A mask type is a 1 to 8-byte string that follows the "mask." annotation
in the format string. This enables obfuscating data in the event the
provided privacy level isn't enabled.
rdar://problem/36756282
llvm-svn: 346211
This is fifth in a series of patches to move intrinsic definitions out of intrin.h.
Note: This was reviewed and approved in D54065 but somehow that diff was messed
up. Committing this again with the proper diff.
llvm-svn: 346205
Summary: This is fifth in a series of patches to move intrinsic definitions out of intrin.h.
Reviewers: rnk, efriedma, mstorsjo, TomTan
Reviewed By: efriedma
Subscribers: javed.absar, kristof.beyls, chrib, jfb, kristina, cfe-commits
Differential Revision: https://reviews.llvm.org/D54065
llvm-svn: 346191
Summary: This is third in a series of patches to move intrinsic definitions out of intrin.h.
Reviewers: rnk, efriedma, mstorsjo, TomTan
Reviewed By: efriedma
Subscribers: javed.absar, kristof.beyls, chrib, jfb, kristina, cfe-commits
Differential Revision: https://reviews.llvm.org/D54062
llvm-svn: 346189
This exposes a (known) CodeGen bug: it can't cope with emitting lvalue
expressions that denote non-odr-used but usable-in-constant-expression
variables. See PR39528 for a testcase.
Reverted for now until that issue can be fixed.
llvm-svn: 346065
Summary: Windows SDK needs these intrinsics to be proper builtins. This is second in a series of patches to move intrinsic defintions out of intrin.h.
Reviewers: rnk, mstorsjo, efriedma, TomTan
Reviewed By: rnk, efriedma
Subscribers: javed.absar, kristof.beyls, chrib, jfb, kristina, cfe-commits
Differential Revision: https://reviews.llvm.org/D54046
llvm-svn: 346044
Handle it in the driver and propagate it to cc1
Reviewers: rjmccall, kcc, rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D52615
llvm-svn: 346001
Coalesced memory access requires use of the new function
`__kmpc_data_sharing_coalesced_push_stack` instead of the
`__kmpc_data_sharing_push_stack`.
llvm-svn: 345991
The previously used combination `PTR_AND_OBJ | PRIVATE` could be used for mapping of some data in Fortran. Changed it to `PTR_AND_OBJ | LITERAL`.
llvm-svn: 345982
target/teams/distribute regions.
Target/teams/distribute regions exist for all the time the kernel is
executed. Thus, if the variable is declared in their context and then
escape it, we can allocate global memory statically instead of
allocating it dynamically.
Patch captures all the globalized variables in target/teams/distribute
contexts, merges them into the records, one per each target region.
Those records are then joined into the union, one per compilation unit
(to save the global memory). Those units are organized into
2 x dimensional arrays, where the first dimension is
the number of blocks per SM and the second one is the number of SMs.
Runtime functions manage this global memory space between the executing
teams.
llvm-svn: 345978
The size of an os_log buffer is known at any stage of compilation, so making it
a constant expression means that the common idiom of declaring a buffer for it
won't result in a VLA. That allows the compiler to skip saving and restoring
the stack pointer around such buffers.
This also moves the OSLog and other FormatString helpers from
libclangAnalysis to libclangAST to avoid a circular dependency.
llvm-svn: 345971
Failed assertion is
> Assertion failed: ((ND->isUsed(false) || !isa<VarDecl>(ND) || !E->getLocation().isValid()) && "Should not use decl without marking it used!"), function EmitDeclRefLValue, file llvm-project/clang/lib/CodeGen/CGExpr.cpp, line 2437.
`EmitDeclRefLValue` mentions
> // A DeclRefExpr for a reference initialized by a constant expression can
> // appear without being odr-used. Directly emit the constant initializer.
The fix is to use the similar approach for non-references as for references. It
is achieved by trying to emit a constant before we attempt to load non-odr-used
variable as LValue.
rdar://problem/40650504
Reviewers: ahatanak, rjmccall
Reviewed By: rjmccall
Subscribers: dexonsmith, erik.pilkington, cfe-commits
Differential Revision: https://reviews.llvm.org/D53674
llvm-svn: 345903
The goal is to use `emitConstant` in more places. Didn't move
`ComplexExprEmitter::emitConstant` because it returns a different type.
Reviewers: rjmccall, ahatanak
Reviewed By: rjmccall
Subscribers: dexonsmith, erik.pilkington, cfe-commits
Differential Revision: https://reviews.llvm.org/D53725
llvm-svn: 345897
This patch should not introduce any behavior changes. It consists of
mostly one of two changes:
1. Replacing fall through comments with the LLVM_FALLTHROUGH macro
2. Inserting 'break' before falling through into a case block consisting
of only 'break'.
We were already using this warning with GCC, but its warning behaves
slightly differently. In this patch, the following differences are
relevant:
1. GCC recognizes comments that say "fall through" as annotations, clang
doesn't
2. GCC doesn't warn on "case N: foo(); default: break;", clang does
3. GCC doesn't warn when the case contains a switch, but falls through
the outer case.
I will enable the warning separately in a follow-up patch so that it can
be cleanly reverted if necessary.
Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu
Differential Revision: https://reviews.llvm.org/D53950
llvm-svn: 345882
The size of an os_log buffer is known at any stage of compilation, so making it
a constant expression means that the common idiom of declaring a buffer for it
won't result in a VLA. That allows the compiler to skip saving and restoring
the stack pointer around such buffers.
This also moves the OSLog helpers from libclangAnalysis to libclangAST
to avoid a circular dependency.
llvm-svn: 345866
This also reverts a couple of follow-up commits trying to fix the
dependency issues. Latest revision added a cyclic dependency that can't
just be patched up in 5 minutes.
llvm-svn: 345846
The member type creation for a cpu-dispatch function was not correctly
including the 'this' parameter, so ensure that the type is properly
determined. Also, disable defer in the cases of emitting the functoins,
as it can end up resulting in the wrong version being emitted.
Change-Id: I0b8fc5e0b0d1ae1a9d98fd54f35f27f6e5d5d083
llvm-svn: 345838
The size of an os_log buffer is known at any stage of compilation, so making it
a constant expression means that the common idiom of declaring a buffer for it
won't result in a VLA. That allows the compiler to skip saving and restoring
the stack pointer around such buffers.
llvm-svn: 345828
When a dispatch function was being emitted that had both a generic and a
pentium configuration listed, we would assert. This is because neither
configuration has any 'features' associated with it so they were both
considered the 'default' version. 'pentium' lacks any features because
we implement it in terms of __builtin_cpu_supports (instead of Intel
proprietary checks), which is unable to decern between the two.
The fix for this is to omit the 'generic' version from the dispatcher if
both are present. This permits existing code to compile, and still will
choose the 'best' version available (since 'pentium' is technically
better than 'generic').
Change-Id: I4b69f3e0344e74cbdbb04497845d5895dd05fda0
llvm-svn: 345826
I fully expected for that to be handled by the canonical type check,
but it clearly wasn't. Sadly, somehow it hide until now.
Reported by Eli Friedman.
llvm-svn: 345816
Summary: Use the same convention as all the other WebAssembly builtin names.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, kristina, cfe-commits
Differential Revision: https://reviews.llvm.org/D53724
llvm-svn: 345804
__tls_guard.
__tls_guard can only ever transition from 0 to 1, and only once. This
permits LLVM to remove repeated checks for TLS initialization and
repeated initialization code in cases like:
int g();
thread_local int n = g();
int a = n + n;
where we could not prove that __tls_guard was still 'true' when checking
it for the second reference to 'n' in the initializer of 'a'.
llvm-svn: 345774
A ConstantExpr class represents a full expression that's in a context where a
constant expression is required. This class reflects the path the evaluator
took to reach the expression rather than the syntactic context in which the
expression occurs.
In the future, the class will be expanded to cache the result of the evaluated
expression so that it's not needlessly re-evaluated
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D53475
llvm-svn: 345692
For arguments, pass it indirectly, since the ABI doc says pretty clearly
that arguments larger than 8 bytes are passed indirectly. This makes
va_list handling easier, anyway.
When returning, GCC returns in XMM0, and we match them.
Fixes PR39492.
llvm-svn: 345676
This is the second half of Implicit Integer Conversion Sanitizer.
It completes the first half, and finally makes the sanitizer
fully functional! Only the bitfield handling is missing.
Summary:
C and C++ are interesting languages. They are statically typed, but weakly.
The implicit conversions are allowed. This is nice, allows to write code
while balancing between getting drowned in everything being convertible,
and nothing being convertible. As usual, this comes with a price:
```
void consume(unsigned int val);
void test(int val) {
consume(val);
// The 'val' is `signed int`, but `consume()` takes `unsigned int`.
// If val is negative, then consume() will be operating on a large
// unsigned value, and you may or may not have a bug.
// But yes, sometimes this is intentional.
// Making the conversion explicit silences the sanitizer.
consume((unsigned int)val);
}
```
Yes, there is a `-Wsign-conversion`` diagnostic group, but first, it is kinda
noisy, since it warns on everything (unlike sanitizers, warning on an
actual issues), and second, likely there are cases where it does **not** warn.
The actual detection is pretty easy. We just need to check each of the values
whether it is negative, and equality-compare the results of those comparisons.
The unsigned value is obviously non-negative. Zero is non-negative too.
https://godbolt.org/g/w93oj2
We do not have to emit the check *always*, there are obvious situations
where we can avoid emitting it, since it would **always** get optimized-out.
But i do think the tautological IR (`icmp ult %x, 0`, which is always false)
should be emitted, and the middle-end should cleanup it.
This sanitizer is in the `-fsanitize=implicit-conversion` group,
and is a logical continuation of D48958 `-fsanitize=implicit-integer-truncation`.
As for the ordering, i'we opted to emit the check **after**
`-fsanitize=implicit-integer-truncation`. At least on these simple 16 test cases,
this results in 1 of the 12 emitted checks being optimized away,
as compared to 0 checks being optimized away if the order is reversed.
This is a clang part.
The compiler-rt part is D50251.
Finishes fixing [[ https://bugs.llvm.org/show_bug.cgi?id=21530 | PR21530 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=37552 | PR37552 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=35409 | PR35409 ]].
Finishes partially fixing [[ https://bugs.llvm.org/show_bug.cgi?id=9821 | PR9821 ]].
Finishes fixing https://github.com/google/sanitizers/issues/940.
Only the bitfield handling is missing.
Reviewers: vsk, rsmith, rjmccall, #sanitizers, erichkeane
Reviewed By: rsmith
Subscribers: chandlerc, filcab, cfe-commits, regehr
Tags: #sanitizers, #clang
Differential Revision: https://reviews.llvm.org/D50250
llvm-svn: 345660
We haven't supported compiling ObjC1 for a long time (and never will again), so
there isn't any reason to keep these separate. This patch replaces
LangOpts::ObjC1 and LangOpts::ObjC2 with LangOpts::ObjC.
Differential revision: https://reviews.llvm.org/D53547
llvm-svn: 345637
Added support for mapping of lambdas in the target regions. It scans all
the captures by reference in the lambda, implicitly maps those variables
in the target region and then later reinstate the addresses of
references in lambda to the correct addresses of the captured|privatized
variables.
llvm-svn: 345609
Only store the NRVO candidate if needed in ReturnStmt.
A good chuck of all of the ReturnStmt have no NRVO candidate
(more than half when parsing all of Boost). For all of them
this saves one pointer. This has no impact on children().
Differential Revision: https://reviews.llvm.org/D53716
Reviewed By: rsmith
llvm-svn: 345605
Summary: So we can keep that not-so-great logic in one place.
Reviewers: rsmith, aaron.ballman
Reviewed By: rsmith
Subscribers: nemanjai, kbarton, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D53837
llvm-svn: 345594
nullptr_t does not access memory.
We now reuse CK_NullToPointer to represent a conversion from a glvalue
of type nullptr_t to a prvalue of nullptr_t where necessary.
llvm-svn: 345562
Summary: This patch adds a new code generation path for bound sharing directives containing distribute parallel for. The new code generation scheme applies to chunked schedules on distribute and parallel for directives. The scheme simplifies the code that is being generated by eliminating the need for an outer for loop over chunks for both distribute and parallel for directives. In the case of distribute it applies to any sized chunk while in the parallel for case it only applies when chunk size is 1.
Reviewers: ABataev, caomhin
Reviewed By: ABataev
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D53448
llvm-svn: 345509
Summary: This patch enables the choosing of the default schedule for parallel for loops even in non-SPMD cases.
Reviewers: ABataev, caomhin
Reviewed By: ABataev
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D53443
llvm-svn: 345507
If the loop counter is not declared in the context of the loop and it is
private, such loop counters should not be captured in the outlined
regions.
llvm-svn: 345505
Make the following changes to PredefinedExpr:
1. Move PredefinedExpr below StringLiteral so that it can use its definition.
2. Rename IdentType to IdentKind to be more in line with clang's conventions,
and propagate the change to its users.
3. Move the location and the IdentKind into the newly available space of
the bit-fields of Stmt.
4. Only store the function name when needed. When parsing all of Boost,
of the 1357 PredefinedExpr 919 have no function name.
Differential Revision: https://reviews.llvm.org/D53605
Reviewed By: rjmccall
llvm-svn: 345460
This reverts commit 8d6af840396f2da2e4ed6aab669214ae25443204 and commit
b78d19c287b6e4a9abc9fb0545de9a3106d38d3d which causes slower build times
by initializing the AddressSanitizer on every function run.
The corresponding revisions are https://reviews.llvm.org/D52814 and
https://reviews.llvm.org/D52739.
llvm-svn: 345433
Summary:
- Added names for some emitted values (such as "tobool" for
the result of a cast to boolean).
- Replaced explicit IRBuilder request for doing sext/zext/trunc
by using CreateIntCast instead.
- Simplify code for emitting satuation into one if-statement
for clamping to max, and one if-statement for clamping to min.
Reviewers: leonardchan, ebevhan
Reviewed By: leonardchan
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D53707
llvm-svn: 345398
This corrects the leader for the swift names. The encoding for 4.2 and
5.0 differ by a single bit on the second character and were swapped.
llvm-svn: 345360
Adds support for -mno-stack-arg-probe and -mstack-probe-size.
(Not really happy copy-pasting code, but that's what we do for all the
other Windows targets.)
Differential Revision: https://reviews.llvm.org/D53617
llvm-svn: 345354
Generate the FP16FML intrinsics into arm_neon.h (AArch64 only for now).
Add two new type modifiers to NeonEmitter to handle the new prototypes.
Define __ARM_FEATURE_FP16FML when +fp16fml is enabled and guard the
intrinsics with the macro in arm_neon.h.
Based on a patch by Gao Yiling.
Differential Revision: https://reviews.llvm.org/D53633
llvm-svn: 345344
storage class.
To be more in line with what GCC does, switch the condition to be based
on the Static Storage duration instead of the storage class.
Change-Id: I8e959d762433cda48855099353bf3c950b9d54b8
llvm-svn: 345302
Similar to how ICC handles CPU-Dispatch on Windows, this patch uses the
resolver function directly to forward the call to the proper function.
This is not nearly as efficient as IFuncs of course, but is still quite
useful for large functions specifically developed for certain
processors.
This is unfortunately still limited to x86, since it depends on
__builtin_cpu_supports and __builtin_cpu_is, which are x86 builtins.
The naming for the resolver/forwarding function for cpu-dispatch was
taken from ICC's implementation, which uses the unmodified name for this
(no mangling additions). This is possible, since cpu-dispatch uses '.A'
for the 'default' version.
In 'target' multiversioning, this function keeps the '.resolver'
extension in order to keep the default function keeping the default
mangling.
Change-Id: I4731555a39be26c7ad59a2d8fda6fa1a50f73284
Differential Revision: https://reviews.llvm.org/D53586
llvm-svn: 345298
The X86 backend will need to see the attribute to make decisions. If it isn't present the backend will have to assume large vectors may be present.
llvm-svn: 345237
Add a new driver level flag `-fcf-runtime-abi=` that allows one to specify the
runtime ABI for CoreFoundation. This controls the language interoperability.
In particular, this is relevant for generating the CFConstantString classes
(primarily through the `__builtin___CFStringMakeConstantString` builtin) which
construct a reference to the "CFObject"'s `isa` field. This type differs
between swift 4.1 and 4.2+.
Valid values for the new option include:
- objc [default behaviour] - enable ObjectiveC interoperability
- swift-4.1 - enable interoperability with swift 4.1
- swift-4.2 - enable interoperability with swift 4.2
- swift-5.0 - enable interoperability with swift 5.0
- swift [alias] - target the latest swift ABI
Furthermore, swift 4.2+ changed the layout for the CFString when building
CoreFoundation *without* ObjectiveC interoperability. In such a case, a field
was added to the CFObject base type changing it from: <{ const int*, int }> to
<{ uintptr_t, uintptr_t, uint64_t }>.
In swift 5.0, the CFString type will be further adjusted to change the length
from a uint32_t on everything but BE LP64 targets to uint64_t.
Note that the default behaviour for clang remains unchanged and the new layout
must be explicitly opted into via `-fcf-runtime-abi=swift*`.
llvm-svn: 345222
Summary:
For the following code:
```
int i;
#pragma omp taskloop
for (i = 0; i < 100; ++i)
{}
#pragma omp taskloop nogroup
for (i = 0; i < 100; ++i)
{}
```
Clang emits the following LLVM IR:
```
...
call void @__kmpc_taskgroup(%struct.ident_t* @0, i32 %0)
%2 = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @0, i32 %0, i32 1, i64 80, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates*)* @.omp_task_entry. to i32 (i32, i8*)*))
...
call void @__kmpc_taskloop(%struct.ident_t* @0, i32 %0, i8* %2, i32 1, i64* %8, i64* %9, i64 %13, i32 0, i32 0, i64 0, i8* null)
call void @__kmpc_end_taskgroup(%struct.ident_t* @0, i32 %0)
...
%15 = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @0, i32 %0, i32 1, i64 80, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.1*)* @.omp_task_entry..2 to i32 (i32, i8*)*))
...
call void @__kmpc_taskloop(%struct.ident_t* @0, i32 %0, i8* %15, i32 1, i64* %21, i64* %22, i64 %26, i32 0, i32 0, i64 0, i8* null)
```
The first set of instructions corresponds to the first taskloop construct. It is important to note that the implicit taskgroup region associated with the taskloop construct has been materialized in our IR: the `__kmpc_taskloop` occurs inside a taskgroup region. Note also that this taskgroup region does not exist in our second taskloop because we are using the `nogroup` clause.
The issue here is the 4th argument of the kmpc_taskloop call, starting from the end, is always a zero. Checking the LLVM OpenMP RT implementation, we see that this argument corresponds to the nogroup parameter:
```
void __kmpc_taskloop(ident_t *loc, int gtid, kmp_task_t *task, int if_val,
kmp_uint64 *lb, kmp_uint64 *ub, kmp_int64 st, int nogroup,
int sched, kmp_uint64 grainsize, void *task_dup);
```
So basically we always tell to the RT to do another taskgroup region. For the first taskloop, this means that we create two taskgroup regions. For the second example, it means that despite the fact we had a nogroup clause we are going to have a taskgroup region, so we unnecessary wait until all descendant tasks have been executed.
Reviewers: ABataev
Reviewed By: ABataev
Subscribers: rogfer01, cfe-commits
Differential Revision: https://reviews.llvm.org/D53636
llvm-svn: 345180
This is a continuation of my patches to inform the X86 backend about what the largest IR types are in the function so that we can restrict the backend type legalizer to prevent 512-bit vectors on SKX when -mprefer-vector-width=256 is specified if no explicit 512 bit vectors were specified by the user.
This patch updates the vector width based on the argument and return types of the current function and from the types of any functions it calls. This is intended to make sure the backend type legalizer doesn't disturb any types that are required for ABI.
Differential Revision: https://reviews.llvm.org/D52441
llvm-svn: 345168
Extract the reference to the ASTContext and Triple and use them throughout the
function. This is simply a cosmetic cleanup while in the area. NFC.
llvm-svn: 345160
These declarations somehow survived a cleanup that combined them with the target
multiversioning functions. This patch removes them as they are no
longer necessary or used.
Change-Id: I318286401ace63bef1aa48018dabb25be0117ca0
llvm-svn: 345145
Before this patch, clang would emit a (module-)forward declaration for
template instantiations that are not anchored by an explicit template
instantiation, but still are guaranteed to be available in an imported
module. Unfortunately detecting the owning module doesn't reliably
work when local submodule visibility is enabled and the template is
inside a cross-module namespace.
This make clang debuggable again with -gmodules and LSV enabled.
rdar://problem/41552377
llvm-svn: 345109
This patch is a part of https://reviews.llvm.org/D48456 in an attempt to split
the casting logic up into smaller patches. This contains the code for casting
from fixed point types to boolean types.
Differential Revision: https://reviews.llvm.org/D53308
llvm-svn: 345063
This broke the Chromium build. See
https://bugs.chromium.org/p/chromium/issues/detail?id=898152#c1 for the
reproducer.
> Generate DILabel metadata and call llvm.dbg.label after label
> statement to associate the metadata with the label.
>
> After fixing PR37395.
> After fixing problems in LiveDebugVariables.
> After fixing NULL symbol problems in AddressPool when enabling
> split-dwarf-file.
> After fixing PR39094.
>
> Differential Revision: https://reviews.llvm.org/D45045
llvm-svn: 345026
Generate DILabel metadata and call llvm.dbg.label after label
statement to associate the metadata with the label.
After fixing PR37395.
After fixing problems in LiveDebugVariables.
After fixing NULL symbol problems in AddressPool when enabling
split-dwarf-file.
After fixing PR39094.
Differential Revision: https://reviews.llvm.org/D45045
llvm-svn: 345009
For instantiated functions, search the template pattern to see if it marked
inline to determine if InlineHint attribute should be added to the function.
llvm-svn: 344987
Since multiversion variant functions can be inline, in C they become
available-externally linkage. This ends up causing the variants to not
be emitted, and not available to the linker.
The solution is to make sure that multiversion functions are always
emitted by marking them linkonce.
Change-Id: I897aa37c7cbba0c1eb2c57ee881d5000a2113b75
llvm-svn: 344957
Function calls without a !dbg location inside a function that has a
DISubprogram make it impossible to construct inline information and
are rejected by the verifier. This patch ensures that sanitizer check
function calls have a !dbg location, by carrying forward the location
of the preceding instruction or by inserting an artificial location if
necessary.
This fixes a crash when compiling the attached testcase with -Os.
rdar://problem/45311226
Differential Revision: https://reviews.llvm.org/D53459
llvm-svn: 344915
mangle types of lambda objects captured by a block instead of creating a
new mangle context everytime a captured field type is mangled.
This fixes a bug in IRGen's block helper merging code that was
introduced in r339438 where two blocks capturing two distinct lambdas
would end up sharing helper functions and the block descriptor. This
happened because the ID number used to distinguish lambdas defined
in the same context is reset everytime a mangled context is created.
rdar://problem/45314494
llvm-svn: 344833
libgcc supports more than 32 features by adding a new 32-bit variable __cpu_features2.
This adds the clang support for checking these feature bits.
Patches for compiler-rt and llvm to support this are coming as well.
Probably still need an additional patch for target multiversioning in clang.
Differential Revision: https://reviews.llvm.org/D53458
llvm-svn: 344832
Summary:
The multiversioning code repurposed the code from __builtin_cpu_supports for checking if a single feature is enabled. That code essentially performed (_cpu_features & (1 << C)) != 0. But with the multiversioning path, the mask is no longer guaranteed to be a power of 2. So we return true anytime any one of the bits in the mask is set not just all of the bits.
The correct check is (_cpu_features & mask) == mask
Reviewers: erichkeane, echristo
Reviewed By: echristo
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D53460
llvm-svn: 344824
Rather, they are subexpressions of the enclosing lambda-expression, and
any temporaries in them are destroyed at the end of that
full-expression, or when the corresponding lambda-expression is
destroyed if they are lifetime-extended.
llvm-svn: 344801
This patch exposes functionality added in rL344723 to the Clang driver/frontend
as a flag and adds appropriate metadata.
Driver tests pass:
```
ninja check-clang-driver
-snip-
Expected Passes : 472
Expected Failures : 3
Unsupported Tests : 65
```
Odd failure in CodeGen tests but unrelated to this:
```
ninja check-clang-codegen
-snip-
/SourceCache/llvm-trunk-8.0/tools/clang/test/CodeGen/builtins-wasm.c:87:10:
error: cannot compile this builtin function yet
-snip-
Failing Tests (1):
Clang :: CodeGen/builtins-wasm.c
Expected Passes : 1250
Expected Failures : 2
Unsupported Tests : 120
Unexpected Failures: 1
```
Original commit:
[X86] Support for the mno-tls-direct-seg-refs flag
Allows to disable direct TLS segment access (%fs or %gs). GCC supports a
similar flag, it can be useful in some circumstances, e.g. when a thread
context block needs to be updated directly from user space. More info and
specific use cases: https://bugs.llvm.org/show_bug.cgi?id=16145
Patch by nruslan (Ruslan Nikolaev).
Differential Revision: https://reviews.llvm.org/D53102
llvm-svn: 344739
Emit llvm.amdgcn.update.dpp for both __builtin_amdgcn_mov_dpp and
__builtin_amdgcn_update_dpp. The first argument to
llvm.amdgcn.update.dpp will be undef for __builtin_amdgcn_mov_dpp.
Differential Revision: https://reviews.llvm.org/D52320
llvm-svn: 344665
This patch is a part of https://reviews.llvm.org/D48456 in an attempt to
split them up. This contains the code for casting between fixed point types
and other fixed point types.
The method for converting between fixed point types is based off the convert()
method in APFixedPoint.
Differential Revision: https://reviews.llvm.org/D50616
llvm-svn: 344530
This reverts commit https://reviews.llvm.org/rL344150 which causes
MachineOutliner related failures on the ppc64le multistage buildbot.
llvm-svn: 344526
This removes the primary remaining API producing `TerminatorInst` which
will reduce the rate at which code is introduced trying to use it and
generally make it much easier to remove the remaining APIs across the
codebase.
Also clean up some of the stragglers that the previous mechanical update
of variables missed.
Users of LLVM and out-of-tree code generally will need to update any
explicit variable types to handle this. Replacing `TerminatorInst` with
`Instruction` (or `auto`) almost always works. Most of these edits were
made in prior commits using the perl one-liner:
```
perl -i -ple 's/TerminatorInst(\b.* = .*getTerminator\(\))/Instruction\1/g'
```
This also my break some rare use cases where people overload for both
`Instruction` and `TerminatorInst`, but these should be easily fixed by
removing the `TerminatorInst` overload.
llvm-svn: 344504
Some ObjC users declare a extern variable named OBJC_CLASS_$_Foo, then use it's
address as a Class. I.e., one could define isInstanceOfF:
BOOL isInstanceOfF(id c) {
extern void OBJC_CLASS_$_F;
return [c class] == (Class)&OBJC_CLASS_$_F;
}
This leads to asserts in clang CodeGen if there is an @implementation of F in
the same TU as an instance of this pattern, because CodeGen assumes that a
variable named OBJC_CLASS_$_* has the right type. This commit fixes the problem
by RAUWing the old (incorrectly typed) global with a new global, then removing
the old global.
rdar://45077269
Differential revision: https://reviews.llvm.org/D53154
llvm-svn: 344373
if the function has globalized variables and called in context of
target/teams/distribute regions, it does not need to globalize 32
copies of the same variables for memory coalescing, it is enough to
have just one copy, because there is parallel region.
Patch does this by adding call for `__kmpc_parallel_level` function and
checking its return value. If the code sees that the parallel level is
0, then only one variable is allocated, not 32.
llvm-svn: 344356
target/teams/distribute regions.
Previously introduced globalization scheme that uses memory coalescing
scheme may increase memory usage fr the variables that are devlared in
target/teams/distribute contexts. We don't need 32 copies of such
variables, just 1. Patch reduces memory use in this case.
llvm-svn: 344273
Summary:
As per IRC disscussion, it seems we really want to have more fine-grained `-fsanitize=implicit-integer-truncation`:
* A check when both of the types are unsigned.
* Another check for the other cases (either one of the types is signed, or both of the types is signed).
This is clang part.
Compiler-rt part is D50902.
Reviewers: rsmith, vsk, Sanitizers
Reviewed by: rsmith
Differential Revision: https://reviews.llvm.org/D50901
llvm-svn: 344230
This can be used to preserve profiling information across codebase
changes that have widespread impact on mangled names, but across which
most profiling data should still be usable. For example, when switching
from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI,
or even from a 32-bit to a 64-bit build.
The user can provide a remapping file specifying parts of mangled names
that should be treated as equivalent (eg, std::__1 should be treated as
equivalent to std::__cxx11), and profile data will be treated as
applying to a particular function if its name is equivalent to the name
of a function in the profile data under the provided equivalences. See
the documentation change for a description of how this is configured.
Remapping is supported for both sample-based profiling and instruction
profiling. We do not support remapping indirect branch target
information, but all other profile data should be remapped
appropriately.
Support is only added for the new pass manager. If someone wants to also
add support for this for the old pass manager, doing so should be
straightforward.
llvm-svn: 344199
This is currently a clang extension and a resolution
of the defect report in the C++ Standard.
Differential Revision: https://reviews.llvm.org/D46441
llvm-svn: 344150
When ifunc support was added to Clang (r265917) it did not allow
resolvers to take function arguments. This was based on GCC's
documentation, which states resolvers return a pointer and take no
arguments.
However, GCC actually allows resolvers to take arguments, and glibc (on
non-x86 platforms) and FreeBSD (on x86 and arm64) pass some CPU
identification information as arguments to ifunc resolvers. I believe
GCC's documentation is simply incorrect / out-of-date.
FreeBSD already removed the prohibition in their in-tree Clang copy.
Differential Revision: https://reviews.llvm.org/D52703
llvm-svn: 344100
Added support for memory coalescing for better performance for
globalized variables. From now on all the globalized variables are
represented as arrays of 32 elements and each thread accesses these
elements using `tid & 31` as index.
llvm-svn: 344049
DWARF v5 introduces DW_AT_call_all_calls, a subprogram attribute which
indicates that all calls (both regular and tail) within the subprogram
have call site entries. The information within these call site entries
can be used by a debugger to populate backtraces with synthetic tail
call frames.
Tail calling frames go missing in backtraces because the frame of the
caller is reused by the callee. Call site entries allow a debugger to
reconstruct a sequence of (tail) calls which led from one function to
another. This improves backtrace quality. There are limitations: tail
recursion isn't handled, variables within synthetic frames may not
survive to be inspected, etc. This approach is not novel, see:
https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=jelinek.pdf
This patch adds an IR-level flag (DIFlagAllCallsDescribed) which lowers
to DW_AT_call_all_calls. It adds the minimal amount of DWARF generation
support needed to emit standards-compliant call site entries. For easier
deployment, when the debugger tuning is LLDB, the DWARF requirement is
adjusted to v4.
Testing: Apart from check-{llvm, clang}, I built a stage2 RelWithDebInfo
clang binary. Its dSYM passed verification and grew by 1.4% compared to
the baseline. 151,879 call site entries were added.
rdar://42001377
Differential Revision: https://reviews.llvm.org/D49887
llvm-svn: 343883
getGUID() returns an uint64_t and "%x" only prints 32 bits of it.
Use PRIx64 format string to print all 64 bits.
Differential Revision: https://reviews.llvm.org/D52938
llvm-svn: 343875
Fixed emission of the __kmpc_global_thread_num() so that it is not
messed up with alloca instructions anymore. Plus, fixes emission of the
__kmpc_global_thread_num() functions in the target outlined regions so
that they are not called before runtime is initialized.
llvm-svn: 343856
Summary: Add an optional attribute referring to a tuple of type and value template parameter nodes to the DIGlobalVariable node. This allows us to record the parameters of template variable specializations.
Reviewers: dblaikie, aprantl, probinson, JDevlieghere, clayborg, jingham
Reviewed By: JDevlieghere
Subscribers: cfe-commits
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D52058
llvm-svn: 343707
Worker threads fork off to the compiler generated worker function
directly after entering the kernel function. Hence, there is no
need to check whether the current thread is the master if we are
outside of a parallel region (neither SPMD nor parallel_level > 0).
Differential Revision: https://reviews.llvm.org/D52732
llvm-svn: 343618
Only need to care about the 'distribute simd' case, all other composite
directives are handled elsewhere. This was already reflected in the
outer 'if' condition, so all other inner conditions could never be true.
Differential Revision: https://reviews.llvm.org/D52731
llvm-svn: 343617
This patch renames -f{no-}cuda-rdc to -f{no-}gpu-rdc and keeps the original
options as aliases. When -fgpu-rdc is off,
clang will assume the device code in each translation unit does not call
external functions except those in the device library, therefore it is possible
to compile the device code in each translation unit to self-contained kernels
and embed them in the host object, so that the host object behaves like
usual host object which can be linked by lld.
The benefits of this feature is: 1. allow users to create static libraries which
can be linked by host linker; 2. amortized device code linking time.
This patch modifies HIP action builder to insert actions for linking device
code and generating HIP fatbin, and pass HIP fatbin to host backend action.
It extracts code for constructing command for generating HIP fatbin as
a function so that it can be reused by early finalization. It also modifies
codegen of HIP host constructor functions to embed the device fatbin
when it is available.
Differential Revision: https://reviews.llvm.org/D52377
llvm-svn: 343611
This reverts r326937 as it broke block argument handling in OpenCL.
See the discussion on https://reviews.llvm.org/D43783 .
The next commit will add a test case that revealed the issue.
llvm-svn: 343582
of a non-trivial C struct, copy the preceding trivial fields that
haven't been copied.
This commit fixes a bug where the instructions used to copy the
preceding trivial fields were emitted inside the loop body.
rdar://problem/44185064
llvm-svn: 343556
from those that aren't.
This patch changes the way __block variables that aren't captured by
escaping blocks are handled:
- Since non-escaping blocks on the stack never get copied to the heap
(see https://reviews.llvm.org/D49303), Sema shouldn't error out when
the type of a non-escaping __block variable doesn't have an accessible
copy constructor.
- IRGen doesn't have to use the specialized byref structure (see
https://clang.llvm.org/docs/Block-ABI-Apple.html#id8) for a
non-escaping __block variable anymore. Instead IRGen can emit the
variable as a normal variable and copy the reference to the block
literal. Byref copy/dispose helpers aren't needed either.
This reapplies r343518 after fixing a use-after-free bug in function
Sema::ActOnBlockStmtExpr where the BlockScopeInfo was dereferenced after
it was popped and deleted.
rdar://problem/39352313
Differential Revision: https://reviews.llvm.org/D51564
llvm-svn: 343542
from those that aren't.
This patch changes the way __block variables that aren't captured by
escaping blocks are handled:
- Since non-escaping blocks on the stack never get copied to the heap
(see https://reviews.llvm.org/D49303), Sema shouldn't error out when
the type of a non-escaping __block variable doesn't have an accessible
copy constructor.
- IRGen doesn't have to use the specialized byref structure (see
https://clang.llvm.org/docs/Block-ABI-Apple.html#id8) for a
non-escaping __block variable anymore. Instead IRGen can emit the
variable as a normal variable and copy the reference to the block
literal. Byref copy/dispose helpers aren't needed either.
This reapplies r341754, which was reverted in r341757 because it broke a
couple of bots. r341754 was calling markEscapingByrefs after the call to
PopFunctionScopeInfo, which caused the popped function scope to be
cleared out when the following code was compiled, for example:
$ cat test.m
struct A {
id data[10];
};
void foo() {
__block A v;
^{ (void)v; };
}
This commit calls markEscapingByrefs before calling PopFunctionScopeInfo
to prevent that from happening.
rdar://problem/39352313
Differential Revision: https://reviews.llvm.org/D51564
llvm-svn: 343518
lightweight runtime.
The datasharing flag must be set to `1` when executing SPMD-mode compatible directive with reduction|lastprivate clauses.
llvm-svn: 343492
There are a few leftovers of rC343147 that are not (\w+)\.begin but in
the form of ([-[:alnum:]>.]+)\.begin or spanning two lines. Change them
to use the container form in this commit. The 12 occurrences have been
inspected manually for safety.
llvm-svn: 343425
Summary: Set default schedule for parallel for loops to schedule(static, 1) when using SPMD mode on the NVPTX device offloading toolchain to ensure coalescing.
Reviewers: ABataev, Hahnfeld, caomhin
Reviewed By: ABataev
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D52629
llvm-svn: 343260
Summary: For the OpenMP NVPTX toolchain choose a default distribute schedule that ensures coalescing on the GPU when in SPMD mode. This significantly increases the performance of offloaded target code and reduces the number of registers used on the GPU side.
Reviewers: ABataev, caomhin, Hahnfeld
Reviewed By: ABataev, Hahnfeld
Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D52434
llvm-svn: 343253
Generate DILabel metadata and call llvm.dbg.label after label
statement to associate the metadata with the label.
After fixing PR37395.
After fixing problems in LiveDebugVariables.
After fixing NULL symbol problems in AddressPool when enabling
split-dwarf-file.
Differential Revision: https://reviews.llvm.org/D45045
llvm-svn: 343148
Previously we used a select and the zero_undef=true intrinsic. In -O2 this pattern will get optimized to zero_undef=false. But in -O0 this optimization won't happen. This results in a compare and cmov being wrapped around a tzcnt/lzcnt instruction.
By using the zero_undef=false intrinsic directly without the select, we can improve the -O0 codegen to just an lzcnt/tzcnt instruction.
Differential Revision: https://reviews.llvm.org/D52392
llvm-svn: 343126
Add support for OMP5.0 requires directive and unified_address clause.
Patches to follow will include support for additional clauses.
Differential Revision: https://reviews.llvm.org/D52359
llvm-svn: 343063
Relanding rL342883 with more fragmented tests to test ELF-specific
section emission separately from broad-scope CFString tests. Now this
tests the following separately
1). CoreFoundation builds and linkage for ELF while building it.
2). CFString ELF section emission outside CF in assembly output.
3). Broad scope `cfstring3.c` tests which cover all object formats at
bitcode level and assembly level (including ELF).
This fixes non-bridged CoreFoundation builds on ELF targets
that use -fconstant-cfstrings. The original changes from differential
for a similar patch to PE/COFF (https://reviews.llvm.org/D44491) did not
check for an edge case where the global could be a constant which surfaced
as an issue when building for ELF because of different linkage semantics.
This patch addresses several issues with crashes related to CF builds on ELF
as well as improves data layout by ensuring string literals that back
the actual CFConstStrings end up in .rodata in line with Mach-O.
Change itself tested with CoreFoundation on Linux x86_64 but should be valid
for BSD-like systems as well that use ELF as the native object format.
Differential Revision: https://reviews.llvm.org/D52344
llvm-svn: 343038
[Clang][CodeGen][ObjC]: Fix non-bridged CoreFoundation builds on ELF targets
that use `-fconstant-cfstrings`. The original changes from differential
for a similar patch to PE/COFF (https://reviews.llvm.org/D44491) did not
check for an edge case where the global could be a constant which surfaced
as an issue when building for ELF because of different linkage semantics.
This patch addresses several issues with crashes related to CF builds on ELF
as well as improves data layout by ensuring string literals that back
the actual CFConstStrings end up in .rodata in line with Mach-O.
Change itself tested with CoreFoundation on Linux x86_64 but should be valid
for BSD-like systems as well that use ELF as the native object format.
Differential Revision: https://reviews.llvm.org/D52344
llvm-svn: 342883
Comparison functions used in sorting algorithms need to have strict weak
ordering. Remove the assert and allow comparisons on all lists.
llvm-svn: 342774
Currently the code-model does not get saved in the module IR, so if a
code model is specified when compiling with LTO, it gets lost and is
not propagated properly to LTO. This patch does what is necessary in
the front end to pass the code-model to the module, so that the back
end can store it in the Module .
Differential Revision: https://reviews.llvm.org/D52323
llvm-svn: 342758
Summary:
This code was in CGDecl.cpp and really belongs in LLVM. It happened to have isBytewiseValue which served a very similar purpose but wasn't as powerful as clang's version. Remove the clang version, and augment isBytewiseValue to be as powerful so that clang does the same thing it used to.
LLVM part of this patch: D51751
Subscribers: dexonsmith, cfe-commits
Differential Revision: https://reviews.llvm.org/D51752
llvm-svn: 342734
Summary:
Some lines have a hit counter where they should not have one.
Cleanup stuff is located to the last line of the body which is most of the time a '}'.
And Exception stuff is added at the beginning of a function and at the end (represented by '{' and '}').
So in such cases, the DebugLoc used in GCOVProfiling.cpp must be marked as not covered.
This patch is a followup of https://reviews.llvm.org/D49915.
Tests in projects/compiler_rt are fixed by: https://reviews.llvm.org/D49917
Reviewers: marco-c, davidxl
Reviewed By: marco-c
Subscribers: dblaikie, cfe-commits, sylvestre.ledru
Differential Revision: https://reviews.llvm.org/D49916
llvm-svn: 342717
unsigned long long builtin_unpack_vector_int128 (vector int128_t, int);
vector int128_t builtin_pack_vector_int128 (unsigned long long, unsigned long long);
Builtins should behave the same way as in GCC.
Patch By: wuzish (Zixuan Wu)
Differential Revision: https://reviews.llvm.org/D52074
llvm-svn: 342614
This special case was added in r264841, but the code breaks our
invariants by calling EmitTopLevelDecl without first creating a
HandlingTopLevelDeclRAII scope.
This fixes the PCH crash in https://crbug.com/884427. I was never able
to make a satisfactory reduction, unfortunately. I'm not very worried
about this regressing since this change makes the code simpler while
passing the existing test that shows we do emit dllexported friend
function definitions. Now we just defer their emission until the tag is
fully complete, which is generally good.
llvm-svn: 342516
Summary:
Before this change, we only emit the XRay attributes in LLVM IR when the
-fxray-instrument flag is provided. This may cause issues with thinlto
when the final binary is being built/linked with -fxray-instrument, and
the constitutent LLVM IR gets re-lowered with xray instrumentation.
With this change, we can honour the "never-instrument "attributes
provided in the source code and preserve those in the IR. This way, even
in thinlto builds, we retain the attributes which say whether functions
should never be XRay instrumented.
This change addresses llvm.org/PR38922.
Reviewers: mboerger, eizan
Subscribers: mehdi_amini, dexonsmith, cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D52015
llvm-svn: 342200
Previously, both types (plus the future target-clones) of
multiversioning had a separate ResolverOption structure and emission
function. This patch combines the two, at the expense of a slightly
more expensive sorting function.
llvm-svn: 342152
declare reduction.
If the declare reduction construct with the non-dependent type is
defined in the template construct, the compiler might crash on the
template instantition. Reworked the whole instantiation scheme for the
declare reduction constructs to fix this problem correctly.
llvm-svn: 342151
Functions generated by clang and included in the .init_array section (such as
static constructors) do not follow the usual code path for adding
target-specific function attributes, so we have to add the return address
signing attribute here too, as is currently done for the sanitisers.
Differential revision: https://reviews.llvm.org/D51418
llvm-svn: 342126
Previously the alignment on the newly created rtti/typeinfo data was largely
not set, meaning that DataLayout::getPreferredAlignment was free to overalign
it to 16 bytes. This causes unnecessary code bloat.
Differential Revision: https://reviews.llvm.org/D51416
llvm-svn: 342053
Summary:
On targets that do not support FP16 natively LLVM currently legalizes
vectors of FP16 values by scalarizing them and promoting to FP32. This
causes problems for the following code:
void foo(int, ...);
typedef __attribute__((neon_vector_type(4))) __fp16 float16x4_t;
void bar(float16x4_t x) {
foo(42, x);
}
According to the AAPCS (appendix A.2) float16x4_t is a containerized
vector fundamental type, so 'foo' expects that the 4 16-bit FP values
are packed into 2 32-bit registers, but instead bar promotes them to
4 single precision values.
Since we already handle scalar FP16 values in the frontend by
bitcasting them to/from integers, this patch adds similar handling for
vector types and homogeneous FP16 vector aggregates.
One existing test required some adjustments because we now generate
more bitcasts (so the patch changes the test to target a machine with
native FP16 support).
Reviewers: eli.friedman, olista01, SjoerdMeijer, javed.absar, efriedma
Reviewed By: javed.absar, efriedma
Subscribers: efriedma, kristof.beyls, cfe-commits, chrib
Differential Revision: https://reviews.llvm.org/D50507
llvm-svn: 342034
This patch removes the last reason why DIFlagBlockByrefStruct from
Clang by directly implementing the drilling into the member type done
in DwarfDebug::DbgVariable::getType() into the frontend.
rdar://problem/31629055
Differential Revision: https://reviews.llvm.org/D51807
llvm-svn: 341842
from those that aren't.
This patch changes the way __block variables that aren't captured by
escaping blocks are handled:
- Since non-escaping blocks on the stack never get copied to the heap
(see https://reviews.llvm.org/D49303), Sema shouldn't error out when
the type of a non-escaping __block variable doesn't have an accessible
copy constructor.
- IRGen doesn't have to use the specialized byref structure (see
https://clang.llvm.org/docs/Block-ABI-Apple.html#id8) for a
non-escaping __block variable anymore. Instead IRGen can emit the
variable as a normal variable and copy the reference to the block
literal. Byref copy/dispose helpers aren't needed either.
rdar://problem/39352313
Differential Revision: https://reviews.llvm.org/D51564
llvm-svn: 341754
Summary:
The optimized (__atomic_foo_<n>) libcalls assume that the atomic object
is properly aligned, so should never be called on an underaligned
object.
This addresses one of several problems identified in PR38846.
Reviewers: jyknight, t.p.northover
Subscribers: jfb, cfe-commits
Differential Revision: https://reviews.llvm.org/D51817
llvm-svn: 341734
This is the clang side of D51803. The llvm intrinsic now returns two results. So we need to emit an explicit store in IR for the out parameter. This is similar to addcarry/subborrow/rdrand/rdseed.
Differential Revision: https://reviews.llvm.org/D51805
llvm-svn: 341699
This is the clang side of D51769. The llvm intrinsics now return two results instead of using an out parameter.
Differential Revision: https://reviews.llvm.org/D51771
llvm-svn: 341678
Boilerplate code for using KMSAN instrumentation in Clang.
We add a new command line flag, -fsanitize=kernel-memory, with a
corresponding SanitizerKind::KernelMemory, which, along with
SanitizerKind::Memory, maps to the memory_sanitizer feature.
KMSAN is only supported on x86_64 Linux.
It's incompatible with other sanitizers, but supports code coverage
instrumentation.
llvm-svn: 341641
This reverts commit r341519, which generates debug info that causes
backend crashes. (with -split-dwarf-file)
Details in https://reviews.llvm.org/D50495
llvm-svn: 341549
Generate DILabel metadata and call llvm.dbg.label after label
statement to associate the metadata with the label.
After fixing PR37395.
After fixing problems in LiveDebugVariables.
Differential Revision: https://reviews.llvm.org/D45045
llvm-svn: 341519
Load Hardening.
Wires up the existing pass to work with a proper IR attribute rather
than just a hidden/internal flag. The internal flag continues to work
for now, but I'll likely remove it soon.
Most of the churn here is adding the IR attribute. I talked about this
Kristof Beyls and he seemed at least initially OK with this direction.
The idea of using a full attribute here is that we *do* expect at least
some forms of this for other architectures. There isn't anything
*inherently* x86-specific about this technique, just that we only have
an implementation for x86 at the moment.
While we could potentially expose this as a Clang-level attribute as
well, that seems like a good question to defer for the moment as it
isn't 100% clear whether that or some other programmer interface (or
both?) would be best. We'll defer the programmer interface side of this
for now, but at least get to the point where the feature can be enabled
without relying on implementation details.
This also allows us to do something that was really hard before: we can
enable *just* the indirect call retpolines when using SLH. For x86, we
don't have any other way to mitigate indirect calls. Other architectures
may take a different approach of course, and none of this is surfaced to
user-level flags.
Differential Revision: https://reviews.llvm.org/D51157
llvm-svn: 341363
These aren't documented in the Intel Intrinsics Guide, but are supported by gcc and icc.
Includes these intrinsics:
_ktestc_mask8_u8, _ktestz_mask8_u8, _ktest_mask8_u8
_ktestc_mask16_u8, _ktestz_mask16_u8, _ktest_mask16_u8
_ktestc_mask32_u8, _ktestz_mask32_u8, _ktest_mask32_u8
_ktestc_mask64_u8, _ktestz_mask64_u8, _ktest_mask64_u8
llvm-svn: 341265
This adds:
_cvtmask8_u32, _cvtmask16_u32, _cvtmask32_u32, _cvtmask64_u64
_cvtu32_mask8, _cvtu32_mask16, _cvtu32_mask32, _cvtu64_mask64
_load_mask8, _load_mask16, _load_mask32, _load_mask64
_store_mask8, _store_mask16, _store_mask32, _store_mask64
These are currently missing from the Intel Intrinsics Guide webpage.
llvm-svn: 341251
This adds the following intrinsics:
_kshiftli_mask8
_kshiftli_mask16
_kshiftli_mask32
_kshiftli_mask64
_kshiftri_mask8
_kshiftri_mask16
_kshiftri_mask32
_kshiftri_mask64
llvm-svn: 341234
Summary:
Added option -gline-directives-only to support emission of the debug directives
only. It behaves very similar to -gline-tables-only, except that it sets
llvm debug info emission kind to
llvm::DICompileUnit::DebugDirectivesOnly.
Reviewers: echristo
Subscribers: aprantl, fedor.sergeev, JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D51177
llvm-svn: 341212
'declare target'.
All the functions, referenced in implicit|explicit target regions must
be emitted during code emission for the device.
llvm-svn: 341093
If the target construct can be executed in SPMD mode + it is a loop
based directive with static scheduling, we can use lightweight runtime
support.
llvm-svn: 340953
Since MinGW supports automatically importing external variables from
DLLs even without the DLLImport attribute, we shouldn't mark them
as DSO local unless we actually know them to be local for sure.
Keep marking thread local variables as DSO local.
Differential Revision: https://reviews.llvm.org/D51382
llvm-svn: 340941
Currently ident_t objects are created const when debug info is not
enabled, but the libittnotify libray in the OpenMP runtime writes to
the reserved_2 field (See __kmp_itt_region_forking in
openmp/runtime/src/kmp_itt.inl). Now create ident_t objects non-const.
Differential Revision: https://reviews.llvm.org/D51331
llvm-svn: 340934
This adds the following intrinsics:
_kadd_mask64
_kadd_mask32
_kadd_mask16
_kadd_mask8
These are missing from the Intel Intrinsics Guide, but are implemented by both gcc and icc.
llvm-svn: 340879
This also adds a second intrinsic name for the 16-bit mask versions.
These intrinsics match gcc and icc. They just aren't published in the Intel Intrinsics Guide so I only recently found they existed.
llvm-svn: 340719
If all LLVM passes are disabled, we can't emit a summary because there
could be unnamed globals in the IR.
Differential Revision: https://reviews.llvm.org/D51198
llvm-svn: 340640
constants by default when there is no optimization.
GCC's option -fno-keep-static-consts can be used to not emit
unused static constants.
In Clang, since default behavior does not keep unused static constants,
-fkeep-static-consts can be used to emit these if required. This could be
useful for producing identification strings like SVN identifiers
inside the object file even though the string isn't used by the program.
Differential Revision: https://reviews.llvm.org/D40925
llvm-svn: 340439
of the captured variable when determining whether the capture needs
special handing when the block is copied or disposed.
This fixes bugs in the handling of variables captured by a block that is
nested inside a lambda that captures the variables by reference.
rdar://problem/43540889
Differential Revision: https://reviews.llvm.org/D51025
llvm-svn: 340408
EmitX86BuiltinExpr() emits all args into Ops at the beginning, so don't do that
work again.
This changes behavior: If e.g. ++a was passed as an arg, we incremented a twice
previously. This change fixes that bug.
https://reviews.llvm.org/D50979
llvm-svn: 340348
If using a custom stack alignment, one is expected to make sure
that all callers provide such alignment, or realign the stack in
all entry points (and callbacks).
Despite this, the compiler can assume that the main function will
need realignment in these cases, since the startup routines calling
the main function most probably won't provide the custom alignment.
This matches what GCC does in similar cases; if compiling with
-mincoming-stack-boundary=X -mpreferred-stack-boundary=X, GCC normally
assumes such alignment on entry to a function, but specifically for
the main function still does realignment.
Differential Revision: https://reviews.llvm.org/D51026
llvm-svn: 340334
This commit adds the flag -fno-c++-static-destructors and the attributes
[[clang::no_destroy]] and [[clang::always_destroy]]. no_destroy specifies that a
specific static or thread duration variable shouldn't have it's destructor
registered, and is the default in -fno-c++-static-destructors mode.
always_destroy is the opposite, and is the default in -fc++-static-destructors
mode.
A variable whose destructor is disabled (either because of
-fno-c++-static-destructors or [[clang::no_destroy]]) doesn't count as a use of
the destructor, so we don't do any access checking or mark it referenced. We
also don't emit -Wexit-time-destructors for these variables.
rdar://21734598
Differential revision: https://reviews.llvm.org/D50994
llvm-svn: 340306