When creating a global variable with a type of a struct with bitfields, we must
forcibly set the alignment of the global from the RecordDecl. We must do this so
that the proper bitfield alignment makes its way down to LLVM, since clang will
mangle the bitfields into one large type.
llvm-svn: 235976
This makes sure that the front end is specific about what they're expecting
the backend to produce. Update a FIXME with the idea that the target-features
could be more precise using backend knowledge.
llvm-svn: 235936
Currently clang emits file-scope asm during *both* host and device
compilation modes which is usually a wrong thing to do.
There's no way to attach any attribute to an __asm statement, so
there's no way to differentiate between host-side and device-side
file-scope asm. This patch makes clang to match nvcc behavior and
emit file-scope-asm only during host-side compilation.
Differential Revision: http://reviews.llvm.org/D9270
llvm-svn: 235905
Emit the following code for 'taskwait' directive within tied task:
call i32 @__kmpc_omp_taskwait(<loc>, i32 <thread_id>);
Differential Revision: http://reviews.llvm.org/D9245
llvm-svn: 235836
Emit a code for reduction clause. Next code should be emitted for reductions:
static kmp_critical_name lock = { 0 };
void reduce_func(void *lhs[<n>], void *rhs[<n>]) {
*(Type0*)lhs[0] = ReductionOperation0(*(Type0*)lhs[0], *(Type0*)rhs[0]);
...
*(Type<n>-1*)lhs[<n>-1] =
ReductionOperation<n>-1(*(Type<n>-1*)lhs[<n>-1],
*(Type<n>-1*)rhs[<n>-1]);
}
...
void *RedList[<n>] = {&<RHSExprs>[0], ..., &<RHSExprs>[<n>-1]};
switch (__kmpc_reduce{_nowait}(<loc>, <gtid>, <n>, sizeof(RedList), RedList, reduce_func, &<lock>)) {
case 1:
<LHSExprs>[0] = ReductionOperation0(*<LHSExprs>[0], *<RHSExprs>[0]);
...
<LHSExprs>[<n>-1] = ReductionOperation<n>-1(*<LHSExprs>[<n>-1], *<RHSExprs>[<n>-1]);
__kmpc_end_reduce{_nowait}(<loc>, <gtid>, &<lock>);
break;
case 2:
Atomic(<LHSExprs>[0] = ReductionOperation0(*<LHSExprs>[0], *<RHSExprs>[0]));
...
Atomic(<LHSExprs>[<n>-1] = ReductionOperation<n>-1(*<LHSExprs>[<n>-1], *<RHSExprs>[<n>-1]));
break;
default:;
}
Reduction variables are a kind of a private variables, they have private copies, but initial values are chosen in accordance with the reduction operation.
If sections directive has only single section, then original shared variables are used instead with barrier at the end of the directive.
Differential Revision: http://reviews.llvm.org/D9242
llvm-svn: 235835
#pragma omp sections lastprivate(<var>)
<BODY>;
This construct is translated into something like:
<last_iter> = alloca i32
<init for lastprivates>;
<last_iter> = 0
; No initializer for simple variables or a default constructor is called for objects.
; For arrays perform element by element initialization by the call of the default constructor.
...
OMP_FOR_START(...,<last_iter>, ..); sets <last_iter> to 1 if this is the last iteration.
<BODY>
...
OMP_FOR_END
if (<last_iter> != 0) {
<final copy for lastprivate>; Update original variable with the lastprivate value.
}
call __kmpc_cancel_barrier() ; an implicit barrier to avoid possible data race.
If there is only one section, there is no special code generation, original shared variables are used + barrier is emitted at the end of the directive.
Differential Revision: http://reviews.llvm.org/D9240
llvm-svn: 235834
If there are 2 or more sections in a 'section' directive the following code is generated:
<default init for privates>
@__kmpc_for_static_init_4();
<BODY for sections directive>
@__kmpc_for_static_fini()
If there is only one section, the following code is generated:
if (@__kmpc_single()) {
<default init for privates>
@__kmpc_end_single();
}
Differential Revision: http://reviews.llvm.org/D9239
llvm-svn: 235833
Emit the following code for 'single' directive with 'private' clause:
if (@__kmpc_single()) {
<default init for privates>
@__kmpc_end_single();
}
Differential Revision: http://reviews.llvm.org/D9238
llvm-svn: 235832
Fixes rdar://20621065.
A more elegant fix would preclude this case by defining the
rules such that zero-size classes are always formally empty.
I believe the only extensions which create zero-size classes
right now are flexible arrays and zero-length arrays; it's
not abstractly unreasonable to say that those don't count
as members for the purposes of emptiness, just as zero-width
bitfields don't count. But that's an ABI-affecting change
and requires further discussion; in the meantime, let's not
assert / miscompile.
llvm-svn: 235815
This fixes a crash when we're emitting coverage and a macro appears
between two binary conditional operators, ie, "foo ?: MACRO ?: bar",
and fixes the interaction of macros and conditional operators in
general.
llvm-svn: 235793
Emit the following code for 'single' directive with 'firtstprivate' clause:
if (@__kmpc_single()) {
<init for firstprivates>
@__kmpc_end_single();
}
@__kmpc_cancel_barrier(); // To avoid data race in firstprivate init
Differential Revision: http://reviews.llvm.org/D9223
llvm-svn: 235694
Runtime function for 'copyprivate' directive generates implicit barriers, so no need to emit it.
Differential Revision: http://reviews.llvm.org/D9215
llvm-svn: 235692
If there are 2 or more sections in a 'section' directive the following code is generated:
<init for firstprivates>
@__kmpc_cancel_barrier();// To avoid data race in firstprivate init
@__kmpc_for_static_init_4();
<BODY for sections directive>
@__kmpc_for_static_fini()
If there is only one section, the following code is generated:
if (@__kmpc_single()) {
<init for firstprivates>
@__kmpc_end_single();
}
@__kmpc_cancel_barrier(); // To avoid data race in firstprivate init
Differential Revision: http://reviews.llvm.org/D9214
llvm-svn: 235691
The RegionCounter type does a lot of legwork, but most of it is only
meaningful within the implementation of CodeGenPGO. The uses elsewhere
in CodeGen generally just want to increment or read counters, so do
that directly.
llvm-svn: 235664
In r235553, Clang started emitting lifetime markers more often. This
caused false negative in MSan, because MSan only poisons all allocas
once at function entry. Eventually, MSan should poison allocas at
lifetime start and probably also lifetime end, but until then, let's not
emit markers that aren't going to be useful.
llvm-svn: 235613
Adds codegen for 'atomic capture' constructs with the following forms of expressions/statements:
v = x binop= expr;
v = x++;
v = ++x;
v = x--;
v = --x;
v = x = x binop expr;
v = x = expr binop x;
{v = x; x = binop= expr;}
{v = x; x++;}
{v = x; ++x;}
{v = x; x--;}
{v = x; --x;}
{x = x binop expr; v = x;}
{x binop= expr; v = x;}
{x++; v = x;}
{++x; v = x;}
{x--; v = x;}
{--x; v = x;}
{x = x binop expr; v = x;}
{x = expr binop x; v = x;}
{v = x; x = expr;}
If x and expr are integer and binop is associative or x is a LHS in a RHS of the assignment expression, and atomics are allowed for type of x on the target platform atomicrmw instruction is emitted.
Otherwise compare-and-swap sequence is emitted.
Update of 'v' is not required to be be atomic with respect to the read or write of the 'x'.
bb:
...
atomic load <x>
cont:
<expected> = phi [ <x>, label %bb ], [ <new_failed>, %cont ]
<desired> = <expected> binop <expr>
<res> = cmpxchg atomic &<x>, desired, expected
<new_failed> = <res>.field1;
br <res>field2, label %exit, label %cont
exit:
atomic store <old/new x>, <v>
...
Differential Revision: http://reviews.llvm.org/D9049
llvm-svn: 235573
Summary:
Make sure signed overflow in "x--" is checked with
llvm.ssub.with.overflow intrinsic and is reported as:
"-2147483648 - 1 cannot be represented in type 'int'"
instead of:
"-2147483648 + -1 cannot be represented in type 'int'"
, like we do for unsigned overflow.
Test Plan: clang + compiler-rt regression test suite
Reviewers: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D8236
llvm-svn: 235568
We try to use the member variable "FuncName" here, but we've also used
that name as a parameter. This ends with us getting the length of the
function name wrong when we generate the coverage data.
llvm-svn: 235565
These extra endcatch markers aren't helping identify regions to outline,
so let's get rid of them. LLVM outlines (more or less) from begincatch
to endcatch. Any unwind edge from an enclosed invoke is a transition to
a new exception handler, which has it's own outlining markers.
llvm-svn: 235562
This reverts commit r234700. It turns out that the lifetime markers
were not the cause of Chromium failing but a bug which was uncovered by
optimizations exposed by the markers.
llvm-svn: 235553
Otherwise -fno-omit-frame-pointer and other flags like it aren't
applied.
Basic idea taken from Gao's patch, thanks!
Differential Revision: http://reviews.llvm.org/D9203
llvm-svn: 235537
If condition evaluates to true, the code executes task by calling @__kmpc_omp_task() runtime function.
If condition evaluates to false, the code executes serial version of the code by executing the following code:
call void @__kmpc_omp_task_begin_if0(<loc>, <threadid>, <task_t_ptr, returned by @__kmpc_omp_task_alloc()>);
proxy_task_entry(<gtid>, <task_t_ptr, returned by @__kmpc_omp_task_alloc()>);
call void @__kmpc_omp_task_complete_if0(<loc>, <threadid>, <task_t_ptr, returned by @__kmpc_omp_task_alloc()>);
Also it checks if the condition is constant and if it is constant it evaluates its value and then generates either parallel version of the code (if the condition evaluates to true), or the serial version of the code (if the condition evaluates to false).
Differential Revision: http://reviews.llvm.org/D9143
llvm-svn: 235507
This patch generates helper variables which used as a private copies of the corresponding original variables inside an OpenMP 'for' directive. These generated variables are initialized by default (with the default constructor, if any). In OpenMP region references to original variables are replaced by the references to these private helper variables.
Differential Revision: http://reviews.llvm.org/D9106
llvm-svn: 235503
Patch fixes bugs in codegen for loops with unsigned counters and zero trip count. Previously preconditions for all loops were built using logic (Upper - Lower) > 0. But if the loop is a loop with zero trip count, then Upper - Lower is < 0 only for signed integer, for unsigned we're running into an underflow situation.
In this patch we're using original Lower<Upper condition to check that loop body can be executed at least once. Also this allows to skip code generation for loops, if it is known that preconditions for the loop are always false.
Differential Revision: http://reviews.llvm.org/D9103
llvm-svn: 235500
Add codegen for 'ordered' directive:
__kmpc_ordered(ident_t *, gtid);
<associated statement>;
__kmpc_end_ordered(ident_t *, gtid);
Also for 'for' directives with the dynamic scheduling and an 'ordered' clause added a call to '__kmpc_dispatch_fini_(4|8)[u]()' function after increment expression for loop control variable:
while(__kmpc_dispatch_next(&LB, &UB)) {
idx = LB;
while (idx <= UB) { BODY; ++idx;
__kmpc_dispatch_fini_(4|8)[u](); // For ordered loops only.
} // inner loop
}
Differential Revision: http://reviews.llvm.org/D9070
llvm-svn: 235496
- Changed CUDALaunchBounds arguments from integers to Expr* so they can
be saved in AST for instantiation.
- Added support for template instantiation of launch_bounds attrubute.
- Moved evaluation of launch_bounds arguments to NVPTXTargetCodeGenInfo::
SetTargetAttributes() where it can be done after template instantiation.
- Added a warning on negative launch_bounds arguments.
- Amended test cases.
Differential Revision: http://reviews.llvm.org/D8985
llvm-svn: 235452
An upcoming LLVM commit will remove the `DIArray` and `DITypeArray`
typedefs that shadow `DebugNodeArray` and `MDTypeRefArray`,
respectively. Use those types directly.
llvm-svn: 235412
Code in CodeGenModule::GetOrCreateLLVMGlobal that sets up GlobalValue
object for LLVM external symbols has this comment:
// FIXME: This code is overly simple and should be merged with other global
// handling.
One part does seems to be "overly simple" currently is that this code
never sets any alignment info on the GlobalValue, so that the emitted
IR does not have any align attribute on external globals. This can
lead to unnecessarily inefficient code generation.
This patch adds a GV->setAlignment call to set alignment info.
llvm-svn: 235396
Prepare for the deletion in LLVM of the subclasses of (the already
deleted) `DIScope` by using the raw pointers they were wrapping
directly.
llvm-svn: 235355
Subclasses of (the already deleted) `DIType` will be deleted by an
upcoming LLVM commit. Remove references.
While `DICompositeType` wraps `MDCompositeTypeBase` and `DIDerivedType`
wraps `MDDerivedTypeBase`, most uses of each really meant the more
specific `MDCompositeType` and `MDDerivedType`. I updated accordingly.
llvm-svn: 235350
Something like { void*, void * } would be passed to a function as a [2 x i64], but returned as an i128. This patch unifies the 2 behaviours so that we also return it as a [2 x i64].
This is better for the quality of the IR, and the size of the final LLVM binary as we tend to want to insert/extract values from these types and do so with the insert/extract instructions is less IR than shifting, truncating, and or'ing values.
Reviewed by Tim Northover.
llvm-svn: 235231
LLVM r235111 changed the `DIBuilder` API to stop using `DIDescriptor`
and its subclasses. Rolled into this was some tightening up of types:
- Scopes: `DIDescriptor` => `MDScope*`.
- Generic debug nodes: `DIDescriptor` => `DebugNode*`.
- Subroutine types: `DICompositeType` => `MDSubroutineType*`.
- Composite types: `DICompositeType` => `MDCompositeType*`.
Note that `DIDescriptor` wraps `MDNode`, and `DICompositeType` wraps
`MDCompositeTypeBase`.
It's this new type strictness that requires changes here.
llvm-svn: 235112
Emits the following code for the clause at the beginning of the outlined function for implicit threads:
if (<not a master thread>) {
...
<thread local copy of var> = <master thread local copy of var>;
...
}
<sync point>;
Checking for a non-master thread is performed by comparing of the address of the thread local variable with the address of the master's variable. Master thread always uses original variables, so you always know the address of the variable in the master thread.
Differential Revision: http://reviews.llvm.org/D9026
llvm-svn: 235075
#pragma omp for lastprivate(<var>)
for (i = a; i < b; ++b)
<BODY>;
This construct is translated into something like:
<last_iter> = alloca i32
<lastprivate_var> = alloca <type>
<last_iter> = 0
; No initializer for simple variables or a default constructor is called for objects.
; For arrays perform element by element initialization by the call of the default constructor.
...
OMP_FOR_START(...,<last_iter>, ..); sets <last_iter> to 1 if this is the last iteration.
<BODY>
...
OMP_FOR_END
if (<last_iter> != 0) {
<var> = <lastprivate_var> ; Update original variable with the lastprivate value.
}
call __kmpc_cancel_barrier() ; an implicit barrier to avoid possible data race.
Differential Revision: http://reviews.llvm.org/D8658
llvm-svn: 235074
Things can't both be in comdats and have common linkage, so never give things
in comdats common linkage. Common linkage is only used in .c files, and the
only thing that can trigger a comdat in c is selectany from what I can tell.
Fixes PR23243.
Also address an over-the-shoulder review comment from rnk by moving the
hasAttr<SelectAnyAttr>() in Decl.cpp around a bit. It only makes a minor
difference for selectany on global variables, so it goes well with the rest of
this patch.
http://reviews.llvm.org/D9042
llvm-svn: 235053
This reverts commit r234767, as it was breaking all ARM buildbots for two days and the
assert is not in the code, making it difficult to spot the error, which would keep the
bots red for a few more days. New errors were silently introduced because of this bug,
and we don't want this to escalate.
llvm-svn: 234983
Adds proper codegen for 'firstprivate' clause in for directive. Initially codegen for 'firstprivate' clause was implemented for 'parallel' directive only.
Also this patch emits sync point only after initialization of firstprivate variables, not all private variables. This sync point is not required for privates, lastprivates etc., only for initialization of firstprivate variables.
Differential Revision: http://reviews.llvm.org/D8660
llvm-svn: 234978
Follow up to r234962, start respecting `-emit-llvm-uselists even for
LLVM assembly. Note that the driver never passes this flag; this is
just a interface convenience/consistency for those using `-cc1`
directly. This required LLVM r234969 (and predecessors).
llvm-svn: 234970
Stop relying on `cl::opt` to pass along the driver's decision to
preserve use-lists. Create a new `-cc1` option called
`-emit-llvm-uselists` that does the right thing (when -emit-llvm-bc).
Note that despite its generic name, it *doesn't* do the right thing when
-emit-llvm (LLVM assembly) yet. I'll hook that up soon.
This doesn't really change the behaviour of the driver. The default is
still to preserve use-lists for `clang -emit-llvm` and `clang
-save-temps`, and nothing else. But it stops relying on global state
(and also is a nicer interface for hackers using `clang -cc1`).
llvm-svn: 234962
Reverts the code changes from r234675 but keeps the test case.
We were already maintaining a DenseMap of globals with dynamic
initializers anyway.
Fixes the test case from PR23234.
llvm-svn: 234961
Now that `addBitcodeWriterPass()` requires an explicit bit to preserve
use-list order, send it in from `clang`. It looks like I'll be able to
push this up to the `-cc1` options.
llvm-svn: 234960
Fixed a bug with codegen of variables with array types specified in 'copyprivate' clause of 'single' directive.
Differential Revision: http://reviews.llvm.org/D8914
llvm-svn: 234856
Stop using `DIDescriptor`'s wrapper around
`MDNode::replaceAllUsesWith()` (which is going away). The new home for
this logic is `DIBuilder::replaceTemporary()`, added in LLVM r234695.
llvm-svn: 234696
Follow-up to r234666. With this, the -m[no-]global-merge options
have the expected behavior. Previously, -mglobal-merge was ignored,
and there was no way of enabling the optimization.
llvm-svn: 234668
Even though these symbols are in a comdat group, the Microsoft linker
really wants them to have internal linkage.
I'm planning to tweak the mangling in a follow-up change. This is a
straight revert with a 1-line fix.
llvm-svn: 234613
Now that TailRecursionElimination has been fixed with r222354, the
threshold on size for lifetime marker insertion can be removed. This
only affects named temporary though, as the patch for unnamed temporaries
is still in progress.
My previous commit (r222993) was not handling debuginfo correctly, but
this could only be seen with some asan tests. Basically, lifetime markers
are just instrumentation for the compiler's usage and should not affect
debug information; however, the cleanup infrastructure was assuming it
contained only destructors, i.e. actual code to be executed, and was
setting the breakpoint for the end of the function to the closing '}', and
not the return statement, in order to show some destructors have been
called when leaving the function. This is wrong when the cleanups are only
lifetime markers, and this is now fixed.
llvm-svn: 234581
WinEHPrepare was going to have to pattern match the control flow merge
and split that the old lowering used, and that wasn't really feasible.
Now we can teach WinEHPrepare to pattern match this, which is much
simpler:
%fp = call i8* @llvm.frameaddress(i32 0)
call void @func(iN [01], i8* %fp)
This prototype happens to match the prototype used by the Win64 SEH
personality function, so this is really simple.
llvm-svn: 234532
The driver currently accepts but ignores the -freciprocal-math flag.
This patch passes the flag through and enables 'arcp' fast-math-flag
generation in IR.
Note that this change does not actually enable the optimization for
any target. The reassociation optimization that this flag specifies
was implemented by http://reviews.llvm.org/D6334 :
http://llvm.org/viewvc/llvm-project?view=revision&revision=222510
Because the optimization is done in the backend rather than IR,
the backend must be modified to understand instruction-level
fast-math-flags or a new function-level attribute must be created.
Also note that -freciprocal-math is independent of any target-specific
usage of reciprocal estimate hardware instructions. That requires
its own flag ('-mrecip').
https://llvm.org/bugs/show_bug.cgi?id=20912
llvm-svn: 234493
_CxxFrameHandler3 calls terminate if a cleanup action throws, regardless
of what bits you put in the xdata tables. There's no need to model this
in the IR, since we just have to take it out later.
llvm-svn: 234448
The test should be fixed. It was failing in NDEBUG builds due to a
missing '*' character in a regex. In asserts builds, the pattern matched
a single digit value, which became a double digit value in NDEBUG
builds. Go figure.
This reverts commit r234261.
llvm-svn: 234447
Update a few calls to `DIBuilder` now that `MDTuple` array-wrappers
don't have implicit conversions to `MDTuple*`. I may circle back and
update `DIBuilder` to take arrays here, to make it easier for the
callers.
llvm-svn: 234327
Error message was:
CGDebugInfo.cpp(1047) : error C2666: 'llvm::MDTypeRefArray::operator []' : 2 overloads have similar conversions
DebugInfoMetadata.h(106): could be 'llvm::MDTypeRef llvm::MDTypeRefArray::operator [](unsigned int) const'
while trying to match the argument list '(llvm::DITypeArray, int)'
llvm-svn: 234308
The catch object parameter to llvm.eh.begincatch is optional, and can be
null. We can save some ourselves the stack space, copy ctor, and dtor
calls if we pass null.
llvm-svn: 234264
While capturing filters aren't very common, we'd like to outline
__finally blocks in the frontend to simplify -O0 EH preparation and
reduce code size. Finally blocks are usually have captures, and this is
the first step towards that.
Currently we don't support capturing 'this' or VLAs.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D8825
llvm-svn: 234261
The clang edition of r234255: use built-in `isa<>`, `dyn_cast<>`, etc.,
and only build `DIDescriptor`s from pointers that are correctly typed.
llvm-svn: 234256
`getScope()` passes the scope back through a `DILexicalBlock` even
though the underlying pointer may be an incompatible `MDSubprogram`.
Just use `getContext()` directly.
llvm-svn: 234245
Don't assume that all pointers are convertible to void pointer.
Instead correctly respect [conv.ptr]p2; only allow pointer types with an
object pointee type to be caught as pointer-to-void.
llvm-svn: 234090
Not all of them (there's still a fallback for this specific function
that omits the type parameter) but it's some I bothered to do now.
llvm-svn: 234063
C structs.
This comes up when we have a function that takes a struct and is defined in a
C++ file and used in a C file.
Before this commit, we will generate byval for C++ and will expand the struct
for C, thus causing difference at IR level. We will use bitcast of function type
at the callsite, which causes the inliner to not inline the function.
This commit changes how we handle small C like structs at IR level, but at
backend, we should generate the same argument passing before and after the
commit.
Note that the condition for expanding is still over conservative. We should be
able to expand type that is spelled with “class” and types that are not C-like.
But this commit fixes the inconsistent argument passing between C/C++.
Reviewed by John.
rdar://20121030
llvm-svn: 234033
Now the GEP constant utility functions require the type to be explicitly
passed (since eventually the pointer type will be opaque and not convey
the required type information). For now callers can still pass nullptr
(though none were needed here in Clang, which is nice) if
convenienc/necessary, but eventually that will be disallowed as well.
llvm-svn: 233937
This uses the same class metadata currently used for virtual call and
cast checks.
The new flag is -fsanitize=cfi-nvcall. For consistency, the -fsanitize=cfi-vptr
flag has been renamed -fsanitize=cfi-vcall.
Differential Revision: http://reviews.llvm.org/D8756
llvm-svn: 233874
The zEC12 provides the transactional-execution facility. This is exposed
to users via a set of builtin routines on other compilers. This patch
adds clang support to enable those builtins. In partciular, the patch:
- enables the transactional-execution feature by default on zEC12
- allows to override presence of that feature via the -mhtm/-mno-htm options
- adds a predefined macro __HTM__ if the feature is enabled
- adds support for the transactional-execution GCC builtins
- adds Sema checking to verify the __builtin_tabort abort code
- adds the s390intrin.h header file (for GCC compatibility)
- adds s390 sections to the htmintrin.h and htmxlintrin.h header files
Since this is first use of target-specific intrinsics on the platform,
the patch creates the include/clang/Basic/BuiltinsSystemZ.def file and
hooks it up in TargetBuiltins.h and lib/Basic/Targets.cpp.
An associated LLVM patch adds the required LLVM IR intrinsics.
For reference, the transactional-execution instructions are documented
in the z/Architecture Principles of Operation for the zEC12:
http://publibfp.boulder.ibm.com/cgi-bin/bookmgr/download/DZ9ZR009.pdf
The associated builtins are documented in the GCC manual:
http://gcc.gnu.org/onlinedocs/gcc/S_002f390-System-z-Built-in-Functions.html
The htmxlintrin.h intrinsics provided for compatibility with the IBM XL
compiler are documented in the "z/OS XL C/C++ Programming Guide".
llvm-svn: 233804
MSVC treats all non-empty exception specifications the same way: all
exceptions are permitted. The .xdata tables provide a way to
efficiently lower exception specifications *but* this probably has to be
implemented as a catch-all/rethrow mechanism instead of the Itanium way.
This fixes PR23092.
llvm-svn: 233787
The argument range checks for the HTM and Crypto builtins were implemented in
CGBuiltin.cpp, not in Sema. This change moves them to the appropriate location
in SemaChecking.cpp. It requires the creation of a new method in the Sema class
to do checks for PPC-specific builtins.
http://reviews.llvm.org/D8672
llvm-svn: 233586
Running the GCC's inter-compiler ABI compatibility test suite uncovered
a couple of errors in clang's SystemZ ABI implementation. These all
affect only rare corner cases:
- Short vector types
GCC synthetic vector types defined with __attribute__ ((vector_size ...))
are always passed and returned by reference. (This is not documented in
the official ABI document, but is the de-facto ABI implemented by GCC.)
clang would do that only for vector sizes >= 16 bytes, but not for shorter
vector types.
- Float-like aggregates and empty bitfields
clang would consider any aggregate containing an empty bitfield as
first element to be a float-like aggregate. That's obviously wrong.
According to the ABI doc, the presence of an empty bitfield makes
an aggregate to be *not* float-like. However, due to a bug in GCC,
empty bitfields are ignored in C++; this patch changes clang to be
compatible with this "feature" of GCC.
- Float-like aggregates and va_arg
The va_arg implementation would mis-detect some aggregates as float-like
that aren't actually passed as such. This applies to aggregates that
have only a single element of type float or double, but using an aligned
attribute that increases the total struct size to more than 8 bytes.
This error occurred because the va_arg implement used to have an copy
of the float-like aggregate detection logic (i.e. it would call the
isFPArgumentType routine, but not perform the size check).
To simplify the logic, this patch removes the duplicated logic and
instead simply checks the (possibly coerced) LLVM argument type as
already determined by classifyArgumentType.
llvm-svn: 233543
Adds atomic update codegen for the following forms of expressions:
x binop= expr;
x++;
++x;
x--;
--x;
x = x binop expr;
x = expr binop x;
If x and expr are integer and binop is associative or x is a LHS in a RHS of the assignment expression, and atomics are allowed for type of x on the target platform atomicrmw instruction is emitted.
Otherwise compare-and-swap sequence is emitted:
bb:
...
atomic load <x>
cont:
<expected> = phi [ <x>, label %bb ], [ <new_failed>, %cont ]
<desired> = <expected> binop <expr>
<res> = cmpxchg atomic &<x>, desired, expected
<new_failed> = <res>.field1;
br <res>field2, label %exit, label %cont
exit:
...
Differential Revision: http://reviews.llvm.org/D8536
llvm-svn: 233513
Replace boolean IsExplicit parameter of OpenMPRuntime::emitBarrierCall() method by OpenMPDirectiveKind Kind for better compatibility with the runtime library. Also add processing of 'nowait' clause on worksharing directives.
Differential Revision: http://reviews.llvm.org/D8659
llvm-svn: 233511
Utilizing IMAGEREL relocations for synthetic IR constructs isn't
valuable, just clutter. While we are here, simplify HandlerType names
by making the numeric value for the 'adjective' part of the mangled name
instead of appending '.const', etc. The old scheme made for very long
global names and leads to wordy things like '.std_bad_alloc'
llvm-svn: 233503
An upcoming LLVM commit will make calling
`DIBuilder::retainType(nullptr)` illegal (actually, it already was, but
it wasn't verified). Check for null before calling.
This triggered in test/CodeGenObjC/debug-info-block-helper.m.
llvm-svn: 233443
Summary:
Due to CUDA's implicit address space casting, the type of a static local
variable may be more specific (i.e. with address space qualifiers) than
the type expected by the constructor. Emit an addrspacecast in that
case.
Test Plan: Clang used to crash on the added test.
Reviewers: nlewycky, pcc, eliben, rsmith
Reviewed By: eliben, rsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8575
llvm-svn: 233208
This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07
(POWER8). The intrinsic support is based on GCC one [1], with both 'PowerPC HTM
Low Level Built-in Functions' and 'PowerPC HTM High Level Inline Functions'
implemented.
Along with builtins a new driver switch is added to enable/disable HTM
instruction support (-mhtm) and a header with common definitions (mostly to
parse the TFHAR register value). The HTM switch also sets a preprocessor builtin
HTM.
The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on
powerpc64 and powerpc64le.
This is send along a llvm patch to enabled the builtins and option switch.
[1]
https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html
Phabricator Review: http://reviews.llvm.org/D8248
llvm-svn: 233205
When we try to find the end loc for a token, we have to re-lex the
token. This was running into a problem when we'd store the end loc of
a macro's coverage region, since we wouldn't actually be at the
beginning of a token when we tried to re-lex it, leading us to do
silly things (and eventually assert) when whitespace or comments
followed.
This pushes our use of getPreciseTokenLocEnd earlier, so that we won't
call it when it doesn't make sense to. It also removes an unnecessary
adjustment by 1 that was working around this problem in some cases.
llvm-svn: 233169
We previously defaulted to long double, but it's also possible to have
a half inc/dec amount, when LangOpts NativeHalfType is set.
Currently, that's only true for OpenCL.
llvm-svn: 233135
On AArch64, the -fallow-half-args-and-returns option is the default.
With it, the half type is considered legal (rather than the i16 used
normally for __fp16), but no operation is, except conversions and
load/stores and such.
The previous behavior was tantamount to saying LangOpts.NativeHalfType
was implied by LangOpts.HalfArgsAndReturns, which isn't true.
Instead, teach the various parts of CodeGen that already know about
half (using the intrinsics or not) about this weird in-between case,
where the "half" type is legal, but operations on it aren't.
This is a smaller intermediate step to the end-goal of removing the
intrinsic, always using "half", and letting the backend legalize.
Builds on r232968.
rdar://20045970, rdar://17468714
Differential Revision: http://reviews.llvm.org/D8367
llvm-svn: 232971
Fix the CodeGen so that for types bigger than float, instead of
converting to fp16 via the sequence "InTy -> float -> fp16", we
perform conversions in just one step. This avoids the double
rounding which potentially changes results from a natural
IEEE-754 operation.
rdar://17594379, rdar://17468714
Differential Revision: http://reviews.llvm.org/D4602
Part of: http://reviews.llvm.org/D8367
llvm-svn: 232968
If there is at least one 'copyprivate' clause is associated with the single directive, the following code is generated:
```
i32 did_it = 0; \\ for 'copyprivate' clause
if(__kmpc_single(ident_t *, gtid)) {
SingleOpGen();
__kmpc_end_single(ident_t *, gtid);
did_it = 1; \\ for 'copyprivate' clause
}
<copyprivate_list>[0] = &var0;
...
<copyprivate_list>[n] = &varn;
call __kmpc_copyprivate(ident_t *, gtid, <copyprivate_list_size>,
<copyprivate_list>, <copy_func>, did_it);
...
void<copy_func>(void *LHSArg, void *RHSArg) {
Dst = (void * [n])(LHSArg);
Src = (void * [n])(RHSArg);
Dst[0] = Src[0];
... Dst[n] = Src[n];
}
```
All list items from all 'copyprivate' clauses are gathered into single <copyprivate list> (<copyprivate_list_size> is a size in bytes of this list) and <copy_func> is used to propagate values of private or threadprivate variables from the 'single' region to other implicit threads from outer 'parallel' region.
Differential Revision: http://reviews.llvm.org/D8410
llvm-svn: 232932
The linear variable is privatized (similar to 'private') and its
value on current iteration is calculated, similar to the loop
counter variables.
Differential revision: http://reviews.llvm.org/D8375
llvm-svn: 232890
the target-cpu, if different from the triple's cpu, and
target-features as they're written that are passed down from the
driver.
Together with LLVM r232885 this should allow the LTO'ing of binaries
that contain modules compiled with different code generation options
on a subset of architectures with full backend support (x86, powerpc,
aarch64).
llvm-svn: 232888
When we instrument a program for profiling, we copy the linkage of an
instrumented function so that our datastructures merge in the same way
as the function. This avoids redundant copies for things like
linkonce, but ends up emitting names we never need to reference for
normal and internal symbols. Promoting internal and external linkage
to private for these variables reduces the size overhead of profiling
drastically.
llvm-svn: 232799
A WIP patch to turn on stricter `DIDescriptor` accessor checks fires
here; it's obvious from the code that `T` can be null, so add an
explicit check. Caught by dozens of current testcases.
llvm-svn: 232791
There are no widely deployed standard libraries providing sized
deallocation functions, so we have to punt and ask the user if they want
us to use sized deallocation. In the future, when such libraries are
deployed, we can teach the driver to detect them and enable this
feature.
N3536 claimed that a weak thunk from sized to unsized deallocation could
be emitted to avoid breaking backwards compatibility with standard
libraries not providing sized deallocation. However, this approach and
other variations don't work in practice.
With the weak function approach, the thunk has to have default
visibility in order to ensure that it is overridden by other DSOs
providing sized deallocation. Weak, default visibility symbols are
particularly expensive on MachO, so John McCall was considering
disabling this feature by default on Darwin. It also changes behavior
ELF linking behavior, causing certain otherwise unreferenced object
files from an archive to be pulled into the link.
Our second approach was to use an extern_weak function declaration and
do an inline conditional branch at the deletion call site. This doesn't
work because extern_weak only works on MachO if you have some archive
providing the default value of the extern_weak symbol. Arranging to
provide such an archive has the same challenges as providing the symbol
in the standard library. Not to mention that extern_weak doesn't really
work on COFF.
Reviewers: rsmith, rjmccall
Differential Revision: http://reviews.llvm.org/D8467
llvm-svn: 232788
Somehow, we never managed to implement this fully. We could constant
fold it like crazy, including constant folding complex arguments, etc.
But if you actually needed to generate code for it, error.
I've implemented it using the somewhat obvious lowering. Happy for
suggestions on a more clever way to lower this.
Now, what you might ask does this have to do with modules? Fun story. So
it turns out that libstdc++ actually uses __builtin_isinf_sign to
implement std::isinf when in C++98 mode, but only inside of a template.
So if we're lucky, and we never instantiate that, everything is good.
But once we try to instantiate that template function, we need this
builtin. All of my customers at least are using C++11 and so they never
hit this code path.
But what does that have to do with modules? Fun story. So it turns out
that with modules we actually observe a bunch of bugs in libstdc++ where
their <cmath> header clobbers things exposed by <math.h>. To fix these,
we have to provide global function definitions to replace the macros
that C99 would have used. And it turns out that ::isinf needs to be
implemented using the exact semantics used by the C++98 variant of
std::isinf. And so I started to fix this bug in libstdc++ and ceased to
be able to compile libstdc++ with Clang.
The yaks are legion.
llvm-svn: 232778
This reverts commit r230580.
extern_weak functions don't appear to work on Darwin (PR22951), so we'll
need to come up with a new approach.
llvm-svn: 232731
There will be an explicit template instantiation in another translation
unit which will provide the definition of the VF/VB-Tables.
This fixes PR22932.
llvm-svn: 232680
Codegen for threadprivate variables (and in some other cases) may cause crash of the compiler if some diagnostic is produced later. This happens because some of the autogenerated globals are not removed from InternalVars StringMap when llvm::Module is reset.
Differential Revision: http://reviews.llvm.org/D8360
llvm-svn: 232610