Use a utility function to check whether the number of elements is a power of 2
and drop the redundant upper limit (a 128-bit vector with more than 16 elements
would have each element < 8 bits, not possible).
llvm-svn: 268422
If we are devirtualizing, then we want to compute the 'this' adjustment
of the devirtualized target, not the adjustment of the base's method
definition, which is usually zero.
Fixes PR27621
llvm-svn: 268418
r268261 made Clang "expand" more struct arguments on Windows. It removed
the check for 'RD->isCLike()', which was preventing us from attempting
to expand structs with reference type fields.
Our expansion code was attempting to load and pass each field of the
type in turn. We were accidentally doing one to many loads on reference
type fields.
On the function prologue side, we can use
EmitLValueForFieldInitialization, which obviously gets the address of
the field. On the call side, I tweaked EmitRValueForField directly,
since this is the only use of this method.
Fixes PR27607
llvm-svn: 268321
It isn't necessary to call hasDefaultArg because we can't rematerialize
a captured variable that is a function parameter, regardless of whether
or not it has a default argument. NFC.
llvm-svn: 268318
parameters in the body of a block.
This fixes a bug where clang would materialize the default argument
inside the body of a block instead of passing the value via the block
descriptor.
For example, in the code below, foo1 would always print 42 regardless
of the value of argument "a" passed to foo1.
void foo1(const int a = 42 ) {
auto block = ^{
printf("%d\n", a);
};
block();
}
rdar://problem/24449235
llvm-svn: 268314
Before this change, we would pass all non-HFA record arguments on
Windows with byval. Byval often blocks optimizations and results in bad
code generation. Windows now uses the existing workaround that other
x86_32 platforms use.
I also expanded the workaround to handle C++ records with constructors
on Windows. On non-Windows platforms, we have to keep generating the
same LLVM IR prototypes if we want our bitcode to be ABI compatible.
Otherwise we will encounter mismatch issues like PR21573.
Essentially fixes PR27522 in Clang instead of LLVM.
Reviewers: hans
Differential Revision: http://reviews.llvm.org/D19756
llvm-svn: 268261
http://reviews.llvm.org/D18474
This patch fixes a bug in code generation of the correct OpenMP runtime library call in presence of target and teams, when target is separated by teams with multiple curly brackets. The current implementation will not be able to see the teams directive inside target and issue a call to tgt_target instead of the correct one tgt_target_teams.
llvm-svn: 267972
Summary:
Port rL265324 to SystemZ to allow using the 'swiftcall' attribute on that architecture.
Depends on D19414.
Reviewers: kbarton, rjmccall, uweigand
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D19432
llvm-svn: 267879
OpenMP 4.5 defines 'taskloop simd' directive, which is combined
directive for 'taskloop' and 'simd' directives. Patch adds initial
codegen support for this directive and its 2 basic clauses 'safelen' and
'simdlen'.
llvm-svn: 267872
It makes compiler-rt tests fail if the gold plugin is enabled.
Revert "Rework interface for bitset-using features to use a notion of LTO visibility."
Revert "Driver: only produce CFI -fvisibility= error when compiling."
Revert "clang/test/CodeGenCXX/cfi-blacklist.cpp: Exclude ms targets. They would be non-cfi."
llvm-svn: 267871
directive.
OpenMP 4.5 defines 'taskloop' directive and 2 additional clauses
'grainsize' and 'num_tasks' for this directive. Patch adds codegen for
these clauses.
These clauses are generated as arguments of the '__kmpc_taskloop'
libcall and are encoded the following way:
void __kmpc_taskloop(ident_t *loc, int gtid, kmp_task_t *task, int if_val, kmp_uint64 *lb, kmp_uint64 *ub, kmp_int64 st, int nogroup, int sched, kmp_uint64 grainsize, void *task_dup);
If 'grainsize' is specified, 'sched' argument must be set to '1' and
'grainsize' argument must be set to the value of the 'grainsize' clause.
If 'num_tasks' is specified, 'sched' argument must be set to '2' and
'grainsize' argument must be set to the value of the 'num_tasks' clause.
It is possible because these 2 clauses are mutually exclusive and can't
be used at the same time on the same directive.
If none of these clauses is specified, 'sched' argument must be set to
'0'.
llvm-svn: 267862
Summary:
This patch adds support for the target exit data directive code generation.
Given that, apart from the employed runtime call, target exit data requires the same code generation pattern as target enter data, the OpenMP codegen entry point was renamed and reused for both.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: cfe-commits, fraggamuffin, caomhin
Differential Revision: http://reviews.llvm.org/D17369
llvm-svn: 267814
Summary: This patch adds support for the target enter data directive code generation.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: cfe-commits, fraggamuffin, caomhin
Differential Revision: http://reviews.llvm.org/D17368
llvm-svn: 267812
Summary:
This patch adds support for the target data directive code generation.
Part of the already existent functionality related with data maps is moved to a new function so that it could be reused.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: cfe-commits, fraggamuffin, caomhin
Differential Revision: http://reviews.llvm.org/D17367
llvm-svn: 267811
Summary:
Implement codegen for the map clause. All the new list items in 4.5 specification are supported.
Fix bug in the generation of array sections that was exposed by some of the map clause tests: for pointer types the offsets have to be calculated from the pointee not the pointer.
Reviewers: hfinkel, kkwli0, carlo.bertolli, arpith-jacob, ABataev
Subscribers: ABataev, cfe-commits, caomhin, fraggamuffin
Differential Revision: http://reviews.llvm.org/D16749
llvm-svn: 267808
Bitsets, and the compiler features they rely on (vtable opt, CFI),
only have visibility within the LTO'd part of the linkage unit. Therefore,
only enable these features for classes with hidden LTO visibility. This
notion is based on object file visibility or (on Windows)
dllimport/dllexport attributes.
We provide the [[clang::lto_visibility_public]] attribute to override the
compiler's LTO visibility inference in cases where the class is defined
in the non-LTO'd part of the linkage unit, or where the ABI supports
calling classes derived from abstract base classes with hidden visibility
in other linkage units (e.g. COM on Windows).
If the cross-DSO CFI mode is enabled, bitset checks are emitted even for
classes with public LTO visibility, as that mode uses a separate mechanism
to cause bitsets to be exported.
This mechanism replaces the whole-program-vtables blacklist, so remove the
-fwhole-program-vtables-blacklist flag.
Because __declspec(uuid()) now implies [[clang::lto_visibility_public]], the
support for the special attr:uuid blacklist entry is removed.
Differential Revision: http://reviews.llvm.org/D18635
llvm-svn: 267784
Summary:
As of D18614, TargetMachine exposes a hook to add a set of passes that should
be run as early as possible. Invoke this hook from clang when setting up the
pass manager.
Reviewers: chandlerc
Subscribers: rnk, cfe-commits, tra
Differential Revision: http://reviews.llvm.org/D18617
llvm-svn: 267764
Make 'nodebug' on a global/static variable suppress all debug info
for the variable. Previously it would only suppress info for the
associated initializer function, if any.
Differential Revision: http://reviews.llvm.org/D19567
llvm-svn: 267746
declare reductions.
If reduction clause is applied to instance of class with user-defined
reduction operation without initialization clause, it may cause a crash.
Patch fixes this issue.
llvm-svn: 267695
Currently there is a problem with codegen of inlined directives inside
lambdas, it may cause a crash during codegen because of incorrect
capturing of variables. Patch fixes this problem.
llvm-svn: 267677
SPIR target can be used for C/C++ inputs too (i.e. in OpenCL compatible mode for the libs creation).
Patch by Neil Henning!
Review: http://reviews.llvm.org/D19478
llvm-svn: 267561
instantiation is in a module.
This patch fixes the condition for determining whether the debug info for a
template instantiation will exist in an imported clang module by:
- checking whether the ClassTemplateSpecializationDecl is complete and
- checking that the instantiation was in a module by looking at the first field.
I also added a negative check to make sure that a typedef to a forward-declared
template (with the definition outside of the module) is handled correctly.
http://reviews.llvm.org/D19443
rdar://problem/25553724
llvm-svn: 267464
The taskloop construct specifies that the iterations of one or more associated loops will be executed in parallel using OpenMP tasks. The iterations are distributed across tasks created by the construct and scheduled to be executed.
The next code will be generated for the taskloop directive:
#pragma omp taskloop num_tasks(N) lastprivate(j)
for( i=0; i<N*GRAIN*STRIDE-1; i+=STRIDE ) {
int th = omp_get_thread_num();
#pragma omp atomic
counter++;
#pragma omp atomic
th_counter[th]++;
j = i;
}
Generated code:
task = __kmpc_omp_task_alloc(NULL,gtid,1,sizeof(struct
task),sizeof(struct shar),&task_entry);
psh = task->shareds;
psh->pth_counter = &th_counter;
psh->pcounter = &counter;
psh->pj = &j;
task->lb = 0;
task->ub = N*GRAIN*STRIDE-2;
task->st = STRIDE;
__kmpc_taskloop(
NULL, // location
gtid, // gtid
task, // task structure
1, // if clause value
&task->lb, // lower bound
&task->ub, // upper bound
STRIDE, // loop increment
0, // 1 if nogroup specified
2, // schedule type: 0-none, 1-grainsize, 2-num_tasks
N, // schedule value (ignored for type 0)
(void*)&__task_dup_entry // tasks duplication routine
);
llvm-svn: 267395
LLVM really wants a debug location on every inlinable call in a function
with debug info, because it otherwise cannot set up inlining debug info.
This change applies an artificial line 0 debug location (which is how
DWARF marks automatically generated code that has no corresponding
source code) to the .__kmpc_global_dtor_. functions to avoid the
LLVM Verifier complaining.
llvm-svn: 267369
LLVM stopped using MDString-based type references, and DIBuilder no
longer fills 'retainedTypes:' with every DICompositeType that has an
'identifier:' field. There are just minor changes to keep the same
behaviour in CFE.
Leaving 'retainedTypes:' unfilled has a dramatic impact on the output
order of the IR though. There are a huge number of testcase changes,
which were unfortunately not really scriptable.
llvm-svn: 267297
Write out the PGOFuncName meta data if PGOFuncName is different from function's
raw name. This should only apply to internal linkage functions. This is to be
consumed by indirect-call promotion when called in LTO optimization pass.
Differential Revision: http://reviews.llvm.org/D18624
llvm-svn: 267224
For various reasons, involving dllexport and class linkage compuations,
we have to wait until after the semicolon after a class declaration to
emit inline methods. These are "deferred" decls. Before this change,
finishing the tag decl would trigger us to deserialize some PCH so that
we could make a "pretty" IR-level type. Deserializing the PCH triggered
calls to HandleTopLevelDecl, which, when done, checked the deferred decl
list, and emitted some dllexported decls that weren't ready.
Avoid this re-entrancy. Deferred decls should not get emitted when a tag
is finished, they should only be emitted after a real top level decl in
the main file.
llvm-svn: 267186
causes code generation failure.
The codegen part of firstprivate clause for member decls used type of
original variable without skipping reference type from
OMPCapturedExprDecl. Patch fixes this problem.
llvm-svn: 267125
If loop control variable for simd-based directives is explicitly marked
as linear/lastprivate in clauses, codegen for such construct would
crash. Patch fixes this problem.
llvm-svn: 267101
Summary:
Adds a framework to enable the instrumentation pass for the new
EfficiencySanitizer ("esan") family of tools. Adds a flag for esan's
cache fragmentation tool via -fsanitize=efficiency-cache-frag.
Adds appropriate tests for the new flag.
Reviewers: eugenis, vitalybuka, aizatsky, filcab
Subscribers: filcab, kubabrecka, llvm-commits, zhaoqin, kcc
Differential Revision: http://reviews.llvm.org/D19169
llvm-svn: 267059
in the compile unit that contains their implementation even if their
interface is declared in a module.
The private @implementation of an @interface may have additional
hidden ivars so we should not defer to the public version of the
type that is found in the module.
<rdar://problem/25541798>
llvm-svn: 266937
If the untied clause is present on a task construct, any thread in the
team can resume the task region after a suspension. Patch adds proper
codegen for untied tasks.
llvm-svn: 266853
Summary:
This is a follow-on to apply Duncan's new DIType ODR uniquing from
r266549 and r266713 in more places.
When invoking ThinLTO backend compiles via clang (for a distributed
build), invoke enableDebugTypeODRUniquing() before parsing the module.
Reviewers: dexonsmith, joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D19264
llvm-svn: 266852
The intrinsic is now called llvm.thread.pointer, not
llvm.aarch64.thread.pointer. Also, the code handling it in CGBuiltin.cpp
is dead - it's already covered by GCCBuiltin. Remove it.
Differential Revision: http://reviews.llvm.org/D19099
llvm-svn: 266817
r259537 added vfma/vfms to armv7, but the builtin was only lowered
on the AArch64 side. Instead of supporting it on ARM, get rid of it.
The vfms builtin lowered to:
%nb = fsub float -0.0, %b
%r = @llvm.fma.f32(%a, %nb, %c)
Instead, define the operation in terms of vfma, and swap the
multiplicands. It now lowers to:
%na = fsub float -0.0, %a
%r = @llvm.fma.f32(%na, %b, %c)
This matches the instruction more closely, and lets current LLVM
generate the "natural" operand ordering:
fmls.2s v0, v1, v2
instead of the crooked (but equivalent):
fmls.2s v0, v2, v1
Except for theses changes, assembly is identical.
LLVM accepts both commutations, and the LLVM tests in:
test/CodeGen/AArch64/arm64-fmadd.ll
test/CodeGen/AArch64/fp-dp3.ll
test/CodeGen/AArch64/neon-fma.ll
test/CodeGen/ARM/fusedMAC.ll
already check either the new one only, or both.
Also verified against the test-suite unittests.
llvm-svn: 266807
Currently, for the ppc64--gnu and aarch64 ABIs, we recognize:
typedef __attribute__((__ext_vector_type__(3))) float v3f32;
typedef __attribute__((__ext_vector_type__(16))) char v16i8;
struct HFA {
v3f32 a;
v16i8 b;
};
as an HFA. Since the first type encountered is used as the base type,
we pass the HFA as:
[2 x <3 x float>]
Which leads to incorrect IR (relying on padding values) when the
second field is used.
Instead, explicitly widen the vector (after size rounding) in
isHomogeneousAggregate.
Differential Revision: http://reviews.llvm.org/D18998
llvm-svn: 266784
If the untied clause is present on a task construct, any thread in the
team can resume the task region after a suspension. Patch adds proper
codegen for untied tasks.
llvm-svn: 266754
If the untied clause is present on a task construct, any thread in the team can resume the task region after a suspension. Patch adds proper codegen for untied tasks.
llvm-svn: 266722
Since elements of most kinds of DICompositeType have back references,
most are involved in uniquing cycles. Except via the ODR 'identifier:'
field, which doesn't care about the storage type (see r266549),
they have no hope of being uniqued.
Distinct nodes are far more efficient, so use them for most kinds of
DICompositeType definitions (i.e., when DIType::isForwardDecl is false).
The exceptions:
- DW_TAG_array_type, since their elements never have back-references
and they never have ODR 'identifier:' fields;
- DW_TAG_enumeration_type when there is no ODR 'identifier:' field,
since their elements usually don't have back-references.
This breaks the last major uniquing cycle I'm aware of in the debug info
graph. The impact won't be enormous for C++ because references to
ODR-uniqued nodes still use string-based DITypeRefs; but this should
prevent a regression in C++ when we drop the string-based references.
This wouldn't have been reasonable until r266549, when composite types
stopped relying on being uniqued by structural equivalence to prevent
blow-ups at LTO time.
llvm-svn: 266556
Since this patch provided support for the __float128 type but disabled it
on all platforms by default, some platforms can't compile type_traits with
-std=gnu++11 since there is a specialization with __float128.
This reverts the patch until D19125 is approved (i.e. we know which platforms
need this support enabled).
llvm-svn: 266460
Non-owning pointers that cache LLVM types and constants can use
'nullptr' default member initializers so that we don't need to mention
them in the constructor initializer list.
Owning pointers should use std::unique_ptr so that we don't need to
manually delete them in the destructor. They also don't need to be
mentioned in the constructor at that point.
NFC
llvm-svn: 266263
of a table entry in the corresponding decl, store an offset from the current
record to the relevant CXX_CTOR_INITIALIZERS record. This results in fewer
indirections and a minor .pcm file size reduction.
llvm-svn: 266254
This patch corresponds to review:
http://reviews.llvm.org/D15120
It adds support for the __float128 keyword, literals and a target feature to
enable it. This support is disabled by default on all targets and any target
that has support for this type is free to add it.
Based on feedback that I've received from target maintainers, this appears to
be the right thing for most targets. I have not heard from the maintainers of
X86 which I believe supports this type. I will subsequently investigate the
impact of enabling this on X86.
llvm-svn: 266186
Putting OpenCLImageTypes.def to clangAST library violates layering requirement: "It's not OK for a Basic/ header to include an AST/ header".
This fixes the modules build.
Differential revision: http://reviews.llvm.org/D18954
Reviewers: Richard Smith, Vassil Vassilev.
llvm-svn: 266180
exiting the for-in loop.
This commit fixes a bug where EmitObjCForCollectionStmt didn't pop
cleanups for captures.
For example, in the following for-in loop, a block which captures self
is passed to foo1:
for (id x in [self foo1:^{ use(self); }]) {
use(x);
break;
}
Previously, the code in EmitObjCForCollectionStmt wouldn't pop the
cleanup for the captured self before exiting the loop, which caused
code-gen to generate an IR in which objc_release was called twice on
the captured self.
This commit fixes the bug by entering a RunCleanupsScope before the
loop condition is evaluated and forcing its cleanup before exiting the
loop.
rdar://problem/16865751
Differential Revision: http://reviews.llvm.org/D18618
llvm-svn: 266147
Clang should pass -backend-option to LLVM even though there is no target machine, since LLVM passes are used when emitting LLVM IR.
Differential Revision: http://reviews.llvm.org/D17552
llvm-svn: 266117
In codegen different address spaces may be mapped to the same address
space for a target, e.g. in x86/x86-64 all address spaces are mapped
to 0. Therefore AddressSpaceConversion should be translated by
CreatePointerBitCastOrAddrSpaceCast instead of CreateAddrSpaceCast.
Differential Revision: http://reviews.llvm.org/D18713
llvm-svn: 266107
OpenMP 4.0 defines clause 'uniform' in 'declare simd' directive:
'uniform' '(' <argument-list> ')'
The uniform clause declares one or more arguments to have an invariant value for all concurrent invocations of the function in the execution of a single SIMD loop.
The special this pointer can be used as if was one of the arguments to the function in any of the linear, aligned, or uniform clauses.
llvm-svn: 266041
This patch add support for GCC attribute((ifunc("resolver"))) for
targets that use ELF as object file format. In general ifunc is a
special kind of function alias with type @gnu_indirect_function. LLVM
patch http://reviews.llvm.org/D15525
Differential Revision: http://reviews.llvm.org/D15524
llvm-svn: 265917
profiling and optimization remarks and indicate that no debug info shall
be emitted for these compile units.
http://reviews.llvm.org/D18808
<rdar://problem/25427165>
llvm-svn: 265862
It is possible to argue that the EABIVersion field is similar in spirit to the
ABI field in TargetOptions. It represents the embedded ABI that the target
follows. This will allow us to thread this information into the target
information construction.
llvm-svn: 265807
Revert the two changes to thread CodeGenOptions into the TargetInfo allocation
and to fix the layering violation by moving CodeGenOptions into Basic.
Code Generation is arguably not particularly "basic". This addresses Richard's
post-commit review comments. This change purely does the mechanical revert and
will be followed up with an alternate approach to thread the desired information
into TargetInfo.
llvm-svn: 265806
I. Current implementation of images is not conformant to spec in the following points:
1. It makes no distinction with respect to access qualifiers and therefore allows to use images with different access type interchangeably. The following code would compile just fine:
void write_image(write_only image2d_t img);
kernel void foo(read_only image2d_t img) { write_image(img); } // Accepted code
which is disallowed according to s6.13.14.
2. It discards access qualifier on generated code, which leads to generated code for the above example:
call void @write_image(%opencl.image2d_t* %img);
In OpenCL2.0 however we can have different calls into write_image with read_only and wite_only images.
Also generally following compiler steps have no easy way to take different path depending on the image access: linking to the right implementation of image types, performing IR opts and backend codegen differently.
3. Image types are language keywords and can't be redeclared s6.1.9, which can happen currently as they are just typedef names.
4. Default access qualifier read_only is to be added if not provided explicitly.
II. This patch corrects the above points as follows:
1. All images are encapsulated into a separate .def file that is inserted in different points where image handling is required. This avoid a lot of code repetition as all images are handled the same way in the code with no distinction of their exact type.
2. The Cartesian product of image types and image access qualifiers is added to the builtin types. This simplifies a lot handling of access type mismatch as no operations are allowed by default on distinct Builtin types. Also spec intended access qualifier as special type qualifier that are combined with an image type to form a distinct type (see statement above - images can't be created w/o access qualifiers).
3. Improves testing of images in Clang.
Author: Anastasia Stulova
Reviewers: bader, mgrang.
Subscribers: pxli168, pekka.jaaskelainen, yaxunl.
Differential Revision: http://reviews.llvm.org/D17821
llvm-svn: 265783
CodeGen-level implementation. Instead of adding an attribute to clang's
FunctionDecl, add the IR attribute directly. This means a module built with
this flag is now compatible with code built without it and vice versa.
This change also results in the 'noalias' attribute no longer being added to
calls to operator new in the IR; it's now only added to the declaration. It
also fixes a bug where we failed to add the attribute to the 'nothrow' versions
(because we didn't implicitly declare them, there was no good time to inject a
fake attribute).
llvm-svn: 265728
This is a mechanical move of CodeGenOptions from libFrontend to libBasic. This
fixes the layering violation introduced earlier by threading CodeGenOptions into
TargetInfo. It should also fix the modules based self-hosting builds. NFC.
llvm-svn: 265702
isinf (is infinite) and isfinite should be implemented with the same function
except we change the comparison operator.
See PR27145 for more details:
https://llvm.org/bugs/show_bug.cgi?id=27145
Ref: forked off of the discussion in D18513.
Differential Revision: http://reviews.llvm.org/D18648
llvm-svn: 265675
Turns out it was there mostly to prevent Clang asking people to report a bug.
This time we report something to Clang's real diagnostics handler so that it
exits with something approximating a real error and tidies up after itself.
llvm-svn: 265592
Summary: See LLVM change D18775 for details, this change depends on it.
Reviewers: jyknight, reames
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18776
llvm-svn: 265569
Summary:
Setting this flag causes all functions are annotated with the
"nvvm-f32ftz" = "true" attribute.
In addition, we annotate the module with "nvvm-reflect-ftz" set
to 0 or 1, depending on whether -cuda-flush-denormals-to-zero is set.
This is read by the NVVMReflect pass.
Reviewers: tra, rnk
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18671
llvm-svn: 265435
Add no-jump-tables flag to disable use of jump tables when lowering
switch statements
Reviewers: echristo, hans
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18407
llvm-svn: 265425
This patch implements the teams directive for the NVPTX backend. It is different from the host code generation path as it:
Does not call kmpc_fork_teams. All necessary teams and threads are started upon touching the target region, when launching a CUDA kernel, and their execution is coordinated through sequential and parallel regions within the target region.
Does not call kmpc_push_num_teams even if a num_teams of thread_limit clause is present. Setting the number of teams and the thread limit is implemented by the nvptx-related runtime.
Please note that I am now passing a Clang Expr * to emitPushNumTeams instead of the originally chosen llvm::Value * type. The reason for that is that I want to avoid emitting expressions for num_teams and thread_limit if they are not needed in the target region.
http://reviews.llvm.org/D17963
llvm-svn: 265304
The objc_runtime_visible attribute deals with an odd corner case where
a particular Objective-C class is known to the Objective-C runtime
(and, therefore, accessible by name) but its symbol has been hidden
for some reason. For such classes, teach CodeGen to use
objc_lookUpClass to retrieve the Class object, rather than referencing
the class symbol directly.
Classes annotated with objc_runtime_visible have two major limitations
that fall out from places where Objective-C metadata needs to refer to
the class (or metaclass) symbol directly:
* One cannot implement a subclass of an objc_runtime_visible class.
* One cannot implement a category on an objc_runtime_visible class.
Implements rdar://problem/25494092.
llvm-svn: 265201
landing pads.
Previously, lifetime.end intrinsics were inserted only on normal control
flows. This prevented StackColoring from merging stack slots for objects
that were destroyed on the exception handling control flow since it
couldn't tell their lifetime ranges were disjoint. This patch fixes
code-gen to emit the intrinsic on both control flows.
rdar://problem/22181976
Differential Revision: http://reviews.llvm.org/D18196
llvm-svn: 265197
Whatever crash it was there to present appears to have been fixed in the
backend now, and it had the nasty side-effect of causing clang to exit(0) and
leave a .o containing goodness knows what even when an error hit.
llvm-svn: 265038
Value profiling should not profile constants and/or constant
expressions when they appear as callees in call instructions.
Constant expressions form when a direct callee has bitcasts or
inttoptr(ptrtint (callee)) nests surrounding it. Value profiling
should avoid instrumenting such cases. Mostly NFC.
llvm-svn: 265037
alignment on Darwin.
Itanium C++ ABI specifies that _Unwind_Exception should be double-word
aligned (16B). To conform to the ABI, libraries implementing exception
handling declare the struct with __attribute__((aligned)), which aligns
the unwindHeader field (and the end of __cxa_exception) to the default
target alignment (which is typically 16-bytes).
struct __cxa_exception {
...
// struct is declared with __attribute__((aligned)).
_Unwind_Exception unwindHeader;
};
Based on the assumption that _Unwind_Exception is declared with
__attribute__((aligned)), ItaniumCXXABI::getAlignmentOfExnObject returns
the target default alignment for __attribute__((aligned)). It turns out
that libc++abi, which is used on Darwin, doesn't declare the struct with
the attribute and therefore doesn't guarantee that unwindHeader is
aligned to the alignment specified by the ABI, which in some cases
causes the program to crash because of unaligned memory accesses.
This commit avoids crashes due to unaligned memory accesses by having
getAlignmentOfExnObject return an 8-byte alignment on Darwin. I've only
fixed the problem for Darwin, but we should also figure out whether other
platforms using libc++abi need similar fixes.
rdar://problem/25314277
Differential revision: http://reviews.llvm.org/D18479
llvm-svn: 264998
...as that is apparently what MSVC does. This is an updated version of r263738,
which had to be reverted in r263740 due to test failures. The original version
had erroneously emitted functions that are defined in class templates, too (see
the updated "Handle friend functions" code in EmitDeferredDecls,
lib/CodeGen/ModuleBuilder.cpp). (The updated tests needed to be split out into
their own dllexport-ms-friend.cpp because of the CHECK-NOTs which would have
interfered with subsequent CHECK-DAGs in dllexport.cpp.)
Differential Revision: http://reviews.llvm.org/D18430
llvm-svn: 264841
For terminator instructions, the value profiling instrumentation
happens in a basic block other than where the value site resides.
This CR moves the instrumentation point prior to the value site.
Mostly NFC.
llvm-svn: 264783
For better support of some specific GNU extensions some extra
transformation of AST nodes were introduced. These transformations are
very hard to handle. The code is improved in handling of these
extensions by using captured expressions construct.
llvm-svn: 264709
Solution unifies interface of RegionCodeGenTy type to allow insert
runtime-specific code before/after main codegen action defined in
CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy
for general OpenMP directives, but must be allowed to insert its own
(required) code to support target specific codegen.
llvm-svn: 264700
to function names
Summary:
Hopefully this will make it easier for the next person to figure all
this out...
Reviewers: bogner, davidxl
Subscribers: davidxl, cfe-commits
Differential Revision: http://reviews.llvm.org/D18489
llvm-svn: 264681
Solution unifies interface of RegionCodeGenTy type to allow insert
runtime-specific code before/after main codegen action defined in
CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy
for general OpenMP directives, but must be allowed to insert its own
(required) code to support target specific codegen.
llvm-svn: 264576
Solution unifies interface of RegionCodeGenTy type to allow insert
runtime-specific code before/after main codegen action defined in
CGStmtOpenMP.cpp file. Runtime should not define its own RegionCodeGenTy
for general OpenMP directives, but must be allowed to insert its own
(required) code to support target specific codegen.
llvm-svn: 264569
I broke this back in r264529 because I forgot to serialize the UuidAttr
member. Fix this by replacing the UuidAttr with a StringRef which is
properly serialized and deserialized.
llvm-svn: 264562
The _GUID_ descriptors emitted by MSVC have alignment 8 for 64-bit
builds: we should do the same if the linker picks the "wrong" COMDAT.
llvm-svn: 264530
Keep a pointer to the UuidAttr that the CXXUuidofExpr corresponds to.
This makes translating from __uuidof to the underlying constant a lot
more straightforward.
llvm-svn: 264529
Summary:
This patch adds base support for codegen of the target directive on the NVPTX device.
Reviewers: ABataev
Differential Revision: http://reviews.llvm.org/D17877
Reworked test case after buildbot failure on windows.
Updated patch to integrate r263837 and test case nvptx_target_firstprivate_codegen.cpp.
llvm-svn: 264018
These functions cannot be implemented as atomicrmw or cmpxchg
instructions, so they are implemented as a call to the NVVM intrinsics
@llvm.nvvm.atomic.load.inc.32.p0i32 and
@llvm.nvvm.atomic.load.dec.32.p0i32.
Patch by Jason Henline.
Reviewers: jlebar
Differential Revision: http://reviews.llvm.org/D18322
llvm-svn: 264009
This reverts commit r263607.
This change caused more objc_retain/objc_release calls in the IR but those
are then incorrectly optimized by the ARC optimizer. Work is going to have
to be done to ensure the ARC optimizer doesn't optimize user written RR, but
that should land before this change.
This change will also need to be updated to take account for any changes required
to ensure that user written calls to RR are distinct from those inserted by ARC.
llvm-svn: 263984
Implement lambda capture of *this by copy.
For e.g.:
struct A {
int d = 10;
auto foo() { return [*this] (auto a) mutable { d+=a; return d; }; }
};
auto L = A{}.foo(); // A{}'s lifetime is gone.
// Below is still ok, because *this was captured by value.
assert(L(10) == 20);
assert(L(100) == 120);
If the capture was implicit, or [this] (i.e. *this was captured by reference), this code would be otherwise undefined.
Implementation Strategy:
- amend the parser to accept *this in the lambda introducer
- add a new king of capture LCK_StarThis
- teach Sema::CheckCXXThisCapture to handle by copy captures of the
enclosing object (i.e. *this)
- when CheckCXXThisCapture does capture by copy, the corresponding
initializer expression for the closure's data member
direct-initializes it thus making a copy of '*this'.
- in codegen, when assigning to CXXThisValue, if *this was captured by
copy, make sure it points to the corresponding field member, and
not, unlike when captured by reference, what the field member points
to.
- mark feature as implemented in svn
Much gratitude to Richard Smith for his carefully illuminating reviews!
llvm-svn: 263921
This makes sure we don't generate a lot of code to spill/reload
CSRs when calling tls_init from the access functions.
This helps performance when tls_init is not inlined into the access
functions.
llvm-svn: 263854
This patch implements the following aspects:
It extends sema to check that a variable is not reference in both a map clause and firstprivate or private. This is needed to ensure correct functioning at codegen level, apart from being useful for the user.
It implements firstprivate for target in codegen. The implementation applies to both host and nvptx devices.
It adds regression tests for codegen of firstprivate, host and device side when using the host as device, and nvptx side.
Please note that the regression test for nvptx codegen is missing VLAs. This is because VLAs currently require saving and restoring the stack which appears not to be a supported operation by nvptx backend.
It adds a check in sema regression tests for target map, firstprivate, and private clauses.
http://reviews.llvm.org/D18203
llvm-svn: 263837
Summary:
r246764 handled __fp16 arguments and returns for AAPCS, but skipped this
handling for OpenCL. Simlar to OpenCL, RenderScript also handles __fp16
type natively.
This patch adds the -fnative-half-arguments-and-returns command line
flag to allow such languages to skip this coercion of __fp16.
Reviewers: srhines, olista01
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18138
llvm-svn: 263795
Summary:
Reworked test case after buildbot failure on windows.
This patch adds base support for codegen of the target directive on the NVPTX device.
Reviewers: ABataev
Differential Revision: http://reviews.llvm.org/D17877
llvm-svn: 263783
Summary: ...as that is apparently what MSVC does
Reviewers: rnk
Patch by Stephan Bergmann
Differential Revision: http://reviews.llvm.org/D15267
llvm-svn: 263738
OpenMP 4.0 allows to define custom reduction operations using '#pragma
omp declare reduction' construct. Patch allows to use this custom
defined reduction operations in 'reduction' clauses.
llvm-svn: 263701
This patch adds support for codegen of private clause of target and a regression test for host code generation, when the host is used as target device. I believe that code generation for nvptx backend would not require anything additional or different to what is done for the host.
http://reviews.llvm.org/D18105
llvm-svn: 263654
Till now, preserve_mostcc/preserve_allcc calling convention attributes were only
available at the LLVM IR level. This patch adds attributes for
preserve_mostcc/preserve_allcc calling conventions to the C/C++ front-end.
The code was mostly written by Juergen Ributzka.
I just added support for the AArch64 target and tests.
Differential Revision: http://reviews.llvm.org/D18025
llvm-svn: 263647
It is faster to directly call the ObjC runtime for methods such as retain/release instead of sending a message to those functions.
This patch adds support for converting messages to retain/release/alloc/autorelease to their equivalent runtime calls.
Tests included for the positive case of applying this transformation, negative tests that we ensure we only convert "alloc" to objc_alloc, not "alloc2", and also a driver test to ensure we enable this only for supported runtime versions.
Reviewed by John McCall.
Differential Revision: http://reviews.llvm.org/D14737
llvm-svn: 263607
Summary:
This patch adds base support for codegen of the target directive on the NVPTX device.
Reviewers: ABataev
Differential Revision: http://reviews.llvm.org/D17877
llvm-svn: 263587
In the cross-DSO CFI mode clang emits __cfi_check_fail that handles
errors triggered from other modules with targets in the current
module. With this change, __cfi_check_fail will handle errors for
CFI kinds that are not enabled in the current module as if they
have the trapping behaviour (-fsanitize-trap=...).
This fixes a bug where some combinations of -fsanitize* flags may
result in a link failure due to a missing sanitizer runtime library
for the diagnostic calls in __cfi_check_fail.
llvm-svn: 263578
Summary:
This patch adds base support for codegen of the target directive on the NVPTX device.
Reviewers: ABataev
Differential Revision: http://reviews.llvm.org/D17877
llvm-svn: 263552
This is the companion to an LLVM patch that renamed the function index
data structures and files to use the more general module summary index.
(Recommit after fixing LLVM side to add back missed file)
llvm-svn: 263514
This is the companion to an LLVM patch that renamed the function index
data structures and files to use the more general module summary index.
llvm-svn: 263491
The relative vtable ABI will use a struct rather than an array as the type
of a vtable. LLVM only allows 32-bit integers as struct indices, so we need
to use 32-bit integers to get addresses of address points. In order to keep
the code simple, we might as well do that unconditionally.
It's probably a reasonable implementation limit to support no more than 2
billion virtual functions per class.
This change causes quite a bit of churn in the test suite, so I'm making
it separately.
Differential Revision: http://reviews.llvm.org/D18113
llvm-svn: 263469
This marks virtual function declarations, as well as runtime library functions
__cxa_pure_virtual, __cxa_deleted_virtual and _purecall, as unnamed_addr. This
will allow us to correctly form relative references to them from vtables in
the relative vtable ABI.
Differential Revision: http://reviews.llvm.org/D18071
llvm-svn: 263464
Summary:
This flag is enabled by default in the driver when NDEBUG is set. It
is forwarded on the LLVMContext to discard all value names (but
GlobalValue) for performance purpose.
This an improved version of D18024
Reviewers: echristo, chandlerc
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18127
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263394
commit 60d9845f6a037122d9be9a6d92d4de617ef45b04
Author: Mehdi Amini <mehdi.amini@apple.com>
Date: Fri Mar 11 18:48:02 2016 +0000
Fix clang crash: when CodeGenAction is initialized without a
context, use the member and not the parameter
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@263273
91177308-0d34-0410-b5e6-96231b3b80d8
commit af7ce3bf04a75ad5124b457b805df26006bd215b
Author: Mehdi Amini <mehdi.amini@apple.com>
Date: Fri Mar 11 17:32:58 2016 +0000
Fix build: use -> with pointers and not .
Silly typo.
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@263267
91177308-0d34-0410-b5e6-96231b3b80d8
commit d0eea119192814954e7368c77d0dc5a9eeec1fbb
Author: Mehdi Amini <mehdi.amini@apple.com>
Date: Fri Mar 11 17:15:44 2016 +0000
Remove compile time PreserveName switch based on NDEBUG
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.
Reviewers: chandlerc
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18024
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@263257
91177308-0d34-0410-b5e6-96231b3b80d8
until we can fix the Release builds.
This reverts commits 263257, 263267, 263273
llvm-svn: 263320
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.
Reviewers: chandlerc
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18024
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263257
Includes new built-in, conversion of built-in to target-independent intrinsic
and update in the header file. Tests are also updated. There is a second part in
the backend for which I will post a separate code-review. BACKEND PART SHOULD BE
COMMITTED FIRST.
Phabricator: http://reviews.llvm.org/D17816
llvm-svn: 263051
OpenMP 4.5 allows privatization of non-static data members in OpenMP
constructs. Patch adds proper codegen support for data members in
'linear' clause
llvm-svn: 263003
This patch provide basic implementation of codegen for teams directive, excluding all clauses except dist_schedule. It also fixes parts of AST reader/writer to enable correct pre-compiled header handling.
http://reviews.llvm.org/D17170
llvm-svn: 262832
This patch provide basic implementation of codegen for teams directive, excluding all clauses except dist_schedule. It also fixes parts of AST reader/writer to enable correct pre-compiled header handling.
http://reviews.llvm.org/D17170
llvm-svn: 262741
Use it to calculate UserLabelPrefix, instead of specifying it (often
incorrectly).
Note that the *actual* user label prefix has always come from the
DataLayout, and is handled within LLVM. The main thing clang's
TargetInfo::UserLabelPrefix did was to set the #define value. Having
these be different from each-other is just silly.
Differential Revision: http://reviews.llvm.org/D17183
llvm-svn: 262737
While pushing switch statements onto the region stack we neglected to
specify their start/end locations. This results in a crash (PR26825) if
we end up in nested macro expansions without enough information to
handle the relevant file exits.
I added a test in switchmacro.c and fixed up a bunch of incorrect CHECK
lines that specify strange end locations for switches.
llvm-svn: 262697
For compatibility with GCC, classify __m64 as SSE.
However, clang is a platform compiler for certain targets; retain our
old behavior on those targets: classify __m64 as integer.
This fixes PR26832.
llvm-svn: 262688
Add code generation support for firstprivate and private clauses of teams on the host. Add extensive regression tests including lambda functions and vla testing.
http://reviews.llvm.org/D17582
llvm-svn: 262663
Summary:
This patch implements the launching of a target region in the presence of a nested teams region, i.e calls tgt_target_teams with the required arguments gathered from the enclosed teams directive.
The actual codegen of the region enclosed by the teams construct will be contributed in a separate patch.
Reviewers: hfinkel, arpith-jacob, kkwli0, carlo.bertolli, ABataev
Subscribers: cfe-commits, caomhin, fraggamuffin
Differential Revision: http://reviews.llvm.org/D17019
llvm-svn: 262625
Add parsing, sema analysis and serialization/deserialization for 'declare reduction' construct.
User-defined reductions are defined as
#pragma omp declare reduction( reduction-identifier : typename-list : combiner ) [initializer ( initializer-expr )]
These custom reductions may be used in 'reduction' clauses of OpenMP constructs. The combiner specifies how partial results can be combined into a single value. The
combiner can use the special variable identifiers omp_in and omp_out that are of the type of the variables being reduced with this reduction-identifier. Each of them will
denote one of the values to be combined before executing the combiner. It is assumed that the special omp_out identifier will refer to the storage that holds the resulting
combined value after executing the combiner.
As the initializer-expr value of a user-defined reduction is not known a priori the initializer-clause can be used to specify one. Then the contents of the initializer-clause
will be used as the initializer for private copies of reduction list items where the omp_priv identifier will refer to the storage to be initialized. The special identifier
omp_orig can also appear in the initializer-clause and it will refer to the storage of the original variable to be reduced.
Differential Revision: http://reviews.llvm.org/D11182
llvm-svn: 262582
OpenMP 4.5 allows to privatize data members of current class in member
functions. Patch adds initial support for privatization of data members
in 'linear' clause, no codegen support.
llvm-svn: 262578
This patch changes cc1 option for PGO profile use from
-fprofile-instr-use=<path> to -fprofile-instrument-use-path=<path>.
-fprofile-instr-use=<path> is now a driver only option.
In addition to decouple the cc1 option from the driver level option, this patch
also enables IR level profile use. cc1 option handling now reads the profile
header and sets CodeGenOpt ProfileUse (valid values are {None, Clang, LLVM}
-- this is a common enum for -fprofile-instrument={}, for the profile
instrumentation), and invoke the pipeline to enable the respective PGO use pass.
Reviewers: silvas, davidxl
Differential Revision: http://reviews.llvm.org/D17737
llvm-svn: 262515
This is like r262493, but for pragma detect_mismatch instead of pragma comment.
The two pragmas have similar behavior, so use the same approach for both.
llvm-svn: 262506
... and register them with CUDA runtime.
This is needed for commonly used cudaMemcpy*() APIs that use address of
host-side shadow to access their counterparts on device side.
Fixes PR26340
Differential Revision: http://reviews.llvm.org/D17779
llvm-svn: 262498
`#pragma comment` was handled by Sema calling a function on ASTConsumer, and
CodeGen then implementing this function and writing things to its output.
Instead, introduce a PragmaCommentDecl AST node and hang one off the
TranslationUnitDecl for every `#pragma comment` line, and then use the regular
serialization machinery. (Since PragmaCommentDecl has codegen relevance, it's
eagerly deserialized.)
http://reviews.llvm.org/D17799
llvm-svn: 262493
Sema allows max values up to 2**28, use unsigned instead of unsiged
short to hold values that large.
Differential Revision: http://reviews.llvm.org/D17248
Patch by Don Hinton!
llvm-svn: 262466
OpenMP 4.5 allows to privatize non-static data members of current class
in non-static member functions. Patch supports codegen for non-static
data members in 'reduction' clauses.
llvm-svn: 262460
We'd lose track of the parent CodeGenFunction, leading us to get
confused with regard to which function a nested finally belonged to.
Differential Revision: http://reviews.llvm.org/D17752
llvm-svn: 262379
This patch expands cc1 option -fprofile-instrument= with a new value: -fprofile-instrument=llvm
which enables IR level PGO instrumentation.
Reviewers: davidxl, silvas
Differential Revision: http://reviews.llvm.org/D17622
llvm-svn: 262239
Summary:
OpenCL access qualifiers are now not only used for image types, refine it to avoid misleading,
Add semacheck for OpenCL access qualifier as well as test caees.
Reviewers: pekka.jaaskelainen, Anastasia, aaron.ballman
Subscribers: aaron.ballman, cfe-commits
Differential Revision: http://reviews.llvm.org/D16040
llvm-svn: 261961
OpenMP 4.5 allows to privatize non-static member decls in non-static
member functions. Patch captures such decls by reference in general (for
bitfields, by value) and then operates with this capture. For bitfields,
at the end of codegen for lastprivates original bitfield is updated with the value of captured copy.
llvm-svn: 261824
Summary:
This is important for e.g. the following case:
void sync() { __syncthreads(); }
void foo() {
do_something();
sync();
do_something_else():
}
Without this change, if the optimizer does not inline sync() (which it
won't because __syncthreads is also marked as noduplicate, for now
anyway), it is free to perform optimizations on sync() that it would not
be able to perform on __syncthreads(), because sync() is not marked as
convergent.
Similarly, we need a notion of convergent calls, since in the case when
we can't statically determine a call's target(s), we need to know
whether it's safe to perform optimizations around the call.
This change is conservative; the optimizer will remove these attrs where
it can, see r260318, r260319.
Reviewers: majnemer
Subscribers: cfe-commits, jhen, echristo, tra
Differential Revision: http://reviews.llvm.org/D17056
llvm-svn: 261779
This patch introduces the -fwhole-program-vtables flag, which enables the
whole-program vtable optimization feature (D16795) in Clang.
Differential Revision: http://reviews.llvm.org/D16821
llvm-svn: 261767
This is mainly for extensibility. Note that fragile category metadata,
metadata for classes and protocols all have a size field.
Initial patch was provided by Greg Parker.
rdar://problem/24804226
llvm-svn: 261756
Fixes PR11517 for SPARC.
On most targets, clang lowers va_arg itself, eschewing the use of the
llvm vaarg instruction. This is necessary (at least for now) as the type
argument to the vaarg instruction cannot represent all the ABI
information that is needed to support complex calling conventions.
However, on targets with a simpler varrags ABIs, the LLVM instruction
can work just fine, and clang can simply lower to it. Unfortunately,
even on such targets, vaarg with a struct argument would fail, because
the default lowering to vaarg was naive: it didn't take into account the
ABI attribute computed by classifyArgumentType. In particular, for the
DefaultABIInfo, structs are supposed to be passed indirectly and so
llvm's vaarg instruction should be emitted with a pointer argument.
Now, vaarg instruction emission is able to use computed ABIArgInfo for
the provided argument type, which allows the default ABI support to work
for structs too.
I haven't touched the EmitVAArg implementation for PPC32_SVR4 or XCore,
although I believe both are now redundant, and could be switched over to
use the default implementation as well.
Differential Revision: http://reviews.llvm.org/D16154
llvm-svn: 261717
Remove an unnecessary workaround introduced in r259975. (NFC)
Now that LLVM r259973 allows replacing a temporary type with another
temporary we can rely on the original implementation.
It is possible for enums to be created as part of
their own declcontext. In this case a FwdDecl will be created
twice. This doesn't cause a problem because both FwdDecls are
entered into the ReplaceMap: finalize() will replace the first
FwdDecl with the second and then replace the second with
complete type.
Thanks to echristo for pointing this out.
# Conflicts:
# lib/CodeGen/CGDebugInfo.cpp
llvm-svn: 261673
Now that LLVM r259973 allows replacing a temporary type with another
temporary we can rely on the original implementation.
It is possible for enums to be created as part of
their own declcontext. In this case a FwdDecl will be created
twice. This doesn't cause a problem because both FwdDecls are
entered into the ReplaceMap: finalize() will replace the first
FwdDecl with the second and then replace the second with
complete type.
Thanks to echristo for pointing this out.
llvm-svn: 261657
This uses the general emitVoidPtrVAArg lowering logic for everything, since
this supports all types, and we don't have any special requirements.
llvm-svn: 261557
We gave a VBTable dllimport storage class and external linkage while
also providing an initializer. An initializer is only valid if the
VBTable has available_externally linkage. Fix this by setting the
linkage to available_externally in situ while generating the
initializer.
This fixes PR26686.
llvm-svn: 261535
This modification applies the following Android commit when we have an
Android environment. This is the sole non-renderscript in the Android repo
commit 9212d4fb30a3ca2f4ee966dd2748c35573d9682c
Author: Tim Murray <timmurray@google.com>
Date: Fri Aug 15 16:00:15 2014 -0700
Update vector calling convention for AArch64.
bug 16846318
Change-Id: I3cfd167758b4bd634d8480ee6ba6bb55d61f82a7
Reviewers: srhines, jyknight
Subscribers: mcrosier, aemerson, rengolin, tberghammer, danalbert, srhines
Differential Revision: http://reviews.llvm.org/D17448
llvm-svn: 261533
It can happen that when we only have 1 more register left in the regsave
area we need to store a value bigger than 1 register and therefore we
go to the overflow area. In this case we have to leave the last slot
in the regsave area unused and keep using overflow area. Do this
by storing a limit value to the used register counter in the overflow block.
Issue diagnosed by and solution tested by Mark Millard!
llvm-svn: 261422
Add support for opencl_unroll_hint attribute from OpenCL v2.0 s6.11.5.
Reusing most of metadata generation from CGLoopInfo helper class.
The code is based on Khronos OpenCL compiler:
https://github.com/KhronosGroup/SPIR/tree/spirv-1.0
Patch by Liu Yaxun (Sam)!
Differential Revision: http://reviews.llvm.org/D16686
llvm-svn: 261350
Llvm module object is shared between CodeGenerator and BackendConsumer,
in both classes it is stored as std::unique_ptr, which is not a good
design solution and can cause double deletion error. Usually it does
not occur because in BackendConsumer::HandleTranslationUnit the
ownership of CodeGenerator over the module is taken away. If however
this method is not called, the module is deleted twice and compiler crashes.
As the module owned by BackendConsumer is always the same as CodeGenerator
has, pointer to llvm module can be removed from BackendGenerator.
Differential Revision: http://reviews.llvm.org/D15450
llvm-svn: 261222
Patch fixes bug with codegen for lastprivate loop counters. Also it may
improve performance for lastprivates calculations in some cases.
llvm-svn: 261209
The assert is triggered because isObjCRetainableType() is called on the
canonicalized return type that has been stripped of the typedefs and
attributes attached to it. To fix this assert, this commit gets the
original return type from CurCodeDecl or BlockInfo and uses it instead
of the canoicalized type.
rdar://problem/24470031
Differential Revision: http://reviews.llvm.org/D16914
llvm-svn: 261151
Summary: Use the new pipeline implemented in D17115
Reviewers: tejohnson
Subscribers: joker.eph, cfe-commits
Differential Revision: http://reviews.llvm.org/D17272
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 261045
Expressions inside 'schedule'|'dist_schedule' clause must be captured in
combined directives to avoid possible crash during codegen. Patch
improves handling of such constructs
llvm-svn: 260954
Sync barrier will be emitted after generation of firstprivate variables
only if one of the firstprivate vars is used in lastprivate clause.
llvm-svn: 260877
Summary:
Unlike other outlined regions in OpenMP, offloading entry points have to have be visible (external linkage) for the device side. Using dots in the names of the entries can be therefore problematic for some toolchains, e.g. NVPTX.
Also the patch drops the column information in the unique name of the entry points. The parsing of directives ignore unknown tokens, preventing several target regions to be implemented in the same line. Therefore, the line information is sufficient for the name to be unique. Also, the preprocessor printer does not preserve the column information, causing offloading-entry detection issues if the host uses an integrated preprocessor and the target doesn't (or vice versa).
Reviewers: hfinkel, arpith-jacob, carlo.bertolli, kkwli0, ABataev
Subscribers: cfe-commits, fraggamuffin, caomhin
Differential Revision: http://reviews.llvm.org/D17179
llvm-svn: 260837
This commit changes the root from "Simple C/C++ TBAA" to "Simple C++ TBAA" for
C++.
The problem is that the type name in the TBAA nodes is generated differently
for C vs C++. If we link an IR file for C with an IR file for C++, since they
have the same root and the type names are different, accesses to the two type
nodes will be considered no-alias, even though the two type nodes are from
the same type in a header file.
The fix is to use different roots for C and C++. Types from C will be treated
conservatively in respect to types from C++.
Follow-up commits will change the C root to "Simple C TBAA" plus some mangling
change for C types to make it a little more aggresive.
llvm-svn: 260567
This reverts commit r260449.
We would supress our emission of vftable definitions if we thought
another translation unit would provide the definition because we saw an
explicit instantiation declaration. This is not the case with
dllimport, we want to synthesize a definition of the vftable regardless.
This fixes PR26569.
llvm-svn: 260548
The current macho linker just copies symbols in section datacoal_nt to
section data, so it doesn't really matter whether or not section
"datacoal_nt" is attached to the global variable.
This is a follow-up to r250370, which made changes in llvm to stop
putting functions and data in the *coal* sections.
rdar://problem/24528611
llvm-svn: 260496
OMPCapturedExprDecl allows caopturing not only of fielddecls, but also
other expressions. It also allows to simplify codegen for several
clauses.
llvm-svn: 260492
Summary:
We can't do the right thing, since there's no right thing to do, but at
least we can not crash the compiler.
Reviewers: majnemer, rnk
Subscribers: cfe-commits, jhen, tra
Differential Revision: http://reviews.llvm.org/D17103
llvm-svn: 260479
Referencing a dllimported vtable is impossible in a constexpr
constructor. It would be friendlier to C++ programmers if we
synthesized a copy of the vftable which referenced imported virtual
functions. This would let us initialize the object in a way which
preserves both the intent to import functionality from another DLL while
also making constexpr work.
Differential Revision: http://reviews.llvm.org/D17061
llvm-svn: 260388
being called with the wrong size: convert CGFunctionInfo to use TrailingObjects
and ask TrailingObjects to provide a working 'operator delete' for us.
llvm-svn: 260181
When handling 'if' statements, we crash if the condition and the consequent
branch are spanned by a single macro expansion.
The crash occurs because of a sanity 'reset' in popRegions(): if an expansion
exactly spans an entire region, we set MostRecentLocation to the start of the
expansion (its 'include location'). This ensures we don't handleFileExit()
ourselves out of the expansion before we're done processing all of the regions
within it. This is tested in test/CoverageMapping/macro-expressions.c.
This causes a problem when an expansion spans both the condition and the
consequent branch of an 'if' statement. MostRecentLocation is updated to the
start of the 'if' statement in popRegions(), so the file for the expansion
isn't exited by the time we're done handling the statement. We then crash with
'fatal: File exit not handled before popRegions'.
The fix for this is to detect these kinds of expansions, and conservatively
update MostRecentLocation to the end of expansion region containing the
conditional. I've added tests to make sure we don't have the same problem with
other kinds of statements.
rdar://problem/23630316
Differential Revision: http://reviews.llvm.org/D16934
llvm-svn: 260129
OpenMP 4.5 introduces privatization of non-static data members of current class in non-static member functions.
To correctly handle such kind of privatization a new (pseudo)declaration VarDecl-based node is added. It allows to reuse an existing code for capturing variables in Lambdas/Block/Captured blocks of code for correct privatization and codegen.
llvm-svn: 260077
A dllimport'd vtable always points one past the RTTI data, this means
that the initializer will never end up referencing the data. Our
emission is a harmless waste.
llvm-svn: 260062
Summary:
Different devices may in some cases require different code generation schemes in order to implement OpenMP. This is required not only for performance reasons, but also because it may not be possible to have the current (default) implementation working for these devices. E.g. GPU's cannot implement the same scheme a target such as powerpc or x86b would use, in the sense that it does not have the ability to fork threads, instead all the threads are always executing and need to be managed by the implementation.
This patch proposes a reorganization of the code in the OpenMP code generation to pave the way to have specialized implementation of OpenMP support. More than a "real" patch this is more a request for comments in order to understand if what is proposed is acceptable or if there are better/easier ways to do it.
In this patch part of the common OpenMP codegen infrastructure is moved to a new file under a new namespace (CGOpenMPCommon) so it can be shared between the default implementation and the specialized one. When CGOpenMPRuntime is created, an attempt to select a specialized implementation is done.
In the patch a specialization for nvptx targets is done which currently checks if the target is an OpenMP device and trap if it is not.
Let me know comments suggestions you may have.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: Hahnfeld, cfe-commits, fraggamuffin, caomhin, jholewinski
Differential Revision: http://reviews.llvm.org/D16784
llvm-svn: 259977
It is possible for enums to be created as part of their own
declcontext. We need to cache a placeholder to avoid the type being
created twice before hitting the cache.
<rdar://problem/24493203>
llvm-svn: 259975
Because the Decl is explicitly passed as nullptr further up the call chain, it
is possible to invoke isa on a nullptr, which will assert. Guard against the
nullptr.
Take the opportunity to reuse the helper method rather than re-implementing this
logic.
llvm-svn: 259874
This patch changes cc1 option -fprofile-instr-generate to an enum option
-fprofile-instrument={clang|none}. It also changes cc1 options
-fprofile-instr-generate= to -fprofile-instrument-path=.
The driver level option -fprofile-instr-generate and -fprofile-instr-generate=
remain intact. This change will pave the way to integrate new PGO
instrumentation in IR level.
Review: http://reviews.llvm.org/D16730
llvm-svn: 259811
Codegen for array sections/array subscripts worked only for expressions with arrays as base. Patch fixes codegen for bases with pointer/reference types.
llvm-svn: 259776
Avoid crashing when printing diagnostics for vtable-related CFI
errors. In diagnostic mode, the frontend does an additional check of
the vtable pointer against the set of all known vtable addresses and
lets the runtime handler know if it is safe to inspect the vtable.
http://reviews.llvm.org/D16823
llvm-svn: 259716
Summary:
This patch adds parsing + sema for the target parallel for directive along with testcases.
Reviewers: ABataev
Differential Revision: http://reviews.llvm.org/D16759
llvm-svn: 259654
C++ ABI library for the same set of types for which we expect the C++ ABI
library to provide the RTTI.
Specifically:
1) __int128 and unsigned __int128 are now emitted into the ABI library. We
always expected them to be there but never actually made sure to emit them.
2) Do not expect OpenCL builtin types to have type info in the C++ ABI library.
Neither libc++abi nor libstdc++ puts them there when built with either GCC
or Clang.
This matches GCC's behavior.
llvm-svn: 259616
the details of the bug, but avoiding overloading llvm::cast with another
function template sidesteps it.
See gcc.gnu.org/PR58022 for details of the bug, and llvm.org/PR26362 for more
backgound on how it manifested in Clang. Patch by Igor Sugak!
llvm-svn: 259598
In general CUDA does not allow dynamic initialization of
global device-side variables. One exception is that CUDA allows
records with empty constructors as described in section E2.2.1 of
CUDA 7.5 Programming guide.
This patch applies initializer checks for all device-side variables.
Empty constructors are accepted, but no code is generated for them.
Differential Revision: http://reviews.llvm.org/D15305
llvm-svn: 259592
Re-commit of r258950 after fixing layering violation.
The related LLVM patch adds a backend diagnostic type for reporting
unsupported features, this adds a printer for them to clang.
In the case where debug location information is not available, I've
changed the printer to report the location as the first line of the
function, rather than the closing brace, as the latter does not give the
user any information. This also affects optimisation remarks.
llvm-svn: 259499
The list of class properties is saved in
Old ABI: protocol->ext->class_properties (protocol->ext->size will be updated)
New ABI: protocol->class_properties (protocol->size will be updated)
rdar://23891898
llvm-svn: 259268
The list of class properties is saved in
Old ABI: category->class_properties (category->size will be updated as well)
New ABI: category->class_properties (a flag in objc_image_info to indicate
whether or not the list of class properties is present)
rdar://23891898
llvm-svn: 259267
Summary:
This is necessary to prevent llvm from generating stacksave intrinsics
around this alloca. NVVM doesn't have a stack, and we don't handle said
intrinsics.
Reviewers: rnk, echristo
Subscribers: cfe-commits, jhen, tra
Differential Revision: http://reviews.llvm.org/D16664
llvm-svn: 259122
Frontend can emit errors when releaseing the Builder. If there are errors before
or when releasing the Builder, we reset the module to stop here before invoking
the backend.
Before this commit, clang will continue to invoke the backend and backend can
crash.
Differential Revision: http://reviews.llvm.org/D16564
llvm-svn: 259116
Re-commit of r258950 after fixing layering violation.
Add backend dignostic printer for unsupported features
The related LLVM patch adds a backend diagnostic type for reporting
unsupported features, this adds a printer for them to clang.
In the case where debug location information is not available, I've
changed the printer to report the location as the first line of the
function, rather than the closing brace, as the latter does not give the
user any information. This also affects optimisation remarks.
Differential Revision: http://reviews.llvm.org/D16591
llvm-svn: 259036
Various bits we'd like to use the new ABI actually compile with "-arch armv7k
-miphoneos-version-min=9.0". Not ideal, but also not ridiculous given how
slices work.
llvm-svn: 258976
reclaiming a call result in order to ignore it or assign it
to an __unsafe_unretained variable. This avoids adding
an unwanted retain/release pair when the return value is
not actually returned autoreleased (e.g. when it is returned
from a nonatomic getter or a typical collection accessor).
This runtime function is only available on the latest Apple
OS releases; the backwards-compatibility story is that you
don't get the optimization unless your deployment target is
recent enough. Sorry.
rdar://20530049
llvm-svn: 258962
The related LLVM patch adds a backend diagnostic type for reporting
unsupported features, this adds a printer for them to clang.
In the case where debug location information is not available, I've
changed the printer to report the location as the first line of the
function, rather than the closing brace, as the latter does not give the
user any information. This also affects optimisation remarks.
Differential Revision: http://reviews.llvm.org/D16591
llvm-svn: 258950
Clang's CodeGen has several paths which end up invoking or calling a
function. The one that we used for calls to __RTtypeid did not
appropriately annotate the call with a funclet bundle.
This fixes PR26329.
llvm-svn: 258877
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html
"This is the way [autoconf] ends
Not with a bang but a whimper."
-T.S. Eliot
Reviewers: chandlerc, grosbach, bob.wilson, echristo
Subscribers: klimek, cfe-commits
Differential Revision: http://reviews.llvm.org/D16472
llvm-svn: 258862
Member pointers in the MS ABI are tricky for a variety of reasons.
The size of a member pointer is indeterminate until the program reaches
a point where the representation is required to be known. However,
*pointers* to member pointers may exist without knowing the pointee
type's representation. In these cases, we synthesize an opaque LLVM
type for the pointee type.
However, we can be in a situation where the underlying member pointer's
representation became known mid-way through the program. To account for
this, we attempted to manicure CodeGen's type-cache so that we can
replace the opaque member pointer type with the real deal while leaving
the pointer types unperturbed. This, unfortunately, is a problematic
approach to take as we will violate CodeGen's invariants.
These violations are mostly harmless but let's do the right thing
instead: invalidate the type-cache if a member pointer's LLVM
representation changes.
This fixes PR26313.
llvm-svn: 258839
Summary:
This patch adds parsing + sema for the target parallel directive and its clauses along with testcases.
Reviewers: ABataev
Differential Revision: http://reviews.llvm.org/D16553
Rebased to current trunk and updated test cases.
llvm-svn: 258832
All current properties are instance properties.
This is the second patch in a series of patches to support class properties
in addition to instance properties in objective-c.
rdar://23891898
llvm-svn: 258824
Summary:
This patch adds parsing + sema for the defaultmap clause associated with the target directive (among others).
Reviewers: ABataev
Differential Revision: http://reviews.llvm.org/D16527
llvm-svn: 258817
Fix arc patch fuzz error.
Summary:
Support for the pipe built-in functions for OpenCL 2.0.
The pipe builtin functions may have infinite kinds of element types, one approach
would be to just generate calls that would always use generic types such as void*.
This patch is based on bader's opencl support patch on SPIR-V branch.
Reviewers: Anastasia, pekka.jaaskelainen
Subscribers: keryell, bader, cfe-commits
Differential Revision: http://reviews.llvm.org/D15914
llvm-svn: 258782
Summary:
Support for the pipe built-in functions for OpenCL 2.0.
The pipe builtin functions may have infinite kinds of element types, one approach
would be to just generate calls that would always use generic types such as void*.
This patch is based on bader's opencl support patch on SPIR-V branch.
Reviewers: Anastasia, pekka.jaaskelainen
Subscribers: keryell, bader, cfe-commits
Differential Revision: http://reviews.llvm.org/D15914
llvm-svn: 258773
* Runtime diagnostic data for cfi-icall changed to match the rest of
cfi checks
* Layout of all CFI diagnostic data changed to put Kind at the
beginning. There is no ABI stability promise yet.
* Call cfi_slowpath_diag instead of cfi_slowpath when needed.
* Emit __cfi_check_fail function, which dispatches a CFI check
faliure according to trap/recover settings of the current module.
* A tiny driver change to match the way the new handlers are done in
compiler-rt.
llvm-svn: 258745
Summary:
These aliases are done to support inline asm, but there's nothing we can
do: NVPTX doesn't support aliases.
Reviewers: tra
Subscribers: cfe-commits, jhen, echristo
Differential Revision: http://reviews.llvm.org/D16501
llvm-svn: 258734
In the Itanium ABI, vtable may be emitted speculatively as an
available_externally global. Such vtable may not be present at the
link time and should not have a corresponding CFI bit set entry.
llvm-svn: 258596
by stripping the path. Follow-up to r258555.
This is safe because only one PCH per CU is currently supported for
module debugging.
rdar://problem/24301262
llvm-svn: 258582
Keep the ones still used by libclc around for now.
Emit the new amdgcn intrinsic name if not targeting r600,
in which case the old AMDGPU name is still used.
llvm-svn: 258560
PCH files don't have a module signature and LLVM uses a nonzero DWO id as
an indicator for skeleton / module CUs. This change pins the DWO id for PCH
files to a known constant value.
The correct long-term solution here is to implement a module signature
that is an actual dterministic hash (at the moment module signatures are
just random nonzero numbers) and then enable this for PCH files as well.
<rdar://problem/24290667>
llvm-svn: 258507
Volatile loads of type wider than a pointer get split by MSVC because
the base x86 ISA doesn't provide loads which are wider than pointer
width. LLVM assumes that it can emit an cmpxchg8b but this is
problematic if the memory is in a CONST memory segment.
Instead, provide behavior compatible with MSVC: split loads wider than a
pointer.
llvm-svn: 258506
If 'sections' directive has only one sub-section, the code for 'single'-based directive was emitted. Removed this codegen, because it causes crashes in different cases.
llvm-svn: 258495
Replace a string append operation in addFunctionMappingRecord with a
vector append. The existing behavior is quadratic in the worst case:
this patch makes it linear.
Differential Revision: http://reviews.llvm.org/D16395
llvm-svn: 258424
can be found in a module.
There are externally visible anonymous types that can be found:
typedef struct { } s; // I can be found via the typedef.
There are anonymous internal types that can be found:
namespace { struct s {}; } // I can be found by name.
rdar://problem/24199640
llvm-svn: 258272
until we are visiting their declcontext.
This fixes a regression introduced in r256962:
When building debug info for a typdef'd anonymous tag type, we would be
visiting the inner anonymous type first thus creating a "typedef changes
linkage of anonymous type, but linkage was already computed" error.
rdar://problem/24199640
llvm-svn: 258152
This is part of a new statistics gathering feature for the sanitizers.
See clang/docs/SanitizerStats.rst for further info and docs.
Differential Revision: http://reviews.llvm.org/D16175
llvm-svn: 257971
This attribute may be attached to a function definition and instructs the backend to generate appropriate function entry/exit code so that
it can be used directly as an interrupt handler.
The IRET instruction, instead of the RET instruction, is used to return from interrupt or exception handlers. All registers, except for the EFLAGS register which is restored by the IRET instruction, are preserved by the compiler.
Any interruptible-without-stack-switch code must be compiled with -mno-red-zone since interrupt handlers can and will, because of the hardware design, touch
the red zone.
interrupt handler must be declared with a mandatory pointer argument:
struct interrupt_frame;
__attribute__ ((interrupt))
void f (struct interrupt_frame *frame) {
...
}
and user must properly define the structure the pointer pointing to.
exception handler:
The exception handler is very similar to the interrupt handler with a different mandatory function signature:
#ifdef __x86_64__
typedef unsigned long long int uword_t;
#else
typedef unsigned int uword_t;
#endif
struct interrupt_frame;
__attribute__ ((interrupt))
void f (struct interrupt_frame *frame, uword_t error_code) {
...
}
and compiler pops the error code off stack before the IRET instruction.
The exception handler should only be used for exceptions which push an error code and all other exceptions must use the interrupt handler.
The system will crash if the wrong handler is used.
Differential Revision: http://reviews.llvm.org/D15709
llvm-svn: 257867
This commit is a follow-up to r251734, r251476, and r249735, which fixes
a bug where function attributes were not attached to thread local
wrapper functions.
rdar://problem/20828324
llvm-svn: 257865
Summary:
Previously we compiled CUDA device code to PTX assembly and embedded
that asm as text in our host binary. Now we compile to PTX assembly and
then invoke ptxas to assemble the PTX into a cubin file. We gather the
ptx and cubin files for each of our --cuda-gpu-archs and combine them
using fatbinary, and then embed that into the host binary.
Adds two new command-line flags, -Xcuda_ptxas and -Xcuda_fatbinary,
which pass args down to the external tools.
Reviewers: tra, echristo
Subscribers: cfe-commits, jhen
Differential Revision: http://reviews.llvm.org/D16082
llvm-svn: 257809
Proper diagnostic and resolution of mangled names' conflicts in variables.
When there is a declaration and a definition using the same name but different
types, we emit what is in the definition. When there are two conflicting
definitions, we issue an error.
Differential Revision: http://reviews.llvm.org/D15686
llvm-svn: 257754
In {CG,}ExprConstant.cpp, we weren't treating vector splats properly.
This patch makes us treat splats more properly.
Additionally, this patch adds a new cast kind which allows a bool->int
cast to result in -1 or 0, instead of 1 or 0 (for true and false,
respectively), so we can sanely model OpenCL bool->int casts in the AST.
Differential Revision: http://reviews.llvm.org/D14877
llvm-svn: 257559
Summary:
Support for OpenCL 2.0 pipe type.
This is a bug-fix version for bader's patch reviews.llvm.org/D14441
Reviewers: pekka.jaaskelainen, Anastasia
Subscribers: bader, Anastasia, cfe-commits
Differential Revision: http://reviews.llvm.org/D15603
llvm-svn: 257254
Clang got itself into the situation where we mangled the same
constructor twice with two different constructor types. After one of
the constructors were utilized, the tag used for one of the types
changed from class to struct because a class template became complete.
This resulted in one of the constructor types varying from the other
constructor.
Instead, force "base" constructor types to "complete" if the ABI doesn't
have constructor variants. This will ensure that GlobalDecls for both
variants will get the same mangled name.
This fixes PR26029.
llvm-svn: 257205
Due to the new in-place renaming support added in r257174, we no
longer need to invoke ThinLTO global renaming from clang. It will be
invoked on the module in the FunctionImport pass (by an immediately
following llvm commit).
As a result, we don't need to load the FunctionInfoIndex as early,
so that is moved down into EmitAssemblyHelper::EmitAssembly.
llvm-svn: 257179
Coverage mapping data may reference names of functions
that are skipped by FE (e.g, unused inline functions). Since
those functions are skipped, normal instr-prof function lowering
pass won't put those names in the right section, so special
handling is needed to walk through coverage mapping structure
and recollect the references.
With this patch, only names that are skipped are processed. This
simplifies the lowering code and it no longer needs to make
assumptions coverage mapping data layout. It should also be
more efficient.
llvm-svn: 257092
was visited and all decls have been merged.
We only get a single chance to emit the types for virtual classes because
CGDebugInfo::completeRequiredType() categorically doesn't complete them.
llvm-svn: 256962
This patch attempts to fix the regressions identified when the patch was committed initially.
Thanks to Michael Liao for identifying the fix in the offloading metadata generation
related with side effects in evaluation of function arguments.
llvm-svn: 256933
Summary:
In order to offloading work properly two things need to be in place:
- a descriptor with all the offloading information (device entry functions, and global variable) has to be created by the host and registered in the OpenMP offloading runtime library.
- all the device functions need to be emitted for the device and a convention has to be in place so that the runtime library can easily map the host ID of an entry point with the actual function in the device.
This patch adds support for these two things. However, only entry functions are being registered given that 'declare target' directive is not yet implemented.
About offloading descriptor:
The details of the descriptor are explained with more detail in http://goo.gl/L1rnKJ. Basically the descriptor will have fields that specify the number of devices, the pointers to where the device images begin and end (that will be defined by the linker), and also pointers to a the begin and end of table whose entries contain information about a specific entry point. Each entry has the type:
```
struct __tgt_offload_entry{
void *addr;
char *name;
int64_t size;
};
```
and will be implemented in a pre determined (ELF) section `.omp_offloading.entries` with 1-byte alignment, so that when all the objects are linked, the table is in that section with no padding in between entries (will be like a C array). The code generation ensures that all `__tgt_offload_entry` entries are emitted in the same order for both host and device so that the runtime can have the corresponding entries in both host and device in same index of the table, and efficiently implement the mapping.
The resulting descriptor is registered/unregistered with the runtime library using the calls `__tgt_register_lib` and `__tgt_unregister_lib`. The registration is implemented in a high priority global initializer so that the registration happens always before any initializer (that can potentially include target regions) is run.
The driver flag -omptargets= was created to specify a comma separated list of devices the user wants to support so that the new functionality can be exercised. Each device is specified with its triple.
About target codegen:
The target codegen is pretty much straightforward as it reuses completely the logic of the host version for the same target region. The tricky part is to identify the meaningful target regions in the device side. Unlike other programming models, like CUDA, there are no already outlined functions with attributes that mark what should be emitted or not. So, the information on what to emit is passed in the form of metadata in host bc file. This requires a new option to pass the host bc to the device frontend. Then everything is similar to what happens in CUDA: the global declarations emission is intercepted to check to see if it is an "interesting" declaration. The difference is that instead of checking an attribute, the metadata information in checked. Right now, there is only a form of metadata to pass information about the device entry points (target regions). A class `OffloadEntriesInfoManagerTy` was created to manage all the information and queries related with the metadata. The metadata looks like this:
```
!omp_offload.info = !{!0, !1, !2, !3, !4, !5, !6}
!0 = !{i32 0, i32 52, i32 77426347, !"_ZN2S12r1Ei", i32 479, i32 13, i32 4}
!1 = !{i32 0, i32 52, i32 77426347, !"_ZL7fstatici", i32 461, i32 11, i32 5}
!2 = !{i32 0, i32 52, i32 77426347, !"_Z9ftemplateIiET_i", i32 444, i32 11, i32 6}
!3 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 99, i32 11, i32 0}
!4 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 272, i32 11, i32 3}
!5 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 127, i32 11, i32 1}
!6 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 159, i32 11, i32 2}
```
The fields in each metadata entry are (in sequence):
Entry 1) an ID of the type of metadata - right now only zero is used meaning "OpenMP target region".
Entry 2) a unique ID of the device where the input source file that contain the target region lives.
Entry 3) a unique ID of the file where the input source file that contain the target region lives.
Entry 4) a mangled name of the function that encloses the target region.
Entries 5) and 6) line and column number where the target region was found.
Entry 7) is the order the entry was emitted.
Entry 2) and 3) are required to distinguish files that have the same function name.
Entry 4) is required to distinguish different instances of the same declaration (usually templated ones)
Entries 5) and 6) are required to distinguish the particular target region in body of the function (it is possible that a given target region is not an entry point - if clause can evaluate always to zero - and therefore we need to identify the "interesting" target regions. )
This patch replaces http://reviews.llvm.org/D12306.
Reviewers: ABataev, hfinkel, tra, rjmccall, sfantao
Subscribers: FBrygidyn, piotr.rak, Hahnfeld, cfe-commits
Differential Revision: http://reviews.llvm.org/D12614
llvm-svn: 256842
NFC. These hints are only used for inlining and the inliner now uses
the same criteria to identify hot and cold callees and set appropriate
thresholds without relying on these hints. Hence this removed code is
superfluous.
Differential Revision: http://reviews.llvm.org/D15726
llvm-svn: 256793
This is one last remaining instrumentatation related structure
that needs to be migrate to use the centralized template
definition. With this change, instrumentation code
related to coverage module header will be kept in sync
with the coverage mapping reader. The remaining code
which makes implicit assumption about covmap control
structure layout in the the lowering pass will cleaned
up in a different patch. This patch is not intended to
have no functional change.
llvm-svn: 256714
It was copying an EHCleanupStack::Cleanup object into a
SmallVector<char>, with a comment saying that SmallVector's alignment is
always large enough. Unfortunately, that isn't actually true after
r162331 in 2012.
Expand the code (somewhat distastefully) to get a stack allocation with
a correct alignment.
llvm-svn: 256619
This adds support for the MCU psABI in a way different from r251223 and r251224,
basically reverting most of these two patches. The problem with the approach
taken in r251223/4 is that it only handled libcalls that originated from the backend.
However, the mid-end also inserts quite a few libcalls and assumes these use the
platform's default calling convention.
The previous patch tried to insert inregs when necessary both in the FE and,
somewhat hackily, in the CG. Instead, we now define a new default calling convention
for the MCU, which doesn't use inreg marking at all, similarly to what x86-64 does.
Differential Revision: http://reviews.llvm.org/D15055
llvm-svn: 256495
OpenMP 4.5 adds 'depend(sink:vec)' in 'ordered' directive for doacross loop synchronization. Patch adds parsing and semantic analysis for this clause.
llvm-svn: 256330
OpenMP 4.5 adds 'depend(sink:vec)' in 'ordered' directive for doacross loop synchronization. Patch adds parsing and semantic analysis for this clause.
llvm-svn: 256238
The /Brepro flag controls whether or not the compiler should embed
timestamps into the object file. Object files which do not embed
timestamps are not suitable for incremental linking but are suitable for
hermetic build systems and staged self-hosts of clang.
A normal clang spelling of this flag has been added,
-mincremental-linker-compatible.
llvm-svn: 256204
When using blocks, a byref structure is created to represent the
closure. The "byref.layout" field of this structure is an i8*. However,
some 'inline' layouts are represented as i64's, not i8*'s.
Prior to r246985 we cast the i64 'inline' layout to an i8* before
assigning it into the byref structure. This patch brings the cast back
and adds a regression test.
The original version of this patch was too invasive. This version only adds the
cast to BuildByrefLayout.
Differential Revision: http://reviews.llvm.org/D15674
rdar://23713871
llvm-svn: 256190
When using blocks, a byref structure is created to represent the
closure. The "byref.layout" field of this structure is an i8*. However,
some 'inline' layouts are represented as i64's, not i8*'s.
Prior to r246985 we cast the i64 'inline' layout to an i8* before
assigning it into the byref structure. This patch brings the cast back
and adds a regression test.
rdar://23713871
llvm-svn: 256185
Reapplies r256063, except instead of frugally re-using an LLVM enum,
we define a Clang enum, to avoid exposing too much LLVM interface.
Differential Revision: http://reviews.llvm.org/D15650
llvm-svn: 256078
#pragma omp parallel needs an implicit barrier that is currently done by an explicit call to __kmpc_barrier. However, the runtime already ensures a barrier in __kmpc_fork_call which currently leads to two barriers per region per thread.
Differential Revision: http://reviews.llvm.org/D15561
llvm-svn: 255992
OpenMP codegen tried to emit the code for its constructs even if it was detected as a dead-code. Added checks to ensure that the code is emitted if the code is not dead.
llvm-svn: 255990
OpenMP 4.5 adds 'depend(source)' clause for 'ordered' directive to support cross-iteration dependence. Patch adds parsing and semantic analysis for this construct.
llvm-svn: 255986
This sets the maximum entry count among all functions in the program to the module using module flags. This allows the optimizer to use this information.
Differential Revision: http://reviews.llvm.org/D15163
llvm-svn: 255918
C++ emits vtables for classes that have key function present in the
current TU. While we compile CUDA the fact that key function was found
in this TU does not mean that we are going to generate code for it. E.g.
vtable for a class with host-only methods should not (and can not) be
generated on device side, because we'll never generate code for them
during device-side compilation.
This patch adds an extra CUDA-specific check during key method computation
and filters out potential key methods that are not suitable for this side
of CUDA compilation.
When we codegen vtable, entries for unsuitable methods are set to null.
Differential Revision: http://reviews.llvm.org/D15309
llvm-svn: 255911
Clang-side cross-DSO CFI.
* Adds a command line flag -f[no-]sanitize-cfi-cross-dso.
* Links a runtime library when enabled.
* Emits __cfi_slowpath calls is bitset test fails.
* Emits extra hash-based bitsets for external CFI checks.
* Sets a module flag to enable __cfi_check generation during LTO.
This mode does not yet support diagnostics.
llvm-svn: 255694
This updates clang to use bundle operands to associate an invoke with
the funclet which it is contained within.
Depends on D15517.
Differential Revision: http://reviews.llvm.org/D15518
llvm-svn: 255675
Clang doesn’t support a use of “this” pointer inside inline asm.
When I tried to compile a class or a struct (see example) with an inline asm that contains "this" pointer.
Clang returns with an error.
This patch fixes that.
error: expected unqualified-id
For example:
'''
struct A {
void f() {
__asm mov eax, this
// error: expected unqualified-id
}
};
'''
Differential Revision: http://reviews.llvm.org/D15115
llvm-svn: 255645
We'd like to remove support for terminatepad from LLVM. To do this, we
need to move Clang off of it first. The intent behind terminatepad was
to carefully model exception specifications for the MSVC personality.
However, we don't support exception specifications for the MSVC
personality and neither does MSVC. Instead, MSVC supports
all-or-nothing exception specifications. We can model this limited
usage using cleanuppads which call std::terminate.
Differential Revision: http://reviews.llvm.org/D15478
llvm-svn: 255521
This patch enables soft float support for ppc32 architecture and fixes
the ABI for variadic functions. This is the first in a set of patches
for soft float support in LLVM.
Patch by Strahinja Petrovic.
Differential Revision: http://reviews.llvm.org/D13351
llvm-svn: 255515
This sets the maximum entry count among all functions in the program to the
module using module flags. This allows the optimizer to use this information.
Differential Revision: http://reviews.llvm.org/D15163
llvm-svn: 255397
For PS4, generate explicit import for anonymous namespaces and mark it by DW_AT_artificial attribute.
Differential Revision: http://reviews.llvm.org/D12624
llvm-svn: 255281
implicitly-concatenated string literals. When looking for the start of a token
in the inline assembly, start from the end of the previous token, not the start
of the entire string.
Patch by Yunlian Jiang!
llvm-svn: 255198
OpenMP 4.5 adds directives 'taskloop' and 'taskloop simd'. These directives support clause 'num_tasks'. Patch adds parsing/semantic analysis for this clause.
llvm-svn: 255008
Summary:
Adds new option -fthinlto-index=<file> to invoke the LTO pipeline
along with function importing via clang using the supplied function
summary index file. This supports invoking the parallel ThinLTO
backend processes in a distributed build environment via clang.
Additionally, this causes the module linker to be invoked on the bitcode
file being compiled to perform any necessary promotion and renaming of
locals that are exported via the function summary index file.
Add a couple tests that confirm we get expected errors when we try to
use the new option on a file that isn't bitcode, or specify an invalid
index file. The tests also confirm that we trigger the expected function
import pass.
Depends on D15024
Reviewers: joker.eph, dexonsmith
Subscribers: joker.eph, davidxl, cfe-commits
Differential Revision: http://reviews.llvm.org/D15025
llvm-svn: 254927
OpenMP 4.5 adds 'taksloop' and 'taskloop simd' directives, which have 'grainsize' clause. Patch adds parsing/sema analysis of this clause.
llvm-svn: 254903
OpenMP 4.5 adds 'taskloop' and 'taskloop simd' directives. These directives have new 'nogroup' clause. Patch adds basic parsing/sema support for this clause.
llvm-svn: 254899
Constructors and destructors may be represented by several functions
in IR. Only base structors correspond to source code, others are
small pieces of code and eventually call the base variant. In this
case instrumentation of non-base structors has little sense, this
fix remove it. Now profile data of a declaration corresponds to
exactly one function in IR, it agrees with the current logic of the
profile data loading.
This change fixes PR24996.
Differential Revision: http://reviews.llvm.org/D15158
llvm-svn: 254876
This reverts commit r254195.
From the description, I suspect that the wrong patch was committed here,
and this is causing assertion failures in EmitDeferred() when the global
value ends up being a bitcast of a global.
llvm-svn: 254823
After r251874, properties from class extensions no longer show up in
ObjCInterfaceDecl::properties(). Make debug info emission explicitly
look for properties in class extensions before looking at direct properties.
Also add a test that checks for this. There are three interesting cases:
1. A property is only declared in a class extension, and the @implementation
is in a different file. This used to generated a DIObjcProperty before
r251874 and does again with this fix.
2. A property is declared as readonly in the class itself and redeclared as
readwrite in a class extension. clang before r251874 put the DIObjcProperty
on the first declaration. clang after r251874 didn't emit any DIObjcProperty,
and clang with this fix puts it on the readwrite redeclaration (which is
what lookup finds). This seems like a progression.
3. Like 2, but with an @implementation in the same file. In this case,
the property debug info gets generated a second time through the ivar
from the definition. In this case, lookup and declaration code need
to agree on the line number so that the DIObjcProperty isn't emitted
twice. In this case, clang before r251874 emitted one DIObjcProperty
on the first declaration, clang with r251874 emitted one on the second
declaration, and clang with this patch still does the latter.
llvm-svn: 254750
Summary:
Looking into some recent issues with LLDBs expression parser highlighted that upstream clang passes vectors types differently to Android Open Source Project's clang for Arm Android targets.
This patch reflects the changes present in the AOSP and allows LLDB's JIT expression evaluation to work correctly for Arm Android targets when passing vectors.
This is submitted with consent of the original author Stephen Hines.
Reviewers: asl, rsmith, ADodds, rnk
Subscribers: rnk, aemerson, tberghammer, danalbert, srhines, cfe-commits, pirama
Differential Revision: http://reviews.llvm.org/D14639
llvm-svn: 254682
Fix calculating address of arguments larger than 32 bit on stack for
variadic functions (rounding up address to alignment) on ppc32 architecture.
Patch by Strahinja Petrovic.
Differential Revision: http://reviews.llvm.org/D14871
llvm-svn: 254670
After, properties from class extensions no longer show up in
ObjCInterfaceDecl::properties(). Make ObjCCommonMac::EmitPropertyList()
explicitly look for properties in class extensions before looking at
direct properties.
Also add a test that passes both with clang before r251874 and after this
patch (but fails with r251874 and without this patch).
llvm-svn: 254622
`pass_object_size` is our way of enabling `__builtin_object_size` to
produce high quality results without requiring inlining to happen
everywhere.
A link to the design doc for this attribute is available at the
Differential review link below.
Differential Revision: http://reviews.llvm.org/D13263
llvm-svn: 254554
Summary:
This patch implements the 4.5 specification for the implicit data maps. OpenMP 4.5 specification changes the default way data is captured into a target region. All the non-aggregate kinds are passed by value by default. This required activating the capturing by value during SEMA for the target region. All the non-aggregate values that can be encoded in the size of a pointer are properly casted and forwarded to the runtime library. On top of fixing the previous weird behavior for mapping pointers in nested data regions (an explicit map was always required), this also improves performance as the number of allocations/transactions to the device per non-aggregate map are reduced from two to only one - instead of passing a reference and the value, only the value passed.
Explicit maps will be added later on once firstprivate, private, and map clauses' SEMA and parsing are available.
Reviewers: hfinkel, rjmccall, ABataev
Subscribers: cfe-commits, carlo.bertolli
Differential Revision: http://reviews.llvm.org/D14940
llvm-svn: 254521
OpenMP 4.5 defines new clause 'priority' for 'task', 'taskloop' and 'taskloop simd' directives. Added parsing and sema analysis for 'priority' clause in 'task' and 'taskloop' directives.
llvm-svn: 254398
Summary: This patch adds support for the interrupt attribute for mips32r2+.
Patch by Simon Dardis.
Reviewers: dsanders, aaron.ballman
Subscribers: aaron.ballman, cfe-commits
Differential Revision: http://reviews.llvm.org/D10802
llvm-svn: 254205
Summary: This patch adds support for the interrupt attribute for mips32r2+.
Reviewers: dsanders, aaron.ballman
Subscribers: aaron.ballman, cfe-commits
Differential Revision: http://reviews.llvm.org/D10802
llvm-svn: 254203
This patch changes the generation of CGFunctionInfo to contain
the FunctionProtoType if it is available. This enables the code
generation for call instructions to look into this type for
exception information and therefore generate better quality
IR - it will not create invoke instructions for functions that
are know not to throw.
llvm-svn: 253926
Runtime library requires, that codegen for 'depend' clause for 'out' dependency kind must be the same as codegen for 'depend' clause with 'inout' dependency.
llvm-svn: 253866
The pointer returned by __RTDynamicCast must be bitcasted. However, it
was not expected that __RTDynamicCast would be invoked, resulting in the
bitcast occuring in a different BasicBlock than the invoke. This caused
a down-stream PHI to get confused about which BasicBlock the incomming
value was from.
This fixes PR25606.
llvm-svn: 253843
driving a canonical difference between that and an unqualified
type is a really bad idea when both are valid. Instead, remember
that it was there in a non-canonical way, then look for that in
the one place we really care about it: block captures. The net
effect closely resembles the behavior of a decl attribute, except
still closely following ARC's standard qualifier parsing rules.
llvm-svn: 253534
to start at the offset of the first ivar instead of the rounded-up
end of the superclass. The latter could include a large amount of
tail padding because of a highly-aligned ivar, and subclass ivars
can be laid out within that.
llvm-svn: 253533
The conversion from QuantityType to the (temporary) IntegerAlignment class
was ambiguous.
For now add in explicit conversion to unsigned to satisfy the clang-x86_64-debian-fast bot.
I'll remove the explicit conversion when I remove the IntegerAlignment class.
llvm-svn: 253519
This is a follow on from a similar LLVM commit: r253511.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
These intrinsics currently have an explicit alignment argument which is
required to be a constant integer. It represents the alignment of the
source and dest, and so must be the minimum of those.
This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments. The alignment
argument itself is removed.
The only code change to clang is hidden in CGBuilder.h which now passes
both dest and source alignment to IRBuilder, instead of taking the minimum of
dest and source alignments.
Reviewed by Hal Finkel.
llvm-svn: 253512
target attribute, don't include "negative" subtarget features in the
list of required features. Builtins are positive by default so don't
need this change, but we pull the default list of features from the
command line and so need to make sure that we only include features
that are turned on for code generation in our error.
llvm-svn: 253242
In r253186, I changed the DIBuilder API to now take size and align
for reference types as well. This was done in preparation for upcoming
changes to the Verifier that will validate that sizes match between
DI types and IR values that are declared as having those types.
This updates clang to actually pass the information through.
llvm-svn: 253190
The ``disable_tail_calls`` attribute instructs the backend to not
perform tail call optimization inside the marked function.
For example,
int callee(int);
int foo(int a) __attribute__((disable_tail_calls)) {
return callee(a); // This call is not tail-call optimized.
}
Note that this attribute is different from 'not_tail_called', which
prevents tail-call optimization to the marked function.
rdar://problem/8973573
Differential Revision: http://reviews.llvm.org/D12547
llvm-svn: 252986
In r244063, I had caused these builtins to call the same-named library
functions, __atomic_*_fetch_SIZE. However, this was incorrect: while
those functions are in fact supported by GCC's libatomic, they're not
documented by the spec (and gcc doesn't ever call them).
Instead, you're /supposed/ to call the __atomic_fetch_* builtins and
then redo the operation inline to return the final value.
Differential Revision: http://reviews.llvm.org/D14385
llvm-svn: 252920
The C++ spec (3.6.1.3) says "The function `main` shall not be used within a program". This implies that it cannot recurse, so add the norecurse attribute to help the midend out a bit.
llvm-svn: 252902
target features that the caller function doesn't provide. This matches
the existing backend failure to inline functions that don't have
matching target features - and diagnoses earlier in the case of
always_inline.
Fix up a few test cases that were, in fact, invalid if you tried
to generate code from the backend with the specified target features
and add a couple of tests to illustrate what's going on.
This should fix PR25246.
llvm-svn: 252834
This is about how we handle static member of a template. Before this commit,
we use internal linkage for the IR thread-local variable, which is inefficient.
With this commit, we will start to follow Itanium C++ ABI.
rdar://problem/23415206
Reviewed by John McCall.
llvm-svn: 252814
We used to emit the store prior to branch in the entry block. To make it more
efficient, this commit moves it to the init block. We still mark as initialized
before initializing anything else.
llvm-svn: 252777
This comes up when a derived class destructor is equivalent to a base
class destructor defined in the same TU, and we try to alias them.
A COFF weak alias cannot satisfy a normal undefined symbol reference
from another TU. The other TU must also mark the referenced symbol as
weak, and we can't rely on that.
Clang already has a special case here for dllexport, but we failed to
realize that the problem also applies to other non-discardable symbols
such as those from explicit template instantiations.
Fixes PR25477.
llvm-svn: 252659
When a struct's size is not a power of 2, the corresponding _Atomic() type is
promoted to the nearest. We already correctly handled normal C++ expressions of
this form, but direct calls to the __c11_atomic_whatever builtins ended up
performing dodgy operations on the smaller non-atomic types (e.g. memcpy too
much). Later optimisations removed this as undefined behaviour.
This patch converts EmitAtomicExpr to allocate its temporaries at the full
atomic width, sidestepping the issue.
llvm-svn: 252507
The -meabi flag to control LLVM EABI version.
Without '-meabi' or with '-meabi default' imply LLVM triple default.
With '-meabi gnu' sets EABI GNU.
With '-meabi 4' or '-meabi 5' set EABI version 4 and 5 respectively.
A similar patch was introduced in LLVM.
Patch by Vinicius Tinti.
llvm-svn: 252463
This attribute is used to prevent tail-call optimizations to the marked
function. For example, in the following piece of code, foo1 will not be
tail-call optimized:
int __attribute__((not_tail_called)) foo1(int);
int foo2(int a) {
return foo1(a); // Tail-call optimization is not performed.
}
The attribute has effect only on statically bound calls. It has no
effect on indirect calls. Also, virtual functions and objective-c
methods cannot be marked as 'not_tail_called'.
rdar://problem/22667622
Differential Revision: http://reviews.llvm.org/D12922
llvm-svn: 252369
Function__builtin_signbit returns wrong value for type ppcf128 on big endian
machines. This patch fixes how value is generated in that case.
Patch by Aleksandar Beserminji.
Differential Revision: http://reviews.llvm.org/D14149
llvm-svn: 252307
Summary: This fixes a bug that's easily encountered in LLDB
(https://llvm.org/bugs/show_bug.cgi?id=22875). The problem here is that we
mangle a name during debug info emission, but never actually emit the actual
Decl, so we run into problems in EmitDeclMetadata (which assumes such a Decl
exists). Fix that by just skipping metadata emissions for mangled names that
don't have associated Decls.
Reviewers: rjmccall
Subscribers: labath, cfe-commits
Differential Revision: http://reviews.llvm.org/D13959
llvm-svn: 252229
for the root cause. The 'using llvm::isa;' declaration in Basic/LLVM.h only
pulls the declarations of llvm::isa that were declared prior to it into
namespace clang. In a modules build, this is a hermetic set of just the
declarations from LLVM. In a non-modules build, we happened to also pull the
declaration from lib/CodeGen/Address.h into namespace clang, which made the
code in question accidentally compile.
llvm-svn: 252211
This new builtin template allows for incredibly fast instantiations of
templates like std::integer_sequence.
Performance numbers follow:
My work station has 64 GB of ram + 20 Xeon Cores at 2.8 GHz.
__make_integer_seq<std::integer_sequence, int, 90000> takes 0.25
seconds.
std::make_integer_sequence<int, 90000> takes unbound time, it is still
running. Clang is consuming gigabytes of memory.
Differential Revision: http://reviews.llvm.org/D13786
llvm-svn: 252036
A 'readonly' Objective-C property declared in the primary class can
effectively be shadowed by a 'readwrite' property declared within an
extension of that class, so long as the types and attributes of the
two property declarations are compatible.
Previously, this functionality was implemented by back-patching the
original 'readonly' property to make it 'readwrite', destroying source
information and causing some hideously redundant, incorrect
code. Simplify the implementation to express how this should actually
be modeled: as a separate property declaration in the extension that
shadows (via the name lookup rules) the declaration in the primary
class. While here, correct some broken Fix-Its, eliminate a pile of
redundant code, clean up the ARC migrator's handling of properties
declared in extensions, and fix debug info's naming of methods that
come from categories.
A wonderous side effect of doing this write is that it eliminates the
"AddedObjCPropertyInClassExtension" method from the AST mutation
listener, which in turn eliminates the last place where we rewrite
entire declarations in a chained PCH file or a module file. This
change (which fixes rdar://problem/18475765) will allow us to
eliminate the rewritten-decls logic from the serialization library,
and fixes a crash (rdar://problem/23247794) illustrated by the
test/PCH/chain-categories.m example.
llvm-svn: 251874
Certain CXXConstructExpr nodes require zero-initialization before a
constructor is called. We had a bug in the case where the constructor
is called on a virtual base: we zero-initialized the base's vbptr field.
A complementary bug is present in MSVC where no zero-initialization
occurs for the subobject at all.
This fixes PR25370.
llvm-svn: 251783
attributes to internal functions.
This patch fixes CodeGenModule::CreateGlobalInitOrDestructFunction to
use SetInternalFunctionAttributes instead of SetLLVMFunctionAttributes
to attach function attributes to internal functions.
Also, make sure the correct CGFunctionInfo is passed instead of always
passing what arrangeNullaryFunction returns.
rdar://problem/20828324
Differential Revision: http://reviews.llvm.org/D13610
llvm-svn: 251734
This sets the mostly expected Darwin default ABI options for these two
platforms. Active changes from these defaults for watchOS are in a later patch.
llvm-svn: 251708
This works around PR25162. The MSVC tables make it very difficult to
correctly inline a C++ destructor that contains try / catch. We've
attempted to address PR25162 in LLVM's backend, but it feels pretty
infeasible. MSVC and ICC both appear to avoid inlining such complex
destructors.
Long term, we want to fix this by making the inliner smart enough to
know when it is inlining into a cleanup, so it can inline simple
destructors (~unique_ptr and ~vector) while avoiding destructors
containing try / catch.
llvm-svn: 251576
GCC uses the x87DoubleExtended model for long doubles, and passes them
indirectly by address through function calls.
Also replace the existing mingw-long-double assembly emitting test with
an IR-level test.
llvm-svn: 251567
functions.
This commit fixes a bug in CGOpenMPRuntime.cpp and CGObjC.cpp where
some of the function attributes are not attached to newly created
functions.
rdar://problem/20828324
Differential Revision: http://reviews.llvm.org/D13928
llvm-svn: 251476
Linking options for particular file depend on the option that specifies the file.
Currently there are two:
* -mlink-bitcode-file links in complete content of the specified file.
* -mlink-cuda-bitcode links in only the symbols needed by current TU.
Linked symbols are internalized. This bitcode linking mode is used to
link device-specific bitcode provided by CUDA.
Files are linked in order they are specified on command line.
-mlink-cuda-bitcode replaces -fcuda-uses-libdevice flag.
Differential Revision: http://reviews.llvm.org/D13913
llvm-svn: 251427
only one of a group of possibilities.
This changes the syntax in the builtin files to represent:
, as the and operator
| as the or operator
The former syntax matches how the backend tablegen files represent
multiple subtarget features being required.
Updated the builtin and intrinsic headers accordingly for the new
syntax.
llvm-svn: 251388
The MCU psABI calling convention is somewhat, but not quite, like -mregparm 3.
In particular, the rules involving structs are different.
Differential Revision: http://reviews.llvm.org/D13978
llvm-svn: 251224
Previously, __weak was silently accepted and ignored in MRC mode.
That makes this a potentially source-breaking change that we have to
roll out cautiously. Accordingly, for the time being, actual support
for __weak references in MRC is experimental, and the compiler will
reject attempts to actually form such references. The intent is to
eventually enable the feature by default in all non-GC modes.
(It is, of course, incompatible with ObjC GC's interpretation of
__weak.)
If you like, you can enable this feature with
-Xclang -fobjc-weak
but like any -Xclang option, this option may be removed at any point,
e.g. if/when it is eventually enabled by default.
This patch also enables the use of the ARC __unsafe_unretained qualifier
in MRC. Unlike __weak, this is being enabled immediately. Since
variables are essentially __unsafe_unretained by default in MRC,
the only practical uses are (1) communication and (2) changing the
default behavior of by-value block capture.
As an implementation matter, this means that the ObjC ownership
qualifiers may appear in any ObjC language mode, and so this patch
removes a number of checks for getLangOpts().ObjCAutoRefCount
that were guarding the processing of these qualifiers. I don't
expect this to be a significant drain on performance; it may even
be faster to just check for these qualifiers directly on a type
(since it's probably in a register anyway) than to do N dependent
loads to grab the LangOptions.
rdar://9674298
llvm-svn: 251041
We got this right for Itanium but not MSVC because CGRecordLayoutBuilder
was checking if the base's size was zero when it should have been
checking the non-virtual size.
This fixes PR21040.
llvm-svn: 251036
This is almost entirely a matter of just flipping a switch. 99% of
the runtime support is available all the way back to when it was
implemented in the non-fragile runtime, i.e. in Lion. However,
fragile runtimes do not recognize ARC-style ivar layout strings,
which means that accessing __strong or __weak ivars reflectively
(e.g. via object_setIvar) will end up accessing the ivar as if it
were __unsafe_unretained. Therefore, when using reflective
technologies like KVC, be sure that your paths always refer to a
property.
rdar://23209307
llvm-svn: 250955
Specifically, handle under-aligned object references (by explicitly
ignoring them, because this just isn't representable in the format;
yes, this means that GC silently ignores such references), descend
into anonymous structs and unions, stop classifying fields of
pointer-to-strong/weak type as strong/weak in ARC mode, and emit
skips to cover the entirety of block layouts in GC mode. As a
cleanup, extract this code into a helper class, avoid a number of
unnecessary copies and layout queries, generate skips implicitly
instead of explicitly tracking them, and clarify the bitmap-creation
logic.
llvm-svn: 250919
Summary: It breaks the build for the ASTMatchers
Subscribers: klimek, cfe-commits
Differential Revision: http://reviews.llvm.org/D13893
llvm-svn: 250827
Currently debug info for types used in explicit cast only is not emitted. It happened after a patch for better alignment handling. This patch fixes this bug.
Differential Revision: http://reviews.llvm.org/D13582
llvm-svn: 250795
The Intel MCU psABI requires floating-point values to be passed in-reg.
This makes the x86-32 ABI code respect "-mfloat-abi soft" and generate float inreg arguments.
Differential Revision: http://reviews.llvm.org/D13554
llvm-svn: 250689
match the feature set of the function that they're being called from.
This ensures that we can effectively diagnose some[1] code that would
instead ICE in the backend with a failure to select message.
Example:
__m128d foo(__m128d a, __m128d b) {
return __builtin_ia32_addsubps(b, a);
}
compiled for normal x86_64 via:
clang -target x86_64-linux-gnu -c
would fail to compile in the back end because the normal subtarget
features for x86_64 only include sse2 and the builtin requires sse3.
[1] We're still not erroring on:
__m128i bar(__m128i const *p) { return _mm_lddqu_si128(p); }
where we should fail and error on an always_inline function being
inlined into a function that doesn't support the subtarget features
required.
llvm-svn: 250473
This recommits r250398 with fixes to the tests for bot failures.
Add "-target x86_64-unknown-linux" to the clang invocations that
check for the gold plugin.
llvm-svn: 250455
Rolling this back for now since there are a couple of bot failures on
the new tests I added, and I won't have a chance to look at them in detail
until later this afternoon. I think the new tests need some restrictions on
having the gold plugin available.
This reverts commit r250398.
llvm-svn: 250402
Summary:
Add clang support for -flto=thin option, which is used to set the
EmitFunctionSummary code gen option on compiles.
Add -flto=full as an alias to the existing -flto.
Add tests to check for proper overriding of -flto variants on the
command line, and convert grep tests to FileCheck.
Reviewers: dexonsmith, joker.eph
Subscribers: davidxl, cfe-commits
Differential Revision: http://reviews.llvm.org/D11908
llvm-svn: 250398
Add support for the `-fdebug-prefix-map=` option as in GCC. The syntax is
`-fdebug-prefix-map=OLD=NEW`. When compiling files from a path beginning with
OLD, change the debug info to indicate the path as start with NEW. This is
particularly helpful if you are preprocessing in one path and compiling in
another (e.g. for a build cluster with distcc).
Note that the linearity of the implementation is not as terrible as it may seem.
This is normally done once per file with an expectation that the map will be
small (1-2) entries, making this roughly linear in the number of input paths.
Addresses PR24619.
llvm-svn: 250094
CGBlocks.cpp.
This commit fixes a bug in clang's code-gen where it creates the
following functions but doesn't attach function attributes to them:
__copy_helper_block_
__destroy_helper_block_
__Block_byref_object_copy_
__Block_byref_object_dispose_
rdar://problem/20828324
Differential Revision: http://reviews.llvm.org/D13525
llvm-svn: 249735
early.
This is needed in a patch I plan to commit later, in which a null Decl
pointer is passed to SetLLVMFunctionAttributesForDefinition.
Relevant discussion is in http://reviews.llvm.org/D13525.
llvm-svn: 249722
No ABI for C++ currently makes it possible to implement the standard
100% perfectly. We wrongly hid some of our compatible behavior behind
-fms-compatibility instead of tying it to the compiler ABI.
llvm-svn: 249656
The backend restores the stack pointer after recovering from an
exception. This is similar to r245879, but it doesn't try to use the
normal cleanup mechanism, so hopefully it won't cause the same breakage.
llvm-svn: 249640
We don't have a good place to put them. Our previous spot was causing us
to optimize loads from the exception object to undef, because it was
after the catchpad instruction that models the write to the catch
object.
llvm-svn: 249616
Currently codegen crashes trying to emit casting to bool &. It happens because bool type is converted to i1 and later then lvalue for reference is converted to i1*. But when codegen tries to load this lvalue it crashes trying to load value from this i1*.
Differential Revision: http://reviews.llvm.org/D13325
llvm-svn: 249534
include/clang/CodeGenABITypes.h is in meant to be included by external
users, but using a unique_ptr on the private CodeGenModule introduces a
dependency on the type definition that prevents such a use.
NFC
llvm-svn: 249328
OpenCL v1.1 s6.2.2: for the boolean value true, every bit in the result vector should be set.
This change treats the i1 value as signed for the purposes of performing the cast to integer,
and therefore sign extend into the result.
Patch by Neil Hickey!
http://reviews.llvm.org/D13349
llvm-svn: 249301
I randomly came across this difference between AArch64 and other targets:
on the latter, we don't emit nil checks for known non-nil class method
calls thanks to r247350, but we still do for AArch64 stret calls.
They use different code paths, because those are special, as they go
through the regular msgSend, not the msgSend*_stret variants.
llvm-svn: 249205
Ensure that the vptr store in the most-derived constructor is not behind
an invariant group barrier. Previously, the base-most vptr store would
be the one behind no barrier, and that could result in the creator of
the object thinking it had the base-most vtable.
This bug caused clang call pure virtual functions when called from
constructor body.
http://reviews.llvm.org/D13373
llvm-svn: 249197
This patch implements the outlining for offloading functions for code
annotated with the OpenMP target directive. It uses a temporary naming
of the outlined functions that will have to be updated later on once
target side codegen and registration of offloading libraries is
implemented - the naming needs to be made unique in the produced
library.
llvm-svn: 249148
This reverts commit r248982 as it was breaking the ARM buildbots and the fix didn't work.
This reverts commit r248984, the fix that didn't work.
llvm-svn: 249005
Summary: __nvvm_atom_cas_* returns the old value instead of whether the swap succeeds.
Reviewers: eliben, tra
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D13306
llvm-svn: 248951
- Remove virtual SC_OpenCLWorkGroupLocal storage type specifier
as it conflicts with static local variables now and prevents
diagnosing static local address space variables correctly.
- Allow static local and global variables (OpenCL2.0 s6.8 and s6.5.1).
- Improve diagnostics of allowed ASes for variables in different scopes:
(i) Global or static local variables have to be in global
or constant ASes (OpenCL1.2 s6.5, OpenCL2.0 s6.5.1);
(ii) Non-kernel function variables can't be declared in local
or constant ASes (OpenCL1.1 s6.5.2 and s6.5.3).
http://reviews.llvm.org/D13105
llvm-svn: 248906
This is the clang commit associated with llvm r248887.
This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234],
vst[234]lane ARM neon intrinsics and associates an address space with the
pointer that these intrinsics take. This changes, e.g.,
<2 x i32> @llvm.arm.neon.vld1.v2i32(i8*, i32)
to
<2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8*, i32)
This change ensures that address spaces are fully taken into account in the ARM
target during lowering of interleaved loads and stores.
Differential Revision: http://reviews.llvm.org/D13127
llvm-svn: 248888
Description.
If the simd clause is specified, the ordered regions encountered by any thread will use only a single SIMD lane to execute the ordered regions in the order of the loop iterations.
Restrictions.
An ordered construct with the simd clause is the only OpenMP construct that can appear in the simd region.
An ordered directive with ‘simd’ clause is generated as an outlined function and corresponding function call to prevent this part of code from vectorization later in backend.
llvm-svn: 248772
Parsing and sema analysis for 'simd' clause in 'ordered' directive.
Description
If the simd clause is specified, the ordered regions encountered by any thread will use only a single SIMD lane to execute the ordered
regions in the order of the loop iterations.
Restrictions
An ordered construct with the simd clause is the only OpenMP construct that can appear in the simd region
llvm-svn: 248696
OpenMP 4.1 extends format of '#pragma omp ordered'. It adds 3 additional clauses: 'threads', 'simd' and 'depend'.
If no clause is specified, the ordered construct behaves as if the threads clause had been specified. If the threads clause is specified, the threads in the team executing the loop region execute ordered regions sequentially in the order of the loop iterations.
The loop region to which an ordered region without any clause or with a threads clause binds must have an ordered clause without the parameter specified on the corresponding loop directive.
llvm-svn: 248569
when building a module. Clang already records the module signature when
building a skeleton CU to reference a clang module.
Matching the id in the skeleton with the one in the module allows a DWARF
consumer to verify that they found the correct version of the module
without them needing to know about the clang module format.
llvm-svn: 248345
* adds -aux-triple option to specify target triple
* propagates aux target info to AST context and Preprocessor
* pulls in target specific preprocessor macros.
* pulls in target-specific builtins from aux target.
* sets appropriate host or device attribute on builtins.
Differential Revision: http://reviews.llvm.org/D12917
llvm-svn: 248299
This commit fixes an assert that is triggered when optnone is being
added to an IR function that is already marked with minsize and optsize.
rdar://problem/22723716
Differential Revision: http://reviews.llvm.org/D13004
llvm-svn: 248191
The signature may not have been computed at the time the module reference
is generated (e.g.: in the future while emitting debug info for a clang
module). Using the full module name is safe because each clang module may
only have a single definition.
NFC.
llvm-svn: 248037
Summary:
This change adds support for `__builtin_ms_va_list`, a GCC extension for
variadic `ms_abi` functions. The existing `__builtin_va_list` support is
inadequate for this because `va_list` is defined differently in the Win64
ABI vs. the System V/AMD64 ABI.
Depends on D1622.
Reviewers: rsmith, rnk, rjmccall
CC: cfe-commits
Differential Revision: http://reviews.llvm.org/D1623
llvm-svn: 247941
Mingw generally wraps an old copy of msvcrt.dll which has these
personalities, so things should work out, or so I hear. I haven't tested
it.
llvm-svn: 247902
This avoids building a fake LLVM IR global variable just to ferry an i32
down into LLVM codegen. It also puts a nail in the coffin of using MS
ABI C++ EH with landingpads, since now we'll assert in the lpad code
when flags are present.
llvm-svn: 247843
ptr in dtor.
Summary:
After destruction, invocation of virtual functions prevented
by poisoning vtable pointer.
Reviewers: eugenis, kcc
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D12712
Fixed testing callback emission order to account for vptr.
Poison vtable in either complete or base dtor, depending on
if virtual bases exist. If virtual bases exist, poison in
complete dtor. Otherwise, poison in base.
Remove commented-out block.
llvm-svn: 247762
It is dangerous to do LTO on code with strict-vtable-pointers, because
one module has invariant.group.barriers, and the other one not.
In the future I want to just strip all invariant.group metadata from
vptrs loads/stores and get rid of invariant.group.barrier calls.
http://reviews.llvm.org/D12580
llvm-svn: 247724
Patch improves codegen for OpenMP constructs. If the OpenMP region does not have internal 'cancel' construct, a call to 'void __kmpc_barrier()' runtime function is generated for all implicit/explicit barriers. If the region has inner 'cancel' directive, then
```
if (__kmpc_cancel_barrier())
exit from outer construct;
```
code is generated.
Also, the code for 'canellation point' directive is not generated if parent directive does not have 'cancel' directive.
llvm-svn: 247681
These are a few cleanups I happened to have from trying to go in a
different direction recently, so just flushing them out while I have
them.
llvm-svn: 247593
This was the wrong direction to take anyway (because ultimately the
GlobalValue needed the pointee type again and /it/ used
PointerType::getElementType eventually anyway)... let's go a different way.
This reverts commit r236161.
llvm-svn: 247586
Current implementation may end up emitting an undefined reference for
an "inline __attribute__((always_inline))" function by generating an
"available_externally alwaysinline" IR function for it and then failing to
inline all the calls. This happens when a call to such function is in dead
code. As the inliner is an SCC pass, it does not process dead code.
Libc++ relies on the compiler never emitting such undefined reference.
With this patch, we emit a pair of
1. internal alwaysinline definition (called F.alwaysinline)
2a. A stub F() { musttail call F.alwaysinline }
-- or, depending on the linkage --
2b. A declaration of F.
The frontend ensures that F.inlinefunction is only used for direct
calls, and the stub is used for everything else (taking the address of
the function, really). Declaration (2b) is emitted in the case when
"inline" is meant for inlining only (like __gnu_inline__ and some
other cases).
This approach, among other nice properties, ensures that alwaysinline
functions are always internal, making it impossible for a direct call
to such function to produce an undefined symbol reference.
This patch is based on ideas by Chandler Carruth and Richard Smith.
llvm-svn: 247494
Current implementation may end up emitting an undefined reference for
an "inline __attribute__((always_inline))" function by generating an
"available_externally alwaysinline" IR function for it and then failing to
inline all the calls. This happens when a call to such function is in dead
code. As the inliner is an SCC pass, it does not process dead code.
Libc++ relies on the compiler never emitting such undefined reference.
With this patch, we emit a pair of
1. internal alwaysinline definition (called F.alwaysinline)
2a. A stub F() { musttail call F.alwaysinline }
-- or, depending on the linkage --
2b. A declaration of F.
The frontend ensures that F.inlinefunction is only used for direct
calls, and the stub is used for everything else (taking the address of
the function, really). Declaration (2b) is emitted in the case when
"inline" is meant for inlining only (like __gnu_inline__ and some
other cases).
This approach, among other nice properties, ensures that alwaysinline
functions are always internal, making it impossible for a direct call
to such function to produce an undefined symbol reference.
This patch is based on ideas by Chandler Carruth and Richard Smith.
llvm-svn: 247465
-force-align-stack.
Also, make changes to the driver so that -mno-stack-realign is no longer
an option exposed to the end-user that disallows stack realignment in
the backend.
Differential Revision: http://reviews.llvm.org/D11815
llvm-svn: 247451
clang modules, if -dwarf-ext-refs (DebugTypesExtRefs) is specified.
This reimplements r247369 in about a third of the amount of code.
Thanks to David Blaikie pointing this out in post-commit review!
llvm-svn: 247432
When uses of personality functions were moved from LandingPadInst to
Function, we forgot to update SimplifyPersonality(). This patch corrects
that.
Note: SimplifyPersonality() is an optimization which replaces
personality functions with the default C++ personality when possible.
Without this update, some ObjC++ projects fail to link against C++
libraries (seeing as the exception ABI had effectively changed).
rdar://problem/22155434
llvm-svn: 247421
The type of a member pointer is incomplete if it has no inheritance
model. This lets us reuse more general logic already embedded in clang.
llvm-svn: 247346
It seems that there is small bug, and we can't generate assume loads
when some virtual functions have internal visibiliy
This reverts commit 982bb7d966947812d216489b3c519c9825cacbf2.
llvm-svn: 247332
Currently all variables used in OpenMP regions are captured into a record and passed to outlined functions in this record. It may result in some poor performance because of too complex analysis later in optimization passes. Patch makes to emit outlined functions for parallel-based regions with a list of captured variables. It reduces code for 2*n GEPs, stores and loads at least.
Codegen for task-based regions remains unchanged because runtime requires that all captured variables are passed in captured record.
llvm-svn: 247251
This flag causes the compiler to emit bit set entries for functions as well
as runtime bitset checks at indirect call sites. Depends on the new function
bitset mechanism.
Differential Revision: http://reviews.llvm.org/D11857
llvm-svn: 247238
Seems it broke the Polly build.
From http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/11687/steps/compile/logs/stdio:
In file included from /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/lib/TableGen/Record.cpp:14:0:
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/include/llvm/TableGen/Record.h:369:3: error: looser throw specifier for 'virtual llvm::TypedInit::~TypedInit()'
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.src/include/llvm/TableGen/Record.h:270:11: error: overriding 'virtual llvm::Init::~Init() noexcept (true)'
llvm-svn: 247222
Generating call assume(icmp %vtable, %global_vtable) after constructor
call for devirtualization purposes.
For more info go to:
http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html
Edit:
Fixed version because of PR24479.
After this patch got reverted because of ScalarEvolution bug (D12719)
Merged after John McCall big patch (Added Address).
http://reviews.llvm.org/D11859
llvm-svn: 247199
We know that a reference can always be dereferenced. However, we don't
always know the number of bytes if the reference's pointee type is
incomplete. This case was correctly handled but we didn't consider the
case where the type is complete but we cannot calculate its size for ABI
specific reasons. In this specific case, a member pointer's size is
available only under certain conditions.
This fixes PR24703.
llvm-svn: 247188
Summary:
Currently clang provides no general way to generate nontemporal loads/stores.
There are some architecture specific builtins for doing so (e.g. in x86), but
there is no way to generate non-temporal store on, e.g. AArch64. This patch adds
generic builtins which are expanded to a simple store with '!nontemporal'
attribute in IR.
Differential Revision: http://reviews.llvm.org/D12313
llvm-svn: 247104
This function can be used to create a metadata identifier for a specific
type. No functionality change, but this will be used by D11857 and D12026.
Differential Revision: http://reviews.llvm.org/D12038
llvm-svn: 247098
When -fmodule-format is set to "obj", emit debug info for all types
declared in a module or referenced by a declaration into the module's
object file container.
This patch adds support for Objective-C types and methods.
llvm-svn: 247068
When -fmodule-format is set to "obj", emit debug info for all types
declared in a module or referenced by a declaration into the module's
object file container.
This patch adds support for C and C++ types.
llvm-svn: 247049
instruction used the ReturnValue as pointer operand or value operand. This
led to wrong code gen - in later stages (load-store elision code) the found
store and its operand would be erased, causing ReturnValue to become a <badref>.
The patch adds a check that makes sure that ReturnValue is a pointer operand of
store instruction. Regression test is also added.
This fixes PR24386.
Differential Revision: http://reviews.llvm.org/D12400
llvm-svn: 247003
separately from building the instruction so that it's
preserved even in -Asserts builds.
Employ C++'s mystical "comment" feature to discourage
breaking this in the future.
llvm-svn: 246991
Introduce an Address type to bundle a pointer value with an
alignment. Introduce APIs on CGBuilderTy to work with Address
values. Change core APIs on CGF/CGM to traffic in Address where
appropriate. Require alignments to be non-zero. Update a ton
of code to compute and propagate alignment information.
As part of this, I've promoted CGBuiltin's EmitPointerWithAlignment
helper function to CGF and made use of it in a number of places in
the expression emitter.
The end result is that we should now be significantly more correct
when performing operations on objects that are locally known to
be under-aligned. Since alignment is not reliably tracked in the
type system, there are inherent limits to this, but at least we
are no longer confused by standard operations like derived-to-base
conversions and array-to-pointer decay. I've also fixed a large
number of bugs where we were applying the complete-object alignment
to a pointer instead of the non-virtual alignment, although most of
these were hidden by the very conservative approach we took with
member alignment.
Also, because IRGen now reliably asserts on zero alignments, we
should no longer be subject to an absurd but frustrating recurring
bug where an incomplete type would report a zero alignment and then
we'd naively do a alignmentAtOffset on it and emit code using an
alignment equal to the largest power-of-two factor of the offset.
We should also now be emitting much more aggressive alignment
attributes in the presence of over-alignment. In particular,
field access now uses alignmentAtOffset instead of min.
Several times in this patch, I had to change the existing
code-generation pattern in order to more effectively use
the Address APIs. For the most part, this seems to be a strict
improvement, like doing pointer arithmetic with GEPs instead of
ptrtoint. That said, I've tried very hard to not change semantics,
but it is likely that I've failed in a few places, for which I
apologize.
ABIArgInfo now always carries the assumed alignment of indirect and
indirect byval arguments. In order to cut down on what was already
a dauntingly large patch, I changed the code to never set align
attributes in the IR on non-byval indirect arguments. That is,
we still generate code which assumes that indirect arguments have
the given alignment, but we don't express this information to the
backend except where it's semantically required (i.e. on byvals).
This is likely a minor regression for those targets that did provide
this information, but it'll be trivial to add it back in a later
patch.
I partially punted on applying this work to CGBuiltin. Please
do not add more uses of the CreateDefaultAligned{Load,Store}
APIs; they will be going away eventually.
llvm-svn: 246985
We were crashing in CodeGen given input like this:
int self_alias(void) __attribute__((weak, alias("self_alias")));
such a self-alias is invalid, but instead of diagnosing the situation, we'd
proceed to produce IR for both the function declaration and the alias. Because
we already had a function named 'self_alias', the alias could not be named the
same thing, and so LLVM would pick a different name ('self_alias1' for example)
for that value. When we later called CodeGenModule::checkAliases, we'd look up
the IR value corresponding to the alias name, find the function declaration
instead, and then assert in a cast to llvm::GlobalAlias. The easiest way to prevent
this is simply to avoid creating the wrongly-named alias value in the first
place and issue the diagnostic there (instead of in checkAliases). We detect a
related cycle case in CodeGenModule::EmitAliasDefinition already, so this just
adds a second such check.
Even though the other test cases for this 'alias definition is part of a cycle'
diagnostic are in test/Sema/attr-alias-elf.c, I've added a separate regression
test for this case. This is because I can't add this check to
test/Sema/attr-alias-elf.c without disturbing the other test cases in that
file. In order to avoid construction of the bad IR values, this diagnostic
is emitted from within CodeGenModule::EmitAliasDefinition (and the relevant
declaration is not added to the Aliases vector). The other cycle checks are
done within the CodeGenModule::checkAliases function based on the Aliases
vector, called from CodeGenModule::Release. However, if there have been errors
earlier, HandleTranslationUnit does not call Release, and so checkAliases is
never called, and so none of the other diagnostics would be produced.
Fixes PR23509.
llvm-svn: 246882
This fixes an issue raised in D12412, where we generated invalid IR.
Thanks to Vedant Kumar for coming up with the initial work around.
Differential Revision: http://reviews.llvm.org/D12412
llvm-svn: 246880
Fix processing of shared variables with reference types in OpenMP constructs. Previously, if the variable was not marked in one of the private clauses, the reference to this variable was emitted incorrectly and caused an assertion later.
llvm-svn: 246846
Summary:
Dtor sanitization handled amidst other dtor cleanups,
between cleaning bases and fields. Sanitizer call pushed onto
stack of cleanup operations.
Reviewers: eugenis, kcc
Differential Revision: http://reviews.llvm.org/D12022
Refactoring dtor sanitizing emission order.
- Support multiple inheritance by poisoning after
member destructors are invoked, and before base
class destructors are invoked.
- Poison for virtual destructor and virtual bases.
- Repress dtor aliasing when sanitizing in dtor.
- CFE test for dtor aliasing, and repression of aliasing in dtor
code generation.
- Poison members on field-by-field basis, with collective poisoning
of trivial members when possible.
- Check msan flags and existence of fields, before dtor sanitizing,
and when determining if aliasing is allowed.
- Testing sanitizing bit fields.
llvm-svn: 246815
This implements basic support for compiling (though not yet assembling
or linking) for a WebAssembly target. Note that ABI details are not yet
finalized, and may change.
Differential Revision: http://reviews.llvm.org/D12002
llvm-svn: 246814
The ACLE (ARM C Language Extensions) 2.0 allows the __fp16 type to be
used as a functon argument or return type (ACLE 1.1 did not).
The current public release of the AAPCS (2.09) states that __fp16 values
should be converted to single-precision before being passed or returned,
but AAPCS 2.10 (to be released shortly) changes this, so that they are
passed in the least-significant 16 bits of either a GPR (for base AAPCS)
or a single-precision register (for AAPCS-VFP). This does not change how
arguments are passed if they get passed on the stack.
This patch brings clang up to compliance with the latest versions of
both of these specs.
We can now set the __ARM_FP16_ARGS ACLE predefine, and we have always
been able to set the __ARM_FP16_FORMAT_IEEE predefine (we do not support
the alternative format).
llvm-svn: 246764
Original commit message:
[ARM] Allow passing/returning of __fp16 arguments
The ACLE (ARM C Language Extensions) 2.0 allows the __fp16 type to be
used as a functon argument or return type (ACLE 1.1 did not).
The current public release of the AAPCS (2.09) states that __fp16 values
should be converted to single-precision before being passed or returned,
but AAPCS 2.10 (to be released shortly) changes this, so that they are
passed in the least-significant 16 bits of either a GPR (for base AAPCS)
or a single-precision register (for AAPCS-VFP). This does not change how
arguments are passed if they get passed on the stack.
This patch brings clang up to compliance with the latest versions of
both of these specs.
We can now set the __ARM_FP16_ARGS ACLE predefine, and we have always
been able to set the __ARM_FP16_FORMAT_IEEE predefine (we do not support
the alternative format).
llvm-svn: 246760
The ACLE (ARM C Language Extensions) 2.0 allows the __fp16 type to be
used as a functon argument or return type (ACLE 1.1 did not).
The current public release of the AAPCS (2.09) states that __fp16 values
should be converted to single-precision before being passed or returned,
but AAPCS 2.10 (to be released shortly) changes this, so that they are
passed in the least-significant 16 bits of either a GPR (for base AAPCS)
or a single-precision register (for AAPCS-VFP). This does not change how
arguments are passed if they get passed on the stack.
This patch brings clang up to compliance with the latest versions of
both of these specs.
We can now set the __ARM_FP16_ARGS ACLE predefine, and we have always
been able to set the __ARM_FP16_FORMAT_IEEE predefine (we do not support
the alternative format).
llvm-svn: 246755
Fixed codegen for extended format of 'if' clauses with special 'directive-name-modifier' + ast-print tests for extended format of 'if' clause.
llvm-svn: 246748
every time it's called rather than attempting to cache the result.
It's unlikely to be called frequently and the overhead of using
it in the first place is already factored out.
llvm-svn: 246706
This patch depends on r246688 (D12341).
The goal is to make LLVM generate different code for these functions for a target that
has cheap branches (see PR23827 for more details):
int foo();
int normal(int x, int y, int z) {
if (x != 0 && y != 0) return foo();
return 1;
}
int crazy(int x, int y) {
if (__builtin_unpredictable(x != 0 && y != 0)) return foo();
return 1;
}
Differential Revision: http://reviews.llvm.org/D12458
llvm-svn: 246699
the main attribute and cache the results so we don't have to parse
a single attribute more than once.
This reapplies r246596 with a fix for an uninitialized class member,
and a couple of cleanups and formatting changes.
llvm-svn: 246610
GCC 4.8+ has a PowerPC-specific intrinsic, __builtin_ppc_get_timebase, to do
what Clang's __builtin_readcyclecounter does. For compatibility with code that
uses GCC's spelling (including glibc), support it as well.
Partially fixes PR23681.
llvm-svn: 246510
Also:
- Add a typedef to make working with the result easier.
- Update callers to use the new function.
- Make initFeatureMap out of line.
llvm-svn: 246468
Proper diagnostic and resolution of mangled names conflicts between C++ methods
and C functions. This patch implements support for functions/methods only;
support for variables is coming separately.
Differential Revision: http://reviews.llvm.org/D11297
llvm-svn: 246438
Added codegen for array section in 'depend' clause of 'task' directive. It emits to pointers, one for the begin of array section and another for the end of array section. Size of the section is calculated as (end + 1 - start) * sizeof(basic_element_type).
llvm-svn: 246422
This replaces the filtered generic iterator with a type-specfic one based
on dyn_cast instead of comparing the kind enum. This allows us to use
range-based for loops and eliminates casts. No functionality change
intended.
llvm-svn: 246384
This patch does two things:
1) Don't error about dllimport/export on thread-local static local variables.
We put those attributes on static locals in dllimport/export functions
implicitly in case the function gets inlined. Now, for TLS variables this
is a problem because we can't import such variables, but it's a benign
problem becase:
2) Make sure we never inline a dllimport function TLS static locals. In fact,
never inline a dllimport function that references a non-imported function
or variable (because these are not defined in the importing library). This
seems to match MSVC's behaviour.
Differential Revision: http://reviews.llvm.org/D12422
llvm-svn: 246338
Added codegen for array section in 'depend' clause of 'task' directive. It emits to pointers, one for the begin of array section and another for the end of array section. Size of the section is calculated as (end + 1 - start) * sizeof(basic_element_type).
llvm-svn: 246278
There was linker problem, and it turns out that it is not always safe
to refer to vtable. If the vtable is used, then we can refer to it
without any problem, but because we don't know when it will be used or
not, we can only check if vtable is external or it is safe to to emit it
speculativly (when class it doesn't have any inline virtual functions).
It should be fixed in the future.
http://reviews.llvm.org/D12385
llvm-svn: 246214
Usually debug info is created on the fly while during codegen.
With this API it becomes possible to create standalone debug info
for types that are not referenced by any code, such as emitting debug info
for a clang module or for implementing something like -gfull.
Because on-the-fly debug info generation may still insert retained types
on top of them, all RetainedTypes are uniqued in CGDebugInfo::finalize().
llvm-svn: 246210
A couple of changes here:
a) Do less work in the case where we don't have a target attribute on the
function. We've already canonicalized the attributes for the function -
no need to do more work.
b) Use the newer canonicalized feature adding functions from TargetInfo
to do the work when we do have a target attribute. This enables us to diagnose
some warnings in the case of conflicting written attributes (only ppc does
this today) and also make sure to get all of the features for a cpu that's
listed rather than just change the cpu.
Updated all testcases accordingly and added a new testcase to verify that we'll
error out on ppc if we have some incompatible options using the existing diagnosis
framework there.
llvm-svn: 246195
to enable the use of external type references in the debug info
(a.k.a. module debugging).
The driver expands -gmodules to "-g -fmodule-format=obj -dwarf-ext-refs"
and passes that to cc1. All this does at the moment is set a flag
codegenopts.
http://reviews.llvm.org/D11958
llvm-svn: 246192
I stared at `false /*declaration*/` for quite some time before giving up
and checking the actual function to see what it meant. Replacing with
`/* isDefinition = */ false` to save myself effort later.
llvm-svn: 246095
Eventually, we will need sample profiles to be incorporated into the
inliner's cost models. To do this, we need the sample profile pass to
be a module pass.
This patch makes no functional changes beyond the mechanical adjustments
needed to run SampleProfile as a module pass.
llvm-svn: 245941
Summary:
The signatures of the methods in LLVM for creating EH pads/rets are changing
to require token arguments on rets and assume token return type on pads.
Update creation code accordingly.
Reviewers: majnemer, rnk
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D12109
llvm-svn: 245798
Summary:
According to CUDA documentation, global variables declared with __device__,
__constant__ can be initialized from host code, so mark them as
externally initialized. Because __shared__ variables cannot have an
initialization as part of their declaration and since the value maybe kept
across different kernel invocation, the value of __shared__ is effectively
undefined instead of zero initialized.
Wrongly using zero initializer may cause illegitimate optimization, e.g.
removing unused __constant__ variable because it's not updated in the device
code and the value is initialized with zero.
Test Plan: test/CodeGenCUDA/address-spaces.cu
Patch by Xuetian Weng
Reviewers: jholewinski, eliben, tra, jingyue
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12241
llvm-svn: 245786
We had "vcvt_f16" and "VCVT_HIGH_F16": for other FP types, this naming
is used for intrinsics with integer overloads. The FP->FP conversions,
on the other hand, use the full "vcvt_f32_f64" name instead.
Use the same naming convention for the f16<->f32 conversions.
While there, reorder the definitions a little bit.
llvm-svn: 245763
This is important in the case that the LLVM-inferred llvm-struct
alignment is not the same as the clang-known C-struct alignment.
Differential Revision: http://reviews.llvm.org/D12243
llvm-svn: 245719
Add emission of metadata for simd loops in presence of 'simdlen' clause.
If 'simdlen' clause is provided without 'safelen' clause, the vectorizer width for the loop is set to value of 'simdlen' clause + all read/write ops in loop are marked with '!llvm.mem.parallel_loop_access' metadata.
If 'simdlen' clause is provided along with 'safelen' clause, the vectorizer width for the loop is set to value of 'simdlen' clause + all read/write ops in loop are not marked with '!llvm.mem.parallel_loop_access' metadata.
If 'safelen' clause is provided without 'simdlen' clause, the vectorizer width for the loop is set to value of 'safelen' clause + all read/write ops in loop are not marked with '!llvm.mem.parallel_loop_access' metadata.
llvm-svn: 245697
Add parsing/sema analysis for 'simdlen' clause in simd directives. Also add check that if both 'safelen' and 'simdlen' clauses are specified, the value of 'simdlen' parameter is less than the value of 'safelen' parameter.
llvm-svn: 245692
OpenMP 4.1 allows to use variables with reference types in all private clauses (private, firstprivate, lastprivate, linear etc.). Patch allows to use such variables and fixes codegen for linear variables with reference types.
llvm-svn: 245268
blender uses statements expression in condition of the loop under control of the '#pragma omp parallel for'. This condition is used several times in different expressions required for codegen of the loop directive. If there are some variables defined in statement expression, it fires an assert during codegen because of redefinition of the same variables.
We have to rebuild several expression to be sure that all variables are unique.
llvm-svn: 245041
Make the copy/move ctors protected and defaulted in the base, make the
derived classes final to avoid exposing any slicing-prone APIs.
Also, while I'm here, simplify the use of buildByrefHelpers by taking
the parameter by value instead of non-const ref. None of the callers
care aobut observing the state after the call.
llvm-svn: 244990
We risk iterator invalidation issues if we use a DenseMap to hold the
backing storage for an APValue. Instead, BumpPtrAllocate them and
use APValue * as our DenseMap value.
Also, don't assume that MaterializedGlobalTemporaryMap won't regrow
between when we initially perform a lookup and later on when we actually
try to insert into it.
This fixes PR24289.
Differential Revision: http://reviews.llvm.org/D11629
llvm-svn: 244989
Summary: Poisoning applied to only class members, and before dtors for base class invoked
Implement poisoning of only class members in dtor, as opposed to also
poisoning fields inherited from base classes. Members are poisoned
only once, by the last dtor for a class. Skip poisoning if class has
no fields.
Verify emitted code for derived class with virtual destructor sanitizes
its members only once.
Removed patch file containing extraneous changes.
Reviewers: eugenis, kcc
Differential Revision: http://reviews.llvm.org/D11951
Simplified test cases for use-after-dtor
Summary: Simplified test cases to focus on one feature at time.
Tests updated to align with new emission order for sanitizing
callback.
Reviewers: eugenis, kcc
Differential Revision: http://reviews.llvm.org/D12003
llvm-svn: 244933
After r244870 flush() will only compare two null pointers and return,
doing nothing but wasting run time. The call is not required any more
as the stream and its SmallString are always in sync.
Thanks to David Blaikie for reviewing.
llvm-svn: 244928
Verify emitted code for derived class with virtual destructor sanitizes its members only once.
Changed emission order for dtor callback, so only the last dtor for a class emits the sanitizing callback, while ensuring that class members are poisoned before base class destructors are invoked.
Skip poisoning of members, if class has no fields.
Removed patch file containing extraneous changes.
Summary: Poisoning applied to only class members, and before dtors for base class invoked
Reviewers: eugenis, kcc
Differential Revision: http://reviews.llvm.org/D11951
llvm-svn: 244819
Summary:
float_cast_overflow is the only UBSan check without a source location attached.
This patch propagates SourceLocations where necessary to get them to the
EmitCheck() call.
Reviewers: rsmith, ABataev, rjmccall
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D11757
llvm-svn: 244568
This patche and a related llvm patch solve the problem of having to explicitly enable analysis when specifying a loop hint pragma to get the diagnostics. Passing AlwasyPrint as the pass name (see below) causes the front-end to print the diagnostic if the user has specified '-Rpass-analysis' without an '=<target-pass>’. Users of loop hints can pass that compiler option without having to specify the pass and they will get diagnostics for only those loops with loop hints.
llvm-svn: 244556
Following one of the appended options will allow the loop to be vectorized. We do not include a command line option for modifying the pointer checking threshold because there is no clang-level interface for this currently.
llvm-svn: 244526
With this patch clang appends the command line options that would allow vectorization when floating-point commutativity is required. Specifically those are enabling fast-math or specifying a loop hint.
llvm-svn: 244492
These changes are for Android x86_64 targets to be compatible
with current Android g++ and conform to AMD64 ABI.
https://llvm.org/bugs/show_bug.cgi?id=23897
* Return type of long double (fp128) should be fp128, not x86_fp80.
* Vararg of long double (fp128) could be in register and overflowed to memory.
https://llvm.org/bugs/show_bug.cgi?id=24111
* Return value of long double (fp128) _Complex should be in memory like a structure of {fp128,fp128}.
Differential Revision: http://reviews.llvm.org/D11437
llvm-svn: 244468
This change adds the new unroll metadata "llvm.loop.unroll.enable" which directs
the optimizer to unroll a loop fully if the trip count is known at compile time, and
unroll partially if the trip count is not known at compile time. This differs from
"llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not
known at compile time
With this change "#pragma unroll" generates "llvm.loop.unroll.enable" rather than
"llvm.loop.unroll.full" metadata. This changes the semantics of "#pragma unroll" slightly
to mean "unroll aggressively (fully or partially)" rather than "unroll fully or not at all".
The motivating example for this change was some internal code with a loop marked
with "#pragma unroll" which only sometimes had a compile-time trip count depending
on template magic. When the trip count was a compile-time constant, everything works
as expected and the loop is fully unrolled. However, when the trip count was not a
compile-time constant the "#pragma unroll" explicitly disabled unrolling of the loop(!).
Removing "#pragma unroll" caused the loop to be unrolled partially which was desirable
from a performance perspective.
llvm-svn: 244467
MinGW has some pretty strange behvaior around RTTI and
dllimport/dllexport:
- RTTI data is never imported
- RTTI data is only exported if the class has no key function.
llvm-svn: 244266
OpenMP 4.1 allows to use variables with reference types in private clauses and, therefore, in init expressions of the cannonical loop forms.
llvm-svn: 244209
When a thunk is generated with a call to the original adjusted function,
the thunk appears in the debugger call stack. We want the backend to perform
tail-call optimization on the call, to make it invisible to the debugger.
This fixes PR24235
Patch by: amjad.aboud@intel.com
Differential Revision: http://reviews.llvm.org/D11476
llvm-svn: 244207
Summary:
By default, 'clang' emits dwarf and 'clang-cl' emits codeview. You can
force emission of one or both by passing -gcodeview and -gdwarf to
either driver.
Reviewers: dblaikie, hans
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D11742
llvm-svn: 244097
Support for emitting libcalls for __atomic_fetch_nand and
__atomic_{add,sub,and,or,xor,nand}_fetch was missing; add it, and some
test cases.
Differential Revision: http://reviews.llvm.org/D10847
llvm-svn: 244063
a BumpPtrAllocator. This at least now handles the case where there is no
concatentation without calling memcpy on a null pointer. It might be
interesting to handle the case where everything is empty without
round-tripping through the allocator, but it wasn't clear to me if the
pointer returned is significant in any way, so I've left it in
a conservatively more-correct state.
Again, found with UBSan.
llvm-svn: 243948
Summary: In addition to checking compiler flags, the front-end also examines the attributes of the destructor definition to ensure that the SanitizeMemory attribute is attached.
Reviewers: eugenis, kcc
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D11727
refactored test into new file, revised how function attribute examined
modified test to examine default dtor with and without attribute
removed attribute check
llvm-svn: 243912
This patch fixes bug 23800 ( https://llvm.org/bugs/show_bug.cgi?id=23800#c2 ). There existed a case where the index operand from extractelement was directly used to create a shufflevector mask. Since the index can be of any integral type but the mask must only contain 32 bit integers a 64 bit index operand led to an assertion error later on.
Committed on behalf of mpflanzer (Moritz Pflanzer)
Differential Revision: http://reviews.llvm.org/D10838
llvm-svn: 243851
The new EH instructions make it possible for LLVM to generate .xdata
tables that the MSVC personality routines will be happy about. Because
this is experimental, hide it behind a -cc1 flag (-fnew-ms-eh).
Differential Revision: http://reviews.llvm.org/D11405
llvm-svn: 243767
Adjust to LLVM DIBuilder API changes in r243764, using
`createAutoVariable()` and `createParameterVariable()` in place of
`createLocalVariable()`. No real functionality change here.
llvm-svn: 243765
In llvm commit r243581, a reverse range adapter was added which allows
us to change code such as
for (auto I = Fields.rbegin(), E = Fields.rend(); I != E; ++I) {
in to
for (const FieldDecl *I : llvm::reverse(Fields))
This commit changes a few of the places in clang which are eligible to use
this new adapter.
llvm-svn: 243663
new GV (usually NAME.1) instead of the correct NAME of the old GV. Moving comdat
creation after GV replacement solves this. Patch + testcase.
Reviewed by Reid Kleckner.
http://reviews.llvm.org/D11594
llvm-svn: 243525
This will be used for old targets like Android that do not
support ELF TLS models.
Differential Revision: http://reviews.llvm.org/D10524
llvm-svn: 243441
- Use cached LLVM types
- Turn SmallVectors into Arrays/ArrayRef if the size is static
- Use ConstantInt::get's implicit splatting for vector types
No functionality change intended.
llvm-svn: 243425
This was calling FD->hasBody(), meaning "Does the function that this
decl refers to have a body?", rather than
FD->doesThisDeclarationHaveABody(), meaning "Is this decl a
non-deleted definition?".
We might want to consider renaming these APIs :/
llvm-svn: 243360
When ‘#pragma clang loop vectorize(assume_safety)’ was specified on a loop other loop hints were lost. The problem is that CGLoopInfo attaches metadata differently than EmitCondBrHints in CGStmt. For do-loops CGLoopInfo attaches metadata to the br in the body block and for while and for loops, the inc block. EmitCondBrHints on the other hand always attaches data to the br in the cond block. When specifying assume_safety CGLoopInfo emits an empty llvm.loop metadata shadowing the metadata in the cond block. Loop transformations like rotate and unswitch would then eliminate the cond block and its non-empty metadata.
This patch unifies both approaches for adding metadata and modifies the existing safety tests to include non-assume_safety loop hints.
llvm-svn: 243315
__builtin_frame_address requires its argument to be a constant
expression which already implies that it cannot have undefined behavior.
However, we used EmitScalarExpr to emit the argument causing UBSan to
try to check for overflow.
Instead, use the constant expression emission system.
This fixes PR24256.
llvm-svn: 243206
Change `getOrCreateLimitedType()` to return a `DICompositeType` and
remove the casts from its callers. Inside, I've strengthened a `cast`
from `DICompositeTypeBase`, but the casts in the callers already prove
that this is safe. There should be no functionality change here.
llvm-svn: 243155
Generating available_externally vtables for optimizations purposes.
Unfortunatelly ItaniumABI doesn't guarantee that we will be able to
refer to virtual inline method by name.
But when we don't have any inline virtual methods, and key function is
not defined in this TU, we can generate that there will be vtable and
mark it as available_externally.
This is patch will help devirtualize better.
Differential Revision: http://reviews.llvm.org/D11441
llvm-svn: 243090
The catch keyword isn't really part of a region, so it's fairly
meaningless to extend into it. This was usually harmless, but it could
crash when catch blocks involved macros in strange ways.
llvm-svn: 243066
Sometimes we can provide an initializer for static locals, in which case
we sometimes might need to change the type. Changing the type requires
making a new LLVM GlobalVariable, and in this codepath we were
forgetting to transfer the comdat.
Fixes PR23838.
Patch by Ivan Garramona.
llvm-svn: 242704
This is the PS4 counterpart to r229376, which quotes the library name if the
name contains space. It was discovered that if a library name contains both
double-quote and space characters, quoting the name might produce unexpected
results, but we are mostly concerned with a Windows host environment, which
does not allow double-quote or slashes in file/folder names.
Differential Revision: http://reviews.llvm.org/D11275
llvm-svn: 242689
- Make it a proper random access iterator with a little help from iterator_adaptor_base
- Clean up users of magic dereferencing. The iterator should behave like an Expr **.
- Make it an implementation detail of Stmt. This allows inlining of the assertions.
llvm-svn: 242608
If this assert does fire, the no-asserts behaviour is an infinite
loop. It's better to crash in this case so we get a crash report and
stop wasting the user's cpu cycles.
llvm-svn: 242591
That platform has alignof(uint64_t) == 4, but, since LLVM_ALIGNAS(...)
cannot take anything but literal integers due to MSVC limitations, the
literal '8' used there didn't match. Switch ScopeStackAlignment to
just use 8, as well.
llvm-svn: 242578
Currently, -save-temp will cause ObjCARC optimization to be dropped,
sanitizer pass to run early in the pipeline, and profiling
instrumentation to run twice.
Fix the issue by properly disable all passes in the optimization
pipeline when generating bitcode output and parse some of the Language
Options even when the input is bitcode so the passes can be setup
correctly.
llvm-svn: 242565
Some const-correctness changes snuck in here too, since they were in the
area of code I was modifying.
This seems to make Clang actually work without Bus Error on
32bit-sparc.
Follow-up patches will factor out a trailing-object helper class, to
make classes using the idiom of appending objects to other objects
easier to understand, and to ensure (with static_assert) that required
alignment guarantees continue to hold.
Differential Revision: http://reviews.llvm.org/D10272
llvm-svn: 242554
We shouldn't crash despite the AMD64 ABI not giving clear guidance as to
how to pass around vector types <= 32 bits. Instead, classify such
vectors as INTEGER to be compatible with GCC.
This fixes PR24162.
llvm-svn: 242508
- introduces a new cc1 option -fmodule-format=[raw,obj]
with 'raw' being the default
- supports arbitrary module container formats that libclang is agnostic to
- adds the format to the module hash to avoid collisions
- splits the old PCHContainerOperations into PCHContainerWriter and
a PCHContainerReader.
Thanks to Richard Smith for reviewing this patch!
llvm-svn: 242499
We now use the sanitizer special case list to decide which types to blacklist.
We also support a special blacklist entry for types with a uuid attribute,
which are generally COM types whose virtual tables are defined externally.
Differential Revision: http://reviews.llvm.org/D11096
llvm-svn: 242286
This patch corresponds to review:
http://reviews.llvm.org/D11184
A number of new interfaces for altivec.h (as mandated by the ABI):
vector float vec_cpsgn(vector float, vector float)
vector double vec_cpsgn(vector double, vector double)
vector double vec_or(vector bool long long, vector double)
vector double vec_or(vector double, vector bool long long)
vector double vec_re(vector double)
vector signed char vec_cntlz(vector signed char)
vector unsigned char vec_cntlz(vector unsigned char)
vector short vec_cntlz(vector short)
vector unsigned short vec_cntlz(vector unsigned short)
vector int vec_cntlz(vector int)
vector unsigned int vec_cntlz(vector unsigned int)
vector signed long long vec_cntlz(vector signed long long)
vector unsigned long long vec_cntlz(vector unsigned long long)
vector signed char vec_nand(vector bool signed char, vector signed char)
vector signed char vec_nand(vector signed char, vector bool signed char)
vector signed char vec_nand(vector signed char, vector signed char)
vector unsigned char vec_nand(vector bool unsigned char, vector unsigned char)
vector unsigned char vec_nand(vector unsigned char, vector bool unsigned char)
vector unsigned char vec_nand(vector unsigned char, vector unsigned char)
vector short vec_nand(vector bool short, vector short)
vector short vec_nand(vector short, vector bool short)
vector short vec_nand(vector short, vector short)
vector unsigned short vec_nand(vector bool unsigned short, vector unsigned short)
vector unsigned short vec_nand(vector unsigned short, vector bool unsigned short)
vector unsigned short vec_nand(vector unsigned short, vector unsigned short)
vector int vec_nand(vector bool int, vector int)
vector int vec_nand(vector int, vector bool int)
vector int vec_nand(vector int, vector int)
vector unsigned int vec_nand(vector bool unsigned int, vector unsigned int)
vector unsigned int vec_nand(vector unsigned int, vector bool unsigned int)
vector unsigned int vec_nand(vector unsigned int, vector unsigned int)
vector signed long long vec_nand(vector bool long long, vector signed long long)
vector signed long long vec_nand(vector signed long long, vector bool long long)
vector signed long long vec_nand(vector signed long long, vector signed long long)
vector unsigned long long vec_nand(vector bool long long, vector unsigned long long)
vector unsigned long long vec_nand(vector unsigned long long, vector bool long long)
vector unsigned long long vec_nand(vector unsigned long long, vector unsigned long long)
vector signed char vec_orc(vector bool signed char, vector signed char)
vector signed char vec_orc(vector signed char, vector bool signed char)
vector signed char vec_orc(vector signed char, vector signed char)
vector unsigned char vec_orc(vector bool unsigned char, vector unsigned char)
vector unsigned char vec_orc(vector unsigned char, vector bool unsigned char)
vector unsigned char vec_orc(vector unsigned char, vector unsigned char)
vector short vec_orc(vector bool short, vector short)
vector short vec_orc(vector short, vector bool short)
vector short vec_orc(vector short, vector short)
vector unsigned short vec_orc(vector bool unsigned short, vector unsigned short)
vector unsigned short vec_orc(vector unsigned short, vector bool unsigned short)
vector unsigned short vec_orc(vector unsigned short, vector unsigned short)
vector int vec_orc(vector bool int, vector int)
vector int vec_orc(vector int, vector bool int)
vector int vec_orc(vector int, vector int)
vector unsigned int vec_orc(vector bool unsigned int, vector unsigned int)
vector unsigned int vec_orc(vector unsigned int, vector bool unsigned int)
vector unsigned int vec_orc(vector unsigned int, vector unsigned int)
vector signed long long vec_orc(vector bool long long, vector signed long long)
vector signed long long vec_orc(vector signed long long, vector bool long long)
vector signed long long vec_orc(vector signed long long, vector signed long long)
vector unsigned long long vec_orc(vector bool long long, vector unsigned long long)
vector unsigned long long vec_orc(vector unsigned long long, vector bool long long)
vector unsigned long long vec_orc(vector unsigned long long, vector unsigned long long)
vector signed char vec_div(vector signed char, vector signed char)
vector unsigned char vec_div(vector unsigned char, vector unsigned char)
vector signed short vec_div(vector signed short, vector signed short)
vector unsigned short vec_div(vector unsigned short, vector unsigned short)
vector signed int vec_div(vector signed int, vector signed int)
vector unsigned int vec_div(vector unsigned int, vector unsigned int)
vector signed long long vec_div(vector signed long long, vector signed long long)
vector unsigned long long vec_div(vector unsigned long long, vector unsigned long long)
vector unsigned char vec_mul(vector unsigned char, vector unsigned char)
vector unsigned int vec_mul(vector unsigned int, vector unsigned int)
vector unsigned long long vec_mul(vector unsigned long long, vector unsigned long long)
vector unsigned short vec_mul(vector unsigned short, vector unsigned short)
vector signed char vec_mul(vector signed char, vector signed char)
vector signed int vec_mul(vector signed int, vector signed int)
vector signed long long vec_mul(vector signed long long, vector signed long long)
vector signed short vec_mul(vector signed short, vector signed short)
vector signed long long vec_mergeh(vector signed long long, vector signed long long)
vector signed long long vec_mergeh(vector signed long long, vector bool long long)
vector signed long long vec_mergeh(vector bool long long, vector signed long long)
vector unsigned long long vec_mergeh(vector unsigned long long, vector unsigned long long)
vector unsigned long long vec_mergeh(vector unsigned long long, vector bool long long)
vector unsigned long long vec_mergeh(vector bool long long, vector unsigned long long)
vector double vec_mergeh(vector double, vector double)
vector double vec_mergeh(vector double, vector bool long long)
vector double vec_mergeh(vector bool long long, vector double)
vector signed long long vec_mergel(vector signed long long, vector signed long long)
vector signed long long vec_mergel(vector signed long long, vector bool long long)
vector signed long long vec_mergel(vector bool long long, vector signed long long)
vector unsigned long long vec_mergel(vector unsigned long long, vector unsigned long long)
vector unsigned long long vec_mergel(vector unsigned long long, vector bool long long)
vector unsigned long long vec_mergel(vector bool long long, vector unsigned long long)
vector double vec_mergel(vector double, vector double)
vector double vec_mergel(vector double, vector bool long long)
vector double vec_mergel(vector bool long long, vector double)
vector signed int vec_pack(vector signed long long, vector signed long long)
vector unsigned int vec_pack(vector unsigned long long, vector unsigned long long)
vector bool int vec_pack(vector bool long long, vector bool long long)
llvm-svn: 242171
The fix is to remove duplicate copy-initialization of the only memcpy-able struct member and to correct the address of aggregately initialized members in destructors' calls during stack unwinding (in order to obtain address of struct member by using GEP instead of 'bitcast').
Differential Revision: http://reviews.llvm.org/D10990
llvm-svn: 242127
which is actually the variable backing up the llvm -time-passes command
line argument.
llvm::TimePassesIsEnabled is actually being initialized in CodeGenAction.
llvm-svn: 242099
Under the -fsanitize-memory-use-after-dtor (disabled by default) insert
an MSan runtime library call at the end of every destructor.
Patch by Naomi Musgrave.
llvm-svn: 242097
Previously, clang/llvm treated inline-asm instructions conservatively,
choosing not to eliminate the instructions or hoisting them out of a loop
even when it was safe to do so. This commit makes changes to attach a
readonly or readnone attribute to an inline-asm instruction, which enables
passes such as LICM and EarlyCSE to move or optimize away the instruction.
rdar://problem/11358192
Differential Revision: http://reviews.llvm.org/D10546
llvm-svn: 241930
tools/clang/test/CodeGen/packed-nest-unpacked.c contains this test:
struct XBitfield {
unsigned b1 : 10;
unsigned b2 : 12;
unsigned b3 : 10;
};
struct YBitfield {
char x;
struct XBitfield y;
} __attribute((packed));
struct YBitfield gbitfield;
unsigned test7() {
// CHECK: @test7
// CHECK: load i32, i32* getelementptr inbounds (%struct.YBitfield, %struct.YBitfield* @gbitfield, i32 0, i32 1, i32 0), align 4
return gbitfield.y.b2;
}
The "align 4" is actually wrong. Accessing all of "gbitfield.y" as a single
i32 is of course possible, but that still doesn't make it 4-byte aligned as
it remains packed at offset 1 in the surrounding gbitfield object.
This alignment was changed by commit r169489, which also introduced changes
to bitfield access code in CGExpr.cpp. Code before that change used to take
into account *both* the alignment of the field to be accessed within the
current struct, *and* the alignment of that outer struct itself; this logic
was removed by the above commit.
Neglecting to consider both values can cause incorrect code to be generated
(I've seen an unaligned access crash on SystemZ due to this bug).
In order to always use the best known alignment value, this patch removes
the CGBitFieldInfo::StorageAlignment member and replaces it with a
StorageOffset member specifying the offset from the start of the surrounding
struct to the bitfield's underlying storage. This offset can then be combined
with the best-known alignment for a bitfield access lvalue to determine the
alignment to use when accessing the bitfield's storage.
Differential Revision: http://reviews.llvm.org/D11034
llvm-svn: 241916
Code in CGCall.cpp that loads up function arguments that need to be
coerced to a different type may in some cases ignore the fact that
the source of the argument is not naturally aligned. This may cause
incorrect code to be generated. In some places in CreateCoercedLoad,
we already have setAlignment calls to address this, but I ran into one
where it was missing, causing wrong code generation on SystemZ.
However, in that location, we do not actually know what alignment of
the source location we can rely on; the callers do not pass anything
to this routine. This is already an issue in other places in
CreateCoercedLoad; and the same problem exists for CreateCoercedStore.
To avoid pessimising code, and to fix the FIXMEs already in place,
this patch also adds an alignment argument to the CreateCoerced*
routines and uses it instead of forcing an alignment of 1. The
callers are changed to pass in the best information they have.
This actually requires changes in a number of existing test cases
since we now get better alignment in many places.
Differential Revision: http://reviews.llvm.org/D11033
llvm-svn: 241898
We were previously creating bit set entries at virtual table offset
sizeof(void*) unconditionally under the Microsoft C++ ABI. This is incorrect
if RTTI data is disabled; in that case the "address point" is at offset
0. This change modifies bit set emission to take into account whether RTTI
data is being emitted.
Also make a start on a blacklisting scheme for records.
Differential Revision: http://reviews.llvm.org/D11048
llvm-svn: 241845
Move the diagnostic back to codegen so that we can compile ATL on the
self-host bot. We don't actually end up emitting code for the __try, so
the diagnostic won't be hit.
llvm-svn: 241761
For Mips direct-to-nacl, the goal is to be close to le32 front-end and
use Mips32EL backend. This patch defines new NaClMips32ELTargetInfo and
modifies it slightly to be close to le32. It also adds necessary parts,
inline with ARM and X86.
Differential Revision: http://reviews.llvm.org/D10739
llvm-svn: 241678
The fix is to emit cleanup for arrays of memcpy-able objects in struct if an exception is thrown later during copy-construction.
Differential Revision: http://reviews.llvm.org/D10989
llvm-svn: 241670
We didn't correctly process the case where a base class is classified as
MEMORY. This would cause us to trip over an assertion.
This fixes PR24020.
Differential Revision: http://reviews.llvm.org/D10907
llvm-svn: 241667
We forgot to run postMerge after decided that the union had to be
classified as MEMORY. This left us with Lo == MEMORY and Hi == SSEUp
which is an invalid combination.
This fixes PR24021.
Differential Revision: http://reviews.llvm.org/D10908
llvm-svn: 241666
This patch adds ObjectFilePCHContainerOperations uses the LLVM backend
to put the contents of a PCH into a __clangast section inside a COFF, ELF,
or Mach-O object file container.
This is done to facilitate module debugging by makeing it possible to
store the debug info for the types defined by a module alongside the AST.
rdar://problem/20091852
llvm-svn: 241620
When messaging a method that was defined in an Objective-C class (or
category or extension thereof) that has type parameters, substitute
the type arguments for those type parameters. Similarly, substitute
into property accesses, instance variables, and other references.
This includes general infrastructure for substituting the type
arguments associated with an ObjCObject(Pointer)Type into a type
referenced within a particular context, handling all of the
substitutions required to deal with (e.g.) inheritance involving
parameterized classes. In cases where no type arguments are available
(e.g., because we're messaging via some unspecialized type, id, etc.),
we substitute in the type bounds for the type parameters instead.
Example:
@interface NSSet<T : id<NSCopying>> : NSObject <NSCopying>
- (T)firstObject;
@end
void f(NSSet<NSString *> *stringSet, NSSet *anySet) {
[stringSet firstObject]; // produces NSString*
[anySet firstObject]; // produces id<NSCopying> (the bound)
}
When substituting for the type parameters given an unspecialized
context (i.e., no specific type arguments were given), substituting
the type bounds unconditionally produces type signatures that are too
strong compared to the pre-generics signatures. Instead, use the
following rule:
- In covariant positions, such as method return types, replace type
parameters with “id” or “Class” (the latter only when the type
parameter bound is “Class” or qualified class, e.g,
“Class<NSCopying>”)
- In other positions (e.g., parameter types), replace type
parameters with their type bounds.
- When a specialized Objective-C object or object pointer type
contains a type parameter in its type arguments (e.g.,
NSArray<T>*, but not NSArray<NSString *> *), replace the entire
object/object pointer type with its unspecialized version (e.g.,
NSArray *).
llvm-svn: 241543
Produce type parameter declarations for Objective-C type parameters,
and attach lists of type parameters to Objective-C classes,
categories, forward declarations, and extensions as
appropriate. Perform semantic analysis of type bounds for type
parameters, both in isolation and across classes/categories/extensions
to ensure consistency.
Also handle (de-)serialization of Objective-C type parameter lists,
along with sundry other things one must do to add a new declaration to
Clang.
Note that Objective-C type parameters are typedef name declarations,
like typedefs and C++11 type aliases, in support of type erasure.
Part of rdar://problem/6294649.
llvm-svn: 241541
different function signatures. (Previously clang would emit all block
pointer types with the type of the first block pointer in the compile
unit.)
rdar://problem/21602473
llvm-svn: 241534
This reverts commit r241244, but restricts SEH support to Win64.
This way, Chromium builds will still fall back on TUs with SEH, and
Clang developers can work on this incrementally upstream while patching
this small predicate locally. It'll also make it easier to review small
fixes.
llvm-svn: 241533
The patch is the same except for the addition of a new test for the
issue that required reverting the dependent llvm commit.
--Original Commit Message--
Pass down the -flto option to the -cc1 job, and from there into the
CodeGenOptions and onto the PassManagerBuilder. This enables gating
the new EliminateAvailableExternally module pass on whether we are
preparing for LTO.
If we are preparing for LTO (e.g. a -flto -c compile), the new pass is not
included as we want to preserve available externally functions for possible
link time inlining.
llvm-svn: 241467
This is needed to use clang's command line option "-ftrap-function" for LTO and
enable changing the trap function name on a per-call-site basis.
rdar://problem/21225723
Differential Revision: http://reviews.llvm.org/D10831
llvm-svn: 241306
While there replace stable_sort of std::string with just sort, stability
is not necessary for "simple" value types. No functional change intended.
llvm-svn: 241299
The next code is generated for this construct:
```
if (__kmpc_cancellationpoint(ident_t *loc, kmp_int32 global_tid, kmp_int32 cncl_kind) != 0)
<exit from outer innermost construct>;
```
llvm-svn: 241239
32-bit finally funclets are intended to be called both directly from the
parent function and indirectly from the EH runtime. Because we aren't
contorting LLVM's X86 prologue to match MSVC's, calling the finally
block directly passes in a different value of EBP than the one that the
runtime provides. We need an adapter thunk to adjust EBP to the expected
value. However, WinEHPrepare already has to solve this problem when
cleanups are not pre-outlined, so we can go ahead and rely on it rather
than duplicating work.
Now we only do the llvm.x86.seh.recoverfp dance for 32-bit SEH filter
functions.
llvm-svn: 241187
This re-lands r236052 and adds support for __exception_code().
In 32-bit SEH, the exception code is not available in eax. It is only
available in the filter function, and now we arrange to load it and
store it into an escaped variable in the parent frame.
As a consequence, we have to disable the "catch i8* null" optimization
on 32-bit and always generate a filter function. We can re-enable the
optimization if we detect an __except block that doesn't use the
exception code, but this probably isn't worth optimizing.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D10852
llvm-svn: 241171
Function static variables, typedefs and records (class, struct or union) declared inside
a lexical scope were associated with the function as their parent scope, rather than the
lexical scope they are defined or declared in.
This fixes PR19238
Patch by: amjad.aboud@intel.com
Differential Revision: http://reviews.llvm.org/D9760
llvm-svn: 241154
When an internal-linkage thunk is code gen'd, CodeGenVTables::emitThunk
will first be called with ForVTable=true (which incorrectly set the
thunk's linkage to available_externally under the Itanium ABI) and later
with ForVTable=false (which reset it to internal). Because we will always
see a call with ForVTable=false, this incorrect linkage never ended up in
the final IR. However, the temporary presence of this linkage caused us
to give such functions a comdat as a result of code introduced in r241102.
To avoid this, check that the thunk is externally visible before giving it
available_externally linkage.
llvm-svn: 241136
The LifetimeExtendedCleanupHeader is carefully fit into 32 bytes,
meaning that cleanups on the LifetimeExtendedCleanupStack are *always*
allocated at a misaligned address and cause undefined behaviour.
There are two ways to solve this - add padding after the header when
we allocated our cleanups, or just simplify the header and let it use
64 bits in the first place. I've opted for the latter, and added a
static assert to avoid the issue in the future.
llvm-svn: 241133
using a string map to canonicalize. Fix up a couple of testcases
that needed changing since we are no longer simply appending features
to the list, but all of their mask dependencies as well.
llvm-svn: 241129
Previously we were not assigning a comdat to thunks in the Microsoft ABI,
which would have required us to emit these functions outside of a comdat.
(Due to an inconsistency in how we were emitting objects, we were getting this
right most of the time, but only when compiling with function sections.) This
code generator change causes us to create a comdat for each thunk.
Differential Revision: http://reviews.llvm.org/D10829
llvm-svn: 241102
This allows a module-aware debugger such as LLDB to import the currently
visible modules before dropping into the expression evaluator.
rdar://problem/20965932
llvm-svn: 241084
isTriviallyRecursive is a hack used to bridge a gap between the
expectations that source code assumes and the semantics that LLVM IR can
provide. Specifically, asm labels on functions are treated as an
explicit name for a GlobalObject in Clang but treated like an
output-processing step in GCC. Tweak this hack a little further to emit
calls to library functions instead of emitting an incorrect definition.
The definition in question would have available_externally linkage (this
is OK) but result in a call to itself which will either result in an
infinite loop or stack overflow.
This fixes PR23964.
llvm-svn: 241043
MSVC only genreates array cookies if the class has a destructor. This
is problematic when having to call T::operator delete[](void *, size_t)
because the second argument's argument is impossible to synthesize
correctly if the class has no destructor (because there will be no array
cookie).
Instead, MSVC passes the size of the class. Do the same, for
compatibility, instead of crashing.
This fixes PR23990.
llvm-svn: 241038
This matches the implementation of the gcc support for the same
feature, including checking the values set up by libgcc at runtime.
The structure looks like this:
unsigned int __cpu_vendor;
unsigned int __cpu_type;
unsigned int __cpu_subtype;
unsigned int __cpu_features[1];
with a set of enums to match various fields that are field out after
parsing the output of the cpuid instruction.
This also adds a set of errors checking for valid input (and cpu).
compiler-rt support for this and the other builtins in this family
(__builtin_cpu_init and __builtin_cpu_is) are forthcoming.
llvm-svn: 240994
We failed to see that we should have deferred the creation of a type
which references a type currently under construction because of atomic
sugar.
This fixes PR23985.
llvm-svn: 240989
We had two separate paths for member pointer conversion: one which
takes a constant and another which takes an arbitrary value. In the
latter case, we are permitted to construct arbitrary instructions.
It turns out that the bulk of the member pointer conversion is sharable
if we construct an artificial IRBuilder.
llvm-svn: 240921
This patch corresponds to review:
http://reviews.llvm.org/D10637
This is the first round of additions of missing builtins listed in the ABI document. More to come (this builds onto what seurer already addes). This patch adds:
vector signed long long vec_abs(vector signed long long)
vector double vec_abs(vector double)
vector signed long long vec_add(vector signed long long, vector signed long long)
vector unsigned long long vec_add(vector unsigned long long, vector unsigned long long)
vector double vec_add(vector double, vector double)
vector double vec_and(vector bool long long, vector double)
vector double vec_and(vector double, vector bool long long)
vector double vec_and(vector double, vector double)
vector signed long long vec_and(vector signed long long, vector signed long long)
vector double vec_andc(vector bool long long, vector double)
vector double vec_andc(vector double, vector bool long long)
vector double vec_andc(vector double, vector double)
vector signed long long vec_andc(vector signed long long, vector signed long long)
vector double vec_ceil(vector double)
vector bool long long vec_cmpeq(vector double, vector double)
vector bool long long vec_cmpge(vector double, vector double)
vector bool long long vec_cmpge(vector signed long long, vector signed long long)
vector bool long long vec_cmpge(vector unsigned long long, vector unsigned long long)
vector bool long long vec_cmpgt(vector double, vector double)
vector bool long long vec_cmple(vector double, vector double)
vector bool long long vec_cmple(vector signed long long, vector signed long long)
vector bool long long vec_cmple(vector unsigned long long, vector unsigned long long)
vector bool long long vec_cmplt(vector double, vector double)
vector bool long long vec_cmplt(vector signed long long, vector signed long long)
vector bool long long vec_cmplt(vector unsigned long long, vector unsigned long long)
llvm-svn: 240821
Skip calls to HasTrivialDestructorBody() in the case where the
destructor is never invoked. Alternatively, Richard proposed to change
Sema to declare a trivial destructor for anonymous union member, which
seems too wasteful.
Differential Revision: http://reviews.llvm.org/D10508
llvm-svn: 240742
When a profile file cannot be opened, we used to display just the error
message but not the name of the profile the compiler was trying to open.
This will become useful in the next set of patches that introduce
GCC-compatible flags to specify profiles.
llvm-svn: 240715
Integer variants are implemented as atomicrmw or cmpxchg instructions.
Atomic add for floating point (__nvvm_atom_add_gen_f()) is implemented
as a call to an overloaded @llvm.nvvm.atomic.load.add.f32.* LVVM
intrinsic.
Differential Revision: http://reviews.llvm.org/D10666
llvm-svn: 240669
Summary:
Byval argument pair formation assumes that if a type is less than 8 bytes
it must be an integer and not a pointer, which is not true for x32 and NaCl.
Relax the assertion and add a test for a codegen case that triggered it.
Reviewers: jvoung
Subscribers: jfb, cfe-commits
Differential Revision: http://reviews.llvm.org/D10701
llvm-svn: 240600
If task directive has associated 'depend' clause then function kmp_int32 __kmpc_omp_task_with_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_task_t * new_task, kmp_int32 ndeps, kmp_depend_info_t *dep_list,kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called instead of __kmpc_omp_task().
If this directive has associated 'if' clause then also before a call of kmpc_omp_task_begin_if0() a function void __kmpc_omp_wait_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_int32 ndeps, kmp_depend_info_t *dep_list, kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called.
Array sections are not supported yet.
llvm-svn: 240532
This fixes a serious bug in r240462: checking the BuiltinID for
ARM::BI_MoveToCoprocessor* in EmitBuiltinExpr() ignores the fact that
each target has an overlapping range of the BuiltinID values. That check
can trigger for builtins from other targets, leading to very bad
behavior.
Part of the reason I did not implement r240462 this way to begin with is
the special handling of the last argument for Neon builtins. In this
change, I have factored out the check to see which builtins have that
extra argument into a new HasExtraNeonArgument() function. There is still
some awkwardness in having to check for those builtins in two separate
places, i.e., once to see if the extra argument is present and once to
generate the appropriate IR, but this seems much cleaner than my previous
patch.
llvm-svn: 240522
The Microsoft-extension _MoveToCoprocessor and _MoveToCoprocessor2
builtins take the register value to be moved as the first argument,
but the corresponding mcr and mcr2 LLVM intrinsics expect that value
to be the third argument. Handle this as a special case, while still
leaving those intrinsics as generic MSBuiltins. I considered the
alternative of handling these in EmitARMBuiltinExpr, but that does
not work well for the follow-up change that I'm going to make to improve
the error handling for PR22560 -- we need the GetBuiltinType() checks
for ICEArguments, and the ARM version of that code is only used for
Neon intrinsics where the last argument is special and not
checked in the normal way.
llvm-svn: 240462
Virtual inheritance member pointers are always relative to the vbindex,
even when the member pointer doesn't point into a virtual base. This is
corrected by adjusting the non-virtual offset backwards from the vbptr
back to the top of the most derived class. While we performed this
adjustment when manifesting member pointers as constants or when
performing conversions, we didn't perform the adjustment when mangling
them.
llvm-svn: 240453
Parsing and sema analysis (without support for array sections in arguments) for 'depend' clause (used in 'task' directive, OpenMP 4.0).
llvm-svn: 240409
Member pointers in the MS ABI are made complicated due to the following:
- Virtual methods in the most derived class (MDC) might live in a
vftable in a virtual base.
- There are four different representations of member pointer: single
inheritance, multiple inheritance, virtual inheritance and the "most
general" representation.
- Bases might have a *more* general representation than classes which
derived from them, a most surprising result.
We believed that we could treat all member pointers as-if they were a
degenerate case of the multiple inheritance model. This fell apart once
we realized that implementing standard member pointers using this ABI
requires referencing members with a non-zero vbindex.
On a bright note, all but the virtual inheritance model operate rather
similarly. The virtual inheritance member pointer representation
awkwardly requires a virtual base adjustment in order to refer to
entities in the MDC.
However, the first virtual base might be quite far from the start of the
virtual base. This means that we must add a negative non-virtual
displacement.
However, things get even more complicated. The most general
representation interprets vbindex zero differently from the virtual
inheritance model: it doesn't reference the vbtable at all.
It turns out that this complexity can increase for quite some time:
consider a derived to base conversion from the most general model to the
multiple inheritance model...
To manage this complexity we introduce a concept of "normalized" member
pointer which allows us to treat all three models as the most general
model. Then we try to figure out how to map this generalized member
pointer onto the destination member pointer model. I've done my best to
furnish the code with comments explaining why each adjustment is
performed.
This fixes PR23878.
llvm-svn: 240384
The MS ABI has very complicated member pointers. Don't attempt to
synthesize the final member pointer ab ovo usque ad mala in one go.
Instead, start with a member pointer which points to the declaration in
question as-if it's decl context was the target class. Then, utilize
our conversion logical to convert it to the target type.
This allows us to simplify how we think about member pointers because we
don't need to consider non-zero nv adjustments before we even generate
the member pointer. Furthermore, it gives our adjustment logic more
exposure by utilizing it in a common path.
llvm-svn: 240383
As specified in the SysV AVX512 ABI drafts. It follows the same scheme
as AVX2:
Arguments of type __m512 are split into eight eightbyte chunks.
The least significant one belongs to class SSE and all the others
to class SSEUP.
This also means we change the OpenMP SIMD default alignment on AVX512.
Based on r240337.
Differential Revision: http://reviews.llvm.org/D9894
llvm-svn: 240338
The patch is generated using this command:
$ tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
-checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
work/llvm/tools/clang
To reduce churn, not touching namespaces spanning less than 10 lines.
llvm-svn: 240270
Testcase provided, in the PR, by Christian Shelton and
reduced by David Majnemer.
PR: 23584
Differential Revision: http://reviews.llvm.org/D10508
Reviewed by: rnk
llvm-svn: 240242
This patch adds initial support for the -fsanitize=kernel-address flag to Clang.
Right now it's quite restricted: only out-of-line instrumentation is supported, globals are not instrumented, some GCC kasan flags are not supported.
Using this patch I am able to build and boot the KASan tree with LLVMLinux patches from github.com/ramosian-glider/kasan/tree/kasan_llvmlinux.
To disable KASan instrumentation for a certain function attribute((no_sanitize("kernel-address"))) can be used.
llvm-svn: 240131
Clang's control flow integrity implementation works by conceptually attaching
"tags" (in the form of bitset entries) to each virtual table, identifying
the names of the classes that the virtual table is compatible with. Under
the Itanium ABI, it is simple to assign tags to virtual tables; they are
simply the address points, which are available via VTableLayout. Because any
overridden methods receive an entry in the derived class's virtual table,
a check for an overridden method call can always be done by checking the
tag of whichever derived class overrode the method call.
The Microsoft ABI is a little different, as it does not directly use address
points, and overrides in a derived class do not cause new virtual table entries
to be added to the derived class; instead, the slot in the base class is
reused, and the compiler needs to adjust the this pointer at the call site
to (generally) the base class that initially defined the method. After the
this pointer has been adjusted, we cannot check for the derived class's tag,
as the virtual table may not be compatible with the derived class. So we
need to determine which base class we have been adjusted to.
Specifically, at each call site, we use ASTRecordLayout to identify the most
derived class whose virtual table is laid out at the "this" pointer offset
we are using to make the call, and check the virtual table for that tag.
Because address point information is unavailable, we "reconstruct" it as
follows: any virtual tables we create for a non-derived class receive a tag
for that class, and virtual tables for a base class inside a derived class
receive a tag for the base class, together with tags for any derived classes
which are laid out at the same position as the derived class (and therefore
have compatible virtual tables).
Differential Revision: http://reviews.llvm.org/D10520
llvm-svn: 240117
This causes programs compiled with this flag to print a diagnostic when
a control flow integrity check fails instead of aborting. Diagnostics are
printed using UBSan's runtime library.
The main motivation of this feature over -fsanitize=vptr is fidelity with
the -fsanitize=cfi implementation: the diagnostics are printed under exactly
the same conditions as those which would cause -fsanitize=cfi to abort the
program. This means that the same restrictions apply regarding compiling
all translation units with -fsanitize=cfi, cross-DSO virtual calls are
forbidden, etc.
Differential Revision: http://reviews.llvm.org/D10268
llvm-svn: 240109
This flag controls whether a given sanitizer traps upon detecting
an error. It currently only supports UBSan. The existing flag
-fsanitize-undefined-trap-on-error has been made an alias of
-fsanitize-trap=undefined.
This change also cleans up some awkward behavior around the combination
of -fsanitize-trap=undefined and -fsanitize=undefined. Previously we
would reject command lines containing the combination of these two flags,
as -fsanitize=vptr is not compatible with trapping. This required the
creation of -fsanitize=undefined-trap, which excluded -fsanitize=vptr
(and -fsanitize=function, but this seems like an oversight).
Now, -fsanitize=undefined is an alias for -fsanitize=undefined-trap,
and if -fsanitize-trap=undefined is specified, we treat -fsanitize=vptr
as an "unsupported" flag, which means that we error out if the flag is
specified explicitly, but implicitly disable it if the flag was implied
by -fsanitize=undefined.
Differential Revision: http://reviews.llvm.org/D10464
llvm-svn: 240105
The most general model has fields for the vbptr offset and the vbindex.
Don't initialize the vbptr offset if the vbindex is 0: we aren't
referencing an entity from a vbase.
Getting this wrong can make member pointer equality fail.
llvm-svn: 240043
Added parsing, sema analysis and codegen for '#pragma omp taskgroup' directive (OpenMP 4.0).
The code for directive is generated the following way:
#pragma omp taskgroup
<body>
void __kmpc_taskgroup(<loc>, thread_id);
<body>
void __kmpc_end_taskgroup(<loc>, thread_id);
llvm-svn: 240011
Added codegen for combined 'omp for simd' directives, that is a combination of 'omp for' directive followed by 'omp simd' directive. Includes support for all clauses.
llvm-svn: 239990
The following code is generated for reduction clause within 'omp simd' loop construct:
#pragma omp simd reduction(op:var)
for (...)
<body>
alloca priv_var
priv_var = <initial reduction value>;
<loop_start>:
<body> // references to original 'var' are replaced by 'priv_var'
<loop_end>:
var op= priv_var;
llvm-svn: 239881
Previously the last iteration for simd loop-based OpenMP constructs were generated as a separate code. This feature is not required and codegen is simplified.
llvm-svn: 239810
We were propagating the coverage map into the body of an if statement,
but not into the condition thereafter. This is fine as long as the two
locations are in the same virtual file, but they won't be when the
"if" part of the statement is from a macro and the condition is not.
llvm-svn: 239803
This patch adds the -fsanitize=safe-stack command line argument for clang,
which enables the Safe Stack protection (see http://reviews.llvm.org/D6094
for the detailed description of the Safe Stack).
This patch is our implementation of the safe stack on top of Clang. The
patches make the following changes:
- Add -fsanitize=safe-stack and -fno-sanitize=safe-stack options to clang
to control safe stack usage (the safe stack is disabled by default).
- Add __attribute__((no_sanitize("safe-stack"))) attribute to clang that can be
used to disable the safe stack for individual functions even when enabled
globally.
Original patch by Volodymyr Kuznetsov and others at the Dependable Systems
Lab at EPFL; updates and upstreaming by myself.
Differential Revision: http://reviews.llvm.org/D6095
llvm-svn: 239762
in section 10.1, __arm_{w,r}sr{,p,64}.
This includes arm_acle.h definitions with builtins and codegen to support
these, the intrinsics are implemented by generating read/write_register calls
which get appropriately lowered in the backend based on the register string
provided. SemaChecking is also implemented to fault invalid parameters.
Differential Revision: http://reviews.llvm.org/D9697
llvm-svn: 239737
Instead, just EvaluateAsInt().
Follow-up to r239549: rsmith points out that isICE() is expensive;
seems like it's not the right concept anyway, as it fails on
`static const' in C, and will actually trigger the assert below on:
test/Sema/inline-asm-validate-x86.c
llvm-svn: 239651
Summary:
In addition to easier syntax, IRBuilder makes sure to set correct
debug locations for newly added instructions (bitcast and
llvm.lifetime itself). This restores the original behavior, which
was modified by r234581 (reapplied as r235553).
Extend one of the tests to check for debug locations.
Test Plan: regression test suite
Reviewers: aadg, dblaikie
Subscribers: cfe-commits, majnemer
Differential Revision: http://reviews.llvm.org/D10418
llvm-svn: 239643
If llvm.lifetime.end turns out to be the first instruction in the last
basic block, we can decrement the iterator twice, going past rend.
At the moment, this can never happen because llvm.lifetime.end always
goes immediately after bitcast, but relying on this is very brittle.
llvm-svn: 239638
Right now we're ignoring the fpmath attribute since there's no
backend support for a feature like this and to do so would require
checking the validity of the strings and doing general subtarget
feature parsing of valid and invalid features with the target
attribute feature.
llvm-svn: 239582
Modeled after the gcc attribute of the same name, this feature
allows source level annotations to correspond to backend code
generation. In llvm particular parlance, this allows the adding
of subtarget features and changing the cpu for a particular function
based on source level hints.
This has been added into the existing support for function level
attributes without particular verification for any target outside
of whether or not the backend will support the features/cpu given
(similar to section, etc).
llvm-svn: 239579
Specifying #pragma clang loop vectorize(assume_safety) on a loop adds the
mem.parallel_loop_access metadata to each load/store operation in the loop. This
metadata tells loop access analysis (LAA) to skip memory dependency checking.
llvm-svn: 239572
For inline assembly immediate constraints, we currently always use
EmitScalarExpr, instead of directly emitting the constant. When the
overflow sanitizer is enabled, this generates overflow intrinsics
instead of constants.
Instead, emit a constant for constraints that either require an
immediate (e.g. 'I' on X86), or only accepts constants (immediate
or symbolic; i.e., don't accept registers or memory).
Fixes PR19763.
Differential Revision: http://reviews.llvm.org/D10255
llvm-svn: 239549
Remove the restriction which forbade forming pointers to member
functions which had parameter types or return types which were not
convertible.
llvm-svn: 239499
CodeGenOptions and onto the PassManagerBuilder. This enables gating
the new EliminateAvailableExternally module pass on whether we are
preparing for LTO.
If we are preparing for LTO (e.g. a -flto -c compile), the new pass is not
included as we want to preserve available externally functions for possible
link time inlining.
llvm-svn: 239481
Based on previous discussion on the mailing list, clang currently lacks support
for C99 partial re-initialization behavior:
Reference: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-April/029188.html
Reference: http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_253.htm
This patch attempts to fix this problem.
Given the following code snippet,
struct P1 { char x[6]; };
struct LP1 { struct P1 p1; };
struct LP1 l = { .p1 = { "foo" }, .p1.x[2] = 'x' };
// this example is adapted from the example for "struct fred x[]" in DR-253;
// currently clang produces in l: { "\0\0x" },
// whereas gcc 4.8 produces { "fox" };
// with this fix, clang will also produce: { "fox" };
Differential Review: http://reviews.llvm.org/D5789
llvm-svn: 239446
This commit adds back the code that seems to have been dropped unintentionally
in r176985.
rdar://problem/13752163
Differential Revision: http://reviews.llvm.org/D10100
llvm-svn: 239426
On ARM/AArch64, we currently always use EmitScalarExpr for the immediate
builtin arguments, instead of directly emitting the constant. When the
overflow sanitizer is enabled, this generates overflow intrinsics
instead of constants, breaking assumptions in various places.
Instead, use the knowledge of "immediates" to directly emit a constant:
- teach the tablegen backend to emit the "immediate" modifiers
- use those modifiers in the NEON CodeGen, on ARM and AArch64.
Fixes PR23517.
Differential Revision: http://reviews.llvm.org/D10045
llvm-svn: 239002
This patch fixes an assertion failure in method
'X86_64ABIInfo::GetByteVectorType'.
Method 'GetByteVectorType' (in TargetInfo.cpp) is responsible
for mapping a QualType 'Ty' (for an argument or return value) to an LLVM IR
type that, according to the ABI, must be passed in a XMM/YMM vector register.
When selecting the IR vector type, method 'GetByteVectorType' always tries to
choose the "best" IR vector type for the 'Ty' in input. In particular, if Ty
is a wrapper structure, it keeps unwrapping it until it finds a vector type VTy.
That VTy is the "preferred IR type".
However, function 'isSingleElementStructure' (used to unwrap structures) does
not know how to look through union types. So, before this patch, if Ty was in
a nest of wrapper structures with at least two union types, we would have
triggered an assertion failure (added at revision 230971).
With this patch, if method 'GetByteVectorType' fails to find the preferred
vector type, we just return a valid (although potentially 'less friendly')
vector type based on the type size. So, rather than asserting on an 'unexpected'
'Ty' in input, we conservatively return vector type <2 x double> if Ty is 16
bytes, or <4 x double> if Ty is 32 bytes.
Differential Revision: http://reviews.llvm.org/D10190
llvm-svn: 238861
The first named data member is the field used to default initialize the
union. An IndirectFieldDecl can introduce the first named data member
of a union.
llvm-svn: 238649
If the type isn't trivially moveable emplace can skip a potentially
expensive move. It also saves a couple of characters.
Call sites were found with the ASTMatcher + some semi-automated cleanup.
memberCallExpr(
argumentCountIs(1), callee(methodDecl(hasName("push_back"))),
on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))),
hasArgument(0, bindTemporaryExpr(
hasType(recordDecl(hasNonTrivialDestructor())),
has(constructExpr()))),
unless(isInTemplateInstantiation()))
No functional change intended.
llvm-svn: 238601
This is a follow-up to r238266. It turned out structors are codegened through a different path,
and didn't get the storage class set in EmitGlobalFunctionDefinition.
llvm-svn: 238443
The representation of a pointer-to-member in the MS ABI is governed by
the layout of the relevant class or if a model has been explicitly
specified. If no model is specified, then an appropriate
"worst-case-scenario" model is implicitly chosen if, and only, if the
pointer-to-member type's representation was needed.
Debug info cannot force a pointer-to-member type to have a
representation so do not try to query the size of such a type unless we
know it is safe to do so.
llvm-svn: 238259
Types can be classified as being zero-initializable or
non-zero-initializable. We used to classify array types by giving them
the classification of their base element type. However, incomplete
array types are never initialized directly and thus are always
zero-initializable.
llvm-svn: 238256
Re-land the change r238200, but with modifications in the tests that should
prevent new failures in some environments as reported with the original
change on the mailing list.
llvm-svn: 238253
On MIPS unsigned int type should not be zero extended but sign-extended.
Patch by Strahinja Petrovic.
Differential Revision: http://reviews.llvm.org/D9198
llvm-svn: 238200
We already have the ABI, we don't need a "HasAVX" flag.
This will also makes it easier to add an AVX512 ABI.
No functional change intended.
llvm-svn: 237989
If loop control variable in a worksharing construct is marked as lastprivate, we should copy last calculated value of private counter back to original variable.
llvm-svn: 237879
-fprofile-instr-generate does not emit counter increment intrinsics
for Dtor_Deleting and Dtor_Complete destructors with assigned
counters. This causes unnecessary [-Wprofile-instr-out-of-date]
warnings during profile-use runs even if the source has never been
modified since profile collection.
Patch by Betul Buyukkurt. Thanks!
llvm-svn: 237804
Patch fixes codegen for aggregate copying of VLAs. Currently method CodeGenFunction::EmitAggregateCopy() does not support copying of VLAs. Patch checks if the size of the type is 0, then checks if the type is actually a variable-length array. Then it calculates total length for this array and calculates total size of the array in bytes:
<total number of elements in array> * aligned_sizeof(ElementType) (if copy assignment is requested).
If simple copying is requested, size is calculated like:
<total number of elements in array> * aligned_sizeof(ElementType) - aligned_sizeof(ElementType) + sizeof(ElementType).
memcpy() is used with this calculated size of the VLA.
Differential Revision: http://reviews.llvm.org/D9851
llvm-svn: 237768
The implicit conversion was causing issues for a helper being added that
would take an llvm::Function rather than an llvm::Value to make the
CallInst. Since we'll eventually need to specify the type of the call
explicitly anyway, fix these up to avoid the future ambiguity.
llvm-svn: 237729
This modification generates proper copyin/initialization sequences for array variables/parameters. Before they were considered as pointers, not arrays.
llvm-svn: 237691
Also add trivial handling of transparent unions.
PPC32, MSP430, and XCore apparently all rely on DefaultABIInfo. This
should worry you, because DefaultABIInfo is not implementing the rules
of any particular ABI.
Fixes PR23097, patch by Andy Gibbs.
llvm-svn: 237630
Internal task structure must be generated like
typedef struct kmp_task {
void * shareds;
kmp_routine_entry_t routine;
kmp_int32 part_id;
kmp_routine_entry_t destructors;
} kmp_task_t;
struct kmp_task_t_with_privates {
kmp_task_t task_data;
.kmp_private. privates;
};
to avoid possible additional alignment bytes in first fields (shareds, routine, part_id and destructors). Runtime library is not aware of such kind additional alignment bytes.
llvm-svn: 237561
With this change, enabling -fmodules-local-submodule-visibility results in name
visibility rules being applied to submodules of the current module in addition
to imported modules (that is, names no longer "leak" between submodules of the
same top-level module). This also makes it much safer to textually include a
non-modular library into a module: each submodule that textually includes that
library will get its own "copy" of that library, and so the library becomes
visible no matter which including submodule you import.
llvm-svn: 237473
The issue I was trying to solve in r236547 was about built-in macros,
but I disabled coverage in all system macros. This is actually a bit
of overkill, and makes the display of coverage around system macros
degrade unnecessarily. Instead, limit this to builtins specifically.
llvm-svn: 237397
Summary:
Space on stack allocated for unused structures returned by functions was unused
even when it's lifetime didn't intersect with lifetime of any other objects that
could use the same space.
The test added also checks for named and auto objects. It seems to make sense
to have this all in one place.
Reviewers: aadg, rsmith, rjmccall, rnk
Reviewed By: rnk
Subscribers: asl, cfe-commits
Differential Revision: http://reviews.llvm.org/D9743
llvm-svn: 237385
GetOutputStream() owns the stream it returns pointer to and the
pointer should never be freed by us. When we fail to load and exit
early, unique_ptr still holds the pointer and frees it which leads to
compiler crash when CompilerInstance attempts to free it again.
Added regression test for failed bitcode linking.
Differential Revision: http://reviews.llvm.org/D9625
llvm-svn: 237159
'schedule' clause for combined directives requires additional processing. Special helper variable is generated, that is captured in the outlined parallel region for 'parallel for' region. This captured variable is used to store chunk expression from the 'schedule' clause in this 'parallel for' region.
llvm-svn: 237100
We didn't supporting taking the address of virtual member functions
which overrode a method in a virtual base. We simply need to encode the
virtual base index in the member pointer.
This fixes PR23452.
N.B. There is no data member pointer side to this change because taking
the address of a virtual bases' data member gives you a member pointer
whose type is derived from the virtual bases' type, not the most derived
type.
llvm-svn: 236962
MSVC 2015 renamed the symbol found by name lookup for 'std::terminate'
so we cannot rely on using '?terminate@@YAXXZ'. Furthermore, it seems
that 2015 will be the first release of MSVC which permits inlining a
function which is noexcept into a function which isn't. This is
implemented by creating a cleanup for the invoker which jumps to
__std_terminate. Clang's implementation of this aspect of the MSVC
scheme is slightly less efficient in this respect because we use a
catch handler configured as a catch-all handler instead.
llvm-svn: 236961
Patch from Geoff Berry <gberry@codeaurora.org>
Fix BackendConsumer::EmitOptimizationMessage() to check if the
DiagnosticInfoOptimizationBase object has a valid location before
calling getLocation() to avoid dereferencing a null pointer inside
getLocation() when no debug info is present.
llvm-svn: 236898
Functions with available_externally linkage will not be emitted to object
files (they will just be undefined symbols), so it does not make sense to
put them in comdats.
Creates a second overload of maybeSetTrivialComdat that uses the GlobalObject
instead of the Decl, and uses that in several places that had the faulty
logic.
Differential Revision: http://reviews.llvm.org/D9580
llvm-svn: 236879
Summary:
Possible coverage levels are:
* -fsanitize-coverage=func - function-level coverage
* -fsanitize-coverage=bb - basic-block-level coverage
* -fsanitize-coverage=edge - edge-level coverage
Extra features are:
* -fsanitize-coverage=indirect-calls - coverage for indirect calls
* -fsanitize-coverage=trace-bb - tracing for basic blocks
* -fsanitize-coverage=trace-cmp - tracing for cmp instructions
* -fsanitize-coverage=8bit-counters - frequency counters
Levels and features can be combined in comma-separated list, and
can be disabled by subsequent -fno-sanitize-coverage= flags, e.g.:
-fsanitize-coverage=bb,trace-bb,8bit-counters -fno-sanitize-coverage=trace-bb
is equivalient to:
-fsanitize-coverage=bb,8bit-counters
Original semantics of -fsanitize-coverage flag is preserved:
* -fsanitize-coverage=0 disables the coverage
* -fsanitize-coverage=1 is a synonym for -fsanitize-coverage=func
* -fsanitize-coverage=2 is a synonym for -fsanitize-coverage=bb
* -fsanitize-coverage=3 is a synonym for -fsanitize-coverage=edge
* -fsanitize-coverage=4 is a synonym for -fsanitize-coverage=edge,indirect-calls
Driver tries to diagnose invalid flag usage, in particular:
* At most one level (func,bb,edge) must be specified.
* "trace-bb" and "8bit-counters" features require some level to be specified.
See test case for more examples.
Test Plan: regression test suite
Reviewers: kcc
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D9577
llvm-svn: 236790
- added -fcuda-include-gpubinary option to incorporate results of
device-side compilation into host-side one.
- generate code to register GPU binaries and associated kernels
with CUDA runtime and clean-up on exit.
- added test case for init/deinit code generation.
Differential Revision: http://reviews.llvm.org/D9507
llvm-svn: 236765
Summary:
The next step is to add user-friendly control over these options
to driver via -fsanitize-coverage= option.
Test Plan: regression test suite
Reviewers: kcc
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D9545
llvm-svn: 236756
Fix for codegen of static variables declared inside of captured statements. Captured statements are actually a transparent DeclContexts, so we have to skip them when trying to get a mangled name for statics.
Differential Revision: http://reviews.llvm.org/D9522
llvm-svn: 236701
The MSVC 2015 ABI utilizes a rather straightforward adaptation of the
algorithm found in the appendix of N2382. While we are here, implement
support for emitting cleanups if an exception is thrown while we are
intitializing a static local variable.
llvm-svn: 236697
Inner bodies of OpenMP worksharing loop-based constructs with dynamic or guided scheduling are allowed to be marked with !llvm.mem.parallel_loop_access metadata for better optimization. Worksharing constructs with static scheduling cannot be marked this way (according to OpenMP standard "A data dependence between the same logical iterations in two such loops is guaranteed").
Constructs with auto and runtime scheduling are also not marked because automatically chosen scheduling may be static also.
Differential Revision: http://reviews.llvm.org/D9518
llvm-svn: 236693
Fixed codegen for reduction operations min, max, && and ||. Codegen for them is quite similar and I was confused by this similarity.
Also added a call to kmpc_end_reduce() in atomic part of reduction codegen (call to kmpc_end_reduce_nowait() is not required).
Differential Revision: http://reviews.llvm.org/D9513
llvm-svn: 236689
All callers should be passing `CXXConstructorDecl` or
`CXXDestructorDecl` here, so use `cast<>` instead of `dyn_cast<>` when
setting up the `GlobalDecl`.
llvm-svn: 236651
It doesn't make much sense to try to show coverage inside system
macros, and source locations in builtins confuses the coverage
mapping. Just avoid doing this.
Fixes an assert that fired when a __block storage specifier starts a
region.
llvm-svn: 236547
This adds low-level builtins to allow access to all of the z13 vector
instructions. Note that instructions whose semantics can be described
by standard C (including clang extensions) do not get any builtins.
For each instructions whose semantics *cannot* (fully) be described, we
define a builtin named __builtin_s390_<insn> that directly maps to this
instruction. These are intended to be compatible with GCC.
For instructions that also set the condition code, the builtin will take
an extra argument of type "int *" at the end. The integer pointed to by
this argument will be set to the post-instruction CC value.
For many instructions, the low-level builtin is mapped to the corresponding
LLVM IR intrinsic. However, a number of instructions can be represented
in standard LLVM IR without requiring use of a target intrinsic.
Some instructions require immediate integer operands within a certain
range. Those are verified at the Sema level.
Based on a patch by Richard Sandiford.
llvm-svn: 236532
This patch adds support for the z13 architecture type. For compatibility
with GCC, a pair of options -mvx / -mno-vx can be used to selectively
enable/disable use of the vector facility.
When the vector facility is present, we default to the new vector ABI.
This is characterized by two major differences:
- Vector types are passed/returned in vector registers
(except for unnamed arguments of a variable-argument list function).
- Vector types are at most 8-byte aligned.
The reason for the choice of 8-byte vector alignment is that the hardware
is able to efficiently load vectors at 8-byte alignment, and the ABI only
guarantees 8-byte alignment of the stack pointer, so requiring any higher
alignment for vectors would require dynamic stack re-alignment code.
However, for compatibility with old code that may use vector types, when
*not* using the vector facility, the old alignment rules (vector types
are naturally aligned) remain in use.
These alignment rules are not only implemented at the C language level,
but also at the LLVM IR level. This is done by selecting a different
DataLayout string depending on whether the vector ABI is in effect or not.
Based on a patch by Richard Sandiford.
llvm-svn: 236531
Destructors are never called for cleanups, so we can't use SmallVector as a member.
Differential Revision: http://reviews.llvm.org/D9399
llvm-svn: 236491
Destructors are never called for cleanups, so we can't use SmallVector as a member.
Differential Revision: http://reviews.llvm.org/D9399
llvm-svn: 236487
Destructors are never called for cleanups, so we can't use SmallVector as a member.
Differential Revision: http://reviews.llvm.org/D9399
llvm-svn: 236482
Destructors are never called for cleanups, so we can't use SmallVector as a member.
Differential Revision: http://reviews.llvm.org/D9399
llvm-svn: 236480
For tasks codegen for private/firstprivate variables are different rather than for other directives.
1. Build an internal structure of privates for each private variable:
struct .kmp_privates_t. {
Ty1 var1;
...
Tyn varn;
};
2. Add a new field to kmp_task_t type with list of privates.
struct kmp_task_t {
void * shareds;
kmp_routine_entry_t routine;
kmp_int32 part_id;
kmp_routine_entry_t destructors;
.kmp_privates_t. privates;
};
3. Create a function with destructors calls for all privates after end of task region.
kmp_int32 .omp_task_destructor.(kmp_int32 gtid, kmp_task_t *tt) {
~Destructor(&tt->privates.var1);
...
~Destructor(&tt->privates.varn);
return 0;
}
4. Perform initialization of all firstprivate fields (by simple copying for POD data, copy constructor calls for classes) + provide address of a destructor function after kmpc_omp_task_alloc() and before kmpc_omp_task() calls.
kmp_task_t *new_task = __kmpc_omp_task_alloc(ident_t *, kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds, kmp_routine_entry_t *task_entry);
CopyConstructor(new_task->privates.var1, *new_task->shareds.var1_ref);
new_task->shareds.var1_ref = &new_task->privates.var1;
...
CopyConstructor(new_task->privates.varn, *new_task->shareds.varn_ref);
new_task->shareds.varn_ref = &new_task->privates.varn;
new_task->destructors = .omp_task_destructor.;
kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *new_task)
Differential Revision: http://reviews.llvm.org/D9370
llvm-svn: 236479
The fact that PGO has a say in how these branch weights are determined
isn't interesting to most of CodeGen, so it makes more sense for this
API to be accessible via CodeGenFunction rather than CodeGenPGO.
llvm-svn: 236380
The underlying problem is that there is currently no way to run
ObjCARCContract from llvm bitcode which is required by ObjC ARC.
This fix the problem by always enable ObjCARCContract pass if
optimization is enabled. The ObjCARC Contract pass has almost no
overhead on code that is not using ARC.
llvm-svn: 236372
No functional change. This just makes it more obvious that the logic
in ComputeRegionCounts only depends on the counter map and local
state.
llvm-svn: 236370
This removes the RegionCounter class, which is only used as a helper
in teh ComputeRegionCounts stmt visitor. This class is just an extra
layer of abstraction that makes the code harder to follow at this
point, and removing it makes the logic quite a bit more direct.
llvm-svn: 236364
This change is the third of 3 patches to add support for specifying
the profile output from the command line via -fprofile-instr-generate=<path>,
where the specified output path/file will be overridden by the
LLVM_PROFILE_FILE environment variable.
This patch adds the necessary support to the clang frontend, and adds a
new test.
The compiler-rt and llvm parts are r236055 and r236288, respectively.
Patch by Teresa Johnson. Thanks!
llvm-svn: 236289