No two elements of this array should be the same, but the standard library
may pass the same element as both arguments to this function.
llvm-svn: 230293
llvm.eh.sjlj.setjmp / llvm.eh.sjlj.longjmp, if the backend is known to
support them outside the Exception Handling context. The default
handling in LLVM codegen doesn't work and will create incorrect code.
The ARM backend on the other hand will assert if the intrinsics are
used.
llvm-svn: 230255
Somehow this file ended up with a strange hybrid of the old "indent
inside a namespace" style and the new "don't", giving us a wonderful
two-space indent starting halfway through a namespace. Fix it.
llvm-svn: 230244
This is a necessary prerequisite for debugging with modules.
The .pcm files become containers that hold the serialized AST which allows
us to store debug information in the module file that can be shared by all
object files that were built importing the module.
rdar://problem/19104245
This reapplies r230044 with a fixed configure+make build and updated
dependencies. Take 2.
llvm-svn: 230089
This is a necessary prerequisite for debugging with modules.
The .pcm files become containers that hold the serialized AST which allows
us to store debug information in the module file that can be shared by all
object files that were built importing the module.
rdar://problem/19104245
This reapplies r230044 with a fixed configure+make build and updated
dependencies.
llvm-svn: 230067
This patch introduces the -fsanitize=cfi-vptr flag, which enables a control
flow integrity scheme that checks that virtual calls take place using a vptr of
the correct dynamic type. More details in the new docs/ControlFlowIntegrity.rst
file.
It also introduces the -fsanitize=cfi flag, which is currently a synonym for
-fsanitize=cfi-vptr, but will eventually cover all CFI checks implemented
in Clang.
Differential Revision: http://reviews.llvm.org/D7424
llvm-svn: 230055
This is a necessary prerequisite for debugging with modules.
The .pcm files become containers that hold the serialized AST which allows
us to store debug information in the module file that can be shared by all
object files that were built importing the module.
rdar://problem/19104245
llvm-svn: 230044
For now -funique-section-names is the default, so no change in default behavior.
The total .o size in a build of llvm and clang goes from 241687775 to 230649031
bytes if -fno-unique-section-names is used.
llvm-svn: 230031
Classes can be defined in multiple translation units. This means that
the static constexpr data members should have identical initializers in
all translation units. Implement this by giving the reference temporary
linkonce_odr linkage.
llvm-svn: 229900
When tools like llvm-cov show regions, it's much easier to understand
what's happening if the condition of an if shows a counter as well as
the body.
llvm-svn: 229813
The coverage mapping generation code previously generated a large
number of redundant coverage regions and then tried to merge similar
ones back together. This then relied on some awkward heuristics to
prevent combining of regions that were importantly different but
happened to have the same count. The end result was inefficient and
hard to follow.
Now, we more carefully create the regions we actually want. This makes
it much easier to create regions at precise locations as well as
making the basic approach quite a bit easier to follow. There's still
a fair bit of complexity here dealing with included code and macro
expansions, but that's pretty hard to avoid without significantly
reducing the quality of data we provide.
I had to modify quite a few tests where the source ranges became more
precise or the old ranges seemed to be wrong anyways, and I've added
quite a few new tests since a large number of constructs didn't seem
to be tested before.
llvm-svn: 229748
This is a patch for PR22563 ( http://llvm.org/bugs/show_bug.cgi?id=22563 ).
We were not correctly unwrapping a single 256-bit AVX vector that was defined as an array of 1 inside a struct.
We would generate a <4 x float> param/return value instead of <8 x float> and lose half of the vector.
Differential Revision: http://reviews.llvm.org/D7614
llvm-svn: 229408
For #pragma comment(linker, ...) MSVC expects the comment string to be quoted, but for #pragma comment(lib, ...) the compiler itself quotes the library name.
Since this distinction disappears by the time the directive reaches the backend, move quoting for the "lib" version to the frontend.
Differential Revision: http://reviews.llvm.org/D7653
llvm-svn: 229376
Bools are a little tricky, they are i8 in memory and must be coerced
back to i1 before further operations can be performed on them.
This fixes PR22577.
llvm-svn: 229204
declarations and just use the legacy namespace qualifier in this file
and the permanent header name.
The old wrapper header is going away to make LLVM's build more modular,
and without updating Clang I can't easily start to add usage of the new
pass manager to Clang. This should make it more clear in the code which
set of types is doing what.
llvm-svn: 229090
The /volatile:ms semantics turn volatile loads and stores into atomic
acquire and release operations. This distinction is important because
volatile memory operations do not form a happens-before relationship
with non-atomic memory. This means that a volatile store is not
sufficient for implementing a mutex unlock routine.
Differential Revision: http://reviews.llvm.org/D7580
llvm-svn: 229082
Summary:
This patch installs an InlineAsmDiagnosticsHandler to avoid the crash
report when the input is bitcode and the bitcode contains invalid inline
assembly. The handler will simply print the same error message that will
print from the backend.
Add CHECK in test-case
Reviewers: echristo, rafael
Reviewed By: rafael
Subscribers: rafael, cfe-commits
Differential Revision: http://reviews.llvm.org/D7568
llvm-svn: 228898
a non-uniqueable temporary node that is only turned into a permanent
unique or distinct node after it is finished.
Otherwise an intermediate node may get accidentally uniqued with another
node as illustrated by the testcase.
Paired commit with LLVM.
llvm-svn: 228855
Matches the existing code for scalar default arguments. Complex default
arguments probably need the same handling too (test/fix to that coming
next).
llvm-svn: 228588
It's slightly cheaper than copying it, if the DebugLoc points to replaceable
metadata every copy is recorded in a DenseMap, moving reduces the peak size of
that map.
llvm-svn: 228492
modifiers on them. If we have a matching output constraint with
an early clobber make sure we don't propagate that to the input
constraint.
llvm-svn: 228422
After r228258, Clang started emitting C++ EH IR that LLVM wasn't ready
to deal with, even when exceptions were disabled with /EHs-. This time,
make /EHs- turn off -fexceptions while still emitting exceptional
constructs in functions using __try. Since Sema rejects C++ exception
handling constructs before CodeGen, landingpads should only appear in
such functions as the result of a __try.
llvm-svn: 228329
This patch emits the following code for the single directive:
#pragma omp single
<body>
<---->
if(__kmpc_single(...)) {
<body>
__kmpc_end_single(...);
}
Differential Revision: http://reviews.llvm.org/D7045
llvm-svn: 228275
For 'taskyield' directive emit call to kmp_int32 __kmpc_omp_taskyield(ident_t *,
kmp_int32 global_tid, int end_part); runtime function call with end_part arg set
to 0 (it is ignored).
Differential Revision: http://reviews.llvm.org/D7047
llvm-svn: 228272
It caused a chromium base unittest that tests throwing and catching SEH
exceptions to fail (http://crbug.com/455488) and I suspect it might also
be the cause of the chromium clang win 64-bit shared release builder timing
out during compiles. So revert to see if that's true.
llvm-svn: 228262
Previously we would simply double-emit the body of the __finally block,
but that doesn't work when it contains any kind of Decl, which we can't
double emit.
This fixes that by emitting the block once and branching into a shared
code region and then branching back out.
llvm-svn: 228222
Now if you break on a dtor and go 'up' in your debugger (or you get an
asan failure in a dtor) during an exception unwind, you'll have more
context. Instead of all dtors appearing to be called from the '}' of the
function, they'll be attributed to the end of the scope of the variable,
the same as the non-exceptional dtor call.
This doesn't /quite/ remove all uses of CurEHLocation (which might be
nice to remove, for a few reasons) - it's still used to choose the
location for some other work in the landing pad. It'd be nice to
attribute that code to the same location as the exception calls within
the block and to remove CurEHLocation.
llvm-svn: 228181
Summary:
Allow user to provide multiple blacklists by passing several
-fsanitize-blacklist= options. These options now don't override
default blacklist from Clang resource directory, which is always
applied (which fixes PR22431).
-fno-sanitize-blacklist option now disables all blacklists that
were specified earlier in the command line (including the default
one).
This change depends on http://reviews.llvm.org/D7367.
Test Plan: regression test suite
Reviewers: timurrrr
Subscribers: cfe-commits, kcc, pcc
Differential Revision: http://reviews.llvm.org/D7368
llvm-svn: 228156
Summary:
Named registers with the constraint "=&r" currently lose the early clobber flag
and turn into "=r" when converted to LLVM-IR. This patch correctly passes it on.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D7346
llvm-svn: 228143
__declspec(restrict) and __attribute(malloc) are both handled
identically by clang: they are allowed to the noalias LLVM attribute.
Seeing as how noalias models the C99 notion of 'restrict', rename the
internal clang attribute to Restrict from Malloc.
llvm-svn: 228120
We would synthesize memcpy intrinsics when emitting calls to trivial C++
constructors but we wouldn't take into account the alignment of the
destination.
llvm-svn: 228061
There are four major kinds of declarations that cause code generation:
- FunctionDecl (includes CXXMethodDecl etc)
- ObjCMethodDecl
- BlockDecl
- CapturedDecl
This patch tracks __try usage on FunctionDecls and diagnoses __try usage
in other decls. If someone wants to use __try from ObjC, they can use it
from a free function, since the ObjC code will need an ObjC-style EH
personality.
Eventually we will want to look through CapturedDecls and track SEH
usage on the parent FunctionDecl, if present.
llvm-svn: 228058
To handle default arguments in C++ in the debug info, we disable code
updating the debug location during the emission of default arguments.
This code was buggy in the case of default arguments which, themselves,
have default arguments - the inner default argument would re-enable
debug info when it was finished, but before the outer default argument
was finished.
This was already a bug, but got worse (because a crasher instead of just
a quality bug) with the recent improvements to debug info line quality
because... The ApplyDebugLocation scoped device would find the debug
info disabled and not save any debug location. But then in
~ApplyDebugLocation it would find the debug info had been enabled and
would then apply the no-location. Then the outer function call would be
emitted without any location. That's bad.
Arguably we could /also/ fix the ApplyDebugLocation to assert on this
situation (where debug info was disabled in the ctor and enabled in the
dtor, or the other way around) but this is at least the necessary fix
regardless.
(also, I imagine this disabling behavior might need to be in-place for
CGExprComplex and CGExprAgg too, maybe... ?)
And I seem to recall seeing some weird default arg stepping behavior
recently which might be related to this too... I'll have to look into
it.
llvm-svn: 228053
distinction between the different use-cases. With the previous default
behavior we would occasionally emit empty debug locations in situations
where they actually were strictly required (= on invoke insns).
We now have a choice between defaulting to an empty location or an
artificial location.
Specifically, this fixes a bug caused by a missing debug location when
emitting C++ EH cleanup blocks from within an artificial function, such as
an ObjC destroy helper function.
rdar://problem/19670595
llvm-svn: 228003
Summary:
This is especially important for targets that use multiple address spaces,
and commonly place global variables in address spaces other than zero.
Fixes PR22383
Test Plan: New test case added: llvm-used.cu
Reviewers: jingyue
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7345
llvm-svn: 227861
It is common for COM interface classes to be marked as 'novtable' to
tell the compiler that constructors and destructors should not reference
virtual function tables.
This commit implements this feature in clang.
llvm-svn: 227796
This moves all of the PassManager <-> Target communication to use the
new pass manager's TargetIRAnalysis even with the old pass manager. See
the LLVM commit for some of why things are moving in this direction, but
the short version is that this will enable us to create per-function
TargetTransformInfo objects that have correct subtarget information for
that function.
llvm-svn: 227732
Create a new TargetCodeGenInfo for Windows on ARM to permit annotating the
functions with stack-probe-size (for /Gs and -mstack-probe-support) for
generating the stack probe necessary for Windows targets. This will be used by
the backend when lowering the frame to generate the stack probe appropriately.
llvm-svn: 227641
On targets which use the MSVCRT, setjmp is a macro which expands to
_setjmp or _setjmpex.
_setjmp and _setjmpex have a secret, hidden argument which is not listed
in the function prototype on X64 and WoA. This hidden argument always
seems to be the frame pointer.
_setjmpex isn't used on X86, _setjmp is magically replaced with a call
to _setjmp3. The second argument is zero for 'normal' setjmp/longjmp
pairs, otherwise it is a count of additional variadic arguments. This
is used when setjmp appears inside of a try or __try.
It is not safe to use a pointer to setjmp because _setjmp, _setjmpex and
_setmp3 are not compatible with setjmp.
llvm-svn: 227426
Summary:
They just existed before to use NaCl's custom ABIInfos; now that those are gone,
the custom TargetCodeGenInfos are no longer needed either.
Test Plan: don't break the existing tests
Reviewers: jvoung
Subscribers: jfb, cfe-commits
Differential Revision: http://reviews.llvm.org/D7234
llvm-svn: 227406
Summary:
It was used for interoperability with PNaCl's calling conventions, but
it's no longer needed.
Also Remove NaCl*ABIInfo which just existed to delegate to either the portable
or native ABIInfo, and remove checkCallingConvention which was now a no-op
override.
Reviewers: jvoung
Subscribers: jfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D7206
llvm-svn: 227362
This is half a fix for a GDB test suite failure that expects to start at
'a' in the following code:
void func(int a)
if (a
&&
b)
...
But instead, without this change, the comparison was assigned to '&&'
(well, worse actually - because there was a chained 'a && b && c' and it
was assigned to the second '&&' because of a recursive application of
this bug) and then the load folded into the comparison so breaking on
the function started at '&&' instead of 'a'.
The other part of this needs to be fixed in LLVM where it's ignoring the
location of the icmp and instead using the location of the branch
instruction.
The fix to the conditional operator is actually a no-op currently,
because the conditional operator's location coincides with 'a' (the
start of the conditional expression) but should probably be '?' instead.
See the FIXME in the test case that mentions the ARCMigration tool
failures when I tried to make that change.
llvm-svn: 227356
The backend won't run LowerExpect on -O0. In a debug LTO build, this results in llvm.expect intrinsics being in the LTO IR which doesn't know how to optimize them.
Thanks to Chandler for the suggestion and review.
Differential revision: http://reviews.llvm.org/D7183
llvm-svn: 227135
This causes things like assignment to refer to the '=' rather than the
LHS when attributing the store instruction, for example.
There were essentially 3 options for this:
* The beginning of an expression (this was the behavior prior to this
commit). This meant that stepping through subexpressions would bounce
around from subexpressions back to the start of the outer expression,
etc. (eg: x + y + z would go x, y, x, z, x (the repeated 'x's would be
where the actual addition occurred)).
* The end of an expression. This seems to be what GCC does /mostly/, and
certainly this for function calls. This has the advantage that
progress is always 'forwards' (never jumping backwards - except for
independent subexpressions if they're evaluated in interesting orders,
etc). "x + y + z" would go "x y z" with the additions occurring at y
and z after the respective loads.
The problem with this is that the user would still have to think
fairly hard about precedence to realize which subexpression is being
evaluated or which operator overload is being called in, say, an asan
backtrace.
* The preferred location or 'exprloc'. In this case you get sort of what
you'd expect, though it's a bit confusing in its own way due to going
'backwards'. In this case the locations would be: "x y + z +" in
lovely postfix arithmetic order. But this does mean that if the op+
were an operator overload, say, and in a backtrace, the backtrace will
point to the exact '+' that's being called, not to the end of one of
its operands.
(actually the operator overload case doesn't work yet for other reasons,
but that's being fixed - but this at least gets scalar/complex
assignments and other plain operators right)
llvm-svn: 227027
In ItaniumCXXABI::EmitCXXDestructors we first emit the base destructor
and then try to emit the complete one as an alias.
If in the base ends up calling the complete destructor, the GD for the
complete will be in the list of deferred decl by the time we replace
it with an alias and delete the original GV.
llvm-svn: 226896
"omp atomic read [seq_cst]" accepts expressions "v=x;". In this patch we perform
an atomic load of "x" (using builtin atomic loading instructions or a call to
"atomic_load()" for simple lvalues and "kmpc_atomic_start();load
<x>;kmpc_atomic_end();" for other lvalues), convert the result of loading to
type of "v" (using EmitScalarConversion() for simple types and
EmitComplexToScalarConversion() for conversions from complex to scalar) and then
store the result in "v".)
Differential Revision: http://reviews.llvm.org/D6431
llvm-svn: 226788
"omp atomic read [seq_cst]" accepts expressions "v=x;". In this patch we perform
an atomic load of "x" (using builtin atomic loading instructions or a call to
"atomic_load()" for simple lvalues and "kmpc_atomic_start();load
<x>;kmpc_atomic_end();" for other lvalues), convert the result of loading to
type of "v" (using EmitScalarConversion() for simple types and
EmitComplexToScalarConversion() for conversions from complex to scalar) and then
store the result in "v".)
Differential Revision: http://reviews.llvm.org/D6431
llvm-svn: 226786
"omp atomic read [seq_cst]" accepts expressions "v=x;". In this patch we perform
an atomic load of "x" (using builtin atomic loading instructions or a call to
"atomic_load()" for simple lvalues and "kmpc_atomic_start();load
<x>;kmpc_atomic_end();" for other lvalues), convert the result of loading to
type of "v" (using EmitScalarConversion() for simple types and
EmitComplexToScalarConversion() for conversions from complex to scalar) and then
store the result in "v".
Differential Revision: http://reviews.llvm.org/D6431
llvm-svn: 226784
We don't emit any coverage mapping for uncovered functions that come
from system headers, but we were creating a GlobalVariable with each
of their names. This is wasteful since the linker will need to dead
strip the unused symbols, and it can lead to issues when merging
coverage with others TUs that do have coverage for those functions.
llvm-svn: 226764
The lowering looks a lot like normal EH lowering, with the exception
that the exceptions are caught by executing filter expression code
instead of matching typeinfo globals. The filter expressions are
outlined into functions which are used in landingpad clauses where
typeinfo would normally go.
Major aspects that still need work:
- Non-call exceptions in __try bodies won't work yet. The plan is to
outline the __try block in the frontend to keep things simple.
- Filter expressions cannot use local variables until capturing is
implemented.
- __finally blocks will not run after exceptions. Fixing this requires
work in the LLVM SEH preparation pass.
The IR lowering looks like this:
// C code:
bool safe_div(int n, int d, int *r) {
__try {
*r = normal_div(n, d);
} __except(_exception_code() == EXCEPTION_INT_DIVIDE_BY_ZERO) {
return false;
}
return true;
}
; LLVM IR:
define i32 @filter(i8* %e, i8* %fp) {
%ehptrs = bitcast i8* %e to i32**
%ehrec = load i32** %ehptrs
%code = load i32* %ehrec
%matches = icmp eq i32 %code, i32 u0xC0000094
%matches.i32 = zext i1 %matches to i32
ret i32 %matches.i32
}
define i1 zeroext @safe_div(i32 %n, i32 %d, i32* %r) {
%rr = invoke i32 @normal_div(i32 %n, i32 %d)
to label %normal unwind to label %lpad
normal:
store i32 %rr, i32* %r
ret i1 1
lpad:
%ehvals = landingpad {i8*, i32} personality i32 (...)* @__C_specific_handler
catch i8* bitcast (i32 (i8*, i8*)* @filter to i8*)
%ehptr = extractvalue {i8*, i32} %ehvals, i32 0
%sel = extractvalue {i8*, i32} %ehvals, i32 1
%filter_sel = call i32 @llvm.eh.seh.typeid.for(i8* bitcast (i32 (i8*, i8*)* @filter to i8*))
%matches = icmp eq i32 %sel, %filter_sel
br i1 %matches, label %eh.except, label %eh.resume
eh.except:
ret i1 false
eh.resume:
resume
}
Reviewers: rjmccall, rsmith, majnemer
Differential Revision: http://reviews.llvm.org/D5607
llvm-svn: 226760
Currently we emit DeferredDeclsToEmit in reverse order. This patch changes that.
The advantages of the change are that
* The output order is a bit closer to the source order. The change to
test/CodeGenCXX/pod-member-memcpys.cpp is a good example.
* If we decide to deffer more, it will not cause as large changes in the
estcases as it would without this patch.
llvm-svn: 226751
This workaround was to provide unique call sites to ensure LLVM's inline
debug info handling would properly unique two calls to the same function
on the same line. Instead, this has now been fixed in LLVM (r226736) and
the workaround here can be removed.
Originally committed in r176895, but this isn't a straight revert due to
all the changes since then. I just searched for anything ForcedColumn*
related and removed them.
We could test this - but it didn't strike me as terribly valuable once
we're no longer adding this workaround everything just works as expected
& it's no longer a special case to test for.
llvm-svn: 226738
This attribute implies indicates that the function musttail calls
another function and returns whatever it returns. The return type of the
thunk is meaningless, as the thunk can dynamically call different
functions with different return types. So long as the callers bitcast
the thunk with the correct type, behavior is well defined.
This attribute was necessary to fix PR20944, where the indirect call
combiner noticed that the thunk returned void and replaced the results
of the indirect call instruction with undef.
Over-the-shoulder reviewed by David Majnemer.
llvm-svn: 226707
The test was fixed after a discussion with the revision author: the check
pattern was made more flexible as the "%call" part is not what we actually want
to check strictly there.
The original patch description:
===
Introduce SPIR calling conventions.
This implements Section 3.7 from the SPIR 1.2 spec:
SPIR kernels should use "spir_kernel" calling convention.
Non-kernel functions use "spir_func" calling convention. All
other calling conventions are disallowed.
The patch works only for OpenCL source. Any other uses will need
to ensure that kernels are assigned the spir_kernel calling
convention correctly.
===
llvm-svn: 226561
This implements Section 3.7 from the SPIR 1.2 spec:
SPIR kernels should use "spir_kernel" calling convention.
Non-kernel functions use "spir_func" calling convention. All
other calling conventions are disallowed.
The patch works only for OpenCL source. Any other uses will need
to ensure that kernels are assigned the spir_kernel calling
convention correctly.
llvm-svn: 226548
The code setting the debug location being removed here was accidentally
leaking a location into the call to the non-static data member's ctor
call. Without it the call had no location and could cause assertion
failures if it was inlined. Now that it has a location (and a correct
one at that) this code should hopefully be no longer needed.
It's possible of course that other parts of the debug info are also
relying on the debug locations being set here to leak to where they're
needed - so we might see the same assertions again & will have to
investigate what the dependence was/is. But the chances are good that
any of those are debug info line table quality bugs we've just not found
yet anyway - so it'll be good to flush them out.
llvm-svn: 226383
This was causing some trouble for otherwise dead code removed in r225085
(reverted in r225361). The location being set for function arguments was
leaking out to the call which wasn't setting its own location (so a
quality bug turned into a crasher with r225085). Fix this so r225085 can
be recommitted.
llvm-svn: 226382
This produces comdats for vtables, typeinfo, typeinfo names, and vtts.
When combined with llvm not producing implicit comdats, not doing this would
cause code bloat on ELF and link errors on COFF.
llvm-svn: 226227
object from the pass that provides access to it.
We should probably refactor the createTLI code here in Clang in light of
the new structure, but I wanted this patch to be a minimal one that just
patches the behavior back together.
llvm-svn: 226158
Sema calls HandleVTable() with a bool parameter which is then threaded through
three layers. The only effect of this bool is an early return at the last
layer.
Instead, remove this parameter and call HandleVTable() only if the bool is
true. No intended behavior change.
llvm-svn: 226096
Summary:
This fixes MultiSource/Applications/lemon on big-endian N32 by correcting the
handling of the argument to wait(). glibc defines it as a transparent union of
void* and int*. Such unions are passed according to the rules of the first
member so the argument must be passed as if it were a void* (sign extended from
i32 to i64) and not as a union (shifted to the upper bits of an i64).
wait() already behaves correctly on big-endian O32 and N64 since the union is
already the same size as an argument slot.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6963
llvm-svn: 225981
Sorry for the noise, I managed to miss a bunch of recent regressions of
include orderings here. This should actually sort all the includes for
Clang. Again, no functionality changed, this is just a mechanical
cleanup that I try to run periodically to keep the #include lines as
regular as possible across the project.
llvm-svn: 225979
Several pieces of code were relying on implicit debug location setting
which usually lead to incorrect line information anyway. So I've fixed
those (in r225955 and r225845) separately which should pave the way for
this commit to be cleanly reapplied.
The reason these implicit dependencies resulted in crashes with this
patch is that the debug location would no longer implicitly leak from
one place to another, but be set back to invalid. Once a call with
no/invalid location was emitted, if that call was ever inlined it could
produce invalid debugloc chains and assert during LLVM's codegen.
There may be further cases of such bugs in this patch - they're hard to
flush out with regression testing, so I'll keep an eye out for reports
and investigate/fix them ASAP if they come up.
Original commit message:
Reapply "DebugInfo: Generalize debug info location handling"
Originally committed in r224385 and reverted in r224441 due to concerns
this change might've introduced a crash. Turns out this change fixes the
crash introduced by one of my earlier more specific location handling
changes (those specific fixes are reverted by this patch, in favor of
the more general solution).
Recommitted in r224941 and reverted in r224970 after it caused a crash
when building compiler-rt. Looks to be due to this change zeroing out
the debug location when emitting default arguments (which were meant to
inherit their outer expression's location) thus creating call
instructions without locations - these create problems for inlining and
must not be created. That is fixed and tested in this version of the
change.
Original commit message:
This is a more scalable (fixed in mostly one place, rather than many
places that will need constant improvement/maintenance) solution to
several commits I've made recently to increase source fidelity for
subexpressions.
This resetting had to be done at the DebugLoc level (not the
SourceLocation level) to preserve scoping information (if the resetting
was done with CGDebugInfo::EmitLocation, it would've caused the tail end
of an expression's codegen to end up in a potentially different scope
than the start, even though it was at the same source location). The
drawback to this is that it might leave CGDebugInfo out of sync. Ideally
CGDebugInfo shouldn't have a duplicate sense of the current
SourceLocation, but for now it seems it does... - I don't think I'm
going to tackle removing that just now.
I expect this'll probably cause some more buildbot fallout & I'll
investigate that as it comes up.
Also these sort of improvements might be starting to show a weakness/bug
in LLVM's line table handling: we don't correctly emit is_stmt for
statements, we just put it on every line table entry. This means one
statement split over multiple lines appears as multiple 'statements' and
two statements on one line (without column info) are treated as one
statement.
I don't think we have any IR representation of statements that would
help us distinguish these cases and identify the beginning of each
statement - so that might be something we need to add (possibly to the
lexical scope chain - a scope for each statement). This does cause some
problems for GDB and possibly other DWARF consumers.
llvm-svn: 225956
Without setting the CurEHLocation these cleanups would be attributed to
whatever the last active debug line location was (the 'fn' call in the
included test cases). By setting CurEHLocation correctly the line
information is improved/corrected.
This quality bug turned into a crasher with r225000 when, instead of
allowing the last location to persist, it would be zero'd out. This
could lead to a function call (such as the dtor) being made without a
debug location - if that call was subsequently inlined (and the caller
and callee had debug info, just not the call instruction) the inliner
would violate important constraints about the debug location chains by
not updating the inlined instructions to chain up to the callee
locations.
So, by fixing this bug, I am addressing the assertion failures
introduced by r225000 and should be able to recommit that patch with
impunity...
llvm-svn: 225955
A pass that adds random noops to X86 binaries to introduce diversity with the goal of increasing security against most return-oriented programming attacks.
Command line options:
-noop-insertion // Enable noop insertion.
-noop-insertion-percentage=X // X% of assembly instructions will have a noop prepended (default: 50%, requires -noop-insertion)
-max-noops-per-instruction=X // Randomly generate X noops per instruction. ie. roll the dice X times with probability set above (default: 1). This doesn't guarantee X noop instructions.
In addition, the following 'quick switch' in clang enables basic diversity using default settings (currently: noop insertion and schedule randomization; it is intended to be extended in the future).
-fdiversify
This is the clang part of the patch.
llvm part: D3392
http://reviews.llvm.org/D3393
Patch by Stephen Crane (@rinon)
llvm-svn: 225910
This was previously piggybacking on whatever happened to be the last
location set on CGDebugInfo/DIBuilder, which was wrong (it was often the
current location, such as the 'fn()' call site, not the end of the
block). With my improvements to set/unset the location in a scoped
manner (r225000) this went from a bad quality situation, to a crash.
Fixing this goes part-way to unblocking the recommit of r225000.
It's likely that any call to CodeGenFunction::StartFunction without the
CurEHLocation set represents a similar bug or risk of a bug. Perhaps
there are some callers that know they won't generate EH cleanups, but
I'm not sure.
I considered a generic catch-fix in StartFunction (just fallback to the
GlobalDecl's location) but that seemed like it'd mask bugs where the EH
location shouldn't be the same as the decl's location (& indeed by not
using that stop-gap I found this bug). We'll see how long I can hold out
on the generic catch-all. I might eventually be able to add an assertion
in.
llvm-svn: 225845
Summary:
The Mips ABI's treat pointers in the same way as integers. They are
sign-extended to 32-bit for O32, and 64-bit for N32/N64. This doesn't matter
for O32 and N64 where pointers are already the correct width but it does matter
for big-endian N32, where pointers are 32-bit and need promoting.
The caller side is already passing pointers correctly. This patch corrects the
callee.
Reviewers: vmedic, atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6812
llvm-svn: 225782
Introduce the following -fsanitize-recover flags:
- -fsanitize-recover=<list>: Enable recovery for selected checks or
group of checks. It is forbidden to explicitly list unrecoverable
sanitizers here (that is, "address", "unreachable", "return").
- -fno-sanitize-recover=<list>: Disable recovery for selected checks or
group of checks.
- -f(no-)?sanitize-recover is now a synonym for
-f(no-)?sanitize-recover=undefined,integer and will soon be deprecated.
These flags are parsed left to right, and mask of "recoverable"
sanitizer is updated accordingly, much like what we do for -fsanitize= flags.
-fsanitize= and -fsanitize-recover= flag families are independent.
CodeGen change: If there is a single UBSan handler function, responsible
for implementing multiple checks, which have different recoverable setting,
then we emit two handler calls instead of one:
the first one for the set of "unrecoverable" checks, another one - for
set of "recoverable" checks. If all checks implemented by a handler have the
same recoverability setting, then the generated code will be the same.
llvm-svn: 225719
The llvm IR until recently had no support for comdats. This was a problem when
targeting C++ on ELF/COFF as just using weak linkage would cause quite a bit of
dead bits to remain on the executable (unless -ffunction-sections,
-fdata-sections and --gc-sections were used).
To fix the problem, llvm's codegen will just assume that any weak or linkonce
that is not in an explicit comdat should be output in one with the same name as
the global.
This unfortunately breaks cases like pr19848 where a weak symbol is not
xpected to be part of any comdat.
Now that we have explicit comdats in the IR, we can finally get both cases
right.
This first patch just makes clang give explicit comdats to GlobalValues where
t is allowed to.
A followup patch to llvm will then stop implicitly producing comdats.
llvm-svn: 225705
Their linkage can change if they are later explicitly instantiated. We would
previously emit such functions eagerly (as opposed to lazily on first use) if
they have a 'dllexport' or 'used' attribute, and fail an assert when hitting the
explicit instantiation.
This is achieved by replacing the old CodeGenModule::MayDeferGeneration() method
with two new ones: MustBeEmitted() and MayBeEmittedEagerly().
Differential Revision: http://reviews.llvm.org/D6674
llvm-svn: 225570
This reverts commit r225000, r225021, r225083, r225086, r225090.
The root change (r225000) still has several issues where it's caused
calls to be emitted without debug locations. This causes assertion
failures if/when those calls are inlined.
I'll work up some test cases and fixes before recommitting this.
llvm-svn: 225555
Allow blessed access to the symbol rewriter from the driver. Although the
symbol rewriter could be invoked through tools like opt and llc, it would not
accessible from the frontend. This allows us to read the rewrite map files in
the frontend rather than the backend and enable symbol rewriting for actually
performing the symbol interpositioning.
llvm-svn: 225504
We were previously emitting counter increments even if we didn't have
an insertion point, which would result in a CallInst with no
parent. This leads to a crash, as in pr22166, if we try to do
GlobalDCE.
llvm-svn: 225495
PR22096 has several test cases that assert that look fairly different. I'm
adding one of those as an automated test, but when relanding the other cases
should probably be checked as well.
llvm-svn: 225361
r225000 generalized debug info line info handling for expressions such
that this code is no longer necessary.
This removes the last use of CGDebugInfo::getLocation, but not all the
uses of CGDebugInfo::CurLoc, which is still used internally in
CGDebugInfo. I'd like to do away with all of that & might succeed after
a few more patches.
llvm-svn: 225085
The optimization (that appears to have been here since the earliest
implementation (r50848) & has become more complicated over the years) to
avoid recreating the debugloc if it would be the same was out of date
because ApplyDebugLocation was not re-updating the CurLoc/PrevLoc. This
optimization doesn't look terribly beneficial/necessary, so I'm removing
it - if it turns up in benchmarks, I'm happy to reconsider/reimplement
this with justification, but for now it just seems to add
complexity/problems.
llvm-svn: 225083
The DeclRefExpr might be for a variable initialized by a constant
expression which hasn't been ODR used.
Emit the initializer for the variable instead of trying to capture the
variable itself.
This fixes PR22071.
llvm-svn: 225060
Originally committed in r224385 and reverted in r224441 due to concerns
this change might've introduced a crash. Turns out this change fixes the
crash introduced by one of my earlier more specific location handling
changes (those specific fixes are reverted by this patch, in favor of
the more general solution).
Recommitted in r224941 and reverted in r224970 after it caused a crash
when building compiler-rt. Looks to be due to this change zeroing out
the debug location when emitting default arguments (which were meant to
inherit their outer expression's location) thus creating call
instructions without locations - these create problems for inlining and
must not be created. That is fixed and tested in this version of the
change.
Original commit message:
This is a more scalable (fixed in mostly one place, rather than many
places that will need constant improvement/maintenance) solution to
several commits I've made recently to increase source fidelity for
subexpressions.
This resetting had to be done at the DebugLoc level (not the
SourceLocation level) to preserve scoping information (if the resetting
was done with CGDebugInfo::EmitLocation, it would've caused the tail end
of an expression's codegen to end up in a potentially different scope
than the start, even though it was at the same source location). The
drawback to this is that it might leave CGDebugInfo out of sync. Ideally
CGDebugInfo shouldn't have a duplicate sense of the current
SourceLocation, but for now it seems it does... - I don't think I'm
going to tackle removing that just now.
I expect this'll probably cause some more buildbot fallout & I'll
investigate that as it comes up.
Also these sort of improvements might be starting to show a weakness/bug
in LLVM's line table handling: we don't correctly emit is_stmt for
statements, we just put it on every line table entry. This means one
statement split over multiple lines appears as multiple 'statements' and
two statements on one line (without column info) are treated as one
statement.
I don't think we have any IR representation of statements that would
help us distinguish these cases and identify the beginning of each
statement - so that might be something we need to add (possibly to the
lexical scope chain - a scope for each statement). This does cause some
problems for GDB and possibly other DWARF consumers.
llvm-svn: 225000
Summary:
In a JIT context it is useful to be able to access the GlobalCtors
and especially clear them once they have been emitted and called.
This adds a public method to be able to access the list.
Subscribers: yaron.keren, cfe-commits
Differential Revision: http://reviews.llvm.org/D6790
llvm-svn: 224982
Originally committed in r224385 and reverted in r224441 due to concerns
this change might've introduced a crash. Turns out this change fixes the
crash introduced by one of my earlier more specific location handling
changes (those specific fixes are reverted by this patch, in favor of
the more general solution).
Original commit message:
This is a more scalable (fixed in mostly one place, rather than many
places that will need constant improvement/maintenance) solution to
several commits I've made recently to increase source fidelity for
subexpressions.
This resetting had to be done at the DebugLoc level (not the
SourceLocation level) to preserve scoping information (if the resetting
was done with CGDebugInfo::EmitLocation, it would've caused the tail end
of an expression's codegen to end up in a potentially different scope
than the start, even though it was at the same source location). The
drawback to this is that it might leave CGDebugInfo out of sync. Ideally
CGDebugInfo shouldn't have a duplicate sense of the current
SourceLocation, but for now it seems it does... - I don't think I'm
going to tackle removing that just now.
I expect this'll probably cause some more buildbot fallout & I'll
investigate that as it comes up.
Also these sort of improvements might be starting to show a weakness/bug
in LLVM's line table handling: we don't correctly emit is_stmt for
statements, we just put it on every line table entry. This means one
statement split over multiple lines appears as multiple 'statements' and
two statements on one line (without column info) are treated as one
statement.
I don't think we have any IR representation of statements that would
help us distinguish these cases and identify the beginning of each
statement - so that might be something we need to add (possibly to the
lexical scope chain - a scope for each statement). This does cause some
problems for GDB and possibly other DWARF consumers.
llvm-svn: 224941
Create an ConstantAggregateZero upfront if we see that it is viable.
This saves us from having to manually push_back each and every
initializer and then looping back over them to determine if they are
'null'.
llvm-svn: 224908
This still lower to the same intrinsics as before.
This is preparation for bounds checking the immediate on the avx version of the builtin so we don't pass illegal immediates into the backend. Since SSE uses a smaller size immediate its not possible to bounds check when using a shared builtin. Rather than creating a clang specific builtin for the different immediate, I decided (after consulting with Chandler) that it was better to match gcc.
llvm-svn: 224879
Turns out there will be left-over deferred inline methods if there have
been errors, because in that case HandleTopLevelDecl bails out early.
llvm-svn: 224649
While we're here, also move the declaration of DeferredInlineMethodDefinitions
closer to the other member vars and make it a SmallVector. NFC.
llvm-svn: 224533
Fixed assertion on type checking for arguments and parameters on function call if arguments are pointers to VLA
Differential Revision: http://reviews.llvm.org/D6655
llvm-svn: 224504
Use new `DIBuilder` API from LLVM r224482 to mutate `DICompositeType`s,
rather than changing them directly. This allows `DIBuilder` to track
otherwise orphaned cycles when `CollectContainingType()` creates a
self-reference.
Fixes PR21941.
llvm-svn: 224483
This state object makes things harder to reason about and isn't really
useful, since we can just emit the mappings before the state changes
rather than holding on to it.
llvm-svn: 224476
For MSVC compatibility, add the `__emit' builtin. This is used in the Windows
SDK headers, and must therefore be implemented as a builtin rather than an
intrinsic.
The `__emit' builtin provides a mechanism to emit a 16-bit opcode instruction
into the stream. The value must be a compile time constant expression. No
guarantees are made about the CPU and memory states after the execution of the
instruction.
Due to the unchecked nature of the builtin, only support this on Windows on ARM.
llvm-svn: 224438
This is a more scalable (fixed in mostly one place, rather than many
places that will need constant improvement/maintenance) solution to
several commits I've made recently to increase source fidelity for
subexpressions.
This resetting had to be done at the DebugLoc level (not the
SourceLocation level) to preserve scoping information (if the resetting
was done with CGDebugInfo::EmitLocation, it would've caused the tail end
of an expression's codegen to end up in a potentially different scope
than the start, even though it was at the same source location). The
drawback to this is that it might leave CGDebugInfo out of sync. Ideally
CGDebugInfo shouldn't have a duplicate sense of the current
SourceLocation, but for now it seems it does... - I don't think I'm
going to tackle removing that just now.
I expect this'll probably cause some more buildbot fallout & I'll
investigate that as it comes up.
Also these sort of improvements might be starting to show a weakness/bug
in LLVM's line table handling: we don't correctly emit is_stmt for
statements, we just put it on every line table entry. This means one
statement split over multiple lines appears as multiple 'statements' and
two statements on one line (without column info) are treated as one
statement.
I don't think we have any IR representation of statements that would
help us distinguish these cases and identify the beginning of each
statement - so that might be something we need to add (possibly to the
lexical scope chain - a scope for each statement). This does cause some
problems for GDB and possibly other DWARF consumers.
llvm-svn: 224385
The variable (and the GV) is only ever used if the function is. Putting it
in the function's comdat make it easier for the linker to discard them.
The motivating example is
struct S {
static const int x;
};
// const int S::x = 42;
inline const int *f() {
static const int y = S::x;
return &y;
}
const int *g() { return f(); }
With S::x commented out, _ZZ1fvE1y is a variable with a guard variable
that is initialized by f.
With S::x present, _ZZ1fvE1y is a constant.
llvm-svn: 224369
Bitfield RefersToEnclosingLocal of Stmt::DeclRefExprBitfields renamed to RefersToCapturedVariable to reflect latest changes introduced in commit 224323. Also renamed method Expr::refersToEnclosingLocal() to Expr::refersToCapturedVariable() and comments for constant arguments.
No functional changes.
llvm-svn: 224329
Currently, if global variable is marked as a private OpenMP variable, the compiler crashes in debug version or generates incorrect code in release version. It happens because in the OpenMP region the original global variable is used instead of the generated private copy. It happens because currently globals variables are not captured in the OpenMP region.
This patch adds capturing of global variables iff private copy of the global variable must be used in the OpenMP region.
Differential Revision: http://reviews.llvm.org/D6259
llvm-svn: 224323
the simplest case, which is used when no chunk_size is specified in
the schedule(static) or no 'schedule' clause is specified - the
iteration space is divided by the library into chunks that are
approximately equal in size, and at most one chunk is distributed
to each thread. In this case, we do not need an outer loop in each
thread - each thread requests once which iterations range it should
handle (using __kmpc_for_static_init runtime call) and then runs the
inner loop on this range.
Differential Revision: http://reviews.llvm.org/D5865
llvm-svn: 224233
Currently clang fires assertions on x86-64 on any atomic operations for long double operands. Patch fixes codegen for such operations.
Differential Revision: http://reviews.llvm.org/D6499
llvm-svn: 224230
This actually came up as a break in UBSan tests (look for a follow-up
commit to this one to see the UBSan test fallout) when I tried a broader
fix to location information.
I have some other ideas about how to do that broader change & will keep
looking into it.
llvm-svn: 224221
CodeGen assumed that a compound literal with array type should have a
corresponding LLVM IR array type.
We had two bugs in this area:
- Zero sized arrays in compound literals would lead to the creation of
an opaque type. This is unnecessary, we should just create an array
type with a bound of zero.
- Funny record types (like unions) lead to exotic IR types for compound
literals. In this case, CodeGen must be prepared to deal with the
possibility that it might not have an array IR type.
This fixes PR21912.
llvm-svn: 224219
The extension has the following syntax:
__builtin_call_with_static_chain(Call, Chain)
where Call must be a function call expression and Chain must be of pointer type
This extension performs a function call Call with a static chain pointer
Chain passed to the callee in a designated register. This is useful for
calling foreign language functions whose ABI uses static chain pointers
(e.g. to implement closures).
Differential Revision: http://reviews.llvm.org/D6332
llvm-svn: 224167
having OptimizeNone remove them again, just don't add them in the
first place if the function already has OptimizeNone.
Note that MinSize can still appear due to attributes on different
declarations; a future patch will address that.
llvm-svn: 224047
arithmetic relaxation flags:
-cl-no-signed-zeros
-cl-unsafe-math-optimizations
-cl-finite-math-only
-cl-fast-relaxed-math
Propagate the info to FP instruction flags as well
as function attributes where they are available.
llvm-svn: 223928
This particularly helps the fidelity of ASan reports (which can occur
even in these examples - if, for example, one uses placement new over a
buffer of insufficient size - now ASan will correctly identify which
member's initialization went over the end of the buffer).
This doesn't cover all types of members - more coming.
llvm-svn: 223726
The logic for lowering profiling counters has been moved to an LLVM
pass. Emit the intrinsics rather than duplicating the whole pass in
clang.
llvm-svn: 223683
Summary:
When -fsanitize-address-field-padding=1 is present
don't emit memcpy for copy constructor.
Thanks Nico for the extra test case.
Test Plan: regression tests
Reviewers: thakis, rsmith
Reviewed By: rsmith
Subscribers: rsmith, cfe-commits
Differential Revision: http://reviews.llvm.org/D6515
llvm-svn: 223563
Adds generation of call to "i32 kmpc_cancel_barrier(ident_t *, i32)" libcall for explicitly specified barriers (OMP_IDENT_BARRIER_EXPL flag is added to "flags" field of "ident_t" structure).
Also this patch replaces all calls to "kmpc_barrier" function by calls of "__kmpc_cancel_barrier" function which provides additional functionality for OpenMP 4.0.
Also, library specific enum OpenMPLocationFlags moved to private section of CGOpenMPRuntime class to make it more independent from library implementation.
Differential Revision: http://reviews.llvm.org/D6447
llvm-svn: 223444
http://llvm.org/bugs/show_bug.cgi?id=21555
Currently, kernel argument metadata is omitted unless the
"-cl-kernel-arg-info" option is specified. But the SPIR 1.2 spec
requires that all metadata except kernel_arg_name should always be
emitted, and kernel_arg_name is only emitted when
"-cl-kernel-arg-info" is specified.
Patch ported by Ryan Burn from the Khronos SPIR generator.
https://github.com/KhronosGroup/SPIR
llvm-svn: 223340
Create a helper function to construct a value for the ARM hint intrinsic
rather than inling the construction. In order to avoid the use of the sentinel
value, inline the use of intrinsic instruction retrieval. NFC.
llvm-svn: 223338
We currently use i32 (...)** as the type of the vptr field in the LLVM
struct type. LLVM's GlobalOpt prefers any bitcasts to be on the side of
the data being stored rather than on the pointer being stored to.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D5916
llvm-svn: 223267
As the semantics of prefix data has changed. See D6454.
Patch by Ben Gamari!
Test Plan: Testsuite
Differential Revision: http://reviews.llvm.org/D6489
llvm-svn: 223190
Consider this program:
struct A {
virtual void operator-() { printf("base\n"); }
};
struct B final : public A {
virtual void operator-() override { printf("derived\n"); }
};
int main() {
B* b = new B;
-static_cast<A&>(*b);
}
Before this patch, clang saw the virtual call to A::operator-(), figured out
that it can be devirtualized, and then just called A::operator-() directly,
without going through the vtable. Instead, it should've looked up which
operator-() the call devirtualizes to and should've called that.
For regular virtual member calls, clang gets all this right already. So
instead of giving EmitCXXOperatorMemberCallee() all the logic that
EmitCXXMemberCallExpr() already has, cut the latter function into two pieces,
call the second piece EmitCXXMemberOrOperatorMemberCallExpr(), and use it also
to generate code for calls to virtual member operators.
This way, virtual overloaded operators automatically don't get devirtualized
if they have covariant returns (like it was done for regular calls in r218602),
etc.
This also happens to fix (or at least improve) codegen for explicit constructor
calls (`A a; a.A::A()`) in MS mode with -fsanitize-address-field-padding=1.
(This adjustment for virtual operator calls seems still wrong with the MS ABI.)
llvm-svn: 223185
There's no need to use different names for the local variables than we
use in the profile itself, and it's a bit simpler and easier to debug
if we're consistent.
llvm-svn: 223173
It doesn't make much sense to have std::unique_ptrs of std::string and
std::vector. Avoid some useless indirection by using these types
directly.
llvm-svn: 223166
ARM ABI specifies that all the libcalls use soft FP ABI
(even hard FP binaries). These days clang emits _mulsc3 / _muldc3
calls with default (C) calling convention which would be translated
into AAPCS_VFP LLVM calling and thus the result of complex
multiplication will be bogus.
Introduce a way for a target to specify explicitly calling
convention for libcalls. Right now this is temporary correctness
fix. Ultimately, we'll end with intrinsic for complex
multiplication and all calling convention decisions for libcalls
will be put into backend.
llvm-svn: 223123
Richard rejected my Sema change to interpret an integer literal zero in
a varargs context as a null pointer, so -Wsentinel sees an integer
literal zero and fires off a warning. Only CodeGen currently knows that
it promotes integer literal zeroes in this context to pointer size on
Windows. I didn't want to teach -Wsentinel about that compatibility
hack. Therefore, I'm migrating to C++11 nullptr.
llvm-svn: 223079
Now that TailRecursionElimination has been fixed with r222354, the
threshold on size for lifetime marker insertion can be removed. This
only affects named temporary though, as the patch for unnamed temporaries
is still in progress.
llvm-svn: 222993
Now that LLVM can count the registers needed to implement AAPCS rules, we don't
need to duplicate that logic here. This means we can drop the explicit padding
and also use more natural types in many cases (e.g. "struct { float arr[3]; }"
used to end up as "[2 x double]" to avoid holes on the stack.
The one wrinkle is that AAPCS va_arg was also using the register counting
machinery. But the local replacement isn't too bad.
llvm-svn: 222904
OpenCL v2.0 s6.5.5 restricts conversion of pointers to different address spaces:
- the named address spaces (__global, __local, and __private) => __generic - implicitly converted;
- __generic => named - with an explicit cast;
- named <=> named - disallowed;
- __constant <=> any other - disallowed.
llvm-svn: 222834
Rethrowing exceptions in the MS model is very simple: just call
_CxxThrowException with nullptr for both arguments.
N.B. They chose stdcall as the calling convention for x86 but cdecl for
all other platforms.
llvm-svn: 222733
All supported environments on x86 Windows return structs in EAX:EDX.
This removes code added in r204978 that had to get updated in r222680.
We should now have the same behavior we had before r204978.
llvm-svn: 222697
Cygwin and MinGW fail to conform to the underlying system's structure passing
ABI. Make the check more precise to ensure that we correctly generate code for
the itanium environment.
llvm-svn: 222626
When emitting nested block definitions, the insert-at-point variant of
DIBuilder::insertDeclare() could be called with the insertion point set
to the end-of-BasicBlock sentinel, causing the parent pointer of the
CallInst to be set to the intentionally bogus value of the sentinel.
Fixed by conditionally invoking the correct version of insertDeclare().
rdar://problem/19034882
llvm-svn: 222487
After LLVM r222434, the Variables field of DISubprograms for forward
declarations will always be null. No need to keep code around to
delete them.
llvm-svn: 222437
This is a followup to r222373. A better solution to the problem solved
there is to not create the leaked nodes at all (we know that they will
never be used for forward declared functions anyway). To avoid bot
breakage in the interval between the cfe and llvm commits, add a check
that the nMDNode is not null before deleting it. This code can completely
go away after the LLVM part is in.
llvm-svn: 222433
For each "omp flush" directive a call to "void kmpc_flush(ident_t *, ...)" function is generated.
Directive "omp flush" may have an associated list of variables to flush, but currently runtime function ignores them. So the patch generates just "call kmpc_flush(ident_t *<loc>, i32 0)".
Differential Revision: http://reviews.llvm.org/D6292
llvm-svn: 222409
While emitting debug information for function forward decalrations, we
create DISubprogram objects that aran't stored in the AllSubprograms
list, and thus won't get finalized by the DIBuilder. During the DIBuilder
finalize(), the temporary MDNode allocated for the DISubprogram
Variables field gets RAUWd with a non temporary DIArray. For the forward
declarations, simply delete that temporary node before we delete the
parent node, so that it doesn't leak.
llvm-svn: 222373
Summary:
With this patch, passing a va_list to another function and reading 10 int's from
it works correctly on a big-endian target.
Based on a pair of patches by David Chisnall, one of which I've reworked
for the current trunk.
Reviewers: theraven, atanasyan
Reviewed By: theraven, atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6248
llvm-svn: 222339
Summary:
This distinguishes between -fpic and -fPIC now, with the additions in LLVM for
PIC level support.
Test Plan: No regressions
Reviewers: echristo, rafael
Reviewed By: rafael
Subscribers: rnk, emaste, llvm-commits
Differential Revision: http://reviews.llvm.org/D5400
llvm-svn: 222227
Currently this function would return nothing for functions or globals that
haven't seen a definition yet. Make it return a forward declaration that will
get RAUWed with the definition if one is seen at a later point. The strategy
used to implement this is similar to what's done for types: the forward
declarations are stored in a vector and post processed upon finilization to
perform the required RAUWs.
For now the only user of getDeclarationOrDefinition() is EmitUsingDecl(), thus
this patch allows to emit correct imported declarations even in the absence of
an actual definition of the imported entity.
(Another user will be the debug info generation for argument default values
that I need to resurect).
Differential Revision: http://reviews.llvm.org/D6173
llvm-svn: 222220
We include unused functions and methods in -fcoverage-mapping so that
we can differentiate between uninstrumented and unused. This can cause
problems for uninstantiated templates though, since they may involve
an incomplete type that can't be mangled. This shows up in things like
libc++'s <unordered_map> and makes coverage unusable.
Avoid the issue by skipping uninstantiated methods of a templated
class.
llvm-svn: 222204
When targeting Windows itanium (a MSVC environment), use itanium style
exceptions rather than SEH. Existing test cases already test this code path.
Applying this change ensures that tests wont break due to a parallel change in
LLVM (to correctly report isMSVCEnvironment).
llvm-svn: 222179
used inside blocks. It fixes a crash in naming code
for __func__ etc. when used in a block declared globally.
It also brings back old naming convention for
predefined expression which was broken. rdar://18961148
llvm-svn: 222065
This option was misleading because it looked like it enabled the
language feature of SEH (__try / __except), when this option was really
controlling which EH personality function to use. Mingw only supports
SEH and SjLj EH on x86_64, so we can simply do away with this flag.
llvm-svn: 221963
This patch enables the vec_vsx_ld and vec_vsx_st intrinsics for
PowerPC, which provide programmer access to the lxvd2x, lxvw4x,
stxvd2x, and stxvw4x instructions.
New code in altivec.h defines these in terms of new builtins, which
are themselves defined in BuiltinsPPC.def. The builtins are converted
to LLVM intrinsics in CGBuiltin.cpp. Additional code is added to
builtins-ppc-vsx.c to verify the correct generation of the intrinsics.
Note that I moved the other VSX builtins so all VSX builtins will be
alphabetical in their own section in BuiltinsPPC.def.
There is a companion patch for LLVM.
llvm-svn: 221768
Summary: If we've added poisoned paddings to a type do not emit memcpy for operator=.
Test Plan: regression tests.
Reviewers: majnemer, rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6160
llvm-svn: 221739
Summary:
This change makes the asan-coverge (formerly -mllvm -asan-coverge)
accessible via a clang flag.
Companion patch to LLVM is http://reviews.llvm.org/D6152
Test Plan: regression tests, chromium
Reviewers: samsonov
Reviewed By: samsonov
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6153
llvm-svn: 221719
Summary:
This change makes CodeGenFunction::EmitCheck() take several
conditions that needs to be checked (all of them need to be true),
together with sanitizer kinds these checks are for. This would allow
to split one call into UBSan runtime into several calls in case
different sanitizer kinds would have different recoverability
settings.
Tests should be fixed accordingly, I'm working on it.
Test Plan: regression test suite.
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6219
llvm-svn: 221716
So DWARF5 specs out auto deduced return types as DW_TAG_unspecified_type
with DW_AT_name "auto", and GCC implements this somewhat, but it
presents a few problems to do this with Clang.
GCC's implementation only applies to member functions where the auto
return type isn't deduced immediately (ie: member functions of templates
or member functions defined out of line). In the common case of an
inline deduced return type function, GCC emits the DW_AT_type as the
deduced return type.
Currently GDB doesn't seem to behave too well with this debug info - it
treats the return type as 'void', even though the definition of the
function has the correctly deduced return type (I guess it sees the
return type the declaration has, doesn't understand it, and assumes
void). This means the function's ABI might be broken (non-trivial return
types, etc), etc.
Clang, on the other hand doesn't track this particular case of a
deducable return type that is deduced immediately versus one that is
deduced 'later'. So if we implement the DWARF5 representation, all
deducible return type functions would get adverse GDB behavior
(including deduced return type lambda functions, inline deduced return
type functions, etc).
Also, we can't just do this for auto types that are not deduced -
because Clang marks even the declaration's return type as deduced (&
provides the underlying type) once a definition is seen that allows the
deduction. So we have to ignore even deduced types - but we can't do
that for auto variables (because this representation only applies to
function declarations - variables and function definitions need the real
type so the function can be called, etc) so we'd need to add an extra
flag to the type unwrapping/creation code to indicate when we want to
see through deduced types and when we don't. It's also not as simple as
just checking at the top level when building a function type (for one
thing, we reuse the function type building for building function pointer
types which might also have 'auto' in them - but be the type of a
variable instead) because the auto might be arbitrarily deeply nested
("auto &", "auto (*)()", etc...)
So, with all that said, let's do the simple thing that works in existing
debuggers for now and treat these functions the same way we do function
templates and implicit special members: omit them from the member list,
since they can't be correctly called anyway (without knowing the return
type the ABI isn't know and a function call could put the arguments in
the wrong place) so they're not much use to the user.
At some point in the future, when GDB understands the DWARF5
representation better it might be worth plumbing through the extra type
builder handling to avoid looking through AutoType for some callers,
etc...
llvm-svn: 221704
For all threadprivate variables which have constructor/destructor emit call to void __kmpc_threadprivate_register(ident_t * <Current Location>, void *<Original Global Addr>, kmpc_ctor <Constructor>, kmpc_cctor NULL, kmpc_dtor <Destructor>);
In expressions all references to such variables are replaced by calls to void *__kmpc_threadprivate_cached(ident_t *<Current Location>, kmp_int32 <Current Thread Id>, void *<Original Global Addr>, size_t <Size of Data>, void ***<Pointer to autogenerated cache – array of private copies of threadprivate variable>);
Test test/OpenMP/threadprivate_codegen.cpp checks that codegen is correct. Also it checks that codegen is correct after serialization/deserialization and one of passes verifies debug info.
Differential Revision: http://reviews.llvm.org/D4002
llvm-svn: 221663
Get rid of ugly SanitizerOptions class thrust into LangOptions:
* Make SanitizeAddressFieldPadding a regular language option,
and rely on default behavior to initialize/reset it.
* Make SanitizerBlacklistFile a regular member LangOptions.
* Introduce the helper class "SanitizerSet" to represent the
set of enabled sanitizers and make it a member of LangOptions.
It is exactly the entity we want to cache and modify in CodeGenFunction,
for instance. We'd also be able to reuse SanitizerSet in
CodeGenOptions for storing the set of recoverable sanitizers,
and in the Driver to represent the set of sanitizers
turned on/off by the commandline flags.
No functionality change.
llvm-svn: 221653
Make sure CodeGenFunction::EmitCheck() knows which sanitizer
it emits check for. Make CheckRecoverableKind enum an
implementation detail and move it away from header.
Currently CheckRecoverableKind is determined by the type of
sanitizer ("unreachable" and "return" are unrecoverable,
"vptr" is always-recoverable, all the rest are recoverable).
This will change in future if we allow to specify which sanitizers
are recoverable, and which are not by -fsanitize-recover= flag.
No functionality change.
llvm-svn: 221635
Homogeneous aggregates on AAPCS_VFP ARM need to be passed *without* being
flattened (e.g. [2 x float] rather than "float, float") for various weird ABI
reasons. However, this isn't the case for anything else; further, we know at
the ABIArgInfo::getDirect callsites whether this flattening is allowed.
So, we can get more unified ARM code, with a simpler Clang, by just using that
knowledge directly.
llvm-svn: 221559
Use the bitmask to store the set of enabled sanitizers instead of a
bitfield. On the negative side, it makes syntax for querying the
set of enabled sanitizers a bit more clunky. On the positive side, we
will be able to use SanitizerKind to eventually implement the
new semantics for -fsanitize-recover= flag, that would allow us
to make some sanitizers recoverable, and some non-recoverable.
No functionality change.
llvm-svn: 221558
We would blindly assume that RTTI data should have the same linkage as
the vtable because we didn't think the RTTI data was external. This
oversight stemmed because we didn't take dllimport into account.
This fixes PR21512.
llvm-svn: 221511
Custom targets in cmake cannot be exported, and this dependency is only
needed in the combined build to ensure that Intrinsics.gen is created
before compiling CodeGen. In the standalone, all of LLVM is build first.
llvm-svn: 221391
When we are generating the global initializer functions, we call
CGDebugInfo::EmitFunctionStart() with a valid decl which is describing
the initialized global variable. Do not update the DeclCache with this
key as it will overwrite the the cached variable DIGlobalVariable with
the newly created artificial DISubprogram.
One could wonder if we should put artificial subprograms in the DIE tree
at all (there are vaild uses for them carrying line information though).
llvm-svn: 221385
mingw64's headers implement fabs by calling __builtin_fabs, so using the
library call results in an infinite loop. If the backend legalizes
@llvm.fabs as a call to fabs later, things should work out, as the crt
provides a definition.
llvm-svn: 221206
Local variables are not initialized, and every target has
been (incorrectly) ignoring the unnecessary request for
zero initialization.
llvm-svn: 221162
It turns out that MinGW never dllimports of exports inline functions.
This means that code compiled with Clang would fail to link with
MinGW-compiled libraries since we might try to import functions that
are not imported.
To fix this, make Clang never dllimport inline functions when targeting
MinGW.
llvm-svn: 221154
It says there is a narrowing conversion when we assign it to an unsigned
3 bit bitfield.
Also, use unsigned instead of size_t for the Size field of the struct in
question. Otherwise they won't run together in MSVC or clang-cl.
llvm-svn: 221019
The most complex aspect of the convention is the handling of homogeneous
vector and floating point aggregates. Reuse the homogeneous aggregate
classification code that we use on PPC64 and ARM for this.
This convention also has a C mangling, and we apparently implement that
in both Clang and LLVM.
Reviewed By: majnemer
Differential Revision: http://reviews.llvm.org/D6063
llvm-svn: 221006
Summary:
The Itanium ABI approach of using offset-to-top isn't possible with the
MS ABI, it doesn't have that kind of information lying around.
Instead, we do the following:
- Call the virtual deleting destructor with the "don't delete the object
flag" set. The virtual deleting destructor will return a pointer to
'this' adjusted to the most derived class.
- Call the global delete using the adjusted 'this' pointer.
Reviewers: rnk
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D5996
llvm-svn: 220993
Summary:
When we are adding field paddings for asan even an empty dtor has to remain in the code,
so we ignore -mconstructor-aliases if the paddings are going to be added.
Test Plan: added a test
Reviewers: rsmith, rnk, rafael
Reviewed By: rafael
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6038
llvm-svn: 220986
Reuse the PPC64 HVA detection algorithm for ARM and AArch64. This is a
nice code deduplication, since they are roughly identical. A few virtual
method extension points are needed to understand how big an HVA can be
and what element types it can have for a given architecture.
Also make the record expansion code work in the presence of non-virtual
bases.
Reviewed By: uweigand, asl
Differential Revision: http://reviews.llvm.org/D6045
llvm-svn: 220972
SanitizerOptions is not even a POD now, so having global variable of
this type, is not nice. Instead, provide a regular constructor and clear()
method, and let each CodeGenFunction has its own copy of SanitizerOptions
it uses.
llvm-svn: 220920
The Windows NT SDK uses __readfsdword and declares it as a compiler provided
builtin (#pragma intrinsic(__readfsword). Because intrin.h is not referenced
by winnt.h, it is not possible to provide an out-of-line definition for the
intrinsic. Provide a proper compiler builtin definition.
llvm-svn: 220859
Following the NVVM IR specifications, arguments of aggregate type should be
passed on the stack without splitting (byval).
http://reviews.llvm.org/D6020
Patch by Jacques Pienaar.
llvm-svn: 220854
As discussed in bug 21398, PowerPC ABI code needs to consider C++ base
classes when classifying a class as homogeneous aggregate (or not) for
ABI purposes.
llvm-svn: 220852
An updated implemnentation of VLA types capturing based on previously committed solution for Lambdas.
This version captures the whole VLA type instead of particular variables which are part of VLA size expression and allows to use previusly calculated size of VLA type in captured regions. Required for OpenMP.
Differential Revision: http://reviews.llvm.org/D5099
llvm-svn: 220850
The MS linker cannot do anything interesting with these, it doesn't make
sense to emit them.
This fixes PR21373.
Differential Revision: http://reviews.llvm.org/D5986
llvm-svn: 220595
Avoid an assertion when materializing a lifetime type aggregate temporary. When
performing CodeGen for ObjC++, we could generate a lifetime-only aggregate
temporary by using an initializer list (which is effectively an array). We
would reach through the temporary expression, fishing out the inner expression.
If this expression was a lifetime expression, we would attempt to emit this as a
scalar. This would eventually result in an assertion as the emission would
eventually assert that the expression being emitted has a scalar evaluation
kind.
Add a case to handle the aggregate expressions. Use the EmitAggExpr to emit the
aggregate expression rather than the EmitScalarInit.
Addresses PR21347.
llvm-svn: 220590
This fixes a corner-case where __uuidof as a template argument would
result in us trying to emit a GLValue as an RValue. This would lead to
a crash down the road.
llvm-svn: 220585
Wire it through everywhere we have support for fastcall, essentially.
This allows us to parse the MSVC "14" CTP headers, but we will
miscompile them because LLVM doesn't support __vectorcall yet.
Reviewed By: Aaron Ballman
Differential Revision: http://reviews.llvm.org/D5808
llvm-svn: 220573
Summary:
This allows us to easily identify them in the backend which in turn allows us
to handle them correctly for big-endian targets (where they must be shifted
into the upper bits of the register).
Depends on D5961
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits, theraven
Differential Revision: http://reviews.llvm.org/D5962
llvm-svn: 220566
Summary:
Ensure all integral/enumeration types are appropriately annotated with
signext/zeroext. In particular, i32 now has these attributes when using the
N32/N64 ABI. This paves the way for accurately representing the way the
N32/N64 ABI's promotes integer arguments to i64.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits, theraven
Differential Revision: http://reviews.llvm.org/D5961
llvm-svn: 220563
Clang would previously assert on the following code when targeting MinGW:
struct __declspec(dllimport) S {
virtual ~S();
};
S::~S() {}
Because ~S is a key function and the class is dllimport, we would try to emit a
strong definition of the vtable, with dllimport - which is a conflict. We
should not emit strong vtable definitions for imported classes.
Differential Revision: http://reviews.llvm.org/D5944
llvm-svn: 220532
The previous IR representation used the non-lexical decl context, which
placed the definitions in the same scope as the declarations (ie: within
the class) - this was hidden by the fact that LLVM currently doesn't
respect the context of global variable definitions at all, and always
puts them at the top level (as direct children of the compile_unit).
Having the correct lexical scope improves source fidelity and simplify
backend global variable emission (with changes coming shortly).
Doing something similar for non-member global variables would help
simplify/cleanup things further (see FIXME in the commit) and provide
similar source fidelity benefits to the final debug info.
llvm-svn: 220488
This eliminates some i8* GEPs and makes the IR that clang emits a bit
more canonical. More work is needed for vftables, but that isn't a clear
win so I plan to send it for review.
llvm-svn: 220398
This patch generates some helper variables which used as a private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by default (with the default constructor, if any). In outlined function references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables and implicit barier is set by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D4752
llvm-svn: 220262
This reverts commit r220169 which reverted r220153. However, it also
contains additional changes:
- We may need to add padding *after* we've packed the struct. This
occurs when the aligned next field offset is greater than the new
field's offset. When this occurs, we make the struct packed.
*However*, once packed the next field offset might be less than the
new feild's offset. It is in this case that we might further pad the
struct.
- We would pad structs which were perfectly sized! This behavior is
immensely old. This behavior came from blindly subtracting
NextFieldOffsetInChars from RecordSize. This doesn't take into
account the fact that the struct might have a greater overall
alignment than the last field.
llvm-svn: 220175
This commit caused two tests in LNT to regress. I'm able to reproduce on
any platform and will send reproduction steps to the original commit
log. This should restore the LNT bots that have been failing.
llvm-svn: 220169
a NaN-test prior to the call to the library function.
This should automatically make fastmath (including just non-NaNs) able to avoid
the expensive libcalls and also open the door to more advanced folding in LLVM
based on the rules for complex math.
Two important notes to remember: first is that this isn't yet a proper
limited range mode, it's still just improving the unlimited range mode.
Also, it isn't really perfecet w.r.t. what an unlimited range mode
should be doing because it isn't quite handling the flags produced by
all the operations in the way desirable for that mode, but then neither
is compiler-rt's libcall. When the compiler-rt libcall is improved to
carefully manage flags, the code emitted here should be improved
correspondingly. And it is still a long-term desirable thing to add
a limited range mode to Clang that would be able to use direct math
without library calls here.
Special thanks to Steve Canon for the careful review on this patch and
teaching me about these issues. =D
Differential Revision: http://reviews.llvm.org/D5756
llvm-svn: 220167
Before, ConstStructBuilder::AppendBytes would check packed constraints
prior to padding being added before the field's offset. However, adding
this padding might force our struct to be packed. Because we wouldn't
check *after* adding padding, ConstStructBuilder would be in an
inconsistent state leading to a crash.
This fixes PR21300.
llvm-svn: 220153
This commit changes the way we blacklist global variables in ASan.
Now the global is excluded from instrumentation (either regular
bounds checking, or initialization-order checking) if:
1) Global is explicitly blacklisted by its mangled name.
This part is left unchanged.
2) SourceLocation of a global is in blacklisted source file.
This changes the old behavior, where instead of looking at the
SourceLocation of a variable we simply considered llvm::Module
identifier. This was wrong, as identifier may not correspond to
the file name, and we incorrectly disabled instrumentation
for globals coming from #include'd files.
3) Global is blacklisted by type.
Now we build the type of a global variable using Clang machinery
(QualType::getAsString()), instead of llvm::StructType::getName().
After this commit, the active users of ASan blacklist files
may have to revisit them (this is a backwards-incompatible change).
llvm-svn: 220097
This commit changes the way we blacklist functions in ASan, TSan,
MSan and UBSan. We used to treat function as "blacklisted"
and turned off instrumentation in it in two cases:
1) Function is explicitly blacklisted by its mangled name.
This part is not changed.
2) Function is located in llvm::Module, whose identifier is
contained in the list of blacklisted sources. This is completely
wrong, as llvm::Module may not correspond to the actual source
file function is defined in. Also, function can be defined in
a header, in which case user had to blacklist the .cpp file
this header was #include'd into, not the header itself.
Such functions could cause other problems - for instance, if the
header was included in multiple source files, compiled
separately and linked into a single executable, we could end up
with both instrumented and non-instrumented version of the same
function participating in the same link.
After this change we will make blacklisting decision based on
the SourceLocation of a function definition. If a function is
not explicitly defined in the source file, (for example, the
function is compiler-generated and responsible for
initialization/destruction of a global variable), then it will
be blacklisted if the corresponding global variable is defined
in blacklisted source file, and will be instrumented otherwise.
After this commit, the active users of blacklist files may have
to revisit them. This is a backwards-incompatible change, but
I don't think it's possible or makes sense to support the
old incorrect behavior.
I plan to make similar change for blacklisting GlobalVariables
(which is ASan-specific).
llvm-svn: 219997
Summary:
The general approach is to add extra paddings after every field
in AST/RecordLayoutBuilder.cpp, then add code to CTORs/DTORs that poisons the paddings
(CodeGen/CGClass.cpp).
Everything is done under the flag -fsanitize-address-field-padding.
The blacklist file (-fsanitize-blacklist) allows to avoid the transformation
for given classes or source files.
See also https://code.google.com/p/address-sanitizer/wiki/IntraObjectOverflow
Test Plan: run SPEC2006 and some of the Chromium tests with -fsanitize-address-field-padding
Reviewers: samsonov, rnk, rsmith
Reviewed By: rsmith
Subscribers: majnemer, cfe-commits
Differential Revision: http://reviews.llvm.org/D5687
llvm-svn: 219961
They cannot be written to, so marking them const makes sense and may improve
optimisation.
As a side-effect, SectionInfos has to be moved from Sema to ASTContext.
It also fixes this problem, that occurs when compiling ATL:
warning LNK4254: section 'ATL' (C0000040) merged into '.rdata' (40000040) with different attributes
The ATL headers are putting variables in a special section that's marked
read-only. However, Clang currently can't model that read-onlyness in the IR.
But, by making the variables const, the section does become read-only, and
the linker warning is avoided.
Differential Revision: http://reviews.llvm.org/D5812
llvm-svn: 219960
Plumb through the full QualType of the TemplateArgument::Declaration, as
it's insufficient to only know whether the type is a reference or
pointer (that was necessary for mangling, but insufficient for debug
info). This shouldn't increase the size of TemplateArgument as
TemplateArgument::Integer is still longer by another 32 bits.
Several bits of code were testing that the reference-ness of the
parameters matched, but this seemed to be insufficient (various other
features of the type could've mismatched and wouldn't've been caught)
and unnecessary, at least insofar as removing those tests didn't cause
anything to fail.
(Richard - perchaps you can hypothesize why any of these checks might
need to test reference-ness of the parameters (& explain why
reference-ness is part of the mangling - I would've figured that for the
reference-ness to be different, a prior template argument would have to
be different). I'd be happy to add them in/beef them up and add test
cases if there's a reason for them)
llvm-svn: 219900
The functionality contained in CodeGenFunction::EmitAlignmentAssumption has
been moved to IRBuilder (so that it can also be used by LLVM-level code).
Remove this now-duplicate implementation in favor of the IRBuilder code.
llvm-svn: 219877
CodeGen wouldn't mark the aliasee as thread_local if the aliasee was a
tentative definition.
Even if the definition was already emitted, it would never mark the
alias as thread_local.
This fixes PR21288.
llvm-svn: 219859
Soon we'll need to have access to blacklist before the CodeGen
phase (see http://reviews.llvm.org/D5687), so parse and construct
the blacklist earlier.
llvm-svn: 219857
After http://reviews.llvm.org/D5687 is submitted, we will need
SanitizerBlacklist before the CodeGen phase, so make it a LangOpt
(as it will actually affect ABI / class layout).
llvm-svn: 219842
This change moves SanitizerBlacklist.h from lib/CodeGen
to public Clang headers in include/clang/Basic. SanitizerBlacklist
is currently only used in CodeGen to decide which functions/modules
should be instrumented, but this will soon change as ASan will
optionally modify class layouts during AST construction
(http://reviews.llvm.org/D5687). We need blacklist machinery
to be available at this point.
llvm-svn: 219840
In particular, if you have two identical templates in different TUs in
anonymous namespaces, we would use the same global_ctors comdat key for
both. As a result, only one would be run.
llvm-svn: 219806
Unions are initialized with the default initialization of their first
named member. If that member is not zero initialized, then we should
prefer that member's type. Otherwise, we might try to make an otherwise
unsuitable type (like an array) which we cannot easily initialize with a
pointer to member.
llvm-svn: 219781
When lazily constructing static member variable declarations (when
the vtable optimization fires and the definition of the type is omitted
(or built later, lazily), but the out of line definition of the static
member is provided and must be described in debug info) ensure we use
the canonical declaration when computing the file, line, etc for that
declaration (rather than the definition, which is also a declaration,
but not the canonical one).
llvm-svn: 219736
CodeGenFunction objects aren't really designed to be reused for more
than one function, and doing so can leak debug info location information
from one function into the prologue of the next.
Add an assertion in to catch reuses of CodeGenFunction, which
surprisingly only caught the ObjC atomic getter/setter cases. Fix those
and add a test to demonstrate the issue.
The test is a bit slim, because we're just testing for the absence of a
debug location on the prologue instructions, which by itself probably
wouldn't be the end of the world - but the particular debug location
that was ending up there was for the previous function's last
instruction. This produced debug info for another function within this
function, which is something I'm trying to remove all cases of as its a
substantial source of bugs, especially around inlining (see r219215).
llvm-svn: 219690
This change adds UBSan check to upcasts. Namely, when we
perform derived-to-base conversion, we:
1) check that the pointer-to-derived has suitable alignment
and underlying storage, if this pointer is non-null.
2) if vptr-sanitizer is enabled, and we perform conversion to
virtual base, we check that pointer-to-derived has a matching vptr.
llvm-svn: 219642
This patch generates call to "kmpc_push_num_threads(ident_t *loc, kmp_int32 global_tid, kmp_int32 num_threads);" library function before calling "kmpc_fork_call" each time there is an associated "num_threads" clause in the "omp parallel" directive.
Differential Revision: http://reviews.llvm.org/D5145
llvm-svn: 219599
Adds codegen for 'if' clause. Currently only for 'if' clause used with the 'parallel' directive.
If condition evaluates to true, the code executes parallel version of the code by calling __kmpc_fork_call(loc, 1, microtask, captured_struct/*context*/), where loc - debug location, 1 - number of additional parameters after "microtask" argument, microtask - is outlined finction for the code associated with the 'parallel' directive, captured_struct - list of variables captured in this outlined function.
If condition evaluates to false, the code executes serial version of the code by executing the following code:
global_thread_id.addr = alloca i32
store i32 global_thread_id, global_thread_id.addr
zero.addr = alloca i32
store i32 0, zero.addr
kmpc_serialized_parallel(loc, global_thread_id);
microtask(global_thread_id.addr, zero.addr, captured_struct/*context*/);
kmpc_end_serialized_parallel(loc, global_thread_id);
Where loc - debug location, global_thread_id - global thread id, returned by __kmpc_global_thread_num() call or passed as a first parameter in microtask() call, global_thread_id.addr - address of the variable, where stored global_thread_id value, zero.addr - implicit bound thread id (should be set to 0 for serial call), microtask() and captured_struct are the same as in parallel call.
Also this patch checks if the condition is constant and if it is constant it evaluates its value and then generates either parallel version of the code (if the condition evaluates to true), or the serial version of the code (if the condition evaluates to false).
Differential Revision: http://reviews.llvm.org/D4716
llvm-svn: 219597
Previously loop hints such as #pragma loop vectorize_width(#) required a constant. This patch allows a constant expression to be used as well. Such as a non-type template parameter or an expression (2 * c + 1).
Reviewed by Richard Smith
llvm-svn: 219589
While we ran getUnqualifiedType over the catch type,
it isn't enough for array types. Use getUnqualifiedArrayType instead.
This fixes PR21252.
llvm-svn: 219582
and !=) to support mixed complex and real operand types.
This requires removing an assert from SemaChecking, and adding support
both to the constant evaluator and the code generator to synthesize the
imaginary part when needed. This seemed somewhat cleaner than having
just the comparison operators force real-to-complex conversions.
I've added test cases for these operations. I'm really terrified that
there were *no* tests in-tree which exercised this.
This turned up when trying to build R after my change to the complex
type lowering.
llvm-svn: 219570
for complex math.
This should fix the windows build bots that started having trouble here
and generally fix complex libcall emission on targets which use sret for
complex data types. It also makes the code a bit simpler (despite
calling into a much more complex bucket of code).
llvm-svn: 219565
operators where one type is a C complex type, and to emit both the
efficient and correct implementation for complex arithmetic according to
C11 Annex G using this extra information.
For both multiply and divide the old code was writing a long-hand
reduced version of the math without any of the special handling of inf
and NaN recommended by the standard here. Instead of putting more
complexity here, this change does what GCC does which is to emit
a libcall for the fully general case.
However, the old code also failed to do the proper minimization of the
set of operations when there was a mixed complex and real operation. In
those cases, C provides a spec for much more minimal operations that are
valid. Clang now emits the exact suggested operations. This change isn't
*just* about performance though, without minimizing these operations, we
again lose the correct handling of infinities and NaNs. It is critical
that this happen in the frontend based on assymetric type operands to
complex math operations.
The performance implications of this change aren't trivial either. I've
run a set of benchmarks in Eigen, an open source mathematics library
that makes heavy use of complex. While a few have slowed down due to the
libcall being introduce, most sped up and some by a huge amount: up to
100% and 140%.
In order to make all of this work, also match the algorithm in the
constant evaluator to the one in the runtime library. Currently it is
a broken port of the simplifications from C's Annex G to the long-hand
formulation of the algorithm.
Splitting this patch up is very hard because none of this works without
the AST change to preserve non-complex operands. Sorry for the enormous
change.
Follow-up changes will include support for sinking the libcalls onto
cold paths in common cases and fastmath improvements to allow more
aggressive backend folding.
Differential Revision: http://reviews.llvm.org/D5698
llvm-svn: 219557
It's possible to construct cases where the first field we are trying to
copy is in the middle of an IR field. In some complicated cases, we
would fail to use an appropriate offset inside the object. Earlier
builds of clang seemed to miscompile the code by copying an insufficient
number of bytes. Up until now, we would assert: the copying offset was
insufficiently aligned.
This fixes PR21232.
llvm-svn: 219524
Moved CGOpenMPRegionInfo from CGOpenMPRuntime.h to CGOpenMPRuntime.cpp file and reworked the code for this change. Also added processing of ThreadID variable passed as an argument in outlined functions in parallel and task directives.
llvm-svn: 219490
This patch makes class OMPPrivateScope a common class for all private variables. Reworked processing of firstprivate variables (now it is based on OMPPrivateScope too).
llvm-svn: 219486
Make it possible to pass NULL through variadic functions on 64-bit
Windows targets. The Visual C++ headers define NULL to 0, when they
should define it to 0LL on Win64 so that NULL is a pointer-sized
integer.
Fixes PR20949.
Reviewers: thakis, rsmith
Differential Revision: http://reviews.llvm.org/D5480
llvm-svn: 219456
Assertion failed: "Computed __func__ length differs from type!"
Reworked PredefinedExpr representation with internal StringLiteral field for function declaration.
Differential Revision: http://reviews.llvm.org/D5365
llvm-svn: 219393
Includes parsing and semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive.
llvm-svn: 219385
Summary:
The current code uses memset to re-initialize EHCleanupScope objects
with breaks the assumptions of the upcoming asan's intra-object-overflow checker.
If there is no DTOR, the new checker will refuse to work.
Test Plan: bootstrap with asan
Reviewers: rnk
Reviewed By: rnk
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D5656
llvm-svn: 219331
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
llvm-svn: 219306
Boostrapping LLVM+Clang+LLDB without threshold on object size for
lifetime markers insertion has shown there was no significant change
in compile time, so let the stack slot colorizer do its optimization
for all slots.
llvm-svn: 219303
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
llvm-svn: 219297
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
llvm-svn: 219295
Summary:
Previously CodeGen assumed that static locals were emitted before they
could be accessed, which is true for automatic storage duration locals.
However, it is possible to have CodeGen emit a nested function that uses
a static local before emitting the function that defines the static
local, breaking that assumption.
Fix it by creating the static local upon access and ensuring that the
deferred function body gets emitted. We may not be able to emit the
initializer properly from outside the function body, so don't try.
Fixes PR18020. See also previous attempts to fix static locals in
PR6769 and PR7101.
Reviewers: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4787
llvm-svn: 219265
We used to avoid these, but it looks like we did so just because we were
not handling dllexport alias correctly.
Dario Domizioli fixed that, so allow these aliases.
Based on a patch by Dario Domizioli!
llvm-svn: 219206
Includes parsing and semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive.
llvm-svn: 219197
No functional changes intended.
Renamed EmitOMPSimdLoop to EmitOMPInnerLoop, I plan to re-use
it to emit inner loop in the future patches for CodeGen of the
worksharing loop directives (omp for, omp for simd).
llvm-svn: 219195
By leaving these members out of the member list, we avoid them being
emitted into type unit definitions - while still allowing the
definition/declaration to be injected into the compile unit as expected.
llvm-svn: 219101
By leaving these members out of the member list, we avoid them being
emitted into type unit definitions - while still allowing the
definition/declaration to be injected into the compile unit as expected.
llvm-svn: 219100
Summary:
This add support for the C++11 feature, thread_local global variables.
The ABI Clang implements is an improvement of the MSVC ABI. Sadly,
further improvements could be made but not without sacrificing ABI
compatibility.
The feature is implemented as follows:
- All thread_local initialization routines are pointed to from the
.CRT$XDU section.
- All non-weak thread_local variables have their initialization routines
call from a single function instead of getting their own .CRT$XDU
section entry. This is done to open up optimization opportunities to
the compiler.
- All weak thread_local variables have their own .CRT$XDU section entry.
This entry is in a COMDAT with the global variable it is initializing;
this ensures that we will initialize the global exactly once.
- Destructors are registered in the initialization function using
__tlregdtor.
Differential Revision: http://reviews.llvm.org/D5597
llvm-svn: 219074
We already add the align parameter attribute for function parameters that have
the align_value attribute (or those with a typedef type having that attribute),
which is an important special case, but does not handle pointers with value
alignment assumptions that come into scope in any other way. To handle the
general case, emit an @llvm.assume-based alignment assumption whenever we load
the pointer-typed lvalue of an align_value-attributed variable (except for
function parameters, which we already deal with at entry).
I'll also note that this is more general than Intel's described support in:
https://software.intel.com/en-us/articles/data-alignment-to-assist-vectorization
which states that the compiler inserts __assume_aligned directives in response
to align_value-attributed variables only for function parameters and for the
initializers of local variables. I think that we can make the optimizer deal
with this more-general scheme (which could lead to a lot of calls to
@llvm.assume inside of loop bodies, for example), but if not, I'll rework this
to be less aggressive.
llvm-svn: 219052
When the aligned clause of an OpenMP simd pragma is not provided with an
explicit alignment, a target-dependent default must be used. This adds such a
default of PPC targets.
This will become slightly more complicated when BG/Q support is added (because
then it will depend on the type). For now, 16 is a correct value for all
systems, and covers Altivec and VSX vectors.
llvm-svn: 218994
This adds support for the align_value attribute. This attribute is supported by
Intel's compiler (versions 14.0+), and several of my HPC users have requested
support in Clang. It specifies an alignment assumption on the values to which a
pointer points, and is used by numerical libraries to encourage efficient
generation of vector code.
Of course, we already have an aligned attribute that can specify enhanced
alignment for a type, so why is this additional attribute important? The
problem is that if you want to specify that an input array of T is, say,
64-byte aligned, you could try this:
typedef double aligned_double attribute((aligned(64)));
void foo(aligned_double *P) {
double x = P[0]; // This is fine.
double y = P[1]; // What alignment did those doubles have again?
}
the access here to P[1] causes problems. P was specified as a pointer to type
aligned_double, and any object of type aligned_double must be 64-byte aligned.
But if P[0] is 64-byte aligned, then P[1] cannot be, and this access causes
undefined behavior. Getting round this problem requires a lot of awkward
casting and hand-unrolling of loops, all of which is bad.
With the align_value attribute, we can accomplish what we'd like in a well
defined way:
typedef double *aligned_double_ptr attribute((align_value(64)));
void foo(aligned_double_ptr P) {
double x = P[0]; // This is fine.
double y = P[1]; // This is fine too.
}
This attribute does not create a new type (and so it not part of the type
system), and so will only "propagate" through templates, auto, etc. by
optimizer deduction after inlining. This seems consistent with Intel's
implementation (thanks to Alexey for confirming the various Intel-compiler
behaviors).
As a final note, I would have chosen to call this aligned_value, not
align_value, for better naming consistency with the aligned attribute, but I
think it would be more useful to users to adopt Intel's name.
llvm-svn: 218910
Prior to GCC 4.4, __sync_fetch_and_nand was implemented as:
{ tmp = *ptr; *ptr = ~tmp & value; return tmp; }
but this was changed in GCC 4.4 to be:
{ tmp = *ptr; *ptr = ~(tmp & value); return tmp; }
in response to this change, support for sync_fetch_and_nand (and
sync_nand_and_fetch) was removed in r99522 in order to avoid miscompiling code
depending on the old semantics. However, at this point:
1. Many years have passed, and the amount of code relying on the old
semantics is likely smaller.
2. Through the work of many contributors, all LLVM backends have been updated
such that "atomicrmw nand" provides the newer GCC 4.4+ semantics (this process
was complete July of 2014 (added to the release notes in r212635).
3. The lack of this intrinsic is now a needless impediment to porting codes
from GCC to Clang (I've now seen several examples of this).
It is true, however, that we still set GNUC_MINOR to 2 (corresponding to GCC
4.2). To compensate for this, and to address the original concern regarding
code relying on the old semantics, I've added a warning that specifically
details the fact that the semantics have changed and that we provide the newer
semantics.
Fixes PR8842.
llvm-svn: 218905
Summary:
Currently, with struct my_struct { int x; method_ptr y; };
a call to foo(my_struct s) may end up dropping the last 4 bytes
of the method pointer for x86_64 NaCl and x32.
When checking Has64BitPointers, also check if the method pointer
straddles an eightbyte boundary and classify Hi as well as Lo if needed.
Test Plan: test/CodeGenCXX/x86_64-arguments-nacl-x32.cpp
Reviewers: dschuff, pavel.v.chupin
Subscribers: jfb
Differential Revision: http://reviews.llvm.org/D5555
llvm-svn: 218889
Complex address expressions are no longer part of DIVariable, but
rather an extra argument to the debug intrinsics.
http://reviews.llvm.org/D4919
rdar://problem/17994491
llvm-svn: 218788
Complex address expressions are no longer part of DIVariable, but
rather an extra argument to the debug intrinsics.
http://reviews.llvm.org/D4919
rdar://problem/17994491
llvm-svn: 218777
This patch implements collapsing of the loops (in particular, in
presense of clause 'collapse'). It calculates number of iterations N
and expressions nesessary to calculate the nested loops counters
values based on new iteration variable (that goes from 0 to N-1)
in Sema. It also adds Codegen for 'omp simd', which uses
(and tests) this feature.
Differential Revision: http://reviews.llvm.org/D5184
llvm-svn: 218743
When generating coverage regions, we were doing a linear search
through the existing regions in order to try to merge related ones.
Most of the time this would find what it was looking for in a small
number of steps and it wasn't a big deal, but in cases with many
regions and few mergeable ones this leads to an absurd compile time
regression.
This changes the coverage mapping logic to do a single sort and then
merge as we go, which is a bit simpler and about 100 times faster.
I've also added FIXMEs on a couple of behaviours that seem a little
suspect, while keeping them behaving as they were - I'll look into
these soon.
The test changes here are mostly tedious reorganization, because the
ordering of regions we output has become slightly (but not completely)
more consistent from the almost completely arbitrary ordering we got
before.
llvm-svn: 218738
This struct has some members that are accessed directly and others
that need accessors, but it's all just public. This is confusing, so
I've changed it to a class and made more members private.
llvm-svn: 218737
This is the last piece of CGCall code that had implicit assumptions about
the order in which Clang arguments are translated to LLVM ones (positions
of inalloca argument, sret, this, padding arguments etc.) Now all of
this data is encapsulated in ClangToLLVMArgsMapping. If this information
would be required somewhere else, this class can be moved to a separate
header or pulled into CGFunctionInfo.
llvm-svn: 218634
Add a method to calculate the number of arguments given QualType
expnads to. Use this method in ClangToLLVMArgMapping calculation.
This number may be cached in CodeGenTypes for efficiency, if needed.
llvm-svn: 218623
Hoist the logic which determines the way QualType is expanded
into a separate method. Remove a bunch of copy-paste and simplify
getTypesFromArgs() / ExpandTypeFromArgs() / ExpandTypeToArgs() methods.
llvm-svn: 218615
Fixes incorrect codegen when devirtualization is aborted due to covariant return types.
Differential Revision: http://reviews.llvm.org/D5321
llvm-svn: 218602
Clang uses two types to talk about a C++ class, the
NonVirtualBaseLLVMType and the LLVMType. Previously, we would allow one
of these to be packed and the other not.
This is problematic. If both don't agree on a common subset of fields,
then routines like getLLVMFieldNo will point to the wrong field. Solve
this by copying the 'packed'-ness of the complete type to the
non-virtual subobject. For this to work, we need to take into account
the non-virtual subobject's size and alignment when we are computing the
layout of the complete object.
This fixes PR21089.
llvm-svn: 218577
(clang crashed in CodeGen in llvm::Module::getNamedValue on
thread_local std::unique_ptr<int>).
Differential Revision: http://reviews.llvm.org/D5353
llvm-svn: 218503
In addition to __builtin_assume_aligned, GCC also supports an assume_aligned
attribute which specifies the alignment (and optional offset) of a function's
return value. Here we implement support for the assume_aligned attribute by making
use of the @llvm.assume intrinsic.
llvm-svn: 218500
AFAICT the semantics of frem match libm's fmod.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 218488
Most of the debug info emission is powered essentially from function
definitions - if we emit the definition of a function, we emit the types
of its parameters, the members of those types, and so on and so forth.
For types that aren't referenced even indirectly due to this - because
they only appear in temporary expressions, not in any named variable, we
need to explicitly emit/add them as is done here. This is not the only
case of such code, and we might want to consider handling "void
func(void*); ... func(new T());" (currently debug info for T is not
emitted) at some point, though GCC doesn't. There's a much broader
solution to these issues, but it's a lot of work for possibly marginal
gain (but might help us improve the default -fno-standalone-debug
behavior to be even more aggressive in some places). See the original
review thread for more details.
Patch by jyoti allur (jyoti.yalamanchili@gmail.com)!
Differential Revision: http://reviews.llvm.org/D2498
llvm-svn: 218390
On further investigation, COMDATs should work with .ctors, and the issue
I was hitting probably reproduces with .init_array.
This reverts commit r218287.
llvm-svn: 218313
In particular, pre-.init_array ELF uses the .ctors section mechanism.
MinGW COFF also uses .ctors, now that I think about it. Therefore,
restrict this optimization to the two platforms that are currently known
to work: ELF with .init_array and COFF with .CRT$XCU.
llvm-svn: 218287
Summary:
Vectors are normally 16-byte aligned, however the O32 ABI enforces a
maximum alignment of 8-bytes since the base of the stack is 8-byte aligned.
Previously, this was enforced on the caller side, but not on the callee side.
This fixes the output of OpenCL's printf when given vectors.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: llvm-commits, pekka.jaaskelainen
Differential Revision: http://reviews.llvm.org/D5433
llvm-svn: 218248
This patch adds codegen for constructs:
#pragma omp critical [name]
<body>
It generates global variable ".gomp_critical_user_[name].var" of type int32[8]. Then it generates library call "kmpc_critical(loc, gtid, .gomp_critical_user_[name].var)", code for <body> statement and final call "kmpc_end_critical(loc, gtid, .gomp_critical_user_[name].var)".
Differential Revision: http://reviews.llvm.org/D5202
llvm-svn: 218239
This patch makes sure that the dllexport attribute is transferred to the alias when such alias is created. It only affects the Itanium ABI because for the MSVC ABI a workaround is in place to not generate aliases of dllexport ctors/dtors.
A new CodeGenModule function is provided, CodeGenModule::setAliasAttributes, to factor the code for transferring attributes to aliases.
llvm-svn: 218159
Clang can already handle
-------------------------------------------
struct S {
static const int x;
};
template<typename T> struct U {
static const int k;
};
template<typename T> const int U<T>::k = T::x;
const int S::x = 42;
extern const int *f();
const int *g() { return &U<S>::k; }
int main() {
return *f() + U<S>::k;
}
const int *f() { return &U<S>::k; }
-------------------------------------------
since r217264 which puts the .inint_array section in the same COMDAT
as the variable.
This patch allows the linker to more easily delete some dead code and data by
putting the guard variable and init function in the same COMDAT.
This is a fixed version of r218089.
llvm-svn: 218141
The field is defined as:
If the third field is present, non-null, and points to a global variable or function, the initializer function will only run if the associated data from the current module is not discarded.
And without COMDATs we can't implement that.
llvm-svn: 218097
Clang can already handle
-------------------------------------------
struct S {
static const int x;
};
template<typename T> struct U {
static const int k;
};
template<typename T> const int U<T>::k = T::x;
const int S::x = 42;
extern const int *f();
const int *g() { return &U<S>::k; }
int main() {
return *f() + U<S>::k;
}
const int *f() { return &U<S>::k; }
-------------------------------------------
since r217264 which puts the .inint_array section in the same COMDAT
as the variable.
This patch allows the linker to more easily delete some dead code and data by
putting the guard variable and init function in the same COMDAT.
llvm-svn: 218089
CodeGen would try to come up with an LLVM IR type for a pointer to
member type on the way to forming an LLVM IR type for a pointer to
pointer to member type.
However, if the pointer to member representation has not been locked in yet,
we would not be able to come up with a pointer to member IR type.
In these cases, make the pointer to member type an incomplete type.
This will make the pointer to pointer to member type a pointer to an
incomplete type. If the class eventually obtains an inheritance model,
we will make the pointer to member type represent the actual inheritance
model.
Differential Revision: http://reviews.llvm.org/D5373
llvm-svn: 218084
There are situations when clang knows that the C1 and C2 constructors
or the D1 and D2 destructors are identical. We already optimize some
of these cases, but cannot optimize it when the GlobalValue is
weak_odr.
The problem with weak_odr is that an old TU seeing the same code will
have a C1 and a C2 comdat with the corresponding symbols. We cannot
suddenly start putting the C2 symbol in the C1 comdat as we cannot
guarantee that the linker will not pick a .o with only C1 in it.
The solution implemented by GCC is to expand the ABI to have a comdat
whose name uses a C5/D5 suffix and always has both symbols. That is
what this patch implements.
llvm-svn: 217874
This adds a flag called -fseh-exceptions that uses the native Windows
.pdata and .xdata unwind mechanism to throw exceptions. The other EH
possibilities are DWARF and SJLJ exceptions.
Patch by Martell Malone!
Reviewed By: asl, rnk
Differential Revision: http://reviews.llvm.org/D3419
llvm-svn: 217790
Deleted virtual functions get _purecall inserted into the vftable.
Earlier CTPs would simply stick nullptr in there.
N.B. MSVC can't handle deleted virtual functions which require return
adjusting thunks, they give an error that a deleted function couldn't be
called inside of a compiler generated function. We get this correct by
making the thunk have a __purecall entry as well.
llvm-svn: 217654
We assumed that the incoming this argument would be the last argument.
However, this is not true under the MS ABI.
This fixes PR20897.
llvm-svn: 217642
This prevents initializers for comdat-folded globals from running multiple times.
Differential Revision: http://reviews.llvm.org/D5281
llvm-svn: 217534
Pointer-sized alignment is sufficient as we only ever read single values
from the table. Otherwise we'd bump the alignment to 16 bytes in the
backend if the vtable is larger than 16 bytes. This is great for
structures that are accessed with vector instructions or copied around, but
that's simply not the case for vtables.
Shrinks the data segment of a Release x86_64 clang by 0.3%. The wins are
larger for i386 and code bases that use vtables more often than we do.
This matches the behavior of GCC 5.
llvm-svn: 217495
Summary:
This patch implements a new UBSan check, which verifies
that function arguments declared to be nonnull with __attribute__((nonnull))
are actually nonnull in runtime.
To implement this check, we pass FunctionDecl to CodeGenFunction::EmitCallArgs
(where applicable) and if function declaration has nonnull attribute specified
for a certain formal parameter, we compare the corresponding RValue to null as
soon as it's calculated.
Test Plan: regression test suite
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits, rnk
Differential Revision: http://reviews.llvm.org/D5082
llvm-svn: 217389
There were code paths that are duplicated for constructors and destructors just
because we have both CXXCtorType and CXXDtorsTypes.
This patch introduces an unified enum and reduces code deplication a bit.
llvm-svn: 217383
This makes use of the recently-added @llvm.assume intrinsic to implement a
__builtin_assume(bool) intrinsic (to provide additional information to the
optimizer). This hooks up __assume in MS-compatibility mode to mirror
__builtin_assume (the semantics have been intentionally kept compatible), and
implements GCC's __builtin_assume_aligned as assume((p - o) & mask == 0). LLVM
now contains special logic to deal with assumptions of this form.
llvm-svn: 217349
This patch adds support for the 32bit numeric max/min and directed round-to-integral NEON intrinsics that were added as part of v8, along with unit tests.
Patch by Graham Hunter!
llvm-svn: 217242
For naked functions with parameters, Clang would still emit stores in the prologue
that would clobber the stack, because LLVM doesn't set up a stack frame. (This
shows up in -O0 compiles, because the stores are optimized away otherwise.)
For example:
__attribute__((naked)) int f(int x) {
asm("movl $42, %eax");
asm("retl");
}
Would result in:
_Z1fi:
movl 12(%esp), %eax
movl %eax, (%esp) <--- Oops.
movl $42, %eax
retl
Differential Revision: http://reviews.llvm.org/D5183
llvm-svn: 217198
If control falls off the end of a function after an __asm block, MSVC
assumes that the inline assembly filled the EAX and possibly EDX
registers with an appropriate return value. This functionality is used
in inline functions returning 64-bit integers in system headers, so we
need some amount of compatibility.
This is implemented in Clang by adding extra output constraints to every
inline asm block, and storing the resulting output registers into the
return value slot. If we see an asm block somewhere in the function
body, we emit a normal epilogue instead of marking the end of the
function with a return type unreachable.
Normal returns in functions not using this functionality will overwrite
the return value slot, and in most cases LLVM should be able to
eliminate the dead stores.
Fixes PR17201.
Reviewed By: majnemer
Differential Revision: http://reviews.llvm.org/D5177
llvm-svn: 217187
Summary:
This allows us to easily find them in the backend after the aggregates have
been lowered to other types. This is important on big-endian targets using
the N32/N64 ABI's since these ABI's must shift small structures into the
upper bits of the register.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5005
llvm-svn: 217160
Summary:
They are returned indirectly which causes the other arguments to move to
the next argument slot.
With this, utils/ABITest does not discover any failing cases in the first
500 attempts on big/little endian for O32. Previously some of these failed.
Also tested N32/N64 little endian (big endian has other known issues) with
no issues.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: atanasyan, cfe-commits
Differential Revision: http://reviews.llvm.org/D4811
llvm-svn: 217147
Using the intrinsic allows the SelectionDAGBuilder to turn this call
into the FABS Node and also the intrinsic is something the vectorizer knows
how to vectorize.
This patch also sets the readnone attribute on this call, which should
enable additional optmizations.
llvm-svn: 217042
This avoids encoding information about the function prototype into the
thunk at the cost of some function prototype bitcast gymnastics.
Fixes PR20653.
llvm-svn: 216782
In theory, it'd be nice if we could move to a case where all buried
pointers were buried via unique_ptr to demonstrate that the program had
finished with the value (that we could really have cleanly deallocated
it) but instead chose to bury it.
I think the main reason that's not possible right now is the various
IntrusiveRefCntPtrs in the Frontend, sharing ownership for a variety of
compiler bits (see the various similar
"CompilerInstance::releaseAndLeak*" functions). I have yet to figure out
their correct ownership semantics - but perhaps, even if the
intrusiveness can be removed, the shared ownership may yet remain and
that would lead to a non-unique burying as is there today. (though we
could model that a little better - by passing in a shared_ptr, etc -
rather than needing the two step that's currently used in those other
releaseAndLeak* functions)
This might be a bit more robust if BuryPointer took the boolean:
BuryPointer(bool, unique_ptr<T>)
and the choice to bury was made internally - that way, even when
DisableFree was not set, the unique_ptr would still be null in the
caller and there'd be no chance of accidentally having a different
codepath where the value is used after burial in !DisableFree, but it
becomes null only in DisableFree, etc...
llvm-svn: 216742
Previously, EnterStructPointerForCoercedAccess used Alloc size when determining how to convert. This was problematic, because there were situations were the alloc size was larger than the store size. For example, if the first element of a structure were i24 and the destination type were i32, the old code would generate a GEP and a load i24. The code should compare store sizes to ensure the whole object is loaded. I have attached a test case.
This patch modifies the output of arm64-be-bitfield.c test case, but the new IR seems to be equivalent, and after -O3, the compiler generates identical ARM assembly. (asr x0, x0, #54)
Patch by Thomas Jablin!
llvm-svn: 216722
Summary:
We did a great job getting this wrong:
- We messed up which LLVM IR types to use for arguments and return values.
The optimized libcalls use integer types for values.
Clang attempted to use the IR type which corresponds to the value
passed in instead of using an appropriately sized integer type. This
would result in violations of the ABI for, as an example, floating
point types.
- We didn't bother recording the result of the atomic libcall in the
destination memory.
Instead, call the functions with arguments matching the type of the
libcall prototype's parameters.
This fixes PR20780.
Differential Revision: http://reviews.llvm.org/D5098
llvm-svn: 216714
Summary:
The current implementation of asan cookie is incorrect:
we add nosanitize metadata to the cookie load, but the metadata may be lost
and we will instrument the load from poisoned memory.
This change replaces the load with a call to __asan_load_cxx_array_cookie (r216692)
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D5111
llvm-svn: 216702
For the following code:
__declspec(dllimport) int f(int x);
int user(int x) {
return f(x);
}
int f(int x) { return 1; }
Clang will drop the dllimport attribute in the AST, but CodeGen would have
already put it on the LLVM::Function, and that would never get updated.
(The same thing happens for global variables.)
This makes Clang check dropped DLL attribute case each time the LLVM object
is referenced.
This isn't perfect, because we will still get it wrong if the function is
never referenced by codegen after the attribute is dropped, but this handles
the common cases and makes us not fail in the verifier.
llvm-svn: 216699
linkage related to generation of OBJC_SELECTOR_REFERENCES symbol
needed in generation of call to 'super' in a class method.
// rdar://18150301
llvm-svn: 216676
ACLE 2.0 allows __fp16 to be used as a function argument or return
type. This enables this for AArch64.
This also fixes an existing bug that causes clang to not allow
homogeneous floating-point aggregates with a base type of __fp16. This
is valid for AAPCS64, but not for AAPCS-VFP.
llvm-svn: 216558
This tidies up some ARM-specific code added by r208417 to move it out
of the target-independent parts of clang into TargetInfo.cpp. This
also has the advantage that we can now flatten struct arguments to
variadic AAPCS functions.
llvm-svn: 216535
This time though, preserve the extension for bool types since that's compatible
with what MSVC expects.
See http://reviews.llvm.org/D4380
llvm-svn: 216507
Summary: When the main file is created from a membuffer, there is no file entry that can be retrieved. This uses "__GLOBAL_I_a" in that case which is what was always used before r208128.
Reviewers: majnemer, thakis
Reviewed By: thakis
Subscribers: yaron.keren, rsmith, cfe-commits
Differential Revision: http://reviews.llvm.org/D5043
llvm-svn: 216495
Summary:
MSVC doesn't extend integer types smaller than 64bit, so to preserve
binary compatibility, clang shouldn't either.
For example, the following C code built with MSVC:
unsigned test(unsigned v);
unsigned foobar(unsigned short);
int main() { return test(0xffffffff) + foobar(28); }
Produces the following:
0000000000000004: B9 FF FF FF FF mov ecx,0FFFFFFFFh
0000000000000009: E8 00 00 00 00 call test
000000000000000E: 89 44 24 20 mov dword ptr [rsp+20h],eax
0000000000000012: 66 B9 1C 00 mov cx,1Ch
0000000000000016: E8 00 00 00 00 call foobar
And as you can see, when setting up the call to foobar, only cx is overwritten.
If foobar is compiled with clang, then the zero extension added by clang means
the rest of the register, which contains garbage, could be used.
For example if foobar is:
unsigned foobar(unsigned short v) {
return v;
}
Compiled with clang -fomit-frame-pointer -O3 gives the following assembly:
foobar:
0000000000000000: 89 C8 mov eax,ecx
0000000000000002: C3 ret
And that function would return garbage because the 16 most significant bits of
ecx still contain garbage from the first call.
With this change, the code for that function is now:
foobar:
0000000000000000: 0F B7 C1 movzx eax,cx
0000000000000003: C3 ret
Reviewers: chapuni, rnk
Reviewed By: rnk
Subscribers: majnemer, cfe-commits
Differential Revision: http://reviews.llvm.org/D4380
llvm-svn: 216491
Summary:
PR19838
When operator new[] is called and an array cookie is created
we want asan to detect buffer overflow bugs that touch the cookie.
For that we need to
a) poison the shadow for the array cookie (call __asan_poison_cxx_array_cookie).
b) ignore the legal accesses to the cookie generated by clang (add 'nosanitize' metadata)
Reviewers: timurrrr, samsonov, rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4774
llvm-svn: 216434
into EmitCXXMemberOrOperatorCall methods. In the end we want
to make declaration visible in EmitCallArgs() method, that
would allow us to alter CodeGen depending on function/parameter
attributes.
No functionality change.
llvm-svn: 216404
PowerPC uses the special PPC_FP128 type for long double on Linux, which is
composed of two 64-bit doubles. The higher-order double (which contains the
overall sign) comes first, and so the __builtin_signbitl implementation
requires special handling to extract the sign bit.
Fixes PR20691.
llvm-svn: 216341
Similar to r215768 (which fixed the same case for while loops). To quote
r215768's commit message:
"A little test case simplification - this could be simplified further,
though there are certainly interesting connections to the if/else
construct so I'm hesitant to remove that entirely though it does appear
somewhat unrelated.
(similar fix to r215766, related to PR19864)"
llvm-svn: 216297
for loops introduce two scopes - one for the outer loop variable and its
initialization, and another for the body of the loop, including any
variable declared inside the loop condition.
llvm-svn: 216288
Summary:
This refactoring introduces ClangToLLVMArgMapping class, which
encapsulates the information about the order in which function arguments listed
in CGFunctionInfo should be passed to actual LLVM IR function, such as:
1) positions of sret, if there is any
2) position of inalloca argument, if there is any
3) position of helper padding argument for each call argument
4) positions of regular argument (there can be many if it's expanded).
Simplify several related methods (ConstructAttributeList, EmitFunctionProlog
and EmitCall): now they don't have to maintain iterators over the list
of LLVM IR function arguments, dealing with all the sret/inalloca/this complexities,
and just use expected positions of LLVM IR arguments stored in ClangToLLVMArgMapping.
This may increase the running time of EmitFunctionProlog, as we have to traverse
expandable arguments twice, but in further refactoring we will be able
to speed up EmitCall by passing already calculated CallArgsToIRArgsMapping to
ConstructAttributeList, thus avoiding traversing expandable argument there.
No functionality change.
Test Plan: regression test suite
Reviewers: majnemer, rnk
Reviewed By: rnk
Subscribers: cfe-commits, rjmccall, timurrrr
Differential Revision: http://reviews.llvm.org/D4938
llvm-svn: 216251
Summary:
This is a first small step towards passing generic "Expr" instead of
ArgBeg/ArgEnd pair into EmitCallArgs() family of methods. Having "Expr" will
allow us to get the corresponding FunctionDecl and its ParmVarDecls,
thus allowing us to alter CodeGen depending on the function/parameter
attributes.
No functionality change.
Test Plan: regression test suite
Reviewers: rnk
Reviewed By: rnk
Subscribers: aemerson, cfe-commits
Differential Revision: http://reviews.llvm.org/D4915
llvm-svn: 216214
The profile data format was recently updated and the new indexing api
requires the code coverage tool to know the function's hash as well
as the function's name to get the execution counts for a function.
Differential Revision: http://reviews.llvm.org/D4995
llvm-svn: 216208