We still don't use the PGO to set branch weights for these loops, but at
least this keeps the compiler from crashing. <rdar://problem/16137778>
llvm-svn: 202002
CGRecordLayoutBuilder was aging, complex, multi-pass, and shows signs of
existing before ASTRecordLayoutBuilder. It redundantly performed many
layout operations that are now performed by ASTRecordLayoutBuilder and
asserted that the results were the same. With the addition of support
for the MS-ABI, such as placement of vbptrs, vtordisps, different
bitfield layout and a variety of other features, CGRecordLayoutBuilder
was growing unwieldy in its redundancy.
This patch re-architects CGRecordLayoutBuilder to not perform any
redundant layout but rather, as directly as possible, lower an
ASTRecordLayout to an llvm::type. The new architecture is significantly
smaller and simpler than the CGRecordLayoutBuilder and contains fewer
ABI-specific code paths. It's also one pass.
The architecture of the new system is described in the comments. For the
most part, the new system simply takes all of the fields and bases from
an ASTRecordLayout, sorts them, inserts padding and dumps a record.
Bitfields, unions and primary virtual bases make this process a bit more
complicated. See the inline comments.
In addition, this patch updates a few lit tests due to the fact that the
new system computes more accurate llvm types than CGRecordLayoutBuilder.
Each change is commented individually in the review.
Differential Revision: http://llvm-reviews.chandlerc.com/D2795
llvm-svn: 201907
Because GCC incorrectly defines _mm_prefetch to take anything that casts
to void*, people have started using that behavior. The previous patch
that made _mm_prefetch actually take a const char * broke compatibility
with existing code. This update to the patch leaves the macro that
defines _mm_prefetch with the (void*) cast when _MSC_VER is not defined.
llvm-svn: 201901
This extends the intrinsic lookup table format slightly, and adds
entries for use the shared ARM/AArch64 definitions. The benefit is
currently smaller than for the SISD intrinsics (there's more custom
code implementing this set), but a few lines are saved and there's
scope for future expansion.
llvm-svn: 201848
This extracts the table-driven intrinsic lookup phase into a separate
function, to be used by EmitCommonNeonBuiltinExpr soon.
It also simplifies the logic used in that lookup, since VectorCastArgN
and ScalarArgN were actually identical.
llvm-svn: 201847
This does;
- clang_tablegen() adds each tblgen'd target to global property CLANG_TABLEGEN_TARGETS as list.
- List of targets is added to LLVM_COMMON_DEPENDS.
- all clang libraries and targets depend on generated headers.
You might wonder this would be regression, but in fact, this is little loss.
- Almost all of clang libraries depend on tblgen'd files and clang-tblgen.
- clang-tblgen may cause short stall-out but doesn't cause unconditional rebuild.
- Each library's dependencies to tblgen'd files might vary along headers' structure.
It made hard to track and update *really optimal* dependencies.
Each dependency to intrinsics_gen and ClangSACheckers is left as DEPENDS.
llvm-svn: 201842
Virtual methods expect 'this' to point to the vfptr containing the
virtual method, and this extends to virtual member pointer thunks. The
relevant vfptr is always at offset zero on entry to the thunk, and no
this adjustment is needed.
Previously we would not include the vfptr adjustment in the member
pointer, and we'd look at the vfptr offset when loading from the vftable
in the thunk.
Fixes PR18917.
llvm-svn: 201835
The MS ABI requires that we determine the vbptr offset if have a
virtual inheritance model. Instead, raise an error pointing to the
diagnostic when this happens.
This fixes PR18583.
Differential Revision: http://llvm-reviews.chandlerc.com/D2842
llvm-svn: 201824
This breaks backwards compatibility with existing code. Previously, this
was defined as
#define _mm_prefetch(a, sel) (__builtin_prefetch((void *)(a), 0, (sel)))
Which basically accepts any pointer. Changing this to char* simply
breaks a lot of existing code. I have tried changing char* to
"const void*", which seems to be the right thing as per Intel
specification this should work on basically any pointer. However,
apparently this breaks windows compatibility (because of a conflicting
declaration in windows.h).
So, we probably need to #ifdef this based on whether clang is compiling
for windows. According to Chandler, this might be done by introducing an
additional symbol to a fake type in BuiltinsX86.def and then condition
the type expansion on the platform.
llvm-svn: 201775
This patch adds several built-ins that are required for ms
compatibility. _mm_prefetch must be a built-in because it takes a
compile-time constant argument and our prior approach of using a #define
to the current built-in doesn't work in the presence of re-declaration
of _mm_prefetch. The others can be obtained by including the windows
system headers. If a user includes the windows system headers but not
intrin.h they still need to work and therefore must be built-in because
we don't get a chance to implement them in intrin.h in this case.
llvm-svn: 201734
This fixes one immediate bug where an expression with side-effects
could be emitted twice during a NEON call.
It also prepares the way for folding CodeGen for many of the SISD
intrinsics into a table, reducing code size and hopefully increasing
performance eventually ("binary search + few switch cases" should be
better than "lots of switch cases").
llvm-svn: 201667
These instructions (well, the f32 ones) are supported on 32-bit ARMv8, not just
AArch64. Now that the arm_neon.td refactoring is complete, adding them is
surprisingly simple.
rdar://problem/16035743
llvm-svn: 201661
Summary:
Generally the vector deleting dtor, which we model as a vtable thunk,
takes care of non-virtual adjustment and delegates to the other
destructor variants. The other non-complete destructor variants assume
that 'this' on entry points to the virtual base subobject that first
declared the virtual destructor.
We need to change the adjustment in both the prologue and the vdtor call
setup.
Reviewers: timurrrr
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2821
llvm-svn: 201612
Previously, we made one traversal of the AST prior to codegen to assign
counters to the ASTs and then propagated the count values during codegen. This
patch now adds a separate AST traversal prior to codegen for the
-fprofile-instr-use option to propagate the count values. The counts are then
saved in a map from which they can be retrieved during codegen.
This new approach has several advantages:
1. It gets rid of a lot of extra PGO-related code that had previously been
added to codegen.
2. It fixes a serious bug. My original implementation (which was mailed to the
list but never committed) used 3 counters for every loop. Justin improved it to
move 2 of those counters into the less-frequently executed breaks and continues,
but that turned out to produce wrong count values in some cases. The solution
requires visiting a loop body before the condition so that the count for the
condition properly includes the break and continue counts. Changing codegen to
visit a loop body first would be a fairly invasive change, but with a separate
AST traversal, it is easy to control the order of traversal. I've added a
testcase (provided by Justin) to make sure this works correctly.
3. It improves the instrumentation overhead, reducing the number of counters for
a loop from 3 to 1. We no longer need dedicated counters for breaks and
continues, since we can just use the propagated count values when visiting
breaks and continues.
To make this work, I needed to make a change to the way we count case
statements, going back to my original approach of not including the fall-through
in the counter values. This was necessary because there isn't always an AST node
that can be used to record the fall-through count. Now case statements are
handled the same as default statements, with the fall-through paths branching
over the counter increments. While I was at it, I also went back to using this
approach for do-loops -- omitting the fall-through count into the loop body
simplifies some of the calculations and make them behave the same as other
loops. Whenever we start using this instrumentation for coverage, we'll need
to add the fall-through counts into the counter values.
llvm-svn: 201528
When a function has a single counter, we will offset the pointer by 1 when
parsing the next function. If a function has multiple counters, we are
okay after skipping rest of the counters.
llvm-svn: 201456
properties by fixing shouldBindAsLValue to accept arrays
(like record types) because we always manipulate
them in memory. Patch suggested by John MaCall.
// rdar://15610943
llvm-svn: 201428
This changes ARM to use @llvm.fabs for floating-point vabs. Patterns
already existed in the backend, and it might help mid-end phases since
it's more likely to be understood than @llvm.arm.neon.vabs.
llvm-svn: 201313
The s64/u64 vcvt conversion operations are actually pretty much identical to
the s32/u32 ones in implementation, and can be shared with just one extra
variable.
llvm-svn: 201145
Xcore target ABI requires const data that is externally visible
to be handled differently if it has C-language linkage rather than
C++ language linkage.
llvm-svn: 201142
According to the AAPCS, we can split structs between GPRs and the stack,
except for when an argument has already been allocated on the stack. This
can occur when a large number of floating-point arguments fill up the VFP
registers, and are alllocated on the stack before the general-purpose argument
registers are full.
llvm-svn: 201137
This option has the following effects:
* It adds the sspstrong IR attribute to each function within the CU.
* It defines the macro __SSP_STRONG__ with the value of 2.
Differential Revision: http://llvm-reviews.chandlerc.com/D2717
llvm-svn: 201120
Now that both ARM backends use the same implementation for vshll operations,
the code can be shared. This is also a necessary LLVM/Clang interface update.
llvm-svn: 201094
Now the backend supports the natural LLVM IR, we can shamelessly steal the
AArch64 front-end code to implement the vshrn intrinsic on 32-bit ARM.
llvm-svn: 201086
unique them and permits the implementation of dynamic_cast (and
anything else which knows it's working with a complete class
type) to compare their addresses directly.
rdar://16005328
llvm-svn: 201020
An HFA is defined as a struct containing floating point values of the
same machine type. In the 32-bit ABI, double and long double have the
same machine type, so a struct with a mixture of these types must be an
HFA (assuming it meets the other criteria).
llvm-svn: 200971
We collect a maximal function count among all functions in the pgo data file.
For functions that are hot, we set its InlineHint attribute. For functions that
are cold, we set its Cold attribute.
We currently treat functions with >= 30% of the maximal function count as hot
and functions with <= 1% of the maximal function count are treated as cold.
These two numbers are from preliminary tuning on SPEC.
This commit should not affect non-PGO builds and should boost performance on
instrumentation based PGO.
llvm-svn: 200874
Member pointers are mangled as they would be represented at runtime.
They can be a single integer literal, single decl, or a tuple with some
more numbers tossed in. With Clang today, most of those numbers will be
zero because we reject pointers to members of virtual bases.
This change required moving VTableContextBase ownership from
CodeGenVTables to ASTContext, because mangling now depends on vtable
layout.
I also hoisted the inheritance model helpers up to be inline static
methods of MSInheritanceAttr. This makes the AST code that deals with
member pointers much more readable.
MSVC doesn't appear to have stable manglings of null member pointers:
- Null data memptrs in function templates have a mangling collision with
the first field of a non-polymorphic single inheritance class.
- The mangling of null data memptrs changes if you add casts.
- Large null function memptrs in class templates crash MSVC.
Clang uses the class template mangling for null data memptrs and the
function template mangling for null function memptrs to deal with this.
Reviewers: majnemer
Differential Revision: http://llvm-reviews.chandlerc.com/D2695
llvm-svn: 200857
Now that the back-end intrinsics are more regular, there's no need for the
special handling these got in the front-end, so they can be moved to
EmitCommonNeonBuiltinExpr.
llvm-svn: 200769
The LLVM backend now has invariant types on the various crypto-intrinsics,
because in all cases there's only really one interpretation.
llvm-svn: 200707
When a non-trivial parameter is present, clang now gathers up all the
parameters that lack inreg and puts them into a packed struct. MSVC
always aligns each parameter to 4 bytes and no more, so this is a pretty
simple struct to lay out.
On win64, non-trivial records are passed indirectly. Prior to this
change, clang was incorrectly using byval on win64.
I'm able to self-host a working clang with this change and additional
LLVM patches.
Reviewers: rsmith
Differential Revision: http://llvm-reviews.chandlerc.com/D2636
llvm-svn: 200597
This fixes PR15768, where the sret parameter and the 'this' parameter
are in the wrong order.
Instance methods compiled by MSVC never return records in registers,
they always return indirectly through an sret pointer. That sret
pointer always comes after the 'this' parameter, for both __cdecl and
__thiscall methods.
Unfortunately, the same is true for other calling conventions, so we'll
have to change the overall approach here relatively soon.
Reviewers: rsmith
Differential Revision: http://llvm-reviews.chandlerc.com/D2664
llvm-svn: 200587
This should be the last routine patch: AArch64 does still delegate to
EmitARMBuiltinExpr, but the remaining instances have complications of
one sort or another so some more cunning thought will be needed.
llvm-svn: 200528
Arguments and return values must always be marshalled as for the base
AAPCS when the callee is a variadic function.
Patch by Oliver Stannard!
llvm-svn: 200307
Replace the last incorrect uses and templatize the function to require a
compile-time constant string preventing further misuse.
The diagnostic formatter expects well-formed input and has undefined behaviour
with arbitrary input or crafted user strings in source files. Accepting user
input would also have caused unbounded generation of new diagnostic IDs which
can be problematic in long-running sessions or language bindings.
This completes the work to fix several incorrect callers that passed user
input or raw messages to the diagnostics engine where a constant format string
was expected.
llvm-svn: 200132
A return type is the declared or deduced part of the function type specified in
the declaration.
A result type is the (potentially adjusted) type of the value of an expression
that calls the function.
Rule of thumb:
* Declarations have return types and parameters.
* Expressions have result types and arguments.
llvm-svn: 200082
might have a smaller size as compared to the stand-alone type of the base class.
This is possible when the derived class is packed and hence might have smaller
alignment requirement than the base class.
Differential Revision: http://llvm-reviews.chandlerc.com/D2599
llvm-svn: 200031
Due to statement expressions supported as GCC extension, it is possible
to put 'break' or 'continue' into a loop/switch statement but outside
its body, for example:
for ( ; ({ if (first) { first = 0; continue; } 0; }); )
This code is rejected by GCC if compiled in C mode but is accepted in C++
code. GCC bug 44715 tracks this discrepancy. Clang used code generation
that differs from GCC in both modes: only statement of the third
expression of 'for' behaves as if it was inside loop body.
This change makes code generation more close to GCC, considering 'break'
or 'continue' statement in condition and increment expressions of a
loop as it was inside the loop body. It also adds error for the cases
when 'break'/'continue' appear outside loop due to this syntax. If
code generation differ from GCC, warning is issued.
Differential Revision: http://llvm-reviews.chandlerc.com/D2518
llvm-svn: 199897
PNaCl and Emscripten can both handle va_arg IR instructions with
struct type.
Also add a test to cover generating a va_arg IR instruction from
va_arg in C on le32 (as already handled by VisitVAArgExpr() in
CGExprScalar.cpp), which was not covered by a test before.
(This fixes https://code.google.com/p/nativeclient/issues/detail?id=2381)
Differential Revision: http://llvm-reviews.chandlerc.com/D2539
llvm-svn: 199830
of the current compilation unit.
As a side effect this enables many more LTO uniquing opportunities.
This reapplies r199757 with a better testcase.
llvm-svn: 199760
I misunderstood the discussion on this. The complexity here is
justified by the malloc overhead it saves.
This reverts commit r199302.
llvm-svn: 199700
Without them they can be merged with non unnamed_addr constants during LTO.
The resulting constant is not unnamed_addr and goes in a different section,
which causes ld64 to crash.
A testcase that would crash before:
* file1.mm:
void g(id notification) {
[notification valueForKey:@"name"];
}
* file2.cpp:
extern const char js_name_str[] = "name";
* file3.cpp
extern bool JS_GetProperty(const char *name);
extern const char js_name_str[];
bool js_ReportUncaughtException() { JS_GetProperty(js_name_str); }
run
clang file1.mm -o file1.o -c -w -emit-llvm
clang file2.cpp -o file2.o -c -w -emit-llvm
clang file3.cpp -o file3.o -c -w
ld -dylib -o XUL file1.o file2.o file3.o -undefined dynamic_lookup.
llvm-svn: 199688
Fix a perennial source of confusion in the clang type system: Declarations and
function prototypes have parameters to which arguments are supplied, so calling
these 'arguments' was a stretch even in C mode, let alone C++ where default
arguments, templates and overloading make the distinction important to get
right.
Readability win across the board, especially in the casting, ADL and
overloading implementations which make a lot more sense at a glance now.
Will keep an eye on the builders and update dependent projects shortly.
No functional change.
llvm-svn: 199686
This reverts commit r199475 (which reverted r199416) with fixes for the
breakages.
We wouldn't lock an inheritance model if we saw a pointer-to-member
formed as a result of the address-of operator.
llvm-svn: 199482
class and use it pervasively to restore debug locations.
Fixes an interaction between cleanup and EH that caused the location
to not be restored properly after emitting a landing pad.
rdar://problem/15208190
llvm-svn: 199444
The MSVC ABI is rather finicky about the exact representation of it's
pointer-to-member representation. The exact position of when and where
it will go with one representation versus another appears to be when it
desires the pointer-to-member to be complete.
To properly implement this in clang, do several things:
- Give up on tracking the polymorphic nature of the class. It isn't
useful to Sema and is only pertinent when choosing CodeGen-time
details like whether the field-offset can be 0 instead of -1.
- Insist on locking-in the inheritance model when we ask our
pointer-to-member type to be complete. From there, grab the
underlying CXXRecordDecl and try to make *that* complete. Once we've
done this, we can calculate it's inheritance model and apply it using
an attribute.
N.B. My first bullet point is a lie. We will eventually care about the
specifics of whether or not a CXXRecordDecl is or is not polymorphic
because MSVC compatible mangling of such things depends on it. However,
I believe we will handle this in a rather different way.
llvm-svn: 199416
flag from clang, and disable zero-base shadow support on all platforms
where it is not the default behavior.
- It is completely unused, as far as we know.
- It is ABI-incompatible with non-zero-base shadow, which means all
objects in a process must be built with the same setting. Failing to
do so results in a segmentation fault at runtime.
- It introduces a backward dependency of compiler-rt on user code,
which is uncommon and complicates testing.
This is the Clang part of a larger change.
llvm-svn: 199372
marked as AlwaysInline or ForceInline.
This moves us to what gcc does with -fno-inline. The attribute approach
was discussed to be better than switching to InlineAlways inliner in presence
of LTO.
llvm-svn: 199324
Way back in r129652 we tried to avoid emitting an empty block at -O0
for switch cases that did nothing but break. This led to a poor
debugging experience as reported in PR9796, so we disabled the
optimization for -O0 but left it in for higher optimization levels in
r154420.
Since the whole point of this was to improve -O0, it's silly to keep
the complexity at all.
llvm-svn: 199302
With the old linkage types removed, set the linkage to external for both
dllimport and dllexport to reflect what's currently supported.
llvm-svn: 199220
Fixes PR18435, where we generated a base ctor instead of a complete
ctor, and so failed to construct virtual bases when constructing the
complete object.
llvm-svn: 199160
adjustFallThroughCount isn't a good name, and the documentation was
even worse. This commit attempts to clarify what it's for and when to
use it.
llvm-svn: 199139
There are a number of places where we do PGO.setCurrentRegionCount(0)
directly after an unconditional branch. Give this operation a name so
that it's clearer why we're doing this.
llvm-svn: 199138
This call looks like it was an artifact of an earlier change, and
doesn't actually make sense. We begin a new region immediately anyway,
so it was mostly harmless.
llvm-svn: 199137
The MS abi lays out *all* non-virtual bases with leading vfptrs before
laying out non-virutal bases without vfptrs. This guarantees that the
primary base is laid out first. r198818 fixed RecordLayoutBuilder to
produce compatiable layouts. This patch fixes CGRecordLayoutBuilder to
be able to consume those layouts and produce meaningful output without
tripping any asserts about assumed incoming layout.
A test case is included that shows CGRecordLayoutBuilder in fact
produces output in the compatiable order.
llvm-svn: 198900
from the global address space (6.5.1 of the OpenCL 1.2 specification).
This makes clang construct the image arguments in the global address
space and generate the argument metadata with the correct address space
descriptor.
Patch by Pedro Ferreira!
llvm-svn: 198868
This patch refactors microsoft record layout to be more "natural". The
most dominant change is that vbptrs and vfptrs are injected after the
fact. This simplifies the implementation and the math for the offest
for the first base/field after the vbptr.
llvm-svn: 198818
Debug info: Implement a cleaner version of r198461. For symmetry with
C and C++ don't emit an extra lexical scope for the compound statement
that is the body of an Objective-C method.
llvm-svn: 198715
C and C++ don't emit an extra lexical scope for the compound statement
that is the body of an Objective-C method.
rdar://problem/15010825
llvm-svn: 198699
encodes the canonical rules for LLVM's style. I noticed this had drifted
quite a bit when cleaning up LLVM, so wanted to clean up Clang as well.
llvm-svn: 198686
It controls everything that -flimit-debug-info used to, plus the
vtable type optimization. The old -fno-limit-debug-info option is now an
alias to -fstandalone-debug and vice versa.
Standalone is the default on Darwin until dtrace is updated to work with
non-standalone debug info (rdar://problem/15758808).
Note: I kept the LimitedDebugInfo name in CodeGenOptions::DebugInfoKind
because NoStandaloneDebugInfo sounded even more confusing.
llvm-svn: 198655
Summary:
This makes us more compatible with MSVC 2012+ and fixes PR17748 where we
would give two tables the same name.
Rather than doing a fresh depth-first traversal of the inheritance graph
for every record's vbtables, now we memoize vbtable paths for each
record. By doing memoization, we end up considering virtual bases of
subobjects that come later in the depth-first traversal. Where
previously we would have ignored a virtual base that we'd already seen,
we now consider it for name mangling purposes without emitting a
duplicate vbtable for it.
Reviewers: majnemer
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2509
llvm-svn: 198462
Summary:
No functionality change.
This code should live here long-term because we should be able to use it
to compute correct vftable names.
It turns out that the most natural way to implement the naming algorithm
is to use a caching layer similar to what we already have for virtual
table info in VTableContext. Subsequent changes will take advantage of
this to fix PR17748, where we have a vbtable name collision.
Reviewers: majnemer
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2499
llvm-svn: 198380
The backend string is only verified when available as it's possible to run
clang IRGen for targets that haven't been built or don't exist in LLVM.
llvm-svn: 198309
Remove UnaryTypeTraitExpr and switch all remaining type trait related handling
over to TypeTraitExpr.
The UTT/BTT/TT enum prefix and evaluation code is retained pending further
cleanup.
This is part of the ongoing work to unify type traits following the removal of
BinaryTypeTraitExpr in r197273.
llvm-svn: 198271
'create' functions conventionally return a pointer, not a reference.
Also use an OwningPtr to get replace the delete of a reference member.
No functional change.
llvm-svn: 198126
DiagIDs are a cached resource generally only constructed from compile-time
constant or stable format strings.
Escaping arbitrary messages and constructing DiagIDs from them didn't make
sense.
llvm-svn: 197856
Now CodeGenVTables has only one VTableContext object, which is either
Itanium or Microsoft.
Fixes a FIXME with no functionality change intended.
Ideally we could avoid the downcasts by pushing the things that
reference the Itanium vtable context into ItaniumCXXABI.cpp, but we're
not there yet.
llvm-svn: 197845
These members will still be lazily added to the relevant DWARF DIEs in
LLVM but when enumerating the members they will not appear. This allows
DWARF type units to be more consistent - the type unit will never
contain these special members (so all instances of the type should have
the same DIEs without some having some special members and others having
others) and the special members will be added to the skeletal
declaration that appears in the relevant compile_unit.
llvm-svn: 197844
This matches llc's behavior.
Before this patch clang would create a TargetInfo base on -triple but a llvm
CodeGen based on the triple in the module.
llvm-svn: 197837
Unlike Itanium's VTTs, the 'most derived' boolean or bitfield is the
last parameter for non-variadic constructors, rather than the second.
For variadic constructors, the 'most derived' parameter comes after the
'this' parameter. This affects constructor calls and constructor decls
in a variety of places.
Reviewers: timurrrr
Differential Revision: http://llvm-reviews.chandlerc.com/D2405
llvm-svn: 197518
clang still doesn't emit the right llvm code when initializing multi-D arrays it seems.
For e.g. the following code would still crash for me on Windows 7, 64 bit:
auto f4 = new int[100][200][300]{{{1,2,3}, {4, 5, 6}}, {{10, 20, 30}}};
It seems that the final new loop that iterates through each outermost array and memsets it to zero gets confused with its final ptr arithmetic.
This patch ensures that it converts the pointer to the allocated type (int [200][300]) before incrementing it (instead of using the base type: 'int').
Richard somewhat squeamishly approved the patch (as a quick fix to potentially make it into 3.4) - while exhorting for a more optimized fix in the future. http://llvm-reviews.chandlerc.com/D2398
Thanks Richard!
llvm-svn: 197294
There's nothing special about type traits accepting two arguments.
This commit eliminates BinaryTypeTraitExpr and switches all related handling
over to TypeTraitExpr.
Also fixes a CodeGen failure with variadic type traits appearing in a
non-constant expression.
The BTT/TT prefix and evaluation code is retained as-is for now but will soon
be further cleaned up.
This is part of the ongoing work to unify type traits.
llvm-svn: 197273
point reciprocal exponent, and floating-point reciprocal square root estimate
LLVM AArch64 intrinsics to use f32/f64 types, rather than their vector
equivalents.
llvm-svn: 197069
We were mistakengly giving linkonce_odr linkage instead of internal
linkage to the deleting and complete destructor thunks for classes in
anonymous namespaces.
Fixes PR17273.
llvm-svn: 197060
With the introduction of explicit address space casts into LLVM, there's
a need to provide a new cast kind the front-end can create for C/OpenCL/CUDA
and code to produce address space casts from those kinds when appropriate.
Patch by Michele Scandale!
llvm-svn: 197036
list, each element of the initializer list may provide more than one of the
base elements of the array. Be sure to initialize the right type and bump the
array pointer by the right amount.
llvm-svn: 196995
Thread an optional GV down to EmitGlobalFunctionDefinition so that it can
avoid the lookup when we already know the corresponding llvm global value.
llvm-svn: 196789
With this patch we output the in the order
C2
C1
D2
D1
D0
Which means that a destructor or constructor that call another is output after
the callee. This is a bit easier to read IHMO and a tiny bit more efficient
as we don't put a decl in DeferredDeclsToEmit.
llvm-svn: 196784
Testing has revealed that large integral constants (i.e. > INT64_MAX)
are always mangled as-if they are negative, even in places where it
would not make sense for them to be negative (like non-type template
parameters of type unsigned long long).
To address this, we change the way we model number mangling: always
mangle as-if our number is an int64_t. This should result in correct
results when we have large unsigned numbers.
N.B. Bizarrely, things that are 32-bit displacements like vbptr offsets
are mangled as-if they are unsigned 32-bit numbers. This is a pretty
egregious waste of space, it would be a 4x savings if we could mangle it
like a signed 32-bit number. Instead, we explicitly cast these
displacements to uint32_t and let the mangler proceed.
llvm-svn: 196771
Before this patch GetOrCreateLLVMFunction would add a decl to
DeferredDeclsToEmit even when it was being called by the function trying to
emit that decl.
llvm-svn: 196753
This can happen when we're trying to emit a thunk with available_externally
linkage with optimization enabled but bail because it doesn't make sense
for vararg functions.
PR18098.
llvm-svn: 196658
Summary:
In general, this type node can be used to represent any type adjustment
that occurs implicitly without losing type sugar. The immediate use of
this is to adjust the calling conventions of member function pointer
types without breaking template instantiation.
Fixes PR17996.
Reviewers: rsmith
Differential Revision: http://llvm-reviews.chandlerc.com/D2332
llvm-svn: 196451
The CodeGenOptions are not used for ABI type selection, so we will just
create one with the default constructor (there is a FloatABI option in
CodeGenOptions that is passed on to LLVM, but not used in Clang for LLVM
IR type generation).
We can use the DiagnosticsEngine on the ASTContext rather than making a
client pass one in explicitly.
llvm-svn: 196450
Summary:
MSVC destroys arguments in the callee from left to right. Because C++
objects have to be destroyed in the reverse order of construction, Clang
has to construct arguments from right to left and destroy arguments from
left to right.
This patch fixes the ordering by reversing the order of evaluation of
all call arguments under the MS C++ ABI.
Fixes PR18035.
Reviewers: rsmith
Differential Revision: http://llvm-reviews.chandlerc.com/D2275
llvm-svn: 196402
I'd misunderstood getIndirect() to mean that the argument should be passed
as a pointer at the ABI level, with the ByVal argument choosing caller-copy
semantics over no-caller-copy (callee-copy-on-write) semantics. But
getIndirect(x) actually means that x is passed by pointer at the IR
level but (at least on all other targets I looked at) directly at the
ABI level. getIndirect(x, false) selects a pointer to a caller-made
copy, which is what SystemZ was aiming for.
This fixes a miscompilation of c-index-test. Structure arguments were being
passed by pointer, but no copy was being made, so a write in the callee
stomped over a caller's local variable.
llvm-svn: 196370
This is a duplicate implementation.
E.g. this patch defines:
float64_t vabd_f64(float64_t a, float64_t b)
But there is already a similar intrinsic "vabdd_f64" with the same types.
Also, this intrinsic will be conflicted to the vector type intrinsic as following(Which is implemented by me and will be committed to trunk):
float64x1_t vabd_f64(float64x1_t a, float64x1_t b).
Two functions shouldn't have a same name in arm_neon.h.
According to ARM ACLE document, such vabd_f64 with float64_t is not existing.
So I revert this commit.
llvm-svn: 196205
clang's attribute support is mature by now so replace Sean's warning and email
address with a standard LLVM copyright heading.
Also remove a personal email address and credit docstring from CGObjCGNU that
shouldn't have been there.
This is in line with the LLVM developer policy introduced in r45410.
Contributors can add themselves to CREDITS.txt while active module owners are
listed in CODE_OWNERS.TXT.
llvm-svn: 195587
available always-inline functions. This breaks libc++'s locale
implementation. Code generation for this case should be fixed, but this
is a stop gap fix for clang 3.4.
llvm-svn: 195501
Not long ago I made the CodeGen of for loops simplify the condition at
-O0 in the same way we do for if and conditionals. Unfortunately this
ties how loops and simple conditions work together too tightly, which
makes features such as instrumentation based PGO awkward.
Ultimately, we should find a more general way to simplify the logic in
a given condition, but for now we'll just avoid using EmitBranchOnBool
for loops, like we already do for while and do loops.
llvm-svn: 195438
In OpenCL a vector of 3 elements, acts like a vector of four elements.
So for a vector of size 3 the '.hi' and '.odd' accessors, would access
the elements {2, 3} and {1, 3} respectively.
However, in EmitStoreThroughExtVectorComponentLValue we are still operating on
a vector of size 3, so we should only access {2} and {1}. We do this by checking
the last element to be accessed, and ignore it if it is out-of-bounds.
EmitLoadOfExtVectorElementLValue doesn't have a similar problem, because it does
a direct shufflevector with undef, so an out-of-bounds access just gives an undef
value.
Patch by Anastasia Stulova!
llvm-svn: 195367
This makes Clang emit a linkonce_odr definition for 'val' in the code below,
to be compatible with MSVC-compiled code:
struct Foo {
static const int val = 1;
};
Differential Revision: http://llvm-reviews.chandlerc.com/D2233
llvm-svn: 195283
Summary:
RTTI is not yet implemented for the Microsoft C++ ABI and isn't expected
soon. We could easily add the mangling, but the error is what prevents
us from silently miscompiling code that expects RTTI.
Instead, add a new mangleTypeName entry point that simply forwards to
mangleName or mangleType to produce a string that isn't part of the ABI.
Itanium can continue to use RTTI names to avoid unecessary test
breakage.
This also seems like the right design. The fact that TBAA names happen
to be RTTI names is now an implementation detail of the mangler, rather
than part of TBAA.
Differential Revision: http://llvm-reviews.chandlerc.com/D2153
llvm-svn: 195168
This adds -freroll-loops (and -fno-reroll-loops in the usual way) to enable
loop rerolling as part of the optimization pass manager. This transformation
can enable vectorization, reduce code size (or both).
Briefly, loop rerolling can transform a loop like this:
for (int i = 0; i < 3200; i += 5) {
a[i] += alpha * b[i];
a[i + 1] += alpha * b[i + 1];
a[i + 2] += alpha * b[i + 2];
a[i + 3] += alpha * b[i + 3];
a[i + 4] += alpha * b[i + 4];
}
into this:
for (int i = 0; i < 3200; ++i) {
a[i] += alpha * b[i];
}
Loop rerolling is currently disabled by default at all optimization levels.
llvm-svn: 194967
Instead of storing the vtable offset directly in the function pointer and
doing a branch to check for virtualness at each call site, the MS ABI
generates a thunk for calling the function at a specific vtable offset,
and puts that in the function pointer.
This patch adds support for emitting such thunks. However, it doesn't support
pointers to virtual member functions that are variadic, have an incomplete
aggregate return type or parameter, or are overriding a function in a virtual
base class.
Differential Revision: http://llvm-reviews.chandlerc.com/D2104
llvm-svn: 194827
This patch disables aliasing (and rauw) of derived dtors to base dtors at -O0.
This optimization can have a negative impact on the debug quality.
This was a latent bug for some time with local classes, but got noticed when it
was generalized and broke gdb's destrprint.exp.
llvm-svn: 194618
Now that the relevant tests use -mconstructor-aliases and the missing
features have been implemented, we can just drop this.
No functionality change.
llvm-svn: 194595
This adds a new option -fprofile-sample-use=filename to Clang. It
tells the driver to schedule the SampleProfileLoader pass and passes
on the name of the profile file to use.
llvm-svn: 194567
The problem was that given
template<typename T>
struct foo {
~foo() {}
};
template class foo<int>;
We would produce a alias, creating a comdat with D0 and D1, since the symbols
have to be weak. Another TU is not required to have a explicit template
instantiation definition or an explict template instantiation declaration and
for
template<typename T>
struct foo {
~foo() {}
};
foo<int> a;
we would produce a comdat with only one symbol in it.
llvm-svn: 194520
The original decls are created when used. The replacements are created at the
end of the TU in reverse order.
This makes the original order far better for testing. This is particularly
important since the replacement logic could be used even when
-mconstructor-aliases is not used, but that would make many tests hard to read.
This is a fixed version of r194357 which handles replacing a destructor with
another which is an alias to a third one.
llvm-svn: 194452
The assert this patch deletes was valid only when aliasing D2 to D1, not when
looking at a base class. Since the assert was in the path where we had already
decided to not produce an alias, just drop it.
llvm-svn: 194411