This reverts also r275029, "Update Clang tests after adding inference for the returned argument attribute"
It broke LTO build. Seems miscompilation.
llvm-svn: 275756
-fxray-instrument: enables XRay annotation of IR
-fxray-instruction-threshold: configures the threshold for function size (looking at IR instructions), and allow LLVM to decide whether to add the nop sleds later on in the process.
Also implements the related xray_always_instrument and xray_never_instrument function attributes.
Patch by Dean Michael Berris.
llvm-svn: 275330
Summary:
Nested if statements can generate empty BBs whose terminator branches
unconditionally to its successor. These branches are not eliminated
to help generate better line number information in some cases, but there
is no reason to keep the empty blocks that result from nested ifs.
Reviewers: mehdi_amini, dblaikie, echristo
Subscribers: mehdi_amini, cfe-commits
Differential review: http://reviews.llvm.org/D11360
llvm-svn: 275115
Place the structure data into `cfstring`. This both isolates the structures to
permit coalescing in the future (by the linker) as well as ensures that it
doesnt get marked as read-only data. The structures themselves are not
read-only, only the string contents.
llvm-svn: 274956
The builtin was renamed in r274770. But __syncthreads is part of our
user-facing API, so we need to keep the name as-is.
Patch by Justin Bogner.
llvm-svn: 274780
Revision r178818 added tests for TBAA but was missing negative tests to ensure
that TBAA markers are not emitted when TBAA is off.
Differential Revision: http://reviews.llvm.org/D21295
llvm-svn: 274610
add abs intrinsics that use native LLVM-IR.
change _mm512_mask[z]_and_epi{32|64} to use select intrinsic
Differential Revision: http://reviews.llvm.org/D21973
llvm-svn: 274542
Summary:
The TargetInfo for 'renderscript32' and 'renderscript64' ArchTypes are
subclasses of ARMleTargetInfo and AArch64leTargetInfo respectively.
RenderScript32TargetInfo modifies the ARM ABI to set LongWidth and
LongAlign to be 64-bits. Other than this modification, the underlying
TargetInfo base classes is initialized as if they have "armv7" and
"aarch64" architecture type respectively.
Reviewers: rsmith, echristo
Subscribers: aemerson, tberghammer, cfe-commits, danalbert, mehdi_amini, srhines
Differential Revision: http://reviews.llvm.org/D21334
llvm-svn: 274409
With all MaterializeTemporaryExprs coming with a ExprWithCleanups, it's
easy to add correct lifetime.end marks into the right RunCleanupsScope.
Differential Revision: http://reviews.llvm.org/D20499
llvm-svn: 274385
This is important for building libclc. Since r273039 tests are failing
due to now emitting calls to these functions instead of emitting the
DAG node. The libm function names are implemented for OpenCL, and should
call the locally defined versions, so -fno-builtin is used. The IR
Some functions use the __builtins and expect the intrinsics to be
emitted. Without this we end up with nobuiltin calls to intrinsics
or to unsupported library calls.
llvm-svn: 274370
Emit the underlying storage offset in addition to the starting bit
position of the field.
This fixes PR28162.
Differential Revision: http://reviews.llvm.org/D21783
llvm-svn: 274201
The final change is required to extend the back-end's AtomicExpandPass that was implemented for Sparc (64 bit) and later extended for Sparc (32 bit).
llvm-svn: 274012
smaller than register as argument in variadic functions on
big endian architectures.
Differential Revision: http://reviews.llvm.org/D21611
llvm-svn: 273665
Summary: InstCombine needs to be performed after simplifycfg and sroa, otherwise it may make bad optimization decisions.
Reviewers: davidxl, wmi, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21568
llvm-svn: 273606
We would incorrectly emit the directive sections due to the missing overridden
methods. We now emit the expected "/DEFAULTLIB" rather than "-l" options for
requested linkage
llvm-svn: 273558
Add support for /Ob1 (and equivalent -finline-hint-functions), which enable
inlining only for functions marked inline, either explicitly (via inline
keyword, for example), or implicitly (function definition in class body,
for example).
This works by enabling inlining pass, and adding noinline attribute to
every function not marked inline.
Patch by Rudy Pons <rudy.pons@ilod.org>!
Differential Revision: http://reviews.llvm.org/D20647
llvm-svn: 273440
It currently only takes 2048 gotos to overflow the FixupDepth bitfield,
causing silent miscompilation. Apparently some parser generators run into
this (see PR).
I don't know that that data structure is terribly size sensitive anyway,
and since there's no room to widen the bitfield, let's just use a separate
word in EHCatchScope for it.
Differential Revision: http://reviews.llvm.org/D21566
llvm-svn: 273434
This new test tests that functions are capable of being imported, rather than
that the import pass is run. This new test is compatible with the approach
being developed in D20268 which runs the importer on its own rather than in
a pass.
Differential Revision: http://reviews.llvm.org/D21542
llvm-svn: 273347
Summary:
If the RenderScript LangOpt is set, either via '-x renderscript' or the '.rs'
file extension, set the DWARF language tag to be that of RenderScript.
Reviewers: rsmith
Subscribers: cfe-commits, srhines
Differential Revision: http://reviews.llvm.org/D21451
llvm-svn: 273321
This is a fix for PR28112:
https://llvm.org/bugs/show_bug.cgi?id=28112
The FP comparison intrinsics that take an immediate parameter (rather than specifying
a comparison predicate in the function name) were added with AVX; these are macros in
avxintrin.h. This patch makes clang behavior match gcc (error if a program tries to use
these without -mavx) and matches the Intel documentation, eg:
VCMPPS: m128 _mm_cmp_ps(m128 a, __m128 b, const int imm)
'V' means this is intended to only work with the AVX form of the instruction.
Differential Revision: http://reviews.llvm.org/D21306
llvm-svn: 273311
Summary: We need to call PruneEH pass before AutoFDO pass so that some EH-related calls can get inlined in Sample Profile pass.
Reviewers: davidxl, dnovillo
Subscribers: junbuml, llvm-commits
Differential Revision: http://reviews.llvm.org/D21197
llvm-svn: 273298
Given something like:
void *v = (void *)100;
We need to synthesize a ptrtoint operation from 100. During constant
emission, we choose i64 as the type for our constant because it
guaranteed not to drop any bits from our CharUnits representation of the
value. However, this is suboptimal for 32-bit targets: LLVM passes like
GlobalOpt will get confused by these sorts of casts resulting in
pessimization.
Instead, make sure the ptrtoint operand has a pointer-sized integer
type.
llvm-svn: 273020
Reapplying patch in r272777 which was reverted
because the llvm patch which added support
for generating the mcrr/mcrr2 instructions
from the intrinsic was causing an assertion
failure. This has now been fixed in llvm.
llvm-svn: 272983
This patch fixes a bug where we'd segfault (in some cases) if we saw a
variadic function with one or more pass_object_size arguments.
Differential Revision: http://reviews.llvm.org/D17462
llvm-svn: 272971
As noted in the code comment, a potential follow-on would be to remove
the builtins themselves. Other than ord/unord, this already works as
expected. Eg:
typedef float v4sf __attribute__((__vector_size__(16)));
v4sf fcmpgt(v4sf a, v4sf b) { return a > b; }
Differential Revision: http://reviews.llvm.org/D21268
llvm-svn: 272840
Patch adds intrinsics for mrrc/mrrc2. The
intrinsics for mrrc/mrrc2 return a single
uint64_t to represent two 32 bit values.
The mcrr/mcrr2 intrinsic was changed to
accept a single uint64_t instead of two
32 bit values as the input for consistency.
Differential Revision: http://reviews.llvm.org/D21179
llvm-svn: 272777
There is no need to use a target-specific intrinsic to implement
_bit_scan_forward or _bit_scan_reverse, reimplementing them using
generic intrinsics makes it more likely that the middle end will
understand what's going on.
llvm-svn: 272564
We can now use __builtin_nontemporal_store instead of target specific builtins for naturally aligned nontemporal stores which avoids the need for handling in CGBuiltin.cpp
The scalar integer nontemporal (unaligned) store builtins will have to wait as __builtin_nontemporal_store currently assumes natural alignment and doesn't accept the 'packed struct' trick that we use for normal unaligned load/stores.
The nontemporal loads require further backend support before we can safely convert them to __builtin_nontemporal_load
Differential Revision: http://reviews.llvm.org/D21272
llvm-svn: 272540
Summary:
Create a new Frontend LangOpt to specify the renderscript language. It
is enabled by the "-x renderscript" option from the driver.
Add a "kernel" function attribute only for RenderScript (an "ignored
attribute" warning is generated otherwise).
Make the NativeHalfType and NativeHalfArgsAndReturns LangOpts be implied
by the RenderScript LangOpt.
Reviewers: rsmith
Subscribers: cfe-commits, srhines
Differential Revision: http://reviews.llvm.org/D21198
llvm-svn: 272342