With all MaterializeTemporaryExprs coming with a ExprWithCleanups, it's
easy to add correct lifetime.end marks into the right RunCleanupsScope.
Differential Revision: http://reviews.llvm.org/D20499
llvm-svn: 274385
This is important for building libclc. Since r273039 tests are failing
due to now emitting calls to these functions instead of emitting the
DAG node. The libm function names are implemented for OpenCL, and should
call the locally defined versions, so -fno-builtin is used. The IR
Some functions use the __builtins and expect the intrinsics to be
emitted. Without this we end up with nobuiltin calls to intrinsics
or to unsupported library calls.
llvm-svn: 274370
Emit the underlying storage offset in addition to the starting bit
position of the field.
This fixes PR28162.
Differential Revision: http://reviews.llvm.org/D21783
llvm-svn: 274201
The final change is required to extend the back-end's AtomicExpandPass that was implemented for Sparc (64 bit) and later extended for Sparc (32 bit).
llvm-svn: 274012
smaller than register as argument in variadic functions on
big endian architectures.
Differential Revision: http://reviews.llvm.org/D21611
llvm-svn: 273665
Summary: InstCombine needs to be performed after simplifycfg and sroa, otherwise it may make bad optimization decisions.
Reviewers: davidxl, wmi, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21568
llvm-svn: 273606
We would incorrectly emit the directive sections due to the missing overridden
methods. We now emit the expected "/DEFAULTLIB" rather than "-l" options for
requested linkage
llvm-svn: 273558
Add support for /Ob1 (and equivalent -finline-hint-functions), which enable
inlining only for functions marked inline, either explicitly (via inline
keyword, for example), or implicitly (function definition in class body,
for example).
This works by enabling inlining pass, and adding noinline attribute to
every function not marked inline.
Patch by Rudy Pons <rudy.pons@ilod.org>!
Differential Revision: http://reviews.llvm.org/D20647
llvm-svn: 273440
It currently only takes 2048 gotos to overflow the FixupDepth bitfield,
causing silent miscompilation. Apparently some parser generators run into
this (see PR).
I don't know that that data structure is terribly size sensitive anyway,
and since there's no room to widen the bitfield, let's just use a separate
word in EHCatchScope for it.
Differential Revision: http://reviews.llvm.org/D21566
llvm-svn: 273434
This new test tests that functions are capable of being imported, rather than
that the import pass is run. This new test is compatible with the approach
being developed in D20268 which runs the importer on its own rather than in
a pass.
Differential Revision: http://reviews.llvm.org/D21542
llvm-svn: 273347
Summary:
If the RenderScript LangOpt is set, either via '-x renderscript' or the '.rs'
file extension, set the DWARF language tag to be that of RenderScript.
Reviewers: rsmith
Subscribers: cfe-commits, srhines
Differential Revision: http://reviews.llvm.org/D21451
llvm-svn: 273321
This is a fix for PR28112:
https://llvm.org/bugs/show_bug.cgi?id=28112
The FP comparison intrinsics that take an immediate parameter (rather than specifying
a comparison predicate in the function name) were added with AVX; these are macros in
avxintrin.h. This patch makes clang behavior match gcc (error if a program tries to use
these without -mavx) and matches the Intel documentation, eg:
VCMPPS: m128 _mm_cmp_ps(m128 a, __m128 b, const int imm)
'V' means this is intended to only work with the AVX form of the instruction.
Differential Revision: http://reviews.llvm.org/D21306
llvm-svn: 273311
Summary: We need to call PruneEH pass before AutoFDO pass so that some EH-related calls can get inlined in Sample Profile pass.
Reviewers: davidxl, dnovillo
Subscribers: junbuml, llvm-commits
Differential Revision: http://reviews.llvm.org/D21197
llvm-svn: 273298
Given something like:
void *v = (void *)100;
We need to synthesize a ptrtoint operation from 100. During constant
emission, we choose i64 as the type for our constant because it
guaranteed not to drop any bits from our CharUnits representation of the
value. However, this is suboptimal for 32-bit targets: LLVM passes like
GlobalOpt will get confused by these sorts of casts resulting in
pessimization.
Instead, make sure the ptrtoint operand has a pointer-sized integer
type.
llvm-svn: 273020
Reapplying patch in r272777 which was reverted
because the llvm patch which added support
for generating the mcrr/mcrr2 instructions
from the intrinsic was causing an assertion
failure. This has now been fixed in llvm.
llvm-svn: 272983
This patch fixes a bug where we'd segfault (in some cases) if we saw a
variadic function with one or more pass_object_size arguments.
Differential Revision: http://reviews.llvm.org/D17462
llvm-svn: 272971
As noted in the code comment, a potential follow-on would be to remove
the builtins themselves. Other than ord/unord, this already works as
expected. Eg:
typedef float v4sf __attribute__((__vector_size__(16)));
v4sf fcmpgt(v4sf a, v4sf b) { return a > b; }
Differential Revision: http://reviews.llvm.org/D21268
llvm-svn: 272840
Patch adds intrinsics for mrrc/mrrc2. The
intrinsics for mrrc/mrrc2 return a single
uint64_t to represent two 32 bit values.
The mcrr/mcrr2 intrinsic was changed to
accept a single uint64_t instead of two
32 bit values as the input for consistency.
Differential Revision: http://reviews.llvm.org/D21179
llvm-svn: 272777
There is no need to use a target-specific intrinsic to implement
_bit_scan_forward or _bit_scan_reverse, reimplementing them using
generic intrinsics makes it more likely that the middle end will
understand what's going on.
llvm-svn: 272564
We can now use __builtin_nontemporal_store instead of target specific builtins for naturally aligned nontemporal stores which avoids the need for handling in CGBuiltin.cpp
The scalar integer nontemporal (unaligned) store builtins will have to wait as __builtin_nontemporal_store currently assumes natural alignment and doesn't accept the 'packed struct' trick that we use for normal unaligned load/stores.
The nontemporal loads require further backend support before we can safely convert them to __builtin_nontemporal_load
Differential Revision: http://reviews.llvm.org/D21272
llvm-svn: 272540
Summary:
Create a new Frontend LangOpt to specify the renderscript language. It
is enabled by the "-x renderscript" option from the driver.
Add a "kernel" function attribute only for RenderScript (an "ignored
attribute" warning is generated otherwise).
Make the NativeHalfType and NativeHalfArgsAndReturns LangOpts be implied
by the RenderScript LangOpt.
Reviewers: rsmith
Subscribers: cfe-commits, srhines
Differential Revision: http://reviews.llvm.org/D21198
llvm-svn: 272342
The 'cvtt' truncation (round to zero) conversions can be safely represented as generic __builtin_convertvector (fptosi) calls instead of x86 intrinsics. We already do this (implicitly) for the scalar equivalents.
Note: I looked at updating _mm_cvttpd_epi32 as well but this still requires a lot more backend work to correctly lower (both for debug and optimized builds).
Differential Revision: http://reviews.llvm.org/D20859
llvm-svn: 271436
The `isa' member was previously not given the correct DLL Storage. Ensure that
we give the `isa' constant `__CFConstantStringClassReference' the correct DLL
storage. Default to dllimport unless an explicit specification gives it a
dllexport storage.
llvm-svn: 271361
According to the gcc headers, intel intrinsics docs and msdn codegen the _mm_store1_pd (and its _mm_store_pd1 equivalent) should use an aligned pointer - the clang headers are the only implementation I can find that assume non-aligned stores (by storing with _mm_storeu_pd).
Additionally, according to the intel intrinsics docs and msdn codegen the _mm_store1_ps (_mm_store_ps1) requires a similarly aligned pointer.
This patch raises the alignment requirements to match the other implementations by calling _mm_store_ps/_mm_store_pd instead.
I've also added the missing _mm_store_pd1 intrinsic (which maps to _mm_store1_pd like _mm_store_ps1 does to _mm_store1_ps).
As a followup I'll update the llvm fast-isel tests to match this codegen.
Differential Revision: http://reviews.llvm.org/D20617
llvm-svn: 271218
Adjust the constant CFString emission to emit into more appropriate sections on
ELF and COFF targets. It would previously try to use MachO section names
irrespective of the file format.
llvm-svn: 271211
This extends the blocks support to support blocks with a dynamically linked
blocks runtime. The previous code generation would work only for static builds
of the blocks runtime. Mark the block "isa" pointers and functions as dllimport
if no explicit declaration marked with __declspec(dllexport) is found. This
additional check allows for the use of the functionality in the runtime library
if desired.
llvm-svn: 271138
The VPMOVSX and (V)PMOVZX sign/zero extension intrinsics can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics.
This patch removes the clang builtins and their use in the sse2/avx headers - a companion patch will remove/auto-upgrade the llvm intrinsics.
Note: We already did this for SSE41 PMOVSX sometime ago.
Differential Revision: http://reviews.llvm.org/D20684
llvm-svn: 271106
_InterlockedIncrement and _InterlockedDecrement have 'long' in their
prototypes. We assumed 'long' was the same size as an i32 which is
incorrect for other targets.
This fixes PR27892.
llvm-svn: 270953
If we have some function with dllimport attribute and then we have the function
definition in the same module but without dllimport attribute we should add
dllexport attribute to this function definition.
The same should be done for variables.
Example:
struct __declspec(dllimport) C3 {
~C3();
};
C3::~C3() {;} // we should export this definition.
Patch by Andrew V. Tischenko
Differential revision: http://reviews.llvm.org/D18953
llvm-svn: 270686
-finline-functions and /Ob2 are currently ignored by Clang. The only way to
enable inlining is to use the global O flags, which also enable other options,
or to emit LLVM bitcode using Clang, then running opt by hand with the inline
pass.
This patch allows to simply use the -finline-functions flag (same as GCC) or
/Ob2 in clang-cl mode to enable inlining without other optimizations.
This is the first patch of a serie to improve support for the /Ob flags.
Patch by Rudy Pons <rudy.pons@ilod.org>!
Differential Revision: http://reviews.llvm.org/D20576
llvm-svn: 270609
Underaligned atomic LValues require libcalls which MSVC doesn't have.
MSVC doesn't seem to consider such operations as requiring a barrier
anyway.
This fixes PR27843.
llvm-svn: 270576
Summary:
Following patch D19265 which enable software floating point support in the Sparc backend, this patch enables the option to be enabled in the front-end using the -msoft-float option.
The user should ensure a library (such as the builtins from Compiler-RT) that includes the software floating point routines is provided.
Reviewers: jyknight, lero_chris
Subscribers: jyknight, cfe-commits
Differential Revision: http://reviews.llvm.org/D20419
llvm-svn: 270538
Both the (V)CVTDQ2PD(Y) (i32 to f64) and (V)CVTPS2PD(Y) (f32 to f64) conversion instructions are lossless and can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics without affecting final codegen.
This patch removes the clang builtins and their use in the sse2/avx headers - a future patch will deal with removing the llvm intrinsics, but that will require a bit more work.
Differential Revision: http://reviews.llvm.org/D20528
llvm-svn: 270499
Ensure _mm256_extract_epi8 and _mm256_extract_epi16 zero extend their i8/i16 result to i32. This matches _mm_extract_epi8 and _mm_extract_epi16.
Fix for PR27594
Differential Revision: http://reviews.llvm.org/D20468
llvm-svn: 270330
This is matching what trunk gcc is accepting. Also adds a missing ssse3
case. PR27779. The amount of duplication here is annoying, maybe it
should be factored into a separate .def file?
llvm-svn: 270224
Summary:
Previously it was implemented as inline asm in the CUDA headers.
This change allows us to use the [addr+imm] addressing mode when
executing ld.global.nc instructions. This translates into a 1.3x
speedup on some benchmarks that call this instruction from within an
unrolled loop.
Reviewers: tra, rsmith
Subscribers: jhen, cfe-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D19990
llvm-svn: 270150
- Fixed cdp intrinsic to only accept compile time
constant values previously you could pass in a
variable to the builtin which would result in
illegal llvm assembly output
Differential Revision: http://reviews.llvm.org/D20394
llvm-svn: 270058
Summary:
MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT
pair while adding a timer function, such that another termination of the MWAITX
instruction occurs when the timer expires. The presence of the MONITORX and
MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29.
The MONITORX and MWAITX instructions are intercepted by the same bits that
intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be
monitored. MWAITX instruction causes the processor to stop instruction
execution and enter an implementation-dependent optimized state until
occurrence of a class of events.
Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is
"0F 01 FB". These opcode information is used in adding tests for the
disassembler.
These instructions are enabled for AMD's bdver4 architecture.
Patch by Ganesh Gopalasubramanian!
Reviewers: echristo, craig.topper
Subscribers: RKSimon, joker.eph, llvm-commits, cfe-commits
Differential Revision: http://reviews.llvm.org/D19796
llvm-svn: 269907
Visual Studio's C++ standard library headers include intrin.h, so the intrinsic
headers get included a lot more often in Microsoft mode than elsewhere. The
AVX512 intrinsics are a lot of code (0.7 MB, causing 30% compile time overhead
for small programs including e.g. <string> and 6% compile time overhead for
larger projects like e.g. v8). Since multiversioning can't be relied on in
Microsoft mode (cl.exe doesn't support it), having faster compiles seems like
the much better tradeoff until we have a better intrinsic story going forward
(which we'll need for e.g. PR19898).
Actually using intrinsics on Windows already requires the right /arch:
settings, so this patch should have no big behavior change.
See also thread "The intrinsics headers (especially avx512) are too big. What
to do about it?" on cfe-dev.
http://reviews.llvm.org/D20291
llvm-svn: 269675
Summary:
Clang does not detect `aapcs-vfp` for the EABIHF environment. The reason is that only GNUEABIHF is considered while choosing calling convention, EABIHF is ignored.
This causes clang to use `aapcs` for EABIHF and add the `arm_aapcscc` specifier to functions in generated IR.
The modified `arm-cc.c` test checks that no calling convention specifier is added to functions for EABIHF, which means the default one is used (`CallingConv::ARM_AAPCS_VFP`).
Reviewers: rengolin, compnerd, t.p.northover
Subscribers: aemerson, rengolin, asl, cfe-commits
Differential Revision: http://reviews.llvm.org/D20219
llvm-svn: 269419
Summary:
This option allows the user to control how much of the file name is
emitted by UBSan. Tuning this option allows one to save space in the
resulting binary, which is helpful for restricted execution
environments.
With a positive N, UBSan skips the first N path components.
With a negative N, UBSan only keeps the last N path components.
Reviewers: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D19666
llvm-svn: 269309
Currently, if clang::isBetterOverloadCandidate encounters an enable_if
attribute on either candidate that it's inspecting, it will ignore all
lower priority attributes (e.g. pass_object_size). This is problematic
in cases like:
```
void foo(char *c) __attribute__((enable_if(1, "")));
void foo(char *c __attribute__((pass_object_size(0))))
__attribute__((enable_if(1, "")));
```
...Because we would ignore the pass_object_size attribute in the second
`foo`, and consider any call to `foo` to be ambiguous.
This patch makes overload resolution consult further tiebreakers (e.g.
pass_object_size) if two candidates have equally good enable_if
attributes.
llvm-svn: 269005
This option, like the corresponding gcc option, is SystemZ-specific and
enables storing frame backchain links, as specified in the ABI.
Differential Revision: http://reviews.llvm.org/D19891
llvm-svn: 268575
The -malign-double flag causes i64 and f64 types to have alignment 8
instead of 4. On x86-64, the behavior of -malign-double is enabled by default.
Rebases and cleans phosek's work here: http://reviews.llvm.org/D12860
Patch by Sean Klein
Reviewers: rnk
Subscribers: rnk, jfb, dschuff, phosek
Differential Revision: http://reviews.llvm.org/D19734
llvm-svn: 268473
We were seeing an internal failure when running this test. I can't
see a good reason for the difference, but the simple fix is to use
%{{.*}} instead of %1.
llvm-svn: 268416
Before this change, we would pass all non-HFA record arguments on
Windows with byval. Byval often blocks optimizations and results in bad
code generation. Windows now uses the existing workaround that other
x86_32 platforms use.
I also expanded the workaround to handle C++ records with constructors
on Windows. On non-Windows platforms, we have to keep generating the
same LLVM IR prototypes if we want our bitcode to be ABI compatible.
Otherwise we will encounter mismatch issues like PR21573.
Essentially fixes PR27522 in Clang instead of LLVM.
Reviewers: hans
Differential Revision: http://reviews.llvm.org/D19756
llvm-svn: 268261
Summary:
According to the ACLE spec, "__ARM_FEATURE_FMA is defined to 1 if
the hardware floating-point architecture supports fused floating-point
multiply-accumulate".
This changes clang's behaviour from emitting this macro for v7-A and v7-R
cores to only emitting it when the target has VFPv4 (and therefore support
for the floating point multiply-accumulate instruction).
Fixes PR27216
Reviewers: t.p.northover, rengolin
Subscribers: aemerson, rengolin, cfe-commits
Differential Revision: http://reviews.llvm.org/D18963
llvm-svn: 267869
LLVM stopped using MDString-based type references, and DIBuilder no
longer fills 'retainedTypes:' with every DICompositeType that has an
'identifier:' field. There are just minor changes to keep the same
behaviour in CFE.
Leaving 'retainedTypes:' unfilled has a dramatic impact on the output
order of the IR though. There are a huge number of testcase changes,
which were unfortunately not really scriptable.
llvm-svn: 267297
The intrinsic is now called llvm.thread.pointer, not
llvm.aarch64.thread.pointer. Also, the code handling it in CGBuiltin.cpp
is dead - it's already covered by GCCBuiltin. Remove it.
Differential Revision: http://reviews.llvm.org/D19099
llvm-svn: 266817
r259537 added vfma/vfms to armv7, but the builtin was only lowered
on the AArch64 side. Instead of supporting it on ARM, get rid of it.
The vfms builtin lowered to:
%nb = fsub float -0.0, %b
%r = @llvm.fma.f32(%a, %nb, %c)
Instead, define the operation in terms of vfma, and swap the
multiplicands. It now lowers to:
%na = fsub float -0.0, %a
%r = @llvm.fma.f32(%na, %b, %c)
This matches the instruction more closely, and lets current LLVM
generate the "natural" operand ordering:
fmls.2s v0, v1, v2
instead of the crooked (but equivalent):
fmls.2s v0, v2, v1
Except for theses changes, assembly is identical.
LLVM accepts both commutations, and the LLVM tests in:
test/CodeGen/AArch64/arm64-fmadd.ll
test/CodeGen/AArch64/fp-dp3.ll
test/CodeGen/AArch64/neon-fma.ll
test/CodeGen/ARM/fusedMAC.ll
already check either the new one only, or both.
Also verified against the test-suite unittests.
llvm-svn: 266807
Currently, for the ppc64--gnu and aarch64 ABIs, we recognize:
typedef __attribute__((__ext_vector_type__(3))) float v3f32;
typedef __attribute__((__ext_vector_type__(16))) char v16i8;
struct HFA {
v3f32 a;
v16i8 b;
};
as an HFA. Since the first type encountered is used as the base type,
we pass the HFA as:
[2 x <3 x float>]
Which leads to incorrect IR (relying on padding values) when the
second field is used.
Instead, explicitly widen the vector (after size rounding) in
isHomogeneousAggregate.
Differential Revision: http://reviews.llvm.org/D18998
llvm-svn: 266784
The builtins already exist in LLVM, but are not exposed to the C/C++
programmers. This patch adds all the information about the builtins
needed for clang, as well as a test for all available intrinsics.
llvm-svn: 266671
This is the clang part of http://reviews.llvm.org/D18846.
SafeStack instrumentation pass adds stack protector canaries if both
attributes are present on a function. StackProtector pass will step
back if the function has a safestack attribute.
llvm-svn: 266005
This patch add support for GCC attribute((ifunc("resolver"))) for
targets that use ELF as object file format. In general ifunc is a
special kind of function alias with type @gnu_indirect_function. LLVM
patch http://reviews.llvm.org/D15525
Differential Revision: http://reviews.llvm.org/D15524
llvm-svn: 265917
isinf (is infinite) and isfinite should be implemented with the same function
except we change the comparison operator.
See PR27145 for more details:
https://llvm.org/bugs/show_bug.cgi?id=27145
Ref: forked off of the discussion in D18513.
Differential Revision: http://reviews.llvm.org/D18648
llvm-svn: 265675
Turns out it was there mostly to prevent Clang asking people to report a bug.
This time we report something to Clang's real diagnostics handler so that it
exits with something approximating a real error and tidies up after itself.
llvm-svn: 265592
Add no-jump-tables flag to disable use of jump tables when lowering
switch statements
Reviewers: echristo, hans
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18407
llvm-svn: 265425
Whatever crash it was there to present appears to have been fixed in the
backend now, and it had the nasty side-effect of causing clang to exit(0) and
leave a .o containing goodness knows what even when an error hit.
llvm-svn: 265038
BuiltinsSystemZ.def is extended to include the required processor
features per intrinsic.
New test test/CodeGen/builtins-systemz-error2.c that checks for
expected errors when instrinsics are used with a subtarget that does
not support the required feature (e.g. vector support).
Reviewed by Ulrich Weigand.
llvm-svn: 264873
Modify these tests to ignore the source file name when looking for the
expected string. It was already catching the source file name once via
the ModuleID, and will catch it another time with an impending change to
LLVM to serialize out the module's SourceFileName.
llvm-svn: 264868
The addcarry and subborrow variants of the builtins do not require the adx
target attribute; only the addcarryx variants require them.
Differential Revision: http://reviews.llvm.org/D18533
llvm-svn: 264801
include altivec.h has come and gone.
Rationale: This causes modules, rewrite-includes, etc to be sad and
people should just include altivec.h in their source.
llvm-svn: 264235
Also includes a minor ``enable_if`` docs update.
Currently, our address-of overload machinery will only allow implicit
conversions of overloaded functions to void* in C. For example:
```
void f(int) __attribute__((overloadable));
void f(double) __attribute__((overloadable, enable_if(0, "")));
void *fp = f; // OK. This is C and the target is void*.
void (*fp2)(void) = f; // Error. This is C, but the target isn't void*.
```
This patch makes the assignment of `fp2` select the `f(int)` overload,
rather than emitting an error (N.B. you'll still get a warning about the
`fp2` assignment if you use -Wincompatible-pointer-types).
Differential Revision: http://reviews.llvm.org/D13704
llvm-svn: 264132
These functions cannot be implemented as atomicrmw or cmpxchg
instructions, so they are implemented as a call to the NVVM intrinsics
@llvm.nvvm.atomic.load.inc.32.p0i32 and
@llvm.nvvm.atomic.load.dec.32.p0i32.
Patch by Jason Henline.
Reviewers: jlebar
Differential Revision: http://reviews.llvm.org/D18322
llvm-svn: 264009
Summary:
r246764 handled __fp16 arguments and returns for AAPCS, but skipped this
handling for OpenCL. Simlar to OpenCL, RenderScript also handles __fp16
type natively.
This patch adds the -fnative-half-arguments-and-returns command line
flag to allow such languages to skip this coercion of __fp16.
Reviewers: srhines, olista01
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18138
llvm-svn: 263795
Till now, preserve_mostcc/preserve_allcc calling convention attributes were only
available at the LLVM IR level. This patch adds attributes for
preserve_mostcc/preserve_allcc calling conventions to the C/C++ front-end.
The code was mostly written by Juergen Ributzka.
I just added support for the AArch64 target and tests.
Differential Revision: http://reviews.llvm.org/D18025
llvm-svn: 263647
The purpose of this patch is to keep the same functionality without using LookupResult's implicit copy ctor and assignment operator, because they cause warnings when -Wdeprecated is passed.
This patch is meant to help the following review: http://reviews.llvm.org/D18123.
The functionality is covered by the tests in my original commit (255890)
The test case in this patch was added to test a bug caught in the review of the first version of this fix.
Differential Revision: http://reviews.llvm.org/D18175
llvm-svn: 263630
In the cross-DSO CFI mode clang emits __cfi_check_fail that handles
errors triggered from other modules with targets in the current
module. With this change, __cfi_check_fail will handle errors for
CFI kinds that are not enabled in the current module as if they
have the trapping behaviour (-fsanitize-trap=...).
This fixes a bug where some combinations of -fsanitize* flags may
result in a link failure due to a missing sanitizer runtime library
for the diagnostic calls in __cfi_check_fail.
llvm-svn: 263578
Summary:
This flag is enabled by default in the driver when NDEBUG is set. It
is forwarded on the LLVMContext to discard all value names (but
GlobalValue) for performance purpose.
This an improved version of D18024
Reviewers: echristo, chandlerc
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18127
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263394
See https://llvm.org/bugs/show_bug.cgi?id=26894 for details. This change
fixes the incorrect flags to Clang and the piping issue. It also disables
the FileCheck portion of the test, which is currently failing.
llvm-svn: 263091
Includes new built-in, conversion of built-in to target-independent intrinsic
and update in the header file. Tests are also updated. There is a second part in
the backend for which I will post a separate code-review. BACKEND PART SHOULD BE
COMMITTED FIRST.
Phabricator: http://reviews.llvm.org/D17816
llvm-svn: 263051
This is mostly a one-time autoconversion of tests that checked assembly after
"-Owhatever" compiles to only run "opt -mem2reg" and check the assembly. This
should make them much more stable to changes in LLVM so they won't break on
unrelated changes.
"opt -mem2reg" is a compromise designed to increase the readability of tests
that check dataflow, while minimizing dependency on LLVM. Hopefully mem2reg is
stable enough that no surpises will come along.
Should address http://llvm.org/PR26815.
llvm-svn: 263048
Until now curly braces could only be used in MS inline assembly to mark block start/end.
All curly braces were removed completely at a very early stage.
This approach caused bugs like:
"m{o}v eax, ebx" turned into "mov eax, ebx" without any error.
In addition, AVX-512 added special operands (e.g., k registers), which are also surrounded by curly braces that mark them as such.
Now, we need to keep the curly braces and identify at a later stage if they are marking block start/end (if so, ignore them), or surrounding special AVX-512 operands (if so, parse them as such).
This patch fixes the bug described above and enables the use of AVX-512 special operands.
This commit is the the clang part of the patch.
The clang part of the review is: http://reviews.llvm.org/D17766
The llvm part of the review is: http://reviews.llvm.org/D17767
Differential Revision: http://reviews.llvm.org/D17766
llvm-svn: 262842
For compatibility with GCC, classify __m64 as SSE.
However, clang is a platform compiler for certain targets; retain our
old behavior on those targets: classify __m64 as integer.
This fixes PR26832.
llvm-svn: 262688
Applying the following restrictions for block types in OpenCL (v2.0 s6.12.5):
- __block storage class is disallowed
- every block declaration must be const qualified and initialized
- a block can't be used as a return type of a function
- a blocks can't be used to declare a structure or union field
- extern speficier is disallowed
Corrected image and sampler types diagnostics with struct and unions.
Review: http://reviews.llvm.org/D16928
llvm-svn: 262616
This patch changes cc1 option for PGO profile use from
-fprofile-instr-use=<path> to -fprofile-instrument-use-path=<path>.
-fprofile-instr-use=<path> is now a driver only option.
In addition to decouple the cc1 option from the driver level option, this patch
also enables IR level profile use. cc1 option handling now reads the profile
header and sets CodeGenOpt ProfileUse (valid values are {None, Clang, LLVM}
-- this is a common enum for -fprofile-instrument={}, for the profile
instrumentation), and invoke the pipeline to enable the respective PGO use pass.
Reviewers: silvas, davidxl
Differential Revision: http://reviews.llvm.org/D17737
llvm-svn: 262515
We'd lose track of the parent CodeGenFunction, leading us to get
confused with regard to which function a nested finally belonged to.
Differential Revision: http://reviews.llvm.org/D17752
llvm-svn: 262379
Issue: https://llvm.org/bugs/show_bug.cgi?id=26720
Fix compile error when building ffmpeg for PowerPC64LE because of some
vec_vsx_ld/vec_vsx_st intrinsics are not supported by current clang.
New added intrinsics:
(vector) {signed|unsigned} {short|char} vec_vsx_ld: (total: 8)
bool vec_vsx_ld: (total: 1)
(vector) {signed|unsigned} {short|char} vec_vsx_st: (total: 8)
bool vec_vsx_st: (total: 1)
Total: 18 intrinsics
Phabricator: http://reviews.llvm.org/D17637
llvm-svn: 262359
This patch expands cc1 option -fprofile-instrument= with a new value: -fprofile-instrument=llvm
which enables IR level PGO instrumentation.
Reviewers: davidxl, silvas
Differential Revision: http://reviews.llvm.org/D17622
llvm-svn: 262239
Fixes PR11517 for SPARC.
On most targets, clang lowers va_arg itself, eschewing the use of the
llvm vaarg instruction. This is necessary (at least for now) as the type
argument to the vaarg instruction cannot represent all the ABI
information that is needed to support complex calling conventions.
However, on targets with a simpler varrags ABIs, the LLVM instruction
can work just fine, and clang can simply lower to it. Unfortunately,
even on such targets, vaarg with a struct argument would fail, because
the default lowering to vaarg was naive: it didn't take into account the
ABI attribute computed by classifyArgumentType. In particular, for the
DefaultABIInfo, structs are supposed to be passed indirectly and so
llvm's vaarg instruction should be emitted with a pointer argument.
Now, vaarg instruction emission is able to use computed ABIArgInfo for
the provided argument type, which allows the default ABI support to work
for structs too.
I haven't touched the EmitVAArg implementation for PPC32_SVR4 or XCore,
although I believe both are now redundant, and could be switched over to
use the default implementation as well.
Differential Revision: http://reviews.llvm.org/D16154
llvm-svn: 261717
Fixing a crash caused by trying to merge a single-line asm statement with an asm block that follows it, e.g:
asm int 4
asm {
int 5
}
Now, only adjacent single-line asm statements that are not surrounded by braces will be merged into one asm call.
Differential Revision: http://reviews.llvm.org/D17496
llvm-svn: 261618
This uses the general emitVoidPtrVAArg lowering logic for everything, since
this supports all types, and we don't have any special requirements.
llvm-svn: 261557
This modification applies the following Android commit when we have an
Android environment. This is the sole non-renderscript in the Android repo
commit 9212d4fb30a3ca2f4ee966dd2748c35573d9682c
Author: Tim Murray <timmurray@google.com>
Date: Fri Aug 15 16:00:15 2014 -0700
Update vector calling convention for AArch64.
bug 16846318
Change-Id: I3cfd167758b4bd634d8480ee6ba6bb55d61f82a7
Reviewers: srhines, jyknight
Subscribers: mcrosier, aemerson, rengolin, tberghammer, danalbert, srhines
Differential Revision: http://reviews.llvm.org/D17448
llvm-svn: 261533
It can happen that when we only have 1 more register left in the regsave
area we need to store a value bigger than 1 register and therefore we
go to the overflow area. In this case we have to leave the last slot
in the regsave area unused and keep using overflow area. Do this
by storing a limit value to the used register counter in the overflow block.
Issue diagnosed by and solution tested by Mark Millard!
llvm-svn: 261422
The DataLayout can calculate alignment of vectors based on the alignment
of the element type and the number of elements. In fact, it is the product
of these two values. The problem is that for vectors of N x i1, this will
return the alignment of N bytes, since the alignment of i1 is 8 bits. The
vector types of vNi1 should be aligned to N bits instead. Provide explicit
alignment for HVX vectors to avoid such complications.
llvm-svn: 260680
This commit changes the root from "Simple C/C++ TBAA" to "Simple C++ TBAA" for
C++.
The problem is that the type name in the TBAA nodes is generated differently
for C vs C++. If we link an IR file for C with an IR file for C++, since they
have the same root and the type names are different, accesses to the two type
nodes will be considered no-alias, even though the two type nodes are from
the same type in a header file.
The fix is to use different roots for C and C++. Types from C will be treated
conservatively in respect to types from C++.
Follow-up commits will change the C root to "Simple C TBAA" plus some mangling
change for C types to make it a little more aggresive.
llvm-svn: 260567
This patch is to upgrade FunctionTypeUnwrapper for correct processing of
AttributedType. Fixes PR25786.
Patch by Alexander Makarov.
Differential Revision: http://reviews.llvm.org/D15373
llvm-svn: 260373
Avoid crashing when printing diagnostics for vtable-related CFI
errors. In diagnostic mode, the frontend does an additional check of
the vtable pointer against the set of all known vtable addresses and
lets the runtime handler know if it is safe to inspect the vtable.
http://reviews.llvm.org/D16823
llvm-svn: 259716
Defined the new AVX512 registers in clang inline asm.
Fixed a bug in the MC subtarget info creation during the parsing of MS asm statement - now it receives the actual CPU and target features information.
Differential Revision: http://reviews.llvm.org/D16757
llvm-svn: 259639
Re-commit of r258950 after fixing layering violation.
The related LLVM patch adds a backend diagnostic type for reporting
unsupported features, this adds a printer for them to clang.
In the case where debug location information is not available, I've
changed the printer to report the location as the first line of the
function, rather than the closing brace, as the latter does not give the
user any information. This also affects optimisation remarks.
llvm-svn: 259499
Allow "mode" attribute for enum types, except for vector modes, for compatibility with GCC.
Support "mode" attribute with dependent types.
Differential Revision: http://reviews.llvm.org/D16219
llvm-svn: 259497
Frontend can emit errors when releaseing the Builder. If there are errors before
or when releasing the Builder, we reset the module to stop here before invoking
the backend.
Before this commit, clang will continue to invoke the backend and backend can
crash.
Differential Revision: http://reviews.llvm.org/D16564
llvm-svn: 259116
Re-commit of r258950 after fixing layering violation.
Add backend dignostic printer for unsupported features
The related LLVM patch adds a backend diagnostic type for reporting
unsupported features, this adds a printer for them to clang.
In the case where debug location information is not available, I've
changed the printer to report the location as the first line of the
function, rather than the closing brace, as the latter does not give the
user any information. This also affects optimisation remarks.
Differential Revision: http://reviews.llvm.org/D16591
llvm-svn: 259036
In OpenCL, `bool` vectors are a reserved type, and are therefore
illegal.
Outside of OpenCL, if we try to make an extended vector of N `bool`s,
Clang will lower it to an `[N x i1]`. LLVM has no ABI for bitvectors, so
lots of operations on such vectors are thoroughly broken. As a result,
this patch makes them illegal in everything else, as well. :)
Differential Revision: http://reviews.llvm.org/D15721
llvm-svn: 259011
r258993 allows stricter testing for basic block labels by making
sure that they are always followed by ":". Use this to improve the
test.
llvm-svn: 258997
The related LLVM patch adds a backend diagnostic type for reporting
unsupported features, this adds a printer for them to clang.
In the case where debug location information is not available, I've
changed the printer to report the location as the first line of the
function, rather than the closing brace, as the latter does not give the
user any information. This also affects optimisation remarks.
Differential Revision: http://reviews.llvm.org/D16591
llvm-svn: 258950
Support for the -fno-math-builtin option was added in r186899. The codegen side
is being tested in test/CodeGen/nomathbuiltin.c. The missing part was just
passing the option through the driver.
PR26317
llvm-svn: 258814
* Runtime diagnostic data for cfi-icall changed to match the rest of
cfi checks
* Layout of all CFI diagnostic data changed to put Kind at the
beginning. There is no ABI stability promise yet.
* Call cfi_slowpath_diag instead of cfi_slowpath when needed.
* Emit __cfi_check_fail function, which dispatches a CFI check
faliure according to trap/recover settings of the current module.
* A tiny driver change to match the way the new handlers are done in
compiler-rt.
llvm-svn: 258745
Volatile loads of type wider than a pointer get split by MSVC because
the base x86 ISA doesn't provide loads which are wider than pointer
width. LLVM assumes that it can emit an cmpxchg8b but this is
problematic if the memory is in a CONST memory segment.
Instead, provide behavior compatible with MSVC: split loads wider than a
pointer.
llvm-svn: 258506
This is part of a new statistics gathering feature for the sanitizers.
See clang/docs/SanitizerStats.rst for further info and docs.
Differential Revision: http://reviews.llvm.org/D16175
llvm-svn: 257971
This attribute may be attached to a function definition and instructs the backend to generate appropriate function entry/exit code so that
it can be used directly as an interrupt handler.
The IRET instruction, instead of the RET instruction, is used to return from interrupt or exception handlers. All registers, except for the EFLAGS register which is restored by the IRET instruction, are preserved by the compiler.
Any interruptible-without-stack-switch code must be compiled with -mno-red-zone since interrupt handlers can and will, because of the hardware design, touch
the red zone.
interrupt handler must be declared with a mandatory pointer argument:
struct interrupt_frame;
__attribute__ ((interrupt))
void f (struct interrupt_frame *frame) {
...
}
and user must properly define the structure the pointer pointing to.
exception handler:
The exception handler is very similar to the interrupt handler with a different mandatory function signature:
#ifdef __x86_64__
typedef unsigned long long int uword_t;
#else
typedef unsigned int uword_t;
#endif
struct interrupt_frame;
__attribute__ ((interrupt))
void f (struct interrupt_frame *frame, uword_t error_code) {
...
}
and compiler pops the error code off stack before the IRET instruction.
The exception handler should only be used for exceptions which push an error code and all other exceptions must use the interrupt handler.
The system will crash if the wrong handler is used.
Differential Revision: http://reviews.llvm.org/D15709
llvm-svn: 257867
Summary:
There are a number of files in the tree which have been accidentally checked in with DOS line endings. Convert these to native line endings.
There are also a few files which have DOS line endings on purpose, and I have set the svn:eol-style property to 'CRLF' on those.
Reviewers: joerg, aaron.ballman
Subscribers: aaron.ballman, cfe-commits
Differential Revision: http://reviews.llvm.org/D15849
llvm-svn: 256704
In MS inline asm syntax a label with '$' char produces an error, while in AT&T it does not.
In AT&T inline asm syntax Clang escapes the '$' char and replaces it with "$$". Adopted same approach for MS syntax.
Differential Revision: http://reviews.llvm.org/D15795
llvm-svn: 256545
the front end as well.
Note that DWARF5 isn't finalized and any feature support is subject to
change and accepting of the option doesn't mean we're supporting the
full range of the current standard.
llvm-svn: 256516
This adds support for the MCU psABI in a way different from r251223 and r251224,
basically reverting most of these two patches. The problem with the approach
taken in r251223/4 is that it only handled libcalls that originated from the backend.
However, the mid-end also inserts quite a few libcalls and assumes these use the
platform's default calling convention.
The previous patch tried to insert inregs when necessary both in the FE and,
somewhat hackily, in the CG. Instead, we now define a new default calling convention
for the MCU, which doesn't use inreg marking at all, similarly to what x86-64 does.
Differential Revision: http://reviews.llvm.org/D15055
llvm-svn: 256495
If there are two pointers passed to an atomic Builtin,
Clang doesn't allow the second (non-atomic) one to be qualified
with an address space.
Remove this restriction by recording the address space of passed pointers
in atomics type diagnostics.
llvm-svn: 256243
The Intel manual documents both an unsigned form (_mm_popcnt_u32)
and a signed form (_popcnt32) of the intrinsic. Add the missing signed form.
Differential Revision: http://reviews.llvm.org/D15568
llvm-svn: 256121
Add MS inline asm support for structs that contain fields that are also structs.
Differential Revision: http://reviews.llvm.org/D15578
llvm-svn: 255890
Clang-side cross-DSO CFI.
* Adds a command line flag -f[no-]sanitize-cfi-cross-dso.
* Links a runtime library when enabled.
* Emits __cfi_slowpath calls is bitset test fails.
* Emits extra hash-based bitsets for external CFI checks.
* Sets a module flag to enable __cfi_check generation during LTO.
This mode does not yet support diagnostics.
llvm-svn: 255694
Clang doesn’t support a use of “this” pointer inside inline asm.
When I tried to compile a class or a struct (see example) with an inline asm that contains "this" pointer.
Clang returns with an error.
This patch fixes that.
error: expected unqualified-id
For example:
'''
struct A {
void f() {
__asm mov eax, this
// error: expected unqualified-id
}
};
'''
Differential Revision: http://reviews.llvm.org/D15115
llvm-svn: 255645
The issue seems to be that .ll file may either use number of register
value or alias %numUsedRegs, so the check needs to cover both cases.
This will hopefully fix the last regression introduced by r255515.
llvm-svn: 255539
This patch enables soft float support for ppc32 architecture and fixes
the ABI for variadic functions. This is the first in a set of patches
for soft float support in LLVM.
Patch by Strahinja Petrovic.
Differential Revision: http://reviews.llvm.org/D13351
llvm-svn: 255515
- Removed support for hexagonv3 and earlier.
- Added handling of hexagonv55 and hexagonv60.
- Added handling of target features (hvx, hvx-double).
- Updated paths to reflect current directory layout.
llvm-svn: 255502
This sets the maximum entry count among all functions in the program to the
module using module flags. This allows the optimizer to use this information.
Differential Revision: http://reviews.llvm.org/D15163
llvm-svn: 255397
As discussed on the ml, backend tests need to be put in llvm/test/CodeGen/X86 as fast-isel tests using IR that is as close to what is generated here as possible.
The llvm tests will (re)added in a future commit.
llvm-svn: 255050
variables in C, in the cases where we can constant-fold it to a value
regardless (such as floating-point division by zero and signed integer
overflow). Strictly enforcing this rule breaks too much code.
llvm-svn: 254992
Hopefully fix the remaining bot failure from r254927. Remove
target specification since it shouldn't be needed, and this causes
an error when trying to check the pass execution structure in
test/CodeGen/thinlto_backend.c on non-x86 arches.
llvm-svn: 254940
Summary:
Adds new option -fthinlto-index=<file> to invoke the LTO pipeline
along with function importing via clang using the supplied function
summary index file. This supports invoking the parallel ThinLTO
backend processes in a distributed build environment via clang.
Additionally, this causes the module linker to be invoked on the bitcode
file being compiled to perform any necessary promotion and renaming of
locals that are exported via the function summary index file.
Add a couple tests that confirm we get expected errors when we try to
use the new option on a file that isn't bitcode, or specify an invalid
index file. The tests also confirm that we trigger the expected function
import pass.
Depends on D15024
Reviewers: joker.eph, dexonsmith
Subscribers: joker.eph, davidxl, cfe-commits
Differential Revision: http://reviews.llvm.org/D15025
llvm-svn: 254927
As discussed on the ml, backend tests need to be put in llvm/test/CodeGen/X86 as fast-isel tests using IR that is as close to what is generated here as possible.
The llvm tests will (re)added in a future commit
llvm-svn: 254849
As discussed on the ml, backend tests need to be put in llvm/test/CodeGen/X86 as fast-isel tests using IR that is as close to what is generated here as possible.
The llvm tests will (re)added in a future commit
I will update PR24580 on this new plan
llvm-svn: 254847
Summary:
Looking into some recent issues with LLDBs expression parser highlighted that upstream clang passes vectors types differently to Android Open Source Project's clang for Arm Android targets.
This patch reflects the changes present in the AOSP and allows LLDB's JIT expression evaluation to work correctly for Arm Android targets when passing vectors.
This is submitted with consent of the original author Stephen Hines.
Reviewers: asl, rsmith, ADodds, rnk
Subscribers: rnk, aemerson, tberghammer, danalbert, srhines, cfe-commits, pirama
Differential Revision: http://reviews.llvm.org/D14639
llvm-svn: 254682
Fix calculating address of arguments larger than 32 bit on stack for
variadic functions (rounding up address to alignment) on ppc32 architecture.
Patch by Strahinja Petrovic.
Differential Revision: http://reviews.llvm.org/D14871
llvm-svn: 254670
These additions were meant to go in as a part of r254554; while it's
certainly nice to have new functionality, it's nicer if we have tests to
go with it. :)
llvm-svn: 254632
This reverts commit r254143 which introduces a crash on the following input:
f(char *);
g(char *);
#pragma weak f = g
int g(char *p) {}
llvm-svn: 254605
As discussed on the ml, backend tests need to be put in llvm/test/CodeGen/X86 as fast-isel tests using IR that is as close to what is generated here as possible.
The llvm tests will (re)added in a future commit
I will update PR24580 on this new plan
llvm-svn: 254594
side-effect, so that we don't allow speculative evaluation of such expressions
during code generation.
This caused a diagnostic quality regression, so fix constant expression
diagnostics to prefer either the first "can't be constant folded" diagnostic or
the first "not a constant expression" diagnostic depending on the kind of
evaluation we're doing. This was always the intent, but didn't quite work
correctly before.
This results in certain initializers that used to be constant initializers to
no longer be; in particular, things like:
float f = 1e100;
are no longer accepted in C. This seems appropriate, as such constructs would
lead to code being executed if sanitizers are enabled.
llvm-svn: 254574
Add/Subtract.
Add missing tests that accidentally were not committed in rL254250.
Differential Revision: http://reviews.llvm.org/D14982
llvm-svn: 254251
Summary: This patch adds support for the interrupt attribute for mips32r2+.
Patch by Simon Dardis.
Reviewers: dsanders, aaron.ballman
Subscribers: aaron.ballman, cfe-commits
Differential Revision: http://reviews.llvm.org/D10802
llvm-svn: 254205
Summary: This patch adds support for the interrupt attribute for mips32r2+.
Reviewers: dsanders, aaron.ballman
Subscribers: aaron.ballman, cfe-commits
Differential Revision: http://reviews.llvm.org/D10802
llvm-svn: 254203
(Re-apply patch after bug fixing)
This diff makes sure that the driver does not pass
-fomit-frame-pointer or -momit-leaf-frame-pointer to
the frontend when -pg is used. Currently, clang gives
an error if -fomit-frame-pointer is used in combination
with -pg, but -momit-leaf-frame-pointer was forgotten.
Also, disable frame pointer elimination in the frontend
when -pg is set.
Patch by Stefan Kempf.
llvm-svn: 253886
This diff makes sure that the driver does not pass
-fomit-frame-pointer or -momit-leaf-frame-pointer to
the frontend when -pg is used. Currently, clang gives
an error if -fomit-frame-pointer is used in combination
with -pg, but -momit-leaf-frame-pointer was forgotten.
Also, disable frame pointer elimination in the frontend
when -pg is set.
Patch by Stefan Kempf.
llvm-svn: 253846
Specifying a fixed triple is not possible because that target may not
even be compiler. Go for a simpler fix by using a _? regex for the
prefix.
llvm-svn: 253758
This is similar to the earlier fix I did, r253702, expect that here it
is function names that are being searched for. If the function name
matches part of the directory name it can cause an apparent test
case failure.
llvm-svn: 253706
Summary: The frontend debuginfo tests should not invoke llvm passes which includes add-discriminators that will change the debug info generated by FE.
Reviewers: dblaikie
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D14848
llvm-svn: 253686
Summary: The discriminator change in http://reviews.llvm.org/D14738 will fail these clang tests. Update the test to accomendate the discriminator change.
Reviewers: dblaikie, davidxl, dnovillo
Differential Revision: http://reviews.llvm.org/D14836
llvm-svn: 253595
This is a follow on from a similar LLVM commit: r253511.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
These intrinsics currently have an explicit alignment argument which is
required to be a constant integer. It represents the alignment of the
source and dest, and so must be the minimum of those.
This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments. The alignment
argument itself is removed.
The only code change to clang is hidden in CGBuilder.h which now passes
both dest and source alignment to IRBuilder, instead of taking the minimum of
dest and source alignments.
Reviewed by Hal Finkel.
llvm-svn: 253512
Currently, when there is a global register variable in a program that
is bound to an invalid register, clang/llvm prints an error message that
is not very user-friendly.
This commit improves the diagnostic and moves the check that used to be
in the backend to Sema. In addition, it makes changes to error out if
the size of the register doesn't match the declared variable size.
e.g., volatile register int B asm ("rbp");
rdar://problem/23084219
Differential Revision: http://reviews.llvm.org/D13834
llvm-svn: 253405
This reverts commit r253269.
This leads to assert / segfault triggering on the following reduced example:
float foo(float U, float base, float cell) { return (U = 2 * base) - cell; }
llvm-svn: 253337
target attribute, don't include "negative" subtarget features in the
list of required features. Builtins are positive by default so don't
need this change, but we pull the default list of features from the
command line and so need to make sure that we only include features
that are turned on for code generation in our error.
llvm-svn: 253242
These two intrinsics are defined in arm_acle.h.
__rev16l needs to rotate by 16 bits, bit it was actually rotating by 2 bits.
For AArch64, where long is 64 bits, this would still be wrong.
__rev16ll was incorrect, it reversed the bytes in each 32-bit word, rather than
each 16-bit halfword. The correct implementation is to apply __rev16 to the top
and bottom words of the 64-bit value.
For AArch32 targets, these get compiled down to the hardware rev16 instruction
at -O1 and above. For AArch64 targets, the 64-bit ones get compiled to two
32-bit rev16 instructions, because there is not currently a pattern for the
64-bit rev16 instruction.
Differential Revision: http://reviews.llvm.org/D14609
llvm-svn: 253211
Several of these tests (the two deleted, and the one removal edit) were
relying on the optimizer to collapse things to test some frontend
feature. The tests were really old and features seemed amply covered by
other parts of the test suite, so I just removed them.
If anyone thinks they're valuable enough to keep/fix, we can play around
with that, for sure.
(inspired by r252872)
llvm-svn: 253114
The ``disable_tail_calls`` attribute instructs the backend to not
perform tail call optimization inside the marked function.
For example,
int callee(int);
int foo(int a) __attribute__((disable_tail_calls)) {
return callee(a); // This call is not tail-call optimized.
}
Note that this attribute is different from 'not_tail_called', which
prevents tail-call optimization to the marked function.
rdar://problem/8973573
Differential Revision: http://reviews.llvm.org/D12547
llvm-svn: 252986
In r244063, I had caused these builtins to call the same-named library
functions, __atomic_*_fetch_SIZE. However, this was incorrect: while
those functions are in fact supported by GCC's libatomic, they're not
documented by the spec (and gcc doesn't ever call them).
Instead, you're /supposed/ to call the __atomic_fetch_* builtins and
then redo the operation inline to return the final value.
Differential Revision: http://reviews.llvm.org/D14385
llvm-svn: 252920
target features that the caller function doesn't provide. This matches
the existing backend failure to inline functions that don't have
matching target features - and diagnoses earlier in the case of
always_inline.
Fix up a few test cases that were, in fact, invalid if you tried
to generate code from the backend with the specified target features
and add a couple of tests to illustrate what's going on.
This should fix PR25246.
llvm-svn: 252834
When a struct's size is not a power of 2, the corresponding _Atomic() type is
promoted to the nearest. We already correctly handled normal C++ expressions of
this form, but direct calls to the __c11_atomic_whatever builtins ended up
performing dodgy operations on the smaller non-atomic types (e.g. memcpy too
much). Later optimisations removed this as undefined behaviour.
This patch converts EmitAtomicExpr to allocate its temporaries at the full
atomic width, sidestepping the issue.
llvm-svn: 252507
The -meabi flag to control LLVM EABI version.
Without '-meabi' or with '-meabi default' imply LLVM triple default.
With '-meabi gnu' sets EABI GNU.
With '-meabi 4' or '-meabi 5' set EABI version 4 and 5 respectively.
A similar patch was introduced in LLVM.
Patch by Vinicius Tinti.
llvm-svn: 252463
This attribute is used to prevent tail-call optimizations to the marked
function. For example, in the following piece of code, foo1 will not be
tail-call optimized:
int __attribute__((not_tail_called)) foo1(int);
int foo2(int a) {
return foo1(a); // Tail-call optimization is not performed.
}
The attribute has effect only on statically bound calls. It has no
effect on indirect calls. Also, virtual functions and objective-c
methods cannot be marked as 'not_tail_called'.
rdar://problem/22667622
Differential Revision: http://reviews.llvm.org/D12922
llvm-svn: 252369
This patch fixes one more thing in MCU psABI support: LongDoubleWidth should be set to 64.
Differential Revision: http://reviews.llvm.org/D14285
llvm-svn: 252156
This patch implements two things in front-end for MCU psABI support:
1) "long double type is the same as double."
2) "New predefined C/C++ pre-processor symbols: iamcu and iamcu__.
Differential Revision: http://reviews.llvm.org/D14205
llvm-svn: 251786
GCC uses the x87DoubleExtended model for long doubles, and passes them
indirectly by address through function calls.
Also replace the existing mingw-long-double assembly emitting test with
an IR-level test.
llvm-svn: 251567
Linking options for particular file depend on the option that specifies the file.
Currently there are two:
* -mlink-bitcode-file links in complete content of the specified file.
* -mlink-cuda-bitcode links in only the symbols needed by current TU.
Linked symbols are internalized. This bitcode linking mode is used to
link device-specific bitcode provided by CUDA.
Files are linked in order they are specified on command line.
-mlink-cuda-bitcode replaces -fcuda-uses-libdevice flag.
Differential Revision: http://reviews.llvm.org/D13913
llvm-svn: 251427
only one of a group of possibilities.
This changes the syntax in the builtin files to represent:
, as the and operator
| as the or operator
The former syntax matches how the backend tablegen files represent
multiple subtarget features being required.
Updated the builtin and intrinsic headers accordingly for the new
syntax.
llvm-svn: 251388
The MCU psABI calling convention is somewhat, but not quite, like -mregparm 3.
In particular, the rules involving structs are different.
Differential Revision: http://reviews.llvm.org/D13978
llvm-svn: 251224
According to the Intel documentation, the mask operand of a maskload and
maskstore intrinsics is always a vector of packed integer/long integer values.
This patch introduces the following two changes:
1. It fixes the avx maskload/store intrinsic definitions in avxintrin.h.
2. It changes BuiltinsX86.def to match the correct gcc definitions for avx
maskload/store (see D13861 for more details).
Differential Revision: http://reviews.llvm.org/D13861
llvm-svn: 250816
The Intel MCU psABI requires floating-point values to be passed in-reg.
This makes the x86-32 ABI code respect "-mfloat-abi soft" and generate float inreg arguments.
Differential Revision: http://reviews.llvm.org/D13554
llvm-svn: 250689
r246877 made __builtin_object_size substantially more aggressive with
unknown bases if Type=1 or Type=3, which causes issues when we encounter
code like this:
struct Foo {
int a;
char str[1];
};
const char str[] = "Hello, World!";
struct Foo *f = (struct Foo *)malloc(sizeof(*f) + strlen(str));
strcpy(&f->str, str);
__builtin_object_size(&f->str, 1) would hand back 1, which is
technically correct given the type of Foo, but the type of Foo lies to
us about how many bytes are available in this case.
This patch adds support for this "writing off the end" idiom -- we now
answer conservatively when we're given the address of the very last
member in a struct.
Differential Revision: http://reviews.llvm.org/D12169
llvm-svn: 250488
Previously, our logic when taking the address of an overloaded function
would not consider enable_if attributes, so long as all of the enable_if
conditions on a given candidate were true. So, two functions with
identical signatures (one with enable_if attributes, the other without),
would be considered equally good overloads. If we were calling the
function instead of taking its address, then the function with enable_if
attributes would be preferred.
This patch makes us prefer the candidate with enable_if regardless of if
we're calling or taking the address of an overloaded function.
Differential Revision: http://reviews.llvm.org/D13795
llvm-svn: 250486
match the feature set of the function that they're being called from.
This ensures that we can effectively diagnose some[1] code that would
instead ICE in the backend with a failure to select message.
Example:
__m128d foo(__m128d a, __m128d b) {
return __builtin_ia32_addsubps(b, a);
}
compiled for normal x86_64 via:
clang -target x86_64-linux-gnu -c
would fail to compile in the back end because the normal subtarget
features for x86_64 only include sse2 and the builtin requires sse3.
[1] We're still not erroring on:
__m128i bar(__m128i const *p) { return _mm_lddqu_si128(p); }
where we should fail and error on an always_inline function being
inlined into a function that doesn't support the subtarget features
required.
llvm-svn: 250473
Add support for the `-fdebug-prefix-map=` option as in GCC. The syntax is
`-fdebug-prefix-map=OLD=NEW`. When compiling files from a path beginning with
OLD, change the debug info to indicate the path as start with NEW. This is
particularly helpful if you are preprocessing in one path and compiling in
another (e.g. for a build cluster with distcc).
Note that the linearity of the implementation is not as terrible as it may seem.
This is normally done once per file with an expectation that the map will be
small (1-2) entries, making this roughly linear in the number of input paths.
Addresses PR24619.
llvm-svn: 250094
This fixes a bug where one can take the address of a conditionally
enabled function to drop its enable_if guards. For example:
int foo(int a) __attribute__((enable_if(a > 0, "")));
int (*p)(int) = &foo;
int result = p(-1); // compilation succeeds; calls foo(-1)
Overloading logic has been updated to reflect this change, as well.
Functions with enable_if attributes that are always true are still
allowed to have their address taken.
Differential Revision: http://reviews.llvm.org/D13607
llvm-svn: 250090
Rationale :
// sse3
__m128d test_mm_addsub_pd(__m128d A, __m128d B) {
return _mm_addsub_pd(A, B);
}
// mmx
void shift(__m64 a, __m64 b, int c) {
_mm_slli_pi16(a, c);
_mm_slli_pi32(a, c);
_mm_slli_si64(a, c);
_mm_srli_pi16(a, c);
_mm_srli_pi32(a, c);
_mm_srli_si64(a, c);
_mm_srai_pi16(a, c);
_mm_srai_pi32(a, c);
}
clang -msse3 -mno-mmx file.c -c
For this code we should be able to explicitly turn off MMX
without affecting the compilation of the SSE3 function and then
diagnose and error on compiling the MMX function.
This is a preparatory patch to the actual diagnosis code which is
coming in a future patch. This sets us up to have the correct information
where we need it and verifies that it's being emitted for the backend
to handle.
llvm-svn: 249733
that we can build up an accurate set of features rather than relying on
TargetInfo initialization via handleTargetFeatures to munge the list
of features.
llvm-svn: 249732
Enums without an explicit, fixed, underlying type are implicitly given a
fixed 'int' type for ABI compatibility with MSVC. However, we can
enforce the standard-mandated rules on these types as-if we didn't know
this fact if the tag is not part of a definition.
llvm-svn: 249667
These test updates almost exclusively around the change in behavior
around enum: enums without a definition are considered incomplete except
when targeting MSVC ABIs. Since these tests are interested in the
'incomplete-enum' behavior, restrict them to %itanium_abi_triple.
llvm-svn: 249660
No ABI for C++ currently makes it possible to implement the standard
100% perfectly. We wrongly hid some of our compatible behavior behind
-fms-compatibility instead of tying it to the compiler ABI.
llvm-svn: 249656
With this change, most 'g' options are rejected by CompilerInvocation.
They remain only as Driver options. The new way to request debug info
from cc1 is with "-debug-info-kind={line-tables-only|limited|standalone}"
and "-dwarf-version={2|3|4}". In the absence of a command-line option
to specify Dwarf version, the Toolchain decides it, rather than placing
Toolchain-specific logic in CompilerInvocation.
Also fix a bug in the Windows compatibility argument parsing
in which the "rightmost argument wins" principle failed.
Differential Revision: http://reviews.llvm.org/D13221
llvm-svn: 249655
Currently FastISel doesn't know how to select vector bitcasts.
During instruction selection, fast-isel always falls back to SelectionDAG
every time it encounters a vector bitcast.
As a consequence of this, all the 'packed vector shift by immedate count'
test cases in avx2-builtins.c are optimized by the DAGCombiner.
In particular, the DAGCombiner would always fold trivial stack loads of
constant shift counts into the operands of packed shift builtins.
This behavior would start changing as soon as I reapply revision 249121.
That revision would teach x86 fast-isel how to select bitcasts between vector
types of the same size.
As a consequence of that change, fast-isel would less often fall back to
SelectionDAG. More importantly, DAGCombiner would no longer be able to
simplify the code by folding the stack reload of a constant.
No functional change.
llvm-svn: 249142
test that our intrinsics behave the same under -fsigned-char and
-funsigned-char.
This further testing uncovered that AVX-2 has a broken cmpgt for 8-bit
elements, and has for a long time. This is fixed in the same way as
SSE4 handles the case.
The other ISA extensions currently work correctly because they use
specific instruction intrinsics. As soon as they are rewritten in terms
of generic IR, they will need to add these special casts. I've added the
necessary testing to catch this however, so we shouldn't have to chase
it down again.
I considered changing the core typedef to be signed, but that seems like
a bad idea. Notably, it would be an ABI break if anyone is reaching into
the innards of the intrinsic headers and passing __v16qi on an API
boundary. I can't be completely confident that this wouldn't happen due
to a macro expanding in a lambda, etc., so it seems much better to leave
it alone. It also matches GCC's behavior exactly.
A fun side note is that for both GCC and Clang, -funsigned-char really
does change the semantics of __v16qi. To observe this, consider:
% cat x.cc
#include <smmintrin.h>
#include <iostream>
int main() {
__v16qi a = { 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
__v16qi b = _mm_set1_epi8(-1);
std::cout << (int)(a / b)[0] << ", " << (int)(a / b)[1] << '\n';
}
% clang++ -o x x.cc && ./x
-1, 1
% clang++ -funsigned-char -o x x.cc && ./x
0, 1
However, while this may be surprising, both Clang and GCC agree.
Differential Revision: http://reviews.llvm.org/D13324
llvm-svn: 249097
recently when we started using direct conversion to model sign
extension. The __v16qi type we use for SSE v16i8 vectors is defined in
terms of 'char' which may or may not be signed! This causes us to
generate pmovsx and pmovzx depending on the setting of -funsigned-char.
This patch just forms an explicitly signed type and uses that to
formulate the sign extension. While this gets the correct behavior
(which we now verify with the enhanced test) this is just the tip of the
ice berg. Now that I know what to look for, I have found errors of this
sort *throughout* our vector code. Fortunately, this is the only
specific place where I know of users actively having their code
miscompiled by Clang due to this, so I'm keeping the fix for those users
minimal and targeted.
I'll be sending a proper email for discussion of how to fix these
systematically, what the implications are, and just how widely broken
this is... From what I can tell, we have never shipped a correct set of
builtin headers for x86 when users rely on -funsigned-char. Oops.
llvm-svn: 248980
Summary: __nvvm_atom_cas_* returns the old value instead of whether the swap succeeds.
Reviewers: eliben, tra
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D13306
llvm-svn: 248951
This is the clang commit associated with llvm r248887.
This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234],
vst[234]lane ARM neon intrinsics and associates an address space with the
pointer that these intrinsics take. This changes, e.g.,
<2 x i32> @llvm.arm.neon.vld1.v2i32(i8*, i32)
to
<2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8*, i32)
This change ensures that address spaces are fully taken into account in the ARM
target during lowering of interleaved loads and stores.
Differential Revision: http://reviews.llvm.org/D13127
llvm-svn: 248888
This patch corresponds to review:
http://reviews.llvm.org/D13190
Implemented the following interfaces to conform to ELF V2 ABI version 1.1.
vector signed __int128 vec_adde (vector signed __int128, vector signed __int128, vector signed __int128);
vector unsigned __int128 vec_adde (vector unsigned __int128, vector unsigned __int128, vector unsigned __int128);
vector signed __int128 vec_addec (vector signed __int128, vector signed __int128, vector signed __int128);
vector unsigned __int128 vec_addec (vector unsigned __int128, vector unsigned __int128, vector unsigned __int128);
vector signed int vec_addc(vector signed int __a, vector signed int __b);
vector bool char vec_cmpge (vector signed char __a, vector signed char __b);
vector bool char vec_cmpge (vector unsigned char __a, vector unsigned char __b);
vector bool short vec_cmpge (vector signed short __a, vector signed short __b);
vector bool short vec_cmpge (vector unsigned short __a, vector unsigned short __b);
vector bool int vec_cmpge (vector signed int __a, vector signed int __b);
vector bool int vec_cmpge (vector unsigned int __a, vector unsigned int __b);
vector bool char vec_cmple (vector signed char __a, vector signed char __b);
vector bool char vec_cmple (vector unsigned char __a, vector unsigned char __b);
vector bool short vec_cmple (vector signed short __a, vector signed short __b);
vector bool short vec_cmple (vector unsigned short __a, vector unsigned short __b);
vector bool int vec_cmple (vector signed int __a, vector signed int __b);
vector bool int vec_cmple (vector unsigned int __a, vector unsigned int __b);
vector double vec_double (vector signed long long __a);
vector double vec_double (vector unsigned long long __a);
vector bool char vec_eqv(vector bool char __a, vector bool char __b);
vector bool short vec_eqv(vector bool short __a, vector bool short __b);
vector bool int vec_eqv(vector bool int __a, vector bool int __b);
vector bool long long vec_eqv(vector bool long long __a, vector bool long long __b);
vector signed short vec_madd(vector signed short __a, vector signed short __b, vector signed short __c);
vector signed short vec_madd(vector signed short __a, vector unsigned short __b, vector unsigned short __c);
vector signed short vec_madd(vector unsigned short __a, vector signed short __b, vector signed short __c);
vector unsigned short vec_madd(vector unsigned short __a, vector unsigned short __b, vector unsigned short __c);
vector bool long long vec_mergeh(vector bool long long __a, vector bool long long __b);
vector bool long long vec_mergel(vector bool long long __a, vector bool long long __b);
vector bool char vec_nand(vector bool char __a, vector bool char __b);
vector bool short vec_nand(vector bool short __a, vector bool short __b);
vector bool int vec_nand(vector bool int __a, vector bool int __b);
vector bool long long vec_nand(vector bool long long __a, vector bool long long __b);
vector bool char vec_orc(vector bool char __a, vector bool char __b);
vector bool short vec_orc(vector bool short __a, vector bool short __b);
vector bool int vec_orc(vector bool int __a, vector bool int __b);
vector bool long long vec_orc(vector bool long long __a, vector bool long long __b);
vector signed long long vec_sub(vector signed long long __a, vector signed long long __b);
vector signed long long vec_sub(vector bool long long __a, vector signed long long __b);
vector signed long long vec_sub(vector signed long long __a, vector bool long long __b);
vector unsigned long long vec_sub(vector unsigned long long __a, vector unsigned long long __b);
vector unsigned long long vec_sub(vector bool long long __a, vector unsigned long long __b);
vector unsigned long long vec_sub(vector unsigned long long __V2 ABI V1.1
http://ror float vec_sub(vector float __a, vector float __b);
unsigned char vec_extract(vector bool char __a, int __b);
signed short vec_extract(vector signed short __a, int __b);
unsigned short vec_extract(vector bool short __a, int __b);
signed int vec_extract(vector signed int __a, int __b);
unsigned int vec_extract(vector bool int __a, int __b);
signed long long vec_extract(vector signed long long __a, int __b);
unsigned long long vec_extract(vector unsigned long long __a, int __b);
unsigned long long vec_extract(vector bool long long __a, int __b);
double vec_extract(vector double __a, int __b);
vector bool char vec_insert(unsigned char __a, vector bool char __b, int __c);
vector signed short vec_insert(signed short __a, vector signed short __b, int __c);
vector bool short vec_insert(unsigned short __a, vector bool short __b, int __c);
vector signed int vec_insert(signed int __a, vector signed int __b, int __c);
vector bool int vec_insert(unsigned int __a, vector bool int __b, int __c);
vector signed long long vec_insert(signed long long __a, vector signed long long __b, int __c);
vector unsigned long long vec_insert(unsigned long long __a, vector unsigned long long __b, int __c);
vector bool long long vec_insert(unsigned long long __a, vector bool long long __b, int __c);
vector double vec_insert(double __a, vector double __b, int __c);
vector signed long long vec_splats(signed long long __a);
vector unsigned long long vec_splats(unsigned long long __a);
vector signed __int128 vec_splats(signed __int128 __a);
vector unsigned __int128 vec_splats(unsigned __int128 __a);
vector double vec_splats(double __a);
int vec_all_eq(vector double __a, vector double __b);
int vec_all_ge(vector double __a, vector double __b);
int vec_all_gt(vector double __a, vector double __b);
int vec_all_le(vector double __a, vector double __b);
int vec_all_lt(vector double __a, vector double __b);
int vec_all_nan(vector double __a);
int vec_all_ne(vector double __a, vector double __b);
int vec_all_nge(vector double __a, vector double __b);
int vec_all_ngt(vector double __a, vector double __b);
int vec_any_eq(vector double __a, vector double __b);
int vec_any_ge(vector double __a, vector double __b);
int vec_any_gt(vector double __a, vector double __b);
int vec_any_le(vector double __a, vector double __b);
int vec_any_lt(vector double __a, vector double __b);
int vec_any_ne(vector double __a, vector double __b);
vector unsigned char vec_sbox_be (vector unsigned char);
vector unsigned char vec_cipher_be (vector unsigned char, vector unsigned char);
vector unsigned char vec_cipherlast_be (vector unsigned char, vector unsigned char);
vector unsigned char vec_ncipher_be (vector unsigned char, vector unsigned char);
vector unsigned char vec_ncipherlast_be (vector unsigned char, vector unsigned char);
vector unsigned int vec_shasigma_be (vector unsigned int, const int, const int);
vector unsigned long long vec_shasigma_be (vector unsigned long long, const int, const int);
vector unsigned short vec_pmsum_be (vector unsigned char, vector unsigned char);
vector unsigned int vec_pmsum_be (vector unsigned short, vector unsigned short);
vector unsigned long long vec_pmsum_be (vector unsigned int, vector unsigned int);
vector unsigned __int128 vec_pmsum_be (vector unsigned long long, vector unsigned long long);
vector unsigned char vec_gb (vector unsigned char);
vector unsigned long long vec_bperm (vector unsigned __int128 __a, vector unsigned char __b);
Removed the folowing interfaces either because their signatures have changed
in version 1.1 of the ABI or because they were implemented for ELF V2 ABI but
have actually been deprecated in version 1.1.
vector signed char vec_eqv(vector bool char __a, vector signed char __b);
vector signed char vec_eqv(vector signed char __a, vector bool char __b);
vector unsigned char vec_eqv(vector bool char __a, vector unsigned char __b);
vector unsigned char vec_eqv(vector unsigned char __a, vector bool char __b);
vector signed short vec_eqv(vector bool short __a, vector signed short __b);
vector signed short vec_eqv(vector signed short __a, vector bool short __b);
vector unsigned short vec_eqv(vector bool short __a, vector unsigned short __b);
vector unsigned short vec_eqv(vector unsigned short __a, vector bool short __b);
vector signed int vec_eqv(vector bool int __a, vector signed int __b);
vector signed int vec_eqv(vector signed int __a, vector bool int __b);
vector unsigned int vec_eqv(vector bool int __a, vector unsigned int __b);
vector unsigned int vec_eqv(vector unsigned int __a, vector bool int __b);
vector signed long long vec_eqv(vector bool long long __a, vector signed long long __b);
vector signed long long vec_eqv(vector signed long long __a, vector bool long long __b);
vector unsigned long long vec_eqv(vector bool long long __a, vector unsigned long long __b);
vector unsigned long long vec_eqv(vector unsigned long long __a, vector bool long long __b);
vector float vec_eqv(vector bool int __a, vector float __b);
vector float vec_eqv(vector float __a, vector bool int __b);
vector double vec_eqv(vector bool long long __a, vector double __b);
vector double vec_eqv(vector double __a, vector bool long long __b);
vector unsigned short vec_nand(vector bool short __a, vector unsigned short __b);
llvm-svn: 248813
Currently it's 64-bit which will lead to mismatch between host and
device code if we compile for i386.
Differential Revision: http://reviews.llvm.org/D13181
llvm-svn: 248753
Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in
a hand-rolled tricky condition block in lib/Basic/Targets.cpp, with a FIXME:
attached.
http://reviews.llvm.org/D12937 moved the handling of the DSP feature over to
ARMTargetParser.def in LLVM, to be in line with other architecture extensions.
This is the corresponding patch to clang, to clear the FIXME: and update
the tests.
Differential Revision: http://reviews.llvm.org/D12938
llvm-svn: 248521
Summary:
Strictly speaking, the MIPS*R2 ISA's should not permit -mnan=2008 since this
feature was added in MIPS*R3. However, other toolchains permit this and we
should do the same.
Reviewers: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D13057
llvm-svn: 248481
This commit fixes an assert that is triggered when optnone is being
added to an IR function that is already marked with minsize and optsize.
rdar://problem/22723716
Differential Revision: http://reviews.llvm.org/D13004
llvm-svn: 248191
Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in
a hand-rolled tricky condition block in lib/Basic/Targets.cpp, with a FIXME:
attached.
http://reviews.llvm.org/D12937 moved the handling of +t2dsp over to
ARMTargetParser.def in LLVM, to be in line with other architecture extensions.
This is the corresponding patch to clang, to clear the FIXME: and update
the tests.
Differential Revision: http://reviews.llvm.org/D12938
llvm-svn: 248154
128-bit vector integer sign extensions correctly lower to the pmovsx instructions even for debug builds.
This patch removes the builtins and reimplements the _mm_cvtepi*_epi* intrinsics __using builtin_shufflevector (to extract the bottom most subvector) and __builtin_convertvector (to actually perform the sign extension).
Differential Revision: http://reviews.llvm.org/D12835
llvm-svn: 248092
Summary:
This change adds support for `__builtin_ms_va_list`, a GCC extension for
variadic `ms_abi` functions. The existing `__builtin_va_list` support is
inadequate for this because `va_list` is defined differently in the Win64
ABI vs. the System V/AMD64 ABI.
Depends on D1622.
Reviewers: rsmith, rnk, rjmccall
CC: cfe-commits
Differential Revision: http://reviews.llvm.org/D1623
llvm-svn: 247941
Mingw generally wraps an old copy of msvcrt.dll which has these
personalities, so things should work out, or so I hear. I haven't tested
it.
llvm-svn: 247902
convert i64 to FP and vice versa
reduceps & reducepd
rangeps & rangepd
all in their 512bit versions
Differential Revision: http://reviews.llvm.org/D11716
llvm-svn: 247881
Current implementation may end up emitting an undefined reference for
an "inline __attribute__((always_inline))" function by generating an
"available_externally alwaysinline" IR function for it and then failing to
inline all the calls. This happens when a call to such function is in dead
code. As the inliner is an SCC pass, it does not process dead code.
Libc++ relies on the compiler never emitting such undefined reference.
With this patch, we emit a pair of
1. internal alwaysinline definition (called F.alwaysinline)
2a. A stub F() { musttail call F.alwaysinline }
-- or, depending on the linkage --
2b. A declaration of F.
The frontend ensures that F.inlinefunction is only used for direct
calls, and the stub is used for everything else (taking the address of
the function, really). Declaration (2b) is emitted in the case when
"inline" is meant for inlining only (like __gnu_inline__ and some
other cases).
This approach, among other nice properties, ensures that alwaysinline
functions are always internal, making it impossible for a direct call
to such function to produce an undefined symbol reference.
This patch is based on ideas by Chandler Carruth and Richard Smith.
llvm-svn: 247494
Current implementation may end up emitting an undefined reference for
an "inline __attribute__((always_inline))" function by generating an
"available_externally alwaysinline" IR function for it and then failing to
inline all the calls. This happens when a call to such function is in dead
code. As the inliner is an SCC pass, it does not process dead code.
Libc++ relies on the compiler never emitting such undefined reference.
With this patch, we emit a pair of
1. internal alwaysinline definition (called F.alwaysinline)
2a. A stub F() { musttail call F.alwaysinline }
-- or, depending on the linkage --
2b. A declaration of F.
The frontend ensures that F.inlinefunction is only used for direct
calls, and the stub is used for everything else (taking the address of
the function, really). Declaration (2b) is emitted in the case when
"inline" is meant for inlining only (like __gnu_inline__ and some
other cases).
This approach, among other nice properties, ensures that alwaysinline
functions are always internal, making it impossible for a direct call
to such function to produce an undefined symbol reference.
This patch is based on ideas by Chandler Carruth and Richard Smith.
llvm-svn: 247465
-force-align-stack.
Also, make changes to the driver so that -mno-stack-realign is no longer
an option exposed to the end-user that disallows stack realignment in
the backend.
Differential Revision: http://reviews.llvm.org/D11815
llvm-svn: 247451
It seems that there is small bug, and we can't generate assume loads
when some virtual functions have internal visibiliy
This reverts commit 982bb7d966947812d216489b3c519c9825cacbf2.
llvm-svn: 247332
This flag causes the compiler to emit bit set entries for functions as well
as runtime bitset checks at indirect call sites. Depends on the new function
bitset mechanism.
Differential Revision: http://reviews.llvm.org/D11857
llvm-svn: 247238
Generating call assume(icmp %vtable, %global_vtable) after constructor
call for devirtualization purposes.
For more info go to:
http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html
Edit:
Fixed version because of PR24479.
After this patch got reverted because of ScalarEvolution bug (D12719)
Merged after John McCall big patch (Added Address).
http://reviews.llvm.org/D11859
llvm-svn: 247199
The tests in test/CodeGen/arm-target-features.c are currently
passing but warning messages are suppressed. These tests are now
synchronized with the corresponding changes in Target Parser.
This patch will fix the regressions in clang caused by r247136
Differential Revision: http://reviews.llvm.org/D12722
llvm-svn: 247138
Summary:
Currently clang provides no general way to generate nontemporal loads/stores.
There are some architecture specific builtins for doing so (e.g. in x86), but
there is no way to generate non-temporal store on, e.g. AArch64. This patch adds
generic builtins which are expanded to a simple store with '!nontemporal'
attribute in IR.
Differential Revision: http://reviews.llvm.org/D12313
llvm-svn: 247104
instruction used the ReturnValue as pointer operand or value operand. This
led to wrong code gen - in later stages (load-store elision code) the found
store and its operand would be erased, causing ReturnValue to become a <badref>.
The patch adds a check that makes sure that ReturnValue is a pointer operand of
store instruction. Regression test is also added.
This fixes PR24386.
Differential Revision: http://reviews.llvm.org/D12400
llvm-svn: 247003
Introduce an Address type to bundle a pointer value with an
alignment. Introduce APIs on CGBuilderTy to work with Address
values. Change core APIs on CGF/CGM to traffic in Address where
appropriate. Require alignments to be non-zero. Update a ton
of code to compute and propagate alignment information.
As part of this, I've promoted CGBuiltin's EmitPointerWithAlignment
helper function to CGF and made use of it in a number of places in
the expression emitter.
The end result is that we should now be significantly more correct
when performing operations on objects that are locally known to
be under-aligned. Since alignment is not reliably tracked in the
type system, there are inherent limits to this, but at least we
are no longer confused by standard operations like derived-to-base
conversions and array-to-pointer decay. I've also fixed a large
number of bugs where we were applying the complete-object alignment
to a pointer instead of the non-virtual alignment, although most of
these were hidden by the very conservative approach we took with
member alignment.
Also, because IRGen now reliably asserts on zero alignments, we
should no longer be subject to an absurd but frustrating recurring
bug where an incomplete type would report a zero alignment and then
we'd naively do a alignmentAtOffset on it and emit code using an
alignment equal to the largest power-of-two factor of the offset.
We should also now be emitting much more aggressive alignment
attributes in the presence of over-alignment. In particular,
field access now uses alignmentAtOffset instead of min.
Several times in this patch, I had to change the existing
code-generation pattern in order to more effectively use
the Address APIs. For the most part, this seems to be a strict
improvement, like doing pointer arithmetic with GEPs instead of
ptrtoint. That said, I've tried very hard to not change semantics,
but it is likely that I've failed in a few places, for which I
apologize.
ABIArgInfo now always carries the assumed alignment of indirect and
indirect byval arguments. In order to cut down on what was already
a dauntingly large patch, I changed the code to never set align
attributes in the IR on non-byval indirect arguments. That is,
we still generate code which assumes that indirect arguments have
the given alignment, but we don't express this information to the
backend except where it's semantically required (i.e. on byvals).
This is likely a minor regression for those targets that did provide
this information, but it'll be trivial to add it back in a later
patch.
I partially punted on applying this work to CGBuiltin. Please
do not add more uses of the CreateDefaultAligned{Load,Store}
APIs; they will be going away eventually.
llvm-svn: 246985
Apparently there are many cast kinds that may cause implicit pointer
arithmetic to happen. In light of this, the cast ignoring logic
introduced in r246877 has been changed to only ignore a small set of
cast kinds, and a test for this behavior has been added.
Thanks to Richard for catching this before it became a bug report. :)
llvm-svn: 246890
Improvements:
- For all types, we would give up in a case such as:
__builtin_object_size((char*)&foo, N);
even if we could provide an answer to
__builtin_object_size(&foo, N);
We now provide the same answer for both of the above examples in all
cases.
- For type=1|3, we now support subobjects with unknown bases, as long
as the designator is valid.
Thanks to Richard Smith for the review + design planning.
Review: http://reviews.llvm.org/D12169
llvm-svn: 246877