Commit Graph

1062 Commits

Author SHA1 Message Date
Artem Belevich fda9905062 [CUDA] added __nvvm_atom_{sys|cta}_* builtins.
These builtins are available on sm_60+ GPU only.

Differential Revision: https://reviews.llvm.org/D24944

llvm-svn: 282609
2016-09-28 17:47:35 +00:00
Nemanja Ivanovic 10e2b5dcaa [Power9] Builtins for ELF v.2 ABI conformance - front end portion
This patch corresponds to review:
https://reviews.llvm.org/D24397

It adds the __POWER9_VECTOR__ macro and the -mpower9-vector option along with
a number of altivec.h functions (refer to the code review for a list).

llvm-svn: 282481
2016-09-27 10:45:22 +00:00
Alexey Bader 465c18973d [OpenCL] Augment pipe built-ins with pipe packet size and alignment.
Reviewers: Anastasia, vpykhtin

Subscribers: dmitry, cfe-commits

Differential Revision: https://reviews.llvm.org/D23992

llvm-svn: 282252
2016-09-23 14:20:00 +00:00
Albert Gutowski 727ab8a803 Add some MS aliases for existing intrinsics
Reviewers: thakis, compnerd, majnemer, rsmith, rnk

Subscribers: alexshap, cfe-commits

Differential Revision: https://reviews.llvm.org/D24330

llvm-svn: 281540
2016-09-14 21:19:43 +00:00
Dehao Chen 5d4f0be5b8 Convert finite to builtin
Summary: This patch converts finite/__finite to builtin functions so that it will be inlined by compiler.

Reviewers: hfinkel, davidxl, efriedma

Subscribers: efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D24483

llvm-svn: 281509
2016-09-14 17:34:14 +00:00
Albert Gutowski fc19fa3721 Temporary fix for MS _Interlocked intrinsics
llvm-svn: 281401
2016-09-13 21:51:37 +00:00
Albert Gutowski 9918cb6573 Reverse commit 281375 (breaks building Chromium)
llvm-svn: 281399
2016-09-13 21:24:51 +00:00
Albert Gutowski ce7a9a47b2 Add bunch of _Interlocked builtins
Reviewers: compnerd, thakis, Prazek, majnemer, rnk

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D24153

llvm-svn: 281378
2016-09-13 19:43:33 +00:00
Albert Gutowski ae3fb3113f Add some MS aliases for existing intrinsics
Reviewers: thakis, compnerd, majnemer, rsmith, rnk

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D24330

llvm-svn: 281375
2016-09-13 19:26:42 +00:00
Albert Gutowski b6a11acb53 Implement MS _rot intrinsics
Reviewers: thakis, Prazek, compnerd, rnk

Subscribers: majnemer, cfe-commits

Differential Revision: https://reviews.llvm.org/D24311

llvm-svn: 280997
2016-09-08 22:32:19 +00:00
Changpeng Fang 03bdd8f797 AMDGPU: Add clang builtin for ds_swizzle.
Summary:
  int __builtin_amdgcn_ds_swizzle (int a, int imm);
while imm is a constant.

Differential Revision:
  http://reviews.llvm.org/D23682

llvm-svn: 279165
2016-08-18 22:04:54 +00:00
Reid Kleckner 66e7717b46 Revert "[X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows platforms"
This reverts commit r278783.  It breaks usage of _xgetbv on Windows.

llvm-svn: 278814
2016-08-16 16:04:14 +00:00
Marina Yatsina 197b65f833 [X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows platforms
commit on behalf of guyblank

Differential Revision: https://reviews.llvm.org/D21959

llvm-svn: 278783
2016-08-16 08:13:36 +00:00
Chandler Carruth 4c5e8ccf74 [x86] Fix a really nasty bug introduced in r276417 where alignment
constraints were added to _mm256_broadcast_{pd,ps} intel intrinsics.

The spec for these intrinics is ... pretty much silent on alignment.
This is especially frustrating considering the amount of discussion of
alignment in the load and store instrinsics. So I was forced to rely on
the specification for the VBROADCASTF128 instruction.

That instruction's spec is *also* completely silent on alignment.
Fortunately, when it comes to the instruction's spec, silence is enough.
There is no #GP fault option for an underaligned address so this
instruction, and by inference the intrinsic, can read any alignment.

As it happens, the old code worked exactly this way and in fact we have
plenty of code that hands pointers with less than 16-byte alignment to
these intrinsics. This code broke pretty spectacularly with this commit.

Fortunately, the fix is super simple! Change a 16 to a 1, and ta da!

Anyways, a lot of debugging for a really boring fix. =]

llvm-svn: 278202
2016-08-10 07:32:47 +00:00
Wei Ding 91c8450967 AMDGPU : Add Clang builtin intrinsics for compare with the full
wavefront result.

Differential Revision: http://reviews.llvm.org/D22934

llvm-svn: 277824
2016-08-05 15:38:46 +00:00
Alexey Bader d81623261a [OpenCL] Added underscores to the names of 'to_addr' OpenCL built-ins.
Summary:
In order to re-define OpenCL built-in functions
'to_{private,local,global}' in OpenCL run-time library LLVM names must
be different from the clang built-in function names.

Reviewers: yaxunl, Anastasia

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D23120

llvm-svn: 277743
2016-08-04 18:06:27 +00:00
Simon Pilgrim 2d8517303c [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 with generic IR
As discussed on D22460, I've updated the vbroadcastf128 pd256/ps256 builtins to map directly to generic IR - load+splat a 128-bit vector to both lanes of a 256-bit vector.

Fix for PR28657.

llvm-svn: 276417
2016-07-22 13:58:56 +00:00
Matt Arsenault c7536a5d60 AMDGPU: Remove legacy ldexp builtin
llvm-svn: 275623
2016-07-15 21:33:06 +00:00
Matt Arsenault c86671da09 AMDGPU: Update for rsq intrinsic changes
llvm-svn: 275622
2016-07-15 21:33:02 +00:00
Wei Ding ea41f356bb AMDGPU: Add Clang Builtin for v_lerp_u8
Differential Revision: http://reviews.llvm.org/D22380

llvm-svn: 275577
2016-07-15 16:43:03 +00:00
Jan Vesely d7e03a5bd9 AMDGPU: Export workitem builtins
Reviewers: tstellardAMD

Differential Revision: http://reviews.llvm.org/D20299

llvm-svn: 275030
2016-07-10 22:38:04 +00:00
Craig Topper f2f1a099a7 [CodeGen] Use llvm::Type::getVectorNumElements instead of casting to llvm::VectorType and calling getNumElements. This is equivalent and shorter.
llvm-svn: 274823
2016-07-08 02:17:35 +00:00
Craig Topper 0160063aeb [X86] Reuse existing lambda and remove unnecessary argument from vector cmp builtin handling. NFC
llvm-svn: 274821
2016-07-08 01:57:24 +00:00
Craig Topper 925ef0a135 [X86] Remove a couple calls to create V2F64 and V4F32 types for builtin handling. Just get the type from the operand of the builtin instead. NFC
llvm-svn: 274820
2016-07-08 01:48:44 +00:00
Craig Topper 425d02d33e [X86] Use native IR for immediate values 0-7 of packed fp cmp builtins. This makes them the same as what is done when using the SSE builtins for these same encodings.
llvm-svn: 274608
2016-07-06 06:27:31 +00:00
Craig Topper 46e7555d4b [AVX512] Use the generic ctlz intrinsic to implement the vplzcntd/q builtins.
llvm-svn: 274603
2016-07-06 04:24:29 +00:00
Anastasia Stulova db7a31cce7 [OpenCL] An implementation of device side enqueue (DSE) from OpenCL v2.0 s6.13.17.
- Added new Builtins: enqueue_kernel, get_kernel_work_group_size
and get_kernel_preferred_work_group_size_multiple.

These Builtins use custom check to diagnose parameters of the passed Blocks
i. e. variable number of 'local void*' type params, and check different
overloads specified in Table 6.31 of OpenCL v2.0.

- IR is generated as an internal library call for each OpenCL Builtin,
reusing ObjC Block implementation.

Review: http://reviews.llvm.org/D20249
llvm-svn: 274540
2016-07-05 11:31:24 +00:00
Anastasia Stulova 7f8d6dc0ef [OpenCL] Make OpenCL Builtins added according to the right version.
Currently we only have OpenCL 2.0 Builtins i.e. pipes or address space conversions.

They have to be added only in the version 2.0 compilation mode to make the identifiers
available for use in the other versions.

Review: http://reviews.llvm.org/D20249
llvm-svn: 274509
2016-07-04 16:07:18 +00:00
Craig Topper ac1823f6e9 [AVX512] Modify what indices we emit for the zero vector we use for zero extension of the result of a v2i1 or v4i1 masked compare. This way we emit something that the backend easily interprets as a concatenation rather than a true shuffle. This delivers slightly better codegen with the current backend capabilities.
llvm-svn: 274484
2016-07-04 07:09:46 +00:00
Matt Arsenault f652caea65 Emit more intrinsics for builtin functions
This is important for building libclc. Since r273039 tests are failing
due to now emitting calls to these functions instead of emitting the
DAG node. The libm function names are implemented for OpenCL, and should
call the locally defined versions, so -fno-builtin is used. The IR
Some functions use the __builtins and expect the intrinsics to be
emitted. Without this we end up with nobuiltin calls to intrinsics
or to unsupported library calls.

llvm-svn: 274370
2016-07-01 17:38:14 +00:00
Igor Breger 2c880cf9b1 [AVX512] Zero extend cmp intrinsic return value.
Differential Revision: http://reviews.llvm.org/D21746

llvm-svn: 274110
2016-06-29 08:14:17 +00:00
Matt Arsenault 64665bc50d AMDGPU: Add builtin to read exec mask
llvm-svn: 273965
2016-06-28 00:13:17 +00:00
Craig Topper d1691c7026 [AVX512] Replace masked integer cmp and ucmp builtins with native IR.
llvm-svn: 273378
2016-06-22 04:47:58 +00:00
Simon Pilgrim d39d026324 [X86][SSE4A] Use native IR for mask movntsd/movntss intrinsics.
Depends on llvm side commit r273002.

llvm-svn: 273003
2016-06-17 14:28:16 +00:00
Ranjeet Singh ca2b3e7b5c [ARM] Add mrrc/mrrc2 intrinsics and update existing mcrr/mcrr2 intrinsics.
Reapplying patch in r272777 which was reverted
because the llvm patch which added support
for generating the mcrr/mcrr2 instructions
from the intrinsic was causing an assertion
failure. This has now been fixed in llvm.

llvm-svn: 272983
2016-06-17 00:59:41 +00:00
Sanjay Patel dbd68dd09d [x86] generate IR for AVX2 integer min/max builtins
Sibling patch to r272932:
http://reviews.llvm.org/rL272932

llvm-svn: 272933
2016-06-16 18:45:01 +00:00
Marcin Koscielnicki a46fade624 [Builtin] Make __builtin_thread_pointer target-independent.
This is now supported for ARM, AArch64, PowerPC, SystemZ, SPARC, Mips.

Differential Revision: http://reviews.llvm.org/D19589

llvm-svn: 272893
2016-06-16 13:41:54 +00:00
Sanjay Patel 280cfd1a69 [x86] translate SSE packed FP comparison builtins to IR
As noted in the code comment, a potential follow-on would be to remove
the builtins themselves. Other than ord/unord, this already works as 
expected. Eg:

  typedef float v4sf __attribute__((__vector_size__(16)));
  v4sf fcmpgt(v4sf a, v4sf b) { return a > b; }

Differential Revision: http://reviews.llvm.org/D21268

llvm-svn: 272840
2016-06-15 21:20:04 +00:00
Sanjay Patel 7495ec026e [x86] generate IR for SSE integer min/max builtins
Sibling patch to r272806:
http://reviews.llvm.org/rL272806

llvm-svn: 272807
2016-06-15 17:18:50 +00:00
Ranjeet Singh d48760da64 Reverting r272777 because one of the tests
added in the llvm patch is causing an assertion
to fail.

llvm-svn: 272790
2016-06-15 14:21:28 +00:00
Craig Topper a54c21e742 [AVX512] Use native IR for mask pcmpeq/pcmpgt intrinsics.
llvm-svn: 272787
2016-06-15 14:06:34 +00:00
Ranjeet Singh 8d5ad5bdf2 [ARM] Add mrrc/mrrc2 intrinsics and update existing mcrr/mcrr2 intrinsics.
Patch adds intrinsics for mrrc/mrrc2. The
intrinsics for mrrc/mrrc2 return a single
uint64_t to represent two 32 bit values.

The mcrr/mcrr2 intrinsic was changed to
accept a single uint64_t instead of two
32 bit values as the input for consistency.

Differential Revision: http://reviews.llvm.org/D21179

llvm-svn: 272777
2016-06-15 11:32:18 +00:00
Simon Pilgrim 532de1ceb9 Fix unused variable warning
llvm-svn: 272541
2016-06-13 10:05:19 +00:00
Simon Pilgrim beca5f295c [Clang][X86] Convert non-temporal store builtins to generic __builtin_nontemporal_store in headers
We can now use __builtin_nontemporal_store instead of target specific builtins for naturally aligned nontemporal stores which avoids the need for handling in CGBuiltin.cpp

The scalar integer nontemporal (unaligned) store builtins will have to wait as __builtin_nontemporal_store currently assumes natural alignment and doesn't accept the 'packed struct' trick that we use for normal unaligned load/stores.

The nontemporal loads require further backend support before we can safely convert them to __builtin_nontemporal_load

Differential Revision: http://reviews.llvm.org/D21272

llvm-svn: 272540
2016-06-13 09:57:52 +00:00
Craig Topper d1cb4ceacd [CodeGen] Update to use an ArrayRef of uint32_t instead of int in calls to CreateShuffleVector to match llvm interface change.
llvm-svn: 272492
2016-06-12 00:41:24 +00:00
Craig Topper 2769bb5753 [X86] Handle AVX2 pslldqi and psrldqi intrinsics shufflevector creation directly in the header file instead of in CGBuiltin.cpp. Simplify the sse2 equivalents as well.
llvm-svn: 272246
2016-06-09 05:15:12 +00:00
Craig Topper c1442973c8 [X86] Reuse the EmitX86Select routine to handle the select for masked palignr too.
llvm-svn: 272245
2016-06-09 05:15:00 +00:00
Igor Breger aadb876200 [AVX512] Emit select instruction instead of using x86 specific instrinsics.
This will allow us to remove the x86 instrinics from the backend.

Differential Revision: http://reviews.llvm.org/D21060

llvm-svn: 272141
2016-06-08 13:59:20 +00:00
Craig Topper f51cc07719 [AVX512] Convert masked palignr builtins directly to native IR similar to the other palignr builtins, but with a select to handle masking.
llvm-svn: 271873
2016-06-06 06:13:01 +00:00
Craig Topper 4b060e31c9 [AVX512] Convert masked load builtins to generic masked load intrinsics instead of the x86 specific ones.
This will allow the x86 intrinsics to be removed from the backend.

llvm-svn: 271253
2016-05-31 06:58:07 +00:00
Craig Topper 6e891fbdd2 [AVX512] Emit generic masked store instrinsics instead of using x86 specific intrinsics.
This will allow us to remove the x86 instrinics from the backend.

llvm-svn: 271246
2016-05-31 01:50:10 +00:00
Craig Topper b8b4b7eb01 [X86] Simplify alignr builtin support by recognizing that NumLaneElts is always 16. NFC
llvm-svn: 271176
2016-05-29 07:06:02 +00:00
Craig Topper 832caf041f [CodeGen] Use the ArrayRef form CreateShuffleVector instead of building ConstantVectors or ConstantDataVectors and calling the other form.
llvm-svn: 271165
2016-05-29 02:39:30 +00:00
Matt Arsenault 2d51059ebb AMDGPU: Add fract builtin
llvm-svn: 271080
2016-05-28 00:43:27 +00:00
David Majnemer e6abf3d29f [CodeGen] Don't crash when sizeof(long) != 4 for some intrins
_InterlockedIncrement and _InterlockedDecrement have 'long' in their
prototypes.  We assumed 'long' was the same size as an i32 which is
incorrect for other targets.

This fixes PR27892.

llvm-svn: 270953
2016-05-27 02:06:19 +00:00
Yaxun Liu f7449a179b [OpenCL] Add to_{global|local|private} builtin functions.
OpenCL builtin functions to_{global|local|private} accepts argument of pointer type to arbitrary pointee type, and return a pointer to the same pointee type in different addr space, i.e.

global gentype *to_global(gentype *p);
It is not desirable to declare it as

global void *to_global(void *);
in opencl header file since it misses diagnostics.

This patch implements these builtin functions as Clang builtin functions. In the builtin def file they are defined to have signature void*(void*). When handling call expressions, their declarations are re-written to have correct parameter type and return type corresponding to the call argument.

In codegen call to addr void *to_addr(void*) is generated with addrcasts or bitcasts to facilitate implementation in builtin library.

Differential Revision: http://reviews.llvm.org/D19932

llvm-svn: 270261
2016-05-20 19:54:38 +00:00
Benjamin Kramer f4c520d5d2 Add all the avx512 flavors to __builtin_cpu_supports's list.
This is matching what trunk gcc is accepting. Also adds a missing ssse3
case. PR27779. The amount of duplication here is annoying, maybe it
should be factored into a separate .def file?

llvm-svn: 270224
2016-05-20 15:21:08 +00:00
Justin Lebar 2e4ecfdebe [CUDA] Implement __ldg using intrinsics.
Summary:
Previously it was implemented as inline asm in the CUDA headers.

This change allows us to use the [addr+imm] addressing mode when
executing ld.global.nc instructions.  This translates into a 1.3x
speedup on some benchmarks that call this instruction from within an
unrolled loop.

Reviewers: tra, rsmith

Subscribers: jhen, cfe-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D19990

llvm-svn: 270150
2016-05-19 22:49:13 +00:00
Derek Schuff dbd24b4593 [WebAssembly] Rename memory_size intrinsic to current_memory
This follows the recent change in the wasm spec.

llvm-svn: 268256
2016-05-02 17:26:19 +00:00
Marcin Koscielnicki 4005070e1b [AArch64] Fix D19098 fallout.
The intrinsic is now called llvm.thread.pointer, not
llvm.aarch64.thread.pointer.  Also, the code handling it in CGBuiltin.cpp
is dead - it's already covered by GCCBuiltin.  Remove it.

Differential Revision: http://reviews.llvm.org/D19099

llvm-svn: 266817
2016-04-19 20:51:00 +00:00
Ahmed Bougacha 1d9de10130 [ARM NEON] Define vfms_f32 on ARM, and all vfms using vfma.
r259537 added vfma/vfms to armv7, but the builtin was only lowered
on the AArch64 side. Instead of supporting it on ARM, get rid of it.

The vfms builtin lowered to:
  %nb = fsub float -0.0, %b
  %r = @llvm.fma.f32(%a, %nb, %c)

Instead, define the operation in terms of vfma, and swap the
multiplicands. It now lowers to:
  %na = fsub float -0.0, %a
  %r = @llvm.fma.f32(%na, %b, %c)

This matches the instruction more closely, and lets current LLVM
generate the "natural" operand ordering:
  fmls.2s v0, v1, v2
instead of the crooked (but equivalent):
  fmls.2s v0, v2, v1
Except for theses changes, assembly is identical.

LLVM accepts both commutations, and the LLVM tests in:
  test/CodeGen/AArch64/arm64-fmadd.ll
  test/CodeGen/AArch64/fp-dp3.ll
  test/CodeGen/AArch64/neon-fma.ll
  test/CodeGen/ARM/fusedMAC.ll
already check either the new one only, or both.

Also verified against the test-suite unittests.

llvm-svn: 266807
2016-04-19 19:44:45 +00:00
Sanjay Patel ae7a9df7bf make __builtin_isfinite more efficient (PR27145)
isinf (is infinite) and isfinite should be implemented with the same function
except we change the comparison operator.

See PR27145 for more details:
https://llvm.org/bugs/show_bug.cgi?id=27145

Ref: forked off of the discussion in D18513.

Differential Revision: http://reviews.llvm.org/D18648

llvm-svn: 265675
2016-04-07 14:29:05 +00:00
JF Bastien 92f4ef1017 NFC: make AtomicOrdering an enum class
Summary: See LLVM change D18775 for details, this change depends on it.

Reviewers: jyknight, reames

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D18776

llvm-svn: 265569
2016-04-06 17:26:42 +00:00
Matt Arsenault 3fb963389e AMDGPU: Add frexp_mant + frexp_exp builtins
llvm-svn: 264960
2016-03-30 22:57:40 +00:00
Aaron Ballman abd466ed04 Silencing warnings from MSVC 2015 Update 2. Both of these changes silence "C4334 '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)". NFC.
llvm-svn: 264932
2016-03-30 21:33:34 +00:00
Matt Arsenault 08087c52eb Add missing __builtin_bitreverse8
Also add documentation for bitreverse builtins

llvm-svn: 264203
2016-03-23 22:14:43 +00:00
Justin Lebar 717d2b0a0d [CUDA] Implement atomicInc and atomicDec builtins
These functions cannot be implemented as atomicrmw or cmpxchg
instructions, so they are implemented as a call to the NVVM intrinsics
@llvm.nvvm.atomic.load.inc.32.p0i32 and
@llvm.nvvm.atomic.load.dec.32.p0i32.

Patch by Jason Henline.

Reviewers: jlebar

Differential Revision: http://reviews.llvm.org/D18322

llvm-svn: 264009
2016-03-22 00:09:28 +00:00
John McCall c56a8b3284 Preserve ExtParameterInfos into CGFunctionInfo.
As part of this, make the function-arrangement interfaces
a little simpler and more semantic.

NFC.

llvm-svn: 263191
2016-03-11 04:30:31 +00:00
Kit Barton fbab158767 [PPC] FE support for generating VSX [negated] absolute value instructions
Includes new built-in, conversion of built-in to target-independent intrinsic
and update in the header file. Tests are also updated. There is a second part in
the backend for which I will post a separate code-review. BACKEND PART SHOULD BE
COMMITTED FIRST.

Phabricator: http://reviews.llvm.org/D17816
llvm-svn: 263051
2016-03-09 19:28:31 +00:00
Matt Arsenault 2d9339890f Add __builtin_canonicalize
llvm-svn: 262122
2016-02-27 09:06:18 +00:00
Matt Arsenault 9b277b4ad4 AMDGPU: Add sin/cos builtins
llvm-svn: 260783
2016-02-13 01:21:09 +00:00
Matt Arsenault f5c1f47181 AMDGPU: Update builtin for intrinsic change
llvm-svn: 260781
2016-02-13 01:03:09 +00:00
Matt Arsenault 105e892c2c Add builtins for bitreverse intrinsic
Follow the naming convention that bswap uses since it's a
similar sort of operation.

llvm-svn: 259671
2016-02-03 17:49:38 +00:00
Xiuli Pan bb4d8d30b1 Recommit: R258773 [OpenCL] Pipe builtin functions
Fix arc patch fuzz error.
Summary:
Support for the pipe built-in functions for OpenCL 2.0.
The pipe builtin functions may have infinite kinds of element types, one approach
would be to just generate calls that would always use generic types such as void*.
This patch is based on bader's opencl support patch on SPIR-V branch.

Reviewers: Anastasia, pekka.jaaskelainen

Subscribers: keryell, bader, cfe-commits

Differential Revision: http://reviews.llvm.org/D15914

llvm-svn: 258782
2016-01-26 04:03:48 +00:00
David Majnemer 747f168e8d Revert "[OpenCL] Pipe builtin functions"
This reverts commit r258773, it broke the build bots:
http://bb.pgr.jp/builders/cmake-clang-x86_64-linux/builds/43853

llvm-svn: 258775
2016-01-26 02:22:31 +00:00
Xiuli Pan 3a9952c9e7 [OpenCL] Pipe builtin functions
Summary:
Support for the pipe built-in functions for OpenCL 2.0.
The pipe builtin functions may have infinite kinds of element types, one approach
would be to just generate calls that would always use generic types such as void*.
This patch is based on bader's opencl support patch on SPIR-V branch.

Reviewers: Anastasia, pekka.jaaskelainen

Subscribers: keryell, bader, cfe-commits

Differential Revision: http://reviews.llvm.org/D15914

llvm-svn: 258773
2016-01-26 02:06:04 +00:00
Justin Lebar 3039a593db [CUDA] Make printf work.
Summary:
The code in CGCUDACall is largely based on a patch written by Eli
Bendersky:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20140324/210218.html

That patch implemented an LLVM pass lowering printf to vprintf; this
one does something similar, but in Clang codegen.

Reviewers: echristo

Subscribers: cfe-commits, jhen, tra, majnemer

Differential Revision: http://reviews.llvm.org/D16372

llvm-svn: 258642
2016-01-23 21:28:14 +00:00
Matt Arsenault 8a4078c741 AMDGPU: Rename builtins to use amdgcn prefix
Keep the ones still used by libclc around for now.

Emit the new amdgcn intrinsic name if not targeting r600,
in which case the old AMDGPU name is still used.

llvm-svn: 258560
2016-01-22 21:30:53 +00:00
Ben Craig cd7e9f143b Reordering fields to reduce padding in Clang. NFC
llvm-svn: 255552
2015-12-14 21:54:11 +00:00
George Burgess IV 3e3bb95b69 Add the `pass_object_size` attribute to clang.
`pass_object_size` is our way of enabling `__builtin_object_size` to
produce high quality results without requiring inlining to happen
everywhere.

A link to the design doc for this attribute is available at the
Differential review link below.

Differential Revision: http://reviews.llvm.org/D13263

llvm-svn: 254554
2015-12-02 21:58:08 +00:00
Eric Christopher fbfd97ed5c Move checkTargetFeatures to CodeGenFunction.cpp to make it
more obvious that it's generic.

llvm-svn: 252833
2015-11-12 00:44:07 +00:00
Eric Christopher c7e79dbec8 In preparation to use it in more places rename
checkBuiltinTargetFeatures to checkTargetFeatures and sink
the error handling into the function.

llvm-svn: 252832
2015-11-12 00:44:04 +00:00
Eric Christopher 2b90a64e31 Extract out a function onto CodeGenModule for getting the map of
features for a particular function, then use it to clean up some
code.

llvm-svn: 252819
2015-11-11 23:05:08 +00:00
Eric Christopher ed60b436d4 Fix a FIXME about using std::is_sorted.
llvm-svn: 252691
2015-11-11 02:04:08 +00:00
Petar Jovanovic 73d1044abe Fix __builtin_signbit for ppcf128 type
Function__builtin_signbit returns wrong value for type ppcf128 on big endian
machines. This patch fixes how value is generated in that case.

Patch by Aleksandar Beserminji.

Differential Revision: http://reviews.llvm.org/D14149

llvm-svn: 252307
2015-11-06 14:52:46 +00:00
Dan Gohman 24f0a08c1b [WebAssembly] Update wasm builtin functions to match spec changes.
The page_size operator has been removed from the spec, and the resize_memory
operator has been changed to grow_memory.

llvm-svn: 252201
2015-11-05 20:16:37 +00:00
John McCall 03107a4ef0 Add support for __builtin_{add,sub,mul}_overflow.
Patch by David Grayson!

llvm-svn: 251651
2015-10-29 20:48:01 +00:00
Benjamin Kramer e003ca2a03 Put global classes into the appropriate namespace.
Most of the cases belong into an anonymous namespace. No functionality
change intended.

llvm-svn: 251514
2015-10-28 13:54:16 +00:00
Eric Christopher 9d628c33b3 Reflow comment.
llvm-svn: 251501
2015-10-28 06:56:25 +00:00
Eric Christopher 99af5b2ea7 Handle target builtin options that are all required rather than
only one of a group of possibilities.

This changes the syntax in the builtin files to represent:

, as the and operator
| as the or operator

The former syntax matches how the backend tablegen files represent
multiple subtarget features being required.

Updated the builtin and intrinsic headers accordingly for the new
syntax.

llvm-svn: 251388
2015-10-27 06:11:03 +00:00
Eric Christopher 4a4367534b Use early exits to reduce indentation.
llvm-svn: 251371
2015-10-27 00:06:21 +00:00
Craig Topper 273dbc602f Make a bunch of static arrays const.
llvm-svn: 250647
2015-10-18 05:29:26 +00:00
Eric Christopher 15709991d0 Add an error when calling a builtin that requires features that don't
match the feature set of the function that they're being called from.

This ensures that we can effectively diagnose some[1] code that would
instead ICE in the backend with a failure to select message.

Example:

__m128d foo(__m128d a, __m128d b) {
  return __builtin_ia32_addsubps(b, a);
}

compiled for normal x86_64 via:

clang -target x86_64-linux-gnu -c

would fail to compile in the back end because the normal subtarget
features for x86_64 only include sse2 and the builtin requires sse3.

[1] We're still not erroring on:

__m128i bar(__m128i const *p) { return _mm_lddqu_si128(p); }

where we should fail and error on an always_inline function being
inlined into a function that doesn't support the subtarget features
required.

llvm-svn: 250473
2015-10-15 23:47:11 +00:00
Benjamin Kramer c2d2b4259c [CodeGen] Remove dead code. NFC.
llvm-svn: 250418
2015-10-15 15:29:40 +00:00
Amjad Aboud 2b9b8a5921 [X86] Add XSAVE intrinsic family
Add intrinsics for the
  XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64)
  XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64)
  XSAVEC instructions (XSAVEC/XSAVEC64)
  XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64)

Differential Revision: http://reviews.llvm.org/D13014

llvm-svn: 250158
2015-10-13 12:29:35 +00:00
Dan Gohman 266b38ab56 [WebAssembly] Add a __builtin_wasm_resize_memory() intrinsic.
llvm-svn: 249179
2015-10-02 20:20:01 +00:00
Dan Gohman d4c5fb597d [WebAssembly] Add a __builtin_wasm_memory_size() intrinsic.
llvm-svn: 249176
2015-10-02 19:38:47 +00:00
Jingyue Wu f1eca25b16 [CUDA] fix codegen for __nvvm_atom_cas_*
Summary: __nvvm_atom_cas_* returns the old value instead of whether the swap succeeds.

Reviewers: eliben, tra

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D13306

llvm-svn: 248951
2015-09-30 21:49:32 +00:00
Jeroen Ketema 55a8e80de8 [ARM][NEON] Use address space in vld([1234]|[234]lane) and vst([1234]|[234]lane) instructions
This is the clang commit associated with llvm r248887.

This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234],
vst[234]lane ARM neon intrinsics and associates an address space with the
pointer that these intrinsics take. This changes, e.g.,

<2 x i32> @llvm.arm.neon.vld1.v2i32(i8*, i32)

to

<2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8*, i32)

This change ensures that address spaces are fully taken into account in the ARM
target during lowering of interleaved loads and stores.

Differential Revision: http://reviews.llvm.org/D13127

llvm-svn: 248888
2015-09-30 10:56:56 +00:00
Artem Belevich b5bc923af4 [CUDA] Allow parsing of host and device code simultaneously.
* adds -aux-triple option to specify target triple
 * propagates aux target info to AST context and Preprocessor
 * pulls in target specific preprocessor macros.
 * pulls in target-specific builtins from aux target.
 * sets appropriate host or device attribute on builtins.

Differential Revision: http://reviews.llvm.org/D12917

llvm-svn: 248299
2015-09-22 17:23:22 +00:00
Charles Davis c7d5c94f78 Support __builtin_ms_va_list.
Summary:
This change adds support for `__builtin_ms_va_list`, a GCC extension for
variadic `ms_abi` functions. The existing `__builtin_va_list` support is
inadequate for this because `va_list` is defined differently in the Win64
ABI vs. the System V/AMD64 ABI.

Depends on D1622.

Reviewers: rsmith, rnk, rjmccall

CC: cfe-commits

Differential Revision: http://reviews.llvm.org/D1623

llvm-svn: 247941
2015-09-17 20:55:33 +00:00
Steven Wu 0d22f2d57e Fix vld1_lane intrinsic generation
Fix a bug introduced in r246985 which causes assertion when generating
vld1_lane.

llvm-svn: 247117
2015-09-09 01:37:18 +00:00
Michael Zolotukhin 84df12375c Introduce __builtin_nontemporal_store and __builtin_nontemporal_load.
Summary:
Currently clang provides no general way to generate nontemporal loads/stores.
There are some architecture specific builtins for doing so (e.g. in x86), but
there is no way to generate non-temporal store on, e.g. AArch64. This patch adds
generic builtins which are expanded to a simple store with '!nontemporal'
attribute in IR.

Differential Revision: http://reviews.llvm.org/D12313

llvm-svn: 247104
2015-09-08 23:52:33 +00:00
John McCall 7f416cc426 Compute and preserve alignment more faithfully in IR-generation.
Introduce an Address type to bundle a pointer value with an
alignment.  Introduce APIs on CGBuilderTy to work with Address
values.  Change core APIs on CGF/CGM to traffic in Address where
appropriate.  Require alignments to be non-zero.  Update a ton
of code to compute and propagate alignment information.

As part of this, I've promoted CGBuiltin's EmitPointerWithAlignment
helper function to CGF and made use of it in a number of places in
the expression emitter.

The end result is that we should now be significantly more correct
when performing operations on objects that are locally known to
be under-aligned.  Since alignment is not reliably tracked in the
type system, there are inherent limits to this, but at least we
are no longer confused by standard operations like derived-to-base
conversions and array-to-pointer decay.  I've also fixed a large
number of bugs where we were applying the complete-object alignment
to a pointer instead of the non-virtual alignment, although most of
these were hidden by the very conservative approach we took with
member alignment.

Also, because IRGen now reliably asserts on zero alignments, we
should no longer be subject to an absurd but frustrating recurring
bug where an incomplete type would report a zero alignment and then
we'd naively do a alignmentAtOffset on it and emit code using an
alignment equal to the largest power-of-two factor of the offset.

We should also now be emitting much more aggressive alignment
attributes in the presence of over-alignment.  In particular,
field access now uses alignmentAtOffset instead of min.

Several times in this patch, I had to change the existing
code-generation pattern in order to more effectively use
the Address APIs.  For the most part, this seems to be a strict
improvement, like doing pointer arithmetic with GEPs instead of
ptrtoint.  That said, I've tried very hard to not change semantics,
but it is likely that I've failed in a few places, for which I
apologize.

ABIArgInfo now always carries the assumed alignment of indirect and
indirect byval arguments.  In order to cut down on what was already
a dauntingly large patch, I changed the code to never set align
attributes in the IR on non-byval indirect arguments.  That is,
we still generate code which assumes that indirect arguments have
the given alignment, but we don't express this information to the
backend except where it's semantically required (i.e. on byvals).
This is likely a minor regression for those targets that did provide
this information, but it'll be trivial to add it back in a later
patch.

I partially punted on applying this work to CGBuiltin.  Please
do not add more uses of the CreateDefaultAligned{Load,Store}
APIs; they will be going away eventually.

llvm-svn: 246985
2015-09-08 08:05:57 +00:00
Dan Gohman c285307e14 [WebAssembly] Initial WebAssembly support in clang
This implements basic support for compiling (though not yet assembling
or linking) for a WebAssembly target. Note that ABI details are not yet
finalized, and may change.

Differential Revision: http://reviews.llvm.org/D12002

llvm-svn: 246814
2015-09-03 22:51:53 +00:00
Sanjay Patel a24296b459 add __builtin_unpredictable and convert to metadata
This patch depends on r246688 (D12341).

The goal is to make LLVM generate different code for these functions for a target that
has cheap branches (see PR23827 for more details):

int foo();

int normal(int x, int y, int z) {
   if (x != 0 && y != 0) return foo();
   return 1;
}

int crazy(int x, int y) {
   if (__builtin_unpredictable(x != 0 && y != 0)) return foo();
   return 1;
}

Differential Revision: http://reviews.llvm.org/D12458

llvm-svn: 246699
2015-09-02 20:01:30 +00:00
Hal Finkel 65e1e4dbe0 [PowerPC] Support __builtin_ppc_get_timebase
GCC 4.8+ has a PowerPC-specific intrinsic, __builtin_ppc_get_timebase, to do
what Clang's __builtin_readcyclecounter does. For compatibility with code that
uses GCC's spelling (including glibc), support it as well.

Partially fixes PR23681.

llvm-svn: 246510
2015-08-31 23:55:19 +00:00
Jingyue Wu 2d69f9608e [CUDA] fix codegen for __nvvm_atom_min/max_gen_u*
Summary: Clang should emit "atomicrmw umin/umax" instead of "atomicrmw min/max".

Reviewers: eliben, tra

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12487

llvm-svn: 246455
2015-08-31 17:25:51 +00:00
Simon Pilgrim 5aba9925c0 [X86][SSE] Add _mm_undefined_* intrinsics
Added missing SSE/AVX 'undefined' intrinsics (PR24040):

_mm_undefined_pd, _mm_undefined_ps + _mm_undefined_si128
_mm256_undefined_pd, _mm256_undefined_ps + _mm256_undefined_si256
_mm512_undefined, _mm512_undefined_ps, _mm512_undefined_pd + _mm512_undefined_epi32

Added builtin intrinsicss:

__builtin_ia32_undef128, __builtin_ia32_undef256 + __builtin_ia32_undef512

Differential Revision: http://reviews.llvm.org/D12052

llvm-svn: 246083
2015-08-26 21:17:12 +00:00
Ahmed Bougacha 40882bb9f8 [ARM NEON] Use CGF cached Types instead of llvm::Type::get. NFC.
llvm-svn: 245906
2015-08-24 23:47:29 +00:00
Ahmed Bougacha 774b5e296f [ARM NEON] Replace redundant code with a new GetFloatNeonType. NFC.
llvm-svn: 245904
2015-08-24 23:41:31 +00:00
Ahmed Bougacha cd5b8a0235 [ARM NEON] Use the common naming scheme for vcvt f16 builtins. NFC.
We had "vcvt_f16" and "VCVT_HIGH_F16": for other FP types, this naming
is used for intrinsics with integer overloads. The FP->FP conversions,
on the other hand, use the full "vcvt_f32_f64" name instead.

Use the same naming convention for the f16<->f32 conversions.
While there, reorder the definitions a little bit.

llvm-svn: 245763
2015-08-21 23:34:20 +00:00
Eric Christopher 02d5d86b4e Rename the non-coding style conformant functions in namespace Builtins
to match the rest of their brethren and reformat the bits that need it.

llvm-svn: 244186
2015-08-06 01:01:12 +00:00
Benjamin Kramer c385a808b8 [CodeGen] Clean up CGBuiltin a bit.
- Use cached LLVM types
- Turn SmallVectors into Arrays/ArrayRef if the size is static
- Use ConstantInt::get's implicit splatting for vector types

No functionality change intended.

llvm-svn: 243425
2015-07-28 15:40:11 +00:00
Adhemerval Zanella 3916c910d1 [AArch64] Implement __builtin_thread_pointer
This path add the aarch64 __builtin_thread_pointer support.  It will be
lowered to llvm.aarch64.thread.pointer.

llvm-svn: 243413
2015-07-28 13:10:10 +00:00
David Majnemer 6cd35912c0 [CodeGen] Don't UBSan-ize the argument to __builtin_frame_address
__builtin_frame_address requires its argument to be a constant
expression which already implies that it cannot have undefined behavior.
However, we used EmitScalarExpr to emit the argument causing UBSan to
try to check for overflow.

Instead, use the constant expression emission system.

This fixes PR24256.

llvm-svn: 243206
2015-07-25 05:57:24 +00:00
Benjamin Kramer b596056413 [CodeGen] Flip lanes when lowering __builtin_palignr with one lane
Otherwise we'd pick the wrong lane for the resulting shuffle and
miscompile code. PR24187.

llvm-svn: 242678
2015-07-20 15:31:17 +00:00
Nemanja Ivanovic 6c363ed67a Add missing builtins to altivec.h for ABI compliance (vol. 4)
This patch corresponds to review:
http://reviews.llvm.org/D11184

A number of new interfaces for altivec.h (as mandated by the ABI):
vector float vec_cpsgn(vector float, vector float)
vector double vec_cpsgn(vector double, vector double)
vector double vec_or(vector bool long long, vector double)
vector double vec_or(vector double, vector bool long long)
vector double vec_re(vector double)
vector signed char vec_cntlz(vector signed char)
vector unsigned char vec_cntlz(vector unsigned char)
vector short vec_cntlz(vector short)
vector unsigned short vec_cntlz(vector unsigned short)
vector int vec_cntlz(vector int)
vector unsigned int vec_cntlz(vector unsigned int)
vector signed long long vec_cntlz(vector signed long long)
vector unsigned long long vec_cntlz(vector unsigned long long)
vector signed char vec_nand(vector bool signed char, vector signed char)
vector signed char vec_nand(vector signed char, vector bool signed char)
vector signed char vec_nand(vector signed char, vector signed char)
vector unsigned char vec_nand(vector bool unsigned char, vector unsigned char)
vector unsigned char vec_nand(vector unsigned char, vector bool unsigned char)
vector unsigned char vec_nand(vector unsigned char, vector unsigned char)
vector short vec_nand(vector bool short, vector short)
vector short vec_nand(vector short, vector bool short)
vector short vec_nand(vector short, vector short)
vector unsigned short vec_nand(vector bool unsigned short, vector unsigned short)
vector unsigned short vec_nand(vector unsigned short, vector bool unsigned short)
vector unsigned short vec_nand(vector unsigned short, vector unsigned short)
vector int vec_nand(vector bool int, vector int)
vector int vec_nand(vector int, vector bool int)
vector int vec_nand(vector int, vector int)
vector unsigned int vec_nand(vector bool unsigned int, vector unsigned int)
vector unsigned int vec_nand(vector unsigned int, vector bool unsigned int)
vector unsigned int vec_nand(vector unsigned int, vector unsigned int)
vector signed long long vec_nand(vector bool long long, vector signed long long)
vector signed long long vec_nand(vector signed long long, vector bool long long)
vector signed long long vec_nand(vector signed long long, vector signed long long)
vector unsigned long long vec_nand(vector bool long long, vector unsigned long long)
vector unsigned long long vec_nand(vector unsigned long long, vector bool long long)
vector unsigned long long vec_nand(vector unsigned long long, vector unsigned long long)
vector signed char vec_orc(vector bool signed char, vector signed char)
vector signed char vec_orc(vector signed char, vector bool signed char)
vector signed char vec_orc(vector signed char, vector signed char)
vector unsigned char vec_orc(vector bool unsigned char, vector unsigned char)
vector unsigned char vec_orc(vector unsigned char, vector bool unsigned char)
vector unsigned char vec_orc(vector unsigned char, vector unsigned char)
vector short vec_orc(vector bool short, vector short)
vector short vec_orc(vector short, vector bool short)
vector short vec_orc(vector short, vector short)
vector unsigned short vec_orc(vector bool unsigned short, vector unsigned short)
vector unsigned short vec_orc(vector unsigned short, vector bool unsigned short)
vector unsigned short vec_orc(vector unsigned short, vector unsigned short)
vector int vec_orc(vector bool int, vector int)
vector int vec_orc(vector int, vector bool int)
vector int vec_orc(vector int, vector int)
vector unsigned int vec_orc(vector bool unsigned int, vector unsigned int)
vector unsigned int vec_orc(vector unsigned int, vector bool unsigned int)
vector unsigned int vec_orc(vector unsigned int, vector unsigned int)
vector signed long long vec_orc(vector bool long long, vector signed long long)
vector signed long long vec_orc(vector signed long long, vector bool long long)
vector signed long long vec_orc(vector signed long long, vector signed long long)
vector unsigned long long vec_orc(vector bool long long, vector unsigned long long)
vector unsigned long long vec_orc(vector unsigned long long, vector bool long long)
vector unsigned long long vec_orc(vector unsigned long long, vector unsigned long long)
vector signed char vec_div(vector signed char, vector signed char)
vector unsigned char vec_div(vector unsigned char, vector unsigned char)
vector signed short vec_div(vector signed short, vector signed short)
vector unsigned short vec_div(vector unsigned short, vector unsigned short)
vector signed int vec_div(vector signed int, vector signed int)
vector unsigned int vec_div(vector unsigned int, vector unsigned int)
vector signed long long vec_div(vector signed long long, vector signed long long)
vector unsigned long long vec_div(vector unsigned long long, vector unsigned long long)
vector unsigned char vec_mul(vector unsigned char, vector unsigned char)
vector unsigned int vec_mul(vector unsigned int, vector unsigned int)
vector unsigned long long vec_mul(vector unsigned long long, vector unsigned long long)
vector unsigned short vec_mul(vector unsigned short, vector unsigned short)
vector signed char vec_mul(vector signed char, vector signed char)
vector signed int vec_mul(vector signed int, vector signed int)
vector signed long long vec_mul(vector signed long long, vector signed long long)
vector signed short vec_mul(vector signed short, vector signed short)
vector signed long long vec_mergeh(vector signed long long, vector signed long long)
vector signed long long vec_mergeh(vector signed long long, vector bool long long)
vector signed long long vec_mergeh(vector bool long long, vector signed long long)
vector unsigned long long vec_mergeh(vector unsigned long long, vector unsigned long long)
vector unsigned long long vec_mergeh(vector unsigned long long, vector bool long long)
vector unsigned long long vec_mergeh(vector bool long long, vector unsigned long long)
vector double vec_mergeh(vector double, vector double)
vector double vec_mergeh(vector double, vector bool long long)
vector double vec_mergeh(vector bool long long, vector double)
vector signed long long vec_mergel(vector signed long long, vector signed long long)
vector signed long long vec_mergel(vector signed long long, vector bool long long)
vector signed long long vec_mergel(vector bool long long, vector signed long long)
vector unsigned long long vec_mergel(vector unsigned long long, vector unsigned long long)
vector unsigned long long vec_mergel(vector unsigned long long, vector bool long long)
vector unsigned long long vec_mergel(vector bool long long, vector unsigned long long)
vector double vec_mergel(vector double, vector double)
vector double vec_mergel(vector double, vector bool long long)
vector double vec_mergel(vector bool long long, vector double)
vector signed int vec_pack(vector signed long long, vector signed long long)
vector unsigned int vec_pack(vector unsigned long long, vector unsigned long long)
vector bool int vec_pack(vector bool long long, vector bool long long)

llvm-svn: 242171
2015-07-14 17:50:27 +00:00
David Blaikie 4ba525b727 Rely on default zero-arg value for IRBuilder::CreateCall calls to zero-arg functions
Patch by servuswiegehtz at yahoo.de

llvm-svn: 242168
2015-07-14 17:27:39 +00:00
Nemanja Ivanovic 1c7ad715ec Add missing builtins to altivec.h for ABI compliance (vol. 2)
This patch corresponds to review:
http://reviews.llvm.org/D10875

The bulk of the second round of additions to altivec.h.
The following interfaces were added:
vector double vec_floor(vector double)
vector double vec_madd(vector double, vector double, vector double)
vector float vec_msub(vector float, vector float, vector float)
vector double vec_msub(vector double, vector double, vector double)
vector float vec_mul(vector float, vector float)
vector double vec_mul(vector double, vector double)
vector float vec_nmadd(vector float, vector float, vector float)
vector double vec_nmadd(vector double, vector double, vector double)
vector double vec_nmsub(vector double, vector double, vector double)
vector double vec_nor(vector double, vector double)
vector double vec_or(vector double, vector double)
vector float vec_rint(vector float)
vector double vec_rint(vector double)
vector float vec_nearbyint(vector float)
vector double vec_nearbyint(vector double)
vector float vec_sqrt(vector float)
vector double vec_sqrt(vector double)
vector double vec_rsqrte(vector double)
vector double vec_sel(vector double, vector double, vector unsigned long long)
vector double vec_sel(vector double, vector double, vector unsigned long long)
vector double vec_sub(vector double, vector double)
vector double vec_trunc(vector double)
vector double vec_xor(vector double, vector double)
vector double vec_xor(vector double, vector bool long long)
vector double vec_xor(vector bool long long, vector double)

New VSX paths for the following interfaces:
vector float vec_madd(vector float, vector float, vector float)
vector float vec_nmsub(vector float, vector float, vector float)
vector float vec_rsqrte(vector float)
vector float vec_trunc(vector float)
vector float vec_floor(vector float)

llvm-svn: 241399
2015-07-05 06:40:52 +00:00
Akira Hatanaka 85365cd72a Attach attribute "trap-func-name" to call sites of llvm.trap and llvm.debugtrap.
This is needed to use clang's command line option "-ftrap-function" for LTO and
enable changing the trap function name on a per-call-site basis.

rdar://problem/21225723

Differential Revision: http://reviews.llvm.org/D10831

llvm-svn: 241306
2015-07-02 22:15:41 +00:00
Eric Christopher d983270976 Add support for the x86 builtin __builtin_cpu_supports.
This matches the implementation of the gcc support for the same
feature, including checking the values set up by libgcc at runtime.
The structure looks like this:

  unsigned int __cpu_vendor;
  unsigned int __cpu_type;
  unsigned int __cpu_subtype;
  unsigned int __cpu_features[1];

with a set of enums to match various fields that are field out after
parsing the output of the cpuid instruction.
This also adds a set of errors checking for valid input (and cpu).

compiler-rt support for this and the other builtins in this family
(__builtin_cpu_init and __builtin_cpu_is) are forthcoming.

llvm-svn: 240994
2015-06-29 21:00:05 +00:00
Nemanja Ivanovic 2f1f926e34 Add missing builtins to altivec.h for ABI compliance (vol. 1)
This patch corresponds to review:
http://reviews.llvm.org/D10637

This is the first round of additions of missing builtins listed in the ABI document. More to come (this builds onto what seurer already addes). This patch adds:
vector signed long long vec_abs(vector signed long long)
vector double vec_abs(vector double)
vector signed long long vec_add(vector signed long long, vector signed long long)
vector unsigned long long vec_add(vector unsigned long long, vector unsigned long long)
vector double vec_add(vector double, vector double)
vector double vec_and(vector bool long long, vector double)
vector double vec_and(vector double, vector bool long long)
vector double vec_and(vector double, vector double)
vector signed long long vec_and(vector signed long long, vector signed long long)
vector double vec_andc(vector bool long long, vector double)
vector double vec_andc(vector double, vector bool long long)
vector double vec_andc(vector double, vector double)
vector signed long long vec_andc(vector signed long long, vector signed long long)
vector double vec_ceil(vector double)
vector bool long long vec_cmpeq(vector double, vector double)
vector bool long long vec_cmpge(vector double, vector double)
vector bool long long vec_cmpge(vector signed long long, vector signed long long)
vector bool long long vec_cmpge(vector unsigned long long, vector unsigned long long)
vector bool long long vec_cmpgt(vector double, vector double)
vector bool long long vec_cmple(vector double, vector double)
vector bool long long vec_cmple(vector signed long long, vector signed long long)
vector bool long long vec_cmple(vector unsigned long long, vector unsigned long long)
vector bool long long vec_cmplt(vector double, vector double)
vector bool long long vec_cmplt(vector signed long long, vector signed long long)
vector bool long long vec_cmplt(vector unsigned long long, vector unsigned long long)

llvm-svn: 240821
2015-06-26 19:27:20 +00:00
Artem Belevich d21e5c6684 [CUDA] Implemented __nvvm_atom_*_gen_* builtins.
Integer variants are implemented as atomicrmw or cmpxchg instructions.
Atomic add for floating point (__nvvm_atom_add_gen_f()) is implemented
as a call to an overloaded @llvm.nvvm.atomic.load.add.f32.* LVVM
intrinsic.

Differential Revision: http://reviews.llvm.org/D10666

llvm-svn: 240669
2015-06-25 18:29:42 +00:00
Bob Wilson 63c931443d Move the special-case check from r240462 into ARM-specific code.
This fixes a serious bug in r240462: checking the BuiltinID for
ARM::BI_MoveToCoprocessor* in EmitBuiltinExpr() ignores the fact that
each target has an overlapping range of the BuiltinID values. That check
can trigger for builtins from other targets, leading to very bad
behavior.

Part of the reason I did not implement r240462 this way to begin with is
the special handling of the last argument for Neon builtins. In this
change, I have factored out the check to see which builtins have that
extra argument into a new HasExtraNeonArgument() function. There is still
some awkwardness in having to check for those builtins in two separate
places, i.e., once to see if the extra argument is present and once to
generate the appropriate IR, but this seems much cleaner than my previous
patch.

llvm-svn: 240522
2015-06-24 06:05:20 +00:00
Bob Wilson 09aa90bbe1 PR22560: Fix argument order for ARM _MoveToCoprocessor builtins.
The Microsoft-extension _MoveToCoprocessor and _MoveToCoprocessor2
builtins take the register value to be moved as the first argument,
but the corresponding mcr and mcr2 LLVM intrinsics expect that value
to be the third argument. Handle this as a special case, while still
leaving those intrinsics as generic MSBuiltins. I considered the
alternative of handling these in EmitARMBuiltinExpr, but that does
not work well for the follow-up change that I'm going to make to improve
the error handling for PR22560 -- we need the GetBuiltinType() checks
for ICEArguments, and the ARM version of that code is only used for
Neon intrinsics where the last argument is special and not
checked in the normal way.

llvm-svn: 240462
2015-06-23 21:10:15 +00:00
Matt Arsenault 3ea39f9e78 AMDGPU: Fix places missed in rename
llvm-svn: 240148
2015-06-19 17:54:10 +00:00
Luke Cheeseman 59b2d83909 This patch implements clang support for the ACLE special register intrinsics
in section 10.1, __arm_{w,r}sr{,p,64}.

This includes arm_acle.h definitions with builtins and codegen to support
these, the intrinsics are implemented by generating read/write_register calls
which get appropriately lowered in the backend based on the register string
provided. SemaChecking is also implemented to fault invalid parameters.

Differential Revision: http://reviews.llvm.org/D9697

llvm-svn: 239737
2015-06-15 17:51:01 +00:00
Ahmed Bougacha 94df730f7d [CodeGen][NEON] Emit constants for "immediate" intrinsic arguments.
On ARM/AArch64, we currently always use EmitScalarExpr for the immediate
builtin arguments, instead of directly emitting the constant. When the
overflow sanitizer is enabled, this generates overflow intrinsics
instead of constants, breaking assumptions in various places.

Instead, use the knowledge of "immediates" to directly emit a constant:
- teach the tablegen backend to emit the "immediate" modifiers
- use those modifiers in the NEON CodeGen, on ARM and AArch64.

Fixes PR23517.

Differential Revision: http://reviews.llvm.org/D10045

llvm-svn: 239002
2015-06-04 01:43:41 +00:00
Nuno Lopes 1ba2d78b9a ubsan: Check for null pointers given to certain builtins, such
as memcpy, memset, memmove, and bzero.

Reviewed by: Richard Smith

Differential Revision: http://reviews.llvm.org/D9673

llvm-svn: 238657
2015-05-30 16:11:40 +00:00
Justin Bogner 20eb9d486c wip: Remove some unused functions
llvm-svn: 238538
2015-05-29 02:42:14 +00:00
David Blaikie 43f9bb7371 API update for streamlining of IRBuilder::CreateCall to just use ArrayRef/initializer_list+braced init
llvm-svn: 237625
2015-05-18 22:14:03 +00:00
Ulrich Weigand 5722c0f192 [SystemZ] Add support for z13 low-level vector builtins
This adds low-level builtins to allow access to all of the z13 vector
instructions.  Note that instructions whose semantics can be described
by standard C (including clang extensions) do not get any builtins.

For each instructions whose semantics *cannot* (fully) be described, we
define a builtin named __builtin_s390_<insn> that directly maps to this
instruction.  These are intended to be compatible with GCC.

For instructions that also set the condition code, the builtin will take
an extra argument of type "int *" at the end.  The integer pointed to by
this argument will be set to the post-instruction CC value.

For many instructions, the low-level builtin is mapped to the corresponding
LLVM IR intrinsic.  However, a number of instructions can be represented
in standard LLVM IR without requiring use of a target intrinsic.

Some instructions require immediate integer operands within a certain
range.  Those are verified at the Sema level.

Based on a patch by Richard Sandiford.

llvm-svn: 236532
2015-05-05 19:36:42 +00:00
David Blaikie fb901c7abf [opaque pointer type] more GEP API migrations
llvm-svn: 234097
2015-04-04 15:12:29 +00:00
Ulrich Weigand 3a610ebf1e [SystemZ] Support transactional execution on zEC12
The zEC12 provides the transactional-execution facility.  This is exposed
to users via a set of builtin routines on other compilers.  This patch
adds clang support to enable those builtins.  In partciular, the patch:

- enables the transactional-execution feature by default on zEC12
- allows to override presence of that feature via the -mhtm/-mno-htm options
- adds a predefined macro __HTM__ if the feature is enabled
- adds support for the transactional-execution GCC builtins
- adds Sema checking to verify the __builtin_tabort abort code
- adds the s390intrin.h header file (for GCC compatibility)
- adds s390 sections to the htmintrin.h and htmxlintrin.h header files

Since this is first use of target-specific intrinsics on the platform,
the patch creates the include/clang/Basic/BuiltinsSystemZ.def file and
hooks it up in TargetBuiltins.h and lib/Basic/Targets.cpp.

An associated LLVM patch adds the required LLVM IR intrinsics.

For reference, the transactional-execution instructions are documented
in the z/Architecture Principles of Operation for the zEC12:
http://publibfp.boulder.ibm.com/cgi-bin/bookmgr/download/DZ9ZR009.pdf
The associated builtins are documented in the GCC manual:
http://gcc.gnu.org/onlinedocs/gcc/S_002f390-System-z-Built-in-Functions.html
The htmxlintrin.h intrinsics provided for compatibility with the IBM XL
compiler are documented in the "z/OS XL C/C++ Programming Guide".

llvm-svn: 233804
2015-04-01 12:54:25 +00:00
Kit Barton e50adcb6b1 [PPC] Move argument range checks for HTM and crypto builtins to Sema
The argument range checks for the HTM and Crypto builtins were implemented in
CGBuiltin.cpp, not in Sema. This change moves them to the appropriate location
in SemaChecking.cpp. It requires the creation of a new method in the Sema class
to do checks for PPC-specific builtins.

http://reviews.llvm.org/D8672

llvm-svn: 233586
2015-03-30 19:40:59 +00:00
Kit Barton 8246f28237 Add Hardware Transactional Memory (HTM) Support
This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07
(POWER8). The intrinsic support is based on GCC one [1], with both 'PowerPC HTM
Low Level Built-in Functions' and 'PowerPC HTM High Level Inline Functions'
implemented.

Along with builtins a new driver switch is added to enable/disable HTM
instruction support (-mhtm) and a header with common definitions (mostly to
parse the TFHAR register value). The HTM switch also sets a preprocessor builtin
HTM.

The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on
powerpc64 and powerpc64le.

This is send along a llvm patch to enabled the builtins and option switch.

[1]
https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html

Phabricator Review: http://reviews.llvm.org/D8248

llvm-svn: 233205
2015-03-25 19:41:41 +00:00
David Majnemer c403a1ce32 MS ABI: Accept calls to an unprototyped declaration of _setjmp
This fixes PR22961.

llvm-svn: 232824
2015-03-20 17:03:35 +00:00
Chandler Carruth c66deafb73 [Modules] Implement __builtin_isinf_sign in Clang.
Somehow, we never managed to implement this fully. We could constant
fold it like crazy, including constant folding complex arguments, etc.
But if you actually needed to generate code for it, error.

I've implemented it using the somewhat obvious lowering. Happy for
suggestions on a more clever way to lower this.

Now, what you might ask does this have to do with modules? Fun story. So
it turns out that libstdc++ actually uses __builtin_isinf_sign to
implement std::isinf when in C++98 mode, but only inside of a template.
So if we're lucky, and we never instantiate that, everything is good.
But once we try to instantiate that template function, we need this
builtin. All of my customers at least are using C++11 and so they never
hit this code path.

But what does that have to do with modules? Fun story. So it turns out
that with modules we actually observe a bunch of bugs in libstdc++ where
their <cmath> header clobbers things exposed by <math.h>. To fix these,
we have to provide global function definitions to replace the macros
that C99 would have used. And it turns out that ::isinf needs to be
implemented using the exact semantics used by the C++98 variant of
std::isinf. And so I started to fix this bug in libstdc++ and ceased to
be able to compile libstdc++ with Clang.

The yaks are legion.

llvm-svn: 232778
2015-03-19 22:39:51 +00:00
David Majnemer ba3e5ecf07 MS ABI: Implement __GetExceptionInfo for std::make_exception_ptr
std::make_exception_ptr calls std::__GetExceptionInfo in order to figure
out how to properly copy the exception object.

Differential Revision: http://reviews.llvm.org/D8280

llvm-svn: 232188
2015-03-13 18:26:17 +00:00
Joerg Sonnenberger 27173288c2 Under duress, move check for target support of __builtin_setjmp/
__builtin_longjmp to Sema as requested by John McCall.

llvm-svn: 231986
2015-03-11 23:46:32 +00:00
Nemanja Ivanovic 55e757db4a Add Clang support for PPC cryptography builtins
Review: http://reviews.llvm.org/D7951

llvm-svn: 231291
2015-03-04 21:48:22 +00:00
Joerg Sonnenberger 244a577754 Adjust the changes from r230255 to bail out if the backend can't lower
__builtin_setjmp/__builtin_longjmp and don't fall back to the libc
functions.

llvm-svn: 231245
2015-03-04 14:25:35 +00:00
Juergen Ributzka 9baa03fc07 Lower _mm256_broadcastsi128_si256 directly to a vector shuffle.
Originally we were using the same GCC builtins to lower this AVX2 vector
intrinsic. Instead we will now lower it directly to a vector shuffle.

This will not only allow LLVM to generate better code, but it will also allow us
to remove the GCC intrinsics.

Reviewed by Andrea

This is related to rdar://problem/18742778.

llvm-svn: 231081
2015-03-03 17:22:53 +00:00
David Majnemer ced8bdf74a Sema: Parenthesized bound destructor member expressions can be called
We would wrongfully reject (a.~A)() in both the destructor and
pseudo-destructor cases.

This fixes PR22668.

llvm-svn: 230512
2015-02-25 17:36:15 +00:00
Joerg Sonnenberger 096feeb741 Only lower __builtin_setjmp / __builtin_longjmp to
llvm.eh.sjlj.setjmp / llvm.eh.sjlj.longjmp, if the backend is known to
support them outside the Exception Handling context. The default
handling in LLVM codegen doesn't work and will create incorrect code.
The ARM backend on the other hand will assert if the intrinsics are
used.

llvm-svn: 230255
2015-02-23 20:23:47 +00:00
Craig Topper 96f9a573b5 [X86] Convert palignr builtin handling to use shuffle form of right shift instead of intrinsics. This should allow the instrinsics to removed from the backend.
llvm-svn: 229474
2015-02-17 07:18:01 +00:00
Craig Topper 480e2b6e43 [X86] Merge the 2 separate builtin handlers for PALIGNR into a single one that handles both.
llvm-svn: 229469
2015-02-17 06:37:58 +00:00
Craig Topper e994b8edad [X86] Remove code that does custom handling of the builtin for MMX palignr. This code is unreachable since its already marked for non-custom handling in llvm's IntrinsicsX86.td file.
llvm-svn: 229468
2015-02-17 06:22:50 +00:00
Craig Topper d2f814dca4 [X86] Remove completely unnecessary switch statement.
llvm-svn: 229435
2015-02-16 21:30:08 +00:00
Craig Topper 370644f66e [X86] Teach clang to lower __builtin_ia32_psrldqi256 and __builtin_ia32_pslldqi256 to vector shuffles the backend recognizes. This is a step towards removing the corresponding intrinsics from the backend.
llvm-svn: 229348
2015-02-16 00:42:49 +00:00
Reid Kleckner 1fcccdd78e Fix build break, these builtins don't exist
llvm-svn: 228241
2015-02-05 00:24:57 +00:00
Reid Kleckner 8a8c129a4b Do the same IRgen for __builtin_pow* as for pow*
There's no reason for these to be different.

llvm-svn: 228240
2015-02-05 00:18:01 +00:00
Reid Kleckner aca01db706 Implement IRGen for SEH __finally and AbnormalTermination
Previously we would simply double-emit the body of the __finally block,
but that doesn't work when it contains any kind of Decl, which we can't
double emit.

This fixes that by emitting the block once and branching into a shared
code region and then branching back out.

llvm-svn: 228222
2015-02-04 22:37:07 +00:00
David Majnemer 310e3a8f60 MS ABI: Implement proper support for setjmp
On targets which use the MSVCRT, setjmp is a macro which expands to
_setjmp or _setjmpex.

_setjmp and _setjmpex have a secret, hidden argument which is not listed
in the function prototype on X64 and WoA.  This hidden argument always
seems to be the frame pointer.

_setjmpex isn't used on X86, _setjmp is magically replaced with a call
to _setjmp3.  The second argument is zero for 'normal' setjmp/longjmp
pairs, otherwise it is a count of additional variadic arguments.  This
is used when setjmp appears inside of a try or __try.

It is not safe to use a pointer to setjmp because _setjmp, _setjmpex and
_setmp3 are not compatible with setjmp.

llvm-svn: 227426
2015-01-29 09:29:21 +00:00
Pete Cooper f051cbf631 Don't generate llvm.expect intrinsics with -O0.
The backend won't run LowerExpect on -O0.  In a debug LTO build, this results in llvm.expect intrinsics being in the LTO IR which doesn't know how to optimize them.

Thanks to Chandler for the suggestion and review.

Differential revision: http://reviews.llvm.org/D7183

llvm-svn: 227135
2015-01-26 20:51:58 +00:00
Reid Kleckner 1d59f99f5c Initial support for Win64 SEH IR emission
The lowering looks a lot like normal EH lowering, with the exception
that the exceptions are caught by executing filter expression code
instead of matching typeinfo globals. The filter expressions are
outlined into functions which are used in landingpad clauses where
typeinfo would normally go.

Major aspects that still need work:
- Non-call exceptions in __try bodies won't work yet. The plan is to
  outline the __try block in the frontend to keep things simple.
- Filter expressions cannot use local variables until capturing is
  implemented.
- __finally blocks will not run after exceptions. Fixing this requires
  work in the LLVM SEH preparation pass.

The IR lowering looks like this:

// C code:
bool safe_div(int n, int d, int *r) {
  __try {
    *r = normal_div(n, d);
  } __except(_exception_code() == EXCEPTION_INT_DIVIDE_BY_ZERO) {
    return false;
  }
  return true;
}

; LLVM IR:
define i32 @filter(i8* %e, i8* %fp) {
  %ehptrs = bitcast i8* %e to i32**
  %ehrec = load i32** %ehptrs
  %code = load i32* %ehrec
  %matches = icmp eq i32 %code, i32 u0xC0000094
  %matches.i32 = zext i1 %matches to i32
  ret i32 %matches.i32
}

define i1 zeroext @safe_div(i32 %n, i32 %d, i32* %r) {
  %rr = invoke i32 @normal_div(i32 %n, i32 %d)
      to label %normal unwind to label %lpad

normal:
  store i32 %rr, i32* %r
  ret i1 1

lpad:
  %ehvals = landingpad {i8*, i32} personality i32 (...)* @__C_specific_handler
      catch i8* bitcast (i32 (i8*, i8*)* @filter to i8*)
  %ehptr = extractvalue {i8*, i32} %ehvals, i32 0
  %sel = extractvalue {i8*, i32} %ehvals, i32 1
  %filter_sel = call i32 @llvm.eh.seh.typeid.for(i8* bitcast (i32 (i8*, i8*)* @filter to i8*))
  %matches = icmp eq i32 %sel, %filter_sel
  br i1 %matches, label %eh.except, label %eh.resume

eh.except:
  ret i1 false

eh.resume:
  resume
}

Reviewers: rjmccall, rsmith, majnemer

Differential Revision: http://reviews.llvm.org/D5607

llvm-svn: 226760
2015-01-22 01:36:17 +00:00
Matt Arsenault 6365ffea3e Add __builtin_amdgpu_class
llvm-svn: 225314
2015-01-06 23:14:57 +00:00
Tom Stellard d8e38a3206 R600: Handle amdgcn triple
For now there is no difference between amdgcn and r600.

llvm-svn: 225294
2015-01-06 20:34:47 +00:00
Craig Topper 2094d8fe88 [x86] Add the (v)cmpps/pd/ss/sd builtins to match gcc. Use them in the sse intrinsic files.
This still lower to the same intrinsics as before.

This is preparation for bounds checking the immediate on the avx version of the builtin so we don't pass illegal immediates into the backend. Since SSE uses a smaller size immediate its not possible to bounds check when using a shared builtin. Rather than creating a clang specific builtin for the different immediate, I decided (after consulting with Chandler) that it was better to match gcc.

llvm-svn: 224879
2014-12-27 06:59:57 +00:00
Saleem Abdulrasool 86b881c63e CodeGen: implement __emit intrinsic
For MSVC compatibility, add the `__emit' builtin. This is used in the Windows
SDK headers, and must therefore be implemented as a builtin rather than an
intrinsic.

The `__emit' builtin provides a mechanism to emit a 16-bit opcode instruction
into the stream. The value must be a compile time constant expression. No
guarantees are made about the CPU and memory states after the execution of the
instruction.

Due to the unchecked nature of the builtin, only support this on Windows on ARM.

llvm-svn: 224438
2014-12-17 17:52:30 +00:00
Peter Collingbourne f770683f14 Implement the __builtin_call_with_static_chain GNU extension.
The extension has the following syntax:

  __builtin_call_with_static_chain(Call, Chain)
  where Call must be a function call expression and Chain must be of pointer type

This extension performs a function call Call with a static chain pointer
Chain passed to the callee in a designated register. This is useful for
calling foreign language functions whose ABI uses static chain pointers
(e.g. to implement closures).

Differential Revision: http://reviews.llvm.org/D6332

llvm-svn: 224167
2014-12-12 23:41:25 +00:00
Duncan P. N. Exon Smith fb49491477 IR: Update clang for Metadata/Value split in r223802
Match LLVM API changes from r223802.

llvm-svn: 223803
2014-12-09 18:39:32 +00:00
Saleem Abdulrasool a14ac3f437 CodeGen: refactor ARM builtin handling
Create a helper function to construct a value for the ARM hint intrinsic
rather than inling the construction.  In order to avoid the use of the sentinel
value, inline the use of intrinsic instruction retrieval.  NFC.

llvm-svn: 223338
2014-12-04 04:52:37 +00:00
Reid Kleckner ee7cf84c8f Use nullptr to silence -Wsentinel when self-hosting on Windows
Richard rejected my Sema change to interpret an integer literal zero in
a varargs context as a null pointer, so -Wsentinel sees an integer
literal zero and fires off a warning. Only CodeGen currently knows that
it promotes integer literal zeroes in this context to pointer size on
Windows.  I didn't want to teach -Wsentinel about that compatibility
hack. Therefore, I'm migrating to C++11 nullptr.

llvm-svn: 223079
2014-12-01 22:02:27 +00:00
Bill Schmidt 9ec8cea02b [PowerPC] Add vec_vsx_ld and vec_vsx_st intrinsics
This patch enables the vec_vsx_ld and vec_vsx_st intrinsics for
PowerPC, which provide programmer access to the lxvd2x, lxvw4x,
stxvd2x, and stxvw4x instructions.

New code in altivec.h defines these in terms of new builtins, which
are themselves defined in BuiltinsPPC.def.  The builtins are converted
to LLVM intrinsics in CGBuiltin.cpp.  Additional code is added to
builtins-ppc-vsx.c to verify the correct generation of the intrinsics.

Note that I moved the other VSX builtins so all VSX builtins will be
alphabetical in their own section in BuiltinsPPC.def.

There is a companion patch for LLVM.

llvm-svn: 221768
2014-11-12 04:19:56 +00:00
Alexey Samsonov e396bfc064 Bundle conditions checked by UBSan with sanitizer kinds they implement.
Summary:
This change makes CodeGenFunction::EmitCheck() take several
conditions that needs to be checked (all of them need to be true),
together with sanitizer kinds these checks are for. This would allow
to split one call into UBSan runtime into several calls in case
different sanitizer kinds would have different recoverability
settings.

Tests should be fixed accordingly, I'm working on it.

Test Plan: regression test suite.

Reviewers: rsmith

Reviewed By: rsmith

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D6219

llvm-svn: 221716
2014-11-11 22:03:54 +00:00
Alexey Samsonov 4c1a96f519 Propagate SanitizerKind into CodeGenFunction::EmitCheck() call.
Make sure CodeGenFunction::EmitCheck() knows which sanitizer
it emits check for. Make CheckRecoverableKind enum an
implementation detail and move it away from header.

Currently CheckRecoverableKind is determined by the type of
sanitizer ("unreachable" and "return" are unrecoverable,
"vptr" is always-recoverable, all the rest are recoverable).
This will change in future if we allow to specify which sanitizers
are recoverable, and which are not by -fsanitize-recover= flag.

No functionality change.

llvm-svn: 221635
2014-11-10 22:27:30 +00:00
Alexey Samsonov edf99a92c0 Introduce a SanitizerKind enum to LangOptions.
Use the bitmask to store the set of enabled sanitizers instead of a
bitfield. On the negative side, it makes syntax for querying the
set of enabled sanitizers a bit more clunky. On the positive side, we
will be able to use SanitizerKind to eventually implement the
new semantics for -fsanitize-recover= flag, that would allow us
to make some sanitizers recoverable, and some non-recoverable.

No functionality change.

llvm-svn: 221558
2014-11-07 22:29:38 +00:00
Reid Kleckner 06ea7d6213 Lower __builtin_fabs* to @llvm.fabs.*
mingw64's headers implement fabs by calling __builtin_fabs, so using the
library call results in an infinite loop. If the backend legalizes
@llvm.fabs as a call to fabs later, things should work out, as the crt
provides a definition.

llvm-svn: 221206
2014-11-03 23:52:09 +00:00
Reid Kleckner 4cad00abf3 Remove dead AST type argument to EmitFAbs
llvm-svn: 221205
2014-11-03 23:51:40 +00:00
Alexey Samsonov 035462c1cf Get rid of SanitizerOptions::Disabled global. NFC.
SanitizerOptions is not even a POD now, so having global variable of
this type, is not nice. Instead, provide a regular constructor and clear()
method, and let each CodeGenFunction has its own copy of SanitizerOptions
it uses.

llvm-svn: 220920
2014-10-30 19:33:44 +00:00
Saleem Abdulrasool a25fbef088 CodeGen: add __readfsdword builtin
The Windows NT SDK uses __readfsdword and declares it as a compiler provided
builtin (#pragma intrinsic(__readfsword).  Because intrin.h is not referenced
by winnt.h, it is not possible to provide an out-of-line definition for the
intrinsic.  Provide a proper compiler builtin definition.

llvm-svn: 220859
2014-10-29 16:35:41 +00:00
Matt Arsenault 2174a9dc28 R600: Update for div_fmas intrinsic change
llvm-svn: 220339
2014-10-21 22:21:41 +00:00
Hal Finkel d2208b59cf Add __sync_fetch_and_nand (again)
Prior to GCC 4.4, __sync_fetch_and_nand was implemented as:

  { tmp = *ptr; *ptr = ~tmp & value; return tmp; }

but this was changed in GCC 4.4 to be:

  { tmp = *ptr; *ptr = ~(tmp & value); return tmp; }

in response to this change, support for sync_fetch_and_nand (and
sync_nand_and_fetch) was removed in r99522 in order to avoid miscompiling code
depending on the old semantics. However, at this point:

  1. Many years have passed, and the amount of code relying on the old
     semantics is likely smaller.

  2. Through the work of many contributors, all LLVM backends have been updated
     such that "atomicrmw nand" provides the newer GCC 4.4+ semantics (this process
     was complete July of 2014 (added to the release notes in r212635).

  3. The lack of this intrinsic is now a needless impediment to porting codes
     from GCC to Clang (I've now seen several examples of this).

It is true, however, that we still set GNUC_MINOR to 2 (corresponding to GCC
4.2). To compensate for this, and to address the original concern regarding
code relying on the old semantics, I've added a warning that specifically
details the fact that the semantics have changed and that we provide the newer
semantics.

Fixes PR8842.

llvm-svn: 218905
2014-10-02 20:53:50 +00:00
Jan Vesely b4379f9c2c CGBuiltin: Use frem instruction rather than libcall to implement fmod
AFAICT the semantics of frem match libm's fmod.

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 218488
2014-09-26 01:19:41 +00:00
Hal Finkel bcc06085a8 Add __builtin_assume and __builtin_assume_aligned using @llvm.assume.
This makes use of the recently-added @llvm.assume intrinsic to implement a
__builtin_assume(bool) intrinsic (to provide additional information to the
optimizer). This hooks up __assume in MS-compatibility mode to mirror
__builtin_assume (the semantics have been intentionally kept compatible), and
implements GCC's __builtin_assume_aligned as assume((p - o) & mask == 0). LLVM
now contains special logic to deal with assumptions of this form.

llvm-svn: 217349
2014-09-07 22:58:14 +00:00
James Molloy 163b1ba471 [ARMv8] Add support for 32-bit MIN/MAXNM and directed rounding.
This patch adds support for the 32bit numeric max/min and directed round-to-integral NEON intrinsics that were added as part of v8, along with unit tests.

Patch by Graham Hunter!

llvm-svn: 217242
2014-09-05 13:50:34 +00:00
Tom Stellard c4e0c1075b CGBuiltin: Use @llvm.fabs rather than fabs libcall when emitting builtins
Using the intrinsic allows the SelectionDAGBuilder to turn this call
into the FABS Node and also the intrinsic is something the vectorizer knows
how to vectorize.

This patch also sets the readnone attribute on this call, which should
enable additional optmizations.

llvm-svn: 217042
2014-09-03 15:24:29 +00:00
Craig Topper 5fc8fc2d31 Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created.
llvm-svn: 216528
2014-08-27 06:28:36 +00:00
Yi Kong 1d268af094 ARM: Add dbg builtin intrinsic
llvm-svn: 216452
2014-08-26 12:48:06 +00:00
Hal Finkel 6208251923 Implement __builtin_signbitl for PowerPC
PowerPC uses the special PPC_FP128 type for long double on Linux, which is
composed of two 64-bit doubles. The higher-order double (which contains the
overall sign) comes first, and so the __builtin_signbitl implementation
requires special handling to extract the sign bit.

Fixes PR20691.

llvm-svn: 216341
2014-08-24 03:47:06 +00:00
Alexey Samsonov 70b9c01bd4 Pass expressions instead of argument ranges to EmitCall/EmitCXXConstructorCall.
Summary:
This is a first small step towards passing generic "Expr" instead of
ArgBeg/ArgEnd pair into EmitCallArgs() family of methods. Having "Expr" will
allow us to get the corresponding FunctionDecl and its ParmVarDecls,
thus allowing us to alter CodeGen depending on the function/parameter
attributes.

No functionality change.

Test Plan: regression test suite

Reviewers: rnk

Reviewed By: rnk

Subscribers: aemerson, cfe-commits

Differential Revision: http://reviews.llvm.org/D4915

llvm-svn: 216214
2014-08-21 20:26:47 +00:00
Matt Arsenault dbb84916d9 R600: Add ldexp intrinsic
llvm-svn: 215738
2014-08-15 17:44:32 +00:00
Yi Kong a5548431a5 AArch64: Prefetch intrinsic
llvm-svn: 215569
2014-08-13 19:18:20 +00:00
Yi Kong 26d104a9ec ARM: Prefetch intrinsics
llvm-svn: 215568
2014-08-13 19:18:14 +00:00
Yi Kong 1083eb5c11 AArch64: Resolve some FIXMEs in CGBuiltin left over from backend merge
Merge vrshr_n_v and vqshlu_n_v with ARM.

Remove FIXME comments for others as they can't actually be shared.

NFC.

Differential Revision: http://reviews.llvm.org/D4697

llvm-svn: 214173
2014-07-29 09:25:17 +00:00
Tim Northover 40956e64f2 AArch64: update Clang for merged arm64/aarch64 triples.
The main subtlety here is that the Darwin tools still need to be given "-arch
arm64" rather than "-arch aarch64". Fortunately this already goes via a custom
function to handle weird edge-cases in other architectures, and it tested.

I removed a few arm64_be tests because that really isn't an interesting thing
to worry about. No-one using big-endian is also referring to the target as
arm64 (at least as far as toolchains go). Mostly they date from when arm64 was
a separate target and we *did* need a parallel name simply to test it at all.
Now aarch64_be is sufficient.

llvm-svn: 213744
2014-07-23 12:32:58 +00:00
Alexey Samsonov 24cad99307 [UBSan] Add !nosanitize metadata to the code generated by UBSan.
This is used to mark the instructions emitted by Clang to implement
variety of UBSan checks. Generally, we don't want to instrument these
instructions with another sanitizers (like ASan).

Reviewed in http://reviews.llvm.org/D4544

llvm-svn: 213291
2014-07-17 18:46:27 +00:00
Hal Finkel 3e49fda0d4 Add basic (noop) CodeGen support for __assume
Clang supports __assume, at least at the semantic level, when MS extensions are
enabled. Unfortunately, trying to actually compile code using __assume would
result in this error:

  error: cannot compile this builtin function yet

__assume is an optimizer hint, and can be ignored at the IR level. Until LLVM
supports assumptions at the IR level, a noop lowering is valid, and that is
what is done here.

llvm-svn: 213206
2014-07-16 22:44:54 +00:00
Matt Arsenault 8587711164 Add codegen for more R600 builtins
llvm-svn: 213079
2014-07-15 17:23:46 +00:00
Yi Kong 4d5e23f53a ARM: Implement __builtin_arm_nop intrinsic
This patch implements __builtin_arm_nop intrinsic for AArch32 and AArch64,
which generates hint 0x0, the alias of NOP instruction.

This intrinsic is necessary to implement ACLE __nop intrinsic.

Differential Revision: http://reviews.llvm.org/D4495

llvm-svn: 212947
2014-07-14 15:20:09 +00:00
Saleem Abdulrasool 572250d60a CodeGen: support hint intrinsics from ACLE on AArch64
This adds support for the ACLE hint intrinsics on AArch64 similar to ARM.  This
is required to properly support ACLE on AArch64.

llvm-svn: 212890
2014-07-12 23:27:22 +00:00
Reid Kleckner ed5d4adb36 MS extension: Make __noop be the integer zero, not void
We still don't accept '__noop;', and we don't consider __noop to be the
integer literal zero.  More work is needed.

llvm-svn: 212839
2014-07-11 20:22:55 +00:00
Saleem Abdulrasool e700cab4e9 CodeGen: add support for a few MSVC ARM intrinsics
This adds support for simple MSVC compatibility mode intrinsics.  These
intrinsics are simple in that they are either directly passed through to the
annotated MSBuiltin intrinsic or they mirror existing GCC builtins.

llvm-svn: 212378
2014-07-05 20:10:05 +00:00
Saleem Abdulrasool 96bfda8dbc CodeGen: add support for MSBuiltin aliases
This completes the infrastructure for the new MSBuiltin aliases in the
instruction definitions.  These behave similar to the GCCBuiltin in that they
can be implicitly constructed without special handling unless needed.

With this change it is possible to annotate an LLVM intrinsic in the backend
instruction definitions and indicate it as a builtin in the Builtin*.def files
in clang via LANGBUILTIN.  That will automatically pass through the instruction
much as a GCCBuiltin.

Note that there is no need for the special handling for ensuring that the
compatibility flag is enabled since the filtering on the LANGBUILTIN will
automatically prevent the intrinsic from bleeding into non-MS compatible
compiler invocations.

llvm-svn: 212359
2014-07-04 21:49:39 +00:00
Saleem Abdulrasool ece7217f70 ARM: rename ARM builtins to use __builtin_arm prefix
This corrects SVN r212196's naming change to use the proper prefix of
`__builtin_arm_` instead of `__builtin_`.

Thanks to Yi Kong for pointing out the incorrect naming!

llvm-svn: 212253
2014-07-03 02:43:20 +00:00
Saleem Abdulrasool 4bddd9d400 CodeGen: make target builtins support languages
This extends the target builtin support to allow language specific annotations
(i.e. LANGBUILTIN).  This is to allow MSVC compatibility whilst retaining the
ability to have EABI targets use a __builtin_ prefix.  This is merely to allow
uniformity in the EABI case where the unprefixed name is provided as an alias in
the header.

llvm-svn: 212196
2014-07-02 17:41:27 +00:00
Tim Northover 3acd6bd0b6 ARM: add support for v8 ldaex/stlex builtins.
ARMv8 adds (to both AArch32 and AArch64) acquiring and releasing
variants of the exclusive operations, in line with the C++11 memory
model.

This adds support for two new intrinsics to expose them to C & C++
developers directly: __builtin_arm_ldaex and __builtin_arm_stlex, in
direct analogy with the versions with no implicit barrier.

rdar://problem/15885451

llvm-svn: 212175
2014-07-02 12:56:02 +00:00
Craig Topper 00bbdcf9b3 Remove llvm:: from uses of ArrayRef.
llvm-svn: 211987
2014-06-28 23:22:23 +00:00
Matt Arsenault 56f008d538 Add R600 builtin codegen.
llvm-svn: 211631
2014-06-24 20:45:01 +00:00
Tim Northover 6ea28bdef5 ARM: remove dead CodeGen functions.
These two are no longer being used by NEON codegen.

llvm-svn: 211586
2014-06-24 12:07:44 +00:00
Jim Grosbach e59c43dc21 Fix spelling. s/overloaed/overloaded/
llvm-svn: 211530
2014-06-23 20:28:43 +00:00
Saleem Abdulrasool 114efe0dc8 CodeGen: improve ms instrincics support
Add support for _InterlockedCompareExchangePointer, _InterlockExchangePointer,
_InterlockExchange.  These are available as a compiler intrinsic on ARM and x86.
These are used directly by the Windows SDK headers without use of the intrin
header.

llvm-svn: 211216
2014-06-18 20:51:10 +00:00
Jim Grosbach 79140826bc AArch64: Support for __builtin_arm_rbit() and __builtin_arm_rbit64().
__builtin_arm_rbit() and __builtin_arm_rbit64().

rdar://9283021

llvm-svn: 211060
2014-06-16 21:56:02 +00:00
Jim Grosbach 171ec34544 ARM: Support for __builtin_arm_rbit() intrinsic.
Reverse the bits in a word. Maps to the RBIT instruction.

rdar://9283021

llvm-svn: 211059
2014-06-16 21:55:58 +00:00
Tim Northover b49b04bbe0 IR-change: cmpxchg operations now return { iN, i1 }.
This is a minimal fix for clang. I'll soon add support for generating
weak variants when requested, but that's not really necessary for the
LLVM change in isolation.

llvm-svn: 210907
2014-06-13 14:24:59 +00:00
Richard Smith 760520bcb7 Add __builtin_operator_new and __builtin_operator_delete, which act like calls
to the normal non-placement ::operator new and ::operator delete, but allow
optimizations like new-expressions and delete-expressions do.

llvm-svn: 210137
2014-06-03 23:27:44 +00:00
Michael J. Spencer 5ce26687f2 [CodeGen] Don't use SizeTy for EmitNeonSplat.
llvm-svn: 210042
2014-06-02 19:48:59 +00:00
Michael J. Spencer dd59775f06 [CodeGen] Don't cast and use SizeTy instead of Int32Ty when constructing {extract,insert} vector element instructions.
llvm-svn: 209942
2014-05-31 00:22:12 +00:00
Tim Northover 573cbee543 AArch64/ARM64: rename ARM64 components to AArch64
This keeps Clang consistent with backend naming conventions.

llvm-svn: 209579
2014-05-24 12:52:07 +00:00
Tim Northover 25e8a6754e AArch64/ARM64: update Clang after AArch64 removal.
A few (mostly CodeGen) parts of Clang were tightly coupled to the
AArch64 backend. Now that it's gone, they will not even compile.

I've also deduplicated RUN lines in many of the AArch64 tests. This
might improve "make check-all" time noticably: some of those NEON
tests were monsters.

llvm-svn: 209578
2014-05-24 12:51:25 +00:00
Craig Topper 8a13c4180e [C++11] Use 'nullptr'. CodeGen edition.
llvm-svn: 209272
2014-05-21 05:09:00 +00:00
Hao Liu 9f9492b657 [ARM64]Fix the bug right shift uint64_t by 64 generates incorrect result.
llvm-svn: 208761
2014-05-14 08:59:30 +00:00
Saleem Abdulrasool 956c2ec532 CodeGen: complete ARM ACLE hint 8.4 support
Add support for the remaining hints from the ACLE.  Although __dbg is listed as
a hint, it is handled different, so it is not covered by this change.

llvm-svn: 207930
2014-05-04 02:52:25 +00:00
Saleem Abdulrasool 38ed6de3a0 CodeGen: rename __builtin_arm_sevl to __sevl
ACLE adds the __sevl() extension.  Rename the hint from a custom name to the
ACLE specified name.

llvm-svn: 207829
2014-05-02 06:53:57 +00:00
James Molloy fa40368d9d [ARM64] Add arm64_be where it was accidentally missed from a bunch of if-conditions.
I think this is the last commit for ARM64 big endian in clang. This commit makes
arm_neon.h compile correctly.

llvm-svn: 207624
2014-04-30 10:11:40 +00:00
Hao Liu a19a2e2da6 [ARM64]Fix a bug cannot select UQSHL/SQSHL with constant i64 shift amount.
llvm-svn: 207401
2014-04-28 07:36:12 +00:00
Saleem Abdulrasool 2b99f2f7a4 CodeGen: remove an unused variable
llvm-svn: 207390
2014-04-28 02:29:11 +00:00
Sylvestre Ledru 464902589e remove useless code
llvm-svn: 207360
2014-04-27 14:57:31 +00:00
Saleem Abdulrasool b9f07e3dbc CodeGen: add __yield intrinsic for ARM
The __yield intrinsic generates a hint instruction to indicate that the thread
is not performing any useful operations at the moment.  This is for
compatibility with MSVC, although, the intrinsic is also part of the ACLE, and
is enabled globally as a result.

llvm-svn: 207275
2014-04-25 21:13:29 +00:00
Saleem Abdulrasool 0fd930e86c CodeGen: replace use of @llvm.arm.sevl with @llvm.arm.hint
Use the new generic @llvm.arm.hint hint intrinsic rather than the specialised
@llvm.arm.sevl hint instruction.

llvm-svn: 207243
2014-04-25 17:25:46 +00:00
Tim Northover b17f9a4609 ARM64: add a few bits of polynomial intrinsic codegen.
llvm-svn: 205303
2014-04-01 12:23:08 +00:00
Tim Northover 74b2def0c5 ARM64: add missing ldN/stN intrinsics and enable tests.
llvm-svn: 205296
2014-04-01 10:37:47 +00:00
Tim Northover 0c68faa455 ARM64: enable aarch64-neon-intrinsics.c test
This adds support for the various NEON intrinsics used by
aarch64-neon-intrinsics.c (originally written for AArch64) and enables the
test.

My implementations are designed to be semantically correct, the actual code
quality looks like its a wash between the two backends, and is frequently
different (hence the large number of CHECK changes).

llvm-svn: 205210
2014-03-31 15:47:09 +00:00
Dmitri Gribenko f461da5306 Remove unused variable
llvm-svn: 205169
2014-03-31 07:52:35 +00:00
Tim Northover 6166ec21be ARM64: remove currently trivial switch statement
llvm-svn: 205167
2014-03-31 07:20:13 +00:00
Tim Northover 97606edd75 ARM64: Fix GCC warning in CGBuiltin.cpp
llvm-svn: 205104
2014-03-29 15:26:07 +00:00
Tim Northover a2ee433c8d ARM64: initial clang support commit.
This adds Clang support for the ARM64 backend. There are definitely
still some rough edges, so please bring up any issues you see with
this patch.

As with the LLVM commit though, we think it'll be more useful for
merging with AArch64 from within the tree.

llvm-svn: 205100
2014-03-29 15:09:45 +00:00
Christian Pirker f01cd6f57b Add ARM big endian Target (armeb, thumbeb)
Reviewed at http://llvm-reviews.chandlerc.com/D3096

llvm-svn: 205008
2014-03-28 14:40:46 +00:00
Reid Kleckner 597e81dea1 -fms-extensions: Add __va_start builtin, which is used for x64
The main difference between __va_start and __builtin_va_start is that
the address of the va_list has already been taken, and the va_list is
always a char*.

__va_end and __va_arg are not needed.

llvm-svn: 204821
2014-03-26 15:38:33 +00:00
Renato Golin c491a8d457 Add support for __builtin___clear_cache in Clang
Adding the mapping between __builtin___clear_cache into @llvm.clear_cache

llvm-svn: 204820
2014-03-26 15:36:05 +00:00
Timur Iskhodzhanov f7af2e6de8 Fix a compile-time warning
lib/CodeGen/CGBuiltin.cpp:3136:12: warning: variable ‘TblPos’ set but not used [-Wunused-but-set-variable]

llvm-svn: 204599
2014-03-24 11:09:01 +00:00
Arnaud A. de Grandmaison 6756a497a1 Cleanup dead assignments reported by scan-build
llvm-svn: 204569
2014-03-23 20:28:07 +00:00
Tim Northover 0622b3a67a Update for IR: add a second AtomicOrdering to cmpxchg insts.
rdar://problem/15996804

llvm-svn: 203560
2014-03-11 10:49:03 +00:00
Ted Kremenek 90097491ed Remove 'break' dominated by 'return' in 'EmitBuiltinExpr'.
llvm-svn: 203080
2014-03-06 05:37:38 +00:00
Tim Northover b44e080dbb AArch64: use less cluttered intrinsic for vtbl/vtbx
The table is always 128-bit so there's no reason to specify it every time we
want the intrinsic.

llvm-svn: 202259
2014-02-26 11:55:15 +00:00
Tim Northover 2df47cedeb AArch64: use different type modifier in arm_neon.td
The 'f' modifier is designed for integer type arguments really (according to
its documentation). It's better to use the "half width, same number" modifier.

Should be no user-visible change.

llvm-svn: 202152
2014-02-25 13:53:01 +00:00
Christian Pirker 9b019ae899 Add AArch64 big endian Target (aarch64_be)
llvm-svn: 202151
2014-02-25 13:51:00 +00:00
Warren Hunt 20e4a5d2af Reapply 201734 but with appropriate gcc compatibility
Because GCC incorrectly defines _mm_prefetch to take anything that casts 
to void*, people have started using that behavior.  The previous patch 
that made _mm_prefetch actually take a const char * broke compatibility 
with existing code.  This update to the patch leaves the macro that 
defines _mm_prefetch with the (void*) cast when _MSC_VER is not defined.

llvm-svn: 201901
2014-02-21 23:08:53 +00:00
Tim Northover a0c95eb2d6 Remove commas at the end of lists (C++11 again)
llvm-svn: 201849
2014-02-21 12:16:59 +00:00
Tim Northover 8fe03d6111 ARM & AArch64: use table for EmitCommonNeonBuiltinExpr
This extends the intrinsic lookup table format slightly, and adds
entries for use the shared ARM/AArch64 definitions. The benefit is
currently smaller than for the SISD intrinsics (there's more custom
code implementing this set), but a few lines are saved and there's
scope for future expansion.

llvm-svn: 201848
2014-02-21 11:57:24 +00:00
Tim Northover 2d83796860 AArch64: refactor table-driven NEON lookup.
This extracts the table-driven intrinsic lookup phase into a separate
function, to be used by EmitCommonNeonBuiltinExpr soon.

It also simplifies the logic used in that lookup, since VectorCastArgN
and ScalarArgN were actually identical.

llvm-svn: 201847
2014-02-21 11:57:20 +00:00
Daniel Jasper 2f0f297bdb Revert r201734 and r201742.
This breaks backwards compatibility with existing code. Previously, this
was defined as

  #define _mm_prefetch(a, sel) (__builtin_prefetch((void *)(a), 0, (sel)))

Which basically accepts any pointer. Changing this to char* simply
breaks a lot of existing code. I have tried changing char* to
"const void*", which seems to be the right thing as per Intel
specification this should work on basically any pointer. However,
apparently this breaks windows compatibility (because of a conflicting
declaration in windows.h).

So, we probably need to #ifdef this based on whether clang is compiling
for windows. According to Chandler, this might be done by introducing an
additional symbol to a fake type in BuiltinsX86.def and then condition
the type expansion on the platform.

llvm-svn: 201775
2014-02-20 11:10:48 +00:00
Warren Hunt 40d6f29ad8 Add _mm_prefetch and some others as MS builtins
This patch adds several built-ins that are required for ms 
compatibility. _mm_prefetch must be a built-in because it takes a 
compile-time constant argument and our prior approach of using a #define 
to the current built-in doesn't work in the presence of re-declaration 
of _mm_prefetch. The others can be obtained by including the windows 
system headers. If a user includes the windows system headers but not 
intrin.h they still need to work and therefore must be built-in because 
we don't get a chance to implement them in intrin.h in this case.

llvm-svn: 201734
2014-02-19 23:20:20 +00:00
Tim Northover db3e5e2408 AArch64: look up EmitAArch64Scalar support before calling.
This fixes one immediate bug where an expression with side-effects
could be emitted twice during a NEON call.

It also prepares the way for folding CodeGen for many of the SISD
intrinsics into a table, reducing code size and hopefully increasing
performance eventually ("binary search + few switch cases" should be
better than "lots of switch cases").

llvm-svn: 201667
2014-02-19 11:55:06 +00:00
Tim Northover 0f6c9d0a9b ARM NEON: add vcvtX (with rounding mode) intrinsics to v8 ARM.
These instructions (well, the f32 ones) are supported on 32-bit ARMv8, not just
AArch64. Now that the arm_neon.td refactoring is complete, adding them is
surprisingly simple.

rdar://problem/16035743

llvm-svn: 201661
2014-02-19 10:37:13 +00:00
Tim Northover 1994fa7d3d ARM & AArch64 NEON: share the vabs implementation.
This changes ARM to use @llvm.fabs for floating-point vabs. Patterns
already existed in the backend, and it might help mid-end phases since
it's more likely to be understood than @llvm.arm.neon.vabs.

llvm-svn: 201313
2014-02-13 10:44:17 +00:00
Tim Northover 02b438754c AArch64: share slgihtly more NEON implementation with ARM.
The s64/u64 vcvt conversion operations are actually pretty much identical to
the s32/u32 ones in implementation, and can be shared with just one extra
variable.

llvm-svn: 201145
2014-02-11 11:27:44 +00:00
Tim Northover d23fc6cceb ARM: move vshll NEON implementation to common code
Now that both ARM backends use the same implementation for vshll operations,
the code can be shared. This is also a necessary LLVM/Clang interface update.

llvm-svn: 201094
2014-02-10 16:20:36 +00:00
Tim Northover a2e0a27d26 ARM: implement vshrn NEON intrinsic in terms of shr/trunc
Now the backend supports the natural LLVM IR, we can shamelessly steal the
AArch64 front-end code to implement the vshrn intrinsic on 32-bit ARM.

llvm-svn: 201086
2014-02-10 14:04:12 +00:00
Tim Northover 7ffb2c5523 ARM & AArch64: combine implementation of vcaXYZ intrinsics
Now that the back-end intrinsics are more regular, there's no need for the
special handling these got in the front-end, so they can be moved to
EmitCommonNeonBuiltinExpr.

llvm-svn: 200769
2014-02-04 14:55:52 +00:00
Tim Northover 02e38609e7 ARM: implement support for crypto intrinsics in arm_neon.h
llvm-svn: 200708
2014-02-03 17:28:04 +00:00
Tim Northover 51ab388266 AArch64: use new non-polymorphic crypto intrinsics
The LLVM backend now has invariant types on the various crypto-intrinsics,
because in all cases there's only really one interpretation.

llvm-svn: 200707
2014-02-03 17:28:00 +00:00
Tim Northover 5309111c22 ARM & AArch64: unify the rest of the completely shared NEON implementations
This should be the last routine patch: AArch64 does still delegate to
EmitARMBuiltinExpr, but the remaining instances have complications of
one sort or another so some more cunning thought will be needed.

llvm-svn: 200528
2014-01-31 10:46:52 +00:00
Tim Northover ba1e344d90 ARM & AArch64: another block of miscellaneous NEON sharing.
llvm-svn: 200527
2014-01-31 10:46:49 +00:00
Tim Northover 027b4ee607 ARM & AArch64: move shared vld/vst intrinsics to common implementation.
llvm-svn: 200526
2014-01-31 10:46:45 +00:00
Tim Northover 9d3ab5fe9f ARM & AArch64: more instructions into common block
llvm-svn: 200525
2014-01-31 10:46:41 +00:00
Tim Northover 61fc835d6e ARM & AArch64: merge another NEON block completely.
llvm-svn: 200524
2014-01-31 10:46:36 +00:00
Tim Northover 58c4474dea ARM & AArch64: extend shared NEON implementation to first block.
This extends the refactoring to the whole of the first block of
trivial correspondences (as a fairly arbitrary boundary).

llvm-svn: 200472
2014-01-30 14:48:01 +00:00
Tim Northover ac85c341ae ARM & AArch64: fully share NEON implementation of permutation intrinsics
As a starting point, this moves the CodeGen for NEON permutation
instructions (vtrn, vzip, vuzp) into a new shared function.

llvm-svn: 200471
2014-01-30 14:47:57 +00:00
Tim Northover c322f838bc ARM & AArch64: share the BI__builtin_neon enum defs.
llvm-svn: 200470
2014-01-30 14:47:51 +00:00
Kevin Qin ce1f0e85ba [AArch64 NEON] Fix a bug about vcles_f32 and vcled_f64.
As vcles_f32() and vcled_f64 are implemented by FCMGE, operands
should make a swap.

llvm-svn: 199866
2014-01-23 03:42:06 +00:00
Hao Liu f96fd37888 [AArch64]The compare to zero intrinsics should be implemented by 'icmp/fcmp' and 'sext' not 'zext'. Modify the implementation by replacing zext with sext.
llvm-svn: 197898
2013-12-23 02:44:00 +00:00
Chad Rosier 6030c84a2f [AArch64] Refactor NEON floating-point Max/Min/Maxnm/Minnm across vector AArch64
intrinsics to use f32 types, rather than their vector equivalents.

llvm-svn: 197091
2013-12-11 23:21:39 +00:00
Chad Rosier c520fce72d [AArch64] Add NEON scalar floating-point compare LLVM AArch64 intrinsics that
use f32/f64 types, rather than their vector equivalents.

llvm-svn: 197071
2013-12-11 21:03:56 +00:00
Chad Rosier edd4403510 [AArch64] Refactor the NEON scalar floating-point reciprocal step and
floating-point reciprocal square root step LLVM AArch64 intrinsics to
use f32/f64 types, rather than their vector equivalents.

llvm-svn: 197070
2013-12-11 21:03:54 +00:00
Chad Rosier 6ce4387c5c [AArch64] Refactor the NEON scalar floating-point reciprocal estimate, floating-
point reciprocal exponent, and floating-point reciprocal square root estimate
LLVM AArch64 intrinsics to use f32/f64 types, rather than their vector
equivalents.

llvm-svn: 197069
2013-12-11 21:03:52 +00:00
Chad Rosier 17c248a7a2 [AArch64] Refactor the NEON floating-point absolute difference LLVM AArch64
intrinsic to use f32/f64 types, rather than their vector equivalents.

llvm-svn: 196969
2013-12-10 21:34:23 +00:00
Chad Rosier 37051a80e9 [AArch64] Refactor the NEON signed/unsigned floating-point convert to fixed-point
LLVM AArch64 intrinsics to use f32/f64, rather than their vector equivalents.

llvm-svn: 196968
2013-12-10 21:34:21 +00:00
Chad Rosier 8f6f3d124c [AArch64] Overload NEON signed/unsigned floating-point convert to fixed-point
and fixed-point convert to floating-point LLVM AArch64 intrinsics.

llvm-svn: 196967
2013-12-10 21:34:20 +00:00
Chad Rosier 11a78c86e1 [AArch64] Overload NEON signed/unsigned integer convert to floating-point
LLVM AArch64 intrinsics.

llvm-svn: 196966
2013-12-10 21:34:17 +00:00
Chad Rosier 8d96c803df [AArch64] Refactor the redundant code in the EmitAArch64ScalarBuiltinExpr()
function.  No functional change intended.

llvm-svn: 196936
2013-12-10 17:44:36 +00:00
Chad Rosier 58f6a1fee7 [AArch64] Refactor the Neon vector/scalar floating-point convert intrinsics so
that they use float/double rather than the vector equivalents when appropriate.

llvm-svn: 196931
2013-12-10 16:11:55 +00:00
Chad Rosier ff3b79aead [AArch64] Refactor the Neon vector/scalar floating-point convert implementation.
Specifically, reuse the ARM intrinsics when possible.

llvm-svn: 196927
2013-12-10 15:35:40 +00:00
Kevin Qin fb79d7f843 [AArch64 NEON] Support poly128_t and implement relevant intrinsic.
llvm-svn: 196888
2013-12-10 06:49:01 +00:00
Chad Rosier ce511f2fcb [AArch64] Refactor the NEON scalar reduce pairwise intrinsics so that they use
float/double rather than the vector equivalents when appropriate.

llvm-svn: 196836
2013-12-09 22:47:59 +00:00
Chad Rosier 01703584eb [AArch64] Refactor the NEON scalar reduce pairwise front-end codegen to remove
unnecessary patterns in tablegen.

llvm-svn: 196835
2013-12-09 22:47:57 +00:00
Chad Rosier ad3683c3cb [AArch64] Remove q and non-q intrinsic definitions from the NEON scalar reduce
pairwise implementation, using an overloaded definition instead.

llvm-svn: 196834
2013-12-09 22:47:55 +00:00
Hao Liu 844a7da243 [AArch64]Add missing pair intrinsics such as:
int32_t vminv_s32(int32x2_t a) 
which should be compiled into SMINP Vd.2S,Vn.2S,Vm.2S

llvm-svn: 196750
2013-12-09 03:52:22 +00:00
Kevin Qin ad53b87c70 [AArch64 NEON] Add ACLE intrinsic vceqz_f64.
llvm-svn: 196361
2013-12-04 08:02:11 +00:00
Kevin Qin 8903f8df4b [AArch64 NEON] Add missing compare intrinsics.
llvm-svn: 196359
2013-12-04 07:53:09 +00:00
Hao Liu a5246fde90 [AArch64]Add missing floating point convert, round and misc intrinsics.
E.g. int64x1_t vcvt_s64_f64(float64x1_t a) -> FCVTZS Dd, Dn

llvm-svn: 196211
2013-12-03 06:07:13 +00:00
Hao Liu 4b850c5e0d revert r196152.
This is a duplicate implementation.
E.g. this patch defines:
     float64_t vabd_f64(float64_t a, float64_t b)
But there is already a similar intrinsic "vabdd_f64" with the same types.
Also, this intrinsic will be conflicted to the vector type intrinsic as following(Which is implemented by me and will be committed to trunk):
     float64x1_t vabd_f64(float64x1_t a, float64x1_t b).
Two functions shouldn't have a same name in arm_neon.h.
According to ARM ACLE document, such vabd_f64 with float64_t is not existing.
So I revert this commit.

llvm-svn: 196205
2013-12-03 05:35:17 +00:00
Hao Liu ce258820ca AArch64: Add missing scalar pair intrinsics.
E.g. "float32_t vaddv_f32(float32x2_t a)" to be matched into "faddp s0, v1.2s".

llvm-svn: 196199
2013-12-03 03:40:08 +00:00
Chad Rosier b0574f3bf7 [AArch64] Add missing NEON scalar floating-point to integer convert ACLEs.
llvm-svn: 196152
2013-12-02 21:07:24 +00:00
Hao Liu 8a0099e02c Fix the problem that the range check for scalar narrow shift is too wide.
E.g. the immediate value of vshrns_n_s16 is [1,16], which should be [1,8].

llvm-svn: 195942
2013-11-29 02:13:17 +00:00
Chad Rosier 9e59285cc8 [AArch64] Add support for NEON scalar floating-point absolute difference.
llvm-svn: 195804
2013-11-27 01:46:19 +00:00
Chad Rosier 52e31b20cb [AArch64] Add support for NEON scalar floating-point to integer convert
instructions.

llvm-svn: 195789
2013-11-26 22:17:51 +00:00
Ana Pazos dbd1a22496 Implemented Neon scalar vdup_lane intrinsics.
Fixed scalar dup alias and added test case.

llvm-svn: 195329
2013-11-21 08:15:01 +00:00
Ana Pazos 2b02688fd9 Implemented Neon scalar by element intrinsics.
Intrinsics implemented: vqdmull_lane, vqdmulh_lane, vqrdmulh_lane,
vqdmlal_lane, vqdmlsl_lane scalar Neon intrinsics.

llvm-svn: 195326
2013-11-21 07:36:33 +00:00
Hao Liu 171cedf61e Implement AArch64 neon instructions class SIMD lsone and SIMD lone-post.
llvm-svn: 195079
2013-11-19 02:17:31 +00:00
Hao Liu 5e4ce1ae9d Implement the newly added AArch64 ACLE functions for ld1/st1 with 2/3/4 vectors.
The functions are like: vst1_s8_x2 ...

llvm-svn: 194991
2013-11-18 06:33:43 +00:00
Benjamin Kramer 847c1d90e1 Remove unused but set variable.
llvm-svn: 194920
2013-11-16 11:47:52 +00:00
Ana Pazos 6f2a47a9e5 Implemented aarch64 Neon scalar vmulx_lane intrinsics
Implemented aarch64 Neon scalar vfma_lane intrinsics
Implemented aarch64 Neon scalar vfms_lane intrinsics

Implemented legacy vmul_n_f64, vmul_lane_f64, vmul_laneq_f64
intrinsics (v1f64 parameter type) using Neon scalar instructions.

Implemented legacy vfma_lane_f64, vfms_lane_f64,
vfma_laneq_f64, vfms_laneq_f64 intrinsics (v1f64 parameter type)
using Neon scalar instructions.

llvm-svn: 194889
2013-11-15 23:33:31 +00:00
Chad Rosier 7aaee48bf0 [AArch64] Add support for legacy AArch32 NEON scalar shift right by immediate
and accumulate instructions.

llvm-svn: 194732
2013-11-14 22:02:24 +00:00
Kevin Qin caac85e612 [AArch64 neon] support poly64 and relevant intrinsic functions.
llvm-svn: 194660
2013-11-14 03:29:16 +00:00
Kevin Qin 1718af6f0a Implement aarch64 neon instruction class misc.
llvm-svn: 194657
2013-11-14 02:45:18 +00:00
Jiangning Liu 18b707cb3f Implement AArch64 NEON instruction set AdvSIMD (table).
llvm-svn: 194649
2013-11-14 01:57:55 +00:00
Reid Kleckner 59e4a6f5e2 -fms-extensions: Recognize _alloca as an alias for the alloca builtin
Differential Revision: http://llvm-reviews.chandlerc.com/D1989

llvm-svn: 194617
2013-11-13 22:58:53 +00:00
Chad Rosier e714a962b5 [AArch64] Tests for legacy AArch32 NEON scalar shift by immediate instructions.
A number of non-overloaded intrinsics have been replaced by thier overloaded
counterparts.

llvm-svn: 194599
2013-11-13 20:05:44 +00:00
Chad Rosier 249c714bb4 [AArch64] Add support for NEON scalar floating-point convert to fixed-point instructions.
llvm-svn: 194395
2013-11-11 18:04:22 +00:00
Jiangning Liu c628af66c7 Implement AArch64 Neon instruction set Perm.
llvm-svn: 194124
2013-11-06 03:35:53 +00:00
Jiangning Liu 37f5bb1b28 Implement AArch64 Neon instruction set Bitwise Extract.
llvm-svn: 194119
2013-11-06 02:26:12 +00:00
Jiangning Liu 34a7109b47 Implement AArch64 Neon Crypto instruction classes AES, SHA, and 3 SHA.
llvm-svn: 194086
2013-11-05 17:42:24 +00:00
Kevin Qin 9eece7b5e0 Implemented aarch64 neon intrinsic vcopy_lane with float type.
llvm-svn: 194042
2013-11-05 02:05:44 +00:00
Chad Rosier 74329d6cff [AArch64] Add support for NEON scalar fixed-point convert to floating-point instructions.
llvm-svn: 193817
2013-10-31 22:37:08 +00:00
Chad Rosier bdca387884 [AArch64] Add support for NEON scalar shift immediate instructions.
llvm-svn: 193791
2013-10-31 19:29:05 +00:00
Mark Lacey a8e7df3602 Add CodeGenABITypes.h for use in LLDB.
CodeGenABITypes is a wrapper built on top of CodeGenModule that exposes
some of the functionality of CodeGenTypes (held by CodeGenModule),
specifically methods that determine the LLVM types appropriate for
function argument and return values.

I addition to CodeGenABITypes.h, CGFunctionInfo.h is introduced, and the
definitions of ABIArgInfo, RequiredArgs, and CGFunctionInfo are moved
into this new header from the private headers ABIInfo.h and CGCall.h.

Exposing this functionality is one part of making it possible for LLDB
to determine the actual ABI locations of function arguments and return
values, making it possible for it to determine this for any supported
target without hard-coding ABI knowledge in the LLDB code.

llvm-svn: 193717
2013-10-30 21:53:58 +00:00
Chad Rosier 4d55e6e0a4 [AArch64] Add support for NEON scalar floating-point compare instructions.
llvm-svn: 193692
2013-10-30 15:20:07 +00:00
Peter Collingbourne b453cd64a7 Implement function type checker for the undefined behavior sanitizer.
This uses function prefix data to store function type information at the
function pointer.

Differential Revision: http://llvm-reviews.chandlerc.com/D1338

llvm-svn: 193058
2013-10-20 21:29:19 +00:00
Chad Rosier 3c03dee1d1 [AArch64] Add support for NEON scalar extract narrow instructions.
llvm-svn: 192971
2013-10-18 14:03:36 +00:00
Chad Rosier e7465644c6 [AArch64] Add support for NEON scalar three register different instruction
class.  The instruction class includes the signed saturating doubling
multiply-add long, signed saturating doubling multiply-subtract long, and
the signed saturating doubling multiply long instructions.

llvm-svn: 192909
2013-10-17 18:12:50 +00:00
Chad Rosier 00eef17dbe [AArch64] Add support for NEON scalar negate instruction.
llvm-svn: 192845
2013-10-16 21:04:53 +00:00
Chad Rosier e904137c01 [AArch64] Add support for NEON scalar absolute value instruction.
llvm-svn: 192844
2013-10-16 21:04:49 +00:00
Chad Rosier 2681b3fb61 Update comment.
llvm-svn: 192807
2013-10-16 16:30:39 +00:00
Chad Rosier 069b90463d [AArch64] Add support for NEON scalar signed saturating accumulated of unsigned
value and unsigned saturating accumulate of signed value instructions.

llvm-svn: 192801
2013-10-16 16:09:16 +00:00
Chad Rosier a70fb7b716 [AArch64] Add support for NEON scalar signed saturating absolute value and
scalar signed saturating negate instructions.

llvm-svn: 192734
2013-10-15 21:19:02 +00:00
Chad Rosier 193573ec89 [AArch64] Add support for NEON scalar integer compare instructions.
llvm-svn: 192597
2013-10-14 14:37:40 +00:00
Kevin Qin f22bf50443 Implemented aarch64 SIMD copy related ACLE intrinsic :
vget_lane, vset_lane, vcopy_lane, vcreate, vdup_n, vdup_lane, vmov_n.

llvm-svn: 192411
2013-10-11 02:34:30 +00:00
Hao Liu 1eade6d927 Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem).
Including following 14 instructions:
4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4).

llvm-svn: 192362
2013-10-10 17:01:49 +00:00
Tim Northover 72ace5cf12 Revert "Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem). "
This reverts commit r192351. The LLVM side broke the build and the Clang tests
will inevitably fail without it.

llvm-svn: 192356
2013-10-10 16:00:08 +00:00
Hao Liu c319193636 Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem).
Including following 14 instructions:
4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4).

E.g. ld1(3 registers version) will load 32-bit elements {A, B, C, D, E, F} sequentially into the three 64-bit vectors list {BA, DC, FE}.
E.g. ld3 will load 32-bit elements {A, B, C, D, E, F} into the three 64-bit vectors list {DA, EB, FC}.

llvm-svn: 192351
2013-10-10 14:59:36 +00:00
Chad Rosier 0a903478c6 [AArch64] Add support for NEON scalar floating-point reciprocal estimate,
reciprocal exponent, and reciprocal square root estimate instructions.

llvm-svn: 192243
2013-10-08 22:09:29 +00:00
Chad Rosier 0babda4b9c [AArch64] Add support for NEON scalar signed/unsigned integer to floating-point
convert instructions.

llvm-svn: 192232
2013-10-08 20:43:46 +00:00
Matt Arsenault 2f15263807 Fix objectsize tests after r192117
llvm-svn: 192120
2013-10-07 19:00:18 +00:00
Chad Rosier 027dfade54 [AArch64] Add support for NEON scalar arithmetic instructions:
SQDMULH, SQRDMULH, FMULX, FRECPS, and FRSQRTS.

llvm-svn: 192112
2013-10-07 17:07:17 +00:00
Jiangning Liu b96ebac02b Implement aarch64 neon instruction set AdvSIMD (Across).
llvm-svn: 192029
2013-10-05 08:22:55 +00:00
Amaury de la Vieuville 21bf6ed730 Do not emit undefined lsrh/ashr for NEON shifts
These IR instructions are undefined when the amount is equal to operand
size, but NEON right shifts support such shifts. Work around that by
emitting a different IR in these cases.

llvm-svn: 191953
2013-10-04 13:13:15 +00:00
Jiangning Liu 4617e9dc85 Implement aarch64 neon instruction set AdvSIMD (3V elem).
llvm-svn: 191945
2013-10-04 09:21:17 +00:00
Joey Gouly 75987a65f3 [ARM] Add a builtin to allow you to use the 'sevl' instruction.
llvm-svn: 191816
2013-10-02 10:00:18 +00:00
Benjamin Kramer 9b1dfe8b56 Mark an impossible path as unreachable to pacify GCC.
llvm-svn: 191436
2013-09-26 16:36:08 +00:00
Benjamin Kramer 39c4924db9 Remove tabs.
llvm-svn: 191427
2013-09-26 12:16:47 +00:00
NAKAMURA Takumi 788af10a8a CGBuiltin.cpp: Prune a stray default: label. [-Wcovered-switch-default]
llvm-svn: 191277
2013-09-24 04:37:50 +00:00
Jiangning Liu 036f16dc8c Initial support for Neon scalar instructions.
Patch by Ana Pazos.

1.Added support for v1ix and v1fx types.
2.Added Scalar Pairwise Reduce instructions.
3.Added initial implementation of Scalar Arithmetic instructions.

llvm-svn: 191264
2013-09-24 02:48:06 +00:00
Eli Friedman f9d8c6cebb Add _mm_stream_si64 intrinsic.
While I'm here, also fix the alignment computation for the whole family of
intrinsics.

PR17298.

llvm-svn: 191243
2013-09-23 23:38:39 +00:00
Joey Gouly 1e8637b259 [ARMv8] Add builtins for CRC instructions.
Patch by Bradley Smith!

llvm-svn: 190931
2013-09-18 10:07:09 +00:00
Hal Finkel 28b2ae3692 Restore the sqrt -> llvm.sqrt mapping in fast-math mode
This restores the sqrt -> llvm.sqrt mapping, but only in fast-math mode
(specifically, when the UnsafeFPMath or NoNaNsFPMath CodeGen options are
enabled). The @llvm.sqrt* intrinsics have slightly different semantics from the
libm call, specifically, they are undefined when given a non-zero negative
number (the libm calls will always return NaN for any negative number).

This mapping was removed in r100613, and replaced with a TODO, but at that time
the fast-math flags were not yet implemented. Now that we have these, restoring
this mapping is important because it will enable autovectorization of sqrt
calls in loops (at least in fast-math mode).

llvm-svn: 190646
2013-09-12 23:57:55 +00:00
Jiangning Liu 1bda93a252 Implement aarch64 neon instruction set AdvSIMD (3V Diff), covering the following 26 instructions,
SADDL, UADDL, SADDW, UADDW, SSUBL, USUBL, SSUBW, USUBW, ADDHN, RADDHN, SABAL, UABAL, SUBHN, RSUBHN, SABDL, UABDL, SMLAL, UMLAL, SMLSL, UMLSL, SQDMLAL, SQDMLSL, SMULL, UMULL, SQDMULL, PMULL

llvm-svn: 190289
2013-09-09 02:21:08 +00:00
Hao Liu b1852eed38 Inplement aarch64 neon instructions in AdvSIMD(shift). About 24 shift instructions:
sshr,ushr,ssra,usra,srshr,urshr,srsra,ursra,sri,shl,sli,sqshlu,sqshl,uqshl,shrn,sqrshr$
 and 4 convert instructions:
      scvtf,ucvtf,fcvtzs,fcvtzu

llvm-svn: 189926
2013-09-04 09:29:13 +00:00
Tim Northover 550ce58312 ARM: comment on why vmull intrinsic has to exist for now.
llvm-svn: 189464
2013-08-28 09:46:40 +00:00
Tim Northover 4ae9812283 ARM: Emit normal IR for vaddhn/vsubhn NEON intrinsics
These operations "vector add high-half narrow" actually correspond to the
sequence:

    %sum = add <4 x i32> %lhs, %rhs
    %high = lshr <4 x i32> %sum, <i32 16, i32 16, i32 16, i32 16>
    %res = trunc <4 x i32> %high to <4 x i16>

Now that LLVM can spot this, Clang should emit the corresponding LLVM IR.

llvm-svn: 189463
2013-08-28 09:46:37 +00:00
Tim Northover 4e423f724a ARM: use vqdmull and vqadds/vqsubs to implement vqdmlal/vqdmlsl
The NEON intrinsics vqdmlal and vqdmlsl are really just combinations of a
saturating-doubling-multiply (vqdmull) and a saturating add/sub, so now that
LLVM can spot those patterns Clang should emit them instead of specialised
intrinsics.

Feature already tested by existing ARM NEON intrinsics tests.

llvm-svn: 189462
2013-08-28 09:46:34 +00:00
Juergen Ributzka 53e2f275d2 Fix last commit.
llvm-svn: 188724
2013-08-19 23:08:53 +00:00
Juergen Ributzka c6ab1f8bfd Simplify code by using CreateMemTemp. No functional change intended.
Reviewer: Eli
llvm-svn: 188722
2013-08-19 22:20:37 +00:00
Juergen Ributzka 2c2dbf4542 Fix the name and the type of the argument for intrinisc
_mm256_broadcastsi128_si256 to align with the Intel documentation.

This fixes bug PR 16581 and rdar:14747994.

llvm-svn: 188609
2013-08-17 16:40:09 +00:00
Hao Liu 0e9837a385 Fix the build failure of Realease version
llvm-svn: 188456
2013-08-15 11:38:54 +00:00
Hao Liu 4efa1402fe Clang and AArch64 backend patches to support shll/shl and vmovl instructions and ACLE functions
llvm-svn: 188452
2013-08-15 08:26:30 +00:00
Tim Northover 2fe823a6c3 AArch64: initial NEON support
Patch by Ana Pazos

- Completed implementation of instruction formats:
AdvSIMD three same
AdvSIMD modified immediate
AdvSIMD scalar pairwise

- Completed implementation of instruction classes
(some of the instructions in these classes
belong to yet unfinished instruction formats):
Vector Arithmetic
Vector Immediate
Vector Pairwise Arithmetic

- Initial implementation of instruction formats:
AdvSIMD scalar two-reg misc
AdvSIMD scalar three same

- Intial implementation of instruction class:
Scalar Arithmetic

- Initial clang changes to support arm v8 intrinsics.
Note: no clang changes for scalar intrinsics function name mangling yet.

- Comprehensive test cases for added instructions
To verify auto codegen, encoding, decoding, diagnosis, intrinsics.

llvm-svn: 187568
2013-08-01 09:23:19 +00:00
Bill Schmidt 778d387684 [PowerPC] Support powerpc64le as a syntax-checking target.
This patch provides basic support for powerpc64le as an LLVM target.
However, use of this target will not actually generate little-endian
code.  Instead, use of the target will cause the correct little-endian
built-in defines to be generated, so that code that tests for
__LITTLE_ENDIAN__, for example, will be correctly parsed for
syntax-only testing.  Code generation will otherwise be the same as
powerpc64 (big-endian), for now.

The patch leaves open the possibility of creating a little-endian
PowerPC64 back end, but there is no immediate intent to create such a
thing.

The new test case variant ensures that correct built-in defines for
little-endian code are generated.

llvm-svn: 187180
2013-07-26 01:36:11 +00:00
Eli Bendersky c3496b0643 Partial revert of r185568.
r186899 and r187061 added a preferred way for some architectures not to get
intrinsic generation for math builtins. So the code changes in r185568 can
now be undone (the test remains).

llvm-svn: 187079
2013-07-24 21:22:01 +00:00
Tim Northover 6aacd49094 ARM: implement low-level intrinsics for the atomic exclusive operations.
This adds three overloaded intrinsics to Clang:
    T __builtin_arm_ldrex(const volatile T *addr)
    int __builtin_arm_strex(T val, volatile T *addr)
    void __builtin_arm_clrex()

The intent is that these do what users would expect when given most sensible
types. Currently, "sensible" translates to ints, floats and pointers.

llvm-svn: 186394
2013-07-16 09:47:53 +00:00
Richard Smith 6cbd65d84d Add a __builtin_addressof that performs the same functionality as the built-in
& operator (ignoring any overloaded operator& for the type). The purpose of
this builtin is for use in std::addressof, to allow it to be made constexpr;
the existing implementation technique (reinterpret_cast to some reference type,
take address, reinterpert_cast back) does not permit this because
reinterpret_cast between reference types is not permitted in a constant
expression in C++11 onwards.

llvm-svn: 186053
2013-07-11 02:27:57 +00:00
Eli Bendersky 9b64ec18c1 Add target hook CodeGen queries when generating builtin pow*.
Without fmath-errno, Clang currently generates calls to @llvm.pow.* intrinsics
when it sees pow*(). This may not be suitable for all targets (for
example le32/PNaCl), so the attached patch adds a target hook that CodeGen
queries. The target can state its preference for having or not having the
intrinsic generated. Non-PNaCl behavior remains unchanged;
PNaCl-specific test added.

llvm-svn: 185568
2013-07-03 19:19:12 +00:00
Eli Bendersky 099888eccd Remove misplaced comment
llvm-svn: 184862
2013-06-25 17:07:56 +00:00
Michael Gottesman 930ecdb77b [checked-arithmetic builtins] Added builtins to enable users to perform checked-arithmetic in c.
This will enable users in security critical applications to perform
checked-arithmetic in a fast safe manner that is amenable to c.

Tests/an update to Language Extensions is included as well.

rdar://13421498.

llvm-svn: 184497
2013-06-20 23:28:10 +00:00
Michael Gottesman 1534399059 [multiprecision-builtins] Added missing builtin __builtin_{add,sub}cb for {add,sub} with carry for bytes.
I have had several people ask me about why this builtin was not available in
clang (since it seems like a logical conclusion). This patch implements said
builtins.

Relevant tests are included as well. I also updated the Clang language extension reference.

rdar://14192664.

llvm-svn: 184227
2013-06-18 20:40:40 +00:00
Rafael Espindola 2219fc5821 Fix __clear_cache on ARM.
Current gcc's produce an error if __clear_cache is anything but

__clear_cache(char *a, char *b);

It looks like we had just implemented a gcc bug that is now fixed.

llvm-svn: 181784
2013-05-14 12:45:47 +00:00
Benjamin Kramer 4757d0aadf Revert accidental commit.
llvm-svn: 181782
2013-05-14 12:23:08 +00:00
Benjamin Kramer 324bf7a159 Take a stab at trying to unbreak the makefile build.
There is no clangRewrite.a.

llvm-svn: 181781
2013-05-14 12:21:21 +00:00
Tim Northover 8ec8c4bf89 AArch64: teach Clang about __clear_cache intrinsic
libgcc provides a __clear_cache intrinsic on AArch64, much like it
does on 32-bit ARM.

llvm-svn: 181111
2013-05-04 07:15:13 +00:00
John McCall c8e0170578 Standardize accesses to the TargetInfo in IR-gen.
Patch by Stephen Lin!

llvm-svn: 179638
2013-04-16 22:48:15 +00:00
Michael Liao ffaae3511a Add RDSEED intrinsic support defined in AVX2 extension
llvm-svn: 178331
2013-03-29 05:17:55 +00:00
John McCall 47fb950871 Change hasAggregateLLVMType, which conflates complex and
aggregate types in a profoundly wrong way that has to be
worked around in every call site, to getEvaluationKind,
which classifies and distinguishes between all of these
cases.

Also, normalize the API for loading and storing complexes.

I'm working on a larger patch and wanted to pull these
changes out, but it would have be annoying to detangle
them from each other.

llvm-svn: 176656
2013-03-07 21:37:08 +00:00
John McCall 882987f30c Use the actual ABI-determined C calling convention for runtime
calls and declarations.

LLVM has a default CC determined by the target triple.  This is
not always the actual default CC for the ABI we've been asked to
target, and so we sometimes find ourselves annotating all user
functions with an explicit calling convention.  Since these
calling conventions usually agree for the simple set of argument
types passed to most runtime functions, using the LLVM-default CC
in principle has no effect.  However, the LLVM optimizer goes
into histrionics if it sees this kind of formal CC mismatch,
since it has no concept of CC compatibility.  Therefore, if this
module happens to define the "runtime" function, or got LTO'ed
with such a definition, we can miscompile;  so it's quite
important to get this right.

Defining runtime functions locally is quite common in embedded
applications.

llvm-svn: 176286
2013-02-28 19:01:20 +00:00
Will Dietz f54319c891 [ubsan] Add support for -fsanitize-blacklist
llvm-svn: 172808
2013-01-18 11:30:38 +00:00
Tim Northover 4ef746768b Correct order of operands forwarding NEON vfma to LLVM fma
llvm-svn: 172650
2013-01-16 20:13:15 +00:00
Michael Gottesman a2b5c4ba6a Multiprecision subtraction builtins.
We lower these into 2x chained usub.with.overflow intrinsics.

llvm-svn: 172476
2013-01-14 21:44:30 +00:00
NAKAMURA Takumi 7ab4fbf5c2 CGBuiltin.cpp: Fix abuse of ArrayRef in EmitOverflowIntrinsic().
In ArrayRef<T>(X), X should not be temporary value. It could be rewritten more redundantly;

  llvm::Type *XTy = X->getType();
  ArrayRef<llvm::Type *> Ty(XTy);
  llvm::Value *Callee = CGF.CGM.getIntrinsic(IntrinsicID, Ty);

Since it is safe if both XTy and Ty are temporary value in one statement, it could be shorten;

  llvm::Value *Callee = CGF.CGM.getIntrinsic(IntrinsicID, ArrayRef<llvm::Type*>(X->getType()));

ArrayRef<T> has an implicit constructor to create uni-entry of T;

  llvm::Value *Callee = CGF.CGM.getIntrinsic(IntrinsicID, X->getType());

MSVC-generated clang.exe crashed.

llvm-svn: 172352
2013-01-13 11:26:44 +00:00
Michael Gottesman 54398015bf Added builtins for multiprecision adds.
We lower all of these intrinsics into a 2x chained usage of
uadd.with.overflow.

llvm-svn: 172341
2013-01-13 02:22:39 +00:00
Dmitri Gribenko f857950d39 Remove useless 'llvm::' qualifier from names like StringRef and others that are
brought into 'clang' namespace by clang/Basic/LLVM.h

llvm-svn: 172323
2013-01-12 19:30:44 +00:00
Chandler Carruth ffd5551bc7 Rewrite #includes for llvm/Foo.h to llvm/IR/Foo.h as appropriate to
reflect the migration in r171366.

Re-sort the #include lines to reflect the new paths.

llvm-svn: 171369
2013-01-02 11:45:17 +00:00
Meador Inge b97878a235 CodeGen: Expand creal and cimag into complex field loads
PR 14529 was opened because neither Clang or LLVM was expanding
calls to creal* or cimag* into instructions that just load the
respective complex field.  After some discussion, it was not
considered realistic to do this in LLVM because of the platform
specific way complex types are expanded.  Thus a way to solve
this in Clang was pursued.  GCC does a similar expansion.

This patch adds the feature to Clang by making the creal* and
cimag* functions library builtins and modifying the builtin code
generator to look for the new builtin types.

llvm-svn: 170455
2012-12-18 20:58:04 +00:00
Chandler Carruth 3a02247dc9 Sort all of Clang's files under 'lib', and fix up the broken headers
uncovered.

This required manually correcting all of the incorrect main-module
headers I could find, and running the new llvm/utils/sort_includes.py
script over the files.

I also manually added quite a few missing headers that were uncovered by
shuffling the order or moving headers up to be main-module-headers.

llvm-svn: 169237
2012-12-04 09:13:33 +00:00
Will Dietz 88e0233ff4 [ubsan] Add flag to enable recovery from checks when possible.
llvm-svn: 169114
2012-12-02 19:50:33 +00:00
Richard Smith b1b0ab41e7 Use the individual -fsanitize=<...> arguments to control which of the UBSan
checks to enable. Remove frontend support for -fcatch-undefined-behavior,
-faddress-sanitizer and -fthread-sanitizer now that they don't do anything.

llvm-svn: 167413
2012-11-05 22:21:05 +00:00
Micah Villmow ea2fea2a60 Cleanup some clang code to use new type functions instead of using cast<>.
llvm-svn: 166684
2012-10-25 15:39:14 +00:00
Nico Weber 636fc09960 "Implement" codegen support for __noop().
Eli discovered that __noop's sema behavior also needs some love. I filed
PR14081 for that and intend to improve it.

llvm-svn: 165886
2012-10-13 22:30:41 +00:00
Richard Smith e30752c93b -fcatch-undefined-behavior: emit calls to the runtime library whenever one of the checks fails.
llvm-svn: 165536
2012-10-09 19:52:38 +00:00
Micah Villmow dd31ca10ef Move TargetData to DataLayout.
llvm-svn: 165395
2012-10-08 16:25:52 +00:00
Benjamin Kramer a801f4a81d Expose __builtin_bswap16.
GCC has always supported this on PowerPC and 4.8 supports it on all platforms,
so it's a good idea to expose it in clang too. LLVM supports this on all targets.

llvm-svn: 165362
2012-10-06 14:42:22 +00:00
Bob Wilson 39d8a132df Add an FMA intrinsic for ARM Neon.
llvm-svn: 164904
2012-09-29 23:52:48 +00:00
Sylvestre Ledru 33b5baf189 Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164766
llvm-svn: 164769
2012-09-27 10:16:10 +00:00
Sylvestre Ledru a876013dc9 Fix a typo 'iff' => 'if'
llvm-svn: 164766
2012-09-27 09:57:10 +00:00
Nico Weber ca496f34a2 Add codegen support for the __debugbreak intrinsic.
llvm-svn: 164660
2012-09-26 05:40:16 +00:00
Jim Grosbach 11b6fe5e9c ARM: Use a dedicated intrinsic for vector bitwise select.
The expression based expansion too often results in IR level optimizations
splitting the intermediate values into separate basic blocks, preventing
the formation of the VBSL instruction as the code author intended. In
particular, LICM would often hoist part of the computation out of a loop.

rdar://11011471

llvm-svn: 164342
2012-09-21 00:18:30 +00:00
Jim Grosbach d3608f433a Tidy up. Trailing whitespace and 80 columns.
llvm-svn: 164341
2012-09-21 00:18:27 +00:00
Richard Smith 4d1458ed38 -fcatch-undefined-behavior: Factor emission of the creation of, and branch to,
the trap BB out of the individual checks and into a common function, to prepare
for making this code call into a runtime library. Rename the existing EmitCheck
to EmitTypeCheck to clarify it and to move it out of the way of the new
EmitCheck.

llvm-svn: 163451
2012-09-08 02:08:36 +00:00
Eli Friedman 504f9a2872 Make alignment computation for pointer values for builtins handle
non-pointer types with a pointer representation correctly. PR13660.

llvm-svn: 162862
2012-08-29 21:21:11 +00:00
Eli Friedman 5d14c48dbb Attempt to fix clang bootstrap (broken by r162425).
llvm-svn: 162440
2012-08-23 11:27:56 +00:00
Eli Friedman a5dd5684dc Use the alignment from lvalue emission to more accurately compute the alignment
of a pointer for builtin emission, instead of just depending on the type of the
pointee.  <rdar://problem/11314941>.

llvm-svn: 162425
2012-08-23 03:10:17 +00:00
Fariborz Jahanian 1ac111989d irgen: inline code for several of complex builtin
calls. // rdar://8315199

llvm-svn: 161891
2012-08-14 20:09:28 +00:00
Bob Wilson 2605fef7db Avoid using i64 types for vld1q_lane/vst1q_lane intrinsics.
The backend has to legalize i64 types by splitting them into two 32-bit pieces,
which leads to poor quality code.  If we produce code for these intrinsics that
uses one-element vector types, which can live in Neon vector registers without
getting split up, then the generated code is much better.  Radar 11998303.

llvm-svn: 161879
2012-08-14 17:27:04 +00:00
Hal Finkel 3fadbb54fd Add __builtin_readcyclecounter() to produce the @llvm.readcyclecounter() intrinsic.
llvm-svn: 161310
2012-08-05 22:03:08 +00:00
Joel Jones 682150364a More replacing of target-dependent intrinsics with target-indepdent
intrinsics.  The second instruction(s) to be handled are the vector versions 
of count set bits (ctpop).

The changes here are to clang so that it generates a target independent 
vector ctpop when it sees an ARM dependent vector bits set count.  The changes 
in llvm are to match the target independent vector ctpop and in 
VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM 
dependent vector pop counts with target-independent ctpops.  There are also 
changes to an existing test case in llvm for ARM vector count instructions and 
to a test for the bitcode upgrade.

<rdar://problem/11892519>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>

llvm-svn: 160409
2012-07-18 00:01:03 +00:00
Simon Atanasyan 94a6d863a9 Revert commit r160308. We decide to move builtins selection to the backend.
llvm-svn: 160353
2012-07-17 08:15:06 +00:00
Simon Atanasyan a06d06b660 MIPS: Implement __builtin_mips_shll_qb builtin function overloading.
This function has two versions. The first one is used for a register operand.
The second one is used for an immediate number.

llvm-svn: 160308
2012-07-16 18:52:02 +00:00
Eric Christopher 934a1c0231 Capitalize comment.
llvm-svn: 160220
2012-07-14 19:29:12 +00:00
Joel Jones 3e00e9d5c1 This is one of the first steps at moving to replace target-dependent
intrinsics with target-indepdent intrinsics.  The first instruction(s) to be 
handled are the vector versions of count leading zeros (ctlz).

The changes here are to clang so that it generates a target independent 
vector ctlz when it sees an ARM dependent vector ctlz.  The changes in llvm 
are to match the target independent vector ctlz and in VMCore/AutoUpgrade.cpp 
to update any existing bc files containing ARM dependent vector ctlzs with 
target-independent ctlzs.  There are also changes to an existing test case in 
llvm for ARM vector count instructions and a new test for the bitcode upgrade.

<rdar://problem/11831778>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>

llvm-svn: 160201
2012-07-13 23:26:27 +00:00
Benjamin Kramer a43b6999ff Add _rdrand{16,32,64}_step intrinsics to immintrin.h
llvm-svn: 160118
2012-07-12 09:33:03 +00:00