llvm-project

Commit Graph

Author	SHA1	Message	Date
Pengxuan Zheng	bff8356b19	Revert "[COFF, ARM64] Add __break intrinsic" This reverts commit `8a9b4fb4aa`.	2022-04-20 11:57:49 -07:00
Pengxuan Zheng	8a9b4fb4aa	[COFF, ARM64] Add __break intrinsic https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170 Reviewed By: rnk, mstorsjo Differential Revision: https://reviews.llvm.org/D124032	2022-04-20 11:20:26 -07:00
Sven van Haastregt	e67b1b0ccf	[OpenCL] Add missing __opencl_c_atomic_scope_device guards Update opencl-c.h after the specification clarification in https://github.com/KhronosGroup/OpenCL-Docs/pull/775	2022-04-20 11:02:50 +01:00
Qiongsi Wu	2512a875cc	[clang] Adding Platform/Architecture Specific Resource Header Installation Targets The goal of this patch is to improve distribution build's flexibility to include only applicable header files. Currently, the clang-resource-headers target contains nearly all the files in clang/lib/Headers. Most of these files are platform specific (e.g. immintrin.h is x86 specific). A distribution build will have to either include all the headers for all the platforms, or not include any headers. For example, if a distribution build for powerpc includes the clang-resource-headers target, it will include all the x86 specific headers, even-though the x86 specific headers cannot be used. This patch breaks up the clang-resource-headers list to a core list and platform specific lists. With the patch, a distribution build can now include the ppc-resource-headers to include the headers applicable to the powerpc platform. Specifically, one can now have cmake ... LLVM_DISTRIBUTION_COMPONENTS="clang;ppc-resource-headers" ... ../llvm ninja install-distribution then installs the powerpc headers. Similarly, one can do cmake ... LLVM_DISTRIBUTION_COMPONENTS="clang;x86-resource-headers" ... ../llvm to include headers applicable to the x86 platform in a distribution installation. To implement this behaviour, the patch does two things: * It breaks up the long files header file list to a core list and platform specific lists. * It adds numerous platform specific installation targets. Differential Revision: https://reviews.llvm.org/D123498	2022-04-19 10:10:07 -04:00
Sven van Haastregt	f3ee0afc67	[OpenCL] opencl-c.h: Add const to get_image_num_samples Align with the `-fdeclare-opencl-builtins` option and other get_image_* builtins which have the const attribute. Differential Revision: https://reviews.llvm.org/D122728	2022-04-19 10:16:44 +01:00
Sven van Haastregt	77c74fd877	[OpenCL] Remove argument names from math builtins This simplifies completeness comparisons against OpenCLBuiltins.td and also makes the header no longer "claim" the argument name identifiers. Continues the direction set out in D119560.	2022-04-06 11:43:59 +01:00
Sven van Haastregt	de30408b3b	[OpenCL] opencl-c.h: remove a/b/c/i/p/n/v arg names This simplifies completeness comparisons against OpenCLBuiltins.td and also makes the header no longer "claim" any single-letter identifiers. Continues the direction set out in D119560.	2022-03-29 10:16:27 +01:00
Sven van Haastregt	677d0e7495	[OpenCL] opencl-c.h: remove x/y/z arg names This simplifies completeness comparisons against OpenCLBuiltins.td and also makes the header no longer "claim" the identifiers "x", "y" and "z". Continues the direction set out in D119560.	2022-03-24 13:55:41 +00:00
Qiu Chaofan	895e5b2d80	[NFC] Format and uglify PowerPC intrinsics headers This change formats PowerPC intrinsics wrapper headers into LLVM style, and add extra prefix '__' to all variables to prevent conflict with user code.	2022-03-24 21:14:55 +08:00
Qiu Chaofan	406bde9a15	[PowerPC] [Clang] Add SSE4 and BMI intrinsics implementation Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D119407	2022-03-24 20:03:08 +08:00
Sven van Haastregt	22548032be	[OpenCL] opencl-c.h: remove arg names for vload/vstore builtins This simplifies completeness comparisons against OpenCLBuiltins.td and also makes the header no longer "claim" the identifiers "data" and "offset". Continues the direction set out in D119560.	2022-03-23 11:12:50 +00:00
Alan Zhao	8cd8bd4a5c	Implement __cpuid and __cpuidex as Clang builtins https://reviews.llvm.org/D23944 implemented the #pragma intrinsic from MSVC. This causes the statement #pragma intrinsic(cpuid) to fail [0] on Clang because cpuid is currently implemented in intrin.h instead of a Clang builtin. Reimplementing cpuid (as well as it's releated function, cpuidex) should resolve this. [0]: https://crbug.com/1279344 Differential revision: https://reviews.llvm.org/D121653	2022-03-18 18:13:52 +01:00
Simon Pilgrim	4e4f839ac2	[X86] Use the unaligned vector typedefs for the lddqu intrinsics pointer arguments (PR20670) Extension to `4390c721cb` - similar to the vanilla load/store intrinsics, _mm_lddqu_si128/_mm256_lddqu_si256 should take an unaligned pointer, but were using the aligned m128i/m256i types which can cause alignment warnings. The existing sse3-builtins.c and avx-builtins.c tests in llvm-project\clang\test\CodeGen\X86 should cover this. Differential Revision: https://reviews.llvm.org/D121815	2022-03-17 10:42:29 +00:00
Kazushi (Jam) Marukawa	9df395bb68	[Clang][VE] Add vector mask intrinsics to clang Add vector mask intrinsics instructions to clang. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D121816	2022-03-17 18:52:28 +09:00
Thomas Lively	7e8913d775	[WebAssembly] Fix names of SIMD instructions containing '_zero' Fix the instruction names to match the WebAssembly spec: - `i32x4.trunc_sat_zero_f64x2_{s,u}` => `i32x4.trunc_sat_f64x2_{s,u}_zero` - `f32x4.demote_zero_f64x2` => `f32x4.demote_f64x2_zero` Also rename related things like intrinsics, builtins, and test functions to match. Reviewed By: aheejin Differential Revision: https://reviews.llvm.org/D121661	2022-03-16 13:34:57 -07:00
Kazushi (Jam) Marukawa	c2f62ab84b	[Clang][VE] Add the rest of intrinsics to clang Add the rest of intrinsics to clang except intrinsics using vector mask registers. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D121586	2022-03-17 00:17:21 +09:00
Kazushi (Jam) Marukawa	b1b4b6f366	[Clang][VE] Add vector load intrinsics Add vector load intrinsic instructions for VE. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D121049	2022-03-12 09:09:57 +09:00
Timm Bäder	e5ccd66801	[clang][sema] Enable first-class bool support for C2x Implement N2395 for C2x. This also covers adding "bool", which is part of N2394. Differential Revision: https://reviews.llvm.org/D120244	2022-03-09 15:04:24 +01:00
Phoebe Wang	4de9a752d6	[X86] Add helper enum for ternary intrinsics Reviewed By: RKSimon, LuoYuanke Differential Revision: https://reviews.llvm.org/D120307	2022-03-08 11:19:05 +08:00
Kristina Bessonova	57aaab3b17	[NVPTX] Fix nvvm.match.sync*.i64 intrinsics return type (i64 -> i32) NVVM IR specification defines them with i32 return type: declare i32 @llvm.nvvm.match.any.sync.i64(i32 %membermask, i64 %value) declare {i32, i1} @llvm.nvvm.match.all.sync.i64(i32 %membermask, i64 %value) ... The i32 return value is a 32-bit mask where bit position in mask corresponds to thread’s laneid. as well as PTX ISA: 9.7.12.8. Parallel Synchronization and Communication Instructions: match.sync match.any.sync.type d, a, membermask; match.all.sync.type d[\|p], a, membermask; ... Destination d is a 32-bit mask where bit position in mask corresponds to thread’s laneid. Additionally, ptxas doesn't accept intructions, produced by NVPTX backend. After this patch, it compiles with no issues. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D120499	2022-03-01 12:26:16 +02:00
Sven van Haastregt	b48e3c805c	[OpenCL] opencl-c.h: Fix incorrect get_image_width guard `cl_khr_3d_image_writes` should not guard `read_only image3d_t`.	2022-02-25 11:05:56 +00:00
Sven van Haastregt	88182e2dfd	[OpenCL] opencl-c.h: remove arg names for image builtins This simplifies completeness comparisons against OpenCLBuiltins.td and also makes the header no longer "claim" the identifiers "image", "image_array", "coord", "sampler", "sample", "gradientX", "gradientY", "lod", and "color". Continues the direction set out in D119560.	2022-02-24 11:52:32 +00:00
Sven van Haastregt	aa9c2d19d9	[OpenCL] Align subgroup builtin guards Until now, subgroup builtins are available with `opencl-c.h` when at least one of `cl_intel_subgroups`, `cl_khr_subgroups`, or `__opencl_c_subgroups` is defined. With `-fdeclare-opencl-builtins`, subgroup builtins are conditionalized on `cl_khr_subgroups` only. Align `-fdeclare-opencl-builtins` to `opencl-c.h` by introducing the internal `__opencl_subgroup_builtins` macro. Differential Revision: https://reviews.llvm.org/D120254	2022-02-23 12:22:09 +00:00
Sven van Haastregt	e7e17b30d0	[OpenCL] opencl-c.h: use uint/ulong consistently Most places already seem to use the short spelling instead of 'unsigned int/long', so perform the following substitutions: s/unsigned int /uint /g s/unsigned long /ulong /g This simplifies completeness comparisons against OpenCLBuiltins.td. Differential Revision: https://reviews.llvm.org/D120032	2022-02-22 10:15:40 +00:00
Sven van Haastregt	52df866615	[OpenCL] opencl-c.h: remove arg names from atomics; NFC This simplifies completeness comparisons against OpenCLBuiltins.td and also makes the header no longer "claim" the identifiers "success", "failure", "desired", "value". Differential Revision: https://reviews.llvm.org/D119560	2022-02-21 11:29:10 +00:00
Dávid Bolvanský	2c91754a13	[Clang] Add attributes alloc_size and alloc_align to mm_malloc LLVM optimizes source codes with mm_malloc better, especially due to alignment info. alloc align https://clang.llvm.org/docs/AttributeReference.html#alloc-align alloc size https://clang.llvm.org/docs/AttributeReference.html#alloc-size Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D117091	2022-02-17 19:59:18 +01:00
Sven van Haastregt	477bc8e8b9	[OpenCL] Guard atomic_double with cl_khr_int64_* It is necessary to guard atomic_double type according to https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#_footnotedef_54. Platform that disable cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics will have compiling errors even if atomic_double is not used. Patch by Haonan Yang. Differential Revision: https://reviews.llvm.org/D119398	2022-02-16 10:07:35 +00:00
Sven van Haastregt	074451bd33	[OpenCL] opencl-c.h: fix atomic_fetch_max with addrspace Commit `3c7d2f1b67` ("[OpenCL] opencl-c.h: add CL 3.0 non-generic address space atomics", 2021-07-30) added some atomic_fetch_add/sub overloads with uintptr_t arguments twice. Instead, they should have been atomic_fetch_max overloads with non-generic address spaces.	2022-02-15 12:12:03 +00:00
Aaron Ballman	a766545402	Update the diagnostic behavior of [[noreturn]] in C2x Post-commit review feedback suggested dropping the deprecated diagnostic for the 'noreturn' macro (the diagnostic from the header file suffices and the macro diagnostic could be confusing) and to only issue the deprecated diagnostic for [[_Noreturn]] when the attribute identifier is either directly written or not from a system macro. Amends the commit made in `5029dce492`.	2022-02-14 14:04:32 -05:00
Aaron Ballman	5029dce492	Implement WG14 N2764 the [[noreturn]] attribute This adds support for http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2764.pdf, which was adopted at the Feb 2022 WG14 meeting. That paper adds [[noreturn]] and [[_Noreturn]] to the list of supported attributes in C2x. These attributes have the same semantics as the [[noreturn]] attribute in C++. The [[_Noreturn]] attribute was added as a deprecated feature so that translation units which include <stdnoreturn.h> do not get an error on use of [[noreturn]] because the macro expands to _Noreturn. Users can use -Wno-deprecated-attributes to silence the diagnostic. Use of <stdnotreturn.h> or the noreturn macro were both deprecated. Users can define the _CLANG_DISABLE_CRT_DEPRECATION_WARNINGS macro to suppress the deprecation diagnostics coming from the header file.	2022-02-14 09:38:26 -05:00
Sven van Haastregt	50f8abb9f4	[OpenCL] Add OpenCL 3.0 atomics to -fdeclare-opencl-builtins Add the atomic overloads for the `global` and `local` address spaces, which are new in OpenCL 3.0. Ensure the preexisting `generic` overloads are guarded by the generic address space feature macro. Ensure a subset of the atomic builtins are guarded by the `__opencl_c_atomic_order_seq_cst` and `__opencl_c_atomic_scope_device` feature macros, and enable those macros for SPIR/SPIR-V targets in `opencl-c-base.h`. Also guard the `cl_ext_float_atomics` builtins with the atomic order and scope feature macros. Differential Revision: https://reviews.llvm.org/D119420	2022-02-11 10:14:14 +00:00
Simon Pilgrim	09857a4bd1	[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes: __m256i test_mm256_adds_epi8(__m256i a, __m256i b) { // CHECK-LABEL: test_mm256_adds_epi8 // CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.}}, <32 x i8> %{{.}}) return _mm256_adds_epi8(a, b); }	2022-02-08 15:00:10 +00:00
Simon Pilgrim	a59faf272e	Revert rG6c174ab2ad0676b295f11f6c3913eff9289fa6b9 "[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat" Missed some legacy builtin tests that need cleaning up first	2022-02-08 14:45:28 +00:00
Simon Pilgrim	6c174ab2ad	[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes: __m256i test_mm256_adds_epi8(__m256i a, __m256i b) { // CHECK-LABEL: test_mm256_adds_epi8 // CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.}}, <32 x i8> %{{.}}) return _mm256_adds_epi8(a, b); }	2022-02-08 14:21:20 +00:00
Sven van Haastregt	9b8a93e3b6	[OpenCL] opencl-c.h: remove arg names from arm_dot; NFC This simplifies completeness comparisons against OpenCLBuiltins.td.	2022-02-08 13:42:24 +00:00
Sven van Haastregt	c15782bcf5	[OpenCL] opencl-c.h: make attribute order consistent; NFC For most builtins, `__purefn` always comes after `__ovld`, but the read_image functions did not follow this pattern.	2022-02-07 10:54:55 +00:00
Piotr Kubaj	f2f4080c10	[PowerPC] Fix SSE translation on FreeBSD This patch drops throws specifier in posix_memalign declaration because that's different between glibc and other libc, and Clang has a hack. Differential Revision: https://reviews.llvm.org/D117972	2022-02-06 01:20:31 +08:00
tyb0807	51e188d079	[AArch64] Support for memset tagged intrinsic This introduces a new ACLE intrinsic for memset tagged (https://github.com/ARM-software/acle/blob/next-release/main/acle.md#memcpy-family-of-operations-intrinsics---mops). void __builtin_arm_mops_memset_tag(void , int, size_t) A corresponding LLVM intrinsic is introduced: i8* llvm.aarch64.mops.memset.tag(i8*, i8, i64) The types match llvm.memset but the return type is not void. This is part 1/4 of a series of patches split from https://reviews.llvm.org/D117405 to facilitate reviewing. Patch by Tomas Matheson Differential Revision: https://reviews.llvm.org/D117753	2022-01-31 20:49:34 +00:00
Aaron Ballman	a6cabd9802	Revert `fad7e491a0` with fixes applied `fad7e491a0` was a revert of `86797fdb6f` due to build failures. This hopefully fixes them.	2022-01-29 08:12:16 -05:00
Jan Korous	fad7e491a0	Revert "Add BITINT_MAXWIDTH support" This reverts commit `86797fdb6f`. Differential Revision: https://reviews.llvm.org/D117238	2022-01-28 15:18:49 -08:00
Aaron Ballman	86797fdb6f	Add BITINT_MAXWIDTH support Part of the _BitInt feature in C2x (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2763.pdf) is a new macro in limits.h named BITINT_MAXWIDTH that can be used to determine the maximum width of a bit-precise integer type. This macro must expand to a value that is at least as large as ULLONG_WIDTH. This adds an implementation-defined macro named __BITINT_MAXWIDTH__ to specify that value, which is used by limits.h for the standard macro. This also limits the maximum bit width to 128 bits because backends do not currently support all mathematical operations (such as division) on wider types yet. This maximum is expected to be increased in the future.	2022-01-28 15:04:29 -05:00
David Tenty	27ee91162d	[AIX][clang] include_next through clang provided float.h AIX provides additional definitions in the system libc float.h that we would like to be available to users, so we need to include_next through, similar to what is done on some other platforms. We also adjust the guards for some definitions which are restricted based on language level to also be provide with the _ALL_SOURCE feature test macro on AIX, similar to what is done by the platform float.h header, so we don't run into cases where we don't provide the compiler macro but still have a different definition from the system. Differential Revision: https://reviews.llvm.org/D117935	2022-01-28 13:27:10 -05:00
Sven van Haastregt	bfd8210f6f	[OpenCL] opencl-c.h: refactor named addrspace builtins The named address space overloads of builtins that take a pointer argument are conditionalized on the `__opencl_c_generic_address_space` feature macro (in a `#else` body). Introduce an internal feature macro instead, such that their availability can be controlled in a single place and independently of the generic address space feature macro. This commit does not change the available builtins. Differential Revision: https://reviews.llvm.org/D118158	2022-01-28 10:24:47 +00:00
Anton Zabaznov	a5de66c4c5	[OpenCL] Add support of __opencl_c_device_enqueue feature macro. This feature requires support of __opencl_c_generic_address_space and __opencl_c_program_scope_global_variables so diagnostics for that is provided as well. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D115640	2022-01-27 14:25:59 +03:00
Sven van Haastregt	35fff208ca	[OpenCL] opencl-c.h: add missing read_write image guards The get_image_num_mip_levels overloads that take a read_write image parameter were missing the __opencl_c_read_write_images guard.	2022-01-27 10:33:12 +00:00
Sven van Haastregt	91a0b464a8	[OpenCL] Make read_write images optional for -fdeclare-opencl-builtins Ensure any use of a `read_write` image is guarded behind the `__opencl_c_read_write_images` feature macro. Differential Revision: https://reviews.llvm.org/D117899	2022-01-25 11:40:31 +00:00
Liu, Chen3	f6984b299a	Fix the wrong value of bit_AVXVNNI Differential Revision: https://reviews.llvm.org/D118103	2022-01-25 14:41:23 +08:00
Simon Pilgrim	e4074432d5	[X86] Remove avx512f integer and/or/xor/min/max reduction intrinsics and use generic equivalents None of these have any reordering issues, and they still emit the same reduction intrinsics without any change in the existing test coverage: llvm-project\clang\test\CodeGen\X86\avx512-reduceIntrin.c llvm-project\clang\test\CodeGen\X86\avx512-reduceMinMaxIntrin.c Differential Revision: https://reviews.llvm.org/D117881	2022-01-24 11:57:53 +00:00
Simon Pilgrim	3e50593b18	[X86] Remove `__builtin_ia32_pmax/min` intrinsics and use generic `__builtin_elementwise_max/min` D111985 added the generic `__builtin_elementwise_max` and `__builtin_elementwise_min` intrinsics with the same integer behaviour as the SSE/AVX instructions This patch removes the `__builtin_ia32_pmax/min` intrinsics and just uses `__builtin_elementwise_max/min` - the existing tests see no changes: ``` __m256i test_mm256_max_epu32(__m256i a, __m256i b) { // CHECK-LABEL: test_mm256_max_epu32 // CHECK: call <8 x i32> @llvm.umax.v8i32(<8 x i32> %{{.}}, <8 x i32> %{{.}}) return _mm256_max_epu32(a, b); } ``` This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`). Sibling patch to D117791 Differential Revision: https://reviews.llvm.org/D117798	2022-01-24 11:40:29 +00:00
Simon Pilgrim	e5147f82e1	[X86] Remove __builtin_ia32_pabs intrinsics and use generic __builtin_elementwise_abs D111986 added the generic `__builtin_elementwise_abs()` intrinsic with the same integer absolute behaviour as the SSE/AVX instructions (abs(INT_MIN) == INT_MIN) This patch removes the `__builtin_ia32_pabs` intrinsics and just uses `__builtin_elementwise_abs` - the existing tests see no changes: ``` __m256i test_mm256_abs_epi8(__m256i a) { // CHECK-LABEL: test_mm256_abs_epi8 // CHECK: [[ABS:%.]] = call <32 x i8> @llvm.abs.v32i8(<32 x i8> %{{.*}}, i1 false) return _mm256_abs_epi8(a); } ``` This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`). Differential Revision: https://reviews.llvm.org/D117791	2022-01-24 11:25:21 +00:00
Simon Pilgrim	0abaf64580	Revert rG4727d29d908f9dd608dd97a58c0af1ad579fd3ca "[X86] Remove __builtin_ia32_pabs intrinsics and use generic __builtin_elementwise_abs" Some build bots are referencing the `__builtin_ia32_pabs` intrinsics via alternative headers	2022-01-21 12:35:36 +00:00
Simon Pilgrim	3ef88b3184	Revert rG8ee135dcf8ff060656ad481c3e980fe8763576f5 "[X86] Remove `__builtin_ia32_pmax/min` intrinsics and use generic `__builtin_elementwise_max/min`" Some build bots are referencing the `__builtin_ia32_pmax/min` intrinsics via alternative headers	2022-01-21 12:34:19 +00:00
Simon Pilgrim	8ee135dcf8	[X86] Remove `__builtin_ia32_pmax/min` intrinsics and use generic `__builtin_elementwise_max/min` D111985 added the generic `__builtin_elementwise_max` and `__builtin_elementwise_min` intrinsics with the same integer behaviour as the SSE/AVX instructions This patch removes the `__builtin_ia32_pmax/min` intrinsics and just uses `__builtin_elementwise_max/min` - the existing tests see no changes: ``` __m256i test_mm256_max_epu32(__m256i a, __m256i b) { // CHECK-LABEL: test_mm256_max_epu32 // CHECK: call <8 x i32> @llvm.umax.v8i32(<8 x i32> %{{.}}, <8 x i32> %{{.}}) return _mm256_max_epu32(a, b); } ``` This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`). Sibling patch to D117791 Differential Revision: https://reviews.llvm.org/D117798	2022-01-21 12:24:58 +00:00
Simon Pilgrim	4727d29d90	[X86] Remove __builtin_ia32_pabs intrinsics and use generic __builtin_elementwise_abs D111986 added the generic `__builtin_elementwise_abs()` intrinsic with the same integer absolute behaviour as the SSE/AVX instructions (abs(INT_MIN) == INT_MIN) This patch removes the `__builtin_ia32_pabs` intrinsics and just uses `__builtin_elementwise_abs` - the existing tests see no changes: ``` __m256i test_mm256_abs_epi8(__m256i a) { // CHECK-LABEL: test_mm256_abs_epi8 // CHECK: [[ABS:%.]] = call <32 x i8> @llvm.abs.v32i8(<32 x i8> %{{.*}}, i1 false) return _mm256_abs_epi8(a); } ``` This requires us to add a `__v64qs` explicitly signed char vector type (we already have `__v16qs` and `__v32qs`). Differential Revision: https://reviews.llvm.org/D117791	2022-01-21 11:59:08 +00:00
Dave Airlie	e1b7bd911d	[OpenCL] opencl-c.h: add __opencl_c_images and __opencl_c_read_write_images This wraps the image and rw images usages in the correct macros Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D107539	2022-01-21 09:51:01 +10:00
Aaron Ballman	0d459444e5	Mark ATOMIC_VAR_INIT and ATOMIC_FLAG_INIT as deprecated C17 deprecated ATOMIC_VAR_INIT with the resolution of DR 485. C++ followed suit when adopting P0883R2 for C++20, but additionally chose to deprecate ATOMIC_FLAG_INIT at the same time despite the macro still being required in C. This patch marks both macros as deprecated when appropriate to do so.	2022-01-18 13:41:56 -05:00
Aaron Ballman	bf7d9970ba	Support the *_WIDTH macros in limits.h and stdint.h This completes the implementation of WG14 N2412 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2412.pdf), which standardizes C on a twos complement representation for integer types. The only work that remained there was to define the correct macros in the standard headers, which this patch does.	2022-01-13 11:46:34 -05:00
Yaxun (Sam) Liu	694fd10659	[HIP] Fix device malloc/free ROCm 4.5 device library introduced __ockl_dm_alloc and __ockl_dm_dealloc for supporting device side malloc/free. This patch redefines device malloc/free to use these functions. It also fixes a bug in the wrapper header which incorrectly defines free with return type void* instead of void. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D116967	2022-01-11 14:49:34 -05:00
Qiu Chaofan	c2cc70e4f5	[NFC] Fix endif comments to match with include guard	2022-01-07 15:52:59 +08:00
Freddy Ye	0bab742805	[X86] Add missing CET intrinsics support These two intrinsics are documented o SDM and intrinsic guide. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D116325	2022-01-04 11:40:40 +08:00
Justas Janickas	b5fef6dbfd	[OpenCL] Allow optional __generic in __remove_address_space utility	2021-12-27 08:39:21 +00:00
Krzysztof Parzyszek	1d1b5efdef	[Hexagon] Driver/preprocessor options for Hexagon v69	2021-12-23 10:17:08 -08:00
Krzysztof Parzyszek	4c8becbeee	[Hexagon] Add Hexagon v69 builtins to clang	2021-12-23 09:00:15 -08:00
Krzysztof Parzyszek	dcb3e8083a	[Hexagon] Make conversions to vector predicate types explicit for builtins HVX does not have load/store instructions for vector predicates (i.e. bool vectors). Because of that, vector predicates need to be converted to another type before being stored, and the most convenient representation is an HVX vector. As a consequence, in C/C++, source-level builtins that either take or produce vector predicates take or return regular vectors instead. On the other hand, the corresponding LLVM intrinsics do have boolean types that, and so a conversion of the operand or the return value was necessary. This conversion would happen inside clang's codegen, but was somewhat fragile. This patch changes the strategy: a builtin that takes a vector predicate now really expects a vector predicate. Since such a predicate cannot be provided via a variable, this builtin must be composed with other builtins that either convert vector to a predicate (V6_vandvrt) or predicate to a vector (V6_vandqrt). For users using builtins defined in hvx_hexagon_protos.h there is no impact: the conversions were added to that file. Other users will need to insert - __builtin_HEXAGON_V6_vandvrt[_128B](V, -1) to convert vector V to a vector predicate, or - __builtin_HEXAGON_V6_vandqrt[_128B](Q, -1) to convert vector predicate Q to a vector. Builtins __builtin_HEXAGON_V6_vmaskedstore.* are a temporary exception to that, but they are deprecated and should not be used anyway. In the future they will either follow the same rule, or be removed.	2021-12-22 12:52:24 -08:00
Stuart Brady	ceb80557e5	[OpenCL] Add pure attribute to vload builtins Use the "pure" attribute (or "readonly") for the vload, vload_half and vloada_half builtins. Includes test changes to SemaOpenCL/fdeclare-opencl-builtins.cl to avoid triggering unused-result warnings. Reviewed By: svenvh Differential Revision: https://reviews.llvm.org/D110742	2021-12-16 18:30:58 +00:00
Stuart Brady	5aefb1dc1e	Revert "[OpenCL] Add pure attribute to vload builtins" This reverts commit `1a376bc285`. This broke clang/test/SemaOpenCL/fdeclare-opencl-builtins.cl	2021-12-16 15:16:41 +00:00
Stuart Brady	1a376bc285	[OpenCL] Add pure attribute to vload builtins Use the "pure" attribute (or "readonly") for the vload, vload_half and vloada_half builtins. Reviewed By: svenvh Differential Revision: https://reviews.llvm.org/D110742	2021-12-16 14:55:31 +00:00
Nico Weber	b6f317d94d	[gn build] Make arm_neon_sve_bridge.h header auto-syncable	2021-12-13 07:04:45 -05:00
Matt Devereau	41def32040	[AArch64][SVE][NEON] Add NEON-SVE-Bridge intrinsics Adds svset_neonq, svget_neonq, svdup_neonq AArch64 intrinsics. These are described in the ACLE specification: https://github.com/ARM-software/acle/pull/72 https://reviews.llvm.org/D114713	2021-12-13 11:31:57 +00:00
Ties Stuij	e6d0b851f8	[ARM][libunwind] add PACBTI-M support for libunwind This patch implements the following: - Emit PACBTI-M build attributes in libunwind asm files - Authenticate LR in DWARF32 using PACBTI Use Armv8.1-M.Main PACBTI extension to authenticate the return address (stored in the LR register) before moving it to the PC (IP) register. The AUTG instruction is used with the candidate return address, the CFA, and the authentication code that is retrieved from the saved pseudo-register RA_AUTH_CODE. - Authenticate LR in EHABI using PACBTI Authenticate the contents of the LR register using Armv8.1-M.Main PACBTI extension. A new frame unwinding instruction is introduced (0xb4). This instruction pops out of the stack the return address authentication code, which is then used in conjunction with the SP and the next-to-be instruction pointer to perform authentication. This authentication code is popped into a new register, UNW_ARM_PSEUDO_PAC, which is a pseudo-register. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Momchil Velikov - Victor Campos - Ties Stuij Reviewed By: #libunwind, danielkiss, mstorsjo Differential Revision: https://reviews.llvm.org/D112430	2021-12-08 09:44:45 +00:00
Saleem Abdulrasool	1ad7de9e92	Headers: exclude `#include_next <stdatomic.h>` on MSVC The 14.31.30818 toolset has the following in the `stdatomic.h`: ~~~ #ifndef __cplusplus #error <stdatomic.h> is not yet supported when compiling as C, but this is planned for a future release. #endif ~~~ This results in clang failing to build existing code which relied on `stdatomic.h` in C mode on Windows. Simply fallback to the clang header until that header is available as a complete implementation.	2021-11-24 12:52:16 -08:00
Nemanja Ivanovic	dc1aa8eacd	[PowerPC] Add missed clang portion of `c933c2eb33` The clang portion of `c933c2eb33` was missed as I made some kind of mistake squashing the commits with git. This patch just adds those. The original review: https://reviews.llvm.org/D114088	2021-11-24 12:42:58 -06:00
Nemanja Ivanovic	b7bf937bbe	[PowerPC] Provide XL-compatible vec_round implementation The XL implementation of vec_round for vector double uses "round-to-nearest, ties to even" just as the vector float `version does. However clang and gcc use "round-to-nearest-away" for vector double and "round-to-nearest, ties to even" for vector float. The XL behaviour is implemented under the __XL_COMPAT_ALTIVEC__ macro similarly to other instances of incompatibility. Differential revision: https://reviews.llvm.org/D113642	2021-11-24 06:43:56 -06:00
$Alfredo Dal'\''Ava Junior$ Alfredo Dal'\''Ava Junior	8e2fd879e6	[PowerPC] [Clang] Enable Intel intrinsics support on FreeBSD This enables Intel intrinsics support on FreeBSD. Thanks to @pkubaj who noticed this feature was missing Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D113451	2021-11-22 20:42:10 +00:00
Freddy Ye	eb9dc0c78f	[X86] add 3 missing intrinsics: _mm_(mask/maskz)_cvtpbh_ps Reviewed By: craig.topper, pengfei Differential Revision: https://reviews.llvm.org/D114059	2021-11-18 08:48:19 +08:00
Nico Weber	b1ad813b47	[clang] Address review comments on https://reviews.llvm.org/D113707 - Drop a needless `l` size suffix on a mov instruction in AT&T mode - Move varying bits of test flags to front - Add a comment about MS mode test	2021-11-17 14:04:16 -05:00
Nico Weber	ae98182cf7	[clang] Make -masm=intel affect inline asm style With this, void f() { __asm__("mov eax, ebx"); } now compiles with clang with -masm=intel. This matches gcc. The flag is not accepted in clang-cl mode. It has no effect on MSVC-style `__asm {}` blocks, which are unconditionally in intel mode both before and after this change. One difference to gcc is that in clang, inline asm strings are "local" while they're "global" in gcc. Building the following with -masm=intel works with clang, but not with gcc where the ".att_syntax" from the 2nd __asm__() is in effect until file end (or until a ".intel_syntax" somewhere later in the file): __asm__("mov eax, ebx"); __asm__(".att_syntax\nmovl %ebx, %eax"); __asm__("mov eax, ebx"); This also updates clang's intrinsic headers to work both in -masm=att (the default) and -masm=intel modes. The official solution for this according to "Multiple assembler dialects in asm templates" in gcc docs->Extensions->Inline Assembly->Extended Asm is to write every inline asm snippet twice: bt{l %[Offset],%[Base] \| %[Base],%[Offset]} This works in LLVM after D113932 and D113894, so use that. (Just putting `.att_syntax` at the start of the snippet works in some but not all cases: When LLVM interpolates in parameters like `%0`, it uses at&t or intel syntax according to the inline asm snippet's flavor, so the `.att_syntax` within the snippet happens to late: The interpolated-in parameter is already in intel style, and then won't parse in the switched `.att_syntax`.) It might be nice to invent a `#pragma clang asm_dialect push "att"` / `#pragma clang asm_dialect pop` to be able to force asm style per snippet, so that the inline asm string doesn't contain the same code in two variants, but let's leave that for a follow-up. Fixes PR21401 and PR20241. Differential Revision: https://reviews.llvm.org/D113707	2021-11-17 13:41:59 -05:00
Freddy Ye	73c9cf8204	[X86][FP16] add alias for fmul_ch intrinsics _mul_ch is to align with _mul_s, _mul_d and _mul_h. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D112777	2021-11-17 13:26:11 +08:00
Saleem Abdulrasool	c17d9b4b12	headers: optionalise some generated resource headers This splits out the generated headers and conditonalises them upon the target being enabled. The motivation here is that the RISCV header alone added 10MB to the resource directory, which was previously at 10MB, increasing the build size and time. This header is contributing ~50% of the size of the resource headers (~10MB). The ARM generated headers are contributing about ~10% or 1MB. This could be extended further adding only the static resource headers for the targets that the LLVM build supports. The changes to the tests for ARM mirror what the RISCV target already did and rnk identified as a possible issue. Testing: cmake -G Ninja -D LLVM_TARGETS_TO_BUILD=X86 -D LLVM_ENABLE_PROJECTS="clang;lld" ../clang ninja check-clang Differential Revision: https://reviews.llvm.org/D112890 Reviewed By: craig.topper	2021-11-09 22:30:29 +00:00
Justas Janickas	d85d57e987	Revert "[OpenCL] Allow optional __generic in __remove_address_space utility" This reverts commit `81081daef0`.	2021-11-09 09:42:17 +00:00
Justas Janickas	81081daef0	[OpenCL] Allow optional __generic in __remove_address_space utility Clang builtin utility `__remove_address_space` now works if generic address space is not supported in C++ for OpenCL 2021. Differential Revision: https://reviews.llvm.org/D110155	2021-11-09 08:13:34 +00:00
Anastasia Stulova	a10a69fe9c	[SPIR-V] Add SPIR-V triple and clang target info. Add new triple and target info for ‘spirv32’ and ‘spirv64’ and, thus, enabling clang (LLVM IR) code emission to SPIR-V target. The target for SPIR-V is mostly reused from SPIR by derivation from a common base class since IR output for SPIR-V is mostly the same as SPIR. Some refactoring are made accordingly. Added and updated tests for parts that are different between SPIR and SPIR-V. Patch by linjamaki (Henry Linjamäki)! Differential Revision: https://reviews.llvm.org/D109144	2021-11-08 13:34:10 +00:00
Kevin Petit	b8b6a5bc86	[OpenCL] Fix parsing of opencl-c.h as CL 3.0 with device-scope atomics enabled https://reviews.llvm.org/D108392 Signed-off-by: Kevin Petit <kevin.petit@arm.com>	2021-11-04 14:17:45 +00:00
Michael Liao	6fe902daf9	[cuda] Add address space predicate funuctions. - Add the missing NVVM predicate builtins on address space checking - Redefine them as pure functions so that they could be used in __builtin_assume. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D112053	2021-10-19 16:20:14 -04:00
Artem Belevich	f526ee5b85	[CUDA] Provide address space conversion builtins. CUDA-11 headers rely on these NVCC builtins. Despite having `__nv` previx, those are not provided by libdevice. Differential Revision: https://reviews.llvm.org/D111665	2021-10-12 14:56:39 -07:00
Sven van Haastregt	544d89e847	[OpenCL] Add atomic_half type builtins Add atomic_half types and builtins operating on the types from the cl_ext_float_atomics extension. Patch by Haonan Yang. Differential Revision: https://reviews.llvm.org/D109740	2021-10-12 10:45:30 +01:00
Qiu Chaofan	2fc0d439a4	[Clang] [PowerPC] Fix header include typo in smmintrin.h The SSE4 header (smmintrin.h) should include SSSE3 (tmmintrin.h) instead of SSE2 (emmintrin.h). Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D111482	2021-10-11 10:44:08 +08:00
Amy Kwan	03bfddae50	[NFC] Update vec_extract builtin signatures to take signed int. This patch updates the vec_extract builtins to take a signed int as the second parameter, as defined by the Power Vector Intrinsics Programming Reference. This patch is NFC and all existing tests pass. Differential Revision: https://reviews.llvm.org/D110935	2021-10-08 15:09:53 -05:00
Artem Belevich	29e00b29f7	[CUDA] Make sure <string.h> is included with original __THROW defined. Otherwise we may end up with an inconsistent redeclarations of the standard library functions if _FORTIFY_SOURCE is in effect. https://bugs.llvm.org/show_bug.cgi?id=47869 Differential Revision: https://reviews.llvm.org/D110781	2021-10-07 11:43:56 -07:00
Amy Kwan	74b1ac7155	[NFC] Update return type of vec_popcnt to vector unsigned. This patch updates the vec_popcnt builtins to return vector unsigned, as defined by the Power Vector Intrinsics Programming Reference. This patch is NFC and all existing tests pass. Differential Revision: https://reviews.llvm.org/D110934	2021-10-07 11:33:19 -05:00
Artem Belevich	6707a7d7e9	[CUDA] remove unneeded includes from CUDA-related headers. This should fix bot failures on PPC and windows.	2021-10-06 17:20:21 -07:00
Artem Belevich	ccfb0555f7	[CUDA] Implement experimental support for texture lookups. The patch implements header-only support for testure lookups. The patch has been tested on a source file with all possible combinations of argument types supported by CUDA headers, compiled and verified that the generated instructions and their parameters match the code generated by NVCC. Unfortunately, compiling texture code requires CUDA headers and can't be tested in clang itself. The test will need to be added to the test-suite later. While generated code compiles and seems to match NVCC, I do not have any code that uses textures that I could test correctness of the implementation. Hence the experimental status. Differential Revision: https://reviews.llvm.org/D110089	2021-10-06 15:15:53 -07:00
Nico Weber	f9457f1f88	[clang] Don't mark _ReadBarrier, _ReadWriteBarrier, _WriteBarrier deprecated It's true that docs.microsoft.com says: """The _ReadBarrier, _WriteBarrier, and _ReadWriteBarrier compiler intrinsics and the MemoryBarrier macro are all deprecated and should not be used. For inter-thread communication, use mechanisms such as atomic_thread_fence and std::atomic<T>, which are defined in the C++ Standard Library. For hardware access, use the /volatile:iso compiler option together with the volatile keyword.""" And these attributes have been here since these builtins were added in r192860. However: - cl.exe does not warn on them even with /Wall - none of the replacements are useful for C code - we don't add __attribute__((__deprecated__())) to any other declarations in intrin.h - intrin0.h in the MSVC headers declares _ReadWriteBarrier() (but without the deprecation attribute), so you get inconsistent deprecation warnings depending on if you include intrin.h or intrin0.h The motivation is that compiling sqlite.h with clang-cl produces a deprecation warning with clang-cl for _ReadWriteBarrier(), but not with cl.exe. Differential Revision: https://reviews.llvm.org/D111232	2021-10-06 10:50:02 -04:00
Albion Fung	13d3cd37e2	[PowerPC] Implement vector float and vector double version for vec_orc builtin The builtin for vec_orc has support for the following two signatures, but currently the compiler marks it ambiguous: vector float vec_orc(vector float, vector float) vector double vec_orc(vector double, vector double) This patch implements these two builtins. Differential revision: https://reviews.llvm.org/D110858	2021-10-06 02:47:42 -05:00
Lei Huang	8b3d944a97	[PowerPC] Disable vector types when not supported by subtarget features Update clang to treat vector unsigned long long and friends as invalid for AltiVec without VSX. Reported in: https://bugs.llvm.org/show_bug.cgi?id=47782 Reviewed By: nemanjai, amyk Differential Revision: https://reviews.llvm.org/D109178	2021-10-04 14:16:47 -05:00
Nemanja Ivanovic	369d785574	[PowerPC] Optimal sequence for doubleword vec_all_{eq\|ne} on Power7 These builtins produce inefficient code for CPU's prior to Power8 due to vcmpequd being unavailable. The predicate forms can actually leverage the available vcmpequw along with xxlxor to produce a better sequence.	2021-10-01 08:27:15 -05:00
Nemanja Ivanovic	fad14a17a4	[PowerPC] Truncate element index for vec_insert in altivec.h When a user specifies an out-of-range index for vec_insert, we just produce IR that has undefined behaviour even though the documentation states that modulo arithmetic is used. This patch just truncates the value to a valid index.	2021-09-30 05:58:22 -05:00
Nemanja Ivanovic	09b67aa1c3	[PowerPC] Implement builtin for vbpermd The instruction has similar semantics to vbpermq but for doublewords. It was added in Power9 and the ABI documents the builtin. Differential revision: https://reviews.llvm.org/D107899	2021-09-29 06:34:31 -05:00
Wang, Pengfei	7d6889964a	[X86][FP16] Add more builtins to avoid multi evaluation problems & add 2 missed intrinsics Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D110336	2021-09-27 09:27:04 +08:00
Quinn Pham	f9912fe4ea	[PowerPC] Add range checks for P10 Vector Builtins This patch adds range checking for some Power10 altivec builtins and changes the signature of a builtin to match documentation. For `vec_cntm`, range checking is done via SemaChecking. For `vec_splati_ins`, the second argument is masked to extract the 0th bit so that we always receive either a `0` or a `1`. Reviewed By: lei, amyk Differential Revision: https://reviews.llvm.org/D109710	2021-09-23 11:05:49 -05:00

1 2 3 4 5 ...

2056 Commits