llvm-project

Commit Graph

Author	SHA1	Message	Date
Levy Hsu	b001d574d7	[RISCV] Add IR intrinsic for Zbr extension Implementation for RISC-V Zbr extension intrinsic. Header files are included in separate patch in case the name needs to be changed RV32 / 64: crc32b crc32h crc32w crc32cb crc32ch crc32cw RV64 Only: crc32d crc32cd Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99009	2021-04-02 10:58:45 -07:00
Thomas Lively	45783d0e8a	[WebAssembly] Implement i64x2 comparisons Removes the prototype builtin and intrinsic for i64x2.eq and implements that instruction as well as the other i64x2 comparison instructions in the final SIMD spec. Unsigned comparisons were not included in the final spec, so they still need to be scalarized via a custom lowering. Differential Revision: https://reviews.llvm.org/D99623	2021-03-31 10:46:17 -07:00
Yaxun (Sam) Liu	cc9477166a	[CUDA][HIP] add __builtin_get_device_side_mangled_name Add builtin function __builtin_get_device_side_mangled_name to get device side manged name for functions and global variables, which can be used to get symbol address of kernels or variables by mangled name in dynamically loaded bundled code objects at run time. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D99301	2021-03-25 15:25:29 -04:00
Nemanja Ivanovic	4020932706	[PowerPC] Make altivec.h work with AIX which has no __int128 There are a number of functions in altivec.h that use vector __int128 which isn't supported on AIX. Those functions need to be guarded for targets that don't support the type. Furthermore, the functions that produce quadword instructions without using the type need a builtin. This patch adds the macro guards to altivec.h using the __SIZEOF_INT128__ which is only defined on targets that support the __int128 type.	2021-03-24 00:35:51 -05:00
Thomas Lively	f5764a8654	[WebAssembly] Finalize SIMD names and opcodes Updates the names (e.g. widen => extend, saturate => sat) and opcodes of all SIMD instructions to match the finalized SIMD spec. Deliberately does not change the public interface in wasm_simd128.h yet; that will require more care. Depends on D98466. Differential Revision: https://reviews.llvm.org/D98676	2021-03-18 11:21:25 -07:00
Thomas Lively	2f2ae08da9	[WebAssembly] Remove experimental SIMD instructions Removes the instruction definitions, intrinsics, and builtins for qfma/qfms, signselect, and prefetch instructions, which were not included in the final WebAssembly SIMD spec. Depends on D98457. Differential Revision: https://reviews.llvm.org/D98466	2021-03-18 11:21:24 -07:00
Bradley Smith	cf0da91ba5	[AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE Previously NEON used a target specific intrinsic for frintn, given that the FROUNDEVEN ISD node now exists, move over to that instead and add codegen support for that node for both NEON and fixed length SVE. Differential Revision: https://reviews.llvm.org/D98487	2021-03-17 11:41:22 +00:00
Amy Huang	f5352dd9da	Emit inline implementation of __builtin__wmemchr on MSVCRT platforms. The MSVC runtime library doesn't have a definition for wmemchr, so provide an inline implementation. Differential Revision: https://reviews.llvm.org/D98472	2021-03-15 15:30:55 -07:00
Stelios Ioannou	ab86edbc88	[AArch64] Implement __rndr, __rndrrs intrinsics This patch implements the __rndr and __rndrrs intrinsics to provide access to the random number instructions introduced in Armv8.5-A. They are only defined for the AArch64 execution state and are available when __ARM_FEATURE_RNG is defined. These intrinsics store the random number in their pointer argument and return a status code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter intrinsic reseeds the random number generator. The instructions write the NZCV flags indicating the success of the operation that we can then read with a CSET. [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics [2] https://bugs.llvm.org/show_bug.cgi?id=47838 Differential Revision: https://reviews.llvm.org/D98264 Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc	2021-03-15 17:51:48 +00:00
Thomas Preud'homme	f60b35340f	Stop traping on sNaN in __builtin_isinf __builtin_isinf currently generates a floating-point compare operation which triggers a trap when faced with a signaling NaN in StrictFP mode. This commit uses integer operations instead to not generate any trap in such a case. Reviewed By: mibintc Differential Revision: https://reviews.llvm.org/D97125	2021-03-15 15:38:08 +00:00
Nikita Popov	42eb658f65	[OpaquePtrs] Remove some uses of type-less CreateGEP() (NFC) This removes some (but not all) uses of type-less CreateGEP() and CreateInBoundsGEP() APIs, which are incompatible with opaque pointers. There are a still a number of tricky uses left, as well as many more variation APIs for CreateGEP.	2021-03-12 21:01:16 +01:00
Nikita Popov	46354bac76	[OpaquePtrs] Remove some uses of type-less CreateLoad APIs (NFC) Explicitly pass loaded type when creating loads, in preparation for the deprecation of these APIs. There are still a couple of uses left.	2021-03-11 14:40:57 +01:00
Nikita Popov	68e01339cc	[CGBuilder] Remove type-less CreateAlignedLoad() APIs (NFC) These are incompatible with opaque pointers. This is in preparation of dropping this API on the IRBuilder side as well. Instead explicitly pass the loaded type.	2021-03-11 10:41:23 +01:00
Zakk Chen	d6a0560bf2	[Clang][RISCV] Add custom TableGen backend for riscv-vector intrinsics. Demonstrate how to generate vadd/vfadd intrinsic functions 1. add -gen-riscv-vector-builtins for clang builtins. 2. add -gen-riscv-vector-builtin-codegen for clang codegen. 3. add -gen-riscv-vector-header for riscv_vector.h. It also generates ifdef directives with extension checking, base on D94403. 4. add -gen-riscv-vector-generic-header for riscv_vector_generic.h. Generate overloading version Header for generic api. https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface 5. update tblgen doc for riscv related options. riscv_vector.td also defines some unused type transformers for vadd, because I think it could demonstrate how tranfer type work and we need them for the whole intrinsic functions implementation in the future. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Zakk Chen <zakk.chen@sifive.com> Reviewed By: jrtc27, craig.topper, HsiangKai, Jim, Paul-C-Anagnostopoulos Differential Revision: https://reviews.llvm.org/D95016	2021-03-10 18:43:43 -08:00
Jingu Kang	25951c5ab8	[AArch64] Add missing intrinsics for scalar FP rounding Differential Revision: https://reviews.llvm.org/D98269	2021-03-10 13:22:29 +00:00
Jingu Kang	9b302513f6	[AArch64] Add missing intrinsics for vrnd	2021-03-05 11:26:12 +00:00
Thomas Preud'homme	b7aeece47c	Revert "Stop traping on sNaN in __builtin_isinf" This reverts commit `1b6eb56aa0` because the invert logic for isfinite is incorrect.	2021-03-04 12:07:35 +00:00
Soumi Manna	eec7f8f7b1	[WebAssembly] Add missing default cases in switch statements unsigned variable 'IntNo' has been declared but not been defined inside function EmitWebAssemblyBuiltinExpr(). static code analysis tool complains about uninitialized variable "IntNo" since this enters to default branch without setting any intrinsics and calls Function *Callee = CGM.getIntrinsic(IntNo). This patch fixes the problem by adding default cases in switch statements.	2021-03-03 13:15:23 -08:00
Thomas Preud'homme	1b6eb56aa0	Stop traping on sNaN in __builtin_isinf __builtin_isinf currently generates a floating-point compare operation which triggers a trap when faced with a signaling NaN in StrictFP mode. This commit uses integer operations instead to not generate any trap in such a case. Reviewed By: mibintc Differential Revision: https://reviews.llvm.org/D97125	2021-03-02 15:54:56 +00:00
Hsiangkai Wang	1a35a1b074	[RISCV] Add vadd with mask and without mask builtin. Demonstrate how to add RISC-V V builtins and lower them to IR intrinsics for V extension. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Differential Revision: https://reviews.llvm.org/D93446	2021-02-24 07:57:31 +08:00
Ryan Santhiraraja	2c25efcbd3	[AArch64] Adding SHA3 Intrinsics support This patch adds the following SHA3 Intrinsics: vsha512hq_u64, vsha512h2q_u64, vsha512su0q_u64, vsha512su1q_u64 veor3q_u8 veor3q_u16 veor3q_u32 veor3q_u64 veor3q_s8 veor3q_s16 veor3q_s32 veor3q_s64 vrax1q_u64 vxarq_u64 vbcaxq_u8 vbcaxq_u16 vbcaxq_u32 vbcaxq_u64 vbcaxq_s8 vbcaxq_s16 vbcaxq_s32 vbcaxq_s64 Note need to include +sha3 and +crypto when building from the front-end Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D96381	2021-02-22 12:09:20 +00:00
Christopher Tetreault	55448ab540	[AArch64] Adding Neon Polynomial vadd Intrinsics This patch adds the following intrinsics: vadd_p8 vadd_p16 vadd_p64 vaddq_p8 vaddq_p16 vaddq_p64 vaddq_p128 Reviewed By: t.p.northover, DavidSpickett, ctetreau Differential Revision: https://reviews.llvm.org/D96825	2021-02-19 14:48:12 -08:00
Pengxuan Zheng	0ec32f1326	Revert "[AArch64] Adding Neon Polynomial vadd Intrinsics" Revert the patch due to buildbot failures. This reverts commit `d9645059c5`.	2021-02-18 12:38:16 -08:00
Pengxuan Zheng	d9645059c5	[AArch64] Adding Neon Polynomial vadd Intrinsics This patch adds the following intrinsics: vadd_p8 vadd_p16 vadd_p64 vaddq_p8 vaddq_p16 vaddq_p64 vaddq_p128 Reviewed By: t.p.northover, DavidSpickett Differential Revision: https://reviews.llvm.org/D96825	2021-02-18 11:33:24 -08:00
Jonas Paulsson	e57bd1ff4f	[CFE, SystemZ] New target hook testFPKind() for checks of FP values. The recent commit `00a6254` "Stop traping on sNaN in builtin_isnan" changed the lowering in constrained FP mode of builtin_isnan from an FP comparison to integer operations to avoid trapping. SystemZ has a special instruction "Test Data Class" which is the preferred way to do this check. This patch adds a new target hook "testFPKind()" that lets SystemZ emit the s390_tdc intrinsic instead. testFPKind() takes the BuiltinID as an argument and is expected to soon handle more opcodes than just 'builtin_isnan'. Review: Thomas Preud'homme, Ulrich Weigand Differential Revision: https://reviews.llvm.org/D96568	2021-02-18 12:36:46 -06:00
Wang, Pengfei	61da20575d	[X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) This is a follow up of D92940. We have successfully converted fadd/fmul _mm_reduce_* intrinsics to llvm.reduction + reassoc flag. We can do the same approach for fmin/fmax too, i.e. llvm.reduction + nnan flag. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93179	2021-02-15 08:52:06 +08:00
Pengxuan Zheng	61cca0f2e5	[AArch64] Adding Neon Sm3 & Sm4 Intrinsics This adds SM3 and SM4 Intrinsics support for AArch64, specifically: vsm3ss1q_u32 vsm3tt1aq_u32 vsm3tt1bq_u32 vsm3tt2aq_u32 vsm3tt2bq_u32 vsm3partw1q_u32 vsm3partw2q_u32 vsm4eq_u32 vsm4ekeyq_u32 Reviewed By: labrinea Differential Revision: https://reviews.llvm.org/D95655	2021-02-11 14:20:20 -08:00
Wang, Pengfei	dd2460ed5d	[X86] Always assign reassoc flag for intrinsics reduce_add/mul_ps/pd. Intrinsics reduce_add/mul_ps/pd have assumption that the elements in the vector are reassociable. So we need to always assign the reassoc flag when we call _mm_reduce_* intrinsics. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D96231	2021-02-09 21:14:06 +08:00
Thomas Preud'homme	00a62547da	Stop traping on sNaN in __builtin_isnan __builtin_isnan currently generates a floating-point compare operation which triggers a trap when faced with a signaling NaN in StrictFP mode. This commit uses integer operations instead to not generate any trap in such a case. Reviewed By: kpn Differential Revision: https://reviews.llvm.org/D95948	2021-02-05 18:28:48 +00:00
Kevin P. Neal	81b69879c9	[FPEnv][X86] Platform builtins edition: clang should get from the AST the metadata for constrained FP builtins Currently clang is not correctly retrieving from the AST the metadata for constrained FP builtins. This patch fixes that for the X86 specific builtins. Differential Revision: https://reviews.llvm.org/D94614	2021-02-03 11:49:17 -05:00
Thomas Lively	4b68b64dcc	[WebAssembly] Prototype i8x16 to i32x4 widening instructions As proposed in https://github.com/WebAssembly/simd/pull/395 and matching the opcodes used in V8: https://chromium-review.googlesource.com/c/v8/v8/+/2617385/4/src/wasm/wasm-opcodes.h Differential Revision: https://reviews.llvm.org/D95557	2021-01-28 10:59:32 -08:00
Thomas Lively	11802eced5	[WebAssembly] Prototype new f64x2 conversions As proposed in https://github.com/WebAssembly/simd/pull/383. Differential Revision: https://reviews.llvm.org/D95012	2021-01-20 11:28:06 -08:00
Qiu Chaofan	168be42083	[Clang] Mutate long-double math builtins into f128 under IEEE-quad Under -mabi=ieeelongdouble on PowerPC, IEEE-quad floating point semantic is used for long double. This patch mutates call to related builtins into f128 version on PowerPC. And in theory, this should be applied to other targets when their backend supports IEEE 128-bit style libcalls. GCC already has these mutations except nansl, which is not available on PowerPC along with other variants (nans, nansf). Reviewed By: RKSimon, nemanjai Differential Revision: https://reviews.llvm.org/D92080	2021-01-15 16:56:20 +08:00
Lucas Prates	2b1e25befe	[AArch64] Adding ACLE intrinsics for the LS64 extension This introduces the ARMv8.7-A LS64 extension's intrinsics for 64 bytes atomic loads and stores: `__arm_ld64b`, `__arm_st64b`, `__arm_st64bv`, and `__arm_st64bv0`. These are selected into the LS64 instructions LD64B, ST64B, ST64BV and ST64BV0, respectively. Based on patches written by Simon Tatham. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D93232	2021-01-14 09:43:58 +00:00
Heejin Ahn	7be271537e	[WebAssembly] Rename wasm_rethrow_in_catch intrinsic/builtin `wasm_rethrow_in_catch` intrinsic and builtin are used in order to rethrow an exception when the exception is caught but there is no matching clause within the current `catch`. For example, ``` try { foo(); } catch (int n) { ... } ``` If the caught exception does not correspond to C++ `int` type, it should be rethrown. These intrinsic/builtin were renamed `rethrow_in_catch` because at the time I thought there would be another intrinsic for C++'s `throw` keyword, which rethrows an exception. It turned out that `throw` keyword doesn't require wasm's `rethrow` instruction, so we rename `rethrow_in_catch` to just `rethrow` here. Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D94038	2021-01-08 06:55:04 -08:00
Wang, Pengfei	c102b9697b	[X86] Correct the comments about comparison intrinsics. NFCI.	2021-01-08 15:36:15 +08:00
Thomas Lively	497026c902	[WebAssembly] Prototype prefetch instructions As proposed in https://github.com/WebAssembly/simd/pull/352 and using the opcodes used in the V8 prototype: https://chromium-review.googlesource.com/c/v8/v8/+/2543167. These instructions are only usable via intrinsics and clang builtins to make them opt-in while they are being benchmarked. Differential Revision: https://reviews.llvm.org/D93883	2021-01-05 11:32:03 -08:00
Thorsten Schütt	2fd11e0b1e	Revert "[NFC, Refactor] Modernize StorageClass from Specifiers.h to a scoped enum (II)" This reverts commit `efc82c4ad2`.	2021-01-04 23:17:45 +01:00
Thorsten Schütt	efc82c4ad2	[NFC, Refactor] Modernize StorageClass from Specifiers.h to a scoped enum (II) Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D93765	2021-01-04 22:58:26 +01:00
Brandon Bergren	6cee9d0cf8	[PowerPC] Support powerpcle target in Clang [3/5] Add powerpcle support to clang. For FreeBSD, assume a freestanding environment for now, as we only need it in the first place to build loader, which runs in the OpenFirmware environment instead of the FreeBSD environment. For Linux, recognize glibc and musl environments to match current usage in Void Linux PPC. Adjust driver to match current binutils behavior regarding machine naming. Adjust and expand tests. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D93919	2021-01-02 12:17:58 -06:00
Juneyoung Lee	9b29610228	Use unary CreateShuffleVector if possible As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`. Let's update them. Actually, it would have been more natural if the patches were made in this order: (1) let them use unary CreateShuffleVector first (2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793) The order is swapped, but in terms of correctness it is still fine. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93923	2020-12-30 22:36:08 +09:00
Thomas Lively	5e09e9979b	[WebAssembly] Prototype extending pairwise add instructions As proposed in https://github.com/WebAssembly/simd/pull/380. This commit makes the new instructions available only via clang builtins and LLVM intrinsics to make their use opt-in while they are still being evaluated for inclusion in the SIMD proposal. Depends on D93771. Differential Revision: https://reviews.llvm.org/D93775	2020-12-28 14:11:14 -08:00
Tom Stellard	3203143f13	CodeGen: Improve generated IR for __builtin_mul_overflow(uint, uint, int) Add a special case for handling __builtin_mul_overflow with unsigned inputs and a signed output to avoid emitting the __muloti4 library call on x86_64. __muloti4 is not implemented in libgcc, so avoiding this call fixes compilation of some programs that call __builtin_mul_overflow with these arguments. For example, this fixes the build of cpio with clang, which includes code from gnulib that calls __builtin_mul_overflow with these argument types. Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D84405	2020-12-17 14:30:31 -08:00
Baptiste Saleil	c2892978e9	[PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_ On PPC, the vector pair instructions are independent from MMA. This patch renames the vector pair LLVM intrinsics and Clang builtins to replace the _mma_ prefix by _vsx_ in their names. We also move the vector pair type/intrinsic/builtin tests to their own files. Differential Revision: https://reviews.llvm.org/D91974	2020-12-17 13:19:27 -05:00
Simon Pilgrim	4855a1004d	[X86] Convert fadd/fmul _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) Followup to D87604, having confirmed on PR47506 that we can use the llvm codegen expansion for fadd/fmul as well. Differential Revision: https://reviews.llvm.org/D92940	2020-12-13 15:37:35 +00:00
Melanie Blower	320af6b138	Create SPIRABIInfo to enable SPIR_FUNC calling convention. Background: Call to library arithmetic functions for div is emitted by the compiler and it set wrong “C” calling convention for calls to these functions, whereas library functions are declared with `spir_function` calling convention. InstCombine optimization replaces such calls with “unreachable” instruction. It looks like clang lacks SPIRABIInfo class which should specify default calling conventions for “system” function calls. SPIR supports only SPIR_FUNC and SPIR_KERNEL calling convention. Reviewers: Erich Keane, Anastasia Differential Revision: https://reviews.llvm.org/D92721	2020-12-12 05:48:20 -08:00
Florian Hahn	9c4cddb53a	[Clang] Add vcmla and rotated variants for Arm ACLE. This patch adds vcmla and the rotated variants as defined in "Arm Neon Intrinsics Reference for ACLE Q3 2020" [1] The _lane_ are still missing, but they can be added separately. This patch only adds the builtin mapping for AArch64. [1] https://developer.arm.com/documentation/ihi0073/latest Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D92930	2020-12-10 16:54:08 +00:00
Kevin P. Neal	abfbc5579b	[FPEnv] clang should get from the AST the metadata for constrained FP builtins Currently clang is not correctly retrieving from the AST the metadata for constrained FP builtins. This patch fixes that for the non-target specific builtins. Differential Revision: https://reviews.llvm.org/D92122	2020-11-30 11:59:37 -05:00
Reid Kleckner	1e843a987d	[MS] Add more 128bit cmpxchg intrinsics for AArch64 The MSVC STL for requires this on ARM64. Requested in https://llvm.org/pr47099 Depends on D92061 Differential Revision: https://reviews.llvm.org/D92062	2020-11-25 12:07:28 -08:00
Reid Kleckner	3bd0672726	[MS] Fix double evaluation of MSVC builtin arguments This code got quite twisted because we consider some MSVC builtins to be target agnostic, and some to be target specific. Target specific intrinsics have a pattern of doing up-front argument evaluation, while general intrinsics do not evaluate their arguments up front. As we tried to share codepaths between the target-specific and target-agnostic handling, we ended up doing double evaluation. Instead, have each target handle MSVC intrinsics consistently before up front argument evaluation. This requires passing less data around and is more consistent with target independent intrinsic handling. See D50979 for past examples of this bug. I noticed this while looking into adding some more intrinsics. Differential Revision: https://reviews.llvm.org/D92061	2020-11-25 11:55:01 -08:00
Florian Hahn	ca2e7e5999	[IRGen] Add !annotation metadata for auto-init stores. This patch updates Clang's IRGen to add !annotation nodes with an "auto-init" annotation to all stores for auto-initialization. As discussed in 'RFC: Combining Annotation Metadata and Remarks' (http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html) this allows using optimization remarks to track down where auto-init code was inserted (and not removed by optimizations). There are a few cases in the tests where !annotation gets dropped by optimizations. Those optimizations will be updated in subsequent patches. This patch is based on a patch by Francis Visoiu Mistrih. Reviewed By: thegameg, paquette Differential Revision: https://reviews.llvm.org/D91417	2020-11-16 10:37:02 +00:00
Mehdi Amini	42e88bd6b1	Replace sequences of v.push_back(v[i]); v.erase(&v[i]); with std::rotate (NFC) The code has a few sequence that looked like: Ops.push_back(Ops[0]); Ops.erase(Ops.begin()); And are equivalent to: std::rotate(Ops.begin(), Ops.begin() + 1, Ops.end()); The latter has the advantage of never reallocating the vector, which would be a bug in the original code as push_back would read from the memory it deallocated.	2020-11-14 00:55:33 +00:00
Heejin Ahn	902ea588ea	[WebAssembly] Rename atomic.notify and *.atomic.wait - atomic.notify -> memory.atomic.notify - i32.atomic.wait -> memory.atomic.wait32 - i64.atomic.wait -> memory.atomic.wait64 See https://github.com/WebAssembly/threads/pull/149. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D91447	2020-11-13 12:04:48 -08:00
Baptiste Saleil	3f78605a8c	[PowerPC] Add paired vector load and store builtins and intrinsics This patch adds the Clang builtins and LLVM intrinsics to load and store vector pairs. Differential Revision: https://reviews.llvm.org/D90799	2020-11-13 12:35:10 -06:00
Qiu Chaofan	7faf62a80b	[Clang] Add more fp128 math library function builtins Since glibc has supported math library functions conforming IEEE 128-bit floating point types on some platform (like ppc64le), we can fix clang's math builtins missing this type. Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D90593	2020-11-04 17:58:42 +08:00
Baptiste Saleil	daa127d77e	[PowerPC] Add MMA builtin decoding and definitions Add MMA builtin decoding. These builtins use the new PowerPC-specific types __vector_pair and __vector_quad. So to avoid pervasive changes, we use custom type descriptors and custom decoding for these builtins. We also use custom code generation to expand builtin calls with pointers to simpler intrinsic calls with non-pointer types. Differential Revision: https://reviews.llvm.org/D81748	2020-11-03 15:08:46 -06:00
Thomas Lively	a787e09779	[WebAssembly] Prototype i64x2.bitmask As proposed in https://github.com/WebAssembly/simd/pull/368. Differential Revision: https://reviews.llvm.org/D90514	2020-10-30 17:23:30 -07:00
Thomas Lively	0a512a555a	[WebAssembly] Prototype i64x2.eq As proposed in https://github.com/WebAssembly/simd/pull/381. Since it is still in the prototyping phase, it is only accessible via a target builtin function and a target intrinsic. Depends on D90504. Differential Revision: https://reviews.llvm.org/D90508	2020-10-30 16:38:15 -07:00
Thomas Lively	1cb0b56607	[WebAssembly] Prototype i64x2.widen_{low,high}_i32x4_{s,u} As proposed in https://github.com/WebAssembly/simd/pull/290. As usual, these instructions are available only via builtin functions and intrinsics while they are in the prototyping stage. Differential Revision: https://reviews.llvm.org/D90504	2020-10-30 15:44:04 -07:00
Thomas Lively	be6f50798e	[WebAssembly] Implement SIMD signselect instructions As proposed in https://github.com/WebAssembly/simd/pull/124, using the opcodes adopted by V8 in https://chromium-review.googlesource.com/c/v8/v8/+/2486235/2/src/wasm/wasm-opcodes.h. Uses new builtin functions and a new target intrinsic exclusively to ensure that the new instructions are only emitted when a user explicitly opts in to using them since they are still in the prototyping and evaluation phase. Differential Revision: https://reviews.llvm.org/D90357	2020-10-29 11:06:20 -07:00
Jon Chesterfield	dee7704829	[AMDGPU] Add __builtin_amdgcn_grid_size [AMDGPU] Add __builtin_amdgcn_grid_size Similar to D76772, loads the data from the dispatch pointer. Marked invariant. Patch also updates the openmp devicertl to use this builtin. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D90251	2020-10-29 16:25:13 +00:00
Thomas Lively	5b464f2aa5	[WebAssembly] Fix incorrectly named target builtin Rename __builtin_wasm_q15mulr_saturate_s_i8x16 to __builtin_wasm_q15mulr_saturate_s_i16x8, fixing the implied lane interpretation of the result.	2020-10-28 10:22:43 -07:00
Heejin Ahn	98941279b9	[WebAssembly] Clang-format builtins generation (NFC) Differential Revision: https://reviews.llvm.org/D90294	2020-10-28 10:01:21 -07:00
Thomas Lively	31e944556f	[WebAssembly] Prototype extending multiplication SIMD instructions As proposed in https://github.com/WebAssembly/simd/pull/376. This commit implements new builtin functions and intrinsics for these instructions, but does not yet add them to wasm_simd128.h because they have not yet been merged to the proposal. These are the first instructions with opcodes greater than 0xff, so this commit updates the MC layer and disassembler to handle that correctly. Differential Revision: https://reviews.llvm.org/D90253	2020-10-28 09:38:59 -07:00
Tyker	d3205bbca3	[Annotation] Allows annotation to carry some additional constant arguments. This allows using annotation in a much more contexts than it currently has. especially when annotation with template or constexpr. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D88645	2020-10-26 10:50:05 +01:00
Caroline Concatto	e8d9ee9c7c	[SVE][CodeGen]Use getFixedSize() function for TypeSize comparison in clang This patch makes sure that the instance of TypeSize comparison operator is done with a fixed type size. Differential Revision: https://reviews.llvm.org/D89312	2020-10-16 10:56:39 +01:00
Fangrui Song	5a338599fb	[CGBuiltin] Respect asm labels and redefine_extname for builtins with specialized emitting rL131311 added `asm()` support for builtin functions, but `asm()` for builtins with specialized emitting (e.g. memcpy, various math functions) still do not work. This patch makes these functions work for `asm()` and `#pragma redefine_extname`. glibc uses `asm()` to redirect internal libc function calls to hidden aliases. Limitation: such a function is a builtin in clang, but will not be recognized as a libcall in optimization passes because Clang does not annotate the renamed function as a libcall. In GCC -O1 or above, `abs` can be optimized out but we can't. Additionally, we cannot redirect `__builtin_sin` to `real_sin` in the following example: double sin(double x) asm("real_sin"); double f(double d) { return __builtin_sin(d); } --- According to @rsmith, the following three statements cannot be simultaneously true: (1) The frontend function foo has known, builtin semantics X. (2) The symbol foo has known, builtin semantics X. (3) It's not correct to lower a call to the frontend function foo to the symbol foo. People do want (1) (if it is profitable to expand a memcpy, do it). This also means that people do not want to add -fno-builtin-memcpy. People do want (3): that is why they use asm("__GI_memcpy") in the first place. So unfortunately we make a compromise by not refuting (2) (see the limitation above). For most libcalls, there is a small loss because compilers don't synthesize them. For the few glibc cares about, it uses `asm("memcpy = __GI_memcpy");` to make the assembly level redirection. (Changing function names (e.g. `__memcpy`) is a hit to ergonomics which is not acceptable). Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D88712	2020-10-15 15:14:38 -07:00
Thomas Lively	1992e30c2d	[WebAssembly] Prototype i8x16.popcnt As proposed at https://github.com/WebAssembly/simd/pull/379. Use a target builtin and intrinsic rather than normal codegen patterns to make the instruction opt-in until it is merged to the proposal and stabilized in engines. Differential Revision: https://reviews.llvm.org/D89446	2020-10-15 21:18:22 +00:00
Thomas Lively	3f738d1f5e	Reland "[WebAssembly] v128.load{8,16,32,64}_lane instructions" This reverts commit `7c8385a352` with a typing fix to an instruction selection pattern.	2020-10-15 19:32:34 +00:00
Thomas Lively	7c8385a352	Revert "[WebAssembly] v128.load{8,16,32,64}_lane instructions" This reverts commit `7c6bfd90ab`.	2020-10-15 15:49:36 +00:00
Thomas Lively	7c6bfd90ab	[WebAssembly] v128.load{8,16,32,64}_lane instructions Prototype the newly proposed load_lane instructions, as specified in https://github.com/WebAssembly/simd/pull/350. Since these instructions are not available to origin trial users on Chrome stable, make them opt-in by only selecting them from intrinsics rather than normal ISel patterns. Since we only need rough prototypes to measure performance right now, this commit does not implement all the load and store patterns that would be necessary to make full use of the offset immediate. However, the full suite of offset tests is included to make it easy to track improvements in the future. Since these are the first instructions to have a memarg immediate as well as an additional immediate, the disassembler needed some additional hacks to be able to parse them correctly. Making that code more principled is left as future work. Differential Revision: https://reviews.llvm.org/D89366	2020-10-15 15:33:10 +00:00
Simon Pilgrim	d7fa9030d4	[CodeGen][X86] Emit fshl/fshr ir intrinsics for shiftleft128/shiftright128 ms intrinsics Now that funnel shift handling is pretty good, we can use the intrinsics directly and avoid a lot of zext/trunc issues. https://godbolt.org/z/YqhnnM Differential Revision: https://reviews.llvm.org/D89405	2020-10-15 10:22:41 +01:00
Simon Pilgrim	6c23cbc560	[X86] Convert integer _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) Emit the equivalent integer reduction intrinsics in IR instead of expanding to shuffle+arithmetic sequences. The fadd/fmul reductions might be trickier as they assume a similar bisection reduction while the generic intrinsics assume a sequential reduction (intel docs are ambiguous on the correct approach) - I'm not sure if we want to always tag them with reassoc? Anyway, that issue can wait until a separate fp patch along with the fmin/fmax reductions. Differential Revision: https://reviews.llvm.org/D87604	2020-10-13 09:28:39 +01:00
Thomas Lively	d8f58bf53a	[WebAssembly] Prototype i16x8.q15mulr_sat_s This saturating, rounding, Q-format multiplication instruction is proposed in https://github.com/WebAssembly/simd/pull/365. Differential Revision: https://reviews.llvm.org/D88968	2020-10-09 21:17:53 +00:00
Craig Topper	a02b449bb1	[X86] Sync AESENC/DEC Key Locker builtins with gcc. For the wide builtins, pass a single input and output pointer to the builtins. Emit the GEPs and input loads from CGBuiltin.	2020-10-04 12:09:41 -07:00
Craig Topper	230c57b0bd	[X86] Synchronize the encodekey builtins with gcc. Don't assume void* is 16 byte aligned. We were taking multiple pointer arguments in the builtin. gcc accepts a single void. The cast from void to _m128i* caused the IR generation to assume the pointer was aligned. Instead make the builtin take a single void, emit i8 GEPs to adjust then cast to <2 x i64>* and perform a store with align of 1.	2020-10-04 12:09:35 -07:00
Richard Smith	8fb2a235b0	Don't reject calls to MinGW's unusual _setjmp declaration. We now recognize this function as a builtin despite it having an unexpected number of parameters; make sure we don't enforce that it has only 1 argument for its 2 parameters.	2020-10-02 15:12:15 -07:00
Xiang1 Zhang	413577a879	[X86] Support Intel Key Locker Key Locker provides a mechanism to encrypt and decrypt data with an AES key without having access to the raw key value by converting AES keys into “handles”. These handles can be used to perform the same encryption and decryption operations as the original AES keys, but they only work on the current system and only until they are revoked. If software revokes Key Locker handles (e.g., on a reboot), then any previous handles can no longer be used. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D88398	2020-09-30 18:08:45 +08:00
Craig Topper	288c5776c9	[X86] Use inlineasm flag output for the _bittest* intrinsics. Instead of expliciting emitting a setc in the inline asm instructions, we can use flag output. This allows the backend to use the flag directly if it is needed by a branch. Previously we needed a test instruction to convert the register back to a flag. If the flag can't be used directly, the backend will emit a setcc. Differential Revision: https://reviews.llvm.org/D87888	2020-09-28 13:33:22 -07:00
David Sherwood	bafdd11326	[SVE] Replace / operator in TypeSize/ElementCount with divideCoefficientBy After some recent upstream discussion we decided that it was best to avoid having the / operator for both ElementCount and TypeSize, since this could give the impression that these classes can be used in the same way as basic integer integer types. However, division for scalable types is a bit odd because we are only dividing the minimum quantity by a value, as opposed to something like: (MinSize * Vscale) / SomeValue This is why when performing division it's important the caller first establishes whether the operation makes sense, perhaps by calling isKnownMultipleOf() prior to division. The caller must now explictly call divideCoefficientBy() on the class to perform the operation. Differential Revision: https://reviews.llvm.org/D87700	2020-09-28 08:03:00 +01:00
Amy Kwan	6b136b19cb	[Power10] Implement custom codegen for the vec_replace_elt and vec_replace_unaligned builtins. This patch implements custom codegen for the vec_replace_elt and vec_replace_unaligned builtins. These builtins map to the @llvm.ppc.altivec.vinsw and @llvm.ppc.altivec.vinsd intrinsics depending on the arguments. The main motivation for doing custom codegen for these intrinsics is because there are float and double versions of the builtin. Normally, the converting the float to an integer would be done via fptoui in the IR. This is incorrect as fptoui truncates the value and we must ensure the value is not truncated. Therefore, we provide custom codegen to utilize bitcast instead as bitcasts do not truncate. Differential Revision: https://reviews.llvm.org/D83500	2020-09-23 22:55:25 -05:00
Craig Topper	d9717d8ee7	[X86] Add a memory clobber to the bittest intrinsic inline asm. Get default clobbers from the target I believe the inline asm emitted here should have a memory clobber since it writes to memory. It was also missing the dirflag clobber that we use by default along with flags and fpsr. To avoid missing defaults in the future, get the default list from the target Differential Revision: https://reviews.llvm.org/D88121	2020-09-23 14:54:39 -07:00
Stanislav Mekhanoshin	59691dc874	[AMDGPU] Make ds fp atomics overloadable Differential Revision: https://reviews.llvm.org/D87947	2020-09-23 11:39:50 -07:00
Simon Pilgrim	9eab73fa17	[X86] Update SSE/AVX integer MINMAX intrinsics to emit llvm.smax.* etc. (PR46851) We're now getting close to having the necessary analysis/combines etc. for the new generic llvm smax/smin/umax/umin intrinsics. This patch updates the SSE/AVX integer MINMAX intrinsics to emit the generic equivalents instead of the icmp+select code pattern. Differential Revision: https://reviews.llvm.org/D87603	2020-09-15 11:19:08 +01:00
Qiu Chaofan	88ff4d2ca1	[PowerPC] Fix STRICT_FRINT/STRICT_FNEARBYINT lowering In standard C library, both rint and nearbyint returns rounding result in current rounding mode. But nearbyint never raises inexact exception. On PowerPC, x(v\|s)r(d\|s)pic may modify FPSCR XX, raising inexact exception. So we can't select constrained fnearbyint into xvrdpic. One exception here is xsrqpi, which will not raise inexact exception, so fnearbyint f128 is okay here. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87220	2020-09-09 22:40:58 +08:00
Simon Pilgrim	2853ae3c1b	[X86] Update SSE/AVX ABS intrinsics to emit llvm.abs.* (PR46851) We're now getting close to having the necessary analysis/combines etc. for the new generic llvm.abs.* intrinsics. This patch updates the SSE/AVX ABS vector intrinsics to emit the generic equivalents instead of the icmp+sub+select code pattern. Differential Revision: https://reviews.llvm.org/D87101	2020-09-07 13:54:12 +01:00
Simon Pilgrim	a8a91533dd	[X86] Replace EmitX86AddSubSatExpr with EmitX86BinaryIntrinsic generic helper. NFCI. Feed the Intrinsic::ID value directly instead of via the IsSigned/IsAddition bool flags.	2020-09-07 13:33:48 +01:00
David Sherwood	f4257c5832	[SVE] Make ElementCount members private This patch changes ElementCount so that the Min and Scalable members are now private and can only be accessed via the get functions getKnownMinValue() and isScalable(). In addition I've added some other member functions for more commonly used operations. Hopefully this makes the class more useful and will reduce the need for calling getKnownMinValue(). Differential Revision: https://reviews.llvm.org/D86065	2020-08-28 14:43:53 +01:00
Mikhail Maltsev	ae1396c7d4	[ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics This patch adjusts the following ARM/AArch64 LLVM IR intrinsics: - neon_bfmmla - neon_bfmlalb - neon_bfmlalt so that they take and return bf16 and float types. Previously these intrinsics used <8 x i8> and <4 x i8> vectors (a rudiment from implementation lacking bf16 IR type). The neon_vbfdot[q] intrinsics are adjusted similarly. This change required some additional selection patterns for vbfdot itself and also for vector shuffles (in a previous patch) because of SelectionDAG transformations kicking in and mangling the original code. This patch makes the generated IR cleaner (less useless bitcasts are produced), but it does not affect the final assembly. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D86146	2020-08-27 18:43:16 +01:00
Christopher Tetreault	19e883fc59	[SVE] Remove calls to VectorType::getNumElements from clang Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D82582	2020-08-26 11:12:26 -07:00
Wang, Pengfei	9512525947	[X86][FPEnv] Teach X86 mask compare intrinsics to respect strict FP semantics. When we use mask compare intrinsics under strict FP option, the masked elements shouldn't raise any exception. So, we cann't replace the intrinsic with a full compare + "and" operation. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D85385	2020-08-11 10:28:41 +08:00
Yonghong Song	00602ee7ef	BPF: simplify IR generation for __builtin_btf_type_id() This patch simplified IR generation for __builtin_btf_type_id(). For __builtin_btf_type_id(obj, flag), previously IR builtin looks like if (obj is a lvalue) llvm.bpf.btf.type.id(obj.ptr, 1, flag) !type else llvm.bpf.btf.type.id(obj, 0, flag) !type The purpose of the 2nd argument is to differentiate __builtin_btf_type_id(obj, flag) where obj is a lvalue vs. __builtin_btf_type_id(obj.ptr, flag) Note that obj or obj.ptr is never used by the backend and the `obj` argument is only used to derive the type. This code sequence is subject to potential llvm CSE when - obj is the same .e.g., nullptr - flag is the same - metadata type is different, e.g., typedef of struct "s" and strust "s". In the above, we don't want CSE since their metadata is different. This patch change IR builtin to llvm.bpf.btf.type.id(seq_num, flag) !type and seq_num is always increasing. This will prevent potential llvm CSE. Also report an error if the type name is empty for remote relocation since remote relocation needs non-empty type name to do relocation against vmlinux. Differential Revision: https://reviews.llvm.org/D85174	2020-08-04 16:29:42 -07:00
Yonghong Song	6d67506964	[clang][BPF] support type exist/size and enum exist/value relocations This patch added the following additional compile-once run-everywhere (CO-RE) relocations: - existence/size of typedef, struct/union or enum type - enum value and enum value existence These additional relocations will make CO-RE bpf programs more adaptive for potential kernel internal data structure changes. For existence/size relocations, the following two code patterns are supported: 1. uint32_t __builtin_preserve_type_info((<type> )0, flag); 2. <type> var; uint32_t __builtin_preserve_field_info(var, flag); flag = 0 for existence relocation and flag = 1 for size relocation. For enum value existence and enum value relocations, the following code pattern is supported: uint64_t __builtin_preserve_enum_value((<enum_type> )<enum_value>, flag); flag = 0 means existence relocation and flag = 1 for enum value. relocation. In the above <enum_type> can be an enum type or a typedef to enum type. The <enum_value> needs to be an enumerator value from the same enum type. The return type is uint64_t to permit potential 64bit enumerator values. Differential Revision: https://reviews.llvm.org/D83242	2020-08-04 08:39:53 -07:00
Thomas Lively	cb32792210	[WebAssembly] Implement prototype v128.load{32,64}_zero instructions Specified in https://github.com/WebAssembly/simd/pull/237, these instructions load the first vector lane from memory and zero the other lanes. Since these instructions are not officially part of the SIMD proposal, they are only available on an opt-in basis via LLVM intrinsics and clang builtin functions. If these instructions are merged to the proposal, this implementation will change so that the instructions will be generated from normal IR. At that point the intrinsics and builtin functions would be removed. This PR also changes the opcodes for the experimental f32x4.qfm{a,s} instructions because their opcodes conflicted with those of the v128.load{32,64}_zero instructions. The new opcodes were chosen to match those used in V8. Differential Revision: https://reviews.llvm.org/D84820	2020-08-03 13:54:00 -07:00
Eli Friedman	8dfb5d767e	[clang codegen][AArch64] Use llvm.aarch64.neon.fcvtzs/u where it's necessary fptosi/fptoui have similar, but not identical, semantics. In particular, the behavior on overflow is different. Fixes https://bugs.llvm.org/show_bug.cgi?id=46844 for 64-bit. (The corresponding patch for 32-bit is more involved because the equivalent intrinsics don't exist, as far as I can tell.) Differential Revision: https://reviews.llvm.org/D84703	2020-07-30 15:41:54 -07:00
Thomas Lively	11bb7eef41	[WebAssembly] Remove intrinsics for SIMD widening ops Instead, pattern match extends of extract_subvectors to generate widening operations. Since extract_subvector is not a legal node, this is implemented via a custom combine that recognizes extract_subvector nodes before they are legalized. The combine produces custom ISD nodes that are later pattern matched directly, just like the intrinsic was. Also removes the clang builtins for these operations since the instructions can now be generated from portable code sequences. Differential Revision: https://reviews.llvm.org/D84556	2020-07-28 18:25:55 -07:00
Richard Smith	6c18f7db73	For PR46800, implement the GCC __builtin_complex builtin. glibc's implementation of the CMPLX macro uses it (with -fgnuc-version set to 4.7 or later).	2020-07-22 13:43:10 -07:00
David Blaikie	36036aa70e	Reapply "Rename/refactor isIntegerConstantExpression to getIntegerConstantExpression" Reapply `49e5f603d4` which had been reverted in `c94332919b`. Originally reverted because I hadn't updated it in quite a while when I got around to committing it, so there were a bunch of missing changes to new code since I'd written the patch. Reviewers: aaron.ballman Differential Revision: https://reviews.llvm.org/D76646	2020-07-21 20:57:12 -07:00
Tim Northover	9697a9e2d3	Fix typo in identifier in assert.	2020-07-15 09:57:53 +01:00
Tim Northover	5165b2b5fd	AArch64+ARM: make LLVM consider system registers volatile. Some of the system registers readable on AArch64 and ARM platforms return different values with each read (for example a timer counter), these shouldn't be hoisted outside loops or otherwise interfered with, but the normal @llvm.read_register intrinsic is only considered to read memory. This introduces a separate @llvm.read_volatile_register intrinsic and maps all system-registers on ARM platforms to use it for the __builtin_arm_rsr calls. Registers declared with asm("r9") or similar are unaffected.	2020-07-15 09:47:36 +01:00
David Blaikie	c94332919b	Revert "Rename/refactor isIntegerConstantExpression to getIntegerConstantExpression" Broke buildbots since I hadn't updated this patch in a while. Sorry for the noise. This reverts commit `49e5f603d4`.	2020-07-12 20:29:19 -07:00
David Blaikie	49e5f603d4	Rename/refactor isIntegerConstantExpression to getIntegerConstantExpression There is a version that just tests (also called isIntegerConstantExpression) & whereas this version is specifically used when the value is of interest (a few call sites were actually refactored to calling the test-only version) so let's make the API look more like it. Reviewers: aaron.ballman Differential Revision: https://reviews.llvm.org/D76646	2020-07-12 19:43:24 -07:00
Craig Topper	b4dbb37f32	[X86] Rename X86_CPU_TYPE_COMPAT_ALIAS/X86_CPU_TYPE_COMPAT/X86_CPU_SUBTYPE_COMPAT macros. NFC Remove _COMPAT. Drop the ARCHNAME. Remove the non-COMPAT versions that are no longer needed. We now only use these macros in places where we need compatibility with libgcc/compiler-rt. So we don't need to call out _COMPAT specifically.	2020-07-12 17:00:24 -07:00
Jennifer Yu	6cf0dac1ca	orrectly generate invert xor value for Binary Atomics of int size > 64 When using __sync_nand_and_fetch with __int128, a problem is found that the wrong value for the 'invert' value gets emitted to the xor in case where the int size is greater than 64 bits. This is because uses of llvm::ConstantInt::get which zero extends the greater than 64 bits, so instead -1 that we require, it end up getting 18446744073709551615 This patch replaces the call to llvm::ConstantInt::get with the call to llvm::Constant::getAllOnesValue which works for all integer types. Reviewers: jfp, erichkeane, rjmccall, hfinkel Differential Revision: https://reviews.llvm.org/D82832	2020-07-07 10:20:14 -07:00
Wouter van Oortmerssen	16d83c395a	[WebAssembly] Added 64-bit memory.grow/size/copy/fill This covers both the existing memory functions as well as the new bulk memory proposal. Added new test files since changes where also required in the inputs. Also removes unused init/drop intrinsics rather than trying to make them work for 64-bit. Differential Revision: https://reviews.llvm.org/D82821	2020-07-06 12:49:50 -07:00
Craig Topper	3537939cda	[X86] Move frontend CPU feature initialization to a look up table based implementation. NFCI This replaces the switch statement implementation in the clang's X86.cpp with a lookup table in X86TargetParser.cpp. I've used constexpr and copy of the FeatureBitset from SubtargetFeature.h to store the features in a lookup table. After the lookup the bitset is translated into strings for use by the rest of the frontend code. I had to modify the implementation of the FeatureBitset to avoid bugs in gcc 5.5 constexpr handling. It seems to not like the same array entry to be used on the left side and right hand side of an assignment or &= or \|=. I've also used uint32_t instead of uint64_t and sized based on the X86::CPU_FEATURE_MAX. I've initialized the features for different CPUs outside of the table so that we can express inheritance in an adhoc way. This was one of the big limitations of the switch and we had resorted to labels and gotos. Differential Revision: https://reviews.llvm.org/D82731	2020-06-30 12:04:58 -07:00
Francesco Petrogalli	67e4330fac	[sve][acle] Implement some of the C intrinsics for brain float. Summary: The following intrinsics have been extended to support brain float types: svbfloat16_t svclasta[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t data) bfloat16_t svclasta[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t data) bfloat16_t svlasta[_bf16](svbool_t pg, svbfloat16_t op) svbfloat16_t svclastb[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t data) bfloat16_t svclastb[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t data) bfloat16_t svlastb[_bf16](svbool_t pg, svbfloat16_t op) svbfloat16_t svdup[_n]_bf16(bfloat16_t op) svbfloat16_t svdup[_n]_bf16_m(svbfloat16_t inactive, svbool_t pg, bfloat16_t op) svbfloat16_t svdup[_n]_bf16_x(svbool_t pg, bfloat16_t op) svbfloat16_t svdup[_n]_bf16_z(svbool_t pg, bfloat16_t op) svbfloat16_t svdupq[_n]_bf16(bfloat16_t x0, bfloat16_t x1, bfloat16_t x2, bfloat16_t x3, bfloat16_t x4, bfloat16_t x5, bfloat16_t x6, bfloat16_t x7) svbfloat16_t svdupq_lane[_bf16](svbfloat16_t data, uint64_t index) svbfloat16_t svinsr[_n_bf16](svbfloat16_t op1, bfloat16_t op2) Reviewers: sdesmalen, kmclaughlin, c-rhodes, ctetreau, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82345	2020-06-29 16:09:08 +00:00
Matt Arsenault	9e03bdebc1	AMDGPU: Add llvm.amdgcn.sqrt intrinsic I spread the GlobalISel test into the regular one, which I've been avoiding so far.	2020-06-26 15:07:07 -04:00
Francesco Petrogalli	7200fa38a9	[sve][acle] Add some C intrinsics for brain float types. Summary: The following intrinsics has been added: svuint16_t svcnt[_bf16]_m(svuint16_t inactive, svbool_t pg, svbfloat16_t op) svuint16_t svcnt[_bf16]_x(svbool_t pg, svbfloat16_t op) svuint16_t svcnt[_bf16]_z(svbool_t pg, svbfloat16_t op) svbfloat16_t svtbl[_bf16](svbfloat16_t data, svuint16_t indices) svbfloat16_t svtbl2[_bf16](svbfloat16x2_t data, svuint16_t indices) svbfloat16_t svtbx[_bf16](svbfloat16_t fallback, svbfloat16_t data, svuint16_t indices) Reviewers: c-rhodes, kmclaughlin, efriedma, sdesmalen, ctetreau Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82429	2020-06-25 16:31:01 +00:00
Andrew Wock	15edd7aaa7	[FPEnv] PowerPC-specific builtin constrained FP enablement This change enables PowerPC compiler builtins to generate constrained floating point operations when clang is indicated to do so. A couple of possibly unexpected backend divergences between constrained floating point and regular behavior are highlighted under the test tag FIXME-CHECK. This may be something for those on the PPC backend to look at. Patch by: Drew Wock <drew.wock@sas.com> Differential Revision: https://reviews.llvm.org/D82020	2020-06-25 11:42:58 -04:00
Cullen Rhodes	05e10ee0ae	[AArch64][SVE2] Add bfloat16 support to whilerw/whilewr intrinsics Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D82399	2020-06-24 10:06:31 +00:00
Cullen Rhodes	fd2c4b8999	[AArch64][SVE] Add bfloat16 support to svlen intrinsic Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D82186	2020-06-24 10:05:51 +00:00
Mikhail Maltsev	3f353a2e5a	[BFloat] Add convert/copy instrinsic support This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a Specifically it adds intrinsic support in clang and llvm for Arm and AArch64. The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Alexandros Lamprineas - Luke Cheeseman - Mikhail Maltsev - Momchil Velikov - Luke Geeson Differential Revision: https://reviews.llvm.org/D80928	2020-06-23 14:27:05 +00:00
Mikhail Maltsev	9c579540ff	[ARM] BFloat MatMul Intrinsics&CodeGen Summary: This patch adds support for BFloat Matrix Multiplication Intrinsics and Code Generation from __bf16 to AArch32. This includes IR intrinsics. Tests are provided as needed. This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Luke Geeson - Momchil Velikov - Mikhail Maltsev - Luke Cheeseman - Simon Tatham Reviewers: stuij, t.p.northover, SjoerdMeijer, sdesmalen, fpetrogalli, LukeGeeson, simon_tatham, dmgreen, MarkMurrayARM Reviewed By: MarkMurrayARM Subscribers: MarkMurrayARM, danielkiss, kristof.beyls, hiraditya, cfe-commits, llvm-commits, chill, miyuki Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81740	2020-06-23 12:06:37 +00:00
Craig Topper	0dfc8e1837	[X86] Remove encoding value from the X86_FEATURE and X86_FEATURE_COMPAT macro. NFCI This was orignally done so we could separate the compatibility values and the llvm internal only features into a separate entries in the feature array. This was needed when we explicitly had to convert the feature into the proper 32-bit chunk at every reference and we didn't want things moving around. Now everything is in an array and we have helper funtions or macros to convert encoding to index. So we renumbering is no longer an issue.	2020-06-22 11:46:21 -07:00
Mikhail Maltsev	3a4feb1d53	[ARM][BFloat] Implement bf16 get/set_lane without casts to i16 vectors Currently, in order to extract an element from a bf16 vector, we cast the vector to an i16 vector, perform the extraction, and cast the result to bfloat. This behavior was copied from the old fp16 implementation. The goal of this patch is to achieve optimal code generation for lane copying intrinsics in a subsequent patch (LLVM fails to fold certain combinations of bitcast, insertelement, extractelement and shufflevector instructions leading to the generation of suboptimal code). Differential Revision: https://reviews.llvm.org/D82206	2020-06-22 17:35:43 +00:00
Zhi Zhuang	37fb860301	Add support of __builtin_expect_with_probability Add a new builtin-function __builtin_expect_with_probability and intrinsic llvm.expect.with.probability. The interface is __builtin_expect_with_probability(long expr, long expected, double probability). It is mainly the same as __builtin_expect besides one more argument indicating the probability of expression equal to expected value. The probability should be a constant floating-point expression and be in range [0.0, 1.0] inclusive. It is similar to builtin-expect-with-probability function in GCC built-in functions. Differential Revision: https://reviews.llvm.org/D79830	2020-06-22 10:21:28 -07:00
Sander de Smalen	ad828e3f4d	[SveEmitter] Add builtins for struct loads/stores (ld2/ld3/etc) The struct store intrinsics in LLVM IR take the individual parts as arguments, so this patch uses the intrinsics used for `svget` to break the tuples into individual parts. Reviewers: c-rhodes, efriedma, ctetreau, david-arm Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D81466	2020-06-19 10:35:42 +01:00
Francesco Petrogalli	3e59dfc301	[llvm][SveEmitter] Emit the bfloat version of `svld1ro`. Summary: The new SVE builtin type __SVBFloat16_t` is used to represent scalable vectors of bfloat elements. Reviewers: sdesmalen, efriedma, stuij, ctetreau, shafik, rengolin Subscribers: tschuett, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81304	2020-06-18 16:36:31 +00:00
Florian Hahn	b5e082e728	[Matrix] Add __builtin_matrix_column_store to Clang. This patch add __builtin_matrix_column_major_store to Clang, as described in clang/docs/MatrixTypes.rst. In the initial version, the stride is not optional yet. Reviewers: rjmccall, jfb, rsmith, Bigcheese Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D72782	2020-06-18 11:39:02 +01:00
Sander de Smalen	4ea8e27a64	[SveEmitter] Add builtins to insert/extract subvectors from tuples (svget/svset) For example: svint32_t svget4(svint32x4_t tuple, uint64_t imm_index) returns the subvector at `index`, which must be in range `0..3`. svint32x3_t svset3(svint32x3_t tuple, uint64_t index, svint32_t vec) returns a tuple vector with `vec` inserted into `tuple` at `index`, which must be in range `0..2`. Reviewers: c-rhodes, efriedma Reviewed By: c-rhodes Tags: #clang Differential Revision: https://reviews.llvm.org/D81464	2020-06-18 11:06:16 +01:00
Florian Hahn	934bcaf10b	[Matrix] Add __builtin_matrix_column_load to Clang. This patch add __builtin_matrix_column_major_load to Clang, as described in clang/docs/MatrixTypes.rst. In the initial version, the stride is not optional yet. Reviewers: rjmccall, rsmith, jfb, Bigcheese Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D72781	2020-06-18 10:47:55 +01:00
Sander de Smalen	1d7b4a7e5e	[SveEmitter] Add builtins for tuple creation (svcreate2/svcreate3/etc) The svcreate builtins allow constructing a tuple from individual vectors, e.g. svint32x2_t svcreate2(svint32_t v2, svint32_t v2)` Reviewers: c-rhodes, david-arm, efriedma Reviewed By: c-rhodes, efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D81463	2020-06-18 10:07:09 +01:00
Sander de Smalen	e51c1d06a9	[SveEmitter] Add builtins for svtbl2 Reviewers: david-arm, efriedma, c-rhodes Reviewed By: c-rhodes Tags: #clang Differential Revision: https://reviews.llvm.org/D81462	2020-06-17 09:41:38 +01:00
Christopher Tetreault	eb81c85afd	[SVE] Deprecate default false variant of VectorType::get Reviewers: efriedma, fpetrogalli, kmclaughlin, huntergr Reviewed By: fpetrogalli Subscribers: cfe-commits, tschuett, rkruppe, psnobl, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D80342	2020-06-16 15:16:11 -07:00
Luke Geeson	10b6567f49	[AArch64]: BFloat MatMul Intrinsics&CodeGen This patch upstreams support for BFloat Matrix Multiplication Intrinsics and Code Generation from __bf16 to AArch64. This includes IR intrinsics. Unittests are provided as needed. AArch32 Intrinsics + CodeGen will come after this patch. This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: Luke Geeson - Momchil Velikov - Mikhail Maltsev - Luke Cheeseman Reviewers: SjoerdMeijer, t.p.northover, sdesmalen, labrinea, miyuki, stuij Reviewed By: miyuki, stuij Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits, miyuki, chill, pbarrio, stuij Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80752 Change-Id: I174f0fd0f600d04e3799b06a7da88973c6c0703f	2020-06-16 15:23:30 +01:00
Jeff Mott	8799ebbc1f	[clang] Fix or emit diagnostic for checked arithmetic builtins with _ExtInt types - Fix computed size for _ExtInt types passed to checked arithmetic builtins. - Emit diagnostic when signed _ExtInt larger than 128-bits is passed to __builtin_mul_overflow. - Change Sema checks for builtins to accept placeholder types. Differential Revision: https://reviews.llvm.org/D81420	2020-06-15 06:51:54 -07:00
Sander de Smalen	91a4a592ed	[SveEmitter] Add SVE tuple types and builtins for svundef. This patch adds new SVE types to Clang that describe tuples of SVE vectors. For example `svint32x2_t` which maps to the twice-as-wide vector `<vscale x 8 x i32>`. Similarly, `svint32x3_t` will map to `<vscale x 12 x i32>`. It also adds builtins to return an `undef` vector for a given SVE type. Reviewers: c-rhodes, david-arm, ctetreau, efriedma, rengolin Reviewed By: c-rhodes Tags: #clang Differential Revision: https://reviews.llvm.org/D81459	2020-06-15 07:36:01 +01:00
Craig Topper	ed34140e11	[X86] Move X86 stuff out of TargetParser.h and into the recently created X86TargetParser.h. NFC	2020-06-10 22:06:34 -07:00
Thomas Lively	b7d369280b	[WebAssembly] Implement prototype SIMD rounding instructions Summary: As specified in https://github.com/WebAssembly/simd/pull/232. These instructions are implemented as LLVM intrinsics for now rather than normal ISel patterns to make these instructions opt-in. Once the instructions are merged to the spec proposal, the intrinsics will be replaced with proper ISel patterns. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81222	2020-06-09 10:14:14 -07:00
Saiyedul Islam	675cefbf60	[AMDGPU] Introduce Clang builtins to be mapped to AMDGCN atomic inc/dec intrinsics Summary: __builtin_amdgcn_atomic_inc32(int Ptr, int Val, unsigned MemoryOrdering, const char SyncScope) __builtin_amdgcn_atomic_inc64(int64_t Ptr, int64_t Val, unsigned MemoryOrdering, const char SyncScope) __builtin_amdgcn_atomic_dec32(int Ptr, int Val, unsigned MemoryOrdering, const char SyncScope) __builtin_amdgcn_atomic_dec64(int64_t Ptr, int64_t Val, unsigned MemoryOrdering, const char SyncScope) First and second arguments gets transparently passed to the amdgcn atomic inc/dec intrinsic. Fifth argument of the intrinsic is set as true if the first argument of the builtin is a volatile pointer. The third argument of this builtin is one of the memory-ordering specifiers ATOMIC_ACQUIRE, ATOMIC_RELEASE, ATOMIC_ACQ_REL, or ATOMIC_SEQ_CST following C++11 memory model semantics. This is mapped to corresponding LLVM atomic memory ordering for the atomic inc/dec instruction using CLANG atomic C ABI. The fourth argument is an AMDGPU-specific synchronization scope defined as string. Reviewers: arsenm, sameerds, JonChesterfield, jdoerfert Reviewed By: arsenm, sameerds Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, kerbowa, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D80804	2020-06-09 17:02:58 +00:00
Florian Hahn	3323a628ec	[Matrix] Add __builtin_matrix_transpose to Clang. This patch add __builtin_matrix_transpose to Clang, as described in clang/docs/MatrixTypes.rst. Reviewers: rjmccall, jfb, rsmith, Bigcheese Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D72778	2020-06-09 10:14:37 +01:00
Jian Cai	4db2b70248	Add a flag to debug automatic variable initialization Summary: Add -ftrivial-auto-var-init-stop-after= to limit the number of times stack variables are initialized when -ftrivial-auto-var-init= is used to initialize stack variables to zero or a pattern. This flag can be used to bisect uninitialized uses of a stack variable exposed by automatic variable initialization, such as http://crrev.com/c/2020401. Reviewers: jfb, vitalybuka, kcc, glider, rsmith, rjmccall, pcc, eugenis, vlad.tsyrklevich Reviewed By: jfb Subscribers: phosek, hubert.reinterpretcast, srhines, MaskRay, george.burgess.iv, dexonsmith, inglorion, gbiv, llozano, manojgupta, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77168	2020-06-08 12:30:56 -07:00
Ties Stuij	8b137a4306	[clang][BFloat] Add create/set/get/dup intrinsics Summary: This patch is part of a series that adds support for the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Luke Cheeseman - Momchil Velikov - Luke Geeson - Ties Stuij - Mikhail Maltsev Reviewers: t.p.northover, sdesmalen, fpetrogalli, LukeGeeson, stuij, labrinea Reviewed By: labrinea Subscribers: miyuki, dmgreen, labrinea, kristof.beyls, ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79710	2020-06-05 14:35:10 +01:00
Ties Stuij	ecd682bbf5	[ARM] Add __bf16 as new Bfloat16 C Type Summary: This patch upstreams support for a new storage only bfloat16 C type. This type is used to implement primitive support for bfloat16 data, in line with the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile In detail this patch: - introduces an opaque, storage-only C-type __bf16, which introduces a new bfloat IR type. This is part of a patch series, starting with command-line and Bfloat16 assembly support. The subsequent patches will upstream intrinsics support for BFloat16, followed by Matrix Multiplication and the remaining Virtualization features of the armv8.6-a architecture. The following people contributed to this patch: - Luke Cheeseman - Momchil Velikov - Alexandros Lamprineas - Luke Geeson - Simon Tatham - Ties Stuij Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, fpetrogalli Reviewed By: SjoerdMeijer Subscribers: labrinea, majnemer, asmith, dexonsmith, kristof.beyls, arphaman, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76077	2020-06-05 10:32:43 +01:00
Craig Topper	dd863ccae1	[X86] Separate X86_CPU_TYPE_COMPAT_WITH_ALIAS from X86_CPU_TYPE_COMPAT. NFC Add a separate X86_CPU_TYPE_COMPAT_ALIAS that carries alias string and the enum from X86_CPU_TYPE_COMPAT.	2020-06-03 14:13:12 -07:00
Andrew Wock	15a1780a10	[PowerPC] Replace subtract-from-zero float in version with fneg in PowerPC special fma compiler builtins This is a re-revert with a corrected test. This patch adds a test for the PowerPC fma compiler builtins, some variations of which negate inputs and outputs. The code to generate IR for these builtins was untested before this patch. Originally, the code used the outdated method of subtracting floating point values from -0.0 as floating point negation. This patch remedies that. Patch by: Drew Wock <drew.wock@sas.com> Differential Revision: https://reviews.llvm.org/D76949	2020-06-03 09:45:27 -04:00
Lucas Prates	8beaba13b8	[Clang][AArch64] Capturing proper pointer alignment for Neon vld1 intrinsicts Summary: During CodeGen for AArch64 Neon intrinsics, Clang was incorrectly assuming all the pointers from which loads were being generated for vld1 intrinsics were aligned according to the intrinsics result type, causing alignment faults on the code generated by the backend. This patch updates vld1 intrinsics' CodeGen to properly capture the correct load alignment based on the type of the pointer provided as input for the intrinsic. Reviewers: t.p.northover, ostannard, pcc, efriedma Reviewed By: ostannard, efriedma Subscribers: echristo, plotfi, nickdesaulniers, efriedma, kristof.beyls, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79721	2020-06-03 11:39:27 +01:00
Christopher Tetreault	796898172c	[SVE] Eliminate calls to default-false VectorType::get() from Clang Reviewers: efriedma, david-arm, fpetrogalli, ddunbar, rjmccall Reviewed By: fpetrogalli, rjmccall Subscribers: tschuett, rkruppe, psnobl, dmgreen, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D80323	2020-06-01 10:02:14 -07:00
Eric Christopher	97a133f157	Temporarily Revert "[Clang][AArch64] Capturing proper pointer alignment for Neon vld1 intrinsicts" as it's causing crashes on code generation and https://bugs.llvm.org/show_bug.cgi?id=46084 This reverts commit `98cad555e2`.	2020-05-26 18:51:00 -07:00
Adrian Prantl	b59b3640bc	Debug Info: Mark os_log helper functions as artificial The os_log helper functions are linkonce_odr and supposed to be uniqued across TUs, so attachine a DW_AT_decl_line on it is highly misleading. By setting the function decl to implicit, CGDebugInfo properly marks the functions as artificial and uses a default file / line 0 location for the function. rdar://problem/63450824 Differential Revision: https://reviews.llvm.org/D80463	2020-05-26 09:08:27 -07:00
Lucas Prates	98cad555e2	[Clang][AArch64] Capturing proper pointer alignment for Neon vld1 intrinsicts Summary: During CodeGen for AArch64 Neon intrinsics, Clang was incorrectly assuming all the pointers from which loads were being generated for vld1 intrinsics were aligned according to the intrinsics result type, causing alignment faults on the code generated by the backend. This patch updates vld1 intrinsics' CodeGen to properly capture the correct load alignment based on the type of the pointer provided as input for the intrinsic. Reviewers: t.p.northover, ostannard, pcc Reviewed By: ostannard Subscribers: kristof.beyls, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79721	2020-05-26 10:09:35 +01:00
Eli Friedman	62f3ef2b53	[CGCall] Annotate references with "align" attribute. If we're going to assume references are dereferenceable, we should also assume they're aligned: otherwise, we can't actually dereference them. See also D80072. Differential Revision: https://reviews.llvm.org/D80166	2020-05-19 20:21:30 -07:00
Francesco Petrogalli	b593bfd4d8	[clang][SveEmitter] SVE builtins for `svusdot` and `svsudot` ACLE. Summary: Intrinsics, guarded by `__ARM_FEATURE_SVE_MATMUL_INT8`: * svusdot[_s32] * svusdot[_n_s32] * svusdot_lane[_s32] * svsudot[_s32] * svsudot[_n_s32] * svsudot_lane[_s32] Reviewers: sdesmalen, efriedma, david-arm, rengolin Subscribers: tschuett, kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79877	2020-05-18 23:07:23 +00:00
Yonghong Song	072cde03aa	[Clang][BPF] implement __builtin_btf_type_id() builtin function Such a builtin function is mostly useful to preserve btf type id for non-global data. For example, extern void foo(..., void *data, int size); int test(...) { struct t { int a; int b; int c; } d; d.a = ...; d.b = ...; d.c = ...; foo(..., &d, sizeof(d)); } The function "foo" in the above only see raw data and does not know what type of the data is. In certain cases, e.g., logging, the additional type information will help pretty print. This patch implemented a BPF specific builtin u32 btf_type_id = __builtin_btf_type_id(param, flag) which will return a btf type id for the "param". flag == 0 will indicate a BTF local relocation, which means btf type_id only adjusted when bpf program BTF changes. flag == 1 will indicate a BTF remote relocation, which means btf type_id is adjusted against linux kernel or future other entities. Differential Revision: https://reviews.llvm.org/D74668	2020-05-15 09:44:54 -07:00
Thomas Lively	3d49d1cfa7	[WebAssembly] Implement pseudo-min/max SIMD instructions Summary: As proposed in https://github.com/WebAssembly/simd/pull/122. Since these instructions are not yet merged to the SIMD spec proposal, this patch makes them entirely opt-in by surfacing them only through LLVM intrinsics and clang builtins. If these instructions are made official, these intrinsics and builtins should be replaced with simple instruction patterns. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D79742	2020-05-12 09:39:01 -07:00
Sander de Smalen	d6936be2ef	[SveEmitter] Add builtins for svdup and svindex Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D79357	2020-05-12 11:02:32 +01:00
Thomas Lively	8e3e56f2a3	[WebAssembly] Add wasm-specific vector shuffle builtin and intrinsic Summary: Although using `__builtin_shufflevector` and the `shufflevector` instruction works fine, they are not opaque to the optimizer. As a result, DAGCombine can potentially reduce the number of shuffles and change the shuffle masks. This is unexpected behavior for users of the WebAssembly SIMD intrinsics who have crafted their shuffles to optimize the code generated by engines. This patch solves the problem by adding a new shuffle intrinsic that is opaque to the optimizers in line with the decision of the WebAssembly SIMD contributors at https://github.com/WebAssembly/simd/issues/196#issuecomment-622494748. In the future we may implement custom DAG combines to properly optimize shuffles and replace this solution. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D66983	2020-05-11 10:01:55 -07:00
Sander de Smalen	4cad97595f	[SveEmitter] Add builtins for svmovlb and svmovlt These builtins are expanded in CGBuiltin to use intrinsics for (signed/unsigned) shift left long top/bottom. Reviewers: efriedma, SjoerdMeijer Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D79579	2020-05-11 09:41:58 +01:00
Sander de Smalen	3cb8b4c193	[SveEmitter] Add builtins for SVE2 Polynomial arithmetic This patch adds builtins for: - sveorbt - sveortb - svpmul - svpmullb, svpmullb_pair - svpmullt, svpmullt_pair The svpmullb and svpmullt builtins are expressed using the svpmullb_pair and svpmullt_pair LLVM IR intrinsics, respectively. Reviewers: SjoerdMeijer, efriedma, rengolin Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D79480	2020-05-07 11:53:04 +01:00

1 2 3 4 5 ...

1493 Commits