llvm-project

Commit Graph

Author	SHA1	Message	Date
Yaxun (Sam) Liu	8a8c6913a9	Revert "[HIP] Add default header and include path" This reverts commit `11d06b9511`.	2020-06-05 15:42:57 -04:00
Yaxun (Sam) Liu	11d06b9511	[HIP] Add default header and include path To support std::complex and some other standard C/C++ functions in HIP device code, they need to be forced to be __host__ __device__ functions by pragmas. This is done by some clang standard C++ wrapper headers which are shared between cuda-clang and hip-Clang. For these standard C++ wapper headers to work properly, specific include path order has to be enforced: clang C++ wrapper include path standard C++ include path clang include path Also, these C++ wrapper headers require device version of some standard C/C++ functions must be declared before including them. This needs to be done by including a default header which declares or defines these device functions. The default header is always included before any other headers are included by users. This patch adds the the default header and include path for HIP. Differential Revision: https://reviews.llvm.org/D81176	2020-06-05 12:44:57 -04:00
Ties Stuij	a6fcf5ca03	[clang][BFloat] add NEON emitter for bfloat Summary: This patch adds the bfloat16_t struct typedefs (e.g. bfloat16x8x2_t) to arm_neon.h This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Luke Cheeseman - Simon Tatham - Ties Stuij Reviewers: t.p.northover, fpetrogalli, sdesmalen, az, LukeGeeson Reviewed By: fpetrogalli Subscribers: SjoerdMeijer, LukeGeeson, pbarrio, mgorny, kristof.beyls, ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79708	2020-06-05 14:11:51 +01:00
Anastasia Stulova	4a4402f0d7	[OpenCL] Add cl_khr_extended_subgroup extensions. Added extensions and their function declarations into the standard header. Patch by Piotr Fusik! Tags: #clang Differential Revision: https://reviews.llvm.org/D79781	2020-06-04 13:29:30 +01:00
Thomas Lively	237be3404b	[WebAssembly] Improve macro hygiene in wasm_simd128.h Summary: The shuffle intrinsic macros did not parenthesize usages of their constant parameters, which could lead to incorrect results due to operator precedence issues. This patch fixes the problem by adding the missing paretheses. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, sunfish, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D80968	2020-06-02 12:55:06 -07:00
Craig Topper	1b02db52b7	[X86] Update some av512 shift intrinsics to use "unsigned int" parameter instead of int to match Intel documentation There are 65 that take a scalar shift amount. Intel documentation shows 60 of them taking unsigned int. There are 5 versions of srli_epi16 that use int, the 512-bit maskz and 128/256 mask/maskz. Fixes PR45931 Differential Revision: https://reviews.llvm.org/D80251	2020-05-22 20:12:57 -07:00
Thomas Lively	3181273be7	[WebAssembly] Implement i64x2.mul and remove i8x16.mul Summary: This reflects changes in the spec proposal made since basic arithmetic was first implemented. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80174	2020-05-19 12:50:44 -07:00
Xiang1 Zhang	bcc0c894f3	Add cet.h for writing CET-enabled assembly code Summary: Add x86 feature with IBT and/or SHSTK bits to ELF program property if they are enabled. Otherwise, contents in this header file are unused. This file is mainly design for assembly source code which want to enable CET Reviewers: hjl.tools, annita.zhang, LuoYuanke, craig.topper, tstellar, pengfei, rsmith Reviewed By: LuoYuanke Subscribers: cfe-commits, mgorny Tags: #clang Differential Revision: https://reviews.llvm.org/D79617	2020-05-19 14:03:17 +08:00
Xiang1 Zhang	62a9eca859	Test asm-cet.S fail for window clang This reverts commit `e7e84ff24a`.	2020-05-19 13:18:05 +08:00
Xiang1 Zhang	e7e84ff24a	Add cet.h for writing CET-enabled assembly code Summary: Add x86 feature with IBT and/or SHSTK bits to ELF program property if they are enabled. Otherwise, contents in this header file are unused. This file is mainly design for assembly source code which want to enable CET Reviewers: hjl.tools, annita.zhang, LuoYuanke, craig.topper, tstellar, pengfei, rsmith Reviewed By: LuoYuanke Subscribers: mgorny Differential Revision: https://reviews.llvm.org/D79617	2020-05-19 10:37:46 +08:00
Thomas Lively	c702d4bf41	[WebAssembly] Update latest implemented SIMD instructions Summary: Move instructions that have recently been implemented in V8 from the `unimplemented-simd128` target feature to the `simd128` target feature. The updated instructions match the update at https://github.com/WebAssembly/simd/pull/223. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D79973	2020-05-15 10:53:02 -07:00
Thomas Lively	3d49d1cfa7	[WebAssembly] Implement pseudo-min/max SIMD instructions Summary: As proposed in https://github.com/WebAssembly/simd/pull/122. Since these instructions are not yet merged to the SIMD spec proposal, this patch makes them entirely opt-in by surfacing them only through LLVM intrinsics and clang builtins. If these instructions are made official, these intrinsics and builtins should be replaced with simple instruction patterns. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D79742	2020-05-12 09:39:01 -07:00
Thomas Lively	8e3e56f2a3	[WebAssembly] Add wasm-specific vector shuffle builtin and intrinsic Summary: Although using `__builtin_shufflevector` and the `shufflevector` instruction works fine, they are not opaque to the optimizer. As a result, DAGCombine can potentially reduce the number of shuffles and change the shuffle masks. This is unexpected behavior for users of the WebAssembly SIMD intrinsics who have crafted their shuffles to optimize the code generated by engines. This patch solves the problem by adding a new shuffle intrinsic that is opaque to the optimizers in line with the decision of the WebAssembly SIMD contributors at https://github.com/WebAssembly/simd/issues/196#issuecomment-622494748. In the future we may implement custom DAG combines to properly optimize shuffles and replace this solution. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D66983	2020-05-11 10:01:55 -07:00
Thomas Lively	e0f52842c8	[WebAssembly] Renumber SIMD opcodes Summary: As described in https://github.com/WebAssembly/simd/pull/209. This is the final reorganization of the SIMD opcode space before standardization. It has been landed in concert with corresponding changes in other projects in the WebAssembly SIMD ecosystem. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79224	2020-05-01 17:20:49 -07:00
Craig Topper	af28e02e74	[clang] Add vendor identity for Hygon Dhyana processor to cpuid.h The vendor id is used to determine whether the processor supports hardware CRC32 in the Scudo code. The previous discussion about the patch is in [1], and more information about Hygon Dhyana processor is in[2]. [1]: https://reviews.llvm.org/D62368 [2]: https://git.kernel.org/torvalds/c/c9661c1e80b609cd038db7c908e061f0535804ef Patch by fanjinke (Jinke Fan) Differential Revision: https://reviews.llvm.org/D78874	2020-04-30 18:17:01 -07:00
Douglas Yung	046130490f	Add header guards for header files that should not be included on the PS4 platform. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79194	2020-04-30 16:17:34 -07:00
Ulrich Weigand	095ccf4455	[SystemZ] Avoid __INTPTR_TYPE__ conversions in vecintrin.h Some intrinsics in vecintrin.h are currently implemented by performing address arithmetic in __INTPTR_TYPE__ and converting the result to some pointer type. While this works correctly, it leads to suboptimal code generation since many optimizers cannot trace the provenance of the resulting pointers. Fixed by using "char *" pointer arithmetic instead.	2020-04-28 18:49:49 +02:00
Ulrich Weigand	c90e09b13c	[SystemZ] Use reserved keywords in vecintrin.h System headers should avoid using the "vector" and "bool" keywords since those might be redefined by user code. For example, using <stdbool.h> before <vecintrin.h> will currently lead to compiler errors. Fixed by using the reserved "__vector" and "__bool" keywords instead. NFC otherwise.	2020-04-28 18:49:48 +02:00
Raul Tambre	8e20516540	[CUDA] Define __CUDACC__ before standard library headers libstdc++ since version 7 when GNU extensions are enabled (e.g. -std=gnu++11) use it to avoid defining overloads using `__float128`. This fixes compiling with GNU extensions failing due to `__float128` being used. Discovered at https://gitlab.kitware.com/cmake/cmake/-/merge_requests/4442#note_737136. Differential Revision: https://reviews.llvm.org/D78392	2020-04-17 12:56:13 -07:00
Johannes Doerfert	17d8334223	[OpenMP] Allow <math.h> to go first in C++-mode in target regions If we are in C++ mode and include <math.h> (not <cmath>) first, we still need to make sure <cmath> is read first. The problem otherwise is that we haven't seen the declarations of the math.h functions when the system math.h includes our cmath overlay. However, our cmath overlay, or better the underlying overlay, e.g. CUDA, uses the math.h functions. Since we haven't declared them yet we get errors. CUDA avoids this by eagerly declaring all math functions (in the __device__ space) but we cannot do this. Instead we break the dependence by forcing cmath to go first. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D77774	2020-04-09 22:10:31 -05:00
WangTianQing	a3dc949000	[X86] Add TSXLDTRK instructions. Summary: For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Reviewers: craig.topper, RKSimon, LuoYuanke Reviewed By: craig.topper Subscribers: mgorny, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77205	2020-04-09 13:17:29 +08:00
Johannes Doerfert	f85ae058f5	[OpenMP] Provide math functions in OpenMP device code via OpenMP variants For OpenMP target regions to piggy back on the CUDA/AMDGPU/... implementation of math functions, we include the appropriate definitions inside of an `omp begin/end declare variant match(device={arch(nvptx)})` scope. This way, the vendor specific math functions will become specialized versions of the system math functions. When a system math function is called and specialized version is available the selection logic introduced in D75779 instead call the specialized version. In contrast to the code path we used so far, the system header is actually included. This means functions without specialized versions are available and so are macro definitions. This should address PR42061, PR42798, and PR42799. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D75788	2020-04-07 23:33:24 -05:00
Pierre Gousseau	08fab9ebec	[X86] Fix implicit sign conversion warnings in X86 headers. Warnings in emmintrin.h and xmmintrin.h are reported by -fsanitize=implicit-integer-sign-change. Reviewed By: RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D77393	2020-04-07 11:25:08 +01:00
WangTianQing	d08fadd662	[X86] Add SERIALIZE instruction. Summary: For more details about this instruction, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Reviewers: craig.topper, RKSimon, LuoYuanke Reviewed By: craig.topper Subscribers: mgorny, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77193	2020-04-02 16:19:23 +08:00
Johannes Doerfert	b0b5f0416b	[OpenMP][FIX] Undo changes accidentally already introduced in NFC commit In `d1705c1196` (D77238) we accidentally included subsequent changes and did not only move the code into a new file (which was the intention). We undo the changes now and re-introduce them with the appropriate test changes later.	2020-04-02 01:33:39 -05:00
Johannes Doerfert	410cfc478f	[OpenMP][FIX] Add second include after header was split in `d1705c1196` The math wrapper handling is going to be replaced shortly and `d1705c1196` was actually a precursor for that.	2020-04-02 00:20:23 -05:00
Johannes Doerfert	d1705c1196	[CUDA][NFC] Split math.h functions out of __clang_cuda_device_functions.h This is not supported to change anything but allow us to reuse the math functions separately from the device functions, e.g., source them at different times. This will be used by the OpenMP overlay. This also adds two `return` keywords that were missing. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D77238	2020-04-01 23:46:27 -05:00
Thomas Lively	95fac2e46b	[WebAssembly] Rename SIMD min/max/avgr intrinsics for consistency Summary: The convention for the wasm_simd128.h intrinsics is to have the integer sign in the lane interpretation rather than as a suffix. This PR changes the names of the integer min, max, and avgr intrinsics to match this convention. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, sunfish, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77185	2020-04-01 09:38:41 -07:00
Thomas Lively	5074776de4	[WebAssembly] Import wasm_simd128.h from Emscripten Summary: As the WebAssembly SIMD proposal nears stabilization, there is desire to use it with toolchains other than Emscripten. Moving the intrinsics header to clang will make it available to WASI toolchains as well. Reviewers: aheejin, sunfish Subscribers: dschuff, mgorny, sbc100, jgravelle-google, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76959	2020-03-30 17:04:18 -07:00
Michael Liao	5be9b8cbe2	[cuda][hip] Add CUDA builtin surface/texture reference support. Summary: - Re-commit after fix Sema checks on partial template specialization. Reviewers: tra, rjmccall, yaxunl, a.sidorin Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76365	2020-03-27 17:18:49 -04:00
Artem Belevich	fe8063e1a0	Revert "[cuda][hip] Add CUDA builtin surface/texture reference support." This reverts commit `6a9ad5f3f4`. The patch breaks CUDA copmilation. Differential Revision: https://reviews.llvm.org/D76365	2020-03-27 10:01:38 -07:00
Michael Liao	6a9ad5f3f4	[cuda][hip] Add CUDA builtin surface/texture reference support. Summary: - Even though the bindless surface/texture interfaces are promoted, there are still code using surface/texture references. For example, [PR#26400](https://bugs.llvm.org/show_bug.cgi?id=26400) reports the compilation issue for code using `tex2D` with texture references. For better compatibility, this patch proposes the support of surface/texture references. - Due to the absent documentation and magic headers, it's believed that `nvcc` does use builtins for texture support. From the limited NVVM documentation[^nvvm] and NVPTX backend texture/surface related tests[^test], it's believed that surface/texture references are supported by replacing their reference types, which are annotated with `device_builtin_surface_type`/`device_builtin_texture_type`, with the corresponding handle-like object types, `cudaSurfaceObject_t` or `cudaTextureObject_t`, in the device-side compilation. On the host side, that global handle variables are registered and will be established and updated later when corresponding binding/unbinding APIs are called[^bind]. Surface/texture references are most like device global variables but represented in different types on the host and device sides. - In this patch, the following changes are proposed to support that behavior: + Refine `device_builtin_surface_type` and `device_builtin_texture_type` attributes to be applied on `Type` decl only to check whether a variable is of the surface/texture reference type. + Add hooks in code generation to replace that reference types with the correponding object types as well as all accesses to them. In particular, `nvvm.texsurf.handle.internal` should be used to load object handles from global reference variables[^texsurf] as well as metadata annotations. + Generate host-side registration with proper template argument parsing. --- [^nvvm]: https://docs.nvidia.com/cuda/pdf/NVVM_IR_Specification.pdf [^test]: https://raw.githubusercontent.com/llvm/llvm-project/master/llvm/test/CodeGen/NVPTX/tex-read-cuda.ll [^bind]: See section 3.2.11.1.2 ``Texture reference API` in [CUDA C Programming Guide](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf). [^texsurf]: According to NVVM IR, `nvvm.texsurf.handle` should be used. But, the current backend doesn't have that supported. We may revise that later. Reviewers: tra, rjmccall, yaxunl, a.sidorin Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76365	2020-03-26 14:44:52 -04:00
Sander de Smalen	5087ace651	[Clang][SVE] Parse builtin type string for scalable vectors This patch adds 'q' to mean 'scalable vector' in the builtin type string, and for SVE will return the matching builtin type as defined in the C/C++ language extensions for SVE. This patch also adds some scaffolding to generate the arm_sve.h header file, and some builtin definitions (+CodeGen) to be able to implement some simple masked load intrinsics that use the ACLE types, such as: svint8_t test_svld1_s8(svbool_t pg, const int8_t *base) { return svld1_s8(pg, base); } Reviewers: efriedma, rjmccall, rovka, rsandifo-arm, rengolin Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D75298	2020-03-15 14:34:52 +00:00
Shengchen Kan	214d24e1f8	[X86] Support intrinsic _mm_broadcastsi128_si256 Reviewers: LuoYuanke, craig.topper, RKSimon, pengfei Reviewed By: craig.topper Subscribers: cfe-commits, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D75897	2020-03-12 10:56:39 +08:00
Shengchen Kan	ab69cd0779	[X86] Support intrinsic _mm_cldemote Reviewers: LuoYuanke, craig.topper, RKSimon, pengfei Reviewed By: craig.topper Subscribers: cfe-commits, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D75896	2020-03-12 10:03:41 +08:00
Shengchen Kan	560aa53f8f	[X86] Support intrinsics _bextr2* Reviewers: LuoYuanke, craig.topper, RKSimon, pengfei Reviewed By: craig.topper Subscribers: cfe-commits, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D75894	2020-03-12 09:26:51 +08:00
Mikhail Maltsev	47edf5bafb	[ARM,CDE] Generalize MVE intrinsics infrastructure to support CDE Summary: This patch generalizes the existing code to support CDE intrinsics which will share some properties with existing MVE intrinsics (some of the intrinsics will be polymorphic and accept/return values of MVE vector types). Specifically the patch: * Adds new tablegen backends -gen-arm-cde-builtin-def, -gen-arm-cde-builtin-codegen, -gen-arm-cde-builtin-sema, -gen-arm-cde-builtin-aliases, -gen-arm-cde-builtin-header based on existing MVE backends. * Renames the '__clang_arm_mve_alias' attribute into '__clang_arm_builtin_alias' (it will be used with CDE intrinsics as well as MVE intrinsics) * Implements semantic checks for the coprocessor argument of the CDE intrinsics as well as the existing coprocessor intrinsics. * Adds one CDE intrinsic __arm_cx1 to test the above changes Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen Reviewed By: simon_tatham Subscribers: sdesmalen, mgorny, kristof.beyls, danielkiss, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D75850	2020-03-10 14:03:16 +00:00
Michael Spencer	16af23fae8	[clang][Headers] Use __has_builtin instead of _MSC_VER. arm_acle.h relied on `_MSC_VER` to determine if a given function was already defined as a builtin. This was incorrect because `-fms-extensions` enables these builtins, but is not responsible for defining `_MSC_VER` on any target. The next closest thing is `_MSC_EXTENSIONS`, which is only defined on Windows targets, but even this is suboptimal. What this conditional is actually trying to determine is if the given functions are defined as builtins, so just check that directly. I also attempted to do this for `__nop`, but in that case intrin.h, which is only includable if `_MSC_VER` is defined, has its own definition. So in that case `_MSC_VER` is correct. Differential Revision: https://reviews.llvm.org/D75719 rdar://60102353	2020-03-06 13:48:09 -08:00
Sven van Haastregt	8a37b9e617	[OpenCL] Remove spurious atomic_fetch_min/max builtins These declarations use a mix of unsigned and signed argument and return types. This is not in accordance with OpenCL v2.0 s6.13.11. Differential Revision: https://reviews.llvm.org/D74910	2020-03-02 15:56:48 +00:00
Mirko Brkusanin	5ba931a84a	[Mips] Add intrinsics for 4-byte and 8-byte MSA loads/stores. New intrinisics are implemented for when we need to port SIMD code from other arhitectures and only load or store portions of MSA registers. Following intriniscs are added which only load/store element 0 of a vector: v4i32 __builtin_msa_ldrq_w (const void , imm_n2048_2044); v2i64 __builtin_msa_ldr_d (const void , imm_n4096_4088); void __builtin_msa_strq_w (v4i32, void , imm_n2048_2044); void __builtin_msa_str_d (v2i64, void , imm_n4096_4088); Differential Revision: https://reviews.llvm.org/D73644	2020-02-11 11:47:30 +01:00
Alexey Bataev	fd3437a4f7	[OPENMP][NVPTX]Add NVPTX specific definitions for new/delete operators. Summary: To use new/delete in NVPTX code we need to define them. Implementation copied from CUDA wrappers. Reviewers: hfinkel, jdoerfert Subscribers: mgorny, guansong, kkwli0, caomhin, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73128	2020-02-05 09:57:53 -05:00
Alexey Sotkin	f780e15caf	[OpenCL] Fix support for cl_khr_mipmap_image_writes Text of the extension is available here: https://github.com/KhronosGroup/OpenCL-Docs/blob/master/ext/cl_khr_mipmap_image.asciidoc Patch by Ilya Mashkov Differential Revision: https://reviews.llvm.org/D71460	2020-02-05 14:55:32 +03:00
Artem Belevich	12fefeef20	[CUDA] Assume the latest known CUDA version if we've found an unknown one. This makes clang somewhat forward-compatible with new CUDA releases without having to patch it for every minor release without adding any new function. If an unknown version is found, clang issues a warning (can be disabled with -Wno-cuda-unknown-version) and assumes that it has detected the latest known version. CUDA releases are usually supersets of older ones feature-wise, so it should be sufficient to keep released clang versions working with minor CUDA updates without having to upgrade clang, too. Differential Revision: https://reviews.llvm.org/D73231	2020-01-28 10:11:42 -08:00
Artem Belevich	cc14de88da	[CUDA] Fix order of memcpy arguments in __shfl_*(<64-bit type>). Wrong argument order resulted in broken shfl ops for 64-bit types.	2020-01-23 13:17:52 -08:00
Craig Topper	16b9410caa	[X86] Cast to __v4hi instead of __m64 in the implementation of _mm_extract_pi16 and _mm_insert_pi16. __m64 is a vector of 1 long long. But the builtins these intrinsics are calling expect a vector of 4 shorts. Fixes PR44589	2020-01-22 16:00:23 -06:00
Ulrich Weigand	cebba7ce39	[SystemZ] Avoid unnecessary conversions in vecintrin.h Use floating-point instead of integer zero constants to avoid creating implicit conversions, which currently cause suboptimal code to be generated with -ffp-exception-behavior=strict. NFC otherwise.	2020-01-16 18:58:14 +01:00
Richard Smith	388eaa1270	Work around PR43337: don't try to use the vec_sel overloads for vector long long, since clang's <altivec.h> doesn't provide it yet!	2020-01-15 13:14:57 -08:00
Warren Ristow	7fcd9e3f70	[X86] Mark various pointer arguments in builtins as const Enabling `-Wcast-qual` identified many casts in various system headers that were dropping the `const` qualifier. Fixing those missing qualifiers pointed out that a few of the definitions of the builtins did not properly identify their arguments as `const` pointers. This commit fixes those builtin definitions, and the system header files so that they no longer drop the qualifier. Differential Revision: https://reviews.llvm.org/D71718	2019-12-19 11:42:11 -08:00
Momchil Velikov	600d123c6f	[ARM][CMSE] Add CMSE header and builtins This is patch C2 as mentioned in RFC http://lists.llvm.org/pipermail/cfe-dev/2019-March/061834.html This adds CMSE builtin functions, and introduces arm_cmse.h header which has useful macros, functions, and data types for end-users of CMSE. Patch by Javed Absar. Diferential Revision: https://reviews.llvm.org/D70817	2019-12-12 15:01:14 +00:00
Craig Topper	890c6ef1fb	[X86] Remove forward declaration of _invpcid from intrin.h. Rely on inline version from immintrin.h The forward declaration had a cdecl calling convention, but the inline version did not. This leads to a conflict if the default calling convention is not cdecl. Fix this by just removing the forward declaration. Fixes PR41503	2019-11-25 16:27:39 -08:00
Craig Topper	3cec2a17de	[X86] Fix the implementation of __readcr3/__writecr3 to work in 64-bit mode We need to use a 64-bit type in 64-bit mode so a 64-bit register will get used in the generated assembly. I've also changed the constraints to just use "r" intead of "q". "q" forces to a only an a/b/c/d register in 32-bit mode, but I see no reason that would matter here. Fixes Nico's note in PR19301 over 4 years ago. Differential Revision: https://reviews.llvm.org/D70101	2019-11-14 13:21:36 -08:00
Nemanja Ivanovic	e0407f5496	[PowerPC][Altivec] Fix offsets for vec_xl and vec_xst As we currently have it implemented in altivec.h, the offsets for these two intrinsics are element offsets. The documentation in the ABI (as well as the implementation in both XL and GCC) states that these should be byte offsets. Differential revision: https://reviews.llvm.org/D63636	2019-11-07 20:58:11 -06:00
Nemanja Ivanovic	070e4027b0	[PowerPC][Altivec] Emit correct builtin for single precision vec_all_ne We currently emit a double precision comparison instruction for this, whereas we need to emit the single precision version. Differential revision: https://reviews.llvm.org/D64024	2019-11-07 20:40:32 -06:00
Eli Friedman	98286b569d	[Headers] Fix compatibility between arm_acle.h and intrin.h Make sure they don't both define __nop. Differential Revision: https://reviews.llvm.org/D69012	2019-10-29 14:52:56 -07:00
vhscampos	f6e11a36c4	[ARM][AArch64] Implement __cls, __clsl and __clsll intrinsics from ACLE Summary: Writing support for three ACLE functions: unsigned int __cls(uint32_t x) unsigned int __clsl(unsigned long x) unsigned int __clsll(uint64_t x) CLS stands for "Count number of leading sign bits". In AArch64, these two intrinsics can be translated into the 'cls' instruction directly. In AArch32, on the other hand, this functionality is achieved by implementing it in terms of clz (count number of leading zeros). Reviewers: compnerd Reviewed By: compnerd Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69250	2019-10-28 11:06:58 +00:00
vhscampos	5d35b7d9e1	[ARM][AArch64] Implement __arm_rsrf, __arm_rsrf64, __arm_wsrf & __arm_wsrf64 Summary: Adding support for ACLE intrinsics. Patch by Michael Platings. Reviewers: chill, t.p.northover, efriedma Reviewed By: chill Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69297	2019-10-28 10:59:18 +00:00
Greg Bedwell	d4758d4a8d	Fix a spelling mistake in a couple of intrinsic description comments. NFC	2019-10-27 09:42:14 +00:00
Simon Tatham	08074cc965	[clang,ARM] Initial ACLE intrinsics for MVE. This commit sets up the infrastructure for auto-generating <arm_mve.h> and doing clang-side code generation for the builtins it relies on, and demonstrates that it works by implementing a representative sample of the ACLE intrinsics, more or less matching the ones introduced in LLVM IR by D67158,D68699,D68700. Like NEON, that header file will provide a set of vector types like uint16x8_t and C functions with names like vaddq_u32(). Unlike NEON, the ACLE spec for <arm_mve.h> includes a polymorphism system, so that you can write plain vaddq() and disambiguate by the vector types you pass to it. Unlike the corresponding NEON code, I've arranged to make every user- facing ACLE intrinsic into a clang builtin, and implement all the code generation inside clang. So <arm_mve.h> itself contains nothing but typedefs and function declarations, with the latter all using the new `__attribute__((__clang_builtin))` system to arrange that the user- facing function names correspond to the right internal BuiltinIDs. So the new MveEmitter tablegen system specifies the full sequence of IRBuilder operations that each user-facing ACLE intrinsic should translate into. Where possible, the ACLE intrinsics map to standard IR operations such as vector-typed `add` and `fadd`; where no standard representation exists, I call down to the sample IR intrinsics introduced in an earlier commit. Doing it like this means that you get the polymorphism for free just by using __attribute__((overloadable)): the clang overload resolution decides which function declaration is the relevant one, and _then_ its BuiltinID is looked up, so by the time we're doing code generation, that's all been resolved by the standard system. It also means that you get really nice error messages if the user passes the wrong combination of types: clang will show the declarations from the header file and explain why each one doesn't match. (The obvious alternative approach would be to have wrapper functions in <arm_mve.h> which pass their arguments to the underlying builtins. But that doesn't work in the case where one of the arguments has to be a constant integer: the wrapper function can't pass the constantness through. So you'd have to do that case using a macro instead, and then use C11 `_Generic` to handle the polymorphism. Then you have to add horrible workarounds because `_Generic` requires even the untaken branches to type-check successfully, and //then// if the user gets the types wrong, the error message is totally unreadable!) Reviewers: dmgreen, miyuki, ostannard Subscribers: mgorny, javed.absar, kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D67161	2019-10-24 16:33:13 +01:00
Craig Topper	282eff3847	[X86] Always define the tzcnt intrinsics even when _MSC_VER is defined. These intrinsics use llvm.cttz intrinsics so are always available even without the bmi feature. We already don't check for the bmi feature on the intrinsics themselves. But we were blocking the include of the header file with _MSC_VER unless BMI was enabled on the command line. Fixes PR30506. llvm-svn: 374516	2019-10-11 06:07:53 +00:00
Pengfei Wang	1f3a15c397	[x86] Adding support for some missing intrinsics: _castf32_u32, _castf64_u64, _castu32_f32, _castu64_f64 Summary: Adding support for some missing intrinsics: _castf32_u32, _castf64_u64, _castu32_f32, _castu64_f64 Reviewers: craig.topper, LuoYuanke, RKSimon, pengfei Reviewed By: RKSimon Subscribers: llvm-commits Patch by yubing (Bing Yu) Differential Revision: https://reviews.llvm.org/D67212 llvm-svn: 372802	2019-09-25 02:24:05 +00:00
Richard Smith	5b2ba5afa9	Fix reliance on -flax-vector-conversions in AVX intrinsics headers and corresponding tests. llvm-svn: 372063	2019-09-17 03:56:30 +00:00
Richard Smith	a50884abad	Remove reliance on lax vector conversions from altivec.h in VSX mode. llvm-svn: 372061	2019-09-17 03:56:26 +00:00
Richard Smith	aeb279dd88	Remove reliance on lax vector conversions from altivec.h and its test. llvm-svn: 371814	2019-09-13 05:19:12 +00:00
Jinsong Ji	5309189d9b	[PowerPC][Altivec] Fix constant argument for vec_dss Summary: This is similar to vec_ct* in https://reviews.llvm.org/rL304205. The argument must be a constant, otherwise instruction selection will fail. always_inline is not enough for isel to always fold everything away at -O0. The fix is to turn the function into macros in altivec.h. Fixes https://bugs.llvm.org/show_bug.cgi?id=43072 Reviewers: nemanjai, hfinkel, #powerpc, wuzish Reviewed By: #powerpc, wuzish Subscribers: wuzish, kbarton, MaskRay, shchenz, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D66699 llvm-svn: 370902	2019-09-04 14:01:47 +00:00
Artem Belevich	ce94ec661f	[CUDA] Use activemask.b32 instruction to implement __activemask w/ CUDA-9.2+ vote.ballot instruction is gone in recent CUDA versions and vote.sync.ballot can not be used because it needs a thread mask parameter. Fortunately PTX 6.2 (introduced with CUDA-9.2) provides activemask.b32 instruction for this. Differential Revision: https://reviews.llvm.org/D66665 llvm-svn: 370792	2019-09-03 17:31:58 +00:00
Pengfei Wang	dea9cad10e	[x86] Fix bugs of some intrinsic functions in CLANG : _mm512_stream_ps, _mm512_stream_pd, _mm512_stream_si512 Reviewers: craig.topper, pengfei, LuoYuanke, RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Patch by Bing Yu (yubing) Differential Revision: https://reviews.llvm.org/D66786 llvm-svn: 370691	2019-09-03 02:06:15 +00:00
Pengfei Wang	caac097fbf	[x86] Adding support for some missing intrinsics: _mm512_cvtsi512_si32 Summary: Adding support for some missing intrinsics: _mm512_cvtsi512_si32 Reviewers: craig.topper, pengfei, LuoYuanke, spatel, RKSimon Reviewed By: craig.topper Subscribers: llvm-commits Patch by Bing Yu (yubing) Differential Revision: https://reviews.llvm.org/D66785 llvm-svn: 370297	2019-08-29 06:18:34 +00:00
Yaxun Liu	af478e240b	[OpenCL] Fix declaration of enqueue_marker Differential Revision: https://reviews.llvm.org/D66512 llvm-svn: 369641	2019-08-22 11:18:59 +00:00
Anastasia Stulova	ef58804ebc	[OpenCL] Fix lang mode predefined macros for C++ mode. In C++ mode we should only avoid adding __OPENCL_C_VERSION__, all other predefined macros about the language mode are still valid. This change also fixes the language version check in the headers accordingly. Differential Revision: https://reviews.llvm.org/D65941 llvm-svn: 368552	2019-08-12 10:44:07 +00:00
Raphael Isemann	c4b5b66a05	[clang] Fixed x86 cpuid NSC signature Summary: The signature "Geode by NSC" for NSC vendor is wrong. In lib/Headers/cpuid.h, signature_NSC_edx and signature_NSC_ecx constants are inverted (cpuid signature order is ebx # edx # ecx). Reviewers: teemperor, rsmith, craig.topper Reviewed By: teemperor, craig.topper Subscribers: craig.topper, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D65978 llvm-svn: 368510	2019-08-10 10:14:01 +00:00
Qiu Chaofan	e9efaf3529	[PowerPC] [Clang] Port SSE3, SSSE3 and SSE4 intrinsics to PowerPC Port existing headers which include x86 intrinsics implementation to PowerPC platform (using Altivec), along with tests. Also, tests about including these intrinsic headers are combined. The headers are mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu. Reviewed By: Jinsong Ji Differential Revision: https://reviews.llvm.org/D65630 llvm-svn: 368392	2019-08-09 03:39:55 +00:00
Momchil Velikov	a36d31478c	[AArch64] Add support for Transactional Memory Extension (TME) Re-commit r366322 after some fixes TME is a future architecture technology, documented in https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools https://developer.arm.com/docs/ddi0601/a More about the future architectures: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and TCANCEL and the target feature/arch extension "tme". It also implements TME builtin functions, defined in ACLE Q2 2019 (https://developer.arm.com/docs/101028/latest) Differential Revision: https://reviews.llvm.org/D64416 Patch by Javed Absar and Momchil Velikov llvm-svn: 367428	2019-07-31 12:52:17 +00:00
Qiu Chaofan	852d444671	[PowerPC] [Clang] Add platform guards to PPC vector intrinsics headers Move the platform check out of PPC Linux toolchain code and add platform guards to the intrinsic headers, since they are supported currently only on 64-bit PowerPC targets. Reviewed By: Jinsong Ji Differential Revision: https://reviews.llvm.org/D64849 llvm-svn: 367281	2019-07-30 02:18:11 +00:00
Paul Robinson	d2c0eefd5c	[X86] Remove const from some intrinsics that shouldn't have them llvm-svn: 366699	2019-07-22 16:14:09 +00:00
Sven van Haastregt	e9e59ad79f	[OpenCL] Define CLK_NULL_EVENT without cast Defining CLK_NULL_EVENT with a `(void*)` cast has the (unintended?) side-effect that the address space will be fixed (as generic in OpenCL 2.0 mode). The consequence is that any target specific address space for the clk_event_t type will not be applied. It is not clear why the void pointer cast was needed in the first place, and it seems we can do without it. Differential Revision: https://reviews.llvm.org/D63876 llvm-svn: 366546	2019-07-19 09:11:48 +00:00
Qiu Chaofan	03aaef8e72	[PowerPC][Clang] Remove use of malloc in mm_malloc Remove dependency of malloc in implementation of mm_malloc function in PowerPC intrinsics and alignment assumption on glibc. Reviewed By: Hal Finkel Differential Revision: https://reviews.llvm.org/D64850 llvm-svn: 366406	2019-07-18 06:20:12 +00:00
Momchil Velikov	0e2b74a2b0	Revert [AArch64] Add support for Transactional Memory Extension (TME) This reverts r366322 (git commit `4b8da3a503`) llvm-svn: 366355	2019-07-17 17:43:32 +00:00
Momchil Velikov	4b8da3a503	[AArch64] Add support for Transactional Memory Extension (TME) TME is a future architecture technology, documented in https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools https://developer.arm.com/docs/ddi0601/a More about the future architectures: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and TCANCEL and the target feature/arch extension "tme". It also implements TME builtin functions, defined in ACLE Q2 2019 (https://developer.arm.com/docs/101028/latest) Patch by Javed Absar and Momchil Velikov Differential Revision: https://reviews.llvm.org/D64416 llvm-svn: 366322	2019-07-17 13:23:27 +00:00
Kyrylo Tkachov	eb72138340	[AArch64] Implement __jcvt intrinsic from Armv8.3-A The jcvt intrinsic defined in ACLE [1] is available when ARM_FEATURE_JCVT is defined. This change introduces the AArch64 intrinsic, wires it up to the instruction and a new clang builtin function. The __ARM_FEATURE_JCVT macro is now defined when an Armv8.3-A or higher target is used. I've implemented the target detection logic in Clang so that this feature is enabled for architectures from armv8.3-a onwards (so -march=armv8.4-a also enables this, for example). make check-all didn't show any new failures. [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics Differential Revision: https://reviews.llvm.org/D64495 llvm-svn: 366197	2019-07-16 09:27:39 +00:00
Ulrich Weigand	b98bf60ef7	[SystemZ] Add support for new cpu architecture - arch13 This patch series adds support for the next-generation arch13 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Support for low-level builtins mapped to new LLVM intrinsics. - New high-level intrinsics in vecintrin.h. - Indicate support by defining __VEC__ == 10303. Note: No currently available Z system supports the arch13 architecture. Once new systems become available, the official system name will be added as supported -march name. llvm-svn: 365933	2019-07-12 18:14:51 +00:00
Craig Topper	caf6b71ab2	[X86] Change the IR sequence for _mm_storeh_pi and _mm_storel_pi to perform the store as a <2 x float> instead of i64. This is similar to what we do for loadl_pi and loadh_pi. llvm-svn: 365669	2019-07-10 17:11:29 +00:00
Sven van Haastregt	b502a44110	[OpenCL] Restore ATOMIC_VAR_INIT We accidentally lost the ATOMIC_VAR_INIT and ATOMIC_FLAG_INIT macros in r363794. Also put the `memory_order` typedef back inside a `>= CL2.0` guard. llvm-svn: 364174	2019-06-24 10:06:40 +00:00
Sven van Haastregt	853dfab799	[OpenCL] Remove more duplicates from opencl-c.h Identified the duplicate declarations using sort lib/Headers/opencl-c.h \| uniq -c \| grep ' 2' llvm-svn: 364173	2019-06-24 10:06:34 +00:00
Anastasia Stulova	999f676d75	[OpenCL][PR41963] Add generic addr space to old atomics in C++ mode Add overloads with generic address space pointer to old atomics. This is currently only added for C++ compilation mode. Differential Revision: https://reviews.llvm.org/D62335 llvm-svn: 364071	2019-06-21 16:19:16 +00:00
Sven van Haastregt	772a7a7680	[OpenCL] Remove duplicate read_image declarations Patch by Pierre Gondois. llvm-svn: 364020	2019-06-21 10:26:10 +00:00
Craig Topper	6d9fb68c53	[X86] Make _mm_mask_cvtps_ph, _mm_maskz_cvtps_ph, _mm256_mask_cvtps_ph, and _mm256_maskz_cvtps_ph aliases for their corresponding cvt_roundps_ph intrinsic. These intrinsics should always take an immediate for the rounding mode. The base instruction comes from before EVEX embdedded rounding. The user should always provide the immediate rather than us assuming CUR_DIRECTION. Make the 512-bit versions also explicit aliases instead of copy pasting the code. llvm-svn: 363961	2019-06-20 18:24:29 +00:00
Xing Xue	ab4bcd844a	AIX system headers need stdint.h and inttypes.h to be re-enterable Summary: AIX system headers need stdint.h and inttypes.h to be re-enterable when macro _STD_TYPES_T is defined so that limit macro definitions such as UINT32_MAX can be found. This patch attempts to allow that on AIX. Reviewers: hubert.reinterpretcast, jasonliu, mclow.lists, EricWF Reviewed by: hubert.reinterpretcast, mclow.lists Subscribers: jfb, jsji, christof, cfe-commits, libcxx-commits, llvm-commits Tags: #LLVM, #clang, #libc++ Differential Revision: https://reviews.llvm.org/D59253 llvm-svn: 363939	2019-06-20 15:36:32 +00:00
Craig Topper	24151619a0	[X86] Correct the __min_vector_width__ attribute on a few intrinsics. llvm-svn: 363890	2019-06-19 23:27:04 +00:00
Sven van Haastregt	af1c230e70	[OpenCL] Split type and macro definitions into opencl-c-base.h Using the -fdeclare-opencl-builtins option will require a way to predefine types and macros such as `int4`, `CLK_GLOBAL_MEM_FENCE`, etc. Move these out of opencl-c.h into opencl-c-base.h such that the latter can be shared by -fdeclare-opencl-builtins and -finclude-default-header. This changes the behaviour of -finclude-default-header when -fdeclare-opencl-builtins is specified: instead of including the full header, it will include the header with only the base definitions. Differential revision: https://reviews.llvm.org/D63256 llvm-svn: 363794	2019-06-19 12:48:22 +00:00
Zi Xuan Wu	cc12f68fff	[PowerPC] [Clang] Port SSE2 intrinsics to PowerPC Port emmintrin.h which include Intel SSE2 intrinsics implementation to PowerPC platform (using Altivec). The new headers containing those implemenations are located into a directory named ppc_wrappers which has higher priority when the platform is PowerPC on Linux. They are mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu. It's a follow-up patch of D62121. Patched by: Qiu Chaofan <qiucf@cn.ibm.com> Differential Revision: https://reviews.llvm.org/D62569 llvm-svn: 363122	2019-06-12 05:25:40 +00:00
Pengfei Wang	244062eece	[X86] Enable intrinsics that convert float and bf16 data to each other Scalar version : _mm_cvtsbh_ss , _mm_cvtness_sbh Vector version: _mm512_cvtpbh_ps , _mm256_cvtpbh_ps _mm512_maskz_cvtpbh_ps , _mm256_maskz_cvtpbh_ps _mm512_mask_cvtpbh_ps , _mm256_mask_cvtpbh_ps Patch by Shengchen Kan (skan) Differential Revision: https://reviews.llvm.org/D62363 llvm-svn: 363018	2019-06-11 01:17:28 +00:00
Pengfei Wang	3a29f7c99c	[X86] Add ENQCMD instructions For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference. Patch by Tianqing Wang (tianqing) Differential Revision: https://reviews.llvm.org/D62282 llvm-svn: 362685	2019-06-06 08:28:42 +00:00
Andrew Savonichev	9ed325e463	[OpenCL] Undefine cl_intel_planar_yuv extension Summary: Remove unnecessary definition (otherwise the extension will be defined where it's not supposed to be defined). Consider the code: #pragma OPENCL EXTENSION cl_intel_planar_yuv : begin // some declarations #pragma OPENCL EXTENSION cl_intel_planar_yuv : end is enough for extension to become known for clang. Patch by: Dmitry Sidorov <dmitry.sidorov@intel.com> Reviewers: Anastasia, yaxunl Reviewed By: Anastasia Tags: #clang Differential Revision: https://reviews.llvm.org/D58666 llvm-svn: 362398	2019-06-03 13:02:43 +00:00
Pengfei Wang	cc3629d545	[X86] Add VP2INTERSECT instructions Support intel AVX512 VP2INTERSECT instructions in clang Patch by Xiang Zhang (xiangzhangllvm) Differential Revision: https://reviews.llvm.org/D62367 llvm-svn: 362196	2019-05-31 06:09:35 +00:00
Zi Xuan Wu	fc3ed1ec50	re-commit r361928: [PowerPC] [Clang] Port SSE intrinsics to PowerPC Port xmmintrin.h which include Intel SSE intrinsics implementation to PowerPC platform (using Altivec). The new headers containing those implemenations are located into a directory named ppc_wrappers which has higher priority when the platform is PowerPC on Linux. They are mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu. Patched by: Qiu Chaofan <qiucf@cn.ibm.com> Reviewed By: Jinsong Ji Differential Revision: https://reviews.llvm.org/D62121 llvm-svn: 362190	2019-05-31 04:42:13 +00:00
Zi Xuan Wu	48061cd999	revert rC361928: [PowerPC] [Clang] Port SSE intrinsics to PowerPC Because test fails in other targets rather than PowerPC llvm-svn: 361930	2019-05-29 07:09:54 +00:00
Zi Xuan Wu	b3bcbb5b66	[PowerPC] [Clang] Port SSE intrinsics to PowerPC Port xmmintrin.h which include Intel SSE intrinsics implementation to PowerPC platform (using Altivec). The new headers containing those implemenations are located into a directory named ppc_wrappers which has higher priority when the platform is PowerPC on Linux. They are mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu. Patched by: Qiu Chaofan <qiucf@cn.ibm.com> Reviewed By: Jinsong Ji Differential Revision: https://reviews.llvm.org/D62121 llvm-svn: 361928	2019-05-29 05:17:03 +00:00
Kevin Petit	aa7754cc90	[OpenCL] Add support for the cl_arm_integer_dot_product extensions The specification is available in the Khronos OpenCL registry: https://www.khronos.org/registry/OpenCL/extensions/arm/cl_arm_integer_dot_product.txt Signed-off-by: Kevin Petit <kevin.petit@arm.com> llvm-svn: 361641	2019-05-24 14:53:52 +00:00
Craig Topper	3d7ecc4618	[X86] Remove semicolons at the end of intrinsics implemented as macros so they can be used as arguments to other intrinsics. Also fix one intrinsic that was using variable names without underscores. Fixes PR41932 llvm-svn: 361109	2019-05-19 01:01:52 +00:00
Gheorghe-Teodor Bercea	144291e14c	[OpenMP][bugfix] Add missing math functions variants for log and abs. Summary: When including the random header in C++, some of the math functions it relies on are not present in the CUDA headers. We include this variants in this case. Reviewers: jdoerfert, hfinkel, tra, caomhin Reviewed By: tra Subscribers: efriedma, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D62046 llvm-svn: 361066	2019-05-17 19:15:53 +00:00
Craig Topper	20040db9a6	[X86] Stop implicitly enabling avx512vl when avx512bf16 is enabled. Previously we were doing this so that the 256 bit selectw builtin could be used in the implementation of the 512->256 bit conversion intrinsic. After this commit we now use a masked convert builtin that will emit the intrinsic call and the 256-bit select from custom code in CGBuiltin. Then the header only needs to call that one intrinsic. llvm-svn: 360924	2019-05-16 18:28:17 +00:00
Craig Topper	58964566e0	[X86] Update doxygen comments for AVX512BF16 to not refer to masks as 'immediates'. Refer to parameter names instead of 'src', 'src1', 'src2'. NFC llvm-svn: 360918	2019-05-16 17:34:35 +00:00
Gheorghe-Teodor Bercea	9392bd6987	[OpenMP][Bugfix] Move double and float versions of abs under c++ macro Summary: This is a fix for the reported bug: [[ https://bugs.llvm.org/show_bug.cgi?id=41861 \| 41861 ]] abs functions need to be moved under the c++ macro to avoid conflicts with included headers. Reviewers: tra, jdoerfert, hfinkel, ABataev, caomhin Reviewed By: jdoerfert Subscribers: guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61959 llvm-svn: 360809	2019-05-15 20:28:23 +00:00
Gheorghe-Teodor Bercea	7641f310d7	[OpenMP][bugfix] Fix issues with C++ 17 compilation when handling math functions Summary: In OpenMP device offloading we must ensure that unde C++ 17, the inclusion of cstdlib will works correctly. Reviewers: ABataev, tra, jdoerfert, hfinkel, caomhin Reviewed By: jdoerfert Subscribers: Hahnfeld, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61949 llvm-svn: 360804	2019-05-15 20:18:21 +00:00
Volodymyr Sapsai	51e79f0634	[X86] Make `x86intrin.h`, `immintrin.h` includable with `-fno-gnu-inline-asm`. Currently `immintrin.h` includes `pconfigintrin.h` and `sgxintrin.h` which contain inline assembly. It causes failures when building with the flag `-fno-gnu-inline-asm`. Fix by excluding functions with inline assembly when this extension is disabled. So far there was no need to support `_pconfig_u32`, `_enclu_u32`, `_encls_u32`, `_enclv_u32` on platforms that require `-fno-gnu-inline-asm`. But if developers start using these functions, they'll have compile-time undeclared identifier errors which is preferrable to runtime errors. rdar://problem/49540880 Reviewers: craig.topper, GBuella, rnk, echristo Reviewed By: rnk Subscribers: jkorous, dexonsmith, cfe-commits Differential Revision: https://reviews.llvm.org/D61621 llvm-svn: 360630	2019-05-13 22:40:11 +00:00
Gheorghe-Teodor Bercea	946957189d	[OpenMP][Clang][BugFix] Split declares and math functions inclusion. Summary: This patches fixes an issue in which the __clang_cuda_cmath.h header is being included even when cmath or math.h headers are not included. Reviewers: jdoerfert, ABataev, hfinkel, caomhin, tra Reviewed By: tra Subscribers: tra, mgorny, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61765 llvm-svn: 360626	2019-05-13 22:11:44 +00:00
Reid Kleckner	55fab1ff48	Revert Include corecrt.h in stddef.h and vcruntime.h in stdarg.h to improve MS compatibility. This reverts r360271 (git commit `a0933bd8ec`) There are concerns on the review that this breaks EFI builds and that the transitive includes (sal.h) are actually heavy enough that we might care. llvm-svn: 360291	2019-05-08 22:01:20 +00:00
Mike Rice	a0933bd8ec	Include corecrt.h in stddef.h and vcruntime.h in stdarg.h to improve MS compatibility. This allows some applications developed with MSVC to compile with clang without any extra changes. Fixes: llvm.org/PR40789 Differential Revision: https://reviews.llvm.org/D61646 llvm-svn: 360271	2019-05-08 17:15:21 +00:00
Gheorghe-Teodor Bercea	e62c693c8e	[OpenMP][Clang] Support for target math functions Summary: In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang. We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism. Authors: @gtbercea @jdoerfert Reviewers: hfinkel, caomhin, ABataev, tra Reviewed By: hfinkel, ABataev, tra Subscribers: JDevlieghere, mgorny, guansong, cfe-commits, jdoerfert Tags: #clang Differential Revision: https://reviews.llvm.org/D61399 llvm-svn: 360265	2019-05-08 15:52:33 +00:00
Jonas Devlieghere	fe608c938c	Revert "[OpenMP][Clang] Support for target math functions" This commit appears to be breaking stage-2 builds on GreenDragon. The OpenMP wrappers for cmath and math.h are copied into the root of the resource directory and cause a cyclic dependency in module 'Darwin': Darwin -> std -> Darwin. This blows up when CMake is testing for modules support and breaks all stage 2 module builds, including the ThinLTO bot and all LLDB bots. CMake Error at cmake/modules/HandleLLVMOptions.cmake:497 (message): LLVM_ENABLE_MODULES is not supported by this compiler llvm-svn: 360192	2019-05-07 21:08:15 +00:00
Gheorghe-Teodor Bercea	1e28a668bc	[OpenMP][Clang] Support for target math functions Summary: In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang. We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism. Authors: @gtbercea @jdoerfert Reviewers: hfinkel, caomhin, ABataev, tra Reviewed By: hfinkel, ABataev, tra Subscribers: mgorny, guansong, cfe-commits, jdoerfert Tags: #clang Differential Revision: https://reviews.llvm.org/D61399 llvm-svn: 360063	2019-05-06 18:19:15 +00:00
Fangrui Song	041c377a59	[X86] Move files to correct directories after D60552 llvm-svn: 360022	2019-05-06 09:24:36 +00:00
Luo, Yuanke	844f662932	Enable intrinsics of AVX512_BF16, which are supported for BFLOAT16 in Cooper Lake Summary: 1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake; 2. Enable intrinsics for VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision. For more details about BF16 intrinsic, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Patch by LiuTianle Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, spatel, RKSimon Reviewed By: craig.topper Subscribers: mgorny, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60552 llvm-svn: 360018	2019-05-06 08:25:11 +00:00
Tom Stellard	dbe1c4aa6f	lib/Header: Fix Visual Studio builds try #2 Summary: This is a follow up to r355253 and a better fix than the first attempt which was r359257. We can't install anything from ${CMAKE_CFG_INTDIR}, because this value is only defined at build time, but we still must make sure to copy the headers into ${CMAKE_CFG_INTDIR}/lib/clang/$VERSION/include, because the lit tests look for headers there. So for this fix we revert to the old behavior of copying the headers to ${CMAKE_CFG_INTDIR}/lib/clang/$VERSION/include during the build and then installing them from the source tree. Reviewers: smeenai, vzakhari, phosek Reviewed By: smeenai, vzakhari Subscribers: mgorny, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61220 llvm-svn: 359654	2019-05-01 06:18:03 +00:00
Javed Absar	18b0c40bc5	[AArch64] Add support for MTE intrinsics This provides intrinsics support for Memory Tagging Extension (MTE), which was introduced with the Armv8.5-a architecture. These intrinsics are available when __ARM_FEATURE_MEMORY_TAGGING is defined. Each intrinsic is described in detail in the ACLE Q1 2019 documentation: https://developer.arm.com/docs/101028/latest Reviewed By: Tim Nortover, David Spickett Differential Revision: https://reviews.llvm.org/D60485 llvm-svn: 359348	2019-04-26 21:08:11 +00:00
Tom Stellard	0184819e81	Revert lib/Header: Fix Visual Studio builds This reverts r359257 (git commit `00d9789509`) This broke check-clang. llvm-svn: 359258	2019-04-26 01:43:59 +00:00
Tom Stellard	00d9789509	lib/Header: Fix Visual Studio builds Summary: This is a follow up to r355253, which inadvertently broke Visual Studio builds by trying to copy files from CMAKE_CFG_INTDIR. See https://reviews.llvm.org/D58537#inline-532492 Reviewers: smeenai, vzakhari, phosek Reviewed By: smeenai Subscribers: mgorny, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61054 llvm-svn: 359257	2019-04-26 01:18:59 +00:00
Jinsong Ji	12450d51a2	[PowerPC][NFC]Update licence to Apache 2 llvm-svn: 359164	2019-04-25 02:40:06 +00:00
Qiu Chaofan	19828e399b	[PowerPC] [Clang] Port MMX intrinsics and basic test cases to Power Port mmintrin.h which include x86 MMX intrinsics implementation to PowerPC platform (using Altivec). To make the include process correct, PowerPC's toolchain class is overrided to insert new headers directory (named ppc_wrappers) into the path. Basic test cases for several intrinsic functions are added. The header is mainly developed by Steven Munroe, with contributions from Paul Clarke, Bill Schmidt, Jinsong Ji and Zixuan Wu. Reviewed By: Jinsong Ji Differential Revision: https://reviews.llvm.org/D59924 llvm-svn: 358949	2019-04-23 05:50:24 +00:00
Evgeny Mankov	88aa3d7237	[CUDA][Windows] Restrict long double device functions declarations to Windows As agreed in D60220, make long double declarations unobservable on non-windows platforms. [Testing] {Windows 10, Ubuntu 16.04.5}/{Visual C++ 2017 15.9.11 & 2019 16.0.1, gcc+ 5.4.0}/CUDA {8.0, 9.0, 9.1, 9.2, 10.0, 10.1} Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D60818 llvm-svn: 358654	2019-04-18 10:08:55 +00:00
Craig Topper	8e364c680f	[X86] Restore the pavg intrinsics. The pattern we replaced these with may be too hard to match as demonstrated by PR41496 and PR41316. This patch restores the intrinsics and then we can start focusing on the optimizing the intrinsics. I've mostly reverted the original patch that removed them. Though I modified the avx512 intrinsics to not have masking built in. Differential Revision: https://reviews.llvm.org/D60674 llvm-svn: 358427	2019-04-15 17:17:35 +00:00
Chandler Carruth	4cf5743b77	Move the builtin headers to use the new license file header. Summary: These all had somewhat custom file headers with different text from the ones I searched for previously, and so I missed them. Thanks to Hal and Kristina and others who prompted me to fix this, and sorry it took so long. Reviewers: hfinkel Subscribers: mcrosier, javed.absar, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60406 llvm-svn: 357941	2019-04-08 20:51:30 +00:00
Evgeny Mankov	66a8b07cd9	[CUDA][Windows] Last fix for the clang Bug 38811 "Clang fails to compile with CUDA-9.x on Windows" (https://bugs.llvm.org/show_bug.cgi?id=38811 ). [IMPORTANT] With that last fix, CUDA has just started being compiling by clang on Windows after nearly a year and two clangâ€™s major releases (7 and 8). As long as the last LLVM release, in which clang was compiling CUDA on Windows successfully, was 6.0.1, this fix and two previous have to be included into upcoming 7.1.0 and 8.0.1 releases. [How to repro] clang++.exe -x cuda "c:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\0_Simple\simplePrintf\simplePrintf.cu" -I"c:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\common\inc" --cuda-gpu-arch=sm_50 --cuda-path="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0" -L"c:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\lib\x64" -lcudart.lib -v [Output] In file included from C:\GIT\LLVM\trunk-for-submits\llvm-64-release-vs2017-15.9.9\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:327: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:390:11: error: no matching function for call to '__isinfl' return (__isinfl(a) != 0); ^~~~~~~~ C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:2662:14: note: candidate function not viable: call to __host__ function from __device__ function __func__(int __isinfl(long double a)) ^ In file included from <built-in>:1: In file included from C:\GIT\LLVM\trunk-for-submits\llvm-64-release-vs2017-15.9.9\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:327: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:438:11: error: no matching function for call to '__isnanl' return (__isnanl(a) != 0); ^~~~~~~~ C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:2672:14: note: candidate function not viable: call to __host__ function from __device__ function __func__(int __isnanl(long double a)) ^ In file included from <built-in>:1: In file included from C:\GIT\LLVM\trunk-for-submits\llvm-64-release-vs2017-15.9.9\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:327: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:486:11: error: no matching function for call to '__finitel' return (__finitel(a) != 0); ^~~~~~~~~ C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0/include\crt/math_functions.hpp:2652:14: note: candidate function not viable: call to __host__ function from __device__ function __func__(int __finitel(long double a)) ^ 3 errors generated when compiling for sm_50. [Solution] Add missing long double device functions' declarations. Provide only declarations to prevent any use of long double on the device side, because CUDA does not support long double on the device side. [Testing] {Windows 10, Ubuntu 16.04.5}/{Visual C++ 2017 15.9.9, gcc+ 5.4.0}/CUDA {8.0, 9.0, 9.1, 9.2, 10.0, 10.1} Reviewed by: Artem Belevich Differential Revision: http://reviews.llvm.org/D60220 llvm-svn: 357779	2019-04-05 16:51:10 +00:00
Craig Topper	6af0363857	[X86] Make _bswap intrinsic a function instead of a macro to hopefully fix the chromium build. This intrinsic was added in r356848 but was implemented as a macro to match gcc. llvm-svn: 356862	2019-03-24 18:00:20 +00:00
Craig Topper	88f4054f48	[X86] Add BSR/BSF/BSWAP intrinsics to ia32intrin.h to match gcc. Summary: These are all implemented by icc as well. I made bit_scan_forward/reverse forward to the __bsfd/__bsrq since we also have __bsfq/__bsrq. Note, when lzcnt is enabled the bsr intrinsics generates lzcnt+xor instead of bsr. Reviewers: RKSimon, spatel Subscribers: cfe-commits, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D59682 llvm-svn: 356848	2019-03-24 00:56:52 +00:00
Craig Topper	1383340422	[X86] Add __popcntd and __popcntq to ia32intrin.h to match gcc and icc. Remove popcnt feature flag from _popcnt32/_popcnt64 and move to ia32intrin.h to match gcc gcc and icc both implement popcntd and popcntq which we did not. gcc doesn't seem to require a feature flag for the _popcnt32/_popcnt64 spelling and will use a libcall if its not supported. Differential Revision: https://reviews.llvm.org/D59567 llvm-svn: 356689	2019-03-21 17:43:53 +00:00
Craig Topper	e0941cb326	[X86] Add __crc32b/__crc32w/__crc32d/__crc32q intrinsics to match gcc and icc. gcc has these intrinsics in ia32intrin.h as well. And icc implements them though they aren't documented in the Intel Intrinsics Guide. Differential Revision: https://reviews.llvm.org/D59533 llvm-svn: 356609	2019-03-20 20:25:28 +00:00
Craig Topper	8b653d0308	[X86] Add gcc rotate intrinsics to ia32intrin.h This is another attempt at what Erich Keane tried to do in r355322. This adds rolb, rolw, rold, rolq and their ror equivalent as always_inline wrappers around __builtin_rotate* which will lower to funnel shift intrinsics in IR. Additionally, when _MSC_VER is not defined we will define _rotl, _lrotl, _rotr, _lrotr as macros to one of the always_inline intrinsics mentioned above. Making sure that _lrotl/_lrotr use either 32 or 64 bit based on the size of long. These need to be macros because we have builtins with the same name for MS compatibility, but _MSC_VER isn't always defined when those builtins are enabled. We also define _rotwl and _rotwr as macros aliasing to rolw/rorw just like gcc to complete the set. These don't need to be gated with _MSC_VER because these aren't MS builtins. I've added tests both for non-MS and -ms-extensions with and without _MSC_VER being defined. Differential Revision: https://reviews.llvm.org/D59346 llvm-svn: 356423	2019-03-18 22:25:57 +00:00
Evgeny Mankov	177301f048	[CUDA][Windows] Partial fix for bug 38811 (Step 2 of 3) Partial fix for the clang Bug 38811 "Clang fails to compile with CUDA-9.x on Windows". [Synopsis] __sptr is a new Microsoft specific modifier (https://docs.microsoft.com/en-us/cpp/cpp/sptr-uptr?view=vs-2017). [Solution] Replace all `__sptr` occurrences with `__s` (and all `__cptr` with `__c` as well) to eliminate the below clang compilation error on Windows. In file included from C:\GIT\LLVM\trunk\llvm-64-release-vs2017-15.9.5\dist\lib\clang\9.0.0\include\__clang_cuda_runtime_wrapper.h:162: C:\GIT\LLVM\trunk\llvm-64-release-vs2017-15.9.5\dist\lib\clang\9.0.0\include\__clang_cuda_device_functions.h:524:33: error: expected expression return __nv_fast_sincosf(__a, __sptr, __cptr); ^ Reviewed by: Artem Belevich Differential Revision: http://reviews.llvm.org/D59423 llvm-svn: 356291	2019-03-15 19:04:46 +00:00
Evgeny Mankov	04188fc0c6	[CUDA][Windows] Partial fix for bug #38811 (Step 1 of 3) Partial fix for the clang Bug https://bugs.llvm.org/show_bug.cgi?id=38811 "Clang fails to compile with CUDA-9.x on Windows". Adding defined(_WIN64) check along with existing #if defined(__LP64__) eliminates the below clang (64-bit) compilation error on Windows. C:/GIT/LLVM/trunk/llvm-64-release-vs2017/dist/lib/clang/9.0.0\include\__clang_cuda_device_functions.h(1609,45): error GEF7559A7: no matching function for call to 'roundf' __DEVICE__ long lroundf(float __a) { return roundf(__a); } Reviewed by: Artem Belevich Differential Revision: http://reviews.llvm.org/D59361 llvm-svn: 356255	2019-03-15 12:05:36 +00:00
Shoaib Meenai	5be71faf4b	[build] Rename clang-headers to clang-resource-headers Summary: The current install-clang-headers target installs clang's resource directory headers. This is different from the install-llvm-headers target, which installs LLVM's API headers. We want to introduce the corresponding target to clang, and the natural name for that new target would be install-clang-headers. Rename the existing target to install-clang-resource-headers to free up the install-clang-headers name for the new target, following the discussion on cfe-dev [1]. I didn't find any bots on zorg referencing install-clang-headers. I'll send out another PSA to cfe-dev to accompany this rename. [1] http://lists.llvm.org/pipermail/cfe-dev/2019-February/061365.html Reviewers: beanz, phosek, tstellar, rnk, dim, serge-sans-paille Subscribers: mgorny, javed.absar, jdoerfert, #sanitizers, openmp-commits, lldb-commits, cfe-commits, llvm-commits Tags: #clang, #sanitizers, #lldb, #openmp, #llvm Differential Revision: https://reviews.llvm.org/D58791 llvm-svn: 355340	2019-03-04 21:19:53 +00:00
Tom Stellard	4f076000c6	lib/Header: Simplify CMakeLists.txt Summary: Replace cut and pasted code with cmake macros and reduce the number of install commands. This fixes an issue where the headers were being installed twice. This clean up should also make future modifications easier, like adding a cmake option to install header files into a custom resource directory. Reviewers: chandlerc, smeenai, mgorny, beanz, phosek Reviewed By: smeenai Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58537 llvm-svn: 355253	2019-03-02 00:50:13 +00:00
Louis Dionne	c2d95792d6	[clang] Only provide C11 features in <float.h> starting with C++17 Summary: In r353970, I enabled those features in C++11 and above. To be strictly conforming, those features should only be enabled in C++17 and above. Reviewers: jfb, eli.friedman Subscribers: jkorous, dexonsmith, libcxx-commits Differential Revision: https://reviews.llvm.org/D58289 llvm-svn: 354691	2019-02-22 20:48:54 +00:00
Shoaib Meenai	defb5a383b	[clang] Switch to LLVM_ENABLE_IDE r344555 switched LLVM to guarding install targets with LLVM_ENABLE_IDE instead of CMAKE_CONFIGURATION_TYPES, which expresses the intent more directly and can be overridden by a user. Make the corresponding change in clang. LLVM_ENABLE_IDE is computed by HandleLLVMOptions, so it should be available for both standalone and integrated builds. Differential Revision: https://reviews.llvm.org/D58284 llvm-svn: 354525	2019-02-20 23:08:43 +00:00
Louis Dionne	defa9f8f85	[clang] Make sure C99/C11 features in <float.h> are provided in C++11 Summary: Previously, those #defines were only provided in C or when GNU extensions were enabled. We need those #defines in C++11 and above, too. Reviewers: jfb, eli.friedman Subscribers: jkorous, dexonsmith, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58149 llvm-svn: 353970	2019-02-13 19:08:01 +00:00
Simon Atanasyan	4c22a57414	[Headers][mips] Add `__attribute__((__mode__(__unwind_word__)))` to the _Unwind_Word / _Unwind_SWord definitions The rationale of this change is to fix _Unwind_Word / _Unwind_SWord definitions for MIPS N32 ABI. This ABI uses 32-bit pointers, but _Unwind_Word and _Unwind_SWord types are eight bytes long. # The __attribute__((__mode__(__unwind_word__))) is added to the type definitions. It makes them equal to the corresponding definitions used by GCC and allows to override types using `getUnwindWordWidth` function. # The `getUnwindWordWidth` virtual function override in the `MipsTargetInfo` class and provides correct type size values. Differential revision: https://reviews.llvm.org/D58165 llvm-svn: 353965	2019-02-13 18:27:09 +00:00
Reid Kleckner	79d7f4114d	[X86] Use __m128_u for _mm_loadu_ps after r353555 Add secondary triple to existing SSE test for it. I audited other uses of __attribute__((__packed__)) in the intrinsic headers, and this seemed to be the only missing one. llvm-svn: 353878	2019-02-12 21:04:21 +00:00
Craig Topper	4390c721cb	[X86] Use the new unaligned vector typedefs for the loadu/storeu intrinsics pointer arguments. This matches what gcc does and what was suggested by rnk in PR20670. llvm-svn: 353802	2019-02-12 07:44:40 +00:00
Tom Tan	42b2424e4f	[COFF, ARM64] Remove definitions for _byteswap library functions _byteswap_* functions are are implemented in below file as normal function from libucrt.lib and declared in stdlib.h. Define them in intrin.h triggers lld error "conflicting comdat type" and "duplicate symbols" which was just added to LLD (https://reviews.llvm.org/D57324). C:\Program Files (x86)\Windows Kits\10\Source\10.0.17763.0\ucrt\stdlib\byteswap.cpp Differential Revision: https://reviews.llvm.org/D57915 llvm-svn: 353740	2019-02-11 20:04:02 +00:00
Craig Topper	be4cbe8726	[X86] Add explicit alignment to __m128/__m128i/__m128d/etc. to allow matching of MSVC behavior with #pragma pack. Summary: With MSVC, #pragma pack is ignored when there is explicit alignment. This differs from gcc. Clang emulates this difference when compiling for Windows. It appears that MSVC and its headers consider the __m128/__m128i/__m128d/etc. types to be explicitly aligned and ignores #pragma pack for them. Since we don't have explicit alignment on them in our headers, we don't match the MSVC behavior here. This patch adds explicit alignment to match this behavior. I'm hoping this won't cause any problems when we're not emulating MSVC. But if someone knows of something that would be different we can swith to conditionally adding the alignment based on _MSC_VER. I had to add explicitly unaligned types as well so we could use them in the loadu/storeu intrinsics which use __attribute__(__packed__). Using the now explicitly aligned types wouldn't produce align 1 accesses when targeting Windows. Reviewers: rnk, erichkeane, spatel, RKSimon Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D57961 llvm-svn: 353555	2019-02-08 19:45:08 +00:00
Eli Friedman	3189d5f48c	[COFF, ARM64] Fix types for _ReadStatusReg, _WriteStatusReg r344765 added those intrinsics, but used the wrong types. Patch by Mike Hommey Differential Revision: https://reviews.llvm.org/D57636 llvm-svn: 353493	2019-02-08 01:17:49 +00:00
Artem Belevich	4071763bb8	Basic CUDA-10 support. Differential Revision: https://reviews.llvm.org/D57771 llvm-svn: 353232	2019-02-05 22:38:58 +00:00
Artem Belevich	c62214da3d	[CUDA] add support for the new kernel launch API in CUDA-9.2+. Instead of calling CUDA runtime to arrange function arguments, the new API constructs arguments in a local array and the kernels are launched with __cudaLaunchKernel(). The old API has been deprecated and is expected to go away in the next CUDA release. Differential Revision: https://reviews.llvm.org/D57488 llvm-svn: 352799	2019-01-31 21:34:03 +00:00
Matt Arsenault	58fc8082a8	OpenCL: Use length modifier for warning on vector printf arguments Re-enable format string warnings on printf. The warnings are still incomplete. Apparently it is undefined to use a vector specifier without a length modifier, which is not currently warned on. Additionally, type warnings appear to not be working with the hh modifier, and aren't warning on all of the special restrictions from c99 printf. llvm-svn: 352540	2019-01-29 20:49:54 +00:00
Matt Arsenault	297afb14ec	Revert "OpenCL: Extend argument promotion rules to vector types" This reverts r348083. This was based on a misreading of the spec for printf specifiers. Also revert r343653, as without a subsequent patch, a correctly specified format for a vector will incorrectly warn. Fixes bug 40491. llvm-svn: 352539	2019-01-29 20:49:47 +00:00
Craig Topper	8de5abc4c8	[X86] Remove mask and passthru arguments from vpconflict builtins. Use select in IR instead. llvm-svn: 352173	2019-01-25 07:08:22 +00:00
Craig Topper	9fddc3fd00	[X86] Remove the cvtuqq2ps256/cvtqq2ps256 mask builtins. Replace with uitofp/sitofp and select. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: kristina, cfe-commits Differential Revision: https://reviews.llvm.org/D56965 llvm-svn: 351694	2019-01-20 19:04:56 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Craig Topper	d08d90ce51	[X86] Only define _XCR_XFEATURE_ENABLED_MASK in xsaveintrin.h when _MSC_VER is defined. Remove from intrin.h. I think this was my intention when I added it xsaveintrin.h llvm-svn: 351568	2019-01-18 17:51:51 +00:00
Craig Topper	931779761e	Recommit r351160 "[X86] Make _xgetbv/_xsetbv on non-windows platforms" V8 has been fixed now. llvm-svn: 351391	2019-01-16 22:56:25 +00:00

1 2 3 4 5 ...

1732 Commits