llvm-project

Commit Graph

Author	SHA1	Message	Date
Alex Richardson	51e09e1d5a	[AMDGPU] Set the default globals address space to 1 This will ensure that passes that add new global variables will create them in address space 1 once the passes have been updated to no longer default to the implicit address space zero. This also changes AutoUpgrade.cpp to add -G1 to the DataLayout if it wasn't already to present to ensure bitcode backwards compatibility. Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D84345	2020-11-20 15:46:53 +00:00
Yaxun (Sam) Liu	3f4b5893ef	[AMDGPU] Add option -munsafe-fp-atomics Add an option -munsafe-fp-atomics for AMDGPU target. When enabled, clang adds function attribute "amdgpu-unsafe-fp-atomics" to any functions for amdgpu target. This allows amdgpu backend to use unsafe fp atomic instructions in these functions. Differential Revision: https://reviews.llvm.org/D91546	2020-11-16 21:52:12 -05:00
Baptiste Saleil	170e45ae18	[PowerPC] Prevent the use of MMA with P9 and earlier We want to allow using MMA on P10 CPU only. This patch prevents the use of MMA with the -mmma option on P9 CPUs and earlier. Differential Revision: https://reviews.llvm.org/D91200	2020-11-12 10:36:50 -06:00
Qiu Chaofan	2abc33683b	[PowerPC] [Clang] Define macros to identify quad-fp semantics We have option -mabi=ieeelongdouble to set current long double to IEEEquad semantics. Like what GCC does, we need to define __LONG_DOUBLE_IEEE128__ macro in this case, and __LONG_DOUBLE_IBM128__ if using PPCDoubleDouble. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D90208	2020-11-12 10:26:13 +08:00
Kazushi (Jam) Marukawa	6e0ae20f3b	[VE] Support vector register in inline asm Support a vector register constraint in inline asm of clang. Add a regression test also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D91251	2020-11-12 06:18:35 +09:00
Akira Hatanaka	d9258a21f0	Fix the data layout mangling specification for 'arm64-pc-win32-macho' rdar://problem/70410504	2020-11-10 18:52:12 -08:00
Tim Renouf	89d41f3a2b	[AMDGPU] Add gfx1033 target Differential Revision: https://reviews.llvm.org/D90447 Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761	2020-11-03 16:27:48 +00:00
Tim Renouf	ee3e642627	[AMDGPU] Add gfx90c target This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were previously included in gfx909. Differential Revision: https://reviews.llvm.org/D90419 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d	2020-11-03 16:27:43 +00:00
Liu, Chen3	756f597841	[X86] Support Intel avxvnni This patch mainly made the following changes: 1. Support AVX-VNNI instructions; 2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix. Differential Revision: https://reviews.llvm.org/D89105	2020-10-31 12:39:51 +08:00
Ben Shi	5be50d79c0	[NFC][clang][AVR] Add more devices Reviewed By: dylanmckay Differential Revision: https://reviews.llvm.org/D88352	2020-10-29 11:49:21 +08:00
Benjamin Kramer	39a0d6889d	[X86] Add a stub for Intel's alderlake. No scheduling, no autodetection.	2020-10-24 19:01:22 +02:00
Benjamin Kramer	bd2cf96c09	[X86] Add a stub for znver3 based on the little public information there is in AMD's manuals No scheduling, no autodetection. Just enough so -march=znver3 works.	2020-10-24 19:01:22 +02:00
Xiangling Liao	05bef88eb3	[AIX] Let alloca return 16 bytes alignment On AIX, to support vector types, which should always be 16 bytes aligned, we set alloca to return 16 bytes aligned memory space. Differential Revision: https://reviews.llvm.org/D89910	2020-10-23 14:41:32 -04:00
Marco Antognini	a779a16993	[OpenCL] Remove unused extensions Many non-language extensions are defined but also unused. This patch removes them with their tests as they do not require compiler support. The cl_khr_select_fprounding_mode extension is also removed because it has been deprecated since OpenCL 1.1 and Clang doesn't have any specific support for it. The cl_khr_context_abort extension is only referred to in "The OpenCL Specification", version 1.2 and 2.0, in Table 4.3, but no specification is provided in "The OpenCL Extension Specification" for these versions. Because it is both unused in Clang and lacks specification, this extension is removed. The following extensions are platform extensions that bring new OpenCL APIs but do not impact the kernel language nor require compiler support. They are therefore removed. - cl_khr_gl_sharing, introduced in OpenCL 1.0 - cl_khr_icd, introduced in OpenCL 1.2 - cl_khr_gl_event, introduced in OpenCL 1.1 Note: this extension adds a new API to create cl_event but it also specifies that these can only be used by clEnqueueAcquireGLObjects. Hence, they cannot be used on the device side and the extension does not impact the kernel language. - cl_khr_d3d10_sharing, introduced in OpenCL 1.1 - cl_khr_d3d11_sharing, introduced in OpenCL 1.2 - cl_khr_dx9_media_sharing, introduced in OpenCL 1.2 - cl_khr_image2d_from_buffer, introduced in OpenCL 1.2 - cl_khr_initialize_memory, introduced in OpenCL 1.2 - cl_khr_gl_depth_images, introduced in OpenCL 1.2 Note: this extension is related to cl_khr_depth_images but only the latter adds new features to the kernel language. - cl_khr_spir, introduced in OpenCL 1.2 - cl_khr_egl_event, introduced in OpenCL 1.2 Note: this extension adds a new API to create cl_event but it also specifies that these can only be used by clEnqueueAcquire* API functions. Hence, they cannot be used on the device side and the extension does not impact the kernel language. - cl_khr_egl_image, introduced in OpenCL 1.2 - cl_khr_terminate_context, introduced in OpenCL 1.2 The minimum required OpenCL version used in OpenCLExtensions.def for these extensions is not always correct. Removing these address that issue. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D89372	2020-10-22 17:01:31 +01:00
Tianqing Wang	be39a6fe6f	[X86] Add User Interrupts(UINTR) instructions For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D89301	2020-10-22 17:33:07 +08:00
Kito Cheng	cfa7094e49	[RISCV] Add -mtune support - The goal of this patch is improve option compatible with RISCV-V GCC, -mcpu support on GCC side will sent patch in next few days. - -mtune only affect the pipeline model and non-arch/extension related target feature, e.g. instruction fusion; in td file it called TuneFeatures, which is introduced by X86 back-end[1]. - -mtune accept all valid option for -mcpu and extra alias processor option, e.g. `generic`, `rocket` and `sifive-7-series`, the purpose is option compatible with RISCV-V GCC. - Processor alias for -mtune will resolve according the current target arch, rv32 or rv64, e.g. `rocket` will resolve to `rocket-rv32` or `rocket-rv64`. - Interaction between -mcpu and -mtune: * -mtune has higher priority than -mcpu for pipeline model and TuneFeatures. [1] https://reviews.llvm.org/D85165 Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D89025	2020-10-16 13:55:08 +08:00
Stanislav Mekhanoshin	d1beb95d12	[AMDGPU] gfx1032 target Differential Revision: https://reviews.llvm.org/D89487	2020-10-15 12:41:18 -07:00
Wang, Pengfei	412cdcf2ed	[X86] Add HRESET instruction. For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D89102	2020-10-13 08:47:26 +08:00
Fangrui Song	012dd42e02	[X86] Support -march=x86-64-v[234] PR47686. These micro-architecture levels are defined in the x86-64 psABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9 GCC 11 will support these levels. Note, -mtune=x86-64-v[234] are invalid and __builtin_cpu_is cannot be used on them. Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D89197	2020-10-12 10:29:46 -07:00
Fangrui Song	cbe4d973ed	[X86] Define __LAHF_SAHF__ if feature 'sahf' is set or 32-bit mode GCC 11 will define this macro. In LLVM, the feature flag only applies to 64-bit mode and we always define the macro in 32-bit mode. This is different from GCC -m32 in which -mno-sahf can suppress the macro. The discrepancy can unlikely cause trouble. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D89198	2020-10-11 09:46:00 -07:00
Tim Renouf	666ef0db20	[AMDGPU] Add gfx602, gfx705, gfx805 targets At AMD, in an internal audit of our code, we found some corner cases where we were not quite differentiating targets enough for some old hardware. This commit is part of fixing that by adding three new targets: * The "Oland" and "Hainan" variants of gfx601 are now split out into gfx602. LLPC (in the GPUOpen driver) and other front-ends could use that to avoid using the shaderZExport workaround on gfx602. * One variant of gfx703 is now split out into gfx705. LLPC and other front-ends could use that to avoid using the shaderSpiCsRegAllocFragmentation workaround on gfx705. * The "TongaPro" variant of gfx802 is now split out into gfx805. TongaPro has a faster 64-bit shift than its former friends in gfx802, and a subtarget feature could be set up for that to take advantage of it. This commit does not make that change; it just adds the target. V2: Add clang changes. Put TargetParser list in order. V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order, so fix the GPUKind order. Differential Revision: https://reviews.llvm.org/D88916 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d	2020-10-10 17:22:22 +01:00
Fanbo Meng	c781dc74a8	[SystemZ][z/OS] Set default alignment rules for z/OS target Set the default alignment control variables for z/OS target and add test case for alignment rules on z/OS. Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D88845	2020-10-06 13:16:15 -04:00
Yaxun (Sam) Liu	cbd420c5ed	[CUDA][HIP] Fix bound arch for offload action for fat binary Currently CUDA/HIP toolchain uses "unknown" as bound arch for offload action for fat binary. This causes -mcpu or -march with "unknown" added in HIPToolChain::TranslateArgs or CUDAToolChain::TranslateArgs. This causes issue for https://reviews.llvm.org/D88377 since HIP toolchain needs to check -mcpu in HIPToolChain::TranslateArgs. The bound arch of offload action for fat binary is not really used, therefore set it to CudaArch::UNUSED. Differential Revision: https://reviews.llvm.org/D88524	2020-10-02 19:05:51 -04:00
Yaxun (Sam) Liu	36501b180a	Emit predefined macro for wavefront size for amdgcn Also fix the issue of multiple -m[no-]wavefrontsize64 options to make the last one wins. Differential Revision: https://reviews.llvm.org/D88370	2020-10-02 10:17:21 -04:00
Sjoerd Meijer	8825fec37e	[AArch64] Add CPU Cortex-R82 This adds support for -mcpu=cortex-r82. Some more information about this core can be found here: https://www.arm.com/products/silicon-ip-cpu/cortex-r/cortex-r82 One note about the system register: that is a bit of a refactoring because of small differences between v8.4-A AArch64 and v8-R AArch64. This is based on patches from Mark Murray and Mikhail Maltsev. Differential Revision: https://reviews.llvm.org/D88660	2020-10-02 12:47:23 +01:00
Akira Hatanaka	21cf2e6c26	Handle unknown OSes in DarwinTargetInfo::getExnObjectAlignment rdar://problem/69727650	2020-09-30 16:05:17 -07:00
Xiang1 Zhang	413577a879	[X86] Support Intel Key Locker Key Locker provides a mechanism to encrypt and decrypt data with an AES key without having access to the raw key value by converting AES keys into “handles”. These handles can be used to perform the same encryption and decryption operations as the original AES keys, but they only work on the current system and only until they are revoked. If software revokes Key Locker handles (e.g., on a reboot), then any previous handles can no longer be used. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D88398	2020-09-30 18:08:45 +08:00
Baptiste Saleil	0156914275	[PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types This patch legalizes the v256i1 and v512i1 types that will be used for MMA. It implements loads and stores of these types. v256i1 is a pair of VSX registers, so for this type, we load/store the two underlying registers. v512i1 is used for MMA accumulators. So in addition to loading and storing the 4 associated VSX registers, we generate instructions to prime (copy the VSX registers to the accumulator) after loading and unprime (copy the accumulator back to the VSX registers) before storing. This patch also adds the UACC register class that is necessary to implement the loads and stores. This class represents accumulator in their unprimed form and allow the distinction between primed and unprimed accumulators to avoid invalid copies of the VSX registers associated with primed accumulators. Differential Revision: https://reviews.llvm.org/D84968	2020-09-28 14:39:37 -05:00
Abhina Sreeskantharajan	0fb97fd6a4	[SystemZ][z/OS] Set default wchar_t type for zOS Set the default wchar_t type on z/OS, and unsigned as the default. Reviewed By: hubert.reinterpretcast, fanbo-meng Differential Revision: https://reviews.llvm.org/D87624	2020-09-22 08:03:03 -04:00
Fanbo Meng	2240ca0bd1	[SystemZ][z/OS] Set aligned allocation unavailable by default for z/OS Aligned allocation is not supported on z/OS. This patch sets -faligned-alloc-unavailable as default in z/OS toolchain. Reviewed By: abhina.sreeskantharajan, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D87611	2020-09-16 14:49:03 -04:00
Qiu Chaofan	8ecc8520bc	[FPEnv] [Clang] Enable constrained FP support for PowerPC `d4ce862f` introduced HasStrictFP to disable generating constrained FP operations for platforms lacking support. Since work for enabling constrained FP on PowerPC is almost done, we'd like to enable it. Reviewed By: kpn, steven.zhang Differential Revision: https://reviews.llvm.org/D87223	2020-09-12 00:39:52 +08:00
Rainer Orth	76e85ae268	[clang][Sparc] Default to -mcpu=v9 for Sparc V8 on Solaris As reported in Bug 42535, `clang` doesn't inline atomic ops on 32-bit Sparc, unlike `gcc` on Solaris. In a 1-stage build with `gcc`, only two testcases are affected (currently `XFAIL`ed), while in a 2-stage build more than 100 tests `FAIL` due to this issue. The reason for this `gcc`/`clang` difference is that `gcc` on 32-bit Solaris/SPARC defaults to `-mpcu=v9` where atomic ops are supported, unlike with `clang`'s default of `-mcpu=v8`. This patch changes `clang` to use `-mcpu=v9` on 32-bit Solaris/SPARC, too. Doing so uncovered two bugs: `clang -m32 -mcpu=v9` chokes with any Solaris system headers included: /usr/include/sys/isa_defs.h:461:2: error: "Both _ILP32 and _LP64 are defined" #error "Both _ILP32 and _LP64 are defined" While `clang` currently defines `__sparcv9` in a 32-bit `-mcpu=v9` compilation, neither `gcc` nor Studio `cc` do. In fact, the Studio 12.6 `cc(1)` man page clearly states: These predefinitions are valid in all modes: [...] __sparcv8 (SPARC) __sparcv9 (SPARC -m64) At the same time, the patch defines `__GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]` for a 32-bit Sparc compilation with any V9 cpu. I've also changed `MaxAtomicInlineWidth` for V9, matching what `gcc` does and the Oracle Developer Studio 12.6: C User's Guide documents (Ch. 3, Support for Atomic Types, 3.1 Size and Alignment of Atomic C Types). The two testcases that had been `XFAIL`ed for Bug 42535 are un-`XFAIL`ed again. Tested on `sparcv9-sun-solaris2.11` and `amd64-pc-solaris2.11`. Differential Revision: https://reviews.llvm.org/D86621	2020-09-11 09:53:19 +02:00
Cullen Rhodes	f9091e56d3	[clang][aarch64] Drop experimental from __ARM_FEATURE_SVE_BITS macro The __ARM_FEATURE_SVE_BITS feature macro is specified in the Arm C Language Extensions (ACLE) for SVE [1] (version 00bet5). From the spec, where __ARM_FEATURE_SVE_BITS==N: When N is nonzero, indicates that the implementation is generating code for an N-bit SVE target and that the arm_sve_vector_bits(N) attribute is available. This was defined in D83550 as __ARM_FEATURE_SVE_BITS_EXPERIMENTAL and enabled under the -msve-vector-bits flag to simplify initial tests. This patch drops _EXPERIMENTAL now there is support for the feature. [1] https://developer.arm.com/documentation/100987/latest Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D86720	2020-09-03 09:39:37 +00:00
Craig Topper	71f3169e1b	[X86] Default to -mtune=generic unless -march is passed to the driver. Add TuneCPU to the AST serialization This patch defaults to -mtune=generic unless -march is present. If -march is present we'll use the empty string unless its overridden by mtune. The back should use the target cpu if the tune-cpu isn't present. It also adds AST serialization support to fix some tests that emit AST and parse it back. These tests diff the IR against the output from not going through AST. So if we don't serialize the tune CPU we fail the diff. Differential Revision: https://reviews.llvm.org/D86488	2020-08-26 14:52:03 -07:00
Abhina Sreeskantharajan	97ccf93b36	[SystemZ][z/OS] Add z/OS Target and define macros This patch adds the z/OS target and defines macros as a stepping stone towards enabling a native build on z/OS. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D85324	2020-08-25 15:51:59 -04:00
Freddy Ye	e02d081f2b	[X86] Support -march=sapphirerapids Support -march=sapphirerapids for x86. Compare with Icelake Server, it includes 14 more new features. They are amxtile, amxint8, amxbf16, avx512bf16, avx512vp2intersect, cldemote, enqcmd, movdir64b, movdiri, ptwrite, serialize, shstk, tsxldtrk, waitpkg. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D86503	2020-08-25 14:21:21 +08:00
Baptiste Saleil	512e256c0d	[PowerPC] Add clang options to control MMA support This patch adds frontend and backend options to enable and disable the PowerPC MMA operations added in ISA 3.1. Instructions using these options will be added in subsequent patches. Differential Revision: https://reviews.llvm.org/D81442	2020-08-24 09:35:55 -05:00
Craig Topper	cc7bf9bcbf	[X86] Allow 32-bit mode only CPUs with -mtune on 64-bit targets gcc errors on this, but I'm nervous that since -mtune has been ignored by clang for so long that there may be code bases out there that pass 32-bit cpus to clang.	2020-08-22 16:38:05 -07:00
Craig Topper	724f570ad2	[X86] Add support 'tune' in target attribute This adds parsing and codegen support for tune in target attribute. I've implemented this so that arch in the target attribute implicitly disables tune from the command line. I'm not sure what gcc does here. But since -march implies -mtune. I assume 'arch' in the target attribute implies tune in the target attribute. Differential Revision: https://reviews.llvm.org/D86187	2020-08-19 15:58:19 -07:00
Yaxun (Sam) Liu	7546b29e76	[HIP] Support target id by --offload-arch This patch introduces support of target id by -offload-arch. Differential Revision: https://reviews.llvm.org/D60620	2020-08-18 23:43:53 -04:00
Brad Smith	d9ff48d038	WCharType and WIntType are always signed int on OpenBSD.	2020-08-18 19:59:54 -04:00
Craig Topper	2b8ad6b604	[WebAssembly] Don't depend on the flags set by handleTargetFeatures in initFeatureMap. Properly set "simd128" in the feature map when "unimplemented-simd128" is requested. initFeatureMap is used to create the feature vector used by handleTargetFeatures. There are later calls to initFeatureMap in CodeGen that were using these flags to recreate the map. But the original feature vector should be passed to those calls. So that should be enough to rebuild the map. The only issue seemed to be that simd128 was not enabled in the map by the first call to initFeatureMap. Using the SIMDLevel set by handleTargetFeatures in the later calls allowed simd128 to be set in the later versions of the map. To fix this I've added an override of setFeatureEnabled that will update the map the first time with the correct simd dependency. Differential Revision: https://reviews.llvm.org/D85806	2020-08-12 11:43:46 -07:00
Erich Keane	aa4bc1cb79	Limit Max Vector alignment on COFF targets to 8192. COFF targets have a max object alignment of 8192, so trying to create one with a larger size results in an unreachable in WinCOFFObjectWriter. For the reproducer I have uses thread local storage, however other alignments are likely affected as well. This patch sets the MaxVectorAlign for COFF to 8192. Additionally, though there is no longer a way to reproduce that I could find, it correctly sets the MaxTLSAlign for COFF to that value as well, so that if anyone comes up with a situation where this is true, it will cause an error. Differential Revision: https://reviews.llvm.org/D85543	2020-08-12 06:35:35 -07:00
Brad Smith	5fe171321c	[Sparc] Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros on SPARCv9	2020-08-11 00:04:24 -04:00
Krzysztof Parzyszek	7406eb4f6a	[Hexagon] Avoid creating an empty target feature If the CPU string is empty, the target feature map may end up having an empty string inserted to it. The symptom of the problem is a warning message: '+' is not a recognized feature for this target (ignoring feature) Also, the target-features attribute in the module will have an empty string in it.	2020-08-10 10:37:24 -05:00
Brad Smith	92e82a2890	int64_t and intmax_t are always (signed) long long on OpenBSD.	2020-08-09 19:43:16 -04:00
Brad Smith	4eb4ebf76a	Hook up OpenBSD 64-bit PowerPC support	2020-08-08 17:51:19 -04:00
Simon Pilgrim	24cca30f7f	Remove unreachable return (PR47026)	2020-08-07 11:23:43 +01:00
Craig Topper	504a197fe5	[X86] Rename X86::getImpliedFeatures to X86::updateImpliedFeatures and pass clang's StringMap directly to it. No point in building a vector of StringRefs for clang to apply to the StringMap. Just pass the StringMap and modify it directly.	2020-08-06 00:20:46 -07:00
Stanislav Mekhanoshin	ea7d0e2996	[AMDGPU] gfx1031 target Differential Revision: https://reviews.llvm.org/D85337	2020-08-05 12:36:26 -07:00
Baptiste Saleil	7aaa85627b	[PowerPC] Add options to control paired vector memops support Adds frontend and backend options to enable and disable the PowerPC paired vector memory operations added in ISA 3.1. Instructions using these options will be added in subsequent patches. Differential Revision: https://reviews.llvm.org/D83722	2020-07-29 14:00:53 -05:00
Alexey Bader	8d27be8dba	[OpenCL] Add global_device and global_host address spaces This patch introduces 2 new address spaces in OpenCL: global_device and global_host which are a subset of a global address space, so the address space scheme will be looking like: ``` generic->global->host ->device ->private ->local constant ``` Justification: USM allocations may be associated with both host and device memory. We want to give users a way to tell the compiler the allocation type of a USM pointer for optimization purposes. (Link to the Unified Shared Memory extension: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/USM/cl_intel_unified_shared_memory.asciidoc) Before this patch USM pointer could be only in opencl_global address space, hence a device backend can't tell if a particular pointer points to host or device memory. On FPGAs at least we can generate more efficient hardware code if the user tells us where the pointer can point - being able to distinguish between these types of pointers at compile time allows us to instantiate simpler load-store units to perform memory transactions. Patch by Dmitry Sidorov. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D82174	2020-07-29 17:24:53 +03:00
Jinsong Ji	d28f86723f	Re-land "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support" This reverts commit `bf544fa1c3`. Fixed the typo in PPCInstrInfo.cpp.	2020-07-28 14:00:11 +00:00
Jinsong Ji	bf544fa1c3	Revert "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support" This reverts commit `adffce7153`. This is breaking test-suite, revert while investigation.	2020-07-27 21:07:00 +00:00
Jinsong Ji	adffce7153	[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support Per RFC http://lists.llvm.org/pipermail/llvm-dev/2020-April/141295.html no one is making use of QPX/A2Q/BGQ/BGP CNK anymore. This patch remove the support of QPX/A2Q in llvm, BGQ/BGP in clang, CNK support in openmp/polly. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D83915	2020-07-27 19:24:39 +00:00
Xiangling Liao	05ad8e9429	[AIX] Implement AIX special alignment rule about double/long double Implement AIX default `power` alignment rule by adding `PreferredAlignment` and `PreferredNVAlignment` in ASTRecordLayout class. The patchh aims at returning correct value for `__alignof(x)` and `alignof(x)` under `power` alignment rules. Differential Revision: https://reviews.llvm.org/D79719	2020-07-27 15:13:03 -04:00
Ulrich Weigand	7f003957bf	[SystemZ] Implement __builtin_eh_return_data_regno Implement __builtin_eh_return_data_regno for SystemZ. Match behavior of GCC. Author: slavek-kucera Differential Revision: https://reviews.llvm.org/D84341	2020-07-24 10:28:06 +02:00
Akira Hatanaka	73bc23ff86	Fix the data layout mangling specification for 'i686-pc-macho' Use 'o' for the mangling specification instead of 'e'. This fixes an error in the backend caused by a mismatch between the data layouts generated by the backend and the frontend. rdar://problem/64168540	2020-07-21 12:58:17 -07:00
Anatoly Trosinenko	16a4350f76	[MSP430] Actualize the toolchain description Reviewed By: krisb Differential Revision: https://reviews.llvm.org/D81676	2020-07-17 15:42:12 +03:00
Cullen Rhodes	bb160e769d	[Sema][AArch64] Add parsing support for arm_sve_vector_bits attribute Summary: This patch implements parsing support for the 'arm_sve_vector_bits' type attribute, defined by the Arm C Language Extensions (ACLE, version 00bet5, section 3.7.3) for SVE [1]. The purpose of this attribute is to define fixed-length (VLST) versions of existing sizeless types (VLAT). For example: #if __ARM_FEATURE_SVE_BITS==512 typedef svint32_t fixed_svint32_t __attribute__((arm_sve_vector_bits(512))); #endif Creates a type 'fixed_svint32_t' that is a fixed-length version of 'svint32_t' that is normal-sized (rather than sizeless) and contains exactly 512 bits. Unlike 'svint32_t', this type can be used in places such as structs and arrays where sizeless types can't. Implemented in this patch is the following: * Defined and tested attribute taking single argument. * Checks the argument is an integer constant expression. * Attribute can only be attached to a single SVE vector or predicate type, excluding tuple types such as svint32x4_t. * Added the `-msve-vector-bits=<bits>` flag. When specified the `__ARM_FEATURE_SVE_BITS__EXPERIMENTAL` macro is defined. * Added a language option to store the vector size specified by the `-msve-vector-bits=<bits>` flag. This is used to validate `N == __ARM_FEATURE_SVE_BITS`, where N is the number of bits passed to the attribute and `__ARM_FEATURE_SVE_BITS` is the feature macro defined under the same flag. The `__ARM_FEATURE_SVE_BITS` macro will be made non-experimental in the final patch of the series. [1] https://developer.arm.com/documentation/100987/latest This is patch 1/4 of a patch series. Reviewers: sdesmalen, rsandifo-arm, efriedma, ctetreau, cameron.mcinally, rengolin, aaron.ballman Reviewed By: sdesmalen, aaron.ballman Differential Revision: https://reviews.llvm.org/D83550	2020-07-17 10:06:54 +00:00
Zakk Chen	294d1eae75	[RISCV] Add support for -mcpu option. Summary: 1. gcc uses `-march` and `-mtune` flag to chose arch and pipeline model, but clang does not have `-mtune` flag, we uses `-mcpu` to chose both infos. 2. Add SiFive e31 and u54 cpu which have default march and pipeline model. 3. Specific `-mcpu` with rocket-rv[32\|64] would select pipeline model only, and use the driver's arch choosing logic to get default arch. Reviewers: lenary, asb, evandro, HsiangKai Reviewed By: lenary, asb, evandro Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D71124	2020-07-16 11:46:22 -07:00
Logan Smith	2c2a297bb6	[clang][NFC] Add 'override' keyword to virtual function overrides This patch adds override to several overriding virtual functions that were missing the keyword within the clang/ directory. These were found by the new -Wsuggest-override.	2020-07-14 08:59:57 -07:00
Craig Topper	b4dbb37f32	[X86] Rename X86_CPU_TYPE_COMPAT_ALIAS/X86_CPU_TYPE_COMPAT/X86_CPU_SUBTYPE_COMPAT macros. NFC Remove _COMPAT. Drop the ARCHNAME. Remove the non-COMPAT versions that are no longer needed. We now only use these macros in places where we need compatibility with libgcc/compiler-rt. So we don't need to call out _COMPAT specifically.	2020-07-12 17:00:24 -07:00
Kevin P. Neal	d4ce862f2a	Reland "[FPEnv][Clang][Driver] Disable constrained floating point on targets lacking support." We currently have strict floating point/constrained floating point enabled for all targets. Constrained SDAG nodes get converted to the regular ones before reaching the target layer. In theory this should be fine. However, the changes are exposed to users through multiple clang options already in use in the field, and the changes are _completely_ _untested_ on almost all of our targets. Bugs have already been found, like "https://bugs.llvm.org/show_bug.cgi?id=45274". This patch disables constrained floating point options in clang everywhere except X86 and SystemZ. A warning will be printed when this happens. Use the new -fexperimental-strict-floating-point flag to force allowing strict floating point on hosts that aren't already marked as supporting it (X86 and SystemZ). Differential Revision: https://reviews.llvm.org/D80952	2020-07-10 08:49:45 -04:00
Craig Topper	3cbfe988bc	[X86] Merge X86TargetInfo::setFeatureEnabled and X86TargetInfo::setFeatureEnabledImpl. NFC setFeatureEnabled is a virtual function. setFeatureEnabledImpl was its implementation. This split was to avoid virtual calls when we need to call setFeatureEnabled in initFeatureMap. With C++11 we can use 'final' on setFeatureEnabled to enable the compiler to perform de-virtualization for the initFeatureMap calls.	2020-07-06 23:54:56 -07:00
Craig Topper	16f3d698f2	[X86] Move the feature dependency handling in X86TargetInfo::setFeatureEnabledImpl to a table based lookup in X86TargetParser.cpp Previously we had to specify the forward and backwards feature dependencies separately which was error prone. And as dependencies have gotten more complex it was hard to be sure the transitive dependencies were handled correctly. The way it was written was also not super readable. This patch replaces everything with a table that lists what features a feature is dependent on directly. Then we can recursively walk through the table to find the transitive dependencies. This is largely based on how we handle subtarget features in the MC layer from the tablegen descriptions. Differential Revision: https://reviews.llvm.org/D83273	2020-07-06 23:14:02 -07:00
Xiang1 Zhang	939d8309db	[X86-64] Support Intel AMX Intrinsic INTEL ADVANCED MATRIX EXTENSIONS (AMX). AMX is a new programming paradigm, it has a set of 2-dimensional registers (TILES) representing sub-arrays from a larger 2-dimensional memory image and operate on TILES. These intrinsics use direct TMM register number as its params. Spec can be found in Chapter 3 here https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D83111	2020-07-07 10:13:40 +08:00
Craig Topper	c359c5d534	[X86] Centalize the 'sse4' hack to a single place in X86TargetInfo::setFeatureEnabledImpl. NFCI Instead of detecting the string in 2 places. Just swap the string to 'sse4.1' or 'sse4.2' at the top of the function. Prep work for a patch to switch the rest of this function to a table based system. And I don't want to include 'sse4a' in the table.	2020-07-06 15:00:32 -07:00
Kevin P. Neal	916e2ca997	Revert "[FPEnv][Clang][Driver] Disable constrained floating point on targets lacking support." My mistake, I had a blocking reviewer. This reverts commit `39d2ae0afb`. This reverts commit `bfdafa32a0`. This reverts commit `2b35511350`. Differential Revision: https://reviews.llvm.org/D80952	2020-07-06 14:57:45 -04:00
Kevin P. Neal	39d2ae0afb	[FPEnv][Clang][Driver] Disable constrained floating point on targets lacking support. We currently have strict floating point/constrained floating point enabled for all targets. Constrained SDAG nodes get converted to the regular ones before reaching the target layer. In theory this should be fine. However, the changes are exposed to users through multiple clang options already in use in the field, and the changes are _completely_ _untested_ on almost all of our targets. Bugs have already been found, like "https://bugs.llvm.org/show_bug.cgi?id=45274". This patch disables constrained floating point options in clang everywhere except X86 and SystemZ. A warning will be printed when this happens. Differential Revision: https://reviews.llvm.org/D80952	2020-07-06 13:32:49 -04:00
Kazushi (Jam) Marukawa	df3bda047d	[VE] Correct stack alignment Summary: Change stack alignment from 64 bits to 128 bits to follow ABI correctly. And add a regression test for datalayout. Reviewers: simoll, k-ishizaka Reviewed By: simoll Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #llvm, #ve, #clang Differential Revision: https://reviews.llvm.org/D83173	2020-07-06 17:25:29 +09:00
Kai Luo	68e07da3e5	[clang][PowerPC] Enable -fstack-clash-protection option for ppc64 Differential Revision: https://reviews.llvm.org/D81355	2020-07-05 03:43:56 +00:00
jasonliu	572dde55ee	[XCOFF][AIX] Use 'L..' instead of '.L' for getPrivateGlobalPrefix in DataLayout Summary: D80831 changed part of the prefix usage for AIX. But there are other places getting prefix from DataLayout. This patch intends to make prefix usage consistent on AIX. Reviewed by: hubert.reinterpretcast, daltenty Differential Revision: https://reviews.llvm.org/D81270	2020-07-03 18:25:14 +00:00
Dmitry Preobrazhensky	53422e8b4f	[AMDGPU] Added support of new inline assembler constraints Added support for constraints 'I', 'J', 'L', 'B', 'C', 'Kf', 'DA', 'DB'. See https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints. Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D81657	2020-07-03 18:01:12 +03:00
Craig Topper	3537939cda	[X86] Move frontend CPU feature initialization to a look up table based implementation. NFCI This replaces the switch statement implementation in the clang's X86.cpp with a lookup table in X86TargetParser.cpp. I've used constexpr and copy of the FeatureBitset from SubtargetFeature.h to store the features in a lookup table. After the lookup the bitset is translated into strings for use by the rest of the frontend code. I had to modify the implementation of the FeatureBitset to avoid bugs in gcc 5.5 constexpr handling. It seems to not like the same array entry to be used on the left side and right hand side of an assignment or &= or \|=. I've also used uint32_t instead of uint64_t and sized based on the X86::CPU_FEATURE_MAX. I've initialized the features for different CPUs outside of the table so that we can express inheritance in an adhoc way. This was one of the big limitations of the switch and we had resorted to labels and gotos. Differential Revision: https://reviews.llvm.org/D82731	2020-06-30 12:04:58 -07:00
Francesco Petrogalli	d54e4dded7	[sve][acle] Enable feature macros for SVE ACLE extensions. Summary: The following feature macros have been added: __ARM_FEATURE_SVE_BF16 __ARM_FEATURE_SVE_MATMUL_INT8 __ARM_FEATURE_SVE_MATMUL_FP32 __ARM_FEATURE_SVE_MATMUL_FP64 The driver has been updated to enable them accordingly to the value of the target feature passed at command line. The SVE ACLE tests using the macros have been modified to work with the target feature instead of passing the macro at command line. Reviewers: sdesmalen, efriedma, c-rhodes, kmclaughlin, SjoerdMeijer, rengolin Subscribers: tschuett, kristof.beyls, rkruppe, psnobl, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D82623	2020-06-30 18:33:03 +00:00
Craig Topper	20a60f46f5	[X86] Explicitly add popcnt feature to Intel CPUs with SSE4.2 in the frontend. Previously we inferred it if sse4.2 ended up being enabled after all feature processing. But writing -march=nehalem -mno-sse4.2 should have popcnt enabled.	2020-06-28 11:06:40 -07:00
Craig Topper	9e8b5a20e9	[X86] Add MOVBE and RDRND features to BDVER4. Only 6 years behind gcc. https://gcc.gnu.org/legacy-ml/gcc-patches/2014-08/msg00231.html Found while working on improving how we define CPU features for clang and auditing for correctness.	2020-06-26 23:32:17 -07:00
Craig Topper	d298acde82	[X86] Don't disable xsave when avx is disabled. Implicitly enable xsave with avx is enabled and xsave wasn't explciitly disabled CPUs with avx always have xsave, but some CPUs without avx also have xsave. So we shouldn't disable xsave just because avx is disabled. This would prevent xsave from being enabled with -march=native on CPUs with xsave and not avx. But we also don't want -mavx -mno-avx to leave xsave eanabled. So only enable xsave if avx is enabled after processing all features. I thought about just not turning xsave on with avx at all, but there might be someone out there depending on it.	2020-06-26 16:45:44 -07:00
Simon Pilgrim	0069824fea	Revert rGf0bab7875e78e01c149d12302dcc4b6d4c43e25c - "Triple.h - reduce Twine.h include to forward declarations. NFC." This causes ICEs on the clang-ppc64be buildbots and I've limited ability to triage the problem.	2020-06-26 14:46:40 +01:00
Anatoly Trosinenko	cb56fa2196	[MSP430] Update register names When writing a unit test on replacing standard epilogue sequences with `BR __mspabi_func_epilog_<N>`, by manually asm-clobbering `rN` - `r10` for N = 4..10, everything worked well except for seeming inability to clobber r4. The problem was that MSP430 code generator of LLVM used an obsolete name FP for that register. Things were worse because when `llc` read an unknown register name, it silently ignored it. That is, I cannot use `fp` register name from the C code because Clang does not accept it (exactly like GCC). But the accepted name `r4` is not recognised by `llc` (it can be used in listings passed to `llvm-mc` and even `fp` is replace to `r4` by `llvm-mc`). So I can specify any of `fp` or `r4` for the string literal of `asm(...)` but nothing in the clobber list. This patch replaces `MSP430::FP` with `MSP430::R4` in the backend code (even [MSP430 EABI](http://www.ti.com/lit/an/slaa534/slaa534.pdf) doesn't mention FP as a register name). The R0 - R3 registers, on the other hand, are left as is in the backend code (after all, they have some special meaning on the ISA level). It is just ensured clang is renaming them as expected by the downstream tools. There is probably not much sense in marking them clobbered but rename them //just in case// for use at potentially different contexts. Differential Revision: https://reviews.llvm.org/D82184	2020-06-26 15:32:07 +03:00
Simon Pilgrim	f0bab7875e	Triple.h - reduce Twine.h include to forward declarations. NFC. Move include down to a number of other files that had an implicit dependency on the Twine class.	2020-06-26 13:06:57 +01:00
Craig Topper	12665f2812	[X86] Make XSAVEC/XSAVEOPT/XSAVES properly depend on XSAVE in both the frontend and the backend. These features implicitly enabled XSAVE in the frontend, but not the backend. Disabling XSAVE in the frontend disabled XSAVEOPT, but not the other 2. Nothing happened in the backend.	2020-06-26 00:14:58 -07:00
Craig Topper	a7db230d75	[X86] Add CMPXCHG16B feature to amdfam10 in the frontend. We already have this feature on it in the backend.	2020-06-25 22:55:36 -07:00
Craig Topper	6673d69226	[X86] Don't imply -mprfchw when -m3dnow is specified. Enable prefetchw in the backend with 3dnow feature. The PREFETCHW instruction was originally part of the 3DNow. But it was given its own CPUID bit on later CPUs just before 3DNow was deprecated. We were setting the -mprfchw flag if -m3dnow was passed or the CPU supported 3dnow unless -mno-prfchw was passed. But -march=native on a CPU without the PRFCHW CPUID bit set will pass -mno-prfchw. So -march=k8 will behave differently than -march=native on a K8 for example. So remove this implicit setting from the frontend and instead enable the backend to use PREFETCHW if 3dnow OR prfchw is enabled. Also enable PRFCHW flag on amdfam10/barcelona which seems to be where this CPUID bit was introduced. That CPU also supported 3dnow.	2020-06-25 12:46:52 -07:00
Craig Topper	01c18f9199	Revert "[X86] Don't imply -mprfchw when -m3dnow is specified. Enable prefetchw in the backend with 3dnow feature." This is failing on the bots. This reverts commit `636d31a5c3`.	2020-06-25 11:43:02 -07:00
Craig Topper	636d31a5c3	[X86] Don't imply -mprfchw when -m3dnow is specified. Enable prefetchw in the backend with 3dnow feature. The PREFETCHW instruction was originally part of the 3DNow. But it was given its own CPUID bit on later CPUs just before 3DNow was deprecated. We were setting the -mprfchw flag if -m3dnow was passed or the CPU supported 3dnow unless -mno-prfchw was passed. But -march=native on a CPU without the PRFCHW CPUID bit set will pass -mno-prfchw. So -march=k8 will behave differently than -march=native on a K8 for example. So remove this implicit setting from the frontend and instead enable the backend to use PREFETCHW if 3dnow OR prfchw is enabled. Also enable PRFCHW flag on amdfam10/barcelona which seems to be where this CPUID bit was introduced. That CPU also supported 3dnow.	2020-06-25 11:25:35 -07:00
Sander de Smalen	fabe67728e	[AArch64][SVE] Enable __ARM_FEATURE_SVE macros. This patch enables the following macros when their corresponding target attributes are set: __ARM_FEATURE_SVE (+sve) __ARM_FEATURE_SVE2 (+sve2) __ARM_FEATURE_SVE2_AES (+sve2-aes) __ARM_FEATURE_SVE2_BITPERM (+sve2-bitperm) __ARM_FEATURE_SVE2_SHA3 (+sve2-sha3) __ARM_FEATURE_SVE2_SM4 (+sve2-sm4) This implies that the base SVE and SVE2 ACLE (00bet2) are now feature complete, meaning that all intrinsics are implemented in LLVM and Clang. Disclaimer: To implement the ACLE we have had to fix up many parts of LLVM to make it support scalable vectors. We have also used many target-specific intrinsics to reduce reliance on parts of LLVM where we know scalable vectors may not yet be handled properly (e.g. some transformation might drop the 'scalable' flag on a vector type). While we've done a best effort with the limited testing that is available to us, we're still working to improve the stability of the implementation. Additionally, Clang may print warnings that code may have miscompiled. We find this often to be a false alarm where the wrong interfaces have been used in LLVM and where resulting code is not actually incorrect. However, this warrants a bug report and investigation. If you find any bugs or issues, please raise them on bugs.llvm.org and let us know! Reviewers: rengolin, efriedma, david-arm, SjoerdMeijer Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D81725	2020-06-25 08:14:19 +01:00
Craig Topper	8dc92142e3	[X86] Replace PROC macros with an enum and a lookup table of processor information. This patch removes the PROC macro in favor of CPUKind enum and a table that contains information about CPUs. The current information in the table is the CPU name, CPUKind enum value, key feature for target multiversioning, and Is64Bit capable. For the strings that are aliases, I've duplicated the information in the table. This means there are more rows in the table than CPUKind enums. This replaces multiple StringSwitch's with loops through the table. They are linear searches due to the table being more logically ordered than alphabetical. The StringSwitch's would have also been linear. I've used StringLiteral on the strings in the table so we can quickly check the length while searching. I contemplated having a CPUKind for each string so there was a 1:1 mapping, but didn't want to spread more names to the places that use the enum. My ultimate goal here is to store the features for each CPU as a bitset within the table. Hoping to use constexpr to make this composable so we can group features and inherit them. After the table lookup we can turn the bitset into a list of strings for the frontend. The current switch we have for selecting features for CPUs has become difficult to maintain while trying to express inheritance relationships. Differential Revision: https://reviews.llvm.org/D82414	2020-06-24 10:46:25 -07:00
Kazushi (Jam) Marukawa	96d4ccf00c	[VE] Clang toolchain for VE Summary: This patch enables compilation of C code for the VE target with Clang. Differential Revision: https://reviews.llvm.org/D79411	2020-06-24 10:12:09 +02:00
Sam Clegg	5804a8b122	[WebAssebmly] Fully disable 'protected' visibility Emscripten doesn't use protected visibility either. Differential Revision: https://reviews.llvm.org/D82346	2020-06-23 17:50:05 -07:00
Craig Topper	0dfc8e1837	[X86] Remove encoding value from the X86_FEATURE and X86_FEATURE_COMPAT macro. NFCI This was orignally done so we could separate the compatibility values and the llvm internal only features into a separate entries in the feature array. This was needed when we explicitly had to convert the feature into the proper 32-bit chunk at every reference and we didn't want things moving around. Now everything is in an array and we have helper funtions or macros to convert encoding to index. So we renumbering is no longer an issue.	2020-06-22 11:46:21 -07:00
Anton Korobeynikov	6cb80fbe40	Revert "[MSP430] Update register names" This reverts commit `8f6620f663`.	2020-06-22 13:37:22 +03:00
Anatoly Trosinenko	8f6620f663	[MSP430] Update register names When writing a unit test on replacing standard epilogue sequences with `BR __mspabi_func_epilog_<N>`, by manually asm-clobbering `rN` - `r10` for N = 4..10, everything worked well except for seeming inability to clobber r4. The problem was that MSP430 code generator of LLVM used an obsolete name FP for that register. Things were worse because when `llc` read an unknown register name, it silently ignored it. Differential Revision: https://reviews.llvm.org/D82184	2020-06-22 13:24:03 +03:00
Eric Christopher	1f593f46f3	[AST/Lex/Parse/Sema] As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-20 01:15:32 -07:00
Brad Smith	0f92096c0a	Revert "Hook up OpenBSD 64-bit PowerPC support"	2020-06-18 20:05:39 -04:00
Brad Smith	3008609d45	Hook up OpenBSD 64-bit PowerPC support	2020-06-18 19:19:45 -04:00
Ahsan Saghir	37e72f47a4	[PowerPC] Add -m[no-]power10-vector clang and llvm option Summary: This patch adds command line option for enabling power10-vector support. Reviewers: hfinkel, nemanjai, lei, amyk, #powerpc Reviewed By: lei, amyk, #powerpc Subscribers: wuzish, kbarton, hiraditya, shchenz, cfe-commits, llvm-commits Tags: #llvm, #clang, #powerpc Differential Revision: https://reviews.llvm.org/D80758	2020-06-16 14:47:35 -05:00
Michael Liao	e830fa260d	[clang][amdgpu] Prefer not using `fp16` conversion intrinsics. Reviewers: yaxunl, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, kerbowa, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81849	2020-06-16 10:21:56 -04:00
Stanislav Mekhanoshin	9ee272f13d	[AMDGPU] Add gfx1030 target Differential Revision: https://reviews.llvm.org/D81886	2020-06-15 16:18:05 -07:00

1 2 3 4 5 ...

629 Commits