llvm-project

Commit Graph

Author	SHA1	Message	Date
Freddy Ye	e02d081f2b	[X86] Support -march=sapphirerapids Support -march=sapphirerapids for x86. Compare with Icelake Server, it includes 14 more new features. They are amxtile, amxint8, amxbf16, avx512bf16, avx512vp2intersect, cldemote, enqcmd, movdir64b, movdiri, ptwrite, serialize, shstk, tsxldtrk, waitpkg. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D86503	2020-08-25 14:21:21 +08:00
Craig Topper	cc7bf9bcbf	[X86] Allow 32-bit mode only CPUs with -mtune on 64-bit targets gcc errors on this, but I'm nervous that since -mtune has been ignored by clang for so long that there may be code bases out there that pass 32-bit cpus to clang.	2020-08-22 16:38:05 -07:00
Craig Topper	504a197fe5	[X86] Rename X86::getImpliedFeatures to X86::updateImpliedFeatures and pass clang's StringMap directly to it. No point in building a vector of StringRefs for clang to apply to the StringMap. Just pass the StringMap and modify it directly.	2020-08-06 00:20:46 -07:00
Craig Topper	b4dbb37f32	[X86] Rename X86_CPU_TYPE_COMPAT_ALIAS/X86_CPU_TYPE_COMPAT/X86_CPU_SUBTYPE_COMPAT macros. NFC Remove _COMPAT. Drop the ARCHNAME. Remove the non-COMPAT versions that are no longer needed. We now only use these macros in places where we need compatibility with libgcc/compiler-rt. So we don't need to call out _COMPAT specifically.	2020-07-12 17:00:24 -07:00
Craig Topper	3cbfe988bc	[X86] Merge X86TargetInfo::setFeatureEnabled and X86TargetInfo::setFeatureEnabledImpl. NFC setFeatureEnabled is a virtual function. setFeatureEnabledImpl was its implementation. This split was to avoid virtual calls when we need to call setFeatureEnabled in initFeatureMap. With C++11 we can use 'final' on setFeatureEnabled to enable the compiler to perform de-virtualization for the initFeatureMap calls.	2020-07-06 23:54:56 -07:00
Craig Topper	16f3d698f2	[X86] Move the feature dependency handling in X86TargetInfo::setFeatureEnabledImpl to a table based lookup in X86TargetParser.cpp Previously we had to specify the forward and backwards feature dependencies separately which was error prone. And as dependencies have gotten more complex it was hard to be sure the transitive dependencies were handled correctly. The way it was written was also not super readable. This patch replaces everything with a table that lists what features a feature is dependent on directly. Then we can recursively walk through the table to find the transitive dependencies. This is largely based on how we handle subtarget features in the MC layer from the tablegen descriptions. Differential Revision: https://reviews.llvm.org/D83273	2020-07-06 23:14:02 -07:00
Xiang1 Zhang	939d8309db	[X86-64] Support Intel AMX Intrinsic INTEL ADVANCED MATRIX EXTENSIONS (AMX). AMX is a new programming paradigm, it has a set of 2-dimensional registers (TILES) representing sub-arrays from a larger 2-dimensional memory image and operate on TILES. These intrinsics use direct TMM register number as its params. Spec can be found in Chapter 3 here https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D83111	2020-07-07 10:13:40 +08:00
Craig Topper	c359c5d534	[X86] Centalize the 'sse4' hack to a single place in X86TargetInfo::setFeatureEnabledImpl. NFCI Instead of detecting the string in 2 places. Just swap the string to 'sse4.1' or 'sse4.2' at the top of the function. Prep work for a patch to switch the rest of this function to a table based system. And I don't want to include 'sse4a' in the table.	2020-07-06 15:00:32 -07:00
Craig Topper	3537939cda	[X86] Move frontend CPU feature initialization to a look up table based implementation. NFCI This replaces the switch statement implementation in the clang's X86.cpp with a lookup table in X86TargetParser.cpp. I've used constexpr and copy of the FeatureBitset from SubtargetFeature.h to store the features in a lookup table. After the lookup the bitset is translated into strings for use by the rest of the frontend code. I had to modify the implementation of the FeatureBitset to avoid bugs in gcc 5.5 constexpr handling. It seems to not like the same array entry to be used on the left side and right hand side of an assignment or &= or \|=. I've also used uint32_t instead of uint64_t and sized based on the X86::CPU_FEATURE_MAX. I've initialized the features for different CPUs outside of the table so that we can express inheritance in an adhoc way. This was one of the big limitations of the switch and we had resorted to labels and gotos. Differential Revision: https://reviews.llvm.org/D82731	2020-06-30 12:04:58 -07:00
Craig Topper	20a60f46f5	[X86] Explicitly add popcnt feature to Intel CPUs with SSE4.2 in the frontend. Previously we inferred it if sse4.2 ended up being enabled after all feature processing. But writing -march=nehalem -mno-sse4.2 should have popcnt enabled.	2020-06-28 11:06:40 -07:00
Craig Topper	9e8b5a20e9	[X86] Add MOVBE and RDRND features to BDVER4. Only 6 years behind gcc. https://gcc.gnu.org/legacy-ml/gcc-patches/2014-08/msg00231.html Found while working on improving how we define CPU features for clang and auditing for correctness.	2020-06-26 23:32:17 -07:00
Craig Topper	d298acde82	[X86] Don't disable xsave when avx is disabled. Implicitly enable xsave with avx is enabled and xsave wasn't explciitly disabled CPUs with avx always have xsave, but some CPUs without avx also have xsave. So we shouldn't disable xsave just because avx is disabled. This would prevent xsave from being enabled with -march=native on CPUs with xsave and not avx. But we also don't want -mavx -mno-avx to leave xsave eanabled. So only enable xsave if avx is enabled after processing all features. I thought about just not turning xsave on with avx at all, but there might be someone out there depending on it.	2020-06-26 16:45:44 -07:00
Craig Topper	12665f2812	[X86] Make XSAVEC/XSAVEOPT/XSAVES properly depend on XSAVE in both the frontend and the backend. These features implicitly enabled XSAVE in the frontend, but not the backend. Disabling XSAVE in the frontend disabled XSAVEOPT, but not the other 2. Nothing happened in the backend.	2020-06-26 00:14:58 -07:00
Craig Topper	a7db230d75	[X86] Add CMPXCHG16B feature to amdfam10 in the frontend. We already have this feature on it in the backend.	2020-06-25 22:55:36 -07:00
Craig Topper	6673d69226	[X86] Don't imply -mprfchw when -m3dnow is specified. Enable prefetchw in the backend with 3dnow feature. The PREFETCHW instruction was originally part of the 3DNow. But it was given its own CPUID bit on later CPUs just before 3DNow was deprecated. We were setting the -mprfchw flag if -m3dnow was passed or the CPU supported 3dnow unless -mno-prfchw was passed. But -march=native on a CPU without the PRFCHW CPUID bit set will pass -mno-prfchw. So -march=k8 will behave differently than -march=native on a K8 for example. So remove this implicit setting from the frontend and instead enable the backend to use PREFETCHW if 3dnow OR prfchw is enabled. Also enable PRFCHW flag on amdfam10/barcelona which seems to be where this CPUID bit was introduced. That CPU also supported 3dnow.	2020-06-25 12:46:52 -07:00
Craig Topper	01c18f9199	Revert "[X86] Don't imply -mprfchw when -m3dnow is specified. Enable prefetchw in the backend with 3dnow feature." This is failing on the bots. This reverts commit `636d31a5c3`.	2020-06-25 11:43:02 -07:00
Craig Topper	636d31a5c3	[X86] Don't imply -mprfchw when -m3dnow is specified. Enable prefetchw in the backend with 3dnow feature. The PREFETCHW instruction was originally part of the 3DNow. But it was given its own CPUID bit on later CPUs just before 3DNow was deprecated. We were setting the -mprfchw flag if -m3dnow was passed or the CPU supported 3dnow unless -mno-prfchw was passed. But -march=native on a CPU without the PRFCHW CPUID bit set will pass -mno-prfchw. So -march=k8 will behave differently than -march=native on a K8 for example. So remove this implicit setting from the frontend and instead enable the backend to use PREFETCHW if 3dnow OR prfchw is enabled. Also enable PRFCHW flag on amdfam10/barcelona which seems to be where this CPUID bit was introduced. That CPU also supported 3dnow.	2020-06-25 11:25:35 -07:00
Craig Topper	8dc92142e3	[X86] Replace PROC macros with an enum and a lookup table of processor information. This patch removes the PROC macro in favor of CPUKind enum and a table that contains information about CPUs. The current information in the table is the CPU name, CPUKind enum value, key feature for target multiversioning, and Is64Bit capable. For the strings that are aliases, I've duplicated the information in the table. This means there are more rows in the table than CPUKind enums. This replaces multiple StringSwitch's with loops through the table. They are linear searches due to the table being more logically ordered than alphabetical. The StringSwitch's would have also been linear. I've used StringLiteral on the strings in the table so we can quickly check the length while searching. I contemplated having a CPUKind for each string so there was a 1:1 mapping, but didn't want to spread more names to the places that use the enum. My ultimate goal here is to store the features for each CPU as a bitset within the table. Hoping to use constexpr to make this composable so we can group features and inherit them. After the table lookup we can turn the bitset into a list of strings for the frontend. The current switch we have for selecting features for CPUs has become difficult to maintain while trying to express inheritance relationships. Differential Revision: https://reviews.llvm.org/D82414	2020-06-24 10:46:25 -07:00
Craig Topper	0dfc8e1837	[X86] Remove encoding value from the X86_FEATURE and X86_FEATURE_COMPAT macro. NFCI This was orignally done so we could separate the compatibility values and the llvm internal only features into a separate entries in the feature array. This was needed when we explicitly had to convert the feature into the proper 32-bit chunk at every reference and we didn't want things moving around. Now everything is in an array and we have helper funtions or macros to convert encoding to index. So we renumbering is no longer an issue.	2020-06-22 11:46:21 -07:00
Craig Topper	ed34140e11	[X86] Move X86 stuff out of TargetParser.h and into the recently created X86TargetParser.h. NFC	2020-06-10 22:06:34 -07:00
Craig Topper	d5c28c4094	[X86] Move CPUKind enum from clang to llvm/lib/Support. NFCI Similar to what some other targets have done. This information could be reused by other frontends so doesn't make sense to live in clang. -Rename CK_Generic to CK_None to better reflect its illegalness. -Move function for translating from string to enum into llvm. -Call checkCPUKind directly from the string to enum translation and update CPU kind to CK_None accordinly. Caller will use CK_None as sentinel for bad CPU. I'm planning to move all the CPU to feature mapping out next. As part of that I want to devise a better way to express CPUs inheriting features from an earlier CPU. Allowing this to be expressed in a less rigid way than just falling through a switch. Or using gotos as we've had to do lately. Differential Revision: https://reviews.llvm.org/D81439	2020-06-09 12:52:41 -07:00
Craig Topper	dd863ccae1	[X86] Separate X86_CPU_TYPE_COMPAT_WITH_ALIAS from X86_CPU_TYPE_COMPAT. NFC Add a separate X86_CPU_TYPE_COMPAT_ALIAS that carries alias string and the enum from X86_CPU_TYPE_COMPAT.	2020-06-03 14:13:12 -07:00
Craig Topper	bb1d8bf270	[X86] Add CLWB to Tremont CPU. Remove CLDEMOTE, MOVDIRI, MOVDIR64B, and WAITPKG to match gcc.	2020-06-02 22:38:51 -07:00
Craig Topper	16c800b8b7	[X86] Remove support for Y0 constraint as an alias for Yz in inline assembly. Neither gcc or icc support this. Split out from D79472. I want to remove more, but it looks like icc does support some things gcc doesn't and I need to double check our internal test suites.	2020-05-06 14:58:53 -07:00
Craig Topper	9bb9ff0957	[X86] Remove incomplete support for 'Y' has an inline assembly constraint by itself. Y is the start of several 2 letter constraints, but we also had partial support to recognize it by itself. But it doesn't look like it can get through clang as a single letter so the backend support for this was effectively dead.	2020-05-06 14:23:04 -07:00
Craig Topper	0fac1c1912	[X86] Allow Yz inline assembly constraint to choose ymm0 or zmm0 when avx/avx512 are enabled and type is 256 or 512 bits gcc supports selecting ymm0/zmm0 for the Yz constraint when used with 256 or 512 bit vector types. Fixes PR45806 Differential Revision: https://reviews.llvm.org/D79448	2020-05-05 21:12:30 -07:00
WangTianQing	a3dc949000	[X86] Add TSXLDTRK instructions. Summary: For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Reviewers: craig.topper, RKSimon, LuoYuanke Reviewed By: craig.topper Subscribers: mgorny, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77205	2020-04-09 13:17:29 +08:00
WangTianQing	d08fadd662	[X86] Add SERIALIZE instruction. Summary: For more details about this instruction, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Reviewers: craig.topper, RKSimon, LuoYuanke Reviewed By: craig.topper Subscribers: mgorny, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77193	2020-04-02 16:19:23 +08:00
David Blaikie	819e540208	Use llvm_unreachable after a fully covered/always-returning switch	2020-03-26 20:09:57 -07:00
Michael Liao	d264f02c6f	Fix `-Wreturn-type` warning. NFC.	2020-03-26 00:53:24 -04:00
zoecarver	b915aec6b5	Add method to TargetInfo to get CPU cache line size Summary: This patch adds a virtual method `getCPUCacheLineSize()` to `TargetInfo`. Currently, I've only implemented the method in `X86TargetInfo`. It's extremely important that each CPU's cache line size correct (e.g., we can't just define it as `64` across the board) so, it has been a little slow getting to this point. I'll work on the ARM CPUs next, but that will probably come later in a different patch. Tags: #clang Differential Revision: https://reviews.llvm.org/D74918	2020-03-25 09:50:38 -07:00
Roland McGrath	271f964773	[Preprocessor][X86] Fix __code_model___ predefine macros GCC defines __code_model___ (two trailing underscores), not __code_model_*_ (one trailing underscore). Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D75003	2020-02-21 23:30:07 -08:00
Craig Topper	d35bcbbb5d	[Sema][X86] Consider target attribute into the checks in validateOutputSize and validateInputSize. The validateOutputSize and validateInputSize need to check whether AVX or AVX512 are enabled. But this can be affected by the target attribute so we need to factor that in. This patch moves some of the code from CodeGen to create an appropriate feature map that we can pass to the function. Differential Revision: https://reviews.llvm.org/D68627	2019-12-23 11:23:30 -08:00
Reid Kleckner	eff08f4097	Revert "[Sema][X86] Consider target attribute into the checks in validateOutputSize and validateInputSize." This reverts commit `e1578fd2b7`. It introduces a dependency on Attr.h which I am removing from ASTContext.h.	2019-12-06 15:42:14 -08:00
Craig Topper	e1578fd2b7	[Sema][X86] Consider target attribute into the checks in validateOutputSize and validateInputSize. The validateOutputSize and validateInputSize need to check whether AVX or AVX512 are enabled. But this can be affected by the target attribute so we need to factor that in. This patch copies some of the code from CodeGen to create an appropriate feature map that we can pass to the function. Probably need some refactoring here to share more code with Codegen. Is there a good place to do that? Also need to support the cpu_specific attribute as well. Differential Revision: https://reviews.llvm.org/D68627	2019-12-06 15:30:59 -08:00
Craig Topper	a8ccb48f69	[X86] Add 'fxsr' feature to -march=pentium2 to match X86.td and gcc.	2019-11-06 10:27:53 -08:00
Craig Topper	ba73aad4f6	[X86] Add 'mmx' to all CPUs that have a version of 'sse' and weren't already enabling '3dnow' All SSE capable CPUs have MMX. 3dnow implicitly enables MMX. We have code that detects if sse is enabled and implicitly enables MMX unless -mno-mmx is passed. So in most cases we were already enabling MMX if march passed a CPU that supported SSE. The exception to this is if you pass -march for a cpu supports SSE and also pass -mno-sse. We should still enable MMX since its part of the CPU capability.	2019-11-06 10:02:40 -08:00
Craig Topper	5a43fdd313	[X86] Remove what little support we had for MPX -Deprecate -mmpx and -mno-mpx command line options -Remove CPUID detection of mpx for -march=native -Remove MPX from all CPUs -Remove MPX preprocessor define I've left the "mpx" string in the backend so we don't fail on old IR, but its not connected to anything. gcc has also deprecated these command line options. https://www.phoronix.com/scan.php?page=news_item&px=GCC-Patch-To-Drop-MPX Differential Revision: https://reviews.llvm.org/D66669 llvm-svn: 370393	2019-08-29 18:09:02 +00:00
Pengfei Wang	e28cbbd5d4	[X86] Support -march=tigerlake Support -march=tigerlake for x86. Compare with Icelake Client, It include 4 more new features ,they are avx512vp2intersect, movdiri, movdir64b, shstk. Patch by Xiang Zhang (xiangzhangllvm) Differential Revision: https://reviews.llvm.org/D65840 llvm-svn: 368543	2019-08-12 01:29:46 +00:00
Sjoerd Meijer	a19f5a76e6	Test commit. NFC. Removed 2 trailing whitespaces in 2 files that used to be in different repos to test my new github monorepo workflow. llvm-svn: 366904	2019-07-24 13:30:36 +00:00
JF Bastien	fff5dc0b17	Support __seg_fs and __seg_gs on x86 Summary: GCC supports named address spaces macros: https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html clang does as well with address spaces: https://clang.llvm.org/docs/LanguageExtensions.html#memory-references-to-specified-segments Add the __seg_fs and __seg_gs macros for compatibility with GCC. <rdar://problem/52944935> Subscribers: jkorous, dexonsmith, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D64676 llvm-svn: 366028	2019-07-14 18:33:51 +00:00
Craig Topper	8e1f3a0538	[X86] Attempt to make the Intel core CPU inheritance a little more readable and maintainable The recently added cooperlake CPU has made our already ugly switch statement even worse. There's a CPU exclusion list around the bf16 feature in the cooper lake block. I worry that we'll have to keep adding new CPUs to that until bf16 intercepts a client space CPU. We have several other exclusion lists in other parts of the switch due to skylakeserver, cascadelake, and cooperlake not having sgx. Another for cannonlake not having clwb but having all other features from skx. This removes all these special ifs at the cost of some duplication of features and a goto. I've copied all of the skx features into either cannonlake or icelakeclient(for clwb). And pulled sklyakeserver, cascadelake, and cooperlake out of the main inheritance chain into their own chain. At the end of skylakeserver we merge back into the main chain at skylakeclient but below sgx. I think this is at least easier to follow. Differential Revision: https://reviews.llvm.org/D63018 llvm-svn: 362965	2019-06-10 16:59:28 +00:00
Pengfei Wang	30bcda86db	[X86] -march=cooperlake (clang) Support intel -march=cooperlake in clang Patch by Shengchen Kan (skan) Differential Revision: https://reviews.llvm.org/D62835 llvm-svn: 362781	2019-06-07 08:53:37 +00:00
Pengfei Wang	3a29f7c99c	[X86] Add ENQCMD instructions For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference. Patch by Tianqing Wang (tianqing) Differential Revision: https://reviews.llvm.org/D62282 llvm-svn: 362685	2019-06-06 08:28:42 +00:00
Pengfei Wang	cc3629d545	[X86] Add VP2INTERSECT instructions Support intel AVX512 VP2INTERSECT instructions in clang Patch by Xiang Zhang (xiangzhangllvm) Differential Revision: https://reviews.llvm.org/D62367 llvm-svn: 362196	2019-05-31 06:09:35 +00:00
Craig Topper	a85c0fd918	[X86] Split multi-line chained assignments into single lines to avoid making clang-format create triangle shaped indentation. Simplify one if statement to remove a bunch of string matches. NFCI We had an if statement that checked over every avx512* feature to see if it should enabled avx512f. Since they are all prefixed with avx512 just check for that instead. llvm-svn: 361557	2019-05-23 21:34:36 +00:00
Craig Topper	20040db9a6	[X86] Stop implicitly enabling avx512vl when avx512bf16 is enabled. Previously we were doing this so that the 256 bit selectw builtin could be used in the implementation of the 512->256 bit conversion intrinsic. After this commit we now use a masked convert builtin that will emit the intrinsic call and the 256-bit select from custom code in CGBuiltin. Then the header only needs to call that one intrinsic. llvm-svn: 360924	2019-05-16 18:28:17 +00:00
Luo, Yuanke	844f662932	Enable intrinsics of AVX512_BF16, which are supported for BFLOAT16 in Cooper Lake Summary: 1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake; 2. Enable intrinsics for VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision. For more details about BF16 intrinsic, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Patch by LiuTianle Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, spatel, RKSimon Reviewed By: craig.topper Subscribers: mgorny, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60552 llvm-svn: 360018	2019-05-06 08:25:11 +00:00
Fangrui Song	75e74e077c	Range-style std::find{,_if} -> llvm::find{,_if}. NFC llvm-svn: 357359	2019-03-31 08:48:19 +00:00
Craig Topper	7339e61b89	[X86] Correct the value of MaxAtomicInlineWidth for pre-586 cpus Use the new cx8 feature flag that was added to the backend to represent support for cmpxchg8b. Use this flag to set the MaxAtomicInlineWidth. This also assumes all the cmpxchg instructions are enabled for CK_Generic which is what cc1 defaults to when nothing is specified. Differential Revision: https://reviews.llvm.org/D59566 llvm-svn: 356709	2019-03-21 20:36:08 +00:00

1 2 3

131 Commits