llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	a3c814d234	Separately track input and output denormal mode AMDGPU and x86 at least both have separate controls for whether denormal results are flushed on output, and for whether denormals are implicitly treated as 0 as an input. The current DAGCombiner use only really cares about the input treatment of denormals.	2020-02-04 12:59:21 -05:00
Konstantin Pyzhov	6d614a82a4	Summary: This CL adds clang declarations of built-in functions for AMDGPU MFMA intrinsics and instructions. OpenCL tests for new built-ins are included. Differential Revision: https://reviews.llvm.org/D72723	2020-01-28 03:51:27 -05:00
Matt Arsenault	a4451d88ee	Consolidate internal denormal flushing controls Currently there are 4 different mechanisms for controlling denormal flushing behavior, and about as many equivalent frontend controls. - AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features - NVPTX uses the nvptx-f32ftz attribute - ARM directly uses the denormal-fp-math attribute - Other targets indirectly use denormal-fp-math in one DAGCombine - cl-denorms-are-zero has a corresponding denorms-are-zero attribute AMDGPU wants a distinct control for f32 flushing from f16/f64, and as far as I can tell the same is true for NVPTX (based on the attribute name). Work on consolidating these into the denormal-fp-math attribute, and a new type specific denormal-fp-math-f32 variant. Only ARM seems to support the two different flush modes, so this is overkill for the other use cases. Ideally we would error on the unsupported positive-zero mode on other targets from somewhere. Move the logic for selecting the flush mode into the compiler driver, instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32 are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as a user flag. -cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and -fno-cuda-flush-denormals-to-zero will be mapped to -fp-denormal-math-f32=ieee or preserve-sign rather than the old attributes. Stop emitting the denorms-are-zero attribute for the OpenCL flag. It has no in-tree users. The meaning would also be target dependent, such as the AMDGPU choice to treat this as only meaning allow flushing of f32 and not f16 or f64. The naming is also potentially confusing, since DAZ in other contexts refers to instructions implicitly treating input denormals as zero, not necessarily flushing output denormals to zero. This also does not attempt to change the behavior for the current attribute. The LangRef now states that the default is ieee behavior, but this is inaccurate for the current implementation. The clang handling is slightly hacky to avoid touching the existing denormal-fp-math uses. Fixing this will be left for a future patch. AMDGPU is still using the subtarget feature to control the denormal mode, but the new attribute are now emitted. A future change will switch this and remove the subtarget features.	2020-01-17 20:09:53 -05:00
Amy Huang	a85f5efd95	Add support for the MS qualifiers __ptr32, __ptr64, __sptr, __uptr. Summary: This adds parsing of the qualifiers __ptr32, __ptr64, __sptr, and __uptr and lowers them to the corresponding address space pointer for 32-bit and 64-bit pointers. (32/64-bit pointers added in https://reviews.llvm.org/D69639) A large part of this patch is making these pointers ignore the address space when doing things like overloading and casting. https://bugs.llvm.org/show_bug.cgi?id=42359 Reviewers: rnk, rsmith Subscribers: jholewinski, jvesely, nhaehnle, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D71039	2019-12-18 10:41:12 -08:00
Bjorn Pettersson	78424e5f84	Prune include of DataLayout.h from include/clang/Basic/TargetInfo.h. NFC Summary: Use a forward declaration of DataLayout instead of including DataLayout.h in clangs TargetInfo.h. This reduces include dependencies toward DataLayout.h (and other headers such as DerivedTypes.h, Type.h that is included by DataLayout.h). Needed to move implemantation of TargetInfo::resetDataLayout from TargetInfo.h to TargetInfo.cpp. Reviewers: rnk Reviewed By: rnk Subscribers: jvesely, nhaehnle, cfe-commits, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69262 llvm-svn: 375438	2019-10-21 17:58:14 +00:00
Matt Arsenault	281f2e2c37	AMDGPU: Add builtins for is_shared/is_private llvm-svn: 371010	2019-09-05 03:00:43 +00:00
Stanislav Mekhanoshin	c17705b7fb	[AMDGPU] Do not assume a default GCN target Differential Revision: https://reviews.llvm.org/D66246 llvm-svn: 368917	2019-08-14 20:55:15 +00:00
Stanislav Mekhanoshin	0cfd75a07d	[AMDGPU] gfx908 clang target Differential Revision: https://reviews.llvm.org/D64430 llvm-svn: 365528	2019-07-09 18:19:00 +00:00
Matt Arsenault	fc84925208	AMDGPU: Fix target builtins for gfx10 This wasn't setting some of the features from older generations. llvm-svn: 364123	2019-06-22 01:30:00 +00:00
Stanislav Mekhanoshin	cafccd7a53	[AMDGPU] gfx1011/gfx1012 clang support Differential Revision: https://reviews.llvm.org/D63308 llvm-svn: 363345	2019-06-14 00:33:59 +00:00
Stanislav Mekhanoshin	91792f1b93	[AMDGPU] gfx1010 clang target Differential Revision: https://reviews.llvm.org/D61875 llvm-svn: 360634	2019-05-13 23:15:59 +00:00
Yaxun Liu	4469701207	AMDGPU: Enable _Float16 llvm-svn: 359594	2019-04-30 18:35:37 +00:00
Stanislav Mekhanoshin	1d9f286ecb	[AMDGPU] rename vi-insts into gfx8-insts Differential Revision: https://reviews.llvm.org/D60293 llvm-svn: 357792	2019-04-05 18:25:00 +00:00
Michael Liao	3c2aadbe67	[AMDGPU] Add the missing clang change of the experimental buffer fat pointer llvm-svn: 356385	2019-03-18 18:11:37 +00:00
Stanislav Mekhanoshin	1607a37308	[AMDGPU] Split dot-insts feature Differential Revision: https://reviews.llvm.org/D57972 llvm-svn: 353588	2019-02-09 00:34:41 +00:00
Yaxun Liu	277e064bf5	Do not copy long double and 128-bit fp format from aux target for AMDGPU rC352620 caused regressions because it copied floating point format from aux target. floating point format decides whether extended long double is supported. It is x86_fp80 on x86 but IEEE double on amdgcn. Document usage of long doubel type in HIP programming guide https://github.com/ROCm-Developer-Tools/HIP/pull/890 Differential Revision: https://reviews.llvm.org/D57527 llvm-svn: 352801	2019-01-31 21:57:51 +00:00
Yaxun Liu	95f2ca541f	[HIP] Fix size_t for MSVC environment In 64 bit MSVC environment size_t is defined as unsigned long long. In single source language like HIP, data layout should be consistent in device and host compilation, therefore copy data layout controlling fields from Aux target for AMDGPU target. Differential Revision: https://reviews.llvm.org/D56318 llvm-svn: 352620	2019-01-30 12:26:54 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Stanislav Mekhanoshin	6332f4d0d4	[AMDGPU] Separate feature dot-insts Differential Revision: https://reviews.llvm.org/D56525 llvm-svn: 350794	2019-01-10 03:25:47 +00:00
Richard Trieu	6368818fd5	Move CodeGenOptions from Frontend to Basic Basic uses CodeGenOptions and should not depend on Frontend. llvm-svn: 348827	2018-12-11 03:18:39 +00:00
Konstantin Zhuravlyov	06570954e2	AMDGPU: Handle gfx909 in AMDGPUTargetInfo::initFeatureMap + add required tests llvm-svn: 345181	2018-10-24 19:07:56 +00:00
Matt Arsenault	b666e73dd9	AMDGPU: Move target code into TargetParser llvm-svn: 340292	2018-08-21 16:13:29 +00:00
Matt Arsenault	45bc148093	AMDGPU: Fix enabling denormals by default on pre-VI targets Fast FMAF is not a sufficient condition to enable denormals. Before VI, enabling denormals caused F32 instructions to run at F64 speeds. llvm-svn: 339278	2018-08-08 17:48:37 +00:00
Matt Arsenault	31c895ecdf	AMDGPU: Add builtin for s_dcache_wb llvm-svn: 339110	2018-08-07 07:49:13 +00:00
Matt Arsenault	24f3924709	AMDGPU: Add builtin for s_dcache_inv_vol llvm-svn: 339109	2018-08-07 07:49:04 +00:00
Matt Arsenault	d2da3c20d7	AMDGPU: Add Vega12 and Vega20 Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331216	2018-04-30 19:08:27 +00:00
Yaxun Liu	bec8a66454	[CUDA] Revert defining __CUDA_ARCH__ for amdgcn targets amdgcn targets only support HIP, which does not define __CUDA_ARCH__. this is a partial unroll of r329232 / D45277. Differential Revision: https://reviews.llvm.org/D45387 llvm-svn: 329584	2018-04-09 15:43:01 +00:00
Yaxun Liu	8a5fc15aa4	[CUDA] Add amdgpu sub archs Patch by Greg Rodgers. Revised and lit tests added by Yaxun Liu. Differential Revision: https://reviews.llvm.org/D45277 llvm-svn: 329232	2018-04-04 21:19:27 +00:00
Matt Arsenault	b130ea5605	AMDGPU: Update datalayout for stack alignment llvm-svn: 328657	2018-03-27 19:26:51 +00:00
Yaxun Liu	1578a0a55d	[AMDGPU] Clean up old address space mapping and fix constant address space value Differential Revision: https://reviews.llvm.org/D43911 llvm-svn: 326725	2018-03-05 17:50:10 +00:00
Konstantin Zhuravlyov	d6b3453bdb	AMDGPU: Define FP_FAST_FMA{F} macros for amdgcn - Expand GK_*s (i.e. GFX6 -> GFX600, GFX601, etc.) - This allows us to choose features correctly in some cases (for example, fast fmaf is available on gfx600, but not gfx601) - Move HasFMAF, HasFP64, HasLDEXPF to GPUInfo tables - Add HasFastFMA, HasFastFMAF to GPUInfo tables - Add missing tests llvm-svn: 326254	2018-02-27 21:48:05 +00:00
Konstantin Zhuravlyov	cf71761495	Reapply r325193 llvm-svn: 325203	2018-02-15 02:37:04 +00:00
Konstantin Zhuravlyov	b7b86127f5	Revert r325193 as it breaks buildbots llvm-svn: 325200	2018-02-15 02:27:45 +00:00
Richard Smith	47c9b5d4d6	Add missing definition for class static after r325193. llvm-svn: 325195	2018-02-15 01:01:06 +00:00
Konstantin Zhuravlyov	5c9d4e7957	AMDGPU: Cleanup most of the macros - Insert __AMD__ macro - Insert __AMDGPU__ macro - Insert __devicename__ macro - Add missing tests for arch macros Differential Revision: https://reviews.llvm.org/D36802 llvm-svn: 325193	2018-02-15 00:20:26 +00:00
Yaxun Liu	651bd73c02	[AMDGPU] Change constant addr space to 4 Differential Revision: https://reviews.llvm.org/D43171 llvm-svn: 325031	2018-02-13 18:01:21 +00:00
Matt Arsenault	e7da136a74	AMDGPU: Update for datalayout change llvm-svn: 324748	2018-02-09 16:58:41 +00:00
Erich Keane	e44bdb3f70	Add Rest of Targets Support to ValidCPUList (enabling march notes) A followup to: https://reviews.llvm.org/D42978 Most of the rest of the Targets were pretty rote, so this patch knocks them all out at once. Differential Revision: https://reviews.llvm.org/D43057 llvm-svn: 324676	2018-02-08 23:16:55 +00:00
Yaxun Liu	f5f45e5e63	[AMDGPU] Switch to the new addr space mapping by default This requires corresponding llvm change. Differential Revision: https://reviews.llvm.org/D40956 llvm-svn: 324102	2018-02-02 16:08:24 +00:00
Matt Arsenault	e4f6280a26	AMDGPU: Don't add fp64 feature to r600 subtargets Should fix test after r319709 llvm-svn: 319735	2017-12-05 03:51:26 +00:00
Jan Vesely	cda72c9c3c	AMDGPU: Parse r600 CPU name early and expose FMAF capability Improve amdgcn macro test Differential Revision: https://reviews.llvm.org/D38667 llvm-svn: 316181	2017-10-19 20:40:13 +00:00
Alexander Richardson	6d989436d0	Convert clang::LangAS to a strongly typed enum Summary: Convert clang::LangAS to a strongly typed enum Currently both clang AST address spaces and target specific address spaces are represented as unsigned which can lead to subtle errors if the wrong type is passed. It is especially confusing in the CodeGen files as it is not possible to see what kind of address space should be passed to a function without looking at the implementation. I originally made this change for our LLVM fork for the CHERI architecture where we make extensive use of address spaces to differentiate between capabilities and pointers. When merging the upstream changes I usually run into some test failures or runtime crashes because the wrong kind of address space is passed to a function. By converting the LangAS enum to a C++11 we can catch these errors at compile time. Additionally, it is now obvious from the function signature which kind of address space it expects. I found the following errors while writing this patch: - ItaniumRecordLayoutBuilder::LayoutField was passing a clang AST address space to TargetInfo::getPointer{Width,Align}() - TypePrinter::printAttributedAfter() prints the numeric value of the clang AST address space instead of the target address space. However, this code is not used so I kept the current behaviour - initializeForBlockHeader() in CGBlocks.cpp was passing LangAS::opencl_generic to TargetInfo::getPointer{Width,Align}() - CodeGenFunction::EmitBlockLiteral() was passing a AST address space to TargetInfo::getPointerWidth() - CGOpenMPRuntimeNVPTX::translateParameter() passed a target address space to Qualifiers::addAddressSpace() - CGOpenMPRuntimeNVPTX::getParameterAddress() was using llvm::Type::getPointerTo() with a AST address space - clang_getAddressSpace() returns either a LangAS or a target address space. As this is exposed to C I have kept the current behaviour and added a comment stating that it is probably not correct. Other than this the patch should not cause any functional changes. Reviewers: yaxunl, pcc, bader Reviewed By: yaxunl, bader Subscribers: jlebar, jholewinski, nhaehnle, Anastasia, cfe-commits Differential Revision: https://reviews.llvm.org/D38816 llvm-svn: 315871	2017-10-15 18:48:14 +00:00
Yaxun Liu	b7318e02c1	[OpenCL] Add LangAS::opencl_private to represent private address space in AST Currently Clang uses default address space (0) to represent private address space for OpenCL in AST. There are two issues with this: Multiple address spaces including private address space cannot be diagnosed. There is no mangling for default address space. For example, if private int* is emitted as i32 addrspace(5)* in IR. It is supposed to be mangled as PUAS5i but it is mangled as Pi instead. This patch attempts to represent OpenCL private address space explicitly in AST. It adds a new enum LangAS::opencl_private and adds it to the variable types which are implicitly private: automatic variables without address space qualifier function parameter pointee type without address space qualifier (OpenCL 1.2 and below) Differential Revision: https://reviews.llvm.org/D35082 llvm-svn: 315668	2017-10-13 03:37:48 +00:00
Konstantin Zhuravlyov	a42719406f	AMDGPU: add missing amdgcn processors and tests - gfx600 - gfx601 - gfx703 - gfx902 - gfx903 Differential Revision: https://reviews.llvm.org/D36771 llvm-svn: 311141	2017-08-18 01:13:39 +00:00
Yaxun Liu	39195062c2	Add OpenCL 2.0 atomic builtin functions as Clang builtin OpenCL 2.0 atomic builtin functions have a scope argument which is ideally represented as synchronization scope argument in LLVM atomic instructions. Clang supports translating Clang atomic builtin functions to LLVM atomic instructions. However it currently does not support synchronization scope of LLVM atomic instructions. Without this, users have to use LLVM assembly code to implement OpenCL atomic builtin functions. This patch adds OpenCL 2.0 atomic builtin functions as Clang builtin functions, which supports generating LLVM atomic instructions with synchronization scope operand. Currently only constant memory scope argument is supported. Support of non-constant memory scope argument will be added later. Differential Revision: https://reviews.llvm.org/D28691 llvm-svn: 310082	2017-08-04 18:16:31 +00:00
Erich Keane	ebba592682	Break up Targets.cpp into a header/impl pair per target type[NFCI] Targets.cpp is getting unwieldy, and even minor changes cause the entire thing to cause recompilation for everyone. This patch bites the bullet and breaks it up into a number of files. I tended to keep function definitions in the class declaration unless it caused additional includes to be necessary. In those cases, I pulled it over into the .cpp file. Content is copy/paste for the most part, besides includes/format/etc. Differential Revision: https://reviews.llvm.org/D35701 llvm-svn: 308791	2017-07-21 22:37:03 +00:00

1 2

96 Commits