llvm-project

Commit Graph

Author	SHA1	Message	Date
Yaxun (Sam) Liu	92d8ad02e9	[HIP] Fix rocm not found on rocm3.5 Currently rocm detector expects device library bitcodes named as .bc instead of .amdgcn.bc. However in rocm3.5 the device library bitcodes are named as *.amdgcn.bc, which causes rocm3.5 not detected. This patch fixes that. Differential Revision: https://reviews.llvm.org/D81713	2020-06-18 08:40:09 -04:00
Artem Belevich	d700237f1a	[CUDA,HIP] Use VFS for SDK detection. It's useful for using clang from tools that may need need to provide SDK files from non-standard locations. Clang CLI only provides a way to specify VFS for include files, so there's no good way to test this yet. Differential Revision: https://reviews.llvm.org/D81771	2020-06-15 12:54:44 -07:00
Yaxun (Sam) Liu	8422bc9efc	recommit "[HIP] Add default header and include path" recommit `11d06b9511` with fix for lit tests.	2020-06-06 14:21:22 -04:00
Nico Weber	2920348063	Revert "recommit "[HIP] Add default header and include path"" This reverts commit `1fa43e0b34`. Still breaks tests on several bots, see https://reviews.llvm.org/D81176	2020-06-05 21:50:04 -04:00
Yaxun (Sam) Liu	1fa43e0b34	recommit "[HIP] Add default header and include path" recommit `11d06b9511` with fix for lit tests.	2020-06-05 20:41:15 -04:00
Yaxun (Sam) Liu	8a8c6913a9	Revert "[HIP] Add default header and include path" This reverts commit `11d06b9511`.	2020-06-05 15:42:57 -04:00
Yaxun (Sam) Liu	11d06b9511	[HIP] Add default header and include path To support std::complex and some other standard C/C++ functions in HIP device code, they need to be forced to be __host__ __device__ functions by pragmas. This is done by some clang standard C++ wrapper headers which are shared between cuda-clang and hip-Clang. For these standard C++ wapper headers to work properly, specific include path order has to be enforced: clang C++ wrapper include path standard C++ include path clang include path Also, these C++ wrapper headers require device version of some standard C/C++ functions must be declared before including them. This needs to be done by including a default header which declares or defines these device functions. The default header is always included before any other headers are included by users. This patch adds the the default header and include path for HIP. Differential Revision: https://reviews.llvm.org/D81176	2020-06-05 12:44:57 -04:00
Matt Arsenault	1d96dca949	HIP: Try to deal with more llvm package layouts The various HIP builds are all inconsistent. The default llvm install goes to ${INSTALL_PREFIX}/bin/clang, but the rocm packaging scripts move this under ${INSTALL_PREFIX}/llvm/bin/clang. Some other builds further pollute this with ${INSTALL_PREFIX}/bin/x86_64/clang. These should really be consolidated, but try to handle them for now.	2020-05-23 13:28:24 -04:00
Matt Arsenault	235fb7dc24	AMDGPU/OpenCL: Accept -nostdlib in place of -nogpulib -nogpulib makes sense when there is a host (where -nostdlib would apply) and offload target. Accept nostdlib when there is no offload target as an alias.	2020-05-14 12:33:31 -04:00
Matt Arsenault	14e1845711	HIP: Merge builtin library handling Merge with the new --rocm-path handling used for OpenCL. This looks for a usable set of device libraries upfront, rather than giving a generic "no such file or directory error". If any of the required bitcode libraries are missing, this will now produce a "cannot find ROCm installation." error. This differs from the existing hip specific flags by pointing to a rocm root install instead of a single directory with bitcode files. This tries to maintain compatibility with the existing the --hip-device-lib and --hip-device-lib-path flags, as well as the HIP_DEVICE_LIB_PATH environment variable, or at least the range of uses with testcases. The existing range of uses and behavior doesn't entirely make sense to me, so some of the untested edge cases change behavior. Currently the two path forms seem to have the double purpose of a search path for an arbitrary --hip-device-lib, and for finding the stock set of libraries. Since the stock set of libraries This also changes the behavior when multiple paths are specified, and only takes the last one (and the environment variable only handles a single path). If --hip-device-lib is used, it now only treats --hip-device-lib-path as the search path for it, and does not attempt to find the rocm installation. If not, --hip-device-lib-path and the environment variable are used as the directory to search instead of the rocm root based path. This should also automatically fix handling of the options to use wave64.	2020-05-12 09:50:22 -04:00
Matt Arsenault	123bee602a	AMDGPU: Search for new ROCm bitcode library structure The current install situation is a mess, but I'm working on fixing it. Search for the target layout instead of one of the N options that exist today.	2020-05-12 09:41:07 -04:00
Matt Arsenault	3a61245050	clang/AMDGPU: Assume denormals are enabled for the default target. Since the default logic was based on having fast denormal/fma features, and the default target has no features, we assumed flushing by default. This fixes incorrectly assuming flushing in builds for "generic" IR libraries. The handling for no specified --cuda-gpu-arch in HIP is kind of broken. Somewhere else forces a default target of gfx803, which does not enable denormal handling by default. We don't see this default switching here, so you'll end up with a different denormal mode depending on whether you explicitly requested gfx803, or used it by default.	2020-04-15 09:17:26 -04:00
Matt Arsenault	dc89a3efb4	HIP: Fix handling of denormal mode I didn't realize HIP was a distinct offloading kind, so the subtarget was looking for -march, which isn't correct for HIP. We also have the possibility of different denormal defaults in the case of multiple offload targets, so we need to thread the JobAction through the target hook.	2020-04-13 11:48:45 -07:00
Matt Arsenault	4593e4131a	AMDGPU: Teach toolchain to link rocm device libs Currently the library is separately linked, but this isn't correct to implement fast math flags correctly. Each module should get the version of the library appropriate for its combination of fast math and related flags, with the attributes propagated into its functions and internalized. HIP already maintains the list of libraries, but this is not used for OpenCL. Unfortunately, HIP uses a separate --hip-device-lib argument, despite both languages using the same bitcode library. Eventually these two searches need to be merged. An additional problem is there are 3 different locations the libraries are installed, depending on which build is used. This also needs to be consolidated (or at least the search logic needs to deal with this unnecessary complexity).	2020-04-10 13:37:32 -04:00
Matt Arsenault	6593360ee7	AMDGPU: Fix consistently backwards logic for default denormal mode I forgot to squash this into `c9d65a48af`	2020-04-01 12:36:22 -04:00
Matt Arsenault	c9d65a48af	HIP: Ensure new denormal mode attributes are set Apparently HIPToolChain does not subclass from AMDGPUToolChain, so this was not applying the new denormal attributes. I'm not sure why this doesn't subclass. Just copy the implementation for now.	2020-03-31 18:00:37 -04:00
Matt Arsenault	a3c814d234	Separately track input and output denormal mode AMDGPU and x86 at least both have separate controls for whether denormal results are flushed on output, and for whether denormals are implicitly treated as 0 as an input. The current DAGCombiner use only really cares about the input treatment of denormals.	2020-02-04 12:59:21 -05:00
Matt Arsenault	a4451d88ee	Consolidate internal denormal flushing controls Currently there are 4 different mechanisms for controlling denormal flushing behavior, and about as many equivalent frontend controls. - AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features - NVPTX uses the nvptx-f32ftz attribute - ARM directly uses the denormal-fp-math attribute - Other targets indirectly use denormal-fp-math in one DAGCombine - cl-denorms-are-zero has a corresponding denorms-are-zero attribute AMDGPU wants a distinct control for f32 flushing from f16/f64, and as far as I can tell the same is true for NVPTX (based on the attribute name). Work on consolidating these into the denormal-fp-math attribute, and a new type specific denormal-fp-math-f32 variant. Only ARM seems to support the two different flush modes, so this is overkill for the other use cases. Ideally we would error on the unsupported positive-zero mode on other targets from somewhere. Move the logic for selecting the flush mode into the compiler driver, instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32 are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as a user flag. -cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and -fno-cuda-flush-denormals-to-zero will be mapped to -fp-denormal-math-f32=ieee or preserve-sign rather than the old attributes. Stop emitting the denorms-are-zero attribute for the OpenCL flag. It has no in-tree users. The meaning would also be target dependent, such as the AMDGPU choice to treat this as only meaning allow flushing of f32 and not f16 or f64. The naming is also potentially confusing, since DAZ in other contexts refers to instructions implicitly treating input denormals as zero, not necessarily flushing output denormals to zero. This also does not attempt to change the behavior for the current attribute. The LangRef now states that the default is ieee behavior, but this is inaccurate for the current implementation. The clang handling is slightly hacky to avoid touching the existing denormal-fp-math uses. Fixing this will be left for a future patch. AMDGPU is still using the subtarget feature to control the denormal mode, but the new attribute are now emitted. A future change will switch this and remove the subtarget features.	2020-01-17 20:09:53 -05:00
Jonas Devlieghere	2b3d49b610	[Clang] Migrate llvm::make_unique to std::make_unique Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. Differential revision: https://reviews.llvm.org/D66259 llvm-svn: 368942	2019-08-14 23:04:18 +00:00
Stanislav Mekhanoshin	8a8131a3f6	[AMDGPU] gfx1010 wave32 clang support Differential Revision: https://reviews.llvm.org/D63209 llvm-svn: 363341	2019-06-13 23:47:59 +00:00
Matt Arsenault	d91bb4831b	AMDGPU: Don't emit debugger subtarget features Keep the flag around for compatability. llvm-svn: 354624	2019-02-21 21:31:43 +00:00
Scott Linder	bef2663751	Add -fapply-global-visibility-to-externs for -cc1 Introduce an option to request global visibility settings be applied to declarations without a definition or an explicit visibility, rather than the existing behavior of giving these default visibility. When the visibility of all or most extern definitions are known this allows for the same optimisations -fvisibility permits without updating source code to annotate all declarations. Differential Revision: https://reviews.llvm.org/D56868 llvm-svn: 352391	2019-01-28 17:12:19 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Raphael Isemann	b23ccecbb0	Misc typos fixes in ./lib folder Summary: Found via `codespell -q 3 -I ../clang-whitelist.txt -L uint,importd,crasher,gonna,cant,ue,ons,orign,ned` Reviewers: teemperor Reviewed By: teemperor Subscribers: teemperor, jholewinski, jvesely, nhaehnle, whisperity, jfb, cfe-commits Differential Revision: https://reviews.llvm.org/D55475 llvm-svn: 348755	2018-12-10 12:37:46 +00:00
Matt Arsenault	cd5bc7be08	AMDGPU: Default to hidden visibility Object linking isn't supported, so it's not useful to emit default visibility. Default visibility requires relocations we don't yet support for functions compiled in another translation unit. WebAssembly already does this, although they insert these arguments in a different place for some reason. llvm-svn: 341033	2018-08-30 08:18:06 +00:00
Konstantin Zhuravlyov	37e9739a58	AMDGPU: Remove amdgpu-debugger-reserve-regs feature llvm-svn: 335287	2018-06-21 20:27:47 +00:00
Konstantin Zhuravlyov	8914a6d50e	AMDGPU/NFC: Move getAMDGPUTargetFeatures to AMDGPU toolchain Differential Revision: https://reviews.llvm.org/D39877 llvm-svn: 317909	2017-11-10 19:09:57 +00:00
Andrey Kasaurov	6618c39a95	[AMDGPU] Implement infrastructure to set options in AMDGPUToolChain In current OpenCL implementation some options are set in OpenCL RT/Driver, which causes discrepancy between online and offline paths. Implement infrastructure to move options from OpenCL RT/Driver to AMDGPUToolChain using overloaded TranslateArgs() method. Create map for default options values, as Options.td doesn't support default values (in contrast with OPTIONS.def). Add two driver options: -On and -mNN (like -O3, -m64). Some minor formatting changes to follow the clang-format style. Differential Revision: https://reviews.llvm.org/D37386 llvm-svn: 312524	2017-09-05 10:24:38 +00:00
Andrey Kasaurov	099f633da4	Test commit llvm-svn: 308744	2017-07-21 15:24:37 +00:00
Nikolay Haustov	208a597ee7	Test commit llvm-svn: 308741	2017-07-21 13:58:11 +00:00
David L. Jones	f561abab56	[Driver] Consolidate tools and toolchains by target platform. (NFC) Summary: (This is a move-only refactoring patch. There are no functionality changes.) This patch splits apart the Clang driver's tool and toolchain implementation files. Each target platform toolchain is moved to its own file, along with the closest-related tools. Each target platform toolchain has separate headers and implementation files, so the hierarchy of classes is unchanged. There are some remaining shared free functions, mostly from Tools.cpp. Several of these move to their own architecture-specific files, similar to r296056. Some of them are only used by a single target platform; since the tools and toolchains are now together, some helpers now live in a platform-specific file. The balance are helpers related to manipulating argument lists, so they are now in a new file pair, CommonArgs.h and .cpp. I've tried to cluster the code logically, which is fairly straightforward for most of the target platforms and shared architectures. I think I've made reasonable choices for these, as well as the various shared helpers; but of course, I'm happy to hear feedback in the review. There are some particular things I don't like about this patch, but haven't been able to find a better overall solution. The first is the proliferation of files: there are several files that are tiny because the toolchain is not very different from its base (usually the Gnu tools/toolchain). I think this is mostly a reflection of the true complexity, though, so it may not be "fixable" in any reasonable sense. The second thing I don't like are the includes like "../Something.h". I've avoided this largely by clustering into the current file structure. However, a few of these includes remain, and in those cases it doesn't make sense to me to sink an existing file any deeper. Reviewers: rsmith, mehdi_amini, compnerd, rnk, javed.absar Subscribers: emaste, jfb, danalbert, srhines, dschuff, jyknight, nemanjai, nhaehnle, mgorny, cfe-commits Differential Revision: https://reviews.llvm.org/D30372 llvm-svn: 297250	2017-03-08 01:02:16 +00:00

1 2

81 Commits