Commit Graph

57 Commits

Author SHA1 Message Date
Gheorghe-Teodor Bercea db900e389a [CUDA][Clang][Bugfix] Add missing CUDA 9.2 case
Summary:
The bug was reported on the OpenMP-dev list:

.../obj-release/lib/clang/9.0.0/include/__clang_cuda_intrinsics.h:173:35: error: '__nvvm_shfl_sync_idx_i32' needs target feature ptx60|ptx61|ptx63|ptx64
__MAKE_SYNC_SHUFFLES(__shfl_sync, __nvvm_shfl_sync_idx_i32,

This problem occurs when trying to compile a .cu file that requires a newer ptx version (>ptx60 in this case) than ptx42.



Reviewers: tra, ABataev, caomhin

Reviewed By: tra

Subscribers: jdoerfert, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D61474

llvm-svn: 359910
2019-05-03 17:59:18 +00:00
Artem Belevich 4cbb235026 [CUDA] Do not pass deprecated option fo fatbinary
CUDA 10.1 tools deprecated some command line options.
fatbinary no longer needs --cuda.

Differential Revision: https://reviews.llvm.org/D61470

llvm-svn: 359838
2019-05-02 22:37:19 +00:00
Artem Belevich 5fe85a003f [CUDA] Implemented _[bi]mma* builtins.
These builtins provide access to the new integer and
sub-integer variants of MMA (matrix multiply-accumulate) instructions
provided by CUDA-10.x on sm_75 (AKA Turing) GPUs.

Also added a feature for PTX 6.4. While Clang/LLVM does not generate
any PTX instructions that need it, we still need to pass it through to
ptxas in order to be able to compile code that uses the new 'mma'
instruction as inline assembly (e.g used by NVIDIA's CUTLASS library
https://github.com/NVIDIA/cutlass/blob/master/cutlass/arch/mma.h#L101)

Differential Revision: https://reviews.llvm.org/D60279

llvm-svn: 359248
2019-04-25 22:28:09 +00:00
Artem Belevich 4071763bb8 Basic CUDA-10 support.
Differential Revision: https://reviews.llvm.org/D57771

llvm-svn: 353232
2019-02-05 22:38:58 +00:00
Artem Belevich 8fa28a0db0 [CUDA] Propagate detected version of CUDA to cc1
..and use it to control that parts of CUDA compilation
that depend on the specific version of CUDA SDK.

This patch has a placeholder for a 'new launch API' support
which is in a separate patch. The list will be further
extended in the upcoming patch to support CUDA-10.1.

Differential Revision: https://reviews.llvm.org/D57487

llvm-svn: 352798
2019-01-31 21:32:24 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Alexey Bataev c92fc3c8bc [CUDA][OPENMP][NVPTX]Improve logic of the debug info support.
Summary:
Added support for the -gline-directives-only option + fixed logic of the
debug info for CUDA devices. If optimization level is O0, then options
--[no-]cuda-noopt-device-debug do not affect the debug info level. If
the optimization level is >O0, debug info options are used +
--no-cuda-noopt-device-debug is used or no --cuda-noopt-device-debug is
used, the optimization level for the device code is kept and the
emission of the debug directives is used.
If the opt level is > O0, debug info is requested +
--cuda-noopt-device-debug option is used, the optimization is disabled
for the device code + required debug info is emitted.

Reviewers: tra, echristo

Subscribers: aprantl, guansong, JDevlieghere, cfe-commits

Differential Revision: https://reviews.llvm.org/D51554

llvm-svn: 348930
2018-12-12 14:52:27 +00:00
Joel E. Denny 6dd34dc3dd [CUDA] Fix nvidia-cuda-toolkit detection on Ubuntu
This just extends D40453 (r319317) to Ubuntu.

Reviewed By: Hahnfeld, tra

Differential Revision: https://reviews.llvm.org/D55269

llvm-svn: 348504
2018-12-06 17:46:17 +00:00
Jonas Devlieghere fc51490baf Lift VFS from clang to llvm (NFC)
This patch moves the virtual file system form clang to llvm so it can be
used by more projects.

Concretely the patch:
 - Moves VirtualFileSystem.{h|cpp} from clang/Basic to llvm/Support.
 - Moves the corresponding unit test from clang to llvm.
 - Moves the vfs namespace from clang::vfs to llvm::vfs.
 - Formats the lines affected by this change, mostly this is the result of
   the added llvm namespace.

RFC on the mailing list:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/126657.html

Differential revision: https://reviews.llvm.org/D52783

llvm-svn: 344140
2018-10-10 13:27:25 +00:00
Yaxun Liu 9767089d00 [HIP] Support early finalization of device code for -fno-gpu-rdc
This patch renames -f{no-}cuda-rdc to -f{no-}gpu-rdc and keeps the original
options as aliases. When -fgpu-rdc is off,
clang will assume the device code in each translation unit does not call
external functions except those in the device library, therefore it is possible
to compile the device code in each translation unit to self-contained kernels
and embed them in the host object, so that the host object behaves like
usual host object which can be linked by lld.

The benefits of this feature is: 1. allow users to create static libraries which
can be linked by host linker; 2. amortized device code linking time.

This patch modifies HIP action builder to insert actions for linking device
code and generating HIP fatbin, and pass HIP fatbin to host backend action.
It extracts code for constructing command for generating HIP fatbin as
a function so that it can be reused by early finalization. It also modifies
codegen of HIP host constructor functions to embed the device fatbin
when it is available.

Differential Revision: https://reviews.llvm.org/D52377

llvm-svn: 343611
2018-10-02 17:48:54 +00:00
Jonas Hahnfeld a981f67bcd [OpenMP] Improve search for libomptarget-nvptx
When looking for the bclib Clang considered the default library
path first while it preferred directories in LIBRARY_PATH when
constructing the invocation of nvlink. The latter actually makes
more sense because during development it allows using a non-default
runtime library. So change the search for the bclib to start
looking in directories given by LIBRARY_PATH.
Additionally add a new option --libomptarget-nvptx-path= which
will be searched first. This will be handy for testing purposes.

Differential Revision: https://reviews.llvm.org/D51686

llvm-svn: 343230
2018-09-27 16:12:32 +00:00
Artem Belevich 44ecb0e3c2 [CUDA] Added basic support for compiling with CUDA-10.0
llvm-svn: 342924
2018-09-24 23:10:44 +00:00
Matt Arsenault a13746b7eb Rename -mlink-cuda-bitcode to -mlink-builtin-bitcode
The same semantics work for OpenCL, and probably any offload
language. Keep the old name around as an alias.

llvm-svn: 340193
2018-08-20 18:16:48 +00:00
Alexey Bataev b83b4e40fe [DEBUGINFO] Disable unsupported debug info options for NVPTX target.
Summary:
Some targets support only default set of the debug options and do not
support additional debug options, like NVPTX target. Patch introduced
virtual function supportsDebugInfoOptions() that can be overloaded
by the toolchain, checks if the target supports some debug
options and emits warning when an unsupported debug option is
found.

Reviewers: echristo

Subscribers: aprantl, JDevlieghere, cfe-commits

Differential Revision: https://reviews.llvm.org/D49148

llvm-svn: 338155
2018-07-27 19:45:14 +00:00
Artem Belevich 578653a8fc [CUDA] Fixed the list of GPUs supported by CUDA-9.
Differential Revision: https://reviews.llvm.org/D47268

llvm-svn: 333098
2018-05-23 16:45:23 +00:00
Artem Belevich 679dafe69e [CUDA] Added -f[no-]cuda-short-ptr option
The option enables use of 32-bit pointers for accessing
const/local/shared memory. The feature is disabled by default.

Differential Revision: https://reviews.llvm.org/D46148

llvm-svn: 331938
2018-05-09 23:10:09 +00:00
Artem Belevich 3cce307799 [CUDA] Enable CUDA compilation with CUDA-9.2
Differential Revision: https://reviews.llvm.org/D45827

llvm-svn: 330753
2018-04-24 18:23:19 +00:00
Artem Belevich 0ae8590354 [NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma instructions.
The new instructions were added added for sm_70+ GPUs in CUDA-9.1.

Differential Revision: https://reviews.llvm.org/D45068

llvm-svn: 330296
2018-04-18 21:51:48 +00:00
Alexey Bataev e36c67b35c [NVPTX] Emit debug info in DWARF-2 by default for Cuda devices.
Summary:
NVPTX target supports debug info in DWARF-2 format. Patch adds emission
of debug info in DWARF-2 by default.

Reviewers: tra, jlebar

Subscribers: aprantl, JDevlieghere, cfe-commits

Differential Revision: https://reviews.llvm.org/D42581

llvm-svn: 330272
2018-04-18 16:31:09 +00:00
Artem Belevich dde3dc27ee [CUDA] Added --[no-]cuda-include-ptx=sm_XX|all option.
Currently we always include PTX into the fatbin along
with the GPU code.It about doubles the size of the GPU binary
we need to carry in the executable. These options allow control
inclusion of PTX into GPU binary.

This patch does not change the defaults, though we may consider
making no-PTX the default in the future.

Differential Revision: https://reviews.llvm.org/D45495

llvm-svn: 329737
2018-04-10 18:38:22 +00:00
Alexander Kornienko 2a8c18d991 Fix typos in clang
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:

  archtype
  cas
  classs
  checkk
  compres
  definit
  frome
  iff
  inteval
  ith
  lod
  methode
  nd
  optin
  ot
  pres
  statics
  te
  thru

Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)

Differential revision: https://reviews.llvm.org/D44188

llvm-svn: 329399
2018-04-06 15:14:32 +00:00
Gheorghe-Teodor Bercea 0d5aa84ad9 [OpenMP] Add flag for linking runtime bitcode library
Summary: This patch adds an additional flag to the OpenMP device offloading toolchain to link in the runtime library bitcode.

Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, grokos, hfinkel

Reviewed By: ABataev, grokos

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D43197

llvm-svn: 327460
2018-03-13 23:19:52 +00:00
Gheorghe-Teodor Bercea 0805b80a73 Revert revision 327438.
llvm-svn: 327447
2018-03-13 20:50:12 +00:00
Gheorghe-Teodor Bercea 148046c11b [OpenMP] Add flag for linking runtime bitcode library
Summary: This patch adds an additional flag to the OpenMP device offloading toolchain to link in the runtime library bitcode.

Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, grokos, hfinkel

Reviewed By: ABataev, grokos

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D43197

llvm-svn: 327438
2018-03-13 19:39:19 +00:00
Jonas Hahnfeld 5379c6d6fd [CUDA] Add option to generate relocatable device code
As a first step, pass '-c/--compile-only' to ptxas so that it
doesn't complain about references to external function. This
will successfully generate object files, but they won't work
at runtime because the registration routines need to adapted.

Differential Revision: https://reviews.llvm.org/D42921

llvm-svn: 324878
2018-02-12 10:46:45 +00:00
Jonas Hahnfeld 7f9c518423 [CUDA] Detect installation in PATH
If the CUDA toolkit is not installed to its default locations
in /usr/local/cuda, the user is forced to specify --cuda-path.
This is tedious and the driver can be smarter if well-known tools
(like ptxas) can already be found in the PATH environment variable.

Add option --cuda-path-ignore-env if the user wants to ignore
set environment variables. Also use it in the tests to make sure
the driver always finds the same CUDA installation, regardless
of the user's environment.

Differential Revision: https://reviews.llvm.org/D42642

llvm-svn: 323848
2018-01-31 08:26:51 +00:00
Artem Belevich fbc56a904f [CUDA] Added partial support for CUDA-9.1
Clang can use CUDA-9.1 now, though new APIs (are not implemented yet.

The major change is that headers in CUDA-9.1 went through substantial
changes that started in CUDA-9.0 which required substantial changes
in the cuda compatibility headers provided by clang.

There are two major issues:
* CUDA SDK no longer provides declarations for libdevice functions.
* A lot of device-side functions have become nvcc's builtins and
  CUDA headers no longer contain their implementations.

This patch changes the way CUDA headers are handled if we compile
with CUDA 9.x. Both 9.0 and 9.1 are affected.

* Clang provides its own declarations of libdevice functions.
* For CUDA-9.x clang now provides implementation of device-side
  'standard library' functions using libdevice.

This patch should not affect compilation with CUDA-8. There may be
some observable differences for CUDA-9.0, though they are not expected
to affect functionality.

Tested: CUDA test-suite tests for all supported combinations of:
        CUDA: 7.0,7.5,8.0,9.0,9.1
        GPU: sm_20, sm_35, sm_60, sm_70

Differential Revision: https://reviews.llvm.org/D42513

llvm-svn: 323713
2018-01-30 00:00:12 +00:00
Ismail Donmez 64f99dfa08 Fix function call to fix build
../tools/clang/lib/Driver/ToolChains/Cuda.cpp:80:18: error: reference to non-static member function must be called; did you mean to call it with no arguments?
    if (Distro(D.getVFS).IsDebian())
               ~~^~~~~~
                       ()

llvm-svn: 319322
2017-11-29 15:18:02 +00:00
Sylvestre Ledru 02082c39f3 Follow up of r319317, add the missing header file
llvm-svn: 319319
2017-11-29 15:11:53 +00:00
Sylvestre Ledru 0cfcdc3ffd Add the nvidia-cuda-toolkit Debian package path to search path
Summary:
Reported here:
http://bugs.debian.org/882505

Patch by Andreas Beckmann


Reviewers: Hahnfeld, tra

Reviewed By: tra

Subscribers: jlebar, cfe-commits

Differential Revision: https://reviews.llvm.org/D40453

llvm-svn: 319317
2017-11-29 15:03:28 +00:00
Jonas Hahnfeld 7c78cc5273 [OpenMP] Consistently use cubin extension for nvlink
This was previously done in some places, but for example not for
bundling so that single object compilation with -c failed. In
addition cubin was used for all file types during unbundling which
is incorrect for assembly files that are passed to ptxas.
Tighten up the tests so that we can't regress in that area.

Differential Revision: https://reviews.llvm.org/D40250

llvm-svn: 318763
2017-11-21 14:44:45 +00:00
Justin Lebar 066494d8c1 [CUDA] Print an error if you try to compile with < sm_30 on CUDA 9.
Summary:
CUDA 9's minimum sm is sm_30.

Ideally we should also make sm_30 the default when compiling with CUDA
9, but that seems harder than it should be.

Subscribers: sanjoy

Differential Revision: https://reviews.llvm.org/D39109

llvm-svn: 316611
2017-10-25 21:32:06 +00:00
Jonas Hahnfeld 30b4418e5a [CMake][OpenMP] Customize default offloading arch
For the shuffle instructions in reductions we need at least sm_30
but the user may want to customize the default architecture.

Differential Revision: https://reviews.llvm.org/D38883

llvm-svn: 315996
2017-10-17 13:37:36 +00:00
Jonas Hahnfeld e2c342fc65 [CUDA] Require libdevice only if needed
If the user passes -nocudalib, we can live without it being present.
Simplify the code by just checking whether LibDeviceMap is empty.

Differential Revision: https://reviews.llvm.org/D38901

llvm-svn: 315902
2017-10-16 13:31:30 +00:00
Gheorghe-Teodor Bercea 5a3608ccfa [OpenMP] Don't throw cudalib not found error if only front-end is required.
Summary: If we only use the compiler front-end, do not throw an error about the cuda device library not being found. This allows the front-end to be run on systems where no Cuda installation is found.

Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, tra

Reviewed By: tra

Subscribers: hfinkel, tra, cfe-commits

Differential Revision: https://reviews.llvm.org/D37914

llvm-svn: 314217
2017-09-26 15:36:20 +00:00
Gheorghe-Teodor Bercea 20789a5f09 [OpenMP] Enable the existing nocudalib flag for OpenMP offloading toolchain.
Summary: Enable the -nocudalib flag for the OpenMP device offloading toolchain as well. Currently it can only be used for the CUDA toolchain.

Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, hfinkel, tra

Reviewed By: tra

Subscribers: hfinkel, cfe-commits

Differential Revision: https://reviews.llvm.org/D37913

llvm-svn: 314164
2017-09-25 21:56:32 +00:00
Gheorghe-Teodor Bercea 5636f4b33a [OpenMP] Bugfix: output file name drops the absolute path where full path is needed.
Summary: When composing the output file name, the path to the file is being dropped. The full path is required.

Reviewers: Hahnfeld, ABataev, caomhin, carlo.bertolli, hfinkel, tra

Reviewed By: tra

Subscribers: hfinkel, tra, cfe-commits

Differential Revision: https://reviews.llvm.org/D37912

llvm-svn: 314156
2017-09-25 21:25:38 +00:00
Gheorghe-Teodor Bercea d45720b55a Revert commit with wrong message.
llvm-svn: 314154
2017-09-25 21:22:49 +00:00
Gheorghe-Teodor Bercea 8cf757ceda [OpenMP] Don't throw cudalib not found error if only front-end is required.
Summary: If we only use the compiler front-end, do not throw an error about the cuda device library not being found. This allows the front-end to be run on systems where no Cuda installation is found.

Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, tra

Reviewed By: tra

Subscribers: hfinkel, tra, cfe-commits

Differential Revision: https://reviews.llvm.org/D37914

llvm-svn: 314150
2017-09-25 21:07:16 +00:00
Artem Belevich 4654dc89be [NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins.
Differential Revision: https://reviews.llvm.org/D38090

llvm-svn: 313820
2017-09-20 21:23:07 +00:00
Artem Belevich 8af4e23d1e [CUDA] Added rudimentary support for CUDA-9 and sm_70.
For now CUDA-9 is not included in the list of CUDA versions clang
searches for, so the path to CUDA-9 must be explicitly passed
via --cuda-path=.

On LLVM side NVPTX added sm_70 GPU type which bumps required
PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment.

Differential Revision: https://reviews.llvm.org/D37576

llvm-svn: 312734
2017-09-07 18:14:32 +00:00
Gheorghe-Teodor Bercea 9c52574886 [OpenMP] Enable previously successful offloading tests.
Create a separate test file to contain all tests for OpenMP
offloading to GPUs.

Make libdevice checking more robust by accounting for
the case in which no libdevice is found.

This changes are in connrection with diff: D29660

llvm-svn: 310718
2017-08-11 15:46:22 +00:00
Gheorghe-Teodor Bercea 14528c60ba [OpenMP] Delete tests in openmp-offload.c which cuase failures
until a better way to perform these tests is figured out.

Change connected to diff: D29654

llvm-svn: 310625
2017-08-10 16:56:59 +00:00
Alex Lorenz 994f231792 Revert r310489 and follow-up commits r310505, r310519, r310537 and r310549
Commit r310489 caused 'openmp-offload.c' test failures on Darwin and other
platforms:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/39230/testReport/junit/Clang/Driver/openmp_offload_c/

The follow-up commits tried to fix the test, but the test is still failing.

llvm-svn: 310580
2017-08-10 10:34:46 +00:00
Gheorghe-Teodor Bercea a659943306 [OpenMP] Provide a default GPU arch that is supported by
the underlying hardware.

This fixes a bug triggered by diff: D29660

llvm-svn: 310549
2017-08-10 05:01:42 +00:00
Gheorghe-Teodor Bercea 690f6f9b9f [OpenMP] Enable executable lookup into driver directory.
Summary: Invoking the compiler inside a script causes the clang-offload-bundler executable to not be found. This patch enables the lookup for executables in the driver directory where the clang-offload-bundler resides.

Reviewers: hfinkel, carlo.bertolli, arpith-jacob, ABataev, caomhin

Reviewed By: hfinkel

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D36537

llvm-svn: 310513
2017-08-09 19:52:28 +00:00
Gheorghe-Teodor Bercea 6b26dcb6d6 [OpenMP] Add flag for overwriting default PTX version for OpenMP targets
Summary:
This flag "--fopenmp-ptx=" enables the overwriting of the default PTX version used for GPU offloaded OpenMP target regions: "+ptx42".



Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar

Reviewed By: ABataev

Subscribers: rengolin, cfe-commits

Differential Revision: https://reviews.llvm.org/D29660

llvm-svn: 310489
2017-08-09 15:56:54 +00:00
Gheorghe-Teodor Bercea 0846582878 [OpenMP] Add flag for disabling the default generation of relocatable OpenMP target code for NVIDIA GPUs.
Summary: Previously we have added the "-c" flag which gets passed to PTXAS by default to generate relocatable OpenMP target code by default. This set of flags exposes control over this behaviour.

Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar

Reviewed By: ABataev

Subscribers: Hahnfeld, rengolin, cfe-commits

Differential Revision: https://reviews.llvm.org/D29659

llvm-svn: 310484
2017-08-09 15:27:39 +00:00
Gheorghe-Teodor Bercea b9d117233f [OpenMP] Make OpenMP generated code for the NVIDIA device relocatable by default
Original Diff: D29642

This patch was previously reverted due to an error with patch D29654
that this depends on.

llvm-svn: 310479
2017-08-09 14:59:35 +00:00
Gheorghe-Teodor Bercea 2c92693280 [OpenMP] OpenMP device offloading code generation produces a cubin file which is then integrated in the host binary using the host linker.
Diff: D29654

llvm-svn: 310362
2017-08-08 14:33:05 +00:00