Commit Graph

124 Commits

Author SHA1 Message Date
Jonas Hahnfeld e2c342fc65 [CUDA] Require libdevice only if needed
If the user passes -nocudalib, we can live without it being present.
Simplify the code by just checking whether LibDeviceMap is empty.

Differential Revision: https://reviews.llvm.org/D38901

llvm-svn: 315902
2017-10-16 13:31:30 +00:00
Gheorghe-Teodor Bercea 5a3608ccfa [OpenMP] Don't throw cudalib not found error if only front-end is required.
Summary: If we only use the compiler front-end, do not throw an error about the cuda device library not being found. This allows the front-end to be run on systems where no Cuda installation is found.

Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, tra

Reviewed By: tra

Subscribers: hfinkel, tra, cfe-commits

Differential Revision: https://reviews.llvm.org/D37914

llvm-svn: 314217
2017-09-26 15:36:20 +00:00
Gheorghe-Teodor Bercea 20789a5f09 [OpenMP] Enable the existing nocudalib flag for OpenMP offloading toolchain.
Summary: Enable the -nocudalib flag for the OpenMP device offloading toolchain as well. Currently it can only be used for the CUDA toolchain.

Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, hfinkel, tra

Reviewed By: tra

Subscribers: hfinkel, cfe-commits

Differential Revision: https://reviews.llvm.org/D37913

llvm-svn: 314164
2017-09-25 21:56:32 +00:00
Gheorghe-Teodor Bercea 5636f4b33a [OpenMP] Bugfix: output file name drops the absolute path where full path is needed.
Summary: When composing the output file name, the path to the file is being dropped. The full path is required.

Reviewers: Hahnfeld, ABataev, caomhin, carlo.bertolli, hfinkel, tra

Reviewed By: tra

Subscribers: hfinkel, tra, cfe-commits

Differential Revision: https://reviews.llvm.org/D37912

llvm-svn: 314156
2017-09-25 21:25:38 +00:00
Gheorghe-Teodor Bercea d45720b55a Revert commit with wrong message.
llvm-svn: 314154
2017-09-25 21:22:49 +00:00
Gheorghe-Teodor Bercea 8cf757ceda [OpenMP] Don't throw cudalib not found error if only front-end is required.
Summary: If we only use the compiler front-end, do not throw an error about the cuda device library not being found. This allows the front-end to be run on systems where no Cuda installation is found.

Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, tra

Reviewed By: tra

Subscribers: hfinkel, tra, cfe-commits

Differential Revision: https://reviews.llvm.org/D37914

llvm-svn: 314150
2017-09-25 21:07:16 +00:00
Artem Belevich 4654dc89be [NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins.
Differential Revision: https://reviews.llvm.org/D38090

llvm-svn: 313820
2017-09-20 21:23:07 +00:00
Artem Belevich 8af4e23d1e [CUDA] Added rudimentary support for CUDA-9 and sm_70.
For now CUDA-9 is not included in the list of CUDA versions clang
searches for, so the path to CUDA-9 must be explicitly passed
via --cuda-path=.

On LLVM side NVPTX added sm_70 GPU type which bumps required
PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment.

Differential Revision: https://reviews.llvm.org/D37576

llvm-svn: 312734
2017-09-07 18:14:32 +00:00
Gheorghe-Teodor Bercea 9c52574886 [OpenMP] Enable previously successful offloading tests.
Create a separate test file to contain all tests for OpenMP
offloading to GPUs.

Make libdevice checking more robust by accounting for
the case in which no libdevice is found.

This changes are in connrection with diff: D29660

llvm-svn: 310718
2017-08-11 15:46:22 +00:00
Gheorghe-Teodor Bercea 14528c60ba [OpenMP] Delete tests in openmp-offload.c which cuase failures
until a better way to perform these tests is figured out.

Change connected to diff: D29654

llvm-svn: 310625
2017-08-10 16:56:59 +00:00
Alex Lorenz 994f231792 Revert r310489 and follow-up commits r310505, r310519, r310537 and r310549
Commit r310489 caused 'openmp-offload.c' test failures on Darwin and other
platforms:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/39230/testReport/junit/Clang/Driver/openmp_offload_c/

The follow-up commits tried to fix the test, but the test is still failing.

llvm-svn: 310580
2017-08-10 10:34:46 +00:00
Gheorghe-Teodor Bercea a659943306 [OpenMP] Provide a default GPU arch that is supported by
the underlying hardware.

This fixes a bug triggered by diff: D29660

llvm-svn: 310549
2017-08-10 05:01:42 +00:00
Gheorghe-Teodor Bercea 690f6f9b9f [OpenMP] Enable executable lookup into driver directory.
Summary: Invoking the compiler inside a script causes the clang-offload-bundler executable to not be found. This patch enables the lookup for executables in the driver directory where the clang-offload-bundler resides.

Reviewers: hfinkel, carlo.bertolli, arpith-jacob, ABataev, caomhin

Reviewed By: hfinkel

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D36537

llvm-svn: 310513
2017-08-09 19:52:28 +00:00
Gheorghe-Teodor Bercea 6b26dcb6d6 [OpenMP] Add flag for overwriting default PTX version for OpenMP targets
Summary:
This flag "--fopenmp-ptx=" enables the overwriting of the default PTX version used for GPU offloaded OpenMP target regions: "+ptx42".



Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar

Reviewed By: ABataev

Subscribers: rengolin, cfe-commits

Differential Revision: https://reviews.llvm.org/D29660

llvm-svn: 310489
2017-08-09 15:56:54 +00:00
Gheorghe-Teodor Bercea 0846582878 [OpenMP] Add flag for disabling the default generation of relocatable OpenMP target code for NVIDIA GPUs.
Summary: Previously we have added the "-c" flag which gets passed to PTXAS by default to generate relocatable OpenMP target code by default. This set of flags exposes control over this behaviour.

Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar

Reviewed By: ABataev

Subscribers: Hahnfeld, rengolin, cfe-commits

Differential Revision: https://reviews.llvm.org/D29659

llvm-svn: 310484
2017-08-09 15:27:39 +00:00
Gheorghe-Teodor Bercea b9d117233f [OpenMP] Make OpenMP generated code for the NVIDIA device relocatable by default
Original Diff: D29642

This patch was previously reverted due to an error with patch D29654
that this depends on.

llvm-svn: 310479
2017-08-09 14:59:35 +00:00
Gheorghe-Teodor Bercea 2c92693280 [OpenMP] OpenMP device offloading code generation produces a cubin file which is then integrated in the host binary using the host linker.
Diff: D29654

llvm-svn: 310362
2017-08-08 14:33:05 +00:00
Alex Lorenz 7e9c478cda Revert r310291, r310300 and r310332 because of test failure on Darwin
The commit r310291 introduced the failure. r310332 was a test fix commit and
r310300 was a followup commit. I reverted these two to avoid merge conflicts
when reverting.

The 'openmp-offload.c' test is failing on Darwin because the following
run lines:
// RUN:   touch %t1.o
// RUN:   touch %t2.o
// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes %t1.o %t2.o 2>&1 \
// RUN:   | FileCheck -check-prefix=CHK-TWOCUBIN %s

trigger the following assertion:

Driver.cpp:3418:
    assert(CachedResults.find(ActionTC) != CachedResults.end() &&
           "Result does not exist??");

llvm-svn: 310345
2017-08-08 11:20:17 +00:00
Gheorghe-Teodor Bercea ceb422236a [OpenMP] Make OpenMP generated code for the NVIDIA device relocatable by default
Summary: When device offloading is enabled and the device is an NVIDIA GPU, OpenMP target regions must be compiled with relocation enabled by passing the "-c" flag to the PTXAS invocation.

Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar

Reviewed By: Hahnfeld

Subscribers: Hahnfeld, rengolin, mkuron, cfe-commits

Differential Revision: https://reviews.llvm.org/D29642

llvm-svn: 310300
2017-08-07 20:31:51 +00:00
Gheorghe-Teodor Bercea 53431bc046 [OpenMP] Pass -v to PTXAS if it was passed to the driver.
Summary: When compiling code being offloaded by OpenMP to an NVIDIA GPU, pass the -v to PTXAS if it was passed to the CLANG driver.

Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, jlebar, hfinkel, tstellar

Reviewed By: jlebar

Subscribers: Hahnfeld, rengolin, cfe-commits

Differential Revision: https://reviews.llvm.org/D29644

llvm-svn: 310295
2017-08-07 20:19:23 +00:00
Gheorghe-Teodor Bercea 4cdba82ee0 [OpenMP] Integrate OpenMP target region cubin into host binary
Summary: OpenMP device offloading code generation produces a cubin file which is then integrated in the host binary using the host linker.

Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, rnk, hfinkel, tstellar

Reviewed By: hfinkel

Subscribers: sfantao, rnk, rengolin, cfe-commits

Differential Revision: https://reviews.llvm.org/D29654

llvm-svn: 310291
2017-08-07 20:01:48 +00:00
Gheorghe-Teodor Bercea 47e0cf378c [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading
Summary:
OpenMP has the ability to offload target regions to devices which may have different architectures.

A new -fopenmp-target-arch flag is introduced to specify the device architecture.

In this patch I use the new flag to specify the compute capability of the underlying NVIDIA architecture for the OpenMP offloading CUDA tool chain.

Only a host-offloading test is provided since full device offloading capability will only be available when [[ https://reviews.llvm.org/D29654 | D29654 ]] lands.

Reviewers: hfinkel, Hahnfeld, carlo.bertolli, caomhin, ABataev

Reviewed By: hfinkel

Subscribers: guansong, cfe-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D34784

llvm-svn: 310263
2017-08-07 15:39:11 +00:00
Gheorghe-Teodor Bercea f0f29608d0 [OpenMP] Extend CLANG target options with device offloading kind.
Summary: Pass the type of the device offloading when building the tool chain for a particular target architecture. This is required when supporting multiple tool chains that target a single device type. In our particular use case, the OpenMP and CUDA tool chains will use the same ```addClangTargetOptions ``` method. This enables the reuse of common options and ensures control over options only supported by a particular tool chain.

Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, jlebar, hfinkel, tstellar, Hahnfeld

Reviewed By: hfinkel

Subscribers: jgravelle-google, aheejin, rengolin, jfb, dschuff, sbc100, cfe-commits

Differential Revision: https://reviews.llvm.org/D29647

llvm-svn: 307272
2017-07-06 16:22:21 +00:00
David L. Jones f561abab56 [Driver] Consolidate tools and toolchains by target platform. (NFC)
Summary:
(This is a move-only refactoring patch. There are no functionality changes.)

This patch splits apart the Clang driver's tool and toolchain implementation
files. Each target platform toolchain is moved to its own file, along with the
closest-related tools. Each target platform toolchain has separate headers and
implementation files, so the hierarchy of classes is unchanged.

There are some remaining shared free functions, mostly from Tools.cpp. Several
of these move to their own architecture-specific files, similar to r296056. Some
of them are only used by a single target platform; since the tools and
toolchains are now together, some helpers now live in a platform-specific file.
The balance are helpers related to manipulating argument lists, so they are now
in a new file pair, CommonArgs.h and .cpp.

I've tried to cluster the code logically, which is fairly straightforward for
most of the target platforms and shared architectures. I think I've made
reasonable choices for these, as well as the various shared helpers; but of
course, I'm happy to hear feedback in the review.

There are some particular things I don't like about this patch, but haven't been
able to find a better overall solution. The first is the proliferation of files:
there are several files that are tiny because the toolchain is not very
different from its base (usually the Gnu tools/toolchain). I think this is
mostly a reflection of the true complexity, though, so it may not be "fixable"
in any reasonable sense. The second thing I don't like are the includes like
"../Something.h". I've avoided this largely by clustering into the current file
structure. However, a few of these includes remain, and in those cases it
doesn't make sense to me to sink an existing file any deeper.

Reviewers: rsmith, mehdi_amini, compnerd, rnk, javed.absar

Subscribers: emaste, jfb, danalbert, srhines, dschuff, jyknight, nemanjai, nhaehnle, mgorny, cfe-commits

Differential Revision: https://reviews.llvm.org/D30372

llvm-svn: 297250
2017-03-08 01:02:16 +00:00