Commit Graph

21 Commits

Author SHA1 Message Date
Fangrui Song 3a79f1caa9 [Driver][test] Restore %clang -cc1 in test/Driver
This partially reverts commit 1609a5d771 (the
test/Driver part). We want to discourage %clang_cc1 and clang -cc1 in
test/Driver. The clang -cc1 uses in hlsl/offload/etc are not good examples.
2022-09-29 12:21:57 -07:00
Michał Górny 1609a5d771 [clang] [test] Use %clang_cc1 substitution consistently
Use the `%clang_cc1` substitution consistently across the test suite,
replacing inline `%clang -cc1` invocations, except for one Preprocessor
test where this is causing breakage.  This is necessary to ensure that
additional parameters passed via `%clang` do not interfere with `-cc1`
that must always be passed as the first command-line argument.

Remove the additional substitution blocking `%clang_cc1` use in Driver
tests.  It has been added in 2013 and was supposed to prevent tests
calling `clang -cc1` from being added to Driver.  The state of the test
suite proves that it did not succeed at all.

Differential Revision: https://reviews.llvm.org/D134880
2022-09-29 20:59:00 +02:00
Joseph Huber f50a7c7a26 [LinkerWrapper] Fix optimized debugging builds for NVPTX LTO
The ptxas assembler does not allow the `-g` flag along with
optimizations. Normally this is degraded to line info in the driver, but
when using LTO we did not have this step and the linker wrapper was not
correctly degrading the option. Note that this will not work if the user
does not pass `-g` again to the linker invocation. That will require
setting some flags in the binary to indicate that debugging was used
when building.

This fixes #57990

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D134660
2022-09-27 10:49:17 -05:00
Joseph Huber 47b0aa5e4b [LinkerWrapper] Rework passing args to the LLVM backend 2022-07-18 12:44:15 -04:00
Joseph Huber 20d253e3bf [LinkerWrapper] Fix linker-wrapper not working with host-LTO 2022-07-13 12:32:02 -04:00
Joseph Huber fe6a391357 [Clang] Fix tests failing due to invalid syntax for host triple
Summary:
We use the `--host-triple=` argument to manually set the target triple.
This was changed to include the `=` previously but was not included in
these additional test cases, causing it for fail on some unsupported
systems.
2022-07-11 21:31:56 -04:00
Joseph Huber ce091eb3b9 [HIP] Add support for handling HIP in the linker wrapper
This patch adds the necessary changes required to bundle and wrap HIP
files. The bundling is done using `clang-offload-bundler` currently to
mimic `fatbinary` and the wrapping is done using very similar runtime
calls to CUDA. This still does not support managed / surface / texture
variables, that would require some additional information in the entry.

One difference in the codegeneration with AMD is that I don't check if
the handle is null before destructing it, I'm not sure if that's
required.

With this we should be able to support HIP with the new driver.

Depends on D128850

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D128914
2022-07-11 15:49:23 -04:00
Joseph Huber 22a01b860b [LinkerWrapper] Forward `-mllvm` options to the linker wrapper
This patch adds the ability to use `-mllvm` options in the linker
wrapper when performing bitcode linking or the module compilation.
This is done by passing in the LLVM argument to the clang-linker-wrapper
tool. Inside the linker-wrapper tool we invoke the `CommandLine` parser
solely for forwarding command line options to the `clang-linker-wrapper`
to the LLVM tools that also use the `CommandLine` parser. The actual
arguments to the linker wrapper are parsed using the `Opt` library
instead.

For example, in the following command the `CommandLine` parser will attempt to
parse `abc`, while the `opt` parser takes `-mllvm <arg>` and ignores it so it is
not passed to the linker arguments.
```
clang-linker-wrapper -mllvm -abc -- <linker-args>
```

As far as I can tell this is the easiest way to forward arguments to
LLVM tool invocations. If there is a better way to pass these arguments
(such as through the LTO config) let me know.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D129424
2022-07-09 21:18:19 -04:00
Joseph Huber d2ead9e324 [LinkerWrapper][NFC] Rework command line argument handling in the linker wrapper
Summary:
This patch reworks the command line argument handling in the linker
wrapper from using the LLVM `cl` interface to using the `Option`
interface with TableGen. This has several benefits compared to the old
method.

We use arguments from the linker arguments in the linker
wrapper, such as the libraries and input files, this allows us to
properly parse these. Additionally we can now easily set up aliases to
the linker wrapper arguments and pass them in the linker input directly.
That is, pass an option like `cuda-path=` as `--offload-arg=cuda-path=`
in the linker's inputs. This will allow us to handle offloading
compilation in the linker itself some day. Finally, this is also a much
cleaner interface for passing arguments to the individual device linking
jobs.
2022-07-08 11:18:38 -04:00
Joseph Huber 0bb1bf1b17 [LinkerWrapper] Add AMDGPU specific options to the LLD invocation
We use LLD to perform AMDGPU linking. This linker accepts some arguments
through the `-plugin-opt` facilities. These options match what `Clang`
will output when given the same input.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D128923
2022-07-05 13:43:51 -04:00
Joseph Huber 958a885050 [LinkerWrapper] Rework the linker wrapper and use owning binaries
The linker wrapper currently eagerly extracts all identified offloading
binaries to a file. This isn't ideal because we will soon open these
files again to examine their symbols for LTO and other things.
Additionally, we may not use every extracted file in the case of static
libraries. This would be very noisy in the case of static libraries that
may contain code for several targets not participating in the current
link.

Recent changes allow us to treat an Offloading binary as a standard
binary class. So that allows us to use an OwningBinary to model the
file. Now we keep it in memory and only write it once we know which
files will be participating in the final link job. This also reworks a
lot of the structure around how we handle this by removing the old
DeviceFile class.

The main benefit from this is that the following doesn't output 32+ files and
instead will only output a single temp file for the linked module.
```
$ clang input.c -fopenmp --offload-arch=sm_70 -foffload-lto -save-temps
```

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D127246
2022-06-22 09:24:10 -04:00
Joseph Huber f37101983f [OpenMP] Add `-Xoffload-linker` to forward input to the device linker
We use the clang-linker-wrapper to perform device linking of embedded
offloading object files. This is done by generating those jobs inside of
the linker-wrapper itself. This patch adds an argument in Clang and the
linker-wrapper that allows users to forward input to the device linking
phase. This can either be done for every device linker, or for a
specific target triple. We use the `-Xoffload-linker <arg>` and the
`-Xoffload-linker-<triple> <arg>` syntax to accomplish this.

Reviewed By: markdewing, tra

Differential Revision: https://reviews.llvm.org/D126226
2022-05-24 09:11:02 -04:00
Joseph Huber 26eb04268f [Clang] Introduce clang-offload-packager tool to bundle device files
In order to do offloading compilation we need to embed files into the
host and create fatbainaries. Clang uses a special binary format to
bundle several files along with their metadata into a single binary
image. This is currently performed using the `-fembed-offload-binary`
option. However this is not very extensibile since it requires changing
the command flag every time we want to add something and makes optional
arguments difficult. This patch introduces a new tool called
`clang-offload-packager` that behaves similarly to CUDA's `fatbinary`.
This tool takes several input files with metadata and embeds it into a
single image that can then be embedded in the host.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D125165
2022-05-11 09:39:13 -04:00
Joseph Huber f49d576a88 [CUDA] Add wrapper code generation for registering CUDA images
This patch adds the necessary code generation to create the wrapper code
that registers all the globals in CUDA. We create the necessary
functions and iterate through the list of
`__start_cuda_offloading_entries` to find which globals must be
registered. This is very similar to the code generation done currently
in Clang for non-rdc builds, but here we are registering a fully linked
fatbinary and finding the globals via the above sections.

With this we should be able to fully support basic RDC / LTO building of CUDA
code.

It's also worth noting that this does not include the necessary PTX to JIT the
image, so to use this support the offloading architecture must match the
system's architecture.

Depends on D123810

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D123812
2022-05-11 07:30:25 -04:00
Joseph Huber e12905b4d5 [OpenMP] Add basic support for properly handling static libraries
Currently we handle static libraries like any other object in the
linker wrapper. However, this does not preserve the sematnics that
dictate static libraries should be lazily loaded as the symbols are
needed. This allows us to ignore linking in architectures that are not
used by the main application being compiled. This patch adds the basic
support for detecting if a file came from a static library, and only
including it in the link job if it's used by other object files.

This patch only adds the basic support, to be more correct we should
check the symbols and only inclue the library if the link job contains
symbols that are needed. Ideally we could just put this on the linker
itself, but nvlink doesn't seem to support `.a` files.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D125092
2022-05-06 11:20:58 -04:00
Joseph Huber d9c64d33b9 [OpenMP] Allow CUDA to be linked with OpenMP using the new driver
After basic support for embedding and handling CUDA files was added to
the new driver, we should be able to call CUDA functions from OpenMP
code. This patch makes the necessary changes to successfuly link in CUDA
programs that were compiled using the new driver. With this patch it
should be possible to compile device-only CUDA code (no kernels) and
call it from OpenMP as follows:

```
$ clang++ cuda.cu -fopenmp-new-driver -offload-arch=sm_70 -c
$ clang++ openmp.cpp cuda.o -fopenmp-new-driver -fopenmp -fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64 -march=sm_70
```

Currently this requires using a host variant to suppress the generation
of a CPU-side fallback call.

Depends on D120272

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120273
2022-04-29 11:38:40 -04:00
Joseph Huber 3530c35c66 [OpenMP] Use CUDA's non-RDC mode when LTO has whole program visibility
When we do LTO we consider ourselves to have whole program visibility if
every single input file we have contains LLVM bitcode. If we have whole
program visibliity then we can create a single image and utilize CUDA's
non-RDC mode by not passing `-c` to `ptxas` and ignoring the `nvlink`
job. This should be faster for some situations and also saves us the
time executing `nvlink`.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D124292
2022-04-23 12:42:40 -04:00
Joseph Huber ee74abaad7 [OpenMP] Add triple to the linker wrapper job
Summary:
I forgot to add the triple to the linker wrapper job, so we were still
generating code for the unintended platforms.
2022-04-20 08:23:43 -04:00
Joseph Huber 1dfe0273fd [OpenMP] Add explicit triple to linker wrapper test
Summary:
Some platforms like Mach-O require different handling of section names.
This is not supported on Mac-OS or Windows yet so we shouldn't be
testing the compilation there. Add an explicit triple to the tests.
2022-04-20 07:24:51 -04:00
Joseph Huber 8c64928887 [OpenMP] Add necessary registered targets for linker wrapper test
Summary:
The linker wrapper needs to use the registered backend to perform LTO.
This was causing problems on the buildbots that didn't support it.
2022-04-19 18:48:58 -04:00
Joseph Huber 260c5df2d5 [OpenMP] Add better testing for the linker wrapper
The linker wrapper is used to perform linking and wrapping of embedded
device object files. Currently its internals are not able to be tested
easily. This patch adds the `--dry-run` and `--print-wrapped-module`
options to investigate the link jobs that will be run along with the
wrapped code that will be created to register the binaries.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D124039
2022-04-19 18:37:09 -04:00