Simplify debug info back to just "limited" or "full" by rolling the ctor
type homing fully into the "limited" debug info.
Also fix a bug I found along the way that was causing ctor type homing
to kick in even when something could be vtable homed (where vtable
homing is stronger/more effective than ctor homing) - fixing at the same
time as it keeps the tests (that were testing only "limited non ctor"
homing and now test ctor homing) passing.
HLSL supports half type.
When enable-16bit-types is not set, half will be treated as float.
When enable-16bit-types is set, half will be treated like real 16bit float type and map to llvm half type.
Also change CXXABI to Microsoft to match dxc behavior.
The mangle name for half is "$f16@" when half is treat as native half type and "$halff@" when treat as float.
In AST, half is still half.
The special thing is done at clang codeGen, when NativeHalfType is false, half will translated into float.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D124790
Previously, when using the new driver we created a fatbinary with the
PTX and Cubin output. This was mainly done in an attempt to create some
backwards compatibility with the existing CUDA support that embeds the
fatbinary in each TU. This will most likely be more work than necessary
to actually implement. The linker wrapper cannot do anything with these
embedded PTX files because we do not know how to link them, and if we
did want to include multiple files it should go through the
`clang-offload-packager` instead. Also this didn't repsect the setting
that disables embedding PTX (although it wasn't used anyway).
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D128441
The target features are necessary for correctly compiling most programs
in LTO mode. Currently, these are derived in clang at link time and
passed as an arguemnt to the linker wrapper. This is problematic because
it requires knowing the required toolchain at link time, which should
not be necessry. Instead, these features should be embedded into the
offloading binary so we can unify them in the linker wrapper for LTO.
This also required changing the offload packager to interpret multiple
arguments as concatenation with a comma. This is so we can still use the
`,` separator for the argument list.
Depends on D127246
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D127686
This patch refines //when// driver diagnostics are formatted so that
`flang-new` and `flang-new -fc1` behave consistently with `clang` and
`clang -cc1`, respectively. This change only applies to driver diagnostics.
Scanning, parsing and semantic diagnostics are separate and not covered here.
**NEW BEHAVIOUR**
To illustrate the new behaviour, consider the following input file:
```! file.f90
program m
integer :: i = k
end
```
In the following invocations, "error: Semantic errors in file.f90" _will be_
formatted:
```
$ flang-new file.f90
error: Semantic errors in file.f90
./file.f90:2:18: error: Must be a constant value
integer :: i = k
$ flang-new -fc1 -fcolor-diagnostics file.f90
error: Semantic errors in file.f90
./file.f90:2:18: error: Must be a constant value
integer :: i = k
```
However, in the following invocations, "error: Semantic errors in file.f90"
_will not be_ formatted:
```
$ flang-new -fno-color-diagnostics file.f90
error: Semantic errors in file.f90
./file.f90:2:18: error: Must be a constant value
integer :: i = k
$ flang-new -fc1 file.f90
error: Semantic errors in file.f90
./file.f90:2:18: error: Must be a constant value
integer :: i = k
```
Before this change, none of the above would be formatted. Note also that the
default behaviour in `flang-new` is different to `flang-new -fc1` (this is
consistent with Clang).
**NOTES ON IMPLEMENTATION**
Note that the diagnostic options are parsed in `createAndPopulateDiagOpt`s in
driver.cpp. That's where the driver's `DiagnosticEngine` options are set. Like
most command-line compiler driver options, these flags are "claimed" in
Flang.cpp (i.e. when creating a frontend driver invocation) by calling
`getLastArg` rather than in driver.cpp.
In Clang's Options.td, `defm color_diagnostics` is replaced with two separate
definitions: `def fcolor_diagnostics` and def fno_color_diagnostics`. That's
because originally `color_diagnostics` derived from `OptInCC1FFlag`, which is a
multiclass for opt-in options in CC1. In order to preserve the current
behaviour in `clang -cc1` (i.e. to keep `-fno-color-diagnostics` unavailable in
`clang -cc1`) and to implement similar behaviour in `flang-new -fc1`, we can't
re-use `OptInCC1FFlag`.
Formatting is only available in consoles that support it and will normally mean that
the message is printed in bold + color.
Co-authored-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
Reviewed By: rovka
Differential Revision: https://reviews.llvm.org/D126164
This patch updates the `--[no-]offload-arch` command line arguments to
allow the user to pass in many architectures in a single argument rather
than specifying it multiple times. This means that the following command
line,
```
clang foo.cu --offload-arch=sm_70 --offload-arch=sm_80
```
can become:
```
clang foo.cu --offload-arch=sm_70,sm_80
```
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D128206
When linking a Fortran program, we need to add the runtime libraries to
the command line. This is exactly what we do for Linux/Darwin, but the
MSVC interface is slightly different (e.g. -libpath instead of -L).
We also remove oldnames and libcmt, since they're not needed at the
moment and they bring in more dependencies.
We also pass `/subsystem:console` to the linker so it can figure out the
right entry point. This is only needed for MSVC's `link.exe`. For LLD it
is redundant but doesn't hurt.
Differential Revision: https://reviews.llvm.org/D126291
Co-authored-by: Markus Mützel <markus.muetzel@gmx.de>
No behavior change as GNU ld/gold/ld.lld ignore --dynamic-linker in -r mode.
This change makes the intention clearer as we already suppress --dynamic-linker
for -shared, -static, and -static-pie.
Previously, each job would overwrite the -MJ file. This didn't quite work for Clang invocations with multiple architectures, which got fixed in D121997 by always appending to the -MJ file. That's not correct either, since the file would grow indefinitely on subsequent Clang invocations. This patch ensures the driver always removes the file before jobs fill it in by appending.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D128098
Let -fsanitize-memory-param-retval be used together with
-fsanitize=kernel-memory, so that it can be applied when building the
Linux kernel.
Also add clang/test/CodeGen/kmsan-param-retval.c to ensure that
-fsanitize-memory-param-retval eliminates shadow accesses for parameters
marked as undef.
Reviewed By: eugenis, vitalybuka
Differential Revision: https://reviews.llvm.org/D127860
GNU ld has a hack that defaults to -X (--discard-locals) in the emulation file
`riscvelf.em`. The recommended way, as gcc/config/arm does, is to let the
compiler driver pass -X to ld.
(The motivation is likely to discard a plethora of `.L` symbols due to linker
relaxation.)
lld default to --discard-none. To make clang+lld match GNU ld's behavior, pass
-X to ld.
Note: GNU ld has a special rule to treat ld -r -s as ld -r -S -x. With -X, driver `-r -Wl,-s`
will behave as ld `-r -S -X`. This removes fewer symbols than `-r -S -x` but is safe.
Differential Revision: https://reviews.llvm.org/D127826
We shouldn't assume that libunwind.so is available. Rather can defer
the decision to the linker which defaults to libunwind.so, but if .so
isn't available, it'd pick libunwind.a. Users can use -static-libgcc
and -shared-libgcc to override this behavior and explicitly choose
the version they want.
Differential Revision: https://reviews.llvm.org/D127528
Adds the -fsanitize plumbing for memtag-globals. Makes -fsanitize=memtag
imply -fsanitize=memtag-globals.
This has no effect on codegen for now.
Reviewed By: eugenis, aaron.ballman
Differential Revision: https://reviews.llvm.org/D127163
1. Support user specified linker (-fuse-ld)
2. Support user specified linker script (-T)
Reviewed By: MaskRay, haowei
Differential Revision: https://reviews.llvm.org/D126192
This patch simplifies how we unify target features. Now we simply
iterate the input in reverse and only insert the feature if it hasn't
been seen yet. The only reason we need to reverse this at the end is to
keep the features in order for the existing tests.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D127707
The offloading packager doesn't have a normal offloading kind. This
would result in its output being sent to the executable name when in
save-temps mode. This would then get overwritten by the actual output.
This patch adds specific checks to make sure that it gets the correct
name.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D127673
Currently the a AAPCS compliant frame record is not always created for
functions when it should. Although a consistent frame record might not
be required in some cases, there are still scenarios where applications
may want to make use of the call hierarchy made available trough it.
In order to enable the use of AAPCS compliant frame records whilst keep
backwards compatibility, this patch introduces a new command-line option
(`-mframe-chain=[none|aapcs|aapcs+leaf]`) for Aarch32 and Thumb backends.
The option allows users to explicitly select when to use it, and is also
useful to ensure the extra overhead introduced by the frame records is
only introduced when necessary, in particular for Thumb targets.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D125094
It was previously reverted by 8406839d19.
---
This flag was introduced by
6818991d71
commit 6818991d71
Author: Ted Kremenek <kremenek@apple.com>
Date: Mon Dec 7 22:06:12 2009 +0000
Add clang-cc option '-analyzer-opt-analyze-nested-blocks' to treat
block literals as an entry point for analyzer checks.
The last reference was removed by this commit:
5c32dfc5fb
commit 5c32dfc5fb
Author: Anna Zaks <ganna@apple.com>
Date: Fri Dec 21 01:19:15 2012 +0000
[analyzer] Add blocks and ObjC messages to the call graph.
This paves the road for constructing a better function dependency graph.
If we analyze a function before the functions it calls and inlines,
there is more opportunity for optimization.
Note, we add call edges to the called methods that correspond to
function definitions (declarations with bodies).
Consequently, we should remove this dead flag.
However, this arises a couple of burning questions.
- Should the `cc1` frontend still accept this flag - to keep
tools/users passing this flag directly to `cc1` (which is unsupported,
unadvertised) working.
- If we should remain backward compatible, how long?
- How can we get rid of deprecated and obsolete flags at some point?
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D126067
I'm trying to remove unused options from the `Analyses.def` file, then
merge the rest of the useful options into the `AnalyzerOptions.def`.
Then make sure one can set these by an `-analyzer-config XXX=YYY` style
flag.
Then surface the `-analyzer-config` to the `clang` frontend;
After all of this, we can pursue the tablegen approach described
https://discourse.llvm.org/t/rfc-tablegen-clang-static-analyzer-engine-options-for-better-documentation/61488
In this patch, I'm proposing flag deprecations.
We should support deprecated analyzer flags for exactly one release. In
this case I'm planning to drop this flag in `clang-16`.
In the clang frontend, now we won't pass this option to the cc1
frontend, rather emit a warning diagnostic reminding the users about
this deprecated flag, which will be turned into error in clang-16.
Unfortunately, I had to remove all the tests referring to this flag,
causing a mass change. I've also added a test for checking this warning.
I've seen that `scan-build` also uses this flag, but I think we should
remove that part only after we turn this into a hard error.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D126215
1. Support user specified linker (-fuse-ld)
2. Support user specified linker script (-T)
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D126192
The option mdefault-visibility-export-mapping is created to allow
mapping default visibility to an explicit shared library export
(e.g. dllexport). Exactly how and if this is manifested is target
dependent (since it depends on how they map dllexport in the IR).
Three values are provided for the option:
* none: the default and behavior without the option, no additional export linkage information is created.
* explicit: add the export for entities with explict default visibility from the source, including RTTI
* all: add the export for all entities with default visibility
This option is useful for targets which do not export symbols as part of
their usual default linkage behaviour (e.g. AIX), such targets
traditionally specified such information in external files (e.g. export
lists), but this mapping allows them to use the visibility information
typically used for this purpose on other (e.g. ELF) platforms.
This relands commit: 8c8a2679a2
with fixes for the compile time and assert problems that were reported
by:
* making shouldMapVisibilityToDLLExport inline and provide an early return
in the case where no mapping is in effect (aka non-AIX platforms)
* don't try to export RTTI types which we will give internal linkage to
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D126340
We use the flags `--offload-host-only` and `--offload-device-only` to
change the driver's code generation for offloading programs. These are
currently parsed out independently in many places. This patch simply
refactors this to work as a mode for the Driver. This stopped us from
emitting warnings if unused because it's always used now, but I don't
think this is a great loss.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D127515
Command lines with multiple `-arch` arguments expand into multiple entries in the compilation database. However, the file writes are not appending, meaning subsequent writes end up overwriting the previous ones, resulting in garbled output.
This patch fixes that by always appending to the file.
rdar://90165004
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D121997
This patch introduces the new -fdriver-only flag which instructs Clang to only execute the driver logic without running individual jobs. In a way, this is very similar to -###, with the following differences:
* it doesn't automatically print all jobs,
* it doesn't avoid side effects (e.g. it will generate compilation database when -MJ is specified).
This flag will be useful in testing D121997.
Reviewed By: dexonsmith, egorzhdan
Differential Revision: https://reviews.llvm.org/D127408
Currently the a AAPCS compliant frame record is not always created for
functions when it should. Although a consistent frame record might not
be required in some cases, there are still scenarios where applications
may want to make use of the call hierarchy made available trough it.
In order to enable the use of AAPCS compliant frame records whilst keep
backwards compatibility, this patch introduces a new command-line option
(`-mframe-chain=[none|aapcs|aapcs+leaf]`) for Aarch32 and Thumb backends.
The option allows users to explicitly select when to use it, and is also
useful to ensure the extra overhead introduced by the frame records is
only introduced when necessary, in particular for Thumb targets.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D125094
Previously, omitting unnecessary DWARF unwinds was only done in two
cases:
* For Darwin + aarch64, if no DWARF unwind info is needed for all the
functions in a TU, then the `__eh_frame` section would be omitted
entirely. If any one function needed DWARF unwind, then MC would emit
DWARF unwind entries for all the functions in the TU.
* For watchOS, MC would omit DWARF unwind on a per-function basis, as
long as compact unwind was available for that function.
This diff makes it so that we omit DWARF unwind on a per-function basis
for Darwin + aarch64 as well. In addition, we introduce the flag
`--emit-dwarf-unwind=` which can toggle between `always`,
`no-compact-unwind` (only emit DWARF when CU cannot be emitted for a
given function), and the target platform `default`. `no-compact-unwind`
is particularly useful for newer x86_64 platforms: we don't want to omit
DWARF unwind for x86_64 in general due to possible backwards compat
issues, but we should make it possible for people to opt into this
behavior if they are only targeting newer platforms.
**Motivation:** I'm working on adding support for `__eh_frame` to LLD,
but I'm concerned that we would suffer a perf hit. Processing compact
unwind is already expensive, and that's a simpler format than EH frames.
Given that MC currently produces one EH frame entry for every compact
unwind entry, I don't think processing them will be cheap. I tried to do
something clever on LLD's end to drop the unnecessary EH frames at parse
time, but this made the code significantly more complex. So I'm looking
at fixing this at the MC level instead.
**Addendum:** It turns out that there was a latent bug in the X86
backend when `OmitDwarfIfHaveCompactUnwind` is naively enabled, which is
not too surprising given that this combination has not been heretofore
used.
For functions that have unwind info that cannot be encoded with CU, MC
would end up dropping both the compact unwind entry (OK; existing
behavior) as well as the DWARF entries (not OK). This diff fixes things
so that we emit the DWARF entry, as well as a CU entry with encoding
`UNWIND_X86_MODE_DWARF` -- this basically tells the unwinder to look for
the DWARF entry. I'm not 100% sure the `UNWIND_X86_MODE_DWARF` CU entry
is necessary, this was the simplest fix. ld64 seems to be able to handle
both the absence and presence of this CU entry. Ultimately ld64 (and
LLD) will synthesize `UNWIND_X86_MODE_DWARF` if it is absent, so there
is no impact to the final binary size.
Reviewed By: davide, lhames
Differential Revision: https://reviews.llvm.org/D122258
The DWARF version was raised to 5 for all platforms which do not opt
out. Default to DWARF version to 4 for z/OS again.
Reviewed By: abhina.sreeskantharajan, uweigand
Differential Revision: https://reviews.llvm.org/D127498
This flag was introduced by
6818991d71
commit 6818991d71
Author: Ted Kremenek <kremenek@apple.com>
Date: Mon Dec 7 22:06:12 2009 +0000
Add clang-cc option '-analyzer-opt-analyze-nested-blocks' to treat
block literals as an entry point for analyzer checks.
The last reference was removed by this commit:
5c32dfc5fb
commit 5c32dfc5fb
Author: Anna Zaks <ganna@apple.com>
Date: Fri Dec 21 01:19:15 2012 +0000
[analyzer] Add blocks and ObjC messages to the call graph.
This paves the road for constructing a better function dependency graph.
If we analyze a function before the functions it calls and inlines,
there is more opportunity for optimization.
Note, we add call edges to the called methods that correspond to
function definitions (declarations with bodies).
Consequently, we should remove this dead flag.
However, this arises a couple of burning questions.
- Should the `cc1` frontend still accept this flag - to keep
tools/users passing this flag directly to `cc1` (which is unsupported,
unadvertised) working.
- If we should remain backward compatible, how long?
- How can we get rid of deprecated and obsolete flags at some point?
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D126067
I'm trying to remove unused options from the `Analyses.def` file, then
merge the rest of the useful options into the `AnalyzerOptions.def`.
Then make sure one can set these by an `-analyzer-config XXX=YYY` style
flag.
Then surface the `-analyzer-config` to the `clang` frontend;
After all of this, we can pursue the tablegen approach described
https://discourse.llvm.org/t/rfc-tablegen-clang-static-analyzer-engine-options-for-better-documentation/61488
In this patch, I'm proposing flag deprecations.
We should support deprecated analyzer flags for exactly one release. In
this case I'm planning to drop this flag in `clang-16`.
In the clang frontend, now we won't pass this option to the cc1
frontend, rather emit a warning diagnostic reminding the users about
this deprecated flag, which will be turned into error in clang-16.
Unfortunately, I had to remove all the tests referring to this flag,
causing a mass change. I've also added a test for checking this warning.
I've seen that `scan-build` also uses this flag, but I think we should
remove that part only after we turn this into a hard error.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D126215
Until now, `-x` wasn't really taken into account in Flang's compiler and
frontend drivers. `flang-new` and `flang-new -fc1` only recently gained
powers to consume inputs other than Fortran files and that's probably
why this hasn't been noticed yet.
This patch makes sure that `-x` is supported correctly and consistently
with Clang. To this end, verification is added when reading LLVM IR
files (i.e. IR modules are verified with `llvm::verifyModule`). This
way, LLVM IR parsing errors are correctly reported to Flang users. This
also aids testing.
With the new functionality, we can verify that `-x ir` breaks
compilation for e.g. Fortran files and vice-versa. Tests are updated
accordingly.
Differential Revision: https://reviews.llvm.org/D127207
CLANG_MODULE_CACHE_PATH can be used to change where clang should
put the module cache, or can be set to "" to disable caching entirely.
Differential revision: https://reviews.llvm.org/D126678
As there 3 intercepts that depend on libresolv, link tests in ./configure scripts may be confuse by the presence of resolv symbols (i.e. dn_expand) even with -lresolv and get a runtime error.
Android provides the functionality in libc.
https://reviews.llvm.org/D122849https://reviews.llvm.org/D126851
Reviewed By: eugenis, MaskRay
Differential Revision: https://reviews.llvm.org/D127145
The backend now can generate working unwind information for this
target.
Improve the existing windows-exceptions.cpp testcase to check for
the state of unwind tables on all MSVC architectures.
Differential Revision: https://reviews.llvm.org/D126862
Instead of adding all devtoolset and gcc-toolset prefixes to the list of
prefixes, just scan the /opt/rh/ directory for the one with the highest
version number and only add that one.
Differential Revision: https://reviews.llvm.org/D125862
This caused assertions, see comment on the code review:
llvm/clang/lib/AST/Decl.cpp:1510:
clang::LinkageInfo clang::LinkageComputer::getLVForDecl(const clang::NamedDecl *, clang::LVComputationKind):
Assertion `D->getCachedLinkage() == LV.getLinkage()' failed.
> The option mdefault-visibility-export-mapping is created to allow
> mapping default visibility to an explicit shared library export
> (e.g. dllexport). Exactly how and if this is manifested is target
> dependent (since it depends on how they map dllexport in the IR).
>
> Three values are provided for the option:
>
> * none: the default and behavior without the option, no additional export linkage information is created.
> * explicit: add the export for entities with explict default visibility from the source, including RTTI
> * all: add the export for all entities with default visibility
>
> This option is useful for targets which do not export symbols as part of
> their usual default linkage behaviour (e.g. AIX), such targets
> traditionally specified such information in external files (e.g. export
> lists), but this mapping allows them to use the visibility information
> typically used for this purpose on other (e.g. ELF) platforms.
>
> Reviewed By: MaskRay
>
> Differential Revision: https://reviews.llvm.org/D126340
This reverts commit 8c8a2679a2.
LTO code may end up mixing bitcode files from various sources varying in
their use of opaque pointer types. The current strategy to decide
between opaque / typed pointers upon the first bitcode file loaded does
not work here, since we could be loading a non-opaque bitcode file first
and would then be unable to load any files with opaque pointer types
later.
So for LTO this:
- Adds an `lto::Config::OpaquePointer` option and enforces an upfront
decision between the two modes.
- Adds `-opaque-pointers`/`-no-opaque-pointers` options to the gold
plugin; disabled by default.
- `--opaque-pointers`/`--no-opaque-pointers` options with
`-plugin-opt=-opaque-pointers`/`-plugin-opt=-no-opaque-pointers`
aliases to lld; disabled by default.
- Adds an `-lto-opaque-pointers` option to the `llvm-lto2` tool.
- Changes the clang driver to pass `-plugin-opt=-opaque-pointers` to
the linker in LTO modes when clang was configured with opaque
pointers enabled by default.
This fixes https://github.com/llvm/llvm-project/issues/55377
Differential Revision: https://reviews.llvm.org/D125847
The option mdefault-visibility-export-mapping is created to allow
mapping default visibility to an explicit shared library export
(e.g. dllexport). Exactly how and if this is manifested is target
dependent (since it depends on how they map dllexport in the IR).
Three values are provided for the option:
* none: the default and behavior without the option, no additional export linkage information is created.
* explicit: add the export for entities with explict default visibility from the source, including RTTI
* all: add the export for all entities with default visibility
This option is useful for targets which do not export symbols as part of
their usual default linkage behaviour (e.g. AIX), such targets
traditionally specified such information in external files (e.g. export
lists), but this mapping allows them to use the visibility information
typically used for this purpose on other (e.g. ELF) platforms.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D126340
This reverts commit a544710cd4.
See discussion in D120540.
This breaks C++ Clang modules on Darwin and also more than a dozen
tests in the LLDB testsuite. I think we need to be more careful to
separate out the enabling of Clang C++ modules and C++20
modules. Either by having -fmodules-ts control the HaveModules flag,
or by adding a way to explicitly turn them off.
Compiler-rt doesn't provide support file for cfi on s390x ad ppc64le (at least).
When trying to use the flag, we get a file error.
This is an attempt at making the error more explicit.
Differential Revision: https://reviews.llvm.org/D120484
clang by default assumes static library name to be xxx.lib
when -lxxx is specified on Windows with MSVC environment,
instead of libxxx.a.
This patch fixes static device library unbundling for that.
It falls back to libxxx.a if xxx.lib is not found.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D126681
Create dxc_D as alias to option D which Define <macro> to <value> (or 1 if <value> omitted).
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D125338
Vector types in hlsl is using clang ext_vector_type.
Declaration of vector types is in builtin header hlsl.h.
hlsl.h will be included by default for hlsl shader.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D125052
This is to avoid err_target_unknown_abi which is caused by use default TargetTriple instead of shader model target triple.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D125585
`-gen-reproducer` causes crash reproduction to be emitted
even when clang didn't crash, and now can optionally take an
argument of never, on-crash (default), on-error and always.
Differential revision: https://reviews.llvm.org/D120201
Vector types in hlsl is using clang ext_vector_type.
Declaration of vector types is in builtin header hlsl.h.
hlsl.h will be included by default for hlsl shader.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D125052
This patch allows user to use C++20 module by -fcxx-modules. Previously,
we could only use it under -std=c++20. Given that user could use C++20
coroutine standalonel by -fcoroutines-ts. It makes sense to offer an
option to use C++20 modules without enabling C++20.
Reviewed By: iains, MaskRay
Differential Revision: https://reviews.llvm.org/D120540
This patch allows user to use C++20 module by -fcxx-modules. Previously,
we could only use it under -std=c++20. Given that user could use C++20
coroutine standalonel by -fcoroutines-ts. It makes sense to offer an
option to use C++20 modules without enabling C++20.
Reviewed By: iains, MaskRay
Differential Revision: https://reviews.llvm.org/D120540
Update the diagnostic in D81404: the convention is to use
err_drv_unsupported_option_argument instead of adding a new diagnostic for every
option.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D126511
Currently if `--sysroot /` is passed to the Clang driver, the include paths generated by the Clang driver will start with a double slash: `//usr/include/...`.
If VFS is used to inject files into the include paths (for example, the Swift compiler does this), VFS will get confused and the injected files won't be visible.
This change makes sure that the include paths start with a single slash.
Fixes#28283.
Differential Revision: https://reviews.llvm.org/D126289
-gen-reproducer causes crash reproduction to be emitted even
when clang didn't crash, and now can optionally take an argument
of never, on-crash (default), on-error and always.
Differential revision: https://reviews.llvm.org/D120201
The new driver uses an augmented linker wrapper to perform the device
linking phase, but to the user looks like a regular linker invocation.
Contrary to the old driver, the new driver contains all the information
necessary to produce a linked device image in the host object itself.
Currently, we infer the usage of the device linker by the user
specifying an offloading toolchain, e.g. (--offload-arch=...) or
(-fopenmp-targets=...), but this shouldn't be strictly necessary.
This patch introduces a new option `--offload-link` to tell
the driver to use the offloading linker instead. So a compilation flow
can now look like this,
```
clang foo.cu --offload-new-driver -fgpu-rdc --offload-arch=sm_70 -c
clang foo.o --offload-link -lcudart
```
I was considering if this could be merged into the `-fuse-ld` option,
but because the device linker wraps over the users linker it would
conflict with that. In the future it's possible to merge this into `lld`
completely or `gold` via a plugin and we would use this option to
enable the device linking feature. Let me know what you think for this.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D126398
Following the new flow for external object code emission,
provide flags to switch between integrated and external
backend similar to the integrated assembler options.
SPIR-V target is the only user of this functionality at
this point.
This patch also updated SPIR-V documentation to clarify
that integrated object code emission for SPIR-V is an
experimental feature.
Differential Revision: https://reviews.llvm.org/D125679
We use the clang-linker-wrapper to perform device linking of embedded
offloading object files. This is done by generating those jobs inside of
the linker-wrapper itself. This patch adds an argument in Clang and the
linker-wrapper that allows users to forward input to the device linking
phase. This can either be done for every device linker, or for a
specific target triple. We use the `-Xoffload-linker <arg>` and the
`-Xoffload-linker-<triple> <arg>` syntax to accomplish this.
Reviewed By: markdewing, tra
Differential Revision: https://reviews.llvm.org/D126226
For generic targets such as SPIR-V clang sets all OpenCL
extensions/features as supported by default. However
concrete targets are unlikely to support all extensions
features, which creates a problem when such generic SPIR-V
binary is compiled for a specific target later on.
To allow compile time diagnostics for unsupported features
this flag is now being exposed in the clang driver.
Differential Revision: https://reviews.llvm.org/D125243
The new test is a copy of the corresponding PS4 test, with the triple
etc updated, because there's currently no good way to make one lit test
"iterate" with multiple targets.
The arch or cpu has its default fpu features and versions such as fpuv2_sf/fpuv3_sf.
And there is also -mfpu option to specify and override fpu version and features.
For example, C860 has fpuv3_sf/fpuv3_df feature as default, when
-mfpu=fpv2 is given, fpuv3_sf/fpuv3_df is replaced with fpuv2_sf/fpuv2_df.
Support the "-fzero-call-used-regs" option on AArch64. This involves much less
specialized code than the X86 version. Most of the checks can be done with
TableGen.
Reviewed By: nickdesaulniers, MaskRay
Differential Revision: https://reviews.llvm.org/D124836
This patch basically extends https://reviews.llvm.org/D122008 with
support for MacOSX/Darwin.
To facilitate this, I've added `MacOSX` to the list of supported OSes in
Target.cpp. Flang already supports `Darwin` and it doesn't really do
anything OS-specific there (it could probably safely skip checking the
OS for now).
Note that generating executables remains hidden behind the
`-flang-experimental-exec` flag. Also, we don't need to add `-lm` on
MacOSX as `libm` is effectively included in `libSystem` (which is linked
in unconditionally).
Differential Revision: https://reviews.llvm.org/D125628
This patch allows systems to build the llvm-project with the devtoolset-11
toolchain.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D125499
The Clang driver additional stages to build a complete offloading
program for applications using CUDA or OpenMP offloading. This normally
requires either a source file input or a valid object file to be
handled. This would cause problems when trying to compile an assembly or
LLVM IR file through clang with flags that would enable offloading. This
patch simply adds a check to prevent the offloading toolchain from being
used if we don't have a valid source file.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D125705
Summary:
Normally we parse through the CUDA installation to disover the needed
features. However, we may want to build libraries on targets that do not
currently have CUDA installed but still need to know which features to
make use of when creating the PTX or bitcode. This flag is a simple way
to specify this so we can compile certain codes withotu a valid CUDA
installation.
Ideally this could be done via an -Xarch or simimlar flag but currently
they cannot handle this. We would need to support using an -Xarch flag
that takes multiple arguments that then pass them to the -Xclang
functionality.
The previous patches allowed us to create a static library containing
all the device code. This patch uses that library to perform the device
runtime linking late when performing LTO. This in addition to
simplifying the libraries, allows us to transparently handle the runtime
library as-needed without needing Clang to manually pass the necessary
library in the linker wrapper job.
Depends on D125315
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D125333
We use globals to configure debugging at compile-time for the device
runtime. Because these are only used by the OpenMP runtime we shouldn't
define them if we aren't using the device runtime. When a user passes in
'-nogpulib' this indicates that we are not using the device runtime, so
we should check for the precense of this flag and not emit these globals
if used.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D125314
OpenMP uses several wrapper hearders to provide the definitions of
needed symbols contained in the host. However, some users may use the
`-nostdinc` option to override these definitions themselves. The OpenMP
wrapper headers are stored in the same location as the clang install. If
the user passes `-nostdinc` then this include directory is never looked
at by default which means that including these wrappers will always
fail. These headers should instead be included manually if they are
needed with a `-nostdinc` build.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D125265
This adds a late Machine Pass to work around a Cortex CPU Erratum
affecting Cortex-A57 and Cortex-A72:
- Cortex-A57 Erratum 1742098
- Cortex-A72 Erratum 1655431
The pass inserts instructions to make the inputs to the fused AES
instruction pairs no longer trigger the erratum. Here the pass errs on
the side of caution, inserting the instructions wherever we cannot prove
that the inputs came from a safe instruction.
The pass is used:
- for Cortex-A57 and Cortex-A72,
- for "generic" cores (which are used when using `-march=`),
- when the user specifies `-mfix-cortex-a57-aes-1742098` or
`mfix-cortex-a72-aes-1655431` in the command-line arguments to clang.
Reviewed By: dmgreen, simon_tatham
Differential Revision: https://reviews.llvm.org/D119720
When Clang generates the path prefix (i.e. the path of the directory
where the file is) when generating FILE, __builtin_FILE(), and
std::source_location, Clang uses the platform-specific path separator
character of the build environment where Clang _itself_ is built. This
leads to inconsistencies in Chrome builds where Clang running on
non-Windows environments uses the forward slash (/) path separator
while Clang running on Windows builds uses the backslash (\) path
separator. To fix this, we add a flag -ffile-reproducible (and its
inverse, -fno-file-reproducible) to have Clang use the target's
platform-specific file separator character.
Additionally, the existing flags -fmacro-prefix-map and
-ffile-prefix-map now both imply -ffile-reproducible. This can be
overriden by setting -fno-file-reproducible.
[0]: https://crbug.com/1310767
Differential revision: https://reviews.llvm.org/D122766
Create dxc_D as alias to option D which Define <macro> to <value> (or 1 if <value> omitted).
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D125338
In order to do offloading compilation we need to embed files into the
host and create fatbainaries. Clang uses a special binary format to
bundle several files along with their metadata into a single binary
image. This is currently performed using the `-fembed-offload-binary`
option. However this is not very extensibile since it requires changing
the command flag every time we want to add something and makes optional
arguments difficult. This patch introduces a new tool called
`clang-offload-packager` that behaves similarly to CUDA's `fatbinary`.
This tool takes several input files with metadata and embeds it into a
single image that can then be embedded in the host.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D125165
The changes made in D123460 generalized the code generation for OpenMP's
offloading entries. We can use the same scheme to register globals for
CUDA code. This patch adds the code generation to create these
offloading entries when compiling using the new offloading driver mode.
The offloading entries are simple structs that contain the information
necessary to register the global. The struct used is as follows:
```
Type struct __tgt_offload_entry {
void *addr; // Pointer to the offload entry info.
// (function or global)
char *name; // Name of the function or global.
size_t size; // Size of the entry info (0 if it a function).
int32_t flags;
int32_t reserved;
};
```
Currently CUDA handles RDC code generation by deferring the registration
of globals in the current TU to a callback function containing the
modules ID. Later all the module IDs will be used to register all of the
globals at once. Rather than mimic this, offloading entries allow us to
mimic the way OpenMP registers globals. That is, we create a simple
global struct for each device global to be registered. These are placed
at a special section `cuda_offloading_entires`. Because this section is
a valid C-identifier, the linker will profide a `__start` and `__stop`
pointer that we can use to iterate and register all globals at runtime.
the registration requires a flag variable to indicate which registration
function to use. I have assigned the flags somewhat arbitrarily, but
these use the following values.
Kernel: 0
Variable: 0
Managed: 1
Surface: 2
Texture: 3
Depends on D120272
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D123471
Summary:
We use the `--offload-new-driver` option to enable offload code
embedding. The check for when to do this was flawed and was enabling it
too early in the case of OpenMP, causing a segfault when dereferencing
the offloading toolchain.
Currently we require the `-fopenmp-targets=` option to specify the
triple to use for the offloading toolchains, and the `-Xopenmp-target=`
option to specify architectures to a specific toolchain. The changes
made in D124721 allowed us to use `--offload-arch=` to specify multiple
target architectures. However, this can become combersome with many
different architectures. This patch introduces functinality that
attempts to deduce the target triple and architectures from the
offloading action. Currently we will deduce known GPU architectures when
only `-fopenmp` is specified.
This required a bit of a hack to cache the deduced architectures,
without this we would've just thrown an error when we tried to look up
the architecture again when generating the job. Normally we require the
user to manually specify the toolchain arguments, but here they would
confict unless we overrode them.
Depends on: D124721
Reviewed By: saiislam
Differential Revision: https://reviews.llvm.org/D125050
This patch adds support for OpenMP to use the `--offload-arch` and
`--no-offload-arch` options. Traditionally, OpenMP has only supported
compiling for a single architecture via the `-Xopenmp-target` option.
Now we can pass in a bound architecture and use that if given, otherwise
we default to the value of the `-march` option as before.
Note that this only applies the basic support, the OpenMP target runtime
does not yet know how to choose between multiple architectures.
Additionally other parts of the offloading toolchain (e.g. LTO) require
the `-march` option, these should be worked out later.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D124721
When targeting cortex-a53, set this linker flag rather than relying
on the toolchain users to do it in their build.
Differential Revision: https://reviews.llvm.org/D114023
fcgl option will make compilation stop after clang codeGen and output the llvm ir.
It is added to check clang codeGen output for HLSL.
It will be translated into -S -emit-llvm and -disable-llvm-passes.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D124983
This change makes sure that Flang's driver recognises LLVM IR and BC as
supported file formats. To this end, `isFortran` is extended and renamed
as `isSupportedByFlang` (the latter better reflects the new
functionality).
New tests are added to verify that the target triple is correctly
overridden by the frontend driver's default value or the value specified
with `-triple`. Strictly speaking, this is not a functionality that's
new in this patch (it was added in D124664). This patch simply enables
us to write such tests and hence I'm including them here.
Differential Revision: https://reviews.llvm.org/D124667
If clang is passed "-include foo.h", it will rewrite to "-include-pch foo.h.pch"
before passing it to cc1, if foo.h.pch exists.
Existence is checked, but validity is not. This is probably a reasonable
assumption for the compiler itself, but not for clang-based tools where the
actual compiler may be a different version of clang, or even GCC.
In the end, we lose our -include, we gain a -include-pch that can't be used,
and the file often fails to parse.
I would like to turn this off for all non-clang invocations (i.e.
createInvocationFromCommandLine), but we have explicit tests of this behavior
for libclang and I can't work out the implications of changing it.
Instead this patch:
- makes it optional in the driver, default on (no change)
- makes it optional in createInvocationFromCommandLine, default on (no change)
- changes driver to do IO through the VFS so it can be tested
- tests the option
- turns the option off in clangd where the problem was reported
Subsequent patches should make libclang opt in explicitly and flip the default
for all other tools. It's probably also time to extract an options struct
for createInvocationFromCommandLine.
Fixes https://github.com/clangd/clangd/issues/856
Fixes https://github.com/clangd/vscode-clangd/issues/324
Differential Revision: https://reviews.llvm.org/D124970
This patch switches the PGO implementation on AIX from using the runtime
registration-based section tracking to the __start_SECNAME/__stop_SECNAME
based. In order to enable the recognition of __start_SECNAME/__stop_SECNAME
symbols in the AIX linker, the -bdbg:namedsects:ss needs to be used.
Reviewed By: jsji, MaskRay, davidxl
Differential Revision: https://reviews.llvm.org/D124857
Due to a think-o, it was only being accepted as a -cc1 flag. This adds
the proper forwarding from the driver to the frontend and adds test
coverage for the option.
The DXIL validator version option(/validator-version) decide the validator version when compile hlsl.
The format is major.minor like 1.0.
In normal case, the value of validator version should be got from DXIL validator. Before we got DXIL validator ready for llvm/main, DXIL validator version option is added first to set validator version.
It will affect code generation for DXIL, so it is treated as a code gen option.
A new member std::string DxilValidatorVersion is added to clang::CodeGenOptions.
Then CGHLSLRuntime is added to clang::CodeGenModule.
It is used to translate clang::CodeGenOptions::DxilValidatorVersion into a ModuleFlag under key "dx.valver" at end of clang code generation.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D123884
Pass the --compress-debug-sections=zlib argument to the linker when
the use of compressed debug info is requested.
Differential Revision: https://reviews.llvm.org/D114115
OpenMP recently moved to the new offloading driver, this had the effect
of making it more difficult to inspect intermediate code for the device.
This patch adds `-foffload-host-only` and `-foffload-device-only` to
control which sides get compiled. This will allow users to more easily
inspect output without needing the temp files.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D124220
When the denormal-fp-math option is used, this should set the
denormal handling mode for all floating point types. However,
currently 32-bit float types can ignore this setting as there is a
variant of the option, denormal-fp-math-f32, specifically for that type
which takes priority when checking the mode based on type and remains
at the default of IEEE. From the description, denormal-fp-math would
be expected to set the mode for floats unless overridden by the f32
variant, and code in the front end only emits the f32 option if it is
different to the general one, so setting just denormal-fp-math should
be valid.
This patch changes the denormal-fp-math option to also set the f32
mode. If denormal-fp-math-f32 is also specified, this is then
overridden as expected, but if it is absent floats will be set to the
mode specified by the former option, rather than remain on the default.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D122589
This patch adds the basic support for the clang driver to compile and link CUDA
using the new offloading driver. This requires handling the CUDA offloading kind
and embedding the generated files into the host. This will allow us to link
OpenMP code with CUDA code in the linker wrapper. More support will be required
to create functional CUDA / HIP binaries using this method.
Depends on D120270 D120271 D120934
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D120272
In preparation for allowing other offloading kinds to use the new driver
a new opt-in flag `-foffload-new-driver` is added. This is distinct from
the existing `-fopenmp-new-driver` because OpenMP will soon use the new
driver by default while the others should not.
Reviewed By: yaxunl, tra
Differential Revision: https://reviews.llvm.org/D123325
In preparation for accepting other offloading kinds with the new driver,
this patch makes the way we handle offloading actions more generic. A
new field to get the associated device action's toolchain is used rather
than manually iterating a list. This makes building the arguments easier
and makes sure that we doin't rely on any implicit ordering.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D123313
This is a followup to D120158 which added an 'h' suffix to the
backend handling.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D124551
Pass the --compress-debug-sections=zlib argument to the linker when
the use of compressed debug info is requested.
Differential Revision: https://reviews.llvm.org/D114115
Summary:
We provide the `-f(no-)openmp-new-driver` option to allow users to use
the old or new driver. Previously this wasn't handled in the expected
way and only `-fno-openmp-new-driver` was checked. This patch fixes that
by using the `hasFlag` method as is standard.
Pass the --compress-debug-sections=zlib argument to the linker when
the use of compressed debug info is requested.
Differential Revision: https://reviews.llvm.org/D114115
Commit 5c90eca added some analyzer option checking, but a typo meant
it was redundantly checking PS4 and not adding checking for PS5.
With the test corrected, it identified the necessary driver updates,
added in this commit.
This patch adds 2 missing items required for `flang-new` to be able to
generate executables:
1. The Fortran_main runtime library, which implements the main entry
point into Fortran's `PROGRAM` in Flang,
2. Extra linker flags to include Fortran runtime libraries (e.g.
Fortran_main).
Fortran_main is the bridge between object files generated by Flang and
the C runtime that takes care of program set-up at system-level. For
every Fortran `PROGRAM`, Flang generates the `_QQmain` function.
Fortran_main implements the C `main` function that simply calls
`_QQmain`.
Additionally, "<driver-path>/../lib" directory is added to the list of
search directories for libraries. This is where the required runtime
libraries are currently located. Note that this the case for the build
directory. We haven't considered installation directories/targets yet.
With this change, you can generate an executable that will print `hello,
world!` as follows:
```bash
$ cat hello.f95
PROGRAM HELLO
write(*, *) "hello, world!"
END PROGRAM HELLO
$ flang-new -flang-experimental-exec hello.f95
./a.out
hello, world!
```
NOTE 1: Fortran_main has to be a static library at all times. It invokes
`_QQmain`, which is the main entry point generated by Flang for the
given input file (you can check this with `flang-new -S hello.f95 -o - |
grep "Qmain"`). This means that Fortran_main has an unresolved
dependency at build time. The linker will allow this for a static
library. However, if Fortran_main was a shared object, then the linker
will produce an error: `undefined symbol: `_QQmain`.
NOTE 2: When Fortran runtime libraries are generated as shared libraries
(excluding Fortran_main, which is always static), you will need to
tell the dynamic linker (by e.g. tweaking LD_LIBRARY_PATH) where to look
for them when invoking the executables. For example:
```bash
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<flang-build-dir>/lib/ ./a.out
```
NOTE 3: This feature is considered experimental and currently guarded
with a flag: `-flang-experimental-exec`.
Differential Revision: https://reviews.llvm.org/D122008
[1] https://github.com/flang-compiler/f18-llvm-project
CREDITS: Fortran_main was originally written by Eric Schweitz, Jean
Perier, Peter Klausler and Steve Scalpone in the fir-dev` branch in [1].
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Peter Klausler <pklausler@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Steve Scalpone <sscalpone@nvidia.com
When the -fdirectives-only option is used together with -E, the preprocessor
output reflects evaluation of if/then/else directives.
Thus it preserves macros that are still live after such processing.
This output can be consumed by a second compilation to produce a header unit.
We automatically invoke this (with -E) when we know that the job produces a
header unit so that the preprocessed output reflects the macros that will be
defined when the binary HU is emitted.
Differential Revision: https://reviews.llvm.org/D121591
Allow an invocation like clang -fmodule-header bar.h (which will be a C++
compilation, but using a header which will be recognised as a C one).
Also we do not want to produce:
"treating 'c-header' input as 'c++-header' when in C++ mode"
diagnostics when the user has been specific about the intent.
Differential Revision: https://reviews.llvm.org/D121590