This cc1 option -fallow-half-arguments-and-returns allows __fp16 to be
passed by argument and returned, without giving an error. It is
currently always enabled for Arm and AArch64, by forcing the option in
the driver. This means any cc1 tests (especially those needing
arm_neon.h) need to specify the option too, to prevent the error from
being emitted.
This changes it to a target option instead, set to true for Arm and
AArch64. This allows the option to be removed. Previously it was implied
by -fnative_half_arguments_and_returns, which is set for certain
languages like open_cl, renderscript and hlsl, so that option now too
controls the errors. There were are few other non-arm uses of
-fallow-half-arguments-and-returns but I believe they were unnecessary.
The strictfp_builtins.c tests were converted from __fp16 to _Float16 to
avoid the issues.
Differential Revision: https://reviews.llvm.org/D133885
The /Zl flag omits default C runtime library name from obj files.
This patch just adds an equivalent clang driver flag.
Differential Revision: https://reviews.llvm.org/D133959
This can be used to disable vectorization and memcpy/memset
expansion for things like OS kernels. It also disables implicit
uses of scalar FP, but I don't know if we have any of those for
RISC-V.
NOTE: Without this patch you can still do -Xclang -no-implicit-float
Reviewed By: rui.zhang
Differential Revision: https://reviews.llvm.org/D134077
Having the flags only pass through if you're using the dxc-driver means
that the clang driver doesn't work for HLSL, which is undesirable. This
change switches to instead passing flags based on the language mode
similar to how OpenCL does it. This allows the clang driver to be used
for HLSL source files as well.
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/D133958
When ‘ffast-math’ is set, ffp-contract is altered this way:
-ffast-math/ Ofast -> ffp-contract=fast
-fno-fast-math -> if ffp-contract= fast then ffp-contract=on else
ffp-contract unchanged
This differs from gcc which doesn’t connect the two options.
Connecting these two options in clang, resulted in spurious warnings
when the user combines these two options -ffast-math -fno-fast-math; see
issue https://github.com/llvm/llvm-project/issues/54625.
The issue is that the ‘ffast-math’ option is an on/off flag, but the
‘ffp-contract’ is an on/off/fast flag. So when ‘fno-fast-math’ is used
there is no obvious value for ‘ffp-contract’. What should the value of
ffp-contract be for -ffp-contract=fast -fno-fast-math and -ffast-math
-ffp-contract=fast -fno-fast-math? The current logic sets ffp-contract
back to on in these cases. This doesn’t take into account that the value
of ffp-contract is modified by an explicit ffp-contract` option.
This patch is proposing a set of rules to apply when ffp-contract',
ffast-math and fno-fast-math are combined. These rules would give the
user the expected behavior and no diagnostic would be needed.
See RFC
https://discourse.llvm.org/t/rfc-making-ffast-math-option-unrelated-to-ffp-contract-option/61912
Previously, we linked in the ROCm device libraries which provide math
and other utility functions late. This is not stricly correct as this
library contains several flags that are only set per-TU, such as fast
math or denormalization. This patch changes this to pass the bitcode
libraries per-TU using the same method we use for the CUDA libraries.
This has the advantage that we correctly propagate attributes making
this implementation more correct. Additionally, many annoying unused
functions were not being fully removed during LTO. This lead to
erroneous warning messages and remarks on unused functions.
I am not sure if not finding these libraries should be a hard error. let
me know if it should be demoted to a warning saying that some device
utilities will not work without them.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D133726
The old device runtime had a "simplified" version that prevented many of
the runtime features from being initialized. The old device runtime was
deleted in LLVM 14 and is no longer in use. Selectively deactivating
features is now done using specific flags rather than the old technique.
This patch simply removes the extra logic required for handling the old
simple runtime scheme.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D133802
Two new dxc mode options -O and -Od are added for dxc mode.
-O is just alias of existing cc1 -O option.
-Od will be lowered into -O0 and -dxc-opt-disable.
-dxc-opt-disable is cc1 option added to for build ShaderFlags.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D128845
The new driver currently crashses when attempting to use the
'-fsyntax-only' option. This is because the option causes all output to
be given the `TY_Nothing' type which should signal the end of the
pipeline. The new driver was not treating this correctly and attempting
to use empty input. This patch fixes the handling so we do not attempt
to continue when the input is nothing.
One concession is that we must now check when generating the arguments
for Clang if the input is of 'TY_Nothing'. This is because the new
driver will only create code if the device code is a dependency on the
host, creating the output without the dependency would require a
complete rewrite of the logic as we do not maintain any state between
calls to 'BuildOffloadingActions' so I believe this is the most
straightforward method.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D133161
Summary:
A previous patch removed support for the `-fopenmp-new-driver` and
accidentally used the `isHostOffloading` flag instead of
`isDeviceOffloading` which lead to some build errors when compiling for
the offloading device. This patch addresses that.
The changes in D130020 removed all support for the old method of
compiling OpenMP offloading programs. This means that
`-fopenmp-new-driver` has no effect and `-fno-openmp-new-driver` does
not work. This patch removes the use and documentation of this flag.
Note that the `--offload-new-driver` flag still exists for using the new
driver optionally with CUDA and HIP.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D133367
This reverts commit 33162a81d4.
This change breaks the usage of module maps with modules disabled, such
as for layering checking via `-fmodules-decluse`.
Regression test added.
Recently OpenMP has transitioned to using the "new" driver which
primarily merges the device and host linking phases into a single
wrapper that handles both at the same time. This replaced a few tools
that were only used for OpenMP offloading, such as the
`clang-offload-wrapper` and `clang-nvlink-wrapper`. The new driver
carries some marked benefits compared to the old driver that is now
being deprecated. Things like device-side LTO, static library
support, and more compatible tooling. As such, we should be able to
completely deprecate the old driver, at least for OpenMP. The old driver
support will still exist for CUDA and HIP, although both of these can
currently be compiled on Linux with `--offload-new-driver` to use the new
method.
Note that this does not deprecate the `clang-offload-bundler`, although
it is unused by OpenMP now, it is still used by the HIP toolchain both
as their device binary format and object format.
When I proposed deprecating this code I heard some vendors voice
concernes about needing to update their code in their fork. They should
be able to just revert this commit if it lands.
Reviewed By: jdoerfert, MaskRay, ye-luo
Differential Revision: https://reviews.llvm.org/D130020
The OpenMP device runtime needs to support the OpenMP standard. However
constructs like nested parallelism are very uncommon in real application
yet lead to complexity in the runtime that is sometimes difficult to
optimize out. As a stop-gap for performance we should supply an argument
that selectively disables this feature. This patch adds the
`-fopenmp-assume-no-nested-parallelism` argument which explicitly
disables the usee of nested parallelism in OpenMP.
Reviewed By: carlo.bertolli
Differential Revision: https://reviews.llvm.org/D132074
With the initial support added, clang can compile `helloworld` C
to executable file for loongarch64. For example:
```
$ cat hello.c
int main() {
printf("Hello, world!\n");
return 0;
}
$ clang --target=loongarch64-unknown-linux-gnu --gcc-toolchain=xxx --sysroot=xxx hello.c
```
The output a.out can run within qemu or native machine. For example:
```
$ file ./a.out
./a.out: ELF 64-bit LSB pie executable, LoongArch, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-loongarch-lp64d.so.1, for GNU/Linux 5.19.0, with debug_info, not stripped
$ ./a.out
Hello, world!
```
Currently gcc toolchain and sysroot can be found here:
https://github.com/loongson/build-tools/releases/download/2022.08.11/loongarch64-clfs-5.1-cross-tools-gcc-glibc.tar.xz
Reference: https://github.com/loongson/LoongArch-Documentation
The last commit hash (main branch) is:
99016636af64d02dee05e39974d4c1e55875c45b
Note loongarch32 is not fully tested because there is no reference
gcc toolchain yet.
Differential Revision: https://reviews.llvm.org/D130255
This patch does the following:
- Consumes the PIC flags (fPIC/fPIE/fropi/frwpi etc) in flang-new.
tools::ParsePICArgs() in ToolChains/CommonArgs.cpp is used for this.
- Adds FC1Option to "-mrelocation-model", "-pic-level", and "-pic-is-pie"
command line options.
- Adds the above options to flang/Frontend/CodeGenOptions' data structure.
- Sets the relocation model in the target machine, and
- Sets module flags for the respective PIC/PIE type in LLVM IR.
I have tried my best to replicate how clang does things.
Differential Revision: https://reviews.llvm.org/D131533
Change-Id: I68fe64910be28147dc5617826641cea71b92d94d
The previous implementation translated from names like sifive-7-series
to sifive-7-rv32 or sifive-7-rv64. This also required sifive-7-rv32
and sifive-7-rv64 to be valid CPU names. As those are not real
CPUs it doesn't make sense to accept them in -mcpu.
This patch does away with the translation and adds sifive-7-series
directly to RISCV.td. Removing sifive-7-rv32 and sifive-7-rv64.
sifive-7-series is only allowed in -mtune.
I've also added "rocket" to RISCV.td but have not removed rocket-rv32
or rocket-rv64.
To prevent -mcpu=sifive-7-series or -mcpu=rocket being used with llc,
I've added a Feature32Bit to all rv32 CPUs. And made it an error to
have an rv32 triple without Feature32Bit. sifive-7-series and rocket
do not have Feature32Bit or Feature64Bit set so the user would need
to provide -mattr=+32bit or -mattr=+64bit along with the -mcpu to
avoid the error.
SiFive no longer names their newer products with 3, 5, or 7 series.
Instead we have p200 series, x200 series, p500 series, and p600 series.
Following the previous behavior would require a sifive-p500-rv32 and
sifive-p500-rv64 in order to support -mtune=sifive-p500-series. There
is currently no p500 product, but it could start getting confusing if
there was in the future.
I'm open to hearing alternatives for how to achieve my main goal
of removing sifive-7-rv32/rv64 as a CPU name.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D131708
Add the clang option -finline-max-stacksize=<N> to suppress inlining
of functions whose stack size exceeds the given value.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D131986
When not set output, set default output to stdout.
When set output with -Fo and no -fcgl, set -emit-obj to generate dx container.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D130858
-E option will set entry function for hlsl.
The format is -E entry_name.
To avoid conflict with existing option with name 'E', add an extra prefix '--'.
A new field HLSLEntry is added to TargetOption.
To share code with HLSLShaderAttr, entry function will be add HLSLShaderAttr attribute too.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D124751
I went over the output of the following mess of a command:
(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z |
parallel --xargs -0 cat | aspell list --mode=none --ignore-case |
grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n |
grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Differential Revision: https://reviews.llvm.org/D130827
Closing https://github.com/llvm/llvm-project/issues/56803. The root
cause for this bug is that we lack a good method to detect the language
mdoe when parsing the command line. There is a FIXME too. Dut to we lack
a good solution now, keep the workaround.
To make use of SPARC support in `getHostCPUName` as implemented by D130272
<https://reviews.llvm.org/D130272>, this patch uses it to handle
`-mcpu=native` and `-mtune=native`. To match GCC, this patch rejects
`-march` instead of silently treating it as a no-op.
Tested on `sparcv9-sun-solaris2.11` and checking that those options are
passed on as `-target-cpu` resp. `-tune-cpu` as expected.
Differential Revision: https://reviews.llvm.org/D130273
Summary:
Linkers use `--verbose` to let users investigate search libraries among
other things. The linker wrapper was incorrectly not forwarding this to
the linker job. This patch simply renames this so users can still see
verbose messages from the linker if it was passed.
We are supporting quadword lock free atomics on AIX. For the situation that users on AIX are using a libatomic that is lock-based for quadword types, we can't enable quadword lock free atomics by default on AIX in case user's new code and existing code accessing the same shared atomic quadword variable, we can't guarentee atomicity. So we need an option to enable quadword lock free atomics on AIX, thus we can build a quadword lock-free libatomic(also for advanced users considering atomic performance critical) for users to make the transition smooth.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D127189
-gsplit-dwarf produces a .dwo file which will not be processed by the linker. If
.dwo files contain relocations, they will not be resolved. Therefore the
practice is that .dwo files do not contain relocations.
Address ranges and location description need to use forms/entry kinds indexing
into .debug_addr (DW_FORM_addrx/DW_RLE_startx_endx/etc), which is currently not
implemented.
There is a difficult-to-read MC error with -gsplit-dwarf with RISC-V for both -mrelax and -mno-relax.
```
% clang --target=riscv64-linux-gnu -g -gsplit-dwarf -c a.c
error: A dwo section may not contain relocations
```
We expect to fix -mno-relax soon, so report a driver error for -mrelax for now.
Link: https://github.com/llvm/llvm-project/issues/56642
Reviewed By: compnerd, kito-cheng
Differential Revision: https://reviews.llvm.org/D130190
Adds `sarif` option to the existing `-fdiagnostics-format` flag
for intended future work with SARIF diagnostics. Currently issues a warning
against the use of diagnostics in SARIF mode, then defaults to clang style for
diagnostics.
Reviewed By: cjdb, denik, aaron.ballman
Differential Revision: https://reviews.llvm.org/D129886
The new driver primarily allows us to support RDC-mode compilations with
proper linking. This is not needed for non-RDC mode compilation, but we
still would like the new driver to be able to handle this mode so we can
transition away from the old driver in the future. This patch adds the
necessary code to support creating a fatbinary for HIP code generation.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D129784