Summary: With OpenMP offloading host compilation is done in two phases to capture host IR that is passed to all device compilations as input. But it turns out that we currently run entire LLVM optimization pipeline on host IR on both compilations which may have unpredictable effects on the resulting code. This patch fixes this problem by disabling LLVM passes on the first compilation, so the host IR that is passed to device compilations will be captured right after front end.
Reviewers: ABataev, jdoerfert, hfinkel
Reviewed By: ABataev
Subscribers: guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73721
Summary:
Fat object size has significantly increased after D65819 which changed bundler tool to add host object as a normal bundle to the fat output which almost doubled its size. That patch was fixing the following issues
1. Problems associated with the partial linking - global constructors were not called for partially linking objects which clearly resulted in incorrect behavior.
2. Eliminating "junk" target object sections from the linked binary on the host side.
The first problem is no longer relevant because we do not use partial linking for creating fat objects anymore. Target objects sections are now inserted into the resulting fat object with a help of llvm-objcopy tool.
The second issue, "junk" sections in the linked host binary, has been fixed in D73408 by adding "exclude" flag to the fat object's sections which contain target objects. This flag tells linker to drop section from the inputs when linking executable or shared library, therefore these sections will not be propagated in the linked binary.
Since both problems have been solved, we can revert D65819 changes to reduce fat object size and this patch essentially is doing that.
Reviewers: ABataev, alexshap, jdoerfert
Reviewed By: ABataev
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73642
Summary: This flag tells link editor to exclude section from linker inputs when linking executable or shared library.
Reviewers: ABataev, alexshap, jdoerfert
Reviewed By: ABataev
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73408
include Clang builtin headers even with -nostdinc
Some projects use -nostdinc, but need to access some intrinsics files when building specific files.
The new -ibuiltininc flag lets them use this flag when compiling these files to ensure they can
find Clang's builtin headers.
The use of -nobuiltininc after the -ibuiltininc flag does not add the builtin header
search path to the list of header search paths.
Differential Revision: https://reviews.llvm.org/D73500
This makes clang somewhat forward-compatible with new CUDA releases
without having to patch it for every minor release without adding
any new function.
If an unknown version is found, clang issues a warning (can be disabled
with -Wno-cuda-unknown-version) and assumes that it has detected
the latest known version. CUDA releases are usually supersets
of older ones feature-wise, so it should be sufficient to keep
released clang versions working with minor CUDA updates without
having to upgrade clang, too.
Differential Revision: https://reviews.llvm.org/D73231
Currently device lib path set by environment variable HIP_DEVICE_LIB_PATH
does not work due to extra "-L" added to each entry.
This patch fixes that by allowing argument name to be empty in addDirectoryList.
Differential Revision: https://reviews.llvm.org/D73299
This required some fixes to the generic code for two issues:
1. -fsanitize=safe-stack is default on x86_64-fuchsia and is *not* incompatible with -fsanitize=leak on Fuchisa
2. -fsanitize=leak and other static-only runtimes must not be omitted under -shared-libsan (which is the default on Fuchsia)
Patch By: mcgrathr
Differential Revision: https://reviews.llvm.org/D73397
See
https://docs.google.com/document/d/1xMkTZMKx9llnMPgso0jrx3ankI4cv60xeZ0y4ksf4wc/preview
for background discussion.
This adds a warning, flags and pragmas to limit the number of
pre-processor tokens either at a certain point in a translation unit, or
overall.
The idea is that this would allow projects to limit the size of certain
widely included headers, or for translation units overall, as a way to
insert backstops for header bloat and prevent compile-time regressions.
Differential revision: https://reviews.llvm.org/D72703
Using the same strategy as c38e42527b.
D69825 revealed (introduced?) a problem when building with ASan, and
some memory leaks somewhere. More details are available in the original
patch.
Looks like we missed one failing tests, this patch adds the workaround
to this test as well.
After rGb4a99a061f517e60985667e39519f60186cbb469, passing a response file such as -Wp,@a.rsp wasn't working anymore because .rsp expansion happens inside clang's main() function.
This patch adds response file expansion in the -cc1 tool.
Differential Revision: https://reviews.llvm.org/D73120
The issue was reported by @xazax.hun here: https://reviews.llvm.org/D69825#1827826
"This patch (D69825) breaks scan-build-py which parses the output of "-###" to get -cc1 command. There might be other tools with the same problems. Could we either remove (in-process) from CC1Command::Print or add a line break?
Having the last line as a valid invocation is valuable and there might be tools relying on that."
Differential Revision: https://reviews.llvm.org/D72982
Currently there are 4 different mechanisms for controlling denormal
flushing behavior, and about as many equivalent frontend controls.
- AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features
- NVPTX uses the nvptx-f32ftz attribute
- ARM directly uses the denormal-fp-math attribute
- Other targets indirectly use denormal-fp-math in one DAGCombine
- cl-denorms-are-zero has a corresponding denorms-are-zero attribute
AMDGPU wants a distinct control for f32 flushing from f16/f64, and as
far as I can tell the same is true for NVPTX (based on the attribute
name).
Work on consolidating these into the denormal-fp-math attribute, and a
new type specific denormal-fp-math-f32 variant. Only ARM seems to
support the two different flush modes, so this is overkill for the
other use cases. Ideally we would error on the unsupported
positive-zero mode on other targets from somewhere.
Move the logic for selecting the flush mode into the compiler driver,
instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32
are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as
a user flag.
-cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and
-fno-cuda-flush-denormals-to-zero will be mapped to
-fp-denormal-math-f32=ieee or preserve-sign rather than the old
attributes.
Stop emitting the denorms-are-zero attribute for the OpenCL flag. It
has no in-tree users. The meaning would also be target dependent, such
as the AMDGPU choice to treat this as only meaning allow flushing of
f32 and not f16 or f64. The naming is also potentially confusing,
since DAZ in other contexts refers to instructions implicitly treating
input denormals as zero, not necessarily flushing output denormals to
zero.
This also does not attempt to change the behavior for the current
attribute. The LangRef now states that the default is ieee behavior,
but this is inaccurate for the current implementation. The clang
handling is slightly hacky to avoid touching the existing
denormal-fp-math uses. Fixing this will be left for a future patch.
AMDGPU is still using the subtarget feature to control the denormal
mode, but the new attribute are now emitted. A future change will
switch this and remove the subtarget features.
Summary:
This change implements the expansion in two parts:
- Add a utility function emitAMDGPUPrintfCall() in LLVM.
- Invoke the above function from Clang CodeGen, when processing a HIP
program for the AMDGPU target.
The printf expansion has undefined behaviour if the format string is
not a compile-time constant. As a sufficient condition, the HIP
ToolChain now emits -Werror=format-nonliteral.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D71365
Flags are clang's default UI is flags.
We can have an env var in addition to that, but in D69825 nobody has yet
mentioned why this needs an env var, so omit it for now. If someone
needs to set the flag via env var, the existing CCC_OVERRIDE_OPTIONS
mechanism works for it (set CCC_OVERRIDE_OPTIONS=+-fno-integrated-cc1
for example).
Also mention the cc1-in-process change in the release notes.
Also spruce up the test a bit so it actually tests something :)
Differential Revision: https://reviews.llvm.org/D72769
These driver options perform some checking and delegate to MC options -x86-align-branch* and -x86-branches-within-32B-boundaries.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D72463
Fix riscv-toolchain-extra tests to pass when CLANG_RESOURCE_DIR is set
to another value than the default.
Differential Revision: https://reviews.llvm.org/D72591
With this patch, the clang tool will now call the -cc1 invocation directly inside the same process. Previously, the -cc1 invocation was creating, and waiting for, a new process.
This patch therefore reduces the number of created processes during a build, thus it reduces build times on platforms where process creation can be costly (Windows) and/or impacted by a antivirus.
It also makes debugging a bit easier, as there's no need to attach to the secondary -cc1 process anymore, breakpoints will be hit inside the same process.
Crashes or signaling inside the -cc1 invocation will have the same side-effect as before, and will be reported through the same means.
This behavior can be controlled at compile-time through the CLANG_SPAWN_CC1 cmake flag, which defaults to OFF. Setting it to ON will revert to the previous behavior, where any -cc1 invocation will create/fork a secondary process.
At run-time, it is also possible to tweak the CLANG_SPAWN_CC1 environment variable. Setting it and will override the compile-time setting. A value of 0 calls -cc1 inside the calling process; a value of 1 will create a secondary process, as before.
Differential Revision: https://reviews.llvm.org/D69825
which is the default TLS model for non-PIC objects. This allows large/
many thread local variables or a compact/fast code in an executable.
Specification is same as that of GCC. For example, the code model
option precedes the TLS size option.
TLS access models other than local-exec are not changed. It means
supoort of the large code model is only in the local exec TLS model.
Patch By KAWASHIMA Takahiro (kawashima-fj <t-kawashima@fujitsu.com>)
Reviewers: dmgreen, mstorsjo, t.p.northover, peter.smith, ostannard
Reviewd By: peter.smith
Committed by: peter.smith
Differential Revision: https://reviews.llvm.org/D71688
All 130+ f_Group flags that take an argument allow it after a '=',
except for fdebug-complation-dir. Add a Joined<> alias so that
it behaves consistently with all the other f_Group flags.
(Keep the old Separate flag for backwards compat.)
Follow-up of D72014. It is more appropriate to use a target
feature instead of a SubTypeArch to express the difference.
Reviewed By: #powerpc, jhibbits
Differential Revision: https://reviews.llvm.org/D72433
In the backend, this feature is implemented with the function attribute
"patchable-function-entry". Both the attribute and XRay use
TargetOpcode::PATCHABLE_FUNCTION_ENTER, so the two features are
incompatible.
Reviewed By: ostannard, MaskRay
Differential Revision: https://reviews.llvm.org/D72222
Architecturally, it's allowed to have MVE-I without an FPU, thus
-mfpu=none should not disable MVE-I, or moves to/from FP-registers.
This patch removes `+/-fpregs` from features unconditionally added to
target feature list, depending on FPU and moves the logic to Clang
driver, where the negative form (`-fpregs`) is conditionally added to
the target features list for the cases of `-mfloat-abi=soft`, or
`-mfpu=none` without either `+mve` or `+mve.fp`. Only the negative
form is added by the driver, the positive one is derived from other
features in the backend.
Differential Revision: https://reviews.llvm.org/D71843
Apple's CPUs are called A7-A13 in official communication, occasionally with
weird suffixes which we probably don't need to care about. This adds each one
and describes its features. It also switches the default CPU to the canonical
name for Cyclone, but leaves legacy support in so that existing bitcode still
compiles.
According to D53384, the default was switched from -fno-PIC to -fPIC to
work around a -fsanitize=leak bug on big-endian.
This gratuitous difference between little-endian and big-endian is
undesired, and not acceptable on powerpc64-unknown-freebsd. If
-fsanitize=leak still has the problem, we should consider defaulting to
-fPIC/-fPIE only when -fsanitize=leak is specified (see SanitizerArgs::requiresPIE())
powerpc64-ibm-aix is unaffected: it still defaults to -fPIC.
powerpc64-linux-musl is unaffected (-fPIE since D39588): it still defaults to -fPIE.
Reviewed By: #powerpc, jhibbits
Differential Revision: https://reviews.llvm.org/D72363
Summary:
Every powerpc64le platform uses elfv2.
For powerpc64, the environments "elfv1" and "elfv2" were added for
FreeBSD ELFv1->ELFv2 migration in D61950. FreeBSD developers have
decided to use OS versions to select ABI, and no one is relying on the
environments.
Also use elfv2 on powerpc64-linux-musl.
Users can always use -mabi=elfv1 and -mabi=elfv2 to override the default
ABI.
Reviewed By: adalava
Differential Revision: https://reviews.llvm.org/D72352
Driver test `cross-linux.c` fails when CLANG_DEFAULT_RTLIB is "compiler-rt"
as the it expects a GCC-style `"crtbegin.o"` after `"crti.o"` but instead
receives something akin to this in the frontend invocation:
```
"crt1.o" "crti.o"
"/o/b/llvm/bin/../lib/clang/10.0.0/lib/linux/clang_rt.crtbegin-x86_64.o"
```
This patch adds an override to `cross-linux.c` tests so the expected result
is produced regardless of the compile-time default rtlib, as having tests
fail due to that is fairly confusing. After applying the patch, the test
passes regardless of the CLANG_DEFAULT_RTLIB setting.
Differential Revision: https://reviews.llvm.org/D72236
-mpacked-stack is currently not supported with -mbackchain, so this should
result in a compilation error message instead of being silently ignored.
Review: Ulrich Weigand
gcc/config/{i386,s390} support -mnop-mcount. We currently only support
-mnop-mcount for SystemZ. The function attribute "mnop-mcount" is
ignored on other targets.
gcc/config/{i386,s390} support -mfentry. We currently only support
-mfentry for X86 and SystemZ. TargetOpcode::FENTRY_CALL is not handled
on other targets.
% clang -target aarch64 -pg -mfentry a.c -c
fatal error: error in backend: Not supported instr: <MCInst 21>
-mfentry, -mrecord-mcount, and -mnop-mcount were invented for Linux
ftrace. Linux uses $(call cc-option-yn,-mrecord-mcount) to detect if the
specific feature is available. Reject unsupported features so that Linux
build system will not wrongly consider them available and cause
build/runtime failures.
Note, GCC has stricter checks that we do not implement, e.g. -fpic/-fpie
-fnop-mcount is not allowed on x86, -fpic/-fpie -mfentry is not allowed
on x86-32.
GCC's x86 and s390 ports support -mrecord-mcount. Other ports reject the
option.
aarch64-linux-gnu-gcc: error: unrecognized command line option ‘-mrecord-mcount’
Allowing this option can cause failures when building Linux kernel for
aarch64, powerpc64, etc, which will think the feature is available if
the clang command returns 0.
On Darwin, when used for generating a linked binary from a source file
(through an intermediate object file), the driver will invoke `cc1` to
generate a temporary object file. The temporary remark file will now be
emitted next to the object file, which will then be picked up by
`dsymutil` and emitted in the .dSYM bundle.
This is available for all formats except YAML since by default, YAML
doesn't need a section and the remark file will be lost.
When clang is invoked with a source file without -c or -S, it creates a
cc1 job, a linker job and if debug info is requested, a dsymutil job. In
case of remarks, we should also create a dsymutil job to avoid losing
the remarks that will be generated in a tempdir that gets removed.
Differential Revision: https://reviews.llvm.org/D71675
Our build system does not handle randomly named files created during
the build well. We'd prefer to write compilation output directly
without creating a temporary file. Function parameters already
existed to control this behavior but were not exposed all the way out
to the command line.
Patch by Zachary Henkel!
Differential revision: https://reviews.llvm.org/D70615
The driver actually adds a default -mlinker-version, based on HOST_LINK_VERSION
cmake variable. The tests should be explicit about which version they're using to
trigger the right behavior.
In Xcode 11, ld added a new flag called -platform_version that can be used instead of the old -<platform>_version_min flags.
The new flag allows Clang to pass the SDK version from the driver to the linker.
This patch adopts the new -platform_version flag in Clang, and starts using it by default,
unless a linker version < 520 is passed to the driver.
Differential Revision: https://reviews.llvm.org/D71579
Summary:
Remove REQUIRES-ANY alias lit directive since it is hardly used and can
be easily implemented using an OR expression using REQUIRES. Fixup
remaining testcases still using REQUIRES-ANY.
Reviewers: probinson, jdenny, gparker42
Reviewed By: gparker42
Subscribers: eugenis, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, delcypher, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, cfe-commits, #sanitizers, llvm-commits
Tags: #llvm, #clang, #sanitizers
Differential Revision: https://reviews.llvm.org/D71408
GCC implicitly adds an .exe suffix if it is given an output file name,
but the file name doesn't contain a suffix, and there are certain
users of GCC that rely on this behaviour (and run into issues when
trying to use Clang instead of GCC). And MSVC's cl.exe also does the
same (but not link.exe).
However, GCC only does this when actually running on windows, not when
operating as a cross compiler.
As GCC doesn't have this behaviour when cross compiling, we definitely
shouldn't introduce the behaviour in such cases (as it would break
at least as many cases as this fixes).
Differential Revision: https://reviews.llvm.org/D71400
This matches https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html
> -momit-leaf-frame-pointer
> -mno-omit-leaf-frame-pointer
>
> Omit or keep the frame pointer in leaf functions. The former behavior is the default.
-mno-omit-leaf-frame-pointer is currently a no-op because
TargetOptions::DisableFramePointerElim is only considered for non-leaf
functions.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D71167
D39317 made clang use .init_array when no gcc installations is found.
This change changes all gcc installations to use .init_array .
GCC 4.7 by default stopped providing .ctors/.dtors compatible crt files,
and stopped emitting .ctors for __attribute__((constructor)).
.init_array should always work.
FreeBSD rules are moved to FreeBSD.cpp to make Generic_ELF rules clean.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D71434
Very few ELF platforms still use .ctors/.dtors now. Linux (glibc: 1999-07),
DragonFlyBSD, FreeBSD (2012-03) and Solaris have supported .init_array
for many years. Some architectures like AArch64/RISC-V default to
.init_array . GNU ld and gold can even convert .ctors to .init_array .
It makes more sense to flip the CC1 default, and only uses
-fno-use-init-array on platforms that don't support .init_array .
For example, OpenBSD did not support DT_INIT_ARRAY before Aug 2016
(86fa57a279)
I may miss some ELF platforms that still use .ctors, but their
maintainers can easily diagnose such problems.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D71393
Serialized remarks contain debug locations for each remark, by storing a
file path, a line, and a column.
Also, remarks support being embedded in a .dSYM bundle using a separate
section in object files, that is found by `dsymutil` through the debug
map.
In order for tools to map addresses to source and display remarks in the
source, we need line tables, and in order for `dsymutil` to find the
object files containing the remark section, we need to keep the debug
map around.
Differential Revision: https://reviews.llvm.org/D71325
This is a follow up patch to use the OpenMP-IR-Builder, as discussed on
the mailing list ([1] and later) and at the US Dev Meeting'19.
[1] http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000197.html
Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim
Subscribers: ppenzin, penzn, llvm-commits, cfe-commits, jfb, guansong, bollu, hiraditya, mgorny
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69922
The -fsplit-dwarf-inlining option does not conform to DWARF5 standard.
It creates children for Skeleton compilation unit. We need default behavior
to be DWARF5 compatible. Thus set default state for -fsplit-dwarf-inlining
into "false".
Differential Revision: https://reviews.llvm.org/D71304
This adds a check for the usage of -foptimization-record-file with
multiple -arch options. This is not permitted since it would require us
to rename the file requested by the user to avoid overwriting it for the
second cc1 invocation.
Fixed the test case to set --sysroot, which lets it succeed in the case where
a directory named "/usr/include/c++/v1" or "/usr/local/include/c++/v1" exists.
Original commit message:
> The NDK uses a separate set of libc++ headers in the sysroot. Any headers
> in the installation directory are not going to work on Android, not least
> because they use a different name for the inline namespace (std::__1 instead
> of std::__ndk1).
>
> This effectively makes it impossible to produce a single toolchain that is
> capable of targeting both Android and another platform that expects libc++
> headers to be installed in the installation directory, such as Mac.
>
> In order to allow this scenario to work, stop looking for headers in the
> install directory on Android.
Differential Revision: https://reviews.llvm.org/D71154
The NDK uses a separate set of libc++ headers in the sysroot. Any headers
in the installation directory are not going to work on Android, not least
because they use a different name for the inline namespace (std::__1 instead
of std::__ndk1).
This effectively makes it impossible to produce a single toolchain that is
capable of targeting both Android and another platform that expects libc++
headers to be installed in the installation directory, such as Mac.
In order to allow this scenario to work, stop looking for headers in the
install directory on Android.
Differential Revision: https://reviews.llvm.org/D71154
Summary:
The HIP toolchain invokes `llc` without an explicit opt-level, meaning
it always uses the default (-O2). This makes it impossible to use -O1,
for example. The HIP toolchain also coerces -Os/-Oz to -O2 even when
invoking opt, and it coerces -Og to -O2 rather than -O1.
Forward the opt-level to `llc` as well as `opt`, and only coerce levels
where it is required.
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70987
Patch was reverted because https://bugs.llvm.org/show_bug.cgi?id=44048
The original patch is modified to set the strictfp IR attribute
explicitly in CodeGen instead of as a side effect of IRBuilder.
In the 2nd attempt to reapply there was a windows lit test fail, the
tests were fixed to use wildcard matching.
Differential Revision: https://reviews.llvm.org/D62731
This was hard-coded to "clang". This change allows it to to be used on
processes other than clang (such as lld).
This gets reported as clang-10 on Linux and clang.exe on Windows so
adapted test to accommodate this.
Differential Revision: https://reviews.llvm.org/D70950
Summary:
clang test Driver/darwin-opt-record.c assumes the default target is
x86_64 by its uses of the -arch x86_64 and -arch x86_64h and thus fail
on systems where it is not the case. Adding a target
x86_64-apple-darwin10 reveals another problem: the driver refuses 2
-arch for an assembly output so this commit also changes the output to
be an object file.
Reviewers: thegameg
Reviewed By: thegameg
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70748
Summary:
A skeleton of AIX toolchain and system linker support has been introduced in D68340, and this is a follow on patch to it.
This patch adds support to system assembler invocation to the AIX toolchain.
Reviewers: daltenty, hubert.reinterpretcast, jasonliu, Xiangling_L, dlj
Reviewed By: daltenty, hubert.reinterpretcast
Subscribers: wuzish, nemanjai, kbarton, jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69620
GCC 8 implements -fmacro-prefix-map. Like -fdebug-prefix-map, it replaces a string prefix for the __FILE__ macro.
-ffile-prefix-map is the union of -fdebug-prefix-map and -fmacro-prefix-map
Reviewed By: rnk, Lekensteyn, maskray
Differential Revision: https://reviews.llvm.org/D49466
Summary:
On SUSE distributions for 32-bit PowerPC, gcc is configured
as a 64-bit compiler using the GNU triplet "powerpc64-suse-linux",
but invoked with "-m32" by default. Thus, the correct GNU triplet
for 32-bit PowerPC SUSE distributions is "powerpc64-suse-linux"
and not "powerpc-suse-linux".
Reviewers: jrtc27, nemanjai, glaubitz
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D55326
When there's a wasm-opt in the PATH, run the it to optimize LLVM's
output. This fixes PR43796.
And, add an "llvm-lto" directory to the sysroot library search paths,
so that sysroots can provide LTO-enabled system libraries.
Differential Revision: https://reviews.llvm.org/D70500
In the GNU toolchain, `-static-libgcc` implies that the unwindlib will
be linked statically. However, when `--unwindlib=libunwind`, this flag is
ignored, and a bare `-lunwind` is added to the linker args. Unfortunately,
this means that if both `libunwind.so`, and `libunwind.a` are present
in the library path, `libunwind.so` will be chosen in all cases where
`-static` is not set.
This change makes `-static-libgcc` affect the `-l` flag produced by
`--unwindlib=libunwind`. After this patch, providing
`-static-libgcc --unwindlib=libunwind` will cause the driver to explicitly
emit `-l:libunwind.a` to statically link libunwind. For all other cases
it will emit `-l:libunwind.so` matching current behavior with a more
explicit link line.
https://reviews.llvm.org/D70416
If a GCC installation is not detected, then this attempts to
use compiler-rt and the compiler-rt crtbegin/crtend
implementations as a fallback.
Differential Revision: https://reviews.llvm.org/D68407
1. Currently only support the set of multilibs same to riscv-gnu-toolchain.
2. Fix testcase typo causes fail on Windows.
3. Fix testcases to set empty sysroot.
Reviewers: espindola, asb, kito-cheng, lenary
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D67508
We don't have a full sysroot yet, so for now we only include compiler
support and compiler-rt builtins, the rest of the runtimes will get
enabled later.
Differential Revision: https://reviews.llvm.org/D70477
1. Currently only support the set of multilibs same to riscv-gnu-toolchain.
2. Fix testcase typo causes fail on Windows
Reviewers: espindola, asb, kito-cheng, lenary
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D67508
Currently only support the set of multilibs same to riscv-gnu-toolchain.
Reviewers: espindola, asb, kito-cheng, lenary
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D67508
Summary:
This allows `clang` to be used to compile CUDA programs. Compiled
simple helloworld.cu with this.
Reviewers: dim, emaste, tra, yaxunl, ABataev
Reviewed By: tra
Subscribers: dim, emaste, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69990
When the driver is targeting multiple architectures at once, for things
like Universal Mach-Os, we need to emit different remark files for each
cc1 invocation to avoid overwriting the files from a different
invocation.
For example:
$ clang -c -o foo.o -fsave-optimization-record -arch x86_64 -arch x86_64h
will create two remark files:
* foo-x86_64.opt.yaml
* foo-x86_64h.opt.yaml
and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread.
This reverts commits af57dbf12e and e6584b2b7b
When the driver is targeting multiple architectures at once, for things
like Universal Mach-Os, we need to emit different remark files for each
cc1 invocation to avoid overwriting the files from a different
invocation.
For example:
$ clang -c -o foo.o -fsave-optimization-record -arch x86_64 -arch x86_64h
will create two remark files:
* foo-x86_64.opt.yaml
* foo-x86_64h.opt.yaml
For RISC-V the value provided to -march should determine whether to
compile for 32- or 64-bit RISC-V irrespective of the target provided to
the Clang driver. This adds a test for this flag for RISC-V and sets the
Target architecture correctly in these cases.
Differential Revision: https://reviews.llvm.org/D54214
Provides support for using r6-r11 as globally scoped
register variables. This requires a -ffixed-rN flag
in order to reserve rN against general allocation.
If for a given GRV declaration the corresponding flag
is not found, or the the register in question is the
target's FP, we fail with a diagnostic.
Differential Revision: https://reviews.llvm.org/D68862
There doesn't seem to be much sense in defaulting "on" unwind tables on
amd64 and not on other arches. It causes surprising differences between
platforms, such as the PR below[1].
Prior to this change, FreeBSD inherited the default implementation of the
method from the Gnu.h Generic_Elf => Generic_GCC parent class, which
returned true only for amd64 targets. Override that and opt on always,
similar to, e.g., NetBSD's driver.
[1] https://bugs.freebsd.org/241562
Patch by cem (Conrad Meyer).
Reviewed By: dim
Differential Revision: https://reviews.llvm.org/D70110
Summary:
Clang/LLVM is a cross-compiler, and so we don't have to make a choice
about `-march`/`-mabi` at build-time, but we may have to compute a
default `-march`/`-mabi` when compiling a program. Until now, each
place that has needed a default `-march` has calculated one itself.
This patch adds a single place where a default `-march` is calculated,
in order to avoid calculating different defaults in different places.
This patch adds a new function `riscv::getRISCVArch` which encapsulates
this logic based on GCC's for computing a default `-march` value
when none is provided. This patch also updates the logic in
`riscv::getRISCVABI` to match the logic in GCC's build system for
computing a default `-mabi`.
This patch also updates anywhere that `-march` is used to now use the
new function which can compute a default. In particular, we now
explicitly pass a `-march` value down to the gnu assembler.
GCC has convoluted logic in its build system to choose a default
`-march`/`-mabi` based on build options, which would be good to match.
This patch is based on the logic in GCC 9.2.0. This commit's logic is
different to GCC's only for baremetal targets, where we default
to rv32imac/ilp32 or rv64imac/lp64 depending on the target triple.
Tests have been updated to match the new logic.
Reviewers: asb, luismarques, rogfer01, kito-cheng, khchen
Reviewed By: asb, luismarques
Subscribers: sameer.abuasal, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69383
Before this patch if we pass "-mcpu=hexagonv65 -march=hexagon" in this order,
the driver fails to figure out the correct cpu version. This patch fixed this
issue.
Summary:
Tooling around DWARF 5 is still not mature enough for this to be a sane
default, and the AMDGPU and HIP toolchains should agree on a single
default.
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, aprantl, dstuttard, tpr, t-tye, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70191
Summary:
This adds `-mreference-types` and `-mno-reference-types` flags to clang
and make `-fwasm-exceptions` enables reference types feature in clang
and the backend.
Reviewers: tlively
Subscribers: dschuff, sbc100, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69832
If a GCC installation is not detected, then this attempts to
use compiler-rt and the compiler-rt crtbegin/crtend
implementations as a fallback.
Differential Revision: https://reviews.llvm.org/D68407
8548 CPU is GCC's name for the e500v2, so accept this in clang. The
e500v2 doesn't support lwsync, so define __NO_LWSYNC__ for this as well,
as GCC does.
Differential Revision: https://reviews.llvm.org/D67787
This started passing target-features on the linker line, not just for RISCV but
for all targets, leading to error messages in Chromium Android build:
'+soft-float-abi' is not a recognized feature for this target (ignoring feature)
'+soft-float-abi' is not a recognized feature for this target (ignoring feature)
See Phabricator review for details.
Reverting until this can be fixed properly.
> Summary:
> 1. enable LTO need to pass target feature and abi to LTO code generation
> RISCV backend need the target feature to decide which extension used in
> code generation.
> 2. move getTargetFeatures to CommonArgs.h and add ForLTOPlugin flag
> 3. add general tools::getTargetABI in CommonArgs.h because different target uses different
> way to get the target ABI.
>
> Patch by Kuan Hsu Chen (khchen)
>
> Reviewers: lenary, lewis-revill, asb, MaskRay
>
> Reviewed By: lenary
>
> Subscribers: hiraditya, dschuff, aheejin, fedor.sergeev, mehdi_amini, inglorion, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, cfe-commits
>
> Tags: #clang
>
> Differential Revision: https://reviews.llvm.org/D67409
Previously these were reported from the driver which blocked clang-scan-deps from getting the full set of dependencies from cc1 commands.
Also the default sanitizer blacklist that is added in driver was never reported as a dependency. I introduced -fsanitize-system-blacklist cc1 option to keep track of which blacklists were user-specified and which were added by driver and clang -MD now also reports system blacklists as dependencies.
Differential Revision: https://reviews.llvm.org/D69290
This flag decouples specifying the DWARF version from enabling/disabling
DWARF in general (or the gN level - gmlt/limited/standalone, etc) while
still allowing existing -gdwarf-N flags to override this default.
Patch by Caroline Tice!
Differential Revision: https://reviews.llvm.org/D69822
Add options to control floating point behavior: trapping and
exception behavior, rounding, and control of optimizations that affect
floating point calculations. More details in UsersManual.rst.
Reviewers: rjmccall
Differential Revision: https://reviews.llvm.org/D62731
If a GCC installed is not detected, the driver would default to
the root of the filesystem. This is not ideal when this doesn't
match the install directory of the toolchain and can cause
undesireable behavior such as picking up system libraries or
the system linker when cross-compiling.
Differential Revision: https://reviews.llvm.org/D68391
The linker options (e.g. pragma detect_mismatch) are intended for host
compilation only, therefore disable it for device compilation.
Differential Revision: https://reviews.llvm.org/D57829
-mvzeroupper will force the vzeroupper insertion pass to run on
CPUs that normally wouldn't. -mno-vzeroupper disables it on CPUs
where it normally runs.
To support this with the default feature handling in clang, we
need a vzeroupper feature flag in X86.td. Since this flag has
the opposite polarity of the fast-partial-ymm-or-zmm-write we
used to use to disable the pass, we now need to add this new
flag to every CPU except KNL/KNM and BTVER2 to keep identical
behavior.
Remove -fast-partial-ymm-or-zmm-write which is no longer used.
Differential Revision: https://reviews.llvm.org/D69786
Add support for continuously syncing profile counter updates to a file.
The motivation for this is that programs do not always exit cleanly. On
iOS, for example, programs are usually killed via a signal from the OS.
Running atexit() handlers after catching a signal is unreliable, so some
method for progressively writing out profile data is necessary.
The approach taken here is to mmap() the `__llvm_prf_cnts` section onto
a raw profile. To do this, the linker must page-align the counter and
data sections, and the runtime must ensure that counters are mapped to a
page-aligned offset within a raw profile.
Continuous mode is (for the moment) incompatible with the online merging
mode. This limitation is lifted in https://reviews.llvm.org/D69586.
Continuous mode is also (for the moment) incompatible with value
profiling, as I'm not sure whether there is interest in this and the
implementation may be tricky.
As I have not been able to test extensively on non-Darwin platforms,
only Darwin support is included for the moment. However, continuous mode
may "just work" without modification on Linux and some UNIX-likes. AIUI
the default value for the GNU linker's `--section-alignment` flag is set
to the page size on many systems. This appears to be true for LLD as
well, as its `no_nmagic` option is on by default. Continuous mode will
not "just work" on Fuchsia or Windows, as it's not possible to mmap() a
section on these platforms. There is a proposal to add a layer of
indirection to the profile instrumentation to support these platforms.
rdar://54210980
Differential Revision: https://reviews.llvm.org/D68351
6bf55804 added special-case code for TY_PP_Fortran to
ToolChain::LookupTypeForExtension(), but
Darwin::LookupTypeForExtension() overrode that method without calling
the superclass implementation.
Make it call the superclass implementation to fix things.
Differential Revision: https://reviews.llvm.org/D69636
This adds a flag to LLVM and clang to always generate a .debug_frame
section, even if other debug information is not being generated. In
situations where .eh_frame would normally be emitted, both .debug_frame
and .eh_frame will be used.
Differential Revision: https://reviews.llvm.org/D67216
D63607 made mac builders unhappy by failing this test, and it isn't
yet obvious why. Mark as unsupported as a temporary measure.
Signed-off-by: Peter Waller <peter.waller@arm.com>
On Windows and macOS, the filesystem is case insensitive, and these files
interfere with each other. Reading through, the case of the file extension
is part of the test. I've altered the rest of the name instead.
This patch adds a new Flang mode. When in Flang mode, the driver will
invoke flang for fortran inputs instead of falling back to the GCC
toolchain as it would otherwise do.
The behaviour of other driver modes are left unmodified to preserve
backwards compatibility.
It is intended that a soon to be implemented binary in the flang project
will import libclangDriver and run the clang driver in the new flang
mode.
Please note that since the binary invoked by the driver is under
development, there will no doubt be further tweaks necessary in future
commits.
* Initial support is added for basic driver phases
* -E, -fsyntax-only, -emit-llvm -S, -emit-llvm, -S, (none specified)
* -### tests are added for all of the above
* This is more than is supported by f18 so far, which will emit errors
for those options which are unimplemented.
* A test is added that ensures that clang gives a reasonable error
message if flang is not available in the path (without -###).
* Test that the driver accepts multiple inputs in --driver-mode=flang.
* Test that a combination of C and Fortran inputs run both clang and
flang in --driver-mode=flang.
* clang/test/Driver/fortran.f95 is fixed to use the correct fortran
comment character.
Differential revision: https://reviews.llvm.org/D63607
Submitted for mcgrathr.
On AArch64, Fuchsia fully supports both SafeStack and ShadowCallStack ABIs.
The latter is now preferred and will be the default. It's possible to
enable both simultaneously, but ShadowCallStack is believed to have most
of the practical benefit of SafeStack with less cost.
Differential Revision: https://reviews.llvm.org/D66712
Summary:
A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on
indirect function calls, using either the check mechanism (X86, ARM, AArch64) or
or the dispatch mechanism (X86-64). The check mechanism requires a new calling
convention for the supported targets. The dispatch mechanism adds the target as
an operand bundle, which is processed by SelectionDAG. Another pass
(CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as
required by /guard:cf. This feature is enabled using the `cfguard` CC1 option.
Reviewers: thakis, rnk, theraven, pcc
Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65761
- Changed FileHandler read/write methods to return llvm::Error
- Using unified way of reporting errors
- Removed trailing '.' from the error messages
Differential Revision: https://reviews.llvm.org/D67031
- Changed FileHandler read/write methods to return llvm::Error
- Using unified way of reporting errors
- Removed trailing '.' from the error messages
Differential Revision: https://reviews.llvm.org/D67031
Summary:
- As variadic parameters have the lowest rank in overload resolution,
without real usage of `va_arg`, they are commonly used as the
catch-all fallbacks in SFINAE. As the front-end still reports errors
on calls to `va_arg`, the declaration of functions with variadic
arguments should be allowed in general.
Reviewers: jlebar, tra, yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69389
This adds support for reserving GPRs such that the compiler will not
choose a register for register allocation. The implementation follows
the same design as for AArch64; each reserved register becomes a target
feature and used for getting the reserved registers for a given
MachineFunction. The backend checks that it does not need to write to
any reserved register; if it does a relevant error is generated.
Differential Revision: https://reviews.llvm.org/D67185
When the %m filename pattern is used, the filename is unique to each
image, so the cached value is wrong.
It struck me that the full filename isn't something that's recomputed
often, so perhaps it doesn't need to be cached at all. David Li pointed
out we can go further and just hide lprofCurFilename. This may regress
workflows that depend on using the set-filename API to change filenames
across all loaded DSOs, but this is expected to be very rare.
rdar://55137071
Differential Revision: https://reviews.llvm.org/D69137
llvm-svn: 375301
This is clang part of the patch. It adds -flto-unit flag for thin LTO
builds on Mac and PS4
Differential revision: https://reviews.llvm.org/D68950
llvm-svn: 375224
Summary:
The sign extension proposal was motivated by a desire to not have
separate sign-extending atomic operations, so it is meant to be
enabled when threads are used.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69075
llvm-svn: 375199
Remove dead virtual functions from vtables with
replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up
correctly.
Original commit message:
Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.
This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.
To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.
The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.
This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.
To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.
I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.
On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.
I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.
I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).
Differential revision: https://reviews.llvm.org/D63932
llvm-svn: 375094
Since `-mfloat-abi=soft` is taken to mean turning off all uses of the
FP registers, it should turn off the MVE vector instructions as well
as NEON and scalar FP. But it wasn't doing so.
So the options `-march=armv8.1-m.main+mve.fp+fp.dp -mfloat-abi=soft`
would cause the underlying LLVM to //not// support MVE (because it
knows the real target feature relationships and turned off MVE when
the `fpregs` feature was removed), but the clang layer still thought
it //was// supported, and would misleadingly define the feature macro
`__ARM_FEATURE_MVE`.
The ARM driver code already has a long list of feature names to turn
off when `-mfloat-abi=soft` is selected. The fix is to add the missing
entries `mve` and `mve.fp` to that list.
Reviewers: dmgreen
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69025
llvm-svn: 375001
The final list of OpenMP offload targets becomes known only at the link time and since offload registration code depends on the targets list it makes sense to delay offload registration code generation to the link time instead of adding it to the host part of every fat object. This patch moves offload registration code generation from clang to the offload wrapper tool.
This is the last part of the OpenMP linker script elimination patch https://reviews.llvm.org/D64943
Differential Revision: https://reviews.llvm.org/D68746
llvm-svn: 374937
Remove REQUIRES and only keep the clang driver tests, since the
assembler are already tested with -Wa,--no-warn. This way we could run
the test on non-linux platforms and catch breaks on them.
llvm-svn: 374932
Summary:
Currently clang does not support -Wa,-W, which suppresses warning
messages in GNU assembler. Add this option for gcc compatibility.
https://bugs.llvm.org/show_bug.cgi?id=43651. Reland with differential
information.
Reviewers: bcain
Reviewed By: bcain
Subscribers: george.burgess.iv, gbiv, llozano, manojgupta, nickdesaulniers, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D68884
llvm-svn: 374834
Currently clang does not support -Wa,-W, which suppresses warning
messages in GNU assembler. Add this option for gcc compatibility.
https://bugs.llvm.org/show_bug.cgi?id=43651
llvm-svn: 374822
Summary:
This patch restores the behaviour that -fpu overwrites the
architecture obtained from -march or -mcpu flags, not enforcing to
disable 'crypto' if march=armv7 and mfpu=neon-fp-armv8.
However, it does warn that 'crypto' is ignored when passing
mfpu=crypto-neon-fp-armv8.
Reviewers: peter.smith, labrinea
Reviewed By: peter.smith
Subscribers: nickdesaulniers, kristof.beyls, dmgreen, cfe-commits, krisb
Tags: #clang
Differential Revision: https://reviews.llvm.org/D67608
llvm-svn: 374785
Summary:
1. enable LTO need to pass target feature and abi to LTO code generation
RISCV backend need the target feature to decide which extension used in
code generation.
2. move getTargetFeatures to CommonArgs.h and add ForLTOPlugin flag
3. add general tools::getTargetABI in CommonArgs.h because different target uses different
way to get the target ABI.
Patch by Kuan Hsu Chen (khchen)
Reviewers: lenary, lewis-revill, asb, MaskRay
Reviewed By: lenary
Subscribers: hiraditya, dschuff, aheejin, fedor.sergeev, mehdi_amini, inglorion, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D67409
llvm-svn: 374774
That way, lit's builtin 'env' command can be used for the 'env' bit.
Also it's clearer that way that the 'not' shouldn't cover 'env'
failures.
llvm-svn: 374749
Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.
This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.
To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.
The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.
This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.
To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.
I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.
On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.
I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.
I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).
Differential revision: https://reviews.llvm.org/D63932
llvm-svn: 374539
I noticed that compiling on Windows with -fno-ms-compatibility had the
side effect of defining __GNUC__, along with __GNUG__, __GXX_RTTI__, and
a number of other macros for GCC compatibility. This is undesirable and
causes Chromium to do things like mix __attribute__ and __declspec,
which doesn't work. We should have a positive language option to enable
GCC compatibility features so that we can experiment with
-fno-ms-compatibility on Windows. This change adds -fgnuc-version= to be
that option.
My issue aside, users have, for a long time, reported that __GNUC__
doesn't match their expectations in one way or another. We have
encouraged users to migrate code away from this macro, but new code
continues to be written assuming a GCC-only environment. There's really
nothing we can do to stop that. By adding this flag, we can allow them
to choose their own adventure with __GNUC__.
This overlaps a bit with the "GNUMode" language option from -std=gnu*.
The gnu language mode tends to enable non-conforming behaviors that we'd
rather not enable by default, but the we want to set things like
__GXX_RTTI__ by default, so I've kept these separate.
Helps address PR42817
Reviewed By: hans, nickdesaulniers, MaskRay
Differential Revision: https://reviews.llvm.org/D68055
llvm-svn: 374449
This patch removes the remaining part of the OpenMP offload linker scripts which was used for inserting device binaries into the output linked binary. Device binaries are now inserted into the host binary with a help of the wrapper bit-code file which contains device binaries as data. Wrapper bit-code file is dynamically created by the clang driver with a help of new tool clang-offload-wrapper which takes device binaries as input and produces bit-code file with required contents. Wrapper bit-code is then compiled to an object and resulting object is appended to the host linking by the clang driver.
This is the second part of the patch for eliminating OpenMP linker script (please see https://reviews.llvm.org/D64943).
Differential Revision: https://reviews.llvm.org/D68166
llvm-svn: 374219
Currently clang does not save some of the intermediate file generated during device compilation for HIP when -save-temps is specified.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D68665
llvm-svn: 374198
These were added to the MS docs in
85b9b6967e
and are supposedly available in VS 2019 16.4 (though my 2019 Preview,
version 16.4.0-pre.1.0 don't seem to have them.)
llvm-svn: 373887
Some Driver tests relied on the default resource direcory having per-os per-arch
subdirectory layout, and when clang is built with `-DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=ON`,
those test fail, because clang by default assumes per-target subdirectories.
Explicitly set `-resource-dir` flag to point to a tree with per-os per-arch layout.
See also: D45604, D62469
Differential Revision: https://reviews.llvm.org/D66981
Patch by Sergej Jaskiewicz <jaskiewiczs@icloud.com>.
llvm-svn: 373565
Sometimes it is useful to compile HIP device code to LLVM BC. It is not convenient to use clang -cc1 since
there are lots of options needed.
This patch allows clang driver to compile HIP device code to LLVM BC with -emit-llvm -c.
Differential Revision: https://reviews.llvm.org/D68284
llvm-svn: 373561
Summary:
To trigger the index-only Whole Program Devirt support added to LLVM, we
need to be able to specify -fno-split-lto-unit in conjunction with
-fwhole-program-vtables. Keep the default for -fwhole-program-vtables as
-fsplit-lto-unit, but don't error on that option combination.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D68029
llvm-svn: 373370
When testing clang that has been compiled with `-DDEFAULT_SYSROOT` set to some path,
some tests would fail. Override sysroot to be empty string for the tests to succeed
when clang is configured with `DEFAULT_SYSROOT`.
Differential Revision: https://reviews.llvm.org/D66834
Patch by Sergej Jaskiewicz <jaskiewiczs@icloud.com>.
llvm-svn: 373147
Linker automatically provides __start_<section name> and __stop_<section name> symbols to satisfy unresolved references if <section name> is representable as a C identifier (see https://sourceware.org/binutils/docs/ld/Input-Section-Example.html for details). These symbols indicate the start address and end address of the output section respectively. Therefore, renaming OpenMP offload entries section name from ".omp.offloading_entries" to "omp_offloading_entries" to use this feature.
This is the first part of the patch for eliminating OpenMP linker script (please see https://reviews.llvm.org/D64943).
Differential Revision: https://reviews.llvm.org/D68070
llvm-svn: 373118
The only functional change here is that -coverage-notes-file is not
passed to -cc1 in some situations.
This code appears to be trying to put the gcno and gcda output next to
the final object file, but it's doing that in a really convoluted way
that needs to be re-examined. It looks for -c or -S in the original
command, and then looks at the -o argument if present in order to handle
the -fno-integrated-as case. However, this doesn't work if this is a
link command with multiple inputs. I looked into fixing this, but the
check-profile test suite has a lot of dependencies on this behavior, so
I left it all alone.
llvm-svn: 373004
You can't use -fno-integrated-as for *-msvc triples because no usable
standalone assembler exists. Perhaps we could teach clang to emit a .s
and then reinvoke itself, but that's a bit silly.
Anyway, fix the test by using an Itanium ABI triple, which will become
mingw, which will assume gnu as is a usable assembler.
llvm-svn: 372994
The recently announced IBM z15 processor implements the architecture
already supported as "arch13" in LLVM. This patch adds support for
"z15" as an alternate architecture name for arch13.
Corrsponding LLVM support was committed as rev. 372435.
llvm-svn: 372436
We need "xgot" flag in the MipsAsmParser to implement correct expansion
of some pseudo instructions in case of using 32-bit GOT (XGOT).
MipsAsmParser does not have reference to MipsSubtarget but has a
reference to "feature bit set".
llvm-svn: 372220
This is both for consistency with other `mkdir`s in tests, and
fixing permission issues with the non-temporary cwd during testing (they
are not always writable).
llvm-svn: 371897
- When using -o, the provided filename is using for constructing the depfile
name (when -MMD is passed).
- The logic looks for the rightmost '.' character and replaces what comes after
with 'd'.
- This works incorrectly when the filename has no extension and the directories
have '.' in them (e.g. out.dir/test)
- This replaces the funciton to just llvm::sys::path functionality
Differential Revision: https://reviews.llvm.org/D67542
llvm-svn: 371853
Summary:
This adds `-fwasm-exceptions` (in similar fashion with
`-fdwarf-exceptions` or `-fsjlj-exceptions`) that turns on everything
with wasm exception handling from the frontend to the backend.
We currently have `-mexception-handling` in clang frontend, but this is
only about the architecture capability and does not turn on other
necessary options such as the exception model in the backend. (This can
be turned on with `llc -exception-model=wasm`, but llc is not invoked
separately as a command line tool, so this option has to be transferred
from clang.)
Turning on `-fwasm-exceptions` in clang also turns on
`-mexception-handling` if not specified, and will error out if
`-mno-exception-handling` is specified.
Reviewers: dschuff, tlively, sbc100
Subscribers: aprantl, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D67208
llvm-svn: 371708
Submittin in behalf of krisb (Kristina Bessonova) <ch.bessonova@gmail.com>
Summary:
'+crypto' means '+aes' and '+sha2' for arch >= ARMv8 when they were
not disabled explicitly. But this is correctly handled only in case of
'-march' option, though the feature may also be specified through
the '-mcpu' or '-mfpu' options. In the following example:
$ clang -mcpu=cortex-a57 -mfpu=crypto-neon-fp-armv8
'aes' and 'sha2' are disabled that is quite unexpected:
$ clang -cc1 -triple armv8--- -target-cpu cortex-a57
<...> -target-feature -sha2 -target-feature -aes -target-feature +crypto
This exposed by https://reviews.llvm.org/D63936 that makes
the 'aes' and 'sha2' features disabled by default.
So, while handling the 'crypto' feature we need to take into account:
- a CPU name, as it provides the information about architecture
(if no '-march' option specified),
- features, specified by the '-mcpu' and '-mfpu' options.
Reviewers: SjoerdMeijer, ostannard, labrinea, dnsampaio
Reviewed By: dnsampaio
Subscribers: ikudrin, javed.absar, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66018
Author: krisb
llvm-svn: 371597
This reverts r371497 (git commit 3d7e9ab7b9)
Reorder `not` with `env` in these two tests so they pass:
Driver/rewrite-map-in-diagnostics.c
Index/crash-recovery-modules.m.
This will not be necessary after D66531 lands.
llvm-svn: 371552
LLDB reads the various .apple* accelerator tables (and in the near
future: the DWARF 5 accelerator tables) which should make
.gnu_pubnames redundant. This changes the Clang driver to no longer
pass -ggnu-pubnames when tuning for LLDB.
Thanks to David Blaikie for pointing this out!
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/thread.html#646062
rdar://problem/50142073
Differential Revision: https://reviews.llvm.org/D67373
llvm-svn: 371530
When running clang as a native compiler in RISC-V Linux the flag
-mabi=ilp32d / -mabi=lp64d is always mandatory. This change makes it the
default there.
Differential Revision: https://reviews.llvm.org/D65634
llvm-svn: 371494
I see in the history for some of these tests REQUIRES:shell was used as
a way to disable tests on Windows because they are flaky there. I tried
not to re-enable such tests, but it's possible that I missed some and
this will re-enable flaky tests on Windows. If so, we should disable
them with UNSUPPORTED:system-windows and add a comment that they are
flaky there. So far as I can tell, the lit internal shell is capable of
running all of these tests, and we shouldn't use REQUIRES:shell as a
proxy for Windows.
llvm-svn: 371478