Summary:
This patch adds assembly-level support for a new Arm M-profile
architecture extension, Custom Datapath Extension (CDE).
A brief description of the extension is available at
https://developer.arm.com/architectures/instruction-sets/custom-instructions
The latest specification for CDE is currently a beta release and is
available at
https://static.docs.arm.com/ddi0607/aa/DDI0607A_a_armv8m_arm_supplement_cde.pdf
CDE allows chip vendors to add custom CPU instructions. The CDE
instructions re-use the same encoding space as existing coprocessor
instructions (such as MRC, MCR, CDP etc.). Each coprocessor in range
cp0-cp7 can be configured as either general purpose (GCP) or custom
datapath (CDEv1). This configuration is defined by the CPU vendor and
is provided to LLVM using 8 subtarget features: cdecp0 ... cdecp7.
The semantics of CDE instructions are implementation-defined, but the
instructions are guaranteed to be pure (that is, they are stateless,
they do not access memory or any registers except their explicit
inputs/outputs).
CDE requires the CPU to support at least Armv8.0-M mainline
architecture. CDE includes 3 sets of instructions:
* Instructions that operate on general purpose registers and NZCV
flags
* Instructions that operate on the S or D register file (require
either FP or MVE extension)
* Instructions that operate on the Q register file, require MVE
The user-facing names that can be specified on the command line are
the same as the 8 subtarget feature names. For example:
$ clang -target arm-none-none-eabi -march=armv8m.main+cdecp0+cdecp3
tells the compiler that the coprocessors 0 and 3 are configured as
CDEv1 and the remaining coprocessors are configured as GCP (which is
the default).
Reviewers: simon_tatham, ostannard, dmgreen, eli.friedman
Reviewed By: simon_tatham
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74044
This reverts commit 0a1123eb43.
Want to revert this because it's causing trouble for PowerPC
I also fixed test fp-model.c which was looking for an incorrect error message
Summary: Adds the RedHat Linux triple to the list of 64-bit RISC-V triples.
Without this the gcc libraries wouldn't be found by clang on a redhat/fedora
system, as the search list included `/usr/lib/gcc/riscv64-redhat-linux-gnu`
but the correct path didn't include the `-gnu` suffix.
Reviewers: lenary, asb, dlj
Reviewed By: lenary
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74399
When the clang baremetal driver selects the rt.builtins static library
it prefix with "-l" and appends ".a". The result is a nonsense option
which lld refuses to accept.
Differential Revision: https://reviews.llvm.org/D73904
Change-Id: Ic753b6104e259fbbdc059b68fccd9b933092d828
As discussed in https://reviews.llvm.org/D74447, this patch disables integrated-cc1 behavior if there's more than one job to be executed. This is meant to limit memory bloating, given that currently jobs don't clean up after execution (-disable-free is always active in cc1 mode).
I see this behavior as temporary until release 10.0 ships (to ease merging of this patch), then we'll reevaluate the situation, see if D74447 makes more sense on the long term.
Differential Revision: https://reviews.llvm.org/D74490
This is to avoid performance regressions when the default attribute
behavior is fixed to assume ieee.
I tested the default on x86_64 ubuntu, which seems to default to
FTZ/DAZ, but am guessing for x86 and PS4.
This reverts commit 99c5bcbce8.
Change clang option -ffp-model=precise to select ffp-contract=on
Including some small touch-ups to the original commit
Reviewers: rjmccall, Andy Kaylor
Differential Revision: https://reviews.llvm.org/D74436
The function attributes xray-skip-entry, xray-skip-exit, and
xray-ignore-loops were only being applied if a function had an
xray-instrument attribute, but they should apply if xray is enabled
globally too.
Differential Revision: https://reviews.llvm.org/D73842
This patch adds the support required for using the __riscv_save and
__riscv_restore libcalls to implement a size-optimization for prologue
and epilogue code, whereby the spill and restore code of callee-saved
registers is implemented by common functions to reduce code duplication.
Logic is also included to ensure that if both this optimization and
shrink wrapping are enabled then the prologue and epilogue code can be
safely inserted into the basic blocks chosen by shrink wrapping.
Differential Revision: https://reviews.llvm.org/D62686
Added a test for #pragma clang __debug llvm_fatal_error to test for the original issue.
Added llvm::sys::Process::Exit() and replaced ::exit() in places where it was appropriate. This new function would call the current CrashRecoveryContext if one is running on the same thread; or call ::exit() otherwise.
Fixes PR44705.
Differential Revision: https://reviews.llvm.org/D73742
-march=armv8.1-m.main+mve.fp+nomve -mfpu=none should disable FP
registers and instructions moving to/from FP registers.
This patch fixes the case when "+mve" (added to the feature list by
"+mve.fp"), is followed by "-mve" (added by "+nomve").
Differential Revision: https://reviews.llvm.org/D72633
Implement protection against the stack clash attack [0] through inline stack
probing.
Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].
This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.
Only implemented for x86.
[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html
This a recommit of 39f50da2a3 with proper LiveIn
declaration, better option handling and more portable testing.
Differential Revision: https://reviews.llvm.org/D68720
Implement protection against the stack clash attack [0] through inline stack
probing.
Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].
This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.
Only implemented for x86.
[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html
This a recommit of 39f50da2a3 with proper LiveIn
declaration, better option handling and more portable testing.
Differential Revision: https://reviews.llvm.org/D68720
Implement protection against the stack clash attack [0] through inline stack
probing.
Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].
This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.
Only implemented for x86.
[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html
This a recommit of 39f50da2a3 with better option
handling and more portable testing
Differential Revision: https://reviews.llvm.org/D68720
LLVMgold.so tests are duplicated in several places. Deduplicate them.
Move the tests to lto.c and lto.cu
Specify -fuse-ld=bfd or -fuse-ld=gold.
In a future change, if -fuse-ld=lld or CLANG_DEFAULT_LINKER=lld without -fuse-ld=, we will remove -plugin /path/to/LLVMgold.so
Implement protection against the stack clash attack [0] through inline stack
probing.
Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].
This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.
Only implemented for x86.
[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html
This a recommit of 39f50da2a3 with correct option
flags set.
Differential Revision: https://reviews.llvm.org/D68720
This reverts commit 39f50da2a3.
The -fstack-clash-protection is being passed to the linker too, which
is not intended.
Reverting and fixing that in a later commit.
Implement protection against the stack clash attack [0] through inline stack
probing.
Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].
This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.
Only implemented for x86.
[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html
Differential Revision: https://reviews.llvm.org/D68720
This reverts commits f41ec709d9 and 5fedc2b410. On some buildbots, Clang :: Driver/crash-report.c is broken with:
```
Command Output (stderr):
--
/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm-project/clang/test/Driver/crash-report.c:48:11: error: CHECK: expected string not found in input
// CHECK: Preprocessed source(s) and associated run script(s) are located at:
^
<stdin>:1:1: note: scanning from here
/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm-project/clang/test/Driver/crash-report.c:50:1: error: unknown type name 'BAZ'
```
Example: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/21321/steps/test-stage1-compiler/logs/stdio
Previously, when using '-MF file.d' on the command line, 'file.d' would not be deleted after a compiler crash.
The code path in Compilation::initCompilationForDiagnostics() that was modifying 'TranslatedArgs' had no effect, because 'TCArgs' was already created after the crash.
This was covered by clang/test/Driver/output-file-cleanup.c, the test was succeeding by fluke because Driver::generateCompilationDiagnostics() would fail to launch the subsequent clang -E (see D74070 for a fix for this). So the test was only covering Driver.cpp, C.CleanupFileMap().
After this patch, both cleanup and removal of -MF are exercised.
Differential Revision: https://reviews.llvm.org/D74076
Previously, when the above '#pragma clang __debug' were used, Driver::generateCompilationDiagnostics() wouldn't work as expected.
The 'clang -E' process created for diagnostics would crash, because it would reach again the intended crash in Pragma.cpp, PragmaDebugHandler::HandlePragma() while preprocessing.
When generating crash diagnostics, we now disable the intended crashing behavior with a new cc1 flag -disable-pragma-debug-crash.
Notes:
- #pragma clang __debug llvm_report_fatal isn't currently tested by crash-report.c, because it needs exit() to be handled differently in -fintegrated-cc1 mode. See https://reviews.llvm.org/D73742 for an upcoming fix.
- This is also needed to further validate that -MF is removed from the 'clang -E ' crash diagnostic cmd-line (currently not the case). See https://reviews.llvm.org/D74076 for an upcoming fix.
Differential Revision: https://reviews.llvm.org/D74070
Summary:
- Similar to other targets, instead of passing a toolchain, a driver
argument should be passed into `arm::getARMTargetFeatures`. Aslo, that
routine should honor the specified triple. Refactor
`arm::getARMFloatABI` with 2 separate interfaces. One has the original
parameters and the other uses the driver and the specified triple.
- That fixes an issue when target & features are queried during the
offload compilation, where the specified triple should be checked
instead of a effective triple. A previously failed test is re-enabled.
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74020
Summary:
As a first step this implementation enables compilation of the offload
code.
Reviewers: ABataev
Subscribers: ebevhan, Anastasia, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74048
Summary:
- The device compilation needs to have a consistent source code compared
to the corresponding host compilation. If macros based on the
host-specific target processor is not properly populated, the device
compilation may fail due to the inconsistent source after the
preprocessor. So far, only the host triple is used to build the
macros. If a detailed host CPU target or certain features are
specified, macros derived from them won't be populated properly, e.g.
`__SSE3__` won't be added unless `+sse3` feature is present. On
Windows compilation compatible with MSVC, that missing macros result
in that intrinsics are not included and cause device compilation
failure on the host-side source.
- This patch addresses this issue by introducing two `cc1` options,
i.e., `-aux-target-cpu` and `-aux-target-feature`. If a specific host
CPU target or certain features are specified, the compiler driver will
append them during the construction of the offline compilation
actions. Then, the toolchain in `cc1` phase will populate macros
accordingly.
- An internal option `--gpu-use-aux-triple-only` is added to fall back
the original behavior to help diagnosing potential issues from the new
behavior.
Reviewers: tra, yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73942
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
First attempt at implementing -fsemantic-interposition.
Rely on GlobalValue::isInterposable that already captures most of the expected
behavior.
Rely on a ModuleFlag to state whether we should respect SemanticInterposition or
not. The default remains no.
So this should be a no-op if -fsemantic-interposition isn't used, and if it is,
isInterposable being already used in most optimisation, they should honor it
properly.
Note that it only impacts architecture compiled with -fPIC and no pie.
Differential Revision: https://reviews.llvm.org/D72829
Summary: With OpenMP offloading host compilation is done in two phases to capture host IR that is passed to all device compilations as input. But it turns out that we currently run entire LLVM optimization pipeline on host IR on both compilations which may have unpredictable effects on the resulting code. This patch fixes this problem by disabling LLVM passes on the first compilation, so the host IR that is passed to device compilations will be captured right after front end.
Reviewers: ABataev, jdoerfert, hfinkel
Reviewed By: ABataev
Subscribers: guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73721
Summary:
Fat object size has significantly increased after D65819 which changed bundler tool to add host object as a normal bundle to the fat output which almost doubled its size. That patch was fixing the following issues
1. Problems associated with the partial linking - global constructors were not called for partially linking objects which clearly resulted in incorrect behavior.
2. Eliminating "junk" target object sections from the linked binary on the host side.
The first problem is no longer relevant because we do not use partial linking for creating fat objects anymore. Target objects sections are now inserted into the resulting fat object with a help of llvm-objcopy tool.
The second issue, "junk" sections in the linked host binary, has been fixed in D73408 by adding "exclude" flag to the fat object's sections which contain target objects. This flag tells linker to drop section from the inputs when linking executable or shared library, therefore these sections will not be propagated in the linked binary.
Since both problems have been solved, we can revert D65819 changes to reduce fat object size and this patch essentially is doing that.
Reviewers: ABataev, alexshap, jdoerfert
Reviewed By: ABataev
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73642
Summary: This flag tells link editor to exclude section from linker inputs when linking executable or shared library.
Reviewers: ABataev, alexshap, jdoerfert
Reviewed By: ABataev
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73408
include Clang builtin headers even with -nostdinc
Some projects use -nostdinc, but need to access some intrinsics files when building specific files.
The new -ibuiltininc flag lets them use this flag when compiling these files to ensure they can
find Clang's builtin headers.
The use of -nobuiltininc after the -ibuiltininc flag does not add the builtin header
search path to the list of header search paths.
Differential Revision: https://reviews.llvm.org/D73500
This makes clang somewhat forward-compatible with new CUDA releases
without having to patch it for every minor release without adding
any new function.
If an unknown version is found, clang issues a warning (can be disabled
with -Wno-cuda-unknown-version) and assumes that it has detected
the latest known version. CUDA releases are usually supersets
of older ones feature-wise, so it should be sufficient to keep
released clang versions working with minor CUDA updates without
having to upgrade clang, too.
Differential Revision: https://reviews.llvm.org/D73231
Currently device lib path set by environment variable HIP_DEVICE_LIB_PATH
does not work due to extra "-L" added to each entry.
This patch fixes that by allowing argument name to be empty in addDirectoryList.
Differential Revision: https://reviews.llvm.org/D73299
This required some fixes to the generic code for two issues:
1. -fsanitize=safe-stack is default on x86_64-fuchsia and is *not* incompatible with -fsanitize=leak on Fuchisa
2. -fsanitize=leak and other static-only runtimes must not be omitted under -shared-libsan (which is the default on Fuchsia)
Patch By: mcgrathr
Differential Revision: https://reviews.llvm.org/D73397
See
https://docs.google.com/document/d/1xMkTZMKx9llnMPgso0jrx3ankI4cv60xeZ0y4ksf4wc/preview
for background discussion.
This adds a warning, flags and pragmas to limit the number of
pre-processor tokens either at a certain point in a translation unit, or
overall.
The idea is that this would allow projects to limit the size of certain
widely included headers, or for translation units overall, as a way to
insert backstops for header bloat and prevent compile-time regressions.
Differential revision: https://reviews.llvm.org/D72703
Using the same strategy as c38e42527b.
D69825 revealed (introduced?) a problem when building with ASan, and
some memory leaks somewhere. More details are available in the original
patch.
Looks like we missed one failing tests, this patch adds the workaround
to this test as well.
After rGb4a99a061f517e60985667e39519f60186cbb469, passing a response file such as -Wp,@a.rsp wasn't working anymore because .rsp expansion happens inside clang's main() function.
This patch adds response file expansion in the -cc1 tool.
Differential Revision: https://reviews.llvm.org/D73120
The issue was reported by @xazax.hun here: https://reviews.llvm.org/D69825#1827826
"This patch (D69825) breaks scan-build-py which parses the output of "-###" to get -cc1 command. There might be other tools with the same problems. Could we either remove (in-process) from CC1Command::Print or add a line break?
Having the last line as a valid invocation is valuable and there might be tools relying on that."
Differential Revision: https://reviews.llvm.org/D72982
Currently there are 4 different mechanisms for controlling denormal
flushing behavior, and about as many equivalent frontend controls.
- AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features
- NVPTX uses the nvptx-f32ftz attribute
- ARM directly uses the denormal-fp-math attribute
- Other targets indirectly use denormal-fp-math in one DAGCombine
- cl-denorms-are-zero has a corresponding denorms-are-zero attribute
AMDGPU wants a distinct control for f32 flushing from f16/f64, and as
far as I can tell the same is true for NVPTX (based on the attribute
name).
Work on consolidating these into the denormal-fp-math attribute, and a
new type specific denormal-fp-math-f32 variant. Only ARM seems to
support the two different flush modes, so this is overkill for the
other use cases. Ideally we would error on the unsupported
positive-zero mode on other targets from somewhere.
Move the logic for selecting the flush mode into the compiler driver,
instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32
are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as
a user flag.
-cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and
-fno-cuda-flush-denormals-to-zero will be mapped to
-fp-denormal-math-f32=ieee or preserve-sign rather than the old
attributes.
Stop emitting the denorms-are-zero attribute for the OpenCL flag. It
has no in-tree users. The meaning would also be target dependent, such
as the AMDGPU choice to treat this as only meaning allow flushing of
f32 and not f16 or f64. The naming is also potentially confusing,
since DAZ in other contexts refers to instructions implicitly treating
input denormals as zero, not necessarily flushing output denormals to
zero.
This also does not attempt to change the behavior for the current
attribute. The LangRef now states that the default is ieee behavior,
but this is inaccurate for the current implementation. The clang
handling is slightly hacky to avoid touching the existing
denormal-fp-math uses. Fixing this will be left for a future patch.
AMDGPU is still using the subtarget feature to control the denormal
mode, but the new attribute are now emitted. A future change will
switch this and remove the subtarget features.
Summary:
This change implements the expansion in two parts:
- Add a utility function emitAMDGPUPrintfCall() in LLVM.
- Invoke the above function from Clang CodeGen, when processing a HIP
program for the AMDGPU target.
The printf expansion has undefined behaviour if the format string is
not a compile-time constant. As a sufficient condition, the HIP
ToolChain now emits -Werror=format-nonliteral.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D71365
Flags are clang's default UI is flags.
We can have an env var in addition to that, but in D69825 nobody has yet
mentioned why this needs an env var, so omit it for now. If someone
needs to set the flag via env var, the existing CCC_OVERRIDE_OPTIONS
mechanism works for it (set CCC_OVERRIDE_OPTIONS=+-fno-integrated-cc1
for example).
Also mention the cc1-in-process change in the release notes.
Also spruce up the test a bit so it actually tests something :)
Differential Revision: https://reviews.llvm.org/D72769
These driver options perform some checking and delegate to MC options -x86-align-branch* and -x86-branches-within-32B-boundaries.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D72463
Fix riscv-toolchain-extra tests to pass when CLANG_RESOURCE_DIR is set
to another value than the default.
Differential Revision: https://reviews.llvm.org/D72591
With this patch, the clang tool will now call the -cc1 invocation directly inside the same process. Previously, the -cc1 invocation was creating, and waiting for, a new process.
This patch therefore reduces the number of created processes during a build, thus it reduces build times on platforms where process creation can be costly (Windows) and/or impacted by a antivirus.
It also makes debugging a bit easier, as there's no need to attach to the secondary -cc1 process anymore, breakpoints will be hit inside the same process.
Crashes or signaling inside the -cc1 invocation will have the same side-effect as before, and will be reported through the same means.
This behavior can be controlled at compile-time through the CLANG_SPAWN_CC1 cmake flag, which defaults to OFF. Setting it to ON will revert to the previous behavior, where any -cc1 invocation will create/fork a secondary process.
At run-time, it is also possible to tweak the CLANG_SPAWN_CC1 environment variable. Setting it and will override the compile-time setting. A value of 0 calls -cc1 inside the calling process; a value of 1 will create a secondary process, as before.
Differential Revision: https://reviews.llvm.org/D69825
which is the default TLS model for non-PIC objects. This allows large/
many thread local variables or a compact/fast code in an executable.
Specification is same as that of GCC. For example, the code model
option precedes the TLS size option.
TLS access models other than local-exec are not changed. It means
supoort of the large code model is only in the local exec TLS model.
Patch By KAWASHIMA Takahiro (kawashima-fj <t-kawashima@fujitsu.com>)
Reviewers: dmgreen, mstorsjo, t.p.northover, peter.smith, ostannard
Reviewd By: peter.smith
Committed by: peter.smith
Differential Revision: https://reviews.llvm.org/D71688
All 130+ f_Group flags that take an argument allow it after a '=',
except for fdebug-complation-dir. Add a Joined<> alias so that
it behaves consistently with all the other f_Group flags.
(Keep the old Separate flag for backwards compat.)
Follow-up of D72014. It is more appropriate to use a target
feature instead of a SubTypeArch to express the difference.
Reviewed By: #powerpc, jhibbits
Differential Revision: https://reviews.llvm.org/D72433
In the backend, this feature is implemented with the function attribute
"patchable-function-entry". Both the attribute and XRay use
TargetOpcode::PATCHABLE_FUNCTION_ENTER, so the two features are
incompatible.
Reviewed By: ostannard, MaskRay
Differential Revision: https://reviews.llvm.org/D72222
Architecturally, it's allowed to have MVE-I without an FPU, thus
-mfpu=none should not disable MVE-I, or moves to/from FP-registers.
This patch removes `+/-fpregs` from features unconditionally added to
target feature list, depending on FPU and moves the logic to Clang
driver, where the negative form (`-fpregs`) is conditionally added to
the target features list for the cases of `-mfloat-abi=soft`, or
`-mfpu=none` without either `+mve` or `+mve.fp`. Only the negative
form is added by the driver, the positive one is derived from other
features in the backend.
Differential Revision: https://reviews.llvm.org/D71843
Apple's CPUs are called A7-A13 in official communication, occasionally with
weird suffixes which we probably don't need to care about. This adds each one
and describes its features. It also switches the default CPU to the canonical
name for Cyclone, but leaves legacy support in so that existing bitcode still
compiles.
According to D53384, the default was switched from -fno-PIC to -fPIC to
work around a -fsanitize=leak bug on big-endian.
This gratuitous difference between little-endian and big-endian is
undesired, and not acceptable on powerpc64-unknown-freebsd. If
-fsanitize=leak still has the problem, we should consider defaulting to
-fPIC/-fPIE only when -fsanitize=leak is specified (see SanitizerArgs::requiresPIE())
powerpc64-ibm-aix is unaffected: it still defaults to -fPIC.
powerpc64-linux-musl is unaffected (-fPIE since D39588): it still defaults to -fPIE.
Reviewed By: #powerpc, jhibbits
Differential Revision: https://reviews.llvm.org/D72363
Summary:
Every powerpc64le platform uses elfv2.
For powerpc64, the environments "elfv1" and "elfv2" were added for
FreeBSD ELFv1->ELFv2 migration in D61950. FreeBSD developers have
decided to use OS versions to select ABI, and no one is relying on the
environments.
Also use elfv2 on powerpc64-linux-musl.
Users can always use -mabi=elfv1 and -mabi=elfv2 to override the default
ABI.
Reviewed By: adalava
Differential Revision: https://reviews.llvm.org/D72352
Driver test `cross-linux.c` fails when CLANG_DEFAULT_RTLIB is "compiler-rt"
as the it expects a GCC-style `"crtbegin.o"` after `"crti.o"` but instead
receives something akin to this in the frontend invocation:
```
"crt1.o" "crti.o"
"/o/b/llvm/bin/../lib/clang/10.0.0/lib/linux/clang_rt.crtbegin-x86_64.o"
```
This patch adds an override to `cross-linux.c` tests so the expected result
is produced regardless of the compile-time default rtlib, as having tests
fail due to that is fairly confusing. After applying the patch, the test
passes regardless of the CLANG_DEFAULT_RTLIB setting.
Differential Revision: https://reviews.llvm.org/D72236
-mpacked-stack is currently not supported with -mbackchain, so this should
result in a compilation error message instead of being silently ignored.
Review: Ulrich Weigand
gcc/config/{i386,s390} support -mnop-mcount. We currently only support
-mnop-mcount for SystemZ. The function attribute "mnop-mcount" is
ignored on other targets.
gcc/config/{i386,s390} support -mfentry. We currently only support
-mfentry for X86 and SystemZ. TargetOpcode::FENTRY_CALL is not handled
on other targets.
% clang -target aarch64 -pg -mfentry a.c -c
fatal error: error in backend: Not supported instr: <MCInst 21>
-mfentry, -mrecord-mcount, and -mnop-mcount were invented for Linux
ftrace. Linux uses $(call cc-option-yn,-mrecord-mcount) to detect if the
specific feature is available. Reject unsupported features so that Linux
build system will not wrongly consider them available and cause
build/runtime failures.
Note, GCC has stricter checks that we do not implement, e.g. -fpic/-fpie
-fnop-mcount is not allowed on x86, -fpic/-fpie -mfentry is not allowed
on x86-32.
GCC's x86 and s390 ports support -mrecord-mcount. Other ports reject the
option.
aarch64-linux-gnu-gcc: error: unrecognized command line option ‘-mrecord-mcount’
Allowing this option can cause failures when building Linux kernel for
aarch64, powerpc64, etc, which will think the feature is available if
the clang command returns 0.
On Darwin, when used for generating a linked binary from a source file
(through an intermediate object file), the driver will invoke `cc1` to
generate a temporary object file. The temporary remark file will now be
emitted next to the object file, which will then be picked up by
`dsymutil` and emitted in the .dSYM bundle.
This is available for all formats except YAML since by default, YAML
doesn't need a section and the remark file will be lost.
When clang is invoked with a source file without -c or -S, it creates a
cc1 job, a linker job and if debug info is requested, a dsymutil job. In
case of remarks, we should also create a dsymutil job to avoid losing
the remarks that will be generated in a tempdir that gets removed.
Differential Revision: https://reviews.llvm.org/D71675
Our build system does not handle randomly named files created during
the build well. We'd prefer to write compilation output directly
without creating a temporary file. Function parameters already
existed to control this behavior but were not exposed all the way out
to the command line.
Patch by Zachary Henkel!
Differential revision: https://reviews.llvm.org/D70615
The driver actually adds a default -mlinker-version, based on HOST_LINK_VERSION
cmake variable. The tests should be explicit about which version they're using to
trigger the right behavior.
In Xcode 11, ld added a new flag called -platform_version that can be used instead of the old -<platform>_version_min flags.
The new flag allows Clang to pass the SDK version from the driver to the linker.
This patch adopts the new -platform_version flag in Clang, and starts using it by default,
unless a linker version < 520 is passed to the driver.
Differential Revision: https://reviews.llvm.org/D71579
Summary:
Remove REQUIRES-ANY alias lit directive since it is hardly used and can
be easily implemented using an OR expression using REQUIRES. Fixup
remaining testcases still using REQUIRES-ANY.
Reviewers: probinson, jdenny, gparker42
Reviewed By: gparker42
Subscribers: eugenis, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, delcypher, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, cfe-commits, #sanitizers, llvm-commits
Tags: #llvm, #clang, #sanitizers
Differential Revision: https://reviews.llvm.org/D71408
GCC implicitly adds an .exe suffix if it is given an output file name,
but the file name doesn't contain a suffix, and there are certain
users of GCC that rely on this behaviour (and run into issues when
trying to use Clang instead of GCC). And MSVC's cl.exe also does the
same (but not link.exe).
However, GCC only does this when actually running on windows, not when
operating as a cross compiler.
As GCC doesn't have this behaviour when cross compiling, we definitely
shouldn't introduce the behaviour in such cases (as it would break
at least as many cases as this fixes).
Differential Revision: https://reviews.llvm.org/D71400
This matches https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html
> -momit-leaf-frame-pointer
> -mno-omit-leaf-frame-pointer
>
> Omit or keep the frame pointer in leaf functions. The former behavior is the default.
-mno-omit-leaf-frame-pointer is currently a no-op because
TargetOptions::DisableFramePointerElim is only considered for non-leaf
functions.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D71167
D39317 made clang use .init_array when no gcc installations is found.
This change changes all gcc installations to use .init_array .
GCC 4.7 by default stopped providing .ctors/.dtors compatible crt files,
and stopped emitting .ctors for __attribute__((constructor)).
.init_array should always work.
FreeBSD rules are moved to FreeBSD.cpp to make Generic_ELF rules clean.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D71434
Very few ELF platforms still use .ctors/.dtors now. Linux (glibc: 1999-07),
DragonFlyBSD, FreeBSD (2012-03) and Solaris have supported .init_array
for many years. Some architectures like AArch64/RISC-V default to
.init_array . GNU ld and gold can even convert .ctors to .init_array .
It makes more sense to flip the CC1 default, and only uses
-fno-use-init-array on platforms that don't support .init_array .
For example, OpenBSD did not support DT_INIT_ARRAY before Aug 2016
(86fa57a279)
I may miss some ELF platforms that still use .ctors, but their
maintainers can easily diagnose such problems.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D71393
Serialized remarks contain debug locations for each remark, by storing a
file path, a line, and a column.
Also, remarks support being embedded in a .dSYM bundle using a separate
section in object files, that is found by `dsymutil` through the debug
map.
In order for tools to map addresses to source and display remarks in the
source, we need line tables, and in order for `dsymutil` to find the
object files containing the remark section, we need to keep the debug
map around.
Differential Revision: https://reviews.llvm.org/D71325
This is a follow up patch to use the OpenMP-IR-Builder, as discussed on
the mailing list ([1] and later) and at the US Dev Meeting'19.
[1] http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000197.html
Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim
Subscribers: ppenzin, penzn, llvm-commits, cfe-commits, jfb, guansong, bollu, hiraditya, mgorny
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69922
The -fsplit-dwarf-inlining option does not conform to DWARF5 standard.
It creates children for Skeleton compilation unit. We need default behavior
to be DWARF5 compatible. Thus set default state for -fsplit-dwarf-inlining
into "false".
Differential Revision: https://reviews.llvm.org/D71304
This adds a check for the usage of -foptimization-record-file with
multiple -arch options. This is not permitted since it would require us
to rename the file requested by the user to avoid overwriting it for the
second cc1 invocation.
Fixed the test case to set --sysroot, which lets it succeed in the case where
a directory named "/usr/include/c++/v1" or "/usr/local/include/c++/v1" exists.
Original commit message:
> The NDK uses a separate set of libc++ headers in the sysroot. Any headers
> in the installation directory are not going to work on Android, not least
> because they use a different name for the inline namespace (std::__1 instead
> of std::__ndk1).
>
> This effectively makes it impossible to produce a single toolchain that is
> capable of targeting both Android and another platform that expects libc++
> headers to be installed in the installation directory, such as Mac.
>
> In order to allow this scenario to work, stop looking for headers in the
> install directory on Android.
Differential Revision: https://reviews.llvm.org/D71154
The NDK uses a separate set of libc++ headers in the sysroot. Any headers
in the installation directory are not going to work on Android, not least
because they use a different name for the inline namespace (std::__1 instead
of std::__ndk1).
This effectively makes it impossible to produce a single toolchain that is
capable of targeting both Android and another platform that expects libc++
headers to be installed in the installation directory, such as Mac.
In order to allow this scenario to work, stop looking for headers in the
install directory on Android.
Differential Revision: https://reviews.llvm.org/D71154
Summary:
The HIP toolchain invokes `llc` without an explicit opt-level, meaning
it always uses the default (-O2). This makes it impossible to use -O1,
for example. The HIP toolchain also coerces -Os/-Oz to -O2 even when
invoking opt, and it coerces -Og to -O2 rather than -O1.
Forward the opt-level to `llc` as well as `opt`, and only coerce levels
where it is required.
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70987
Patch was reverted because https://bugs.llvm.org/show_bug.cgi?id=44048
The original patch is modified to set the strictfp IR attribute
explicitly in CodeGen instead of as a side effect of IRBuilder.
In the 2nd attempt to reapply there was a windows lit test fail, the
tests were fixed to use wildcard matching.
Differential Revision: https://reviews.llvm.org/D62731
This was hard-coded to "clang". This change allows it to to be used on
processes other than clang (such as lld).
This gets reported as clang-10 on Linux and clang.exe on Windows so
adapted test to accommodate this.
Differential Revision: https://reviews.llvm.org/D70950
Summary:
clang test Driver/darwin-opt-record.c assumes the default target is
x86_64 by its uses of the -arch x86_64 and -arch x86_64h and thus fail
on systems where it is not the case. Adding a target
x86_64-apple-darwin10 reveals another problem: the driver refuses 2
-arch for an assembly output so this commit also changes the output to
be an object file.
Reviewers: thegameg
Reviewed By: thegameg
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70748
Summary:
A skeleton of AIX toolchain and system linker support has been introduced in D68340, and this is a follow on patch to it.
This patch adds support to system assembler invocation to the AIX toolchain.
Reviewers: daltenty, hubert.reinterpretcast, jasonliu, Xiangling_L, dlj
Reviewed By: daltenty, hubert.reinterpretcast
Subscribers: wuzish, nemanjai, kbarton, jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69620
GCC 8 implements -fmacro-prefix-map. Like -fdebug-prefix-map, it replaces a string prefix for the __FILE__ macro.
-ffile-prefix-map is the union of -fdebug-prefix-map and -fmacro-prefix-map
Reviewed By: rnk, Lekensteyn, maskray
Differential Revision: https://reviews.llvm.org/D49466
Summary:
On SUSE distributions for 32-bit PowerPC, gcc is configured
as a 64-bit compiler using the GNU triplet "powerpc64-suse-linux",
but invoked with "-m32" by default. Thus, the correct GNU triplet
for 32-bit PowerPC SUSE distributions is "powerpc64-suse-linux"
and not "powerpc-suse-linux".
Reviewers: jrtc27, nemanjai, glaubitz
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D55326
When there's a wasm-opt in the PATH, run the it to optimize LLVM's
output. This fixes PR43796.
And, add an "llvm-lto" directory to the sysroot library search paths,
so that sysroots can provide LTO-enabled system libraries.
Differential Revision: https://reviews.llvm.org/D70500
In the GNU toolchain, `-static-libgcc` implies that the unwindlib will
be linked statically. However, when `--unwindlib=libunwind`, this flag is
ignored, and a bare `-lunwind` is added to the linker args. Unfortunately,
this means that if both `libunwind.so`, and `libunwind.a` are present
in the library path, `libunwind.so` will be chosen in all cases where
`-static` is not set.
This change makes `-static-libgcc` affect the `-l` flag produced by
`--unwindlib=libunwind`. After this patch, providing
`-static-libgcc --unwindlib=libunwind` will cause the driver to explicitly
emit `-l:libunwind.a` to statically link libunwind. For all other cases
it will emit `-l:libunwind.so` matching current behavior with a more
explicit link line.
https://reviews.llvm.org/D70416
If a GCC installation is not detected, then this attempts to
use compiler-rt and the compiler-rt crtbegin/crtend
implementations as a fallback.
Differential Revision: https://reviews.llvm.org/D68407
1. Currently only support the set of multilibs same to riscv-gnu-toolchain.
2. Fix testcase typo causes fail on Windows.
3. Fix testcases to set empty sysroot.
Reviewers: espindola, asb, kito-cheng, lenary
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D67508
We don't have a full sysroot yet, so for now we only include compiler
support and compiler-rt builtins, the rest of the runtimes will get
enabled later.
Differential Revision: https://reviews.llvm.org/D70477
1. Currently only support the set of multilibs same to riscv-gnu-toolchain.
2. Fix testcase typo causes fail on Windows
Reviewers: espindola, asb, kito-cheng, lenary
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D67508
Currently only support the set of multilibs same to riscv-gnu-toolchain.
Reviewers: espindola, asb, kito-cheng, lenary
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D67508
Summary:
This allows `clang` to be used to compile CUDA programs. Compiled
simple helloworld.cu with this.
Reviewers: dim, emaste, tra, yaxunl, ABataev
Reviewed By: tra
Subscribers: dim, emaste, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69990
When the driver is targeting multiple architectures at once, for things
like Universal Mach-Os, we need to emit different remark files for each
cc1 invocation to avoid overwriting the files from a different
invocation.
For example:
$ clang -c -o foo.o -fsave-optimization-record -arch x86_64 -arch x86_64h
will create two remark files:
* foo-x86_64.opt.yaml
* foo-x86_64h.opt.yaml
and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread.
This reverts commits af57dbf12e and e6584b2b7b
When the driver is targeting multiple architectures at once, for things
like Universal Mach-Os, we need to emit different remark files for each
cc1 invocation to avoid overwriting the files from a different
invocation.
For example:
$ clang -c -o foo.o -fsave-optimization-record -arch x86_64 -arch x86_64h
will create two remark files:
* foo-x86_64.opt.yaml
* foo-x86_64h.opt.yaml
For RISC-V the value provided to -march should determine whether to
compile for 32- or 64-bit RISC-V irrespective of the target provided to
the Clang driver. This adds a test for this flag for RISC-V and sets the
Target architecture correctly in these cases.
Differential Revision: https://reviews.llvm.org/D54214
Provides support for using r6-r11 as globally scoped
register variables. This requires a -ffixed-rN flag
in order to reserve rN against general allocation.
If for a given GRV declaration the corresponding flag
is not found, or the the register in question is the
target's FP, we fail with a diagnostic.
Differential Revision: https://reviews.llvm.org/D68862
There doesn't seem to be much sense in defaulting "on" unwind tables on
amd64 and not on other arches. It causes surprising differences between
platforms, such as the PR below[1].
Prior to this change, FreeBSD inherited the default implementation of the
method from the Gnu.h Generic_Elf => Generic_GCC parent class, which
returned true only for amd64 targets. Override that and opt on always,
similar to, e.g., NetBSD's driver.
[1] https://bugs.freebsd.org/241562
Patch by cem (Conrad Meyer).
Reviewed By: dim
Differential Revision: https://reviews.llvm.org/D70110
Summary:
Clang/LLVM is a cross-compiler, and so we don't have to make a choice
about `-march`/`-mabi` at build-time, but we may have to compute a
default `-march`/`-mabi` when compiling a program. Until now, each
place that has needed a default `-march` has calculated one itself.
This patch adds a single place where a default `-march` is calculated,
in order to avoid calculating different defaults in different places.
This patch adds a new function `riscv::getRISCVArch` which encapsulates
this logic based on GCC's for computing a default `-march` value
when none is provided. This patch also updates the logic in
`riscv::getRISCVABI` to match the logic in GCC's build system for
computing a default `-mabi`.
This patch also updates anywhere that `-march` is used to now use the
new function which can compute a default. In particular, we now
explicitly pass a `-march` value down to the gnu assembler.
GCC has convoluted logic in its build system to choose a default
`-march`/`-mabi` based on build options, which would be good to match.
This patch is based on the logic in GCC 9.2.0. This commit's logic is
different to GCC's only for baremetal targets, where we default
to rv32imac/ilp32 or rv64imac/lp64 depending on the target triple.
Tests have been updated to match the new logic.
Reviewers: asb, luismarques, rogfer01, kito-cheng, khchen
Reviewed By: asb, luismarques
Subscribers: sameer.abuasal, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69383