Support -Wno-frame-larger-than (with no =) and make it properly
interoperate with -Wframe-larger-than. Reject -Wframe-larger-than with
no argument.
We continue to support Clang's old spelling, -Wframe-larger-than=, for
compatibility with existing users of that facility.
In passing, stop the driver from accepting and ignoring
-fwarn-stack-size and make it a cc1-only flag as intended.
-Wframe-larger-than= is an interesting warning; we can't know the frame
size until PrologueEpilogueInsertion (PEI); very late in the compilation
pipeline.
-Wframe-larger-than= was propagated through CC1 as an -mllvm flag, then
was a cl::opt in LLVM's PEI pass; this meant it was dropped during LTO
and needed to be re-specified via -plugin-opt.
Instead, make it part of the IR proper as a module level attribute,
similar to D103048. Introduce -fwarn-stack-size CC1 option.
Reviewed By: rsmith, qcolombet
Differential Revision: https://reviews.llvm.org/D103928
This patch adds the command line option -foperator-names which acts as the opposite of -fno-operator-names. With this command line option it is possible to reenable C++ operator keywords on the command line if -fno-operator-names had previously been passed.
Differential Revision: https://reviews.llvm.org/D103749
This patch changes the ffp-model=precise to enables -ffp-contract=on
(previously -ffp-model=precise enabled -ffp-contract=fast). This is a
follow-up to Andy Kaylor's comments in the llvm-dev discussion
"Floating Point semantic modes". From the same email thread, I put
Andy's distillation of floating point options and floating point modes
into UsersManual.rst
Differential Revision: https://reviews.llvm.org/D74436
This fixed PR#48894 for AArch64. The issue has been fixed for Arm in
https://reviews.llvm.org/D95872
The following rules apply to -Wa,-march with this change:
- Only compiler options apply to non assembly files
- Compiler and assembler options apply to assembly files
- For assembly files, we prefer the assembler option(s) if we have both kinds of option
- Of the options that apply (or are preferred), the last value wins (it's not additive)
Reviewed By: DavidSpickett, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D103184
This patch fixes a Windows -EHa crash induced by previous commit 797ad70152.
The crash was caused by "LifetimeMarker" scope (with option -O2) that should not be considered as SEH Scope.
This change also turns off -fasync-exceptions by default under -EHa option for now.
Differential Revision: https://reviews.llvm.org/D103664#2799944
A recent change (D99683) to support ThinLTO for HIP caused a regression
when compiling cuda code with -flto=thin -fwhole-program-vtables.
Specifically, we now get an error:
error: invalid argument '-fwhole-program-vtables' only allowed with '-flto'
This error is coming from the device offload cc1 action being set up for
the cuda compile, for which -flto=thin doesn't apply and gets dropped.
This is a regression, but points to a potential issue that was silently
occurring before the patch, details below.
Before D99683, the check for fwhole-program-vtables in the driver looked
like:
if (WholeProgramVTables) {
if (!D.isUsingLTO())
D.Diag(diag::err_drv_argument_only_allowed_with)
<< "-fwhole-program-vtables"
<< "-flto";
CmdArgs.push_back("-fwhole-program-vtables");
}
And D.isUsingLTO() returned true since we have -flto=thin. However,
because the cuda cc1 compile is doing device offloading, which didn't
support any LTO, there was other code that suppressed -flto* options
from being passed to the cc1 invocation. So the cc1 invocation silently
had -fwhole-program-vtables without any -flto*. This seems potentially
problematic, since if we had any virtual calls we would get type test
assume sequences without the corresponding LTO pass that handles them.
However, with the patch, which adds support for device offloading LTO
option -foffload-lto=thin, the code has changed so that we set a bool
IsUsingLTO based on either -flto* or -foffload-lto*, depending on
whether this is the device offloading action. For the device offload
action in our compile, since we don't have -foffload-lto, IsUsingLTO is
false, and the check for LTO with -fwhole-program-vtables now fails.
What we should do is only pass through -fwhole-program-vtables to the
cc1 invocation that has LTO enabled (either the device offload action
with -foffload-lto, or the non-device offload action with -flto), and
otherwise drop the -fwhole-program-vtables for the non-LTO action.
Then we should error only if we have -fwhole-program-vtables without any
-f*lto* options.
Differential Revision: https://reviews.llvm.org/D103579
All fuchsia targets will now use the relative-vtables ABI by default.
Also remove -fexperimental-relative-c++-abi-vtables from test RUNs targeting fuchsia.
Differential Revision: https://reviews.llvm.org/D102374
VS 2019 16.11 (just released in Preview) is adding support for the
/std:c++20 option and bumping /std:c++latest to "post-c++20". This
updates clang-cl to match.
Differential revision: https://reviews.llvm.org/D103155
Since 4468e5b899 clang will prefer
the last one it finds of "-mimplicit-it" or "-Wa,-mimplicit-it".
Due to a mistake in that patch the compiler argument "-mimplicit-it"
was never marked as used, even if it was the last one and was passed
to llvm.
Move the Claim call back to the start of the loop and update
the testing to check we don't get any unused argument warnings.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D103086
Add options -[no-]offload-lto and -foffload-lto=[thin,full] for controlling
LTO for offload compilation. Allow LTO for AMDGPU target.
AMDGPU target does not support codegen of object files containing
call of external functions, therefore the LLVM module passed to
AMDGPU backend needs to contain definitions of all the callees.
An LLVM option is added to allow function importer to import
functions with noinline attribute.
HIP toolchain passes proper LLVM options to lld to make sure
function importer imports definitions of all the callees.
Reviewed by: Teresa Johnson, Artem Belevich
Differential Revision: https://reviews.llvm.org/D99683
If multiple instances of the -arm-implicit-it option is passed to
the backend, it errors out.
Also fix cases where there are multiple -Wa,-mimplicit-it; the existing
tests indicate that the last one specified takes effect, while in
practice it passed double options, which didn't work as intended.
Differential Revision: https://reviews.llvm.org/D102812
Instead of ignoring flto=auto and -flto=jobserver, treat them as -flto
and pass -flto=full along.
Differential Revision: https://reviews.llvm.org/D102479
This reverts commit 2919222d80.
That commit broke backwards compatibility. Additionally, the
replacement, -Wa,-mimplicit-it, isn't yet supported by any stable
release of Clang.
See D102812 for a fix for the error cases when callers specify both
-mimplicit-it and -Wa,-mimplicit-it.
This is a GNU as and Clang cc1as option, not a GCC option.
Users should specify `-Wa,-mimplicit-it=` instead.
Note: mixing the -m option and the -Wa, option doesn't work
`-Wa,-mimplicit-it=never -mimplicit-it=always` =>
`clang (LLVM option parsing): for the --arm-implicit-it option: may only occur zero or one times!`
Reviewed By: nickdesaulniers, raj.khem
Differential Revision: https://reviews.llvm.org/D102568
Currently, we have support for SYCL 1.2.1 (also known as SYCL 2017).
This patch introduces the start of support for SYCL 2020 mode, which is
the latest SYCL standard available at (https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html).
This sets the default SYCL to be 2020 in the driver, and introduces the
notion of a "default" version (set to 2020) when cc1 is in SYCL mode
but there was no explicit -sycl-std= specified on the command line.
This patch is the Part-1 (FE Clang) implementation of HW Exception handling.
This new feature adds the support of Hardware Exception for Microsoft Windows
SEH (Structured Exception Handling).
This is the first step of this project; only X86_64 target is enabled in this patch.
Compiler options:
For clang-cl.exe, the option is -EHa, the same as MSVC.
For clang.exe, the extra option is -fasync-exceptions,
plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.
NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change.
The rules for C code:
For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow
three rules:
* First, no exception can move in or out of _try region., i.e., no "potential
faulty instruction can be moved across _try boundary.
* Second, the order of exceptions for instructions 'directly' under a _try
must be preserved (not applied to those in callees).
* Finally, global states (local/global/heap variables) that can be read
outside of _try region must be updated in memory (not just in register)
before the subsequent exception occurs.
The impact to C++ code:
Although SEH is a feature for C code, -EHa does have a profound effect on C++
side. When a C++ function (in the same compilation unit with option -EHa ) is
called by a SEH C function, a hardware exception occurs in C++ code can also
be handled properly by an upstream SEH _try-handler or a C++ catch(...).
As such, when that happens in the middle of an object's life scope, the dtor
must be invoked the same way as C++ Synchronous Exception during unwinding
process.
Design:
A natural way to achieve the rules above in LLVM today is to allow an EH edge
added on memory/computation instruction (previous iload/istore idea) so that
exception path is modeled in Flow graph preciously. However, tracking every
single memory instruction and potential faulty instruction can create many
Invokes, complicate flow graph and possibly result in negative performance
impact for downstream optimization and code generation. Making all
optimizations be aware of the new semantic is also substantial.
This design does not intend to model exception path at instruction level.
Instead, the proposed design tracks and reports EH state at BLOCK-level to
reduce the complexity of flow graph and minimize the performance-impact on CPP
code under -EHa option.
One key element of this design is the ability to compute State number at
block-level. Our algorithm is based on the following rationales:
A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping
into a _try is not allowed. The single entry must start with a seh_try_begin()
invoke with a correct State number that is the initial state of the SEME.
Through control-flow, state number is propagated into all blocks. Side exits
marked by seh_try_end() will unwind to parent state based on existing
SEHUnwindMap[].
Note side exits can ONLY jump into parent scopes (lower state number).
Thus, when a block succeeds various states from its predecessors, the lowest
State triumphs others. If some exits flow to unreachable, propagation on those
paths terminate, not affecting remaining blocks.
For CPP code, object lifetime region is usually a SEME as SEH _try.
However there is one rare exception: jumping into a lifetime that has Dtor but
has no Ctor is warned, but allowed:
Warning: jump bypasses variable with a non-trivial destructor
In that case, the region is actually a MEME (multiple entry multiple exits).
Our solution is to inject a eha_scope_begin() invoke in the side entry block to
ensure a correct State.
Implementation:
Part-1: Clang implementation described below.
Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end().
_scope_begin() is immediately added after ctor() is called and EHStack is pushed.
So it must be an invoke, not a call. With that it's also guaranteed an
EH-cleanup-pad is created regardless whether there exists a call in this scope.
_scope_end is added before dtor(). These two intrinsics make the computation of
Block-State possible in downstream code gen pass, even in the presence of
ctor/dtor inlining.
Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark
_try boundary and to prevent from exceptions being moved across _try boundary.
All memory instructions inside a _try are considered as 'volatile' to assure
2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's
acceptable as the amount of code directly under _try is very small.
Part-2 (will be in Part-2 patch): LLVM implementation described below.
For both C++ & C-code, the state of each block is computed at the same place in
BE (WinEHPreparing pass) where all other EH tables/maps are calculated.
In addition to _scope_begin & _scope_end, the computation of block state also
rely on the existing State tracking code (UnwindMap and InvokeStateMap).
For both C++ & C-code, the state of each block with potential trap instruction
is marked and reported in DAG Instruction Selection pass, the same place where
the state for -EHsc (synchronous exceptions) is done.
If the first instruction in a reported block scope can trap, a Nop is injected
before this instruction. This nop is needed to accommodate LLVM Windows EH
implementation, in which the address in IPToState table is offset by +1.
(note the purpose of that is to ensure the return address of a call is in the
same scope as the call address.
The handler for catch(...) for -EHa must handle HW exception. So it is
'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches
C++ exceptions).
Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW
exceptions can be passed through.
Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.htmlhttps://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html
Differential Revision: https://reviews.llvm.org/D80344/new/
Follow up to D88631 but for aarch64; the Linux kernel uses the command
line flags:
1. -mstack-protector-guard=sysreg
2. -mstack-protector-guard-reg=sp_el0
3. -mstack-protector-guard-offset=0
to use the system register sp_el0 for the stack canary, enabling the
kernel to have a unique stack canary per task (like a thread, but not
limited to userspace as the kernel can preempt itself).
Address pr/47341 for aarch64.
Fixes: https://github.com/ClangBuiltLinux/linux/issues/289
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed By: xiangzhangllvm, DavidSpickett, dmgreen
Differential Revision: https://reviews.llvm.org/D100919
This patch adds support for GCC's -fstack-usage flag. With this flag, a stack
usage file (i.e., .su file) is generated for each input source file. The format
of the stack usage file is also similar to what is used by GCC. For each
function defined in the source file, a line with the following information is
produced in the .su file.
<source_file>:<line_number>:<function_name> <size_in_byte> <static/dynamic>
"Static" means that the function's frame size is static and the size info is an
accurate reflection of the frame size. While "dynamic" means the function's
frame size can only be determined at run-time because the function manipulates
the stack dynamically (e.g., due to variable size objects). The size info only
reflects the size of the fixed size frame objects in this case and therefore is
not a reliable measure of the total frame size.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D100509
Previously clang would print a binary blob into the bundled file
for amdgcn. With this patch, it will instead print textual IR as
expected.
Reviewed By: JonChesterfield, ronlieb
Differential Revision: https://reviews.llvm.org/D102065
Change-Id: I10c0127ab7357787769fdf9a2edd4b3071e790a1
-fno-semantic-interposition (only effective with -fpic) can optimize default
visibility external linkage (non-ifunc-non-COMDAT) variable access and function
calls to avoid GOT/PLT, by using local aliases, e.g.
```
int var;
__attribute__((optnone)) int fun(int x) { return x * x; }
int test() { return fun(var); }
```
-fpic (var and fun are dso_preemptable)
```
test:
.LBB1_1:
auipc a0, %got_pcrel_hi(var)
ld a0, %pcrel_lo(.LBB1_1)(a0)
lw a0, 0(a0)
// fun is preemptible by default in ld -shared mode. ld will create a PLT.
tail fun@plt
```
vs -fpic -fno-semantic-interposition (var and fun are dso_local)
```
test:
.Ltest$local:
.LBB1_1:
auipc a0, %pcrel_hi(.Lvar$local)
addi a0, a0, %pcrel_lo(.LBB1_1)
lw a0, 0(a0)
// The assembler either resolves .Lfun$local at assembly time (-mno-relax
// -fno-function-sections), or produces a relocation referencing a non-preemptible
// local symbol (which can avoid PLT).
tail .Lfun$local
```
Note: Clang's default -fpic is more aggressive than GCC -fpic: interprocedural
optimizations (including inlining) are available but local aliases are not used.
-fpic -fsemantic-interposition can disable interprocedural optimizations.
Depends on D101875
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D101876
-fno-semantic-interposition (only effective with -fpic) can optimize default
visibility external linkage (non-ifunc-non-COMDAT) variable access and function
calls to avoid GOT/PLT, by using local aliases, e.g.
```
int var;
__attribute__((optnone)) int fun(int x) { return x * x; }
int test() { return fun(var); }
```
-fpic (var and fun are dso_preemptable)
```
test: // @test
adrp x8, :got:var
ldr x8, [x8, :got_lo12:var]
ldr w0, [x8]
// fun is preemptible by default in ld -shared mode. ld will create a PLT.
b fun
```
vs -fpic -fno-semantic-interposition (var and fun are dso_local)
```
test: // @test
.Ltest$local:
adrp x8, .Lvar$local
ldr w0, [x8, :lo12:.Lvar$local]
// The assembler either resolves .Lfun$local at assembly time, or produces a
// relocation referencing a non-preemptible section symbol (which can avoid PLT).
b .Lfun$local
```
Note: Clang's default -fpic is more aggressive than GCC -fpic: interprocedural
optimizations (including inlining) are available but local aliases are not used.
-fpic -fsemantic-interposition can disable interprocedural optimizations.
Depends on D101872
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D101873
Previously clang would print a binary blob into the bundled file
for amdgcn. With this patch, it will instead print textual IR as
expected.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D102065
This implements the flag proposed in RFC
http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html.
The goal is to add a way to override the default target C++ ABI through a
compiler flag. This makes it easier to test and transition between different
C++ ABIs through compile flags rather than build flags.
In this patch:
- Store -fc++-abi= in a LangOpt. This isn't stored in a CodeGenOpt because
there are instances outside of codegen where Clang needs to know what the
ABI is (particularly through ASTContext::createCXXABI), and we should be
able to override the target default if the flag is provided at that point.
- Expose the existing ABIs in TargetCXXABI as values that can be passed
through this flag.
- Create a .def file for these ABIs to make it easier to check flag values.
- Add an error for diagnosing bad ABI flag values.
Differential Revision: https://reviews.llvm.org/D85802
GCC supports negative values for -mstack-protector-guard-offset=, this
should be a signed value. Pre-req to D100919.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101325
[clang][amdgpu] Use implicit code object version
At present, clang always passes amdhsa-code-object-version on to -cc1. That is
great for certainty over what object version is being used when debugging.
Unfortunately, the command line argument is in AMDGPUBaseInfo.cpp in the amdgpu
target. If clang is used with an llvm compiled with DLLVM_TARGETS_TO_BUILD
that excludes amdgpu, this will be diagnosed (as discovered via D98658):
- Unknown command line argument '--amdhsa-code-object-version=4'
This means that clang, built only for X86, can be used to compile the nvptx
devicertl for openmp but not the amdgpu one. That would shortly spawn fragile
logic in the devicertl cmake to try to guess whether the clang used will work.
This change omits the amdhsa-code-object-version parameter when it matches the
default that AMDGPUBaseInfo.cpp specifies, with a comment to indicate why. As
this is the only part of clang's codegen for amdgpu that depends on the target
in the back end it suffices to build the openmp runtime on most (all?) systems.
It is a non-functional change, though observable in the updated tests and when
compiling with -###. It may cause minor disruption to the amd-stg-open branch.
Revision of D98746, builds on refactor in D101077
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D101095
[clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion
Separates detection of deprecated or invalid code object version from
returning the version. Written to avoid any behaviour change.
Precursor to a revision of D98746.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D101077
This is a user-facing option, so it doesn't make sense for it to be cc1
only.
Follow-up to D100420
Differential revision: https://reviews.llvm.org/D100759
This allows frontend and backend diagnostic files to all go into the
same place. Have it control the Windows (mini-)dump location.
Differential Revision: https://reviews.llvm.org/D99199
Programmers would like to be able to test direct methods by calling them from a
different linkage unit or mocking them, both of which are impossible. This
patch adds a flag that effectively disables the attribute, which will fix this
when enabled in testable builds. rdar://71190891
Differential revision: https://reviews.llvm.org/D95845
Problem:
On SystemZ we need to open text files in text mode. On Windows, files opened in text mode adds a CRLF '\r\n' which may not be desirable.
Solution:
This patch adds two new flags
- OF_CRLF which indicates that CRLF translation is used.
- OF_TextWithCRLF = OF_Text | OF_CRLF indicates that the file is text and uses CRLF translation.
Developers should now use either the OF_Text or OF_TextWithCRLF for text files and OF_None for binary files. If the developer doesn't want carriage returns on Windows, they should use OF_Text, if they do want carriage returns on Windows, they should use OF_TextWithCRLF.
So this is the behaviour per platform with my patch:
z/OS:
OF_None: open in binary mode
OF_Text : open in text mode
OF_TextWithCRLF: open in text mode
Windows:
OF_None: open file with no carriage return
OF_Text: open file with no carriage return
OF_TextWithCRLF: open file with carriage return
The Major change is in llvm/lib/Support/Windows/Path.inc to only set text mode if the OF_CRLF is set.
```
if (Flags & OF_CRLF)
CrtOpenFlags |= _O_TEXT;
```
These following files are the ones that still use OF_Text which I left unchanged. I modified all these except raw_ostream.cpp in recent patches so I know these were previously in Binary mode on Windows.
./llvm/lib/Support/raw_ostream.cpp
./llvm/lib/TableGen/Main.cpp
./llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
./llvm/unittests/Support/Path.cpp
./clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
./clang/lib/Frontend/CompilerInstance.cpp
./clang/lib/Driver/Driver.cpp
./clang/lib/Driver/ToolChains/Clang.cpp
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D99426
In order to bring up scalable vector support in LLVM incrementally,
we introduced behaviour to emit a warning, instead of an error, when
asking the wrong question of a scalable vector, like asking for the
fixed number of elements.
This patch puts that behaviour under a flag. The default behaviour is
that the compiler will always error, which means that all LLVM unit
tests and regression tests will now fail when a code-path is taken that
still uses the wrong interface.
The behaviour to demote an error to a warning can be individually enabled
for tools that want to support experimental use of scalable vectors.
This patch enables that behaviour when driving compilation from Clang.
This means that for users who want to try out scalable-vector support,
fixed-width codegen support, or build user-code with scalable vector
intrinsics, Clang will not crash and burn when the compiler encounters
such a case.
This allows us to do away with the following pattern in many of the SVE tests:
RUN: .... 2>%t
RUN: cat %t | FileCheck --check-prefix=WARN
WARN-NOT: warning: ...
The behaviour to emit warnings is only temporary and we expect this flag
to be removed in the future when scalable vector support is more stable.
This patch also has fixes the following tests:
unittests:
ScalableVectorMVTsTest.SizeQueries
SelectionDAGAddressAnalysisTest.unknownSizeFrameObjects
AArch64SelectionDAGTest.computeKnownBitsSVE_ZERO_EXTEND_VECTOR_INREG
regression tests:
Transforms/InstCombine/vscale_gep.ll
Reviewed By: paulwalker-arm, ctetreau
Differential Revision: https://reviews.llvm.org/D98856
For DBX, it does not handle column info well. Set -gno-column-info
by default for DBX.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D99703
Based on this debugger type, for now, we plan to:
1: use inline string by default for XCOFF DWARF
2: generate no column info for debug line table.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D99400
The contents of the string returned by getenv() is not guaranteed across calls to getenv(). The code to handle the CC_PRINT etc env vars calls getenv() and saves the results in just a char *. The string returned by getenv() needs to be copied and saved. Switching the type of the strings from char * to std::string will do this and manage the alloated memory.
Differential Revision: https://reviews.llvm.org/D98554
Summary: Add -fno-split-stack and rename CC1 option from `-split-stacks`
to `-fsplit-stack`.
Test Plan: check-all
Differential Revision: https://reviews.llvm.org/D99245
This patch sets the OF_Text flag correctly for the json file created in Clang::DumpCompilationDatabaseFragmentToDir.
Reviewed By: amccarth
Differential Revision: https://reviews.llvm.org/D99200
SYCL compilations initiated by the driver will spawn off one or more
frontend compilation jobs (one for device and one for host). This patch
reworks the driver options to make upstreaming this from the downstream
SYCL fork easier.
This patch introduces a language option to identify host executions
(SYCLIsHost) and a -cc1 frontend option to enable this mode. -fsycl and
-fno-sycl become driver-only options that are rejected when passed to
-cc1. This is because the frontend and beyond should be looking at
whether the user is doing a device or host compilation specifically.
Because the frontend should only ever be in one mode or the other,
-fsycl-is-device and -fsycl-is-host are mutually exclusive options.
Initially, this flag was meant to only be used through cc1 and not directly
through the clang driver. However, we accidentally ended up using this flag
as a driver flag already for selecting multilibs within the fuchsia toolchain.
We're currently in an awkward state where it's only accepted as a driver flag
when targeting Fuchsia, and all other instances it can only be added via
-Xclang. Since we're ready to use this in Fuchsia, we can just expose this to
the driver for simplicity.
Differential Revision: https://reviews.llvm.org/D98375
In -fno-exceptions -fno-asynchronous-unwind-tables -g0 mode,
GCC does not emit `.cfi_*` directives.
```
% diff <(gcc -fno-asynchronous-unwind-tables -dM -E a.c) <(gcc -dM -E a.c)
130a131
> #define __GCC_HAVE_DWARF2_CFI_ASM 1
```
This macro is useful because code can decide whether inline asm should include `.cfi_*` directives.
`.cfi_*` directives without `.cfi_startproc` can cause assembler errors
(integrated assembler: `this directive must appear between .cfi_startproc and .cfi_endproc directives`).
Differential Revision: https://reviews.llvm.org/D97743
A bug was introduced when adding -munsafe-fp-atomics.
By default it should be off.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D97967
Like nvptx and some other targets, -mconstructor-aliases does not work well with amdgpu,
therefore we disable it in the same approach.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D97959
This makes CC1 and driver defaults consistent.
In addition, for more common cases (-g is specified without -gsplit-dwarf), users will not see -fno-split-dwarf-inlining in CC1 options.
Verified that the below is still true:
* `clang -g` => `splitDebugInlining: false` in DICompileUnit
* `clang -g -gsplit-dwarf` => `splitDebugInlining: false` in DICompileUnit
* `clang -g -gsplit-dwarf -fsplit-dwarf-inlining` => no `splitDebugInlining: false`
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D97706
These flags affect coverage mapping (-fcoverage-mapping), not
-fprofile-[instr-]generate so it makes more sense to use the
-fcoverage-* prefix.
Differential Revision: https://reviews.llvm.org/D97434
We introduce -ffile-compilation-dir shorthand to avoid having to set
-fdebug-compilation-dir and -fprofile-compilation-dir separately. This
is similar to -ffile-prefix-map.
Differential Revision: https://reviews.llvm.org/D97433
This change enables the builtin function declarations
in clang driver by default using the Tablegen solution
along with the implicit include of 'opencl-c-base.h'
header.
A new flag '-cl-no-stdinc' disabling all default
declarations and header includes is added. If any other
mechanisms were used to include the declarations (e.g.
with -Xclang -finclude-default-header) and the new default
approach is not sufficient the, `-cl-no-stdinc` flag has
to be used with clang to activate the old behavior.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D96515
This patch moves the creation of the '-Wspir-compat' argument from cc1 to the driver.
Without this change, generating command line arguments from `CompilerInvocation` cannot be done reliably: there's no way to distinguish whether '-Wspir-compat' was passed to cc1 on the command line (should be generated), or if it was created within `CompilerInvocation::CreateFromArgs` (should not be generated).
This is also in line with how other '-W' flags are handled.
(This was introduced in D21567.)
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D97041
We currently always store absolute filenames in coverage mapping. This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines. We are also
duplicating the path prefix potentially wasting space.
This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.
Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.
Differential Revision: https://reviews.llvm.org/D95753
We currently always store absolute filenames in coverage mapping. This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines. We are also
duplicating the path prefix potentially wasting space.
This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.
Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.
Differential Revision: https://reviews.llvm.org/D95753
would otherwise include template specialization types
This helps reduce the size of the encoded C++ type strings in the binary.
This is enabled by default only on Darwin, but can be enabled/disabled
via command line options.
rdar://63288571
Differential Revision: https://reviews.llvm.org/D96816
The following commits added commandline arguments to control following the Arm
Procedure Call Standard for certain volatile bitfield operations:
- https://reviews.llvm.org/D67399
- https://reviews.llvm.org/D72932
This commit fixes the oversight that these args weren't passed from the driver
to cc1 if appropriate.
Where *appropriate* means:
- `-faapcs-bitfield-width`: is the default, so won't be passed
- `-fno-aapcs-bitfield-width`: should be passed
- `-faapcs-bitfield-load`: should be passed
Differential Revision: https://reviews.llvm.org/D96784
This fixes an issue when "-gdwarf-N" switch was ignored if it was given
before another debug option.
Differential Revision: https://reviews.llvm.org/D96865
Drop the `Separate` form of `-fmodule-name X`, `-fprofile-remapping-file X`, and `-frewrite-map-file X`.
To the best of my knowledge they are not used. Their conventional Joined forms (`-fFOO=`) should be used instead.
`-fdebug-compilation-dir X` is used in several places, e.g. chromium/infra/goma.
It is also advertised in http://blog.llvm.org/2019/11/deterministic-builds-with-clang-and-lld.html
So we keep it but make the EQ form canonical and the Separate form an alias.
Differential Revision: https://reviews.llvm.org/D96886
The option was added in D90507 for C/C++ source files. This patch adds
support for assembly files.
Differential Revision: https://reviews.llvm.org/D96783
This patch adds 2 new options to control when Clang adds `mustprogress`:
1. -ffinite-loops: assume all loops are finite; mustprogress is added
to all loops, regardless of the selected language standard.
2. -fno-finite-loops: assume no loop is finite; mustprogress is not
added to any loop or function. We could add mustprogress to
functions without loops, but we would have to detect that in Clang,
which is probably not worth it.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D96419
After D93264, using both -fdebug-info-for-profiling and
-fpseudo-probe-for-profiling will cause the compiler to crash.
Diagnose these conflicting options in the driver.
Also, the existing CodeGen test was using the driver when it should be
running cc1.
Differential Revision: https://reviews.llvm.org/D96354
This patch added a distinct CUID for each input file, which is represented by InputAction.
clang initially creates an InputAction for each input file for the host compilation. In CUDA/HIP action
builder, each InputAction is given a CUID and cloned for each GPU arch, and the CUID is also cloned. In this way,
we guarantee the corresponding device and host compilation for the same file shared the
same CUID. On the other hand, different compilation units have different CUID.
-fuse-cuid=random|hash|none is added to control the method to generate CUID. The default
is hash. -cuid=X is also added to specify CUID explicitly, which overrides -fuse-cuid.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D95007
Currently -fgpu-rdc is not passed to host clang -cc1.
This causes issue because -fgpu-rdc affects shadow
variable linkage in host compilation.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D96105
Generate outline atomics if compiling for armv8-a non-LSE AArch64 Linux
(including Android) targets to use LSE instructions, if they are available,
at runtime. Library support is checked by clang driver which doesn't enable
outline atomics if no proper libraries (libgcc >= 9.3.1 or compiler-rt) found.
Differential Revision: https://reviews.llvm.org/D93585
clang-cl already defaults to C17 for .c files, but no harm
in accepting these flags. Fixes PR48185.
Differential Revision: https://reviews.llvm.org/D95575
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.
Differential Revision: https://reviews.llvm.org/D94820
There are two use cases.
Assembler
We have accrued some code gated on MCAsmInfo::useIntegratedAssembler(). Some
features are supported by latest GNU as, but we have to use
MCAsmInfo::useIntegratedAs() because the newer versions have not been widely
adopted (e.g. SHF_LINK_ORDER 'o' and 'unique' linkage in 2.35, --compress-debug-sections= in 2.26).
Linker
We want to use features supported only by LLD or very new GNU ld, or don't want
to work around older GNU ld. We currently can't represent that "we don't care
about old GNU ld". You can find such workarounds in a few other places, e.g.
Mips/MipsAsmprinter.cpp PowerPC/PPCTOCRegDeps.cpp X86/X86MCInstrLower.cpp
AArch64 TLS workaround for R_AARCH64_TLSLD_MOVW_DTPREL_* (PR ld/18276),
R_AARCH64_TLSLE_LDST8_TPREL_LO12 (https://bugs.llvm.org/show_bug.cgi?id=36727https://sourceware.org/bugzilla/show_bug.cgi?id=22969)
Mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER components (supported by LLD in D84001;
GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=16833 may take a while before available).
This feature allows to garbage collect some unused sections (e.g. fragmented .gcc_except_table).
This patch adds `-fbinutils-version=` to clang and `-binutils-version` to llc.
It changes one codegen place in SHF_MERGE to demonstrate its usage.
`-fbinutils-version=2.35` means the produced object file does not care about GNU
ld<2.35 compatibility. When `-fno-integrated-as` is specified, the produced
assembly can be consumed by GNU as>=2.35, but older versions may not work.
`-fbinutils-version=none` means that we can use all ELF features, regardless of
GNU as/ld support.
Both clang and llc need `parseBinutilsVersion`. Such command line parsing is
usually implemented in `llvm/lib/CodeGen/CommandFlags.cpp` (LLVMCodeGen),
however, ClangCodeGen does not depend on LLVMCodeGen. So I add
`parseBinutilsVersion` to `llvm/lib/Target/TargetMachine.cpp` (LLVMTarget).
Differential Revision: https://reviews.llvm.org/D85474
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.
Differential Revision: https://reviews.llvm.org/D94820
The previous implementation required that `-maltivec` be specified when using either `-mabi=vec-extabi` or `-mabi=vec-default`, this patch removes that requirement.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D94986
This generalizes D94647 to IR input, as suggested by @tejohnson.
Ideally the driver should just forward split dwarf options, but doing this currently will cause `clang -gsplit-dwarf -c a.c` to create a .dwo with just `.strtab`.
Reviewed By: dblaikie, tejohnson
Differential Revision: https://reviews.llvm.org/D94655
-g is an IR generation option while -gsplit-dwarf is an object file generation option.
For -gsplit-dwarf in the backend phase of a distributed ThinLTO (-fthinlto-index=) which does object file generation and no IR generation, -g should not be needed.
This patch makes `-fthinlto-index= -gsplit-dwarf` emit .dwo even in the absence of -g.
This should fix https://crbug.com/1158215 after D80391.
```
// Distributed ThinLTO usage
clang -g -O2 -c -flto=thin -fthin-link-bitcode=a.indexing.o a.c
clang -g -O2 -c -flto=thin -fthin-link-bitcode=b.indexing.o b.c
clang -fuse-ld=lld -Wl,--thinlto-index-only=a.rsp -Wl,--thinlto-prefix-replace=';lto/' -Wl,--thinlto-object-suffix-replace='.indexing.o;.o' a.indexing.o b.indexing.o
clang -gsplit-dwarf -O2 -c -fthinlto-index=lto/a.o.thinlto.bc a.o -o lto/a.o
clang -gsplit-dwarf -O2 -c -fthinlto-index=lto/b.o.thinlto.bc b.o -o lto/b.o
clang -fuse-ld=lld @a.rsp -o exe
```
Note: for implicit regular/Thin LTO, .dwo emission works without this patch:
`clang -flto=thin -gsplit-dwarf a.o b.o` passes `-plugin-opt=dwo_dir=` to the linker.
The linker forwards the option to LTO. LTOBackend.cpp emits `$dwo_dir/[01234].dwo`.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D94647
If conflicting `-fprofile-generate -fcs-profile-generate` are used together,
there is currently an assertion failure. Fix the failure.
Also add some driver tests.
Reviewed By: xur
Differential Revision: https://reviews.llvm.org/D94463
This patch removes the -f[no-]trapping-math flags from the -cc1 command line. These flags are ignored in the command line parser and their semantics is fully handled by -ffp-exception-mode.
This patch does not remove -f[no-]trapping-math from the driver command line. The driver flags are being used and do affect compilation.
Reviewed By: dexonsmith, SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D93395
GCC r218397 "x86-64: Optimize access to globals in PIE with copy reloc" made
-fpie code emit R_X86_64_PC32 to reference external data symbols by default.
Clang adopted -mpie-copy-relocations D19996 as a flexible alternative.
The name -mpie-copy-relocations can be improved [1] and does not capture the
idea that this option can apply to -fno-pic and -fpic [2], so this patch
introduces -f[no-]direct-access-external-data and makes -mpie-copy-relocations
their aliases for compatibility.
[1]
For
```
extern int var;
int get() { return var; }
```
if var is defined in another translation unit in the link unit, there is no copy
relocation.
[2]
-fno-pic -fno-direct-access-external-data is useful to avoid copy relocations.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65888
If a shared object is linked with -Bsymbolic or --dynamic-list and exports a
data symbol, normally the data symbol cannot be accessed by -fno-pic code
(because by default an absolute relocation is produced which will lead to a copy
relocation). -fno-direct-access-external-data can prevent copy relocations.
-fpic -fdirect-access-external-data can avoid GOT indirection. This is like the
undefined counterpart of -fno-semantic-interposition. However, the user should
define var in another translation unit and link with -Bsymbolic or
--dynamic-list, otherwise the linker will error in a -shared link. Generally
the user has better tools for their goal but I want to mention that this
combination is valid.
On COFF, the behavior is like always -fdirect-access-external-data.
`__declspec(dllimport)` is needed to enable indirect access.
There is currently no plan to affect non-ELF behaviors or -fpic behaviors.
-fno-pic -fno-direct-access-external-data will be implemented in the subsequent patch.
GCC feature request https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D92633
@ikudrin enabled support for dwarf64 in D87011. Adding a clang flag so it can be used through that compilation pass.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D90507
Summary:
Optimized debugging is not supported by ptxas. Debugging information is degraded to line information only if optimizations are enabled, but debugging information would be added back in by the driver if remarks were enabled. This solves https://bugs.llvm.org/show_bug.cgi?id=48153.
Reviewers: jdoerfert tra jholewinski serge-sans-paille
Differential Revision: https://reviews.llvm.org/D94123
This patch propagates the -moutline flag when LTO is enabled and avoids
passing it explicitly to the linker plugin.
Differential Revision: https://reviews.llvm.org/D93385
Add powerpcle support to clang.
For FreeBSD, assume a freestanding environment for now, as we only need it in the first place to build loader, which runs in the OpenFirmware environment instead of the FreeBSD environment.
For Linux, recognize glibc and musl environments to match current usage in Void Linux PPC.
Adjust driver to match current binutils behavior regarding machine naming.
Adjust and expand tests.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D93919
The idea is that the CC1 default for ELF should set dso_local on default
visibility external linkage definitions in the default -mrelocation-model pic
mode (-fpic/-fPIC) to match COFF/Mach-O and make output IR similar.
The refactoring is made available by 2820a2ca3a.
Currently only x86 supports local aliases. We move the decision to the driver.
There are three CC1 states:
* -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable.
* (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local
* -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out.
Add -fhalf-no-semantic-interposition to a few tests using the half-based semantic interposition behavior.
cl.exe doesn't understand Zd (in either MSVC 2017 or 2019), so neiter
should we. It used to do the same as `-gline-tables-only` which is
exposed as clang-cl flag as well, so if you want this behavior, use
`gline-tables-only`. That makes it clear that it's a clang-cl-only flag
that won't work with cl.exe.
Motivated by the discussion in D92958.
Differential Revision: https://reviews.llvm.org/D93458
There are out-of-tree tools using clang-offload-bundler to extract
bundles from bundled files. When a bundle is not in the bundled
file, clang-offload-bundler is expected to emit an error message
and return non-zero value. However currently clang-offload-bundler
silently generates empty file for the missing bundles.
Since OpenMP/HIP toolchains expect the current behavior, an option
-allow-missing-bundles is added to let clang-offload-bundler
create empty file when a bundle is missing when unbundling.
The unbundling job action is updated to use this option by
default.
clang-offload-bundler itself will emit error when a bundle
is missing when unbundling by default.
Changes are also made to check duplicate targets in -targets
option and emit error.
Differential Revision: https://reviews.llvm.org/D93068
This patch enables marshalling of the exception model options while enforcing their mutual exclusivity. The clang driver interface remains the same, this only affects the cc1 command line.
Depends on D93215.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D93216
The new PM is considered stable and many downstream groups have adopted it (some
have adopted it for more than two years). Add -f[no-]legacy-pass-manager to reflect the
fact that it is no longer experimental and the legacy pass manager is something we strive to retire.
In the future, when the legacy PM eventually goes away,
-fno-experimental-new-pass-manager and -flegacy-pass-manager will be removed.
This patch also changes -f[no-]legacy-pass-manager to pass `-plugin-opt={new,legacy}-pass-manager` to the linker (supported by both ld.lld and LLVMgold.so) when -flto/-flto=thin is specified
Reviewed By: aeubanks, rsmith
Differential Revision: https://reviews.llvm.org/D92915
This is needed for CUDA compilation where NVPTX back-end only supports DWARF2,
but host compilation should be allowed to use newer DWARF versions.
Differential Revision: https://reviews.llvm.org/D92617
Currently when -gsplit-dwarf is specified (could be buried in a build system),
there is no convenient way to cancel debug fission without affecting the debug
information amount (all of -g0, -g1 -fsplit-dwarf-inlining and -gline-directives-only
can, but they affect the debug information amount).
Reviewed By: #debug-info, dblaikie
Differential Revision: https://reviews.llvm.org/D92809
RFC: http://lists.llvm.org/pipermail/cfe-dev/2020-May/065430.html
Agreement from GCC: https://sourceware.org/pipermail/gcc-patches/2020-May/545688.html
g_flags_Group options generally don't affect the amount of debugging
information. -gsplit-dwarf is an exception. Its order dependency with
other gN_Group options make it inconvenient in a build system:
* -g0 -gsplit-dwarf -> level 2
-gsplit-dwarf "upgrades" the amount of debugging information despite
the previous intention (-g0) to drop debugging information
* -g1 -gsplit-dwarf -> level 2
-gsplit-dwarf "upgrades" the amount of debugging information.
* If we have a higher-level -gN, -gN -gsplit-dwarf will supposedly decrease the
amount of debugging information. This happens with GCC -g3.
The non-orthogonality has confused many users. GCC 11 will change the semantics
(-gsplit-dwarf no longer implies -g2) despite the backwards compatibility break.
This patch matches its behavior.
New semantics:
* If there is a g_Group, allow split DWARF if useful
(none of: -g0, -gline-directives-only, -g1 -fno-split-dwarf-inlining)
* Otherwise, no-op.
To restore the original behavior, replace -gsplit-dwarf with -gsplit-dwarf -g.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D80391
Currently, -ftime-report + new pass manager emits one line of report for each
pass run. This potentially causes huge output text especially with regular LTO
or large single file (Obeserved in private tests and was reported in D51276).
The behaviour of -ftime-report + legacy pass manager is
emitting one line of report for each pass object which has relatively reasonable
text output size. This patch adds a flag `-ftime-report=` to control time report
aggregation for new pass manager.
The flag is for new pass manager only. Using it with legacy pass manager gives
an error. It is a driver and cc1 flag. `per-pass` is the new default so
`-ftime-report` is aliased to `-ftime-report=per-pass`. Before this patch,
functionality-wise `-ftime-report` is aliased to `-ftime-report=per-pass-run`.
* Adds an boolean variable TimePassesHandler::PerRun to control per-pass vs per-pass-run.
* Adds a new clang CodeGen flag CodeGenOptions::TimePassesPerRun to work with the existing CodeGenOptions::TimePasses.
* Remove FrontendOptions::ShowTimers, its uses are replaced by the existing CodeGenOptions::TimePasses.
* Remove FrontendTimesIsEnabled (It was introduced in D45619 which was largely reverted.)
Differential Revision: https://reviews.llvm.org/D92436
This patch implements correct hostness based overloading resolution
in isBetterOverloadCandidate.
Based on hostness, if one candidate is emittable whereas the other
candidate is not emittable, the emittable candidate is better.
If both candidates are emittable, or neither is emittable based on hostness, then
other rules should be used to determine which is better. This is because
hostness based overloading resolution is mostly for determining
viability of a function. If two functions are both viable, other factors
should take precedence in preference.
If other rules cannot determine which is better, CUDA preference will be
used again to determine which is better.
However, correct hostness based overloading resolution
requires overloading resolution diagnostics to be deferred,
which is not on by default. The rationale is that deferring
overloading resolution diagnostics may hide overloading reslolutions
issues in header files.
An option -fgpu-exclude-wrong-side-overloads is added, which is off by
default.
When -fgpu-exclude-wrong-side-overloads is off, keep the original behavior,
that is, exclude wrong side overloads only if there are same side overloads.
This may result in incorrect overloading resolution when there are no
same side candates, but is sufficient for most CUDA/HIP applications.
When -fgpu-exclude-wrong-side-overloads is on, enable deferring
overloading resolution diagnostics and enable correct hostness
based overloading resolution, i.e., always exclude wrong side overloads.
Differential Revision: https://reviews.llvm.org/D80450
This change introduces a new clang switch `-fpseudo-probe-for-profiling` to enable AutoFDO with pseudo instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.
One implication from pseudo-probe instrumentation is that the profile is now sensitive to CFG changes. We perform the pseudo instrumentation very early in the pre-LTO pipeline, before any CFG transformation. This ensures that the CFG instrumented and annotated is stable and optimization-resilient.
The early instrumentation also allows the inliner to duplicate probes for inlined instances. When a probe along with the other instructions of a callee function are inlined into its caller function, the GUID of the callee function goes with the probe. This allows samples collected on inlined probes to be reported for the original callee function.
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D86502
Added support for the options mabi=vec-extabi and mabi=vec-default which are analogous to qvecnvol and qnovecnvol when using XL on AIX.
The extended Altivec ABI on AIX is enabled using mabi=vec-extabi in clang and vec-extabi in llc.
Reviewed By: Xiangling_L, DiggerLin
Differential Revision: https://reviews.llvm.org/D89684
This patch implements out of line atomics for LSE deployment
mechanism. Details how it works can be found in llvm/docs/Atomics.rst
Options -moutline-atomics and -mno-outline-atomics to enable and disable it
were added to clang driver. This is clang and llvm part of out-of-line atomics
interface, library part is already supported by libgcc. Compiler-rt
support is provided in separate patch.
Differential Revision: https://reviews.llvm.org/D91157
- The new option, -arcmt-action, is a simple enum based option.
- The driver is modified to translate the existing -ccc-acmt-* options accordingly
Depends on D83298
Reviewed By: Bigcheese
Original patch by Daniel Grumberg.
Differential Revision: https://reviews.llvm.org/D83315
Add an option -munsafe-fp-atomics for AMDGPU target.
When enabled, clang adds function attribute "amdgpu-unsafe-fp-atomics"
to any functions for amdgpu target. This allows amdgpu backend to use
unsafe fp atomic instructions in these functions.
Differential Revision: https://reviews.llvm.org/D91546
VE needs to support integrated assembler and "nas". This "nas"
doesn't recognize ".sigaddr" pseudo mnemonics, so need to disable
it. This patch disable it on VE by default. Also add a regression
test for that.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D91350
The behavior is controlled by the `-fprebuilt-implicit-modules` option, and
allows searching for implicit modules in the prebuilt module cache paths.
The current command-line options for prebuilt modules do not allow to easily
maintain and use multiple versions of modules. Both the producer and users of
prebuilt modules are required to know the relationships between compilation
options and module file paths. Using a particular version of a prebuilt module
requires passing a particular option on the command line (e.g.
`-fmodule-file=[<name>=]<file>` or `-fprebuilt-module-path=<directory>`).
However the compiler already knows how to distinguish and automatically locate
implicit modules. Hence this proposal to introduce the
`-fprebuilt-implicit-modules` option. When set, it enables searching for
implicit modules in the prebuilt module paths (specified via
`-fprebuilt-module-path`). To not modify existing behavior, this search takes
place after the standard search for prebuilt modules. If not
Here is a workflow illustrating how both the producer and consumer of prebuilt
modules would need to know what versions of prebuilt modules are available and
where they are located.
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules_v1 <config 1 options>
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules_v2 <config 2 options>
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules_v3 <config 3 options>
clang -cc1 -x c use.c -fmodules fmodule-map-file=modulemap -fprebuilt-module-path=prebuilt_modules_v1 <config 1 options>
clang -cc1 -x c use.c -fmodules fmodule-map-file=modulemap <non-prebuilt config options>
With prebuilt implicit modules, the producer can generate prebuilt modules as
usual, all in the same output directory. The same mechanisms as for implicit
modules take care of incorporating hashes in the path to distinguish between
module versions.
Note that we do not specify the output module filename, so `-o` implicit modules are generated in the cache path `prebuilt_modules`.
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules <config 1 options>
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules <config 2 options>
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules <config 3 options>
The user can now simply enable prebuilt implicit modules and point to the
prebuilt modules cache. No need to "parse" command-line options to decide
what prebuilt modules (paths) to use.
clang -cc1 -x c use.c -fmodules fmodule-map-file=modulemap -fprebuilt-module-path=prebuilt_modules -fprebuilt-implicit-modules <config 1 options>
clang -cc1 -x c use.c -fmodules fmodule-map-file=modulemap -fprebuilt-module-path=prebuilt_modules -fprebuilt-implicit-modules <non-prebuilt config options>
This is for example particularly useful in a use-case where compilation is
expensive, and the configurations expected to be used are predictable, but not
controlled by the producer of prebuilt modules. Modules for the set of
predictable configurations can be prebuilt, and using them does not require
"parsing" the configuration (command-line options).
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D68997
415f7ee883 had LIT test failures on any build where the clang executable
was not called "clang". I have adjusted the LIT CHECKs to remove the
binary name to fix this.
Original commit message:
For PlayStation we offer source code compatibility with
Microsoft's dllimport/export annotations; however, our file
format is based on ELF.
To support this we translate from DLL storage class to ELF
visibility at the end of codegen in Clang.
Other toolchains have used similar strategies (e.g. see the
documentation for this ARM toolchain:
https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0)
This patch adds the ability to perform this translation. Options
are provided to support customizing the mapping behaviour.
Differential Revision: https://reviews.llvm.org/D89970
Similar to -fprofile-generate=, add -fmemory-profile= which takes a
directory path. This is passed down to LLVM via a new module flag
metadata. LLVM in turn provides this name to the runtime via the new
__memprof_profile_filename variable.
Additionally, always pass a default filename (in $cwd if a directory
name is not specified vi the = form of the option). This is also
consistent with the behavior of the PGO instrumentation. Since the
memory profiles will generally be fairly large, it doesn't make sense to
dump them to stderr. Also, importantly, the memory profiles will
eventually be dumped in a compact binary format, which is another reason
why it does not make sense to send these to stderr by default.
Change the existing memprof tests to specify log_path=stderr when that
was being relied on.
Depends on D89086.
Differential Revision: https://reviews.llvm.org/D89087
Since Wasm comdat sections work similarly to ELF, we can use that mechanism
to eliminate duplicate dwarf type information in the same way.
Differential Revision: https://reviews.llvm.org/D88603
Make the virtual method Toolchain::GetDefaultStackProtectorLevel()
return an explict enum value rather than an integral constant. This
makes the code subjectively easier to read, and should help prevent bugs
that may (or may never) arise from changing the enum values. Previously,
these were just kept in sync via a comment, which is brittle. The trade
off is including a additional header in a few new places. It is not
necessary, but in my opinion helps the readability.
Split off from https://reviews.llvm.org/D90194 to help cut down on lines
changed in code review.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D90271
Since Wasm comdat sections work similarly to ELF, we can use that mechanism
to eliminate duplicate dwarf type information in the same way.
Differential Revision: https://reviews.llvm.org/D88603
1. Emit error for -G driver option on AIX
2. Adjust cmake file to use -Wl,-G instead of -G
On AIX, legacy XL compiler uses -G to produce a shared object enabled
for use with the run-time linker, which has different meanings from what
it is used for in Clang. And in Clang, other targets do not have -G map
to another functionality in their legacy compiler. So this error is more
important when we are on AIX.
Differential Revision: https://reviews.llvm.org/D89897
With -fbasicblock-sections=, let the front-end handle the case where the file
doesnt exist. The driver only checks if the option syntax is right.
Differential Revision: https://reviews.llvm.org/D89500
* Make cc1 and cc1as --compress-debug-sections an alias for --compress-debug-sections=zlib
* Make -gz an alias for -gz=zlib
The new behavior is consistent with GCC when binutils>=2.26 is detected:
-gz is translated to --compress-debug-sections=zlib instead of --compress-debug-sections.
- The goal of this patch is improve option compatible with RISCV-V GCC,
-mcpu support on GCC side will sent patch in next few days.
- -mtune only affect the pipeline model and non-arch/extension related
target feature, e.g. instruction fusion; in td file it called
TuneFeatures, which is introduced by X86 back-end[1].
- -mtune accept all valid option for -mcpu and extra alias processor
option, e.g. `generic`, `rocket` and `sifive-7-series`, the purpose is
option compatible with RISCV-V GCC.
- Processor alias for -mtune will resolve according the current target arch,
rv32 or rv64, e.g. `rocket` will resolve to `rocket-rv32` or `rocket-rv64`.
- Interaction between -mcpu and -mtune:
* -mtune has higher priority than -mcpu for pipeline model and
TuneFeatures.
[1] https://reviews.llvm.org/D85165
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D89025
This reverts commits 683b308c07 and
8487bfd4e9.
We will go for a more restricted approach that does not give freedom to
everyone to change ABIs on whichever platform.
See the discussion on https://reviews.llvm.org/D85802.
This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html.
The goal is to add a way to override the default target C++ ABI through
a compiler flag. This makes it easier to test and transition between different
C++ ABIs through compile flags rather than build flags.
In this patch:
- Store `-fc++-abi=` in a LangOpt. This isn't stored in a
CodeGenOpt because there are instances outside of codegen where Clang
needs to know what the ABI is (particularly through
ASTContext::createCXXABI), and we should be able to override the
target default if the flag is provided at that point.
- Expose the existing ABIs in TargetCXXABI as values that can be passed
through this flag.
- Create a .def file for these ABIs to make it easier to check flag
values.
- Add an error for diagnosing bad ABI flag values.
Differential Revision: https://reviews.llvm.org/D85802
Summary:
This patch does the following:
1. Make InitTargetOptionsFromCodeGenFlags() accepts Triple as a
parameter, because some options' default value is triple dependant.
2. DataSections is turned on by default on AIX for llc.
3. Test cases change accordingly because of the default behaviour change.
4. Clang Driver passes in -fdata-sections by default on AIX.
Reviewed By: MaskRay, DiggerLin
Differential Revision: https://reviews.llvm.org/D88737
SUMMARY:
In IBM compiler xlclang , there is an option -fnovisibility which suppresses visibility. For more details see: https://www.ibm.com/support/knowledgecenter/SSGH3R_16.1.0/com.ibm.xlcpp161.aix.doc/compiler_ref/opt_visibility.html.
We need to add the option -mignore-xcoff-visibility for compatibility with the IBM AIX OS (as the option is enabled by default in AIX). With this option llvm does not emit any visibility attribute to ASM or XCOFF object file.
The option only work on the AIX OS, for other non-AIX OS using the option will report an unsupported options error.
In AIX OS:
1.1 the option -mignore-xcoff-visibility is enabled by default , if there is not -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command .
1.2 if there is -fvisibility=* explicitly but not -mignore-xcoff-visibility explicitly in the clang command. it will generate visibility attributes.
1.3 if there are both -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command. The option "-mignore-xcoff-visibility" wins , it do not emit the visibility attribute.
The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR.
Reviewer: daltenty,Jason Liu
Differential Revision: https://reviews.llvm.org/D87451
Object of class `Command` contains various properties of a command to
execute, but output file was missed from them. This change adds this
property. It is required for reporting consumed time and memory implemented
in D78903 and may be used in other cases too.
Differential Revision: https://reviews.llvm.org/D78902
A lot of our code building with clang-cl.exe using Clang 11 was failing with
the following 2 type of errors:
1. explicit specialization of 'foo' after instantiation
2. no matching function for call to 'bar'
Note that we also use -fdelayed-template-parsing in our builds.
I tried pretty hard to get a small repro for these failures, but couldn't. So
there is some subtle edge case in the -fpch-instantiate-templates feature
introduced by this change: https://reviews.llvm.org/D69585
When I tried turning this off using -fno-pch-instantiate-templates, builds
would silently fail with the same error without any indication that
-fno-pch-instantiate-templates was being ignored by the compiler. Then I
realized this "no" option wasn't actually working when I ran Clang under a
debugger.
Differential revision: https://reviews.llvm.org/D88680
GCC 7 introduced -fprofile-update={atomic,prefer-atomic} (prefer-atomic is for
best efforts (some targets do not support atomics)) to increment counters
atomically, which is exactly what we have done with -fprofile-instr-generate
(D50867) and -fprofile-arcs (b5ef137c11).
This patch adds the option to clang to surface the internal options at driver level.
GCC 7 also turned on -fprofile-update=prefer-atomic when -pthread is specified,
but it has performance regression
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89307). So we don't follow suit.
Differential Revision: https://reviews.llvm.org/D87737
recommit e50465ecef with fix for
regression in lldb tests.
Two issues:
1. the directory part of original .dwo file was dropped
2. if the stem of the .dwo file contains '.', the last dot
and strings after that were removed
This recommit fixes those two issues.
Set the default wchar_t type on z/OS, and unsigned as the default.
Reviewed By: hubert.reinterpretcast, fanbo-meng
Differential Revision: https://reviews.llvm.org/D87624
when -gsplit option is used with clang driver, clang driver will create
a filename with .dwo option based on the input file name and pass
it to clang -cc1. This file is used for storing the debug info. Since
HIP generate separate object files for different GPU arch's,
this file should be different for different GPU arch. This patch
adds _ and GPU arch to the stem of the dwo file.
Differential Revision: https://reviews.llvm.org/D87791
Enforcing a profile available check in the driver does not work with
incremental LTO builds where the LTO backend invocation does not include
the profile flags. At this point the profiles have already been consumed
and the IR contains profile metadata. Instead we always pass through the
-fsplit-machine-functions flag on user request. The pass itself contains
a check to return early if no profile information is available.
Differential Revision: https://reviews.llvm.org/D87943
Initial support for dwarf fission sections (-gsplit-dwarf) on wasm.
The most interesting change is support for writing 2 files (.o and .dwo) in the
wasm object writer. My approach moves object-writing logic into its own function
and calls it twice, swapping out the endian::Writer (W) in between calls.
It also splits the import-preparation step into its own function (and skips it when writing a dwo).
Differential Revision: https://reviews.llvm.org/D85685
Writing the .note.gnu.property manually is error prone and hard to
maintain in the assembly files.
The -mmark-bti-property is for the assembler to emit the section with the
GNU_PROPERTY_AARCH64_FEATURE_1_BTI. To be used when C/C++ is compiled
with -mbranch-protection=bti.
This patch refactors the .note.gnu.property handling.
Reviewed By: chill, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D81930
Reland with test dependency on aarch64 target.
Writing the .note.gnu.property manually is error prone and hard to
maintain in the assembly files.
The -mmark-bti-property is for the assembler to emit the section with the
GNU_PROPERTY_AARCH64_FEATURE_1_BTI. To be used when C/C++ is compiled
with -mbranch-protection=bti.
This patch refactors the .note.gnu.property handling.
Reviewed By: chill, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D81930
This patch adds a command line flag for the machine function splitter
(added in rG94faadaca4e1).
-fsplit-machine-functions
Split machine functions using profile information (x86 ELF). On
other targets an error is emitted. If profile information is not
provided a warning is emitted notifying the user that profile
information is required.
Differential Revision: https://reviews.llvm.org/D87047
Basic block sections is untested on other platforms and binary formats apart
from x86,elf. This patch emits a warning and drops the flag if the platform
and binary format are not compatible. Add a test to ensure that
specifying an incompatible target in the driver does not enable the
feature.
Differential Revision: https://reviews.llvm.org/D87426
This effectively disables r340386 on Darwin, and provides a command line flag
to opt into/out of this behaviour. This change is needed to compile certain
Apple headers correctly.
rdar://47688592
Differential revision: https://reviews.llvm.org/D86881
For the PS4, do not emit "-tune-cpu generic" since the platform only has 1 known CPU and we do not want to prevent optimizations by tuning for a generic rather than the specific processor it contains.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D86965
See RFC for background:
http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html
Note that the runtime changes will be sent separately (hopefully this
week, need to add some tests).
This patch includes the LLVM pass to instrument memory accesses with
either inline sequences to increment the access count in the shadow
location, or alternatively to call into the runtime. It also changes
calls to memset/memcpy/memmove to the equivalent runtime version.
The pass is modeled on the address sanitizer pass.
The clang changes add the driver option to invoke the new pass, and to
link with the upcoming heap profiling runtime libraries.
Currently there is no attempt to optimize the instrumentation, e.g. to
aggregate updates to the same memory allocation. That will be
implemented as follow on work.
Differential Revision: https://reviews.llvm.org/D85948
This patch defaults to -mtune=generic unless -march is present. If -march is present we'll use the empty string unless its overridden by mtune. The back should use the target cpu if the tune-cpu isn't present.
It also adds AST serialization support to fix some tests that emit AST and parse it back. These tests diff the IR against the output from not going through AST. So if we don't serialize the tune CPU we fail the diff.
Differential Revision: https://reviews.llvm.org/D86488
Some code bases out there pass -mtune=generic to clang. This would have
been ignored prior to D85384. Now it results in an error
because "generic" isn't recognized by isValidCPUName.
And if we let it go through to the backend as a tune
setting it would get the tune flags closer to i386 rather
than a modern CPU.
I plan to change what tune=generic does in the backend in
a future patch. And allow this in the frontend.
But this should be a quick fix for the error some users
are seeing.
Building on the backend support from D85165. This parses the command line option in the driver, passes it on to CC1 and adds a function attribute.
-Still need to support tune on the target attribute.
-Need to use "generic" as the tuning by default. But need to change generic in the backend first.
-Need to set tune if march is specified and mtune isn't.
-May need to disable getHostCPUName's ability to guess CPU name from features when it doesn't have a family/model match for mtune=native. That's what gcc appears to do.
Differential Revision: https://reviews.llvm.org/D85384
Fixes pr/11710.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Resubmit after breaking Windows and OSX builds.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D80242
Using -fmodules-* options for PCHs is a bit confusing, so add -fpch-*
variants. Having extra options also makes it simple to do a configure
check for the feature.
Also document the options in the release notes.
Differential Revision: https://reviews.llvm.org/D83623
No real action is taken for a value of scalable but it provides a
route to disable an earlier specification and is effectively its
default value when omitted.
Patch also removes an "unused variable" warning.
Differential Revision: https://reviews.llvm.org/D84021
Summary:
This patch implements parsing support for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE, version 00bet5,
section 3.7.3) for SVE [1].
The purpose of this attribute is to define fixed-length (VLST) versions
of existing sizeless types (VLAT). For example:
#if __ARM_FEATURE_SVE_BITS==512
typedef svint32_t fixed_svint32_t __attribute__((arm_sve_vector_bits(512)));
#endif
Creates a type 'fixed_svint32_t' that is a fixed-length version of
'svint32_t' that is normal-sized (rather than sizeless) and contains
exactly 512 bits. Unlike 'svint32_t', this type can be used in places
such as structs and arrays where sizeless types can't.
Implemented in this patch is the following:
* Defined and tested attribute taking single argument.
* Checks the argument is an integer constant expression.
* Attribute can only be attached to a single SVE vector or predicate
type, excluding tuple types such as svint32x4_t.
* Added the `-msve-vector-bits=<bits>` flag. When specified the
`__ARM_FEATURE_SVE_BITS__EXPERIMENTAL` macro is defined.
* Added a language option to store the vector size specified by the
`-msve-vector-bits=<bits>` flag. This is used to validate `N ==
__ARM_FEATURE_SVE_BITS`, where N is the number of bits passed to the
attribute and `__ARM_FEATURE_SVE_BITS` is the feature macro defined under
the same flag.
The `__ARM_FEATURE_SVE_BITS` macro will be made non-experimental in the final
patch of the series.
[1] https://developer.arm.com/documentation/100987/latest
This is patch 1/4 of a patch series.
Reviewers: sdesmalen, rsandifo-arm, efriedma, ctetreau, cameron.mcinally, rengolin, aaron.ballman
Reviewed By: sdesmalen, aaron.ballman
Differential Revision: https://reviews.llvm.org/D83550
Do not detect device library by default in rocm detector.
Only detect device library in Rocm and HIP toolchain.
Separate detection of HIP runtime and Rocm device library.
Detect rocm path by version file in host toolchains.
Also added detecting rocm version and printing rocm
installation path and version with -v.
Fixed include path and device library detection for
ROCm 3.5.
Added --hip-version option. Renamed --hip-device-lib-path
to --rocm-device-lib-path.
Fixed default value for -fhip-new-launch-api.
Added default -std option for HIP.
Differential Revision: https://reviews.llvm.org/D82930
Summary:
-debug-info-kind=constructor reduces the amount of class debug info that
is emitted; this patch switches to using this as the default.
Constructor homing emits the complete type info for a class only when the
constructor is emitted, so it is expected that there will be some classes that
are not defined in the debug info anymore because they are never constructed,
and we shouldn't need debug info for these classes.
I compared the PDB files for clang, and there are 273 class types that are defined with `=limited`
but not with `=constructor` (out of ~60,000 total class types).
We've looked at a number of the types that are no longer defined with =constructor. The vast
majority of cases are something like class A is used as a parameter in a member function of
some other class B, which is emitted. But the function that uses class A is never called, and class A
is never constructed, and therefore isn't emitted in the debug info.
Bug: https://bugs.llvm.org/show_bug.cgi?id=46537
Subscribers: aprantl, cfe-commits, lldb-commits
Tags: #clang, #lldb
Differential Revision: https://reviews.llvm.org/D79147
Making -g[no-]column-info opt out reduces the length of a typical CC1 command line.
Additionally, in a non-debug compile, we won't see -dwarf-column-info.
Summary:
If you execute the following commandline multiple times, the behavior was not always the same:
clang++ --target=thumbv7em-none-windows-eabi-coff -march=armv7-m -mcpu=cortex-m7 -o temp.obj -c -x c++ empty.cpp
Most of the time the compilation succeeded, but sometimes clang reported this error:
clang++: error: the target architecture 'thumbv7em' is not supported by the target 'thumbv7em-none-windows-eabi'
The cause of the inconsistent behavior was the uninitialized variable Version.
With these commandline arguments, the variable Version was not set by getAsInteger(),
because it cannot parse a number from the substring "7em" (of "thumbv7em").
To get a consistent behaviour, it's enough to initialize the variable Version to zero.
Zero is smaller than 7, so the comparison will be true.
Then the command always fails with the error message seen above.
By using consumeInteger() instead of getAsInteger() we get 7 from the substring "7em"
and the command does not fail.
Reviewers: compnerd, danielkiss
Reviewed By: danielkiss
Subscribers: danielkiss, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75453
specified at Command creation, rather than as part of the Tool.
This resolves the hack I just added to allow Darwin toolchain to vary
its level of support based on `-mlinker-version=`.
The change preserves the _current_ settings for response-file support.
Some tools look likely to be declaring that they don't support
response files in error, however I kept them as-is in order for this
change to be a simple refactoring.
Differential Revision: https://reviews.llvm.org/D82782
This fixes a unit test. Otherwise here is the original commit:
1) Shared writable directories like /tmp are a security problem.
2) Systems provide dedicated cache directories these days anyway.
3) This also refines LLVM's cache_directory() on Darwin platforms to use
the Darwin per-user cache directory.
Reviewers: compnerd, aprantl, jakehehrlich, espindola, respindola, ilya-biryukov, pcc, sammccall
Reviewed By: compnerd, sammccall
Subscribers: hiraditya, llvm-commits, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D82362
1) Shared writable directories like /tmp are a security problem.
2) Systems provide dedicated cache directories these days anyway.
3) This also refines LLVM's cache_directory() on Darwin platforms to use
the Darwin per-user cache directory.
Reviewers: compnerd, aprantl, jakehehrlich, espindola, respindola, ilya-biryukov, pcc, sammccall
Reviewed By: compnerd, sammccall
Subscribers: hiraditya, llvm-commits, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D82362
Summary:
Added support for dynamic memory allocation for globalized variables in
case if execution of target regions in parallel is required.
Reviewers: jdoerfert
Subscribers: jholewinski, yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82324
Add -fpch-instantiate-templates which makes template instantiations be
performed already in the PCH instead of it being done in every single
file that uses the PCH (but every single file will still do it as well
in order to handle its own instantiations). I can see 20-30% build
time saved with the few tests I've tried.
The change may reorder compiler output and also generated code, but
should be generally safe and produce functionally identical code.
There are some rare cases that do not compile with it,
such as test/PCH/pch-instantiate-templates-forward-decl.cpp. If
template instantiation bailed out instead of reporting the error,
these instantiations could even be postponed, which would make them
work.
Enable this by default for clang-cl. MSVC creates PCHs by compiling
them using an empty .cpp file, which means templates are instantiated
while building the PCH and so the .h needs to be self-contained,
making test/PCH/pch-instantiate-templates-forward-decl.cpp to fail
with MSVC anyway. So the option being enabled for clang-cl matches this.
Differential Revision: https://reviews.llvm.org/D69585
On AIX, we use __atexit to register dtor functions rather than __cxa_atexit.
So a driver change is needed to default AIX to using -fno-use-cxa-atexit.
Windows platform does not uses __cxa_atexit either. Following its precedent,
we remove the assertion for when -fuse-cxa-atexit is specified by the user,
do not produce a message and silently default to -fno-use-cxa-atexit behavior.
Differential Revision: https://reviews.llvm.org/D82136