The idea is that the CC1 default for ELF should set dso_local on default
visibility external linkage definitions in the default -mrelocation-model pic
mode (-fpic/-fPIC) to match COFF/Mach-O and make output IR similar.
The refactoring is made available by 2820a2ca3a.
Currently only x86 supports local aliases. We move the decision to the driver.
There are three CC1 states:
* -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable.
* (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local
* -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out.
Add -fhalf-no-semantic-interposition to a few tests using the half-based semantic interposition behavior.
cl.exe doesn't understand Zd (in either MSVC 2017 or 2019), so neiter
should we. It used to do the same as `-gline-tables-only` which is
exposed as clang-cl flag as well, so if you want this behavior, use
`gline-tables-only`. That makes it clear that it's a clang-cl-only flag
that won't work with cl.exe.
Motivated by the discussion in D92958.
Differential Revision: https://reviews.llvm.org/D93458
There are out-of-tree tools using clang-offload-bundler to extract
bundles from bundled files. When a bundle is not in the bundled
file, clang-offload-bundler is expected to emit an error message
and return non-zero value. However currently clang-offload-bundler
silently generates empty file for the missing bundles.
Since OpenMP/HIP toolchains expect the current behavior, an option
-allow-missing-bundles is added to let clang-offload-bundler
create empty file when a bundle is missing when unbundling.
The unbundling job action is updated to use this option by
default.
clang-offload-bundler itself will emit error when a bundle
is missing when unbundling by default.
Changes are also made to check duplicate targets in -targets
option and emit error.
Differential Revision: https://reviews.llvm.org/D93068
This patch enables marshalling of the exception model options while enforcing their mutual exclusivity. The clang driver interface remains the same, this only affects the cc1 command line.
Depends on D93215.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D93216
The new PM is considered stable and many downstream groups have adopted it (some
have adopted it for more than two years). Add -f[no-]legacy-pass-manager to reflect the
fact that it is no longer experimental and the legacy pass manager is something we strive to retire.
In the future, when the legacy PM eventually goes away,
-fno-experimental-new-pass-manager and -flegacy-pass-manager will be removed.
This patch also changes -f[no-]legacy-pass-manager to pass `-plugin-opt={new,legacy}-pass-manager` to the linker (supported by both ld.lld and LLVMgold.so) when -flto/-flto=thin is specified
Reviewed By: aeubanks, rsmith
Differential Revision: https://reviews.llvm.org/D92915
This is needed for CUDA compilation where NVPTX back-end only supports DWARF2,
but host compilation should be allowed to use newer DWARF versions.
Differential Revision: https://reviews.llvm.org/D92617
Currently when -gsplit-dwarf is specified (could be buried in a build system),
there is no convenient way to cancel debug fission without affecting the debug
information amount (all of -g0, -g1 -fsplit-dwarf-inlining and -gline-directives-only
can, but they affect the debug information amount).
Reviewed By: #debug-info, dblaikie
Differential Revision: https://reviews.llvm.org/D92809
RFC: http://lists.llvm.org/pipermail/cfe-dev/2020-May/065430.html
Agreement from GCC: https://sourceware.org/pipermail/gcc-patches/2020-May/545688.html
g_flags_Group options generally don't affect the amount of debugging
information. -gsplit-dwarf is an exception. Its order dependency with
other gN_Group options make it inconvenient in a build system:
* -g0 -gsplit-dwarf -> level 2
-gsplit-dwarf "upgrades" the amount of debugging information despite
the previous intention (-g0) to drop debugging information
* -g1 -gsplit-dwarf -> level 2
-gsplit-dwarf "upgrades" the amount of debugging information.
* If we have a higher-level -gN, -gN -gsplit-dwarf will supposedly decrease the
amount of debugging information. This happens with GCC -g3.
The non-orthogonality has confused many users. GCC 11 will change the semantics
(-gsplit-dwarf no longer implies -g2) despite the backwards compatibility break.
This patch matches its behavior.
New semantics:
* If there is a g_Group, allow split DWARF if useful
(none of: -g0, -gline-directives-only, -g1 -fno-split-dwarf-inlining)
* Otherwise, no-op.
To restore the original behavior, replace -gsplit-dwarf with -gsplit-dwarf -g.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D80391
Currently, -ftime-report + new pass manager emits one line of report for each
pass run. This potentially causes huge output text especially with regular LTO
or large single file (Obeserved in private tests and was reported in D51276).
The behaviour of -ftime-report + legacy pass manager is
emitting one line of report for each pass object which has relatively reasonable
text output size. This patch adds a flag `-ftime-report=` to control time report
aggregation for new pass manager.
The flag is for new pass manager only. Using it with legacy pass manager gives
an error. It is a driver and cc1 flag. `per-pass` is the new default so
`-ftime-report` is aliased to `-ftime-report=per-pass`. Before this patch,
functionality-wise `-ftime-report` is aliased to `-ftime-report=per-pass-run`.
* Adds an boolean variable TimePassesHandler::PerRun to control per-pass vs per-pass-run.
* Adds a new clang CodeGen flag CodeGenOptions::TimePassesPerRun to work with the existing CodeGenOptions::TimePasses.
* Remove FrontendOptions::ShowTimers, its uses are replaced by the existing CodeGenOptions::TimePasses.
* Remove FrontendTimesIsEnabled (It was introduced in D45619 which was largely reverted.)
Differential Revision: https://reviews.llvm.org/D92436
This patch implements correct hostness based overloading resolution
in isBetterOverloadCandidate.
Based on hostness, if one candidate is emittable whereas the other
candidate is not emittable, the emittable candidate is better.
If both candidates are emittable, or neither is emittable based on hostness, then
other rules should be used to determine which is better. This is because
hostness based overloading resolution is mostly for determining
viability of a function. If two functions are both viable, other factors
should take precedence in preference.
If other rules cannot determine which is better, CUDA preference will be
used again to determine which is better.
However, correct hostness based overloading resolution
requires overloading resolution diagnostics to be deferred,
which is not on by default. The rationale is that deferring
overloading resolution diagnostics may hide overloading reslolutions
issues in header files.
An option -fgpu-exclude-wrong-side-overloads is added, which is off by
default.
When -fgpu-exclude-wrong-side-overloads is off, keep the original behavior,
that is, exclude wrong side overloads only if there are same side overloads.
This may result in incorrect overloading resolution when there are no
same side candates, but is sufficient for most CUDA/HIP applications.
When -fgpu-exclude-wrong-side-overloads is on, enable deferring
overloading resolution diagnostics and enable correct hostness
based overloading resolution, i.e., always exclude wrong side overloads.
Differential Revision: https://reviews.llvm.org/D80450
This change introduces a new clang switch `-fpseudo-probe-for-profiling` to enable AutoFDO with pseudo instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.
One implication from pseudo-probe instrumentation is that the profile is now sensitive to CFG changes. We perform the pseudo instrumentation very early in the pre-LTO pipeline, before any CFG transformation. This ensures that the CFG instrumented and annotated is stable and optimization-resilient.
The early instrumentation also allows the inliner to duplicate probes for inlined instances. When a probe along with the other instructions of a callee function are inlined into its caller function, the GUID of the callee function goes with the probe. This allows samples collected on inlined probes to be reported for the original callee function.
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D86502
Added support for the options mabi=vec-extabi and mabi=vec-default which are analogous to qvecnvol and qnovecnvol when using XL on AIX.
The extended Altivec ABI on AIX is enabled using mabi=vec-extabi in clang and vec-extabi in llc.
Reviewed By: Xiangling_L, DiggerLin
Differential Revision: https://reviews.llvm.org/D89684
This patch implements out of line atomics for LSE deployment
mechanism. Details how it works can be found in llvm/docs/Atomics.rst
Options -moutline-atomics and -mno-outline-atomics to enable and disable it
were added to clang driver. This is clang and llvm part of out-of-line atomics
interface, library part is already supported by libgcc. Compiler-rt
support is provided in separate patch.
Differential Revision: https://reviews.llvm.org/D91157
- The new option, -arcmt-action, is a simple enum based option.
- The driver is modified to translate the existing -ccc-acmt-* options accordingly
Depends on D83298
Reviewed By: Bigcheese
Original patch by Daniel Grumberg.
Differential Revision: https://reviews.llvm.org/D83315
Add an option -munsafe-fp-atomics for AMDGPU target.
When enabled, clang adds function attribute "amdgpu-unsafe-fp-atomics"
to any functions for amdgpu target. This allows amdgpu backend to use
unsafe fp atomic instructions in these functions.
Differential Revision: https://reviews.llvm.org/D91546
VE needs to support integrated assembler and "nas". This "nas"
doesn't recognize ".sigaddr" pseudo mnemonics, so need to disable
it. This patch disable it on VE by default. Also add a regression
test for that.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D91350
The behavior is controlled by the `-fprebuilt-implicit-modules` option, and
allows searching for implicit modules in the prebuilt module cache paths.
The current command-line options for prebuilt modules do not allow to easily
maintain and use multiple versions of modules. Both the producer and users of
prebuilt modules are required to know the relationships between compilation
options and module file paths. Using a particular version of a prebuilt module
requires passing a particular option on the command line (e.g.
`-fmodule-file=[<name>=]<file>` or `-fprebuilt-module-path=<directory>`).
However the compiler already knows how to distinguish and automatically locate
implicit modules. Hence this proposal to introduce the
`-fprebuilt-implicit-modules` option. When set, it enables searching for
implicit modules in the prebuilt module paths (specified via
`-fprebuilt-module-path`). To not modify existing behavior, this search takes
place after the standard search for prebuilt modules. If not
Here is a workflow illustrating how both the producer and consumer of prebuilt
modules would need to know what versions of prebuilt modules are available and
where they are located.
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules_v1 <config 1 options>
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules_v2 <config 2 options>
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules_v3 <config 3 options>
clang -cc1 -x c use.c -fmodules fmodule-map-file=modulemap -fprebuilt-module-path=prebuilt_modules_v1 <config 1 options>
clang -cc1 -x c use.c -fmodules fmodule-map-file=modulemap <non-prebuilt config options>
With prebuilt implicit modules, the producer can generate prebuilt modules as
usual, all in the same output directory. The same mechanisms as for implicit
modules take care of incorporating hashes in the path to distinguish between
module versions.
Note that we do not specify the output module filename, so `-o` implicit modules are generated in the cache path `prebuilt_modules`.
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules <config 1 options>
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules <config 2 options>
clang -cc1 -x c modulemap -fmodules -emit-module -fmodule-name=foo -fmodules-cache-path=prebuilt_modules <config 3 options>
The user can now simply enable prebuilt implicit modules and point to the
prebuilt modules cache. No need to "parse" command-line options to decide
what prebuilt modules (paths) to use.
clang -cc1 -x c use.c -fmodules fmodule-map-file=modulemap -fprebuilt-module-path=prebuilt_modules -fprebuilt-implicit-modules <config 1 options>
clang -cc1 -x c use.c -fmodules fmodule-map-file=modulemap -fprebuilt-module-path=prebuilt_modules -fprebuilt-implicit-modules <non-prebuilt config options>
This is for example particularly useful in a use-case where compilation is
expensive, and the configurations expected to be used are predictable, but not
controlled by the producer of prebuilt modules. Modules for the set of
predictable configurations can be prebuilt, and using them does not require
"parsing" the configuration (command-line options).
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D68997
415f7ee883 had LIT test failures on any build where the clang executable
was not called "clang". I have adjusted the LIT CHECKs to remove the
binary name to fix this.
Original commit message:
For PlayStation we offer source code compatibility with
Microsoft's dllimport/export annotations; however, our file
format is based on ELF.
To support this we translate from DLL storage class to ELF
visibility at the end of codegen in Clang.
Other toolchains have used similar strategies (e.g. see the
documentation for this ARM toolchain:
https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0)
This patch adds the ability to perform this translation. Options
are provided to support customizing the mapping behaviour.
Differential Revision: https://reviews.llvm.org/D89970
Similar to -fprofile-generate=, add -fmemory-profile= which takes a
directory path. This is passed down to LLVM via a new module flag
metadata. LLVM in turn provides this name to the runtime via the new
__memprof_profile_filename variable.
Additionally, always pass a default filename (in $cwd if a directory
name is not specified vi the = form of the option). This is also
consistent with the behavior of the PGO instrumentation. Since the
memory profiles will generally be fairly large, it doesn't make sense to
dump them to stderr. Also, importantly, the memory profiles will
eventually be dumped in a compact binary format, which is another reason
why it does not make sense to send these to stderr by default.
Change the existing memprof tests to specify log_path=stderr when that
was being relied on.
Depends on D89086.
Differential Revision: https://reviews.llvm.org/D89087
Since Wasm comdat sections work similarly to ELF, we can use that mechanism
to eliminate duplicate dwarf type information in the same way.
Differential Revision: https://reviews.llvm.org/D88603
Make the virtual method Toolchain::GetDefaultStackProtectorLevel()
return an explict enum value rather than an integral constant. This
makes the code subjectively easier to read, and should help prevent bugs
that may (or may never) arise from changing the enum values. Previously,
these were just kept in sync via a comment, which is brittle. The trade
off is including a additional header in a few new places. It is not
necessary, but in my opinion helps the readability.
Split off from https://reviews.llvm.org/D90194 to help cut down on lines
changed in code review.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D90271
Since Wasm comdat sections work similarly to ELF, we can use that mechanism
to eliminate duplicate dwarf type information in the same way.
Differential Revision: https://reviews.llvm.org/D88603
1. Emit error for -G driver option on AIX
2. Adjust cmake file to use -Wl,-G instead of -G
On AIX, legacy XL compiler uses -G to produce a shared object enabled
for use with the run-time linker, which has different meanings from what
it is used for in Clang. And in Clang, other targets do not have -G map
to another functionality in their legacy compiler. So this error is more
important when we are on AIX.
Differential Revision: https://reviews.llvm.org/D89897
With -fbasicblock-sections=, let the front-end handle the case where the file
doesnt exist. The driver only checks if the option syntax is right.
Differential Revision: https://reviews.llvm.org/D89500
* Make cc1 and cc1as --compress-debug-sections an alias for --compress-debug-sections=zlib
* Make -gz an alias for -gz=zlib
The new behavior is consistent with GCC when binutils>=2.26 is detected:
-gz is translated to --compress-debug-sections=zlib instead of --compress-debug-sections.
- The goal of this patch is improve option compatible with RISCV-V GCC,
-mcpu support on GCC side will sent patch in next few days.
- -mtune only affect the pipeline model and non-arch/extension related
target feature, e.g. instruction fusion; in td file it called
TuneFeatures, which is introduced by X86 back-end[1].
- -mtune accept all valid option for -mcpu and extra alias processor
option, e.g. `generic`, `rocket` and `sifive-7-series`, the purpose is
option compatible with RISCV-V GCC.
- Processor alias for -mtune will resolve according the current target arch,
rv32 or rv64, e.g. `rocket` will resolve to `rocket-rv32` or `rocket-rv64`.
- Interaction between -mcpu and -mtune:
* -mtune has higher priority than -mcpu for pipeline model and
TuneFeatures.
[1] https://reviews.llvm.org/D85165
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D89025
This reverts commits 683b308c07 and
8487bfd4e9.
We will go for a more restricted approach that does not give freedom to
everyone to change ABIs on whichever platform.
See the discussion on https://reviews.llvm.org/D85802.
This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html.
The goal is to add a way to override the default target C++ ABI through
a compiler flag. This makes it easier to test and transition between different
C++ ABIs through compile flags rather than build flags.
In this patch:
- Store `-fc++-abi=` in a LangOpt. This isn't stored in a
CodeGenOpt because there are instances outside of codegen where Clang
needs to know what the ABI is (particularly through
ASTContext::createCXXABI), and we should be able to override the
target default if the flag is provided at that point.
- Expose the existing ABIs in TargetCXXABI as values that can be passed
through this flag.
- Create a .def file for these ABIs to make it easier to check flag
values.
- Add an error for diagnosing bad ABI flag values.
Differential Revision: https://reviews.llvm.org/D85802
Summary:
This patch does the following:
1. Make InitTargetOptionsFromCodeGenFlags() accepts Triple as a
parameter, because some options' default value is triple dependant.
2. DataSections is turned on by default on AIX for llc.
3. Test cases change accordingly because of the default behaviour change.
4. Clang Driver passes in -fdata-sections by default on AIX.
Reviewed By: MaskRay, DiggerLin
Differential Revision: https://reviews.llvm.org/D88737
SUMMARY:
In IBM compiler xlclang , there is an option -fnovisibility which suppresses visibility. For more details see: https://www.ibm.com/support/knowledgecenter/SSGH3R_16.1.0/com.ibm.xlcpp161.aix.doc/compiler_ref/opt_visibility.html.
We need to add the option -mignore-xcoff-visibility for compatibility with the IBM AIX OS (as the option is enabled by default in AIX). With this option llvm does not emit any visibility attribute to ASM or XCOFF object file.
The option only work on the AIX OS, for other non-AIX OS using the option will report an unsupported options error.
In AIX OS:
1.1 the option -mignore-xcoff-visibility is enabled by default , if there is not -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command .
1.2 if there is -fvisibility=* explicitly but not -mignore-xcoff-visibility explicitly in the clang command. it will generate visibility attributes.
1.3 if there are both -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command. The option "-mignore-xcoff-visibility" wins , it do not emit the visibility attribute.
The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR.
Reviewer: daltenty,Jason Liu
Differential Revision: https://reviews.llvm.org/D87451
Object of class `Command` contains various properties of a command to
execute, but output file was missed from them. This change adds this
property. It is required for reporting consumed time and memory implemented
in D78903 and may be used in other cases too.
Differential Revision: https://reviews.llvm.org/D78902
A lot of our code building with clang-cl.exe using Clang 11 was failing with
the following 2 type of errors:
1. explicit specialization of 'foo' after instantiation
2. no matching function for call to 'bar'
Note that we also use -fdelayed-template-parsing in our builds.
I tried pretty hard to get a small repro for these failures, but couldn't. So
there is some subtle edge case in the -fpch-instantiate-templates feature
introduced by this change: https://reviews.llvm.org/D69585
When I tried turning this off using -fno-pch-instantiate-templates, builds
would silently fail with the same error without any indication that
-fno-pch-instantiate-templates was being ignored by the compiler. Then I
realized this "no" option wasn't actually working when I ran Clang under a
debugger.
Differential revision: https://reviews.llvm.org/D88680
GCC 7 introduced -fprofile-update={atomic,prefer-atomic} (prefer-atomic is for
best efforts (some targets do not support atomics)) to increment counters
atomically, which is exactly what we have done with -fprofile-instr-generate
(D50867) and -fprofile-arcs (b5ef137c11).
This patch adds the option to clang to surface the internal options at driver level.
GCC 7 also turned on -fprofile-update=prefer-atomic when -pthread is specified,
but it has performance regression
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89307). So we don't follow suit.
Differential Revision: https://reviews.llvm.org/D87737
recommit e50465ecef with fix for
regression in lldb tests.
Two issues:
1. the directory part of original .dwo file was dropped
2. if the stem of the .dwo file contains '.', the last dot
and strings after that were removed
This recommit fixes those two issues.
Set the default wchar_t type on z/OS, and unsigned as the default.
Reviewed By: hubert.reinterpretcast, fanbo-meng
Differential Revision: https://reviews.llvm.org/D87624
when -gsplit option is used with clang driver, clang driver will create
a filename with .dwo option based on the input file name and pass
it to clang -cc1. This file is used for storing the debug info. Since
HIP generate separate object files for different GPU arch's,
this file should be different for different GPU arch. This patch
adds _ and GPU arch to the stem of the dwo file.
Differential Revision: https://reviews.llvm.org/D87791
Enforcing a profile available check in the driver does not work with
incremental LTO builds where the LTO backend invocation does not include
the profile flags. At this point the profiles have already been consumed
and the IR contains profile metadata. Instead we always pass through the
-fsplit-machine-functions flag on user request. The pass itself contains
a check to return early if no profile information is available.
Differential Revision: https://reviews.llvm.org/D87943
Initial support for dwarf fission sections (-gsplit-dwarf) on wasm.
The most interesting change is support for writing 2 files (.o and .dwo) in the
wasm object writer. My approach moves object-writing logic into its own function
and calls it twice, swapping out the endian::Writer (W) in between calls.
It also splits the import-preparation step into its own function (and skips it when writing a dwo).
Differential Revision: https://reviews.llvm.org/D85685
Writing the .note.gnu.property manually is error prone and hard to
maintain in the assembly files.
The -mmark-bti-property is for the assembler to emit the section with the
GNU_PROPERTY_AARCH64_FEATURE_1_BTI. To be used when C/C++ is compiled
with -mbranch-protection=bti.
This patch refactors the .note.gnu.property handling.
Reviewed By: chill, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D81930
Reland with test dependency on aarch64 target.
Writing the .note.gnu.property manually is error prone and hard to
maintain in the assembly files.
The -mmark-bti-property is for the assembler to emit the section with the
GNU_PROPERTY_AARCH64_FEATURE_1_BTI. To be used when C/C++ is compiled
with -mbranch-protection=bti.
This patch refactors the .note.gnu.property handling.
Reviewed By: chill, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D81930
This patch adds a command line flag for the machine function splitter
(added in rG94faadaca4e1).
-fsplit-machine-functions
Split machine functions using profile information (x86 ELF). On
other targets an error is emitted. If profile information is not
provided a warning is emitted notifying the user that profile
information is required.
Differential Revision: https://reviews.llvm.org/D87047
Basic block sections is untested on other platforms and binary formats apart
from x86,elf. This patch emits a warning and drops the flag if the platform
and binary format are not compatible. Add a test to ensure that
specifying an incompatible target in the driver does not enable the
feature.
Differential Revision: https://reviews.llvm.org/D87426
This effectively disables r340386 on Darwin, and provides a command line flag
to opt into/out of this behaviour. This change is needed to compile certain
Apple headers correctly.
rdar://47688592
Differential revision: https://reviews.llvm.org/D86881
For the PS4, do not emit "-tune-cpu generic" since the platform only has 1 known CPU and we do not want to prevent optimizations by tuning for a generic rather than the specific processor it contains.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D86965
See RFC for background:
http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html
Note that the runtime changes will be sent separately (hopefully this
week, need to add some tests).
This patch includes the LLVM pass to instrument memory accesses with
either inline sequences to increment the access count in the shadow
location, or alternatively to call into the runtime. It also changes
calls to memset/memcpy/memmove to the equivalent runtime version.
The pass is modeled on the address sanitizer pass.
The clang changes add the driver option to invoke the new pass, and to
link with the upcoming heap profiling runtime libraries.
Currently there is no attempt to optimize the instrumentation, e.g. to
aggregate updates to the same memory allocation. That will be
implemented as follow on work.
Differential Revision: https://reviews.llvm.org/D85948
This patch defaults to -mtune=generic unless -march is present. If -march is present we'll use the empty string unless its overridden by mtune. The back should use the target cpu if the tune-cpu isn't present.
It also adds AST serialization support to fix some tests that emit AST and parse it back. These tests diff the IR against the output from not going through AST. So if we don't serialize the tune CPU we fail the diff.
Differential Revision: https://reviews.llvm.org/D86488
Some code bases out there pass -mtune=generic to clang. This would have
been ignored prior to D85384. Now it results in an error
because "generic" isn't recognized by isValidCPUName.
And if we let it go through to the backend as a tune
setting it would get the tune flags closer to i386 rather
than a modern CPU.
I plan to change what tune=generic does in the backend in
a future patch. And allow this in the frontend.
But this should be a quick fix for the error some users
are seeing.
Building on the backend support from D85165. This parses the command line option in the driver, passes it on to CC1 and adds a function attribute.
-Still need to support tune on the target attribute.
-Need to use "generic" as the tuning by default. But need to change generic in the backend first.
-Need to set tune if march is specified and mtune isn't.
-May need to disable getHostCPUName's ability to guess CPU name from features when it doesn't have a family/model match for mtune=native. That's what gcc appears to do.
Differential Revision: https://reviews.llvm.org/D85384
Fixes pr/11710.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Resubmit after breaking Windows and OSX builds.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D80242
Using -fmodules-* options for PCHs is a bit confusing, so add -fpch-*
variants. Having extra options also makes it simple to do a configure
check for the feature.
Also document the options in the release notes.
Differential Revision: https://reviews.llvm.org/D83623
No real action is taken for a value of scalable but it provides a
route to disable an earlier specification and is effectively its
default value when omitted.
Patch also removes an "unused variable" warning.
Differential Revision: https://reviews.llvm.org/D84021
Summary:
This patch implements parsing support for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE, version 00bet5,
section 3.7.3) for SVE [1].
The purpose of this attribute is to define fixed-length (VLST) versions
of existing sizeless types (VLAT). For example:
#if __ARM_FEATURE_SVE_BITS==512
typedef svint32_t fixed_svint32_t __attribute__((arm_sve_vector_bits(512)));
#endif
Creates a type 'fixed_svint32_t' that is a fixed-length version of
'svint32_t' that is normal-sized (rather than sizeless) and contains
exactly 512 bits. Unlike 'svint32_t', this type can be used in places
such as structs and arrays where sizeless types can't.
Implemented in this patch is the following:
* Defined and tested attribute taking single argument.
* Checks the argument is an integer constant expression.
* Attribute can only be attached to a single SVE vector or predicate
type, excluding tuple types such as svint32x4_t.
* Added the `-msve-vector-bits=<bits>` flag. When specified the
`__ARM_FEATURE_SVE_BITS__EXPERIMENTAL` macro is defined.
* Added a language option to store the vector size specified by the
`-msve-vector-bits=<bits>` flag. This is used to validate `N ==
__ARM_FEATURE_SVE_BITS`, where N is the number of bits passed to the
attribute and `__ARM_FEATURE_SVE_BITS` is the feature macro defined under
the same flag.
The `__ARM_FEATURE_SVE_BITS` macro will be made non-experimental in the final
patch of the series.
[1] https://developer.arm.com/documentation/100987/latest
This is patch 1/4 of a patch series.
Reviewers: sdesmalen, rsandifo-arm, efriedma, ctetreau, cameron.mcinally, rengolin, aaron.ballman
Reviewed By: sdesmalen, aaron.ballman
Differential Revision: https://reviews.llvm.org/D83550
Do not detect device library by default in rocm detector.
Only detect device library in Rocm and HIP toolchain.
Separate detection of HIP runtime and Rocm device library.
Detect rocm path by version file in host toolchains.
Also added detecting rocm version and printing rocm
installation path and version with -v.
Fixed include path and device library detection for
ROCm 3.5.
Added --hip-version option. Renamed --hip-device-lib-path
to --rocm-device-lib-path.
Fixed default value for -fhip-new-launch-api.
Added default -std option for HIP.
Differential Revision: https://reviews.llvm.org/D82930
Summary:
-debug-info-kind=constructor reduces the amount of class debug info that
is emitted; this patch switches to using this as the default.
Constructor homing emits the complete type info for a class only when the
constructor is emitted, so it is expected that there will be some classes that
are not defined in the debug info anymore because they are never constructed,
and we shouldn't need debug info for these classes.
I compared the PDB files for clang, and there are 273 class types that are defined with `=limited`
but not with `=constructor` (out of ~60,000 total class types).
We've looked at a number of the types that are no longer defined with =constructor. The vast
majority of cases are something like class A is used as a parameter in a member function of
some other class B, which is emitted. But the function that uses class A is never called, and class A
is never constructed, and therefore isn't emitted in the debug info.
Bug: https://bugs.llvm.org/show_bug.cgi?id=46537
Subscribers: aprantl, cfe-commits, lldb-commits
Tags: #clang, #lldb
Differential Revision: https://reviews.llvm.org/D79147
Making -g[no-]column-info opt out reduces the length of a typical CC1 command line.
Additionally, in a non-debug compile, we won't see -dwarf-column-info.
Summary:
If you execute the following commandline multiple times, the behavior was not always the same:
clang++ --target=thumbv7em-none-windows-eabi-coff -march=armv7-m -mcpu=cortex-m7 -o temp.obj -c -x c++ empty.cpp
Most of the time the compilation succeeded, but sometimes clang reported this error:
clang++: error: the target architecture 'thumbv7em' is not supported by the target 'thumbv7em-none-windows-eabi'
The cause of the inconsistent behavior was the uninitialized variable Version.
With these commandline arguments, the variable Version was not set by getAsInteger(),
because it cannot parse a number from the substring "7em" (of "thumbv7em").
To get a consistent behaviour, it's enough to initialize the variable Version to zero.
Zero is smaller than 7, so the comparison will be true.
Then the command always fails with the error message seen above.
By using consumeInteger() instead of getAsInteger() we get 7 from the substring "7em"
and the command does not fail.
Reviewers: compnerd, danielkiss
Reviewed By: danielkiss
Subscribers: danielkiss, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75453
specified at Command creation, rather than as part of the Tool.
This resolves the hack I just added to allow Darwin toolchain to vary
its level of support based on `-mlinker-version=`.
The change preserves the _current_ settings for response-file support.
Some tools look likely to be declaring that they don't support
response files in error, however I kept them as-is in order for this
change to be a simple refactoring.
Differential Revision: https://reviews.llvm.org/D82782
This fixes a unit test. Otherwise here is the original commit:
1) Shared writable directories like /tmp are a security problem.
2) Systems provide dedicated cache directories these days anyway.
3) This also refines LLVM's cache_directory() on Darwin platforms to use
the Darwin per-user cache directory.
Reviewers: compnerd, aprantl, jakehehrlich, espindola, respindola, ilya-biryukov, pcc, sammccall
Reviewed By: compnerd, sammccall
Subscribers: hiraditya, llvm-commits, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D82362
1) Shared writable directories like /tmp are a security problem.
2) Systems provide dedicated cache directories these days anyway.
3) This also refines LLVM's cache_directory() on Darwin platforms to use
the Darwin per-user cache directory.
Reviewers: compnerd, aprantl, jakehehrlich, espindola, respindola, ilya-biryukov, pcc, sammccall
Reviewed By: compnerd, sammccall
Subscribers: hiraditya, llvm-commits, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D82362
Summary:
Added support for dynamic memory allocation for globalized variables in
case if execution of target regions in parallel is required.
Reviewers: jdoerfert
Subscribers: jholewinski, yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82324
Add -fpch-instantiate-templates which makes template instantiations be
performed already in the PCH instead of it being done in every single
file that uses the PCH (but every single file will still do it as well
in order to handle its own instantiations). I can see 20-30% build
time saved with the few tests I've tried.
The change may reorder compiler output and also generated code, but
should be generally safe and produce functionally identical code.
There are some rare cases that do not compile with it,
such as test/PCH/pch-instantiate-templates-forward-decl.cpp. If
template instantiation bailed out instead of reporting the error,
these instantiations could even be postponed, which would make them
work.
Enable this by default for clang-cl. MSVC creates PCHs by compiling
them using an empty .cpp file, which means templates are instantiated
while building the PCH and so the .h needs to be self-contained,
making test/PCH/pch-instantiate-templates-forward-decl.cpp to fail
with MSVC anyway. So the option being enabled for clang-cl matches this.
Differential Revision: https://reviews.llvm.org/D69585
On AIX, we use __atexit to register dtor functions rather than __cxa_atexit.
So a driver change is needed to default AIX to using -fno-use-cxa-atexit.
Windows platform does not uses __cxa_atexit either. Following its precedent,
we remove the assertion for when -fuse-cxa-atexit is specified by the user,
do not produce a message and silently default to -fno-use-cxa-atexit behavior.
Differential Revision: https://reviews.llvm.org/D82136
Summary:
Add -ftrivial-auto-var-init-stop-after= to limit the number of times
stack variables are initialized when -ftrivial-auto-var-init= is used to
initialize stack variables to zero or a pattern. This flag can be used
to bisect uninitialized uses of a stack variable exposed by automatic
variable initialization, such as http://crrev.com/c/2020401.
Reviewers: jfb, vitalybuka, kcc, glider, rsmith, rjmccall, pcc, eugenis, vlad.tsyrklevich
Reviewed By: jfb
Subscribers: phosek, hubert.reinterpretcast, srhines, MaskRay, george.burgess.iv, dexonsmith, inglorion, gbiv, llozano, manojgupta, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77168