Summary:
- Similar to other targets, instead of passing a toolchain, a driver
argument should be passed into `arm::getARMTargetFeatures`. Aslo, that
routine should honor the specified triple. Refactor
`arm::getARMFloatABI` with 2 separate interfaces. One has the original
parameters and the other uses the driver and the specified triple.
- That fixes an issue when target & features are queried during the
offload compilation, where the specified triple should be checked
instead of a effective triple. A previously failed test is re-enabled.
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74020
Summary:
As a first step this implementation enables compilation of the offload
code.
Reviewers: ABataev
Subscribers: ebevhan, Anastasia, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74048
Summary:
- The device compilation needs to have a consistent source code compared
to the corresponding host compilation. If macros based on the
host-specific target processor is not properly populated, the device
compilation may fail due to the inconsistent source after the
preprocessor. So far, only the host triple is used to build the
macros. If a detailed host CPU target or certain features are
specified, macros derived from them won't be populated properly, e.g.
`__SSE3__` won't be added unless `+sse3` feature is present. On
Windows compilation compatible with MSVC, that missing macros result
in that intrinsics are not included and cause device compilation
failure on the host-side source.
- This patch addresses this issue by introducing two `cc1` options,
i.e., `-aux-target-cpu` and `-aux-target-feature`. If a specific host
CPU target or certain features are specified, the compiler driver will
append them during the construction of the offline compilation
actions. Then, the toolchain in `cc1` phase will populate macros
accordingly.
- An internal option `--gpu-use-aux-triple-only` is added to fall back
the original behavior to help diagnosing potential issues from the new
behavior.
Reviewers: tra, yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73942
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
First attempt at implementing -fsemantic-interposition.
Rely on GlobalValue::isInterposable that already captures most of the expected
behavior.
Rely on a ModuleFlag to state whether we should respect SemanticInterposition or
not. The default remains no.
So this should be a no-op if -fsemantic-interposition isn't used, and if it is,
isInterposable being already used in most optimisation, they should honor it
properly.
Note that it only impacts architecture compiled with -fPIC and no pie.
Differential Revision: https://reviews.llvm.org/D72829
Summary: With OpenMP offloading host compilation is done in two phases to capture host IR that is passed to all device compilations as input. But it turns out that we currently run entire LLVM optimization pipeline on host IR on both compilations which may have unpredictable effects on the resulting code. This patch fixes this problem by disabling LLVM passes on the first compilation, so the host IR that is passed to device compilations will be captured right after front end.
Reviewers: ABataev, jdoerfert, hfinkel
Reviewed By: ABataev
Subscribers: guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73721
See
https://docs.google.com/document/d/1xMkTZMKx9llnMPgso0jrx3ankI4cv60xeZ0y4ksf4wc/preview
for background discussion.
This adds a warning, flags and pragmas to limit the number of
pre-processor tokens either at a certain point in a translation unit, or
overall.
The idea is that this would allow projects to limit the size of certain
widely included headers, or for translation units overall, as a way to
insert backstops for header bloat and prevent compile-time regressions.
Differential revision: https://reviews.llvm.org/D72703
Currently there are 4 different mechanisms for controlling denormal
flushing behavior, and about as many equivalent frontend controls.
- AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features
- NVPTX uses the nvptx-f32ftz attribute
- ARM directly uses the denormal-fp-math attribute
- Other targets indirectly use denormal-fp-math in one DAGCombine
- cl-denorms-are-zero has a corresponding denorms-are-zero attribute
AMDGPU wants a distinct control for f32 flushing from f16/f64, and as
far as I can tell the same is true for NVPTX (based on the attribute
name).
Work on consolidating these into the denormal-fp-math attribute, and a
new type specific denormal-fp-math-f32 variant. Only ARM seems to
support the two different flush modes, so this is overkill for the
other use cases. Ideally we would error on the unsupported
positive-zero mode on other targets from somewhere.
Move the logic for selecting the flush mode into the compiler driver,
instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32
are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as
a user flag.
-cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and
-fno-cuda-flush-denormals-to-zero will be mapped to
-fp-denormal-math-f32=ieee or preserve-sign rather than the old
attributes.
Stop emitting the denorms-are-zero attribute for the OpenCL flag. It
has no in-tree users. The meaning would also be target dependent, such
as the AMDGPU choice to treat this as only meaning allow flushing of
f32 and not f16 or f64. The naming is also potentially confusing,
since DAZ in other contexts refers to instructions implicitly treating
input denormals as zero, not necessarily flushing output denormals to
zero.
This also does not attempt to change the behavior for the current
attribute. The LangRef now states that the default is ieee behavior,
but this is inaccurate for the current implementation. The clang
handling is slightly hacky to avoid touching the existing
denormal-fp-math uses. Fixing this will be left for a future patch.
AMDGPU is still using the subtarget feature to control the denormal
mode, but the new attribute are now emitted. A future change will
switch this and remove the subtarget features.
These driver options perform some checking and delegate to MC options -x86-align-branch* and -x86-branches-within-32B-boundaries.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D72463
The option will limit debug info by only emitting complete class
type information when its constructor is emitted.
This patch changes comparisons with LimitedDebugInfo to use the new
level instead.
Differential Revision: https://reviews.llvm.org/D72427
With this patch, the clang tool will now call the -cc1 invocation directly inside the same process. Previously, the -cc1 invocation was creating, and waiting for, a new process.
This patch therefore reduces the number of created processes during a build, thus it reduces build times on platforms where process creation can be costly (Windows) and/or impacted by a antivirus.
It also makes debugging a bit easier, as there's no need to attach to the secondary -cc1 process anymore, breakpoints will be hit inside the same process.
Crashes or signaling inside the -cc1 invocation will have the same side-effect as before, and will be reported through the same means.
This behavior can be controlled at compile-time through the CLANG_SPAWN_CC1 cmake flag, which defaults to OFF. Setting it to ON will revert to the previous behavior, where any -cc1 invocation will create/fork a secondary process.
At run-time, it is also possible to tweak the CLANG_SPAWN_CC1 environment variable. Setting it and will override the compile-time setting. A value of 0 calls -cc1 inside the calling process; a value of 1 will create a secondary process, as before.
Differential Revision: https://reviews.llvm.org/D69825
which is the default TLS model for non-PIC objects. This allows large/
many thread local variables or a compact/fast code in an executable.
Specification is same as that of GCC. For example, the code model
option precedes the TLS size option.
TLS access models other than local-exec are not changed. It means
supoort of the large code model is only in the local exec TLS model.
Patch By KAWASHIMA Takahiro (kawashima-fj <t-kawashima@fujitsu.com>)
Reviewers: dmgreen, mstorsjo, t.p.northover, peter.smith, ostannard
Reviewd By: peter.smith
Committed by: peter.smith
Differential Revision: https://reviews.llvm.org/D71688
All 130+ f_Group flags that take an argument allow it after a '=',
except for fdebug-complation-dir. Add a Joined<> alias so that
it behaves consistently with all the other f_Group flags.
(Keep the old Separate flag for backwards compat.)
In the backend, this feature is implemented with the function attribute
"patchable-function-entry". Both the attribute and XRay use
TargetOpcode::PATCHABLE_FUNCTION_ENTER, so the two features are
incompatible.
Reviewed By: ostannard, MaskRay
Differential Revision: https://reviews.llvm.org/D72222
Summary:
Every powerpc64le platform uses elfv2.
For powerpc64, the environments "elfv1" and "elfv2" were added for
FreeBSD ELFv1->ELFv2 migration in D61950. FreeBSD developers have
decided to use OS versions to select ABI, and no one is relying on the
environments.
Also use elfv2 on powerpc64-linux-musl.
Users can always use -mabi=elfv1 and -mabi=elfv2 to override the default
ABI.
Reviewed By: adalava
Differential Revision: https://reviews.llvm.org/D72352
-mpacked-stack is currently not supported with -mbackchain, so this should
result in a compilation error message instead of being silently ignored.
Review: Ulrich Weigand
gcc/config/{i386,s390} support -mnop-mcount. We currently only support
-mnop-mcount for SystemZ. The function attribute "mnop-mcount" is
ignored on other targets.
gcc/config/{i386,s390} support -mfentry. We currently only support
-mfentry for X86 and SystemZ. TargetOpcode::FENTRY_CALL is not handled
on other targets.
% clang -target aarch64 -pg -mfentry a.c -c
fatal error: error in backend: Not supported instr: <MCInst 21>
-mfentry, -mrecord-mcount, and -mnop-mcount were invented for Linux
ftrace. Linux uses $(call cc-option-yn,-mrecord-mcount) to detect if the
specific feature is available. Reject unsupported features so that Linux
build system will not wrongly consider them available and cause
build/runtime failures.
Note, GCC has stricter checks that we do not implement, e.g. -fpic/-fpie
-fnop-mcount is not allowed on x86, -fpic/-fpie -mfentry is not allowed
on x86-32.
GCC's x86 and s390 ports support -mrecord-mcount. Other ports reject the
option.
aarch64-linux-gnu-gcc: error: unrecognized command line option ‘-mrecord-mcount’
Allowing this option can cause failures when building Linux kernel for
aarch64, powerpc64, etc, which will think the feature is available if
the clang command returns 0.
Method '-[NSCoder decodeValueOfObjCType:at:]' is not only deprecated
but also a security hazard, hence a loud check.
Differential Revision: https://reviews.llvm.org/D71728
Recognize -mrecord-mcount from the command line and add a function attribute
"mrecord-mcount" when passed.
Only valid on SystemZ (when used with -mfentry).
Review: Ulrich Weigand
https://reviews.llvm.org/D71627
Summary: Per D62731, the behavior of clang with `-frounding-math` is no worse than when the rounding flag was completely ignored, so remove this unnecessary warning.
Reviewers: mibintc, chandlerc, echristo, rjmccall, kpn, erichkeane, rsmith, andrew.w.kaylor
Reviewed By: mibintc
Subscribers: merge_guards_bot, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71671
On Darwin, when used for generating a linked binary from a source file
(through an intermediate object file), the driver will invoke `cc1` to
generate a temporary object file. The temporary remark file will now be
emitted next to the object file, which will then be picked up by
`dsymutil` and emitted in the .dSYM bundle.
This is available for all formats except YAML since by default, YAML
doesn't need a section and the remark file will be lost.
Our build system does not handle randomly named files created during
the build well. We'd prefer to write compilation output directly
without creating a temporary file. Function parameters already
existed to control this behavior but were not exposed all the way out
to the command line.
Patch by Zachary Henkel!
Differential revision: https://reviews.llvm.org/D70615
Recognize -mpacked-stack from the command line and add a function attribute
"mpacked-stack" when passed. This is needed for building the Linux kernel.
If this option is passed for any other target than SystemZ, an error is
generated.
Review: Ulrich Weigand
https://reviews.llvm.org/D71441
This matches https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html
> -momit-leaf-frame-pointer
> -mno-omit-leaf-frame-pointer
>
> Omit or keep the frame pointer in leaf functions. The former behavior is the default.
-mno-omit-leaf-frame-pointer is currently a no-op because
TargetOptions::DisableFramePointerElim is only considered for non-leaf
functions.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D71167
Very few ELF platforms still use .ctors/.dtors now. Linux (glibc: 1999-07),
DragonFlyBSD, FreeBSD (2012-03) and Solaris have supported .init_array
for many years. Some architectures like AArch64/RISC-V default to
.init_array . GNU ld and gold can even convert .ctors to .init_array .
It makes more sense to flip the CC1 default, and only uses
-fno-use-init-array on platforms that don't support .init_array .
For example, OpenBSD did not support DT_INIT_ARRAY before Aug 2016
(86fa57a279)
I may miss some ELF platforms that still use .ctors, but their
maintainers can easily diagnose such problems.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D71393
Serialized remarks contain debug locations for each remark, by storing a
file path, a line, and a column.
Also, remarks support being embedded in a .dSYM bundle using a separate
section in object files, that is found by `dsymutil` through the debug
map.
In order for tools to map addresses to source and display remarks in the
source, we need line tables, and in order for `dsymutil` to find the
object files containing the remark section, we need to keep the debug
map around.
Differential Revision: https://reviews.llvm.org/D71325
This is a follow up patch to use the OpenMP-IR-Builder, as discussed on
the mailing list ([1] and later) and at the US Dev Meeting'19.
[1] http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000197.html
Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim
Subscribers: ppenzin, penzn, llvm-commits, cfe-commits, jfb, guansong, bollu, hiraditya, mgorny
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69922
The -fsplit-dwarf-inlining option does not conform to DWARF5 standard.
It creates children for Skeleton compilation unit. We need default behavior
to be DWARF5 compatible. Thus set default state for -fsplit-dwarf-inlining
into "false".
Differential Revision: https://reviews.llvm.org/D71304
This adds a check for the usage of -foptimization-record-file with
multiple -arch options. This is not permitted since it would require us
to rename the file requested by the user to avoid overwriting it for the
second cc1 invocation.
This patch allows for -o to be used with -c when compiling with clang
interface stubs enabled. This is because the second file will be an
intermediate ifs stubs file that is the text stub analog of the .o file.
Both get produces in this case, so two files.
Why are we doing this? Because we want to support the case where
interface stubs are used bu first invoking clang like so:
clang -c <other flags> -emit-interface-stubs foo.c -o foo.o
...
clang -emit-interface-stubs <.o files> -o libfoo.so
This should generate N .ifs files, and one .ifso file. Prior to this
patch, using -o with the -c invocation was not possible. Currently the
clang driver supports generating a a.out/.so file at the same time as a
merged ifs file / ifso file, but this is done by checking that the final
job is the IfsMerge job. When -c is used, the final job is a Compile job
so what this patch does is check to figure out of the job type is
TY_IFS_CPP.
Differential Revision: https://reviews.llvm.org/D70763
Patch was reverted because https://bugs.llvm.org/show_bug.cgi?id=44048
The original patch is modified to set the strictfp IR attribute
explicitly in CodeGen instead of as a side effect of IRBuilder.
In the 2nd attempt to reapply there was a windows lit test fail, the
tests were fixed to use wildcard matching.
Differential Revision: https://reviews.llvm.org/D62731
Skip distro detection when we're not running on Linux, or when the target triple is not Linux. This saves a few OS calls for each invocation of clang.exe.
Differential Revision: https://reviews.llvm.org/D70467
Summary:
Removed the ```-fforce-experimental-new-constant-interpreter flag```, leaving
only the ```-fexperimental-new-constant-interpreter``` one. The interpreter
now always emits an error on an unsupported feature.
Allowing the interpreter to bail out would require a mapping from APValue to
interpreter memory, which will not be necessary in the final version. It is
more sensible to always emit an error if the interpreter fails.
Reviewers: jfb, Bigcheese, rsmith, dexonsmith
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70071
GCC 8 implements -fmacro-prefix-map. Like -fdebug-prefix-map, it replaces a string prefix for the __FILE__ macro.
-ffile-prefix-map is the union of -fdebug-prefix-map and -fmacro-prefix-map
Reviewed By: rnk, Lekensteyn, maskray
Differential Revision: https://reviews.llvm.org/D49466
When the driver is targeting multiple architectures at once, for things
like Universal Mach-Os, we need to emit different remark files for each
cc1 invocation to avoid overwriting the files from a different
invocation.
For example:
$ clang -c -o foo.o -fsave-optimization-record -arch x86_64 -arch x86_64h
will create two remark files:
* foo-x86_64.opt.yaml
* foo-x86_64h.opt.yaml
and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread.
This reverts commits af57dbf12e and e6584b2b7b
When the driver is targeting multiple architectures at once, for things
like Universal Mach-Os, we need to emit different remark files for each
cc1 invocation to avoid overwriting the files from a different
invocation.
For example:
$ clang -c -o foo.o -fsave-optimization-record -arch x86_64 -arch x86_64h
will create two remark files:
* foo-x86_64.opt.yaml
* foo-x86_64h.opt.yaml
This started passing target-features on the linker line, not just for RISCV but
for all targets, leading to error messages in Chromium Android build:
'+soft-float-abi' is not a recognized feature for this target (ignoring feature)
'+soft-float-abi' is not a recognized feature for this target (ignoring feature)
See Phabricator review for details.
Reverting until this can be fixed properly.
> Summary:
> 1. enable LTO need to pass target feature and abi to LTO code generation
> RISCV backend need the target feature to decide which extension used in
> code generation.
> 2. move getTargetFeatures to CommonArgs.h and add ForLTOPlugin flag
> 3. add general tools::getTargetABI in CommonArgs.h because different target uses different
> way to get the target ABI.
>
> Patch by Kuan Hsu Chen (khchen)
>
> Reviewers: lenary, lewis-revill, asb, MaskRay
>
> Reviewed By: lenary
>
> Subscribers: hiraditya, dschuff, aheejin, fedor.sergeev, mehdi_amini, inglorion, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, cfe-commits
>
> Tags: #clang
>
> Differential Revision: https://reviews.llvm.org/D67409
This flag decouples specifying the DWARF version from enabling/disabling
DWARF in general (or the gN level - gmlt/limited/standalone, etc) while
still allowing existing -gdwarf-N flags to override this default.
Patch by Caroline Tice!
Differential Revision: https://reviews.llvm.org/D69822
Add options to control floating point behavior: trapping and
exception behavior, rounding, and control of optimizations that affect
floating point calculations. More details in UsersManual.rst.
Reviewers: rjmccall
Differential Revision: https://reviews.llvm.org/D62731
Recognize -mnop-mcount from the command line and add a function attribute
"mnop-mcount"="true" when passed.
When this option is used, a nop is added instead of a call to fentry. This
is used when building the Linux Kernel.
If this option is passed for any other target than SystemZ, an error is
generated.
Review: Ulrich Weigand
https://reviews.llvm.org/D67763
The linker options (e.g. pragma detect_mismatch) are intended for host
compilation only, therefore disable it for device compilation.
Differential Revision: https://reviews.llvm.org/D57829
This reverts commit 004ed2b0d1.
Original commit hash 6d03890384
Summary:
This adds a clang option to disable inline line tables. When it is used,
the inliner uses the call site as the location of the inlined function instead of
marking it as an inline location with the function location.
https://reviews.llvm.org/D67723
This adds a flag to LLVM and clang to always generate a .debug_frame
section, even if other debug information is not being generated. In
situations where .eh_frame would normally be emitted, both .debug_frame
and .eh_frame will be used.
Differential Revision: https://reviews.llvm.org/D67216
Summary:
This adds a clang option to disable inline line tables. When it is used,
the inliner uses the call site as the location of the inlined function instead of
marking it as an inline location with the function location.
See https://bugs.llvm.org/show_bug.cgi?id=42344
Reviewers: rnk
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D67723
Summary:
A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on
indirect function calls, using either the check mechanism (X86, ARM, AArch64) or
or the dispatch mechanism (X86-64). The check mechanism requires a new calling
convention for the supported targets. The dispatch mechanism adds the target as
an operand bundle, which is processed by SelectionDAG. Another pass
(CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as
required by /guard:cf. This feature is enabled using the `cfguard` CC1 option.
Reviewers: thakis, rnk, theraven, pcc
Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65761
Summary:
- As variadic parameters have the lowest rank in overload resolution,
without real usage of `va_arg`, they are commonly used as the
catch-all fallbacks in SFINAE. As the front-end still reports errors
on calls to `va_arg`, the declaration of functions with variadic
arguments should be allowed in general.
Reviewers: jlebar, tra, yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69389
Summary:
A necessary step to let build system caching work for its output.
Reviewers: tejohnson, steven_wu
Reviewed by: tejohnson
Subscribers: mehdi_amini, inglorion, dexonsmith, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69406
This is clang part of the patch. It adds -flto-unit flag for thin LTO
builds on Mac and PS4
Differential revision: https://reviews.llvm.org/D68950
llvm-svn: 375224
Remove dead virtual functions from vtables with
replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up
correctly.
Original commit message:
Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.
This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.
To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.
The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.
This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.
To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.
I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.
On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.
I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.
I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).
Differential revision: https://reviews.llvm.org/D63932
llvm-svn: 375094
The final list of OpenMP offload targets becomes known only at the link time and since offload registration code depends on the targets list it makes sense to delay offload registration code generation to the link time instead of adding it to the host part of every fat object. This patch moves offload registration code generation from clang to the offload wrapper tool.
This is the last part of the OpenMP linker script elimination patch https://reviews.llvm.org/D64943
Differential Revision: https://reviews.llvm.org/D68746
llvm-svn: 374937
Summary:
When files often get touched during builds, the mtime based validation
leads to different problems in implicit modules builds, even when the
content doesn't actually change:
- Modules only: module invalidation due to out of date files. Usually causing rebuild traffic.
- Modules + PCH: build failures because clang cannot rebuild a module if it comes from building a PCH.
- PCH: build failures because clang cannot rebuild a PCH in case one of the input headers has different mtime.
This patch proposes hashing the content of input files (headers and
module maps), which is performed during serialization time. When looking
at input files for validation, clang only computes the hash in case
there's a mtime mismatch.
I've tested a couple of different hash algorithms availble in LLVM in
face of building modules+pch for `#import <Cocoa/Cocoa.h>`:
- `hash_code`: performace diff within the noise, total module cache increased by 0.07%.
- `SHA1`: 5% slowdown. Haven't done real size measurements, but it'd be BLOCK_ID+20 bytes per input file, instead of BLOCK_ID+8 bytes from `hash_code`.
- `MD5`: 3% slowdown. Like above, but BLOCK_ID+16 bytes per input file.
Given the numbers above, the patch uses `hash_code`. The patch also
improves invalidation error msgs to point out which type of problem the
user is facing: "mtime", "size" or "content".
rdar://problem/29320105
Reviewers: dexonsmith, arphaman, rsmith, aprantl
Subscribers: jkorous, cfe-commits, ributzka
Tags: #clang
Differential Revision: https://reviews.llvm.org/D67249
> llvm-svn: 374841
llvm-svn: 374895
Summary:
When files often get touched during builds, the mtime based validation
leads to different problems in implicit modules builds, even when the
content doesn't actually change:
- Modules only: module invalidation due to out of date files. Usually causing rebuild traffic.
- Modules + PCH: build failures because clang cannot rebuild a module if it comes from building a PCH.
- PCH: build failures because clang cannot rebuild a PCH in case one of the input headers has different mtime.
This patch proposes hashing the content of input files (headers and
module maps), which is performed during serialization time. When looking
at input files for validation, clang only computes the hash in case
there's a mtime mismatch.
I've tested a couple of different hash algorithms availble in LLVM in
face of building modules+pch for `#import <Cocoa/Cocoa.h>`:
- `hash_code`: performace diff within the noise, total module cache increased by 0.07%.
- `SHA1`: 5% slowdown. Haven't done real size measurements, but it'd be BLOCK_ID+20 bytes per input file, instead of BLOCK_ID+8 bytes from `hash_code`.
- `MD5`: 3% slowdown. Like above, but BLOCK_ID+16 bytes per input file.
Given the numbers above, the patch uses `hash_code`. The patch also
improves invalidation error msgs to point out which type of problem the
user is facing: "mtime", "size" or "content".
rdar://problem/29320105
Reviewers: dexonsmith, arphaman, rsmith, aprantl
Subscribers: jkorous, cfe-commits, ributzka
Tags: #clang
Differential Revision: https://reviews.llvm.org/D67249
llvm-svn: 374841
Summary:
Currently clang does not support -Wa,-W, which suppresses warning
messages in GNU assembler. Add this option for gcc compatibility.
https://bugs.llvm.org/show_bug.cgi?id=43651. Reland with differential
information.
Reviewers: bcain
Reviewed By: bcain
Subscribers: george.burgess.iv, gbiv, llozano, manojgupta, nickdesaulniers, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D68884
llvm-svn: 374834
Currently clang does not support -Wa,-W, which suppresses warning
messages in GNU assembler. Add this option for gcc compatibility.
https://bugs.llvm.org/show_bug.cgi?id=43651
llvm-svn: 374822
The goal is to have 100% fidelity in clang-scan-deps behavior when
--analyze is present in compilation command.
At the same time I don't want to break clang-tidy which expects
__static_analyzer__ macro defined as built-in.
I introduce new cc1 options (-setup-static-analyzer) that controls
the macro definition and is conditionally set in driver.
Differential Revision: https://reviews.llvm.org/D68093
llvm-svn: 374815
Summary:
1. enable LTO need to pass target feature and abi to LTO code generation
RISCV backend need the target feature to decide which extension used in
code generation.
2. move getTargetFeatures to CommonArgs.h and add ForLTOPlugin flag
3. add general tools::getTargetABI in CommonArgs.h because different target uses different
way to get the target ABI.
Patch by Kuan Hsu Chen (khchen)
Reviewers: lenary, lewis-revill, asb, MaskRay
Reviewed By: lenary
Subscribers: hiraditya, dschuff, aheejin, fedor.sergeev, mehdi_amini, inglorion, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D67409
llvm-svn: 374774
Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.
This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.
To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.
The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.
This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.
To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.
I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.
On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.
I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.
I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).
Differential revision: https://reviews.llvm.org/D63932
llvm-svn: 374539
I noticed that compiling on Windows with -fno-ms-compatibility had the
side effect of defining __GNUC__, along with __GNUG__, __GXX_RTTI__, and
a number of other macros for GCC compatibility. This is undesirable and
causes Chromium to do things like mix __attribute__ and __declspec,
which doesn't work. We should have a positive language option to enable
GCC compatibility features so that we can experiment with
-fno-ms-compatibility on Windows. This change adds -fgnuc-version= to be
that option.
My issue aside, users have, for a long time, reported that __GNUC__
doesn't match their expectations in one way or another. We have
encouraged users to migrate code away from this macro, but new code
continues to be written assuming a GCC-only environment. There's really
nothing we can do to stop that. By adding this flag, we can allow them
to choose their own adventure with __GNUC__.
This overlaps a bit with the "GNUMode" language option from -std=gnu*.
The gnu language mode tends to enable non-conforming behaviors that we'd
rather not enable by default, but the we want to set things like
__GXX_RTTI__ by default, so I've kept these separate.
Helps address PR42817
Reviewed By: hans, nickdesaulniers, MaskRay
Differential Revision: https://reviews.llvm.org/D68055
llvm-svn: 374449
This patch removes the remaining part of the OpenMP offload linker scripts which was used for inserting device binaries into the output linked binary. Device binaries are now inserted into the host binary with a help of the wrapper bit-code file which contains device binaries as data. Wrapper bit-code file is dynamically created by the clang driver with a help of new tool clang-offload-wrapper which takes device binaries as input and produces bit-code file with required contents. Wrapper bit-code is then compiled to an object and resulting object is appended to the host linking by the clang driver.
This is the second part of the patch for eliminating OpenMP linker script (please see https://reviews.llvm.org/D64943).
Differential Revision: https://reviews.llvm.org/D68166
llvm-svn: 374219
Second Landing Attempt:
This patch enables end to end support for generating ELF interface stubs
directly from clang. Now the following:
clang -emit-interface-stubs -o libfoo.so a.cpp b.cpp c.cpp
will product an ELF binary with visible symbols populated. Visibility attributes
and -fvisibility can be used to control what gets populated.
* Adding ToolChain support for clang Driver IFS Merge Phase
* Implementing a default InterfaceStubs Merge clang Tool, used by ToolChain
* Adds support for the clang Driver to involve llvm-ifs on ifs files.
* Adds -emit-merged-ifs flag, to tell llvm-ifs to emit a merged ifs text file
instead of the final object format (normally ELF)
Differential Revision: https://reviews.llvm.org/D63978
llvm-svn: 374061
This patch enables end to end support for generating ELF interface stubs
directly from clang. Now the following:
clang -emit-interface-stubs -o libfoo.so a.cpp b.cpp c.cpp
will product an ELF binary with visible symbols populated. Visibility attributes
and -fvisibility can be used to control what gets populated.
* Adding ToolChain support for clang Driver IFS Merge Phase
* Implementing a default InterfaceStubs Merge clang Tool, used by ToolChain
* Adds support for the clang Driver to involve llvm-ifs on ifs files.
* Adds -emit-merged-ifs flag, to tell llvm-ifs to emit a merged ifs text file
instead of the final object format (normally ELF)
Differential Revision: https://reviews.llvm.org/D63978
llvm-svn: 373538
Summary:
To trigger the index-only Whole Program Devirt support added to LLVM, we
need to be able to specify -fno-split-lto-unit in conjunction with
-fwhole-program-vtables. Keep the default for -fwhole-program-vtables as
-fsplit-lto-unit, but don't error on that option combination.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D68029
llvm-svn: 373370
The only functional change here is that -coverage-notes-file is not
passed to -cc1 in some situations.
This code appears to be trying to put the gcno and gcda output next to
the final object file, but it's doing that in a really convoluted way
that needs to be re-examined. It looks for -c or -S in the original
command, and then looks at the -o argument if present in order to handle
the -fno-integrated-as case. However, this doesn't work if this is a
link command with multiple inputs. I looked into fixing this, but the
check-profile test suite has a lot of dependencies on this behavior, so
I left it all alone.
llvm-svn: 373004
We need "xgot" flag in the MipsAsmParser to implement correct expansion
of some pseudo instructions in case of using 32-bit GOT (XGOT).
MipsAsmParser does not have reference to MipsSubtarget but has a
reference to "feature bit set".
llvm-svn: 372220
- When using -o, the provided filename is using for constructing the depfile
name (when -MMD is passed).
- The logic looks for the rightmost '.' character and replaces what comes after
with 'd'.
- This works incorrectly when the filename has no extension and the directories
have '.' in them (e.g. out.dir/test)
- This replaces the funciton to just llvm::sys::path functionality
Differential Revision: https://reviews.llvm.org/D67542
llvm-svn: 371853
Summary:
This patch introduces the skeleton of the constexpr interpreter,
capable of evaluating a simple constexpr functions consisting of
if statements. The interpreter is described in more detail in the
RFC. Further patches will add more features.
Reviewers: Bigcheese, jfb, rsmith
Subscribers: bruno, uenoku, ldionne, Tyker, thegameg, tschuett, dexonsmith, mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64146
llvm-svn: 371834
levels:
-- none: no lax vector conversions [new GCC default]
-- integer: only conversions between integer vectors [old GCC default]
-- all: all conversions between same-size vectors [Clang default]
For now, Clang still defaults to "all" mode, but per my proposal on
cfe-dev (2019-04-10) the default will be changed to "integer" as soon as
that doesn't break lots of testcases. (Eventually I'd like to change the
default to "none" to match GCC and general sanity.)
Following GCC's behavior, the driver flag -flax-vector-conversions is
translated to -flax-vector-conversions=integer.
This reinstates r371805, reverted in r371813, with an additional fix for
lldb.
llvm-svn: 371817
levels:
-- none: no lax vector conversions [new GCC default]
-- integer: only conversions between integer vectors [old GCC default]
-- all: all conversions between same-size vectors [Clang default]
For now, Clang still defaults to "all" mode, but per my proposal on
cfe-dev (2019-04-10) the default will be changed to "integer" as soon as
that doesn't break lots of testcases. (Eventually I'd like to change the
default to "none" to match GCC and general sanity.)
Following GCC's behavior, the driver flag -flax-vector-conversions is
translated to -flax-vector-conversions=integer.
llvm-svn: 371805
Summary:
This adds `-fwasm-exceptions` (in similar fashion with
`-fdwarf-exceptions` or `-fsjlj-exceptions`) that turns on everything
with wasm exception handling from the frontend to the backend.
We currently have `-mexception-handling` in clang frontend, but this is
only about the architecture capability and does not turn on other
necessary options such as the exception model in the backend. (This can
be turned on with `llc -exception-model=wasm`, but llc is not invoked
separately as a command line tool, so this option has to be transferred
from clang.)
Turning on `-fwasm-exceptions` in clang also turns on
`-mexception-handling` if not specified, and will error out if
`-mno-exception-handling` is specified.
Reviewers: dschuff, tlively, sbc100
Subscribers: aprantl, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D67208
llvm-svn: 371708
LLDB reads the various .apple* accelerator tables (and in the near
future: the DWARF 5 accelerator tables) which should make
.gnu_pubnames redundant. This changes the Clang driver to no longer
pass -ggnu-pubnames when tuning for LLDB.
Thanks to David Blaikie for pointing this out!
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/thread.html#646062
rdar://problem/50142073
Differential Revision: https://reviews.llvm.org/D67373
llvm-svn: 371530
This made clang unable to open files using relative paths on network shares on
Windows (PR43204). On the bug it was pointed out that createPhysicalFileSystem()
is not terribly mature, and using it is risky. Reverting for now until there's
a clear way forward.
> Currently the `-working-directory` option does not actually impact the working
> directory for all of the clang driver, it only impacts how files are looked up
> to make sure they exist. This means that that clang passes the wrong paths
> to -fdebug-compilation-dir and -coverage-notes-file.
>
> This patch fixes that by changing all the places in the driver where we convert
> to absolute paths to use the VFS, and then calling setCurrentWorkingDirectory on
> the VFS. This also changes the default VFS for `Driver` to use a virtualized
> working directory, instead of changing the process's working directory.
>
> Differential Revision: https://reviews.llvm.org/D62271
This also revertes the part of r369938 which checked that -working-directory works.
llvm-svn: 371027
Breaks BUILD_SHARED_LIBS build, introduces cycles in library dependency
graphs. (clangInterp depends on clangAST which depends on clangInterp)
This reverts r370839, which is an yet another recommit of D64146.
llvm-svn: 370874
Summary:
This patch introduces the skeleton of the constexpr interpreter,
capable of evaluating a simple constexpr functions consisting of
if statements. The interpreter is described in more detail in the
RFC. Further patches will add more features.
Reviewers: Bigcheese, jfb, rsmith
Subscribers: bruno, uenoku, ldionne, Tyker, thegameg, tschuett, dexonsmith, mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64146
llvm-svn: 370839
Summary:
This patch introduces the skeleton of the constexpr interpreter,
capable of evaluating a simple constexpr functions consisting of
if statements. The interpreter is described in more detail in the
RFC. Further patches will add more features.
Reviewers: Bigcheese, jfb, rsmith
Subscribers: bruno, uenoku, ldionne, Tyker, thegameg, tschuett, dexonsmith, mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64146
llvm-svn: 370636
Summary:
This patch introduces the skeleton of the constexpr interpreter,
capable of evaluating a simple constexpr functions consisting of
if statements. The interpreter is described in more detail in the
RFC. Further patches will add more features.
Reviewers: Bigcheese, jfb, rsmith
Subscribers: bruno, uenoku, ldionne, Tyker, thegameg, tschuett, dexonsmith, mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64146
llvm-svn: 370584
Summary:
This patch introduces the skeleton of the constexpr interpreter,
capable of evaluating a simple constexpr functions consisting of
if statements. The interpreter is described in more detail in the
RFC. Further patches will add more features.
Reviewers: Bigcheese, jfb, rsmith
Subscribers: bruno, uenoku, ldionne, Tyker, thegameg, tschuett, dexonsmith, mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64146
llvm-svn: 370531
Summary:
This patch introduces the skeleton of the constexpr interpreter,
capable of evaluating a simple constexpr functions consisting of
if statements. The interpreter is described in more detail in the
RFC. Further patches will add more features.
Reviewers: Bigcheese, jfb, rsmith
Subscribers: bruno, uenoku, ldionne, Tyker, thegameg, tschuett, dexonsmith, mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64146
llvm-svn: 370476
a fragment of a compilation database for each compilation
This patch adds a new option called -gen-cdb-fragment-path to the driver,
which can be used to specify a directory path to which clang can emit a fragment
of a CDB for each compilation it needs to invoke.
This option emits the same CDB contents as -MJ, and will be ignored if -MJ is specified.
Differential Revision: https://reviews.llvm.org/D66555
llvm-svn: 369938
-dA was in the d_group, which is a preprocessor state dumping group.
However -dA is a debug flag to cause a verbose asm. It was already
implemented to do the same thing as -fverbose-asm, so make it just be an
alias.
llvm-svn: 369926
I've been working on a new tool, llvm-ifs, for merging interface stub files
generated by clang and I've iterated on my derivative format of TBE to a newer
format. llvm-ifs will only support the new format, so I am going to drop the
older experimental interface stubs formats in this commit to make things
simpler.
Differential Revision: https://reviews.llvm.org/D66573
llvm-svn: 369719
After posting llvm-ifs on phabricator, I made some progress in hardening up how
I think the format for Interface Stubs should look. There are a number of
things I think the TBE format was missing (no endianness, no info about the
Object Format because it assumes ELF), so I have added those and broken off
from being as similar to the TBE schema. In a subsequent commit I can drop the
other formats.
An example of how The format will look is as follows:
--- !experimental-ifs-v1
IfsVersion: 1.0
Triple: x86_64-unknown-linux-gnu
ObjectFileFormat: ELF
Symbols:
_Z9nothiddenv: { Type: Func }
_Z10cmdVisiblev: { Type: Func }
...
The format is still marked experimental.
Differential Revision: https://reviews.llvm.org/D66446
llvm-svn: 369715
This broke compiling some ASan tests with never versions of MSVC/the Win
SDK, see https://crbug.com/996675
> MSVC 2017 update 3 (_MSC_VER 1911) enables /Zc:twoPhase by default, and
> so should clang-cl:
> https://docs.microsoft.com/en-us/cpp/build/reference/zc-twophase
>
> clang-cl takes the MSVC version it emulates from the -fmsc-version flag,
> or if that's not passed it tries to check what the installed version of
> MSVC is and uses that, and failing that it uses a default version that's
> currently 1911. So this changes the default if no -fmsc-version flag is
> passed and no installed MSVC is detected. (It also changes the default
> if -fmsc-version is passed or MSVC is detected, and either indicates
> _MSC_VER >= 1911.)
>
> As mentioned in the MSDN article, the Windows SDK header files in
> version 10.0.15063.0 (Creators Update or Redstone 2) and earlier
> versions do not work correctly with /Zc:twoPhase. If you need to use
> these old SDKs with a new clang-cl, explicitly pass /Zc:twoPhase- to get
> the old behavior.
>
> Fixes PR43032.
>
> Differential Revision: https://reviews.llvm.org/D66394
llvm-svn: 369647
MSVC 2017 update 3 (_MSC_VER 1911) enables /Zc:twoPhase by default, and
so should clang-cl:
https://docs.microsoft.com/en-us/cpp/build/reference/zc-twophase
clang-cl takes the MSVC version it emulates from the -fmsc-version flag,
or if that's not passed it tries to check what the installed version of
MSVC is and uses that, and failing that it uses a default version that's
currently 1911. So this changes the default if no -fmsc-version flag is
passed and no installed MSVC is detected. (It also changes the default
if -fmsc-version is passed or MSVC is detected, and either indicates
_MSC_VER >= 1911.)
As mentioned in the MSDN article, the Windows SDK header files in
version 10.0.15063.0 (Creators Update or Redstone 2) and earlier
versions do not work correctly with /Zc:twoPhase. If you need to use
these old SDKs with a new clang-cl, explicitly pass /Zc:twoPhase- to get
the old behavior.
Fixes PR43032.
Differential Revision: https://reviews.llvm.org/D66394
llvm-svn: 369402
Add an option group for all of the -mlong-double-* options and make
-mlong-double-80 restore the default long double behavior for X86. The
motivations are that GNU accepts the -mlong-double-80 option and that complex
Makefiles often need a way of undoing earlier options. Prior to this commit, if
one chooses 64-bit or 128-bit long double for X86, there is no way to undo that
choice and restore the 80-bit behavior.
Differential Revision: https://reviews.llvm.org/D66055
llvm-svn: 369183
Add an option group for all of the -mlong-double-* options and make
-mlong-double-80 restore the default long double behavior for X86. The
motivations are that GNU accepts the -mlong-double-80 option and that complex
Makefiles often need a way of undoing earlier options. Prior to this commit, if
one chooses 64-bit or 128-bit long double for X86, there is no way to undo that
choice and restore the 80-bit behavior.
Differential Revision: https://reviews.llvm.org/D66055
llvm-svn: 369152
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
Differential revision: https://reviews.llvm.org/D66259
llvm-svn: 368942
This fixes a regression from r365860: As that commit message
states, there are 3 valid states targeted by the combination of
-f(no-)omit-frame-pointer and -m(no-)omit-leaf-frame-pointer.
After r365860 it's impossible to get from state 10 (omit just
leaf frame pointers) to state 11 (omit all frame pointers)
in a single command line without getting a warning.
This change restores that functionality.
Fixes PR42966.
Differential Revision: https://reviews.llvm.org/D66142
llvm-svn: 368728
There are times when we wish to explicitly control the C++ standard
library search paths used by the driver. For example, when we're
building against the Android NDK, we might want to use the NDK's C++
headers (which have a custom inline namespace) even if we have C++
headers installed next to the driver. We might also be building against
a non-standard directory layout and wanting to specify the C++ standard
library include directories explicitly.
We could accomplish this by passing -nostdinc++ and adding an explicit
-isystem for our custom search directories. However, users of our
toolchain may themselves want to use -nostdinc++ and a custom C++ search
path (libc++'s build does this, for example), and our added -isystem
won't respect the -nostdinc++, leading to multiple C++ header
directories on the search path, which causes build failures.
Add a new driver option -stdlib++-isystem to support this use case.
Passing this option suppresses adding the default C++ library include
paths in the driver, and it also respects -nostdinc++ to allow users to
still override the C++ library paths themselves.
It's a bit unfortunate that we end up with both -stdlib++-isystem and
-cxx-isystem, but their semantics differ significantly. -cxx-isystem is
unaffected by -nostdinc++ and is added to the end of the search path
(which is not appropriate for C++ standard library headers, since they
often #include_next into other system headers), while -stdlib++-isystem
respects -nostdinc++, is added to the beginning of the search path, and
suppresses the default C++ library include paths.
Differential Revision: https://reviews.llvm.org/D64089
llvm-svn: 367982
This morally relands r365703 (and r365714), originally reviewed at
https://reviews.llvm.org/D64527, but with a different implementation.
Relanding the same approach with a fix for the revert reason got a bit
involved (see https://reviews.llvm.org/D65108) so use a simpler approach
with a more localized implementation (that in return duplicates code
a bit more).
This approach also doesn't validate flags for the integrated assembler
if the assembler step doesn't run.
Fixes PR42066.
Differential Revision: https://reviews.llvm.org/D65233
llvm-svn: 367165
Summary:
Move `-ftime-trace-granularity` option to frontend options. Without patch
this option is showed up in the help for any tool that links libSupport.
Reviewers: sammccall
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65202
llvm-svn: 366911
with '-mframe-pointer'
After D56351 and D64294, frame pointer handling is migrated to tri-state
(all, non-leaf, none) in clang driver and on the function attribute.
This patch makes the frame pointer handling cc1 option tri-state.
Reviewers: chandlerc, rnk, t.p.northover, MaskRay
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D56353
llvm-svn: 366645
Starting with Solaris 11.4 (which is now the required minimal version), Solaris does
support __cxa_atexit. This patch reflects that.
One might consider removing the affected tests altogether instead of inverting them,
as is done on other targets.
Besides, this lets two ASan tests PASS:
AddressSanitizer-i386-sunos :: TestCases/init-order-atexit.cc
AddressSanitizer-i386-sunos-dynamic :: TestCases/init-order-atexit.cc
Tested on x86_64-pc-solaris2.11 and sparcv9-sun-solaris2.11.
Differential Revision: https://reviews.llvm.org/D64491
llvm-svn: 366305
Summary:
Previously, passing -fthinlto-index= to clang required that bitcode
files be explicitly marked by -x ir. This change makes us detect files
with object file extensions as bitcode files when -fthinlto-index= is
present, so that explicitly marking them is no longer necessary.
Explicitly specifying -x ir is still accepted and continues to be part
of the test case to ensure we continue to support it.
Reviewers: tejohnson, rnk, pcc
Subscribers: mehdi_amini, steven_wu, dexonsmith, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64610
llvm-svn: 366127
gcc PowerPC supports 3 representations of long double:
* -mlong-double-64
long double has the same representation of double but is mangled as `e`.
In clang, this is the default on AIX, FreeBSD and Linux musl.
* -mlong-double-128
2 possible 128-bit floating point representations:
+ -mabi=ibmlongdouble
IBM extended double format. Mangled as `g`
In clang, this is the default on Linux glibc.
+ -mabi=ieeelongdouble
IEEE 754 quadruple-precision format. Mangled as `u9__ieee128` (`U10__float128` before gcc 8.2)
This is currently unavailable.
This patch adds -mabi=ibmlongdouble and -mabi=ieeelongdouble, and thus
makes the IEEE 754 quadruple-precision long double available for
languages supported by clang.
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D64283
llvm-svn: 366044
This patch makes the driver option -mlong-double-128 available for X86
and PowerPC. The CC1 option -mlong-double-128 is available on all targets
for users to test on unsupported targets.
On PowerPC, -mlong-double-128 uses the IBM extended double format
because we don't support -mabi=ieeelongdouble yet (D64283).
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D64277
llvm-svn: 365866
Use a tri-state enum to represent shouldUseFramePointer() and
shouldUseLeafFramePointer().
This simplifies the logic and fixes PR9825:
-fno-omit-frame-pointer doesn't imply -mno-omit-leaf-frame-pointer.
and PR24003:
/Oy- /O2 should not omit leaf frame pointer: this matches MSVC x86-32.
(/Oy- is a no-op on MSVC x86-64.)
and:
when CC1 option -mdisable-fp-elim if absent, -momit-leaf-frame-pointer
can also be omitted.
The new behavior matches GCC:
-fomit-frame-pointer wins over -mno-omit-leaf-frame-pointer
-fno-omit-frame-pointer loses out to -momit-leaf-frame-pointer
The behavior makes lots of sense. We have 4 states:
- 00) leaf retained, non-leaf retained
- 01) leaf retained, non-leaf omitted (this is invalid)
- 10) leaf omitted, non-leaf retained (what -momit-leaf-frame-pointer was designed for)
- 11) leaf omitted, non-leaf omitted
"omit" options taking precedence over "no-omit" options is the only way
to make 3 valid states representable with -f(no-)?omit-frame-pointer and
-m(no-)?omit-leaf-pointer.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D64294
llvm-svn: 365860
clang currently warns when passing flags for the assembler (e.g.
-Wa,-mbig-obj) to an invocation that doesn't run the assembler (e.g.
-E).
At first sight, that makes sense -- the flag really is unused. But many
other flags don't have an effect if no assembler runs (e.g.
-fno-integrated-as, -ffunction-sections, and many others), and those
currently don't warn. So this seems more like a side effect of how
CollectArgsForIntegratedAssembler() is implemented than like an
intentional feature.
Since it's a bit inconvenient when debugging builds and adding -E,
always call CollectArgsForIntegratedAssembler() to make sure assembler
args always get claimed. Currently, this affects only these flags:
-mincremental-linker-compatible, -mimplicit-it= (on ARM), -Wa, -Xassembler
It does have the side effect that assembler options now need to be valid
even if -E is passed. Previously, `-Wa,-mbig-obj` would error for
non-coff output only if the assembler ran, now it always errors. This
too makes assembler flags more consistent with all the other flags and
seems like a progression.
Fixes PR42066.
Differential Revision: https://reviews.llvm.org/D64527
llvm-svn: 365703
-mlong-double-64 is supported on some ports of gcc (i386, x86_64, and ppc{32,64}).
On many other targets, there will be an error:
error: unrecognized command line option '-mlong-double-64'
This patch makes the driver option -mlong-double-64 available for x86
and ppc. The CC1 option -mlong-double-64 is available on all targets for
users to test on unsupported targets.
LongDoubleSize is added as a VALUE_LANGOPT so that the option can be
shared with -mlong-double-128 when we support it in clang.
Also, make powerpc*-linux-musl default to use 64-bit long double. It is
currently the only supported ABI on musl and is also how people
configure powerpc*-linux-musl-gcc.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D64067
llvm-svn: 365412
Summary:
The changes in D59673 made the choice redundant, since we can achieve
single-file split DWARF just by not setting an output file name.
Like llc we can also derive whether to enable Split DWARF from whether
-split-dwarf-file is set, so we don't need the flag at all anymore.
The test CodeGen/split-debug-filename.c distinguished between having set
or not set -enable-split-dwarf with -split-dwarf-file, but we can
probably just always emit the metadata into the IR.
The flag -split-dwarf wasn't used at all anymore.
Reviewers: dblaikie, echristo
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D63167
llvm-svn: 364479
This change reverts r363649; effectively re-landing r363626. At this point
clang::Index::CodegenNameGeneratorImpl has been refactored into
clang::AST::ASTNameGenerator. This makes it so that the previous circular link
dependency no longer exists, fixing the previous share lib
(-DBUILD_SHARED_LIBS=ON) build issue which was the reason for r363649.
Clang interface stubs (previously referred to as clang-ifsos) is a new frontend
action in clang that allows the generation of stub files that contain mangled
name info that can be used to produce a stub library. These stub libraries can
be useful for breaking up build dependencies and controlling access to a
library's internal symbols. Generation of these stubs can be invoked by:
clang -fvisibility=<visibility> -emit-interface-stubs \
-interface-stub-version=<interface format>
Notice that -fvisibility (along with use of visibility attributes) can be used
to control what symbols get generated. Currently the interface format is
experimental but there are a wide range of possibilities here.
Currently clang-ifs produces .ifs files that can be thought of as analogous to
object (.o) files, but just for the mangled symbol info. In a subsequent patch
I intend to add support for merging the .ifs files into one .ifs/.ifso file
that can be the input to something like llvm-elfabi to produce something like a
.so file or .dll (but without any of the code, just symbols).
Differential Revision: https://reviews.llvm.org/D60974
llvm-svn: 363948
This reverts commit rC363626.
clangIndex depends on clangFrontend. r363626 adds a dependency from
clangFrontend to clangIndex, which creates a circular dependency.
This is disallowed by -DBUILD_SHARED_LIBS=on builds:
CMake Error: The inter-target dependency graph contains the following strongly connected component (cycle):
"clangFrontend" of type SHARED_LIBRARY
depends on "clangIndex" (weak)
"clangIndex" of type SHARED_LIBRARY
depends on "clangFrontend" (weak)
At least one of these targets is not a STATIC_LIBRARY. Cyclic dependencies are allowed only among static libraries.
Note, the dependency on clangIndex cannot be removed because
libclangFrontend.so is linked with -Wl,-z,defs: a shared object must
have its full direct dependencies specified on the linker command line.
In -DBUILD_SHARED_LIBS=off builds, this appears to work when linking
`bin/clang-9`. However, it can cause trouble to downstream clang library
users. The llvm build system links libraries this way:
clang main_program_object_file ... lib/libclangIndex.a ... lib/libclangFrontend.a -o exe
libclangIndex.a etc are not wrapped in --start-group.
If the downstream application depends on libclangFrontend.a but not any
other clang libraries that depend on libclangIndex.a, this can cause undefined
reference errors when the linker is ld.bfd or gold.
The proper fix is to not include clangIndex files in clangFrontend.
llvm-svn: 363649
Clang interface stubs (previously referred to as clang-ifsos) is a new frontend
action in clang that allows the generation of stub files that contain mangled
name info that can be used to produce a stub library. These stub libraries can
be useful for breaking up build dependencies and controlling access to a
library's internal symbols. Generation of these stubs can be invoked by:
clang -fvisibility=<visibility> -emit-interface-stubs \
-interface-stub-version=<interface format>
Notice that -fvisibility (along with use of visibility attributes) can be used
to control what symbols get generated. Currently the interface format is
experimental but there are a wide range of possibilities here.
Differential Revision: https://reviews.llvm.org/D60974
llvm-svn: 363626
Use -fsave-optimization-record=<format> to specify a different format
than the default, which is YAML.
For now, only YAML is supported.
llvm-svn: 363573
The flag is useful when wanting to create .o files that are independent
from the absolute path to the build directory. -fdebug-prefix-map= can
be used to the same effect, but it requires putting the absolute path
to the build directory on the build command line, so it still requires
the build command line to be dependent on the absolute path of the build
directory. With this flag, "-fdebug-compilation-dir ." makes it so that
both debug info and the compile command itself are independent of the
absolute path of the build directory, which is good for build
determinism (in the sense that the build is independent of which
directory it happens in) and for caching compile results.
(The tradeoff is that the debugger needs explicit configuration to know
the build directory. See also http://dwarfstd.org/ShowIssue.php?issue=171130.2)
Differential Revision: https://reviews.llvm.org/D63387
llvm-svn: 363548
Summary:
With Split DWARF the resulting object file (then called skeleton CU)
contains the file name of another ("DWO") file with the debug info.
This can be a problem for remote compilation, as it will contain the
name of the file on the compilation server, not on the client.
To use Split DWARF with remote compilation, one needs to either
* make sure only relative paths are used, and mirror the build directory
structure of the client on the server,
* inject the desired file name on the client directly.
Since llc already supports the latter solution, we're just copying that
over. We allow setting the actual output filename separately from the
value of the DW_AT_[GNU_]dwo_name attribute in the skeleton CU.
Fixes PR40276.
Reviewers: dblaikie, echristo, tejohnson
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D59673
llvm-svn: 363496
Summary:
This is the first in a series of changes trying to align clang -cc1
flags for Split DWARF with those of llc. The unfortunate side effect of
having -split-dwarf-output for single file Split DWARF will disappear
again in a subsequent change.
The change is the result of a discussion in D59673.
Reviewers: dblaikie, echristo
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D63130
llvm-svn: 363494
Modern ELF platforms use -fuse-init-array to emit .init_array instead of
.ctors . ld.bfd and gold --ctors-in-init-array merge .init_array and
.ctors into .init_array but lld doesn't do that.
If crtbegin*.o crtend*.o don't provide .ctors/.dtors, such .ctors in
user object files can lead to crash (see PR42002. The first and the last
elements in .ctors/.dtors are ignored - they are traditionally provided
by crtbegin*.o crtend*.o).
Call addClangTargetOptions() to ensure -fuse-init-array is rendered on
modern ELF platforms. On Hexagon, this renders -target-feature
+reserved-r19 for -ffixed-r19.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D62509
llvm-svn: 362052
Currently the `-working-directory` option does not actually impact the working
directory for all of the clang driver, it only impacts how files are looked up
to make sure they exist. This means that that clang passes the wrong paths
to -fdebug-compilation-dir and -coverage-notes-file.
This patch fixes that by changing all the places in the driver where we convert
to absolute paths to use the VFS, and then calling setCurrentWorkingDirectory on
the VFS. This also changes the default VFS for `Driver` to use a virtualized
working directory, instead of changing the process's working directory.
Differential Revision: https://reviews.llvm.org/D62271
llvm-svn: 361885
New -cc1 arguments, such as -faddrsig, have started appearing after the
input name. I personally find it convenient for the input to be the last
argument to the compile command line, since I often need to edit it when
running crash reproduction scripts.
Differential Revision: https://reviews.llvm.org/D62270
llvm-svn: 361530
Defines macro ARM_FEATURE_CMSE to 1 for v8-M targets and introduces
-mcmse option which for v8-M targets sets ARM_FEATURE_CMSE to 3.
A diagnostic is produced when the option is given on architectures
without support for Security Extensions.
Reviewed By: dmgreen, snidertm
Differential Revision: https://reviews.llvm.org/D59879
llvm-svn: 361261
This is needed so lld-link can find clang_rt.profile when self hosting
on Windows with PGO. Using clang-cl as a linker knows to add the library
but self hosting, using -DCMAKE_LINKER=<...>/lld-link.exe doesn't.
Differential Revision: https://reviews.llvm.org/D61742
llvm-svn: 360674
Summary: This patches fixes an issue in which the __clang_cuda_cmath.h header is being included even when cmath or math.h headers are not included.
Reviewers: jdoerfert, ABataev, hfinkel, caomhin, tra
Reviewed By: tra
Subscribers: tra, mgorny, guansong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61765
llvm-svn: 360626
Summary:
In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang.
We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism.
Authors:
@gtbercea
@jdoerfert
Reviewers: hfinkel, caomhin, ABataev, tra
Reviewed By: hfinkel, ABataev, tra
Subscribers: JDevlieghere, mgorny, guansong, cfe-commits, jdoerfert
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61399
llvm-svn: 360265
This commit appears to be breaking stage-2 builds on GreenDragon. The
OpenMP wrappers for cmath and math.h are copied into the root of the
resource directory and cause a cyclic dependency in module 'Darwin':
Darwin -> std -> Darwin. This blows up when CMake is testing for modules
support and breaks all stage 2 module builds, including the ThinLTO bot
and all LLDB bots.
CMake Error at cmake/modules/HandleLLVMOptions.cmake:497 (message):
LLVM_ENABLE_MODULES is not supported by this compiler
llvm-svn: 360192
Summary:
In this patch we propose a temporary solution to resolving math functions for the NVPTX toolchain, temporary until OpenMP variant is supported by Clang.
We intercept the inclusion of math.h and cmath headers and if we are in the OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism.
Authors:
@gtbercea
@jdoerfert
Reviewers: hfinkel, caomhin, ABataev, tra
Reviewed By: hfinkel, ABataev, tra
Subscribers: mgorny, guansong, cfe-commits, jdoerfert
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61399
llvm-svn: 360063
Summary:
When -gsplit-dwarf is used together with other -g options, in most cases
the computed debug info level is decided by the last -g option, with one
special case (see below). This patch drops that special case and thus
makes it easy to reason about:
// If a lower debug level -g comes after -gsplit-dwarf, in some cases
// -gsplit-dwarf is cancelled.
-gsplit-dwarf -g0 => 0
-gsplit-dwarf -gline-directives-only => DebugDirectivesOnly
-gsplit-dwarf -gmlt -fsplit-dwarf-inlining => 1
-gsplit-dwarf -gmlt -fno-split-dwarf-inlining => 1 + split
// If -gsplit-dwarf comes after -g options, with this patch, the net
// effect is 2 + split for all combinations
-g0 -gsplit-dwarf => 2 + split
-gline-directives-only -gsplit-dwarf => 2 + split
-gmlt -gsplit-dwarf -fsplit-dwarf-inlining => 2 + split
-gmlt -gsplit-dwarf -fno-split-dwarf-inlining => 1 + split (before) 2 + split (after)
The last case has been changed. In general, if the user intends to lower
debug info level, place that -g option after -gsplit-dwarf.
Some context:
In gcc, the last of -gsplit-dwarf -g0 -g1 -g2 -g3 -ggdb[0-3] -gdwarf-*
... decides the debug info level (-gsplit-dwarf -gdwarf-* have level 2).
It is a bit unfortunate that -gsplit-dwarf -gdwarf-* ... participate in
the level computation but that is the status quo.
Reviewers: dblaikie, echristo, probinson
Reviewed By: dblaikie, probinson
Subscribers: probinson, aprantl, jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59923
llvm-svn: 358544
LLDB can't currently handle Clang's default (limit/no-standalone) DWARF,
so platforms that default to LLDB (Darwin) or anyone else manually
requesting LLDB tuning, should also get standalone DWARF.
That doesn't mean a user can't explicitly enable (because they have
other reasons to prefer standalone DWARF (such as that they're only
building half their application with debug info enabled, and half
without - or because they're tuning for GDB, but want to be able to use
it under LLDB too (this is the default on FreeBSD))) or disable (testing
LLDB fixes/improvements that handle no-standalone mode, building C code,
perhaps, which wouldn't have the LLDB<>no-standalone conflict, etc) the
feature regardless of the tuning.
llvm-svn: 358464
This change adds hierarchical "time trace" profiling blocks that can be visualized in Chrome, in a "flame chart" style. Each profiling block can have a "detail" string that for example indicates the file being processed, template name being instantiated, function being optimized etc.
This is taken from GitHub PR: https://github.com/aras-p/llvm-project-20170507/pull/2
Patch by Aras Pranckevičius.
Differential Revision: https://reviews.llvm.org/D58675
llvm-svn: 357340
In gcc, -gsplit-dwarf is handled in gcc/gcc.c as a spec
(ASM_FINAL_SPEC): objcopy --extract-dwo + objcopy --strip-dwo. In
gcc/opts.c, -gsplit_dwarf has the same semantic of a -g. Except for the
availability of the external command 'objcopy', nothing precludes the
feature working on other ELF OSes. llvm doesn't use objcopy, so it doesn't
have to exclude other OSes.
llvm-svn: 357150
The RISC-V assembler needs the target ABI because it defines a flag of the ELF
file, as described in [1].
Make clang (the driver) to pass the target ABI to -cc1as in exactly the same
way it does for -cc1.
Currently -cc1as knows about -target-abi but is not handling it. Handle it and
pass it to the MC layer via MCTargetOptions.
[1] https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#file-header
Differential Revision: https://reviews.llvm.org/D59298
llvm-svn: 356981
-malign-double is currently only implemented in the -cc1 interface. But its declared in Options.td so it is a driver option too. But you try to use it with the driver you'll get a message about the option being unused.
This patch teaches the driver to pass the option through to cc1 so it won't be unused. The Options.td says the option is x86 only but I didn't see any x86 specific code in its impementation in cc1 so not sure if the documentation is wrong or if I should only pass this option through the driver on x86 targets.
Differential Revision: https://reviews.llvm.org/D59624
llvm-svn: 356706
Currently we have -Rpass for filtering the remarks that are displayed as
diagnostics, but when using -fsave-optimization-record, there is no way
to filter the remarks while generating them.
This adds support for filtering remarks by passes using a regex.
Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline`
will only emit the remarks coming from the pass `inline`.
This adds:
* `-fsave-optimization-record` to the driver
* `-opt-record-passes` to cc1
* `-lto-pass-remarks-filter` to the LTOCodeGenerator
* `--opt-remarks-passes` to lld
* `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2
* `-opt-remarks-passes` to gold-plugin
Differential Revision: https://reviews.llvm.org/D59268
Original llvm-svn: 355964
llvm-svn: 355984
Currently we have -Rpass for filtering the remarks that are displayed as
diagnostics, but when using -fsave-optimization-record, there is no way
to filter the remarks while generating them.
This adds support for filtering remarks by passes using a regex.
Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline`
will only emit the remarks coming from the pass `inline`.
This adds:
* `-fsave-optimization-record` to the driver
* `-opt-record-passes` to cc1
* `-lto-pass-remarks-filter` to the LTOCodeGenerator
* `--opt-remarks-passes` to lld
* `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2
* `-opt-remarks-passes` to gold-plugin
Differential Revision: https://reviews.llvm.org/D59268
llvm-svn: 355964
When -forder-file-instrumentation is on, we pass llvm flag to enable the order file instrumentation pass.
https://reviews.llvm.org/D58751
llvm-svn: 355333
Part 1 of CSPGO change in Clang. This includes changes in clang options
and calls to llvm PassManager. Tests will be committed in part2.
This change needs the PassManager change in llvm.
Differential Revision: https://reviews.llvm.org/D54176
llvm-svn: 355331
Summary:
In the clang UI, replaces -mthread-model posix with -matomics as the
source of truth on threading. In the backend, replaces
-thread-model=posix with the atomics target feature, which is now
collected on the WebAssemblyTargetMachine along with all other used
features. These collected features will also be used to emit the
target features section in the future.
The default configuration for the backend is thread-model=posix and no
atomics, which was previously an invalid configuration. This change
makes the default valid because the thread model is ignored.
A side effect of this change is that objects are never emitted with
passive segments. It will instead be up to the linker to decide
whether sections should be active or passive based on whether atomics
are used in the final link.
Reviewers: aheejin, sbc100, dschuff
Subscribers: mehdi_amini, jgravelle-google, hiraditya, sunfish, steven_wu, dexonsmith, rupprecht, jfb, jdoerfert, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D58742
llvm-svn: 355112
A faster way to reduce the values in teams reductions was found, the
codegen is updated to use this faster algorithm and new runtime functions.
llvm-svn: 354479
This adds ACLE-defined macros to test for code being compiled in the ROPI and
RWPI position-independence modes.
Differential revision: https://reviews.llvm.org/D23610
llvm-svn: 354265
Summary:
There have been three options related to threads and users had to set
all three of them separately to get the correct compilation results.
This makes sure the relationship between the options makes sense and
sets necessary options for users if only part of the necessary options
are specified. This does:
- Remove `-matomics`; this option alone does not enable anything, so
removed it to not confuse users.
- `-mthread-model posix` sets `-target-feature +atomics`
- `-pthread` sets both `-target-feature +atomics` and
`-mthread-model posix`
Also errors out when explicitly given options don't match, such as
`-pthread` is given with `-mthread-model single`.
Reviewers: dschuff, sbc100, tlively, sunfish
Subscribers: jgravelle-google, jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D57874
llvm-svn: 353761
This is suggested by 3.3.9 of MSP430 EABI document.
We do allow user to manually enable frame pointer. GCC toolchain uses the same behavior.
Patch by Dmitry Mikushev!
Differential Revision: https://reviews.llvm.org/D56925
llvm-svn: 353212
Summary:
This adds support for new-PM plugin loading to clang. The option
`-fpass-plugin=` may be used to specify a dynamic shared object file
that adheres to the PassPlugin API.
Tested: created simple plugin that registers an EP callback; with optimization level > 0, the pass is run as expected.
Committed on behalf of Marco Elver
Differential Revision: https://reviews.llvm.org/D56935
llvm-svn: 352972
..and use it to control that parts of CUDA compilation
that depend on the specific version of CUDA SDK.
This patch has a placeholder for a 'new launch API' support
which is in a separate patch. The list will be further
extended in the upcoming patch to support CUDA-10.1.
Differential Revision: https://reviews.llvm.org/D57487
llvm-svn: 352798
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
These two options enable/disable emission of R_{MICRO}MIPS_JALR fixups along
with PIC calls. The linker may then try to turn PIC calls into direct jumps.
By default, these fixups do get emitted by the backend, use
'-mno-relax-pic-calls' to omit them.
Differential revision: https://reviews.llvm.org/D56878
llvm-svn: 351579
This is an initial implementation for msp430 toolchain including
-mmcu option support
-mhwmult options support
-integrated-as by default
The toolchain uses msp430-elf-as as a linker and supports msp430-gcc toolchain tree.
Patch by Kristina Bessonova!
Differential Revision: https://reviews.llvm.org/D56658
llvm-svn: 351228
Summary:
Adds a new -f[no]split-lto-unit flag that is disabled by default to
control module splitting during ThinLTO. It is automatically enabled
for -fsanitize=cfi and -fwhole-program-vtables.
The new EnableSplitLTOUnit codegen flag is passed down to llvm
via a new module flag of the same name.
Depends on D53890.
Reviewers: pcc
Subscribers: ormris, mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D53891
llvm-svn: 350949
Summary: Introduce a compiler flag for cases when the user knows that the collapsed loop counter can be safely represented using at most 32 bits. This will prevent the emission of expensive mathematical operations (such as the div operation) on the iteration variable using 64 bits where 32 bit operations are sufficient.
Reviewers: ABataev, caomhin
Reviewed By: ABataev
Subscribers: hfinkel, kkwli0, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D55928
llvm-svn: 350758
Gentoo supports combining clang toolchain with GNU binutils, and many
users actually do that. As -faddrsig is not supported by GNU strip,
this results in a lot of warnings. Disable it by default and let users
enable it explicitly if they want it; with the intent of reevaluating
when the underlying feature becomes standarized.
See also: https://bugs.gentoo.org/667854
Differential Revision: https://reviews.llvm.org/D56047
llvm-svn: 350028
If an -analyzer-config is passed through -Xanalyzer, it is not found while
looking for -Xclang.
Additionally, don't emit -analyzer-config-compatibility-mode for *every*
-analyzer-config flag we encounter; one is enough.
https://reviews.llvm.org/D55823
rdar://problem/46504165
llvm-svn: 349866
Since r348038 we emit an error every time an -analyzer-config option is not
found. The driver, however, suppresses this error with another flag,
-analyzer-config-compatibility-mode, so backwards compatibility is maintained,
while analyzer developers still enjoy the new typo-free experience.
The backwards compatibility turns out to be still broken when the -analyze
action is not specified; it is still possible to specify -analyzer-config
in that case. This should be fixed now.
Patch by Kristóf Umann!
Differential Revision: https://reviews.llvm.org/D55823
rdar://problem/46504165
llvm-svn: 349824
Replace multiple comparisons of getOS() value with FreeBSD, NetBSD,
OpenBSD and DragonFly with matching isOS*BSD() methods. This should
improve the consistency of coding style without changing the behavior.
Direct getOS() comparisons were left whenever used in switch or switch-
like context.
Differential Revision: https://reviews.llvm.org/D55916
llvm-svn: 349752
Avoid passing -faddrsig by default on NetBSD. This platform is still
using old GNU binutils that crashes on executables containing those
sections.
Differential Revision: https://reviews.llvm.org/D55828
llvm-svn: 349647
NFC for targets other than PS4.
Respect -nostdlib and -nodefaultlibs when enabling asan or ubsan.
Differential Revision: https://reviews.llvm.org/D55712
llvm-svn: 349508
Summary:
Add an option to initialize automatic variables with either a pattern or with
zeroes. The default is still that automatic variables are uninitialized. Also
add attributes to request uninitialized on a per-variable basis, mainly to disable
initialization of large stack arrays when deemed too expensive.
This isn't meant to change the semantics of C and C++. Rather, it's meant to be
a last-resort when programmers inadvertently have some undefined behavior in
their code. This patch aims to make undefined behavior hurt less, which
security-minded people will be very happy about. Notably, this means that
there's no inadvertent information leak when:
- The compiler re-uses stack slots, and a value is used uninitialized.
- The compiler re-uses a register, and a value is used uninitialized.
- Stack structs / arrays / unions with padding are copied.
This patch only addresses stack and register information leaks. There's many
more infoleaks that we could address, and much more undefined behavior that
could be tamed. Let's keep this patch focused, and I'm happy to address related
issues elsewhere.
To keep the patch simple, only some `undef` is removed for now, see
`replaceUndef`. The padding-related infoleaks are therefore not all gone yet.
This will be addressed in a follow-up, mainly because addressing padding-related
leaks should be a stand-alone option which is implied by variable
initialization.
There are three options when it comes to automatic variable initialization:
0. Uninitialized
This is C and C++'s default. It's not changing. Depending on code
generation, a programmer who runs into undefined behavior by using an
uninialized automatic variable may observe any previous value (including
program secrets), or any value which the compiler saw fit to materialize on
the stack or in a register (this could be to synthesize an immediate, to
refer to code or data locations, to generate cookies, etc).
1. Pattern initialization
This is the recommended initialization approach. Pattern initialization's
goal is to initialize automatic variables with values which will likely
transform logic bugs into crashes down the line, are easily recognizable in
a crash dump, without being values which programmers can rely on for useful
program semantics. At the same time, pattern initialization tries to
generate code which will optimize well. You'll find the following details in
`patternFor`:
- Integers are initialized with repeated 0xAA bytes (infinite scream).
- Vectors of integers are also initialized with infinite scream.
- Pointers are initialized with infinite scream on 64-bit platforms because
it's an unmappable pointer value on architectures I'm aware of. Pointers
are initialize to 0x000000AA (small scream) on 32-bit platforms because
32-bit platforms don't consistently offer unmappable pages. When they do
it's usually the zero page. As people try this out, I expect that we'll
want to allow different platforms to customize this, let's do so later.
- Vectors of pointers are initialized the same way pointers are.
- Floating point values and vectors are initialized with a negative quiet
NaN with repeated 0xFF payload (e.g. 0xffffffff and 0xffffffffffffffff).
NaNs are nice (here, anways) because they propagate on arithmetic, making
it more likely that entire computations become NaN when a single
uninitialized value sneaks in.
- Arrays are initialized to their homogeneous elements' initialization
value, repeated. Stack-based Variable-Length Arrays (VLAs) are
runtime-initialized to the allocated size (no effort is made for negative
size, but zero-sized VLAs are untouched even if technically undefined).
- Structs are initialized to their heterogeneous element's initialization
values. Zero-size structs are initialized as 0xAA since they're allocated
a single byte.
- Unions are initialized using the initialization for the largest member of
the union.
Expect the values used for pattern initialization to change over time, as we
refine heuristics (both for performance and security). The goal is truly to
avoid injecting semantics into undefined behavior, and we should be
comfortable changing these values when there's a worthwhile point in doing
so.
Why so much infinite scream? Repeated byte patterns tend to be easy to
synthesize on most architectures, and otherwise memset is usually very
efficient. For values which aren't entirely repeated byte patterns, LLVM
will often generate code which does memset + a few stores.
2. Zero initialization
Zero initialize all values. This has the unfortunate side-effect of
providing semantics to otherwise undefined behavior, programs therefore
might start to rely on this behavior, and that's sad. However, some
programmers believe that pattern initialization is too expensive for them,
and data might show that they're right. The only way to make these
programmers wrong is to offer zero-initialization as an option, figure out
where they are right, and optimize the compiler into submission. Until the
compiler provides acceptable performance for all security-minded code, zero
initialization is a useful (if blunt) tool.
I've been asked for a fourth initialization option: user-provided byte value.
This might be useful, and can easily be added later.
Why is an out-of band initialization mecanism desired? We could instead use
-Wuninitialized! Indeed we could, but then we're forcing the programmer to
provide semantics for something which doesn't actually have any (it's
uninitialized!). It's then unclear whether `int derp = 0;` lends meaning to `0`,
or whether it's just there to shut that warning up. It's also way easier to use
a compiler flag than it is to manually and intelligently initialize all values
in a program.
Why not just rely on static analysis? Because it cannot reason about all dynamic
code paths effectively, and it has false positives. It's a great tool, could get
even better, but it's simply incapable of catching all uses of uninitialized
values.
Why not just rely on memory sanitizer? Because it's not universally available,
has a 3x performance cost, and shouldn't be deployed in production. Again, it's
a great tool, it'll find the dynamic uses of uninitialized variables that your
test coverage hits, but it won't find the ones that you encounter in production.
What's the performance like? Not too bad! Previous publications [0] have cited
2.7 to 4.5% averages. We've commmitted a few patches over the last few months to
address specific regressions, both in code size and performance. In all cases,
the optimizations are generally useful, but variable initialization benefits
from them a lot more than regular code does. We've got a handful of other
optimizations in mind, but the code is in good enough shape and has found enough
latent issues that it's a good time to get the change reviewed, checked in, and
have others kick the tires. We'll continue reducing overheads as we try this out
on diverse codebases.
Is it a good idea? Security-minded folks think so, and apparently so does the
Microsoft Visual Studio team [1] who say "Between 2017 and mid 2018, this
feature would have killed 49 MSRC cases that involved uninitialized struct data
leaking across a trust boundary. It would have also mitigated a number of bugs
involving uninitialized struct data being used directly.". They seem to use pure
zero initialization, and claim to have taken the overheads down to within noise.
Don't just trust Microsoft though, here's another relevant person asking for
this [2]. It's been proposed for GCC [3] and LLVM [4] before.
What are the caveats? A few!
- Variables declared in unreachable code, and used later, aren't initialized.
This goto, Duff's device, other objectionable uses of switch. This should
instead be a hard-error in any serious codebase.
- Volatile stack variables are still weird. That's pre-existing, it's really
the language's fault and this patch keeps it weird. We should deprecate
volatile [5].
- As noted above, padding isn't fully handled yet.
I don't think these caveats make the patch untenable because they can be
addressed separately.
Should this be on by default? Maybe, in some circumstances. It's a conversation
we can have when we've tried it out sufficiently, and we're confident that we've
eliminated enough of the overheads that most codebases would want to opt-in.
Let's keep our precious undefined behavior until that point in time.
How do I use it:
1. On the command-line:
-ftrivial-auto-var-init=uninitialized (the default)
-ftrivial-auto-var-init=pattern
-ftrivial-auto-var-init=zero -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang
2. Using an attribute:
int dont_initialize_me __attribute((uninitialized));
[0]: https://users.elis.ugent.be/~jsartor/researchDocs/OOPSLA2011Zero-submit.pdf
[1]: https://twitter.com/JosephBialek/status/1062774315098112001
[2]: https://outflux.net/slides/2018/lss/danger.pdf
[3]: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00615.html
[4]: 776a0955ef
[5]: http://wg21.link/p1152
I've also posted an RFC to cfe-dev: http://lists.llvm.org/pipermail/cfe-dev/2018-November/060172.html
<rdar://problem/39131435>
Reviewers: pcc, kcc, rsmith
Subscribers: JDevlieghere, jkorous, dexonsmith, cfe-commits
Differential Revision: https://reviews.llvm.org/D54604
llvm-svn: 349442
is not specified
The -target option allows the user to specify the build target using LLVM
triple. The triple includes the arch, and so the -arch option is redundant.
This should work just as well without the -arch. However, the driver has a bug
in which it doesn't target the "Cyclone" CPU for darwin if -target is used
without -arch. This commit fixes this issue.
rdar://46743182
Differential Revision: https://reviews.llvm.org/D55731
llvm-svn: 349382
Implement options in clang to enable recording the driver command-line
in an ELF section.
Implement a new special named metadata, llvm.commandline, to support
frontends embedding their command-line options in IR/ASM/ELF.
This differs from the GCC implementation in some key ways:
* In GCC there is only one command-line possible per compilation-unit,
in LLVM it mirrors llvm.ident and multiple are allowed.
* In GCC individual options are separated by NULL bytes, in LLVM entire
command-lines are separated by NULL bytes. The advantage of the GCC
approach is to clearly delineate options in the face of embedded
spaces. The advantage of the LLVM approach is to support merging
multiple command-lines unambiguously, while handling embedded spaces
with escaping.
Differential Revision: https://reviews.llvm.org/D54487
Clang Differential Revision: https://reviews.llvm.org/D54489
llvm-svn: 349155
Summary:
Added support for the -gline-directives-only option + fixed logic of the
debug info for CUDA devices. If optimization level is O0, then options
--[no-]cuda-noopt-device-debug do not affect the debug info level. If
the optimization level is >O0, debug info options are used +
--no-cuda-noopt-device-debug is used or no --cuda-noopt-device-debug is
used, the optimization level for the device code is kept and the
emission of the debug directives is used.
If the opt level is > O0, debug info is requested +
--cuda-noopt-device-debug option is used, the optimization is disabled
for the device code + required debug info is emitted.
Reviewers: tra, echristo
Subscribers: aprantl, guansong, JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D51554
llvm-svn: 348930
It is faster to directly call the ObjC runtime for methods such as alloc/allocWithZone instead of sending a message to those functions.
This patch adds support for converting messages to alloc/allocWithZone to their equivalent runtime calls.
Tests included for the positive case of applying this transformation, negative tests that we ensure we only convert "alloc" to objc_alloc, not "alloc2", and also a driver test to ensure we enable this only for supported runtime versions.
Reviewed By: rjmccall
https://reviews.llvm.org/D55349
llvm-svn: 348687
The flag -fdebug-compilation-dir is useful to make generated .o files
independent of the path of the build directory, without making the compile
command-line dependent on the path of the build directory, like
-fdebug-prefix-map requires. This change makes it so that the driver can
forward the flag to -cc1as, like it already can for -cc1. We might want to
consider making -fdebug-compilation-dir a driver flag in a follow-up.
(Since -fdebug-compilation-dir defaults to PWD, it's already possible to get
this effect by setting PWD, but explicit compiler flags are better than env
vars, because e.g. ninja tracks command lines and reruns commands that change.)
Somewhat related to PR14625.
Differential Revision: https://reviews.llvm.org/D55377
llvm-svn: 348515
This is an updated version of the D54576, which was reverted.
Problem was that SplitDebugName calls the InputInfo::getFilename
which asserts if InputInfo given is not of type Filename:
const char *getFilename() const {
assert(isFilename() && "Invalid accessor.");
return Data.Filename;
}
At the same time at that point, it can be of type Nothing and
we need to use getBaseInput(), like original code did.
Differential revision: https://reviews.llvm.org/D55006
llvm-svn: 348352
When debugging a boost build with a modified
version of Clang, I discovered that the PTH implementation
stores TokenKind in 8 bits. However, we currently have 368
TokenKinds.
The result is that the value gets truncated and the wrong token
gets picked up when including PTH files. It seems that this will
go wrong every time someone uses a token that uses the 9th bit.
Upon asking on IRC, it was brought up that this was a highly
experimental features that was considered a failure. I discovered
via googling that BoostBuild (mostly Boost.Math) is the only user of
this
feature, using the CC1 flag directly. I believe that this can be
transferred over to normal PCH with minimal effort:
https://github.com/boostorg/build/issues/367
Based on advice on IRC and research showing that this is a nearly
completely unused feature, this patch removes it entirely.
Note: I considered leaving the build-flags in place and making them
emit an error/warning, however since I've basically identified and
warned the only user, it seemed better to just remove them.
Differential Revision: https://reviews.llvm.org/D54547
Change-Id: If32744275ef1f585357bd6c1c813d96973c4d8d9
llvm-svn: 348266
When the global new and delete operators aren't declared, Clang
provides and implicit declaration, but this declaration currently
always uses the default visibility. This is a problem when the
C++ library itself is being built with non-default visibility because
the implicit declaration will force the new and delete operators to
have the default visibility unlike the rest of the library.
The existing workaround is to use assembly to enforce the visiblity:
https://fuchsia.googlesource.com/zircon/+/master/system/ulib/zxcpp/new.cpp#108
but that solution is not always available, e.g. in the case of of
libFuzzer which is using an internal version of libc++ that's also built
with -fvisibility=hidden where the existing behavior is causing issues.
This change introduces a new option -fvisibility-global-new-delete-hidden
which makes the implicit declaration of the global new and delete
operators hidden.
Differential Revision: https://reviews.llvm.org/D53787
llvm-svn: 348234
This adds Hurd toolchain support to Clang's driver in addition
to handling translating the triple from Hurd-compatible form to
the actual triple registered in LLVM.
(Phabricator was stripping the empty files from the patch so I
manually created them)
Patch by sthibaul (Samuel Thibault)
Differential Revision: https://reviews.llvm.org/D54379
llvm-svn: 347833
This reverts commit r347035 as it introduced assertion failures under
certain conditions. More information can be found here:
https://reviews.llvm.org/rL347035
llvm-svn: 347676
Summary:
-mno-speculative-load-hardening isn't a cc1 option, therefore,
before this change:
clang -mno-speculative-load-hardening hello.cpp
would have the following error:
error: unknown argument: '-mno-speculative-load-hardening'
This change will only ever forward -mspeculative-load-hardening
which is a CC1 option based on which flag was passed to clang.
Also added a test that uses this option that fails if an error like the
above is ever thrown.
Thank you ericwf for help debugging and fixing this error.
Reviewers: chandlerc, EricWF
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54763
llvm-svn: 347582
Summary:
the previous patch (https://reviews.llvm.org/rC346642) has been reverted because of test failure under windows.
So this patch fix the test cfe/trunk/test/CodeGen/code-coverage-filter.c.
Reviewers: marco-c
Reviewed By: marco-c
Subscribers: cfe-commits, sylvestre.ledru
Differential Revision: https://reviews.llvm.org/D54600
llvm-svn: 347144
Summary:
Experience has shown that the functionality is useful. It makes linking
optimized clang with debug info for me a lot faster, 20s to 13s. The
type merging phase of PDB writing goes from 10s to 3s.
This removes the LLVM cl::opt and replaces it with a metadata flag.
After this change, users can do the following to use ghash:
- add -gcodeview-ghash to compiler flags
- replace /DEBUG with /DEBUG:GHASH in linker flags
Reviewers: zturner, hans, thakis, takuto.ikuta
Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D54370
llvm-svn: 347072
This should be NFC change.
SplitDebugName recently started to accept the `Output` that
can be used to simplify the logic a bit, also it
seems that code in SplitDebugName that uses
OPT_fdebug_compilation_dir is simply dead.
Differential revision: https://reviews.llvm.org/D54576
llvm-svn: 347035
-frewrite-imports already implies -frewrite-includes (it piggy-backs
on/extends the implementation) so there's no need to conditionally pass
-frewrite-includes when already using -frewrite-imports (& especially I
don't think these would want to be different between crash reporting and
not crash reporting)
llvm-svn: 346927
Summary:
If you're using the Microsoft ABI, chances are that you want PDBs and
codeview debug info. Currently, everyone has to remember to specific
-gcodeview by default, when it would be nice if the standard -g option
did the right thing by default.
Also, do some related cleanup of -cc1 options. When targetting the MS
C++ ABI, we probably shouldn't pass -debugger-tuning=gdb. We were also
passing -gcodeview twice, which is silly.
Reviewers: smeenai, zturner
Subscribers: aprantl, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D54499
llvm-svn: 346907
This unfortunately results in a substantial breaking change when
switching to C++20, but it's not yet clear what / how much we should
do about that. We may want to add a compatibility conversion from
u8 string literals to const char*, similar to how C++98 provided a
compatibility conversion from string literals to non-const char*,
but that's not handled by this patch.
The feature can be disabled in C++20 mode with -fno-char8_t.
llvm-svn: 346892
The DWARF5 specification says(Appendix F.1):
"The sections that do not require relocation, however, can be
written to the relocatable object (.o) file but ignored by the
linker or they can be written to a separate DWARF object (.dwo)
file that need not be accessed by the linker."
The first part describes a single file split DWARF feature and there
is no way to trigger this behavior atm.
Fortunately, no many changes are required to keep *.dwo sections
in a .o, the patch does that.
Differential revision: https://reviews.llvm.org/D52296
llvm-svn: 346837
Summary:
This saves a lot of relocations in optimized object files (at the cost
of some cost/increase in linked executable bytes), but gold's 32 bit
gdb-index support has a bug (
https://sourceware.org/bugzilla/show_bug.cgi?id=21894 ) so we can't
switch to this unconditionally. (& even if it weren't for that bug, one
might argue that some users would want to optimize in one direction or
the other - prioritizing object size or linked executable size)
Differential Revision: https://reviews.llvm.org/D54243
llvm-svn: 346789
Summary: /Zc:dllexportInlines with /fallback may cause unexpected linker error. It is better to disallow compile rather than warn for this combination.
Reviewers: hans, thakis
Reviewed By: hans
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D54426
llvm-svn: 346733
Summary:
These options are taking regex separated by colons to filter files.
- if both are empty then all files are instrumented
- if -fprofile-filter-files is empty then all the filenames matching any of the regex from exclude are not instrumented
- if -fprofile-exclude-files is empty then all the filenames matching any of the regex from filter are instrumented
- if both aren't empty then all the filenames which match any of the regex in filter and which don't match all the regex in filter are instrumented
- this patch is a follow-up of https://reviews.llvm.org/D52033
Reviewers: marco-c, vsk
Reviewed By: marco-c, vsk
Subscribers: cfe-commits, sylvestre.ledru
Differential Revision: https://reviews.llvm.org/D52034
llvm-svn: 346642
This reverts commit r345963. We have a path forward now.
Original commit message:
The driver accidentally stopped passing the input filenames on to -cc1
in this mode due to confusion over what action was being requested.
This change also fixes a couple of crashes I encountered when passing
multiple files to such a -cc1 invocation.
llvm-svn: 346130
Summary:
This CL adds /Zc:DllexportInlines flag to clang-cl.
When Zc:DllexportInlines- is specified, inline class member function is not exported if the function does not have local static variables.
By not exporting inline function, code for those functions are not generated and that reduces both compile time and obj size. Also this flag does not import inline functions from dllimported class if the function does not have local static variables.
On my 24C48T windows10 machine, build performance of chrome target in chromium repository is like below.
These stats are come with 'target_cpu="x86" enable_nacl = false is_component_build=true dcheck_always_on=true` build config and applied
* https://chromium-review.googlesource.com/c/chromium/src/+/1212379
* https://chromium-review.googlesource.com/c/v8/v8/+/1186017
Below stats were taken with this patch applied on a05115cd4c
| config | build time | speedup | build dir size |
| with patch, PCH on, debug | 1h10m0s | x1.13 | 35.6GB |
| without patch, PCH on, debug | 1h19m17s | | 49.0GB |
| with patch, PCH off, debug | 1h15m45s | x1.16 | 33.7GB |
| without patch, PCH off, debug | 1h28m10s | | 52.3GB |
| with patch, PCH on, release | 1h13m13s | x1.22 | 26.2GB |
| without patch, PCH on, release | 1h29m57s | | 37.5GB |
| with patch, PCH off, release | 1h23m38s | x1.32 | 23.7GB |
| without patch, PCH off, release | 1h50m50s | | 38.7GB |
This patch reduced obj size and the number of exported symbols largely, that improved link time too.
e.g. link time stats of blink_core.dll become like below
| | cold disk cache | warm disk cache |
| with patch, PCH on, debug | 71s | 30s |
| without patch, PCH on, debug | 111s | 48s |
This patch's implementation is based on Nico Weber's patch. I modified to support static local variable, added tests and took stats.
Bug: https://bugs.llvm.org/show_bug.cgi?id=33628
Reviewers: hans, thakis, rnk, javed.absar
Reviewed By: hans
Subscribers: kristof.beyls, smeenai, dschuff, probinson, cfe-commits, eraman
Differential Revision: https://reviews.llvm.org/D51340
llvm-svn: 346069
target/teams/distribute regions.
Target/teams/distribute regions exist for all the time the kernel is
executed. Thus, if the variable is declared in their context and then
escape it, we can allocate global memory statically instead of
allocating it dynamically.
Patch captures all the globalized variables in target/teams/distribute
contexts, merges them into the records, one per each target region.
Those records are then joined into the union, one per compilation unit
(to save the global memory). Those units are organized into
2 x dimensional arrays, where the first dimension is
the number of blocks per SM and the second one is the number of SMs.
Runtime functions manage this global memory space between the executing
teams.
llvm-svn: 345978
This reverts commit r345803 and r345915 (a follow-up fix to r345803).
Reason: r345803 blocks our internal integrate because of the new
warnings showing up in too many places. The fix is actually correct,
we will reland it after figuring out how to integrate properly.
llvm-svn: 345963
-fsyntax-only.
The driver accidentally stopped passing the input filenames on to -cc1
in this mode due to confusion over what action was being requested.
This change also fixes a couple of crashes I encountered when passing
multiple files to such a -cc1 invocation.
llvm-svn: 345803
This reverts commit r345370, as it uncovered even more issues in
tests with partial/inconsistent path normalization:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/13562http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/886http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/20994
In particular, these tests seem to have failed:
Clang :: CodeGen/thinlto-diagnostic-handler-remarks-with-hotness.ll
Clang :: CodeGen/thinlto-multi-module.ll
Clang :: Driver/cuda-external-tools.cu
Clang :: Driver/cuda-options.cu
Clang :: Driver/hip-toolchain-no-rdc.hip
Clang :: Driver/hip-toolchain-rdc.hip
Clang :: Driver/openmp-offload-gpu.c
At least the Driver tests could potentially be fixed by extending
the path normalization to even more places, but the issues with the
CodeGen tests are still unknown.
In addition, a number of other tests seem to have been broken in
other clang dependent tools such as clang-tidy and clangd.
llvm-svn: 345372
libtool inspects the output of $CC -v to detect what object files and
libraries are linked in by default. When clang is built as a native
windows executable, all paths are formatted with backslashes, and
the backslashes cause each argument to be enclosed in quotes. The
backslashes and quotes break further processing within libtool (which
is implemented in shell script, running in e.g. msys) pretty badly.
Between unix style pathes (that only work in tools that are linked
to the msys runtime, essentially the same as cygwin) and proper windows
style paths (with backslashes, that can easily break shell scripts
and msys environments), the best compromise is to use windows style
paths (starting with e.g. c:) but with forward slashes, which both
msys based tools, shell scripts and native windows executables can
cope with. This incidentally turns out to be the form of paths that
GCC prints out when run with -v on windows as well.
This change potentially makes the output from clang -v a bit more
inconsistent, but it is isn't necessarily very consistent to begin with.
Compared to the previous attempt in SVN r345004, this now does
the same transformation on more paths, hopefully on the right set
of paths so that all tests pass (previously some tests failed, where
path fragments that were required to be identical turned out to
use different path separators in different places). This now also
is done only for non-windows, or cygwin/mingw targets, to preserve
all backslashes for MSVC cases (where the paths can end up e.g. embedded
into PDB files. (The transformation function itself,
llvm::sys::path::convert_to_slash only has an effect when run on windows.)
Differential Revision: https://reviews.llvm.org/D53066
llvm-svn: 345370
Add a new driver level flag `-fcf-runtime-abi=` that allows one to specify the
runtime ABI for CoreFoundation. This controls the language interoperability.
In particular, this is relevant for generating the CFConstantString classes
(primarily through the `__builtin___CFStringMakeConstantString` builtin) which
construct a reference to the "CFObject"'s `isa` field. This type differs
between swift 4.1 and 4.2+.
Valid values for the new option include:
- objc [default behaviour] - enable ObjectiveC interoperability
- swift-4.1 - enable interoperability with swift 4.1
- swift-4.2 - enable interoperability with swift 4.2
- swift-5.0 - enable interoperability with swift 5.0
- swift [alias] - target the latest swift ABI
Furthermore, swift 4.2+ changed the layout for the CFString when building
CoreFoundation *without* ObjectiveC interoperability. In such a case, a field
was added to the CFObject base type changing it from: <{ const int*, int }> to
<{ uintptr_t, uintptr_t, uint64_t }>.
In swift 5.0, the CFString type will be further adjusted to change the length
from a uint32_t on everything but BE LP64 targets to uint64_t.
Note that the default behaviour for clang remains unchanged and the new layout
must be explicitly opted into via `-fcf-runtime-abi=swift*`.
llvm-svn: 345222
This patch exposes functionality added in rL344723 to the Clang driver/frontend
as a flag and adds appropriate metadata.
Driver tests pass:
```
ninja check-clang-driver
-snip-
Expected Passes : 472
Expected Failures : 3
Unsupported Tests : 65
```
Odd failure in CodeGen tests but unrelated to this:
```
ninja check-clang-codegen
-snip-
/SourceCache/llvm-trunk-8.0/tools/clang/test/CodeGen/builtins-wasm.c:87:10:
error: cannot compile this builtin function yet
-snip-
Failing Tests (1):
Clang :: CodeGen/builtins-wasm.c
Expected Passes : 1250
Expected Failures : 2
Unsupported Tests : 120
Unexpected Failures: 1
```
Original commit:
[X86] Support for the mno-tls-direct-seg-refs flag
Allows to disable direct TLS segment access (%fs or %gs). GCC supports a
similar flag, it can be useful in some circumstances, e.g. when a thread
context block needs to be updated directly from user space. More info and
specific use cases: https://bugs.llvm.org/show_bug.cgi?id=16145
Patch by nruslan (Ruslan Nikolaev).
Differential Revision: https://reviews.llvm.org/D53102
llvm-svn: 344739