Recently HIP toolchain made a change to use clang instead of opt/llc to do compilation
(https://reviews.llvm.org/D81861). The intention is to make HIP toolchain canonical like
other toolchains.
However, this change introduced an unintentional change regarding backend fp fuse
option, which caused regressions in some HIP applications.
Basically before the change, HIP toolchain used clang to generate bitcode, then use
opt/llc to optimize bitcode and generate ISA. As such, the amdgpu backend takes
the default fp fuse mode which is 'Standard'. This mode respect contract flag of
fmul/fadd instructions and do not fuse fmul/fadd instructions without contract flag.
However, after the change, HIP toolchain now use clang to generate IR, do optimization,
and generate ISA as one process. Now amdgpu backend fp fuse option is determined
by -ffp-contract option, which is 'fast' by default. And this -ffp-contract=fast language option
is translated to 'Fast' fp fuse option in backend. Suddenly backend starts to fuse fmul/fadd
instructions without contract flag.
This causes wrong result for some device library functions, e.g. tan(-1e20), which should
return 0.8446, now returns -0.933. What is worse is that since backend with 'Fast' fp fuse
option does not respect contract flag, there is no way to use #pragma clang fp contract
directive to enforce fp contract requirements.
This patch fixes the regression by introducing a new value 'fast-honor-pragmas' for -ffp-contract
and use it for HIP by default. 'fast-honor-pragmas' is equivalent to 'fast' in frontend but
let the backend to use 'Standard' fp fuse option. 'fast-honor-pragmas' is useful since 'Fast'
fp fuse option in backend does not honor contract flag, it is of little use to HIP
applications since all code with #pragma STDC FP_CONTRACT or any IR from a
source compiled with -ffp-contract=on is broken.
Differential Revision: https://reviews.llvm.org/D90174
The new option `-fproc-stat-info=<file>` can be used to generate report
about used memory and execution tile of each stage of compilation.
Documentation for this option can be found in `UserManual.rst`. The
option can be used in parallel builds.
Differential Revision: https://reviews.llvm.org/D78903
Old GCC used to aggressively fold VLAs to constant-bound arrays at block
scope in GNU mode. That's non-conforming, and more modern versions of
GCC only do this at file scope. Update Clang to do the same.
Also promote the warning for this from off-by-default to on-by-default
in all cases; more recent versions of GCC likewise warn on this by
default.
This is still slightly more permissive than GCC, as pointed out in
PR44406, as we still fold VLAs to constant arrays in structs, but that
seems justifiable given that we don't support VLA-in-struct (and don't
intend to ever support it), but GCC does.
Differential Revision: https://reviews.llvm.org/D89523
GCC 7 introduced -fprofile-update={atomic,prefer-atomic} (prefer-atomic is for
best efforts (some targets do not support atomics)) to increment counters
atomically, which is exactly what we have done with -fprofile-instr-generate
(D50867) and -fprofile-arcs (b5ef137c11).
This patch adds the option to clang to surface the internal options at driver level.
GCC 7 also turned on -fprofile-update=prefer-atomic when -pthread is specified,
but it has performance regression
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89307). So we don't follow suit.
Differential Revision: https://reviews.llvm.org/D87737
This patch introduces the new .bb_addr_map section feature which allows us to emit the bits needed for mapping binary profiles to basic blocks into a separate section.
The format of the emitted data is represented as follows. It includes a header for every function:
| Address of the function | -> 8 bytes (pointer size)
| Number of basic blocks in this function (>0) | -> ULEB128
The header is followed by a BB record for every basic block. These records are ordered in the same order as MachineBasicBlocks are placed in the function. Each BB Info is structured as follows:
| Offset of the basic block relative to function begin | -> ULEB128
| Binary size of the basic block | -> ULEB128
| BB metadata | -> ULEB128 [ MBB.isReturn() OR MBB.hasTailCall() << 1 OR MBB.isEHPad() << 2 ]
The new feature will replace the existing "BB labels" functionality with -basic-block-sections=labels.
The .bb_addr_map section scrubs the specially-encoded BB symbols from the binary and makes it friendly to profilers and debuggers.
Furthermore, the new feature reduces the binary size overhead from 70% bloat to only 12%.
For more information and results please refer to the RFC: https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html
Reviewed By: MaskRay, snehasish
Differential Revision: https://reviews.llvm.org/D85408
Fixes pr/11710.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Resubmit after breaking Windows and OSX builds.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D80242
This is to correct bugs.llvm.org/show_bug.cgi?id=46444
ffinite-math-only is a gcc option. That is the correct spelling.
File modified is clang/docs/UsersManual.rst
This patch adds clang options:
-fbasic-block-sections={all,<filename>,labels,none} and
-funique-basic-block-section-names.
LLVM Support for basic block sections is already enabled.
+ -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic
block sections for all or a subset of basic blocks. "labels" only enables
basic block symbols.
+ -funique-basic-block-section-names: Enables unique section names for
basic block sections, disabled by default.
Differential Revision: https://reviews.llvm.org/D68049
This is a standalone patch and this would help Propeller do a better job of code
layout as it can accurately attribute the profiles to the right internal linkage
function.
This also helps SampledFDO/AutoFDO correctly associate sampled profiles to the
right internal function. Currently, if there is more than one internal symbol
foo, their profiles are aggregated by SampledFDO.
This patch adds a new clang option, -funique-internal-funcnames, to generate
unique names for functions with internal linkage. This patch appends the md5
hash of the module name to the function symbol as a best effort to generate a
unique name for symbols with internal linkage.
Differential Revision: https://reviews.llvm.org/D73307
Prior to this change, for a few compiler-rt libraries such as ubsan and
the profile library, Clang would embed "-defaultlib:path/to/rt-arch.lib"
into the .drective section of every object compiled with
-finstr-profile-generate or -fsanitize=ubsan as appropriate.
These paths assume that the link step will run from the same working
directory as the compile step. There is also evidence that sometimes the
paths become absolute, such as when clang is run from a different drive
letter from the current working directory. This is fragile, and I'd like
to get away from having paths embedded in the object if possible. Long
ago it was suggested that we use this for ASan, and apparently I felt
the same way back then:
https://reviews.llvm.org/D4428#56536
This is also consistent with how all other autolinking usage works for
PS4, Mac, and Windows: they all use basenames, not paths.
To keep things working for people using the standard GCC driver
workflow, the driver now adds the resource directory to the linker
library search path when it calls the linker. This is enough to make
check-ubsan pass, and seems like a generally good thing.
Users that invoke the linker directly (most clang-cl users) will have to
add clang's resource library directory to their linker search path in
their build system. I'm not sure where I can document this. Ideally I'd
also do it in the MSBuild files, but I can't figure out where they go.
I'd like to start with this for now.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D65543
Summary:
This flag has been deprecated, with an on-by-default warning encouraging
users to explicitly specify whether they mean "all" or ubsan for 5 years
(released in Clang 3.7). Change it to mean what we wanted and
undeprecate it.
Also make the argument to -fsanitize-trap optional, and likewise default
it to 'all', and express the aliases for these flags in the .td file
rather than in code. (Plus documentation updates for the above.)
Reviewers: kcc
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77753
Change clang option -ffp-model=precise, the default, to select ffp-contract=on
The patch caused some problems for PowerPC but ibm has made
adjustments so I am resubmitting this patch. Additionally, Andy looked
at the performance regressions on LNT and it looks like a loop
unrolling decision that could be adjusted.
Reviewers: rjmccall, Andy Kaylor
Differential Revision: https://reviews.llvm.org/D74436
This reverts commit 0a1123eb43.
Want to revert this because it's causing trouble for PowerPC
I also fixed test fp-model.c which was looking for an incorrect error message
This reverts commit 99c5bcbce8.
Change clang option -ffp-model=precise to select ffp-contract=on
Including some small touch-ups to the original commit
Reviewers: rjmccall, Andy Kaylor
Differential Revision: https://reviews.llvm.org/D74436
Remove description of language mode from the language
extensions and add a link to pdf document.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72076
On Darwin, when used for generating a linked binary from a source file
(through an intermediate object file), the driver will invoke `cc1` to
generate a temporary object file. The temporary remark file will now be
emitted next to the object file, which will then be picked up by
`dsymutil` and emitted in the .dSYM bundle.
This is available for all formats except YAML since by default, YAML
doesn't need a section and the remark file will be lost.
Patch was reverted because https://bugs.llvm.org/show_bug.cgi?id=44048
The original patch is modified to set the strictfp IR attribute
explicitly in CodeGen instead of as a side effect of IRBuilder.
In the 2nd attempt to reapply there was a windows lit test fail, the
tests were fixed to use wildcard matching.
Differential Revision: https://reviews.llvm.org/D62731
and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread.
This reverts commits af57dbf12e and e6584b2b7b
Add options to control floating point behavior: trapping and
exception behavior, rounding, and control of optimizations that affect
floating point calculations. More details in UsersManual.rst.
Reviewers: rjmccall
Differential Revision: https://reviews.llvm.org/D62731
I noticed that compiling on Windows with -fno-ms-compatibility had the
side effect of defining __GNUC__, along with __GNUG__, __GXX_RTTI__, and
a number of other macros for GCC compatibility. This is undesirable and
causes Chromium to do things like mix __attribute__ and __declspec,
which doesn't work. We should have a positive language option to enable
GCC compatibility features so that we can experiment with
-fno-ms-compatibility on Windows. This change adds -fgnuc-version= to be
that option.
My issue aside, users have, for a long time, reported that __GNUC__
doesn't match their expectations in one way or another. We have
encouraged users to migrate code away from this macro, but new code
continues to be written assuming a GCC-only environment. There's really
nothing we can do to stop that. By adding this flag, we can allow them
to choose their own adventure with __GNUC__.
This overlaps a bit with the "GNUMode" language option from -std=gnu*.
The gnu language mode tends to enable non-conforming behaviors that we'd
rather not enable by default, but the we want to set things like
__GXX_RTTI__ by default, so I've kept these separate.
Helps address PR42817
Reviewed By: hans, nickdesaulniers, MaskRay
Differential Revision: https://reviews.llvm.org/D68055
llvm-svn: 374449
Commit message below, original caused the sphinx build bot to fail, this
one should fix it.
Create UsersManual section entitled 'Controlling Floating Point
Behavior'
Create a new section for documenting the floating point options. Move
all the floating point options into this section, and add new entries
for the floating point options that exist but weren't previously
described in the UsersManual.
Patch By: mibintc
Differential Revision: https://reviews.llvm.org/D67517
llvm-svn: 372229
Behavior'
Create a new section for documenting the floating point options. Move
all the floating point options into this section, and add new entries
for the floating point options that exist but weren't previously
described in the UsersManual.
Patch By: mibintc
Differential Revision: https://reviews.llvm.org/D67517
llvm-svn: 372180
Rename lang mode flag to -cl-std=clc++/-cl-std=CLC++
or -std=clc++/-std=CLC++.
This aligns with OpenCL C conversion and removes ambiguity
with OpenCL C++.
Differential Revision: https://reviews.llvm.org/D65102
llvm-svn: 367008
Added documentation of C++ for OpenCL mode into Clang
User Manual and Language Extensions document.
Differential Revision: https://reviews.llvm.org/D64418
llvm-svn: 366351
Use -fsave-optimization-record=<format> to specify a different format
than the default, which is YAML.
For now, only YAML is supported.
llvm-svn: 363573
Summary:
This updates all places in documentation that refer to "Mac OS X", "OS X", etc.
to instead use the modern name "macOS" when no specific version number is
mentioned.
If a specific version is mentioned, this attempts to use the OS name at the time
of that version:
* Mac OS X for 10.0 - 10.7
* OS X for 10.8 - 10.11
* macOS for 10.12 - present
Reviewers: JDevlieghere
Subscribers: mgorny, christof, arphaman, cfe-commits, lldb-commits, libcxx-commits, llvm-commits
Tags: #clang, #lldb, #libc, #llvm
Differential Revision: https://reviews.llvm.org/D62654
llvm-svn: 362113
Update the `cl` emulation to support the `/Zc:char8_t[-]?` options as per the
MSVC 2019.1 toolset. These are aliases for `-fchar8_t` and `-fno-char8_t`.
llvm-svn: 361859
Part 1 of CSPGO change in Clang. This includes changes in clang options
and calls to llvm PassManager. Tests will be committed in part2.
This change needs the PassManager change in llvm.
Differential Revision: https://reviews.llvm.org/D54176
llvm-svn: 355331
Summary:
the previous patch (https://reviews.llvm.org/rC346642) has been reverted because of test failure under windows.
So this patch fix the test cfe/trunk/test/CodeGen/code-coverage-filter.c.
Reviewers: marco-c
Reviewed By: marco-c
Subscribers: cfe-commits, sylvestre.ledru
Differential Revision: https://reviews.llvm.org/D54600
llvm-svn: 347144
Summary:
These options are taking regex separated by colons to filter files.
- if both are empty then all files are instrumented
- if -fprofile-filter-files is empty then all the filenames matching any of the regex from exclude are not instrumented
- if -fprofile-exclude-files is empty then all the filenames matching any of the regex from filter are instrumented
- if both aren't empty then all the filenames which match any of the regex in filter and which don't match all the regex in filter are instrumented
- this patch is a follow-up of https://reviews.llvm.org/D52033
Reviewers: marco-c, vsk
Reviewed By: marco-c, vsk
Subscribers: cfe-commits, sylvestre.ledru
Differential Revision: https://reviews.llvm.org/D52034
llvm-svn: 346642
The clang-cl driver disables access to command line options outside of the
"Core" and "CLOption" sets of command line arguments. This filtering makes it
impossible to pass arguments that are interpreted by the clang driver and not
by either 'cc1' (the frontend) or one of the other tools invoked by the driver.
An example driver-level flag is the '-fno-slp-vectorize' flag, which is
processed by the driver in Clang::ConstructJob and used to set the cc1 flag
"-vectorize-slp". There is no negative cc1 flag or -mllvm flag, so it is not
currently possible to disable the SLP vectorizer from the clang-cl driver.
This change introduces the "/clang:" argument that is available when the
driver mode is set to CL compatibility. This option works similarly to the
"-Xclang" option, except that the option values are processed by the clang
driver rather than by 'cc1'. An example usage is:
clang-cl /clang:-fno-slp-vectorize /O2 test.c
Another example shows how "/clang:" can be used to pass a flag where there is
a conflict between a clang-cl compat option and an overlapping clang driver
option:
clang-cl /MD /clang:-MD /clang:-MF /clang:test_dep_file.dep test.c
In the previous example, the unprefixed /MD selects the DLL version of the msvc
CRT, while the prefixed -MD flag and the -MF flags are used to create a make
dependency file for included headers.
One note about flag ordering: the /clang: flags are concatenated to the end of
the argument list, so in cases where the last flag wins, the /clang: flags
will be chosen regardless of their order relative to other flags on the driver
command line.
Patch by Neeraj K. Singh!
Differential revision: https://reviews.llvm.org/D53457
llvm-svn: 346393
Handle it in the driver and propagate it to cc1
Reviewers: rjmccall, kcc, rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D52615
llvm-svn: 346001
This can be used to preserve profiling information across codebase
changes that have widespread impact on mangled names, but across which
most profiling data should still be usable. For example, when switching
from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI,
or even from a 32-bit to a 64-bit build.
The user can provide a remapping file specifying parts of mangled names
that should be treated as equivalent (eg, std::__1 should be treated as
equivalent to std::__cxx11), and profile data will be treated as
applying to a particular function if its name is equivalent to the name
of a function in the profile data under the provided equivalences. See
the documentation change for a description of how this is configured.
Remapping is supported for both sample-based profiling and instruction
profiling. We do not support remapping indirect branch target
information, but all other profile data should be remapped
appropriately.
Support is only added for the new pass manager. If someone wants to also
add support for this for the old pass manager, doing so should be
straightforward.
llvm-svn: 344199
which was reverted in r337336.
The problem that required a revert was fixed in r337338.
Also added a missing "REQUIRES: x86-registered-target" to one of
the tests.
Original commit message:
> Teach Clang to emit address-significance tables.
>
> By default, we emit an address-significance table on all ELF
> targets when the integrated assembler is enabled. The emission of an
> address-significance table can be controlled with the -faddrsig and
> -fno-addrsig flags.
>
> Differential Revision: https://reviews.llvm.org/D48155
llvm-svn: 337339
Causing multiple failures on sanitizer bots due to TLS symbol errors,
e.g.
/usr/bin/ld: __msan_origin_tls: TLS definition in /home/buildbots/ppc64be-clang-test/clang-ppc64be/stage1/lib/clang/7.0.0/lib/linux/libclang_rt.msan-powerpc64.a(msan.cc.o) section .tbss.__msan_origin_tls mismatches non-TLS reference in /tmp/lit_tmp_0a71tA/mallinfo-3ca75e.o
llvm-svn: 337336
By default, we emit an address-significance table on all ELF
targets when the integrated assembler is enabled. The emission of an
address-significance table can be controlled with the -faddrsig and
-fno-addrsig flags.
Differential Revision: https://reviews.llvm.org/D48155
llvm-svn: 337333
Summary:
In many cases we can't devirtualize
because definition of vtable is not present. Most of the
time it is caused by inline virtual function not beeing
emitted. Forcing emitting of vtable adds a reference of these
inline virtual functions.
Note that GCC was always doing it.
Reviewers: rjmccall, rsmith, amharc, kuhar
Subscribers: llvm-commits, cfe-commits
Differential Revision: https://reviews.llvm.org/D47108
Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 334600
As suggested in the post-commit thread for rL331056, we should match these
clang options with the established vocabulary of the corresponding sanitizer
option. Also, the use of 'strict' is well-known for these kinds of knobs,
and we can improve the descriptive text in the docs.
So this intends to match the logic of D46135 but only change the words.
Matching LLVM commit to match this spelling of the attribute to follow shortly.
Differential Revision: https://reviews.llvm.org/D46236
llvm-svn: 331209
As discussed in the post-commit thread for:
rL330437 ( http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180423/545906.html )
We need a way to opt-out of a float-to-int-to-float cast optimization because too much
existing code relies on the platform-specific undefined result of those casts when the
float-to-int overflows.
The LLVM changes associated with adding this function attribute are here:
rL330947
rL330950
rL330951
Also as suggested, I changed the LLVM doc to mention the specific sanitizer flag that
catches this problem:
rL330958
Differential Revision: https://reviews.llvm.org/D46135
llvm-svn: 331041
- Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target.
- Use a function attribute to communicate to the AMDGPU backend.
Differential Revision: https://reviews.llvm.org/D43735
llvm-svn: 328347
Summary:
Currently, assertion-disabled Clang builds emit value names when generating LLVM IR. This is controlled by the `NDEBUG` macro, and is not easily overridable. In order to get IR output containing names from a release build of Clang, the user must manually construct the CC1 invocation w/o the `-discard-value-names` option. This is less than ideal.
For example, Godbolt uses a release build of Clang, and so when asked to emit LLVM IR the result lacks names, making it harder to read. Manually invoking CC1 on Compiler Explorer is not feasible.
This patch adds the driver options `-fdiscard-value-names` and `-fno-discard-value-names` which allow the user to override the default behavior. If neither is specified, the old behavior remains.
Reviewers: erichkeane, aaron.ballman, lebedev.ri
Reviewed By: aaron.ballman
Subscribers: bogner, cfe-commits
Differential Revision: https://reviews.llvm.org/D42887
llvm-svn: 324498
Summary:
Use monospace for option flags in the PCH section, instead of the
italics that were being used previously.
I believe these used to be links, for which single backticks would
have been appropriate, but since they were un-link-ified in
https://reviews.llvm.org/rL275560, I believe monospace is now more
appropriate, and so two backticks are needed.
Test Plan:
Build the `docs-clang-html` target and confirm the options are rendered
using monospace font.
Reviewers: sepavloff, aaron.ballman
Reviewed By: aaron.ballman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42005
llvm-svn: 322447
Clang is inherently a cross compiler and can generate code for any target
enabled during build. It however requires to specify many parameters in the
invocation, which could be hardcoded during configuration process in the
case of single-target compiler. The purpose of configuration files is to
make specifying clang arguments easier.
A configuration file is a collection of driver options, which are inserted
into command line before other options specified in the clang invocation.
It groups related options together and allows specifying them in simpler,
more flexible and less error prone way than just listing the options
somewhere in build scripts. Configuration file may be thought as a "macro"
that names an option set and is expanded when the driver is called.
Use of configuration files is described in `UserManual.rst`.
Differential Revision: https://reviews.llvm.org/D24933
llvm-svn: 321621
Clang is inherently a cross compiler and can generate code for any target
enabled during build. It however requires to specify many parameters in the
invocation, which could be hardcoded during configuration process in the
case of single-target compiler. The purpose of configuration files is to
make specifying clang arguments easier.
A configuration file is a collection of driver options, which are inserted
into command line before other options specified in the clang invocation.
It groups related options together and allows specifying them in simpler,
more flexible and less error prone way than just listing the options
somewhere in build scripts. Configuration file may be thought as a "macro"
that names an option set and is expanded when the driver is called.
Use of configuration files is described in `UserManual.rst`.
Differential Revision: https://reviews.llvm.org/D24933
llvm-svn: 321587
Summary:
This change allows generalizing pointers in type signatures used for
cfi-icall by enabling the -fsanitize-cfi-icall-generalize-pointers flag.
This works by 1) emitting an additional generalized type signature
metadata node for functions and 2) llvm.type.test()ing for the
generalized type for translation units with the flag specified.
This flag is incompatible with -fsanitize-cfi-cross-dso because it would
require emitting twice as many type hashes which would increase artifact
size.
Reviewers: pcc, eugenis
Reviewed By: pcc
Subscribers: kcc
Differential Revision: https://reviews.llvm.org/D39358
llvm-svn: 317044