Implement options in clang to enable recording the driver command-line
in an ELF section.
Implement a new special named metadata, llvm.commandline, to support
frontends embedding their command-line options in IR/ASM/ELF.
This differs from the GCC implementation in some key ways:
* In GCC there is only one command-line possible per compilation-unit,
in LLVM it mirrors llvm.ident and multiple are allowed.
* In GCC individual options are separated by NULL bytes, in LLVM entire
command-lines are separated by NULL bytes. The advantage of the GCC
approach is to clearly delineate options in the face of embedded
spaces. The advantage of the LLVM approach is to support merging
multiple command-lines unambiguously, while handling embedded spaces
with escaping.
Differential Revision: https://reviews.llvm.org/D54487
Clang Differential Revision: https://reviews.llvm.org/D54489
llvm-svn: 349155
It is faster to directly call the ObjC runtime for methods such as alloc/allocWithZone instead of sending a message to those functions.
This patch adds support for converting messages to alloc/allocWithZone to their equivalent runtime calls.
Tests included for the positive case of applying this transformation, negative tests that we ensure we only convert "alloc" to objc_alloc, not "alloc2", and also a driver test to ensure we enable this only for supported runtime versions.
Reviewed By: rjmccall
https://reviews.llvm.org/D55349
llvm-svn: 348687
When debugging a boost build with a modified
version of Clang, I discovered that the PTH implementation
stores TokenKind in 8 bits. However, we currently have 368
TokenKinds.
The result is that the value gets truncated and the wrong token
gets picked up when including PTH files. It seems that this will
go wrong every time someone uses a token that uses the 9th bit.
Upon asking on IRC, it was brought up that this was a highly
experimental features that was considered a failure. I discovered
via googling that BoostBuild (mostly Boost.Math) is the only user of
this
feature, using the CC1 flag directly. I believe that this can be
transferred over to normal PCH with minimal effort:
https://github.com/boostorg/build/issues/367
Based on advice on IRC and research showing that this is a nearly
completely unused feature, this patch removes it entirely.
Note: I considered leaving the build-flags in place and making them
emit an error/warning, however since I've basically identified and
warned the only user, it seemed better to just remove them.
Differential Revision: https://reviews.llvm.org/D54547
Change-Id: If32744275ef1f585357bd6c1c813d96973c4d8d9
llvm-svn: 348266
When the global new and delete operators aren't declared, Clang
provides and implicit declaration, but this declaration currently
always uses the default visibility. This is a problem when the
C++ library itself is being built with non-default visibility because
the implicit declaration will force the new and delete operators to
have the default visibility unlike the rest of the library.
The existing workaround is to use assembly to enforce the visiblity:
https://fuchsia.googlesource.com/zircon/+/master/system/ulib/zxcpp/new.cpp#108
but that solution is not always available, e.g. in the case of of
libFuzzer which is using an internal version of libc++ that's also built
with -fvisibility=hidden where the existing behavior is causing issues.
This change introduces a new option -fvisibility-global-new-delete-hidden
which makes the implicit declaration of the global new and delete
operators hidden.
Differential Revision: https://reviews.llvm.org/D53787
llvm-svn: 348234
In earlier patches regarding AnalyzerOptions, a lot of effort went into
gathering all config options, and changing the interface so that potential
misuse can be eliminited.
Up until this point, AnalyzerOptions only evaluated an option when it was
querried. For example, if we had a "-no-false-positives" flag, AnalyzerOptions
would store an Optional field for it that would be None up until somewhere in
the code until the flag's getter function is called.
However, now that we're confident that we've gathered all configs, we can
evaluate off of them before analysis, so we can emit a error on invalid input
even if that prticular flag will not matter in that particular run of the
analyzer. Another very big benefit of this is that debug.ConfigDumper will now
show the value of all configs every single time.
Also, almost all options related class have a similar interface, so uniformity
is also a benefit.
The implementation for errors on invalid input will be commited shorty.
Differential Revision: https://reviews.llvm.org/D53692
llvm-svn: 348031
This patch passes -fdebug-prefix-map (a feature for renaming source
paths in the debug info) through to the per-module codegen options and
adds the debug prefix map to the module hash.
<rdar://problem/46045865>
Differential Revision: https://reviews.llvm.org/D55037
llvm-svn: 347926
Summary:
the previous patch (https://reviews.llvm.org/rC346642) has been reverted because of test failure under windows.
So this patch fix the test cfe/trunk/test/CodeGen/code-coverage-filter.c.
Reviewers: marco-c
Reviewed By: marco-c
Subscribers: cfe-commits, sylvestre.ledru
Differential Revision: https://reviews.llvm.org/D54600
llvm-svn: 347144
Summary:
Experience has shown that the functionality is useful. It makes linking
optimized clang with debug info for me a lot faster, 20s to 13s. The
type merging phase of PDB writing goes from 10s to 3s.
This removes the LLVM cl::opt and replaces it with a metadata flag.
After this change, users can do the following to use ghash:
- add -gcodeview-ghash to compiler flags
- replace /DEBUG with /DEBUG:GHASH in linker flags
Reviewers: zturner, hans, thakis, takuto.ikuta
Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D54370
llvm-svn: 347072
This unfortunately results in a substantial breaking change when
switching to C++20, but it's not yet clear what / how much we should
do about that. We may want to add a compatibility conversion from
u8 string literals to const char*, similar to how C++98 provided a
compatibility conversion from string literals to non-const char*,
but that's not handled by this patch.
The feature can be disabled in C++20 mode with -fno-char8_t.
llvm-svn: 346892
The DWARF5 specification says(Appendix F.1):
"The sections that do not require relocation, however, can be
written to the relocatable object (.o) file but ignored by the
linker or they can be written to a separate DWARF object (.dwo)
file that need not be accessed by the linker."
The first part describes a single file split DWARF feature and there
is no way to trigger this behavior atm.
Fortunately, no many changes are required to keep *.dwo sections
in a .o, the patch does that.
Differential revision: https://reviews.llvm.org/D52296
llvm-svn: 346837
Summary:
This saves a lot of relocations in optimized object files (at the cost
of some cost/increase in linked executable bytes), but gold's 32 bit
gdb-index support has a bug (
https://sourceware.org/bugzilla/show_bug.cgi?id=21894 ) so we can't
switch to this unconditionally. (& even if it weren't for that bug, one
might argue that some users would want to optimize in one direction or
the other - prioritizing object size or linked executable size)
Differential Revision: https://reviews.llvm.org/D54243
llvm-svn: 346789
Summary:
These options are taking regex separated by colons to filter files.
- if both are empty then all files are instrumented
- if -fprofile-filter-files is empty then all the filenames matching any of the regex from exclude are not instrumented
- if -fprofile-exclude-files is empty then all the filenames matching any of the regex from filter are instrumented
- if both aren't empty then all the filenames which match any of the regex in filter and which don't match all the regex in filter are instrumented
- this patch is a follow-up of https://reviews.llvm.org/D52033
Reviewers: marco-c, vsk
Reviewed By: marco-c, vsk
Subscribers: cfe-commits, sylvestre.ledru
Differential Revision: https://reviews.llvm.org/D52034
llvm-svn: 346642
Summary:
This CL adds /Zc:DllexportInlines flag to clang-cl.
When Zc:DllexportInlines- is specified, inline class member function is not exported if the function does not have local static variables.
By not exporting inline function, code for those functions are not generated and that reduces both compile time and obj size. Also this flag does not import inline functions from dllimported class if the function does not have local static variables.
On my 24C48T windows10 machine, build performance of chrome target in chromium repository is like below.
These stats are come with 'target_cpu="x86" enable_nacl = false is_component_build=true dcheck_always_on=true` build config and applied
* https://chromium-review.googlesource.com/c/chromium/src/+/1212379
* https://chromium-review.googlesource.com/c/v8/v8/+/1186017
Below stats were taken with this patch applied on a05115cd4c
| config | build time | speedup | build dir size |
| with patch, PCH on, debug | 1h10m0s | x1.13 | 35.6GB |
| without patch, PCH on, debug | 1h19m17s | | 49.0GB |
| with patch, PCH off, debug | 1h15m45s | x1.16 | 33.7GB |
| without patch, PCH off, debug | 1h28m10s | | 52.3GB |
| with patch, PCH on, release | 1h13m13s | x1.22 | 26.2GB |
| without patch, PCH on, release | 1h29m57s | | 37.5GB |
| with patch, PCH off, release | 1h23m38s | x1.32 | 23.7GB |
| without patch, PCH off, release | 1h50m50s | | 38.7GB |
This patch reduced obj size and the number of exported symbols largely, that improved link time too.
e.g. link time stats of blink_core.dll become like below
| | cold disk cache | warm disk cache |
| with patch, PCH on, debug | 71s | 30s |
| without patch, PCH on, debug | 111s | 48s |
This patch's implementation is based on Nico Weber's patch. I modified to support static local variable, added tests and took stats.
Bug: https://bugs.llvm.org/show_bug.cgi?id=33628
Reviewers: hans, thakis, rnk, javed.absar
Reviewed By: hans
Subscribers: kristof.beyls, smeenai, dschuff, probinson, cfe-commits, eraman
Differential Revision: https://reviews.llvm.org/D51340
llvm-svn: 346069
Handle it in the driver and propagate it to cc1
Reviewers: rjmccall, kcc, rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D52615
llvm-svn: 346001
target/teams/distribute regions.
Target/teams/distribute regions exist for all the time the kernel is
executed. Thus, if the variable is declared in their context and then
escape it, we can allocate global memory statically instead of
allocating it dynamically.
Patch captures all the globalized variables in target/teams/distribute
contexts, merges them into the records, one per each target region.
Those records are then joined into the union, one per compilation unit
(to save the global memory). Those units are organized into
2 x dimensional arrays, where the first dimension is
the number of blocks per SM and the second one is the number of SMs.
Runtime functions manage this global memory space between the executing
teams.
llvm-svn: 345978
This patch should not introduce any behavior changes. It consists of
mostly one of two changes:
1. Replacing fall through comments with the LLVM_FALLTHROUGH macro
2. Inserting 'break' before falling through into a case block consisting
of only 'break'.
We were already using this warning with GCC, but its warning behaves
slightly differently. In this patch, the following differences are
relevant:
1. GCC recognizes comments that say "fall through" as annotations, clang
doesn't
2. GCC doesn't warn on "case N: foo(); default: break;", clang does
3. GCC doesn't warn when the case contains a switch, but falls through
the outer case.
I will enable the warning separately in a follow-up patch so that it can
be cleanly reverted if necessary.
Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu
Differential Revision: https://reviews.llvm.org/D53950
llvm-svn: 345882
We haven't supported compiling ObjC1 for a long time (and never will again), so
there isn't any reason to keep these separate. This patch replaces
LangOpts::ObjC1 and LangOpts::ObjC2 with LangOpts::ObjC.
Differential revision: https://reviews.llvm.org/D53547
llvm-svn: 345637
Default property value 'true' preserves current behavior. Value 'false' can be
used to create VFS "root", file system that gives better control over which
files compiler can use during compilation as there are no unpredictable
accesses to real file system.
Non-fallthrough use case changes how we treat multiple VFS overlay
files. Instead of all of them being at the same level just above a real
file system, now they are nested and subsequent overlays can refer to
files in previous overlays.
Change is done both in LLVM and Clang, corresponding LLVM commit is r345431.
rdar://problem/39465552
Reviewers: bruno, benlangmuir
Reviewed By: bruno
Subscribers: dexonsmith, cfe-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D50539
llvm-svn: 345432
Summary:
- Add `UETT_PreferredAlignOf` to account for the difference between `__alignof` and `alignof`
- `AlignOfType` now returns ABI alignment instead of preferred alignment iff clang-abi-compat > 7, and one uses _Alignof or alignof
Patch by Nicole Mazzuca!
Differential Revision: https://reviews.llvm.org/D53207
llvm-svn: 345419
Add a new driver level flag `-fcf-runtime-abi=` that allows one to specify the
runtime ABI for CoreFoundation. This controls the language interoperability.
In particular, this is relevant for generating the CFConstantString classes
(primarily through the `__builtin___CFStringMakeConstantString` builtin) which
construct a reference to the "CFObject"'s `isa` field. This type differs
between swift 4.1 and 4.2+.
Valid values for the new option include:
- objc [default behaviour] - enable ObjectiveC interoperability
- swift-4.1 - enable interoperability with swift 4.1
- swift-4.2 - enable interoperability with swift 4.2
- swift-5.0 - enable interoperability with swift 5.0
- swift [alias] - target the latest swift ABI
Furthermore, swift 4.2+ changed the layout for the CFString when building
CoreFoundation *without* ObjectiveC interoperability. In such a case, a field
was added to the CFObject base type changing it from: <{ const int*, int }> to
<{ uintptr_t, uintptr_t, uint64_t }>.
In swift 5.0, the CFString type will be further adjusted to change the length
from a uint32_t on everything but BE LP64 targets to uint64_t.
Note that the default behaviour for clang remains unchanged and the new layout
must be explicitly opted into via `-fcf-runtime-abi=swift*`.
llvm-svn: 345222
This patch exposes functionality added in rL344723 to the Clang driver/frontend
as a flag and adds appropriate metadata.
Driver tests pass:
```
ninja check-clang-driver
-snip-
Expected Passes : 472
Expected Failures : 3
Unsupported Tests : 65
```
Odd failure in CodeGen tests but unrelated to this:
```
ninja check-clang-codegen
-snip-
/SourceCache/llvm-trunk-8.0/tools/clang/test/CodeGen/builtins-wasm.c:87:10:
error: cannot compile this builtin function yet
-snip-
Failing Tests (1):
Clang :: CodeGen/builtins-wasm.c
Expected Passes : 1250
Expected Failures : 2
Unsupported Tests : 120
Unexpected Failures: 1
```
Original commit:
[X86] Support for the mno-tls-direct-seg-refs flag
Allows to disable direct TLS segment access (%fs or %gs). GCC supports a
similar flag, it can be useful in some circumstances, e.g. when a thread
context block needs to be updated directly from user space. More info and
specific use cases: https://bugs.llvm.org/show_bug.cgi?id=16145
Patch by nruslan (Ruslan Nikolaev).
Differential Revision: https://reviews.llvm.org/D53102
llvm-svn: 344739
This can be used to preserve profiling information across codebase
changes that have widespread impact on mangled names, but across which
most profiling data should still be usable. For example, when switching
from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI,
or even from a 32-bit to a 64-bit build.
The user can provide a remapping file specifying parts of mangled names
that should be treated as equivalent (eg, std::__1 should be treated as
equivalent to std::__cxx11), and profile data will be treated as
applying to a particular function if its name is equivalent to the name
of a function in the profile data under the provided equivalences. See
the documentation change for a description of how this is configured.
Remapping is supported for both sample-based profiling and instruction
profiling. We do not support remapping indirect branch target
information, but all other profile data should be remapped
appropriately.
Support is only added for the new pass manager. If someone wants to also
add support for this for the old pass manager, doing so should be
straightforward.
llvm-svn: 344199
This patch moves the virtual file system form clang to llvm so it can be
used by more projects.
Concretely the patch:
- Moves VirtualFileSystem.{h|cpp} from clang/Basic to llvm/Support.
- Moves the corresponding unit test from clang to llvm.
- Moves the vfs namespace from clang::vfs to llvm::vfs.
- Formats the lines affected by this change, mostly this is the result of
the added llvm namespace.
RFC on the mailing list:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/126657.html
Differential revision: https://reviews.llvm.org/D52783
llvm-svn: 344140
Summary:
gcc defines macros such as __code_model_small_ based on the user passed command line flag -mcmodel. clang accepts a flag with the same name and similar effects, but does not generate any macro that the user can use. This cl narrows the gap between gcc and clang behaviour.
However, achieving full compatibility with gcc is not trivial: The set of valid values for mcmodel in gcc and clang are not equal. Also, gcc defines different macros for different architectures. In this cl, we only tackle an easy part of the problem and define the macro only for x64 architecture. When the user does not specify a mcmodel, the macro for small code model is produced, as is the case with gcc.
Reviewers: compnerd, MaskRay
Reviewed By: MaskRay
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D52920
llvm-svn: 344000
Summary: Its job is covered by -ast-dump. The option is rarely used and lacks many AST nodes which will lead to llvm_unreachable() crash.
Reviewers: rsmith, arphaman
Reviewed By: rsmith
Subscribers: jfb, cfe-commits
Differential Revision: https://reviews.llvm.org/D52529
llvm-svn: 343660
This patch renames -f{no-}cuda-rdc to -f{no-}gpu-rdc and keeps the original
options as aliases. When -fgpu-rdc is off,
clang will assume the device code in each translation unit does not call
external functions except those in the device library, therefore it is possible
to compile the device code in each translation unit to self-contained kernels
and embed them in the host object, so that the host object behaves like
usual host object which can be linked by lld.
The benefits of this feature is: 1. allow users to create static libraries which
can be linked by host linker; 2. amortized device code linking time.
This patch modifies HIP action builder to insert actions for linking device
code and generating HIP fatbin, and pass HIP fatbin to host backend action.
It extracts code for constructing command for generating HIP fatbin as
a function so that it can be reused by early finalization. It also modifies
codegen of HIP host constructor functions to embed the device fatbin
when it is available.
Differential Revision: https://reviews.llvm.org/D52377
llvm-svn: 343611
There are a few leftovers of rC343147 that are not (\w+)\.begin but in
the form of ([-[:alnum:]>.]+)\.begin or spanning two lines. Change them
to use the container form in this commit. The 12 occurrences have been
inspected manually for safety.
llvm-svn: 343425
Dumping graphs instead of opening them is often very useful,
e.g. for transfer or converting to SVG.
Basic sanity check for generated exploded graphs.
Differential Revision: https://reviews.llvm.org/D52637
llvm-svn: 343352
With clang-cl, when the user specifies /Yc or /Yu without a filename
the compiler uses a #pragma hdrstop in the main source file to
determine the end of the PCH. If a header is specified with /Yc or
/Yu #pragma hdrstop has no effect.
The optional #pragma hdrstop filename argument is not yet supported.
Differential Revision: https://reviews.llvm.org/D51391
llvm-svn: 341963
Boilerplate code for using KMSAN instrumentation in Clang.
We add a new command line flag, -fsanitize=kernel-memory, with a
corresponding SanitizerKind::KernelMemory, which, along with
SanitizerKind::Memory, maps to the memory_sanitizer feature.
KMSAN is only supported on x86_64 Linux.
It's incompatible with other sanitizers, but supports code coverage
instrumentation.
llvm-svn: 341641
Ubigraph project has been dead since about 2008, and to the best of my
knowledge, no one was using it.
Previously, I wasn't able to launch the existing binary at all.
Differential Revision: https://reviews.llvm.org/D51655
llvm-svn: 341601
Load Hardening.
Wires up the existing pass to work with a proper IR attribute rather
than just a hidden/internal flag. The internal flag continues to work
for now, but I'll likely remove it soon.
Most of the churn here is adding the IR attribute. I talked about this
Kristof Beyls and he seemed at least initially OK with this direction.
The idea of using a full attribute here is that we *do* expect at least
some forms of this for other architectures. There isn't anything
*inherently* x86-specific about this technique, just that we only have
an implementation for x86 at the moment.
While we could potentially expose this as a Clang-level attribute as
well, that seems like a good question to defer for the moment as it
isn't 100% clear whether that or some other programmer interface (or
both?) would be best. We'll defer the programmer interface side of this
for now, but at least get to the point where the feature can be enabled
without relying on implementation details.
This also allows us to do something that was really hard before: we can
enable *just* the indirect call retpolines when using SLH. For x86, we
don't have any other way to mitigate indirect calls. Other architectures
may take a different approach of course, and none of this is surfaced to
user-level flags.
Differential Revision: https://reviews.llvm.org/D51157
llvm-svn: 341363
Summary:
Added option -gline-directives-only to support emission of the debug directives
only. It behaves very similar to -gline-tables-only, except that it sets
llvm debug info emission kind to
llvm::DICompileUnit::DebugDirectivesOnly.
Reviewers: echristo
Subscribers: aprantl, fedor.sergeev, JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D51177
llvm-svn: 341212
constants by default when there is no optimization.
GCC's option -fno-keep-static-consts can be used to not emit
unused static constants.
In Clang, since default behavior does not keep unused static constants,
-fkeep-static-consts can be used to emit these if required. This could be
useful for producing identification strings like SVN identifiers
inside the object file even though the string isn't used by the program.
Differential Revision: https://reviews.llvm.org/D40925
llvm-svn: 340439
This commit adds the flag -fno-c++-static-destructors and the attributes
[[clang::no_destroy]] and [[clang::always_destroy]]. no_destroy specifies that a
specific static or thread duration variable shouldn't have it's destructor
registered, and is the default in -fno-c++-static-destructors mode.
always_destroy is the opposite, and is the default in -fc++-static-destructors
mode.
A variable whose destructor is disabled (either because of
-fno-c++-static-destructors or [[clang::no_destroy]]) doesn't count as a use of
the destructor, so we don't do any access checking or mark it referenced. We
also don't emit -Wexit-time-destructors for these variables.
rdar://21734598
Differential revision: https://reviews.llvm.org/D50994
llvm-svn: 340306
This changes the current default behavior (from emitting pubnames by
default, to not emitting them by default) & moves to matching GCC's
behavior* with one significant difference: -gno(-gnu)-pubnames disables
pubnames even in the presence of -gsplit-dwarf (though -gsplit-dwarf
still by default enables -ggnu-pubnames). This allows users to disable
pubnames (& the new DWARF5 accelerated access tables) when they might
not be worth the size overhead.
* GCC's behavior is that -ggnu-pubnames and -gpubnames override each
other, and that -gno-gnu-pubnames and -gno-pubnames act as synonyms and
disable either kind of pubnames if they come last. (eg: -gpubnames
-gno-gnu-pubnames causes no pubnames (neither gnu or standard) to be
emitted)
llvm-svn: 340206
- Add a command line options -msign-return-address to enable return address
signing
- Armv8.3a added instructions to sign the return address to help mitigate
against ROP attacks
- This patch adds command line options to generate function attributes that
signal to the back whether return address signing instructions should be
added
Differential revision: https://reviews.llvm.org/D49793
llvm-svn: 340019
LLVM triple normalization is handling "unknown" and empty components
differently; for example given "x86_64-unknown-linux-gnu" and
"x86_64-linux-gnu" which should be equivalent, triple normalization
returns "x86_64-unknown-linux-gnu" and "x86_64--linux-gnu". autoconf's
config.sub returns "x86_64-unknown-linux-gnu" for both
"x86_64-linux-gnu" and "x86_64-unknown-linux-gnu". This changes the
triple normalization to behave the same way, replacing empty triple
components with "unknown".
This addresses PR37129.
Differential Revision: https://reviews.llvm.org/D50219
llvm-svn: 339294
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.
More details : https://lkml.org/lkml/2018/4/4/601
GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.
This feature is implemented in as the function attribute
"null-pointer-is-valid"="true".
This CL only adds the attribute on the function.
It also strips "nonnull" attributes from function arguments but
keeps the related warnings unchanged.
Corresponding LLVM change rL336613 already updated the
optimizations to not treat null pointer dereferencing
as undefined if the attribute is present.
Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
Reviewed By: jyknight
Subscribers: drinkcat, xbolva00, cfe-commits
Differential Revision: https://reviews.llvm.org/D47894
llvm-svn: 337433
which was reverted in r337336.
The problem that required a revert was fixed in r337338.
Also added a missing "REQUIRES: x86-registered-target" to one of
the tests.
Original commit message:
> Teach Clang to emit address-significance tables.
>
> By default, we emit an address-significance table on all ELF
> targets when the integrated assembler is enabled. The emission of an
> address-significance table can be controlled with the -faddrsig and
> -fno-addrsig flags.
>
> Differential Revision: https://reviews.llvm.org/D48155
llvm-svn: 337339
Causing multiple failures on sanitizer bots due to TLS symbol errors,
e.g.
/usr/bin/ld: __msan_origin_tls: TLS definition in /home/buildbots/ppc64be-clang-test/clang-ppc64be/stage1/lib/clang/7.0.0/lib/linux/libclang_rt.msan-powerpc64.a(msan.cc.o) section .tbss.__msan_origin_tls mismatches non-TLS reference in /tmp/lit_tmp_0a71tA/mallinfo-3ca75e.o
llvm-svn: 337336
By default, we emit an address-significance table on all ELF
targets when the integrated assembler is enabled. The emission of an
address-significance table can be controlled with the -faddrsig and
-fno-addrsig flags.
Differential Revision: https://reviews.llvm.org/D48155
llvm-svn: 337333
Summary: Add a flag `-fno-digraphs` to disable digraphs in the lexer, similar to `-fno-operator-names` which disables alternative names for C++ operators.
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: rsmith, cfe-commits
Differential Revision: https://reviews.llvm.org/D48266
llvm-svn: 337232
Implement support for MS-style PCH through headers.
This enables support for /Yc and /Yu where the through header is either
on the command line or included in the source. It replaces the current
support the requires the header also be specified with /FI.
This change adds a -cc1 option -pch-through-header that is used to either
start or stop compilation during PCH create or use.
When creating a PCH, the compilation ends after compilation of the through
header.
When using a PCH, tokens are skipped until after the through header is seen.
Patch By: mikerice
Differential Revision: https://reviews.llvm.org/D46652
llvm-svn: 336379
- Rename the `-fsame-fbits` flag to `-fpadding-on-unsigned-fixed-point`
- Move the flag from a driver option to a cc1 option
- Rename the `SameFBits` member in TargetInfo to `PaddingOnUnsignedFixedPoint`
- Updated descriptions
Differential Revision: https://reviews.llvm.org/D48727
llvm-svn: 335993
With MSVC, PCH files are created along with an object file that needs to
be linked into the final library or executable. That object file
contains the code generated when building the headers. In particular, it
will include definitions of inline dllexport functions, and because they
are emitted in this object file, other files using the PCH do not need
to emit them. See the bug for an example.
This patch makes clang-cl match MSVC's behaviour in this regard, causing
significant compile-time savings when building dlls using precompiled
headers.
For example, in a 64-bit optimized shared library build of Chromium with
PCH, it reduces the binary size and compile time of
stroke_opacity_custom.obj from 9315564 bytes to 3659629 bytes and 14.6
to 6.63 s. The wall-clock time of building blink_core.dll goes from
38m41s to 22m33s. ("user" time goes from 1979m to 1142m).
Differential Revision: https://reviews.llvm.org/D48426
llvm-svn: 335466
Since we are now producing a summary also for regular LTO builds, we
need to run the NameAnonGlobals pass in those cases as well (the
summary cannot handle anonymous globals).
See https://reviews.llvm.org/D34156 for details on the original change.
This reverts commit 6c9ee4a4a438a8059aacc809b2dd57128fccd6b3.
llvm-svn: 335385
This is breaking a couple of buildbots. We need to run the
NameAnonGlobal pass for regular LTO now as well (since we're producing a
summary). I'll post a separate patch for review to make this happen and
then re-commit.
This reverts commit c0759b7b1f4a81ff9021b952aa38a222d5fa4dfd.
llvm-svn: 335291
Summary:
With D33921, we gained the ability to have module summaries in regular
LTO modules without triggering ThinLTO compilation. Module summaries in
regular LTO allow garbage collection (dead stripping) before LTO
compilation and thus open up additional optimization opportunities.
This patch enables summary emission in regular LTO for all targets
except ld64-based ones (which use the legacy LTO API).
Reviewers: pcc, tejohnson, mehdi_amini
Subscribers: inglorion, eraman, cfe-commits
Differential Revision: https://reviews.llvm.org/D34156
llvm-svn: 335284
This diff includes the logic for setting the precision bits for each primary fixed point type in the target info and logic for initializing a fixed point literal.
Fixed point literals are declared using the suffixes
```
hr: short _Fract
uhr: unsigned short _Fract
r: _Fract
ur: unsigned _Fract
lr: long _Fract
ulr: unsigned long _Fract
hk: short _Accum
uhk: unsigned short _Accum
k: _Accum
uk: unsigned _Accum
```
Errors are also thrown for illegal literal values
```
unsigned short _Accum u_short_accum = 256.0uhk; // expected-error{{the integral part of this literal is too large for this unsigned _Accum type}}
```
Differential Revision: https://reviews.llvm.org/D46915
llvm-svn: 335148
Summary:
In many cases we can't devirtualize
because definition of vtable is not present. Most of the
time it is caused by inline virtual function not beeing
emitted. Forcing emitting of vtable adds a reference of these
inline virtual functions.
Note that GCC was always doing it.
Reviewers: rjmccall, rsmith, amharc, kuhar
Subscribers: llvm-commits, cfe-commits
Differential Revision: https://reviews.llvm.org/D47108
Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 334600
Summary:
This kind of functionality is useful to other project apart from clang.
LLDB works with version numbers a lot, but it does not have a convenient
abstraction for this. Moving this class to a lower level library allows
it to be freely used within LLDB.
Since this class is used in a lot of places in clang, and it used to be
in the clang namespace, it seemed appropriate to add it to the list of
adopted classes in LLVM.h to avoid prefixing all uses with "llvm::".
Also, I didn't find any tests specific for this class, so I wrote a
couple of quick ones for the more interesting bits of functionality.
Reviewers: zturner, erik.pilkington
Subscribers: mgorny, cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D47887
llvm-svn: 334399
The windows-msvc target is used for MSVC ABI compatibility, including
the exceptions model. It doesn't make sense to pair a windows-msvc
target with a non-MSVC exception model. This would previously cause an
assertion failure; explicitly error out for it in the frontend instead.
This also allows us to reduce the matrix of target/exception models a
bit (see the modified tests), and we can possibly simplify some of the
personality code in a follow-up.
Differential Revision: https://reviews.llvm.org/D47853
llvm-svn: 334243
// Primary fixed point types
signed short _Accum s_short_accum;
signed _Accum s_accum;
signed long _Accum s_long_accum;
unsigned short _Accum u_short_accum;
unsigned _Accum u_accum;
unsigned long _Accum u_long_accum;
// Aliased fixed point types
short _Accum short_accum;
_Accum accum;
long _Accum long_accum;
This diff only allows for declaration of the fixed point types. Assignment and other operations done on fixed point types according to http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf will be added in future patches. The saturated versions of these types and the equivalent _Fract types will also be added in future patches.
The tests included are for asserting that we can declare these types.
Fixed the test that was failing by not checking for dso_local on some
targets.
Differential Revision: https://reviews.llvm.org/D46084
llvm-svn: 333923
```
// Primary fixed point types
signed short _Accum s_short_accum;
signed _Accum s_accum;
signed long _Accum s_long_accum;
unsigned short _Accum u_short_accum;
unsigned _Accum u_accum;
unsigned long _Accum u_long_accum;
// Aliased fixed point types
short _Accum short_accum;
_Accum accum;
long _Accum long_accum;
```
This diff only allows for declaration of the fixed point types. Assignment and other operations done on fixed point types according to http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf will be added in future patches. The saturated versions of these types and the equivalent `_Fract` types will also be added in future patches.
The tests included are for asserting that we can declare these types.
Differential Revision: https://reviews.llvm.org/D46084
llvm-svn: 333814
Add the ability to dump compiler option-related information to a JSON file via the -compiler-options-dump option. Specifically, it dumps the features/extensions lists -- however, this output could be extended to other information should it be useful. In order to support features and extensions, I moved them into a .def file so that we could build the various lists we care about from them without a significant increase in maintenance burden.
llvm-svn: 333653
Codebases that need to be compatible with the Microsoft ABI can pass
this flag to avoid issues caused by the lack of a fixed ABI for
incomplete member pointers.
Differential Revision: https://reviews.llvm.org/D47503
llvm-svn: 333498
The return value of sys::getDefaultTargetTriple, which is derived from
-DLLVM_DEFAULT_TRIPLE, is used to construct tool names, default target,
and in the future also to control the search path directly; as such it
should be used textually, without interpretation by LLVM.
Normalization of this value may lead to unexpected results, for example
if we configure LLVM with -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-linux-gnu,
normalization will transform that value to x86_64--linux-gnu. Driver will
use that value to search for tools prefixed with x86_64--linux-gnu- which
may be confusing. This is also inconsistent with the behavior of the
--target flag which is taken as-is without any normalization and overrides
the value of LLVM_DEFAULT_TARGET_TRIPLE.
Users of sys::getDefaultTargetTriple already perform their own
normalization as needed, so this change shouldn't impact existing logic.
Differential Revision: https://reviews.llvm.org/D47153
llvm-svn: 333307
Currently getting such completions requires source correction, reparsing
and calling completion again. And if it shows no results and rollback is
required then it costs one more reparse.
With this change it's possible to get all results which can be later
filtered to split changes which require correction.
Differential Revision: https://reviews.llvm.org/D41537
llvm-svn: 333272
if `-fopenmp-simd` is specified alone, `_OPENMP` macro should not be
defined. If `-fopenmp-simd` is specified along with the `-fopenmp`,
`_OPENMP` macro should be defined with the value `201511`.
llvm-svn: 332852
The return value of sys::getDefaultTargetTriple, which is derived from
-DLLVM_DEFAULT_TRIPLE, is used to construct tool names, default target,
and in the future also to control the search path directly; as such it
should be used textually, without interpretation by LLVM.
Normalization of this value may lead to unexpected results, for example
if we configure LLVM with -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-linux-gnu,
normalization will transform that value to x86_64--linux-gnu. Driver will
use that value to search for tools prefixed with x86_64--linux-gnu- which
may be confusing. This is also inconsistent with the behavior of the
--target flag which is taken as-is without any normalization and overrides
the value of LLVM_DEFAULT_TARGET_TRIPLE.
Users of sys::getDefaultTargetTriple already perform their own
normalization as needed, so this change shouldn't impact existing logic.
Differential Revision: https://reviews.llvm.org/D46910
llvm-svn: 332750
in gcc by https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html.
The -mibt feature flag is being removed, and the -fcf-protection
option now also defines a CET macro and causes errors when used
on non-X86 targets, while X86 targets no longer check for -mibt
and -mshstk to determine if -fcf-protection is supported. -mshstk
is now used only to determine availability of shadow stack intrinsics.
Comes with an LLVM patch (D46882).
Patch by mike.dvoretsky
Differential Revision: https://reviews.llvm.org/D46881
llvm-svn: 332704
The PS4 requires clang ABI version 6 for compatibility reasons. This change forces this and if the user specifies a different version when the PS4 target is specified, the compiler emits a warning that the specified version is being ignored.
Reviewers: probinson
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D46767
llvm-svn: 332160
The option enables use of 32-bit pointers for accessing
const/local/shared memory. The feature is disabled by default.
Differential Revision: https://reviews.llvm.org/D46148
llvm-svn: 331938
Restrict the following keywords in the OpenCL C++ language mode,
according to Sections 2.2 & 2.9 of the OpenCL C++ 1.0 Specification.
- dynamic_cast
- typeid
- register (already restricted in OpenCL C, update the diagnostic)
- thread_local
- exceptions (try/catch/throw)
- access qualifiers read_only, write_only, read_write
Support the `__global`, `__local`, `__constant`, `__private`, and
`__generic` keywords in OpenCL C++. Leave the unprefixed address
space qualifiers such as global available, i.e., do not mark them as
reserved keywords in OpenCL C++. libclcxx provides explicit address
space pointer classes such as `global_ptr` and `global<T>` that are
implemented using the `__`-prefixed qualifiers.
Differential Revision: https://reviews.llvm.org/D46022
llvm-svn: 331874
This is similar to the LLVM change https://reviews.llvm.org/D46290.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46320
llvm-svn: 331834
Generate a printable OpenCL language version number in a single place
and select between the OpenCL C or OpenCL C++ version accordingly.
Differential Revision: https://reviews.llvm.org/D46382
llvm-svn: 331766
Summary:
Passes down the necessary code ge options to the LTO Config to enable
-fdiagnostics-show-hotness and -fsave-optimization-record in the ThinLTO
backend for a distributed build.
Also, remove warning about not having PGO when the input is IR.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, eraman, cfe-commits
Differential Revision: https://reviews.llvm.org/D46464
llvm-svn: 331592
This replicates 'cl.exe' behavior and allows for both preprocessor output and
dependency information to be extraced with a single compiler invocation.
This is especially useful for compiler caching with tools like Mozilla's sccache.
See: https://github.com/mozilla/sccache/issues/246
Patch By: fxb
Differential Revision: https://reviews.llvm.org/D46394
llvm-svn: 331533
enabled for the host.
If the compilation for the host enables C++ exceptions, but they are not
supported by the device, we still need to allow the code with the
exception handling constructs outside of the target regions.
llvm-svn: 331372
This is not yet part of any C++ working draft, and so is controlled by the flag
-fchar8_t rather than a -std= flag. (The GCC implementation is controlled by a
flag with the same name.)
This implementation is experimental, and will be removed or revised
substantially to match the proposal as it makes its way through the C++
committee.
llvm-svn: 331244
As suggested in the post-commit thread for rL331056, we should match these
clang options with the established vocabulary of the corresponding sanitizer
option. Also, the use of 'strict' is well-known for these kinds of knobs,
and we can improve the descriptive text in the docs.
So this intends to match the logic of D46135 but only change the words.
Matching LLVM commit to match this spelling of the attribute to follow shortly.
Differential Revision: https://reviews.llvm.org/D46236
llvm-svn: 331209
As discussed in the post-commit thread for:
rL330437 ( http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180423/545906.html )
We need a way to opt-out of a float-to-int-to-float cast optimization because too much
existing code relies on the platform-specific undefined result of those casts when the
float-to-int overflows.
The LLVM changes associated with adding this function attribute are here:
rL330947
rL330950
rL330951
Also as suggested, I changed the LLVM doc to mention the specific sanitizer flag that
catches this problem:
rL330958
Differential Revision: https://reviews.llvm.org/D46135
llvm-svn: 331041
HIP is a language similar to CUDA (https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_kernel_language.md ).
The language syntax is very similar, which allows a hip program to be compiled as a CUDA program by Clang. The main difference
is the host API. HIP has a set of vendor neutral host API which can be implemented on different platforms. Currently there is open source
implementation of HIP runtime on amdgpu target (https://github.com/ROCm-Developer-Tools/HIP).
This patch adds support of input kind and language standard hip.
When hip file is compiled, both LangOpts.CUDA and LangOpts.HIP is turned on. This allows compilation of hip program as CUDA
in most cases and only special handling of hip program is needed LangOpts.HIP is checked.
This patch also adds support of kernel launching of HIP program using HIP host API.
When -x hip is not specified, there is no behaviour change for CUDA.
Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D44984
llvm-svn: 330790
Summary:
By default Clang outputs its version (including git commit hash, in
case of trunk builds) into object and assembly files. It might be
useful to have an option to disable this, especially for debugging
purposes.
This patch implements new command line flags -Qn and -Qy (the names
are chosen for compatibility with GCC). -Qn disables output of
the 'llvm.ident' metadata string and the 'producer' debug info. -Qy
(enabled by default) does the opposite.
Reviewers: faisalv, echristo, aprantl
Reviewed By: aprantl
Subscribers: aprantl, cfe-commits, JDevlieghere, rogfer01
Differential Revision: https://reviews.llvm.org/D45255
llvm-svn: 330442
This implements support for the previously ignored flag
`-falign-functions`. This allows the frontend to request alignment on
function definitions in the translation unit where they are not
explicitly requested in code. This is compatible with the GCC behaviour
and the ICC behaviour.
The scalar value passed to `-falign-functions` aligns functions to a
power-of-two boundary. If flag is used, the functions are aligned to
16-byte boundaries. If the scalar is specified, it must be an integer
less than or equal to 4096. If the value is not a power-of-two, the
driver will round it up to the nearest power of two.
llvm-svn: 330378
Summary:
A clang builtin for xray typed events. Differs from
__xray_customevent(...) by the presence of a type tag that is vended by
compiler-rt in typical usage. This allows xray handlers to expand logged
events with their type description and plugins to process traced events
based on type.
This change depends on D45633 for the intrinsic definition.
Reviewers: dberris, pelikan, rnk, eizan
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D45716
llvm-svn: 330220
register destructor functions annotated with __attribute__((destructor))
using __cxa_atexit or atexit.
Register destructor functions annotated with __attribute__((destructor))
calling __cxa_atexit in a synthesized constructor function instead of
emitting references to the functions in a special section.
The primary reason for adding this option is that we are planning to
deprecate the __mod_term_funcs section on Darwin in the future. This
feature is enabled by default only on Darwin. Users who do not want this
can use command line option 'fno_register_global_dtors_with_atexit' to
disable it.
rdar://problem/33887655
Differential Revision: https://reviews.llvm.org/D45578
llvm-svn: 330199
Summary:
The clang driver option -save-temps was not passed to the LTO config,
so when invoking the ThinLTO backends via clang during distributed
builds there was no way to get LTO to save temp files.
Getting this to work with ThinLTO distributed builds also required
changing the driver to avoid a separate compile step to emit unoptimized
bitcode when the input was already bitcode under -save-temps. Not only is
this unnecessary in general, it is problematic for ThinLTO backends since
the temporary bitcode file to the backend would not match the module path
in the combined index, leading to incorrect ThinLTO backend index-based
optimizations.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, eraman, cfe-commits
Differential Revision: https://reviews.llvm.org/D45217
llvm-svn: 330194
Summary:
This change addresses http://llvm.org/PR36926 by allowing users to pick
which instrumentation bundles to use, when instrumenting with XRay. In
particular, the flag `-fxray-instrumentation-bundle=` has four valid
values:
- `all`: the default, emits all instrumentation kinds
- `none`: equivalent to -fnoxray-instrument
- `function`: emits the entry/exit instrumentation
- `custom`: emits the custom event instrumentation
These can be combined either as comma-separated values, or as
repeated flag values.
Reviewers: echristo, kpw, eizan, pelikan
Reviewed By: pelikan
Subscribers: mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D44970
llvm-svn: 329985
It means the same thing as -mllvm; there isn't any reason to have two
options which do the same thing.
Differential Revision: https://reviews.llvm.org/D45109
llvm-svn: 329965
Summary:
This change consolidates the always/never lists that may be provided to
clang to externally control which functions should be XRay instrumented
by imbuing attributes. The files follow the same format as defined in
https://clang.llvm.org/docs/SanitizerSpecialCaseList.html for the
sanitizer blacklist.
We also deprecate the existing `-fxray-instrument-always=` and
`-fxray-instrument-never=` flags, in favour of `-fxray-attr-list=`.
This fixes http://llvm.org/PR34721.
Reviewers: echristo, vlad.tsyrklevich, eugenis
Reviewed By: vlad.tsyrklevich
Subscribers: llvm-commits, cfe-commits
Differential Revision: https://reviews.llvm.org/D45357
llvm-svn: 329543
This CMake flag allows setting the default value for the
-f[no]-experimental-new-pass-manager flag.
Differential Revision: https://reviews.llvm.org/D44330
llvm-svn: 329366
Summary:
"-fmerge-all-constants" is a non-conforming optimization and should not
be the default. It is also causing miscompiles when building Linux
Kernel (https://lkml.org/lkml/2018/3/20/872).
Fixes PR18538.
Reviewers: rjmccall, rsmith, chandlerc
Reviewed By: rsmith, chandlerc
Subscribers: srhines, cfe-commits
Differential Revision: https://reviews.llvm.org/D45289
llvm-svn: 329300
ObjC and ObjC++ pass non-trivial structs in a way that is incompatible
with each other. For example:
typedef struct {
id f0;
__weak id f1;
} S;
// this code is compiled in c++.
extern "C" {
void foo(S s);
}
void caller() {
// the caller passes the parameter indirectly and destructs it.
foo(S());
}
// this function is compiled in c.
// 'a' is passed directly and is destructed in the callee.
void foo(S a) {
}
This patch fixes the incompatibility by passing and returning structs
with __strong or weak fields using the C ABI in C++ mode. __strong and
__weak fields in a struct do not cause the struct to be destructed in
the caller and __strong fields do not cause the struct to be passed
indirectly.
Also, this patch fixes the microsoft ABI bug mentioned here:
https://reviews.llvm.org/D41039?id=128767#inline-364710
rdar://problem/38887866
Differential Revision: https://reviews.llvm.org/D44908
llvm-svn: 328731
r327219 added wrappers to std::sort which randomly shuffle the container before
sorting. This will help in uncovering non-determinism caused due to undefined
sorting order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of
std::sort.
llvm-svn: 328636
This make -ivfsoverlay behave more like other fatal errors (e.g. missing
-include file) by skipping the missing file instead of bailing out of
the whole compilation. This makes it possible for libclang to still
provide some functionallity as well as to correctly produce the fatal
error diagnostic (previously we lost the diagnostic in libclang since
there was no TU to tie it to).
rdar://33385423
llvm-svn: 328337
When skipping building the module for a private framework module,
LangOpts.CurrentModule isn't enough for implict modules builds; for
instance, in case a private module is built while building a public one,
LangOpts.CurrentModule doesn't reflect the -fmodule-name being passed
down, but instead the module name which triggered the build.
Store the actual -fmodule-name in LangOpts.ModuleName and actually
check a name was provided during compiler invocation in order to
skip building the private module.
rdar://problem/38434694
llvm-svn: 328053
The patch adds nocf_check target independent attribute for disabling checks that were enabled by cf-protection flag.
The attribute can be appertained to functions and function pointers.
Attribute name follows GCC's similar attribute name.
Differential Revision: https://reviews.llvm.org/D41880
llvm-svn: 327768
Provide two new CMake cache variables -- CLANG_DEFAULT_STD_C
and CLANG_DEFAULT_STD_CXX -- that can be used to override the default
C/ObjC and C++/ObjC++ standards appropriately. They can be set to one of
the identifiers from LangStandards.def, or left unset (the default) to
respect the current platform default.
This option is mostly intended for compiler vendors that may wish
to adjust the defaults their compilers are using. For example, Gentoo
planned to use it to set clang and gcc to matching standards, so that
we could maintain as much compatibility between different compilers
as possible.
The code relies on explicit identifiers rather than the string aliases
for simplicity. This saves us from the necessity of parsing aliases at
build-time or adding additional processing at runtime. For the latter
case, it also adds trivial value check -- if incorrect value is passed,
the code simply fails to compile through referencing an undefined
constant.
If the variable is used to redefine the default standard, the explicit
value overrides the special case for PS4. It is done this way mostly
following other kinds of variables where 'platform defaults' are
redefined.
Differential Revision: https://reviews.llvm.org/D34365
llvm-svn: 326836
This makes it easier to debug crashes and hangs in block functions since
users can easily find out where the block is called from. The option
doesn't disable tail-calls from non-escaping blocks since non-escaping
blocks are not as hard to debug as escaping blocks.
rdar://problem/35758207
Differential Revision: https://reviews.llvm.org/D43841
llvm-svn: 326530
Since LLVM r326341, default EmulatedTLS mode is decided in backend
according to target triple. Any front-end should pass -f[no]-emulated-tls
to backend and set up ExplicitEmulatedTLS only when the flags are used.
Differential Revision: https://reviews.llvm.org/D43965
llvm-svn: 326499
Differential Revision: https://reviews.llvm.org/D43852
This patch extends the SPMD implementation to all target constructs and guards this implementation under a new flag.
llvm-svn: 326368
Binaries for multiple architectures are combined by fatbinary,
so the current code was effectively not needed.
Differential Revision: https://reviews.llvm.org/D43461
llvm-svn: 326342
In DWARF v5 the Line Number Program Header is extensible, allowing values with
new content types. This vendor extension to DWARF v5 allows source text to be
embedded directly in the line tables of the debug line section.
Add new flag (-g[no-]embed-source) to Driver and CC1 which indicates
that source should be passed through to LLVM during CodeGen.
Differential Revision: https://reviews.llvm.org/D42766
llvm-svn: 326102
Summary:
If the flag -fforce-enable-int128 is passed, it will enable support for __int128_t and __uint128_t types.
This flag can then be used to build compiler-rt for RISCV32.
Reviewers: asb, kito-cheng, apazos, efriedma
Reviewed By: asb, efriedma
Subscribers: shiva0217, efriedma, jfb, dschuff, sdardis, sbc100, jgravelle-google, aheejin, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, cfe-commits
Differential Revision: https://reviews.llvm.org/D43105
llvm-svn: 326045
Summary:
OpenCL 2.0 specification defines '-cl-uniform-work-group-size' option,
which requires that the global work-size be a multiple of the work-group
size specified to clEnqueueNDRangeKernel and allows optimizations that
are made possible by this restriction.
The patch introduces the support of this option.
To keep information about whether an OpenCL kernel has uniform work
group size or not, clang generates 'uniform-work-group-size' function
attribute for every kernel:
- "uniform-work-group-size"="true" for OpenCL 1.2 and lower,
- "uniform-work-group-size"="true" for OpenCL 2.0 and higher if
'-cl-uniform-work-group-size' option was specified,
- "uniform-work-group-size"="false" for OpenCL 2.0 and higher if no
'-cl-uniform-work-group-size' options was specified.
If the function is not an OpenCL kernel, 'uniform-work-group-size'
attribute isn't generated.
Patch by: krisb
Reviewers: yaxunl, Anastasia, b-sumner
Reviewed By: yaxunl, Anastasia
Subscribers: nhaehnle, yaxunl, Anastasia, cfe-commits
Differential Revision: https://reviews.llvm.org/D43570
llvm-svn: 325771
Summary:
Right now clang is skipping array cookie poisoning for any operator
new[] which is not part of the set of replaceable global allocation
functions.
This commit adds a flag to tell clang to poison all operator new[]
cookies.
A previous review was poisoning all array cookies unconditionally, but
there is an edge case which would stop working under ASan (a custom
operator new[] saves whatever pointer it returned, and then accesses
it).
This newer revision adds a command line argument to toggle this feature.
Original revision: https://reviews.llvm.org/D41301
Compiler-rt test revision with an explanation of the edge case: https://reviews.llvm.org/D41664
Reviewers: rjmccall, kcc, rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D43013
llvm-svn: 324884
As a first step, pass '-c/--compile-only' to ptxas so that it
doesn't complain about references to external function. This
will successfully generate object files, but they won't work
at runtime because the registration routines need to adapted.
Differential Revision: https://reviews.llvm.org/D42921
llvm-svn: 324878
This patch adds a base-class called TemplateInstantiationObserver which gets
notified whenever a template instantiation is entered or exited during
semantic analysis. This is a base class used to implement the template
profiling and debugging tool called
Templight (https://github.com/mikael-s-persson/templight).
The patch also makes a few more changes:
* ActiveTemplateInstantiation class is moved out of the Sema class (so it can be used with inclusion of Sema.h).
* CreateFrontendAction function in front-end utilities is given external linkage (not longer a hidden static function).
* TemplateInstObserverChain data member added to Sema class to hold the list of template-inst observers.
* Notifications to the template-inst observer are added at the key places where templates are instantiated.
Patch by: Abel Sinkovics!
Differential Revision: https://reviews.llvm.org/D5767
llvm-svn: 324808
Analyzing problems which appear in scan-build results can be very
difficult, as after the launch no exact invocation is stored, and it's
super-hard to launch the debugger.
With this patch, the exact analyzer invocation appears in the footer,
and can be copied to debug/check reproducibility/etc.
rdar://35980230
llvm-svn: 323245
Adds option /guard:cf to clang-cl and -cfguard to cc1 to emit function IDs
of functions that have their address taken into a section named .gfids$y for
compatibility with Microsoft's Control Flow Guard feature.
The original patch didn't have the lit.local.cfg file that restricts the new
test to x86, thus the new test was failing on the non-x86 bots.
Differential Revision: https://reviews.llvm.org/D40531
The reverts r322008, which was a revert of r322005.
This reverts commit a05b89f9aca70597dc79fe97bc49b50b51f525ba.
llvm-svn: 322136
Summary:
A FIXME added 8 years ago (2010) in https://reviews.llvm.org/rL118203
mentioned that a FileManager should not need to be used when parsing
preprocessor arguments. In fact, its only use was removed 6 years ago
(2012), in https://reviews.llvm.org/rL166452. Remove the unused
variable and the obsolete FIXME.
Test Plan: `check-clang`
Reviewers: v.g.vassilev
Reviewed By: v.g.vassilev
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D41867
llvm-svn: 322118
Cf-protection is a target independent flag that instructs the back-end to instrument control flow mechanisms like: Branch, Return, etc.
For example in X86 this flag will be used to instrument Indirect Branch Tracking instructions.
Differential Revision: https://reviews.llvm.org/D40478
Change-Id: I5126e766c0e6b84118cae0ee8a20fe78cc373dea
llvm-svn: 322063
The new test fails on the Hexagon bot. Reverting while I investigate.
This reverts https://reviews.llvm.org/rL322005
This reverts commit b7e0026b4385180c378edc658ec91a39566f2942.
llvm-svn: 322008
Adds option /guard:cf to clang-cl and -cfguard to cc1 to emit function IDs
of functions that have their address taken into a section named .gfids$y for
compatibility with Microsoft's Control Flow Guard feature.
Differential Revision: https://reviews.llvm.org/D40531
llvm-svn: 322005
Adds the -fstack-size-section flag to enable the .stack_sizes section. The flag defaults to on for the PS4 triple.
Differential Revision: https://reviews.llvm.org/D40712
llvm-svn: 321992
Summary:
Depends on https://reviews.llvm.org/D41732.
Utilities such as `opt`, when invoked with arguments that are very
nearly spelled correctly, suggest the correctly spelled options:
```
bin/opt -hel
opt: Unknown command line argument '-hel'. Try: 'bin/opt -help'
opt: Did you mean '-help'?
```
Clang, on the other hand, prior to this commit, does not:
```
bin/clang -hel
clang-6.0: error: unknown argument: '-hel'
```
This commit makes use of the new libLLVMOption API from
https://reviews.llvm.org/D41732 in order to provide correct suggestions:
```
bin/clang -hel
clang-6.0: error: unknown argument: '-hel', did you mean '-help'?
```
Test Plan: `check-clang`
Reviewers: yamaguchi, v.g.vassilev, teemperor, ruiu, bruno
Reviewed By: bruno
Subscribers: bruno, jroelofs, cfe-commits
Differential Revision: https://reviews.llvm.org/D41733
llvm-svn: 321917
There are 2 parts to getting the -fassociative-math command-line flag translated to LLVM FMF:
1. In the driver/frontend, we accept the flag and its 'no' inverse and deal with the
interactions with other flags like -ffast-math -fno-signed-zeros -fno-trapping-math.
This was mostly already done - we just need to translate the flag as a codegen option.
The test file is complicated because there are many potential combinations of flags here.
Note that we are matching gcc's behavior that requires 'nsz' and no-trapping-math.
2. In codegen, we map the codegen option to FMF in the IR builder. This is simple code and
corresponding test.
For the motivating example from PR27372:
float foo(float a, float x) { return ((a + x) - x); }
$ ./clang -O2 27372.c -S -o - -ffast-math -fno-associative-math -emit-llvm | egrep 'fadd|fsub'
%add = fadd nnan ninf nsz arcp contract float %0, %1
%sub = fsub nnan ninf nsz arcp contract float %add, %2
So 'reassoc' is off as expected (and so is the new 'afn' but that's a different patch).
This case now works as expected end-to-end although the underlying logic is still wrong:
$ ./clang -O2 27372.c -S -o - -ffast-math -fno-associative-math | grep xmm
addss %xmm1, %xmm0
subss %xmm1, %xmm0
We're not done because the case where 'reassoc' is set is ignored by optimizer passes. Example:
$ ./clang -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math -emit-llvm | grep fadd
%add = fadd reassoc float %0, %1
$ ./clang -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math | grep xmm
addss %xmm1, %xmm0
subss %xmm1, %xmm0
Differential Revision: https://reviews.llvm.org/D39812
llvm-svn: 320920
This mimics FileCheck's --check-prefixes option.
The default prefix is "expected". That is, "-verify" is equivalent to
"-verify=expected".
The goal is to permit exercising a single test suite source file with different
compiler options producing different sets of diagnostics. While cpp can be
combined with the existing -verify to accomplish the same goal, source is often
easier to maintain when it's not cluttered with preprocessor directives or
duplicate passages of code. For example, this patch also rewrites some existing
clang tests to demonstrate the benefit of this feature.
Patch by Joel E. Denny, thanks!
Differential Revision: https://reviews.llvm.org/D39694
llvm-svn: 320908
This adds a new command line option -mprefer-vector-width to specify a preferred vector width for the vectorizers. Valid values are 'none' and unsigned integers. The driver will check that it meets those constraints. Specific supported integers will be managed by the targets in the backend.
Clang will take the value and add it as a new function attribute during CodeGen.
This represents the alternate direction proposed by Sanjay in this RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-November/118734.html
The syntax here matches gcc, though gcc treats it as an x86 specific command line argument. gcc only allows values of 128, 256, and 512. I'm not having clang check any values.
Differential Revision: https://reviews.llvm.org/D40230
llvm-svn: 320419
As reported in llvm bugzilla 32377.
Here’s a patch to add preinclude of stdc-predef.h.
The gcc documentation says “On GNU/Linux, <stdc-predef.h> is pre-included.” See https://gcc.gnu.org/gcc-4.8/porting_to.html;
The preinclude is inhibited with –ffreestanding.
Basically I fixed the failing test cases by adding –ffreestanding which inhibits this behavior.
I fixed all the failing tests, including some in extra/test, there's a separate patch for that which is linked here
Note: this is a recommit after a test failure took down the original (r318669)
Patch By: mibintc
Differential Revision: https://reviews.llvm.org/D34158
llvm-svn: 320391
This adds -std=c17, -std=gnu17, and -std=iso9899:2017 as language mode flags for C17 and updates the value of __STDC_VERSION__ to the value based on the C17 FDIS. Given that this ballot cannot succeed until 2018, it is expected that we (and GCC) will add c18 flags as aliases once the ballot passes.
llvm-svn: 320089
Summary:
The -fxray-always-emit-customevents flag instructs clang to always emit
the LLVM IR for calls to the `__xray_customevent(...)` built-in
function. The default behaviour currently respects whether the function
has an `[[clang::xray_never_instrument]]` attribute, and thus not lower
the appropriate IR code for the custom event built-in.
This change allows users calling through to the
`__xray_customevent(...)` built-in to always see those calls lowered to
the corresponding LLVM IR to lay down instrumentation points for these
custom event calls.
Using this flag enables us to emit even just the user-provided custom
events even while never instrumenting the start/end of the function
where they appear. This is useful in cases where "phase markers" using
__xray_customevent(...) can have very few instructions, must never be
instrumented when entered/exited.
Reviewers: rnk, dblaikie, kpw
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D40601
llvm-svn: 319388
This is a re-apply of r319294.
adds -fseh-exceptions and -fdwarf-exceptions flags
clang will check if the user has specified an exception model flag,
in the absense of specifying the exception model clang will then check
the driver default and append the model flag for that target to cc1
-fno-exceptions has a higher priority then specifying the model
move __SEH__ macro definitions out of Targets into InitPreprocessor
behind the -fseh-exceptions flag
move __ARM_DWARF_EH__ macrodefinitions out of verious targets and into
InitPreprocessor behind the -fdwarf-exceptions flag and arm|thumb check
remove unused USESEHExceptions from the MinGW Driver
fold USESjLjExceptions into a new GetExceptionModel function that
gives the toolchain classes more flexibility with eh models
Reviewers: rnk, mstorsjo
Differential Revision: https://reviews.llvm.org/D39673
llvm-svn: 319297
adds -fseh-exceptions and -fdwarf-exceptions flags
clang will check if the user has specified an exception model flag,
in the absense of specifying the exception model clang will then check
the driver default and append the model flag for that target to cc1
clang cc1 assumes dwarf is the default if none is passed
and -fno-exceptions has a higher priority then specifying the model
move __SEH__ macro definitions out of Targets into InitPreprocessor
behind the -fseh-exceptions flag
move __ARM_DWARF_EH__ macrodefinitions out of verious targets and into
InitPreprocessor behind the -fdwarf-exceptions flag and arm|thumb check
remove unused USESEHExceptions from the MinGW Driver
fold USESjLjExceptions into a new GetExceptionModel function that
gives the toolchain classes more flexibility with eh models
Reviewers: rnk, mstorsjo
Differential Revision: https://reviews.llvm.org/D39673
llvm-svn: 319294
This is an instrumentation flag that's similar to
-finstrument-functions, but it only inserts calls on function entry, the
calls are inserted post-inlining, and they don't take any arugments.
This is intended for users who want to instrument function entry with
minimal overhead.
(-pg would be another alternative, but forces frame pointer emission and
affects link flags, so is probably best left alone to be used for
generating gcov data.)
Differential revision: https://reviews.llvm.org/D40276
llvm-svn: 318785
As reported in llvm bugzilla 32377.
Here’s a patch to add preinclude of stdc-predef.h.
The gcc documentation says “On GNU/Linux, <stdc-predef.h> is pre-included.”
See https://gcc.gnu.org/gcc-4.8/porting_to.html;
The preinclude is inhibited with –ffreestanding.
Basically I fixed the failing test cases by adding –ffreestanding which inhibits
this behavior.
I fixed all the failing tests, including some in extra/test, there's a separate
patch for that which is linked here
Patch By: mibintc
Differential Revision: https://reviews.llvm.org/D34158
llvm-svn: 318669
This updates -mcount to use the new attribute names (LLVM r318195), and
switches over -finstrument-functions to also use these attributes rather
than inserting instrumentation in the frontend.
It also adds a new flag, -finstrument-functions-after-inlining, which
makes the cygprofile instrumentation get inserted after inlining rather
than before.
Differential Revision: https://reviews.llvm.org/D39331
llvm-svn: 318199
Added support for regcall as default calling convention. Also added code to
exclude main when applying default calling conventions.
Patch-By: eandrews
Differential Revision: https://reviews.llvm.org/D39210
llvm-svn: 317268
Summary:
This change allows generalizing pointers in type signatures used for
cfi-icall by enabling the -fsanitize-cfi-icall-generalize-pointers flag.
This works by 1) emitting an additional generalized type signature
metadata node for functions and 2) llvm.type.test()ing for the
generalized type for translation units with the flag specified.
This flag is incompatible with -fsanitize-cfi-cross-dso because it would
require emitting twice as many type hashes which would increase artifact
size.
Reviewers: pcc, eugenis
Reviewed By: pcc
Subscribers: kcc
Differential Revision: https://reviews.llvm.org/D39358
llvm-svn: 317044
GCC tries to shorten system headers in depfiles using its real path
(resolving components like ".." and following symlinks). Mimic this
feature to ensure that the Ninja build tool detects the correct
dependencies when a symlink changes directory levels, see
https://github.com/ninja-build/ninja/issues/1330
An option to disable this feature is added in case "these changed header
paths may conflict with some compilation environments", see
https://gcc.gnu.org/ml/gcc-patches/2012-09/msg00287.html
Note that the original feature request for GCC
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52974) also included paths
preprocessed output (-E) and diagnostics. That is not implemented now
since I am not sure if it breaks something else.
Differential Revision: https://reviews.llvm.org/D37954
llvm-svn: 316193
Currently all the consecutive bitfields are wrapped as a large integer unless there is unamed zero sized bitfield in between. The patch provides an alternative manner which makes the bitfield to be accessed as separate memory location if it has legal integer width and is naturally aligned. Such separate bitfield may split the original consecutive bitfields into subgroups of consecutive bitfields, and each subgroup will be wrapped as an integer. Now This is all controlled by an option -ffine-grained-bitfield-accesses. The alternative of bitfield access manner can improve the access efficiency of those bitfields with legal width and being aligned, but may reduce the chance of load/store combining of other bitfields, so it depends on how the bitfields are defined and actually accessed to choose when to use the option. For now the option is off by default.
Differential revision: https://reviews.llvm.org/D36562
llvm-svn: 315915
Move the logic for determining the `wchar_t` type information into the
driver. Rather than passing the single bit of information of
`-fshort-wchar` indicate to the frontend the desired type of `wchar_t`
through a new `-cc1` option of `-fwchar-type` and indicate the
signedness through `-f{,no-}signed-wchar`. This replicates the current
logic which was spread throughout Basic into the
`RenderCharacterOptions`.
Most of the changes to the tests are to ensure that the frontend uses
the correct type. Add a new test set under `test/Driver/wchar_t.c` to
ensure that we calculate the proper types for the various cases.
llvm-svn: 315126
to have child entries describing the template parameters. This will
be on by default for SCE tuning.
Differential Revision: https://reviews.llvm.org/D14358
llvm-svn: 314444
Extend the -fmodule-file option to support the [<name>=]<file> value format.
If the name is omitted, then the old semantics is preserved (the module file
is loaded whether needed or not). If the name is specified, then the mapping
is treated as just another prebuilt module search mechanism, similar to
-fprebuilt-module-path, and the module file is only loaded if actually used
(e.g., via import). With one exception: this mapping also overrides module
file references embedded in other modules (which can be useful if module files
are moved/renamed as often happens during remote compilation).
This override semantics requires some extra work: we now store the module name
in addition to the file name in the serialized AST representation.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D35020
llvm-svn: 312220
Extend the -fmodule-file option to support the [<name>=]<file> value format.
If the name is omitted, then the old semantics is preserved (the module file
is loaded whether needed or not). If the name is specified, then the mapping
is treated as just another prebuilt module search mechanism, similar to
-fprebuilt-module-path, and the module file is only loaded if actually used
(e.g., via import). With one exception: this mapping also overrides module
file references embedded in other modules (which can be useful if module files
are moved/renamed as often happens during remote compilation).
This override semantics requires some extra work: we now store the module name
in addition to the file name in the serialized AST representation.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D35020
llvm-svn: 312105
Summary:
An implementation of ubsan runtime library suitable for use in production.
Minimal attack surface.
* No stack traces.
* Definitely no C++ demangling.
* No UBSAN_OPTIONS=log_file=/path (very suid-unfriendly). And no UBSAN_OPTIONS in general.
* as simple as possible
Minimal CPU and RAM overhead.
* Source locations unnecessary in the presence of (split) debug info.
* Values and types (as in A+B overflows T) can be reconstructed from register/stack dumps, once you know what type of error you are looking at.
* above two items save 3% binary size.
When UBSan is used with -ftrap-function=abort, sometimes it is hard to reason about failures. This library replaces abort with a slightly more informative message without much extra overhead. Since ubsan interface in not stable, this code must reside in compiler-rt.
Reviewers: pcc, kcc
Subscribers: srhines, mgorny, aprantl, krytarowski, llvm-commits
Differential Revision: https://reviews.llvm.org/D36810
llvm-svn: 312029
This patch adds a flag -fclang-abi-compat that can be used to request that
Clang attempts to be ABI-compatible with some older version of itself.
This is provided on a best-effort basis; right now, this can be used to undo
the ABI change in r310401, reverting Clang to its prior C++ ABI for pass/return
by value of class types affected by that change, and to undo the ABI change in
r262688, reverting Clang to using integer registers rather than SSE registers
for passing <1 x long long> vectors. The intent is that we will maintain this
backwards compatibility path as we make ABI-breaking fixes in future.
The reversion to the old behavior for r310401 is also applied to the PS4 target
since that change is not part of its platform ABI (which is essentially to do
whatever Clang 3.2 did).
llvm-svn: 311823
Summary: With accurate sample profile, we can do more aggressive size optimization. For some size-critical application, this can reduce the text size by 20%
Reviewers: davidxl, rsmith
Reviewed By: davidxl, rsmith
Subscribers: mehdi_amini, eraman, sanjoy, cfe-commits
Differential Revision: https://reviews.llvm.org/D37091
llvm-svn: 311707
Summary:
Augment SanitizerCoverage to insert maximum stack depth tracing for
use by libFuzzer. The new instrumentation is enabled by the flag
-fsanitize-coverage=stack-depth and is compatible with the existing
trace-pc-guard coverage. The user must also declare the following
global variable in their code:
thread_local uintptr_t __sancov_lowest_stack
https://bugs.llvm.org/show_bug.cgi?id=33857
Reviewers: vitalybuka, kcc
Reviewed By: vitalybuka
Subscribers: kubamracek, hiraditya, cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D36839
llvm-svn: 311186
This option when combined with -mgpopt and -membedded-data places all
uninitialized constant variables in the read-only section.
Reviewers: atanasyan, nitesh.jain
Differential Revision: https://reviews.llvm.org/D35917
llvm-svn: 309940
Summary:
Changed EABIVersion type from string to llvm::EABI.
It seems it was just a typo and this is intended implementation.
Differential Revision: https://reviews.llvm.org/D34595
llvm-svn: 306953
Summary:
Un-revert https://reviews.llvm.org/D34868, but with a slight tweak to the
documentation to fix an error -- I had used the wrong syntax for a link.
llvm-svn: 306948
Summary:
Depends on https://reviews.llvm.org/D34867.
Add a Clang frontend option to enable optimization remark hotness
thresholds, which were added to LLVM in https://reviews.llvm.org/D34867.
This prevents diagnostics that do not meet a minimum hotness
threshold from being output. When generating optimization remarks for large
codebases with a ton of cold code paths, this option can be used
to limit the optimization remark output at a reasonable size.
Discussion of this change can be read here:
http://lists.llvm.org/pipermail/llvm-dev/2017-June/114377.html
Reviewers: anemet, davidxl, hfinkel
Reviewed By: anemet
Subscribers: fhahn, cfe-commits
Differential Revision: https://reviews.llvm.org/D34868
llvm-svn: 306945
basic block vectorizer. This vectorizer has had no known users for many,
many years and is completely surpassed by the normal
'-fvectorize-slp'-controlled SLP vectorizer in LLVM.
Hal proposed this back in 2014 to no objections:
http://lists.llvm.org/pipermail/llvm-dev/2014-November/079091.html
While this patch completely removes the flag, Joerg is working on
a patch that will add it back in a way that warns users and ignores the
flag in a clear and well factored way (so that we can keep doing this
going forward).
Differential Revision: https://reviews.llvm.org/D34846
llvm-svn: 306786
a c++17 aligned allocation/deallocation function that is unavailable in
the standard library on Apple platforms.
The aligned functions are implemented only in the following versions or
later versions of the OSes, so clang issues diagnostics if the deployment
target being targeted is older than these:
macosx: 10.13
ios: 11.0
tvos: 11.0
watchos: 4.0
The diagnostics are issued whenever the aligned functions are selected
except when the selected function has a definition in the same file.
If there is a user-defined function available somewhere else, option
-Wno-aligned-allocation-unavailable can be used to silence the
diagnostics.
rdar://problem/32664169
Differential Revision: https://reviews.llvm.org/D34574
llvm-svn: 306722
Summary: Device offloading requires the specification of an additional flag containing the triple of the //other// architecture the code is being compiled on if such an architecture exists. If compiling for the host, the auxiliary triple flag will contain the triple describing the device and vice versa.
Reviewers: arpith-jacob, sfantao, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar
Reviewed By: Hahnfeld
Subscribers: rengolin, cfe-commits
Differential Revision: https://reviews.llvm.org/D29339
llvm-svn: 306689
Restore the `-gz` option to the driver with some minor tweaks to handle
the additional case for `-Wa,--compress-debug-sections`.
This intends to make the compression of the debug information
controllable from the driver. The following is the behaviour:
-gz enable compression (ambiguous for format, will default to zlib-gnu)
-gz=none disable compression
-gz=zlib-gnu enable compression (deprecated GNU style zlib compression)
-gz=zlib enable compression (zlib based compression)
Although -Wa,-compress-debug-sections works, it should be discouraged
when using the driver to invoke the assembler. However, we permit the
assembler to accept the GNU as style argument --compress-debug-sections
to maintain compatibility.
Note, -gz/-gz= does *NOT* imply -g. That is, you need to additionally
specific -g for debug information to be generated.
llvm-svn: 306115
Summary:
Prior to this change, using `-fdiagnostics-show-hotness` with a sampling
profile specified via `-fprofile-sample-use=` would result in the Clang
frontend emitting a warning: "argument '-fdiagnostics-show-hotness' requires
profile-guided optimization information". Of course, a sampling profile
*is* profile-guided optimization information, so the warning is misleading.
Furthermore, despite the warning, hotness was displayed based on the data in
the sampling profile.
Prevent the warning from being emitted when a sampling profile is used, and
add a test that verifies this.
Reviewers: anemet, davidxl
Reviewed By: davidxl
Subscribers: danielcdh, cfe-commits
Differential Revision: https://reviews.llvm.org/D34082
llvm-svn: 306079
for preprocessing
r300667 added support for editor placeholder to Clang. That commit didn’t take
into account that users who use Clang for preprocessing only (-E) will get the
"editor placeholder in source file" error when preprocessing their source
(PR33394). This commit ensures that Clang doesn't lex editor placeholders when
running a preprocessor only action.
rdar://32718000
Differential Revision: https://reviews.llvm.org/D34256
llvm-svn: 305576
Summary:
It seems -flto must be either "thin" or "full". I think the use of
containValue is just a typo.
Reviewers: ruiu, tejohnson
Subscribers: mehdi_amini, inglorion
Differential Revision: https://reviews.llvm.org/D34055
llvm-svn: 305392
cc1as does not currently access the "--" version of this flag. At the
very least this needs to be fixed and proper test cases need to be
added.
Simple reproducer:
clang -Wa,--compress-debug-sections /tmp/test.cc
Result:
error: unknown argument: '--compress-debug-sections'
llvm-svn: 305182
These options control the behaviour of the compression of debug info
sections on ELF targets. Our behaviour slightly diverges from the
behaviour of GCC. `-gz` maps to the `-compress-debug-sections` rather
than `-compress-debug-sections=zlib` or
`-compress-debug-sections=zlib-gnu`. This small divergence allows us to
be compatible across versions of binutils (=zlib support was introduced
in 2.26, while earlier versions only support =zlib-gnu). This also
allows users to not have to worry about the version of the assembler
they may be using if they are not using the IAS. Previously, users
would have had to go through the internal option
`-compress-debug-sectionss` and pass that through to the assembler,
which is no longer needed.
llvm-svn: 305165
If specified, when preprocessing, the contents of imported .pcm files will be
included in preprocessed output. The resulting preprocessed file can then be
compiled standalone without the module sources or .pcm files.
llvm-svn: 305116
This is tied with the LLVM side of the change to expose the debug
information compression types to clang. We now track the compression
type as an enumeration rather than a boolean. We still use the same
value (GNU) that we did previously. This is in preparation to support
passing down the compression type and switch it based on the command
line.
llvm-svn: 305039
This patch makes it an error to have a mismatch between the enabled
sanitizers in a CU, and in any module being imported into the CU. Only
mismatches between non-modular sanitizers are treated as errors.
This patch also includes non-modular sanitizers in module hashes, in
order to ensure module rebuilds occur when -fsanitize=X is toggled on
and off for non-modular sanitizers, and to cut down on module rebuilds
when the option is toggled for modular sanitizers.
This fixes a longstanding issue with implicit modules and sanitizers,
which Duncan originally diagnosed.
When building with implicit modules it's possible to hit a scenario
where modules are built without -fsanitize=address, and are subsequently
imported into CUs with -fsanitize=address enabled. This causes strange
failures at runtime. The case Duncan found affects libcxx, since its
vector implementation behaves differently when ASan is enabled.
Implicit module builds should "just work" when -fsanitize=X is toggled
on and off across multiple compiler invocations, which is what this
patch does.
Differential Revision: https://reviews.llvm.org/D32724
llvm-svn: 304463
Amongst other, this will help LTO to correctly handle/honor files
compiled with O0, helping debugging failures.
It also seems in line with how we handle other options, like how
-fnoinline adds the appropriate attribute as well.
Differential Revision: https://reviews.llvm.org/D28404
llvm-svn: 304127
There's a Microsoft header in the Windows SDK which won't
compile with clang because it uses an operator name (and)
as a field name. This patch allows that file to compile by
setting the option which disables operator names.
The header which doesn't compile <Query.h> C:/Program Files (x86)/
Windows Kits/10/include/10.0.14393.0/um\Query.h:259:40:
error: expected member name or ';' after declaration specifiers
/* [case()] */ NODERESTRICTION or;
~~~~~~~~~~~~~~~ ^
1 error generated.
Contributed for Melanie Blower
Differential Revision:https://reviews.llvm.org/D33505
llvm-svn: 303798
When a diagnostic includes a highlighted range spanning multiple lines, clang
now supports printing out multiple lines of context if necessary to show the
highlighted ranges. This is not yet exposed in the driver, but can be enabled
by "-Xclang -fcaret-diagnostics-max-lines -Xclang N".
This is experimental until we can find out whether it works well in practice,
and if so, what a good default for the maximum number of lines is.
llvm-svn: 303589
This feature is subtly broken when the linker is gold 2.26 or
earlier. See the following bug for details:
https://sourceware.org/bugzilla/show_bug.cgi?id=19002
Since the decision needs to be made at compilation time, we can not
test the linker version. The flag is off by default on ELF targets,
and on otherwise.
llvm-svn: 302591
action to the general FrontendAction infrastructure.
This permits applying -E, -ast-dump, -fsyntax-only, and so on to a module map
compilation. (The -E form is not currently especially useful yet as there's no
good way to take the output and use it to actually build a module.)
In order to support this, -cc1 now accepts -x <lang>-module-map in all cases
where it accepts -x <lang> for a language we can parse (not ir/ast). And for
uniformity, we also accept -x <lang>-header for all such languages (we used
to reject for cuda and renderscript), and -x <lang>-cpp-output for all such
languages (we used to reject for c, cl, and renderscript).
(None of these new alternatives are accepted by the driver yet, so no
user-visible changes.)
llvm-svn: 301610
Don't list deprecated -std= values (c++0x etc). Only produce one line of output
per standard, even if we know it by multiple names.
In passing, add missing -std=gnu++03 alias (supported by GCC), and add new
spelling '-std=cl1.0' for OpenCL 1.0 for consistency with the other values,
with the same meaning as the preexisting '-std=cl'.
llvm-svn: 301507
We already prohibited this in most cases (in r130710), but had some bugs in our
enforcement of this rule. Specifically, this prevents the following
combinations:
* -x c -std=clN.M, which would previously effectively act as if -x cl were
used, despite the input being a C source file. (-x cl -std=cNN continues
to be disallowed.)
* -x c++ -std=cuda, which would previously select C++98 + CUDA, despite that
not being a C++ standard. (-x cuda -std=c++NN is still permitted, and
selects CUDA with the given C++ standard as its base language.
-x cuda -std=cuda is still supported with the meaning of CUDA + C++98.)
* -x renderscript -std=c++NN, which would previously form a hybrid "C++ with
RenderScript extensions" language. We could support such a thing, but
shouldn't do so by accident.
llvm-svn: 301497
Since Split DWARF needs to name the actual .dwo file that is generated,
it can't be known at the time the llvm::Module is produced as it may be
merged with other Modules before the object is generated and that object
may be generated with any name.
By passing the Split DWARF file name when LLVM is producing object code
the .dwo file name in the object file can match correctly.
The support for Split DWARF for implicit modules remains the same -
using metadata to store the dwo name and dwo id so that potentially
multiple skeleton CUs referring to different dwo files can be generated
from one llvm::Module.
llvm-svn: 301063
This commit teaches Clang to recognize editor placeholders that are produced
when an IDE like Xcode inserts a code-completion result that includes a
placeholder. Now when the lexer sees a placeholder token, it emits an
'editor placeholder in source file' error and creates an identifier token
that represents the placeholder. The parser/sema can now recognize the
placeholders and can suppress the diagnostics related to the placeholders. This
ensures that live issues in an IDE like Xcode won't get spurious diagnostics
related to placeholders.
This commit also adds a new compiler option named '-fallow-editor-placeholders'
that silences the 'editor placeholder in source file' error. This is useful
for an IDE like Xcode as we don't want to display those errors in live issues.
rdar://31581400
Differential Revision: https://reviews.llvm.org/D32081
llvm-svn: 300667
The driver needs to know whether it's building a module interface or
implementation unit because it affects which outputs it produces and how it
builds the command pipeline. But the frontend doesn't need to know and should
not care: all it needs to know is what action it is being asked to perform on
the input.
(This is in preparation for permitting -emit-obj to be used on a module
interface unit to produce object code without going via a "full" PCM file.)
llvm-svn: 300611
Remove the restriction where this is only valid with C++
rdar://problem/29055656
Differential Revision: https://reviews.llvm.org/D31781
llvm-svn: 300108
This allows using and testing these two features separately. (noteably,
debug info is, so far as I know, always a win (basically). But function
modular codegen is currently a loss for highly optimized code - where
most of the linkonce_odr definitions are optimized away, so providing
weak_odr definitions is only overhead)
llvm-svn: 300104
This isn't need anymore and modules options -fbuild-session-file and
-fmodules-validate-once-per-build-session already provide a sane
mechanism to validate the system headers.
rdar://problem/19767523
llvm-svn: 300027
Summary:
The refactoring introduced a regression in the flag processing for
-fxray-instruction-threshold which causes it to not get passed properly.
This change should restore the previous behaviour.
Reviewers: rnk, pelikan
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D31491
llvm-svn: 299126
Summary:
The -fxray-always-instrument= and -fxray-never-instrument= flags take
filenames that are used to imbue the XRay instrumentation attributes
using a whitelist mechanism (similar to the sanitizer special cases
list). We use the same syntax and semantics as the sanitizer blacklists
files in the implementation.
As implemented, we respect the attributes that are already defined in
the source file (i.e. those that have the
[[clang::xray_{always,never}_instrument]] attributes) before applying
the always/never instrument lists.
Reviewers: rsmith, chandlerc
Subscribers: jfb, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D30388
llvm-svn: 299041
FPContractModeKind is the codegen option flag which is already ternary (off,
on, fast). This makes it universally the type for the contractable info
across the front-end:
* In FPOptions (i.e. in the Sema + in the expression nodes).
* In LangOpts::DefaultFPContractMode which is the option that initializes
FPOptions in the Sema.
Another way to look at this change is that before fp-contractable on/off were
the only states handled to the front-end:
* For "on", FMA folding was performed by the front-end
* For "fast", we simply forwarded the flag to TargetOptions to handle it in
LLVM
Now off/on/fast are all exposed because for fast we will generate
fast-math-flags during CodeGen.
This is toward moving fp-contraction=fast from an LLVM TargetOption to a
FastMathFlag in order to fix PR25721.
---
This is a recommit of r299027 with an adjustment to the test
CodeGenCUDA/fp-contract.cu. The test assumed that even
though -ffp-contract=on is passed FE-based folding of FMA won't happen.
This is obviously wrong since the user is asking for this explicitly with the
option. CUDA is different that -ffp-contract=fast is on by default.
The test used to "work" because contract=fast and contract=on were maintained
separately and we didn't fold in the FE because contract=fast was on due to
the target-default. This patch consolidates the contract=on/fast/off state
into a ternary state hence the change in behavior.
---
Differential Revision: https://reviews.llvm.org/D31167
llvm-svn: 299033
FPContractModeKind is the codegen option flag which is already ternary (off,
on, fast). This makes it universally the type for the contractable info
across the front-end:
* In FPOptions (i.e. in the Sema + in the expression nodes).
* In LangOpts::DefaultFPContractMode which is the option that initializes
FPOptions in the Sema.
Another way to look at this change is that before fp-contractable on/off were
the only states handled to the front-end:
* For "on", FMA folding was performed by the front-end
* For "fast", we simply forwarded the flag to TargetOptions to handle it in
LLVM
Now off/on/fast are all exposed because for fast we will generate
fast-math-flags during CodeGen.
This is toward moving fp-contraction=fast from an LLVM TargetOption to a
FastMathFlag in order to fix PR25721.
Differential Revision: https://reviews.llvm.org/D31167
llvm-svn: 299027
The flag CXXOperatorNames was overwritten unconditionally
after being set for OpenCL.
There seems to be no necessity to set it, so removing the line.
llvm-svn: 298709
Summary:
Clang companion patch to LLVM patch D31027, which adds support
for emitting minimized bitcode file for use in the thin link step.
Add a cc1 option -fthin-link-bitcode=<file> to trigger this behavior.
Depends on D31027.
Reviewers: mehdi_amini, pcc
Subscribers: cfe-commits, Prazek
Differential Revision: https://reviews.llvm.org/D31050
llvm-svn: 298639
The alias was only ever used on darwin and had some issues there,
and isn't used in practice much. Also fixes a problem with -mno-altivec
not turning off -maltivec.
Also add a diagnostic for faltivec/fno-altivec that directs users to use
maltivec options and include the altivec.h file explicitly.
llvm-svn: 298449
This fixes lookup mismatches that could happen when the module cache
path contained a '/./' component.
<rdar://problem/30413458>
Differential Revision: https://reviews.llvm.org/D30915
llvm-svn: 297790
Change ASTFileSignature from a random 32-bit number to the hash of the
PCM content.
- Move definition ASTFileSignature to Basic/Module.h so Module and
ASTSourceDescriptor can use it.
- Change the signature from uint64_t to std::array<uint32_t,5>.
- Stop using (saving/reading) the size and modification time of PCM
files when there is a valid SIGNATURE.
- Add UNHASHED_CONTROL_BLOCK, and use it to store the SIGNATURE record
and other records that shouldn't affect the hash. Because implicit
modules reuses the same file for multiple levels of -Werror, this
includes DIAGNOSTIC_OPTIONS and DIAG_PRAGMA_MAPPINGS.
This helps to solve a PCH + implicit Modules dependency issue: PCH files
are handled by the external build system, whereas implicit modules are
handled by internal compiler build system. This prevents invalidating a
PCH when the compiler overwrites a PCM file with the same content
(modulo the diagnostic differences).
Design and original patch by Manman Ren!
llvm-svn: 297655
This commit adds support for a new -iframeworkwithsysroot compiler option which
allows the user to specify a framework path that can be prefixed with the
sysroot. This option is similar to the -iwithsysroot option that exists to
supplement -isystem.
rdar://21316352
Differential Revision: https://reviews.llvm.org/D30183
llvm-svn: 297614
In case user did not provide valid standard name for -std option, available
values (with short description) will be reported.
Patch by Paweł Żukowski!
llvm-svn: 295113
First pass at generating weak definitions of inline functions from module files
(& skipping (-O0) or emitting available_externally (optimizations)
definitions where those modules are used).
External functions defined in modules are emitted into the modular
object file as well (this may turn an existing ODR violation (if that
module were imported into multiple translations) into valid/linkable
code).
Internal symbols (static functions, for example) are not correctly
supported yet. The symbol will be produced, internal, in the modular
object - unreferenceable from the users.
Reviewers: rsmith
Differential Revision: https://reviews.llvm.org/D28845
llvm-svn: 293456
Summary:
Now when you ask clang to link in a bitcode module, you can tell it to
set attributes on that module's functions to match what we would have
set if we'd emitted those functions ourselves.
This is particularly important for fast-math attributes in CUDA
compilations.
Each CUDA compilation links in libdevice, a bitcode library provided by
nvidia as part of the CUDA distribution. Without this patch, if we have
a user-function F that is compiled with -ffast-math that calls a
function G from libdevice, F will have the unsafe-fp-math=true (etc.)
attributes, but G will have no attributes.
Since F calls G, the inliner will merge G's attributes into F's. It
considers the lack of an unsafe-fp-math=true attribute on G to be
tantamount to unsafe-fp-math=false, so it "merges" these by setting
unsafe-fp-math=false on F.
This then continues up the call graph, until every function that
(transitively) calls something in libdevice gets unsafe-fp-math=false
set, thus disabling fastmath in almost all CUDA code.
Reviewers: echristo
Subscribers: hfinkel, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D28538
llvm-svn: 293097
Summary:
SamplePGO uses profile with debug info to collect profile. Unlike the traditional debugging purpose, sample pgo needs more accurate debug info to represent the profile. We add -femit-accurate-debug-info for this purpose. It can be combined with all debugging modes (-g, -gmlt, etc). It makes sure that the following pieces of info is always emitted:
* start line of all subprograms
* linkage name of all subprograms
* standalone subprograms (functions that has neither inlined nor been inlined)
The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch):
-gmlt(orig) -gmlt(patched) -g
433.milc 4.68% 5.40% 19.73%
444.namd 8.45% 8.93% 45.99%
447.dealII 97.43% 115.21% 374.89%
450.soplex 27.75% 31.88% 126.04%
453.povray 21.81% 26.16% 92.03%
470.lbm 0.60% 0.67% 1.96%
482.sphinx3 5.77% 6.47% 26.17%
400.perlbench 17.81% 19.43% 73.08%
401.bzip2 3.73% 3.92% 12.18%
403.gcc 31.75% 34.48% 122.75%
429.mcf 0.78% 0.88% 3.89%
445.gobmk 6.08% 7.92% 42.27%
456.hmmer 10.36% 11.25% 35.23%
458.sjeng 5.08% 5.42% 14.36%
462.libquantum 1.71% 1.96% 6.36%
464.h264ref 15.61% 16.56% 43.92%
471.omnetpp 11.93% 15.84% 60.09%
473.astar 3.11% 3.69% 14.18%
483.xalancbmk 56.29% 81.63% 353.22%
geomean 15.60% 18.30% 57.81%
Debug info size change for -gmlt binary with this patch:
433.milc 13.46%
444.namd 5.35%
447.dealII 18.21%
450.soplex 14.68%
453.povray 19.65%
470.lbm 6.03%
482.sphinx3 11.21%
400.perlbench 8.91%
401.bzip2 4.41%
403.gcc 8.56%
429.mcf 8.24%
445.gobmk 29.47%
456.hmmer 8.19%
458.sjeng 6.05%
462.libquantum 11.23%
464.h264ref 5.93%
471.omnetpp 31.89%
473.astar 16.20%
483.xalancbmk 44.62%
geomean 16.83%
Reviewers: davidxl, andreadb, rob.lougher, dblaikie, echristo
Reviewed By: dblaikie, echristo
Subscribers: hfinkel, rob.lougher, andreadb, gbedwell, cfe-commits, probinson, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25435
llvm-svn: 292458
In ThinLTO mode, type metadata will require the module to be written as a
multi-module bitcode file, which is currently incompatible with the Darwin
linker. It is also useful to be able to enable or disable multi-module bitcode
for testing purposes. This introduces a cc1-level flag, -f{,no-}lto-unit,
which is used by the driver to enable multi-module bitcode on all but
Darwin+ThinLTO, and can also be used to enable/disable the feature manually.
Differential Revision: https://reviews.llvm.org/D28877
llvm-svn: 292448
Aleksey Shlypanikov pointed out my mistake in migrating an explicit
unique_ptr to auto - I was expecting the function returned a unique_ptr,
but instead it returned a raw pointer - introducing a leak.
Thanks Aleksey!
This reapplies r291184, reverted in r291249.
llvm-svn: 291270
in non-void functions that fall off at the end without returning a value when
compiling C++.
Clang uses the new compiler flag to determine when it should treat control flow
paths that fall off the end of a non-void function as unreachable. If
-fno-strict-return is on, the code generator emits the ureachable and trap
IR only when the function returns either a record type with a non-trivial
destructor or another non-trivially copyable type.
The primary goal of this flag is to avoid treating falling off the end of a
non-void function as undefined behaviour. The burden of undefined behaviour
is placed on the caller instead: if the caller ignores the returned value then
the undefined behaviour is avoided. This kind of behaviour is useful in
several cases, e.g. when compiling C code in C++ mode.
rdar://13102603
Differential Revision: https://reviews.llvm.org/D27163
llvm-svn: 290960
to be specified for a template template parameter whenever the parameter is at
least as specialized as the argument (when there's an obvious and correct
mapping from uses of the parameter to uses of the argument). For example, a
template with more parameters can be passed to a template template parameter
with fewer, if those trailing parameters have default arguments.
This is disabled by default, despite being a DR resolution, as it's fairly
broken in its current state: there are no partial ordering rules to cope with
template template parameters that have different parameter lists, meaning that
code that attempts to decompose template-ids based on arity can hit unavoidable
ambiguity issues.
The diagnostics produced on a non-matching argument are also pretty bad right
now, but I aim to improve them in a subsequent commit.
llvm-svn: 290792
manager, and a code path to use it.
The option is actually a top-level option but does contain
'experimental' in the name. This is the compromise suggested by Richard
in discussions. We expect this option will be around long enough and
have enough users towards the end that it merits not being relegated to
CC1, but it still needs to be clear that this option will go away at
some point.
The backend code is a fresh codepath dedicated to handling the flow with
the new pass manager. This was also Richard's suggested code structuring
to essentially leave a clean path for development rather than carrying
complexity or idiosyncracies of how we do things just to share code with
the parts of this in common with the legacy pass manager. And it turns
out, not much is really in common even though we use the legacy pass
manager for codegen at this point.
I've switched a couple of tests to run with the new pass manager, and
they appear to work. There are still plenty of bugs that need squashing
(just with basic experiments I've found two already!) but they aren't in
this code, and the whole point is to expose the necessary hooks to start
experimenting with the pass manager in more realistic scenarios.
That said, I want to *strongly caution* anyone itching to play with
this: it is still *very shaky*. Several large components have not yet
been shaken down. For example I have bugs in both the always inliner and
inliner that I have already spotted and will be fixing independently.
Still, this is a fun milestone. =D
One thing not in this patch (but that might be very reasonable to add)
is some level of support for raw textual pass pipelines such as what
Sean had a patch for some time ago. I'm mostly interested in the more
traditional flow of getting the IR out of Clang and then running it
through opt, but I can see other use cases so someone may want to add
it.
And of course, *many* features are not yet supported!
- O1 is currently more like O2
- None of the sanitizers are wired up
- ObjC ARC optimizer isn't wired up
- ...
So plenty of stuff still lef to do!
Differential Revision: https://reviews.llvm.org/D28077
llvm-svn: 290450
-fno-inline-functions, -O0, and optnone.
These were really, really tangled together:
- We used the noinline LLVM attribute for -fno-inline
- But not for -fno-inline-functions (breaking LTO)
- But we did use it for -finline-hint-functions (yay, LTO is happy!)
- But we didn't for -O0 (LTO is sad yet again...)
- We had weird structuring of CodeGenOpts with both an inlining
enumeration and a boolean. They interacted in weird ways and
needlessly.
- A *lot* of set smashing went on with setting these, and then got worse
when we considered optnone and other inlining-effecting attributes.
- A bunch of inline affecting attributes were managed in a completely
different place from -fno-inline.
- Even with -fno-inline we failed to put the LLVM noinline attribute
onto many generated function definitions because they didn't show up
as AST-level functions.
- If you passed -O0 but -finline-functions we would run the normal
inliner pass in LLVM despite it being in the O0 pipeline, which really
doesn't make much sense.
- Lastly, we used things like '-fno-inline' to manipulate the pass
pipeline which forced the pass pipeline to be much more
parameterizable than it really needs to be. Instead we can *just* use
the optimization level to select a pipeline and control the rest via
attributes.
Sadly, this causes a bunch of churn in tests because we don't run the
optimizer in the tests and check the contents of attribute sets. It
would be awesome if attribute sets were a bit more FileCheck friendly,
but oh well.
I think this is a significant improvement and should remove the semantic
need to change what inliner pass we run in order to comply with the
requested inlining semantics by relying completely on attributes. It
also cleans up tho optnone and related handling a bit.
One unfortunate aspect of this is that for generating alwaysinline
routines like those in OpenMP we end up removing noinline and then
adding alwaysinline. I tried a bunch of other approaches, but because we
recompute function attributes from scratch and don't have a declaration
here I couldn't find anything substantially cleaner than this.
Differential Revision: https://reviews.llvm.org/D28053
llvm-svn: 290398
Much to my surprise, '-disable-llvm-optzns' which I thought was the
magical flag I wanted to get at the raw LLVM IR coming out of Clang
deosn't do that. It still runs some passes over the IR. I don't want
that, I really want the *raw* IR coming out of Clang and I strongly
suspect everyone else using it is in the same camp.
There is actually a flag that does what I want that I didn't know about
called '-disable-llvm-passes'. I suspect many others don't know about it
either. It both does what I want and is much simpler.
This removes the confusing version and makes that spelling of the flag
an alias for '-disable-llvm-passes'. I've also moved everything in Clang
to use the 'passes' spelling as it seems both more accurate (*all* LLVM
passes are disabled, not just optimizations) and much easier to remember
and spell correctly.
This is part of simplifying how Clang drives LLVM to make it cleaner to
wire up to the new pass manager.
Differential Revision: https://reviews.llvm.org/D28047
llvm-svn: 290392
In r267772, we had set the PS4's default dialect for both C and
Objective-C to gnu99. Make that change only for C; we don't really
support Objective-C/C++ so there's no point fiddling the dialect.
llvm-svn: 289625
Re-introduce r285411.
Implement the -dI as supported by GCC: Output ‘#include’ directives in addition
to the result of preprocessing.
This change aims to add this option, pass it through to the preprocessor via
the options class, and when inclusions occur we output some information (+ test
cases).
Patch by Steve O'Brien!
Differential Revision: https://reviews.llvm.org/D26089
llvm-svn: 287275
Summary:
Just like gcc, we should have the -Og option as more and more software are using it:
https://llvm.org/bugs/show_bug.cgi?id=20765
Reviewers: echristo, dberlin, dblaikie, keith.walker.arm, rengolin
Subscribers: aprantl, friss, mehdi_amini, RKSimon, probinson, majnemer, cfe-commits
Differential Revision: https://reviews.llvm.org/D24998
llvm-svn: 286602
Summary:
This patch adds a command line option '-cl-ext' to control a set of
supported OpenCL extensions. Option accepts a comma-separated list
of extensions prefixed with '+' or '-'.
It can be used together with a target triple to override support for some
extensions:
// spir target supports all extensions, but we want to disable fp64
clang -cc1 -triple spir-unknown-unknown -cl-ext=-cl_khr_fp64
Special 'all' extension allows to enable or disable all possible
extensions:
// only fp64 will be supported
clang -cc1 -triple spir-unknown-unknown -cl-ext=-all,+cl_khr_fp64
Patch by asavonic (Andrew Savonichev).
Reviewers: joey, yaxunl
Subscribers: yaxunl, bader, Anastasia, cfe-commits
Differential Revision: https://reviews.llvm.org/D23712
llvm-svn: 285700
on cxx-abi-dev (thread starting 2016-10-11). This is currently hidden behind a
cc1-only -m flag, pending discussion of how best to deal with language changes
that require use of new symbols from the ABI library.
llvm-svn: 285664
Implement the -dI as supported by GCC: Output ‘#include’ directives in addition
to the result of preprocessing.
This change aims to add this option, pass it through to the preprocessor via
the options class, and when inclusions occur we output some information (+ test
cases).
Patch by Steve O'Brien!
Differential Revision: https://reviews.llvm.org/D25153
llvm-svn: 285411
r276653 suppressed the pragma once warning when generating a PCH file.
This patch extends that to any main file for which clang is told (with
the -x option) that it's a header file. It will also suppress the
warning "#include_next in primary source file".
Differential Revision: http://reviews.llvm.org/D25989
llvm-svn: 285295
Summary:
SetVector already used DenseSet, but SmallSetVector used std::set. This
leads to surprising performance differences. Moreover, it means that
the set of key types accepted by SetVector and SmallSetVector are
quite different!
In order to make this change, we had to convert some callsites that used
SmallSetVector<std::string, N> to use SmallSetVector<CachedHashString, N>
instead.
Reviewers: timshen
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25648
llvm-svn: 284887
Reapply r283827 by fixing the tests to not be target specific
Currently, driver level warnings do not show option names (e.g. warning:
complain about foo [-Woption-name]) in a diagnostic unless
-fdiagnostics-show-option is explictly specified. OTOH, the driver by
default turn this option on for CC1. Change the logic to show option
names by default in the driver as well.
Differential Revision: https://reviews.llvm.org/D24516
rdar://problem/27300909
llvm-svn: 283913
The backend now has the capability to save information from optimizations, the
same information that can be used to generate optimization diagnostics but in
machine-consumable form, into an output file. This can be enabled when using
opt (see r282539), and this change enables it when using clang. The idea is
that other tools will be able to consume these files, and perhaps in
combination with the original source code, produce various kinds of
optimization reports for users (and for compiler developers).
We now have at-least two tools that can consume these files:
* tools/llvm-opt-report
* utils/opt-viewer
Using the flag -fsave-optimization-record will cause the YAML file to be
generated; the file name will be based on the output file name (if we're using
-c or -S and have an output name), or the input file name. When we're using
CUDA, or some other offloading mechanism, separate files are generated for each
backend target. The output file name can be specified by the user using
-foptimization-record-file=filename.
Differential Revision: https://reviews.llvm.org/D25225
llvm-svn: 283834
Currently, driver level warnings do not show option names (e.g. warning:
complain about foo [-Woption-name]) in a diagnostic unless
-fdiagnostics-show-option is explictly specified. OTOH, the driver by
default turn this option on for CC1. Change the logic to show option
names by default in the driver as well.
Differential Revision: https://reviews.llvm.org/D24516
rdar://problem/27300909
llvm-svn: 283827
Summary: This matches the idiom we use for our other CUDA wrapper headers.
Reviewers: tra
Subscribers: beanz, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D24978
llvm-svn: 283679
Summary:
Also makes -fcoroutines_ts to be both a Driver and CC1 flag.
Patch mostly by EricWF.
Reviewers: rnk, cfe-commits, rsmith, EricWF
Subscribers: mehdi_amini
Differential Revision: https://reviews.llvm.org/D25130
llvm-svn: 283064
assume that ::operator new provides no more alignment than is necessary for any
primitive type, except when we're on a GNU OS, where glibc's malloc guarantees
to provide 64-bit alignment on 32-bit systems and 128-bit alignment on 64-bit
systems. This can be controlled by the command-line -fnew-alignment flag.
llvm-svn: 282974
This option behaves in a similar spirit as -save-temps and writes
internal llvm statistics in json format to a file.
Differential Revision: https://reviews.llvm.org/D24820
llvm-svn: 282426
Clang has the default FP contraction setting of “-ffp-contract=on”, which
doesn't really mean “on” in the conventional sense of the word, but rather
really means “according to the per-statement effective value of the relevant
pragma”.
Before this patch, Clang has that pragma defaulting to “off”. Since the
“-ffp-contract=on” mode is really an AND of two booleans and the second of them
defaults to “off”, the whole thing effectively defaults to “off”. This patch
changes the default value of the pragma to “on”, thus making the default pair of
booleans (on, on) rather than (on, off). This makes FP optimization slightly
more aggressive than before when not using either “-Ofast”, “-ffast-math”, or
“-ffp-contract=fast”. Even with this patch the compiler still respects
“-ffp-contract=off”.
As per a suggestion by Steve Canon, the added code does _not_ require “-O3” or
higher. This is so as to try our best to preserve identical floating-point
results for unchanged source code compiling for an unchanged target when only
changing from any optimization level in the set (“-O0”, “-O1”, “-O2”, “-O3”) to
any other optimization level in that set. “-Os” and “-Oz” seem to be behaving
identically, i.e. should probably be considered a part of the aforementioned
set, but I have not reviewed this rigorously. “-Ofast” is explicitly _not_ a
member of that set.
Patch authored by Abe Skolnik [a.skolnik@samsung.com] and Stephen Canon [scanon@apple.com].
Differential Revision: https://reviews.llvm.org/D24481
llvm-svn: 282259
Original commit message:
Add -fdiagnostics-show-hotness
Summary:
I've recently added the ability for optimization remarks to include the
hotness of the corresponding code region. This uses PGO and allows
filtering of the optimization remarks by relevance. The idea was first
discussed here:
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334
The general goal is to produce a YAML file with the remarks. Then, an
external tool could dynamically filter these by hotness and perhaps by
other things.
That said it makes sense to also expose this at the more basic level
where we just include the hotness info with each optimization remark.
For example, in D22694, the clang flag was pretty useful to measure the
overhead of the additional analyses required to include hotness.
(Without the flag we don't even run the analyses.)
For the record, Hal has already expressed support for the idea of this
patch on IRC.
Differential Revision: https://reviews.llvm.org/D23284
llvm-svn: 281293
Summary:
I've recently added the ability for optimization remarks to include the
hotness of the corresponding code region. This uses PGO and allows
filtering of the optimization remarks by relevance. The idea was first
discussed here:
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334
The general goal is to produce a YAML file with the remarks. Then, an
external tool could dynamically filter these by hotness and perhaps by
other things.
That said it makes sense to also expose this at the more basic level
where we just include the hotness info with each optimization remark.
For example, in D22694, the clang flag was pretty useful to measure the
overhead of the additional analyses required to include hotness.
(Without the flag we don't even run the analyses.)
For the record, Hal has already expressed support for the idea of this
patch on IRC.
Differential Revision: https://reviews.llvm.org/D23284
llvm-svn: 281276
-fprofile-dir=path allows the user to specify where .gcda files should be
emitted when the program is run. In particular, this is the first flag that
causes the .gcno and .o files to have different paths, LLVM is extended to
support this. -fprofile-dir= does not change the file name in the .gcno (and
thus where lcov looks for the source) but it does change the name in the .gcda
(and thus where the runtime library writes the .gcda file). It's different from
a GCOV_PREFIX because a user can observe that the GCOV_PREFIX_STRIP will strip
paths off of -fprofile-dir= but not off of a supplied GCOV_PREFIX.
To implement this we split -coverage-file into -coverage-data-file and
-coverage-notes-file to specify the two different names. The !llvm.gcov
metadata node grows from a 2-element form {string coverage-file, node dbg.cu}
to 3-elements, {string coverage-notes-file, string coverage-data-file, node
dbg.cu}. In the 3-element form, the file name is already "mangled" with
.gcno/.gcda suffixes, while the 2-element form left that to the middle end
pass.
llvm-svn: 280306
r280133. Original commit message:
C++ Modules TS: driver support for building modules.
This works as follows: we add --precompile to the existing gamut of options for
specifying how far to go when compiling an input (-E, -c, -S, etc.). This flag
specifies that an input is taken to the precompilation step and no further, and
this can be specified when building a .pcm from a module interface or when
building a .pch from a header file.
The .cppm extension (and some related extensions) are implicitly recognized as
C++ module interface files. If --precompile is /not/ specified, the file is
compiled (via a .pcm) to a .o file containing the code for the module (and then
potentially also assembled and linked, if -S, -c, etc. are not specified). We
do not yet suppress the emission of object code for other users of the module
interface, so for now this will only work if everything in the .cppm file has
vague linkage.
As with the existing support for module-map modules, prebuilt modules can be
provided as compiler inputs either via the -fmodule-file= command-line argument
or via files named ModuleName.pcm in one of the directories specified via
-fprebuilt-module-path=.
This also exposes the -fmodules-ts cc1 flag in the driver. This is still
experimental, and in particular, the concrete syntax is subject to change as
the Modules TS evolves in the C++ committee. Unlike -fmodules, this flag does
not enable support for implicitly loading module maps nor building modules via
the module cache, but those features can be turned on separately and used in
conjunction with the Modules TS support.
llvm-svn: 280134
to CC1, which are translated to function attributes and can e.g. be mapped on
build attributes FP_exceptions and FP_denormal. Setting these build attributes
allows better selection of floating point libraries.
Differential Revision: https://reviews.llvm.org/D23840
llvm-svn: 280064
This works as follows: we add --precompile to the existing gamut of options for
specifying how far to go when compiling an input (-E, -c, -S, etc.). This flag
specifies that an input is taken to the precompilation step and no further, and
this can be specified when building a .pcm from a module interface or when
building a .pch from a header file.
The .cppm extension (and some related extensions) are implicitly recognized as
C++ module interface files. If --precompile is /not/ specified, the file is
compiled (via a .pcm) to a .o file containing the code for the module (and then
potentially also assembled and linked, if -S, -c, etc. are not specified). We
do not yet suppress the emission of object code for other users of the module
interface, so for now this will only work if everything in the .cppm file has
vague linkage.
As with the existing support for module-map modules, prebuilt modules can be
provided as compiler inputs either via the -fmodule-file= command-line argument
or via files named ModuleName.pcm in one of the directories specified via
-fprebuilt-module-path=.
This also exposes the -fmodules-ts cc1 flag in the driver. This is still
experimental, and in particular, the concrete syntax is subject to change as
the Modules TS evolves in the C++ committee. Unlike -fmodules, this flag does
not enable support for implicitly loading module maps nor building modules via
the module cache, but those features can be turned on separately and used in
conjunction with the Modules TS support.
llvm-svn: 280035
In cases where .dwo/.dwp files are guaranteed to be available, skipping
the extra online (in the .o file) inline info can save a substantial
amount of space - see the original r221306 for more details there.
llvm-svn: 279651
In this mode, there is no need to load any module map and the programmer can
simply use "@import" syntax to load the module directly from a prebuilt
module path. When loading from prebuilt module path, we don't support
rebuilding of the module files and we ignore compatible configuration
mismatches.
rdar://27290316
Differential Revision: http://reviews.llvm.org/D23125
llvm-svn: 279096
standard's Annex B. We now attempt to increase the process's stack rlimit to
8MiB on startup, which appears to be enough to allow this to work reliably.
(And if it turns out not to be, we can investigate increasing it further.)
llvm-svn: 278983
Let the driver pass the option to frontend. Do not set precision metadata for division instructions when this option is set. Set function attribute "correctly-rounded-divide-sqrt-fp-math" based on this option.
Differential Revision: https://reviews.llvm.org/D22940
llvm-svn: 278155
Adjust target features for amdgcn target when -cl-denorms-are-zero is set.
Denormal support is controlled by feature strings fp32-denormals fp64-denormals in amdgcn target. If -cl-denorms-are-zero is not set and the command line does not set fp32/64-denormals feature string, +fp32-denormals +fp64-denormals will be on for GPU's supporting them.
A new virtual function virtual void TargetInfo::adjustTargetOptions(const CodeGenOptions &CGOpts, TargetOptions &TargetOpts) const is introduced to allow adjusting target option by codegen option.
Differential Revision: https://reviews.llvm.org/D22815
llvm-svn: 278151
This patch adds a command line option to list the checkers that were enabled
by analyzer-checker and not disabled by -analyzer-disable-checker.
It can be very useful to debug long command lines when it is not immediately
apparent which checkers are turned on and which checkers are turned off.
Differential Revision: https://reviews.llvm.org/D23060
llvm-svn: 278006
Currently Clang use int32 to represent sampler_t, which have been a source of issue for some backends, because in some backends sampler_t cannot be represented by int32. They have to depend on kernel argument metadata and use IPA to find the sampler arguments and global variables and transform them to target specific sampler type.
This patch uses opaque pointer type opencl.sampler_t* for sampler_t. For each use of file-scope sampler variable, it generates a function call of __translate_sampler_initializer. For each initialization of function-scope sampler variable, it generates a function call of __translate_sampler_initializer.
Each builtin library can implement its own __translate_sampler_initializer(). Since the real sampler type tends to be architecture dependent, allowing it to be initialized by a library function simplifies backend design. A typical implementation of __translate_sampler_initializer could be a table lookup of real sampler literal values. Since its argument is always a literal, the returned pointer is known at compile time and easily optimized to finally become some literal values directly put into image read instructions.
This patch is partially based on Alexey Sotkin's work in Khronos Clang (3d4eec6162).
Differential Revision: https://reviews.llvm.org/D21567
llvm-svn: 277024
With PCH+Module, sometimes compiler gives a hard error:
Module file ‘<some-file path>.pcm' is out of date and needs to be rebuilt
This happens when we have a pch importing a module and the module gets
overwritten by another compiler instance after we build the pch (one example is
that both compiler instances hash to the same pcm file but use different
diagnostic options). When we try to load the pch later on, the compiler notices
that the imported module is out of date (modification date, size do not match)
but it can't handle this out of date pcm (i.e it does not know how to rebuild
the pch).
This commit introduces a new command line option so for PCH + module, we can
turn on this option and if two compiler instances only differ in diagnostic
options, the latter instance will not invalidate the original pcm.
rdar://26675801
Differential Revision: http://reviews.llvm.org/D22773
llvm-svn: 276769
-fxray-instrument: enables XRay annotation of IR
-fxray-instruction-threshold: configures the threshold for function size (looking at IR instructions), and allow LLVM to decide whether to add the nop sleds later on in the process.
Also implements the related xray_always_instrument and xray_never_instrument function attributes.
Patch by Dean Michael Berris.
llvm-svn: 275330
Also fixes strict-aliasing option to only be allowed when OpenCL Version 1.0. Added testcase in test/Frontend/opencl-blocks.cl.
Patch by Aaron En Ye Shi.
Differential Revision: http://reviews.llvm.org/D22170
llvm-svn: 275318
This is to allow distributed build systems, that do not preserve time stamps, to use PCH files.
Second and last part of the patch proposed at:
Differential Revision: http://reviews.llvm.org/D20867
llvm-svn: 275267
Add OCL option -cl-no-signed-zeros to driver options.
Also added to opencl.cl testcases.
Patch by Aaron En Ye Shi.
Differential Revision: http://reviews.llvm.org/D22067
llvm-svn: 274923
Summary: This patch changes the options used by offloading to start with -fopenmp instead of -fomp. This makes the option naming more consistent and materializes a suggestion by Richard Smith in http://reviews.llvm.org/D9888.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, ABataev
Subscribers: kkwli0, cfe-commits, caomhin
Differential Revision: http://reviews.llvm.org/D21841
llvm-svn: 274283
Allow -cl-std and other standard -cl- options from cc1 to driver.
Added a test for the options moved.
Patch by Aaron En Ye Shi.
Differential Revision: http://reviews.llvm.org/D21031
llvm-svn: 274150
The PIC and PIE levels are not independent. In fact, if PIE is defined
it is always the same as PIC.
This is clear in the driver where ParsePICArgs returns a PIC level and
a IsPIE boolean. Unfortunately that is currently lost and we pass two
redundant levels down the pipeline.
This patch keeps a bool and a PIC level all the way down to codegen.
llvm-svn: 273566
Add support for /Ob1 (and equivalent -finline-hint-functions), which enable
inlining only for functions marked inline, either explicitly (via inline
keyword, for example), or implicitly (function definition in class body,
for example).
This works by enabling inlining pass, and adding noinline attribute to
every function not marked inline.
Patch by Rudy Pons <rudy.pons@ilod.org>!
Differential Revision: http://reviews.llvm.org/D20647
llvm-svn: 273440
Include opencl-c.h by default as a module to utilize the automatic AST caching mechanism of clang modules.
Add an option -finclude-default-header to enable default header for OpenCL, which is off by default.
Differential Revision: http://reviews.llvm.org/D20444
llvm-svn: 273191
Fix a regression which forbids using -std=cl|CL1.1|CL1.2|CL2.0 in driver.
Allow -std and -cl-std={cl|CL}{|1.1|1.2|2.0}.
Differential Revision: http://reviews.llvm.org/D20630
llvm-svn: 273015
Summary:
Create a new Frontend LangOpt to specify the renderscript language. It
is enabled by the "-x renderscript" option from the driver.
Add a "kernel" function attribute only for RenderScript (an "ignored
attribute" warning is generated otherwise).
Make the NativeHalfType and NativeHalfArgsAndReturns LangOpts be implied
by the RenderScript LangOpt.
Reviewers: rsmith
Subscribers: cfe-commits, srhines
Differential Revision: http://reviews.llvm.org/D21198
llvm-svn: 272342
Diagnostics that happen during driver time do not have color output support
unless -fcolor-diagonostic is explicitly passed into the driver. This is not a
problem for cc1 since dianostic arguments are properly handled and color is
enabled by default if the terminal supports it.
Make the driver behave like CC1. There are tests that already check for these
flags, but for the color itself there's no sensible way to test it.
Differential Revision: http://reviews.llvm.org/D20404
rdar://problem/26290980
llvm-svn: 271042
OpenMP version.
If '-fopenmp' option is provided '-fopenmp-version=' allows to control,
which version of OpenMP must be supported. Currently it affects only the
value of _OPENMP define.
llvm-svn: 270838
-finline-functions and /Ob2 are currently ignored by Clang. The only way to
enable inlining is to use the global O flags, which also enable other options,
or to emit LLVM bitcode using Clang, then running opt by hand with the inline
pass.
This patch allows to simply use the -finline-functions flag (same as GCC) or
/Ob2 in clang-cl mode to enable inlining without other optimizations.
This is the first patch of a serie to improve support for the /Ob flags.
Patch by Rudy Pons <rudy.pons@ilod.org>!
Differential Revision: http://reviews.llvm.org/D20576
llvm-svn: 270609
This matches default nvcc behavior and gives substantial
performance boost on GPU where fmad is much cheaper compared to add+mul.
Differential Revision: http://reviews.llvm.org/D20341
llvm-svn: 270094
This reverts commit r269492 as the corresponding LLVM commit was
reverted due to lots of warnings. See the review thread for the original
LLVM commit (r269491) for details.
llvm-svn: 269549
Summary:
This option allows the user to control how much of the file name is
emitted by UBSan. Tuning this option allows one to save space in the
resulting binary, which is helpful for restricted execution
environments.
With a positive N, UBSan skips the first N path components.
With a negative N, UBSan only keeps the last N path components.
Reviewers: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D19666
llvm-svn: 269309
Summary:
Teach clang to embed bitcode inside bitcode. When -fembed-bitcode cc1
option is used, clang will embed both the input bitcode and cc1
commandline into the bitcode in special sections before compiling to
the object file. Using -fembed-bitcode-marker will only introduce a
marker in both sections.
Depends on D17390
Reviewers: rsmith
Subscribers: yaron.keren, vsk, cfe-commits
Differential Revision: http://reviews.llvm.org/D17392
llvm-svn: 269202
From the GCC manpage:
-I dir
... If dir begins with =, then the = will be replaced by the sysroot prefix;
see --sysroot and -isysroot.
Add support to expand the `=` as a prefix of the include path with the sysroot
if specified. `-isysroot` takes precedence over `--sysroot` as the normal
argument behaviour occurs. The ordering of the `-isysroot` is relevant to the
path substituted. If no `--sysroot=` or `-isysroot` option is present, the = is
not expanded.
Resolves PR26965!
llvm-svn: 268777
This option, like the corresponding gcc option, is SystemZ-specific and
enables storing frame backchain links, as specified in the ABI.
Differential Revision: http://reviews.llvm.org/D19891
llvm-svn: 268575
Let the loop unroll pass handle /Os. It already checks that option and adjust its thresholds accordingly. Also, will allow the #pragma unroll to have an effect in /Os.
Differential Revision: http://reviews.llvm.org/D19827
llvm-svn: 268509
The -malign-double flag causes i64 and f64 types to have alignment 8
instead of 4. On x86-64, the behavior of -malign-double is enabled by default.
Rebases and cleans phosek's work here: http://reviews.llvm.org/D12860
Patch by Sean Klein
Reviewers: rnk
Subscribers: rnk, jfb, dschuff, phosek
Differential Revision: http://reviews.llvm.org/D19734
llvm-svn: 268473
Summary:
Host and device types must match, otherwise when we pass values back and
forth between the host and device, we will get the wrong result.
This patch makes NVPTXTargetInfo inherit most of its type information
from the host's target info.
Reviewers: rsmith
Subscribers: cfe-commits, jhen, tra
Differential Revision: http://reviews.llvm.org/D19346
llvm-svn: 268131
It makes compiler-rt tests fail if the gold plugin is enabled.
Revert "Rework interface for bitset-using features to use a notion of LTO visibility."
Revert "Driver: only produce CFI -fvisibility= error when compiling."
Revert "clang/test/CodeGenCXX/cfi-blacklist.cpp: Exclude ms targets. They would be non-cfi."
llvm-svn: 267871
Bitsets, and the compiler features they rely on (vtable opt, CFI),
only have visibility within the LTO'd part of the linkage unit. Therefore,
only enable these features for classes with hidden LTO visibility. This
notion is based on object file visibility or (on Windows)
dllimport/dllexport attributes.
We provide the [[clang::lto_visibility_public]] attribute to override the
compiler's LTO visibility inference in cases where the class is defined
in the non-LTO'd part of the linkage unit, or where the ABI supports
calling classes derived from abstract base classes with hidden visibility
in other linkage units (e.g. COM on Windows).
If the cross-DSO CFI mode is enabled, bitset checks are emitted even for
classes with public LTO visibility, as that mode uses a separate mechanism
to cause bitsets to be exported.
This mechanism replaces the whole-program-vtables blacklist, so remove the
-fwhole-program-vtables-blacklist flag.
Because __declspec(uuid()) now implies [[clang::lto_visibility_public]], the
support for the special attr:uuid blacklist entry is removed.
Differential Revision: http://reviews.llvm.org/D18635
llvm-svn: 267784
OpenCL spec requires __OPENCL_C_VERSION__ to be defined based on -cl-std option. This patch implements that.
The patch also defines __FAST_RELAXED_MATH__ based on -cl-fast-relaxed-math option.
Also fixed a test using -std=c99 for OpenCL program. Limit allowed language standard of OpenCL to be OpenCL standards.
Differential Revision: http://reviews.llvm.org/D19071
llvm-svn: 267590
It is possible to argue that the EABIVersion field is similar in spirit to the
ABI field in TargetOptions. It represents the embedded ABI that the target
follows. This will allow us to thread this information into the target
information construction.
llvm-svn: 265807
CodeGen-level implementation. Instead of adding an attribute to clang's
FunctionDecl, add the IR attribute directly. This means a module built with
this flag is now compatible with code built without it and vice versa.
This change also results in the 'noalias' attribute no longer being added to
calls to operator new in the IR; it's now only added to the declaration. It
also fixes a bug where we failed to add the attribute to the 'nothrow' versions
(because we didn't implicitly declare them, there was no good time to inject a
fake attribute).
llvm-svn: 265728
Summary:
Setting this flag causes all functions are annotated with the
"nvvm-f32ftz" = "true" attribute.
In addition, we annotate the module with "nvvm-reflect-ftz" set
to 0 or 1, depending on whether -cuda-flush-denormals-to-zero is set.
This is read by the NVVMReflect pass.
Reviewers: tra, rnk
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18671
llvm-svn: 265435
Add no-jump-tables flag to disable use of jump tables when lowering
switch statements
Reviewers: echristo, hans
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18407
llvm-svn: 265425
With this patch, by a constexpr function is implicitly host+device
unless:
a) it's a variadic function (variadic functions are not allowed on the
device side), or
b) it's preceeded by a __device__ overload in a system header.
The restriction on overloading __host__ __device__ functions on the
basis of their CUDA attributes remains in place, but we use (b) to allow
us to define __device__ overloads for constexpr functions in cmath,
which would otherwise be __host__ __device__ and thus not overloadable.
You can disable this behavior with -fno-cuda-host-device-constexpr.
Reviewers: tra, rnk, rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18380
llvm-svn: 264964
Summary:
* -fcuda-target-overloads
Previously unconditionally set to true by the driver. Necessary for
correct functioning of the compiler -- our CUDA headers wrapper won't
compile without this.
* -fcuda-disable-target-call-checks
Previously unconditionally set to true by the driver. Necessary to
compile almost any external CUDA code -- almost all libraries assume
that host+device code can call host or device functions.
* -fcuda-allow-host-calls-from-host-device
No effect when target overloading is enabled.
Reviewers: tra
Subscribers: rsmith, cfe-commits
Differential Revision: http://reviews.llvm.org/D18416
llvm-svn: 264739
include altivec.h has come and gone.
Rationale: This causes modules, rewrite-includes, etc to be sad and
people should just include altivec.h in their source.
llvm-svn: 264235
This reverts commit r263607.
This change caused more objc_retain/objc_release calls in the IR but those
are then incorrectly optimized by the ARC optimizer. Work is going to have
to be done to ensure the ARC optimizer doesn't optimize user written RR, but
that should land before this change.
This change will also need to be updated to take account for any changes required
to ensure that user written calls to RR are distinct from those inserted by ARC.
llvm-svn: 263984
This reapplies r261552 and r263748. Fixed testcase to reapply.
The VFS overlay mapping between virtual paths and real paths is done through
the 'external-contents' entries in YAML files, which contains hardcoded paths
to the real files.
When a module compilation crashes, headers are dumped into <name>.cache/vfs
directory and are mapped via the <name>.cache/vfs/vfs.yaml. The script
generated for reproduction uses -ivfsoverlay pointing to file to gather the
mapping between virtual paths and files inside <name>.cache/vfs. Currently, we
are only capable of reproducing such crashes in the same machine as they
happen, because of the hardcoded paths in 'external-contents'.
To be able to reproduce a crash in another machine, this patch introduces a new
option in the VFS yaml file called 'overlay-relative'. When it's equal to
'true' it means that the provided path to the YAML file through the
-ivfsoverlay option should also be used to prefix the final path for every
'external-contents'.
Example, given the invocation snippet "... -ivfsoverlay
<name>.cache/vfs/vfs.yaml" and the following entry in the yaml file:
"overlay-relative": "true",
"roots": [
...
"type": "directory",
"name": "/usr/include",
"contents": [
{
"type": "file",
"name": "stdio.h",
"external-contents": "/usr/include/stdio.h"
},
...
Here, a file manager request for virtual "/usr/include/stdio.h", that will map
into real path "/<absolute_path_to>/<name>.cache/vfs/usr/include/stdio.h.
This is a useful feature for debugging module crashes in machines other than
the one where the error happened.
Differential Revision: http://reviews.llvm.org/D17457
rdar://problem/24499339
llvm-svn: 263893
Summary:
r246764 handled __fp16 arguments and returns for AAPCS, but skipped this
handling for OpenCL. Simlar to OpenCL, RenderScript also handles __fp16
type natively.
This patch adds the -fnative-half-arguments-and-returns command line
flag to allow such languages to skip this coercion of __fp16.
Reviewers: srhines, olista01
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18138
llvm-svn: 263795
This reapplies r261552.
The VFS overlay mapping between virtual paths and real paths is done through
the 'external-contents' entries in YAML files, which contains hardcoded paths
to the real files.
When a module compilation crashes, headers are dumped into <name>.cache/vfs
directory and are mapped via the <name>.cache/vfs/vfs.yaml. The script
generated for reproduction uses -ivfsoverlay pointing to file to gather the
mapping between virtual paths and files inside <name>.cache/vfs. Currently, we
are only capable of reproducing such crashes in the same machine as they
happen, because of the hardcoded paths in 'external-contents'.
To be able to reproduce a crash in another machine, this patch introduces a new
option in the VFS yaml file called 'overlay-relative'. When it's equal to
'true' it means that the provided path to the YAML file through the
-ivfsoverlay option should also be used to prefix the final path for every
'external-contents'.
Example, given the invocation snippet "... -ivfsoverlay
<name>.cache/vfs/vfs.yaml" and the following entry in the yaml file:
"overlay-relative": "true",
"roots": [
...
"type": "directory",
"name": "/usr/include",
"contents": [
{
"type": "file",
"name": "stdio.h",
"external-contents": "/usr/include/stdio.h"
},
...
Here, a file manager request for virtual "/usr/include/stdio.h", that will map
into real path "/<absolute_path_to>/<name>.cache/vfs/usr/include/stdio.h.
This is a useful feature for debugging module crashes in machines other than
the one where the error happened.
Differential Revision: http://reviews.llvm.org/D17457
rdar://problem/24499339
llvm-svn: 263748
It is faster to directly call the ObjC runtime for methods such as retain/release instead of sending a message to those functions.
This patch adds support for converting messages to retain/release/alloc/autorelease to their equivalent runtime calls.
Tests included for the positive case of applying this transformation, negative tests that we ensure we only convert "alloc" to objc_alloc, not "alloc2", and also a driver test to ensure we enable this only for supported runtime versions.
Reviewed by John McCall.
Differential Revision: http://reviews.llvm.org/D14737
llvm-svn: 263607
Currently when an AST plugin is loaded it must then be enabled by passing
-plugin pluginname or -add-plugin pluginname to the -cc1 command line. This
patch adds a method to PluginASTAction which allows it to declare that the
action happens before, instead of, or after the main AST action, plus the
relevant changes to make the plugin action happen at that time automatically.
Differential Revision: http://reviews.llvm.org/D17959
llvm-svn: 263546
This is the companion to an LLVM patch that renamed the function index
data structures and files to use the more general module summary index.
(Recommit after fixing LLVM side to add back missed file)
llvm-svn: 263514
This is the companion to an LLVM patch that renamed the function index
data structures and files to use the more general module summary index.
llvm-svn: 263491
Summary:
The current offloading implementation is using -omptargets and -omp-host-ir-file-path options in the frontend. This causes the user a lot of trouble due to to the conflicts with the -o option. E.g. if the user misspells omptargets he will end up with a file with a weird name.
This patches replaces these two options with -fomptargets and -fomp-host-ir-file-path to avoid these issues, and it is also more consistent with the other options like -fopenmp.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: cfe-commits, caomhin, fraggamuffin
Differential Revision: http://reviews.llvm.org/D18112
llvm-svn: 263442
ASan also relies on names on allocas and will emit unhelpful output if
they're not present. Just force-enable value names for now. Should
unbreak release builds of asan.
llvm-svn: 263429
trying to track origins in the memory sanitizer since the backend
instrumentation pass currently takes names from the Instruction.
Fixes all of the origin tracking tests in compiler-rt after the
-discard-value-name option was added.
llvm-svn: 263412
Summary:
This flag is enabled by default in the driver when NDEBUG is set. It
is forwarded on the LLVMContext to discard all value names (but
GlobalValue) for performance purpose.
This an improved version of D18024
Reviewers: echristo, chandlerc
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18127
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263394
This patch changes cc1 option for PGO profile use from
-fprofile-instr-use=<path> to -fprofile-instrument-use-path=<path>.
-fprofile-instr-use=<path> is now a driver only option.
In addition to decouple the cc1 option from the driver level option, this patch
also enables IR level profile use. cc1 option handling now reads the profile
header and sets CodeGenOpt ProfileUse (valid values are {None, Clang, LLVM}
-- this is a common enum for -fprofile-instrument={}, for the profile
instrumentation), and invoke the pipeline to enable the respective PGO use pass.
Reviewers: silvas, davidxl
Differential Revision: http://reviews.llvm.org/D17737
llvm-svn: 262515
In the gcc precompiled header model, one explicitly runs clang with `-x
c++-header` on a .h file to produce a gch file, and then includes the header
with `-include foo.h` and if a .gch file exists for that header it gets used.
This is documented at
http://clang.llvm.org/docs/UsersManual.html#precompiled-headers
cl.exe's model is fairly different, and controlled by the two flags /Yc and
/Yu. A pch file is generated as a side effect of a regular compilation when
/Ycheader.h is passed. While the compilation is running, the compiler keeps
track of #include lines in the main translation unit and writes everything up
to an `#include "header.h"` line into a pch file. Conversely, /Yuheader.h tells
the compiler to skip all code in the main TU up to and including `#include
"header.h"` and instead load header.pch. (It's also possible to use /Yc and /Yu
without an argument, in that case a `#pragma hrdstop` takes the role of
controlling the point where pch ends and real code begins.)
This patch implements limited support for this in that it requires the pch
header to be passed as a /FI force include flag – with this restriction,
it can be implemented almost completely in the driver with fairly small amounts
of code. For /Yu, this is trivial, and for /Yc a separate pch action is added
that runs before the actual compilation. After r261774, the first failing
command makes a compilation stop – this means if the pch fails to build the
main compilation won't run, which is what we want. However, in /fallback builds
we need to run the main compilation even if the pch build fails so that the
main compilation's fallback can run. To achieve this, add a ForceSuccessCommand
that pretends that the pch build always succeeded in /fallback builds (the main
compilation will then fail to open the pch and run the fallback cl.exe
invocation).
If /Yc /Yu are used in a setup that clang-cl doesn't implement yet, clang-cl
will now emit a "not implemented yet; flag ignored" warning that can be
disabled using -Wno-clang-cl-pch.
Since clang-cl doesn't yet serialize some important things (most notably
`pragma comment(lib, ...)`, this feature is disabled by default and only
enabled by an internal driver flag. Once it's more stable, this internal flag
will disappear.
(The default stdafx.h setup passes stdafx.h as explicit argument to /Yc but not
as /FI – instead every single TU has to `#include <stdafx.h>` as first thing it
does. Implementing support for this should be possible with the approach in
this patch with minimal frontend changes by passing a --stop-at / --start-at
flag from the driver to the frontend. This is left for a follow-up. I don't
think we ever want to support `#pragma hdrstop`, and supporting it with this
approach isn't easy: This approach relies on the driver knowing the pch
filename in advance, and `#pragma hdrstop(out.pch)` can set the output
filename, so the driver can't know about it in advance.)
clang-cl now also honors /Fp and puts pch files in the same spot that cl.exe
would put them, but the pch file format is of course incompatible. This has
ramifications on /fallback, so /Yc /Yu aren't passed through to cl.exe in
/fallback builds.
http://reviews.llvm.org/D17695
llvm-svn: 262420
This patch expands cc1 option -fprofile-instrument= with a new value: -fprofile-instrument=llvm
which enables IR level PGO instrumentation.
Reviewers: davidxl, silvas
Differential Revision: http://reviews.llvm.org/D17622
llvm-svn: 262239
This patch introduces the -fwhole-program-vtables flag, which enables the
whole-program vtable optimization feature (D16795) in Clang.
Differential Revision: http://reviews.llvm.org/D16821
llvm-svn: 261767
The VFS overlay mapping between virtual paths and real paths is done through
the 'external-contents' entries in YAML files, which contains hardcoded paths
to the real files.
When a module compilation crashes, headers are dumped into <name>.cache/vfs
directory and are mapped via the <name>.cache/vfs/vfs.yaml. The script
generated for reproduction uses -ivfsoverlay pointing to file to gather the
mapping between virtual paths and files inside <name>.cache/vfs. Currently, we
are only capable of reproducing such crashes in the same machine as they
happen, because of the hardcoded paths in 'external-contents'.
To be able to reproduce a crash in another machine, this patch introduces a new
option in the VFS yaml file called 'overlay-relative'. When it's equal to
'true' it means that the provided path to the YAML file through the
-ivfsoverlay option should also be used to prefix the final path for every
'external-contents'.
Example, given the invocation snippet "... -ivfsoverlay
<name>.cache/vfs/vfs.yaml" and the following entry in the yaml file:
"overlay-relative": "true",
"roots": [
...
"type": "directory",
"name": "/usr/include",
"contents": [
{
"type": "file",
"name": "stdio.h",
"external-contents": "/usr/include/stdio.h"
},
...
Here, a file manager request for virtual "/usr/include/stdio.h", that will map
into real path "/<absolute_path_to>/<name>.cache/vfs/usr/include/stdio.h.
This is a useful feature for debugging module crashes in machines other than
the one where the error happened.
Differential Revision: http://reviews.llvm.org/D17457
rdar://problem/24499339
llvm-svn: 261552
option. Previously these options could both be used to specify that you were
compiling the implementation file of a module, with a different set of minor
bugs in each case.
This change removes -fmodule-implementation-of, and instead tracks a flag to
determine whether we're currently building a module. -fmodule-name now behaves
the same way that -fmodule-implementation-of previously did.
llvm-svn: 261372
Summary:
Different devices may in some cases require different code generation schemes in order to implement OpenMP. This is required not only for performance reasons, but also because it may not be possible to have the current (default) implementation working for these devices. E.g. GPU's cannot implement the same scheme a target such as powerpc or x86b would use, in the sense that it does not have the ability to fork threads, instead all the threads are always executing and need to be managed by the implementation.
This patch proposes a reorganization of the code in the OpenMP code generation to pave the way to have specialized implementation of OpenMP support. More than a "real" patch this is more a request for comments in order to understand if what is proposed is acceptable or if there are better/easier ways to do it.
In this patch part of the common OpenMP codegen infrastructure is moved to a new file under a new namespace (CGOpenMPCommon) so it can be shared between the default implementation and the specialized one. When CGOpenMPRuntime is created, an attempt to select a specialized implementation is done.
In the patch a specialization for nvptx targets is done which currently checks if the target is an OpenMP device and trap if it is not.
Let me know comments suggestions you may have.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: Hahnfeld, cfe-commits, fraggamuffin, caomhin, jholewinski
Differential Revision: http://reviews.llvm.org/D16784
llvm-svn: 259977
This patch changes cc1 option -fprofile-instr-generate to an enum option
-fprofile-instrument={clang|none}. It also changes cc1 options
-fprofile-instr-generate= to -fprofile-instrument-path=.
The driver level option -fprofile-instr-generate and -fprofile-instr-generate=
remain intact. This change will pave the way to integrate new PGO
instrumentation in IR level.
Review: http://reviews.llvm.org/D16730
llvm-svn: 259811
Summary:
Turns out the variadic function checking added in r258643 was too strict
for some existing users; give them an escape valve. When
-fcuda-allow-variadic-functions is passed, the front-end makes no
attempt to disallow C-style variadic functions. Calls to va_arg are
still not allowed.
Reviewers: tra
Subscribers: cfe-commits, jhen, echristo, bkramer
Differential Revision: http://reviews.llvm.org/D16559
llvm-svn: 258822
This is part of a new statistics gathering feature for the sanitizers.
See clang/docs/SanitizerStats.rst for further info and docs.
Differential Revision: http://reviews.llvm.org/D16175
llvm-svn: 257971
of the file name. This is consistent with how other HeaderSearchOptions
are handled.
Due to the other inputs of the module hash (revision number) this is not
really testable in a meaningful way.
llvm-svn: 257520
This patch attempts to fix the regressions identified when the patch was committed initially.
Thanks to Michael Liao for identifying the fix in the offloading metadata generation
related with side effects in evaluation of function arguments.
llvm-svn: 256933
Summary:
In order to offloading work properly two things need to be in place:
- a descriptor with all the offloading information (device entry functions, and global variable) has to be created by the host and registered in the OpenMP offloading runtime library.
- all the device functions need to be emitted for the device and a convention has to be in place so that the runtime library can easily map the host ID of an entry point with the actual function in the device.
This patch adds support for these two things. However, only entry functions are being registered given that 'declare target' directive is not yet implemented.
About offloading descriptor:
The details of the descriptor are explained with more detail in http://goo.gl/L1rnKJ. Basically the descriptor will have fields that specify the number of devices, the pointers to where the device images begin and end (that will be defined by the linker), and also pointers to a the begin and end of table whose entries contain information about a specific entry point. Each entry has the type:
```
struct __tgt_offload_entry{
void *addr;
char *name;
int64_t size;
};
```
and will be implemented in a pre determined (ELF) section `.omp_offloading.entries` with 1-byte alignment, so that when all the objects are linked, the table is in that section with no padding in between entries (will be like a C array). The code generation ensures that all `__tgt_offload_entry` entries are emitted in the same order for both host and device so that the runtime can have the corresponding entries in both host and device in same index of the table, and efficiently implement the mapping.
The resulting descriptor is registered/unregistered with the runtime library using the calls `__tgt_register_lib` and `__tgt_unregister_lib`. The registration is implemented in a high priority global initializer so that the registration happens always before any initializer (that can potentially include target regions) is run.
The driver flag -omptargets= was created to specify a comma separated list of devices the user wants to support so that the new functionality can be exercised. Each device is specified with its triple.
About target codegen:
The target codegen is pretty much straightforward as it reuses completely the logic of the host version for the same target region. The tricky part is to identify the meaningful target regions in the device side. Unlike other programming models, like CUDA, there are no already outlined functions with attributes that mark what should be emitted or not. So, the information on what to emit is passed in the form of metadata in host bc file. This requires a new option to pass the host bc to the device frontend. Then everything is similar to what happens in CUDA: the global declarations emission is intercepted to check to see if it is an "interesting" declaration. The difference is that instead of checking an attribute, the metadata information in checked. Right now, there is only a form of metadata to pass information about the device entry points (target regions). A class `OffloadEntriesInfoManagerTy` was created to manage all the information and queries related with the metadata. The metadata looks like this:
```
!omp_offload.info = !{!0, !1, !2, !3, !4, !5, !6}
!0 = !{i32 0, i32 52, i32 77426347, !"_ZN2S12r1Ei", i32 479, i32 13, i32 4}
!1 = !{i32 0, i32 52, i32 77426347, !"_ZL7fstatici", i32 461, i32 11, i32 5}
!2 = !{i32 0, i32 52, i32 77426347, !"_Z9ftemplateIiET_i", i32 444, i32 11, i32 6}
!3 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 99, i32 11, i32 0}
!4 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 272, i32 11, i32 3}
!5 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 127, i32 11, i32 1}
!6 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 159, i32 11, i32 2}
```
The fields in each metadata entry are (in sequence):
Entry 1) an ID of the type of metadata - right now only zero is used meaning "OpenMP target region".
Entry 2) a unique ID of the device where the input source file that contain the target region lives.
Entry 3) a unique ID of the file where the input source file that contain the target region lives.
Entry 4) a mangled name of the function that encloses the target region.
Entries 5) and 6) line and column number where the target region was found.
Entry 7) is the order the entry was emitted.
Entry 2) and 3) are required to distinguish files that have the same function name.
Entry 4) is required to distinguish different instances of the same declaration (usually templated ones)
Entries 5) and 6) are required to distinguish the particular target region in body of the function (it is possible that a given target region is not an entry point - if clause can evaluate always to zero - and therefore we need to identify the "interesting" target regions. )
This patch replaces http://reviews.llvm.org/D12306.
Reviewers: ABataev, hfinkel, tra, rjmccall, sfantao
Subscribers: FBrygidyn, piotr.rak, Hahnfeld, cfe-commits
Differential Revision: http://reviews.llvm.org/D12614
llvm-svn: 256842
The /Brepro flag controls whether or not the compiler should embed
timestamps into the object file. Object files which do not embed
timestamps are not suitable for incremental linking but are suitable for
hermetic build systems and staged self-hosts of clang.
A normal clang spelling of this flag has been added,
-mincremental-linker-compatible.
llvm-svn: 256204
Reapplies r256063, except instead of frugally re-using an LLVM enum,
we define a Clang enum, to avoid exposing too much LLVM interface.
Differential Revision: http://reviews.llvm.org/D15650
llvm-svn: 256078
Clang-side cross-DSO CFI.
* Adds a command line flag -f[no-]sanitize-cfi-cross-dso.
* Links a runtime library when enabled.
* Emits __cfi_slowpath calls is bitset test fails.
* Emits extra hash-based bitsets for external CFI checks.
* Sets a module flag to enable __cfi_check generation during LTO.
This mode does not yet support diagnostics.
llvm-svn: 255694
For PS4, generate explicit import for anonymous namespaces and mark it by DW_AT_artificial attribute.
Differential Revision: http://reviews.llvm.org/D12624
llvm-svn: 255281
Summary:
Adds new option -fthinlto-index=<file> to invoke the LTO pipeline
along with function importing via clang using the supplied function
summary index file. This supports invoking the parallel ThinLTO
backend processes in a distributed build environment via clang.
Additionally, this causes the module linker to be invoked on the bitcode
file being compiled to perform any necessary promotion and renaming of
locals that are exported via the function summary index file.
Add a couple tests that confirm we get expected errors when we try to
use the new option on a file that isn't bitcode, or specify an invalid
index file. The tests also confirm that we trigger the expected function
import pass.
Depends on D15024
Reviewers: joker.eph, dexonsmith
Subscribers: joker.eph, davidxl, cfe-commits
Differential Revision: http://reviews.llvm.org/D15025
llvm-svn: 254927
This flag causes all files that were read by the compilation to be embedded
into a produced module file. This is useful for distributed build systems that
use an include scanning system to determine which files are "needed" by a
compilation, and only provide those files to remote compilation workers. Since
using a module can require any file that is part of that module (or anything it
transitively includes), files that are not found by an include scanner can be
required in a regular build using explicit modules. With this flag, only files
that are actually referenced by transitively-#included files are required to be
present on the build machine.
llvm-svn: 253950
(Re-apply patch after bug fixing)
This diff makes sure that the driver does not pass
-fomit-frame-pointer or -momit-leaf-frame-pointer to
the frontend when -pg is used. Currently, clang gives
an error if -fomit-frame-pointer is used in combination
with -pg, but -momit-leaf-frame-pointer was forgotten.
Also, disable frame pointer elimination in the frontend
when -pg is set.
Patch by Stefan Kempf.
llvm-svn: 253886
This diff makes sure that the driver does not pass
-fomit-frame-pointer or -momit-leaf-frame-pointer to
the frontend when -pg is used. Currently, clang gives
an error if -fomit-frame-pointer is used in combination
with -pg, but -momit-leaf-frame-pointer was forgotten.
Also, disable frame pointer elimination in the frontend
when -pg is set.
Patch by Stefan Kempf.
llvm-svn: 253846
This reverts commit r253269.
This leads to assert / segfault triggering on the following reduced example:
float foo(float U, float base, float cell) { return (U = 2 * base) - cell; }
llvm-svn: 253337
This failed to solve the problem it was aimed at, and introduced just as many
issues as it resolved. Realistically, we need to deal with the possibility that
multiple modules might define different internal linkage symbols with the same
name, and this isn't a problem unless two such symbols are simultaneously
visible.
The case where two modules define equivalent internal linkage symbols is
handled by r252063: if lookup finds multiple sufficiently-similar entities from
different modules, we just pick one of them as an extension (but we keep them
separate).
llvm-svn: 252957
The -meabi flag to control LLVM EABI version.
Without '-meabi' or with '-meabi default' imply LLVM triple default.
With '-meabi gnu' sets EABI GNU.
With '-meabi 4' or '-meabi 5' set EABI version 4 and 5 respectively.
A similar patch was introduced in LLVM.
Patch by Vinicius Tinti.
llvm-svn: 252463
Introduce the notion of a module file extension, which introduces
additional information into a module file at the time it is built that
can then be queried when the module file is read. Module file
extensions are identified by a block name (which must be unique to the
extension) and can write any bitstream records into their own
extension block within the module file. When a module file is loaded,
any extension blocks are matched up with module file extension
readers, that are per-module-file and are given access to the input
bitstream.
Note that module file extensions can only be introduced by
programmatic clients that have access to the CompilerInvocation. There
is only one such extension at the moment, which is used for testing
the module file extension harness. As a future direction, one could
imagine allowing the plugin mechanism to introduce new module file
extensions.
llvm-svn: 251955
Linking options for particular file depend on the option that specifies the file.
Currently there are two:
* -mlink-bitcode-file links in complete content of the specified file.
* -mlink-cuda-bitcode links in only the symbols needed by current TU.
Linked symbols are internalized. This bitcode linking mode is used to
link device-specific bitcode provided by CUDA.
Files are linked in order they are specified on command line.
-mlink-cuda-bitcode replaces -fcuda-uses-libdevice flag.
Differential Revision: http://reviews.llvm.org/D13913
llvm-svn: 251427
Previously, __weak was silently accepted and ignored in MRC mode.
That makes this a potentially source-breaking change that we have to
roll out cautiously. Accordingly, for the time being, actual support
for __weak references in MRC is experimental, and the compiler will
reject attempts to actually form such references. The intent is to
eventually enable the feature by default in all non-GC modes.
(It is, of course, incompatible with ObjC GC's interpretation of
__weak.)
If you like, you can enable this feature with
-Xclang -fobjc-weak
but like any -Xclang option, this option may be removed at any point,
e.g. if/when it is eventually enabled by default.
This patch also enables the use of the ARC __unsafe_unretained qualifier
in MRC. Unlike __weak, this is being enabled immediately. Since
variables are essentially __unsafe_unretained by default in MRC,
the only practical uses are (1) communication and (2) changing the
default behavior of by-value block capture.
As an implementation matter, this means that the ObjC ownership
qualifiers may appear in any ObjC language mode, and so this patch
removes a number of checks for getLangOpts().ObjCAutoRefCount
that were guarding the processing of these qualifiers. I don't
expect this to be a significant drain on performance; it may even
be faster to just check for these qualifiers directly on a type
(since it's probably in a register anyway) than to do N dependent
loads to grab the LangOptions.
rdar://9674298
llvm-svn: 251041
The ELF symbol visibilities are:
- internal: Not visibile across DSOs, cannot pass address across DSOs
- hidden: Not visibile across DSOs, can be called indirectly
- default: Usually visible across DSOs, possibly interposable
- protected: Visible across DSOs, not interposable
LLVM only supports the latter 3 visibilities. Internal visibility is in
theory useful, as it allows you to assume that the caller is maintaining
a PIC register for you in %ebx, or in some other pre-arranged location.
As far as LLVM is concerned, this isn't worth the trouble. Using hidden
visibility is always correct, so we can just do that.
Resolves PR9183.
llvm-svn: 250954
Summary: It breaks the build for the ASTMatchers
Subscribers: klimek, cfe-commits
Differential Revision: http://reviews.llvm.org/D13893
llvm-svn: 250827
This recommits r250398 with fixes to the tests for bot failures.
Add "-target x86_64-unknown-linux" to the clang invocations that
check for the gold plugin.
llvm-svn: 250455
Rolling this back for now since there are a couple of bot failures on
the new tests I added, and I won't have a chance to look at them in detail
until later this afternoon. I think the new tests need some restrictions on
having the gold plugin available.
This reverts commit r250398.
llvm-svn: 250402
Summary:
Add clang support for -flto=thin option, which is used to set the
EmitFunctionSummary code gen option on compiles.
Add -flto=full as an alias to the existing -flto.
Add tests to check for proper overriding of -flto variants on the
command line, and convert grep tests to FileCheck.
Reviewers: dexonsmith, joker.eph
Subscribers: davidxl, cfe-commits
Differential Revision: http://reviews.llvm.org/D11908
llvm-svn: 250398
Add support for the `-fdebug-prefix-map=` option as in GCC. The syntax is
`-fdebug-prefix-map=OLD=NEW`. When compiling files from a path beginning with
OLD, change the debug info to indicate the path as start with NEW. This is
particularly helpful if you are preprocessing in one path and compiling in
another (e.g. for a build cluster with distcc).
Note that the linearity of the implementation is not as terrible as it may seem.
This is normally done once per file with an expectation that the map will be
small (1-2) entries, making this roughly linear in the number of input paths.
Addresses PR24619.
llvm-svn: 250094
With this change, most 'g' options are rejected by CompilerInvocation.
They remain only as Driver options. The new way to request debug info
from cc1 is with "-debug-info-kind={line-tables-only|limited|standalone}"
and "-dwarf-version={2|3|4}". In the absence of a command-line option
to specify Dwarf version, the Toolchain decides it, rather than placing
Toolchain-specific logic in CompilerInvocation.
Also fix a bug in the Windows compatibility argument parsing
in which the "rightmost argument wins" principle failed.
Differential Revision: http://reviews.llvm.org/D13221
llvm-svn: 249655
In versions of clang prior to r238238, __declspec was recognized as a keyword in
all modes. It was then changed to only be enabled when Microsoft or Borland
extensions were enabled (and for CUDA, as a temporary measure). There is a
desire to support __declspec in Playstation code, and possibly other
environments. This commit adds a command-line switch to allow explicit
enabling/disabling of the recognition of __declspec as a keyword. Recognition
is enabled by default in Microsoft, Borland, CUDA, and PS4 environments, and
disabled in all other environments.
Patch by Warren Ristow!
llvm-svn: 249279
* adds -aux-triple option to specify target triple
* propagates aux target info to AST context and Preprocessor
* pulls in target specific preprocessor macros.
* pulls in target-specific builtins from aux target.
* sets appropriate host or device attribute on builtins.
Differential Revision: http://reviews.llvm.org/D12917
llvm-svn: 248299
The patch makes it possible to parse CUDA files that contain host/device
functions with identical signatures, but different attributes without
having to physically split source into host-only and device-only parts.
This change is needed in order to parse CUDA header files that have
a lot of name clashes with standard include files.
Gory details are in design doc here: https://goo.gl/EXnymm
Feel free to leave comments there or in this review thread.
This feature is controlled with CC1 option -fcuda-target-overloads
and is disabled by default.
Differential Revision: http://reviews.llvm.org/D12453
llvm-svn: 248295