This patch fixes compilation failure with explicit modules caused by scanner not reporting the module map describing the module whose implementation is being compiled.
Below is a breakdown of the attached test case. Note the VFS that makes frameworks "A" and "B" share the same header "shared/H.h".
In non-modular build, Clang skips the last import, since the "shared/H.h" header has already been included.
During scan (or implicit build), the compiler handles "tu.m" as follows:
* `@import B` imports module "B", as expected,
* `#import <A/H.h>` is resolved textually (due to `-fmodule-name=A`) to "shared/H.h" (due to the VFS remapping),
* `#import <B/H.h>` is resolved to import module "A_Private", since the header "shared/H.h" is already known to be part of that module, and the import is skipped.
In the end, the only modular dependency of the TU is "B".
In explicit modular build without `-fmodule-name=A`, TU does depend on module "A_Private" properly, not just textually. Clang therefore builds & loads its PCM, and knows to ignore the last import, since "shared/H.h" is known to be part of "A_Private".
But with current scanner behavior and `-fmodule-name=A` present, the last import fails during explicit build. Clang doesn't know about "A_Private" (it's included textually) and tries to import "B_Private" instead, which it doesn't know about either (the scanner correctly didn't report it as dependency). This is fixed by reporting the module map describing "A" and matching the semantics of implicit build.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D134222
The patch diagnoses an identifier as a future keyword if it exists in a
future language mode, such as:
int restrict;
in C modes earlier than C99. We now give a warning to the user that
such an identifier is a future keyword. Handles keywords from C as well
as C++.
Differential Revision: https://reviews.llvm.org/D131683
This addresses [cpp.include]/7
(when encountering #include header-name)
If the header identified by the header-name denotes an importable header, it
is implementation-defined whether the #include preprocessing directive is
instead replaced by an import directive.
In this implementation, include translation is performed _only_ for headers
in the Global Module fragment, so:
```
module;
#include "will-be-translated.h" // IFF the header unit is available.
export module M;
#include "will-not-be-translated.h" // even if the header unit is available
```
The reasoning is that, in general, includes in the module purview would not
be validly translatable (they would have to immediately follow the module
decl and without any other intervening decls). Otherwise that would violate
the rules on contiguous import directives.
This would be quite complex to track in the preprocessor, and for relatively
little gain (the user can 'import "will-not-be-translated.h";' instead.)
TODO: This is one area where it becomes increasingly difficult to disambiguate
clang modules in C++ from C++ standard modules. That needs to be addressed in
both the driver and the FE.
Differential Revision: https://reviews.llvm.org/D128981
When running `clang -E -Ofast` on macOS, the `__FLT_EVAL_METHOD__` macro is `0`, which causes the following typedef to be emitted into the preprocessed source: `typedef float float_t`.
However, when running `clang -c -Ofast`, `__FLT_EVAL_METHOD__` is `-1`, and `typedef long double float_t` is emitted.
This causes build errors for certain projects, which are not reproducible when compiling from preprocessed source.
The issue is that `__FLT_EVAL_METHOD__` is configured in `Sema::Sema` which is not executed when running in `-E` mode.
This change moves that logic into the preprocessor initialization method, which is invoked correctly in `-E` mode.
rdar://96134605
rdar://92748429
Differential Revision: https://reviews.llvm.org/D128814
"Ascii" StringLiteral instances are actually narrow strings
that are UTF-8 encoded and do not have an encoding prefix.
(UTF8 StringLiteral are also UTF-8 encoded strings, but with
the u8 prefix.
To avoid possible confusion both with actuall ASCII strings,
and with future works extending the set of literal encodings
supported by clang, this rename StringLiteral::isAscii() to
isOrdinary(), matching C++ standard terminology.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D128762
This is a commit with the following changes:
* Remove `ExcludedPreprocessorDirectiveSkipMapping` and related functionality
Removes `ExcludedPreprocessorDirectiveSkipMapping`; its intended benefit for fast skipping of excluded directived blocks
will be superseded by a follow-up patch in the series that will use dependency scanning lexing for the same purpose.
* Refactor dependency scanning to produce pre-lexed preprocessor directive tokens, instead of minimized sources
Replaces the "source minimization" mechanism with a mechanism that produces lexed dependency directives tokens.
* Make the special lexing for dependency scanning a first-class feature of the `Preprocessor` and `Lexer`
This is bringing the following benefits:
* Full access to the preprocessor state during dependency scanning. E.g. a component can see what includes were taken and where they were located in the actual sources.
* Improved performance for dependency scanning. Measurements with a release+thin-LTO build shows ~ -11% reduction in wall time.
* Opportunity to use dependency scanning lexing to speed-up skipping of excluded conditional blocks during normal preprocessing (as follow-up, not part of this patch).
For normal preprocessing measurements show differences are below the noise level.
Since, after this change, we don't minimize sources and pass them in place of the real sources, `DependencyScanningFilesystem` is not technically necessary, but it has valuable performance benefits for caching file `stat`s along with the results of scanning the sources. So the setup of using the `DependencyScanningFilesystem` during a dependency scan remains.
Differential Revision: https://reviews.llvm.org/D125486
Differential Revision: https://reviews.llvm.org/D125487
Differential Revision: https://reviews.llvm.org/D125488
This patch replaces the exact include count of each file in `HeaderFileInfo` with a set of included files in `Preprocessor`.
The number of includes isn't a property of a header file but rather a preprocessor state. The exact number of includes is not used anywhere except statistic tracking.
Reviewed By: vsapsai
Differential Revision: https://reviews.llvm.org/D114095
The `{HeaderSearch,Preprocessor}::LookupFile()` functions take an out-parameter `const DirectoryLookup *&`. Most callers end up creating a `const DirectoryLookup *` variable that's otherwise unused.
This patch changes the out-parameter from reference to a pointer, making it possible to simply pass `nullptr` to the function without the ceremony.
Reviewed By: ahoppen
Differential Revision: https://reviews.llvm.org/D117312
This fixes an LLDB build failure where the `ImportLoc` argument is missing: https://lab.llvm.org/buildbot#builders/68/builds/19975
This change also makes it possible to drop `SourceLocation()` in `Preprocessor::getCurrentModule`.
This patch propagates the import `SourceLocation` into `HeaderSearch::lookupModule`. This enables remarks on search path usage (implemented in D102923) to point to the source code that initiated header search.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111557
This patch adds a new preprocessor extension ``#pragma clang final``
which enables warning on undefinition and re-definition of macros.
The intent of this warning is to extend beyond ``-Wmacro-redefined`` to
warn against any and all alterations to macros that are marked `final`.
This warning is part of the ``-Wpedantic-macros`` diagnostics group.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D108567
This change would treat the token `or` in system headers as an
identifier, and elsewhere as an operator. As reported in
llvm.org/pr42427, many users classify their third party library headers
as "system" headers to suppress warnings. There's no clean way to
separate Windows SDK headers from user headers.
Clang is still able to parse old Windows SDK headers if C++ operator
names are disabled. Traditionally this was controlled by
`-fno-operator-names`, but is now also enabled with `/permissive` since
D103773. This change will prevent `clang-cl` from parsing <query.h> from
the Windows SDK out of the box, but there are multiple ways to work
around that:
- Pass `/clang:-fno-operator-names`
- Pass `/permissive`
- Pass `-DQUERY_H_RESTRICTION_PERMISSIVE`
In all of these modes, the operator names will consistently be available
or not available, instead of depending on whether the code is in a
system header.
I added a release note for this, since it may break straightforward
users of the Windows SDK.
Fixes PR42427
Differential Revision: https://reviews.llvm.org/D108720
This patch adds `#pragma clang restrict_expansion ` to enable flagging
macros as unsafe for header use. This is to allow macros that may have
ABI implications to be avoided in headers that have ABI stability
promises.
Using macros in headers (particularly public headers) can cause a
variety of issues relating to ABI and modules. This new pragma logs
warnings when using annotated macros outside the main source file.
This warning is added under a new diagnostics group -Wpedantic-macros
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D107095
This patch adds `#pragma clang deprecated` to enable deprecation of
preprocessor macros.
The macro must be defined before `#pragma clang deprecated`. When
deprecating a macro a custom message may be optionally provided.
Warnings are emitted at the use site of a deprecated macro, and can be
controlled via the `-Wdeprecated` warning group.
This patch takes some rough inspiration and a few lines of code from
https://reviews.llvm.org/D67935.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106732
This patch adds the -fminimize-whitespace with the following effects:
* If combined with -E, remove as much non-line-breaking whitespace as
possible.
* If combined with -E -P, removes as much whitespace as possible,
including line-breaks.
The motivation is to reduce the amount of insignificant changes in the
preprocessed output with source files where only whitespace has been
changed (add/remove comments, clang-format, etc.) which is in particular
useful with ccache.
A patch for ccache for using this flag has been proposed to ccache as well:
https://github.com/ccache/ccache/pull/815, which will use
-fnormalize-whitespace when clang-13 has been detected, and additionally
uses -P in "unify_mode". ccache already had a unify_mode in an older
version which was removed because of problems that using the
preprocessor itself does not have (such that the custom tokenizer did
not recognize C++11 raw strings).
This patch slightly reorganizes which part is responsible for adding
newlines that are required for semantics. It is now either
startNewLineIfNeeded() or MoveToLine() but never both; this avoids the
ShouldUpdateCurrentLine workaround and avoids redundant lines being
inserted in some cases. It also fixes a mandatory newline not inserted
after a _Pragma("...") that is expanded into a #pragma.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D104601
WG14 adopted N2645 and WG21 EWG has accepted P2334 in principle (still
subject to full EWG vote + CWG review + plenary vote), which add
support for #elifdef as shorthand for #elif defined and #elifndef as
shorthand for #elif !defined. This patch adds support for the new
preprocessor directives.
Somewhat surprisingly, signature help is emitted as a side-effect of
computing the expected type of a function argument.
The reason is that both actions require enumerating the possible
function signatures and running partial overload resolution, and doing
this twice would be wasteful and complicated.
Change #1: document this, it's subtle :-)
However, sometimes we need to compute the expected type without having
reached the code completion cursor yet - in particular to allow
completion of designators.
eb4ab3358c did this but introduced a
regression - it emits signature help in the wrong location as a side-effect.
Change #2: only emit signature help if the code completion cursor was reached.
Currently there is PP.isCodeCompletionReached(), but we can't use it
because it's set *after* running code completion.
It'd be nice to set this implicitly when the completion token is lexed,
but ConsumeCodeCompletionToken() makes this complicated.
Change #3: call cutOffParsing() *first* when seeing a completion token.
After this, the fact that the Sema::Produce*SignatureHelp() functions
are even more confusing, as they only sometimes do that.
I don't want to rename them in this patch as it's another large
mechanical change, but we should soon.
Change #4: prepare to rename ProduceSignatureHelp() to GuessArgumentType() etc.
Differential Revision: https://reviews.llvm.org/D98488
More study has discovered this to not actually be useful: because
current C++20 implementations reject `#ifdef __VA_OPT__`, this can't
really be used as a feature-test mechanism. And it's not too hard to
detect __VA_OPT__ without this, for example:
#define THIRD_ARG(a, b, c, ...) c
#define HAS_VA_OPT(...) THIRD_ARG(__VA_OPT__(,), 1, 0, )
#if HAS_VA_OPT(?)
Partially reverts 0436ec2128.
These changes are intended to give code a path to move away from the GNU
,##__VA_ARGS__ extension, which is non-conforming in some situations and
which we'd like to disable in our conforming mode in those cases.
Replace `SourceManager::getMemoryBufferForFile`, which returned a
dereferenceable `MemoryBuffer*` and had a `bool*Invalid` out parameter,
with `getMemoryBufferForFileOrNone` (returning
`Optional<MemoryBufferRef>`) and `getMemoryBufferForFileOrFake`
(returning `MemoryBufferRef`).
Differential Revision: https://reviews.llvm.org/D89429
Summary:
We would like to use NumericLiteralParser in the implementation of the
syntax tree builder, and plumbing a preprocessor there seems
inconvenient and superfluous.
Reviewers: eduucaldas
Reviewed By: eduucaldas
Subscribers: gribozavr2, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83480
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
See
https://docs.google.com/document/d/1xMkTZMKx9llnMPgso0jrx3ankI4cv60xeZ0y4ksf4wc/preview
for background discussion.
This adds a warning, flags and pragmas to limit the number of
pre-processor tokens either at a certain point in a translation unit, or
overall.
The idea is that this would allow projects to limit the size of certain
widely included headers, or for translation units overall, as a way to
insert backstops for header bloat and prevent compile-time regressions.
Differential revision: https://reviews.llvm.org/D72703
Summary:
As discussed in http://lists.llvm.org/pipermail/cfe-dev/2019-October/063459.html
the overflow of the souce locations (limited to 2^31 chars) can generate all sorts of
weird things (bogus warnings, hangs, crashes, miscompilation and correct compilation).
In debug mode this assert would fail. So it might be a good start, as in PR42301,
to detect the failure and exit with a proper error message.
Reviewers: rsmith, thakis, miyuki
Reviewed By: miyuki
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70183
Builtins are rarely if ever accessed via the Preprocessor. They are
typically found on the ASTContext, so there should be no performance
penalty to using a pointer indirection to store the builtin context.
This commit adds an optimization to clang-scan-deps and clang's preprocessor that skips excluded preprocessor
blocks by bumping the lexer pointer, and not lexing the tokens until reaching appropriate #else/#endif directive.
The skip positions and lexer offsets are computed when the file is minimized, directly from the minimized tokens.
On an 18-core iMacPro with macOS Catalina Beta I got 10-15% speed-up from this optimization when running clang-scan-deps on
the compilation database for a recent LLVM and Clang (3511 files).
Differential Revision: https://reviews.llvm.org/D67127
llvm-svn: 371656
when the FileManager is reused across invocations
This commit introduces a parallel API to FileManager's getFile: getFileEntryRef, which returns
a reference to the FileEntry, and the name that was used to access the file. In the case of
a VFS with 'use-external-names', the FileEntyRef contains the external name of the file,
not the filename that was used to access it.
The new API is adopted only in the HeaderSearch and Preprocessor for include file lookup, so that the
accessed path can be propagated to SourceManager's FileInfo. SourceManager's FileInfo now can report this accessed path, using
the new getName method. This API is then adopted in the dependency collector, which now correctly reports dependencies when a file
is included both using a symlink and a real path in the case when the FileManager is reused across multiple Preprocessor invocations.
Note that this patch does not fix all dependency collector issues, as the same problem is still present in other cases when dependencies
are obtained using FileSkipped, InclusionDirective, and HasInclude. This will be fixed in follow-up commits.
Differential Revision: https://reviews.llvm.org/D65907
llvm-svn: 369680
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
Differential revision: https://reviews.llvm.org/D66259
llvm-svn: 368942
Summary:
By adding a hook to consume all tokens produced by the preprocessor.
The intention of this change is to make it possible to consume the
expanded tokens without re-runnig the preprocessor with minimal changes
to the preprocessor and minimal performance penalty when preprocessing
without recording the tokens.
The added hook is very low-level and reconstructing the expanded token
stream requires more work in the client code, the actual algorithm to
collect the tokens using this hook can be found in the follow-up change.
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: eraman, nemanjai, kbarton, jsji, riccibruno, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59885
llvm-svn: 361007
is not used since it consumes all preprocessor directives until it returns
a real token. Using the specific Lexer (i.e. CurLexer->Lex) makes it
possible to stop skipping after an #include or #pragma hdrstop. Previously
the skipping code was only handling CurLexer, now all will be handled
correctly.
Fixes: llvm.org/PR41585
Differential Revision: https://reviews.llvm.org/D61217
llvm-svn: 359506
and the global and private module fragment.
For now, the private module fragment introducer is ignored, but use of
the global module fragment introducer should be properly enforced.
llvm-svn: 358353
internal lexing steps in the preprocessor.
It is not safe to use the preprocessor's token lookahead except when
operating on the final sequence of tokens that would be produced by
phase 4 of translation. Doing so corrupts the token lookahead cache used
by the parser. (See added testcase for an example.) Lookahead should
instead be viewed as a layer on top of the normal lexer.
Added assertions to catch any further incorrect uses of lookahead within
lexing actions.
llvm-svn: 358230