Directive `dependency_directives_scan::tokens_present_before_eof` is introduced to indicate there were tokens present before
the last scanned dependency directive and EOF.
This is useful to ensure we correctly identify the macro guards when lexing using the dependency directives.
Differential Revision: https://reviews.llvm.org/D133357
Per the documentation, these restrictions were intended to apply to textual headers but previously this didn't work because we decided there was no requesting module when the `#include` was in a textual header.
A `-cc1` flag is provided to restore the old behavior for transitionary purposes.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D132779
The Darwin module has specified [no_undeclared_includes] for at least five years now, there's no need to hard code it in the compiler.
Reviewed By: ributzka, Bigcheese
Differential Revision: https://reviews.llvm.org/D132971
Xcode 13's clang has them. For the included testcase, Xcode's clang
behaves like the implementation in this patch.
Availability.h in the macOS 12.0 SDK (part of Xcode 13, and the current
stable version of the macOS SDK) does something like:
#if defined(__has_builtin)
...
#if __has_builtin(__is_target_os)
#if __has_builtin(__is_target_environment)
#if __has_builtin(__is_target_variant_os)
#if __has_builtin(__is_target_variant_environment)
#if (... && ((__is_target_os(ios) && __is_target_environment(macabi)) || (__is_target_variant_os(ios) && __is_target_variant_environment(macabi))))
#define __OSX_AVAILABLE_STARTING(_osx, _ios) ...
#define __OSX_AVAILABLE_BUT_DEPRECATED(_osxIntro, _osxDep, _iosIntro, _iosDep) ...
#define __OSX_AVAILABLE_BUT_DEPRECATED_MSG(_osxIntro, _osxDep, _iosIntro, _iosDep, _msg) ...
So if __has_builtin(__is_target_variant_os) or
__has_builtin(__is_target_variant_environment) are false, these defines are not
defined.
Most of the time, this doesn't matter. But open-source clang currently fails
to commpile a file containing only `#include <Security/cssmtype.h>` when
building for catalyst by adding a `-target arm64-apple-ios13.1-macabi` triple,
due to those __OSX_AVAILABLE macros not being set correctly.
If a potential future SDK version were to include cssmtype.h transitively
from a common header such as `<Foundation/Foundation.h>`, then it would become
close to impossible to build Catalyst binaries with open-source clang.
To fix this for normal catalyst builds, it's only necessary that
__has_builtin() evaluates to true for these two built-ins -- the implementation
of them doesn't matter. But as a courtesy, a correct (at least on the test
cases I tried) implementation is provided. (This should also help people who
try to build zippered code, where having the correct implementation does
matter.)
Differential Revision: https://reviews.llvm.org/D132754
The patch diagnoses an identifier as a future keyword if it exists in a
future language mode, such as:
int restrict;
in C modes earlier than C99. We now give a warning to the user that
such an identifier is a future keyword. Handles keywords from C as well
as C++.
Differential Revision: https://reviews.llvm.org/D131683
When Clang encounters `@import M.Private` during implicit build, it precompiles module `M` and looks through its submodules. If the `Private` submodule is not found, Clang assumes `@import M_Private`. In the dependency scanner, we don't capture the dependency on `M`, since it's not imported. It's an affecting module, though: compilation of the import statement will fail when implicit modules are disabled and `M` is not precompiled and explicitly provided. This patch fixes that.
Depends on D132430.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D132502
When compiling a module, its semantics and Clang's behavior are affected by other modules. These modules are typically the **imported** ones. However, during implicit build, some modules end up being compiled and read without being actually imported. This patch starts tracking such modules and serializing them into `.pcm` files. This enables the dependency scanner to construct explicit compilations that mimic implicit build.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D132430
Adds
* `__add_lvalue_reference`
* `__add_pointer`
* `__add_rvalue_reference`
* `__decay`
* `__make_signed`
* `__make_unsigned`
* `__remove_all_extents`
* `__remove_extent`
* `__remove_const`
* `__remove_volatile`
* `__remove_cv`
* `__remove_pointer`
* `__remove_reference`
* `__remove_cvref`
These are all compiler built-in equivalents of the unary type traits
found in [[meta.trans]][1]. The compiler already has all of the
information it needs to answer these transformations, so we can skip
needing to make partial specialisations in standard library
implementations (we already do this for a lot of the query traits). This
will hopefully improve compile times, as we won't need use as much
memory in such a base part of the standard library.
[1]: http://wg21.link/meta.trans
Co-authored-by: zoecarver
Reviewed By: aaron.ballman, rsmith
Differential Revision: https://reviews.llvm.org/D116203
The commit of af2d11b1d5 missed a case where
the value of a suggested module needed to be reset to nullptr. Fixed thus
and added a testcase to cover the circumstance.
Adds
* `__add_lvalue_reference`
* `__add_pointer`
* `__add_rvalue_reference`
* `__decay`
* `__make_signed`
* `__make_unsigned`
* `__remove_all_extents`
* `__remove_extent`
* `__remove_const`
* `__remove_volatile`
* `__remove_cv`
* `__remove_pointer`
* `__remove_reference`
* `__remove_cvref`
These are all compiler built-in equivalents of the unary type traits
found in [[meta.trans]][1]. The compiler already has all of the
information it needs to answer these transformations, so we can skip
needing to make partial specialisations in standard library
implementations (we already do this for a lot of the query traits). This
will hopefully improve compile times, as we won't need use as much
memory in such a base part of the standard library.
[1]: http://wg21.link/meta.trans
Co-authored-by: zoecarver
Reviewed By: aaron.ballman, rsmith
Differential Revision: https://reviews.llvm.org/D116203
In the case of static compilation the file system is pretty much read-only
and taking a snapshot of it usually is sufficient. In the interactive C++
case the compilation is longer and people can create and include files, etc.
In that case we often do not want to open files or cache failures unless is
absolutely necessary.
This patch extends the original API call by forwarding some optional flags,
so we can continue use it in the previous way with no breakage.
Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D131241
I went over the output of the following mess of a command:
(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z |
parallel --xargs -0 cat | aspell list --mode=none --ignore-case |
grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n |
grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Differential Revision: https://reviews.llvm.org/D130827
isAllowedInitiallyIDChar is only used with non-ASCII codepoints,
which are handled by isAsciiIdentifierStart.
To make that clearer, remove the check for _ from
isAllowedInitiallyIDChar, and assert on ASCII - to ensure neither
_ or $ are passed to this function.
Reviewed By: tahonermann, aaron.ballman
Differential Revision: https://reviews.llvm.org/D130750
The #warning directive is standard in C++2b and C2x,
this adjusts the pedantic and extensions warning accordingly.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D130415
This implements
N2836 Identifier Syntax using Unicode Standard Annex 31.
The feature was already implemented for C++,
and the semantics are the same.
Unlike C++ there was, afaict, no decision to
backport the feature in older languages mode,
so C17 and earlier are not modified and the
code point tables for these language modes are conserved.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D130416
Diagnostic for `-Wauto-import` shouldn't be a warning because it doesn't
represent a potential problem in code that should be fixed. And the
emitted fix-it is likely to trigger `-Watimport-in-framework-header`
which makes it challenging to have a warning-free codebase. But it is
still useful to see how include directives are translated into modular
imports and which module a header belongs to, that's why keep it as a remark.
Keep `-Wauto-import` for now to allow a gradual migration for codebases
using `-Wno-auto-import`, e.g., `-Weverything -Wno-auto-import`.
rdar://79594287
Differential Revision: https://reviews.llvm.org/D130138
WG21 approved delimited escape sequences and named escape
sequences.
Adjust the extension warnings accordingly, and update
the release notes.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D129664
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.
The warning is off by default as its likely to be somewhat disruptive
otherwise.
This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D128059
This reverts commit cc309721d2 because it
breaks the following tests on GreenDragon:
TestDataFormatterObjCCF.py
TestDataFormatterObjCExpr.py
TestDataFormatterObjCKVO.py
TestDataFormatterObjCNSBundle.py
TestDataFormatterObjCNSData.py
TestDataFormatterObjCNSError.py
TestDataFormatterObjCNSNumber.py
TestDataFormatterObjCNSURL.py
TestDataFormatterObjCPlain.py
TestDataFormatterObjNSException.py
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45288/
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.
The warning is off by default as its likely to be somewhat disruptive
otherwise.
This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D128059
This addresses [cpp.include]/7
(when encountering #include header-name)
If the header identified by the header-name denotes an importable header, it
is implementation-defined whether the #include preprocessing directive is
instead replaced by an import directive.
In this implementation, include translation is performed _only_ for headers
in the Global Module fragment, so:
```
module;
#include "will-be-translated.h" // IFF the header unit is available.
export module M;
#include "will-not-be-translated.h" // even if the header unit is available
```
The reasoning is that, in general, includes in the module purview would not
be validly translatable (they would have to immediately follow the module
decl and without any other intervening decls). Otherwise that would violate
the rules on contiguous import directives.
This would be quite complex to track in the preprocessor, and for relatively
little gain (the user can 'import "will-not-be-translated.h";' instead.)
TODO: This is one area where it becomes increasingly difficult to disambiguate
clang modules in C++ from C++ standard modules. That needs to be addressed in
both the driver and the FE.
Differential Revision: https://reviews.llvm.org/D128981
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.
The warning is off by default as its likely to be somewhat disruptive
otherwise.
This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D128059
This reverts commit 4174f0ca61.
Also revert follow-up "[Clang] Fix invalid utf-8 detection"
This reverts commit bf45e27a67.
The second commit broke tests, see comments on
https://reviews.llvm.org/D129223, and it sounds like the first
commit isn't valid without the second one. So reverting both for now.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.
The warning is off by default as its likely to be somewhat disruptive
otherwise.
This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D128059
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.
The warning is off by default as its likely to be somewhat disruptive
otherwise.
This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D128059
This is a preprocessor callback focused on the lexed file changing, without conflating effects of line number directives and other pragmas.
A client that only cares about what files the lexer processes, like dependency generation, can use this more straightforward
callback instead of `PPCallbacks::FileChanged()`. Clients that want the pragma directive effects as well can keep using `FileChanged()`.
A use case where `PPCallbacks::LexedFileChanged()` is particularly simpler to use than `FileChanged()` is in a situation
where a client wants to keep track of lexed file changes that include changes from/to the predefines buffer, where it becomes
unnecessary complicated trying to use `FileChanged()` while filtering out the pragma directives effects callbacks.
Also take the opportunity to provide information about the prior `FileID` the `Lexer` moved from, even when entering a new file.
Differential Revision: https://reviews.llvm.org/D128947
Otherwise a header may be erroneously marked as having a header macro guard and won't get re-included.
Differential Revision: https://reviews.llvm.org/D128772
When running `clang -E -Ofast` on macOS, the `__FLT_EVAL_METHOD__` macro is `0`, which causes the following typedef to be emitted into the preprocessed source: `typedef float float_t`.
However, when running `clang -c -Ofast`, `__FLT_EVAL_METHOD__` is `-1`, and `typedef long double float_t` is emitted.
This causes build errors for certain projects, which are not reproducible when compiling from preprocessed source.
The issue is that `__FLT_EVAL_METHOD__` is configured in `Sema::Sema` which is not executed when running in `-E` mode.
This change moves that logic into the preprocessor initialization method, which is invoked correctly in `-E` mode.
rdar://96134605
rdar://92748429
Differential Revision: https://reviews.llvm.org/D128814
"Ascii" StringLiteral instances are actually narrow strings
that are UTF-8 encoded and do not have an encoding prefix.
(UTF8 StringLiteral are also UTF-8 encoded strings, but with
the u8 prefix.
To avoid possible confusion both with actuall ASCII strings,
and with future works extending the set of literal encodings
supported by clang, this rename StringLiteral::isAscii() to
isOrdinary(), matching C++ standard terminology.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D128762
Implements [[ https://wg21.link/p2071r1 | P2071 Named Universal Character Escapes ]] - as an extension in all language mode, the patch not warn in c++23 mode will be done later once this paper is plenary approved (in July).
We add
* A code generator that transforms `UnicodeData.txt` and `NameAliases.txt` to a space efficient data structure that can be queried in `O(NameLength)`
* A set of functions in `Unicode.h` to query that data, including
* A function to find an exact match of a given Unicode character name
* A function to perform a loose (ignoring case, space, underscore, medial hyphen) matching
* A function returning the best matching codepoint for a given string per edit distance
* Support of `\N{}` escape sequences in String and character Literals, with loose and typos diagnostics/fixits
* Support of `\N{}` as UCN with loose matching diagnostics/fixits.
Loose matching is considered an error to match closely the semantics of P2071.
The generated data contributes to 280kB of data to the binaries.
`UnicodeData.txt` and `NameAliases.txt` are not committed to the repository in this patch, and regenerating the data is a manual process.
Reviewed By: tahonermann
Differential Revision: https://reviews.llvm.org/D123064
This speeds up preprocessing, specifically for preprocessing the clang sources time is reduced by about -36%,
using measurements on M1Pro with a release+thinLTO build.
Differential Revision: https://reviews.llvm.org/D127379