Commit Graph

2548 Commits

Author SHA1 Message Date
Kazu Hirata cb2c8f694d [clang] Use value instead of getValue (NFC) 2022-07-13 23:39:33 -07:00
Corentin Jabot 6882ca9aff [Clang] Adjust extension warnings for delimited sequences
WG21 approved delimited escape sequences and named escape
sequences.
Adjust the extension warnings accordingly, and update
the release notes.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D129664
2022-07-14 07:50:58 +02:00
Corentin Jabot d4892a168f [Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059
2022-07-13 10:19:26 +02:00
Jonas Devlieghere a262f4dbd7 Revert "[Clang] Add a warning on invalid UTF-8 in comments."
This reverts commit cc309721d2 because it
breaks the following tests on GreenDragon:

  TestDataFormatterObjCCF.py
  TestDataFormatterObjCExpr.py
  TestDataFormatterObjCKVO.py
  TestDataFormatterObjCNSBundle.py
  TestDataFormatterObjCNSData.py
  TestDataFormatterObjCNSError.py
  TestDataFormatterObjCNSNumber.py
  TestDataFormatterObjCNSURL.py
  TestDataFormatterObjCPlain.py
  TestDataFormatterObjNSException.py

https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45288/
2022-07-12 15:22:29 -07:00
Corentin Jabot cc309721d2 [Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059
2022-07-12 14:34:30 +02:00
Iain Sandoe af2d11b1d5 [C++20][Modules] Implement include translation.
This addresses [cpp.include]/7

(when encountering #include header-name)

If the header identified by the header-name denotes an importable header, it
is implementation-defined whether the #include preprocessing directive is
instead replaced by an import directive.

In this implementation, include translation is performed _only_ for headers
in the Global Module fragment, so:
```
module;
 #include "will-be-translated.h" // IFF the header unit is available.

export module M;
 #include "will-not-be-translated.h" // even if the header unit is available
```
The reasoning is that, in general, includes in the module purview would not
be validly translatable (they would have to immediately follow the module
decl and without any other intervening decls).  Otherwise that would violate
the rules on contiguous import directives.

This would be quite complex to track in the preprocessor, and for relatively
little gain (the user can 'import "will-not-be-translated.h";' instead.)

TODO: This is one area where it becomes increasingly difficult to disambiguate
clang modules in C++ from C++ standard modules.  That needs to be addressed in
both the driver and the FE.

Differential Revision: https://reviews.llvm.org/D128981
2022-07-10 11:06:51 +01:00
Corentin Jabot 50416e5454 Revert "[Clang] Add a warning on invalid UTF-8 in comments."
It is probable thart this change crashes on the powerpc bots.

This reverts commit 355532a149.
2022-07-09 17:18:35 +02:00
Corentin Jabot 355532a149 [Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059
2022-07-09 11:26:45 +02:00
Nico Weber e9fe20dab3 Revert "[Clang] Add a warning on invalid UTF-8 in comments."
This reverts commit 4174f0ca61.

Also revert follow-up "[Clang] Fix invalid utf-8 detection"
This reverts commit bf45e27a67.

The second commit broke tests, see comments on
https://reviews.llvm.org/D129223, and it sounds like the first
commit isn't valid without the second one. So reverting both for now.
2022-07-06 22:51:52 +02:00
Corentin Jabot 4174f0ca61 [Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059
2022-07-06 21:18:29 +02:00
Corentin Jabot fb06dd3e8c Revert "[Clang] Add a warning on invalid UTF-8 in comments."
Reverting while I investigate build failures

This reverts commit e3dc56805f.
2022-07-06 19:45:12 +02:00
Corentin Jabot e3dc56805f [Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059
2022-07-06 17:59:44 +02:00
Argyrios Kyrtzidis 0d3a2b4c66 [Lex] Introduce `PPCallbacks::LexedFileChanged()` preprocessor callback
This is a preprocessor callback focused on the lexed file changing, without conflating effects of line number directives and other pragmas.
A client that only cares about what files the lexer processes, like dependency generation, can use this more straightforward
callback instead of `PPCallbacks::FileChanged()`. Clients that want the pragma directive effects as well can keep using `FileChanged()`.

A use case where `PPCallbacks::LexedFileChanged()` is particularly simpler to use than `FileChanged()` is in a situation
where a client wants to keep track of lexed file changes that include changes from/to the predefines buffer, where it becomes
unnecessary complicated trying to use `FileChanged()` while filtering out the pragma directives effects callbacks.

Also take the opportunity to provide information about the prior `FileID` the `Lexer` moved from, even when entering a new file.

Differential Revision: https://reviews.llvm.org/D128947
2022-07-01 14:22:31 -07:00
Argyrios Kyrtzidis c68b8c84eb [Lex] Make sure to notify `MultipleIncludeOpt` for "read tokens" during fast dependency directive lexing
Otherwise a header may be erroneously marked as having a header macro guard and won't get re-included.

Differential Revision: https://reviews.llvm.org/D128772
2022-06-29 15:50:16 -07:00
Egor Zhdan 5f2cf3a21f [Clang][Preprocessor] Fix inconsistent `FLT_EVAL_METHOD` when compiling vs preprocessing
When running `clang -E -Ofast` on macOS, the `__FLT_EVAL_METHOD__` macro is `0`, which causes the following typedef to be emitted into the preprocessed source: `typedef float float_t`.

However, when running `clang -c -Ofast`, `__FLT_EVAL_METHOD__` is `-1`, and `typedef long double float_t` is emitted.

This causes build errors for certain projects, which are not reproducible when compiling from preprocessed source.

The issue is that `__FLT_EVAL_METHOD__` is configured in `Sema::Sema` which is not executed when running in `-E` mode.

This change moves that logic into the preprocessor initialization method, which is invoked correctly in `-E` mode.

rdar://96134605
rdar://92748429

Differential Revision: https://reviews.llvm.org/D128814
2022-06-29 19:36:22 +01:00
Corentin Jabot a9a60f20e6 [Clang] Rename StringLiteral::isAscii() => isOrdinary() [NFC]
"Ascii" StringLiteral instances are actually narrow strings
that are UTF-8 encoded and do not have an encoding prefix.
(UTF8 StringLiteral are also UTF-8 encoded strings, but with
the u8 prefix.

To avoid possible confusion both with actuall ASCII strings,
and with future works extending the set of literal encodings
supported by clang, this rename StringLiteral::isAscii() to
isOrdinary(), matching C++ standard terminology.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D128762
2022-06-29 18:28:51 +02:00
Kazu Hirata ca05cc2064 [clang] Don't use Optional::hasValue (NFC)
This patch replaces x.hasValue() with x where x is contextually
convertible to bool.
2022-06-26 18:51:54 -07:00
Kazu Hirata 97afce08cb [clang] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 22:26:24 -07:00
Kazu Hirata 3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3.
2022-06-25 11:56:50 -07:00
Kazu Hirata aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Corentin Jabot c92056d038 [Clang][C++23] P2071 Named universal character escapes
Implements [[ https://wg21.link/p2071r1  | P2071 Named Universal Character Escapes ]] - as an extension in all language mode, the patch  not warn in c++23 mode will be done later once this paper is plenary approved (in July).

We add

 * A code generator that transforms `UnicodeData.txt` and `NameAliases.txt` to a space efficient data structure that can be queried in `O(NameLength)`
 * A set of functions in `Unicode.h` to query that data, including

   * A function to find an exact match of a given Unicode character name
   * A function to perform a loose (ignoring case, space, underscore, medial hyphen) matching
   * A function returning the best matching codepoint for a given string per edit distance

 * Support of `\N{}` escape sequences in String and character Literals, with loose and typos diagnostics/fixits
 * Support of `\N{}` as UCN with loose matching diagnostics/fixits.

Loose matching is considered an error to match closely the semantics of P2071.

The generated data contributes to 280kB of data to the binaries.

`UnicodeData.txt` and `NameAliases.txt`  are not committed to the repository in this patch, and regenerating the data is a manual process.

Reviewed By: tahonermann

Differential Revision: https://reviews.llvm.org/D123064
2022-06-25 19:03:33 +02:00
Kazu Hirata ca4af13e48 [clang] Don't use Optional::getValue (NFC) 2022-06-20 22:59:26 -07:00
Kazu Hirata d66cbc565a Don't use Optional::hasValue (NFC) 2022-06-20 20:26:05 -07:00
Kazu Hirata 0916d96d12 Don't use Optional::hasValue (NFC) 2022-06-20 20:17:57 -07:00
Kazu Hirata 064a08cd95 Don't use Optional::hasValue (NFC) 2022-06-20 20:05:16 -07:00
Kazu Hirata 5413bf1bac Don't use Optional::hasValue (NFC) 2022-06-20 11:33:56 -07:00
Kazu Hirata 452db157c9 [clang] Don't use Optional::hasValue (NFC) 2022-06-20 10:51:34 -07:00
Argyrios Kyrtzidis f7e19a5928 [Lex] Keep track of skipped preprocessor blocks and advance the lexer directly if they are revisited
This speeds up preprocessing, specifically for preprocessing the clang sources time is reduced by about -36%,
using measurements on M1Pro with a release+thinLTO build.

Differential Revision: https://reviews.llvm.org/D127379
2022-06-13 21:46:46 -07:00
Jan Svoboda d9390b6ac3 Reapply "[clang][lex] NFCI: Use DirectoryEntryRef in HeaderSearch::load*()"
This reverts commit 340654e0f2, essentially reapplying 1d3ba05e4a.

The test VFS/real-path-found-first.m that was failing on Windows is now passing with a workaround.
2022-06-13 17:03:32 +02:00
Nico Weber 5f57ca208b fix comment typo to cycle bots 2022-06-11 18:55:40 -04:00
Argyrios Kyrtzidis fbaa8b9ae5 [Lex] Fix `fixits` for typo-corrections of preprocessing directives within skipped blocks
The `EndLoc` parameter was always unset so no fixit was emitted. But it is also unnecessary for determining the range so we can remove it.

Differential Revision: https://reviews.llvm.org/D127251
2022-06-10 13:32:19 -07:00
Leonard Grey dd6bcdbf21 [Attributes] Remove AttrSyntax and migrate uses to AttributeCommonInfo::Syntax (NFC)
This is setup for allowing hasAttribute to work for plugin-provided attributes

Differential Revision: https://reviews.llvm.org/D126902
2022-06-03 12:11:48 -04:00
Paul Pluzhnikov 4ad17d2e96 Clean "./" from __FILE__ expansion.
This is alternative to https://reviews.llvm.org/D121733
and helps with Clang header modules in which FILE
may expand to "./foo.h" or "foo.h" depending on whether the file was
included directly or not.

Only do this when UseTargetPathSeparator is true, as we are already
changing the path in that case.

Reviewed By: ayzhao

Differential Revision: https://reviews.llvm.org/D126396
2022-06-02 18:00:19 -04:00
Argyrios Kyrtzidis fad6e37995 [Lex] Fix crash during dependency scanning while skipping an unmatched `#if` 2022-05-27 23:59:30 -07:00
Argyrios Kyrtzidis b4c83a13f6 [Tooling/DependencyScanning & Preprocessor] Refactor dependency scanning to produce pre-lexed preprocessor directive tokens, instead of minimized sources
This is a commit with the following changes:

* Remove `ExcludedPreprocessorDirectiveSkipMapping` and related functionality

Removes `ExcludedPreprocessorDirectiveSkipMapping`; its intended benefit for fast skipping of excluded directived blocks
will be superseded by a follow-up patch in the series that will use dependency scanning lexing for the same purpose.

* Refactor dependency scanning to produce pre-lexed preprocessor directive tokens, instead of minimized sources

Replaces the "source minimization" mechanism with a mechanism that produces lexed dependency directives tokens.

* Make the special lexing for dependency scanning a first-class feature of the `Preprocessor` and `Lexer`

This is bringing the following benefits:

    * Full access to the preprocessor state during dependency scanning. E.g. a component can see what includes were taken and where they were located in the actual sources.
    * Improved performance for dependency scanning. Measurements with a release+thin-LTO build shows ~ -11% reduction in wall time.
    * Opportunity to use dependency scanning lexing to speed-up skipping of excluded conditional blocks during normal preprocessing (as follow-up, not part of this patch).

For normal preprocessing measurements show differences are below the noise level.

Since, after this change, we don't minimize sources and pass them in place of the real sources, `DependencyScanningFilesystem` is not technically necessary, but it has valuable performance benefits for caching file `stat`s along with the results of scanning the sources. So the setup of using the `DependencyScanningFilesystem` during a dependency scan remains.

Differential Revision: https://reviews.llvm.org/D125486
Differential Revision: https://reviews.llvm.org/D125487
Differential Revision: https://reviews.llvm.org/D125488
2022-05-26 12:50:06 -07:00
Argyrios Kyrtzidis b58a420ff4 [Tooling/DependencyScanning] Rename refactorings towards transitioning dependency scanning to use pre-lexed preprocessor directive tokens
This is first of a series of patches for making the special lexing for dependency scanning a first-class feature of the `Preprocessor` and `Lexer`.
This patch only includes NFC renaming changes to make reviewing of the functionality changing parts easier.

Differential Revision: https://reviews.llvm.org/D125484
2022-05-26 12:49:51 -07:00
Yaxun (Sam) Liu cefe472c51 [clang] Fix __has_builtin
Fix __has_builtin to return 1 only if the requested target features
of a builtin are enabled by refactoring the code for checking
required target features of a builtin and use it in evaluation
of __has_builtin.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D125829
2022-05-19 11:34:42 -04:00
Ken Matsui 45e01ce5fe [clang] Avoid suggesting typoed directives in `.S` files
This patch is itended to avoid suggesting typoed directives in `.S`
files to support the cases of `#` directives treated as comments or
various pseudo-ops. The feature is implemented in
https://reviews.llvm.org/D124726.

Fixes: https://reviews.llvm.org/D124726#3516346.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D125727
2022-05-16 15:46:59 -07:00
Ken Matsui a247ba9d15 Suggest typo corrections for preprocessor directives
When a preprocessor directive is unknown outside of a skipped
conditional block, we give an error diagnostic because we don't know
how to proceed with preprocessing. But when the directive is in a
skipped conditional block, we would not diagnose it on the theory that
the directive may be known to an implementation other than Clang.

Now, for unknown directives inside a skipped conditional block, we
diagnose the unknown directive as a warning if it is sufficiently
similar to a directive specific to preprocessor conditional blocks. For
example, we'll warn about `#esle` and suggest `#else` but we won't warn
about `#progma` because it's not a directive specific to preprocessor
conditional blocks.

Fixes #51598

Differential Revision: https://reviews.llvm.org/D124726
2022-05-13 09:16:46 -04:00
Timm Bäder b91073db6a [clang][preprocessor] Fix unsigned-ness of utf8 char literals
UTF8 char literals are always unsigned.

Fixes https://github.com/llvm/llvm-project/issues/54886

Differential Revision: https://reviews.llvm.org/D124996
2022-05-13 07:57:10 +02:00
Ken Matsui a1545f51a9 Warn if using `elifdef` & `elifndef` in not C2x & C++2b mode
This adds an extension warning when using the preprocessor conditionals
in a language mode they're not officially supported in, and an opt-in
warning for compatibility with previous standards.

Fixes #55306
Differential Revision: https://reviews.llvm.org/D125178
2022-05-12 09:26:44 -04:00
Alan Zhao 6398f3f2e9 [clang] Add the flag -ffile-reproducible
When Clang generates the path prefix (i.e. the path of the directory
where the file is) when generating FILE, __builtin_FILE(), and
std::source_location, Clang uses the platform-specific path separator
character of the build environment where Clang _itself_ is built. This
leads to inconsistencies in Chrome builds where Clang running on
non-Windows environments uses the forward slash (/) path separator
while Clang running on Windows builds uses the backslash (\) path
separator. To fix this, we add a flag -ffile-reproducible (and its
inverse, -fno-file-reproducible) to have Clang use the target's
platform-specific file separator character.

Additionally, the existing flags -fmacro-prefix-map and
-ffile-prefix-map now both imply -ffile-reproducible. This can be
overriden by setting -fno-file-reproducible.

[0]: https://crbug.com/1310767

Differential revision: https://reviews.llvm.org/D122766
2022-05-11 23:04:36 +02:00
Ken Matsui 786c721c2b Add extension diagnostic for linemarker directives
This adds the -Wgnu-line-marker diagnostic flag, grouped under -Wgnu,
to warn about use of the GNU linemarker preprocessor extension.

Fixes #55067

Differential Revision: https://reviews.llvm.org/D124534
2022-05-11 06:42:00 -04:00
Sam McCall 817550919e [Lex] Don't assert when decoding invalid UCNs.
Currently if a lexically-valid UCN encodes an invalid codepoint, then we
diagnose that, and then hit an assertion while trying to decode it.

Since there isn't anything preventing us reaching this state, remove the
assertion. expandUCNs("X\UAAAAAAAAY") will produce "XY".

Differential Revision: https://reviews.llvm.org/D125059
2022-05-06 08:51:42 +02:00
Cyndy Ishida b6c67c3c67 [clang] Track how headers get included generally during lookup time
tapi & clang-extractapi both attempt to construct then check against
how a header was included to determine api information when working
against multiple search paths, headermap, and vfsoverlay mechanisms.
Validating this against what the preprocessor sees during lookup time
makes this check more reliable.

Reviewed By: zixuw, jansvoboda11

Differential Revision: https://reviews.llvm.org/D124638
2022-05-04 09:52:31 -07:00
Senran Zhang ae76eb32a5 [NFC][Clang][Pragma] Remove unused variables
Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D124339
2022-04-24 14:50:59 +08:00
Christopher Di Bella e9a902c7f7 Revert "Revert "Revert "[clang][pp] adds '#pragma include_instead'"""
> Includes regression test for problem noted by @hans.
> is reverts commit 973de71.
>
> Differential Revision: https://reviews.llvm.org/D106898

Feature implemented as-is is fairly expensive and hasn't been used by
libc++. A potential reimplementation is possible if libc++ become
interested in this feature again.

Differential Revision: https://reviews.llvm.org/D123885
2022-04-22 16:37:20 +00:00
Jan Svoboda 340654e0f2 Revert "[clang][lex] NFCI: Use DirectoryEntryRef in HeaderSearch::load*()"
This reverts commit 1d3ba05e4a which caused failures of the VFS/real-path-found-first.m test on Windows build bots.
2022-04-20 20:27:14 +02:00
Jan Svoboda 99cfccdcb3 [clang][lex] NFCI: Use FileEntryRef in ModuleMap::diagnoseHeaderInclusion()
This patch removes uses of the deprecated `DirectoryEntry::getName()` from the `ModuleMap::diagnoseHeaderInclusion()` function by using `{File,Directory}EntryRef` instead.

Reviewed By: bnbarham

Differential Revision: https://reviews.llvm.org/D123856
2022-04-20 20:27:13 +02:00
Jan Svoboda f43ce5199d [clang][lex] NFCI: Use DirectoryEntryRef in FrameworkCacheEntry
This patch changes the member of `FrameworkCacheEntry` from `const DirectoryEntry *` to `Optional<DirectoryEntryRef>` in order to remove uses of the deprecated `DirectoryEntry::getName()`.

Reviewed By: bnbarham

Differential Revision: https://reviews.llvm.org/D123854
2022-04-20 19:01:02 +02:00