Commit Graph

2610 Commits

Author SHA1 Message Date
Ken Matsui a247ba9d15 Suggest typo corrections for preprocessor directives
When a preprocessor directive is unknown outside of a skipped
conditional block, we give an error diagnostic because we don't know
how to proceed with preprocessing. But when the directive is in a
skipped conditional block, we would not diagnose it on the theory that
the directive may be known to an implementation other than Clang.

Now, for unknown directives inside a skipped conditional block, we
diagnose the unknown directive as a warning if it is sufficiently
similar to a directive specific to preprocessor conditional blocks. For
example, we'll warn about `#esle` and suggest `#else` but we won't warn
about `#progma` because it's not a directive specific to preprocessor
conditional blocks.

Fixes #51598

Differential Revision: https://reviews.llvm.org/D124726
2022-05-13 09:16:46 -04:00
Timm Bäder b91073db6a [clang][preprocessor] Fix unsigned-ness of utf8 char literals
UTF8 char literals are always unsigned.

Fixes https://github.com/llvm/llvm-project/issues/54886

Differential Revision: https://reviews.llvm.org/D124996
2022-05-13 07:57:10 +02:00
Ken Matsui a1545f51a9 Warn if using `elifdef` & `elifndef` in not C2x & C++2b mode
This adds an extension warning when using the preprocessor conditionals
in a language mode they're not officially supported in, and an opt-in
warning for compatibility with previous standards.

Fixes #55306
Differential Revision: https://reviews.llvm.org/D125178
2022-05-12 09:26:44 -04:00
Alan Zhao 6398f3f2e9 [clang] Add the flag -ffile-reproducible
When Clang generates the path prefix (i.e. the path of the directory
where the file is) when generating FILE, __builtin_FILE(), and
std::source_location, Clang uses the platform-specific path separator
character of the build environment where Clang _itself_ is built. This
leads to inconsistencies in Chrome builds where Clang running on
non-Windows environments uses the forward slash (/) path separator
while Clang running on Windows builds uses the backslash (\) path
separator. To fix this, we add a flag -ffile-reproducible (and its
inverse, -fno-file-reproducible) to have Clang use the target's
platform-specific file separator character.

Additionally, the existing flags -fmacro-prefix-map and
-ffile-prefix-map now both imply -ffile-reproducible. This can be
overriden by setting -fno-file-reproducible.

[0]: https://crbug.com/1310767

Differential revision: https://reviews.llvm.org/D122766
2022-05-11 23:04:36 +02:00
Ken Matsui 786c721c2b Add extension diagnostic for linemarker directives
This adds the -Wgnu-line-marker diagnostic flag, grouped under -Wgnu,
to warn about use of the GNU linemarker preprocessor extension.

Fixes #55067

Differential Revision: https://reviews.llvm.org/D124534
2022-05-11 06:42:00 -04:00
Sam McCall 817550919e [Lex] Don't assert when decoding invalid UCNs.
Currently if a lexically-valid UCN encodes an invalid codepoint, then we
diagnose that, and then hit an assertion while trying to decode it.

Since there isn't anything preventing us reaching this state, remove the
assertion. expandUCNs("X\UAAAAAAAAY") will produce "XY".

Differential Revision: https://reviews.llvm.org/D125059
2022-05-06 08:51:42 +02:00
Cyndy Ishida b6c67c3c67 [clang] Track how headers get included generally during lookup time
tapi & clang-extractapi both attempt to construct then check against
how a header was included to determine api information when working
against multiple search paths, headermap, and vfsoverlay mechanisms.
Validating this against what the preprocessor sees during lookup time
makes this check more reliable.

Reviewed By: zixuw, jansvoboda11

Differential Revision: https://reviews.llvm.org/D124638
2022-05-04 09:52:31 -07:00
Senran Zhang ae76eb32a5 [NFC][Clang][Pragma] Remove unused variables
Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D124339
2022-04-24 14:50:59 +08:00
Christopher Di Bella e9a902c7f7 Revert "Revert "Revert "[clang][pp] adds '#pragma include_instead'"""
> Includes regression test for problem noted by @hans.
> is reverts commit 973de71.
>
> Differential Revision: https://reviews.llvm.org/D106898

Feature implemented as-is is fairly expensive and hasn't been used by
libc++. A potential reimplementation is possible if libc++ become
interested in this feature again.

Differential Revision: https://reviews.llvm.org/D123885
2022-04-22 16:37:20 +00:00
Jan Svoboda 340654e0f2 Revert "[clang][lex] NFCI: Use DirectoryEntryRef in HeaderSearch::load*()"
This reverts commit 1d3ba05e4a which caused failures of the VFS/real-path-found-first.m test on Windows build bots.
2022-04-20 20:27:14 +02:00
Jan Svoboda 99cfccdcb3 [clang][lex] NFCI: Use FileEntryRef in ModuleMap::diagnoseHeaderInclusion()
This patch removes uses of the deprecated `DirectoryEntry::getName()` from the `ModuleMap::diagnoseHeaderInclusion()` function by using `{File,Directory}EntryRef` instead.

Reviewed By: bnbarham

Differential Revision: https://reviews.llvm.org/D123856
2022-04-20 20:27:13 +02:00
Jan Svoboda f43ce5199d [clang][lex] NFCI: Use DirectoryEntryRef in FrameworkCacheEntry
This patch changes the member of `FrameworkCacheEntry` from `const DirectoryEntry *` to `Optional<DirectoryEntryRef>` in order to remove uses of the deprecated `DirectoryEntry::getName()`.

Reviewed By: bnbarham

Differential Revision: https://reviews.llvm.org/D123854
2022-04-20 19:01:02 +02:00
Jan Svoboda 1d3ba05e4a [clang][lex] NFCI: Use DirectoryEntryRef in HeaderSearch::load*()
This patch removes uses of the deprecated `DirectoryEntry::getName()` from `HeaderSearch::load*()` functions by using `DirectoryEntryRef` instead.

Note that we bail out in one case and use the also deprecated `FileEntry::getLastRef()`. That's to prevent this patch from growing, and is addressed in a follow-up.

Reviewed By: bnbarham

Differential Revision: https://reviews.llvm.org/D123771
2022-04-20 18:52:27 +02:00
Timm Bäder 33ec653055 [clang][lexer] Allow u8 character literal prefixes in C2x
Implement N2418 for C2x.

Differential Revision: https://reviews.llvm.org/D119221
2022-04-19 09:57:51 +02:00
Jan Svoboda 0b09b5d448 [clang][lex] NFC: Use FileEntryRef in PreprocessorLexer::getFileEntry()
This patch changes the return type of `PreprocessorLexer::getFileEntry()` so that its clients may stop using the deprecated APIs of `FileEntry`.

Reviewed By: bnbarham

Differential Revision: https://reviews.llvm.org/D123772
2022-04-15 15:16:17 +02:00
Paul Robinson 7726ad04e2 [PS5] Add basic PS5 driver behavior
This adds a PS5-specific ToolChain subclass, which defines some basic
PS5 driver behavior. Future patches will add more target-specific
driver behavior.
2022-04-14 12:45:33 -07:00
Jan Svoboda d79ad2f1db [clang][lex] NFCI: Use FileEntryRef in PPCallbacks::InclusionDirective()
This patch changes type of the `File` parameter in `PPCallbacks::InclusionDirective()` from `const FileEntry *` to `Optional<FileEntryRef>`.

With the API change in place, this patch then removes some uses of the deprecated `FileEntry::getName()` (e.g. in `DependencyGraph.cpp` and `ModuleDependencyCollector.cpp`).

Reviewed By: dexonsmith, bnbarham

Differential Revision: https://reviews.llvm.org/D123574
2022-04-14 10:46:12 +02:00
Timm Bäder 0eb5891adc [clang][preprocessor] Allow calling DumpToken() on annotation tokens
Differential Revision: https://reviews.llvm.org/D122659
2022-04-13 07:06:00 +02:00
Jan Svoboda b672638dbc [clang][deps] Ensure deterministic filename case
The dependency scanner can reuse single FileManager instance across multiple translation units. This may lead to non-deterministic output depending on which TU gets processed first.

One of the problems is that Clang uses DirectoryEntry::getName in the header search algorithm. This function returns the path that was first used to construct the (shared) entry in FileManager. Using DirectoryEntryRef::getName instead preserves the case as it was spelled out for the current "get directory entry" request.

rdar://90647508

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D123229
2022-04-08 09:18:00 +02:00
Paul Robinson 1e085448b3 [PS4] Fix header search list
A missing "break" in the initial implementation had us adding a
spurious "/usr/include" to the header search list. Later someone
introduced LLVM_FALLTHROUGH to prevent a warning.  Replace this with
the correct "break" and make sure the extra directory isn't added to
the PS4 header search list.
2022-04-05 14:14:13 -07:00
David Goldman d9739f29cd Serialize PragmaAssumeNonNullLoc to support preambles
Previously, if a `#pragma clang assume_nonnull begin` was at the
end of a premable with a `#pragma clang assume_nonnull end` at the
end of the main file, clang would diagnose an unterminated begin in
the preamble and an unbalanced end in the main file.

With this change, those errors no longer occur and the case above is
now properly handled. I've added a corresponding test to clangd,
which makes use of preambles, in order to verify this works as
expected.

Differential Revision: https://reviews.llvm.org/D122179
2022-03-31 11:08:01 -04:00
Chuanqi Xu ee572129ae [C++20] [Modules] Use '-' as the separator of partitions when searching
in filesystems

It is simpler to search for module unit by -fprebuilt-module-path
option. However, the separator ':' of partitions is not friendly.
According to the discussion in https://reviews.llvm.org/D118586, I think
we get consensus to use '-' as the separator instead. The '-' is the
choice of GCC too.

Previously I thought it would be better to add an option. But I feel it
is over-engineering now. Another reason here is that there are too many
options for modules (for clang module mainly) now. Given it is not bad
to use '-' when searching, I think it is acceptable to not add an
option.

Reviewed By: iains

Differential Revision: https://reviews.llvm.org/D120874
2022-03-31 11:21:58 +08:00
Iain Sandoe 6c0e60e884 [C++20][Modules][HU 1/5] Introduce header units as a module type.
This is the first in a series of patches that introduce C++20 importable
header units.

These differ from clang header modules in that:
 (a) they are identifiable by an internal name
 (b) they represent the top level source for a single header - although
     that might include or import other headers.

We name importable header units with the path by which they are specified
(although that need not be the absolute path for the file).

So "foo/bar.h" would have a name "foo/bar.h".  Header units are made a
separate module type so that we can deal with diagnosing places where they
are permitted but a named module is not.

Differential Revision: https://reviews.llvm.org/D121095
2022-03-25 09:17:14 +00:00
Jan Svoboda 59dadd178b [clang][lex] Fix failures with Microsoft header search rules
`HeaderSearch` currently assumes `LookupFileCache` is eventually populated in `LookupFile`. However, that's not always the case with `-fms-compatibility` and its early returns.

This patch adds a defensive check that the iterator pulled out of the cache is actually valid before using it.

(This bug was introduced in D119721. Before that, the cache was initialized to `0` - essentially the `search_dir_begin()` iterator.)

Reviewed By: dexonsmith, erichkeane

Differential Revision: https://reviews.llvm.org/D122237
2022-03-23 14:49:17 +01:00
Zahira Ammarguellat bbf0d1932a Currently the control of the eval-method is mixed with fast-math.
FLT_EVAL_METHOD tells the user the precision at which, temporary results
are evaluated but when fast-math is enabled, the numeric values are not
guaranteed to match the source semantics, so the eval-method is
meaningless.
For example, the expression `x + y + z` has as source semantics `(x + y)
+ z`. FLT_EVAL_METHOD is telling the user at which precision `(x + y)`
is evaluated. With fast-math enable the compiler can choose to
evaluate the expression as `(y + z) + x`.
The correct behavior is to set the FLT_EVAL_METHOD to `-1` to tell the
user that the precision of the intermediate values is unknow. This
patch is doing that.

Differential Revision: https://reviews.llvm.org/D121122
2022-03-17 11:48:03 -07:00
Jan Svoboda 6007b0b67b [clang][deps] NFC: Use range-based for loop instead of iterators
The iterator is not needed after the loop body anymore, meaning we can use more terse range-based for loop.

Depends on D121295.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D121685
2022-03-16 12:17:53 +01:00
Jan Svoboda 77924d60ef [clang][deps] Modules don't contribute to search path usage
To reduce the number of modules we build in explicit builds (which use strict context hash), we prune unused header search paths. This essentially merges parts of the dependency graph.

Determining whether a search path was used to discover a module (through implicit module maps) proved to be somewhat complicated. Initial support landed in D102923, while D113676 attempts to fix some bugs.

However, now that we don't use implicit module maps in explicit builds (since D120465), we don't need to consider such search paths as used anymore. Modules are no longer discovered through the header search mechanism, so we can drop such search paths (provided they are not needed for other reasons).

This patch removes whatever support for detecting such usage we had, since it's buggy and not required anymore.

Depends on D120465.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D121295
2022-03-16 12:17:52 +01:00
Aaron Ballman 9e3e85ac6e Silence -Wlogical-op-parentheses and fix a logic bug while doing so 2022-03-14 10:13:39 -04:00
Aaron Ballman 8cba72177d Implement literal suffixes for _BitInt
WG14 adopted N2775 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2775.pdf)
at our Feb 2022 meeting. This paper adds a literal suffix for
bit-precise types that automatically sizes the bit-precise type to be
the smallest possible legal _BitInt type that can represent the literal
value. The suffix chosen is wb (for a signed bit-precise type) which
can be combined with the u suffix (for an unsigned bit-precise type).

The preprocessor continues to operate as-if all integer types were
intmax_t/uintmax_t, including bit-precise integer types. It is a
constraint violation if the bit-precise literal is too large to fit
within that type in the context of the preprocessor (when still using
a pp-number preprocessing token), but it is not a constraint violation
in other circumstances. This allows you to make bit-precise integer
literals that are wider than what the preprocessor currently supports
in order to initialize variables, etc.
2022-03-14 09:24:19 -04:00
Paul Robinson 7b85f0f32f [PS4] isPS4 and isPS4CPU are not meaningfully different 2022-03-03 11:36:59 -05:00
Dawid Jurczak d813116c9d [NFC][Lexer] Remove getLangOpts function from Lexer
Given that there is only one external user of Lexer::getLangOpts
we can remove getter entirely without much pain.

Differential Revision: https://reviews.llvm.org/D120404
2022-03-02 11:17:05 +01:00
Adam Czachorowski 8f4ea36bfe [clang] Improve laziness of resolving module map headers.
clang has support for lazy headers in module maps - if size and/or
modtime and provided in the cppmap file, headers are only resolved when
an include directive for a file with that size/modtime is encoutered.

Before this change, the lazy resolution was all-or-nothing per module.
That means as soon as even one file in that module potentially matched
an include, all lazy files in that module were resolved. With this
change, only files with matching size/modtime will be resolved.

The goal is to avoid unnecessary stat() calls on non-included files,
which is especially valuable on networked file systems, with higher
latency.

Differential Revision: https://reviews.llvm.org/D120569
2022-03-01 15:56:23 +01:00
Dawid Jurczak a64d3c602f [NFC][Lexer] Make Lexer::LangOpts const reference
This change can be seen as code cleanup but motivation is more performance related.
While browsing perf reports captured during Linux build we can notice unusual portion of instructions executed in std::vector<std::string> copy constructor like:

0.59%     0.58%  clang-14    clang-14      [.] std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >,
                                                                std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::vector

or even:

1.42%     0.26%  clang    clang-14             [.] clang::LangOptions::LangOptions
       |
        --1.16%--clang::LangOptions::LangOptions
                  |
                   --0.74%--std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >,
                            std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::vector

After more digging we can see that relevant LangOptions std::vector members (*Files, ModuleFeatures and NoBuiltinFuncs)
are constructed when Lexer::LangOpts field is initialized on list:

Lexer::Lexer(..., const LangOptions &langOpts, ...)
            : ..., LangOpts(langOpts),

Since LangOptions copy constructor is called by Lexer(..., const LangOptions &LangOpts,...) and local Lexer objects are created thousands times
(in Lexer::getRawToken, Preprocessor::EnterSourceFile and more) during single module processing in frontend it makes std::vector copy constructors surprisingly hot.

Unfortunately even though in current Lexer implementation mentioned std::vector members are unused and most of time empty,
no compiler is smart enough to optimize their std::vector copy constructors out (take a look at test assembly): https://godbolt.org/z/hdoxPfMYY even with LTO enabled.
However there is simple way to fix this. Since Lexer doesn't access *Files, ModuleFeatures, NoBuiltinFuncs and any other LangOptions fields (but only LangOptionsBase)
we can simply get rid of redundant copy constructor assembly by changing LangOpts type to more appropriate const LangOptions reference: https://godbolt.org/z/fP7de9176

Additionally we need to store LineComment outside LangOpts because it's written in SkipLineComment function.
Also FormatTokenLexer need to be adjusted a bit to avoid lifetime issues related to passing local LangOpts reference to Lexer.

After this change I can see more than 1% speedup in some of my microbenchmarks when using Clang release binary built with LTO.
For Linux build gains are not so significant but still nice at the level of -0.4%/-0.5% instructions drop.

Differential Revision: https://reviews.llvm.org/D120334
2022-02-28 15:42:19 +01:00
Zahira Ammarguellat 1592d88aa7 Add support for floating-point option `ffp-eval-method` and for
`pragma clang fp eval_method`.

Differential Revision: https://reviews.llvm.org/D109239
2022-02-23 15:00:18 -08:00
Dawid Jurczak fbe38a784e [NFC][Lexer] Make access to LangOpts more consistent
Before this change without any good reason Lexer::LangOpts is sometimes accessed by getter and another time read directly in Lexer functions.
Since getLangOpts is a bit more verbose prefer direct access to LangOpts member when possible.

Differential Revision: https://reviews.llvm.org/D120333
2022-02-23 12:46:13 +01:00
Florian Hahn 09193f20a1
Revert "Add support for floating-point option `ffp-eval-method` and for"
This reverts commit 32b73bc6ab.

This breaks builds on macOS in some configurations, because
__FLT_EVAL_METHOD__ is set to an unexpected value.

E.g.
https://green.lab.llvm.org/green/job/clang-stage1-RA/28282/consoleFull#129538464349ba4694-19c4-4d7e-bec5-911270d8a58c

More details available in the review thread
https://reviews.llvm.org/D109239
2022-02-18 11:04:00 +00:00
Zahira Ammarguellat 32b73bc6ab Add support for floating-point option `ffp-eval-method` and for
`pragma clang fp eval_method`.

https://reviews.llvm.org/D109239
2022-02-17 08:59:21 -08:00
Nico Weber 125abb61f7 Revert "Add support for floating-point option `ffp-eval-method` and for"
This reverts commit 4bafe65c2b.
Breaks at least Misc/warning-flags.c, see comments on
https://reviews.llvm.org/D109239
2022-02-15 22:02:25 -05:00
Zahira Ammarguellat 4bafe65c2b Add support for floating-point option `ffp-eval-method` and for
`pragma clang fp eval_method`.
2022-02-15 13:59:27 -08:00
Jan Svoboda e7dcf09fc3 [clang][lex] Use `SearchDirIterator` types in for loops
This patch replaces a lot of index-based loops with iterators and ranges.

Depends on D117566.

Reviewed By: ahoppen

Differential Revision: https://reviews.llvm.org/D119722
2022-02-15 11:02:26 +01:00
Jan Svoboda 17c9fcd6f6 [clang][lex] Use `ConstSearchDirIterator` in lookup cache
This patch starts using the new iterator type in `LookupFileCacheInfo`.

Depends on D117566.

Reviewed By: ahoppen

Differential Revision: https://reviews.llvm.org/D119721
2022-02-15 10:39:05 +01:00
Jan Svoboda 7631c366c8 [clang][lex] Introduce `ConstSearchDirIterator`
The `const DirectoryLookup *` out-parameter of `{HeaderSearch,Preprocessor}::LookupFile()` is assigned the most recently used search directory, which callers use to implement `#include_next`.

From the function signature it's not obvious the `const DirectoryLookup *` is being used as an iterator. This patch introduces `ConstSearchDirIterator` to make that affordance obvious. This would've prevented a bug that occurred after initially landing D116750.

Reviewed By: ahoppen

Differential Revision: https://reviews.llvm.org/D117566
2022-02-15 10:36:54 +01:00
Jan Svoboda a081a0654f [clang][lex] NFC: De-duplicate some #include_next logic
This patch addresses a FIXME and de-duplicates some `#include_next` logic

Depends on D119714.

Reviewed By: ahoppen

Differential Revision: https://reviews.llvm.org/D119716
2022-02-15 09:52:39 +01:00
Jan Svoboda d8298f04a9 [clang][lex][minimizer] Avoid treating path separators as comments
The minimizer strips out single-line comments (introduced by `//`). This sequence of characters can also appear in `#include` or `#import` directives where they play the role of path separators. We already avoid stripping this character sequence for `#include` but not for `#import` (which has the same semantics). This patch makes it so `#import <A//A.h>` is not affected by minimization. Previously, we would incorrectly reduce it into `#import <A`.

Reviewed By: arphaman

Differential Revision: https://reviews.llvm.org/D119226
2022-02-15 09:49:19 +01:00
Jan Svoboda fd2dff17c5 [clang][lex][minimizer] Ensure whitespace between squashed lines
The minimizer tries to squash multi-line macro definitions into single line. For that to work, contents of each line need to be separated by a space. Since we always strip leading whitespace on lines of a macro definition, the code currently tries to preserve exactly one space that appeared before the backslash.

This means the following code:

```
#define FOO(BAR) \
  #BAR           \
  baz
```

gets minimized into:

```
#define FOO(BAR) #BAR baz
```

However, if there are no spaces before the backslash on line 2:

```
#define FOO(BAR) \
  #BAR\
  baz
```

no space can be preserved, leading to (most likely) malformed macro definition:

```
#define FOO(BAR) #BARbaz
```

This patch makes sure we always put exactly one space at the end of line ending with a backslash.

Reviewed By: arphaman

Differential Revision: https://reviews.llvm.org/D119231
2022-02-15 09:49:03 +01:00
Jan Svoboda edd09bb5a4 [clang][lex] Remove `Preprocessor::GetCurDirLookup()`
`Preprocessor` exposes the search directory iterator via `GetCurDirLookup()` getter, which is only used in two static functions.

To simplify reasoning about search directory iterators/references and to simplify the `Preprocessor` API, this patch makes the two static functions private member functions and removes the getter entirely.

Depends D119708.

Reviewed By: ahoppen, dexonsmith

Differential Revision: https://reviews.llvm.org/D119714
2022-02-15 09:48:25 +01:00
Jan Svoboda 7a124f4859 [clang][lex] Remove `PPCallbacks::FileNotFound()`
The purpose of the `FileNotFound` preprocessor callback was to add the ability to recover from failed header lookups. This was to support downstream project.

However, injecting additional search path while performing header search can invalidate currently used iterators/references to `DirectoryLookup` in `Preprocessor` and `HeaderSearch`.

The downstream project ended up maintaining a separate patch to further tweak the functionality. Since we don't have any upstream users nor open source downstream users, I'd like to remove this callback for good to prevent future misuse. I doubt there are any actual downstream users, since the functionality is definitely broken at the moment.

Reviewed By: ahoppen

Differential Revision: https://reviews.llvm.org/D119708
2022-02-15 09:48:25 +01:00
Alex Lorenz 00cd6c0420 [Preprocessor] Reduce the memory overhead of `#define` directives (Recommit)
Recently we observed high memory pressure caused by clang during some parallel builds.
We discovered that we have several projects that have a large number of #define directives
in their TUs (on the order of millions), which caused huge memory consumption in clang due
to a lot of allocations for MacroInfo. We would like to reduce the memory overhead of
clang for a single #define to reduce the memory overhead for these files, to allow us to
reduce the memory pressure on the system during highly parallel builds. This change achieves
that by removing the SmallVector in MacroInfo and instead storing the tokens in an array
allocated using the bump pointer allocator, after all tokens are lexed.

The added unit test with 1000000 #define directives illustrates the problem. Prior to this
change, on arm64 macOS, clang's PP bump pointer allocator allocated 272007616 bytes, and
used roughly 272 bytes per #define. After this change, clang's PP bump pointer allocator
allocates 120002016 bytes, and uses only roughly 120 bytes per #define.

For an example test file that we have internally with 7.8 million #define directives, this
change produces the following improvement on arm64 macOS: Persistent allocation footprint for
this test case file as it's being compiled to LLVM IR went down 22% from 5.28 GB to 4.07 GB
and the total allocations went down 14% from 8.26 GB to 7.05 GB. Furthermore, this change
reduced the total number of allocations made by the system for this clang invocation from
1454853 to 133663, an order of magnitude improvement.

The recommit fixes the LLDB build failure.

Differential Revision: https://reviews.llvm.org/D117348
2022-02-14 09:27:44 -08:00
Alex Lorenz 3f05192c4c Revert "[Preprocessor] Reduce the memory overhead of `#define` directives"
This reverts commit 0d9b91524e.

This change broke LLDB's build. I will need to recommit after fixing LLDB.
2022-02-11 15:53:16 -08:00
Alex Lorenz 0d9b91524e [Preprocessor] Reduce the memory overhead of `#define` directives
Recently we observed high memory pressure caused by clang during some parallel builds.
We discovered that we have several projects that have a large number of #define directives
in their TUs (on the order of millions), which caused huge memory consumption in clang due
to a lot of allocations for MacroInfo. We would like to reduce the memory overhead of
clang for a single #define to reduce the memory overhead for these files, to allow us to
reduce the memory pressure on the system during highly parallel builds. This change achieves
that by removing the SmallVector in MacroInfo and instead storing the tokens in an array
allocated using the bump pointer allocator, after all tokens are lexed.

The added unit test with 1000000 #define directives illustrates the problem. Prior to this
change, on arm64 macOS, clang's PP bump pointer allocator allocated 272007616 bytes, and
used roughly 272 bytes per #define. After this change, clang's PP bump pointer allocator
allocates 120002016 bytes, and uses only roughly 120 bytes per #define.

For an example test file that we have internally with 7.8 million #define directives, this
change produces the following improvement on arm64 macOS: Persistent allocation footprint for
this test case file as it's being compiled to LLVM IR went down 22% from 5.28 GB to 4.07 GB
and the total allocations went down 14% from 8.26 GB to 7.05 GB. Furthermore, this change
reduced the total number of allocations made by the system for this clang invocation from
1454853 to 133663, an order of magnitude improvement.

Differential Revision: https://reviews.llvm.org/D117348
2022-02-11 15:01:10 -08:00
ZezhengLi d7969012e4 [C++20] [Modules] Check if modulemap exists to avoid crash in implicit used C++ module
An impilt used of C++ module without prebuild path may cause crash.

For example:

```
// ./dir1/C.cppm
export module C;
// ./dir2/B.cppm
export module B;
import C;
// ./A.cpp
import B;
import C;
```

When we compile A.cpp without the prebuild path of C.pcm, the compiler
will crash.

```
clang++ -std=c++20 --precompile -c ./dir1/C.cppm -o ./dir1/C.pcm
clang++ -std=c++20 --precompile -fprebuilt-module-path=./dir2  -c
./dir2/B.cppm -o ./dir2/B.pcm
clang++ -std=c++20 -fprebuilt-module-path=./dir2 A.cpp

```

The prebuilt path of module C is cached when import module B, and in the
function HeaderSearch::getCachedModuleFileName, the compiler try to get
the filename by modulemap without check if modulemap exists, and there
is no modulemap in C++ module.

Reviewed By: ChuanqiXu

Differential review: https://reviews.llvm.org/D119426
2022-02-11 11:22:50 +08:00
David Goldman 9385ece95a [HeaderSearch] Track framework name in LookupFile
Previously, the Framework name was only set if the file
came from a header mapped framework; now we'll always
set the framework name if the file is in a framework.

Differential Revision: https://reviews.llvm.org/D117830
2022-02-04 13:32:39 -05:00
Alex Lorenz 979d0ee8ab [clang] fix out of bounds access in an empty string when lexing a _Pragma with missing string token
The lexer can attempt to lex a _Pragma and crash with an out of bounds string access when it's
lexing a _Pragma whose string token is an invalid buffer, e.g. when a module header file from which the macro
expansion for that token was deleted from the file system.

Differential Revision: https://reviews.llvm.org/D116052
2022-02-02 11:16:11 -08:00
Kadir Cetinkaya ff77071a4d
[clang][Lexer] Make raw and normal lexer behave the same for line comments
Normally there are heruistics in lexer to treat `//*` specially in
language modes that don't have line comments (to emit `/`). Unfortunately this
only applied to the first occurence of a line comment inside the file, as the
subsequent line comments were treated as if language had support for them.

This unfortunately only holds in normal lexing mode, as in raw mode all
occurences of line comments received this treatment, which created discrepancies
when comparing expanded and spelled tokens.

The proper fix would be to just make sure we treat all the line comments with a
subsequent `*` the same way, but it would imply breaking some code that's
accepted by clang today. So instead we introduce the same bug into raw lexing
mode.

Fixes https://github.com/clangd/clangd/issues/1003.

Differential Revision: https://reviews.llvm.org/D118471
2022-01-31 16:15:16 +01:00
Jan Svoboda f720272330 [clang][lex] Include tracking: simplify and move to preprocessor
This patch replaces the exact include count of each file in `HeaderFileInfo` with a set of included files in `Preprocessor`.

The number of includes isn't a property of a header file but rather a preprocessor state. The exact number of includes is not used anywhere except statistic tracking.

Reviewed By: vsapsai

Differential Revision: https://reviews.llvm.org/D114095
2022-01-26 15:56:26 +01:00
Jan Svoboda 105c913156 [clang][lex] NFC: Simplify calls to `LookupFile`
The `{HeaderSearch,Preprocessor}::LookupFile()` functions take an out-parameter `const DirectoryLookup *&`. Most callers end up creating a `const DirectoryLookup *` variable that's otherwise unused.

This patch changes the out-parameter from reference to a pointer, making it possible to simply pass `nullptr` to the function without the ceremony.

Reviewed By: ahoppen

Differential Revision: https://reviews.llvm.org/D117312
2022-01-18 16:02:18 +01:00
Jan Svoboda ccd7e7830f Revert "[clang][lex] Keep references to `DirectoryLookup` objects up-to-date"
This reverts commit 8503c688. This patch causes some issues with `#include_next`: https://github.com/llvm/llvm-project/issues/53161
2022-01-13 16:29:44 +01:00
Jan Svoboda f77d115cc1 [clang] Move `ApplyHeaderSearchOptions` from Frontend to Lex
In D116750, the `clangFrontend` library was added as a dependency of `LexTests` in order to make `clang::ApplyHeaderSearchOptions()` available. This increased the number of TUs the test depends on.

This patch moves the function into `clangLex` and removes dependency of `LexTests` on `clangFrontend`.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D117024
2022-01-11 17:57:40 +01:00
Jan Svoboda 8503c688d5 [clang][lex] Keep references to `DirectoryLookup` objects up-to-date
The elements of `SearchPath::SearchDirs` are being referenced to by their indices. This proved to be error-prone: `HeaderSearch::SearchDirToHSEntry` was accidentally not being updated in `HeaderSearch::AddSearchPath()`. This patch fixes that by referencing `SearchPath::SearchDirs` elements by their address instead, which is stable thanks to the bump-ptr-allocation strategy.

Reviewed By: ahoppen

Differential Revision: https://reviews.llvm.org/D116750
2022-01-11 15:24:46 +01:00
David Goldman 0cf7e61a42 [clang][HeaderSearch] Support framework includes in suggestPath...
Clang will now search through the framework includes to identify
the framework include path to a file, and then suggest a framework
style include spelling for the file.

Differential Revision: https://reviews.llvm.org/D115183
2022-01-10 12:25:53 -05:00
Jan Svoboda d5ba066cb6 [clang][lex] NFC: Move some HeaderSearch functions to .cpp file 2022-01-06 16:32:02 +01:00
Jan Svoboda 32c2ea5c33 [clang][lex] NFC: Simplify loop 2022-01-05 14:40:47 +01:00
Kazu Hirata 298367ee6e [clang] Use nullptr instead of 0 or NULL (NFC)
Identified with modernize-use-nullptr.
2021-12-29 08:34:20 -08:00
Kazu Hirata 713ee230f8 [clang] Use llvm::reverse (NFC) 2021-12-17 16:51:42 -08:00
Chuanqi Xu e587372f85 [C++20] [Module] Support extern C/C++ semantics
According to [module.unit]p7.2.3, a declaration within a linkage-specification
should be attached to the global module.
This let user to forward declare types across modules.

Reviewed by: rsmith, aaron.ballman

Differential Revision: https://reviews.llvm.org/D110215
2021-12-08 13:29:16 +08:00
Jan Svoboda 197576c409 [clang][lex] Refactor check for the first file include
This patch refactors the code that checks whether a file has just been included for the first time.

The `HeaderSearch::FirstTimeLexingFile` function is removed and the information is threaded to the original call site from `HeaderSearch::ShouldEnterIncludeFile`. This will make it possible to avoid tracking the number of includes in a follow up patch.

Depends on D114092.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D114093
2021-11-18 13:01:07 +01:00
Daan De Meyer 5a6dac66db LiteralSupport: Don't assert() on invalid input
When using clangd, it's possible to trigger assertions in
NumericLiteralParser and CharLiteralParser when switching git branches.
This commit removes the initial asserts on invalid input and replaces
those asserts with the error handling mechanism from those respective
classes instead. This allows clangd to gracefully recover without
crashing.

See https://github.com/clangd/clangd/issues/888 for more information
on the clangd crashes.
2021-11-17 23:51:30 +00:00
Nathan Sidwell bf6986d99e [clang] GCC directive extension extension: Hash NNN lines
Some time back I extended GCC's '# NNN' line marker semantics.
Specifically popping to a blank filename will restore the filename to
that of the popped-to include.  Restore to line 5 of including file
(escaped BOL #'s to avoid git eliding them):

\# 5 "" 2

Added documentation for this line control extension.

This was useful in developing modules tests, but turned out to also be
useful with machine-generated source code.  Specifically, a generated
include file that itself includes fragments from elsewhere.  The
ability to pop to the generated include file -- with its full path
prefix -- is useful for diagnostic & debug purposes.  For instance
something like:

// Machine generated -- DO NOT EDIT
Type Var = {
\# 7 "encoded.dsl" 1 // push to snippet-container
{snippet, of, code}
\# 6 " 2 // Restore to machined-generated source
,
};

// user-code
...
\#include "dsl.h"
...

That pop to "" will restore the filename to '..includepath../dsl.h',
which is better than restoring to plain "dsl.h".

Differential Revision: https://reviews.llvm.org/D113425
2021-11-09 07:31:03 -08:00
Benjamin Kramer 8adb6d6de2 [clang] Use llvm::reverse. NFCI. 2021-11-07 14:24:33 +01:00
Sylvain Audi a82a844961 [clang][deps] Keep #pragma push_macro, pop_macro and include_alias when minimizing source code.
The #pragma directives push_macro/pop_macro and include_alias may influence the #include / import directives encountered by dependency scanning tools like clang-scan-deps.

This patch ensures that those directives are not removed during source code minimization.

Differential Revision: https://reviews.llvm.org/D112088
2021-11-01 16:04:52 -04:00
Duncan P. N. Exon Smith 9902362701 Support: Use sys::path::is_style_{posix,windows}() in a few places
Use the new sys::path::is_style_posix() and is_style_windows() in a few
places that need to detect the system's native path style.

In llvm/lib/Support/Path.cpp, this patch removes most uses of the
private `real_style()`, where is_style_posix() and is_style_windows()
are just a little tidier.

Elsewhere, this removes `_WIN32` macro checks. Added a FIXME to a
FileManagerTest that seemed fishy, but maintained the existing
behaviour.

Differential Revision: https://reviews.llvm.org/D112289
2021-10-29 12:09:41 -07:00
Kazu Hirata 16ceb44e62 [clang] Use llvm::{count,count_if,find_if,all_of,none_of} (NFC) 2021-10-25 09:14:45 -07:00
Kazu Hirata 7cc8fa2dd2 Use llvm::is_contained (NFC) 2021-10-24 09:32:57 -07:00
Kazu Hirata dccfaddc6b [clang] Use StringRef::contains (NFC) 2021-10-21 08:58:19 -07:00
Kazu Hirata d245f2e859 [clang] Use llvm::erase_if (NFC) 2021-10-17 13:50:29 -07:00
Aaron Ballman 2edb89c746 Lex arguments for __has_cpp_attribute and friends as expanded tokens
The C and C++ standards require the argument to __has_cpp_attribute and
__has_c_attribute to be expanded ([cpp.cond]p5). It would make little sense
to expand the argument to those operators but not expand the argument to
__has_attribute and __has_declspec, so those were both also changed in this
patch.

Note that it might make sense for the other builtins to also expand their
argument, but it wasn't as clear to me whether the behavior would be correct
there, and so they were left for a future revision.
2021-10-17 07:54:48 -04:00
Volodymyr Sapsai d0e7bdc208 [modules] Make a module map referenced by a system map a system one too.
Mimic the behavior of including headers where a system includer makes an
includee a system header too.

rdar://84049469

Differential Revision: https://reviews.llvm.org/D111476
2021-10-15 12:46:51 -07:00
Cyndy Ishida 395e1fe305 [clang] Capture Framework when HeaderSearch is resolved via headermap
When building frameworks, headermaps responsible for mapping angle-included headers to their source file location are passed via
`-I` and not `-index-header-map`. Also, `-index-header-map` is only used for indexing purposes and not during most builds.
This patch holds on to the framework's name in HeaderFileInfo as this is retrieveable for cases outside of IndexHeaderMaps and
still represents the framework that is being built.

resolves: rdar://84046893

Reviewed By: jansvoboda11

Differential Revision: https://reviews.llvm.org/D111468
2021-10-15 09:12:31 -07:00
Kazu Hirata e567f37dab [clang] Use llvm::is_contained (NFC) 2021-10-13 20:41:55 -07:00
Jan Svoboda 4445135109 [clang][lex] Remark on search path usage
For dependency scanning, it would be useful to collect header search paths (provided on command-line via `-I` and friends) that were actually used during preprocessing. This patch adds that feature to `HeaderSearch` along with a new remark that reports such paths as they get used.

Previous version of this patch tried to use the existing `LookupFileCache` to report used paths via `HitIdx`. That doesn't work for `ComputeUserEntryUsage` (which is intended to be called *after* preprocessing), because it indexes used search paths by the file name. This means the values get overwritten when the code contains `#include_next`.

Note that `HeaderSearch` doesn't use `HeaderSearchOptions::UserEntries` directly. Instead, `InitHeaderSearch` pre-processes them (adds platform-specific paths, removes duplicates, removes paths that don't exist) and creates `DirectoryLookup` instances. This means we need a mechanism for translating between those two. It's not possible to go from `DirectoryLookup` back to the original `HeaderSearch`, so `InitHeaderSearch` now tracks the relationships explicitly.

Depends on D111557.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D102923
2021-10-12 12:20:55 +02:00
Jan Svoboda 1341a2c19e [clang][modules] Default `SourceLocation` parameter in `HeaderSearch::lookupModule`
This fixes an LLDB build failure where the `ImportLoc` argument is missing: https://lab.llvm.org/buildbot#builders/68/builds/19975

This change also makes it possible to drop `SourceLocation()` in `Preprocessor::getCurrentModule`.
2021-10-12 09:58:54 +02:00
Jan Svoboda 638c673a8c [clang][modules] NFC: Propagate import `SourceLocation` into `HeaderSearch::lookupModule`
This patch propagates the import `SourceLocation` into `HeaderSearch::lookupModule`. This enables remarks on search path usage (implemented in D102923) to point to the source code that initiated header search.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111557
2021-10-12 09:31:51 +02:00
Aaron Ballman af971365a2 Fix a diagnoses-valid in C++20 with variadic macros
C++20 and later allow you to pass no argument for the ... parameter in
a variadic macro, whereas earlier language modes and C disallow it.

We no longer diagnose in C++20 and later modes. This fixes PR51609.
2021-10-09 08:20:20 -04:00
Richard Smith 7063b76b02 PR50644: Do not warn on a declaration of `operator"" _foo`.
Also do not warn on `#define _foo` or `#undef _foo`.

Only global scope names starting with _[a-z] are reserved, not the use
of such an identifier in any other context.
2021-10-06 15:13:05 -07:00
Jay Foad d933adeaca [APInt] Stop using soft-deprecated constructors and methods in clang. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in clang.

Differential Revision: https://reviews.llvm.org/D110808
2021-10-04 09:38:11 +01:00
Dávid Bolvanský fb84aa2a8f Fixed warnings in target/parser codes produced by -Wbitwise-instead-of-logicala 2021-10-03 15:04:01 +02:00
Nico Weber e31899c708 Reland "[clang-cl] Accept `#pragma warning(disable : N)` for some N"
This reverts commit 0cd9d8a48b and
adds the changes described in https://reviews.llvm.org/D110668#3034461.
2021-09-30 15:03:23 -04:00
Amy Huang 0cd9d8a48b Revert "[clang-cl] Accept `#pragma warning(disable : N)` for some N"
because it causes `error: error reading '/wd4091'` errors in
compiler-rt builds.
2021-09-29 18:46:55 -07:00
Nico Weber b2de52bec1 [clang-cl] Accept `#pragma warning(disable : N)` for some N
clang-cl maps /wdNNNN to -Wno-flags for a few warnings that map
cleanly from cl.exe concepts to clang concepts.

This patch adds support for the same numbers to
`#pragma warning(disable : NNNN)`. It also lets
`#pragma warning(push)` and `#pragma warning(pop)` have an effect,
since these are used together with `warning(disable)`.

The optional numeric argument to `warning(push)` is ignored,
as are the other non-`disable` `pragma warning()` arguments.
(Supporting `error` would be easy, but we also don't support
`/we`, and those should probably be added together.)

The motivating example is that a bunch of code (including in LLVM)
uses this idiom to locally disable warnings about calls to deprecated
functions in Windows-only code, and 4996 maps nicely to
-Wno-deprecated-declarations:

    #pragma warning(push)
    #pragma warning(disable: 4996)
      f();
    #pragma warning(pop)

Implementation-wise:
- Move `/wd` flag handling from Options.td to actual Driver-level code
- Extract the function mapping cl.exe IDs to warning groups to the
  new file clang/lib/Basic/CLWarnings.cpp
- Create a diag::Group enum so that CLWarnings.cpp can refer to
  existing groups by ID (and give DllexportExplicitInstantiationDecl
  a named group), and add a function to map a diag::Group to the
  spelling of it's associated commandline flag
- Call that new function from PragmaWarningHandler

Differential Revision: https://reviews.llvm.org/D110668
2021-09-29 13:14:23 -04:00
Nico Weber 5cf0606140 [clang] Let PPCallbacks::PragmaWarning() pass specifier as enum instead of string
Differential Revision: https://reviews.llvm.org/D110635
2021-09-28 19:47:27 -04:00
Chris Bieneman 1e48ef2035 Implement #pragma clang final extension
This patch adds a new preprocessor extension ``#pragma clang final``
which enables warning on undefinition and re-definition of macros.

The intent of this warning is to extend beyond ``-Wmacro-redefined`` to
warn against any and all alterations to macros that are marked `final`.

This warning is part of the ``-Wpedantic-macros`` diagnostics group.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D108567
2021-09-27 14:11:16 -05:00
Aaron Ballman 38d09080c9 Removing a default constructor argument; NFC
The argument is always used with its default value, so remove the
argument entirely.
2021-09-27 09:41:28 -04:00
Corentin Jabot afb6223bc5 Support Unicode 14 identifiers
This update the UAX tables to support new Unicode 14 identifiers.
2021-09-16 13:21:27 -04:00
Corentin Jabot 274adcb866 Implement delimited escape sequences.
\x{XXXX} \u{XXXX} and \o{OOOO} are accepted in all languages mode
in characters and string literals.

This is a feature proposed for both C++ (P2290R1) and C (N2785). The
papers have been seen by both committees but are not yet adopted into
either standard. However, they do have support from both committees.
2021-09-15 09:54:49 -04:00
Corentin Jabot 601102d282 Cleanup identifier parsing; NFC
Rename methods to clearly signal when they only deal with ASCII,
simplify the parsing of identifier, and use start/continue instead of
head/body for consistency with Unicode terminology.
2021-09-14 09:12:22 -04:00
Kazuaki Ishizaki 8f77dc459e [clang] NFC: Fix trivial typo in comments and document
`the the` -> `the`

Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D77470
2021-09-04 12:59:42 +05:30
Volodymyr Sapsai 64ebf313a7 [HeaderSearch] Use `isImport` only for imported headers and not for `#pragma once`.
There is a separate field `isPragmaOnce` and when `isImport` combines
both, it complicates HeaderFileInfo serialization as `#pragma once` is
the inherent property of the header while `isImport` reflects how other
headers use it. The usage of the header can be different in different
contexts, that's why `isImport` requires tracking separate from `#pragma once`.

Differential Revision: https://reviews.llvm.org/D104351
2021-09-01 17:07:35 -07:00
Zahira Ammarguellat cec7c2b32e Revert "[CLANG][PATCH][FPEnv] Add support for option -ffp-eval-method and extend #pragma float_control similarly"
The intent of this patch is to add support of -fp-model=[source|double|extended] to allow
the compiler to use a wider type for intermediate floating point calculations. As a side
effect to that, the value of FLT_EVAL_METHOD is changed according to the pragma
float_control.
Unfortunately some issue was uncovered with this change in preprocessing. See details in
https://reviews.llvm.org/D93769 . We are therefore reverting this patch until we find a way
to reconcile the value of FLT_EVAL_METHOD, the pragma and the -E flow.

This reverts commit 66ddac22e2.
2021-09-01 04:48:50 -07:00
Reid Kleckner db3d029fbe Effectively revert 33c3d8a916 / D33782
This change would treat the token `or` in system headers as an
identifier, and elsewhere as an operator. As reported in
llvm.org/pr42427, many users classify their third party library headers
as "system" headers to suppress warnings. There's no clean way to
separate Windows SDK headers from user headers.

Clang is still able to parse old Windows SDK headers if C++ operator
names are disabled. Traditionally this was controlled by
`-fno-operator-names`, but is now also enabled with `/permissive` since
D103773. This change will prevent `clang-cl` from parsing <query.h> from
the Windows SDK out of the box, but there are multiple ways to work
around that:
- Pass `/clang:-fno-operator-names`
- Pass `/permissive`
- Pass `-DQUERY_H_RESTRICTION_PERMISSIVE`

In all of these modes, the operator names will consistently be available
or not available, instead of depending on whether the code is in a
system header.

I added a release note for this, since it may break straightforward
users of the Windows SDK.

Fixes PR42427

Differential Revision: https://reviews.llvm.org/D108720
2021-08-25 14:41:26 -07:00
Chris Bieneman 43de869d77 Implement #pragma clang restrict_expansion
This patch adds `#pragma clang restrict_expansion ` to enable flagging
macros as unsafe for header use. This is to allow macros that may have
ABI implications to be avoided in headers that have ABI stability
promises.

Using macros in headers (particularly public headers) can cause a
variety of issues relating to ABI and modules. This new pragma logs
warnings when using annotated macros outside the main source file.

This warning is added under a new diagnostics group -Wpedantic-macros

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D107095
2021-08-23 09:46:38 -07:00
Corentin Jabot bdeda959ab Make wide multi-character character literals ill-formed
This implements P2362, which has not yet been approved by the
C++ committee, but because wide-multi character literals are
implementation defined, clang might not have to wait for WG21.

This change is also being applied in C mode as the behavior is
implementation-defined in C as well and there's no benefit to
having different rules between the languages.

The other part of P2362, making non-representable character
literals ill-formed, is already implemented by clang
2021-08-20 11:10:53 -04:00
Corentin Jabot 2715c4da50 Do not emit diagnostics for invalid unicode characters in preprocessing mode
This amends 4e80636db7 with a fix for
https://lab.llvm.org/buildbot/#/builders/139/builds/8943
2021-08-18 09:12:36 -04:00
Corentin Jabot 4e80636db7 Implement P1949
This adds the Unicode 13 data for XID_Start and XID_Continue.
The definition of valid identifier is changed in all C++ modes
as P1949 (https://wg21.link/p1949) was accepted by WG21 as a defect
report.
2021-08-18 07:33:14 -04:00
Pavel Asyutchenko 7df405e079 Apply -fmacro-prefix-map to __builtin_FILE()
This matches the behavior of GCC.
Patch does not change remapping logic itself, so adding one simple smoke test should be enough.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D107393
2021-08-04 16:42:14 -07:00
Christopher Di Bella 0871954197 Revert "Revert "[clang][pp] adds '#pragma include_instead'""
Includes regression test for problem noted by @hans.
This reverts commit 973de71856.

Differential Revision: https://reviews.llvm.org/D106898
2021-07-29 19:21:43 +00:00
Chris Bieneman 26c695b789 Support macro deprecation #pragma clang deprecated
This patch adds `#pragma clang deprecated` to enable deprecation of
preprocessor macros.

The macro must be defined before `#pragma clang deprecated`. When
deprecating a macro a custom message may be optionally provided.

Warnings are emitted at the use site of a deprecated macro, and can be
controlled via the `-Wdeprecated` warning group.

This patch takes some rough inspiration and a few lines of code from
https://reviews.llvm.org/D67935.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D106732
2021-07-29 12:40:53 -05:00
Melanie Blower 66ddac22e2 [CLANG][PATCH][FPEnv] Add support for option -ffp-eval-method and extend #pragma float_control similarly
The Intel compiler ICC supports the option "-fp-model=(source|double|extended)"
which causes the compiler to use a wider type for intermediate floating point
calculations. Also supported is a way to embed this effect in the source
program with #pragma float_control(source|double|extended).
This patch extends pragma float_control syntax, and also adds support
for a new floating point option "-ffp-eval-method=(source|double|extended)".
source: intermediate results use source precision
double: intermediate results use double precision
extended: intermediate results use extended precision

Reviewed By: Aaron Ballman

Differential Revision: https://reviews.llvm.org/D93769
2021-07-28 10:50:32 -04:00
Hans Wennborg 973de71856 Revert "[clang][pp] adds '#pragma include_instead'"
> `#pragma clang include_instead(<header>)` is a pragma that can be used
> by system headers (and only system headers) to indicate to a tool that
> the file containing said pragma is an implementation-detail header and
> should not be directly included by user code.
>
> The library alternative is very messy code that can be seen in the first
> diff of D106124, and we'd rather avoid that with something more
> universal.
>
> This patch takes the first step by warning a user when they include a
> detail header in their code, and suggests alternative headers that the
> user should include instead. Future work will involve adding a fixit to
> automate the process, as well as cleaning up modules diagnostics to not
> suggest said detail headers. Other tools, such as clangd can also take
> advantage of this pragma to add the correct user headers.
>
> Differential Revision: https://reviews.llvm.org/D106394

This caused compiler crashes in Chromium builds involving PCH and an include
directive with macro expansion, when Token::getLiteralData() returned null. See
the code review for details.

This reverts commit e8a64e5491.
2021-07-27 17:29:48 +02:00
Christopher Di Bella e8a64e5491 [clang][pp] adds '#pragma include_instead'
`#pragma clang include_instead(<header>)` is a pragma that can be used
by system headers (and only system headers) to indicate to a tool that
the file containing said pragma is an implementation-detail header and
should not be directly included by user code.

The library alternative is very messy code that can be seen in the first
diff of D106124, and we'd rather avoid that with something more
universal.

This patch takes the first step by warning a user when they include a
detail header in their code, and suggests alternative headers that the
user should include instead. Future work will involve adding a fixit to
automate the process, as well as cleaning up modules diagnostics to not
suggest said detail headers. Other tools, such as clangd can also take
advantage of this pragma to add the correct user headers.

Differential Revision: https://reviews.llvm.org/D106394
2021-07-26 16:07:45 +00:00
Michael Kruse ae6b400002 [Preprocessor] Implement -fminimize-whitespace.
This patch adds the -fminimize-whitespace with the following effects:

 * If combined with -E, remove as much non-line-breaking whitespace as
   possible.

 * If combined with -E -P, removes as much whitespace as possible,
   including line-breaks.

The motivation is to reduce the amount of insignificant changes in the
preprocessed output with source files where only whitespace has been
changed (add/remove comments, clang-format, etc.) which is in particular
useful with ccache.

A patch for ccache for using this flag has been proposed to ccache as well:
https://github.com/ccache/ccache/pull/815, which will use
-fnormalize-whitespace when clang-13 has been detected, and additionally
uses -P in "unify_mode". ccache already had a unify_mode in an older
version which was removed because of problems that using the
preprocessor itself does not have (such that the custom tokenizer did
not recognize C++11 raw strings).

This patch slightly reorganizes which part is responsible for adding
newlines that are required for semantics. It is now either
startNewLineIfNeeded() or MoveToLine() but never both; this avoids the
ShouldUpdateCurrentLine workaround and avoids redundant lines being
inserted in some cases. It also fixes a mandatory newline not inserted
after a _Pragma("...") that is expanded into a #pragma.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D104601
2021-07-25 23:30:57 -05:00
Jan Svoboda aa245ddd46 [clang][lex] NFC: Add explicit cast to silence -Wsign-compare 2021-07-22 12:21:12 +02:00
Simon Tatham 21401a7262 [clang] Introduce SourceLocation::[U]IntTy typedefs.
This is part of a patch series working towards the ability to make
SourceLocation into a 64-bit type to handle larger translation units.

NFC: this patch introduces typedefs for the integer type used by
SourceLocation and makes all the boring changes to use the typedefs
everywhere, but for the moment, they are unconditionally defined to
uint32_t.

Patch originally by Mikhail Maltsev.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D105492
2021-07-21 10:45:46 +01:00
Melanie Blower d48ad358b1 Revert "[CLANG][PATCH][FPEnv] Add support for option -ffp-eval-method and extend #pragma float_control similarly"
This reverts commit ce8024e8ff.
There are a couple buildbot problems
2021-07-20 16:40:55 -04:00
Melanie Blower ce8024e8ff [CLANG][PATCH][FPEnv] Add support for option -ffp-eval-method and extend #pragma float_control similarly
The Intel compiler ICC supports the option "-fp-model=(source|double|extended)"
which causes the compiler to use a wider type for intermediate floating point
calculations. Also supported is a way to embed this effect in the source
program with #pragma float_control(source|double|extended).
This patch extends pragma float_control syntax, and also adds support
for a new floating point option "-ffp-eval-method=(source|double|extended)".
source: intermediate results use source precision
double: intermediate results use double precision
extended: intermediate results use extended precision

Reviewed By: Aaron Ballman

Differential Revision: https://reviews.llvm.org/D93769
2021-07-20 16:02:09 -04:00
Sam McCall fd22785054 [Lex] Consider a PCH header-guarded even with #endif truncated
This seems to be a more useful behavior for tools that use preambles.
I believe it doesn't affect real compiles: the PCH is only included once
when used, and recursive inclusion of the main-file *within* the PCH
isn't supported in any case.

Differential Revision: https://reviews.llvm.org/D106204
2021-07-20 14:25:36 +02:00
Yitzhak Mandelbaum 93dc73b1e0 [Lexer] Fix bug in `makeFileCharRange` called on split tokens.
When the end loc of the specified range is a split token, `makeFileCharRange`
does not process it correctly.  This patch adds proper support for split tokens.

Differential Revision: https://reviews.llvm.org/D105365
2021-07-14 14:36:31 +00:00
David Blaikie 1def2579e1 PR51018: Remove explicit conversions from SmallString to StringRef to future-proof against C++23
C++23 will make these conversions ambiguous - so fix them to make the
codebase forward-compatible with C++23 (& a follow-up change I've made
will make this ambiguous/invalid even in <C++23 so we don't regress
this & it generally improves the code anyway)
2021-07-08 13:37:57 -07:00
Saleem Abdulrasool 24f4c3ebef Lex: add a callback for `#pragma mark`
Allow a preprocessor observer to be notified of mark pragmas.  Although
this does not impact code generation in any way, it is useful for other
clients, such as clangd, to be able to identify any marked regions.

Reviewed By: dgoldman

Differential Revision: https://reviews.llvm.org/D105368
2021-07-02 15:44:01 -07:00
Martin Storsjö e5c7c171e5 [clang] Rename StringRef _lower() method calls to _insensitive()
This is mostly a mechanical change, but a testcase that contains
parts of the StringRef class (clang/test/Analysis/llvm-conventions.cpp)
isn't touched.
2021-06-25 00:22:01 +03:00
Hans Wennborg 24037c37b6 Add support for #pragma system_header with -fms-extensions
Clang already supports the pragma prefixed by "GCC" or "clang".

MSVC has more recently added support for the pragma, but without any prefix; see
https://devblogs.microsoft.com/cppblog/broken-warnings-theory/#external-headers

Differential revision: https://reviews.llvm.org/D104770
2021-06-23 13:26:03 +02:00
Simon Pilgrim 61cdaf66fe [ADT] Remove APInt/APSInt toString() std::string variants
<string> is currently the highest impact header in a clang+llvm build:

https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html

One of the most common places this is being included is the APInt.h header, which needs it for an old toString() implementation that returns std::string - an inefficient method compared to the SmallString versions that it actually wraps.

This patch replaces these APInt/APSInt methods with a pair of llvm::toString() helpers inside StringExtras.h, adjusts users accordingly and removes the <string> from APInt.h - I was hoping that more of these users could be converted to use the SmallString methods, but it appears that most end up creating a std::string anyhow. I avoided trying to use the raw_ostream << operators as well as I didn't want to lose having the integer radix explicit in the code.

Differential Revision: https://reviews.llvm.org/D103888
2021-06-11 13:19:15 +01:00
Dmitry Polukhin 178ad93e3f [clang][clangd] Use reverse header map lookup in suggestPathToFileForDiagnostics
Summary:
suggestPathToFileForDiagnostics is actively used in clangd for converting
an absolute path to a header file to a header name as it should be spelled
in the sources. Current approach converts absolute path to relative path.
This diff implements missing logic that makes a reverse lookup from the
relative path to the key in the header map that should be used in the sources.

Prerequisite diff: https://reviews.llvm.org/D103229

Test Plan: check-clang

Reviewers: dexonsmith, bruno, rsmith

Subscribers: cfe-commits

Tasks:

Tags: #clang

Differential Revision: https://reviews.llvm.org/D103142
2021-06-03 01:37:55 -07:00
Aaron Ballman baa2b8d085 Fix a git apply that went bad somehow.
When applying the changes in 8edd3464af,
it seems that this bit got merged incorrectly and no test coverage
caught the issue. This fixes the diagnostic and adds a test.
2021-06-01 14:06:39 -04:00
Aaron Ballman ce276b7a64 Fix -Wswitch warning; NFC 2021-05-27 09:24:43 -04:00
Aaron Ballman 8edd3464af Add support for #elifdef and #elifndef
WG14 adopted N2645 and WG21 EWG has accepted P2334 in principle (still
subject to full EWG vote + CWG review + plenary vote), which add
support for #elifdef as shorthand for #elif defined and #elifndef as
shorthand for #elif !defined. This patch adds support for the new
preprocessor directives.
2021-05-27 08:57:47 -04:00
Richard Smith de6164ec4d PR50456: Properly handle multiple escaped newlines in a '*/'. 2021-05-24 16:21:03 -07:00
Melvin Fox b5b3843f8d [clang] Fix for "Bug 27113 - MSVC-compat __identifier implementation incomplete"
this patch fixes Bug 27113 by adding support for string literals to the
implementation of the MS extension __identifier.

Differential revision: https://reviews.llvm.org/D100252
2021-05-21 11:14:01 +02:00
Michael Spencer d3676d4b66 [clang][modules] Build inferred modules
This patch enables explicitly building inferred modules.

Effectively a cherry-pick of https://github.com/apple/llvm-project/pull/699 authored by @Bigcheese with libclang and dependency scanner changes omitted.

Contains the following changes:

1. [Clang] Fix the header paths in clang::Module for inferred modules.
  * The UmbrellaAsWritten and NameAsWritten fields in clang::Module are a lie for framework modules. For those they actually are the path to the header or umbrella relative to the clang::Module::Directory.
  * The exception to this case is for inferred modules. Here it actually is the name as written, because we print out the module and read it back in when implicitly building modules. This causes a problem when explicitly building an inferred module, as we skip the printing out step.
  * In order to fix this issue this patch adds a new field for the path we want to use in getInputBufferForModule. It also makes NameAsWritten actually be the name written in the module map file (or that would be, in the case of an inferred module).

2. [Clang] Allow explicitly building an inferred module.
  * Building the actual module still fails, but make sure it fails for the right reason.

Split from D100934.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D102491
2021-05-17 10:40:51 +02:00
serge-sans-paille 6045cb89e5 Use an allow list on reserved macro identifiers
The allow list is based on various official sources (see in-code comment).

This fixes https://bugs.llvm.org/show_bug.cgi?id=50248

Differential Revision: https://reviews.llvm.org/D102168
2021-05-13 09:23:47 +02:00
serge-sans-paille 91a919e899 [NFC] Synchronize reserved identifier code between macro and variables / symbols
Differential Revision: https://reviews.llvm.org/D102164
2021-05-10 17:46:51 +02:00
Kadir Cetinkaya 761f3d1675
[clang][PreProcessor] Cutoff parsing after hitting completion point
This fixes a crash caused by Lexers being invalidated at code
completion points in
https://github.com/llvm/llvm-project/blob/main/clang/lib/Lex/PPLexerChange.cpp#L520.

Differential Revision: https://reviews.llvm.org/D102069
2021-05-10 11:24:27 +02:00
Anton Bikineev dc7ebd2cb0 [C++2b] Support size_t literals
This adds support for C++2b's z/uz suffixes for size_t literals (P0330).
2021-03-31 13:36:23 +00:00
Richard Smith 4259301aaf Support #__private_macro and #__public_macro in local submodule
visibility mode.
2021-03-23 16:54:28 -07:00
Richard Smith 4cd109891c Improve const-correctness. NFC. 2021-03-23 16:54:27 -07:00
Richard Smith 3775d811ff Improve module dumping for debugging.
* List inferred lists of imports in `#pragma clang __debug module_map`.

  * Add `#pragma clang __debug modules {all,visible,building}` to dump
    lists of known / visible module names or the building modules stack.
2021-03-22 19:07:46 -07:00
Sam McCall 128ce70eef [CodeCompletion] Avoid spurious signature help for init-list args
Somewhat surprisingly, signature help is emitted as a side-effect of
computing the expected type of a function argument.
The reason is that both actions require enumerating the possible
function signatures and running partial overload resolution, and doing
this twice would be wasteful and complicated.

Change #1: document this, it's subtle :-)

However, sometimes we need to compute the expected type without having
reached the code completion cursor yet - in particular to allow
completion of designators.
eb4ab3358c did this but introduced a
regression - it emits signature help in the wrong location as a side-effect.

Change #2: only emit signature help if the code completion cursor was reached.

Currently there is PP.isCodeCompletionReached(), but we can't use it
because it's set *after* running code completion.
It'd be nice to set this implicitly when the completion token is lexed,
but ConsumeCodeCompletionToken() makes this complicated.

Change #3: call cutOffParsing() *first* when seeing a completion token.

After this, the fact that the Sema::Produce*SignatureHelp() functions
are even more confusing, as they only sometimes do that.
I don't want to rename them in this patch as it's another large
mechanical change, but we should soon.

Change #4: prepare to rename ProduceSignatureHelp() to GuessArgumentType() etc.

Differential Revision: https://reviews.llvm.org/D98488
2021-03-16 12:46:40 +01:00
serge-sans-paille 9628cb1fee [NFC] Use higher level constructs to check for whitespace/newlines in the lexer
It turns out that according to valgrind and perf, it's also slightly faster.

Differential Revision: https://reviews.llvm.org/D98637
2021-03-15 18:27:19 +01:00
Jan Svoboda 23cc8ebf59 [clang][lex] Speculative fix for buffer overrun on raw string parse
This attempts to fix a (non-deterministic) buffer overrun when parsing raw string literals during modular build.

Similar fix to 4e5b5c36f4.

Reviewed By: beccadax

Differential Revision: https://reviews.llvm.org/D94950
2021-03-15 15:13:47 +01:00
Aaron Ballman e448310059 Add support for digit separators in C2x.
WG14 adopted N2626 at the meetings this week. This commit adds support
for using ' as a digit separator in a numeric literal which is
compatible with the C++ feature.
2021-03-12 07:21:03 -05:00
Aaron Ballman 8da090381d Improve static_assert/_Static_assert diagnostics
Our diagnostics relating to static assertions were a bit confused. For
instance, when in MS compatibility mode in C (where we accept
static_assert even without including <assert.h>), we would fail
to warn the user that they were using the wrong spelling (even in
pedantic mode), we were missing a compatibility warning about using
_Static_assert in earlier standards modes, diagnostics for the optional
message were not reflected in C as they were in C++, etc.
2021-03-03 08:48:27 -05:00
Duncan P. N. Exon Smith 64d8c7818d Revert "Module: Use FileEntryRef and DirectoryEntryRef in Umbrella, Header, and DirectoryName, NFC"
This (mostly) reverts 32c501dd88.  Hit a
case where this causes a behaviour change, perhaps the same root cause
that triggered the revert of a40db5502b in
7799ef7121.

(The API changes in DirectoryEntry.h have NOT been reverted as a number
of subsequent commits depend on those.)

https://reviews.llvm.org/D90497#2582166
2021-02-23 09:57:28 -08:00
Richard Smith 5dfa37a761 Don't allow __VA_OPT__ to be detected by #ifdef.
More study has discovered this to not actually be useful: because
current C++20 implementations reject `#ifdef __VA_OPT__`, this can't
really be used as a feature-test mechanism. And it's not too hard to
detect __VA_OPT__ without this, for example:

  #define THIRD_ARG(a, b, c, ...) c
  #define HAS_VA_OPT(...) THIRD_ARG(__VA_OPT__(,), 1, 0, )
  #if HAS_VA_OPT(?)

Partially reverts 0436ec2128.
2021-01-27 13:34:15 -08:00
Richard Smith 0436ec2128 Permit __VA_OPT__ in all language modes and allow it to be detected with #ifdef.
These changes are intended to give code a path to move away from the GNU
,##__VA_ARGS__ extension, which is non-conforming in some situations and
which we'd like to disable in our conforming mode in those cases.
2021-01-27 12:34:43 -08:00
Reid Kleckner 61a66e4b5e Revert "Suppress non-conforming GNU paste extension in all standard-conforming modes"
This reverts commit f4537935dc.
This reverts commit b43c26d036.

This GNU and MSVC extension turns out to be very popular. Most projects
are not using C++20, so cannot use the new __VA_OPT__ feature to be
standards conformant. The other workaround, using -std=gnu*, enables too
many language extensions and isn't viable.

Until there is a way for users to get the behavior provided by the
`, ## __VA_ARGS__` extension in the -std=c++17 and earlier language
modes, we need to revert this.
2021-01-27 10:59:57 -08:00
Harald van Dijk b43c26d036
Restore GNU , ## __VA_ARGS__ behavior in MSVC mode
As noted in D91913, MSVC implements the GNU behavior for
, ## __VA_ARGS__ as well. Do the same when `-fms-compatibility` is used.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D95392
2021-01-25 22:34:49 +00:00
Harald van Dijk f4537935dc
Suppress non-conforming GNU paste extension in all standard-conforming modes
The GNU token paste extension that removes the comma in , ## __VA_ARGS__
conflicts with C99/C++11's requirements when a variadic macro has no
named parameters: according to the standard, an invocation as FOO()
gives it a single empty argument, and concatenation of anything with an
empty argument is well-defined. For this reason, the GNU extension was
already disabled in C99 standard-conforming mode. It was not yet
disabled in C++11 standard-conforming mode.

The associated comment suggested that GCC keeps this extension enabled
in C90/C++03 standard-conforming mode, but it actually does not, so
rather than adding a check for C++ language version, this change simply
removes the check for C language version.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D91913
2021-01-25 00:56:45 +00:00
Timm Bäder ef3800e821 Return false from __has_declspec_attribute() if not explicitly enabled
Currently, projects can check for __has_declspec_attribute() and use
it accordingly, but the check for __has_declspec_attribute will return
true even if declspec attributes are not enabled for the target.

This changes Clang to instead return false when declspec attributes are
not supported for the target.
2021-01-12 13:20:08 -05:00
Nico Weber 7799ef7121 Revert "Lex: Migrate HeaderSearch::LoadedModuleMaps to FileEntryRef"
This reverts commit a40db5502b.
and follow-up d636b881bb

Somewhat speculative, likely broke check-clang on Windows:
https://reviews.llvm.org/D92975#2453482
2020-12-14 22:05:08 -05:00
Reid Kleckner d2ed9d6b7e Revert "ADT: Migrate users of AlignedCharArrayUnion to std::aligned_union_t, NFC"
We determined that the MSVC implementation of std::aligned* isn't suited
to our needs. It doesn't support 16 byte alignment or higher, and it
doesn't really guarantee 8 byte alignment. See
https://github.com/microsoft/STL/issues/1533

Also reverts "ADT: Change AlignedCharArrayUnion to an alias of std::aligned_union_t, NFC"

Also reverts "ADT: Remove AlignedCharArrayUnion, NFC" to bring back
AlignedCharArrayUnion.

This reverts commit 4d8bf870a8.

This reverts commit d10f9863a5.

This reverts commit 4b5dc150b9.
2020-12-14 17:04:06 -08:00
Duncan P. N. Exon Smith a40db5502b Lex: Migrate HeaderSearch::LoadedModuleMaps to FileEntryRef
Migrate `HeaderSearch::LoadedModuleMaps` and a number of APIs over to
`FileEntryRef`. This should have no functionality change. Note that two
`FileEntryRef`s hash the same if they point at the same `FileEntry`.

Differential Revision: https://reviews.llvm.org/D92975
2020-12-14 14:35:11 -08:00