Fixes https://github.com/llvm/llvm-project/issues/31592.
This commits enables lexing of digraphs in C++11 and onwards.
Enabling them in C++03 is error-prone, as it would unconditionally treat sequences like "<:" as digraphs, even if they are followed by a single colon, e.g. "<::" would be treated as "[:" instead of "<" followed by "::". Lexing in C++11 doesn't have this problem as it looks ahead the following token.
The relevant excerpt from Lexer::LexTokenInternal:
```
// C++0x [lex.pptoken]p3:
// Otherwise, if the next three characters are <:: and the subsequent
// character is neither : nor >, the < is treated as a preprocessor
// token by itself and not as the first character of the alternative
// token <:.
```
Also, note that both clang and gcc turn on digraphs by default (-fdigraphs), so clang-format should match this behaviour.
Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D118706
Make specializations of `DataflowAnalysis` extendable with domain-specific
logic for comparing distinct values when comparing environments.
This includes a breaking change to the `runDataflowAnalysis` interface
as the return type is now `llvm::Expected<...>`.
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D118596
Fixes https://github.com/llvm/llvm-project/issues/52772.
This patch fixes the formatting of the code:
```
auto aaaaaaaaaaaaaaaaaaaaa = {};
auto b = g([] {
return;
});
```
which should be left as is, but before this patch was formatted to:
```
auto aaaaaaaaaaaaaaaaaaaaa = {};
auto b = g([] {
return;
});
```
Reviewed By: MyDeveloperDay, HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D115972
Fixes https://github.com/llvm/llvm-project/issues/34626.
Before, the include sorter would break the code:
```
#include <stdio.h>
#include <stdint.h> /* long
comment */
```
and change it into:
```
#include <stdint.h> /* long
#include <stdio.h>
comment */
```
This commit handles only the most basic case of a single block comment on an include line, but does not try to handle all the possible edge cases with multiple comments.
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D118627
Normally there are heruistics in lexer to treat `//*` specially in
language modes that don't have line comments (to emit `/`). Unfortunately this
only applied to the first occurence of a line comment inside the file, as the
subsequent line comments were treated as if language had support for them.
This unfortunately only holds in normal lexing mode, as in raw mode all
occurences of line comments received this treatment, which created discrepancies
when comparing expanded and spelled tokens.
The proper fix would be to just make sure we treat all the line comments with a
subsequent `*` the same way, but it would imply breaking some code that's
accepted by clang today. So instead we introduce the same bug into raw lexing
mode.
Fixes https://github.com/clangd/clangd/issues/1003.
Differential Revision: https://reviews.llvm.org/D118471
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D118480
Fixes https://github.com/llvm/llvm-project/issues/53441.
Expected code:
```
/**/ //
int a; //
```
was before misformatted to:
```
/**/ //
int a; //
```
Because the "remaining length" (after the starting `/*`) of an empty block comment `/**/` was computed to be 0 instead of 2.
Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D118475
import X = A.B.C;
Previously, these were unhandled and would terminate import sorting.
With this change, aliases sort as their own group, coming last after all
other imports.
Aliases are not sorted within their group, as they may reference each
other, so order is significant.
This reverts commit f750c3d95a. It fixes
the msan issue by not parsing past the end of the line when handling
import aliases.
Differential Revision: https://reviews.llvm.org/D118446
Fixes https://github.com/llvm/llvm-project/issues/53430.
Initially, I had a quick and dirty approach, but it led to a myriad of special cases handling comments (that may add unwrapped lines).
So I added TT_RecordLBrace type annotations and it seems like a much nicer solution.
I think that in the future it will allow us to clean up some convoluted code that detects records.
Reviewed By: MyDeveloperDay, HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D118337
Users can define aliases for long symbols using import aliases:
import X = A.B.C;
Previously, these were unhandled and would terminate import sorting.
With this change, aliases sort as their own group, coming last after all
other imports.
Aliases are not sorted within their group, as they may reference each
other, so order is significant.
Revision URI: https://reviews.llvm.org/D118361
These built-in functions build the (sophisticated) model of the code's
memory. This model isn't used by all analyses, so we provide for disabling it to
avoid incurring the costs associated with its construction.
Differential Revision: https://reviews.llvm.org/D118178
This reverts commit ef82063207.
- It conflicts with the existing llvm::size in STLExtras, which will now
never be called.
- Calling it without llvm:: breaks C++17 compat
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D118236
Underscore-uglified identifiers are used in standard library implementations to
guard against collisions with macros, and they hurt readability considerably.
(Consider `push_back(Tp_ &&__value)` vs `push_back(Tp value)`.
When we're describing an interface, the exact names of parameters are not
critical so we can drop these prefixes.
This patch adds a new PrintingPolicy flag that can applies this stripping
when recursively printing pieces of AST.
We set it in code completion/signature help, and in clangd's hover display.
All three features also do a bit of manual poking at names, so fix up those too.
Fixes https://github.com/clangd/clangd/issues/736
Differential Revision: https://reviews.llvm.org/D116387
Make specializations of `DataflowAnalysis` extendable with domain-specific
logic for merging distinct values when joining environments. This could be
a strict lattice join or a more general widening operation.
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D118038
This patch ensures that the dataflow analysis framework does not crash
when it encounters access to members of union types.
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D118226
This patch adds a `buildAccess` function, which constructs a string with the
proper operator to use based on the expression's form and type. It also adds two
predicates related to smart pointers, which are needed by `buildAccess` but are
also of general value.
We deprecate `buildDot` and `buildArrow` in favor of the more general
`buildAccess`. These will be removed in a future patch.
Differential Revision: https://reviews.llvm.org/D116377
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D118119
- Fixes https://github.com/llvm/llvm-project/issues/53227 that wrongly
indents multiline comments
- Fixes wrong detection of single-line opening braces when used along
with those only opening scopes, causing crashes due to duplicated
replacements on the same token:
void foo()
{
{
int x;
}
}
- Fixes wrong recognition of first line of definition when the line
starts with block comment, causing crashes due to duplicated
replacements on the same token for this leads toward skipping the line
starting with inline block comment:
/*
Some descriptions about function
*/
/*inline*/ void bar() {
}
- Fixes wrong recognition of enum when used as a type name rather than
starting definition block, causing crashes due to duplicated
replacements on the same token since both actions for enum and for
definition blocks were taken place:
void foobar(const enum EnumType e) {
}
- Change to use function keyword for JavaScript instead of comparing
strings
- Resolves formatting conflict with options EmptyLineAfterAccessModifier
and EmptyLineBeforeAccessModifier (prompts with --dry-run (-n) or
--output-replacement-xml but no observable change)
- Recognize long (len>=5) uppercased name taking a single line as return
type and fix the problem of adding newline below it, with adding new
token type FunctionLikeOrFreestandingMacro and marking tokens in
UnwrappedLineParser:
void
afunc(int x) {
return;
}
TYPENAME
func(int x, int y) {
// ...
}
- Remove redundant and repeated initialization
- Do no change to newlines before EOF
Reviewed By: MyDeveloperDay, curdeius, HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D117520
The minimizing filesystem used by the dependency scanner isn't great when it comes to the consistency of its caches. There are two problems that can be exposed by a filesystem that changes during dependency scan:
1. In-memory cache entries for original and minimized files are distinct, populated at different times using separate stat/open syscalls. This means that when a file is read with minimization disabled, its contents might be inconsistent when the same file is read with minimization enabled at later point (and vice versa).
2. In-memory cache entries are indexed by filename. This is problematic for symlinks, where the contents of the symlink might be inconsistent with contents of the original file (for the same reason as in problem 1).
This patch ensures consistency by always stating/reading a file exactly once. The original contents are always cached and minimized contents are derived from that on demand. The cache entries are now indexed by their `UniqueID` ensuring consistency for symlinks too. Moreover, the stat/read syscalls are now issued outside of critical section.
Depends on D115935.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D114966
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D117754
Fixes https://github.com/llvm/llvm-project/issues/44601.
This patch handles a bug when parsing a below example code :
```
template <class> class S;
template <class T> bool operator<(S<T> const &x, S<T> const &y) {
return x.i < y.i;
}
template <class T> class S {
int i = 42;
friend bool operator< <>(S const &, S const &);
};
int main() { return S<int>{} < S<int>{}; }
```
which parse `< <>` as `<< >`, not `< <>` in terms of tokens as discussed in discord.
1. Add a condition in `tryMergeLessLess()` considering `operator` keyword and `>`
2. Force to leave a whitespace between `tok::less` and a template opener
3. Add unit test
Reviewed By: MyDeveloperDay, curdeius
Differential Revision: https://reviews.llvm.org/D117398
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D117667
Some tests were skipped in D114454 to resolve test failures on some
platforms, where the pointers have different bitwidth than expected.
This patch re-enables these tests, by relaxing the requirements on the
types of the SVal.
The issue:
There is no way to reconstruct the type of the `SVal` perfectly
accurately, since there could be multiple types having the required
bitwidth and signedness.
Consider platforms where `int` and `long` have the same bitwidth.
Additionally, we need to be careful about casting a pointer to an
integral representation, because we don't know what smallest integral
type can represent that.
To workaround these issues, I propose enforcing a type that has the
same signedness and bitwidth as the expected type, instead of perfect
equality.
In the `GetLocAsIntType` test, in case of pointer-to-integral casts
I'm using the widest standard integral type (long long) to make sure
that the pointer can be represented by the type without losing
precision. This won't affect the test in any meaningful way, since the
type of the `lvalue` remained the same.
In one case, I had to replace `getUIntPtrType()` with `UnsignedLongTy`
because on some platforms `getUIntPtrType()` is different then `long
int`.
In this patch, I also enforce that the tests must compile without
errors, to prevent narrowing conversions in the future.
Reviewed By: stevewan
Differential Revision: https://reviews.llvm.org/D115349
E.g. `Concept auto Func();`
The nameLoc for the constained auto type loc pointed to the concept name
loc, it should be the auto token loc. This patch fixes it, and remove
a relevant hack in clang-tidy check.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D117009
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D117567
The `{HeaderSearch,Preprocessor}::LookupFile()` functions take an out-parameter `const DirectoryLookup *&`. Most callers end up creating a `const DirectoryLookup *` variable that's otherwise unused.
This patch changes the out-parameter from reference to a pointer, making it possible to simply pass `nullptr` to the function without the ceremony.
Reviewed By: ahoppen
Differential Revision: https://reviews.llvm.org/D117312
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D117496
Users outside of the clang repo may use different googletest versions. So, it's
better not to depend on llvm's googletest. This patch removes the dependency by
having `checkDataflow` return an `llvm::Error` instead of calling googletest's
`FAIL` or `ASSERT...` macros.
Differential Revision: https://reviews.llvm.org/D117304
The patch was reverted because it caused a crash during PCH build -- we
missed to update the RParenLoc in TreeTransform<Derived>::TransformAutoType.
This relands 55d96ac and 37ec65e with a test and fix.
This style is similar to AlwaysBreak, but places closing brackets on new lines.
For example, if you have a multiline parameter list, clang-format currently only supports breaking per-parameter, but places the closing bracket on the line of the last parameter.
Function(
param1,
param2,
param3);
A style supported by other code styling tools (e.g. rustfmt) is to allow the closing brackets to be placed on their own line, aiding the user in being able to quickly infer the bounds of the block of code.
Function(
param1,
param2,
param3
);
For prior work on a similar feature, see: https://reviews.llvm.org/D33029.
Note: This currently only supports block indentation for closing parentheses.
Differential Revision: https://reviews.llvm.org/D109557
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Differential Revision: https://reviews.llvm.org/D117339