Commit Graph

154 Commits

Author SHA1 Message Date
Haojian Wu 263dcf452f [syntax] Introduce a TokenManager interface.
TokenManager defines Token interfaces for the clang syntax-tree. This is the level
of abstraction that the syntax-tree should use to operate on Tokens.

It decouples the syntax-tree from a particular token implementation (TokenBuffer
previously).  This enables us to use a different underlying token implementation
for the syntax Leaf node -- in clang pseudoparser, we want to produce a
syntax-tree with its own pseudo::Token rather than syntax::Token.

Differential Revision: https://reviews.llvm.org/D128411
2022-07-15 10:30:37 +02:00
Sam McCall 499d0b96cb [clang] createInvocationFromCommandLine -> createInvocation, delete former. NFC
(Followup from 40c13720a4)

Differential Revision: https://reviews.llvm.org/D125012
2022-05-06 16:21:48 +02:00
Sam McCall 4cec789c17 [Testing] Drop clangTesting from clang's public library interface
This was probably not particularly intended to be public, and disallows deps
on gtest which are useful in test helpers.

https://discourse.llvm.org/t/stop-exporting-clangtesting-library/61672

Differential Revision: https://reviews.llvm.org/D123610
2022-04-20 13:28:44 +02:00
Sam McCall 89cd86bbc5 Reapply [pseudo] Move pseudoparser from clang to clang-tools-extra"
This reverts commit 049f4e4eab.

The problem was a stray dependency in CLANG_TEST_DEPS which caused cmake
to fail if clang-pseudo wasn't built. This is now removed.
2022-03-16 01:10:55 +01:00
Sam McCall 049f4e4eab Revert "[pseudo] Move pseudoparser from clang to clang-tools-extra"
This reverts commit b97856c4cf.

Breaks a bunch of bots:
https://lab.llvm.org/buildbot/#/builders/193/builds/8513
2022-03-16 01:06:24 +01:00
Sam McCall b97856c4cf [pseudo] Move pseudoparser from clang to clang-tools-extra
This should make clearer that:
 - it's not part of clang proper
 - there's no expectation to update it along with clang (beyond green tests)
 - clang should not depend on it

This is intended to be expose a library, so unlike other tools has a split
between include/ and lib/.

The main renames are:
  clang/lib/Tooling/Syntax/Pseudo/*           => clang-tools-extra/pseudo/lib/*
  clang/include/clang/Tooling/Syntax/Pseudo/* => clang-tools-extra/pseudo/include/clang-pseudo/*
  clang/tools/clang/pseudo/*                  => clang-tools-extra/pseudo/tool/*
  clang/test/Syntax/*                         => clang-tools-extra/pseudo/test/*
  clang/unittests/Tooling/Syntax/Pseudo/*     => clang-tools-extra/pseudo/unittests/*
  #include "clang/Tooling/Syntax/Pseudo/*"    => #include "clang-pseudo/*"
  namespace clang::syntax::pseudo             => namespace clang::pseudo
  check-clang                                 => check-clang-pseudo
  clangToolingSyntaxPseudo                    => clangPseudo
The clang-pseudo and ClangPseudoTests binaries are not renamed.

See discussion around:
https://discourse.llvm.org/t/rfc-a-c-pseudo-parser-for-tooling/59217/50

Differential Revision: https://reviews.llvm.org/D121233
2022-03-16 00:14:11 +01:00
Haojian Wu 2d01ac18df [pseudo] Strip comments for TokenStream.
Add a utility function to strip comments from a "raw" tokenstream. The
derived stream will be fed to the GLR parser (for early testing).

Differential Revision: https://reviews.llvm.org/D121092
2022-03-07 20:24:37 +01:00
Sam McCall 54d6b5b67f [pseudo] Rename {Preprocess,PPStructure} -> DirectiveMap. NFC
More precisely describes what this file does.
Per comments on https://reviews.llvm.org/D121092
2022-03-07 17:41:35 +01:00
Sam McCall 7c1ee5e95f [Pseudo] Token/TokenStream, PP directive parser.
The TokenStream class is the representation of the source code that will
be fed into the GLR parser.

This patch allows a "raw" TokenStream to be built by reading source code.
It also supports scanning a TokenStream to find the directive structure.

Next steps (with placeholders in the code): heuristically choosing a
path through #ifs, preprocessing the code by stripping directives and comments.
These will produce a suitable stream to feed into the parser proper.

Differential Revision: https://reviews.llvm.org/D119162
2022-02-23 17:52:02 +01:00
Haojian Wu a2fab82f33 [pseudo] Implement LRTable.
This patch introduces a dense implementation of the LR parsing table, which is
used by LR parsers.

We build a SLR(1) parsing table from the LR(0) graph.

Statistics of the LR parsing table on the C++ spec grammar:
  - number of states: 1449
  - number of actions: 83069
  - size of the table (bytes): 334928

Differential Revision: https://reviews.llvm.org/D118196
2022-02-23 09:21:34 +01:00
Haojian Wu f1984b1433 [pseudo] Implement LRGraph
LRGraph is the key component of the clang pseudo parser, it is a
deterministic handle-finding finite-state machine, which is used to
generated the LR parsing table.

Separate from https://reviews.llvm.org/D118196.

Differential Revision: https://reviews.llvm.org/D119172
2022-02-09 11:20:07 +01:00
Haojian Wu fe932a88e9 [pseudo] Add first and follow set computation in Grammar.
These will be used when building parsing table for LR parsers.

Separate from https://reviews.llvm.org/D118196.

Differential Revision: https://reviews.llvm.org/D118990
2022-02-09 09:16:27 +01:00
Haojian Wu b94f09524e [pseudo] NFC, clangSyntaxPsuedo => clangToolingSyntaxPseudo
To be consistent with existing name pattern.
2022-02-04 09:57:20 +01:00
Haojian Wu 2189960e65 [pseudo] Rename Tests.cpp => Test.cpp
To be consistent with other files in clang unittest directory.
2022-02-04 09:48:14 +01:00
Haojian Wu 20e05b9f0e [syntax][pseudo] Add Grammar for the clang pseudo-parser
This patch introduces the Grammar class, which is a critial piece for constructing
a tabled-based parser.

As the first patch, the scope is limited to:
  - define base types (symbol, rules) of modeling the grammar
  - construct Grammar by parsing the BNF file (annotations are excluded for now)

Differential Revision: https://reviews.llvm.org/D114790
2022-02-03 11:28:27 +01:00
Jim Lin ad39b5bc59 [NFC] Remove duplicate include 2022-01-27 13:56:13 +08:00
Logan Smith 08eb614e30 [NFC][testing] Return underlying strings directly instead of OS.str()
This avoids an unnecessary copy required by 'return OS.str()', allowing
instead for NRVO or implicit move. The .str() call (which flushes the
stream) is no longer required since 65b13610a5,
which made raw_string_ostream unbuffered by default.

Differential Revision: https://reviews.llvm.org/D115374
2021-12-09 16:05:46 -08:00
Zarko Todorovski c79345fb7b [NFC][Clang][test] Inclusive language: Remove and rephrase uses of sanity test/check in clang/test
Part of work to use more inclusive terms in clang/llvm.
2021-11-24 14:03:49 -05:00
Benjamin Kramer d4d80a2903 Bump googletest to 1.10.0 2021-05-14 19:16:31 +02:00
Utkarsh Saxena aa979084df [clang][Syntax] Optimize expandedTokens for token ranges.
`expandedTokens(SourceRange)` used to do a binary search to get the
expanded tokens belonging to a source range. Each binary search uses
`isBeforeInTranslationUnit` to order two source locations. This is
inherently very slow.
By profiling clangd we found out that users like clangd::SelectionTree
spend 95% of time in `isBeforeInTranslationUnit`. Also it is worth
noting that users of `expandedTokens(SourceRange)` majorly use ranges
provided by AST to query this funciton. The ranges provided by AST are
token ranges (starting at the beginning of a token and ending at the
beginning of another token).

Therefore we can avoid the binary search in majority of the cases by
maintaining an index of ExpandedToken by their SourceLocations. We still
do binary search for ranges which are not token ranges but such
instances are quite low.

Performance:
`~/build/bin/clangd --check=clang/lib/Serialization/ASTReader.cpp`
Before: Took 2:10s to complete.
Now: Took 1:13s to complete.

Differential Revision: https://reviews.llvm.org/D99086
2021-03-25 18:54:15 +01:00
Kadir Cetinkaya f43ff34ae6
[clang] Mark re-injected tokens appropriately during pragma handling
This hides such tokens from TokenWatcher, preventing crashes in clients
trying to match spelled and expanded tokens.

Fixes https://github.com/clangd/clangd/issues/712

Differential Revision: https://reviews.llvm.org/D98483
2021-03-12 18:18:17 +01:00
Haojian Wu 780ead41e0 [Syntax] No crash on OpaqueValueExpr.
OpaqueValueExpr doesn't correspond to the concrete syntax, it has
invalid source location, ignore them.

Reviewed By: kbobyrev

Differential Revision: https://reviews.llvm.org/D96112
2021-02-18 10:32:04 +01:00
Haojian Wu e159a3ced4 [Syntax] Remove a strict valid source location assertion for TypeLoc.
The EndLoc of a type loc can be invalid for broken code.

Also extend the existing test to support error code with `error-ok`
annotation.

Differential Revision: https://reviews.llvm.org/D96261
2021-02-11 09:53:52 +01:00
Haojian Wu 35a5e88390 [Syntax] NFC, Simplify a test with annotations 2021-02-11 09:49:06 +01:00
Haojian Wu 6c1a23303d [Syntax] Support condition for IfStmt.
Differential Revision: https://reviews.llvm.org/D95782
2021-02-04 09:15:30 +01:00
Arthur O'Dwyer e181a6aedd s/instantate/instantiate/ throughout. NFCI.
The static_assert in "libcxx/include/memory" was the main offender here,
but then I figured I might as well `git grep -i instantat` and fix all
the instances I found. One was in user-facing HTML documentation;
the rest were in comments or tests.
2020-12-01 22:13:40 -05:00
Kirill Bobyrev 142c6f82fd
[clang] Simplify buildSyntaxTree API
Follow-up on https://reviews.llvm.org/D88553#inline-837013

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D90672
2020-11-09 22:49:54 +01:00
Sam McCall d4934eb5f8 [Syntax] Add iterators over children of syntax trees.
This gives us slightly nicer syntax (foreach) for idioms currently expressed
as a loop, and the option to use range algorithms where it makes sense
(e.g. llvm::all_of et al encapsulate the needed flow control in a useful way).

It's also a building block for iteration over filtered views (e.g. iterate over
all Stmt children, with the right type):
for (const Statement &S : filter<Statement>(N.children()))
  ...

I realize the recent direction has been mostly towards strongly-typed
node-specific facilities, but I think it's important we have convenient
generic facilities too.

Differential Revision: https://reviews.llvm.org/D90023
2020-10-28 12:37:57 +01:00
Eduardo Caldas 5011d43108 Migrate Declarators to use the List API
After this change all nodes that have a delimited-list are using the
`List` API.

Implementation details:
Let's look at a declaration with multiple declarators:
`int a, b;`
To generate a declarator list node we need to have the range of
declarators: `a, b`:
However, the `ClangAST` actually stores them as separate declarations:
`int a   ;`
`int    b;`
We solve that by appropriately marking the declarators on each separate
declaration in the `ClangAST` and then for the final declarator `int
b`, shrinking its range to fit to the already marked declarators.

Differential Revision: https://reviews.llvm.org/D88403
2020-10-01 13:56:31 +00:00
Eduardo Caldas c3c08bfdfd [SyntaxTree] Test the List API
Differential Revision: https://reviews.llvm.org/D87839
2020-09-22 17:07:41 +00:00
Eduardo Caldas 6dc06fa09d [SyntaxTree] Add tests for the assignment of the `canModify` tag.
Differential Revision: https://reviews.llvm.org/D88077
2020-09-22 13:17:33 +00:00
Eduardo Caldas 66bcb14312 [SyntaxTree][Synthesis] Fix: `deepCopy` -> `deepCopyExpandingMacros`.
There can be Macros that are tagged with `modifiable`. Thus verifying
`canModifyAllDescendants` is not sufficient to avoid macros when deep
copying.

We think the `TokenBuffer` could inform us whether a `Token` comes from
a macro. We'll look into that when we can surface this information
easily, for instance in unit tests for `ComputeReplacements`.

Differential Revision: https://reviews.llvm.org/D88034
2020-09-22 09:15:21 +00:00
Eduardo Caldas af582c9b0f [SyntaxTree] Test `findFirstLeaf` and `findLastLeaf`
* Introduce `TreeTest.cpp` to unit test `Tree.h`
* Add `generateAllTreesWithShape` to generating test cases
* Add tests for `findFirstLeaf` and `findLastLeaf`
* Fix implementations of `findFirstLeaf` and `findLastLeaf` that had
been broken when empty `Tree` were present.

Differential Revision: https://reviews.llvm.org/D87779
2020-09-22 06:47:36 +00:00
Eduardo Caldas 4a5cc389c5 [SyntaxTree][Synthesis] Implement `deepCopy`
Differential Revision: https://reviews.llvm.org/D87749
2020-09-21 09:27:15 +00:00
Eduardo Caldas e616a42598 [SyntaxTree] Test for '\' inside token.
Differential Revision: https://reviews.llvm.org/D87895
2020-09-21 06:56:14 +00:00
Eduardo Caldas bb5b28f12f [SyntaxTree][Synthesis] Improve testing `createLeaf`
The new test shows that `createLeaf` depends on the C++ version.

Differential Revision: https://reviews.llvm.org/D87896
2020-09-21 06:11:46 +00:00
Eduardo Caldas 7c37b82f5b [SyntaxTree][Synthesis] Add support for Tree.
In a future patch
* Implement helper function to generate Trees for tests
* and test Tree methods, namely `findFirstLeaf` and `findLastLeaf`

Differential Revision: https://reviews.llvm.org/D87533
2020-09-11 20:37:23 +00:00
Eduardo Caldas 5d152127d4 [SyntaxTree][Synthesis] Add support for simple Leafs and test based on tree dump
Differential Revision: https://reviews.llvm.org/D87495
2020-09-11 18:22:00 +00:00
Eduardo Caldas 4c14ee61b7 [SyntaxTree] Rename functions to start with verb
According to LLVM coding standards:
https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

Differential Revision: https://reviews.llvm.org/D87498
2020-09-11 14:54:18 +00:00
Eduardo Caldas c01d28dc51 [SyntaxTree] Specialize `TreeTestBase` for `BuildTreeTest`, `MutationsTest` and `SynthesisTest`
Differential Revision: https://reviews.llvm.org/D87374
2020-09-10 16:44:14 +00:00
Eduardo Caldas f5087d5c72 [SyntaxTree] Fix crash on functions with default arguments.
* Do not visit `CXXDefaultArgExpr`
* To build `CallArguments` nodes, just go through non-default arguments

Differential Revision: https://reviews.llvm.org/D87249
2020-09-08 09:49:30 +00:00
Eduardo Caldas 134455a07c [SyntaxTree] Ignore implicit `CXXFunctionalCastExpr` wrapping constructor
Differential Revision: https://reviews.llvm.org/D87229
2020-09-08 09:44:23 +00:00
Eduardo Caldas 46f4439dc9 [SyntaxTree] Ignore implicit leaf `CXXConstructExpr`
Differential Revision: https://reviews.llvm.org/D86700
2020-09-08 09:44:23 +00:00
Eduardo Caldas 2325d6b42f [SyntaxTree] Ignore implicit non-leaf `CXXConstructExpr`
Differential Revision: https://reviews.llvm.org/D86699
2020-09-08 09:44:23 +00:00
Eduardo Caldas a1461953f4 [SyntaxTree] Add coverage for declarators and init-declarators 2020-08-28 12:19:38 +00:00
Eduardo Caldas 718e550cd0 [SyntaxTree] Refactor `NodeRole`s
Previously a NodeRole would generally be prefixed with the `NodeKind`,
we remove this prefix, as it we redundant and made tests more noisy.

Differential Revision: https://reviews.llvm.org/D86636
2020-08-27 05:16:00 +00:00
Eduardo Caldas dc3d474327 [SyntaxTree] Migrate `ParamatersAndQualifiers` to use the new List API
Fix: Add missing `List::getTerminationKind()`, `List::canBeEmpty()`,
`List::getDelimiterTokenKind()` for `CallArguments`.

Differential Revision: https://reviews.llvm.org/D86600
2020-08-26 16:46:19 +00:00
Eduardo Caldas 3b75f65e6b [SyntaxTree] Fix C++ versions on tests of `BuildTreeTest.cpp`
Differential Revision: https://reviews.llvm.org/D86591
2020-08-26 07:19:49 +00:00
Eduardo Caldas 2de2ca348d [SyntaxTree] Add support for `CallExpression`
* Generate `CallExpression` syntax node for all semantic nodes inheriting from
`CallExpr` with call-expression syntax - except `CUDAKernelCallExpr`.
* Implement all the accessors
* Arguments of `CallExpression` have their own syntax node which is based on
the `List` base API

Differential Revision: https://reviews.llvm.org/D86544
2020-08-26 07:03:49 +00:00
Eduardo Caldas be2bc7d4ce [SyntaxTree] Update `Modifiable` tests to dump `NodeRole` and `unmodifiable` tag 2020-08-25 06:34:48 +00:00