llvm-project

Commit Graph

Author	SHA1	Message	Date
Haojian Wu	5b8337cf40	[syntax] Some #includes cleanup, NFC.	2022-07-15 21:05:59 +02:00
Haojian Wu	263dcf452f	[syntax] Introduce a TokenManager interface. TokenManager defines Token interfaces for the clang syntax-tree. This is the level of abstraction that the syntax-tree should use to operate on Tokens. It decouples the syntax-tree from a particular token implementation (TokenBuffer previously). This enables us to use a different underlying token implementation for the syntax Leaf node -- in clang pseudoparser, we want to produce a syntax-tree with its own pseudo::Token rather than syntax::Token. Differential Revision: https://reviews.llvm.org/D128411	2022-07-15 10:30:37 +02:00
Kazu Hirata	0916d96d12	Don't use Optional::hasValue (NFC)	2022-06-20 20:17:57 -07:00
Sam McCall	89cd86bbc5	Reapply [pseudo] Move pseudoparser from clang to clang-tools-extra" This reverts commit `049f4e4eab`. The problem was a stray dependency in CLANG_TEST_DEPS which caused cmake to fail if clang-pseudo wasn't built. This is now removed.	2022-03-16 01:10:55 +01:00
Sam McCall	049f4e4eab	Revert "[pseudo] Move pseudoparser from clang to clang-tools-extra" This reverts commit `b97856c4cf`. Breaks a bunch of bots: https://lab.llvm.org/buildbot/#/builders/193/builds/8513	2022-03-16 01:06:24 +01:00
Sam McCall	b97856c4cf	[pseudo] Move pseudoparser from clang to clang-tools-extra This should make clearer that: - it's not part of clang proper - there's no expectation to update it along with clang (beyond green tests) - clang should not depend on it This is intended to be expose a library, so unlike other tools has a split between include/ and lib/. The main renames are: clang/lib/Tooling/Syntax/Pseudo/* => clang-tools-extra/pseudo/lib/* clang/include/clang/Tooling/Syntax/Pseudo/* => clang-tools-extra/pseudo/include/clang-pseudo/* clang/tools/clang/pseudo/* => clang-tools-extra/pseudo/tool/* clang/test/Syntax/* => clang-tools-extra/pseudo/test/* clang/unittests/Tooling/Syntax/Pseudo/* => clang-tools-extra/pseudo/unittests/* #include "clang/Tooling/Syntax/Pseudo/" => #include "clang-pseudo/" namespace clang::syntax::pseudo => namespace clang::pseudo check-clang => check-clang-pseudo clangToolingSyntaxPseudo => clangPseudo The clang-pseudo and ClangPseudoTests binaries are not renamed. See discussion around: https://discourse.llvm.org/t/rfc-a-c-pseudo-parser-for-tooling/59217/50 Differential Revision: https://reviews.llvm.org/D121233	2022-03-16 00:14:11 +01:00
Haojian Wu	2d01ac18df	[pseudo] Strip comments for TokenStream. Add a utility function to strip comments from a "raw" tokenstream. The derived stream will be fed to the GLR parser (for early testing). Differential Revision: https://reviews.llvm.org/D121092	2022-03-07 20:24:37 +01:00
Haojian Wu	d5b8ecbd33	[pseudo] empty parameter-declaration should be allowed in lambda declarator. This was an oversight, as we did a avoild-nullable modication to parameter-declaration-clause. Differential Revision: https://reviews.llvm.org/D121089	2022-03-07 20:05:35 +01:00
Sam McCall	54d6b5b67f	[pseudo] Rename {Preprocess,PPStructure} -> DirectiveMap. NFC More precisely describes what this file does. Per comments on https://reviews.llvm.org/D121092	2022-03-07 17:41:35 +01:00
Sam McCall	68b4e2d703	[pseudo] Add readme Differential Revision: https://reviews.llvm.org/D121108	2022-03-07 15:54:00 +01:00
Haojian Wu	28ccf32672	[pseudo] Fix an out-of-bound access for LRTable::Actions. Without this patch, when End == Start, we access Actions[Actions.end()] though we return an empty result. This fixes an assertion failure in MSVC STL debug build.	2022-03-03 14:27:44 +01:00
Haojian Wu	05d7e9f68e	[pseudo] fix some comment nits, NFC.	2022-03-02 10:19:17 +01:00
Haojian Wu	28efb1ccf5	[pseudo] Fix an out-of-bound error in LRTable::find. The linear scan should not escape the TargetedStates range. Differential Revision: https://reviews.llvm.org/D120723	2022-03-02 09:53:52 +01:00
Haojian Wu	302ca279cb	[pseudo] fix an out-of-bound error in LRTable. Fix window debug build.	2022-02-23 21:34:54 +01:00
Sam McCall	7c1ee5e95f	[Pseudo] Token/TokenStream, PP directive parser. The TokenStream class is the representation of the source code that will be fed into the GLR parser. This patch allows a "raw" TokenStream to be built by reading source code. It also supports scanning a TokenStream to find the directive structure. Next steps (with placeholders in the code): heuristically choosing a path through #ifs, preprocessing the code by stripping directives and comments. These will produce a suitable stream to feed into the parser proper. Differential Revision: https://reviews.llvm.org/D119162	2022-02-23 17:52:02 +01:00
Aaron Ballman	b1a8dcf8c1	Silence some "not all control paths return a value" warnings; NFC	2022-02-23 09:18:56 -05:00
Haojian Wu	a2fab82f33	[pseudo] Implement LRTable. This patch introduces a dense implementation of the LR parsing table, which is used by LR parsers. We build a SLR(1) parsing table from the LR(0) graph. Statistics of the LR parsing table on the C++ spec grammar: - number of states: 1449 - number of actions: 83069 - size of the table (bytes): 334928 Differential Revision: https://reviews.llvm.org/D118196	2022-02-23 09:21:34 +01:00
Erich Keane	8073da0bee	[NFC] Fix sign-compare warning in GrammarBNF thanks to int promotion	2022-02-09 11:25:58 -08:00
Haojian Wu	f1984b1433	[pseudo] Implement LRGraph LRGraph is the key component of the clang pseudo parser, it is a deterministic handle-finding finite-state machine, which is used to generated the LR parsing table. Separate from https://reviews.llvm.org/D118196. Differential Revision: https://reviews.llvm.org/D119172	2022-02-09 11:20:07 +01:00
Haojian Wu	fe932a88e9	[pseudo] Add first and follow set computation in Grammar. These will be used when building parsing table for LR parsers. Separate from https://reviews.llvm.org/D118196. Differential Revision: https://reviews.llvm.org/D118990	2022-02-09 09:16:27 +01:00
Haojian Wu	e1db505b42	[syntax][pseudo] Introduce the C++ spec grammar. Add a dummy clang-pseudo tool (right now it accepts and parses the grammar file). Differential Revision: https://reviews.llvm.org/D115856	2022-02-04 11:58:50 +01:00
Haojian Wu	b94f09524e	[pseudo] NFC, clangSyntaxPsuedo => clangToolingSyntaxPseudo To be consistent with existing name pattern.	2022-02-04 09:57:20 +01:00
Haojian Wu	20e05b9f0e	[syntax][pseudo] Add Grammar for the clang pseudo-parser This patch introduces the Grammar class, which is a critial piece for constructing a tabled-based parser. As the first patch, the scope is limited to: - define base types (symbol, rules) of modeling the grammar - construct Grammar by parsing the BNF file (annotations are excluded for now) Differential Revision: https://reviews.llvm.org/D114790	2022-02-03 11:28:27 +01:00
Jan Svoboda	600c6714ac	[clang][syntax] Replace `std::vector<bool>` use LLVM Programmer’s Manual strongly discourages the use of `std::vector<bool>` and suggests `llvm::BitVector` as a possible replacement. This patch replaces `std::vector<bool>` with `llvm::BitVector` in the Syntax library and replaces range-based for loop with regular for loop. This is necessary due to `llvm::BitVector` not having `begin()` and `end()` (D117116). Reviewed By: dexonsmith, dblaikie Differential Revision: https://reviews.llvm.org/D118109	2022-01-26 11:20:18 +01:00
Logan Smith	5336befe8c	[NFC][tools] Return underlying strings directly instead of OS.str() This avoids an unnecessary copy required by 'return OS.str()', allowing instead for NRVO or implicit move. The .str() call (which flushes the stream) is no longer required since `65b13610a5`, which made raw_string_ostream unbuffered by default. Differential Revision: https://reviews.llvm.org/D115374	2021-12-09 16:05:46 -08:00
Zarko Todorovski	d8e5a0c42b	[clang][NFC] Inclusive terms: replace some uses of sanity in clang Rewording of comments to avoid using `sanity test, sanity check`. Reviewed By: aaron.ballman, Quuxplusone Differential Revision: https://reviews.llvm.org/D114025	2021-11-19 14:58:35 -05:00
Kazu Hirata	16ceb44e62	[clang] Use llvm::{count,count_if,find_if,all_of,none_of} (NFC)	2021-10-25 09:14:45 -07:00
Utkarsh Saxena	cd824a48cc	[clang][Syntax] Handle invalid source range in expandedTokens. Differential Revision: https://reviews.llvm.org/D99934	2021-04-07 11:19:01 +02:00
Utkarsh Saxena	aa979084df	[clang][Syntax] Optimize expandedTokens for token ranges. `expandedTokens(SourceRange)` used to do a binary search to get the expanded tokens belonging to a source range. Each binary search uses `isBeforeInTranslationUnit` to order two source locations. This is inherently very slow. By profiling clangd we found out that users like clangd::SelectionTree spend 95% of time in `isBeforeInTranslationUnit`. Also it is worth noting that users of `expandedTokens(SourceRange)` majorly use ranges provided by AST to query this funciton. The ranges provided by AST are token ranges (starting at the beginning of a token and ending at the beginning of another token). Therefore we can avoid the binary search in majority of the cases by maintaining an index of ExpandedToken by their SourceLocations. We still do binary search for ranges which are not token ranges but such instances are quite low. Performance: `~/build/bin/clangd --check=clang/lib/Serialization/ASTReader.cpp` Before: Took 2:10s to complete. Now: Took 1:13s to complete. Differential Revision: https://reviews.llvm.org/D99086	2021-03-25 18:54:15 +01:00
Haojian Wu	780ead41e0	[Syntax] No crash on OpaqueValueExpr. OpaqueValueExpr doesn't correspond to the concrete syntax, it has invalid source location, ignore them. Reviewed By: kbobyrev Differential Revision: https://reviews.llvm.org/D96112	2021-02-18 10:32:04 +01:00
Haojian Wu	e159a3ced4	[Syntax] Remove a strict valid source location assertion for TypeLoc. The EndLoc of a type loc can be invalid for broken code. Also extend the existing test to support error code with `error-ok` annotation. Differential Revision: https://reviews.llvm.org/D96261	2021-02-11 09:53:52 +01:00
Haojian Wu	6c1a23303d	[Syntax] Support condition for IfStmt. Differential Revision: https://reviews.llvm.org/D95782	2021-02-04 09:15:30 +01:00
Sam McCall	1630e50874	[Syntax] Tablegen literal expressions. Non-mechanical changes: - Added FIXME to StringLiteral to cover multi-token string literals. - LiteralExpression::getLiteralToken() is gone. (It was never called) This is because we don't codegen methods in Alternatives It's conceptually suspect if we consider multi-token string literals, though. Differential Revision: https://reviews.llvm.org/D91277	2020-11-12 01:26:02 +01:00
Sam McCall	ea4d24c899	[Syntax] Tablegen Sequence classes. NFC Similar to the previous patch, this doesn't convert all the classes that could be converted. It also doesn't enforce any new invariants etc. It does include some data we don't use yet: specific token types that are allowed and optional/required status of sequence items. (Similar to Dmitri's prototype). I think these are easier to add as we go than later, and serve a useful documentation purpose. Differential Revision: https://reviews.llvm.org/D90659	2020-11-11 16:29:19 +01:00
Sam McCall	138189ee33	[Syntax] Tablegen operator<<(NodeKind). NFC Differential Revision: https://reviews.llvm.org/D90662	2020-11-11 16:02:01 +01:00
Sam McCall	454579e46a	Reland [Syntax] Add minimal TableGen for syntax nodes. NFC This reverts commit `09c6259d6d`. (Fixed side-effecting code being buried in an assert)	2020-11-11 11:24:47 +01:00
Sam McCall	09c6259d6d	Revert "[Syntax] Add minimal TableGen for syntax nodes. NFC" This reverts commit `55120f74ca`. Segfaults during build: http://lab.llvm.org:8011/#/builders/36/builds/1310	2020-11-09 23:59:11 +01:00
Sam McCall	55120f74ca	[Syntax] Add minimal TableGen for syntax nodes. NFC So far, only used to generate Kind and implement classof(). My plan is to have this general-purpose Nodes.inc in the style of AST DeclNodes.inc etc, and additionally a special-purpose backend generating the actual class definitions. But baby steps... Differential Revision: https://reviews.llvm.org/D90540	2020-11-09 23:45:50 +01:00
Kirill Bobyrev	142c6f82fd	[clang] Simplify buildSyntaxTree API Follow-up on https://reviews.llvm.org/D88553#inline-837013 Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D90672	2020-11-09 22:49:54 +01:00
Eduardo Caldas	23657d9cc3	[SyntaxTree] Add reverse links to syntax Nodes. Rationale: Children of a syntax tree had forward links only, because there was no need for reverse links. This need appeared when we started mutating the syntax tree. On a forward list, to remove a target node in O(1) we need a pointer to the node before the target. If we don't have this "before" pointer, we have to find it, and that requires O(n). So in order to remove a syntax node from a tree, we would similarly need to find the node before to then remove. This is both not ergonomic nor does it have a good complexity. Differential Revision: https://reviews.llvm.org/D90240	2020-11-05 09:33:53 +00:00
Sam McCall	dd6f7ee05e	[Syntax] DeclaratorList is a List I think this was just an oversight. Differential Revision: https://reviews.llvm.org/D90541	2020-11-03 03:29:06 +01:00
Sam McCall	d4934eb5f8	[Syntax] Add iterators over children of syntax trees. This gives us slightly nicer syntax (foreach) for idioms currently expressed as a loop, and the option to use range algorithms where it makes sense (e.g. llvm::all_of et al encapsulate the needed flow control in a useful way). It's also a building block for iteration over filtered views (e.g. iterate over all Stmt children, with the right type): for (const Statement &S : filter<Statement>(N.children())) ... I realize the recent direction has been mostly towards strongly-typed node-specific facilities, but I think it's important we have convenient generic facilities too. Differential Revision: https://reviews.llvm.org/D90023	2020-10-28 12:37:57 +01:00
Kirill Bobyrev	5ad6bbacf0	[clangd] Start using SyntaxTrees for folding ranges feature This is an initial attempt to start using Syntax Trees in clangd while improving state of folding ranges feature and experimenting with Syntax Tree capabilities. Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D88553	2020-10-27 16:47:35 +01:00
Mikhail Maltsev	7819411837	[clang] Use SourceLocation as key in hash maps, NFCI The patch adjusts the existing `llvm::DenseMap<unsigned, T>` and `llvm::DenseSet<unsigned>` objects that store source locations, so that they use `SourceLocation` directly instead of `unsigned`. This patch relies on the `DenseMapInfo` trait added in D89719. It also replaces the construction of `SourceLocation` objects from the constants -1 and -2 with calls to the trait's methods `getEmptyKey` and `getTombstoneKey` where appropriate. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D69840	2020-10-20 16:24:09 +01:00
Eduardo Caldas	6fbad9bf30	[SyntaxTree][NFC] Nit on `replaceChildRangeLowLevel`	2020-10-14 09:40:37 +00:00
Eduardo Caldas	72732acade	[SyntaxTree] Bug fix in `MutationsImpl::addAfter`. * Add assertions to other `MutationsImpl` member functions * `findPrevious` is a free function Differential Revision: https://reviews.llvm.org/D89314	2020-10-14 09:22:01 +00:00
Eduardo Caldas	4178f8f2f0	[SyntaxTree] Improve safety of `replaceChildRangeLowLevel` * Add assertions for other preconditions. * If nothing is modified, don't mark it. Differential Revision: https://reviews.llvm.org/D89303	2020-10-14 09:18:32 +00:00
Eduardo Caldas	5011d43108	Migrate Declarators to use the List API After this change all nodes that have a delimited-list are using the `List` API. Implementation details: Let's look at a declaration with multiple declarators: `int a, b;` To generate a declarator list node we need to have the range of declarators: `a, b`: However, the `ClangAST` actually stores them as separate declarations: `int a ;` `int b;` We solve that by appropriately marking the declarators on each separate declaration in the `ClangAST` and then for the final declarator `int b`, shrinking its range to fit to the already marked declarators. Differential Revision: https://reviews.llvm.org/D88403	2020-10-01 13:56:31 +00:00
Eduardo Caldas	66bcb14312	[SyntaxTree][Synthesis] Fix: `deepCopy` -> `deepCopyExpandingMacros`. There can be Macros that are tagged with `modifiable`. Thus verifying `canModifyAllDescendants` is not sufficient to avoid macros when deep copying. We think the `TokenBuffer` could inform us whether a `Token` comes from a macro. We'll look into that when we can surface this information easily, for instance in unit tests for `ComputeReplacements`. Differential Revision: https://reviews.llvm.org/D88034	2020-09-22 09:15:21 +00:00
Eduardo Caldas	af582c9b0f	[SyntaxTree] Test `findFirstLeaf` and `findLastLeaf` * Introduce `TreeTest.cpp` to unit test `Tree.h` * Add `generateAllTreesWithShape` to generating test cases * Add tests for `findFirstLeaf` and `findLastLeaf` * Fix implementations of `findFirstLeaf` and `findLastLeaf` that had been broken when empty `Tree` were present. Differential Revision: https://reviews.llvm.org/D87779	2020-09-22 06:47:36 +00:00

1 2 3 4

167 Commits