Commit Graph

6 Commits

Author SHA1 Message Date
Sam McCall 7d8e2742d9 [pseudo] Define recovery strategy as grammar extension.
Differential Revision: https://reviews.llvm.org/D129158
2022-07-06 15:03:38 +02:00
Sam McCall 3121167488 [pseudo] Add error-recovery framework & brace-based recovery
The idea is:

- a parse failure is detected when all heads die when trying to shift the next token
- we can recover by choosing a nonterminal we're partway through parsing, and
  determining where it ends through nonlocal means (e.g. matching brackets)
- we can find candidates by walking up the stack from the (ex-)heads
- the token range is defined using heuristics attached to grammar rules
- the unparsed region is represented in the forest by an Opaque node

This patch has the core GLR functionality.
It does not allow recovery heuristics to be attached as extensions to
the grammar, but rather infers a brace-based heuristic.

Expected followups:

- make recovery heuristics grammar extensions (depends on D127448)
- add recovery to our grammar for bracketed constructs and sequence nodes
- change the structure of our augmented `_ := start` rules to eliminate some
  special-cases in glrParse.
- (if I can work out how): avoid some spurious recovery cases described in comments

(Previously mistakenly committed as a0f4c10ae2)

Differential Revision: https://reviews.llvm.org/D128486
2022-07-05 20:49:41 +02:00
Haojian Wu 70c0d92930 [pseudo] Use the prebuilt cxx grammar for the lit tests, NFC.
Differential Revision: https://reviews.llvm.org/D129074
2022-07-05 15:17:18 +02:00
Sam McCall 743971faaf Revert "[pseudo] Add error-recovery framework & brace-based recovery"
This reverts commit a0f4c10ae2.
This commit hadn't been reviewed yet, and was unintentionally included
on another branch.
2022-06-28 21:11:09 +02:00
Sam McCall a0f4c10ae2 [pseudo] Add error-recovery framework & brace-based recovery
The idea is:
 - a parse failure is detected when all heads die when trying to shift
   the next token
 - we can recover by choosing a nonterminal we're partway through parsing,
   and determining where it ends through nonlocal means (e.g. matching brackets)
 - we can find candidates by walking up the stack from the (ex-)heads
 - the token range is defined using heuristics attached to grammar rules
 - the unparsed region is represented in the forest by an Opaque node

This patch has the core GLR functionality.
It does not allow recovery heuristics to be attached as extensions to
the grammar, but rather infers a brace-based heuristic.

Expected followups:
 - make recovery heuristics grammar extensions (depends on D127448)
 - add recover to our grammar for bracketed constructs and sequence nodes
 - change the structure of our augmented `_ := start` rules to eliminate
   some special-cases in glrParse.
 - (if I can work out how): avoid some spurious recovery cases described
   in comments
 - grammar changes to eliminate the hard distinction between init-list
   and designated-init-list shown in the recovery-init-list.cpp testcase

Differential Revision: https://reviews.llvm.org/D128486
2022-06-28 21:08:43 +02:00
Haojian Wu 0a6a17a4f9 [pseudo] Fix the member-specification grammar rule.
The grammar rule is not right, doesn't match the standard one.

Differential Revision: https://reviews.llvm.org/D126991
2022-06-07 10:18:18 +02:00