Commit Graph

54 Commits

Author SHA1 Message Date
Alexander Kornienko 4b67207157 Moved FormatToken to a separate header.
llvm-svn: 183115
2013-06-03 16:45:03 +00:00
Manuel Klimek 6e6310ec84 The second step in the token refactoring.
Gets rid of AnnotatedToken, putting everything into FormatToken.
FormatTokens are created once, and only referenced by pointer.  This
enables multiple future features, like having tokens shared between
multiple UnwrappedLines (while there's still work to do to fully enable
that).

llvm-svn: 182859
2013-05-29 14:47:47 +00:00
Manuel Klimek 591ab5a830 Make UnwrappedLines and AnnotatedToken contain pointers to FormatToken.
The FormatToken is now not copyable any more.

llvm-svn: 182772
2013-05-28 13:42:28 +00:00
Manuel Klimek 15dfe7ac40 A first step towards giving format tokens pointer identity.
With this patch, we create all tokens in one go before parsing and pass
an ArrayRef<FormatToken*> to the UnwrappedLineParser. The
UnwrappedLineParser is switched to use pointer-to-token internally.

The UnwrappedLineParser still copies the tokens into the UnwrappedLines.
This will be fixed in an upcoming patch.

llvm-svn: 182768
2013-05-28 11:55:06 +00:00
David Blaikie 8f6a2972ce Remove unreachable return
llvm-svn: 182742
2013-05-27 20:43:54 +00:00
Manuel Klimek 9043c74f49 Major refactoring of BreakableToken.
Unify handling of whitespace when breaking protruding tokens with other
whitespace replacements.

As a side effect, the BreakableToken structure changed significantly:
- have a common base class for single-line breakable tokens, as they are
  much more similar
- revamp handling of multi-line comments; we now calculate the
  information about lines in multi-line comments similar to normal
  tokens, and always issue replacements

As a result, we were able to get rid of special casing of trailing
whitespace deletion for comments in the whitespace manager and the
BreakableToken and fixed bugs related to tab handling and escaped
newlines.

llvm-svn: 182738
2013-05-27 15:23:34 +00:00
Manuel Klimek 75081b5cf8 Address post-review comment from dblakie.
llvm-svn: 182732
2013-05-27 12:36:28 +00:00
Alexander Kornienko f2e021233c Ignore contents of #if 0 blocks.
Summary:
Added stack of preprocessor branching directives, and ignore all tokens
inside #if 0 except for preprocessor directives.

Reviewers: klimek, djasper

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D855

llvm-svn: 182658
2013-05-24 18:24:24 +00:00
Manuel Klimek 5c24cca0f0 Use a SourceRange for the whitespace location in FormatToken.
Replaces the use of WhitespaceStart + WhitspaceLength.
This made a bug in the formatter obvous where we would incorrectly
calculate the next column.

FIXME: There's a similar bug left regarding TokenLength. We should
probably also move to have a TokenRange instead.

llvm-svn: 182572
2013-05-23 10:56:37 +00:00
Manuel Klimek 6734592c12 Fix no-assert compiles.
llvm-svn: 182569
2013-05-23 10:02:51 +00:00
Manuel Klimek ab41991c07 Expand parsing of braced init lists.
Allows formatting of C++11 braced init list constructs, like:
vector<int> v { 1, 2, 3 };
f({ 1, 2 });

This involves some changes of how tokens are handled in the
UnwrappedLineFormatter. Note that we have a plan to evolve the
design of the token flow into one where we create all tokens
up-front and then annotate them in the various layers (as we
currently already have to create all tokens at once anyway, the
current abstraction does not help). Thus, this introduces
FIXMEs towards that goal.

llvm-svn: 182568
2013-05-23 09:41:43 +00:00
Daniel Jasper d2ae41a7c6 Remove diagnostics from clang-format.
We only ever implemented one and that one is not actually all that
helpful (e.g. gets incorrectly triggered by macros).

llvm-svn: 181871
2013-05-15 08:14:19 +00:00
Alexander Kornienko 9e90b62e01 Unified token breaking logic: support for line comments.
Summary:
Added BreakableLineComment, moved common code from
BreakableBlockComment to newly added BreakableComment. As a side-effect of the
rewrite, found another problem with escaped newlines and had to change
code which removes trailing whitespace from line comments not to break after
this patch.

Reviewers: klimek, djasper

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D682

llvm-svn: 179693
2013-04-17 17:34:05 +00:00
Manuel Klimek 1a18c40468 Revamps structural error detection / handling.
Previously we'd only detect structural errors on the very first level.
This leads to incorrectly balanced braces not being discovered, and thus
incorrect indentation.

This change fixes the problem by:
- changing the parser to use an error state that can be detected
  anywhere inside the productions, for example if we get an eof on
  SOME_MACRO({ some block <eof>
- previously we'd never break lines when we discovered a structural
  error; now we break even in the case of a structural error if there
  are two unwrapped lines within the same line; thus,
  void f() { while (true) { g(); y(); } }
  will still be re-formatted, even if there's missing braces somewhere
  in the file
- still exclude macro definitions from generating structural error;
  macro definitions are inbalanced snippets

llvm-svn: 179379
2013-04-12 14:13:36 +00:00
Daniel Jasper 973c9420e1 Format a line if a range in its leading whitespace was selected.
With [] marking the selected range, clang-format invoked on

    [  ]   int a;

Would so far not reformat anything. With this patch, it formats a
line if its leading whitespace is touched.

llvm-svn: 176435
2013-03-04 13:43:19 +00:00
Daniel Jasper 7a6d09b300 Move the token annotator into separate files.
No functional changes. Also removed experimental-warning from all of
clang-format's files, as it is no longer accurate.

llvm-svn: 173830
2013-01-29 21:01:14 +00:00
Manuel Klimek 0a3a3c9900 Allow us to better guess the context of an unwrapped line.
This gives us the ability to guess better defaults for whether a *
between identifiers is a pointer dereference or binary operator.

Now correctly formats:
void f(a *b);
void f() { f(a * b); }

llvm-svn: 173243
2013-01-23 09:32:48 +00:00
Manuel Klimek f92f7bc540 Implements more principled comment parsing.
Changing nextToken() in the UnwrappedLineParser to get the next
non-comment token. This allows us to correctly layout a whole class of
snippets, like:

if /* */(/* */ a /* */) /* */
  f() /* */; /* */
else /* */
  g();

Fixes a bug in the formatter where we would assume there is a previous
non-comment token.
Also adds the indent level of an unwrapped line to the debug output in
the parser.

llvm-svn: 173168
2013-01-22 16:31:55 +00:00
Manuel Klimek 762dd189a4 Fix parsing of return statements.
Previously, we would not detect brace initializer lists in return
statements, thus:
 return (a)(b) { 1, 2, 3 };
would put the semicolon onto the next line.

llvm-svn: 173017
2013-01-21 10:07:49 +00:00
Chandler Carruth 4b41745e05 Re-sort all the headers. Lots of regressions have crept in here.
Manually fix the order of UnwrappedLineParser.cpp as that one didn't
have its associated header as the first header.

This also uncovered a subtle inclusion order dependency as CLog.h didn't
include LLVM.h to pick up using declarations it relied upon.

llvm-svn: 172892
2013-01-19 08:09:44 +00:00
Manuel Klimek 05d82b72f1 Fix comment.
llvm-svn: 172831
2013-01-18 18:24:28 +00:00
Manuel Klimek d3b92fa61e Fixes problems with line merging in the face of preprocessor directives.
This patch prepares being able to test for and fix more problems (see
FIXME in the test for example).

Previously we would output unwrapped lines for preprocessor directives
at the point where we also parsed the hash token. Since often
projections only terminate (and thus output their own unwrapped line)
after peeking at the next token, this would lead to the formatter seeing
the preprocessor directives out-of-order (slightly earlier). To be able
to correctly identify lines to merge, the formatter needs a well-defined
order of unwrapped lines, which this patch introduces.

llvm-svn: 172819
2013-01-18 14:04:34 +00:00
Daniel Jasper a67a8f062b Calculate the total length of a line up to each token up front.
This makes the tedious fitsIntoLimit() method unnecessary and I can
replace one hack (constructor initializers) by a slightly better hack.

Furthermore, this will enable calculating whether a certain part of a
line fits into the limit for future modifications.

llvm-svn: 172604
2013-01-16 10:41:46 +00:00
Daniel Jasper daffc0dd4c Change the datastructure for UnwrappedLines.
It was quite convoluted leading to us accidentally introducing O(N^2)
complexity while copying from UnwrappedLine to AnnotatedLine. We might
still want to improve the datastructure in AnnotatedLine (most
importantly not put them in a vector where they need to be copied on
vector resizing but that will be done as a follow-up.

This fixes most of the regression in llvm.org/PR14959.

No formatting changes intended.

llvm-svn: 172602
2013-01-16 09:10:19 +00:00
Manuel Klimek e01bab587c Fixes various bugs around the keywords class, struct and union.
This switches to parsing record definitions only if we can clearly
identify them. We're specifically allowing common patterns for
visibility control through macros and attributes, but we cannot
currently fix all instances. This fixes all known bugs we have though.

Before:
static class A f() {
  return g();
} int x;

After:
static class A f() {
  return g();
}
int x;

llvm-svn: 172530
2013-01-15 13:38:33 +00:00
Dmitri Gribenko f857950d39 Remove useless 'llvm::' qualifier from names like StringRef and others that are
brought into 'clang' namespace by clang/Basic/LLVM.h

llvm-svn: 172323
2013-01-12 19:30:44 +00:00
Manuel Klimek d5e5f8f2a4 Fix parsing of initializer lists with elaborated type specifier.
Now we correctly parse and format:
verifyFormat("struct foo a = { bar };
int n;

llvm-svn: 172229
2013-01-11 18:13:04 +00:00
Alexander Kornienko 5b7157ac8d Basic support for diagnostics.
Summary: Uses DiagnosticsEngine to output diagnostics.

Reviewers: djasper, klimek

Reviewed By: djasper

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D278

llvm-svn: 172071
2013-01-10 15:05:09 +00:00
Manuel Klimek 8e07a1b64b Fix layout of blocks inside statements.
Previously, we would not indent:
SOME_MACRO({
  int i;
});
correctly. This is fixed by adding the trailing }); to the unwrapped
line starting with SOME_MACRO({, so the formatter can correctly match
the braces and indent accordingly.

Also fixes incorrect parsing of initializer lists, like:
int a[] = { 1 };

llvm-svn: 172058
2013-01-10 11:52:21 +00:00
Nico Weber 2ce0ac5a8c Formatter: Add support for @implementation.
Just reuse the @interface code for this. It accepts slightly more than
necessary (@implementation cannot have protocol lists), but that's ok.

llvm-svn: 172019
2013-01-09 23:25:37 +00:00
Nico Weber 8696a8d9e3 Formatting: Add support for @protocol.
Pull pieces of the @interface code into reusable methods.

llvm-svn: 172001
2013-01-09 21:15:03 +00:00
Nico Weber 7eecf4b6e3 Formatter: Add support for @interface.
Previously:
@interface Foo + (id)init; @end

Now:
@interface Foo
+ (id)init;
@end

Some tweaking remains, but this is a good first step.

llvm-svn: 171995
2013-01-09 20:25:35 +00:00
Manuel Klimek 52b1515405 Enables layouting unwrapped lines around preprocessor directives.
Previously, we'd always start at indent level 0 after a preprocessor
directive, now we layout the following snippet (column limit 69) as
follows:

functionCallTo(someOtherFunction(
    withSomeParameters, whichInSequence,
    areLongerThanALine(andAnotherCall,
  B
                       withMoreParamters,
                       whichStronglyInfluenceTheLayout),
    andMoreParameters),
               trailing);

Note that the different jumping indent is a different issue that will be
addressed separately.

This is the first step towards handling #ifdef->#else->#endif chains
correctly.

llvm-svn: 171974
2013-01-09 15:25:02 +00:00
Daniel Jasper 7c85fde500 Change the data structure used in clang-format.
This is a first step towards supporting more complex structures such
as #ifs inside unwrapped lines. This patch mostly converts the array-based
UnwrappedLine into a linked-list-based UnwrappedLine. Future changes will
allow multiple children for each Token turning the UnwrappedLine into a
tree.

No functional changes intended.

llvm-svn: 171856
2013-01-08 14:56:18 +00:00
Manuel Klimek 28cacc740d Fix parsing of variable declarations directly after a class / struct.
Previous indent:
class A {
}
a;
void f() {
};

With this patch:
class A {
} a;
void f() {
}
;

The patch introduces a production for classes and structs, and parses
the rest of the line to the semicolon after the class scope.
This allowed us to remove a long-standing wart in the parser that would
just much the semicolon after any block.
Due to this suboptimal formating some tests were broken.

Some unrelated formatting tests broke; those hit a bug in the ast
printing, and need to be fixed separately.

llvm-svn: 171761
2013-01-07 18:10:23 +00:00
Manuel Klimek 6b9eeba09a s/parseStatement/parseStructuralElement/g in the UnwrappedLineParser.
llvm-svn: 171737
2013-01-07 14:56:16 +00:00
Daniel Jasper 8d1832e091 Reformat clang-formats source code.
All changes done by clang-format itself. No functional changes.

llvm-svn: 171732
2013-01-07 13:26:07 +00:00
Manuel Klimek ef92069940 Fix layouting of tokens with a leading escaped newline.
If a token follows directly on an escaped newline, the escaped newline
is stored with the token. Since we re-layout escaped newlines, we need
to treat them just like normal whitespace - thus, we need to increase
the whitespace-length of the token, while decreasing the token length
(otherwise the token length contains the length of the escaped newline
and we double-count it while indenting).

llvm-svn: 171706
2013-01-07 07:56:50 +00:00
Manuel Klimek 1058d987f9 Fixes handling of unbalances braces.
If we find an unexpected closing brace, we must not stop parsing, as
we'd otherwise not layout anything beyond that point.

If we find a structural error on the highest level we'll not re-indent
anyway, but we'll still want to format within unwrapped lines.

Needed to introduce a differentiation between an expected and unexpected
closing brace.

llvm-svn: 171666
2013-01-06 20:07:31 +00:00
Manuel Klimek 52d0fd8961 Fixes parsing of hash tokens in the middle of a line.
To parse # correctly, we need to know whether it is the first token in a
line - we can deduct this either from the whitespace or seeing that the
token is the first in the file - we already calculate this information.
This patch moves the identification of the first token into the
getNextToken method and stores it inside the FormatToken, so the
UnwrappedLineParser can stay independent of the SourceManager.

llvm-svn: 171640
2013-01-05 22:56:06 +00:00
Manuel Klimek ef2cfb110d Fixes PR14801 - preprocessor directives shouldn't be indented
Uses indent 0 for macros for now and resets the indent state to the
level prior to the preprocessor directive.

llvm-svn: 171639
2013-01-05 22:14:16 +00:00
Manuel Klimek 1abf789c7a Various fixes to clang-format's macro handling.
Some of this is still pretty rough (note the load of FIXMEs), but it is
strictly an improvement and fixes various bugs that were related to
macro processing but are also imporant in non-macro use cases.

Specific fixes:
- correctly puts espaced newlines at the end of the line
- fixes counting of white space before a token when escaped newlines are
  present
- fixes parsing of "trailing" tokens when eof() is hit
- puts macro parsing orthogonal to parsing other structure
- general support for parsing of macro definitions

Due to the fix to format trailing tokens, this change also includes a
bunch of fixes to the c-index tests.

llvm-svn: 171556
2013-01-04 23:34:14 +00:00
Manuel Klimek a71e5d8115 Fixes use of unescaped newlines when formatting preprocessor directives.
This is the first step towards handling preprocessor directives. This
patch only fixes the most pressing issue, namely correctly escaping
newlines for tokens within a sequence of a preprocessor directive.

The next step will be to fix incorrect format decisions on #define
directives.

llvm-svn: 171393
2013-01-02 16:30:12 +00:00
Daniel Jasper 8fbd96855c Let clang-format format itself.
Apply all formatting changes that clang-format would apply to its own source
code. All choices seem to improve readability (or at least not make it worse).
No functional changes.

llvm-svn: 171039
2012-12-24 16:51:15 +00:00
Daniel Jasper e25509f857 Fix several formatting problems.
More specifically:
- Improve formatting of static initializers.
- Fix formatting of lines comments in enums.
- Fix formmating of trailing line comments.

llvm-svn: 170316
2012-12-17 11:29:41 +00:00
Matt Beaumont-Gay 05e0ad5961 Appease -Wnon-virtual-dtor
llvm-svn: 169648
2012-12-07 22:49:27 +00:00
Alexander Kornienko e327684b2a Clang-format: extracted FormatTokenSource from UnwrappedLineParser.
Summary: FormatTokenLexer is here, FormatTokenBuffer is on the way. This will allow to re-parse unwrapped lines when needed.

Reviewers: djasper, klimek

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D186

llvm-svn: 169605
2012-12-07 16:15:44 +00:00
Alexander Kornienko 578fdd8968 Clang-format: IndentCaseLabels option, proper namespace handling
Summary: + tests arranged in groups, as their number is already quite large.

Reviewers: djasper, klimek

Reviewed By: djasper

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D185

llvm-svn: 169520
2012-12-06 18:03:27 +00:00
Alexander Kornienko 37d6c94e28 Clang-format: parse for and while loops
Summary: Adds support for formatting for and while loops.

Reviewers: djasper, klimek

Reviewed By: klimek

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D174

llvm-svn: 169387
2012-12-05 15:06:06 +00:00
Alexander Kornienko bc09a7ea85 Follow-up to r169286, addresses comments in http://llvm-reviews.chandlerc.com/D164#comment-4 : comments and a method rename
llvm-svn: 169382
2012-12-05 13:56:52 +00:00