to whether the fileid is a 'extern c system header' in addition to whether it
is a system header, most of this is spreading plumbing around. Once we have that,
PPLexerChange bases its "file enter/exit" notifications to PPCallbacks to
base the system header state on FileIDInfo instead of HeaderSearch. Finally,
in Preprocessor::HandleIncludeDirective, mirror logic in GCC: the system headerness
of a file being entered can be set due to the #includer or the #includee.
llvm-svn: 56688
- Added as private members for each because it is not clear where to
put the common definition. Perhaps the IdentifierInfos all of these
"pseudo-keywords" should be collected into one place (this would
KnownFunctionIDs and Objective-C property IDs, for example).
Remove Token::isNamedIdentifier.
- There isn't a good reason to use strcmp when we have interned
strings, and there isn't a good reason to encourage clients to do
so.
llvm-svn: 54794
* Move FormatError() from TextDiagnostic up to DiagClient, remove now
empty class TextDiagnostic
* Make DiagClient optional for Diagnostic
This fixes the following problems:
* -html-diags (and probably others) does now output the same set of
warnings as console clang does
* nothing crashes if one forgets to call setHeaderSearch() on
TextDiagnostic
* some code duplication is removed
llvm-svn: 54620
1) New public methods added:
-EnableBacktrackAtThisPos
-DisableBacktrack
-Backtrack
-isBacktrackEnabled
2) LookAhead() implementation is replaced with a more efficient one.
3) LookNext() is removed.
llvm-svn: 54611
related to pp-expressions. Doing so is pretty simple and this
patch implements it, yielding nice diagnostics like:
t.c:2:7: error: division by zero in preprocessor expression
#if 1 / (0 + 0)
~ ^ ~~~~~~~
t.c:5:14: error: expected ')' in preprocessor expression
#if (412 + 42
~~~~~~~~^
t.c:5:5: error: to match this '('
#if (412 + 42
^
t.c:10:10: warning: left side of operator converted from negative value to unsigned: -42 to 18446744073709551574
#if (-42 + 0U) / -2
~~~ ^ ~~
t.c:10:16: warning: right side of operator converted from negative value to unsigned: -2 to 18446744073709551614
#if (-42 + 0U) / -2
~~~~~~~~~~ ^ ~~
5 diagnostics generated.
llvm-svn: 50638
clang.cpp: InitializePreprocessor now makes a copy of the contents of PredefinesBuffer and
passes it to the preprocessor object.
clang.cpp: DriverPreprocessorFactory now calls "InitializePreprocessor" instead of this being done in main().
html::HighlightMacros() now takes a PreprocessorFactory, allowing it to conjure up a new
Preprocessor to highlight macros.
class HTMLDiagnostics now takes a PreprocessorFactory* that it can use for html::HighlightMacros().
Updated clients of HTMLDiagnostics to use this new interface.
llvm-svn: 49875
theoretically useful, but not useful in practice. It adds a bunch of
complexity, and not much value. It's best to nuke it. One big advantage
is that it means the target interfaces will soon lose their SLoc arguments
and target queries can never emit diagnostics anymore (yay). Removing this
also simplifies some of the core preprocessor which should make it slightly
faster.
Ted, I didn't simplify TripleProcessor, which can now have at most one
triple, and can probably just be removed. Please poke at it when you have
time.
llvm-svn: 47930
- More enum signeness bitfield fixes (MSVC treats enums as signed).
- Fixed in Lex/HeaderSearch.cpp an unsafe copy between two
HeaderSearch::PerFileInfo entries in a common vector. The copy involved two
calls to getFileInfo() within the assignment; these calls could have
side-effects that enlarged the internal vector, and with MSVC this would
invalidate one of the values in the assignment.
Patch by Argiris Kirtzidis!
llvm-svn: 47536
Since this behavior is useful for most classes, we might consider adding a simple 3 method class that implements the behavior. Ted said that Boost has such a class.
llvm-svn: 46654
incorrectly apply the multiple include optimization to files with
guards like:
#if !defined(x) MACRO
where MACRO could expand to different things in different contexts.
Thanks Neil!
llvm-svn: 45716
both Preprocessor and ASTContext, we no longer need to explicitly pass
MainFileID around in function calls that also pass either Preprocessor or
ASTContext. This resulted in some nice cleanups in the ASTConsumers and the
driver.
llvm-svn: 45228
contents of the header map. Look ma, no assumptions about input data
here (aka, corrupt header maps can't crash the compiler - crazy thought).
llvm-svn: 45122
the internal representation. This also fixes a bug where -I foo -F foo would
not search foo as both a normal and framework include dir.
llvm-svn: 45092
NumericLiteralParser::GetFloatValue(). Upon method return, this flag has the value
true if the returned APFloat can exactly represent the number in the parsed text,
and false otherwise.
Modified the implementation of GetFloatValue() to parse literals using APFloat's
convertFromString method (which allows us to set the value of isExact).
llvm-svn: 44339
predefined macros. Previously, these were handled by the driver,
now they are handled by the preprocessor.
Some fallout of this:
1. Instead of preprocessing two buffers (the predefines, then the
main source file) we now start preprocessing the main source
file and inject the predefines as a "psuedo #include" from the
main source file.
2. #1 allows us to nuke the Lexer::IsMainFile flag and simplify
Preprocessor::isInPrimaryFile.
3. The driver doesn't have to know about standard #defines, the
preprocessor knows, which is nice for people wanting to define
their own drivers.
4. This allows us to put normal tokens in the predefine buffer,
for example a definition for __builtin_va_list that is
target-specific, and a typedef for id in objc.
llvm-svn: 42818
stuff like this:
// If we don't have a comma, it is either the end of the list (a ';') or
// an error, bail out.
if (Tok.isNot(tok::comma))
break;
instead of:
// If we don't have a comma, it is either the end of the list (a ';') or
// an error, bail out.
if (Tok.getKind() != tok::comma)
break;
There is obviously no functionality change, but the code reads a bit better and is
more terse.
llvm-svn: 42795
Now instead of IdentifierInfo knowing anything about MacroInfo,
only the preprocessor knows. This makes MacroInfo truly private
to the Lex library (and its direct clients) instead of being
accessed in the Basic library.
llvm-svn: 42727
(because all bitfields now fit in 32 bits). This shrinks the identifier
table for carbon.h from 1634428 to 1451424 bytes (12%) and has no impact
on compile time.
llvm-svn: 42723
moves the MacroInfo pointer to a side hash table (which currently
lives in IdentifierTable.cpp). This removes a pointer from
Identifier info, but doesn't shrink it, as it requires a new bit
be added. This strange approach with the 'hasmacro' bit is needed
to not lose preprocessor performance.
llvm-svn: 42722
fit in 32-bits, shrinking IdentifierInfo by a word.
This shrinks the total size of the identifier pool from
1817264 to 1634428 bytes (11%) on carbon.h.
llvm-svn: 42719
number of arguments now and does the right thing, but the nullary/unary
accessors are preserved as convenience functions. This allows us to
slightly simplify clients.
llvm-svn: 42716
Selector's instead of requiring void* to be used. I converted one use of
DenseSet<void*> over to use DenseSet<Selector> but the others should change
as well.
llvm-svn: 42645
- Add SelectorTable, which enables us to remove MultiKeywordSelector from the public header.
- Remove FoldingSet from IdentifierInfo.h and Preprocessor.h.
- Remove Parser::ObjcGetUnarySelector and Parser::ObjcGetKeywordSelector, they are subsumed by SelectorTable.
- Add MultiKeywordSelector to IdentifierInfo.cpp.
- Move a bunch of selector related methods from ParseObjC.cpp to IdentifierInfo.cpp.
- Added some comments.
llvm-svn: 42643
This motivated implementing a devious clattner inspired solution:-)
This approach uses a small value "Selector" class to point to an IdentifierInfo for the 0/1 case. For multi-keyword selectors, we instantiate a MultiKeywordSelector object (previously known as SelectorInfo). Now, the incremental cost for selectors is only 24,800 for Cocoa.h! This saves 156,592 bytes, or 86%!! The size reduction is also the result of getting rid of the AST slot, which was not strictly necessary (we will associate a selector with it's method using another table...most likely in Sema).
This change was critical to make now, before we have too many clients.
I still need to add some comments to the Selector class...will likely add later today/tomorrow.
llvm-svn: 42452
#1: It is cleaner. I never "liked" storing keyword selectors (i.e. foo:bar:baz) in the IdentifierTable.
#2: It is more space efficient. Since Cocoa keyword selectors can be quite long, this technique is space saving. For Cocoa.h, pulling the keyword selectors out saves ~180k. The cost of the SelectorInfo data is ~100k. Saves ~80k, or 43%.
#3: It results in many API simplifications. Here are some highlights:
- Removed 3 actions (ActOnKeywordMessage, ActOnUnaryMessage, & one flavor of ObjcBuildMethodDeclaration that was specific to unary messages).
- Removed 3 funky structs from DeclSpec.h (ObjcKeywordMessage, ObjcKeywordDecl, and ObjcKeywordInfo).
- Removed 2 ivars and 2 constructors from ObjCMessageExpr (fyi, this space savings has not been measured).
I am happy with the way it turned out (though it took a bit more hacking than I expected). Given the central role of selectors in ObjC, making sure this is "right" will pay dividends later.
Thanks to Chris for talking this through with me and suggesting this approach.
llvm-svn: 42395
Rationale:
We currently have a separate table to unique ObjC selectors. Since I don't need all the instance data in IdentifierInfo, I thought this would save space (and make more sense conceptually).
It turns out the cost of having duplicate entries for unary selectors (i.e. names without colons) outweighs the cost difference between the IdentifierInfo & SelectorInfo structures. Here is the data:
Two tables:
*** Selector/Identifier Stats:
# Selectors/Identifiers: 51635
Bytes allocated: 1999824
One table:
*** Identifier Table Stats:
# Identifiers: 49500
Bytes allocated: 1990316
llvm-svn: 42139
- Add SelectorInfo/SelectorTable classes, modeled after IdentifierInfo/IdentifierTable.
- Add SelectorTable instance to ASTContext, created lazily through ASTContext::getSelectorInfo().
- Add SelectorInfo slot to ObjcMethodDecl.
- Add helper function to derive a SelectorInfo from ObjcKeywordInfo.
Misc: Got the Decl stats stuff up and running again...it was missing support for ObjC AST's.
llvm-svn: 42023
2) Add support for lexing imaginary constants (a GCC extension):
t.c:5:10: warning: imaginary constants are an extension
A = 1.0iF;
^
3) Make the 'invalid suffix' diagnostic pointer more accurate:
t.c:6:10: error: invalid suffix 'qF' on floating constant
A = 1.0qF;
^
instead of:
t.c:6:10: error: invalid suffix 'qF' on floating constant
A = 1.0qF;
^
llvm-svn: 41411
memorybuffer instead of a pointer to the memorybuffer itself. This
reduces coupling and eliminates a pointer dereference on a hot path.
This speeds up -Eonly on 483.xalancbmk by 2.1%
llvm-svn: 40394
fileid/offset pair, it now contains a bit discriminating between
mapped locations and file locations. This separates the tables for
macros and files in SourceManager, and allows better separation of
concepts in the rest of the compiler. This allows us to have *many*
macro instantiations before running out of 'addressing space'.
This is also more efficient, because testing whether something is a
macro expansion is now a bit test instead of a table lookup (which
also used to require having a srcmgr around, now it doesn't).
This is fully functional, but there are several refinements and
optimizations left.
llvm-svn: 40103
specifying the start of a token and a logical (phase 3) character number,
returns a sloc representing the input character corresponding to it.
llvm-svn: 39905
out of the llvm namespace. This makes the clang namespace be a sibling of
llvm instead of being a child.
The good thing about this is that it makes many things unambiguous. The
bad things is that many things in the llvm namespace (notably data structures
like smallvector) now require an llvm:: qualifier. IMO, libsystem and libsupport
should be split out of llvm into their own namespace in the future, which will fix
this issue.
llvm-svn: 39659
Submitted by:
Reviewed by:
Fixed a bug in the parser's handling of attributes on pointer declarators.
For example, the following code was producing a syntax error...
int *__attribute(()) foo;
attrib.c:10:25: error: expected identifier or '('
int *__attribute(()) foo;
^
Changed Parser::ParseTypeQualifierListOpt to not consume the token following
an attribute declaration.
Also added LexerToken::getName() convenience method...useful when tracking
down errors like this.
llvm-svn: 39602
Submitted by:
Reviewed by:
Move string literal parsing from Sema=>LiteralSupport. This consolidates
all the quirky parsing code within the Lexer subsystem (yeah!). This
simplifies Sema and (more importantly) allows future parsers
(i.e. subclasses of Action) to benefit from this code.
llvm-svn: 39354
Submitted by:
Reviewed by:
-Converted the preprocessor to use NumericLiteralParser.
-Several minor changes to LiteralSupport interface/implementation.
-Added an error diagnostic for floating point usage in pp expr's.
llvm-svn: 39352
Submitted by:
Reviewed by:
Moved numeric literal support from SemaExpr.cpp to LiteralSupport.[h,cpp]
in Lex. This will allow it to be used by both Sema and Preprocessor (and
should be the last major refactoring of this sub-system).. Over
time, it will be reused by anyone implementing an actions module (i.e.
any subclass of llvm::clang::Action. Minor changes to IntegerLiteral in Expr.h.
More to come...
llvm-svn: 39351
whose decl objects are lazily created the first time they are referenced.
Builtin functions are described by the clang/AST/Builtins.def file, which
makes it easy to add new ones.
This is missing two important pieces:
1. Support for the rest of the gcc builtins.
2. Support for target-specific builtins (e.g. __builtin_ia32_emms).
Just adding this builtins reduces the number of implicit function definitions
by 6, reducing the # diagnostics from 550 to 544 when parsing carbon.h.
I need to add all the i386-specific ones to eliminate several hundred more.
ugh.
llvm-svn: 39327
of having a loose collection of function pointers. This also allows clients to
maintain state, and reduces the size of the Preprocessor.h interface.
llvm-svn: 39203
filenames (and also '#pragma GCC dependency' of course). Now, assuming
no cleaning is needed, we can go all the way from lexing the filename to
doing filename lookups with no mallocs. This speeds up user PP time from
0.077 to 0.075s for Cocoa.h (2.6%).
llvm-svn: 39092
argument tokens into instead of a real vector. This avoids some malloc
traffic in common cases. In an "abusive macro expansion" testcase, this
reduced -Eonly time by 25%.
llvm-svn: 38757
is an intra-lexer property, not a inter-lexer property, so it makes sense
for it to be define here. It also makes no sense for macros, and allows us
to define it more carefully in the header.
While I'm at it, improve comments and structuring in Lexer.h
llvm-svn: 38701
that are lexed are made up of tokens, so the calls are just ugly and redundant.
Hook up the MIOpt for the #if case. PPCExpressions doesn't currently implement
the hook though, so we still don't handle #if !defined(X) with the MIOpt.
llvm-svn: 38649
Now, instead of keeping a pointer to the start of the token in memory, we keep the
start of the token as a SourceLocation node. This means that each LexerToken knows
the full include stack it was created with, and means that the LexerToken isn't
reliant on a "CurLexer" member to be around (lexer tokens would previously go out of
scope when their lexers were deallocated).
This simplifies several things, and forces good cleanup elsewhere. Now the
Preprocessor is the one that knows how to dump tokens/macros and is the one that
knows how to get the spelling of a token (it has all the context).
llvm-svn: 38551