definitions from a precompiled header. This ensures that
code-completion with macro names behaves the same with or without
precompiled headers.
llvm-svn: 92497
We creating and free thousands of MacroArgs objects (and the related
std::vectors hanging off them) for the testcase in PR5610 even though
there are only ~20 live at a time. This doesn't actually use the
cache yet.
llvm-svn: 91391
files with the contents of an arbitrary memory buffer. Use this new
functionality to drastically clean up the way in which we handle file
truncation for code-completion: all of the truncation/completion logic
is now encapsulated in the preprocessor where it belongs
(<rdar://problem/7434737>).
llvm-svn: 90300
in diagnostics when we fail to open a file. This allows us to
report things like:
$ clang test.c -I.
test.c:2:10: fatal error: error opening file './foo.h': Permission denied
#include "foo.h"
^
llvm-svn: 90276
stat a file but where mmaping it fails. In this case, we emit an
error like:
t.c:1:10: fatal error: error opening file '../../foo.h'
instead of "cannot find file".
llvm-svn: 90110
handling template template parameters properly. This refactoring:
- Parses template template arguments as id-expressions, representing
the result of the parse as a template name (Action::TemplateTy)
rather than as an expression (lame!).
- Represents all parsed template arguments via a new parser-specific
type, ParsedTemplateArgument, which stores the kind of template
argument (type, non-type, template) along with all of the source
information about the template argument. This replaces an ad hoc
set of 3 vectors (one for a void*, which was either a type or an
expression; one for a bit telling whether the first was a type or
an expression; and one for a single source location pointing at
the template argument).
- Moves TemplateIdAnnotation into the new Parse/Template.h. It never
belonged in the Basic library anyway.
llvm-svn: 86708
a little fuzzy, but conceptually it's just uniquing the identifier.
Chris, please review. I debated splitting into const/non-const versions where
the const one propogated constness to the resulting IdentifierInfo*.
llvm-svn: 86106
operators, e.g.,
operator+<int>
which now works in declarators, id-expressions, and member access
expressions. This commit only implements the non-dependent case, where
we can resolve the template-id to an actual declaration.
llvm-svn: 85966
-code-completion-at=filename:line:column
which performs code completion at the specified location by truncating
the file at that position and enabling code completion. This approach
makes it possible to run multiple tests from a single test file, and
gives a more natural command-line interface.
llvm-svn: 82571
essence, code completion is triggered by a magic "code completion"
token produced by the lexer [*], which the parser recognizes at
certain points in the grammar. The parser then calls into the Action
object with the appropriate CodeCompletionXXX action.
Sema implements the CodeCompletionXXX callbacks by performing minimal
translation, then forwarding them to a CodeCompletionConsumer
subclass, which uses the results of semantic analysis to provide
code-completion results. At present, only a single, "printing" code
completion consumer is available, for regression testing and
debugging. However, the design is meant to permit other
code-completion consumers.
This initial commit contains two code-completion actions: one for
member access, e.g., "x." or "p->", and one for
nested-name-specifiers, e.g., "std::". More code-completion actions
will follow, along with improved gathering of code-completion results
for the various contexts.
[*] In the current -code-completion-dump testing/debugging mode, the
file is truncated at the completion point and EOF is translated into
"code completion".
llvm-svn: 82166
declaration in the AST.
The new ASTContext::getCommentForDecl function searches for a comment
that is attached to the given declaration, and returns that comment,
which may be composed of several comment blocks.
Comments are always available in an AST. However, to avoid harming
performance, we don't actually parse the comments. Rather, we keep the
source ranges of all of the comments within a large, sorted vector,
then lazily extract comments via a binary search in that vector only
when needed (which never occurs in a "normal" compile).
Comments are written to a precompiled header/AST file as a blob of
source ranges. That blob is only lazily loaded when one requests a
comment for a declaration (this never occurs in a "normal" compile).
The indexer testbed now supports comment extraction. When the
-point-at location points to a declaration with a Doxygen-style
comment, the indexer testbed prints the associated comment
block(s). See test/Index/comments.c for an example.
Some notes:
- We don't actually attempt to parse the comment blocks themselves,
beyond identifying them as Doxygen comment blocks to associate them
with a declaration.
- We won't find comment blocks that aren't adjacent to the
declaration, because we start our search based on the location of
the declaration.
- We don't go through the necessary hops to find, for example,
whether some redeclaration of a declaration has comments when our
current declaration does not. Similarly, we don't attempt to
associate a \param Foo marker in a function body comment with the
parameter named Foo (although that is certainly possible).
- Verification of my "no performance impact" claims is still "to be
done".
llvm-svn: 74704
Implement support for C++ Substitution Failure Is Not An Error
(SFINAE), which says that errors that occur during template argument
deduction do *not* produce diagnostics and do not necessarily make a
program ill-formed. Instead, template argument deduction silently
fails. This is currently implemented for template argument deduction
during matching of class template partial specializations, although
the mechanism will also apply to template argument deduction for
function templates. The scheme is simple:
- If we are in a template argument deduction context, any diagnostic
that is considered a SFINAE error (or warning) will be
suppressed. The error will be propagated up the call stack via the
normal means.
- By default, all warnings and errors are SFINAE errors. Add the
NoSFINAE class to a diagnostic in the .td file to make it a hard
error (e.g., for access-control violations).
Note that, to make this fully work, every place in Sema that emits an
error *and then immediately recovers* will need to check
Sema::isSFINAEContext() to determine whether it must immediately
return an error rather than recovering.
llvm-svn: 73332
preprocessor and initialize it early in clang-cc. This
ensures that __has_builtin works in all modes, not just
when ASTContext is around.
llvm-svn: 73319
builtin preprocessor macro. This appears to work with two caveats:
1) builtins are registered in -E mode, and 2) target-specific builtins
are unconditionally registered even if they aren't supported by the
target (e.g. SSE4 builtin when only SSE1 is enabled).
llvm-svn: 73289
PCH file. In the Cocoa-prefixed "Hello, World" benchmark, this takes
us from reading 503 identifiers down to 37 and from 470 macros down to
4. It also results in an 8% performance improvement.
llvm-svn: 70094
This allows it to accurately measure tokens, so that we get:
t.cpp:8:13: error: unknown type name 'X'
static foo::X P;
~~~~~^
instead of the woefully inferior:
t.cpp:8:13: error: unknown type name 'X'
static foo::X P;
~~~~ ^
Most of this is just plumbing to push the reference around.
llvm-svn: 69099
buffer generated for the current translation unit. If they are
different, complain and then ignore the PCH file. This effectively
checks for all compilation options that somehow would affect
preprocessor state (-D, -U, -include, the dreaded -imacros, etc.).
When we do accept the PCH file, throw away the contents of the
predefines buffer rather than parsing them, since all of the results
of that parsing are already stored in the PCH file. This eliminates
the ugliness with the redefinition of __builtin_va_list, among other
things.
llvm-svn: 68838
PCH. This works now, except for limitations not being able to do things
with identifiers. The basic example in the testcase works though.
llvm-svn: 68832
improvement, source locations read from the PCH file will properly
resolve to the source files that were used to build the PCH file
itself.
Once we have the preprocessor state stored in the PCH file, source
locations that refer to macro instantiations that occur in the PCH
file should have the appropriate instantiation information.
llvm-svn: 68758
- Make the Diagnostic::Level for PTH errors to be specified by the caller
clang (driver):
- Set the PTHManager diagnostic level to "Diagnostic::Error" for -include-pth
(a hard error) and Diagnostic::Warning for -token-cache (we can still
proceed).
llvm-svn: 67462
std::vector<int>::allocator_type
When we parse a template-id that names a type, it will become either a
template-id annotation (which is a parsed representation of a
template-id that has not yet been through semantic analysis) or a
typename annotation (where semantic analysis has resolved the
template-id to an actual type), depending on the context. We only
produce a type in contexts where we know that we only need type
information, e.g., in a type specifier. Otherwise, we create a
template-id annotation that can later be "upgraded" by transforming it
into a typename annotation when the parser needs a type. This occurs,
for example, when we've parsed "std::vector<int>" above and then see
the '::' after it. However, it means that when writing something like
this:
template<> class Outer::Inner<int> { ... };
We have two tokens to represent Outer::Inner<int>: one token for the
nested name specifier Outer::, and one template-id annotation token
for Inner<int>, which will be passed to semantic analysis to define
the class template specialization.
Most of the churn in the template tests in this patch come from an
improvement in our error recovery from ill-formed template-ids.
llvm-svn: 65467
to being allocated from the same bumpptr that the MacroInfo objects
themselves are.
This speeds up -Eonly cocoa.h pth by ~4%, fsyntax-only is barely measurable.
llvm-svn: 65195
escapes in the string for subtoken positioning. This gives
us working examples like:
t.m:5:16: warning: field width should have type 'int', but argument has type 'unsigned int'
printf("\n\n%*d", (unsigned) 1, 1);
^ ~~~~~~~~~~~~
where before the caret pointed two spaces to the left.
llvm-svn: 64940
Now instead of just tracking the expansion history, also track the full
range of the macro that got replaced. For object-like macros, this doesn't
change anything. For _Pragma and function-like macros, this means we track
the locations of the ')'.
This is required for PR3579 because apparently GCC uses the line of the ')'
of a function-like macro as the location to expand __LINE__ to.
llvm-svn: 64601
to use this stat information in the PTH file using a 'StatSysCallCache' object.
Performance impact (Cocoa.h, PTH):
- number of stat calls reduces from 1230 to 425
- fsyntax-only: time improves by 4.2%
We can reduce the number of stat calls to almost zero by caching negative stat
calls and directory stat calls in the PTH file as well.
llvm-svn: 64353
actually *slightly* slower than the binary search. Since this is algorithmically
better, further performance tuning should be able to make this faster.
llvm-svn: 64326
Performance impact for -fsyntax-only on Cocoa.h (with Cocoa.h in the PTH file):
- PTH generation time improves by 5%
- PTH reading improves by 0.3%.
llvm-svn: 63072
Token now has a class of kinds for "literals", which include
numeric constants, strings, etc. These tokens can optionally have
a pointer to the start of the token in the lexer buffer. This
makes it faster to get spelling and do other gymnastics, because we
don't have to go through source locations.
This change is performance neutral, but will make other changes
more feasible down the road.
llvm-svn: 63028
ground work for implementing #line, and fixes the "out of macro ID's"
problem.
There is nothing particularly tricky about the code, other than the
very performance sensitive SourceManager::getFileID() method.
llvm-svn: 62978
method. This lets us clean up the interface and make it more obvious that
this method is *really really* _Pragma specific.
Note that _Pragma handling uglifies the Lexer in the critical path. It would
be very interesting to consider making _Pragma remapping be a new special
lexer class of its own.
llvm-svn: 62425
"FileID" a concept that is now enforced by the compiler's type checker
instead of yet-another-random-unsigned floating around.
This is an important distinction from the "FileID" currently tracked by
SourceLocation. *That* FileID may refer to the start of a file or to a
chunk within it. The new FileID *only* refers to the file (and its
#include stack and eventually #line data), it cannot refer to a chunk.
FileID is a completely opaque datatype to all clients, only SourceManager
is allowed to poke and prod it.
llvm-svn: 62407
the "physical" location of tokens, refer to the "spelling" location.
This is more concrete and useful, tokens aren't really physical objects!
llvm-svn: 62309
- IdentifierInfo can now (optionally) have its string data not be
co-located with itself. This is for use with PTH. This aspect is a
little gross, as getName() and getLength() now make assumptions
about a possible alternate representation of IdentifierInfo.
Perhaps we should make IdentifierInfo have virtual methods?
IdentifierTable:
- Added class "IdentifierInfoLookup" that can be used by
IdentifierTable to perform "string -> IdentifierInfo" lookups using
an auxilliary data structure. This is used by PTH.
- Perform tests show that IdentifierTable::get() does not slow down
because of the extra check for the IdentiferInfoLookup object (the
regular StringMap lookup does enough work to mitigate the impact of
an extra null pointer check).
- The upshot is that now that some IdentifierInfo objects might be
owned by the IdentiferInfoLookup object. This should be reviewed.
PTH:
- Modified PTHManager::GetIdentifierInfo to *not* insert entries in
IdentifierTable's string map, and instead create IdentifierInfo
objects on the fly when mapping from persistent IDs to
IdentifierInfos. This saves a ton of work with string copies,
hashing, and StringMap lookup and resizing. This change was
motivated because when processing source files in the PTH cache we
don't need to do any string -> IdentifierInfo lookups.
- PTHManager now subclasses IdentifierInfoLookup, allowing clients of
IdentifierTable to transparently use IdentifierInfo objects managed
by the PTH file. PTHManager resolves "string -> IdentifierInfo"
queries by doing a binary search over a sorted table of identifier
strings in the PTH file (the exact algorithm we use can be changed
as needed).
These changes lead to the following performance changes when using PTH on Cocoa.h:
- fsyntax-only: 10% performance improvement
- Eonly: 30% performance improvement
llvm-svn: 62273
- Use canonical FileID when using getSpelling() caching. This
addresses some cache misses we were seeing with -fsyntax-only on
Cocoa.h
- Added Preprocessor::getPhysicalCharacterAt() utility method for
clients to grab the first character at a specified sourcelocation.
This uses the PTH spelling cache.
- Modified Sema::ActOnNumericConstant() to use
Preprocessor::getPhysicalCharacterAt() instead of
SourceManager::getCharacterData() (to get PTH hits).
These changes cause -fsyntax-only to not page in any sources from
Cocoa.h. We see a speedup of 27%.
llvm-svn: 62193
- Refactor caching logic into a helper class PTHSpellingSearch
- Allow "random accesses" in the spelling cache, thus catching the remaining
cases where 'getSpelling' wasn't hitting the PTH cache
For -Eonly, PTH, Cocoa.h:
- This reduces wall time by 3% (user time unchanged, sys time reduced)
- This reduces the amount of paged source by 1112K.
The remaining 1112K still being paged in is from somewhere else
(investigating).
llvm-svn: 62009
performance gain. Here's what we see for -Eonly on Cocoa.h (using PTH):
- wall time decreases by 21% (26% speedup overall)
- system time decreases by 35%
- user time decreases by 6%
These reductions are due to not paging source files just to get spellings for
literals. The solution in place doesn't appear to be 100% yet, as we still see
some of the pages for source files getting mapped in. Using -print-stats, we see
that SourceManager maps in 7179K less bytes of source text (reduction of 75%).
Will investigate why the remaining 25% are getting paged in.
With these changes, here's how PTH compares to non-PTH on Cocoa.h:
-Eonly: PTH takes 64% of the time as non-PTH (54% speedup)
-fsyntax-only: PTH takes 89% of the time as non-PTH (11% speedup)
llvm-svn: 61913
- Added stub PTHLexer::getSpelling() that will be used for fetching cached
spellings from the PTH file. This doesn't do anything yet.
- Added a hook in Preprocessor::getSpelling() to call PTHLexer::getSpelling()
when using a PTHLexer.
- Updated PTHLexer to read the offsets of spelling tables in the PTH file.
llvm-svn: 61911
- In PTHLexer::Lex read all of the token data from PTH file before
constructing the token. The idea is to enhance locality.
- Do not use Read8/Read32 in PTHLexer::Lex. Inline these operations manually.
- Change PTHManager::ReadIdentifierInfo() to PTHManager::GetIdentifierInfo().
They are functionally the same except that PTHLexer::Lex() reads the
persistent id.
These changes result in a 3.3% speedup for PTH on Cocoa.h (-Eonly).
llvm-svn: 61363
- Embed 'eom' tokens in PTH file.
- Use embedded 'eom' tokens to not lazily generate them in the PTHLexer.
This means that PTHLexer can always advance to the next token after
reading a token (instead of buffering tokens using a copy).
- Moved logic of 'ReadToken' into Lex. GetToken & ReadToken no longer exist.
- These changes result in a 3.3% speedup (-Eonly) on Cocoa.h.
- The code is a little gross. Many cleanups are possible and should be done.
llvm-svn: 61360
become useful or correct until we (1) parse template arguments
correctly, (2) have some way to turn template-ids into types,
declarators, etc., and (3) have a real representation of templates.
llvm-svn: 61208
- Added virtual method 'getSourceLocation()' (no arguments) that gets the location of the next "observable" location (e.g., next character, next token).
PPLexerChange.cpp:
- Implemented FIXME by using PreprocessorLexer::getSourceLocation() to get the location in the file we are returning to after lexing a #included file. This appears to be slightly faster than having the branch (i.e., 'if(CurLexer)'). It's also not a really hot part of the Preprocessor.
llvm-svn: 60860
- Added method "setPTHManager" that will be called by the driver to install
a PTHManager for the Preprocessor.
- Fixed some comments.
- Added EnterSourceFileWithPTH to mirror EnterSourceFileWithLexer.
llvm-svn: 60437
its call sites. This makes it more explicit when the hasError flag is
getting set and removes a confusing difference in behavior between
PP.Diag and Diag in this code.
llvm-svn: 59863
(and carefully calculated) effect of allowing the compiler to reason
about the aliasing properties of DiagnosticBuilder object better,
allowing the whole thing to be promoted to registers instead of
resulting in a ton of stack traffic.
While I'm not very concerned about the performance of the Diag() method
invocations, I *am* more concerned about their code size and impact on the
non-diagnostic code. This patch shrinks the clang executable (in
release-asserts mode with gcc-4.2) from 14523980 to 14519816 bytes. This
isn't much, but it shrinks the lexer from 38192 to 37776, PPDirectives.o
from 31116 to 28868 bytes, etc.
llvm-svn: 59862
one for building up the diagnostic that is in flight (DiagnosticBuilder)
and one for pulling structured information out of the diagnostic when
formatting and presenting it.
There is no functionality change with this patch.
llvm-svn: 59849
- Move out logic for handling the end-of-file to LexEndOfFile (to match the Lexer) class. The logic now mirrors the Lexer class more, which allows us to pass most of the Preprocessor test cases.
llvm-svn: 59768
- Rename 'CurToken' and 'LastToken' to 'CurTokenIdx' and 'LastTokenIdx'
respectively.
- Add helper methods GetToken(), AdvanceToken(), AtLastToken() to abstract away
details of the token stream. This also allows us to easily replace their
implementation later.
llvm-svn: 59733
- This is fairly gross but although the code is conceptually the
same, introducting the union causes gcc 4.2 on x86 (darwin, if that
matters) to pessimize LexTokenInternal which is critical to our
preprocessor performance.
This speeds up -Eonly lexing of Cocoa.h by ~4.7% in my timings and
reduces the code size of LexTokenInternal by 8.6%.
llvm-svn: 59725
- Add variants of IsNonPragmaNonMacroLexer to accept an IncludeMacroStack entry
(simplifies some uses).
- Use IsNonPragmaNonMacroLexer in Preprocessor::LookupFile.
- Add 'FileID' to PreprocessorLexer, and have Preprocessor query this fileid
when looking up the FileEntry for a file
Performance testing of -Eonly on Cocoa.h shows no performance regression because
of this patch.
llvm-svn: 59666
With this snippet:
void f(a::b);
An assert is hit:
Assertion failed: CachedTokens[CachedLexPos-1].getLocation() == Tok.getAnnotationEndLoc() && "The annotation should be until the most recent cached token", file ..\..\lib\Lex\PPCaching.cpp, line 98
Introduce Preprocessor::RevertCachedTokens that reverts a specific number of tokens when backtracking is enabled.
llvm-svn: 59636
- Add variants of IsNonPragmaNonMacroLexer to accept an IncludeMacroStack entry
(simplifies some uses).
- Use IsNonPragmaNonMacroLexer in Preprocessor::LookupFile.
Performance testing of -Eonly on Cocoa.h shows no performance regression because
of this patch.
llvm-svn: 59574
- Add static method to test if the current lexer is a non-macro/non-pragma
lexer.
- Refactor some code in PPLexerChange to use this static method.
- No performance change.
llvm-svn: 59486
PreprocessorLexer, which will either be a 'Lexer' or 'PTHLexer'.
- Added stub field 'CurPTHLexer' to keep track of the current PTHLexer.
- Modified IncludeStackInfo to track both the current PTHLexer and
current PreprocessorLexer.
llvm-svn: 59472
representing the names of declarations in the C family of
languages. DeclarationName is used in NamedDecl to store the name of
the declaration (naturally), and ObjCMethodDecl is now a NamedDecl.
llvm-svn: 59441
PreprocessorLexer now has a virtual method "IndirectLex" which allows it to call the lex method of its subclasses. This is not for performance intensive operations.
llvm-svn: 59185
even whitespace, as tokens from the file. This is enabled with
L->SetKeepWhitespaceMode(true) on a raw lexer. In this mode, you too
can use clang as a really complex version of 'cat' with code like this:
Lexer RawLex(SourceLocation::getFileLoc(SM.getMainFileID(), 0),
PP.getLangOptions(), File.first, File.second);
RawLex.SetKeepWhitespaceMode(true);
Token RawTok;
RawLex.LexFromRawLexer(RawTok);
while (RawTok.isNot(tok::eof)) {
std::cout << PP.getSpelling(RawTok);
RawLex.LexFromRawLexer(RawTok);
}
This will emit exactly the input file, with no canonicalization or other
translation. Realistic clients actually do something with the tokens of
course :)
llvm-svn: 57401
same we we do an unterminated string or character literal. This makes
it so we can guarantee that the lexer never calls into the
preprocessor (which would be suicide for a raw lexer).
llvm-svn: 57395
using LexRawToken, create one and use LexFromRawLexer. This avoids
twiddling the RawLexer flag around and simplifies some code (even
speeding raw lexing up a tiny bit).
This change also improves the token paster to use a Lexer on the stack
instead of new/deleting it.
llvm-svn: 57393
to whether the fileid is a 'extern c system header' in addition to whether it
is a system header, most of this is spreading plumbing around. Once we have that,
PPLexerChange bases its "file enter/exit" notifications to PPCallbacks to
base the system header state on FileIDInfo instead of HeaderSearch. Finally,
in Preprocessor::HandleIncludeDirective, mirror logic in GCC: the system headerness
of a file being entered can be set due to the #includer or the #includee.
llvm-svn: 56688
- Added as private members for each because it is not clear where to
put the common definition. Perhaps the IdentifierInfos all of these
"pseudo-keywords" should be collected into one place (this would
KnownFunctionIDs and Objective-C property IDs, for example).
Remove Token::isNamedIdentifier.
- There isn't a good reason to use strcmp when we have interned
strings, and there isn't a good reason to encourage clients to do
so.
llvm-svn: 54794
* Move FormatError() from TextDiagnostic up to DiagClient, remove now
empty class TextDiagnostic
* Make DiagClient optional for Diagnostic
This fixes the following problems:
* -html-diags (and probably others) does now output the same set of
warnings as console clang does
* nothing crashes if one forgets to call setHeaderSearch() on
TextDiagnostic
* some code duplication is removed
llvm-svn: 54620
1) New public methods added:
-EnableBacktrackAtThisPos
-DisableBacktrack
-Backtrack
-isBacktrackEnabled
2) LookAhead() implementation is replaced with a more efficient one.
3) LookNext() is removed.
llvm-svn: 54611
related to pp-expressions. Doing so is pretty simple and this
patch implements it, yielding nice diagnostics like:
t.c:2:7: error: division by zero in preprocessor expression
#if 1 / (0 + 0)
~ ^ ~~~~~~~
t.c:5:14: error: expected ')' in preprocessor expression
#if (412 + 42
~~~~~~~~^
t.c:5:5: error: to match this '('
#if (412 + 42
^
t.c:10:10: warning: left side of operator converted from negative value to unsigned: -42 to 18446744073709551574
#if (-42 + 0U) / -2
~~~ ^ ~~
t.c:10:16: warning: right side of operator converted from negative value to unsigned: -2 to 18446744073709551614
#if (-42 + 0U) / -2
~~~~~~~~~~ ^ ~~
5 diagnostics generated.
llvm-svn: 50638
clang.cpp: InitializePreprocessor now makes a copy of the contents of PredefinesBuffer and
passes it to the preprocessor object.
clang.cpp: DriverPreprocessorFactory now calls "InitializePreprocessor" instead of this being done in main().
html::HighlightMacros() now takes a PreprocessorFactory, allowing it to conjure up a new
Preprocessor to highlight macros.
class HTMLDiagnostics now takes a PreprocessorFactory* that it can use for html::HighlightMacros().
Updated clients of HTMLDiagnostics to use this new interface.
llvm-svn: 49875
theoretically useful, but not useful in practice. It adds a bunch of
complexity, and not much value. It's best to nuke it. One big advantage
is that it means the target interfaces will soon lose their SLoc arguments
and target queries can never emit diagnostics anymore (yay). Removing this
also simplifies some of the core preprocessor which should make it slightly
faster.
Ted, I didn't simplify TripleProcessor, which can now have at most one
triple, and can probably just be removed. Please poke at it when you have
time.
llvm-svn: 47930
- More enum signeness bitfield fixes (MSVC treats enums as signed).
- Fixed in Lex/HeaderSearch.cpp an unsafe copy between two
HeaderSearch::PerFileInfo entries in a common vector. The copy involved two
calls to getFileInfo() within the assignment; these calls could have
side-effects that enlarged the internal vector, and with MSVC this would
invalidate one of the values in the assignment.
Patch by Argiris Kirtzidis!
llvm-svn: 47536
Since this behavior is useful for most classes, we might consider adding a simple 3 method class that implements the behavior. Ted said that Boost has such a class.
llvm-svn: 46654
incorrectly apply the multiple include optimization to files with
guards like:
#if !defined(x) MACRO
where MACRO could expand to different things in different contexts.
Thanks Neil!
llvm-svn: 45716
both Preprocessor and ASTContext, we no longer need to explicitly pass
MainFileID around in function calls that also pass either Preprocessor or
ASTContext. This resulted in some nice cleanups in the ASTConsumers and the
driver.
llvm-svn: 45228
contents of the header map. Look ma, no assumptions about input data
here (aka, corrupt header maps can't crash the compiler - crazy thought).
llvm-svn: 45122
the internal representation. This also fixes a bug where -I foo -F foo would
not search foo as both a normal and framework include dir.
llvm-svn: 45092
NumericLiteralParser::GetFloatValue(). Upon method return, this flag has the value
true if the returned APFloat can exactly represent the number in the parsed text,
and false otherwise.
Modified the implementation of GetFloatValue() to parse literals using APFloat's
convertFromString method (which allows us to set the value of isExact).
llvm-svn: 44339
predefined macros. Previously, these were handled by the driver,
now they are handled by the preprocessor.
Some fallout of this:
1. Instead of preprocessing two buffers (the predefines, then the
main source file) we now start preprocessing the main source
file and inject the predefines as a "psuedo #include" from the
main source file.
2. #1 allows us to nuke the Lexer::IsMainFile flag and simplify
Preprocessor::isInPrimaryFile.
3. The driver doesn't have to know about standard #defines, the
preprocessor knows, which is nice for people wanting to define
their own drivers.
4. This allows us to put normal tokens in the predefine buffer,
for example a definition for __builtin_va_list that is
target-specific, and a typedef for id in objc.
llvm-svn: 42818
stuff like this:
// If we don't have a comma, it is either the end of the list (a ';') or
// an error, bail out.
if (Tok.isNot(tok::comma))
break;
instead of:
// If we don't have a comma, it is either the end of the list (a ';') or
// an error, bail out.
if (Tok.getKind() != tok::comma)
break;
There is obviously no functionality change, but the code reads a bit better and is
more terse.
llvm-svn: 42795
Now instead of IdentifierInfo knowing anything about MacroInfo,
only the preprocessor knows. This makes MacroInfo truly private
to the Lex library (and its direct clients) instead of being
accessed in the Basic library.
llvm-svn: 42727
(because all bitfields now fit in 32 bits). This shrinks the identifier
table for carbon.h from 1634428 to 1451424 bytes (12%) and has no impact
on compile time.
llvm-svn: 42723
moves the MacroInfo pointer to a side hash table (which currently
lives in IdentifierTable.cpp). This removes a pointer from
Identifier info, but doesn't shrink it, as it requires a new bit
be added. This strange approach with the 'hasmacro' bit is needed
to not lose preprocessor performance.
llvm-svn: 42722
fit in 32-bits, shrinking IdentifierInfo by a word.
This shrinks the total size of the identifier pool from
1817264 to 1634428 bytes (11%) on carbon.h.
llvm-svn: 42719
number of arguments now and does the right thing, but the nullary/unary
accessors are preserved as convenience functions. This allows us to
slightly simplify clients.
llvm-svn: 42716
Selector's instead of requiring void* to be used. I converted one use of
DenseSet<void*> over to use DenseSet<Selector> but the others should change
as well.
llvm-svn: 42645
- Add SelectorTable, which enables us to remove MultiKeywordSelector from the public header.
- Remove FoldingSet from IdentifierInfo.h and Preprocessor.h.
- Remove Parser::ObjcGetUnarySelector and Parser::ObjcGetKeywordSelector, they are subsumed by SelectorTable.
- Add MultiKeywordSelector to IdentifierInfo.cpp.
- Move a bunch of selector related methods from ParseObjC.cpp to IdentifierInfo.cpp.
- Added some comments.
llvm-svn: 42643
This motivated implementing a devious clattner inspired solution:-)
This approach uses a small value "Selector" class to point to an IdentifierInfo for the 0/1 case. For multi-keyword selectors, we instantiate a MultiKeywordSelector object (previously known as SelectorInfo). Now, the incremental cost for selectors is only 24,800 for Cocoa.h! This saves 156,592 bytes, or 86%!! The size reduction is also the result of getting rid of the AST slot, which was not strictly necessary (we will associate a selector with it's method using another table...most likely in Sema).
This change was critical to make now, before we have too many clients.
I still need to add some comments to the Selector class...will likely add later today/tomorrow.
llvm-svn: 42452
#1: It is cleaner. I never "liked" storing keyword selectors (i.e. foo:bar:baz) in the IdentifierTable.
#2: It is more space efficient. Since Cocoa keyword selectors can be quite long, this technique is space saving. For Cocoa.h, pulling the keyword selectors out saves ~180k. The cost of the SelectorInfo data is ~100k. Saves ~80k, or 43%.
#3: It results in many API simplifications. Here are some highlights:
- Removed 3 actions (ActOnKeywordMessage, ActOnUnaryMessage, & one flavor of ObjcBuildMethodDeclaration that was specific to unary messages).
- Removed 3 funky structs from DeclSpec.h (ObjcKeywordMessage, ObjcKeywordDecl, and ObjcKeywordInfo).
- Removed 2 ivars and 2 constructors from ObjCMessageExpr (fyi, this space savings has not been measured).
I am happy with the way it turned out (though it took a bit more hacking than I expected). Given the central role of selectors in ObjC, making sure this is "right" will pay dividends later.
Thanks to Chris for talking this through with me and suggesting this approach.
llvm-svn: 42395
Rationale:
We currently have a separate table to unique ObjC selectors. Since I don't need all the instance data in IdentifierInfo, I thought this would save space (and make more sense conceptually).
It turns out the cost of having duplicate entries for unary selectors (i.e. names without colons) outweighs the cost difference between the IdentifierInfo & SelectorInfo structures. Here is the data:
Two tables:
*** Selector/Identifier Stats:
# Selectors/Identifiers: 51635
Bytes allocated: 1999824
One table:
*** Identifier Table Stats:
# Identifiers: 49500
Bytes allocated: 1990316
llvm-svn: 42139
- Add SelectorInfo/SelectorTable classes, modeled after IdentifierInfo/IdentifierTable.
- Add SelectorTable instance to ASTContext, created lazily through ASTContext::getSelectorInfo().
- Add SelectorInfo slot to ObjcMethodDecl.
- Add helper function to derive a SelectorInfo from ObjcKeywordInfo.
Misc: Got the Decl stats stuff up and running again...it was missing support for ObjC AST's.
llvm-svn: 42023
2) Add support for lexing imaginary constants (a GCC extension):
t.c:5:10: warning: imaginary constants are an extension
A = 1.0iF;
^
3) Make the 'invalid suffix' diagnostic pointer more accurate:
t.c:6:10: error: invalid suffix 'qF' on floating constant
A = 1.0qF;
^
instead of:
t.c:6:10: error: invalid suffix 'qF' on floating constant
A = 1.0qF;
^
llvm-svn: 41411
memorybuffer instead of a pointer to the memorybuffer itself. This
reduces coupling and eliminates a pointer dereference on a hot path.
This speeds up -Eonly on 483.xalancbmk by 2.1%
llvm-svn: 40394
fileid/offset pair, it now contains a bit discriminating between
mapped locations and file locations. This separates the tables for
macros and files in SourceManager, and allows better separation of
concepts in the rest of the compiler. This allows us to have *many*
macro instantiations before running out of 'addressing space'.
This is also more efficient, because testing whether something is a
macro expansion is now a bit test instead of a table lookup (which
also used to require having a srcmgr around, now it doesn't).
This is fully functional, but there are several refinements and
optimizations left.
llvm-svn: 40103
specifying the start of a token and a logical (phase 3) character number,
returns a sloc representing the input character corresponding to it.
llvm-svn: 39905
out of the llvm namespace. This makes the clang namespace be a sibling of
llvm instead of being a child.
The good thing about this is that it makes many things unambiguous. The
bad things is that many things in the llvm namespace (notably data structures
like smallvector) now require an llvm:: qualifier. IMO, libsystem and libsupport
should be split out of llvm into their own namespace in the future, which will fix
this issue.
llvm-svn: 39659
Submitted by:
Reviewed by:
Fixed a bug in the parser's handling of attributes on pointer declarators.
For example, the following code was producing a syntax error...
int *__attribute(()) foo;
attrib.c:10:25: error: expected identifier or '('
int *__attribute(()) foo;
^
Changed Parser::ParseTypeQualifierListOpt to not consume the token following
an attribute declaration.
Also added LexerToken::getName() convenience method...useful when tracking
down errors like this.
llvm-svn: 39602
Submitted by:
Reviewed by:
Move string literal parsing from Sema=>LiteralSupport. This consolidates
all the quirky parsing code within the Lexer subsystem (yeah!). This
simplifies Sema and (more importantly) allows future parsers
(i.e. subclasses of Action) to benefit from this code.
llvm-svn: 39354
Submitted by:
Reviewed by:
-Converted the preprocessor to use NumericLiteralParser.
-Several minor changes to LiteralSupport interface/implementation.
-Added an error diagnostic for floating point usage in pp expr's.
llvm-svn: 39352
Submitted by:
Reviewed by:
Moved numeric literal support from SemaExpr.cpp to LiteralSupport.[h,cpp]
in Lex. This will allow it to be used by both Sema and Preprocessor (and
should be the last major refactoring of this sub-system).. Over
time, it will be reused by anyone implementing an actions module (i.e.
any subclass of llvm::clang::Action. Minor changes to IntegerLiteral in Expr.h.
More to come...
llvm-svn: 39351
whose decl objects are lazily created the first time they are referenced.
Builtin functions are described by the clang/AST/Builtins.def file, which
makes it easy to add new ones.
This is missing two important pieces:
1. Support for the rest of the gcc builtins.
2. Support for target-specific builtins (e.g. __builtin_ia32_emms).
Just adding this builtins reduces the number of implicit function definitions
by 6, reducing the # diagnostics from 550 to 544 when parsing carbon.h.
I need to add all the i386-specific ones to eliminate several hundred more.
ugh.
llvm-svn: 39327
of having a loose collection of function pointers. This also allows clients to
maintain state, and reduces the size of the Preprocessor.h interface.
llvm-svn: 39203
filenames (and also '#pragma GCC dependency' of course). Now, assuming
no cleaning is needed, we can go all the way from lexing the filename to
doing filename lookups with no mallocs. This speeds up user PP time from
0.077 to 0.075s for Cocoa.h (2.6%).
llvm-svn: 39092
argument tokens into instead of a real vector. This avoids some malloc
traffic in common cases. In an "abusive macro expansion" testcase, this
reduced -Eonly time by 25%.
llvm-svn: 38757
is an intra-lexer property, not a inter-lexer property, so it makes sense
for it to be define here. It also makes no sense for macros, and allows us
to define it more carefully in the header.
While I'm at it, improve comments and structuring in Lexer.h
llvm-svn: 38701
that are lexed are made up of tokens, so the calls are just ugly and redundant.
Hook up the MIOpt for the #if case. PPCExpressions doesn't currently implement
the hook though, so we still don't handle #if !defined(X) with the MIOpt.
llvm-svn: 38649
Now, instead of keeping a pointer to the start of the token in memory, we keep the
start of the token as a SourceLocation node. This means that each LexerToken knows
the full include stack it was created with, and means that the LexerToken isn't
reliant on a "CurLexer" member to be around (lexer tokens would previously go out of
scope when their lexers were deallocated).
This simplifies several things, and forces good cleanup elsewhere. Now the
Preprocessor is the one that knows how to dump tokens/macros and is the one that
knows how to get the spelling of a token (it has all the context).
llvm-svn: 38551