Fixes a crash printing diagnostics on the gcc testsuite, and also makes
diagnostic range printing print nicer results for token pastes.
llvm-svn: 169068
import of that module elsewhere, don't try to build the module again:
it won't work, and the experience is quite dreadful. We track this
information somewhat globally, shared among all of the related
CompilerInvocations used to build modules on-the-fly, so that a
particular Clang instance will only try to build a given module once.
Fixes <rdar://problem/12552849>.
llvm-svn: 168961
string literal needs cleaning (because it contains line-splicing in the
encoding prefix or in the ud-suffix), do not clean the section between the
double-quotes -- that's the "raw" bit!
llvm-svn: 168776
This makes LexCharConstant() look more like LexStringLiteral(), which doesn't
have this bug. Add tests for eof after \ for several other cases.
llvm-svn: 168269
common LexStringLiteral function. In doing so, some consistency problems have
been ironed out (e.g. where the first token in the string literal was lexed
with macro expansion, but subsequent ones were not) and also an erroneous
diagnostic has been corrected.
LexStringLiteral is complemented by a FinishLexStringLiteral function which
can be used in the situation where the first token of the string literal has
already been lexed.
llvm-svn: 168266
the related comma pasting extension.
In certain cases, we used to get two diagnostics for what is essentially one
extension. This change suppresses the first diagnostic in certain cases
where we know we're going to print the second diagnostic. The
diagnostic is redundant, and it can't be suppressed in the definition
of the macro because it points at the use of the macro, so we want to
avoid printing it if possible.
The implementation works by detecting constructs which look like comma
pasting at the time of the definition of the macro; this information
is then used when the macro is used. (We can't actually detect
whether we're using the comma pasting extension until the macro is
actually used, but we can detecting constructs which will be comma
pasting if the varargs argument is elided.)
<rdar://problem/12292192>
llvm-svn: 167907
is empty in a variadic macro expansion. This fixes a divergence in support for
the ", ## __VA_ARGS__" GCC extension which differed in behaviour when in strict
C99 mode (note: there is no change in behaviour has been made in the gnu99 mode
that clang uses by default). In addition, there is improved support for the
Microsoft alternative extension ", __VA_ARGS__".
llvm-svn: 167613
allowing a module map to be placed one level above the '.framework'
directories to specify that all .frameworks within that directory can
be inferred as framework modules. One can also specifically exclude
frameworks known not to work.
This makes explicit (and more restricted) behavior modules have had
"forever", where *any* .framework was assumed to be able to be built
as a module. That's not necessarily true, so we white-list directories
(with exclusions) when those directories have been audited.
llvm-svn: 167482
the various stakeholders bump up the reference count. In particular,
the diagnostics engine now keeps the DiagnosticOptions object alive.
llvm-svn: 166508
description. Previously, one could emulate this behavior by placing
the header in an always-unavailable submodule, but Argyrios guilted me
into expressing this idea properly.
llvm-svn: 165921
macro history.
When deserializing macro history, we arrange history such that the
macros that have definitions (that haven't been #undef'd) and are
visible come at the beginning of the list, which is what the
preprocessor and other clients of Preprocessor::getMacroInfo()
expect. If additional macro definitions become visible later, they'll
be moved toward the front of the list. Note that it's possible to have
ambiguities, but we don't diagnose them yet.
There is a partially-implemented design decision here that, if a
particular identifier has been defined or #undef'd within the
translation unit, that definition (or #undef) hides any macro
definitions that come from imported modules. There's still a little
work to do to ensure that the right #undef'ing happens.
Additionally, we'll need to scope the update records for #undefs, so
they only kick in when the submodule containing that update record
becomes visible.
llvm-svn: 165682
MacroInfo*. Instead of simply dumping an offset into the current file,
give each macro definition a proper ID with all of the standard
modules-remapping facilities. Additionally, when a macro is modified
in a subsequent AST file (e.g., #undef'ing a macro loaded from another
module or from a precompiled header), provide a macro update record
rather than rewriting the entire macro definition. This gives us
greater consistency with the way we handle declarations, and ties
together macro definitions much more cleanly.
Note that we're still not actually deserializing macro history (we
never were), but it's far easy to do properly now.
llvm-svn: 165560
Summary:
When issuing a diagnostic message for the -Wimplicit-fallthrough diagnostics, always try to find the latest macro, defined at the point of fallthrough, which is immediately expanded to "[[clang::fallthrough]]", and use it's name instead of the actual sequence.
Known issues:
* uses PP.getSpelling() to compare macro definition with a string (anyone can suggest a convenient way to fill a token array, or maybe lex it in runtime?);
* this can be generalized and used in other similar cases, any ideas where it should reside then?
Reviewers: doug.gregor, rsmith
Reviewed By: rsmith
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D50
llvm-svn: 164858
have PPCallbacks::InclusionDirective pass the character range for the filename quotes or brackets.
rdar://11113134 & http://llvm.org/PR13880
llvm-svn: 164743
This makes the behavior clearer concerning literals with the maximum
number of digits. For a 32-bit example, 4,000,000,000 is a valid uint32_t,
but 5,000,000,000 is not, so we'd have to count 10-digit decimal numbers
as "unsafe" (meaning we have to check for overflow when parsing them,
just as we would for numbers with 11 digits or higher). This is the same,
only with 64 bits to play with.
No functionality change.
llvm-svn: 164639
top-level frameworks can actually be symlinked over to embedded
frameworks, and accessed via the top-level framework's headers. In
this case, we need to determine that the framework was *actually* an
embedded framework, so we can load the appropriate top-level module.
llvm-svn: 164620
Summary: Passes all tests (+ the new one with code completion), but needs a thorough review in part related to modules.
Reviewers: doug.gregor
Reviewed By: alexfh
CC: cfe-commits, rsmith
Differential Revision: http://llvm-reviews.chandlerc.com/D41
llvm-svn: 164610
specific module (__building_module(modulename)) and to get the name of
the current module as an identifier (__MODULE__).
Used to help headers behave differently when they're being included as
part of building a module. Oh, the irony.
llvm-svn: 164605
Fixed by pointing the end location of the preprocessed entity for the #include
at the closing '>', instead of the start of '<'.
rdar://11113134
llvm-svn: 163588
(__is_pod, __is_signed, etc.) to normal identifiers if they are
encountered in certain places in the grammar where we know that prior
versions of libstdc++ or libc++ use them, to still allow the use of
these keywords as type traits. Fixes <rdar://problem/9836262> and PR10184.
llvm-svn: 162937
within its own argument list. The original definition is used for the immediate
expansion, but the new definition is used for any subsequent occurences within
the argument list or after the expansion.
llvm-svn: 162906
Summary:
The problem was with the following sequence:
#pragma push_macro("long")
#undef long
#pragma pop_macro("long")
in case when "long" didn't represent a macro.
Fixed crash and removed code duplication for #undef/pop_macro case. Added regression tests.
Reviewers: doug.gregor, klimek
Reviewed By: doug.gregor
CC: cfe-commits, chapuni
Differential Revision: http://llvm-reviews.chandlerc.com/D31
llvm-svn: 162845
Summary:
Summary: Keep history of macro definitions and #undefs with corresponding source locations, so that we can later find out all macros active in a specified source location. We don't save the history in PCH (no need currently). Memory overhead is about sizeof(void*)*3*<number of macro definitions and #undefs>+<in-memory size of all #undef'd macros>
I've run a test on a file composed of 109 .h files from boost 1.49 on x86-64 linux.
Stats before this patch:
*** Preprocessor Stats:
73222 directives found:
19171 #define.
4345 #undef.
#include/#include_next/#import:
5233 source files entered.
27 max include stack depth
19210 #if/#ifndef/#ifdef.
2384 #else/#elif.
6891 #endif.
408 #pragma.
14466 #if/#ifndef#ifdef regions skipped
80023/451669/1270 obj/fn/builtin macros expanded, 85724 on the fast path.
127145 token paste (##) operations performed, 11008 on the fast path.
Preprocessor Memory: 5874615B total
BumpPtr: 4399104
Macro Expanded Tokens: 417768
Predefines Buffer: 8135
Macros: 1048576
#pragma push_macro Info: 0
Poison Reasons: 1024
Comment Handlers: 8
Stats with this patch:
...
Preprocessor Memory: 7541687B total
BumpPtr: 6066176
Macro Expanded Tokens: 417768
Predefines Buffer: 8135
Macros: 1048576
#pragma push_macro Info: 0
Poison Reasons: 1024
Comment Handlers: 8
In my test increase in memory usage is about 1.7Mb, which is ~28% of initial preprocessor's memory usage and about 0.8% of clang's total VMM allocation.
As for CPU overhead, it should only be noticeable when iterating over all macros, and should mostly consist of couple extra dereferences and one comparison per macro + skipping of #undef'd macros. It's less trivial to measure, though, as the preprocessor consumes a very small fraction of compilation time.
Reviewers: doug.gregor, klimek, rsmith, djasper
Reviewed By: doug.gregor
CC: cfe-commits, chandlerc
Differential Revision: http://llvm-reviews.chandlerc.com/D28
llvm-svn: 162810
diagnostics for bad deployment targets and adding a few
more predicates. Includes a patch by Jonathan Schleifer
to enable ARC for ObjFW.
llvm-svn: 162252
current directory, propagate the framework and in-index-header-map
from the including header's information down to the included header's
information. Fixes <rdar://problem/11261291>.
As with everything header-map related, we can't really test this in
isolation within Clang, so it's tested elsewhere.
llvm-svn: 161759
This tests for the ability to include a "message" field in availability
attributes, like so:
extern void ATSFontGetName(const char *oName)
__attribute__((availability(macosx,introduced=8.0,deprecated=9.0,
message="use CTFontCopyFullName")));
This was actually supported in Clang 3.1, but we got a request for a
__has_feature so that header files can use this more safely. It's
unfortunate that the 3.1 release doesn't include this, however.
<rdar://problem/11886458>
llvm-svn: 160699
In future changes we should:
* use __builtin_trap rather than derefing 'random' volatile pointers.
* avoid dumping temporary files into /tmp when running tests, instead
preferring a location that is properly cleaned up by lit.
Review by Chandler Carruth.
llvm-svn: 159469
undefined behaviour, and move the diagnostic for '' from an Error into
an ExtWarn in this group. This is important for some users of the preprocessor,
and is necessary for gcc compatibility.
llvm-svn: 159335
express library-level dependencies within Clang.
This is no more verbose really, and plays nicer with the rest of the
CMake facilities. It should also have no change in functionality.
llvm-svn: 158888
places. I've turned this off for the GNU runtimes --- I don't know if
they support weak class import, but it's easy enough for them to opt in.
Also tweak a comment per review by Jordan.
llvm-svn: 158860
target Objective-C runtime down to the frontend: break this
down into a single target runtime kind and version, and compute
all the relevant information from that. This makes it
relatively painless to add support for new runtimes to the
compiler. Make the new -cc1 flag, -fobjc-runtime=blah-x.y.z,
available at the driver level as a better and more general
alternative to -fgnu-runtime and -fnext-runtime. This new
concept of an Objective-C runtime also encompasses what we
were previously separating out as the "Objective-C ABI", so
fragile vs. non-fragile runtimes are now really modelled as
different kinds of runtime, paving the way for better overall
differentiation.
As a sort of special case, continue to accept the -cc1 flag
-fobjc-runtime-has-weak, as a sop to PLCompatibilityWeak.
I won't go so far as to say "no functionality change", even
ignoring the new driver flag, but subtle changes in driver
semantics are almost certainly not intended.
llvm-svn: 158793
* Escaped # and < characters in Doxygen comments as needed;
* Removed a Doxygen comment in HeaderSearch.cpp that was redundant with
the corresponding comment in the header file.
llvm-svn: 158776
* Retain comments in the AST
* Serialize/deserialize comments
* Find comments attached to a certain Decl
* Expose raw comment text and SourceRange via libclang
llvm-svn: 158771
The original r158700 caused crashes in the gcc test suite,
g++.abi/vtable3a.C among others. It also caused failures in the libc++
test suite.
llvm-svn: 158749
r158085 added some logic to track predefined declarations. The main reason we
had predefined declarations in the input was because the __builtin_va_list
declarations were injected into the preprocessor input. As of r158592 we
explicitly build the __builtin_va_list declarations. Therefore the predefined
decl tracking is no longer needed.
llvm-svn: 158732
Note that this is mostly a structural patch that handles the change from the old
spelling style to the new one. One consequence of this is that all AT_foo_bar
enum values have changed to not be based off of the first spelling, but rather
off of the class name, so they are now AT_FooBar and the like (a straw poll on
IRC showed support for this). Apologies for code churn.
Most attributes have GNU spellings as a temporary solution until everything else
is sorted out (such as a Keyword spelling, which I intend to add if someone else
doesn't beat me to it). This is definitely a WIP.
I've also killed BaseCheckAttr since it was unused, and I had to go through
every attribute anyway.
llvm-svn: 158700
* Removed docs for Lexer::makeFileCharRange from Lexer.cpp, as they're in
the header file;
* Reworked the documentation for SkipBlockComment so that it doesn't confuse
Doxygen's comment parsing;
* Added another summary with \brief markup.
llvm-svn: 158618
1. Teach Lexer that pragma lexers are like macro expansions at EOF.
2. Treat pragmas like #define/#undef when printing.
3. If we just printed a directive, add a newline before any more tokens.
(4. Miscellaneous cleanup in PrintPreprocessedOutput.cpp)
PR10594 and <rdar://problem/11562490> (two separate related problems)
llvm-svn: 158571
modes. For languages other than C99/C11, this isn't quite a conforming
extension, and for C++11, it breaks some reasonable code containing
user-defined literals.
In languages which don't officially have hexfloats, pare back this extension
to only apply in cases where the token starts 0x and does not contain an
underscore. The extension is still not quite conforming, but it's a lot closer
now.
llvm-svn: 158487
override whether headers are system headers by checking for prefixes of the
header name specified in the #include directive.
This allows warnings to be disabled for third-party code which is found in
specific subdirectories of include paths.
llvm-svn: 158418
The preprocessor's handling of diagnostic push/pops is stateful, so
encountering pragmas during a re-parse causes problems. HTMLRewrite
already filters out normal # directives including #pragma, so it's
clear it's not expected to be interpreting pragmas in this mode.
This fix adds a flag to Preprocessor to explicitly disable pragmas.
The "right" fix might be to separate pragma lexing from pragma
parsing so that we can throw away pragmas like we do preprocessor
directives, but right now it's important to get the fix in.
Note that this has nothing to do with the "hack" of re-using the
input preprocessor in HTMLRewrite. Even if we someday copy the
preprocessor instead of re-using it, the copy would (and should) include
the diagnostic level tables and have the same problems.
llvm-svn: 158214
This was a problem for people who write 'return(result);'
Also fix ARCMT's corresponding code, though there's no test case for this
because implicit casts like this are rejected by the migrator for being
ambiguous, and explicit casts have no problem.
<rdar://problem/11577346>
llvm-svn: 158130
In standard C since C89, a 'translation-unit' is syntactically defined to have
at least one "external-declaration", which is either a decl or a function
definition. In Clang the latter gives us a declaration as well.
The tricky bit about this warning is that our predefines can contain external
declarations (__builtin_va_list and the 128-bit integer types). Therefore our
AST parser now makes sure we have at least one declaration that doesn't come
from the predefines buffer.
Also, remove bogus warning about empty source files. This doesn't catch source
files that only contain comments, and never fired anyway because of our
predefines.
PR12665 and <rdar://problem/9165548>
llvm-svn: 158085
so we can destroy it even if it was constructed with "DelayInitialization = true",
and we didn't end up calling Preprocessor::Initialize.
Fixes crashes in rdar://11558355
llvm-svn: 157892
- Developers of system frameworks need a way for their framework to be treated as a "system framework" during development. Otherwise, they are unable to properly test how their framework behaves when installed because of the semantic changes (in warning behavior) applied to system frameworks.
llvm-svn: 154105
cached during the non-cached lex, otherwise we are going to drop them.
Fixes a bogus "_Pragma takes a parenthesized string literal" error when
expanding consecutive _Pragmas in a macro argument.
Part of rdar://11168596
llvm-svn: 153994
If we are pre-expanding a macro argument don't actually "activate"
the pragma at that point, activate the pragma whenever we encounter
it again in the token stream.
This ensures that we will activate it in the correct location
or that we will ignore it if it never enters the token stream, e.g:
\#define EMPTY(x)
\#define INACTIVE(x) EMPTY(x)
INACTIVE(_Pragma("clang diagnostic ignored \"-Wconversion\""))
This also fixes the crash in rdar://11168596.
llvm-svn: 153959
"#include MACRO(STUFF)".
-As an inclusion position for the included file, use the file location of the file where it
was included but *after* the macro expansions. We want the macro expansions to be considered
as before-in-translation-unit for everything in the included file.
-In the preprocessing record take into account that only inclusion directives can be encountered
as "out-of-order" (by comparing the start of the range which for inclusions is the hash location)
and use binary search if there is an extreme number of macro expansions in the include directive.
Fixes rdar://11111779
llvm-svn: 153527
Enable incremental parsing by the Preprocessor,
where more code can be provided after an EOF.
It mainly prevents the tearing down of the topmost lexer.
To be used like this:
PP.enableIncrementalProcessing();
while (getMoreSource()) {
while (Parser.ParseTopLevelDecl(ADecl)) {...}
}
PP.enableIncrementalProcessing(false);
llvm-svn: 152914
basic source character set in C++98. Add -Wc++98-compat diagnostics for same in
literals in C++11. Extend such support to cover string literals as well as
character literals, and mark N2170 as done.
This seems too minor to warrant a release note to me. Let me know if you disagree.
llvm-svn: 152444
first codepoint! Also, don't reject empty raw string literals for spurious
"encoding" issues. Also, don't rely on undefined behavior in ConvertUTF.c.
llvm-svn: 152344
starting with an underscore is ill-formed.
Since this rule rejects programs that were using <inttypes.h>'s macros, recover
from this error by treating the ud-suffix as a separate preprocessing-token,
with a DefaultError ExtWarn. The approach of treating such cases as two tokens
is under discussion for standardization, but is in any case a conforming
extension and allows existing codebases to keep building while the committee
makes up its mind.
Reword the warning on the definition of literal operators not starting with
underscores (which are, strangely, legal) to more explicitly state that such
operators can't be called by literals. Remove the special-case diagnostic for
hexfloats, since it was both triggering in the wrong cases and incorrect.
llvm-svn: 152287
NSNumber, and boolean literals. This includes both Sema and Codegen support.
Included is also support for new Objective-C container subscripting.
My apologies for the large patch. It was very difficult to break apart.
The patch introduces changes to the driver as well to cause clang to link
in additional runtime support when needed to support the new language features.
Docs are forthcoming to document the implementation and behavior of these features.
llvm-svn: 152137
grammar requires a string-literal and not a user-defined-string-literal. The
two constructs are still represented by the same TokenKind, in order to prevent
a combinatorial explosion of different kinds of token. A flag on Token tracks
whether a ud-suffix is present, in order to prevent clients from needing to look
at the token's spelling.
llvm-svn: 152098
Introduce PreprocessingRecord::rangeIntersectsConditionalDirective() which returns
true if a given range intersects with a conditional directive block.
llvm-svn: 152018
-Add location parameter for the directives callbacks
-Skip callbacks if the directive is inside a skipped range.
-Make sure the directive callbacks are invoked in source order.
llvm-svn: 152017
Fixes PR10606.
I'm not sure if this is the best way to go about it, but
I locally enabled this code path without the msext conditional,
and all tests pass, except for test/Preprocessor/cxx_oper_keyword.cpp
which explicitly checks that operator keywords can't be redefined.
I also parsed chromium/win with a clang with and without this patch.
It introduced no new errors, but removes 43 existing errors.
llvm-svn: 151768
that provides the behavior of the C++11 library trait
std::is_trivially_constructible<T, Args...>, which can't be
implemented purely as a library.
Since __is_trivially_constructible can have zero or more arguments, I
needed to add Yet Another Type Trait Expression Class, this one
handling arbitrary arguments. The next step will be to migrate
UnaryTypeTrait and BinaryTypeTrait over to this new, more general
TypeTrait class.
Fixes the Clang side of <rdar://problem/10895483> / PR12038.
llvm-svn: 151352
This seems to negatively affect compile time onsome ObjC tests
(which use a lot of partial diagnostics I assume). I have to come
up with a way to keep them inline without including Diagnostic.h
everywhere. Now adding a new diagnostic requires a full rebuild
of e.g. the static analyzer which doesn't even use those diagnostics.
This reverts commit 6496bd10dc3a6d5e3266348f08b6e35f8184bc99.
This reverts commit 7af19b817ba964ac560b50c1ed6183235f699789.
This reverts commit fdd15602a42bbe26185978ef1e17019f6d969aa7.
This reverts commit 00bd44d5677783527d7517c1ffe45e4d75a0f56f.
This reverts commit ef9b60ffed980864a8db26ad30344be429e58ff5.
llvm-svn: 150006
- Move the offending methods out of line and fix transitive includers.
- This required changing an enum in the PPCallback API into an unsigned.
llvm-svn: 149782
into using non-absolute system includes (<foo>)...
... and introduce another hack that is simultaneously more heineous
and more effective. We whitelist Clang-supplied headers that augment
or override system headers (such as float.h, stdarg.h, and
tgmath.h). For these headers, Clang does not provide a module
mapping. Instead, a system-supplied module map can refer to these
headers in a system module, and Clang will look both in its own
include directory and wherever the system-supplied module map
suggests, then adds either or both headers. The end result is that
Clang-supplied headers get merged into the system-supplied module for
the C standard library.
As a drive-by, fix up a few dependencies in the _Builtin_instrinsics
module.
llvm-svn: 149611
__has_builtin
in an empty file, as we were overwriting the EOF token. Overwriting an arbitrary token
never seems like a good idea in the error case. This fixes a bug reported on the GCC
list :)
llvm-svn: 149397
for getting the name of the module file, unifying the code for
searching for a module with a given name (into lookupModule()) and
separating out the mapping to a module file (into
getModuleFileName()). No functionality change.
llvm-svn: 149197
single attribute ("system") that allows us to mark a module as being a
"system" module. Each of the headers that makes up a system module is
considered to be a system header, so that we (for example) suppress
warnings there.
If a module is being inferred for a framework, and that framework
directory is within a system frameworks directory, infer it as a
system framework.
llvm-svn: 149143
when it actually has changed (and not, e.g., when we've simply attached a
deserialized macro definition). Good for ~1.5% reduction in module
file size, mostly in the identifier table.
llvm-svn: 148808
Updates ProcessUCNExcape() for C++. C++11 allows UCNs in character
and string literals that represent control characters and basic
source characters. Also C++03 allows UCNs that refer to surrogate
codepoints.
UTF-8 sequences in character literals are now handled as single
c-chars.
Added error for multiple characters in Unicode character literals.
Added errors for when a the execution charset encoding of a c-char
cannot be represented as a single code unit in the associated
character type. Note that for the purposes of this error the asso-
ciated character type for a narrow character literal is char, not
int, even though in C narrow character literals have type int.
llvm-svn: 148389
- Add atomic-to/from-nonatomic cast types
- Emit atomic operations for arithmetic on atomic types
- Emit non-atomic stores for initialisation of atomic types, but atomic stores and loads for every other store / load
- Add a __atomic_init() intrinsic which does a non-atomic store to an _Atomic() type. This is needed for the corresponding C11 stdatomic.h function.
- Enables the relevant __has_feature() checks. The feature isn't 100% complete yet, but it's done enough that we want people testing it.
Still to do:
- Make the arithmetic operations on atomic types (e.g. Atomic(int) foo = 1; foo++;) use the correct LLVM intrinsic if one exists, not a loop with a cmpxchg.
- Add a signal fence builtin
- Properly set the fenv state in atomic operations on floating point values
- Correctly handle things like _Atomic(_Complex double) which are too large for an atomic cmpxchg on some platforms (this requires working out what 'correctly' means in this context)
- Fix the many remaining corner cases
llvm-svn: 148242
re-computed rather than the variables be re-used just after the assert.
Just use the variables since we have them already. Fixes an unused
variable warning.
Also fix an 80-column violation.
llvm-svn: 148212
framework is actually a subframework within a top-level framework. If
so, only infer a module for the top-level framework and then dig out
the appropriate submodule.
This helps us cope with an amusing subframeworks anti-pattern, where
one uses -F <framework>/Frameworks to get direct include access to the
subframeworks of a framework (which otherwise would not be
permitted).
llvm-svn: 148148
include stack to find the first file that is known to be part of the
module. This copes with situations where the module map doesn't
completely specify all of the headers that are involved in the module,
which can come up when there are very strange #include_next chains
(e.g., with weird compiler/stdlib headers like stdarg.h or float.h).
llvm-svn: 147662
in the module map. This provides a bit more predictability for the
user, as well as eliminating the need to sort the submodules when
serializing them.
llvm-svn: 147564
any language variant), and restrict __has_feature(objc_modules) to
mean that we also have the Objective-C @import syntax. I anticipate
__has_feature(cxx_modules) and/or __has_feature(c_modules) for when we
nail down the module syntax for C/C++.
llvm-svn: 147548
modules. This leaves us without an explicit syntax for importing
modules in C/C++, because such a syntax needs to be discussed
first. In Objective-C/Objective-C++, the @import syntax is used to
import modules.
Note that, under -fmodules, C/C++ programs can import modules via the
#include mechanism when a module map is in place for that header. This
allows us to work with modules in C/C++ without committing to a syntax.
llvm-svn: 147467
features needed for a particular module to be available. This allows
mixed-language modules, where certain headers only work under some
language variants (e.g., in C++, std.tuple might only be available in
C++11 mode).
llvm-svn: 147387
found within that umbrella directory that were not actually included
by the umbrella header. They should either be referenced in the module
map or included by the umbrella header.
llvm-svn: 147207
hitting a submodule that was never actually created, e.g., because
that header wasn't parsed. In such cases, complain (because the
module's umbrella headers don't cover everything) and fall back to
including the header.
Later, we'll add a warning at module-build time to catch all such
cases. However, this fallback is important to eliminate assertions in
the ASTWriter when this happens.
llvm-svn: 146933
#__include_macros) in the arguments of a function-style macro. Directives in the
arguments of such macros have undefined behaviour, and GCC does not correctly
support these cases. In some situations, this can lead to better diagnostics.
llvm-svn: 146765
all of the headers below that particular directory. Use umbrella
directories as a clean way to deal with (1) directories/frameworks
that don't have an umbrella header, but don't want to enumerate all of
their headers, and (2) PrivateHeaders, which we never want to
enumerate and want to keep separate from the main umbrella header.
This also eliminates a little more of the "magic" for private headers,
and frameworks in general.
llvm-svn: 146235
part of HeaderSearch. This function just normalizes filenames for use
inside of a synthetic include directive, but it is used in both the
Frontend and Serialization libraries so it needs a common home.
llvm-svn: 146227
umbrella headers in the sense that all of the headers within that
directory (and eventually its subdirectories) are considered to be
part of the module with that umbrella directory. However, unlike
umbrella headers, which are expected to include all of the headers
within their subdirectories, Clang will automatically include all of
the headers it finds in the named subdirectory.
The intent here is to allow a module map to trivially turn a
subdirectory into a module, where the module's structure can mimic the
directory structure.
llvm-svn: 146165
a modifier for a header declarartion, e.g.,
umbrella header "headername"
Collapse the umbrella-handling code in the parser into the
header-handling code, so we don't duplicate the header-search logic.
llvm-svn: 146159
header to also support umbrella directories. The umbrella directory
for an umbrella header is the directory in which the umbrella header
resides.
No functionality change yet, but it's coming.
llvm-svn: 146158
when we load a module map (module.map) from a directory, also load a
private module map (module_private.map) for that directory, if
present. That private module map can inject a new submodule that
captures private headers.
llvm-svn: 146012
most specific (sub)module based on the actual file we find, rather
than always importing the top-level module. This means
that #include'ing <Foo/Blah.h> should give us the submodule Foo.Blah.
llvm-svn: 145942
frameworks). A submodule can now be labeled as a "framework", and
header search will look into the appropriate Headers/PrivateHeaders
subdirectories for named headers.
llvm-svn: 145941
to re-export anything that it imports. This opt-in feature makes a
module behave more like a header, because it can be used to re-export
the transitive closure of a (sub)module's dependencies.
llvm-svn: 145811
within module maps, which will (eventually) be used to re-export a
module from another module. There are still some pieces missing,
however.
llvm-svn: 145665
(sub)module, all of the names may be hidden, just the macro names may
be exposed (for example, after the preprocessor has seen the import of
the module but the parser has not), or all of the names may be
exposed. Importing a module makes its names, and the names in any of
its non-explicit submodules, visible to name lookup (transitively).
This commit only introduces the notion of name visible and marks
modules and submodules as visible when they are imported. The actual
name-hiding logic in the AST reader will follow (along with test cases).
llvm-svn: 145586
library, since modules cut across all of the libraries. Rename
serialization::Module to serialization::ModuleFile to side-step the
annoying naming conflict. Prune a bunch of ModuleMap.h includes that
are no longer needed (most files only needed the Module type).
llvm-svn: 145538
callback client to suggest an alternative search path and after we
complain when the included file can't be found. The former can't be
tested in isolation, the latter doesn't actually matter (because we
won't make a module suggestion if no header is available). However,
the flow is better this way.
llvm-svn: 145502
submodules. This information will eventually be used for name hiding
when dealing with submodules. For now, we only use it to ensure that
the module "key" returned when loading a module will always be a
module (rather than occasionally being a FileEntry).
llvm-svn: 145497
return the module itself (in the module map) rather than returning the
umbrella header used to build the module. While doing this, make sure
that we're inferring modules for frameworks to build that module.
llvm-svn: 145310
into a module. This module can either be loaded from a module map in
the framework directory (which isn't quite working yet) or inferred
from an umbrella header (which does work, and replaces the existing
hack).
llvm-svn: 144877
the umbrella header's directory and its subdirectories are part of the
module (that's why it's an umbrella). Make sure that these headers are
considered to be part of the module for lookup purposes.
llvm-svn: 144859
the module is described in one of the module maps in a search path or
in a subdirectory off the search path that has the same name as the
module we're looking for.
llvm-svn: 144433
map, so long as they have an umbrella header. This makes it possible
to introduce a module map + umbrella header for a given set of
headers, to turn it into a module.
There are two major deficiencies here: first, we don't go hunting for
module map files when we just see a module import (so we won't know
about the modules described therein). Second, we don't yet have a way
to build modules that don't have umbrella headers, or have incomplete
umbrella headers.
llvm-svn: 144424
the corresponding (top-level) modules. This isn't actually useful yet,
because we don't yet have a way to build modules out of module maps.
llvm-svn: 144410
Module map files provide a way to map between headers and modules, so
that we can layer a module system on top of existing headers without
changing those headers at all.
This commit introduces the module map file parser and the module map
that it generates, and wires up the module map file parser so that
we'll automatically find module map files as part of header
search. Note that we don't yet use the information stored in the
module map.
llvm-svn: 144402
AST file more lazy, so that we don't eagerly load that information for
all known identifiers each time a new AST file is loaded. The eager
reloading made some sense in the context of precompiled headers, since
very few identifiers were defined before PCH load time. With modules,
however, a huge amount of code can get parsed before we see an
@import, so laziness becomes important here.
The approach taken to make this information lazy is fairly simple:
when we load a new AST file, we mark all of the existing identifiers
as being out-of-date. Whenever we want to access information that may
come from an AST (e.g., whether the identifier has a macro definition,
or what top-level declarations have that name), we check the
out-of-date bit and, if it's set, ask the AST reader to update the
IdentifierInfo from the AST files. The update is a merge, and we now
take care to merge declarations before/after imports with declarations
from multiple imports.
The results of this optimization are fairly dramatic. On a small
application that brings in 14 non-trivial modules, this takes modules
from being > 3x slower than a "perfect" PCH file down to 30% slower
for a full rebuild. A partial rebuild (where the PCH file or modules
can be re-used) is down to 7% slower. Making the PCH file just a
little imperfect (e.g., adding two smallish modules used by a bunch of
.m files that aren't in the PCH file) tips the scales in favor of the
modules approach, with 24% faster partial rebuilds.
This is just a first step; the lazy scheme could possibly be improved
by adding versioning, so we don't search into modules we already
searched. Moreover, we'll need similar lazy schemes for all of the
other lookup data structures, such as DeclContexts.
llvm-svn: 143100
preprocessed entities that are #included in the range that we are interested.
This is useful when we are interested in preprocessed entities of a specific file, e.g
when we are annotating tokens. There is also an optimization where we cache the last
result of PreprocessingRecord::getPreprocessedEntitiesInRange and we re-use it if
there is a call with the same range as before.
rdar://10313365
llvm-svn: 142887
This also adds a -Wc++98-compat-pedantic for warning on constructs which would
be diagnosed by -std=c++98 -pedantic (that is, it warns even on C++11 features
which we enable by default, with no warning, in C++98 mode).
llvm-svn: 142034
CoreFoundation object-transfer properties audited, and add a #pragma
to cause them to be automatically applied to functions in a particular
span of code. This has to be implemented largely in the preprocessor
because of the requirement that the region be entirely contained in
a single file; that's hard to impose from the parser without registering
for a ton of callbacks.
llvm-svn: 140846
buffer as an 'unsigned char', so that integer promotion doesn't
sign-extend character values > 127 into oblivion. Fixes
<rdar://problem/10188919>.
llvm-svn: 140608
which will do a binary search and return a pair of iterators
for preprocessed entities in the given source range.
Source ranges of preprocessed entities are stored twice currently in
the PCH/Module file but this will be fixed in a subsequent commit.
llvm-svn: 140058
the AST reader), merge that header file information with whatever
header file information we already have. Otherwise, we might forget
something we already knew (e.g., that the header was #import'd already).
llvm-svn: 139979
-Use an array of offsets for all preprocessed entities
-Get rid of the separate array of offsets for just macro definitions;
for references to macro definitions use an index inside the preprocessed
entities array.
-Deserialize each preprocessed entity lazily, at first request; not in bulk.
Paves the way for binary searching of preprocessed entities that will offer
efficiency and will simplify things on the libclang side a lot.
llvm-svn: 139809
target triple to separate modules built under different
conditions. The hash is used to create a subdirectory in the module
cache path where other invocations of the compiler (with the same
version, language options, etc.) can find the precompiled modules.
llvm-svn: 139662
but there is a corresponding umbrella header in a framework, build the
module on-the-fly so it can be immediately loaded at the import
statement. This is very much proof-of-concept code, with details to be
fleshed out over time.
llvm-svn: 139558
where the compiler will look for module files. Eliminates the
egregious hack where we looked into the header search paths for
modules.
llvm-svn: 139538
'id' that can be used (only!) via a contextual keyword as the result
type of an Objective-C message send. 'instancetype' then gives the
method a related result type, which we have already been inferring for
a variety of methods (new, alloc, init, self, retain). Addresses
<rdar://problem/9267640>.
llvm-svn: 139275
keyword. We now handle this keyword in HandleIdentifier, making a note
for ourselves when we've seen the __import_module__ keyword so that
the next lexed token can trigger a module import (if needed). This
greatly simplifies Preprocessor::Lex(), and completely erases the 5.5%
-Eonly slowdown Argiris noted when I originally implemented
__import_module__. Big thanks to Argiris for noting that horrible
regression!
llvm-svn: 139265
Previously we would cut off the source file buffer at the code-completion
point; this impeded code-completion inside C++ inline methods and,
recently, with buffering ObjC methods.
Have the code-completion inserted into the source buffer so that it can
be buffered along with a method body. When we actually hit the code-completion
point the cut-off lexing or parsing.
Fixes rdar://10056932&8319466
llvm-svn: 139086
The function was only counting lines that included tokens and not empty lines,
but MaxLines (mainly initiated to the line where the code-completion point resides)
is a count of overall lines (even empty ones).
llvm-svn: 139085
and language-specific initialization. Use this to allow ASTUnit to
create a preprocessor object *before* loading the AST file. No actual
functionality change.
llvm-svn: 138983
LangOptions, rather than making distinct copies of
LangOptions. Granted, LangOptions doesn't actually get modified, but
this will eventually make it easier to construct ASTContext and
Preprocessor before we know all of the LangOptions.
llvm-svn: 138959
include guards don't show up as macro definitions in every translation
unit that imports a module. Macro definitions can, however, be
exported with the intentionally-ugly #__export_macro__
directive. Implement this feature by not even bothering to serialize
non-exported macros to a module, because clients of that module need
not (should not) know that these macros even exist.
llvm-svn: 138943
existing practice with Python extension modules. Not that Python
extension modules should be using a double-underscored identifier
anyway, but...
llvm-svn: 138870
collision between C99 hexfloats and C++0x user-defined literals by
giving C99 hexfloats precedence. Also, warning about user-defined
literals that conflict with hexfloats and those that have names that
are reserved by the implementation. Fixes <rdar://problem/9940194>.
llvm-svn: 138839
__import__ within the preprocessor, since the prior one foolishly
assumed that Preprocessor::Lex() was re-entrant. We now handle
__import__ at the top level (only), after macro expansion. This should
fix the buildbot failures.
llvm-svn: 138704
loads the named module. The syntax itself is intentionally hideous and
will be replaced at some later point with something more
palatable. For now, we're focusing on the semantics:
- Module imports are handled first by the preprocessor (to get macro
definitions) and then the same tokens are also handled by the parser
(to get declarations). If both happen (as in normal compilation),
the second one is redundant, because we currently have no way to
hide macros or declarations when loading a module. Chris gets credit
for this mad-but-workable scheme.
- The Preprocessor now holds on to a reference to a module loader,
which is responsible for loading named modules. CompilerInstance is
the only important module loader: it now knows how to create and
wire up an AST reader on demand to actually perform the module load.
- We search for modules in the include path, using the module name
with the suffix ".pcm" (precompiled module) for the file name. This
is a temporary hack; we hope to improve the situation in the
future.
llvm-svn: 138679
to increased calls to SourceManager::getFileID. (rdar://9992664)
Use a slightly different approach that is more efficient both in terms of speed
(no extra getFileID calls) and in SLocEntries reduction.
Comparing pre-r138129 and this patch we get:
For compiling SemaExpr.cpp reduction of SLocEntries by 26%.
For the boost enum library:
-SLocEntries -34% (note that this was -5% for r138129)
-Memory consumption -50%
-PCH size -31%
Reduced SLocEntries also benefit the hot function SourceManager::getFileID,
evident by the reduced "FileID scans".
llvm-svn: 138380
Currently getMacroArgExpandedLocation is very inefficient and for the case
of a location pointing at the main file it will end up checking almost all of
the SLocEntries. Make it faster:
-Use a map of macro argument chunks to their expanded source location. The map
is for a single source file, it's stored in the file's ContentCache and lazily
computed, like the source lines cache.
-In SLocEntry's FileInfo add an 'unsigned NumCreatedFIDs' field that keeps track
of the number of FileIDs (files and macros) that were created during preprocessing
of that particular file SLocEntry. This is useful when computing the macro argument
map in skipping included files while scanning for macro arg FileIDs that lexed from
a specific source file. Due to padding, the new field does not increase the size
of SLocEntry.
llvm-svn: 138225
for tokens that are lexed consecutively from the same FileID, instead of creating
a SLocEntry for each token. e.g for
assert(foo == bar);
there will be a single SLocEntry for the "foo == bar" chunk and locations
for the 'foo', '==', 'bar' tokens will point inside that chunk.
For parsing SemaExpr.cpp, this reduced the number of SLocEntries by 25%.
llvm-svn: 138129
1. Be more tolerant of comments in -CC (comment-preserving) mode. We were missing a few cases.
2. Make sure to expand the second FOO in "#if defined FOO FOO". (See also
r97253, which addressed the case of "#if defined(FOO FOO".)
Fixes PR10286.
llvm-svn: 136748
etc. With this I think essentially all of the SourceManager APIs are
converted. Comments and random other bits of cleanup should be all thats
left.
llvm-svn: 136057
and various other 'expansion' based terms. I've tried to reformat where
appropriate and catch as many references in comments but I'm going to do
several more passes. Also I've tried to expand parameter names to be
more clear where appropriate.
llvm-svn: 136056
FullSourceLoc::getInstantiationLoc to ...::getExpansionLoc. This is part
of the API and documentation update from 'instantiation' as the term for
macros to 'expansion'.
llvm-svn: 135914
entities generated directly by the preprocessor from those loaded from
the external source (e.g., the ASTReader). By separating these two
sets of entities into different vectors, we allow both to grow
independently, and eliminate the need for preallocating all of the
loaded preprocessing entities. This is similar to the way the recent
SourceManager refactoring treats FileIDs and the source location
address space.
As part of this, switch over to building a continuous range map to
track preprocessing entities.
llvm-svn: 135646
source locations from source locations loaded from an AST/PCH file.
Previously, loading an AST/PCH file involved carefully pre-allocating
space at the beginning of the source manager for the source locations
and FileIDs that correspond to the prefix, and then appending the
source locations/FileIDs used for parsing the remaining translation
unit. This design forced us into loading PCH files early, as a prefix,
whic has become a rather significant limitation.
This patch splits the SourceManager space into two parts: for source
location "addresses", the lower values (growing upward) are used to
describe parsed code, while upper values (growing downward) are used
for source locations loaded from AST/PCH files. Similarly, positive
FileIDs are used to describe parsed code while negative FileIDs are
used to file/macro locations loaded from AST/PCH files. As a result,
we can load PCH/AST files even during parsing, making various
improvemnts in the future possible, e.g., teaching #include <foo.h> to
look for and load <foo.h.gch> if it happens to be already available.
This patch was originally written by Sebastian Redl, then brought
forward to the modern age by Jonathan Turner, and finally
polished/finished by me to be committed.
llvm-svn: 135484
variants to 'expand'. This changed a couple of public APIs, including
one public type "MacroInstantiation" which is now "MacroExpansion". The
rest of the codebase was updated to reflect this, especially the
libclang code. Two of the C++ (and thus easily changed) libclang APIs
were updated as well because they pertained directly to the old
MacroInstantiation class.
No functionality changed.
llvm-svn: 135139
'expand'. Also update the public API it provides to the new term, and
propagate that update to the various clients.
No functionality changed.
llvm-svn: 135138
argument expansion to use the macro argument source locations as well.
Add a few tests to exercise this. There is still a bit more work needed
here though.
llvm-svn: 134674
instantiation and improve diagnostics which are stem from macro
arguments to trace the argument itself back through the layers of macro
expansion.
This requires some tricky handling of the source locations, as the
argument appears to be expanded in the opposite direction from the
surrounding macro. This patch provides helper routines that encapsulate
the logic and explain the reasoning behind how we step through macros
during diagnostic printing.
This fixes the rest of the test cases originially in PR9279, and later
split out into PR10214 and PR10215.
There is still some more work we can do here to improve the macro
backtrace, but those will follow as separate patches.
llvm-svn: 134660
When a macro instantiation occurs, reserve a SLocEntry chunk with length the
full length of the macro definition source. Set the spelling location of this chunk
to point to the start of the macro definition and any tokens that are lexed directly
from the macro definition will get a location from this chunk with the appropriate offset.
For any tokens that come from argument expansion, '##' paste operator, etc. have their
instantiation location point at the appropriate place in the instantiated macro definition
(the argument identifier and the '##' token respectively).
This improves macro instantiation diagnostics:
Before:
t.c:5:9: error: invalid operands to binary expression ('struct S' and 'int')
int y = M(/);
^~~~
t.c:5:11: note: instantiated from:
int y = M(/);
^
After:
t.c:5:9: error: invalid operands to binary expression ('struct S' and 'int')
int y = M(/);
^~~~
t.c:3:20: note: instantiated from:
\#define M(op) (foo op 3);
~~~ ^ ~
t.c:5:11: note: instantiated from:
int y = M(/);
^
The memory savings for a candidate boost library that abuses the preprocessor are:
- 32% less SLocEntries (37M -> 25M)
- 30% reduction in PCH file size (900M -> 635M)
- 50% reduction in memory usage for the SLocEntry table (1.6G -> 800M)
llvm-svn: 134587
structure to hold inferred information, then propagate each invididual
bit down to -cc1. Separate the bits of "supports weak" and "has a native
ARC runtime"; make the latter a CodeGenOption.
The tool chain is still driving this decision, because it's the place that
has the required deployment target information on Darwin, but at least it's
better-factored now.
llvm-svn: 134453
Previously macro expanded tokens were added to Preprocessor's bump allocator and never released,
even after the TokenLexer that were lexing them was finished, thus they were wasting memory.
A very "useful" boost library was causing clang to eat 1 GB just for the expanded macro tokens.
Introduce a special cache that works like a stack; a TokenLexer can add the macro expanded tokens
in the cache, and when it finishes, the tokens are removed from the end of the cache.
Now consumed memory by expanded tokens for that library is ~ 1.5 MB.
Part of rdar://9327049.
llvm-svn: 134105
Language-design credit goes to a lot of people, but I particularly want
to single out Blaine Garst and Patrick Beard for their contributions.
Compiler implementation credit goes to Argyrios, Doug, Fariborz, and myself,
in no particular order.
llvm-svn: 133103
Patch by Matthieu Monrocq with tweaks by me to avoid StringRefs in the static
diagnostic data structures, which resulted in a huge global-var-init function.
Depends on llvm commit r132046.
llvm-svn: 132047
header. Getting it in the wrong order generated incorrect line markers in -E
mode. In the testcase from PR9861 we used to generate:
# 1 "test.c" 2
# 1 "./foobar.h" 1
# 0 "./foobar.h"
# 0 "./foobar.h" 3
# 2 "test.c" 2
now we properly produce:
# 1 "test.c" 2
# 1 "./foobar.h" 1
# 1 "./foobar.h" 3
# 2 "test.c" 2
This fixes PR9861.
llvm-svn: 131871
minor issues along the way:
- Non-type template parameters of type 'std::nullptr_t' were not
permitted.
- We didn't properly introduce built-in operators for nullptr ==,
!=, <, <=, >=, or > as candidate functions .
To my knowledge, there's only one (minor but annoying) part of nullptr
that hasn't been implemented: catching a thrown 'nullptr' as a pointer
or pointer-to-member, per C++0x [except.handle]p4.
llvm-svn: 131813
1. We would assume that the length of the string literal token was at least 2
2. We would allocate a buffer with size length-2
And when the stars aligned (one of which would be an invalid source location due to stale PCH)
The length would be 0 and we would try to allocate a 4GB buffer.
Add checks for this corner case and a bunch of asserts.
(We really really should have had an assert for 1.).
Note that there's no test case since I couldn't get one (it was major PITA to reproduce),
maybe later.
llvm-svn: 131492
__has_extension is a function-like macro which takes the same set
of feature identifiers as __has_feature. It evaluates to 1 if the
feature is supported by Clang in the current language (either as a
language extension or a standard language feature) or 0 if not.
At the same time, add support for the C1X feature identifiers
c_generic_selections (renamed from generic_selections) and
c_static_assert, and document them.
Patch by myself and Jean-Daniel Dupas.
llvm-svn: 131308
CXTranslationUnit_NestedMacroInstantiations, which indicates whether
we want to see "nested" macro instantiations (e.g., those that occur
inside other macro instantiations) within the detailed preprocessing
record. Many clients (e.g., those that only care about visible tokens)
don't care about this information, and in code that uses preprocessor
metaprogramming, this information can have a very high cost.
Addresses <rdar://problem/9389320>.
llvm-svn: 130990
which determines whether a particular file is actually a header that
is intended to be guarded from multiple inclusions within the same
translation unit.
llvm-svn: 130808
As far as I know, this implementation is complete but might be missing a
few optimizations. Exceptions and virtual bases are handled correctly.
Because I'm an optimist, the web page has appropriately been updated. If
I'm wrong, feel free to downgrade its support categories.
llvm-svn: 130642
includes get resolved, especially when they are found relatively to
another include file. We also try to get it working for framework
includes, but that part of the code is untested, as I don't have a code
base that uses it.
llvm-svn: 130246
This introduces a few APIs on the AST to bundle up the standard-based
logic so that programmatic clients have access to exactly the same
behavior.
There is only one serious FIXME here: checking for non-trivial move
constructors and move assignment operators. Those bits need to be added
to the declaration and accessors provided.
This implementation should be enough for the uses of __is_trivial in
libstdc++ 4.6's C++98 library implementation.
Ideas for more thorough test cases or any edge cases missing would be
appreciated. =D
llvm-svn: 130057
Add 'openFile' bool to FileManager::getFile to specify whether we want to have the file opened or not, have it
false by default, and enable it only in HeaderSearch.cpp where the open+fstat optimization matters.
Fixes rdar://9139899.
llvm-svn: 127748
clients to observe the exact path through which an #included file was
located. This is very useful when trying to record and replay inclusion
operations without it beind influenced by the aggressive caching done
inside the FileManager to avoid redundant system calls and filesystem
operations.
The work to compute and return this is only done in the presence of
callbacks, so it should have no effect on normal compilation.
Patch by Manuel Klimek.
llvm-svn: 127742
Find out that our C++0x status has only one field for noexcept expression and specification together, and that it was accidentally already marked as fully implemented.
This completes noexcept specification work.
llvm-svn: 127701
of an Objective-C method to be overridden on a case-by-case basis. This
is a higher-level tool than ns_returns_retained &c.; it lets users specify
that not only does a method have different retain/release semantics, but
that it semantically acts differently than one might assume from its name.
This in turn is quite useful to static analysis.
llvm-svn: 126839
The previous name was inaccurate as this token in fact appears at
the end of every preprocessing directive, not just macro definitions.
No functionality change, except for a diagnostic tweak.
llvm-svn: 126631
AST/PCH files more lazy:
- Don't preload all of the file source-location entries when reading
the AST file. Instead, load them lazily, when needed.
- Only look up header-search information (whether a header was already
#import'd, how many times it's been included, etc.) when it's needed
by the preprocessor, rather than pre-populating it.
Previously, we would pre-load all of the file source-location entries,
which also populated the header-search information structure. This was
a relatively minor performance issue, since we would end up stat()'ing
all of the headers stored within a AST/PCH file when the AST/PCH file
was loaded. In the normal PCH use case, the stat()s were cached, so
the cost--of preloading ~860 source-location entries in the Cocoa.h
case---was relatively low.
However, the recent optimization that replaced stat+open with
open+fstat turned this into a major problem, since the preloading of
source-location entries would now end up opening those files. Worse,
those files wouldn't be closed until the file manager was destroyed,
so just opening a Cocoa.h PCH file would hold on to ~860 file
descriptors, and it was easy to blow through the process's limit on
the number of open file descriptors.
By eliminating the preloading of these files, we neither open nor stat
the headers stored in the PCH/AST file until they're actually needed
for something. Concretely, we went from
*** HeaderSearch Stats:
835 files tracked.
364 #import/#pragma once files.
823 included exactly once.
6 max times a file is included.
3 #include/#include_next/#import.
0 #includes skipped due to the multi-include optimization.
1 framework lookups.
0 subframework lookups.
*** Source Manager Stats:
835 files mapped, 3 mem buffers mapped.
37460 SLocEntry's allocated, 11215575B of Sloc address space used.
62 bytes of files mapped, 0 files with line #'s computed.
with a trivial program that uses a chained PCH including a Cocoa PCH
to
*** HeaderSearch Stats:
4 files tracked.
1 #import/#pragma once files.
3 included exactly once.
2 max times a file is included.
3 #include/#include_next/#import.
0 #includes skipped due to the multi-include optimization.
1 framework lookups.
0 subframework lookups.
*** Source Manager Stats:
3 files mapped, 3 mem buffers mapped.
37460 SLocEntry's allocated, 11215575B of Sloc address space used.
62 bytes of files mapped, 0 files with line #'s computed.
for the same program.
llvm-svn: 125286
- Don't publicize a C++0x feature through __has_feature if we aren't
in C++0x mode (even if the feature is available only with a
warning).
- "auto" is not implemented well enough for its __has_feature to be
turned on.
- Fix the test of C++0x __has_feature to actually test what we're
trying to test. Searching for the substring "foo" when our options
are "foo" and "no_foo" doesn't work :)
llvm-svn: 124291
and turn on __has_feature(cxx_rvalue_references). The core rvalue
references proposal seems to be fully implemented now, pending lots
more testing.
llvm-svn: 124169
Turn on the __has_feature switch for variadic templates, document
their completion, and put the ExtWarn into the c++0x-extensions
warning group.
llvm-svn: 123854
new gcc warning that complains on self-assignments and
self-initializations. Fix one bug found by the warning, in which one
clang::OverloadCandidate constructor failed to initialize its
FunctionTemplate member.
llvm-svn: 122459
Diagnostic pragmas are broken because we don't keep track of the diagnostic state changes and we only check the current/latest state.
Problems manifest if a diagnostic is emitted for a source line that has different diagnostic state than the current state; this can affect
a lot of places, like C++ inline methods, template instantiations, the lexer, etc.
Fix the issue by having the Diagnostic object keep track of the source location of the pragmas so that it is able to know what is the diagnostic state at any given source location.
Fixes rdar://8365684.
llvm-svn: 121873
pointer that is passed down through the APIs, and make
FileSystemStatCache::get be the one that filters out
directory lookups that hit files. This also paves the
way to have stat queries be able to return opened files.
llvm-svn: 120060
FileSystemOpts through a ton of apis, simplifying a lot of code.
This also fixes a latent bug in ASTUnit where it would invoke
methods on FileManager without creating one in some code paths
in cindextext.
llvm-svn: 120010
and use a better and more general approach, where NullStmt has a flag to indicate whether it was preceded by an empty macro.
Thanks to Abramo Bagnara for the hint!
llvm-svn: 119887
than a Token that holds the same information all in one easy-to-use
package. There's no technical reason to prefer the former -- the
information comes from a Token originally -- and it's clumsier to use,
so I've changed the code to use tokens everywhere.
Approved by clattner
llvm-svn: 119845
The callback info for #if/#elif is not great -- ideally it would give
us a list of tokens in the #if, or even better, a little parse tree.
But that's a lot more work. Instead, clients can retokenize using
Lexer::LexFromRawLexer().
Reviewed by nlewycky.
llvm-svn: 118318
When -working-directory is passed in command line, file paths are resolved relative to the specified directory.
This helps both when using libclang (where we can't require the user to actually change the working directory)
and to help reproduce test cases when the reproduction work comes along.
--FileSystemOptions is introduced which controls how file system operations are performed (currently it just contains
the working directory value if set).
--FileSystemOptions are passed around to various interfaces that perform file operations.
--Opening & reading the content of files should be done only through FileManager. This is useful in general since
file operations will be abstracted in the future for the reproduction mechanism.
FileSystemOptions is independent of FileManager so that we can have multiple translation units sharing the same
FileManager but with different FileSystemOptions.
Addresses rdar://8583824.
llvm-svn: 118203
load identifiers without loading their corresponding macro
definitions. This is likely to improve PCH performance slightly, and
reduces deserialization stack depth considerably when using
preprocessor metaprogramming.
llvm-svn: 117750
inclusion directives, keeping track of every #include, #import,
etc. in the translation unit. We keep track of the source location and
kind of the inclusion, how the file name was spelled, and the
underlying file to which the inclusion resolved.
llvm-svn: 116952
Now MICache is a linked list (per the FIXME), where we tradeoff between MacroInfo objects being in MICache
and MIChainHead. MacroInfo objects in the MICache chain are already "Destroy()'ed", so they can be reused. When
inserting into MICache, we need to remove them from the regular linked list so that they aren't destroyed more than
once.
llvm-svn: 116869
The problem was not the management of MacroInfo objects, but that when we recycle them
via the MICache the memory of the underlying SmallVector (within MacroInfo) was not getting
released. This is because objects stashed into MICache simply are reused with a placement
new, and never have their destructor called.
llvm-svn: 116862
list of allocated MacroInfos. This requires only 1 extra pointer per MacroInfo object, and allows us to blow them
away in one place. This fixes an elusive memory leak with MacroInfos (whose exact location I couldn't still figure
out despite substantial digging).
Fixes <rdar://problem/8361834>.
llvm-svn: 116842
spelled (#pragma, _Pragma, __pragma). In -E mode, use that information
to add appropriate newlines when translating _Pragma and __pragma into
#pragma, like GCC does. Fixes <rdar://problem/8412013>.
llvm-svn: 113553
The extra data stored on user-defined literal Tokens is stored in extra
allocated memory, which is managed by the PreprocessorLexer because there isn't
a better place to put it that makes sure it gets deallocated, but only after
it's used up. My testing has shown no significant slowdown as a result, but
independent testing would be appreciated.
llvm-svn: 112458
__debug overflow_stack'.
- For testing crash reporting stuff... you'd think I could just use some C++
code but Doug keeps fixing stuff!
llvm-svn: 109587
reparsing an ASTUnit. When saving a preamble, create a buffer larger
than the actual file we're working with but fill everything from the
end of the preamble to the end of the file with spaces (so the lexer
will quickly skip them). When we load the file, create a buffer of the
same size, filling it with the file and then spaces. Then, instruct
the lexer to start lexing after the preamble, therefore continuing the
parse from the spot where the preamble left off.
It's now possible to perform a simple preamble build + parse (+
reparse) with ASTUnit. However, one has to disable a bunch of checking
in the PCH reader to do so. That part isn't committed; it will likely
be handled with some other kind of flag (e.g., -fno-validate-pch).
As part of this, fix some issues with null termination of the memory
buffers created for the preamble; we were trying to explicitly
NULL-terminate them, even though they were also getting implicitly
NULL terminated, leading to excess warnings about NULL characters in
source files.
llvm-svn: 109445
is present.
Rather than using clang_getCursorExtent(), which requires
us to lex the token at the ending position to determine its
length. Then, we'd be comparing [a, b) source ranges that cover the
characters in the range rather than the normal behavior for Clang's
source ranges, which covers the tokens in the range. However, relexing
causes us to read the source file (which may come from a precompiled
header), which is rather unfortunate and affects performance.
In the new scheme, we only use Clang-style source ranges that cover
the tokens in the range. At the entry points where this matters
(clang_annotateTokens, clang_getCursor), we make sure to move source
locations to the start of the token.
Addresses most of <rdar://problem/8049381>.
llvm-svn: 109134
which is the part of the file that contains all of the initial
comments, includes, and preprocessor directives that occur before any
of the actual code. Added a new -print-preamble cc1 action that is
only used for testing.
llvm-svn: 108913
When loading the PCH, IdentifierInfos that are associated with pragmas cause declarations that use these identifiers to be deserialized (e.g. the "clang" pragma causes the "clang" namespace to be loaded).
We can avoid this if we just use StringRefs for the pragmas.
As a bonus, since we don't have to create and pass IdentifierInfos, the pragma interfaces get a bit more simplified.
llvm-svn: 108237
In the case of backtracking, the cached token lexer will be the only
lexer on the stack, without this the token stack will be empty and EOF
won't be returned.
This fixes PR7072.
llvm-svn: 108124
Fix string concatenation to treat escapes in concatenated strings that
are wide because of other string chunks to process the escapes as wide
themselves. Before we would warn about and miscompile the attached testcase.
This fixes rdar://8040728 - miscompile + warning: hex escape sequence out of range
llvm-svn: 106012
diagnostics. That would be while we're parsing string literals for the
sole purpose of producing a diagnostic about them. Fixes
<rdar://problem/8026030>.
llvm-svn: 104684
1) Suppress diagnostics as soon as we form the code-completion
token, so we don't get any error/warning spew from the early
end-of-file.
2) If we consume a code-completion token when we weren't expecting
one, go into a code-completion recovery path that produces the best
results it can based on the context that the parser is in.
llvm-svn: 104585
make it miss (invalid) things like:
<<<<<<<
>>>>>>>
and crash if
<<<<<<<
was at the end of the line. When we find a >>>>>>> that is not at the
end of the line, make sure to reset Pos so we don't crash on something
like:
<<<<<<< >>>>>>>
This isn't worth making testcases for, since each would require a new file.
rdar://7987078 - signal 11 compiling "<<<<<<<<<<"
llvm-svn: 103968
in an input file like this:
# 42
int x;
we were emitting:
# <something>
int x;
(with a space before the int) because we weren't clearing the leading
whitespace flag properly after the \n from the directive was handled.
llvm-svn: 101084
instantiations when we have the corresponding macro definition and by
removing macro definition information from our table when the macro is
undefined.
llvm-svn: 99004
record (which includes all macro instantiations and definitions). As
with all lay deserialization, this introduces a new external source
(here, an external preprocessing record source) that loads all of the
preprocessed entities prior to iterating over the entities.
The preprocessing record is an optional part of the precompiled header
that is disabled by default (enabled with
-detailed-preprocessing-record). When the preprocessor given to the
PCH writer has a preprocessing record, that record is written into the
PCH file. When the PCH reader is given a PCH file that contains a
preprocessing record, it will be lazily loaded (which, effectively,
implicitly adds -detailed-preprocessing-record). This is the first
case where we have sections of the precompiled header that are
added/removed based on a compilation flag, which is
unfortunate. However, this data consumes ~550k in the PCH file for
Cocoa.h (out of ~9.9MB), and there is a non-trivial cost to gathering
this detailed preprocessing information, so it's too expensive to turn
on by default. In the future, we should investigate a better encoding
of this information.
llvm-svn: 99002
eliminating the extra PopulatePreprocessingRecord object. This will
become useful once we start writing the preprocessing record to
precompiled headers.
llvm-svn: 98966
preprocessing record. Use that link with clang_getCursorReferenced()
and clang_getCursorDefinition() to match instantiations of a macro to
the definition of the macro.
llvm-svn: 98842
the macro definitions and macro instantiations that are found
during preprocessing. Preprocessing records are *not* generated by
default; rather, we provide a PPCallbacks subclass that hooks into the
existing callback mechanism to record this activity.
The only client of preprocessing records is CIndex, which keeps track
of macro definitions and instantations so that they can be exposed via
cursors. At present, only token annotation uses these facilities, and
only for macro instantiations; both will change in the near
future. However, with this change, token annotation properly annotates
macro instantiations that do not produce any tokens and instantiations
of macros that are later undef'd, improving our consistency.
Preprocessing directives that are not macro definitions are still
handled by clang_annotateTokens() via re-lexing, so that we don't have
to track every preprocessing directive in the preprocessing record.
Performance impact of preprocessing records is still TBD, although it
is limited to CIndex and therefore out of the path of the main compiler.
llvm-svn: 98836
SourceManager's getBuffer() and, therefore, could fail, along with
Preprocessor::getSpelling(). Use the Invalid parameters in the literal
parsers (string, floating point, integral, character) to make them
robust against errors that stem from, e.g., PCH files that are not
consistent with the underlying file system.
I still need to audit every use caller to all of these routines, to
determine which ones need specific handling of error conditions.
llvm-svn: 98608
SourceManager's getBuffer() (and similar) operations. This abstract
can be used to force callers to cope with errors in getBuffer(), such
as missing files and changed files. Fix a bunch of callers to use the
new interface.
Add some very basic checks for file consistency (file size,
modification time) into ContentCache::getBuffer(), although these
checks don't help much until we've updated the main callers (e.g.,
SourceManager::getSpelling()).
llvm-svn: 98585
guard macro is already defined for the first occurrence of the
header. If it is, the body will be skipped and not be properly
analyzed for the include guard optimization.
llvm-svn: 95972
doing so invalidates the file guard optimization and is not
in the spirit of "#if 0" because it is supposed to completely
skip everything, even if it isn't lexically valid. Patch by
Abramo Bagnara!
llvm-svn: 95253
region of interest (if provided). Implement clang_getCursor() in terms
of this traversal rather than using the Index library; the unified
cursor visitor is more complete, and will be The Way Forward.
Minor other tweaks needed to make this work:
- Extend Preprocessor::getLocForEndOfToken() to accept an offset
from the end, making it easy to move to the last character in the
token (rather than just past the end of the token).
- In Lexer::MeasureTokenLength(), the length of whitespace is zero.
llvm-svn: 94200
disabled with the intent that users can start with them now and not have to change
a thing to have them work when we implement the features.
llvm-svn: 93312
incompatible with user-defined literals, specifically with the following form:
0x1p+1
The preprocessing-number token extends only as far as the 'p'; the '+' is not
included. Previously we could get away with this extension as p was an invalid
suffix, but now with user-defined literals, 'p' might well be a valid suffix
and we are forced to consider it as such.
This patch also adds a warning in non-0x C++ modes telling the user that
this extension is incompatible with C++0x that is enabled by default
(previously and with other languages, we warn only with a compliance
option such as -pedantic).
llvm-svn: 93135
definitions from a precompiled header. This ensures that
code-completion with macro names behaves the same with or without
precompiled headers.
llvm-svn: 92497
1. Don't make a copy of LangOptions every time a lexer is created.
2. Don't make CharInfo global mutable state.
3. Fix the implementation to properly treat ^Z as EOF instead of as
horizontal whitespace, which matches the semantic implemented by VC++.
llvm-svn: 91586
on PR5610 (2.185 -> 2.130s). The big issue is that this is making
insanely huge macro argument lists with over a million tokens in it.
The reason that mallco and free are so expensive is that we are
actually going to the kernel to get it, and switching to a bump
pointer allocator won't change this in an interesting way.
llvm-svn: 91449
We creating and free thousands of MacroArgs objects (and the related
std::vectors hanging off them) for the testcase in PR5610 even though
there are only ~20 live at a time. This doesn't actually use the
cache yet.
llvm-svn: 91391
expanding directives withing macro expansions. This is undefined behavior
according to 6.10.3p11, so we might as well be undefined in ways similar to
GCC.
llvm-svn: 91266
files with the contents of an arbitrary memory buffer. Use this new
functionality to drastically clean up the way in which we handle file
truncation for code-completion: all of the truncation/completion logic
is now encapsulated in the preprocessor where it belongs
(<rdar://problem/7434737>).
llvm-svn: 90300
in diagnostics when we fail to open a file. This allows us to
report things like:
$ clang test.c -I.
test.c:2:10: fatal error: error opening file './foo.h': Permission denied
#include "foo.h"
^
llvm-svn: 90276
stat a file but where mmaping it fails. In this case, we emit an
error like:
t.c:1:10: fatal error: error opening file '../../foo.h'
instead of "cannot find file".
llvm-svn: 90110
annotation token, because some of the tokens we're annotating might
not be in the set of cached tokens (we could have consumed them
unconditionally).
Also, move the tentative parsing from ParseTemplateTemplateArgument
into the one caller that needs it, improving recovery.
llvm-svn: 86904
a little fuzzy, but conceptually it's just uniquing the identifier.
Chris, please review. I debated splitting into const/non-const versions where
the const one propogated constness to the resulting IdentifierInfo*.
llvm-svn: 86106
only supporting a single stat cache. The immediate benefit of this
change is that we can now generate a PCH/AST file when including
another PCH file; in the future, the chain of stat caches will likely
be useful with multiple levels of PCH files.
llvm-svn: 84263
-code-completion-at=filename:line:column
which performs code completion at the specified location by truncating
the file at that position and enabling code completion. This approach
makes it possible to run multiple tests from a single test file, and
gives a more natural command-line interface.
llvm-svn: 82571
essence, code completion is triggered by a magic "code completion"
token produced by the lexer [*], which the parser recognizes at
certain points in the grammar. The parser then calls into the Action
object with the appropriate CodeCompletionXXX action.
Sema implements the CodeCompletionXXX callbacks by performing minimal
translation, then forwarding them to a CodeCompletionConsumer
subclass, which uses the results of semantic analysis to provide
code-completion results. At present, only a single, "printing" code
completion consumer is available, for regression testing and
debugging. However, the design is meant to permit other
code-completion consumers.
This initial commit contains two code-completion actions: one for
member access, e.g., "x." or "p->", and one for
nested-name-specifiers, e.g., "std::". More code-completion actions
will follow, along with improved gathering of code-completion results
for the various contexts.
[*] In the current -code-completion-dump testing/debugging mode, the
file is truncated at the completion point and EOF is translated into
"code completion".
llvm-svn: 82166
declaration in the AST.
The new ASTContext::getCommentForDecl function searches for a comment
that is attached to the given declaration, and returns that comment,
which may be composed of several comment blocks.
Comments are always available in an AST. However, to avoid harming
performance, we don't actually parse the comments. Rather, we keep the
source ranges of all of the comments within a large, sorted vector,
then lazily extract comments via a binary search in that vector only
when needed (which never occurs in a "normal" compile).
Comments are written to a precompiled header/AST file as a blob of
source ranges. That blob is only lazily loaded when one requests a
comment for a declaration (this never occurs in a "normal" compile).
The indexer testbed now supports comment extraction. When the
-point-at location points to a declaration with a Doxygen-style
comment, the indexer testbed prints the associated comment
block(s). See test/Index/comments.c for an example.
Some notes:
- We don't actually attempt to parse the comment blocks themselves,
beyond identifying them as Doxygen comment blocks to associate them
with a declaration.
- We won't find comment blocks that aren't adjacent to the
declaration, because we start our search based on the location of
the declaration.
- We don't go through the necessary hops to find, for example,
whether some redeclaration of a declaration has comments when our
current declaration does not. Similarly, we don't attempt to
associate a \param Foo marker in a function body comment with the
parameter named Foo (although that is certainly possible).
- Verification of my "no performance impact" claims is still "to be
done".
llvm-svn: 74704
with dos style newlines. I have a trivial test for this:
// RUN: clang-cc %s -verify
#define test(x, y) \
x ## y
but I don't know how to get svn to not change newlines and testrunner
doesn't work with dos style newlines either, so "not worth it". :)
rdar://6994000
llvm-svn: 73945
line, and when the pragma is at the end of a file. In this case, the last
token consumed could pop the lexer, invalidating CurPPLexer. Thanks to
Peter Thoman for pointing it out.
llvm-svn: 73689
registered when PCH wasn't being used. We should always install (in BuiltinInfo)
information about target-specific builtins, but we shouldn't register any builtin
identifier infos. This fixes the build of apps that use PCH and target specific
builtins together.
llvm-svn: 73492
builtin preprocessor macro. This appears to work with two caveats:
1) builtins are registered in -E mode, and 2) target-specific builtins
are unconditionally registered even if they aren't supported by the
target (e.g. SSE4 builtin when only SSE1 is enabled).
llvm-svn: 73289
diagnostic to include the full instantiation location for the
invalid paste. For:
#define foo(a, b) a ## b
#define bar(x) foo(x, ])
bar(a)
bar(zdy)
Instead of:
t.c:3:22: error: pasting formed 'a]', an invalid preprocessing token
#define foo(a, b) a ## b
^
t.c:3:22: error: pasting formed 'zdy]', an invalid preprocessing token
we now produce:
t.c:7:1: error: pasting formed 'a]', an invalid preprocessing token
bar(a)
^
t.c:4:16: note: instantiated from:
#define bar(x) foo(x, ])
^
t.c:3:22: note: instantiated from:
#define foo(a, b) a ## b
^
t.c:8:1: error: pasting formed 'zdy]', an invalid preprocessing token
bar(zdy)
^
t.c:4:16: note: instantiated from:
#define bar(x) foo(x, ])
^
t.c:3:22: note: instantiated from:
#define foo(a, b) a ## b
^
llvm-svn: 72519
1. When we accept "#garbage" in asm-with-cpp mode, change the token kind
of the # to unknown so that the preprocessor won't try to process it as
a real #. This fixes a crash on the attached example
2. Fix macro definition extents processing to handle #foo at the end of a
macro to say the definition ends with the foo, not the #.
This is a follow-on fix to r72283, and rdar://6916026
llvm-svn: 72388
two empty arguments. Also, add an assert so that this bug
manifests as an assertion failure, not a valgrind problem.
This fixes rdar://6880648 - [cpp] crash in ArgNeedsPreexpansion
llvm-svn: 71616
that if we're going to print an extension warning anyway,
there's no point to changing behavior based on NoExtensions: it will
only make error recovery worse.
Note that this doesn't cause any behavior change because NoExtensions
isn't used by the current front-end. I'm still considering what to do about
the remaining use of NoExtensions in IdentifierTable.cpp.
llvm-svn: 70273
PCH file. In the Cocoa-prefixed "Hello, World" benchmark, this takes
us from reading 503 identifiers down to 37 and from 470 macros down to
4. It also results in an 8% performance improvement.
llvm-svn: 70094
will let us test for multiple different warning modes in the same
file in regression tests.
This implements rdar://2362963, a 10-year old feature request :)
llvm-svn: 69560
support it. I don't know what evaluation method we use for complex
arithmetic, so I don't know whether/if we should warn about use of
CX_LIMITED_RANGE.
This concludes my planned hacking on STDC pragmas, flame away :)
llvm-svn: 69556
in a function-like macro body. This has the added bonus of moving some
function-like macro specific code out of the object-like macro codepath.
llvm-svn: 69530
as decimal, even if it starts with 0. Also, since things like 0x1 are
completely illegal, don't even bother using numericliteralparser for them.
llvm-svn: 69454
Highlights: PP::isNextPPTokenLParen() no longer eats the (
when present. We now simplify slightly the logic parsing
macro arguments. We now handle PR3937 and other related cases
correctly.
llvm-svn: 69411
This allows it to accurately measure tokens, so that we get:
t.cpp:8:13: error: unknown type name 'X'
static foo::X P;
~~~~~^
instead of the woefully inferior:
t.cpp:8:13: error: unknown type name 'X'
static foo::X P;
~~~~ ^
Most of this is just plumbing to push the reference around.
llvm-svn: 69099
t.c:3:8: warning: extra tokens at end of #endif directive
#endif foo
^
//
Don't do this in strict-C89 mode because bcpl comments aren't
valid there, and it is too much trouble to analyze whether
C block comments are safe.
llvm-svn: 69024