ModuleManager::visit() by keeping a free list of the two data
structures used to store state (a preallocated stack and a visitation
number vector). Improves -fsyntax-only performance for my modules test
case by 2.8%. Modules has pulled ahead by almost 10% with the global
module index.
llvm-svn: 173692
index, optimizing the operation that skips lookup in modules where we
know the identifier will not be found. This makes the global module
index optimization actually useful, providing an 8.5% speedup over
modules without the global module index for -fsyntax-only.
llvm-svn: 173529
and limiting ourselves to two memory allocations. 10% speedup in
-fsyntax-only time for modules.
With this change, we can actually see some performance different from
the global module index, but it's still about 1%.
llvm-svn: 173512
AST reader.
The global module index tracks all of the identifiers known to a set
of module files. Lookup of those identifiers looks first in the global
module index, which returns the set of module files in which that
identifier can be found. The AST reader only needs to look into those
module files and any module files not known to the global index (e.g.,
because they were (re)built after the global index), reducing the
number of on-disk hash tables to visit. For an example source I'm
looking at, we go from 237844 total identifier lookups into on-disk
hash tables down to 126817.
Unfortunately, this does not translate into a performance advantage.
At best, it's a wash once the global module index has been built, but
that's ignore the cost of building the global module index (which
is itself fairly large). Profiles show that the global module index
code is far less efficient than it should be; optimizing it might give
enough of an advantage to justify its continued inclusion.
llvm-svn: 173405
The global module index is a "global" index for all of the module
files within a particular subdirectory in the module cache, which
keeps track of all of the "interesting" identifiers and selectors
known in each of the module files. One can perform a fast lookup in
the index to determine which module files will have more information
about entities with a particular name/selector. This information can
help eliminate redundant lookups into module files (a serious
performance problem) and help with creating auto-import/auto-include
Fix-Its.
The global module index is created or updated at the end of a
translation unit that has triggered a (re)build of a module by
scraping all of the .pcm files out of the module cache subdirectory,
so it catches everything. As with module rebuilds, we use the file
system's atomicity to synchronize.
llvm-svn: 173301
This change also makes the serialisation store the required semantics,
fixing an issue where PPC128 was always assumed when re-reading a
128-bit value.
llvm-svn: 173139
forming the identifier, e.g., as part of a selector or a declaration
name, don't actually deserialize any information about the
identifier. Instead, simply mark it "out-of-date" and we'll load the
the information on demand. 2% speedup on the modules testcase I'm
looking at; should also help PCH.
llvm-svn: 173056
DeclContext. When the DeclContext is of a kind that can only be
defined once and never updated, we limit the search to the module file
that conatins the lookup table. Provides a 15% speedup in one
modules-heavy source file.
llvm-svn: 173050
Makes sure that a deserialized macro is only added to the preprocessor macro definitions only once.
Unfortunately I couldn't get a reduced test case.
rdar://13016031
llvm-svn: 172843
Previously we would serialize the macro redefinitions as a list, part of
the identifier, and try to chain them together across modules individually
without having the info that they were already chained at definition time.
Change this by serializing the macro redefinition chain and then try
to synthesize the chain parts across modules. This allows us to correctly
pinpoint when 2 different definitions are ambiguous because they came from
unrelated modules.
Fixes bogus "ambiguous expansion of macro" warning when a macro in a PCH
is redefined without undef'ing it first.
rdar://13016031
llvm-svn: 172620
metadata for linking against the libraries/frameworks for imported
modules.
The module map language is extended with a new "link" directive that
specifies what library or framework to link against when a module is
imported, e.g.,
link "clangAST"
or
link framework "MyFramework"
Importing the corresponding module (or any of its submodules) will
eventually link against the named library/framework.
For now, I've added some placeholder global metadata that encodes the
imported libraries/frameworks, so that we can test that this
information gets through to the IR. The format of the data is still
under discussion.
llvm-svn: 172437
which a particular declaration resides. Use this information to
customize the "definition of 'blah' must be imported from another
module" diagnostic with the module the user actually has to
import. Additionally, recover by importing that module, so we don't
complain about other names in that module.
Still TODO: coming up with decent Fix-Its for these cases, and expand
this recovery approach for other name lookup failures.
llvm-svn: 172290
don't crash when loading a PCH with the older format.
The introduction of the control block broke compatibility with PCHs from
older versions. This patch allows loading (and rejecting) PCHs from an older
version and allows newer PCHs to be rejected from older clang versions as well.
rdar://12821386
llvm-svn: 170150
module, provide a module import stack similar to what we would get for
an include stack, e.g.,
In module 'DependsOnModule' imported from build-fail-notes.m:4:
In module 'Module' imported from DependsOnModule.framework/Headers/DependsOnModule.h:1:
Inputs/Module.framework/Headers/Module.h:15:12: note: previous definition is here
@interface Module
<rdar://problem/12696425>
llvm-svn: 169042
allocated using the allocator associated with an ASTContext.
Use this inside CXXRecordDecl::DefinitionData instead of an UnresolvedSet to
avoid a potential memory leak.
rdar://12761275
llvm-svn: 168771
The stat cache became essentially useless ever since we started
validating all file entries in the PCH.
But the motivating reason for removing it now is that it also affected
correctness in this situation:
-You have a header without include guards (using "#pragma once" or #import)
-When creating the PCH:
-The same header is referenced in an #include with different filename cases.
-In the PCH, of course, we record only one file entry for the header file
-But we cache in the PCH file the stat info for both filename cases
-Then the source files are updated and the header file is updated in a way that
its size and modification time are the same but its inode changes
-When using the PCH:
-We validate the headers, we check that header file and we create a file entry with its current inode
-There's another #include with a filename with different case than the previously created file entry
-In order to get its stat info we go through the cached stat info of the PCH and we receive the old inode
-because of the different inodes, we think they are different files so we go ahead and include its contents.
Removing the stat cache will potentially break clients that are attempting to use the stat cache
as a way of avoiding having the actual input files available. If that use case is important, patches are welcome
to bring it back in a way that will actually work correctly (i.e., emit a PCH that is self-contained, coping with
literal strings, line/column computations, etc.).
This fixes rdar://5502805
llvm-svn: 167172
the macros that are #define'd or #undef'd on the command line. This
checking happens much earlier than the current macro-definition
checking and is far cleaner, because it does a direct comparison
rather than a diff of the predefines buffers. Moreover, it allows us
to use the result of this check to skip over PCH files within a
directory that have non-matching -D's or -U's on the command
line. Finally, it improves the diagnostics a bit for mismatches,
fixing <rdar://problem/8612222>.
The old predefines-buffer diff'ing will go away in a subsequent commit.
llvm-svn: 166641
check each of the files within that directory to determine if any of
them is an AST file that matches the language and target options. If
so, the first matching AST file is loaded. This fixes a longstanding
discrepency with GCC's precompiled header implementation.
llvm-svn: 166469
failures they know how to tolerate, e.g., out-of-date input files or
configuration/version mismatches. Suppress the corresponding
diagnostics if the client can handle it.
No clients actually use this functionality, yet.
llvm-svn: 166449
manager block and input-file information in the control block. The
source manager entries now point back into the control block. Input
files are now lazily deserialized (if validation is disabled). Reduces
Cocoa's PCH by the ~70k I added when I introduced the redundancy in
r166251.
llvm-svn: 166429
block, so the input files are validated early on, before we've
committed to loading the AST file. This (accidentally) fixed a but
wherein the main file used to generate the AST file would *not* be
validated by the existing validation logic.
At the moment, this leads to some duplication of filenames between the
source manager block and input-file blocks, as well as validation
logic. This will be handled via an upcoming patch.
llvm-svn: 166251
block, which stores information about how the AST file to generated,
from the AST block, which stores the actual serialized AST. The
information in the control block should be enough to determine whether
the AST file is up-to-date and compatible with the current translation
unit, and reading it should not cause any side effects that aren't
easy to undo. That way, we can back out from an attempt to read an
incompatible or out-of-date AST file.
Note that there is still more factoring to do. In particular,
information about the source files used to generate the AST file
(along with their time stamps, sizes, etc.) still resides in the
source manager block.
llvm-svn: 166166
description. Previously, one could emulate this behavior by placing
the header in an always-unavailable submodule, but Argyrios guilted me
into expressing this idea properly.
llvm-svn: 165921
macro history.
When deserializing macro history, we arrange history such that the
macros that have definitions (that haven't been #undef'd) and are
visible come at the beginning of the list, which is what the
preprocessor and other clients of Preprocessor::getMacroInfo()
expect. If additional macro definitions become visible later, they'll
be moved toward the front of the list. Note that it's possible to have
ambiguities, but we don't diagnose them yet.
There is a partially-implemented design decision here that, if a
particular identifier has been defined or #undef'd within the
translation unit, that definition (or #undef) hides any macro
definitions that come from imported modules. There's still a little
work to do to ensure that the right #undef'ing happens.
Additionally, we'll need to scope the update records for #undefs, so
they only kick in when the submodule containing that update record
becomes visible.
llvm-svn: 165682
MacroInfo*. Instead of simply dumping an offset into the current file,
give each macro definition a proper ID with all of the standard
modules-remapping facilities. Additionally, when a macro is modified
in a subsequent AST file (e.g., #undef'ing a macro loaded from another
module or from a precompiled header), provide a macro update record
rather than rewriting the entire macro definition. This gives us
greater consistency with the way we handle declarations, and ties
together macro definitions much more cleanly.
Note that we're still not actually deserializing macro history (we
never were), but it's far easy to do properly now.
llvm-svn: 165560
whether that function/method already has a body (loaded from some
other AST file), as introduced in r165137. Delay this check until
after the redeclaration chains have been wired up.
While I'm here, make the loading of method bodies lazy.
llvm-svn: 165513
Check whether a pending instantiation needs to be instantiated (or whether an instantiation already exists).
Verify the size of the PendingInstantiations record (was only checking size of existing PendingInstantiations).
Migrate Obj-C++ part of redecl-merge into separate test, now that this is growing.
templates.mm: test that CodeGen has seen exactly one definition of template instantiations.
redecl-merge.m: use "@" specifier for expected-diagnostics.
llvm-svn: 164993
unexpanded parameter pack is a pack expansion. Thus, as with a non-type template
parameter which is a pack expansion, it needs to be expanded early into a fixed
list of template parameters.
Since the expanded list of template parameters is not itself a parameter pack,
it is permitted to appear before the end of the template parameter list, so also
remove that restriction (for both template template parameter pack expansions and
non-type template parameter pack expansions).
llvm-svn: 163369
(__builtin_* etc.) so that it isn't possible to take their address.
Specifically, introduce a new type to represent a reference to a builtin
function, and a new cast kind to convert it to a function pointer in the
operand of a call. Fixes PR13195.
llvm-svn: 162962
coming from an AST file are registered for serialization.
A static data member instantiation of in a chained PCH could be missed
when serializing decls; the result was that when emitting the visible decls
map of its DeclContext, we would use a DeclID that was not actually emitted,
leading to crashes or hangs.
Fix this by making sure such decls are always registered for serialization.
Also introduce extra sanity checks to make sure we don't register new
declarations or types after we have serialized the types/decls block.
rdar://11728990
llvm-svn: 159550
For some targets a structure named __va_list_tag is built to help define
the __builtin_va_list type. However, __va_list_tag was not being treated as a
predefined type thus causing problems when serializing the AST. This commit
fixes that oversight by adding the necessary support to treat __va_list_tag
as a predefined type.
llvm-svn: 159508
target Objective-C runtime down to the frontend: break this
down into a single target runtime kind and version, and compute
all the relevant information from that. This makes it
relatively painless to add support for new runtimes to the
compiler. Make the new -cc1 flag, -fobjc-runtime=blah-x.y.z,
available at the driver level as a better and more general
alternative to -fgnu-runtime and -fnext-runtime. This new
concept of an Objective-C runtime also encompasses what we
were previously separating out as the "Objective-C ABI", so
fragile vs. non-fragile runtimes are now really modelled as
different kinds of runtime, paving the way for better overall
differentiation.
As a sort of special case, continue to accept the -cc1 flag
-fobjc-runtime-has-weak, as a sop to PLCompatibilityWeak.
I won't go so far as to say "no functionality change", even
ignoring the new driver flag, but subtle changes in driver
semantics are almost certainly not intended.
llvm-svn: 158793
* Retain comments in the AST
* Serialize/deserialize comments
* Find comments attached to a certain Decl
* Expose raw comment text and SourceRange via libclang
llvm-svn: 158771
This reduces the number of warnings generated by Doxygen by about 100
(roughly 10%). Issues addressed:
(1) Primarily, backslash-escaped "@foo" and "#bah" in Doxygen comments
when they're not supposed to be Doxygen commands or links, and
similarly for "<baz>" when it's not intended as as HTML tag;
(2) Changed some \t commands (which don't exist) to \c ("to refer to a
word of code", as the Doxygen manual says);
(3) \precondition becomes \pre;
(4) When touching comments, deleted a couple of spurious spaces in them;
(5) Changed some \n and \r to \\n and \\r;
(6) Fixed one tiny typo: #pragms -> #pragma.
This patch touches documentation/comments only.
llvm-svn: 158422
such as "protocol" and "expression" being implicitly turned into links to
mistakenly-generated Doxygen pages:
- Escaping @ symbols when Doxygen would otherwise incorrectly interpret them;
- Escaping # symbols when they're not intended as explicit Doxygen link
requests, such as when discussing preprocessor directives;
- In one odd case, unescaping @ in @__experimental_modules_import, because
Doxygen wrote '\@' to the output in that case, causing the example in the
description of ImportDecl to be wrong; and
- Fixing a typo: @breif -> @brief.
llvm-svn: 158299
includes a patch from Matthias Kleine with a regression testcase!
Adds a new iterator 'data_iterator' to OnDiskHashTable which doesn't try to
reconstruct the external_key from the internal_key, which is useful for traits
that don't store enough information to do that mapping in their key. Also
deletes the 'item_iterator' from OnDiskHashTable as dead code.
llvm-svn: 154784
attached. Since we do not support any attributes which appertain to a statement
(yet), testing of this is necessarily quite minimal.
Patch by Alexander Kornienko!
llvm-svn: 154723
track whether the referenced declaration comes from an enclosing
local context. I'm amenable to suggestions about the exact meaning
of this bit.
llvm-svn: 152491
analysis to make the AST representation testable. They are represented by a
new UserDefinedLiteral AST node, which is a sugared CallExpr. All semantic
properties, including full CodeGen support, are achieved for free by this
representation.
UserDefinedLiterals can never be dependent, so no custom instantiation
behavior is required. They are mangled as if they were direct calls to the
underlying literal operator. This matches g++'s apparent behavior (but not its
actual mangling, which is broken for literal-operator-ids).
User-defined *string* literals are now fully-operational, but the semantic
analysis is quite hacky and needs more work. No other forms of user-defined
literal are created yet, but the AST support for them is present.
This patch committed after midnight because we had already hit the quota for
new kinds of literal yesterday.
llvm-svn: 152211
compiler errors or not.
-Control whether ASTReader should reject such a PCH by a boolean flag at ASTReader's creation time.
By default, such a PCH file will be rejected with an error when trying to load it.
[libclang] Allow clang_saveTranslationUnit to create a PCH file even if compiler errors
occurred.
-Have libclang API calls accept a PCH that had compiler errors.
The general idea is that we want libclang to stay functional even if a PCH had a compiler error.
rdar://10976363.
llvm-svn: 152192
NSNumber, and boolean literals. This includes both Sema and Codegen support.
Included is also support for new Objective-C container subscripting.
My apologies for the large patch. It was very difficult to break apart.
The patch introduces changes to the driver as well to cause clang to link
in additional runtime support when needed to support the new language features.
Docs are forthcoming to document the implementation and behavior of these features.
llvm-svn: 152137
- This reduces our total # of allocations building a PCH for Cocoa.h by almost
a whopping 50%.
- A SmallPtrMap would be cleaner, but since we don't have one yet...
llvm-svn: 151697
that provides the behavior of the C++11 library trait
std::is_trivially_constructible<T, Args...>, which can't be
implemented purely as a library.
Since __is_trivially_constructible can have zero or more arguments, I
needed to add Yet Another Type Trait Expression Class, this one
handling arbitrary arguments. The next step will be to migrate
UnaryTypeTrait and BinaryTypeTrait over to this new, more general
TypeTrait class.
Fixes the Clang side of <rdar://problem/10895483> / PR12038.
llvm-svn: 151352
We were passing a decl to the consumer after all pending deserializations were finished
but this was not enough; due to processing by the consumer we may end up into yet another
deserialization process but the way FinishedDeserializing() was setup we would not ensure
that everything was fully deserialized before returning to the consumer.
Separate ASTReader::FinishedDeserializing() into two semantic actions.
The first is ensuring that a deserialization process ends up will fully deserialized decls/types even
if the process is started by the consumer.
The second is pushing "interesting" decls to the consumer; we make sure that we don't re-enter this
section recursively be checking a variable.
llvm-svn: 150160
This seems to negatively affect compile time onsome ObjC tests
(which use a lot of partial diagnostics I assume). I have to come
up with a way to keep them inline without including Diagnostic.h
everywhere. Now adding a new diagnostic requires a full rebuild
of e.g. the static analyzer which doesn't even use those diagnostics.
This reverts commit 6496bd10dc3a6d5e3266348f08b6e35f8184bc99.
This reverts commit 7af19b817ba964ac560b50c1ed6183235f699789.
This reverts commit fdd15602a42bbe26185978ef1e17019f6d969aa7.
This reverts commit 00bd44d5677783527d7517c1ffe45e4d75a0f56f.
This reverts commit ef9b60ffed980864a8db26ad30344be429e58ff5.
llvm-svn: 150006
Fix all the files that depended on transitive includes of Diagnostic.h.
With this patch in place changing a diagnostic no longer requires a full rebuild of the StaticAnalyzer.
llvm-svn: 149781
the direct serialization of the linked-list structure. Instead, use a
scheme similar to how we handle redeclarations, with redeclaration
lists on the side. This addresses several issues:
- In cases involving mixing and matching of many categories across
many modules, the linked-list structure would not be consistent
across different modules, and categories would get lost.
- If a module is loaded after the class definition and its other
categories have already been loaded, we wouldn't see any categories
in the newly-loaded module.
llvm-svn: 149112
return pre-built lists. Instead, it feeds the methods it deserializes
to Sema so that Sema can unique them, which keeps the chains shorter.
llvm-svn: 148889
generational scheme for identifiers that avoids searching the hash
tables of a given module more than once for a given
identifier. Previously, loading any new module invalidated all of the
previous lookup results for all identifiers, causing us to perform the
lookups repeatedly.
llvm-svn: 148412
protocol, record the definition pointer in the canonical declaration
for that entity, and then propagate that definition pointer from the
canonical declaration to all other deserialized declarations. This
approach works well even when deserializing declarations that didn't
know about the original definition, which can occur with modules.
A nice bonus from this definition-deserialization approach is that we
no longer need update records when a definition is added, because the
redeclaration chains ensure that the if any declaration is loaded, the
definition will also get loaded.
llvm-svn: 148223
chains, again. The prior implementation was very linked-list oriented, and
the list-splicing logic was both fairly convoluted (when loading from
multiple modules) and failed to preserve a reasonable ordering for the
redeclaration chains.
This new implementation uses a simpler strategy, where we store the
ordered redeclaration chains in an array-like structure (indexed based
on the first declaration), and use that ordering to add individual
deserialized declarations to the end of the existing chain. That way,
the chain mimics the ordering from its modules, and a bug somewhere is
far less likely to result in a broken linked list.
llvm-svn: 148222
storage for the global declaration ID. Declarations that are parsed
(rather than deserialized) are unaffected, so the number of
declarations that pay this cost tends to be relatively small (since
relatively few declarations are ever deserialized).
This replaces a largish DenseMap within the AST reader. It's not
strictly a win in terms of memory use---not every declaration was
added to that DenseMap in the first place---but it's cleaner to have
this information available for every deserialized declaration, so that
future clients can rely on it.
llvm-svn: 147617
features needed for a particular module to be available. This allows
mixed-language modules, where certain headers only work under some
language variants (e.g., in C++, std.tuple might only be available in
C++11 mode).
llvm-svn: 147387
set of (previously-canonical) declaration IDs to the module file, so
that future AST reader instances that load the module know which
declarations are merged. This is important in the fairly tricky case
where a declaration of an entity, e.g.,
@class X;
occurs before the import of a module that also declares that
entity. We merge the declarations, and record the fact that the
declaration of X loaded from the module was merged into the (now
canonical) declaration of X that we parsed.
llvm-svn: 147181
declaration of that same class that either came from some other module
or occurred in the translation unit loading the module. In this case,
we need to merge the two redeclaration chains immediately so that all
such declarations have the same canonical declaration in the resulting
AST (even though they don't in the module files we've imported).
Focusing on Objective-C classes until I'm happy with the design, then
I'll both (1) extend this notion to other kinds of declarations, and
(2) optimize away this extra checking when we're not dealing with
modules. For now, doing this checking for PCH files/preambles gives us
better testing coverage.
llvm-svn: 147123
with a definition pointer (e.g., C++ and Objective-C classes), zip
through the redeclaration chain to make sure that all of the
declarations point to the definition data.
As part of this, realized again why the first redeclaration of an
entity in a file is important, and brought back that idea.
llvm-svn: 146886
imported modules that don't introduce any new entities of a particular
kind. Allow these entries to be replaced with entries for another
loaded module.
In the included test case, selectors exhibit this behavior.
llvm-svn: 146870
chains. The previous implementation relied heavily on the declaration
chain being stored as a (circular) linked list on disk, as it is in
memory. However, when deserializing from multiple modules, the
different chains could get mixed up, leading to broken declaration chains.
The new solution keeps track of the first and last declarations in the
chain for each module file. When we load a declaration, we search all
of the module files for redeclarations of that declaration, then
splice together all of the lists into a coherent whole (along with any
redeclarations that were actually parsed).
As a drive-by fix, (de-)serialize the redeclaration chains of
TypedefNameDecls, which had somehow gotten missed previously. Add a
test of this serialization.
This new scheme creates a redeclaration table that is fairly large in
the PCH file (on the order of 400k for Cocoa.h's 12MB PCH file). The
table is mmap'd in and searched via a binary search, but it's still
quite large. A future tweak will eliminate entries for declarations
that have no redeclarations anywhere, and should
drastically reduce the size of this table.
llvm-svn: 146841