forked from OSchip/llvm-project
				
			
		
			
				
	
	
		
			533 lines
		
	
	
		
			25 KiB
		
	
	
	
		
			HTML
		
	
	
	
			
		
		
	
	
			533 lines
		
	
	
		
			25 KiB
		
	
	
	
		
			HTML
		
	
	
	
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
 | 
						|
          "http://www.w3.org/TR/html4/strict.dtd">
 | 
						|
<html>
 | 
						|
<head>
 | 
						|
  <title>Precompiled Headers (PCH)</title>
 | 
						|
  <link type="text/css" rel="stylesheet" href="../menu.css">
 | 
						|
  <link type="text/css" rel="stylesheet" href="../content.css">
 | 
						|
  <style type="text/css">
 | 
						|
    td {
 | 
						|
    vertical-align: top;
 | 
						|
    }
 | 
						|
  </style>
 | 
						|
</head>
 | 
						|
 | 
						|
<body>
 | 
						|
 | 
						|
<!--#include virtual="../menu.html.incl"-->
 | 
						|
 | 
						|
<div id="content">
 | 
						|
 | 
						|
<h1>Precompiled Headers</h1>
 | 
						|
 | 
						|
  <p>This document describes the design and implementation of Clang's
 | 
						|
  precompiled headers (PCH). If you are interested in the end-user
 | 
						|
  view, please see the <a
 | 
						|
   href="UsersManual.html#precompiledheaders">User's Manual</a>.</p>
 | 
						|
 | 
						|
  <p><b>Table of Contents</b></p>
 | 
						|
  <ul>
 | 
						|
    <li><a href="#usage">Using Precompiled Headers with
 | 
						|
    <tt>clang</tt></a></li>
 | 
						|
    <li><a href="#philosophy">Design Philosophy</a></li>
 | 
						|
    <li><a href="#contents">Precompiled Header Contents</a>
 | 
						|
      <ul>
 | 
						|
        <li><a href="#metadata">Metadata Block</a></li>
 | 
						|
        <li><a href="#sourcemgr">Source Manager Block</a></li>
 | 
						|
        <li><a href="#preprocessor">Preprocessor Block</a></li>
 | 
						|
        <li><a href="#types">Types Block</a></li>
 | 
						|
        <li><a href="#decls">Declarations Block</a></li>
 | 
						|
        <li><a href="#stmt">Statements and Expressions</a></li>
 | 
						|
        <li><a href="#idtable">Identifier Table Block</a></li>
 | 
						|
        <li><a href="#method-pool">Method Pool Block</a></li>
 | 
						|
      </ul>
 | 
						|
    </li>
 | 
						|
    <li><a href="#tendrils">Precompiled Header Integration
 | 
						|
    Points</a></li>
 | 
						|
</ul>
 | 
						|
    
 | 
						|
<h2 id="usage">Using Precompiled Headers with <tt>clang</tt></h2>
 | 
						|
 | 
						|
<p>The Clang compiler frontend, <tt>clang -cc1</tt>, supports two command line
 | 
						|
options for generating and using PCH files.<p>
 | 
						|
 | 
						|
<p>To generate PCH files using <tt>clang -cc1</tt>, use the option
 | 
						|
<b><tt>-emit-pch</tt></b>:
 | 
						|
 | 
						|
<pre> $ clang -cc1 test.h -emit-pch -o test.h.pch </pre>
 | 
						|
 | 
						|
<p>This option is transparently used by <tt>clang</tt> when generating
 | 
						|
PCH files. The resulting PCH file contains the serialized form of the
 | 
						|
compiler's internal representation after it has completed parsing and
 | 
						|
semantic analysis. The PCH file can then be used as a prefix header
 | 
						|
with the <b><tt>-include-pch</tt></b> option:</p>
 | 
						|
 | 
						|
<pre>
 | 
						|
  $ clang -cc1 -include-pch test.h.pch test.c -o test.s
 | 
						|
</pre>
 | 
						|
 | 
						|
<h2 id="philosophy">Design Philosophy</h2>
 | 
						|
  
 | 
						|
<p>Precompiled headers are meant to improve overall compile times for
 | 
						|
  projects, so the design of precompiled headers is entirely driven by
 | 
						|
  performance concerns. The use case for precompiled headers is
 | 
						|
  relatively simple: when there is a common set of headers that is
 | 
						|
  included in nearly every source file in the project, we
 | 
						|
  <i>precompile</i> that bundle of headers into a single precompiled
 | 
						|
  header (PCH file). Then, when compiling the source files in the
 | 
						|
  project, we load the PCH file first (as a prefix header), which acts
 | 
						|
  as a stand-in for that bundle of headers.</p>
 | 
						|
 | 
						|
<p>A precompiled header implementation improves performance when:</p>
 | 
						|
<ul>
 | 
						|
  <li>Loading the PCH file is significantly faster than re-parsing the
 | 
						|
  bundle of headers stored within the PCH file. Thus, a precompiled
 | 
						|
  header design attempts to minimize the cost of reading the PCH
 | 
						|
  file. Ideally, this cost should not vary with the size of the
 | 
						|
  precompiled header file.</li>
 | 
						|
  
 | 
						|
  <li>The cost of generating the PCH file initially is not so large
 | 
						|
  that it counters the per-source-file performance improvement due to
 | 
						|
  eliminating the need to parse the bundled headers in the first
 | 
						|
  place. This is particularly important on multi-core systems, because
 | 
						|
  PCH file generation serializes the build when all compilations
 | 
						|
  require the PCH file to be up-to-date.</li>
 | 
						|
</ul>
 | 
						|
 | 
						|
<p>Clang's precompiled headers are designed with a compact on-disk
 | 
						|
representation, which minimizes both PCH creation time and the time
 | 
						|
required to initially load the PCH file. The PCH file itself contains
 | 
						|
a serialized representation of Clang's abstract syntax trees and
 | 
						|
supporting data structures, stored using the same compressed bitstream
 | 
						|
as <a href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitcode
 | 
						|
file format</a>.</p>
 | 
						|
 | 
						|
<p>Clang's precompiled headers are loaded "lazily" from disk. When a
 | 
						|
PCH file is initially loaded, Clang reads only a small amount of data
 | 
						|
from the PCH file to establish where certain important data structures
 | 
						|
are stored. The amount of data read in this initial load is
 | 
						|
independent of the size of the PCH file, such that a larger PCH file
 | 
						|
does not lead to longer PCH load times. The actual header data in the
 | 
						|
PCH file--macros, functions, variables, types, etc.--is loaded only
 | 
						|
when it is referenced from the user's code, at which point only that
 | 
						|
entity (and those entities it depends on) are deserialized from the
 | 
						|
PCH file. With this approach, the cost of using a precompiled header
 | 
						|
for a translation unit is proportional to the amount of code actually
 | 
						|
used from the header, rather than being proportional to the size of
 | 
						|
the header itself.</p> 
 | 
						|
 | 
						|
<p>When given the <code>-print-stats</code> option, Clang produces
 | 
						|
statistics describing how much of the precompiled header was actually
 | 
						|
loaded from disk. For a simple "Hello, World!" program that includes
 | 
						|
the Apple <code>Cocoa.h</code> header (which is built as a precompiled
 | 
						|
header), this option illustrates how little of the actual precompiled
 | 
						|
header is required:</p>
 | 
						|
 | 
						|
<pre>
 | 
						|
*** PCH Statistics:
 | 
						|
  933 stat cache hits
 | 
						|
  4 stat cache misses
 | 
						|
  895/39981 source location entries read (2.238563%)
 | 
						|
  19/15315 types read (0.124061%)
 | 
						|
  20/82685 declarations read (0.024188%)
 | 
						|
  154/58070 identifiers read (0.265197%)
 | 
						|
  0/7260 selectors read (0.000000%)
 | 
						|
  0/30842 statements read (0.000000%)
 | 
						|
  4/8400 macros read (0.047619%)
 | 
						|
  1/4995 lexical declcontexts read (0.020020%)
 | 
						|
  0/4413 visible declcontexts read (0.000000%)
 | 
						|
  0/7230 method pool entries read (0.000000%)
 | 
						|
  0 method pool misses
 | 
						|
</pre>
 | 
						|
 | 
						|
<p>For this small program, only a tiny fraction of the source
 | 
						|
locations, types, declarations, identifiers, and macros were actually
 | 
						|
deserialized from the precompiled header. These statistics can be
 | 
						|
useful to determine whether the precompiled header implementation can
 | 
						|
be improved by making more of the implementation lazy.</p>
 | 
						|
 | 
						|
<p>Precompiled headers can be chained. When you create a PCH while
 | 
						|
including an existing PCH, Clang can create the new PCH by referencing
 | 
						|
the original file and only writing the new data to the new file. For
 | 
						|
example, you could create a PCH out of all the headers that are very
 | 
						|
commonly used throughout your project, and then create a PCH for every
 | 
						|
single source file in the project that includes the code that is
 | 
						|
specific to that file, so that recompiling the file itself is very fast,
 | 
						|
without duplicating the data from the common headers for every file.</p>
 | 
						|
 | 
						|
<h2 id="contents">Precompiled Header Contents</h2>
 | 
						|
 | 
						|
<img src="PCHLayout.png" style="float:right" alt="Precompiled header layout">
 | 
						|
 | 
						|
<p>Clang's precompiled headers are organized into several different
 | 
						|
blocks, each of which contains the serialized representation of a part
 | 
						|
of Clang's internal representation. Each of the blocks corresponds to
 | 
						|
either a block or a record within <a
 | 
						|
 href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitstream
 | 
						|
format</a>. The contents of each of these logical blocks are described
 | 
						|
below.</p>
 | 
						|
 | 
						|
<p>For a given precompiled header, the <a
 | 
						|
href="http://llvm.org/cmds/llvm-bcanalyzer.html"><code>llvm-bcanalyzer</code></a>
 | 
						|
utility can be used to examine the actual structure of the bitstream
 | 
						|
for the precompiled header. This information can be used both to help
 | 
						|
understand the structure of the precompiled header and to isolate
 | 
						|
areas where precompiled headers can still be optimized, e.g., through
 | 
						|
the introduction of abbreviations.</p>
 | 
						|
 | 
						|
<h3 id="metadata">Metadata Block</h3>
 | 
						|
 | 
						|
<p>The metadata block contains several records that provide
 | 
						|
information about how the precompiled header was built. This metadata
 | 
						|
is primarily used to validate the use of a precompiled header. For
 | 
						|
example, a precompiled header built for a 32-bit x86 target cannot be used
 | 
						|
when compiling for a 64-bit x86 target. The metadata block contains
 | 
						|
information about:</p>
 | 
						|
 | 
						|
<dl>
 | 
						|
  <dt>Language options</dt>
 | 
						|
  <dd>Describes the particular language dialect used to compile the
 | 
						|
PCH file, including major options (e.g., Objective-C support) and more
 | 
						|
minor options (e.g., support for "//" comments). The contents of this
 | 
						|
record correspond to the <code>LangOptions</code> class.</dd>
 | 
						|
  
 | 
						|
  <dt>Target architecture</dt>
 | 
						|
  <dd>The target triple that describes the architecture, platform, and
 | 
						|
ABI for which the PCH file was generated, e.g.,
 | 
						|
<code>i386-apple-darwin9</code>.</dd>
 | 
						|
  
 | 
						|
  <dt>PCH version</dt>
 | 
						|
  <dd>The major and minor version numbers of the precompiled header
 | 
						|
format. Changes in the minor version number should not affect backward
 | 
						|
compatibility, while changes in the major version number imply that a
 | 
						|
newer compiler cannot read an older precompiled header (and
 | 
						|
vice-versa).</dd>
 | 
						|
 | 
						|
  <dt>Original file name</dt>
 | 
						|
  <dd>The full path of the header that was used to generate the
 | 
						|
precompiled header.</dd>
 | 
						|
 | 
						|
  <dt>Predefines buffer</dt>
 | 
						|
  <dd>Although not explicitly stored as part of the metadata, the
 | 
						|
predefines buffer is used in the validation of the precompiled header.
 | 
						|
The predefines buffer itself contains code generated by the compiler
 | 
						|
to initialize the preprocessor state according to the current target,
 | 
						|
platform, and command-line options. For example, the predefines buffer
 | 
						|
will contain "<code>#define __STDC__ 1</code>" when we are compiling C
 | 
						|
without Microsoft extensions. The predefines buffer itself is stored
 | 
						|
within the <a href="#sourcemgr">source manager block</a>, but its
 | 
						|
contents are verified along with the rest of the metadata.</dd>
 | 
						|
 | 
						|
</dl>
 | 
						|
 | 
						|
<p>A chained PCH file (that is, one that references another PCH) has
 | 
						|
a slightly different metadata block, which contains the following
 | 
						|
information:</p>
 | 
						|
 | 
						|
<dl>
 | 
						|
  <dt>Referenced file</dt>
 | 
						|
  <dd>The name of the referenced PCH file. It is looked up like a file
 | 
						|
specified using -include-pch.</dd>
 | 
						|
 | 
						|
  <dt>PCH version</dt>
 | 
						|
  <dd>This is the same as in normal PCH files.</dd>
 | 
						|
 | 
						|
  <dt>Original file name</dt>
 | 
						|
  <dd>The full path of the header that was used to generate this
 | 
						|
precompiled header.</dd>
 | 
						|
 | 
						|
</dl>
 | 
						|
 | 
						|
<p>The language options, target architecture and predefines buffer data
 | 
						|
is taken from the end of the chain, since they have to match anyway.</p>
 | 
						|
 | 
						|
<h3 id="sourcemgr">Source Manager Block</h3>
 | 
						|
 | 
						|
<p>The source manager block contains the serialized representation of
 | 
						|
Clang's <a
 | 
						|
 href="InternalsManual.html#SourceLocation">SourceManager</a> class,
 | 
						|
which handles the mapping from source locations (as represented in
 | 
						|
Clang's abstract syntax tree) into actual column/line positions within
 | 
						|
a source file or macro instantiation. The precompiled header's
 | 
						|
representation of the source manager also includes information about
 | 
						|
all of the headers that were (transitively) included when building the
 | 
						|
precompiled header.</p>
 | 
						|
 | 
						|
<p>The bulk of the source manager block is dedicated to information
 | 
						|
about the various files, buffers, and macro instantiations into which
 | 
						|
a source location can refer. Each of these is referenced by a numeric
 | 
						|
"file ID", which is a unique number (allocated starting at 1) stored
 | 
						|
in the source location. Clang serializes the information for each kind
 | 
						|
of file ID, along with an index that maps file IDs to the position
 | 
						|
within the PCH file where the information about that file ID is
 | 
						|
stored. The data associated with a file ID is loaded only when
 | 
						|
required by the front end, e.g., to emit a diagnostic that includes a
 | 
						|
macro instantiation history inside the header itself.</p>
 | 
						|
 | 
						|
<p>The source manager block also contains information about all of the
 | 
						|
headers that were included when building the precompiled header. This
 | 
						|
includes information about the controlling macro for the header (e.g.,
 | 
						|
when the preprocessor identified that the contents of the header
 | 
						|
dependent on a macro like <code>LLVM_CLANG_SOURCEMANAGER_H</code>)
 | 
						|
along with a cached version of the results of the <code>stat()</code>
 | 
						|
system calls performed when building the precompiled header. The
 | 
						|
latter is particularly useful in reducing system time when searching
 | 
						|
for include files.</p>
 | 
						|
 | 
						|
<h3 id="preprocessor">Preprocessor Block</h3>
 | 
						|
 | 
						|
<p>The preprocessor block contains the serialized representation of
 | 
						|
the preprocessor. Specifically, it contains all of the macros that
 | 
						|
have been defined by the end of the header used to build the
 | 
						|
precompiled header, along with the token sequences that comprise each
 | 
						|
macro. The macro definitions are only read from the PCH file when the
 | 
						|
name of the macro first occurs in the program. This lazy loading of
 | 
						|
macro definitions is triggered by lookups into the <a
 | 
						|
 href="#idtable">identifier table</a>.</p>
 | 
						|
 | 
						|
<h3 id="types">Types Block</h3>
 | 
						|
 | 
						|
<p>The types block contains the serialized representation of all of
 | 
						|
the types referenced in the translation unit. Each Clang type node
 | 
						|
(<code>PointerType</code>, <code>FunctionProtoType</code>, etc.) has a
 | 
						|
corresponding record type in the PCH file. When types are deserialized
 | 
						|
from the precompiled header, the data within the record is used to
 | 
						|
reconstruct the appropriate type node using the AST context.</p>
 | 
						|
 | 
						|
<p>Each type has a unique type ID, which is an integer that uniquely
 | 
						|
identifies that type. Type ID 0 represents the NULL type, type IDs
 | 
						|
less than <code>NUM_PREDEF_TYPE_IDS</code> represent predefined types
 | 
						|
(<code>void</code>, <code>float</code>, etc.), while other
 | 
						|
"user-defined" type IDs are assigned consecutively from
 | 
						|
<code>NUM_PREDEF_TYPE_IDS</code> upward as the types are encountered.
 | 
						|
The PCH file has an associated mapping from the user-defined types
 | 
						|
block to the location within the types block where the serialized
 | 
						|
representation of that type resides, enabling lazy deserialization of
 | 
						|
types. When a type is referenced from within the PCH file, that
 | 
						|
reference is encoded using the type ID shifted left by 3 bits. The
 | 
						|
lower three bits are used to represent the <code>const</code>,
 | 
						|
<code>volatile</code>, and <code>restrict</code> qualifiers, as in
 | 
						|
Clang's <a
 | 
						|
 href="http://clang.llvm.org/docs/InternalsManual.html#Type">QualType</a>
 | 
						|
class.</p>
 | 
						|
 | 
						|
<h3 id="decls">Declarations Block</h3>
 | 
						|
 | 
						|
<p>The declarations block contains the serialized representation of
 | 
						|
all of the declarations referenced in the translation unit. Each Clang
 | 
						|
declaration node (<code>VarDecl</code>, <code>FunctionDecl</code>,
 | 
						|
etc.) has a corresponding record type in the PCH file. When
 | 
						|
declarations are deserialized from the precompiled header, the data
 | 
						|
within the record is used to build and populate a new instance of the
 | 
						|
corresponding <code>Decl</code> node. As with types, each declaration
 | 
						|
node has a numeric ID that is used to refer to that declaration within
 | 
						|
the PCH file. In addition, a lookup table provides a mapping from that
 | 
						|
numeric ID to the offset within the precompiled header where that
 | 
						|
declaration is described.</p>
 | 
						|
 | 
						|
<p>Declarations in Clang's abstract syntax trees are stored
 | 
						|
hierarchically. At the top of the hierarchy is the translation unit
 | 
						|
(<code>TranslationUnitDecl</code>), which contains all of the
 | 
						|
declarations in the translation unit. These declarations (such as
 | 
						|
functions or struct types) may also contain other declarations inside
 | 
						|
them, and so on. Within Clang, each declaration is stored within a <a
 | 
						|
href="http://clang.llvm.org/docs/InternalsManual.html#DeclContext">declaration
 | 
						|
context</a>, as represented by the <code>DeclContext</code> class.
 | 
						|
Declaration contexts provide the mechanism to perform name lookup
 | 
						|
within a given declaration (e.g., find the member named <code>x</code>
 | 
						|
in a structure) and iterate over the declarations stored within a
 | 
						|
context (e.g., iterate over all of the fields of a structure for
 | 
						|
structure layout).</p>
 | 
						|
 | 
						|
<p>In Clang's precompiled header format, deserializing a declaration
 | 
						|
that is a <code>DeclContext</code> is a separate operation from
 | 
						|
deserializing all of the declarations stored within that declaration
 | 
						|
context. Therefore, Clang will deserialize the translation unit
 | 
						|
declaration without deserializing the declarations within that
 | 
						|
translation unit. When required, the declarations stored within a
 | 
						|
declaration context will be deserialized. There are two representations
 | 
						|
of the declarations within a declaration context, which correspond to
 | 
						|
the name-lookup and iteration behavior described above:</p>
 | 
						|
 | 
						|
<ul>
 | 
						|
  <li>When the front end performs name lookup to find a name
 | 
						|
  <code>x</code> within a given declaration context (for example,
 | 
						|
  during semantic analysis of the expression <code>p->x</code>,
 | 
						|
  where <code>p</code>'s type is defined in the precompiled header),
 | 
						|
  Clang deserializes a hash table mapping from the names within that
 | 
						|
  declaration context to the declaration IDs that represent each
 | 
						|
  visible declaration with that name. The entire hash table is
 | 
						|
  deserialized at this point (into the <code>llvm::DenseMap</code>
 | 
						|
  stored within each <code>DeclContext</code> object), but the actual
 | 
						|
  declarations are not yet deserialized. In a second step, those
 | 
						|
  declarations with the name <code>x</code> will be deserialized and
 | 
						|
  will be used as the result of name lookup.</li>
 | 
						|
 | 
						|
  <li>When the front end performs iteration over all of the
 | 
						|
  declarations within a declaration context, all of those declarations
 | 
						|
  are immediately de-serialized. For large declaration contexts (e.g.,
 | 
						|
  the translation unit), this operation is expensive; however, large
 | 
						|
  declaration contexts are not traversed in normal compilation, since
 | 
						|
  such a traversal is unnecessary. However, it is common for the code
 | 
						|
  generator and semantic analysis to traverse declaration contexts for
 | 
						|
  structs, classes, unions, and enumerations, although those contexts
 | 
						|
  contain relatively few declarations in the common case.</li>
 | 
						|
</ul>
 | 
						|
 | 
						|
<h3 id="stmt">Statements and Expressions</h3>
 | 
						|
 | 
						|
<p>Statements and expressions are stored in the precompiled header in
 | 
						|
both the <a href="#types">types</a> and the <a
 | 
						|
 href="#decls">declarations</a> blocks, because every statement or
 | 
						|
expression will be associated with either a type or declaration. The
 | 
						|
actual statement and expression records are stored immediately
 | 
						|
following the declaration or type that owns the statement or
 | 
						|
expression. For example, the statement representing the body of a
 | 
						|
function will be stored directly following the declaration of the
 | 
						|
function.</p>
 | 
						|
 | 
						|
<p>As with types and declarations, each statement and expression kind
 | 
						|
in Clang's abstract syntax tree (<code>ForStmt</code>,
 | 
						|
<code>CallExpr</code>, etc.) has a corresponding record type in the
 | 
						|
precompiled header, which contains the serialized representation of
 | 
						|
that statement or expression. Each substatement or subexpression
 | 
						|
within an expression is stored as a separate record (which keeps most
 | 
						|
records to a fixed size). Within the precompiled header, the
 | 
						|
subexpressions of an expression are stored, in reverse order, prior to the expression
 | 
						|
that owns those expression, using a form of <a
 | 
						|
href="http://en.wikipedia.org/wiki/Reverse_Polish_notation">Reverse
 | 
						|
Polish Notation</a>. For example, an expression <code>3 - 4 + 5</code>
 | 
						|
would be represented as follows:</p>
 | 
						|
 | 
						|
<table border="1">
 | 
						|
  <tr><td><code>IntegerLiteral(5)</code></td></tr>
 | 
						|
  <tr><td><code>IntegerLiteral(4)</code></td></tr>
 | 
						|
  <tr><td><code>IntegerLiteral(3)</code></td></tr>
 | 
						|
  <tr><td><code>BinaryOperator(-)</code></td></tr>
 | 
						|
  <tr><td><code>BinaryOperator(+)</code></td></tr>
 | 
						|
  <tr><td>STOP</td></tr>
 | 
						|
</table>
 | 
						|
 | 
						|
<p>When reading this representation, Clang evaluates each expression
 | 
						|
record it encounters, builds the appropriate abstract syntax tree node,
 | 
						|
and then pushes that expression on to a stack. When a record contains <i>N</i>
 | 
						|
subexpressions--<code>BinaryOperator</code> has two of them--those
 | 
						|
expressions are popped from the top of the stack. The special STOP
 | 
						|
code indicates that we have reached the end of a serialized expression
 | 
						|
or statement; other expression or statement records may follow, but
 | 
						|
they are part of a different expression.</p>
 | 
						|
 | 
						|
<h3 id="idtable">Identifier Table Block</h3>
 | 
						|
 | 
						|
<p>The identifier table block contains an on-disk hash table that maps
 | 
						|
each identifier mentioned within the precompiled header to the
 | 
						|
serialized representation of the identifier's information (e.g, the
 | 
						|
<code>IdentifierInfo</code> structure). The serialized representation
 | 
						|
contains:</p>
 | 
						|
 | 
						|
<ul>
 | 
						|
  <li>The actual identifier string.</li>
 | 
						|
  <li>Flags that describe whether this identifier is the name of a
 | 
						|
  built-in, a poisoned identifier, an extension token, or a
 | 
						|
  macro.</li>
 | 
						|
  <li>If the identifier names a macro, the offset of the macro
 | 
						|
  definition within the <a href="#preprocessor">preprocessor
 | 
						|
  block</a>.</li>
 | 
						|
  <li>If the identifier names one or more declarations visible from
 | 
						|
  translation unit scope, the <a href="#decls">declaration IDs</a> of these
 | 
						|
  declarations.</li>
 | 
						|
</ul>
 | 
						|
 | 
						|
<p>When a precompiled header is loaded, the precompiled header
 | 
						|
mechanism introduces itself into the identifier table as an external
 | 
						|
lookup source. Thus, when the user program refers to an identifier
 | 
						|
that has not yet been seen, Clang will perform a lookup into the
 | 
						|
identifier table. If an identifier is found, its contents (macro 
 | 
						|
definitions, flags, top-level declarations, etc.) will be deserialized, at which point the corresponding <code>IdentifierInfo</code> structure will have the same contents it would have after parsing the headers in the precompiled header.</p>
 | 
						|
 | 
						|
<p>Within the PCH file, the identifiers used to name declarations are represented with an integral value. A separate table provides a mapping from this integral value (the identifier ID) to the location within the on-disk
 | 
						|
hash table where that identifier is stored. This mapping is used when
 | 
						|
deserializing the name of a declaration, the identifier of a token, or
 | 
						|
any other construct in the PCH file that refers to a name.</p>
 | 
						|
 | 
						|
<h3 id="method-pool">Method Pool Block</h3>
 | 
						|
 | 
						|
<p>The method pool block is represented as an on-disk hash table that
 | 
						|
serves two purposes: it provides a mapping from the names of
 | 
						|
Objective-C selectors to the set of Objective-C instance and class
 | 
						|
methods that have that particular selector (which is required for
 | 
						|
semantic analysis in Objective-C) and also stores all of the selectors
 | 
						|
used by entities within the precompiled header. The design of the
 | 
						|
method pool is similar to that of the <a href="#idtable">identifier
 | 
						|
table</a>: the first time a particular selector is formed during the
 | 
						|
compilation of the program, Clang will search in the on-disk hash
 | 
						|
table of selectors; if found, Clang will read the Objective-C methods
 | 
						|
associated with that selector into the appropriate front-end data
 | 
						|
structure (<code>Sema::InstanceMethodPool</code> and
 | 
						|
<code>Sema::FactoryMethodPool</code> for instance and class methods,
 | 
						|
respectively).</p>
 | 
						|
 | 
						|
<p>As with identifiers, selectors are represented by numeric values
 | 
						|
within the PCH file. A separate index maps these numeric selector
 | 
						|
values to the offset of the selector within the on-disk hash table,
 | 
						|
and will be used when de-serializing an Objective-C method declaration
 | 
						|
(or other Objective-C construct) that refers to the selector.</p>
 | 
						|
 | 
						|
<h2 id="tendrils">Precompiled Header Integration Points</h2>
 | 
						|
 | 
						|
<p>The "lazy" deserialization behavior of precompiled headers requires
 | 
						|
their integration into several completely different submodules of
 | 
						|
Clang. For example, lazily deserializing the declarations during name
 | 
						|
lookup requires that the name-lookup routines be able to query the
 | 
						|
precompiled header to find entities within the PCH file.</p>
 | 
						|
 | 
						|
<p>For each Clang data structure that requires direct interaction with
 | 
						|
the precompiled header logic, there is an abstract class that provides
 | 
						|
the interface between the two modules. The <code>PCHReader</code>
 | 
						|
class, which handles the loading of a precompiled header, inherits
 | 
						|
from all of these abstract classes to provide lazy deserialization of
 | 
						|
Clang's data structures. <code>PCHReader</code> implements the
 | 
						|
following abstract classes:</p>
 | 
						|
 | 
						|
<dl>
 | 
						|
  <dt><code>StatSysCallCache</code></dt>
 | 
						|
  <dd>This abstract interface is associated with the
 | 
						|
    <code>FileManager</code> class, and is used whenever the file
 | 
						|
    manager is going to perform a <code>stat()</code> system call.</dd>
 | 
						|
    
 | 
						|
  <dt><code>ExternalSLocEntrySource</code></dt>
 | 
						|
  <dd>This abstract interface is associated with the
 | 
						|
    <code>SourceManager</code> class, and is used whenever the
 | 
						|
    <a href="#sourcemgr">source manager</a> needs to load the details
 | 
						|
    of a file, buffer, or macro instantiation.</dd>
 | 
						|
 | 
						|
  <dt><code>IdentifierInfoLookup</code></dt>
 | 
						|
  <dd>This abstract interface is associated with the
 | 
						|
    <code>IdentifierTable</code> class, and is used whenever the
 | 
						|
    program source refers to an identifier that has not yet been seen.
 | 
						|
    In this case, the precompiled header implementation searches for
 | 
						|
    this identifier within its <a href="#idtable">identifier table</a>
 | 
						|
    to load any top-level declarations or macros associated with that
 | 
						|
    identifier.</dd>
 | 
						|
 | 
						|
  <dt><code>ExternalASTSource</code></dt>
 | 
						|
  <dd>This abstract interface is associated with the
 | 
						|
    <code>ASTContext</code> class, and is used whenever the abstract
 | 
						|
    syntax tree nodes need to loaded from the precompiled header. It
 | 
						|
    provides the ability to de-serialize declarations and types
 | 
						|
    identified by their numeric values, read the bodies of functions
 | 
						|
    when required, and read the declarations stored within a
 | 
						|
    declaration context (either for iteration or for name lookup).</dd>
 | 
						|
    
 | 
						|
  <dt><code>ExternalSemaSource</code></dt>
 | 
						|
  <dd>This abstract interface is associated with the <code>Sema</code>
 | 
						|
    class, and is used whenever semantic analysis needs to read
 | 
						|
    information from the <a href="#methodpool">global method
 | 
						|
    pool</a>.</dd>
 | 
						|
</dl>
 | 
						|
 | 
						|
</div>
 | 
						|
 | 
						|
</body>
 | 
						|
</html>
 |