This issue was encountered on libcmt.pdb, which has a type record that
looks like this:
Struct (0x1094) {
TypeLeafKind: LF_STRUCTURE (0x1505)
MemberCount: 3
Properties [ (0x200)
HasUniqueName (0x200)
]
FieldList: <field list> (0x1093)
DerivedFrom: 0x0
VShape: 0x0
SizeOf: 4
Name: <unnamed-tag>
LinkageName: .?AU<unnamed-tag>@@
}
The checks for startswith/endswith "<unnamed-tag>" should look at the
display name, not the linkage name.
llvm-svn: 274376
This allows better catching of compiler errors since we can use
the override keyword to verify that methods are actually
overridden.
Also in this patch I've changed from storing a boolean Error
code everywhere to returning an llvm::Error, to propagate richer
error information up the call stack.
Reviewed By: ruiu, rnk
Differential Revision: http://reviews.llvm.org/D21410
llvm-svn: 272926
Both parameters to visitTypeBegin are actually members of CVRecord,
so we can just pass CVRecord instead of destructuring it.
Differential Revision: http://reviews.llvm.org/D21435
llvm-svn: 272899
TPI hash table contains a parallel array for the type records.
For each type record R, a hash value is calculated by `H(R) % NumBuckets`
where H is a hash function, and the result is stored to a bucket element.
H is TPI1::hashPrec function in microsoft-pdb repository.
Our hash function does not support all type record types yet.
Currently it supports only records for line number.
I'll extend it in a follow up patch.
The aim of verify the hash table is not only detect corrupted files.
It ensures that our understanding of how the hash values are calculated
is correct.
llvm-svn: 272229
In order to efficiently write PDBs, we need to be able to make a
StreamWriter class similar to a StreamReader, which can transparently deal
with writing to discontiguous streams, and we need to use this for all
writing, similar to how we use StreamReader for all reading.
Most discontiguous streams are the typical numbered streams that appear in
a PDB file and are described by the directory, but the exception to this,
that until now has been parsed by hand, is the directory itself.
MappedBlockStream works by querying the directory to find out which blocks
a stream occupies and various other things, so naturally the same logic
could not possibly work to describe the blocks that the directory itself
resided on.
To solve this, I've introduced an abstraction IPDBStreamData, which allows
the client to query for the list of blocks occupied by the stream, as well
as the stream length. I provide two implementations of this: one which
queries the directory (for indexed streams), and one which queries the
super block (for the directory stream).
This has the side benefit of vastly simplifying the code to parse the
directory. Whereas before a mini state machine was rolled by hand, now we
simply use FixedStreamArray to read out the stream sizes, then build a
vector of FixedStreamArrays for the stream map, all in just a few lines of
code.
Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21046
llvm-svn: 271982
This converts remaining uses of ByteStream, which was still
left in the symbol stream and type stream, to using the new
StreamInterface zero-copy classes.
RecordIterator is finally deleted, so this is the only way left
now. Additionally, more error checking is added when iterating
the various streams.
With this, the transition to zero copy pdb access is complete.
llvm-svn: 271101
PDBs can be extremely large. We're already mapping the entire
PDB into the process's address space, but to make matters worse
the blocks of the PDB are not arranged contiguously. So, when
we have something like an array or a string embedded into the
stream, we have to make a copy. Since it's convenient to use
traditional data structures to iterate and manipulate these
records, we need the memory to be contiguous.
As a result of this, we were using roughly twice as much memory
as the file size of the PDB, because every stream was copied
out and re-stitched together contiguously.
This patch addresses this by improving the MappedBlockStream
to allocate from a BumpPtrAllocator only when a read requires
a discontiguous read. Furthermore, it introduces some data
structures backed by a stream which can iterate over both
fixed and variable length records of a PDB. Since everything
is backed by a stream and not a buffer, we can read almost
everything from the PDB with zero copies.
Differential Revision: http://reviews.llvm.org/D20654
Reviewed By: ruiu
llvm-svn: 270951
We have need to reuse this functionality, including making
additional generic stream types that are smarter about how and
when they copy memory versus referencing the original memory.
So all of these structures belong in the common library
rather than being pdb specific.
llvm-svn: 270751
Summary:
Port the dumper in llvm-readobj over to it.
I'm planning to use this visitor to power type stream merging.
While we're at it, try to switch from StringRef to ArrayRef<uint8_t> in some
places.
Reviewers: zturner, amccarth
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19899
llvm-svn: 268535
Ability to parse codeview type streams is also needed by
DebugInfoPDB for parsing PDBs, so moving this into a library
gives us this option. Since DebugInfoPDB had already hand
rolled some code to do this, that code is now convereted over
to using this common abstraction.
Differential Revision: http://reviews.llvm.org/D19887
Reviewed By: dblaikie, amccarth
llvm-svn: 268454
This parses the TPI stream (stream 2) from the PDB file. This stream
contains some header information followed by a series of codeview records.
There is some additional complexity here in that alongside this stream of
codeview records is a serialized hash table in order to efficiently query
the types. We parse the necessary bookkeeping information to allow us to
reconstruct the hash table, but we do not actually construct it yet as
there are still a few things that need to be understood first.
Differential Revision: http://reviews.llvm.org/D19840
Reviewed By: ruiu, rnk
llvm-svn: 268343