Commit Graph

33 Commits

Author SHA1 Message Date
Zachary Turner a3225b0451 [msf] Resubmit "Rename Msf -> MSF".
Previously this change was submitted from a Windows machine, so
changes made to the case of filenames and directory names did
not survive the commit, and as a result the CMake source file
names and the on-disk file names did not match on case-sensitive
file systems.

I'm resubmitting this patch from a Linux system, which hopefully
allows the case changes to make it through unfettered.

llvm-svn: 277213
2016-07-29 20:56:36 +00:00
Zachary Turner 334aec4dd2 Revert "[msf] Rename Msf to MSF."
This reverts commit 4d1557ffac41e079bcb1abbcf04f512474dcd6fe.

llvm-svn: 277194
2016-07-29 18:38:47 +00:00
Zachary Turner a010f5cef0 [msf] Rename Msf to MSF.
In a previous patch, it was suggested to use all caps instead of
rolling caps for initialisms, so this patch changes everything
to do this.

llvm-svn: 277190
2016-07-29 18:24:26 +00:00
Zachary Turner d66889cbae [pdb] Refactor library to more clearly separate reading/writing
Reviewed By: amccarth, ruiu
Differential Revision: https://reviews.llvm.org/D22693

llvm-svn: 277019
2016-07-28 19:12:28 +00:00
Zachary Turner 199f48a5f0 Get rid of IMsfStreamData class.
This was a pure virtual base class whose purpose was to abstract
away the notion of how you retrieve the layout of a discontiguous
stream of blocks in an Msf file.  This led to too many layers of
abstraction making it difficult to figure out what was going on
and extend things.  Ultimately, a stream's layout is decided by
its length and the array of block numbers that it lives on.  So
rather than have an abstract base class which can return this in
any number of ways, it's more straightforward to simply store them
as fields of a trivial struct, and also to give a more appropriate
name.

This patch does that.  It renames IMsfStreamData to MsfStreamLayout,
and deletes the 2 concrete implementations, DirectoryStreamData
and IndexedStreamData.  MsfStreamLayout is a trivial struct
with the necessary data.

llvm-svn: 277018
2016-07-28 19:11:09 +00:00
Zachary Turner bac69d33d0 [msf] Create LLVMDebugInfoMsf
This provides a better layering of responsibilities among different
aspects of PDB writing code.  Some of the MSF related code was
contained in CodeView, and some was in PDB prior to this.  Further,
we were often saying PDB when we meant MSF, and the two are
actually independent of each other since in theory you can have
other types of data besides PDB data in an MSF.  So, this patch
separates the MSF specific code into its own library, with no
dependencies on anything else, and DebugInfoCodeView and
DebugInfoPDB take dependencies on DebugInfoMsf.

llvm-svn: 276458
2016-07-22 19:56:05 +00:00
Zachary Turner b383d628df [pdb] Move file layout header structs to RawTypes.h
This facilitates code reuse between the builder classes and the
"frozen" read only versions of the classes used for parsing
existing PDB files.

llvm-svn: 276427
2016-07-22 15:46:46 +00:00
Zachary Turner d218c26124 [pdb] Round-trip module & file info to/from YAML.
This implements support for writing compiland and compiland source
file info to a binary PDB.  This is tested by adding support for
dumping these fields from an existing PDB to yaml, reading them
back in, and dumping them again and verifying the values are as
expected.

llvm-svn: 276426
2016-07-22 15:46:37 +00:00
Zachary Turner faa554b2fd [pdb] Use MsfBuilder to handle the writing PDBs.
Previously we would read a PDB, then write some of it back out,
but write the directory, super block, and other pertinent metadata
back out unchanged.  This generates incorrect PDBs since the amount
of data written was not always the same as the amount of data read.

This patch changes things to use the newly introduced `MsfBuilder`
class to write out a correct and accurate set of Msf metadata for
the data *actually* written, which opens up the door for adding and
removing type records, symbol records, and other types of data to
an existing PDB.

llvm-svn: 275627
2016-07-15 22:16:56 +00:00
Zachary Turner dbeaea7b35 Refactor the PDB writing to use a builder approach
llvm-svn: 275110
2016-07-11 21:45:26 +00:00
Benjamin Kramer 4d09892e9a Give helper classes/functions internal linkage. NFC.
llvm-svn: 275014
2016-07-10 11:28:51 +00:00
Zachary Turner 8848a7a6b2 [pdb] Round trip the PDB stream between YAML and binary PDB.
This gets writing of the PDB stream working.

llvm-svn: 274647
2016-07-06 18:05:57 +00:00
Reid Kleckner 11582c59d7 [pdb] Don't error on missing FPO streams
64-bit PDBs never have FPO data. They have xdata instead.

Also improve error recovery of stream summary dumping while I'm here.

llvm-svn: 273046
2016-06-17 20:38:01 +00:00
Zachary Turner a1657a9e64 [pdb] Handle stream index errors better.
Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21128

llvm-svn: 272172
2016-06-08 17:26:39 +00:00
Zachary Turner d2b2bfed94 [pdb] Try to fix use after free.
llvm-svn: 272078
2016-06-08 00:25:08 +00:00
Zachary Turner d8447990b0 [pdb] Use MappedBlockStream to parse the PDB directory.
In order to efficiently write PDBs, we need to be able to make a
StreamWriter class similar to a StreamReader, which can transparently deal
with writing to discontiguous streams, and we need to use this for all
writing, similar to how we use StreamReader for all reading.

Most discontiguous streams are the typical numbered streams that appear in
a PDB file and are described by the directory, but the exception to this,
that until now has been parsed by hand, is the directory itself.
MappedBlockStream works by querying the directory to find out which blocks
a stream occupies and various other things, so naturally the same logic
could not possibly work to describe the blocks that the directory itself
resided on.

To solve this, I've introduced an abstraction IPDBStreamData, which allows
the client to query for the list of blocks occupied by the stream, as well
as the stream length. I provide two implementations of this: one which
queries the directory (for indexed streams), and one which queries the
super block (for the directory stream).

This has the side benefit of vastly simplifying the code to parse the
directory. Whereas before a mini state machine was rolled by hand, now we
simply use FixedStreamArray to read out the stream sizes, then build a
vector of FixedStreamArrays for the stream map, all in just a few lines of
code.

Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21046

llvm-svn: 271982
2016-06-07 05:28:55 +00:00
Rui Ueyama ef2b488482 [pdbdump] Print out New FPO stream contents.
The data strucutre in the new FPO stream is described in the
PE/COFF spec. There is one record per function if frame pointer
is omitted.

Differential Revision: http://reviews.llvm.org/D20999

llvm-svn: 271926
2016-06-06 18:39:21 +00:00
Zachary Turner 3df1bfaaec [pdb] Print out file names instead of file offsets.
When printing line information and file checksums, we were printing
the file offset field from the struct header.  This teaches
llvm-pdbdump how to turn those numbers into the filename.  In the
case of file checksums, this is done by looking in the global
string table.  In the case of line contributions, this is done
by indexing into the file names buffer of the DBI stream.  Why
they use a different technique I don't know.

llvm-svn: 271630
2016-06-03 05:52:57 +00:00
Rui Ueyama 90db78816b pdbdump: print out COFF section headers.
Unlike other sections that can grow to any size, the COFF section header
stream has maximum length because each record is fixed size and the COFF
file format limits the maximum number of sections. So I decided to not
create a specific stream class for it. Instead, I added a member function
to DbiStream class which returns a vector of COFF headers.

Differential Revision: http://reviews.llvm.org/D20717

llvm-svn: 271557
2016-06-02 18:20:20 +00:00
Zachary Turner 93839cb4ac [pdb] Parse and dump section map and section contribs
Differential Revision: http://reviews.llvm.org/D20876
Reviewed By: rnk, ruiu

llvm-svn: 271488
2016-06-02 05:07:49 +00:00
Zachary Turner 1de49c9ffd Resubmit "[pdb] Allow zero-copy read support for symbol streams.""
Due to differences in template instantiation rules, it is not
portable to static_assert(false) inside of an invalid specialization
of a template.  Instead I just =delete the method so that it can't
be used, and leave a comment that it must be explicitly specialized.

llvm-svn: 271027
2016-05-27 18:47:20 +00:00
Chad Rosier 6c247c8cc8 Revert "[pdb] Allow zero-copy read support for symbol streams."
This reverts commit r271024 due to error: static_assert failed
"You must either provide a specialization of VarStreamArrayExtractor
or a custom extractor"

llvm-svn: 271026
2016-05-27 18:31:02 +00:00
Zachary Turner 3a9a23ae62 [pdb] Allow zero-copy read support for symbol streams.
This reduces the amount of memory used by llvm-pdbdump by roughly
1/3 of the size of the PDB file.

Differential Revision: http://reviews.llvm.org/D20724
Reviewed By: ruiu

llvm-svn: 271025
2016-05-27 18:20:20 +00:00
Zachary Turner 8dbe3629a0 [codeview,pdb] Try really hard to conserve memory when reading.
PDBs can be extremely large.  We're already mapping the entire
PDB into the process's address space, but to make matters worse
the blocks of the PDB are not arranged contiguously.  So, when
we have something like an array or a string embedded into the
stream, we have to make a copy.  Since it's convenient to use
traditional data structures to iterate and manipulate these
records, we need the memory to be contiguous.

As a result of this, we were using roughly twice as much memory
as the file size of the PDB, because every stream was copied
out and re-stitched together contiguously.

This patch addresses this by improving the MappedBlockStream
to allocate from a BumpPtrAllocator only when a read requires
a discontiguous read.  Furthermore, it introduces some data
structures backed by a stream which can iterate over both
fixed and variable length records of a PDB.  Since everything
is backed by a stream and not a buffer, we can read almost
everything from the PDB with zero copies.

Differential Revision: http://reviews.llvm.org/D20654
Reviewed By: ruiu

llvm-svn: 270951
2016-05-27 01:54:44 +00:00
Zachary Turner d5d37dcf83 [codeview] Move StreamInterface and StreamReader to libcodeview.
We have need to reuse this functionality, including making
additional generic stream types that are smarter about how and
when they copy memory versus referencing the original memory.
So all of these structures belong in the common library
rather than being pdb specific.

llvm-svn: 270751
2016-05-25 20:37:03 +00:00
Zachary Turner d3076ab36f [llvm-pdbdump] Decipher the remaining PDB streams.
We know at least know the meaning of every stream of the
PDB file.  Yay!

llvm-svn: 270669
2016-05-25 05:49:48 +00:00
Zachary Turner 96e60f7573 [llvm-pdbdump] Rework command line options.
When dumping huge PDB files, too many of the options were grouped
together so you would get neverending spew of output.  This patch
introduces more granular display options so you can only dump the
fields you actually care about.

llvm-svn: 270607
2016-05-24 20:31:48 +00:00
Rui Ueyama 0376b1a2d7 pdbdump: Rename NumberOfSymbols -> SymbolRecordStreamIndex.
Differential Revision: http://reviews.llvm.org/D20441

llvm-svn: 270088
2016-05-19 18:05:58 +00:00
Rui Ueyama 1f6b6e2c53 pdbdump: Print "Publics" stream.
Publics stream seems to contain information as to public symbols.
It actually contains a serialized hash table along with fixed-sized
headers. This patch is not complete. It scans only till the end of
the stream and dump the header information. I'll write code to
de-serialize the hash table later.

Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20256

llvm-svn: 269484
2016-05-13 21:21:53 +00:00
Zachary Turner 819e77d196 Port DebugInfoPDB over to using llvm::Error.
Differential Revision: http://reviews.llvm.org/D19940
Reviewed By: rnk

llvm-svn: 268791
2016-05-06 20:51:57 +00:00
Zachary Turner b56d904433 PDB - Instead of hardcoding stream numbers, use an enum.
llvm-svn: 268270
2016-05-02 18:09:21 +00:00
Zachary Turner 0eace0bae5 Parse PDB Name Hash Table
PDB has a lot of similar data structures.  We already have code
for parsing a Name Map, but PDB seems to have a different but
very similar structure that is a hash table.  This is the
beginning of code needed in order to parse the name hash table,
but it is not yet complete.  It parses the basic metadata of
the hash table, the bucket array, and the names buffer, but
doesn't use any of these fields yet as the data structure
requires a non-trivial amount of work to understand.

llvm-svn: 268268
2016-05-02 18:09:14 +00:00
Zachary Turner 2f09b5091c Put PDB parsing code into a pdb namespace.
llvm-svn: 268072
2016-04-29 17:28:47 +00:00