Commit Graph

2041 Commits

Author SHA1 Message Date
Reid Kleckner 291d015de4 [PDB] Add symbol records in bulk
Summary:
This speeds up linking clang.exe/pdb with /DEBUG:GHASH by 31%, from
12.9s to 9.8s.

Symbol records are typically small (16.7 bytes on average), but we
processed them one at a time. CVSymbol is a relatively "large" type. It
wraps an ArrayRef<uint8_t> with a kind an optional 32-bit hash, which we
don't need. Before this change, each DbiModuleDescriptorBuilder would
maintain an array of CVSymbols, and would write them individually with a
BinaryItemStream.

With this change, we now add symbols that happen to appear contiguously
in bulk. For each .debug$S section (roughly one per function), we
allocate two copies, one for relocation, and one for realignment
purposes. For runs of symbols that go in the module stream, which is
most symbols, we now add them as a single ArrayRef<uint8_t>, so the
vector DbiModuleDescriptorBuilder is roughly linear in the number of
.debug$S sections (O(# funcs)) instead of the number of symbol records
(very large).

Some stats on symbol sizes for the curious:
  PDB size: 507M
  sym bytes: 316,508,016
  sym count:  18,954,971
  sym byte avg: 16.7

As future work, we may be able to skip copying symbol records in the
linker for realignment purposes if we make LLVM write them aligned into
the object file. We need to double check that such symbol records are
still compatible with link.exe, but if so, it's definitely worth doing,
since my profile shows we spend 500ms in memcpy in the symbol merging
code. We could potentially cut that in half by saving a copy.
Alternatively, we could apply the relocations *after* we iterate the
symbols. This would require some careful re-engineering of the
relocation processing code, though.

Reviewers: zturner, aganea, ruiu

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D54554

llvm-svn: 347687
2018-11-27 19:00:23 +00:00
Luke Cheeseman 6db3a6a4a7 Revert r347490 as it breaks address sanitizer builds
llvm-svn: 347499
2018-11-23 17:13:06 +00:00
Luke Cheeseman d6dbd64104 Revert r343341
- Cannot reproduce the build failure locally and the build logs have
  been deleted.

llvm-svn: 347490
2018-11-23 11:01:47 +00:00
Zachary Turner c68f895702 [CodeView] Add support for ref-qualified member functions.
When you have a member function with a ref-qualifier, for example:

struct Foo {
  void Func() &;
  void Func2() &&;
};

clang-cl was not emitting this information. Doing so is a bit
awkward, because it's not a property of the LF_MFUNCTION type, which
is what you'd expect. Instead, it's a property of the this pointer
which is actually an LF_POINTER. This record has an attributes
bitmask on it, and our handling of this bitmask was all wrong. We
had some parts of the bitmask defined incorrectly, but importantly
for this bug, we didn't know about these extra 2 bits that represent
the ref qualifier at all.

Differential Revision: https://reviews.llvm.org/D54667

llvm-svn: 347354
2018-11-20 22:13:43 +00:00
Zachary Turner b35e1d7dc3 [CodeView] Don't print PointerAttributes when dumping.
PointerAttributes is a bitwise-or of several other fields, each of
which is already printed on its own line with a better explanation.
So this doesn't really help much.

llvm-svn: 347275
2018-11-20 00:10:27 +00:00
David Blaikie 81959a2730 llvm-symbolizer: Avoid calling getFromOffset when the index entry is already available
Especially for symbolizer it can be efficient to have to search through
the entire index when it isn't needed - llvm-symbolizer looks up only a
few CUs & already has an index available in getUnitForEntry, once it's
passed down to DWARFUnitHeader::extract then there's no need for it to
call getFromOffset.

llvm-svn: 347134
2018-11-17 05:57:58 +00:00
Fangrui Song 7570932977 Use llvm::copy. NFC
llvm-svn: 347126
2018-11-17 01:44:25 +00:00
Simon Atanasyan 705fbd5d4f [DWARF] Use PRIx64 instead of 'x' to format 64-bit values
This is a follow-up to r346715. Use PRIx64 to formatted print of 64-bit
value in the `DWARFDebugLoclists::LocationList::dump` to escape problem
on big-endian hosts.

llvm-svn: 347049
2018-11-16 13:14:26 +00:00
Zachary Turner 03a24052f3 [NativePDB] Improved support for nested type reconstruction.
In a previous patch, we pre-processed the TPI stream in order to build
the reverse mapping from nested type -> parent type so that we could
accurately reconstruct a DeclContext hierarchy.

However, there were some issues. An LF_NESTTYPE record is really just a
typedef, so although it happens to be used to indicate the name of the
nested type and referring to the global record which defines the type,
it is also used for every other kind of nested typedef. When we rebuild
the DeclContext hierarchy, we want it to be as accurate as possible,
which means that if we have something like:

  struct A {
    struct B {};
    using C = B;
  };

We don't want to create two CXXRecordDecls in the AST each with the
exact same definition. We just want to create one for B and then
define C as an alias to B. Previously, however, it would not be able
to distinguish between the two cases and it would treat A::B and
A::C as being two classes each with separate definitions. We address
the first half of improving the pre-processing logic so that only
actual definitions are treated this way.

Later, in a followup patch, we can handle the case of nested
typedefs since we're already going to be enumerating the field list
anyway and this patch introduces the general framework for
distinguishing between the two cases.

Differential Revision: https://reviews.llvm.org/D54357

llvm-svn: 346786
2018-11-13 20:07:32 +00:00
Simon Atanasyan 22dc538618 [DWARF] Do not use PRIx32 for printing uint64_t values
The `DWARFDebugAddrTable::dump` routine prints 32/64-bits addresses.
These values are stored in a vector of `uint64_t` independently of their
original sizes. But `format` function gets format string with PRIx32
suffix in case of 32-bit address size. At least on MIPS 32-bit targets
that leads to incorrect output.

This patch changes formats strings and always use PRIx64 to print
`uint64_t` values.

Differential Revision: http://reviews.llvm.org/D54424

llvm-svn: 346715
2018-11-12 22:43:17 +00:00
David Blaikie 582a5ebce0 NFC: DebugInfo: Reduce scope of DebugOffset to simplify code
This was being used as a sort of indirect out parameter from shouldDump
- seems simpler to use it as the actual result of the call. (this does
mean using a pointer to an Optional & actually using all 3 states (null,
None, and present) which is, admittedly, a tad subtle - but given the
limited scope, seems OK to me - open to discussion though, if others
feel strongly about it)

llvm-svn: 346691
2018-11-12 18:53:28 +00:00
Fangrui Song 158b26213f [DWARF] Change pubnames to use DWARFSection instead of StringRef
Summary: The debug_info_offset values in .debug_{,gnu_}pub{name,types} may be relocated. Change it to DWARFSection so that we can get relocated values.

Reviewers: ruiu, dblaikie, grimar, JDevlieghere

Reviewed By: JDevlieghere

Subscribers: aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D54375

llvm-svn: 346615
2018-11-11 18:57:28 +00:00
Alexandre Ganea 4b2957243b [LLD] Fix Microsoft precompiled headers cross-compile on Linux
Differential revision: https://reviews.llvm.org/D54122

llvm-svn: 346403
2018-11-08 14:42:37 +00:00
Jorge Gorbe Moya bf1badb6bb Add parentheses to silence warning.
DWARFContext.cpp:356:20: error: using the result of an assignment as a condition without parentheses [-Werror,-Wparentheses]

llvm-svn: 346365
2018-11-07 22:30:01 +00:00
Paul Robinson 746c22389c [DWARFv5] Read and dump multiple .debug_info sections.
Type units go in .debug_info comdats, not .debug_types, in v5.

Differential Revision: https://reviews.llvm.org/D53907

llvm-svn: 346360
2018-11-07 21:39:09 +00:00
Fangrui Song 54d23a8eb7 [DWARF] Support types CU list in .gdb_index dumping
Some executables have non-empty types CU list and -gdb-index would report "<error reporting>" before.

llvm-svn: 346181
2018-11-05 23:27:53 +00:00
Alexandre Ganea 71c43ceaf8 [COFF][LLD] Add link support for Microsoft precompiled headers OBJs
This change allows for link-time merging of debugging information from
Microsoft precompiled types OBJs compiled with cl.exe /Z7 /Yc and /Yu.

This fixes llvm.org/PR34278

Differential Revision: https://reviews.llvm.org/D45213

llvm-svn: 346154
2018-11-05 19:20:47 +00:00
Wolfgang Pieb 5253cccbd5 [DWARF v5] Verifier: Add checks for DW_FORM_strx* forms.
Adding functionality to the DWARF verifier for DWARF v5 strx* forms which 
index into the string offsets table.

Differential Revision: https://reviews.llvm.org/D54049

llvm-svn: 346061
2018-11-03 00:27:35 +00:00
Fangrui Song 999570a2f4 [DWARF] Fix typo, .gnu_index -> .gdb_index
llvm-svn: 346039
2018-11-02 20:34:40 +00:00
Reid Kleckner 46ff186b29 [codeview] Add breaks to fix -Wimplicit-fallthrough
This is a minor bug fix. Previously, if you tried to encode the RSP
register on the x86 platform, that might have succeeded and been encoded
incorrectly. However, no existing producer or consumer passes the x86_64
registers when targeting x86_32.

llvm-svn: 345879
2018-11-01 19:36:29 +00:00
Zachary Turner 56a5a0c3ce [CodeView] Emit the correct TypeIndex for std::nullptr_t.
The TypeIndex used by cl.exe is 0x103, which indicates a SimpleTypeMode
of NearPointer (note the absence of the bitness, normally pointers use a
mode of NearPointer32 or NearPointer64) and a SimpleTypeKind of void.
So this is basically a void*, but without a specified size, which makes
sense given how std::nullptr_t is defined.

clang-cl was actually not emitting *anything* for this. Instead, when we
encountered std::nullptr_t in a DIType, we would actually just emit a
TypeIndex of 0, which is obviously wrong.

std::nullptr_t in DWARF is represented as a DW_TAG_unspecified_type with
a name of "decltype(nullptr)", so we add that logic along with a test,
as well as an update to the dumping code so that we no longer print
void* when dumping 0x103 (which would previously treat Void/NearPointer
no differently than Void/NearPointer64).

Differential Revision: https://reviews.llvm.org/D53957

llvm-svn: 345811
2018-11-01 04:02:41 +00:00
Wolfgang Pieb 8eb3c81457 [DWARF][NFC] Refactor a function to return Optional<> instead of bool
Minor refactor of DWARFUnit::getStringOffsetSectionItem().

Differential Revision: https://reviews.llvm.org/D53948

llvm-svn: 345776
2018-10-31 21:05:51 +00:00
Wolfgang Pieb f39a9bbe72 [DWARF] Revert r345546: Refactor range list extraction and dumping
This patch caused some internal tests to break which are being investigated.

llvm-svn: 345687
2018-10-31 01:12:58 +00:00
Saleem Abdulrasool 91242b788a DWARFVerifier: make the verifier more comprehensive for objects
Make the code do what was mentioned in the comment: only skip the CU types.
This enables the lexical blocks to be verified as well.

llvm-svn: 345675
2018-10-30 23:45:27 +00:00
Wolfgang Pieb fb6cffca09 [DWARF][NFC] Refactor range list extraction and dumping
The purpose of this patch is twofold: 
- Fold pre-DWARF v5 functionality into v5 to eliminate the need for 2 different 
  versions of range list handling. We get rid of DWARFDebugRangelist{.cpp,.h}.
- Templatize the handling of range list tables so that location list handling
  can take advantage of it as well. Location list and range list tables have the 
  same basic layout.

A non-NFC version of this patch was previously submitted with r342218, but it caused
errors with some TSan tests. This patch has no functional changes. The difference to
the non-NFC patch is that there are no changes to rangelist dumping in this patch.

Differential Revision: https://reviews.llvm.org/D53545

llvm-svn: 345546
2018-10-29 22:16:47 +00:00
Saleem Abdulrasool ec77a6517f Revert "Revert "DebugInfo: reduce DIE range verification on object files""
This reverts commit 836c763dadbd9478fa35b1a291a38bf17aa206ba.  Default
initialize the values that MSAN caught.

llvm-svn: 345482
2018-10-28 22:30:48 +00:00
Vlad Tsyrklevich 50d2683a00 Revert "DebugInfo: reduce DIE range verification on object files"
This reverts commits r345441 and r345444, they were causing msan
buildbot failures.

llvm-svn: 345457
2018-10-27 17:39:13 +00:00
Saleem Abdulrasool b342446fe0 DebugInfo: reduce DIE range verification on object files
Relocatable content may have overlapping ranges until the sections are
finalized.  This reduces the amount of verification that is done on an object
file so that invalid errors are not raised.

llvm-svn: 345441
2018-10-27 00:49:33 +00:00
Wolfgang Pieb d57b5251d4 [DWARF][NFC] cleanup (mostly leftovers from the implementation of string offsets tables)
Majority of the patch by David Blaikie.

Differential Revision: https://reviews.llvm.org/D53741

llvm-svn: 345404
2018-10-26 17:14:46 +00:00
David Blaikie 2f9c42c994 llvm-dwarfdump: loclists: Don't expect an (albeit empty) expression for LLE_base_address
llvm-svn: 345320
2018-10-25 21:35:59 +00:00
George Rimar 581fc63dc0 [llvm-dwarfdump] - Fix incorrect parsing of the DW_LLE_startx_length
As was already mentioned in comments for D53364, DWARF 5
spec says about DW_LLE_startx_length:

"This is a form of bounded location description that has two unsigned ULEB operands.
The first value is an address index (into the .debug_addr section) that indicates the beginning of the address range
over which the location is valid. The second value is the length of the range. ")

Currently, the length is always parsed as U32.
Patch change the behavior to parse DW_LLE_startx_length as ULEB128 for DWARF 5
and keeps it as U32 for DWARF4+(pre-DWARF5) for compatibility.

Differential revision: https://reviews.llvm.org/D53564

llvm-svn: 345254
2018-10-25 10:56:44 +00:00
David Blaikie c8ae096739 llvm-dwarfdump: Account for skeleton addr_base when dumping addresses in split unit in the same file
llvm-svn: 345215
2018-10-24 22:44:54 +00:00
Reid Kleckner 075897292f [PDB] Fix -Wunused-private-field in DIA
llvm-svn: 345054
2018-10-23 17:20:16 +00:00
Aleksandr Urakov c43e086c74 Revert "Revert "[PDB] Extend IPDBSession's interface to retrieve frame data""
This reverts commit 466ce67d6ec444962e5cc0136243c16a453190c0.

llvm-svn: 345010
2018-10-23 08:14:53 +00:00
Zachary Turner b96181c2bf Some cleanups to the native pdb plugin [NFC].
This is mostly some cleanup done in the process of implementing
some basic support for types.  I tried to split up the patch a
bit to get some of the NFC portion of the patch out into a separate
commit, and this is the result of that.  It moves some code around,
deletes some spurious namespace qualifications, removes some
unnecessary header includes, forward declarations, etc.

llvm-svn: 344913
2018-10-22 16:19:07 +00:00
Aleksandr Urakov 738df2de7f Revert "[PDB] Extend IPDBSession's interface to retrieve frame data"
This reverts commit b5c7e2f9a4dbb34e3667c4bb4972735eadd3247a.

llvm-svn: 344909
2018-10-22 15:30:48 +00:00
George Rimar 209232091c [llvm-dwarfdump] - Fix win10 build bot failture.
Bot failed: 
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/20877/steps/test/logs/stdio

This was broken after the 
r344895 "[llvm-dwarfdump] - Add the support of parsing .debug_loclists."
because of wrong formatting specifiers used.

llvm-svn: 344896
2018-10-22 12:18:30 +00:00
George Rimar 4c7dd9cf0a [llvm-dwarfdump] - Add the support of parsing .debug_loclists.
This teaches llvm-dwarfdump to dump the content of .debug_loclists sections.

It converts the DWARFDebugLocDWO class to DWARFDebugLoclists,
teaches llvm-dwarfdump about .debug_loclists section and
adds the implementation for parsing the DW_LLE_offset_pair entries.

Differential revision: https://reviews.llvm.org/D53364

llvm-svn: 344895
2018-10-22 11:30:54 +00:00
Aleksandr Urakov d4a82f6f74 [PDB] Extend IPDBSession's interface to retrieve frame data
Summary:
This patch just extends the `IPDBSession` interface to allow retrieving
of frame data through it, and adds an implementation over DIA. It is needed
for an implementation (for now with DIA) of the conversion from FPO programs
to DWARF expressions mentioned in D53086.

Reviewers: zturner, asmith, rnk

Reviewed By: asmith

Subscribers: mgorny, aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D53324

llvm-svn: 344886
2018-10-22 07:18:08 +00:00
David Blaikie 2df23a4e2e DebugInfo: Use DW_OP_addrx in DWARFv5
Reuse addresses in the address pool, even in non-split cases.

llvm-svn: 344838
2018-10-20 08:54:05 +00:00
David Blaikie 59ac206433 llvm-dwarfdump: Support RLE_addressx and RLE_startx_length in .debug_rnglists
llvm-svn: 344835
2018-10-20 06:16:25 +00:00
David Blaikie 161dd3c186 DebugInfo: Use debug_addr for non-dwo addresses in DWARF 5
Putting addresses in the address pool, even with non-fission, can reduce
relocations - reusing the addresses from debug_info and debug_rnglists
(the latter coming soon)

llvm-svn: 344834
2018-10-20 06:02:15 +00:00
Wolfgang Pieb 6214c11cb7 [DWARF] Make llvm-dwarfdump display location lists in a .dwp file correctly. Fixes PR38990.
Considers the index when extracting location lists from a .dwp file.
Majority of the patch by David Blaikie.

Reviewers: dblaikie

Differential revision: https://reviews.llvm.org/D53155

llvm-svn: 344807
2018-10-19 19:23:16 +00:00
Jonas Devlieghere 344cac5efd [dwarfdump] Hide ranges in diff-mode.
llvm-dwarfdump --diff should not print DW_AT_ranges. This patch fixes
that.

Differential revision: https://reviews.llvm.org/D53353

llvm-svn: 344794
2018-10-19 17:57:53 +00:00
David Bolvansky 7e30c91dca [DwarfVerifier] Fixed -Wimplicit-fallthrough warning
Reviewers: JDevlieghere, RKSimon

Reviewed By: JDevlieghere

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52963

llvm-svn: 344176
2018-10-10 20:10:37 +00:00
Zachary Turner 5989281cf3 [PDB] Fix another bug in globals stream name lookup.
When we're on the last bucket the computation is tricky.
We were failing when the last bucket contained multiple
matches.  Added a new test for this.

llvm-svn: 344081
2018-10-09 21:19:03 +00:00
Wolfgang Pieb a9ea9c5034 [DWARF] Make llvm-dwarfdump display the .debug_loc.dwo section. Fixes PR38991.
Reviewer: dblaikie

Differential Revision: https://reviews.llvm.org/D52444

llvm-svn: 344068
2018-10-09 18:38:55 +00:00
Zachary Turner b7dd12b7a8 [PDB] Fix failure on big endian machines.
We changed an ArrayRef<uint8_t> to an ArrayRef<uint32_t>, but
it needs to be an ArrayRef<support::ulittle32_t>.

We also change ArrayRef<> to FixedStreamArray<>.  Technically
an ArrayRef<> will work, but it can cause a copy in the underlying
implementation if the memory is not contiguous, and there's no
reason not to use a FixedStreamArray<>.

Thanks to nemanjai@ and thakis@ for helping me track this down
and confirm the fix.

llvm-svn: 344063
2018-10-09 17:58:51 +00:00
Zachary Turner 0f556f88c5 Remove unused variable.
llvm-svn: 344002
2018-10-08 22:56:57 +00:00
Zachary Turner c8207fa59b [PDB] fix a bug in global stream name lookup.
When we're looking up a record in the last hash bucket chain, we
need to be careful with the end-offset calculation.

llvm-svn: 344001
2018-10-08 22:38:27 +00:00
Kristina Brooks bcc86a95c1 [DebugInfo][PDB] Fix a signed/unsigned coversion warning
Fix the following warning when compiling with clang (caused by commit
rL343951):

GlobalsStream.cpp:61:33: warning: comparison of integers of different
signs: 'int' and 'uint32_t'

This also avoids double evaluation of `GlobalsTable.HashBuckets.size()`.

llvm-svn: 343957
2018-10-08 09:03:17 +00:00
Zachary Turner ba73a91491 Fix a -Wsign-compare warning.
llvm-svn: 343953
2018-10-08 04:44:12 +00:00
Zachary Turner 9f6ac4c264 Fix a compilation failure on non-MSVC compilers.
llvm-svn: 343952
2018-10-08 04:34:41 +00:00
Zachary Turner 94926a6db8 [PDB] Add the ability to lookup global symbols by name.
The Globals table is a hash table keyed on symbol name, so
it's possible to lookup symbols by name in O(1) time.  Add
a function to the globals stream to do this, and add an option
to llvm-pdbutil to exercise this, then use it to write some
tests to verify correctness.

llvm-svn: 343951
2018-10-08 04:19:16 +00:00
David Blaikie fdada09fa4 dwarfdump: Avoid parsing units unnecessarily
NFC-ish (the parsing of the units is not a functional change - no
errors/warnings are emitted during the shallow parsing - though without
parsing them here, the "max version" would be wrong (still zero) later
on, so in those cases the units do need to be parsed)

llvm-svn: 343884
2018-10-05 20:55:20 +00:00
Vedant Kumar 5931b4e5b5 [DebugInfo] Add support for DWARF5 call site-related attributes
DWARF v5 introduces DW_AT_call_all_calls, a subprogram attribute which
indicates that all calls (both regular and tail) within the subprogram
have call site entries. The information within these call site entries
can be used by a debugger to populate backtraces with synthetic tail
call frames.

Tail calling frames go missing in backtraces because the frame of the
caller is reused by the callee. Call site entries allow a debugger to
reconstruct a sequence of (tail) calls which led from one function to
another. This improves backtrace quality. There are limitations: tail
recursion isn't handled, variables within synthetic frames may not
survive to be inspected, etc. This approach is not novel, see:

  https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=jelinek.pdf

This patch adds an IR-level flag (DIFlagAllCallsDescribed) which lowers
to DW_AT_call_all_calls. It adds the minimal amount of DWARF generation
support needed to emit standards-compliant call site entries. For easier
deployment, when the debugger tuning is LLDB, the DWARF requirement is
adjusted to v4.

Testing: Apart from check-{llvm, clang}, I built a stage2 RelWithDebInfo
clang binary. Its dSYM passed verification and grew by 1.4% compared to
the baseline. 151,879 call site entries were added.

rdar://42001377

Differential Revision: https://reviews.llvm.org/D49887

llvm-svn: 343883
2018-10-05 20:37:17 +00:00
Zachary Turner a67765ac8d [PDB] Add support for more kinds of PDB Sym Tags.
DIA SDK is returning several new sym tag types, so we update
the enumeration and printing code to support these.

llvm-svn: 343547
2018-10-01 22:39:19 +00:00
Reid Kleckner 9ea2c01264 [codeview] Emit S_FRAMEPROC and use S_DEFRANGE_FRAMEPOINTER_REL
Summary:
Before this change, LLVM would always describe locals on the stack as
being relative to some specific register, RSP, ESP, EBP, ESI, etc.
Variables in stack memory are pretty common, so there is a special
S_DEFRANGE_FRAMEPOINTER_REL symbol for them. This change uses it to
reduce the size of our debug info.

On top of the size savings, there are cases on 32-bit x86 where local
variables are addressed from ESP, but ESP changes across the function.
Unlike in DWARF, there is no FPO data to describe the stack adjustments
made to push arguments onto the stack and pop them off after the call,
which makes it hard for the debugger to find the local variables in
frames further up the stack.

To handle this, CodeView has a special VFRAME register, which
corresponds to the $T0 variable set by our FPO data in 32-bit.  Offsets
to local variables are instead relative to this value.

This is part of PR38857.

Reviewers: hans, zturner, javed.absar

Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D52217

llvm-svn: 343543
2018-10-01 21:59:45 +00:00
Zachary Turner a5e3e02602 [PDB] Add support for dumping Typedef records.
These work a little differently because they are actually in
the globals stream and are treated as symbol records, even though
DIA presents them as types.  So this also adds the necessary
infrastructure to cache records that live somewhere other than
the TPI stream as well.

llvm-svn: 343507
2018-10-01 17:55:38 +00:00
Zachary Turner 5c1873b213 [PDB] Add support for parsing VFTable Shape records.
This allows them to be returned from the native API.

llvm-svn: 343506
2018-10-01 17:55:16 +00:00
Zachary Turner 518cb2d560 [PDB] Add native support for dumping array types.
llvm-svn: 343412
2018-09-30 16:19:18 +00:00
Zachary Turner 6ca6a03c51 [PDB] Better native API support for pointers.
We didn't properly detect when a pointer was a member
pointer, and when that was the case we were not
properly returning class parent info.  This caused
member pointers to render incorrectly in pretty mode.
However, we didn't even have pretty tests for pointers
in native mode, so those are also added now to ensure
this.

llvm-svn: 343393
2018-09-29 23:28:19 +00:00
Luke Cheeseman 10981cc884 Revert r343317
- asan buildbots are breaking and I need to investigate the issue

llvm-svn: 343341
2018-09-28 17:01:50 +00:00
Luke Cheeseman 21f2955bb2 Reapply changes reverted by r343235
- Add fix so that all code paths that create DWARFContext
  with an ObjectFile initialise the target architecture in the context
- Add an assert that the Arch is known in the Dwarf CallFrameString method

llvm-svn: 343317
2018-09-28 13:37:27 +00:00
Aaron Smith 757274f9b2 [pdb] Simplify the code by replacing a few string conversions with calls to invokeBstrMethod()
Reviewers: aleksandr.urakov, zturner, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D52624

llvm-svn: 343291
2018-09-28 02:32:07 +00:00
Luke Cheeseman 8e5676b1aa Revert r343192 as an ubsan build is currently failing
llvm-svn: 343235
2018-09-27 16:47:30 +00:00
Luke Cheeseman f6844b307a Reapply changes reverted in r343114, lldb patch to follow shortly
llvm-svn: 343192
2018-09-27 10:39:20 +00:00
Fangrui Song 0cac726a00 llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)
Summary: The convenience wrapper in STLExtras is available since rL342102.

Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb

Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52573

llvm-svn: 343163
2018-09-27 02:13:45 +00:00
Luke Cheeseman 77aaa22081 Revert r343112 as CallFrameString API change has broken lldb builds
llvm-svn: 343114
2018-09-26 14:48:03 +00:00
Luke Cheeseman 03ad8812f5 [AArch64] - Return address signing dwarf support
- Reapply r343089 with a fix for DebugInfo/Sparc/gnu-window-save.ll

llvm-svn: 343112
2018-09-26 14:30:29 +00:00
Hans Wennborg 00b88bbcaf Revert r343089 "[AArch64] - Return address signing dwarf support"
This caused the DebugInfo/Sparc/gnu-window-save.ll test to fail.

> Functions that have signed return addresses need additional dwarf support:
> - After signing the LR, and before authenticating it, the LR register is in a
>   state the is unusable by a debugger or unwinder
> - To account for this a new directive, .cfi_negate_ra_state, is added
> - This directive says the signed state of the LR register has now changed,
>   i.e. unsigned -> signed or signed -> unsigned
> - This directive has the same CFA code as the SPARC directive GNU_window_save
>   (0x2d), adding a macro to account for multiply defined codes
> - This patch matches the gcc implementation of this support:
>   https://patchwork.ozlabs.org/patch/800271/
>
> Differential Revision: https://reviews.llvm.org/D50136

llvm-svn: 343103
2018-09-26 12:57:45 +00:00
Luke Cheeseman f755e687fc [AArch64] - Return address signing dwarf support
Functions that have signed return addresses need additional dwarf support:
- After signing the LR, and before authenticating it, the LR register is in a
  state the is unusable by a debugger or unwinder
- To account for this a new directive, .cfi_negate_ra_state, is added
- This directive says the signed state of the LR register has now changed,
  i.e. unsigned -> signed or signed -> unsigned
- This directive has the same CFA code as the SPARC directive GNU_window_save
  (0x2d), adding a macro to account for multiply defined codes
- This patch matches the gcc implementation of this support:
  https://patchwork.ozlabs.org/patch/800271/

Differential Revision: https://reviews.llvm.org/D50136

llvm-svn: 343089
2018-09-26 10:14:15 +00:00
Zachary Turner a9defc348b Add missing include.
llvm-svn: 342781
2018-09-21 22:44:31 +00:00
Zachary Turner 6345e84dde [NativePDB] Add support for reading function signatures.
This adds support for parsing function signature records and returning
them through the native DIA interface.

llvm-svn: 342780
2018-09-21 22:36:28 +00:00
Zachary Turner 355ffb0032 [PDB] Add native reading support for UDT / class types.
This allows the native reader to find records of class/struct/
union type and dump them.  This behavior is tested by using the
diadump subcommand against golden output produced by actual DIA
SDK on the same PDB file, and again using pretty -native to
confirm that we actually dump the classes.  We don't find class
members or anything like that yet, for now it's just the class
itself.

llvm-svn: 342779
2018-09-21 22:36:04 +00:00
Jonas Devlieghere b32274242d [dwarfdump] Verify DW_AT_type is set and points to a compatible DIE.
This extends the verifier to catch three new errors:

  * Missing DW_AT_type attributes for DW_TAG_formal_parameter,
    DW_TAG_variable and DW_TAG_array_type.

  * Valid references for DW_AT_type pointing to a non-type tag.

Differential revision: https://reviews.llvm.org/D52223

llvm-svn: 342713
2018-09-21 07:50:21 +00:00
Jonas Devlieghere 7ef2c2021e [dwarfdump] Verify compatibility of attribute TAGs.
Verify that DW_AT_specification and DW_AT_abstract_origin reference a
DIE with a compatible tag.

Differential revision: https://reviews.llvm.org/D38719

llvm-svn: 342712
2018-09-21 07:49:29 +00:00
Zachary Turner 4e0295bed3 [PDB] Fix -Wcovered-switch-default warning.
llvm-svn: 342681
2018-09-20 19:57:49 +00:00
Zachary Turner 68f0eeff83 Fix warnings.
llvm-svn: 342670
2018-09-20 17:48:44 +00:00
Zachary Turner 5907a780f0 [PDB] Better printing of builtin types when using DIA dumper.
llvm-svn: 342658
2018-09-20 16:12:05 +00:00
Zachary Turner cfa1d499f9 [PDB] Add the ability to map forward references to full decls.
Some records point to an LF_CLASS, LF_UNION, LF_STRUCTURE, or LF_ENUM
which is a forward reference and doesn't contain complete debug
information. In these cases, we'd like to be able to quickly locate the
full record. The TPI stream stores an array of pre-computed record hash
values, one for each type record. If we pre-process this on startup, we
can build a mapping from hash value -> {list of possible matching type
indices}. Since hashes of full records are only based on the name and or
unique name and not the full record contents, we can then use forward
ref record to compute the hash of what *would* be the full record by
just hashing the name, use this to get the list of possible matches, and
iterate those looking for a match on name or unique name.

llvm-pdbutil is updated to resolve forward references for the purposes
of testing (plus it's just useful).

Differential Revision: https://reviews.llvm.org/D52283

llvm-svn: 342656
2018-09-20 15:50:13 +00:00
Jonas Devlieghere f1f3e7377c [DWARF Verifier] Add helper function to dump DIEs. [NFC]
It's pretty common for the verifier to dump the relevant DIE when it
finds an issue. This tends to be relatively verbose and error prone
because we have to pass the DIDumpOptions to the DIE's dump method. This
patch adds a helper function to the verifier to make this easier.

llvm-svn: 342526
2018-09-19 08:08:13 +00:00
Zachary Turner c41ce8355f [PDB] Better support for enumerating pointer types.
There were several issues with the previous implementation.

1) There were no tests.
2) We didn't support creating PDBSymbolTypePointer records for
   builtin types since those aren't described by LF_POINTER
   records.
3) We didn't support a wide enough variety of builtin types even
   ignoring pointers.

This patch fixes all of these issues.  In order to add tests,
it's helpful to be able to ignore the symbol index id hierarchy
because it makes the golden output from the DIA version not match
our output, so I've extended the dumper to disable dumping of id
fields.

llvm-svn: 342493
2018-09-18 16:35:05 +00:00
Zachary Turner bdf0381e21 [PDB] Make the native reader support enumerators.
Previously we would dump the names of enum types, but not their
enumerator values.  This adds support for enumerator values.  In
doing so, we have to introduce a general purpose mechanism for
caching symbol indices of field list members.  Unlike global
types, FieldList members do not have a TypeIndex.  So instead,
we identify them by the pair {TypeIndexOfFieldList, IndexInFieldList}.

llvm-svn: 342415
2018-09-17 21:08:11 +00:00
Zachary Turner 4727ac2394 [PDB] Make the native reader support modified types.
Previously for cv-qualified types, we would just ignore them
and they would never get printed.  Now we can enumerate them
and cache them like any other symbol type.

llvm-svn: 342414
2018-09-17 21:07:48 +00:00
Alexander Kornienko e74e0f11d1 Revert "[DWARF] reposting r342048, which was reverted in r342056 due to buildbot errors. Adjusted 2 test cases for ARM and darwin and fixed a bug with the original change in dsymutil."
This reverts commit r342218. Due to a number of failures under TSAN. An isolated
test case is being worked on.

llvm-svn: 342399
2018-09-17 15:40:01 +00:00
Jonas Devlieghere 9d7cecfcbf [DebugInfo] Remove redundant argument. [NFC]
Removes the redundant UnitType parameter from verifyUnitContents. I also
fixed  some formatting issues as I was touching the file.

llvm-svn: 342396
2018-09-17 14:23:47 +00:00
Nico Weber 205ca68b8d Give InfoStreamBuilder an opt-in method to write a hash of the PDB as GUID.
Naively computing the hash after the PDB data has been generated is in practice
as fast as other approaches I tried. I also tried online-computing the hash as
parts of the PDB were written out (https://reviews.llvm.org/D51887; that's also
where all the measuring data is) and computing the hash in parallel
(https://reviews.llvm.org/D51957). This approach here is simplest, without
being slower.

Differential Revision: https://reviews.llvm.org/D51956

llvm-svn: 342333
2018-09-15 18:35:51 +00:00
Zachary Turner 4d68951e6d [PDB] Refactor a little of the Symbol creation code.
Eventually we need to be able to support nested types, which don't
have an associated CVType record.  To handle this, remove the
CVType from all of the record classes, and instead store the
deserialized record.  Then move the deserialization up to the thing
that creates the type.  This actually makes error handling better
anyway as we can return an invalid symbol instead of asserting false.

llvm-svn: 342284
2018-09-14 21:03:57 +00:00
Reid Kleckner ba732f213d Remove unused DIASession field
llvm-svn: 342272
2018-09-14 20:16:31 +00:00
Wolfgang Pieb 55dbac9f07 [DWARF] reposting r342048, which was reverted in r342056 due to buildbot
errors.
Adjusted 2 test cases for ARM and darwin and fixed a bug with the original
change in dsymutil.

llvm-svn: 342218
2018-09-14 09:14:10 +00:00
Simon Pilgrim 5b65e41a8f Fix unused variable warning. NFCI.
llvm-svn: 342128
2018-09-13 10:54:23 +00:00
David Blaikie eee709f03c DebugInfo/PDB: Remove unused member
llvm-svn: 342101
2018-09-13 00:02:02 +00:00
David Blaikie da36f3f482 dwarfdump: Improve performance on large DWP files
llvm-svn: 342099
2018-09-12 23:39:51 +00:00
Zachary Turner c43d55602f [PDB] Remove all clone() methods.
These are dead code and encourage poor usage patterns, so I'm
removing them.  They weren't called anywhere anyway.

llvm-svn: 342093
2018-09-12 22:57:03 +00:00
Zachary Turner a1f85f8bdd [PDB] Emit old fpo data to the PDB file.
r342003 added support for emitting FPO data from the
DEBUG_S_FRAMEDATA subsection of the .debug$S section to the PDB
file.  However, that is not the end of the story.  FPO can end
up in two different destinations in a PDB, each corresponding to
a different FPO data source.

The case handled by r342003 involves copying data from the
DEBUG_S_FRAMEDATA subsection of the .debug$S section to the
"New FPO" stream in the PDB, which is then referred to by the
DBI stream.  The case handled by this patch involves copying
records from the .debug$F section of an object file to the "FPO"
stream (or perhaps more aptly, the "Old FPO" stream) in the PDB
file, which is also referred to by the DBI stream.

The formats are largely similar, and the difference is mostly
only visible in masm generated object files, such as some of the
low-level CRT object files like memcpy.  MASM doesn't appear to
support writing the DEBUG_S_FRAMEDATA subsection, and instead
just writes these records to the .debug$F section.

Although clang-cl does not emit a .debug$F section ever, lld still
needs to support it so we have good debugging for CRT functions.

Differential Revision: https://reviews.llvm.org/D51958

llvm-svn: 342080
2018-09-12 21:02:01 +00:00
Wolfgang Pieb 233bc73047 Reverting r342048, which caused UBSan failures in dsymutil.
llvm-svn: 342056
2018-09-12 14:40:04 +00:00
Wolfgang Pieb 3a8781cf6c [DWARF] Refactoring range list dumping to fold DWARF v4 functionality into v5 handling
Eliminating some duplication of rangelist dumping code at the expense of
some version-dependent code in dump and extract routines.

Reviewer: dblaikie, JDevlieghere, vleschuk

Differential revision: https://reviews.llvm.org/D51081

llvm-svn: 342048
2018-09-12 12:01:19 +00:00
Zachary Turner 42e7cc1b0f [PDB] Write FPO Data to the PDB.
llvm-svn: 342003
2018-09-11 22:35:01 +00:00
Reid Kleckner a6f64265ea [codeview] Decode and dump FP regs from S_FRAMEPROC records
Summary:
There are two registers encoded in the S_FRAMEPROC flags: one for locals
and one for parameters. The encoding is described by the
ExpandEncodedBasePointerReg function in cvinfo.h. Two bits are used to
indicate one of four possible values:

  0: no register - Used when there are no variables.
  1: SP / standard - Variables are stored relative to the standard SP
     for the ISA.
  2: FP - Variables are addressed relative to the ISA frame
     pointer, i.e. EBP on x86. If realignment is required, parameters
     use this. If a dynamic alloca is used, locals will be EBP relative.
  3: Alternative - Variables are stored relative to some alternative
     third callee-saved register. This is required to address highly
     aligned locals when there are dynamic stack adjustments. In this
     case, both the incoming SP saved in the standard FP and the current
     SP are at some dynamic offset from the locals. LLVM uses ESI in
     this case, MSVC uses EBX.

Most of the changes in this patch are to pass around the CPU so that we
can decode these into real, named architectural registers.

Subscribers: hiraditya

Differential Revision: https://reviews.llvm.org/D51894

llvm-svn: 341999
2018-09-11 22:00:50 +00:00
Nico Weber e2745b5d86 pdb output: Initialize padding in PublicsStreamHeader.
Makes the produced pdbs more deterministic; before they'd contain 2 arbitary
bytes where this padding was.

Also reorder initialization to match the order of the fields in the struct (nfc)

llvm-svn: 341945
2018-09-11 14:11:52 +00:00
David Blaikie 4ec5a9159b llvm-symbolizer: Fix bug related to TUs interfering with symbolizing
With the merge of TUs and CUs into a single container, some code that
relied on the CU range having an ordered range of contiguous addresses
(for locating a CU at a given offset) broke. But the units from
debug_info (currently only CUs, but CUs and TUs in DWARFv5) are in a
contiguous sub-range of that container - searching only through that
subrange is still valid & so do that.

llvm-svn: 341889
2018-09-11 02:04:45 +00:00
Zachary Turner b789458e0c Re-run clang-format on one file.
clang-format was getting confused due to the presence of a macro
invocation that was not terminated by a semicolon.  Fixed this by
terminating the macro lines with semicolons and re-ran clang-format
on the file.

llvm-svn: 341864
2018-09-10 21:31:21 +00:00
Zachary Turner cae734588f [PDB] Change uint32_t to SymIndex wherever it makes sense.
Although it's just a typedef, it helps for readability.  NFC.

llvm-svn: 341863
2018-09-10 21:30:59 +00:00
Alexandre Ganea d93b07f0b0 [LLD][COFF] Cleanup error messages / add more coverage tests
- Log the reason for a PDB or precompiled-OBJ load failure
- Properly handle out-of-date PDB or precompiled-OBJ signature by displaying a corresponding error
- Slightly change behavior on PDB failure: any subsequent load attempt from another OBJ would result in the same error message being logged
- Slightly change behavior on PDB failure: retry with filename only if previous error was ENOENT ("no such file or directory")
- Tests: a. for native PDB errors; b. cover all the cases above

Differential Revision: https://reviews.llvm.org/D51559

llvm-svn: 341825
2018-09-10 13:51:21 +00:00
Zachary Turner 0119e38491 Fix some of the PDB tests.
They were unintentionally calling DIA directly, which requires
Windows.  We need to pass the -native flag, and this then required
fixing up one or two tests.

llvm-svn: 341731
2018-09-07 23:36:08 +00:00
Zachary Turner da4b63ab9a [PDB] Support pointer types in the native reader.
In order to start testing this, I've added a new mode to
llvm-pdbutil which is only really useful for writing tests.
It just dumps the value of raw fields in record format.
This isn't really ideal and it won't allow us to test some
important cases, but it's better than nothing for now.

llvm-svn: 341729
2018-09-07 23:21:33 +00:00
Zachary Turner 5d629966a9 [PDB] Rename some files in the native reader.
By calling these NativeType<foo>.cpp, they will all be sorted
together, and it also distinguishes the types from the symbols.

llvm-svn: 341609
2018-09-07 00:12:56 +00:00
Zachary Turner 8ab7dd6028 [PDB] Create a SymbolCache class.
Part of the responsibility of the native PDB reader is to cache
symbols the first time they are accessed, so they can then be
looked up by an ID.  Furthermore, we need to resolve type indices
to records that we vend to the user, and other things.  Previously
this code was all thrown together a bit haphazardly in the native
session class, but it makes sense to collect all of this into a
single class whose sole responsibility is to manage the collection
of known symbols.

llvm-svn: 341608
2018-09-07 00:12:34 +00:00
Zachary Turner 5cda1b802d Fix some warnings.
llvm-svn: 341508
2018-09-06 00:06:20 +00:00
Zachary Turner 7999b4fa48 [PDB] Refactor the PDB symbol classes to fix a reuse bug.
The way DIA SDK works is that when you request a symbol, it
gets assigned an internal identifier that is unique for the
life of the session.  You can then use this identifier to
get back the same symbol, with all of the same internal state
that it had before, even if you "destroyed" the original
copy of the object you had.

This didn't work properly in our native implementation, and
if you destroyed an object for a particular symbol, then
requested the same symbol again, it would get assigned a new
ID and you'd get a fresh copy of the object.  In order to fix
this some refactoring had to happen to properly reuse cached
objects.  Some unittests are added to verify that symbol
reuse is taking place, making use of the new unittest input
feature.

llvm-svn: 341503
2018-09-05 23:30:38 +00:00
Jonas Devlieghere 881452384a [dwarfdump] Improve -diff option by hiding more data.
The -diff option makes it easy to diff dwarf by hiding addresses and
offsets. However not all of them were hidden, which should be fixed by
this patch.

Differential revision: https://reviews.llvm.org/D51593

llvm-svn: 341377
2018-09-04 16:21:37 +00:00
Jonas Devlieghere 6e5c7e6037 [DebugInfo] Have the verifier accept missing linkage names.
According to the standard, for the .debug_names (the "dwarf accelerator
tables"):

> If a subprogram or inlined subroutine is included, and has a
> DW_AT_linkage_name attribute, there will be an additional index entry
> for the linkage name.

For Swift we generate DW_structure_types with a linkage name and the
verifier was incorrectly rejecting this. This patch fixes that by only
considering the linkage name in those particular cases. The test is the
"reduced" debug info of the failing swift test on swift.org.

Differential revision: https://reviews.llvm.org/D51420

llvm-svn: 341311
2018-09-03 12:12:17 +00:00
Alexandre Ganea 6a7efef4af [DebugInfo] Common behavior for error types
Following D50807, and heading towards D50664, this intermediary change does the following:

1. Upgrade all custom Error types in llvm/trunk/lib/DebugInfo/ to use the new StringError behavior (D50807).
2. Implement std::is_error_code_enum and make_error_code() for DebugInfo error enumerations.
3. Rename GenericError -> PDBError (the file will be renamed in a subsequent commit)
4. Update custom error messages to follow the same formatting: (\w\s*)+\.
5. Keep generic "file not found" (ENOENT) errors as they are in PDB code. Previously, there used to be a custom enumeration for that purpose.
6. Remove a few extraneous LF in log() implementations. Printing LF is a responsability at a higher level, not at the error level.

Differential Revision: https://reviews.llvm.org/D51499

llvm-svn: 341228
2018-08-31 17:41:58 +00:00
Victor Leschuk cf1f714d3b [DWARF] Unify warning callbacks. NFC.
Both DWARFDebugLine and DWARFDebugAddr used the same callback mechanism
for handling recoverable errors. They both implemented similar warn() function
to be used as such callbacks.

In this revision we get rid of code duplication and move this warn() function
to DWARFContext as DWARFContext::dumpWarning().

Reviewers: lhames, jhenderson, aprantl, probinson, dblaikie, JDevlieghere

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D51033

llvm-svn: 340528
2018-08-23 12:43:33 +00:00
Victor Leschuk cba595da82 [DWARF] Refactor DWARF classes to use unified error reporting. NFC.
DWARF-related classes in lib/DebugInfo/DWARF contained 
duplicating code for creating StringError instances, like:

template <typename... Ts>
static Error createError(char const *Fmt, const Ts &... Vals) {
  std::string Buffer;
  raw_string_ostream Stream(Buffer);
  Stream << format(Fmt, Vals...);
  return make_error<StringError>(Stream.str(), inconvertibleErrorCode());
}

Similar function was placed in Support lib in https://reviews.llvm.org/D49824

This revision makes DWARF classes use this function
instead of their local implementation of it.

Reviewers: aprantl, dblaikie, probinson, wolfgangp, JDevlieghere, jhenderson

Reviewed By: JDevlieghere, jhenderson

Differential Revision: https://reviews.llvm.org/D49964

llvm-svn: 340163
2018-08-20 09:59:08 +00:00
Reid Kleckner bd5d71229d [codeview] Use push_macro to avoid conflicts instead of a prefix
Summary:
This prefix was added in r333421, and it changed our dumper output to
say things like "CVRegEAX" instead of just "EAX". That's a functional
change that I'd rather avoid.

I tested GCC, Clang, and MSVC, and all of them support #pragma
push_macro. They don't issue warnings whem the macro is not defined
either.

I don't have a Mac so I can't test the real termios.h header, but I
looked at the termios.h sources online and looked for other conflicts.
I saw only the CR* macros, so those are the ones we work around.

Reviewers: zturner, JDevlieghere

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50851

llvm-svn: 339907
2018-08-16 17:34:31 +00:00
Paul Robinson 508b081514 [DWARF] Verifier now handles .debug_types sections.
Differential Revision: https://reviews.llvm.org/D50466

llvm-svn: 339302
2018-08-08 23:50:22 +00:00
Alexandre Ganea 741cc3531a [llvm-pdbutil] Support PDBs without a DBI stream
Differential Revision: https://reviews.llvm.org/D50258

llvm-svn: 339045
2018-08-06 19:35:00 +00:00
Jonas Devlieghere 3a92c5c1d3 [DebugInfo/Verifier] Don't emit error for missing module in index
We don't expect module names to be present in the index. This patch adds
DW_TAG_module to the blacklist.

Differential revision: https://reviews.llvm.org/D50237

llvm-svn: 338878
2018-08-03 12:01:43 +00:00
Paul Robinson 96545db374 [DebugInfo/DWARF] Remove redundant iterator type. NFC
llvm-svn: 338759
2018-08-02 19:29:38 +00:00
Paul Robinson 2c25f345d7 [DebugInfo/DWARF] [4/4] Unify handling of compile and type units. NFC
This is patch 4 of 4 NFC refactorings to handle type units and compile
units more consistently and with less concern about the object-file
section that they came from.

Patch 4 combines separate DWARFUnitVectors for compile and type units
into a single DWARFUnitVector that contains both.  For now the
implementation distinguishes compile units from type units by putting
all compile units at the front of the vector, reflecting the DWARF v4
distinction between .debug_info and .debug_types sections.  A future
patch will change this to allow the free mixing of unit kinds, as is
specified by DWARF v5.

Differential Revision: https://reviews.llvm.org/D49744

llvm-svn: 338633
2018-08-01 20:54:11 +00:00
Paul Robinson 11307fab93 [DebugInfo/DWARF] [3/4] Rename DWARFUnitSection to DWARFUnitVector. NFC
This is patch 3 of 4 NFC refactorings to handle type units and compile
units more consistently and with less concern about the object-file
section that they came from.

Patch 3 simply renames DWARFUnitSection to DWARFUnitVector, as the
object-file section of a unit is nearly irrelevant now.

Differential Revision: https://reviews.llvm.org/D49743

llvm-svn: 338632
2018-08-01 20:49:44 +00:00
Paul Robinson 7f33094486 [DebugInfo/DWARF] [2/4] Type units no longer in a std::deque. NFC
This is patch 2 of 4 NFC refactorings to handle type units and compile
units more consistently and with less concern about the object-file
section that they came from.

Patch 2 takes the existing std::deque<DWARFUnitSection> for type units
and makes it a simple DWARFUnitSection, simplifying the handling of
type units and making it more consistent with compile units.

Differential Revision: https://reviews.llvm.org/D49742

llvm-svn: 338629
2018-08-01 20:46:46 +00:00
Paul Robinson 143eaeab53 [DebugInfo/DWARF] [1/4] De-templatize DWARFUnitSection. NFC
This is patch 1 of 4 NFC refactorings to handle type units and compile
units more consistently and with less concern about the object-file
section that they came from.

Patch 1 replaces the templated DWARFUnitSection with a non-templated
version. That is, instead of being a SmallVector of pointers to a
specific unit kind, it is not a SmallVector of pointers to the base
class for both type and compile units.  Virtual methods are magic.

Differential Revision: https://reviews.llvm.org/D49741

llvm-svn: 338628
2018-08-01 20:43:47 +00:00
Victor Leschuk 58d3399d8a [DWARF] Support for .debug_addr (consumer)
This patch implements basic support for parsing
  and dumping DWARFv5 .debug_addr section.

llvm-svn: 338447
2018-07-31 22:19:19 +00:00
Alexandre Ganea ee8a720051 [CodeView] Minimal support for S_UNAMESPACE records
Differential Revision: https://reviews.llvm.org/D50007

llvm-svn: 338417
2018-07-31 19:15:50 +00:00
Alexandre Ganea 0bb8e89187 This fixes a crash when a second pass is required for the Codeview Type merging *and* the index points outside of the table (which should lead to an error being printed).
This occurs currently until MS precompiled headers .obj is added (see D45213)

Differential Revision: https://reviews.llvm.org/D50006

llvm-svn: 338308
2018-07-30 21:14:25 +00:00
Fangrui Song f78650a8de Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293
2018-07-30 19:41:25 +00:00
Wolfgang Pieb 1d56b4ae40 [DWARF v5] Don't report an error when the .debug_rnglists section is empty or non-existent. Fixes PR38297.
Reviewer: JDevlieghere

Differential Revision: https://reviews.llvm.org/D49815
 

llvm-svn: 337993
2018-07-26 01:12:41 +00:00
Fangrui Song 5bad9d835a [DWARF] Use deque in place of SmallVector to fix use-after-free issue
Summary: SmallVector's elements are moved when resizing and cause use-after-free.

Reviewers: probinson, dblaikie

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D49702

llvm-svn: 337772
2018-07-23 23:27:45 +00:00
Wolfgang Pieb 790d86cefc Embed a template specialization in a namespace to work around a gcc bug.
llvm-svn: 337770
2018-07-23 23:14:23 +00:00
Wolfgang Pieb 439801ba1d [DWARF v5] Refactor range lists dumping by using a more generic way of handling tables of lists.
The intent is to use it for location list tables as well. Change is almost NFC with the exception
of the spelling of some strings used during dumping (all lowercase now).

Reviewer: JDevlieghere

Differential Revision: https://reviews.llvm.org/D49500

llvm-svn: 337763
2018-07-23 22:37:17 +00:00
Mandeep Singh Grang 20239b18bb [llvm] Change 2 instances of std::sort to llvm::sort
llvm-svn: 337192
2018-07-16 17:26:37 +00:00
Jonas Devlieghere 327e7a1608 [dwarfdump] Add pretty printer for accelerator table based on Atom.
For instance, When dumping .apple_types, the second atom represents the
DW_TAG. In addition to printing the raw value, we now also pretty print
the value if the ATOM tells us how.

llvm-svn: 337026
2018-07-13 17:21:51 +00:00
Fangrui Song 24452316c6 [DebugInfo] Fix getPreviousSibling after r336823
llvm-svn: 336837
2018-07-11 19:09:37 +00:00
Jonas Devlieghere 3f27e57ade [DebugInfo] Make children iterator bidirectional
Make the DIE iterator bidirectional so we can move to the previous
sibling of a DIE.

Differential revision: https://reviews.llvm.org/D49173

llvm-svn: 336823
2018-07-11 17:11:11 +00:00
Rui Ueyama 0230f7c763 Use StringRef instead of `const char *`.
I don't think there's a need to use `const char *`. In most (probably all?)
cases, we need a length of a name later, so discarding a length will
lead to a wasted effort.

Differential Revision: https://reviews.llvm.org/D49046

llvm-svn: 336612
2018-07-09 22:26:49 +00:00
Maksim Panchenko fa762cc19b [DebugInfo] Change default value of FDEPointerEncoding
Summary:
If the encoding is not specified in CIE augmentation string, then it
should be DW_EH_PE_absptr instead of DW_EH_PE_omit.

Reviewers: ruiu, MaskRay, plotfi, rafauler

Reviewed By: MaskRay

Subscribers: rafauler, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D49000

llvm-svn: 336577
2018-07-09 18:45:38 +00:00
Benjamin Kramer 9fc944ae36 [PDB] memicmp only exists on Windows, use StringRef::compare_lower instead
llvm-svn: 336469
2018-07-06 21:56:57 +00:00
Zachary Turner 648bebdc67 [PDB] One more fix for hasing GSI records.
The reference implementation uses a case-insensitive string
comparison for strings of equal length.  This will cause the
string "tEo" to compare less than "VUo".  However we were using
a case sensitive comparison, which would generate the opposite
outcome.  Switch to a case insensitive comparison.  Also, when
one of the strings contains non-ascii characters, fallback to
a straight memcmp.

The only way to really test this is with a DIA test.  Before this
patch, the test will fail (but succeed if link.exe is used instead
of lld-link).  After the patch, it succeeds even with lld-link.

llvm-svn: 336464
2018-07-06 21:01:42 +00:00
Zachary Turner 1f200adfa7 [PDB] Sort globals symbols by name in GSI hash buckets.
It seems like the debugger first computes a symbol's bucket,
and then does a binary search of entries in the bucket using the
symbol's name in order to find it.  If the bucket entries are not
in sorted order, this obviously won't work.  After this patch a
couple of simple test cases show that we generate an exactly
identical GSI hash stream, which is very nice.

llvm-svn: 336405
2018-07-06 02:33:58 +00:00
Zachary Turner 68e1919d14 [CodeView] Correctly compute the name of S_PROCREF symbols.
We have a function which switches on the type of a symbol record
to return a hardcoded offset into the record that contains the
symbol name.  Not all symbols have names to begin with, and for
those records we return -1 for the offset.

Names are used for various things.  Importantly for this particular
bug, a hash of the record name is used as a key for certain hash
tables which are serialied into the PDB file.  One of these hash
tables is for the global symbol stream, which is basically a
collection of S_PROCREF symbols which contain the name of the
symbol, a module, and an address offset.

However, for S_PROCREF symbols, the function to return the offset
of the name was returning -1: basically it wasn't implemented.
As a result of this, all global symbols were hashing to the same
value, essentially it was as if every single global symbol's name
was the empty string.

This manifests in the VS debugger when you try to call a function
(global or member, doesn't matter) through the immediate window
and the debugger simply reports an error because it can't find the
function.  This makes perfect sense, because it is hashing the name
for real, looking in the global symbol hash table, and there is only
1 entry there which corresponds to a symbol whose name is the empty
string.

Fixing this fixes the MSVC debugger in this case.

llvm-svn: 336024
2018-06-29 22:19:02 +00:00
Paul Robinson 50f8ca38ee Pass DWARFUnit to verifier by reference not by value. I am moderately
sure this should not cause a memory leak.

llvm-svn: 336007
2018-06-29 19:17:44 +00:00
Zachary Turner ee8010abe3 Move some code from PDBFileBuilder to MSFBuilder.
The code to emit the pieces of the MSF file were actually in
PDBFileBuilder.  Move this to MSFBuilder so that we can
theoretically emit an MSF without having a PDB file.

llvm-svn: 335789
2018-06-27 21:18:15 +00:00
Kamil Rytarowski a8448ad098 Handle NetBSD specific path in findDebugBinary()
Summary:
The NetBSD Operating System installs debuginfo
files into /usr/libdata/debug, rather than other path
like in some other popular distribution.

This change makes llvm-symbolizer functional with
the basesystem executables.

Reviewers: joerg, vitalybuka

Reviewed By: vitalybuka

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D48525

llvm-svn: 335511
2018-06-25 18:49:13 +00:00
Wolfgang Pieb 61d8c8d9b3 [DWARF] Improved error reporting for range lists.
Errors found processing the DW_AT_ranges attribute are propagated by lower level 
routines and reported by their callers.

Reviewer: JDevlieghere

Differential Revision: https://reviews.llvm.org/D48344

llvm-svn: 335188
2018-06-20 22:56:37 +00:00
Pavel Labath 4adc88ed25 [DWARF/AccelTable] Remove getDIESectionOffset for DWARF v5 entries
Summary:
This method was not correct for entries in DWO files as it assumed it
could just add up the CU and DIE offsets to get the absolute DIE offset.
This is not correct for the DWO files, as here the CU offset will
reference the skeleton unit, whereas the DIE offset will be the offset
in the full unit in the DWO file.

Unfortunately, this means that we are not able to determine the absolute
DIE offset using the information in the .debug_names section alone,
which means we have to offload some of this work to the users of this
class.

To demonstrate how this can be done, I've added/fixed the ability to
lookup entries using accelerator tables in DWO files in llvm-dwarfdump.
To make this happen, I've needed to make two extra changes in other
classes:
- made the DWARFContext method to lookup a CU based on the section
  offset public. I've needed this functionality to lookup a CU, and this
  seems like a useful thing in general.
- made DWARFUnit::getDWOId call extractDIEsIfNeeded. Before this, the
  DWOId was filled in only if the root DIE happened to be parsed
  before we called the accessor. Since the lazy parsing is supposed to
  happen under the hood, calling extractDIEsIfNeeded seems appropriate.

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: mgrang, llvm-commits

Differential Revision: https://reviews.llvm.org/D48009

llvm-svn: 334578
2018-06-13 08:14:27 +00:00
Pavel Labath d6ca063907 DWARFAcceleratorTable: Add an iterator-based api for accessing names in the index
Summary:
Back when we were introducing the DWARF v5 name index, there was a
short discussion whether we shouldn't have a nicer api for iterating
over the index. At that time, I did not find it necessary since the
iteration over names was done only from within the index itself (and I
figured the internal implementation can deal with a slightly rough
interface).

However, now I ran into a use for this kind of API in LLDB (for finding
all names matching a regular expression), so it looked like a nice
opportunity to introduce one. To make the API more useful, I've made the
NameTableEntry class a bit smarter: it now stores the string section
reference (so it can return its name) and its position in the name index
(mainly useful for dumping/logging).

I also convert the internal users to use the new API, which also gives
test coverage for the added code.

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47590

llvm-svn: 333738
2018-06-01 10:33:11 +00:00
Pavel Labath 59870af66f DWARFAcceleratorTable: fix equal_range iterators
Summary:
Both (Apple and DWARF5) implementations of the iterators had bugs which
resulted in crashes if one attempted to iterate through the accelerator
tables all the way.

For the Apple tables, the issue was that we did not clear the DataOffset
field when we reached the end, which made our iterator compare unequal
to the "end" iterator. For the Dwarf5 tables, the problem was that we
incremented the CurrentIndex pointer and then used the incremented
(possibly invalid) pointer to check whether we have reached the end of
the index list.

The reason these bugs went undetected is because their only user
(dwarfdump) only ever searched for the first match. Besides allowing us
to test this fix, changing llvm-dwarfdump --find to display all matches
seems like a good improvement (it makes the behavior consistent with the
--name option), so I change llvm-dwarfdump to do that.

The existing tests would be sufficient to test this fix with the new
llvm-dwarfdump behavior, but I add a special test that demonstrates that
the tool indeed displays multiple results. The find.test test needed to
be tweaked a bit as the tool now does not print the ".debug_info
contents" header (also consistent with how --name works).

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: mgrang, llvm-commits

Differential Revision: https://reviews.llvm.org/D47543

llvm-svn: 333635
2018-05-31 08:47:00 +00:00
Jonas Devlieghere 43dce3edbe [CodeView] Add prefix to CodeView registers.
Adds CVReg to CodeView register names to prevent a duplicate symbol with
CR3 defined in termios.h, as suggested by Zachary on the mailing list.

http://lists.llvm.org/pipermail/llvm-dev/2018-May/123372.html

Differential revision: https://reviews.llvm.org/D47478

rdar://39863705

llvm-svn: 333421
2018-05-29 14:35:34 +00:00
Jonas Devlieghere cb547cbb5c [dwarfdump] Make -c and -p work together
When requesting to dump both the parent chain and children, we used to
print the DIE more than once because we propagated the dump options to
the parent without clearing the respective flags. This commit fixes this
oversight and adds a test.

rdar://39415292

Differential revision: https://reviews.llvm.org/D47263

llvm-svn: 333350
2018-05-26 19:39:56 +00:00
Jonas Devlieghere 63eca15e95 [DebugInfo] Invert DIE order for range errors.
When printing an error for an invalid address range in a DIE, we used to
print the child above the parent, which is counter intuitive. This patch
reverses the order and indents the child to mimic the way we print the
debug info section.

llvm-svn: 333006
2018-05-22 17:38:03 +00:00
Jonas Devlieghere 7e0b023302 [DebugInfo] Fix location list check in the verifier
We weren't properly verifying location lists because we tried obtaining
the offset as a constant.

llvm-svn: 333005
2018-05-22 17:37:27 +00:00
Paul Robinson 543c0e1d50 [DWARFv5] Put the DWO ID in its place.
In DWARF v5, the DWO ID is in the (split/skeleton) CU header, not an
attribute on the CU DIE.

This changes the size of those headers, so use the parsed size whenever
we have one, for simplicitly.

Differential Revision: https://reviews.llvm.org/D47158

llvm-svn: 333004
2018-05-22 17:27:31 +00:00
Jonas Devlieghere c111382aa8 [DebugInfo] Use absolute addresses in location lists
Rather than relying on the user to do the address calculating in
DW_AT_location we should just dump the absolute address.

rdar://problem/38513870

Differential revision: https://reviews.llvm.org/D47152

llvm-svn: 332873
2018-05-21 19:36:54 +00:00
James Henderson 004b729ed1 [DWARF] Refactor callback usage for .debug_line error handling
Change the "recoverable" error callback to take an Error instaed of a
string.

Reviewed by: JDevlieghere

Differential Revision: https://reviews.llvm.org/D46831

llvm-svn: 332845
2018-05-21 15:30:54 +00:00
Wolfgang Pieb 20e1546655 Fixing buildbot error introduced with r332759.
llvm-svn: 332772
2018-05-18 21:44:28 +00:00
Wolfgang Pieb 401b5ecfea Addressing a couple of compiler warnings introduced with r332759.
llvm-svn: 332766
2018-05-18 20:51:16 +00:00
Wolfgang Pieb da71639cdb Fixing build error introduced with r332759.
llvm-svn: 332762
2018-05-18 20:35:13 +00:00
Wolfgang Pieb ad60559be7 [DWARF v5] Improved support for .debug_rnglists (consumer). Enables any consumer to
extract DWARF v5 encoded rangelists.

Reviewer: JDevlieghere

Differential Revision: https://reviews.llvm.org/D45549

llvm-svn: 332759
2018-05-18 20:12:54 +00:00
Zachary Turner c762666e87 Resubmit [pdb] Change /DEBUG:GHASH to emit 8 byte hashes."
This fixes the remaining failing tests, so resubmitting with no
functional change.

llvm-svn: 332676
2018-05-17 22:55:15 +00:00
Zachary Turner 1de9fce151 Revert "[pdb] Change /DEBUG:GHASH to emit 8 byte hashes."
A few tests haven't been properly updated, so reverting while
I have time to investigate proper fixes.

llvm-svn: 332672
2018-05-17 21:49:25 +00:00
Zachary Turner 3c4c8a0937 [pdb] Change /DEBUG:GHASH to emit 8 byte hashes.
Previously we emitted 20-byte SHA1 hashes.  This is overkill
for identifying debug info records, and has the negative side
effect of making object files bigger and links slower.  By
using only the last 8 bytes of a SHA1, we get smaller object
files and ~10% faster links.

This modifies the format of the .debug$H section by adding a new
value for the hash algorithm field, so that the linker will still
work when its object files have an old format.

Differential Revision: https://reviews.llvm.org/D46855

llvm-svn: 332669
2018-05-17 21:22:48 +00:00
Reid Kleckner f40f85868e [codeview] Include record prefix in global type hashing
The prefix includes type kind, which is important to preserve. Two
different type leafs can easily have the same interior record contents
as another type.

We ran into this issue in PR37492 where a bitfield type record collided
with a const modifier record. Their contents were bitwise identical, but
their kinds were different.

llvm-svn: 332664
2018-05-17 20:47:22 +00:00
Pavel Labath 80827f10a1 Reapply "DWARFVerifier: Check "completeness" of .debug_names section"
This is a resubmit of r331868 (D46583), which was reverted due to
failures on the PS4 bot.

These have been resolved with r332246/D46748.

llvm-svn: 332349
2018-05-15 13:24:10 +00:00
Paul Robinson 5f53f07b66 [DWARF] Factor out a DWARFUnitHeader class. NFC
Extract information related to a "unit header" from DWARFUnit into a
new DWARFUnitHeader class, and add a DWARFUnit member for the header.
This is one step in the direction of allowing type units in the
.debug_info section for DWARF v5.

Differential Revision: https://reviews.llvm.org/D46707

llvm-svn: 332289
2018-05-14 20:32:31 +00:00
Pavel Labath 2a6afe5f87 [CodeGen/AccelTable]: Handle -dwarf-linkage-names=Abstract correctly
Summary:
If we are not emitting a linkage name in the .debug_info sections, we
should not add it into the index either. This makes sure our index is
consistent with the actual debug info.

I am also explicitly setting the --dwarf-linkage-names=All in the
name-collsions test as that one would now fail on targets where this
defaults to "Abstract" (in fact, it would have failed already if there
wasn't a bug in the DWARF verifier, which I fix as well).

Reviewers: probinson, aprantl, JDevlieghere

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46748

llvm-svn: 332246
2018-05-14 14:13:20 +00:00
Wolfgang Pieb f2b6915ed4 [DWARF] Fixing a bug in DWARF v5 string offsets tables where the length encoded the contribution
length excluding the table header. Instead it must encode the contribution length minus the length
field itself.

Reviewer: JDevliegehere

Differential Revision: https://reviews.llvm.org/D45922

llvm-svn: 332030
2018-05-10 20:02:34 +00:00
James Henderson 11a9de74c9 Fix signed/unsigned comparison warning and print format
The print format was causing at least 2 unit-test failures from r331971.

The signed/unsigned comparison warnings only appeared to affect two lines but
it was unclear whether it might just pop up on other lines, so I have been
explicit in all the literals in the tests.

There were other bot unit-test failures that I am still investigating.

llvm-svn: 331978
2018-05-10 12:15:43 +00:00
James Henderson a3acf99e59 [DWARF] Rework debug line parsing to use llvm::Error and callbacks
Reviewed by: dblaikie, JDevlieghere, espindola

Differential Revision: https://reviews.llvm.org/D44560

Summary:
The .debug_line parser previously reported errors by printing to stderr and
return false. This is not particularly helpful for clients of the library code,
as it prevents them from handling the errors in a manner based on the calling
context. This change switches to using llvm::Error and callbacks to indicate
what problems were detected during parsing, and has updated clients to handle
the errors in a location-specific manner. In general, this means that they
continue to do the same thing to external users. Below, I have outlined what
the known behaviour changes are, relating to this change.

There are two levels of "errors" in the new error mechanism, to broadly
distinguish between different fail states of the parser, since not every
failure will prevent parsing of the unit, or of subsequent unit. Malformed
table errors that prevent reading the remainder of the table (reported by
returning them) and other minor issues representing problems with parsing that
do not prevent attempting to continue reading the table (reported by calling a
specified callback funciton). The only example of this currently is when the
last sequence of a unit is unterminated. However, I think it would be good to
change the handling of unrecognised opcodes to report as minor issues as well,
rather than just printing to the stream if --verbose is used (this would be a
subsequent change however).

I have substantially extended the DwarfGenerator to be able to handle
custom-crafted .debug_line sections, allowing for comprehensive unit-testing
of the parser code. For now, I am just adding unit tests to cover the basic
error reporting, and positive cases, and do not currently intend to test every
part of the parser, although the framework should be sufficient to do so at a
later point.

Known behaviour changes:
  - The dump function in DWARFContext now does not attempt to read subsequent
  tables when searching for a specific offset, if the unit length field of a
  table before the specified offset is a reserved value.
  - getOrParseLineTable now returns a useful Error if an invalid offset is
  encountered, rather than simply a nullptr.
  - The parse functions no longer use `WithColor::warning` directly to report
  errors, allowing LLD to call its own warning function.
  - The existing parse error messages have been updated to not specifically
  include "warning" in their message, allowing consumers to determine what
  severity the problem is.
  - If the line table version field appears to have a value less than 2, an
  informative error is returned, instead of just false.
  - If the line table unit length field uses a reserved value, an informative
  error is returned, instead of just false.
  - Dumping of .debug_line.dwo sections is now implemented the same as regular
  .debug_line sections.
  - Verbose dumping of .debug_line[.dwo] sections now prints the prologue, if
  there is a prologue error, just like non-verbose dumping.

As a helper for the generator code, I have re-added emitInt64 to the
AsmPrinter code. This previously existed, but was removed way back in r100296,
presumably because it was dead at the time.

This change also requires a change to LLD, which will be committed separately.

llvm-svn: 331971
2018-05-10 10:51:33 +00:00
Pavel Labath e0207a60dd Revert "DWARFVerifier: Check "completeness" of .debug_names section"
The new verifier check has found an error in the
debug-names-name-collisions.ll test on the PS4 bot:

error: Name Index @ 0x0: Entry @ 0xdc: mismatched Name of DIE @ 0x23: index - _ZN3foo3fooE; debug_info - foo.

Reverting while I investigate whether this is a bug in the verifier or
the generator.

This reverts commit r331868.

llvm-svn: 331869
2018-05-09 12:26:19 +00:00
Pavel Labath 3280e0467f DWARFVerifier: Check "completeness" of .debug_names section
Summary:
This patch implements a check which makes sure all entries required by
the DWARF v5 specification are present in the Name Index. The algorithm
tries to follow the wording of Section 6.1.1.1 of the spec as closely as
possible.

The main deviation from it is that instead of a whitelist-based approach
in the spec "The name index must contain an entry for each debugging
information entry that defines a named subprogram, label, variable,
type, or namespace" I chose a blacklist-based one, where I consider
everything to be "in" and then remove the entries that don't make sense.
I did this because it has more potential for catching interesting cases
and the above is a bit vague (it uses plain words like "variable" and
"subprogram", but the rest of the section speaks about specific TAGs).

This approach has raised some interesting questions, the main one being
whether enumerator values should be indexed. The consensus seems to be
that they should, although it does not follow from section 6.1.1.1.
For the time being I made the verifier ignore these, as LLVM does not do
this yet, and I wanted to get a clean run when verifying generated debug
info.

Another interesting case was the DW_TAG_imported_declaration. It was not
immediately clear to me whether this should go in or not, but currently
it is not indexed, and (unlike the enumerators) in does not seem to cause
problems for LLDB, so I've also ignored it.

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46583

llvm-svn: 331868
2018-05-09 12:06:17 +00:00
Fangrui Song bd088560a8 [DebugInfo] Accept `S` in augmentation strings in CIE.
glibc libc.a(sigaction.o) compiled from sysdeps/unix/sysv/linux/x86_64/sigaction.c uses "zRS".

llvm-svn: 331738
2018-05-08 06:21:12 +00:00
David Blaikie aa537da89f llvm-symbolizer: Handle function definitions nested within other functions
LLVM always puts function definition DIEs at the top level, but under
some circumstances GCC does not (at least in this case with member
functions of a function-local type).

To ensure that doesn't appear as though the local type's member function
is unduly inlined within the outer function - ensure the inline
discovery DIE parent walk stops at the first DW_TAG_subprogram.

llvm-svn: 331291
2018-05-01 18:08:45 +00:00
Jonas Devlieghere 4bbcb5ab04 [DebugInfo] Prevent infinite recursion for malformed DWARF
This prevents infinite recursion in DWARFDie::findRecursively for
malformed DWARF where a DIE references itself.

This fixes PR36257.

Differential revision: https://reviews.llvm.org/D43092

llvm-svn: 331200
2018-04-30 17:02:41 +00:00
Zachary Turner 194be871b9 [LLD/PDB] Emit first section contribution for DBI Module Descriptor.
Part of the DBI stream is a list of variable length structures
describing each module that contributes to the final executable.

One member of this structure is a section contribution entry that
describes the first section contribution in the output file for
the given module.

We have been leaving this structure unpopulated until now, so with
this patch it is now filled out correctly.

Differential Revision: https://reviews.llvm.org/D45832

llvm-svn: 330457
2018-04-20 18:00:46 +00:00
Andrew Ng 7a2fa74ab0 [DebugInfo] Use WithColor for more debug line warnings
Updated two more debug line related warnings to use WithColor. This was
necessary to ensure consistent output order of the warnings on Windows
for debug line tests.

Differential Revision: https://reviews.llvm.org/D45871

llvm-svn: 330440
2018-04-20 15:29:47 +00:00
Zachary Turner bee6c22414 [llvm-pdbutil] Dump first section contribution for each module.
The DBI stream contains a list of module descriptors.  At the
beginning of each descriptor is a structure representing the first
section contribution in the output file for that module.  LLD
currently doesn't fill out this structure at all, but link.exe
does.  So as a precursor to emitting this data in LLD, we first
need a way to dump it so that it can be checked.

This patch adds support for the dumping, and verifies via a test
that LLD emits bogus information.

llvm-svn: 330208
2018-04-17 20:06:43 +00:00
Zachary Turner d8d97de514 [PDB] Correctly use the target machine when writing DBI stream.
Using Config->is64() will treat ARM64 as Amd64, which is incorrect.
Furthermore, there are more esoteric architectures that could
theoretically be encountered.  Just set it directly to the machine
type, which we already know anyway.

llvm-svn: 330157
2018-04-16 20:42:06 +00:00
Zachary Turner e3fe669855 Resubmit "Fix some incorrect fields in our generated PDBs."
This fixes the failing tests.  They simply hadn't been updated
to match the new output resulting from this patch.

llvm-svn: 330145
2018-04-16 18:17:13 +00:00
Zachary Turner 52c80e3860 Revert "Fix some incorrect fields in our generated PDBs."
There are a couple of failing tests which slipped under my radar
so I'm reverting this while I attempt to fix.

llvm-svn: 330133
2018-04-16 16:55:41 +00:00
Brock Wyma 94ece8fbc9 [CodeView] Initial support for emitting S_THUNK32 symbols for compiler...
When emitting CodeView debug information, compiler-generated thunk routines
should be emitted using S_THUNK32 symbols instead of S_GPROC32_ID symbols so
Visual Studio can properly step into the user code.  This initial support only
handles standard thunk ordinals.

Differential Revision: https://reviews.llvm.org/D43838

llvm-svn: 330132
2018-04-16 16:53:57 +00:00
Zachary Turner 1b06cc7817 Fix some incorrect fields in our generated PDBs.
Most of these are pretty trivial and obvious. Setting the toolchain
version to 14.11 is perhaps a little questionable, but we've been bitten
in the past where one of our version fields sidn't match MSVC's, and I
definitely don't want to go through that diagnosis again as it was
pretty time consuming and hard to track down.

I found all of these by using llvm-pdbutil export to dump the dbi and
pdb streams to a file, then using fc followed by llvm-pdbutil explain to
explain the mismatched bytes.

There are still some more, these are just the low hanging fruit.

Differential Revision: https://reviews.llvm.org/D45276

llvm-svn: 330130
2018-04-16 16:27:49 +00:00
Jonas Devlieghere 6be1f01935 [Support] Extend WithColor helpers
Although printing warnings and errors to stderr is by far the most
common case, this patch makes it possible to specify any stream.

llvm-svn: 330094
2018-04-15 08:44:15 +00:00
Jonas Devlieghere 84e99265d6 [DebugInfo] Use WithColor to print errors/warnings
Use the convenience methods from WithColor to consistently print errors
and warnings in libDebugInfo.

llvm-svn: 330092
2018-04-14 22:07:23 +00:00
Mandeep Singh Grang 0f035ebed2 [DebugInfo] Change std::sort to llvm::sort in response to r327219
r327219 added wrappers to std::sort which randomly shuffle the container before
sorting.  This will help in uncovering non-determinism caused due to undefined
sorting order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of
std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to
llvm::sort.  Refer the comments section in D44363 for a list of all the
required patches.

llvm-svn: 330061
2018-04-13 19:50:51 +00:00
Aaron Smith 3dca0bedbb [DebugInfoPDB] Add DIA implementations of findSymbolByRVA and findSymbolByAddr
llvm-svn: 329724
2018-04-10 17:33:18 +00:00
Aaron Smith c0a5c01aeb [PDB] Remove dead code and run clang format; NFC
llvm-svn: 329712
2018-04-10 15:25:04 +00:00
Alexandre Ganea 08df84e4f0 [DebugInfo][COFF] Fix reading variable-length encoded records
While reading Codeview records which contain variable-length encoded integers,
such as LF_BCLASS, LF_ENUMERATE, LF_MEMBER, LF_VBCLASS or LF_IVBCLASS,
the record's size would be improperly calculated in cases where the value was
indeed of a variable length (>= LF_NUMERIC). This caused a bad alignement on
the next record, which would/might crash later on.

Differential Revision: https://reviews.llvm.org/D45104

llvm-svn: 329659
2018-04-10 01:58:45 +00:00
Alexandre Ganea 3241cec577 Fix line endings (CR/LF -> LF) introduced by rL329613
reviewer: zturner
llvm-svn: 329646
2018-04-10 00:09:15 +00:00
Alexandre Ganea d9e96741c4 [Debuginfo][COFF] Minimal serialization support for precompiled types records
This change adds support for the LF_PRECOMP and LF_ENDPRECOMP records required
to read/write Microsoft precompiled types .objs.
See https://en.wikipedia.org/wiki/Precompiled_header#Microsoft_Visual_C_and_C++

This also adds handling for the .debug$P section, which is actually a .debug$T
section in disguise, found only in precompiled .objs.

Differential Revision: https://reviews.llvm.org/D45283

llvm-svn: 329613
2018-04-09 20:17:56 +00:00
Hiroshi Inoue 9ff2380ea6 [NFC] fix trivial typos in comments and error message
"is is" -> "is", "are are" -> "are"

llvm-svn: 329546
2018-04-09 04:37:53 +00:00
Pavel Labath c9f07b06a1 DWARFVerifier: validate information in name index entries
Summary:
This patch add checks to verify that the information in the name index
entries is consistent with the debug_info section. Specifically, we
check that entries point to valid DIEs, and their names, tags, and
compile units match the information in the debug_info sections.

These checks are only run if the previous checks did not find any errors
in the name index headers. Attempting to proceed with the checks anyway
would likely produce a lot of spurious errors and the verification code
would need to be very careful to avoid crashing.

I also add a couple of more checks to the abbreviation-validation code
to verify that some attributes are always present (an index without a
DW_IDX_die_offset attribute is fairly useless).

The entry verification works only on indexes without any type units - I
haven't attempted to extend it to type units, as we don't even have a
DWARF v5-compatible type unit generator at the moment.

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45323

llvm-svn: 329392
2018-04-06 13:34:12 +00:00
Pavel Labath 54ca2d688a [debug_loc] Fix typo in DWARFExpression constructor
Summary:
The positions of the DwarfVersion and AddressSize arguments were
reversed, which caused parsing for dwarf opcodes which contained
address-size-dependent operands (such as DW_OP_addr). Amusingly enough,
none of the address-size asserts fired, as dwarf version was always 4,
which is a valid address size.

I ran into this when constructing weird inputs for the DWARF verifier. I
I add a test case as hand-written dwarf -- I am not sure how to trigger
this differently, as having a DW_OP_addr inside a location list is a
fairly non-standard thing to do.

Fixing this error exposed a bug in the debug_loc.dwo parser, which was
always being constructed with an address size of 0. I fix that as well
by following the pattern in the non-dwo parser of picking up the address
size from the first compile unit (which is technically not correct, but
probably good enough in practice).

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45324

llvm-svn: 329381
2018-04-06 08:49:57 +00:00
Wolfgang Pieb 3fb9e3f398 [DWARF v5][NFC]: Refactor DebugRnglists to prepare for the support of the DW_AT_ranges
attribute in conjunction with .debug_rnglists.

Reviewers: JDevlieghere
 
Differential Revision: https://reviews.llvm.org/D45307

llvm-svn: 329345
2018-04-05 21:01:49 +00:00
Zachary Turner 15b2bdfd8b [llvm-pdbutil] Add the ability to explain binary files.
Using this, you can use llvm-pdbutil to export the contents of a
stream to a binary file, then run explain on the binary file so
that it treats the offset as an offset into the stream instead
of an offset into a file.  This makes it easy to compare the
contents of the same stream from two different files.

llvm-svn: 329207
2018-04-04 17:29:09 +00:00
Nico Weber 086b1c8118 Minor no-op cmake file style fix.
llvm-svn: 329137
2018-04-04 00:50:22 +00:00
Aaron Smith 47f18b91bb [DebugInfoPDB] Add a few missing definitions to PDBTypes.h
The missing definitions are from cvconst.h shipped with DIA SDK.

Correct the url to MSDN for MemoryTypeEnum and set the underlying 
type of PDB_StackFrameType and PDB_MemoryType to uint16_t.

llvm-svn: 329104
2018-04-03 19:41:27 +00:00
Zachary Turner d11328a1bb [llvm-pdbutil] Add an export subcommand.
This command can dump the binary contents of a stream to a file.
This is useful when you want to do side-by-side comparisons of
a specific stream from two PDBs to examine the differences between
them.  You can export both of them to a file, then open them up
side by side in a hex editor (for example), so as to eliminate any
differences that might arise from the contents being on different
blocks in the PDB.

In subsequent patches I plan to improve the "explain" subcommand
so that you can explain the contents of a binary file that isn't
necessarily a full PDB, but one of these dumped streams, by telling
the subcommand how to interpret the contents.

llvm-svn: 329002
2018-04-02 18:35:21 +00:00
Mandeep Singh Grang fe1d28e83d [DebugInfo] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: echristo, zturner, samsonov

Reviewed By: echristo

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D45134

llvm-svn: 328935
2018-04-01 16:18:49 +00:00
Zachary Turner d5cf5cf637 [llvm-pdbutil] Dig deeper into the PDB and DBI streams when explaining.
This will show more detail when using `llvm-pdbutil explain` on an
offset in the DBI or PDB streams.  Specifically, it will dig into
individual header fields and substreams to give a more precise
description of what the byte represents.

llvm-svn: 328878
2018-03-30 17:16:50 +00:00
Zachary Turner 3203e27473 [MSF] Default to FPM2, and always mark FPM pages allocated.
There are two FPMs in an MSF file, the idea being that for
incremental updates you can write to the alternate one and then
atomically swap them on commit.  LLVM defaulted to using FPM1
on the first commit, but this differs from Microsoft's behavior
which is to default to using FPM2 on the first commit.  To
eliminate some byte-level file differences, this patch changes
LLVM's default to also be FPM2.

Additionally, LLVM was trying to be "smart" about marking FPM
pages allocated.  In addition to marking every page belonging
to the alternate FPM as unallocated, LLVM also marked pages at
the end of the main FPM which were not needed as unallocated.

In order to match the behavior of Microsoft-generated PDBs, we
now always mark every FPM block as allocated, regardless of
whether it is in the main FPM or the alt FPM, and regardless of
whether or not it describes blocks which are actually in the file.

This has the side benefit of simplifying our code.

llvm-svn: 328812
2018-03-29 18:34:15 +00:00
Pavel Labath ea0f841c3b .debug_names: Correctly align the AugmentationStringSize field
We should align the value of the field, not the overall section offset.

This distinction matters if one of the debug_names contributions is not
of size which is a multiple of four. The dwarf producers may choose to
emit rounded contributions, but they are not required to do so. In the
latter case, without this patch we would corrupt the parsing state, as
we would adjust the offset even if subsequent contributions contained
correctly rounded augmentation strings.

llvm-svn: 328796
2018-03-29 15:12:45 +00:00
Pavel Labath 2d1fc4375f .debug_names: Parse DW_IDX_die_offset as a reference
Before this patch we were parsing the attributes as section offsets, as
that is what apple_names is doing. However, this is not correct as DWARF
v5 specifies that this attribute should use the Reference form class.

This also updates all the testcases (except the ones that deliberately
pass a different form) to use the correct form class.

llvm-svn: 328773
2018-03-29 13:47:57 +00:00
Wolfgang Pieb ab068eaa57 [DWARF][DWARF v5]: Adding support for dumping DW_RLE_offset_pair and DW_RLE_base_address
Reviewers: dblakie, aprantl

Differential Revision: https://reviews.llvm.org/D44811

llvm-svn: 328662
2018-03-27 20:27:36 +00:00
Aaron Smith f13938382c [DebugInfoPDB] Print the method name along with the variant value
Before this change, using dumpProperties() with PDBSymbolData
would look like this:

  get_locationType: 3
  1

After this change:

  get_locationType: 3
  get_value: 1

llvm-svn: 328590
2018-03-26 22:53:38 +00:00
Aaron Smith 1af50bcf89 [DebugInfoPDB] Add methods to get the compiland and line numbers with PDBSymbolData
llvm-svn: 328587
2018-03-26 22:17:12 +00:00
Aaron Smith ed81a9db29 [DebugInfoPDB] Add DIA implementation of findLineNumbersByRVA
This method is used to find line numbers for PDBSymbolData
that have an invalid virtual address.

llvm-svn: 328586
2018-03-26 22:13:22 +00:00
Aaron Smith 53708a5e9e [DebugInfoPDB] Add DIA implementation of addressForVA and addressForRVA
These are used in finding line numbers for PDBSymbolData

llvm-svn: 328585
2018-03-26 22:10:02 +00:00
Paul Robinson 82e4864730 Use correct format specifier.
Review comment on r328235 by James Henderson.

llvm-svn: 328578
2018-03-26 19:55:01 +00:00
Zachary Turner f228276262 [PDB] Resubmit "Support embedding natvis files in PDBs."
This was reverted several times due to what ultimately turned out
to be incompatibilities in our serialized hash table format.

Several changes went in prior to this to fix those issues since
they were more fundamental and independent of supporting injected
sources, so now that those are fixed this change should hopefully
pass.

llvm-svn: 328363
2018-03-23 19:57:25 +00:00
Zachary Turner a6fb536e5b [PDB] Make our PDBs look more like MS PDBs.
When investigating bugs in PDB generation, the first step is
often to do the same link with link.exe and then compare PDBs.

But comparing PDBs is hard because two completely different byte
sequences can both be correct, so it hampers the investigation when
you also have to spend time figuring out not just which bytes are
different, but also if the difference is meaningful.

This patch fixes a couple of cases related to string table emission,
hash table emission, and the order in which we emit strings that
makes more of our bytes the same as the bytes generated by MS PDBs.

Differential Revision: https://reviews.llvm.org/D44810

llvm-svn: 328348
2018-03-23 18:43:39 +00:00
Paul Robinson 7947468e69 [DWARF] Replace assert with diagnostic. PR36868.
llvm-svn: 328235
2018-03-22 19:37:56 +00:00
Zachary Turner 71d36ad9f9 [Codeview/PDB] Rename some methods for clarity.
NFC, this just renames some methods to better express what they
do, and also adds a few helper methods to add some symmetry to the
API in a few places (for example there was a getStringFromId but not
a getIdFromString method in the string table).

llvm-svn: 328221
2018-03-22 17:37:28 +00:00
Pavel Labath 79cd942c23 DWARFVerifier: verify debug_names abbreviation table
Summary:
This commit adds checks of the abbreviation table in a DWARF v5 Name
Index. The most interesting/useful check is the one which checks that
each index attributes is encoded using the correct form class, but it
also checks for the more obvious errors like unknown
forms/tags/attributes and duplicated attributes.

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44736

llvm-svn: 328202
2018-03-22 14:50:44 +00:00
Aaron Smith 523de05a1f [DIA] Add IPDBSectionContrib interfaces and DIA implementation
To resolve symbol context at a particular address, we need to
determine the compiland for the address. We are able to determine
the parent compiland of PDBSymbolFunc, PDBSymbolTypeUDT,
PDBSymbolTypeEnum symbols indirectly through line information. 
However no such information is availabile for PDBSymbolData, 
i.e. variables.

The Section Contribution table from PDBs has information about
each compiland's contribution to sections by address. For example,
a piece of a contribution looks like,

  VA         RelativeVA  Sect No.  Offset    Length    Compiland
  14000087B0 000087B0    0001      000077B0  000000BB  exe_main.obj

So given an address, it's possible to determine its compiland with
this information.

llvm-svn: 328178
2018-03-22 04:08:15 +00:00
Aaron Smith 58a32a478f [PDB] Get more DIA table enumerators
Rename the original function and make it a static template.

llvm-svn: 328177
2018-03-22 03:57:06 +00:00
Zachary Turner eb62999455 [PDB] Don't ignore bucket 0 when writing the PDB string table.
The hash table is a list of buckets, and the *value* stored in
the bucket cannot be 0 since that is reserved.  However, the code
here was incorrectly skipping over the 0'th bucket entirely.
The 0'th bucket is perfectly fine, just none of these buckets
can contain the value 0.

As a result, whenever there was a string where hash(S) % Size
was equal to 0, we would write the value in the next bucket
instead.  We never caught this in our tests due to *another*
bug, which is that we would iterate the entire list of buckets
looking for the value, only using the hash value as a starting
point.  However, the real algorithm stops when it finds 0 in
a bucket since it takes that to mean "the item is not in the
hash table".

The unit test is updated to carefully construct a set of hash
values that will cause one item to hash to 0 mod bucket count,
and the reader is also updated to return an error indicating that
the item is not found when it encounters a 0 bucket.

llvm-svn: 328162
2018-03-21 22:23:59 +00:00
Reid Kleckner 8562c1a198 [PDB] Remove unused private variable, re-applying r327900 after relanding more natvis changes[4~
llvm-svn: 328156
2018-03-21 21:47:26 +00:00
Rafael Espindola c51dc906ea Handle abbr_offset with relocations.
This is mostly just plumbing to get a DWARFDataExtractor where we
compute abbr_offset so we can use getRelocatedValue.

This is part of PR36793.

llvm-svn: 328154
2018-03-21 21:31:25 +00:00
Pavel Labath 9025f9559d [dwarf] Unify unknown dwarf enum formatting code
Summary:
We have had at least three pieces of code (in DWARFAbbreviationDeclaration,
DWARFAcceleratorTable and DWARFDie) that have hand-rolled support for
dumping unknown dwarf enum values. While not terrible, they are a bit
distracting and enable small differences to creep in (Unknown_ffff vs.
Unknown_0xffff). I ended up needing to add a fourth place
(DWARFVerifier), so it seems it would be a good time to centralize.

This patch creates an alternative to the XXXString dumping functions in
the BinaryFormat library, which formats an unknown value as
DW_TYPE_unknown_1234, instead of just an empty string. It is based on
the formatv function, as that allows us to avoid materializing the
string for unknown values (and because this way I don't have to invent a
name for the new functions :P).

In this patch I add formatters for dwarf attributes, forms, tags, and
index attributes as these are the ones in use currently, but adding
other enums is straight-forward.

Reviewers: dblaikie, JDevlieghere, aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44570

llvm-svn: 328090
2018-03-21 11:46:37 +00:00
Zachary Turner fced530650 Revert "Resubmit "Support embedding natvis files in PDBs.""
This is still failing on a different bot this time due to some
issue related to hashing absolute paths.  Reverting until I can
figure it out.

llvm-svn: 328014
2018-03-20 18:37:03 +00:00
Zachary Turner 132d7a134f Resubmit "Support embedding natvis files in PDBs."
The issue causing this to fail in certain configurations
should be fixed.

It was due to the fact that DIA apparently expects there to be
a null string at ID 1 in the string table.  I'm not sure why this
is important but it seems to make a difference, so set it.

llvm-svn: 328002
2018-03-20 17:06:39 +00:00
Aaron Smith da61120749 [PDB] Add a method to get the full path of the source file for PDBSymbolCompiland
Summary:
Redefine PDBSymbolCompiland::getSourceFileName() to return the filename (w/o directory) of the source file that is used to compile the compiland. This is because the result returned previously is ambiguous. It could be the filename, relative path or full path of the source file. 

Move the implementation of SymbolFilePDB::GetSourceFileNameForPDBCompiland() into a new method PDBSymbolCompiland::getSourceFileFullPath(). 

Reviewers: zturner, rnk, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D44458

llvm-svn: 327910
2018-03-19 21:20:04 +00:00
Aaron Smith 06173e8b46 [PDB] Add exclusive methods to derived symbol class
Summary: This commit adds two methods to the PDBSymboFunc class used in parsing symbols. getLineNumbers() is used to determine a Function symbol's declaration and getCompilandId() is used to initialize the SymbolContext field sc.comp_unit.

Reviewers: zturner, rnk, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D44457

llvm-svn: 327909
2018-03-19 21:18:39 +00:00
Zachary Turner a21558897b Revert "Support embedding natvis files in PDBs."
This is causing a test failure on a certain bot, so I'm removing
this temporarily until we can figure out the source of the error.

llvm-svn: 327903
2018-03-19 20:41:59 +00:00
Zachary Turner 426885b10c Remove an unused private variable.
llvm-svn: 327900
2018-03-19 20:22:48 +00:00
Zachary Turner de53aaf132 Support embedding natvis files in PDBs.
Natvis is a debug language supported by Visual Studio for
specifying custom visualizers.  The /NATVIS option is an
undocumented link.exe flag which will take a .natvis file
and "inject" it into the PDB.  This way, you can ship the
debug visualizers for a program along with the PDB, which
is very useful for postmortem debugging.

This is implemented by adding a new "named stream" to the
PDB with a special name of /src/files/<natvis file name>
and simply copying the contents of the xml into this file.

Additionally, we need to emit a single stream named
/src/headerblock which contains a hash table of embedded
files to records describing them.

This patch adds this functionality, including the /NATVIS
option to lld-link.

Differential Revision: https://reviews.llvm.org/D44328

llvm-svn: 327895
2018-03-19 19:53:51 +00:00
Pavel Labath 906b777a6a DWARFVerifier: Enhance validation of .debug_names hash tables
Summary:
This patch adds more checks to the .debug_names validator. Specifically,
they check for:
- buckets claiming to be non-empty but pointing to mismatched hashes
  (most consumers would interpret this as an empty bucket, but it
  questionable whether the generator meant that)
- hashes that are not reachable from any bucket
- names with incorrect hashes

Together, these checks ensure that any name in the index can be reached
through the hash table using the regular lookup algorithm. We also warn
if we encounter a name index without a hash table.

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44433

llvm-svn: 327699
2018-03-16 10:02:16 +00:00
Zachary Turner edbcbe0b62 [PDB] Fix a bug where we were serializing hash tables incorrectly.
There was some code that tried to calculate the number of 4-byte
words required to hold N bits, but it was instead computing the
number of bytes required to hold N bits.  This was leading to
extraneous data being output into the hash table, which would
cause certain operations in DIA (the Microsoft PDB reader) to
fail.

llvm-svn: 327675
2018-03-15 22:31:00 +00:00
Zachary Turner ebf03f6c46 Refactor the PDB HashTable class.
It previously only worked when the key and value types were
both 4 byte integers.  We now have a use case for a non trivial
value type, so we need to extend it to support arbitrary value
types, which means templatizing it.

llvm-svn: 327647
2018-03-15 17:38:26 +00:00
Aaron Smith 40198f5905 [DebugInfo] Add a new method IPDBSession::findLineNumbersBySectOffset
Summary:
Some PDB symbols do not have a valid VA or RVA but have Addr by Section and Offset. For example, a variable in thread-local storage has the following properties:

     get_addressOffset: 0
     get_addressSection: 5
     get_lexicalParentId: 2
     get_name: g_tls
     get_symIndexId: 12
     get_typeId: 4
     get_dataKind: 6
     get_symTag: 7
     get_locationType: 2

This change provides a new method to locate line numbers by Section and Offset from those symbols.

Reviewers: zturner, rnk, llvm-commits

Subscribers: asmith, JDevlieghere

Differential Revision: https://reviews.llvm.org/D44407

llvm-svn: 327601
2018-03-15 06:04:51 +00:00
Pavel Labath 322711f529 DWARF: Unify form size handling code
Summary:
This patch replaces the two switches which are deducing the size of
various forms with a single implementation. I have put the new
implementation into BinaryFormat, to avoid introducing dependencies
between the two independent libraries (DebugInfo and CodeGen) that need
this functionality.

Reviewers: aprantl, JDevlieghere, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44418

llvm-svn: 327486
2018-03-14 09:39:54 +00:00
Eugene Zemtsov 82d60d6b29 Handle mixed-OS paths in DWARF reader
Make sure that DWARF line information generated by Windows can be properly read by Posix OS and vice versa.

Differential Revision: https://reviews.llvm.org/D44290

llvm-svn: 327430
2018-03-13 17:54:29 +00:00
Zachary Turner 679aeadda1 [PDB] Support dumping injected sources via the DIA reader.
Injected sources are basically a way to add actual source file content
to your PDB. Presumably you could use this for shipping your source code
with your debug information, but in practice I can only find this being
used for embedding natvis files inside of PDBs.

In order to effectively test LLVM's natvis file injection, we need a way
to dump the injected sources of a PDB in a way that is authoritative
(i.e. based on Microsoft's understanding of the PDB format, and not
LLVM's). To this end, I've added support for dumping injected sources
via DIA. I made a PDB file that used the /natvis option to generate a
test case.

Differential Revision: https://reviews.llvm.org/D44405

llvm-svn: 327428
2018-03-13 17:46:06 +00:00
Jonas Devlieghere b4c85cf4a4 [DebugInfo] Replace unreachable with None
Invalid user input should not trigger assertions and unreachables. We
already return an Option so we should just return None here.

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5532

llvm-svn: 327274
2018-03-12 14:45:08 +00:00
Pavel Labath 47c3472c41 [DebugInfo/AccelTable] Fix inconsistency in getDIEOffset implementations
Summary:
Even though the getDIEOffset offset function was common for the two
accelerator table implementations, it was doing two different things:
for the Apple tables, it was returning the die offset relative to the
start of the section, whereas for DWARF v5 tables, it was relative to
the start of the CU.

I resolve this by renaming the function to getDIESectionOffset to make
it obvious what the function returns, and change the DWARF
implementation to return the section offset. I also keep the CU-relative
accessor, but only in the DWARF implementation (there is no way to get
this information for the Apple tables). This was not caught by existing
tests because the hand-written inputs also erroneously used section
offsets instead of CU-relative ones.

While looking at this, I noticed that the Apple implementation was not
fully correct either -- the header contains a DIEOffsetBase field, which
should be added to offsets encoded with the DW_FORM_ref*** family, but
this was not being used. This went unnoticed because all current writers
set this field to zero anyway. I fix this as well and add a hand-written
test which demonstrates the issue.

Reviewers: JDevlieghere, dblaikie

Subscribers: aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D44202

llvm-svn: 327116
2018-03-09 11:58:59 +00:00
Jonas Devlieghere 6921753994 [Support] Move syntax highlighting into support
Move the DWARF syntax highlighting into support. This has several
advantages, most notably that this makes the WithColor RAII wrapper
available outside libDebugInfo. Furthermore, several projects all have
their own code for handling colored output. This provides a place to
centralize it.

Differential revision: https://reviews.llvm.org/D44215

llvm-svn: 327108
2018-03-09 09:56:24 +00:00
Benjamin Kramer 70e6faaa0d [DebugInfo] Move RangeListEntries instead of copying.
This is needed for correctness as RangeListEntry is not copy-assignable,
which std::vector might rely on.

llvm-svn: 327067
2018-03-08 21:31:10 +00:00
Zachary Turner 145bc6e0d3 Fix compilation failure with MSVC.
llvm-svn: 327063
2018-03-08 21:07:30 +00:00
Wolfgang Pieb a0729d4126 [DWARF v5] Support for verbose dumping of .debug_rnglist entries
Adding verbose dumping to the recent implementation of dumping of v5 range list entries. 
We're capturing the entries as is as they come in during extraction, including their file offset,
so we can dump them in more detail.
The offset table entries which are table-relative are shown as is (as in non-verbose mode)
and with the actual file offset they map to.

Reviewers: dblaikie, aprantl, jdevlieghere, jhenderson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43366

llvm-svn: 327059
2018-03-08 20:52:35 +00:00
Pavel Labath b136c3934e DWARFVerifier: Basic verification of .debug_names
Summary:
This patch adds basic .debug_names verification capabilities to the
DWARF verifier. Right now, it checks that the headers and abbreviation
tables of the individual name indexes can be parsed correctly, it
verifies the buckets table and the cross-checks the CU lists for
consistency. I intend to add further checks in follow-up patches.

Reviewers: JDevlieghere, aprantl, probinson, dblaikie

Subscribers: vleschuk, echristo, clayborg, llvm-commits

Differential Revision: https://reviews.llvm.org/D44211

llvm-svn: 327011
2018-03-08 15:34:42 +00:00
James Henderson 667026297d [DWARF] Don't attempt to parse line tables at invalid offsets
Whilst working on improvements to the error handling of the debug line
parsing code, I noticed that if an invalid offset were to be specified
in a call to getOrParseLineTable(), an entry in the LineTableMap would
still be created, even if the offset was not within the section range.
The immediate parsing attempt afterwards would fail (it would end up
getting a version of 0), and thereafter, any subsequent calls to
getOrParseLineTable or getLineTable would return the default-
constructed, invalid line table. In reality, we shouldn't even attempt
to parse this table, and we should always return a nullptr from these
two functions for this situation.

I have tested this via a unit test, which required some new framework
for unit testing debug line. My plan is to add quite a few more unit
tests for the new error reporting mechanism that will follow shortly,
hence the reason why the supporting code for the tests are written the
way they are - I intend to extend the DwarfGenerator class to support
generating debug line. At that point, I'll make sure that there are a
few positive test cases for this and the parsing code too.

Differential Revision: https://reviews.llvm.org/D44200

Reviewers: JDevlieghere, aprantl
llvm-svn: 326995
2018-03-08 10:53:34 +00:00
Rafael Auler 86fb7bf2bc Reland "[DebugInfo] Support DWARF expressions in eh_frame"
Summary:
Original change was D43313 (r326932) and reverted by r326953 because it
broke an LLD test and a windows build. The LLD test was already fixed in
lld commit r326944 (thanks maskray). This is the original change with
the windows build fixed.

llvm-svn: 326970
2018-03-08 00:46:53 +00:00
Eugene Zemtsov c4a13015fd Fix build broken by r326959
Adding Demangle to link time dependencies of Symbolize

llvm-svn: 326964
2018-03-08 00:07:26 +00:00
Eugene Zemtsov cd72cbc667 Use itaniumDemangle in llvm-symbolizer
Currently on Windows (_MSC_VER) LLVMSymbolizer supports only Microsoft mangling.
This fix just explicitly uses itaniumDemangle when mangled name starts with _Z.

Differential Revision: https://reviews.llvm.org/D44192

llvm-svn: 326959
2018-03-07 23:07:34 +00:00
Rui Ueyama 6aa8b3491f Revert r326932: [DebugInfo] Support DWARF expressions in eh_frame
This reverts commit rr326932 because it broke lld/test/ELF/eh-frame-hdr-augmentation.s.

llvm-svn: 326953
2018-03-07 22:29:48 +00:00
Rafael Auler 7fdf44440c [DebugInfo] Support DWARF expressions in eh_frame
This patch enhances DWARFDebugFrame with the capability of parsing and
printing DWARF expressions in CFI instructions. It also makes FDEs and
CIEs accessible to lib users, so they can process them in client tools
that rely on LLVM. To make it self-contained with a test case, it
teaches llvm-readobj to be able to dump EH frames and checks they are
correct in a unit test. The llvm-readobj code is Maksim Panchenko's work
(maksfb).

Reviewers: JDevlieghere, espindola

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D43313

llvm-svn: 326932
2018-03-07 19:19:51 +00:00
Jonas Devlieghere bf8596f9cf [dwarfdump] Only print CU relative offset in verbose mode
Instead of only printing the CU-relative offset in non-verbose mode, it
makes more sense to only printed the resolved address. In verbose mode
we still print both.

Differential revision: https://reviews.llvm.org/D44148

rdar://33525475

llvm-svn: 326903
2018-03-07 16:28:53 +00:00
Aaron Smith 25409ddf2a [DebugInfoPDB] Add DIA implementation for getSrcLineOnTypeDefn
Summary: This helps to determine the line number for a PDB type with definition

Reviewers: zturner, llvm-commits, rnk

Reviewed By: zturner

Subscribers: rengolin, JDevlieghere

Differential Revision: https://reviews.llvm.org/D44119

llvm-svn: 326857
2018-03-07 00:33:09 +00:00
Francis Ricci fe6cbceec2 [llvm-symbolizer] Use correct path when resolving .gnu_debuglink in .debug
Summary:
The symbolizer was checking for .debug as a subdirectory of the
binary file itself, not of the directory containing the binary. This led to
a failure to find split debug info when it was contained in a .debug directory.

Reviewers: rnk, glider, zturner

Subscribers: llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D44025

llvm-svn: 326630
2018-03-02 22:56:45 +00:00
Zachary Turner c6a75a69f1 [PDB] Defer writing the build id until the rest of the PDB is written.
For now this is NFC, but this small refactor opens the door to
letting us embed a hash of the PDB in the build id field of the
PDB.

Differential Revision: https://reviews.llvm.org/D43913

llvm-svn: 326453
2018-03-01 18:00:29 +00:00
Reid Kleckner 3acdc67734 [CodeView] Lower __restrict and other pointer qualifiers correctly
Qualifiers on a pointer or reference type may apply to either the
pointee or the pointer itself. Consider 'const char *' and 'char *
const'. In the first example, the pointee data may not be modified
without casts, and in the second example, the pointer may not be updated
to point to new data.

In the general case, qualifiers are applied to types with LF_MODIFIER
records, which support the usual const and volatile qualifiers as well
as the __unaligned extension qualifier.

However, LF_POINTER records, which are used for pointers, references,
and member pointers, have flags for qualifiers applying to the
*pointer*. In fact, this is the only way to represent the restrict
qualifier, which can only apply to pointers, and cannot qualify regular
data types.

This patch causes LLVM to correctly fold 'const' and 'volatile' pointer
qualifiers into the pointer record, as well as adding support for
'__restrict' qualifiers in the same place.

Based on a patch from Aaron Smith

Differential Revision: https://reviews.llvm.org/D43060

llvm-svn: 326260
2018-02-27 22:08:15 +00:00
Reid Kleckner 22d838cd31 [codeview] Remove unused variable
llvm-svn: 326253
2018-02-27 21:46:40 +00:00
Pavel Labath d99072bc97 Implement equal_range for the DWARF v5 accelerator table
Summary:
This patch implements the name lookup functionality of the .debug_names
accelerator table and hooks it up to "llvm-dwarfdump -find". To make the
interface of the two kinds of accelerator tables more consistent, I've
created an abstract "DWARFAcceleratorTable::Entry" class, which provides
a consistent interface to access the common functionality of the table
entries (such as getting the die offset, die tag, etc.). I've also
modified the apple table to vend entries conforming to this interface.

Reviewers: JDevlieghere, aprantl, probinson, dblaikie

Subscribers: vleschuk, clayborg, echristo, llvm-commits

Differential Revision: https://reviews.llvm.org/D43067

llvm-svn: 326003
2018-02-24 00:35:21 +00:00
Scott Linder 16c7bdaf32 [DebugInfo] Support DWARF v5 source code embedding extension
In DWARF v5 the Line Number Program Header is extensible, allowing values with
new content types. In this extension a content type is added,
DW_LNCT_LLVM_source, which contains the embedded source code of the file.

Add new optional attribute for !DIFile IR metadata called source which contains
source text. Use this to output the source to the DWARF line table of code
objects. Analogously extend METADATA_FILE in Bitcode and .file directive in ASM
to support optional source.

Teach llvm-dwarfdump and llvm-objdump about the new values. Update the output
format of llvm-dwarfdump to make room for the new attribute on file_names
entries, and support embedded sources for the -source option in llvm-objdump.

Differential Revision: https://reviews.llvm.org/D42765

llvm-svn: 325970
2018-02-23 23:01:06 +00:00
Aaron Smith 89a19ac38d [PDB] Check the result of setLoadAddress()
Summary: Change setLoadAddress() to return true or false on failure.

Reviewers: zturner, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D43638

llvm-svn: 325843
2018-02-23 00:02:27 +00:00
Aaron Smith 9161a6cb25 [PDB] Fix buildbot failure from missing include for DIAEnumLineNumbers
llvm-svn: 325826
2018-02-22 20:00:07 +00:00
Aaron Smith fbe65404fd [PDB] Implement more find methods for PDB symbols
Summary:
Add additional find methods on PDB raw symbols.

findChildrenByAddr()
findChildrenByVA()
findInlineFramesByAddr()
findInlineFramesByVA()
findInlineLines()
findInlineLinesByAddr()
findInlineLinesByRVA()
findInlineLinesByVA()




Reviewers: zturner, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D43637

llvm-svn: 325824
2018-02-22 19:47:43 +00:00
Jonas Devlieghere 7d4a974d8b [dwarfdump] Fix spurious verification errors for DW_AT_location attributes
Verifying any DWARF file that is optimized and contains at least one tag
with a DW_AT_location with a location list offset as a
DW_AT_form_dataXXX results in dwarfdump spuriously claiming that the
location list is invalid.

Differential revision: https://reviews.llvm.org/D40199

llvm-svn: 325430
2018-02-17 13:06:37 +00:00
Zachary Turner cafd476836 Fix emission of PDB string table.
This was originally reported as a bug with the symptom being "cvdump
crashes when printing an LLD-linked PDB that has an S_FILESTATIC record
in it". After some additional investigation, I determined that this was
a symptom of a larger problem, and in fact the real problem was in the
way we emitted the global PDB string table. As evidence of this, you can
take any lld-generated PDB, run cvdump -stringtable on it, and it would
return no results.

My hypothesis was that cvdump could not *find* the string table to begin
with. Normally it would do this by looking in the "named stream map",
finding the string /names, and using its value as the stream index. If
this lookup fails, then cvdump would fail to load the string table.

To test this hypothesis, I looked at the name stream map generated by a
link.exe PDB, and I emitted exactly those bytes into an LLD-generated
PDB. Suddenly, cvdump could read our string table!

This code has always been hacky and we knew there was something we
didn't understand. After all, there were some comments to the effect of
"we have to emit strings in a specific order, otherwise things don't
work". The key to fixing this was finally understanding this.

The way it works is that it makes use of a generic serializable hash map
that maps integers to other integers. In this case, the "key" is the
offset into a buffer, and the value is the stream number. If you index
into the buffer at the offset specified by a given key, you find the
name. The underlying cause of all these problems is that we were using
the identity function for the hash. i.e. if a string's offset in the
buffer was 12, the hash value was 12. Instead, we need to hash the
string *at that offset*. There is an additional catch, in that we have
to compute the hash as a uint32 and then truncate it to uint16.

Making this work is a little bit annoying, because we use the same hash
table in other places as well, and normally just using the identity
function for the hash function is actually what's desired. I'm not
totally happy with the template goo I came up with, but it works in any
case.

The reason we never found this bug through our own testing is because we
were building a /parallel/ hash table (in the form of an
llvm::StringMap<>) and doing all of our lookups and "real" hash table
work against that. I deleted all of that code and now everything goes
through the real hash table. Then, to test it, I added a unit test which
adds 7 strings and queries the associated values. I test every possible
insertion order permutation of these 7 strings, to verify that it really
does work as expected.

Differential Revision: https://reviews.llvm.org/D43326

llvm-svn: 325386
2018-02-16 20:46:04 +00:00
David Blaikie 3b6de6fe1c Revert "Rewrite the cached map used for locating the most precise DIE among inlined subroutines for a given address."
Seeing some inlining missing in internal uses of symbolizer. I'll work
on a reproduction, tests, improvements & recommit as soon as possible.

(Chandler would like it to be known that this improvement did make
check-llvm 4x faster... - so there's certainly some fairly good
motivation to push on fixing/figuring this out & getting it back in)

This reverts commit r321345.

llvm-svn: 324981
2018-02-13 01:52:30 +00:00
Adrian Prantl a5eee4de07 Simplify switch statement (NFC)
llvm-svn: 324945
2018-02-12 22:09:57 +00:00
Adrian Prantl 520789e139 Fix the syntax highlighting of strings in dwarfdump.
llvm-svn: 324936
2018-02-12 21:11:23 +00:00
Adrian Prantl baf9c20b02 Factor out common condition into an easier to understand helper function (NFC).
llvm-svn: 324935
2018-02-12 21:11:14 +00:00
Paul Robinson ceafcd41cf [DWARFv5] Fix dumper to show the file table starts at index 0.
Emitting the correct (root of compilation) file at index 0 will be
posted for review later; I wanted to get this minor change out of the
way first.

llvm-svn: 324669
2018-02-08 23:08:02 +00:00
Paul Robinson 0a22709f06 [DWARF] Regularize dumping strings from line tables.
The major visible difference here is that in line-table dumps,
directory and file names are wrapped in double-quotes; previously,
directory names got single quotes and file names were not quoted at
all.

The improvement in this patch is that when a DWARF v5 line table
header has indirect strings, in a verbose dump these will all have
their section[offset] printed as well as the name itself.  This
matches the format used for dumping strings in the .debug_info
section.

Differential Revision: https://reviews.llvm.org/D42802

llvm-svn: 324270
2018-02-05 20:43:15 +00:00
James Henderson 10392cdbf7 Fix more print format specifiers in debug_rnglists dumping
See also r324096.

I have made the assumption that DWARF64 is not an issue for the time
being with these fixes.

llvm-svn: 324223
2018-02-05 10:47:13 +00:00
Simon Pilgrim 3d5ee7aa41 Fix MSVC signed/unsigned comparison warning. NFCI.
llvm-svn: 324171
2018-02-03 12:38:56 +00:00
James Henderson 2465633234 Fix type sizes that were causing incorrect string formatting
llvm-svn: 324096
2018-02-02 15:09:31 +00:00
James Henderson c2dfd502a2 Add missing new files from r324077
Differential Revision: https://reviews.llvm.org/D42481

llvm-svn: 324078
2018-02-02 12:45:57 +00:00
James Henderson 3fcc74500a [DWARF v5] Add limited support for dumping .debug_rnglists
This change adds support to llvm-dwarfdump for dumping DWARF5
.debug_rnglists sections in regular ELF files.

It is not complete, in that several DW_RLE_* encodings are currently
not supported, but does dump the headert and the basic ranges for
DW_RLE_start_length and DW_RLE_start_end encodings.

Obvious next steps are to add verbose dumping that dumps the raw
encodings, rather than the interpreted contents, to add -verify support
of the section (e.g. to show that the correct number of offsets are
specified), add dumping of .debug_rnglists.dwo, and to add support for
other encodings.

Reviewed by: dblaikie, JDevlieghere

Differential Revision: https://reviews.llvm.org/D42481

llvm-svn: 324077
2018-02-02 12:35:52 +00:00
Zachary Turner 07d803777c [CodeView] Micro-optimizations to speed up type merging.
Based on a profile, a couple of hot spots were identified in the
main type merging loop.  The code was simplified, a few loops
were re-arranged, and some outlined functions were inlined.  This
speeds up type merging by a decent amount, shaving around 3-4 seconds
off of a 40 second link in my test case.

Differential Revision: https://reviews.llvm.org/D42559

llvm-svn: 323790
2018-01-30 17:12:04 +00:00
Paul Robinson d0c89f851b Stop tracking .debug_line_str in DWARFUnit. NFC.
llvm-svn: 323701
2018-01-29 22:02:56 +00:00
Paul Robinson bf750c80e9 [DWARFv5] Re-enable dumping a line table with no CU.
r323476 added support for DW_FORM_line_strp, and incorrectly made that
depend on having a DWARFUnit available.  We shouldn't be tracking
.debug_line_str in DWARFUnit after all.  After this patch, I can do an
NFC follow up and undo a bunch of the "plumbing" part of r323476.

Differential Revision: https://reviews.llvm.org/D42609

llvm-svn: 323691
2018-01-29 20:57:43 +00:00
Pavel Labath e7264106d4 Fix windows test failure caused by r323638
The test was failing because of an incorrect sizeof check in the name
index parsing code. This code was meant to check that we have enough
input to parse the fixed-size part of the dwarf header, which it did by
comparing the input to sizeof(Header). Originally struct Header only
contained the fixed-size part, but during review, we've moved additional
members into it, which rendered the sizeof check invalid.

I resolve this by moving the fixed-size part to a separate struct and
updating the sizeof-expression to use that.

llvm-svn: 323648
2018-01-29 13:53:48 +00:00
Pavel Labath 3460957ea3 Fix build broken by r323641
The call to ScopedPrinter::printNumber with size_t argument was
ambiguous (I think) on 32-bit builds. Explicitly cast to a 64-bit int to
avoid this.

llvm-svn: 323642
2018-01-29 11:53:46 +00:00
Pavel Labath 394e805668 Refactor dwarfdump -apple-names output
Summary:
This modifies the dwarfdump output to align it with the new .debug_names
dump. It also renames two header fields to match similar fields in the
dwarf5 header.

A couple of tests needed to be updated to match new output. The changes
were fairly straight-forward, although not really automatable.

Reviewers: JDevlieghere, aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42415

llvm-svn: 323641
2018-01-29 11:33:17 +00:00
Pavel Labath 3c9a918c9e [DebugInfo] Basic .debug_names dumping support
Summary:
This commit renames DWARFAcceleratorTable to AppleAcceleratorTable to free up
the first name as an interface for the different accelerator tables.
Then I add a DWARFDebugNames class for the dwarf5 table.

Presently, the only common functionality of the two classes is the dump()
method, because this is the only method that was necessary to implement
dwarfdump -debug-names; and because the rest of the
AppleAcceleratorTable interface does not directly transfer to the dwarf5
tables (the main reason for that is that the present interface assumes
the tables are homogeneous, but the dwarf5 tables can have different
keys associated with each entry).

I expect to make the common interface richer as I add more functionality
to the new class (and invent a way to represent it in generic way).

In terms of sharing the implementation, I found the format of the two
tables sufficiently different to frustrate any attempts to have common
parsing or dumping code, so presently the implementations share just low
level code for formatting dwarf constants.

Reviewers: vleschuk, JDevlieghere, clayborg, aprantl, probinson, echristo, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42297

llvm-svn: 323638
2018-01-29 11:08:32 +00:00
Jonas Devlieghere 92ac9d3e1b [Support] Move DJB hash to support. NFC
This patch moves the DJB hash to support. This is consistent with other
hashing algorithms living there. The hash is used by the DWARF
accelerator tables. We're doing this now because the hashing function is
needed by dsymutil and we don't want to link against libBinaryFormat.

Differential revision: https://reviews.llvm.org/D42594

llvm-svn: 323616
2018-01-28 11:05:10 +00:00
Paul Robinson 0115844c2f [DWARFv5] Classify all the new forms. NFC.
Move standard forms from a switch statement to the table of forms;
fill in all the missing ones defined in DWARF v5.  I'm guessing at
classifications in a couple of cases where v5 forms aren't actually
supported yet, but whoever adds support for the forms can fix the
classifications as needed.

llvm-svn: 323481
2018-01-25 23:06:36 +00:00
Paul Robinson b6aa01ca99 [DWARFv5] Support DW_FORM_line_strp in llvm-dwarfdump.
This form is like DW_FORM_strp, but points to .debug_line_str instead
of .debug_str as the string section.  It's intended to be used from
the line-table header, and allows string-pooling of directory and
filenames across compilation units.

Differential Revision: https://reviews.llvm.org/D42553

llvm-svn: 323476
2018-01-25 22:02:36 +00:00
Pavel Labath 9b36fd2541 Rename DwarfAcceleratorTable to AppleAcceleratorTable. NFC
This frees up the first name to be used as an base class for the
apple table and the dwarf5 .debug_names accel table. The rename  was
split off from D42297 (adding of debug_names support), which is still
under review.

llvm-svn: 323113
2018-01-22 13:17:23 +00:00
Paul Robinson 8181d23b3d [DWARFv5] Number the line-table's directory array correctly.
The compilation directory has always been #0, but as of DWARF v5 it is
explicitly listed in the line-table section instead of implicitly
being a reference to the compile_unit DIE's DW_AT_comp_dir attribute.
This means the dumper should number the dumped array starting with 0
or 1 depending on the DWARF version of the line table.

References in the generated DWARF are correct, it's just the dumper
that was wrong.  Also some assembler-coded tests were similarly
confused about directory numbers.

llvm-svn: 322884
2018-01-18 20:33:35 +00:00
Zachary Turner 1bc2ce6b9b Speed up iteration of CodeView record streams.
There's some abstraction overhead in the underlying
mechanisms that were being used, and it was leading to an
abundance of small but not-free copies being made.  This
showed up on a profile.  Eliminating this and going back to
a low-level byte-based implementation speeds up lld with
/DEBUG between 10 and 15%.

Differential Revision: https://reviews.llvm.org/D42148

llvm-svn: 322871
2018-01-18 18:35:01 +00:00
Aaron Smith 53a1a1616c Fix pretty printing the unspecified param of a variadic function
Summary:
 - Fix a bug in PrettyBuiltinDumper that returns "void" as the name for
  an unspecified builtin type. Since the unspecified param of a variadic
  function is considered a builtin of unspecified type in PDBs, we set
  "..." for its name.

  - Provide a method to determine if a PDBSymbolFunc is variadic in
  PrettyFunctionDumper since PDBSymbolFunc::getArgument() doesn't return the
  last unspecified-type param.

  - Add a pretty-func-dumper.test to test pretty dumping of variadic
  functions.

Reviewers: zturner, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D41801

llvm-svn: 322608
2018-01-17 01:22:03 +00:00
Jonas Devlieghere 6f24c8778c [DebugInfo] Unify dumping of address ranges
Summary:
This patch unifies the printing of address ranges as [0x0, 0x1).

rdar://34822059

Reviewers: aprantl, dblaikie

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D42056

llvm-svn: 322543
2018-01-16 11:17:57 +00:00
Adrian Prantl 146ed408f4 dwarfdump: Match the --uuid output with that of Darwin dwarfdump.
This option is widely used by scripts and there is no reason to break them.

rdar://problem/36032398

llvm-svn: 321901
2018-01-05 21:44:17 +00:00
Zachary Turner de6a487d70 [MSF] Fix FPM interval calcluation
We have some code to try to determine how many pieces an MSF
Free Page Map is split into, and this code had an off by one
error which would cause the calculation to be incorrect when
there were exactly 4096*k + 1 blocks in an MSF file.

Original investigation and patch outline by Colden Cullen.

Differential Revision: https://reviews.llvm.org/D41742

llvm-svn: 321880
2018-01-05 18:12:14 +00:00
Jonas Devlieghere cbf651f739 [DebugInfo] Don't crash when given invalid DWARFv5 line table prologue.
This patch replaces an assertion with an explicit check for the validity
of the FORM parameters. The assertion was triggered when the DWARFv5
line table contained a zero address size.

This fixes OSS-Fuzz Issue 4644
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=4644

Differential revision: https://reviews.llvm.org/D41615

llvm-svn: 321863
2018-01-05 10:03:02 +00:00
Chandler Carruth 54a5ad3681 Rewrite the cached map used for locating the most precise DIE among
inlined subroutines for a given address.

This is essentially the hot path of llvm-symbolizer when extracting
inlined frames during symbolization. Previously, we would read every
subprogram and every inlined subroutine, building a std::map across the
entire PC space to the best DIE, and then do only a handful of queries
as we symbolized a backtrace. A huge fraction of the time was spent
building the map itself.

This patch changes it two a two-level system. First, we just build a map
from PC-interval to DWARF subprograms. These are required to be disjoint
and so constructing this is pretty easy. Second, we build a map *just*
for the inlined subroutines within the subprogram containing the query
address. This allows us to look at far fewer DIEs and build a *much*
smaller set of cached maps in the llvm-symbolizer case where only a few
address get symbolized during the entire run.

It also builds both interval maps in a very different way. It constructs
a single flat vector of pairs that maps from offset -> index. The
indices point into collections of DIE objects, but can also be
"tombstones" (-1) to mark gaps. In the case of subprograms, this mostly
just simplifies the data structure a bit. For inlined subroutines,
because we carefully split them as we build the map, we end up in many
cases having no holes and not having to store both start and stop
offsets.

Finally, the PC ranges for the inlined subroutines are compressed into
32-bits by making them relative to the base PC of the outer subprogram.
This means that if you have a single function body with over 2gb of
executable code in it, we will stop mapping address past the first 2gb
of that function into inlined subroutines and just give you the
subprogram. This doesn't seem like a problem. ;]

All of this combines to make llvm-symbolizer *well* over 2x faster for
symbolizing backtraces out of LLVM's unittests. Death-test heavy unit
tests are running >2x faster. I'm still going to look at completely
disabling symbolization there, but figured while I had a good benchmark
we should make symbolization a bit better.

Sadly, the logic to build the flat interval map for the inlined
subroutines is fairly complex. I'm not super happy about this and
welcome any simplifying suggestions.

Huge thanks to Dave Blaikie who helped walk me through what the various
things I needed to do in DWARF to make this work.

Differential Revision: https://reviews.llvm.org/D40987

llvm-svn: 321345
2017-12-22 06:41:23 +00:00
Wolfgang Pieb b4ba1aa486 [DWARF] Fix formatting bug with r321295. This fixes a MIPS buildbot failure.
llvm-svn: 321330
2017-12-22 01:12:24 +00:00
Wolfgang Pieb 6ecd6a8088 [DWARF v5] Rework of string offsets table reader
Reorganizes the DWARF consumer to derive the string offsets table 
contribution's format from the contribution header instead of 
(incorrectly) from the unit's format.

Reviewers: JDevliegehere, aprantl

Differential Revision: https://reviews.llvm.org/D41146
 

llvm-svn: 321295
2017-12-21 19:38:13 +00:00
Adrian Prantl 0e6694d111 Silence a bunch of implicit fallthrough warnings
llvm-svn: 321114
2017-12-19 22:05:25 +00:00
Paul Robinson a06f8dcca6 Recommit "[DWARFv5] Dump an MD5 checksum in the line-table header."
Adds missing support for DW_FORM_data16.

Update of r320852/r320886, fixing the unittest again, this time use a
raw char string for the test data.

Differential Revision: https://reviews.llvm.org/D41090

llvm-svn: 321011
2017-12-18 19:08:35 +00:00
Paul Robinson 6d0484f2b6 Revert "Recommit "[DWARFv5] Dump an MD5 checksum in the line-table header.""
This reverts commit 0afef672f63f0e4e91938656bc73424a8c058bfc.
Still failing at runtime on bots.

llvm-svn: 320888
2017-12-15 23:21:52 +00:00
Paul Robinson 5c8f7d7de4 Recommit "[DWARFv5] Dump an MD5 checksum in the line-table header."
Adds missing support for DW_FORM_data16.

Update of r320852, fixing the unittest to use a hand-coded struct
instead of std::array to guarantee data layout.

Differential Revision: https://reviews.llvm.org/D41090

llvm-svn: 320886
2017-12-15 22:57:17 +00:00
Paul Robinson 67ca67d1b2 Revert "[DWARFv5] Dump an MD5 checksum in the line-table header."
Unit test fails on some bots.

llvm-svn: 320857
2017-12-15 20:29:25 +00:00
Paul Robinson 72546fe87b [DWARFv5] Dump an MD5 checksum in the line-table header.
Adds missing support for DW_FORM_data16.

Differential Revision: https://reviews.llvm.org/D41090

llvm-svn: 320852
2017-12-15 19:52:34 +00:00
Zachary Turner 0d07a8e948 [COFF] Teach LLD to use the COFF .debug$H section.
This adds the /DEBUG:GHASH option to LLD which will look for
the existence of .debug$H sections in linker inputs and use them
to accelerate type merging.  The clang-cl side has already been
added, so this completes the work necessary to begin experimenting
with this feature.

Differential Revision: https://reviews.llvm.org/D40980

llvm-svn: 320719
2017-12-14 18:07:04 +00:00
Zachary Turner 048f8f99bf [CodeView] Teach clang to emit the .debug$H COFF section.
Currently this is an LLVM extension to the COFF spec which is
experimental and intended to speed up linking.  For now it is
behind a hidden cl::opt flag, but in the future we can move it
to a "real" cc1 flag and have the driver pass it through whenever
it is appropriate.

The patch to actually make use of this section in lld will come
in a followup.

Differential Revision: https://reviews.llvm.org/D40917

llvm-svn: 320649
2017-12-13 22:33:58 +00:00
Michael Zolotukhin 0c169bf7f7 Remove redundant includes from lib/DebugInfo.
llvm-svn: 320620
2017-12-13 21:30:49 +00:00
Jonas Devlieghere ba915897da [dwarfdump] Fix off-by-one bug in accelerator table extractor.
This fixes a bug where the verifier was complaining about empty
accelerator tables. When the table is empty, its size is not a valid
offset as it points after the end of the section.

This patch also makes the extractor return llvm:Error instead of bool
for better error reporting in the verifier.

Differential revision: https://reviews.llvm.org/D41063

rdar://35932007

llvm-svn: 320399
2017-12-11 18:22:47 +00:00
Adrian Prantl 01fb31cc89 dwarfdump: Add support for the --diff option.
--diff      Emit the output in a diff-friendly way by omitting offsets and
            addresses.

<rdar://problem/34502625>

llvm-svn: 320214
2017-12-08 23:32:47 +00:00
Zachary Turner ecd2684ed7 [DebugInfo] Fix register variables not showing up in pdb.
Previously, when linking against libcmt from the MSVC runtime,
lld-link /verbose would show "Ignoring unknown symbol record
with kind 0x1006".  It turns out this was because
TypeIndexDiscovery did not handle S_REGISTER records, so these
records were not getting properly remapped.

Patch by: Alexnadre Ganea
Differential Revision: https://reviews.llvm.org/D40919

llvm-svn: 320108
2017-12-07 22:51:16 +00:00
Zachary Turner 376d437776 Teach llvm-pdbutil to dump types from object files.
llvm-svn: 319859
2017-12-05 23:58:18 +00:00
Zachary Turner 023f88ef42 Fix -Wmissing-braces error.
llvm-svn: 319855
2017-12-05 23:19:33 +00:00
Zachary Turner 87b78e9d1e [CodeView] Add support for content hashing CodeView type records.
Currently nothing uses this, but this at least gets the core
algorithm in, and adds some test to demonstrate correctness.

Differential Revision: https://reviews.llvm.org/D40736

llvm-svn: 319854
2017-12-05 23:08:58 +00:00
Paul Robinson ab69b477a9 [DebugInfo] Bail out if making no progress dumping line tables.
llvm-svn: 319564
2017-12-01 18:25:30 +00:00
Zachary Turner f0e4c6a819 Simplify the DenseSet used for hashing CodeView records.
This was storing the hash alongside the key so that the hash
doesn't need to be re-computed every time, but in doing so it
was allocating a structure to keep the key size small in the
DenseMap.  This is a noble goal, but it also leads to a pointer
indirection on every probe, and this cost of this pointer
indirection ends up being higher than the cost of having a
slightly larger entry in the hash table.  Removing this not only
simplifies the code, but yields a small but noticeable
performance improvement in the type merging algorithm.

llvm-svn: 319493
2017-11-30 23:00:30 +00:00
Zachary Turner ca6dbf1440 Split TypeTableBuilder into two classes.
llvm-svn: 319456
2017-11-30 18:39:50 +00:00
Zachary Turner 52d036e693 [CodeView] Factor some code out of TypeTableBuilder.
This class had some code that would automatically remap type
indices before hashing and serializing.  The only caller of
this method was the TypeStreamMerger anyway, and the method
doesn't make general sense, and prevents making certain future
improvements to the class.  So, factoring this up one level
into the TypeStreamMerger where it belongs.

llvm-svn: 319377
2017-11-29 22:41:56 +00:00
Zachary Turner 3e3936da93 Make TypeTableBuilder inherit from TypeCollection.
A couple of places in LLD were passing references to
TypeTableCollections around, which makes it hard to change the
implementation at runtime.  However, these cases only needed to
iterate over the types in the collection, and TypeCollection
already provides a handy abstract interface for this purpose.

By implementing this interface, we can get rid of the need to
pass TypeTableBuilder references around, which should allow us
to swap the implementation at runtime in subsequent patches.

llvm-svn: 319345
2017-11-29 19:35:21 +00:00
Adrian Prantl 5da51f435a llvm-dwarfdump: honor the --show-children option when dumping a specific DIE.
llvm-svn: 319271
2017-11-29 01:12:22 +00:00
Zachary Turner 4c1fa68590 Fix a warning.
llvm-svn: 319263
2017-11-29 00:13:44 +00:00
Zachary Turner 29b081dcd1 [NFC] Minor cleanups in CodeView TypeTableBuilder.
llvm-svn: 319260
2017-11-28 23:57:13 +00:00
Rafael Espindola bba7f862d8 Fix non assert build warnings.
llvm-svn: 319200
2017-11-28 18:50:08 +00:00
Zachary Turner 6900de1dfb [CodeView] Refactor / Rewrite TypeSerializer and TypeTableBuilder.
The motivation behind this patch is that future directions require us to
be able to compute the hash value of records independently of actually
using them for de-duplication.

The current structure of TypeSerializer / TypeTableBuilder being a
single entry point that takes an unserialized type record, and then
hashes and de-duplicates it is not flexible enough to allow this.

At the same time, the existing TypeSerializer is already extremely
complex for this very reason -- it tries to be too many things. In
addition to serializing, hashing, and de-duplicating, ti also supports
splitting up field list records and adding continuations. All of this
functionality crammed into this one class makes it very complicated to
work with and hard to maintain.

To solve all of these problems, I've re-written everything from scratch
and split the functionality into separate pieces that can easily be
reused. The end result is that one class TypeSerializer is turned into 3
new classes SimpleTypeSerializer, ContinuationRecordBuilder, and
TypeTableBuilder, each of which in isolation is simple and
straightforward.

A quick summary of these new classes and their responsibilities are:

- SimpleTypeSerializer : Turns a non-FieldList leaf type into a series of
  bytes. Does not do any hashing. Every time you call it, it will
  re-serialize and return bytes again. The same instance can be re-used
  over and over to avoid re-allocations, and in exchange for this
  optimization the bytes returned by the serializer only live until the
  caller attempts to serialize a new record.

- ContinuationRecordBuilder : Turns a FieldList-like record into a series
  of fragments. Does not do any hashing. Like SimpleTypeSerializer,
  returns references to privately owned bytes, so the storage is
  invalidated as soon as the caller tries to re-use the instance. Works
  equally well for LF_FIELDLIST as it does for LF_METHODLIST, solving a
  long-standing theoretical limitation of the previous implementation.

- TypeTableBuilder : Accepts sequences of bytes that the user has already
  serialized, and inserts them by de-duplicating with a hash table. For
  the sake of convenience and efficiency, this class internally stores a
  SimpleTypeSerializer so that it can accept unserialized records. The
  same is not true of ContinuationRecordBuilder. The user is required to
  create their own instance of ContinuationRecordBuilder.

Differential Revision: https://reviews.llvm.org/D40518

llvm-svn: 319198
2017-11-28 18:33:17 +00:00
Greg Clayton d6b67eb15c Fixed the ability to recursively get an attribute value from a DWARFDie.
The previous implementation would only look 1 DW_AT_specification or DW_AT_abstract_origin deep. This means DWARFDie::getName() would fail in certain cases. I ran into such a case while creating a tool that used the LLVM DWARF parser to generate a symbolication format so I have seen this in the wild.

Differential Revision: https://reviews.llvm.org/D40156

llvm-svn: 319104
2017-11-27 22:12:44 +00:00
Zachary Turner 96c6985b53 [BinaryStream] Support growable streams.
The existing library assumed that a stream's length would never
change.  This makes some things simpler, but it's not flexible
enough for what we need, especially for writable streams where
what you really want is for each call to write to actually append.

llvm-svn: 319070
2017-11-27 18:48:37 +00:00
Jonas Devlieghere 6a9c5929d4 [llvm-dwarfdump] Display DW_AT_high_pc as absolute value
DWARF4 relative DW_AT_high_pc values are now displayed as absolute
addresses. The relative value is only shown when explicitly dumping the
forms, i.e. in show-form or verbose mode.

```
DW_AT_low_pc	(0x0000000000000049)
DW_AT_high_pc	(0x00000019)
```

becomes

```
DW_AT_low_pc	(0x0000000000000049)
DW_AT_high_pc	(0x0000000000000062)
```

Differential revision: https://reviews.llvm.org/D40317

rdar://35416943

llvm-svn: 319044
2017-11-27 16:40:46 +00:00
Paul Robinson 6ca1dd6fa3 [DwarfDump] -debug-line=offset applies to .dwo too.
llvm-svn: 318856
2017-11-22 18:23:55 +00:00
Paul Robinson 511b54cadc [DebugInfo] Dump a .debug_line section, including line-number program,
without any compile units.

Differential Revision: https://reviews.llvm.org/D40114

llvm-svn: 318842
2017-11-22 15:48:30 +00:00
Paul Robinson 63811a472e [DWARFv5] Support DW_FORM_strp in the .debug_line.dwo header.
As a side effect, the .debug_line section will be dumped in physical
order, rather than in the order that compile units refer to their
associated portions of the .debug_line section.  These are probably
always the same order anyway, and no tests noticed the difference.

Differential Revision: https://reviews.llvm.org/D39854

llvm-svn: 318839
2017-11-22 15:33:17 +00:00
Paul Robinson e0833349b6 [DWARF] Fix handling of extended line-number opcodes
Differential Revision: https://reviews.llvm.org/D40200

llvm-svn: 318838
2017-11-22 15:14:49 +00:00
Zachary Turner bd159d32c4 Don't #include MemoryBuffer.h from Host.h.
It turns out this #include isn't used from Host.h anyway,
but by having it it causes circular include dependencies.
This issues only surfaced while I was working on a separate
patch, so I'm submitting this first so that it's independent
of the other, unrelated patch.

llvm-svn: 318489
2017-11-17 01:00:35 +00:00
Reid Kleckner b5d17d8d30 Fix my typo of PDB_TableType
llvm-svn: 318447
2017-11-16 19:41:12 +00:00
Reid Kleckner 4ca69bdac6 Fix -Wreturn-type falling off the end of a function in new DIA code
llvm-svn: 318444
2017-11-16 19:32:53 +00:00
Aaron Smith 89bca9e566 [DebugInfo/PDB] Adding getUndecoratedNameEx and IPDB interfaces for IDiaEnumTables and IDiaTable.
Initial changes to support debugging PE/COFF files with LLDB on Windows through DIA SDK.
There is another set of changes required on the LLDB side before this does anything.

Differential Revision: https://reviews.llvm.org/D39517

llvm-svn: 318403
2017-11-16 14:33:09 +00:00
Aaron Smith c6ef575909 Test commit. Add a missing dash to the standard llvm file header; NFC.
llvm-svn: 318400
2017-11-16 13:42:28 +00:00
Rafael Espindola e0df357dbd Convert FileOutputBuffer to Expected. NFC.
llvm-svn: 317649
2017-11-08 01:05:44 +00:00
Paul Robinson e5400f8a6e [DWARFv5] Support DW_FORM_strp in the .debug_line header.
Supporting this form in .debug_line.dwo will be done as a follow-up.

Differential Revision: https://reviews.llvm.org/D33155

llvm-svn: 317607
2017-11-07 19:57:12 +00:00
NAKAMURA Takumi 1657f2ad99 Fix warnings discovered by rL317076. [-Wunused-private-field]
llvm-svn: 317091
2017-11-01 13:47:55 +00:00
Benjamin Kramer 0fad6dd3c4 Revert "[DWARF] Now that Optional is standard layout, put it into an union instead of splatting it."
GCC doesn't like it. This reverts commit r317028.

llvm-svn: 317030
2017-10-31 19:55:08 +00:00
Benjamin Kramer 8732bbec1e [DWARF] Now that Optional is standard layout, put it into an union instead of splatting it.
No functionality change intended.

llvm-svn: 317028
2017-10-31 19:40:03 +00:00
George Rimar 3d07f6004e Fix BB after r316756 "[llvm-dwarfdump] - Teach verifier to report broken DWARF expressions."
Bot:
http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/6255

Changed format of this message by mistake.

llvm-svn: 316757
2017-10-27 10:58:04 +00:00
George Rimar 144e4c5a32 [llvm-dwarfdump] - Teach verifier to report broken DWARF expressions.
Patch improves next things:

* Fixes assert/crash in getOpDesc when giving it a invalid expression op code.
* DWARFExpression::print() called DWARFExpression::Operation::getEndOffset() which
  returned and used uninitialized field EndOffset. Patch fixes that.
* Teaches verifier to verify DW_AT_location and error out on broken expressions.

Differential revision: https://reviews.llvm.org/D39294

llvm-svn: 316756
2017-10-27 10:42:04 +00:00
Reid Kleckner 145090f124 [PDB] Handle an empty globals hash table with no buckets
llvm-svn: 316722
2017-10-27 00:45:51 +00:00
Jonas Devlieghere f63ee64c4b Re-land "[dwarfdump] Add -lookup option"
Add the option to lookup an address in the debug information and print
out the file, function, block and line table details.

Differential revision: https://reviews.llvm.org/D38409

llvm-svn: 316619
2017-10-25 21:56:41 +00:00
George Rimar 0be860f695 [llvm-dwarfdump] - Fix array out of bounds access crash.
This fixes possible out of bound access in
DWARFDie::getFirstChild()
which might happen when .debug_info section is corrupted,
like shown in testcase.

Differential revision: https://reviews.llvm.org/D39185

llvm-svn: 316566
2017-10-25 10:23:49 +00:00
Reid Kleckner 8aa32ffbad [codeview] Fix handling of S_HEAPALLOCSITE
The type index is from the TPI stream, not the IPI stream. Fix the
dumper, fix type index discovery, and add a test in LLD.

Also improve the log message we emit when we fail to rewrite type
indices in LLD. That's how I found this bug.

llvm-svn: 316461
2017-10-24 17:02:40 +00:00
Reid Kleckner 0e88118dd7 [codeview] Add support for inlinee lists
This adds type index discovery and dumper support for symbol record kind
0x1168, which is a list of inlined function ids. This symbol kind is
undocumented, but S_INLINEES is consistent with the existing
nomenclature.

Fixes PR34222

llvm-svn: 316398
2017-10-23 23:43:40 +00:00
Reid Kleckner ecddee27a8 [codeview] Recognize two records with no type index fields
Thunk records do not have types and frame cookies do not have types.

These were found while linking libconcrt.lib from MSVC.

llvm-svn: 316385
2017-10-23 22:44:24 +00:00
George Rimar 7fc298afe4 [llvm-dwarfdump] - Teach tool about few GNU call_sites constants.
This teaches tool about following consants: 
DW_TAG_GNU_call_site,
DW_TAG_GNU_call_site_parameter,
DW_AT_GNU_call_site_value,
DW_AT_GNU_all_call_sites.

Constants documented here: https://sourceware.org/elfutils/DwarfExtensions

Differential revision: https://reviews.llvm.org/D39119

llvm-svn: 316321
2017-10-23 11:24:14 +00:00
Peter Collingbourne 75257bc2ec COFF: Add type server pdb files to linkrepro tar file.
Differential Revision: https://reviews.llvm.org/D38977

llvm-svn: 316233
2017-10-20 19:48:26 +00:00
George Rimar 68b285f69e [llvm-dwarfdump] - Teach tool to parse DW_CFA_GNU_args_size.
Currently llvm-dwarfdump runs into llvm_unreachable when
faces DW_CFA_GNU_args_size. Patch implements the support.

Differential revision: https://reviews.llvm.org/D38879

llvm-svn: 315897
2017-10-16 10:26:17 +00:00
Jonas Devlieghere aa6be823a4 Re-land "[llvm-dwarfdump] Print type names in DW_AT_type DIEs"
This patch adds printing for DW_AT_type DIEs like it is already the case
for DW_AT_specification DIEs. This is a rather naive approach and only a
start. We should have pretty printers for different languages.

Recommit after being reverted in r315299.

Differential revision: https://reviews.llvm.org/D36993

llvm-svn: 315316
2017-10-10 14:15:25 +00:00
Jonas Devlieghere 5b0f885691 Revert "[llvm-dwarfdump] Print type names in DW_AT_type DIEs"
This reverts commit r315297.

llvm-svn: 315299
2017-10-10 11:49:56 +00:00
Jonas Devlieghere 2eb95c33f6 [llvm-dwarfdump] Print type names in DW_AT_type DIEs
This patch adds printing for DW_AT_type DIEs like it is already the case
for DW_AT_specification DIEs. This is a rather naive approach and only a
start. We should have pretty printers for different languages.

Differential revision: https://reviews.llvm.org/D36993

llvm-svn: 315297
2017-10-10 11:24:41 +00:00
Jonas Devlieghere f2fa9ebe3f [dwarfdump] Verify that unit type matches root DIE
This patch adds two new verifiers:

  - It checks that the root DIE of a CU is actually a valid unit DIE.
    (based on its tag)
  - For DWARF5 which contains a unit type int he CU header, it checks that
    this matches the type of the unit DIE.

Differential revision: https://reviews.llvm.org/D38453

llvm-svn: 315121
2017-10-06 22:27:31 +00:00
Adrian Prantl b4a67907b7 clang-format file.
llvm-svn: 314942
2017-10-04 22:26:19 +00:00
Adrian Prantl 617a007b7c delete commented out code.
llvm-svn: 314941
2017-10-04 22:26:19 +00:00
Hans Wennborg dc8d6f2527 Fix -Wcovered-switch-default warnings from r314821
llvm-svn: 314826
2017-10-03 18:44:12 +00:00
Hans Wennborg ab2177edf7 Revert r314817 "[dwarfdump] Add -lookup option"
The test fails on Linux; see follow-up email on the llvm-commits list.

> Add the option to lookup an address in the debug information and print
> out the file, function, block and line table details.
>
> Differential revision: https://reviews.llvm.org/D38409

This also reverts the follow-up r314818:

> [test] Fix llvm-dwarfdump/cmdline.test
>
> Fixes test/tools/llvm-dwarfdump/cmdline.test

llvm-svn: 314825
2017-10-03 18:39:13 +00:00
Hans Wennborg 660531085a CodeView: Provide a .def file with the register ids
The list of register ids was previously written out in a couple of dirrent
places. This puts it in a .def file and also adds a few more registers (e.g.
the x87 regs) which should lead to more readable dumps, but I didn't include
the whole list since that seems unnecessary.

X86_MC::initLLVMToSEHAndCVRegMapping is pretty ugly, but at least it's not
relying on magic constants anymore. The TODO of using tablegen still stands.

Differential revision: https://reviews.llvm.org/D38480

llvm-svn: 314821
2017-10-03 18:27:22 +00:00
Jonas Devlieghere f998c501b6 [dwarfdump] Add -lookup option
Add the option to lookup an address in the debug information and print
out the file, function, block and line table details.

Differential revision: https://reviews.llvm.org/D38409

llvm-svn: 314817
2017-10-03 17:10:21 +00:00
Hans Wennborg ea89ff7c25 CodeView symbol dumper: use symbolic names for registers
https://reviews.llvm.org/D38469

llvm-svn: 314690
2017-10-02 17:44:47 +00:00
Jonas Devlieghere f91dc28b7b [dwarfdump] Add -show-form
This enables printing of DWARF form types after the DWARF attribute
types.

Differential revision: https://reviews.llvm.org/D38459

llvm-svn: 314685
2017-10-02 16:02:04 +00:00
Jonas Devlieghere a15f25d325 [dwarfdump][NFC] Consistent printing of address ranges
This implement the insertion operator for DWARF address ranges so they
are consistently printed as [LowPC, HighPC).

While a dump method might have felt more consistent, it is used
exclusively for printing error messages in the verifier and never used
for actual dumping. Hence this approach is more intuitive and creates
less clutter at the call sites.

Differential revision: https://reviews.llvm.org/D38395

llvm-svn: 314523
2017-09-29 15:41:22 +00:00
Jonas Devlieghere 19fc4d941f [dwarfdump][NFC] Consistent errors and warnings with --verify
This patch introduces 3 helper functions: error(), warn() and note() to
make printing  during verification more consistent. When supported, the
respective prefixes are printed in color using the same color scheme as
clang.

Differential revision: https://reviews.llvm.org/D38368

llvm-svn: 314498
2017-09-29 09:33:31 +00:00
Adrian Prantl f51e78017d llvm-dwarfdump: support .apple-namespaces in --find
llvm-svn: 314481
2017-09-29 00:52:33 +00:00
Adrian Prantl 714ee4d536 llvm-dwarfdump: add support for .apple_types in --find
llvm-svn: 314479
2017-09-29 00:33:22 +00:00
Adrian Prantl 99fdb9d927 llvm-dwarfdump: implement --find for .apple_names
This patch implements the dwarfdump option --find=<name>.  This option
looks for a DIE in the accelerator tables and dumps it if found.  This
initial patch only adds support for .apple_names to keep the review
small, adding the other sections and pubnames support should be
trivial though.

Differential Revision: https://reviews.llvm.org/D38282

llvm-svn: 314439
2017-09-28 18:10:52 +00:00
Jonas Devlieghere 35fdaa94f7 [dwarfdump] Verify that CUs have a unit DIE.
This patch adds a check to the DWARF verifier to detect CUs without a
unit DIE.

Differential revision: https://reviews.llvm.org/D38363

llvm-svn: 314426
2017-09-28 15:57:50 +00:00
Jonas Devlieghere 777731ab2b [dwarfdump] Fix printing of .debug_line offset.
Fixes 32-bit buildbots:
  http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/542
  http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15/builds/11533
  http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/11494

llvm-svn: 314291
2017-09-27 10:00:27 +00:00
Jonas Devlieghere 65af0f9584 [dwarfdump] Add support for -debug-line=OFFSET
This patch adds support for passing an offset to -debug-line.

Differential revision: https://reviews.llvm.org/D38240

llvm-svn: 314288
2017-09-27 09:33:45 +00:00
Jonas Devlieghere 622c563b5a [dwarfdump] Add support for -debug-loc=OFFSET
This patch adds support for passing an offset to -debug-loc.

Differential revision: https://reviews.llvm.org/D38237

llvm-svn: 314286
2017-09-27 09:33:36 +00:00
Jonas Devlieghere 8af2387b91 [dwarfdump] Skip 'stripped' sections
When dsymutil generates the companion file, its strips all unnecessary
sections by omitting their body and setting the offset in their
corresponding load command to zero.

One such section is the .eh_frame section, as it contains runtime
information rather than debug information and is part of the __TEXT
segment. When reading this section, we would just read the number of
bytes specified in the load command, starting from offset 0 (i.e. the
beginning of the file).

Rather than trying to parse this obviously invalid section, dwarfdump
now skips this.

Differential revision: https://reviews.llvm.org/D38135

llvm-svn: 314208
2017-09-26 14:22:35 +00:00
Jonas Devlieghere 26f9a0c529 [dwarfdump] Add verbose output for .debug-line section
This patch adds dumping of line table instructions as well as the final
state at each specified pc value in verbose mode. This is essentially
the same as the default in Darwin's dwarfdump. Dumping the actual line
table opcodes can be particularly useful for something like debugging a
bad `.debug_line` section.

Differential revision: https://reviews.llvm.org/D37971

llvm-svn: 313910
2017-09-21 20:15:30 +00:00
Adrian Prantl 62528e69c0 llvm-dwarfdump support --debug-frame=<offset> and --eh-frame=<offset>
llvm-svn: 313900
2017-09-21 18:52:03 +00:00
Benjamin Kramer eb14c1109f [DWARF] Shrink AttributeSpec from 24 to 16 bytes.
This is a bit ugly because we can't put Optional into a union. Hide all
of that behind a set of accessors and make accesses safer using asserts.

llvm-svn: 313884
2017-09-21 15:27:45 +00:00
Adrian Prantl d3f9f2138d llvm-dwarfdump: implement --recurse-depth=<N>
This patch implements the Darwin dwarfdump option --recurse-depth=<N>,
which limits the recursion depth when selectively printing DIEs at an
offset.

Differential Revision: https://reviews.llvm.org/D38064

llvm-svn: 313778
2017-09-20 17:44:00 +00:00
David Blaikie e79dda31e9 dwarfdump/symbolizer: Avoid loading unneeded CUs from a DWP
When symbolizing large binaries, parsing every CU in a DWP file is a
significant performance penalty. Instead, use the index to only load the
CUs that are needed.

llvm-svn: 313659
2017-09-19 18:36:11 +00:00
David Blaikie 485e01be26 dwarfdump: Delay parsing abbreviations until they're needed
This speeds up dumping specific DIEs by not parsing abbreviations for
units that are not used.

(this is also handy to have in eventually to speed up llvm-symbolizer
for .dwp files, where parsing most of the DWP file can be avoided by
using the index)

llvm-svn: 313635
2017-09-19 15:13:55 +00:00
Adrian Prantl c8d8653de8 llvm-dwarfdump: use more efficient API (NFC)
llvm-svn: 313573
2017-09-18 21:44:40 +00:00
Adrian Prantl 283eae82fd Fix indentation.
llvm-svn: 313568
2017-09-18 21:28:13 +00:00
Adrian Prantl c2bc717028 llvm-dwarfdump: add a --show-parents options when selectively dumping DIEs.
llvm-svn: 313567
2017-09-18 21:27:44 +00:00
Adrian Prantl dc7d460945 llvm-dwarfdump: Sink the handling of ShowChildren into DWARFDie::dump(). NFC.
llvm-svn: 313560
2017-09-18 19:55:00 +00:00
Jonas Devlieghere c0a758d8ab [dwarfdump] Make .eh_frame an alias for .debug_frame
This patch makes the `.eh_frame` extension an alias for `.debug_frame`.
Up till now it was only possible to dump the section using objdump, but
not with dwarfdump. Since the two are essentially interchangeable, we
dump whichever of the two is present.

As a workaround, this patch also adds parsing for 3 currently
unimplemented CFA instructions: `DW_CFA_def_cfa_expression`,
`DW_CFA_expression`, and `DW_CFA_val_expression`. Because I lack the
required knowledge, I just parse the fields without actually creating
the instructions.

Finally, this also fixes the typo in the `.debug_frame` section name
which incorrectly contained a trailing `s`.

Differential revision: https://reviews.llvm.org/D37852

llvm-svn: 313530
2017-09-18 14:15:57 +00:00
Adrian Prantl 597aa48d11 llvm-dwarfdump: support a --show-children option
This will print all children of a DIE when selectively printing only
one DIE at a given offset.

llvm-svn: 313464
2017-09-16 17:28:00 +00:00
Adrian Prantl 099d7e452a llvm-dwarfdump: Add support for -debug-types=<offset>.
llvm-svn: 313463
2017-09-16 16:58:18 +00:00
Adrian Prantl 057d336c0d llvm-dwarfdump: Add support for -debug-info=<offset>.
This is the first of many commits that enable selectively dumping just
one record from the debug info.

This reapplies r313412 with some extra qualification to appease GCC and MSVC.

llvm-svn: 313419
2017-09-15 23:04:04 +00:00
Adrian Prantl b5abcc558d Revert "llvm-dwarfdump: Add support for -debug-info=<offset>."
This reverts commit r313412 because of a g++ incompatibility.

llvm-svn: 313413
2017-09-15 22:47:16 +00:00
Adrian Prantl fb5d284e97 llvm-dwarfdump: Add support for -debug-info=<offset>.
This is the first of many commits that enable selectively dumping just
one record from the debug info.

llvm-svn: 313412
2017-09-15 22:37:56 +00:00
Adrian Prantl 8416802ea4 llvm-dwarfdump: Factor out the printing of the section header (NFC)
llvm-svn: 313370
2017-09-15 17:39:50 +00:00
Jonas Devlieghere d585a20394 [test] Fix TestDWARFDieRangeInfoIntersects
Fixes heap buffer overflow triggered in DWARF verifier, detected by ASAN.

llvm-svn: 313280
2017-09-14 17:46:23 +00:00
Adrian Prantl 5fd3d49bc4 llvm-dwarfdump: support dumping static archives.
llvm-svn: 313272
2017-09-14 17:01:53 +00:00
Jonas Devlieghere 5891060ff8 [dwarfdump] Add DWARF verifiers for address ranges
This patch started as an attempt to rebase Greg's differential (D32821).
The result is both quite similar and different at the same time. It adds
the following checks:

 - Verify that all address ranges in a DIE are valid.
 - Verify that no ranges within the DIE overlap.
 - Verify that no ranges overlap with the ranges of a sibling.
 - Verify that children are completely contained in its (direct)
   parent's address range. (unless both are subprograms)

Differential revision: https://reviews.llvm.org/D37696

llvm-svn: 313255
2017-09-14 11:33:42 +00:00
Jonas Devlieghere a9f55bed8a Revert "[dwarfdump] Add DWARF verifiers for address ranges"
This reverts commit r313250.

llvm-svn: 313253
2017-09-14 10:49:15 +00:00
Jonas Devlieghere d7201b3a36 [dwarfdump] Add DWARF verifiers for address ranges
This patch started as an attempt to rebase Greg's differential (D32821).
The result is both quite similar and different at the same time. It adds
the following checks:

 - Verify that all address ranges in a DIE are valid.
 - Verify that no ranges within the DIE overlap.
 - Verify that no ranges overlap with the ranges of a sibling.
 - Verify that children are completely contained in its (direct)
   parent's address range. (unless both are subprograms)

Differential revision: https://reviews.llvm.org/D37696

llvm-svn: 313250
2017-09-14 10:38:18 +00:00
Adrian Prantl 3ae35eb56b llvm-dwarfdump: automatically dump both regular and .dwo variant of sections
Since users typically don't really care about the .dwo / non.dwo
distinction, this patch makes it so dwarfdump --debug-<info,...> dumps
.debug_info and (if available) also .debug_info.dwo. This simplifies
the command line interface (I've removed all dwo-specific dump
options) and makes the tool friendlier to use.

Differential Revision: https://reviews.llvm.org/D37771

llvm-svn: 313207
2017-09-13 22:09:01 +00:00
Adrian Prantl 3dcd122151 llvm-dwarfdump: support dumping UUIDs of Mach-O binaries.
This is a feature supported by Darwin dwarfdump. UUIDs are used to
associate executables with their .dSYM bundles.

llvm-svn: 313165
2017-09-13 18:22:59 +00:00
Jonas Devlieghere 27476ce24b [dwarfdump] Rename Brief to Verbose in DIDumpOptions
This patches renames "brief" to "verbose" in de DIDumpOptions and
inverts the logic to match the new behavior where brief is the default.
Changing the default value uncovered some bugs related to the
DIDumpOptions not being propagated and have been fixed as well.

Differential revision: https://reviews.llvm.org/D37745

llvm-svn: 313139
2017-09-13 09:43:05 +00:00
Adrian Prantl 7bc1b28291 llvm-dwarfdump: Replace -debug-dump=sect option with individual options.
As discussed on llvm-dev in
http://lists.llvm.org/pipermail/llvm-dev/2017-September/117301.html
this changes the command line interface of llvm-dwarfdump to match the
one used by the dwarfdump utility shipping on macOS. In addition to
being shorter to type this format also has the advantage of allowing
more than one section to be specified at the same time.

In a nutshell, with this change

  $ llvm-dwarfdump --debug-dump=info
  $ llvm-dwarfdump --debug-dump=apple-objc

becomes

  $ dwarfdump --debug-info --apple-objc

Differential Revision: https://reviews.llvm.org/D37714

llvm-svn: 312970
2017-09-11 22:59:45 +00:00
Jonas Devlieghere f4ed65da04 [dwarfdump] Verify line table prologue
This patch adds prologue verification, which is already present in
Apple's dwarfdump. It checks for invalid directory indices and warns
about duplicate file paths.

Differential revision: https://reviews.llvm.org/D37511

llvm-svn: 312782
2017-09-08 09:48:51 +00:00
Peter Collingbourne 9e26e97955 COFF: PDB: Allow multiple modules with the same name.
It is possible for two modules to have the same name if they are
archive members with the same name, or if we are doing LTO (in which
case all modules will have the name "lto.tmp").

Differential Revision: https://reviews.llvm.org/D37589

llvm-svn: 312744
2017-09-07 20:39:46 +00:00
Peter Collingbourne 8ad3aab4e5 Remove dead code. NFCI.
llvm-svn: 312740
2017-09-07 19:17:30 +00:00
George Rimar 2f95c8bccb [DebugInfo] - Fix for lld DWARF parsing of base address selection entries in range lists.
It solves issue of wrong section index evaluating for ranges when
base address is used.

Based on David Blaikie's patch D36097.

Differential revision: https://reviews.llvm.org/D37214

llvm-svn: 312477
2017-09-04 10:30:39 +00:00
Zachary Turner abb17cc084 [llvm-pdbutil] Support dumping CodeView from object files.
We have llvm-readobj for dumping CodeView from object files, and
llvm-pdbutil has always been more focused on PDB.  However,
llvm-pdbutil has a lot of useful options for summarizing debug
information in aggregate and presenting high level statistical
views.  Furthermore, it's arguably better as a testing tool since
we don't have to write tests to conform to a state-machine like
structure where you match multiple lines in succession, each
depending on a previous match.  llvm-pdbutil dumps much more
concisely, so it's possible to use single-line matches in many
cases where as with readobj tests you have to use multi-line
matches with an implicit state machine.

Because of this, I'm adding object file support to llvm-pdbutil.
In fact, this mirrors the cvdump tool from Microsoft, which also
supports both object files and pdb files.  In the future we could
perhaps rename this tool llvm-cvutil.

In the meantime, this allows us to deep dive into object files
the same way we already can with PDB files.

llvm-svn: 312358
2017-09-01 20:06:56 +00:00
Zachary Turner 99c6982bcd [llvm-pdbutil] Print detailed S_UDT stats.
This adds a new command line option, -udt-stats, which breaks
down the stats of S_UDT records.  These are one of the biggest
contributors to the size of /DEBUG:FASTLINK PDBs, so they need
some additional tools to be able to analyze their usage.  This
option will dig into each S_UDT record and determine what kind
of record it points to, and then break down the statistics by
the target type.  The goal here is to identify how our object
files differ from MSVC object files in S_UDT records, so that
we can output fewer of them and reach size parity.

llvm-svn: 312276
2017-08-31 20:43:22 +00:00
Reid Kleckner a058736c9c [dwarfdump] Pretty print location expressions and location lists
Summary:
Based on Fred's patch here: https://reviews.llvm.org/D6771

I can't seem to commandeer the old review, so I'm creating a new one.

With that change the locations exrpessions are pretty printed inline in the
DIE tree. The output looks like this for debug_loc entries:

    DW_AT_location [DW_FORM_data4]        (0x00000000
       0x0000000000000001 - 0x000000000000000b: DW_OP_consts +3
       0x000000000000000b - 0x0000000000000012: DW_OP_consts +7
       0x0000000000000012 - 0x000000000000001b: DW_OP_reg0 RAX, DW_OP_piece 0x4
       0x000000000000001b - 0x0000000000000024: DW_OP_breg5 RDI+0)

And like this for debug_loc.dwo entries:
    DW_AT_location [DW_FORM_sec_offset]   (0x00000000
      Addr idx 2 (w/ length 190): DW_OP_consts +0, DW_OP_stack_value
      Addr idx 3 (w/ length 23): DW_OP_reg0 RAX, DW_OP_piece 0x4)

Simple locations without ranges are printed inline:

   DW_AT_location [DW_FORM_block1]       (DW_OP_reg4 RSI, DW_OP_piece 0x4, DW_OP_bit_piece 0x20 0x0)

The debug_loc(.dwo) dumping in changed accordingly to factor the code.

Reviewers: dblaikie, aprantl, friss

Subscribers: mgorny, javed.absar, hiraditya, llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D37123

llvm-svn: 312042
2017-08-29 21:41:21 +00:00
Jonas Devlieghere 4942a0b0f3 Revert "[llvm-dwarfdump] Print type names in DW_AT_type DIEs"
This reverts commit r311492.

llvm-svn: 311499
2017-08-22 21:59:46 +00:00
Jonas Devlieghere f456d1864d [llvm-dwarfdump] Print type names in DW_AT_type DIEs
This patch adds printing for DW_AT_type DIEs like it's currently already
the case for DW_AT_specification DIEs.

llvm-svn: 311492
2017-08-22 21:41:49 +00:00
Zachary Turner 5641c07d6b [PDB] Serialize records into a stack-allocated buffer.
We were using a std::vector<> and resizing to MaxRecordLength,
which is ~64KB.  We would then do this repeatedly often many
times in a tight loop, which was causing measurable performance
impact when linking PDBs.

Patch by Alex Telishev
Differential Revision: https://reviews.llvm.org/D36940

llvm-svn: 311375
2017-08-21 20:17:19 +00:00
Zachary Turner d76dc2d31e [lld/pdb] Speed up construction of publics & globals addr map.
computeAddrMap function calls std::stable_sort with a comparison
function that computes deserialized symbols every time its called.
In the result deserializeAs<PublicSym32> is called 20-30 times per
symbol. It's much faster to calculate it beforehand and pass a
pointer to it to the comparison function.

Patch by Alex Telishev
Differential Revision: https://reviews.llvm.org/D36941

llvm-svn: 311373
2017-08-21 20:08:40 +00:00
Zachary Turner d1de2f4f5e [llvm-pdbutil] Add support for dumping detailed module stats.
This adds support for dumping a summary of module symbols
and CodeView debug chunks.  This option prints a table for
each module of all of the symbols that occurred in the module
and the number of times it occurred and total byte size.  Then
at the end it prints the totals for the entire file.

Additionally, this patch adds the -jmc (just my code) option,
which suppresses modules which are from external libraries or
linker imports, so that you can focus only on the object files
and libraries that originate from your own source code.

llvm-svn: 311338
2017-08-21 14:53:25 +00:00
Benjamin Kramer 49a49fe816 Move helper classes into anonymous namespaces.
No functionality change intended.

llvm-svn: 311288
2017-08-20 13:03:48 +00:00
Jonas Devlieghere a2faf7b60f [llvm-dwarfdump] Hide .debug_str and DIE reference offsets in brief mode
This patch hides the .debug_str offset and DIE reference offsets into
the CU when llvm-dwarfdump is invoked with -brief.

Differential Revision: https://reviews.llvm.org/D36835

llvm-svn: 311201
2017-08-18 21:35:44 +00:00
Zachary Turner 197bba0028 Remove unused variable.
llvm-svn: 311119
2017-08-17 20:18:36 +00:00
Zachary Turner 96bcd6a37a [llvm-pdbutil] Fix some dumping issues.
When dumping, we were treating the S_INLINESITESYM as referring
to a type record, when it actually refers to an id record.  We
had this correct in TypeIndexDiscovery, so our merging algorithm
should be fine, but we had it wrong in the dumper, which means it
would appear to work most of the time, unless the index was out
of bounds in the type stream, when it would fail.  Fixed this, and
audited a few other cases to make them match the behavior in
TypeIndexDiscovery.

Also, I've now observed a new symbol record with kind 0x1168 which
I have no clue what it is, so to avoid crashing we have to just
print "Unknown Symbol Kind".

llvm-svn: 311117
2017-08-17 20:04:51 +00:00
Adrian Prantl 3d523a657a Add a convenience overload of DWARFDie::dump() for debugging purposes.
llvm-svn: 311026
2017-08-16 17:43:01 +00:00
George Rimar e5269439cd [llvm-dwarfdump] - Attemp to fix BB after r310915.
Now MIPS one is unhappy:
http://lab.llvm.org:8011/builders/llvm-mips-linux/builds/2221

llvm-svn: 310928
2017-08-15 16:42:21 +00:00
George Rimar e1c30f74f7 [llvm-dwarfdump] - Refactor section name/uniqueness gathering.
As was requested in D36313 thread,

with this patch section names and uniqueness calculated once,
and not every time when a range is dumped.

Differential revision: https://reviews.llvm.org/D36740

llvm-svn: 310923
2017-08-15 15:54:43 +00:00
George Rimar 6957ab5b7b [llvm-dwarfdump] - Print section name and index when dumping .debug_info ranges
Teaches llvm-dwarfdump to print section index and name of range
when it dumps .debug_info.

Differential revision: https://reviews.llvm.org/D36313

llvm-svn: 310915
2017-08-15 12:32:54 +00:00
Zachary Turner ee9906d884 [LLD/PDB] Write actual records to the globals stream.
Previously we were writing an empty globals stream.  Windows
tools interpret this as "private symbols are not present in
this PDB", even when they are, so we need to fix this.  Regardless,
without it we don't have information about global variables, so
we need to fix it anyway.  This patch does that.

With this patch, the "lm" command in WinDbg correctly reports
that we have private symbols available, but the "dv" command
still refuses to display local variables.

Differential Revision: https://reviews.llvm.org/D36535

llvm-svn: 310743
2017-08-11 19:00:03 +00:00
Zachary Turner 5448dabbdd [PDB] Fix an issue writing the publics stream.
In the refactor to merge the publics and globals stream, a bug
was introduced that wrote the wrong value for one of the fields
of the PublicsStreamHeader.  This caused debugging in WinDbg
to break.

We had no way of dumping any of these fields, so in addition to
fixing the bug I've added dumping support for them along with a
test that verifies the correct value is written.

llvm-svn: 310439
2017-08-09 04:23:59 +00:00
Zachary Turner 946204c83e [PDB] Merge Global and Publics Builders.
The publics stream and globals stream are very similar. They both
contain a list of hash buckets that refer into a single shared stream,
the symbol record stream. Because of the need for each builder to manage
both an independent hash stream as well as a single shared record
stream, making the two builders be independent entities is not the right
design. This patch merges them into a single class, of which only a
single instance is needed to create all 3 streams.  PublicsStreamBuilder
and GlobalsStreamBuilder are now merged into the single GSIStreamBuilder
class, which writes all 3 streams at once.

Note that this patch does not contain any functionality change. So we're
still not yet writing any records to the globals stream. All we're doing
is making it so that when we do start writing records to the globals,
this refactor won't have to be part of that patch.

Differential Revision: https://reviews.llvm.org/D36489

llvm-svn: 310438
2017-08-09 04:23:25 +00:00
Zachary Turner 59e3ae827d [PDB] Fix linking of function symbols and local variables.
The compiler outputs PROC32_ID symbols into the object files
for functions, and these symbols have an embedded type index
which, when copied to the PDB, refer to the IPI stream.  However,
the symbols themselves are also converted into regular symbols
(e.g. S_GPROC32_ID -> S_GPROC32), and type indices in the regular
symbol records refer to the TPI stream.  So this patch applies
two fixes to function records.
  1. It converts ID symbols to the proper non-ID record type.
  2. After remapping the type index from the object file's index
     space to the PDB file/IPI stream's index space, it then
     remaps that index to the TPI stream's index space by.

Besides functions, during the remapping process we were also
discarding symbol record types which we did not recognize.
In particular, we were discarding S_BPREL32 records, which is
what MSVC uses to describe local variables on the stack.  So
this patch fixes that as well by copying them to the PDB.

Differential Revision: https://reviews.llvm.org/D36426

llvm-svn: 310394
2017-08-08 18:34:44 +00:00
Simon Dardis b1b52c0200 [DebugInfo][DWARF] Address paulr's comment on rL310253.
llvm-svn: 310267
2017-08-07 16:08:11 +00:00
Simon Dardis 02d9945e6f [DebugInfo][DWARF] Correct some usages of PRIx32 to PRIx64
These lead to tests failing spuriously as the values after being rendered to a
string were incorrect.

Reviewers: clayborg

Differential Revision: https://reviews.llvm.org/D36319

llvm-svn: 310262
2017-08-07 15:37:57 +00:00
Simon Dardis ec4ea99766 [DebugInfo][DWARF] Use PRIx64 explicitly in output.
llvm-svn: 310253
2017-08-07 13:30:03 +00:00
Adrian McCarthy b41f03e768 Enable llvm-pdbutil to list enumerations using native PDB reader
This extends the native reader to enable llvm-pdbutil to list the enums in a
PDB and it includes a simple test. It does not yet list the values in the
enumerations, which requires an actual implementation of
NativeEnumSymbol::FindChildren.

To exercise this code, use a command like:

    llvm-pdbutil pretty -native -enums foo.pdb

Differential Revision: https://reviews.llvm.org/D35738

llvm-svn: 310144
2017-08-04 22:37:58 +00:00
Reid Kleckner 175af4bcc7 [PDB] Fix section contributions
Summary:
PDB section contributions are supposed to use output section indices and
offsets, not input section indices and offsets.

This allows the debugger to look up the index of the module that it
should look up in the modules stream for symbol information. With this
change, windbg can now find line tables, but it still cannot print local
variables.

Fixes PR34048

Reviewers: zturner

Subscribers: hiraditya, ruiu, llvm-commits

Differential Revision: https://reviews.llvm.org/D36285

llvm-svn: 309987
2017-08-03 21:15:09 +00:00
Zachary Turner 9fb9d71d3e [pdb/lld] Write a valid FPM.
The PDB reserves certain blocks for the FPM that describe which
blocks in the file are allocated and which are free.  We weren't
filling that out at all, and in some cases we were even stomping
it with incorrect data.  This patch writes a correct FPM.

Differential Revision: https://reviews.llvm.org/D36235

llvm-svn: 309896
2017-08-02 22:31:39 +00:00
Zachary Turner c3d8eec9e9 [pdbutil] Add a command to dump the FPM.
Recently problems have been discovered in the way we write the FPM
(free page map).  In order to fix this, we first need to establish
a baseline about what a correct FPM looks like using an MSVC
generated PDB, so that we can then make our own generated PDBs
match.  And in order to do this, the dumper needs a mode where it
can dump an FPM so that we can write tests for it.

This patch adds a command to dump the FPM, as well as a test against
a known-good PDB.

llvm-svn: 309894
2017-08-02 22:25:52 +00:00
David Blaikie 22dc4474a6 DebugInfo: Test & handle (differently) non-zero DW_AT_ranges_base
Followup to r309570, fixing it slightly differently (ranges_base and
addr_base should never be read from a DWO file - so there shouldn't be
any issue with 'overriding' the values - conditionalize the code and
assert that the values aren't being overriden).

llvm-svn: 309879
2017-08-02 20:16:22 +00:00
Benjamin Kramer 295cf4de37 [DebugInfo] Use shrink_to_fit to simplify code. NFCI.
llvm-svn: 309683
2017-08-01 14:38:08 +00:00
Zachary Turner 8d927b6bf9 [lld/pdb] Add an empty globals stream.
We don't write any actual symbols to this stream yet, but for
now we just create the stream and hook it up to the appropriate
places and give it a valid header.

Differential Revision: https://reviews.llvm.org/D35290

llvm-svn: 309608
2017-07-31 19:36:08 +00:00
Spyridoula Gravani 70d35e102e [DWARF] Added verification check for tags in accelerator tables. This patch verifies that the atom tag is actually the same with the tag of the DIE that we retrieve from the table.
Differential Revision: https://reviews.llvm.org/D35963

llvm-svn: 309596
2017-07-31 18:01:16 +00:00
Benjamin Kramer d7b1e5af0a [DebugInfo] Don't overwrite DWARFUnit fields if the CU DIE doesn't have them.
DIEs are lazily deserialized so it's possible that the DWO CU is created
before the DIE is parsed. DWO shares .debug_addr and .debug_ranges with the
object file so overwriting the offset with 0 will make the CU unusable.

No test case because I couldn't get clang to emit a non-zero range base.

llvm-svn: 309570
2017-07-31 15:32:39 +00:00
David Blaikie a62f1cb1fa DebugInfo: Fix for CU index usage in 309507
Not sure quite how I failed so clearly to test this, but anyway.

llvm-svn: 309514
2017-07-30 15:15:58 +00:00
David Blaikie ebac0b9c62 DebugInfo: Use DWP cu_index to speed up symbolizing (as intended)
I was a bit lazy when I first implemented this & skipped the index
lookup - obviously for large files this becomes pretty crucial, so here
we go, do the index lookup. Speeds up large DWP symbolizing by... lots.
(20m -> 20s, actually, maybe more in a release build (that was a release
build without index lookup, compared to a debug/non-release build with
the index usage))

llvm-svn: 309507
2017-07-30 08:12:07 +00:00
David Blaikie e5adb68e04 DebugInfo: Provide option for explicitly specifying the name of the DWP file
If you've archived the DWP file somewhere it's probably useful to be
able to just tell llvm-symbolizer where it is when you're symbolizing
stack traces from the binary.

This only provides a mechanism for specifying a single DWP file, good if
you're symbolizing a program with a single DWP file, but it's likely if
the program is dynamically linked that you might have a DWP for each
dynamic library - in which case this feature won't help (at least as
it's surfaced in llvm-symbolizer for now) - in theory it could be
extended to specify a collection of DWP files that could all be
consulted for split CU hash resolution.

llvm-svn: 309498
2017-07-30 01:34:08 +00:00
Reid Kleckner ef443296a4 [PDB] Initialize the std::array<ulittle32_t> used for the gsi bitmap
With ASan, we would write about 512 bytes of malloc fill value to the
PDB, with some random bits ORed in here and there. Dumping the PDB would
always fail reliably.

llvm-svn: 309331
2017-07-27 23:13:05 +00:00
Reid Kleckner eacdf04fdd [PDB] Write public symbol records and the publics hash table
Summary:
MSVC link.exe records all external symbol names in the publics stream.
It provides similar functionality to an ELF .symtab.

Reviewers: zturner, ruiu

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D35871

llvm-svn: 309303
2017-07-27 18:25:59 +00:00
Spyridoula Gravani 73e1796da2 [DWARF] Minor code style modification, no functionality change.
llvm-svn: 309240
2017-07-27 00:59:33 +00:00
Reid Kleckner 037bcd9345 [PDB] Remove stale GSI.h header that I intended to remove in the previous commit
llvm-svn: 309069
2017-07-26 00:58:49 +00:00
Spyridoula Gravani dc635f40bb [DWARF] Generalized verification of .apple_names accelerator table to be applicable to any acceleration table. Added verification for .apple_types, .apple_namespaces and .apple_objc sections.
Differential Revision: https://reviews.llvm.org/D35853

llvm-svn: 309068
2017-07-26 00:52:31 +00:00
Reid Kleckner 14d90fd05c [PDB] Improve GSI hash table dumping for publics and globals
The PDB "symbol stream" actually contains symbol records for the publics
and the globals stream. The globals and publics streams are essentially
hash tables that point into a single stream of records. In order to
match cvdump's behavior, we need to only dump symbol records referenced
from the hash table. This patch implements that, and then implements
global stream dumping, since it's just a subset of public stream
dumping.

Now we shouldn't see S_PROCREF or S_GDATA32 records when dumping
publics, and instead we should see those record in the globals stream.

llvm-svn: 309066
2017-07-26 00:40:36 +00:00
NAKAMURA Takumi 7ddaf3cf88 DWARFVerifier.cpp: Fix -m32 in r308928. Use PRIx64.
llvm-svn: 308949
2017-07-25 05:03:17 +00:00
Spyridoula Gravani e0ba415740 [DWARF] Added verification check for die ranges. If highPC is an address, then it should be greater than lowPC for each range.
Differential Revision: https://reviews.llvm.org/D35733

llvm-svn: 308928
2017-07-24 21:04:11 +00:00
Rafael Espindola 87c3f4a938 Move DWARFSectionMap to a .cpp file.
Thanks to Paul Robinson for the suggestion.

llvm-svn: 308913
2017-07-24 19:34:26 +00:00
Tim Northover fe6be421a7 Revert "Debug: handle dumping the D language."
Reid beat me to it.

llvm-svn: 308902
2017-07-24 17:47:46 +00:00
Tim Northover c7bd8255b9 Debug: handle dumping the D language.
Mostly just to silence a warning about an unhandled case. There don't seem to
be any tests for this operator (at least that I could find).

llvm-svn: 308901
2017-07-24 17:39:44 +00:00
Reid Kleckner e2ba971302 Add missing case to switch
llvm-svn: 308894
2017-07-24 16:30:44 +00:00
Reid Kleckner 898ddf61c0 [codeview] Emit 'D' as the cv source language for D code
This matches DMD:
522263965c/src/ddmd/backend/cv8.c (L199)

Fixes PR33899.

llvm-svn: 308890
2017-07-24 16:16:42 +00:00
Reid Kleckner c85041fe00 Fix DebugInfo/PDB build by adding missing changes
llvm-svn: 308765
2017-07-21 18:32:00 +00:00
Reid Kleckner 686f121a5d [PDB] Dump extra info about the publics stream
This includes the hash table, the address map, and the thunk table and
section offset table. The last two are only used for incremental
linking, which LLD doesn't support, so they are less interesting. The
hash table is particularly important to get right, since this is the one
of the streams that debuggers use to translate addresses to symbols.

llvm-svn: 308764
2017-07-21 18:28:55 +00:00
Spyridoula Gravani c6ef9873ac [DWARF] Generalized verification of .debug_abbrev to be applicable to both .debug_abbrev and .debug_abbrev.dwo sections.
Differential Revision: https://reviews.llvm.org/D35698

llvm-svn: 308703
2017-07-21 00:51:32 +00:00
Spyridoula Gravani 364b535234 [DWARF] Added check that verifies that no abbreviation declaration has more than one attribute with the same name.
SUMMARY

This patch adds a verification check on the abbreviation declarations in the .debug_abbrev section.
The check makes sure that no abbreviation declaration has more than one attributes with the same name.

Differential Revision: https://reviews.llvm.org/D35643

llvm-svn: 308579
2017-07-20 02:06:52 +00:00
Reid Kleckner 388f88070e Use llvm::make_unique once more to avoid ADL ambiguity with std::make_unique
llvm-svn: 308552
2017-07-19 23:42:53 +00:00
Rafael Espindola 2e942fbaef Use llvm::make_unique to try to fix the windows build.
llvm-svn: 308551
2017-07-19 23:38:54 +00:00
Rafael Espindola 3ee9e11acb Remove some leftover DWARFContextInMemory.
Not sure how I missed these on the previous commit.

llvm-svn: 308550
2017-07-19 23:34:59 +00:00
Rafael Espindola c398e67fed Use delegation instead of inheritance.
This changes DwarfContext to delegate to DwarfObject instead of having
pure virtual methods.

With this DwarfContextInMemory is replaced with an implementation of
DwarfObject that is local to a .cpp file.

llvm-svn: 308543
2017-07-19 22:27:28 +00:00
Spyridoula Gravani f6bd788dda [DWARF] Modification of code for the verification of .debug_info section.
Summary:
This patch modifies the handleDebugInfo() function so that we verify the contents of each unit
in the .debug_info section only if its header has been successfully verified.

This change will allow for more/different verification checks depending on the type of the unit since from
dwarf5, the .debug_info section may consist of different types of units.

Subscribers: aprantl

Differential Revision: https://reviews.llvm.org/D35521

llvm-svn: 308245
2017-07-18 01:00:26 +00:00
Reid Kleckner c50349d4c6 [PDB] Finish and simplify TPI hashing
Summary:
This removes the CVTypeVisitor updater and verifier classes. They were
made dead by the minimal type dumping refactoring. Replace them with a
single function that takes a type record and produces a hash. Call this
from the minimal type dumper and compare the hash.

I also noticed that the microsoft-pdb reference repository uses a basic
CRC32 for records that aren't special. We already have an implementation
of that CRC ready to use, because it's used in COFF for ICF.

I'll make LLD call this hashing utility in a follow-up change. We might
also consider using this same hash in type stream merging, so that we
don't have to hash our records twice.

Reviewers: inglorion, ruiu

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D35515

llvm-svn: 308240
2017-07-18 00:33:45 +00:00
Reid Kleckner 67653ee086 [codeview] Fix YAML for LF_TYPESERVER2 by hoisting PDB_UniqueId
Summary:
We were treating the GUIDs in TypeServer2Record as strings, and the
non-ASCII bytes in the GUID would not round-trip through YAML.

We already had the PDB_UniqueId type portably represent a Windows GUID,
but we need to hoist that up to the DebugInfo/CodeView library so that
we can use it in the TypeServer2Record as well as in PDB parsing code.

Reviewers: inglorion, amccarth

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D35495

llvm-svn: 308234
2017-07-17 23:59:44 +00:00
Reid Kleckner 3167480ca6 [codeview] Don't use the type visitor to merge types
Summary:
This didn't do much to speed things up, but it implements a FIXME, and I
think it's a nice simplification. We don't need the record kind switch.
We're doing that ourselves.

Reviewers: ruiu, inglorion

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D35496

llvm-svn: 308213
2017-07-17 20:31:38 +00:00
Reid Kleckner a842cd75e2 [codeview] Remove TypeServerHandler and PDBTypeServerHandler
Summary:
Instead of wiring these through the CVTypeVisitor interface, clients
should inspect the CVTypeArray before visiting it and potentially load
up the type server's TPI stream if they need it.

No tests relied on this functionality because LLD was the only client.

Reviewers: ruiu

Subscribers: mgorny, hiraditya, zturner, llvm-commits

Differential Revision: https://reviews.llvm.org/D35394

llvm-svn: 308212
2017-07-17 20:28:06 +00:00
Reid Kleckner af88a910fd [CodeView] Dump BuildInfoSym and ProcSym type indices
I need to print the type index in hex so that I can match it in
FileCheck for a test I'm writing.

llvm-svn: 308107
2017-07-15 18:10:39 +00:00
Eric Christopher f73870eefa Remove set but not used variables from the debug info verifier code.
llvm-svn: 307987
2017-07-14 01:40:47 +00:00
Spyridoula Gravani 890eedc4e4 [DWARF] Introduce verification for the unit header chain in .debug_info section to llvm-dwarfdump.
This patch adds verification checks for the unit header chain in the .debug_info section.
Specifically, for each unit in the .debug_info section, the verifier checks that:

The unit length is valid (i.e. the unit can actually fit in the .debug_info section)
The dwarf version of the unit is valid
The address size is valid (4 or 8)
The unit type (if the unit is in dwarf5) is valid
The debug_abbrev_offset is valid

llvm-svn: 307975
2017-07-13 23:25:24 +00:00
Reid Kleckner 6597c28d76 [PDB] Fix type server handling for archives
Summary:
This fixes type indices for SDK or CRT static archives. Previously we'd
try to look next to the archive object file path, which would not exist
on the local machine.

Also error out if we can't resolve a type server record. Hypothetically
we can recover from this error by discarding debug info for this object,
but that is not yet implemented.

Reviewers: ruiu, amccarth

Subscribers: aprantl, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D35369

llvm-svn: 307946
2017-07-13 20:12:23 +00:00
Wolfgang Pieb 515d0e5001 [DWARF] Fixing a bug with processing of DWARF v5 indexed strings in Mach-O objects.
Code to convert MachO - specific section debug section names to standard DWARF v5
section names was in the wrong place.

Differential Revision: https://reviews.llvm.org/D35321

llvm-svn: 307872
2017-07-13 01:03:28 +00:00
Rafael Espindola 5e5dfa1fc5 Don't expose a map in the DWARFContext interface.
Doing so is leaking an implementation detail.

I have an implementation that uses the lld infrastructure and doesn't
use a map or object::SectionRef.

llvm-svn: 307846
2017-07-12 21:08:24 +00:00
Reid Kleckner 0962cb2e3a Fix non-Windows build after PDB native builtin type change
Some C++14 features slipped in along with an extra member qualification.

llvm-svn: 307835
2017-07-12 19:46:35 +00:00
Adrian McCarthy 8d090fc531 [PDB] Enable NativeSession to create symbols for built-in types on demand
Summary:
There is a reserved range of type indexes for built-in types (like integers).
This will create a symbol for a built-in type if the caller askes for one by
type index.  This is also plumbing for being able to recall symbols by type
index in general, but user-defined types will come in subsequent patches.

Reviewers: rnk, zturner

Subscribers: mgorny, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D35163

llvm-svn: 307834
2017-07-12 19:38:11 +00:00
Reid Kleckner 8d8888ff42 [codeview] Change readobj symbol dumping format
Avoid duplicating DictScope with hand-written names everywhere.  Print
the S_-prefixed symbol kind for every record. This should make it easier
to search for certain kinds of records when debugging PDB linking.

llvm-svn: 307732
2017-07-11 23:41:41 +00:00
Reid Kleckner 447f677133 [codeview] Fix type index discovery for four symbol records
I encountered these when linking LLD, which uses atls.lib. Those objects
appear to use these uncommon symbol records:

  0x115E S_HEAPALLOCSITE
  0x113D S_ENVBLOCK
  0x1113 S_GTHREAD32
  0x1153 S_FILESTATIC

llvm-svn: 307725
2017-07-11 22:37:25 +00:00
Zachary Turner 7eaf1d96ad [lld/pdb] Create an empty public symbol record stream.
This is part of the continuing effort to increase parity between
LLD and MSVC PDBs.  link still doesn't like our PDBs, so the most
obvious thing to check was whether adding an empty publics stream
would get it to do something else.  It still fails in the same way
but at least this removes one more variable from the equation.
The next logical step would be to try creating an empty globals
stream.

Differential Revision: https://reviews.llvm.org/D35224

llvm-svn: 307598
2017-07-10 22:40:20 +00:00
George Rimar c7147b368a [DWARF] - Rename variable. NFC.
Variable was called 'Name' and contained text
name of relocation type. Problem was that
outside of this error handling scope we already
have different 'Name' variable that contains
section name.

Change helps to avoid confusion. 

llvm-svn: 307530
2017-07-10 10:04:51 +00:00
George Rimar 4be1388ebb [DWARF] - Remove unused variables. NFC.
llvm-svn: 307528
2017-07-10 09:36:44 +00:00
Zachary Turner 3a11fdf8ce [PDB] More changes to bring lld PDBs to parity with MSVC.
1) Don't write a /src/headerblock stream.  This appears to be
   written conditionally by MSVC, but it's not clear what the
   condition is.  For now, just remove it since we dont' know
   what it is anyway and the particular pdb we've checked in
   for the test doesn't have one.
2) Write a valid timestamp for the PDB file signature.  This
   leads to non-reproducible builds, but it matches the default
   behavior of link, so it should be out default as well.  If
   we need reproducibility, we should add a separate command
   line option for it that is off by default.
3) Write an empty FPO stream.  MSVC seems to always write an
   FPO stream.  This change makes the stream directory match
   up, although we still need to make the contents of the FPO
   stream match.

llvm-svn: 307436
2017-07-07 20:25:39 +00:00
Zachary Turner c1e93e5fa4 Fix some differences between lld and MSVC generated PDBs.
A couple of things were different about our generated PDBs.

1) We were outputting the wrong Version on the PDB Stream.
   The version we were setting was newer than what MSVC is setting.
   It's not clear what the implications are, but we change LLD
   to use PdbImplVC70, as MSVC does.
2) For the optional debug stream indices in the DBI Stream, we
   were outputting 0 to mean "the stream is not present".  MSVC
   outputs uint16_t(-1), which is the "correct" way to specify
   that a stream is not present.  So we fix that as well.
3) We were setting the PDB Stream signature to 0.  This is supposed
   to be the result of calling time(nullptr).  Although this leads
   to non-deterministic builds, a better way to solve that is by
   having a command line option explicitly for generating a
   reproducible build, and have the default behavior of lld-link
   match the default behavior of link.

To test this, I'm making use of the new and improved `pdb diff`
sub command.  To make it suitable for writing tests against, I had
to modify the diff subcommand slightly to print less verbose output.
Previously it would always print | <column> | <value1> | <value2> |
which is quite verbose, and the values are fragile.  All we really
want to know is "did we produce the same value as link?"  So I added
command line options to print a single character representing the
result status (different, identical, equivalent), and another to
hide the value display.  Note that just inspecting the diff output
used to write the test, you can see some things that are obviously
wrong.  That is just reflective of the fact that this is the state
of affairs today, not that we're asserting that this is "correct".
We can use this as a starting point to discover differences, fix
them, and update the test.

Differential Revision: https://reviews.llvm.org/D35086

llvm-svn: 307422
2017-07-07 18:45:56 +00:00
Zachary Turner f3b4b2d89d [llvm-pdbutil] Improve diff mode.
We're getting to the point that some MS tools (e.g. DIA) can recognize
our PDBs but others (e.g. link.exe) cannot. I think the way forward is
to improve our tooling to help us find differences more easily. For
example, if we can compile the same program with clang-cl and cl and
have a tool tell us all the places where the PDBs differ, this could
tell us what we're doing wrong. It's tricky though, because there are a
lot of "benign" differences in a PDB. For example, if the string table
in one PDB consists of "foo" followed by "bar" and in the other PDB it
consists of "bar" followed by "foo", this is not necessarily a critical
difference, as long as the uses of these strings also refer to the
correct location. On the other hand, if the second PDB doesn't even
contain the string "foo" at all, this is a critical difference.

diff mode has been in llvm-pdbutil for quite a while, but because of the
above challenge along with some others, it's been hard to make it
useful. I think this patch addresses that. It looks for all the same
things, but it now prints the output in tabular format (carefully
formatted and aligned into tables and fields), and it highlights
critical differences in red, non-critical differences in yellow, and
identical fields in green.  This makes it easy to spot the places we
differ, and the general concept of outputting arbitrary fields in
tabular format can be extended to provide analysis into many of the
different types of information that show up in a PDB.

Differential Revision: https://reviews.llvm.org/D35039

llvm-svn: 307421
2017-07-07 18:45:37 +00:00
Rafael Espindola 7413496f8d Fix variable names. NFC.
llvm-svn: 307406
2017-07-07 15:20:55 +00:00
Rafael Espindola 77e87b3444 Reduce code duplication.
By addding a mapNameToDWARFSection we only need to check section names
in one place.

llvm-svn: 307359
2017-07-07 05:36:53 +00:00
Zachary Turner 6c4bfba8f3 [PDB] Teach libpdb to write DBI Stream ECNames.
Based strictly on the name, this seems to have something to do
width edit & continue.  The goal of this patch has nothing to do
with supporting edit and continue though.  msvc link.exe writes
very basic information into this area even when *not* compiling
with support for E&C, and so the goal here is to bring lld-link
to parity.  Since we cannot know what assumptions standard tools
make about the content of PDB files, we need to be as close as
possible.

This ECNames data structure is a standard PDB string hash table.
link.exe puts a single string into this hash table, which is the
full path to the PDB file on disk.  It then references this string
from the module descriptor for the compiler generated `* Linker *`
module.

With this patch, lld-link will generate the exact same sequence of
bytes as MSVC link for this subsection for a given object file
input (as reported by `llvm-pdbutil bytes -ec`).

llvm-svn: 307356
2017-07-07 05:04:36 +00:00
Hiroshi Inoue ddb34d84c9 fix trivial typos in comments; NFC
llvm-svn: 307004
2017-07-03 06:32:59 +00:00
Eugene Zelenko 4fcfc19976 [CodeView, PDB] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 306911
2017-06-30 23:06:03 +00:00
Zachary Turner af8c75a8c0 [llvm-pdbutil] Output the symbol offset when dumping.
Type records have a unique type index, but symbol records do
not.  Instead, symbol records refer to other symbol records
by referencing their offset in the symbol stream.  In a sense
this is the analogue of the TypeIndex, but we are not printing
it in the dumper.  Printing it not only gives us more useful
information when manually investigating the contents of a PDB,
but also allows us to write better tests by enabling us to
verify that fields that reference other symbol records do
so correctly.

Differential Revision: https://reviews.llvm.org/D34906

llvm-svn: 306890
2017-06-30 21:35:00 +00:00
Zachary Turner 02a267758e [llvm-pdbutil] Add the ability to dump the dependency tree for a type
Previously we had the -type-index option which would dump the record of
a single, but we had no way to follow the dependency graph backwards and
also dump all dependent types.

Having this option makes test-writing better, because we can limit the
test to only those records that are of importance for the thing we're
trying to test, which allows us to use things like CHECK-NEXT to reduce
fragility.

Differential Revision: https://reviews.llvm.org/D34899

llvm-svn: 306852
2017-06-30 18:15:47 +00:00
Spyridoula Gravani 837c110cb1 [DWARF] Added verification checks for the .apple_names section.
This patch verifies the number of atoms, the validity of the form for each atom, as well as the validity of the
hashdata. For hashdata, we're verifying that the hashdata offset is correct and that the offset in the .debug_info for
each DIE in the hashdata is also valid.

llvm-svn: 306735
2017-06-29 20:13:05 +00:00
Paul Robinson 17536b935a [DWARF] NFC: DWARFDataExtractor combines relocs with DataExtractor.
Requires callers to directly associate relocations with a DataExtractor
used to read data from a DWARF section, which helps a callee not make
assumptions about which section it is reading.
This is the next step in reducing DWARFFormValue's dependence on DWARFUnit.

Differential Revision: https://reviews.llvm.org/D34704

llvm-svn: 306699
2017-06-29 16:52:08 +00:00
George Rimar b25a09d0f5 [DWARF] - Fix message reporting about broken relocation.
Because of mistake introduced in r306517,
wrong variable ("name" instead of "Name") was used
in error message.
As a result it reported section name instead of
relocation name.

This file still needs cleanup to match LLVM coding style
and more tests I think.

llvm-svn: 306677
2017-06-29 14:05:18 +00:00
Eugene Zelenko 8456b16ea9 [CodeView] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 306616
2017-06-29 00:05:44 +00:00
Adrian McCarthy bf0afc3246 Introduce symbol cache to PDB NativeSession
Instead of creating symbols directly in the findChildren methods of the native
symbol implementations, they will rely on the NativeSession to act as a factory
for these types.  This lets NativeSession cache the NativeRawSymbols in its
new symbol cache and makes that cache the source of unique IDs for the symbols.

Right now, this affects only NativeCompilandSymbols.  There's no external
change yet, so I think the existing tests are still sufficient.  Coming soon
are patches to extend this to built-in types and enums.

llvm-svn: 306610
2017-06-28 22:47:40 +00:00
George Rimar 1af3cb2912 Recommit "[ELF] - Add ability for DWARFContextInMemory to exit early when any error happen."
With fix in include folder character case:
#include "llvm/Codegen/AsmPrinter.h" -> #include "llvm/CodeGen/AsmPrinter.h"

Original commit message:

Change introduces error reporting policy for DWARFContextInMemory.
New callback provided by client is able to handle error on it's
side and return Halt or Continue.

That allows to either keep current behavior when parser prints all errors
but continues parsing object or implement something very different, like
stop parsing on a first error and report an error in a client style.

Differential revision: https://reviews.llvm.org/D34328

llvm-svn: 306517
2017-06-28 08:21:19 +00:00
George Rimar 7a82cffd68 Revert r306512 "[ELF] - Add ability for DWARFContextInMemory to exit early when any error happen."
It broke BB:

[13/106] 13 0.022 Generating VCSRevision.h
[25/106] 24 1.209 Building CXX object unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o
FAILED: unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o 
/home/bb/bin/g++  -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_GLOBAL_ISEL -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Iunittests/DebugInfo/DWARF -I../llvm-project/llvm/unittests/DebugInfo/DWARF -Iinclude -I../llvm-project/llvm/include -I../llvm-project/llvm/utils/unittest/googletest/include -I../llvm-project/llvm/utils/unittest/googlemock/include -fPIC -fvisibility-inlines-hidden -m32 -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O3    -UNDEBUG  -Wno-variadic-macros -fno-exceptions -fno-rtti -MD -MT unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o -MF unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o.d -o unittests/DebugInfo/DWARF/CMakeFiles/DebugInfoDWARFTests.dir/DWARFDebugInfoTest.cpp.o -c ../llvm-project/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
../llvm-project/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp:18:37: fatal error: llvm/Codegen/AsmPrinter.h: No such file or directory
 #include "llvm/Codegen/AsmPrinter.h"
                                     ^
compilation terminated.

llvm-svn: 306513
2017-06-28 07:06:17 +00:00
George Rimar 397a70425b [ELF] - Add ability for DWARFContextInMemory to exit early when any error happen.
Change introduces error reporting policy for DWARFContextInMemory.
New callback provided by client is able to handle error on it's
side and return Halt or Continue.

That allows to either keep current behavior when parser prints all errors
but continues parsing object or implement something very different, like
stop parsing on a first error and report an error in a client style.

Differential revision: https://reviews.llvm.org/D34328

llvm-svn: 306512
2017-06-28 06:57:20 +00:00
Paul Robinson d66ee0f9a7 [DWARF] NFC: Make string-offset handling more like address-table handling;
do the indirection and relocation all in the same method.

llvm-svn: 306418
2017-06-27 15:40:18 +00:00
Paul Robinson 36e85a867b [DWARF] NFC: Give DwarfFormat a 1-byte base type.
In particular this reduces DWARFFormParams from 64 to 32 bits; pass it
around by value.

llvm-svn: 306324
2017-06-26 19:52:32 +00:00
Paul Robinson 75c068c50b [DWARF] NFC: Collect info used by DWARFFormValue into a helper.
Some forms have sizes that depend on the DWARF version, DWARF format
(32/64-bit), or the size of an address.  Collect these into a struct
to simplify passing them around.  Require callers to provide one when
they query a form's size.

Differential Revision: http://reviews.llvm.org/D34570

llvm-svn: 306315
2017-06-26 18:43:01 +00:00
Zachary Turner 1affd805fc [pdb] Fix reading of llvm-generated PDBs by cvdump.
If you dump a pdb to yaml, and then round-trip it back to a pdb,
and run cvdump -l <file> on the new pdb, cvdump will generate
output such as this.

*** LINES

** Module: "d:\src\llvm\test\DebugInfo\PDB\Inputs\empty.obj"

Error: Line number corrupted: invalid file id 0
  <Unknown> (MD5), 0001:00000010-0000001A, line/addr pairs = 3

        5 00000010      6 00000013      7 00000018

Note the error message about the corrupted line number.

It turns out that the problem is that cvdump cannot find the
/names stream (e.g. the global string table), and the reason it
can't find the /names stream is because it doesn't understand
the NameMap that we serialize which tells pdb consumers which
stream has the string table.

Some experimentation shows that if we add items to the hash
table in a specific order before serializing it, cvdump can read
it. This suggests that either we're using the wrong hash function,
or we're serializing something incorrectly, but it will take some
deeper investigation to figure out how / why.  For now, this at
least allows cvdump to read our line information (and incidentally,
produces an identical byte sequence to what Microsoft tools
produce when writing the named stream map).

Differential Revision: https://reviews.llvm.org/D34491

llvm-svn: 306233
2017-06-25 03:51:42 +00:00
Zachary Turner fa33282774 [llvm-pdbutil] Dump raw bytes of module symbols and debug chunks.
llvm-svn: 306179
2017-06-23 23:08:57 +00:00
Eugene Zelenko 2db0cfa617 [DebugInfo] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 306169
2017-06-23 21:57:40 +00:00
Zachary Turner c2f5b4bfd9 [llvm-pdbutil] Dump raw bytes of type and id records.
llvm-svn: 306167
2017-06-23 21:50:54 +00:00
Zachary Turner dd73968256 [llvm-pdbutil] Dump raw bytes of various DBI stream subsections.
llvm-svn: 306160
2017-06-23 21:11:54 +00:00
Zachary Turner 6c3e41bbd3 [llvm-pdbutil] Dump raw bytes of pdb name map.
This patch dumps the raw bytes of the pdb name map which contains
the mapping of stream name to stream index for the string table
and other reserved streams.

llvm-svn: 306148
2017-06-23 20:18:38 +00:00
Zachary Turner 0b36c3ebd0 [llvm-pdbutil] Add a function for formatting MSF data.
The goal here is to make it possible to display absolute
file offsets when dumping byets from an MSF.  The problem is
that when dumping bytes from an MSF, often the bytes will
cross a block boundary and encounter a discontinuity.  We
can't use the normal formatBinary() function for this because
this would just treat the sequence as entirely ascending, and
not account out-of-order blocks.

This patch adds a formatMsfData() function to our printer, and
then uses this function to improve the output of the -stream-data
command line option for dumping bytes from a particular stream.

Test coverage is also expanded to make sure to include all possible
scenarios of offsets, sizes, and crossing block boundaries.

llvm-svn: 306141
2017-06-23 18:52:13 +00:00
Adrian McCarthy 4aedc81b8c Fix build break by using llvm::make_unique instead of std::make_unique.
llvm-svn: 306043
2017-06-22 18:57:51 +00:00
Adrian McCarthy 31bcb6f680 Add IDs and clone methods to NativeRawSymbol
All NativeRawSymbols will have a unique symbol ID (retrievable via
getSymIndexId).  For now, these are initialized to 0, but soon the
NativeSession will be responsible for creating the raw symbols, and it will
assign unique IDs.

The symbol cache in the NativeSession will also require the ability to clone
raw symbols, so I've provided implementations for that as well.

llvm-svn: 306042
2017-06-22 18:43:18 +00:00
Adrian McCarthy 6a4b080a5f Make IPDBSession::getGlobalScope a non-const method
There doesn't seem to be a compelling reason why this method should be const
other than it was possible with the DIA implementation.  The native session
is going to act as a symbol factory and cache.  This could be acheived with
mutable (and the existing const_cast), but it seems cleaner to accept that
this method affects the state of the session.

This change eliminates an existing const_cast.

llvm-svn: 306041
2017-06-22 18:42:23 +00:00
Wolfgang Pieb 258927e3da [DWARF] Support for DW_FORM_strx3 and complete support for DW_FORM_strx{1,2,4}
(consumer).

Reviewer: aprantl

Differential Revision:  https://reviews.llvm.org/D34418

llvm-svn: 305944
2017-06-21 19:37:44 +00:00
Reid Kleckner d0e6e24a53 [PDB] Add symbols to the PDB
Summary:
The main complexity in adding symbol records is that we need to
"relocate" all the type indices. Type indices do not have anything like
relocations, an opaque data structure describing where to find existing
type indices for fixups. The linker just has to "know" where the type
references are in the symbol records. I added an overload of
`discoverTypeIndices` that works on symbol records, and it seems to be
able to link the standard library.

Reviewers: zturner, ruiu

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D34432

llvm-svn: 305933
2017-06-21 17:25:56 +00:00
Zachary Turner 297b6eb20d [PDB] Don't write uninitialized bytes to a PDB file.
There were certain fields that we didn't know how to write, as
well as various padding bytes that we would ignore.  This leads
to garbage data in the PDB.  While not strictly necessary, we
should initialize these bytes to something meaningful, as it
makes for easier binary comparison between PDBs.

llvm-svn: 305819
2017-06-20 18:50:55 +00:00
David Blaikie 6ab0eb4764 Remove convenient but probably not worthwhile macro for lambda workaround
Cleanup from r305405

llvm-svn: 305731
2017-06-19 19:01:08 +00:00
Reid Kleckner 44cdb10964 [PDB] Start emitting source file and line information
Summary:
This is a first step towards getting line info to show up in VS and
windbg. So far, only llvm-pdbutil can parse the PDBs that we produce.
cvdump doesn't like something about our file checksum tables. I'll have
to dig into that next.

This patch adds a new DebugSubsectionRecordBuilder which takes bytes
directly from some other producer, such as a linker, and sticks it into
the PDB. Line tables only need to be relocated. No data needs to be
rewritten.

File checksums and string tables, on the other hand, need to be re-done.

Reviewers: zturner, ruiu

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D34257

llvm-svn: 305713
2017-06-19 17:21:45 +00:00
Reid Kleckner 18d90e17ad [CodeView] Fix dumping of public symbol record flags
I noticed nonsensical type information while dumping PDBs produced by
MSVC.

llvm-svn: 305708
2017-06-19 16:54:51 +00:00
Zachary Turner 26dbc5420d Delete TypeDatabase.
Merge the functionality into the random access type collection.
This class was only being used in 2 places, so getting rid of it
simplifies the code.

llvm-svn: 305653
2017-06-18 20:52:45 +00:00
Zachary Turner b0fdd214b7 Don't crash if a type record can't be found.
This was a regression introduced in a previous patch.  Adding
back the code that handles this case.

llvm-svn: 305617
2017-06-17 00:02:24 +00:00
Zachary Turner ad859bd472 [CodeView] Fix random access of type names.
Suppose we had a type index offsets array with a boundary at type index
N. Then you request the name of the type with index N+1, and that name
requires the name of index N-1 (think a parameter list, for example). We
didn't handle this, and we would print something like (<unknown UDT>,
<unknown UDT>).

The fix for this is not entirely trivial, and speaks to a larger
problem. I think we need to kill TypeDatabase, or at the very least kill
TypeDatabaseVisitor. We need a thing that doesn't do any caching
whatsoever, just given a type index it can compute the type name "the
slow way". The reason for the bug is that we don't have anything like
that. Everything goes through the type database, and if we've visited a
record, then we're "done". It doesn't know how to do the expensive thing
of re-visiting dependent records if they've not yet been visited.

What I've done here is more or less copied the code (albeit greatly
simplified) from TypeDatabaseVisitor, but wrapped it in an interface
that just returns a std::string. The logic of caching the name is now in
LazyRandomTypeCollection. Eventually I'd like to move the record
database here as well and the visited record bitfield here as well, at
which point we can actually just delete TypeDatabase. I don't see any
reason for it if a "sequential" collection is just a special case of a
random access collection with an empty partial offsets array.

Differential Revision: https://reviews.llvm.org/D34297

llvm-svn: 305612
2017-06-16 23:42:44 +00:00
Zachary Turner 59224cba2e Remove some dead code / includes.
I'm trying to get rid of the TypeDatabase class, so the first
step is to minimize its footprint.

llvm-svn: 305611
2017-06-16 23:42:15 +00:00
Spyridoula Gravani 32614fcf42 [DWARF] Corrected behavior for when no .apple_names section is present in the object.
The verifier should not output any message in such a case.
Added test case with no .apple_name section in the file to verify new functionality.
Made existing test case more specific.

llvm-svn: 305597
2017-06-16 22:03:21 +00:00
Zachary Turner 4e950647fb [llvm-pdbutil] Add support for dumping lines and inlinee lines.
llvm-svn: 305529
2017-06-15 23:56:19 +00:00
Zachary Turner 0e327d0360 [llvm-pdbutil] Add back support for dumping file checksums.
When dumping module source files, also dump checksums.

llvm-svn: 305526
2017-06-15 23:12:41 +00:00
Zachary Turner f8a2e04812 [llvm-pdbutil] Add back the ability to dump hashes and index offsets.
This was regressed in a previous patch that re-wrote the dumper,
and I'm incrementally adding back the pieces that are missing.

llvm-svn: 305524
2017-06-15 23:04:42 +00:00
Zachary Turner 6305545527 Resubmit "[llvm-pdbutil] rewrite the "raw" output style."
This resubmits commit c0c249e9f2ef83e1d1e5f166b50673d92f3579d7.

It was broken due to some weird template issues, which have
since been fixed.

llvm-svn: 305517
2017-06-15 22:24:24 +00:00
Zachary Turner da504b794c Revert "[llvm-pdbutil] rewrite the "raw" output style."
This reverts commit 83ea17ebf2106859a51fbc2a86031b44d33696ad.

This is failing due to some strange template problems, so reverting
until it can be straightened out.

llvm-svn: 305505
2017-06-15 20:55:51 +00:00
Spyridoula Gravani 7a27a26db0 [DWARF] Removed dead code. The verifier functionality is provided by
the DWARFVerifier class (as it should).

llvm-svn: 305503
2017-06-15 20:40:08 +00:00
Zachary Turner b560fdf3b8 [llvm-pdbutil] rewrite the "raw" output style.
After some internal discussions, we agreed that the raw output style had
outlived its usefulness. It was originally created before we had even
thought of dumping to YAML, and it was intended to give us some insight
into the internals of a PDB file. Now we have YAML mode which does
almost exactly this but is more powerful in that it can round-trip back
to a PDB, which the raw mode could not do. So the raw mode had become
purely a maintenance burden.

One option was to just delete it. However, its original goal was to be
as readable as possible while staying close to the "metal" - i.e.
presenting the output in a way that maps directly to the underlying file
format. We don't actually need that last requirement anymore since it's
covered by the yaml mode, so we could repurpose "raw" mode to actually
just be as readable as possible.

This patch implements about 80% of the functionality previously in raw
mode, but in a completely different style that is more akin to what
cvdump outputs. Records are very compressed, often times appearing on
just one line. One nice thing about this is that it makes full record
matching easier, because you can grep for indices, names, and leaf types
on a single line often.

See the tests for some examples of what the new output looks like.

Note that this patch actually regresses the functionality of raw mode in
a few areas, but only because the patch was already unreasonably large
and going 100% would have been even worse. Specifically, this patch is
missing:

The ability to dump module debug subsections (checksums, lines, etc)
The ability to dump section headers
Aside from that everything is here. While goign through the tests fixing
them all up, I found many duplicate tests. They've been deleted. In
subsequent patches I will go through and re-add the missing
functionality.

Differential Revision: https://reviews.llvm.org/D34191

llvm-svn: 305495
2017-06-15 19:34:41 +00:00
Galina Kistanova 3c0505d30c Specified ReportError as noreturn friendly to old compilers.
llvm-svn: 305405
2017-06-14 17:32:53 +00:00
Zachary Turner a8cfc29c9a Resubmit "[codeview] Make obj2yaml/yaml2obj support .debug$S..."
This was originally reverted because of some non-deterministic
failures on certain buildbots.  Luckily ASAN eventually caught
this as a stack-use-after-scope, so the fix is included in
this patch.

llvm-svn: 305393
2017-06-14 15:59:27 +00:00
Zachary Turner 0085dce221 Revert "[codeview] Make obj2yaml/yaml2obj support .debug$S..."
This is causing failures on linux bots with an invalid stream
read.  It doesn't repro in any configuration on Windows, so
reverting until I have a chance to investigate on Linux.

llvm-svn: 305371
2017-06-14 06:24:24 +00:00
Zachary Turner a3da4467fa [codeview] Make obj2yaml/yaml2obj support .debug$S/T sections.
This allows us to use yaml2obj and obj2yaml to round-trip CodeView
symbol and type information without having to manually specify the bytes
of the section. This makes for much easier to maintain tests. See the
tests under lld/COFF in this patch for example. Before they just said
SectionData: <blob> whereas now we can use meaningful record
descriptions. Note that it still supports the SectionData yaml field,
which could be useful for initializing a section to invalid bytes for
testing, for example.

Differential Revision: https://reviews.llvm.org/D34127

llvm-svn: 305366
2017-06-14 05:31:00 +00:00
Spyridoula Gravani e41823bb89 Added partial verification for .apple_names accelerator table in llvm-dwarfdump output.
This patch adds code which verifies that each bucket in the .apple_names
accelerator table is either empty or has a valid hash index.

Differential Revision: https://reviews.llvm.org/D34177

llvm-svn: 305344
2017-06-14 00:17:55 +00:00
Galina Kistanova 41def9b72c Reverted r305339 as MSVC is not happy with noreturn in lambda.
llvm-svn: 305343
2017-06-13 23:57:51 +00:00
Galina Kistanova 680c7605a7 Specified LLVM_ATTRIBUTE_NORETURN for ReportError.
llvm-svn: 305339
2017-06-13 23:39:42 +00:00
Reid Kleckner 8cbdd0c0f2 [PDB] Add a module descriptor for every object file
Summary:
Expose the module descriptor index and fill it in for section
contributions.

Reviewers: zturner

Subscribers: llvm-commits, ruiu, hiraditya

Differential Revision: https://reviews.llvm.org/D34126

llvm-svn: 305296
2017-06-13 15:49:13 +00:00
Zachary Turner 68ea80d0a7 Slightly better fix for dealing with no-id-stream PDBs.
The last fix required the user to manually add the required
feature.  This caused an LLD test to fail because I failed to
update LLD.  In practice we can hide this logic so it can just
be transparently added when we write the PDB.

llvm-svn: 305236
2017-06-12 21:46:51 +00:00
Zachary Turner 990d0c8158 [llvm-pdbdump] Don't fail on PDBs with no ID stream.
Older PDBs don't have this.  Its presence is detected by using
the various "feature" flags that come at the end of the PDB
Stream.  Detect this, and don't try to dump the ID stream if the
features tells us it's not present.

llvm-svn: 305235
2017-06-12 21:34:53 +00:00
Zachary Turner d334cebac4 Fix a null pointer dereference in llvm-pdbutil pretty.
Static data members were causing a problem because I mistakenly
assumed all members would affect a class's layout and so the
Layout member would be non-null.

llvm-svn: 305229
2017-06-12 20:46:35 +00:00
Sylvestre Ledru 337804d86a Same expressions on both sides of the return
Summary:
I guess we want PointerToMemberFunction & PointerToDataMember


Fix coverity cid 1376038 


Reviewers: zturner

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34110

llvm-svn: 305219
2017-06-12 18:53:46 +00:00
David Blaikie a91885a08c dwarfdump: Handle relocs to zlib (.zdebug*) compressed sections
llvm-svn: 305152
2017-06-10 19:32:50 +00:00
Galina Kistanova 038f9854ec Added llvm_unreachable as ReportError cannot be specified as noreturn.
llvm-svn: 305143
2017-06-10 07:50:14 +00:00
Zachary Turner 3226fe95bb [pdb] Support CoffSymbolRVA debug subsection.
llvm-svn: 305108
2017-06-09 20:46:52 +00:00
Zachary Turner 7e62cd17d6 Allow VarStreamArray to use stateful extractors.
Previously extractors tried to be stateless with any additional
context information needed in order to parse items being passed
in via the extraction method.  This led to quite cumbersome
implementation challenges and awkwardness of use.  This patch
brings back support for stateful extractors, making the
implementation and usage simpler.

llvm-svn: 305093
2017-06-09 17:54:36 +00:00
Bob Haarman fdf499bf2d [codeview] use 32-bit integer for RelocOffset in DebugLinesSubsection
Summary:
RelocOffset is a 32-bit value, but we previously truncated it to 16 bits.

Fixes PR33335.

Reviewers: zturner, hiraditya!

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33968

llvm-svn: 305043
2017-06-09 01:18:10 +00:00
Zachary Turner 28c22c83e3 [pdb] Don't crash on unknown debug subsections.
More and more unknown debug subsection kinds are being discovered
so we should make it possible to dump these and display the
bytes.

llvm-svn: 305041
2017-06-09 00:53:59 +00:00
Zachary Turner deb391309c [CodeView] Support remaining debug subsection types
This adds support for Symbols, StringTable, and FrameData subsection
types.  Even though these subsections rarely if ever appear in a PDB
file (they are usually in object files), there's no theoretical reason
why they *couldn't* appear in a PDB.  The real issue though is that in
order to add support for dumping and writing them (which will be useful
for object files), we need a way to test them.  And since there is no
support for reading and writing them to / from object files yet, making
PDB support them is the best way to both add support for the underlying
format and add support for tests at the same time.  Later, when we go
to add support for reading / writing them from object files, we'll need
only minimal changes in the underlying read/write code.

llvm-svn: 305037
2017-06-09 00:28:08 +00:00
Zachary Turner 1bf7762049 [llvm-pdbdump] Support native ordering of subsections in raw mode.
This is the same change for the YAML Output style applied to the
raw output style.  Previously we would queue up all subsections
until every one had been read, and then output them in a pre-
determined order.  This was because some subsections need to be
read first in order to properly dump later subsections.  This
patch allows them to be dumped in the order they appear.

Differential Revision: https://reviews.llvm.org/D34015

llvm-svn: 305034
2017-06-08 23:49:01 +00:00
Zachary Turner 15eb237fd3 [PDB] Don't crash on /debug:fastlink PDBs.
Apparently support for /debug:fastlink PDBs isn't part of the
DIA SDK (!), and it was causing llvm-pdbdump to crash because
we weren't checking for a null pointer return value.  This
manifests when calling findChildren on the IDiaSymbol, and
it returns E_NOTIMPL.

llvm-svn: 304982
2017-06-08 16:00:40 +00:00
NAKAMURA Takumi 92c99cd6dc Update libdeps to add BinaryFormat, introduced in r304864.
llvm-svn: 304869
2017-06-07 04:48:49 +00:00
Zachary Turner 264b5d9e88 Move Object format code to lib/BinaryFormat.
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.

Differential Revision: https://reviews.llvm.org/D33843

llvm-svn: 304864
2017-06-07 03:48:56 +00:00
Zachary Turner 1bfb9f47af Fix uninitialized read.
llvm-svn: 304846
2017-06-06 23:54:23 +00:00
Adrian Prantl 318d1195f2 Introduce -brief command line option to llvm-dwarfdump
This patch introduces a new command line option, called brief, to
llvm-dwarfdump.  When -brief is used, the attribute forms for the
.debug_info section will not be emitted to output.

Patch by Spyridoula Gravani!

rdar://problem/21474365
Differential Revision: https://reviews.llvm.org/D33867

llvm-svn: 304844
2017-06-06 23:28:45 +00:00
Chandler Carruth aaeada6c75 Fix another ordering constraint with windows.h and comment about
a revers constraint that we got right (by chance).

llvm-svn: 304792
2017-06-06 12:43:20 +00:00
Chandler Carruth 6bda14b313 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Wolfgang Pieb 77d3e938f8 [DWARF] Adding support for the DWARF v5 string offsets table (consumer/reader part only).
Reviewers: dblaikie, aprantl

Differential Revision: https://reviews.llvm.org/D32779

llvm-svn: 304759
2017-06-06 01:22:34 +00:00
Zachary Turner 88101dadcc [CodeView] Fix endianness bug.
We should be outputting in little endian, but we were writing
in host endianness.

llvm-svn: 304741
2017-06-05 22:12:23 +00:00
Zachary Turner 349c18f837 [CodeView] Handle Cross Module Imports and Exports.
While it's not entirely clear why a compiler or linker might
put this information into an object or PDB file, one has been
spotted in the wild which was causing llvm-pdbdump to crash.

This patch adds support for reading-writing these sections.
Since I don't know how to get one of the native tools to
generate this kind of debug info, the only test here is one
in which we feed YAML into the tool to produce a PDB and
then spit out YAML from the resulting PDB and make sure that
it matches.

llvm-svn: 304738
2017-06-05 21:40:33 +00:00
Zachary Turner 5b74ff33e7 [PDB] Fix use after free.
Previously MappedBlockStream owned its own BumpPtrAllocator that
it would allocate from when a read crossed a block boundary.  This
way it could still return the user a contiguous buffer of the
requested size.  However, It's not uncommon to open a stream, read
some stuff, close it, and then save the information for later.
After all, since the entire file is mapped into memory, the data
should always be available as long as the file is open.

Of course, the exception to this is when the data isn't *in* the
file, but rather in some buffer that we temporarily allocated to
present this contiguous view.  And this buffer would get destroyed
as soon as the strema was closed.

The fix here is to force the user to specify the allocator, this
way it can provide an allocator that has whatever lifetime it
chooses.

Differential Revision: https://reviews.llvm.org/D33858

llvm-svn: 304623
2017-06-03 00:33:35 +00:00
Zachary Turner 92dcdda623 [CodeView] Support CodeView subsections in any order.
Previously we would expect certain subsections to appear
in a certain order because some subsections would reference
other subsections, but in practice we need to support
arbitrary orderings since some object file and PDB file
producers generate them this way.  This also paves the
way for supporting Yaml <-> Object File conversion of
CodeView, since Object Files typically have quite a
large number of subsections in their debug info.

Differential Revision: https://reviews.llvm.org/D33807

llvm-svn: 304588
2017-06-02 19:49:14 +00:00
Zachary Turner afb81a83a9 Fix 2 more -Wreorder warnings.
llvm-svn: 304494
2017-06-01 23:24:50 +00:00
Zachary Turner ebd3ae8371 [CodeView] Properly align symbol records on read/write.
Object files have symbol records not aligned to any particular
boundary (e.g. 1-byte aligned), while PDB files have symbol
records padded to 4-byte aligned boundaries.  Since they share
the same reading / writing code, we have to provide an option to
specify the alignment and propagate it up to the producer or
consumer who knows what the alignment is supposed to be for the
given container type.

Added a test for this by modifying the existing PDB -> YAML -> PDB
round-tripping code to round trip symbol records as well as types.

Differential Revision: https://reviews.llvm.org/D33785

llvm-svn: 304484
2017-06-01 21:52:41 +00:00
Adrian Prantl f4bc1f77b7 [DWARF] Introduce Dump Options
This commit introduces a structure that holds all the flags that
control the pretty printing of dwarf output.

Patch by Spyridoula Gravani!

Differential Revision: https://reviews.llvm.org/D33749

llvm-svn: 304446
2017-06-01 18:18:23 +00:00
Zachary Turner d427383cb8 [CodeView] Move CodeView YAML code to ObjectYAML.
This is the beginning of an effort to move the codeview yaml
reader / writer into ObjectYAML so that it can be shared.
Currently the only consumer / producer of CodeView YAML is
llvm-pdbdump, but CodeView can exist outside of PDB files, and
indeed is put into object files and passed to the linker to
produce PDB files.  Furthermore, there are subtle differences
in the types of records that show up in object file CodeView
vs PDB file CodeView, but they are otherwise 99% the same.

By having this code in ObjectYAML, we can have llvm-pdbdump
reuse this code, while teaching obj2yaml and yaml2obj to use
this syntax for dealing with object files that can contain
CodeView.

This patch only adds support for CodeView type information
to ObjectYAML.  Subsequent patches will add support for
CodeView symbol information.

llvm-svn: 304248
2017-05-30 21:53:05 +00:00
Galina Kistanova 8c1e2f9108 Added missing break.
llvm-svn: 304230
2017-05-30 19:02:49 +00:00
Zachary Turner 591312c5c1 [CodeView] Add more DebugSubsection implementations.
This adds implementations for Symbols and FrameData, and renames
the existing codeview::StringTable class to conform to the
DebugSectionStringTable convention.

llvm-svn: 304222
2017-05-30 17:13:33 +00:00
Zachary Turner 8c099fe06e [CodeView] Rename ModuleDebugFragment -> DebugSubsection.
This is more concise, and matches the terminology used in other
parts of the codebase more closely.

llvm-svn: 304218
2017-05-30 16:36:15 +00:00
George Rimar a25d329b33 Recommit "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
With fix of uninitialized variable.

Original commit message:

This change is intended to use for LLD in D33183. 
Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to.

Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo
interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason of incorrect behavior when
sections share the same offsets, like D33176 shows.

This patch makes DWARF parsers to return section index as well. That solves problem mentioned above.

Differential revision: https://reviews.llvm.org/D33184

llvm-svn: 304078
2017-05-27 18:10:23 +00:00
George Rimar 1f9cab6b1c Revert r304002 "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
Revert it again. Now another bot unhappy: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/8750

llvm-svn: 304011
2017-05-26 17:36:23 +00:00
George Rimar bc223c63cc [DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC
This change is intended to use for LLD in D33183. 
Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to.

Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo
interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason of incorrect behavior when
sections share the same offsets, like D33176 shows.

This patch makes DWARF parsers to return section index as well. That solves problem mentioned above.

Differential revision: https://reviews.llvm.org/D33184

llvm-svn: 304002
2017-05-26 16:26:18 +00:00
George Rimar a8403a64ea Revert "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
Broked BB again:

TEST 'LLVM :: DebugInfo/X86/dbg-value-regmask-clobber.ll' FAILED
...
LLVM ERROR: Section was outside of section table.

llvm-svn: 303984
2017-05-26 13:20:09 +00:00
George Rimar 655b7b63f6 Recommit r303978 "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
With fix of test compilation.

Initial commit message:

This change is intended to use for LLD in D33183. 
Problem we have in LLD when building .gdb_index is that we need to know section 
which address range belongs to.

Previously it was solved on LLD side by providing fake section addresses
with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed.
Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason 
of incorrect behavior when
sections share the same offsets, like D33176 shows.

This patch makes DWARF parsers to return section index as well. 
That solves problem mentioned above.

Differential revision: https://reviews.llvm.org/D33184

llvm-svn: 303983
2017-05-26 13:13:50 +00:00
George Rimar 7d5f12185a Revert r303978 "[DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC"
It failed BB.

llvm-svn: 303981
2017-05-26 12:53:41 +00:00
George Rimar 732f268aa0 [DWARF] - Make collectAddressRanges() return section index in addition to Low/High PC
This change is intended to use for LLD in D33183. 
Problem we have in LLD when building .gdb_index is that we need to know section 
which address range belongs to.

Previously it was solved on LLD side by providing fake section addresses
with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed.
Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason 
of incorrect behavior when
sections share the same offsets, like D33176 shows.

This patch makes DWARF parsers to return section index as well. 
That solves problem mentioned above.

Differential revision: https://reviews.llvm.org/D33184

llvm-svn: 303978
2017-05-26 12:46:41 +00:00
Zachary Turner f2110283c6 Remove unused member.
llvm-svn: 303942
2017-05-25 23:47:56 +00:00
Zachary Turner fed467eefb [CV Type Merging] Find nested type indices faster.
Merging two type streams is one of the most time consuming
parts of generating a PDB, and as such it needs to be as
fast as possible.  The visitor abstractions used for interoperating
nicely with many different types of inputs and outputs have
been used widely and help greatly for testability and implementing
tools, but the abstractions build up and get in the way of
performance.

This patch removes all of the visitation stuff from the type
stream merger, essentially re-inventing the leaf / member switch
and loop, but at a very low level.  This allows us many other
optimizations, such as not actually deserializing *any* records
(even member records which don't describe their own length), as
the operation of "figure out how long this record is" is somewhat
faster than "figure out how long this record *and* get all its
fields out".  Furthermore, whereas before we had to deserialize,
re-write type indices, then re-serialize, now we don't have to
do any of those 3 steps.  We just find out where the type indices
are and pull them directly out of the byte stream and re-write
them.

This is worth a 50-60% performance increase.  On top of all other
optimizations that have been applied this week, I now get the
following numbers when linking lld.exe and lld.pdb

MSVC: 25.67s
Before This Patch: 18.59s
After This Patch: 8.92s

So this is a huge performance win.

Differential Revision: https://reviews.llvm.org/D33564

llvm-svn: 303935
2017-05-25 23:36:16 +00:00
Zachary Turner 2897e0306e [lld] Fix a bug where we continually re-follow type servers.
Originally this was intended to be set up so that when linking
a PDB which refers to a type server, it would only visit the
PDB once, and on subsequent visitations it would just skip it
since all the records had already been added.

Due to some C++ scoping issues, this was not occurring and it
was revisiting the type server every time, which caused every
record to end up being thrown away on all subsequent visitations.

This doesn't affect the performance of linking clang-cl generated
object files because we don't use type servers, but when linking
object files and libraries generated with /Zi via MSVC, this means
only 1 object file has to be linked instead of N object files, so
the speedup is quite large.

llvm-svn: 303920
2017-05-25 21:16:03 +00:00
Zachary Turner 7f97c362a4 [CodeView Type Merging] Don't keep re-allocating temp serializer.
Previously, every time we wanted to serialize a field list record, we
would create a new copy of FieldListRecordBuilder, which would in turn
create a temporary instance of TypeSerializer, which itself had a
std::vector<> that was about 128K in size. So this 128K allocation was
happening every time. We can re-use the same instance over and over, we
just have to clear its internal hash table and seen records list between
each run. This saves us from the constant re-allocations.

This is worth an ~18.5% speed increase (3.75s -> 3.05s) in my tests.

Differential Revision: https://reviews.llvm.org/D33506

llvm-svn: 303919
2017-05-25 21:15:37 +00:00
Bob Haarman 55256ada25 [pdb] pad source file name buffer at the end instead of the beginning
Summary:
DbiStreamBuilder calculated the offset of the source file names inside
the file info substream as the size of the file info substream minus
the size of the file names. Since the file info substream is padded to
a multiple of 4 bytes, this caused the first file name to be aligned
on a 4-byte boundary. By contrast, DbiModuleList would read the file
names immediately after the file name offset table, without skipping
to the next 4-byte boundary. This change makes it so that the file
names are written to the location where DbiModuleList expects them,
and puts any necessary padding for the file info substream after the
file names instead of before it.

Reviewers: amccarth, rnk, zturner

Reviewed By: amccarth, zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33475

llvm-svn: 303917
2017-05-25 21:12:15 +00:00
Zachary Turner c4e4b7e31e Fix a bug in MappedBlockStream.
It was using the number of blocks of the entire PDB file as the number
of blocks of each stream that was created.  This was only an issue in
the readLongestContiguousChunk function, which  was never called prior.
This bug surfaced when I updated an algorithm to use this function and
the algorithm broke.

llvm-svn: 303916
2017-05-25 21:12:00 +00:00
Zachary Turner dda25b128c [CodeView Type Merging] Avoid record deserialization when possible.
A profile shows the majority of time doing type merging is spent
deserializing records from sequences of bytes into friendly C++ structures
that we can easily access members of in order to find the type indices to
re-write.

Records are prefixed with their length, however, and most records have
type indices that appear at fixed offsets in the record. For these
records, we can save some cycles by just looking at the right place in the
byte sequence and re-writing the value, then skipping the record in the
type stream. This saves us from the costly deserialization of examining
every field, including potentially null terminated strings which are the
slowest, even though it was unnecessary to begin with.

In addition, we apply another optimization. Previously, after
deserializing a record and re-writing its type indices, we would
unconditionally re-serialize it in order to compute the hash of the
re-written record. This would result in an alloc and memcpy for every
record. If no type indices were re-written, however, this was an
unnecessary allocation. In this patch re-writing is made two phase. The
first phase discovers the indices that need to be rewritten and their new
values. This information is passed through to the de-duplication code,
which only copies and re-writes type indices in the serialized byte
sequence if at least one type index is different.

Some records have type indices which only appear after variable length
strings, or which have lists of type indices, or various other situations
that can make it tricky to make this optimization. While I'm not giving up
on optimizing these cases as well, for now we can get the easy cases out
of the way and lay the groundwork for more complicated cases later.

This patch yields another 50% speedup on top of the already large speedups
submitted over the past 2 days. In two tests I have run, I went from 9
seconds to 3 seconds, and from 16 seconds to 8 seconds.

Differential Revision: https://reviews.llvm.org/D33480

llvm-svn: 303914
2017-05-25 21:06:28 +00:00
Zachary Turner bb64231d2d Don't do a full scan of the type stream before processing records.
LazyRandomTypeCollection is designed for random access, and in
order to provide this it lazily indexes ranges of types.  In the
case of types from an object file, there is no partial index
to build off of, so it has to index the full stream up front.
However, merging types only requires sequential access, and when
that is needed, this extra work is simply wasted.  Changing the
algorithm to work on sequential arrays of types rather than
random access type collections eliminates this up front scan.

llvm-svn: 303707
2017-05-24 00:26:27 +00:00
Zachary Turner 7daf62e743 [CodeView] Eliminate redundant hashes and allocations.
When writing field list records, we would construct a temporary
type serializer that shared a bump ptr allocator with the rest
of the application, so anything allocated from here would live
forever.  Furthermore, this temporary serializer had all the
properties of a full blown serializer including record hashing
and de-duplication.

These features are required when you're merging multiple type
streams into each other, because different streams may contain
identical records, but records from the same type stream will
never collide with each other.  So all of this hashing was
unnecessary.

To solve this, two fixes are made:

1) The temporary serializer keeps its own bump ptr allocator
instead of sharing a global one.  When it's finished, all of
its memory is freed.

2) Instead of using the same temporary serializer for the life
of an entire type stream, we use it only for the life of a single
field list record and delete it when the field list record is
completed.  This way the hash table will not grow as other
records from the same type stream get inserted.  Further improvements
could eliminate hashing entirely from this codepath.

This reduces the link time by 85% in my test, from 1 minute to 9
seconds.

llvm-svn: 303676
2017-05-23 18:56:23 +00:00
Reid Kleckner 36238b15d7 Speculative build fix for non-Windows
llvm-svn: 303667
2017-05-23 18:28:13 +00:00
Reid Kleckner ded38803c5 [PDB] Hash types up front when merging types instead of using StringMap
Summary:
First, StringMap uses llvm::HashString, which is only good for short
identifiers and really bad for large blobs of binary data like type
records. Moving to `DenseMap<StringRef, TypeIndex>` with some tricks for
memory allocation fixes that.

Unfortunately, that didn't buy very much performance. Profiling showed
that we spend a long time during DenseMap growth rehashing existing
entries. Also, in general, DenseMap is faster when the keys are small.
This change takes that to the logical conclusion by introducing a small
wrapper value type around a pointer to key data. The key data contains a
precomputed hash, the original record data (pointer and size), and the
type index, which is the "value" of our original map.

This reduces the time to produce llvm-as.exe and llvm-as.pdb from ~15s
on my machine to 3.5s, which is about a 4x improvement.

Reviewers: zturner, inglorion, ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33428

llvm-svn: 303665
2017-05-23 18:23:59 +00:00
Zachary Turner bf35e6ab2a Revert "Make TypeSerializer's StringMap use the same allocator."
This reverts commit e34ccb7b57da25cc89ded913d8638a2906d1110a.

This is causing failures on the ASAN bots.

llvm-svn: 303640
2017-05-23 15:50:37 +00:00
David Blaikie 15d85fc537 libDebugInfo: Support symbolizing using DWP files
llvm-svn: 303609
2017-05-23 06:48:53 +00:00
David Blaikie 37d1cff491 FIX: Remove debugging assert left in previous commit
Sorry for the bot noise.

llvm-svn: 303592
2017-05-23 00:31:24 +00:00
David Blaikie f9803fb4bb libDebugInfo: Avoid independently parsing the same .dwo file for two separate CUs residing there
NFC, just an optimization. Will be building on this for DWP support
shortly.

llvm-svn: 303591
2017-05-23 00:30:42 +00:00
Zachary Turner d4136e945e Implement various flavors of type merging.
Previous algotirhm assumed that types and ids are in a single
unified stream.  For inputs that come from object files, this
is the case.  But if the input is already a PDB, or is the result
of a previous merge, then the types and ids will already have
been split up, in which case we need an algorithm that can
accept operate on independent streams of types and ids that
refer across stream boundaries to each other.

Differential Revision: https://reviews.llvm.org/D33417

llvm-svn: 303577
2017-05-22 21:07:43 +00:00
Zachary Turner 12f8c31c04 Make TypeSerializer's StringMap use the same allocator.
llvm-svn: 303576
2017-05-22 21:07:14 +00:00
David Blaikie d2f3a941e0 libDebugInfo/DWARF: Apply relocations for debug_addr addresses in object files
llvm-symbolizer would fail to symbolize addresses in unlinked object
files when handling .dwo file data because the addresses would not be
relocated in the same way as the ranges in the skeleton CU in the object
file.

Fix that so object files can be symbolized the same as executables.

llvm-svn: 303532
2017-05-22 07:02:47 +00:00
David Blaikie 8d039d40c5 llvm-symbolizer: Support multiple CUs in a single DWO file
llvm-svn: 303482
2017-05-20 03:32:49 +00:00
Zachary Turner 526f4f2aa8 Resubmit "[CodeView] Provide a common interface for type collections."
This was originally reverted because it was a breaking a bunch
of bots and the breakage was not surfacing on Windows.  After much
head-scratching this was ultimately traced back to a bug in the
lit test runner related to its pipe handling.  Now that the bug
in lit is fixed, Windows correctly reports these test failures,
and as such I have finally (hopefully) fixed all of them in this
patch.

llvm-svn: 303446
2017-05-19 19:26:58 +00:00
Zachary Turner 1dfcf8d92c Revert "[CodeView] Provide a common interface for type collections."
This is a squash of ~5 reverts of, well, pretty much everything
I did today.  Something is seriously broken with lit on Windows
right now, and as a result assertions that fire in tests are
triggering failures.  I've been breaking non-Windows bots all
day which has seriously confused me because all my tests have
been passing, and after running lit with -a to view the output
even on successful runs, I find out that the tool is crashing
and yet lit is still reporting it as a success!

At this point I don't even know where to start, so rather than
leave the tree broken for who knows how long, I will get this
back to green, and then once lit is fixed on Windows, hopefully
hopefully fix the remaining set of problems for real.

llvm-svn: 303409
2017-05-19 05:57:45 +00:00
Zachary Turner 47fdc73771 Don't crash if someone tries to visit an empty type stream.
llvm-svn: 303408
2017-05-19 05:18:09 +00:00
Zachary Turner 59ab6a3816 [CodeView] Reduce memory usage in TypeSerializer.
We were using a BumpPtrAllocator to allocate stable storage for
a record, then trying to insert that into a hash table.  If a
collision occurred, the bytes were never inserted and the
allocation was unnecessary.  At the cost of an extra hash
computation, check first if it exists, and only if it does do
we allocate and insert.

llvm-svn: 303407
2017-05-19 04:56:48 +00:00
Zachary Turner 8f1d87a79a Fix crasher in CodeView test.
Apparently this was always broken, but previously we were more
graceful about it and we would print "unknown udt" if we couldn't
find the type index, whereas now we just segfault because we
assume it's valid.  But this exposed a real bug, which is that
we weren't looking in the right place.  So fix that, and also
fix this crash at the same time.

llvm-svn: 303397
2017-05-19 00:56:39 +00:00
Zachary Turner 7b62d7ccc0 Fix some build errors and warnings.
llvm-svn: 303391
2017-05-18 23:12:42 +00:00
Zachary Turner b32ec02b80 [CodeView] Raise the source to ID map out of the TypeStreamMerger.
This map will be needed to rewrite symbol streams after re-writing
the corresponding type streams.

llvm-svn: 303390
2017-05-18 23:04:08 +00:00
Zachary Turner 8fb441ab9c [llvm-pdbdump] Add the ability to merge PDBs.
Merging PDBs is a feature that will be used heavily by
the linker.  The functionality already exists but does not
have deep test coverage because it's not easily exposed through
any tools.  This patch aims to address that by adding the
ability to merge PDBs via llvm-pdbdump.  It takes arbitrarily
many PDBs and outputs a single PDB.

Using this new functionality, a test is added for merging
type records.  Future patches will add the ability to merge
symbol records, module information, etc.

llvm-svn: 303389
2017-05-18 23:03:41 +00:00
Zachary Turner 0c60f269fc [CodeView] Provide a common interface for type collections.
Right now we have multiple notions of things that represent collections of
types. Most commonly used are TypeDatabase, which is supposed to keep
mappings from TypeIndex to type name when reading a type stream, which
happens when reading PDBs. And also TypeTableBuilder, which is used to
build up a collection of types dynamically which we will later serialize
(i.e. when writing PDBs).

But often you just want to do some operation on a collection of types, and
you may want to do the same operation on any kind of collection. For
example, you might want to merge two TypeTableBuilders or you might want
to merge two type streams that you loaded from various files.

This dichotomy between reading and writing is responsible for a lot of the
existing code duplication and overlapping responsibilities in the existing
CodeView library classes. For example, after building up a
TypeTableBuilder with a bunch of type records, if we want to dump it we
have to re-invent a bunch of extra glue because our dumper takes a
TypeDatabase or a CVTypeArray, which are both incompatible with
TypeTableBuilder.

This patch introduces an abstract base class called TypeCollection which
is shared between the various type collection like things. Wherever we
previously stored a TypeDatabase& in some common class, we now store a
TypeCollection&.

The advantage of this is that all the details of how the collection are
implemented, such as lazy deserialization of partial type streams, is
completely transparent and you can just treat any collection of types the
same regardless of where it came from.

Differential Revision: https://reviews.llvm.org/D33293

llvm-svn: 303388
2017-05-18 23:03:06 +00:00
Zachary Turner 5a83fb153f Fix some minor issues in PDB parsing library.
1) Until now I'd never seen a valid PDB where the DBI stream and
   the PDB Stream disagreed on the "Age" field.  Because of that,
   we had code to assert that they matched.  Recently though I was
   given a PDB where they disagreed, so this assumption has proven
   to be incorrect.  Remove this check.

2) We were walking the entire list of hash values for types up front
   and then throwing away the values.  For large PDBs this was a
   significant slow down.  Remove this.

With this patch, I can dump the list of all compilands from a
1.5GB PDB file in just a few seconds.

llvm-svn: 303351
2017-05-18 15:14:44 +00:00
George Rimar 47f84b1a3c [DWARF] - Simplify RelocVisitor implementation.
We do not need to store relocation width field.
Patch removes relative code, that simplifies implementation.

Differential revision: https://reviews.llvm.org/D33274

llvm-svn: 303335
2017-05-18 08:25:11 +00:00
George Rimar f98b9ac5da [lib/Object] - Minor API update for llvm::Decompressor.
I revisited Decompressor API (issue with it was triggered during D32865 review)
and found it is probably provides more then we really need.

Issue was about next method's signature:

Error decompress(SmallString<32> &Out);
It is too strict. At first I wanted to change it to decompress(SmallVectorImpl<char> &Out),
but then found it is still not flexible because sticks to SmallVector.

During reviews was suggested to use templating to simplify code. Patch do that.

Differential revision: https://reviews.llvm.org/D33200

llvm-svn: 303331
2017-05-18 08:00:01 +00:00
Bob Haarman de33a63784 [llvm-pdbdump] in yaml2pdb, generate default output filename if none given
Summary:
llvm-pdbdump yaml2pdb used to fail with a misleading error
message ("An I/O error occurred on the file system") if no output file
was specified. This change adds an assert to PDBFileBuilder to check
that an output file name is specified, and makes llvm-pdbdump generate
an output file name based on the input file name if no output file
name is explicitly specified.

Reviewers: amccarth, zturner

Reviewed By: zturner

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D33296

llvm-svn: 303299
2017-05-17 20:46:48 +00:00