Commit Graph

1625 Commits

Author SHA1 Message Date
Fangrui Song b3be23d334 [DWARF] Simplify LineTable::findRowInSeq
We want the last row whose address is less than or equal to Address.
This can be computed as upper_bound - 1, which is simpler than
lower_bound followed by skipping equal rows in a loop.

Since FirstRow (LowPC) does not satisfy the predicate (OrderByAddress)
while LastRow-1 (HighPC) satisfies the predicate. We can decrease the
search range by two, i.e.

upper_bound [FirstRow,LastRow) = upper_bound [FirstRow+1,LastRow-1)

llvm-svn: 358053
2019-04-10 07:44:23 +00:00
Fangrui Song 9b22c469ca [DWARF] DWARFDebugLine: replace Sequence::orderByLowPC with orderByHighPC
In a sorted list of non-overlapping [LowPC,HighPC) ranges, locating an address with
upper_bound on HighPC is simpler than lower_bound on LowPC.

llvm-svn: 358012
2019-04-09 15:08:32 +00:00
Eugene Leviant 7671a1daa7 Use llvm::crc32 instead of crc32. NFC
llvm-svn: 357911
2019-04-08 13:40:58 +00:00
Eugene Leviant 18873b22be Attempt to recommit r357901
llvm-svn: 357905
2019-04-08 12:31:12 +00:00
Eugene Leviant 03d28a4490 Reverting r357901 as fails to build on some of the buildbots
llvm-svn: 357902
2019-04-08 11:37:20 +00:00
Eugene Leviant ad69bd6870 [Support] Add zlib independent CRC32
Differential revision: https://reviews.llvm.org/D59816

llvm-svn: 357901
2019-04-08 11:25:48 +00:00
Fangrui Song c4c8bcaeec [DWARF] DWARFDebugLine: delete unused parameter `Offset`
llvm-svn: 357866
2019-04-07 13:56:14 +00:00
Fangrui Song 6a0746a92f Change some StringRef::data() reinterpret_cast to bytes_begin() or arrayRefFromStringRef()
llvm-svn: 357852
2019-04-07 03:58:42 +00:00
Fangrui Song 4be8629e49 [DWARF] Simplify DWARFDebugAranges::findAddress
The current lower_bound approach has to check two iterators pos and pos-1.
Changing it to upper_bound allows us to check one iterator (similar to
DWARFUnitVector::getUnitFor*).

llvm-svn: 357834
2019-04-06 09:12:53 +00:00
Fangrui Song cb300f1243 [Symbolize] Uniquify sorted vector<pair<SymbolDesc, StringRef>>
llvm-svn: 357833
2019-04-06 02:18:56 +00:00
Fangrui Song afb54fd629 [Symbolize] Replace map<SymbolDesc, StringRef> with sorted vector
llvm-svn: 357758
2019-04-05 12:52:04 +00:00
Fangrui Song e2622b3e33 [Symbolize] Keep SymbolDescs with the same address and improve getNameFromSymbolTable heuristic
I'll follow up with better heuristics or tests.

llvm-svn: 357683
2019-04-04 11:08:45 +00:00
Igor Kudrin 0fed7b0564 [llvm-symbolizer] Add `--output-style` switch.
In general, llvm-symbolizer follows the output style of GNU's addr2line.
However, there are still some differences; in particular, for a requested
address, llvm-symbolizer prints line and column, while addr2line prints
only the line number.

This patch adds a new switch to select the preferred style.

Differential Revision: https://reviews.llvm.org/D60190

llvm-svn: 357675
2019-04-04 08:39:40 +00:00
Reid Kleckner e10d00419a [codeview] Remove Type member from CVRecord
Summary:
Now CVType and CVSymbol are effectively type-safe wrappers around
ArrayRef<uint8_t>. Make the kind() accessor load it from the
RecordPrefix, which is the same for types and symbols.

Reviewers: zturner, aganea

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60018

llvm-svn: 357658
2019-04-04 00:28:48 +00:00
Jonas Devlieghere 2156797cf0 [dwarfdump] Remove bogus verifier error
The standard doesn't require a DW_TAG_variable, DW_TAG_formal_parameter
or DW_TAG_constant to A DW_AT_type attribute describing the type of the
variable. It only specifies that it *can* have one.

llvm-svn: 357628
2019-04-03 19:57:13 +00:00
Paul Semel 0c27bc2e1f [DWARF] check whether the DIE is valid before querying for information
Differential Revision: https://reviews.llvm.org/D60147

llvm-svn: 357607
2019-04-03 17:13:45 +00:00
Reid Kleckner 85e2cdac73 Delay initialization of three static global maps, NFC
This avoids allocating a few KB of heap memory on startup, and instead
allocates these maps lazily. I noticed this while profiling LLD.

llvm-svn: 357192
2019-03-28 17:33:41 +00:00
Fangrui Song 3f2e29b013 [DWARF] Add D to Seen early to avoid duplicate elements in Worklist
llvm-svn: 357054
2019-03-27 09:38:05 +00:00
Fangrui Song 38a4c619eb [DWARF] Simplify DWARFVerifier::handleDebugAbbrev. NFC
llvm-svn: 357053
2019-03-27 08:43:21 +00:00
Ali Tamur 02e96648d7 Revert "[llvm] Reapply "Prevent duplicate files in debug line header in dwarf 5.""
This reverts commit rL357020.

The commit broke the test llvm/test/tools/llvm-objdump/embedded-source.test
on some builds including clang-ppc64be-linux-multistage,
clang-s390x-linux, clang-with-lto-ubuntu, clang-x64-windows-msvc,
llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast (and others).

llvm-svn: 357026
2019-03-26 20:05:27 +00:00
Ali Tamur 2f5cd03a3f [llvm] Reapply "Prevent duplicate files in debug line header in dwarf 5."
Reapply rL356941 after regenerating the object file in the failing test
llvm/test/tools/llvm-objdump/embedded-source.test from source.

Original commit message:

[llvm] Prevent duplicate files in debug line header in dwarf 5.

Motivation: In previous dwarf versions, file name indexes started from 1, and
the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes
the primary source file to be explicitly given an entry with an index number 0.

The current implementation honors the specification by just duplicating the
main source file, once with index number 0, and later maybe with another
index number. While this is compliant with the letter of the standard, the
duplication causes problems for consumers of this information such as lldb.
(Some files are duplicated, where only some of them have a line table although
all refer to the same file)

With this change, dwarf 5 debug line section files always start from 0, and
the zeroth entry is not duplicated whenever possible. This requires different
handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns
an index zero for a file name, it signals an error in dwarf 4, but not in dwarf 5)
However, I think the minor complication is worth it, because it enables all
consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the
file name list homogenously.

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D59515

llvm-svn: 357018
2019-03-26 18:53:23 +00:00
Ali Tamur fdce82a814 Revert "[llvm] Prevent duplicate files in debug line header in dwarf 5."
This reverts commit 312ab05887.

My commit broke the build; I will revert and find out what happened.

llvm-svn: 356951
2019-03-25 21:09:07 +00:00
Ali Tamur 312ab05887 [llvm] Prevent duplicate files in debug line header in dwarf 5.
Summary:

Motivation: In previous dwarf versions, file name indexes started from 1, and
the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes
the primary source file to be explicitly given an entry with an index number 0.

The current implementation honors the specification by just duplicating the
main source file, once with index number 0, and later maybe with another
index number. While this is compliant with the letter of the standard, the
duplication causes problems for consumers of this information such as lldb.
(Some files are duplicated, where only some of them have a line table although
all refer to the same file)

With this change, dwarf 5 debug line section files always start from 0, and
the zeroth entry is not duplicated whenever possible. This requires different
handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns
an index zero for a file name, it signals an error in dwarf 4, but not in dwarf 5)
However, I think the minor complication is worth it, because it enables all
consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the
file name list homogenously.

Reviewers: dblaikie, probinson, aprantl, espindola

Reviewed By: probinson

Subscribers: emaste, jvesely, nhaehnle, aprantl, javed.absar, arichardson, hiraditya, MaskRay, rupprecht, jdoerfert, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D59515

llvm-svn: 356941
2019-03-25 20:08:00 +00:00
Fangrui Song 40483e1831 [DWARF] Delete a stray break and a stray comment. NFC
llvm-svn: 356838
2019-03-23 16:15:40 +00:00
Alexey Lapshin b2c4b8bded [DebugInfo] follow up for "add SectionedAddress to DebugInfo interfaces"
[Symbolizer] Add getModuleSectionIndexForAddress() helper routine

  The https://reviews.llvm.org/D58194 patch changed symbolizer interface.
  Particularily it requires not only Address but SectionIndex also.
  Note object::SectionedAddress parameter:

  Expected<DILineInfo> symbolizeCode(const std::string &ModuleName,
                                   object::SectionedAddress ModuleOffset,
                                   StringRef DWPName = "");

  There are callers of symbolizer which do not know particular section index.
  That patch creates getModuleSectionIndexForAddress() routine which
  will detect section index for the specified address. Thus if caller
  set ModuleOffset.SectionIndex into object::SectionedAddress::UndefSection
  state then symbolizer would detect section index using
  getModuleSectionIndexForAddress routine.

  Differential Revision: https://reviews.llvm.org/D58848

llvm-svn: 356829
2019-03-23 08:08:40 +00:00
Fangrui Song 4597dce483 [DWARF] Refactor RelocVisitor and fix computation of SHT_RELA-typed relocation entries
Summary:
getRelocatedValue may compute incorrect value for SHT_RELA-typed relocation entries.

// DWARFDataExtractor.cpp
uint64_t DWARFDataExtractor::getRelocatedValue(uint32_t Size, uint32_t *Off,
...
  // This formula is correct for REL, but may be incorrect for RELA if the value
  // stored in the location (getUnsigned(Off, Size)) is not zero.
  return getUnsigned(Off, Size) + Rel->Value;

In this patch, we

* refactor these visit* functions to include a new parameter `uint64_t A`.
  Since these visit* functions are no longer used as visitors, rename them to resolve*.
  + REL: A is used as the addend. A is the value stored in the location where the
    relocation applies: getUnsigned(Off, Size)
  + RELA: The addend encoded in RelocationRef is used, e.g. getELFAddend(R)
* and add another set of supports* functions to check if a given relocation type is handled.
  DWARFObjInMemory uses them to fail early.

Reviewers: echristo, dblaikie

Reviewed By: echristo

Subscribers: mgorny, aprantl, aheejin, fedor.sergeev, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57939

llvm-svn: 356729
2019-03-22 02:43:11 +00:00
Reid Kleckner cda7ff9ddc [llvm-pdbutil] Add -type-ref-stats to help find unused type info
Summary:
This considers module symbol streams and the global symbol stream to be
roots. Most types that this considers "unreferenced" are referenced by
LF_UDT_MOD_SRC_LINE id records, which VC seems to always include.
Essentially, they are types that the user can only find in the debugger
if they call them by name, they cannot be found by traversing a symbol.

In practice, around 80% of type information in a PDB is referenced by a
symbol. That seems like a reasonable number.

I don't really plan to do anything with this tool. It mostly just exists
for informational purposes, and to confirm that we probably don't need
to implement type reference tracking in LLD. We can continue to merge
all types as we do today without wasting space.

Reviewers: zturner, aganea

Subscribers: mgorny, hiraditya, arphaman, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59620

llvm-svn: 356692
2019-03-21 18:02:34 +00:00
Markus Lavin b86ce219f4 [DebugInfo] Introduce DW_OP_LLVM_convert
Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows
for a convenient way to perform type conversions on the Dwarf expression
stack. As an additional bonus it paves the way for using other Dwarf
v5 ops that need to reference a base_type.

The new DW_OP_LLVM_convert is used from lib/Transforms/Utils/Local.cpp
to perform sext/zext on debug values but mainly the patch is about
preparing terrain for adding other Dwarf v5 ops that need to reference a
base_type.

For Dwarf v5 the op maps to DW_OP_convert and for earlier versions a
complex shift & mask pattern is generated to emulate sext/zext.

This is a recommit of r356442 with trivial fixes for the failing tests.

Differential Revision: https://reviews.llvm.org/D56587

llvm-svn: 356451
2019-03-19 13:16:28 +00:00
Markus Lavin ad78768d59 Revert "[DebugInfo] Introduce DW_OP_LLVM_convert"
This reverts commit 1cf4b593a7ebd666fc6775f3bd38196e8e65fafe.

Build bots found failing tests not detected locally.

Failing Tests (3):
  LLVM :: DebugInfo/Generic/convert-debugloc.ll
  LLVM :: DebugInfo/Generic/convert-inlined.ll
  LLVM :: DebugInfo/Generic/convert-linked.ll

llvm-svn: 356444
2019-03-19 09:17:28 +00:00
Markus Lavin cd8a940b37 [DebugInfo] Introduce DW_OP_LLVM_convert
Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows
for a convenient way to perform type conversions on the Dwarf expression
stack. As an additional bonus it paves the way for using other Dwarf
v5 ops that need to reference a base_type.

The new DW_OP_LLVM_convert is used from lib/Transforms/Utils/Local.cpp
to perform sext/zext on debug values but mainly the patch is about
preparing terrain for adding other Dwarf v5 ops that need to reference a
base_type.

For Dwarf v5 the op maps to DW_OP_convert and for earlier versions a
complex shift & mask pattern is generated to emulate sext/zext.

Differential Revision: https://reviews.llvm.org/D56587

llvm-svn: 356442
2019-03-19 08:48:19 +00:00
Alexandre Ganea 4aeea4cc42 [DebugInfo][PDB] Don't write empty debug streams
Before, empty debug streams were written as 8 bytes (4 bytes signature + 4 bytes for the GlobalRefs count).

With this patch, unused empty streams aren't emitted anymore. Modules now encode 65535 as an 'unused stream' value, by convention.
Also fix the * Linker * contrib section which wasn't correctly emitted previously.

Differential Revision: https://reviews.llvm.org/D59502

llvm-svn: 356395
2019-03-18 19:13:23 +00:00
Mircea Trofin 2c3ab66539 [llvm] Skip over empty line table entries.
Summary:
This is similar to how addr2line handles consecutive entries with the
same address - pick the last one.

Reviewers: dblaikie, friss, JDevlieghere

Reviewed By: dblaikie

Subscribers: eugenis, vitalybuka, echristo, JDevlieghere, probinson, aprantl, hiraditya, rupprecht, jdoerfert, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D58952

llvm-svn: 356265
2019-03-15 15:00:12 +00:00
Evgeniy Stepanov 6e64a14804 Revert "[llvm] Skip over empty line table entries."
This reverts commit r355972.
See the discussion at https://reviews.llvm.org/D58952.

llvm-svn: 356001
2019-03-13 01:37:58 +00:00
Mircea Trofin 0c29402eb4 [llvm] Skip over empty line table entries.
Summary:
This is similar to how addr2line handles consecutive entries with the
same address - pick the last one.

Reviewers: dblaikie, friss, JDevlieghere

Reviewed By: dblaikie

Subscribers: ormris, echristo, JDevlieghere, probinson, aprantl, hiraditya, rupprecht, jdoerfert, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D58952

llvm-svn: 355972
2019-03-12 20:48:45 +00:00
Nathan Lanza cc51dc649a Add Swift enumerator value for CodeView::SourceLanguage
Summary:
Swift now generates PDBs for debugging on Windows. llvm and lldb
need a language enumerator value too properly handle the output
emitted by swiftc.

Subscribers: jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59231

llvm-svn: 355882
2019-03-11 23:27:59 +00:00
Petar Jovanovic 95817d3641 [DebugInfo] Fix the type of the formated variable
Change the format type of *Personality and *LSDAAddress to PRIx64 since
they are of type uint64_t.
The problem was detected on mips builds, where it was printing junk values
and causing test failure.

Patch by Milos Stojanovic.

Differential Revision: https://reviews.llvm.org/D58451

llvm-svn: 355607
2019-03-07 16:31:08 +00:00
Jonas Devlieghere 4cc567bb9e [DWARFFormValue] Don't consider DW_FORM_data4/8 to be section offsets.
When dumping ToT clan's debug info with dwarfdump, we were seeing an
error saying that that the location list overflows the debug_loc
section. After reducing the testcase we figured out that we were
interpreting the DW_FORM_data4 as a section offset.

In DWARF3 DW_FORM_data4 and DW_FORM_data8 served also as a section
offset. Until now we didn't check check for the DWARF version, because
some producers (read old versions of clang) were still emitting this.
The relevant code/comment was added in 2013, and I believe it's now
reasonable to start checking the version.

The FormValue class is a little bit of a mess because it cashes the
DWARF unit and context when it extracted the value itself. Several
methods of the class rely on it being present, or return an Optional for
the code path that needs it. At the same time the FormValue class also
used in places where there's no DWARF unit.

For this patch I went with the least invasive change: checking the
version from the CU when it's available. If it's not (because the form
value was created from a value directly) we default to the old behavior.

Differential revision: https://reviews.llvm.org/D58698

llvm-svn: 355456
2019-03-05 23:47:22 +00:00
Vlad Tsyrklevich 53a9f1d367 Revert "[DWARFFormValue] Cleanup DWARFFormValue interface. (2/2) (NFC)"
This reverts commit r355233, it was causing UBSan failures.

llvm-svn: 355255
2019-03-02 01:10:00 +00:00
Jonas Devlieghere 2dc2baa8cc [DWARFFormValue] Cleanup DWARFFormValue interface. (2/2) (NFC)
Continues the work started in r354941. Changes (all but one) uses of the
extractValue to static createFromData.

llvm-svn: 355233
2019-03-01 22:14:24 +00:00
Adrian Prantl fa37a00044 dsymutil support for DW_OP_convert
Add support for cloning DWARF expressions that contain base type DIE
references in dsymutil.

<rdar://problem/48167812>

Differential Revision: https://reviews.llvm.org/D58534

llvm-svn: 355148
2019-02-28 22:12:32 +00:00
Alexey Lapshin 77fc1f6049 [DebugInfo] add SectionedAddress to DebugInfo interfaces.
That patch is the fix for https://bugs.llvm.org/show_bug.cgi?id=40703
   "wrong line number info for obj file compiled with -ffunction-sections"
   bug. The problem happened with only .o files. If object file contains
   several .text sections then line number information showed incorrectly.
   The reason for this is that DwarfLineTable could not detect section which
   corresponds to specified address(because address is the local to the
   section). And as the result it could not select proper sequence in the
   line table. The fix is to pass SectionIndex with the address. So that it
   would be possible to differentiate addresses from various sections. With
   this fix llvm-objdump shows correct line numbers for disassembled code.

   Differential review: https://reviews.llvm.org/D58194

llvm-svn: 354972
2019-02-27 13:17:36 +00:00
Jonas Devlieghere bb111152b7 [DWARFFormValue] Cleanup DWARFFormValue interface. (NFC)
DWARFFormValues can be created from a data extractor or by passing its
value directly. Until now this was done by member functions that
modified an existing object's internal state. This patch replaces a
subset of these methods with static method that return a new
DWARFFormValue.

llvm-svn: 354941
2019-02-27 00:58:09 +00:00
Markus Lavin 76dda218a0 [DebugInfo] Prep llvm-dwarfdump for typed DW5 ops.
Adds llvm-dwarfdump support for pretty printing Dwarf5 expressions ops
that reference a base type (right now only DW_OP_convert is added).
Includes verification to verify that the ops operand is actually a
DW_TAG_base_type DIE.

Differential Revision: https://reviews.llvm.org/D58442

llvm-svn: 354552
2019-02-21 08:20:24 +00:00
Matt Davis 123be5d4c0 [symbolizer] Avoid collecting symbols belonging to invalid sections.
Summary:
llvm-symbolizer would originally report symbols that belonged to an invalid object file section.
Specifically the case where: `*Symbol.getSection() == ObjFile.section_end()`
This patch prevents the Symbolizer from collecting symbols that belong to invalid sections.

The test  (from PR40591) introduces a case where two symbols have address 0,
one symbol is defined, 'foo', and the other is not defined, 'bar'.  This patch will cause
the Symbolizer to keep 'foo' and ignore 'bar'.

As a side note, the logic for adding symbols to the Symbolizer's store
(`SymbolizableObjectFile::addSymbol`) replaces symbols with the
same <address, size> pair.  At some point that logic should be revisited as in the
aforementioned case, 'bar' was overwriting 'foo' in the Symbolizer's store,
and 'foo' was forgotten.

This fixes PR40591

Reviewers: jhenderson, rupprecht

Reviewed By: rupprecht

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58146

llvm-svn: 354083
2019-02-14 23:50:35 +00:00
Jordan Rupprecht 5b7ad42729 [DebugInfo] Fix /usr/lib/debug llvm-symbolizer lookup with relative paths
Summary:
rL189250 added a realpath call, and rL352916 because realpath breaks assumptions with some build systems. However, the /usr/lib/debug case has been clarified, falling back to /usr/lib/debug is currently broken if the obj passed in is a relative path. Adding a call to use absolute paths when falling back to /usr/lib/debug fixes that while still not making any realpath assumptions.

This also adds a --fallback-debug-path command line flag for testing (since we probably can't write to /usr/lib/debug from buildbot environments), but was also verified manually:

```
$ rm -f path/to/dwarfdump-test.elf-x86-64
$ strace llvm-symbolizer --obj=relative/path/to/dwarfdump-test.elf-x86-64.debuglink 0x40113f |& grep dwarfdump
```

Lookups went to relative/path/to/dwarfdump-test.elf-x86-64, relative/path/to/.debug/dwarfdump-test.elf-x86-64, and then finally /usr/lib/debug/absolute/path/to/dwarfdump-test.elf-x86-64.

Reviewers: dblaikie, samsonov

Reviewed By: dblaikie

Subscribers: krytarowski, aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57916

llvm-svn: 353730
2019-02-11 18:05:48 +00:00
Benjamin Kramer 711950c116 Move some classes into anonymous namespaces. NFC.
llvm-svn: 353710
2019-02-11 15:16:21 +00:00
Alexandre Ganea 120366edc7 [CodeView] Fix cycles in debug info when merging Types with global hashes
When type streams with forward references were merged using GHashes, cycles 
were introduced in the debug info. This was caused by 
GlobalTypeTableBuilder::insertRecordAs() not inserting the record on the second
pass, thus yielding an empty ArrayRef at that record slot. Later on, upon PDB 
emission, TpiStreamBuilder::commit() would skip that empty record, thus 
offseting all indices that came after in the stream. 

This solution comes in two steps: 

1. Fix the hash calculation, by doing a multiple-step resolution, iff there are
forward references in the input stream.
2. Fix merge by resolving with multiple passes, therefore moving records with
forward references at the end of the stream. 

This patch also adds support for llvm-readoj --codeview-ghash.
Finally, fix dumpCodeViewMergedTypes() which previously could reference deleted
memory. 

Fixes PR40221 

Differential Revision: https://reviews.llvm.org/D57790 

llvm-svn: 353412
2019-02-07 15:24:18 +00:00
James Henderson b6b5b1a592 [DebugInfo]Print correct value for special opcode address increment
The wrong variable was being used when printing the address increment in
verbose output of .debug_line. This patch fixes this.

Reviewed by: JDevlieghere

Differential Revision: https://reviews.llvm.org/D57693

llvm-svn: 353288
2019-02-06 10:31:50 +00:00
Jordan Rupprecht 835df27f85 [DebugInfo] Don't use realpath when looking up debug binary locations.
Summary:
Using realpath makes assumptions about build systems that do not always hold true. The debug binary referred to from the .gnu_debuglink should exist in the same directory (or in a .debug directory, etc.), but the files may only exist as symlinks to a differently named files elsewhere, and using realpath causes that lookup to fail.

This was added in r189250, and this is basically a revert + regression test case.

Reviewers: dblaikie, samsonov, jhenderson

Reviewed By: dblaikie

Subscribers: llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57609

llvm-svn: 352916
2019-02-01 21:04:16 +00:00
Wolfgang Pieb 58513b7761 [DWARF v5] Fix DWARF emitter and consumer to produce/expect a uleb for a location description's length.
Reviewer: davide, JDevliegere

Differential Revision: https://reviews.llvm.org/D57550

llvm-svn: 352889
2019-02-01 17:11:58 +00:00
Aleksandr Urakov d17f6ab61b [NativePDB] Fix access to both old & new fpo data entries from dbi stream
Summary:
This patch fixes access to fpo streams in native pdb from DbiStream and makes
code consistent with DbiStreamBuilder.

Patch By: leonid.mashinskiy

Reviewers: zturner, aleksandr.urakov

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D56725

llvm-svn: 352615
2019-01-30 10:40:45 +00:00
Zachary Turner 8371da385a [PDB] Increase TPI hash bucket count.
PDBs contain several serialized hash tables. In the microsoft-pdb
repo published to support LLVM implementing PDB support, the
provided initializes the bucket count for the TPI and IPI streams
to the maximum size. This occurs in tpi.cpp L33 and tpi.cpp L398.
In the LLVM code for generating PDBs, these streams are created with
minimum number of buckets. This difference makes LLVM generated
PDBs slower for when used for debugging.

Patch by C.J. Hebert
Differential Revision: https://reviews.llvm.org/D56942

llvm-svn: 352117
2019-01-24 22:25:55 +00:00
James Henderson 33c16a3f16 [llvm-symbolizer] Add support for --basenames/-s
This fixes https://bugs.llvm.org/show_bug.cgi?id=40068.

--basenames is a GNU addr2line switch which strips the directory names
from the file path in the output.

Reviewed by: ruiu

Differential Revision: https://reviews.llvm.org/D56919

llvm-svn: 351795
2019-01-22 10:24:32 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Alexandre Ganea 90f4b94da3 [CodeView] More appropriate name and type for a Microsoft precompiled headers parameter. NFC
llvm-svn: 350520
2019-01-07 13:53:16 +00:00
David Blaikie b917c3a41a llvm-dwarfdump: Skip address index info (and dump only the address, if found) when non-verbose dumping addrx forms
There's a few bugs here still - demonstrated with FIXITs in the test.

llvm-svn: 350046
2018-12-24 06:52:31 +00:00
David Blaikie 2a38c17b34 DebugInfo: Accurately propagate the section used by a relocation when accessing ranges defined by low/high_pc
This is difficult/not possible to test in LLVM, but is visible as a
crash in LLD when parsing DWARF to generate gdb-index.

This function is called by llvm-dwarfdump when parsing high_pc for
non-verbose output (to print the actual high_pc rather than the low_pc
relative value), but in that case llvm-dwarfdump doesn't print section
names (if it did, it would hit this problem).

We could add some other features to llvm-dwarfdump to expose this, but
nothing really springs to my mind. I will add a test to lld, though.

llvm-svn: 350010
2018-12-22 22:20:40 +00:00
David Blaikie 25179613f6 llvm-dwarfdump: Dump the section name/number for addr attributes
llvm-svn: 350009
2018-12-22 20:34:58 +00:00
David Blaikie 9efb0153f0 llvm-dwarfdump: Remove extraneous space between '(' and 'indexed'
When dumping string or address indexes

llvm-svn: 349997
2018-12-22 08:43:08 +00:00
David Blaikie c04d2bf22a llvm-dwarfdump: Print the section name/number for addr_index attributes
(addr attributes coming shortly)

llvm-svn: 349996
2018-12-22 08:33:55 +00:00
David Blaikie 87ae80fb2f DebugInfo: Refactor named section dumping into a reusable helper
Currently the section name (& possibly number) is only printed on
addresses in ranges - but no reason it couldn't also be displayed on
other addresses (like low/high PC).

Refactor in that direction by pulling out the section lookup and name
ambiguity dumping logic into a reusable helper.

llvm-svn: 349995
2018-12-22 08:23:10 +00:00
David Blaikie e4e0b9f48f DebugInfo: Remove extra attribute lookup
llvm-svn: 349985
2018-12-22 02:24:13 +00:00
David Blaikie 219c6bd388 libDebugInfo: Refactor error handling in range list parsing
Propagate the llvm::Error a little further up. This is NFC for
llvm-dwarfdump in this change, but allows ld.lld to emit more precise
error messages about which object and archive the erroneous DWARF is in.

llvm-svn: 349978
2018-12-22 00:31:02 +00:00
David Blaikie c3f30a7fc6 Reapply: DebugInfo: Assume an absence of ranges or high_pc on a CU means the CU is empty (devoid of code addresses)
Originally committed in r349333, reverted in r349353.

GCC emitted these unconditionally on/before 4.4/March 2012
Clang emitted these unconditionally on/before 3.5/March 2014

This improves performance when parsing CUs (especially those using split
DWARF) that contain no code ranges (such as the mini CUs that may be
created by ThinLTO importing - though generally they should be/are
avoided, especially for Split DWARF because it produces a lot of very
small CUs, which don't scale well in a bunch of other ways too
(including size)).

The revert was due to a (Google internal) test that had some checked in old
object files missing DW_AT_ranges. That's since been fixed.

llvm-svn: 349968
2018-12-21 22:25:01 +00:00
Luke Cheeseman 41a9e53500 [Dwarf/AArch64] Return address signing B key dwarf support
- When signing return addresses with -msign-return-address=<scope>{+<key>},
  either the A key instructions or the B key instructions can be used. To
  correctly authenticate the return address, the unwinder/debugger must know
  which key was used to sign the return address.
- When and exception is thrown or a break point reached, it may be necessary to
  unwind the stack. To accomplish this, the unwinder/debugger must be able to
  first authenticate an the return address if it has been signed.
- To enable this, the augmentation string of CIEs has been extended to allow
  inclusion of a 'B' character. Functions that are signed using the B key
  variant of the instructions should have and FDE whose associated CIE has a 'B'
  in the augmentation string.
- One must also be able to preserve these semantics when first stepping from a
  high level language into assembly and then, as a second step, into an object
  file. To achieve this, I have introduced a new assembly directive
  '.cfi_b_key_frame ', that tells the assembler the current frame uses return
  address signing with the B key.
- This ensures that the FDE is associated with a CIE that has 'B' in the
  augmentation string.

Differential Revision: https://reviews.llvm.org/D51798

llvm-svn: 349895
2018-12-21 10:45:08 +00:00
David Blaikie ac69af7ad6 llvm-dwarfdump: Improve/fix pretty printing of array dimensions
This is to address post-commit feedback from Paul Robinson on r348954.

The original commit misinterprets count and upper bound as the same thing (I thought I saw GCC producing an upper bound the same as Clang's count, but GCC correctly produces an upper bound that's one less than the count (in C, that is, where arrays are zero indexed)).

I want to preserve the C-like output for the common case, so in the absence of a lower bound the count (or one greater than the upper bound) is rendered between []. In the trickier cases, where a lower bound is specified, a half-open range is used (eg: lower bound 1, count 2 would be "[1, 3)" and an unknown parts use a '?' (eg: "[1, ?)" or "[?, 7)" or "[?, ? + 3)").

Reviewers: aprantl, probinson, JDevlieghere

Differential Revision: https://reviews.llvm.org/D55721

llvm-svn: 349670
2018-12-19 19:34:24 +00:00
Luke Cheeseman f57d7d8237 [AArch64] - Return address signing dwarf support
- Reapply changes intially introduced in r343089
- The archtecture info is no longer loaded whenever a DWARFContext is created
- The runtimes libraries (santiziers) make use of the dwarf context classes but
  do not intialise the target info
- The architecture of the object can be obtained without loading the target info
- Adding a method to the dwarf context to get this information and multiplex the
  string printing later on

Differential Revision: https://reviews.llvm.org/D55774

llvm-svn: 349472
2018-12-18 10:37:42 +00:00
Zachary Turner bb3d7e565f [PDB] Add some helper functions for working with scopes.
llvm-svn: 349361
2018-12-17 16:15:36 +00:00
Eric Liu 6c933a2bed Revert "DebugInfo: Assume an absence of ranges or high_pc on a CU means the CU is empty (devoid of code addresses)"
This reverts commit r349333. It caused internal test to fail. I have
sent more information to the author.

llvm-svn: 349353
2018-12-17 14:14:40 +00:00
David Blaikie 884deed1b3 DebugInfo: Assume an absence of ranges or high_pc on a CU means the CU is empty (devoid of code addresses)
GCC emitted these unconditionally on/before 4.4/March 2012
Clang emitted these unconditionally on/before 3.5/March 2014

This improves performance when parsing CUs (especially those using split
DWARF) that contain no code ranges (such as the mini CUs that may be
created by ThinLTO importing - though generally they should be/are
avoided, especially for Split DWARF because it produces a lot of very
small CUs, which don't scale well in a bunch of other ways too
(including size)).

llvm-svn: 349333
2018-12-17 08:27:19 +00:00
David Blaikie 023674a9e4 DebugInfo/DWARF: Pretty print subroutine types
Doesn't handle varargs and other fun things, but it's a start. (also
doesn't print these strictly as valid C++ when it's a pointer to
function, it'll print as "void(int)*" instead of "void (*)(int)")

llvm-svn: 348965
2018-12-12 19:53:03 +00:00
David Blaikie 3f8f004daf DebugInfo/DWARF: Improve dumping of pointers to members ('int foo::*' rather than 'int*')
llvm-svn: 348962
2018-12-12 19:34:02 +00:00
David Blaikie 815cffaad8 DebugInfo/DWARF: Refactor type dumping to dump types, rather than DIEs that reference types
This lays the foundation for dumping types not referenced by DW_AT_type
attributes (in the near-term, that'll be DW_AT_containing_type for a
DW_TAG_ptr_to_member_type - in the future, potentially dumping the
pretty printed name next to the DW_TAG for the type, rather than only
when the type is referenced from elsewhere)

llvm-svn: 348961
2018-12-12 19:33:08 +00:00
David Blaikie 92b5493a14 DebugInfo/DWARF: Refactor getAttributeValueAsReferencedDie to accept a DWARFFormValue
Save searching for the attribute again when you already have the
DWARFFormValue at hand.

llvm-svn: 348960
2018-12-12 19:23:55 +00:00
David Blaikie 73066d60f1 llvm-dwarfdump: Dump array dimensions in stringified type names
llvm-svn: 348954
2018-12-12 18:46:25 +00:00
Zachary Turner a42bbe3981 [NativePDB] Reconstruct function declarations from debug info.
Previously we would create an lldb::Function object for each function
parsed, but we would not add these to the clang AST. This is a first
step towards getting local variable support working, as we first need an
AST decl so that when we create local variable entries, they have the
proper DeclContext.

Differential Revision: https://reviews.llvm.org/D55384

llvm-svn: 348631
2018-12-07 19:34:02 +00:00
Zachary Turner a93458b050 [PDB] Move some code around. NFC.
llvm-svn: 348505
2018-12-06 17:49:15 +00:00
Zachary Turner 579264bd59 Support skewed stream arrays.
VarStreamArray was built on the assumption that it is backed by a
StreamRef, and offset 0 of that StreamRef is the first byte of the first
record in the array.

This is a logical and intuitive assumption, but unfortunately we have
use cases where it doesn't hold. Specifically, a PDB module's symbol
stream is prefixed by 4 bytes containing a magic value, and the first
byte of record data in the array is actually at offset 4 of this byte
sequence.

Previously, we would just truncate the first 4 bytes and then construct
the VarStreamArray with the resulting StreamRef, so that offset 0 of the
underlying stream did correspond to the first byte of the first record,
but this is problematic, because symbol records reference other symbol
records by the absolute offset including that initial magic 4 bytes. So
if another record wants to refer to the first record in the array, it
would say "the record at offset 4".

This led to extremely confusing hacks and semantics in loading code, and
after spending 30 minutes trying to get some math right and failing, I
decided to fix this in the underlying implementation of VarStreamArray.
Now, we can say that a stream is skewed by a particular amount. This
way, when we access a record by absolute offset, we can use the same
values that the records themselves contain, instead of having to do
fixups.

Differential Revision: https://reviews.llvm.org/D55344

llvm-svn: 348499
2018-12-06 16:55:00 +00:00
Zachary Turner 7c6b19f49b [PDB] Emit S_UDT records in LLD.
Previously these were dropped.  We now understand them sufficiently
well to start emitting them.  From the debugger's perspective, this
now enables us to have debug info about typedefs (both global and
function-locally scoped)

Differential Revision: https://reviews.llvm.org/D55228

llvm-svn: 348306
2018-12-04 21:48:46 +00:00
George Rimar 7e981f330b [llvm-dwarfdump] - Dump the older versions of .eh_frame/.debug_frame correctly.
The issue is the following.

DWARF 2 used version 1 for .debug_frame.
(Appendix G, p. 416 http://dwarfstd.org/doc/DWARF5.pdf)

lib/MC now always sets version 1 for .eh_frame (and sets 1-4 versions for .debug_frame correctly):
https://github.com/llvm-mirror/llvm/blob/master/lib/MC/MCDwarf.cpp#L1530
https://github.com/llvm-mirror/llvm/blob/master/lib/MC/MCDwarf.cpp#L1562
https://github.com/llvm-mirror/llvm/blob/master/lib/MC/MCDwarf.cpp#L1602

In version 1, return_address_register was defined as ubyte, while other versions
switched to uleb128.
(p 62, http://www.dwarfstd.org/doc/dwarf-2.0.0.pdf)

Patch teaches llvm-dwarfdump about this difference.

Differential revision: https://reviews.llvm.org/D54860

llvm-svn: 348242
2018-12-04 10:01:39 +00:00
Zachary Turner 1e0cce796c Fix issue with Tpi Stream hash map.
Part of the patch to not build the hash map eagerly was omitted
due to a merge conflict.  Add it back, which should fix the failing
tests.

llvm-svn: 348166
2018-12-03 19:05:12 +00:00
Zachary Turner f861e291d6 Don't build the Tpi Hash map by default.
This is very slow and should be done for specific cases where
lookups will need to happen.

llvm-svn: 348160
2018-12-03 18:32:05 +00:00
George Rimar 6d85c58328 [llvm-dwarfdump] - Stop printing the bogus empty section name on invalid dwarf.
When there is no .debug_addr section for some reason,
llvm-dwarfdump would print the bogus empty section name when dumping ranges
in .debug_info:

DW_AT_ranges [DW_FORM_rnglistx]   (indexed (0x0) rangelist = 0x00000004
    [0x0000000000000000, 0x0000000000000001) ""
    [0x0000000000000000, 0x0000000000000002) "")

That happens because of the code which uses 0 (zero) as a section index as a default value.
The code should use -1ULL instead because technically 0 is a valid zero section index
in ELF and -1ULL is a special constant used that means "no section available".

This is mostly a fix for the overall correctness/safety of the code,
but a test case is provided too.

Differential revision: https://reviews.llvm.org/D55113

llvm-svn: 348115
2018-12-03 10:33:40 +00:00
Reid Kleckner ffba54493f Add missing error checking code intended for r347687
llvm-svn: 347690
2018-11-27 19:14:11 +00:00
Reid Kleckner 291d015de4 [PDB] Add symbol records in bulk
Summary:
This speeds up linking clang.exe/pdb with /DEBUG:GHASH by 31%, from
12.9s to 9.8s.

Symbol records are typically small (16.7 bytes on average), but we
processed them one at a time. CVSymbol is a relatively "large" type. It
wraps an ArrayRef<uint8_t> with a kind an optional 32-bit hash, which we
don't need. Before this change, each DbiModuleDescriptorBuilder would
maintain an array of CVSymbols, and would write them individually with a
BinaryItemStream.

With this change, we now add symbols that happen to appear contiguously
in bulk. For each .debug$S section (roughly one per function), we
allocate two copies, one for relocation, and one for realignment
purposes. For runs of symbols that go in the module stream, which is
most symbols, we now add them as a single ArrayRef<uint8_t>, so the
vector DbiModuleDescriptorBuilder is roughly linear in the number of
.debug$S sections (O(# funcs)) instead of the number of symbol records
(very large).

Some stats on symbol sizes for the curious:
  PDB size: 507M
  sym bytes: 316,508,016
  sym count:  18,954,971
  sym byte avg: 16.7

As future work, we may be able to skip copying symbol records in the
linker for realignment purposes if we make LLVM write them aligned into
the object file. We need to double check that such symbol records are
still compatible with link.exe, but if so, it's definitely worth doing,
since my profile shows we spend 500ms in memcpy in the symbol merging
code. We could potentially cut that in half by saving a copy.
Alternatively, we could apply the relocations *after* we iterate the
symbols. This would require some careful re-engineering of the
relocation processing code, though.

Reviewers: zturner, aganea, ruiu

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D54554

llvm-svn: 347687
2018-11-27 19:00:23 +00:00
Luke Cheeseman 6db3a6a4a7 Revert r347490 as it breaks address sanitizer builds
llvm-svn: 347499
2018-11-23 17:13:06 +00:00
Luke Cheeseman d6dbd64104 Revert r343341
- Cannot reproduce the build failure locally and the build logs have
  been deleted.

llvm-svn: 347490
2018-11-23 11:01:47 +00:00
Zachary Turner c68f895702 [CodeView] Add support for ref-qualified member functions.
When you have a member function with a ref-qualifier, for example:

struct Foo {
  void Func() &;
  void Func2() &&;
};

clang-cl was not emitting this information. Doing so is a bit
awkward, because it's not a property of the LF_MFUNCTION type, which
is what you'd expect. Instead, it's a property of the this pointer
which is actually an LF_POINTER. This record has an attributes
bitmask on it, and our handling of this bitmask was all wrong. We
had some parts of the bitmask defined incorrectly, but importantly
for this bug, we didn't know about these extra 2 bits that represent
the ref qualifier at all.

Differential Revision: https://reviews.llvm.org/D54667

llvm-svn: 347354
2018-11-20 22:13:43 +00:00
Zachary Turner b35e1d7dc3 [CodeView] Don't print PointerAttributes when dumping.
PointerAttributes is a bitwise-or of several other fields, each of
which is already printed on its own line with a better explanation.
So this doesn't really help much.

llvm-svn: 347275
2018-11-20 00:10:27 +00:00
David Blaikie 81959a2730 llvm-symbolizer: Avoid calling getFromOffset when the index entry is already available
Especially for symbolizer it can be efficient to have to search through
the entire index when it isn't needed - llvm-symbolizer looks up only a
few CUs & already has an index available in getUnitForEntry, once it's
passed down to DWARFUnitHeader::extract then there's no need for it to
call getFromOffset.

llvm-svn: 347134
2018-11-17 05:57:58 +00:00
Fangrui Song 7570932977 Use llvm::copy. NFC
llvm-svn: 347126
2018-11-17 01:44:25 +00:00
Simon Atanasyan 705fbd5d4f [DWARF] Use PRIx64 instead of 'x' to format 64-bit values
This is a follow-up to r346715. Use PRIx64 to formatted print of 64-bit
value in the `DWARFDebugLoclists::LocationList::dump` to escape problem
on big-endian hosts.

llvm-svn: 347049
2018-11-16 13:14:26 +00:00
Zachary Turner 03a24052f3 [NativePDB] Improved support for nested type reconstruction.
In a previous patch, we pre-processed the TPI stream in order to build
the reverse mapping from nested type -> parent type so that we could
accurately reconstruct a DeclContext hierarchy.

However, there were some issues. An LF_NESTTYPE record is really just a
typedef, so although it happens to be used to indicate the name of the
nested type and referring to the global record which defines the type,
it is also used for every other kind of nested typedef. When we rebuild
the DeclContext hierarchy, we want it to be as accurate as possible,
which means that if we have something like:

  struct A {
    struct B {};
    using C = B;
  };

We don't want to create two CXXRecordDecls in the AST each with the
exact same definition. We just want to create one for B and then
define C as an alias to B. Previously, however, it would not be able
to distinguish between the two cases and it would treat A::B and
A::C as being two classes each with separate definitions. We address
the first half of improving the pre-processing logic so that only
actual definitions are treated this way.

Later, in a followup patch, we can handle the case of nested
typedefs since we're already going to be enumerating the field list
anyway and this patch introduces the general framework for
distinguishing between the two cases.

Differential Revision: https://reviews.llvm.org/D54357

llvm-svn: 346786
2018-11-13 20:07:32 +00:00
Simon Atanasyan 22dc538618 [DWARF] Do not use PRIx32 for printing uint64_t values
The `DWARFDebugAddrTable::dump` routine prints 32/64-bits addresses.
These values are stored in a vector of `uint64_t` independently of their
original sizes. But `format` function gets format string with PRIx32
suffix in case of 32-bit address size. At least on MIPS 32-bit targets
that leads to incorrect output.

This patch changes formats strings and always use PRIx64 to print
`uint64_t` values.

Differential Revision: http://reviews.llvm.org/D54424

llvm-svn: 346715
2018-11-12 22:43:17 +00:00
David Blaikie 582a5ebce0 NFC: DebugInfo: Reduce scope of DebugOffset to simplify code
This was being used as a sort of indirect out parameter from shouldDump
- seems simpler to use it as the actual result of the call. (this does
mean using a pointer to an Optional & actually using all 3 states (null,
None, and present) which is, admittedly, a tad subtle - but given the
limited scope, seems OK to me - open to discussion though, if others
feel strongly about it)

llvm-svn: 346691
2018-11-12 18:53:28 +00:00
Fangrui Song 158b26213f [DWARF] Change pubnames to use DWARFSection instead of StringRef
Summary: The debug_info_offset values in .debug_{,gnu_}pub{name,types} may be relocated. Change it to DWARFSection so that we can get relocated values.

Reviewers: ruiu, dblaikie, grimar, JDevlieghere

Reviewed By: JDevlieghere

Subscribers: aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D54375

llvm-svn: 346615
2018-11-11 18:57:28 +00:00
Alexandre Ganea 4b2957243b [LLD] Fix Microsoft precompiled headers cross-compile on Linux
Differential revision: https://reviews.llvm.org/D54122

llvm-svn: 346403
2018-11-08 14:42:37 +00:00
Jorge Gorbe Moya bf1badb6bb Add parentheses to silence warning.
DWARFContext.cpp:356:20: error: using the result of an assignment as a condition without parentheses [-Werror,-Wparentheses]

llvm-svn: 346365
2018-11-07 22:30:01 +00:00
Paul Robinson 746c22389c [DWARFv5] Read and dump multiple .debug_info sections.
Type units go in .debug_info comdats, not .debug_types, in v5.

Differential Revision: https://reviews.llvm.org/D53907

llvm-svn: 346360
2018-11-07 21:39:09 +00:00
Fangrui Song 54d23a8eb7 [DWARF] Support types CU list in .gdb_index dumping
Some executables have non-empty types CU list and -gdb-index would report "<error reporting>" before.

llvm-svn: 346181
2018-11-05 23:27:53 +00:00