It is possible to try to keep parsing a debug line program even when the
length of an extended opcode does not match what is expected for that
opcode. This patch changes what was previously a fatal error to be
non-fatal. The parser now continues by assuming the the claimed length
is correct, even if it means moving the offset backwards.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D72155
This helps to detect and report parsing errors better.
The patch follows the ideas of LLDB's patches D59370 and D59381.
It adds tests for valid and some invalid cases. More checks and
tests to come. Note that the patch fixes validation of the Length
field because the value does not include the field itself.
The existing users are updated to show the error messages.
Differential Revision: https://reviews.llvm.org/D71875
Reasonable assumptions can be made when a parsed address length does not
match the expected length, so there's no need for this to be fatal.
Reviewed by: ikudrin
Differential Revision: https://reviews.llvm.org/D72154
Unlike most of our errors in the debug line parser, the "no end of
sequence" message was missing any reference to which line table it
refererred to. This change adds the offset to this message.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D72443
If the claimed unit length of a debug line program is such that the line
table would finish past the end of the .debug_line section, an infinite
loop occurs because the data extractor will continue to "read" zeroes
without changing the offset. This previously didn't hit an error because
the line table program handles a series of zeroes as a bad extended
opcode.
This patch fixes the inifinite loop and adds a warning if the program
doesn't fit in the available data.
Reviewed by: JDevlieghere
Differential Revision: https://reviews.llvm.org/D72279
The checkGetOrParseLineTableEmitsError function could end up generating
both recoverable and unrecoverable errors, but it is only intended for
handling the latter.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D72156
The Isa register is a uint8_t, but at least on Windows this is
internally an unsigned char, which meant that prior to this patch it got
formatted as an ASCII character, rather than a decimal number. This
patch fixes this by casting it to a uint64_t before printing. I did it
this way instead of using a uint8_t formatter because a) it is simpler,
and b) it allows us to change the internal type of Isa in the future
without this code breaking.
I also took the opportunity to test the printing of the other standard
opcodes.
Reviewed by: probinson
Differential Revision: https://reviews.llvm.org/D71274
Summary:
Lookup functions are designed to not fully decode a FunctionInfo, LineTable or InlineInfo, they decode only what is needed into a LookupResult object. This allows lookups to avoid costly memory allocations and avoid parsing large amounts of information one a suitable match is found.
LookupResult objects contain the address that was looked up, the concrete function address range, the name of the concrete function, and a list of source locations. One for each inline function, and one for the concrete function. This allows one address to turn into multiple frames and improves the signal you get when symbolicating addresses in GSYM files.
Reviewers: labath, aprantl
Subscribers: mgorny, hiraditya, llvm-commits, lldb-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70993
This recommits 089c0f5814, which was
reverted due to failing tests on big endian machines. It includes a fix
which I believe (I don't have BE machine) should fix this issue. The fix
consists of correcting the invocation DWARFYAML::EmitDebugSections,
which was missing one (default) function arguments, and so didn't
actually force the little-endian mode.
The original commit message follows.
Summary:
This patch adds DWARFDie::getLocations, which returns the location
expressions for a given attribute (typically DW_AT_location). It handles
both "inline" locations and references to the external location list
sections (currently only of the DW_FORM_sec_offset type). It is
implemented on top of DWARFUnit::findLoclistFromOffset, which is also
added in this patch. I tried to make their signatures similar to the
equivalent range list functionality.
The actual location list interpretation logic is in
DWARFLocationTable::visitAbsoluteLocationList. This part is not
equivalent to the range list code, but this deviation is motivated by a
desire to reuse the same location list parsing code within lldb.
The functionality is tested via a c++ unit test of the DWARFDie API.
Reviewers: dblaikie, JDevlieghere, SouraVX
Subscribers: mgorny, hiraditya, cmtice, probinson, llvm-commits, aprantl
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70394
Summary:
This patch adds DWARFDie::getLocations, which returns the location
expressions for a given attribute (typically DW_AT_location). It handles
both "inline" locations and references to the external location list
sections (currently only of the DW_FORM_sec_offset type). It is
implemented on top of DWARFUnit::findLoclistFromOffset, which is also
added in this patch. I tried to make their signatures similar to the
equivalent range list functionality.
The actual location list interpretation logic is in
DWARFLocationTable::visitAbsoluteLocationList. This part is not
equivalent to the range list code, but this deviation is motivated by a
desire to reuse the same location list parsing code within lldb.
The functionality is tested via a c++ unit test of the DWARFDie API.
Reviewers: dblaikie, JDevlieghere, SouraVX
Subscribers: mgorny, hiraditya, cmtice, probinson, llvm-commits, aprantl
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70394
MipsMCAsmInfo was using '$' prefix for Mips32 and '.L' for Mips64
regardless of -target-abi option. By passing MCTargetOptions to MCAsmInfo
we can find out Mips ABI and pick appropriate prefix.
Tags: #llvm, #clang, #lldb
Differential Revision: https://reviews.llvm.org/D66795
This patch adds the ability to create GSYM files with GsymCreator, and read them with GsymReader. Full testing has been added for both new classes.
This patch differs from the original patch https://reviews.llvm.org/D53379 in that is uses a StringTableBuilder class from llvm instead of a custom version. Support for big and little endian files has been added. If the endianness matches the current host, we use efficient extraction for the header, address table and address info offset tables.
Differential Revision: https://reviews.llvm.org/D68744
llvm-svn: 374381
This patch adds the llvm::gsym::Header class which appears at the start of a stand alone GSYM file, or in the first bytes of the GSYM data in a GSYM section within a file. Added encode and decode methods with full error handling and full tests.
Differential Revision: https://reviews.llvm.org/D67666
llvm-svn: 372149
This patch adds encoding and decoding of the FunctionInfo objects along with full error handling and tests. Full details of the FunctionInfo encoding format appear in the FunctionInfo.h header file.
Differential Revision: https://reviews.llvm.org/D67506
llvm-svn: 372135
This patch adds the ability to create a gsym::LineTable object, populate it, encode and decode it and test all functionality.
The full format of the LineTable encoding is specified in the header file LineTable.h.
Differential Revision: https://reviews.llvm.org/D66602
llvm-svn: 371657
This is a follow-up of rL369529, where the return value of
DWARFUnit::getLength() was changed from uint32_t to uint64_t.
The test checks that a unit header with Length > 4G can be successfully
parsed and the value of the Length field is not truncated.
Differential Revision: https://reviews.llvm.org/D67276
llvm-svn: 371510
This is a follow-up of rL369529, where the return value of
DWARFUnit::getLength() was changed from uint32_t to uint64_t.
The test checks that a unit header with Length > 4G can be successfully
parsed and the value of the Length field is not truncated.
Differential Revision: https://reviews.llvm.org/D67276
llvm-svn: 371499
This patch adds the ability to encode and decode InlineInfo objects and adds test coverage. Error handling is introduced in the encoding and decoding which will be used from here on out for remaining patches.
Differential Revision: https://reviews.llvm.org/D66600
llvm-svn: 370936
The full GSYM patch started with: https://reviews.llvm.org/D53379
This patch add the ability to encode data using the new llvm::gsym::FileWriter class.
FileWriter is a simplified binary data writer class that doesn't require targets, target definitions, architectures, or require any other optional compile time libraries to be enabled via the build process. This class needs the ability to seek to different spots in the binary data that it produces to fix up offsets and sizes in GSYM data. It currently uses std::ostream over llvm::raw_ostream because llvm::raw_ostream doesn't support seeking which is required when encoding and decoding GSYM data.
AddressRange objects are encoded and decoded to be relative to a base address. This will be the FunctionInfo's start address if the AddressRange is directly contained in a FunctionInfo, or a base address of the containing parent AddressRange or AddressRanges. This allows address ranges to be efficiently encoded using ULEB128 encodings as we encode the offset and size of each range instead of full addresses. This also makes encoded addresses easy to relocate as we just need to relocate one base address.
Differential Revision: https://reviews.llvm.org/D63828
llvm-svn: 369587
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
llvm-svn: 369013
Some of these names were abbreviated, some were not, some pluralised,
some not. Made the API difficult to use - since it's an exact 1:1
mapping to the DWARF sections - use those names (changing underscore
separation for camel casing).
llvm-svn: 368189
Using 64-bit offsets is required to fully implement 64-bit DWARF.
As these classes are used in many different libraries they should
temporarily support both 32- and 64-bit offsets.
Differential Revision: https://reviews.llvm.org/D64006
llvm-svn: 368013
This patch exnteds the error handling in the debug line parser to get
rid of the existing MD5 assertion. I want to reuse the debug line parser
from LLVM in LLDB where we cannot crash on invalid input.
Differential revision: https://reviews.llvm.org/D64544
llvm-svn: 366762
The DWARF3 documentation had inconsistency concerning the reserved range
for unit length values. The issue was fixed in DWARF4.
Differential Revision: https://reviews.llvm.org/D64622
llvm-svn: 366190
The traits object is only used by a few methods. Deserializing a hash
table and walking it is possible without the traits object, so it
shouldn't be required to build a dummy object for that use case.
The TraitsT object used to be a function template parameter before
r327647, this restores it to that state.
This makes it clear that the traits object isn't needed at all in 1 of
the current 3 uses of HashTable (and I am going to add another use that
doesn't need it), and that the default PdbHashTraits isn't used outside
of tests.
While here, also re-enable 3 checks in the test that were commented out
(which requires making HashTableInternals templated and giving FooBar
an operator==).
No intended behavior change.
Differential Revision: https://reviews.llvm.org/D64640
llvm-svn: 365974
Delete unnecessary getters of AddressRange.
Simplify AddressRange::size(): Start <= End check should be checked in an upper layer.
Delete isContiguousWith() that doesn't make sense.
Simplify AddressRanges::insert. Delete commented code. Fix it when more than 1 ranges are to be deleted.
Delete trailing newline.
llvm-svn: 364637
The full GSYM patch started with: https://reviews.llvm.org/D53379
In that patch we wanted to split up getting GSYM into the LLVM code base so we are not committing too much code at once.
This is a first in a series of patches where I only add the foundation classes along with complete unit tests. They provide the foundation for encoding and decoding a GSYM file.
File entries are defined in llvm::gsym::FileEntry. This class splits the file up into a directory and filename represented by uniqued string table offsets. This allows all files that are referred to in a GSYM file to be encoded as 1 based indexes into a global file table in the GSYM file.
Function information in stored in llvm::gsym::FunctionInfo. This object represents a contiguous address range that has a name and range with an optional line table and inline call stack information.
Line table entries are defined in llvm::gsym::LineEntry. They store only address, file and line information to keep the line tables simple and allows the information to be efficiently encoded in a subsequent patch.
Inline information is defined in llvm::gsym::InlineInfo. These structs store the name of the inline function, along with one or more address ranges, and the file and line that called this function. They also contain any child inline information.
There are also utility classes for address ranges in llvm::gsym::AddressRange, and string table support in llvm::gsym::StringTable which are simple classes.
The unit tests test all the APIs on these simple classes so they will be ready for the next patches where we will create GSYM files and parse GSYM files.
Differential Revision: https://reviews.llvm.org/D63104
llvm-svn: 364427
Summary:
When building with a Default Target set we can experience issues
in the DWARF DebugInfo unit tests because:
They assume we can generate object files for the host platform.
Some tests assume the endianess of the target we are generating
DWARF for and the host match.
This patch correct these issues by ensuring the tests which
generate objects in memory are run with respect to
LVM_DEFAULT_TARGET_TRIPLE and it's endianess.
We also make sure we don't use the hosts address size for line test
and split the triple util function in DwarfUtils into a version
that takes an address size and one that doesn't.
See also for discussion:
http://lists.llvm.org/pipermail/llvm-dev/2019-March/131212.html
Patch by: daltenty
Differential Revision: https://reviews.llvm.org/D62084
llvm-svn: 362454
lld-link used to write PDB files that DIA couldn't recover natvis
files from if:
- The global strings table was > 64kiB
- There were at least 3 natvis files
The cause was that the hash function for the /src/headerblock stream
was incorrect: It needs to be truncated to 16 bit.
If the global strings table was <= 64kiB, truncating to 16 bit is a
no-op, so this wasn't needed for small programs.
If there are only 1 or 2 natvis files, then the growth strategy in
HashTable::grow() would mean the hash table would have 2 buckets (for 1
natvis file) or 4 buckets (for 4 natvis files), and since the hash
function is used modulo number of buckets, and since 2 and 4 divide
0x10000, the missing `% 0x10000` is a no-op there too. For 3 natvis
files, the hash table grows to 6 buckets, which has a factor that's not
common with 0x10000 and the difference starts to matter.
Fixes PR41626.
Differential Revision: https://reviews.llvm.org/D61277
llvm-svn: 359515
It didn't handle empty LHS correctly. If two ranges of LHS were
contiguous and jointly contained one range of RHS, it could also be incorrect.
DWARFAddressRange::contains can be removed and its tests can be merged into DWARFVerifier::DieRangeInfo::contains
llvm-svn: 358387
Summary:
Now CVType and CVSymbol are effectively type-safe wrappers around
ArrayRef<uint8_t>. Make the kind() accessor load it from the
RecordPrefix, which is the same for types and symbols.
Reviewers: zturner, aganea
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60018
llvm-svn: 357658
DWARFFormValues can be created from a data extractor or by passing its
value directly. Until now this was done by member functions that
modified an existing object's internal state. This patch replaces a
subset of these methods with static method that return a new
DWARFFormValue.
llvm-svn: 354941
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
This defines member function base on the specialization of
std::reverse_iterator for DWARFDie::iterator as required by C++
[reverse.iter.conv].
This fixes unit test DWARFDebugInfoTest.cpp under EXPENSIVE_CHECKS which
currently can't be built due to GNU C++ Library calling this member
function in debug mode.
This fixes https://llvm.org/PR38785
Patch by: Eugene Sharygin
Differential revision: https://reviews.llvm.org/D53792
llvm-svn: 345621
Summary:
This patch just extends the `IPDBSession` interface to allow retrieving
of frame data through it, and adds an implementation over DIA. It is needed
for an implementation (for now with DIA) of the conversion from FPO programs
to DWARF expressions mentioned in D53086.
Reviewers: zturner, asmith, rnk
Reviewed By: asmith
Subscribers: mgorny, aprantl, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D53324
llvm-svn: 344886
Using llvm::getInputFileDirectory() in unit tests is discouraged, so require an explicit opt-in.
This way, cmake also writes ~60 fewer unused files to disk.
Differential Revision: https://reviews.llvm.org/D52095
llvm-svn: 342248
libLLVMTestingSupport.so references a symbol in utils/unittest/UnitTestMain/TestMain.cpp (a layering issue) and will cause a link error because of -Wl,-z,defs (cmake/modules/HandleLLVMOptions.cmake)
Waiting zturner for a better fix.
llvm-svn: 341580
The way DIA SDK works is that when you request a symbol, it
gets assigned an internal identifier that is unique for the
life of the session. You can then use this identifier to
get back the same symbol, with all of the same internal state
that it had before, even if you "destroyed" the original
copy of the object you had.
This didn't work properly in our native implementation, and
if you destroyed an object for a particular symbol, then
requested the same symbol again, it would get assigned a new
ID and you'd get a fresh copy of the object. In order to fix
this some refactoring had to happen to properly reuse cached
objects. Some unittests are added to verify that symbol
reuse is taking place, making use of the new unittest input
feature.
llvm-svn: 341503
Summary:
This prefix was added in r333421, and it changed our dumper output to
say things like "CVRegEAX" instead of just "EAX". That's a functional
change that I'd rather avoid.
I tested GCC, Clang, and MSVC, and all of them support #pragma
push_macro. They don't issue warnings whem the macro is not defined
either.
I don't have a Mac so I can't test the real termios.h header, but I
looked at the termios.h sources online and looked for other conflicts.
I saw only the CR* macros, so those are the ones we work around.
Reviewers: zturner, JDevlieghere
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D50851
llvm-svn: 339907
Summary:
The accelerator tables use the debug_str section to store their strings.
However, they do not support the indirect method of access that is
available for the debug_info section (DW_FORM_strx et al.).
Currently our code is assuming that all strings can/will be referenced
indirectly, and puts all of them into the debug_str_offsets section.
This is generally true for regular (unsplit) dwarf, but in the DWO case,
most of the strings in the debug_str section will only be used from the
accelerator tables. Therefore the contents of the debug_str_offsets
section will be largely unused and bloating the main executable.
This patch rectifies this by teaching the DwarfStringPool to
differentiate between strings accessed directly and indirectly. When a
user inserts a string into the pool it has to declare whether that
string will be referenced directly or not. If at least one user requsts
indirect access, that string will be assigned an index ID and put into
debug_str_offsets table. Otherwise, the offset table is skipped.
This approach reduces the overall binary size (when compiled with
-gdwarf-5 -gsplit-dwarf) in my tests by about 2% (debug_str_offsets is
shrunk by 99%).
Reviewers: probinson, dblaikie, JDevlieghere
Subscribers: aprantl, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D49493
llvm-svn: 339122
This is patch 2 of 4 NFC refactorings to handle type units and compile
units more consistently and with less concern about the object-file
section that they came from.
Patch 2 takes the existing std::deque<DWARFUnitSection> for type units
and makes it a simple DWARFUnitSection, simplifying the handling of
type units and making it more consistent with compile units.
Differential Revision: https://reviews.llvm.org/D49742
llvm-svn: 338629
This is patch 1 of 4 NFC refactorings to handle type units and compile
units more consistently and with less concern about the object-file
section that they came from.
Patch 1 replaces the templated DWARFUnitSection with a non-templated
version. That is, instead of being a SmallVector of pointers to a
specific unit kind, it is not a SmallVector of pointers to the base
class for both type and compile units. Virtual methods are magic.
Differential Revision: https://reviews.llvm.org/D49741
llvm-svn: 338628
The DWARFDie is a lightweight utility wrapper that stores a pointer to a
compile unit and a debug info entry. Currently, its iterator (used for
walking over its children) stores a DWARFDie and returns a const
reference when dereferencing it.
When the iterator is modified (by incrementing or decrementing it), this
reference becomes invalid. This was happening when calling reverse on
it, because the std::reverse_iterator is keeping a temporary copy of the
iterator (see
https://en.cppreference.com/w/cpp/iterator/reverse_iterator for a good
illustration).
The relevant code in libcxx:
reference operator*() const {_Iter __tmp = current; return *--__tmp;}
When dereferencing the reverse iterator, we decrement and return a
reference to a DWARFDie stored in the stack frame of this function,
resulting in UB at runtime.
This patch specifies the std::reverse_iterator for DWARFDie to do the
right thing.
Differential revision: https://reviews.llvm.org/D49679
llvm-svn: 338506
Previous version of this patch failed on darwin targets because of
different handling of cross-debug-section relocations. This fixes the
tests to emit the DW_AT_str_offsets_base attribute correctly in both
cases. Since doing this is a non-trivial amount of code, and I'm going
to need it in more than one test, I've added a helper function to the
dwarfgen DIE class to do it.
Original commit message follows:
The motivation for this is D49493, where we'd like to test details of
debug_str_offsets behavior which is difficult to trigger from a
traditional test.
This adds the plubming necessary for dwarfgen to generate this section.
The more interesting changes are:
- I've moved emitStringOffsetsTableHeader function from DwarfFile to
DwarfStringPool, so I can generate the section header more easily from
the unit test.
- added a new addAttribute overload taking an MCExpr*. This is used to
generate the DW_AT_str_offsets_base, which links a compile unit to the
offset table.
I've also added a basic test for reading and writing DW_form_strx forms.
Reviewers: dblaikie, JDevlieghere, probinson
Subscribers: llvm-commits, aprantl
Differential Revision: https://reviews.llvm.org/D49670
llvm-svn: 338031
The AsmPrinter created in the tests contained an uninitialized
TargetLoweringObjectFile. Things mostly worked regardless, because we
used a separate instance of that class to specify sections to emit.
This rearanges the object construction order so that we can avoid
creating two lowering objects. Instead, we properly initialize the
object in the AsmPrinter, and have the DWARF generator store a pointer
to it.
llvm-svn: 338026
This recommits r337910 after fixing an "ambiguous call to addAttribute"
error with some compilers (gcc circa 4.9 and MSVC). It seems that these
compilers will consider a "false -> pointer" conversion during overload
resolution. This creates ambiguity because one I added an overload which
takes a MCExpr * as an argument.
I fix this by making the new overload take MCExpr&, which avoids the
conversion. It also documents the fact that we expect a valid MCExpr
object.
Original commit message follows:
The motivation for this is D49493, where we'd like to test details of
debug_str_offsets behavior which is difficult to trigger from a
traditional test.
This adds the plubming necessary for dwarfgen to generate this section.
The more interesting changes are:
- I've moved emitStringOffsetsTableHeader function from DwarfFile to
DwarfStringPool, so I can generate the section header more easily from
the unit test.
- added a new addAttribute overload taking an MCExpr*. This is used to
generate the DW_AT_str_offsets_base, which links a compile unit to the
offset table.
I've also added a basic test for reading and writing DW_form_strx forms.
Reviewers: dblaikie, JDevlieghere, probinson
Subscribers: llvm-commits, aprantl
Differential Revision: https://reviews.llvm.org/D49670
llvm-svn: 337933
This reverts commit r337910 as it's generating "ambiguous call to
addAttribute" errors on some bots.
Will resubmit once I get a chance to look into the problem.
llvm-svn: 337924
Summary:
The motivation for this is D49493, where we'd like to test details of
debug_str_offsets behavior which is difficult to trigger from a
traditional test.
This adds the plubming necessary for dwarfgen to generate this section.
The more interesting changes are:
- I've moved emitStringOffsetsTableHeader function from DwarfFile to
DwarfStringPool, so I can generate the section header more easily from
the unit test.
- added a new addAttribute overload taking an MCExpr*. This is used to
generate the DW_AT_str_offsets_base, which links a compile unit to the
offset table.
I've also added a basic test for reading and writing DW_form_strx forms.
Reviewers: dblaikie, JDevlieghere, probinson
Subscribers: llvm-commits, aprantl
Differential Revision: https://reviews.llvm.org/D49670
llvm-svn: 337910
Make the DIE iterator bidirectional so we can move to the previous
sibling of a DIE.
Differential revision: https://reviews.llvm.org/D49173
llvm-svn: 336823
The code to emit the pieces of the MSF file were actually in
PDBFileBuilder. Move this to MSFBuilder so that we can
theoretically emit an MSF without having a PDB file.
llvm-svn: 335789
Change the "recoverable" error callback to take an Error instaed of a
string.
Reviewed by: JDevlieghere
Differential Revision: https://reviews.llvm.org/D46831
llvm-svn: 332845
The idea is that a client that wants split dwarf would create a
specific kind of object writer that creates two files, and use it to
create the streamer.
Part of PR37466.
Differential Revision: https://reviews.llvm.org/D47050
llvm-svn: 332749
The print format was causing at least 2 unit-test failures from r331971.
The signed/unsigned comparison warnings only appeared to affect two lines but
it was unclear whether it might just pop up on other lines, so I have been
explicit in all the literals in the tests.
There were other bot unit-test failures that I am still investigating.
llvm-svn: 331978
Reviewed by: dblaikie, JDevlieghere, espindola
Differential Revision: https://reviews.llvm.org/D44560
Summary:
The .debug_line parser previously reported errors by printing to stderr and
return false. This is not particularly helpful for clients of the library code,
as it prevents them from handling the errors in a manner based on the calling
context. This change switches to using llvm::Error and callbacks to indicate
what problems were detected during parsing, and has updated clients to handle
the errors in a location-specific manner. In general, this means that they
continue to do the same thing to external users. Below, I have outlined what
the known behaviour changes are, relating to this change.
There are two levels of "errors" in the new error mechanism, to broadly
distinguish between different fail states of the parser, since not every
failure will prevent parsing of the unit, or of subsequent unit. Malformed
table errors that prevent reading the remainder of the table (reported by
returning them) and other minor issues representing problems with parsing that
do not prevent attempting to continue reading the table (reported by calling a
specified callback funciton). The only example of this currently is when the
last sequence of a unit is unterminated. However, I think it would be good to
change the handling of unrecognised opcodes to report as minor issues as well,
rather than just printing to the stream if --verbose is used (this would be a
subsequent change however).
I have substantially extended the DwarfGenerator to be able to handle
custom-crafted .debug_line sections, allowing for comprehensive unit-testing
of the parser code. For now, I am just adding unit tests to cover the basic
error reporting, and positive cases, and do not currently intend to test every
part of the parser, although the framework should be sufficient to do so at a
later point.
Known behaviour changes:
- The dump function in DWARFContext now does not attempt to read subsequent
tables when searching for a specific offset, if the unit length field of a
table before the specified offset is a reserved value.
- getOrParseLineTable now returns a useful Error if an invalid offset is
encountered, rather than simply a nullptr.
- The parse functions no longer use `WithColor::warning` directly to report
errors, allowing LLD to call its own warning function.
- The existing parse error messages have been updated to not specifically
include "warning" in their message, allowing consumers to determine what
severity the problem is.
- If the line table version field appears to have a value less than 2, an
informative error is returned, instead of just false.
- If the line table unit length field uses a reserved value, an informative
error is returned, instead of just false.
- Dumping of .debug_line.dwo sections is now implemented the same as regular
.debug_line sections.
- Verbose dumping of .debug_line[.dwo] sections now prints the prologue, if
there is a prologue error, just like non-verbose dumping.
As a helper for the generator code, I have re-added emitInt64 to the
AsmPrinter code. This previously existed, but was removed way back in r100296,
presumably because it was dead at the time.
This change also requires a change to LLD, which will be committed separately.
llvm-svn: 331971
See r331124 for how I made a list of files missing the include.
I then ran this Python script:
for f in open('filelist.txt'):
f = f.strip()
fl = open(f).readlines()
found = False
for i in xrange(len(fl)):
p = '#include "llvm/'
if not fl[i].startswith(p):
continue
if fl[i][len(p):] > 'Config':
fl.insert(i, '#include "llvm/Config/llvm-config.h"\n')
found = True
break
if not found:
print 'not found', f
else:
open(f, 'w').write(''.join(fl))
and then looked through everything with `svn diff | diffstat -l | xargs -n 1000 gvim -p`
and tried to fix include ordering and whatnot.
No intended behavior change.
llvm-svn: 331184
This patch adds the ability for the ObjectYAML DWARFEmitter to calculate
the lengths of DIEs. This is accomplished by creating a DIEFixupVisitor
class which traverses the DWARF DIEs to calculate and fix up the lengths
in the Compile Unit header.
The DIEFixupVisitor can be extended in the future to enable more complex
fix ups which will enable simplified YAML string representations.
This is also very useful when using the YAML format in unit tests
because you no longer need to know the length of the compile unit when
writing the YAML string.
Differential commandeered from Chris Bieneman (beanz)
Differential revision: https://reviews.llvm.org/D30666
llvm-svn: 330421
These aren't the .def style files used in LLVM that require a macro
defined before their inclusion - they're just basic non-modular includes
to stamp out command line flag variables.
llvm-svn: 329840
While reading Codeview records which contain variable-length encoded integers,
such as LF_BCLASS, LF_ENUMERATE, LF_MEMBER, LF_VBCLASS or LF_IVBCLASS,
the record's size would be improperly calculated in cases where the value was
indeed of a variable length (>= LF_NUMERIC). This caused a bad alignement on
the next record, which would/might crash later on.
Differential Revision: https://reviews.llvm.org/D45104
llvm-svn: 329659
There are two FPMs in an MSF file, the idea being that for
incremental updates you can write to the alternate one and then
atomically swap them on commit. LLVM defaulted to using FPM1
on the first commit, but this differs from Microsoft's behavior
which is to default to using FPM2 on the first commit. To
eliminate some byte-level file differences, this patch changes
LLVM's default to also be FPM2.
Additionally, LLVM was trying to be "smart" about marking FPM
pages allocated. In addition to marking every page belonging
to the alternate FPM as unallocated, LLVM also marked pages at
the end of the main FPM which were not needed as unallocated.
In order to match the behavior of Microsoft-generated PDBs, we
now always mark every FPM block as allocated, regardless of
whether it is in the main FPM or the alt FPM, and regardless of
whether or not it describes blocks which are actually in the file.
This has the side benefit of simplifying our code.
llvm-svn: 328812
To resolve symbol context at a particular address, we need to
determine the compiland for the address. We are able to determine
the parent compiland of PDBSymbolFunc, PDBSymbolTypeUDT,
PDBSymbolTypeEnum symbols indirectly through line information.
However no such information is availabile for PDBSymbolData,
i.e. variables.
The Section Contribution table from PDBs has information about
each compiland's contribution to sections by address. For example,
a piece of a contribution looks like,
VA RelativeVA Sect No. Offset Length Compiland
14000087B0 000087B0 0001 000077B0 000000BB exe_main.obj
So given an address, it's possible to determine its compiland with
this information.
llvm-svn: 328178
The hash table is a list of buckets, and the *value* stored in
the bucket cannot be 0 since that is reserved. However, the code
here was incorrectly skipping over the 0'th bucket entirely.
The 0'th bucket is perfectly fine, just none of these buckets
can contain the value 0.
As a result, whenever there was a string where hash(S) % Size
was equal to 0, we would write the value in the next bucket
instead. We never caught this in our tests due to *another*
bug, which is that we would iterate the entire list of buckets
looking for the value, only using the hash value as a starting
point. However, the real algorithm stops when it finds 0 in
a bucket since it takes that to mean "the item is not in the
hash table".
The unit test is updated to carefully construct a set of hash
values that will cause one item to hash to 0 mod bucket count,
and the reader is also updated to return an error indicating that
the item is not found when it encounters a 0 bucket.
llvm-svn: 328162