This will ensure that nothing can ever start parsing data from a future
sequence and part-read data will be returned as 0 instead.
Reviewed by: aprantl, labath
Differential Revision: https://reviews.llvm.org/D80796
The debug_line_invalid.test test case was previously using the
interpreted line table dumping to identify which opcodes have been
parsed. This change moves to looking for the expected opcodes
explicitly. This is probably a little clearer and also allows for
testing some cases that wouldn't be easily identifiable from the
interpreted table.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D80795
For most tables, we already use commas in headers. This set of patches
unifies dumping the remaining ones.
Differential Revision: https://reviews.llvm.org/D80806
For describing section/symbol names we can use unique suffixes,
e.g:
```
- Name: '.foo [1]`
- Name: '.foo [2]`
```
It can be a problem (see https://reviews.llvm.org/D79984#inline-734829),
because `[]` are sometimes used to describe a macros:
```
- Name: "[[a0]]"
```
Seems the better approach is to use something else, like "()".
This patch does it and refactors the code related.
Differential revision: https://reviews.llvm.org/D80123
The patch changes dumping of a unit_length field and offsets in headers
in .debug_loclists and .debug_rnglists sections so that they are printed
as 16-digit hex values if the contribution is in the DWARF64 format.
Differential Revision: https://reviews.llvm.org/D79997
The patch changes dumping of unit_length and header_length fields in
headers in .debug_line sections so that they are printed as 16-digit hex
values if the contribution is in the DWARF64 format.
Differential Revision: https://reviews.llvm.org/D79997
The patch changes dumping of the unit_length field in a unit header so
that it is printed as a 16-digit hex value if the unit is in the DWARF64
format.
Differential Revision: https://reviews.llvm.org/D79997
It looks like that was an initial intention, but some code paths in
`DWARFExpression::Operation::extract()` did not initialize `EndOffset`
properly.
Differential Revision: https://reviews.llvm.org/D79622
This addresses:
-Clean up the source code
-Refactor the JSON fields
-Fix the test cases
-Improve the docs for the stats output
Differential Revision: https://reviews.llvm.org/D77789
Summary:
Without this we could silently accept an invalid prologue because the
default DataExtractor behavior is to return an empty string when
reaching the end of file. And empty string is also used to terminate
these lists.
This makes the parsing code slightly more complicated, but this
complexity will go away once the parser starts working with truncating
data extractors. The reason I am doing it this way is because without
this, the truncation would regress the quality of error messages (right
now, we produce bad error messages only near EOF, but truncation would
make everything behave as if it was near EOF).
Reviewers: dblaikie, probinson, jhenderson
Subscribers: hiraditya, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77555
There is no need to use `--check-prefix` multiple times.
It helps to improve readability/test maintainability.
This patch does it for all tools at once.
Differential revision: https://reviews.llvm.org/D78217
Originally committed as 416fa7720e
Reverted (due to buildbot failure - breaking lldb) in 7a45aeacf3.
I still can't seem to build lldb locally, but Pavel Labath has kindly
provided a potential fix to preserve the old behavior in lldb by
registering a simple recoverable error handler there that prints to the
desired stream in lldb, rather than stderr.
Summary:
Without that we could be silently reading zeroes, as that's the default
DataExtractor behavior. The entire parse would still most likely fail,
but it would do that with a seemingly unrelated/nonsensical error
message.
Reviewers: dblaikie, probinson, jhenderson
Subscribers: hiraditya, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77554
This probably isn't ideal - the error was being printed specifically
inline with the dumping that was more legible - but then the error
wasn't reported to stderr and didn't produce a non-zero exit code.
Probably the error message could be improved by adding more context now
that it isn't printed in-situ of the DIE dumping as much.
Makes it easier to test "this doesn't produce an error" (& indeed makes
that the implied default so we don't accidentally write tests that have
silent/sneaky errors as well as the positive behavior we're testing for)
Though the support for applying relocations is patchy enough that a
bunch of tests treat lack of relocation application as more of a warning
than an error - so rather than me trying to figure out how to add
support for a bunch of relocation types, let's degrade that to a warning
to match the usage (& indeed, it's sort of more of a tool warning anyway
- it's not that the DWARF is wrong, just that the tool can't fully cope
with it - and it's not like the tool won't dump the DWARF, it just won't
follow/render certain relocations - I guess in the most general case it
might try to render an unrelocated value & instead render something
bogus... but mostly seems to be about interesting relocations used in
eh_frame (& honestly it might be nice if we were lazier about doing this
relocation resolution anyway - if you're not dumping eh_frame, should we
really be erroring about the relocations in it?))
Add an option to llvm-dwarfdump to calculate the bytes within
the debug sections. Dump this numbers when using --statistics
option as well.
This is an initial patch (e.g. we should support other units,
since we only support 'bytes' now).
Differential Revision: https://reviews.llvm.org/D74205
Summary:
The directory_count and file_name_count fields are (section 6.2.4 of
DWARF5 spec) supposed to be uleb128s, not bytes. This bug meant that it
was not possible to correctly parse headers with more than 128 files or
directories.
I've found this bug by code inspection, though the limit is so small
someone would have run into it for real sooner or later. I've verified
that the producer side handles many files correctly, and that we are
able to parse such files after this fix.
Reviewers: dblaikie, jhenderson
Subscribers: aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76498
Summary:
This is a preparatory change for allowing LLVM to emit DW_OP_convert
operations converting to the generic type.
If DW_OP_convert's operand is 0, it converts the top of stack to the
generic type, as specified by DWARFv5 section 2.5.1.6:
"[...] takes one operand, which is an unsigned LEB128 integer that
represents the offset of a debugging information entry in the current
compilation unit, or value 0 which represents the generic type."
This adds support for such operations to llvm-dwarfdump.
Reviewers: aprantl, markus, jdoerfert, jhenderson
Reviewed By: aprantl
Subscribers: hiraditya, llvm-commits
Tags: #debug-info, #llvm
Differential Revision: https://reviews.llvm.org/D76141
This fixes printing long values that might reside in CIE and FDE,
including offsets, lengths, and addresses.
Differential Revision: https://reviews.llvm.org/D73887
Summary:
In this patch I've done a slightly bigger rewrite to also remove the
hardcoded header lengths.
Reviewers: jhenderson, dblaikie, ikudrin
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75119
Summary:
This could be considered obvious, but I am putting it up to illustrate
the usefulness/impact of the getInitialLength change.
Reviewers: dblaikie, jhenderson, ikudrin
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75117
The patch was reverted in 69da40033 because of test failures on windows.
The problem was the unpredictable order of some of the error messages,
which I've tried to strenghten in that patch.
It turns out this is not possible to do in verbose mode because there
the data is being writted as it is being parsed. No amount of flushing
(as I've done in the non-verbose mode) will help that. Indeed, even
without any buffering the warning messages can end in the middle of a
line in non-verbose mode.
In this patch, I have reverted the changes which tested the relative
position of the warning message, except for the messages about
unsupported initial length, which are the ones I really wanted to test,
and which do come out reasonably.
The original commit message was:
This patch if motivated by D74560, specifically the subthread about what
to print upon encountering reserved initial length values.
If the debug_line prologue has an unsupported version, we skip parsing
the rest of the data. If we encounter an reserved initial length field,
we don't even parse the version. However, we still print out all members
(with value 0) in the dump function.
This patch introduces early exits in the Prologue::dump function so that
we print only the fields that were parsed successfully. In case of an
unsupported version, we skip printing all subsequent prologue fields --
because we don't even know if this version has those fields. In case of a
reserved unit length, we don't print anything -- if the very first field
of the prologue is invalid, it's hard to say if we even have a prologue
to begin with.
Note that the user will still be able to see the invalid/reserved
initial length value in the error message. I've modified (reordered)
debug_line_invalid.test to show that the error message comes straight
after the debug_line offset. I've also added some flush() calls to the
dumping code to ensure this is the case in all situations (without that,
the warnings could get out of sync if the output was not a terminal -- I
guess this is why std::iostreams have the tie() function).
Reviewers: jhenderson, ikudrin, dblaikie
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75043
Summary:
This patch if motivated by D74560, specifically the subthread about what
to print upon encountering reserved initial length values.
If the debug_line prologue has an unsupported version, we skip parsing
the rest of the data. If we encounter an reserved initial length field,
we don't even parse the version. However, we still print out all members
(with value 0) in the dump function.
This patch introduces early exits in the Prologue::dump function so that
we print only the fields that were parsed successfully. In case of an
unsupported version, we skip printing all subsequent prologue fields --
because we don't even know if this version has those fields. In case of a
reserved unit length, we don't print anything -- if the very first field
of the prologue is invalid, it's hard to say if we even have a prologue
to begin with.
Note that the user will still be able to see the invalid/reserved
initial length value in the error message. I've modified (reordered)
debug_line_invalid.test to show that the error message comes straight
after the debug_line offset. I've also added some flush() calls to the
dumping code to ensure this is the case in all situations (without that,
the warnings could get out of sync if the output was not a terminal -- I
guess this is why std::iostreams have the tie() function).
Reviewers: jhenderson, ikudrin, dblaikie
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75043
While the value of the CIE pointer field in a DWARF FDE record is
an offset to the corresponding CIE record from the beginning of
the section, for EH FDE records it is relative to the current offset.
Previously, we did not make that distinction when dumped both kinds
of FDE records and just printed the same value for the CIE pointer
field and the CIE offset; that was acceptable for DWARF FDEs but was
wrong for EH FDEs.
This patch fixes the issue by explicitly printing the offset of the
linked CIE object.
Differential Revision: https://reviews.llvm.org/D74613
A future MC change may add a warning/error when a .section directive
specifies incorrect sh_flags/sh_type. Fix the tests to use correct
sh_flags/sh_type.
This patch enables the debug entry values feature.
- Remove the (CC1) experimental -femit-debug-entry-values option
- Enable it for x86, arm and aarch64 targets
- Resolve the test failures
- Leave the llc experimental option for targets that do not
support the CallSiteInfo yet
Differential Revision: https://reviews.llvm.org/D73534
We do not keep the actual value of the CIE ID field, because it is
predefined, and use a constant when dumping a CIE record. The issue
was that the predefined value is different for .debug_frame and
.eh_frame sections, but we always printed the one which corresponds
to .debug_frame. The patch fixes that by choosing an appropriate
constant to print.
See the following for more information about .eh_frame sections:
https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html
Differential Revision: https://reviews.llvm.org/D73627
The DWARFv2-4 specification for the line table header states that the
include directories and file name tables both end with a single null
byte. Prior to this change, the parser did not detect if this byte was
missing, because it also stopped reading the tables once it reached the
prologue end, as claimed by the header_length field. This change adds a
check that the terminator has been seen at the end of each table.
Reviewed by: dblaikie, MaskRay
Differential Revision: https://reviews.llvm.org/D74413
The number of standard opcodes is defined to be opcode_base - 1, so a
value of 0 for the opcode_base caused a crash as an attempt was made to
reserve many entries in a vector. This change fixes the crash, by
issuing a warning and skipping reading of standard opcode lengths in the
event of an opcode_base of 0.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D74309
Also remove some test duplication and add a test case that shows the
maximum version is rejected (this also shows that the value in the error
message is actually in decimal, and not just missing an 0x prefix).
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D74403
This patch enables the debug entry values feature.
- Remove the (CC1) experimental -femit-debug-entry-values option
- Enable it for x86, arm and aarch64 targets
- Resolve the test failures
- Leave the llc experimental option for targets that do not
support the CallSiteInfo yet
Differential Revision: https://reviews.llvm.org/D73534
The patch removes unnecessary members of DWARFDebugAddr and further
simplifies the implementation by separating parsing methods of tables
in the DWARFv5 and pre-standard formats.
Differential Revision: https://reviews.llvm.org/D74197
As a preparation for the subsequent patches, this updates the wordings
of some error messages in DWARFDebugAddr.
Differential Revision: https://reviews.llvm.org/D74196
This replaces a collocation "a .debug_addr table" with "an address table"
because the latter sounds more accurate.
Differential Revision: https://reviews.llvm.org/D74407
As there is no header in pre-DWARFv5 address tables, and we fill
the class data members with some artificial values, we should not
dump them as that might be misleading.
Differential Revision: https://reviews.llvm.org/D74195
As addresses in the address tables may have relocations, thus,
the relocations should be resolved to read the correct address.
That is especially important for targets that use RELA relocations
because in that case addends are stored in relocation sections.
Differential Revision: https://reviews.llvm.org/D74404
The DebugInfo/dwarfdump-invalid-line-table test used a pre-canned binary
generated by a fuzzer to demonstrate a bug fix. Unfortunately, the
binary is rigid and requires hand-editing if we change behaviour, such
as rejecting certain properties within it (as I plan on doing in another
change).
Rather than hand-edit the binary, I have replaced it with two tests. The
first tests the high-level code path from the debug line parser that
produces the same error as this test previously did, and the second is a
set of unit test cases that comprehensively cover the
FormValue::skipValue method, which in turn covers the area that the
original bug fix touched.
Reviewed by: MaskRay, dblaikie
Differential Revision: https://reviews.llvm.org/D74202
If dumping an Split DWARF file that hasn't been split into separate
files (such as from llc - that includes the plain and .dwo sections in
the same file) allow both macinfo and macinfo.dwo sections to be dumped.
Many of the debug line prologue errors are not inherently fatal. In most
cases, we can make reasonable assumptions and carry on. This patch does
exactly that. In the case of length problems, the approach of "assume
stated length is correct" is taken which means the offset might need
adjusting.
This is a relanding of b94191fe, fixing an LLD test and the LLDB build.
Reviewed by: dblaikie, labath
Differential Revision: https://reviews.llvm.org/D72158
It isn't known how many times we've seen the same variable or member in
the global scope (unlike in functions), but there still can be some duplicates
among different CUs.
So, this patch proposes to count variables in the global scope just as a sum of
the number of vars, constant members and artificial entities.
Reviewed by: aprantl
Differential Revision: https://reviews.llvm.org/D73004
A few DW_TAG_formal_parameter's of the same function may have the same
name (e.g. variadic (template) functions) or don't have a name at all
(if the parameter isn't used inside the function body), but we still
need to be able to distinguish between them to get correct number of 'total vars'
and 'availability' metric.
Reviewed by: aprantl
Differential Revision: https://reviews.llvm.org/D73003
Here may be more than one out-of-line instance of the same function
among different CUs. All of them should be accounted for to get an accurate
total number of variables/parameters.
Reviewed by: aprantl
Differential Revision: https://reviews.llvm.org/D73002
DW_TAG_subroutine_type is not really useful for statistics purposes, as it never
has location information. But it may contain DW_TAG_formal_parameter
children that generate number of parameters w/o location and decrease
'availability' metric significantly.
Reviewed by: djtodoro
Differential Revision: https://reviews.llvm.org/D72983
Different variables and functions might have the same name in different CU.
To calculate 'Availability' metric more accurate (i.e. to avoid getting
availability above 100%), we need to have some additional logic to
distinguish between them.
The patch introduces a DIE identifier that consists of a function/variable name
and declaration information: a filename and a line number. This allows
distinguishing different functions/variables (different means declared in
different files/lines) with the same name, keeping duplicates counted
as duplicates.
Reviewed by: aprantl, djtodoro
Differential Revision: https://reviews.llvm.org/D72797
Many of the debug line prologue errors are not inherently fatal. In most
cases, we can make reasonable assumptions and carry on. This patch does
exactly that. In the case of length problems, the approach of "the
claimed length is correct" is taken to be consistent with other
instances such as the SectionParser, which ignores the read length.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D72158
A subsequent patch will change how an invalid file name table is handled
to allow parsing to continue. This patch adds a test case that will
demonstrate a difference in behaviour with that change between invalid
file tables where the error is before the end of the stated prologue
length and where the error occurs after the stated length.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D72157
It is possible to try to keep parsing a debug line program even when the
length of an extended opcode does not match what is expected for that
opcode. This patch changes what was previously a fatal error to be
non-fatal. The parser now continues by assuming the the claimed length
is correct, even if it means moving the offset backwards.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D72155
This helps to detect and report parsing errors better.
The patch follows the ideas of LLDB's patches D59370 and D59381.
It adds tests for valid and some invalid cases. More checks and
tests to come. Note that the patch fixes validation of the Length
field because the value does not include the field itself.
The existing users are updated to show the error messages.
Differential Revision: https://reviews.llvm.org/D71875
Summary:
This patch implements `formatv()` formatting for `dwarf::LineNumberOps`
and makes use of it for the `llvm-dwarfdump --debug-line` dump.
Previously, unknown line number standard opcodes would lead to undefined
behaviour. The code would attempt to format the data pointer of an empty
`StringRef` (a null pointer) using `%s`. According to the description
for `format()`, use of that interface carries the "risk of `printf`".
Passing a null pointer in place of an array to a C library function
results in undefined behaviour.
Reviewers: jhenderson, daltenty, stevewan
Reviewed By: jhenderson
Subscribers: aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72369
Unlike most of our errors in the debug line parser, the "no end of
sequence" message was missing any reference to which line table it
refererred to. This change adds the offset to this message.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D72443
The V5 directory and filename tables had checks in to make sure we
hadn't read past the end of the line table prologue. Since previous
changes to the data extractor class ensure we never read past the end,
these checks are now redundant, so this patch removes them.
There is still a check to show that the whole prologue remains within
the prologue length.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D71768
This removes the need to duplicate the LASTONLY check pattern and the
last part of the NONFATAL pattern in the modified test.
Reviewed By: MaskRay, JDevlieghere
Differential Revision: https://reviews.llvm.org/D71757
The line tables in debug_line_malformed.s had contents that varied more
than was necessary for the testing, making it harder to follow what was
important. This patch normalises them so that they all share
more-or-less the same body. Additionally, it makes the testing for what
was printed more consistent, to show that the right parts of the line
table prologue and body are/are not parsed and printed.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D71755
Some of the tables in debug_line_malformed.s were not being checked in
the NONFATAL checks in debug_line_invalid.test (only the warnings coming
from them were being checked). This made the test harder to follow.
Additionally, a later change will change the way the errors are handled
such that more of the line table will be printed. That will require
checks for these tables (or something equivalent) so that the difference
in behaviour can be observed. This patch adds checks for the three
tables that were missing checks.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D71753
This patch adds and improves comments in the debug_line_invalid.test and
its associated input file so that it is easier to follow. It uses '##'
to make comments stand out from lit and FileCheck commands.
It also reflows some commands so that the lines are not so long and are
easier to read and fixes some copy/paste errors.
Reviewed by: JDevlieghere
Differential Revision: https://reviews.llvm.org/D71752
Now that DWARFv5 provides a way to identify DWARF expressions based on
form, rather than only by attribute - use it to always provide pretty
printing for any exprloc attribute, not only the attributes known to
contain expressions.
The debug line verbose printing was printing the wrong values for rows
added via DW_LNE_end_sequence, because the row was being printed AFTER
its state had been reset following it being appended to the line table.
This patch fixes this issue by printing the row before appending it.
Reviewers: dblaikie, MaskRay
Differential Revision: https://reviews.llvm.org/D71664
Commit 84a9756 added an extra blank line at the end of any line table.
However, a blank line is also printed after the line table header, which
meant that two blank lines in a row were being printed after a header,
if there were no rows. This patch defers the post-header blank line
printing until it has been determined that there are rows to print.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D71540
Summary:
This is a follow up for D70548.
Currently, variables with debug info coverage between 0% and 1% are put into
zero-bucket. D70548 changed the way statistics calculate a variable's coverage:
we began to use enclosing scope rather than a possible variable life range.
Thus more variables might be moved to zero-bucket despite they have some debug
info coverage.
The patch is to distinguish between a variable that has location info but
it's significantly less than its enclosing scope and a variable that doesn't
have it at all.
Reviewers: djtodoro, aprantl, dblaikie, avl
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71070
Summary:
This changes the representation of 'coverage buckets' in llvm-dwarfdump and
llvm-locstats to one that makes more clear what the buckets contain.
See some related details in D71070.
Reviewers: djtodoro, aprantl, cmtice, jhenderson
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71366
This helps delineate it in the output from later tables or other output.
Reviewed by: JDevlieghere
Differential Revision: https://reviews.llvm.org/D71344
A number of the --debug-* options in llvm-dwarfdump are not particularly
well tested. In some cases, the option is only tested as part of testing
another feature, or a specific part of the section that the options
dump. This change adds four new tests to address some of these holes. It
is not aiming to address every hole however.
I kept the --debug-line switch test separate to X86/brief.s because the
latter only considers the parts of the line table that are affected by
verbose printing, thus missing out things like the header and different
values for things like the Line, Column etc registers.
Reviewed by: JDevlieghere
Differential Revision: https://reviews.llvm.org/D71276
Summary:
The patch removes OffsetToFirstDefinition in the 'scope bytes total'
statistic computation. Thus it unifies the way the scope and the coverage
buckets are computed. The rationals behind that are the following:
1. OffsetToFirstDefinition was used to calculate the variable's life range.
However, there is no simple way to do it accurately, so the scope calculated
this way might be misleading. See D69027 for more details on the subject.
2. Both 'scope bytes total' and coverage buckets seem to be intended
to represent the same data in different ways. Otherwise, the statistics
might be controversial and confusing.
Note that the approach gives up a thorough evaluation of debug information
completeness (i.e. coverage buckets by themselves doesn't tell how good
the debug information is). Only changes in coverage over time make
a 'physical' sense.
Reviewers: djtodoro, aprantl, vsk, dblaikie, avl
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70548
Summary:
Currently these function return the raw content of the appropriate table
header, which means they are relative to the DW_AT_{loc,rng}list_base,
and one has to relocate them in order to do anything.
This changes the functions to perform the relocation themselves, which
seems more clearer, particularly as they are sitting right next to the
find{Rng,Loc}listFromOffset functions, but one *cannot* simply take the
result of these functions and take pass them there.
The only effect of this patch is to change what value is dumped for the
DW_AT_ranges attribute, which I think is for the better, as previously
the values appeared to point into thin air.
(The main reason I am looking at this is because I was trying to
implement equivalent functionality in lldb's DWARFUnit, and was stumped
by this behavior.
Reviewers: dblaikie, JDevlieghere, aprantl
Subscribers: hiraditya, llvm-commits, SouraVX
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71006
Summary:
lldb's loclists parser has support for DW_LLE_start_end(x) encodings. To
avoid regressing when switching the implementation to llvm's, I add
parsing support for all previously unsupported location list encodings.
Reviewers: dblaikie, JDevlieghere, aprantl, SouraVX
Subscribers: hiraditya, probinson, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70949
Summary:
This does exactly what it says on the box. The only small gotcha is the
section index computation for offset_pair entries, which can use either
the base address section, or the section from the offset_pair entry.
This is to support both the cases where the base address is relocated
(points to the base of the CU, typically), and the case where the base
address is a constant (typically zero) and relocations are on the
offsets themselves.
Reviewers: dblaikie, JDevlieghere, aprantl, SouraVX
Subscribers: hiraditya, llvm-commits, probinson
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70540
Summary:
Instead of going to the debug_loc section directly, use new
DWARFDie::getLocations instead. This means that the code will now
automatically support debug_loclists sections.
This is the last usage of the old debug_loc methods, and they can now be
removed.
Reviewers: dblaikie, JDevlieghere, aprantl, SouraVX
Subscribers: hiraditya, probinson, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70534
Summary:
This patch removes manual location list handling in the statistics code
and replaces it with the new DWARFDie api, which provides access to a
"cooked" location list. This has the following effects:
- the code now properly handles split-dwarf location lists
- it will automatically support dwarf5 location lists once support for
those is added
- it properly handles location lists with base address selection entries
- it fixes a bug where the location list code was using the first
DW_AT_ranges range as a "base address" of the compile unit (it should
have used DW_AT_low_pc instead. The effect of this was that the
computation of the start address of a variable in its scope was broken
for these kinds of compile units. This only manifested itself on
linked files, since in object files the first DW_AT_ranges range
normally starts at 0.
Since pretty much every kind of location list was broken in some way,
it's hard to verify that the new implementation is correct -- the output
will be different in all non-trivial cases, and mostly with good reason.
Most of the existing statistics tests continue to pass though, and a
visual inspection of the statistics for non-trivial inputs shows that
the data is more "reasonable" now. I have updated the "dwo statistics"
test to include the new numbers, as the previous ones were completely
bogus, and I have added a targeted test for the "base address" bug.
Reviewers: dblaikie, cmtice, vsk
Subscribers: aprantl, SouraVX, JDevlieghere, djtodoro, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70444
This patch replaces the tabs by spaces and avoid the need for a
debug_str section by moving all strings inline. It also removes the
hardcoded DIE offsets in the test, which will simplify a follow-up
patch.
Summary:
This adds a visitLocationList function to the DWARF v4 location lists,
similar to what already exists for DWARF v5. It follows the approach
outlined in previous patches (D69672), where the parsed form is always
stored in the DWARF v5 format, which makes it easier for generic code to
be built on top of that. v4 location lists are "upgraded" during
parsing, and then this upgrade is undone while dumping.
Both "inline" and section-based dumping is rewritten to reuse the
existing "generic" location list dumper. This means that the output
format is consistent for all location lists (the only thing one needs to
implement is the function which prints the "raw" form of a location
list), and that debug_loc dumping correctly processes base address
selection entries, etc.
The previous existing debug_loc functionality (e.g.,
parseOneLocationList) is rewritten on top of the new API, but it is not
removed as there is still code which uses them. This will be done in
follow-up patches, after I build the API to access the "interpreted"
location lists in a generic way (as that is what those users really
want).
Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69847
Summary:
This removes the use of zero as a base address in section-based dumping.
Although this will often be true for (unlinked) object files with a
single compile unit, it is not true in general. This means that
section-based dumping will not be able to resolve entries referencing
the base address (DW_LLE_offset_pair) -- it wasn't able to do that
correctly before either, but now it will be more explicit about it. One
exception to that is if the location list contains an explicit
DW_LLE_base_address entry -- in this case the dumper will pick it up,
and resolve subsequent entries normally.
The patch also removes the fallback to zero in the "inline" dumping in
case the compile unit does not contain a base address.
Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70115
Summary:
This patch extracts the logic for computing the "absolute" locations,
which was partially present in the debug_loclists dumper, completes it,
and moves it into a separate function. This makes it possible to later
reuse the same logic for uses other than dumping.
The dumper is changed to reuse the location list interpreter, and its
format is changed somewhat. In "verbose" mode it prints the "raw" value
of a location list, the interpreted location (if available) and the
expression itself. In non-verbose mode it prints only one of the
location forms: it prefers the interpreted form, but falls back to the
"raw" format if interpretation is not possible (for instance, because we
were not given a base address, or the resolution of indirect addresses
failed).
This patch also undos some of the changes made in D69672, namely the
part about making all functions static. The main reason for this is that
I learned that the original approach (dumping only fully resolved
locations) meant that it was impossible to rewrite one of the existing
tests. To make that possible (and make the "inline location" dump work
in more cases), I now reuse the same dumping mechanism as is used for
section-based dumping. As this required having more objects know about
the various location lists classes, it seemed like a good idea to create
an interface abstracting the difference between them.
Therefore, I now create a DWARFLocationTable class, which will serve as
a base class for the location list classes. DWARFDebugLoclists is made
to inherit from that. DWARFDebugLoc will follow.
Another positive effect of this change is that section-based dumping
code will not need to use templates (as originally) envisioned, and that
the argument lists of the dumping functions become shorter.
Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70081
Summary:
This patch stems from the discussion D68270 (including some offline
talks). The idea is to provide an "incremental" api for parsing location
lists, which will avoid caching or materializing parsed data. An
additional goal is to provide a high level location list api, which
abstracts the differences between different encoding schemes, and can be
used by users which don't care about those (such as LLDB).
This patch implements the first part. It implements a call-back based
"visitLocationList" api. This function parses a single location list,
calling a user-specified callback for each entry. This is going to be
the base api, which other location list functions (right now, just the
dumping code) are going to be based on.
Future patches will do something similar for the v4 location lists, and
add a mechanism to translate raw entries into concrete address ranges.
Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69672
Summary:
Handling relocations was not needed when the loclists section was a
DWO-only thing. But since DWARF5, it is possible to use it in regular
objects too, and the standard permits embedding addresses into the
section directly. These addresses need to be relocated in unlinked
files.
Reviewers: JDevlieghere, dblaikie, probinson
Subscribers: aprantl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68271
The tool reports verbose output for the DWARF debug location coverage.
The llvm-locstats for each variable or formal parameter DIE computes what
percentage from the code section bytes, where it is in scope, it has
location description. The line 0 shows the number (and the percentage) of
DIEs with no location information, but the line 100 shows the number (and
the percentage) of DIEs where there is location information in all code
section bytes (where the variable or parameter is in the scope). The line
50..59 shows the number (and the percentage) of DIEs where the location
information is in between 50 and 59 percentage of its scope covered.
Differential Revision: https://reviews.llvm.org/D66526
llvm-svn: 372554
The tool reports verbose output for the DWARF debug location coverage.
The llvm-locstats for each variable or formal parameter DIE computes what
percentage from the code section bytes, where it is in scope, it has
location description. The line 0 shows the number (and the percentage) of
DIEs with no location information, but the line 100 shows the number (and
the percentage) of DIEs where there is location information in all code
section bytes (where the variable or parameter is in the scope). The line
50..59 shows the number (and the percentage) of DIEs where the location
information is in between 50 and 59 percentage of its scope covered.
The tool will be very useful for tracking improvements regarding the
"debugging optimized code" support with LLVM ecosystem.
Differential Revision: https://reviews.llvm.org/D66526
llvm-svn: 371520
The additional fields will be parsed by the llvm-locstats tool in order to
produce more human readable output of the DWARF debug location quality
generated.
Differential Revision: https://reviews.llvm.org/D66525
llvm-svn: 371506
Verify that the call site DWARF symbols (added during the implementation
of the debug entry values feature) are generated properly.
Differential Revision: https://reviews.llvm.org/D66865
llvm-svn: 370631
The type_offset field is 8 bytes long in DWARF64. The patch extends
TypeOffset to uint64_t and fixes its reading. The patch also fixes
checking of TypeOffset bounds as it was inaccurate in DWARF64 case.
Differential Revision: https://reviews.llvm.org/D66465
llvm-svn: 369378
This patch exnteds the error handling in the debug line parser to get
rid of the existing MD5 assertion. I want to reuse the debug line parser
from LLVM in LLDB where we cannot crash on invalid input.
Differential revision: https://reviews.llvm.org/D64544
llvm-svn: 366762
Dump the DWARF information about call sites and call site parameters into
debug info sections.
The patch also provides an interface for the interpretation of instructions
that could load values of a call site parameters in order to generate DWARF
about the call site parameters.
([13/13] Introduce the debug entry values.)
Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>
Differential Revision: https://reviews.llvm.org/D60716
llvm-svn: 365467
Add the IR and the AsmPrinter parts for handling of the DW_OP_entry_values
DWARF operation.
([11/13] Introduce the debug entry values.)
Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>
Differential Revision: https://reviews.llvm.org/D60866
llvm-svn: 364542
This adds `-parent-recurse-depth` which limits the number of parent DIEs
being dumped.
Differential revision: https://reviews.llvm.org/D62359
llvm-svn: 361671
Add check for, and parsing of, .dwo files to Statistics.cpp; create a new getNon
SkeletonUnitDie function for DWARFUnit.h
Reviewers: dblaikie
Differential Revision: https://review.llvm.org/D61755
llvm-svn: 360380
Summary:
Make DW_LNS_copy set the discriminator register to 0, to conform to
DWARF 4 & 5: "Then it sets the discriminator register to 0, and sets the
basic_block, prologue_end and epilogue_begin registers to false."
Because all of DW_LNE_end_sequence, DN_LNS_copy, and special opcodes reset
discriminator to 0, we can move discriminator=0 to appendRowToMatrix.
Also, make DW_LNS_copy print before appending the row, as it is similar
to a address+=0,line+=0 special opcode, which prints before appending
the row.
Reviewers: dblaikie, probinson, aprantl
Reviewed By: dblaikie
Subscribers: danielcdh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60364
llvm-svn: 358148
The standard doesn't require a DW_TAG_variable, DW_TAG_formal_parameter
or DW_TAG_constant to A DW_AT_type attribute describing the type of the
variable. It only specifies that it *can* have one.
llvm-svn: 357628
When dumping ToT clan's debug info with dwarfdump, we were seeing an
error saying that that the location list overflows the debug_loc
section. After reducing the testcase we figured out that we were
interpreting the DW_FORM_data4 as a section offset.
In DWARF3 DW_FORM_data4 and DW_FORM_data8 served also as a section
offset. Until now we didn't check check for the DWARF version, because
some producers (read old versions of clang) were still emitting this.
The relevant code/comment was added in 2013, and I believe it's now
reasonable to start checking the version.
The FormValue class is a little bit of a mess because it cashes the
DWARF unit and context when it extracted the value itself. Several
methods of the class rely on it being present, or return an Optional for
the code path that needs it. At the same time the FormValue class also
used in places where there's no DWARF unit.
For this patch I went with the least invasive change: checking the
version from the CU when it's available. If it's not (because the form
value was created from a value directly) we default to the old behavior.
Differential revision: https://reviews.llvm.org/D58698
llvm-svn: 355456
Add statistics for abstract origins, function, variable and parameter
locations; break the 'variable' counts down into variables and
parameters. Also update call site counting to check for
DW_AT_call_{file,line} in addition to DW_TAG_call_site.
Differential revision: https://reviews.llvm.org/D58849
llvm-svn: 355243
Adds llvm-dwarfdump support for pretty printing Dwarf5 expressions ops
that reference a base type (right now only DW_OP_convert is added).
Includes verification to verify that the ops operand is actually a
DW_TAG_base_type DIE.
Differential Revision: https://reviews.llvm.org/D58442
llvm-svn: 354552
DW_TAG_subprogram DIEs should not be counted in the inlined function statistic. This also addresses the source variables count, as that uses the inlined function count in its calculations.
Differential revision: https://reviews.llvm.org/D57849
llvm-svn: 353491
The wrong variable was being used when printing the address increment in
verbose output of .debug_line. This patch fixes this.
Reviewed by: JDevlieghere
Differential Revision: https://reviews.llvm.org/D57693
llvm-svn: 353288
This is to address post-commit feedback from Paul Robinson on r348954.
The original commit misinterprets count and upper bound as the same thing (I thought I saw GCC producing an upper bound the same as Clang's count, but GCC correctly produces an upper bound that's one less than the count (in C, that is, where arrays are zero indexed)).
I want to preserve the C-like output for the common case, so in the absence of a lower bound the count (or one greater than the upper bound) is rendered between []. In the trickier cases, where a lower bound is specified, a half-open range is used (eg: lower bound 1, count 2 would be "[1, 3)" and an unknown parts use a '?' (eg: "[1, ?)" or "[?, 7)" or "[?, ? + 3)").
Reviewers: aprantl, probinson, JDevlieghere
Differential Revision: https://reviews.llvm.org/D55721
llvm-svn: 349670
This is https://bugs.llvm.org/show_bug.cgi?id=39992,
If we have the following code (test.cpp):
thread_local int tdata = 24;
and build an .o file with debug information:
clang --target=x86_64-pc-linux -c bar.cpp -g
Then object produced may have R_X86_64_DTPOFF64/R_X86_64_DTPOFF32 relocations.
(clang emits R_X86_64_DTPOFF64 and gcc emits R_X86_64_DTPOFF32 for the code above for me)
Currently, llvm-dwarfdump fails to compute this TLS relocation when dumping
object and reports an
error:
failed to compute relocation: R_X86_64_DTPOFF64, Invalid data was encountered while parsing the file
This relocation represents the offset in the TLS block and resolved by the linker,
but this info is unavailable at the
point when the object file is dumped by this tool.
The patch adds the simple evaluation for such relocations to avoid emitting errors.
Resulting behavior seems to be equal to GNU dwarfdump.
Differential revision: https://reviews.llvm.org/D55762
llvm-svn: 349476
Doesn't handle varargs and other fun things, but it's a start. (also
doesn't print these strictly as valid C++ when it's a pointer to
function, it'll print as "void(int)*" instead of "void (*)(int)")
llvm-svn: 348965
The test was fully rewritten for simplification.
New test code was suggested by David Blaikie.
Differential revision: https://reviews.llvm.org/D55261
llvm-svn: 348464
When there is no .debug_addr section for some reason,
llvm-dwarfdump would print the bogus empty section name when dumping ranges
in .debug_info:
DW_AT_ranges [DW_FORM_rnglistx] (indexed (0x0) rangelist = 0x00000004
[0x0000000000000000, 0x0000000000000001) ""
[0x0000000000000000, 0x0000000000000002) "")
That happens because of the code which uses 0 (zero) as a section index as a default value.
The code should use -1ULL instead because technically 0 is a valid zero section index
in ELF and -1ULL is a special constant used that means "no section available".
This is mostly a fix for the overall correctness/safety of the code,
but a test case is provided too.
Differential revision: https://reviews.llvm.org/D55113
llvm-svn: 348115
Adding functionality to the DWARF verifier for DWARF v5 strx* forms which
index into the string offsets table.
Differential Revision: https://reviews.llvm.org/D54049
llvm-svn: 346061
Relocatable content may have overlapping ranges until the sections are
finalized. This reduces the amount of verification that is done on an object
file so that invalid errors are not raised.
llvm-svn: 345441
As was already mentioned in comments for D53364, DWARF 5
spec says about DW_LLE_startx_length:
"This is a form of bounded location description that has two unsigned ULEB operands.
The first value is an address index (into the .debug_addr section) that indicates the beginning of the address range
over which the location is valid. The second value is the length of the range. ")
Currently, the length is always parsed as U32.
Patch change the behavior to parse DW_LLE_startx_length as ULEB128 for DWARF 5
and keeps it as U32 for DWARF4+(pre-DWARF5) for compatibility.
Differential revision: https://reviews.llvm.org/D53564
llvm-svn: 345254