Summary:
This adds support for DWARF5 location lists which are specified
indirectly, via an index into the debug_loclists offset table. This
includes parsing the DW_AT_loclists_base attribute which determines the
location of this offset table, and support for new form DW_FORM_loclistx
which is used in conjuction with DW_AT_location to refer to the location
lists in this way.
The code uses the llvm class to parse the offset information, and I've
also tried to structure it similarly to how the relevant llvm
functionality works.
Reviewers: JDevlieghere, aprantl, clayborg
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71268
In DWARF5 DW_AT_low_pc (and DW_AT_entry_pc, and possibly others) can use
DW_FORM_addrx to refer to the address indirectly. This means we need to
have processed the DW_AT_addr_base attribute before we can do anything
with these.
Since we were processing the unit attributes serially, this created a
problem in cases where the DW_AT_addr_base comes after DW_AT_low_pc --
we would end up computing the wrong unit base address, which also
corrupted any values which later depended on that (for instance range
lists). Clang currently always emits DW_AT_addr_base last.
The fix is simple -- process DW_AT_addr_base first, regardless of its
position in the attribute list.
the value of DW_AT_rnglists_base of the skeleton unit is for that unit
alone (e.g. used in DW_AT_ranges of the unit DIE) and should not apply
to the split unit.
The split unit has a hardcoded range list base value -- we should
initialize range list code whenever we detect a nonempty
debug_rnglists.dwo section.
Summary:
Our rnglist support was working only for the trivial cases (one CU),
because we only ever parsed one contribution out of the debug_rnglists
section. This means we were never able to resolve range lists for the
second and subsequent units (DW_FORM_sec_offset references came out
blang, and DW_FORM_rnglistx references always used the ranges lists from
the first unit).
Since both llvm and lldb rnglist parsers are sufficiently
self-contained, and operate similarly, we can fix this problem by
switching to the llvm parser instead. Besides the changes which are due
to variations in the interface, the main thing is that now the range
list object is a member of the DWARFUnit, instead of the entire symbol
file. This ensures that each unit can get it's own private set of range
list indices, and is consistent with how llvm's DWARFUnit does it
(overall, I've tried to structure the code the same way as the llvm
version).
I've also added a test case for the two unit scenario.
Reviewers: JDevlieghere, aprantl, clayborg
Subscribers: dblaikie, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71021
Originally I wanted to remove the RegularExpression class in Utility and
replace it with llvm::Regex. However, during that transition I noticed
that there are several places where need the regular expression string.
So instead I propose to keep the RegularExpression class and make it a
thin wrapper around llvm::Regex.
This patch also removes the workaround for empty regular expressions.
The result is that we are now (more or less) POSIX conformant.
Differential revision: https://reviews.llvm.org/D66174
llvm-svn: 369153
Summary:
This commit achieves the following:
- Functions used to return a `TypeSystem *` return an
`llvm::Expected<TypeSystem *>` now. This means that the result of a call
is always checked, forcing clients to move more carefully.
- `TypeSystemMap::GetTypeSystemForLanguage` will either return an Error or a
non-null pointer to a TypeSystem.
Reviewers: JDevlieghere, davide, compnerd
Subscribers: jdoerfert, lldb-commits
Differential Revision: https://reviews.llvm.org/D65122
llvm-svn: 367360
Delete the abstract GetOffset function, which is only defined for
rnglists entries. Instead fix up entries which refer to the range list
classes so that one can statically know that he is dealing with the
rnglists section and call the function that way.
llvm-svn: 367106
Summary:
With the last round of refactors, supporting type units in dwo files
becomes almost trivial. This patch contains a couple of small fixes,
which taken as a whole make type units work in the split dwarf scenario
(both DWARF4 and DWARF5):
- DWARFContext: make sure we actually read the debug_types.dwo section
- DWARFUnit: set string offsets base on all units in the dwo file, not
just the main CU
- ManualDWARFIndex: index all units in the file
- SymbolFileDWARFDwo: Search for the single compile unit in the file, as
we can no longer assume it will be the first one
The last part makes it obvious that there is still some work to be done
here, namely that we do not support dwo files with multiple compile
units. That is something that should be easier after the DIERef
refactors, but it still requires more work.
Tests are added for the type units+split dwarf + dwarf4/5 scenarios, as
well as a test that checks we behave reasonably in the presence of dwo
files with multiple CUs.
Reviewers: clayborg, JDevlieghere, aprantl
Subscribers: arphaman, lldb-commits
Differential Revision: https://reviews.llvm.org/D63643
llvm-svn: 364274
Summary:
When dwo support was introduced, it used a trick where debug info
entries were referenced by the offset of the compile unit in the main
file, but the die offset was relative to the dwo file. Although there
was some elegance to it, this representation was starting to reach its
breaking point:
- the fact that the skeleton compile unit owned the DWO file meant that
it was impossible (or at least hard and unintuitive) to support DWO
files containing more than one compile unit. These kinds of files are
produced by LTO for example.
- it made it impossible to reference any DIEs in the skeleton compile
unit (although the skeleton units are generally empty, clang still
puts some info into them with -fsplit-dwarf-inlining).
- (current motivation) it made it very hard to support type units placed
in DWO files, as type units don't have any skeleton units which could
be referenced in the main file
This patch addresses this problem by introducing an new
"dwo_num" field to the DIERef class, whose purpose is to identify the
dwo file. It's kind of similar to the dwo_id field in DWARF5 unit
headers, but while this is a 64bit hash whose main purpose is to catch
file mismatches, this is just a smaller integer used to indentify a
loaded dwo file. Currently, this is based on the index of the skeleton
compile unit which owns the dwo file, but it is intended to be
eventually independent of that (to support the LTO use case).
Simultaneously the cu_offset is dropped to conserve space, as it is no
longer necessary. This means we can remove the "BaseObjectOffset" field
from the DWARFUnit class. It also means we can remove some of the
workarounds put in place to support the skeleton-unit+dwo-die combo.
More work is needed to remove all of them, which is out of scope of this
patch.
Reviewers: JDevlieghere, clayborg, aprantl
Subscribers: mehdi_amini, dexonsmith, arphaman, lldb-commits
Differential Revision: https://reviews.llvm.org/D63428
llvm-svn: 364009
Previously it was storing a *pointer*, which left open the possibility
of this pointer being null. We never made use of that possibility (it
does not make sense), and most of the code was already assuming that.
However, there were a couple of null-checks scattered around the code.
This patch replaces the reference with a pointer, making the
non-null-ness explicit, and removes the remaining null-checks.
llvm-svn: 363381
Summary:
This patch creates a cache of file lists in line tables referenced by
type units. This cache is used to avoid parsing a line table twice
(since a file list will generally be shared by many type units).
It also sets things up in a way that parsing of DW_AT_decl_file
attributes will keep working even when we stop creating lldb compile
units for dwarf type units, but it stops short of actually doing that.
This means that the request for files now go directly to SymbolFileDWARF
instead of being routed there indirectly via the
lldb_private::CompileUnit class.
As a result of this, a number of occurences of SymbolContext variables
in DWARFASTParserClang have become unused, so I remove them.
This patch reduces the number of times a file list is being parsed, but
the situation is still suboptimal, as the parsed list is being copied
multiple times. This will be fixed when we stop creating CompileUnits
for DWARF type units.
Reviewers: clayborg, aprantl, JDevlieghere
Subscribers: jdoerfert, lldb-commits
Differential Revision: https://reviews.llvm.org/D62894
llvm-svn: 363143
Summary:
r362103 exposed a bug, where we could read incorrect data if a skeleton
unit contained more than the single unit DIE. Clang emits these kinds of
units with -fsplit-dwarf-inlining (which is also the default).
Changing lldb to handle these DIEs is nontrivial, as we'd have to change
the UID encoding logic to be able to reference these DIEs, and fix up
various places which are assuming that all DIEs come from the separate
compile unit.
However, it turns out this is not necessary, as the DWO unit contains
all the information that the skeleton unit does. So, this patch just
skips parsing the extra DIEs if we have successfully found the DWO file.
This enforces the invariant that the rest of the code is already
operating under.
This patch fixes a couple of existing tests, but I've also included a
simpler test which does not depend on execution of binaries, and would
have helped us in catching this sooner.
Reviewers: clayborg, JDevlieghere, aprantl
Subscribers: probinson, dblaikie, lldb-commits
Differential Revision: https://reviews.llvm.org/D62852
llvm-svn: 362586
The function Extract() is almost a duplicate of FastExtract() but is not used.
Delete it and rename FastExtract() to Extract().
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D62593
llvm-svn: 362049
Summary:
debug_ranges got renamed to debug_rnglists in DWARF 5. Prior to this
patch lldb was just picking the first section it could find in the file,
and using that for all address ranges lookups. This is not correct in
case the file contains a mixture of compile units with various standard
versions (not a completely unlikely scenario).
In this patch I make lldb support reading from both sections
simulaneously, and decide the correct section to use based on the
version number of the compile unit. SymbolFileDWARF::DebugRanges is
split into GetDebugRanges and GetDebugRngLists (the first one is renamed
mainly so we can catch all incorrect usages).
I tried to structure the code similarly to how llvm handles this logic
(hence DWARFUnit::FindRnglistFromOffset/Index), but the implementations
are still relatively far from each other.
Reviewers: JDevlieghere, aprantl, clayborg
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D62302
llvm-svn: 361938
The fix form sizes use to have two arrays: one for 4 byte addresses and in for 8 byte addresses. The table had an issue where DW_FORM_flag_present wasn't being represented as a fixed size form because its actual size _is_ zero and zero was used to indicate the form isn't fixed in size. Any code that needed to quickly access the DWARF had to get a FixedFormSizes instance using the address byte size.
This fix cleans things up by adding a DWARFFormValue::GetFixedSize() both as a static method and as a member function on DWARFFormValue. It correctly can indicate if a form size is zero. This cleanup is a precursor to a follow up patch where I hope to speed up DWARF parsing.
I verified performance doesn't regress by loading hundreds of DWARF files and setting a breakpoint by file and line and by name in files that do not have DWARF indexes. Performance remained consistent between the two approaches.
Differential Revision: https://reviews.llvm.org/D62416
llvm-svn: 361675
Summary:
This patch implements the main feature of type units. When completing a
type, if we encounter a DW_AT_signature attribute, we use it's value to
lookup the complete definition of the type in the relevant type unit.
To enable this lookup, we build up a map of all type units in a symbol
file when parsing the units. Then we consult this map when resolving the
DW_AT_signature attribute.
I include add a couple of tests which exercise the type lookup feature,
including one that ensure we do something reasonable in case we fail to
lookup the type.
A lot of the ideas in this patch have been taken from D32167 and D61505.
Reviewers: clayborg, JDevlieghere, aprantl, alexshap
Subscribers: mgrang, lldb-commits
Differential Revision: https://reviews.llvm.org/D62246
llvm-svn: 361603
Summary:
NFC = [[ https://llvm.org/docs/Lexicon.html#nfc | Non functional change ]]
This commit is the result of modernizing the LLDB codebase by using
`nullptr` instread of `0` or `NULL`. See
https://clang.llvm.org/extra/clang-tidy/checks/modernize-use-nullptr.html
for more information.
This is the command I ran and I to fix and format the code base:
```
run-clang-tidy.py \
-header-filter='.*' \
-checks='-*,modernize-use-nullptr' \
-fix ~/dev/llvm-project/lldb/.* \
-format \
-style LLVM \
-p ~/llvm-builds/debug-ninja-gcc
```
NOTE: There were also changes to `llvm/utils/unittest` but I did not
include them because I felt that maybe this library shall be updated in
isolation somehow.
NOTE: I know this is a rather large commit but it is a nobrainer in most
parts.
Reviewers: martong, espindola, shafik, #lldb, JDevlieghere
Reviewed By: JDevlieghere
Subscribers: arsenm, jvesely, nhaehnle, hiraditya, JDevlieghere, teemperor, rnkovacs, emaste, kubamracek, nemanjai, ki.stfu, javed.absar, arichardson, kbarton, jrtc27, MaskRay, atanasyan, dexonsmith, arphaman, jfb, jsji, jdoerfert, lldb-commits, llvm-commits
Tags: #lldb, #llvm
Differential Revision: https://reviews.llvm.org/D61847
llvm-svn: 361484
Summary:
Type units don't describe any code, so they should never be the result
of any address lookup queries.
Previously, we would compute the address ranges for the type units for
via the line tables they reference because the type units looked a lot
like line-tables-only compile units. However, this is not correct, as
the line tables are only referenced from type units so that other
declarations can use the file names contained in them.
In this patch I make the BuildAddressRangeTable function virtual, and
implement it only for compile units.
Testing this was a bit tricky, because the behavior depends on the order
in which we add things to the address range map. This rarely caused a
problem with DWARF v4 type units, as they are always added after all
CUs. It happened more frequently with DWARF v5, as there clang emits the
type units first. However, this is still not something that it is
required to do, so for testing I've created an assembly file where I've
deliberately sandwiched a compile unit between two type units, which
should isolate us from both changes in how the compiler emits the units
and changes in the order we process them.
Reviewers: clayborg, aprantl, JDevlieghere
Subscribers: jdoerfert, lldb-commits
Differential Revision: https://reviews.llvm.org/D62178
llvm-svn: 361465
Summary:
This patch introduces the DWARFTypeUnit class, and teaches lldb to parse
type units out of both the debug_types section (DWARF v4), and from the
regular debug_info section (DWARF v5).
The most important piece of functionality - resolving DW_AT_signatures
to connect type forward declarations to their definitions - is not
implemented here, but even without that, a lot of functionality becomes
available. I've added tests for the commands that start to work after
this patch.
The changes in this patch were greatly inspired by D61505, which in turn took
over changes from D32167.
Reviewers: JDevlieghere, clayborg, aprantl
Subscribers: mgorny, jankratochvil, lldb-commits
Differential Revision: https://reviews.llvm.org/D62008
llvm-svn: 361360
In D61502#1503247 @clayborg suggested that SymbolFileDWARF *dwarf2Data is
really redundant in all the calls with also having DWARFUnit *cu. So remove it.
One `SymbolFileDWARF *` nullptr check
(DWARFDebugInfoEntry::GetDIENamesAndRanges) could be removed, other two nullptr
checks (DWARFDebugInfoEntry::GetName and DWARFDebugInfoEntry::AppendTypeName)
need to stay in place (now for `DWARFUnit *`).
Differential Revision: https://reviews.llvm.org/D62011
llvm-svn: 361277
Summary:
This patch introduces the DWARFUnitHeader class. Its purpose (and its
structure, to the extent it was possible to make it) is the same as its
LLVM counterpart -- to extract the unit header information before we
actually construct the unit, so that we know which kind of units to
construct. This is needed because as of DWARF5, type units live in the
.debug_info section, which means it's not possible to statically
determine the type of units in a given section.
Reviewers: aprantl, clayborg, JDevlieghere
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D62073
llvm-svn: 361224
This moves the sections from SymbolFileDWARF to DWARFContext, where it
was trivial to do so. A couple of sections are still left in
SymbolFileDWARF. These will be handled by separate patches.
llvm-svn: 361127
So far dw_offset_t was global for the whole SymbolFileDWARF but with
.debug_types the same dw_offset_t may mean two different things depending on
its section (=CU). So references now return whole new referenced DWARFDIE
instead of just dw_offset_t.
This means that some functions have to now handle 16 bytes instead of 8 bytes
but I do not see that anywhere performance critical.
Differential Revision: https://reviews.llvm.org/D61502
llvm-svn: 360795
D42892 changed a lot of code to use superclass DWARFUnit instead of its
subclass DWARFCompileUnit.
Finish this change more thoroughly for any *CompileUnit* -> *Unit* names.
Later patch will introduce DWARFTypeUnit which needs to be sometimes different
from DWARFCompileUnit and it would be confusing without this renaming.
Differential Revision: https://reviews.llvm.org/D61501
llvm-svn: 360443
Summary:
The implementation of GetID used a relatively complicated algorithm,
which returned some kind of an offset of the unit in some file
(depending on the debug info flavour). The only thing this ID was used
for was to enable subseqent retrieval of the unit from the SymbolFile.
This can be made simpler if we just make the "ID" of the unit an index
into the list of the units belonging to the symbol file. We already
support indexed access to the units, so each unit already has a well
"index" -- this just makes it accessible from within the unit.
To make the distincion between "id" and "offset" clearer (and help catch
any misuses), I also rename DWARFDebugInfo::GetCompileUnit (which
accesses by offset) into DWARFDebugInfo::GetCompileUnitAtOffset.
On its own, this only brings a minor simplification, but it enables
further simplifications in the DIERef class (coming in a follow-up
patch).
Reviewers: JDevlieghere, clayborg, aprantl
Subscribers: arphaman, jdoerfert, lldb-commits, tberghammer, jankratochvil
Differential Revision: https://reviews.llvm.org/D61481
llvm-svn: 360014
A lot of comments in LLDB are surrounded by an ASCII line to delimit the
begging and end of the comment.
Its use is not really consistent across the code base, sometimes the
lines are longer, sometimes they are shorter and sometimes they are
omitted. Furthermore, it looks kind of weird with the 80 column limit,
where the comment actually extends past the line, but not by much.
Furthermore, when /// is used for Doxygen comments, it looks
particularly odd. And when // is used, it incorrectly gives the
impression that it's actually a Doxygen comment.
I assume these lines were added to improve distinguishing between
comments and code. However, given that todays editors and IDEs do a
great job at highlighting comments, I think it's worth to drop this for
the sake of consistency. The alternative is fixing all the
inconsistencies, which would create a lot more churn.
Differential revision: https://reviews.llvm.org/D60508
llvm-svn: 358135
This reverts commit r356682 because it breaks the DWO flavours of some
tests:
lldb-Suite :: lang/c/const_variables/TestConstVariables.py
lldb-Suite :: lang/c/local_variables/TestLocalVariables.py
lldb-Suite :: lang/c/vla/TestVLA.py
llvm-svn: 356773
This is mostly mechanical, and just moves the remaining non-DWO
related sections over to DWARFContext.
Differential Revision: https://reviews.llvm.org/D59611
llvm-svn: 356682
All of this is code that is unreferenced. Removing as much of
this as possible makes it more easy to determine what functionality
is missing and/or shared between LLVM and LLDB's DWARF interfaces.
llvm-svn: 356509
These log statements have questionable value, and hinder the effort
of separating the high and low level DWARF parsing interfaces inside
of LLDB. Removing them for now, and if/when we need such log statements
again in the future, we can add them back (if possible) or introduce a
mechanism for logging from the low-level interface in such a way that it
isn't coupled to the high level interface.
Differential Revision: https://reviews.llvm.org/D59498
llvm-svn: 356469
LLVM doesn't produce DWARF64, and neither does GCC. LLDB's support
for DWARF64 is only partial, and if enabled appears to also not work.
Finally, it's untested. Removing this makes merging LLVM and
LLDB's DWARF parsing implementations simpler.
Differential Revision: https://reviews.llvm.org/D59235
llvm-svn: 355975
This is a very thin wrapper over a std::vector<DWARFDIE> and does
not seem to provide any real value over just using a container
directly.
Differential Revision: https://reviews.llvm.org/D59165
llvm-svn: 355974
The `ap` suffix is a remnant of lldb's former use of auto pointers,
before they got deprecated. Although all their uses were replaced by
unique pointers, some variables still carried the suffix.
In r353795 I removed another auto_ptr remnant, namely redundant calls to
::get for unique_pointers. Jim justly noted that this is a good
opportunity to clean up the variable names as well.
I went over all the changes to ensure my find-and-replace didn't have
any undesired side-effects. I hope I didn't miss any, but if you end up
at this commit doing a git blame on a weirdly named variable, please
know that the change was unintentional.
llvm-svn: 353912
Summary:
This adds support for auto-detection of path style to SymbolFileBreakpad
(similar to how r351328 did the same for DWARF). We guess each file
entry separately, as we have no idea which file came from which compile
units (and different compile units can have different path styles). The
breakpad generates should have already converted the paths to absolute
ones, so this guess should be reasonable accurate, but as always with
these kinds of things, it is hard to give guarantees about anything.
In an attempt to bring some unity to the path guessing logic, I move the
guessing logic from inside SymbolFileDWARF into the FileSpec class and
have both symbol files use it to implent their desired behavior.
Reviewers: clayborg, lemo, JDevlieghere
Subscribers: aprantl, markmentovai, lldb-commits
Differential Revision: https://reviews.llvm.org/D57895
llvm-svn: 353702
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Summary:
If we opened a file which was produced on system with different path
syntax, we would parse the paths from the debug info incorrectly.
The reason for that is that we would parse the paths as they were
native. For example this meant that on linux we would treat the entire
windows path as a single file name with no directory component, and then
we would concatenate that with the single directory component from the
DW_AT_comp_dir attribute. When parsing posix paths on windows, we would
at least get the directory separators right, but we still would treat
the posix paths as relative, and concatenate them where we shouldn't.
This patch attempts to remedy this by guessing the path syntax used in
each compile unit. (Unfortunately, there is no info in DWARF which would
give the definitive path style used by the produces, so guessing is all
we can do.) Currently, this guessing is based on the DW_AT_comp_dir
attribute of the compile unit, but this can be refined later if needed
(for example, the DW_AT_name of the compile unit may also contain some
useful info). This style is then used when parsing the line table of
that compile unit.
This patch is sufficient to make the line tables come out right, and
enable breakpoint setting by file name work correctly. Setting a
breakpoint by full path still has some kinks (specifically, using a
windows-style full path will not work on linux because the path will be
parsed as a linux path), but this will require larger changes in how
breakpoint setting works.
Reviewers: clayborg, zturner, JDevlieghere
Subscribers: aprantl, lldb-commits
Differential Revision: https://reviews.llvm.org/D56543
llvm-svn: 351328
This patch simplifies boolean expressions acorss LLDB. It was generated
using clang-tidy with the following command:
run-clang-tidy.py -checks='-*,readability-simplify-boolean-expr' -format -fix $PWD
Differential revision: https://reviews.llvm.org/D55584
llvm-svn: 349215
A skeleton compilation unit may contain the DW_AT_str_offsets_base attribute
that points to the first string offset of the CU contribution to the
.debug_str_offsets. At the same time, when we use split dwarf,
the corresponding split debug unit also
may use DW_FORM_strx* forms pointing to its own .debug_str_offsets.dwo.
In that case, DWO does not contain DW_AT_str_offsets_base, but LLDB
still need to know and skip the .debug_str_offsets.dwo section header to
access the offsets.
The patch implements the support of DW_AT_str_offsets_base.
Differential revision: https://reviews.llvm.org/D54844
llvm-svn: 347859
The issue happens because starting from DWARF v5
DW_AT_addr_base attribute should be used
instead of DW_AT_GNU_addr_base. LLDB does not do that and
we end up reading the .debug_addr header as section content
(as addresses) instead of skipping it and reading the real addresses.
Then LLDB is unable to match 2 similar locations and
thinks they are different.
Differential revision: https://reviews.llvm.org/D54751
llvm-svn: 347842
Summary:
While parsing a childless compile unit DIE we could crash if the DIE was
followed by any extra data (such as a superfluous end-of-children
marker). This happened because the break-on-depth=0 check was performed
only when parsing the null DIE, which was not correct because with a
childless root DIE, we could reach the end of the unit without ever
encountering the null DIE.
If the compile unit contribution ended directly after the CU DIE,
everything would be fine as we would terminate parsing due to reaching
EOF. However, if the contribution contained extra data (perhaps a
superfluous end-of-children marker), we would crash because we would
treat that data as the begging of another compile unit.
This fixes the crash by moving the depth=0 check to a more generic
place, and also adds a regression test.
Reviewers: clayborg, jankratochvil, JDevlieghere
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D54417
llvm-svn: 346849
This adds support for DW_RLE_base_addressx, DW_RLE_startx_endx,
DW_RLE_startx_length, DW_FORM_rnglistx.
Differential revision: https://reviews.llvm.org/D53929
llvm-svn: 345958
This adds the support for DW_FORM_addrx, DW_FORM_addrx1,
DW_FORM_addrx2, DW_FORM_addrx3, DW_FORM_addrx4 forms.
Differential revision: https://reviews.llvm.org/D53813
llvm-svn: 345706
It merges DWARFDebugInfoEntry's m_empty_children into m_has_children.
m_empty_children was implemented by rL144983.
As Greg confirmed m_has_children was used to represent what was in the DWARF in
the byte that follows the DW_TAG. m_empty_children was used for DIEs that said
they had children but actually only contain a single NULL tag. It is fine to
not differentiate between the two.
Also changed assert()->lldbassert() for m_abbr_idx 16-bit overflow check as
that could be a tough bug to catch if it ever happens.
I have checked all calls of HasChildren() that this change should not matter to
them. The code even wants to know if there are any children - it does not
matter how the children presence is coded in the binary.
Patch written based on suggestions by Greg Clayton.
Differential Revision: https://reviews.llvm.org/D53321
llvm-svn: 344644
xbolva00 bugreported $subj in: https://reviews.llvm.org/D46810#1247410
It can happen only from the line:
m_die_array.back().SetEmptyChildren(true);
In the case DW_TAG_compile_unit has DW_CHILDREN_yes but there is only 0 (end of
list, no children present). Therefore the assertion can fortunately happen only
with a hand-crafted DWARF or with DWARF from some suboptimal compilers.
Differential Revision: https://reviews.llvm.org/D53255
llvm-svn: 344605
If BuildAddressRangeTable called ExtractDIEsIfNeeded(false), then another
thread started processing data from m_die_array and then the first thread
called final ClearDIEs() the second thread would crash.
It is also required without multithreaded debugger using DW_TAG_partial_unit
for DWZ.
Differential revision: https://reviews.llvm.org/D40470
llvm-svn: 333987
rL145086 introduced m_die_array.shrink_to_fit() implemented by
exact_size_die_array.swap, it was before LLVM became written in C++11.
Differential revision: https://reviews.llvm.org/D47492
llvm-svn: 333636
GetUnitDIEPtrOnly() needs to return pointer to the first DIE.
But the first element of m_die_array after ExtractDIEsIfNeeded(true)
may move in memory after later ExtractDIEsIfNeeded(false).
DWARFDebugInfoEntry::collection m_die_array is std::vector,
its data may move during its expansion.
Differential revision: https://reviews.llvm.org/D46810
llvm-svn: 333437
As suggested by Pavel Labath in D46810 DWARFUnit::GetUnitDIEOnly() returning
a pointer to m_first_die should not permit using methods like GetFirstChild().
Differential revision: https://reviews.llvm.org/D47276
llvm-svn: 333224
Summary:
I think this makes sense for several reasons:
- better separation of concerns: DWARFUnit's job should be to provide a
nice interface to its users to access the unit contents.
ManualDWARFIndex can then use this interface to build an index and
provide it to its users.
- closer alignment with llvm parsers: there is no indexing equivalent in
llvm, and there probably never will be, as the index is very centered
around how lldb wants to access debug info. If we ever switch to
llvm's parser, this will allow us swap out DWARFUnit implementations
and keep indexing as-is.
- closer proximity of the indexing code to AppleDWARFIndex will make it
easier to keep the two in sync (e.g. right now the two use very
different algorithms to determine whether a DW_TAG_subroutine
represents a "method"). This is my primary motivation for making this
change now, but I am leaving this work to a separate patch.
The only interface change to DWARFUnit I needed to make was to add an
efficient way to iterate over the list of all DIEs. Adding this also
aligns us closer to the llvm parser.
Reviewers: JDevlieghere, clayborg, aprantl
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D47253
llvm-svn: 333178
Summary:
This places the `if(m_using_apple_tables)` branches inside the
SymbolFileDWARF class behind an abstract DWARFIndex class. The class
currently has two implementations:
- AppleIndex, which searches using .apple_names and friends
- ManualIndex, which searches using a manually built index
Most of the methods of the class are very simple, and simply extract the
list of DIEs for the given name from the appropriate sub-table. The main
exception are the two GetFunctions overloads, which take a couple of
extra paramenters, including some callbacks. It was not possible to
split these up the same way as other methods, as here we were doing a
lot of post-processing on the results. The post-processing is similar
for the two cases, but not identical. I hope to factor these further in
separate patches.
Other interesting methods are:
- Preload(): do any preprocessing to make lookups faster (noop for
AppleIndex, forces a build of the lookup tables for ManualIndex).
- ReportInvalidDIEOffset(): Used to notify the users of an invalid index
(prints a message for AppleIndex, noop for ManualIndex).
- Dump(): dumps the index state (noop for AppleIndex, prints the lookup
tables for ManualIndex).
Reviewers: clayborg, JDevlieghere
Subscribers: mgorny, aprantl, lldb-commits
Differential Revision: https://reviews.llvm.org/D46889
llvm-svn: 332719
Pavel Labath found this patch is incomplete and racy. I think there needs to
be some more mutexes even before considering DW_TAG_partial_unit.
This reverts commit 331229 which was: https://reviews.llvm.org/D40470
llvm-svn: 332200
This cleanup is designed to make the https://reviews.llvm.org/D32167 patch smaller and easier to read.
Cleanup in this patch:
Allow DWARFUnit subclasses to hand out the data that should be used when decoding data for a DIE. The information might be in .debug_info or could be in .debug_types. There is a new virtual function on DWARFUnit that each subclass must override:
virtual const lldb_private::DWARFDataExtractor &DWARFUnit::GetData() const;
This allows DWARFCompileUnit and eventually DWARFTypeUnit to hand out different data to be used when decoding the DIE information.
Add a new pure virtual function to get the size of the DWARF unit header:
virtual uint32_t DWARFUnit::GetHeaderByteSize() const = 0;
This allows DWARFCompileUnit and eventually DWARFTypeUnit to hand out different offsets where the first DIE starts when decoding DIE information from the unit.
Added a new function to DWARFDataExtractor to get the size of an offset:
size_t DWARFDataExtractor::GetDWARFSizeOfOffset() const;
Removed dead dumping and parsing code in the DWARFDebugInfo class.
Inlined a bunch of calls in DWARFUnit for accessors that were just returning integer member variables.
Renamed DWARFUnit::Size() to DWARFUnit::GetHeaderByteSize() as it clearly states what it is doing and makes more sense.
Differential Revision: https://reviews.llvm.org/D46606
llvm-svn: 331892
Summary:
Before this patch the two paths were doing very different things
- the apple path searched the .apple_names section, which contained
mangled names, as well as basenames of all functions. It returned any
name it found.
- the non-accelerated path looked in the "full name" index we built
ourselves, which contained mangled as well as demangled names of all
functions (but no basenames). Then however, if it did not find a match
it did an extra search in the basename index, with some special
handling for anonymous namespaces.
This aligns the two paths by changing the non-accelerated path to return
the same results as in the apple-tables one. In pratice, this means we
will search in both the "basename", "method" and "fullname" indexes (in
the manual indexes these are separate indexes. This means the function
will return some slightly inappropriate results (e.g. bar::baz::foo when
one asks for a "full name" foo), but this can be handled by additional
filtering, independently indexing method. I've also stopped inserting
demangled names into the "fullname" index, as that is inconsistent with
the apple path.
Reviewers: clayborg, JDevlieghere
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D46576
llvm-svn: 331855
Multiple DW_TAG_compile_unit being indexed in a multithreaded way can request
reading of the same DW_TAG_partial_unit.
Unfortunately one cannot detect DWZ file ahead of time to disable such locking
overhead as DWARFCompileUnit::Extract does not read the first DIE which is the
only place one could find early enough if the DWARF file is using any
DW_TAG_partial_unit.
Differential revision: https://reviews.llvm.org/D40470
llvm-svn: 331229
This is intended as a clean up after the big clang-format commit
(r280751), which unfortunately resulted in many of the comment
paragraphs in LLDB being very hard to read.
FYI, the script I used was:
import textwrap
import commands
import os
import sys
import re
tmp = "%s.tmp"%sys.argv[1]
out = open(tmp, "w+")
with open(sys.argv[1], "r") as f:
header = ""
text = ""
comment = re.compile(r'^( *//) ([^ ].*)$')
special = re.compile(r'^((([A-Z]+[: ])|([0-9]+ )).*)|(.*;)$')
for line in f:
match = comment.match(line)
if match and not special.match(match.group(2)):
# skip intentionally short comments.
if not text and len(match.group(2)) < 40:
out.write(line)
continue
if text:
text += " " + match.group(2)
else:
header = match.group(1)
text = match.group(2)
continue
if text:
filled = textwrap.wrap(text, width=(78-len(header)),
break_long_words=False)
for l in filled:
out.write(header+" "+l+'\n')
text = ""
out.write(line)
os.rename(tmp, sys.argv[1])
Differential Revision: https://reviews.llvm.org/D46144
llvm-svn: 331197
Code commonly checks if the parent DIE is DW_TAG_compile_unit.
But DW_TAG_partial_unit also acts as DW_TAG_compile_unit for DWZ
as DWZ is using DW_TAG_imported_unit only at the top unit level.
Differential revision: https://reviews.llvm.org/D40469
llvm-svn: 331194
This patch by Greg Clayton drops the virtualization for DWARFPartialUnit.
The virtualization of DWARFUnit now matches more its LLVM counterpart.
DWZ patchset is going to be implementable without DWARFPartialUnit remapping.
https://reviews.llvm.org/D40474
This reverts commit 329423.
This reapplies commit r329305.
llvm-svn: 330084
The reverted commit changed DWARFUnit from https://reviews.llvm.org/D40466 and
https://reviews.llvm.org/D42892 that was prepared for DWARFPartialUnit and
made from it a superclass for DWARFTypeUnit. DWARFUnit's intention was:
DWARFUnit->DWARFSomeNameUnit->DWARFCompileUnit
DWARFUnit->DWARFSomeNameUnit->DWARFTypeUnit
DWARFUnit->DWARFPartialUnit
Discussed at: https://reviews.llvm.org/D45170
This reverts commit r329305.
llvm-svn: 329423
Many things that were in DWARFCompileUnit actually need to be in DWARFUnit. This patch moves all DWARFUnit specific things over into DWARFUnit and fixes the layering. This is in preparation for adding DWARFTypeUnit for the .debug_types patch.
Differential Revision: https://reviews.llvm.org/D45170
llvm-svn: 329305
Now the codebase can use the DWARFUnit superclass. It will make it later
seamlessly work also with DWARFPartialUnit for DWZ.
This patch is only a search-and-replace easily undone, nothing interesting
in it.
Differential revision: https://reviews.llvm.org/D42892
llvm-svn: 327810
DW_TAG_partial_unit for DWZ can be then presented by DWARFPartialUnit also
inherited from DWARFUnit.
Differential revision: https://reviews.llvm.org/D40466
llvm-svn: 327809