This change makes sure that DwarfUnit does not load a .dwo file until
necessary. I also take advantage of DWARF 5's guarantee that the first
support file is also the primary file to make it possible to create
a compile unit without loading the .dwo file.
Review By: jankratochvil, dblaikie
Differential Revision: https://reviews.llvm.org/D100299
This change makes sure that DwarfUnit does not load a .dwo file until
necessary. I also take advantage of DWARF 5's guarantee that the first
support file is also the primary file to make it possible to create
a compile unit without loading the .dwo file.
Review By: jankratochvil, dblaikie
Differential Revision: https://reviews.llvm.org/D100299
If we succeed at gathering global variables for a compile
unit, there is no need to fallback to generating a manual index.
Reviewed By: jankratochvil
Differential Revision: https://reviews.llvm.org/D106355
This patch adds the ability to get a DWARFDIE's children as an LLVM range.
This way we can use for range loops to iterate over them and we can use LLVM's
algorithms like `llvm::all_of` to query all children.
The implementation has to do some small shenanigans as the iterator needs to
store a DWARFDIE, but a DWARFDIE container is also a DWARFDIE so it can't return
the iterator by value. I just made the `children` getter a templated function to
avoid the cyclic dependency.
Reviewed By: #lldb, werat, JDevlieghere
Differential Revision: https://reviews.llvm.org/D103172
LLDB's DWARF parser has some heuristics for guessing and fixing up the
accessibility of C++ class/struct members after they were already created in the
internal Clang AST. The heuristic is that if a struct/class has a base class,
then it's actually a class and it's members are private unless otherwise
specified.
From what I can see this heuristic isn't sound and also unnecessary. The idea
that inheritance implies that the `class` keyword was used and the default
visibility is `private` is incorrect. Also both GCC and Clang use
`DW_TAG_structure_type` and `DW_TAG_class_type` for `struct` and `class` types
respectively, so the default visibility we infer from that information is always
correct and there is no need to fix it up.
And finally, the access specifiers we set in the Clang AST are anyway unused
within LLDB. The expression parser explicitly ignores them to give users access
to private members and there is not SBAPI functionality that exposes this
information.
This patch removes all the heuristic code for the reasons above and instead
just relies on the access values we infer from the tag kind and explicit
annotations in DWARF.
This patch is NFCI.
Reviewed By: werat
Differential Revision: https://reviews.llvm.org/D105463
C doesn't allow empty structs but Clang/GCC support them and give them a size of 0.
LLDB implements this by checking the tag kind and if it's `DW_TAG_structure_type` then
we give it a size of 0 via an empty external RecordLayout. This is done because our
internal TypeSystem is always in C++ mode (which means we would give them a size
of 1).
The current check for when we have this special case is currently too lax as types with
`DW_TAG_structure_type` can also occur in C++ with types defined using the `struct`
keyword. This means that in a C++ program with `struct Empty{};`, LLDB would return
`0` for `sizeof(Empty)` even though the correct size is 1.
This patch removes this special case and replaces it with a generic approach that just
assigns empty structs the byte_size as specified in DWARF. The GCC/Clang special
case is handles as they both emit an explicit `DW_AT_byte_size` of 0. And if another
compiler decides to use a different byte size for this case then this should also be
handled by the same code as long as that information is provided via `DW_AT_byte_size`.
Reviewed By: werat, shafik
Differential Revision: https://reviews.llvm.org/D105471
My D99653 implemented a getter GetRnglist() for m_rnglist_table.
That was confusing as the getter returns DWARFDebugRnglistTable which
contains DWARFDebugRnglist as its elements.
when dealing with -gmodules debug info.
This fixes the bot failures on Darwin.
A recent clang change (presumably https://reviews.llvm.org/D104291)
introduced a bug where .pcm files would identify themselves as
DW_LANG_C_plus_plus, but the .o that references them would identify as
DW_LANG_C_plus_plus_14. While that bug needs to be fixed, too, it
shows that the current strict comparison also isn't meaningful.
rdar://79423225
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.
This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.
This reapplies the previously reverted patch with additional include
order fixes for non-modular builds of LLDB.
Differential Revision: https://reviews.llvm.org/D103575
This converts a default constructor's member initializers into C++11
default member initializers. This patch was automatically generated with
clang-tidy and the modernize-use-default-member-init check.
$ run-clang-tidy.py -header-filter='lldb' -checks='-*,modernize-use-default-member-init' -fix
This is a mass-refactoring patch and this commit will be added to
.git-blame-ignore-revs.
Differential revision: https://reviews.llvm.org/D103483
The `lock` call directly will check for us if the `weak_ptr` is expired and
returns an invalid `shared_ptr` (which we correctly handle), so this check is
redundant.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D103442
The C headers are deprecated so as requested in D102845, this is replacing them
all with their (not deprecated) C++ equivalent.
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D103084
In D98289#inline-939112 @dblaikie said:
Perhaps this could be more informative about what makes the range list
index of 0 invalid? "index 0 out of range of range list table (with
range list base 0xXXX) with offset entry count of XX (valid indexes
0-(XX-1))" Maybe that's too verbose/not worth worrying about since
this'll only be relevant to DWARF producers trying to debug their
DWARFv5, maybe no one will ever see this message in practice. Just
a thought.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102851
DW_AT_ranges can use DW_FORM_sec_offset (instead of DW_FORM_rnglistx).
In such case DW_AT_rnglists_base does not need to be present.
DWARF-5 spec:
"If the offset_entry_count is zero, then DW_FORM_rnglistx cannot
be used to access a range list; DW_FORM_sec_offset must be used
instead. If the offset_entry_count is non-zero, then
DW_FORM_rnglistx may be used to access a range list;"
This fix is for TestTypeCompletion.py category `dwarf` using GCC with DWARF-5.
The fix just provides GetRnglist() lazy getter for `m_rnglist_table`.
The testcase is easier to review by:
diff -u lldb/test/Shell/SymbolFile/DWARF/DW_AT_low_pc-addrx.s \
lldb/test/Shell/SymbolFile/DWARF/DW_AT_range-DW_FORM_sec_offset.s
Differential Revision: https://reviews.llvm.org/D98289
We have a bug in which using member_clang_type.GetByteSize() triggers record
layout and during this process since the record was not yet complete we ended
up reaching a record that had not been layed out yet.
Using member_type->GetByteSize() avoids this situation since it relies on size
from DWARF and will not trigger record layout.
For reference: rdar://77293040
Differential Revision: https://reviews.llvm.org/D102445
A type system is not guaranteed to have a symbol file. This patch adds null-pointer checks so we don't crash when trying to access a type system's symbol file.
Reviewed By: aprantl, teemperor
Differential Revision: https://reviews.llvm.org/D101539
This patch refactors a good part of the code base turning the usual
FileSpec, Line, Column, CheckInlines, ExactMatch arguments into a
SourceLocationSpec object.
This change is required for a following patch that will add handling of the
column line information when doing symbol resolution.
Differential Revision: https://reviews.llvm.org/D100965
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch moves the Declaration class from the Symbol library to the
Core library. This will allow to use it in a more generic fashion and
aims to lower the dependency cycles when it comes to the linking.
The patch also does some cleaning up by making column information
permanent and removing the LLDB_ENABLE_DECLARATION_COLUMNS directives.
Differential revision: https://reviews.llvm.org/D101556
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
`InsertSequence` doesn't take ownership of the pointer so releasing this pointer
is just leaking memory.
Follow up to D100806 that was fixing other leak sanitizer test failures
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D100846
DWARF allows .dwo file paths to be relative rather than absolute. When
they are relative, DWARF uses DW_AT_comp_dir to find the .dwo
file. DW_AT_comp_dir can also be relative, making the entire search
patch for the .dwo file relative. In this case, LLDB currently
searches relative to its current working directory, i.e. the directory
from which the debugger was launched. This is not right, as the
compiler, which generated the relative paths, can have no idea where
the debugger will be launched. The correct thing is to search relative
to the location of the executable binary. That is what this patch
does.
Differential Revision: https://reviews.llvm.org/D97786
DWARF allows .dwo file paths to be relative rather than absolute. When
they are relative, DWARF uses DW_AT_comp_dir to find the .dwo
file. DW_AT_comp_dir can also be relative, making the entire search
patch for the .dwo file relative. In this case, LLDB currently
searches relative to its current working directory, i.e. the directory
from which the debugger was launched. This is not right, as the
compiler, which generated the relative paths, can have no idea where
the debugger will be launched. The correct thing is to search relative
to the location of the executable binary. That is what this patch
does.
Differential Revision: https://reviews.llvm.org/D97786
When LLDB's DWARF parser is parsing the member DIEs of a struct/class it
currently fully resolves the types of static member variables in a class before
adding the respective `VarDecl` to the record.
For record types fully resolving the type will also parse the member DIEs of the
respective class. The other way of resolving is just 'forward' resolving the type
which will try to load only the minimum amount of information about the type
(for records that would only be the name/kind of the type). Usually we always
resolve types on-demand so it's rarely useful to speculatively fully resolve
them on the first use.
This patch changes makes that we only 'forward' resolve the types of static
members. This solves the fact that LLDB unnecessarily loads debug information
to parse the type if it's maybe not needed later and it also avoids a crash where
the parsed type might in turn reference the surrounding class that is currently
being parsed.
The new test case demonstrates the crash that might happen. The crash happens
with the following steps:
1. We parse class `ToLayout` and it's members.
2. We parse the static class member and fully resolve its type
(`DependsOnParam2<ToLayout>`).
3. That type has a non-static class member `DependsOnParam1<ToLayout>` for which
LLDB will try to calculate the size.
4. The layout (and size)`DependsOnParam1<ToLayout>` turns depends on the
`ToLayout` size/layout.
5. Clang will calculate the record layout/size for `ToLayout` even though we are
currently parsing it and it's missing it's non-static member.
The created is missing the offset for the yet unparsed non-static member. If we
later try to get the offset we end up hitting different asserts. Most common is
the one in `TypeSystemClang::DumpValue` where it checks that the record layout
has offsets for the current FieldDecl.
```
assert(field_idx < record_layout.getFieldCount());
```
Fixed rdar://67910011
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D100180
If the debug info is missing the terminating null die, we would crash
when trying to access the nonexisting children/siblings. This was
discovered because the test case for D98619 accidentaly produced such
input.
Remove the "depth" variable, as the same information can be obtained
through die_index_stack.size().
Also add a test case for a one tricky case I noticed -- a unit
containing only a null unit die.
When LLVM error handling was introduced to the parsing of the .debug_aranges it would cause major issues if any DWARFDebugArangeSet::extract() calls returned any errors. The code in DWARFDebugInfo::GetCompileUnitAranges() would end up calling DWARFDebugAranges::extract() which would return an error if _any_ DWARFDebugArangeSet had any errors, but it default constructed a DWARFDebugAranges object into DWARFDebugInfo::m_cu_aranges_up and populated it partially, and returned an error prior to finishing much needed functionality in the DWARFDebugInfo::GetCompileUnitAranges() function. Subsequent callers to this function would see that the DWARFDebugInfo::m_cu_aranges_up was actually valid and return this partially populated DWARFDebugAranges reference _and_ it would not be sorted or minimized.
This above bugs would cause an incomplete .debug_aranges parsing, it would skip manually parsing any compile units for ranges, and would not sort the DWARFDebugAranges in m_cu_aranges_up.
This bug would also cause breakpoints set by file and line to fail to set correctly if a symbol context for an address could not be resolved properly, which the incomplete and unsorted DWARFDebugAranges object that DWARFDebugInfo::GetCompileUnitAranges() returned would cause symbol context lookups resolved by address (breakpoint address) to fail to find any DWARF debug info for a given address.
This patch fixes all of the issues that I found:
- DWARFDebugInfo::GetCompileUnitAranges() no longer returns a "llvm::Expected<DWARFDebugAranges &>", but just returns a "const DWARFDebugAranges &". Why? Because this code contained a fallback that would parse all of the valid DWARFDebugArangeSet objects, and would check which compile units had valid .debug_aranges set entries, and manually build an address ranges table using DWARFUnit::BuildAddressRangeTable(). If we return an error because any DWARFDebugArangeSet has any errors, then we don't do any of this code. Now we parse all DWARFDebugArangeSet objects that have no errors, if any calls to DWARFDebugArangeSet::extract() return errors, we skip that DWARFDebugArangeSet so that we can use the fallback call to DWARFUnit::BuildAddressRangeTable(). Since DWARFDebugInfo::GetCompileUnitAranges() needs to parse what it can from the .debug_aranges and build address ranges tables for any compile units that don't have any .debug_aranges sets, everything now works as expected.
- Fix an issue where a DWARFDebugArangeSet contains multiple terminator entries. The LLVM parser and llvm-dwarfdump properly warn about this because it happens with linux compilers and linkers and was the original cause of the bug I am fixing here. We now correctly warn about this issue if "log enable dwarf info" is enabled, but we continue to parse the DWARFDebugArangeSet correctly so we don't lose data that is contained in the .debug_aranges section.
- DWARFDebugAranges::extract() no longer returns a llvm::Error because we need to be able to parse all of the valid DWARFDebugArangeSet objects. It also will correctly skip a DWARFDebugArangeSet object that has errors in the middle of the stream by setting the start offsets of each DWARFDebugArangeSet to be calculated by the previous DWARFDebugArangeSet::extract() calculated offset that uses the header which contains the length of the DWARFDebugArangeSet. This means if do we run into real errors while parsing individual DWARFDebugArangeSet objects, we can continue to parse the rest of the validly encoded DWARFDebugArangeSet objects in the .debug_aranges section. This will allow LLDB to parse DWARF that contains a possibly newer .debug_aranges set format than LLDB currently supports because we will error out for the parsing of the DWARFDebugArangeSet, but be able to skip to the next DWARFDebugArangeSet object using the "DWARFDebugArangeSet.m_header.length" field to calculate the next starting offset.
Tests were added to cover all new functionality.
Differential Revision: https://reviews.llvm.org/D99401
LLDB can often appear deadlocked to users that use IDEs when it is indexing DWARF, or parsing symbol tables. These long running operations can make a debug session appear to be doing nothing even though a lot of work is going on inside LLDB. This patch adds a public API to allow clients to listen to debugger events that report progress and will allow UI to create an activity window or display that can show users what is going on and keep them informed of expensive operations that are going on inside LLDB.
Differential Revision: https://reviews.llvm.org/D97739
SymbolFileDWARF::ResolveSymbolContext is currently unaware that in DWARF5 the primary file is specified at file index 0. As a result it misses to correctly resolve the symbol context for the primary file when DWARF5 debug data is used and the primary file is only specified at index 0.
This change makes use of CompileUnit::ResolveSymbolContext to resolve the symbol context. The ResolveSymbolContext in CompileUnit has been previously already updated to reflect changes in DWARF5
and contains a more readable version. It can resolve more, but will also do a bit more work than
SymbolFileDWARF::ResolveSymbolContext (getting the Module, and going through SymbolFileDWARF::ResolveSymbolContextForAddress), however, it's mostly directed by $resolve_scope
what will be resolved, and ensures that code is easier to maintain if there's only one path.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D98619
The idiom:
```
DeclContext::lookup_result R = DeclContext::lookup(Name);
for (auto *D : R) {...}
```
is not safe when in the loop body we trigger deserialization from an AST file.
The deserialization can insert new declarations in the StoredDeclsList whose
underlying type is a vector. When the vector decides to reallocate its storage
the pointer we hold becomes invalid.
This patch replaces a SmallVector with an singly-linked list. The current
approach stores a SmallVector<NamedDecl*, 4> which is around 8 pointers.
The linked list is 3, 5, or 7. We do better in terms of memory usage for small
cases (and worse in terms of locality -- the linked list entries won't be near
each other, but will be near their corresponding declarations, and we were going
to fetch those memory pages anyway). For larger cases: the vector uses a
doubling strategy for reallocation, so will generally be between half-full and
full. Let's say it's 75% full on average, so there's N * 4/3 + 4 pointers' worth
of space allocated currently and will be 2N pointers with the linked list. So we
break even when there are N=6 entries and slightly lose in terms of memory usage
after that. We suspect that's still a win on average.
Thanks to @rsmith!
Differential revision: https://reviews.llvm.org/D91524
Apply changes from https://reviews.llvm.org/D91014 to other places where DWARF entries are being processed.
Test case is provided by @jankratochvil.
The test is marked to run only on x64 and exclude Windows and Darwin, because the assembly is not OS-independent.
(First attempt https://reviews.llvm.org/D96778 broke the build bots)
Reviewed By: jankratochvil
Differential Revision: https://reviews.llvm.org/D97765
In DWARF v4 compile units go in .debug_info and type units go in
.debug_types. However, in v5 both kinds of units are in .debug_info.
Therefore we can't decide whether to use the CU or TU index just by
looking at which section we're reading from. We have to wait until we
have read the unit type from the header.
Differential Revision: https://reviews.llvm.org/D96194
The comment for ValueType claims that all values <1 are errors, but
not all switch statements take this into account. This patch
introduces an explicit Error case and deletes all default: cases, so
we get warned about incomplete switch coverage.
https://reviews.llvm.org/D96537
Finishing out the support (to the best of my knowledge/based on current
testing running the whole check-lldb with a clang forcibly using
DW_AT_ranges on all DW_TAG_subprograms) for this feature.
Differential Revision: https://reviews.llvm.org/D94064
gcc already produces debug info with this form
-freorder-block-and-partition
clang produces this sort of thing with -fbasic-block-sections and with a
coming-soon tweak to use ranges in DWARFv5 where they can allow greater
reuse of debug_addr than the low/high_pc forms.
This fixes the case of breaking on a function name, but leaves broken
printing a variable - a follow-up commit will add that and improve the
test case to match.
Differential Revision: https://reviews.llvm.org/D94063
In split DWARF v5 files, the DWO id is no longer in the DW_AT_GNU_dwo_id
attribute. It's in the CU header instead. This change makes lldb look in
both places.
Differential Revision: https://reviews.llvm.org/D93444
This patch introduces a LLDB_SCOPED_TIMER macro to hide the needlessly
repetitive creation of scoped timers in LLDB. It's similar to the
LLDB_LOG(F) macro.
Differential revision: https://reviews.llvm.org/D93663
To get LLDB one step closer to fulfil the software redundancy requirements of
modern aircrafts, we apparently decided to have two separately maintained
implementations of `CreateTypedef` in TypeSystemClang. Let's pass on the idea of
an LLDB-powered jetliner and deleted one implementation.
On a more serious note: This function got duplicated a long time ago when the
idea of CompilerType with a backing TypeSystemClang subclass happened
(56939cb310). One implementation was supposed to
be called from CompilerType::CreateTypedef and the other has just always been
around to create typedefs. By accident one of the implementations is only used
by the PDB parser while the CompilerType::CreateTypedef backend is used by the
rest of LLDB.
We also had some patches over the year that only fixed one of the two functions
(D18099 for example only fixed up the CompilerType::CreateTypedef
implementation). D51162 and D86140 both fixed the same missing `addDecl` call
for one of the two implementations.
This patch:
* deletes the `CreateTypedefType` function as its only used by the PDB parser
and the `CreateTypedef` implementation is anyway needed as it's the backend
implementation of CompilerType.
* replaces the calls in the PDB parser by just calling the CompilerType wrapper.
* moves the documentation to the remaining function.
* moves the check for empty typedef names that was only in the deleted
implementation to the other (I don't think this fixes anything as I believe
all callers are already doing the same check).
I'll fix up the usual stuff (not using StringRef, not doing early exit) in a NFC
follow-up.
This patch is not NFC as the PDB parser now calls the function that has the fix
from D18099.
Reviewed By: labath, JDevlieghere
Differential Revision: https://reviews.llvm.org/D93382
We currently reject all templates that have either zero args or that have a
parameter pack without a name. Both cases are actually allowed in C++, so
rejecting them leads to LLDB instead falling back to a dummy 'void' type. This
leads to all kind of errors later on (most notable, variables that have such
template types appear to be missing as we can't have 'void' variables and
inheriting from such a template type will cause Clang to hit some asserts when
finding that the base class is 'void').
This just removes the too strict tests and adds a few tests for this stuff (+
some combinations of these tests with preceding template parameters).
Things that I left for follow-up patches:
* All the possible interactions with template-template arguments which seem like a whole new source of possible bugs.
* Function templates which completely lack sanity checks.
* Variable templates are not implemented.
* Alias templates are not implemented too.
* The rather strange checks that just make sure that the separate list of
template arg names and values always have the same length. I believe those
ought to be asserts, but my current plan is to move both those things into a
single list that can't end up in this inconsistent state.
Reviewed By: JDevlieghere, shafik
Differential Revision: https://reviews.llvm.org/D92425
When parsing DWARF and laying out bit-fields we don't properly take into account when they are in a union, they will all have a zero offset.
Differential Revision: https://reviews.llvm.org/D91118
I found a few cases where entries in the debug_line for a specific line of code have invalid entries (the address is outside of a code section or no section at all) and also valid entries. When this happens lldb might not set the breakpoint because the first line entry it will find in the line table might be the invalid one and since it's range is "invalid" no location is resolved. To get around this I changed the way we parse the line sequences to ignore those starting at an address under the first code segment.
Greg suggested to implement it this way so we don't need to check all sections for every line sequence.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D87172
This would be reproducible in future DWZ category of the testsuite as:
Failed Tests (1):
lldb-api :: python_api/symbol-context/two-files/TestSymbolContextTwoFiles.py
Differential Revision: https://reviews.llvm.org/D91014
Current user_id_t format is:
63{isDebugTypes} 62..32{dwo || 7fffffff}
31..0 {die_offset}
while current DIERef format is (I have made up the bit positions but the
field widths do match):
63{m_section==isDebugTypes} 62{m_dwo_num_valid} 61..32{m_dwo_num}
31..0 {m_die_offset}
Proposing to change user_id_t to:
63{isDebugTypes} 62{dwo_is_valid} 61..32{dwo; 0 if !valid}
31..0 {die_offset}
There is no benefit of having 31-bits wide dwo_num in user_id_t when it
gets converted to 30-bits width in DIERef.
This patch is for future DWZ patchset which extends the dwo_is_valid bit
into a 2-bit field (normal, DWO, DWZ, DWZcommon) so that both user_id_t
and DIERef can be changed then the same way.
It would be best to somehow unify user_id_t and DIERef but I do not plan
to do that. user_id_t should probably remain a number for the Python API
compatibility while there still needs to be some class with all the
methods to access it.
SymbolFileDWARF::GetDwpSymbolFile() and SymbolFileDWARF::GetDIE use
0x3fffffff for DWP but that does not clash:
formerly:
31bits32..62:0x7fffffff = normal unit / not any DWO
31bits32..62:0x3fffffff = DWP
31bits32..62:others = DWO unit number
after this patch:
bit62=0 30bits32..61:any = normal unit / not any DWO
bit62=1 30bits32..61:0x3fffffff = DWP
bit62=1 30bits32..61:others = DWO unit number
Differential Revision: https://reviews.llvm.org/D90413
SymbolFileDWARF::GetTypes was not handling dwo correctly. The fix is
simple -- adding a GetNonSkeletonUnit call -- but I've snuck in a small
refactor as well.
LookupAddress makes no sense for DWARFTypeUnit.
Also make GetNonSkeletonUnit to preserve the called type.
Differential Revision: https://reviews.llvm.org/D89646
Only SymbolFileDWARF::ParseCompileUnit creates a CompileUnit and it uses
DWARFCompileUnit for that.
Differential Revision: https://reviews.llvm.org/D89165
This is a polymorphic class, copying it is a bad idea.
This was not a problem because most classes inheriting from it were
deleting their copy operations themselves. However, this enables us to
delete those explicit deletions, and ensure noone forgets to add them in
the future.
Add an optimal thread strategy to execute specified amount of tasks.
This strategy should prevent us from creating too many threads if we
occasionaly have an unexpectedly small amount of tasks.
Differential Revision: https://reviews.llvm.org/D87765
Code was added that used llvm error checking to parse .debug_aranges, but the error check after parsing the DWARFDebugArangesSet was reversed and was causing no error to be returned with no valid address ranges being actually used. This meant we always would fall back onto creating out own address ranges by parsing the compile unit's ranges. This was causing problems for cases where the DW_TAG_compile_unit had a single address range by using a DW_AT_low_pc and DW_AT_high_pc attribute pair (not using a DW_AT_ranges attribute), but the .debug_aranges had correct split ranges. In this case we would end up using the single range for the compile unit that encompassed all of the ranges from the .debug_aranges section and would cause address resolving issues in LLDB where address lookups would fail for certain addresses.
Differential Revision: https://reviews.llvm.org/D87626
`image dump symtab` seems to output the symbols in whatever order they appear in
the DenseMap that is used to filter out symbols with non-unique addresses. As
DenseMap is a hash map this order can change at any time so the output of this
command is pretty unstable. This also causes the `Breakpad/symtab.test` to fail
with enabled reverse iteration (which reverses the DenseMap order to find issues
like this).
This patch makes the DenseMap a std::vector and uses a separate DenseSet to do
the address filtering. The output order is now dependent on the order in which
the symbols are read (which should be deterministic). It might also avoid a bit
of work as all the work for creating the Symbol constructor parameters is only
done when we can actually emplace a new Symbol.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D87036
Class-level static constexpr variables can have both DW_AT_const_value
(in the "declaration") and a DW_AT_location (in the "definition")
attributes. Our code was trying to handle this, but it was brittle and
hard to follow (and broken) because it was processing the attributes in
the order in which they were found.
Refactor the code to make the intent clearer -- DW_AT_location trumps
DW_AT_const_value, and fix the bug which meant that we were not
displaying these variables properly (the culprit was the delayed parsing
of the const_value attribute due to a need to fetch the variable type.
Differential Revision: https://reviews.llvm.org/D86615
This fixes several issues in handling of DW_AT_const_value attributes:
- the first is that the size of the data given by data forms does not
need to match the size of the underlying variable. We already had the
case to handle this for DW_FORM_(us)data -- this extends the handling
to other data forms. The main reason this was not picked up is because
clang uses leb forms in these cases while gcc prefers the fixed-size
ones.
- The handling of DW_AT_strp form was completely broken -- we would end
up using the pointer value as the result. I've reorganized this code
so that it handles all string forms uniformly.
- In case of a completely bogus form we would crash due to
strlen(nullptr).
Depends on D86311.
Differential Revision: https://reviews.llvm.org/D86348
This is very similar to D85968, only more elusive to since we were not
adding the typedef type to the relevant DeclContext until D86140, which
meant that the DeclContext was populated (and the relevant assertion
hit) only after importing the type into the expression ast in a
particular way.
I haven't checked whether this situation can be hit in the gmodules
case, but my money is on "yes".
Differential Revision: https://reviews.llvm.org/D86216
Parsing DWARFv5 debug_loclist offsets when a CU is parsed is weighing
down memory usage of symbolizers that don't need to parse this data at
all. There's not much benefit to caching these anyway - since they are
O(1) lookup and reading once you know where the offset list starts (and
can do bounds checking with the offset list size too).
In general, I think it might be time to start paying down some of the
technical debt of loc/loclist/range/rnglist parsing to try to unify it a
bit more.
eg:
* Currently DWARFUnit has: RangeSection, RangeSectionBase, LocSection,
LocSectionBase, LocTable, RngListTable, LoclistTableHeader (be nice if
these were all wrapped up in two variables - one for loclists, one for
rnglists)
* rnglists and loclists are handled differently (see:
LoclistTableHeader, but no RnglistTableHeader)
* maybe all these types could be less stateful - lazily parse what they
need to, even reparsing rather than caching because it doesn't seem
too expensive, for instance. (though admittedly so long as it's
constantcost/overead per compilatiton that's probably adequate)
* Maybe implementing and using a DWARFDataExtractor that can be
sub-ranged (so we could slice it up to just the single contribution) -
though maybe that's not so useful because loc/ranges need to refer to
it by absolute, not contribution-relative mechanisms
Differential Revision: https://reviews.llvm.org/D86110
CreateFunctionDeclaration should just take a StringRef. GetDeclarationName is
(only) used by CreateFunctionDeclaration so that's why now also takes a
StringRef.
With -flimit-debug-info, we can run into cases when we only have a class
as a declaration, but we do have a definition of a nested class. In this
case, clang will hit an assertion when adding a member to an incomplete
type (but only if it's adding a c++ class, and not C struct).
It turns out we already had code to handle a similar situation arising
in the -gmodules scenario. This extends the code to handle
-flimit-debug-info as well, and reorganizes bits of other code handling
completion of types to move functions doing similar things closer
together.
Differential Revision: https://reviews.llvm.org/D85968
When loading a PE/COFF target, the associated PDB file often wasn't
found. The executable module contains a path for the associated PDB
file, but people often debug from a different directory than the one
their build system uses. (This is especially common in post-mortem
and cross platform debugging.)
Suppose the COFF executable being debugged is `~/proj/foo.exe`, but
it was built elsewhere and refers to `D:\remote\build\env\foobar.pdb`,
LLDB wouldn't find it.
With this change, if no file exists at the PDB path, LLDB will look
in the executable directory for a PDB file that matches the name of
the one it expected (e.g., `~/proj/foobar.pdb`). If found, the PDB
is subject to the same matching criteria (GUIDs and age) as would
have been used had it been in the original location.
This same-directory-as-the-binary rule is commonly used by debuggers
on Windows.
Differential Review: https://reviews.llvm.org/D84815
Summary:
This effectively reverts r188124, which added code to handle
(DW_AT_)declarations of structures with some kinds of children as
definitions. The commit message claims this is a workaround for some
kind of debug info produced by gcc. However, it does not go into
specifics, so it's hard to reproduce or verify that this is indeed still a
problem.
Having this code is definitely a problem though, because it mistakenly
declares incomplete dwarf declarations to be complete. Both clang (with
-flimit-debug-info) and gcc (by default) generate DW_AT_declarations of
structs with children. This happens when full debug info for a class is
not emitted in a given compile unit (e.g. because of vtable homing), but
the class has inline methods which are used in the given compile unit.
In that case, the compilers emit a DW_AT_declaration of a class, but
add a DW_TAG_subprogram child to it to describe the inlined instance of
the method.
Even though the class tag has some children, it definitely does not
contain enough information to construct a full class definition (most
notably, it lacks any members). Keeping the class as incomplete allows
us to search for a real definition in other modules, helping the
-flimit-debug-info flow. And in case the definition is not found we can
display a error message saying that, instead of just showing an empty
struct.
Reviewers: clayborg, aprantl, JDevlieghere, shafik
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D83302
This patch has no effect for C and C++. In more dynamic languages,
such as Objective-C and Swift GetByteSize() needs to call into the
language runtime, so it's important to pass one in where possible. My
primary motivation for this is some work I'm doing on the Swift
branch, however, it looks like we are also seeing warnings in
Objective-C that this may resolve. Everything in the SymbolFile
hierarchy still passes in nullptrs, because we don't have an execution
context in SymbolFile, since SymbolFile transcends processes.
Differential Revision: https://reviews.llvm.org/D84267
Currently expressions dealing with bit-fields in Objective-C objects is pretty broken. When generating debug-info for Objective-C bit-fields DW_AT_data_bit_offset has a different meaning than it does to C and C++.
When we parse the DWARF we validate bit offsets for C and C++ correctly but not for ObjC. For ObjC in some cases we end up incorrectly flagging an error and we don't generate further bit-fields in the AST.
Later on when we do a name lookup we don't find the ObjCIvarDecl in the ObjCInterfaceDecl in some cases since we never added it and then we don't go to the runtime to obtain the offset.
This will fix how we handle bit-fields for the Objective-C case and add tests to verify this fix but also to documents areas that still don't work and will be addressed in follow-up PRs.
Note: we can never correctly calculate offsets statically because of how Objective-C deals with the fragile base class issue. Which means the runtime may need to shift fields over.
Differential Revision: https://reviews.llvm.org/D83433
Summary:
With D81784, lld has started debug info resolving relocations to
garbage-collected symbols as -1 (instead of relocation addend). For an
unaware consumer this generated sequences which seemingly wrap the
address space -- their first entry was 0xfffff, but all other entries
were low numbers.
Lldb stores line sequences concatenated into one large vector, sorted by
the first entry, and searched with std::lower_bound. This resulted in
the low-value entries being placed at the end of the vector, which
utterly confused the lower_bound algorithm, and caused it to not find a
match. (Previously, these sequences would be at the start of the vector,
and normally would contain addresses that are far smaller than any real
address we want to look up, so std::lower_bound was fine.)
This patch makes lldb ignore these kinds of sequences completely. It
does that by changing the construction algorithm from iterating over the
rows (as parsed by llvm), to iterating over the sequences. This is
important because the llvm parsed performs validity checks when
constructing the sequence array, whereas the row array contains raw
data.
Reviewers: JDevlieghere, MaskRay
Differential Revision: https://reviews.llvm.org/D83957
There is a local 'class_language' veriable in DWARFASTParserClang which is named
as if it is related to the 'class_language' member of ParsedDWARFTypeAttributes.
However, it actually only has two possible enum values: 'ObjC' (which means the
current record is a Objective-C class) or 'Unknown' (which covers all other
cases).
This is confusing for the reader and also lead to some strange code where we
have several comparisons against the value "ObjC_plus_plus" (which is always
false).
This replaces the variable with either a const bool variable (if there are
multiple checks for that condition in a function) or a direct call to the
TypeSystemClang utility method for checking if it's a Objective-C
Object/Interface type.
Summary:
DWARF-parsing methods in SymbolFileDWARF which update module state
typically take the module lock. ParseCallEdgesInFunction doesn't do
this, but higher-level locking within lldb::Function (which owns the
storage for parsed call edges) is necessary.
The lack of locking could explain some as-of-yet unreproducible crashes
which occur in Function::GetTailCallingEdges(). In these crashes, the
`m_call_edges` vector is non-empty but contains a nullptr, which
shouldn't be possible. (If this vector is non-empty, it _must_ contain a
non-null unique_ptr.)
This may address rdar://55622443 and rdar://65119458.
Reviewers: jasonmolenda, friss, jingham
Subscribers: aprantl, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D83359
Summary:
Unify the code for requiring a complete type and move it into a single
place. The only functional change is that the "cannot start a definition
of an incomplete type" is upgrated from a runtime error/warning to an
lldbassert. An plain assert might also be fine, since (AFAICT) this can
only happen in case of a programmer error.
Reviewers: teemperor, aprantl, shafik
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D83199
With -flimit-debug-info, we can have a definition of a class, but no
definition for some of its members. This extends the same logic we were
using for incomplete base classes to cover incomplete members too.
Test forward-declarations.s is removed as it is no longer applicable --
we don't warn anymore when encountering incomplete members as they could
be completed elsewhere. New checks added to TestLimitDebugInfo cover the
handling of incomplete members more thoroughly.
Summary:
This patch adds support for evaluation of expressions referring to types
which were compiled in -flimit-debug-info (a.k.a -fno-standalone-debug)
in clang. In this mode it's possible that the debug information needed
to fully describe a c++ type is not present in a single shared library
-- for example debug info for a base class or a member of a type can
only be found in another shared library. This situation is not
currently handled well within lldb as we are limited to searching within
a single shared library (lldb_private::Module) when searching for the
definition of these types.
The way that this patch gets around this limitation is by doing the
search at a later stage -- during the construction of the expression ast
context. This works by having the parser (currently SymbolFileDWARF, but
a similar approach is probably needed for PDBs too) mark a type as
"forcefully completed". What this means is that the parser has marked
the type as "complete" in the module ast context (as this is necessary
to e.g. derive classes from it), but its definition is not really there.
This is done via a new field on the ClangASTMetadata struct.
Later, when we are importing such a type into the expression ast, we
check this flag. If the flag is set, we try to find a better definition
for the type in other shared libraries. We do this by initiating a
new lookup for the "forcefully completed" classes, which then imports the
type from a module with a full definition.
This patch only implements this handling for base classes, but other
cases (members, array element types, etc.). The changes for that should
be fairly simple and mostly revolve around marking these types as
"forcefully completed" at an approriate time -- the importing logic is
generic already.
Another aspect, which is also not handled by this patch is viewing these
types via the "frame variable" command. This does not use the AST
importer and so it will need to handle these types on its own -- that
will be the subject of another patch.
Differential Revision: https://reviews.llvm.org/D81561
Summary:
When evaluating an expression referencing a constexpr static member variable, an
error is issued because the PDB does not specify a symbol with an address that
can be relocated against.
Rather than attempt to resolve the variable's value within the IR execution, the
values of all constants can be looked up and incorporated into the AST of the
record type as a literal, mirroring the original compiler AST.
This change applies to DIA and native PDB loaders.
Patch By: jackoalan
Reviewers: aleksandr.urakov, jasonmolenda, zturner, jdoerfert, teemperor
Reviewed By: aleksandr.urakov
Subscribers: sstefan1, lldb-commits, llvm-commits, #lldb
Tags: #lldb, #llvm
Differential Revision: https://reviews.llvm.org/D82160
Prior to my patch of using the LLVM line table parsing code,
SymbolFileDWARF::ParseSupportFiles would only parse the line table
prologues to get the file list for any files that could be in the line
table.
With the old behavior, if we found the file that someone is setting the
breakpoint in in the support files list, we would get a valid index. If
we didn't, we would not look any further. So someone sets a breakpoint
one "MyFile.cpp:12" and if we find "MyFile.cpp" in the support file list
for the compile unit, then and only then would we get the entire line
table for that compile unit.
With the current behavior, no matter what, we always fully parse the
line table for all compile units any time any file and line breakpoint
is set. This creates a serious problem when debugging a large DWARF in
.o file project.
This patch re-instates the old behavior. Unfortunately it means we might
end up parsing to prologue twice, but I don't think that outweighs the
cost of trying to cache/reuse it.
Differential revision: https://reviews.llvm.org/D81589
D80519 <https://reviews.llvm.org/D80519>
added support for `DW_TAG_GNU_call_site` but
Bug 45886 <https://bugs.llvm.org/show_bug.cgi?id=45886>
found one case did not work.
There is:
0x000000b1: DW_TAG_GNU_call_site
DW_AT_low_pc (0x000000000040111e)
DW_AT_abstract_origin (0x000000cc "a")
...
0x000000cc: DW_TAG_subprogram
DW_AT_name ("a")
DW_AT_prototyped (true)
DW_AT_low_pc (0x0000000000401109)
^^^^^^^^^^^^ - here it did overwrite the 'low_pc' variable containing value 0x40111e we wanted
DW_AT_high_pc (0x0000000000401114)
DW_AT_frame_base (DW_OP_call_frame_cfa)
DW_AT_GNU_all_call_sites (true)
DW_TAG_GNU_call_site attributes order as produced by GCC:
0x000000b1: DW_TAG_GNU_call_site
DW_AT_low_pc (0x000000000040111e)
DW_AT_abstract_origin (0x000000cc "a")
clang produces the attributes in opposite order:
0x00000064: DW_TAG_GNU_call_site
DW_AT_abstract_origin (0x0000002a "a")
DW_AT_low_pc (0x0000000000401146)
Differential Revision: https://reviews.llvm.org/D81334
Summary:
The way that the support for the GNU dialect of tail call frames was
implemented in D80519 meant that the were reporting very bogus PC values
which pointed into the middle of an instruction: the -1 trick is
necessary for the address to resolve to the right function, but we
should still be reporting a more realistic PC value -- I say "realistic"
and not "real", because it's very debatable what should be the correct
PC value for frames like this.
This patch achieves that my moving the -1 from SymbolFileDWARF into the
stack frame computation code. The idea is that SymbolFileDWARF will
merely report whether it has provided an address of the instruction
after the tail call, or the address of the call instruction itself. The
StackFrameList machinery uses this information to set the "behaves like
frame zero" property of the artificial frames (the main thing this flag
does is it controls the -1 subtraction when looking up the function
address).
This required a moderate refactor of the CallEdge class, because it was
implicitly assuming that edges pointing after the call were real calls
and those pointing the the call insn were tail calls. The class now
carries this information explicitly -- it carries three mostly
independent pieces of information:
- an address of interest in the caller
- a bit saying whether this address points to the call insn or after it
- whether this is a tail call
Reviewers: vsk, dblaikie
Subscribers: aprantl, mgrang, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D81010
Summary:
The code changes are very straight-forward -- just handle both DW_AT_GNU
and DW_AT_call versions of all tags and attributes. There is just one
small gotcha: in the GNU version, DW_AT_low_pc was used both for the
"return pc" and the "call pc" values, depending on whether the tag was
describing a tail call, while the official scheme uses different
attributes for the two things.
Reviewers: vsk, dblaikie
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D80519
The llvm DWARFExpression dump is nearly identical, but better -- for
example it does print a spurious space after zero-argument expressions.
Some parts of our code (variable locations) have been already switched
to llvm-based expression dumping. This switches the remainder: unwind
plans and some unit tests.
For Swift LLDB (but potentially also for module support in Clang-land)
we need a way to accumulate the path remappings produced by
Module::RegisterXcodeSDK(). In order to make this work for
SymbolFileDebugMaps, registering the search path remapping with both
modules is necessary.
Differential Revision: https://reviews.llvm.org/D79384
<rdar://problem/62750529>
When debugging from a SymbolMap the creation of CompileUnits for the
individual object files is so lazy that RegisterXcodeSDK() is not
invoked at all before the Swift TypeSystem wants to read it. This
patch fixes this by introducing an explicit
SymbolFile::ParseXcodeSDK() call that can be invoked deterministically
before the result is required.
<rdar://problem/62532151+62326862>
https://reviews.llvm.org/D79273
The dumping code is not used by anyone, and is a source of
inconsistencies with the llvm dwarf parser, as dumping is implemented at
a different level (DWARFDie) there.
The cause of this crash is relatively simple -- we are using a
SymbolFileDWARFDwo to parse a (skeleton) dwarf unit. This cause the
CompileUnit to be created with the wrong ID, which later triggers an
assertion in SymbolFile::SetCompileUnitAtIndex. The fix is also simple
-- ensure we use the right symbol file for parsing.
However, a fairly elaborate setup is needed trigger this bug, because
ParseCompileUnit is normally called very early on (and with the right
symbol file object) during the process of accessing a compile unit.
The only way this can be triggered is if the DWARF unit is
"accidentally" pulled into scope during expression evaluation
This can happen if the "this" object used for the context of an
expression is in a namespace, and that namespace is also present in
other compile units
The included test recreates this setup.
Summary:
The code in DWARFCompileUnit::BuildAddressRangeTable tries hard to avoid
relying on DW_AT_low/high_pc for compile unit range information, and
this logic is a big cause of llvm/lldb divergence in the lowest layers
of dwarf parsing code.
The implicit assumption in that code is that this information (as opposed to
DW_AT_ranges) is unreliable. However, I have not been able to verify
that assumption. It is definitely not true for all present-day
compilers (gcc, clang, icc), and it was also not the case for the
historic compilers that I have been able to get a hold of (thanks Matt
Godbolt).
All compiler included in my research either produced correct
DW_AT_ranges or .debug_aranges entries, or they produced no DW_AT_hi/lo
pc at all. The detailed findings are:
- gcc >= 4.4: produces DW_AT_ranges and .debug_aranges
- 4.1 <= gcc < 4.4: no DW_AT_ranges, no DW_AT_high_pc, .debug_aranges
present. The upper version range here is uncertain as godbolt.org does
not have intermediate versions.
- gcc < 4.1: no versions on godbolt.org
- clang >= 3.5: produces DW_AT_ranges, and (optionally) .debug_aranges
- 3.4 <= clang < 3.5: no DW_AT_ranges, no DW_AT_high_pc, .debug_aranges
present.
- clang <= 3.3: no DW_AT_ranges, no DW_AT_high_pc, no .debug_aranges
- icc >= 16.0.1: produces DW_AT_ranges
- icc < 16.0.1: no functional versions on godbolt.org (some are present
but fail to compile)
Based on this analysis, I believe it is safe to start trusting
DW_AT_low/high_pc information in dwarf as well as remove the code for
manually reconstructing range information by traversing the DIE
structure, and just keep the line table fallback. The only compilers
where this will change behavior are pre-3.4 clangs, which are almost 7
years old now. However, the functionality should remain unchanged
because we will be able to reconstruct this information from the line
table, which seems to be needed for some line-tables-only scenarios
anyway (haven't researched this too much, but at least some compilers
seem to emit DW_AT_ranges even in these situations).
In addition, benchmarks showed that for these compilers computing the
ranges via line tables is noticably faster than doing so via the DIE
tree.
Other advantages include simplifying the code base, removing some
untested code (the only test changes are recent tests with overly
reduced synthetic dwarf), and increasing llvm convergence.
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D78489
Pavel Labath wrote in D73206:
The internal representation of DebugNames and Apple indexes is fixed by
the relevant (pseudo-)standards, so we can't really change it. The
question is how to efficiently (and cleanly) convert from the internal
representation to some common thing. The conversion from AppleIndex to
DIERef is trivial (which is not surprising as it was the first and the
overall design was optimized for that). With debug_names, the situation
gets more tricky. The internal representation of debug_names uses
CU-relative DIE offsets, but DIERef wants an absolute offset. That means
the index has to do more work to produce the common representation. And
it needs to do that for all results, even though a lot of the index
users are really interested only in a single entry. With the switch to
user_id_t, _all_ indexes would have to do some extra work to encode it,
only for their users to have to immediately decode it back. Having
a iterator/callback based api would allow us to minimize the impact of
that, as it would only need to happen for the entries that are really
used. And /I think/ we could make it interface returns DWARFDies
directly, and each index converts to that using the most direct approach
available.
Jan Kratochvil:
It also makes all the callers shorter as they no longer need to fetch
DWARFDIE from DIERef (and handling if not found by ReportInvalidDIERef)
but the callers are already served DWARFDIE which they need.
In some cases the DWARFDIE had to be fetched both by callee (DWARFIndex
implementation) and caller.
Differential Revision: https://reviews.llvm.org/D77970
It removes some needless deep indentation and some redundant statements.
It prepares the code for a more clean next patch - DWARF index callbacks
D77327.
Differential Revision: https://reviews.llvm.org/D77326
Types that came from a Clang module are nested in DW_TAG_module tags
in DWARF. This patch recreates the Clang module hierarchy in LLDB and
1;95;0csets the owning module information accordingly. My primary motivation
is to facilitate looking up per-module APINotes for individual
declarations, but this likely also has other applications.
This reapplies the previously reverted commit, but without support for
ClassTemplateSpecializations, which I'm going to look into separately.
rdar://problem/59634380
Differential Revision: https://reviews.llvm.org/D75488
This is mostly useful for Swift support; it allows LLDB to substitute
a matching SDK it shipped with instead of the sysroot path that was
used at compile time.
The goal of this is to make the Xcode SDK something that behaves more
like the compiler's resource directory, as in that it ships with LLDB
rather than with the debugged program. This important primarily for
importing Swift and Clang modules in the expression evaluator, and
getting at the APINotes from the SDK in Swift.
For a cross-debugging scenario, this means you have to have an SDK for
your target installed alongside LLDB. In Xcode this will always be the
case.
rdar://problem/60640017
Differential Revision: https://reviews.llvm.org/D76471
This is a preparation for an upcoming patch which adds support for
DWARFv5 unit index sections. The patch adds tag "_EXT_" to identifiers
which reference sections that are deprecated in the DWARFv5 standard.
See D75929 for the discussion.
Differential Revision: https://reviews.llvm.org/D77141
Apparently the intention was to copy the condition above:
if (types.GetSize() >= max_matches)
break;
So that if the iteration stopped because of too many matches we do not
add even more matches in this 'Clang modules' block downward.
It was implemented by:
SymbolFileDWARF: Unconditionally scan through clang modules. NFCish
fe9eaadd68
Differential Revision: https://reviews.llvm.org/D77336
It is an obvious part of D77326.
It removes some needless deep indentation and some redundant statements.
It prepares the code for a more clean next patch - DWARF index callbacks
in D77327.
The old name was a bit misleading because the functions actually return
contributions to the corresponding sections.
Differential revision: https://reviews.llvm.org/D77302
Types that came from a Clang module are nested in DW_TAG_module tags
in DWARF. This patch recreates the Clang module hierarchy in LLDB and
sets the owning module information accordingly. My primary motivation
is to facilitate looking up per-module APINotes for individual
declarations, but this likely also has other applications.
rdar://problem/59634380
Differential Revision: https://reviews.llvm.org/D75488
When parsing DWARF and laying out bit-fields we currently don't take into account whether we have a base class or not.
Currently if the first field is a bit-field but the bit offset is due a field we inherit from a base class we currently
treat it as an unnamed bit-field and therefore add an extra field.
This fix will not check if we have a base class and assume that this offset is due to members we are inheriting from the base.
We are currently seeing asserts during codegen when debugging clang::DiagnosticOptions.
This assumption will fail in the case where the first field in the derived class in an unnamed bit-field. Fixing the first field
being an unnamed bit-field looks like it will require a larger change since we will need a way to track or discover the last field offset of the bases(s).
Differential Revision: https://reviews.llvm.org/D76808
In breakpad, only x86 (and mips) registers have a leading '$' in their
names. Arm architectures use plain register names.
Previously, lldb was assuming all registers have a '$'. Fix the code to
match the (unfortunately, inconsistent) reality.
Reland with changes: the test modified in this change originally failed
on a Debian/x86_64 builder, and I suspect the cause was that lldb looked
up the line location for an artificial frame by subtracting 1 from the
frame's address. For artificial frames, the subtraction must not happen
because the address is already exact.
---
lldb currently guesses the address to use when creating an artificial
frame (i.e., a frame constructed by determining the sequence of (tail)
calls which must have happened).
Guessing the address creates problems -- use the actual address provided
by the DW_AT_call_pc attribute instead.
Depends on D76336.
rdar://60307600
Differential Revision: https://reviews.llvm.org/D76337
This reverts commit 6905394d15. The
changed test is failing on Debian/x86_64, possibly because lldb is
subtracting an offset from the DW_AT_call_pc address used for the
artificial frame:
http://lab.llvm.org:8011/builders/lldb-x86_64-debian/builds/7171/steps/test/logs/stdio
/home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/lldb/test/API/functionalities/tail_call_frames/unambiguous_sequence/main.cpp:6:17: error: CHECK-NEXT: expected string not found in input
// CHECK-NEXT: frame #1: 0x{{[0-9a-f]+}} a.out`func3() at main.cpp:14:3 [opt] [artificial]
^
<stdin>:3:2: note: scanning from here
frame #1: 0x0000000000401127 a.out`func3() at main.cpp:13:4 [opt] [artificial]
lldb currently guesses the address to use when creating an artificial
frame (i.e., a frame constructed by determining the sequence of (tail)
calls which must have happened).
Guessing the address creates problems -- use the actual address provided
by the DW_AT_call_pc attribute instead.
Depends on D76336.
rdar://60307600
Differential Revision: https://reviews.llvm.org/D76337
Fix to get the AST we generate for function templates closer to what clang generates and expects.
We fix which FuntionDecl we are passing to CreateFunctionTemplateSpecializationInfo and we strip
template parameters from the name when creating the FunctionDecl and FunctionTemplateDecl.
These two fixes together fix asserts and ambiguous lookup issues for several cases which are added to the already existing small function template test.
This fixes issues with overloads, overloads and ADL, variadic function templates and templated operator overloads.
Differential Revision: https://reviews.llvm.org/D75761
If a producer emits a nonzero segment size, `lldb` will silently read
incorrect values and crash, or do something worse later as the tuple
size is expected to be 2, rather than 3.
Neither LLVM, nor GCC produce segmented aranges, but this dangerous case
should still be checked and handled.
Reviewed by: clayborg, labath
Differential Revision: https://reviews.llvm.org/D75925
Subscribers: lldb-commits
Tags: #lldb
Summary:
Currently `SymbolFileDWARF::TypeSet` is a typedef to a `std::set<Type *>`.
In `SymbolFileDWARF::GetTypes` we iterate over a TypeSet variable when finding
types so that logic is non-deterministic as it depends on the actual pointer address values.
This patch changes the `TypeSet` to a `llvm::UniqueVector` which always iterates in
the order in which we inserted the types into the list.
Reviewers: JDevlieghere, aprantl
Reviewed By: JDevlieghere
Subscribers: mgrang, abidh, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D75481
The convention is that the dwp file name is derived from the name of the
file holding the executable code, even if the linked portion of the
debug info is elsewhere (objcopy --only-keep-debug).
Summary:
When we added support for type units in dwo files, we changed the
"manual" dwarf index to index _all_ dwarf units in the dwo file instead
of just the split unit belonging to our skeleton unit. This was fine for
dwo files, as they contain only a single compile units and type units do
not have a split type unit which would point to them.
However, this does not work for dwp files because, these files do
contain multiple split compile units, and the current approach means
that each unit gets indexed multiple times (once for each split unit =>
n^2 complexity).
This patch teaches the manual dwarf index to treat dwp files specially.
Any type units in the dwp file added to the main list of compile units
and indexed with them in a single batch. Split compile units in dwp
files are still indexed as a part of their skeleton unit -- this is done
because we need the DW_AT_language attribute from the skeleton unit to
index them properly.
Handling of dwo files remains unchanged -- all units (type and skeleton)
are indexed when we reach the dwo file through the split unit.
Reviewers: clayborg, JDevlieghere, aprantl
Subscribers: arphaman, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D74964
Change the return value of SymbolFileDWARF::DebugInfo from a pointer to
a reference, and remove all null checks.
Previously, we were not constructing the DebugInfo object when the
debug_info section was empty. Now we always construct the object but
it will return an empty list of dwarf units (a thing which it already
supported).
The only thing needed was to account for the offset from the
debug_cu_index section when searching for the location list.
This patch also fixes a bug in the Module::ParseAllDebugSymbols
function, which meant that we would only parse the variables of the
first compile unit in the module. This function is only used from
lldb-test, so this does not fix any real issue, besides preventing me
from writing a test for this patch.
Summary:
In dwp files a constant (from the debug_cu_index section) needs to be
added to each reference into the debug_str_offsets section.
I've tried to implement this to roughly match the llvm flow: I've
changed the DWARFormValue to stop resolving the indirect string
references directly -- instead, it calls into DWARFUnit, which resolves
this for it (similar to how it already resolves indirect range and
location list references). I've also done a small refactor of the string
offset base computation code in DWARFUnit in order to make it easier to
access the debug_cu_index base offset.
Reviewers: JDevlieghere, aprantl, clayborg
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D74723
Summary:
All of our lookup APIs either use `CompilerDeclContext &` or `CompilerDeclContext *` semi-randomly it seems.
This leads to us constantly converting between those two types (and doing nullptr checks when going from
pointer to reference). It also leads to the confusing situation where we have two possible ways to express
that we don't have a CompilerDeclContex: either a nullptr or an invalid CompilerDeclContext (aka a default
constructed CompilerDeclContext).
This moves all APIs to use references and gets rid of all the nullptr checks and conversions.
Reviewers: labath, mib, shafik
Reviewed By: labath, shafik
Subscribers: shafik, arphaman, abidh, JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D74607
LLDB has a few different styles of header guards and they're not very
consistent because things get moved around or copy/pasted. This patch
unifies the header guards across LLDB and converts everything to match
LLVM's style.
Differential revision: https://reviews.llvm.org/D74743
This patch changes the way we initialize and terminate the plugins in
the system initializer. It uses an approach similar to LLVM's
TARGETS_TO_BUILD with a def file that enumerates the plugins.
The previously landed patch got reverted because it was lacking:
(1) A plugin definition for the Objective-C language runtime,
(2) The dependency between the Static and WASM dynamic loader,
(3) Explicit initialization of ScriptInterpreterNone for lldb-test.
All issues have been addressed in this patch.
Differential revision: https://reviews.llvm.org/D73067
This patch changes the way we initialize and terminate the plugins in
the system initializer. It uses an approach similar to LLVM's
TARGETS_TO_BUILD with a def file that enumerates the plugins.
Differential revision: https://reviews.llvm.org/D73067
As discussed in https://reviews.llvm.org/D73206#1871895 there is both
`DIERef` and `user_id_t` and sometimes (for DWZ) we need to encode Main
CU into them and sometimes we cannot as it is unavailable at that point
and at the same time not even needed.
I have also noticed `DIERef` and `user_id_t` in fact contain the same
information which can be seen in SymbolFileDWARF::GetUID.
SB* API/ABI is already using `user_id_t` and it needs to encode Main CU
for DWZ. Therefore what about making `DIERef` the identifier not
containing Main CU and `user_id_t` the identifier containing Main CU?
It is sort of a revert of D63322.
I find this patch as a NFC cleanup to the codebase - to satisfy a new
premise `user_id_t` is used as little as possible and thus only for
external interfaces which must not deal with MainCU in any way.
Its larger goal is to satisfy a plan to implement DWZ support.
Differential Revision: https://reviews.llvm.org/D74637
Summary:
This patch removes the bitrotted SymbolFileDWARF(Dwo)Dwp classes, and
replaces them with dwp support implemented directly inside
SymbolFileDWARFDwo, in a manner mirroring the implementation in llvm.
This patch does:
- add support for the .debug_cu_index section to our DWARFContext
- adds a llvm::DWARFUnitIndex argument to the DWARFUnit constructors.
This argument is used to look up the offsets of the debug_info and
debug_abbrev contributions in the sections of the dwp file.
- makes sure the creation of the DebugInfo object as well as the initial
discovery of DWARFUnits is thread-safe, as we can now call this
concurrently when doing parallel indexing.
This patch does not:
- use the DWARFUnitIndex to search for other kinds of contributions
(debug_loc, debug_ranges, etc.). This means that units which reference
these sections will not work correctly. These will be handled by
follow-up patches, but even the present level of support is sufficient
to enable basic functionality.
- Make the llvm::DWARFContext thread-safe. Right now, it just avoids this
problem by ensuring everything is initialized ahead of time. However,
this is something we will run into more often as we try to use more of
llvm, and so I plan to start looking into our options here.
Reviewers: JDevlieghere, aprantl, clayborg
Subscribers: mgorny, mgrang, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D73783
As discussed in https://reviews.llvm.org/D73206#1871895> simplifying
usage of `user_id_t`.
There is even written:
// The compile unit ID is the index of the DWARF unit.
DWARFUnit *dwarf_cu = info->GetUnitAtIndex(comp_unit->GetID());
Differential Revision: https://reviews.llvm.org/D74670
This is the second dwp preparatory patch. When a SymbolFileDWARFDwo will
hold more than one split unit, it will not be able to be uniquely owned
by a single DWARFUnit. I achieve this by changing the
unique_ptr<SymbolFileDWARFDwo> member of DWARFUnit to
shared_ptr<DWARFUnit>. The shared_ptr points to a DWARFUnit, but it is
in fact holding the entire SymbolFileDWARFDwo alive. This is the same
method used by llvm DWARFUnit (except that is uses the DWARFContext
class).
Differential Revision: https://reviews.llvm.org/D73782
Move the logic for initialization and termination for
SymbolFileDWARFDebugMap into SymbolFileDWARF so that there's one
initializer for the SymbolFileDWARF plugin.
This is a step towards making the initialize and terminate calls be
generated by CMake, which in turn is towards making it possible to
disable plugins at configuration time.
Differential revision: https://reviews.llvm.org/D74245
Summary:
This is a preparatory patch to re-enable DWP support in lldb (we already
have code claiming to do that, but it has been completely broken for a
while now).
The idea of the new approach is to make the SymbolFileDWARFDwo class
handle both dwo and dwo files, similar to how llvm uses one DWARFContext
to handle the two.
The first step is to remove the assumption that a SymbolFileDWARFDwo
holds just a single compile unit, i.e. the GetBaseCompileUnit method.
This requires changing the way how we reach the skeleton compile unit
(and the lldb_private::CompileUnit) from a dwo unit, which was
previously done via GetSymbolFile()->GetBaseCompileUnit() (and some
virtual dispatch).
The new approach reuses the "user data" mechanism of DWARFUnits, which
was used to link dwarf units (both skeleton and split) to their
lldb_private counterparts. Now, this is done only for non-dwo units, and
instead of that, the dwo units holds a pointer to the relevant skeleton
unit.
Reviewers: JDevlieghere, aprantl, clayborg
Reviewed By: JDevlieghere, clayborg
Subscribers: arphaman, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D73781
Summary:
I think that there are very few things from clang that actually need forward
declaration, so not having a ClangForward header makes sense.
Differential Revision: https://reviews.llvm.org/D73827
Summary:
This change represents the move of ClangASTImporter, ClangASTMetadata,
ClangExternalASTSourceCallbacks, ClangUtil, CxxModuleHandler, and
TypeSystemClang from lldbSource to lldbPluginExpressionParserClang.h
This explicitly removes knowledge of clang internals from lldbSymbol,
moving towards a more generic core implementation of lldb.
Reviewers: JDevlieghere, davide, aprantl, teemperor, clayborg, labath, jingham, shafik
Subscribers: emaste, mgorny, arphaman, jfb, usaxena95, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D73661
Reverting part of commit 789beeeca3.
Its DWARFDebugInfoEntry::GetDWARFDeclContext() refactorization for
return value is now adding it in opposite order.
This patchset is removing non-DWARF code from DWARFUnit for better
future merge with LLVM DWARF as discussed with @labath.
Differential revision: https://reviews.llvm.org/D70646
- m_debug_loc(lists) are unused since the relevant logic was moved to
DWARFContext.
- const versions of DebugInfo(), DebugAbbrev() are not used, and they
are dangerous to use as they do not initialize the relevant objects.
This adds a conversion function from clang::Decl to CompilerDecl. It checks
that the TypeSystemClang in the CompilerDecl actually fits to the clang::Decl
AST during creation, thus preventing the creation of CompilerDecl instances with
inconsistent state.
Many of the debug line prologue errors are not inherently fatal. In most
cases, we can make reasonable assumptions and carry on. This patch does
exactly that. In the case of length problems, the approach of "assume
stated length is correct" is taken which means the offset might need
adjusting.
This is a relanding of b94191fe, fixing an LLD test and the LLDB build.
Reviewed by: dblaikie, labath
Differential Revision: https://reviews.llvm.org/D72158
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
The old method of adding line sequences one by one can easily go
quadratic if the sequences are not perfectly sorted. The equivalent
change in DWARF brought a considerable improvement in line table
parsing. It is not clear if the same will be the case for PDB, but this
does bring us a step closer towards removing the dangerous API.
This reverts commit 1b12766883 because of
breaking the mac test suite.
I'm not certain this is the cause because of a concurrent build breakage
which masked this problem, but the failure messages are related to
symbol lookup, which makes this very likely.
Summary:
In the spirit of https://reviews.llvm.org/D70846, we only return functions with matching mangled name from Apple/DebugNamesDWARFIndex::GetFunction if eFunctionNameTypeFull is requested.
This speeds up lookup in the presence of large amount of class methods of the same name (a typical examples would be constructors of templates with many instantiations or overloaded operators).
Reviewers: labath
Reviewed By: labath
Subscribers: aprantl, arphaman, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D73191
This was needed when asking a compile unit for its dwo component
triggered a infinite recursion if the dwo unit has not been already
parsed.
This has since been fixed.
Summary:
A *.cpp file header in LLDB (and in LLDB) should like this:
```
//===-- TestUtilities.cpp -------------------------------------------------===//
```
However in LLDB most of our source files have arbitrary changes to this format and
these changes are spreading through LLDB as folks usually just use the existing
source files as templates for their new files (most notably the unnecessary
editor language indicator `-*- C++ -*-` is spreading and in every review
someone is pointing out that this is wrong, resulting in people pointing out that this
is done in the same way in other files).
This patch removes most of these inconsistencies including the editor language indicators,
all the different missing/additional '-' characters, files that center the file name, missing
trailing `===//` (mostly caused by clang-format breaking the line).
Reviewers: aprantl, espindola, jfb, shafik, JDevlieghere
Reviewed By: JDevlieghere
Subscribers: dexonsmith, wuzish, emaste, sdardis, nemanjai, kbarton, MaskRay, atanasyan, arphaman, jfb, abidh, jsji, JDevlieghere, usaxena95, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D73258
We ran into an assert when debugging clang and performing an expression on a class derived from DeclContext. The assert was indicating we were getting the offsets wrong for RecordDeclBitfields. We were getting both the size and offset of unnamed bit-field members wrong. We could fix this case with a quick change but as I extended the test suite to include more combinations we kept finding more cases that were being handled incorrectly. A fix that handled all the new cases as well as the cases already covered required a refactor of the existing technique.
Differential Revision: https://reviews.llvm.org/D72953
Summary:
This commit renames ClangASTContext to TypeSystemClang to better reflect what this class is actually supposed to do
(implement the TypeSystem interface for Clang). It also gets rid of the very confusing situation that we have both a
`clang::ASTContext` and a `ClangASTContext` in clang (which sometimes causes Clang people to think I'm fiddling
with Clang's ASTContext when I'm actually just doing LLDB work).
I also have plans to potentially have multiple clang::ASTContext instances associated with one ClangASTContext so
the ASTContext naming will then become even more confusing to people.
Reviewers: #lldb, aprantl, shafik, clayborg, labath, JDevlieghere, davide, espindola, jdoerfert, xiaobai
Reviewed By: clayborg, labath, xiaobai
Subscribers: wuzish, emaste, nemanjai, mgorny, kbarton, MaskRay, arphaman, jfb, usaxena95, jingham, xiaobai, abidh, JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D72684
Summary:
Our DWARFUnit was automatically forwarding the requests to the split
unit when looking for a DIE by offset. llvm::DWARFUnit does not do that,
and is not likely to start doing it any time soon.
This patch deletes the this logic and updates the callers to request the
correct unit instead. While doing that, I've found a bit of duplicated
code for lookup up a function and block by address, so I've extracted
that into a helper function.
Reviewers: JDevlieghere, aprantl, clayborg, jdoerfert
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D73112
We were creating a bunch of LineSequence objects but never deleting
them.
This fixes the leak and changes the code to use std::unique_ptr, to make
it harder to make the same mistake again.
Summary:
This code is handling debug info paths starting with /proc/self/cwd,
which is one of the mechanisms people use to obtain "relocatable" debug
info (the idea being that one starts the debugger with an appropriate
cwd and things "just work").
Instead of resolving the symlinks inside DWARFUnit, we can do the same
thing more elegantly by hooking into the existing Module path remapping
code. Since llvm::DWARFUnit does not support any similar functionality,
doing things this way is also a step towards unifying llvm and lldb
dwarf parsers.
Reviewers: JDevlieghere, aprantl, clayborg, jdoerfert
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71770
Summary:
Motivation: When setting breakpoints in certain projects line sequences are frequently being inserted out of order.
Rather than inserting sequences one at a time into a sorted line table, store all the line sequences as we're building them up and sort and flatten afterwards.
Reviewers: jdoerfert, labath
Reviewed By: labath
Subscribers: teemperor, labath, mgrang, JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D72909
Summary:
This method was doing a lot more than it's only caller needed
(DWARFDIE::LookupDeepestBlock) needed, so I inline it into the caller,
and remove any code which is not actually used. This includes code for
searching for the deepest function, and the code for working around
incomplete DW_AT_low_pc/high_pc attributes on a compile unit DIE (modern
compiler get this right, and this method is called on function DIEs
anyway).
This also improves our llvm consistency, as llvm::DWARFDebugInfoEntry is
just a very simple struct with no nontrivial logic.
Reviewers: JDevlieghere, aprantl
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D72920
Summary:
The goal of this patch is two-fold. First, it fixes a use-after-free in
the construction of the llvm DWARFContext. This happened because the
construction code was throwing away the lldb DataExtractors it got while
reading the sections (unlike their llvm counterparts, these are also
responsible for memory ownership). In most cases this did not matter,
because the sections are just slices of the mmapped file data. But this
isn't the case for compressed elf sections, in which case the section is
decompressed into a heap buffer. A similar thing also happen with object
files which are loaded from process memory.
The second goal is to make it explicit which sections go into the llvm
DWARFContext -- any access to the sections through both DWARF parsers
carries a risk of parsing things twice, so it's better if this is a
conscious decision. Also, this avoids loading completely irrelevant
sections (e.g. .text). At present, the only section that needs to be
present in the llvm DWARFContext is the debug_line_str. Using it through
both APIs is not a problem, as there is no parsing involved.
The first goal is achieved by loading the sections through the existing
lldb DWARFContext APIs, which already do the caching. The second by
explicitly enumerating the sections we wish to load.
Reviewers: JDevlieghere, aprantl
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D72917
[this re-applies c0176916a4
with the correct commit message and phabricator link]
This addresses point 1 of PR44213.
https://bugs.llvm.org/show_bug.cgi?id=44213
The DW_AT_LLVM_sysroot attribute is used for Clang module debug info,
to allow LLDB to import a Clang module from source. Currently it is
part of each DW_TAG_module, however, it is the same for all modules in
a compile unit. It is more efficient and less ambiguous to store it
once in the DW_TAG_compile_unit.
This should have no effect on DWARF consumers other than LLDB.
Differential Revision: https://reviews.llvm.org/D71732
This is a purely cosmetic change that is NFC in terms of the binary
output. I bugs me that I called the attribute DW_AT_LLVM_isysroot
since the "i" is an artifact of GCC command line option syntax
(-isysroot is in the category of -i options) and doesn't carry any
useful information otherwise.
This attribute only appears in Clang module debug info.
Differential Revision: https://reviews.llvm.org/D71722
This reverts D53469, which changed llvm's DWARF emission to emit
DW_AT_call_return_pc as a function-local offset. Such an encoding is not
compatible with post-link block re-ordering tools and isn't standards-
compliant.
In addition to reverting back to the original DW_AT_call_return_pc
encoding, teach lldb how to fix up DW_AT_call_return_pc when the address
comes from an object file pointed-to by a debug map. While doing this I
noticed that lldb's support for tail calls that cross a DSO/object file
boundary wasn't covered, so I added tests for that. This latter case
exercises the newly added return PC fixup.
The dsymutil changes in this patch were originally included in D49887:
the associated test should be sufficient to test DW_AT_call_return_pc
encoding purely on the llvm side.
Differential Revision: https://reviews.llvm.org/D72489
These are the last sections not managed by the DWARFContext object. I
also introduce separate SectionType enums for dwo section variants, as
this is necessary for proper handling of single-file split dwarf.
Summary:
This change is connected with
https://reviews.llvm.org/D69843
In large codebases, we sometimes see Module::FindFunctions (when called from
ClangExpressionDeclMap::FindExternalVisibleDecls) returning huge amounts of
functions.
In current fix I trying to return only function_fullnames from ManualDWARFIndex::GetFunctions when eFunctionNameTypeFull is passed as argument.
Reviewers: labath, jarin, aprantl
Reviewed By: labath
Subscribers: shafik, clayborg, teemperor, arphaman, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D70846
This patch removes the code (deep inside DWARFDebugInfoEntry) which
automagically returned the attributes of the dwo unit DIE when asking
for the attributes of the skeleton unit. This is fairly hacky, and not
consistent with how llvm DWARF parser operates.
Instead, I change the code the explicitly request (via
GetNonSkeletonUnit) the right unit to search (there were just two places
that needed this). If it turns out we need this more often, we can
create a utility function (external to DWARFUnit) for doing this.
Summary:
Our code was expecting that a single (symbol) file contains only one
kind of location lists. This is not correct (on non-apple platforms, at
least) as a file can compile units with different dwarf versions.
This patch moves the deteremination of location list flavour down to the
compile unit level, fixing this problem. I have also tried to rougly
align the code with the llvm DWARFUnit. Fully matching the API is not
possible because of how lldb's DWARFExpression lives separately from the
rest of the DWARF code, but this is at least a step in the right
direction.
Reviewers: JDevlieghere, aprantl, clayborg
Subscribers: dblaikie, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71751
Summary:
A skeleton unit can easily be detected by checking the m_dwo_symbol_file
member, but we cannot tell a split unit from a normal unit from the
"inside", which is sometimes useful.
This patch adds a m_is_dwo member to enable this, and align the code
with llvm::DWARFUnit. Right now it's only used to avoid creating a split
unit inside another split unit (which removes one override from
SymbolFileDWARFDwo and brings us a step closer to deleting it), but my
main motivation is fixing the handling of location lists in mixed v4&v5
files. This comes in a separate patch.
Reviewers: JDevlieghere, aprantl, clayborg
Subscribers: dblaikie, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71750
LLDB frequently converts QualType to CompilerType. This is currently done like this:
result = CompilerType(this, qual_type_var.getAsOpaquePtr())
There are a few shortcomings in this current approach:
1. CompilerType's constructor takes a void* pointer so it isn't type safe.
2. We can't add any sanity checks to the CompilerType constructor (e.g. that the type
actually belongs to the passed ClangASTContext) without expanding the TypeSystem API.
3. The logic for converting QualType->CompilerType is spread out over all of LLDB so
changing it is difficult (e.g., what if we want to just pass the type ptr and not the
1type_ptr | qual_flags1 to CompilerType).
This patch adds a `ClangASTContext::GetType` function similar to the other GetTypeForDecl
functions that does this conversion in a type safe way.
It also adds a sanity check for Tag-based types that the type actually belongs to the
current ClangASTContext (Types don't seem to know their ASTContext, so we have to
workaround by looking at the decl for the underlying TagDecl. This doesn't cover all types
we construct but it's better than no sanity check).
GetASTContext is really expensive to call as it makes use of the global
mapping from ASTContext to ClangASTContext. This replaces all calls where
we already have the ClangASTContext around and don't need to call
GetASTContext again.
This function is not very useful, as it's forcing a materialization of
the returned DIEs, and calling it is not substantially simpler than just
iterating over the DIEs manually. Delete it, and rewrite the single
caller.
This bit of code is trying to strip everything up to the first colon
from all debug info paths, as dwarf2 recommends this syntax for storing
the compilation host name. However, this code was too eager, and it
ended up stripping the entire compilation directory, if it did not
contain a forward slash (or a "x:\").
Normally this does not matter, as all absolute paths will contain one of
these patterns, but this does not have to be the case in case the debug
info is produced by "clang -fdebug-compilation-dir", which can end up
producing a relative compilation directory with no slashes (this is one
of the techniques for producing "relocatable" debug info).
Summary:
This code is handling debug info paths starting with /proc/self/cwd,
which is one of the mechanisms people use to obtain "relocatable" debug
info (the idea being that one starts the debugger with an appropriate
cwd and things "just work").
Instead of resolving the symlinks inside DWARFUnit, we can do the same
thing more elegantly by hooking into the existing Module path remapping
code. Since llvm::DWARFUnit does not support any similar functionality,
doing things this way is also a step towards unifying llvm and lldb
dwarf parsers.
Reviewers: JDevlieghere, aprantl, clayborg, jdoerfert
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71770
The CompilerDeclContext constructor takes a void* pointer which
means that all callers of this constructor need to first explicitly
convert all pointers to clang::DeclContext*. This causes that we
for example can't just pass a TranslationUnitDecl* to the constructor without
first casting it to its parent class (as it inherits from both
Decl and DeclContext so the void* pointer is actually a Decl*).
This patch introduces a utility function in the ClangASTContext
which gets rid of the requirement to cast all pointers to
clang::DeclContext. Also moves all constructor calls to use this
function instead which is NFC (beside the change in
DWARFASTParserClangTests.cpp).
ClangASTContext::getASTContext() currently returns a ptr but we have an assert there since a
while that the ASTContext is not a nullptr. This causes that we still have a lot of code
that is doing nullptr checks on the result of getASTContext() which is all unreachable code.
This patch changes the return value to a reference to make it clear this can't be a nullptr
and deletes all the nullptr checks.
This is a purely cosmetic change that is NFC in terms of the binary
output. I bugs me that I called the attribute DW_AT_LLVM_isysroot
since the "i" is an artifact of GCC command line option syntax
(-isysroot is in the category of -i options) and doesn't carry any
useful information otherwise.
This attribute only appears in Clang module debug info.
Differential Revision: https://reviews.llvm.org/D71722
Summary:
Fixes PR41237 - SIGSEGV on call expression evaluation when debugging clang
When linking multiple compilation units that define the same functions,
the functions is merged but their debug info is not. This ignores debug
info entries for functions in a non-executable sections; those are
functions that were definitely dropped by the linker.
Reviewers: spyffe, clayborg, jasonmolenda
Reviewed By: clayborg
Subscribers: labath, aprantl, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71487
Summary:
D69991 introduced `__attribute__((objc_direct))` that allows directly calling methods without message passing.
This patch adds support for calling methods with this attribute to LLDB's expression evaluator.
The patch can be summarised in that LLDB just adds the same attribute to our module AST when we find a
method with `__attribute__((objc_direct))` in our debug information.
Reviewers: aprantl, shafik
Reviewed By: shafik
Subscribers: JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71196
Summary:
LLDB associates additional information with Types and Declarations which it calls ClangASTMetadata.
ClangASTMetadata is stored by the ClangASTSourceCommon which is implemented by having a large map of
`void *` keys to associated `ClangASTMetadata` values. To make this whole mechanism even unsafer
we also decided to use `clang::Decl *` as one of pointers we throw in there (beside `clang::Type *`).
The Decl class hierarchy uses multiple inheritance which means that not all pointers have the
same address when they are implicitly converted to pointers of their parent classes. For example
`clang::Decl *` and `clang::DeclContext *` won't end up being the same address when they
are implicitly converted from one of the many Decl-subclasses that inherit from both.
As we use the addresses as the keys in our Metadata map, this means that any implicit type
conversions to parent classes (or anything else that changes the addresses) will break our metadata tracking
in obscure ways.
Just to illustrate how broken this whole mechanism currently is:
```lang=cpp
// m_ast is our ClangASTContext. Let's double check that from GetTranslationUnitDecl
// in ClangASTContext and ASTContext return the same thing (one method just calls the other).
assert(m_ast->GetTranslationUnitDecl() == m_ast->getASTContext()->getTranslationUnitDecl());
// Ok, both methods have the same TU*. Let's store metadata with the result of one method call.
m_ast->SetMetadataAsUserID(m_ast->GetTranslationUnitDecl(), 1234U);
// Retrieve the same Metadata for the TU by using the TU* from the other method... which fails?
EXPECT_EQ(m_ast->GetMetadata(m_ast->getASTContext()->getTranslationUnitDecl())->GetUserID(), 1234U);
// Turns out that getTranslationUnitDecl one time returns a TranslationUnitDecl* but the other time
// we return one of the parent classes of TranslationUnitDecl (DeclContext).
```
This patch splits up the `void *` API into two where one does the `clang::Type *` tracking and one the `clang::Decl *` mapping.
Type and Decl are disjoint class hierarchies so there is no implicit conversion possible that could influence
the address values.
I had to change the storing of `clang::QualType` opaque pointers to their `clang::Type *` equivalents as
opaque pointers are already `void *` pointers to begin with. We don't seem to ever set any qualifier in any of these
QualTypes to this conversion should be NFC.
Reviewers: labath, shafik, aprantl
Reviewed By: labath
Subscribers: JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71409
Summary:
This adds support for DWARF5 location lists which are specified
indirectly, via an index into the debug_loclists offset table. This
includes parsing the DW_AT_loclists_base attribute which determines the
location of this offset table, and support for new form DW_FORM_loclistx
which is used in conjuction with DW_AT_location to refer to the location
lists in this way.
The code uses the llvm class to parse the offset information, and I've
also tried to structure it similarly to how the relevant llvm
functionality works.
Reviewers: JDevlieghere, aprantl, clayborg
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71268
Summary:
Lldb support base address selection entries in location lists was broken
for a long time. This wasn't noticed until llvm started producing these
kinds of entries more frequently with r374600.
In r374769, I made a quick patch which added sufficient support for them
to get the test suite to pass. However, I did not fully understand how
this code operates, and so the fix was not complete. Specifically, what
was lacking was the ability to handle modules which were not loaded at
their preferred load address (for instance, due to ASLR).
Now that I better understand how this code works, I've come to the
conclusion that the current setup does not provide enough information
to correctly process these entries. In the current setup the location
lists were parameterized by two addresses:
- the distance of the function start from the start of the compile unit.
The purpose of this was to make the location ranges relative to the
start of the function.
- the actual address where the function was loaded at. With this the
function-start-relative ranges can be translated to actual memory
locations.
The reason for the two values, instead of just one (the load bias) is (I
think) MachO, where the debug info in the object files will appear to be
relative to the address zero, but the actual code it refers to
can be moved and reordered by the linker. This means that the location
lists need to be "linked" to reflect the locations in the actual linked
file.
These two bits of information were enough to correctly process location
lists which do not contain base address selection entries (and so all
entries are relative to the CU base). However, they don't work with
them because, in theory two base address can be completely unrelated (as
can happen for instace with hot/cold function splitting, where the
linker can reorder the two pars arbitrarily).
To fix that, I split the first parameter into two:
- the compile unit base address
- the function start address, as is known in the object file
The new algorithm becomes:
- the location lists are processed as they were meant to be processed.
The CU base address is used as the initial base address value. Base
address selection entries can set a new base.
- the difference between the "file" and "load" function start addresses
is used to compute the load bias. This value is added to the final
ranges to get the actual memory location.
This algorithm is correct for non-MachO debug info, as there the
location lists correctly describe the code in the final executable, and
the dynamic linker can just move the entire module, not pieces of it. It
will also be correct for MachO if the static linker preserves relative
positions of the various parts of the location lists -- I don't know
whether it actually does that, but judging by the lack of base address
selection support in dsymutil and lldb, this isn't something that has
come up in the past.
I add a test case which simulates the ASLR scenario and demonstrates
that base address selection entries now work correctly here.
Reviewers: JDevlieghere, aprantl, clayborg
Subscribers: dblaikie, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D70532
Summary:
This patch adds support for atomic types (DW_TAG_atomic_type) to LLDB. It's mostly just filling out all the switch-statements that didn't implement Atomic case with the usual boilerplate.
Thanks Pavel for writing the test case.
Reviewers: labath, aprantl, shafik
Reviewed By: labath
Subscribers: jfb, abidh, JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71183
In DWARF5 DW_AT_low_pc (and DW_AT_entry_pc, and possibly others) can use
DW_FORM_addrx to refer to the address indirectly. This means we need to
have processed the DW_AT_addr_base attribute before we can do anything
with these.
Since we were processing the unit attributes serially, this created a
problem in cases where the DW_AT_addr_base comes after DW_AT_low_pc --
we would end up computing the wrong unit base address, which also
corrupted any values which later depended on that (for instance range
lists). Clang currently always emits DW_AT_addr_base last.
The fix is simple -- process DW_AT_addr_base first, regardless of its
position in the attribute list.
the value of DW_AT_rnglists_base of the skeleton unit is for that unit
alone (e.g. used in DW_AT_ranges of the unit DIE) and should not apply
to the split unit.
The split unit has a hardcoded range list base value -- we should
initialize range list code whenever we detect a nonempty
debug_rnglists.dwo section.
Summary:
Yet another step on the long road towards getting rid of lldb's Stream class.
We probably should just make this some kind of member of Address/AddressRange, but it seems quite often we just push
in random integers in there and this is just about getting rid of Stream and not improving arbitrary APIs.
I had to rename another `DumpAddress` function in FormatEntity that is dumping the content of an address to make Clang happy.
Reviewers: labath
Reviewed By: labath
Subscribers: JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71052
Summary:
Our rnglist support was working only for the trivial cases (one CU),
because we only ever parsed one contribution out of the debug_rnglists
section. This means we were never able to resolve range lists for the
second and subsequent units (DW_FORM_sec_offset references came out
blang, and DW_FORM_rnglistx references always used the ranges lists from
the first unit).
Since both llvm and lldb rnglist parsers are sufficiently
self-contained, and operate similarly, we can fix this problem by
switching to the llvm parser instead. Besides the changes which are due
to variations in the interface, the main thing is that now the range
list object is a member of the DWARFUnit, instead of the entire symbol
file. This ensures that each unit can get it's own private set of range
list indices, and is consistent with how llvm's DWARFUnit does it
(overall, I've tried to structure the code the same way as the llvm
version).
I've also added a test case for the two unit scenario.
Reviewers: JDevlieghere, aprantl, clayborg
Subscribers: dblaikie, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71021
Summary:
Lldb's "format-independent" debug info made use of the fact that DWARF
(<=4) did not use the file index zero, and reused the support file index
zero for storing the compile unit name.
While this provided some convenience for DWARF<=4, it meant that the PDB
plugin needed to artificially remap file indices in order to free up
index 0. Furthermore, DWARF v5 make file index 0 legal, which meant that
similar remapping would be needed in the dwarf plugin too.
What this patch does instead is remove the requirement of having the
compile unit name in the index 0. It is not that useful since the name
can always be fetched from the CompileUnit object. Remapping code in the
pdb plugin(s) has been removed or simplified.
DWARF plugin has started inserting an empty FileSpec at index 0 to
ensure the indices keep matching up (in case of DWARF<=4). For DWARF5,
we insert the file 0 from the line table.
I add a test to ensure we can correctly lookup line table entries
referencing file 0, and in particular the case where the file 0 is also
duplicated in another file entry, as this is how clang produces line
tables in some circumstances (see pr44170). Though this is probably a
bug in clang, this is not forbidden by DWARF, and lldb already has
support for that in some (but not all) cases -- this adds a test for the
code path which was not fixed in this patch.
Reviewers: clayborg, JDevlieghere, jdoerfert
Subscribers: aprantl, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D70954
Summary:
The FileSpec class is often used as a sort of a pattern -- one specifies
a bare file name to search, and we check if in matches the full file
name of an existing module (for example).
These comparisons used FileSpec::Equal, which had some support for it
(via the full=false argument), but it was not a good fit for this job.
For one, it did a symmetric comparison, which makes sense for a function
called "equal", but not for typical searches (when searching for
"/foo/bar.so", we don't want to find a module whose name is just
"bar.so"). This resulted in patterns like:
if (FileSpec::Equal(pattern, file, pattern.GetDirectory()))
which would request a "full" match only if the pattern really contained
a directory. This worked, but the intended behavior was very unobvious.
On top of that, a lot of the code wanted to handle the case of an
"empty" pattern, and treat it as matching everything. This resulted in
conditions like:
if (pattern && !FileSpec::Equal(pattern, file, pattern.GetDirectory())
which are nearly impossible to decipher.
This patch introduces a FileSpec::Match function, which does exactly
what most of FileSpec::Equal callers want, an asymmetric match between a
"pattern" FileSpec and a an actual FileSpec. Empty paterns match
everything, filename-only patterns match only the filename component.
I've tried to update all callers of FileSpec::Equal to use a simpler
interface. Those that hardcoded full=true have been changed to use
operator==. Those passing full=pattern.GetDirectory() have been changed
to use FileSpec::Match.
There was also a handful of places which hardcoded full=false. I've
changed these to use FileSpec::Match too. This is a slight change in
semantics, but it does not look like that was ever intended, and it was
more likely a result of a misunderstanding of the "proper" way to use
FileSpec::Equal.
[In an ideal world a "FileSpec" and a "FileSpec pattern" would be two
different types, but given how widespread FileSpec is, it is unlikely
we'll get there in one go. This at least provides a good starting point
by centralizing all matching behavior.]
Reviewers: teemperor, JDevlieghere, jdoerfert
Subscribers: emaste, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D70851