This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
I encountered an issue where `p &variable` was finding an incorrect address for
32-bit PIC ELF files loaded into a running process. The problem was that the
R_386_32 ELF relocations were not being applied to the DWARF section, so all
variables in that file were reporting as being at the start of their respective
section. There is an assert that catches this on debug builds, but silently
ignores the issue on non-debug builds.
In this changeset, I added handling for the R_386_32 relocation type to
ObjectFileELF, and a supporting function to ELFRelocation to differentiate
between DT_REL & DT_RELA in ObjectFileELF::ApplyRelocations().
Demonstration of issue:
```
[dmlary@host work]$ cat rel.c
volatile char padding[32] = "make sure var isnt at .data+0";
volatile char var[] = "test";
[dmlary@host work]$ gcc -c rel.c -FPIC -fpic -g -m32
[dmlary@host work]$ lldb ./exec
(lldb) target create "./exec"
Current executable set to '/home/dmlary/src/work/exec' (i386).
(lldb) process launch --stop-at-entry
Process 21278 stopped
* thread #1, name = 'exec', stop reason = signal SIGSTOP
frame #0: 0xf7fdb150 ld-2.17.so`_start
ld-2.17.so`_start:
-> 0xf7fdb150 <+0>: movl %esp, %eax
0xf7fdb152 <+2>: calll 0xf7fdb990 ; _dl_start
ld-2.17.so`_dl_start_user:
0xf7fdb157 <+0>: movl %eax, %edi
0xf7fdb159 <+2>: calll 0xf7fdb140
Process 21278 launched: '/home/dmlary/src/work/exec' (i386)
(lldb) image add ./rel.o
(lldb) image load --file rel.o .text 0x40000000 .data 0x50000000
section '.text' loaded at 0x40000000
section '.data' loaded at 0x50000000
(lldb) image dump symtab rel.o
Symtab, file = rel.o, num_symbols = 13:
Debug symbol
|Synthetic symbol
||Externally Visible
|||
Index UserID DSX Type File Address/Value Load Address Size Flags Name
------- ------ --- --------------- ------------------ ------------------ ------------------ ---------- ----------------------------------
[ 0] 1 SourceFile 0x0000000000000000 0x0000000000000000 0x00000004 rel.c
[ 1] 2 Invalid 0x0000000000000000 0x0000000000000020 0x00000003
[ 2] 3 Invalid 0x0000000000000000 0x50000000 0x0000000000000020 0x00000003
[ 3] 4 Invalid 0x0000000000000025 0x0000000000000000 0x00000003
[ 4] 5 Invalid 0x0000000000000000 0x0000000000000020 0x00000003
[ 5] 6 Invalid 0x0000000000000000 0x0000000000000020 0x00000003
[ 6] 7 Invalid 0x0000000000000000 0x0000000000000020 0x00000003
[ 7] 8 Invalid 0x0000000000000000 0x0000000000000020 0x00000003
[ 8] 9 Invalid 0x0000000000000000 0x0000000000000020 0x00000003
[ 9] 10 Invalid 0x0000000000000000 0x0000000000000020 0x00000003
[ 10] 11 Invalid 0x0000000000000000 0x0000000000000020 0x00000003
[ 11] 12 X Data 0x0000000000000000 0x50000000 0x0000000000000020 0x00000011 padding
[ 12] 13 X Data 0x0000000000000020 0x50000020 0x0000000000000005 0x00000011 var
(lldb) p &var
(volatile char (*)[5]) $1 = 0x50000000
```
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D132954
Previously, depending on how you constructed a UUID from data or a
StringRef, an input value of all zeros was valid (e.g. setFromData)
or not (e.g. setFromOptionalData). Since there was no way to tell
which interpretation to use, it was done somewhat inconsistently.
This standardizes the meaning of a UUID of all zeros to Not Valid,
and removes all the Optional methods and their uses, as well as the
static factories that supported them.
Differential Revision: https://reviews.llvm.org/D132191
Symbols that have the section index of SHN_ABS were previously creating extra top level sections that contained the value of the symbol as if the symbol's value was an address. As far as I can tell, these symbol's values are not addresses, even if they do have a size. To make matters worse, adding these extra sections can stop address lookups from succeeding if the symbol's value + size overlaps with an existing section as these sections get mapped into memory when the image is loaded by the dynamic loader. This can cause stack frames to appear empty as the address lookup fails completely.
This patch:
- doesn't create a section for any SHN_ABS symbols
- makes symbols that are absolute have values that are not addresses
- add accessors to SBSymbol to get the value and size of a symbol as raw integers. Prevoiusly there was no way to access a symbol's value from a SBSymbol because the only accessors were:
SBAddress SBSymbol::GetStartAddress();
SBAddress SBSymbol::GetEndAddress();
and these accessors would return an invalid SBAddress if the symbol's value wasn't an address
- Adds a test to ensure no ".absolute.<symbol-name>" sections are created
- Adds a test to test the new SBSymbol APIs
Differential Revision: https://reviews.llvm.org/D131705
clang 14 removed -gz=zlib-gnu support and ld.lld/llvm-objcopy removed zlib-gnu
support recently. Remove lldb support by migrating away from
llvm::object::Decompressor::isCompressedELFSection.
The API has another user llvm-dwp, so it is not removed in this patch.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D129724
It's more natural to use uint8_t * (std::byte needs C++17 and llvm has
too much uint8_t *) and most callers use uint8_t * instead of char *.
The functions are recently moved into `llvm::compression::zlib::`, so
downstream projects need to make adaption anyway.
Currently, ppc64le and ppc64 (defaulting to big endian) have the same
descriptor, thus the linear scan always return ppc64le. Handle that through
subtype.
This is a recommit of f114f00948 with a new test
setup that doesn't involves (unsupported) corefiles.
Differential Revision: https://reviews.llvm.org/D124760
This reverts commit f114f00948.
Due to hitting an assert on our lldb bots:
https://lab.llvm.org/buildbot/#/builders/96/builds/22715
../llvm-project/lldb/source/Plugins/Process/elf-core/ThreadElfCore.cpp:170:
virtual lldb::RegisterContextSP ThreadElfCore::CreateRegisterContextForFrame(
lldb_private::StackFrame *): Assertion `false && "Architecture or OS not supported"' failed.
Currently, ppc64le and ppc64 (defaulting to big endian) have the same
descriptor, thus the linear scan always return ppc64le. Handle that through
subtype.
Differential Revision: https://reviews.llvm.org/D124760
Currently, all data buffers are assumed to be writable. This is a
problem on macOS where it's not allowed to load unsigned binaries in
memory as writable. To be more precise, MAP_RESILIENT_CODESIGN and
MAP_RESILIENT_MEDIA need to be set for mapped (unsigned) binaries on our
platform.
Binaries are mapped through FileSystem::CreateDataBuffer which returns a
DataBufferLLVM. The latter is backed by a llvm::WritableMemoryBuffer
because every DataBuffer in LLDB is considered to be writable. In order
to use a read-only llvm::MemoryBuffer I had to split our abstraction
around it.
This patch distinguishes between a DataBuffer (read-only) and
WritableDataBuffer (read-write) and updates LLDB to use the appropriate
one.
rdar://74890607
Differential revision: https://reviews.llvm.org/D122856
The current design allows that the object file contents could be mapped
by one object file plugin and then used by another. Presumably the idea
here was to avoid mapping the same file twice.
This becomes an issue when one object file plugin wants to map the file
differently from the others. For example, ObjectFileELF needs to map its
memory as writable while others likeObjectFileMachO needs it to be
mapped read-only.
This patch prevents plugins from changing the buffer by passing them is
by value rather than by reference.
Differential revision: https://reviews.llvm.org/D122944
Add support to inspect the ELF headers for RISCV targets to determine if
RVC or RVE are enabled and the floating point support to enable. As per
the RISCV specification, d implies f, q implies d implies f, which gives
us the cascading effect that is used to enable the features when setting
up the disassembler. With this change, it is now possible to attach the
debugger to a remote process and be able to disassemble the instruction
stream.
~~~
$ bin/lldb tmp/reduced
(lldb) target create "reduced"
Current executable set to '/tmp/reduced' (riscv64).
(lldb) gdb-remote localhost:1234
(lldb) Process 5737 stopped
* thread #1, name = 'reduced', stop reason = signal SIGTRAP
frame #0: 0x0000003ff7fe1b20
-> 0x3ff7fe1b20: mv a0, sp
0x3ff7fe1b22: jal 1936
0x3ff7fe1b26: mv s0, a0
0x3ff7fe1b28: auipc a0, 27
~~~
Reflow the textual comment which preserves formatted output from
tooling. This makes the content legible again after the lldb source
code was reformatted with automated tooling.
Most of our code was including Log.h even though that is not where the
"lldb" log channel is defined (Log.h defines the generic logging
infrastructure). This worked because Log.h included Logging.h, even
though it should.
After the recent refactor, it became impossible the two files include
each other in this direction (the opposite inclusion is needed), so this
patch removes the workaround that was put in place and cleans up all
files to include the right thing. It also renames the file to LLDBLog to
better reflect its purpose.
Symbol table parsing has evolved over the years and many plug-ins contained duplicate code in the ObjectFile::GetSymtab() that used to be pure virtual. With this change, the "Symbtab *ObjectFile::GetSymtab()" is no longer virtual and will end up calling a new "void ObjectFile::ParseSymtab(Symtab &symtab)" pure virtual function to actually do the parsing. This helps centralize the code for parsing the symbol table and allows the ObjectFile base class to do all of the common work, like taking the necessary locks and creating the symbol table object itself. Plug-ins now just need to parse when they are asked to parse as the ParseSymtab function will only get called once.
This is a retry of the original patch https://reviews.llvm.org/D113965 which was reverted. There was a deadlock in the Manual DWARF indexing code during symbol preloading where the module was asked on the main thread to preload its symbols, and this would in turn cause the DWARF manual indexing to use a thread pool to index all of the compile units, and if there were relocations on the debug information sections, these threads could ask the ObjectFile to load section contents, which could cause a call to ObjectFileELF::RelocateSection() which would ask for the symbol table from the module and it would deadlock. We can't lock the module in ObjectFile::GetSymtab(), so the solution I am using is to use a llvm::once_flag to create the symbol table object once and then lock the Symtab object. Since all APIs on the symbol table use this lock, this will prevent anyone from using the symbol table before it is parsed and finalized and will avoid the deadlock I mentioned. ObjectFileELF::GetSymtab() was never locking the module lock before and would put off creating the symbol table until somewhere inside ObjectFileELF::GetSymtab(). Now we create it one time inside of the ObjectFile::GetSymtab() and immediately lock it which should be safe enough. This avoids the deadlocks and still provides safety.
Differential Revision: https://reviews.llvm.org/D114288
This reverts commit 951b107eed.
Buildbots were failing, there is a deadlock in /Users/gclayton/Documents/src/llvm/clean/llvm-project/lldb/test/Shell/SymbolFile/DWARF/DW_AT_range-DW_FORM_sec_offset.s when ELF files try to relocate things.
Symbol table parsing has evolved over the years and many plug-ins contained duplicate code in the ObjectFile::GetSymtab() that used to be pure virtual. With this change, the "Symbtab *ObjectFile::GetSymtab()" is no longer virtual and will end up calling a new "void ObjectFile::ParseSymtab(Symtab &symtab)" pure virtual function to actually do the parsing. This helps centralize the code for parsing the symbol table and allows the ObjectFile base class to do all of the common work, like taking the necessary locks and creating the symbol table object itself. Plug-ins now just need to parse when they are asked to parse as the ParseSymtab function will only get called once.
Differential Revision: https://reviews.llvm.org/D113965
The new module stats adds the ability to measure the time it takes to parse and index the symbol tables for each module, and reports modules statistics in the output of "statistics dump" along with the path, UUID and triple of the module. The time it takes to parse and index the symbol tables are also aggregated into new top level key/value pairs at the target level.
Differential Revision: https://reviews.llvm.org/D112279
This patch deals with ObjectFile, ObjectContainer and OperatingSystem
plugins. I'll convert the other types in separate patches.
In order to enable piecemeal conversion, I am leaving some ConstStrings
in the lowest PluginManager layers. I'll convert those as the last step.
Differential Revision: https://reviews.llvm.org/D112061
There is no reason why this function should be returning a ConstString.
While modifying these files, I also fixed several instances where
GetPluginName and GetPluginNameStatic were returning different strings.
I am not changing the return type of GetPluginNameStatic in this patch, as that
would necessitate additional changes, and this patch is big enough as it is.
Differential Revision: https://reviews.llvm.org/D111877
.. and reduce the scope of others. They don't follow llvm coding
standards (which say they should be used only when the same effect
cannot be achieved with the static keyword), and they set a bad example.
In all these years, we haven't found a use for this function (it has
zero callers). Lets just remove the boilerplate.
Differential Revision: https://reviews.llvm.org/D109600
This is a resubmission of https://reviews.llvm.org/D105160 after fixing testing issues.
This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool.
Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index.
This patch fixes the issue by:
not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed
doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen
removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool
Prior to this fix the startup times for a large application was:
12.5 seconds (cold file caches)
8.5 seconds (warm file caches)
After this fix:
9.7 seconds (cold file caches)
5.7 seconds (warm file caches)
The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name.
Differential Revision: https://reviews.llvm.org/D106837
This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool.
Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index.
This patch fixes the issue by:
- not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed
- doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen
- removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool
Prior to this fix the startup times for a large application was:
12.5 seconds (cold file caches)
8.5 seconds (warm file caches)
After this fix:
9.7 seconds (cold file caches)
5.7 seconds (warm file caches)
The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name.
Differential Revision: https://reviews.llvm.org/D105160
Reverts commits:
"Fix failing tests after https://reviews.llvm.org/D104488."
"Fix buildbot failure after https://reviews.llvm.org/D104488."
"Create synthetic symbol names on demand to improve memory consumption and startup times."
This series of commits broke the windows lldb bot and then failed to fix all of the failing tests.
This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool.
Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index.
This patch fixes the issue by:
- not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed
- doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen
- removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool
Prior to this fix the startup times for a large application was:
12.5 seconds (cold file caches)
8.5 seconds (warm file caches)
After this fix:
9.7 seconds (cold file caches)
5.7 seconds (warm file caches)
The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name.
Differential Revision: https://reviews.llvm.org/D104488
The code used the total number of symbols to create a symbol ID for the
synthetic symbols. This is not correct because the IDs of real symbols
can be higher than their total number, as we do not add all symbols (and
in particular, we never add symbol zero, which is not a real symbol).
This meant we could have symbols with duplicate IDs, which caused
problems if some relocations were referring to the duplicated IDs. This
was the cause of the failure of the test D97786.
This patch fixes the code to use the ID of the highest (last) symbol
instead.
Commiting this patch for Augusto Noronha who is getting set
up still.
This patch changes Target::ReadMemory so the default behavior
when a read is in a Section that is read-only is to fetch the
data from the local binary image, instead of reading it from
memory. Update all callers to use their old preferences
(the old prefer_file_cache bool) using the new API; we should
revisit these calls and see if they really intend to read
live memory, or if reading from a read-only Section would be
equivalent and important for performance-sensitive cases.
rdar://30634422
Differential revision: https://reviews.llvm.org/D100338
LLDB can often appear deadlocked to users that use IDEs when it is indexing DWARF, or parsing symbol tables. These long running operations can make a debug session appear to be doing nothing even though a lot of work is going on inside LLDB. This patch adds a public API to allow clients to listen to debugger events that report progress and will allow UI to create an activity window or display that can show users what is going on and keep them informed of expensive operations that are going on inside LLDB.
Differential Revision: https://reviews.llvm.org/D97739
It is possible for the GetSectionHeaderByIndex lookup to fail because
the previous FindSectionContainingFileAddress lookup found a segment
instead of a section. This is possible if the binary does not have
a PLT (which means that lld will in some circumstances set DT_JMPREL
to 0, which is typically an address that is part of the ELF headers
and not in a section) and may also be possible if the section headers
have been stripped. To handle this possibility, replace the assert
with an if.
Differential Revision: https://reviews.llvm.org/D93438
Adds the RISC-V ArchSpec bits contributed by @simoncook as part of D62732,
plus logic to distinguish between riscv32 and riscv64 based on ELF class.
The patch follows the implementation approach previously used for MIPS.
It defines RISC-V architecture subtypes and inspects the ELF header,
namely the ELF class, to detect the right subtype.
Differential Revision: https://reviews.llvm.org/D86292
This patch introduces a LLDB_SCOPED_TIMER macro to hide the needlessly
repetitive creation of scoped timers in LLDB. It's similar to the
LLDB_LOG(F) macro.
Differential revision: https://reviews.llvm.org/D93663
Summary:
This patch extends the ModuleSpec class to include a
DataBufferSP which contains the module data. If this
data is provided, LLDB won't try to hit the filesystem
to create the Module, but use only the data stored in
the ModuleSpec.
Reviewers: labath, espindola
Subscribers: emaste, MaskRay, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D83512
Leverage ARM ELF build attribute section to create ELF attribute section
for RISC-V. Extract the common part of parsing logic for this section
into ELFAttributeParser.[cpp|h] and ELFAttributes.[cpp|h].
Differential Revision: https://reviews.llvm.org/D74023
This is a step towards making the initialize and terminate calls be
generated by CMake, which in turn is towards making it possible to
disable plugins at configuration time.
Differential revision: https://reviews.llvm.org/D74245
Scale segment identifier up to user_id_t before negating it. This fixes
the identifers being wrongly e.g. 0x00000000fffffffe instead of
0xfffffffffffffffe. Fix suggested by Pavel Labath.
This fixes 5 tests failing on i386 (PR #44748):
lldb-shell :: ObjectFile/ELF/PT_LOAD-overlap-PT_INTERP.yaml
lldb-shell :: ObjectFile/ELF/PT_LOAD-overlap-PT_TLS.yaml
lldb-shell :: ObjectFile/ELF/PT_LOAD-overlap-section.yaml
lldb-shell :: ObjectFile/ELF/PT_LOAD.yaml
lldb-shell :: ObjectFile/ELF/PT_TLS-overlap-PT_LOAD.yaml
Differential Revision: https://reviews.llvm.org/D73914
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.