The current design allows that the object file contents could be mapped
by one object file plugin and then used by another. Presumably the idea
here was to avoid mapping the same file twice.
This becomes an issue when one object file plugin wants to map the file
differently from the others. For example, ObjectFileELF needs to map its
memory as writable while others likeObjectFileMachO needs it to be
mapped read-only.
This patch prevents plugins from changing the buffer by passing them is
by value rather than by reference.
Differential revision: https://reviews.llvm.org/D122944
Version 2 of 'main bin spec' LC_NOTE allows for the specification
of a slide of where the binary is loaded in the corefile virtual
address space. It also adds a (currently unused) platform field
for the main binary.
Some corefile creators will only have a UUID and an offset to be
applied to the binary.
Changed TestFirmwareCorefiles.py to test this new form of
'main bin spec' with a slide, and also to run on both x86_64
and arm64 macOS systems.
Differential Revision: https://reviews.llvm.org/D116094
rdar://85938455
Add lldb support for a Mach-O "load binary" LC_NOTE which provides
a UUID, load address/slide, and possibly a name of a binary that
should be loaded when examining the core.
struct load_binary
{
uint32_t version; // currently 1
uuid_t uuid; // all zeroes if uuid not specified
uint64_t load_address; // virtual address where the macho is loaded, UINT64_MAX if unavail
uint64_t slide; // slide to be applied to file address to get load address, 0 if unavail
char name_cstring[]; // must be nul-byte terminated c-string, '\0' alone if name unavail
} __attribute__((packed));
Differential Revision: https://reviews.llvm.org/D115494
rdar://85069250
Symbol table parsing has evolved over the years and many plug-ins contained duplicate code in the ObjectFile::GetSymtab() that used to be pure virtual. With this change, the "Symbtab *ObjectFile::GetSymtab()" is no longer virtual and will end up calling a new "void ObjectFile::ParseSymtab(Symtab &symtab)" pure virtual function to actually do the parsing. This helps centralize the code for parsing the symbol table and allows the ObjectFile base class to do all of the common work, like taking the necessary locks and creating the symbol table object itself. Plug-ins now just need to parse when they are asked to parse as the ParseSymtab function will only get called once.
This is a retry of the original patch https://reviews.llvm.org/D113965 which was reverted. There was a deadlock in the Manual DWARF indexing code during symbol preloading where the module was asked on the main thread to preload its symbols, and this would in turn cause the DWARF manual indexing to use a thread pool to index all of the compile units, and if there were relocations on the debug information sections, these threads could ask the ObjectFile to load section contents, which could cause a call to ObjectFileELF::RelocateSection() which would ask for the symbol table from the module and it would deadlock. We can't lock the module in ObjectFile::GetSymtab(), so the solution I am using is to use a llvm::once_flag to create the symbol table object once and then lock the Symtab object. Since all APIs on the symbol table use this lock, this will prevent anyone from using the symbol table before it is parsed and finalized and will avoid the deadlock I mentioned. ObjectFileELF::GetSymtab() was never locking the module lock before and would put off creating the symbol table until somewhere inside ObjectFileELF::GetSymtab(). Now we create it one time inside of the ObjectFile::GetSymtab() and immediately lock it which should be safe enough. This avoids the deadlocks and still provides safety.
Differential Revision: https://reviews.llvm.org/D114288
This reverts commit 951b107eed.
Buildbots were failing, there is a deadlock in /Users/gclayton/Documents/src/llvm/clean/llvm-project/lldb/test/Shell/SymbolFile/DWARF/DW_AT_range-DW_FORM_sec_offset.s when ELF files try to relocate things.
Symbol table parsing has evolved over the years and many plug-ins contained duplicate code in the ObjectFile::GetSymtab() that used to be pure virtual. With this change, the "Symbtab *ObjectFile::GetSymtab()" is no longer virtual and will end up calling a new "void ObjectFile::ParseSymtab(Symtab &symtab)" pure virtual function to actually do the parsing. This helps centralize the code for parsing the symbol table and allows the ObjectFile base class to do all of the common work, like taking the necessary locks and creating the symbol table object itself. Plug-ins now just need to parse when they are asked to parse as the ParseSymtab function will only get called once.
Differential Revision: https://reviews.llvm.org/D113965
This patch deals with ObjectFile, ObjectContainer and OperatingSystem
plugins. I'll convert the other types in separate patches.
In order to enable piecemeal conversion, I am leaving some ConstStrings
in the lowest PluginManager layers. I'll convert those as the last step.
Differential Revision: https://reviews.llvm.org/D112061
There is no reason why this function should be returning a ConstString.
While modifying these files, I also fixed several instances where
GetPluginName and GetPluginNameStatic were returning different strings.
I am not changing the return type of GetPluginNameStatic in this patch, as that
would necessitate additional changes, and this patch is big enough as it is.
Differential Revision: https://reviews.llvm.org/D111877
In all these years, we haven't found a use for this function (it has
zero callers). Lets just remove the boilerplate.
Differential Revision: https://reviews.llvm.org/D109600
When adding a dSYM to a Module and it has different file addresses
from the already-present ObjectFile binary, change the Sections to
use the dSYM's file addresses so the symbol table and DWARF are
properly contained in the Sections. Previously this was only done
for IsInMemory ObjectFiles, but it's more common than that.
Differential Revision: https://reviews.llvm.org/D108889
rdar://81504400
This patch adds code to process save-core for Mach-O files which
embeds an "addrable bits" LC_NOTE when the process is using a
code address mask (e.g. AArch64 v8.3 with ptrauth aka arm64e).
Add code to ObjectFileMachO to read that LC_NOTE from corefiles,
and ProcessMachCore to set the process masks based on it when reading
a corefile back in.
Also have "process status --verbose" print the current address masks
that lldb is using internally to strip ptrauth bits off of addresses.
Differential Revision: https://reviews.llvm.org/D106348
rdar://68630113
Add a new feature to process save-core on Darwin systems -- for
lldb to create a user process corefile with only the dirty (modified
memory) pages included. All of the binaries that were used in the
corefile are assumed to still exist on the system for the duration
of the use of the corefile. A new --style option to process save-core
is added, so a full corefile can be requested if portability across
systems, or across time, is needed for this corefile.
debugserver can now identify the dirty pages in a memory region
when queried with qMemoryRegionInfo, and the size of vm pages is
given in qHostInfo.
Create a new "all image infos" LC_NOTE for Mach-O which allows us
to describe all of the binaries that were loaded in the process --
load address, UUID, file path, segment load addresses, and optionally
whether code from the binary was executing on any thread. The old
"read dyld_all_image_infos and then the in-memory Mach-O load
commands to get segment load addresses" no longer works when we
only have dirty memory.
rdar://69670807
Differential Revision: https://reviews.llvm.org/D88387
When a Mach-O corefile has an LC_NOTE "main bin spec" for a
standalone binary / firmware, with only a UUID and no load
address, try to locate the binary and dSYM by UUID and if
found, load it at offset 0 for the user.
Add a test case that tests a firmware/standalone corefile
with both the "kern ver str" and "main bin spec" LC_NOTEs.
<rdar://problem/68193804>
Differential Revision: https://reviews.llvm.org/D88282
Summary:
On macOS 11, the libraries that have been integrated in the system
shared cache are not present on the filesystem anymore. LLDB was
using those files to get access to the symbols of those libraries.
LLDB can get the images from the target process memory though.
This has 2 consequences:
- LLDB cannot load the images before the process starts, reporting
an error if someone tries to break on a system symbol.
- Loading the symbols by downloading the data from the inferior
is super slow. It takes tens of seconds at the start of the
debug session to populate the Module list.
To fix this, we can use the library images LLDB has in its own
mapping of the shared cache. Shared cache images are somewhat
special as their LINKEDIT segment is moved to the end of the cache
and thus the images are not contiguous in memory. All of this can
hidden in ObjectFileMachO.
This patch fixes a number of test failures on macOS 11 due to the
first problem described above and adds some specific unittesting
for the new SharedCache Host utilities.
Reviewers: jasonmolenda, labath
Subscribers: llvm-commits, lldb-commits
Tags: #lldb, #llvm
Differential Revision: https://reviews.llvm.org/D83023
The two classes are equivalent, except:
- the former uses a llvm::SmallVector (with a configurable size), while
the latter uses std::vector.
- the former has a typo in one of the functions name
This patch just leaves one class, using llvm::SmallVector, and defaults
the small size to zero. This is the same thing we did with the
RangeDataVector class in D56170.
LLDB has a few different styles of header guards and they're not very
consistent because things get moved around or copy/pasted. This patch
unifies the header guards across LLDB and converts everything to match
LLVM's style.
Differential revision: https://reviews.llvm.org/D74743
On macOS one Mach-O slice can contain multiple load commands: One load
command for being loaded into a macOS process and one load command for
being loaded into a macCatalyst process. This patch adds support for
the new load command and makes sure ObjectFileMachO returns the
Architecture that matches the Module.
Differential Revision: https://reviews.llvm.org/D66626
llvm-svn: 369814
Summary:
On the heels of D62934, this patch uses the same approach to introduce
llvm RTTI support to the ObjectFile hierarchy. It also replaces the
existing uses of GetPluginName doing run-time type checks with
llvm::dyn_cast and friends.
This formally introduces new dependencies from some other plugins to
ObjectFile plugins. However, I believe this is fine because:
- these dependencies were already kind of there, and the only reason
we could get away with not modeling them explicitly was because the
code was relying on magically knowing what will GetPluginName() return
for a particular kind of object files.
- the dependencies themselves are logical (it makes sense for
SymbolVendorELF to depend on ObjectFileELF), or at least don't
actively get in the way (the JitLoaderGDB->MachO thing).
- they don't introduce any new dependency loops as ObjectFile plugins
don't depend on any other plugins
Reviewers: xiaobai, JDevlieghere, espindola
Subscribers: emaste, mgorny, arichardson, MaskRay, lldb-commits
Differential Revision: https://reviews.llvm.org/D65450
llvm-svn: 367413
The new DriverKit user-land kernel drivers in macOS 10.15 / Catalina
do not have a main() function or an LC_MAIN load command. lldb uses
the address of main() as the return address for inferior function
calls; it puts a breakpoint on main, runs the inferior function call,
and when the main() breakpoint is hit, lldb knows unambiguously that
the inferior function call ran to completion - no other function calls
main.
This change hoists the logic for finding the "entry address" from
ThreadPlanCallFunction to Target. It changes the logic to first
try to get the entry address from the main executable module,
but if that module does not have one, it will iterate through all
modules looking for an entry address.
The patch also adds code to ObjectFileMachO to use dyld's
_dyld_start function as an entry address.
<rdar://problem/52343958>
Differential Revision: https://reviews.llvm.org/D64897
llvm-svn: 366493
Summary: This patch modernizes the GetSDKVersion API and hopefully prevents problems such as the ones discovered in D61218.
Reviewers: aprantl, jasonmolenda, clayborg
Reviewed By: aprantl, clayborg
Subscribers: clayborg, labath, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D61233
llvm-svn: 365090
A lot of comments in LLDB are surrounded by an ASCII line to delimit the
begging and end of the comment.
Its use is not really consistent across the code base, sometimes the
lines are longer, sometimes they are shorter and sometimes they are
omitted. Furthermore, it looks kind of weird with the 80 column limit,
where the comment actually extends past the line, but not by much.
Furthermore, when /// is used for Doxygen comments, it looks
particularly odd. And when // is used, it incorrectly gives the
impression that it's actually a Doxygen comment.
I assume these lines were added to improve distinguishing between
comments and code. However, given that todays editors and IDEs do a
great job at highlighting comments, I think it's worth to drop this for
the sake of consistency. The alternative is fixing all the
inconsistencies, which would create a lot more churn.
Differential revision: https://reviews.llvm.org/D60508
llvm-svn: 358135
My apologies for the large patch. With the exception of ConstString.h
itself it was entirely produced by sed.
ConstString has exactly one const char * data member, so passing a
ConstString by reference is not any more efficient than copying it by
value. In both cases a single pointer is passed. But passing it by
value makes it harder to accidentally return the address of a local
object.
(This fixes rdar://problem/48640859 for the Apple folks)
Differential Revision: https://reviews.llvm.org/D59030
llvm-svn: 355553
Summary:
This file implements some general purpose data structures, and so it
belongs to the Utility module.
Reviewers: zturner, jingham, JDevlieghere, clayborg, espindola
Subscribers: emaste, mgorny, javed.absar, arichardson, MaskRay, lldb-commits
Differential Revision: https://reviews.llvm.org/D58970
llvm-svn: 355509
instead of returning the UUID through by-ref argument and a boolean
value indicating success, we can just return it directly. Since the UUID
class already has an invalid state, it can be used to denote the failure
without the additional bool.
llvm-svn: 353714
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Summary:
instead of returning the architecture through by-ref argument and a
boolean value indicating success, we can just return the ArchSpec
directly. Since the ArchSpec already has an invalid state, it can be
used to denote the failure without the additional bool.
Reviewers: clayborg, zturner, espindola
Subscribers: emaste, arichardson, JDevlieghere, lldb-commits
Differential Revision: https://reviews.llvm.org/D56129
llvm-svn: 350291
Prior to this there were 3 places that were duplicating the logic to detect if a section can/should be loaded and some were doing things a bit differently. Now it is all centralized in one place and it is done correctly.
llvm-svn: 349926
Summary:
This function was named such because in the case of MachO files, the
mach header is located at this address. However all (most?) usages of
this function were not interested in that fact, but the fact that this
address is used as the base address for expressing various relative
addresses in the object file.
For other object file formats, this name is not appropriate (and it's
probably the reason why this function was not implemented in these
classes). In the ELF case the ELF header will usually end up at this
address, but this is a result of the linker optimizing the file layout
and not a requirement of the spec. For COFF files, I believe the is no
header located at this address either.
Reviewers: clayborg, jasonmolenda, amccarth, lemo, stella.stamenova
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D55422
llvm-svn: 348849
This patch removes the comments grouping header includes. They were
added after running IWYU over the LLDB codebase. However they add little
value, are often outdates and burdensome to maintain.
llvm-svn: 346626
Summary:
One of the conclusions of the discussion on D49740 was that SafeMachO is better
off in the Host module (as that's the only place which should include
mach/machine.h, which is what this header is working around). Also, Utility,
which is the only module which cannot include Host, should not be doing
anything with object file formats.
This patch implements that move, and also removes any unneded includes of that
file.
I've verified that MacOS still compiles after this.
Reviewers: jingham, zturner, teemperor
Subscribers: fedor.sergeev, lldb-commits
Differential Revision: https://reviews.llvm.org/D50383
llvm-svn: 342050
This change allows to make AddressClass strongly typed enum and not to have issues with old versions of SWIG that don't support enum classes.
llvm-svn: 335710
Summary:
This has multiple advantages:
- we need only one function argument/instance variable instead of three
- no need to default initialize variables
- no custom parsing code
- VersionTuple has comparison operators, which makes version comparisons much
simpler
Reviewers: zturner, friss, clayborg, jingham
Subscribers: emaste, lldb-commits
Differential Revision: https://reviews.llvm.org/D47889
llvm-svn: 334950
when it and the inferior process both have the same shared cache
(a conglomeration of all libraries at the same fixed address for
all processes), lldb will read data out of its own memory to speed
things up. The shared cache has a UUID, so lldb currently checks
that the UUID of its own shared cache matches that of the inferior.
This change adds one refinement to that -- it checks that the UUID
is the same and that the base address of the shared cache is the
same. And only uses its local shared cache if they are both identical.
This involved using a different style of SPI with dyld to get lldb's
shared cache load address, but it's not especially difficult.
One unattractive part of the change is that I'm using the real
underlying types of task_t and kern_return_t instead of picking
them up from mach/mach.h. The defines that get picked up there (a
lot from machine.h but others too) conflict with llvm/Support/MachO.h
even when I have mach.h included before our SafeMachO.h which
undefines most of the defines before including llvm/Support/MachO.h.
I'll need to augment the #undefs in SafeMachO.h to get this to
compile cleanly, but that'll be another day.
<rdar://problem/39868238>
llvm-svn: 331497
Summary:
In an effort to understand the function's operation, I've split it into logical
pieces. Parsing of a single segment is moved to a separate function (and the
parsing state that is carried from one segment to another is explicitly
captured in the SegmentParsingContext object). I've also extracted some pieces
of code which were already standalone (validation of the segment load command,
determining the section type, determining segment permissions) into
separate functions.
Parsing of a single section within the segment should probably also be a
separate function, but I've left that for a separate patch.
This patch is intended to be NFC.
Reviewers: clayborg, davide
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D44074
llvm-svn: 326791
This renames the LLDB error class to Status, as discussed
on the lldb-dev mailing list.
A change of this magnitude cannot easily be done without
find and replace, but that has potential to catch unwanted
occurrences of common strings such as "Error". Every effort
was made to find all the obvious things such as the word "Error"
appearing in a string, etc, but it's possible there are still
some lingering occurences left around. Hopefully nothing too
serious.
llvm-svn: 302872
lldb should use when given a corefile.
This uses an LC_NOTE "main bin spec" or an LC_NOTE "kern ver str"
if they are present in a Mach-O core file.
Core files may have multiple different binaries -- different kernels,
or a mix of user process and kernel binaries -- and it can be
difficult for lldb to detect the correct one to use simply by looking
at the pages of memory. These two new LC_NOTE load commands allow
for the correct binary to be recorded unambiguously.
<rdar://problem/20878266>
llvm-svn: 300138
*** to conform to clang-format’s LLVM style. This kind of mass change has
*** two obvious implications:
Firstly, merging this particular commit into a downstream fork may be a huge
effort. Alternatively, it may be worth merging all changes up to this commit,
performing the same reformatting operation locally, and then discarding the
merge for this particular commit. The commands used to accomplish this
reformatting were as follows (with current working directory as the root of
the repository):
find . \( -iname "*.c" -or -iname "*.cpp" -or -iname "*.h" -or -iname "*.mm" \) -exec clang-format -i {} +
find . -iname "*.py" -exec autopep8 --in-place --aggressive --aggressive {} + ;
The version of clang-format used was 3.9.0, and autopep8 was 1.2.4.
Secondly, “blame” style tools will generally point to this commit instead of
a meaningful prior commit. There are alternatives available that will attempt
to look through this change and find the appropriate prior commit. YMMV.
llvm-svn: 280751
This is a pretty straightforward first pass over removing a number of uses of
Mutex in favor of std::mutex or std::recursive_mutex. The problem is that there
are interfaces which take Mutex::Locker & to lock internal locks. This patch
cleans up most of the easy cases. The only non-trivial change is in
CommandObjectTarget.cpp where a Mutex::Locker was split into two.
llvm-svn: 269877
should not be used for this module -- for use when an ObjectFile
knows that it does not have meaningful or accurate function start
addresses.
More commonly, it is not clear that function start addresses are
missing in a module. There are certain cases on Mac OS X where we
can tell that a Mach-O binary has been stripped of this essential
information, and the unwinder can end up emulating many megabytes
of instructions for a single "function" in the binary.
When a Mach-O binary is missing both an LC_FUNCTION_STARTS load
command (very unusual) and an eh_frame section, then we will assume
it has also been stripped of symbols and that instruction emulation
will not be useful on this module.
<rdar://problem/25988067>
llvm-svn: 268475
Background: dyld binaries often have extra symbols in their symbol table like "malloc" and "free" for the early bringup of dyld and we often don't want to set breakpoints in dynamic linker binaries. We also don't want to call the "malloc" or "free" function in dyld when a user writes an expression like "(void *)malloc(123)" so we need to avoid doing name lookups in dyld. We mark Modules as being dynamic link editors and this helps do correct lookups for breakpoints by name and function lookups.
<rdar://problem/19716267>
llvm-svn: 228261
(lldb) file /bin/ls
(lldb) b malloc
(lldb) run
(lldb) process save-core /tmp/ls.core
Each ObjectFile plug-in now has the option to save core files by registering a new static callback.
llvm-svn: 210864
Changes include:
- ObjectFileMachO can now determine if a binary is "*-apple-ios" or "*-apple-macosx" by checking the min OS and SDK load commands
- ArchSpec now says "<arch>-apple-macosx" is equivalent to "<arch>-apple-ios" since the simulator mixes and matches binaries (some from the system and most from the iOS SDK).
- Getting process inforamtion on MacOSX now correctly classifies iOS simulator processes so they have "*-apple-ios" architectures in the ProcessInstanceInfo
- PlatformiOSSimulator can now list iOS simulator processes correctly instead of showing nothing by using:
(lldb) platform select ios-simulator
(lldb) platform process list
- debugserver can now properly return "*-apple-ios" for the triple in the process info packets for iOS simulator executables
- GDBRemoteCommunicationClient now correctly passes along the triples it gets for process info by setting the OS in the llvm::Triple correctly
<rdar://problem/17060217>
llvm-svn: 209852