3 patches, aiming to improve PE/COFF support:
- First patch fix symbol reading (invalid header size from sizeof() == 20 != 18, and various bugfixes such as invalid skipping of auxiliary symbols, 4 bytes shift from beginning, etc...).
- Second patch add image_base to section vmaddr offset so that VM addr is in image_base space.
- Third patch add support for DWARF section in PECOFF (taken from ELF counterpart), since they are generated by gcc/clang under windows.
llvm-svn: 184153
Accept mach-o files with bad segments. Many core files are not created correctly and we should still be able to glean any information we can from them.
llvm-svn: 183247
Which means "platform process list" should work and list the architecture.
We are now parsing the elf build-id if it exists, which should allow us to load stripped symbols (looking at that next).
llvm-svn: 182610
Fixed "target symbols add" to correctly extract all module specifications from a dSYM file that is supplied and match the symbol file to a current target module using the UUID values if they are available.
This fixes the case where you add a dSYM file (like "foo.dSYM") which is for a renamed executable (like "bar"). In our case it was "mach_kernel.dSYM" which didn't match "mach_kernel.sys".
llvm-svn: 181916
Combine N_GSYM stab entries with their non-stab counterpart (data symbols) to make the symbol table smaller with less duplicate named symbols.
llvm-svn: 181841
Most importantly, have DoReadGPR/DoReadFPU/DoReadEXC return -1
to indicate failure if they're called. Else these could override
the Error setting for the relevant thread state -- if the core file
didn't include a floating point thread state, for instance, these
functions would clear the Error setting for that register set and
lldb would display random bytes as those registers' contents.
<rdar://problem/13665075>
llvm-svn: 181757
<rdar://problem/13594769>
Main changes in this patch include:
- cleanup plug-in interface and use ConstStrings for plug-in names
- Modfiied the BSD Archive plug-in to be able to pick out the correct .o file when .a files contain multiple .o files with the same name by using the timestamp
- Modified SymbolFileDWARFDebugMap to properly verify the timestamp on .o files it loads to ensure we don't load updated .o files and cause problems when debugging
The plug-in interface changes:
Modified the lldb_private::PluginInterface class that all plug-ins inherit from:
Changed:
virtual const char * GetPluginName() = 0;
To:
virtual ConstString GetPluginName() = 0;
Removed:
virtual const char * GetShortPluginName() = 0;
- Fixed up all plug-in to adhere to the new interface and to return lldb_private::ConstString values for the plug-in names.
- Fixed all plug-ins to return simple names with no prefixes. Some plug-ins had prefixes and most ones didn't, so now they all don't have prefixed names, just simple names like "linux", "gdb-remote", etc.
llvm-svn: 181631
std::string
Module::GetSpecificationDescription () const;
This returns the module as "/usr/lib/libfoo.dylib" for normal files (calls "std::string FileSpec::GetPath()" on m_file) but it also might include the object name in case the module is for a .o file in a BSD archive ("/usr/lib/libfoo.a(bar.o)"). Cleaned up necessary logging code to use it.
llvm-svn: 180717
There is a new static ObjectFile function you can call:
size_t
ObjectFile::GetModuleSpecifications (const FileSpec &file,
lldb::offset_t file_offset,
ModuleSpecList &specs)
This will fill in "specs" which the details of all the module specs (file + arch + UUID (if there is one) + object name (for BSD archive objects eventually) + file offset to the object in question).
This helps us when a user specifies a file that contains a single architecture, and also helps us when we are given a debug symbol file (like a dSYM file on MacOSX) that contains one or more architectures and we need to be able to match it up to an existing Module that has no debug info.
llvm-svn: 180224
- conditionally build mac-specific plugins only on mac (PluginObjectFileMachO, PluginDynamicLoaderDrawinKernel and PluginDynamicLoaderMacOSXDYLD)
- clean up warnings by ignoring deprecated declarations (auto_ptr for example)
llvm-svn: 179694
differs from lldb's own shared cache, and where the inferior process shared
cache does not match up with the on-disk shared cache file.
Simplify the code where lldb gets its own shared cache uuid a little bit.
llvm-svn: 179633
Show an error message when we have a corrupt mach-o file where the LC_SEGMENT or LC_SEGMENT_64 load command have file offsets or file offsets + sizes that extend beyond the end of the file.
llvm-svn: 179605
a UUID for the shared cache libraries that can be used to confirm
that one process' shared cache is the same as another, or that a
process' in-memory shared cache is a match for a given on-disk
dyld_shared_cache binary file. Use these UUIDs to catch some
uncommon problems when the shared caches are being changed for debug
purposes.
<rdar://problem/13524467>
llvm-svn: 179583
- Do not add symbols with no names
- Make sure that symbols from ELF symbol tables know that the byte size is correct. Previously the symbols would calculate their sizes by looking for the next symbol and take symbols that had zero size and make them have invalid sizes.
- Added the ability to dump raw ELF symbols by adding a Dump method to ELFSymbol
Also removed some unused code from lldb_private::Symtab.
llvm-svn: 179466
SectionList so we don't try to do anything with this file. Currently we end up crashing
later in the debug session when we read past the end of the file -- this at least gets us
closer with something like ProcessMachCore printing "error: core file has no sections".
<rdar://problem/13468295>
llvm-svn: 179152
LLDB is crashing when logging is enabled from lldb-perf-clang. This has to do with the global destructor chain as the process and its threads are being torn down.
All logging channels now make one and only one instance that is kept in a global pointer which is never freed. This guarantees that logging can correctly continue as the process tears itself down.
llvm-svn: 178191
This returns a vector of <file address, size> entries for all of
the functions in the module that have an eh_frame FDE.
Update ObjectFileMachO to use the eh_frame FDE function addresses if
the LC_FUNCTION_STARTS section is missing, to fill in the start
addresses of any symbols that have been stripped from the binary.
Generally speaking, lldb works best if it knows the actual start
address of every function in a module - it's especially important
for unwinding, where lldb inspects the instructions in the prologue
of the function. In a stripped binary, it is deprived of this
information and it reduces the quality of our unwinds and saved
register retrieval.
Other ObjectFile users may want to use the function addresses from
DWARFCallFrameInfo to fill in any stripped symbols like ObjectFileMachO
does already.
<rdar://problem/13365659>
llvm-svn: 177624
DWARF with .o files now uses 40-60% less memory!
Big fixes include:
- Change line table internal representation to contain "file addresses". Since each line table is owned by a compile unit that is owned by a module, it makes address translation into lldb_private::Address easy to do when needed.
- Removed linked address members/methods from lldb_private::Section and lldb_private::Address
- lldb_private::LineTable can now relink itself using a FileRangeMap to make it easier to re-link line tables in the future
- Added ObjectFile::ClearSymtab() so that we can get rid of the object file symbol tables after we parse them once since they are not needed and kept memory allocated for no reason
- Moved the m_sections_ap (std::auto_ptr to section list) and m_symtab_ap (std::auto_ptr to the lldb_private::Symtab) out of each of the ObjectFile subclasses and put it into lldb_private::ObjectFile.
- Changed how the debug map is parsed and stored to be able to:
- Lazily parse the debug map for each object file
- not require the address map for a .o file until debug information is linked for a .o file
llvm-svn: 176454
- generate-vers.pl has to be called by cmake to generate the version number
- parallel builds not yet supported; dependency on clang must be explicitly specified
Tested on Linux.
- Building on Mac will require code-signing logic to be implemented.
- Building on Windows will require OS-detection logic and some selective directory inclusion
Thanks to Carlo Kok (who originally prepared these CMakefiles for Windows) and Ben Langmuir
who ported them to Linux!
llvm-svn: 175795
lldb was mmap'ing archive files once per .o file it loads, now it correctly shares the archive between modules.
LLDB was also always mapping entire contents of universal mach-o files, now it maps just the slice that is required.
Added a new logging channel for "lldb" called "mmap" to help track future regressions.
Modified the ObjectFile and ObjectContainer plugin interfaces to take a data offset along with the file offset and size so we can implement the correct caching and efficient reading of parts of files without mmap'ing the entire file like we used to.
The current implementation still keeps entire .a files mmaped (once) and entire slices from universal files mmaped to ensure that if a client builds their binaries during a debug session we don't lose our data and get corrupt object file info and debug info.
llvm-svn: 174524
function stub routine addresses from an in-memory-only
MachO object file. This was the only remaining part of
ParseSymtab() that was assuming a file exists.
<rdar://problem/13139585>
llvm-svn: 174455
Major fixed to allow reading files that are over 4GB. The main problems were that the DataExtractor was using 32 bit offsets as a data cursor, and since we mmap all of our object files we could run into cases where if we had a very large core file that was over 4GB, we were running into the 4GB boundary.
So I defined a new "lldb::offset_t" which should be used for all file offsets.
After making this change, I enabled warnings for data loss and for enexpected implicit conversions temporarily and found a ton of things that I fixed.
Any functions that take an index internally, should use "size_t" for any indexes and also should return "size_t" for any sizes of collections.
llvm-svn: 173463
equality can be strict or loose and we want code to
explicitly choose one or the other.
Also renamed the Compare function to IsEqualTo, to
avoid confusion.
<rdar://problem/12856749>
llvm-svn: 170152
When using the same-device optimization for shared cache libraries, if
we have an invalid load address for __LINKEDIT, don't try to read
anything out of lldb's own address space. Reading it out of the remote
address space will fail gracefully if we have bad addresses but reading
it out of lldb's own address space will result in a crash.
llvm-svn: 169582
Allow the expression parser to see more than just data symbols. We now accept any symbol that has an address. We take precautions to only accept symbols by their mangled or demangled names only if the demangled name was not synthesized. If the demangled name is synthesized, then we now mark symbols accordingly and only compare against the mangled original name.
llvm-svn: 168668
RegisterContextKDP_i386 was not correctly writing registers due to missing "virtual" keywords. Added the virtual keywords and made the functions pure virtual to ensure subclasses can't get away without implementing these functions.
llvm-svn: 167066
is being run on iOS natively and we are examining a binary that is
in the shared-cache. The shared cache may be set up to not load the
symbol names in memory (and may be missing some local symbols entirely,
to boot) so we need to read the on-disk-but-not-mapped-into-memory cache
of symbol names/symbols before we start processing the in-memory nlist
entries.
This code needs to be reorganized into its own separate method, ideally
we'll find some way to not duplicate the nlist symbol handling. But
we need to handle this new format quickly and we'll clean up later.
Thanks for James McIlree for the patch. Fixes <rdar://problem/11639018>.
llvm-svn: 158891
a cache of address ranges for child sections,
accelerating lookups. This cache is built during
object file loading, and is then set in stone once
the object files are done loading. (In Debug builds,
we ensure that the cache is never invalidated after
that.)
llvm-svn: 158188
Fixed an issue with the symbol table parsing of files that have STAB entries in them where there are two N_SO entries where the first has a directory, and the second contains a full path:
[ 0] 00000002 64 (N_SO ) 00 0000 0000000000000000 '/Volumes/data/src/'
[ 1] 0000001e 64 (N_SO ) 00 0000 0000000000000000 '/Volumes/data/src/Source/main.m'
[ 2] 00000047 66 (N_OSO ) 09 0001 000000004fc642d2 '/tmp/main.o'
[ 3] 00000001 2e (N_BNSYM ) 01 0000 0000000000003864
[ 4] 000000bd 24 (N_FUN ) 01 0000 0000000000003864 '_main'
[ 5] 00000001 24 (N_FUN ) 00 0000 00000000000000ae
[ 6] 00000001 4e (N_ENSYM ) 01 0000 00000000000000ae
[ 7] 00000001 64 (N_SO ) 01 0000 0000000000000000
We now correctly combine entries 0 and 1 into a single entry.
llvm-svn: 157712
indicates that the section is thread specific. Any functions the load a module
given a slide, will currently ignore any sections that are thread specific.
lldb_private::Section now has:
bool
Section::IsThreadSpecific () const
{
return m_thread_specific;
}
void
Section::SetIsThreadSpecific (bool b)
{
m_thread_specific = b;
}
The ELF plug-in has been modified to set this for the ".tdata" and the ".tbss"
sections.
Eventually we need to have each lldb_private::Thread subclass be able to
resolve a thread specific section, but for now they will just not resolve. The
code for that should be trivual to add, but the address resolving functions
will need to be changed to take a "ExecutionContext" object instead of just
a target so that thread specific sections can be resolved.
llvm-svn: 153537
1 - sections only get a valid VM size if they have SHF_ALLOC in the section flags
2 - symbol names are marked as mangled if they start with "_Z"
Also fixed the DWARF parser to correctly use the section file size when extracting the DWARF.
llvm-svn: 153496
Fixed an issue with the FUNC_STARTS load command where we would get the
symbol size wrong and we would add all sorts of symbols due to bit zero being
set to indicate thumb.
llvm-svn: 152696
Simplify the locking strategy for Module and its owned objects to always use the Module's mutex to avoid A/B deadlocks. We had a case where a symbol vendor was locking itself and then calling a function that would try to get it's Module's mutex and at the same time another thread had the Module mutex that was trying to get the SymbolVendor mutex. Now any classes that inherit from ModuleChild should use the module lock using code like:
void
ModuleChildSubclass::Function
{
ModuleSP module_sp(GetModule());
if (module_sp)
{
lldb_private::Mutex::Locker locker(module_sp->GetMutex());
... do work here...
}
}
This will help avoid deadlocks by using as few locks as possible for a module and all its child objects and also enforce detecting if a module has gone away (the ModuleSP will be returned empty if the weak_ptr does refer to a valid object anymore).
llvm-svn: 152679
Get function boundaries from the LC_FUNCTION_STARTS load command. This helps to determine symbol sizes and also allows us to be able to debug stripped binaries.
If you have a stack backtrace that goes through a function that has been stripped from the symbol table, the variables for any functions above that stack frame will most likely be incorrect. It can also affect our ability to step in/out/through of a function.
llvm-svn: 152381
This takes two important changes:
- Calling blocks is now supported. You need to
cast their return values, but that works fine.
- We now can correctly run JIT-compiled
expressions that use floating-point numbers.
Also, we have taken a fix that allows us to
ignore access control in Objective-C as in C++.
llvm-svn: 152286
This fix really needed to happen as a previous fix I had submitted for
calculating symbol sizes made many symbols appear to have zero size since
the function that was calculating the symbol size was calling another function
that would cause the calculation to happen again. This resulted in some symbols
having zero size when they shouldn't. This could then cause infinite stack
traces and many other side affects.
llvm-svn: 152244
I started work on being able to add symbol files after a debug session
had started with a new "target symfile add" command and quickly ran into
problems with stale Address objects in breakpoint locations that had
lldb_private::Section pointers into modules that had been removed or
replaced. This also let to grabbing stale modules from those sections.
So I needed to thread harded the Address, Section and related objects.
To do this I modified the ModuleChild class to now require a ModuleSP
on initialization so that a weak reference can created. I also changed
all places that were handing out "Section *" to have them hand out SectionSP.
All ObjectFile, SymbolFile and SymbolVendors were inheriting from ModuleChild
so all of the find plug-in, static creation function and constructors now
require ModuleSP references instead of Module *.
Address objects now have weak references to their sections which can
safely go stale when a module gets destructed.
This checkin doesn't complete the "target symfile add" command, but it
does get us a lot clioser to being able to do such things without a high
risk of crashing or memory corruption.
llvm-svn: 151336
subclasses if the object files support version numbering. Exposed
this through SBModule for upcoming data formatter version checking stuff.
llvm-svn: 151190
Tracking modules down when you have a UUID and a path has been improved.
DynamicLoaderDarwinKernel no longer parses mach-o load commands and it
now uses the memory based modules now that we can load modules from memory.
Added a target setting named "target.exec-search-paths" which can be used
to supply a list of directories to use when trying to look for executables.
This allows one or more directories to be used when searching for modules
that may not exist in the SDK/PDK. The target automatically adds the directory
for the main executable to this list so this should help us in tracking down
shared libraries and other binaries.
llvm-svn: 150426
detection of kernels into the object file and
adding a new category for raw binary images.
Fixed all clients who previously searched for
sections manually, making them use the object
file's facilities instead.
llvm-svn: 150272
user space programs. The core file support is implemented by making a process
plug-in that will dress up the threads and stack frames by using the core file
memory.
Added many default implementations for the lldb_private::Process functions so
that plug-ins like the ProcessMachCore don't need to override many many
functions only to have to return an error.
Added new virtual functions to the ObjectFile class for extracting the frozen
thread states that might be stored in object files. The default implementations
return no thread information, but any platforms that support core files that
contain frozen thread states (like mach-o) can make a module using the core
file and then extract the information. The object files can enumerate the
threads and also provide the register state for each thread. Since each object
file knows how the thread registers are stored, they are responsible for
creating a suitable register context that can be used by the core file threads.
Changed the process CreateInstace callbacks to return a shared pointer and
to also take an "const FileSpec *core_file" parameter to allow for core file
support. This will also allow for lldb_private::Process subclasses to be made
that could load crash logs. This should be possible on darwin where the crash
logs contain all of the stack frames for all of the threads, yet the crash
logs only contain the registers for the crashed thrad. It should also allow
some variables to be viewed for the thread that crashed.
llvm-svn: 150154
Fixed "target modules list" (aliased to "image list") to output more information
by default. Modified the "target modules list" to have a few new options:
"--header" or "-h" => show the image header address
"--offset" or "-o" => show the image header address offset from the address in the file (the slide applied to the shared library)
Removed the "--symfile-basename" or "-S" option, and repurposed it to
"--symfile-unique" "-S" which will show the symbol file if it differs from
the executable file.
ObjectFile's can now be loaded from memory for cases where we don't have the
files cached locally in an SDK or net mounted root. ObjectFileMachO can now
read mach files from memory.
Moved the section data reading code into the ObjectFile so that the object
file can get the section data from Process memory if the file is only in
memory.
lldb_private::Module can now load its object file in a target with a rigid
slide (very common operation for most dynamic linkers) by using:
bool
Module::SetLoadAddress (Target &target, lldb::addr_t offset, bool &changed)
lldb::SBModule() now has a new constructor in the public interface:
SBModule::SBModule (lldb::SBProcess &process, lldb::addr_t header_addr);
This will find an appropriate ObjectFile plug-in to load an image from memory
where the object file header is at "header_addr".
llvm-svn: 149804
mmap() the entire object file contents into memory with MAP_PRIVATE.
We do this because object file contents can change on us and currently
this helps alleviate this situation. It also make the code for accessing
object file data much easier to manage and we don't end up opening the
file, reading some data and closing the file over and over.
llvm-svn: 148017
so that we don't have "fprintf (stderr, ...)" calls sprinkled everywhere.
Changed all needed locations over to using this.
For non-darwin, we log to stderr only. On darwin, we log to stderr _and_
to ASL (Apple System Log facility). This will allow GUI apps to have a place
for these error and warning messages to go, and also allows the command line
apps to log directly to the terminal.
llvm-svn: 147596
Watch for empty symbol tables by doing a lot more error checking on
all mach-o symbol table load command values and data that is obtained.
This avoids a crash that was happening when there was no string table.
llvm-svn: 147358
Objective-C, making symbol lookups for various raw
Objective-C symbols work correctly. The IR interpreter
makes these lookups because Clang has emitted raw
symbol references for ivars and classes.
Also improved performance in SymbolFiles, caching the
result of asking for SymbolFile abilities.
llvm-svn: 145758
object file can correctly make these symbols which will abstract us from the
file format and ABI and we can then ask for the objective C class symbol for
a class and find out which object file it was defined in.
llvm-svn: 145744
This is the actual fix for the above radar where global variables that weren't
initialized were not being shown correctly when leaving the DWARF in the .o
files. Global variables that aren't intialized have symbols in the .o files
that specify they are undefined and external to the .o file, yet document the
size of the variable. This allows the compiler to emit a single copy, but makes
it harder for our DWARF in .o files with the executable having a debug map
because the symbol for the global in the .o file doesn't exist in a section
that we can assign a fixed up linked address to, and also the DWARF contains
an invalid address in the "DW_OP_addr" location (always zero). This means that
the DWARF is incorrect and actually maps all such global varaibles to the
first file address in the .o file which is usually the first function. So we
can fix this in either of two ways: make a new fake section in the .o file
so that we have a file address in the .o file that we can relink, or fix the
the variable as it is created in the .o file DWARF parser and actually give it
the file address from the executable. Each variable contains a
SymbolContextScope, or a single pointer that helps us to recreate where the
variables came from (which module, file, function, etc). This context helps
us to resolve any file addresses that might be in the location description of
the variable by pointing us to which file the file address comes from, so we
can just replace the SymbolContextScope and also fix up the location, which we
would have had to do for the other case as well, and update the file address.
Now globals display correctly.
The above changes made it possible to determine if a variable is a global
or static variable when parsing DWARF. The DWARF emits a DW_TAG_variable tag
for each variable (local, global, or static), yet DWARF provides no way for
us to classify these variables into these categories. We can now detect when
a variable has a simple address expressions as its location and this will help
us classify these correctly.
While making the above changes I also noticed that we had two symbol types:
eSymbolTypeExtern and eSymbolTypeUndefined which mean essentially the same
thing: the symbol is not defined in the current object file. Symbol objects
also have a bit that specifies if a symbol is externally visible, so I got
rid of the eSymbolTypeExtern symbol type and moved all code locations that
used it to use the eSymbolTypeUndefined type.
llvm-svn: 144489
Fixed an issue where if a mach-o symbol table was corrupt and had a string
table offset that is invalid, we could crash. We now properly check the string
table offset and ignore any symbols with invalid strings.
llvm-svn: 143362
in the same hashed format as the ".apple_names", but they map objective C
class names to all of the methods and class functions. We need to do this
because in the DWARF the methods for Objective C are never contained in the
class definition, they are scattered about at the translation unit level and
they don't even have attributes that say the are contained within the class
itself.
Added 3 new formats which can be used to display data:
eFormatAddressInfo
eFormatHexFloat
eFormatInstruction
eFormatAddressInfo describes an address such as function+offset and file+line,
or symbol + offset, or constant data (c string, 2, 4, 8, or 16 byte constants).
The format character for this is "A", the long format is "address".
eFormatHexFloat will print out the hex float format that compilers tend to use.
The format character for this is "X", the long format is "hex float".
eFormatInstruction will print out disassembly with bytes and it will use the
current target's architecture. The format character for this is "i" (which
used to be being used for the integer format, but the integer format also has
"d", so we gave the "i" format to disassembly), the long format is
"instruction".
Mate the lldb::FormatterChoiceCriterion enumeration private as it should have
been from the start. It is very specialized and doesn't belong in the public
API.
llvm-svn: 143114