Produce another specific error message for a malformed Mach-O file when a symbol’s
string index is past the end of the string table. The existing test case in test/Object/macho-invalid.test
for macho-invalid-symbol-name-past-eof now reports the error with the message indicating
that a symbol at a specific index has a bad sting index and that bad string index value.
Again converting interfaces to Expected<> from ErrorOr<> does involve
touching a number of places. Where the existing code reported the error with a
string message or an error code it was converted to do the same. There is some
code for this that could be factored into a routine but I would like to leave that for
the code owners post-commit to do as they want for handling an llvm::Error. An
example of how this could be done is shown in the diff in
lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h which had a Check() routine
already for std::error_code so I added one like it for llvm::Error .
Also there some were bugs in the existing code that did not deal with the
old ErrorOr<> return values. So now with Expected<> since they must be
checked and the error handled, I added a TODO and a comment:
“// TODO: Actually report errors helpfully” and a call something like
consumeError(NameOrErr.takeError()) so the buggy code will not crash
since needed to deal with the Error.
Note there fixes needed to lld that goes along with this that I will commit right after this.
So expect lld not to built after this commit and before the next one.
llvm-svn: 266919
Produce the first specific error message for a malformed Mach-O file describing
the problem instead of the generic message for object_error::parse_failed of
"Invalid data was encountered while parsing the file”. Many more good error
messages will follow after this first one.
This is built on Lang Hames’ great work of adding the ’Error' class for
structured error handling and threading Error through MachOObjectFile
construction. And making createMachOObjectFile return Expected<...> .
So to to get the error to the llvm-obdump tool, I changed the stack of
these methods to also return Expected<...> :
object::ObjectFile::createObjectFile()
object::SymbolicFile::createSymbolicFile()
object::createBinary()
Then finally in ParseInputMachO() in MachODump.cpp the error can
be reported and the specific error message can be printed in llvm-objdump
and can be seen in the existing test case for the existing malformed binary
but with the updated error message.
Converting these interfaces to Expected<> from ErrorOr<> does involve
touching a number of places. To contain the changes for now use of
errorToErrorCode() and errorOrToExpected() are used where the callers
are yet to be converted.
Also there some were bugs in the existing code that did not deal with the
old ErrorOr<> return values. So now with Expected<> since they must be
checked and the error handled, I added a TODO and a comment:
“// TODO: Actually report errors helpfully” and a call something like
consumeError(ObjOrErr.takeError()) so the buggy code will not crash
since needed to deal with the Error.
Note there is one fix also needed to lld/COFF/InputFiles.cpp that goes along
with this that I will commit right after this. So expect lld not to built
after this commit and before the next one.
llvm-svn: 265606
in the test suite. While this is not really an interesting tool and option to run
on a Mach-O file to show the symbol table in a generic libObject format
it shouldn’t crash.
The reason for the crash was in MachOObjectFile::getSymbolType() when it was
calling MachOObjectFile::getSymbolSection() without checking its return value
for the error case.
What makes this fix require a fair bit of diffs is that the method getSymbolType() is
in the class ObjectFile defined without an ErrorOr<> so I needed to add that all
the sub classes. And all of the uses needed to be updated and the return value
needed to be checked for the error case.
The MachOObjectFile version of getSymbolType() “can” get an error in trying to
come up with the libObject’s internal SymbolRef::Type when the Mach-O symbol
symbol type is an N_SECT type because the code is trying to select from the
SymbolRef::ST_Data or SymbolRef::ST_Function values for the SymbolRef::Type.
And it needs the Mach-O section to use isData() and isBSS to determine if
it will return SymbolRef::ST_Data.
One other possible fix I considered is to simply return SymbolRef::ST_Other
when MachOObjectFile::getSymbolSection() returned an error. But since in
the past when I did such changes that “ate an error in the libObject code” I
was asked instead to push the error out of the libObject code I chose not
to implement the fix this way.
As currently written both the COFF and ELF versions of getSymbolType()
can’t get an error. But if isReservedSectionNumber() wanted to check for
the two known negative values rather than allowing all negative values or
the code wanted to add the same check as in getSymbolAddress() to use
getSection() and check for the error then these versions of getSymbolType()
could return errors.
At the end of the day the error printed now is the generic “Invalid data was
encountered while parsing the file” for object_error::parse_failed. In the
future when we thread Lang’s new TypedError for recoverable error handling
though libObject this will improve. And where the added // Diagnostic(…
comment is, it would be changed to produce and error message
like “bad section index (42) for symbol at index 8” for this case.
llvm-svn: 264187
Summary:
For relocation types that are known to not require stub functions, there
is no need to allocate extra space for the stub functions.
Reviewers: lhames, reames, maksfb
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14676
llvm-svn: 253920
Now that setExecutable() changed to do all the ground work to make
memory executable on the host, we can remove all (redundant) calls
to invalidate instruction cache here.
As an added bonus, this makes invalidateInstructionCache() dead
code, so it can be removed.
Differential Revision: http://reviews.llvm.org/D13631
llvm-svn: 253343
This commit introduces an option, --preallocate, so that we can get memory
upfront and use it in small memory model tests (in order to get
reliable results).
Differential Revision: http://reviews.llvm.org/D13630
llvm-svn: 250956
TrivialMemoryManager currently doesn't check the return type of AllocateRWX --
and returns a 'null' MemoryBlock to its caller. As pointed out by Lang,
this exposes some serious issues with the MemoryManager interface. There's,
in fact, no way to report back an error to clients rather than aborting in
case memory can't be allocated. Eventually the interface will grow to support
this, but for now, fail sooner rather than later.
Differential Revision: http://reviews.llvm.org/D13627
llvm-svn: 250350
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).
For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.
This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.
This commit also contains a trivial patch to clang to account for the C++ API
change. Thanks go to Pavel Labath for fixing LLDB for me.
Reviewers: rengolin
Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10969
llvm-svn: 247692
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).
For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.
This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.
This commit also contains a trivial patch to clang to account for the C++ API
change.
Reviewers: rengolin
Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10969
llvm-svn: 247683
Requested by Eugene Rozenfeld of the LLILC team, this feature allows JIT
clients to skip relocations for selected external symbols by returning ~0ULL
from their symbol resolver. If this value is returned for a given symbol,
RuntimeDyld will skip all relocations for that symbol. The client will be
responsible for applying the skipped relocations manually before the code
is executed.
llvm-svn: 241383
This function can really fail since the string table offset can be out of
bounds.
Using ErrorOr makes sure the error is checked.
Hopefully a lot of the boilerplate code in tools/* can go away once we have
a diagnostic manager in Object.
llvm-svn: 241297
This is still a really odd function. Most calls are in object format specific
contexts and should probably be replaced with a more direct query, but at least
now this is not too obnoxious to use.
llvm-svn: 240777
On ELF that was already the case since getting the size of a symbol
never fails.
On MachO and COFF we could fail trying to get the section of a symbol. But
we don't really need the section, just the section number to know if two
symbols are in the same section or not.
llvm-svn: 240580
This patch adds the -mcpu= option to llvm-rtdyld. With this option, one
can test relocations for different types of CPUs (e.g. Mips64r6).
Patch by Vladimir Radosavljevic.
Differential Revision: http://reviews.llvm.org/D10503
llvm-svn: 240477
llvm-rtdyld was relying on the default memory manager's EH frame registration,
which is host-dependent rather than target-dependent. As a result, big-endian
ELF Mips EH frames were being registered on OS X (and elsewhere). This is a
really bad idea.
llvm-svn: 238951
This commit adds partial support for MachO relocations to RelocVisitor.
A simple test case is added to show that relocations are indeed being
applied and that using llvm-dwarfdump on MachO files no longer errors.
Correctness is not yet tested, due to an unrelated bug in DebugInfo,
which will be fixed with appropriate testcase in a followup commit.
Differential Revision: http://reviews.llvm.org/D8148
llvm-svn: 238663
Summary:
This supersedes http://reviews.llvm.org/D4010, hopefully properly
dealing with the JIT case and also adds an actual test case.
DwarfContext was basically already usable for the JIT (and back when
we were overwriting ELF files it actually worked out of the box by
accident), but in order to resolve relocations correctly it needs
to know the load address of the section.
Rather than trying to get this out of the ObjectFile or requiring
the user to create a new ObjectFile just to get some debug info,
this adds the capability to pass in that info directly.
As part of this I separated out part of the LoadedObjectInfo struct
from RuntimeDyld, since it is now required at a higher layer.
Reviewers: lhames, echristo
Reviewed By: echristo
Subscribers: vtjnash, friss, rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D6961
llvm-svn: 237961
This will enable us to create a PDBContext so as to expose some
amount of debug info functionality through a common interace.
Differential Revision: http://reviews.llvm.org/D9205
Reviewed by: Alexey Samsonov
llvm-svn: 235612
MCJIT.
This patch decouples the two responsibilities of the RTDyldMemoryManager class,
memory management and symbol resolution, into two new classes:
RuntimeDyld::MemoryManager and RuntimeDyld::SymbolResolver.
The symbol resolution interface is modified slightly, from:
uint64_t getSymbolAddress(const std::string &Name);
to:
RuntimeDyld::SymbolInfo findSymbol(const std::string &Name);
The latter passes symbol flags along with symbol addresses, allowing RuntimeDyld
and others to reason about non-strong/non-exported symbols.
The memory management interface removes the following method:
void notifyObjectLoaded(ExecutionEngine *EE,
const object::ObjectFile &) {}
as it is not related to memory management. (Note: Backwards compatibility *is*
maintained for this method in MCJIT and OrcMCJITReplacement, see below).
The RTDyldMemoryManager class remains in-tree for backwards compatibility.
It inherits directly from RuntimeDyld::SymbolResolver, and indirectly from
RuntimeDyld::MemoryManager via the new MCJITMemoryManager class, which
just subclasses RuntimeDyld::MemoryManager and reintroduces the
notifyObjectLoaded method for backwards compatibility).
The EngineBuilder class retains the existing method:
EngineBuilder&
setMCJITMemoryManager(std::unique_ptr<RTDyldMemoryManager> mcjmm);
and includes two new methods:
EngineBuilder&
setMemoryManager(std::unique_ptr<MCJITMemoryManager> MM);
EngineBuilder&
setSymbolResolver(std::unique_ptr<RuntimeDyld::SymbolResolver> SR);
Clients should use EITHER:
A single call to setMCJITMemoryManager with an RTDyldMemoryManager.
OR (exclusive)
One call each to each of setMemoryManager and setSymbolResolver.
This patch should be fully compatible with existing uses of RTDyldMemoryManager.
If it is not it should be considered a bug, and the patch either fixed or
reverted.
If clients find the new API to be an improvement the goal will be to deprecate
and eventually remove the RTDyldMemoryManager class in favor of the new classes.
llvm-svn: 233509
Author: Lang Hames <lhames@gmail.com>
Date: Mon Mar 9 23:51:09 2015 +0000
[Orc][MCJIT][RuntimeDyld] Add header that was accidentally left out of r231724.
Author: Lang Hames <lhames@gmail.com>
Date: Mon Mar 9 23:44:13 2015 +0000
[Orc][MCJIT][RuntimeDyld] Add symbol flags to symbols in RuntimeDyld. Thread the
new types through MCJIT and Orc.
In particular, add a 'weak' flag. When plumbed through RTDyldMemoryManager, this
will allow us to distinguish between weak and strong definitions and find the
right ones during symbol resolution.
llvm-svn: 231731
new types through MCJIT and Orc.
In particular, add a 'weak' flag. When plumbed through RTDyldMemoryManager, this
will allow us to distinguish between weak and strong definitions and find the
right ones during symbol resolution.
llvm-svn: 231724
In preparation for adding PDB support to LLVM, this moves the
DWARF parsing code to its own subdirectory under DebugInfo, and
renames LLVMDebugInfo to LLVMDebugInfoDWARF.
This is purely a mechanical / build system change.
Differential Revision: http://reviews.llvm.org/D7269
Reviewed by: Eric Christopher
llvm-svn: 227586
utils/sort_includes.py.
I clearly haven't done this in a while, so more changed than usual. This
even uncovered a missing include from the InstrProf library that I've
added. No functionality changed here, just mechanical cleanup of the
include order.
llvm-svn: 225974
Previously, when loading an object file, RuntimeDyld (1) took ownership of the
ObjectFile instance (and associated MemoryBuffer), (2) potentially modified the
object in-place, and (3) returned an ObjectImage that managed ownership of the
now-modified object and provided some convenience methods. This scheme accreted
over several years as features were tacked on to RuntimeDyld, and was both
unintuitive and unsafe (See e.g. http://llvm.org/PR20722).
This patch fixes the issue by removing all ownership and in-place modification
of object files from RuntimeDyld. Existing behavior, including debugger
registration, is preserved.
Noteworthy changes include:
(1) ObjectFile instances are now passed to RuntimeDyld by const-ref.
(2) The ObjectImage and ObjectBuffer classes have been removed entirely, they
existed to model ownership within RuntimeDyld, and so are no longer needed.
(3) RuntimeDyld::loadObject now returns an instance of a new class,
RuntimeDyld::LoadedObjectInfo, which can be used to construct a modified
object suitable for registration with the debugger, following the existing
debugger registration scheme.
(4) The JITRegistrar class has been removed, and the GDBRegistrar class has been
re-written as a JITEventListener.
This should fix http://llvm.org/PR20722 .
llvm-svn: 222810
mode.
This will cause -verify mode to report failure when RuntimeDyld encounters an
internal error (e.g. overflows in relocation computations). Previously we had
let these errors slip past unreported.
llvm-svn: 214925
use in -verify mode.
This patch adds three hidden command line options to llvm-rtdyld:
-target-addr-start <start-addr> : Specify the start of the virtual address
space on the phony target.
-target-addr-end <end-addr> : Specify the end of the virtual address space
on the phony target.
-target-section-sep <sep> : Specify the separation (in bytes) between the
end of one section and the start of the next.
These options automatically default to sane values for the target platform. In
particular, they allow narrow (e.g. 32-bit, 16-bit) targets to be tested from
wider (e.g. 64-bit, 32-bit) hosts without overflowing pointers.
The section separation option defaults to zero, but can be set to a large number
(e.g. 1 << 32) to force large separations between sections in order to
stress-test large-code-model code.
llvm-svn: 214255
This patch introduces a 'stub_addr' builtin that can be used to find the address
of the stub for a given (<file>, <section>, <symbol>) tuple. This address can be
used both to verify the contents of stubs (by loading from the returned address)
and to verify references to stubs (by comparing against the returned address).
Example (1) - Verifying stub contents:
Load 8 bytes (assuming a 64-bit target) from the stub for 'x' in the __text
section of f.o, and compare that value against the addres of 'x'.
# rtdyld-check: *{8}(stub_addr(f.o, __text, x) = x
Example (2) - Verifying references to stubs:
Decode the immediate of the instruction at label 'l', and verify that it's
equal to the offset from the next instruction's PC to the stub for 'y' in the
__text section of f.o (i.e. it's the correct PC-rel difference).
# rtdyld-check: decode_operand(l, 4) = stub_addr(f.o, __text, y) - next_pc(l)
l:
movq y@GOTPCREL(%rip), %rax
Since stub inspection requires cooperation with RuntimeDyldImpl this patch
pimpl-ifies RuntimeDyldChecker. Its implementation is moved in to a new class,
RuntimeDyldCheckerImpl, that has access to the definition of RuntimeDyldImpl.
llvm-svn: 213698
This patch adds a "-verify" mode to the llvm-rtdyld utility. In verify mode,
llvm-rtdyld will test supplied expressions against the linked program images
that it creates in memory. This scheme can be used to verify the correctness
of the relocation logic applied by RuntimeDyld.
The expressions to test will be read out of files passed via the -check option
(there may be more than one of these). Expressions to check are extracted from
lines of the form:
# rtdyld-check: <expression>
This system is designed to fit the llvm-lit regression test workflow. It is
format and target agnostic, and supports verification of images linked for
remote targets. The expression language is defined in
llvm/include/llvm/RuntimeDyldChecker.h . Examples can be found in
test/ExecutionEngine/RuntimeDyld.
llvm-svn: 211956
libraries before linking and executing the target objects.
This allows programs that use external calls (e.g. to libc) to be run under
llvm-rtdyld.
llvm-svn: 208739
Immutable DILineInfo doesn't bring any benefits and complicates
code. Also, use std::string instead of SmallString<16> for file
and function names - their length can vary significantly.
No functionality change.
llvm-svn: 206654
This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.
llvm-svn: 203083
None of the object file formats reported error on iterator increment. In
retrospect, that is not too surprising: no object format stores symbols or
sections in a linked list or other structure that requires chasing pointers.
As a consequence, all error checking can be done on begin() and end().
This reduces the text segment of bin/llvm-readobj in my machine from 521233 to
518526 bytes.
llvm-svn: 200442
It's useful for the memory managers that are allocating a section to know what the name of the section is.
At a minimum, this is useful for low-level debugging - it's customary for JITs to be able to tell you what
memory they allocated, and as part of any such dump, they should be able to tell you some meta-data about
what each allocation is for. This allows clients that supply their own memory managers to do this.
Additionally, we also envision the SectionName being useful for passing meta-data from within LLVM to an LLVM
client.
This changes both the C and C++ APIs, and all of the clients of those APIs within LLVM. I'm assuming that
it's safe to change the C++ API because that API is allowed to change. I'm assuming that it's safe to change
the C API because we haven't shipped the API in a release yet (LLVM 3.3 doesn't include the MCJIT memory
management C API).
llvm-svn: 191804
For Mach-O there were 2 implementations for parsing object files. A
standalone llvm/Object/MachOObject.h and llvm/Object/MachO.h which
implements the generic interface in llvm/Object/ObjectFile.h.
This patch adds the missing features to MachO.h, moves macho-dump to
use MachO.h and removes ObjectFile.h.
In addition to making sure that check-all is clean, I checked that the
new version produces exactly the same output in all Mach-O files in a
llvm+clang build directory (including executables and shared
libraries).
To test the performance, I ran macho-dump over all the files in a
llvm+clang build directory again, but this time redirecting the output
to /dev/null. Both the old and new versions take about 4.6 seconds
(2.5 user) to finish.
llvm-svn: 180624
Again, tools are trickier to pick the main module header for than
library source files. I've started to follow the pattern of using
LLVMContext.h when it is included as a stub for program source files.
llvm-svn: 169252
It's more flexible for MCJIT tasks, in addition it's provides a invalidation instruction cache for code sections which will be used before JIT code will be executed.
llvm-svn: 156933
1. The main works will made in the RuntimeDyLdImpl with uses the ObjectFile class. RuntimeDyLdMachO and RuntimeDyLdELF now only parses relocations and resolve it. This is allows to make improvements of the RuntimeDyLd more easily. In addition the support for COFF can be easily added.
2. Added ARM relocations to RuntimeDyLdELF.
3. Added support for stub functions for the ARM, allowing to do a long branch.
4. Added support for external functions that are not loaded from the object files, but can be loaded from external libraries. Now MCJIT can correctly execute the code containing the printf, putc, and etc.
5. The sections emitted instead functions, thanks Jim Grosbach. MemoryManager.startFunctionBody() and MemoryManager.endFunctionBody() have been removed.
6. MCJITMemoryManager.allocateDataSection() and MCJITMemoryManager. allocateCodeSection() used JMM->allocateSpace() instead of JMM->allocateCodeSection() and JMM->allocateDataSection(), because I got an error: "Cannot allocate an allocated block!" with object file contains more than one code or data sections.
llvm-svn: 153754
(and hopefully on Windows). The bots have been down most of the day
because of this, and it's not clear to me what all will be required to
fix it.
The commits started with r153205, then r153207, r153208, and r153221.
The first commit seems to be the real culprit, but I couldn't revert
a smaller number of patches.
When resubmitting, r153207 and r153208 should be folded into r153205,
they were simple build fixes.
llvm-svn: 153241
Move to a by-section allocation and relocation scheme. This allows
better support for sections which do not contain externally visible
symbols.
Flesh out the relocation address vs. local storage address separation a
bit more as well. Remote process JITs use this to tell the relocation
resolution code where the code will live when it executes.
The startFunctionBody/endFunctionBody interfaces to the JIT and the
memory manager are deprecated. They'll stick around for as long as the
old JIT does, but the MCJIT doesn't use them anymore.
llvm-svn: 148258
Add handling for tracking the relocations on symbols and resolving them.
Keep track of the relocations even after they are resolved so that if
the RuntimeDyld client moves the object, it can update the address and any
relocations to that object will be updated.
For our trival object file load/run test harness (llvm-rtdyld), this enables
relocations between functions located in the same object module. It should
be trivially extendable to load multiple objects with mutual references.
As a simple example, the following now works (running on x86_64 Darwin 10.6):
$ cat t.c
int bar() {
return 65;
}
int main() {
return bar();
}
$ clang t.c -fno-asynchronous-unwind-tables -o t.o -c
$ otool -vt t.o
t.o:
(__TEXT,__text) section
_bar:
0000000000000000 pushq %rbp
0000000000000001 movq %rsp,%rbp
0000000000000004 movl $0x00000041,%eax
0000000000000009 popq %rbp
000000000000000a ret
000000000000000b nopl 0x00(%rax,%rax)
_main:
0000000000000010 pushq %rbp
0000000000000011 movq %rsp,%rbp
0000000000000014 subq $0x10,%rsp
0000000000000018 movl $0x00000000,0xfc(%rbp)
000000000000001f callq 0x00000024
0000000000000024 addq $0x10,%rsp
0000000000000028 popq %rbp
0000000000000029 ret
$ llvm-rtdyld t.o -debug-only=dyld ; echo $?
Function sym: '_bar' @ 0
Function sym: '_main' @ 16
Extracting function: _bar from [0, 15]
allocated to 0x100153000
Extracting function: _main from [16, 41]
allocated to 0x100154000
Relocation at '_main' + 16 from '_bar(Word1: 0x2d000000)
Resolving relocation at '_main' + 16 (0x100154010) from '_bar (0x100153000)(pcrel, type: 2, Size: 4).
loaded '_main' at: 0x100154000
65
$
llvm-svn: 129388