ELFLinkGraphBuilder<ELFT> will hold generic parsing and LinkGraph-building code
that can be shared between JITLink ELF backends for different architectures.
For now it's just a stub. The plan is to incrementally move functionality down
from ELFLinkGraphBuilder_x86_64 into the new template.
This patch was derived from Valentin Churavy's work in
https://reviews.llvm.org/D104480. It adds support for setting the transform on
an IRTransformLayer, and for accessing the IRTransformLayer in LLJIT. It also
adds access to the ThreadSafeModule::withModuleDo method for thread-safe
access to modules.
A new example has been added to show how to use these APIs to optimize a module
during materialization.
Thanks Valentin!
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D103855
Provides ObjectTransformLayer APIs, a getter to access the
ObjectTransformLayer member of LLJIT, and the DumpObjects utility
to make construction of a dump-to-disk transform easy.
An example showing how the new APIs can be used has been added in
llvm/examples/OrcV2Examples/OrcV2CBindingsDumpObjects.
Addresses FIXMEs in TPC-based EH-frame and debug object registration code by
replacing manual argument serialization with WrapperFunction utility calls.
Replace the existing WrapperFunctionResult type in
llvm/include/ExecutionEngine/Orc/Shared/TargetProcessControlTypes.h with a
version adapted from the ORC runtime's implementation.
Also introduce the SimplePackedSerialization scheme (also adapted from the ORC
runtime's implementation) for wrapper functions to avoid manual serialization
and deserialization for calls to runtime functions involving common types.
The C-string section splitting support added in f9649d123d triggered an assert
("Duplicate canonical symbol at address") when multiple symbols were defined at
the the same offset within a C-string block (this triggered on arm64, where we
always add a block start symbol). The bug was caused by a failure to update the
record of the last canonical symbol address. The fix was to maintain this record
correctly, and move the auto-generation of the block-start symbol above the
handling for symbols defined in the object itself so that all symbols
(auto-generated and defined) are processed in address order.
MachO C-string literal sections should be split on null-terminator boundaries,
rather than the usual symbol boundaries. This patch updates
MachOLinkGraphBuilder to do that.
As suggested on rG937c4cffd024, use llvm_unreachable for unhandled integer types (which shouldn't be possible) instead of breaking and dropping down to the existing fatal error handler.
Helps silence static analyzer warnings.
During the generic x86-64 support refactor in ecf6466f01 the implementation
of MachO_arm64_GOTAndStubsBuilder::isGOTEdgeToFix was altered to only return
true for external symbols. This behavior is incorrect: GOT entries may be
required for defined symbols (e.g. in the large code model).
This patch fixes the bug and adds a test case for it (renaming an old test
case to avoid any ambiguity).
Currently, BPF only contains three relocations:
R_BPF_NONE for no relocation
R_BPF_64_64 for LD_imm64 and normal 64-bit data relocation
R_BPF_64_32 for call insn and normal 32-bit data relocation
Also .BTF and .BTF.ext sections contain symbols in allocated
program and data sections. These two sections reserved 32bit
space to hold the offset relative to the symbol's section.
When LLVM JIT is used, the LLVM ExecutionEngine RuntimeDyld
may attempt to resolve relocations for .BTF and .BTF.ext,
which we want to prevent. So we used R_BPF_NONE for such relocations.
This all works fine until when we try to do linking of
multiple objects.
. R_BPF_64_64 handling of LD_imm64 vs. normal 64-bit data
is different, so lld target->relocate() needs more context
to do a correct job.
. The same for R_BPF_64_32. More context is needed for
lld target->relocate() to differentiate call insn vs.
normal 32-bit data relocation.
. Since relocations in .BTF and .BTF.ext are set to R_BPF_NONE,
they will not be relocated properly when multiple .BTF/.BTF.ext
sections are merged by lld.
This patch intends to address this issue by adding additional
relocation kinds:
R_BPF_64_ABS64 for normal 64-bit data relocation
R_BPF_64_ABS32 for normal 32-bit data relocation
R_BPF_64_NODYLD32 for .BTF and .BTF.ext style relocations.
The old R_BPF_64_{64,32} semantics:
R_BPF_64_64 for LD_imm64 relocation
R_BPF_64_32 for call insn relocation
The existing R_BPF_64_64/R_BPF_64_32 mapping to numeric values
is maintained. They are the most common use cases for
bpf programs and we want to maintain backward compatibility
as much as possible.
ExecutionEngine RuntimeDyld BPF relocations are adjusted as well.
R_BPF_64_{ABS64,ABS32} relocations will be resolved properly and
other relocations will be ignored.
Two tests are added for RuntimeDyld. Not handling R_BPF_64_NODYLD32 in
RuntimeDyldELF.cpp will result in "Relocation type not implemented yet!"
fatal error.
FK_SecRel_4 usages in BPFAsmBackend.cpp and BPFELFObjectWriter.cpp
are removed as they are not triggered in BPF backend.
BPF backend used FK_SecRel_8 for LD_imm64 instruction operands.
Differential Revision: https://reviews.llvm.org/D102712
This patch introduces new operations on jitlink::Blocks: setMutableContent,
getMutableContent and getAlreadyMutableContent. The setMutableContent method
will set the block content data and size members and flag the content as
mutable. The getMutableContent method will return a mutable copy of the existing
content value, auto-allocating and populating a new mutable copy if the existing
content is marked immutable. The getAlreadyMutableMethod asserts that the
existing content is already mutable and returns it.
setMutableContent should be used when updating the block with totally new
content backed by mutable memory. It can be used to change the size of the
block. The argument value should *not* be shared with any other block.
getMutableContent should be used when clients want to modify the existing
content and are unsure whether it is mutable yet.
getAlreadyMutableContent should be used when clients want to modify the existing
content and know from context that it must already be immutable.
These operations reduce copy-modify-update boilerplate and unnecessary copies
introduced when clients couldn't me sure whether the existing content was
mutable or not.
The implementation and intent behind freeing the triple string here is the same
as LLVMGetDefaultTargetTriple (and any other owned c string returned from the C
API), so we should use LLVMDisposeMessage for to free the string for
consistency.
Patch by Mats Larsen -- thanks Mats!
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D102957
This patch introduces functionality used by BOLT when
re-linking the final binary. It adds to MemoryManager a new member
function allowStubAllocation to control whether this MemoryManager
supports increasing code size with stubs or not. Since BOLT can
rewrite some files in-place, it needs to avoid stub insertion done
by the linker. This patch also introduces allowsZeroSymbols to the
JITSymbolResolver class, enabling us to finish a link successfully
even when some symbols resolve to the value zero. When rewriting a
binary, sometimes we do need to resolve a target to zero in case
the input binary calls address zero and we want to be bug
compatible. We also expose reassignSectionAddress as it is used by
BOLT.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97898
Fix was implemented in the ittap repo to solve an error about cross-compiling ITTAPI in LLVM with mingw.
The problem occurred in the cross-compilation environment for Julia's dependencies.
The corresponding issue item in ittapi repo: https://github.com/intel/ittapi/issues/19
A new tag was created in ittapi repo for that fix.
This patch contains changes to update the ittapi tag in LLVM.
Reviewed By: bader
Differential Revision: https://reviews.llvm.org/D102471
This is separate from (but builds on) the support added in ec6b71df70 for
emitting LinkGraphs in the context of an active materialization. This commit
makes LinkGraphs a first-class data structure with features equivalent to
object files within ObjectLinkingLayer.
These can be used to create eh-frame section fixing passes outside the usual
linker pipeline, which can be useful for tests and tools that just want to
verify or dump graphs.
This commit reorders some fields and fixes the width of others to try to
maintain more consistent columns. It also switches to long-hand scope
and linkage names, since LinkGraph dumps aren't read often enough for
single-character codes to be memorable.
Generalizing this API allows work to be distributed more evenly. In particular,
query callbacks can now be dispatched (rather than running immediately on the
thread that satisfied the query). This avoids the pathalogical case where an
operation on one thread satisfies many queries simultaneously, causing large
amounts of work to be run on that thread while other threads potentially sit
idle.
This can be useful for clients constructing custom JIT stacks: If the C API
for your custom stack exposes API to obtain a reference to an object layer
(e.g. LLVMOrcLLJITGetObjLinkingLayer) then the newly added
LLVMOrcObjectLayerAddObjectFile and LLVMOrcObjectLayerAddObjectFileWithRT
functions can be used to add objects directly to that layer.
This reapplies 8740360093, which was reverted in bbddadd46e due to buildbot
errors.
This version checks that a JIT instance can be safely constructed, skipping
tests if it can not be. To enable this it introduces new C API to retrieve and
set the target triple for a JITTargetMachineBuilder.
Adds support for creating custom MaterializationUnits in the C API with the new
LLVMOrcCreateCustomMaterializationUnit function.
Modifies ownership rules for LLVMOrcAbsoluteSymbols to make it consistent with
LLVMOrcCreateCustomMaterializationUnit. This is an ABI breaking change for any
clients of the LLVMOrcAbsoluteSymbols API.
Adds LLVMOrcLLJITGetObjLinkingLayer and LLVMOrcObjectLayerEmit functions to
allow clients to get a reference to an LLJIT instance's linking layer, then
emit an object file using it. This can be used to support construction of
custom materialization units in the common case where those units will
generate an object file that needs to be emitted to complete the
materialization.
In EHFrameRegistrationPlugin::notifyTransferringResources if SrcKey had
eh-frames associated but DstKey did not we would create a new entry for DskKey,
invalidating the iterator for SrcKey in the process. This commit fixes that by
removing SrcKey first in this case.
It used to be that all of our intrinsics were call instructions, but over time, we've added more and more invokable intrinsics. According to the verifier, we're up to 8 right now. As IntrinsicInst is a sub-class of CallInst, this puts us in an awkward spot where the idiomatic means to check for intrinsic has a false negative if the intrinsic is invoked.
This change switches IntrinsicInst from being a sub-class of CallInst to being a subclass of CallBase. This allows invoked intrinsics to be instances of IntrinsicInst, at the cost of requiring a few more casts to CallInst in places where the intrinsic really is known to be a call, not an invoke.
After this lands and has baked for a couple days, planned cleanups:
Make GCStatepointInst a IntrinsicInst subclass.
Merge intrinsic handling in InstCombine and use idiomatic visitIntrinsicInst entry point for InstVisitor.
Do the same in SelectionDAG.
Do the same in FastISEL.
Differential Revision: https://reviews.llvm.org/D99976
Adds utilities for creating anonymous pointers and jump stubs to x86_64.h. These
are used by the GOT and Stubs builder, but may also be used by pass writers who
want to create pointer stubs for indirection.
This patch also switches the underlying type for LinkGraph content from
StringRef to ArrayRef<char>. This avoids any confusion when working with buffers
that contain null bytes in the middle like, for example, a newly added null
pointer content array. ;)
This option tells LLJIT to disable platform support explicitly: JITDylibs aren't scanned for special init/deinit symbols and no runtime API interposes are injected.
It's useful in two cases: for platforms that don't have such requirements and platforms for which we have no explicit support yet and that don't work well with the generic IR platform.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D99416
LLVMOrcDisposeObjectLayer and LLVMOrcExecutionSessionGetJITDylibByName did not
have matching signatures between the C-API header and binding implementations.
Fixes http://llvm.org/PR49745.
Patch by Mats Larsen. Thanks Mats!
Reviewed by: lhames
Differential Revision: https://reviews.llvm.org/D99478
JITLink now requires section names to be unique. In MachO section names are only
guaranteed to be unique within their containing segment (e.g. a '__const' section
in the '__DATA' segment does not clash with a '__const' section in the '__TEXT'
segment), so we need to use the fully qualified <segment>,<section> section
names (e.g. '__DATA,__const' or '__TEXT,__const') when constructing
jitlink::Sections for MachO objects.
Apply the way createLocalIndirectStubsManagerBuilder() deals with unsupported achritectures to createLocalLazyCallThroughManager(). The returned call-through manager is dysfunctional: It runs into an unreachable as soon as a lazy JIT attempts to use it. However, this results in broader platform support for lli in default (greedy) ORC mode where no lazy materialization is required.
Don't leak ResourceKeys from MaterializationResponsibility::withResourceKeyDo() in notifyEmitted().
Also make some improvements in the overall implementation.
Differential Revision: https://reviews.llvm.org/D98863
There can be multiple MaterializationResponsibilitys in-flight for a single ResourceKey. Hence, pending debug objects must be tracked by MaterializationResponsibility and not by ResourceKey.
Differential Revision: https://reviews.llvm.org/D98785
Introduces DefineExternalSectionStartAndEndSymbols.h, which defines a template
for a JITLink pass that transforms external symbols meeting a user-supplied
predicate into defined symbols pointing at the start and end of a Section
identified by the predicate. JITLink.h is updated with a new makeAbsolute
function to support this pass.
Also renames BasicGOTAndStubsBuilder to PerGraphGOTAndPLTStubsBuilder -- the new
name better describes the intent of this GOT and PLT stubs builder, and will
help to distinguish it from future GOT and PLT stub builders that build entries
that may be shared between multiple graphs.
Issuing a lookup for an empty symbol set is legal, but can actually result in
unrelated work being done if there was a work queue left over from the previous
lookup. We can avoid doing this unrelated work (reducing stack depth and
interleaving of debugging output) by not issuing these no-op lookups in the
first place.
This patch introduces generic x86-64 edge kinds, and refactors the MachO/x86-64
backend to use these edge kinds. This simplifies the implementation of the
MachO/x86-64 backend and makes it possible to write generic x86-64 passes and
utilities.
The new edge kinds are different from the original set used in the MachO/x86-64
backend. Several edge kinds that were not meaningfully distinguished in that
backend (e.g. the PCRelMinusN edges) have been merged into single edge kinds in
the new scheme (these edge kinds can be reintroduced later if we find a use for
them). At the same time, new edge kinds have been introduced to convey extra
information about the state of the graph. E.g. The Request*AndTransformTo**
edges represent GOT/TLVP relocations prior to synthesis of the GOT/TLVP
entries, and the 'Relaxable' suffix distinguishes edges that are candidates for
optimization from edges which should be left as-is (e.g. to enable runtime
redirection).
ELF/x86-64 will be refactored to use these generic edges at some point in the
future, and I anticipate a similar refactor to create a generic arm64 support
header too.
Differential Revision: https://reviews.llvm.org/D98305
This makes the target triple, graph name, and full graph content available
when making decisions about how to populate the linker pass pipeline.
Also updates the LLJITWithObjectLinkingLayerPlugin example to show more
API use, including use of the API changes in this patch.
During finalization the debug object is registered with the target. Materialization must wait for this process to finish. Otherwise we might start running code before the debugger finished processing the corresponding debug info.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D98417
From what I can tell, the loop inside applyExternalSymbolRelocations()
used to call getSymbolAddress(). After the JITSymbolResolver interface
redesign, the functionality has changed, and the loop should no longer
trigger repopulation of ExternalSymbolRelocations. If that's the case,
there is no need to update the loop iterator manually, and
ExternalSymbolRelocations can be cleared at once. This way, when there
are many external symbols in the program, the function runs much faster.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97531
This patch introduces functionality used by BOLT when
re-linking the final binary. It adds new relocation types that
are currently unsupported by RuntimeDyldELF.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97899
GCC warning:
```
In file included from /usr/include/c++/9/cassert:44,
from /home/vsts/work/1/llvm-project/llvm/include/llvm/ADT/BitVector.h:21,
from /home/vsts/work/1/llvm-project/llvm/include/llvm/Support/Program.h:17,
from /home/vsts/work/1/llvm-project/llvm/include/llvm/Support/Process.h:32,
from /home/vsts/work/1/llvm-project/llvm/lib/ExecutionEngine/JITLink/JITLinkMemoryManager.cpp:11:
/home/vsts/work/1/llvm-project/llvm/lib/ExecutionEngine/JITLink/JITLinkMemoryManager.cpp: In member function ‘virtual llvm::Expected<std::unique_ptr<llvm::jitlink::JITLinkMemoryManager::Allocation> > llvm::jitlink::InProcessMemoryManager::allocate(const llvm::jitlink::JITLinkDylib*, const SegmentsRequestMap&)’:
/home/vsts/work/1/llvm-project/llvm/lib/ExecutionEngine/JITLink/JITLinkMemoryManager.cpp:129:40: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
129 | assert(SlabRemaining.allocatedSize() >= 0 && "Mapping exceeds allocation");
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
```
The return type of `allocatedSize()` is `size_t`, thus the expression
`SlabRemaining.allocatedSize() >= 0` always evaluate to `true`.
lli aims to provide both, RuntimeDyld and JITLink, as the dynamic linkers/loaders for it's JIT implementations. And they both offer debugging via the GDB JIT interface, which builds on the two well-known symbol names `__jit_debug_descriptor` and `__jit_debug_register_code`. As these symbols must be unique accross the linked executable, we can only define them in one of the libraries and make the other depend on it. OrcTargetProcess is a minimal stub for embedding a JIT client in remote executors. For the moment it seems reasonable to have the definition there and let ExecutionEngine depend on it, until we find a better solution.
This is the second commit for the reviewed patch.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97339
Add a new ObjectLinkingLayer plugin `DebugObjectManagerPlugin` and infrastructure to handle creation of `DebugObject`s as well as their registration in OrcTargetProcess. The current implementation only covers ELF on x86-64, but the infrastructure is not limited to that.
The journey starts with a new `LinkGraph` / `JITLinkContext` pair being created for a `MaterializationResponsibility` in ORC's `ObjectLinkingLayer`. It sends a `notifyMaterializing()` notification, which is forwarded to all registered plugins. The `DebugObjectManagerPlugin` aims to create a `DebugObject` form the provided target triple and object buffer. (Future implementations might create `DebugObject`s from a `LinkGraph` in other ways.) On success it will track it as the pending `DebugObject` for the `MaterializationResponsibility`.
This patch only implements the `ELFDebugObject` for `x86-64` targets. It follows the RuntimeDyld approach for debug object setup: it captures a copy of the input object, parses all section headers and prepares to patch their load-address fields with their final addresses in target memory. It instructs the plugin to report the section load-addresses once they are available. The plugin overrides `modifyPassConfig()` and installs a JITLink post-allocation pass to capture them.
Once JITLink emitted the finalized executable, the plugin emits and registers the `DebugObject`. For emission it requests a new `JITLinkMemoryManager::Allocation` with a single read-only segment, copies the object with patched section load-addresses over to working memory and triggers finalization to target memory. For registration, it notifies the `DebugObjectRegistrar` provided in the constructor and stores the previously pending`DebugObject` as registered for the corresponding MaterializationResponsibility.
The `DebugObjectRegistrar` registers the `DebugObject` with the target process. `llvm-jitlink` uses the `TPCDebugObjectRegistrar`, which calls `llvm_orc_registerJITLoaderGDBWrapper()` in the target process via `TargetProcessControl` to emit a `jit_code_entry` compatible with the GDB JIT interface [1]. So far the implementation only supports registration and no removal. It appears to me that it wouldn't raise any new design questions, so I left this as an addition for the near future.
[1] https://sourceware.org/gdb/current/onlinedocs/gdb/JIT-Interface.html
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97335
So far we had no way to distinguish between JITLink and RuntimeDyld in lli. Instead, we used implicit knowledge that RuntimeDyld would be used for linking ELF. In order to get D97337 to work with lli though, we have to move on and allow JITLink for ELF. This patch uses extensible RTTI to allow external clients to add their own layers without touching the LLVM sources.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97338
This commit fixes how metadata is handled in CloneModule to be sound,
and improves how it's handled in CloneFunctionInto (although the latter
is still awkward when called within a module).
Ruiling Song pointed out in PR48841 that CloneModule was changed to
unsoundly use the RF_ReuseAndMutateDistinctMDs flag (renamed in
fa35c1f80f for clarity). This flag papered
over a crash caused by other various changes made to CloneFunctionInto
over the past few years that made it unsound to use cloning between
different modules.
(This commit partially addresses PR48841, fixing the repro from
preprocessed source but not textual IR. MDNodeMapper::mapDistinctNode
became unsound in df763188c9 and this
commit does not address that regression.)
RF_ReuseAndMutateDistinctMDs is designed for the IRMover to use,
avoiding unnecessary clones of all referenced metadata when linking
between modules (with IRMover, the source module is discarded after
linking). It never makes sense to use when you're not discarding the
source. This commit drops its incorrect use in CloneModule.
Sadly, the right thing to do with metadata when cloning a function is
complicated, and this patch doesn't totally fix it.
The first problem is that there are two different types of referenceable
metadata and it's not obvious what to with one of them when remapping.
- `!0 = !{!1}` is metadata's version of a constant. Programatically it's
called "uniqued" (probably a better term would be "constant") because,
like `ConstantArray`, it's stored in uniquing tables. Once it's
constructed, it's illegal to change its arguments.
- `!0 = distinct !{!1}` is a bit closer to a global variable. It's legal
to change the operands after construction.
What should be done with distinct metadata when cloning functions within
the same module?
- Should new, cloned nodes be created?
- Should all references point to the same, old nodes?
The answer depends on whether that metadata is effectively owned by a
function.
And that's the second problem. Referenceable metadata's ownership model
is not clear or explicit. Technically, it's all stored on an
LLVMContext. However, any metadata that is `distinct`, that transitively
references a `distinct` node, or that transitively references a
GlobalValue is specific to a Module and is effectively owned by it. More
specifically, some metadata is effectively owned by a specific Function
within a module.
Effectively function-local metadata was introduced somewhere around
c10d0e5ccd, which made it illegal for two
functions to share a DISubprogram attachment.
When cloning a function within a module, you need to clone the
function-local debug info and suppress cloning of global debug info (the
status quo suppresses cloning some global debug info but not all). When
cloning a function to a new/different module, you need to clone all of
the debug info.
Here's what I think we should do (eventually? soon? not this patch
though):
- Distinguish explicitly (somehow) between pure constant metadata owned
by the LLVMContext, global metadata owned by the Module, and local
metadata owned by a GlobalValue (such as a function).
- Update CloneFunctionInto to trigger cloning of all "local" metadata
(only), perhaps by adding a bit to RemapFlag. Alternatively, split
out a separate function CloneFunctionMetadataInto to prime the
metadata map that callers are updated to call ahead of time as
appropriate.
Here's the somewhat more isolated fix in this patch:
- Converted the `ModuleLevelChanges` parameter to `CloneFunctionInto` to
an enum called `CloneFunctionChangeType` that is one of
LocalChangesOnly, GlobalChanges, DifferentModule, and ClonedModule.
- The code maintaining the "functions uniquely own subprograms"
invariant is now only active in the first two cases, where a function
is being cloned within a single module. That's necessary because this
code inhibits cloning of (some) "global" metadata that's effectively
owned by the module.
- The code maintaining the "all compile units must be explicitly
referenced by !llvm.dbg.cu" invariant is now only active in the
DifferentModule case, where a function is being cloned into a new
module in isolation.
- CoroSplit.cpp's call to CloneFunctionInto in CoroCloner::create
uses LocalChangeOnly, since fa635d730f
only set `ModuleLevelChanges` to trigger cloning of local metadata.
- CloneModule drops its unsound use of RF_ReuseAndMutateDistinctMDs
and special handling of !llvm.dbg.cu.
- Fixed some outdated header docs and left a couple of FIXMEs.
Differential Revision: https://reviews.llvm.org/D96531
A fix has been implemented in the ittap repo to fix an error about implicit fallthrough in a switch that was occurring during self build.
A new tag has been created for that fix. This is to update the tag.
Reviewed By: bader
Differential Revision: https://reviews.llvm.org/D95462
Patch by Zahira Ammarguellat.
Compilers may insert new definitions during compilation, E.g. EH personality
function pointers, or named constant pool entries. This commit causes
ObjectLinkingLayer to attempt to claim responsibility for all weak definitions
in objects as they're linked. This is always safe (first claimant for each
symbol is granted responsibility, subsequent claims are rejected without error)
and prevents compiler-injected symbols from being dead-stripped (which they
will be if they remain unclaimed by anyone).
This change was motivated by errors seen by an out-of-tree client while testing
eh-frame support in JITLink ELF/x86-64: IR containing exceptions didn't define
DW.ref.__gxx_personality_v0 (since it's added by CodeGen), and this caused
DW.ref.__gxx_personality_v0 to be dead-stripped leading to linker failures.
No test case yet: We won't have a way to test in-tree until we enable JITLink
for lli on Linux.
This is required for ELF where PCRel32 doesn't implicitly subtract 4.
No test case yet: I haven't figured out a good way to test stub
generation -- this may required extensions to jitlink-check.
Adds the EHFrameSplitter and EHFrameEdgeFixer passes to the default JITLink
pass pipeline for ELF/x86-64, and teaches EHFrameEdgeFixer to handle some
new pointer encodings.
Together these changes enable exception handling (at least for the basic
cases that I've tested so far) for ELF/x86-64 objects loaded via JITLink.
Previously FDE field names were used, but the fixup kind used for a field can
vary based on the pointer encoding.
This change will improve readability / maintainability when EH-frame support is
added to JITLink/ELF.
It can be useful for an ObjectLinkingLayerCreator to allow callee errors to get propagated to the builder. Specifically, this is the case when the ObjectLayer uses the EHFrameRegistrationPlugin, because it requires a TPCEHFrameRegistrar and instantiation for it may fail (e.g. if the required registration symbols are missing in the target process).
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D94690
All other layers in LLJIT are stored as unique_ptr's already. At this point, it is not strictly necessary for ObjTransformLayer, but it makes a follow-up change more straightforward.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D94689
Passes in the new PostAllocationPasses list will run immediately after memory
allocation and address assignment for defined symbols, and before
JITLinkContext::notifyResolved is called. These passes can set up state
associated with the addresses of defined symbols before any query for these
addresses completes.
PreFixupPasses better reflects when these passes will run.
A future patch will (re)introduce a PostAllocationPasses list that will run
after allocation, but before JITLinkContext::notifyResolved is called to notify
the rest of the JIT about the resolved symbol addresses.
Add a triple for powerpcle-*-*.
This is a little-endian encoding of the 32-bit PowerPC ABI, useful in certain niche situations:
1) A loader such as the FreeBSD loader which will be loading a little endian kernel. This is required for PowerPC64LE to load properly in pseries VMs.
Such a loader is implemented as a freestanding ELF32 LSB binary.
2) Userspace emulation of a 32-bit LE architecture such as x86 on 64-bit hosts such as PowerPC64LE with tools like box86 requires having a 32-bit LE toolchain and library set, as they operate by translating only the main binary and switching to native code when making library calls.
3) The Void Linux for PowerPC project is experimenting with running an entire powerpcle userland.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D93918
Moves all headers from Orc/RPC to Orc/Shared, and from the llvm::orc::rpc
namespace into llvm::orc::shared. Also renames RPCTypeName to
SerializationTypeName and Function to RPCFunction.
In addition to being a more reasonable home for this code, this will make it
easier for the upcoming Orc runtime to re-use the Serialization system for
creating and parsing wrapper-function binary blobs.
Separates link graph creation from linking. This allows raw LinkGraphs to be
created and passed to a link. ObjectLinkingLayer is updated to support emission
of raw LinkGraphs in addition to object buffers.
Raw LinkGraphs can be created by in-memory compilers to bypass object encoding /
decoding (though this prevents caching, as LinkGraphs have do not have an
on-disk representation), and by utility code to add programatically generated
data structures to the JIT target process.
JITLinkDylib represents a target dylib for a JITLink link. By representing this
explicitly we can:
- Enable JITLinkMemoryManagers to manage allocations on a per-dylib basis
(e.g by maintaining a seperate allocation pool for each JITLinkDylib).
- Enable new features and diagnostics that require information about the
target dylib (not implemented in this patch).
To support llorg builds this patch provides the following changes:
1) Added cmake variable ITTAPI_GIT_REPOSITORY to control the location of ITTAPI repository.
Default value of ITTAPI_GIT_REPOSITORY is github location: https://github.com/intel/ittapi.git
Also, the separate cmake variable ITTAPI_GIT_TAG was added for repo tag.
2) Added cmake variable ITTAPI_SOURCE_DIR to control the place where the repo will be cloned.
Default value of ITTAPI_SOURCE_DIR is build area: PROJECT_BINARY_DIR
Reviewed By: etyurin, bader
Patch by ekovanov.
Differential Revision: https://reviews.llvm.org/D91935
The LLVM_ENABLE_MODULES builds currently randomly fail due depending on the
headers generated by the intrinsics_gen target, but the current dependency only model
the non-modules dependencies:
```
While building module 'LLVM_ExecutionEngine' imported from llvm-project/llvm/lib/ExecutionEngine/Orc/Shared/TargetProcessControlTypes.cpp:13:
While building module 'LLVM_intrinsic_gen' imported from llvm-project/llvm/include/llvm/ExecutionEngine/Orc/ThreadSafeModule.h:17:
In file included from <module-includes>:1:
In file included from llvm-project/llvm/include/llvm/IR/Argument.h:18:
llvm/include/llvm/IR/Attributes.h:75:14: fatal error: 'llvm/IR/Attributes.inc' file not found
#include "llvm/IR/Attributes.inc"
^~~~~~~~~~~~~~~~~~~~~~~~
```
Depending on whether intrinsics_gen runs before compiling Orc/Shared files we either fail or include an outdated Attributes.inc
in module builds. The Clang modules require these additional dependencies as including/importing one module requires all
includes headers by that module to be parsable.
Differential Revision: https://reviews.llvm.org/D92873
There is one result per lookup symbol, so we have to advance the result iterator no matter whether it's NULL or not.
MissingSymbols variable is unused.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D91707
Distinguish objects by target properties address size, endian and machine architecture. So far we only
support x86-64 (ELFCLASS64, ELFDATA2LSB, EM_X86_64).
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D90860
LLVMBuild has been removed from the build system. However, three LLVMBuild.txt
files remain in the tree. This patch simply removes them.
llvm/lib/ExecutionEngine/Orc/TargetProcess/LLVMBuild.txt
llvm/tools/llvm-jitlink/llvm-jitlink-executor/LLVMBuild.txt
llvm/tools/llvm-profgen/LLVMBuild.txt
Differential Revision: https://reviews.llvm.org/D92693
This reverts commit c6ef6e1690.
Basically, publicly linked libraries have a different semantic than components,
which link libraries privately.
Differential Revision: https://reviews.llvm.org/D91461
Use LINK_COMPONENTS instead of explicit target_link_libraries for components.
This avoids redundancy and potential inconsistencies.
Differential Revision: https://reviews.llvm.org/D91461
Patch by Elena Kovanova. Thanks Elena!
Problem:
LLVM already has a feature to profile the JIT-compiled code with VTune. This is
done using Intel JIT Profiling API (https://github.com/intel/ittapi). Function
information is captured by VTune as soon as the function is JIT-compiled. We
tried to use the same approach to report the function information generated by
the MCJIT engine – read parsing the debug information for in-memory ELF module
and report it using JIT API. As the results, we figured out that it did not work
properly for the following cases: inline functions, the functions located in
multiple source files, the functions having several bodies (address ranges).
Solution:
To overcome limitations described above, we have introduced new APIs as a part
of Intel ITT APIs to report the entire in-memory ELF module to be further
processed as regular ELF binaries with debug information.
This patch
1. Switches LLVM to open source version of Intel ITT/JIT APIs
(https://github.com/intel/ittapi) to keep it always up to date.
2. Adds support of profiling the code generated by MCJIT engine using Intel
VTune profiler
Another separate patch will get rid of obsolete Intel ITT APIs stuff, having
LLVM already switched to https://github.com/intel/ittapi.
Differential Revision: https://reviews.llvm.org/D86435
No longer rely on an external tool to build the llvm component layout.
Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.
These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.
Differential Revision: https://reviews.llvm.org/D90848
implementation.
This patch aims to improve support for out-of-process JITing using OrcV2. It
introduces two new class templates, OrcRPCTargetProcessControlBase and
OrcRPCTPCServer, which together implement the TargetProcessControl API by
forwarding operations to an execution process via an Orc-RPC Endpoint. These
utilities are used to implement out-of-process JITing from llvm-jitlink to
a new llvm-jitlink-executor tool.
This patch also breaks the OrcJIT library into three parts:
-- OrcTargetProcess: Contains code needed by the JIT execution process.
-- OrcShared: Contains code needed by the JIT execution and compiler
processes
-- OrcJIT: Everything else.
This break-up allows JIT executor processes to link against OrcTargetProcess
and OrcShared only, without having to link in all of OrcJIT. Clients executing
JIT'd code in-process should start linking against OrcTargetProcess as well as
OrcJIT.
In the near future these changes will enable:
-- Removal of the OrcRemoteTargetClient/OrcRemoteTargetServer class templates
which provided similar functionality in OrcV1.
-- Restoration of Chapter 5 of the Building-A-JIT tutorial series, which will
serve as a simple usage example for these APIs.
-- Implementation of lazy, cross-target compilation in lli's -jit-kind=orc-lazy
mode.
The macro HAVE_EHTABLE_SUPPORT is used by parts of ExecutionEngine to tell __register_frame/__deregister_frame is available to register the
FDE for a generated (JIT) code. It's currently set by a slowly growing set of macro tests in the respective headers, which is updated now and then when it fails to link on some platform or another due to the symbols being missing (see for example https://bugs.llvm.org/show_bug.cgi?id=5715).
This change converts the macro in two HAVE_(DE)REGISTER_FRAME config.h macros (like most of the other HAVE_* macros) and set's them based on whether CMake can actually find a definition for these symbols to link to at configuration time.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D87114
Basic implementation for call and jmp branches with 32 bit offset. Branches to local targets produce
Branch32 edges that are resolved like a regular PCRel32 relocations. Branches to external (undefined)
targets produce Branch32ToStub edges and go through a PLT entry by default. If the target happens to
get resolved within the 32 bit range from the callsite, the edge is relaxed during post-allocation
optimization. There is a test for each of these cases.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D90331
We have been producing R_X86_64_REX_GOTPCRELX (MOV64rm/TEST64rm/...) and
R_X86_64_GOTPCRELX for CALL64m/JMP64m without the REX prefix since 2016 (to be
consistent with GNU as), but not for MOV32rm/TEST32rm/...
Symbols with special section index SHN_COMMON (0xfff2) haven't been handled so far and caused an invalid section error.
This is a more or less straightforward use of the code commented out at the end of the function. I checked with the ELF spec, that the symbol value gives the alignment.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D89795
The difference is that the former are indirect and go to the GOT while the latter go to the target directly. This info can be used to relax indirect ones that don't need the GOT (because the target is in range). We check for this optimization beforehand. For formal correctness and to avoid confusion, we should only change the relocation kind if we actually apply the relaxation.
This re-applies e2fceec2fd with fixes. Apparently we already *do* support
relaxation for ELF, so we need to make sure the test case allocates a slab at
a fixed address, and that the R_X86_64_REX_GOTPCRELX test references an external
that is guaranteed to be out of range.
This patch breaks Orc.h up into Orc.h, LLJIT.h and OrcEE.h.
Orc.h contain core Orc utilities.
LLJIT.h contains LLJIT specific types and functions.
OrcEE.h contains types and functions that depend on ExecutionEngine.
The intent is that these headers should match future library divisions: Clients
who only use Orc.h should only need to link againt the Orc core libraries,
clients using LLJIT.h will also need to link against LLVM core, and clients
using OrcEE.h will also have to link against ExecutionEngine.
In addition to breaking up the Orc.h header this patch introduces functions to:
(1) Set the object linking layer creation function on LLJITBuilder.
(2) Create an RTDyldObjectLinkingLayer instance (particularly for use in (1)).
(3) Register JITEventListeners with an RTDyldObjectLinkingLayer.
Together (1), (2) and (3) can be used to force use of RTDyldObjectLinkingLayer
as the underlying JIT linker for LLJIT, rather than the platform default, and
to register event listeners with the RTDyldObjectLinkingLayer.
C API clients can now define a custom definition generator by providing a
callback function (to implement DefinitionGenerator::tryToGenerate) and context
object. All arguments for the DefinitionGenerator::tryToGenerate method have
been given C API counterparts, and the API allows for optionally asynchronous
generation.
Symbol string pool entries are ref counted, but not automatically cleared.
This can cause the size of the pool to grow without bound if it's not
periodically cleared. These functions allow that to be done via the C API.
This patch moves definition generation out from the session lock, instead
running it under a per-dylib generator lock. It also makes the
DefinitionGenerator::tryToGenerate method optionally asynchronous: Generators
are handed an opaque LookupState object which can be captured to stop/restart
the lookup process.
The new scheme provides the following benefits and guarantees:
(1) Queries that do not need to attempt definition generation (because all
requested symbols matched against existing definitions in the JITDylib)
can proceed without being blocked by any running definition generators.
(2) Definition generators can capture the LookupState to continue their work
asynchronously. This allows generators to run for an arbitrary amount of
time without blocking a thread. Definition generators that do not need to
run asynchronously can return without capturing the LookupState to eliminate
unnecessary recursion and improve lookup performance.
(3) Definition generators still do not need to worry about concurrency or
re-entrance: Since they are still run under a (per-dylib) lock, generators
will never be re-entered concurrently, or given overlapping symbol sets to
generate.
Finally, the new system distinguishes between symbols that are candidates for
generation (generation candidates) and symbols that failed to match for a query
(due to symbol visibility). This fixes a bug where an unresolved symbol could
trigger generation of a duplicate definition for an existing hidden symbol.
MaterializationResponsibility, JITDylib, and ExecutionSession collectively
manage the OrcV2 core JIT state. Responsibility for maintaining and
updating this state has previously been spread among these classes, resulting
in implementations that are each non-trivial, but all tightly coupled. This has
in turn made reading the code and reasoning about state update and locking
rules difficult.
The core state model can be simplified by thinking of
MaterializationResponsibility and JITDylib as facets of ExecutionSession. This
commit is the first in a series intended to refactor Core.cpp to reflect this
model. Operations on MaterializationResponsibility and JITDylib will forward to
implementation methods inside ExecutionSession. Raw state will remain with the
original classes, but in most cases will only be modified by the
ExecutionSession.
This patch updates the Kaleidoscope and BuildingAJIT tutorial series (chapter
1-4) to OrcV2. Chapter 5 of the BuildingAJIT series is removed -- it will be
re-instated once we have in-tree support for out-of-process JITing.
This patch only updates the tutorial code, not the text. Patches welcome for
that, otherwise I will try to update it in a few weeks.
This patch introduces new APIs to support resource tracking and removal in Orc.
It is intended as a thread-safe generalization of the removeModule concept from
OrcV1.
Clients can now create ResourceTracker objects (using
JITDylib::createResourceTracker) to track resources for each MaterializationUnit
(code, data, aliases, absolute symbols, etc.) added to the JIT. Every
MaterializationUnit will be associated with a ResourceTracker, and
ResourceTrackers can be re-used for multiple MaterializationUnits. Each JITDylib
has a default ResourceTracker that will be used for MaterializationUnits added
to that JITDylib if no ResourceTracker is explicitly specified.
Two operations can be performed on ResourceTrackers: transferTo and remove. The
transferTo operation transfers tracking of the resources to a different
ResourceTracker object, allowing ResourceTrackers to be merged to reduce
administrative overhead (the source tracker is invalidated in the process). The
remove operation removes all resources associated with a ResourceTracker,
including any symbols defined by MaterializationUnits associated with the
tracker, and also invalidates the tracker. These operations are thread safe, and
should work regardless of the the state of the MaterializationUnits. In the case
of resource transfer any existing resources associated with the source tracker
will be transferred to the destination tracker, and all future resources for
those units will be automatically associated with the destination tracker. In
the case of resource removal all already-allocated resources will be
deallocated, any if any program representations associated with the tracker have
not been compiled yet they will be destroyed. If any program representations are
currently being compiled then they will be prevented from completing: their
MaterializationResponsibility will return errors on any attempt to update the
JIT state.
Clients (usually Layer writers) wishing to track resources can implement the
ResourceManager API to receive notifications when ResourceTrackers are
transferred or removed. The MaterializationResponsibility::withResourceKeyDo
method can be used to create associations between the key for a ResourceTracker
and an allocated resource in a thread-safe way.
RTDyldObjectLinkingLayer and ObjectLinkingLayer are updated to use the
ResourceManager API to enable tracking and removal of memory allocated by the
JIT linker.
The new JITDylib::clear method can be used to trigger removal of every
ResourceTracker associated with the JITDylib (note that this will only
remove resources for the JITDylib, it does not run static destructors).
This patch includes unit tests showing basic usage. A follow-up patch will
update the Kaleidoscope and BuildingAJIT tutorial series to OrcV2 and will
use this API to release code associated with anonymous expressions.
This removes all legacy layers, legacy utilities, the old Orc C bindings,
OrcMCJITReplacement, and OrcMCJITReplacement regression tests.
ExecutionEngine and MCJIT are not affected by this change.
Report a fatal error if an IMAGE_REL_AMD64_ADDR32NB cannot be applied due to an
out-of-range target. Previously we emitted a diagnostic to llvm::errs and
continued.
Patch by Dale Martin. Thanks Dale!
This patch enables basic BSS section handling, and improves a couple of error
messages in the ELF section parsing code.
Patch by Christian Schafmeister. Thanks Christian!
Differential Revision: https://reviews.llvm.org/D88867
`ELFFile<ELFT>` has many methods that take pointers,
though they assume that arguments are never null and
hence could take references instead.
This patch performs such clean-up.
Differential revision: https://reviews.llvm.org/D87385
Making MaterializationResponsibility instances immovable allows their
associated VModuleKeys to be updated by the ExecutionSession while the
responsibility is still in-flight. This will be used in the upcoming
removable code feature to enable safe merging of resource keys even if
there are active compiles using the keys being merged.
TPCDynamicLibrarySearchGenerator was generating errors on missing
symbols, but that doesn't fit the DefinitionGenerator contract: A symbol
that isn't generated by a particular generator should not cause an
error.
This commit fixes the error by using SymbolLookupFlags::WeaklyReferencedSymbol
for all elements of the lookup, and switches llvm-jitlink to use
TPCDynamicLibrarySearchGenerator.
If there's no initializer symbol in the current MaterializationResponsibility
then bail out without installing JITLink passes: they're going to be no-ops
anyway.
A think-o in the existing code meant that dependencies were never registered.
This failure could lead to crashes rather than orderly error propagation if
initialization dependencies failed to materialize.
No test case: The bug was discovered in an out-of-tree code and requires
pathalogically misconfigured JIT to generate the original error that lead to
the crash.
DFS and Reverse-DFS linkage orders are used to order execution of
deinitializers and initializers respectively.
This patch replaces uses of special purpose DFS order functions in
MachOPlatform and LLJIT with uses of the new methods.
The functions `__register_frame`/`__deregister_frame` are not
available on z/OS, so add a guard to not use them.
Reviewed By: lhames, abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D84787
MachOLinkGraphBuilder has been treating these as hidden, but they should be
treated as local.
Symbols with N_PEXT set and N_EXT unset are produced when hidden symbols are
run through 'ld -r' without passing -keep_private_externs. They will show up
under 'nm -m' as "was private extern", hence the name of the test cases.
Testcase commited as relocatable object to ensure that the test suite doesn't
depend on having 'ld -r' available.
This loop caused me a little headache once, because I didn't see the assigned variable is a member. The refactored version appears more readable to me.
Differential Revision: https://reviews.llvm.org/D85922
Correctly sign extend the addend, and fix implicit shift operand decoding
(it incorrectly returned 0 for some cases), and check that the initial
encoded immediate is 0.
The code in SectionMemoryManager.cpp unnecessarily maps
read-only data sections with the READ+EXECUTE flags. This is
undesirable from a security stand-point.
Moreover, on the Fuchsia platform, which is now very strict
about mapping pages with the EXECUTE permission, this simply
fails, because the section's pages were initially allocated
with only the READ+WRITE flags.
A more detailed description of the issue can be found in this
public SwiftShader bug:
https://issuetracker.google.com/issues/154586551
This patch just restrict the mapping to the READ flag for ROData
sections. Code sections are still mapped with READ+EXECUTE as
expected.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D78574
Archives can now be specified as input files the same way that object
files are. Archives will always be linked after all objects (regardless
of the relative order of the inputs) but before any dynamic libraries or
process symbols.
This patch also relaxes matching for slice triples in
StaticLibraryDefinitionGenerator in order to support this feature:
Vendors need not match if the source vendor is unknown.
The -phony-externals option adds a generator which explicitly defines any
otherwise unresolved externals as null. This transforms link-time
unresolved-symbol errors into potential runtime null pointer accesses
(if an unresolved external is actually accessed during execution).
This option can be useful in -harness mode to avoid having to mock a
large number of symbols that are not reachable at runtime (e.g. unused
methods referenced by a class vtable).
This fixes the ExecutionEngine/MCJIT/stubs-sm-pic.ll test in no-asserts
builds which is set to XFAIL on some platforms like 32-bit x86. More
importantly, we probably don't want to silently error in these cases.
Differential revision: https://reviews.llvm.org/D84390
The -harness option enables new testing use-cases for llvm-jitlink. It takes a
list of objects to treat as a test harness for any regular objects passed to
llvm-jitlink.
If any files are passed using the -harness option then the following
transformations are applied to all other files:
(1) Symbols definitions that are referenced by the harness files are promoted
to default scope. (This enables access to statics from test harness).
(2) Symbols definitions that clash with definitions in the harness files are
deleted. (This enables interposition by test harness).
(3) All other definitions in regular files are demoted to local scope.
(This causes untested code to be dead stripped, reducing memory cost and
eliminating spurious unresolved symbol errors from untested code).
These transformations allow the harness files to reference and interpose
symbols in the regular object files, which can be used to support execution
tests (including fuzz tests) of functions in relocatable objects produced by a
build.
This allows clients to detect invalid transformations applied by JITLink passes
(e.g. inserting or removing symbols in unexpected ways) and terminate linking
with an error.
This change is used to simplify the error propagation logic in
ObjectLinkingLayer.
Subclasses will commonly gather that information from a remote during
construction, in which case they won't have meaningful values to pass to
TargetProcessControl's constructor.