This patch is the initial support, it implements translation from object file to JIT link graph, and very few relocations were supported. Currently, the test file ELF_pc_indirect.s is passed, the HelloWorld program(compiled with mno-relax flag) can be linked correctly and run on instruction emulator correctly.
In the downstream implementation, I have implemented the GOT, PLT function, and EHFrame and some optimization will be implement soon. I will organize the code in to patches, then gradually send it to upstream.
Differential Revision: https://reviews.llvm.org/D105429
By replacing a lambda expression with a functor class instance, this
patch works around an issue encountered on AIX where the IBM XL compiler
appears to make no progress for many hours.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D106554
This reverts commit 6b2a96285b.
The ccache builders are still failing. Looks like they need to be updated to
get the llvm-zorg config change in 490633945677656ba75d42ff1ca9d4a400b7b243.
I'll re-apply this as soon as the builders are updated.
This reapplies commit a7733e9556 ("Re-apply
[ORC][ORC-RT] Add initial native-TLV support to MachOPlatform."), and
d4abdefc99 ("[ORC-RT] Rename macho_tlv.x86-64.s
to macho_tlv.x86-64.S (uppercase suffix)").
These patches were reverted in 48aa82cacb while I
investigated bot failures (e.g.
https://lab.llvm.org/buildbot/#/builders/109/builds/18981). The fix was to
disable building of the ORC runtime on buliders using ccache (which is the same
fix used for other compiler-rt projects containing assembly code). This fix was
commited to llvm-zorg in 490633945677656ba75d42ff1ca9d4a400b7b243.
This reverts commit d4abdefc99 ("[ORC-RT] Rename
macho_tlv.x86-64.s to macho_tlv.x86-64.S (uppercase suffix)", and
a7733e9556 ("Re-apply "[ORC][ORC-RT] Add initial
native-TLV support to MachOPlatform."), while I investigate failures on
ccache builders (e.g. https://lab.llvm.org/buildbot/#/builders/109/builds/18981)
Reapplies fe1fa43f16, which was reverted in
6d8c63946c, with fixes:
1. Remove .subsections_via_symbols directive from macho_tlv.x86-64.s (it's
not needed here anyway).
2. Return error from pthread_key_create to the MachOPlatform to silence unused
variable warning.
Adds code to LLVM (MachOPlatform) and the ORC runtime to support native MachO
thread local variables. Adding new TLVs to a JITDylib at runtime is supported.
On the LLVM side MachOPlatform is updated to:
1. Identify thread local variables in the LinkGraph and lower them to GOT
accesses to data in the __thread_data or __thread_bss sections.
2. Merge and report the address range of __thread_data and thread_bss sections
to the runtime.
On the ORC runtime a MachOTLVManager class introduced which records the address
range of thread data/bss sections, and creates thread-local instances from the
initial data on demand. An orc-runtime specific tlv_get_addr implementation is
included which saves all register state then calls the MachOTLVManager to get
the address of the requested variable for the current thread.
LinkGraph::transferBlock can be used to move a block and all associated symbols
from one section to another.
LinkGraph::mergeSections moves all blocks and sections from a source section to
a destination section.
Adds support for MachO static initializers/deinitializers and eh-frame
registration via the ORC runtime.
This commit introduces cooperative support code into the ORC runtime and ORC
LLVM libraries (especially the MachOPlatform class) to support macho runtime
features for JIT'd code. This commit introduces support for static
initializers, static destructors (via cxa_atexit interposition), and eh-frame
registration. Near-future commits will add support for MachO native
thread-local variables, and language runtime registration (e.g. for Objective-C
and Swift).
The llvm-jitlink tool is updated to use the ORC runtime where available, and
regression tests for the new MachOPlatform support are added to compiler-rt.
Notable changes on the ORC runtime side:
1. The new macho_platform.h / macho_platform.cpp files contain the bulk of the
runtime-side support. This includes eh-frame registration; jit versions of
dlopen, dlsym, and dlclose; a cxa_atexit interpose to record static destructors,
and an '__orc_rt_macho_run_program' function that defines running a JIT'd MachO
program in terms of the jit- dlopen/dlsym/dlclose functions.
2. Replaces JITTargetAddress (and casting operations) with ExecutorAddress
(copied from LLVM) to improve type-safety of address management.
3. Adds serialization support for ExecutorAddress and unordered_map types to
the runtime-side Simple Packed Serialization code.
4. Adds orc-runtime regression tests to ensure that static initializers and
cxa-atexit interposes work as expected.
Notable changes on the LLVM side:
1. The MachOPlatform class is updated to:
1.1. Load the ORC runtime into the ExecutionSession.
1.2. Set up standard aliases for macho-specific runtime functions. E.g.
___cxa_atexit -> ___orc_rt_macho_cxa_atexit.
1.3. Install the MachOPlatformPlugin to scrape LinkGraphs for information
needed to support MachO features (e.g. eh-frames, mod-inits), and
communicate this information to the runtime.
1.4. Provide entry-points that the runtime can call to request initializers,
perform symbol lookup, and request deinitialiers (the latter is
implemented as an empty placeholder as macho object deinits are rarely
used).
1.5. Create a MachO header object for each JITDylib (defining the __mh_header
and __dso_handle symbols).
2. The llvm-jitlink tool (and llvm-jitlink-executor) are updated to use the
runtime when available.
3. A `lookupInitSymbolsAsync` method is added to the Platform base class. This
can be used to issue an async lookup for initializer symbols. The existing
`lookupInitSymbols` method is retained (the GenericIRPlatform code is still
using it), but is deprecated and will be removed soon.
4. JIT-dispatch support code is added to ExecutorProcessControl.
The JIT-dispatch system allows handlers in the JIT process to be associated with
'tag' symbols in the executor, and allows the executor to make remote procedure
calls back to the JIT process (via __orc_rt_jit_dispatch) using those tags.
The primary use case is ORC runtime code that needs to call bakc to handlers in
orc::Platform subclasses. E.g. __orc_rt_macho_jit_dlopen calling back to
MachOPlatform::rt_getInitializers using __orc_rt_macho_get_initializers_tag.
(The system is generic however, and could be used by non-runtime code).
The new ExecutorProcessControl::JITDispatchInfo struct provides the address
(in the executor) of the jit-dispatch function and a jit-dispatch context
object, and implementations of the dispatch function are added to
SelfExecutorProcessControl and OrcRPCExecutorProcessControl.
5. OrcRPCTPCServer is updated to support JIT-dispatch calls over ORC-RPC.
6. Serialization support for StringMap is added to the LLVM-side Simple Packed
Serialization code.
7. A JITLink::allocateBuffer operation is introduced to allocate writable memory
attached to the graph. This is used by the MachO header synthesis code, and will
be generically useful for other clients who want to create new graph content
from scratch.
At most these use the StringRef/Twine wrappers and don't have any implicit uses of std::string.
Move the include down to any cpp implementation where std::string is actually used.
Renames CommonOrcRuntimeTypes.h to ExecutorAddress.h and moves ExecutorAddress
into the 'orc' namespace (rather than orc::shared).
Also makes ExecutorAddress a class, adds an ExecutorAddrDiff type and some
arithmetic operations on the pair (subtracting two addresses yields an addrdiff,
adding an addrdiff and an address yields an address).
The computeNamedSymbolDependencies and computeLocalDeps methods on
ObjectLinkingLayerJITLinkContext are responsible for computing, for each symbol
in the current MaterializationResponsibility, the set of non-locally-scoped
symbols that are depended on. To calculate this we have to consider the effect
of chains of dependence through locally scoped symbols in the LinkGraph. E.g.
.text
.globl foo
foo:
callq bar ## foo depneds on external 'bar'
movq Ltmp1(%rip), %rcx ## foo depends on locally scoped 'Ltmp1'
addl (%rcx), %eax
retq
.data
Ltmp1:
.quad x ## Ltmp1 depends on external 'x'
In this example symbol 'foo' depends directly on 'bar', and indirectly on 'x'
via 'Ltmp1', which is locally scoped.
Performance of the existing implementations appears to have been mediocre:
Based on flame graphs posted by @drmeister (in #jit on the LLVM discord server)
the computeLocalDeps function was taking up a substantial amount of time when
starting up Clasp (https://github.com/clasp-developers/clasp).
This commit attempts to address the performance problems in three ways:
1. Using jitlink::Blocks instead of jitlink::Symbols as the nodes of the
dependencies-introduced-by-locally-scoped-symbols graph.
Using either Blocks or Symbols as nodes provides the same information, but since
there may be more than one locally scoped symbol per block the block-based
version of the dependence graph should always be a subgraph of the Symbol-based
version, and so faster to operate on.
2. Improved worklist management.
The older version of computeLocalDeps used a fixed worklist containing all
nodes, and iterated over this list propagating dependencies until no further
changes were required. The worklist was not sorted into a useful order before
the loop started.
The new version uses a variable work-stack, visiting nodes in DFS order and
only adding nodes when there is meaningful work to do on them.
Compared to the old version the new version avoids revisiting nodes which
haven't changed, and I suspect it converges more quickly (due to the DFS
ordering).
3. Laziness and caching.
Mappings of...
jitlink::Symbol* -> Interned Name (as SymbolStringPtr)
jitlink::Block* -> Immediate dependencies (as SymbolNameSet)
jitlink::Block* -> Transitive dependencies (as SymbolNameSet)
are all built lazily and cached while running computeNamedSymbolDependencies.
According to @drmeister these changes reduced Clasp startup time in his test
setup (averaged over a handful of starts) from 4.8 to 2.8 seconds (with
ORC/JITLink linking ~11,000 object files in that time), which seems like
enough to justify switching to the new algorithm in the absence of any other
perf numbers.
MachOJITDylibInitializers::SectionExtent represented the address range of a
section as an (address, size) pair. The new ExecutorAddressRange type
generalizes this to an address range (for any object, not necessarily a section)
represented as a (start-address, end-address) pair.
The aim is to express more of ORC (and the ORC runtime) in terms of simple types
that can be serialized/deserialized via SPS. This will simplify SPS-based RPC
involving arguments/return-values of these types.
Adds support for both synchronous and asynchronous calls to wrapper functions
using SPS (Simple Packed Serialization). Also adds support for wrapping
functions on the JIT side in SPS-based wrappers that can be called from the
executor.
These new methods simplify calls between the JIT and Executor, and will be used
in upcoming ORC runtime patches to enable communication between ORC and the
runtime.
This is a first step towards consistently using the term 'executor' for the
process that executes JIT'd code. I've opted for 'executor' as the preferred
term over 'target' as target is already heavily overloaded ("the target
machine for the executor" is much clearer than "the target machine for the
target").
ELFLinkGraphBuilder<ELFT> will hold generic parsing and LinkGraph-building code
that can be shared between JITLink ELF backends for different architectures.
For now it's just a stub. The plan is to incrementally move functionality down
from ELFLinkGraphBuilder_x86_64 into the new template.
This patch was derived from Valentin Churavy's work in
https://reviews.llvm.org/D104480. It adds support for setting the transform on
an IRTransformLayer, and for accessing the IRTransformLayer in LLJIT. It also
adds access to the ThreadSafeModule::withModuleDo method for thread-safe
access to modules.
A new example has been added to show how to use these APIs to optimize a module
during materialization.
Thanks Valentin!
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D103855
Provides ObjectTransformLayer APIs, a getter to access the
ObjectTransformLayer member of LLJIT, and the DumpObjects utility
to make construction of a dump-to-disk transform easy.
An example showing how the new APIs can be used has been added in
llvm/examples/OrcV2Examples/OrcV2CBindingsDumpObjects.
Addresses FIXMEs in TPC-based EH-frame and debug object registration code by
replacing manual argument serialization with WrapperFunction utility calls.
Replace the existing WrapperFunctionResult type in
llvm/include/ExecutionEngine/Orc/Shared/TargetProcessControlTypes.h with a
version adapted from the ORC runtime's implementation.
Also introduce the SimplePackedSerialization scheme (also adapted from the ORC
runtime's implementation) for wrapper functions to avoid manual serialization
and deserialization for calls to runtime functions involving common types.
The C-string section splitting support added in f9649d123d triggered an assert
("Duplicate canonical symbol at address") when multiple symbols were defined at
the the same offset within a C-string block (this triggered on arm64, where we
always add a block start symbol). The bug was caused by a failure to update the
record of the last canonical symbol address. The fix was to maintain this record
correctly, and move the auto-generation of the block-start symbol above the
handling for symbols defined in the object itself so that all symbols
(auto-generated and defined) are processed in address order.
MachO C-string literal sections should be split on null-terminator boundaries,
rather than the usual symbol boundaries. This patch updates
MachOLinkGraphBuilder to do that.
As suggested on rG937c4cffd024, use llvm_unreachable for unhandled integer types (which shouldn't be possible) instead of breaking and dropping down to the existing fatal error handler.
Helps silence static analyzer warnings.
During the generic x86-64 support refactor in ecf6466f01 the implementation
of MachO_arm64_GOTAndStubsBuilder::isGOTEdgeToFix was altered to only return
true for external symbols. This behavior is incorrect: GOT entries may be
required for defined symbols (e.g. in the large code model).
This patch fixes the bug and adds a test case for it (renaming an old test
case to avoid any ambiguity).
Currently, BPF only contains three relocations:
R_BPF_NONE for no relocation
R_BPF_64_64 for LD_imm64 and normal 64-bit data relocation
R_BPF_64_32 for call insn and normal 32-bit data relocation
Also .BTF and .BTF.ext sections contain symbols in allocated
program and data sections. These two sections reserved 32bit
space to hold the offset relative to the symbol's section.
When LLVM JIT is used, the LLVM ExecutionEngine RuntimeDyld
may attempt to resolve relocations for .BTF and .BTF.ext,
which we want to prevent. So we used R_BPF_NONE for such relocations.
This all works fine until when we try to do linking of
multiple objects.
. R_BPF_64_64 handling of LD_imm64 vs. normal 64-bit data
is different, so lld target->relocate() needs more context
to do a correct job.
. The same for R_BPF_64_32. More context is needed for
lld target->relocate() to differentiate call insn vs.
normal 32-bit data relocation.
. Since relocations in .BTF and .BTF.ext are set to R_BPF_NONE,
they will not be relocated properly when multiple .BTF/.BTF.ext
sections are merged by lld.
This patch intends to address this issue by adding additional
relocation kinds:
R_BPF_64_ABS64 for normal 64-bit data relocation
R_BPF_64_ABS32 for normal 32-bit data relocation
R_BPF_64_NODYLD32 for .BTF and .BTF.ext style relocations.
The old R_BPF_64_{64,32} semantics:
R_BPF_64_64 for LD_imm64 relocation
R_BPF_64_32 for call insn relocation
The existing R_BPF_64_64/R_BPF_64_32 mapping to numeric values
is maintained. They are the most common use cases for
bpf programs and we want to maintain backward compatibility
as much as possible.
ExecutionEngine RuntimeDyld BPF relocations are adjusted as well.
R_BPF_64_{ABS64,ABS32} relocations will be resolved properly and
other relocations will be ignored.
Two tests are added for RuntimeDyld. Not handling R_BPF_64_NODYLD32 in
RuntimeDyldELF.cpp will result in "Relocation type not implemented yet!"
fatal error.
FK_SecRel_4 usages in BPFAsmBackend.cpp and BPFELFObjectWriter.cpp
are removed as they are not triggered in BPF backend.
BPF backend used FK_SecRel_8 for LD_imm64 instruction operands.
Differential Revision: https://reviews.llvm.org/D102712
This patch introduces new operations on jitlink::Blocks: setMutableContent,
getMutableContent and getAlreadyMutableContent. The setMutableContent method
will set the block content data and size members and flag the content as
mutable. The getMutableContent method will return a mutable copy of the existing
content value, auto-allocating and populating a new mutable copy if the existing
content is marked immutable. The getAlreadyMutableMethod asserts that the
existing content is already mutable and returns it.
setMutableContent should be used when updating the block with totally new
content backed by mutable memory. It can be used to change the size of the
block. The argument value should *not* be shared with any other block.
getMutableContent should be used when clients want to modify the existing
content and are unsure whether it is mutable yet.
getAlreadyMutableContent should be used when clients want to modify the existing
content and know from context that it must already be immutable.
These operations reduce copy-modify-update boilerplate and unnecessary copies
introduced when clients couldn't me sure whether the existing content was
mutable or not.
The implementation and intent behind freeing the triple string here is the same
as LLVMGetDefaultTargetTriple (and any other owned c string returned from the C
API), so we should use LLVMDisposeMessage for to free the string for
consistency.
Patch by Mats Larsen -- thanks Mats!
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D102957
This patch introduces functionality used by BOLT when
re-linking the final binary. It adds to MemoryManager a new member
function allowStubAllocation to control whether this MemoryManager
supports increasing code size with stubs or not. Since BOLT can
rewrite some files in-place, it needs to avoid stub insertion done
by the linker. This patch also introduces allowsZeroSymbols to the
JITSymbolResolver class, enabling us to finish a link successfully
even when some symbols resolve to the value zero. When rewriting a
binary, sometimes we do need to resolve a target to zero in case
the input binary calls address zero and we want to be bug
compatible. We also expose reassignSectionAddress as it is used by
BOLT.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97898
Fix was implemented in the ittap repo to solve an error about cross-compiling ITTAPI in LLVM with mingw.
The problem occurred in the cross-compilation environment for Julia's dependencies.
The corresponding issue item in ittapi repo: https://github.com/intel/ittapi/issues/19
A new tag was created in ittapi repo for that fix.
This patch contains changes to update the ittapi tag in LLVM.
Reviewed By: bader
Differential Revision: https://reviews.llvm.org/D102471
This is separate from (but builds on) the support added in ec6b71df70 for
emitting LinkGraphs in the context of an active materialization. This commit
makes LinkGraphs a first-class data structure with features equivalent to
object files within ObjectLinkingLayer.
These can be used to create eh-frame section fixing passes outside the usual
linker pipeline, which can be useful for tests and tools that just want to
verify or dump graphs.
This commit reorders some fields and fixes the width of others to try to
maintain more consistent columns. It also switches to long-hand scope
and linkage names, since LinkGraph dumps aren't read often enough for
single-character codes to be memorable.
Generalizing this API allows work to be distributed more evenly. In particular,
query callbacks can now be dispatched (rather than running immediately on the
thread that satisfied the query). This avoids the pathalogical case where an
operation on one thread satisfies many queries simultaneously, causing large
amounts of work to be run on that thread while other threads potentially sit
idle.
This can be useful for clients constructing custom JIT stacks: If the C API
for your custom stack exposes API to obtain a reference to an object layer
(e.g. LLVMOrcLLJITGetObjLinkingLayer) then the newly added
LLVMOrcObjectLayerAddObjectFile and LLVMOrcObjectLayerAddObjectFileWithRT
functions can be used to add objects directly to that layer.
This reapplies 8740360093, which was reverted in bbddadd46e due to buildbot
errors.
This version checks that a JIT instance can be safely constructed, skipping
tests if it can not be. To enable this it introduces new C API to retrieve and
set the target triple for a JITTargetMachineBuilder.
Adds support for creating custom MaterializationUnits in the C API with the new
LLVMOrcCreateCustomMaterializationUnit function.
Modifies ownership rules for LLVMOrcAbsoluteSymbols to make it consistent with
LLVMOrcCreateCustomMaterializationUnit. This is an ABI breaking change for any
clients of the LLVMOrcAbsoluteSymbols API.
Adds LLVMOrcLLJITGetObjLinkingLayer and LLVMOrcObjectLayerEmit functions to
allow clients to get a reference to an LLJIT instance's linking layer, then
emit an object file using it. This can be used to support construction of
custom materialization units in the common case where those units will
generate an object file that needs to be emitted to complete the
materialization.
In EHFrameRegistrationPlugin::notifyTransferringResources if SrcKey had
eh-frames associated but DstKey did not we would create a new entry for DskKey,
invalidating the iterator for SrcKey in the process. This commit fixes that by
removing SrcKey first in this case.
It used to be that all of our intrinsics were call instructions, but over time, we've added more and more invokable intrinsics. According to the verifier, we're up to 8 right now. As IntrinsicInst is a sub-class of CallInst, this puts us in an awkward spot where the idiomatic means to check for intrinsic has a false negative if the intrinsic is invoked.
This change switches IntrinsicInst from being a sub-class of CallInst to being a subclass of CallBase. This allows invoked intrinsics to be instances of IntrinsicInst, at the cost of requiring a few more casts to CallInst in places where the intrinsic really is known to be a call, not an invoke.
After this lands and has baked for a couple days, planned cleanups:
Make GCStatepointInst a IntrinsicInst subclass.
Merge intrinsic handling in InstCombine and use idiomatic visitIntrinsicInst entry point for InstVisitor.
Do the same in SelectionDAG.
Do the same in FastISEL.
Differential Revision: https://reviews.llvm.org/D99976
Adds utilities for creating anonymous pointers and jump stubs to x86_64.h. These
are used by the GOT and Stubs builder, but may also be used by pass writers who
want to create pointer stubs for indirection.
This patch also switches the underlying type for LinkGraph content from
StringRef to ArrayRef<char>. This avoids any confusion when working with buffers
that contain null bytes in the middle like, for example, a newly added null
pointer content array. ;)
This option tells LLJIT to disable platform support explicitly: JITDylibs aren't scanned for special init/deinit symbols and no runtime API interposes are injected.
It's useful in two cases: for platforms that don't have such requirements and platforms for which we have no explicit support yet and that don't work well with the generic IR platform.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D99416
LLVMOrcDisposeObjectLayer and LLVMOrcExecutionSessionGetJITDylibByName did not
have matching signatures between the C-API header and binding implementations.
Fixes http://llvm.org/PR49745.
Patch by Mats Larsen. Thanks Mats!
Reviewed by: lhames
Differential Revision: https://reviews.llvm.org/D99478
JITLink now requires section names to be unique. In MachO section names are only
guaranteed to be unique within their containing segment (e.g. a '__const' section
in the '__DATA' segment does not clash with a '__const' section in the '__TEXT'
segment), so we need to use the fully qualified <segment>,<section> section
names (e.g. '__DATA,__const' or '__TEXT,__const') when constructing
jitlink::Sections for MachO objects.
Apply the way createLocalIndirectStubsManagerBuilder() deals with unsupported achritectures to createLocalLazyCallThroughManager(). The returned call-through manager is dysfunctional: It runs into an unreachable as soon as a lazy JIT attempts to use it. However, this results in broader platform support for lli in default (greedy) ORC mode where no lazy materialization is required.
Don't leak ResourceKeys from MaterializationResponsibility::withResourceKeyDo() in notifyEmitted().
Also make some improvements in the overall implementation.
Differential Revision: https://reviews.llvm.org/D98863
There can be multiple MaterializationResponsibilitys in-flight for a single ResourceKey. Hence, pending debug objects must be tracked by MaterializationResponsibility and not by ResourceKey.
Differential Revision: https://reviews.llvm.org/D98785
Introduces DefineExternalSectionStartAndEndSymbols.h, which defines a template
for a JITLink pass that transforms external symbols meeting a user-supplied
predicate into defined symbols pointing at the start and end of a Section
identified by the predicate. JITLink.h is updated with a new makeAbsolute
function to support this pass.
Also renames BasicGOTAndStubsBuilder to PerGraphGOTAndPLTStubsBuilder -- the new
name better describes the intent of this GOT and PLT stubs builder, and will
help to distinguish it from future GOT and PLT stub builders that build entries
that may be shared between multiple graphs.
Issuing a lookup for an empty symbol set is legal, but can actually result in
unrelated work being done if there was a work queue left over from the previous
lookup. We can avoid doing this unrelated work (reducing stack depth and
interleaving of debugging output) by not issuing these no-op lookups in the
first place.
This patch introduces generic x86-64 edge kinds, and refactors the MachO/x86-64
backend to use these edge kinds. This simplifies the implementation of the
MachO/x86-64 backend and makes it possible to write generic x86-64 passes and
utilities.
The new edge kinds are different from the original set used in the MachO/x86-64
backend. Several edge kinds that were not meaningfully distinguished in that
backend (e.g. the PCRelMinusN edges) have been merged into single edge kinds in
the new scheme (these edge kinds can be reintroduced later if we find a use for
them). At the same time, new edge kinds have been introduced to convey extra
information about the state of the graph. E.g. The Request*AndTransformTo**
edges represent GOT/TLVP relocations prior to synthesis of the GOT/TLVP
entries, and the 'Relaxable' suffix distinguishes edges that are candidates for
optimization from edges which should be left as-is (e.g. to enable runtime
redirection).
ELF/x86-64 will be refactored to use these generic edges at some point in the
future, and I anticipate a similar refactor to create a generic arm64 support
header too.
Differential Revision: https://reviews.llvm.org/D98305
This makes the target triple, graph name, and full graph content available
when making decisions about how to populate the linker pass pipeline.
Also updates the LLJITWithObjectLinkingLayerPlugin example to show more
API use, including use of the API changes in this patch.
During finalization the debug object is registered with the target. Materialization must wait for this process to finish. Otherwise we might start running code before the debugger finished processing the corresponding debug info.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D98417
From what I can tell, the loop inside applyExternalSymbolRelocations()
used to call getSymbolAddress(). After the JITSymbolResolver interface
redesign, the functionality has changed, and the loop should no longer
trigger repopulation of ExternalSymbolRelocations. If that's the case,
there is no need to update the loop iterator manually, and
ExternalSymbolRelocations can be cleared at once. This way, when there
are many external symbols in the program, the function runs much faster.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97531
This patch introduces functionality used by BOLT when
re-linking the final binary. It adds new relocation types that
are currently unsupported by RuntimeDyldELF.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97899
GCC warning:
```
In file included from /usr/include/c++/9/cassert:44,
from /home/vsts/work/1/llvm-project/llvm/include/llvm/ADT/BitVector.h:21,
from /home/vsts/work/1/llvm-project/llvm/include/llvm/Support/Program.h:17,
from /home/vsts/work/1/llvm-project/llvm/include/llvm/Support/Process.h:32,
from /home/vsts/work/1/llvm-project/llvm/lib/ExecutionEngine/JITLink/JITLinkMemoryManager.cpp:11:
/home/vsts/work/1/llvm-project/llvm/lib/ExecutionEngine/JITLink/JITLinkMemoryManager.cpp: In member function ‘virtual llvm::Expected<std::unique_ptr<llvm::jitlink::JITLinkMemoryManager::Allocation> > llvm::jitlink::InProcessMemoryManager::allocate(const llvm::jitlink::JITLinkDylib*, const SegmentsRequestMap&)’:
/home/vsts/work/1/llvm-project/llvm/lib/ExecutionEngine/JITLink/JITLinkMemoryManager.cpp:129:40: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
129 | assert(SlabRemaining.allocatedSize() >= 0 && "Mapping exceeds allocation");
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
```
The return type of `allocatedSize()` is `size_t`, thus the expression
`SlabRemaining.allocatedSize() >= 0` always evaluate to `true`.
lli aims to provide both, RuntimeDyld and JITLink, as the dynamic linkers/loaders for it's JIT implementations. And they both offer debugging via the GDB JIT interface, which builds on the two well-known symbol names `__jit_debug_descriptor` and `__jit_debug_register_code`. As these symbols must be unique accross the linked executable, we can only define them in one of the libraries and make the other depend on it. OrcTargetProcess is a minimal stub for embedding a JIT client in remote executors. For the moment it seems reasonable to have the definition there and let ExecutionEngine depend on it, until we find a better solution.
This is the second commit for the reviewed patch.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97339
Add a new ObjectLinkingLayer plugin `DebugObjectManagerPlugin` and infrastructure to handle creation of `DebugObject`s as well as their registration in OrcTargetProcess. The current implementation only covers ELF on x86-64, but the infrastructure is not limited to that.
The journey starts with a new `LinkGraph` / `JITLinkContext` pair being created for a `MaterializationResponsibility` in ORC's `ObjectLinkingLayer`. It sends a `notifyMaterializing()` notification, which is forwarded to all registered plugins. The `DebugObjectManagerPlugin` aims to create a `DebugObject` form the provided target triple and object buffer. (Future implementations might create `DebugObject`s from a `LinkGraph` in other ways.) On success it will track it as the pending `DebugObject` for the `MaterializationResponsibility`.
This patch only implements the `ELFDebugObject` for `x86-64` targets. It follows the RuntimeDyld approach for debug object setup: it captures a copy of the input object, parses all section headers and prepares to patch their load-address fields with their final addresses in target memory. It instructs the plugin to report the section load-addresses once they are available. The plugin overrides `modifyPassConfig()` and installs a JITLink post-allocation pass to capture them.
Once JITLink emitted the finalized executable, the plugin emits and registers the `DebugObject`. For emission it requests a new `JITLinkMemoryManager::Allocation` with a single read-only segment, copies the object with patched section load-addresses over to working memory and triggers finalization to target memory. For registration, it notifies the `DebugObjectRegistrar` provided in the constructor and stores the previously pending`DebugObject` as registered for the corresponding MaterializationResponsibility.
The `DebugObjectRegistrar` registers the `DebugObject` with the target process. `llvm-jitlink` uses the `TPCDebugObjectRegistrar`, which calls `llvm_orc_registerJITLoaderGDBWrapper()` in the target process via `TargetProcessControl` to emit a `jit_code_entry` compatible with the GDB JIT interface [1]. So far the implementation only supports registration and no removal. It appears to me that it wouldn't raise any new design questions, so I left this as an addition for the near future.
[1] https://sourceware.org/gdb/current/onlinedocs/gdb/JIT-Interface.html
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97335
So far we had no way to distinguish between JITLink and RuntimeDyld in lli. Instead, we used implicit knowledge that RuntimeDyld would be used for linking ELF. In order to get D97337 to work with lli though, we have to move on and allow JITLink for ELF. This patch uses extensible RTTI to allow external clients to add their own layers without touching the LLVM sources.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D97338
This commit fixes how metadata is handled in CloneModule to be sound,
and improves how it's handled in CloneFunctionInto (although the latter
is still awkward when called within a module).
Ruiling Song pointed out in PR48841 that CloneModule was changed to
unsoundly use the RF_ReuseAndMutateDistinctMDs flag (renamed in
fa35c1f80f for clarity). This flag papered
over a crash caused by other various changes made to CloneFunctionInto
over the past few years that made it unsound to use cloning between
different modules.
(This commit partially addresses PR48841, fixing the repro from
preprocessed source but not textual IR. MDNodeMapper::mapDistinctNode
became unsound in df763188c9 and this
commit does not address that regression.)
RF_ReuseAndMutateDistinctMDs is designed for the IRMover to use,
avoiding unnecessary clones of all referenced metadata when linking
between modules (with IRMover, the source module is discarded after
linking). It never makes sense to use when you're not discarding the
source. This commit drops its incorrect use in CloneModule.
Sadly, the right thing to do with metadata when cloning a function is
complicated, and this patch doesn't totally fix it.
The first problem is that there are two different types of referenceable
metadata and it's not obvious what to with one of them when remapping.
- `!0 = !{!1}` is metadata's version of a constant. Programatically it's
called "uniqued" (probably a better term would be "constant") because,
like `ConstantArray`, it's stored in uniquing tables. Once it's
constructed, it's illegal to change its arguments.
- `!0 = distinct !{!1}` is a bit closer to a global variable. It's legal
to change the operands after construction.
What should be done with distinct metadata when cloning functions within
the same module?
- Should new, cloned nodes be created?
- Should all references point to the same, old nodes?
The answer depends on whether that metadata is effectively owned by a
function.
And that's the second problem. Referenceable metadata's ownership model
is not clear or explicit. Technically, it's all stored on an
LLVMContext. However, any metadata that is `distinct`, that transitively
references a `distinct` node, or that transitively references a
GlobalValue is specific to a Module and is effectively owned by it. More
specifically, some metadata is effectively owned by a specific Function
within a module.
Effectively function-local metadata was introduced somewhere around
c10d0e5ccd, which made it illegal for two
functions to share a DISubprogram attachment.
When cloning a function within a module, you need to clone the
function-local debug info and suppress cloning of global debug info (the
status quo suppresses cloning some global debug info but not all). When
cloning a function to a new/different module, you need to clone all of
the debug info.
Here's what I think we should do (eventually? soon? not this patch
though):
- Distinguish explicitly (somehow) between pure constant metadata owned
by the LLVMContext, global metadata owned by the Module, and local
metadata owned by a GlobalValue (such as a function).
- Update CloneFunctionInto to trigger cloning of all "local" metadata
(only), perhaps by adding a bit to RemapFlag. Alternatively, split
out a separate function CloneFunctionMetadataInto to prime the
metadata map that callers are updated to call ahead of time as
appropriate.
Here's the somewhat more isolated fix in this patch:
- Converted the `ModuleLevelChanges` parameter to `CloneFunctionInto` to
an enum called `CloneFunctionChangeType` that is one of
LocalChangesOnly, GlobalChanges, DifferentModule, and ClonedModule.
- The code maintaining the "functions uniquely own subprograms"
invariant is now only active in the first two cases, where a function
is being cloned within a single module. That's necessary because this
code inhibits cloning of (some) "global" metadata that's effectively
owned by the module.
- The code maintaining the "all compile units must be explicitly
referenced by !llvm.dbg.cu" invariant is now only active in the
DifferentModule case, where a function is being cloned into a new
module in isolation.
- CoroSplit.cpp's call to CloneFunctionInto in CoroCloner::create
uses LocalChangeOnly, since fa635d730f
only set `ModuleLevelChanges` to trigger cloning of local metadata.
- CloneModule drops its unsound use of RF_ReuseAndMutateDistinctMDs
and special handling of !llvm.dbg.cu.
- Fixed some outdated header docs and left a couple of FIXMEs.
Differential Revision: https://reviews.llvm.org/D96531
A fix has been implemented in the ittap repo to fix an error about implicit fallthrough in a switch that was occurring during self build.
A new tag has been created for that fix. This is to update the tag.
Reviewed By: bader
Differential Revision: https://reviews.llvm.org/D95462
Patch by Zahira Ammarguellat.
Compilers may insert new definitions during compilation, E.g. EH personality
function pointers, or named constant pool entries. This commit causes
ObjectLinkingLayer to attempt to claim responsibility for all weak definitions
in objects as they're linked. This is always safe (first claimant for each
symbol is granted responsibility, subsequent claims are rejected without error)
and prevents compiler-injected symbols from being dead-stripped (which they
will be if they remain unclaimed by anyone).
This change was motivated by errors seen by an out-of-tree client while testing
eh-frame support in JITLink ELF/x86-64: IR containing exceptions didn't define
DW.ref.__gxx_personality_v0 (since it's added by CodeGen), and this caused
DW.ref.__gxx_personality_v0 to be dead-stripped leading to linker failures.
No test case yet: We won't have a way to test in-tree until we enable JITLink
for lli on Linux.
This is required for ELF where PCRel32 doesn't implicitly subtract 4.
No test case yet: I haven't figured out a good way to test stub
generation -- this may required extensions to jitlink-check.
Adds the EHFrameSplitter and EHFrameEdgeFixer passes to the default JITLink
pass pipeline for ELF/x86-64, and teaches EHFrameEdgeFixer to handle some
new pointer encodings.
Together these changes enable exception handling (at least for the basic
cases that I've tested so far) for ELF/x86-64 objects loaded via JITLink.
Previously FDE field names were used, but the fixup kind used for a field can
vary based on the pointer encoding.
This change will improve readability / maintainability when EH-frame support is
added to JITLink/ELF.
It can be useful for an ObjectLinkingLayerCreator to allow callee errors to get propagated to the builder. Specifically, this is the case when the ObjectLayer uses the EHFrameRegistrationPlugin, because it requires a TPCEHFrameRegistrar and instantiation for it may fail (e.g. if the required registration symbols are missing in the target process).
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D94690
All other layers in LLJIT are stored as unique_ptr's already. At this point, it is not strictly necessary for ObjTransformLayer, but it makes a follow-up change more straightforward.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D94689
Passes in the new PostAllocationPasses list will run immediately after memory
allocation and address assignment for defined symbols, and before
JITLinkContext::notifyResolved is called. These passes can set up state
associated with the addresses of defined symbols before any query for these
addresses completes.
PreFixupPasses better reflects when these passes will run.
A future patch will (re)introduce a PostAllocationPasses list that will run
after allocation, but before JITLinkContext::notifyResolved is called to notify
the rest of the JIT about the resolved symbol addresses.
Add a triple for powerpcle-*-*.
This is a little-endian encoding of the 32-bit PowerPC ABI, useful in certain niche situations:
1) A loader such as the FreeBSD loader which will be loading a little endian kernel. This is required for PowerPC64LE to load properly in pseries VMs.
Such a loader is implemented as a freestanding ELF32 LSB binary.
2) Userspace emulation of a 32-bit LE architecture such as x86 on 64-bit hosts such as PowerPC64LE with tools like box86 requires having a 32-bit LE toolchain and library set, as they operate by translating only the main binary and switching to native code when making library calls.
3) The Void Linux for PowerPC project is experimenting with running an entire powerpcle userland.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D93918
Moves all headers from Orc/RPC to Orc/Shared, and from the llvm::orc::rpc
namespace into llvm::orc::shared. Also renames RPCTypeName to
SerializationTypeName and Function to RPCFunction.
In addition to being a more reasonable home for this code, this will make it
easier for the upcoming Orc runtime to re-use the Serialization system for
creating and parsing wrapper-function binary blobs.
Separates link graph creation from linking. This allows raw LinkGraphs to be
created and passed to a link. ObjectLinkingLayer is updated to support emission
of raw LinkGraphs in addition to object buffers.
Raw LinkGraphs can be created by in-memory compilers to bypass object encoding /
decoding (though this prevents caching, as LinkGraphs have do not have an
on-disk representation), and by utility code to add programatically generated
data structures to the JIT target process.
JITLinkDylib represents a target dylib for a JITLink link. By representing this
explicitly we can:
- Enable JITLinkMemoryManagers to manage allocations on a per-dylib basis
(e.g by maintaining a seperate allocation pool for each JITLinkDylib).
- Enable new features and diagnostics that require information about the
target dylib (not implemented in this patch).
To support llorg builds this patch provides the following changes:
1) Added cmake variable ITTAPI_GIT_REPOSITORY to control the location of ITTAPI repository.
Default value of ITTAPI_GIT_REPOSITORY is github location: https://github.com/intel/ittapi.git
Also, the separate cmake variable ITTAPI_GIT_TAG was added for repo tag.
2) Added cmake variable ITTAPI_SOURCE_DIR to control the place where the repo will be cloned.
Default value of ITTAPI_SOURCE_DIR is build area: PROJECT_BINARY_DIR
Reviewed By: etyurin, bader
Patch by ekovanov.
Differential Revision: https://reviews.llvm.org/D91935
The LLVM_ENABLE_MODULES builds currently randomly fail due depending on the
headers generated by the intrinsics_gen target, but the current dependency only model
the non-modules dependencies:
```
While building module 'LLVM_ExecutionEngine' imported from llvm-project/llvm/lib/ExecutionEngine/Orc/Shared/TargetProcessControlTypes.cpp:13:
While building module 'LLVM_intrinsic_gen' imported from llvm-project/llvm/include/llvm/ExecutionEngine/Orc/ThreadSafeModule.h:17:
In file included from <module-includes>:1:
In file included from llvm-project/llvm/include/llvm/IR/Argument.h:18:
llvm/include/llvm/IR/Attributes.h:75:14: fatal error: 'llvm/IR/Attributes.inc' file not found
#include "llvm/IR/Attributes.inc"
^~~~~~~~~~~~~~~~~~~~~~~~
```
Depending on whether intrinsics_gen runs before compiling Orc/Shared files we either fail or include an outdated Attributes.inc
in module builds. The Clang modules require these additional dependencies as including/importing one module requires all
includes headers by that module to be parsable.
Differential Revision: https://reviews.llvm.org/D92873
There is one result per lookup symbol, so we have to advance the result iterator no matter whether it's NULL or not.
MissingSymbols variable is unused.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D91707
Distinguish objects by target properties address size, endian and machine architecture. So far we only
support x86-64 (ELFCLASS64, ELFDATA2LSB, EM_X86_64).
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D90860
LLVMBuild has been removed from the build system. However, three LLVMBuild.txt
files remain in the tree. This patch simply removes them.
llvm/lib/ExecutionEngine/Orc/TargetProcess/LLVMBuild.txt
llvm/tools/llvm-jitlink/llvm-jitlink-executor/LLVMBuild.txt
llvm/tools/llvm-profgen/LLVMBuild.txt
Differential Revision: https://reviews.llvm.org/D92693
This reverts commit c6ef6e1690.
Basically, publicly linked libraries have a different semantic than components,
which link libraries privately.
Differential Revision: https://reviews.llvm.org/D91461
Use LINK_COMPONENTS instead of explicit target_link_libraries for components.
This avoids redundancy and potential inconsistencies.
Differential Revision: https://reviews.llvm.org/D91461
Patch by Elena Kovanova. Thanks Elena!
Problem:
LLVM already has a feature to profile the JIT-compiled code with VTune. This is
done using Intel JIT Profiling API (https://github.com/intel/ittapi). Function
information is captured by VTune as soon as the function is JIT-compiled. We
tried to use the same approach to report the function information generated by
the MCJIT engine – read parsing the debug information for in-memory ELF module
and report it using JIT API. As the results, we figured out that it did not work
properly for the following cases: inline functions, the functions located in
multiple source files, the functions having several bodies (address ranges).
Solution:
To overcome limitations described above, we have introduced new APIs as a part
of Intel ITT APIs to report the entire in-memory ELF module to be further
processed as regular ELF binaries with debug information.
This patch
1. Switches LLVM to open source version of Intel ITT/JIT APIs
(https://github.com/intel/ittapi) to keep it always up to date.
2. Adds support of profiling the code generated by MCJIT engine using Intel
VTune profiler
Another separate patch will get rid of obsolete Intel ITT APIs stuff, having
LLVM already switched to https://github.com/intel/ittapi.
Differential Revision: https://reviews.llvm.org/D86435
No longer rely on an external tool to build the llvm component layout.
Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.
These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.
Differential Revision: https://reviews.llvm.org/D90848
implementation.
This patch aims to improve support for out-of-process JITing using OrcV2. It
introduces two new class templates, OrcRPCTargetProcessControlBase and
OrcRPCTPCServer, which together implement the TargetProcessControl API by
forwarding operations to an execution process via an Orc-RPC Endpoint. These
utilities are used to implement out-of-process JITing from llvm-jitlink to
a new llvm-jitlink-executor tool.
This patch also breaks the OrcJIT library into three parts:
-- OrcTargetProcess: Contains code needed by the JIT execution process.
-- OrcShared: Contains code needed by the JIT execution and compiler
processes
-- OrcJIT: Everything else.
This break-up allows JIT executor processes to link against OrcTargetProcess
and OrcShared only, without having to link in all of OrcJIT. Clients executing
JIT'd code in-process should start linking against OrcTargetProcess as well as
OrcJIT.
In the near future these changes will enable:
-- Removal of the OrcRemoteTargetClient/OrcRemoteTargetServer class templates
which provided similar functionality in OrcV1.
-- Restoration of Chapter 5 of the Building-A-JIT tutorial series, which will
serve as a simple usage example for these APIs.
-- Implementation of lazy, cross-target compilation in lli's -jit-kind=orc-lazy
mode.
The macro HAVE_EHTABLE_SUPPORT is used by parts of ExecutionEngine to tell __register_frame/__deregister_frame is available to register the
FDE for a generated (JIT) code. It's currently set by a slowly growing set of macro tests in the respective headers, which is updated now and then when it fails to link on some platform or another due to the symbols being missing (see for example https://bugs.llvm.org/show_bug.cgi?id=5715).
This change converts the macro in two HAVE_(DE)REGISTER_FRAME config.h macros (like most of the other HAVE_* macros) and set's them based on whether CMake can actually find a definition for these symbols to link to at configuration time.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D87114
Basic implementation for call and jmp branches with 32 bit offset. Branches to local targets produce
Branch32 edges that are resolved like a regular PCRel32 relocations. Branches to external (undefined)
targets produce Branch32ToStub edges and go through a PLT entry by default. If the target happens to
get resolved within the 32 bit range from the callsite, the edge is relaxed during post-allocation
optimization. There is a test for each of these cases.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D90331
We have been producing R_X86_64_REX_GOTPCRELX (MOV64rm/TEST64rm/...) and
R_X86_64_GOTPCRELX for CALL64m/JMP64m without the REX prefix since 2016 (to be
consistent with GNU as), but not for MOV32rm/TEST32rm/...
Symbols with special section index SHN_COMMON (0xfff2) haven't been handled so far and caused an invalid section error.
This is a more or less straightforward use of the code commented out at the end of the function. I checked with the ELF spec, that the symbol value gives the alignment.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D89795
The difference is that the former are indirect and go to the GOT while the latter go to the target directly. This info can be used to relax indirect ones that don't need the GOT (because the target is in range). We check for this optimization beforehand. For formal correctness and to avoid confusion, we should only change the relocation kind if we actually apply the relaxation.
This re-applies e2fceec2fd with fixes. Apparently we already *do* support
relaxation for ELF, so we need to make sure the test case allocates a slab at
a fixed address, and that the R_X86_64_REX_GOTPCRELX test references an external
that is guaranteed to be out of range.
This patch breaks Orc.h up into Orc.h, LLJIT.h and OrcEE.h.
Orc.h contain core Orc utilities.
LLJIT.h contains LLJIT specific types and functions.
OrcEE.h contains types and functions that depend on ExecutionEngine.
The intent is that these headers should match future library divisions: Clients
who only use Orc.h should only need to link againt the Orc core libraries,
clients using LLJIT.h will also need to link against LLVM core, and clients
using OrcEE.h will also have to link against ExecutionEngine.
In addition to breaking up the Orc.h header this patch introduces functions to:
(1) Set the object linking layer creation function on LLJITBuilder.
(2) Create an RTDyldObjectLinkingLayer instance (particularly for use in (1)).
(3) Register JITEventListeners with an RTDyldObjectLinkingLayer.
Together (1), (2) and (3) can be used to force use of RTDyldObjectLinkingLayer
as the underlying JIT linker for LLJIT, rather than the platform default, and
to register event listeners with the RTDyldObjectLinkingLayer.
C API clients can now define a custom definition generator by providing a
callback function (to implement DefinitionGenerator::tryToGenerate) and context
object. All arguments for the DefinitionGenerator::tryToGenerate method have
been given C API counterparts, and the API allows for optionally asynchronous
generation.
Symbol string pool entries are ref counted, but not automatically cleared.
This can cause the size of the pool to grow without bound if it's not
periodically cleared. These functions allow that to be done via the C API.
This patch moves definition generation out from the session lock, instead
running it under a per-dylib generator lock. It also makes the
DefinitionGenerator::tryToGenerate method optionally asynchronous: Generators
are handed an opaque LookupState object which can be captured to stop/restart
the lookup process.
The new scheme provides the following benefits and guarantees:
(1) Queries that do not need to attempt definition generation (because all
requested symbols matched against existing definitions in the JITDylib)
can proceed without being blocked by any running definition generators.
(2) Definition generators can capture the LookupState to continue their work
asynchronously. This allows generators to run for an arbitrary amount of
time without blocking a thread. Definition generators that do not need to
run asynchronously can return without capturing the LookupState to eliminate
unnecessary recursion and improve lookup performance.
(3) Definition generators still do not need to worry about concurrency or
re-entrance: Since they are still run under a (per-dylib) lock, generators
will never be re-entered concurrently, or given overlapping symbol sets to
generate.
Finally, the new system distinguishes between symbols that are candidates for
generation (generation candidates) and symbols that failed to match for a query
(due to symbol visibility). This fixes a bug where an unresolved symbol could
trigger generation of a duplicate definition for an existing hidden symbol.
MaterializationResponsibility, JITDylib, and ExecutionSession collectively
manage the OrcV2 core JIT state. Responsibility for maintaining and
updating this state has previously been spread among these classes, resulting
in implementations that are each non-trivial, but all tightly coupled. This has
in turn made reading the code and reasoning about state update and locking
rules difficult.
The core state model can be simplified by thinking of
MaterializationResponsibility and JITDylib as facets of ExecutionSession. This
commit is the first in a series intended to refactor Core.cpp to reflect this
model. Operations on MaterializationResponsibility and JITDylib will forward to
implementation methods inside ExecutionSession. Raw state will remain with the
original classes, but in most cases will only be modified by the
ExecutionSession.
This patch updates the Kaleidoscope and BuildingAJIT tutorial series (chapter
1-4) to OrcV2. Chapter 5 of the BuildingAJIT series is removed -- it will be
re-instated once we have in-tree support for out-of-process JITing.
This patch only updates the tutorial code, not the text. Patches welcome for
that, otherwise I will try to update it in a few weeks.
This patch introduces new APIs to support resource tracking and removal in Orc.
It is intended as a thread-safe generalization of the removeModule concept from
OrcV1.
Clients can now create ResourceTracker objects (using
JITDylib::createResourceTracker) to track resources for each MaterializationUnit
(code, data, aliases, absolute symbols, etc.) added to the JIT. Every
MaterializationUnit will be associated with a ResourceTracker, and
ResourceTrackers can be re-used for multiple MaterializationUnits. Each JITDylib
has a default ResourceTracker that will be used for MaterializationUnits added
to that JITDylib if no ResourceTracker is explicitly specified.
Two operations can be performed on ResourceTrackers: transferTo and remove. The
transferTo operation transfers tracking of the resources to a different
ResourceTracker object, allowing ResourceTrackers to be merged to reduce
administrative overhead (the source tracker is invalidated in the process). The
remove operation removes all resources associated with a ResourceTracker,
including any symbols defined by MaterializationUnits associated with the
tracker, and also invalidates the tracker. These operations are thread safe, and
should work regardless of the the state of the MaterializationUnits. In the case
of resource transfer any existing resources associated with the source tracker
will be transferred to the destination tracker, and all future resources for
those units will be automatically associated with the destination tracker. In
the case of resource removal all already-allocated resources will be
deallocated, any if any program representations associated with the tracker have
not been compiled yet they will be destroyed. If any program representations are
currently being compiled then they will be prevented from completing: their
MaterializationResponsibility will return errors on any attempt to update the
JIT state.
Clients (usually Layer writers) wishing to track resources can implement the
ResourceManager API to receive notifications when ResourceTrackers are
transferred or removed. The MaterializationResponsibility::withResourceKeyDo
method can be used to create associations between the key for a ResourceTracker
and an allocated resource in a thread-safe way.
RTDyldObjectLinkingLayer and ObjectLinkingLayer are updated to use the
ResourceManager API to enable tracking and removal of memory allocated by the
JIT linker.
The new JITDylib::clear method can be used to trigger removal of every
ResourceTracker associated with the JITDylib (note that this will only
remove resources for the JITDylib, it does not run static destructors).
This patch includes unit tests showing basic usage. A follow-up patch will
update the Kaleidoscope and BuildingAJIT tutorial series to OrcV2 and will
use this API to release code associated with anonymous expressions.
This removes all legacy layers, legacy utilities, the old Orc C bindings,
OrcMCJITReplacement, and OrcMCJITReplacement regression tests.
ExecutionEngine and MCJIT are not affected by this change.
Report a fatal error if an IMAGE_REL_AMD64_ADDR32NB cannot be applied due to an
out-of-range target. Previously we emitted a diagnostic to llvm::errs and
continued.
Patch by Dale Martin. Thanks Dale!
This patch enables basic BSS section handling, and improves a couple of error
messages in the ELF section parsing code.
Patch by Christian Schafmeister. Thanks Christian!
Differential Revision: https://reviews.llvm.org/D88867
`ELFFile<ELFT>` has many methods that take pointers,
though they assume that arguments are never null and
hence could take references instead.
This patch performs such clean-up.
Differential revision: https://reviews.llvm.org/D87385
Making MaterializationResponsibility instances immovable allows their
associated VModuleKeys to be updated by the ExecutionSession while the
responsibility is still in-flight. This will be used in the upcoming
removable code feature to enable safe merging of resource keys even if
there are active compiles using the keys being merged.
TPCDynamicLibrarySearchGenerator was generating errors on missing
symbols, but that doesn't fit the DefinitionGenerator contract: A symbol
that isn't generated by a particular generator should not cause an
error.
This commit fixes the error by using SymbolLookupFlags::WeaklyReferencedSymbol
for all elements of the lookup, and switches llvm-jitlink to use
TPCDynamicLibrarySearchGenerator.
If there's no initializer symbol in the current MaterializationResponsibility
then bail out without installing JITLink passes: they're going to be no-ops
anyway.
A think-o in the existing code meant that dependencies were never registered.
This failure could lead to crashes rather than orderly error propagation if
initialization dependencies failed to materialize.
No test case: The bug was discovered in an out-of-tree code and requires
pathalogically misconfigured JIT to generate the original error that lead to
the crash.
DFS and Reverse-DFS linkage orders are used to order execution of
deinitializers and initializers respectively.
This patch replaces uses of special purpose DFS order functions in
MachOPlatform and LLJIT with uses of the new methods.
The functions `__register_frame`/`__deregister_frame` are not
available on z/OS, so add a guard to not use them.
Reviewed By: lhames, abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D84787
MachOLinkGraphBuilder has been treating these as hidden, but they should be
treated as local.
Symbols with N_PEXT set and N_EXT unset are produced when hidden symbols are
run through 'ld -r' without passing -keep_private_externs. They will show up
under 'nm -m' as "was private extern", hence the name of the test cases.
Testcase commited as relocatable object to ensure that the test suite doesn't
depend on having 'ld -r' available.
This loop caused me a little headache once, because I didn't see the assigned variable is a member. The refactored version appears more readable to me.
Differential Revision: https://reviews.llvm.org/D85922
Correctly sign extend the addend, and fix implicit shift operand decoding
(it incorrectly returned 0 for some cases), and check that the initial
encoded immediate is 0.
The code in SectionMemoryManager.cpp unnecessarily maps
read-only data sections with the READ+EXECUTE flags. This is
undesirable from a security stand-point.
Moreover, on the Fuchsia platform, which is now very strict
about mapping pages with the EXECUTE permission, this simply
fails, because the section's pages were initially allocated
with only the READ+WRITE flags.
A more detailed description of the issue can be found in this
public SwiftShader bug:
https://issuetracker.google.com/issues/154586551
This patch just restrict the mapping to the READ flag for ROData
sections. Code sections are still mapped with READ+EXECUTE as
expected.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D78574
Archives can now be specified as input files the same way that object
files are. Archives will always be linked after all objects (regardless
of the relative order of the inputs) but before any dynamic libraries or
process symbols.
This patch also relaxes matching for slice triples in
StaticLibraryDefinitionGenerator in order to support this feature:
Vendors need not match if the source vendor is unknown.
The -phony-externals option adds a generator which explicitly defines any
otherwise unresolved externals as null. This transforms link-time
unresolved-symbol errors into potential runtime null pointer accesses
(if an unresolved external is actually accessed during execution).
This option can be useful in -harness mode to avoid having to mock a
large number of symbols that are not reachable at runtime (e.g. unused
methods referenced by a class vtable).
This fixes the ExecutionEngine/MCJIT/stubs-sm-pic.ll test in no-asserts
builds which is set to XFAIL on some platforms like 32-bit x86. More
importantly, we probably don't want to silently error in these cases.
Differential revision: https://reviews.llvm.org/D84390
The -harness option enables new testing use-cases for llvm-jitlink. It takes a
list of objects to treat as a test harness for any regular objects passed to
llvm-jitlink.
If any files are passed using the -harness option then the following
transformations are applied to all other files:
(1) Symbols definitions that are referenced by the harness files are promoted
to default scope. (This enables access to statics from test harness).
(2) Symbols definitions that clash with definitions in the harness files are
deleted. (This enables interposition by test harness).
(3) All other definitions in regular files are demoted to local scope.
(This causes untested code to be dead stripped, reducing memory cost and
eliminating spurious unresolved symbol errors from untested code).
These transformations allow the harness files to reference and interpose
symbols in the regular object files, which can be used to support execution
tests (including fuzz tests) of functions in relocatable objects produced by a
build.
This allows clients to detect invalid transformations applied by JITLink passes
(e.g. inserting or removing symbols in unexpected ways) and terminate linking
with an error.
This change is used to simplify the error propagation logic in
ObjectLinkingLayer.
Subclasses will commonly gather that information from a remote during
construction, in which case they won't have meaningful values to pass to
TargetProcessControl's constructor.
This patch makes ownership of the JITLinkMemoryManager by ObjectLinkingLayer
optional: the layer can still own the memory manager but no longer has to.
Evevntually we want to move to a state where ObjectLinkingLayer never owns its
memory manager. For now allowing optional ownership makes it easier to develop
classes that can dynamically use either RTDyldObjectLinkingLayer, which owns
its memory managers, or ObjectLinkingLayer (e.g. LLJIT).
TPCDynamicLibrarySearchGenerator uses a TargetProcessControl instance to
load libraries and search for symbol addresses in a target process. It
can be used in place of a DynamicLibrarySearchGenerator to enable
target-process agnostic lookup.
When allocating a new memory block in SectionMemoryManager, initialize
the Near hint for the other memory groups if they have not been set
already.
Patch by Dana Koch. Thanks Dana!
Identify relocations by (section name, offset) pairs, rather than plain
vmaddrs. This makes it easier to cross-reference debugging output for
relocations with output from standard object inspection tools (otool,
readelf, objdump, etc.).
When processing a MachO SUBTRACTOR/UNSIGNED pair, if the UNSIGNED target
is non-extern then check the r_symbolnum field of the relocation to find
the targeted section and use the section's address to find 'ToSymbol'.
Previously 'ToSymbol' was found by loading the initial value stored at
the fixup location and treating this as an address to search for. This
is incorrect, however: the initial value includes the addend and will
point to the wrong block if the addend is less than zero or greater than
the block size.
rdar://65756694
TargetProcessControl is a new API for communicating with JIT target processes.
It supports memory allocation and access, and inspection of some process
properties, e.g. the target proces triple and page size.
Centralizing these APIs allows utilities written against TargetProcessControl
to remain independent of the communication procotol with the target process
(which may be direct memory access/allocation for in-process JITing, or may
involve some form of IPC or RPC).
An initial set of TargetProcessControl-based utilities for lazy compilation is
provided by the TPCIndirectionUtils class.
An initial implementation of TargetProcessControl for in-process JITing
is provided by the SelfTargetProcessControl class.
An example program showing how the APIs can be used is provided in
llvm/examples/OrcV2Examples/LLJITWithTargetProcessControl.
Summary: This adds the basic support for GOT in elf x86.
Was able to just get away using the macho code by generalising the edges.
There will be a follow up patch to turn that into a generic utility for both of the x86 and Mach-O code.
This patch also lands support for relocations relative to symbol.
Reviewers: lhames
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83748
Summary: This PR contains a build failure fix that occurs on both AIX and z/OS as a result of this commit https://reviews.llvm.org/rG670915094462d831e3733e5b01a76471b8cf6dd8.
Reviewers: uweigand, Kai, hubert.reinterpretcast, daltenty, lhames
Reviewed By: Kai, hubert.reinterpretcast, daltenty
Subscribers: SeanP, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83889
destructor via a pointer of the wrong static type.
This caused crashes during deallocation in C++14 builds when using a
deallocator whose sized delete requires the size argument to be correct.
Also make the LazyCallThroughManager destructor protected to catch this
sort of bug in the future.
LazyReexportsManager instances use the trampoline pool, but they don't need to
own it. Keeping TrampolinePool ownership separate allows re-use of the
trampoline pool by other clients.
This patch generalizes the APIs for writing re-entry blocks, trampolines and
stubs to allow their final linked address to differ from the address of
their initial working memory. This will allow these routines to be used with
JITLinkMemoryManagers, which will in turn allow for unification of code paths
for in-process and cross-process lazy JITing.
If a symbol name begins with the linker private global prefix (as
described by the DataLayout) then it should be treated as non-exported,
regardless of its LLVM IR visibility value.
This patch allows for usage of the @PLT modifier in AArch64 assembly which
lowers to an R_AARCH64_PLT32 relocation. See D81184 for handling this
relocation in lld.
Differential Revision: https://reviews.llvm.org/D81446
JITLink supports all code and relocation models, so there's no reason to
conditionalize using JITLink on the code or relocation model settings.
Clients wanting to use RTDyldObjectLinkingLayer/RuntimeDyld will now
need to use a custom object linking layer creator.
Debug sections will not be linked into the final executable and may contain
ambiguous relocations*. Skipping them avoids both some unnecessary processing
cost and the hassle of dealing with the problematic relocations.
* E.g. __debug_ranges contains non-extern relocations to the end of functions
hat begin with named symbols. Under the usual rules for interpreting non-extern
relocations these will be incorrectly associated with the following block, or
no block at all (if there is a gap between one block and the next).
Summary:
Adding in our first relocation type, and all the required plumbing to support the rest in following patches
Differential Revision: https://reviews.llvm.org/D80613
Reviewer: lhames
This patch adds a jitlink pass, 'registerELFGraphInfo', that records section
and symbol information about each LinkGraph in the llvm-jitlink session object.
This allows symbols and sections to be referred to by name in llvm-jitlink
regression tests. This will enable a testcase to be written for
https://reviews.llvm.org/D80613.
This initial implementation supports section and symbol parsing, but no
relocation support. It enables JITLink to link and execute ELF relocatable
objects that do not require relocations.
Patch by Jared Wyles. Thanks Jared!
Differential Revision: https://reviews.llvm.org/D79832
MaterializationResponsibility.
MaterializationResponsibility objects provide a connection between a
materialization process (compiler, jit linker, etc.) and the JIT state held in
the ExecutionSession and JITDylib objects. Switching to shared ownership
extends the lifetime of JITDylibs to ensure they remain accessible until all
materializers targeting them have completed. This will allow (in a follow-up
patch) JITDylibs to be removed from the ExecutionSession and placed in a
pending-destruction state while they are kept alive to communicate errors
to/from any still-runnning materialization processes. The intent is to enable
JITDylibs to be safely removed even if they have running compiles targeting
them.
Refering to the link order of a dylib better matches the terminology used in
static compilation. As upcoming patches will increase the number of places where
link order matters (for example when closing JITDylibs) it's better to get this
name change out of the way early.
Summary:
In D77860, we have changed `getSymbolFlags()` return type to `Expected<uint32_t>`.
This change helps bubble the error further up the stack.
Reviewers: jhenderson, grimar, JDevlieghere, MaskRay
Reviewed By: jhenderson
Subscribers: hiraditya, MaskRay, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79075
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().
I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.
Differential Revision: https://reviews.llvm.org/D78882
This patch changes Block::removeEdge to return a valid iterator to the new next
element, and uses this to update the edge removal algorithm in
LinkGraph::splitBlock.
LLJIT::defineAbsolute did not mangle its Name argument, which is inconsistent
with the behavior of other LLJIT methods (e.g. lookup). Since it is currently
unused anyway, this commit replaces it with a generic 'define' convenience
method for adding MaterializationUnits to the main JITDylib. This simplifies
use of the generic absoluteSymbols function (as well as the symbolAlias,
reexports and other functions that generate MaterializationUnits) with LLJIT.
This removes the conditional layout of relocation_info bitfields that was
introduced in 3ccd677bf (svn r358839). The platform relocation_info
struct (defined in usr/include/mach-o/reloc.h) does not define the layout of
this struct differently on big-endian platforms and we want to keep the LLVM
and platform definitions in sync.
To fix the bug that 3ccd677bf addressed this patch modifies JITLink to construct
its relocation_info structs from the raw relocation words using shift and mask
operations.
Adds basic support for LLJITBuilder and DynamicLibrarySearchGenerator. This
allows C API clients to configure LLJIT to expose process symbols to JIT'd
code. An example of this is added in
llvm/examples/OrcV2CBindingsReflectProcessSymbols.
Add a new overload of StaticLibraryDefinitionGenerator::Load that takes a triple
argument and supports loading archives from MachO universal binaries in addition
to regular archives.
The LLI tool is updated to use this overload.
Failure to export __cxa_atexit can lead to an attempt to import a definition
from the process itself (if __cxa_atexit is referenced from another JITDylib),
but the process definition will clash with the existing non-exported definition
to produce an unexpected DuplicateDefinitionError.
This patch fixes the immediate issue by exporting __cxa_atexit. It also fixes a
bug where atexit functions in other JITDylibs were not being run by adding a
copy of run_atexits_helper to every JITDylib.
A follow up patch will deal with the bug where definition generators are called
despite a non-exported definition being present.
The MemoryBuffer::getMemBuffer method's RequiresNullTerminator parameter
defaults to true, but object files are not null terminated so we need to
explicitly pass false here.
Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.
This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.
I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors. Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it. Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.
Differential Revision: https://reviews.llvm.org/D72467
This flag can be used to mark a symbol as existing only for the purpose of
enabling materialization. Such a symbol can be looked up to trigger
materialization with the lookup returning only once materialization is
complete. Symbols with this flag will never resolve however (to avoid
permanently polluting the symbol table), and should only be looked up using
the SymbolLookupFlags::WeaklyReferencedSymbol flag. The primary use case for
this flag is initialization symbols.
Summary:
Rename `succ_const_iterator` to `const_succ_iterator` and
`succ_const_range` to `const_succ_range` for consistency with the
predecessor iterators, and the corresponding iterators in
MachineBasicBlock.
Reviewers: nicholas, dblaikie, nlewycky
Subscribers: hiraditya, bmahjour, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75952
Updates the object buffer ownership scheme in jitLinkForOrc and related
functions: Ownership of both the object::ObjectFile and underlying
MemoryBuffer is passed into jitLinkForOrc and passed back to the onEmit
callback once linking is complete. This avoids the use-after-free errors
that were seen in 98f2bb4461.
Along the same lines as eb918d8daf1: This code also had to acquire the session
mutex, and this could cause a deadlock under the wrong circumstances. This
patch updates GenericLLVMIRPlatformSupport to just use the session lock for
everything.
In MachOPlatform, obtaining the link-order for a JITDylib requires locking the
session, but also needs to be part of a larger atomic operation that collates
initializer symbols tracked by the platform. Trying to do this under a separate
platform mutex leads to potential locking order issues, e.g.
T1 locks session then tries to lock platform to register a new init symbol
meanwhile
T2 locks platform then tries to lock session to obtain link order.
Removing the platform lock and performing all these operations under the session
lock eliminates this possibility.
At the same time we also need to collate init pointers from the
MachOPlatform::InitScraperPlugin, and we don't need or want to lock the session
for that. The new InitSeqMutex has been added to guard these init pointers, and
the session mutex is never obtained while the InitSeqMutex is held.
The MU may define no symbols, but still contain a non-trivial destructor (e.g.
an LLVM IR module that has been stripped of all externally visible
definitions, but which still needs to lock its context to be destroyed).
Bailing out early ensures that we destroy the unit outside the session lock,
rather than under it which may cause deadlocks.
Also adds some extra sanity-checking assertions.
Follow-up for D74433
What the function returns are almost standard BFD names, except that "ELF" is
in uppercase instead of lowercase.
This patch changes "ELF" to "elf" and changes ARM/AArch64 to use their BFD names.
MIPS and PPC64 have endianness differences as well, but this patch does not intend to address them.
Advantages:
* llvm-objdump: the "file format " line matches GNU objdump on ARM/AArch64 objects
* "file format " line can be extracted and fed into llvm-objcopy -O literally.
(https://github.com/ClangBuiltLinux/linux/issues/779 has such a use case)
Affected tools: llvm-readobj, llvm-objdump, llvm-dwarfdump, MCJIT (internal implementation detail, not exposed)
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D76046
Enable use of ExecutionEngine JITEventListeners in RTDyldObjectLinkingLayer.
This allows existing MCJIT clients to more easily migrate to LLJIT / ORCv2.
Example usage in llvm/examples/OrcV2Examples/LLJITWithGDBRegistrationListener.
Differential Revision: https://reviews.llvm.org/D75838
Global symbols with linker-private prefixes should be resolvable across object
boundaries, but internal symbols with linker-private prefixes should not.
Renames the llvm/examples/LLJITExamples directory to llvm/examples/OrcV2Examples
since it is becoming a home for all OrcV2 examples, not just LLJIT.
See http://llvm.org/PR31103.
Patch based on https://reviews.llvm.org/D75912 by Alexander Shishkin. Thanks
Alexander!
To minimize disruption to existing clients, who may be relying on the fact that
unused references to unresolved symbols do not generate an error, this patch
makes error checking opt-in: Clients can call ExecutionEngine::hasError or
LLVMExecutionEngineGetError to check whether and error has occurred.
Differential revision: https://reviews.llvm.org/D75912
This patch enables exception handling in code added to LLJIT on Darwin by
adding an orc::EHFrameRegistrationPlugin instance to the ObjectLinkingLayer
(which is currently used on Darwin only).
These may be accessed from multiple threads if concurrent materialization is
enabled in ORC.
Testcase coming in a follow-up patch that enables eh-frame registration for
LLJIT.
Summary:
Enables JIT-linking by RuntimeDyld of COFF objects that contain references to
dllimport symbols. This is done by recognizing symbols that start with the
reserved "__imp_" prefix and building a pointer entry to the target symbol in
the stubs area of the section. References to the "__imp_" symbol are updated to
point to this pointer.
Work in progress: The generic code is in place, but only RuntimeDyldCOFFX86_64
and RuntimeDyldCOFFI386 have been updated to look for and update references to
dllimport symbols.
Reviewers: compnerd
Subscribers: hiraditya, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75884
This patch allows rtdyld-check / jitlink-check expressions to be extended over
multiple lines by terminating each line with a '\'. E.g.
# llvm-rtdyld: *{8}X = \
# llvm-rtdyld: Y
X:
.quad Y
This will be used to break up some long lines in upcoming test cases.
The LLJIT::MachOPlatformSupport class used to unconditionally attempt to
register __objc_selrefs and __objc_classlist sections. If libobjc had not
been loaded this resulted in an assertion, even if no objc sections were
actually present. This patch replaces this unconditional registration with
a check that no objce sections are present if libobjc has not been loaded.
This will allow clients to use MachOPlatform with LLJIT without requiring
libobjc for non-objc code.
Summary: Decompose callThroughToSymbol() into findReexport(), resolveSymbol(), notifyResolved() and reportCallThroughError(). This allows derived classes to reuse the functionality while adding their own code in between.
Reviewers: lhames
Reviewed By: lhames
Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75084
ST_File symbols aren't relevant for linking purposes, but can end up shadowing
real symbols if they're not filtered.
No test case yet: The ideal testcase for this would be an ELF llvm-jitlink test,
but llvm-jitlink support for ELF is still under development. We should add a
testcase for this once support lands in tree.
reinterpret_cast'ing the block base address directly to a uint64_t leaves the
high bits in an implementation-defined state, but JITLink expects them to be
zero. Switching to pointerToJITTargetAddress for the cast should fix this.
This should fix the jitlink test failures that we have seen on some of the
32-bit testers.
Lots of headers pass around MemoryBuffer objects, but very few open
them. Let those that do include FileSystem.h.
Saves ~250 includes of Chrono.h & FileSystem.h:
$ diff -u thedeps-before.txt thedeps-after.txt | grep '^[-+] ' | sort | uniq -c | sort -nr
254 - ../llvm/include/llvm/Support/FileSystem.h
253 - ../llvm/include/llvm/Support/Chrono.h
237 - ../llvm/include/llvm/Support/NativeFormatting.h
237 - ../llvm/include/llvm/Support/FormatProviders.h
192 - ../llvm/include/llvm/ADT/StringSwitch.h
190 - ../llvm/include/llvm/Support/FormatVariadicDetails.h
...
This requires duplicating the file_t typedef, which is unfortunate. I
sunk the choice of mapping mode down into the cpp file using variable
template specializations instead of class members in headers.
Summary: A function that creates JITSymbolFlags from a GlobalValueSummary. Similar functions exist: fromGlobalValue(), fromObjectSymbol()
Reviewers: lhames
Reviewed By: lhames
Subscribers: hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75082
This optimization bypasses GOT loads and calls/branches through stubs when the
ultimate target of the access/branch is found to be within range of the
reference.
Extra debugging output is also added to the generic JITLink algorithm and
basic GOT and Stubs builder utility to aid debugging.
The GenericLLVMIRPlatformSupport class runs a transform on all LLVM IR added to
the LLJIT instance to replace instances of llvm.global_ctors with a specially
named function that runs the corresponing static initializers (See
(GlobalCtorDtorScraper from lib/ExecutionEngine/Orc/LLJIT.cpp). This patch
updates the GenericIRPlatform class to check for this specially named function
in other materialization units that are added to the JIT and, if found, add
the function to the initializer work queue. Doing this allows object files
that were compiled from IR and cached to be reloaded in subsequent JIT sessions
without their initializers being skipped.
To enable testing this patch also updates the lli tool's -jit-kind=orc-lazy mode
to respect the -enable-cache-manager and -object-cache-dir options, and modifies
the CompileOnDemandLayer to rename extracted submodules to include a hash of the
names of their symbol definitions. This allows a simple object caching scheme
based on module names (which was already implemented in lli) to work with the
lazy JIT.
This patch adds new errors and error checking to the ObjectLinkingLayer to
catch cases where a compiled or loaded object either:
(1) Contains definitions not covered by its responsibility set, or
(2) Is missing definitions that are covered by its responsibility set.
Proir to this patch providing the correct set of definitions was treated as
an API contract requirement, however this requires that the client be confident
in the correctness of the whole compiler / object-cache pipeline and results
in difficult-to-debug assertions upon failure. Treating this as a recoverable
error results in clearer diagnostics.
The performance overhead of this check is one comparison of densemap keys
(symbol string pointers) per linking object, which is minimal. If this overhead
ever becomes a problem we can add the check under a flag that can be turned off
if the client fully trusts the rest of the pipeline.
Initializers and deinitializers are used to implement C++ static constructors
and destructors, runtime registration for some languages (e.g. with the
Objective-C runtime for Objective-C/C++ code) and other tasks that would
typically be performed when a shared-object/dylib is loaded or unloaded by a
statically compiled program.
MCJIT and ORC have historically provided limited support for discovering and
running initializers/deinitializers by scanning the llvm.global_ctors and
llvm.global_dtors variables and recording the functions to be run. This approach
suffers from several drawbacks: (1) It only works for IR inputs, not for object
files (including cached JIT'd objects). (2) It only works for initializers
described by llvm.global_ctors and llvm.global_dtors, however not all
initializers are described in this way (Objective-C, for example, describes
initializers via specially named metadata sections). (3) To make the
initializer/deinitializer functions described by llvm.global_ctors and
llvm.global_dtors searchable they must be promoted to extern linkage, polluting
the JIT symbol table (extra care must be taken to ensure this promotion does
not result in symbol name clashes).
This patch introduces several interdependent changes to ORCv2 to support the
construction of new initialization schemes, and includes an implementation of a
backwards-compatible llvm.global_ctor/llvm.global_dtor scanning scheme, and a
MachO specific scheme that handles Objective-C runtime registration (if the
Objective-C runtime is available) enabling execution of LLVM IR compiled from
Objective-C and Swift.
The major changes included in this patch are:
(1) The MaterializationUnit and MaterializationResponsibility classes are
extended to describe an optional "initializer" symbol for the module (see the
getInitializerSymbol method on each class). The presence or absence of this
symbol indicates whether the module contains any initializers or
deinitializers. The initializer symbol otherwise behaves like any other:
searching for it triggers materialization.
(2) A new Platform interface is introduced in llvm/ExecutionEngine/Orc/Core.h
which provides the following callback interface:
- Error setupJITDylib(JITDylib &JD): Can be used to install standard symbols
in JITDylibs upon creation. E.g. __dso_handle.
- Error notifyAdding(JITDylib &JD, const MaterializationUnit &MU): Generally
used to record initializer symbols.
- Error notifyRemoving(JITDylib &JD, VModuleKey K): Used to notify a platform
that a module is being removed.
Platform implementations can use these callbacks to track outstanding
initializers and implement a platform-specific approach for executing them. For
example, the MachOPlatform installs a plugin in the JIT linker to scan for both
__mod_inits sections (for C++ static constructors) and ObjC metadata sections.
If discovered, these are processed in the usual platform order: Objective-C
registration is carried out first, then static initializers are executed,
ensuring that calls to Objective-C from static initializers will be safe.
This patch updates LLJIT to use the new scheme for initialization. Two
LLJIT::PlatformSupport classes are implemented: A GenericIR platform and a MachO
platform. The GenericIR platform implements a modified version of the previous
llvm.global-ctor scraping scheme to provide support for Windows and
Linux. LLJIT's MachO platform uses the MachOPlatform class to provide MachO
specific initialization as described above.
Reviewers: sgraenitz, dblaikie
Subscribers: mgorny, hiraditya, mgrang, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74300
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
ObjectLinkingLayer was not correctly propagating dependencies through local
symbols within an object. This could cause symbol lookup to return before a
searched-for symbol is ready if the following conditions are met:
(1) The definition of the symbol being searched for transitively depends on a
local symbol within the same object, and that local symbol in turn
transitively depends on an external symbol provided by some other module
in the JIT.
(2) Concurrent compilation is enabled.
(3) Thread scheduling causes the lookup of the searched-for symbol to return
before all transitive dependencies of the looked-up symbol are emitted.
This bug was found by inspection and has not been observed in practice.
A jitlink test case has been added to verify that symbol dependencies are
correctly propagated through local symbol definitions.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.
Reviewers: xbolva00, courbet, bollu
Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73099
This commit adds a ManglingOptions struct to IRMaterializationUnit, and replaces
IRCompileLayer::CompileFunction with a new IRCompileLayer::IRCompiler class. The
ManglingOptions struct defines the emulated-TLS state (via a bool member,
EmulatedTLS, which is true if emulated-TLS is enabled and false otherwise). The
IRCompileLayer::IRCompiler class wraps an IRCompiler (the same way that the
CompileFunction typedef used to), but adds a method to return the
IRCompileLayer::ManglingOptions that the compiler will use.
These changes allow us to correctly determine the symbols that will be produced
when a thread local global variable defined at the IR level is compiled with or
without emulated TLS. This is required for ORCv2, where MaterializationUnits
must declare their interface up-front.
Most ORCv2 clients should not require any changes. Clients writing custom IR
compilers will need to wrap their compiler in an IRCompileLayer::IRCompiler,
rather than an IRCompileLayer::CompileFunction, however this should be a
straightforward change (see modifications to CompileUtils.* in this patch for an
example).
The MaterializationResponsibility::defineMaterializing method allows clients to
add new definitions that are in the process of being materialized to the JIT.
This patch adds support to defineMaterializing for symbols with weak linkage
where the new definitions may be rejected if another materializer concurrently
defines the same symbol. If a weak symbol is rejected it will not be added to
the MaterializationResponsibility's responsibility set. Clients can check for
membership in the responsibility set via the
MaterializationResponsibility::getSymbols() method before resolving any
such weak symbols.
This patch also adds code to RTDyldObjectLinkingLayer to tag COFF comdat symbols
introduced during codegen as weak, on the assumption that these are COFF comdat
constants. This fixes http://llvm.org/PR40074.
Based on Don Hinton's patch in https://reviews.llvm.org/D72406. This feature
was accidentally left out of e9e26c01cd, and
would have pessimized concurrent compilation in the default case.
Thanks for spotting this Don!
This patch makes the target triple available via the LLJIT interface, and moves
the IRTransformLayer from LLLazyJIT down into LLJIT. Together these changes make
it easier to use the lazyReexports utility with LLJIT, and to apply IR
transforms to code as it is compiled in LLJIT (rather than requiring transforms
to be applied manually before code is added). An code example is added in
llvm/examples/LLJITExamples/LLJITWithLazyReexports
A bug in the existing implementation meant that lazyReexports would not work if
the aliased name differed from the alias's name, i.e. all lazy reexports had to
be of the form (lib1, name) -> (lib2, name). This patch fixes the issue by
capturing the alias's name in the NotifyResolved callback. To simplify this
capture, and the LazyCallThroughManager code in general, the NotifyResolved
callback is updated to use llvm::unique_function rather than a custom class.
No test case yet: This can only be tested at runtime, and the only in-tree
client (lli) always uses aliases with matching names. I will add a new LLJIT
example shortly that will directly test the lazyReexports API and the
non-trivial alias use case.
The argument is llvm::null() everywhere except llvm::errs() in
llvm-objdump in -DLLVM_ENABLE_ASSERTIONS=On builds. It is used by no
target but X86 in -DLLVM_ENABLE_ASSERTIONS=On builds.
If we ever have the needs to add verbose log to disassemblers, we can
record log with a member function, instead of passing it around as an
argument.
This fixes an off-by-one error in the argc value computed by runAsMain, and
switches lli back to using the input bitcode (rather than the string "lli") as
the effective program name.
Thanks to Stefan Graenitz for spotting the bug.
LLJITBuilder will now use JITLink on supported platforms even if a custom
JITTargetMachineBuilder is supplied, provided that neither the code model,
nor the relocation model, nor the ObjectLinkingLayerCreator is set.
JITLink (which underlies ObjectLinkingLayer) is a replacement for RuntimeDyld.
It supports the native code model, and linker plugins that enable a wider range
of features than RuntimeDyld.
Currently only enabled for MachO/x86-64 and MachO/arm64. New architectures will
be added as JITLink support for them is developed.
This relieves ObjectLinkingLayer clients of the responsibility of holding the
memory manager. This makes it easier to select between RTDyldObjectLinkingLayer
(which already owned its memory manager factory) and ObjectLinkingLayer at
runtime as clients aren't required to hold a jitlink::MemoryManager field just
in case ObjectLinkingLayer is selected.
This patch removes the magic "main" JITDylib from ExecutionEngine. The main
JITDylib was created automatically at ExecutionSession construction time, and
all subsequently created JITDylibs were added to the main JITDylib's
links-against list by default. This saves a couple of lines of boilerplate for
simple JIT setups, but this isn't worth introducing magical behavior for.
ORCv2 clients should now construct their own main JITDylib using
ExecutionSession::createJITDylib and set up its linkages manually using
JITDylib::setSearchOrder (or related methods in JITDylib).
The runAsMain function takes a pointer to a function with a standard C main
signature, int(*)(int, char*[]), and invokes it using the given arguments and
program name. The arguments are copied into writable temporary storage as
required by the C and C++ specifications, so runAsMain safe to use when calling
main functions that modify their arguments in-place.
This patch also uses the new runAsMain function to replace hand-rolled versions
in lli, llvm-jitlink, and the SpeculativeJIT example.
libraries.
This patch substantially updates ORCv2's lookup API in order to support weak
references, and to better support static archives. Key changes:
-- Each symbol being looked for is now associated with a SymbolLookupFlags
value. If the associated value is SymbolLookupFlags::RequiredSymbol then
the symbol must be defined in one of the JITDylibs being searched (or be
able to be generated in one of these JITDylibs via an attached definition
generator) or the lookup will fail with an error. If the associated value is
SymbolLookupFlags::WeaklyReferencedSymbol then the symbol is permitted to be
undefined, in which case it will simply not appear in the resulting
SymbolMap if the rest of the lookup succeeds.
Since lookup now requires these flags for each symbol, the lookup method now
takes an instance of a new SymbolLookupSet type rather than a SymbolNameSet.
SymbolLookupSet is a vector-backed set of (name, flags) pairs. Clients are
responsible for ensuring that the set property (i.e. unique elements) holds,
though this is usually simple and SymbolLookupSet provides convenience
methods to support this.
-- Lookups now have an associated LookupKind value, which is either
LookupKind::Static or LookupKind::DLSym. Definition generators can inspect
the lookup kind when determining whether or not to generate new definitions.
The StaticLibraryDefinitionGenerator is updated to only pull in new objects
from the archive if the lookup kind is Static. This allows lookup to be
re-used to emulate dlsym for JIT'd symbols without pulling in new objects
from archives (which would not happen in a normal dlsym call).
-- JITLink is updated to allow externals to be assigned weak linkage, and
weak externals now use the SymbolLookupFlags::WeaklyReferencedSymbol value
for lookups. Unresolved weak references will be assigned the default value of
zero.
Since this patch was modifying the lookup API anyway, it alo replaces all of the
"MatchNonExported" boolean arguments with a "JITDylibLookupFlags" enum for
readability. If a JITDylib's associated value is
JITDylibLookupFlags::MatchExportedSymbolsOnly then the lookup will only
match against exported (non-hidden) symbols in that JITDylib. If a JITDylib's
associated value is JITDylibLookupFlags::MatchAllSymbols then the lookup will
match against any symbol defined in the JITDylib.
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so. Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:
1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so. This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.
With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.
2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set. This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.
I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:
- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON
Reviewers: beanz, smeenai, compnerd, phosek
Reviewed By: beanz
Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70179
In a34680a33e OrcError.h and Orc/RPC/*.h were split out from the rest of
ExecutionEngine in order to eliminate false dependencies for remote JIT
targets (see https://reviews.llvm.org/D68732), however this broke modules
builds (see https://reviews.llvm.org/D69817).
This patch splits these headers out into a separate module, LLVM_OrcSupport,
in order to fix the modules build.
Fixes <rdar://56377508>.
It was failing with
PerfJITEventListener.cpp:489:7: error: 'ManagedStatic' in namespace 'llvm' does not name a template type
llvm::ManagedStatic<PerfJITEventListener> PerfListener;
It was failing with
llvm/lib/ExecutionEngine/Orc/DebugUtils.cpp:56:10:
error: could not convert ‘Obj’ from ‘std::unique_ptr<llvm::MemoryBuffer>’
to ‘llvm::Expected<std::unique_ptr<llvm::MemoryBuffer> >’
return Obj;
^
Adds a DumpObjects utility that can be used to dump JIT'd objects to disk.
Instances of DebugObjects may be used by ObjectTransformLayer as no-op
transforms.
This patch also adds an ObjectTransformLayer to LLJIT and an example of how
to use this utility to dump JIT'd objects in LLJIT.
Some targets (E.g. MachO/arm64) use relocations to fix some CFI record fields
in the eh-frame section. When relocations are used the initial (pre-relocation)
content of the eh-frame section can no longer be interpreted by following the
eh-frame specification. This causes errors in the existing eh-frame parser.
This patch moves eh-frame handling into two LinkGraph passes that are run after
relocations have been parsed (but before they are applied). The first] pass
breaks up blocks in the eh-frame section into per-CFI-record blocks, and the
second parses blocks of (potentially multiple) CFI records and adds the
appropriate edges to any CFI fields that do not have existing relocations.
These passes can be run independently of one another. By handling eh-frame
splitting/fixing with LinkGraph passes we can both re-use existing relocations
for CFI record fields and avoid applying eh-frame fixups before parsing the
section (which would complicate the linker and require extra temporary
allocations of working memory).
LinkGraph::splitBlock will split a block at a given index, returning a new
block covering the range [ 0, index ) and modifying the original block to
cover the range [ index, original-block-size ). Block addresses, content,
edges and symbols will be updated as necessary. This utility will be used
in upcoming improvements to JITLink's eh-frame support.
Summary:
When createing an ORC remote JIT target the current library split forces the target process to link large portions of LLVM (Core, Execution Engine, JITLink, Object, MC, Passes, RuntimeDyld, Support, Target, and TransformUtils). This occurs because the ORC RPC interfaces rely on the static globals the ORC Error types require, which starts a cycle of pulling in more and more.
This patch breaks the ORC RPC Error implementations out into an "OrcError" library which only depends on LLVM Support. It also pulls the ORC RPC headers into their own subdirectory.
With this patch code can include the Orc/RPC/*.h headers and will only incur link dependencies on LLVMOrcError and LLVMSupport.
Reviewers: lhames
Reviewed By: lhames
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68732
Sections may have zero size and zero-sized sections may share a start address
with other zero-sized sections. For the section overlap test to function
correctly zero-sized sections must be ordered before any non-zero sized ones.
This should fix the intermittent failures in the
test/ExecutionEngine/JITLink/X86/MachO_zero_fill_alignment.s test case that
have been observed on some builders.
It returns just a section_iterator currently and have a report_fatal_error call inside.
This change adds a way to return errors and handle them on caller sides.
The patch also changes/improves current users and adds test cases.
Differential revision: https://reviews.llvm.org/D69167
llvm-svn: 375408
Works on this dependency chain:
ArrayRef.h ->
Hashing.h -> --CUT--
Host.h ->
StringMap.h / StringRef.h
ArrayRef is very popular, but Host.h is rarely needed. Move the
IsBigEndianHost constant to SwapByteOrder.h. Clients of that header are
more likely to need it.
llvm-svn: 375316
RTDyldObjectLinkingLayer allowed clients to register a NotifyEmitted function to
reclaim ownership of object buffers once they had been linked. This patch adds
similar functionality to ObjectLinkingLayer: Clients can now optionally call the
ObjectLinkingLayer::setReturnObjectBuffer method to register a function that
will be called when discarding object buffers. If set, this function will be
called to return ownership of the object regardless of whether the link
succeeded or failed.
Use cases for this function include debug dumping (it provides a way to dump
all objects linked into JIT'd code) and object re-use (e.g. storing an
object in a cache).
llvm-svn: 374951
InProcessMemoryManager used to make separate memory allocation calls for each
permission level (RW, RX, RO), which could lead to target-out-of-range errors
if data and code were placed too far apart (this was the source of failures in
the JITLink/AArch64 testcase when it was first landed).
This patch updates InProcessMemoryManager to allocate a single slab which is
subdivided between text and data. This should guarantee that accesses remain
in-range provided that individual object files do not exceed 1Mb in size.
This patch also re-enables the JITLink/AArch64 testcase.
llvm-svn: 374948
This implementation has support for all relocation types except TLV.
Compact unwind sections are not yet supported, so exceptions/unwinding will not
work.
llvm-svn: 374476
Doing this makes MSVC complain that `empty(someRange)` could refer to
either C++17's std::empty or LLVM's llvm::empty, which previously we
avoided via SFINAE because std::empty is defined in terms of an empty
member rather than begin and end. So, switch callers over to the new
method as it is added.
https://reviews.llvm.org/D68439
llvm-svn: 373935