TargetProcessControl is a new API for communicating with JIT target processes.
It supports memory allocation and access, and inspection of some process
properties, e.g. the target proces triple and page size.
Centralizing these APIs allows utilities written against TargetProcessControl
to remain independent of the communication procotol with the target process
(which may be direct memory access/allocation for in-process JITing, or may
involve some form of IPC or RPC).
An initial set of TargetProcessControl-based utilities for lazy compilation is
provided by the TPCIndirectionUtils class.
An initial implementation of TargetProcessControl for in-process JITing
is provided by the SelfTargetProcessControl class.
An example program showing how the APIs can be used is provided in
llvm/examples/OrcV2Examples/LLJITWithTargetProcessControl.
Summary: This PR contains a build failure fix that occurs on both AIX and z/OS as a result of this commit https://reviews.llvm.org/rG670915094462d831e3733e5b01a76471b8cf6dd8.
Reviewers: uweigand, Kai, hubert.reinterpretcast, daltenty, lhames
Reviewed By: Kai, hubert.reinterpretcast, daltenty
Subscribers: SeanP, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83889
destructor via a pointer of the wrong static type.
This caused crashes during deallocation in C++14 builds when using a
deallocator whose sized delete requires the size argument to be correct.
Also make the LazyCallThroughManager destructor protected to catch this
sort of bug in the future.
LazyReexportsManager instances use the trampoline pool, but they don't need to
own it. Keeping TrampolinePool ownership separate allows re-use of the
trampoline pool by other clients.
This patch generalizes the APIs for writing re-entry blocks, trampolines and
stubs to allow their final linked address to differ from the address of
their initial working memory. This will allow these routines to be used with
JITLinkMemoryManagers, which will in turn allow for unification of code paths
for in-process and cross-process lazy JITing.
JITLink supports all code and relocation models, so there's no reason to
conditionalize using JITLink on the code or relocation model settings.
Clients wanting to use RTDyldObjectLinkingLayer/RuntimeDyld will now
need to use a custom object linking layer creator.
MaterializationResponsibility.
MaterializationResponsibility objects provide a connection between a
materialization process (compiler, jit linker, etc.) and the JIT state held in
the ExecutionSession and JITDylib objects. Switching to shared ownership
extends the lifetime of JITDylibs to ensure they remain accessible until all
materializers targeting them have completed. This will allow (in a follow-up
patch) JITDylibs to be removed from the ExecutionSession and placed in a
pending-destruction state while they are kept alive to communicate errors
to/from any still-runnning materialization processes. The intent is to enable
JITDylibs to be safely removed even if they have running compiles targeting
them.
Refering to the link order of a dylib better matches the terminology used in
static compilation. As upcoming patches will increase the number of places where
link order matters (for example when closing JITDylibs) it's better to get this
name change out of the way early.
LLJIT::defineAbsolute did not mangle its Name argument, which is inconsistent
with the behavior of other LLJIT methods (e.g. lookup). Since it is currently
unused anyway, this commit replaces it with a generic 'define' convenience
method for adding MaterializationUnits to the main JITDylib. This simplifies
use of the generic absoluteSymbols function (as well as the symbolAlias,
reexports and other functions that generate MaterializationUnits) with LLJIT.
Adds basic support for LLJITBuilder and DynamicLibrarySearchGenerator. This
allows C API clients to configure LLJIT to expose process symbols to JIT'd
code. An example of this is added in
llvm/examples/OrcV2CBindingsReflectProcessSymbols.
Add a new overload of StaticLibraryDefinitionGenerator::Load that takes a triple
argument and supports loading archives from MachO universal binaries in addition
to regular archives.
The LLI tool is updated to use this overload.
Failure to export __cxa_atexit can lead to an attempt to import a definition
from the process itself (if __cxa_atexit is referenced from another JITDylib),
but the process definition will clash with the existing non-exported definition
to produce an unexpected DuplicateDefinitionError.
This patch fixes the immediate issue by exporting __cxa_atexit. It also fixes a
bug where atexit functions in other JITDylibs were not being run by adding a
copy of run_atexits_helper to every JITDylib.
A follow up patch will deal with the bug where definition generators are called
despite a non-exported definition being present.
The MemoryBuffer::getMemBuffer method's RequiresNullTerminator parameter
defaults to true, but object files are not null terminated so we need to
explicitly pass false here.
This flag can be used to mark a symbol as existing only for the purpose of
enabling materialization. Such a symbol can be looked up to trigger
materialization with the lookup returning only once materialization is
complete. Symbols with this flag will never resolve however (to avoid
permanently polluting the symbol table), and should only be looked up using
the SymbolLookupFlags::WeaklyReferencedSymbol flag. The primary use case for
this flag is initialization symbols.
Summary:
Rename `succ_const_iterator` to `const_succ_iterator` and
`succ_const_range` to `const_succ_range` for consistency with the
predecessor iterators, and the corresponding iterators in
MachineBasicBlock.
Reviewers: nicholas, dblaikie, nlewycky
Subscribers: hiraditya, bmahjour, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75952
Updates the object buffer ownership scheme in jitLinkForOrc and related
functions: Ownership of both the object::ObjectFile and underlying
MemoryBuffer is passed into jitLinkForOrc and passed back to the onEmit
callback once linking is complete. This avoids the use-after-free errors
that were seen in 98f2bb4461.
Along the same lines as eb918d8daf1: This code also had to acquire the session
mutex, and this could cause a deadlock under the wrong circumstances. This
patch updates GenericLLVMIRPlatformSupport to just use the session lock for
everything.
In MachOPlatform, obtaining the link-order for a JITDylib requires locking the
session, but also needs to be part of a larger atomic operation that collates
initializer symbols tracked by the platform. Trying to do this under a separate
platform mutex leads to potential locking order issues, e.g.
T1 locks session then tries to lock platform to register a new init symbol
meanwhile
T2 locks platform then tries to lock session to obtain link order.
Removing the platform lock and performing all these operations under the session
lock eliminates this possibility.
At the same time we also need to collate init pointers from the
MachOPlatform::InitScraperPlugin, and we don't need or want to lock the session
for that. The new InitSeqMutex has been added to guard these init pointers, and
the session mutex is never obtained while the InitSeqMutex is held.
The MU may define no symbols, but still contain a non-trivial destructor (e.g.
an LLVM IR module that has been stripped of all externally visible
definitions, but which still needs to lock its context to be destroyed).
Bailing out early ensures that we destroy the unit outside the session lock,
rather than under it which may cause deadlocks.
Also adds some extra sanity-checking assertions.
Enable use of ExecutionEngine JITEventListeners in RTDyldObjectLinkingLayer.
This allows existing MCJIT clients to more easily migrate to LLJIT / ORCv2.
Example usage in llvm/examples/OrcV2Examples/LLJITWithGDBRegistrationListener.
Differential Revision: https://reviews.llvm.org/D75838
Global symbols with linker-private prefixes should be resolvable across object
boundaries, but internal symbols with linker-private prefixes should not.
Renames the llvm/examples/LLJITExamples directory to llvm/examples/OrcV2Examples
since it is becoming a home for all OrcV2 examples, not just LLJIT.
See http://llvm.org/PR31103.
This patch enables exception handling in code added to LLJIT on Darwin by
adding an orc::EHFrameRegistrationPlugin instance to the ObjectLinkingLayer
(which is currently used on Darwin only).
These may be accessed from multiple threads if concurrent materialization is
enabled in ORC.
Testcase coming in a follow-up patch that enables eh-frame registration for
LLJIT.
The LLJIT::MachOPlatformSupport class used to unconditionally attempt to
register __objc_selrefs and __objc_classlist sections. If libobjc had not
been loaded this resulted in an assertion, even if no objc sections were
actually present. This patch replaces this unconditional registration with
a check that no objce sections are present if libobjc has not been loaded.
This will allow clients to use MachOPlatform with LLJIT without requiring
libobjc for non-objc code.
Summary: Decompose callThroughToSymbol() into findReexport(), resolveSymbol(), notifyResolved() and reportCallThroughError(). This allows derived classes to reuse the functionality while adding their own code in between.
Reviewers: lhames
Reviewed By: lhames
Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75084
ST_File symbols aren't relevant for linking purposes, but can end up shadowing
real symbols if they're not filtered.
No test case yet: The ideal testcase for this would be an ELF llvm-jitlink test,
but llvm-jitlink support for ELF is still under development. We should add a
testcase for this once support lands in tree.
Lots of headers pass around MemoryBuffer objects, but very few open
them. Let those that do include FileSystem.h.
Saves ~250 includes of Chrono.h & FileSystem.h:
$ diff -u thedeps-before.txt thedeps-after.txt | grep '^[-+] ' | sort | uniq -c | sort -nr
254 - ../llvm/include/llvm/Support/FileSystem.h
253 - ../llvm/include/llvm/Support/Chrono.h
237 - ../llvm/include/llvm/Support/NativeFormatting.h
237 - ../llvm/include/llvm/Support/FormatProviders.h
192 - ../llvm/include/llvm/ADT/StringSwitch.h
190 - ../llvm/include/llvm/Support/FormatVariadicDetails.h
...
This requires duplicating the file_t typedef, which is unfortunate. I
sunk the choice of mapping mode down into the cpp file using variable
template specializations instead of class members in headers.
The GenericLLVMIRPlatformSupport class runs a transform on all LLVM IR added to
the LLJIT instance to replace instances of llvm.global_ctors with a specially
named function that runs the corresponing static initializers (See
(GlobalCtorDtorScraper from lib/ExecutionEngine/Orc/LLJIT.cpp). This patch
updates the GenericIRPlatform class to check for this specially named function
in other materialization units that are added to the JIT and, if found, add
the function to the initializer work queue. Doing this allows object files
that were compiled from IR and cached to be reloaded in subsequent JIT sessions
without their initializers being skipped.
To enable testing this patch also updates the lli tool's -jit-kind=orc-lazy mode
to respect the -enable-cache-manager and -object-cache-dir options, and modifies
the CompileOnDemandLayer to rename extracted submodules to include a hash of the
names of their symbol definitions. This allows a simple object caching scheme
based on module names (which was already implemented in lli) to work with the
lazy JIT.
This patch adds new errors and error checking to the ObjectLinkingLayer to
catch cases where a compiled or loaded object either:
(1) Contains definitions not covered by its responsibility set, or
(2) Is missing definitions that are covered by its responsibility set.
Proir to this patch providing the correct set of definitions was treated as
an API contract requirement, however this requires that the client be confident
in the correctness of the whole compiler / object-cache pipeline and results
in difficult-to-debug assertions upon failure. Treating this as a recoverable
error results in clearer diagnostics.
The performance overhead of this check is one comparison of densemap keys
(symbol string pointers) per linking object, which is minimal. If this overhead
ever becomes a problem we can add the check under a flag that can be turned off
if the client fully trusts the rest of the pipeline.
Initializers and deinitializers are used to implement C++ static constructors
and destructors, runtime registration for some languages (e.g. with the
Objective-C runtime for Objective-C/C++ code) and other tasks that would
typically be performed when a shared-object/dylib is loaded or unloaded by a
statically compiled program.
MCJIT and ORC have historically provided limited support for discovering and
running initializers/deinitializers by scanning the llvm.global_ctors and
llvm.global_dtors variables and recording the functions to be run. This approach
suffers from several drawbacks: (1) It only works for IR inputs, not for object
files (including cached JIT'd objects). (2) It only works for initializers
described by llvm.global_ctors and llvm.global_dtors, however not all
initializers are described in this way (Objective-C, for example, describes
initializers via specially named metadata sections). (3) To make the
initializer/deinitializer functions described by llvm.global_ctors and
llvm.global_dtors searchable they must be promoted to extern linkage, polluting
the JIT symbol table (extra care must be taken to ensure this promotion does
not result in symbol name clashes).
This patch introduces several interdependent changes to ORCv2 to support the
construction of new initialization schemes, and includes an implementation of a
backwards-compatible llvm.global_ctor/llvm.global_dtor scanning scheme, and a
MachO specific scheme that handles Objective-C runtime registration (if the
Objective-C runtime is available) enabling execution of LLVM IR compiled from
Objective-C and Swift.
The major changes included in this patch are:
(1) The MaterializationUnit and MaterializationResponsibility classes are
extended to describe an optional "initializer" symbol for the module (see the
getInitializerSymbol method on each class). The presence or absence of this
symbol indicates whether the module contains any initializers or
deinitializers. The initializer symbol otherwise behaves like any other:
searching for it triggers materialization.
(2) A new Platform interface is introduced in llvm/ExecutionEngine/Orc/Core.h
which provides the following callback interface:
- Error setupJITDylib(JITDylib &JD): Can be used to install standard symbols
in JITDylibs upon creation. E.g. __dso_handle.
- Error notifyAdding(JITDylib &JD, const MaterializationUnit &MU): Generally
used to record initializer symbols.
- Error notifyRemoving(JITDylib &JD, VModuleKey K): Used to notify a platform
that a module is being removed.
Platform implementations can use these callbacks to track outstanding
initializers and implement a platform-specific approach for executing them. For
example, the MachOPlatform installs a plugin in the JIT linker to scan for both
__mod_inits sections (for C++ static constructors) and ObjC metadata sections.
If discovered, these are processed in the usual platform order: Objective-C
registration is carried out first, then static initializers are executed,
ensuring that calls to Objective-C from static initializers will be safe.
This patch updates LLJIT to use the new scheme for initialization. Two
LLJIT::PlatformSupport classes are implemented: A GenericIR platform and a MachO
platform. The GenericIR platform implements a modified version of the previous
llvm.global-ctor scraping scheme to provide support for Windows and
Linux. LLJIT's MachO platform uses the MachOPlatform class to provide MachO
specific initialization as described above.
Reviewers: sgraenitz, dblaikie
Subscribers: mgorny, hiraditya, mgrang, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74300
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
ObjectLinkingLayer was not correctly propagating dependencies through local
symbols within an object. This could cause symbol lookup to return before a
searched-for symbol is ready if the following conditions are met:
(1) The definition of the symbol being searched for transitively depends on a
local symbol within the same object, and that local symbol in turn
transitively depends on an external symbol provided by some other module
in the JIT.
(2) Concurrent compilation is enabled.
(3) Thread scheduling causes the lookup of the searched-for symbol to return
before all transitive dependencies of the looked-up symbol are emitted.
This bug was found by inspection and has not been observed in practice.
A jitlink test case has been added to verify that symbol dependencies are
correctly propagated through local symbol definitions.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.
Reviewers: xbolva00, courbet, bollu
Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73099
This commit adds a ManglingOptions struct to IRMaterializationUnit, and replaces
IRCompileLayer::CompileFunction with a new IRCompileLayer::IRCompiler class. The
ManglingOptions struct defines the emulated-TLS state (via a bool member,
EmulatedTLS, which is true if emulated-TLS is enabled and false otherwise). The
IRCompileLayer::IRCompiler class wraps an IRCompiler (the same way that the
CompileFunction typedef used to), but adds a method to return the
IRCompileLayer::ManglingOptions that the compiler will use.
These changes allow us to correctly determine the symbols that will be produced
when a thread local global variable defined at the IR level is compiled with or
without emulated TLS. This is required for ORCv2, where MaterializationUnits
must declare their interface up-front.
Most ORCv2 clients should not require any changes. Clients writing custom IR
compilers will need to wrap their compiler in an IRCompileLayer::IRCompiler,
rather than an IRCompileLayer::CompileFunction, however this should be a
straightforward change (see modifications to CompileUtils.* in this patch for an
example).
The MaterializationResponsibility::defineMaterializing method allows clients to
add new definitions that are in the process of being materialized to the JIT.
This patch adds support to defineMaterializing for symbols with weak linkage
where the new definitions may be rejected if another materializer concurrently
defines the same symbol. If a weak symbol is rejected it will not be added to
the MaterializationResponsibility's responsibility set. Clients can check for
membership in the responsibility set via the
MaterializationResponsibility::getSymbols() method before resolving any
such weak symbols.
This patch also adds code to RTDyldObjectLinkingLayer to tag COFF comdat symbols
introduced during codegen as weak, on the assumption that these are COFF comdat
constants. This fixes http://llvm.org/PR40074.
Based on Don Hinton's patch in https://reviews.llvm.org/D72406. This feature
was accidentally left out of e9e26c01cd, and
would have pessimized concurrent compilation in the default case.
Thanks for spotting this Don!
This patch makes the target triple available via the LLJIT interface, and moves
the IRTransformLayer from LLLazyJIT down into LLJIT. Together these changes make
it easier to use the lazyReexports utility with LLJIT, and to apply IR
transforms to code as it is compiled in LLJIT (rather than requiring transforms
to be applied manually before code is added). An code example is added in
llvm/examples/LLJITExamples/LLJITWithLazyReexports
A bug in the existing implementation meant that lazyReexports would not work if
the aliased name differed from the alias's name, i.e. all lazy reexports had to
be of the form (lib1, name) -> (lib2, name). This patch fixes the issue by
capturing the alias's name in the NotifyResolved callback. To simplify this
capture, and the LazyCallThroughManager code in general, the NotifyResolved
callback is updated to use llvm::unique_function rather than a custom class.
No test case yet: This can only be tested at runtime, and the only in-tree
client (lli) always uses aliases with matching names. I will add a new LLJIT
example shortly that will directly test the lazyReexports API and the
non-trivial alias use case.
This fixes an off-by-one error in the argc value computed by runAsMain, and
switches lli back to using the input bitcode (rather than the string "lli") as
the effective program name.
Thanks to Stefan Graenitz for spotting the bug.
LLJITBuilder will now use JITLink on supported platforms even if a custom
JITTargetMachineBuilder is supplied, provided that neither the code model,
nor the relocation model, nor the ObjectLinkingLayerCreator is set.
JITLink (which underlies ObjectLinkingLayer) is a replacement for RuntimeDyld.
It supports the native code model, and linker plugins that enable a wider range
of features than RuntimeDyld.
Currently only enabled for MachO/x86-64 and MachO/arm64. New architectures will
be added as JITLink support for them is developed.
This relieves ObjectLinkingLayer clients of the responsibility of holding the
memory manager. This makes it easier to select between RTDyldObjectLinkingLayer
(which already owned its memory manager factory) and ObjectLinkingLayer at
runtime as clients aren't required to hold a jitlink::MemoryManager field just
in case ObjectLinkingLayer is selected.
This patch removes the magic "main" JITDylib from ExecutionEngine. The main
JITDylib was created automatically at ExecutionSession construction time, and
all subsequently created JITDylibs were added to the main JITDylib's
links-against list by default. This saves a couple of lines of boilerplate for
simple JIT setups, but this isn't worth introducing magical behavior for.
ORCv2 clients should now construct their own main JITDylib using
ExecutionSession::createJITDylib and set up its linkages manually using
JITDylib::setSearchOrder (or related methods in JITDylib).
The runAsMain function takes a pointer to a function with a standard C main
signature, int(*)(int, char*[]), and invokes it using the given arguments and
program name. The arguments are copied into writable temporary storage as
required by the C and C++ specifications, so runAsMain safe to use when calling
main functions that modify their arguments in-place.
This patch also uses the new runAsMain function to replace hand-rolled versions
in lli, llvm-jitlink, and the SpeculativeJIT example.
libraries.
This patch substantially updates ORCv2's lookup API in order to support weak
references, and to better support static archives. Key changes:
-- Each symbol being looked for is now associated with a SymbolLookupFlags
value. If the associated value is SymbolLookupFlags::RequiredSymbol then
the symbol must be defined in one of the JITDylibs being searched (or be
able to be generated in one of these JITDylibs via an attached definition
generator) or the lookup will fail with an error. If the associated value is
SymbolLookupFlags::WeaklyReferencedSymbol then the symbol is permitted to be
undefined, in which case it will simply not appear in the resulting
SymbolMap if the rest of the lookup succeeds.
Since lookup now requires these flags for each symbol, the lookup method now
takes an instance of a new SymbolLookupSet type rather than a SymbolNameSet.
SymbolLookupSet is a vector-backed set of (name, flags) pairs. Clients are
responsible for ensuring that the set property (i.e. unique elements) holds,
though this is usually simple and SymbolLookupSet provides convenience
methods to support this.
-- Lookups now have an associated LookupKind value, which is either
LookupKind::Static or LookupKind::DLSym. Definition generators can inspect
the lookup kind when determining whether or not to generate new definitions.
The StaticLibraryDefinitionGenerator is updated to only pull in new objects
from the archive if the lookup kind is Static. This allows lookup to be
re-used to emulate dlsym for JIT'd symbols without pulling in new objects
from archives (which would not happen in a normal dlsym call).
-- JITLink is updated to allow externals to be assigned weak linkage, and
weak externals now use the SymbolLookupFlags::WeaklyReferencedSymbol value
for lookups. Unresolved weak references will be assigned the default value of
zero.
Since this patch was modifying the lookup API anyway, it alo replaces all of the
"MatchNonExported" boolean arguments with a "JITDylibLookupFlags" enum for
readability. If a JITDylib's associated value is
JITDylibLookupFlags::MatchExportedSymbolsOnly then the lookup will only
match against exported (non-hidden) symbols in that JITDylib. If a JITDylib's
associated value is JITDylibLookupFlags::MatchAllSymbols then the lookup will
match against any symbol defined in the JITDylib.
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so. Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:
1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so. This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.
With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.
2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set. This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.
I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:
- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON
Reviewers: beanz, smeenai, compnerd, phosek
Reviewed By: beanz
Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70179
It was failing with
llvm/lib/ExecutionEngine/Orc/DebugUtils.cpp:56:10:
error: could not convert ‘Obj’ from ‘std::unique_ptr<llvm::MemoryBuffer>’
to ‘llvm::Expected<std::unique_ptr<llvm::MemoryBuffer> >’
return Obj;
^
Adds a DumpObjects utility that can be used to dump JIT'd objects to disk.
Instances of DebugObjects may be used by ObjectTransformLayer as no-op
transforms.
This patch also adds an ObjectTransformLayer to LLJIT and an example of how
to use this utility to dump JIT'd objects in LLJIT.
Summary:
When createing an ORC remote JIT target the current library split forces the target process to link large portions of LLVM (Core, Execution Engine, JITLink, Object, MC, Passes, RuntimeDyld, Support, Target, and TransformUtils). This occurs because the ORC RPC interfaces rely on the static globals the ORC Error types require, which starts a cycle of pulling in more and more.
This patch breaks the ORC RPC Error implementations out into an "OrcError" library which only depends on LLVM Support. It also pulls the ORC RPC headers into their own subdirectory.
With this patch code can include the Orc/RPC/*.h headers and will only incur link dependencies on LLVMOrcError and LLVMSupport.
Reviewers: lhames
Reviewed By: lhames
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68732
Works on this dependency chain:
ArrayRef.h ->
Hashing.h -> --CUT--
Host.h ->
StringMap.h / StringRef.h
ArrayRef is very popular, but Host.h is rarely needed. Move the
IsBigEndianHost constant to SwapByteOrder.h. Clients of that header are
more likely to need it.
llvm-svn: 375316
RTDyldObjectLinkingLayer allowed clients to register a NotifyEmitted function to
reclaim ownership of object buffers once they had been linked. This patch adds
similar functionality to ObjectLinkingLayer: Clients can now optionally call the
ObjectLinkingLayer::setReturnObjectBuffer method to register a function that
will be called when discarding object buffers. If set, this function will be
called to return ownership of the object regardless of whether the link
succeeded or failed.
Use cases for this function include debug dumping (it provides a way to dump
all objects linked into JIT'd code) and object re-use (e.g. storing an
object in a cache).
llvm-svn: 374951
Doing this makes MSVC complain that `empty(someRange)` could refer to
either C++17's std::empty or LLVM's llvm::empty, which previously we
avoided via SFINAE because std::empty is defined in terms of an empty
member rather than begin and end. So, switch callers over to the new
method as it is added.
https://reviews.llvm.org/D68439
llvm-svn: 373935
In the Atom model the symbols, content and relocations of a relocatable object
file are represented as a graph of atoms, where each Atom represents a
contiguous block of content with a single name (or no name at all if the
content is anonymous), and where edges between Atoms represent relocations.
If more than one symbol is associated with a contiguous block of content then
the content is broken into multiple atoms and layout constraints (represented by
edges) are introduced to ensure that the content remains effectively contiguous.
These layout constraints must be kept in mind when examining the content
associated with a symbol (it may be spread over multiple atoms) or when applying
certain relocation types (e.g. MachO subtractors).
This patch replaces the Atom model in JITLink with a blocks-and-symbols model.
The blocks-and-symbols model represents relocatable object files as bipartite
graphs, with one set of nodes representing contiguous content (Blocks) and
another representing named or anonymous locations (Symbols) within a Block.
Relocations are represented as edges from Blocks to Symbols. This scheme
removes layout constraints (simplifying handling of MachO alt-entry symbols,
and hopefully ELF sections at some point in the future) and simplifies some
relocation logic.
llvm-svn: 373689
This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM.
FastISel is mostly disabled for now since it would generate incorrect code for
ILP32.
llvm-svn: 371722
Summary:
This patch introduces, SequenceBBQuery - new heuristic to find likely next callable functions it tries to find the blocks with calls in order of execution sequence of Blocks.
It still uses BlockFrequencyAnalysis to find high frequency blocks. For a handful of hottest blocks (plan to customize), the algorithm traverse and discovered the caller blocks along the way to Entry Basic Block and Exit Basic Block. It uses Block Hint, to stop traversing the already visited blocks in both direction. It implicitly assumes that once the block is visited during discovering entry or exit nodes, revisiting them again does not add much. It also branch probability info (cached result) to traverse only hot edges (planned to customize) from hot blocks. Without BPI, the algorithm mostly return's all the blocks in the CFG with calls.
It also changes the heuristic queries, so they don't maintain states. Hence it is safe to call from multiple threads.
It also implements, new instrumentation to avoid jumping into JIT on every call to the function with the help _orc_speculate.decision.block and _orc_speculate.block.
"Speculator Registration Mechanism is also changed" - kudos to @lhames
Open to review, mostly looking to change implementation of SequeceBBQuery heuristics with good data structure choices.
Reviewers: lhames, dblaikie
Reviewed By: lhames
Subscribers: mgorny, hiraditya, mgrang, llvm-commits, lhames
Tags: #speculative_compilation_in_orc, #llvm
Differential Revision: https://reviews.llvm.org/D66399
llvm-svn: 370092
On MachO, processing of the eh-frame section should stop if the end of the
__eh_frame section is reached, regardless of whether or not there is a null CFI
length field at the end of the section. This patch tracks the eh-frame section
size and threads it through the appropriate APIs so that processing can be
terminated correctly.
No testcase yet: This patch is all API plumbing (rather than modification of
linked memory) which the existing infrastructure does not provide a way of
testing. Committing without a testcase until I have an idea of how to write
one.
llvm-svn: 370074
In r369808 the failure scheme for ORC symbols was changed to make
MaterializationResponsibility objects responsible for failing the symbols
they represented. This simplifies error logic in the case where symbols are
still covered by a MaterializationResponsibility, but left a gap in error
handling: Symbols that have been emitted but are not yet ready (due to a
dependence on some unemitted symbol) are not covered by a
MaterializationResponsibility object. Under the scheme introduced in r369808
such symbols would be moved to the error state, but queries on those symbols
were never notified. This led to deadlocks when such symbols were failed.
This commit updates error logic to immediately fail queries on any symbol that
has already been emitted if one of its dependencies fails.
llvm-svn: 369976
Symbols that have not been queried will not have MaterializingInfo entries,
so remove the assert that all failed symbols should have these entries.
Also updates the loop to only remove entries that were found earlier.
llvm-svn: 369975
If the dependencies are not removed then a late failure (one symbol covered by
the query failing after others have already been resolved) can result in an
attempt to detach the query from already finalized symbol, resulting in an
assert/crash. This patch fixes the issue by removing query dependencies in
JITDylib::resolve for symbols that meet the required state.
llvm-svn: 369809
When symbols are failed (via MaterializationResponsibility::failMaterialization)
any symbols depending on them will now be moved to an error state. Attempting
to resolve or emit a symbol in the error state (via the notifyResolved or
notifyEmitted methods on MaterializationResponsibility) will result in an error.
If notifyResolved or notifyEmitted return an error due to failure of a
dependence then the caller should log or discard the error and call
failMaterialization to propagate the failure to any queries waiting on the
symbols being resolved/emitted (plus their dependencies).
llvm-svn: 369808
Summary:
rL367756 (f5c40cb) increases the dependency of LLVMOrcJIT on LLVMPasses.
In particular, symbols defined in LLVMPasses that are referenced by the
destructor of `PassBuilder` are now referenced by LLVMOrcJIT through
`Speculation.cpp.o`.
We believe that referencing symbols defined in LLVMPasses in the
destructor of `PassBuilder` is valid, and that adding to the set of such
symbols is legitimate. To support such cases, this patch adds LLVMPasses
to the set of libraries being linked when linking in LLVMOrcJIT causes
such symbols from LLVMPasses to be referenced.
Reviewers: Whitney, anhtuyen, pree-jackie
Reviewed By: pree-jackie
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66441
llvm-svn: 369310
Summary:
llvm/lib/ExecutionEngine/Orc/Layer.cpp:53:12: warning: returning address of local temporary object [-Wreturn-stack-address]
In
```
StringRef IRMaterializationUnit::getName() const {
[...]
return TSM.withModuleDo(
[](const Module &M) { return M.getModuleIdentifier(); });
```
`getModuleIdentifier()` returns a `const std::string &`, but the implicit return type
of the lambda is `std::string` by value, and thus the returned `StringRef` refers
to a temporary `std::string`.
Detect by annotating `llvm::StringRef` with `[[gsl::Pointer]]`.
Reviewers: lhames, sgraenitz
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66440
llvm-svn: 369306
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
llvm-svn: 369013
This patch replaces the JITDylib::DefinitionGenerator typedef with a class of
the same name, and adds support for attaching a sequence of DefinitionGeneration
objects to a JITDylib.
This patch also adds a new definition generator,
StaticLibraryDefinitionGenerator, that can be used to add symbols fom a static
library to a JITDylib. An object from the static library will be added (via
a supplied ObjectLayer reference) whenever a symbol from that object is
referenced.
To enable testing, lli is updated to add support for the --extra-archive option
when running in -jit-kind=orc-lazy mode.
llvm-svn: 368707
This commit adds host CPU name and sub-target features to the
`JITTargetMachineBuilder` created by `JITTargetMachineBuilder::detectHost()`.
Differential Revision: https://reviews.llvm.org/D65760
llvm-svn: 367944
libObject does not apply the Exported flag to symbols in COFF object files,
which can lead to assertions when the symbol flags initially derived from
IR added to the JIT clash with the flags seen by the JIT linker. Both
RTDyldObjectLinkingLayer and ObjectLinkingLayer have a workaround for this:
they can be told to override the flags seen by the linker with the flags
attached to the materialization responsibility object that was passed down
to the linker. This patch modifies LLJIT's setup code to enable this override
by default on platforms where COFF is the default object format.
llvm-svn: 367712
ThreadSafeModule/ThreadSafeContext are used to manage lifetimes and locking
for LLVMContexts in ORCv2. Prior to this patch contexts were locked as soon
as an associated Module was emitted (to be compiled and linked), and were not
unlocked until the emit call returned. This could lead to deadlocks if
interdependent modules that shared contexts were compiled on different threads:
when, during emission of the first module, the dependence was discovered the
second module (which would provide the required symbol) could not be emitted as
the thread emitting the first module still held the lock.
This patch eliminates this possibility by moving to a finer-grained locking
scheme. Each client holds the module lock only while they are actively operating
on it. To make this finer grained locking simpler/safer to implement this patch
removes the explicit lock method, 'getContextLock', from ThreadSafeModule and
replaces it with a new method, 'withModuleDo', that implicitly locks the context,
calls a user-supplied function object to operate on the Module, then implicitly
unlocks the context before returning the result.
ThreadSafeModule TSM = getModule(...);
size_t NumFunctions = TSM.withModuleDo(
[](Module &M) { // <- context locked before entry to lambda.
return M.size();
});
Existing ORCv2 layers that operate on ThreadSafeModules are updated to use the
new method.
This method is used to introduce Module locking into each of the existing
layers.
llvm-svn: 367686
Summary:
ORCv1 is deprecated. The current aim is to remove it before the LLVM 10.0
release. This patch adds deprecation attributes to the ORCv1 layers and
utilities to warn clients of the change.
Reviewers: dblaikie, sgraenitz, AlexDenisov
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64609
llvm-svn: 366344
LLJITBuilder now has a setCompileFunctionCreator method which can be used to
construct a CompileFunction for the LLJIT instance being created. The motivating
use-case for this is supporting ObjectCaches, which can now be set up at
compile-function construction time. To demonstrate this an example project,
LLJITWithObjectCache, is included.
llvm-svn: 365671
Replaces direct calls to eh-frame registration with calls to methods on an
EHFrameRegistrar instance. This allows clients to substitute a registrar that
registers frames in a remote process via IPC/RPC.
llvm-svn: 365098
notifyResolved/notifyEmitted.
The 'notify' prefix better describes what these methods do: they update the JIT
symbol states and notify any pending queries that the 'resolved' and 'emitted'
states have been reached (rather than actually performing the resolution or
emission themselves). Since new states are going to be introduced in the near
future (to track symbol registration/initialization) it's worth changing the
convention pre-emptively to avoid further confusion.
llvm-svn: 363322
rather than two callbacks.
The asynchronous lookup API (which the synchronous lookup API wraps for
convenience) used to take two callbacks: OnResolved (called once all requested
symbols had an address assigned) and OnReady to be called once all requested
symbols were safe to access). This patch updates the asynchronous lookup API to
take a single 'OnComplete' callback and a required state (SymbolState) to
determine when the callback should be made. This simplifies the common use case
(where the client is interested in a specific state) and will generalize neatly
as new states are introduced to track runtime initialization of symbols.
Clients who were making use of both callbacks in a single query will now need to
issue two queries (one for SymbolState::Resolved and another for
SymbolState::Ready). Synchronous lookup API clients who were explicitly passing
the WaitOnReady argument will now need neeed to pass a SymbolState instead (for
'WaitOnReady == true' use SymbolState::Ready, for 'WaitOnReady == false' use
SymbolState::Resolved). Synchronous lookup API clients who were using default
arugment values should see no change.
llvm-svn: 362832
Prior to this patch, JITDylibs inferred symbol states (whether a symbol was
newly added, materializing, resolved, or ready to run) via a combination of (1)
bits in the JITSymbolFlags member, and (2) the state of some internal JITDylib
data structures. This patch explicitly tracks symbol states by adding a new
SymbolState member to the symbol table entries, and removing the 'Lazy' and
'Materializing' bits from JITSymbolFlags. This is a first step towards adding
additional states representing initialization phases (e.g. eh-frame registration,
registration with the language runtime, and static initialization).
llvm-svn: 361899
Summary:
scan-build flagged a potential use-after-move in debug builds. It's not
safe that a moved from value contains anything but garbage. Manually
DRY up these repeated expressions.
Reviewers: lhames
Reviewed By: lhames
Subscribers: hiraditya, llvm-commits, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62112
llvm-svn: 361203
SymbolStringPtr used to use nullptr as its empty value and (since it performed
ref-count operations on any non-nullptr) a pointer to a special pool-entry
instance as its tombstone.
This commit changes the scheme to use two invalid pointer values as the empty
and tombstone values, and broadens the ref-count guard to prevent ref-counting
operations from being performed on these pointers. This should improve the
performance of SymbolStringPtrs used in DenseMaps/DenseSets, as ref counting
operations will no longer be performed on the tombstone.
llvm-svn: 360925
This patch changes the return type of sys::Process::getPageSize to
Expected<unsigned> to account for the fact that the underlying syscalls used to
obtain the page size may fail (see below).
For clients who use the page size as an optimization only this patch adds a new
method, getPageSizeEstimate, which calls through to getPageSize but discards
any error returned and substitues a "reasonable" page size estimate estimate
instead. All existing LLVM clients are updated to call getPageSizeEstimate
rather than getPageSize.
On Unix, sys::Process::getPageSize is implemented in terms of getpagesize or
sysconf, depending on which macros are set. The sysconf call is documented to
return -1 on failure. On Darwin getpagesize is implemented in terms of sysconf
and may also fail (though the manpage documentation does not mention this).
These failures have been observed in practice when highly restrictive sandbox
permissions have been applied. Without this patch, the result is that
getPageSize returns -1, which wreaks havoc on any subsequent code that was
assuming a sane page size value.
<rdar://problem/41654857>
Reviewers: dblaikie, echristo
Subscribers: kristina, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59107
llvm-svn: 360221
Clients who want to regain ownership of object buffers after they have been
linked may now use the NotifyEmitted callback for this purpose.
Note: Currently NotifyEmitted is only called if linking succeeds. If linking
fails the buffer is always discarded.
llvm-svn: 359735
Background: A definition generator can be attached to a JITDylib to generate
new definitions in response to queries. For example: a generator that forwards
calls to dlsym can map symbols from a dynamic library into the JIT process on
demand.
If definition generation fails then the generator should be able to return an
error. This allows the JIT API to distinguish between the case where a
generator does not provide a definition, and the case where it was not able to
determine whether it provided a definition due to an error.
The immediate motivation for this is cross-process symbol lookups: If the
remote-lookup generator is attached to a JITDylib early in the search list, and
if a generator failure is misinterpreted as "no definition in this JITDylib" then
lookup may continue and bind to a different definition in a later JITDylib, which
is a bug.
llvm-svn: 359521
LLJITBuilder and LLLazyJITBuilder construct LLJIT and LLLazyJIT instances
respectively. Over time these will allow more configurable options to be
added while remaining easy to use in the default case, which for default
in-process JITing is now:
auto J = ExitOnErr(LLJITBuilder.create());
llvm-svn: 359511
ObjectLinkingLayer::Plugin provides event notifications when objects are loaded,
emitted, and removed. It also provides a modifyPassConfig callback that allows
plugins to modify the JITLink pass configuration.
This patch moves eh-frame registration into its own plugin, and teaches
llvm-jitlink to only add that plugin when performing execution runs on
non-Windows platforms. This should allow us to re-enable the test case that was
removed in r359198.
llvm-svn: 359357
When failing materialization of a symbol X, remove X from the dependants list
of any of X's dependencies. This ensures that when X's dependencies are
emitted (or fail themselves) they do not try to access the no-longer-existing
MaterializationInfo for X.
llvm-svn: 359252
Summary:
JITLink is a jit-linker that performs the same high-level task as RuntimeDyld:
it parses relocatable object files and makes their contents runnable in a target
process.
JITLink aims to improve on RuntimeDyld in several ways:
(1) A clear design intended to maximize code-sharing while minimizing coupling.
RuntimeDyld has been developed in an ad-hoc fashion for a number of years and
this had led to intermingling of code for multiple architectures (e.g. in
RuntimeDyldELF::processRelocationRef) in a way that makes the code more
difficult to read, reason about, extend. JITLink is designed to isolate
format and architecture specific code, while still sharing generic code.
(2) Support for native code models.
RuntimeDyld required the use of large code models (where calls to external
functions are made indirectly via registers) for many of platforms due to its
restrictive model for stub generation (one "stub" per symbol). JITLink allows
arbitrary mutation of the atom graph, allowing both GOT and PLT atoms to be
added naturally.
(3) Native support for asynchronous linking.
JITLink uses asynchronous calls for symbol resolution and finalization: these
callbacks are passed a continuation function that they must call to complete the
linker's work. This allows for cleaner interoperation with the new concurrent
ORC JIT APIs, while still being easily implementable in synchronous style if
asynchrony is not needed.
To maximise sharing, the design has a hierarchy of common code:
(1) Generic atom-graph data structure and algorithms (e.g. dead stripping and
| memory allocation) that are intended to be shared by all architectures.
|
+ -- (2) Shared per-format code that utilizes (1), e.g. Generic MachO to
| atom-graph parsing.
|
+ -- (3) Architecture specific code that uses (1) and (2). E.g.
JITLinkerMachO_x86_64, which adds x86-64 specific relocation
support to (2) to build and patch up the atom graph.
To support asynchronous symbol resolution and finalization, the callbacks for
these operations take continuations as arguments:
using JITLinkAsyncLookupContinuation =
std::function<void(Expected<AsyncLookupResult> LR)>;
using JITLinkAsyncLookupFunction =
std::function<void(const DenseSet<StringRef> &Symbols,
JITLinkAsyncLookupContinuation LookupContinuation)>;
using FinalizeContinuation = std::function<void(Error)>;
virtual void finalizeAsync(FinalizeContinuation OnFinalize);
In addition to its headline features, JITLink also makes other improvements:
- Dead stripping support: symbols that are not used (e.g. redundant ODR
definitions) are discarded, and take up no memory in the target process
(In contrast, RuntimeDyld supported pointer equality for weak definitions,
but the redundant definitions stayed resident in memory).
- Improved exception handling support. JITLink provides a much more extensive
eh-frame parser than RuntimeDyld, and is able to correctly fix up many
eh-frame sections that RuntimeDyld currently (silently) fails on.
- More extensive validation and error handling throughout.
This initial patch supports linking MachO/x86-64 only. Work on support for
other architectures and formats will happen in-tree.
Differential Revision: https://reviews.llvm.org/D58704
llvm-svn: 358818
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57172
llvm-svn: 352911
This cleans up all CallInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57170
llvm-svn: 352909
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
This will allow other utilities (including a future RuntimeDyld replacement) to
use these types without pulling in the major Core types (JITDylib, etc.).
llvm-svn: 351138
MIPS `jr` instruction uses a delay-slot. To escape execution of
arbitrary instruction we should either fill the delay-slot by `nop`
instruction or swap `jr` instruction and logically preceding
instruction. This fix implements the second method to generate a bit
more effective code.
llvm-svn: 351001
MIPS ABI states that every function must be called through jalr $t9. In
other words, a function expect that t9 register points to the beginning
of its code. A function uses this register to calculate offset to the
Global Offset Table and save it to the `gp` register.
```
lui $gp, %hi(_gp_disp)
addiu $gp, %lo(_gp_disp)
addu $gp, $gp, $t9
```
If `t9` and as a result `$gp` point to the wrong place the following code
loads incorrect value from GOT and passes control to invalid code.
```
lw $v0,%call16(foo)($gp)
jalr $t9
```
OrcMips32 and OrcMips64 writeResolverCode methods pass control to the
resolved address, but do not setup `$t9` before the call. The `t9` holds
value of the beginning of `resolver` code so any attempts to call
routines via GOT failed.
This change fixes the problem. The `OrcLazy/hidden-visibility.ll` test
starts to pass correctly. Before the change it fails on MIPS because the
`exitOnLazyCallThroughFailure` called from the resolver code could not
call libc routine `exit` via GOT.
Differential Revision: http://reviews.llvm.org/D56058
llvm-svn: 351000
This patch renames both methods (NotifyObjectEmitted -> notifyObjectLoaded, and
NotifyObjectFreed -> notifyObjectFreed), adds an abstract "ObjectKey" (uint64_t)
parameter to notifyObjectLoaded, and replaces the ObjectFile parameter for
notifyObjectFreed with an ObjectKey. Using an ObjectKey to track identify
events, rather than a reference to the ObjectFile, allows us to free the
ObjectFile after notifyObjectLoaded is called, saving memory.
https://reviews.llvm.org/D53773
llvm-svn: 348223
Doesn't build on Windows. The call to 'lookup' is ambiguous. Clang and
MSVC agree, anyway.
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/787
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\unittests\ExecutionEngine\Orc\CoreAPIsTest.cpp(315): error C2668: 'llvm::orc::ExecutionSession::lookup': ambiguous call to overloaded function
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\include\llvm/ExecutionEngine/Orc/Core.h(823): note: could be 'llvm::Expected<llvm::JITEvaluatedSymbol> llvm::orc::ExecutionSession::lookup(llvm::ArrayRef<llvm::orc::JITDylib *>,llvm::orc::SymbolStringPtr)'
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\include\llvm/ExecutionEngine/Orc/Core.h(817): note: or 'llvm::Expected<llvm::JITEvaluatedSymbol> llvm::orc::ExecutionSession::lookup(const llvm::orc::JITDylibSearchList &,llvm::orc::SymbolStringPtr)'
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\unittests\ExecutionEngine\Orc\CoreAPIsTest.cpp(315): note: while trying to match the argument list '(initializer list, llvm::orc::SymbolStringPtr)'
llvm-svn: 345078
In the new scheme the client passes a list of (JITDylib&, bool) pairs, rather
than a list of JITDylibs. For each JITDylib the boolean indicates whether or not
to match against non-exported symbols (true means that they should be found,
false means that they should not). The MatchNonExportedInJD and MatchNonExported
parameters on lookup are removed.
The new scheme is more flexible, and easier to understand.
This patch also updates JITDylib search orders to be lists of (JITDylib&, bool)
pairs to match the new lookup scheme. Error handling is also plumbed through
the LLJIT class to allow regression tests to fail predictably when a lookup from
a lazy call-through fails.
llvm-svn: 345077
Otherwise we can end up with a data-race when linking concurrently.
This should fix an intermittent failure in the multiple-compile-threads-basic.ll
testcase.
llvm-svn: 344956
MaterializationResponsibility.
VModuleKeys are intended to enable selective removal of modules from a JIT
session, however for a wide variety of use cases selective removal is not
needed and introduces unnecessary overhead. As of this commit, the default
constructed VModuleKey value is reserved as a "do not track" value, and
becomes the default when adding a new module to the JIT.
This commit also changes the propagation of VModuleKeys. They were passed
alongside the MaterializationResponsibity instance in XXLayer::emit methods,
but are now propagated as part of the MaterializationResponsibility instance
itself (and as part of MaterializationUnit when stored in a JITDylib).
Associating VModuleKeys with MaterializationUnits in this way should allow
for a thread-safe module removal mechanism in the future, even when a module
is in the process of being compiled, by having the
MaterializationResponsibility object check in on its VModuleKey's state
before commiting its results to the JITDylib.
llvm-svn: 344643
This commit adds a 'Legacy' prefix to old ORC layers and utilities, and removes
the '2' suffix from the new ORC layers. If you wish to continue using the old
ORC layers you will need to add a 'Legacy' prefix to your classes. If you were
already using the new ORC layers you will need to drop the '2' suffix.
The legacy layers will remain in-tree until the new layers reach feature
parity with them. This will involve adding support for removing code from the
new layers, and ensuring that performance is comperable.
llvm-svn: 344572
The new name is a better fit: This class does not actually spawn any new
threads for compilation, it is just safe to call from multiple threads
concurrently.
The "Simple" part of the name did not convey much either, so it was
dropped.
llvm-svn: 344567
Renames:
JITDylib's setFallbackDefinitionGenerator method to setGenerator.
DynamicLibraryFallbackGenerator class to DynamicLibrarySearchGenerator.
ReexportsFallbackDefinitionGenerator to ReexportsGenerator.
llvm-svn: 344489
This adds two arguments to the main ExecutionSession::lookup method:
MatchNonExportedInJD, and MatchNonExported. These control whether and where
hidden symbols should be matched when searching a list of JITDylibs.
A similar effect could have been achieved by filtering search results, but
this would have involved materializing symbol definitions (since materialization
is triggered on lookup) only to throw the results away, among other issues.
llvm-svn: 344467
rather than require them to have been promoted before being passed in.
Dropping this precondition is better for layer composition (CompileOnDemandLayer
was the only one that placed pre-conditions on the modules that could be added).
It also means that the promoted private symbols do not show up in the target
JITDylib's symbol table. Instead, they are confined to the hidden implementation
dylib that contains the actual definitions.
For the 403.gcc testcase this cut down the public symbol table size from ~15,000
symbols to ~4000, substantially reducing symbol dependence tracking costs.
llvm-svn: 344078
Symbols can be removed provided that all are present in the JITDylib and none
are currently in the materializing state. On success all requested symbols are
removed. On failure an error is returned and no symbols are removed.
llvm-svn: 343928
(1) Adds comments for the API.
(2) Removes the setArch method: This is redundant: the setArchStr method on the
triple should be used instead.
(3) Turns EmulatedTLS on by default. This matches EngineBuilder's behavior.
llvm-svn: 343423
CompileOnDemandLayer2 now supports user-supplied partition functions (the
original CompileOnDemandLayer already supported these).
Partition functions are called with the list of requested global values
(i.e. global values that currently have queries waiting on them) and have an
opportunity to select extra global values to materialize at the same time.
Also adds testing infrastructure for the new feature to lli.
llvm-svn: 343396
This makes it available for use in IRTransformLayer2::TransformFunction
instances (since a const MaterializationResponsibility& parameter was
added in r343365).
llvm-svn: 343367
(1) A const accessor for the LLVMContext held by a ThreadSafeContext.
(2) A const accessor for the ThreadSafeModules held by an IRMaterializationUnit.
(3) A const MaterializationResponsibility reference to IRTransformLayer2's
transform function. This makes IRTransformLayer2 useful for JIT debugging
(since it can inspect JIT state through the responsibility argument) as well
as program transformations.
llvm-svn: 343365
(1) Print debugging output under a session lock to avoid garbled messages when
compiling on multiple threads.
(2) Name MaterializationUnits, add an ostream operator for them, and so they can
be easily referenced in debugging output, and have that ostream operator
optionally print code/data/hidden symbols provided by that materialization unit
based on command line options.
llvm-svn: 343323
flag to true in LLJIT when running in multithreaded mode.
The IRLayer::setCloneToNewContextOnEmit method sets a flag within the IRLayer
that causes modules added to that layer to be moved to a new context (by
serializing to/from a memory buffer) when they are emitted. This allows modules
that were all loaded on the same context to be compiled in parallel.
llvm-svn: 343266
one SymbolLinkagePromoter utility.
SymbolLinkagePromoter renames anonymous and private symbols, and bumps all
linkages to at least global/hidden-visibility. Modules whose symbols have been
promoted by this utility can be decomposed into sub-modules without introducing
link errors. This is used by the CompileOnDemandLayer to extract single-function
modules for lazy compilation.
llvm-svn: 343257
Modifies lit to add a 'thread_support' feature that can be used in lit test
REQUIRES clauses. The thread_support flag is set if -DLLVM_ENABLE_THREADS=ON
and unset if -DLLVM_ENABLE_THREADS=OFF. The lit flag is used to disable the
multiple-compile-threads-basic.ll testcase when threading is disabled.
llvm-svn: 343122
This doesn't work well in builds configured with LLVM_ENABLE_THREADS=OFF,
causing the following assert when running
ExecutionEngine/OrcLazy/multiple-compile-threads-basic.ll:
lib/ExecutionEngine/Orc/Core.cpp:1748: Expected<llvm::JITEvaluatedSymbol>
llvm::orc::lookup(const llvm::orc::JITDylibList &, llvm::orc::SymbolStringPtr):
Assertion `ResultMap->size() == 1 && "Unexpected number of results"' failed.
> LLJIT and LLLazyJIT can now be constructed with an optional NumCompileThreads
> arguments. If this is non-zero then a thread-pool will be created with the
> given number of threads, and compile tasks will be dispatched to the thread
> pool.
>
> To enable testing of this feature, two new flags are added to lli:
>
> (1) -compile-threads=N (N = 0 by default) controls the number of compile threads
> to use.
>
> (2) -thread-entry can be used to execute code on additional threads. For each
> -thread-entry argument supplied (multiple are allowed) a new thread will be
> created and the given symbol called. These additional thread entry points are
> called after static constructors are run, but before main.
llvm-svn: 343099
for lazy compilation, rather than a callback manager.
The new mechanism does not block compile threads, and does not require
function bodies to be renamed.
Future modifications should allow laziness on a per-module basis to work
without any modification of the input module.
llvm-svn: 343065
implementation as lazy compile callbacks, and a "lazy re-exports" utility that
builds lazy call-throughs.
Lazy call-throughs are similar to lazy compile callbacks (and are based on the
same underlying state saving/restoring trampolines) but resolve their targets
by performing a standard ORC lookup rather than invoking a user supplied
compiler callback. This allows them to inherit the thread-safety of ORC lookups
while blocking only the calling thread (whereas compile callbacks also block one
compile thread).
Lazy re-exports provide a simple way of building lazy call-throughs. Unlike a
regular re-export, a lazy re-export generates a new address (a stub entry point)
that will act like the re-exported symbol when called. The first call via a
lazy re-export will trigger compilation of the re-exported symbol before calling
through to it.
llvm-svn: 343061
This will allow trampoline pools to be re-used for a new lazy-reexport utility
that generates looks up function bodies using the standard symbol lookup process
(rather than using a user provided compile function). This new utility provides
the same capabilities (since MaterializationUnits already allow user supplied
compile functions to be run) as JITCompileCallbackManager, but can use the new
asynchronous lookup functions to avoid blocking a compile thread.
This patch also updates createLocalCompileCallbackManager to return an error if
a callback manager can not be created, and updates clients of that API to
account for the change. Finally, the OrcCBindingsStack is updates so that if
a callback manager is not available for the target platform a valid stack
(without support for lazy compilation) can still be constructed.
llvm-svn: 343059
LLJIT and LLLazyJIT can now be constructed with an optional NumCompileThreads
arguments. If this is non-zero then a thread-pool will be created with the
given number of threads, and compile tasks will be dispatched to the thread
pool.
To enable testing of this feature, two new flags are added to lli:
(1) -compile-threads=N (N = 0 by default) controls the number of compile threads
to use.
(2) -thread-entry can be used to execute code on additional threads. For each
-thread-entry argument supplied (multiple are allowed) a new thread will be
created and the given symbol called. These additional thread entry points are
called after static constructors are run, but before main.
llvm-svn: 343058
compilation of IR in the JIT.
ThreadSafeContext is a pair of an LLVMContext and a mutex that can be used to
lock that context when it needs to be accessed from multiple threads.
ThreadSafeModule is a pair of a unique_ptr<Module> and a
shared_ptr<ThreadSafeContext>. This allows the lifetime of a ThreadSafeContext
to be managed automatically in terms of the ThreadSafeModules that refer to it:
Once all modules using a ThreadSafeContext are destructed, and providing the
client has not held on to a copy of shared context pointer, the context will be
automatically destructed.
This scheme is necessary due to the following constraits: (1) We need multiple
contexts for multithreaded compilation (at least one per compile thread plus
one to store any IR not currently being compiled, though one context per module
is simpler). (2) We need to free contexts that are no longer being used so that
the JIT does not leak memory over time. (3) Module lifetimes are not
predictable (modules are compiled as needed depending on the flow of JIT'd
code) so there is no single point where contexts could be reclaimed.
JIT clients not using concurrency can safely use one ThreadSafeContext for all
ThreadSafeModules.
JIT clients who want to be able to compile concurrently should use a different
ThreadSafeContext for each module, or call setCloneToNewContextOnEmit on their
top-level IRLayer. The former reduces compile latency (since no clone step is
needed) at the cost of additional memory overhead for uncompiled modules (as
every uncompiled module will duplicate the LLVM types, constants and metadata
that have been shared).
llvm-svn: 343055
switch RTDyldObjectLinkingLayer2 to use it.
RuntimeDyld::loadObject is currently a blocking operation. This means that any
JIT'd code whose call-graph contains an embedded complete K graph will require
at least K threads to link, which precludes the use of a fixed sized thread
pool for concurrent JITing of arbitrary code (whatever K the thread-pool is set
at, any code with a K+1 complete subgraph will deadlock at JIT-link time).
To address this issue, this commmit introduces a function called jitLinkForORC
that uses continuation-passing style to pass the fix-up and finalization steps
to the asynchronous symbol resolver interface so that linking can be performed
without blocking.
llvm-svn: 343043
This reverts commit r342939.
MSVC's promise/future implementation does not like types that are not default
constructible. Reverting while I figure out a solution.
llvm-svn: 342941