llvm-project

Commit Graph

Author	SHA1	Message	Date
Lang Hames	b8e5f91816	[ORC] Flesh out ExecutorAddress, rename CommonOrcRuntimeTypes header. Renames CommonOrcRuntimeTypes.h to ExecutorAddress.h and moves ExecutorAddress into the 'orc' namespace (rather than orc::shared). Also makes ExecutorAddress a class, adds an ExecutorAddrDiff type and some arithmetic operations on the pair (subtracting two addresses yields an addrdiff, adding an addrdiff and an address yields an address).	2021-07-10 13:53:52 +10:00
Lang Hames	963378bd82	[ORC] Improve computeLocalDeps / computeNamedSymbolDependencies performance. The computeNamedSymbolDependencies and computeLocalDeps methods on ObjectLinkingLayerJITLinkContext are responsible for computing, for each symbol in the current MaterializationResponsibility, the set of non-locally-scoped symbols that are depended on. To calculate this we have to consider the effect of chains of dependence through locally scoped symbols in the LinkGraph. E.g. .text .globl foo foo: callq bar ## foo depneds on external 'bar' movq Ltmp1(%rip), %rcx ## foo depends on locally scoped 'Ltmp1' addl (%rcx), %eax retq .data Ltmp1: .quad x ## Ltmp1 depends on external 'x' In this example symbol 'foo' depends directly on 'bar', and indirectly on 'x' via 'Ltmp1', which is locally scoped. Performance of the existing implementations appears to have been mediocre: Based on flame graphs posted by @drmeister (in #jit on the LLVM discord server) the computeLocalDeps function was taking up a substantial amount of time when starting up Clasp (https://github.com/clasp-developers/clasp). This commit attempts to address the performance problems in three ways: 1. Using jitlink::Blocks instead of jitlink::Symbols as the nodes of the dependencies-introduced-by-locally-scoped-symbols graph. Using either Blocks or Symbols as nodes provides the same information, but since there may be more than one locally scoped symbol per block the block-based version of the dependence graph should always be a subgraph of the Symbol-based version, and so faster to operate on. 2. Improved worklist management. The older version of computeLocalDeps used a fixed worklist containing all nodes, and iterated over this list propagating dependencies until no further changes were required. The worklist was not sorted into a useful order before the loop started. The new version uses a variable work-stack, visiting nodes in DFS order and only adding nodes when there is meaningful work to do on them. Compared to the old version the new version avoids revisiting nodes which haven't changed, and I suspect it converges more quickly (due to the DFS ordering). 3. Laziness and caching. Mappings of... jitlink::Symbol* -> Interned Name (as SymbolStringPtr) jitlink::Block* -> Immediate dependencies (as SymbolNameSet) jitlink::Block* -> Transitive dependencies (as SymbolNameSet) are all built lazily and cached while running computeNamedSymbolDependencies. According to @drmeister these changes reduced Clasp startup time in his test setup (averaged over a handful of starts) from 4.8 to 2.8 seconds (with ORC/JITLink linking ~11,000 object files in that time), which seems like enough to justify switching to the new algorithm in the absence of any other perf numbers.	2021-07-08 16:31:59 +10:00
Lang Hames	5471766f9d	[ORC] Replace MachOJITDylibInitializers::SectionExtent with ExecutorAddressRange MachOJITDylibInitializers::SectionExtent represented the address range of a section as an (address, size) pair. The new ExecutorAddressRange type generalizes this to an address range (for any object, not necessarily a section) represented as a (start-address, end-address) pair. The aim is to express more of ORC (and the ORC runtime) in terms of simple types that can be serialized/deserialized via SPS. This will simplify SPS-based RPC involving arguments/return-values of these types.	2021-07-08 14:15:44 +10:00
Lang Hames	425b908301	[ORC] Rename SPSTargetAddress to SPSExecutorAddress. Also removes SPSTagTargetAddress, which was accidentally introduced at some point (and never used).	2021-07-02 12:40:14 +10:00
Valentin Churavy	45e8a0befb	[Orc] At CBindings for LazyRexports At C bindings and an example for LLJIT with lazy reexports Differential Revision: https://reviews.llvm.org/D104672	2021-07-01 21:52:05 +02:00
Lang Hames	39f64c4c83	[ORC] Add wrapper-function support methods to ExecutorProcessControl. Adds support for both synchronous and asynchronous calls to wrapper functions using SPS (Simple Packed Serialization). Also adds support for wrapping functions on the JIT side in SPS-based wrappers that can be called from the executor. These new methods simplify calls between the JIT and Executor, and will be used in upcoming ORC runtime patches to enable communication between ORC and the runtime.	2021-07-01 18:21:49 +10:00
Lang Hames	662c55442f	[ORC] Rename TargetProcessControl to ExecutorProcessControl. NFC. This is a first step towards consistently using the term 'executor' for the process that executes JIT'd code. I've opted for 'executor' as the preferred term over 'target' as target is already heavily overloaded ("the target machine for the executor" is much clearer than "the target machine for the target").	2021-07-01 13:31:12 +10:00
Valentin Churavy	69e0f790e0	[Orc] Fix name of LLVMOrcIRTransformLayerSetTransform In https://reviews.llvm.org/D103855 we added access to IRTransformLayer, but I just noticed that the function name is following the wrong pattern. Differential Revision: https://reviews.llvm.org/D104840	2021-06-30 21:43:34 +02:00
Eugene Zhulenev	e88ac7295f	[perf] Fix a data race in the PerfJITEventListener Concurrent JIT compilation + PerfJITEventListener triggers tsan error Reviewed By: cota Differential Revision: https://reviews.llvm.org/D104977	2021-06-29 08:30:31 -07:00
Lang Hames	8e66fc4384	[JITLink][ELF] Move ELF section and symbol parsing into ELFLinkGraphBuilder. Move architecture independent ELF parsing/graph-building code from ELFLinkGraphBuilder_x86_64 to the ELFLinkGraphBuilder base class template.	2021-06-29 09:59:49 +10:00
Lang Hames	aff57ff24a	[JITLink][ELF] Add generic ELFLinkGraphBuilder template. ELFLinkGraphBuilder<ELFT> will hold generic parsing and LinkGraph-building code that can be shared between JITLink ELF backends for different architectures. For now it's just a stub. The plan is to incrementally move functionality down from ELFLinkGraphBuilder_x86_64 into the new template.	2021-06-26 21:37:33 +10:00
Lang Hames	80f30a6b85	[ORC][C-bindings] Add access to LLJIT IRTransformLayer, ThreadSafeModule utils. This patch was derived from Valentin Churavy's work in https://reviews.llvm.org/D104480. It adds support for setting the transform on an IRTransformLayer, and for accessing the IRTransformLayer in LLJIT. It also adds access to the ThreadSafeModule::withModuleDo method for thread-safe access to modules. A new example has been added to show how to use these APIs to optimize a module during materialization. Thanks Valentin! Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D103855	2021-06-19 11:50:27 +10:00
Lang Hames	8962c68ad0	[ORC][C-bindings] Re-order object transform function arguments. ObjInOut is an in-out parameter not a return value argument, so by convention it should come after the context value (Ctx).	2021-06-18 22:12:39 +10:00
Lang Hames	cec8e69f01	[ORC] Add support for dumping objects to the C API. Provides ObjectTransformLayer APIs, a getter to access the ObjectTransformLayer member of LLJIT, and the DumpObjects utility to make construction of a dump-to-disk transform easy. An example showing how the new APIs can be used has been added in llvm/examples/OrcV2Examples/OrcV2CBindingsDumpObjects.	2021-06-18 20:56:45 +10:00
Lang Hames	838490de7e	[ORC] Switch from uint8_t to char buffers for TargetProcessControl::runWrapper. This matches WrapperFunctionResult's char buffer, cutting down on the number of pointer casts needed.	2021-06-17 13:27:09 +10:00
Lang Hames	834616146b	[ORC] Switch to WrapperFunction utility for calls to registration functions. Addresses FIXMEs in TPC-based EH-frame and debug object registration code by replacing manual argument serialization with WrapperFunction utility calls.	2021-06-16 18:05:58 +10:00
Lang Hames	89fa1a3a83	[ORC] Fix endianness in manual serialization to match WrapperFunctionUtils.	2021-06-15 21:51:52 +10:00
Lang Hames	4eb9fe2e1a	[ORC] Port WrapperFunctionUtils and SimplePackedSerialization from ORC runtime. Replace the existing WrapperFunctionResult type in llvm/include/ExecutionEngine/Orc/Shared/TargetProcessControlTypes.h with a version adapted from the ORC runtime's implementation. Also introduce the SimplePackedSerialization scheme (also adapted from the ORC runtime's implementation) for wrapper functions to avoid manual serialization and deserialization for calls to runtime functions involving common types.	2021-06-15 21:13:57 +10:00
Lang Hames	82f8aef3de	[JITLink][MachO] Handle muliple symbols at same offset when splitting C-strings. The C-string section splitting support added in `f9649d123d` triggered an assert ("Duplicate canonical symbol at address") when multiple symbols were defined at the the same offset within a C-string block (this triggered on arm64, where we always add a block start symbol). The bug was caused by a failure to update the record of the last canonical symbol address. The fix was to maintain this record correctly, and move the auto-generation of the block-start symbol above the handling for symbols defined in the object itself so that all symbols (auto-generated and defined) are processed in address order.	2021-06-09 19:16:49 +10:00
Lang Hames	f9649d123d	[JITLink][MachO] Split C-string literal sections on null-terminators. MachO C-string literal sections should be split on null-terminator boundaries, rather than the usual symbol boundaries. This patch updates MachOLinkGraphBuilder to do that.	2021-06-09 10:19:27 +10:00
Simon Pilgrim	52396577a2	Use llvm_unreachable for unsupported integer types. As suggested on rG937c4cffd024, use llvm_unreachable for unhandled integer types (which shouldn't be possible) instead of breaking and dropping down to the existing fatal error handler. Helps silence static analyzer warnings.	2021-06-08 17:59:05 +01:00
Simon Pilgrim	937c4cffd0	Fix implicit fall through compiler warning. NFCI.	2021-06-06 13:45:11 +01:00
Arthur Eubanks	372237487e	[OpaquePtr] Remove some uses of PointerType::getElementType()	2021-05-31 16:11:25 -07:00
Lang Hames	249cd9dd60	[JITLink][MachO][arm64] Build GOT entries for defined symbols too. During the generic x86-64 support refactor in `ecf6466f01` the implementation of MachO_arm64_GOTAndStubsBuilder::isGOTEdgeToFix was altered to only return true for external symbols. This behavior is incorrect: GOT entries may be required for defined symbols (e.g. in the large code model). This patch fixes the bug and adds a test case for it (renaming an old test case to avoid any ambiguity).	2021-05-25 12:19:09 -07:00
Yonghong Song	6a2ea84600	BPF: Add more relocation kinds Currently, BPF only contains three relocations: R_BPF_NONE for no relocation R_BPF_64_64 for LD_imm64 and normal 64-bit data relocation R_BPF_64_32 for call insn and normal 32-bit data relocation Also .BTF and .BTF.ext sections contain symbols in allocated program and data sections. These two sections reserved 32bit space to hold the offset relative to the symbol's section. When LLVM JIT is used, the LLVM ExecutionEngine RuntimeDyld may attempt to resolve relocations for .BTF and .BTF.ext, which we want to prevent. So we used R_BPF_NONE for such relocations. This all works fine until when we try to do linking of multiple objects. . R_BPF_64_64 handling of LD_imm64 vs. normal 64-bit data is different, so lld target->relocate() needs more context to do a correct job. . The same for R_BPF_64_32. More context is needed for lld target->relocate() to differentiate call insn vs. normal 32-bit data relocation. . Since relocations in .BTF and .BTF.ext are set to R_BPF_NONE, they will not be relocated properly when multiple .BTF/.BTF.ext sections are merged by lld. This patch intends to address this issue by adding additional relocation kinds: R_BPF_64_ABS64 for normal 64-bit data relocation R_BPF_64_ABS32 for normal 32-bit data relocation R_BPF_64_NODYLD32 for .BTF and .BTF.ext style relocations. The old R_BPF_64_{64,32} semantics: R_BPF_64_64 for LD_imm64 relocation R_BPF_64_32 for call insn relocation The existing R_BPF_64_64/R_BPF_64_32 mapping to numeric values is maintained. They are the most common use cases for bpf programs and we want to maintain backward compatibility as much as possible. ExecutionEngine RuntimeDyld BPF relocations are adjusted as well. R_BPF_64_{ABS64,ABS32} relocations will be resolved properly and other relocations will be ignored. Two tests are added for RuntimeDyld. Not handling R_BPF_64_NODYLD32 in RuntimeDyldELF.cpp will result in "Relocation type not implemented yet!" fatal error. FK_SecRel_4 usages in BPFAsmBackend.cpp and BPFELFObjectWriter.cpp are removed as they are not triggered in BPF backend. BPF backend used FK_SecRel_8 for LD_imm64 instruction operands. Differential Revision: https://reviews.llvm.org/D102712	2021-05-25 08:19:13 -07:00
Simon Pilgrim	ed14062be0	Fix MSVC "truncation of constant value" warning. NFCI.	2021-05-25 11:35:57 +01:00
Lang Hames	82ad2b6e94	[JITLink] Enable creation and management of mutable block content. This patch introduces new operations on jitlink::Blocks: setMutableContent, getMutableContent and getAlreadyMutableContent. The setMutableContent method will set the block content data and size members and flag the content as mutable. The getMutableContent method will return a mutable copy of the existing content value, auto-allocating and populating a new mutable copy if the existing content is marked immutable. The getAlreadyMutableMethod asserts that the existing content is already mutable and returns it. setMutableContent should be used when updating the block with totally new content backed by mutable memory. It can be used to change the size of the block. The argument value should not be shared with any other block. getMutableContent should be used when clients want to modify the existing content and are unsure whether it is mutable yet. getAlreadyMutableContent should be used when clients want to modify the existing content and know from context that it must already be immutable. These operations reduce copy-modify-update boilerplate and unnecessary copies introduced when clients couldn't me sure whether the existing content was mutable or not.	2021-05-24 22:09:36 -07:00
Lang Hames	20634ece15	[ORC] Fix debugging output: printDescription should not have a newline.	2021-05-21 21:11:54 -07:00
Lang Hames	40df1b15b4	[ORC][C-bindings] Replace LLVMOrcJITTargetMachineBuilderDisposeTargetTriple. The implementation and intent behind freeing the triple string here is the same as LLVMGetDefaultTargetTriple (and any other owned c string returned from the C API), so we should use LLVMDisposeMessage for to free the string for consistency. Patch by Mats Larsen -- thanks Mats! Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D102957	2021-05-21 17:38:06 -07:00
Rafael Auler	a33687ec58	[RuntimeDyld] Add allowStubs/allowZeroSyms This patch introduces functionality used by BOLT when re-linking the final binary. It adds to MemoryManager a new member function allowStubAllocation to control whether this MemoryManager supports increasing code size with stubs or not. Since BOLT can rewrite some files in-place, it needs to avoid stub insertion done by the linker. This patch also introduces allowsZeroSymbols to the JITSymbolResolver class, enabling us to finish a link successfully even when some symbols resolve to the value zero. When rewriting a binary, sometimes we do need to resolve a target to zero in case the input binary calls address zero and we want to be bug compatible. We also expose reassignSectionAddress as it is used by BOLT. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D97898	2021-05-18 11:35:27 -07:00
Lang Hames	c42580bf20	[ORC] Don't try to obtain a ref to a non-existent buffer.	2021-05-18 08:44:15 -07:00
Lang Hames	d1a7630369	[JITLink] Fix symbol comparator in LinkGraph::dump. The existing implementation did not provide a strict weak ordering.	2021-05-16 10:11:58 -07:00
Alexey Bader	444f02d73c	New tag for ittapi - fix an error related to cross-compiling ITTAPI in LLVM with mingw Fix was implemented in the ittap repo to solve an error about cross-compiling ITTAPI in LLVM with mingw. The problem occurred in the cross-compilation environment for Julia's dependencies. The corresponding issue item in ittapi repo: https://github.com/intel/ittapi/issues/19 A new tag was created in ittapi repo for that fix. This patch contains changes to update the ittapi tag in LLVM. Reviewed By: bader Differential Revision: https://reviews.llvm.org/D102471	2021-05-14 08:18:49 +03:00
Lang Hames	0fda4c4745	[ORC] Add support for adding LinkGraphs directly to ObjectLinkingLayer. This is separate from (but builds on) the support added in `ec6b71df70` for emitting LinkGraphs in the context of an active materialization. This commit makes LinkGraphs a first-class data structure with features equivalent to object files within ObjectLinkingLayer.	2021-05-13 21:44:13 -07:00
Lang Hames	d63860a052	[JITLink] Fix bogus format string.	2021-05-11 16:04:00 -07:00
Lang Hames	a0162a81b1	[JITLink][MachO/x86_64] Expose API for creating eh-frame fixing passes. These can be used to create eh-frame section fixing passes outside the usual linker pipeline, which can be useful for tests and tools that just want to verify or dump graphs.	2021-05-11 15:26:16 -07:00
Lang Hames	74a96b4c98	[JITLink][x86-64] Add an x86_64 PointerSize constexpr. This can be used in place of magic '8' values in generic x86-64 utilities.	2021-05-11 15:26:15 -07:00
Lang Hames	cbcfca343f	[JITLink] Make LinkGraph debug dumps more readable. This commit reorders some fields and fixes the width of others to try to maintain more consistent columns. It also switches to long-hand scope and linkage names, since LinkGraph dumps aren't read often enough for single-character codes to be memorable.	2021-05-11 15:26:15 -07:00
Lang Hames	7f9a89f9a2	[ORC] Use the new dispatchTask API to run query callbacks. Dispatching query callbacks, rather than running them on the current thread, will allow them to be distributed across multiple threads.	2021-05-09 19:19:40 -07:00
Lang Hames	5344c88dcb	[ORC] Generalize materialization dispatch to task dispatch. Generalizing this API allows work to be distributed more evenly. In particular, query callbacks can now be dispatched (rather than running immediately on the thread that satisfied the query). This avoids the pathalogical case where an operation on one thread satisfies many queries simultaneously, causing large amounts of work to be run on that thread while other threads potentially sit idle.	2021-05-09 19:19:39 -07:00
Lang Hames	7b73cd684a	[ORC] Introduce C API for adding object buffers directly to an object layer. This can be useful for clients constructing custom JIT stacks: If the C API for your custom stack exposes API to obtain a reference to an object layer (e.g. LLVMOrcLLJITGetObjLinkingLayer) then the newly added LLVMOrcObjectLayerAddObjectFile and LLVMOrcObjectLayerAddObjectFileWithRT functions can be used to add objects directly to that layer.	2021-05-05 19:02:13 -07:00
David Stuttard	417b1164c2	[JITLink] Minor fix to avoid Windows compiler warning for static-cast Change-Id: Id0c1d5535b53e2aebe314151c0efa585e763f3f6 Differential Revision: https://reviews.llvm.org/D100093	2021-04-30 11:08:05 +01:00
Lang Hames	aaf026d9da	[ORC] JITDylib::addDependencies should be run under the session lock.	2021-04-29 14:09:40 -07:00
Lang Hames	a702fa2a04	[ORC] Make LLVMOrcLLJITBuilderSetJITTargetMachineBuilder consume as advertised. This should fix some of the memory leaks seen in the ORC C API test case.	2021-04-26 22:26:38 -07:00
Lang Hames	d122d80b3d	Reapply "[ORC] Add unit tests for parts of the ..." with fixes and improvements. This reapplies `8740360093`, which was reverted in `bbddadd46e` due to buildbot errors. This version checks that a JIT instance can be safely constructed, skipping tests if it can not be. To enable this it introduces new C API to retrieve and set the target triple for a JITTargetMachineBuilder.	2021-04-26 20:44:40 -07:00
Lang Hames	c8fc5e3ba9	[ORC] C API updates. Adds support for creating custom MaterializationUnits in the C API with the new LLVMOrcCreateCustomMaterializationUnit function. Modifies ownership rules for LLVMOrcAbsoluteSymbols to make it consistent with LLVMOrcCreateCustomMaterializationUnit. This is an ABI breaking change for any clients of the LLVMOrcAbsoluteSymbols API. Adds LLVMOrcLLJITGetObjLinkingLayer and LLVMOrcObjectLayerEmit functions to allow clients to get a reference to an LLJIT instance's linking layer, then emit an object file using it. This can be used to support construction of custom materialization units in the common case where those units will generate an object file that needs to be emitted to complete the materialization.	2021-04-26 13:58:37 -07:00
Moritz Sichert	10038d0b3d	[RuntimeDyld] Fixed buffer overflows with absolute symbols Differential Revision: https://reviews.llvm.org/D95596	2021-04-26 19:24:03 +02:00
Lang Hames	c1baf946e6	[ORC] Avoid invalidating iterators in EHFrameRegistrationPlugin. In EHFrameRegistrationPlugin::notifyTransferringResources if SrcKey had eh-frames associated but DstKey did not we would create a new entry for DskKey, invalidating the iterator for SrcKey in the process. This commit fixes that by removing SrcKey first in this case.	2021-04-25 16:55:19 -07:00
Philip Reames	4824d876f0	Revert "Allow invokable sub-classes of IntrinsicInst" This reverts commit `d87b9b81cc`. Post commit review raised concerns, reverting while discussion happens.	2021-04-20 15:38:38 -07:00
Philip Reames	d87b9b81cc	Allow invokable sub-classes of IntrinsicInst It used to be that all of our intrinsics were call instructions, but over time, we've added more and more invokable intrinsics. According to the verifier, we're up to 8 right now. As IntrinsicInst is a sub-class of CallInst, this puts us in an awkward spot where the idiomatic means to check for intrinsic has a false negative if the intrinsic is invoked. This change switches IntrinsicInst from being a sub-class of CallInst to being a subclass of CallBase. This allows invoked intrinsics to be instances of IntrinsicInst, at the cost of requiring a few more casts to CallInst in places where the intrinsic really is known to be a call, not an invoke. After this lands and has baked for a couple days, planned cleanups: Make GCStatepointInst a IntrinsicInst subclass. Merge intrinsic handling in InstCombine and use idiomatic visitIntrinsicInst entry point for InstVisitor. Do the same in SelectionDAG. Do the same in FastISEL. Differential Revision: https://reviews.llvm.org/D99976	2021-04-20 15:03:49 -07:00

1 2 3 4 5 ...

2920 Commits