RuntimeDyld does not support RISC-V, so it makes sense to enable
JITLink by default. This also makes relocations work without support
for a large code model.
Differential Revision: https://reviews.llvm.org/D129092
Define atexit symbol in GenericLLVMIRPlatformSupport so that it doesn't need to be defined by user.
On windows, llvm codegen emits atexit runtime calls to support global deinitializers as there is no lower function like cxa_atexit as in Itanium C++ ABI. ORC JIT user had to define custom atexit symbol manually. This was a hassle as it has to deal with dso_handle and cxa_atexit internals of LLJIT. If client didn't provide atexit definition, the default behaviour is just linking with host atexit function which is destined to fail as it calls dtors when the host program exits. This is after jit instances and buffers are freed, so users would see weird access violation exception from the uknown location. (in console application, the debugger thinks exception happened in scrt_common_main_seh)
This is a hack that has some caveats. (e.g. memory address is not identical) But, it's better than the situation described in the above. Ultimately, we will move on to ORC runtime that is able to solve the memory address issue properly.
Reviewed By: sgraenitz
Differential Revision: https://reviews.llvm.org/D128037
Running iwyu-diff on LLVM codebase since fb67d683db detected a few
regressions, fixing them.
The impact on preprocessed output is negligible: -4k lines.
[JITLink][Orc] Add MemoryMapper interface with InProcess implementation
MemoryMapper class takes care of cross-process and in-process address space
reservation, mapping, transferring content and applying protections.
Implementations of this class can support different ways to do this such
as using shared memory, transferring memory contents over EPC or just
mapping memory in the same process (InProcessMemoryMapper).
The original patch landed with commit 6ede652050
It was reverted temporarily in commit 6a4056ab2a
Reviewed By: sgraenitz, lhames
Differential Revision: https://reviews.llvm.org/D127491
MemoryMapper class takes care of cross-process and in-process address space
reservation, mapping, transferring content and applying protections.
Implementations of this class can support different ways to do this such
as using shared memory, transferring memory contents over EPC or just
mapping memory in the same process (InProcessMemoryMapper).
Reviewed By: sgraenitz, lhames
Differential Revision: https://reviews.llvm.org/D127491
Logs enum name of unsupported relocation type. This also changes elf/x86 to use common util function (getELFRelocationTypeName) inside llvm object module.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127715
Implements R_AARCH64_MOVW_UABS_G*_NC fixup edges. These relocation entries can be generated when code is compiled without a PIC flag. With this patch, clang-repl can printf Hello World with ObjectLinkerLayer on aarch64 linux.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127585
Implements MoveWide16 generic edge kind that can be used to patch MOVZ/MOVK (imm16) instructions.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127584
Lift fixup functions from aarch64.cpp to aarch64.h so that they have better chance of getting inlined. Also, adds some comments documenting the purpose of functions.
Reviewed By: sgraenitz
Differential Revision: https://reviews.llvm.org/D127559
Unifies GOT/PLT table managers of ELF and MachO on aarch64 architecture. Additionally, it migrates table managers from PerGraphGOTAndPLTStubsBuilder to generic crtp TableManager.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127558
Slow definition generators may suspend lookups to temporarily release the
session lock, allowing unrelated lookups to proceed.
Using this functionality is discouraged: it is best to make definition
generation fast, rather than suspending the lookup. As a last resort where
this is not possible, suspension may be used.
An API to wrap ExecutionSession::lookup, this allows C API clients to use async
lookup.
The immediate motivation for adding this is to simplify upcoming
definition-generator unit tests.
As we're adding more tests that need to convert between C and C++ flag values
this commit adds helper functions to support this. This patch also updates the
CAPIDefinitionGenerator to use these new utilities.
Previously, omitting unnecessary DWARF unwinds was only done in two
cases:
* For Darwin + aarch64, if no DWARF unwind info is needed for all the
functions in a TU, then the `__eh_frame` section would be omitted
entirely. If any one function needed DWARF unwind, then MC would emit
DWARF unwind entries for all the functions in the TU.
* For watchOS, MC would omit DWARF unwind on a per-function basis, as
long as compact unwind was available for that function.
This diff makes it so that we omit DWARF unwind on a per-function basis
for Darwin + aarch64 as well. In addition, we introduce the flag
`--emit-dwarf-unwind=` which can toggle between `always`,
`no-compact-unwind` (only emit DWARF when CU cannot be emitted for a
given function), and the target platform `default`. `no-compact-unwind`
is particularly useful for newer x86_64 platforms: we don't want to omit
DWARF unwind for x86_64 in general due to possible backwards compat
issues, but we should make it possible for people to opt into this
behavior if they are only targeting newer platforms.
**Motivation:** I'm working on adding support for `__eh_frame` to LLD,
but I'm concerned that we would suffer a perf hit. Processing compact
unwind is already expensive, and that's a simpler format than EH frames.
Given that MC currently produces one EH frame entry for every compact
unwind entry, I don't think processing them will be cheap. I tried to do
something clever on LLD's end to drop the unnecessary EH frames at parse
time, but this made the code significantly more complex. So I'm looking
at fixing this at the MC level instead.
**Addendum:** It turns out that there was a latent bug in the X86
backend when `OmitDwarfIfHaveCompactUnwind` is naively enabled, which is
not too surprising given that this combination has not been heretofore
used.
For functions that have unwind info that cannot be encoded with CU, MC
would end up dropping both the compact unwind entry (OK; existing
behavior) as well as the DWARF entries (not OK). This diff fixes things
so that we emit the DWARF entry, as well as a CU entry with encoding
`UNWIND_X86_MODE_DWARF` -- this basically tells the unwinder to look for
the DWARF entry. I'm not 100% sure the `UNWIND_X86_MODE_DWARF` CU entry
is necessary, this was the simplest fix. ld64 seems to be able to handle
both the absence and presence of this CU entry. Ultimately ld64 (and
LLD) will synthesize `UNWIND_X86_MODE_DWARF` if it is absent, so there
is no impact to the final binary size.
Reviewed By: davide, lhames
Differential Revision: https://reviews.llvm.org/D122258
Implements eh frame handling by using generic EHFrame passes. The c++ exception handling works correctly with this change.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127063
Removes CodeAlignmentFactor and DataAlignmentFactor validation in EHFrameEdgeFixer. I observed some of aarch64 elf files generated by clang contains CIE record with code_alignment_factor = 4 or data_alignment_factor = -8. code_alignment_factor and data_alignment_factor are used by call fram instruction that should be correctled handled by libunwind.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127062
ELF-based platforms currently support defining multiple static
initializer table sections with differing priorities, for example
.init_array.0 or .init_array.100; the default .init_array corresponds
to a priority of 65535. When building a shared library or executable,
the system linker normally sorts these sections and combines them into
a single .init_array section. This change adds the capability to
recognize ELF static initializers with priorities other than the
default, and to properly sort them by priority, to Orc and the Orc
runtime.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127056
This change enables integrating orc::LLJIT with the ORCv2
platforms (MachOPlatform and ELFNixPlatform) and the compiler-rt orc
runtime. Changes include:
- Adding SPS wrapper functions for the orc runtime's dlfcn emulation
functions, allowing initialization and deinitialization to be invoked
by LLJIT.
- Changing the LLJIT code generation default to add UseInitArray so
that .init_array constructors are generated for ELF platforms.
- Integrating the ORCv2 Platforms into lli, and adding a
PlatformSupport implementation to the LLJIT instance used by lli which
implements initialization and deinitialization by calling the new
wrapper functions in the runtime.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D126492
Implements Procedure Linkage Table (PLT) for ELF/AARCH64. The aarch64 linux calling convention also uses r16 as the intra-procedure-call scratch register same as MachO/ARM64. We can use the same stub sequence for this reason.
Also, BR regiseter doesn't touch X30 register. External function call by BL instruction (touched by CALL26 relocation) will set X30 to the original PC + 4, which is the intended behavior. External function call by B instruction (touched by JUMP26 relocation) doesn't requite to set X30, so the patch will be correct in this case too.
Reference: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#611general-purpose-registers
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127061
Adds the aarch64 support in ELFNixPlatform. These are few simple changes, but it allows us to use the orc runtime in ELF/AARCH64 backend. It succesfully run the static initializers of stdlibc++ iostream so that "cout << Hello world" testcase starts to work.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127060
Implements R_AARCH64_JUMP26. We can use the same generic aarch64 Branch26 edge since B instruction and BL nstruction have the same sized&offseted immediate field, and the relocation address calculation is the same.
Reference: ELF for the ARM ® 64-bit Architecture Tabel 4-10, ARM Architecture Reference Manual ® ARMv8, for ARMv8-A architecture profile C6.2.24, C6.2.31
Reviewed By: sgraenitz
Differential Revision: https://reviews.llvm.org/D127059
This patch implements R_AARCH64_PREL64 and R_AARCH64_PREL32 relocations that is
used in eh frame pointers. The test case utlizes obj2yaml tool to create an
artifical eh frame that generates related relocation types.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127058
This patch implements two most commonly used Global Offset Table relocations in
ELF/AARCH64: R_AARCH64_ADR_GOT_PAGE and R_AARCH64_LD64_GOT_LO12_NC. It
implements the GOT table manager by extending the existing
PerGraphGOTAndPLTStubsBuilder. A future patch will unify this with the MachO
implementation to produce a generic aarch64 got table manager.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127057
Implement R_AARCH64_ABS64 relocation entry. This relocation type is generated
when creating a static function pointer to symbol.
Reviewed By: lhames, sgraenitz
Differential Revision: https://reviews.llvm.org/D126658
Implement R_AARCH64_LDST*_ABS_LO12_NC relocaiton entries by reusing PageOffset21
generic relocation edge. The difference between MachO backend is that in ELF,
the shift value is explicitly given by relocation type. lld generates the
relocation type that matches with instruction bitwidth, so getting the shift
value implicitly from instruction bytes should be fine in typical use cases.
The separate isLoadStoreImm12 predicate will be used for validating ELF/aarch64
ldst relocation types.
Reviewed By: lhames, sgraenitz
Differential Revision: https://reviews.llvm.org/D126628
This patch moves the aarch64 fixup logic from the MachO/arm64 backend to
aarch64.h header so that it can be re-used in the ELF/aarch64 backend. This
significantly expands relocation support in the ELF/aarch64 backend.
Reviewed By: lhames, sgraenitz
Differential Revision: https://reviews.llvm.org/D126286
This code previously used cantFail, but both steps (resolution and emission)
can fail if the resource tracker associated with the
AbsoluteSymbolsMaterializationUnit is removed. Checking these errors is
necessary for correct error propagation.
Idiomatic llvm::Error usage can result in a FailedToMaterialize error tearing
down an ExecutionSession instance. Since the FailedToMaterialize error holds
SymbolStringPtrs and JITDylib references this leads to crashes when accessing
or logging the error.
This patch modifies FailedToMaterialize to retain the SymbolStringPool and
JITDylibs involved in the failure so that we can safely report an error message
to the client, even if the error tears down the session.
The contract for JITDylibs allows the getName method to be used even after the
session has been torn down, but no other JITDylib fields should be accessed via
the FailedToMaterialize error if the ssesion has been torn down. Logging the
error is guaranteed to be safe in all cases.
Clients are required to call ExecutionSession::endSession before destroying the
ExecutionSession. Failure to do so can lead to memory leaks and other difficult
to debug issues. Enforcing this requirement by assertion makes it easy to spot
or debug situations where the contract was not followed.
This changes the ELFNix platform Orc runtime to use, when available,
the __unw_add_dynamic_eh_frame_section interface provided by libunwind
for registering .eh_frame sections loaded by JITLink. When libunwind
is not being used for unwinding, the ELFNix platform detects this and
defaults to the __register_frame interface provided by libgcc_s.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D114961
It's idiomatic to require that plugins (especially platform plugins) be
installed to handle special edge kinds. If the plugins are not installed and an
object is loaded that uses one of the special edge kinds then we want to error
out rather than asserting.
Adds support for pointer encodings commonly used in large/static models,
including non-pcrel, sdata/udata8, indirect, and omit.
Also refactors pointer-encoding handling to consolidate error generation inside
common functions, rather than callees of those functions.
This adds resolver, indirection and trampoline stubs for riscv64,
allowing lazy compilation to work.
It assumes hard float extension exists. I don't know the proper way to detect it as Triple doesn't provide the interface to check riscv +f +d abi.
I am also not sure if orclazy tests should be enabled because lli needs an additional -codemodel=melany for tests to pass.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D122543
Removes a bogus dyn_cast_or_null that was breaking cast-expression handling when
parsing llvm.global_ctors.
The intent of this code was to identify Functions nested within cast
expressions, but the offending dyn_cast_or_null was actually blocking that:
Since a function is not a cast expression, we would set FuncC to null and break
the loop without finding the Function. The cast was not necessary either:
Functions are already Constants, and we didn't need to do anything
ConstantExpr-specific with FuncC, so we could just drop the cast.
Thanks to Jonas Hahnfeld for tracking this down.
http://llvm.org/PR54797
This function had been assuming a 1-byte alignment, which isn't always correct.
This commit updates it to take the alignment from the __cstring section.
The key change is to the createContentBlock call, but the surrounding code is
updated with clearer debugging output to support the testcase (and any future
debugging work).
The sort should have been lexicographic, but wasn't. This resulted in us
choosing a common symbol at address zero over the intended target function,
leading to a crash.
This patch also moves sorting up to the start of the pass, which means that we
only need to hold on to the canonical symbol at each address rather than a list
of candidates.
With 229d576b31 the class EHFrameSplitter was renamed to DWARFRecordSectionSplitter. This change merely moves it to it's own .cpp/.h file
Differential Revision: https://reviews.llvm.org/D121721
EHFrameSplitter does the exact same work to split up the eh_frame as it would need for any section that follows the DWARF record, therefore this patch just changes the name of it to DWARFRecordSectionSplitter to be more general.
Differential Revision: https://reviews.llvm.org/D121486
This patch refactors the range checking function to make it compatible with all relocation types and supports range checking for R_RISCV_BRANCH. Moreover, it refactors the alignment check functions.
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D117946
This patch removes the unintended resolution of locally scoped absolute symbols
(which was causing unexpected definition errors).
It stops using the JITSymbolFlags::Absolute flag (it isn't set or used elsewhere,
and causes mismatch-flags asserts), and adds JITSymbolFlags::Exported to default
scoped absolute symbols.
Finally, we now set the scope of absolute symbols correctly in
MachOLinkGraphBuilder.
Without this, EPCIndirectionUtils::getResolverBlockAddr (and lazy compilation
via EPC) won't work.
No test case: lli is still using LocalLazyCallThroughManager. I'll revisit this
soon when I look at adding lazy compilation support to the ORC runtime.
This patch supports the R_RISCV_JAL relocation.
Moreover, it will fix the extractBits function's behavior as it extracts Size + 1 bits.
In the test ELF_jal.s:
Before:
```
Hi: 4294836480
extractBits(Hi, 12, 8): 480
```
After:
```
Hi: 4294836480
extractBits(Hi, 12, 8): 224
```
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D117975
This patch supports the R_RISCV_JAL relocation.
Moreover, it will fix the extractBits function's behavior as it extracts Size + 1 bits.
In the test ELF_jal.s:
Before:
```
Hi: 4294836480
extractBits(Hi, 12, 8): 480
```
After:
```
Hi: 4294836480
extractBits(Hi, 12, 8): 224
```
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D117975
Most notably,
llvm/Object/Binary.h no longer includes llvm/Support/MemoryBuffer.h
llvm/Object/MachOUniversal*.h no longer include llvm/Object/Archive.h
llvm/Object/TapiUniversal.h no longer includes llvm/Object/TapiFile.h
llvm-project preprocessed size:
before: 1068185081
after: 1068324320
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119457
This patch updates the MachO platform (both the ORC MachOPlatform class and the
ORC-Runtime macho_platform.* files) to use allocation actions, rather than EPC
calls, to transfer the initializer information scraped from each linked object.
Interactions between the ORC and ORC-Runtime sides of the platform are
substantially redesigned to accomodate the change.
The high-level changes in this patch are:
1. The MachOPlatform::setupJITDylib method now calls into the runtime to set up
a dylib name <-> header mapping, and a dylib state object (JITDylibState).
2. The MachOPlatformPlugin builds an allocation action that calls the
__orc_rt_macho_register_object_platform_sections and
__orc_rt_macho_deregister_object_platform_sections functions in the runtime
to register the address ranges for all "interesting" sections in the object
being allocated (TLS data sections, initializers, language runtime metadata
sections, etc.).
3. The MachOPlatform::rt_getInitializers method (the entry point in the
controller for requests from the runtime for initializer information) is
replaced by MachOPlatform::rt_pushInitializers. The former returned a data
structure containing the "interesting" section address ranges, but these are
now handled by __orc_rt_macho_register_object_platform_sections. The new
rt_pushInitializers method first issues a lookup to trigger materialization
of the "interesting" sections, then returns the dylib dependence tree rooted
at the requested dylib for dlopen to consume. (The dylib dependence tree is
returned by rt_pushInitializers, rather than being handled by some dedicated
call, because rt_pushInitializers can alter the dependence tree).
The advantage of these changes (beyond the performance advantages of using
allocation actions) is that it moves more information about the materialized
portions of the JITDylib into the executor. This tends to make the runtime
easier to reason about, e.g. the implementation of dlopen in the runtime is now
recursive, rather than relying on recursive calls in the controller to build a
linear data structure for consumption by the runtime. This change can also make
some operations more efficient, e.g. JITDylibs can be dlclosed and then
re-dlopened without having to pull all initializers over from the controller
again.
In addition to the high-level changes, there are some low-level changes to ORC
and the runtime:
* In ORC, at ExecutionSession teardown time JITDylibs are now destroyed in
reverse creation order. This is on the assumption that the ORC runtime will be
loaded into an earlier dylib that will be used by later JITDylibs. This is a
short-term solution to crashes that arose during testing when the runtime was
torn down before its users. Longer term we will likely destroy dylibs in
dependence order.
* toSPSSerializable(Expected<T> E) is updated to explicitly initialize the T
value, allowing it to be used by Ts that have explicit constructors.
* The ORC runtime now (1) attempts to track ref-counts, and (2) distinguishes
not-yet-processed "interesting" sections from previously processed ones. (1)
is necessary for standard dlopen/dlclose emulation. (2) is intended as a step
towards better REPL support -- it should enable future runtime calls that
run only newly registered initializers ("dlopen_more", "dlopen_additions",
...?).
In D116573, the relocation behavior of R_RISCV_BRANCH didn't consider that branch instruction like 'bge' has a branch target address which is given as a PC-relative offset, sign-extend and multiplied by 2.
Although the target address is a 12-bits number, acctually its range is [-4096, 4094].
This patch fix it.
Differential Revision: https://reviews.llvm.org/D118151
Summary:
This is the first path to support more relocation types on aarch64.
The patch just uses the isInt<N> to replace fitsRangeSignedInt.
Test Plan:
check-all
Differential Revision: https://reviews.llvm.org/D118231
with fixes
This reapply `de872382951572b70dfaefe8d77eb98d15586115`, which was
reverted in `fdb6578514dd3799ad23c8bbb7699577c0fb414d`
Add `# REQUIRES: asserts` in test file `anonymous_symbol.s` to disable
this test for non-debug build
This patch supports R_RISCV_SET* and R_RISCV_32_PCREL relocations in JITLink.
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D117082
In RISCV, temporary symbols will be used to generate dwarf, eh_frame sections..., and will be placed in object code's symbol table. However, LLVM does not use names on these temporary symbols. This patch add anonymous symbols in LinkGraph for these temporary symbols.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D116475
Calls to JITDylib's getDFSLinkOrder and getReverseDFSLinkOrder methods (both
static an non-static versions) are now valid to make on defunct JITDylibs, but
will return an error if any JITDylib in the link order is defunct.
This means that platforms can safely lookup link orders by name in response to
jit-dlopen calls from the ORC runtime, even if the call names a defunct
JITDylib -- the call will just fail with an error.
ELF object files can contain duplicated sections (thus section symbols
as well), espeically when comdats/section groups are present. This patch
adds support for generating LinkGraph from object files that have
duplicated section names. This is the first step to properly model
comdats/section groups.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D114753
This is a counterpart to Platform::setupJITDylib, and is called when JITDylib
instances are removed (via ExecutionSession::removeJITDylib).
Upcoming MachOPlatform patches will use this to clear per-JITDylib data when
JITDylibs are removed.
This reapplies 253ce92844, which was reverted in
66b2ed477f due to bot failures.
I have added the `-phony-externals` option added, which should fix the
unresolved symbol errors.
Do nothing on R_AARCH64_NONE relocation. The relocation is used by BOLT when re-linking the final binary. It is used as a dummy relocation hack in order to stop the RuntimeDyld to skip the allocation of the section.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D117066
This patch makes jitlink to report an out of range error when the fixup value out of range
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D107328
This is needed for DWARF eh-frame exception handling on AArch64.
https://github.com/llvm/llvm-project/issues/52921.
Original patch by David Nadlinger <code@klickverbot.at> (thanks David!),
testcase and comments added by me.
This consistent with ld64's treatment of the GOT, but the main aim here is a
short-term workaround for a bad interaction between stub code sequences and
memory layout: Stubs use LDRLiteral19 relocations to reference the GOT, but
BasicLayout currently puts RW- segments between R-- and R-X segments -- a large
RW- segment (or a large R-- for that matter) can cause the relocation to fail
with an out-of-range error.
Putting the GOT in R-X fixes this efficiently in practice. A more robust fix
will be to use a longer code sequence to materialize the GOT pointer and then
rewrite the stub to use a shorter sequence where possible.
The previous error message only provided the address of the anonymous symbol.
Knowing the containing section makes the error easier to diagnose at a glance.
runFinalizeActions takes an AllocActions vector and attempts to run its finalize
actions. If any finalize action fails then all paired dealloc actions up to the
failing pair are run, and the error(s) returned. If all finalize actions succeed
then a vector containing the dealloc actions is returned.
runDeallocActions takes a vector<WrapperFunctionCall> containing dealloc action
calls and runs them all, returning any error(s).
These helpers are intended to simplify the implementation of
JITLinkMemoryManager::InFlightAlloc::finalize and
JITLinkMemoryManager::deallocate overrides by taking care of execution (and
potential roll-back) of allocation actions.
This re-applies 133f86e954, which was reverted in
c5965a411c while I investigated bot failures.
The original failure contained an arithmetic conversion think-o (on line 419 of
EHFrameSupport.cpp) that could cause failures on 32-bit platforms. The issue
should be fixed in this patch.
This patch makes jitlink to report an out of range error when the fixup value out of range
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D107328
Address the advice proposed at patch D105429 . Use [Low, Low+size) to represent bits.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D107250
This reverts commit fd4808887e.
This patch causes gcc to issue a lot of warnings like:
warning: base class ‘class llvm::MCParsedAsmOperand’ should be
explicitly initialized in the copy constructor [-Wextra]
This adds a GetObjectFileInterface callback member to
StaticLibraryDefinitionGenerator, and adds an optional argument for initializing
that member to StaticLibraryDefinitionGenerator's named constructors. If not
supplied, it will default to getObjectFileInterface from ObjectFileInterface.h.
To enable testing a `-hidden-l<x>` option is added to the llvm-jitlink tool.
This allows archives to be loaded with all contained symbol visibilities demoted
to hidden.
The ObjectLinkingLayer::setOverrideObjectFlagsWithResponsibilityFlags method is
(belatedly) hooked up, and enabled in llvm-jitlink when `-hidden-l<x>` is used
so that the demotion is also applied at symbol resolution time (avoiding any
"mismatched symbol flags" crashes).
Also moves object interface building functions out of Mangling.h and in to the
new ObjectFileInterfaces.h header, and updates the llvm-jitlink tool to use
custom object interfaces rather than a custom link layer.
ObjectLayer::add overloads are added to match the old signatures (which
do not take a MaterializationUnit::Interface). These overloads use the
standard getObjectFileInterface function to build an interface.
Passing a MaterializationUnit::Interface explicitly makes it easier to alter
the effective interface of the object file being added, e.g. by changing symbol
visibility/linkage, or renaming symbols (in both cases the changes will need to
be mirrored by a JITLink pass at link time to update the LinkGraph to match the
explicit interface). Altering interfaces in this way can be useful when lazily
compiling (e.g. for renaming function bodies) or emulating linker options (e.g.
demoting all symbols to hidden visibility to emulate -load_hidden).
If all symbols in a lookup match before we reach the end of the search order
then bail out of the search-order loop early.
This should reduce unnecessary contention on the session lock and improve
readability of the debug logs.
Most of `MemoryBuffer` interfaces expose a `RequiresNullTerminator` parameter that's being used to:
* determine how to open a file (`mmap` vs `open`),
* assert newly initialized buffer indeed has an implicit null terminator.
This patch adds the paramater to the `SmallVectorMemoryBuffer` constructors, meaning:
* null terminator can now be added to `SmallVector`s that didn't have one before,
* `SmallVectors` that had a null terminator before keep it even after the move.
In line with existing code, the new parameter is defaulted to `true`. This patch makes sure all calls to the `SmallVectorMemoryBuffer` constructor set it to `false` to preserve the current semantics.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D115331
MaterializationUnit::Interface holds the values that make up the interface
(for ORC's purposes) of a materialization unit: the symbol flags map and
initializer symbol.
Having a type for this will make functions that build materializer interfaces
more readable and maintainable.
In order to present a well-formed MachO debug object for debugger registration
the first block in each section must have a zero alignment offset (since there
is no way to represent a non-zero offset in a MachO section load command). This
patch updates the MachODebugObjectSynthesizer class to introduce a padding
padding block at the start of the section if necessary to guarantee a zero
alignment offset.
Improves cross-distro portability of LLVM cmake package by resolving paths for
terminfo and libffi via import targets.
When LLVMExports.cmake is generated for installation, it contains absolute
library paths which are likely to be a common cause of portability issues. To
mitigate this, the discovery logic for these dependencies is refactored into
find modules which get installed alongside LLVMConfig.cmake. The result is
cleaner, cmake-friendly management of these dependencies that respect the
environment of the LLVM package importer.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D114327
R_X86_64_PLT32 explicitly represents the '-4' PC-adjustment in the relocation's
addend, but JITLink's x86_64::Branch32PCRel includes the PC-adjustment
implicitly. We have been zeroing the addend to account for the difference, but
this breaks for branches to non-zero offsets past labels. This patch updates the
relocation parsing code to unconditionally adjust the offset by '+4' instead.
For branches directly to labels the result is still 0, for branches to offsets
past labels the result is the correct addend for x86_64::Branch32PCRel.
This allows JITDylibs to be removed from the ExecutionSession. Calling
ExecutionSession::removeJITDylib will disconnect the JITDylib from the
ExecutionSession and clear it (removing all trackers associated with it). The
JITDylib object will then be destroyed as soon as the last JITDylibSP pointing
at it is destroyed.
GeneratorsMutex should prevent lookups from proceeding through the
generators of a single JITDylib concurrently (since this could
result in redundant attempts to generate definitions). Mutation of
the generators list itself should be done under the session lock.
This keeps the tracker alive for the lifetime of the MR. This is needed so that
we can check whether the tracker has become defunct before posting results (or
failure) for the MR.
Size 0 sections can have symbols that have size 0. Build those sections
and symbols into the LinkGraph so they can be used properly if needed.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D114749
Add support for reading extended table in ELF object file. This allows
JITLink to support ELF object files with many sections.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D114747
We were adding all defined weak symbols to the materialization
responsibility, but local symbols will not be in the symbol table, so it
failed to materialize due to the "missing" symbol.
Local weak symbols come up in practice when using `ld -r` with a hidden
weak symbol.
rdar://85574696
Fix `splitBlock` so that it can handle the case when the block being
split has symbols span across the split boundary. This is an error
case in general but for EHFrame splitting on macho platforms, there is an
anonymous symbol that marks the entire block. Current implementation
will leave a symbol that is out of bound of the underlying block. Fix
the problem by dropping such symbols when the block is split.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D113912
This reapplies e1933a0488 (which was reverted in
f55ba3525e due to bot failures, e.g.
https://lab.llvm.org/buildbot/#/builders/117/builds/2768).
The bot failures were due to a missing symbol error: We use the input object's
mangling to decide how to mangle the debug-info registration function name. This
caused lookup of the registration function to fail when the input object
mangling didn't match the host mangling.
Disbaling the test on non-Darwin platforms is the easiest short-term solution.
I have filed https://llvm.org/PR52503 with a proposed longer term solution.
This commit adds a new plugin, GDBJITDebugInfoRegistrationPlugin, that checks
for objects containing debug info and registers any debug info found via the
GDB JIT registration API.
To enable this registration without redundantly representing non-debug sections
this plugin synthesizes a new embedded object within a section of the LinkGraph.
An allocation action is used to make the registration call.
Currently MachO only. ELF users can still use the DebugObjectManagerPlugin. The
two are likely to be merged in the near future.
Only search within the requested section, and allow one-past-then-end addresses.
This is needed to support section-end-address references to sections with no
symbols in them.
Similar to how the other swift sections are registered by the ORC
runtime's macho platform, add the __swift5_types section, which contains
type metadata. Add a simple test that demonstrates that the swift
runtime recognized the registered types.
rdar://85358530
Differential Revision: https://reviews.llvm.org/D113811
We need to skip the length field when generating error strings.
No test case: This hand-hacked deserializer should be removed in the near future
once JITLink can use generic ORC APIs (including SPS and WrapperFunction).
If a tool wants to introduce new indirections via stubs at link-time in
ORC, it can cause fidelity issues around the address of the function if
some references to the function do not have relocations. This is known
to happen inside the body of the function itself on x86_64 for example,
where a PC-relative address is formed, but without a relocation.
```
_foo:
leaq -7(%rip), %rax ## form pointer to '_foo' without relocation
_bar:
leaq (%rip), %rax ## uses X86_64_RELOC_SIGNED to '_foo'
```
The consequence of introducing a stub for such a function at link time
is that if it forms a pointer to itself without relocation, it will not
have the same value as a pointer from outside the function. If the
function pointer is used as a key, this can cause problems.
This utility provides best-effort support for adding such missing
relocations using MCDisassembler and MCInstrAnalysis to identify the
problematic instructions. Currently it is only implemented for x86_64.
Note: the related issue with call/jump instructions is not handled
here, only forming function pointers.
rdar://83514317
Differential revision: https://reviews.llvm.org/D113038
MachOPlatform used to make an EPC-call (registerObjectSections) to register the
eh-frame and thread-data sections for each linked object with the ORC runtime.
Now that JITLinkMemoryManager supports allocation actions we can use these
instead of an EPC call. This saves us one EPC-call per object linked, and
manages registration/deregistration in the executor, rather than the controller
process. In the future we may use this to allow JIT'd code in the executor to
outlive the controller object while still being able to be cleanly destroyed.
Since the code for allocation actions must be available when the actions are
run, and since the eh-frame registration code lives in the ORC runtime itself,
this change required that MachO eh-frame support be split out of
macho_platform.cpp and into its own macho_ehframe_registration.cpp file that has
no other dependencies. During bootstrap we start by forcing emission of
macho_ehframe_registration.cpp so that eh-frame registration is guaranteed to be
available for the rest of the bootstrap process. Then we load the rest of the
MachO-platform runtime support, erroring out if there is any attempt to use
TLVs. Once the bootstrap process is complete all subsequent code can use all
features.
Alloc actions should return a CWrapperFunctionResult. JITLink does not have
access to this type yet, due to library layering issues, so add a cut-down
version with a fixme.
This type has been moved up into the llvm::orc::shared namespace.
This type was originally put in the detail:: namespace on the assumption that
few (if any) LLVM source files would need to use it. In practice it has been
needed in many places, and will continue to be needed until/unless
OrcTargetProcess is fully merged into the ORC runtime.
The new name better suits the type.
This patch also changes the signature of the run method (it now returns a
WrapperFunctionResult), and adds runWithSPSRet methods that deserialize the
function result using SPS.
Together these chages bring this type into close alignment with its ORC runtime
counterpart.
SPSExecutorAddr will now be serializable to/from ExecutorAddr, rather than
uint64_t. This improves type safety when working with serialized addresses.
Also updates the SupportFunctionCall to use an ExecutorAddrRange (rather than
a separate ExecutorAddr addr and uint64_t size field), and updates the
tpctypes::*Write data structures to use ExecutorAddr rather than
JITTargetAddress.
Enables the arm64 MachO platform, adds basic tests, and implements the
missing TLV relocations and runtime wrapper function. The TLV
relocations are just handled as GOT accesses.
rdar://84671534
Differential Revision: https://reviews.llvm.org/D112656
This lifts the global offset table and procedure linkage table builders out of
ELF_x86_64.h and into x86_64.h, renaming them with generic names
x86_64::GOTTableBuilder and x86_64::PLTTableBuilder. MachO_x86_64.cpp is updated
to use these classes instead of the older PerGraphGOTAndStubsBuilder tool.
Moves visitEdge into the TableManager derivatives, replacing the fixEdgeKind
methods in those classes. The visitEdge method takes on responsibility for
updating the edge target, as well as its kind.
This patch add a TableManager which reponsible for fixing edges that need entries to reference the target symbol and constructing such entries.
In the past, the PerGraphGOTAndPLTStubsBuilder pass was used to build GOT and PLT entry, and the PerGraphTLSInfoEntryBuilder pass was used to build TLSInfo entry. By generalizing the behavior of building entry, I added a TableManager which could be reused when built GOT, PLT and TLSInfo entries.
If this patch makes sense and can be accepted, I will apply the TableManager to other targets(MachO_x86_64, MachO_arm64, ELF_riscv), and delete the file PerGraphGOTAndPLTStubsBuilder.h
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D110383
SimpleRemoteEPC notionally allowed subclasses to override the
createMemoryManager and createMemoryAccess methods to use custom objects, but
could not actually be subclassed in practice (The construction process in
SimpleRemoteEPC::Create could not be re-used).
Instead of subclassing, this commit adds a SimpleRemoteEPC::Setup class that
can be used by clients to set up the memory manager and memory access members.
A default-constructed Setup object results in no change from previous behavior
(EPCGeneric* memory manager and memory access objects used by default).
Negative deltas for LDRLiteral19 have their high bits set. If these bits aren't
masked out then they will overwrite other instruction bits, leading to a bogus
encoding.
This long-standing relocation bug was exposed by e50aea58d5, "[JITLink][ORC]
Major JITLinkMemoryManager refactor.", which caused memory layouts to be
reordered, which in turn lead to a previously unseen negative delta. (Unseen
because LDRLiteral19s were only created in JITLink passes where they always
pointed at segments that were layed-out-after in the old layout).
No testcase yet: Our existing regression test infrastructure is good at checking
that operand bits are correct, but provides no easy way to test for bad opcode
bits. I'll have a think about the right way to approach this.
https://llvm.org/PR52153
f341161689 added a task dispatcher for async handlers, but didn't add a
TaskDispatcher::shutdown call to SelfExecutorProcessControl or SimpleRemoteEPC.
This patch adds the missing call, which ensures that we don't destroy the
dispatcher while tasks are still running.
This should fix the use-after-free crash seen in
https://lab.llvm.org/buildbot/#/builders/5/builds/13063
Adds explicit narrowing casts to JITLinkMemoryManager.cpp.
Honors -slab-address option in llvm-jitlink.cpp, which was accidentally
dropped in the refactor.
This effectively reverts commit 6641d29b70.
This commit substantially refactors the JITLinkMemoryManager API to: (1) add
asynchronous versions of key operations, (2) give memory manager implementations
full control over link graph address layout, (3) enable more efficient tracking
of allocated memory, and (4) support "allocation actions" and finalize-lifetime
memory.
Together these changes provide a more usable API, and enable more powerful and
efficient memory manager implementations.
To support these changes the JITLinkMemoryManager::Allocation inner class has
been split into two new classes: InFlightAllocation, and FinalizedAllocation.
The allocate method returns an InFlightAllocation that tracks memory (both
working and executor memory) prior to finalization. The finalize method returns
a FinalizedAllocation object, and the InFlightAllocation is discarded. Breaking
Allocation into InFlightAllocation and FinalizedAllocation allows
InFlightAllocation subclassses to be written more naturally, and FinalizedAlloc
to be implemented and used efficiently (see (3) below).
In addition to the memory manager changes this commit also introduces a new
MemProt type to represent memory protections (MemProt replaces use of
sys::Memory::ProtectionFlags in JITLink), and a new MemDeallocPolicy type that
can be used to indicate when a section should be deallocated (see (4) below).
Plugin/pass writers who were using sys::Memory::ProtectionFlags will have to
switch to MemProt -- this should be straightworward. Clients with out-of-tree
memory managers will need to update their implementations. Clients using
in-tree memory managers should mostly be able to ignore it.
Major features:
(1) More asynchrony:
The allocate and deallocate methods are now asynchronous by default, with
synchronous convenience wrappers supplied. The asynchronous versions allow
clients (including JITLink) to request and deallocate memory without blocking.
(2) Improved control over graph address layout:
Instead of a SegmentRequestMap, JITLinkMemoryManager::allocate now takes a
reference to the LinkGraph to be allocated. The memory manager is responsible
for calculating the memory requirements for the graph, and laying out the graph
(setting working and executor memory addresses) within the allocated memory.
This gives memory managers full control over JIT'd memory layout. For clients
that don't need or want this degree of control the new "BasicLayout" utility can
be used to get a segment-based view of the graph, similar to the one provided by
SegmentRequestMap. Once segment addresses are assigned the BasicLayout::apply
method can be used to automatically lay out the graph.
(3) Efficient tracking of allocated memory.
The FinalizedAlloc type is a wrapper for an ExecutorAddr and requires only
64-bits to store in the controller. The meaning of the address held by the
FinalizedAlloc is left up to the memory manager implementation, but the
FinalizedAlloc type enforces a requirement that deallocate be called on any
non-default values prior to destruction. The deallocate method takes a
vector<FinalizedAlloc>, allowing for bulk deallocation of many allocations in a
single call.
Memory manager implementations will typically store the address of some
allocation metadata in the executor in the FinalizedAlloc, as holding this
metadata in the executor is often cheaper and may allow for clean deallocation
even in failure cases where the connection with the controller is lost.
(4) Support for "allocation actions" and finalize-lifetime memory.
Allocation actions are pairs (finalize_act, deallocate_act) of JITTargetAddress
triples (fn, arg_buffer_addr, arg_buffer_size), that can be attached to a
finalize request. At finalization time, after memory protections have been
applied, each of the "finalize_act" elements will be called in order (skipping
any elements whose fn value is zero) as
((char*(*)(const char *, size_t))fn)((const char *)arg_buffer_addr,
(size_t)arg_buffer_size);
At deallocation time the deallocate elements will be run in reverse order (again
skipping any elements where fn is zero).
The returned char * should be null to indicate success, or a non-null
heap-allocated string error message to indicate failure.
These actions allow finalization and deallocation to be extended to include
operations like registering and deregistering eh-frames, TLS sections,
initializer and deinitializers, and language metadata sections. Previously these
operations required separate callWrapper invocations. Compared to callWrapper
invocations, actions require no extra IPC/RPC, reducing costs and eliminating
a potential source of errors.
Finalize lifetime memory can be used to support finalize actions: Sections with
finalize lifetime should be destroyed by memory managers immediately after
finalization actions have been run. Finalize memory can be used to support
finalize actions (e.g. with extra-metadata, or synthesized finalize actions)
without incurring permanent memory overhead.
In SimpleRemoteEPC, calls to from callWrapperAsync to sendMessage may fail.
The handlers may or may not be sent failure messages by handleDisconnect,
depending on when that method is run. This patch adds a check for an un-failed
handler, and if it finds one sends it a failure message.
On the controller-side, handle `Hangup` messages from the executor. The executor passed `Error::success()` or a failure message as payload.
Hangups cause an immediate disconnect of the transport layer. The disconnect function may be called later again and so implementations should be prepared. `FDSimpleRemoteEPCTransport::disconnect()` already has a flag to check that:
cd1bd95d87/llvm/lib/ExecutionEngine/Orc/Shared/SimpleRemoteEPCUtils.cpp (L112)
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D111527
Adds LLVMOrcCreateStaticLibrarySearchGeneratorForPath and
LLVMOrcCreateDynamicLibrarySearchGeneratorForPath functions to create generators
for static and dynamic libraries.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D108535
The callWrapperAsync and callSPSWrapperAsync methods take a handler object
that is run on the return value of the call when it is ready. The new RunPolicy
parameters allow clients to control how these handlers are run. If no policy is
specified then the handler will be packaged as a GenericNamedTask and dispatched
using the ExecutorProcessControl's TaskDispatch member. Callers can use the
ExecutorProcessControl::RunInPlace policy to cause the handler to be run
directly instead, which may be preferrable for simple handlers, or they can
write their own policy object (e.g. to dispatch as some other kind of Task,
rather than GenericNamedTask).
f341161689 introduced a dependence (for builds with LLVM_ENABLE_THREADS) on
pthreads. This commit updates the CMakeLists.txt file to include a LINK_LIBS
entry for pthreads.
ExecutorProcessControl objects will now have a TaskDispatcher member which
should be used to dispatch work (in particular, handling incoming packets in
the implementation of remote EPC implementations like SimpleRemoteEPC).
The GenericNamedTask template can be used to wrap function objects that are
callable as 'void()' (along with an optional name to describe the task).
The makeGenericNamedTask functions can be used to create GenericNamedTask
instances without having to name the function object type.
In a future patch ExecutionSession will be updated to use the
ExecutorProcessControl's dispatcher, instead of its DispatchTaskFunction.
The callee address is now the first parameter and the 'SendResult' function
the second. This change improves consistentency with the non-async functions
where the callee is the first address and the return value the second.
There is a bug reported at https://bugs.llvm.org/show_bug.cgi?id=48938
After looking through the glibc, I found the `atexit(f)` is the same as `__cxa_atexit(f, NULL, NULL)`. In orc runtime, we identify different JITDylib by their dso_handle value, so that a NULL dso_handle is invalid. So in this patch, I added a `PlatformJDDSOHandle` to ELFNixRuntimeState, and functions which are registered by atexit will be registered at PlatformJD.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D111413
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
This reverts commit dfd74db981.
SimpleRemoteEPC should share dispatch with the ExecutionSession, rather than
having two different dispatch systems on the controller side.
SimpleRemoteEPCServer::Dispatch doesn't need to be shared.
Renames SimpleRemoteEPCServer::Dispatcher to SimpleRemoteEPCDispatcher and
moves it into OrcShared. SimpleRemoteEPCServer::ThreadDispatcher is similarly
moved and renamed to DynamicThreadPoolSimpleRemoteEPCDispatcher.
This will allow these classes to be reused by SimpleRemoteEPC on the controller
side of the connection.
This patch add a TableManager which reponsible for fixing edges that need entries to reference the target symbol and constructing such entries.
In the past, the PerGraphGOTAndPLTStubsBuilder pass was used to build GOT and PLT entry, and the PerGraphTLSInfoEntryBuilder pass was used to build TLSInfo entry. By generalizing the behavior of building entry, I added a TableManager which could be reused when built GOT, PLT and TLSInfo entries.
If this patch makes sense and can be accepted, I will apply the TableManager to other targets(MachO_x86_64, MachO_arm64, ELF_riscv), and delete the file PerGraphGOTAndPLTStubsBuilder.h
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D110383
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
We can use the raw_string_ostream::str() method to perform the implicit flush() and return a reference to the std::string container that we can then wrap inside Twine().
With the removal of OrcRPCExecutorProcessControl and OrcRPCTPCServer in
6aeed7b19c the ORC RPC library no longer has any in-tree users.
Clients needing serialization for ORC should move to Simple Packed
Serialization (usually by adopting SimpleRemoteEPC for remote JITing).
CompactUnwindSplitter splits compact-unwind sections on record boundaries and
adds keep-alive edges from target functions back to their respective records.
In MachO_arm64.cpp, a CompactUnwindSplitter pass is added to the pre-prune pass
list when setting up the standard pipeline.
This patch does not provide runtime support for compact-unwind, but is a first
step towards enabling it.
The getPerDylibResources method may be called concurrently from multiple
threads, so we need to protect access to the underlying map.
Possible for fix https://llvm.org/PR51064
Adds a 'start' method to SimpleRemoteEPCTransport to defer transport startup
until the client has been configured. This avoids races on client members if the
first messages arrives while the client is being configured.
Also fixes races on the file descriptors in FDSimpleRemoteEPCTransport.
This reintroduces "[ORC] Introduce EPCGenericRTDyldMemoryManager."
(bef55a2b47) and "[lli] Add ChildTarget dependence
on OrcTargetProcess library." (7a219d801b) which were
reverted in 99951a5684 due to bot failures.
The root cause of the bot failures should be fixed by "[ORC] Fix uninitialized
variable." (0371049277) and "[ORC] Wait for
handleDisconnect to complete in SimpleRemoteEPC::disconnect."
(320832cc9b).
Disconnect should block until handleDisconnect completes, otherwise we might
destroy the SimpleRemoteEPC instance while it's still in use.
Thanks to Dave Blaikie for helping me track this down.
This reverts commit bef55a2b47 while I investigate
failures on some bots. Also reverts "[lli] Add ChildTarget dependence on
OrcTargetProcess library." (7a219d801b) which was
a fallow-up to bef55a2b47.
EPCGenericRTDyldMemoryMnaager is an EPC-based implementation of the
RuntimeDyld::MemoryManager interface. It enables remote-JITing via EPC (backed
by a SimpleExecutorMemoryManager instance on the executor side) for RuntimeDyld
clients.
The lli and lli-child-target tools are updated to use SimpleRemoteEPC and
SimpleRemoteEPCServer (rather than OrcRemoteTargetClient/Server), and
EPCGenericRTDyldMemoryManager for MCJIT tests.
By enabling remote-JITing for MCJIT and RuntimeDyld-based ORC clients,
EPCGenericRTDyldMemoryManager allows us to deprecate older remote-JITing
support, including OrcTargetClient/Server, OrcRPCExecutorProcessControl, and the
Orc RPC system itself. These will be removed in future patches.