This is an implementation of orc::MemoryMapper that maps shared memory
pages in both executor and controller process and writes directly to
them avoiding transferring content over EPC. All allocations are properly
deinitialized automatically on the executor side at shutdown by the
ExecutorSharedMemoryMapperService.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D128544
Adds initial COFF support in JITLink. This is able to run a hello world c program in x86 windows successfully.
Implemented
- COFF object loader
- Static local symbols
- Absolute symbols
- External symbols
- Weak external symbols
- Common symbols
- COFF jitlink-check support
- All COMDAT selection type execpt largest
- Implicit symobl size calculation
- Rel32 relocation with PLT stub.
- IMAGE_REL_AMD64_ADDR32NB relocation
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D128968
An earlier version of this change originally landed as part of
e6f1f06245 (D129120), which caused a
Fuchsia buildbot regression in ExecutionEngine tests.
Careful review suggests that the issue was that in the earlier version,
the destructor of the JITDebugLock was run before the destructor of
GDBJITRegistrationListener. The new version of the change moves the lock
to a member variable of the (singleton!) GDBJITRegistartionListener so
that destructors are run in the right order.
This change originally landed as part of
e6f1f06245 (D129120), which caused a
Fuchsia buildbot regression in ExecutionEngine tests.
I am resubmitting the backed out parts in smaller pieces after a careful
review.
This change originally landed as part of
e6f1f06245 (D129120), which caused a
Fuchsia buildbot regression in ExecutionEngine tests.
I am resubmitting the backed out parts in smaller pieces after a careful
review.
(Reapply after revert in e9ce1a5880 due to
Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other
than error categories, to be checked in more detail and reapplied
separately.)
Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.
Differential Revision: https://reviews.llvm.org/D129120
Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.
Differential Revision: https://reviews.llvm.org/D129120
PointerToGOT lowering was accidentally changed from Delta32 to Delta64 in
db37225803. This patch moves it back to Delta32 and renames the generic
aarch64 edge to Delta32ToGOT to avoid the ambiguity.
No test case yet -- I haven't figured out how to write a succinct test case
(this typically appears in CIEs in eh-frames).
It might be an oversight that pass OrcAArch64 as template parameter to stubAndPointerRangesOk on MIps.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D129076
It is fine to not implement and ignore linker relaxation for now, but
we need to check the alignment. Luckily, an alignment of only 2 bytes
is the most common case when interpreting C++ code in clang-repl, and
already guaranteed by the length of compressed instructions.
Differential Revision: https://reviews.llvm.org/D129159
Implements TLS descriptor relocations in JITLink ELF/AARCH64 backend and support the relevant runtime functions in ELFNixPlatform.
Unlike traditional TLS model, TLS descriptor model requires linker to return the "offset" from thread pointer via relocaiton not the actual pointer to thread local variable. There is no public libc api for adding new allocations to TLS block dynamically which thread pointer points to. So, we support this by taking delta from thread base pointer to the actual thread local variable in our allocated section.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D128601
RuntimeDyld does not support RISC-V, so it makes sense to enable
JITLink by default. This also makes relocations work without support
for a large code model.
Differential Revision: https://reviews.llvm.org/D129092
Define atexit symbol in GenericLLVMIRPlatformSupport so that it doesn't need to be defined by user.
On windows, llvm codegen emits atexit runtime calls to support global deinitializers as there is no lower function like cxa_atexit as in Itanium C++ ABI. ORC JIT user had to define custom atexit symbol manually. This was a hassle as it has to deal with dso_handle and cxa_atexit internals of LLJIT. If client didn't provide atexit definition, the default behaviour is just linking with host atexit function which is destined to fail as it calls dtors when the host program exits. This is after jit instances and buffers are freed, so users would see weird access violation exception from the uknown location. (in console application, the debugger thinks exception happened in scrt_common_main_seh)
This is a hack that has some caveats. (e.g. memory address is not identical) But, it's better than the situation described in the above. Ultimately, we will move on to ORC runtime that is able to solve the memory address issue properly.
Reviewed By: sgraenitz
Differential Revision: https://reviews.llvm.org/D128037
Running iwyu-diff on LLVM codebase since fb67d683db detected a few
regressions, fixing them.
The impact on preprocessed output is negligible: -4k lines.
[JITLink][Orc] Add MemoryMapper interface with InProcess implementation
MemoryMapper class takes care of cross-process and in-process address space
reservation, mapping, transferring content and applying protections.
Implementations of this class can support different ways to do this such
as using shared memory, transferring memory contents over EPC or just
mapping memory in the same process (InProcessMemoryMapper).
The original patch landed with commit 6ede652050
It was reverted temporarily in commit 6a4056ab2a
Reviewed By: sgraenitz, lhames
Differential Revision: https://reviews.llvm.org/D127491
MemoryMapper class takes care of cross-process and in-process address space
reservation, mapping, transferring content and applying protections.
Implementations of this class can support different ways to do this such
as using shared memory, transferring memory contents over EPC or just
mapping memory in the same process (InProcessMemoryMapper).
Reviewed By: sgraenitz, lhames
Differential Revision: https://reviews.llvm.org/D127491
Logs enum name of unsupported relocation type. This also changes elf/x86 to use common util function (getELFRelocationTypeName) inside llvm object module.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127715
Implements R_AARCH64_MOVW_UABS_G*_NC fixup edges. These relocation entries can be generated when code is compiled without a PIC flag. With this patch, clang-repl can printf Hello World with ObjectLinkerLayer on aarch64 linux.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127585
Implements MoveWide16 generic edge kind that can be used to patch MOVZ/MOVK (imm16) instructions.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127584
Lift fixup functions from aarch64.cpp to aarch64.h so that they have better chance of getting inlined. Also, adds some comments documenting the purpose of functions.
Reviewed By: sgraenitz
Differential Revision: https://reviews.llvm.org/D127559
Unifies GOT/PLT table managers of ELF and MachO on aarch64 architecture. Additionally, it migrates table managers from PerGraphGOTAndPLTStubsBuilder to generic crtp TableManager.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127558
Slow definition generators may suspend lookups to temporarily release the
session lock, allowing unrelated lookups to proceed.
Using this functionality is discouraged: it is best to make definition
generation fast, rather than suspending the lookup. As a last resort where
this is not possible, suspension may be used.
An API to wrap ExecutionSession::lookup, this allows C API clients to use async
lookup.
The immediate motivation for adding this is to simplify upcoming
definition-generator unit tests.
As we're adding more tests that need to convert between C and C++ flag values
this commit adds helper functions to support this. This patch also updates the
CAPIDefinitionGenerator to use these new utilities.
Previously, omitting unnecessary DWARF unwinds was only done in two
cases:
* For Darwin + aarch64, if no DWARF unwind info is needed for all the
functions in a TU, then the `__eh_frame` section would be omitted
entirely. If any one function needed DWARF unwind, then MC would emit
DWARF unwind entries for all the functions in the TU.
* For watchOS, MC would omit DWARF unwind on a per-function basis, as
long as compact unwind was available for that function.
This diff makes it so that we omit DWARF unwind on a per-function basis
for Darwin + aarch64 as well. In addition, we introduce the flag
`--emit-dwarf-unwind=` which can toggle between `always`,
`no-compact-unwind` (only emit DWARF when CU cannot be emitted for a
given function), and the target platform `default`. `no-compact-unwind`
is particularly useful for newer x86_64 platforms: we don't want to omit
DWARF unwind for x86_64 in general due to possible backwards compat
issues, but we should make it possible for people to opt into this
behavior if they are only targeting newer platforms.
**Motivation:** I'm working on adding support for `__eh_frame` to LLD,
but I'm concerned that we would suffer a perf hit. Processing compact
unwind is already expensive, and that's a simpler format than EH frames.
Given that MC currently produces one EH frame entry for every compact
unwind entry, I don't think processing them will be cheap. I tried to do
something clever on LLD's end to drop the unnecessary EH frames at parse
time, but this made the code significantly more complex. So I'm looking
at fixing this at the MC level instead.
**Addendum:** It turns out that there was a latent bug in the X86
backend when `OmitDwarfIfHaveCompactUnwind` is naively enabled, which is
not too surprising given that this combination has not been heretofore
used.
For functions that have unwind info that cannot be encoded with CU, MC
would end up dropping both the compact unwind entry (OK; existing
behavior) as well as the DWARF entries (not OK). This diff fixes things
so that we emit the DWARF entry, as well as a CU entry with encoding
`UNWIND_X86_MODE_DWARF` -- this basically tells the unwinder to look for
the DWARF entry. I'm not 100% sure the `UNWIND_X86_MODE_DWARF` CU entry
is necessary, this was the simplest fix. ld64 seems to be able to handle
both the absence and presence of this CU entry. Ultimately ld64 (and
LLD) will synthesize `UNWIND_X86_MODE_DWARF` if it is absent, so there
is no impact to the final binary size.
Reviewed By: davide, lhames
Differential Revision: https://reviews.llvm.org/D122258
Implements eh frame handling by using generic EHFrame passes. The c++ exception handling works correctly with this change.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127063
Removes CodeAlignmentFactor and DataAlignmentFactor validation in EHFrameEdgeFixer. I observed some of aarch64 elf files generated by clang contains CIE record with code_alignment_factor = 4 or data_alignment_factor = -8. code_alignment_factor and data_alignment_factor are used by call fram instruction that should be correctled handled by libunwind.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127062
ELF-based platforms currently support defining multiple static
initializer table sections with differing priorities, for example
.init_array.0 or .init_array.100; the default .init_array corresponds
to a priority of 65535. When building a shared library or executable,
the system linker normally sorts these sections and combines them into
a single .init_array section. This change adds the capability to
recognize ELF static initializers with priorities other than the
default, and to properly sort them by priority, to Orc and the Orc
runtime.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127056
This change enables integrating orc::LLJIT with the ORCv2
platforms (MachOPlatform and ELFNixPlatform) and the compiler-rt orc
runtime. Changes include:
- Adding SPS wrapper functions for the orc runtime's dlfcn emulation
functions, allowing initialization and deinitialization to be invoked
by LLJIT.
- Changing the LLJIT code generation default to add UseInitArray so
that .init_array constructors are generated for ELF platforms.
- Integrating the ORCv2 Platforms into lli, and adding a
PlatformSupport implementation to the LLJIT instance used by lli which
implements initialization and deinitialization by calling the new
wrapper functions in the runtime.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D126492
Implements Procedure Linkage Table (PLT) for ELF/AARCH64. The aarch64 linux calling convention also uses r16 as the intra-procedure-call scratch register same as MachO/ARM64. We can use the same stub sequence for this reason.
Also, BR regiseter doesn't touch X30 register. External function call by BL instruction (touched by CALL26 relocation) will set X30 to the original PC + 4, which is the intended behavior. External function call by B instruction (touched by JUMP26 relocation) doesn't requite to set X30, so the patch will be correct in this case too.
Reference: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#611general-purpose-registers
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127061
Adds the aarch64 support in ELFNixPlatform. These are few simple changes, but it allows us to use the orc runtime in ELF/AARCH64 backend. It succesfully run the static initializers of stdlibc++ iostream so that "cout << Hello world" testcase starts to work.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127060
Implements R_AARCH64_JUMP26. We can use the same generic aarch64 Branch26 edge since B instruction and BL nstruction have the same sized&offseted immediate field, and the relocation address calculation is the same.
Reference: ELF for the ARM ® 64-bit Architecture Tabel 4-10, ARM Architecture Reference Manual ® ARMv8, for ARMv8-A architecture profile C6.2.24, C6.2.31
Reviewed By: sgraenitz
Differential Revision: https://reviews.llvm.org/D127059