Commit Graph

729 Commits

Author SHA1 Message Date
Fangrui Song 685b212553 [ELF] Make relocateAlloc target specific. NFC
The target-specific code (AArch64, PPC64) does not fit into the generic code and
adds virtual function overhead. Move relocateAlloc into ELF/Arch/ instead. This
removes many virtual functions (relaxTls*). In addition, this helps get rid of
getRelocTargetVA dispatch and many RelExpr members in the future.
2022-10-17 11:01:11 -07:00
Fangrui Song 14f996dca8 [ELF] Move inputSections/ehInputSections into Ctx. NFC 2022-10-16 00:49:48 -07:00
Fangrui Song 9c626d4a0d [ELF] Remove symtab indirection. NFC
Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr indirection.
2022-10-01 14:46:49 -07:00
Fangrui Song 367997d0d6 [Support] Rename llvm::compression::{zlib,zstd}::uncompress to more appropriate decompress
This improves consistency with other places (e.g. llvm::compression::decompress,
llvm::object::Decompressor::decompress, llvm-objcopy).
Note: when zstd::uncompress was added, we noticed that the API `ZSTD_decompress`
is fine while the zlib API `uncompress` is a misnomer.
2022-09-17 12:35:17 -07:00
Fangrui Song 5e0464e38b [ELF] Support ELFCOMPRESS_ZSTD input
so that lld accepts relocatable object files produced by `clang -c -g -gz=zstd`.

We don't want to increase the size of InputSection, so do redundant but cheap
ch_type checks instead.

Differential Revision: https://reviews.llvm.org/D129406
2022-09-09 10:25:37 -07:00
Fangrui Song c682c26942 [ELF] Rename InputSectionBase::uncompress to decompress. NFC
The canonical verb is "decompress" (also used in llvm-objcopy). "uncompressed"
describes the state.
2022-09-09 10:18:46 -07:00
Fangrui Song 2515cb80cd [ELF] Parallelize input section initialization
This implements the last step of
https://discourse.llvm.org/t/parallel-input-file-parsing/60164 for the ELF port.

For an ELF object file, we previously did: parse, (parallel) initializeLocalSymbols, (parallel) postParseObjectFile.
Now we do: parse, (parallel) initSectionsAndLocalSyms, (parallel) postParseObjectFile.

initSectionsAndLocalSyms does most of input section initialization.
The sequential `parse` does SHT_ARM_ATTRIBUTES/SHT_RISCV_ATTRIBUTES/SHT_GROUP initialization for now.

Performance linking some programs with --threads=8 (glibc 2.33 malloc and mimalloc):

* clang: 1.05x as fast with glibc malloc, 1.03x as fast with mimalloc
* chrome: 1.04x as fast with glibc malloc, 1.03x as fast with mimalloc
* internal search program: 1.08x as fast with glibc malloc, 1.05x as fast with mimalloc

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D130810
2022-08-04 11:47:52 -07:00
Gabriel Ravier 5dbd8faad5 [lld] Fixed a number of typos
I went over the output of the following mess of a command:

`(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z | parallel --xargs -0 cat | aspell list --mode=none --ignore-case | grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n | grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)`

and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).

Differential Revision: https://reviews.llvm.org/D130982
2022-08-02 09:52:31 -04:00
Fangrui Song d0cf7b2015 [ELF] EhInputSection::getParentOffset: fix out-of-bounds access for symbols relative to a non-empty .eh_frame
This has unclear semantics and can be considered invalid. Return an arbitrary value.
2022-08-01 01:10:51 -07:00
Fangrui Song af1328ef45 [ELF] Simplify EhInputSection::split. NFC
* Inline getReloc
* Fold the UINT32_MAX length check into the section size check.
  This transformation is valid because we don't support .eh_frame input sections
  larger than 32-bit (unrealistic even for large code models).
2022-07-31 16:59:57 -07:00
Fangrui Song 3e9adff456 [ELF] Split EhInputSection::pieces into cies and fdes
This simplifies code, removes a read32 (for id==0 check), and makes it feasible
to combine some operations in EhInputSection::split and EhFrameSection::addRecords.

Mostly NFC, but fixes "Relocation not in any piece" assertion failure in an
erroneous case when a relocation offset precedes all CIE/FDE pices.
2022-07-31 16:16:10 -07:00
Fangrui Song c09d323599 [ELF] Move EhInputSection out of inputSections. NFC
inputSections temporarily contains EhInputSection objects mainly for
combineEhSections. Place EhInputSection objects into a new vector
ehInputSections instead of inputSections.
2022-07-31 11:58:08 -07:00
Fangrui Song a465e79f19 [ELF] Move SyntheticSections to InputSection.h. NFC
Keep the main SectionBase hierarchy in InputSection.h.
And inline MergeInputSection::getParent.
2022-07-30 17:42:08 -07:00
Fangrui Song e690137dde [Support] Change compression::zlib::{compress,uncompress} to use uint8_t *
It's more natural to use uint8_t * (std::byte needs C++17 and llvm has
too much uint8_t *) and most callers use uint8_t * instead of char *.
The functions are recently moved into `llvm::compression::zlib::`, so
downstream projects need to make adaption anyway.
2022-07-13 16:26:54 -07:00
Fangrui Song dd74d3117d [ELF] Refactor ELFCOMPRESS_ZLIB handling and improve diagnostics
And add some tests.
2022-07-08 14:04:19 -07:00
Cole Kissane ea61750c35 [NFC] Refactor llvm::zlib namespace
* Refactor compression namespaces across the project, making way for a possible
  introduction of alternatives to zlib compression.
  Changes are as follows:
  * Relocate the `llvm::zlib` namespace to `llvm::compression::zlib`.

Reviewed By: MaskRay, leonardchan, phosek

Differential Revision: https://reviews.llvm.org/D128953
2022-07-08 11:19:07 -07:00
Fangrui Song 6611d58f5b [ELF] Relax R_RISCV_ALIGN
Alternative to D125036. Implement R_RISCV_ALIGN relaxation so that we can handle
-mrelax object files (i.e. -mno-relax is no longer needed) and creates a
framework for future relaxation.

`relaxAux` is placed in a union with InputSectionBase::jumpInstrMod, storing
auxiliary information for relaxation. In the first pass, `relaxAux` is allocated.
The main data structure is `relocDeltas`: when referencing `relocations[i]`, the
actual offset is `r_offset - (i ? relocDeltas[i-1] : 0)`.

`relaxOnce` performs one relaxation pass. It computes `relocDeltas` for all text
section. Then, adjust st_value/st_size for symbols relative to this section
based on `SymbolAnchor`. `bytesDropped` is set so that `assignAddresses` knows
that the size has changed.

Run `relaxOnce` in the `finalizeAddressDependentContent` loop to wait for
convergence of text sections and other address dependent sections (e.g.
SHT_RELR). Note: extrating `relaxOnce` into a separate loop works for many cases
but has issues in some linker script edge cases.

After convergence, compute section contents: shrink the NOP sequence of each
R_RISCV_ALIGN as appropriate. Instead of deleting bytes, we run a sequence of
memcpy on the content delimitered by relocation locations. For R_RISCV_ALIGN let
the next memcpy skip the desired number of bytes. Section content computation is
parallelizable, but let's ensure the implementation is mature before
optimizations. Technically we can save a copy if we interleave some code with
`OutputSection::writeTo`, but let's not pollute the generic code (we don't have
templated relocation resolving, so using conditions can impose overhead to
non-RISCV.)

Tested:
`make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- LLVM=1 defconfig all` built Linux kernel using -mrelax is bootable.
FreeBSD RISCV64 system using -mrelax is bootable.
bash/curl/firefox/libevent/vim/tmux using -mrelax works.

Differential Revision: https://reviews.llvm.org/D127581
2022-07-07 10:16:09 -07:00
Fangrui Song 16ca490f45 [ELF] Change getRISCVPCRelHi20 error to conventional errorOrWarn 2022-06-12 21:15:06 -07:00
Fangrui Song e09f77d394 [ELF] Remove support for legacy .zdebug sections
.zdebug is unlikely used any longer: gcc -gz switched from legacy
.zdebug to SHF_COMPRESSED with binutils 2.26 (2016), which has been
several years. clang 14 dropped -gz=zlib-gnu support. According to
Debian Code Search (`gz=zlib-gnu`), no project uses -gz=zlib-gnu.

Remove .zdebug support to (a) simplify code and (b) allow removal of llvm-mc's
--compress-debug-sections=zlib-gnu.

In case the old object file `a.o` uses .zdebug, run `objcopy --decompress-debug-sections a.o`

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D126793
2022-06-02 13:37:19 -07:00
Andrew Ng c78c00dc16 [LLD][ELF] Drop the string null terminator from the hash in splitStrings
Differential Revision: https://reviews.llvm.org/D126484
2022-05-27 10:48:53 +01:00
Fangrui Song 02eab52866 [ELF][AArch64] Fix unneeded thunk for branches to hidden undefined weak
Similar to D119787 for PPC64.

A hidden undefined weak may change its binding to local before some
`isUndefinedWeak` code, so some `isUndefinedWeak` code needs to be changed to
`isUndefined`. The undefined non-weak case has been errored, so just using
`isUndefined` is fine.

The Linux kernel recently has a usage that a branch from 0xffff800008491ee0
references a hidden undefined weak symbol `vfio_group_set_kvm`.
It relies on the behavior that a branch to undefined weak resolving to the next
instruction, otherwise it'd see spurious relocation out of range errors.

Fixes https://github.com/ClangBuiltLinux/linux/issues/1624

Differential Revision: https://reviews.llvm.org/D123750
2022-04-14 11:32:30 -07:00
Fangrui Song 4645311933 [ELF] --emit-relocs: adjust offsets of .rel[a].eh_frame relocations
Two code paths may reach the EHFrame case in SectionBase::getOffset:

* .eh_frame reference
* relocation copy for --emit-relocs

The first may be used by clang_rt.crtbegin.o and GCC crtbeginT.o to get the
start address of the output .eh_frame. The relocation has an offset of 0 or
(x86-64 PC-relative leaq for clang_rt.crtbegin.o) -4. The current code just
returns `offset`, which handles this case well.

The second is related to InputSection::copyRelocations on .eh_frame (used by
--emit-relocs). .eh_frame pieces may be dropped due to GC/ICF, so we should
convert the input offset to the output offset. Use the same way as
MergeInputSection with a special case handling outSecOff==-1 for an invalid
piece (see eh-frame-marker.s).

This exposes an issue in mips64-eh-abs-reloc.s that we don't reliably
handle anyway. Just add --no-check-dynamic-relocations to paper over it.

Differential Revision: https://reviews.llvm.org/D122459
2022-03-29 09:51:41 -07:00
Fangrui Song 48e251b1d6 Revert D122459 "[ELF] --emit-relocs: adjust offsets of .rel[a].eh_frame relocations"
This reverts commit 6faba31e0d.

It may cause "offset is outside the section".
2022-03-28 20:26:21 -07:00
Fangrui Song 6faba31e0d [ELF] --emit-relocs: adjust offsets of .rel[a].eh_frame relocations
.eh_frame pieces may be dropped due to GC/ICF. When --emit-relocs adds
relocations against .eh_frame, the offsets need to be adjusted. Use the same
way as MergeInputSection with a special case handling outSecOff==-1 for an
invalid piece (see eh-frame-marker.s).

This exposes an issue in mips64-eh-abs-reloc.s that we don't reliably
handle anyway. Just add --no-check-dynamic-relocations to paper over it.

Original patch by Ayrton Muñoz

Differential Revision: https://reviews.llvm.org/D122459
2022-03-28 16:23:13 -07:00
Fangrui Song 8565a87fd4 [ELF] Simplify MergeInputSection::getParentOffset. NFC
and remove overly verbose comments.
2022-03-28 10:02:35 -07:00
Fangrui Song 72bedf46c7 [ELF] Inline InputSection::getParent. NFC
Combined with the previous change, lld executable is ~2K smaller and some code
paths using InputSection::getParent are more efficient.

The fragmented headers lead to a design limitation that OutputSection has to be
incomplete, so we cannot use static_cast.
2022-03-08 11:26:12 -08:00
Fangrui Song a815424cc5 Reland D119909 [ELF] Parallelize initializeLocalSymbols
ObjFile::parse combines symbol initialization and resolution. Many tasks
unrelated to symbol resolution can be postponed and parallelized. This patch
extracts local symbol initialization and parallelizes it.

Technically the new function initializeLocalSymbols can be merged into
ObjFile::postParse, but functions like getSrcMsg may access the
uninitialized (all nullptr) local part of InputFile::symbols.

Linking chrome: 1.02x as fast with glibc malloc, 1.04x as fast with mimalloc

Depends on f456c3ae3f and D119908

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D119909
2022-03-04 19:00:10 -08:00
Jorge Gorbe Moya 449b649fec Revert "[ELF] Parallelize initializeLocalSymbols"
This reverts commit 09602d3b47.
2022-03-04 15:01:17 -08:00
Fangrui Song bb3eeac773 [ELF] Make InputSection::classof inline. NFC 2022-02-28 00:16:45 -08:00
Fangrui Song 4976d1fe58 [ELF] Move SyntheticSection check from InputSection::writeTo to OutputSection::writeTo. NFC
Simplify code and make the heavyweight operation to the call site so that it is
clearer how to improve the inefficient scheduling in the future.
2022-02-27 23:28:52 -08:00
Fangrui Song 09602d3b47 [ELF] Parallelize initializeLocalSymbols
ObjFile::parse combines symbol initialization and resolution. Many tasks
unrelated to symbol resolution can be postponed and parallelized. This patch
extracts local symbol initialization and parallelizes it.

Technically the new function initializeLocalSymbols can be merged into
ObjFile::postParse, but functions like getSrcMsg may access the
uninitialized (all nullptr) local part of InputFile::symbols.

Linking chrome: 1.02x as fast with glibc malloc, 1.04x as fast with mimalloc

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D119909
2022-02-24 20:05:59 -08:00
Fangrui Song ae1ba6194f [ELF] Replace uncompressed InputSectionBase::data() with rawData. NFC
In many call sites we know uncompression cannot happen (non-SHF_ALLOC, or the
data (even if compressed) must have been uncompressed by a previous pass).
Prefer rawData in these cases. data() increases code size and prevents
optimization on rawData.
2022-02-21 00:39:26 -08:00
Fangrui Song 27bb799095 [ELF] Clean up headers. NFC 2022-02-07 21:53:34 -08:00
Alexander Shaposhnikov 4450a2a23d [lld][ELF] Add support for ADRP+ADD optimization for AArch64
This diff adds support for ADRP+ADD optimization for AArch64 described in
d2ca58c54b
i.e. under appropriate constraints

ADRP  x0, symbol
ADD   x0, x0, :lo12: symbol

can be turned into

NOP
ADR   x0, symbol

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D117614
2022-02-02 06:09:55 +00:00
Fangrui Song 17a39aecd1 [ELF] Simplify code with invokeELFT. NFC 2022-02-01 09:53:29 -08:00
Fangrui Song d97749fabc [ELF] Switch split-stack to use SmallVector. NFC
My x86-64 lld executable is 1.1KiB smaller.
2022-02-01 00:09:30 -08:00
Fangrui Song 457273fda5 [ELF] splitStrings: replace entSize==1 special case with manual loop unswitch. NFC
My x86-64 lld executable is actually smaller.
2022-01-30 17:15:45 -08:00
Fangrui Song 5a2020d069 [ELF] copyShtGroup: replace unordered_set<uint32_t> with DenseSet<uint32_t>. NFC
We don't need to support the empty/tombstone key section index.
2022-01-30 01:18:41 -08:00
Fangrui Song bc1369fae3 [ELF] Optimize MergeInputSection::splitNonStrings with resize_for_overwrite. NFC 2022-01-30 00:10:52 -08:00
Fangrui Song 14b7785c09 [ELF] Simplify InputSection::writeTo. NFC 2022-01-26 22:03:26 -08:00
Alexandre Ganea 83d59e05b2 Re-land [LLD] Remove global state in lldCommon
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.

See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html

The previous land f860fe3622 caused issues in https://lab.llvm.org/buildbot/#/builders/123/builds/8383, fixed by 22ee510dac.

Differential Revision: https://reviews.llvm.org/D108850
2022-01-20 14:53:26 -05:00
Fangrui Song a7a4115bf3 [ELF] Replace .zdebug string comparison with SHF_COMPRESSED check. NFC 2022-01-19 22:33:32 -08:00
Fangrui Song 5bd38a2826 [ELF] Fix split-stack caller with hidden non-split-stack callee
Fix a regression after aabe901d57 (`[ELF] Remove
one redundant computeBinding`): isLocal() does not indicate that the symbol is
originally local. For simplicity, just drop this optimization.
2022-01-19 12:25:01 -08:00
Fangrui Song 5f404a749a [ELF] De-template InputSectionBase::getLocation. NFC 2022-01-18 17:33:58 -08:00
Fangrui Song eafd34581f [ELF] Simplify/optimize EhInputSection::split
and change some `fatal` to `errorOrWarn`.

EhFrame.cpp is a helper file. We don't place all .eh_frame implementation there,
so the code move is fine.
2022-01-18 17:03:23 -08:00
Fangrui Song 83c7f5d3fb [ELF] EhInputSection::split: remove unneeded check 2022-01-17 13:59:52 -08:00
Alexandre Ganea e6b153947d Revert [LLD] Remove global state in lldCommon
It seems to be causing issues on https://lab.llvm.org/buildbot/#/builders/123/builds/8383
2022-01-16 11:03:06 -05:00
Alexandre Ganea f860fe3622 [LLD] Remove global state in lldCommon
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.

See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html

Differential Revision: https://reviews.llvm.org/D108850
2022-01-16 08:57:57 -05:00
Fangrui Song 8b2f33231c [ELF] Make some diagnostics follow the convention 2022-01-15 10:46:25 -08:00
Igor Kudrin e00ac48df3 [ELF] Use tombstone values for discarded symbols in relocatable output
This extends D81784. Sections can be discarded when linking a
relocatable output. Before the patch, LLD did not update the content
of debug sections and only replaced the corresponding relocations with
R_*_NONE, which could break the debug information.

Differential Revision: https://reviews.llvm.org/D116946
2022-01-13 11:38:26 +07:00