Commit Graph

442 Commits

Author SHA1 Message Date
Fangrui Song e980f16d52 [ELF] Move whyExtract/backwardReferences from LinkerDriver to Ctx. NFC
Ctx was recently added as a more suitable place for such singletons.
2022-06-29 17:34:31 -07:00
Shoaib Meenai 1af25a9860 [ELF] Fix wrapping symbols produced during LTO codegen
We were previously not correctly wrapping symbols that were only
produced during LTO codegen and unreferenced before then, or symbols
only referenced from such symbols. The root cause was that we weren't
marking the wrapped symbol as used if we only saw the use after LTO
codegen, leading to the failed wrapping.

Fix this by explicitly tracking whether a symbol will become referenced
after wrapping is done. We can use this property to tell LTO to preserve
such symbols, instead of overload isUsedInRegularObj for this purpose.
Since we're no longer setting isUsedInRegularObj for all symbols which
will be wrapped, its value at the time of performing the wrapping in the
symbol table will accurately reflect whether the symbol was actually
used in an object (including in an LTO-generated object), and we can
propagate that value to the wrapped symbol and thereby ensure we wrap
correctly.

This incorrect wrapping was the only scenario I was aware of where we
produced an invalid PLT relocation, which D123985 started diagnosing,
and with it fixed, we lose the test for that diagnosis. I think it's
worth keeping the diagnosis though, in case we run into other issues in
the future which would be caught by it.

Fixes PR50675.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D124056
2022-04-22 16:45:21 -07:00
Shoaib Meenai b862bcbf44 [ELF] Move SymbolUnion assertions to source file
Otherwise they fires for every single file which includes the header,
which is very noisy when building.

Reviewed By: MaskRay, peter.smith

Differential Revision: https://reviews.llvm.org/D124041
2022-04-22 16:41:14 -07:00
Shoaib Meenai 4641d86e45 [ELF] Shrink binding and type in Symbol
STB_HIPROC and STT_HIPROC are both 15, so we can fit the symbol binding
and type in 4 bits. This gives us an additional byte to use for Symbol
flags (without increasing the type's size), which I'll be making use of
in the next diff.

Reorder type and binding based on a suggestion from @MaskRay, to
optimize st_info computation on little-endian systems (see
https://godbolt.org/z/nMn8Yar43).

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D124042
2022-04-20 10:46:36 -07:00
Fangrui Song 7c7702b318 [ELF] Move section assignment from initializeSymbols to postParse
https://discourse.llvm.org/t/parallel-input-file-parsing/60164

initializeSymbols currently sets Defined::section and handles non-prevailing
COMDAT groups. Move the code to the parallel postParse to reduce work from the
single-threading code path and make parallel section initialization infeasible.

Postpone reporting duplicate symbol errors so that the messages have the
section information. (`Defined::section` is assigned in postParse and another
thread may not have the information).

* duplicated-synthetic-sym.s: BinaryFile duplicate definition (very rare) now
  has no section information
* comdat-binding: `%t/w.o %t/g.o` leads to an undesired undefined symbol. This
  is not ideal but we report a diagnostic to inform that this is unsupported.
  (See release note)
* comdat-discarded-lazy.s: %tdef.o is unextracted. The new behavior (discarded
  section error) makes more sense
* i386-comdat.s: switched to a better approach working around
  .gnu.linkonce.t.__x86.get_pc_thunk.bx in glibc<2.32 for x86-32.
  Drop the ancient no-longer-relevant workaround for __i686.get_pc_thunk.bx

Depends on D120640

Differential Revision: https://reviews.llvm.org/D120626
2022-03-15 19:24:41 -07:00
Fangrui Song 9b61fff0eb Revert D120626 "[ELF] Move section assignment from initializeSymbols to postParse"
This reverts commit c30e6447c0.
It exposed brittle support for __x86.get_pc_thunk.bx.
Need to think a bit how to support __x86.get_pc_thunk.bx.
2022-03-15 19:00:54 -07:00
Fangrui Song c30e6447c0 [ELF] Move section assignment from initializeSymbols to postParse
https://discourse.llvm.org/t/parallel-input-file-parsing/60164

initializeSymbols currently sets Defined::section and handles non-prevailing
COMDAT groups. Move the code to the parallel postParse to reduce work from the
single-threading code path and make parallel section initialization infeasible.

Postpone reporting duplicate symbol errors so that the messages have the
section information. (`Defined::section` is assigned in postParse and another
thread may not have the information).

* duplicated-synthetic-sym.s: BinaryFile duplicate definition (very rare) now
  has no section information
* comdat-binding: `%t/w.o %t/g.o` leads to an undesired undefined symbol. This
  is not ideal but we report a diagnostic to inform that this is unsupported.
  (See release note)
* comdat-discarded-lazy.s: %tdef.o is unextracted. The new behavior (discarded
  section error) makes more sense

Depends on D120640

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D120626
2022-03-14 14:13:41 -07:00
Fangrui Song c7cf960d85 [ELF] Set the priority of STB_GNU_UNIQUE the same as STB_WEAK
In GCC -fgnu-unique output, STB_GNU_UNIQUE symbols are always defined relative
to a section in a COMDAT group. Currently `other` cannot be STB_GNU_UNIQUE for
valid input, so this patch is NFC.

If we switch to the model that ignores COMDAT resolution when performing symbol
resolution (D120626), this will fix bogus `relocation refers to a symbol in a
discarded section` errors when mixing -fno-gnu-unique objects with -fgnu-unique
objects.

Differential Revision: https://reviews.llvm.org/D120640
2022-03-14 12:00:15 -07:00
Fangrui Song b07ef4d566 [ELF] Rename Symbol::compare to shouldReplace. NFC
The return value is not a boolean instead of a tri-state.
Suggested by Peter Smith in D120640.
2022-02-28 18:25:21 +00:00
Fangrui Song d14d8664e3 [ELF] Change global variable backwardReferences to a LinkerDriver member variable. NFC
Similar to whyExtract.
2022-02-27 20:33:28 +00:00
Fangrui Song 7fd3849b35 [ELF] Move --print-archive-stats= and --why-extract= beside --warn-backrefs report
So that early errors don't suppress their output.
2022-02-27 20:23:09 +00:00
Fangrui Song 6d94340809 [ELF] Simplify resolveDefined and resolveCommon
This is NFC for valid input (COMMON symbols cannot be weak or versioned).
2022-02-24 14:08:06 -08:00
Fangrui Song 4129890dd8 [ELF] De-template Symbol::resolveLazy. NFC 2022-02-24 12:20:05 -08:00
Fangrui Song 5bc4e15c6e [ELF] Set config->exportDynamic to true if config->shared. NFC 2022-02-24 11:31:58 -08:00
Fangrui Song 9f9ac3464e [ELF] Symbols.h: remove #include "InputFiles.h" 2022-02-23 21:36:45 -08:00
Fangrui Song 8ca46bba23 [ELF] Move isUsedInRegularObj assignment from ctor to call sites. NFC
This removes the tricky
`isUsedInRegularObj(!file || file->kind() == InputFile::ObjKind)`
and the copy from `Symbol::mergeProperties`.
2022-02-23 21:32:50 -08:00
Fangrui Song 38fbedab32 [ELF] Don't rely on Symbols.h's transitive inclusion of InputFiles.h. NFC 2022-02-23 20:44:34 -08:00
Fangrui Song ba061713d3 [ELF] Move TLS mismatch error from Symbol::replace to postParse
* detect `def_tls.o undef_nontls.o` violation
* place error checking code (checking duplicate symbol) together
* allow `--defsym tls1=tls2 def_tls.o`

As a degraded error checking, `--defsym tls1=42` violation will not be detected.
2022-02-23 20:34:48 -08:00
Fangrui Song 47d18be58b [ELF] Remove SharedSymbol::getFile. NFC
Symbol.h depends on InputFiles.h. This change moves us toward dropping the
weird dependency.

The call sites will become slightly uglier (`cast<SharedFile>(s->file)`), but
the compromise is acceptable.
2022-02-23 17:57:52 -08:00
Fangrui Song 88d66f6ed1 [ELF] Move duplicate symbol check after input file parsing
https://discourse.llvm.org/t/parallel-input-file-parsing/60164

To decouple symbol initialization and section initialization, `Defined::section`
assignment should be postponed after input file parsing. To avoid spurious
duplicate definition error due to two definitions in COMDAT groups of the same
signature, we should postpone the duplicate symbol check.

The function is called postScan instead of a more specific name like
checkDuplicateSymbols, because we may merge Symbol::mergeProperties into
postScan. It is placed after compileBitcodeFiles to apply to ET_REL files
produced by LTO. This causes minor diagnostic regression
for skipLinkedOutput configurations: ld.lld --thinlto-index-only a.bc b.o
(bitcode definition prevails) won't detect duplicate symbol error. I think this
is an acceptable compromise. The important cases where (a) both files are
bitcode or (b) --thinlto-index-only is unused are still detected.

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D119908
2022-02-22 10:07:58 -08:00
Fangrui Song 3d85424096 [ELF] Parse archives as --start-lib object files
https://maskray.me/blog/2022-01-16-archives-and-start-lib

For every definition in an extracted archive member, we intern the symbol twice,
once for the archive index entry, once for the .o symbol table after extraction.
This is inefficient.

Symbols in a --start-lib ObjFile/BitcodeFile are only interned once because the
result is cached in symbols[i].

Just handle an archive using the --start-lib code path. We can therefore remove
ArchiveFile and LazyArchive. For many projects, archive member extraction ratio
is high and it is a net performance win. Linking a Release build of clang is
1.01x as fast.

Note: --start-lib scans symbols in the same order that llvm-ar adds them to the
index, so in the common case the semantics should be identical. If the archive
symbol table was created in a different order, or is incomplete, this strategy
may have different semantics. Such cases are considered user error.

The `is neither ET_REL nor LLVM bitcode` error is changed to a warning.
Previously an archive may have such members without a diagnostic. Using a
warning prevents breakage.

* For some tests, the diagnostics get improved where we did not consider
  the archive member name: `b.a:` => `b.a(b.o):`.
* `no-obj.s`: the link is now allowed, matching GNU ld
* `archive-no-index.s`: the `is neither ET_REL nor LLVM bitcode` diagnostic is
  demoted to a warning.
* `incompatible.s`: even when an archive is unextracted, we may report an
  "incompatible with" error.

---

I recently decreased sizeof(SymbolUnion) by 8 and decreased memory usage quite a
bit, so retaining `symbols` for un-extracted archive members should not cause a
memory usage problem.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D119074
2022-02-15 09:38:00 -08:00
Ben Dunbobbin 666aa43cbf Fix comment after upstream: 9e08e92980 - [ELF] Allow STV_PROTECTED shared definition to set exportDynamic? 2022-02-09 23:51:31 +00:00
Fangrui Song 27bb799095 [ELF] Clean up headers. NFC 2022-02-07 21:53:34 -08:00
Fangrui Song cb03ac0b5d [ELF] Move Symbol::needsTlsLd to config->needsTlsLd
to decrease sizeof(SymbolUnion) from 72 to 64 on ELF64 platforms.

Use a dummy `Undefined` to prevent null pointer dereference (though unused)
`*rel.sym` in InputSectionBase::relocateAlloc.

The relocation order may shuffle a bit, but otherwise there is no behavior
difference.
2022-02-07 10:26:16 -08:00
Alexander Kornienko ec8a693717 Revert "[ELF] Move Symbol::needsTlsLd to config->needsTlsLd. NFC"
This reverts commit f9e3ca542e.

The commit results in internal test failures. Test case provided offline.
2022-02-07 19:00:09 +01:00
Fangrui Song 5ad2aae244 [ELF] SharedFile::parse: move verdefIndex assignment outside of ctor. NFC
SharedSymbol::SharedSymbol initializes verdefIndex and Symbol::replace
copies verdefIndex.

By move verdefIndex assignment outside of ctor, Symbol::replace can be changed
to not copy verdefIndex. This can be used to decrease work for for
ObjKind/BitcodeKind.
2022-02-05 20:43:51 -08:00
Fangrui Song 977a1a523c [ELF] Symbol::replace: use the old nameData/nameSize. NFC
Currently `this->getName() == newSym.getName()`.
By keeping the old nameData/nameSize, newSym's nameData/nameSize will be
ignored. The call sites can avoid calling getName().

printTraceSymbol needs to take the symbol name since `other`'s name is empty.
2022-02-05 16:34:02 -08:00
Fangrui Song f9e3ca542e [ELF] Move Symbol::needsTlsLd to config->needsTlsLd. NFC
to decrease sizeof(SymbolUnion) from 72 to 64 on ELF64 platforms.
2022-02-05 14:40:15 -08:00
Fangrui Song 73f55fba76 [ELF] Reorder Symbol members to improve access locality. NFC
* partition and isPreemptible are frequently used. Move it to the front
* move used beside isUsedInRegularObj. They are similar and accessed together in .symtab finalizing
* move auxIdx/dynsymIndex/verdefIndex to the end.

This decreases code size.
2022-02-05 14:11:37 -08:00
Fangrui Song 7c675923c7 [ELF] Merge canInline into scriptDefined
They perform similar tasks and are essentially the same after
d28c26bbdd.
2022-02-05 12:00:34 -08:00
Fangrui Song bb4eacdb70 [ELF] Refactor how Symbol::used is set. NFC 2022-02-05 11:09:40 -08:00
Fangrui Song ac2911e738 [ELF] Refactor how exportDynamic is set. NFC 2022-02-05 10:25:25 -08:00
Fangrui Song 9e08e92980 [ELF] Allow STV_PROTECTED shared definition to set exportDynamic
A STV_PROTECTED shared definition does not set exportDynamic of a defined
symbol. This is on the basis that a protected definition cannot be preempted so
the export is unnecessary. However, the condition is imperfect because we don't
know whether the shared object was built with a symbolic option. Since dropping
the condition simplifies code and matches GNU ld, let's do it.
2022-02-05 01:10:43 -08:00
Fangrui Song 03909c4400 [ELF] Remove StringRefZ
StringRefZ does not improve performance. Non-local symbols always have eagerly
computed nameSize. Most local symbols's lengths will be updated in either:

* shouldKeepInSymtab
* SymbolTableBaseSection::addSymbol

Its benefit is offsetted by strlen in every call site (sums up to 5KiB code in a
release x86-64 build), so using StringRefZ may be slower.

In a -s link (uncommon) there is minor speedup, like ~0.3% for clang and chrome.

Reviewed By: alexander-shaposhnikov

Differential Revision: https://reviews.llvm.org/D117644
2022-01-19 20:09:41 -08:00
Fangrui Song 7f1955dc96 [ELF] Support mixed TLSDESC and TLS GD
We only support both TLSDESC and TLS GD for x86 so this is an x86-specific
problem. If both are used, only one R_X86_64_TLSDESC is produced and TLS GD
accesses will incorrectly reference R_X86_64_TLSDESC. Fix this by introducing
SymbolAux::tlsDescIdx.

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D116900
2022-01-10 10:03:21 -08:00
Fangrui Song 5d3bd7f360 [ELF] Move gotIndex/pltIndex/globalDynIndex to SymbolAux
to decrease sizeof(SymbolUnion) by 8 on ELF64 platforms.

Symbols needing such information are typically 1% or fewer (5134 out of 560520
when linking clang, 19898 out of 5550705 when linking chrome). Storing them
elsewhere can decrease memory usage and symbol initialization time.
There is a ~0.8% saving on max RSS when linking a large program.

Future direction:

* Move some of dynsymIndex/verdefIndex/versionId to SymbolAux
* Support mixed TLSDESC and TLS GD without increasing sizeof(SymbolUnion)

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D116281
2022-01-09 13:43:27 -08:00
Fangrui Song 7924b3814f [ELF] Add Symbol::hasVersionSuffix
"Process symbol versions" may take 2+% time.
"Redirect symbols" may take 0.6% time.
This change speeds up the two passes and makes `*sym.getVersionSuffix()
== '@'` in the `undefined reference` diagnostic cleaner.

Linking chrome (no debug info) and another large program is 1.5% faster.

For empty-ver2.s: the behavior now matches GNU ld, though I'd consider the input
invalid and the exact behavior does not matter.
2021-12-26 17:25:54 -08:00
Fangrui Song 70912420bb [ELF] Move TLS dynamic relocations to postScanRelocations
This temporarily increases sizeof(SymbolUnion), but allows us to mov GOT/PLT/etc
index members outside Symbol in the future.

Then, we can make TLSDESC and TLSGD use different indexes and support mixed
TLSDESC and TLSGD (tested by x86-64-tlsdesc-gd-mixed.s).

Note: needsTlsGd and needsTlsGdToIe may optionally be combined.

Test updates are due to reordered GOT entries.
2021-12-24 22:36:49 -08:00
Fangrui Song e1b6b5be46 [ELF] Avoid referencing SectionBase::repl after ICF
It is fairly easy to forget SectionBase::repl after ICF.
Let ICF rewrite a Defined symbol's `section` field to avoid references to
SectionBase::repl in subsequent passes. This slightly improves the --icf=none
performance due to less indirection (maybe for --icf={safe,all} as well if most
symbols are Defined).

With this change, there is only one reference to `repl` (--gdb-index D89751).
We can undo f4fb5fd752 (`Move Repl to SectionBase.`)
but move `repl` to `InputSection` instead.

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D116093
2021-12-24 12:09:48 -08:00
Fangrui Song 3a5fb57393 [ELF] Replace LazyObjFile with lazy ObjFile/BitcodeFile
The new `lazy` state is the inverse of the previous `LazyObjFile::extracted`.
There are many advantages:

* previously when a LazyObjFile was extracted, a new ObjFile/BitcodeFile was created; now the file is reused, just with `lazy` cleared
* avoid the confusing transfer of `symbols` from LazyObjFile to the new file
* the `incompatible file:` diagnostic is unified with `is incompatible with`
* simpler code, smaller executable (6200+ bytes smaller on x86-64)
* make eager parsing feasible (for parallel section/symbol table initialization)
2021-12-22 17:41:50 -08:00
Fangrui Song b0211de5e3 [ELF] Change Symbol::verdefIndex from uint32_t to uint16_t
The SHT_GNU_version index is 16-bit, so the 32-bit value is a waste.
Technically non-default version index 0x7fff uses version index 0xffff,
but it is impossible in practice.

This change decreases sizeof(SymbolUnion) from 80 to 72 on ELF64 platforms.
Memory usage decreases by 1% when linking a large executable.
2021-12-15 17:59:30 -08:00
Fangrui Song 68009b78f2 [ELF] Symbol::replace: remove dead code 2021-12-15 16:08:18 -08:00
Fangrui Song a8d6d2614b [ELF] Replace make<Defined> with makeDefined. NFC
This removes SpecificAlloc<Defined> and makes my lld executable 1.5k smaller.
This drops the small memory waste due to the separate BumpPtrAllocator.
2021-12-15 13:15:03 -08:00
Fangrui Song cf783be8d7 Reland D114783/D115603 [ELF] Split scanRelocations into scanRelocations/postScanRelocations
(Fixed an issue about GOT on a copy relocated alias.)
(Fixed an issue about not creating r_addend=0 IRELATIVE for unreferenced non-preemptible ifunc.)

The idea is to make scanRelocations mark some actions are needed (GOT/PLT/etc)
and postpone the real work to postScanRelocations. It gives some flexibility:

* Make it feasible to support .plt.got (PR32938): we need to know whether GLOB_DAT and JUMP_SLOT are both needed.
* Make non-preemptible IFUNC handling slightly cleaner: avoid setting/clearing sym.gotInIgot
* -z nocopyrel: report all copy relocation places for one symbol
* Make GOT deduplication feasible
* Make parallel relocation scanning feasible (if we can avoid all stateful operations and make Symbol attributes atomic), but parallelism may not be the appealing choice

Since this patch moves a large chunk of code out of ELFT templates. My x86-64
executable is actually a few hundred bytes smaller.

For ppc32-ifunc-nonpreemptible-pic.s: I remove absolute relocation references to non-preemptible ifunc
because absolute relocation references are incorrect in -fpie mode.

Reviewed By: peter.smith, ikudrin

Differential Revision: https://reviews.llvm.org/D114783
2021-12-14 16:28:41 -08:00
Fangrui Song ea15b862d7 Revert D114783 [ELF] Split scanRelocations into scanRelocations/postScanRelocations
May cause a failure for non-preemptible `bcmp` in a glibc -static link.
2021-12-14 14:33:50 -08:00
Fangrui Song b79686c6dc [ELF] Remove needsPltAddr in favor of needsCopy
needsPltAddr is equivalent to `needsCopy && isFunc`. In many places, it is
equivalent to `needsCopy` because the non-STT_FUNC cases are ruled out.

Reviewed By: ikudrin, peter.smith

Differential Revision: https://reviews.llvm.org/D115603
2021-12-14 09:52:43 -08:00
Fangrui Song e7a95b0674 Reland [ELF] Split scanRelocations into scanRelocations/postScanRelocations
(Fixed an issue about GOT on a copy relocated alias.)

The idea is to make scanRelocations mark some actions are needed (GOT/PLT/etc)
and postpone the real work to postScanRelocations. It gives some flexibility:

* Make it feasible to support .plt.got (PR32938): we need to know whether GLOB_DAT and JUMP_SLOT are both needed.
* Make non-preemptible IFUNC handling slightly cleaner: avoid setting/clearing sym.gotInIgot
* -z nocopyrel: report all copy relocation places for one symbol
* Make GOT deduplication feasible
* Make parallel relocation scanning feasible (if we can avoid all stateful operations and make Symbol attributes atomic), but parallelism may not be the appealing choice

Since this patch moves a large chunk of code out of ELFT templates. My x86-64
executable is actually a few hundred bytes smaller.

For ppc32-ifunc-nonpreemptible-pic.s: I remove absolute relocation references to non-preemptible ifunc
because absolute relocation references are incorrect in -fpie mode.

Reviewed By: peter.smith, ikudrin

Differential Revision: https://reviews.llvm.org/D114783
2021-12-13 20:11:24 -08:00
Fangrui Song 0b8b86e30f Revert "[ELF] Split scanRelocations into scanRelocations/postScanRelocations"
This reverts commit fc33861d48.

`replaceWithDefined` should copy needsGot, otherwise an alias for a copy
relocated symbol may not have GOT entry if its needsGot was originally true.
2021-12-13 19:29:53 -08:00
Fangrui Song fc33861d48 [ELF] Split scanRelocations into scanRelocations/postScanRelocations
The idea is to make scanRelocations mark some actions are needed (GOT/PLT/etc)
and postpone the real work to postScanRelocations. It gives some flexibility:

* Make it feasible to support .plt.got (PR32938): we need to know whether GLOB_DAT and JUMP_SLOT are both needed.
* Make non-preemptible IFUNC handling slightly cleaner: avoid setting/clearing sym.gotInIgot
* -z nocopyrel: report all copy relocation places for one symbol
* Make parallel relocation scanning possible (if we can avoid all stateful operations and make Symbol attributes atomic), but parallelism may not be the appealing choice
* Make GOT deduplication feasible

Since this patch moves a large chunk of code out of ELFT templates. My x86-64
executable is actually a few hundred bytes smaller.

For ppc32-ifunc-nonpreemptible-pic.s: I remove absolute relocation references to non-preemptible ifunc
because absolute relocation references are incorrect in -fpie mode.

Reviewed By: peter.smith, ikudrin

Differential Revision: https://reviews.llvm.org/D114783
2021-12-13 09:56:52 -08:00
Fangrui Song 09401dfcf1 [ELF] Rename fetch to extract
The canonical term is "extract" (GNU ld documentation, Solaris's `-z *extract`
options). Avoid inventing a term and match --why-extract. (ld64 prefers "load"
but the word is overloaded too much)

Mostly MFC, except for --help messages and the header row in
--print-archive-stats output.
2021-11-26 10:58:50 -08:00