We have been creating many ConcatInputSections with identical values due
to .subsections_via_symbols. This diff factors out the identical values
into a Shared struct, to reduce memory consumption and make copying
cheaper.
I also changed `callSiteCount` from a uint32_t to a 31-bit field to save an
extra word.
All in all, this takes InputSection from 120 to 72 bytes (and
ConcatInputSection from 160 to 112 bytes), i.e. 30% size reduction in
ConcatInputSection.
Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:
N Min Max Median Avg Stddev
x 20 4.14 4.24 4.18 4.183 0.027548999
+ 20 4.04 4.11 4.075 4.0775 0.018027756
Difference at 95.0% confidence
-0.1055 +/- 0.0149005
-2.52211% +/- 0.356215%
(Student's t, pooled s = 0.0232803)
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D105305
`__cfstring` is a special literal section, so instead of breaking it up
at symbol boundaries, we break it up at fixed-width boundaries (since
each literal is the same size). Symbols can only occur at one of those
boundaries, so this is strictly more powerful than
`.subsections_via_symbols`.
With that in place, we then run the section through ICF.
This change is about perf-neutral when linking chromium_framework.
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D105045
SymtabSection::emitStabs() writes the symbol table in the order
of externalSymbols, which has the order of symtab->getSymbols(),
which is just the order symbols are added to the symbol table.
In practice, symbols in the symbol files of input .o files are
sorted, but since that's not guaranteed we sort them in
ObjFile::parseSymbols(). To make sure several symbols with the same
address keep the order they're in the input file, we have to use
stable_sort().
In practice, std::sort() on already-sorted inputs won't change the order
of just adjacent elements, and while in theory std::sort() could use a
random pivot, in practice the code should be deterministic as it was
previously too.
But now lld/test/MachO/stabs.s passes with LLVM_ENABLE_EXPENSIVE_CHECKS=ON
(the last test that was failing with that set).
Fixes a regression from D99972.
While here, remove an empty section in stabs.s and move
.subsections_via_symbols to the end where it usually is (this part no
behavior change).
Differential Revision: https://reviews.llvm.org/D105071
Fixes PR50637.
Downstream bug: https://crbug.com/1218958
Currently, we split __cstring along symbol boundaries with .subsections_via_symbols
when not deduplicating, and along null bytes when deduplicating. This change splits
along null bytes unconditionally, and preserves original alignment in the non-
deduplicated case.
Removing subsections-section-relocs.s because with this change, __cstring
is never reordered based on the order file.
Differential Revision: https://reviews.llvm.org/D104919
Previously, we asserted that such a case was invalid, but in fact
`ld -r` can emit such symbols if the input contained a (true) private
extern, or if it contained a symbol started with "L".
Non-extern symbols marked as private extern are essentially equivalent
to regular TU-scoped symbols, so no new functionality is needed.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D104502
ICF = Identical C(ode|OMDAT) Folding
This is the LLD ELF/COFF algorithm, adapted for MachO. So far, only `-icf all` is supported. In order to support `-icf safe`, we will need to port address-significance tables (`.addrsig` directives) to MachO, which will come in later diffs.
`check-{llvm,clang,lld}` have 0 regressions for `lld -icf all` vs. baseline ld64.
We only run ICF on `__TEXT,__text` for reasons explained in the block comment in `ConcatOutputSection.cpp`.
Here is the perf impact for linking `chromium_framekwork` on a Mac Pro (16-core Xeon W) for the non-ICF case vs. pre-ICF:
```
N Min Max Median Avg Stddev
x 20 4.27 4.44 4.34 4.349 0.043029977
+ 20 4.37 4.46 4.405 4.4115 0.025188761
Difference at 95.0% confidence
0.0625 +/- 0.0225658
1.43711% +/- 0.518873%
(Student's t, pooled s = 0.0352566)
```
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D103292
It's a warning in ld64. While having LLD be stricter would be nice, it
makes it harder for it to be a drop-in replacement into existing builds.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D104333
This is motivated by an upcoming diff in which the
WordLiteralInputSection ctor sets itself up based on the value of its
section flags. As such, it needs to be passed the `flags` value as part
of its ctor parameters, instead of having them assigned after the fact
in `parseSection()`. While refactoring code to make that possible, I
figured it would make sense for the other InputSections to also take
their initial values as ctor parameters.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D103978
Conceptually, the implementation is pretty straightforward: we put each
literal value into a hashtable, and then write out the keys of that
hashtable at the end.
In contrast with ELF, the Mach-O format does not support variable-length
literals that aren't strings. Its literals are either 4, 8, or 16 bytes
in length. LLD-ELF dedups its literals via sorting + uniq'ing, but since
we don't need to worry about overly-long values, we should be able to do
a faster job by just hashing.
That said, the implementation right now is far from optimal, because we
add to those hashtables serially. To parallelize this, we'll need a
basic concurrent hashtable (only needs to support concurrent writes w/o
interleave reads), which shouldn't be to hard to implement, but I'd like
to punt on it for now.
Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:
N Min Max Median Avg Stddev
x 20 4.27 4.39 4.315 4.3225 0.033225703
+ 20 4.36 4.82 4.44 4.4845 0.13152846
Difference at 95.0% confidence
0.162 +/- 0.0613971
3.74783% +/- 1.42041%
(Student's t, pooled s = 0.0959262)
This corresponds to binary size savings of 2MB out of 335MB, or 0.6%.
It's not a great tradeoff as-is, but as mentioned our implementation can
be signficantly optimized, and literal dedup will unlock more
opportunities for ICF to identify identical structures that reference
the same literals.
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D103113
This is important for Frameworks, which are usually symlinks.
ld64 gets this right for @rpath that's replaced with @loader_path, but not for
bare @loader_path -- ld64's code calls realpath() in that case too, but ignores
the result.
ld64 somehow manages to find libbar1.dylib in the test without the
explicit `-rpath` in Foo1. I don't understand why or how. But this
change is a step forward and fixes an immediate problem I'm having,
so let's start with this :)
Differential Revision: https://reviews.llvm.org/D103990
Our implementation draws heavily from LLD-ELF's, which in turn delegates
its string deduplication to llvm-mc's StringTableBuilder. The messiness of
this diff is largely due to the fact that we've previously assumed that
all InputSections get concatenated together to form the output. This is
no longer true with CStringInputSections, which split their contents into
StringPieces. StringPieces are much more lightweight than InputSections,
which is important as we create a lot of them. They may also overlap in
the output, which makes it possible for strings to be tail-merged. In
fact, the initial version of this diff implemented tail merging, but
I've dropped it for reasons I'll explain later.
**Alignment Issues**
Mergeable cstring literals are found under the `__TEXT,__cstring`
section. In contrast to ELF, which puts strings that need different
alignments into different sections, clang's Mach-O backend puts them all
in one section. Strings that need to be aligned have the `.p2align`
directive emitted before them, which simply translates into zero padding
in the object file.
I *think* ld64 extracts the desired per-string alignment from this data
by preserving each string's offset from the last section-aligned
address. I'm not entirely certain since it doesn't seem consistent about
doing this; but perhaps this can be chalked up to cases where ld64 has
to deduplicate strings with different offset/alignment combos -- it
seems to pick one of their alignments to preserve. This doesn't seem
correct in general; we can in fact can induce ld64 to produce a crashing
binary just by linking in an additional object file that only contains
cstrings and no code. See PR50563 for details.
Moreover, this scheme seems rather inefficient: since unaligned and
aligned strings are all put in the same section, which has a single
alignment value, it doesn't seem possible to tell whether a given string
doesn't have any alignment requirements. Preserving offset+alignments
for strings that don't need it is wasteful.
In practice, the crashes seen so far seem to stem from x86_64 SIMD
operations on cstrings. X86_64 requires SIMD accesses to be
16-byte-aligned. So for now, I'm thinking of just aligning all strings
to 16 bytes on x86_64. This is indeed wasteful, but implementation-wise
it's simpler than preserving per-string alignment+offsets. It also
avoids the aforementioned crash after deduplication of
differently-aligned strings. Finally, the overhead is not huge: using
16-byte alignment (vs no alignment) is only a 0.5% size overhead when
linking chromium_framework.
With these alignment requirements, it doesn't make sense to attempt tail
merging -- most strings will not be eligible since their overlaps aren't
likely to start at a 16-byte boundary. Tail-merging (with alignment) for
chromium_framework only improves size by 0.3%.
It's worth noting that LLD-ELF only does tail merging at `-O2`. By
default (at `-O1`), it just deduplicates w/o tail merging. @thakis has
also mentioned that they saw it regress compressed size in some cases
and therefore turned it off. `ld64` does not seem to do tail merging at
all.
**Performance Numbers**
CString deduplication reduces chromium_framework from 250MB to 242MB, or
about a 3.2% reduction.
Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:
N Min Max Median Avg Stddev
x 20 3.91 4.03 3.935 3.95 0.034641016
+ 20 3.99 4.14 4.015 4.0365 0.0492336
Difference at 95.0% confidence
0.0865 +/- 0.027245
2.18987% +/- 0.689746%
(Student's t, pooled s = 0.0425673)
As expected, cstring merging incurs some non-trivial overhead.
When passing `--no-literal-merge`, it seems that performance is the
same, i.e. the refactoring in this diff didn't cost us.
N Min Max Median Avg Stddev
x 20 3.91 4.03 3.935 3.95 0.034641016
+ 20 3.89 4.02 3.935 3.9435 0.043197831
No difference proven at 95.0% confidence
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D102964
When a library "host"'s reexports change their installName with
`$ld$os10.11$install_name$host`, we used to write a load command for "host" but
write the version numbers of the reexport instead of "host". This fixes that.
I first thought that the rule is to take the version numbers from the library
that originally had that install name (implemented in D103819), but that's not
what ld64 seems to be doing: It takes the version number from the first dylib
with that install name it loads, and it loads the reexporting library before
the reexports. We already did most of that, we just added reexports before the
reexporter. After this change, we add the reexporter before the reexports.
Addresses https://bugs.llvm.org/show_bug.cgi?id=49800#c11 part 1.
(ld64 seems to add reexports after processing _all_ files on the command line,
while we add them right after the reexporter. For the common case of reexport +
$ld$ symbol changing back to the exporter name, this doesn't make a difference,
but you can construct a case where it does. I expect this to not make a
difference in practice though.)
Differential Revision: https://reviews.llvm.org/D103821
Also adjust a few comments, and move the DylibFile comment talking about
umbrella next to the parameter again.
Differential Revision: https://reviews.llvm.org/D103783
The flag to set it is called `-install_name`, and it's called `installName` in tbd files.
No behavior change.
Differential Revision: https://reviews.llvm.org/D103776
This diff adds first bits to support special symbols $ld$previous* in LLD.
$ld$* symbols modify properties/behavior of the library
(e.g. its install name, compatibility version or hide/add symbols)
for specific target versions.
Test plan: make check-lld-macho
Differential revision: https://reviews.llvm.org/D103505
D103423 neglected to call `parseReexports()` for nested TBD
documents, leading to symbol resolution failures when trying to look up
a symbol nested more than one level deep in a TBD file. This fixes the
regression and adds a test.
It also appears that `umbrella` wasn't being set properly when calling
`parseLoadCommands` -- it's supposed to resolve to `this` if `nullptr`
is passed. I didn't write a failing test case for this but I've made
`umbrella` a member so the previous behavior should be preserved.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D103586
Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.
Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).
Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:
% ministat lld_*
x lld_nostrip.txt
+ lld_strip.txt
N Min Max Median Avg Stddev
x 10 3.929414 4.07692 4.0269079 4.0089678 0.044214794
+ 10 3.8129408 3.9025559 3.8670411 3.8642573 0.024779651
Difference at 95.0% confidence
-0.144711 +/- 0.0336749
-3.60967% +/- 0.839989%
(Student's t, pooled s = 0.0358398)
This interacts with many parts of the linker. I tried to add test coverage
for all added `isLive()` checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:
- debug info
- export tries
- import opcodes
- flags like -exported_symbol(s_list)
- -U / dynamic_lookup
- mod_init_funcs, mod_term_funcs
- weak symbol handling
- unwind info
- stubs
- map files
- -sectcreate
- undefined, dylib, common, defined (both absolute and normal) symbols
It's possible it interacts with more features I didn't think of,
of course.
I also did some manual testing:
- check-llvm check-clang check-lld work with lld with this patch
as host linker and -dead_strip enabled
- Chromium still starts
- Chromium's base_unittests still pass, including unwind tests
Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.
Mechanical parts:
- Rename canOmitFromOutput to wasCoalesced (no behavior change)
since it really is for weak coalesced symbols
- Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP
(`.no_dead_strip` in asm)
Fixes PR49276.
Differential Revision: https://reviews.llvm.org/D103324
I forgot to move the message() call around as requested in D103428
before committing that change. Move it now.
Also, improve the ordinal uniq'ing comment. I hadn't realized that the
distinct-but-identical files happen with --reproduce and not in general.
No behavior change.
Differential Revision: https://reviews.llvm.org/D103522
In all of these cases, the functions could simply return a nullptr instead of {}.
There is no case where Optional<nullptr> has a special meaning.
Differential Revision: https://reviews.llvm.org/D103489
This omits load commands for unreferenced dylibs if:
- the dylib was loaded implicitly,
- it is marked MH_DEAD_STRIPPABLE_DYLIB
- or -dead_strip_dylibs is passed
This matches ld64.
Currently, the "is dylib referenced" state is computed before dead code
stripping and is not updated after dead code stripping. This too matches ld64.
We should do better here.
With this, clang-format linked with lld (like with ld64) no longer has
libobjc.A.dylib in `otool -L` output. (It was implicitly loaded as a reexport
of CoreFoundation.framework, but it's not needed.)
Differential Revision: https://reviews.llvm.org/D103430
loadDylib() keeps a name->DylibFile cache, but it only writes
to the cache once the DylibFile constructor has completed.
So dylib loads done recursively from the DylibFile constructor
wouldn't use the cache.
Now, we load additional dylibs after writing to the cache,
which means the cache now gets used for dylibs loaded because
they're referenced from other dylibs.
Related to PR49514 and PR50101, but no dramatic behavior change in itself.
(Technically we no longer crash when a tbd file reexports itself,
but that doesn't happen in practice. We now accept it silently instead
of crashing; ld64 has a diag for the reexport cycle.)
Differential Revision: https://reviews.llvm.org/D103423
* Move `static_asserts` into cpp instead of header file. I noticed they
had been separated from the main class definition in the header, so I
set about to clean that up, then figured it made more sense as part of
the cpp file so as not to incur unnecessary compile-time overhead.
* Remove unnecessary `virtual`s
* Remove unnecessary comment / reword another comment
Has the effect that `__mh_execute_header` stays in the symbol table of
outputs even after running `strip` on the output. I don't know if that's
important for anything -- my motivation for the patch is just is to make
the output more similar to ld64.
(Corresponds to symbolTableInAndNeverStrip in ld64.)
Differential Revision: https://reviews.llvm.org/D102619
We had a hardcoded check and a stale TODO, written back when we only had
support for one architecture.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D102154
On a section with alignment of 16, subsections aligned to 16-byte
boundaries should keep their 16-byte alignment.
Fixes PR50274. (The same bug could have happened with -order_file
previously.)
Differential Revision: https://reviews.llvm.org/D102139
Before this, if an inline function was defined in several input files,
lld would write each copy of the inline function the output. With this
patch, it only writes one copy.
Reduces the size of Chromium Framework from 378MB to 345MB (compared
to 290MB linked with ld64, which also does dead-stripping, which we
don't do yet), and makes linking it faster:
N Min Max Median Avg Stddev
x 10 3.9957051 4.3496981 4.1411121 4.156837 0.10092097
+ 10 3.908154 4.169318 3.9712729 3.9846753 0.075773012
Difference at 95.0% confidence
-0.172162 +/- 0.083847
-4.14165% +/- 2.01709%
(Student's t, pooled s = 0.0892373)
Implementation-wise, when merging two weak symbols, this sets a
"canOmitFromOutput" on the InputSection belonging to the weak symbol not put in
the symbol table. We then don't write InputSections that have this set, as long
as they are not referenced from other symbols. (This happens e.g. for object
files that don't set .subsections_via_symbols or that use .alt_entry.)
Some restrictions:
- not yet done for bitcode inputs
- no "comdat" handling (`kindNoneGroupSubordinate*` in ld64) --
Frame Descriptor Entries (FDEs), Language Specific Data Areas (LSDAs)
(that is, catch block unwind information) and Personality Routines
associated with weak functions still not stripped. This is wasteful,
but harmless.
- However, this does strip weaks from __unwind_info (which is needed for
correctness and not just for size)
- This nopes out on InputSections that are referenced form more than
one symbol (eg from .alt_entry) for now
Things that work based on symbols Just Work:
- map files (change in MapFile.cpp is no-op and not needed; I just
found it a bit more explicit)
- exports
Things that work with inputSections need to explicitly check if
an inputSection is written (e.g. unwind info).
This patch is useful in itself, but it's also likely also a useful foundation
for dead_strip.
I used to have a "canoncialRepresentative" pointer on InputSection instead of
just the bool, which would be handy for ICF too. But I ended up not needing it
for this patch, so I removed that again for now.
Differential Revision: https://reviews.llvm.org/D102076
ld64 can emit dylibs that support more than one platform (typically macOS and
macCatalyst). This diff allows LLD to read in those dylibs. Note that this is a
super bare-bones implementation -- in particular, I haven't added support for
LLD to emit those multi-platform dylibs, nor have I added a variety of
validation checks that ld64 does. Until we have a use-case for emitting zippered
dylibs, I think this is good enough.
Fixes PR49597.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D101954
Currently the linker causes unnecessary errors when either the target or the config's platform is a simulator.
Differential Revision: https://reviews.llvm.org/D101855
@thakis pointed out that `mach_header` and `mach_header_64`
actually have the same set of (used) fields, with the 64-bit version
having extra padding. So we can access the fields we need using the
single `mach_header` type instead of using templates to switch between
the two.
I also spotted a potential issue where hasObjCSection tries to parse a
file w/o checking if it does indeed match the target arch... As such,
I've added a quick magic number check to ensure we don't access invalid
memory during `findCommand()`.
Addresses PR50180.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D101724
I first had a more invasive patch for this (D101069), but while trying
to get that polished for review I realized that lld's current symbol
merging semantics mean that only a very small code change is needed.
So this goes with the smaller patch for now.
This has no effect on projects that build with -fvisibility=hidden
(e.g. chromium), since these see .private_extern symbols instead.
It does have an effect on projects that build with -fvisibility-inlines-hidden
(e.g. llvm) in -O2 builds, where LLVM's GlobalOpt pass will promote most inline
functions from .weak_definition to .weak_def_can_be_hidden.
Before this patch:
% ls -l out/gn/bin/clang out/gn/lib/libclang.dylib
-rwxr-xr-x 1 thakis staff 113059936 Apr 22 11:51 out/gn/bin/clang
-rwxr-xr-x 1 thakis staff 86370064 Apr 22 11:51 out/gn/lib/libclang.dylib
% out/gn/bin/llvm-objdump --macho --weak-bind out/gn/bin/clang | wc -l
8291
% out/gn/bin/llvm-objdump --macho --weak-bind out/gn/lib/libclang.dylib | wc -l
5698
With this patch:
% ls -l out/gn/bin/clang out/gn/lib/libclang.dylib
-rwxr-xr-x 1 thakis staff 111721096 Apr 22 11:55 out/gn/bin/clang
-rwxr-xr-x 1 thakis staff 85291208 Apr 22 11:55 out/gn/lib/libclang.dylib
thakis@MBP llvm-project % out/gn/bin/llvm-objdump --macho --weak-bind out/gn/bin/clang | wc -l
725
thakis@MBP llvm-project % out/gn/bin/llvm-objdump --macho --weak-bind out/gn/lib/libclang.dylib | wc -l
542
Linking clang becomes a tiny bit faster with this patch:
x 100 0.67263818 0.77847815 0.69430709 0.69877208 0.017715892
+ 100 0.67209601 0.73323393 0.68600798 0.68917346 0.012824377
Difference at 95.0% confidence
-0.00959861 +/- 0.00428661
-1.37364% +/- 0.613449%
(Student's t, pooled s = 0.0154648)
This only happens if lld with the patch and lld without the patch are both
linked with an lld with the patch or both linked with an lld without the patch
(...or with ld64). I accidentally linked the lld with the patch with an lld
without the patch and the other way round at first. In that setup, no
difference is found. That makese sense, since having fewer weak imports will
make the linked output a bit faster too. So not only does this make linking
binaries such as clang a bit faster (since fewer exports need to be written to
the export trie by lld), the linked output binary binary is also a bit faster
(since dyld needs to process fewer dynamic imports).
This also happens to fix the one `check-clang` failure when using lld as host
linker, but mostly for silly reasons: See crbug.com/1183336, mostly comment 26.
The real bug here is that c-index-test links all of LLVM both statically and
dynamically, which is an ODR violation. Things just happen to work with this
patch.
So after this patch, check-clang, check-lld, check-llvm all pass with lld as
host linker :)
Differential Revision: https://reviews.llvm.org/D101080
We had got it backwards... the minimum version of the target
should be higher than the min version of the object files, presumably
since new platforms are backwards-compatible with older formats.
Fixes PR50078.
The original commit (aa05439c9c) broke many tests that had inputs too
new for our target platform (10.0). This commit changes the inputs to
target 10.0, which was the simpler thing to do, but we should really
just have our lit.local.cfg default to targeting 10.15... we're not
likely to ever have proper support for the older versions anyway. I will
follow up with a change to that effect.
Differential Revision: https://reviews.llvm.org/D101114
We had got it backwards... the minimum version of the target
should be higher than the min version of the object files, presumably
since new platforms are backwards-compatible with older formats.
Fixes PR50078.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D101114
This diff adds initial support for the legacy LC_VERSION_MIN_* load commands.
Test plan: make check-lld-macho
Differential revision: https://reviews.llvm.org/D100523
XCode 12 ships with mismatched platforms for these libraries,
so this hack is necessary...
Fixes PR49799.
Reviewed By: #lld-macho, gkm, smeenai
Differential Revision: https://reviews.llvm.org/D100913
The minuend (but not the subtrahend) can reference a section.
Note that we do not yet properly validate that the subtrahend isn't
referencing a section; I've filed PR50034 to track that.
I've also extended the reloc-subtractor.s test to reorder symbols, to
make sure that the addends are being associated with the minuend (and not
the subtrahend) relocation.
Fixes PR49999.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D100804
It doesn't make sense to take just the base filename for archives when we emit
the full path for object files. (LLD-ELF emits the full path too.)
This will also make it easier to write a proper test for {D100147}.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D100357
I noticed two problems with the previous implementation:
* N_ALT_ENTRY symbols weren't being handled correctly -- they should
determine the size of the previous symbol, even though they don't
cause a new section to be created
* The last symbol in a section had its size calculated wrongly;
the first subsection's size was used instead of the last one
I decided to take the opportunity to refactor things as well, mainly to
realize my observation
[here](https://reviews.llvm.org/D98837#inline-931511) that we could
avoid doing a binary search to match symbols with subsections. I think
the resulting code is a bit simpler too.
N Min Max Median Avg Stddev
x 20 4.31 4.43 4.37 4.3775 0.034162922
+ 20 4.32 4.43 4.38 4.3755 0.02799906
No difference proven at 95.0% confidence
Reviewed By: #lld-macho, alexshap
Differential Revision: https://reviews.llvm.org/D99972
We bikeshedded about it here: https://reviews.llvm.org/D98837#inline-931557
I initially suggested SubsectionMapping, but I thought the discussion
landed on doing `std::vector<SubsectionEntry>`. @alexshap went and did
both, but on hindsight I regret adding 3 more characters to an already
long name, and I think SubsectionEntry is descriptive enough...
This diff also renames `subsectionMap` to `subsecMap` for consistency
with other variable names in the codebase.
TextAPI/ELF has moved out into InterfaceStubs, so theres no longer a
need to seperate out TextAPI between formats.
Reviewed By: ributzka, int3, #lld-macho
Differential Revision: https://reviews.llvm.org/D99811
The main challenge was handling the different on-disk structures (e.g.
`mach_header` vs `mach_header_64`). I tried to strike a balance between
sprinkling `target->wordSize == 8` checks everywhere (branchy = slow, and ugly)
and templatizing everything (causes code bloat, also ugly). I think I struck a
decent balance by judicious use of type erasure.
Note that LLD-ELF has a similar architecture, though it seems to use more templating.
Linking chromium_framework takes about the same time before and after this
change:
N Min Max Median Avg Stddev
x 20 4.52 4.67 4.595 4.5945 0.044423204
+ 20 4.5 4.71 4.575 4.582 0.056344803
No difference proven at 95.0% confidence
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D99633
This diff addresses FIXME in SyntheticSections.cpp and removes
the dependency of emitEndFunStab on .subsections_via_symbols.
Test plan: make check-lld-macho
Differential revision: https://reviews.llvm.org/D99054
This diff is a preparation for fixing FunStabs (incorrect size calculation).
std::map<uint32_t, InputSection*> (SubsectionMap) is replaced with
a sorted vector + binary search. If .subsections_via_symbols is set
this vector will contain the list of subsections, otherwise,
the offsets will be used for calculating the symbols sizes.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D98837
This diff required fixing `getEmbeddedAddend` to apply sign
extension to 32-bit values. We were previously passing around wrong
64-bit addend values that became "right" after being truncated back to
32-bit.
I've also made `getEmbeddedAddend` return a signed int, which is similar
to what LLD-ELF does for its `getImplicitAddend`.
`reportRangeError`, `checkUInt`, and `checkInt` are counterparts of similar
functions in LLD-ELF.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D98387
SUBTRACTOR relocations are always paired with UNSIGNED
relocations to indicate a pair of symbols whose address difference we
want. Functionally they are like a single relocation: only one pointer
gets written / relocated. Previously, we would handle these pairs by
skipping over the SUBTRACTOR relocation and writing the pointer when
handling the UNSIGNED reloc. This diff reverses things, so we write
while handling SUBTRACTORs and skip over the UNSIGNED relocs instead.
Being able to distinguish between SUBTRACTOR and UNSIGNED relocs in the
write phase (i.e. inside `relocateOne`) is useful for the upcoming range
check diff: we want to check that SUBTRACTOR relocs write signed values,
but UNSIGNED relocs (naturally) write unsigned values.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D98386
The previous implementation miscalculated the addend, resulting
in an underflow. This meant that every SIGNED_N section relocation would
be associated with the last subsection (since the addend would now be a
huge number). We were "lucky" that this mistake was typically cancelled
out -- 64-to-32-bit-truncation meant that the final value was correct,
as long as subsections were not rearranged.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D98385
Previously, SyntheticSections.cpp did not have a top-level `using namespace
llvm::MachO` because it caused a naming conflict: `llvm::MachO::Symbol` would
collide with `lld::macho::Symbol`.
`MachO::Symbol` represents the symbols defined in InterfaceFiles (TBDs). By
moving the inclusion of InterfaceFile.h into our .cpp files, we can avoid this
name collision in other files where we are only dealing with LLD's own symbols.
Along the way, I removed all unnecessary "MachO::" prefixes in our code.
Cons of this approach: If TextAPI/MachO/Symbol.h gets included via some other
header file in the future, we could run into this collision again.
Alternative 1: Have either TextAPI/MachO or BinaryFormat/MachO.h use a different
namespace. Most of the benefit of `using namespace llvm::MachO` comes from being
able to use things in BinaryFormat/MachO.h conveniently; if TextAPI was under a
different (and fully-qualified) namespace like `llvm::tapi` that would solve our
problems. Cons: lots of files across llvm-project will need to be updated, and
folks who own the TextAPI code need to agree to the name change.
Alternative 2: Rename our Symbol to something like `LldSymbol`. I think this is
ugly.
Personally I think alternative #1 is ideal, but I'm not sure the effort to do it is
worthwhile, this diff's halfway solution seems good enough to me. Thoughts?
Reviewed By: #lld-macho, oontvoo, MaskRay
Differential Revision: https://reviews.llvm.org/D98149
lld policy discourages `auto`. Replace it with a type name whenever reasonable. Retain `auto` to avoid ...
* redundancy, as for decls such as `auto *t = mumble_cast<TYPE *>` or similar that specifies the result type on the RHS
* verbosity, as for iterators
* gratuitous suffering, as for lambdas
Along the way, add `const` when appropriate.
Note: a future diff will ...
* add more `const` qualifiers
* remove `opt::` when we are already `using llvm::opt`
Differential Revision: https://reviews.llvm.org/D98313
clang appears to emit symbols in `__debug_aranges`, at least
for arm64... in the examples I've seen, it doesn't seem like those
symbols are referenced outside of `__DWARF`, so I think they're safe to
ignore. But hopefully @clayborg can confirm.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D98073
We'll need to properly handle object files with multiple source inputs
eventually, but remove the assert for now so we can successfully emit binaries
for testing.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D98067
Since multiple dylibs can be defined in one TBD, this is
necessary to avoid confusion.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D97905
Previously, we were loading re-exports without checking whether
they were compatible with our target. Prior to {D97209}, it meant that
we were defining dylib symbols that were invalid -- usually a silent
failure unless our binary actually used them. D97209 exposed this as an
explicit error.
Along the way, I've extended our TAPI compatibility check to cover the
platform as well, instead of just checking the arch. To this end, I've
replaced MachO::Architecture with MachO::Target in our Config struct.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D97867
The reexport-nested-libs test added in D97438 was a bit wonky.
First, it was linking against libReexportSystem.tbd which targets the
iOS simulator, and which in turn attempted to re-export the iOS
simulator's libSystem. However, due to the way `-syslibroot` works, it
was actually re-exporting the macOS libSystem.
As a result, the test was not actually able to resolve the symbols in
the desired libSystem. I'm guessing that @oontvoo was confused by this
and therefore included those symbols in libReexportSystem.tbd itself.
But this means that the test wasn't actually testing the resolution of
re-exported symbols (though it did at least verify that the re-exported
libraries could be located).
After some consideration, I figured that stub-link.s could be extended
to cover what reexport-nested-libs.s was attempting to do. The test
targets macOS, so we only have one `-syslibroot` and no chance of
confusion.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D97866
Suppose we are linking against libFoo, which re-exports the
implicitly-bound libSystem, which in turn re-exports some
non-explicitly-bound library like `/usr/lib/system/libsystem_c.dylib`.
Then any bindings we have to a symbol in libsystem_c should use
libSystem (and not libFoo) as the umbrella library.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D97865
We previously defaulted to x86_64 and an unknown platform, which was fine when
we only supported one arch and did no platform checks, but that will no longer
be true going ahead. Therefore, we should require those flags to be specified
whenever the linker is invoked.
Note that LLD-ELF and ld64 both infer the arch from their input object files,
but the usefulness of that is questionable since clang will always specify these
flags, and most of the time `lld` will be invoked via clang.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D97799
This reverts diff D97610 (commit 0223ab035c) and adds a one-line fix to verify that a `MemoryBufferRef` has sufficient length before reading a 4-byte magic number.
Differential Revision: https://reviews.llvm.org/D97757
Currently, it was delibrately impleneted to not handle this case, but as it has turnt out, we need this feature.
The concrete use case is
`System/Library/Frameworks/Cocoa.framework/Versions/A/Cocoa` reexports
/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit , which then rexports
/System/Library/PrivateFrameworks/UIFoundation.framework/Versions/A/UIFoundation
The current implemention uses a global currentTopLevelTapi, which is not reset until it finishes loading the whole tree.
This is a problem because if the top-level is set to Cocoa, then when we get to UIFoundation, it will try to find UIFoundation in the current top level, which is Cocoa and will not find it.
The right thing should be:
- When loading a library from a TBD file, re-exports need to be looked up in the auxiliary documents within the same TBD.
- When loading from an actual dylib, no additional TBD documents need to be examined.
- In no case does a re-export mentioned in one TBD file need to be looked up in a document in an auxiliary document from a different TBD file
Differential Revision: https://reviews.llvm.org/D97438
-flat_namespace makes lld emit binaries that use name lookup that's more in
line with other POSIX systems: Instead of looking up symbols as (dylib,name)
pairs by dyld, they're instead looked up just by name.
-flat_namespace has three effects:
1. MH_TWOLEVEL and MH_NNOUNDEFS are no longer set in the Mach-O header
2. All symbols use BIND_SPECIAL_DYLIB_FLAT_LOOKUP as ordinal
3. When a dylib is added to the link, its dependent dylibs are also added,
so that lld can verify that no undefined symbols remain at the end of
a link with -flat_namespace. These transitive dylibs are added for symbol
resolution, but they are not emitted in LC_LOAD_COMMANDs.
-undefined with -flat_namespace still isn't implemented. Before this change,
it was impossible to hit that combination because -flat_namespace caused a
diagnostic. Now that it no longer does, emit a dedicated temporary diagnostic
when both flags are used.
Differential Revision: https://reviews.llvm.org/D97641
There was initially some concern around the correct handling of pcrel
section relocations with r_length != 2. But it looks like there are no such
relocations in practice -- x86_64's pcrel section relocs all have r_length == 2,
and ARM64 doesn't even have pcrel section relocs. So we can replace the TODO
with an assert.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D97576
Bifurcate the `readFile()` API into ...
* `readRawFile()` which performs no checks, and
* `readLinkableFile()` which enforces minimum length of 20 bytes, same as ld64
There are no new tests because tweaks to existing tests are sufficient.
Differential Revision: https://reviews.llvm.org/D97610
On arm64, UNSIGNED relocs are the only ones that use embedded addends
instead of the ADDEND relocation.
Also ensure that the addend works when UNSIGNED is part of a SUBTRACTOR
pair.
Reviewed By: #lld-macho, alexshap
Differential Revision: https://reviews.llvm.org/D97105
Also add a few asserts to verify that we are indeed handling an
UNSIGNED relocation as the minued. I haven't made it an actual
user-facing error since I don't think llvm-mc is capable of generating
SUBTRACTOR relocations without an associated UNSIGNED.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D97103
When parsing bitcode, convert LTO Symbols to LLD Symbols in order to perform
resolution. The "winning" symbol will then be marked as Prevailing at LTO
compilation time. This is similar to what the other LLD ports do.
This change allows us to handle `linkonce` symbols correctly, and to deal with
duplicate bitcode symbols gracefully. Previously, both scenarios would result in
an assertion failure inside the LTO code, complaining that multiple Prevailing
definitions are not allowed.
While at it, I also added basic logic around visibility. We don't do anything
useful with it yet, but we do check that its value is valid. LLD-ELF appears to
use it only to set FinalDefinitionInLinkageUnit for LTO, which I think is just a
performance optimization.
From my local experimentation, the linker itself doesn't seem to do anything
differently when encountering linkonce / linkonce_odr / weak / weak_odr. So I've
only written a test for one of them. LLD-ELF has more, but they seem to mostly
be testing the intermediate bitcode output of their LTO backend...? I'm far from
an expert here though, so I might very well be missing things.
Reviewed By: #lld-macho, MaskRay, smeenai
Differential Revision: https://reviews.llvm.org/D94342
The silent failures had confused me a few times.
I haven't added a similar check for platform yet as we don't yet have logic to
infer the platform automatically, and so adding that check would require
updating dozens of test files.
Reviewed By: #lld-macho, thakis, alexshap
Differential Revision: https://reviews.llvm.org/D97209
I've adjusted the RelocAttrBits to better fit the semantics of
the relocations. In particular:
1. *_UNSIGNED relocations are no longer marked with the `TLV` bit, even
though they can occur within TLV sections. Instead the `TLV` bit is
reserved for relocations that can reference thread-local symbols, and
*_UNSIGNED relocations have their own `UNSIGNED` bit. The previous
implementation caused TLV and regular UNSIGNED semantics to be
conflated, resulting in rebase opcodes being incorrectly emitted for TLV
relocations.
2. I've added a new `POINTER` bit to denote non-relaxable GOT
relocations. This distinction isn't important on x86 -- the GOT
relocations there are either relaxable or non-relaxable loads -- but
arm64 has `GOT_LOAD_PAGE21` which loads the page that the referent
symbol is in (regardless of whether the symbol ends up in the GOT). This
relocation must reference a GOT symbol (so must have the `GOT` bit set)
but isn't itself relaxable (so must not have the `LOAD` bit). The
`POINTER` bit is used for relocations that *must* reference a GOT
slot.
3. A similar situation occurs for TLV relocations.
4. ld64 supports both a pcrel and an absolute version of
ARM64_RELOC_POINTER_TO_GOT. But the semantics of the absolute version
are pretty weird -- it results in the value of the GOT slot being
written, rather than the address. (That means a reference to a
dynamically-bound slot will result in zeroes being written.) The
programs I've tried linking don't use this form of the relocation, so
I've dropped our partial support for it by removing the relevant
RelocAttrBits.
Reviewed By: alexshap
Differential Revision: https://reviews.llvm.org/D97031
Differential Revision: https://reviews.llvm.org/D95913
Usage: -bundle_loader <executable>
This option specifies the executable that will load the build output file being linked.
When building a bundle, users can use the --bundle_loader to specify an executable
that contains symbols referenced, but not implemented in the bundle.
This is an initial base commit for ARM64 target arch support. I don't represent that it complete or bug-free, but wish to put it out for review now that some basic things like branch target & load/store address relocs are working.
I can add more tests to this base commit, or add them in follow-up commits.
It is not entirely clear whether I use the "ARM64" (Apple) or "AArch64" (non-Apple) naming convention. Guidance is appreciated.
Differential Revision: https://reviews.llvm.org/D88629
This extends {D92539} to work even when we are loading archive
members via `-force_load`. I uncovered this issue while trying to
force-load archives containing bitcode -- we were segfaulting.
In addition to fixing the `-force_load` case, this diff also addresses
the behavior of `-ObjC` when LTO bitcode is involved -- we need to
force-load those archive members if they contain ObjC categories.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D95265
This makes our error messages more informative. But the bigger motivation is for
LTO symbol resolution, which will be in an upcoming diff. The changes in this
one are largely mechanical.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D94316
Add per-reloc-type attribute bits and migrate code from per-target file into target independent code, driven by reloc attributes.
Many cleanups
Differential Revision: https://reviews.llvm.org/D95121
Private extern symbols are used for things scoped to the linkage unit.
They cause duplicate symbol errors (so they're in the symbol table,
unlike TU-scoped truly local symbols), but they don't make it into the
export trie. They are created e.g. by compiling with
-fvisibility=hidden.
If two weak symbols have differing privateness, the combined symbol is
non-private external. (Example: inline functions and some TUs that
include the header defining it were built with
-fvisibility-inlines-hidden and some weren't).
A weak private external symbol implicitly has its "weak" dropped and
behaves like a regular strong private external symbol: Weak is an export
trie concept, and private symbols are not in the export trie.
If a weak and a strong symbol have different privateness, the strong
symbol wins.
If two common symbols have differing privateness, the larger symbol
wins. If they have the same size, the privateness of the symbol seen
later during the link wins (!) -- this is a bit lame, but it matches
ld64 and this behavior takes 2 lines less to implement than the less
surprising "result is non-private external), so match ld64.
(Example: `int a` in two .c files, both built with -fcommon,
one built with -fvisibility=hidden and one without.)
This also makes `__dyld_private` a true TU-local symbol, matching ld64.
To make this work, make the `const char*` StringRefZ ctor to correctly
set `size` (without this, writing the string table crashed when calling
getName() on the __dyld_private symbol).
Mention in CommonSymbol's comment that common symbols are now disabled
by default in clang.
Mention in -keep_private_externs's HelpText that the flag only has an
effect with `-r` (which we don't implement yet -- so this patch here
doesn't regress any behavior around -r + -keep_private_externs)). ld64
doesn't explicitly document it, but the commit text of
http://reviews.llvm.org/rL216146 does, and ld64's
OutputFile::buildSymbolTable() checks `_options.outputKind() ==
Options::kObjectFile` before calling `_options.keepPrivateExterns()`
(the only reference to that function).
Fixes PR48536.
Differential Revision: https://reviews.llvm.org/D93609
This is a refactor to pave the way for supporting paired-ADDEND for ARM64. The only paired reloc type for X86_64 is SUBTRACTOR. In a later diff, I will add SUBTRACTOR for both X86_64 and ARM64.
* s/`getImplicitAddend`/`getAddend`/ because it handles all forms of addend: implicit, explicit, paired.
* add predicate `bool isPairedReloc()`
* check range of `relInfo.r_symbolnum` is internal, unrelated to user-input, so use `assert()`, not `error()`
* minor cleanups & rearrangements in `InputFile::parseRelocations()`
Differential Revision: https://reviews.llvm.org/D90614
Note that dylibs without *any* refs will still be loaded in the usual
(strong) fashion.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D93435
Weak references need not necessarily be satisfied at runtime (but they must
still be satisfied at link time). So symbol resolution still works as per usual,
but we now pass around a flag -- ultimately emitting it in the bind table -- to
indicate if a given dylib symbol is a weak reference.
ld64's behavior for symbols that have both weak and strong references is
a bit bizarre. For non-function symbols, it will emit a weak import. For
function symbols (those referenced by BRANCH relocs), it will emit a
regular import. I'm not sure what value there is in that behavior, and
since emulating it will make our implementation more complex, I've
decided to treat regular weakrefs like function symbol ones for now.
Fixes PR48511.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D93369
We were not setting forceWeakImport for file paths given by
`-weak_library` if we had already loaded the file. This diff fixes that
by having `loadDylib` return a cached DylibFile instance even if we have
already loaded that file.
We still avoid emitting multiple LC_LOAD_DYLIBs, but we achieve this by
making inputFiles a SetVector instead of relying on the `loadedDylibs`
cache.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D93255
Dylibs that are "public" -- i.e. top-level system libraries -- are considered
implicitly linked when another library re-exports them. That is, we should load
them & bind directly to their symbols instead of via their re-exporting
umbrella library. This diff implements that behavior by default, as well as an
opt-out flag.
In theory, this is just a performance optimization, but in practice it seems
that it's needed for correctness.
Fixes llvm.org/PR48395.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D93000
This was causing a crash as we were attempting to look up the
nonexistent parent OutputSection of the debug sections. We didn't detect
it earlier because there was no test for PIEs with debug info (PIEs
require us to emit rebases for X86_64_RELOC_UNSIGNED).
This diff filters out the debug sections while loading the ObjFiles. In
addition to fixing the above problem, it also lets us avoid doing
redundant work -- we no longer parse / apply relocations / attempt to
emit dyld opcodes for these sections that we don't emit.
Fixes llvm.org/PR48392.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D92904
Also error out if we find anything other than an object or bitcode file
in the archive.
Note that we were previously inserting the symbols and sections of the
unpacked ObjFile into the containing ArchiveFile. This was actually
unnecessary -- we can just insert the ObjectFile (or BitcodeFile) into
the `inputFiles` vector. This is the approach taken by LLD-ELF.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D92539
Additionally:
1. Move the helper functions in InputSection.h below the definition of
`InputSection`, so the important stuff is on top
2. Remove unnecessary `explicit`
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D92453
clang puts `-framework CoreFoundation` in this load command for files
that use @available / __builtin_available. Without support for this,
binaries that don't explicitly link to CoreFoundation fail to link.
Differential Revision: https://reviews.llvm.org/D92624
The problem was that `sym` became replaced in the call
to make<ObjFile> and referring to it afer that read memory that now
stored a different kind of symbol (a Defined instead of a LazySymbol).
Since this happens only once per archive, just copy the symbol to the
stack before make<ObjFile> and read the copy instead.
Originally reviewed at https://reviews.llvm.org/D92496
This is useful for debugging why lld loads .o files it shouldn't load.
It's also useful for users of lld -- I've used ld64's version of this a
few times.
Differential Revision: https://reviews.llvm.org/D92496
Also, for .o files, include full path as given on link command line.
Before:
lld: error: undefined symbol [...], referenced from sandbox_logging.o
After:
lld: error: undefined symbol [...], referenced from libseatbelt.a(sandbox_logging.o)
Move archiveName up to InputFile so we can consistently use toString()
to print InputFiles in diags, and pass it to the ObjFile ctor. This
matches the ELF and COFF ports.
Differential Revision: https://reviews.llvm.org/D92437
- most importantly, fix a use-after-free when using thin archives,
by putting the archive unique_ptr to the arena allocator. This
ports D65565 to MachO
- correctly demangle symbol namess from archives in diagnostics
- add a test for thin archives -- it finds this UaF, but only when
running it under asan (it also finds the demangling fix)
- make forceLoadArchive() use addFile() with a bool to have the archive
loading code in fewer places. no behavior change; matches COFF port a
bit better
Differential Revision: https://reviews.llvm.org/D92360
This addresses a lot of the comments in {D89257}. Ideally it'd have been
done in the same diff, but the commits in between make that difficult.
This diff implements:
* N_GSYM and N_STSYM, the STABS for global and static symbols
* Has the STABS reflect the section IDs of their referent symbols
* Ensures we don't fail when encountering absolute symbols or files with
no debug info
* Sorts STABS symbols by file to minimize the number of N_OSO entries
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D92366
We should also set the modtime when running LTO. That will be done in a
future diff, together with support for the `-object_path_lto` flag.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D91318
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.
With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.
Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.
Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:
1. We can split up subsections by symbol even if `.subsections_with_symbols`
is not set, but include constraints to ensure those subsections retain
their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
and I'm more inclined toward it, but I'm not sure if there are use cases
that it doesn't handle well. As such I'm punting on the decision for now.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D89257
This adds support for ld.lld's --reproduce / lld-link's /reproduce:
flag to the MachO port. This flag can be added to a link command
to make the link write a tar file containing all inputs to the link
and a response file containing the link command. This can be used
to reproduce the link on another machine, which is useful for sharing
bug report inputs or performance test loads.
Since the linker is usually called through the clang driver and
adding linker flags can be a bit cumbersome, setting the env var
`LLD_REPRODUCE=foo.tar` triggers the feature as well.
The file response.txt in the archive can be used with
`ld64.lld.darwinnew $(cat response.txt)` as long as the contents are
smaller than the command-line limit, or with `ld64.lld.darwinnew
@response.txt` once D92149 is in.
The support in this patch is sufficient to create a tar file for
Chromium's base_unittests that can link after unpacking on a different
machine.
Differential Revision: https://reviews.llvm.org/D92274
Just enough to consume some bitcode files and link them. There's more
to be done around the symbol resolution API and the LTO config, but I don't yet
understand what all the various LTO settings do...
Reviewed By: #lld-macho, compnerd, smeenai, MaskRay
Differential Revision: https://reviews.llvm.org/D90663
They operate like Defined symbols but with no associated InputSection.
Note that `ld64` seems to treat the weak definition flag like a no-op for
absolute symbols, so I have replicated that behavior.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D87909
On Unix, it is traditionally allowed to write variable definitions without
initialization expressions (such as "int foo;") to header files. These are
called tentative definitions.
The compiler creates common symbols when it sees tentative definitions. When
linking the final binary, if there are remaining common symbols after name
resolution is complete, the linker converts them to regular defined symbols in
a `__common` section.
This diff implements most of that functionality, though we do not yet handle
the case where there are both common and non-common definitions of the same
symbol.
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D86909
The word "target" is overloaded, so lighten its load by using another word to denote the symbol or section to which a reloc points. While more stilted than "target", "referent" is rather less pompous than "designatum" or "denotatum". :P
Along the way, make a few neighboring variable names more descriptive.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D87584
Previously, we were only emitting regular bindings to weak
dynamic symbols; this diff adds support for the weak bindings too, which
can overwrite the regular bindings at runtime. We also treat weak
defined global symbols similarly -- since they can also be interposed at
runtime, they need to be treated as potentially dynamic symbols.
Note that weak bindings differ from regular bindings in that they do not
specify the dylib to do the lookup in (i.e. weak symbol lookup happens
in a flat namespace.)
Differential Revision: https://reviews.llvm.org/D86572
The re-exports list in a TAPI document can either refer to other inlined
TAPI documents, or to on-disk files (which may themselves be TBD or
regular files.) Similarly, the re-exports of a regular dylib can refer
to a TBD file.
Differential Revision: https://reviews.llvm.org/D85404
Two things needed fixing for that to work:
1. getName() no longer returns null for DylibFiles constructed from TAPIs
2. markSubLibrary() now accepts .tbd as a possible extension
Differential Revision: https://reviews.llvm.org/D86180
DylibFile doesn't store a pointer to its InterfaceFile
parameter, so there's no need to use a shared_ptr.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D85402
References to symbols in dylibs work very similarly regardless of
whether the symbol is a TLV. The main difference is that we have a
separate `__thread_ptrs` section that acts as the GOT for these
thread-locals.
We can identify thread-locals in dylibs by a flag in their export trie
entries, and we cross-check it with the relocations that refer to them
to ensure that we are not using a GOT relocation to reference a
thread-local (or vice versa).
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D85081
Handle command-line option `-sectcreate SEG SECT FILE`, which inputs a binary blob from `FILE` into `SEG,SECT`
Reviewed By: int3
Differential Revision: https://reviews.llvm.org/D85501
This diff adds support for weak definitions, though it doesn't handle weak
symbols in dylibs quite correctly -- we need to emit binding opcodes for them
in the weak binding section rather than the lazy binding section.
What *is* covered in this diff:
1. Reading the weak flag from symbol table / export trie, and writing it to the
export trie
2. Refining the symbol table's rules for choosing one symbol definition over
another. Wrote a few dozen test cases to make sure we were matching ld64's
behavior.
We can now link basic C++ programs.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D83532
Summary:
llvm-mc emits `__bss` sections with an offset of zero, but we weren't expecting
that in our input, so we were copying non-zero data from the start of the file and
putting it in `__bss`, with obviously undesirable runtime results. (It appears that
the kernel will copy those nonzero bytes as long as the offset is nonzero, regardless
of whether S_ZERO_FILL is set.)
I debated on whether to make a special ZeroFillSection -- separate from a
regular InputSection -- but it seemed like too much work for now. But I'm happy
to refactor if anyone feels strongly about having it as a separate class.
Depends on D80857.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Reviewed By: smeenai
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80859
This removes the stub library that lld injected to satisfy the
dependency on the libSystem. Now with TBD support, we can provide the
stub library to permit the tests to function properly as they would on a
real system.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D81418
Summary:
We should be reading / writing our addends / relocated addresses based on
r_length, and not just based on the type of the relocation. But since only
some r_length values are valid for a given reloc type, I've also added some
validation.
ld64 has code to allow for r_length = 0 in X86_64_RELOC_BRANCH relocs, but I'm
not sure how to create such a relocation...
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D80854
Add support to lld to use Text Based API stubs for linking. This is
support is incomplete not filtering out platforms. It also does not
account for architecture specific API handling and potentially does not
correctly handle trees of re-exports with inlined libraries being
treated as direct children of the top level library.
My test refactoring in D80217 seems to have caused yaml2obj to emit
unaligned nlist_64 structs, causing ASAN'd lld to be unhappy. I don't
think this is an issue with yaml2obj though -- llvm-mc also seems to
emit unaligned nlist_64s. This diff makes lld able to safely do aligned
reads under ASAN builds while hopefully creating no overhead for regular
builds on architectures that support unaligned reads.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D80414
I considered making a `Target::validate()` method, but I wasn't sure how
I felt about the overhead of doing yet another switch-dispatch on the
relocation type, so I put the validation in `relocateOne` instead...
might be a bit of a micro-optimization, but `relocateOne` does assume
certain things about the relocations it gets, and this error handling
makes that explicit, so it's not a totally unreasonable code
organization.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D80049
Summary:
This diff restores and builds upon @pcc and @ruiu's initial work on
subsections.
The .subsections_via_symbols directive indicates we can split each
section along symbol boundaries, unless those symbols have been marked
with `.alt_entry`.
We exercise this functionality in our tests by using order files that
rearrange those symbols.
Depends on D79668.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Reviewed By: smeenai
Subscribers: thakis, llvm-commits, pcc, ruiu
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79926
This diff restores and builds upon @pcc and @ruiu's initial work on
subsections.
The .subsections_via_symbols directive indicates we can split each
section along symbol boundaries, unless those symbols have been marked
with `.alt_entry`.
We exercise this functionality in our tests by using order files that
rearrange those symbols.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D79926
The order file indicates how input sections should be sorted within each
output section, based on the symbols contained within those sections.
This diff sets the stage for implementing and testing
`.subsections_via_symbols`, where we will break up InputSections by each
symbol and sort them more granularly.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D79668
With this change, basic archive files can be linked together. Input
section discovery has been refactored into a function since archive
files lazily resolve their symbols / the object files containing those
symbols.
Reviewed By: int3, smeenai
Differential Revision: https://reviews.llvm.org/D78342