Commit Graph

260 Commits

Author SHA1 Message Date
Jez Ng e4b286211c [lld-macho][nfc] Rearrange order of statements to clarify data dependencies 2022-04-07 00:00:41 -04:00
Jez Ng 1c0234dfcc [lld-macho][nfc] Have findContainingSubsection take a Section
... instead of an instance of `Subsections`.

This simplifies the code slightly since all its callsites have a Section
instance anyway.
2022-03-21 07:23:09 -04:00
Jez Ng ce2ae38124 [lld-macho] Deduplicate the `__objc_classrefs` section contents
ld64 breaks down `__objc_classrefs` on a per-word level and deduplicates
them. This greatly reduces the number of bind entries emitted (and
therefore the amount of work `dyld` has to do at runtime). For
chromium_framework, this change to LLD cuts the number of (non-lazy)
binds from 912 to 190, getting us to parity with ld64 in this aspect.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D121053
2022-03-08 08:34:04 -05:00
Jez Ng fd3669c256 [lld-macho] Improve hiding of unnamed_addr symbols
Symbols for which `canBeOmittedFromSymbolTable()` is true should be
treated as private externs. This diff tries to do that by unsetting the
ExportDynamic bit. It seems to mostly work with the FullLTO backend, but
with the ThinLTO backend, the `local_unnamed_addr` symbols still fail to
be properly hidden. Nonetheless, this is a step in the right direction.

I've documented all the remaining differences between our behavior and
LD64's in the lto-internalized-unnamed-addr.ll test.

See also https://discourse.llvm.org/t/mach-o-lto-handling-of-linkonce-odr-unnamed-addr/60015

Reviewed By: #lld-macho, thevinster

Differential Revision: https://reviews.llvm.org/D119767
2022-02-18 12:09:38 -05:00
Jez Ng 94c28d289a [lld-macho][nfc] Factor out callgraph parsing code
`parseSections()` is a getting a bit large unwieldy, let's factor out
logic where we can.

Other minor changes in this diff:
* `"__cg_profile"` is now a global constexpr
* We now use `checkError()` instead of `fatal()`-ing without handling
  the Error
* Check for `callGraphProfileSort` before checking the section name,
  since the boolean comparison is likely cheaper

Reviewed By: #lld-macho, lgrey, oontvoo

Differential Revision: https://reviews.llvm.org/D119892
2022-02-15 21:13:55 -05:00
Jez Ng 06f863ac5e [lld-macho] Include address offsets in error messages
This makes it easier to pinpoint the source of the problem.

TODO: Have more relocation error messages make use of this
functionality.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D118798
2022-02-07 21:06:18 -05:00
Byoungchan Lee da08d50fd6 [lld][macho] Add more skip platform check for libSystem re-exports
Xcode 13 comes with a mismatched platform in libcompiler_rt.dylib,
so this creates a linker error on mac catalyst.
Fix it by adding it to the skip list.

Reviewed By: MaskRay, #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D117925
2022-02-04 16:34:56 -05:00
Jez Ng 2b78ef06c2 [lld-macho][nfc] Eliminate InputSection::Shared
Earlier in LLD's evolution, I tried to create the illusion that
subsections were indistinguishable from "top-level" sections. Thus, even
though the subsections shared many common field values, I hid those
common values away in a private Shared struct (see D105305). More
recently, however, @gkm added a public `Section` struct in D113241 that
served as an explicit way to store values that are common to an entire
set of subsections (aka InputSections). Now that we have another "common
value" struct, `Shared` has been rendered redundant. All its fields can
be moved into `Section` instead, and the pointer to `Shared` can be replaced
with a pointer to `Section`.

This `Section` pointer also has the advantage of letting us inspect other
subsections easily, simplifying the implementation of {D118798}.

P.S. I do think that having both `Section` and `InputSection` makes for
a slightly confusing naming scheme. I considered renaming `InputSection`
to `Subsection`, but that would break the symmetry with `OutputSection`.
It would also make us deviate from LLD-ELF's naming scheme.

This change is perf-neutral on my 3.2 GHz 16-Core Intel Xeon W machine:

             base           diff           difference (95% CI)
  sys_time   1.258 ± 0.031  1.248 ± 0.023  [  -1.6% ..   +0.1%]
  user_time  3.659 ± 0.047  3.658 ± 0.041  [  -0.5% ..   +0.4%]
  wall_time  4.640 ± 0.085  4.625 ± 0.063  [  -1.0% ..   +0.3%]
  samples    49             61

There's also no stat sig change in RSS (as measured by `time -l`):

           base                         diff                           difference (95% CI)
  time     998038627.097 ± 13567305.958 1003327715.556 ± 15210451.236  [  -0.2% ..   +1.2%]
  samples  31                           36

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D118797
2022-02-03 19:55:42 -05:00
Jez Ng 9408b75ec3 [lld-macho][nfc] Hoist out creation of Section in parseSections()
Simplifies the code slightly.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D118796
2022-02-02 17:09:14 -05:00
Alexandre Ganea 83d59e05b2 Re-land [LLD] Remove global state in lldCommon
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.

See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html

The previous land f860fe3622 caused issues in https://lab.llvm.org/buildbot/#/builders/123/builds/8383, fixed by 22ee510dac.

Differential Revision: https://reviews.llvm.org/D108850
2022-01-20 14:53:26 -05:00
Fangrui Song 0aae2bf373 [lld-macho] Add --start-lib --end-lib
In ld.lld, when an ObjFile/BitcodeFile is read in --start-lib state, the file is
given archive semantics. --end-lib closes the previous --start-lib. A build
system can use this feature as an alternative to archives. This patch ports
the feature to lld-macho.

--start-lib and --end-lib are positional, unlike usual ld64 options.
I think the slight drawback does not matter as (a) reusing option names
make build systems convenient (b) `--start-lib a.o b.o --end-lib` conveys more
information than an alternative design: `-objlib a.o -objlib b.o` because
--start-lib makes it clear which objects are in the same conceptual archive.
This provides flexibility (c) `-objlib`/`-filelist` interaction may be weird.

Close https://github.com/llvm/llvm-project/issues/52931

Reviewed By: #lld-macho, Jez Ng, oontvoo

Differential Revision: https://reviews.llvm.org/D116913
2022-01-19 10:14:49 -08:00
Alexandre Ganea e6b153947d Revert [LLD] Remove global state in lldCommon
It seems to be causing issues on https://lab.llvm.org/buildbot/#/builders/123/builds/8383
2022-01-16 11:03:06 -05:00
Alexandre Ganea f860fe3622 [LLD] Remove global state in lldCommon
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.

See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html

Differential Revision: https://reviews.llvm.org/D108850
2022-01-16 08:57:57 -05:00
Juergen Ributzka 3025c3eded Replace PlatformKind with PlatformType.
The PlatformKind/PlatformType enums contain the same information, which requires
them to be kept in-sync. This commit changes over to PlatformType as the sole
source of truth, which allows the removal of the redundant PlatformKind.

The majority of the changes were in LLD and TextAPI.

Reviewed By: cishida

Differential Revision: https://reviews.llvm.org/D117163
2022-01-13 09:23:49 -08:00
Leonard Grey 6db04b97e6 [lld-macho] Port CallGraphSort from COFF/ELF
Depends on D112160

This adds the new options `--call-graph-profile-sort` (default),
`--no-call-graph-profile-sort` and `--print-symbol-order=`. If call graph
profile sorting is enabled, reads `__LLVM,__cg_profile` sections from object
files and uses the resulting graph to put callees and callers close to each
other in the final binary via the C3 clustering heuristic.

Differential Revision: https://reviews.llvm.org/D112164
2022-01-12 10:47:04 -05:00
Fangrui Song 97a5dccb7d [lld-macho] Rename LazySymbol to LazyArchive. NFC
D116913 will add LazyObject. Rename LazySymbol to LazyArchive to avoid confusion
and mirror ELF.

Reviewed By: #lld-macho, Jez Ng

Differential Revision: https://reviews.llvm.org/D116914
2022-01-11 16:49:06 -08:00
Vy Nguyen 4f90e67e2f [lld-macho] Handle $ld$hide[$os] symbols.
PR/52708

Differential Revision: https://reviews.llvm.org/D115775
2021-12-17 16:40:07 -05:00
Nico Weber c4b45eeb44 [lld/mac] Don't lose "weak ref" bit when doing LTO
Fixes #52778.

Probably fixes Chromium crashing on startup on macOS 10.15 (and older) systems
when building with LTO, but I haven't verified that yet.

Differential Revision: https://reviews.llvm.org/D115949
2021-12-17 15:26:35 -05:00
Jez Ng 098430cd25 [lld-macho][nfc] Simplify LC_DATA_IN_CODE generation
1. After D113241, we have the section address easily accessible and no
   longer need to iterate across the LC_SEGMENT commands to emit
   LC_DATA_IN_CODE.

2. There's no need to store a pointer to the data in code entries during
   the parse step; we can just look it up as part of the output step.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D115556
2021-12-11 01:01:57 -05:00
Jez Ng 8a1f2d6580 [lld-macho] Include archive name in bitcode files
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D115281
2021-12-07 19:11:23 -05:00
Greg McGary 9cc489a4b2 [lld-macho][nfc] Factor-out NFC changes from main __eh_frame diff
In order to keep signal:noise high for the `__eh_frame` diff, I have teased-out the NFC changes and put them here.

Differential Revision: https://reviews.llvm.org/D114017
2021-11-17 15:16:44 -07:00
Vy Nguyen 34d15eaced [lld-macho][nfc] Sanity check on template type
Differential Revision: https://reviews.llvm.org/D114044
2021-11-16 20:04:49 -05:00
Shoaib Meenai 93bf271f27 [MachO] Shrink reloc from 32 bytes to 24 bytes
The `r_address` field of `relocation_info` is only 4 bytes, so our
offset field (which is the `r_address` field adjusted for subsection
splitting) also only needs to be 4 bytes. This reduces the structure
size from 32 bytes to 24 bytes.

Combined with https://reviews.llvm.org/D113813, this is a minor perf
improvement for linking an internal app, tested on two machines:

```
           smol-relocs     baseline        difference (95% CI)
sys_time   7.367 ± 0.138   7.543 ± 0.157   [  +0.9% ..   +3.8%]
user_time  21.843 ± 0.351  21.861 ± 0.450  [  -1.3% ..   +1.4%]
wall_time  20.301 ± 0.307  20.556 ± 0.324  [  +0.1% ..   +2.4%]
samples    16              16

           smol-relocs     baseline        difference (95% CI)
sys_time   2.923 ± 0.050   2.992 ± 0.018   [  +1.4% ..   +3.4%]
user_time  10.345 ± 0.039  10.448 ± 0.023  [  +0.8% ..   +1.2%]
wall_time  12.068 ± 0.071  12.229 ± 0.021  [  +1.0% ..   +1.7%]
samples    15              12
```

More importantly though, this change by itself reduces our maximum
resident set size by 220 MB (2.75%, from 7.85 GB to 7.64 GB) on the
first machine. On the second machine, it reduces it by 125 MB (1.94%,
from 6.31 GB to 6.19 GB).

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D113818
2021-11-16 16:30:34 -08:00
Greg McGary 3a1b3c9afe [lld-macho][nfc] rename parsed-section types & variables
This is an NFC diff that prepares for pruning & relocating `__eh_frame`.

Along the way, I made the following changes to ...
* clarify usage of `section` vs. `subsection`
* remove `map` & `vec` from type names
* disambiguate class `Section` from template parameter `SectionHeader`.

Differential Revision: https://reviews.llvm.org/D113241
2021-11-16 07:06:41 -07:00
Vy Nguyen 9b29dae3ca [lld-macho] Allow exporting weak_def_can_be_hidden(AKA "autohide") symbols
autohide symbols behaves similarly to private_extern symbols.
However, LD64 allows exporting autohide symbols. LLD currently does not.
This patch allows LLD to export them.

Differential Revision: https://reviews.llvm.org/D113167
2021-11-12 21:57:30 -05:00
Jez Ng d9b6f7e312 [lld-macho] Teach ICF to dedup functions with identical unwind info
Dedup'ing unwind info is tricky because each CUE contains a different
function address, if ICF operated naively and compared the entire
contents of each CUE, entries with identical unwind info but belonging
to different functions would never be considered identical. To work
around this problem, we slice away the function address before
performing ICF. We rely on `relocateCompactUnwind()` to correctly handle
these truncated input sections.

Here are the numbers before and after D109944, D109945, and this diff
were applied, as tested on my 3.2 GHz 16-Core Intel Xeon W:

Without any optimizations:

             base           diff           difference (95% CI)
  sys_time   0.849 ± 0.015  0.896 ± 0.012  [  +4.8% ..   +6.2%]
  user_time  3.357 ± 0.030  3.512 ± 0.023  [  +4.3% ..   +5.0%]
  wall_time  3.944 ± 0.039  4.032 ± 0.031  [  +1.8% ..   +2.6%]
  samples    40             38

With `-dead_strip`:

             base           diff           difference (95% CI)
  sys_time   0.847 ± 0.010  0.896 ± 0.012  [  +5.2% ..   +6.5%]
  user_time  3.377 ± 0.014  3.532 ± 0.015  [  +4.4% ..   +4.8%]
  wall_time  3.962 ± 0.024  4.060 ± 0.030  [  +2.1% ..   +2.8%]
  samples    47             30

With `-dead_strip` and `--icf=all`:

             base           diff           difference (95% CI)
  sys_time   0.935 ± 0.013  0.957 ± 0.018  [  +1.5% ..   +3.2%]
  user_time  3.472 ± 0.022  6.531 ± 0.046  [ +87.6% ..  +88.7%]
  wall_time  4.080 ± 0.040  5.329 ± 0.060  [ +30.0% ..  +31.2%]
  samples    37             30

Unsurprisingly, ICF is now a lot slower, likely due to the much larger
number of input sections it needs to process. But the rest of the
linker only suffers a mild slowdown.

Note that the compact-unwind-bad-reloc.s test was expanded because we
now handle the relocation for CUE's function address in a separate code
path from the rest of the CUE relocations. The extended test covers both
code paths.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D109946
2021-11-12 16:02:49 -05:00
Jez Ng ad8df21db2 [reland][lld-macho] Fix symbol relocs handling for compact unwind's functionAddress
Clang seems to emit all functionAddress relocs as section relocs, but
`ld -r` can turn those relocs into symbol ones. It turns out that we
weren't handling that case correctly when the symbol was a weak def
whose definition did not prevail.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D113702
2021-11-12 15:01:51 -05:00
Kazu Hirata 835135a8ae Revert "[lld-macho] Fix symbol relocs handling for compact unwind's functionAddress"
This reverts commit e941fe5061.

The commit in question causes:

  lld/MachO/InputFiles.cpp:916:13: error: use of undeclared identifier
  'it'
2021-11-11 20:29:48 -08:00
Jez Ng e941fe5061 [lld-macho] Fix symbol relocs handling for compact unwind's functionAddress
Clang seems to emit all functionAddress relocs as section relocs, but
`ld -r` can turn those relocs into symbol ones. It turns out that we
weren't handling that case correctly when the symbol was a weak def
whose definition did not prevail.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D113702
2021-11-11 22:53:35 -05:00
Keith Smiley 0bce3e3b84 [lld-macho] Clear resolvedReads cache
https://reviews.llvm.org/D113153#3108083

smeenai, int3

Differential Revision: https://reviews.llvm.org/D113198
2021-11-04 18:02:34 -07:00
Keith Smiley d49e7244cc [lld-macho] Cache readFile results
In one of our links lld was reading 760k files, but the unique number of
files was only 1500. This takes that link from 30 seconds to 8.

This seems like a heavy hammer, especially since some things don't need
to be cached, like the filelist arguments and the passed static
archives (the latter is already cached as a one off), but it seems ld64
does something similar here to short circuit these duplicate reads:

82e429e186/src/ld/InputFiles.cpp (L644-L665)

Of the types of files being read for our iOS app, the biggest problem
was constantly re-reading small tbd files:

```
% wc -l /tmp/read.txt
761414 /tmp/read.txt
% cat /tmp/read.txt | sort -u | wc -l
1503

% cat /tmp/read.txt | grep "\.a$" | wc -l
43721
% cat /tmp/read.txt | grep "\.tbd$" | wc -l
717656
```

We could likely hoist this logic up to not cache at this level, but it
would be a more invasive change to make sure all callers that needed it
cached the results.

I could see this being an issue with OOMs, and I'm not a linker expert so
maybe there's another way we should solve this problem? Feedback welcome!

Reviewed By: int3, #lld-macho

Differential Revision: https://reviews.llvm.org/D113153
2021-11-03 22:12:21 -07:00
Keith Smiley 6629ec3ecc [lld-macho] Implement -arch_errors_fatal
By default with ld64, architecture mismatches are just warnings, then
this flag can be passed to make these fail. This matches that behavior.

Reviewed By: int3, #lld-macho

Differential Revision: https://reviews.llvm.org/D113082
2021-11-03 22:01:53 -07:00
Vy Nguyen 3f35dd06a5 [lld-macho][nfc][cleanup] Fix a few code style lints and clang-tidy findings
- Use .empty() instead of `size() == 0` when possible.
- Use const-ref to avoid copying

Differential Revision: https://reviews.llvm.org/D112978
2021-11-02 11:26:15 -04:00
Jez Ng 1d2a4cd57d [lld-macho] Fix compact-unwind-bad-reloc.s test
Broken by a9353dbe51.

Now that the functions point to the compact unwind entries, instead of
the other way around, we need to perform the "invalid reference" check
in a different place.

This change was originally part of the stacked diff D109946, but should
have been included as part of D109945.
2021-10-26 18:59:12 -04:00
Jez Ng 002eda7056 [lld-macho] Associate compact unwind entries with function symbols
Compact unwind entries (CUEs) contain pointers to their respective
function symbols. However, during the link process, it's far more useful
to have pointers from the function symbol to the CUE than vice versa.
This diff adds that pointer in the form of `Defined::compactUnwind`.

In particular, when doing dead-stripping, we want to mark CUEs live when
their function symbol is live; and when doing ICF, we want to dedup
sections iff the symbols in that section have identical CUEs. In both
cases, we want to be able to locate the symbols within a given section,
as well as locate the CUEs belonging to those symbols. So this diff also
adds `InputSection::symbols`.

The ultimate goal of this refactor is to have ICF support dedup'ing
functions with unwind info, but that will be handled in subsequent
diffs. This diff focuses on simplifying `-dead_strip` --
`findFunctionsWithUnwindInfo` is no longer necessary, and
`Defined::isLive()` is now a lot simpler. Moreover, UnwindInfoSection no
longer has to check for dead CUEs -- we simply avoid adding them in the
first place.

Additionally, we now support stripping of dead LSDAs, which follows
quite naturally since `markLive()` can now reach them via the CUEs.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D109944
2021-10-26 16:04:15 -04:00
Kaining Zhong aab0f2264a [lld-macho] Fix dangling string reference when adding frameworks
In Driver.cpp, addFramework used std::string instance to represent the path of a framework, which will be freed after the function returns. However, this string is stored in loadedArchive, which will be used later to compare with path of newly added frameworks. This caused https://bugs.llvm.org/show_bug.cgi?id=52133. A test is included in this commit to reproduce this bug.

Now resolveDylibPath returns a StringRef instance, and it uses StringSaver to save its data, then returns it to functions on the top. This ensures the resolved framework path is still valid after LC_LINKER_OPTION is parsed.

Reviewed By: int3, #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D111706
2021-10-20 11:21:40 -04:00
Jez Ng 91ace9f062 [lld-macho] Construct CFString literals by copying the ConcatInputSection
... instead of constructing a new one each time. This allows us
to take advantage of {D105305}.

I didn't see a substantial difference when linking chromium_framework,
but this paves the way for reusing similar logic for splitting compact
unwind entries into sections. There are a lot more of those, so the
performance impact is significant.

Differential Revision: https://reviews.llvm.org/D109895
2021-09-17 19:46:20 -04:00
Jez Ng 9065fe5591 [lld-macho] Refactor archive loading
The previous logic was duplicated between symbol-initiated
archive loads versus flag-initiated loads (i.e. `-force_load` and
`-ObjC`). This resulted in code duplication as well as redundant work --
we would create Archive instances twice whenever we had one of those
flags; once in `getArchiveMembers` and again when we constructed the
ArchiveFile.

This was motivated by an upcoming diff where we load archive members
containing ObjC-related symbols before loading those containing
ObjC-related sections, as well as before performing symbol resolution.
Without this refactor, it would be difficult to do that while avoiding
loading the same archive member twice.

Differential Revision: https://reviews.llvm.org/D108780
2021-08-26 18:52:07 -04:00
Vincent Lee 08d55c5c01 [lld-macho] Refactor parseSections to avoid creating isec on LLVM segments
Address post follow up comment in D108016. Avoid creating isec for
LLVM segments since we are skipping over it.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D108167
2021-08-16 18:47:50 -07:00
Vincent Lee 15dc93e61c [lld-macho] Ignore LLVM segments to prevent duplicate syms
There was an instance of a third-party archive containing multiple
_llvm symbols from different files that clashed with each other
producing duplicate symbols. Symbols under the LLVM segment
don't seem to be producing any meaningful value, so just ignore them.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D108016
2021-08-16 12:41:03 -07:00
Jez Ng e49374f9e0 [lld-macho] Support common symbols in bitcode (but differently from ld64)
ld64 seems to handle common symbols in bitcode rather
bizarrely. They follow entirely different precedence rules from their
non-bitcode counterparts. I initially tried to emulate ld64 in D106597,
but I'm not sure the extra complexity is worth it, especially given that
common symbols are not, well, very common.

This diff accords common bitcode symbols the same precedence as regular
common symbols, just as we treat all other pairs of bitcode and
non-bitcode symbol types. The tests document ld64's behavior in detail,
just in case we want to revisit this.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D107027
2021-07-29 11:07:50 -04:00
Nico Weber dd57915b1e [lld/mac] Fix sub-library.s on Windows after 8e8701abca
The endswith() check for the framework name fails when joining
with the native path separator. Always use the posix separator as fix.
2021-07-27 15:25:52 -04:00
Nico Weber 8e8701abca [lld/mac] When loading reexports, look for basename in -F / -L first
Matches ld64 (cf Options::findIndirectDylib()), and fixes PR51218.

Differential Revision: https://reviews.llvm.org/D106842
2021-07-27 14:28:52 -04:00
Leonard Grey 5acc6d4572 [lld-macho] Disambiguate bitcode files with the same name by archive name/offset in archive
Ported from COFF/ELF; test is adapted from
test/COFF/thinlto-archivecollision.ll

LTO expects every bitcode file to have a unique name. If given multiple bitcode
files with the same name, it errors with "Expected at most one ThinLTO module
per bitcode file".

This change incorporates the archive name, to disambiguate members with the
same name in different archives and the offset in archive to disambiguate
members with the same name in the same archive.

Differential Revision: https://reviews.llvm.org/D106179
2021-07-22 22:50:25 -04:00
Nico Weber fbb45947b2 [lld/mac] Resolve defined symbols before undefined symbols
Ports https://reviews.llvm.org/D95985 to the MachO port.
Happens to fix PR51135; see that bug for details.
Also makes lld's behavior match ld64 for the included test case.

Differential Revision: https://reviews.llvm.org/D106293
2021-07-19 16:37:41 -04:00
Nico Weber f21801dab2 [lld/mac] Implement -application_extension
Differential Revision: https://reviews.llvm.org/D105818
2021-07-12 13:42:16 -04:00
Jez Ng 11a0d23650 [lld-macho][nfc] clang-format 2021-07-11 18:36:59 -04:00
Jez Ng f6e84a84f9 [lld-macho][nfc] Avoid using std::map for PlatformKinds
The mappings we were using had a small number of keys, so a vector is
probably better. This allows us to remove the last usage of std::map in
our codebase.

I also used `removeSimulator` to simplify the code a bit further.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D105786
2021-07-11 18:24:53 -04:00
Nico Weber 2c25f39fcc [lld/mac] Implement -final_output
This is one of two flags clang passes to the linker when giving calling
clang with multiple -arch flags.

I think it'd make sense to also use finalOutput instead of outputFile
in CodeSignatureSection() and when replacing @executable_path, but
ld64 doesn't do that, so I'll at least put those in separate commits.

Differential Revision: https://reviews.llvm.org/D105449
2021-07-05 20:06:26 -04:00
Jez Ng bcaf57cae8 [lld-macho] Parse relocations quickly by assuming sorted order
clang and gcc both seem to emit relocations in reverse order of
address. That means we can match relocations to their containing
subsections in `O(relocs + subsections)` rather than the `O(relocs *
log(subsections))` that our previous binary search implementation
required.

Unfortunately, `ld -r` can still emit unsorted relocations, so we have a
fallback code path for that (less common) case.

Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:

      N           Min           Max        Median           Avg        Stddev
  x  20          4.04          4.11         4.075        4.0775   0.018027756
  +  20          3.95          4.02          3.98         3.985   0.020900768
  Difference at 95.0% confidence
          -0.0925 +/- 0.0124919
          -2.26855% +/- 0.306361%
          (Student's t, pooled s = 0.0195172)

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D105410
2021-07-05 01:13:44 -04:00
Vy Nguyen c7c5a1c9ae [lld-macho] Ignore debug symbols while preparing relocations.
Details: see https://bugs.llvm.org/show_bug.cgi?id=50812

Differential Revision: https://reviews.llvm.org/D105210
2021-07-02 13:51:46 -04:00
Jez Ng f6b6e72143 [lld-macho] Factor out common InputSection members
We have been creating many ConcatInputSections with identical values due
to .subsections_via_symbols. This diff factors out the identical values
into a Shared struct, to reduce memory consumption and make copying
cheaper.

I also changed `callSiteCount` from a uint32_t to a 31-bit field to save an
extra word.

All in all, this takes InputSection from 120 to 72 bytes (and
ConcatInputSection from 160 to 112 bytes), i.e. 30% size reduction in
ConcatInputSection.

Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:

      N           Min           Max        Median           Avg        Stddev
  x  20          4.14          4.24          4.18         4.183   0.027548999
  +  20          4.04          4.11         4.075        4.0775   0.018027756
  Difference at 95.0% confidence
          -0.1055 +/- 0.0149005
          -2.52211% +/- 0.356215%
          (Student's t, pooled s = 0.0232803)

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D105305
2021-07-01 21:22:39 -04:00
Jez Ng ac2dd06b91 [lld-macho] Deduplicate CFStrings
`__cfstring` is a special literal section, so instead of breaking it up
at symbol boundaries, we break it up at fixed-width boundaries (since
each literal is the same size). Symbols can only occur at one of those
boundaries, so this is strictly more powerful than
`.subsections_via_symbols`.

With that in place, we then run the section through ICF.

This change is about perf-neutral when linking chromium_framework.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D105045
2021-07-01 21:22:38 -04:00
Nico Weber aed0a08c69 [lld/mac] Make symbol table order deterministic
SymtabSection::emitStabs() writes the symbol table in the order
of externalSymbols, which has the order of symtab->getSymbols(),
which is just the order symbols are added to the symbol table.

In practice, symbols in the symbol files of input .o files are
sorted, but since that's not guaranteed we sort them in
ObjFile::parseSymbols(). To make sure several symbols with the same
address keep the order they're in the input file, we have to use
stable_sort().

In practice, std::sort() on already-sorted inputs won't change the order
of just adjacent elements, and while in theory std::sort() could use a
random pivot, in practice the code should be deterministic as it was
previously too.

But now lld/test/MachO/stabs.s passes with LLVM_ENABLE_EXPENSIVE_CHECKS=ON
(the last test that was failing with that set).

Fixes a regression from D99972.

While here, remove an empty section in stabs.s and move
.subsections_via_symbols to the end where it usually is (this part no
behavior change).

Differential Revision: https://reviews.llvm.org/D105071
2021-06-29 09:29:49 -04:00
Leonard Grey a8a6e5b094 [lld-macho] Preserve alignment for non-deduplicated cstrings
Fixes PR50637.

Downstream bug: https://crbug.com/1218958

Currently, we split __cstring along symbol boundaries with .subsections_via_symbols
when not deduplicating, and along null bytes when deduplicating. This change splits
along null bytes unconditionally, and preserves original alignment in the non-
deduplicated case.

Removing subsections-section-relocs.s because with this change, __cstring
is never reordered based on the order file.

Differential Revision: https://reviews.llvm.org/D104919
2021-06-28 22:26:43 -04:00
Jez Ng 4c49f9ceaf [lld-macho] Handle non-extern symbols marked as private extern
Previously, we asserted that such a case was invalid, but in fact
`ld -r` can emit such symbols if the input contained a (true) private
extern, or if it contained a symbol started with "L".

Non-extern symbols marked as private extern are essentially equivalent
to regular TU-scoped symbols, so no new functionality is needed.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D104502
2021-06-18 16:36:14 -04:00
Muhammad Omair Javaid 9777f3fd06 Fix build failure on 32 bit Arm
This patch fixes build failure caused by commit
f27e4548fc on 32 bit arm.

Differential Revision: https://reviews.llvm.org/D103292
2021-06-18 15:27:09 +00:00
Greg McGary f27e4548fc [lld-macho] Implement ICF
ICF = Identical C(ode|OMDAT) Folding

This is the LLD ELF/COFF algorithm, adapted for MachO. So far, only `-icf all` is supported. In order to support `-icf safe`, we will need to port address-significance tables (`.addrsig` directives) to MachO, which will come in later diffs.

`check-{llvm,clang,lld}` have 0 regressions for `lld -icf all` vs. baseline ld64.

We only run ICF on `__TEXT,__text` for reasons explained in the block comment in `ConcatOutputSection.cpp`.

Here is the perf impact for linking `chromium_framekwork` on a Mac Pro (16-core Xeon W) for the non-ICF case vs. pre-ICF:
```
    N           Min           Max        Median           Avg        Stddev
x  20          4.27          4.44          4.34         4.349   0.043029977
+  20          4.37          4.46         4.405        4.4115   0.025188761
Difference at 95.0% confidence
        0.0625 +/- 0.0225658
        1.43711% +/- 0.518873%
        (Student's t, pooled s = 0.0352566)
```

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D103292
2021-06-17 10:07:44 -07:00
Jez Ng eeac6b2bec [lld-macho] Handle multiple LC_LINKER_OPTIONs
We previously only parsed the first one.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D104352
2021-06-16 15:23:06 -04:00
Jez Ng d52d1b93c3 [lld-macho] Downgrade version mismatch to warning
It's a warning in ld64. While having LLD be stricter would be nice, it
makes it harder for it to be a drop-in replacement into existing builds.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D104333
2021-06-16 11:06:26 -04:00
Nico Weber b579938d40 [lld/mac] Add support for -no_data_in_code_info flag
Differential Revision: https://reviews.llvm.org/D104345
2021-06-16 06:40:42 -04:00
Alexander Shaposhnikov 928394d109 [lld][MachO] Add support for LC_DATA_IN_CODE
Add first bits for emitting LC_DATA_IN_CODE.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D103006
2021-06-14 19:21:59 -07:00
Jez Ng cc17bfe489 [lld-macho] Fix "shift exponent too large" UBSAN error
UBSAN seems to have added this check somewhere along the way...

This might also fix the PPC buildbot, which is failing on the same test
2021-06-14 13:47:25 -04:00
Jez Ng 681cfeb591 [lld-macho][nfc] Have InputSection ctors take some parameters
This is motivated by an upcoming diff in which the
WordLiteralInputSection ctor sets itself up based on the value of its
section flags. As such, it needs to be passed the `flags` value as part
of its ctor parameters, instead of having them assigned after the fact
in `parseSection()`. While refactoring code to make that possible, I
figured it would make sense for the other InputSections to also take
their initial values as ctor parameters.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D103978
2021-06-11 19:50:09 -04:00
Jez Ng 5d88f2dd94 [lld-macho] Deduplicate fixed-width literals
Conceptually, the implementation is pretty straightforward: we put each
literal value into a hashtable, and then write out the keys of that
hashtable at the end.

In contrast with ELF, the Mach-O format does not support variable-length
literals that aren't strings. Its literals are either 4, 8, or 16 bytes
in length. LLD-ELF dedups its literals via sorting + uniq'ing, but since
we don't need to worry about overly-long values, we should be able to do
a faster job by just hashing.

That said, the implementation right now is far from optimal, because we
add to those hashtables serially. To parallelize this, we'll need a
basic concurrent hashtable (only needs to support concurrent writes w/o
interleave reads), which shouldn't be to hard to implement, but I'd like
to punt on it for now.

Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:

      N           Min           Max        Median           Avg        Stddev
  x  20          4.27          4.39         4.315        4.3225   0.033225703
  +  20          4.36          4.82          4.44        4.4845    0.13152846
  Difference at 95.0% confidence
          0.162 +/- 0.0613971
          3.74783% +/- 1.42041%
          (Student's t, pooled s = 0.0959262)

This corresponds to binary size savings of 2MB out of 335MB, or 0.6%.
It's not a great tradeoff as-is, but as mentioned our implementation can
be signficantly optimized, and literal dedup will unlock more
opportunities for ICF to identify identical structures that reference
the same literals.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D103113
2021-06-11 19:50:08 -04:00
Nico Weber 0e399eb527 [lld/mac] When handling @loader_path, use realpath() of symlinks
This is important for Frameworks, which are usually symlinks.

ld64 gets this right for @rpath that's replaced with @loader_path, but not for
bare @loader_path -- ld64's code calls realpath() in that case too, but ignores
the result.

ld64 somehow manages to find libbar1.dylib in the test without the
explicit `-rpath` in Foo1. I don't understand why or how. But this
change is a step forward and fixes an immediate problem I'm having,
so let's start with this :)

Differential Revision: https://reviews.llvm.org/D103990
2021-06-09 20:36:07 -04:00
Jez Ng 04259cde15 [lld-macho] Implement cstring deduplication
Our implementation draws heavily from LLD-ELF's, which in turn delegates
its string deduplication to llvm-mc's StringTableBuilder. The messiness of
this diff is largely due to the fact that we've previously assumed that
all InputSections get concatenated together to form the output. This is
no longer true with CStringInputSections, which split their contents into
StringPieces. StringPieces are much more lightweight than InputSections,
which is important as we create a lot of them. They may also overlap in
the output, which makes it possible for strings to be tail-merged. In
fact, the initial version of this diff implemented tail merging, but
I've dropped it for reasons I'll explain later.

**Alignment Issues**

Mergeable cstring literals are found under the `__TEXT,__cstring`
section. In contrast to ELF, which puts strings that need different
alignments into different sections, clang's Mach-O backend puts them all
in one section. Strings that need to be aligned have the `.p2align`
directive emitted before them, which simply translates into zero padding
in the object file.

I *think* ld64 extracts the desired per-string alignment from this data
by preserving each string's offset from the last section-aligned
address. I'm not entirely certain since it doesn't seem consistent about
doing this; but perhaps this can be chalked up to cases where ld64 has
to deduplicate strings with different offset/alignment combos -- it
seems to pick one of their alignments to preserve. This doesn't seem
correct in general; we can in fact can induce ld64 to produce a crashing
binary just by linking in an additional object file that only contains
cstrings and no code. See PR50563 for details.

Moreover, this scheme seems rather inefficient: since unaligned and
aligned strings are all put in the same section, which has a single
alignment value, it doesn't seem possible to tell whether a given string
doesn't have any alignment requirements. Preserving offset+alignments
for strings that don't need it is wasteful.

In practice, the crashes seen so far seem to stem from x86_64 SIMD
operations on cstrings. X86_64 requires SIMD accesses to be
16-byte-aligned. So for now, I'm thinking of just aligning all strings
to 16 bytes on x86_64. This is indeed wasteful, but implementation-wise
it's simpler than preserving per-string alignment+offsets. It also
avoids the aforementioned crash after deduplication of
differently-aligned strings. Finally, the overhead is not huge: using
16-byte alignment (vs no alignment) is only a 0.5% size overhead when
linking chromium_framework.

With these alignment requirements, it doesn't make sense to attempt tail
merging -- most strings will not be eligible since their overlaps aren't
likely to start at a 16-byte boundary. Tail-merging (with alignment) for
chromium_framework only improves size by 0.3%.

It's worth noting that LLD-ELF only does tail merging at `-O2`. By
default (at `-O1`), it just deduplicates w/o tail merging. @thakis has
also mentioned that they saw it regress compressed size in some cases
and therefore turned it off. `ld64` does not seem to do tail merging at
all.

**Performance Numbers**

CString deduplication reduces chromium_framework from 250MB to 242MB, or
about a 3.2% reduction.

Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:

      N           Min           Max        Median           Avg        Stddev
  x  20          3.91          4.03         3.935          3.95   0.034641016
  +  20          3.99          4.14         4.015        4.0365     0.0492336
  Difference at 95.0% confidence
          0.0865 +/- 0.027245
          2.18987% +/- 0.689746%
          (Student's t, pooled s = 0.0425673)

As expected, cstring merging incurs some non-trivial overhead.

When passing `--no-literal-merge`, it seems that performance is the
same, i.e. the refactoring in this diff didn't cost us.

      N           Min           Max        Median           Avg        Stddev
  x  20          3.91          4.03         3.935          3.95   0.034641016
  +  20          3.89          4.02         3.935        3.9435   0.043197831
  No difference proven at 95.0% confidence

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D102964
2021-06-07 23:48:35 -04:00
Nico Weber 17c43c4045 [lld/mac] Add reexports after reexporter to inputFiles
When a library "host"'s reexports change their installName with
`$ld$os10.11$install_name$host`, we used to write a load command for "host" but
write the version numbers of the reexport instead of "host". This fixes that.

I first thought that the rule is to take the version numbers from the library
that originally had that install name (implemented in D103819), but that's not
what ld64 seems to be doing: It takes the version number from the first dylib
with that install name it loads, and it loads the reexporting library before
the reexports. We already did most of that, we just added reexports before the
reexporter. After this change, we add the reexporter before the reexports.

Addresses https://bugs.llvm.org/show_bug.cgi?id=49800#c11 part 1.

(ld64 seems to add reexports after processing _all_ files on the command line,
while we add them right after the reexporter. For the common case of reexport +
$ld$ symbol changing back to the exporter name, this doesn't make a difference,
but you can construct a case where it does. I expect this to not make a
difference in practice though.)

Differential Revision: https://reviews.llvm.org/D103821
2021-06-07 17:04:03 -04:00
Nico Weber c5ffe97988 [lld/mac] Implement support for searching dylibs with @rpath/ in install name
Also adjust a few comments, and move the DylibFile comment talking about
umbrella next to the parameter again.

Differential Revision: https://reviews.llvm.org/D103783
2021-06-07 06:22:52 -04:00
Nico Weber 52489021cf [lld/mac] Implement support for searching dylibs with @loader_path/ in install name
Differential Revision: https://reviews.llvm.org/D103779
2021-06-06 20:19:50 -04:00
Nico Weber a48bd587f7 [lld/mac] Implement support for searching dylibs with @executable_path/ in install name
Differential Revision: https://reviews.llvm.org/D103775
2021-06-06 20:01:50 -04:00
Nico Weber 7def700667 [lld/mac] Rename DylibFile::dylibName to DylibFile::installName
The flag to set it is called `-install_name`, and it's called `installName` in tbd files.

No behavior change.

Differential Revision: https://reviews.llvm.org/D103776
2021-06-06 20:00:35 -04:00
Nico Weber e910437443 [lld/mac] Use fewer magic numbers in magic $ld$ handling code
Also simply a conditional and de-alias a variable.
Minor cleanups, no behavior change.

Differential Revision: https://reviews.llvm.org/D103774
2021-06-06 18:13:16 -04:00
Alexander Shaposhnikov 5e49ee8794 [lld][MachO] Add support for $ld$install_name symbols
This diff adds support for $ld$install_name symbols.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D103746
2021-06-05 12:58:59 -07:00
Alexander Shaposhnikov 1309c181a8 [lld][MachO] Add first bits to support special symbols
This diff adds first bits to support special symbols $ld$previous* in LLD.
$ld$* symbols modify properties/behavior of the library
(e.g. its install name, compatibility version or hide/add symbols)
for specific target versions.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D103505
2021-06-04 23:32:26 -07:00
Jez Ng 6881f29a36 [lld-macho] Parse re-exports of nested TAPI documents
D103423 neglected to call `parseReexports()` for nested TBD
documents, leading to symbol resolution failures when trying to look up
a symbol nested more than one level deep in a TBD file. This fixes the
regression and adds a test.

It also appears that `umbrella` wasn't being set properly when calling
`parseLoadCommands` -- it's supposed to resolve to `this` if `nullptr`
is passed. I didn't write a failing test case for this but I've made
`umbrella` a member so the previous behavior should be preserved.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D103586
2021-06-03 12:02:30 -04:00
Nico Weber a5645513db [lld/mac] Implement -dead_strip
Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.

Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).

Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:

    % ministat lld_*
    x lld_nostrip.txt
    + lld_strip.txt
        N           Min           Max        Median           Avg        Stddev
    x  10      3.929414       4.07692     4.0269079     4.0089678   0.044214794
    +  10     3.8129408     3.9025559     3.8670411     3.8642573   0.024779651
    Difference at 95.0% confidence
            -0.144711 +/- 0.0336749
            -3.60967% +/- 0.839989%
            (Student's t, pooled s = 0.0358398)

This interacts with many parts of the linker. I tried to add test coverage
for all added `isLive()` checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:
- debug info
- export tries
- import opcodes
- flags like -exported_symbol(s_list)
- -U / dynamic_lookup
- mod_init_funcs, mod_term_funcs
- weak symbol handling
- unwind info
- stubs
- map files
- -sectcreate
- undefined, dylib, common, defined (both absolute and normal) symbols

It's possible it interacts with more features I didn't think of,
of course.

I also did some manual testing:
- check-llvm check-clang check-lld work with lld with this patch
  as host linker and -dead_strip enabled
- Chromium still starts
- Chromium's base_unittests still pass, including unwind tests

Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.

Mechanical parts:
- Rename canOmitFromOutput to wasCoalesced (no behavior change)
  since it really is for weak coalesced symbols
- Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP
  (`.no_dead_strip` in asm)

Fixes PR49276.

Differential Revision: https://reviews.llvm.org/D103324
2021-06-02 11:09:26 -04:00
Nico Weber 476e4d65d4 [lld/mac] Address review feedback and improve a comment
I forgot to move the message() call around as requested in D103428
before committing that change. Move it now.

Also, improve the ordinal uniq'ing comment. I hadn't realized that the
distinct-but-identical files happen with --reproduce and not in general.

No behavior change.

Differential Revision: https://reviews.llvm.org/D103522
2021-06-02 10:54:53 -04:00
Vy Nguyen 8f89c054af [lld-macho][nfc] Remove unnecessary use of Optional<T*>
In all of these cases, the functions could simply return a nullptr instead of {}.
There is no case where Optional<nullptr> has a special meaning.

Differential Revision: https://reviews.llvm.org/D103489
2021-06-01 18:35:31 -04:00
Nico Weber 2c1903412b [lld/mac] Implement removal of unused dylibs
This omits load commands for unreferenced dylibs if:
- the dylib was loaded implicitly,
- it is marked MH_DEAD_STRIPPABLE_DYLIB
- or -dead_strip_dylibs is passed

This matches ld64.

Currently, the "is dylib referenced" state is computed before dead code
stripping and is not updated after dead code stripping. This too matches ld64.
We should do better here.

With this, clang-format linked with lld (like with ld64) no longer has
libobjc.A.dylib in `otool -L` output. (It was implicitly loaded as a reexport
of CoreFoundation.framework, but it's not needed.)

Differential Revision: https://reviews.llvm.org/D103430
2021-06-01 16:06:30 -04:00
Nico Weber 24979e1113 [lld/mac] Don't load DylibFiles from the DylibFile constructor
loadDylib() keeps a name->DylibFile cache, but it only writes
to the cache once the DylibFile constructor has completed.
So dylib loads done recursively from the DylibFile constructor
wouldn't use the cache.

Now, we load additional dylibs after writing to the cache,
which means the cache now gets used for dylibs loaded because
they're referenced from other dylibs.

Related to PR49514 and PR50101, but no dramatic behavior change in itself.
(Technically we no longer crash when a tbd file reexports itself,
but that doesn't happen in practice. We now accept it silently instead
of crashing; ld64 has a diag for the reexport cycle.)

Differential Revision: https://reviews.llvm.org/D103423
2021-06-01 15:31:02 -04:00
Jez Ng 8535834ef7 [lld-macho][nfc] Misc code cleanup
* Move `static_asserts` into cpp instead of header file. I noticed they
  had been separated from the main class definition in the header, so I
  set about to clean that up, then figured it made more sense as part of
  the cpp file so as not to incur unnecessary compile-time overhead.

* Remove unnecessary `virtual`s

* Remove unnecessary comment / reword another comment
2021-05-25 14:58:29 -04:00
Alexander Shaposhnikov 57501e512e [lld][MachO] Fix code formatting
Apply clang-format -style=llvm to InputFile.cpp. NFC.

Test plan: make check-all
2021-05-23 20:35:55 -07:00
Nico Weber 4a12248ee2 [lld/mac] Honor REFERENCED_DYAMICALLY, set it on __mh_execute_header
Has the effect that `__mh_execute_header` stays in the symbol table of
outputs even after running `strip` on the output. I don't know if that's
important for anything -- my motivation for the patch is just is to make
the output more similar to ld64.

(Corresponds to symbolTableInAndNeverStrip in ld64.)

Differential Revision: https://reviews.llvm.org/D102619
2021-05-17 14:22:12 -04:00
Jez Ng b1c3c2e4fc [lld-macho] Fix order file arch filtering
We had a hardcoded check and a stale TODO, written back when we only had
support for one architecture.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D102154
2021-05-10 15:45:54 -04:00
Nico Weber 7f673fcaa9 [lld/mac] Fix alignment on subsections
On a section with alignment of 16, subsections aligned to 16-byte
boundaries should keep their 16-byte alignment.

Fixes PR50274. (The same bug could have happened with -order_file
previously.)

Differential Revision: https://reviews.llvm.org/D102139
2021-05-09 21:00:56 -04:00
Nico Weber d5a70db193 [lld/mac] Write every weak symbol only once in the output
Before this, if an inline function was defined in several input files,
lld would write each copy of the inline function the output. With this
patch, it only writes one copy.

Reduces the size of Chromium Framework from 378MB to 345MB (compared
to 290MB linked with ld64, which also does dead-stripping, which we
don't do yet), and makes linking it faster:

        N           Min           Max        Median           Avg        Stddev
    x  10     3.9957051     4.3496981     4.1411121      4.156837    0.10092097
    +  10      3.908154      4.169318     3.9712729     3.9846753   0.075773012
    Difference at 95.0% confidence
            -0.172162 +/- 0.083847
            -4.14165% +/- 2.01709%
            (Student's t, pooled s = 0.0892373)

Implementation-wise, when merging two weak symbols, this sets a
"canOmitFromOutput" on the InputSection belonging to the weak symbol not put in
the symbol table. We then don't write InputSections that have this set, as long
as they are not referenced from other symbols. (This happens e.g. for object
files that don't set .subsections_via_symbols or that use .alt_entry.)

Some restrictions:
- not yet done for bitcode inputs
- no "comdat" handling (`kindNoneGroupSubordinate*` in ld64) --
  Frame Descriptor Entries (FDEs), Language Specific Data Areas (LSDAs)
  (that is, catch block unwind information) and Personality Routines
  associated with weak functions still not stripped. This is wasteful,
  but harmless.
- However, this does strip weaks from __unwind_info (which is needed for
  correctness and not just for size)
- This nopes out on InputSections that are referenced form more than
  one symbol (eg from .alt_entry) for now

Things that work based on symbols Just Work:
- map files (change in MapFile.cpp is no-op and not needed; I just
  found it a bit more explicit)
- exports

Things that work with inputSections need to explicitly check if
an inputSection is written (e.g. unwind info).

This patch is useful in itself, but it's also likely also a useful foundation
for dead_strip.

I used to have a "canoncialRepresentative" pointer on InputSection instead of
just the bool, which would be handy for ICF too. But I ended up not needing it
for this patch, so I removed that again for now.

Differential Revision: https://reviews.llvm.org/D102076
2021-05-07 17:11:40 -04:00
Jez Ng 9260760235 [lld-macho] Support loading of zippered dylibs
ld64 can emit dylibs that support more than one platform (typically macOS and
macCatalyst). This diff allows LLD to read in those dylibs. Note that this is a
super bare-bones implementation -- in particular, I haven't added support for
LLD to emit those multi-platform dylibs, nor have I added a variety of
validation checks that ld64 does. Until we have a use-case for emitting zippered
dylibs, I think this is good enough.

Fixes PR49597.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D101954
2021-05-06 11:19:40 -04:00
Vy Nguyen 23233ad139 [lld-macho] Check simulator platforms to avoid issuing false positive errors.
Currently the linker causes unnecessary errors when either the target or the config's platform is a simulator.

Differential Revision: https://reviews.llvm.org/D101855
2021-05-05 18:07:58 -04:00
Jez Ng 001ba65375 [lld-macho] De-templatize mach_header operations
@thakis pointed out that `mach_header` and `mach_header_64`
actually have the same set of (used) fields, with the 64-bit version
having extra padding. So we can access the fields we need using the
single `mach_header` type instead of using templates to switch between
the two.

I also spotted a potential issue where hasObjCSection tries to parse a
file w/o checking if it does indeed match the target arch... As such,
I've added a quick magic number check to ensure we don't access invalid
memory during `findCommand()`.

Addresses PR50180.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D101724
2021-05-03 18:31:23 -04:00
Jez Ng 05c5363b39 [lld-macho] Parse & emit the N_ARM_THUMB_DEF symbol flag
Eventually we'll use this flag to properly handle bl/blx
opcodes.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D101558
2021-04-30 16:17:26 -04:00
Nico Weber 4b456038e4 [lld/mac] Tweak two comments and fix style on one variable name
Cosmetic, no behavior change.
2021-04-30 09:30:51 -04:00
Nico Weber a38ebed258 [lld/mac] Implement support for .weak_def_can_be_hidden
I first had a more invasive patch for this (D101069), but while trying
to get that polished for review I realized that lld's current symbol
merging semantics mean that only a very small code change is needed.
So this goes with the smaller patch for now.

This has no effect on projects that build with -fvisibility=hidden
(e.g.  chromium), since these see .private_extern symbols instead.

It does have an effect on projects that build with -fvisibility-inlines-hidden
(e.g. llvm) in -O2 builds, where LLVM's GlobalOpt pass will promote most inline
functions from .weak_definition to .weak_def_can_be_hidden.

Before this patch:

    % ls -l out/gn/bin/clang out/gn/lib/libclang.dylib
    -rwxr-xr-x  1 thakis  staff  113059936 Apr 22 11:51 out/gn/bin/clang
    -rwxr-xr-x  1 thakis  staff   86370064 Apr 22 11:51 out/gn/lib/libclang.dylib
    % out/gn/bin/llvm-objdump --macho --weak-bind out/gn/bin/clang | wc -l
        8291
    % out/gn/bin/llvm-objdump --macho --weak-bind out/gn/lib/libclang.dylib | wc -l
        5698

With this patch:

    % ls -l out/gn/bin/clang out/gn/lib/libclang.dylib
    -rwxr-xr-x  1 thakis  staff  111721096 Apr 22 11:55 out/gn/bin/clang
    -rwxr-xr-x  1 thakis  staff   85291208 Apr 22 11:55 out/gn/lib/libclang.dylib
    thakis@MBP llvm-project % out/gn/bin/llvm-objdump --macho --weak-bind out/gn/bin/clang | wc -l
         725
    thakis@MBP llvm-project % out/gn/bin/llvm-objdump --macho --weak-bind out/gn/lib/libclang.dylib | wc -l
         542

Linking clang becomes a tiny bit faster with this patch:

    x 100    0.67263818    0.77847815    0.69430709    0.69877208   0.017715892
    + 100    0.67209601    0.73323393    0.68600798    0.68917346   0.012824377
    Difference at 95.0% confidence
            -0.00959861 +/- 0.00428661
            -1.37364% +/- 0.613449%
            (Student's t, pooled s = 0.0154648)

This only happens if lld with the patch and lld without the patch are both
linked with an lld with the patch or both linked with an lld without the patch
(...or with ld64). I accidentally linked the lld with the patch with an lld
without the patch and the other way round at first. In that setup, no
difference is found. That makese sense, since having fewer weak imports will
make the linked output a bit faster too. So not only does this make linking
binaries such as clang a bit faster (since fewer exports need to be written to
the export trie by lld), the linked output binary binary is also a bit faster
(since dyld needs to process fewer dynamic imports).

This also happens to fix the one `check-clang` failure when using lld as host
linker, but mostly for silly reasons: See crbug.com/1183336, mostly comment 26.
The real bug here is that c-index-test links all of LLVM both statically and
dynamically, which is an ODR violation. Things just happen to work with this
patch.

So after this patch, check-clang, check-lld, check-llvm all pass with lld as
host linker :)

Differential Revision: https://reviews.llvm.org/D101080
2021-04-22 22:51:34 -04:00
Jez Ng 8c17a87515 [re-land][lld-macho] Fix min version check
We had got it backwards... the minimum version of the target
should be higher than the min version of the object files, presumably
since new platforms are backwards-compatible with older formats.

Fixes PR50078.

The original commit (aa05439c9c) broke many tests that had inputs too
new for our target platform (10.0). This commit changes the inputs to
target 10.0, which was the simpler thing to do, but we should really
just have our lit.local.cfg default to targeting 10.15... we're not
likely to ever have proper support for the older versions anyway. I will
follow up with a change to that effect.

Differential Revision: https://reviews.llvm.org/D101114
2021-04-22 19:35:32 -04:00
Jez Ng 75ecb804b1 Revert "[lld-macho] Fix min version check"
This reverts commit aa05439c9c.
2021-04-22 19:07:41 -04:00
Jez Ng aa05439c9c [lld-macho] Fix min version check
We had got it backwards... the minimum version of the target
should be higher than the min version of the object files, presumably
since new platforms are backwards-compatible with older formats.

Fixes PR50078.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D101114
2021-04-22 18:25:44 -04:00
Jez Ng ed4a4e3312 [lld-macho][nfc] Add accessors for commonly-used PlatformInfo fields
As discussed here: https://reviews.llvm.org/D100523#inline-951543

Reviewed By: #lld-macho, thakis, alexshap

Differential Revision: https://reviews.llvm.org/D100978
2021-04-21 15:43:56 -04:00
Alexander Shaposhnikov b5720354ef [lld][MachO] Refactor findCommand
Refactor findCommand to allow passing multiple types. NFC.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D100954
2021-04-21 08:38:17 -07:00
Alexander Shaposhnikov 5c835e1ae5 [lld][MachO] Add support for LC_VERSION_MIN_* load commands
This diff adds initial support for the legacy LC_VERSION_MIN_* load commands.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D100523
2021-04-21 05:41:14 -07:00
Jez Ng 7208bd4b32 [lld-macho] Skip platform checks for a few libSystem re-exports
XCode 12 ships with mismatched platforms for these libraries,
so this hack is necessary...

Fixes PR49799.

Reviewed By: #lld-macho, gkm, smeenai

Differential Revision: https://reviews.llvm.org/D100913
2021-04-20 19:54:53 -04:00