Commit Graph

328 Commits

Author SHA1 Message Date
Vincent Lee d695d0d6f6 [lld-macho] Optimize bind opcodes with multiple passes
In D105866, we used an intermediate container to store a list of opcodes. Here,
we use that data structure to help us perform optimization passes that would allow
a more efficient encoding of bind opcodes. Currently, the functionality mirrors the
optimization pass {1,2} done in ld64 for bind opcodes under optimization gate
to prevent slight regressions.

Reviewed By: int3, #lld-macho

Differential Revision: https://reviews.llvm.org/D105867
2021-07-15 20:52:46 -07:00
Leonard Grey c931ff72bd [lld-macho] Add LTO cache support
This adds support for the lld-only `--thinlto-cache-policy` option, as well as
implementations for ld64's `-cache_path_lto`, `-prune_interval_lto`,
`-prune_after_lto`, and `-max_relative_cache_size_lto`.

Test is adapted from lld/test/ELF/lto/cache.ll

Differential Revision: https://reviews.llvm.org/D105922
2021-07-15 12:56:13 -04:00
Alexander Shaposhnikov d21772fa21 [lld][MachO] Code cleanup
Make use of ArgList::getLastArgValue. NFC.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D105452
2021-07-14 04:33:09 -07:00
Nico Weber f21801dab2 [lld/mac] Implement -application_extension
Differential Revision: https://reviews.llvm.org/D105818
2021-07-12 13:42:16 -04:00
Jez Ng 11a0d23650 [lld-macho][nfc] clang-format 2021-07-11 18:36:59 -04:00
Jez Ng 28a2102ee3 [lld-macho][nfc] Remove unnecessary llvm:: namespace prefixes 2021-07-11 18:36:53 -04:00
Jez Ng f6e84a84f9 [lld-macho][nfc] Avoid using std::map for PlatformKinds
The mappings we were using had a small number of keys, so a vector is
probably better. This allows us to remove the last usage of std::map in
our codebase.

I also used `removeSimulator` to simplify the code a bit further.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D105786
2021-07-11 18:24:53 -04:00
Nico Weber 6e05c1cd5f [lld/mac] Always reference dyld_stub_binder when linked with libSystem
lld currently only references dyld_stub_binder when it's needed.
ld64 always references it when libSystem is linked.
Match ld64.

The (somewhat lame) motivation is that `nm` on a binary without any
export writes a "no symbols" warning to stderr, and this change makes
it so that every binary in practice has at least a reference to
dyld_stub_binder, which suppresses that.

Every "real" output file will reference dyld_stub_binder, so most
of the time this shouldn't make much of a difference. And if you
really don't want to have this reference for whatever reason, you
can stop passing -lSystem, like you have to for ld64 anyways.

(After linking any dylib, we dump the exported list of symbols to
a txt file with `nm` and only relink downstream deps if that txt
file changes. A nicer fix is to make lld optionally write .tbd files
with the public interface of a linked dylib and use that instead,
but for now the txt files are what we do.)

Differential Revision: https://reviews.llvm.org/D105782
2021-07-11 13:37:48 -04:00
Mikael Holmen 21fd875952 [lld/mac] Fix warning about unused variable [NFC]
Change "dyn_cast" to "isa" to get rid of the unused
variable "bitcodeFile".

gcc warned with

lld/MachO/Driver.cpp:531:17: warning: unused variable 'bitcodeFile' [-Wunused-variable]
531 |       if (auto *bitcodeFile = dyn_cast<BitcodeFile>(file)) {
    |                 ^~~~~~~~~~~
2021-07-08 09:46:30 +02:00
Nico Weber 3eb2fc4b50 [lld/mac] Partially implement -export_dynamic
This implements the part of -export_dynamic that adds external
symbols as dead strip roots even for executables.

It does not yet implement the effect -export_dynamic has for LTO.
I tried just replacing `config->outputType != MH_EXECUTE` with
`(config->outputType != MH_EXECUTE || config->exportDynamic)` in
LTO.cpp, but then local symbols make it into the symbol table too,
which is too much (and also doesn't match ld64). So punt on this
for now until I understand it better.
(D91583 may or may not be related too).

Differential Revision: https://reviews.llvm.org/D105482
2021-07-06 11:22:18 -04:00
Nico Weber 64be5b7d87 [lld/mac] Implement -arch_multiple
This is the other flag clang passes when calling clang with two -arch
flags (which means with this, `clang -arch x86_64 -arch arm64 -fuse-ld=lld ...`
now no longer prints any warnings \o/). Since clang calls the linker several
times in that setup, it's not clear to the user from which invocation the
errors are. The flag's help text is

    Specifies that the linker should augment error and warning messages
    with the architecture name.

In ld64, the only effect of the flag is that undefined symbols are prefaced
with

    Undefined symbols for architecture x86_64:

instead of the usual "Undefined symbols:". So for now, let's add this
only to undefined symbol errors too. That's probably the most common
linker diagnostic.

Another idea would be to prefix errors and warnings with "ld64.lld(x86_64):"
instead of the usual "ld64.lld:", but I'm not sure if people would
misunderstand that as a comment about the arch of ld itself.
But open to suggestions on what effect this flag should have :) And we
don't have to get it perfect now, we can iterate on it.

Differential Revision: https://reviews.llvm.org/D105450
2021-07-06 00:25:18 -04:00
Nico Weber 2c25f39fcc [lld/mac] Implement -final_output
This is one of two flags clang passes to the linker when giving calling
clang with multiple -arch flags.

I think it'd make sense to also use finalOutput instead of outputFile
in CodeSignatureSection() and when replacing @executable_path, but
ld64 doesn't do that, so I'll at least put those in separate commits.

Differential Revision: https://reviews.llvm.org/D105449
2021-07-05 20:06:26 -04:00
Nico Weber db64306d99 [lld/mac] Implement -umbrella
I think this is an old way for doing what is done with
-reexport_library these days, but it's e.g. still used in libunwind's
build (the opensource.apple.com one, not the llvm one).

Differential Revision: https://reviews.llvm.org/D105448
2021-07-05 20:06:25 -04:00
Jez Ng f6b6e72143 [lld-macho] Factor out common InputSection members
We have been creating many ConcatInputSections with identical values due
to .subsections_via_symbols. This diff factors out the identical values
into a Shared struct, to reduce memory consumption and make copying
cheaper.

I also changed `callSiteCount` from a uint32_t to a 31-bit field to save an
extra word.

All in all, this takes InputSection from 120 to 72 bytes (and
ConcatInputSection from 160 to 112 bytes), i.e. 30% size reduction in
ConcatInputSection.

Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:

      N           Min           Max        Median           Avg        Stddev
  x  20          4.14          4.24          4.18         4.183   0.027548999
  +  20          4.04          4.11         4.075        4.0775   0.018027756
  Difference at 95.0% confidence
          -0.1055 +/- 0.0149005
          -2.52211% +/- 0.356215%
          (Student's t, pooled s = 0.0232803)

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D105305
2021-07-01 21:22:39 -04:00
Jez Ng 3a11528d97 [lld-macho] Move ICF earlier to avoid emitting redundant binds
This is a pretty big refactoring diff, so here are the motivations:

Previously, ICF ran after scanRelocations(), where we emitting
bind/rebase opcodes etc. So we had a bunch of redundant leftovers after
ICF. Having ICF run before Writer seems like a better design, and is
what LLD-ELF does, so this diff refactors it accordingly.

However, ICF had two dependencies on things occurring in Writer: 1) it
needs literals to be deduplicated beforehand and 2) it needs to know
which functions have unwind info, which was being handled by
`UnwindInfoSection::prepareRelocations()`.

In order to do literal deduplication earlier, we need to add literal
input sections to their corresponding output sections. So instead of
putting all input sections into the big `inputSections` vector, and then
filtering them by type later on, I've changed things so that literal
sections get added directly to their output sections during the 'gather'
phase. Likewise for compact unwind sections -- they get added directly
to the UnwindInfoSection now. This latter change is not strictly
necessary, but makes it easier for ICF to determine which functions have
unwind info.

Adding literal sections directly to their output sections means that we
can no longer determine `inputOrder` from iterating over
`inputSections`. Instead, we store that order explicitly on
InputSection. Bloating the size of InputSection for this purpose would
be unfortunate -- but LLD-ELF has already solved this problem: it reuses
`outSecOff` to store this order value.

One downside of this refactor is that we now make an additional pass
over the unwind info relocations to figure out which functions have
unwind info, since want to know that before `processRelocations()`. I've
made sure to run that extra loop only if ICF is enabled, so there should
be no overhead in non-optimizing runs of the linker.

The upside of all this is that the `inputSections` vector now contains
only ConcatInputSections that are destined for ConcatOutputSections, so
we can clean up a bunch of code that just existed to filter out other
elements from that vector.

I will test for the lack of redundant binds/rebases in the upcoming
cfstring deduplication diff. While binds/rebases can also happen in the
regular `.text` section, they're more common in `.data` sections, so it
seems more natural to test it that way.

This change is perf-neutral when linking chromium_framework.

Reviewed By: oontvoo

Differential Revision: https://reviews.llvm.org/D105044
2021-07-01 21:22:38 -04:00
Leonard Grey fe08e9c487 [lld-macho] Add support for LTO optimization level
Everything (including test) modified from ELF/COFF. Using the same syntax
(--lto-O3, etc) as ELF.

Differential Revision: https://reviews.llvm.org/D105223
2021-07-01 15:01:59 -04:00
Jez Ng b41b4148e7 [lld-macho] Only enable `__DATA_CONST` for newer platforms
Matches ld64.

Reviewed By: #lld-macho, alexander-shaposhnikov

Differential Revision: https://reviews.llvm.org/D105080
2021-06-30 18:55:48 -04:00
Jez Ng 557e1fa02f [lld-macho] Extend ICF to literal sections
Literal sections can be deduplicated before running ICF. That makes it
easy to compare them during ICF: we can tell if two literals are
constant-equal by comparing their offsets in their OutputSection.

LLD-ELF takes a similar approach.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D104671
2021-06-28 14:49:39 -04:00
Nico Weber 3a6a60f6c9 [lld/mac] Make a variable more local; no behavior change
The variable used to need the wider scope, but doesn't after the
reland. See LC_LINKER_OPTIONS-related discussion on
https://reviews.llvm.org/D104353 for background.
2021-06-20 21:59:15 -04:00
Jez Ng f79e7a5a48 [lld-macho] Have inputOrder default to less than INT_MAX
We make it less than INT_MAX in order not to conflict with the ordering
of zerofill sections, which must always be placed at the end of their
segment.

This is the more structural fix for the issue addressed in {D104596}.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D104607
2021-06-20 19:49:27 -04:00
Jez Ng 4507f64165 [re-land][lld-macho] Avoid force-loading the same archive twice
This reverts commit c9b241efd6, which was
a backout diff to fix the buildbots.

The real culprit of the crash is
1d31fb8d12,
which is being reverted.

Differential Revision: https://reviews.llvm.org/D104353
2021-06-18 22:43:50 -04:00
Jez Ng a79c018325 Revert "[lld-macho] Have path-related functions return std::string, not StringRef"
This reverts commit 1d31fb8d12.

Making `rerootPath` return a temporary std::string caused a
use-after-free:

https://ci.chromium.org/ui/p/chromium/builders/try/win_upload_clang/1608/overview
2021-06-18 22:43:49 -04:00
Nico Weber c9b241efd6 Revert "[lld-macho] Avoid force-loading the same archive twice"
This reverts commit 24706cd73c.
Test seems to fail flakily. See comments on https://reviews.llvm.org/D104353
for a hypothesis for why.
2021-06-18 20:25:27 -04:00
Jez Ng 1d31fb8d12 [lld-macho] Have path-related functions return std::string, not StringRef
findLibrary() returned a StringRef while findFramework & other helper
functions returned std::strings. Standardize on std::string.

(I initially tried making the helper functions all return StringRefs,
but I realized we shouldn't return input StringRefs since their
lifetimes would not be obvious from the calling code.)
2021-06-18 16:36:14 -04:00
Nico Weber f7366890c2 [lld/mac] Support -data_in_code_info, -function_starts flags
These are on by default, but there's also an explicit flag for them.

Differential Revision: https://reviews.llvm.org/D104543
2021-06-18 13:01:42 -04:00
Greg McGary 8120c9e379 Rename option -icf MODE to --icf=MODE
The `icf` command-line option is not present in ld64, so it should use the LLD option syntax, which begins with double dashes and separates primary option from any suboption with the equal sign.

Differential Revision: https://reviews.llvm.org/D104548
2021-06-18 09:52:15 -07:00
Greg McGary f27e4548fc [lld-macho] Implement ICF
ICF = Identical C(ode|OMDAT) Folding

This is the LLD ELF/COFF algorithm, adapted for MachO. So far, only `-icf all` is supported. In order to support `-icf safe`, we will need to port address-significance tables (`.addrsig` directives) to MachO, which will come in later diffs.

`check-{llvm,clang,lld}` have 0 regressions for `lld -icf all` vs. baseline ld64.

We only run ICF on `__TEXT,__text` for reasons explained in the block comment in `ConcatOutputSection.cpp`.

Here is the perf impact for linking `chromium_framekwork` on a Mac Pro (16-core Xeon W) for the non-ICF case vs. pre-ICF:
```
    N           Min           Max        Median           Avg        Stddev
x  20          4.27          4.44          4.34         4.349   0.043029977
+  20          4.37          4.46         4.405        4.4115   0.025188761
Difference at 95.0% confidence
        0.0625 +/- 0.0225658
        1.43711% +/- 0.518873%
        (Student's t, pooled s = 0.0352566)
```

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D103292
2021-06-17 10:07:44 -07:00
Jez Ng 24706cd73c [lld-macho] Avoid force-loading the same archive twice
We need to dedup archive loads (similar to what we do for dylib
loads).

I noticed this issue after building some Swift stuff that used
`-force_load_swift_libs`, as it caused some Swift archives to be loaded
many times.

Reviewed By: #lld-macho, thakis, MaskRay

Differential Revision: https://reviews.llvm.org/D104353
2021-06-17 11:13:54 -04:00
Nico Weber b579938d40 [lld/mac] Add support for -no_data_in_code_info flag
Differential Revision: https://reviews.llvm.org/D104345
2021-06-16 06:40:42 -04:00
Jez Ng 681cfeb591 [lld-macho][nfc] Have InputSection ctors take some parameters
This is motivated by an upcoming diff in which the
WordLiteralInputSection ctor sets itself up based on the value of its
section flags. As such, it needs to be passed the `flags` value as part
of its ctor parameters, instead of having them assigned after the fact
in `parseSection()`. While refactoring code to make that possible, I
figured it would make sense for the other InputSections to also take
their initial values as ctor parameters.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D103978
2021-06-11 19:50:09 -04:00
Jez Ng 7f2ba39b16 [lld-macho][nfc] Move liveness-tracking fields into ConcatInputSection
These fields currently live in the parent InputSection class,
but they should be specific to ConcatInputSection, since the other
InputSection classes (that contain literals) aren't atomically live or
dead -- rather their component string/int literals should have
individual liveness states. (An upcoming diff will add liveness bits for
StringPieces and fixed-sized literals.)

I also factored out some asserts for isCoalescedWeak() in MarkLive.cpp.
We now avoid putting coalesced sections in the `inputSections` vector,
so we don't have to check/assert against it everywhere.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D103977
2021-06-11 19:50:08 -04:00
Nico Weber e87c095af3 [lld/mac] Print dylib search details with --print-dylib-search or RC_TRACE_DYLIB_SEARCHING
For debugging dylib loading, it's useful to have some insight into what
the linker is doing.

ld64 has the undocumented RC_TRACE_DYLIB_SEARCHING env var
for this printing dylib search candidates.

This adds a flag --print-dylib-search to make lld print the seame information.
It's useful for users, but also for writing tests. The output is formatted
slightly differently than ld64, but we still support RC_TRACE_DYLIB_SEARCHING
to offer at least a compatible way to trigger this.

ld64 has both `-print_statistics` and `-trace_symbol_output` to enable
diagnostics output. I went with "print" since that seems like a more
straightforward name.

Differential Revision: https://reviews.llvm.org/D103985
2021-06-09 22:08:20 -04:00
Jez Ng 447dfbe005 [lld-macho] Implement -force_load_swift_libs
It causes libraries whose names start with "swift" to be force-loaded.
Note that unlike the more general `-force_load`, this flag only applies
to libraries specified via LC_LINKER_OPTIONS, and not those passed on
the command-line. This is what ld64 does.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D103709
2021-06-07 23:48:35 -04:00
Jez Ng 04259cde15 [lld-macho] Implement cstring deduplication
Our implementation draws heavily from LLD-ELF's, which in turn delegates
its string deduplication to llvm-mc's StringTableBuilder. The messiness of
this diff is largely due to the fact that we've previously assumed that
all InputSections get concatenated together to form the output. This is
no longer true with CStringInputSections, which split their contents into
StringPieces. StringPieces are much more lightweight than InputSections,
which is important as we create a lot of them. They may also overlap in
the output, which makes it possible for strings to be tail-merged. In
fact, the initial version of this diff implemented tail merging, but
I've dropped it for reasons I'll explain later.

**Alignment Issues**

Mergeable cstring literals are found under the `__TEXT,__cstring`
section. In contrast to ELF, which puts strings that need different
alignments into different sections, clang's Mach-O backend puts them all
in one section. Strings that need to be aligned have the `.p2align`
directive emitted before them, which simply translates into zero padding
in the object file.

I *think* ld64 extracts the desired per-string alignment from this data
by preserving each string's offset from the last section-aligned
address. I'm not entirely certain since it doesn't seem consistent about
doing this; but perhaps this can be chalked up to cases where ld64 has
to deduplicate strings with different offset/alignment combos -- it
seems to pick one of their alignments to preserve. This doesn't seem
correct in general; we can in fact can induce ld64 to produce a crashing
binary just by linking in an additional object file that only contains
cstrings and no code. See PR50563 for details.

Moreover, this scheme seems rather inefficient: since unaligned and
aligned strings are all put in the same section, which has a single
alignment value, it doesn't seem possible to tell whether a given string
doesn't have any alignment requirements. Preserving offset+alignments
for strings that don't need it is wasteful.

In practice, the crashes seen so far seem to stem from x86_64 SIMD
operations on cstrings. X86_64 requires SIMD accesses to be
16-byte-aligned. So for now, I'm thinking of just aligning all strings
to 16 bytes on x86_64. This is indeed wasteful, but implementation-wise
it's simpler than preserving per-string alignment+offsets. It also
avoids the aforementioned crash after deduplication of
differently-aligned strings. Finally, the overhead is not huge: using
16-byte alignment (vs no alignment) is only a 0.5% size overhead when
linking chromium_framework.

With these alignment requirements, it doesn't make sense to attempt tail
merging -- most strings will not be eligible since their overlaps aren't
likely to start at a 16-byte boundary. Tail-merging (with alignment) for
chromium_framework only improves size by 0.3%.

It's worth noting that LLD-ELF only does tail merging at `-O2`. By
default (at `-O1`), it just deduplicates w/o tail merging. @thakis has
also mentioned that they saw it regress compressed size in some cases
and therefore turned it off. `ld64` does not seem to do tail merging at
all.

**Performance Numbers**

CString deduplication reduces chromium_framework from 250MB to 242MB, or
about a 3.2% reduction.

Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:

      N           Min           Max        Median           Avg        Stddev
  x  20          3.91          4.03         3.935          3.95   0.034641016
  +  20          3.99          4.14         4.015        4.0365     0.0492336
  Difference at 95.0% confidence
          0.0865 +/- 0.027245
          2.18987% +/- 0.689746%
          (Student's t, pooled s = 0.0425673)

As expected, cstring merging incurs some non-trivial overhead.

When passing `--no-literal-merge`, it seems that performance is the
same, i.e. the refactoring in this diff didn't cost us.

      N           Min           Max        Median           Avg        Stddev
  x  20          3.91          4.03         3.935          3.95   0.034641016
  +  20          3.89          4.02         3.935        3.9435   0.043197831
  No difference proven at 95.0% confidence

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D102964
2021-06-07 23:48:35 -04:00
Nico Weber 17c43c4045 [lld/mac] Add reexports after reexporter to inputFiles
When a library "host"'s reexports change their installName with
`$ld$os10.11$install_name$host`, we used to write a load command for "host" but
write the version numbers of the reexport instead of "host". This fixes that.

I first thought that the rule is to take the version numbers from the library
that originally had that install name (implemented in D103819), but that's not
what ld64 seems to be doing: It takes the version number from the first dylib
with that install name it loads, and it loads the reexporting library before
the reexports. We already did most of that, we just added reexports before the
reexporter. After this change, we add the reexporter before the reexports.

Addresses https://bugs.llvm.org/show_bug.cgi?id=49800#c11 part 1.

(ld64 seems to add reexports after processing _all_ files on the command line,
while we add them right after the reexporter. For the common case of reexport +
$ld$ symbol changing back to the exporter name, this doesn't make a difference,
but you can construct a case where it does. I expect this to not make a
difference in practice though.)

Differential Revision: https://reviews.llvm.org/D103821
2021-06-07 17:04:03 -04:00
Nico Weber a5645513db [lld/mac] Implement -dead_strip
Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.

Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).

Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:

    % ministat lld_*
    x lld_nostrip.txt
    + lld_strip.txt
        N           Min           Max        Median           Avg        Stddev
    x  10      3.929414       4.07692     4.0269079     4.0089678   0.044214794
    +  10     3.8129408     3.9025559     3.8670411     3.8642573   0.024779651
    Difference at 95.0% confidence
            -0.144711 +/- 0.0336749
            -3.60967% +/- 0.839989%
            (Student's t, pooled s = 0.0358398)

This interacts with many parts of the linker. I tried to add test coverage
for all added `isLive()` checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:
- debug info
- export tries
- import opcodes
- flags like -exported_symbol(s_list)
- -U / dynamic_lookup
- mod_init_funcs, mod_term_funcs
- weak symbol handling
- unwind info
- stubs
- map files
- -sectcreate
- undefined, dylib, common, defined (both absolute and normal) symbols

It's possible it interacts with more features I didn't think of,
of course.

I also did some manual testing:
- check-llvm check-clang check-lld work with lld with this patch
  as host linker and -dead_strip enabled
- Chromium still starts
- Chromium's base_unittests still pass, including unwind tests

Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.

Mechanical parts:
- Rename canOmitFromOutput to wasCoalesced (no behavior change)
  since it really is for weak coalesced symbols
- Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP
  (`.no_dead_strip` in asm)

Fixes PR49276.

Differential Revision: https://reviews.llvm.org/D103324
2021-06-02 11:09:26 -04:00
Nico Weber 66a1ecd2cf [lld/mac] Implement -needed_framework, -needed_library, -needed-l
These allow overriding dead_strip_dylibs.

Differential Revision: https://reviews.llvm.org/D103499
2021-06-02 11:06:42 -04:00
Nico Weber e14fd7d879 [lld/mac] Don't strip explicit dylib also mentioned in LC_LINKER_OPTION
Noticed by Jez in D103499.

Differential Revision: https://reviews.llvm.org/D103521
2021-06-02 10:59:56 -04:00
Nico Weber 78ce89bb1e [lld/mac] Implement -reexport_framework, -reexport_library, -reexport-l
These are slightly easier-to-use versions of -sub_library and -sub_umbrella.

Differential Revision: https://reviews.llvm.org/D103497
2021-06-02 06:37:34 -04:00
Nico Weber 222a88a243 [lld/mac] Make -t work correctly with -flat_namespace
We used to not print dylibs referenced by other dylibs in `-t` mode. This
affected reexports, and with `-flat_namespace` also just dylibs loaded by
dylibs. Now we print them.

Fixes PR49514.

Differential Revision: https://reviews.llvm.org/D103428
2021-06-01 19:23:39 -04:00
Vy Nguyen 8f89c054af [lld-macho][nfc] Remove unnecessary use of Optional<T*>
In all of these cases, the functions could simply return a nullptr instead of {}.
There is no case where Optional<nullptr> has a special meaning.

Differential Revision: https://reviews.llvm.org/D103489
2021-06-01 18:35:31 -04:00
Nico Weber 2c1903412b [lld/mac] Implement removal of unused dylibs
This omits load commands for unreferenced dylibs if:
- the dylib was loaded implicitly,
- it is marked MH_DEAD_STRIPPABLE_DYLIB
- or -dead_strip_dylibs is passed

This matches ld64.

Currently, the "is dylib referenced" state is computed before dead code
stripping and is not updated after dead code stripping. This too matches ld64.
We should do better here.

With this, clang-format linked with lld (like with ld64) no longer has
libobjc.A.dylib in `otool -L` output. (It was implicitly loaded as a reexport
of CoreFoundation.framework, but it's not needed.)

Differential Revision: https://reviews.llvm.org/D103430
2021-06-01 16:06:30 -04:00
Nico Weber 0b39f055d8 [lld/mac] Don't write mtimes to N_OSO entries if ZERO_AR_DATE is set.
This is important for build determinism. This matches ld64.

Differential Revision: https://reviews.llvm.org/D103446
2021-06-01 15:29:38 -04:00
Mariusz Ceier 9383e9c1e6 Fix lld macho standalone build by including llvm/Config/llvm-config.h instead of llvm/Config/config.h
lld/MachO/Driver.cpp and lld/MachO/SyntheticSections.cpp include
llvm/Config/config.h which doesn't exist when building standalone lld.

This patch replaces llvm/Config/config.h include with llvm/Config/llvm-config.h
just like it is in lld/ELF/Driver.cpp and HAVE_LIBXAR with LLVM_HAVE_LIXAR and
moves LLVM_HAVE_LIBXAR from config.h to llvm-config.h

Also it adds LLVM_HAVE_LIBXAR to LLVMConfig.cmake and links liblldMachO2.so
with XAR_LIB if LLVM_HAVE_LIBXAR is set.

Differential Revision: https://reviews.llvm.org/D102084
2021-05-19 11:15:07 -04:00
Nico Weber 095c520fb4 [lld/mac] Propagate -(un)exported_symbol(s_list) to privateExtern in Driver
That way, it's done only once instead of every time shouldExportSymbol() is
called.

Possibly a bit faster:

    % ministat at_main at_symtodo
    x at_main
    + at_symtodo
        N           Min           Max        Median           Avg        Stddev
    x  30     3.9732189      4.114846      4.024621     4.0304692   0.037022865
    +  30       3.93766     4.0510042     3.9973931      3.991469   0.028472565
    Difference at 95.0% confidence
            -0.0390002 +/- 0.0170714
            -0.967635% +/- 0.423559%
            (Student's t, pooled s = 0.0330256)

In other runs with n=30 it makes no perf difference, so maybe it's just noise.
But being able to quickly and conveniently answer "is this symbol exported?"
is useful for fixing PR50373 and for implementing -dead_strip, so this seems
like a good change regardless.

No behavior change.

Differential Revision: https://reviews.llvm.org/D102661
2021-05-18 07:42:58 -04:00
Nico Weber 4a12248ee2 [lld/mac] Honor REFERENCED_DYAMICALLY, set it on __mh_execute_header
Has the effect that `__mh_execute_header` stays in the symbol table of
outputs even after running `strip` on the output. I don't know if that's
important for anything -- my motivation for the patch is just is to make
the output more similar to ld64.

(Corresponds to symbolTableInAndNeverStrip in ld64.)

Differential Revision: https://reviews.llvm.org/D102619
2021-05-17 14:22:12 -04:00
Greg McGary 93c8559baf [lld-macho] Implement branch-range-extension thunks
Extend the range of calls beyond an architecture's limited branch range by first calling a thunk, which loads the far address into a scratch register (x16 on ARM64) and branches through it.

Other ports (COFF, ELF) use multiple passes with successively-refined guesses regarding the expansion of text-space imposed by thunk-space overhead. This MachO algorithm places thunks during MergedOutputSection::finalize() in a single pass using exact thunk-space overheads. Thunks are kept in a separate vector to avoid the overhead of inserting into the `inputs` vector of `MergedOutputSection`.

FIXME:
* arm64-stubs.s test is broken
* add thunk tests
* Handle thunks to DylibSymbol in MergedOutputSection::finalize()

Differential Revision: https://reviews.llvm.org/D100818
2021-05-12 09:44:58 -07:00
Nico Weber 9ab49ae55d [lld/mac] Implement -sectalign
clang sometimes passes this flag along (see D68351), so we should implement it.

Differential Revision: https://reviews.llvm.org/D102247
2021-05-11 13:31:32 -04:00
Jez Ng b1c3c2e4fc [lld-macho] Fix order file arch filtering
We had a hardcoded check and a stale TODO, written back when we only had
support for one architecture.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D102154
2021-05-10 15:45:54 -04:00
Jez Ng 2516b0b526 [lld-macho] Treat undefined symbols uniformly
In particular, we should apply the `-undefined` behavior to all
such symbols, include those that are specified via the command line
(i.e.  `-e`, `-u`, and `-exported_symbol`). ld64 supports this too.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D102143
2021-05-10 15:45:54 -04:00
Jez Ng 0f8854f7f5 [lld-macho] Don't reference entry symbol for non-executables
This would cause us to pull in symbols (and code) that should
be unused.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D102137
2021-05-09 20:30:26 -04:00
Greg McGary 5be8502271 [lld-macho] Explicitly undefine literal exported symbols
Symbols explicitly exported via command-line options `--exported_symbol SYM` and `--exported_symbols_list FILE` must be defined. Before this fix, lazy symbols defined in archives would be left to languish. We now force them to be included in the linked output.

Differential Revision: https://reviews.llvm.org/D102100
2021-05-08 11:37:00 -07:00
Jez Ng 20f51ffe67 [lld-macho] Have --reproduce account for path rerooting
We need to account for path rerooting when generating the response
file. We could either reroot the paths before generating the file, or pass
through the original filenames and change just the syslibroot. I've opted for
the latter, in order that the reproduction run more closely mirrors the
original.

We must also be careful *not* to make an absolute path relative if it is
shadowed by a rerooted path. See repro6.tar in reroot-path.s for
details.

I've moved the call to `createResponseFile()` after the initialization of
`config->systemLibraryRoots`, since it now needs to know what those roots are.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D101224
2021-05-05 14:41:01 -04:00
Greg McGary 27b426b0c8 [lld-macho] Implement builtin section renaming
ld64 automatically renames many sections depending on output type and assorted flags. Here, we implement the most common configs. We can add more obscure flags and behaviors as needed.

Depends on D101393

Differential Revision: https://reviews.llvm.org/D101395
2021-05-03 21:26:51 -07:00
Jez Ng 001ba65375 [lld-macho] De-templatize mach_header operations
@thakis pointed out that `mach_header` and `mach_header_64`
actually have the same set of (used) fields, with the 64-bit version
having extra padding. So we can access the fields we need using the
single `mach_header` type instead of using templates to switch between
the two.

I also spotted a potential issue where hasObjCSection tries to parse a
file w/o checking if it does indeed match the target arch... As such,
I've added a quick magic number check to ensure we don't access invalid
memory during `findCommand()`.

Addresses PR50180.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D101724
2021-05-03 18:31:23 -04:00
Jez Ng 05c5363b39 [lld-macho] Parse & emit the N_ARM_THUMB_DEF symbol flag
Eventually we'll use this flag to properly handle bl/blx
opcodes.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D101558
2021-04-30 16:17:26 -04:00
Jez Ng 2d28100bf2 [lld-macho] Initial scaffolding for ARM32 support
This just parses the `-arch armv7` and emits the right header flags.
The rest will be slowly fleshed out in upcoming diffs.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D101557
2021-04-30 16:17:25 -04:00
Jez Ng 7e115da5df [lld-macho] Make everything PIE by default
Modern versions of macOS (>= 10.7) and in general all modern Mach-O
target archs want PIEs by default. ld64 defaults to PIE for iOS >= 4.3,
as well as for all versions of watchOS and simulators. Basically all the
platforms LLD is likely to target want PIE. So instead of cluttering LLD's
code with legacy version checks, I think it's simpler to just default to
PIE for everything.

Note that `-no_pie` still works, so users can still opt out of it.

Reviewed By: #lld-macho, thakis, MaskRay

Differential Revision: https://reviews.llvm.org/D101513
2021-04-29 15:11:23 -04:00
Greg McGary c2419aae76 [lld-macho] Add option --error-limit=N
Add option to limit (or remove limits) on the number of errors printed before exiting. This option exists in the other lld ports: COFF & ELF.

Differential Revision: https://reviews.llvm.org/D101274
2021-04-26 07:10:12 -07:00
Jez Ng ed4a4e3312 [lld-macho][nfc] Add accessors for commonly-used PlatformInfo fields
As discussed here: https://reviews.llvm.org/D100523#inline-951543

Reviewed By: #lld-macho, thakis, alexshap

Differential Revision: https://reviews.llvm.org/D100978
2021-04-21 15:43:56 -04:00
Jez Ng ab9c21bbab [lld-macho] Support LC_ENCRYPTION_INFO
This load command records a range spanning from the end of the load
commands to the end of the `__TEXT` segment. Presumably the kernel will encrypt
all this data.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D100973
2021-04-21 13:39:56 -04:00
Alexander Shaposhnikov 5c835e1ae5 [lld][MachO] Add support for LC_VERSION_MIN_* load commands
This diff adds initial support for the legacy LC_VERSION_MIN_* load commands.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D100523
2021-04-21 05:41:14 -07:00
Jez Ng ca6751043d [lld-macho] Initial groundwork for -bitcode_bundle
This diff creates an empty XAR file and copies it into
`__LLVM,__bundle`. Follow-up work will actually populate the contents of
that XAR.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D100650
2021-04-16 16:47:14 -04:00
Jez Ng 3bc88eb392 [lld-macho] Add support for arm64_32
From what I can tell, it's pretty similar to arm64. The two main differences
are:

1. No 64-bit relocations
2. Stub code writes to 32-bit registers instead of 64-bit

Plus of course the various on-disk structures like `segment_command` are using
the 32-bit instead of the 64-bit variants.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D99822
2021-04-15 21:16:33 -04:00
Jez Ng db7a413e51 [lld-macho] Re-root absolute input file paths if -syslibroot is specified
Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D100147
2021-04-15 21:16:33 -04:00
Jez Ng 8ca366935b Revert "[lld-macho] Add support for arm64_32" and other stacked diffs
This reverts commits:
* 8914902b01
* 35a745d814
* 682d1dfe09
2021-04-13 12:40:58 -04:00
Jez Ng 84c52f3a19 [lld-macho] arm64_32 executables are always PIE
This should fix the assert that's currently breaking the build.
2021-04-13 11:55:45 -04:00
Jez Ng 8914902b01 [lld-macho] Add support for arm64_32
From what I can tell, it's pretty similar to arm64. The two main differences
are:

1. No 64-bit relocations
2. Stub code writes to 32-bit registers instead of 64-bit

Plus of course the various on-disk structures like `segment_command` are using
the 32-bit instead of the 64-bit variants.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D99822
2021-04-13 10:43:28 -04:00
Jez Ng c2e76a9a6d [lld-macho][nfc] Use varargs form of hasArg() 2021-04-08 14:12:55 -04:00
Jez Ng c23b92acd0 [lld-macho] Support -add_ast_path
Swift builds seem to use it. All it requires is emitting the
corresponding paths as STABS.

Fixes llvm.org/PR49385.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D100076
2021-04-08 14:12:55 -04:00
Jez Ng 050a7a27ca [lld-macho] Support --thinlto-jobs
The test is loosely based off LLD-ELF's `thinlto.ll`. However, I
found that test questionable because the the -save_temps behavior it
checks for is identical regardless of whether we are running in single-
or multi-threaded mode. I tried writing a test based on `--time-trace`
but couldn't get it to run deterministically... so I've opted to just
skip checking that behavior for now.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D99356
2021-04-08 12:21:01 -04:00
Vy Nguyen db851dfb49 [lld-macho] Make time-trace* options more permissive.
If either `time-trace-granularity` or `time-trace-file` is specified, then don't make users specify `-time-trace`.
It seems silly that I have to type all three options, eg, `-time-trace -time-trace-file=- -time-trace-granularity=...`.

Differential Revision: https://reviews.llvm.org/D100011
2021-04-07 16:00:20 -04:00
Vy Nguyen ffc65824f0 [lld-macho][nfc] Minor refactoring + clang-tidy fixes
- use "empty()" instead of "size()"
- refactor the re-export code so it doesn't create a new vector every time.

Differential Revision: https://reviews.llvm.org/D100019
2021-04-07 13:55:52 -04:00
Jez Ng 174deb0539 [lld-macho] clang-format cleanup
find . -type f -name "*.cpp" -o -name "*.h" | xargs clang-format -i
2021-04-06 14:26:13 -04:00
Jez Ng e0df2b540a [lld-macho] Rename SubsectionMapping to SubsectionMap
We bikeshedded about it here: https://reviews.llvm.org/D98837#inline-931557

I initially suggested SubsectionMapping, but I thought the discussion
landed on doing `std::vector<SubsectionEntry>`. @alexshap went and did
both, but on hindsight I regret adding 3 more characters to an already
long name, and I think SubsectionEntry is descriptive enough...

This diff also renames `subsectionMap` to `subsecMap` for consistency
with other variable names in the codebase.
2021-04-06 14:26:13 -04:00
Cyndy Ishida 0116d04d04 [TextAPI] move source code files out of subdirectory, NFC
TextAPI/ELF has moved out into InterfaceStubs, so theres no longer a
need to seperate out TextAPI between formats.

Reviewed By: ributzka, int3, #lld-macho

Differential Revision: https://reviews.llvm.org/D99811
2021-04-05 10:24:42 -07:00
Jez Ng 817d98d841 [lld-macho][nfc] Refactor in preparation for 32-bit support
The main challenge was handling the different on-disk structures (e.g.
`mach_header` vs `mach_header_64`). I tried to strike a balance between
sprinkling `target->wordSize == 8` checks everywhere (branchy = slow, and ugly)
and templatizing everything (causes code bloat, also ugly). I think I struck a
decent balance by judicious use of type erasure.

Note that LLD-ELF has a similar architecture, though it seems to use more templating.

Linking chromium_framework takes about the same time before and after this
change:

      N           Min           Max        Median           Avg        Stddev
  x  20          4.52          4.67         4.595        4.5945   0.044423204
  +  20           4.5          4.71         4.575         4.582   0.056344803
  No difference proven at 95.0% confidence

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D99633
2021-04-02 18:46:39 -04:00
Alexander Shaposhnikov f6ad045366 [lld][MachO] Make emitEndFunStab independent from .subsections_via_symbols
This diff addresses FIXME in SyntheticSections.cpp and removes
the dependency of emitEndFunStab on .subsections_via_symbols.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D99054
2021-04-01 17:48:09 -07:00
Alexander Shaposhnikov f1e4e2fb20 [lld][MachO] Refactor handling of subsections
This diff is a preparation for fixing FunStabs (incorrect size calculation).
std::map<uint32_t, InputSection*> (SubsectionMap) is replaced with
a sorted vector + binary search. If .subsections_via_symbols is set
this vector will contain the list of subsections, otherwise,
the offsets will be used for calculating the symbols sizes.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D98837
2021-03-31 16:52:53 -07:00
Greg McGary 427d359721 [lld-macho][NFC] Drop unnecessary macho:: namespace prefix on unambiguous references to Symbol
Within `lld/macho/`, only `InputFiles.cpp` and `Symbols.h` require the `macho::` namespace qualifier to disambiguate references to `class Symbol`.

Add braces to outer `for` of a 5-level single-line `if`/`for` nest.

Differential Revision: https://reviews.llvm.org/D99555
2021-03-30 14:58:35 -07:00
Jez Ng a43f588e01 [lld-macho] Implement -segprot
Addresses llvm.org/PR49405.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D99389
2021-03-29 14:08:12 -04:00
Jez Ng 94e369400e [lld-macho] Fix parsing of --time-trace-{granularity,file}
Summary: We needed to use `Joined` instead of `Flag`. This wasn't caught
because the relevant test that was copied from LLD-ELF was still
invoking LLD-ELF instead of LLD-MachO...

Differential Revision: https://reviews.llvm.org/D99313
2021-03-26 18:14:10 -04:00
Jez Ng 45cdceb40c [lld-macho] Support -no_function_starts
Pretty simple code-wise. Also threw in some refactoring:

* Put the functionStartSection under Writer instead of InStruct, since
  it doesn't need to be accessed outside of Writer
* Adjusted the test to put all files under the temp dir instead of at
  the top-level
* Added some CHECK-LABELs to make it clearer where the function starts
  data is

Differential Revision: https://reviews.llvm.org/D99112
2021-03-26 18:14:10 -04:00
Jez Ng 0113cf00b6 [lld-macho] Add support for --threads
Code and test are largely identical to the LLD-ELF equivalents.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D99312
2021-03-25 14:51:31 -04:00
Jez Ng 4bcaafeb0e [lld-macho] Add more TimeTraceScopes
I added just enough to allow us to see a top-level breakdown of time taken. This
is the result of loading the time-trace output into `chrome:://tracing`:

ef5e8234f3/tracing.png

Reviewed By: oontvoo

Differential Revision: https://reviews.llvm.org/D99311
2021-03-25 14:51:31 -04:00
Jez Ng 53fd1ada76 [lld-macho] Fix typo in diagnostic message 2021-03-25 14:51:31 -04:00
Vy Nguyen f499b932bf Revert "Revert "Revert "Revert "Revert "Revert "[lld-macho] Implement -dependency_info (partially - more opcodes needed)""""""
This reverts commit 4876ba5b2d.

Third-attemp relanding D98559, new change:
  - explicitly cast enum to underlying type to avoid ambiguity (workaround to clang's bug).
2021-03-23 14:51:05 -04:00
Mehdi Amini 4876ba5b2d Revert "Revert "Revert "Revert "Revert "[lld-macho] Implement -dependency_info (partially - more opcodes needed)"""""
This reverts commit 3c21166a94.
The build is broken (clang-8 host compiler):

lld/MachO/DriverUtils.cpp:271:8: error: use of overloaded operator '<<' is ambiguous (with operand types 'llvm::raw_fd_ostream' and 'lld::macho::DependencyTracker::DepOpCode')
    os << opcode;
    ~~ ^  ~~~~~~
2021-03-23 00:19:12 +00:00
Vy Nguyen 3c21166a94 Revert "Revert "Revert "Revert "[lld-macho] Implement -dependency_info (partially - more opcodes needed)""""
This reverts commit 9670d2e4af.

Second attemp to reland D98559. New changes:
 - inline functions removed from cpp file.
 - updated tests to use CHECK-DAG instead of CHECK-NEXT
 - fixed ambiguous "<<" operator by switching `char` to uint8_t
2021-03-22 19:34:51 -04:00
Vy Nguyen 9670d2e4af Revert "Revert "Revert "[lld-macho] Implement -dependency_info (partially - more opcodes needed)"""
This reverts commit 5ad2c225f3.

bots still  unhappy - revertting again
2021-03-22 14:54:01 -04:00
Vy Nguyen 5ad2c225f3 Revert "Revert "[lld-macho] Implement -dependency_info (partially - more opcodes needed)""
This reverts commit 2554b95db5.

Relanding [lld-macho] Implement -dependency_info (D98559) with changes:
 - inline functions removed from cpp file.
 - updated tests to not check libSystem.tbd with other input files (because of possible indeterministic ordering)
2021-03-22 14:41:57 -04:00
Nico Weber 2554b95db5 Revert "[lld-macho] Implement -dependency_info (partially - more opcodes needed)"
This reverts commit c53a1322f3.
Test only passes depending on build dir having a lexicographically later name
than the source dir, and doesn't link on mac/win. See
https://reviews.llvm.org/D98559#2640265 onward.
2021-03-21 16:35:38 -04:00
Vy Nguyen c53a1322f3 [lld-macho] Implement -dependency_info (partially - more opcodes needed)
Bug: https://bugs.llvm.org/show_bug.cgi?id=49278
The flag is not well documented, so this implementation is based on observed behaviour.

When specified, `-dependency_info <path>` produced a text file containing information pertaining to the current linkage, such as input files, output file, linker version, etc.

This file's layout is also not documented, but it seems to be a series of null ('\0') terminated strings in the form `<op code><path>`

`<op code>` could be:
   `0x00` : linker version
   `0x10` : input
   `0x11` : files not found(??)
   `0x40` : output

`<path>` : is the file path, except for the linker-version case.

(??) This part is a bit unclear. I think it means all the files the linker attempted to look at, but could not find.

Differential Revision: https://reviews.llvm.org/D98559
2021-03-21 14:35:46 -04:00
Vy Nguyen 66f340051a [lld-macho] Define __mh_*_header synthetic symbols.
Bug: https://bugs.llvm.org/show_bug.cgi?id=49290

    Differential Revision: https://reviews.llvm.org/D97007
2021-03-19 14:14:40 -04:00
caoming.roy ed8bff13dc [lld-macho] implement options -map
Implement command-line options -map

Reviewed By: int3, #lld-macho

Differential Revision: https://reviews.llvm.org/D98323
2021-03-18 10:39:19 -04:00
Greg McGary db1e845a96 [lld-macho] Handle error cases properly for -exported_symbol(s_list)
This fixes defects in D98223 [lld-macho] implement options -(un)exported_symbol(s_list):
* disallow export of hidden symbols
* verify that whitelisted literal names are defined in the symbol table
* reflect export-status overrides in `nlist` attribute of `N_EXT` or `N_PEXT`

Thanks to @thakis for raising these issues

Differential Revision: https://reviews.llvm.org/D98381
2021-03-16 21:20:39 -07:00
Jez Ng 38a6374564 [lld-macho] Only codesign by default on arm64 macOS
instead of doing it on all arm64 platforms.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D98446
2021-03-12 17:26:27 -05:00
Jez Ng d8283d9ddc [lld-macho][nfc] Give every SyntheticSection a fake InputSection
Previously, it was difficult to write code that handled both synthetic
and regular sections generically. We solve this problem by creating a
fake InputSection at the start of every SyntheticSection.

This refactor allows us to handle DSOHandle like a regular Defined
symbol (since Defined symbols must be attached to an InputSection), and
paves the way for supporting `__mh_*header` symbols. Additionally, it
simplifies our binding/rebase code.

I did have to extend Defined a little -- it now has a `linkerInternal`
flag, to indicate that `___dso_handle` should not be in the final symbol
table.

I've also added some additional testing for `___dso_handle`.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D98545
2021-03-12 17:26:27 -05:00
Jez Ng 29bbbd06fe [lld-macho] Unbreak build breakage from rG1752f2850685 2021-03-11 13:35:13 -05:00
Jez Ng 1752f28506 [lld-macho][nfc] Remove `MachO::` prefix where possible
Previously, SyntheticSections.cpp did not have a top-level `using namespace
llvm::MachO` because it caused a naming conflict: `llvm::MachO::Symbol` would
collide with `lld::macho::Symbol`.

`MachO::Symbol` represents the symbols defined in InterfaceFiles (TBDs). By
moving the inclusion of InterfaceFile.h into our .cpp files, we can avoid this
name collision in other files where we are only dealing with LLD's own symbols.

Along the way, I removed all unnecessary "MachO::" prefixes in our code.

Cons of this approach: If TextAPI/MachO/Symbol.h gets included via some other
header file in the future, we could run into this collision again.

Alternative 1: Have either TextAPI/MachO or BinaryFormat/MachO.h use a different
namespace. Most of the benefit of `using namespace llvm::MachO` comes from being
able to use things in BinaryFormat/MachO.h conveniently; if TextAPI was under a
different (and fully-qualified) namespace like `llvm::tapi` that would solve our
problems. Cons: lots of files across llvm-project will need to be updated, and
folks who own the TextAPI code need to agree to the name change.

Alternative 2: Rename our Symbol to something like `LldSymbol`. I think this is
ugly.

Personally I think alternative #1 is ideal, but I'm not sure the effort to do it is
worthwhile, this diff's halfway solution seems good enough to me. Thoughts?

Reviewed By: #lld-macho, oontvoo, MaskRay

Differential Revision: https://reviews.llvm.org/D98149
2021-03-11 13:28:08 -05:00
Thorsten Schütt 50c1b21851 [lld-macho] minimal TimeTrace support
This is the minimal port from ELF. Any extension should easy from here

Test plan: ninja check-all-macho

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D98419
2021-03-11 15:30:45 +01:00
Greg McGary 98fe9e41f7 [lld-macho][NFC] add const to pointer/reference induction variables of range-based for loops
Pointer and reference induction variables of range-based for loops are often const, and code authors often lax about qualifying them.

Differential Revision: https://reviews.llvm.org/D98317
2021-03-10 12:07:31 -08:00
Nico Weber 6e92f468c8 [lld/mac] warn on -install_name without -dylib
The flag doesn't (and shouldn't) have an effect in that case.
ld64 doesn't warn on this, but it seems like a good thing to do.
If it causes problems in practice for some reason, we can revert it.

Also add a dedicated test for install_name.

Differential Revision: https://reviews.llvm.org/D98259
2021-03-10 09:05:44 -05:00
Nico Weber 1aafaaca67 [lld/mac] Implement support for -mark_dead_strippable_dylib
lld doesn't read MH_DEAD_STRIPPABLE_DYLIB to strip dead dylibs yet,
but now it can produce dylibs with it set.

While here, also switch an existing test that looks only at the main Mach-O
header from --all-headers to --private-header.

Differential Revision: https://reviews.llvm.org/D98262
2021-03-10 08:57:46 -05:00
Greg McGary 714ec86c02 [lld-macho][NFC] drop opt:: when already using llvm::opt
Top-level `using llvm::opt` has been present in `lld/MachO/Driver*.cpp` for some time, so remove lingering `opt::` prefixes.

Differential Revision: https://reviews.llvm.org/D98314
2021-03-09 22:08:32 -08:00
Greg McGary fdc0c21973 [lld-macho][NFC] when reasonable, replace auto keyword with type names
lld policy discourages `auto`. Replace it with a type name whenever reasonable. Retain `auto` to avoid ...
* redundancy, as for decls such as `auto *t = mumble_cast<TYPE *>` or similar that specifies the result type on the RHS
* verbosity, as for iterators
* gratuitous suffering, as for lambdas

Along the way, add `const` when appropriate.

Note: a future diff will ...
* add more `const` qualifiers
* remove `opt::` when we are already `using llvm::opt`

Differential Revision: https://reviews.llvm.org/D98313
2021-03-09 22:08:32 -08:00
Greg McGary 06c4aadeb6 [lld-macho] implement options -(un)exported_symbol(s_list)
Implement command-line options to alter a dylib's exported-symbols list:
* `-exported_symbol*` options override the default export list. The export list is compiled according to the command-line option(s) only.
* `-unexported_symbol*` options hide otherwise public symbols.
* `-*exported_symbol PATTERN` options specify a single literal or glob pattern.
* `-*exported_symbols_list FILE` options specify a file containing a series of lines containing symbol literals or glob patterns. Whitespace and `#`-prefix comments are stripped.

Note: This is a simple implementation of the primary use case. ld64 has much more complexity surrounding interactions with other options, many of which are obscure and undocumented. We will start simple and complexity as necessary.

Differential Revision: https://reviews.llvm.org/D98223
2021-03-09 18:43:39 -08:00
Nico Weber 90085d9286 [lld/mac] fix clang-format violation from 0e319bd0be 2021-03-05 12:12:56 -05:00
Nico Weber 0e319bd0be [lld/mac] ad-hoc sign dylibs and bundles on arm64 by default, support -(no_)adhoc_codesign flags
Previously, lld/mac only ad-hoc codesigned executables on arm64.

Matches ld64 behavior. Part of PR49443. Fixes 14 of 17 failures when running
check-llvm with lld as host linker on an M1 MBP.

Differential Revision: https://reviews.llvm.org/D97994
2021-03-05 09:12:34 -05:00
Jez Ng 55a32812fa [lld-macho] Filter TAPI re-exports by target
Previously, we were loading re-exports without checking whether
they were compatible with our target. Prior to {D97209}, it meant that
we were defining dylib symbols that were invalid -- usually a silent
failure unless our binary actually used them. D97209 exposed this as an
explicit error.

Along the way, I've extended our TAPI compatibility check to cover the
platform as well, instead of just checking the arch. To this end, I've
replaced MachO::Architecture with MachO::Target in our Config struct.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D97867
2021-03-04 14:36:47 -05:00
Jez Ng b63919e180 [lld-macho] Require -arch and -platform_version to always be specified
We previously defaulted to x86_64 and an unknown platform, which was fine when
we only supported one arch and did no platform checks, but that will no longer
be true going ahead. Therefore, we should require those flags to be specified
whenever the linker is invoked.

Note that LLD-ELF and ld64 both infer the arch from their input object files,
but the usefulness of that is questionable since clang will always specify these
flags, and most of the time `lld` will be invoked via clang.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D97799
2021-03-03 15:52:10 -05:00
Jez Ng 1168736c66 [lld-macho][nfc] Parse more options using getLastArg{Value}
The option-iterating loop should be reserved for options whose command-line
order is important. I think LLD-ELF follows a similar design.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D97797
2021-03-03 15:52:06 -05:00
Greg McGary 4af1522a85 [lld-macho] Rework length check when opening input files
This reverts diff D97610 (commit 0223ab035c) and adds a one-line fix to verify that a `MemoryBufferRef` has sufficient length before reading a 4-byte magic number.

Differential Revision: https://reviews.llvm.org/D97757
2021-03-02 13:00:57 -08:00
Nico Weber 8174f33dc9 [lld/mac] Add support for -flat_namespace
-flat_namespace makes lld emit binaries that use name lookup that's more in
line with other POSIX systems: Instead of looking up symbols as (dylib,name)
pairs by dyld, they're instead looked up just by name.

-flat_namespace has three effects:

1. MH_TWOLEVEL and MH_NNOUNDEFS are no longer set in the Mach-O header
2. All symbols use BIND_SPECIAL_DYLIB_FLAT_LOOKUP as ordinal
3. When a dylib is added to the link, its dependent dylibs are also added,
   so that lld can verify that no undefined symbols remain at the end of
   a link with -flat_namespace. These transitive dylibs are added for symbol
   resolution, but they are not emitted in LC_LOAD_COMMANDs.

-undefined with -flat_namespace still isn't implemented. Before this change,
it was impossible to hit that combination because -flat_namespace caused a
diagnostic. Now that it no longer does, emit a dedicated temporary diagnostic
when both flags are used.

Differential Revision: https://reviews.llvm.org/D97641
2021-03-01 15:25:10 -05:00
Nico Weber ab45289d2e [lld/mac] Make -v print version and search paths in additon to linking, not instead of linking
This matches ld64's behavior.

Differential Revision: https://reviews.llvm.org/D97718
2021-03-01 15:09:46 -05:00
Nico Weber bacacb9d5c [lld/mac] Prefix errors with "ld64.lld" instead of just "lld"
Matches the ELF and COFF ports, which use ld.lld and lld-link, respectively.

While here, also move up `cleanupCallback` to match ELF / COFF.

Differential Revision: https://reviews.llvm.org/D97715
2021-03-01 15:08:19 -05:00
Greg McGary 0223ab035c [lld-macho] check minimum header length when opening linkable input files
Bifurcate the `readFile()` API into ...
* `readRawFile()` which performs no checks, and
* `readLinkableFile()` which enforces minimum length of 20 bytes, same as ld64

There are no new tests because tweaks to existing tests are sufficient.

Differential Revision: https://reviews.llvm.org/D97610
2021-02-27 14:41:40 -08:00
Greg McGary 6f9dd843db [lld-macho] Implement options -rename_section -rename_segment
Implement command-line options to rename output sections & segments.

Differential Revision: https://reviews.llvm.org/D97600
2021-02-27 11:44:12 -08:00
Nico Weber cafb6cd10c [lld/mac] Add some support for dynamic lookup symbols, and implement -U
Dynamic lookup symbols are symbols that work like dynamic symbols
in ELF: They're not bound to a dylib like normal Mach-O twolevel lookup
symbols, but they live in a global pool and dyld resolves them against
exported symbols from all loaded dylibs.

This adds support for dynamical lookup symbols to lld/mac. They are
represented as DylibSymbols with file set to nullptr.

This also uses this support to implement the -U flag, which makes
a specific symbol that's undefined at the end of the link a
dynamic lookup symbol.

For -U, it'd be sufficient to just to a pass over remaining undefined symbols
at the end of the link and to replace them with dynamic lookup symbols then.
But I'd like to use this code to implement flat_namespace too, and that will
require real support for resolving dynamic lookup symbols in SymbolTable. So
this patch adds this now already.

While writing tests for this, I noticed that we didn't set N_WEAK_DEF in the
symbol table for DylibSymbols, so this fixes that too.

Differential Revision: https://reviews.llvm.org/D97521
2021-02-26 16:50:53 -05:00
Vy Nguyen 5a856f5b44 Reland [lld-macho]Implement bundle_loader
Reland 1a0afcf518
  https://reviews.llvm.org/D95913

New change: fix UB bug caused by copying empty path/name. (since the executable does not have a name)
2021-02-22 14:05:12 -05:00
Nico Weber 28d9953af9 [lld/mac] reject -undefined warning and -undefined suppress with -twolevel_namespace
See discussion on https://reviews.llvm.org/D93263

-flat_namespace isn't implemented yet, and neither is -undefined dynamic,
so this makes -undefined pretty pointless in lld/MachO for now. But once
we implement -flat_namespace (which we need to do anyways to get check-llvm
to pass with lld as host linker), the code's already there.

Follow-up to https://reviews.llvm.org/D93263#2491865

Differential Revision: https://reviews.llvm.org/D96963
2021-02-20 13:35:22 -05:00
Vitaly Buka c17547df44 Revert "Implement -bundle_loader"
D95913 passes null pointer into memcpy

This reverts commit 1a0afcf518.
2021-02-19 17:40:07 -08:00
Vy Nguyen 1a0afcf518 Implement -bundle_loader
Differential Revision: https://reviews.llvm.org/D95913

Usage: -bundle_loader <executable>
This option specifies the executable that will load the build output file being linked.
When building a bundle, users can use the --bundle_loader  to specify an executable
that contains symbols referenced, but not implemented in the bundle.
2021-02-18 16:11:37 -05:00
Nico Weber e0b8604e5d [lld/mac] Implement -u flag
Since we emit diagnostics for undefineds in Writer::scanRelocations()
and symbols referenced by -u flags aren't referenced by any relocations,
this needs some manual code (similar to the entry point).

Differential Revision: https://reviews.llvm.org/D94371
2021-02-09 08:23:06 -05:00
Greg McGary 87104faac4 [lld-macho] Add ARM64 target arch
This is an initial base commit for ARM64 target arch support. I don't represent that it complete or bug-free, but wish to put it out for review now that some basic things like branch target & load/store address relocs are working.

I can add more tests to this base commit, or add them in follow-up commits.

It is not entirely clear whether I use the "ARM64" (Apple) or "AArch64" (non-Apple) naming convention. Guidance is appreciated.

Differential Revision: https://reviews.llvm.org/D88629
2021-02-08 18:14:07 -07:00
Jez Ng f843bb82c0 [lld-macho] Force-loading should share code path with regular archive loads
This extends {D92539} to work even when we are loading archive
members via `-force_load`. I uncovered this issue while trying to
force-load archives containing bitcode -- we were segfaulting.

In addition to fixing the `-force_load` case, this diff also addresses
the behavior of `-ObjC` when LTO bitcode is involved -- we need to
force-load those archive members if they contain ObjC categories.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D95265
2021-02-03 13:43:47 -05:00
Jez Ng 163dcd8513 [lld-macho] Associate each Symbol with an InputFile
This makes our error messages more informative. But the bigger motivation is for
LTO symbol resolution, which will be in an upcoming diff. The changes in this
one are largely mechanical.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D94316
2021-02-03 13:43:47 -05:00
Greg McGary 3a9d2f1488 [lld-macho][NFC] refactor relocation handling
Add per-reloc-type attribute bits and migrate code from per-target file into target independent code, driven by reloc attributes.

Many cleanups

Differential Revision: https://reviews.llvm.org/D95121
2021-02-02 10:54:53 -07:00
Jez Ng 697f4e429b [lld-macho] Run ObjCContractPass during LTO
Run the ObjCARCContractPass during LTO. The legacy LTO backend (under
LTO/ThinLTOCodeGenerator.cpp) already does this; this diff just adds that
behavior to the new LTO backend. Without that pass, the objc.clang.arc.use
intrinsic will get passed to the instruction selector, which doesn't know how to
handle it.

In order to test both the new and old pass managers, I've also added support for
the `--[no-]lto-legacy-pass-manager` flags.

P.S. Not sure if the ordering of the pass within the pipeline matters...

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D94547
2021-01-20 14:21:32 -05:00
Nico Weber 1198478c42 [lld/mac] remove redundant null check
This is already checked two lines up. No behavior change.
2021-01-09 21:18:32 -05:00
Jez Ng e98b441a09 [lld-macho] Remove unnecessary llvm:: namespace prefixes 2021-01-09 12:44:35 -05:00
Jez Ng 9d1140e18e [lld-macho] Simulator & DriverKit executables should always be PIE
We didn't have support for parsing DriverKit in our `-platform`
flag, so add that too. Also remove a bunch of unnecessary namespace
prefixes.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D93741
2020-12-23 11:24:12 -05:00
Nico Weber 77fb45e59e [lld/mac] Add --version flag
It's an extension to ld64, but all the other ports have it, and
someone asked for it in PR43721.

While here, change the COFF help text to match the other ports.

Differential Revision: https://reviews.llvm.org/D93491
2020-12-22 22:06:39 -05:00
Nico Weber 13f439a187 [lld/mac] Implement support for private extern symbols
Private extern symbols are used for things scoped to the linkage unit.
They cause duplicate symbol errors (so they're in the symbol table,
unlike TU-scoped truly local symbols), but they don't make it into the
export trie. They are created e.g. by compiling with
-fvisibility=hidden.

If two weak symbols have differing privateness, the combined symbol is
non-private external. (Example: inline functions and some TUs that
include the header defining it were built with
-fvisibility-inlines-hidden and some weren't).

A weak private external symbol implicitly has its "weak" dropped and
behaves like a regular strong private external symbol: Weak is an export
trie concept, and private symbols are not in the export trie.

If a weak and a strong symbol have different privateness, the strong
symbol wins.

If two common symbols have differing privateness, the larger symbol
wins. If they have the same size, the privateness of the symbol seen
later during the link wins (!) -- this is a bit lame, but it matches
ld64 and this behavior takes 2 lines less to implement than the less
surprising "result is non-private external), so match ld64.
(Example: `int a` in two .c files, both built with -fcommon,
one built with -fvisibility=hidden and one without.)

This also makes `__dyld_private` a true TU-local symbol, matching ld64.
To make this work, make the `const char*` StringRefZ ctor to correctly
set `size` (without this, writing the string table crashed when calling
getName() on the __dyld_private symbol).

Mention in CommonSymbol's comment that common symbols are now disabled
by default in clang.

Mention in -keep_private_externs's HelpText that the flag only has an
effect with `-r` (which we don't implement yet -- so this patch here
doesn't regress any behavior around -r + -keep_private_externs)). ld64
doesn't explicitly document it, but the commit text of
http://reviews.llvm.org/rL216146 does, and ld64's
OutputFile::buildSymbolTable() checks `_options.outputKind() ==
Options::kObjectFile` before calling `_options.keepPrivateExterns()`
(the only reference to that function).

Fixes PR48536.

Differential Revision: https://reviews.llvm.org/D93609
2020-12-21 21:23:33 -05:00
Jez Ng 5f9896d3b2 [lld-macho] Support Obj-C symbols in order files
Obj-C symbols may have spaces and colons, which our previous order file
parser would be confused by. The order file format has made the very unfortunate
choice of using colons for its delimiters, which means that we have to use
heuristics to determine if a given colon is part of a symbol or not...

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D93567
2020-12-20 13:49:18 -05:00
Greg McGary cc1cf6332a [lld-macho] Implement option: -undefined TREATMENT
TREATMENT can be `error`, `warning`, `suppress`, or `dynamic_lookup`
The `dymanic_lookup` remains unimplemented for now.

Differential Revision: https://reviews.llvm.org/D93263
2020-12-17 17:40:50 -08:00
Nico Weber f710bb7063 lld: Replace some lld::outs()s with message()
No behavior change.
2020-12-17 16:19:09 -05:00
Jez Ng 811444d7a1 [lld-macho] Add support for weak references
Weak references need not necessarily be satisfied at runtime (but they must
still be satisfied at link time). So symbol resolution still works as per usual,
but we now pass around a flag -- ultimately emitting it in the bind table -- to
indicate if a given dylib symbol is a weak reference.

ld64's behavior for symbols that have both weak and strong references is
a bit bizarre. For non-function symbols, it will emit a weak import. For
function symbols (those referenced by BRANCH relocs), it will emit a
regular import. I'm not sure what value there is in that behavior, and
since emulating it will make our implementation more complex, I've
decided to treat regular weakrefs like function symbol ones for now.

Fixes PR48511.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D93369
2020-12-17 08:49:16 -05:00
dfukalov 9ed8e0caab [NFC] Reduce include files dependency and AA header cleanup (part 2).
Continuing work started in https://reviews.llvm.org/D92489:

Removed a bunch of includes from "AliasAnalysis.h" and "LoopPassManager.h".

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D92852
2020-12-17 14:04:48 +03:00
Nico Weber 09edd9df6e [mac/lld] simplify code using PackedVersion instead of VersionTuple
PackedVersion already does the correct range checks.

No behavior change.

Differential Revision: https://reviews.llvm.org/D93338
2020-12-15 19:23:07 -05:00
Jez Ng 8a5e068823 [lld-macho] Support -sub_umbrella
From what I can tell, it's essentially identical to
`-sub_library`, but it doesn't match files ending in ".dylib".

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D93276
2020-12-15 15:58:26 -05:00
Jez Ng 544148ae70 [lld-macho] -weak_{library,framework} should always take priority
We were not setting forceWeakImport for file paths given by
`-weak_library` if we had already loaded the file. This diff fixes that
by having `loadDylib` return a cached DylibFile instance even if we have
already loaded that file.

We still avoid emitting multiple LC_LOAD_DYLIBs, but we achieve this by
making inputFiles a SetVector instead of relying on the `loadedDylibs`
cache.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D93255
2020-12-15 15:58:26 -05:00
Nico Weber d058b69b1c [lld/mac] implement -compatibility_version, -current_version
Differential Revision: https://reviews.llvm.org/D93237
2020-12-14 18:41:36 -05:00
Jez Ng 76c36c11a9 [lld-macho] Don't load dylibs more than once
Also remove `DylibFile::reexported` since it's unused.

Fixes llvm.org/PR48393.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D93001
2020-12-10 15:57:52 -08:00
Jez Ng 6a348f6158 [lld-macho] Implement `-no_implicit_dylibs`
Dylibs that are "public" -- i.e. top-level system libraries -- are considered
implicitly linked when another library re-exports them. That is, we should load
them & bind directly to their symbols instead of via their re-exporting
umbrella library. This diff implements that behavior by default, as well as an
opt-out flag.

In theory, this is just a performance optimization, but in practice it seems
that it's needed for correctness.

Fixes llvm.org/PR48395.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D93000
2020-12-10 15:57:52 -08:00
Jez Ng 74d799926e [lld-macho] Initialize AsmParsers earlier
We need to initialize AsmParsers before any calls to `addFile`, as
bitcode files may require them. Otherwise we trigger `Assertion T &&
T->hasMCAsmParser()' failed`.

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D92913
2020-12-10 15:57:52 -08:00
Jez Ng 29d3b0e471 [lld-macho] Add support for -mcpu, -mattr, -code-model in LTO
`-mcpu` and `-code-model` tests were copied from similar ones in
LLD-ELF.

There doesn't seem to be an equivalent test for `-mattr` in LLD-ELF, so
I've verified our behavior by cribbing a test from
CodeGen/X86/recip-fastmath.ll.

Reviewed By: #lld-macho, compnerd, MaskRay

Differential Revision: https://reviews.llvm.org/D92912
2020-12-10 15:57:51 -08:00
Jez Ng 95831a56d0 [lld-macho] Implement -object_path_lto
This makes it possible for STABS entries to reference the debug info
contained in the LTO-compiled output.

I'm not sure how to test the file mtime within llvm-lit -- GNU and BSD
`stat` take different command-line arguments. I've omitted the check for
now.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D92537
2020-12-10 15:57:51 -08:00
Nico Weber 16b1f6e385 [mac/lld] Add support for the LC_LINKER_OPTION load command in o files
clang puts `-framework CoreFoundation` in this load command for files
that use @available / __builtin_available. Without support for this,
binaries that don't explicitly link to CoreFoundation fail to link.

Differential Revision: https://reviews.llvm.org/D92624
2020-12-04 08:46:53 -05:00
Nico Weber 7cb0a373d1 [mac/lld] Implement -t
Goes well with `-why_load` to get an idea of load order.

Differential Revision: https://reviews.llvm.org/D92583
2020-12-03 16:02:38 -05:00