llvm-project

Commit Graph

Author	SHA1	Message	Date
Jez Ng	1aa29dffce	[lld-macho] Support subtractor relocations that reference sections The minuend (but not the subtrahend) can reference a section. Note that we do not yet properly validate that the subtrahend isn't referencing a section; I've filed PR50034 to track that. I've also extended the reloc-subtractor.s test to reorder symbols, to make sure that the addends are being associated with the minuend (and not the subtrahend) relocation. Fixes PR49999. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D100804	2021-04-20 16:58:57 -04:00
Jez Ng	3142fc3b5b	[lld-macho] Have toString() emit full path to archive files It doesn't make sense to take just the base filename for archives when we emit the full path for object files. (LLD-ELF emits the full path too.) This will also make it easier to write a proper test for {D100147}. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D100357	2021-04-13 10:43:28 -04:00
Jez Ng	2461804b48	[lld-macho] Symbol::value should always be uint64_t D98837 migrated a bunch of `value`s to uint64_t, but missed these.	2021-04-06 17:54:11 -04:00
Jez Ng	ceec610754	[lld-macho] Fix & refactor symbol size calculations I noticed two problems with the previous implementation: * N_ALT_ENTRY symbols weren't being handled correctly -- they should determine the size of the previous symbol, even though they don't cause a new section to be created * The last symbol in a section had its size calculated wrongly; the first subsection's size was used instead of the last one I decided to take the opportunity to refactor things as well, mainly to realize my observation [here](https://reviews.llvm.org/D98837#inline-931511) that we could avoid doing a binary search to match symbols with subsections. I think the resulting code is a bit simpler too. N Min Max Median Avg Stddev x 20 4.31 4.43 4.37 4.3775 0.034162922 + 20 4.32 4.43 4.38 4.3755 0.02799906 No difference proven at 95.0% confidence Reviewed By: #lld-macho, alexshap Differential Revision: https://reviews.llvm.org/D99972	2021-04-06 15:10:01 -04:00
Jez Ng	e0df2b540a	[lld-macho] Rename SubsectionMapping to SubsectionMap We bikeshedded about it here: https://reviews.llvm.org/D98837#inline-931557 I initially suggested SubsectionMapping, but I thought the discussion landed on doing `std::vector<SubsectionEntry>`. @alexshap went and did both, but on hindsight I regret adding 3 more characters to an already long name, and I think SubsectionEntry is descriptive enough... This diff also renames `subsectionMap` to `subsecMap` for consistency with other variable names in the codebase.	2021-04-06 14:26:13 -04:00
Cyndy Ishida	0116d04d04	[TextAPI] move source code files out of subdirectory, NFC TextAPI/ELF has moved out into InterfaceStubs, so theres no longer a need to seperate out TextAPI between formats. Reviewed By: ributzka, int3, #lld-macho Differential Revision: https://reviews.llvm.org/D99811	2021-04-05 10:24:42 -07:00
Jez Ng	817d98d841	[lld-macho][nfc] Refactor in preparation for 32-bit support The main challenge was handling the different on-disk structures (e.g. `mach_header` vs `mach_header_64`). I tried to strike a balance between sprinkling `target->wordSize == 8` checks everywhere (branchy = slow, and ugly) and templatizing everything (causes code bloat, also ugly). I think I struck a decent balance by judicious use of type erasure. Note that LLD-ELF has a similar architecture, though it seems to use more templating. Linking chromium_framework takes about the same time before and after this change: N Min Max Median Avg Stddev x 20 4.52 4.67 4.595 4.5945 0.044423204 + 20 4.5 4.71 4.575 4.582 0.056344803 No difference proven at 95.0% confidence Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D99633	2021-04-02 18:46:39 -04:00
Yang Fan	d441dee5c2	[lld][MachO] Fix -Wsign-compare warning (NFC) GCC warning: ``` /llvm-project/lld/MachO/InputFiles.cpp:484:24: warning: comparison of integer expressions of different signedness: ‘int64_t’ {aka ‘long int’} and ‘uint64_t’ {aka ‘long unsigned int’} [-Wsign-compare] 484 \| return value < subsectionEntry.offset; \| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ ```	2021-04-02 11:33:56 +08:00
Alexander Shaposhnikov	f6ad045366	[lld][MachO] Make emitEndFunStab independent from .subsections_via_symbols This diff addresses FIXME in SyntheticSections.cpp and removes the dependency of emitEndFunStab on .subsections_via_symbols. Test plan: make check-lld-macho Differential revision: https://reviews.llvm.org/D99054	2021-04-01 17:48:09 -07:00
Alexander Shaposhnikov	f1e4e2fb20	[lld][MachO] Refactor handling of subsections This diff is a preparation for fixing FunStabs (incorrect size calculation). std::map<uint32_t, InputSection*> (SubsectionMap) is replaced with a sorted vector + binary search. If .subsections_via_symbols is set this vector will contain the list of subsections, otherwise, the offsets will be used for calculating the symbols sizes. Test plan: make check-all Differential revision: https://reviews.llvm.org/D98837	2021-03-31 16:52:53 -07:00
Jez Ng	dc8bee9265	[lld-macho] Check address ranges when applying relocations This diff required fixing `getEmbeddedAddend` to apply sign extension to 32-bit values. We were previously passing around wrong 64-bit addend values that became "right" after being truncated back to 32-bit. I've also made `getEmbeddedAddend` return a signed int, which is similar to what LLD-ELF does for its `getImplicitAddend`. `reportRangeError`, `checkUInt`, and `checkInt` are counterparts of similar functions in LLD-ELF. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D98387	2021-03-12 17:26:27 -05:00
Jez Ng	a723db92d8	[lld-macho][nfc] Refactor subtractor reloc handling SUBTRACTOR relocations are always paired with UNSIGNED relocations to indicate a pair of symbols whose address difference we want. Functionally they are like a single relocation: only one pointer gets written / relocated. Previously, we would handle these pairs by skipping over the SUBTRACTOR relocation and writing the pointer when handling the UNSIGNED reloc. This diff reverses things, so we write while handling SUBTRACTORs and skip over the UNSIGNED relocs instead. Being able to distinguish between SUBTRACTOR and UNSIGNED relocs in the write phase (i.e. inside `relocateOne`) is useful for the upcoming range check diff: we want to check that SUBTRACTOR relocs write signed values, but UNSIGNED relocs (naturally) write unsigned values. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D98386	2021-03-11 13:28:13 -05:00
Jez Ng	e8a3058303	[lld-macho] Fix handling of X86_64_RELOC_SIGNED_{1,2,4} The previous implementation miscalculated the addend, resulting in an underflow. This meant that every SIGNED_N section relocation would be associated with the last subsection (since the addend would now be a huge number). We were "lucky" that this mistake was typically cancelled out -- 64-to-32-bit-truncation meant that the final value was correct, as long as subsections were not rearranged. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D98385	2021-03-11 13:28:11 -05:00
Jez Ng	5433a79176	[lld-macho][nfc] Create Relocations.{h,cpp} for relocation-specific code This more closely mirrors the structure of lld-ELF. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D98384	2021-03-11 13:28:09 -05:00
Jez Ng	1752f28506	[lld-macho][nfc] Remove `MachO::` prefix where possible Previously, SyntheticSections.cpp did not have a top-level `using namespace llvm::MachO` because it caused a naming conflict: `llvm::MachO::Symbol` would collide with `lld::macho::Symbol`. `MachO::Symbol` represents the symbols defined in InterfaceFiles (TBDs). By moving the inclusion of InterfaceFile.h into our .cpp files, we can avoid this name collision in other files where we are only dealing with LLD's own symbols. Along the way, I removed all unnecessary "MachO::" prefixes in our code. Cons of this approach: If TextAPI/MachO/Symbol.h gets included via some other header file in the future, we could run into this collision again. Alternative 1: Have either TextAPI/MachO or BinaryFormat/MachO.h use a different namespace. Most of the benefit of `using namespace llvm::MachO` comes from being able to use things in BinaryFormat/MachO.h conveniently; if TextAPI was under a different (and fully-qualified) namespace like `llvm::tapi` that would solve our problems. Cons: lots of files across llvm-project will need to be updated, and folks who own the TextAPI code need to agree to the name change. Alternative 2: Rename our Symbol to something like `LldSymbol`. I think this is ugly. Personally I think alternative #1 is ideal, but I'm not sure the effort to do it is worthwhile, this diff's halfway solution seems good enough to me. Thoughts? Reviewed By: #lld-macho, oontvoo, MaskRay Differential Revision: https://reviews.llvm.org/D98149	2021-03-11 13:28:08 -05:00
Greg McGary	fdc0c21973	[lld-macho][NFC] when reasonable, replace auto keyword with type names lld policy discourages `auto`. Replace it with a type name whenever reasonable. Retain `auto` to avoid ... * redundancy, as for decls such as `auto t = mumble_cast<TYPE >` or similar that specifies the result type on the RHS * verbosity, as for iterators * gratuitous suffering, as for lambdas Along the way, add `const` when appropriate. Note: a future diff will ... * add more `const` qualifiers * remove `opt::` when we are already `using llvm::opt` Differential Revision: https://reviews.llvm.org/D98313	2021-03-09 22:08:32 -08:00
Vy Nguyen	70c0dbf151	[lld-macho][NFC] Replace config param with a global in hasCompatVersion() helper. Differential Revision: https://reviews.llvm.org/D98115	2021-03-06 11:32:51 -05:00
Vy Nguyen	fc5d804ddb	[lld-macho] Check platform and version when constructor ObjFile Differential Revision: https://reviews.llvm.org/D97979	2021-03-05 17:34:38 -05:00
Jez Ng	3c19b4f34d	[lld-macho] Skip over symbols in un-parsed debug info sections clang appears to emit symbols in `__debug_aranges`, at least for arm64... in the examples I've seen, it doesn't seem like those symbols are referenced outside of `__DWARF`, so I think they're safe to ignore. But hopefully @clayborg can confirm. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D98073	2021-03-05 17:24:32 -05:00
Jez Ng	fc011b5eb1	[lld-macho] Replace debug-info-related assert with FIXME We'll need to properly handle object files with multiple source inputs eventually, but remove the assert for now so we can successfully emit binaries for testing. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D98067	2021-03-05 17:24:31 -05:00
Jez Ng	0d4dadc64c	[lld-macho] Include install name in error messages for dylibs from TBDs Since multiple dylibs can be defined in one TBD, this is necessary to avoid confusion. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D97905	2021-03-04 14:36:49 -05:00
Jez Ng	55a32812fa	[lld-macho] Filter TAPI re-exports by target Previously, we were loading re-exports without checking whether they were compatible with our target. Prior to {D97209}, it meant that we were defining dylib symbols that were invalid -- usually a silent failure unless our binary actually used them. D97209 exposed this as an explicit error. Along the way, I've extended our TAPI compatibility check to cover the platform as well, instead of just checking the arch. To this end, I've replaced MachO::Architecture with MachO::Target in our Config struct. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D97867	2021-03-04 14:36:47 -05:00
Jez Ng	8601be809e	[lld-macho] Fix & fold reexport-nested-libs test into stub-link.s The reexport-nested-libs test added in D97438 was a bit wonky. First, it was linking against libReexportSystem.tbd which targets the iOS simulator, and which in turn attempted to re-export the iOS simulator's libSystem. However, due to the way `-syslibroot` works, it was actually re-exporting the macOS libSystem. As a result, the test was not actually able to resolve the symbols in the desired libSystem. I'm guessing that @oontvoo was confused by this and therefore included those symbols in libReexportSystem.tbd itself. But this means that the test wasn't actually testing the resolution of re-exported symbols (though it did at least verify that the re-exported libraries could be located). After some consideration, I figured that stub-link.s could be extended to cover what reexport-nested-libs.s was attempting to do. The test targets macOS, so we only have one `-syslibroot` and no chance of confusion. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D97866	2021-03-04 14:36:46 -05:00
Jez Ng	5d9aafc09a	[lld-macho] Bind re-exported symbols directly to implicitly-linked umbrellas Suppose we are linking against libFoo, which re-exports the implicitly-bound libSystem, which in turn re-exports some non-explicitly-bound library like `/usr/lib/system/libsystem_c.dylib`. Then any bindings we have to a symbol in libsystem_c should use libSystem (and not libFoo) as the umbrella library. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D97865	2021-03-04 14:36:44 -05:00
Jez Ng	b63919e180	[lld-macho] Require -arch and -platform_version to always be specified We previously defaulted to x86_64 and an unknown platform, which was fine when we only supported one arch and did no platform checks, but that will no longer be true going ahead. Therefore, we should require those flags to be specified whenever the linker is invoked. Note that LLD-ELF and ld64 both infer the arch from their input object files, but the usefulness of that is questionable since clang will always specify these flags, and most of the time `lld` will be invoked via clang. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D97799	2021-03-03 15:52:10 -05:00
Greg McGary	4af1522a85	[lld-macho] Rework length check when opening input files This reverts diff D97610 (commit `0223ab035c`) and adds a one-line fix to verify that a `MemoryBufferRef` has sufficient length before reading a 4-byte magic number. Differential Revision: https://reviews.llvm.org/D97757	2021-03-02 13:00:57 -08:00
Vy Nguyen	9a2e2de15f	[lld-macho] Change loadReexport to handle the case where a TAPI re-exports to reference documents nested within other TBD. Currently, it was delibrately impleneted to not handle this case, but as it has turnt out, we need this feature. The concrete use case is `System/Library/Frameworks/Cocoa.framework/Versions/A/Cocoa` reexports /System/Library/Frameworks/AppKit.framework/Versions/C/AppKit , which then rexports /System/Library/PrivateFrameworks/UIFoundation.framework/Versions/A/UIFoundation The current implemention uses a global currentTopLevelTapi, which is not reset until it finishes loading the whole tree. This is a problem because if the top-level is set to Cocoa, then when we get to UIFoundation, it will try to find UIFoundation in the current top level, which is Cocoa and will not find it. The right thing should be: - When loading a library from a TBD file, re-exports need to be looked up in the auxiliary documents within the same TBD. - When loading from an actual dylib, no additional TBD documents need to be examined. - In no case does a re-export mentioned in one TBD file need to be looked up in a document in an auxiliary document from a different TBD file Differential Revision: https://reviews.llvm.org/D97438	2021-03-02 12:14:31 -05:00
Nico Weber	8174f33dc9	[lld/mac] Add support for -flat_namespace -flat_namespace makes lld emit binaries that use name lookup that's more in line with other POSIX systems: Instead of looking up symbols as (dylib,name) pairs by dyld, they're instead looked up just by name. -flat_namespace has three effects: 1. MH_TWOLEVEL and MH_NNOUNDEFS are no longer set in the Mach-O header 2. All symbols use BIND_SPECIAL_DYLIB_FLAT_LOOKUP as ordinal 3. When a dylib is added to the link, its dependent dylibs are also added, so that lld can verify that no undefined symbols remain at the end of a link with -flat_namespace. These transitive dylibs are added for symbol resolution, but they are not emitted in LC_LOAD_COMMANDs. -undefined with -flat_namespace still isn't implemented. Before this change, it was impossible to hit that combination because -flat_namespace caused a diagnostic. Now that it no longer does, emit a dedicated temporary diagnostic when both flags are used. Differential Revision: https://reviews.llvm.org/D97641	2021-03-01 15:25:10 -05:00
Jez Ng	f083f652c3	[lld-macho][nfc] Remove TODO regarding addends There was initially some concern around the correct handling of pcrel section relocations with r_length != 2. But it looks like there are no such relocations in practice -- x86_64's pcrel section relocs all have r_length == 2, and ARM64 doesn't even have pcrel section relocs. So we can replace the TODO with an assert. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D97576	2021-03-01 12:30:08 -05:00
Greg McGary	0223ab035c	[lld-macho] check minimum header length when opening linkable input files Bifurcate the `readFile()` API into ... * `readRawFile()` which performs no checks, and * `readLinkableFile()` which enforces minimum length of 20 bytes, same as ld64 There are no new tests because tweaks to existing tests are sufficient. Differential Revision: https://reviews.llvm.org/D97610	2021-02-27 14:41:40 -08:00
Jez Ng	82b3da6f6f	[lld-macho] Extract embedded addends for arm64 UNSIGNED relocations On arm64, UNSIGNED relocs are the only ones that use embedded addends instead of the ADDEND relocation. Also ensure that the addend works when UNSIGNED is part of a SUBTRACTOR pair. Reviewed By: #lld-macho, alexshap Differential Revision: https://reviews.llvm.org/D97105	2021-02-27 12:31:34 -05:00
Jez Ng	541390131e	[lld-macho] Don't emit rebase opcodes for subtractor minuend relocs Also add a few asserts to verify that we are indeed handling an UNSIGNED relocation as the minued. I haven't made it an actual user-facing error since I don't think llvm-mc is capable of generating SUBTRACTOR relocations without an associated UNSIGNED. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D97103	2021-02-27 12:31:34 -05:00
Jez Ng	84579fc24f	[lld-macho] Basic support for linkage and visibility attributes in LTO When parsing bitcode, convert LTO Symbols to LLD Symbols in order to perform resolution. The "winning" symbol will then be marked as Prevailing at LTO compilation time. This is similar to what the other LLD ports do. This change allows us to handle `linkonce` symbols correctly, and to deal with duplicate bitcode symbols gracefully. Previously, both scenarios would result in an assertion failure inside the LTO code, complaining that multiple Prevailing definitions are not allowed. While at it, I also added basic logic around visibility. We don't do anything useful with it yet, but we do check that its value is valid. LLD-ELF appears to use it only to set FinalDefinitionInLinkageUnit for LTO, which I think is just a performance optimization. From my local experimentation, the linker itself doesn't seem to do anything differently when encountering linkonce / linkonce_odr / weak / weak_odr. So I've only written a test for one of them. LLD-ELF has more, but they seem to mostly be testing the intermediate bitcode output of their LTO backend...? I'm far from an expert here though, so I might very well be missing things. Reviewed By: #lld-macho, MaskRay, smeenai Differential Revision: https://reviews.llvm.org/D94342	2021-02-25 13:27:40 -05:00
Jez Ng	4752cdc9a2	[lld-macho] Check for arch compatibility when loading ObjFiles and TBDs The silent failures had confused me a few times. I haven't added a similar check for platform yet as we don't yet have logic to infer the platform automatically, and so adding that check would require updating dozens of test files. Reviewed By: #lld-macho, thakis, alexshap Differential Revision: https://reviews.llvm.org/D97209	2021-02-23 22:02:38 -05:00
Jez Ng	5e851733c5	[lld-macho] Fix semantics & add tests for ARM64 GOT/TLV relocs I've adjusted the RelocAttrBits to better fit the semantics of the relocations. In particular: 1. _UNSIGNED relocations are no longer marked with the `TLV` bit, even though they can occur within TLV sections. Instead the `TLV` bit is reserved for relocations that can reference thread-local symbols, and _UNSIGNED relocations have their own `UNSIGNED` bit. The previous implementation caused TLV and regular UNSIGNED semantics to be conflated, resulting in rebase opcodes being incorrectly emitted for TLV relocations. 2. I've added a new `POINTER` bit to denote non-relaxable GOT relocations. This distinction isn't important on x86 -- the GOT relocations there are either relaxable or non-relaxable loads -- but arm64 has `GOT_LOAD_PAGE21` which loads the page that the referent symbol is in (regardless of whether the symbol ends up in the GOT). This relocation must reference a GOT symbol (so must have the `GOT` bit set) but isn't itself relaxable (so must not have the `LOAD` bit). The `POINTER` bit is used for relocations that must reference a GOT slot. 3. A similar situation occurs for TLV relocations. 4. ld64 supports both a pcrel and an absolute version of ARM64_RELOC_POINTER_TO_GOT. But the semantics of the absolute version are pretty weird -- it results in the value of the GOT slot being written, rather than the address. (That means a reference to a dynamically-bound slot will result in zeroes being written.) The programs I've tried linking don't use this form of the relocation, so I've dropped our partial support for it by removing the relevant RelocAttrBits. Reviewed By: alexshap Differential Revision: https://reviews.llvm.org/D97031	2021-02-23 22:02:38 -05:00
Jez Ng	e5d780e049	[lld-macho] Use full input file name in invalid relocation error message Just something I noticed while debugging arm relocations... Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D97078	2021-02-23 22:02:38 -05:00
Vy Nguyen	5a856f5b44	Reland [lld-macho]Implement bundle_loader Reland `1a0afcf518` https://reviews.llvm.org/D95913 New change: fix UB bug caused by copying empty path/name. (since the executable does not have a name)	2021-02-22 14:05:12 -05:00
Vitaly Buka	c17547df44	Revert "Implement -bundle_loader" D95913 passes null pointer into memcpy This reverts commit `1a0afcf518`.	2021-02-19 17:40:07 -08:00
Vy Nguyen	1a0afcf518	Implement -bundle_loader Differential Revision: https://reviews.llvm.org/D95913 Usage: -bundle_loader <executable> This option specifies the executable that will load the build output file being linked. When building a bundle, users can use the --bundle_loader to specify an executable that contains symbols referenced, but not implemented in the bundle.	2021-02-18 16:11:37 -05:00
Greg McGary	87104faac4	[lld-macho] Add ARM64 target arch This is an initial base commit for ARM64 target arch support. I don't represent that it complete or bug-free, but wish to put it out for review now that some basic things like branch target & load/store address relocs are working. I can add more tests to this base commit, or add them in follow-up commits. It is not entirely clear whether I use the "ARM64" (Apple) or "AArch64" (non-Apple) naming convention. Guidance is appreciated. Differential Revision: https://reviews.llvm.org/D88629	2021-02-08 18:14:07 -07:00
Jez Ng	f843bb82c0	[lld-macho] Force-loading should share code path with regular archive loads This extends {D92539} to work even when we are loading archive members via `-force_load`. I uncovered this issue while trying to force-load archives containing bitcode -- we were segfaulting. In addition to fixing the `-force_load` case, this diff also addresses the behavior of `-ObjC` when LTO bitcode is involved -- we need to force-load those archive members if they contain ObjC categories. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D95265	2021-02-03 13:43:47 -05:00
Jez Ng	163dcd8513	[lld-macho] Associate each Symbol with an InputFile This makes our error messages more informative. But the bigger motivation is for LTO symbol resolution, which will be in an upcoming diff. The changes in this one are largely mechanical. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D94316	2021-02-03 13:43:47 -05:00
Greg McGary	3a9d2f1488	[lld-macho][NFC] refactor relocation handling Add per-reloc-type attribute bits and migrate code from per-target file into target independent code, driven by reloc attributes. Many cleanups Differential Revision: https://reviews.llvm.org/D95121	2021-02-02 10:54:53 -07:00
Jez Ng	e98b441a09	[lld-macho] Remove unnecessary llvm:: namespace prefixes	2021-01-09 12:44:35 -05:00
Nico Weber	13f439a187	[lld/mac] Implement support for private extern symbols Private extern symbols are used for things scoped to the linkage unit. They cause duplicate symbol errors (so they're in the symbol table, unlike TU-scoped truly local symbols), but they don't make it into the export trie. They are created e.g. by compiling with -fvisibility=hidden. If two weak symbols have differing privateness, the combined symbol is non-private external. (Example: inline functions and some TUs that include the header defining it were built with -fvisibility-inlines-hidden and some weren't). A weak private external symbol implicitly has its "weak" dropped and behaves like a regular strong private external symbol: Weak is an export trie concept, and private symbols are not in the export trie. If a weak and a strong symbol have different privateness, the strong symbol wins. If two common symbols have differing privateness, the larger symbol wins. If they have the same size, the privateness of the symbol seen later during the link wins (!) -- this is a bit lame, but it matches ld64 and this behavior takes 2 lines less to implement than the less surprising "result is non-private external), so match ld64. (Example: `int a` in two .c files, both built with -fcommon, one built with -fvisibility=hidden and one without.) This also makes `__dyld_private` a true TU-local symbol, matching ld64. To make this work, make the `const char*` StringRefZ ctor to correctly set `size` (without this, writing the string table crashed when calling getName() on the __dyld_private symbol). Mention in CommonSymbol's comment that common symbols are now disabled by default in clang. Mention in -keep_private_externs's HelpText that the flag only has an effect with `-r` (which we don't implement yet -- so this patch here doesn't regress any behavior around -r + -keep_private_externs)). ld64 doesn't explicitly document it, but the commit text of http://reviews.llvm.org/rL216146 does, and ld64's OutputFile::buildSymbolTable() checks `_options.outputKind() == Options::kObjectFile` before calling `_options.keepPrivateExterns()` (the only reference to that function). Fixes PR48536. Differential Revision: https://reviews.llvm.org/D93609	2020-12-21 21:23:33 -05:00
Greg McGary	d4ec3346b1	[lld-macho][nfc] Refactor to accommodate paired relocs This is a refactor to pave the way for supporting paired-ADDEND for ARM64. The only paired reloc type for X86_64 is SUBTRACTOR. In a later diff, I will add SUBTRACTOR for both X86_64 and ARM64. * s/`getImplicitAddend`/`getAddend`/ because it handles all forms of addend: implicit, explicit, paired. * add predicate `bool isPairedReloc()` * check range of `relInfo.r_symbolnum` is internal, unrelated to user-input, so use `assert()`, not `error()` * minor cleanups & rearrangements in `InputFile::parseRelocations()` Differential Revision: https://reviews.llvm.org/D90614	2020-12-17 20:21:41 -08:00
Jez Ng	4c8276cdc1	[lld-macho] Use LC_LOAD_WEAK_DYLIB for dylibs with only weakrefs Note that dylibs without any refs will still be loaded in the usual (strong) fashion. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D93435	2020-12-17 08:49:17 -05:00
Jez Ng	811444d7a1	[lld-macho] Add support for weak references Weak references need not necessarily be satisfied at runtime (but they must still be satisfied at link time). So symbol resolution still works as per usual, but we now pass around a flag -- ultimately emitting it in the bind table -- to indicate if a given dylib symbol is a weak reference. ld64's behavior for symbols that have both weak and strong references is a bit bizarre. For non-function symbols, it will emit a weak import. For function symbols (those referenced by BRANCH relocs), it will emit a regular import. I'm not sure what value there is in that behavior, and since emulating it will make our implementation more complex, I've decided to treat regular weakrefs like function symbol ones for now. Fixes PR48511. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D93369	2020-12-17 08:49:16 -05:00
Nico Weber	ec88746a05	[lld/mac] fill in current and compatibility version for LC_LOAD_(WEAK_)DYLIB Not sure if anything actually depends on this, but it makes `otool -L` output look nicer. Differential Revision: https://reviews.llvm.org/D93332	2020-12-15 19:34:59 -05:00
Jez Ng	3aa8e071dd	[lld-macho] Add implicit dylib support for frameworks {D93000} applied to frameworks. Partial fix for PR48511. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D93277	2020-12-15 15:58:26 -05:00
Jez Ng	544148ae70	[lld-macho] -weak_{library,framework} should always take priority We were not setting forceWeakImport for file paths given by `-weak_library` if we had already loaded the file. This diff fixes that by having `loadDylib` return a cached DylibFile instance even if we have already loaded that file. We still avoid emitting multiple LC_LOAD_DYLIBs, but we achieve this by making inputFiles a SetVector instead of relying on the `loadedDylibs` cache. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D93255	2020-12-15 15:58:26 -05:00
Jez Ng	76c36c11a9	[lld-macho] Don't load dylibs more than once Also remove `DylibFile::reexported` since it's unused. Fixes llvm.org/PR48393. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D93001	2020-12-10 15:57:52 -08:00
Jez Ng	6a348f6158	[lld-macho] Implement `-no_implicit_dylibs` Dylibs that are "public" -- i.e. top-level system libraries -- are considered implicitly linked when another library re-exports them. That is, we should load them & bind directly to their symbols instead of via their re-exporting umbrella library. This diff implements that behavior by default, as well as an opt-out flag. In theory, this is just a performance optimization, but in practice it seems that it's needed for correctness. Fixes llvm.org/PR48395. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D93000	2020-12-10 15:57:52 -08:00
Jez Ng	863f7a745e	[lld-macho] Don't attempt to emit rebase opcodes for debug sections This was causing a crash as we were attempting to look up the nonexistent parent OutputSection of the debug sections. We didn't detect it earlier because there was no test for PIEs with debug info (PIEs require us to emit rebases for X86_64_RELOC_UNSIGNED). This diff filters out the debug sections while loading the ObjFiles. In addition to fixing the above problem, it also lets us avoid doing redundant work -- we no longer parse / apply relocations / attempt to emit dyld opcodes for these sections that we don't emit. Fixes llvm.org/PR48392. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D92904	2020-12-10 15:57:51 -08:00
Jez Ng	78976bf3da	[lld-macho] Support parsing of bitcode within archives Also error out if we find anything other than an object or bitcode file in the archive. Note that we were previously inserting the symbols and sections of the unpacked ObjFile into the containing ArchiveFile. This was actually unnecessary -- we can just insert the ObjectFile (or BitcodeFile) into the `inputFiles` vector. This is the approach taken by LLD-ELF. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D92539	2020-12-08 10:34:32 -08:00
Jez Ng	7b007ac080	[lld-macho][nfc] Move some methods from InputFile to ObjFile Additionally: 1. Move the helper functions in InputSection.h below the definition of `InputSection`, so the important stuff is on top 2. Remove unnecessary `explicit` Reviewed By: #lld-macho, compnerd Differential Revision: https://reviews.llvm.org/D92453	2020-12-08 10:34:32 -08:00
Nico Weber	16b1f6e385	[mac/lld] Add support for the LC_LINKER_OPTION load command in o files clang puts `-framework CoreFoundation` in this load command for files that use @available / __builtin_available. Without support for this, binaries that don't explicitly link to CoreFoundation fail to link. Differential Revision: https://reviews.llvm.org/D92624	2020-12-04 08:46:53 -05:00
Nico Weber	7cb0a373d1	[mac/lld] Implement -t Goes well with `-why_load` to get an idea of load order. Differential Revision: https://reviews.llvm.org/D92583	2020-12-03 16:02:38 -05:00
Nico Weber	3422f3cc6e	Reland "[mac/lld] Implement -why_load". The problem was that `sym` became replaced in the call to make<ObjFile> and referring to it afer that read memory that now stored a different kind of symbol (a Defined instead of a LazySymbol). Since this happens only once per archive, just copy the symbol to the stack before make<ObjFile> and read the copy instead. Originally reviewed at https://reviews.llvm.org/D92496	2020-12-03 08:35:12 -05:00
Nico Weber	ea0029f55d	Revert "[mac/lld] Implement -why_load" This reverts commit `542d3b609d`. Seems to break check-lld. Reverting while I take a look.	2020-12-02 18:57:46 -05:00
Nico Weber	542d3b609d	[mac/lld] Implement -why_load This is useful for debugging why lld loads .o files it shouldn't load. It's also useful for users of lld -- I've used ld64's version of this a few times. Differential Revision: https://reviews.llvm.org/D92496	2020-12-02 18:33:12 -05:00
Nico Weber	ca634393fc	[mac/lld] Make --reproduce work with thin archives See http://reviews.llvm.org/rL268229 and http://reviews.llvm.org/rL313832 which did the same for the ELF port. Differential Revision: https://reviews.llvm.org/D92456	2020-12-02 09:48:31 -05:00
Nico Weber	b2f00f24a3	[mac/lld] Include archive name in diagnostics Also, for .o files, include full path as given on link command line. Before: lld: error: undefined symbol [...], referenced from sandbox_logging.o After: lld: error: undefined symbol [...], referenced from libseatbelt.a(sandbox_logging.o) Move archiveName up to InputFile so we can consistently use toString() to print InputFiles in diags, and pass it to the ObjFile ctor. This matches the ELF and COFF ports. Differential Revision: https://reviews.llvm.org/D92437	2020-12-01 23:00:25 -05:00
Nico Weber	07ab597bb0	[lld/mac] Fix issues around thin archives - most importantly, fix a use-after-free when using thin archives, by putting the archive unique_ptr to the arena allocator. This ports D65565 to MachO - correctly demangle symbol namess from archives in diagnostics - add a test for thin archives -- it finds this UaF, but only when running it under asan (it also finds the demangling fix) - make forceLoadArchive() use addFile() with a bool to have the archive loading code in fewer places. no behavior change; matches COFF port a bit better Differential Revision: https://reviews.llvm.org/D92360	2020-12-01 18:48:29 -05:00
Jez Ng	78f6498cdc	[lld-macho] Flesh out STABS implementation This addresses a lot of the comments in {D89257}. Ideally it'd have been done in the same diff, but the commits in between make that difficult. This diff implements: * N_GSYM and N_STSYM, the STABS for global and static symbols * Has the STABS reflect the section IDs of their referent symbols * Ensures we don't fail when encountering absolute symbols or files with no debug info * Sorts STABS symbols by file to minimize the number of N_OSO entries Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D92366	2020-12-01 15:05:21 -08:00
Jez Ng	b768d57b36	[lld-macho] Add archive name and file modtime to STABS output We should also set the modtime when running LTO. That will be done in a future diff, together with support for the `-object_path_lto` flag. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D91318	2020-12-01 15:05:21 -08:00
Jez Ng	3fcb0eeb15	[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257	2020-12-01 15:05:20 -08:00
Nico Weber	83e60f5a55	[lld/mac] Add --reproduce option This adds support for ld.lld's --reproduce / lld-link's /reproduce: flag to the MachO port. This flag can be added to a link command to make the link write a tar file containing all inputs to the link and a response file containing the link command. This can be used to reproduce the link on another machine, which is useful for sharing bug report inputs or performance test loads. Since the linker is usually called through the clang driver and adding linker flags can be a bit cumbersome, setting the env var `LLD_REPRODUCE=foo.tar` triggers the feature as well. The file response.txt in the archive can be used with `ld64.lld.darwinnew $(cat response.txt)` as long as the contents are smaller than the command-line limit, or with `ld64.lld.darwinnew @response.txt` once D92149 is in. The support in this patch is sufficient to create a tar file for Chromium's base_unittests that can link after unpacking on a different machine. Differential Revision: https://reviews.llvm.org/D92274	2020-11-30 08:40:21 -05:00
Nico Weber	c519bc7e16	lld/MachO: Move MachOOptTable to DriverUtils.cpp, remove DriverUtils.h This makes lld/MachO look more like lld/COFF and lld/ELF, as discussed in D91640.	2020-11-18 12:33:15 -05:00
Jez Ng	21f831134c	[lld-macho] Add very basic support for LTO Just enough to consume some bitcode files and link them. There's more to be done around the symbol resolution API and the LTO config, but I don't yet understand what all the various LTO settings do... Reviewed By: #lld-macho, compnerd, smeenai, MaskRay Differential Revision: https://reviews.llvm.org/D90663	2020-11-10 12:19:28 -08:00
Jez Ng	62a3f0c984	[lld-macho] Support absolute symbols They operate like Defined symbols but with no associated InputSection. Note that `ld64` seems to treat the weak definition flag like a no-op for absolute symbols, so I have replicated that behavior. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D87909	2020-09-25 11:28:35 -07:00
Jez Ng	c32e69b2ce	[lld-macho][re-land] Initial support for common symbols Fix earlier build break via a static_cast. This reverts commit `8112d494d3`. Differential Revision: https://reviews.llvm.org/D86909	2020-09-24 15:00:20 -07:00
Muhammad Omair Javaid	8112d494d3	Revert "[lld-macho] Initial support for common symbols" This reverts commit `63ace77962`. Breaks LLDB Arm build: http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4409	2020-09-24 12:26:40 +05:00
Jez Ng	9c70281497	[lld-macho][NFC] Make `!= nullptr` implicit	2020-09-23 20:09:49 -07:00
Jez Ng	63ace77962	[lld-macho] Initial support for common symbols On Unix, it is traditionally allowed to write variable definitions without initialization expressions (such as "int foo;") to header files. These are called tentative definitions. The compiler creates common symbols when it sees tentative definitions. When linking the final binary, if there are remaining common symbols after name resolution is complete, the linker converts them to regular defined symbols in a `__common` section. This diff implements most of that functionality, though we do not yet handle the case where there are both common and non-common definitions of the same symbol. Reviewed By: #lld-macho, gkm Differential Revision: https://reviews.llvm.org/D86909	2020-09-23 19:26:40 -07:00
Greg McGary	1a3ef0417c	[lld-macho] In the context of relocs, s/target/referent/ for sections & symbols The word "target" is overloaded, so lighten its load by using another word to denote the symbol or section to which a reloc points. While more stilted than "target", "referent" is rather less pompous than "designatum" or "denotatum". :P Along the way, make a few neighboring variable names more descriptive. Reviewed By: #lld-macho, int3 Differential Revision: https://reviews.llvm.org/D87584	2020-09-22 20:31:01 -07:00
Jez Ng	cbe27316ef	[lld-macho] Implement weak bindings for GOT/TLV Previously, we were only emitting regular bindings to weak dynamic symbols; this diff adds support for the weak bindings too, which can overwrite the regular bindings at runtime. We also treat weak defined global symbols similarly -- since they can also be interposed at runtime, they need to be treated as potentially dynamic symbols. Note that weak bindings differ from regular bindings in that they do not specify the dylib to do the lookup in (i.e. weak symbol lookup happens in a flat namespace.) Differential Revision: https://reviews.llvm.org/D86572	2020-08-26 19:21:09 -07:00
Jez Ng	cf918c809b	[lld-macho] Implement -ObjC It's roughly like -force_load with some filtering. Differential Revision: https://reviews.llvm.org/D86181	2020-08-26 19:20:55 -07:00
Jez Ng	7394460d87	[lld-macho] Handle TAPI and regular re-exports uniformly The re-exports list in a TAPI document can either refer to other inlined TAPI documents, or to on-disk files (which may themselves be TBD or regular files.) Similarly, the re-exports of a regular dylib can refer to a TBD file. Differential Revision: https://reviews.llvm.org/D85404	2020-08-26 19:20:48 -07:00
Jez Ng	6336c042f6	[lld-macho] Make it possible to re-export .tbd files Two things needed fixing for that to work: 1. getName() no longer returns null for DylibFiles constructed from TAPIs 2. markSubLibrary() now accepts .tbd as a possible extension Differential Revision: https://reviews.llvm.org/D86180	2020-08-26 19:20:42 -07:00
Jez Ng	7e6d675499	[lld-macho] Avoid unnecessary shared_ptr in DylibFile ctor DylibFile doesn't store a pointer to its InterfaceFile parameter, so there's no need to use a shared_ptr. Reviewed By: #lld-macho, compnerd Differential Revision: https://reviews.llvm.org/D85402	2020-08-12 19:50:12 -07:00
Jez Ng	a499898e86	[lld-macho] Generate ObjC symbols from .tbd files I followed similar logic in TapiFile.cpp. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D85255	2020-08-12 19:50:10 -07:00
Jez Ng	3c9100fb78	[lld-macho] Support dynamic linking of thread-locals References to symbols in dylibs work very similarly regardless of whether the symbol is a TLV. The main difference is that we have a separate `__thread_ptrs` section that acts as the GOT for these thread-locals. We can identify thread-locals in dylibs by a flag in their export trie entries, and we cross-check it with the relocations that refer to them to ensure that we are not using a GOT relocation to reference a thread-local (or vice versa). Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D85081	2020-08-12 19:50:09 -07:00
Greg McGary	a379f2c251	[lld-macho] Handle command-line option -sectcreate SEG SECT FILE Handle command-line option `-sectcreate SEG SECT FILE`, which inputs a binary blob from `FILE` into `SEG,SECT` Reviewed By: int3 Differential Revision: https://reviews.llvm.org/D85501	2020-08-10 18:47:13 -07:00
Jez Ng	31d5885842	[lld-macho] Partial support for weak definitions This diff adds support for weak definitions, though it doesn't handle weak symbols in dylibs quite correctly -- we need to emit binding opcodes for them in the weak binding section rather than the lazy binding section. What is covered in this diff: 1. Reading the weak flag from symbol table / export trie, and writing it to the export trie 2. Refining the symbol table's rules for choosing one symbol definition over another. Wrote a few dozen test cases to make sure we were matching ld64's behavior. We can now link basic C++ programs. Reviewed By: #lld-macho, compnerd Differential Revision: https://reviews.llvm.org/D83532	2020-07-24 15:55:25 -07:00
Jez Ng	74871cdad7	[lld-macho] Ensure __bss sections we output have file offset of zero Summary: llvm-mc emits `__bss` sections with an offset of zero, but we weren't expecting that in our input, so we were copying non-zero data from the start of the file and putting it in `__bss`, with obviously undesirable runtime results. (It appears that the kernel will copy those nonzero bytes as long as the offset is nonzero, regardless of whether S_ZERO_FILL is set.) I debated on whether to make a special ZeroFillSection -- separate from a regular InputSection -- but it seemed like too much work for now. But I'm happy to refactor if anyone feels strongly about having it as a separate class. Depends on D80857. Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Reviewed By: smeenai Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80859	2020-06-17 20:41:28 -07:00
Jez Ng	fcde378dcb	[lld-macho] Support non-pcrel section relocs Summary: Depends on D80854. Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80855	2020-06-17 20:41:28 -07:00
Saleem Abdulrasool	73312976ad	lld: remove old test support path This removes the stub library that lld injected to satisfy the dependency on the libSystem. Now with TBD support, we can provide the stub library to permit the tests to function properly as they would on a real system. Reviewed By: smeenai Differential Revision: https://reviews.llvm.org/D81418	2020-06-16 15:57:58 -07:00
Jez Ng	53c796b948	[lld-macho] Properly handle & validate relocation r_length Summary: We should be reading / writing our addends / relocated addresses based on r_length, and not just based on the type of the relocation. But since only some r_length values are valid for a given reloc type, I've also added some validation. ld64 has code to allow for r_length = 0 in X86_64_RELOC_BRANCH relocs, but I'm not sure how to create such a relocation... Reviewed By: smeenai Differential Revision: https://reviews.llvm.org/D80854	2020-06-14 16:35:23 -07:00
Saleem Abdulrasool	6fe27b5fed	lld: initial pass at supporting TBD Add support to lld to use Text Based API stubs for linking. This is support is incomplete not filtering out platforms. It also does not account for architecture specific API handling and potentially does not correctly handle trees of re-exports with inlined libraries being treated as direct children of the top level library.	2020-06-08 18:15:40 -07:00
Jez Ng	1e1a3f67ee	[lld-macho] Ensure reads from nlist_64 structs are aligned when necessary My test refactoring in D80217 seems to have caused yaml2obj to emit unaligned nlist_64 structs, causing ASAN'd lld to be unhappy. I don't think this is an issue with yaml2obj though -- llvm-mc also seems to emit unaligned nlist_64s. This diff makes lld able to safely do aligned reads under ASAN builds while hopefully creating no overhead for regular builds on architectures that support unaligned reads. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D80414	2020-06-02 13:19:38 -07:00
Jez Ng	6f6d91867d	[lld-macho] Add some relocation validation logic I considered making a `Target::validate()` method, but I wasn't sure how I felt about the overhead of doing yet another switch-dispatch on the relocation type, so I put the validation in `relocateOne` instead... might be a bit of a micro-optimization, but `relocateOne` does assume certain things about the relocations it gets, and this error handling makes that explicit, so it's not a totally unreasonable code organization. Reviewed By: smeenai Differential Revision: https://reviews.llvm.org/D80049	2020-06-02 13:19:38 -07:00
Jez Ng	ce0d8beebc	[lld-macho][re-land] Support X86_64_RELOC_UNSIGNED This reverts commit `db8559eee4`.	2020-05-19 12:31:55 -07:00
Jez Ng	4eb6f4854e	[lld-macho][re-land] Support .subsections_via_symbols Summary: This diff restores and builds upon @pcc and @ruiu's initial work on subsections. The .subsections_via_symbols directive indicates we can split each section along symbol boundaries, unless those symbols have been marked with `.alt_entry`. We exercise this functionality in our tests by using order files that rearrange those symbols. Depends on D79668. Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Reviewed By: smeenai Subscribers: thakis, llvm-commits, pcc, ruiu Tags: #llvm Differential Revision: https://reviews.llvm.org/D79926	2020-05-19 12:31:54 -07:00
Jez Ng	70fbbcdd34	Revert "[lld-macho] Support .subsections_via_symbols" Due to build breakage mentioned in https://reviews.llvm.org/D79926. This reverts commit `e270b2f172`.	2020-05-19 08:30:02 -07:00
Jez Ng	db8559eee4	Revert "[lld-macho] Support X86_64_RELOC_UNSIGNED" This reverts commit `1f820e3559`.	2020-05-19 08:30:02 -07:00
Jez Ng	1f820e3559	[lld-macho] Support X86_64_RELOC_UNSIGNED Note that it's only used for non-pc-relative contexts. Reviewed By: MaskRay, smeenai Differential Revision: https://reviews.llvm.org/D80048	2020-05-19 07:46:57 -07:00
Jez Ng	e270b2f172	[lld-macho] Support .subsections_via_symbols This diff restores and builds upon @pcc and @ruiu's initial work on subsections. The .subsections_via_symbols directive indicates we can split each section along symbol boundaries, unless those symbols have been marked with `.alt_entry`. We exercise this functionality in our tests by using order files that rearrange those symbols. Reviewed By: smeenai Differential Revision: https://reviews.llvm.org/D79926	2020-05-19 07:46:57 -07:00
Jez Ng	55e9eb416e	[lld-macho] Support -order_file The order file indicates how input sections should be sorted within each output section, based on the symbols contained within those sections. This diff sets the stage for implementing and testing `.subsections_via_symbols`, where we will break up InputSections by each symbol and sort them more granularly. Reviewed By: smeenai Differential Revision: https://reviews.llvm.org/D79668	2020-05-19 07:46:57 -07:00
Kellie Medlin	2b920ae78c	[lld] Add archive file support to Mach-O backend With this change, basic archive files can be linked together. Input section discovery has been refactored into a function since archive files lazily resolve their symbols / the object files containing those symbols. Reviewed By: int3, smeenai Differential Revision: https://reviews.llvm.org/D78342	2020-05-14 12:58:35 -07:00
Jez Ng	87b6fd3e02	[lld-macho] Add support for creating and reading reexported dylibs This unblocks the linking of real programs, since many core system functions are only available as sub-libraries of libSystem. Differential Revision: https://reviews.llvm.org/D79228	2020-05-12 07:52:03 -07:00
Jez Ng	198b0c57df	[lld-macho] Support pc-relative section relocations Summary: So far we've only supported symbol relocations. Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79211	2020-05-09 20:56:23 -07:00
Jez Ng	7bbdbacd00	[lld-macho] Use export trie instead of symtab when linking against dylibs Summary: This allows us to link against stripped dylibs. Moreover, it's simply more correct: The symbol table includes symbols that the dylib uses but doesn't export. This temporarily regresses our ability to do lazy symbol binding because dyld_stub_binder isn't in libSystem's export trie. Rather, it is in one of the sub-libraries libSystem re-exports. (This doesn't affect our tests since we are mocking out dyld_stub_binder there.) A follow-up diff will address this by adding support for sub-libraries. Depends on D79114. Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79226	2020-05-09 20:56:22 -07:00
Jez Ng	b3e2fc931d	[lld-macho] Support calls to functions in dylibs Summary: This diff implements lazy symbol binding -- very similar to the PLT mechanism in ELF. ELF's .plt section is broken up into two sections in Mach-O: StubsSection and StubHelperSection. Calls to functions in dylibs will end up calling into StubsSection, which contains indirect jumps to addresses stored in the LazyPointerSection (the counterpart to ELF's .plt.got). Initially, the LazyPointerSection contains addresses that point into one of the entry points in the middle of the StubHelperSection. The code in StubHelperSection will push on the stack an offset into the LazyBindingSection. The push is followed by a jump to the beginning of the StubHelperSection (similar to PLT0), which then calls into dyld_stub_binder. dyld_stub_binder is a non-lazily bound symbol, so this call looks it up in the GOT. The stub binder will look up the bind opcodes in the LazyBindingSection at the given offset. The bind opcodes will tell the binder to update the address in the LazyPointerSection to point to the symbol, so that subsequent calls don't have to redo the symbol resolution. The binder will then jump to the resolved symbol. Depends on D78269. Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78270	2020-05-09 20:56:22 -07:00
Kellie Medlin	6cb073133c	[lld] Merge Mach-O input sections Summary: Similar to other formats, input sections in the MachO implementation are now grouped under output sections. This is primarily a refactor, although there's some new logic (like resolving the output section's flags based on its inputs). Differential Revision: https://reviews.llvm.org/D77893	2020-05-01 16:57:18 -07:00
Jez Ng	918948db4d	[lld-macho] Support reading of universal binaries Differential Revision: https://reviews.llvm.org/D77006	2020-04-29 15:44:44 -07:00
Jez Ng	060efd24c7	[lld-macho] Add basic support for linking against dylibs This diff implements: * dylib loading (much of which is being restored from @pcc and @ruiu's original work) * The GOT_LOAD relocation, which allows us to load non-lazy dylib symbols * Basic bind opcode emission, which tells `dyld` how to populate the GOT Differential Revision: https://reviews.llvm.org/D76252	2020-04-21 13:43:19 -07:00
Fangrui Song	6acd300375	Reland D75382 "[lld] Initial commit for new Mach-O backend" With a fix for http://lab.llvm.org:8011/builders/clang-cmake-armv8-lld/builds/3636 Also trims some unneeded dependencies.	2020-04-02 12:03:43 -07:00
Oliver Stannard	af39151f3c	Revert "[lld] Initial commit for new Mach-O backend" This is causing buildbot failures on 32-bit hosts, for example: http://lab.llvm.org:8011/builders/clang-cmake-armv8-lld/builds/3636 This reverts commit `03f43b3aca`.	2020-04-02 13:23:30 +01:00
Jez Ng	03f43b3aca	[lld] Initial commit for new Mach-O backend Summary: This is the first commit for the new Mach-O backend, designed to roughly follow the architecture of the existing ELF and COFF backends, and building off work that @ruiu and @pcc did in a branch a while back. Note that this is a very stripped-down commit with the bare minimum of functionality for ease of review. We'll be following up with more diffs soon. Currently, we're able to generate a simple "Hello World!" executable that runs on OS X Catalina (and possibly on earlier OS X versions; I haven't tested them). (This executable can be obtained by compiling `test/MachO/relocations.s`.) We're mocking out a few load commands to achieve this -- for example, we can't load dynamic libraries, but Catalina requires binaries to be linked against `dyld`, so we hardcode the emission of a `LC_LOAD_DYLIB` command. Other mocked out load commands include LC_SYMTAB and LC_DYSYMTAB. Differential Revision: https://reviews.llvm.org/D75382	2020-03-31 11:58:47 -07:00

... 2 3 4 5 6

260 Commits