r363016 let lld-link and llvm-lib share the /machine: parsing code.
This lets llvm-cvtres share it as well.
Making llvm-cvtres depend on llvm-lib seemed a bit strange (it doesn't
need llvm-lib's dependencies on BinaryFormat and BitReader) and I
couldn't find a good place to put this code. Since it's just a few
lines, put it in lib/Object for now.
Differential Revision: https://reviews.llvm.org/D63120
llvm-svn: 363144
Users are exepcted to pass all .res files to the linker, which then
merges all the resource in all .res files into a tree structure and then
converts the final tree structure to a .obj file with .rsrc$01 and
.rsrc$02 sections and then links that.
If the user instead passes several .obj files containing such resources,
the correct thing to do would be to have custom code to merge the trees
in the resource sections instead of doing normal section merging -- but
link.exe rejects if multiple resource obj files are passed in with
LNK4078, so let lld-link do that too instead of silently writing broken
.rsrc sections in that case.
The only real way to run into this is if users manually convert .res
files to .obj files by running cvtres and then handing the resulting
.obj files to lld-link instead, which in practice likely never happens.
(lld-link is slightly stricter than link.exe now: If link.exe is passed
one .obj file created by cvtres, and a .res file, for some reason it
just emits a warning instead of an error and outputs strange looking
data. lld-link now errors out on mixed input like this.)
One way users could accidentally run into this is the following
scenario: If a .res file is passed to lib.exe, then lib.exe calls
cvtres.exe on the .res file before putting it in the output .lib.
(llvm-lib currently doesn't do this.)
link.exe's /wholearchive seems to only add obj files referenced from the
static library index, but lld-link current really adds all files in the
archive. So if lld-link /wholearchive is used with .lib files produced
by lib.exe and .res files were among the files handed to lib.exe, we
previously silently produced invalid output, but now we error out.
link.exe's /wholearchive semantics on the other hand mean that it
wouldn't load the resource object files from the .lib file at all.
Since this scenario is probably still an unlikely corner case,
the difference in behavior here seems fine -- and lld-link might have to
change to use link.exe's /wholearchive semantics in the future anyways.
Vaguely related to PR42180.
Differential Revision: https://reviews.llvm.org/D63109
llvm-svn: 363078
And share some code with lld-link.
While here, also add a FIXME about PR42180 and merge r360150 to llvm-lib.
Differential Revision: https://reviews.llvm.org/D63021
llvm-svn: 363016
Patch by Erik McClure with a modification to rebase to HEAD.
When calling `COFF::link()` with `CanExitEarly` set to `false`, the
function needs to clean up several global variable caches to ensure that
the next invocation of the function starts from a clean slate. The
`MergeChunk::Instances` cache is missing from this cleanup code, and as
a result will create nondeterministic memory access errors and sometimes
infinite loops due to invalid memory being referenced on the next call
to `COFF::link()`.
This fix simply clears `MergeChunk::Instances` before exiting the function.
An additional review of the COFF library was made to try and find any
other missing global caches, but I was unable to find any other than
`MergeChunk`. Someone more familiar with the global variables might want
to do their own check.
This fix was made to support inNative
<https://github.com/innative-sdk/innative>'s `.wast` script compiler,
which must build multiple incremental builds. It relies on statically
linking LLD because the entire compiler must be a single statically
embeddable library, thus preventing it from being able to call LLD as a
new process.
Differential Revision: https://reviews.llvm.org/D63042
llvm-svn: 362930
This works like /include, but is not fatal if the requested symbol
wasn't found. This allows implementing the GNU ld option -u.
Differential Revision: https://reviews.llvm.org/D62976
llvm-svn: 362881
Summary:
When handling exports from the command line or from .def files, the
linker does a "fuzzy" string lookup to allow finding mangled symbols.
However, when the symbol is re-exported under a new name, the linker has
to transfer the decorations from the exported symbol over to the new
name. This is implemented by taking the mangled symbol that was found in
the object and replacing the original symbol name with the export name.
Before this patch, LLD implemented the fuzzy search by adding an
undefined symbol with the unmangled name, and then during symbol
resolution, checking if similar mangled symbols had been added after the
last round of symbol resolution. If so, LLD makes the original symbol a
weak alias of the mangled symbol. Later, to get the original symbol
name, LLD would look through the weak alias and forward it on to the
import library writer, which copies the symbol decorations. This
approach doesn't work when bar is itself a weak alias, as is the case in
asan. It's especially bad when the aliasee of bar contains the string
"bar", consider "bar_default". In this case, we would end up exporting
the symbol "foo_default" when we should've exported just "foo".
To fix this, don't look through weak aliases to find the mangled name.
Save the mangled name earlier during fuzzy symbol lookup.
Fixes PR42074
Reviewers: mstorsjo, ruiu
Subscribers: thakis, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62984
llvm-svn: 362849
We need to have all input files ready before doing debuginfo type merging.
This patch is moving the late PDB type server discovery much earlier in the process, when the explicit inputs (OBJs, LIBs) are loaded.
The short term goal is to parallelize type merging.
Differential Revision: https://reviews.llvm.org/D60095
llvm-svn: 362393
We need to have all input files ready before doing debuginfo type merging.
This patch is moving the late PDB type server discovery much earlier in the process, when the explicit inputs (OBJs, LIBs) are loaded.
The short term goal is to parallelize type merging.
Differential Revision: https://reviews.llvm.org/D60095
llvm-svn: 361842
Patch by Stefan Schmidt.
This adds the /filealign parameter to lld, which allows to specify the
section alignment in the output file (as it does on Microsoft's
link.exe).
This is required to be able to load dynamically linked libraries on the
original Xbox, where the debugger monitor expects the section alignment
in the file to be the same as in memory.
llvm-svn: 361634
The KeepUnique bit is used during ICF, which only operates on
SectionChunks, so only SectionChunks need it. This frees up a byte in
Chunk, which I plan to use in a follow-up change.
llvm-svn: 361549
OptTable treats arguments starting with / that aren't a known option
as filenames. This means lld-link's and clang-cl's typo correction for
unknown flags didn't do spell checking for misspelled options that start
with /.
I first tried changing OptTable, but that got pretty messy, see PR41787
comments 2 and 3.
Instead, let lld-link's and clang's (including clang-cl's) "file not
found" diagnostic check if a non-existent file looks like it could be a
mis-spelled option, and if so add a "did you mean" suggestion to the
"file not found" diagnostic.
While here, make formatting of a few diagnostics a bit more
self-consistent.
Fixes PR41787.
Differential Revision: https://reviews.llvm.org/D62276
llvm-svn: 361518
The previous patch lost the call to PowerOf2Ceil, which causes LLD to
crash when handling common symbols with a non-power-of-2 size. I tweaked
the existing common.test to make the bsspad16 common symbol be 15 bytes
to add coverage for this case.
llvm-svn: 361426
Summary:
Valid section or chunk alignments are powers of 2 in the range [1,
8192]. These can be stored more canonically in log2 form to free up some
bits in Chunk. Combined with D61696, SectionChunk gets 8 bytes smaller.
Reviewers: ruiu, aganea
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61698
llvm-svn: 361206
Change
std::error_code getSectionContents(DataRefImpl, StringRef &) const;
to
Expected<ArrayRef<uint8_t>> getSectionContents(DataRefImpl) const;
Many object formats use ArrayRef<uint8_t> as the underlying type, which
is generally better than StringRef to represent binary data, so change
the type to decrease the number of type conversions.
Reviewed By: ruiu, sbc100
Differential Revision: https://reviews.llvm.org/D61781
llvm-svn: 360648
Summary:
When using lld-link to build static libraries containing object files
with module assembly, the program would crash with "Assertion `T &&
T->hasMCAsmParser()' failed". This change causes the code in lld-link
that initialized Targets, TargetInfos, and AsmParsers (which already
existed) to be run before entering the lib building path (which needs
it). This avoids the error (and is what llvm-lib and llvm-ar do, too).
Fixes PR41803.
Reviewers: ruiu, rnk, hans
Reviewed By: ruiu
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61699
llvm-svn: 360295
As a side benefit, lld-link now reports more than one duplicate resource
entry before exiting with an error even if the new flag is not passed.
llvm-svn: 359829
r191276 added this to old LLD, but it never made it to new LLD -- except
that the flag was in Options.td, so it was silently ignored. I figured
it should be easy to implement, so I did that instead of removing the
flags from Options.td.
I then discovered that link.exe also supports comma-separated lists of
'cd' and 'net', which made the parsing code a bit annoying.
The Alias technique in Options.td is to get nice help output.
Differential Revision: https://reviews.llvm.org/D61067
llvm-svn: 359192
The following options: /pdb, /out or /implib now emit in the repro.tar/response.txt only a filename stripped from its path, to avoid non-existent paths on the reproducer's machine.
Differential Revision: https://reviews.llvm.org/D59530
llvm-svn: 358980
Summary:
Archives can contain multiple members with the same name. This would
cause ThinLTO links to fail ("Expected at most one ThinLTO module per
bitcode file"). This change implements the same strategy we use in
the ELF linker: make the offset in the archive part of the module
name so that names are unique.
Reviewers: pcc, mehdi_amini, ruiu
Reviewed By: ruiu
Subscribers: eraman, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60549
llvm-svn: 358440
When faced with command line options such as "crtbegin.o appmain.o
-lsomelib crtend.o", GNU ld pulls in all necessary object files from
somelib before proceeding to crtend.o.
LLD operates differently, only loading object files from any
referenced static libraries after processing all input object files.
This uses a similar hack as in the ELF linker. Here, it moves crtend.o
to the end of the vector of object files. This makes sure that
terminator chunks for sections such as .eh_frame gets ordered last,
fixing DWARF exception handling for libgcc and gcc's crtend.o.
Differential Revision: https://reviews.llvm.org/D60628
llvm-svn: 358394
Generate import modules for each imported DLL, along with its symbol stream.
Also create COFF groups in the * Linker * module, one for each PartialSection (input, unmerged sections)
Currently COFF groups are disabled for MINGW because it significantly increases PDB sizes. We could enable that later with an option.
The overall objective for this change is to support code hot patching tools. Such tools need to know the import libraries used, from the PDB alone.
Differential Revision: https://reviews.llvm.org/D54802
llvm-svn: 357308
/summary prints information about the data (OBJ/LIB/PDB) processed by LLD. The goal is have an estimate about the inputs and outputs, to better understand where the timings go.
Differential Revision: https://reviews.llvm.org/D58599
llvm-svn: 356188
This makes lld-link's output a bit more concise. Since most developers can't
read mangled names, this should make the output a bit easier to understand as
well. It also makes lld-link's output consistent with ld.lld's output.
(link.exe prints both demangled and mangled names; lld-link used to match
link.exe output but now no longer does.)
For people working on toolchains, add a `/demangle:no` flag that makes lld-link
print the mangled name instead of the demangled name. (If desired, people could
pipe that through `demumble -b` to get the old behavior of both demangled and
mangled output.)
Differential Revision: https://reviews.llvm.org/D58132
llvm-svn: 355878
When mismatched #pragma detect_mismatch declarations occur, now print the conflicting OBJs.
lld-link: error: /failifmismatch: mismatch detected for 'TEST':
>>> test.obj has value 1
>>> test2.obj has value 2
Fixes PR38579
Differential Revision: https://reviews.llvm.org/D58910
llvm-svn: 355543
This is a private undocumented option, intended to be used by
the MinGW driver frontend.
Also restructure the condition to put if (Config->MinGW) first.
This changes the behaviour for the tautological combination of
-export-all-symbols without -lldmingw.
Differential Revision: https://reviews.llvm.org/D58380
llvm-svn: 354386
Summary:
The message "could not get the buffer for the member defining symbol"
now also contains the name of the archive and the name of the archive
member that we tried to open.
Reviewers: ruiu
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57974
llvm-svn: 353572
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
This allows using #pragma comment(lib, "foo") in MinGW built code,
if built with -fms-extensions. (This works for system libraries and
static libraries only, as it doesn't try to look for .dll.a. As
ld.bfd doesn't support embedded defaultlib directives, this isn't
in widespread use among mingw users.)
Differential Revision: https://reviews.llvm.org/D53017
llvm-svn: 344124
ld.bfd doesn't do any inference of subsystem; unless the windows
subsystem is specified, the console subsystem is used.
For the console subsystem, the entry point is called mainCRTStartup,
regardless of whether the the user code entry point is main or wmain.
The same goes for the windows subsystem, where the entry point always
is WinMainCRTStartup, for both WinMain and wWinMain in user code.
One detail that we don't emulate, is that if the inferred entry point
is undefined, ld.bfd silently just sets the entry point to the start
of the image. And if an explicit entry point is set, but it is
undefined, the link still succeeds but the linker warns about the
entry point not being found.
Differential Revision: https://reviews.llvm.org/D52931
llvm-svn: 343879
Three related changes:
1. link.exe uses the presence of main and wmain to decide if it should call
mainCRTStartup or wmainCRTStartup, even if /nodefaultlib is passed. For
compatibility, remove FindMain logic.
2. Default to the non-wide entrypoint if main is not found. This has two effects:
2a. In normal links, lld-link now prints
lld-link: error: undefined symbol: _main
>>> referenced by f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:78
>>> libcmt.lib(exe_main.obj):("int __cdecl invoke_main(void)" (?invoke_main@@YAHXZ))
>>> referenced by f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:283
>>> libcmt.lib(exe_main.obj):("int __cdecl __scrt_common_main_seh(void)" (?__scrt_common_main_seh@@YAHXZ))
instead of
lld-link: error: entry point must be defined
This is arguably a better error message, since it now mentions that _main is
missing. (This matches link.exe's diagnostic in this case.)
2b. With /nodefautlib, we now default to mainCRTStartup if no main() is
present, again matching link.exe. This makes r337407 obsolete.
This means if you have a cc file containing both mainCRTStartup and
wmainCRTStartup and you pass /nodefaultlib /subsystem:console, lld-link will
now call mainCRTStartup, matching link.exe
3. Print a warning if both main and wmain are present, similar to link.exe's
LNK4067.
Differential Revision: https://reviews.llvm.org/D52832
llvm-svn: 343698
Implement final argument precedence if multiple /debug arguments are passed on the command-line to match expected link.exe behavior.
Support /debug:none and emit warning for /debug:fastlink with automatic fallback to /debug:full.
Emit error if last /debug:option is unknown.
Emit warning if last /debugtype:option is unknown.
https://reviews.llvm.org/D50404
llvm-svn: 342894
MinGW uses these kind of list terminator symbols for traversing
the constructor/destructor lists. These list terminators are
actual pointers entries in the lists, with the values 0 and
(uintptr_t)-1 (instead of just symbols pointing to the start/end
of the list).
(This mechanism exists in both the mingw-w64 crt startup code and
in libgcc; normally the mingw-w64 one is used, but a DLL build of
libgcc uses the libgcc one. Therefore it's not trivial to change
the mechanism without lots of cross-project synchronization and
potentially invalidating some combinations of old/new versions
of them.)
When mingw-w64 has been used with lld so far, the CRT startup object
files have so far provided these symbols, ending up with different,
incompatible builds of the CRT startup object files depending on
whether binutils or lld are going to be used.
In order to avoid the need of different configuration of the CRT startup
object files depending on what linker to be used, provide these symbols
in lld instead. (Mingw-w64 checks at build time whether the linker
provides these symbols or not.) This unifies this particular detail
between the two linkers.
This does disallow the use of the very latest lld with older versions
of mingw-w64 (the configure check for the list was added recently;
earlier it simply checked whether the CRT was built with gcc or clang),
and requires rebuilding the mingw-w64 CRT. But the number of users of
lld+mingw still is low enough that such a change should be tolerable,
and unifies this aspect of the toolchains, easing interoperability
between the toolchains for the future.
The actual test for this feature is added in ctors_dtors_priority.s,
but a number of other tests that checked absolute output addresses
are updated.
Differential Revision: https://reviews.llvm.org/D52053
llvm-svn: 342294
Patch by Thomas Roughton.
This patch adds support for linking with multiple definitions to LLD's
COFF driver, in line with link.exe's /force:multiple option.
Differential Revision: https://reviews.llvm.org/D50598
llvm-svn: 342191
For lld-link missing.obj, lld-link currently prints:
lld-link: error: could not open foo.obj: No such file or directory
lld-link: warning: /machine is not specified. x64 is assumed
lld-link: error: subsystem must be defined
The 2nd and 3rd diagnostics are consequences of the input not existing and are
not interesting. If input files are missing, the best thing we can do is point
that out and then return.
Differential Revision: https://reviews.llvm.org/D51981
llvm-svn: 342158
When building a shared libc++.dll, it pulls in libc++abi.a statically
with the --wholearchive flag. If such a build is done with
--export-all-symbols, it's reasonable to assume that everything
from that library also should be exported with the same rules as normal
local object files, even though we normally avoid autoexporting things
from libc++abi.a in other cases when linking a DLL (user code).
Differential Revision: https://reviews.llvm.org/D51529
llvm-svn: 341403
There's no point in keeping them as separate sections.
This differs from GNU ld, which places .ctors and .dtors content in
.text (implemented by a built-in linker script). But since the content
only is pointers, there's no need to have it executable.
GNU ld also leaves .CRT separate as its own standalone section.
MSVC merges .CRT into .rdata similarly, with a directive embedded in
an object file in msvcrt.lib or libcmt.lib.
Differential Revision: https://reviews.llvm.org/D51414
llvm-svn: 340940
Normally, in order to reference exported data symbols from a different
DLL, the declarations need to have the dllimport attribute, in order to
use the __imp_<var> symbol (which contains an address to the actual
variable) instead of the variable itself directly. This isn't an issue
in the same way for functions, since any reference to the function without
the dllimport attribute will end up as a reference to a thunk which loads
the actual target function from the import address table (IAT).
GNU ld, in MinGW environments, supports automatically importing data
symbols from DLLs, even if the references didn't have the appropriate
dllimport attribute. Since the PE/COFF format doesn't support the kind
of relocations that this would require, the MinGW's CRT startup code
has an custom framework of their own for manually fixing the missing
relocations once module is loaded and the target addresses in the IAT
are known.
For this to work, the linker (originall in GNU ld) creates a list of
remaining references needing fixup, which the runtime processes on
startup before handing over control to user code.
While this feature is rather controversial, it's one of the main features
allowing unix style libraries to be used on windows without any extra
porting effort.
Some sort of automatic fixing of data imports is also necessary for the
itanium C++ ABI on windows (as clang implements it right now) for importing
vtable pointers in certain cases, see D43184 for some discussion on that.
The runtime pseudo relocation handler supports 8/16/32/64 bit addresses,
either PC relative references (like IMAGE_REL_*_REL32*) or absolute
references (IMAGE_REL_AMD64_ADDR32, IMAGE_REL_AMD64_ADDR32,
IMAGE_REL_I386_DIR32). On linking, the relocation is handled as a
relocation against the corresponding IAT slot. For the absolute references,
a normal base relocation is created, to update the embedded address
in case the image is loaded at a different address.
The list of runtime pseudo relocations contains the RVA of the
imported symbol (the IAT slot), the RVA of the location the relocation
should be applied to, and a size of the memory location. When the
relocations are fixed at runtime, the difference between the actual
IAT slot value and the IAT slot address is added to the reference,
doing the right thing for both absolute and relative references.
With this patch alone, things work fine for i386 binaries, and mostly
for x86_64 binaries, with feature parity with GNU ld. Despite this,
there are a few gotchas:
- References to data from within code works fine on both x86 architectures,
since their relocations consist of plain 32 or 64 bit absolute/relative
references. On ARM and AArch64, references to data doesn't consist of
a plain 32 or 64 bit embedded address or offset in the code. On ARMNT,
it's usually a MOVW+MOVT instruction pair represented by a
IMAGE_REL_ARM_MOV32T relocation, each instruction containing 16 bit of
the target address), on AArch64, it's usually an ADRP+ADD/LDR/STR
instruction pair with an even more complex encoding, storing a PC
relative address (with a range of +/- 4 GB). This could theoretically
be remedied by extending the runtime pseudo relocation handler with new
relocation types, to support these instruction encodings. This isn't an
issue for GCC/GNU ld since they don't support windows on ARMNT/AArch64.
- For x86_64, if references in code are encoded as 32 bit PC relative
offsets, the runtime relocation will fail if the target turns out to be
out of range for a 32 bit offset.
- Fixing up the relocations at runtime requires making sections writable
if necessary, with the VirtualProtect function. In Windows Store/UWP apps,
this function is forbidden.
These limitations are addressed by a few later patches in lld and
llvm.
Differential Revision: https://reviews.llvm.org/D50917
llvm-svn: 340726