Many programs, for reasons unknown, really like to look at the
AddressOfRelocationTable to determine whether or not they are looking at
a bona fide PE file. Without this, programs like the UNIX `file'
utility will insist that they are looking at a MS DOS executable.
llvm-svn: 221335
LLD skipped COMDAT section symbols when reading them because
I thought we don't want to have symbols with the same name.
But they are actually needed because relocations may refer to
the section symbols. So we shoulnd't skip them.
llvm-svn: 221329
The job of the CompactUnwind pass is to turn __compact_unwind data (and
__eh_frame) into the compressed final form in __unwind_info. After it's done,
the original atoms are no longer relevant and should be deleted (they cause
problems during actual execution, quite apart from the fact that they're not
needed).
llvm-svn: 221301
The ELF writer creates a invalid binary for few cases with large filesize and
memory size for segments. This patch addresses the functionality and updates the
test. This patch also cleans up parts of the ELF writer for future enhancements
to support Linker scripts.
llvm-svn: 221233
Normally, PE files have section names of eight characters or less.
However, this is problematic for DWARF because DWARF section names are
things like .debug_aranges.
Instead of truncating the section name, redirect the section name into
the string table.
Differential Revision: http://reviews.llvm.org/D6104
llvm-svn: 221212
This patch does *not* implement any semantic actions, but it is a first step to
teach LLD how to read complete linker scripts. The additional linker scripts
statements whose parsing is now supported are:
* SEARCH_DIR directive
* SECTIONS directive
* Symbol definitions inside SECTIONS including PROVIDE and PROVIDE_HIDDEN
* C-like expressions used in many places in linker scripts
* Input to output sections mapping
The goal of this commit was guided towards completely parsing a default GNU ld
linker script and the linker script used to link the FreeBSD kernel. Thus, it
also adds a test case based on the default linker script used in GNU ld for
x86_64 ELF targets. I tested SPEC userland programs linked by GNU ld, using the
linker script dump'ed by this parser, and everything went fine. I then tested
linking the FreeBSD kernel with a dump'ed linker script, installed the new
kernel and booted it, everything went fine.
Directives that still need to be implemented:
* MEMORY
* PHDRS
Reviewers: silvas, shankarke and ruiu
http://reviews.llvm.org/D5852
llvm-svn: 221126
lld was regenerating LC_DATA_IN_CODE in .o output files, but not into
final linked images.
Update test case to verify data-in-code info makes it into final linked images.
llvm-svn: 220827
Objective-C switched to a new ABI which uses a different mangling for class
names. But to keep projects building that use export lists that use the old
class name mangling, the linker recognizes the old names and transforms them
to the new mangling.
llvm-svn: 220598
In final linked shared images, the __TEXT segment contains both code and
the mach-o header/load-commands. In the case of a data-only dylib, there is
no code, so we need to force the addition of the __TEXT segment.
llvm-svn: 220597
This is a follow-up patch for r220333. r220333 renames exported symbols.
That raised another issue; if we have both decorated and undecorated names
for the same symbol, we'll end up have two duplicate exported symbol
entries.
This is a fix for that issue by removing duplciate entries.
llvm-svn: 220350
All compiler generated mach-o object files are marked with MH_SUBSECTIONS_VIA_SYMBOLS.
But hand written assembly files need to opt-in if they are written correctly.
The flag means the linker can break up a sections at symbol addresses and
dead strip or re-order functions.
This change recognizes object files without the flag and marks its atoms as
not dead strippable and adds a layout-after chain of references so that the
atoms cannot be re-ordered.
llvm-svn: 220348
There are two ways to specify a symbol to be exported in the module
definition file.
1) EXPORT <external name> = <symbol>
2) EXPORT <symbol>
In (1), you give both external name and internal name. In that case,
the linker tries to find a symbol using the internal name, and write
that address to the export table with the external name. Thus, from
the outer world, the symbol seems to be exported as the external name.
In (2), internal name is basically the same as the external name
with an exception: if you give an undecorated symbol to the EXPORT
directive, and if the linker finds a decorated symbol, the external
name for the symbol will become the decorated symbol.
LLD didn't implement that exception correctly. This patch fixes that.
llvm-svn: 220333
HAVE_CXXABI_H is not defined on FreeBSD but the system actually
has the header. CMake test fails because the header depends on size_t.
llvm-svn: 220315
The canParse function for all the ELF subtargets check if the input files match
the subtarget.
There were few mismatches in the input files that didnt match the subtarget for
which the link was being invoked, which also acts as a test for this change.
llvm-svn: 220182
For PC relative accesses, negative addends were to be ignored. The linker was
not ignoring it and would fail with an assert. This fixes the issue and is able
to get Helloworld working.
llvm-svn: 220179
The old code was used as a workaround to fix how relocations are calculated for
sections with SHF_MERGE|SHF_STRINGS attribute. This patch removes the erroneous
code.
llvm-svn: 220159
This fixes the way archive members are displayed when the linker is used with a
flag to show all the files that it processes.
When an archive file member is read, we need to show the archive filename and
the member.
llvm-svn: 220144
This would permit the ELF reader to check the architecture that is being
selected by the linking process.
This patch also sorts the include files according to LLVM conventions.
llvm-svn: 220129
The code was making non-portable assumptions about the exact string returned by
the glob (possibly by the shell?); this is more robust and matches what is done
everywhere else.
llvm-svn: 220117
Previously we supported x86 only. This patch is to support x64.
The array of pointers to delay-loaded functions points the
DLL delay loading function at start-up. When a function is called
for the first time, the delay loading function gets called and
then rewrite the function pointer array.
llvm-svn: 220096
First, add a comment to support more variation in FDE formats. Second, refactor
fde -> function handling into a separate function living in the ArchHandler.
llvm-svn: 219959
To deal with cycles in shared library dependencies, the darwin linker supports
marking specific link dependencies as "upward". An upward link is when a
lower level library links against a higher level library.
llvm-svn: 219949
This patch creates the import address table and sets its
address to the delay-load import table. This also creates
wrapper functions for __delayLoadHelper2.
x86 only for now.
llvm-svn: 219948
Not all situations are representable in the compressed __unwind_info format,
and when this happens the entry needs to point to the more general __eh_frame
description.
Just x86_64 implementation for now.
rdar://problem/18208653
llvm-svn: 219836
We'll also need references back to the CIE eventually, but for now making sure
we can work out what an FDE is referring to is enough.
The actual kind of reference needs to be different between architectures,
probably because of MachO's chronic shortage of relocation types but I don't
really want to know in case I find out something that distresses me even more.
rdar://problem/18208653
llvm-svn: 219824
Because we use cast<> at the beginning of this function, it will
abort there if a given atom is not a DefinedAtom.
In the switch statement, we checked if a given atom is a DefinedAtom
again by evaluating definition() == Atom::definitionRegular.
This was always true. So we can remove the outer switch statement.
llvm-svn: 219724
Arm code has two instruction encodings "thumb" and "arm". When branching from
one code encoding to another, you need to use an instruction that switches
the instruction mode. Usually the transition only happens at call sites, and
the linker can transform a BL instruction in BLX (or vice versa). But if the
compiler did a tail call optimization and a function ends with a branch (not
branch and link), there is no pc-rel BX instruction.
The ShimPass looks for pc-rel B instructions that will need to switch mode.
For those cases it synthesizes a shim which does the transition, then modifies
the original atom with the B instruction to target to the shim atom.
llvm-svn: 219655
When committed in r219353, this patch originally caused problems because it was
not tested in debug build. In such scenarios, Driver.cpp adds two additional
passes. These passes serialize all atoms via YAML and reads it back. Since the
patch changed ObjectAtom to hold a new reference, the serialization was removing
the extra data.
This commit implements r219853 in another way, similar to the original MIPS way,
by using a StringSet that holds the names of all copied atoms instead of
directly holding a reference to the copied atom. In this way, this commit is
simpler and eliminate the necessity of changing the DefinedAtom hierarchy to
hold a new data.
Reviewers: shankarke
http://reviews.llvm.org/D5713
llvm-svn: 219449
We are going to have another type of jump table for the delay-load
import table. In order to prepare for that, we want to factor out
the function handling the jump table. No functionality change.
llvm-svn: 219446
Previously the field was not set. The field should be pointing to
a placeholder where the DLL delay-loader writes the base address
of a DLL.
llvm-svn: 219415
This is a partial patch to emit the delay-import table. With this,
LLD is now able to emit the table that llvm-readobj can read and
dump.
The table lacks a few fields, such as the address of HMODULE, the
import address table, etc. They'll be added in subsequent patches.
llvm-svn: 219384
Enhances the creation of an ELF dynamic executable by avoiding recording
unnecessary shared libraries as NEEDED to load a program.
To do this, we must keep track of not only symbols that were referenced but
also of COPY relocations, which steal the symbol from a shared library but does
not store from which lib this symbol came from. To fix this, this commit changes
ObjectSymbol to store the original library from which this symbol came. With
this information, we are able to build a list of the exact shared libraries that
must be marked as DT_NEEDED, instead of blindly marking all shared libraries as
needed.
This logic originally came from the MIPS backend with some adaptation.
Reviewers: atanasyan, shankar.easwaran
http://reviews.llvm.org/D5574
llvm-svn: 219353
Updates the remaining tasks in the X86_64 ELF lld backend after the commit that
handles general dynamic TLS relocations.
Reviewer: shankarke
http://reviews.llvm.org/D5673
llvm-svn: 219350
This commit implements in the X86_64 ELF lld backend yet another feature that
was only available in the MIPS backend. However, this patch changes generic ELF
classes to make it trivial for other ELF backends to use this logic too. When
creating a dynamic executable that has dynamic relocations against weak
undefined symbols, these symbols must be exported to the dynamic symbol table
to seek a possible resolution at run time.
A common use case is the __gmon_start__ weak glibc undefined symbol.
Reviewer: shankarke
http://reviews.llvm.org/D5571
llvm-svn: 219349
When creating a dynamic executable and receiving the -E flag, the linker should
export all globally visible symbols in its dynamic symbol table.
This commit also moves the logic that exports symbols in the dynamic symbol
table from OutputELFWriter to the ExecutableWriter class. It is not correct to
leave this at OutputELFWriter because DynamicLibraryWriter, another subclass of
OutputELFWriter, already exports all symbols, meaning we can potentially end up
with duplicated symbols in the dynamic symbol table when creating shared libs.
Reviewers: shankarke
http://reviews.llvm.org/D5585
llvm-svn: 219334
mach-o supports "fat" files which are a header/table-of-contents followed by a
concatenation of mach-o files (or archives of mach-o files) built for
different architectures. Previously, the support for fat files was in the
MachOReader, but that only supported fat .o files and dylibs (not archives).
The fix is to put the fat handing into MachOFileNode. That way any input file
kind (including archives) can be fat. MachOFileNode selects the sub-range
of the fat file that matches the arch being linked and creates a MemoryBuffer
for just that subrange.
llvm-svn: 219268
Teach the reader about ARM NT relocation types. Although the writer cannot yet
perform the actual application of these relocations, the reader can at least now
identify the relocation types.
llvm-svn: 219178
Previously, we would not check the target machine type and the module (object)
machine type. Add a check to ensure that we do not attempt to use an object
file with a different target architecture.
This change identified a couple of tests which were incorrectly mixing up
architecture types, using x86 input for a x64 target. Adjust the tests
appropriately. The renaming of the input and the architectures covers the
changes to the existing tests.
One significant change to the existing tests is that the newly added test input
for x64 uses the correct user label prefix for X64.
llvm-svn: 219093
In order to support more than x86/x86_64, we need to change the behaviour to use
the actual machine type rather than checking the bitness and assuming that we
are on X86. This replaces the use of is64bit in applyAllRelocations with a
check on the machine type. This will enable adding support for handling ARM
relocations.
Rename the existing applyRelocation methods to be similarly named and to make it
clear the types of relocations they will process.
llvm-svn: 219088
This option is added by Xcode when it runs the linker. It produces a binary
file which contains the file the linker used. Xcode uses the info to
dynamically update it dependency tracking.
To check the content of the binary file, the test case uses a python script
to dump the binary file as text which FileCheck can check.
llvm-svn: 219039
When creating the graph edges of the atoms of an ELF file, special care must be
taken with atoms that represent weak symbols. They cannot be the target of any
Reference::kindLayoutAfter edge because they can be merged and point to other
code, screwing up the final layout of the atoms. ELFFile::createAtoms()
correctly handles this corner case. The problem is that createAtoms() assumed
that there can be no zero-sized weak symbols, which is not true. Consider:
my_weak_func1:
my_weak_func2:
my_weak_func3:
code
In this case, we have two zero-sized weak symbols, my_weak_func1 and
my_weak_func2, and one non-zero weak symbol my_weak_func3. createAtoms() would
correctly handle my_weak_func3, but not the first two symbols. This problem
happens in the musl C library when a zero-sized weak symbol is merged and
screws up the file layout. Since this musl code lives at the finalization hooks,
any C program linked with LLD and musl was correctly executing, but segfaulting
at the end.
Reviewers: shankarke
http://reviews.llvm.org/D5606
llvm-svn: 219034
This patch adds logic to avoid putting the dynamic linker library (ld.so) as a
DT_NEEDED entry in the dynamic table. It should only appear in PT_INTERP.
This patch fixes SPEC programs 433, 445, 450, 453, 456, 462 when running on
Ubuntu Linux x86_64 and when linking SPEC programs with LLD and glibc 2.19.
http://reviews.llvm.org/D5573
llvm-svn: 218847
Summary: With r218633, the logic that monitors which shared library symbols were used was copied from the MIPS lld backend to ELF classes, making it available to all ELF backends. However, this made the isDynSymEntryRequired() functions in MipsDynamicLibraryWriter.h/MipsELFWriters.h/MipsExecutableWriter.h to be duplicated logic, since this is already implemented in OutputELFWriter<>/DefaultLayout. This patch removes this duplicated code from MIPS.
Reviewers: Bigcheese, shankarke
Reviewed By: shankarke
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5564
llvm-svn: 218846
No functionality change. This removes a down-cast from LinkingContext to
MachOLinkingContext.
Also, remove const from LinkingContext::createImplicitFiles() to remove
the need for another const cast. Seems reasonable for createImplicitFiles()
to need to modify the context (MachOLinkingContext does).
llvm-svn: 218796
The darwin linker has the -demangle option which directs it to demangle C++
(and soon Swift) mangled symbol names. Long term we need some Diagnostics object
for formatting errors and warnings. But for now we have the Core linker just
writing messages to llvm::errs(). So, to enable demangling, I changed the
Resolver to call a LinkingContext method on the symbol name.
To make this more interesting, the demangling code is done via __cxa_demangle()
which is part of the C++ ABI, which is only supported on some platforms, so I
had to conditionalize the code with the config generated HAVE_CXXABI_H.
llvm-svn: 218718
This is yet another edge case of ambiguous name resolution.
When a symbol is specified with /entry:SYM, SYM may be resolved
to the C++ mangled function name (?SYM@@YAXXZ).
llvm-svn: 218706
This is a minimally useful pass to construct the __unwind_info section in a
final object from the various __compact_unwind inputs. Currently it doesn't
produce any compressed pages, only works for x86_64 and will fail if any
function ends up without __compact_unwind.
rdar://problem/18208653
llvm-svn: 218703
Summary:
This patch adds support for the general dynamic TLS access model for X86_64 (see www.akkadia.org/drepper/tls.pdf).
To properly support TLS, the patch also changes the __tls_get_addr atom to be a shared library atom instead of a regularly defined atom (the previous lld approach). This closely models the reality of a function that will be resolved at runtime by the dynamic linker and loader itself (ld.so). I was tempted to force LLD to link against ld.so itself to resolve these symbols, but since GNU ld does not need the ld.so library to resolve this symbol, I decided to mimic its behavior and keep hardwired a definition of __tls_get_addr in the lld code.
This patch also moves some important logic that previously was only available to the MIPS lld backend to be used to all ELF backends. This logic, which now lives in the DefaultLayout class, will monitor which external (shared lib) symbols are really imported by the current module and will only populate the dynamic symbol table with used symbols, as opposed to the previous approach of dumping all shared lib symbols in the dynamic symbol table. This is important to this patch to avoid __tls_get_addr from getting injected into all dynamic symbol tables.
By solving the previous problem of always adding __tls_get_addr, now the produced symbol tables are slightly smaller. But this impacted several tests that relied on hardwired/predefined sizes of the symbol table, requiring this patch to update such tests.
Test Plan: Added a LIT test case that exercises a simple use case of TLS variable in a shared library.
Reviewers: ruiu, rafael, Bigcheese, shankarke
Reviewed By: Bigcheese, shankarke
Subscribers: emaste, shankarke, joerg, kledzik, mcrosier, llvm-commits
Projects: #lld
Differential Revision: http://reviews.llvm.org/D5505
llvm-svn: 218633
Previously we emit two or more identical definitions for an
exported symbol if the same /export option is given more than
once. This patch fixes that bug.
llvm-svn: 218433
This patch is difficult to test in isolation, so a subsequent patch will test
further.
Patch by Daniel Stewart <stewartd@codeaurora.org>!
Phabricator Revision: http://reviews.llvm.org/D5377
llvm-svn: 218418
lib.exe prints a warning if a symbol in a module definition file has
both the PRIVATE attribute and an ordinal like this.
EXPORTS
foo @1 PRIVATE
This patch suppresses that.
llvm-svn: 218395
Currently you can omit the leading underscore from exported
symbol name. LLD will look for mangled name for you. But it won't
look for C++ mangled name.
This patch is to support that.
If "sym" is specified to be exported, the linker looks for not
only "sym", but also "_sym" and "?sym@@<whatever>", so that you
can export a C++ function without decorating it.
llvm-svn: 218355
Exported symbol name resolution is two-pass. In the first pass,
we try to resolve that as a regular undefined symbol. If it fails,
we look for mangled name for the symbol and rename the undefined
symbol and try again.
After all name resolution is done, we look for an atom for each
exported symbol again, to construct the export table. In this
process we try the regular names first, and then try mangled names.
But at this moment we should have knew which name is correct.
This patch is to keep the information we get in the first process
to use it later.
llvm-svn: 218354
The export table descriptor is a data structure to keep information
about the export table. It contains a symbol name, and the name may
or may not be mangled.
We need unmangled names for the export table, so we demangle them
before writing them to the export table.
Obviously this is not a correct round-trip conversion. That could
drop a leading underscore from a symbol because that's
indistinguishable from a mangled name.
What we need to do is to keep unmangled names. This patch does that.
llvm-svn: 218345
/machine:ebc was previously recognized but rejected. Unknown architecture
names were handled differently but eventually rejected too. We don't need
to distinguish them.
llvm-svn: 218344
This patch changes the type of export table set from std::set to
std::vector. The new code is slightly inefficient, but because
export table elements are actually mutable, std::vector is better
here. No functionality change.
llvm-svn: 218343
If two or more /export options are given for the same symbol, we should
always print a warning message and use the first one regardless of other
parameters.
Previously there was a case that the first parameter is not used.
llvm-svn: 218342
A symbol in a module definition file may be annotated with the
PRIVATE keyword like this.
EXPORTS
func PRIVATE
The PRIVATE keyword does not affect the resulting .dll file.
But it prevents the symbol to be listed in the .lib (import
library) file.
llvm-svn: 218273
Patch from Rafael Auler!
When a shared lib has an undefined symbol that is defined in a regular object
(the program), the final executable must export this symbol in the dynamic
symbol table. However, in the current logic, lld only puts the symbol in the
dynamic symbol table if the symbol is weak. This patch fixes lld to put the
symbol in the dynamic symbol table regardless if it is weak or not.
This caused a problem in FreeBSD10, whose programs link against a crt1.o
that defines the symbol __progname, which is, in turn, undefined in libc.so.7
and will only be resolved in runtime.
http://reviews.llvm.org/D5424
llvm-svn: 218259
llvm\tools\lld\lib\readerwriter\macho\macholinkingcontext.cpp(647):
warning C4715: 'lld::MachOLinkingContext::exportSymbolNamed' :
not all control paths return a value
llvm\tools\lld\lib\readerwriter\macho\machonormalizedfilefromatoms.cpp(723):
warning C4715: '`anonymous namespace'::Util::getSymbolTableRegion' :
not all control paths return a value
While all enum values do appear in the switch, an uninitialized or corrupted
enum variable would not be caught without the default: case in the switch.
llvm-svn: 218197
Atoms are ordered in the output file by ordinal. File has file ordinal,
and atom has atom ordinal which is unique within the file.
No two atoms should have the same combination of ordinals.
However that contract was not satisifed for alias atoms. Alias atom
is defined by /alternatename:sym1=sym2. In this case sym1 is defined
as an alias for sym2. sym1 always got ordinal 0.
As a result LLD failed with an assertion failure.
This patch assigns ordinal to alias atoms.
llvm-svn: 218158
Cache the machine type value of the linking context. We need this in order to
calculate the virtual address of the atom when resolving function symbols.
Windows on ARM must check if the atom is a function and if so, set the Thumb bit
for the returned virtual address. Failure to do so will result in an abnormal
exit due to a trap caused by invalid instruction decoding. The same information
can be used to determine the relocation type that was previously being done via
is64 to select between x86 and x86_64.
llvm-svn: 218106
Accept /machine:arm as an argument. This is changed to support ARM NT.
Although there is no way to differentiate between ARM (Windows CE) and ARM NT
(Windows on ARM), since LLVM currently only supports Windows on ARM, simply take
/machine:arm to mean Windows on ARM.
llvm-svn: 218105
Rather than saving whether we are targeting 64-bit x86 (x86_64), simply convert
the single use of that information to the actual relocation type. This will
permit the selection of non-x86 relocation types (e.g. for WoA support).
Inline the access of the machine type field as it is relatively cheap (a couple
of pointer dereferences) rather than storing the relocation type as a member
variable.
llvm-svn: 218104
When we encounter an unknown machine type, we print out the machine type magic.
However, we would print out the magic in decimal rather than hex. Perform this
conversion to make it easier to identify what machine is unsupported.
llvm-svn: 218103
This patch fixes a forbidden use of Twine. It should only be used
as an intermediary value, but never stored.
This caused a bug in lld when running on Linux and compiled with
optimizations - it couldn't properly search libs.
Patch from Rafael Auler!
llvm-svn: 218083
I made LLD to report an error if /safeseh:no option is given on x64,
but it turned out MSVC link.exe doesn't report error on it.
Removing the check.
llvm-svn: 218077
The contents from section .CRT$XLA to .CRT$XLZ is an array of function
pointers. They are called by the runtime when a new thread is created
or (gracefully) terminated.
You can make your own initialization function to be called by that
mechanism. All you have to do is:
- Define a pointer to a function in a .CRT$XL* section using pragma
- Make an external reference to "__tls_used" symbol
That technique is used in many projects. This patch is to support that.
What this patch does is to set the relative virtual address of
"__tls_used" to the PECOFF directory table. __tls_used is actually a
struct containing pointers to a symbol in .CRT$XLA and another symbol
in .CRT$XLZ. The runtime looks at the directory table, gets the address
of the struct, and call the function pointers between XLA and XLZ.
llvm-svn: 218007
On darwin, the linker tools records which dylib (DSO) each undefined was found
in, and then at runtime, the loader (dyld) only looks in that one specific
dylib for each undefined symbol. Now that llvm-objdump can display that info
I can write test cases.
llvm-svn: 217898
lld shouldn't directly use the COFF header nor should it use raw
coff_symbols. Instead, query the header properties from the
COFFObjectFile and use COFFSymbolRef to abstractly reference COFF
symbols.
This is just enough to get lld compiling with the changes to
llvm::object. Bigobj specific testing will come later.
Differential Revision: http://reviews.llvm.org/D5280
llvm-svn: 217497
Most of the changes are in the new file ArchHandler_arm64.cpp. But a few
things had to be fixed to support 16KB pages (instead of 4KB) which iOS arm64
requires. In addition the StubInfo struct had to be expanded because
arm64 uses two instruction (ADRP/LDR) to load a global which requires two
relocations. The other mach-o arches just needed one relocation.
llvm-svn: 217469
There is a bit (MH_PIE) in the flags field of the mach_header which tells
the kernel is a program was built position independent (for ASLR). The linker
automatically attempts to build programs PIE if they are built for a recent
OS version. But the -pie and -no_pie options override that default behavior.
llvm-svn: 217408
defined in a shared library.
Now LLD does not export a strong defined symbol if it coalesces away a
weak symbol defined in a shared library. This bug affects all ELF
architectures and leads to segfault:
% cat foo.c
extern int __attribute__((weak)) flag;
int foo() { return flag; }
% cat main.c
int flag = 1;
int foo();
int main() { return foo() == 1 ? 0 : -1; }
% clang -c -fPIC foo.c main.c
% lld -flavor gnu -target x86_64 -shared -o libfoo.so ... foo.o
% lld -flavor gnu -target x86_64 -o a.out ... main.o libfoo.so
% ./a.out
Segmentation fault
The problem is caused by the fact that we lose all information about
coalesced symbols after the `Resolver::resolve()` method is finished.
The patch solves the problem by overriding the
`LinkingContext::notifySymbolTableCoalesce()` method and saving names
of coalesced symbols. Later in the `buildDynamicSymbolTable()` routine
we use this information to export these symbols.
llvm-svn: 217363
By default linker would not create a separate segment to hold read only data.
This option overrides that behavior by creating the a separate read only segment
for read only data.
llvm-svn: 217358
Moved code used only by isDataSymbol from find to isDataSymbol member
function. Also changed the return type of isDataSymbol because
previously "if (isDataSymbol(...))" meant "if it is *not* a data symbol"
which is opposite from what you'd expect.
llvm-svn: 217285
Mach-O has a "fat" (or "universal") variant where the same contents built for
different architectures are concatenated into one file with a table-of-contents
header at the start. But this leaves a dilemma for the linker - which
architecture to use.
Normally, the linker command line -arch is used to force which slice of any fat
files are used. The clang compiler always passes -arch to the linker when
invoking it. But some Makefiles invoke the linker directly and don’t specify
the -arch option. For those cases, the linker scans all input files in command
line order and finds the first non-fat object file. Whatever architecture it
is becomes the architecture for the link.
llvm-svn: 217189
The use of default: was disabling the warning about unused enumerators. Fix
that, then fix the one enumerator that was not handled. Add coverage for
it in test suite.
llvm-svn: 217078
On Darwin at runtime, dyld will prefer to use the export trie of a dylib instead
of the traditional symbol table (which is large and requires a binary search).
This change enables the linker to generate an export trie and to prefer it if
found in a dylib being linked against. This also simples the yaml for dylibs
because the yaml form of the trie can be reduced to just a sequence of names.
llvm-svn: 217066
I hope this is the last fix for x64 relocations as I've wasted
a few days on this.
This caused a mysterious issue that some C++ programs crash on
startup. It was because a null pointer is passed as argv to main.
__tmainCRTStartup calls main, but before that it calls all
initialization routines between .text$xc_a and .text$xc_z.
pre_cpp_init is one of such routines, and it is the one who
initializes a heap pointer for argv for later use. That routine
was not called for some reason.
It turned out that __tmainCRTStartup was skipping a block of
code because of the relocation bug. A condition in the function
depends on a memory load, and that memory load was referring
a wrong location. As a result a jump instruction took the
wrong branch, skipping pre_cpp_init and so on.
This patch fixes the issue. Also added more tests to fix them
once and for all.
llvm-svn: 216772
When a relocation is applied to a location, the new value needs
to be added to the existing value at the location. Existing
value is in most cases zero, but if not, the current code does
not work.
llvm-svn: 216680
Image Base field in the PE/COFF header is used as hint for the loader.
If the loader can load the executable at the specified address, that's
fine, but if not, it has to load it at a different address.
If that happens, the loader has to fix up the addresses in the
executable by adding the offset. The list of addresses that need to
be fixed is in .reloc section.
This patch is to emit x64 .reloc section contents.
llvm-svn: 216636
IMAGE_REL_AMD64_ADDR64 relocation should set 64-bit *VA* (virtual
address) instead of *RVA* (relative virtual address), so we have
to add the iamge base to the target's RVA.
llvm-svn: 216512
The implementation of AMD64 relocations was imcomplete
and wrong. On AMD64, we of course have to use AMD64
relocations instead of i386 ones. This patch fixes the
issue.
LLD is now able to link hello64.obj (created from
hello64.asm) against user32.lib and kernel32.lib to
create a Win64 binary.
llvm-svn: 216253
Mach-O symbols can have an attribute on them means their content should never be
dead code stripped. This translates to deadStrip() == deadStripNever.
llvm-svn: 216234
Both options control the final scope of atoms.
When -exported_symbols_list <file> is used, the file is parsed into one
symbol per line in the file. Only those symbols will be exported (global)
in the final linked image.
The -keep_private_externs option is only used with -r mode. Normally, -r
mode reduces private extern (scopeLinkageUnit) symbols to non-external. But
add the -keep_private_externs option keeps them private external.
llvm-svn: 216146
The darwin linker has an option, heavily used by Xcode, in which, instead
of listing all input files on the command line, the input file paths are
written to a text file and the path of that text file is passed to the linker
with the -filelist option (similar to @file).
In order to make test cases for this, I generalized the -test_libresolution
option to become -test_file_usage.
llvm-svn: 215762
Darwin has a packaging mechanism for shared libraries and headers called
frameworks. A directory Foo.framework contains a shared library binary file
"Foo" and a subdirectory "Headers". Most OS frameworks are all in one
directory /System/Library/Frameworks/. As a linking convenience, the linker
option "-framework Foo" means search the framework directories specified
with -F (analogous to -L) looking for a shared library Foo.framework/Foo.
llvm-svn: 215680
In general two-level namespace means each program records exactly which dylib
each undefined (imported) symbol comes from. But, sometimes the implementor
wants to hide the implementation dylib. For instance libSytem.dylib is the base
dylib all Darwin programs must link with. A few years ago it was split up
into two dozen dylibs by all are hidden behind libSystem.dylib which re-exports
each sub-dylib. All clients still think libSystem.dylib is the implementor.
To support this, the linker must load "indirect" dylibs and not just the
"direct" dylibs specified on the command line. This is done in the
createImplicitFiles() method after all command line specified files are
loaded. Since an indirect dylib may have already been loaded as a direct dylib
(or indirectly via a previous direct dylib), the MachOLinkingContext keeps
a list of all loaded dylibs.
With this change hello world can now be linked against the real OS or SDK.
llvm-svn: 215605
Split up the CRuntimeFile into one part for output types that need an entry
point and another part for output types that use stubs.
Add file 'test/mach-o/Inputs/libSystem.yaml' for use by test cases that
use -dylib and therefore may now need the helper symbol in libSystem.dylib.
llvm-svn: 215602
Mach-o uses "two-level namespace" where each undefined symbols is associated
with a specific dylib. This means at runtime the loader (dyld) does need to
search all loaded dylibs for that symbol but rather just the one specified.
Now that llvm-nm -m prints out that info, properly set that info, and test
it in the hello world test cases.
llvm-svn: 215598
This patch adds the initial ELF/AArch64 support to lld. Only a basic "Hello
World" app has been successfully tested for both dynamic and static compiling.
Differential Revision: http://reviews.llvm.org/D4778
Patch by Daniel Stewart <stewartd@codeaurora.org>!
llvm-svn: 215544
The tests assume the path separator is '/', but if you run
them on Windows it is '\'. As a result the tests are failing
on Windows. This should be the minimal change to make these
tests to pass on Windows platform.
Differential Revision: http://reviews.llvm.org/D4710
llvm-svn: 214990
/INCLUDE arguments passed as command line options are handled in the
same way as Unix -u. All option values are converted to an undefined
symbol and added to a dummy input file, so that the specified symbols
are resolved.
One tricky thing on Windows is that the option is also allowed to
appear in the object file's directive section. At the time when
it's being read, all (regular) command line options have already
been processed. We cannot add undefined atoms to the dummy file
anymore.
Previously, we added such /INCLUDE to a set that has already been
processed. As a result the options were ignored.
This patch fixes the issue. Now, /INCLUDE symbols in the directive
section are handled as real undefined symbol in the COFF file.
We create an undefined symbol for each /INCLUDE argument and add
it to the file being parsed.
llvm-svn: 214824
The PE/COFF spec says that SizeOfRawData field in the section
header must be a multiple of FileAlignment from the optional
header. LLD emits 512 as FileAlignment, so it must have been
a multiple of 512.
LLD did not follow that. It emitted the actual section size
without the last padding as the SizeOfRawData. Although it's
not correct as per the spec, the Windows loader doesn't seem
to actually bother to check that. Executables created by LLD
worked fine.
However, tools dealing with executalbe files may expect it
to be the correct value, and one instance of it is mt.exe
tool distributed as a part of Windows SDK.
If CMake is invoked with "-E vs_link_exe" option, it silently
run mt.exe to embed a resource file to the resulting file.
And mt.exe sometimes breaks an input file if it's section
header does not follow the standard. That caused a misterous
error that CMake with Ninja occasionally produces a broken
executable.
This patch fixes the section header to make mt.exe and
other tools happy.
llvm-svn: 214453
In some cases the address of a function will be materialized with a movw/movt
pair. If the function is a thumb function, the low bit needs to be set on
the movw immediate value.
llvm-svn: 214277
The -sectalign option is used to increase the alignment required for a section.
It required some reworking of how the __TEXT segment is laid out because that
segment also contains the mach_header and load commands. And the size of load
commands depend on the number of segments, sections, and dependent dylibs used.
Using this option will simplify some future test cases because the final
address of code can be pinned down, making tests of its content easier.
llvm-svn: 214268
All iOS arm processor support switching between arm and thumb mode at call sites
by using the BLX instruction (instead of BL). But the compiler does not know
the implementation mode for extern functions, so the linker must update BL/BLX
instructions to match what is linked is actually linked together. In addition,
pointers to functions (such as vtables) must have the low bit set if the target
of the pointer is a thumb mode function.
llvm-svn: 214140
Sometimes compilers emit data into code sections (e.g. constant pools or
jump tables). These runs of data can throw off disassemblers. The solution
in mach-o is that ranges of data-in-code are encoded into a table pointed to
by the LC_DATA_IN_CODE load command.
The way the data-in-code information is encoded into lld's Atom model is that
that start and end of each data run is marked with a Reference whose offset
is the start/end of the data run. For arm, the switch back to code also marks
whether it is thumb or arm code.
llvm-svn: 213901
The entry point file needs to be processed after all other
object files and before all .lib files. It was processed
after .lib files. That caused an issue that the entry point
function was not resolved from the standard library files.
llvm-svn: 213804
On Windows there are four "main" functions -- main, wmain, WinMain,
or wWinMain. Their parameter types are diffferent. The standard
library provides four different entry functions (i.e.
{w,}{WinMain,main}CRTStartup) for them. You need to use the right
entry routine for your "main" function.
If you give an /entry option, the specified name is used
unconditionally.
Otherwise, the linker needs to select the right one based on
user-supplied entry point function. This can be done after the
linker reads all the input files.
This patch moves the code to determine the entry point function
from the driver to a virtual input file. It also implements the
correct logic for the entry point function selection.
llvm-svn: 213713
This patch just supports marking ranges that are thumb code (vs arm code).
Future patches will mark data and jump table ranges. The ranges are encoded
as References with offsetInAtom being the start of the range and the target
being the same atom.
llvm-svn: 213712
Over time the symbols and relocations have changed for dwarf unwind info
in the __eh_frame section. Add test cases for older and new style.
llvm-svn: 213585
Add support for adding section relocations in -r mode. Enhance the test
cases which validate the parsing of .o files to also round trip. They now
write out the .o file and then parse that, verifying all relocations survived
the round trip.
llvm-svn: 213333
The code to manage resolvable symbols is now separated from
ExportedSymbolRenameFile so that other class can reuse it.
I'm planning to use it to find the entry function symbol
based on resolvable symbols.
llvm-svn: 213322
All architecture specific handling is now done in the appropriate
ArchHandler subclass.
The StubsPass and GOTPass have been simplified. All architecture specific
variations in stubs are now encoded in a table which is vended by the
current ArchHandler.
llvm-svn: 213187
There are two forms of `-l` prefixed expression:
* -l<libname>
* -l:<filename>
In the first case a linker should construct a full library name
`lib + libname + .[so|a]` and search this library as usual. In the second case
a linker should use the `<filename>` as is and search this file through library
search directories.
The patch reviewed by Shankar Easwaran.
llvm-svn: 213077
Previously we invoked cvtres.exe for each compiled Windows
resource file. The generated files were then concatenated
and embedded to the executable.
That was not the correct way to merge compiled Windows
resource files. If you just concatenate generated files,
only the first file would be recognized and the rest would
be ignored as trailing garbage.
The right way to merge them is to call cvtres.exe with
multiple input files. In this patch we do that in the
Windows driver.
llvm-svn: 212763
These behave slightly idiosyncratically in the best of cases, and have
additional hacks layered on top of that for compatibility with badly behaved
build systems (via ld64).
For -lXYZ:
+ If XYZ is actually XY.o then search all library paths for XY.o
+ Otherwise search all library paths, first for libXYZ.dylib, then libXYZ.a
+ By default the library paths are /usr/lib and /usr/local/lib in that order.
For -syslibroot:
+ -syslibroot options apply to absolute paths in the search order.
+ All -syslibroot prefixes that exist are added to the search path *instead*
of the original.
+ If no -syslibroot prefixed path exists, the original is kept.
+ Hacks^WExceptions:
+ If only 1 -syslibroot is given and doesn't contain /usr/lib or
/usr/local/lib, that path is dropped entirely. (rdar://problem/6438270).
+ If the last -syslibroot is "/", all of them are ignored entirely.
(rdar://problem/5829579).
At least, that's my best interpretation of what ld64 does in buildSearchPaths.
llvm-svn: 212706
Previously the alignment of the .bss section was not
properly set because of a bug in AtomizeDefinedSymbolsInSection.
We set the alignment of the section at the end of the function,
but we use an eraly return for the .bss section, so the code had
been skipped.
llvm-svn: 212571
This converts the very complicated mach-o arm
relocations into the simple Reference Kinds in lld.
The next patch will use the internal Reference kinds
to fix up arm/thumb code.
llvm-svn: 212306
When trying to map atom types to sections, we were iterating through an array
until we hit a sentinel value. There's no need for such dances when range-based
for loops are available.
llvm-svn: 212035
This isn't really the right place to put them in final object files (that would
be __TEXT,__unwind_info), but the format is different between relocatable and
final objects, which means we really need a pass to handle the translation.
For now, re-emitting in __LD,__compact_unwind is harmless (dyld ignores it and
moves straight on to inspecting __TEXT,__eh_frame), and sidesteps an assertion
failure when processing files containing compact-unwind info.
llvm-svn: 212032
Segments must occupy a multiple of the page size in memory (4096 currently). We
check for this when emitting files, but the placement algorithm broke down for
the second non-__TEXT segment encountered: the offset wasn't aligned up to 4096
before starting its layout.
llvm-svn: 212031
Because of how we were calculating fileOffset and fileSize for segments, most
ended up at a single offset in a finalised MachO file. This meant the data
often didn't even get written in the final object, let alone where it would be
useful.
llvm-svn: 212030
This is first step in reworking how mach-o relocations are processed.
The existing KindHandler is going to become a delgate/helper object for
processing architecture specific references. The KindHandler knows how
to convert mach-o relocations into References and back, as well, as fixing
up the content the relocation is on.
One of the messy things about mach-o relocations is that they sometime
come in pairs, but the pairs still convert to one lld::Reference. So, the
conversion has to detect pairs (arch specific) and change the stride.
llvm-svn: 211921
The previous function returned true for "s < s", which could completely mess up
the sorting of symbols within a section.
Unfortunately, I don't think there's a robust way to write a test for this.
Anything I come up with will be making assumptions about the particular
implementation of std::sort.
llvm-svn: 211704
When looking through sections with zero-terminated string-literals (__cstring
or __ustring) we were constantly rechecking the first few bytes of the string
for '\0' rather than advancing along. This obviously failed unless all strings
within the section had the same length as that first one.
llvm-svn: 211682
We were trying to examine the first symbol in a section that we wanted to
atomize by symbols, even when there wasn't one. Instead, we should make the
initial anonymous symbol cover the entire section in that situation.
llvm-svn: 211681
dynamic symbol table populating and DT_NEEDED tag creation.
The `isDynSymEntryRequired` function returns true if the specified shared
library atom requires a dynamic symbol table entry. The `isNeededTagRequired`
function returns true if we need to create DT_NEEDED tag for the shared
library defined specified shared atom.
By default the both functions return true. So there is no functional changes
for all targets except MIPS. Probably we need to spread the same modifications
on other ELF targets but I want to implement and fully tested complete set of
changes for MIPS target first.
For MIPS we create a dynamic symbol table entry for a shared library atom iif
this atom is referenced by a regular defined atom. For example, if library L1
defines symbol T1, library L2 defines symbol T2 and uses symbol T1
and executable file E1 uses symbol T2 but does not use symbol T1 we create
an entry in the E1 dynamic symbol table for symbol T2 and do not create
an entry for T1.
The patch creates DT_NEEDED tags for shared libraries contain shared library
atoms which a) referenced by regular defined atoms; b) have corresponding
copy dynamic relocations (R_MIPS_COPY).
Now the patch does not take in account --as-needed / --no-as-needed command
line options. So it is too restrictive and create DT_NEEDED tags for really
needed shared libraries only. I plan to fix that by subsequent patches.
llvm-svn: 211674
COFF supports a feature similar to ELF's section groups. This
patch implements it.
In ELF, section groups are identified by their names, and they are
treated somewhat differently from regular symbols. In COFF, the
feature is realized in a more straightforward way. A section can
have an annotation saying "if Nth section is linked, link this
section too."
I added a new reference type, kindAssociate. If a target atom is
coalesced away, the referring atom is removed by Resolver, so that
they are treated as a group.
Differential Revision: http://reviews.llvm.org/D4028
llvm-svn: 211106
The previous commit uncovered a bug in the mach-o writer whereby two __text
sections were created. But the test case did not catch that. So I updated
the test case to run the linker a second time, reading the output of the
first pass.
llvm-svn: 210624
COFF supports a feature similar to ELF's section groups. This
patch implements it.
In ELF, section groups are identified by their names, and they are
treated somewhat differently from regular symbols. In COFF, the
feature is realized in a more straightforward way. A section can
have an annotation saying "if Nth section is linked, link this
section too."
Implementing such feature is easy. We can add a reference from a
target atom to an original atom, so that if the target is linked,
the original atom is also linked. If not linked, both will be
dead-stripped. So they are treated as a group.
I added a new reference type, kindAssociate. It does nothing except
preventing referenced atoms from being dead-stripped.
No change to the Resolver is needed.
Reviewers: Bigcheese, shankarke, atanasyan
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3946
llvm-svn: 210240
This provides support for the autoconfing & make build style.
The format, style and implementation follows that used within the llvm and clang projects.
TODO: implement out-of-source documentation builds.
llvm-svn: 210177
Previously FileArchive ctor comment said that only its subclasses
can be instantiated, but the ctor is actually public and is
instantiated by ArchiveReader.
Remove the wrong comment and reorder the member functions so that
public members appear before private ones.
llvm-svn: 210175
In sections that are broken into atoms at symbols, if the first symbol in the
section is not at the start of the section, then make an anonymous atom for
the section content that is before the first symbol.
llvm-svn: 210142
Previously each section kind had its own code to loop over the section and
parse it into atoms. This refactoring has two tables. The first maps sections
to ContentType. The second maps ContentType to information on how to find
the atom boundaries.
A few bugs in test cases were discovered as part of the refactoring.
No change in functionality intended.
llvm-svn: 210138
Previously section groups are doubly linked to their children.
That is, an atom representing a group has group-child references
to its group contents, and content atoms also have group-parent
references to the group atom. That relationship was invariant;
if X has a group-child edge to Y, Y must have a group-parent
edge to X.
However we were not using group-parent references at all. The
resolver only needs group-child edges.
This patch simplifies the section group by removing the unused
reverse edge. No functionality change intended.
Differential Revision: http://reviews.llvm.org/D3945
llvm-svn: 210066
Arrange .ctors/.dtors sections in the following order:
.ctors from crtbegin.o or crtbegin?.o
.ctors from regular object files
.ctors.* (sorted) from regular object files
.ctors from crtend.o or crtend?.o
This order is specific for MIPS traget. For example, on X86
the .ctors.* sections are merged into the .init_array section.
llvm-svn: 209987
The main problem is in the predicate passed to the `std::stable_sort()`.
This predicate always returns false if **both** section's names do not
start with `.init_array` or `.fini_array` prefixes. In short, it does not
define a strict weak orderng. Suppose we have the following sections:
.A .init_array.1 .init_array.2
The predicate states that:
not .init_array.1 < .A
not .A < .init_array.2
but .init_array.1 < .init_array.2 !!!
The second problem is that `.init_array` section without number should
go last in the list. Not it has the lowest priority '0' and goes first.
The patch fixes both of the problems.
llvm-svn: 209875
This is a short-term fix to allow lld Readers to return error messages
with dynamic content.
The long term fix will be to enhance ErrorOr<> to work with errors other
than error_code. Or to change the interface to Readers to pass down a
diagnostics object through which all error messages are written.
llvm-svn: 209681
/alternatename is a command line option to define a weak alias. You
can use it as /alternatename:foo=bar to define "foo" as a weak alias
for "bar".
Because it's a command line option, the weak alias mapping is in the
LinkingContext object, and not in a object file being read.
Previously, we looked up the mapping each time we read a new symbol
from a file, to check if there is a weak alias defined for the symbol.
That's not wrong, but had made function signature's a bit complicated --
we had to pass the mapping object to many functions. Now their
parameter lists are much cleaner.
This also has another (unrealized) benefit. parseFile() now read a
file and then add alias symbols to the file. In the first pass a
LinkingContext object is not used at all. That should make it easy
to read files from archive files speculatively, as the first pass
is free from side effect.
llvm-svn: 209486
Alias symbols are SimpleDefinedAtoms and are platform neutral. They
don't have to belong ELF. This patch is to make it available to all
platforms. No functionality change intended.
Differential Revision: http://reviews.llvm.org/D3862
llvm-svn: 209475
addResolvableSymbols() queues input files, and readAllSymbols() reads
from them. In practice it's currently safe because they are called from
a single thread. But it's not guaranteed.
Also, acquiring the same mutex is needed not to see inconsistent memory
contents that is allowed in the C++ memory model.
llvm-svn: 209254
ExportedSymbolRenameFile is not always used. In most cases we don't
need to read given files at all. So lazy load would help. This doesn't
change the meaining of the program.
llvm-svn: 208818
As written in the comment in this patch, symbol names specified with
/export option is resolved in a special way; for /export:foo, linker
finds a foo@<number> symbol if such symbols exists.
On Windows, a function in stdcall calling convention is mangled with
a leading underscore and following "@" and numbers. This name
mangling is kind of automatic, so you can sometimes omit _ and @number
when specifying a symbol. /export option is that case.
Previously, if a file in an archive file foo.lib provides a symbol
_fn@8, and /export:fn is specified, LLD failed to resolve the symbol.
It only tried to find _fn, and failed to find _fn@8. With this patch,
_fn@8 will be searched on the second iteration.
Differential Revision: http://reviews.llvm.org/D3736
llvm-svn: 208754
Make it possible to add observers to an Input Graph, so that files
returned from an Input Graph can be examined before they are
passed to Resolver.
To implement some PE/COFF features we need to know all the symbols
that *can* be solved, including ones in archive files that are not
yet to be read.
Currently, Resolver only maintains a set of symbols that are
already read. It has no knowledge on symbols in skipped files in
an archive file.
There are many ways to implement that. I chose to apply the
observer pattern here because it seems most non-intrusive. We don't
want to mess up Resolver with architecture specific features.
Even in PE/COFF, the feature that needs this mechanism is minor.
So I chose not to modify Resolver, but add a hook to Input Graph.
Differential Revision: http://reviews.llvm.org/D3735
llvm-svn: 208753
If one or more dynamic relocation might modify a read-only section,
dynamic table should contain DT_TEXTREL tag.
The patch introduces new `RelocationTable::canModifyReadonlySection()`
method. This method checks through the relocations to see if any modifies
a read-only section. The DynamicTable class calls this method and emits
the DT_TEXTREL tag if necessary.
The patch reviewed by Rui Ueyama and Shankar Easwaran.
llvm-svn: 208670
We did not actively try to resolve dllexported symbols specified
by /export or by a module definition file. So if exported symbols
would be resolved for other reasons, like other symbols refer to
them, that was fine, but if (unreferenced) exported symbols were
in an archive file, and no one refers to that file in the archive,
they remained unresolved.
That would obviously cause the issue that dllexported symbols are
not in a resultant DLL.
In this patch, we create an undefined symbol for each dllexported
symbol, to let the core linker to resolve it.
llvm-svn: 208452
Previously the handling of exported symbol was wrong if it's
specified in a module definition file in the form of
<externalname>=<internalname>. Export the correct symbol.
llvm-svn: 208446
isAlias always returns false and no one is using it. It was
originally added Atom to query if an atom is an alias for another
atom, assuming that alias atoms are different from normal atoms.
We now support atom aliasing, but the way that's implemented is
in a different way than what isAlias assumed. An alias atom is
just a regular defined atom with no content, and it has a layout-
before edge to alias-to atom so that they are layed out at the
same location in the result. So this is dead code, and it doesn't
make much sense to keep it.
llvm-svn: 207884
Export definitions in a module definition file is as follows:
exportedname[=internalname] [@ordinal [NONAME]] [PRIVATE] [DATA]
Previously we did not support =internalname, so users couldn't export
symbols from a DLL with a different name.
llvm-svn: 207827
In general the linker scripts's GROUP command works like a pair
of command line options --start-group/--end-group. But there is
a difference in the files look up algorithm.
The --start-group/--end-group commands use a trivial approach:
a) If the path has '-l' prefix, add 'lib' prefix and '.a'/'.so'
suffix and search the path through library search directories.
b) Otherwise, use the path 'as-is'.
The GROUP command implements more compicated approach:
a) If the path has '-l' prefix, add 'lib' prefix and '.a'/'.so'
suffix and search the path through library search directories.
b) If the path does not have '-l' prefix, and sysroot is configured,
and the path starts with the / character, and the script being
processed is located inside the sysroot, search the path under
the sysroot. Otherwise, try to open the path in the current
directory. If it is not found, search through library search
directories.
https://www.sourceware.org/binutils/docs-2.24/ld/File-Commands.html
The patch reviewed by Shankar Easwaran, Rui Ueyama.
llvm-svn: 207769
When creating a .lib file, we should strip the leading underscore,
but should not strip stdcall atsign suffix. Otherwise produced .lib
files cannot be linked.
llvm-svn: 207729
Previously the input file for the lib.exe command would be removed
as soon as the command exits, so we couldn't write a test to check
the file contents are correct.
This patch adds /lldmoduledeffile: option to retain a copy of the
temporary file at the given file path, so that you can see the file
if you want.
llvm-svn: 207727
Linker should create _imp_ symbols for local use only when such
symbols cannot be resolved in any other way. If it overrides real
imported symbols, such symbols remain virtually unresolved without
error, causing odd issues. I observed that a program linked with
LLD entered an infinite loop before reaching main() because of
this issue.
This patch moves the virtual file creating _imp_ symbols to the
very end of the input file list. Previously, the file is at the end
of the library file group. Linker might revisit the group many times,
so it was not really at the end of the input file list.
llvm-svn: 207605
1. Re-implement PLT entries and dynamic relocations emitting to keep PLT
and relocations table in a consistent state.
2. Initialize st_value and st_other fields for dynamic symbols table
entry if this entry corresponds to an external function which address is
taken in a non-PIC executable. In that case the st_value field holds an
address of the function's PLT entry. Also set STO_MIPS_PLT bit in the
st_other field.
llvm-svn: 207494
Implicit symbol for local use implemented in r207141 was not fully
compatible with MSVC link.exe. In r207141, I implemented the feature
in such way that implicit symbols are defined only when they are
exported with /EXPORT option.
After that I found that implicit symbols are defined not only for
dllexported symbols but for all defined symbols. Actually _imp_
implicit symbols have no relationship with the dllexport feature. You
could add _imp_ to any symbol to get a pointer to the symbol, whether
the symbol is dllexported or not. It looks pretty weird to me but
that's what we want if link.exe behaves that way.
Here is a bit about the implementation: Creating all implicit symbols
beforehand is going to be a huge waste of resource. This feature is
rarely used, and MSVC link.exe even prints out a warning message when
it finds this feature is being used. So we create implicit symbols
on demand. There is an archive file that creates implicit symbols when
they are needed.
llvm-svn: 207476
This patch is to fix a compatibility issue with MSVC link.exe as to
use of dllexported symbols inside DLL.
A DLL exports two symbols for a function. One is non-decorated one,
and the other is with __imp_ prefix. The former is a function that
you can directly call, and the latter is a pointer to the function.
These dllexported symbols are created by linker for programs that
link against the DLL. So, I naturally believed that __imp_ symbols
become available when you once create a DLL and link against it, but
they don't exist until then. And that's not true.
MSVC link.exe is smart enough to allow users to use __imp_ symbols
locally. That is, if a symbol is specified with /export option, it
implicitly creates a new symbol with __imp_ prefix as a pointer to
the exported symbol. This feature allows the following program to
be linked and run, although _imp__hello is not defined in this code.
#include <stdio.h>
__declspec(dllexport)
void hello(void) { printf("Hello\n"); }
extern void (*_imp__hello)(void);
int main() {
_imp__hello();
return 0;
}
MSVC link.exe prints out the following warning when linking it.
LNK4217: locally defined symbol _hello imported in function _main
Using __imp_ symbols locally is I think not a good coding style. One
should just take an address using "&" operator rather than appending
__imp_ prefix. However, there are programs in the wild that depends
on this link.exe's behavior, so we need this feature.
llvm-svn: 207141
Not all symbols are decorated with an underscore in x86. You can
write undecorated symbols in assembly, for example. Thus this
assertion is too strong.
llvm-svn: 207125
We don't use sections with IMAGE_SYM_DEBUG attribute so we basically
want to the symbols for them when reading symbol table. When we skip
them, we need to skip auxiliary symbols too. Otherwise weird error
would happen because aux symbols would be interpreted as regular ones.
llvm-svn: 206931
definition below all of the header #include lines, LLD edition.
IF you want to know more details about this, you can see the recent
commits to Debug.h in LLVM. This is just the LLD segment of a cleanup
I'm doing globally for this macro.
llvm-svn: 206851
Currently LLD supports --defsym only in the form of
--defsym=<symbol>=<integer>, where the integer is interpreted as the
absolute address of the symbol. This patch extends it to allow other
symbol name to be given as an RHS value. If a RHS value is a symbol
name, the LHS symbol will be defined as an alias for the RHS symbol.
Internally, a LHS symbol is represented as a zero-size defined atom
who has an LayoutAfter reference to an undefined atom, whose name is
the RHS value. Everything else is already implemented -- Resolver
will resolve the undefined symbol, and the layout pass will layout
the two atoms at the same location. Looks like it's working fine.
Note that GNU LD supports --defsym=<symbol>=<symbol>+<addend>. That
feature is out of scope of this patch.
Differential Revision: http://reviews.llvm.org/D3332
llvm-svn: 206417
a couple of new virtual functions.
Follow-up to the rL203408. Two virtual functions `createRelocationReference()`
responsible for creation of `ELFReference` have been replaced by a couple of
new virtual functions `createRelocationReferences()` (plural). Each former
function creates a //single// ELFReference for a specified `Elf_Rela`
or `Elf_Rel` relocation records. The new functions responsible for creation
of //all// relocation references for provided symbol.
For all targets except MIPS there are no functional changes.
MIPS ABI has a notion of //paired// relocations. An effective addend of such
relocations are calculated using addends of both pair's members.
Each `R_MIPS_HI16` and `R_MIPS_GOT16` (for local symbols) relocations must have
an associated `R_MIPS_LO16` entry immediately following it in the list
of relocations. Immediately does not mean "next record" in relocations section
but "next record referenced the same symbol". Moreover a single `R_MIPS_LO16`
relocation can be paired with multiple preceding `R_MIPS_HI16/R_MIPS_GOT16`
relocations.
The paired relocation can have offsets belong to the different symbols.
That is why we need to have access to list of all relocations during
construction of `ELFReference` for MIPS target.
The patch reviewed by Shankar Easwaran.
llvm-svn: 206102
Seems getSomething() is more common naming scheme than just a noun
to get something, so renaming these members.
Differential Revision: http://llvm-reviews.chandlerc.com/D3285
llvm-svn: 205589
"x.empty()" is more idiomatic than "x.size() == 0". This patch is to
add such method and use it in LLD.
Differential Revision: http://llvm-reviews.chandlerc.com/D3279
llvm-svn: 205558
.gnu.linkonce sections are similar to section groups.
They were supported before section groups existed and provided a way
to resolve COMDAT sections using a different design.
There are few implementations that use .gnu.linkonce sections
to store simple floating point constants which doesnot require complex section
group support but need a way to store only one copy of the floating point
constant in a binary.
.gnu.linkonce based symbol resolution achieves that.
Review : http://llvm-reviews.chandlerc.com/D3242
llvm-svn: 205280
This reverts commit 5d5ca72a7876c3dd3dd1db83dc6a0d74be9e2cd1.
Discuss on a better design to raise error when there is a similar group with Gnu
linkonce sections and COMDAT sections.
llvm-svn: 205224
.gnu.linkonce sections are similar to section groups. They were supported before
section groups existed and provided a way to resolve COMDAT sections using a
different design. There are few implementations that use .gnu.linkonce sections
to store simple floating point constants which doesnot require complex section
group support but need a way to store only one copy of the floating point
constant. .gnu.linkonce based symbol resolution achieves that.
llvm-svn: 205163
This patch is to support --defsym option for ELF file format/GNU-compatible
driver. Currently it takes a symbol name followed by '=' and a number. If such
option is given, the driver sets up an absolute symbol with the specified
address. You can specify multiple --defsym options to define multiple symbols.
GNU LD's --defsym provides many more features. For example, it allows users to
specify another symbol name instead of a number to define a symbol alias, or it
even allows a symbol plus an offset (e.g. --defsym=foo+3) to define symbol-
relative alias. This patch does not support that, but will be supported in
subsequent patches.
Differential Revision: http://llvm-reviews.chandlerc.com/D3208
llvm-svn: 205029
These classes are declared in a .cpp file but not used in the same compliation
unit. They seems to have been copy-and-pasted from ELFReader.h.
llvm-svn: 204988
of GOT.
* Read addend for R_MIPS_GOT16 relocation.
* Put only high 16 bits of symbol + addend into GOT entries for
locally visible symbols.
llvm-svn: 204247
COMDAT_SELECT_LARGEST is a COMDAT type that make linker to choose the largest
definition from among all of the definition of a symbol. If the size is the
same, the choice is arbitrary.
Differential Revision: http://llvm-reviews.chandlerc.com/D3011
llvm-svn: 204172
The COFF spec says that the SectionNumber field in the symbol table is 16 bit
signed type, but MSVC treats the field as if it is unsigned.
llvm-svn: 203901
This results in some simplifications to the code where an OwningPtr had to
be used with the previous api and then ownership moved to a unique_ptr for
the rest of lld.
llvm-svn: 203809
An object whose machine type header value is unknown looks a bit odd but
is valid. If an object contains only machine-type-independent data, you
can leave the type field unspecified. Some files in oldname.lib are such
object files.
llvm-svn: 203752
Summary:
COMDAT_SELECT_SAME_SIZE is a COMDAT type that I presume exist only in COFF.
The semantics of the type is that linker should merge such COMDAT sections if
their sizes are the same. Otherwise it's an error.
Reviewers: Bigcheese, shankarke, kledzik
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2996
llvm-svn: 203308
Just like x86 exception handler table, the table for x64 needs to be sorted
so that runtime can binary search on it. Unlike x86, the table entry for x64
has multiple fields, and they need to be sorted according to its BeginAddress
field. This patch also fixes a bug in relocations.
llvm-svn: 202874
It looks like the contents of the table need to be sorted according to its
value, so that the runtime can find the entry by binary search. I'm not 100%
sure if we really have to do that, but at least I can say it's safe to do
because the contents of .sxdata is just a list of exception handlers' RVAs.
llvm-svn: 202550
If all input files are compatible with Structured Exception Handling, linker
is supposed to create an exectuable with a table for SEH handlers. The table
consists of exception handlers entry point addresses.
The basic idea of SEH in x86 Microsoft ABI is to list all valid entry points
of exception handlers in an read-only memory, so that an attacker cannot
override the addresses in it. In x86 ABI, data for exception handling is mostly
on stack, so it's volnerable to stack overflow attack. In order to protect
against it, Windows runtime uses the table to check a return address, to
ensure that the address is really an valid entry point for an exception handler.
Compiler emits a list of exception handler functions to .sxdata section. It
also emits a marker symbol "@feat.00" to indicate that the object is compatible
with SEH. SEH is a relatively new feature for COFF, and mixing SEH-compatible
and SEH-incompatible objects will result in an invalid executable, so is the
marker.
If all input files are compatible with SEH, LLD emits a SEH table. SEH table
needs to be pointed by Load Configuration strucutre, so when emitting a SEH
table LLD emits it too. The address of a Load Configuration will be stored to
the file header.
llvm-svn: 202248
target_link_libraries(INTERFACE) doesn't bring inter-target dependencies in add_library,
although final targets have dependencies to whole dependent libraries.
It makes most libraries can be built in parallel.
target_link_libraries(PRIVATE) is used to shaared library.
Each dependent library is linked to the target.so, and its user will not see its grandchildren.
For example,
- libclang.so has sufficient libclang*.a(s).
- c-index-test requires just only libclang.so.
FIXME: lld is tweaked minimally. Adding INTERFACE in each library would be better thing.
llvm-svn: 202241
This restores the debug output to how it was before r197727 broke it. This
went undetected because the corresponding test was never run due to broken
feature detection.
llvm-svn: 202079
The sections .rela/.rel.(*) have a alignment of 2 in the final image created by
the linker. This needs to be properly set to the right alignment depending on
the architecture(32/64bits).
llvm-svn: 201740
The comments in the files that described the file name as part of each file
header ran over 80 columns, which clang-format split over multiple lines.
This commit fixes to make them appear properly.
llvm-svn: 200181
ELFFile would be a class that rest of the targets would derive from.
To keep the implementation clean, separate the implementation from
rest of the Header file.
llvm-svn: 200168
Add new virtual virtual function `isRelaOutputFormat` to the
`ELFLinkingContext` class. Call this function everywhere we need to
select a relocation table format.
Patch reviewed by Shankar Easwaran and Rui Ueyama.
llvm-svn: 199973
method returns the DefaultLayout::_segments field. The type of this field is
a vector of Segment<ELFT>* pointers. This type cannot be implicitly casted to
the range<ChunkIter>.
llvm-svn: 199233
The main goal of this patch is to allow "mach-o encoded as yaml" and "native
encoded as yaml" documents to be intermixed. They are distinguished via
yaml tags at the start of the document. This will enable all mach-o test cases
to be written using yaml instead of checking in object files.
The Registry was extend to allow yaml tag handlers to be registered. The
mach-o Reader adds a yaml tag handler for the tag "!mach-o".
Additionally, this patch fixes some buffer ownership issues. When parsing
mach-o binaries, the mach-o atoms can have pointers back into the memory
mapped .o file. But with yaml encoded mach-o, name and content are ephemeral,
so a copyRefs parameter was added to cause the mach-o atoms to make their
own copy.
llvm-svn: 198986
Currently LLD always print a warning message if the same symbol is specified
more than once for /export option. It's a bit annoying because specifying the
same symbol with compatible options is actually safe and considered as a
normal use case. This patch makes LLD to warn only when incompatible export
options are given.
llvm-svn: 198104
Each export symbol descriptor has unique name attribute, so std::set is
better container than std::vector for it. No functionality change.
llvm-svn: 198102
Currently .drectve section contents are parsed after other sections are parsed.
That order may result in wrong results if other sections depend on command line
options in the directive section.
For example, if a weak symbol is defined using /alternatename option in the
directive section, we have to read it first and then read the text section
contents. Otherwise the weak symbol won't be defined.
This patch changes the order to fix the issue.
llvm-svn: 198071
There are many object files in the standard library who have empty .drective
sections. Parsing the empty string is not wrong but a waste.
llvm-svn: 198067
Subsystem field in the PE/COFF file header has no meanining for the DLL.
It looks like MSVC link.exe sets the default subsystem (Windows GUI) to
the field if no /subsystem option is specified.
llvm-svn: 198015
It will configure resonable defaults for other settings in the
MachOLinkingContext object based on the parameters.
Patch by Joe Ranieri
llvm-svn: 197851
If a symbol in an import library is marked as "data", the linker will not
create a jump table entry for the symbol, since jump table makes sense only
for a symbol pointing to a function.
I don't think NONAME attribute has a meaning when creating an import library.
The attribute is emitted for debugging purpose.
llvm-svn: 197803
If the linker is instructed to create a DLL, it will also create an import
library (.lib file) to describe the symbols exported by the DLL. This patch is
to create the import library file.
There is a convenient command "lib.exe" which can create an import library
from a module definition file (.def file). The command is used in this patch.
llvm-svn: 197801
Default ordinals were assigned in EdataPass, and the assigned values were
then discarded in the pass. No code other than EdataPass would not be able
to get all of the information about ordinals. That's not ideal since I'm
writing code to emit an Import Library file, which also needs ordinals.
This is a patch to move the code to assign default ordinals from EdataPass
to LinkingContext::verify(), so that assigned ordinals will be available
anywhere.
No functionality change.
llvm-svn: 197797
The main changes are in:
include/lld/Core/Reference.h
include/lld/ReaderWriter/Reader.h
Everything else is details to support the main change.
1) Registration based Readers
Previously, lld had a tangled interdependency with all the Readers. It would
have been impossible to make a streamlined linker (say for a JIT) which
just supported one file format and one architecture (no yaml, no archives, etc).
The old model also required a LinkingContext to read an object file, which
would have made .o inspection tools awkward.
The new model is that there is a global Registry object. You programmatically
register the Readers you want with the registry object. Whenever you need to
read/parse a file, you ask the registry to do it, and the registry tries each
registered reader.
For ease of use with the existing lld code base, there is one Registry
object inside the LinkingContext object.
2) Changing kind value to be a tuple
Beside Readers, the registry also keeps track of the mapping for Reference
Kind values to and from strings. Along with that, this patch also fixes
an ambiguity with the previous Reference::Kind values. The problem was that
we wanted to reuse existing relocation type values as Reference::Kind values.
But then how can the YAML write know how to convert a value to a string? The
fix is to change the 32-bit Reference::Kind into a tuple with an 8-bit namespace
(e.g. ELF, COFFF, etc), an 8-bit architecture (e.g. x86_64, PowerPC, etc), and
a 16-bit value. This tuple system allows conversion to and from strings with
no ambiguities.
llvm-svn: 197727
Executable files do not use a string table, so section names longer than 8
characters are not permitted. Long section names should just be truncated.
llvm-svn: 197470
If NONAME option is given for an export, that symbol will be exported only by
its ordinal. LLD will not emit the symbol name to the export table.
llvm-svn: 197371
OrdinalBase is an addend to the ordinals. We used to always set 1 to the field.
Although it produced a valid a DLL export table, it'd be a waste if the first
ordinal does not start with 1 -- we had to have NULL fields at the beginning of
the export address table. By setting the ordinal base, we can eliminate the
NULL fields.
llvm-svn: 197367
You can specify exported function's ordinal by /export:func,@<number> command
line option, but LLD ignored the option until now. This patch implements the
feature.
Ordinal is basically the index into the exported function address table. So,
for example, if /export:foo,@42 is specified, the linker writes foo's address
to 42th entry in the address table. Windows supports import-by-ordinal; you
can not only import a function by name, but by its ordinal. If you want to
allow your DLL users to import your functions by their ordinals, you need to
make sure that your functions are always exported with the same ordinals.
This is the feature for that situation.
llvm-svn: 197364
The following are the most significant peculiarities of MIPS target:
- MIPS ABI requires some special tags in the dynamic table.
- GOT consists of two parts local and global. The local part contains
entries refer locally visible symbols. The global part contains entries
refer global symbols.
- Entries in the .dynsym section which have corresponded entries in the
GOT should be:
* Emitted at the end of .dynsym section
* Sorted accordingly to theirs GOT counterparts
- There are "paired" relocations. One or more R_MIPS_HI16 and R_MIPS_GOT16
relocations should be followed by R_MIPS_LO16 relocation. To calculate
result of R_MIPS_HI16 and R_MIPS_GOT16 relocations we need to combine
addends from these relocations and paired R_MIPS_LO16 relocation.
The patch reviewed by Michael Spencer, Shankar Easwaran, Rui Ueyama.
http://llvm-reviews.chandlerc.com/D2156
llvm-svn: 197342
The only data in .edata whose length varies is the string. This patch moves
all the strings to the end of the section, so that 16-bit or 32-bit integers
are aligned on correct boundaries.
llvm-svn: 197213
This is the first patch to emit data for the DLL export table. The DLL export
table is the data used by the Windows loader to find the address of exported
function from DLL. With this patch, LLD is able to emit a valid DLL export
table which the Windows loader can interpret and load.
The data structure of the DLL export table is described in the Microsoft
PE/COFF Specification, section 5.3.
DLL support is not complete yet; the linker needs to emit an import library
for a DLL, otherwise the linker cannot link against the DLL. We also do not
support export-only-by-ordinal yet.
llvm-svn: 197212
DLLNameAtom is an atom whose content is a string. IdataAtom is not going to
be the only place we need such atom, so I want to generalize it.
llvm-svn: 197137
I'm planning to create a new pass for the DLL export table, and I want to use
the class both from IdataPass and the new pass, EdataPass. So move the class to
a common place.
llvm-svn: 197132
Before this patch, we had the following class hierarchy.
Chunk -> AtomChunk -> SectionChunk -> GenericSectionChunk
-> BaseRelocChunk
-> HeaderChunk
Chunk represented the generic concept of contiguous range in an output
file. AtomChunk represented a chunk consists of atoms.
That class hierarchy had many issues: 1) BaseRelocChunk does not really
consist of atoms, so inheriting from AtomChunk was plainly wrong, and 2)
the hierarchy is unecessarily too deep.
This patch correct them. The new hierachy is shown below.
Chunk -> SectionChunk -> AtomChunk
-> BaseRelocChunk
-> HeaderChunk
In the new hierarchy, AtomChunk represents a chunk consists of atoms. Other
types of sections (currently only BaseRelocChunk) should inherit directly
from SectionChunk.
llvm-svn: 197038
This patch is to basically move the functionality to construct Data Directory
from IdataPass to WriterPECOFF.
Data Directory is a part of the PE/COFF header and contains the addresses of
the import tables.
We used to represent the link from Data Directory to the import tables as
relocation references. The idea behind it is that, because relocation
references are processed by the Writer, we wouldn't have to do anything special
to fill the addresses of the import tables. I thought that the addresses would
be set "automatically".
But it turned out that that design made the pass and the writer rather
complicated. In order to make relocation references between Data Directory to
the import tables, these data structures needed to be represented as Atom.
However, because Data Directory is not a section content but a part of the
PE/COFF header, it did not fit well as an Atom. So we ended up having
complicated code both in IdataPass and the writer.
This patch simplifies it.
One side effect of this patch is that we now have ".idata.a", ".idata.d" and
"idata.t" sections for the import address table, the import directory table,
and the import lookup table. The writer looks for the sections by name to find
the start addresses of the sections. We probably should have a better way to
find a specific atom from the core linking result, but currently using the
section name seems to be the easiest way to do that. The Windows loader do not
care about the import table's section layout.
llvm-svn: 197016
If section size is not multiple of 512, the writer added NULL bytes at the end
of it to make it so. That is not required by the PE/COFF spec, and the MSVC's
linker does not do that too. So we don't need to do that, too.
llvm-svn: 197002
Code to create COFF section header was scattered across many member functions
of SectionChunk. Consolidate it to a member function of SectionHeaderTableChunk.
llvm-svn: 196895
/ALTERNATENAME is a rarely-used, undocumented command line option that is
needed to link LLD for release build. It seems that the option is for defining
an weak alias; /alternatename:foo=bar defines weak symbol "foo" for "bar".
If "foo" is defined in an input file, it'll be linked normally and the command
line option will have no effect. If it's not defined, "foo" will be handled
as an alias for "bar".
This patch implements the parser for the option. The actual weak alias handling
will be implemented in a separate patch.
llvm-svn: 196743
GroupedSectionsPass was a complicated pass. That pass's job was to reorder
atoms by section name, so that the atoms with the same section prefix will be
emitted consecutively to the executable. The pass added layout edges to atoms,
and let the layout pass to actually reorder them.
This patch simplifies the design by making GroupedSectionPass to directly
reorder atoms, rather than adding layout edges. This resembles ELF's
ArrayOrderPass.
This patch improves the performance of LLD; it used to take 7.1 seconds to
link LLD with LLD on my Macbook Pro, but it now takes 6.1 seconds.
llvm-svn: 196628
Emitting idata atoms to their own section would make debugging easier.
The Windows loader do not really care about whether the DLL import table is
in .rdata or its own .idata section, so there is no change in functionality.
llvm-svn: 196458
This is a patch to let the PECOFF writer to use the information passed
by the parser for /section option. The implementation of /section should
now be complete.
llvm-svn: 195893
/MERGE option is a bit complicated for many reasons. Firstly, it takes both
positive and negative arguments. That means we have to have one of three
distinctive values (set, clear or unchange) for each permission bit. In this
patch we represent the three values using two bitmasks.
Secondly, the permissions specified by the parameter is bitwise or-ed with the
default permissions of a section. There is an exception for that rule; if one
of READ, WRITE or EXECUTE bit is specified, unspecified bits need to be
cleared. (So if you specify only WRITE for example, the resulting section will
not have WRITE nor EXECUTE bits.)
Lastly, multiple /merge options are allowed.
llvm-svn: 195882
Atom ordinals are the indeces in a file. Currently the PECOFF reader assigns
ordinals for each section, so it's (incorrectly) assigning duplicate ordinals.
llvm-svn: 195852
Instead of having multiple SectionChunks for each section (.text, .data,
.rdata and .bss), we could have one chunk writer that can emit any sections.
This patch does that -- removing all section-sepcific chunk writers and
replace them with one "generic" writer.
This change should simplify the code because it eliminates similar-but-
slightly-different classes.
It also fixes an issue in the previous design. Before this patch, we could
emit only limited set of sections (i.e. .text, .data, .rdata and .bss). With
this patch, we can emit any sections.
llvm-svn: 195797
According to the PE/COFF spec, a section with IMAGE_SCN_LNK_INFO should only
appear in an object file, and not allowed in an executable. So I believe
treating it as the same way as IMAGE_SCN_LNK_INFO is the right thing.
llvm-svn: 195692
This patch won't change the output because the layout of linker internal
atoms is forced by layout-{before,after} references. Ordinals of the linker
internal atoms are not currently used. (That's why it's working even if there
are atoms having the same ordinals.)
llvm-svn: 195610
Change the attribute from sectionBasedOnContent to sectionCustomRequired
because its the right attribute for atoms read from COFF files to have.
COFF atoms should basically be emitted to the section having the same name
as input. Permissions/attributes should not affect that.
There's no functionality change because the writer doesn't yet use the
section name. The writer will be modified in a following patch, so that atoms
are written to its customSectionName()'s section.
llvm-svn: 195595
Looks like -L paths are not positional. They need to be added to a list of
search paths and those needs to be searched when lld looks for a library.
llvm-svn: 195594
If /subsystem option is not specified, the linker needs to infer it from the
entry point function. If "main" or "wmain" is defined, it's a console
application. If "WinMain" or "wWinMain" is defined, it's a GUI application.
llvm-svn: 195592
This adds LinkerScript support by creating a type Script which is of type
FileNode in the InputGraph. Once the LinkerScript Parser converts the
LinkerScript into a sequence of command, the commands are handled by the
equivalent LinkerScript node for the current Flavor/Target. For ELF, a
ELFGNULdScript gets created which converts the commands to ELF nodes and ELF
control nodes(ELFGroup for handling Group nodes).
Since the Inputfile type has to be determined in the Driver, the Driver needs
to determine the complete path of the file that needs to be processed by the
Linker. Due to this, few tests have been removed since the Driver uses paths
that doesnot exist.
llvm-svn: 195583
NativeReferenceIvarsV1 cannot handle more than 65535 relocation targets
because its field to point to the target table is of type uint16_t. Because
of that limitation, the LLD couldn't link a file containing more than 65535
relocations. 65535 is not a big number - the LLD couldn't even link itself
with V1.
This patch solves the issue by adding NativeReferenceIvarsV2 support. The
new structure has more bits for the target table, so it can handle a large
number of relocatinos.
V2 structure is larger than V1. In order to prevent file bloating, V2 format
is used only when the resulting file cannot be represented in V1 format. The
writer and the reader support both V1 and V2 formats.
Differential Revision: http://llvm-reviews.chandlerc.com/D2217
llvm-svn: 195270
The maximum number of references the file with NativeReferenceIvarsV1 can
contain is 65534. If a file larger than that is converted to Native format,
the conversion will fail without any error message. This caused a subtle bug
that the LLD would produce a broken executable only when input files contain
too many references.
This issue exists since the RoundTripNativeTest is introduced in r193585. Since
then, it seems that nobody have linked any program having more than 65534
relocations with the LLD. Otherwise we would have found it earlier.
llvm-svn: 194987
This patch does not change the meaning of the program, but if something's wrong
in the linker or the compiler and the control reaches to the gap of imported
function table, it will stop immediately because of the presence of INT3. If
NOP, it'd fall through to the next call instruction, which is usually a
completely foreign function call.
llvm-svn: 194860
The result of sizeof(SymbolTable<ELFT>::SymbolEntry) in DynamicSymbolTable
<ELFT>::write() was different from the same expression in RelocationTable
<ELFT>::write(), although the same template parameters were passed. They were
40 and 32, respectively. As a result, the same vector was treated as a
vector of 40 byte values in some places and a vector of 32 values in other
places. That caused an weird issue, resulting in collapse of the rela.dyn
section.
I suspect that this is a padding size calculation bug in MSVC 2012, but I
may be wrong. Reordering the fields to eliminate padding seems to fix the
issue.
llvm-svn: 194349
This patch adds support for converting normalized mach-o to and from binary
mach-o. It also changes WriterMachO (which previously directly wrote a
mach-o binary given a set of Atoms) to instead do it in two steps. The first
step uses normalizedFromAtoms() to convert Atoms to normalized mach-o, and the
second step uses writeBinary() which to generate the mach-o binary file.
llvm-svn: 194167
These fields are for /align option. Section alignment can be set per-section
basis with /section option too. In order to avoid name conflicts, rename the
existing identifiers to become more specific. No functionality change.
llvm-svn: 194160
We wrapped the linker internal file with a virtual archive file, so that the
linker internal file was linked only when it's actually used. This was to avoid
__ImageBase being included to the resulting executable. __ImageBase used to
occupy four bytes when emitted to executable.
And then it turned out that the implementation of __ImageBase was wrong -- it
shouldn't have been a regular atom but an absolute atom. Absolute atoms point
to some memory location, but they don't occupy disk space themselves. So it
wouldn't increase executable size (except the symbol table.) That means that
it's OK to link the linker internal file unconditionally.
So this patch does that, removing the wrapper archive file. Doing this
simplifies the code.
llvm-svn: 194127
n_desc field in MachO string table was not initialized. On Unix,
test/darwin/hello-world.objtxt did not fail because I think an nlist object
is always allocated to a fresh heap initialized with zeros. On Windows,
uninitialized fields are filled with 0xCC when compiled with /GZ. Because
of that the test was failing on Windows.
llvm-svn: 193909
Bugs that would be caught by this assertion would also be caught by
RoundTripYAMLPass test. We've enabled the pass for PECOFF, so we can remove
this.
llvm-svn: 193886
The data directory in the PE/COFF header consisted of list of data directory
atoms. This patch changes it -- now there's only one data directory entry that
contains former data directories. That's easier to handle in the writer as well
as to write to/read from YAML/Native files. The main purpose of this refactoring
is to enable RoundTrip tests for PE/COFF.
There's no functionality change.
llvm-svn: 193854
This reverts commit r193479.
The atoms are already added to the file, so re-adding them caused the YAML
writer to write the same atoms twice. That made the YAML reader to fail with
"duplicate atom name" error.
This is not the only error we've got for RoundTripYAMLPass for PECOFF, so we
cannot enable the test yet. More fixes will come.
Differential Revision: http://llvm-reviews.chandlerc.com/D2069
llvm-svn: 193762
Enable this for the following flavors
a) core
b) gnu
c) darwin
Its disabled for the flavor PECOFF. Convenient markers are added with FIXME
comments in the Driver that would be removed and code removed from each flavor.
llvm-svn: 193585
__ImageBase is an absolute symbol whose address is the same as the image base
address. What we did before this patch was to create __ImageBase symbol as a
symbol whose *contents* (not location) is the image base address, which is
clearly wrong.
llvm-svn: 193565
On discussing this with Nick, it looks like the StubAtoms
that contain a lazyImmediate reference kind should be null
and the location needs to be fixed up later with some value
that is an offset into the __LINKEDIT segment.
The drawback is that it allows yaml files with references
that expect a target to be considered without one.
This results in bad yaml files that would need to be handled
in the YAML Reader.
Inorder to fix this, the Stub Atoms use a dummy target such
as itself.
llvm-svn: 193476
/merge:<from>=<to> option makes the linker to combine "from" section to "to"
section. This patch is to parse the option. The actual feature will be
implemented in a subsequent patch.
llvm-svn: 193454
Disable tests to be run with REQUIRES: disable. Note disable is not added to the
config by the test runner Mkaefiles, so essentially disables the test.
Code changes would be required to fix these tests :-
test/darwin/hello-world.objtxt
test/elf/check.test
test/elf/phdr.test
test/elf/ppc.test
test/elf/undef-from-main-dso.test
test/elf/X86_64/note-sections-ro_plus_rw.test
test/pecoff/alignment.test
test/pecoff/base-reloc.test
test/pecoff/bss-section.test
test/pecoff/drectve.test
test/pecoff/dynamic.test
test/pecoff/dynamicbase.test
test/pecoff/entry.test
test/pecoff/hello.test
test/pecoff/imagebase.test
test/pecoff/importlib.test
test/pecoff/lib.test
test/pecoff/multi.test
test/pecoff/reloc.test
test/pecoff/weak-external.test
llvm-svn: 193300
This patch won't change LLD's behavior because it's a temporary file and
LLD does not use the file extension to determine file type. But using the
correct file extension is a good thing.
llvm-svn: 193211
We should dead-strip atoms only if they are created for COMDAT symbols. If we
remove non-COMDAT atoms from a binary, it will no longer be guaranteed that
the binary will work correctly.
In COFF, you can manipulate the order of section contents in the resulting
binary by section name. For example, if you have four sections
.data$unique_prefix_{a,b,c,d}, it's guaranteed that the contents of A, B, C,
and D will be consecutive in the resulting .data section in that order.
Thus, you can access B's and C's contents by incrementing a pointer pointing
to A until it reached to D. That's why we cannot dead-strip B or C even if
no one is directly referencing to them.
Some object files in the standard library actually use that technique.
llvm-svn: 193017
INT 3 (machine code 0xCC) will raise an interrupt when executed. That is better
for filling the gap than NOP because we want to stop the execution immediately
when the control reached to non-code address.
llvm-svn: 192945
This change removes code in various places which was setting the File Ordinals.
This is because the file ordinals are assigned by the way files are resolved.
There was no other way than making the getNextFileAndOrdinal be set const and
change the _nextOrdinal to mutable.
There are so many places in code, that you would need to cleanup to make
LinkingContext non-const!
llvm-svn: 192280
This is the first step in how I plan to get mach-o object files support into
lld. We need to be able to test the mach-o Reader and Write on systems without
a mach-o tools. Therefore, we want to support a textual way (YAML) to represent
mach-o files.
MachONormalizedFile.h defines an in-memory abstraction of the content of mach-o
files. The in-memory data structures are always native endianess and always
use 64-bit sizes. That internal data structure can then be converted to or
from three different formats: 1) yaml (text) encoded mach-o, 2) binary mach-o
files, 3) lld Atoms.
This patch defines the internal model and uses YAML I/O to implement the
conversion to and from the model to yaml. The next patch will implement
the conversion from normalized to binary mach-o.
This patch includes unit tests to validate the yaml conversion APIs.
llvm-svn: 192147
Changes :-
a) Functionality in InputGraph to insert Input elements at any position
b) Functionality in the Resolver to use nextFile
c) Move the functionality of assigning file ordinals to InputGraph
d) Changes all inputs to MemoryBuffers
e) Remove LinkerInput, InputFiles, ReaderArchive
llvm-svn: 192081
This will eventually need to be refactored to better handle COPY relocations,
as other relocations can also generate them. I'm not yet sure the exact
circumstances in which they are needed yet.
llvm-svn: 191567
This patch inverts the return value of these functions, so that they return
"true" on success and "false" on failure. The meaning of boolean return value
was mixed in LLD; for example, InputGraph::validate() returns true on success.
With this patch they'll become consistent.
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1748
llvm-svn: 191341
Summary:
This patch changes WriterPECOFF to actually write down the address instead of ignoring it.
Also, it changes the order of adding the BaseReloc chunk as otherwise the address wasn't set yet.
I think a better way of doing it would be to change DataDirectoryAtom to create a Reference
instead of using a number, and to change IdataPass accordingly, but I'm not sure how to do that.
Reviewers: ruiu
Reviewed By: ruiu
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1743
llvm-svn: 191220
Summary: This patch changes WritePECOFF to calculate the value of the SizeOfHeaders PE header field instead of just using 512.
Reviewers: rui314, ruiu
Reviewed By: ruiu
CC: llvm-commits, ruiu
Differential Revision: http://llvm-reviews.chandlerc.com/D1708
llvm-svn: 191212
This adds an option --output-filetype that can be set to either
YAML/Native(case insensitive). The linker would create the outputs
associated with the type specified by the user.
Changes all the tests to use the new option.
llvm-svn: 191183
This also makes it support debugging executables built with lld.
Initial patch done by Bigcheese. This is only a revised patch to
have the functionality in the Writer.
llvm-svn: 191032
Base relocation block should be aligned on a 32-bit boundary. While the PECOFF
spec mentions only aligning the blocks, and not padding them, link.exe seems
to add an extra IMAGE_REL_I386_ABSOLUTE entry (just a zeroed WORD) in order to
pad the blocks.
Patch by Ron Ofir.
llvm-svn: 190951
This sets the sectionChoice property for DefinedAtoms. The output section name
is derived by the property of the atom. This also decreases native file size.
Adds a test.
llvm-svn: 190840
This patch changes lld to go through all sections while calculating the size
for SizeOfCode, SizeOfInitializedData and SizeOfUninitializedData fields in the
PE header, instead of using only a small set of hard-coded sections.
This only really changes SizeOfInitializedData which didn't include .reloc
section before this patch.
Patch by Ron Ofir.
llvm-svn: 190799
This patch sets the IMAGE_SCN_MEM_DISCARDABLE characteristic to the base
relocations section in order to match MS PECOFF specification.
Patch by Ron Ofir.
llvm-svn: 190798
There was a bug that if a section has an alignment requirement and there are
multiple symbols at offset 0 in the section, only the last atom at offset 0
would be aligned properly. That bug would move only the last symbol to an
alignment boundary, leaving other symbols unaligned, although they should be at
the same location. That caused a mysterious SEGV error of the resultant
executable.
With this patch, we manage all symbols at the same location properly, rather
than keeping the last one.
llvm-svn: 190724
Alignment(1) does not mean that the atom should be aligned on a 1 byte
boundary but on a 2^1 boundary. So, atoms without any specific alignment
requirements should have Alignment(0).
llvm-svn: 190723
This handles multiple weak symbols which appear back to back. This fix is needed
which otherwise will lead to symbols getting initialized to arbitrary values.
There was a constructor/destructor test that really triggered this to be fixed
on X86_64.
Adds a test.
llvm-svn: 190658
So that we can determine what the target architecture is. Adding this
field does not mean that we are going to support non-i386 architectures
soon; there are many things to do to support them, and I'm focusing on
i386 now. But this is the first step toward multi architecture support.
llvm-svn: 190627
In COFF, an undefined symbol can have up to one alternative name. If a symbol
is resolved by its regular name, then it's linked normally. If a symbol is not
found in any input files, all references to the regular name are resolved using
the alternative name. If the alternative name is not found, it's a link error.
This mechanism is called "weak externals".
To support this mechanism, I added a new member function fallback() to undefined
atom. If an undefined atom has the second name, fallback() returns a new undefined
atom that should be used instead of the original one to resolve undefines. If it
does not have the second name, the function returns nullptr.
Differential Revision: http://llvm-reviews.chandlerc.com/D1550
llvm-svn: 190625
Mangling scheme varies on platform, and prepending an underscore is valid only
on 32-bit x86. Added a method to mangle name to PECOFFLinkingContext and use
it to avoid hard coding mangled names.
llvm-svn: 190585
attribute in LinkerInput to isWholeArchive and use that for deciding
whether library archives should be expanded. Implement the -all_load
option of the Darwin linker using this flag and drop the support for it
in GNU mode.
llvm-svn: 190275
It looks like there is a possibility of seeing RO/RW note sections
and we would need to create an appropriate RO/RW segment associated
with them.
Adds a test too.
llvm-svn: 189907
The compiler is allowed to add a linker option starting with -?<name> to
.drectve section. If the linker can interpret -<name>, it's processed as if
there's no question mark there. If not, such option is silently ignored.
This is a COFF's feature to allow the compiler to emit new linker options
while keeping compatibility with older linkers.
llvm-svn: 189897
This changes the interface of createLinkerInput to use ErrorOr, so that
errors from the linker can be captured.
Also adds a convenience function for error strings to be returned from
file nodes.
llvm-svn: 189871
This creates .init_array/.fini_array section for X86_64 ELF
targets and executes init/fini functions specified by the
-init/-fini options respectively.
llvm-svn: 189719
This adds an API to the LinkingContext for flavors to add Internal files
containing atoms that need to appear in the YAML output as well, when -emit-yaml
switch is used.
Flavors can add more internal files for other options that are needed.
llvm-svn: 189718
We added layout edges to the head atoms in grouped sections. That was wrong,
because the head atom needs to be followed by the other atoms in the *same*
section, not by the other section contents. With this patch, layout edges are
added from tail atom, which is the last atom in a section, to head atom.
llvm-svn: 189573
Because of a bug, the last atom of each section contained a garbage at the
end of its data. In most cases the garbage is harmless but it could have cause
SEGV.
llvm-svn: 189572
We were creating undefined atoms for common symbols by mistake. That did not
lead to a link failure, for undefined atoms would be resolved by common symbols
in the same file, but that's a waste of resource.
llvm-svn: 189534
We scanned the symbol table twice; first to gather all regular symbols, and
second to process aux symbols. That's a bit inefficient and complicated. We
can instead cache aux symbols in the first pass, to eliminate the need of the
second pass.
llvm-svn: 189525
The cleanup includes :-
* Rename ambiguous Header class to ELFHeader
* Convert Chunk contentype and kind to be a enumerated class
* Remove functions that are not being used, avoids future confusion
llvm-svn: 189209
This change processes fini_array section in addition to processing
init_array sections. This also makes functions registered at compile
time for initialization and finalization to be run during execution
llvm-svn: 189196
There may be relocations that may be pointing to the section
even if the section sizes are 0. We shouldnot ignore them
for that regard.
llvm-svn: 189139
typeTLV content type is used by Darwin to represent thread local
storage. A new contentType has to be made to represent ELF
thread local storage data. These have been set to
- typeThreadZeroFill (represents TBSS storage)
- typeThreadData (represents TDATA storage)
llvm-svn: 189137
BSS atoms dont take any file space in the Input file. They are associated
with a contentType(typeZeroFill). Similiar zero fill types also exist which
have the same meaning in terms of occupying file space in the Input.
These atoms have to be handled seperately when writing to the
lld's intermediate file or the lld test infrastructure.
Also adds a test.
llvm-svn: 189136
The import name is not always the same as the symbol name. If the name/type
field in the import header is NOPREFIX or UNDECORATE, we need to strip some
characters from symbol to get its import name.
The Microsoft PE/COFF spec is vague if symbol contains more than two
consecutive characters to be stripped. We used to strip all characters,
but it doesn't seem right as we couldn't link against the system library
because of this name mangling. Looks like we shouldn't strip more than one
character.
llvm-svn: 188154
__ImageBase is a symbol having 4 byte integer equal to the image base address
of the resultant executable. The linker is expected to create the symbol as if
it were read from a file.
In order to emit the symbol contents only when the symbol is actually
referenced, we created a pseudo library file to wrap the linker generated
symbol. The library file member is emitted to the output only when the member
is actually referenced, which is suitable for our purpose.
llvm-svn: 188052
The COMDAT section is a section with a special attribute to tell the linker
whether the symbols in the section are allowed to be merged or not. This patch
add a function to interpret the COMDAT data and set "merge" attribute to the
atoms accordingly.
LLD supports multiple policies to merge atoms; atoms can be merged by name or
by content. COFF supports them, and in addition to that, it supports
choose-the-largest-atom policy, which LLD currently does not support. I simply
mapped it to merge-by-name attribute for now, but we eventually have to support
that policy in the core linker.
llvm-svn: 188025
Also change some local variable names: "ti" -> "context" and
"_targetInfo" -> "_context".
Differential Revision: http://llvm-reviews.chandlerc.com/D1301
llvm-svn: 187823
The aim of this patch is to reduce the dependency from COFFDefinedAtom
to COFF structs defined in llvm/Object/COFF.h. Currently many attributes
of the atom are computed in the atom. That provide a simple interface but
does not work well in some cases.
There are some cases that the same type atom is created from different
parts of a COFF file. One example is the BSS atom, which can be created
from the defined symbol in the .bss section or from the undefined symbol.
Computing attributes from different sources in the atom complicates the
code. We should compute it outside the atom.
In the next patch, I'll move more code from Atoms.h to ReaderCOFF.cpp.
llvm-svn: 187681
Summary:
The .drectve section contains linker command line options, and the linker is
expected to interpret them as if they were given via the command line. In this
patch, the command line parser in the driver is called from the object file
reader to parse the string.
I think this patch is important, because this is the first step towards mutable
TargetInfo. We had a discussion about that on llvm-commits mailing list before.
I haven't removed "const" from the function signature yet. Instead, I just use
cast to remove "const". This is a temporary aid for an experiment. If we don't
see any issue with this mutable TargetInfo appraoch, I'll change the function
signature, and rename the class LinkerContext from TargetInfo.
Reviewers: kledzik
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1246
llvm-svn: 187677
For an invalid input we should not call report_fatal_error(), because
when LLD is used as a library, we don't want to kill the whole app
because of a malformed input.
llvm-svn: 187673
A instance of the class always represents a BSS atom, so we don't need
to look at the symbol or the section to retrieve its attributes.
llvm-svn: 187643
The BSS atom is similar to the regular defined atom, but it's different
in the sense that it does not have contents. Until now we assumed all the
defined atoms have its contents. That did not fit well to the BSS atom.
llvm-svn: 187453
This patch removes hacky mangle() function, which strips all decorations
uncondtitionally. LLD now interprets Import Name/Type field in the import
library properly as described in the Microsoft PE/COFF Spec.
llvm-svn: 187388
Member functions to read the symbol table had too many parameters to propagate
all the temporary information from one to another. By storing the information
to data members, we can simplify the function signatures and improve the
readability.
llvm-svn: 187321
Some sections, such as with IMAGE_SCN_LNK_REMOVE attribute, is skipped
in the first pass. Such sections need to be skipped in the latter passes.
llvm-svn: 187281
Missing files will be reported as errors in the later pass, so this patch
does not change the behavior of the LLD linker, but it helps writing unit
tests for the driver.
llvm-svn: 187256
The /include command line option is equivalent to Unix --undefined
option, which forces the linker to resolve the given symbol name
as if it's an unresolved symbol in one of its input files. This feature
is used to link an additional object file or a shared library that no
input files refer to.
llvm-svn: 187084
This is a follow up patch for r186336.
Reviewers: LegalizeAdulthood
CC: silvas, llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1144
llvm-svn: 186384
Emit .reloc section. This is the first step to support DLL creation. The
executable doesn't need .reloc section, but the DLL does.
Reviewers: Bigcheese
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1126
llvm-svn: 186336
This patch adds a new pass, IdataPass, to transform shared atom references
to real references and to construct the .idata section data. With this patch
lld can produce a working Hello World program by linking it against
kernel32.dll and user32.dll.
Reviewers: Bigcheese
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1096
llvm-svn: 186071
Contents of ".reloc" section depends on the addresses of other sections, so
the section cannot be created until all the other sections are created and get
their memory addresses (RVAs). That means that computation of section size
needs to be at least two pass.
Techynically there's no reason to compute it all at once, but instead we can
compute the address of a section as added to the output file. Doing so helps
us to create ".reloc" section.
llvm-svn: 185902
The optional data directory header contains addresses to some atoms such
as the import address table in data section. Such fields can naturally
be set by relocation if we make the optional data driectory as an atom.
Currently we assume that atoms are always in a section, so we can't create
a file header with atoms. This patch separates section chunk from atom
chunk, to allow atom-based file header.
llvm-svn: 185521
A hint is an index of the export pointer table in a DLL, at which
PE/COFF loader starts looking for a symbol name. The import library
comes with hints and symbol pairs, and as long as hints are in sync
with the actual symbol table in DLL, the symbols will be resolved
quickly. So, we shouldn't ignore hints but propagate them to an output.
llvm-svn: 185516
In order to support linking against DLL, the linker needs to create defined
atoms for jump tables and etc. Because such atoms are not read from a file,
they lack some information such as an ordinal. With this patch, COFFDefinedAtom
is split into two classes; one is the base class of all COFF defined atoms, and
another is a concrete class for atoms read from file. More classes inheriting
COFFBaseDefinedAtom will be added for jump tables and etc.
llvm-svn: 185195
This renames variable name to reflect initial undefined symbols that are
defined by the linker -u option.
This doesnot change any functionality in lld, and updates code to reflect
Nick's comment.
llvm-svn: 184682
Extract atom definitions as Atoms.h so that we can use them in other files.
Also applied clang-format to Atoms.h.
Reviewers: shankarke
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D995
llvm-svn: 184124
This is the first patch toward full DLL support. With this patch, lld can
read .lib file for a DLL.
Reviewers: Bigcheese
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D987
llvm-svn: 184101
This change adds functionality to add more sections like .gcc_except_table,
.data.rel.local, .data.rel.ro into the default section map, so that they are
all merged into appropriate output sections.
This also makes c++ static binaries comparable to what you get with the default
linker.
Adds a test for testing the functionality.
llvm-svn: 184071
With this patch, it can now resolve relocations in the same output file.
"Hello world" program does not still work because call to the DLL routine
is not supported yet.
Reviewers: Bigcheese
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D985
llvm-svn: 184063
Archive file in Windows has file extension of ".lib" but the file format is
in fact the same as Unix. It's an ar archive holding multiple .obj files.
The existing archive reader can read .lib files.
llvm-svn: 184036
Summary:
COFFReference class is defined to represent relocation information for
COFFDefinedAtom, as ELFReference for ELFDefinedAtom. ReaderCOFF can now
read relocation entries and create COFFReferences accordingly.
I need to make WriterPECOFF to handle the relocation references created by
the reader, but this patch is already big, so I think it's probably better
to get it reviewed now.
Reviewers: Bigcheese
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D976
llvm-svn: 183964
Architecture specific code should reside in architecture specific directory
not in Atom. Looks like there are no efforts being made at this moment to
support ARM, so let's remove it for now.
Reviewers: Bigcheese
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D959
llvm-svn: 183877
The yaml reader is not specific to any file format. This patch moves
it to TargetInfo and makes validate a non virtual interface so that it
can be constructed from a single location.
The same method will be used to create a reader for llvm bitcode
files.
llvm-svn: 183740
Split FileCOFF's constructor into mainly two private methods.
One method is responsible to iterate over symbol tables, and other
method is to atomize defined atoms. This is for readability and
no changes in functionality.
Reviewers: Bigcheese
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D940
llvm-svn: 183708
- Split createAtom() in lib/ReaderWriter/ELF/File.h into small methods.
- Added comments to code in other methods.
No functionality changes.
Reviewers: shankarke
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D921
llvm-svn: 183696
This fixes a recent regression (r183338). Stripped elf files (like installed
crtn.o for example), are not required to have a symbol table. Handle that
correctly.
llvm-svn: 183573
lld can now output a valid Windows executable with a text section that does nothing
but just returns immediately. It's not able to handle relocations, symbol tables,
data sections, etc, so it still can't do anything practical, though.
Reviewers: Bigcheese
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D892
llvm-svn: 183478
This is the parser for the ENTRY command. Note that because the parsing result
is currentlyd discarded, lld can parse but just ignore the command.
Reviewers: Bigcheese
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D833
llvm-svn: 183141
Users can override the default value of the dynamic linker to be set to the
one that appears in the command line. The path can even be empty!.
Added a test for the option.
llvm-svn: 182889
Add WinLinkDriver and connect it to the existing COFF reader. Remaining
parts are still stubs, so while it can now read a COFF file, it still
cannot link or output PE/COFF files yet.
Reviewers: Bigcheese
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D865
llvm-svn: 182784
only if they are relative. This removes the FIXME when the
relocations are being emitted and checks if the relocation
is relative and only then populates the addend information.
I couldnt add a testcase for this as llvm-readobj lacks
functionality of printing dynamic relocations.
When the functionality is added, remove the commented lines
from elf/ifunc.test to test functionality.
llvm-svn: 182077
This patch splits X86_64Target specific code so that
the dynamic atoms and the Relocation handlers are in seperate
files for easier maintenace. The files are sure to grow and this
makes it easier to work with.
* There is no change in functionality *
llvm-svn: 182076
to the list of undefined atoms.
The processing of undefined atoms from dynamic libraries is controlled by
use-shlib-undefines command line option.
This patch also adds additional command line arguments to allow/disallow
unresolved symbols from shared libraries and mimics GNU ld behavior.
llvm-svn: 179257
InputFile is being pulled from the Archive library to resolve a symbol.
The buffer which was being used was already being handed over to the
MemoryBuffer object and was being accessed after the hand over.
Moving it before the buffer is handed over.
llvm-svn: 178838
The major changes are:
1) LinkerOptions has been merged into TargetInfo
2) LinkerInvocation has been merged into Driver
3) Drivers no longer convert arguments into an intermediate (core) argument
list, but instead create a TargetInfo object and call setter methods on
it. This is only how in-process linking would work. That is, you can
programmatically set up a TargetInfo object which controls the linking.
4) Lots of tweaks to test suite to work with driver changes
5) Add the DarwinDriver
6) I heavily doxygen commented TargetInfo.h
Things to do after this patch is committed:
a) Consider renaming TargetInfo, given its new roll.
b) Consider pulling the list of input files out of TargetInfo. This will
enable in-process clients to create one TargetInfo the re-use it with
different input file lists.
c) Work out a way for Drivers to format the warnings and error done in
core linking.
llvm-svn: 178776
This changes from reading each relocation individually for each section to just
storing the range of relocations. It also counts the relocations to preallocate
the _references array.
llvm-svn: 177562
This seems to be what ld does, but I'm not sure how it works with symbol interposition.
With this hello-world with glibc dynamically linked works.
llvm-svn: 176310
const U *>>. Even in C++11 it doesn't seem this is valid as vector's
emplace support requires move assignment, and there is no way to move
assign a reference.
The real motivation however is that this fixes the build of lld with
libstdc++ 4.6.
llvm-svn: 175481
The purpose of this change is to simplify creating non-atom sections.
Previously _contentType, _sectionKind and _order were used for multiple
purposes and collided in places. This moves all of the Atom specific logic down
into AtomSection and makes Section just have raw Elf_Shdr flags.
llvm-svn: 175207
This removes Target and moves the functionality it had over to TargetInfo.
This also simplifies LinkerInput by removing the InputKind. This will be handled elsewhere.
llvm-svn: 174589
In an previous commit I managed to completely disable the IRELATIVE relocation
writing code. I also used the wrong addend for the static relocation. Fix both
these issues and add a test. This test is quite brittle because there's no way
to do arithmetic on variables in FileCheck.
llvm-svn: 174161