1. Move relocation addendum reading code to the MipsRelocationHandler
class to reduce code duplication.
2. Factor out the relocations calculation code into the separate
function to be ready to handle MIPS N64 ABI relocation chains.
No functional changes.
llvm-svn: 231641
We should not take in account a type of "source" symbol. Cross mode jump
adjustment is requred when target symbol and relocation belong to
different (regular/microMIPS) instruction sets.
llvm-svn: 231639
All readers except PE/COFF reader create layout-after edges to preserve
the original symbol order. PE/COFF uses layout-before edges as primary
edges for no reason.
This patch makes PE/COFF reader to create layout-after edges.
Resolver is updated to recognize reverse edges of layout-after edges
in the garbage collection pass.
Now we can retire layout-before edges. I don't do that in this patch
because if I do, I would have updated many tests to replace all
occurrrences of "layout-before" with "layout-after". So that's a TODO.
llvm-svn: 231615
If an output is large, its base relocation section can be also large.
For example, chrome.dll is almost 300 MB, and it has about 9 million
base relocations. Creating the section took 1.5 seconds on my machine.
This patch changes the way to create the section so that we can use
parallel_sort to group base relocations by high bits. This change
makes the linker almost 4% faster for the above test case on my machine.
If I replace parallel_sort with std::sort, performance remains the same,
so single thread performance should remain the same.
This has no functionality change. The output should be identical as
before.
llvm-svn: 231585
This patch reverts r231545 "PECOFF: Do not add extraneous symbols
to the dead strip root." CrWinClangLLD buildbot is currently broken.
Since I can't reproduce the issue locally, I'm reverting the most
relevant change.
llvm-svn: 231582
Previously we added all undefined symbols found in object files to
the dead strip root. This patch makes the linker to stop doing that.
Undefined symbols would be resolved anyway, so this patch doesn't
change the linker behavior. It should slightly improve performance
but it's really marginal. This is a cleanup.
llvm-svn: 231545
Previously applying 1 million relocations took about 2 seconds on my
Xeon 2.4GHz 8 core workstation. After this patch, it takes about 300
milliseconds. As a result, time to link chrome.dll becomes 23 seconds
to 21 seconds.
llvm-svn: 231454
The last use of layout-after edge for PE/COFF was removed in r231290.
Now layout-after edges do nothing. We can stop adding them to the graph.
No functionality change intended.
llvm-svn: 231301
Merge::mergeByLargestSection is half-baked since it's defined
in terms of section size, there's no way to get the section size
of an atom.
Currently we work around the issue by traversing the layout edges
to both directions and calculate the sum of all atoms reachable.
I wrote that code but I knew it's hacky. It's even not guaranteed
to work. If you add layout edges before the core linking, it
miscalculates a size.
Also it's of course slow. It's basically a linked list traversal.
In this patch I added DefinedAtom::sectionSize so that we can use
that for mergeByLargestSection. I'm not very happy to add a new
field to DefinedAtom base class, but I think it's legitimate since
mergeByLargestSection is defined for section size, and the section
size is currently just missing.
http://reviews.llvm.org/D7966
llvm-svn: 231290
File objects are not really const in the resolver. We set ordinals to
them and call beforeLink hooks. Also, File's member functions marked
as const are not really const. ArchiveFile never returns the same
member file twice, so it remembers files returned before. find() has
side effects.
In order to deal with the inconsistencies, we sprinkled const_casts
and marked member varaibles as mutable.
This patch removes const from there to reflect the reality.
llvm-svn: 231212
std::promise and std::future in old version of libstdc++ are buggy.
I think that's the reason why LLD tests were flaky on Ubuntu 13
buildbots until we disabled file preloading.
In this patch, I implemented very simple future and used that in
FileArchive. Compared to std::promise and std::future, it lacks
many features, but should serve our purpose.
http://reviews.llvm.org/D8025
llvm-svn: 231153
Yet another chapter in the story. We're getting there, finally.
Note for the future: the tests for relocation have a lot of duplication
and probably can be unified in a single file. Let's reevaluate this once
the support will be complete (hopefully, soon).
llvm-svn: 231057
"virtual" was present at a wrong place. FileArchive is a subclass of
ArchiveLibraryFile, and a FileArchive can be deleted through a
pointer of ArchiveLibraryFile. We want to make the destructor of the
base class virtual.
llvm-svn: 231033
This reverts commit r230086. I added a lock to guard FileCOFF::doParse(),
which killed parallel file parsing. Now the buildbots got back to green,
I believe the threading issue was resolved, so it's time to remove the
guard to see if it works with the buildbots.
llvm-svn: 230886
In doParse, we shouldn't do anything that has side effects. That function may be
called speculatively and possibly in parallel.
We called WinLinkDriver::parse from doParse to parse a command line in a .drectve
section. The parse function updates a linking context object, so it has many side
effects. It was not safe to call that function from doParse. beforeLink is a
function for a File object to do something that has side effects. Moving a call
of WinLinkDriver::parse to there.
llvm-svn: 230791
If no initial live symbols are set up, and deadStrip() == true,
the Resolver ends up reclaiming all the symbols that aren't absolute. This is wrong.
This patch fixes the issue by setting entrySymbolName() as live, and this allows
us to self-host lld when --gc-sections is enabled. There are still quite a few problems
with --gc-sections (test failures), so the option can't be enabled by default.
Differential Revision: D7926
Reviewed by: ruiu, shankarke
llvm-svn: 230737
This reverts commit r230732.
sectionSize() in lib/Core/SymbolTable.cpp still depends on the layout-
after edges, so we couldn't remove them yet.
llvm-svn: 230734
Previously we needed to create atoms as a doubly-linked link, but it's
no longer needed. Also we don't use layout-after edges in PE/COFF.
Creating such edges is just waste.
llvm-svn: 230732
Nothing wrong with reinterpret_cast<llvm::support::ulittle32_t *>(loc),
but that's redundant and not great from readability point of view.
The new functions are wrappers for that kind of reinterpet_casts.
Surprisingly or unsurprisingly, there was no use of big endian read
and write. {read,write}{16,32,64}be have no user. But I think they
still worth to be there in the header for completeness.
http://reviews.llvm.org/D7927
llvm-svn: 230725
Previously we have a string -> string map to keep the weak alias
symbol mapping. Naturally we can't define more than one weak alias
with that data structure.
This patch is to allow multiple aliases for the same symbol by
changing the map type to string -> set of string map.
llvm-svn: 230702
In LLD's model, symbol is a property of the node (atom) and not a property of
edge (reference). Prior to this patch, we stored the symbol in the reference.
From post-commit comments, it seemed better to create a map from the reference
to the symbol instead and use this mapping wherever desired.
Address comments from Ruiu/Simon Atanasyan.
llvm-svn: 230273
SHF_GROUP: Group Member Sections
----------------------------------
A section which is part of a group, and is to be retained or discarded with the
group as a whole, is identified by a new section header attribute: SHF_GROUP
This section is a member (perhaps the only one) of a group of sections, and the
linker should retain or discard all or none of the members. This section must be
referenced in a SHT_GROUP section. This attribute flag may be set in any section
header, and no other modification or indication is made in the grouped sections.
All additional information is contained in the associated SHT_GROUP section.
SHT_GROUP: Section Group Definition
-------------------------------------
Represents a group section.
The section group's sh_link field identifies a symbol table section, and its
sh_info field the index of a symbol in that section. The name of that symbol is
treated as the identifier of the section group.
More information: https://mentorembedded.github.io/cxx-abi/abi/prop-72-comdat.html
Added a lot of extensive tests, that tests functionality.
llvm-svn: 230195
When the GNU linker sees two input sections with the same name, and the name
starts with ".gnu.linkonce.", the linker will only keep one copy and discard the
other. Any section whose name starts with “.gnu.linkonce.” is a COMDAT section.
Some architectures like Hexagon use this section to store floating point constants,
that need be deduped.
This patch adds gnu.linkonce functionality to the ELFReader.
llvm-svn: 230194
There is code(added by me) in the YAMLReader which isn't correct when it handles references
for section groups. The test case was also checking for wrong outputs.
This fixes the bug and the testcase so that they check for proper outputs.
llvm-svn: 230190
This is mainly for back-compatibility with GNU ld.
Ideally --stats should be a general option in LinkingContext, providing
individual stats for every pass in the linking process.
In the GNU driver, a better wording could be used, but there's no need
to change it for now.
Differential Revision: D7657
Reviewed by: ruiu
llvm-svn: 230157
Now since the correct file path for atoms is available and not clobbered,
commit r222309 which was reverted previously can be added back.
No change in functionality.
llvm-svn: 230138
Looks like there's a threading issue in the COFF reader which makes
buildbot unstable. Probability of crash varies depending on the number
of input. If we are linking a big executalbe, LLD almost always crash.
This patch temporarily adds a lock to guard the reader so that LLD
doesn't crash. I'll investigate and fix the issue as soon as possible
because this patch has negative performance impact.
llvm-svn: 230086
The round-trip passes were introduced in r193300. The intention of
the change was to make sure that LLD is capable of reading end
writing such file formats.
But that turned out to be yet another over-designed stuff that had
been slowing down everyday development.
The passes ran after the core linker and before the writer. If you
had an additional piece of information that needs to be passed from
front-end to the writer, you had to invent a way to save the data to
YAML/Native. These passes forced us to do that even if that data
was not needed to be represented neither in an object file nor in
an executable/DSO. It doesn't make sense. We don't need these passes.
http://reviews.llvm.org/D7480
llvm-svn: 230069
LinkerScript AST nodes are never destroyed which means that their
std::vector members will never be destroyed.
Instead, allocate the operand list itself in the Parser's
BumpPtrAllocator. This ensures that the storage will be destroyed along
with the nodes when the Parser is destroyed.
llvm-svn: 229967
This is yet another edge case of base relocation for symbols. Absolute
symbols are in general not target of base relocation because absolute
atom is a way to point to a specific memory location. In r229816, I
removed entries for absolute atoms from the base relocation table
(so that they won't be fixed by the loader).
However, there was one exception -- ImageBase. ImageBase points to the
start address of the current image in memory. That needs to be fixed up
at load time. This patch is to treat the symbol in a special manner.
llvm-svn: 229961
Previously we wrongly emitted a base relocation entry for an absolute symbol.
That made the loader to rewrite some instruction operands with wrong values
only when a DLL is not loaded at the default address. That caused a
misterious crash of some executable.
Absolute symbols will of course never change value wherever the binary is
loaded to memory. We shouldn't emit base relocations for absolute symbols.
llvm-svn: 229816
Weak aliases defined using /alternatename command line option were getting
wrong RVAs in the final output because of wrong atom ordinal. Alias atoms
were assigned large ordinals than any other regular atoms because they were
instantiated after other atoms and just got new (larger) ordinals.
Atoms are sorted by its file and atom ordinals in the order pass. Alias
atoms were located after all other atoms in the same file.
An alias atom's ordinal needs to be smaller than its alias target but larger
than the atom appeared before the target -- so that the alias is located
between the two. Since an alias has no size, the alias target will be located
at the same location as the alias.
In this patch, I made a gap between two regular atoms so that we can put
aliases after instantiating them (without re-numbering existing atoms).
llvm-svn: 229762
atomContent's memory is freed at the end of the stack frame,
but it is referenced by the atom pushed into _definedAtoms.
Differential Revision: http://reviews.llvm.org/D7732
llvm-svn: 229749
Summary:
Define an explicit type for arch specific reference kind and use it in switch statement to make the compiler emit warnings if some case is not cover.
It will help to catch such errors when we add new mach-o reference kind.
Reviewers: shankarke, kledzik
Reviewed By: shankarke
Subscribers: shankarke, aemerson, llvm-commits
Projects: #lld
Differential Revision: http://reviews.llvm.org/D7612
llvm-svn: 229246
Wrap functionality was using a std::set to record symbols that need to be
wrapped. This changes the implementation to use a StringSet instead.
No change in functionality.
llvm-svn: 229165
If the name field of a symbol table entry is all zero, it's interpreted
as it's pointing to the beginning of the string table. The first four
bytes of the string table is the size field, so dumpbin dumps that number
as an ASCIZ string.
This patch fills a dummy value to name field.
llvm-svn: 228971
Use a wrapper function for symbol. Any undefined reference to symbol will be
resolved to "__wrap_symbol". Any undefined reference to "__real_symbol" will be
resolved to symbol.
This can be used to provide a wrapper for a system function. The wrapper
function should be called "__wrap_symbol". If it wishes to call the system
function, it should call "__real_symbol".
Here is a trivial example:
void * __wrap_malloc (size_t c)
{
printf ("malloc called with %zu\n", c);
return __real_malloc (c);
}
If you link other code with this file using --wrap malloc, then all calls
to "malloc" will call the function "__wrap_malloc" instead. The call to
"__real_malloc" in "__wrap_malloc" will call the real "malloc" function.
llvm-svn: 228906
This adds the LinkingContext parameter to the ELFReader. Previously the flags in
that were needed in the Context was passed to the ELFReader, this made it very
hard to access data structures in the LinkingContext when reading an ELF file.
This change makes the ELFReader more flexible so that required parameters can be
grabbed directly from the LinkingContext.
Future patches make use of the changes.
There is no change in functionality though.
llvm-svn: 228905
The dumpbin tool in the MSVC toolchain cannot handle an executable created
by LLD if the executable contains a long section name.
In PE/COFF, a section name is stored to a section table entry. Because the
section name field in the table is only 8 byte long, a name longer than
that is stored to the string table and the offset in the string table is
stored to the section table entry instead.
In order to look up a string from the string table, tools need to handle
the symbol table, because the string table is defined as it immediately
follows the symbol table.
And seems the dumpbin doesn't like zero-length symbol table.
This patch teaches LLD how to emit a dummy symbol table. The dummy table
has one dummy entry in it.
llvm-svn: 228900
When calling ARM code from Thumb and vice versa,
a veneer that switches instruction set should be generated.
Added veneer generation for ARM_JUMP24 ARM_THM_JUMP24 instructions.
Differential Revision: http://reviews.llvm.org/D7502
llvm-svn: 228680
After the total number of program headers are determined, virtual addresses
and file offsets need not be reassigned for sections whose virtual addresses and
fileoffsets remained the same.
This doesnot change any functionality.
llvm-svn: 228377
Only search library directories explicitly specified
on the command line. Library directories specified in linker
scripts (including linker scripts specified on the command
line) are ignored.
llvm-svn: 228375
Previously we only have File::path() to get the path name of a file.
If a file was a member of an archive file, path() returns a concatenated
string of the file name in the archive and the archive file name.
If we wanted to get a file name or an archive file name, we had to
parse that string. That's of course not good.
This patch adds new member functions, archivePath and memberPath, to File.
http://reviews.llvm.org/D7447
llvm-svn: 228352
The real user of the LayoutPass is now only Mach-O, so move that
pass out of the common directory to Mach-O directory.
"Core" architecture were using the LayoutPass. I modified that
to use a simple OrderPass. I think no one actually have authority
what feature should be in Core and what's not, but I believe the
LayoutPass is not very suitable for Core. Before more code starts
depending on the complex pass, it's better to remove that from
Core.
I could have simplified that pass because Mach-O is the only user
of the LayoutPass. For example, the second parameter of the
LayoutPass constructor can be converted from optional to mandatory.
I didn't do that in this patch to keep it simple. I'll do in a
followup patch.
http://reviews.llvm.org/D7311
llvm-svn: 228341
Previously, we incorrectly added the image base address to an absolute
symbol address (that calculation doesn't make any sense) if an
IMAGE_REL_I386_DIR32 relocation is applied to an absolute symbol.
This patch fixes the issue. With this fix, we can link Bochs using LLD.
(Choosing Bochs has no special meaining -- I just picked it up as a
test program and found it didn't work.) This also fixes one of the
issues we currently have to link Chromium using LLD.
llvm-svn: 228279
This may be a little bit inefficient than the original code
but that should be okay as this is not really in a performance
critical pass.
http://reviews.llvm.org/D7393
llvm-svn: 228077
INPUT directive is a variant of GROUP in the sense that that specifies
a list of input files. The only difference is whether the entire file
list is wrapped with a --start-group/--end-group or not.
http://reviews.llvm.org/D7390
llvm-svn: 228060
Currently, no one owns script::Parser buffers, but yet ELFLinkingContext gets
updated with StringRef pointers to data inside Parser buffers. Since this buffer
is locally owned inside GnuLdDriver::evalLinkerScript(), as soon as this
function finishes, all pointers in ELFLinkingContext that comes from linker
scripts get invalid. The problem is that we need someone to own linker scripts
data structures and, since ELFLinkingContext transports references to linker
scripts data, we can simply make it also own all linker scripts data.
Differential Revision: http://reviews.llvm.org/D7323
llvm-svn: 227975
Added relocations to perform function calls with and without passing arguments.
ARM-only, Thumb-only and mixed mode code generations are supported.
Only simple veneers (direct instruction modification) are supported as ARM-Thumb interwork.
Differential Revision: http://reviews.llvm.org/D7223
llvm-svn: 227961
Previously we applied the LayoutPass to order atoms and then
apply elf::ArrayOrderPass to sort them again. The first pass is
basically supposed to sort atoms in the normal fashion (which
is to sort symbols in the same order as the input files).
The second pass sorts atoms in {init,fini}_array.<priority> by
priority.
The problem is that the LayoutPass is overkill. It analyzes
references between atoms to make a decision how to sort them.
It's slow, hard to understand, and above all, it doesn't seem
that we need its feature for ELF in the first place.
This patch remove the LayoutPass from ELF pass list. Now all
reordering is done in elf::OrderPass. That pass sorts atoms by
{init,fini}_array, and if they are not in the special section,
they are ordered as the same order as they appear in the command
line. The new code is far easier to understand, faster, and
still able to create valid executables.
Unlike the previous layout pass, elf::OrderPass doesn't count
any attributes of an atom (e.g. permissions) except its
position. It's OK because the writer takes care of them if we
have to.
This patch changes the order of final output, although that's
benign. Tests are updated.
http://reviews.llvm.org/D7278
llvm-svn: 227666
The LayoutPass is one of the slowest pass. This change is to skip
that pass. This change not only improve performance but also improve
maintainability of the code because the LayoutPass is pretty complex.
Previously we used the LayoutPass to sort all atoms in a specific way,
and reorder them again for PE/COFF in GroupedSectionPass.
I spent time on improving and fixing bugs in the LayoutPass (e.g.
r193029), but the pass is still hard to understand and hard to use.
It's better not to depend on that if we don't need. For PE/COFF, we
just wanted to sort atoms in the same order as the file order in the
command line.
The feature we used in the LayoutPass is now simplified to
compareByPosition function in OrderPass.cpp. The function is just 5
lines.
This patch changes the order of final output because it changes the
sort order a bit. The output is still correct, though.
llvm-svn: 227500
That kind of reference was used only in ELFFile, and the use of
that reference there didn't seem to make sense. All test still
pass (after adjusting symbol names) without that code. LLD is
still be able to link LLD and Clang. Looks like we just don't
need this.
http://reviews.llvm.org/D7189
llvm-svn: 227259
Misread buildbot's log.
Both gcc and clang compile this fine.
Original fix reason:
gcc allows template specializations only in the same namespace
where template has been declared.
llvm-svn: 227183
Symbols addressing Thumb code have zero bit set in st_value to distinguish them from ARM instructions.
This caused wrong atoms' forming because of offset of one byte brought in by that corrected st_value.
Fixed reading of st_value & st_value-related things in ARMELFFile while forming atoms.
Symbol table generation is also fixed for Thumb atoms.
Differential Revision: http://reviews.llvm.org/D7161
llvm-svn: 227174
Anonymous atoms created there were getting wrong atom ordinal.
LayoutAfter references take precedence over atom ordinals, so
the bug was not visible, though.
llvm-svn: 227168
* Removed cyclic dependency between lldPECOFF and lldDriver
* Added missing dependencies in unit tests
Differential Revision: http://reviews.llvm.org/D7185
llvm-svn: 227134
This is initial patch to support MIPS64 object files linking.
The patch just makes some classes more generalized, and rejects
attempts to interlinking O32 and N64 ABI object files.
I try to reuse the current MIPS target related classes as much as
possible because O32 and N64 MIPS ABI are tightly related and share
almost the same set of relocations, GOT, flags etc.
llvm-svn: 227058
The `Elf_Rel_Impl::setSymbolAndType` method now has the third argument
`IsMips64EL` (like complement methods `getSymbol` and `getType`). While
we do not support linking of MIPS64 ELF object file just pass `false`
to the `setSymbolAndType`.
llvm-svn: 227045
lldELF is used by each ELF backend. lldELF's ELFLinkingContext
also held a reference to each backend, creating a link-time
cycle. This patch moves the backend references to lldDriver.
Differential Revision: http://reviews.llvm.org/D7119
llvm-svn: 226976
Moved getMemoryBuffer from DarwnLdDriver to MachOLinkingContext.
lldMachO shared library target now builds.
Differential Review: http://reviews.llvm.org/D7155
llvm-svn: 226963
lldELF is used by each ELF backend. lldELF's ELFLinkingContext
also held a reference to each backend, creating a link-time
cycle. This patch moves the backend references to lldDriver.
Differential Revision: http://reviews.llvm.org/D7119
llvm-svn: 226922
Before this patch there was a cyclic dependency between lldCore and
lldReaderWriter. Only lldConfig could be built as a shared library.
* Moved Reader and Writer base classes into lldCore.
* The following shared libraries can now be built:
lldCore
lldYAML
lldNative
lldPasses
lldReaderWriter
Differential Revision: http://reviews.llvm.org/D7105
From: Greg Fitzgerald <garious@gmail.com>
llvm-svn: 226732
The code is able to statically link the simplest case of:
int main() { return 0; }
* Only works with ARM code - no Thumb code, no interwork (-marm -mno-thumb-interwork)
* musl libc built with no interwork and no Thumb code
Differential Revision: http://reviews.llvm.org/D6716
From: Denis Protivensky <dprotivensky@accesssoftek.com>
llvm-svn: 226643
The use of std::future introduces an implicit dependency on the PPL
subcomponent of ConcRT. ConcRT in general is pretty noisy with
warnings, so this patch just disables one of the noisy warnings.
llvm-svn: 226590
sh_addralign of zero is equivalent to sh_addralign of one, meaning
no alignment specified. Avoid calculating Log2 or modulus when
sh_addralign is zero as the results will not be useful.
llvm-svn: 226572
At the moment errors in relocation processing such as out of range
values are not detected or at best trapped by asserts which will not
be present in release builds. This patch adds support for checking
error return values from applyRelocation() calls and printing an
appropriate error message. It also adds support for printing multiple
errors rather than just the first one.
llvm-svn: 226557
LLD parses archive file index table only at first. When it finds a symbol
it is looking for is defined in a member file in an archive file, it actually
reads the member from the archive file. That's done in the core linker.
That's a single-thread process since the core linker is single threaded.
If your command line contains a few object files and a lot of archive files
(which is quite often the case), LLD hardly utilizes hardware parallelism.
This patch improves parallelism by speculatively instantiating archive
file members. At the beginning of the core linking, we first create a map
containing all symbols defined in all members, and each time we find a
new undefined symbol, we instantiate a member file containing the
symbol (if such file exists). File instantiation is side effect free, so this
should not affect correctness.
This is a quick benchmark result. Time to link self-link LLD executable:
Linux 9.78s -> 8.50s (0.86x)
Windows 6.18s -> 4.51s (0.73x)
http://reviews.llvm.org/D7015
llvm-svn: 226336
Generalise the base relocation handling slightly to support multiple base
relocation types in PE/COFF. This is necessary to generate proper executables
for WoA.
Track the base relocation type from the decision that we need a base relocation
to the point where we emit the base relocation into base relocation directory.
Remove an outdated TODO item while in the area.
llvm-svn: 226335
We had such class there because of InputGraph abstraction.
Previously, no one except InputGraph itself has complete picture of
input file list. In order to create a set of all defined symbols,
we had to use some indirections there to workaround InputGraph.
It can now be rewritten as simple code. No change in functionality.
llvm-svn: 226319
This patch makes File::parse() multi-thread safe. If one thread is running
File::parse(), other threads will block if they try to call the same method.
File::parse() is idempotent, so you can safely call multiple times.
With this change, we don't have to wait for all worker threads to finish
in Driver::link(). Previously, Driver::link() calls TaskGroup::sync() to
wait for all threads running File::parse(). This was not ideal because
we couldn't start the resolver until we parse all files.
This patch increase parallelism by making Driver::link() to not wait for
worker threads. The resolver calls parse() to make sure that the file
being read has been parsed, and then uses the file. In this approach,
the resolver can run with the parser threads in parallel.
http://reviews.llvm.org/D6994
llvm-svn: 226281
The previous default behavior of LLD is --as-needed. LLD linked
against a DSO only if the DSO file was actually used to link an
executable (i.e. at least one symbol was resolved using the shared
library file.)
In this patch I added a boolean flag to FileNode for --as-needed.
I also added an accessor to DSO name to shared library file class.
llvm-svn: 226274
This class defines a relocation handler interface. The interface does
not depend on the template argument so the argument is redundant.
llvm-svn: 226259
The target may be a synthetic symbol like __ImageBase. cast_or_null will ensure
that the atom is a DefinedAtom, which is not guaranteed, which was the original
reason for the cast_or_null. Switch this to dyn_cast, which should enable
building of executables for WoA. Unfortunately, the issue of missing base
relocations still needs to be investigated.
llvm-svn: 226246
InputElement was named that because it's an element of an InputGraph.
It's losing the origin because the InputGraph is now being removed.
InputElement's subclass is FileNode, that naming inconsistency needed
to be fixed.
llvm-svn: 226147
They were the last member functions of InputGraph (besides members()).
Now InputGraph is just a container of a vector. We are ready to replace
InputGraph with plain File vector.
llvm-svn: 226146
These changes depended on r225674 and had been rolled back in r225859.
Because r225674 has been re-submitted, it's safe to re-submit them.
llvm-svn: 226132
The original commit had an issue with Mac OS dylib files. It didn't
handle fat binary dylib files correctly. This patch includes a fix.
A test for that case has already been committed in r225764.
llvm-svn: 226123
This is just a mechanical cleanup, no functionality changed. This just
fixes very minor inconsistencies with how #include lines were spaced and
sorted in LLD.
llvm-svn: 225978
r225764 broke a basic functionality on Mac OS. This change reverts
r225764, r225766, r225767, r225769, r225814, r225816, r225829, and r225832.
llvm-svn: 225859
PECOFF was the only user of the API, and the reason why we created
the API is because, although the driver creates a list of input files,
it has no knowledge on what files are being created. It was because
everything was hidden behind the InputGraph abstraction.
Now the driver knows what that's doing. We no longer need this
indirection to get the file list being processed.
llvm-svn: 225767
This is necessary to support linking a basic program which references symbols
outside of the module itself. Add the import thunk for ARM NT style imports.
This allows us to create the reference. However, it is still insufficient to
generate executables that will run due to base relocations not being emitted for
the import.
llvm-svn: 225428
This adds the ability to export symbols from a DLL built for ARMNT. Add this
support first to help work towards adding support for import thunks on Windows
on ARM. In order to generate the exports, add support for
IMAGE_REL_ARM_ADDR32NB relocations.
llvm-svn: 225339
ARM NT assumes a purely THUMB execution, and as such requires that the address
of entry point is adjusted to indicate a thumb entry point. Unconditionally
adjust the AddressOfEntryPoint in the PE header for PE/COFF ARM as we only
support ARM NT at the moment.
llvm-svn: 225139
ARM NT assumes a THUMB only environment. As such, any address that is detected
as residing in an executable section is adjusted to have its bottom bit set to
indicate THUMB in case of a mode exchange.
Although the testing here seems insufficient (missing the negative cases) the
existing test cases for the IMAGE_REL_ARM_{ADDR32,MOV32T} are relevant as they
ensure that we do not incorrectly set the bit.
llvm-svn: 225104
This adds support for IMAGE_REL_ARM_BRANCH24T relocations. Similar to the
IMAGE_REL_ARM_BLX32T relocation, this relocation requires munging an
instruction. The instruction encoding is quite similar, allowing us to reuse
the same munging implementation. This is needed by the entry point stubs for
modules provided by MSVCRT.
llvm-svn: 225082
This adds support for IMAGE_REL_ARM_BLX23T relocations. Similar to the
IMAGE_REL_ARM_MOV32T relocation, this relocation requires munging an
instruction. This inches us closer to supporting a basic hello world
application.
llvm-svn: 225081
This adds support for the IMAGE_REL_ARM_MOV32T relocation. This is one of the
most complicated relocations for the Window on ARM target. It involves
re-encoding an instruction to contain an immediate value which is the relocation
target.
llvm-svn: 225072
This implements the IMAGE_REL_ARM_ADDR32 relocation. There are still a few more
relocation types that need to resolved before lld can even attempt to link a
trivial program for Windows on ARM.
llvm-svn: 225057
This teaches lld about the ARM NT object types. Add a trivial test to ensure
that it can handle ARM NT object file inputs. It is still unable to perform the
necessary relocations for ARM NT, but this allows the linker to at least read
the objects.
llvm-svn: 225052
If a regular symbol has microMIPS-bit we need to stop linking. Now the
LLD does not check the `applyRelocation` return value and continues
linking anyway. As a temporary workaround use the `llvm_unreachable`
call to stop the linker.
llvm-svn: 224831
Summary:
Fix the binary file reader to properly read dyld version info.
Update the install_name test case to properly test the binary reader. We can't use '-print_atoms' as the output format is 'native' yaml and it does not contains the dyld current and compatibility versions.
Also change the timestamp value of LD_ID_DYLD to match the one generated by ld64.
The dynamic linker (dyld) used to expects different values for timestamp in LD_ID_DYLD and LD_LOAD_DYLD for prebound images. While prebinding is deprecated, we should probably keep it safe and match ld64.
Reviewers: kledzik
Subscribers: llvm-commits
Projects: #lld
Differential Revision: http://reviews.llvm.org/D6736
llvm-svn: 224681
Summary:
Work on adding -rpath support to the mach-o linker.
This patch is based on the ld64 behavior for the command line option validation.
It includes a basic test to check that the LC_RPATH load commands are properly generated when that option is used.
It also add LC_RPATH support to the binary reader, but I don't know how to test it though.
Reviewers: kledzik
Subscribers: llvm-commits
Projects: #lld
Differential Revision: http://reviews.llvm.org/D6724
llvm-svn: 224544
The documentation of parseFile() said that "the resulting File
object may take ownership of the MemoryBuffer." So, whether or not
the ownership of a MemoryBuffer would be taken was not clear.
A FileNode (a subclass of InputElement, which is being deprecated)
keeps the ownership if a File doesn't take it.
This patch makes File always take the ownership of a buffer.
Buffers lifespan is not always the same as File instances.
Files are able to deallocate buffers after parsing the contents.
llvm-svn: 224113
This is a second patch for InputGraph cleanup.
Sorry about the size of the patch, but what I did in this
patch is basically moving code from constructor to a new
method, parse(), so the amount of new code is small.
This has no change in functionality.
We've discussed the issue that we have too many classes
to represent a concept of "file". We have File subclasses
that represent files read from disk. In addition to that,
we have bunch of InputElement subclasses (that are part
of InputGraph) that represent command line arguments for
input file names. InputElement is a wrapper for File.
InputElement has parseFile method. The method instantiates
a File. The File's constructor reads a file from disk and
parses that.
Because parseFile method is called from multiple worker
threads, file parsing is processed in parallel. In other
words, one reason why we needed the wrapper classes is
because a File would start reading a file as soon as it
is instantiated.
So, the reason why we have too many classes here is at
least partly because of the design flaw of File class.
Just like threads in a good threading library, we need
to separate instantiation from "start" method, so that
we can instantiate File objects when we need them (which
should be very fast because it involves only one mmap()
and no real file IO) and use them directly instead of
the wrapper classes. Later, we call parse() on each
file in parallel to let them do actual file IO.
In this design, we can eliminate a reason to have the
wrapper classes.
In order to minimize the size of the patch, I didn't go so
far as to replace the wrapper classes with File classes.
The wrapper classes are still there.
In this patch, we call parse() immediately after
instantiating a File, so this really has no change in
functionality. Eventually the call of parse() should be
moved to Driver::link(). That'll be done in another patch.
llvm-svn: 224102
Some targets like microMIPS and ARM Thumb use the last bit of a symbol's
value to mark 'compressed' code. This patch adds new virtual function
`DynamicTable::getAtomVirtualAddress` which allows to adjust a symbol's
value before using it in a dynamic table tags like DT_INIT / DT_FINI.
llvm-svn: 223963
The LLD linker searches initializer and finalizer function names
and emits DT_INIT/DT_FINI dynamic table tags to point to these symbols.
The -init/-fini command line options override initializer ("_init") and
finalizer ("_fini") function names used by default.
Now the -init/-fini options do not affect .init_array/.fini_array
sections. The corresponding code has been removed.
Differential Revision: http://reviews.llvm.org/D6578
llvm-svn: 223917
At present each TargetRelocationHandler generates a pretty similar error
string and calls llvm_unreachable() when encountering an unknown
relocation. This is not ideal for two reasons:
1. llvm_unreachable disappears in release builds but we still want to
know if we encountered a relocation we couldn't handle in release
builds.
2. Duplication is bad - there is no need to have a per-architecture error
message.
This change adds a test for AArch64 to test whether or not the error
message actually works. The other architectures have not been tested
but they compile and check-lld passes.
llvm-svn: 223782
Looks like if you have symbol foo in a module-definition file
(.def file), and if the actual symbol name to match that export
description is _foo@x (where x is an integer), the exported
symbol name becomes this.
- foo in the .dll file
- foo@x in the .lib file
I have checked in a few fixes recently for exported symbol name mangling.
I haven't found a simple rule that governs all the mangling rules.
There may not ever exist. For now, this is a patch to improve .lib
file compatibility.
llvm-svn: 223524
This reverts commit r223330 because it broke Darwin and ELF
linkers in a way that we couldn't have caught with the existing
test cases.
llvm-svn: 223373
To find an AtomLayout object for the given symbol I replace the
`Layout::findAtomAddrByName` method by `Layout::findAtomLayoutByName` method.
llvm-svn: 223359
Looks like the rule of /export is more complicated than
I was thinking. If /export:foo, for example, is given, and
if the actual symbol name in an object file is _foo@<number>,
we need to export that symbol as foo, not as the mangled name.
If only /export:_foo@<number> is given, the symbol is exported
as _foo@<number>.
If both /export:foo and /export:_foo@<number> are given,
they are considered as duplicates, and the linker needs to
choose the unmangled name.
The basic idea seems that the linker needs to export a symbol
with the same name as given as /export.
We exported mangled symbols. This patch fixes that issue.
llvm-svn: 223341
The aim of this patch is to reduce the excessive abstraction from
the InputGraph. We found that even a simple thing, such as sorting
input files (Mach-O) or adding a new file to the input file list
(PE/COFF), is nearly impossible with the InputGraph abstraction,
because it hides too much information behind it. As a result,
we invented complex interactions between components (e.g.
notifyProgress() mechanism) and tricky code to work around that
limitation. There were many occasions that we needed to write
awkward code.
This patch is a first step to make it cleaner. As a first step,
this removes Group class from the InputGraph. The grouping feature
is now directly handled by the Resolver. notifyProgress is removed
since we no longer need that. I could have cleaned it up even more,
but in order to keep the patch minimum, I focused on Group.
SimpleFileNode class, a container of File objects, is now limited
to have only one File. We shold have done this earlier.
We used to allow putting multiple File objects to FileNode.
Although SimpleFileNode usually has only one file, the Driver class
actually used that capability. I modified the Driver class a bit,
so that one FileNode is created for each input File.
We should now probably remove SimpleFileNode and directly store
File objects to the InputGraph in some way, because a container
that can contain only one object is useless. This is a TODO.
Mach-O input files are now sorted before they are passe to the
Resolver. DarwinInputGraph class is no longer needed, so removed.
PECOFF still has hacky code to add a new file to the input file list.
This will be cleaned up in another patch.
llvm-svn: 223330
/export option can be given multiple times to specify multiple
symbols to be exported. /export accepts both decorated and
undecorated name.
If you give both undecorated and decorated name of the same symbol
to /export, they are resolved to the same symbol. In this case,
we need to de-duplicate the exported names, so that we don't have
duplicated items in the export symbol table in a DLL.
We remove duplicate items from a vector. The bug was there.
Because we had pointers pointing to elements of the vector,
after an item is removed, they would point wrong elements.
This patch is to remove these pointers. Added a test for that case.
llvm-svn: 223200
In PR21682 Jean-Daliel Dupas found a leak in the trie builder and suggested
a fix was to use a list instead of SmallVector so that the list elements
could be allocated in the BumpPtrAllocator.
llvm-svn: 223104
The AtomSections were improperly merging sections from various input files. This
patch fixes the problem, with an updated test that was provided by Simon.
Thanks to Simon Atanasyan for catching this issue.
llvm-svn: 222982
Opening a file using openFileForWrite and closing it using close
was asymmetric. It also had a subtle portability problem (details are
described in the commit message for r219189).
llvm-svn: 222802
This was basically benign resource leak on Unix, but on Windows
it could cause builds to fail because opened file descriptor
prevents other processes from moving or removing the file.
llvm-svn: 222799
There are many build files in the wild that depend on the fact that
link.exe produces a PDB file if /DEBUG option is given. They fail
if the file is not created.
This patch is to make LLD create an empty (dummy) file to satisfy
such build targets. This doesn't do anything other than "touching"
the file.
If a target depends on the content of the PDB file, this workaround
is no help, of course. Otherwise this patch should help build some
stuff.
llvm-svn: 222773
Export table entries need to be sorted in ASCII-betical order,
so that the loader can find an entry for a function by binary search.
We sorted the entries by its mangled names. That can be different
from their exported names. As a result, LLD produces incorrect export
table, from which the loader complains that a function that actually
exists in a DLL cannot be found.
This patch fixes that issue.
llvm-svn: 222452
This reverts commit r222310.
Not sure which commit is the cause of the failure on the darwin bot. Will need
to revert my changes and commit one change at a time.
llvm-svn: 222330
Mach-o does not use a simple SO_NEEDED to track dependent dylibs. Instead,
the linker copies four things from each dylib to each client: the runtime path
(aka "install name"), the build time, current version (dylib build number), and
compatibility version The build time is no longer used (it cause every rebuild
of a dylib to be different). The compatibility version is usually just 1.0
and never changes, or the dylib becomes incompatible.
This patch copies that information into the NormalizedMachO format and
propagates it to clients.
llvm-svn: 222300
This patch fixes the following MSVC warning.
warning C4334: '<<' : result of 32-bit shift implicitly
converted to 64 bits (was 64-bit shift intended?)
llvm-svn: 222293
When fixing up BL instructions, the linker has to compare the thumbness of the
target to decide if the instruction needs to be converted to BLX. But with B
instruction there is no BX, so the linker asserts if the target is not the
same thumbness. This assert was firing in -r mode when the target was undefined
which it interpreted as being non-thumb.
Test case change is to add a B (in both thumb and arm code) to an undefined
symbol and round trip through -r mode.
llvm-svn: 222266
The arm64 assembler almost always uses r_extern=1 relocations in which the
r_symbolnum field is the index of the symbol the relocation references. But
sometimes it will set r_extern=0 in which case the linker needs to read the
content of the reloction to determine the target.
Add test case that the r_extern=0 relocation round trips.
llvm-svn: 222200
The arm64 assembler almost always uses r_extern=1 relocations in which the
r_symbolnum field is the index of the symbol the relocation references. But
sometimes it will set r_extern=0 in which case the linker needs to read the
content of the reloction to determine the target.
Add test case that the r_extern=0 relocation round trips.
llvm-svn: 222198
If you have something like
__declspec(align(8192)) int foo = 1;
in your code, the compiler makes the data to be aligned to 8192-byte
boundary, and the linker align the section containing the data to 8192.
LLD always aligned the section to 4192. So, as long as alignment
requirement is smaller than 4192, it was correct, but for larger
requirements, it's wrong.
This patch fixes the issue.
llvm-svn: 222043
AddressOfEntryPoint is overridden after we layout all atoms (until then,
we don't know the entry point address for obvious reason.)
I believe this code is leftover from very early version of the
PE/COFF port that we only had an entry function in a test object file.
llvm-svn: 222026
With --no-align-segments, there is a bug that the fileoffset may not be
congruent to virtual address modulo page alignment.
This patch fixes the problem.
llvm-svn: 221890
MIPS ELF symbols might contain some additional MIPS-specific flags
in the st_other field besides visibility ones. These flags indicate
code properties like microMIPS / MIPS16 encoding, position independent
code etc. We need to transfer the flags from input objects to the
output linked file to write them into the symbol table, adjust symbols
addresses etc.
I add new attribute CodeModel to the DefinedAtom class to hold target
specific flag and to get over YAML/Native format conversion barrier.
Other architectures/targets can extend CodeModel enumeration by their
own flags.
MIPS specific part of this patch adds support for STO_MIPS_MICROMIPS
flag. This flag marks microMIPS symbols. Such symbol should:
a) Has STO_MIPS_MICROMIPS in the corresponding .symtab record.
b) Has adjusted (odd) address in the corresponding .symtab
and .dynsym records.
llvm-svn: 221864
The segment alignment for PT_LOAD segments is set to page size by default, but
if any of the sections require an alignment more than the page size, the segment
alignment property is set to the maximum alignment of the sections that are part
of the segment.
llvm-svn: 221862
The user can use the max-page-size option and set the maximum page size. Dont
check for maximum allowed values for page size, as its what the kernel is
configured with.
Fix the test as well.
llvm-svn: 221858
Each entry in the delay-import address table had a wrong alignment
requirement if 32 bit. As a result it got wrong delay-import table.
Because llvm-readobj doesn't print out that field, we don't have a
test for that. I'll submit a test that would catch this bug after
improving llvm-readobj.
llvm-svn: 221853
The GOT slots were being laid out in a random order by the GOTPass which
caused randomness in the output file.
Note: With this change lld now bootstraps on darwin. That is:
1) link lld using system linker to make lld.1
2) link lld using lld.1 to make lld.2
3) link lld using lld.2 to make lld.3
Now lld.2 and lld.3 are identical.
llvm-svn: 221831
On darwin in final linked images, the __TEXT segment covers that start of the
file. That means in memory a process can see the mach_header (and load commands)
for every loaded image in a process. There are APIs that take and return the
mach_header addresses as a way to specify a particular loaded image.
For completeness, any code can get the address of the mach_header of the image
it is in by using &__dso_handle. In addition there are mach-o type specific
symbols like __mh_execute_header.
The linker needs to supply a definition for any of these symbols if used. But
the address the symbol it resolves to is not in any section. Instead it is the
address of the start of the __TEXT segment.
I needed to make a small change to SimpleFileNode to not override
resetNextIndex() because the Driver creates a SimpleFileNode to hold the
internal/implicit files that the context/writer can create. For some reason
SimpleFileNode overrode resetNextIndex() to do nothing instead of reseting
the index (which mach-o needs if the internal file is an archive).
llvm-svn: 221822
The way lazy binding works in mach-o is that the linker generates a helper
function and has the stub (PLT) initially jump to it. The helper function
pushes an extra parameter then jumps into dyld. The extra parameter is an
offset into the lazy binding info where dyld will find the information about
which symbol to bind and way lazy binding pointer to update.
llvm-svn: 221654
The dynamic table was creating the entry DT_FINI_ARRAY{SZ} even when there was
no .fini_array section. The entries should be creating in the dynamic section
only if there are sections .init_array/.fini_array in the output.
Fixes the tests that checked for errroneous outputs.
llvm-svn: 221588
The value of _DYNAMIC should be pointing at the start of the .dynamic segment.
This was pointing to the end of the dynamic segment. Similarly the value of
_GLOBAL_OFFSET_TABLE_ was not proper too.
llvm-svn: 221587
The parsing routines in the linker script to parse strings encoded in various
formats(hexadecimal, octal, decimal, etc), is needed by the GNU driver too. This
library provides helper functions for all flavors and flavors to add helper
functions which other flavors may make use of.
llvm-svn: 221583
lld generates an ELF by adhering to the ELF spec by aligning vma/fileoffset to a
page boundary, but this becomes an issue when dealing with large pages. This
adds support so that lld generated executables adheres to the ELF spec with the
rule vma % p_align = offset % p_align.
This is supported by the flag --no-align-segments.
This could be the default in few targets like X86_64 to save space on disk.
llvm-svn: 221571
The darwin linker lets you rearrange functions and data for better locality
(less paging). You do this with the -order_file option which supplies a text
file containing one symbol per line.
Implementing this required a small change to LayoutPass to add a custom sorter
hook.
llvm-svn: 221545
When FileArchive loads a member, it instantiates a temporary MemoryBuffer
which points to the member range of the archive file. The problem is that the
object file parsers call getBufferIndentifer() on that temporary MemoryBuffer
and store that StringRef as the _path data member for that lld::File. When
FileArchive::instantiateMember() goes out of scope the MemoryBuffer is deleted
and the File::._path becomes a dangling reference.
The fix adds a vector<> to FileArchive to own the instantiated MemoryBuffers.
In addition it fixes member's path to be the standard format
(e.g. "/path/libfoo.a(foo.o)") instead of just the leaf name.
llvm-svn: 221544
Request `getPairRelocation()` function to get paired relocation type.
That allows us to look up another pairs like R_MICROMIPS_HI16/LO16
in the future.
llvm-svn: 221539
ELFLinkingContext had these two functions, which is really not needed since
the Writer uses a llvm::object template composed of Endianness, Alignment,
Is32bit/64bit. We could just use that and not duplicate functionality.
No Change In Functionality.
llvm-svn: 221523
If /subsystem option is not given, the linker needs to infer the
subsystem based on the entry point symbol. If it fails to infer
that, the linker should error out on it.
LLD was almost correct, but it would fail to infer the subsystem
if the entry point is specified with /entry. This is because the
subsystem inference was coupled with the entry point function
searching (if no entry point name is specified, the linker needs
to find the right entry name).
This patch makes the subsystem inference an independent pass to
fix the issue. Now, as long as an entry point function is defined,
LLD can infer the subsystem no matter how it resolved the entry
point.
I don't think scanning all the defined symbols is fast, although
it shouldn't be that slow. The file class there does not provide
any easy way to find an atom by name, so this is what we can do
at this moment. I'd like to revisit this later to make it more
efficient.
llvm-svn: 221499
1. The path checks ELF header flags to prevent linking of incompatible files.
For example we do not allow to link files with different ABI, -mnan
flags, some combination of target CPU etc.
2. The patch merge ELF header flags from input object files to put their
combination to the generated file. For example, if some input files
have EF_MIPS_NOREORDER flag we need to put this flag to the output
file header.
I use the `parseFile()` (not `canParse()`) method because in case of
recognition of incorrect input flags combination we should show detailed
error message and stop the linking process and should not try to use
another `Reader`.
llvm-svn: 221439
The darwin linker does not process dwarf debug info. Instead it produces a
"debug map" in the output file which points back to the original .o files for
anything that wants debug info (e.g. debugger).
The -S option means "don't add a debug map". lld for mach-o currently does
not generate the debug map, so there is nothing to do when this option is used.
But we need to process the option to get existing projects building.
llvm-svn: 221432
Darwin uses two-level-namespace lookup for symbols which means the static
linker records where each symbol must be found at runtime. Thus defining a
symbol in a dylib loaded earlier will not effect where symbols needed by
later dylibs will be found. Instead overriding is done through a section
of type S_INTERPOSING which contains tuples of <interposer, interposee>.
llvm-svn: 221421
SECREL relocation's value is the offset to the beginning of the section.
Because of the off-by-one error, if a SECREL relocation target is at the
beginning of a section, it got wrong value.
Added a test that would have caught this.
llvm-svn: 221420
The local variable `cfi` became dead in r220730 when it's use was
obviated; it was replaced with a call to read32.
No functionality change intended.
llvm-svn: 221412
Many programs, for reasons unknown, really like to look at the
AddressOfRelocationTable to determine whether or not they are looking at
a bona fide PE file. Without this, programs like the UNIX `file'
utility will insist that they are looking at a MS DOS executable.
llvm-svn: 221335
LLD skipped COMDAT section symbols when reading them because
I thought we don't want to have symbols with the same name.
But they are actually needed because relocations may refer to
the section symbols. So we shoulnd't skip them.
llvm-svn: 221329
The job of the CompactUnwind pass is to turn __compact_unwind data (and
__eh_frame) into the compressed final form in __unwind_info. After it's done,
the original atoms are no longer relevant and should be deleted (they cause
problems during actual execution, quite apart from the fact that they're not
needed).
llvm-svn: 221301
The ELF writer creates a invalid binary for few cases with large filesize and
memory size for segments. This patch addresses the functionality and updates the
test. This patch also cleans up parts of the ELF writer for future enhancements
to support Linker scripts.
llvm-svn: 221233
Normally, PE files have section names of eight characters or less.
However, this is problematic for DWARF because DWARF section names are
things like .debug_aranges.
Instead of truncating the section name, redirect the section name into
the string table.
Differential Revision: http://reviews.llvm.org/D6104
llvm-svn: 221212
This patch does *not* implement any semantic actions, but it is a first step to
teach LLD how to read complete linker scripts. The additional linker scripts
statements whose parsing is now supported are:
* SEARCH_DIR directive
* SECTIONS directive
* Symbol definitions inside SECTIONS including PROVIDE and PROVIDE_HIDDEN
* C-like expressions used in many places in linker scripts
* Input to output sections mapping
The goal of this commit was guided towards completely parsing a default GNU ld
linker script and the linker script used to link the FreeBSD kernel. Thus, it
also adds a test case based on the default linker script used in GNU ld for
x86_64 ELF targets. I tested SPEC userland programs linked by GNU ld, using the
linker script dump'ed by this parser, and everything went fine. I then tested
linking the FreeBSD kernel with a dump'ed linker script, installed the new
kernel and booted it, everything went fine.
Directives that still need to be implemented:
* MEMORY
* PHDRS
Reviewers: silvas, shankarke and ruiu
http://reviews.llvm.org/D5852
llvm-svn: 221126
lld was regenerating LC_DATA_IN_CODE in .o output files, but not into
final linked images.
Update test case to verify data-in-code info makes it into final linked images.
llvm-svn: 220827
Objective-C switched to a new ABI which uses a different mangling for class
names. But to keep projects building that use export lists that use the old
class name mangling, the linker recognizes the old names and transforms them
to the new mangling.
llvm-svn: 220598
In final linked shared images, the __TEXT segment contains both code and
the mach-o header/load-commands. In the case of a data-only dylib, there is
no code, so we need to force the addition of the __TEXT segment.
llvm-svn: 220597
This is a follow-up patch for r220333. r220333 renames exported symbols.
That raised another issue; if we have both decorated and undecorated names
for the same symbol, we'll end up have two duplicate exported symbol
entries.
This is a fix for that issue by removing duplciate entries.
llvm-svn: 220350
All compiler generated mach-o object files are marked with MH_SUBSECTIONS_VIA_SYMBOLS.
But hand written assembly files need to opt-in if they are written correctly.
The flag means the linker can break up a sections at symbol addresses and
dead strip or re-order functions.
This change recognizes object files without the flag and marks its atoms as
not dead strippable and adds a layout-after chain of references so that the
atoms cannot be re-ordered.
llvm-svn: 220348
There are two ways to specify a symbol to be exported in the module
definition file.
1) EXPORT <external name> = <symbol>
2) EXPORT <symbol>
In (1), you give both external name and internal name. In that case,
the linker tries to find a symbol using the internal name, and write
that address to the export table with the external name. Thus, from
the outer world, the symbol seems to be exported as the external name.
In (2), internal name is basically the same as the external name
with an exception: if you give an undecorated symbol to the EXPORT
directive, and if the linker finds a decorated symbol, the external
name for the symbol will become the decorated symbol.
LLD didn't implement that exception correctly. This patch fixes that.
llvm-svn: 220333
HAVE_CXXABI_H is not defined on FreeBSD but the system actually
has the header. CMake test fails because the header depends on size_t.
llvm-svn: 220315
The canParse function for all the ELF subtargets check if the input files match
the subtarget.
There were few mismatches in the input files that didnt match the subtarget for
which the link was being invoked, which also acts as a test for this change.
llvm-svn: 220182
For PC relative accesses, negative addends were to be ignored. The linker was
not ignoring it and would fail with an assert. This fixes the issue and is able
to get Helloworld working.
llvm-svn: 220179
The old code was used as a workaround to fix how relocations are calculated for
sections with SHF_MERGE|SHF_STRINGS attribute. This patch removes the erroneous
code.
llvm-svn: 220159
This fixes the way archive members are displayed when the linker is used with a
flag to show all the files that it processes.
When an archive file member is read, we need to show the archive filename and
the member.
llvm-svn: 220144
This would permit the ELF reader to check the architecture that is being
selected by the linking process.
This patch also sorts the include files according to LLVM conventions.
llvm-svn: 220129
The code was making non-portable assumptions about the exact string returned by
the glob (possibly by the shell?); this is more robust and matches what is done
everywhere else.
llvm-svn: 220117
Previously we supported x86 only. This patch is to support x64.
The array of pointers to delay-loaded functions points the
DLL delay loading function at start-up. When a function is called
for the first time, the delay loading function gets called and
then rewrite the function pointer array.
llvm-svn: 220096
First, add a comment to support more variation in FDE formats. Second, refactor
fde -> function handling into a separate function living in the ArchHandler.
llvm-svn: 219959
To deal with cycles in shared library dependencies, the darwin linker supports
marking specific link dependencies as "upward". An upward link is when a
lower level library links against a higher level library.
llvm-svn: 219949
This patch creates the import address table and sets its
address to the delay-load import table. This also creates
wrapper functions for __delayLoadHelper2.
x86 only for now.
llvm-svn: 219948
Not all situations are representable in the compressed __unwind_info format,
and when this happens the entry needs to point to the more general __eh_frame
description.
Just x86_64 implementation for now.
rdar://problem/18208653
llvm-svn: 219836
We'll also need references back to the CIE eventually, but for now making sure
we can work out what an FDE is referring to is enough.
The actual kind of reference needs to be different between architectures,
probably because of MachO's chronic shortage of relocation types but I don't
really want to know in case I find out something that distresses me even more.
rdar://problem/18208653
llvm-svn: 219824
Because we use cast<> at the beginning of this function, it will
abort there if a given atom is not a DefinedAtom.
In the switch statement, we checked if a given atom is a DefinedAtom
again by evaluating definition() == Atom::definitionRegular.
This was always true. So we can remove the outer switch statement.
llvm-svn: 219724
Arm code has two instruction encodings "thumb" and "arm". When branching from
one code encoding to another, you need to use an instruction that switches
the instruction mode. Usually the transition only happens at call sites, and
the linker can transform a BL instruction in BLX (or vice versa). But if the
compiler did a tail call optimization and a function ends with a branch (not
branch and link), there is no pc-rel BX instruction.
The ShimPass looks for pc-rel B instructions that will need to switch mode.
For those cases it synthesizes a shim which does the transition, then modifies
the original atom with the B instruction to target to the shim atom.
llvm-svn: 219655
When committed in r219353, this patch originally caused problems because it was
not tested in debug build. In such scenarios, Driver.cpp adds two additional
passes. These passes serialize all atoms via YAML and reads it back. Since the
patch changed ObjectAtom to hold a new reference, the serialization was removing
the extra data.
This commit implements r219853 in another way, similar to the original MIPS way,
by using a StringSet that holds the names of all copied atoms instead of
directly holding a reference to the copied atom. In this way, this commit is
simpler and eliminate the necessity of changing the DefinedAtom hierarchy to
hold a new data.
Reviewers: shankarke
http://reviews.llvm.org/D5713
llvm-svn: 219449
We are going to have another type of jump table for the delay-load
import table. In order to prepare for that, we want to factor out
the function handling the jump table. No functionality change.
llvm-svn: 219446
Previously the field was not set. The field should be pointing to
a placeholder where the DLL delay-loader writes the base address
of a DLL.
llvm-svn: 219415
This is a partial patch to emit the delay-import table. With this,
LLD is now able to emit the table that llvm-readobj can read and
dump.
The table lacks a few fields, such as the address of HMODULE, the
import address table, etc. They'll be added in subsequent patches.
llvm-svn: 219384
Enhances the creation of an ELF dynamic executable by avoiding recording
unnecessary shared libraries as NEEDED to load a program.
To do this, we must keep track of not only symbols that were referenced but
also of COPY relocations, which steal the symbol from a shared library but does
not store from which lib this symbol came from. To fix this, this commit changes
ObjectSymbol to store the original library from which this symbol came. With
this information, we are able to build a list of the exact shared libraries that
must be marked as DT_NEEDED, instead of blindly marking all shared libraries as
needed.
This logic originally came from the MIPS backend with some adaptation.
Reviewers: atanasyan, shankar.easwaran
http://reviews.llvm.org/D5574
llvm-svn: 219353
Updates the remaining tasks in the X86_64 ELF lld backend after the commit that
handles general dynamic TLS relocations.
Reviewer: shankarke
http://reviews.llvm.org/D5673
llvm-svn: 219350
This commit implements in the X86_64 ELF lld backend yet another feature that
was only available in the MIPS backend. However, this patch changes generic ELF
classes to make it trivial for other ELF backends to use this logic too. When
creating a dynamic executable that has dynamic relocations against weak
undefined symbols, these symbols must be exported to the dynamic symbol table
to seek a possible resolution at run time.
A common use case is the __gmon_start__ weak glibc undefined symbol.
Reviewer: shankarke
http://reviews.llvm.org/D5571
llvm-svn: 219349
When creating a dynamic executable and receiving the -E flag, the linker should
export all globally visible symbols in its dynamic symbol table.
This commit also moves the logic that exports symbols in the dynamic symbol
table from OutputELFWriter to the ExecutableWriter class. It is not correct to
leave this at OutputELFWriter because DynamicLibraryWriter, another subclass of
OutputELFWriter, already exports all symbols, meaning we can potentially end up
with duplicated symbols in the dynamic symbol table when creating shared libs.
Reviewers: shankarke
http://reviews.llvm.org/D5585
llvm-svn: 219334
mach-o supports "fat" files which are a header/table-of-contents followed by a
concatenation of mach-o files (or archives of mach-o files) built for
different architectures. Previously, the support for fat files was in the
MachOReader, but that only supported fat .o files and dylibs (not archives).
The fix is to put the fat handing into MachOFileNode. That way any input file
kind (including archives) can be fat. MachOFileNode selects the sub-range
of the fat file that matches the arch being linked and creates a MemoryBuffer
for just that subrange.
llvm-svn: 219268
Teach the reader about ARM NT relocation types. Although the writer cannot yet
perform the actual application of these relocations, the reader can at least now
identify the relocation types.
llvm-svn: 219178
Previously, we would not check the target machine type and the module (object)
machine type. Add a check to ensure that we do not attempt to use an object
file with a different target architecture.
This change identified a couple of tests which were incorrectly mixing up
architecture types, using x86 input for a x64 target. Adjust the tests
appropriately. The renaming of the input and the architectures covers the
changes to the existing tests.
One significant change to the existing tests is that the newly added test input
for x64 uses the correct user label prefix for X64.
llvm-svn: 219093
In order to support more than x86/x86_64, we need to change the behaviour to use
the actual machine type rather than checking the bitness and assuming that we
are on X86. This replaces the use of is64bit in applyAllRelocations with a
check on the machine type. This will enable adding support for handling ARM
relocations.
Rename the existing applyRelocation methods to be similarly named and to make it
clear the types of relocations they will process.
llvm-svn: 219088
This option is added by Xcode when it runs the linker. It produces a binary
file which contains the file the linker used. Xcode uses the info to
dynamically update it dependency tracking.
To check the content of the binary file, the test case uses a python script
to dump the binary file as text which FileCheck can check.
llvm-svn: 219039
When creating the graph edges of the atoms of an ELF file, special care must be
taken with atoms that represent weak symbols. They cannot be the target of any
Reference::kindLayoutAfter edge because they can be merged and point to other
code, screwing up the final layout of the atoms. ELFFile::createAtoms()
correctly handles this corner case. The problem is that createAtoms() assumed
that there can be no zero-sized weak symbols, which is not true. Consider:
my_weak_func1:
my_weak_func2:
my_weak_func3:
code
In this case, we have two zero-sized weak symbols, my_weak_func1 and
my_weak_func2, and one non-zero weak symbol my_weak_func3. createAtoms() would
correctly handle my_weak_func3, but not the first two symbols. This problem
happens in the musl C library when a zero-sized weak symbol is merged and
screws up the file layout. Since this musl code lives at the finalization hooks,
any C program linked with LLD and musl was correctly executing, but segfaulting
at the end.
Reviewers: shankarke
http://reviews.llvm.org/D5606
llvm-svn: 219034
This patch adds logic to avoid putting the dynamic linker library (ld.so) as a
DT_NEEDED entry in the dynamic table. It should only appear in PT_INTERP.
This patch fixes SPEC programs 433, 445, 450, 453, 456, 462 when running on
Ubuntu Linux x86_64 and when linking SPEC programs with LLD and glibc 2.19.
http://reviews.llvm.org/D5573
llvm-svn: 218847
Summary: With r218633, the logic that monitors which shared library symbols were used was copied from the MIPS lld backend to ELF classes, making it available to all ELF backends. However, this made the isDynSymEntryRequired() functions in MipsDynamicLibraryWriter.h/MipsELFWriters.h/MipsExecutableWriter.h to be duplicated logic, since this is already implemented in OutputELFWriter<>/DefaultLayout. This patch removes this duplicated code from MIPS.
Reviewers: Bigcheese, shankarke
Reviewed By: shankarke
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5564
llvm-svn: 218846
No functionality change. This removes a down-cast from LinkingContext to
MachOLinkingContext.
Also, remove const from LinkingContext::createImplicitFiles() to remove
the need for another const cast. Seems reasonable for createImplicitFiles()
to need to modify the context (MachOLinkingContext does).
llvm-svn: 218796
The darwin linker has the -demangle option which directs it to demangle C++
(and soon Swift) mangled symbol names. Long term we need some Diagnostics object
for formatting errors and warnings. But for now we have the Core linker just
writing messages to llvm::errs(). So, to enable demangling, I changed the
Resolver to call a LinkingContext method on the symbol name.
To make this more interesting, the demangling code is done via __cxa_demangle()
which is part of the C++ ABI, which is only supported on some platforms, so I
had to conditionalize the code with the config generated HAVE_CXXABI_H.
llvm-svn: 218718
This is yet another edge case of ambiguous name resolution.
When a symbol is specified with /entry:SYM, SYM may be resolved
to the C++ mangled function name (?SYM@@YAXXZ).
llvm-svn: 218706
This is a minimally useful pass to construct the __unwind_info section in a
final object from the various __compact_unwind inputs. Currently it doesn't
produce any compressed pages, only works for x86_64 and will fail if any
function ends up without __compact_unwind.
rdar://problem/18208653
llvm-svn: 218703
Summary:
This patch adds support for the general dynamic TLS access model for X86_64 (see www.akkadia.org/drepper/tls.pdf).
To properly support TLS, the patch also changes the __tls_get_addr atom to be a shared library atom instead of a regularly defined atom (the previous lld approach). This closely models the reality of a function that will be resolved at runtime by the dynamic linker and loader itself (ld.so). I was tempted to force LLD to link against ld.so itself to resolve these symbols, but since GNU ld does not need the ld.so library to resolve this symbol, I decided to mimic its behavior and keep hardwired a definition of __tls_get_addr in the lld code.
This patch also moves some important logic that previously was only available to the MIPS lld backend to be used to all ELF backends. This logic, which now lives in the DefaultLayout class, will monitor which external (shared lib) symbols are really imported by the current module and will only populate the dynamic symbol table with used symbols, as opposed to the previous approach of dumping all shared lib symbols in the dynamic symbol table. This is important to this patch to avoid __tls_get_addr from getting injected into all dynamic symbol tables.
By solving the previous problem of always adding __tls_get_addr, now the produced symbol tables are slightly smaller. But this impacted several tests that relied on hardwired/predefined sizes of the symbol table, requiring this patch to update such tests.
Test Plan: Added a LIT test case that exercises a simple use case of TLS variable in a shared library.
Reviewers: ruiu, rafael, Bigcheese, shankarke
Reviewed By: Bigcheese, shankarke
Subscribers: emaste, shankarke, joerg, kledzik, mcrosier, llvm-commits
Projects: #lld
Differential Revision: http://reviews.llvm.org/D5505
llvm-svn: 218633
Previously we emit two or more identical definitions for an
exported symbol if the same /export option is given more than
once. This patch fixes that bug.
llvm-svn: 218433
This patch is difficult to test in isolation, so a subsequent patch will test
further.
Patch by Daniel Stewart <stewartd@codeaurora.org>!
Phabricator Revision: http://reviews.llvm.org/D5377
llvm-svn: 218418
lib.exe prints a warning if a symbol in a module definition file has
both the PRIVATE attribute and an ordinal like this.
EXPORTS
foo @1 PRIVATE
This patch suppresses that.
llvm-svn: 218395
Currently you can omit the leading underscore from exported
symbol name. LLD will look for mangled name for you. But it won't
look for C++ mangled name.
This patch is to support that.
If "sym" is specified to be exported, the linker looks for not
only "sym", but also "_sym" and "?sym@@<whatever>", so that you
can export a C++ function without decorating it.
llvm-svn: 218355
Exported symbol name resolution is two-pass. In the first pass,
we try to resolve that as a regular undefined symbol. If it fails,
we look for mangled name for the symbol and rename the undefined
symbol and try again.
After all name resolution is done, we look for an atom for each
exported symbol again, to construct the export table. In this
process we try the regular names first, and then try mangled names.
But at this moment we should have knew which name is correct.
This patch is to keep the information we get in the first process
to use it later.
llvm-svn: 218354
The export table descriptor is a data structure to keep information
about the export table. It contains a symbol name, and the name may
or may not be mangled.
We need unmangled names for the export table, so we demangle them
before writing them to the export table.
Obviously this is not a correct round-trip conversion. That could
drop a leading underscore from a symbol because that's
indistinguishable from a mangled name.
What we need to do is to keep unmangled names. This patch does that.
llvm-svn: 218345
/machine:ebc was previously recognized but rejected. Unknown architecture
names were handled differently but eventually rejected too. We don't need
to distinguish them.
llvm-svn: 218344
This patch changes the type of export table set from std::set to
std::vector. The new code is slightly inefficient, but because
export table elements are actually mutable, std::vector is better
here. No functionality change.
llvm-svn: 218343
If two or more /export options are given for the same symbol, we should
always print a warning message and use the first one regardless of other
parameters.
Previously there was a case that the first parameter is not used.
llvm-svn: 218342
A symbol in a module definition file may be annotated with the
PRIVATE keyword like this.
EXPORTS
func PRIVATE
The PRIVATE keyword does not affect the resulting .dll file.
But it prevents the symbol to be listed in the .lib (import
library) file.
llvm-svn: 218273
Patch from Rafael Auler!
When a shared lib has an undefined symbol that is defined in a regular object
(the program), the final executable must export this symbol in the dynamic
symbol table. However, in the current logic, lld only puts the symbol in the
dynamic symbol table if the symbol is weak. This patch fixes lld to put the
symbol in the dynamic symbol table regardless if it is weak or not.
This caused a problem in FreeBSD10, whose programs link against a crt1.o
that defines the symbol __progname, which is, in turn, undefined in libc.so.7
and will only be resolved in runtime.
http://reviews.llvm.org/D5424
llvm-svn: 218259
llvm\tools\lld\lib\readerwriter\macho\macholinkingcontext.cpp(647):
warning C4715: 'lld::MachOLinkingContext::exportSymbolNamed' :
not all control paths return a value
llvm\tools\lld\lib\readerwriter\macho\machonormalizedfilefromatoms.cpp(723):
warning C4715: '`anonymous namespace'::Util::getSymbolTableRegion' :
not all control paths return a value
While all enum values do appear in the switch, an uninitialized or corrupted
enum variable would not be caught without the default: case in the switch.
llvm-svn: 218197
Atoms are ordered in the output file by ordinal. File has file ordinal,
and atom has atom ordinal which is unique within the file.
No two atoms should have the same combination of ordinals.
However that contract was not satisifed for alias atoms. Alias atom
is defined by /alternatename:sym1=sym2. In this case sym1 is defined
as an alias for sym2. sym1 always got ordinal 0.
As a result LLD failed with an assertion failure.
This patch assigns ordinal to alias atoms.
llvm-svn: 218158
Cache the machine type value of the linking context. We need this in order to
calculate the virtual address of the atom when resolving function symbols.
Windows on ARM must check if the atom is a function and if so, set the Thumb bit
for the returned virtual address. Failure to do so will result in an abnormal
exit due to a trap caused by invalid instruction decoding. The same information
can be used to determine the relocation type that was previously being done via
is64 to select between x86 and x86_64.
llvm-svn: 218106
Accept /machine:arm as an argument. This is changed to support ARM NT.
Although there is no way to differentiate between ARM (Windows CE) and ARM NT
(Windows on ARM), since LLVM currently only supports Windows on ARM, simply take
/machine:arm to mean Windows on ARM.
llvm-svn: 218105
Rather than saving whether we are targeting 64-bit x86 (x86_64), simply convert
the single use of that information to the actual relocation type. This will
permit the selection of non-x86 relocation types (e.g. for WoA support).
Inline the access of the machine type field as it is relatively cheap (a couple
of pointer dereferences) rather than storing the relocation type as a member
variable.
llvm-svn: 218104
When we encounter an unknown machine type, we print out the machine type magic.
However, we would print out the magic in decimal rather than hex. Perform this
conversion to make it easier to identify what machine is unsupported.
llvm-svn: 218103
This patch fixes a forbidden use of Twine. It should only be used
as an intermediary value, but never stored.
This caused a bug in lld when running on Linux and compiled with
optimizations - it couldn't properly search libs.
Patch from Rafael Auler!
llvm-svn: 218083
I made LLD to report an error if /safeseh:no option is given on x64,
but it turned out MSVC link.exe doesn't report error on it.
Removing the check.
llvm-svn: 218077
The contents from section .CRT$XLA to .CRT$XLZ is an array of function
pointers. They are called by the runtime when a new thread is created
or (gracefully) terminated.
You can make your own initialization function to be called by that
mechanism. All you have to do is:
- Define a pointer to a function in a .CRT$XL* section using pragma
- Make an external reference to "__tls_used" symbol
That technique is used in many projects. This patch is to support that.
What this patch does is to set the relative virtual address of
"__tls_used" to the PECOFF directory table. __tls_used is actually a
struct containing pointers to a symbol in .CRT$XLA and another symbol
in .CRT$XLZ. The runtime looks at the directory table, gets the address
of the struct, and call the function pointers between XLA and XLZ.
llvm-svn: 218007
On darwin, the linker tools records which dylib (DSO) each undefined was found
in, and then at runtime, the loader (dyld) only looks in that one specific
dylib for each undefined symbol. Now that llvm-objdump can display that info
I can write test cases.
llvm-svn: 217898
lld shouldn't directly use the COFF header nor should it use raw
coff_symbols. Instead, query the header properties from the
COFFObjectFile and use COFFSymbolRef to abstractly reference COFF
symbols.
This is just enough to get lld compiling with the changes to
llvm::object. Bigobj specific testing will come later.
Differential Revision: http://reviews.llvm.org/D5280
llvm-svn: 217497
Most of the changes are in the new file ArchHandler_arm64.cpp. But a few
things had to be fixed to support 16KB pages (instead of 4KB) which iOS arm64
requires. In addition the StubInfo struct had to be expanded because
arm64 uses two instruction (ADRP/LDR) to load a global which requires two
relocations. The other mach-o arches just needed one relocation.
llvm-svn: 217469
There is a bit (MH_PIE) in the flags field of the mach_header which tells
the kernel is a program was built position independent (for ASLR). The linker
automatically attempts to build programs PIE if they are built for a recent
OS version. But the -pie and -no_pie options override that default behavior.
llvm-svn: 217408
defined in a shared library.
Now LLD does not export a strong defined symbol if it coalesces away a
weak symbol defined in a shared library. This bug affects all ELF
architectures and leads to segfault:
% cat foo.c
extern int __attribute__((weak)) flag;
int foo() { return flag; }
% cat main.c
int flag = 1;
int foo();
int main() { return foo() == 1 ? 0 : -1; }
% clang -c -fPIC foo.c main.c
% lld -flavor gnu -target x86_64 -shared -o libfoo.so ... foo.o
% lld -flavor gnu -target x86_64 -o a.out ... main.o libfoo.so
% ./a.out
Segmentation fault
The problem is caused by the fact that we lose all information about
coalesced symbols after the `Resolver::resolve()` method is finished.
The patch solves the problem by overriding the
`LinkingContext::notifySymbolTableCoalesce()` method and saving names
of coalesced symbols. Later in the `buildDynamicSymbolTable()` routine
we use this information to export these symbols.
llvm-svn: 217363
By default linker would not create a separate segment to hold read only data.
This option overrides that behavior by creating the a separate read only segment
for read only data.
llvm-svn: 217358
Moved code used only by isDataSymbol from find to isDataSymbol member
function. Also changed the return type of isDataSymbol because
previously "if (isDataSymbol(...))" meant "if it is *not* a data symbol"
which is opposite from what you'd expect.
llvm-svn: 217285
Mach-O has a "fat" (or "universal") variant where the same contents built for
different architectures are concatenated into one file with a table-of-contents
header at the start. But this leaves a dilemma for the linker - which
architecture to use.
Normally, the linker command line -arch is used to force which slice of any fat
files are used. The clang compiler always passes -arch to the linker when
invoking it. But some Makefiles invoke the linker directly and don’t specify
the -arch option. For those cases, the linker scans all input files in command
line order and finds the first non-fat object file. Whatever architecture it
is becomes the architecture for the link.
llvm-svn: 217189
The use of default: was disabling the warning about unused enumerators. Fix
that, then fix the one enumerator that was not handled. Add coverage for
it in test suite.
llvm-svn: 217078
On Darwin at runtime, dyld will prefer to use the export trie of a dylib instead
of the traditional symbol table (which is large and requires a binary search).
This change enables the linker to generate an export trie and to prefer it if
found in a dylib being linked against. This also simples the yaml for dylibs
because the yaml form of the trie can be reduced to just a sequence of names.
llvm-svn: 217066
I hope this is the last fix for x64 relocations as I've wasted
a few days on this.
This caused a mysterious issue that some C++ programs crash on
startup. It was because a null pointer is passed as argv to main.
__tmainCRTStartup calls main, but before that it calls all
initialization routines between .text$xc_a and .text$xc_z.
pre_cpp_init is one of such routines, and it is the one who
initializes a heap pointer for argv for later use. That routine
was not called for some reason.
It turned out that __tmainCRTStartup was skipping a block of
code because of the relocation bug. A condition in the function
depends on a memory load, and that memory load was referring
a wrong location. As a result a jump instruction took the
wrong branch, skipping pre_cpp_init and so on.
This patch fixes the issue. Also added more tests to fix them
once and for all.
llvm-svn: 216772
When a relocation is applied to a location, the new value needs
to be added to the existing value at the location. Existing
value is in most cases zero, but if not, the current code does
not work.
llvm-svn: 216680
Image Base field in the PE/COFF header is used as hint for the loader.
If the loader can load the executable at the specified address, that's
fine, but if not, it has to load it at a different address.
If that happens, the loader has to fix up the addresses in the
executable by adding the offset. The list of addresses that need to
be fixed is in .reloc section.
This patch is to emit x64 .reloc section contents.
llvm-svn: 216636
IMAGE_REL_AMD64_ADDR64 relocation should set 64-bit *VA* (virtual
address) instead of *RVA* (relative virtual address), so we have
to add the iamge base to the target's RVA.
llvm-svn: 216512
The implementation of AMD64 relocations was imcomplete
and wrong. On AMD64, we of course have to use AMD64
relocations instead of i386 ones. This patch fixes the
issue.
LLD is now able to link hello64.obj (created from
hello64.asm) against user32.lib and kernel32.lib to
create a Win64 binary.
llvm-svn: 216253
Mach-O symbols can have an attribute on them means their content should never be
dead code stripped. This translates to deadStrip() == deadStripNever.
llvm-svn: 216234
Both options control the final scope of atoms.
When -exported_symbols_list <file> is used, the file is parsed into one
symbol per line in the file. Only those symbols will be exported (global)
in the final linked image.
The -keep_private_externs option is only used with -r mode. Normally, -r
mode reduces private extern (scopeLinkageUnit) symbols to non-external. But
add the -keep_private_externs option keeps them private external.
llvm-svn: 216146
The darwin linker has an option, heavily used by Xcode, in which, instead
of listing all input files on the command line, the input file paths are
written to a text file and the path of that text file is passed to the linker
with the -filelist option (similar to @file).
In order to make test cases for this, I generalized the -test_libresolution
option to become -test_file_usage.
llvm-svn: 215762
Darwin has a packaging mechanism for shared libraries and headers called
frameworks. A directory Foo.framework contains a shared library binary file
"Foo" and a subdirectory "Headers". Most OS frameworks are all in one
directory /System/Library/Frameworks/. As a linking convenience, the linker
option "-framework Foo" means search the framework directories specified
with -F (analogous to -L) looking for a shared library Foo.framework/Foo.
llvm-svn: 215680
In general two-level namespace means each program records exactly which dylib
each undefined (imported) symbol comes from. But, sometimes the implementor
wants to hide the implementation dylib. For instance libSytem.dylib is the base
dylib all Darwin programs must link with. A few years ago it was split up
into two dozen dylibs by all are hidden behind libSystem.dylib which re-exports
each sub-dylib. All clients still think libSystem.dylib is the implementor.
To support this, the linker must load "indirect" dylibs and not just the
"direct" dylibs specified on the command line. This is done in the
createImplicitFiles() method after all command line specified files are
loaded. Since an indirect dylib may have already been loaded as a direct dylib
(or indirectly via a previous direct dylib), the MachOLinkingContext keeps
a list of all loaded dylibs.
With this change hello world can now be linked against the real OS or SDK.
llvm-svn: 215605
Split up the CRuntimeFile into one part for output types that need an entry
point and another part for output types that use stubs.
Add file 'test/mach-o/Inputs/libSystem.yaml' for use by test cases that
use -dylib and therefore may now need the helper symbol in libSystem.dylib.
llvm-svn: 215602
Mach-o uses "two-level namespace" where each undefined symbols is associated
with a specific dylib. This means at runtime the loader (dyld) does need to
search all loaded dylibs for that symbol but rather just the one specified.
Now that llvm-nm -m prints out that info, properly set that info, and test
it in the hello world test cases.
llvm-svn: 215598
This patch adds the initial ELF/AArch64 support to lld. Only a basic "Hello
World" app has been successfully tested for both dynamic and static compiling.
Differential Revision: http://reviews.llvm.org/D4778
Patch by Daniel Stewart <stewartd@codeaurora.org>!
llvm-svn: 215544
The tests assume the path separator is '/', but if you run
them on Windows it is '\'. As a result the tests are failing
on Windows. This should be the minimal change to make these
tests to pass on Windows platform.
Differential Revision: http://reviews.llvm.org/D4710
llvm-svn: 214990
/INCLUDE arguments passed as command line options are handled in the
same way as Unix -u. All option values are converted to an undefined
symbol and added to a dummy input file, so that the specified symbols
are resolved.
One tricky thing on Windows is that the option is also allowed to
appear in the object file's directive section. At the time when
it's being read, all (regular) command line options have already
been processed. We cannot add undefined atoms to the dummy file
anymore.
Previously, we added such /INCLUDE to a set that has already been
processed. As a result the options were ignored.
This patch fixes the issue. Now, /INCLUDE symbols in the directive
section are handled as real undefined symbol in the COFF file.
We create an undefined symbol for each /INCLUDE argument and add
it to the file being parsed.
llvm-svn: 214824
The PE/COFF spec says that SizeOfRawData field in the section
header must be a multiple of FileAlignment from the optional
header. LLD emits 512 as FileAlignment, so it must have been
a multiple of 512.
LLD did not follow that. It emitted the actual section size
without the last padding as the SizeOfRawData. Although it's
not correct as per the spec, the Windows loader doesn't seem
to actually bother to check that. Executables created by LLD
worked fine.
However, tools dealing with executalbe files may expect it
to be the correct value, and one instance of it is mt.exe
tool distributed as a part of Windows SDK.
If CMake is invoked with "-E vs_link_exe" option, it silently
run mt.exe to embed a resource file to the resulting file.
And mt.exe sometimes breaks an input file if it's section
header does not follow the standard. That caused a misterous
error that CMake with Ninja occasionally produces a broken
executable.
This patch fixes the section header to make mt.exe and
other tools happy.
llvm-svn: 214453
In some cases the address of a function will be materialized with a movw/movt
pair. If the function is a thumb function, the low bit needs to be set on
the movw immediate value.
llvm-svn: 214277
The -sectalign option is used to increase the alignment required for a section.
It required some reworking of how the __TEXT segment is laid out because that
segment also contains the mach_header and load commands. And the size of load
commands depend on the number of segments, sections, and dependent dylibs used.
Using this option will simplify some future test cases because the final
address of code can be pinned down, making tests of its content easier.
llvm-svn: 214268
All iOS arm processor support switching between arm and thumb mode at call sites
by using the BLX instruction (instead of BL). But the compiler does not know
the implementation mode for extern functions, so the linker must update BL/BLX
instructions to match what is linked is actually linked together. In addition,
pointers to functions (such as vtables) must have the low bit set if the target
of the pointer is a thumb mode function.
llvm-svn: 214140
Sometimes compilers emit data into code sections (e.g. constant pools or
jump tables). These runs of data can throw off disassemblers. The solution
in mach-o is that ranges of data-in-code are encoded into a table pointed to
by the LC_DATA_IN_CODE load command.
The way the data-in-code information is encoded into lld's Atom model is that
that start and end of each data run is marked with a Reference whose offset
is the start/end of the data run. For arm, the switch back to code also marks
whether it is thumb or arm code.
llvm-svn: 213901
The entry point file needs to be processed after all other
object files and before all .lib files. It was processed
after .lib files. That caused an issue that the entry point
function was not resolved from the standard library files.
llvm-svn: 213804
On Windows there are four "main" functions -- main, wmain, WinMain,
or wWinMain. Their parameter types are diffferent. The standard
library provides four different entry functions (i.e.
{w,}{WinMain,main}CRTStartup) for them. You need to use the right
entry routine for your "main" function.
If you give an /entry option, the specified name is used
unconditionally.
Otherwise, the linker needs to select the right one based on
user-supplied entry point function. This can be done after the
linker reads all the input files.
This patch moves the code to determine the entry point function
from the driver to a virtual input file. It also implements the
correct logic for the entry point function selection.
llvm-svn: 213713
This patch just supports marking ranges that are thumb code (vs arm code).
Future patches will mark data and jump table ranges. The ranges are encoded
as References with offsetInAtom being the start of the range and the target
being the same atom.
llvm-svn: 213712
Over time the symbols and relocations have changed for dwarf unwind info
in the __eh_frame section. Add test cases for older and new style.
llvm-svn: 213585
Add support for adding section relocations in -r mode. Enhance the test
cases which validate the parsing of .o files to also round trip. They now
write out the .o file and then parse that, verifying all relocations survived
the round trip.
llvm-svn: 213333
The code to manage resolvable symbols is now separated from
ExportedSymbolRenameFile so that other class can reuse it.
I'm planning to use it to find the entry function symbol
based on resolvable symbols.
llvm-svn: 213322
All architecture specific handling is now done in the appropriate
ArchHandler subclass.
The StubsPass and GOTPass have been simplified. All architecture specific
variations in stubs are now encoded in a table which is vended by the
current ArchHandler.
llvm-svn: 213187
There are two forms of `-l` prefixed expression:
* -l<libname>
* -l:<filename>
In the first case a linker should construct a full library name
`lib + libname + .[so|a]` and search this library as usual. In the second case
a linker should use the `<filename>` as is and search this file through library
search directories.
The patch reviewed by Shankar Easwaran.
llvm-svn: 213077
Previously we invoked cvtres.exe for each compiled Windows
resource file. The generated files were then concatenated
and embedded to the executable.
That was not the correct way to merge compiled Windows
resource files. If you just concatenate generated files,
only the first file would be recognized and the rest would
be ignored as trailing garbage.
The right way to merge them is to call cvtres.exe with
multiple input files. In this patch we do that in the
Windows driver.
llvm-svn: 212763
These behave slightly idiosyncratically in the best of cases, and have
additional hacks layered on top of that for compatibility with badly behaved
build systems (via ld64).
For -lXYZ:
+ If XYZ is actually XY.o then search all library paths for XY.o
+ Otherwise search all library paths, first for libXYZ.dylib, then libXYZ.a
+ By default the library paths are /usr/lib and /usr/local/lib in that order.
For -syslibroot:
+ -syslibroot options apply to absolute paths in the search order.
+ All -syslibroot prefixes that exist are added to the search path *instead*
of the original.
+ If no -syslibroot prefixed path exists, the original is kept.
+ Hacks^WExceptions:
+ If only 1 -syslibroot is given and doesn't contain /usr/lib or
/usr/local/lib, that path is dropped entirely. (rdar://problem/6438270).
+ If the last -syslibroot is "/", all of them are ignored entirely.
(rdar://problem/5829579).
At least, that's my best interpretation of what ld64 does in buildSearchPaths.
llvm-svn: 212706
Previously the alignment of the .bss section was not
properly set because of a bug in AtomizeDefinedSymbolsInSection.
We set the alignment of the section at the end of the function,
but we use an eraly return for the .bss section, so the code had
been skipped.
llvm-svn: 212571
This converts the very complicated mach-o arm
relocations into the simple Reference Kinds in lld.
The next patch will use the internal Reference kinds
to fix up arm/thumb code.
llvm-svn: 212306
When trying to map atom types to sections, we were iterating through an array
until we hit a sentinel value. There's no need for such dances when range-based
for loops are available.
llvm-svn: 212035
This isn't really the right place to put them in final object files (that would
be __TEXT,__unwind_info), but the format is different between relocatable and
final objects, which means we really need a pass to handle the translation.
For now, re-emitting in __LD,__compact_unwind is harmless (dyld ignores it and
moves straight on to inspecting __TEXT,__eh_frame), and sidesteps an assertion
failure when processing files containing compact-unwind info.
llvm-svn: 212032
Segments must occupy a multiple of the page size in memory (4096 currently). We
check for this when emitting files, but the placement algorithm broke down for
the second non-__TEXT segment encountered: the offset wasn't aligned up to 4096
before starting its layout.
llvm-svn: 212031
Because of how we were calculating fileOffset and fileSize for segments, most
ended up at a single offset in a finalised MachO file. This meant the data
often didn't even get written in the final object, let alone where it would be
useful.
llvm-svn: 212030
This is first step in reworking how mach-o relocations are processed.
The existing KindHandler is going to become a delgate/helper object for
processing architecture specific references. The KindHandler knows how
to convert mach-o relocations into References and back, as well, as fixing
up the content the relocation is on.
One of the messy things about mach-o relocations is that they sometime
come in pairs, but the pairs still convert to one lld::Reference. So, the
conversion has to detect pairs (arch specific) and change the stride.
llvm-svn: 211921
The previous function returned true for "s < s", which could completely mess up
the sorting of symbols within a section.
Unfortunately, I don't think there's a robust way to write a test for this.
Anything I come up with will be making assumptions about the particular
implementation of std::sort.
llvm-svn: 211704
When looking through sections with zero-terminated string-literals (__cstring
or __ustring) we were constantly rechecking the first few bytes of the string
for '\0' rather than advancing along. This obviously failed unless all strings
within the section had the same length as that first one.
llvm-svn: 211682
We were trying to examine the first symbol in a section that we wanted to
atomize by symbols, even when there wasn't one. Instead, we should make the
initial anonymous symbol cover the entire section in that situation.
llvm-svn: 211681
dynamic symbol table populating and DT_NEEDED tag creation.
The `isDynSymEntryRequired` function returns true if the specified shared
library atom requires a dynamic symbol table entry. The `isNeededTagRequired`
function returns true if we need to create DT_NEEDED tag for the shared
library defined specified shared atom.
By default the both functions return true. So there is no functional changes
for all targets except MIPS. Probably we need to spread the same modifications
on other ELF targets but I want to implement and fully tested complete set of
changes for MIPS target first.
For MIPS we create a dynamic symbol table entry for a shared library atom iif
this atom is referenced by a regular defined atom. For example, if library L1
defines symbol T1, library L2 defines symbol T2 and uses symbol T1
and executable file E1 uses symbol T2 but does not use symbol T1 we create
an entry in the E1 dynamic symbol table for symbol T2 and do not create
an entry for T1.
The patch creates DT_NEEDED tags for shared libraries contain shared library
atoms which a) referenced by regular defined atoms; b) have corresponding
copy dynamic relocations (R_MIPS_COPY).
Now the patch does not take in account --as-needed / --no-as-needed command
line options. So it is too restrictive and create DT_NEEDED tags for really
needed shared libraries only. I plan to fix that by subsequent patches.
llvm-svn: 211674
COFF supports a feature similar to ELF's section groups. This
patch implements it.
In ELF, section groups are identified by their names, and they are
treated somewhat differently from regular symbols. In COFF, the
feature is realized in a more straightforward way. A section can
have an annotation saying "if Nth section is linked, link this
section too."
I added a new reference type, kindAssociate. If a target atom is
coalesced away, the referring atom is removed by Resolver, so that
they are treated as a group.
Differential Revision: http://reviews.llvm.org/D4028
llvm-svn: 211106
The previous commit uncovered a bug in the mach-o writer whereby two __text
sections were created. But the test case did not catch that. So I updated
the test case to run the linker a second time, reading the output of the
first pass.
llvm-svn: 210624
COFF supports a feature similar to ELF's section groups. This
patch implements it.
In ELF, section groups are identified by their names, and they are
treated somewhat differently from regular symbols. In COFF, the
feature is realized in a more straightforward way. A section can
have an annotation saying "if Nth section is linked, link this
section too."
Implementing such feature is easy. We can add a reference from a
target atom to an original atom, so that if the target is linked,
the original atom is also linked. If not linked, both will be
dead-stripped. So they are treated as a group.
I added a new reference type, kindAssociate. It does nothing except
preventing referenced atoms from being dead-stripped.
No change to the Resolver is needed.
Reviewers: Bigcheese, shankarke, atanasyan
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3946
llvm-svn: 210240
This provides support for the autoconfing & make build style.
The format, style and implementation follows that used within the llvm and clang projects.
TODO: implement out-of-source documentation builds.
llvm-svn: 210177
Previously FileArchive ctor comment said that only its subclasses
can be instantiated, but the ctor is actually public and is
instantiated by ArchiveReader.
Remove the wrong comment and reorder the member functions so that
public members appear before private ones.
llvm-svn: 210175
In sections that are broken into atoms at symbols, if the first symbol in the
section is not at the start of the section, then make an anonymous atom for
the section content that is before the first symbol.
llvm-svn: 210142
Previously each section kind had its own code to loop over the section and
parse it into atoms. This refactoring has two tables. The first maps sections
to ContentType. The second maps ContentType to information on how to find
the atom boundaries.
A few bugs in test cases were discovered as part of the refactoring.
No change in functionality intended.
llvm-svn: 210138
Previously section groups are doubly linked to their children.
That is, an atom representing a group has group-child references
to its group contents, and content atoms also have group-parent
references to the group atom. That relationship was invariant;
if X has a group-child edge to Y, Y must have a group-parent
edge to X.
However we were not using group-parent references at all. The
resolver only needs group-child edges.
This patch simplifies the section group by removing the unused
reverse edge. No functionality change intended.
Differential Revision: http://reviews.llvm.org/D3945
llvm-svn: 210066
Arrange .ctors/.dtors sections in the following order:
.ctors from crtbegin.o or crtbegin?.o
.ctors from regular object files
.ctors.* (sorted) from regular object files
.ctors from crtend.o or crtend?.o
This order is specific for MIPS traget. For example, on X86
the .ctors.* sections are merged into the .init_array section.
llvm-svn: 209987
The main problem is in the predicate passed to the `std::stable_sort()`.
This predicate always returns false if **both** section's names do not
start with `.init_array` or `.fini_array` prefixes. In short, it does not
define a strict weak orderng. Suppose we have the following sections:
.A .init_array.1 .init_array.2
The predicate states that:
not .init_array.1 < .A
not .A < .init_array.2
but .init_array.1 < .init_array.2 !!!
The second problem is that `.init_array` section without number should
go last in the list. Not it has the lowest priority '0' and goes first.
The patch fixes both of the problems.
llvm-svn: 209875
This is a short-term fix to allow lld Readers to return error messages
with dynamic content.
The long term fix will be to enhance ErrorOr<> to work with errors other
than error_code. Or to change the interface to Readers to pass down a
diagnostics object through which all error messages are written.
llvm-svn: 209681
/alternatename is a command line option to define a weak alias. You
can use it as /alternatename:foo=bar to define "foo" as a weak alias
for "bar".
Because it's a command line option, the weak alias mapping is in the
LinkingContext object, and not in a object file being read.
Previously, we looked up the mapping each time we read a new symbol
from a file, to check if there is a weak alias defined for the symbol.
That's not wrong, but had made function signature's a bit complicated --
we had to pass the mapping object to many functions. Now their
parameter lists are much cleaner.
This also has another (unrealized) benefit. parseFile() now read a
file and then add alias symbols to the file. In the first pass a
LinkingContext object is not used at all. That should make it easy
to read files from archive files speculatively, as the first pass
is free from side effect.
llvm-svn: 209486
Alias symbols are SimpleDefinedAtoms and are platform neutral. They
don't have to belong ELF. This patch is to make it available to all
platforms. No functionality change intended.
Differential Revision: http://reviews.llvm.org/D3862
llvm-svn: 209475
addResolvableSymbols() queues input files, and readAllSymbols() reads
from them. In practice it's currently safe because they are called from
a single thread. But it's not guaranteed.
Also, acquiring the same mutex is needed not to see inconsistent memory
contents that is allowed in the C++ memory model.
llvm-svn: 209254
ExportedSymbolRenameFile is not always used. In most cases we don't
need to read given files at all. So lazy load would help. This doesn't
change the meaining of the program.
llvm-svn: 208818
As written in the comment in this patch, symbol names specified with
/export option is resolved in a special way; for /export:foo, linker
finds a foo@<number> symbol if such symbols exists.
On Windows, a function in stdcall calling convention is mangled with
a leading underscore and following "@" and numbers. This name
mangling is kind of automatic, so you can sometimes omit _ and @number
when specifying a symbol. /export option is that case.
Previously, if a file in an archive file foo.lib provides a symbol
_fn@8, and /export:fn is specified, LLD failed to resolve the symbol.
It only tried to find _fn, and failed to find _fn@8. With this patch,
_fn@8 will be searched on the second iteration.
Differential Revision: http://reviews.llvm.org/D3736
llvm-svn: 208754
Make it possible to add observers to an Input Graph, so that files
returned from an Input Graph can be examined before they are
passed to Resolver.
To implement some PE/COFF features we need to know all the symbols
that *can* be solved, including ones in archive files that are not
yet to be read.
Currently, Resolver only maintains a set of symbols that are
already read. It has no knowledge on symbols in skipped files in
an archive file.
There are many ways to implement that. I chose to apply the
observer pattern here because it seems most non-intrusive. We don't
want to mess up Resolver with architecture specific features.
Even in PE/COFF, the feature that needs this mechanism is minor.
So I chose not to modify Resolver, but add a hook to Input Graph.
Differential Revision: http://reviews.llvm.org/D3735
llvm-svn: 208753
If one or more dynamic relocation might modify a read-only section,
dynamic table should contain DT_TEXTREL tag.
The patch introduces new `RelocationTable::canModifyReadonlySection()`
method. This method checks through the relocations to see if any modifies
a read-only section. The DynamicTable class calls this method and emits
the DT_TEXTREL tag if necessary.
The patch reviewed by Rui Ueyama and Shankar Easwaran.
llvm-svn: 208670
We did not actively try to resolve dllexported symbols specified
by /export or by a module definition file. So if exported symbols
would be resolved for other reasons, like other symbols refer to
them, that was fine, but if (unreferenced) exported symbols were
in an archive file, and no one refers to that file in the archive,
they remained unresolved.
That would obviously cause the issue that dllexported symbols are
not in a resultant DLL.
In this patch, we create an undefined symbol for each dllexported
symbol, to let the core linker to resolve it.
llvm-svn: 208452
Previously the handling of exported symbol was wrong if it's
specified in a module definition file in the form of
<externalname>=<internalname>. Export the correct symbol.
llvm-svn: 208446
isAlias always returns false and no one is using it. It was
originally added Atom to query if an atom is an alias for another
atom, assuming that alias atoms are different from normal atoms.
We now support atom aliasing, but the way that's implemented is
in a different way than what isAlias assumed. An alias atom is
just a regular defined atom with no content, and it has a layout-
before edge to alias-to atom so that they are layed out at the
same location in the result. So this is dead code, and it doesn't
make much sense to keep it.
llvm-svn: 207884
Export definitions in a module definition file is as follows:
exportedname[=internalname] [@ordinal [NONAME]] [PRIVATE] [DATA]
Previously we did not support =internalname, so users couldn't export
symbols from a DLL with a different name.
llvm-svn: 207827
In general the linker scripts's GROUP command works like a pair
of command line options --start-group/--end-group. But there is
a difference in the files look up algorithm.
The --start-group/--end-group commands use a trivial approach:
a) If the path has '-l' prefix, add 'lib' prefix and '.a'/'.so'
suffix and search the path through library search directories.
b) Otherwise, use the path 'as-is'.
The GROUP command implements more compicated approach:
a) If the path has '-l' prefix, add 'lib' prefix and '.a'/'.so'
suffix and search the path through library search directories.
b) If the path does not have '-l' prefix, and sysroot is configured,
and the path starts with the / character, and the script being
processed is located inside the sysroot, search the path under
the sysroot. Otherwise, try to open the path in the current
directory. If it is not found, search through library search
directories.
https://www.sourceware.org/binutils/docs-2.24/ld/File-Commands.html
The patch reviewed by Shankar Easwaran, Rui Ueyama.
llvm-svn: 207769
When creating a .lib file, we should strip the leading underscore,
but should not strip stdcall atsign suffix. Otherwise produced .lib
files cannot be linked.
llvm-svn: 207729
Previously the input file for the lib.exe command would be removed
as soon as the command exits, so we couldn't write a test to check
the file contents are correct.
This patch adds /lldmoduledeffile: option to retain a copy of the
temporary file at the given file path, so that you can see the file
if you want.
llvm-svn: 207727
Linker should create _imp_ symbols for local use only when such
symbols cannot be resolved in any other way. If it overrides real
imported symbols, such symbols remain virtually unresolved without
error, causing odd issues. I observed that a program linked with
LLD entered an infinite loop before reaching main() because of
this issue.
This patch moves the virtual file creating _imp_ symbols to the
very end of the input file list. Previously, the file is at the end
of the library file group. Linker might revisit the group many times,
so it was not really at the end of the input file list.
llvm-svn: 207605
1. Re-implement PLT entries and dynamic relocations emitting to keep PLT
and relocations table in a consistent state.
2. Initialize st_value and st_other fields for dynamic symbols table
entry if this entry corresponds to an external function which address is
taken in a non-PIC executable. In that case the st_value field holds an
address of the function's PLT entry. Also set STO_MIPS_PLT bit in the
st_other field.
llvm-svn: 207494
Implicit symbol for local use implemented in r207141 was not fully
compatible with MSVC link.exe. In r207141, I implemented the feature
in such way that implicit symbols are defined only when they are
exported with /EXPORT option.
After that I found that implicit symbols are defined not only for
dllexported symbols but for all defined symbols. Actually _imp_
implicit symbols have no relationship with the dllexport feature. You
could add _imp_ to any symbol to get a pointer to the symbol, whether
the symbol is dllexported or not. It looks pretty weird to me but
that's what we want if link.exe behaves that way.
Here is a bit about the implementation: Creating all implicit symbols
beforehand is going to be a huge waste of resource. This feature is
rarely used, and MSVC link.exe even prints out a warning message when
it finds this feature is being used. So we create implicit symbols
on demand. There is an archive file that creates implicit symbols when
they are needed.
llvm-svn: 207476
This patch is to fix a compatibility issue with MSVC link.exe as to
use of dllexported symbols inside DLL.
A DLL exports two symbols for a function. One is non-decorated one,
and the other is with __imp_ prefix. The former is a function that
you can directly call, and the latter is a pointer to the function.
These dllexported symbols are created by linker for programs that
link against the DLL. So, I naturally believed that __imp_ symbols
become available when you once create a DLL and link against it, but
they don't exist until then. And that's not true.
MSVC link.exe is smart enough to allow users to use __imp_ symbols
locally. That is, if a symbol is specified with /export option, it
implicitly creates a new symbol with __imp_ prefix as a pointer to
the exported symbol. This feature allows the following program to
be linked and run, although _imp__hello is not defined in this code.
#include <stdio.h>
__declspec(dllexport)
void hello(void) { printf("Hello\n"); }
extern void (*_imp__hello)(void);
int main() {
_imp__hello();
return 0;
}
MSVC link.exe prints out the following warning when linking it.
LNK4217: locally defined symbol _hello imported in function _main
Using __imp_ symbols locally is I think not a good coding style. One
should just take an address using "&" operator rather than appending
__imp_ prefix. However, there are programs in the wild that depends
on this link.exe's behavior, so we need this feature.
llvm-svn: 207141
Not all symbols are decorated with an underscore in x86. You can
write undecorated symbols in assembly, for example. Thus this
assertion is too strong.
llvm-svn: 207125
We don't use sections with IMAGE_SYM_DEBUG attribute so we basically
want to the symbols for them when reading symbol table. When we skip
them, we need to skip auxiliary symbols too. Otherwise weird error
would happen because aux symbols would be interpreted as regular ones.
llvm-svn: 206931
definition below all of the header #include lines, LLD edition.
IF you want to know more details about this, you can see the recent
commits to Debug.h in LLVM. This is just the LLD segment of a cleanup
I'm doing globally for this macro.
llvm-svn: 206851
Currently LLD supports --defsym only in the form of
--defsym=<symbol>=<integer>, where the integer is interpreted as the
absolute address of the symbol. This patch extends it to allow other
symbol name to be given as an RHS value. If a RHS value is a symbol
name, the LHS symbol will be defined as an alias for the RHS symbol.
Internally, a LHS symbol is represented as a zero-size defined atom
who has an LayoutAfter reference to an undefined atom, whose name is
the RHS value. Everything else is already implemented -- Resolver
will resolve the undefined symbol, and the layout pass will layout
the two atoms at the same location. Looks like it's working fine.
Note that GNU LD supports --defsym=<symbol>=<symbol>+<addend>. That
feature is out of scope of this patch.
Differential Revision: http://reviews.llvm.org/D3332
llvm-svn: 206417
a couple of new virtual functions.
Follow-up to the rL203408. Two virtual functions `createRelocationReference()`
responsible for creation of `ELFReference` have been replaced by a couple of
new virtual functions `createRelocationReferences()` (plural). Each former
function creates a //single// ELFReference for a specified `Elf_Rela`
or `Elf_Rel` relocation records. The new functions responsible for creation
of //all// relocation references for provided symbol.
For all targets except MIPS there are no functional changes.
MIPS ABI has a notion of //paired// relocations. An effective addend of such
relocations are calculated using addends of both pair's members.
Each `R_MIPS_HI16` and `R_MIPS_GOT16` (for local symbols) relocations must have
an associated `R_MIPS_LO16` entry immediately following it in the list
of relocations. Immediately does not mean "next record" in relocations section
but "next record referenced the same symbol". Moreover a single `R_MIPS_LO16`
relocation can be paired with multiple preceding `R_MIPS_HI16/R_MIPS_GOT16`
relocations.
The paired relocation can have offsets belong to the different symbols.
That is why we need to have access to list of all relocations during
construction of `ELFReference` for MIPS target.
The patch reviewed by Shankar Easwaran.
llvm-svn: 206102
Seems getSomething() is more common naming scheme than just a noun
to get something, so renaming these members.
Differential Revision: http://llvm-reviews.chandlerc.com/D3285
llvm-svn: 205589
"x.empty()" is more idiomatic than "x.size() == 0". This patch is to
add such method and use it in LLD.
Differential Revision: http://llvm-reviews.chandlerc.com/D3279
llvm-svn: 205558
.gnu.linkonce sections are similar to section groups.
They were supported before section groups existed and provided a way
to resolve COMDAT sections using a different design.
There are few implementations that use .gnu.linkonce sections
to store simple floating point constants which doesnot require complex section
group support but need a way to store only one copy of the floating point
constant in a binary.
.gnu.linkonce based symbol resolution achieves that.
Review : http://llvm-reviews.chandlerc.com/D3242
llvm-svn: 205280
This reverts commit 5d5ca72a7876c3dd3dd1db83dc6a0d74be9e2cd1.
Discuss on a better design to raise error when there is a similar group with Gnu
linkonce sections and COMDAT sections.
llvm-svn: 205224
.gnu.linkonce sections are similar to section groups. They were supported before
section groups existed and provided a way to resolve COMDAT sections using a
different design. There are few implementations that use .gnu.linkonce sections
to store simple floating point constants which doesnot require complex section
group support but need a way to store only one copy of the floating point
constant. .gnu.linkonce based symbol resolution achieves that.
llvm-svn: 205163
This patch is to support --defsym option for ELF file format/GNU-compatible
driver. Currently it takes a symbol name followed by '=' and a number. If such
option is given, the driver sets up an absolute symbol with the specified
address. You can specify multiple --defsym options to define multiple symbols.
GNU LD's --defsym provides many more features. For example, it allows users to
specify another symbol name instead of a number to define a symbol alias, or it
even allows a symbol plus an offset (e.g. --defsym=foo+3) to define symbol-
relative alias. This patch does not support that, but will be supported in
subsequent patches.
Differential Revision: http://llvm-reviews.chandlerc.com/D3208
llvm-svn: 205029
These classes are declared in a .cpp file but not used in the same compliation
unit. They seems to have been copy-and-pasted from ELFReader.h.
llvm-svn: 204988
of GOT.
* Read addend for R_MIPS_GOT16 relocation.
* Put only high 16 bits of symbol + addend into GOT entries for
locally visible symbols.
llvm-svn: 204247
COMDAT_SELECT_LARGEST is a COMDAT type that make linker to choose the largest
definition from among all of the definition of a symbol. If the size is the
same, the choice is arbitrary.
Differential Revision: http://llvm-reviews.chandlerc.com/D3011
llvm-svn: 204172
The COFF spec says that the SectionNumber field in the symbol table is 16 bit
signed type, but MSVC treats the field as if it is unsigned.
llvm-svn: 203901
This results in some simplifications to the code where an OwningPtr had to
be used with the previous api and then ownership moved to a unique_ptr for
the rest of lld.
llvm-svn: 203809
An object whose machine type header value is unknown looks a bit odd but
is valid. If an object contains only machine-type-independent data, you
can leave the type field unspecified. Some files in oldname.lib are such
object files.
llvm-svn: 203752