This change causes us to read partition specifications from partition
specification sections and split output sections into partitions according
to their reachability from partition entry points.
This is only the first step towards a full implementation of partitions. Later
changes will add additional synthetic sections to each partition so that
they can be loaded independently.
Differential Revision: https://reviews.llvm.org/D60353
llvm-svn: 361925
My recent commits separated symbol resolution from the symbol table,
so the functions to resolve symbols are now in a somewhat wrong file.
This patch moves it to Symbols.cpp.
The functions are now member functions of the symbol.
This is code move change. I modified function names so that they are
appropriate as member functions, though. No functionality change
intended.
Differential Revision: https://reviews.llvm.org/D62290
llvm-svn: 361474
--{start,end}-lib give files grouped by the options the archive file
semantics. That is, each object file between them acts as if it were
in an archive file whose sole member is the file.
Therefore, files between --{start,end}-lib are linked to the final
output only if they are needed to resolve some undefined symbols.
Previously, the feature was implemented this way:
1. We read a symbol table and insert defined symbols to the symbol
table as lazy symbols.
2. If an undefind symbol is resolved to a lazy symbol, that lazy
symbol instantiate ObjFile class for that symbol, which re-insert
all defined symbols to the symbol table.
So, if an ObjFile is instantiated, defined symbols are inserted to the
symbol table twice. Since inserting long symbol names is not cheap,
there's a room to optimize here.
This patch optimzies it. Now, LazyObjFile remembers symbol handles and
passed them over to a new ObjFile instance, so that the ObjFile
doesn't insert the same strings.
Here is a quick benchmark to link clang. "Original" is the original
lld with unmodified command line options. For "Case 1" and "Case 2", I
extracted all files from archive files and replace .a's in a command
line with .o's wrapped with --{start,end}-lib. I used the original lld
for Case 1" and use this patch for Case 2.
Original: 5.892
Case 1: 6.001 (+1.8%)
Case 2: 5.701 (-3.2%)
So, interestingly, --{start,end}-lib are now faster than the regular
linking scheme with archive files. That's perhaps not too surprising,
though, because for regular archive files, we look up the symbol table
with the same string twice.
Differential Revision: https://reviews.llvm.org/D62188
llvm-svn: 361473
Also renames it LinkerDriver::compileBitcodeFiles.
The function doesn't logically belong to SymbolTable. We added this
function to the symbol table because symbol table used to be a
container of input files. This is no longer the case.
Differential Revision: https://reviews.llvm.org/D62291
llvm-svn: 361469
Rather than report "undefined symbol: ", give more informative message
about the object file that defines the discarded section.
In particular, PR41133, if the section is a discarded COMDAT, print the
section group signature and the object file with the prevailing
definition. This is useful to track down some ODR issues.
We need to
* add `uint32_t DiscardedSecIdx` to Undefined for this feature.
* make ComdatGroups public and change its type to DenseMap<CachedHashStringRef, const InputFile *>
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D59649
llvm-svn: 361359
This is a mechanical rewrite of replaceSymbol(A, B) to A->replace(B).
I also added a comment to Symbol::replace().
Technically this change is not necessary, but this change makes code a
bit more concise.
Differential Revision: https://reviews.llvm.org/D62117
llvm-svn: 361123
This is the last patch of the series of patches to make it possible to
resolve symbols without asking SymbolTable to do so.
The main point of this patch is the introduction of
`elf::resolveSymbol(Symbol *Old, Symbol *New)`. That function resolves
or merges given symbols by examining symbol types and call
replaceSymbol (which memcpy's New to Old) if necessary.
With the new function, we have now separated symbol resolution from
symbol lookup. If you already have a Symbol pointer, you can directly
resolve the symbol without asking SymbolTable to do that.
Now that the nice abstraction become available, I can start working on
performance improvement of the linker. As a starter, I'm thinking of
making --{start,end}-lib faster.
--{start,end}-lib is currently unnecessarily slow because it looks up
the symbol table twice for each symbol.
- The first hash table lookup/insertion occurs when we instantiate a
LazyObject file to insert LazyObject symbols.
- The second hash table lookup/insertion occurs when we create an
ObjFile from LazyObject file. That overwrites LazyObject symbols
with Defined symbols.
I think it is not too hard to see how we can now eliminate the second
hash table lookup. We can keep LazyObject symbols in Step 1, and then
call elf::resolveSymbol() to do Step 2.
Differential Revision: https://reviews.llvm.org/D61898
llvm-svn: 360975
Previously, we handled common symbols as a kind of Defined symbol,
but what we were doing for common symbols is pretty different from
regular defined symbols.
Common symbol and defined symbol are probably as different as shared
symbol and defined symbols are different.
This patch introduces CommonSymbol to represent common symbols.
After symbols are resolved, they are converted to Defined symbols
residing in a .bss section.
Differential Revision: https://reviews.llvm.org/D61895
llvm-svn: 360841
SymbolTable's add-family functions have lots of parameters because
when they have to create a new symbol, they forward given arguments
to Symbol's constructors. Therefore, the functions take at least as
many arguments as their corresponding constructors.
This patch simplifies the add-family functions. Now, the functions
take a symbol instead of arguments to construct a symbol. If there's
no existing symbol, a given symbol is memcpy'ed to the symbol table.
Otherwise, the functions attempt to merge the existing and a given
new symbol.
I also eliminated `CanOmitFromDynSym` parameter, so that the functions
take really one argument.
Symbol classes are trivially constructible, so looks like constructing
them to pass to add-family functions is as cheap as passing a lot of
arguments to the functions. A quick benchmark showed that this patch
seems performance-neutral.
This is a preparation for
http://lists.llvm.org/pipermail/llvm-dev/2019-April/131902.html
Differential Revision: https://reviews.llvm.org/D61855
llvm-svn: 360838
The symbol table used to be a container of vectors of input files,
but that's no longer the case because the vectors are moved out of
SymbolTable and are now global variables.
Therefore, addFile doesn't have to belong to any class. This patch
moves the function out of the class.
This patch is a preparation for my RFC [1].
[1] http://lists.llvm.org/pipermail/llvm-dev/2019-April/131902.html
Differential Revision: https://reviews.llvm.org/D61854
llvm-svn: 360666
It makes the --plugin-opt=obj-path= and --plugin-opt=thinlto-index-only=
behavior more consistent - the files will be created in the
BitcodeFiles.empty() case, but I assume whether it behaves this way is
not required by anyone.
LTOObj->run() cannot run with empty BitcodeFiles. There would be an error:
ld.lld: error: No available targets are compatible with triple ""
Differential Revision: https://reviews.llvm.org/D61635
llvm-svn: 360129
Summary:
The gold plugin behavior (creating empty index files for lazy bitcode
files) was added in D46034, but it missed the case when there is no
non-lazy bitcode files, e.g.
ld.lld -shared crti.o crtbeginS.o --start-lib bitcode.o --end-lib ...
crti.o crtbeginS.o are not bitcode, but our distributed build system
wants bitcode.o.thinlto.bc to confirm all expected outputs are created
based on all of the modules provided to the linker.
Differential Revision: https://reviews.llvm.org/D61420
llvm-svn: 359788
Summary:
In ld.bfd/gold, --no-allow-shlib-undefined is the default when linking
an executable. This patch implements a check to error on undefined
symbols in a shared object, if all of its DT_NEEDED entries are seen.
Our approach resembles the one used in gold, achieves a good balance to
be useful but not too smart (ld.bfd traces all DSOs and emulates the
behavior of a dynamic linker to catch more cases).
The error is issued based on the symbol table, different from undefined
reference errors issued for relocations. It is most effective when there
are DSOs that were not linked with -z defs (e.g. when static sanitizers
runtime is used).
gold has a comment that some system libraries on GNU/Linux may have
spurious undefined references and thus system libraries should be
excluded (https://sourceware.org/bugzilla/show_bug.cgi?id=6811). The
story may have changed now but we make --allow-shlib-undefined the
default for now. Its interaction with -shared can be discussed in the
future.
Reviewers: ruiu, grimar, pcc, espindola
Reviewed By: ruiu
Subscribers: joerg, emaste, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D57385
llvm-svn: 352826
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Summary:
If a DSO appears more than once with and without --as-needed, ld.bfd and gold consider --no-as-needed to takes precedence over --as-needed. lld didn't and this patch makes it do so.
This makes it a bit away from the position-dependent behavior (how
different occurrences of the same DSO interact) and protects us from
some mysterious runtime errors: if some interceptor libraries add their
own --no-as-needed dependencies (e.g. librt.so), and the user
application specifies -Wl,--as-needed -lrt , the absence of the
DT_NEEDED entry would make dlsym(RTLD_NEXT, "clock_gettime") return NULL
and would break at runtime.
Reviewers: ruiu, espindola
Reviewed By: ruiu
Subscribers: emaste, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D56089
llvm-svn: 350105
Summary:
In glibc, libc.so is a linker script with an as-needed dependency on ld-linux-x86-64.so.2
GROUP ( /lib/x86_64-linux-gnu/libc.so.6 /usr/lib/x86_64-linux-gnu/libc_nonshared.a AS_NEEDED ( /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 ) )
ld-linux-x86-64.so.2 (as-needed) defines some symbols which resolve undefined references in libc.so.6, it will therefore be added as a DT_NEEDED entry, which isn't necessary.
The test case as-needed-not-in-regular.s emulates the libc.so scenario, where ld.bfd and gold don't add DT_NEEDED for a.so
The relevant code in gold/resolve.cc:
// If we have a non-WEAK reference from a regular object to a
// dynamic object, mark the dynamic object as needed.
if (to->is_from_dynobj() && to->in_reg() && !to->is_undef_binding_weak())
to->object()->set_is_needed();
in_reg() appears to do something similar to IsUsedInRegularObj.
This patch makes lld do the similar thing, but moves the check from
addShared to a later stage MarkLive where all symbols are scanned.
Reviewers: ruiu, pcc, espindola
Reviewed By: ruiu
Subscribers: emaste, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D55902
llvm-svn: 349849
Now it returns Symbol. This should be NFC that
avoids doing cast at the caller's sides.
Differential revision: https://reviews.llvm.org/D54627
llvm-svn: 347455
`Type` parameter was used only to check for TLS attribute mismatch,
but we can do that when we actually replace symbols, so we don't need
to type as an argument. This change should simplify the interface of
the symbol table a bit.
llvm-svn: 344394
`SymbolTable` is a singleton class and is a global variable for the
unique instance, so we can always refer the symtab by `Symtab->`.
However, we don't need to use the global varaible from member functions
of SymbolTable class.
llvm-svn: 344075
We have an issue with -wrap that the option doesn't work well when
renamed symbols get PLT entries. I'll explain what is the issue and
how this patch solves it.
For one -wrap option, we have three symbols: foo, wrap_foo and real_foo.
Currently, we use memcpy to overwrite wrapped symbols so that they get
the same contents. This works in most cases but doesn't when the relocation
processor sets some flags in the symbol. memcpy'ed symbols are just
aliases, so they always have to have the same contents, but the
relocation processor breaks that assumption.
r336609 is an attempt to fix the issue by memcpy'ing again after
processing relocations, so that symbols that are out of sync get the
same contents again. That works in most cases as well, but it breaks
ASan build in a mysterious way.
We could probably fix the issue by choosing symbol attributes that need
to be copied after they are updated. But it feels too complicated to me.
So, in this patch, I fixed it once and for all. With this patch, we no
longer memcpy symbols. All references to renamed symbols point to new
symbols after wrapSymbols() is done.
Differential Revision: https://reviews.llvm.org/D50569
llvm-svn: 340387
If we have 2 bitcode inputs for different targets, LLD would
print "<internal>" instead of the name of one of the files.
The patch adds a test and fixes this issue.
llvm-svn: 336794
Patch by Matthew Koontz!
Before, direct calls to __wrap_sym would not map to valid PLT entries,
so they would crash at runtime. This change maps such calls to the same
PLT entry as calls to sym that are then wrapped.
Differential Revision: https://reviews.llvm.org/D48502
llvm-svn: 336609
Summary:
Suppose we visit symbols in this order:
1. weak definition of foo in a lazy object
2. reference of foo
3 (optional). definition of foo
bfd/gold allows 123 but not 12.
Current --warn-backrefs implementation will report both cases as a backward reference. With this change, both 123 (intended) and 12 (unintended) are allowed. The usage of weak definitions usually imply there are also global definitions, so the trade-off is justified.
Reviewers: ruiu, espindola
Subscribers: emaste, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D46624
llvm-svn: 332061
Our code for LazyObject and LazyArchive duplicates.
This patch extracts the common part to remove
the duplication.
Differential revision: https://reviews.llvm.org/D45516
llvm-svn: 330701
I'm proposing a new command line flag, --warn-backrefs in this patch.
The flag and the feature proposed below don't exist in GNU linkers
nor the current lld.
--warn-backrefs is an option to detect reverse or cyclic dependencies
between static archives, and it can be used to keep your program
compatible with GNU linkers after you switch to lld. I'll explain the
feature and why you may find it useful below.
lld's symbol resolution semantics is more relaxed than traditional
Unix linkers. Therefore,
ld.lld foo.a bar.o
succeeds even if bar.o contains an undefined symbol that have to be
resolved by some object file in foo.a. Traditional Unix linkers
don't allow this kind of backward reference, as they visit each
file only once from left to right in the command line while
resolving all undefined symbol at the moment of visiting.
In the above case, since there's no undefined symbol when a linker
visits foo.a, no files are pulled out from foo.a, and because the
linker forgets about foo.a after visiting, it can't resolve
undefined symbols that could have been resolved otherwise.
That lld accepts more relaxed form means (besides it makes more
sense) that you can accidentally write a command line or a build
file that works only with lld, even if you have a plan to
distribute it to wider users who may be using GNU linkers. With
--check-library-dependency, you can detect a library order that
doesn't work with other Unix linkers.
The option is also useful to detect cyclic dependencies between
static archives. Again, lld accepts
ld.lld foo.a bar.a
even if foo.a and bar.a depend on each other. With --warn-backrefs
it is handled as an error.
Here is how the option works. We assign a group ID to each file. A
file with a smaller group ID can pull out object files from an
archive file with an equal or greater group ID. Otherwise, it is a
reverse dependency and an error.
A file outside --{start,end}-group gets a fresh ID when
instantiated. All files within the same --{start,end}-group get the
same group ID. E.g.
ld.lld A B --start-group C D --end-group E
A and B form group 0, C, D and their member object files form group
1, and E forms group 2. I think that you can see how this group
assignment rule simulates the traditional linker's semantics.
Differential Revision: https://reviews.llvm.org/D45195
llvm-svn: 329636