In the lld perf builder r328686 had a negative impact in
stalled-cycles-frontend. Somehow that stat is not showing on my
machine, but the attached patch shows an improvement on cache-misses,
which is probably a reasonable proxy.
My working theory is that given a large input the pieces vector is out
of cache by the time initOffsetMap runs.
Both finalizeContents implementation have a convenient location for
initializing the OffsetMap, so this seems the best solution.
llvm-svn: 329117
Also make certain Thunk methods non-const as this will be required for
an upcoming change.
Differential Revision: https://reviews.llvm.org/D44961
llvm-svn: 328732
When the target saves ElfSym::GlobalOffsetTable in the .got rather than
.got.plt, Target->GotHeaderEntriesNum states the number of extra entries
required in the .got. Rather than having to add Target->GotHeaderEntriesNum to
NumEntries in every function which refers to NumEntries, this patch changes the
initial value of NumEntries in the constructor.
Differential Revision: https://reviews.llvm.org/D44744
llvm-svn: 328559
Previously, we used 0 as an alias for VER_NDX_GLOBAL and had a dummy
entry in SharedFile::Verdefs so that the access to the array is within
its boundary. But that's not straightforwad. We can just stop doing both.
llvm-svn: 328401
Choosing a Shift2 value based on wordsize is cargo-culted from gold.
Assuming that djb hash is a good hash function, choosing bits [4,9]
shouldn't be any worse or better than choosing bits [5,10]. We shouldn't
have copied that behavior that we can't justify in the first place.
Differential Revision: https://reviews.llvm.org/D44547
llvm-svn: 327921
This patch adds changes to start supporting the Power 64-Bit ELF V2 ABI.
This includes:
- Changing the ElfSym::GlobalOffsetTable to be named .TOC.
- Creating a GotHeader so the first entry in the .got is .TOC.
- Setting the e_flags to be 1 for ELF V1 and 2 for ELF V2
Differential Revision: https://reviews.llvm.org/D44483
llvm-svn: 327871
This is the same as 327248 except Arm defining _GLOBAL_OFFSET_TABLE_ to
be the base of the .got section as some existing code is relying upon it.
For most Targets the _GLOBAL_OFFSET_TABLE_ symbol is expected to be at
the start of the .got.plt section so that _GLOBAL_OFFSET_TABLE_[0] =
reserved value that is by convention the address of the dynamic section.
Previously we had defined _GLOBAL_OFFSET_TABLE_ as either the start or end
of the .got section with the intention that the .got.plt section would
follow the .got. However this does not always hold with the current
default section ordering so _GLOBAL_OFFSET_TABLE_[0] may not be consistent
with the reserved first entry of the .got.plt.
X86, X86_64 and AArch64 will use the .got.plt. Arm, Mips and Power use .got
Fixes PR36555
Differential Revision: https://reviews.llvm.org/D44259
llvm-svn: 327823
This change broke ARM code that expects to be able to add
_GLOBAL_OFFSET_TABLE_ to the result of an R_ARM_REL32.
I will provide a reproducer on llvm-commits.
llvm-svn: 327688
Currently, we can end up with NBuckets==0 and android loader
does not like it (PR36537).
Seems we can go with a minimal amount of changes here for simplicity
and be consistent with gold and so just always use >= 1 value for NBuckets.
Differential revision: https://reviews.llvm.org/D44422
llvm-svn: 327481
the start of the .got.plt section so that _GLOBAL_OFFSET_TABLE_[0] =
reserved value that is by convention the address of the dynamic section.
Previously we had defined _GLOBAL_OFFSET_TABLE_ as either the start or end
of the .got section with the intention that the .got.plt section would
follow the .got. However this does not always hold with the current
default section ordering so _GLOBAL_OFFSET_TABLE_[0] may not be consistent
with the reserved first entry of the .got.plt.
X86, X86_64, Arm and AArch64 will use the .got.plt. Mips and Power use .got
Fixes PR36555
Differential Revision: https://reviews.llvm.org/D44259
llvm-svn: 327248
Summary:
If an executable needs text relocations, it should be marked as such so
that the loader can prepare for text relocations. We currently create a
dummy segment with DT_TEXTREL for that purpose.
Generic ABI as of 2000 [1] mentioned that "Its [DT_TEXTREL's] use
has been superseded by the DF_TEXTREL flag". However, it's actually not
superseded even after 18 years. OpenBSD and musl recognize only DT_TEXTREL.
So we still need to set both.
[1] http://www.sco.com/developers/gabi/2000-07-17/ch5.dynamic.html
Reviewers: rafael
Subscribers: emaste, llvm-commits, arichardson
Differential Revision: https://reviews.llvm.org/D43920
llvm-svn: 326503
This should resolve the issue that lld build fails in some hosts
that uses case-insensitive file system.
Differential Revision: https://reviews.llvm.org/D43788
llvm-svn: 326339
We sometimes need to iterate over input sections for a given
output section. It is not very convinent because we have to iterate
over section descriptions.
Patch introduces getInputSections helper, it simplifies things.
Differential revision: https://reviews.llvm.org/D43574
llvm-svn: 325763
Summary:
Before the name of the function sounded like it was just a getter for the
private class member Addend. However, it actually calculates the final
value for the r_addend field in Elf_Rela that is used when writing the
.rela.dyn section. I also added a comment to the UseSymVA member to
explain how it interacts with computeAddend().
Differential Revision: https://reviews.llvm.org/D43161
llvm-svn: 325485
Now that we have R_ADDEND, UseSymVA was redundant. We only want to
write the symbol virtual address when using an expression other than
R_ADDEND.
llvm-svn: 325360
Summary:
This follows up on r321889 where writing of Elf_Rel addends was partially
moved to RelocationBaseSection. This patch ensures that the addends are
always written to the output section when a input section uses RELA but the
output is REL.
Differential Revision: https://reviews.llvm.org/D42843
llvm-svn: 325328
When resolving dynamic RELA relocations the addend is taken from the
relocation and not the place being relocated. Accordingly lld does not
write the addend field to the place like it would for a REL relocation.
Unfortunately there is some system software, in particlar dynamic loaders
such as Bionic's linker64 that use the value of the place prior to
relocation to find the offset that they have been loaded at. Both gold
and bfd control this behavior with the --[no-]apply-dynamic-relocs option.
This change implements the option and defaults it to true for compatibility
with gold and bfd.
Differential Revision: https://reviews.llvm.org/D42797
llvm-svn: 324221
Compiler doesn't know the fact that Config->WordSize * 8 is always a
power of two, so it had to use the div instruction to divide some
number with C.
llvm-svn: 323014
I created https://reviews.llvm.org/D42202 to see how large the bloom
filter should be. With that patch, I tested various bloom filter sizes
with the following commands:
$ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug -DLLVM_ENABLE_LLD=true \
-DLLVM_ENABLE_PROJECTS='clang;lld' -DBUILD_SHARED_LIBS=ON \
-DCMAKE_SHARED_LINKER_FLAGS=-Wl,-bloom-filter-bits=<some integer> \
../llvm-project/llvm
$ rm -f $(find . -name \*.so.7.0.0svn)
$ ninja lld
$ LD_BIND_NOW=1 perf stat bin/ld.lld
Here is the result:
-bloom-filter-bits=8 0.220351609 seconds
-bloom-filter-bits=10 0.217146597 seconds
-bloom-filter-bits=12 0.206870826 seconds
-bloom-filter-bits=16 0.209456312 seconds
-bloom-filter-bits=32 0.195092075 seconds
Currently we allocate 8 bits for a symbol, but according to the above
result, that number is not optimal. Even though the numbers follow the
diminishing return rule, the point where a marginal improvement becomes
too small is not -bloom-filter-bits=8 but 12. So this patch sets it to 12.
Differential Revision: https://reviews.llvm.org/D42204
llvm-svn: 323010
Previously we checked (HeaderSize == 0) to find out if
PltSection section is IPLT or PLT. Some targets does not set
HeaderSize though. For example PPC64 has no lazy binding implemented
and does not set PltHeaderSize constant.
Because of that using of both IPLT and PLT relocations worked
incorrectly there (testcase is provided).
Patch fixes the issue.
Differential revision: https://reviews.llvm.org/D41613
llvm-svn: 322362
Summary:
As reported in bug 35788, rL316280 reintroduces a race between two
members of SectionPiece, which share the same 64 bit memory location.
To fix the race, check the hash before checking the Live member, as
suggested by Rafael.
Reviewers: ruiu, rafael
Reviewed By: ruiu
Subscribers: smeenai, emaste, llvm-commits
Differential Revision: https://reviews.llvm.org/D41884
llvm-svn: 322264
When setting up the chain, we copy over the bucket's previous symbol
index, assuming that this index will be 0 (STN_UNDEF) for an unused
bucket (marking the end of the chain). When linking with --no-rosegment,
however, unused buckets will in fact contain the padding value, and so
the hash table will end up containing invalid chains. Zero out the hash
table section explicitly to avoid this, similar to what's already done
for GNU hash sections.
Differential Revision: https://reviews.llvm.org/D41928
llvm-svn: 322259
This is "Bug 35751 - .dynamic relocation entries omitted if output
contains only IFUNC relocations"
We have InX::RelaPlt and InX::RelaIPlt synthetic sections for PLT relocations.
They are usually live in rela.plt section. Problem appears when InX::RelaPlt
section is empty. In that case we did not produce normal set of dynamic tags
required, because logic was written in the way assuming we always have
non-IRelative relocations in rela.plt.
Patch fixes the issue.
Differential revision: https://reviews.llvm.org/D41592
llvm-svn: 321600
This was raised in comments for D41592.
With current code we always assign parent
section for Rel[a] sections like
InX::RelaPlt or InX::RelaDyn, so checking
their parent for null is excessive.
llvm-svn: 321581
We add dynamic section entries both in the ctor of the class and
DynamicSection::finalizeContents(). Some entries need to be added early
in the ctor because they add strings to .dynstr. Other entries were
intended to be added in finalizeContents(). However, some entries are
added in the ctor even though they don't add strings. This patch
fix the issue.
llvm-svn: 320851
We might crash in 'ARMExidxSentinelSection::writeTo()' because it expected
the sentinel entry to be put in the same 'InputSectionDescription' as
the last real entry. This assumption fails if the last output section command
for .ARM.exidx is anything but an input section description, because in this
case 'OutputSection::addSection()' creates a new 'InputSectionDescription'.
Differential Revision: https://reviews.llvm.org/D41105
llvm-svn: 320668
By using an index instead of a pointer for verdef we can put the index
next to the alignment field. This uses the otherwise wasted area and
reduces the shared symbol size.
By itself the performance change of this is in the noise, but I have a
followup patch to remove another 8 bytes that improves performance
when combined with this.
llvm-svn: 320449
We fill executable sections with trap instructions (0xcc or equivalent).
If a .gnu.hash section was put into an executable segment, we created
corrupted .gnu.hash section. This patch fixes the issue.
llvm-svn: 319863
This change actually makes the linker slightly faster. My observation
is that, with this patch, link time of clang without debug is about 1%
faster.
Differential Revision: https://reviews.llvm.org/D40697
llvm-svn: 319600
This is for "Bug 35474 - --emit-relocs produces wrongly-named reloc sections".
LLD currently for scripts like:
.text.boot : { *(.text.boot) }
emits relocation section with name .rela.text because does not take
redefined name of output section into account and builds section name
using rules for non-scripted case. Patch fixes this oddness.
Differential revision: https://reviews.llvm.org/D40652
llvm-svn: 319526
The MIPS GOT section has a number of local entries based on the number of pages
needed for output sections referenced by GOT page relocations. The number is
recorded in the DT_MIPS_LOCAL_GOTNO dynamic section tag. However, the dynamic tag
is added before assignAddresses has been called, meaning that any section size used
to calculate the value will not include size modifications caused by, for example,
linker scripts and thunks.
This change moves the calculation of DT_MIPS_LOCAL_GOTNO until writeTo, by which
time the output section sizes have been finalized.
Reviewers: ruiu, rafael
Differential Revision: https://reviews.llvm.org/D39493
llvm-svn: 318828
No difference in practice other than having sh_entsize in the output.
This should simplify the patch for handling SHF_MERGE in -r.
Based on a patch by George Rimar.
llvm-svn: 318306
microMIPS symbols including microMIPS PLT records created for regular
symbols needs to be marked by STO_MIPS_MICROMIPS flag in a symbol table.
Additionally microMIPS entries in a dynamic symbol table should have
configured less-significant bit. That allows to escape teaching a
dynamic linker about microMIPS symbols.
llvm-svn: 318097
We have a lot of "if (MIPS)" conditions in lld because the MIPS' ABI
is different at various places than other arch's ABIs at where it
don't have to be different, but we at least want to reduce MIPS-ness
from the regular classes.
llvm-svn: 317525
Now that DefinedRegular is the only remaining derived class of
Defined, we can merge the two classes.
Differential Revision: https://reviews.llvm.org/D39667
llvm-svn: 317448
Common symbols are now represented with a DefinedRegular that points
to a BssSection, even during symbol resolution.
Differential Revision: https://reviews.llvm.org/D39666
llvm-svn: 317447
Now that we have only SymbolBody as the symbol class. So, "SymbolBody"
is a bit strange name now. This is a mechanical change generated by
perl -i -pe s/SymbolBody/Symbol/g $(git grep -l SymbolBody lld/ELF lld/COFF)
nd clang-format-diff.
Differential Revision: https://reviews.llvm.org/D39459
llvm-svn: 317370
SymbolBody and Symbol were separated classes due to a historical reason.
Symbol used to be a pointer to a SymbolBody, and the relationship
between Symbol and SymbolBody was n:1.
r2681780 changed that. Since that patch, SymbolBody and Symbol are
allocated next to each other to improve memory locality, and they have
1:1 relationship now. So, the separation of Symbol and SymbolBody no
longer makes sense.
This patch merges them into one class. In order to avoid updating too
many places, I chose SymbolBody as a unified name. I'll rename it Symbol
in a follow-up patch.
Differential Revision: https://reviews.llvm.org/D39406
llvm-svn: 317006
DSO is short for dynamic shared object, so the function name was a
little confusing because it sounded like it didn't work when we were
a creating statically-linked executable or something.
What we mean by "DSO" here is the current output file that we are
creating. Thus the new name. Alternatively, we could call it the current
ELF module, but "module" is a overloaded word, so I avoided that.
llvm-svn: 316809
The Android relocation packing format is a more compact
format for dynamic relocations in executables and DSOs
that is based on delta encoding and SLEBs. An overview
of the format can be found in the Android source code:
https://android.googlesource.com/platform/bionic/+/refs/heads/master/tools/relocation_packer/src/delta_encoder.h
This patch implements relocation packing using that format.
This implementation uses a more intelligent algorithm for compressing
relative relocations than Android's own relocation packer. As a
result it can generally create smaller relocation sections than
that packer. If I link Chromium for Android targeting ARM32 I get a
.rel.dyn of size 174693 bytes, as compared to 371832 bytes with gold
and the Android packer.
Differential Revision: https://reviews.llvm.org/D39152
llvm-svn: 316775
When an OutputSection is larger than the branch range for a Target we
need to place thunks such that they are always in range of their caller,
and sufficiently spaced to maximise the number of callers that can use
the thunk. We use the simple heuristic of placing the
ThunkSection at intervals corresponding to a target specific branch range.
If the OutputSection is small we put the thunks at the end of the executable
sections.
Differential Revision: https://reviews.llvm.org/D34689
llvm-svn: 316751
Previously, EhFrameSection pushes data to EhFrameHdr by calling addFde
function. By making EhFrameHdr pull data from EhFrameSection, we can
eliminate a member from EhFrameHdr.
llvm-svn: 316730
When LLD do ICF for 2 identical sections it leaves 2 duplicate entries in .eh_frame
pointing to the same address. After that it fixes .eh_frame_header's header,
so that it says it contains single FDE, though section itself contains 2
(it contains garbage data at tail).
As a result excessive entries in .eh_frame and excessive dummy data in .eh_frame_header
emited to output. Patch fixes that. This is PR34518.
Differential revision: https://reviews.llvm.org/D38998
llvm-svn: 316648
Summary:
The COFF linker and the ELF linker have long had similar but separate
Error.h and Error.cpp files to implement error handling. This change
introduces new error handling code in Common/ErrorHandler.h, changes the
COFF and ELF linkers to use it, and removes the old, separate
implementations.
Reviewers: ruiu
Reviewed By: ruiu
Subscribers: smeenai, jyknight, emaste, sdardis, nemanjai, nhaehnle, mgorny, javed.absar, kbarton, fedor.sergeev, llvm-commits
Differential Revision: https://reviews.llvm.org/D39259
llvm-svn: 316624
By assuming that mergeable input sections are smaller than 4 GiB,
lld's memory usage when linking clang with debug info drops from
2.788 GiB to 2.019 GiB (measured by valgrind, and that does not include
memory space for mmap'ed files). I think that's a reasonable assumption
given such a large RAM savings, so this patch.
According to valgrind, gold needs 3.54 GiB of RAM to do the same thing.
NB: This patch does not introduce a limitation on the size of
output sections. You can still create sections larger than 4 GiB.
llvm-svn: 316280