The --fix-cortex-a8 option implements a linker workaround for the
coretex-a8 erratum 657417. A summary of the erratum conditions is:
- A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two
4KiB regions.
- The destination of the branch is to the first 4KiB region.
- The instruction before the branch is a 32-bit Thumb-2 non-branch
instruction.
The linker fix is to redirect the branch to a patch not in the first
4KiB region. The patch forwards the branch on to its target.
The cortex-a8, is an old CPU, with the first implementation of this
workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in
early Android Phones and there are some critical applications that still
need to run on a cortex-a8 that have the erratum. The patch is applied
roughly 10 times on LLD and 20 on Clang when they are built with
--fix-cortex-a8 on an Arm system.
The formal erratum description is avaliable in the ARM Core Cortex-A8
(AT400/AT401) Errata Notice document. This is available from Arm on
request but it seems to be findable via a web search.
Differential Revision: https://reviews.llvm.org/D67284
llvm-svn: 371965
If there is no readonly section, we map:
* The ELF header at imageBase+maxPageSize
* Program headers at imageBase+maxPageSize+sizeof(Ehdr)
* The first section .text at imageBase+maxPageSize+sizeof(Ehdr)+sizeof(program headers)
Due to the interaction between Writer<ELFT>::fixSectionAlignments and
LinkerScript::allocateHeaders,
`alignDown(p_vaddr(R PT_LOAD)) = alignDown(p_vaddr(RX PT_LOAD))`.
The RX PT_LOAD will override the R PT_LOAD at runtime, which is not ideal:
```
// PHDR at 0x401034, should be 0x400034
PHDR 0x000034 0x00401034 0x00401034 0x000a0 0x000a0 R 0x4
// R PT_LOAD contains just Ehdr and program headers.
// At 0x401000, should be 0x400000
LOAD 0x000000 0x00401000 0x00401000 0x000d4 0x000d4 R 0x1000
LOAD 0x0000d4 0x004010d4 0x004010d4 0x00001 0x00001 R E 0x1000
```
* createPhdrs allocates the headers to the R PT_LOAD.
* fixSectionAlignments assigns `imageBase+maxPageSize+sizeof(Ehdr)+sizeof(program headers)` (formula: `alignTo(dot, maxPageSize) + dot % config->maxPageSize`) to addrExpr of .text
* allocateHeaders computes the minimum address among SHF_ALLOC sections, i.e. addr(.text)
* allocateHeaders sets address of ELF header to `addr(.text)-sizeof(Ehdr)-sizeof(program headers) = imageBase+maxPageSize`
The main observation is that when the SECTIONS command is not used, we
don't have to call allocateHeaders. This requires an assumption that
the presence of PT_PHDR and addresses of headers can be decided
regardless of address information.
This may seem natural because dot is not manipulated by a linker script.
The other thing is that we have to drop the special rule for -T<section>
in `getInitialDot`. If -Ttext is smaller than the image base, the headers
will not be allocated with the old behavior (allocateHeaders is called)
but always allocated with the new behavior.
The behavior change is not a problem. Whether and where headers are
allocated can vary among linkers, or ld.bfd across different versions
(--enable-separate-code or not). It is thus advised to use a linker
script with the PHDRS command to have a consistent behavior across
linkers. If PT_PHDR is needed, an explicit --image-base can be a simpler
alternative.
Differential Revision: https://reviews.llvm.org/D67325
llvm-svn: 371957
```
part.phdrs = script->hasPhdrsCommands() ? script->createPhdrs() : createPhdrs(part);
```
createPhdrs() allocates a PT_PHDR and a PF_R PT_LOAD, which will be
deleted later in LinkerScript::allocateHeaders, but leave a gap between
the program headers and the first section. Don't allocate the segments
to avoid the gap. PT_INTERP is likely not needed as well.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D67324
llvm-svn: 371398
Recommit r370635 (reverted by r371202), with one change: move addOrphanSections() before ICF.
Before, orphan sections in two different partitions may be folded and
moved to the main partition.
Now, InputSection->OutputSection assignment for orphans happens before
ICF. ICF does not fold input sections with different output sections.
With the PR43241 reproduce,
`llvm-objcopy --extract-partition libvr.so libchrome__combined.so libvr.so` => no error
Updated description:
Fixes PR39418. Complements D47241 (the non-linker-script case).
processSectionCommands() assigns input sections to output sections.
ICF is called before it, so .text.foo and .text.bar may be folded even if
their output sections are made different by SECTIONS commands.
```
markLive<ELFT>()
doIcf<ELFT>() // During ICF, we don't know the output sections
writeResult()
combineEhSections<ELFT>()
script->processSectionCommands() // InputSection -> OutputSection assignment
```
This patch splits processSectionCommands() into processSectionCommands()
and processSymbolAssignments(), and moves
processSectionCommands()/addOrphanSections() before ICF:
```
markLive<ELFT>()
combineEhSections<ELFT>()
script->processSectionCommands()
script->addOrphanSections();
doIcf<ELFT>() // should remove folded input sections
writeResult()
script->processSymbolAssignments()
```
An alternative approach is to unfold a section `sec` in
processSectionCommands() when we find `sec` and `sec->repl` belong to
different output sections. I feel this patch is superior because this
can fold more sections and the decouple of
SectionCommand/SymbolAssignment gives flexibility:
* An ExprValue can't be evaluated before its section is assigned to an
output section -> we can delete getOutputSectionVA and simplify
another place where we had to check if the output section is null.
Moreover, a case in linkerscript/early-assign-symbol.s can be handled
now.
* processSectionCommands/processSymbolAssignments can be freely moved
around.
llvm-svn: 371216
```
Writer<ELFT>::run
assignFileOffsets
setFileOffset
computeFileOffset
os->ptLoad->p_align may be smaller than config->maxPageSize
setPhdrs
p_align = max(p_align, config->maxPageSize)
```
If we move the config->maxPageSize logic to the constructor of
PhdrEntry, computeFileOffset can be simplified.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D67211
llvm-svn: 371085
Previously, segments were aligned according to their first section's
alignment requirements. That was not correct, but segments are also
aligned to a page boundary, and a page boundary is usually much larger
than a section alignment requirement, so no one noticed this bug before.
Now, lld has --nmagic option which sets maxPageSize to 1 to effectively
disable page alignment, which reveals the issue.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43212
Differential Revision: https://reviews.llvm.org/D67152
llvm-svn: 371013
Fixes PR39418. Complements D47241 (the non-linker-script case).
processSectionCommands() assigns input sections to output sections.
ICF is called before it, so .text.foo and .text.bar may be folded even if
their output sections are made different by SECTIONS commands.
```
markLive<ELFT>()
doIcf<ELFT>() // During ICF, we don't know the output sections
writeResult()
combineEhSections<ELFT>()
script->processSectionCommands() // InputSection -> OutputSection assignment
```
This patch splits processSectionCommands() into processSectionCommands() and
processSymbolAssignments(), and moves processSectionCommands() before ICF:
```
markLive<ELFT>()
combineEhSections<ELFT>()
script->processSectionCommands()
doIcf<ELFT>() // should remove folded input sections
writeResult()
script->processSymbolAssignments()
```
An alternative approach is to unfold a section `sec` in
processSectionCommands() when we find `sec` and `sec->repl` belong to
different output sections. I feel this patch is superior because this
can fold more sections and the decouple of
SectionCommand/SymbolAssignment gives flexibility:
* An ExprValue can't be evaluated before its section is assigned to an
output section -> we can delete getOutputSectionVA and simplify
another place where we had to check if the output section is null.
Moreover, a case in linkerscript/early-assign-symbol.s can be handled
now.
* processSectionCommands/processSymbolAssignments can be freely moved
around.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D66717
llvm-svn: 370635
Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=998712
SHT_LLVM_PART_EHDR marks the start of a partition. The partition
sections will be extracted to a separate file. Align to the next maximum
page size boundary so that we can find the ELF header at the start. We
cannot benefit from overlapping p_offset ranges with the previous
segment anyway.
It seems we lack some llvm-objcopy --extract-main-partition and
--extract-partition sanity checks. It may place EHDR at the start
even if p_offset if non zero. Anyway, the lld change is justified for
the reasons above.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D67032
llvm-svn: 370629
Port the D64906 technique to RISC-V. It deletes 3 alignments at
PT_LOAD boundaries for the default case: the size of a RISC-V binary
decreases by at most 12kb.
llvm-svn: 370192
This essentially reverts the code change of D63132 and switches to a simpler approach.
In an executable/shared object, st_shndx of a symbol can be:
1) SHN_UNDEF: undefined symbol (or canonical PLT)
2) SHN_ABS: absolute symbol
3) any other value (usually a regular section index) represents a relative symbol.
The actual value does not matter.
Many ld.so (musl, all archs except MIPS of FreeBSD rtld-elf) even treat 2) and 3)
the same. If .sdata does not exist, it does not matter what value/section
__global_pointer$ has, as long as it is relative (otherwise there will be a pedantic
lld error. See D63132). Just set the st_shndx arbitrarily to 1.
Dummy st_shndx=1 may be used by __rela_iplt_start, linker-script-defined symbols outside a section, __dso_handle, etc.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D66798
llvm-svn: 370172
Port the D64906 technique to ARM. It deletes 3 alignments at
PT_LOAD boundaries for the default case: the size of an arm binary
decreases by at most 12kb.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D66749
llvm-svn: 370049
EhFrameSection::addSection checks liveness of FDE early. This makes it
infeasible to move combineEhSections() before ICF.
Postpone the check to EhFrameSection::finalizeContents(). This is what
ARMExidxSyntheticSection does and it will make a subsequent patch D66717
simpler.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D66727
llvm-svn: 369890
PR42990. For `SECTIONS { b = a; . = 0xff00 + (a >> 8); a = .; }`,
we currently set st_value(a)=0xff00 while st_value(b)=0xffff.
The following call tree demonstrates the problem:
```
link<ELF64LE>(Args);
Script->declareSymbols(); // insert a and b as absolute Defined
Writer<ELFT>().run();
Script->processSectionCommands();
addSymbol(cmd); // a and b are re-inserted. LinkerScript::getSymbolValue
// is lazily called by subsequent evaluation
finalizeSections();
forEachRelSec(scanRelocations<ELFT>);
processRelocAux // another problem PR42506, not affected by this patch
finalizeAddressDependentContent(); // loop executed once
script->assignAddresses(); // a = 0, b = 0xff00
script->assignAddresses(); // a = 0xff00, _end = 0xffff
```
We need another assignAddresses() to finalize the value of `a`.
This patch
1) modifies assignAddress() to track the original section/value of each
symbol and return a symbol whose section/value has changed.
2) moves the post-finalizeSections assignAddress() inside the loop
of finalizeAddressDependentContent() and makes it iterative.
Symbol assignment may not converge so we make a few attempts before
bailing out.
Note, assignAddresses() must be called at least twice. The penultimate
call finalized section addresses while the last finalized symbol values.
It is somewhat obscure and there was no comment.
linkerscript/addr-zero.test tests this.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D66279
llvm-svn: 369889
Reported at https://reviews.llvm.org/D64930#1642223
If the only section of a PT_LOAD is a SHT_NOBITS section (e.g. .bss), we
may not align its sh_offset. p_offset of the PT_LOAD will be set to
sh_offset, and we will get p_offset!=p_vaddr (mod p_align). If such
executable is mapped by the Linux kernel, it will segfault.
After D64906, this may happen the non-linker script case.
The linker script case has had this issue for a long time.
This was fixed by rL321657 (but the test linkerscript/nobits-offset.s
failed to test a SHT_NOBITS section), but broken by rL345154.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D66658
llvm-svn: 369828
Ported the D64906 technique to EM_386.
If `sh_addralign(.tdata) < sh_addralign(.tbss)`,
we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0`.
ld.so that are known to have problems if p_vaddr%p_align!=0:
* FreeBSD 13.0-CURRENT rtld-elf
* glibc https://sourceware.org/bugzilla/show_bug.cgi?id=24606
New test i386-tls-vaddr-align.s checks our workaround makes p_vaddr%p_align = 0.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D65865
llvm-svn: 369347
Ported the D64906 technique to AArch64. It deletes 3 alignments at
PT_LOAD boundaries for the default case: the size of an aarch64 binary
decreases by at most 192kb.
If `sh_addralign(.tdata) < sh_addralign(.tbss)`,
we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0`.
ld.so that are known to have problems if p_vaddr%p_align!=0:
* musl<=1.1.22
* FreeBSD 13.0-CURRENT (and before) rtld-elf arm64
New test aarch64-tls-vaddr-align.s checks that our workaround makes p_vaddr%p_align = 0.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D64930
llvm-svn: 369344
This change affects the non-linker script case (precisely, when the
`SECTIONS` command is not used). It deletes 3 alignments at PT_LOAD
boundaries for the default case: the size of a powerpc64 binary can be
decreased by at most 192kb. The technique can be ported to other
targets.
Let me demonstrate the idea with a maxPageSize=65536 example:
When assigning the address to the first output section of a new PT_LOAD,
if the end p_vaddr of the previous PT_LOAD is 0x10020, we advance to
the next multiple of maxPageSize: 0x20000. The new PT_LOAD will thus
have p_vaddr=0x20000. Because p_offset and p_vaddr are congruent modulo
maxPageSize, p_offset will be 0x20000, leaving a p_offset gap [0x10020,
0x20000) in the output.
Alternatively, if we advance to 0x20020, the new PT_LOAD will have
p_vaddr=0x20020. We can pick either 0x10020 or 0x20020 for p_offset!
Obviously 0x10020 is the choice because it leaves no gap. At runtime,
p_vaddr will be rounded down by pagesize (65536 if
pagesize=maxPageSize). This PT_LOAD will load additional initial
contents from p_offset ranges [0x10000,0x10020), which will also be
loaded by the previous PT_LOAD. This is fine if -z noseparate-code is in
effect or if we are not transiting between executable and non-executable
segments.
ld.bfd -z noseparate-code leverages this technique to keep output small.
This patch implements the technique in lld, which is mostly effective on
targets with large defaultMaxPageSize (AArch64/MIPS/PPC: 65536). The 3
removed alignments can save almost 3*65536 bytes.
Two places that rely on p_vaddr%pagesize = 0 have to be updated.
1) We used to round p_memsz(PT_GNU_RELRO) up to commonPageSize (defaults
to 4096 on all targets). Now p_vaddr%commonPageSize may be non-zero.
The updated formula takes account of that factor.
2) Our TP offsets formulae are only correct if p_vaddr%p_align = 0.
Fix them. See the updated comments in InputSection.cpp for details.
On targets that we enable the technique (only PPC64 now),
we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0`
if `sh_addralign(.tdata) < sh_addralign(.tbss)`
This exposes many problems in ld.so implementations, especially the
offsets of dynamic TLS blocks. Known issues:
FreeBSD 13.0-CURRENT rtld-elf (i386/amd64/powerpc/arm64)
glibc (HEAD) i386 and x86_64 https://sourceware.org/bugzilla/show_bug.cgi?id=24606
musl<=1.1.22 on TLS Variant I architectures (aarch64/powerpc64/...)
So, force p_vaddr%p_align = 0 by rounding dot up to p_align(PT_TLS).
The technique will be enabled (with updated tests) for other targets in
subsequent patches.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D64906
llvm-svn: 369343
Currently the following 3 relocation types do not trigger the creation
of a canonical PLT (which changes STT_GNU_IFUNC to STT_FUNC and
redirects all references):
1) GOT-generating (`needsGot`)
2) PLT-generating (`needsPlt`)
3) R_ABS with 0 addend in a writable location. This is used for
for ifunc function pointers in writable sections such as .data and .toc.
This patch deletes case 3) to simplify the R_*_IRELATIVE generating
logic added in D57371. Other advantages:
* It is guaranteed no more than 1 R_*_IRELATIVE is created for an ifunc.
* PPC64: no need to special case ifunc in toc-indirect to toc-relative relaxation. See D65755
The deleted elf::addIRelativeRelocs demonstrates that one-pass scan
through relocations makes several optimizations difficult. This is
something we can think about in the future.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D65995
llvm-svn: 368661
In Writer::includeInDynSym(), exportDynamic is used by a Defined with
protected or default visibility, to record whether it is required to be
exported into .dynsym. It is set when any of the following conditions
hold:
1) There is an interposable symbol from a DSO (Undefined or SharedSymbol with default visibility)
2) If -shared or --export-dynamic is specified, any symbol in an object file/bitcode sets this property, unless suppressed by canBeOmittedFromSymbolTable().
3) --dynamic-list when producing an executable
4) protected symbol from a DSO preempted by copy relocation/canonical PLT when
--ignore-{data,function}-address-equality is specified
5) ifunc is exported when -z ifunc-noplt is specified
Bullet points 4) and 5) are irrelevant in this patch.
Bullet 3) does not play well with 1) and 2). When -shared is specified,
exportDynamic of most symbols is true. This makes it incapable to record
--dynamic-list marked symbols. We thus have obscure:
if (!config->shared)
b->exportDynamic = true;
else if (b->includeInDynsym())
b->isPreemptible = true;
This patch adds another bit `Symbol::inDynamicList` to record
3). We can thus simplify handleDynamicList() by unifying the DSO and
executable cases. It also allows us to simplify isPreemptible - now
the field is only used in finalizeSections() and later stages.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D66091
llvm-svn: 368659
We prioritize non-* wildcards overs VER_NDX_LOCAL/VER_NDX_GLOBAL "*".
This patch generalizes the rule to "*" of other versions and thus fixes PR40176.
I don't feel strongly about this GNU linkers' behavior but the
generalization simplifies code.
Delete `config->defaultSymbolVersion` which was used to special case
VER_NDX_LOCAL/VER_NDX_GLOBAL "*".
In `SymbolTable::scanVersionScript`, custom versions are handled the same
way as VER_NDX_LOCAL/VER_NDX_GLOBAL. So merge
`config->versionScript{Locals,Globals}` into `config->versionDefinitions`.
Overall this seems to simplify the code.
In `SymbolTable::assign{Exact,Wildcard}Versions`,
`sym->verdefIndex == config->defaultSymbolVersion` is changed to
`verdefIndex == UINT32_C(-1)`.
This allows us to give duplicate assignment diagnostics for
`{ global: foo; };` `V1 { global: foo; };`
In test/linkerscript/version-script.s:
vs_index of an undefined symbol changes from 0 to 1. This doesn't matter (arguably 1 is better because the binding is STB_GLOBAL) because vs_index of an undefined symbol is ignored.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D65716
llvm-svn: 367869
An R_*_IRELATIVE represents the address of a STT_GNU_IFUNC symbol
(redirected at runtime) which is non-preemptable and is not associated
with a canonical PLT (associated with a symbol with a section index of
SHN_UNDEF but a non-zero st_value).
.rel[a].plt [DT_JMPREL, DT_JMPREL+DT_JMPRELSZ) contains relocations that
can be lazily resolved. R_*_IRELATIVE are always eagerly resolved, so
conceptually they do not belong to .rela.plt. "iplt" is mostly a misnomer.
glibc powerpc and powerpc64 do not resolve R_*_IRELATIVE if they are in .rela.plt.
// a.o - synthesized PLT call stub has an R_*_IRELATIVE
void ifunc(); int main() { ifunc(); }
// b.o
static void real() {}
asm (".type ifunc, %gnu_indirect_function");
void *ifunc() { return ℜ }
The lld-linked executable crashes. ld.bfd places R_*_IRELATIVE in
.rela.dyn and the executable works.
glibc i386, x86_64, and aarch64 have logic
(glibc/sysdeps/*/dl-machine.h:elf_machine_lazy_rel) to eagerly resolve
R_*_IRELATIVE in .rel[a].plt so the lld-linked executable works.
Move R_*_IRELATIVE from .rel[a].plt to .rel[a].dyn to fix the crashes on
glibc powerpc/powerpc64. This also helps simplifying ifunc
implementation in FreeBSD rtld-elf powerpc64.
If --pack-dyn-relocs=android[+relr] is specified, the Android packed
dynamic relocation format is used for .rela.dyn. We cannot name
in.relaIplt ".rela.dyn" because the output section will have mixed
formats. This can be improved in the future.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D65651
llvm-svn: 367745
This patch
1) adds -z separate-code and -z noseparate-code (default).
2) changes the condition that the last page of last PF_X PT_LOAD is
padded with trap instructions.
Current condition (after D33630): if there is no `SECTIONS` commands.
After this change: if -z separate-code is specified.
-z separate-code was introduced to ld.bfd in 2018, to place the text
segment in its own pages. There is no overlap in pages between an
executable segment and a non-executable segment:
1) RX cannot load initial contents from R or RW(or non-SHF_ALLOC).
2) R and RW(or non-SHF_ALLOC) cannot load initial contents from RX.
lld's current status:
- Between R and RX: in `Writer<ELFT>::fixSectionAlignments()`, the start of a
segment is always aligned to maxPageSize, so the initial contents loaded by R
and RX do not overlap. I plan to allow overlaps in D64906 if -z noseparate-code
is in effect.
- Between RX and RW(or non-SHF_ALLOC if RW doesn't exist):
we currently unconditionally pad the last page to commonPageSize
(defaults to 4096 on all targets we support).
This patch will make it effective only if -z separate-code is specified.
-z separate-code is a dubious feature that intends to reduce the number
of ROP gadgets (which is actually ineffective because attackers can find
plenty of gadgets in the text segment, no need to find gadgets in
non-code regions).
With the overlapping PT_LOAD technique D64906, -z noseparate-code
removes two more alignments at segment boundaries than -z separate-code.
This saves at most defaultCommonPageSize*2 bytes, which are significant
on targets with large defaultCommonPageSize (AArch64/MIPS/PPC: 65536).
Issues/feedback on alignment at segment boundaries to help understand
the implication:
* binutils PR24490 (the situation on ld.bfd is worse because they have
two R-- on both sides of R-E so more alignments.)
* In binutils, the 2018-02-27 commit "ld: Add --enable-separate-code" made -z separate-code the default on Linux.
d969dea983
In musl-cross-make, binutils is configured with --disable-separate-code
to address size regressions caused by -z separate-code. (lld actually has the same
issue, which I plan to fix in a future patch. The ld.bfd x86 status is
worse because they default to max-page-size=0x200000).
* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237676 people want
smaller code size. This patch will remove one alignment boundary.
* Stef O'Rear: I'm opposed to any kind of page alignment at the
text/rodata line (having a partial page of text aliased as rodata and
vice versa has no demonstrable harm, and I actually care about small
systems).
So, make -z noseparate-code the default.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D64903
llvm-svn: 367537
Summary:
After D58892 split the RW PT_LOAD on the PT_GNU_RELRO boundary, the new
layout is:
PT_LOAD(PT_GNU_RELRO(.data.rel.ro .bss.rel.ro)) PT_LOAD(.data. .bss)
The two pageAlign() calls at PT_GNU_RELRO boundaries are redundant due
to the existence of PT_LOAD.
Reviewers: grimar, peter.smith, ruiu, espindola
Reviewed By: ruiu
Subscribers: sfertile, atanasyan, emaste, arichardson, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64854
llvm-svn: 366307
This patch does the same thing as r365595 to other subdirectories,
which completes the naming style change for the entire lld directory.
With this, the naming style conversion is complete for lld.
Differential Revision: https://reviews.llvm.org/D64473
llvm-svn: 365730
This patch is mechanically generated by clang-llvm-rename tool that I wrote
using Clang Refactoring Engine just for creating this patch. You can see the
source code of the tool at https://reviews.llvm.org/D64123. There's no manual
post-processing; you can generate the same patch by re-running the tool against
lld's code base.
Here is the main discussion thread to change the LLVM coding style:
https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html
In the discussion thread, I proposed we use lld as a testbed for variable
naming scheme change, and this patch does that.
I chose to rename variables so that they are in camelCase, just because that
is a minimal change to make variables to start with a lowercase letter.
Note to downstream patch maintainers: if you are maintaining a downstream lld
repo, just rebasing ahead of this commit would cause massive merge conflicts
because this patch essentially changes every line in the lld subdirectory. But
there's a remedy.
clang-llvm-rename tool is a batch tool, so you can rename variables in your
downstream repo with the tool. Given that, here is how to rebase your repo to
a commit after the mass renaming:
1. rebase to the commit just before the mass variable renaming,
2. apply the tool to your downstream repo to mass-rename variables locally, and
3. rebase again to the head.
Most changes made by the tool should be identical for a downstream repo and
for the head, so at the step 3, almost all changes should be merged and
disappear. I'd expect that there would be some lines that you need to merge by
hand, but that shouldn't be too many.
Differential Revision: https://reviews.llvm.org/D64121
llvm-svn: 365595
If .sdata is absent, linker synthesized __global_pointer$ gets a section index of SHN_ABS.
(ld.bfd has a similar issue: binutils PR24678)
Scrt1.o may use `lla gp, __global_pointer$` to reference the symbol PC
relatively. In -pie/-shared mode, lld complains if a PC relative
relocation references an absolute symbol (SHN_ABS) but ld.bfd doesn't:
ld.lld: error: relocation R_RISCV_PCREL_HI20 cannot refer to lute symbol: __global_pointer$
Let the reference of __global_pointer$ to force creation of .sdata to
fix the problem. This is similar to _GLOBAL_OFFSET_TABLE_, which forces
creation of .got or .got.plt .
Also, change the visibility from STV_HIDDEN to STV_DEFAULT and don't
define the symbol for -shared. This matches ld.bfd, though I don't
understand why it uses STV_DEFAULT.
Reviewed By: ruiu, jrtc27
Differential Revision: https://reviews.llvm.org/D63132
llvm-svn: 363351
Otherwise the getPartition() accessor may return an OOB pointer. Found
using _GLIBCXX_DEBUG.
The error is benign (we never dereference the pointer for the end marker)
so this wasn't caught by e.g. the sanitizer bots.
llvm-svn: 363026
We create several types of synthetic sections for loadable partitions, including:
- The dynamic symbol table. This allows code outside of the loadable partitions
to find entry points with dlsym.
- Creating a dynamic symbol table also requires the creation of several other
synthetic sections for the partition, such as the dynamic table and hash table
sections.
- The partition's ELF header is represented as a synthetic section in the
combined output file, and will be used by llvm-objcopy to extract partitions.
Differential Revision: https://reviews.llvm.org/D62350
llvm-svn: 362819
Many -static/-no-pie/-shared/-pie applications linked against glibc or musl
should work with this patch. This also helps FreeBSD PowerPC64 to migrate
their lib32 (PR40888).
* Fix default image base and max page size.
* Support new-style Secure PLT (see below). Old-style BSS PLT is not
implemented, so it is not suitable for FreeBSD rtld now because it doesn't
support Secure PLT yet.
* Support more initial relocation types:
R_PPC_ADDR32, R_PPC_REL16*, R_PPC_LOCAL24PC, R_PPC_PLTREL24, and R_PPC_GOT16.
The addend of R_PPC_PLTREL24 is special: it decides the call stub PLT type
but it should be ignored for the computation of target symbol VA.
* Support GNU ifunc
* Support .glink used for lazy PLT resolution in glibc
* Add a new thunk type: PPC32PltCallStub that is similar to PPC64PltCallStub.
It is used by R_PPC_REL24 and R_PPC_PLTREL24.
A PLT stub used in -fPIE/-fPIC usually loads an address relative to
.got2+0x8000 (-fpie/-fpic code uses _GLOBAL_OFFSET_TABLE_ relative
addresses).
Two .got2 sections in two object files have different addresses, thus a PLT stub
can't be shared by two object files. To handle this incompatibility,
change the parameters of Thunk::isCompatibleWith to
`const InputSection &, const Relocation &`.
PowerPC psABI specified an old-style .plt (BSS PLT) that is both
writable and executable. Linkers don't make separate RW- and RWE segments,
which causes all initially writable memory (think .data) executable.
This is a big security concern so a new PLT scheme (secure PLT) was developed to
address the security issue.
TLS will be implemented in D62940.
glibc older than ~2012 requires .rela.dyn to include .rela.plt, it can
not handle the DT_RELA+DT_RELASZ == DT_JMPREL case correctly. A hack
(not included in this patch) in LinkerScript.cpp addOrphanSections() to
work around the issue:
if (Config->EMachine == EM_PPC) {
// Older glibc assumes .rela.dyn includes .rela.plt
Add(In.RelaDyn);
if (In.RelaPlt->isLive() && !In.RelaPlt->Parent)
In.RelaDyn->getParent()->addSection(In.RelaPlt);
}
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D62464
llvm-svn: 362721
We currently (ab)use the Live bit on output sections to track whether
the section has ever had an input section added to it, and then later
use it during orphan placement. This will conflict with one of my upcoming
partition-related changes that will assign all output sections to a partition
(thus marking them as live) so that they can be added to the correct segment
by the code that creates program headers.
Instead of using the Live bit for this purpose, create a new flag and
start using it to track the property explicitly.
Differential Revision: https://reviews.llvm.org/D62348
llvm-svn: 362444
(1) {gcc,clang} -fuse-ld=bfd -pie -fPIE -nostdlib a.c => .interp created
(2) {gcc,clang} -fuse-ld=lld -pie -fPIE -nostdlib a.c => .interp not created
(3) {gcc,clang} -fuse-ld=lld -pie -fPIE -nostdlib a.c a.so => .interp created
The inconsistency of (2) is due to the condition `!Config->SharedFiles.empty()`.
To make lld behave more like ld.bfd, we could change the condition to:
Config->HasDynSymTab && !Config->DynamicLinker.empty() && Script->needsInterpSection();
However, that would bring another inconsistency as can be observed with:
(4) {gcc,clang} -fuse-ld=bfd -no-pie -nostdlib a.c => .interp not created
So instead, use `!Config->DynamicLinker.empty() && Script->needsInterpSection()`,
which is both simple and consistent in these cases.
The inconsistency of (4) likely originated from ld.bfd and gold's choice to have a default --dynamic-linker.
Their condition to create .interp is ANDed with (not -shared).
Since lld doesn't have a default --dynamic-linker,
compiler drivers (gcc/clang) don't pass --dynamic-linker for -shared,
and direct ld users are not supposed to specify --dynamic-linker for -shared,
we do not need the condition !Config->Shared.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D62765
llvm-svn: 362355
For the Local Dynamic case of TLSDESC, _TLS_MODULE_BASE_ is defined as a
special TLS symbol that makes:
1) Without relaxation: it produces a dynamic TLSDESC relocation that
computes 0. Adding @dtpoff to access a TLS symbol.
2) With LD->LE relaxation: _TLS_MODULE_BASE_@tpoff = 0 (lowest address in
the TLS block). Adding @tpoff to access a TLS symbol.
For 1), this saves dynamic relocations and GOT slots as otherwise
(General Dynamic) we would create an R_X86_64_TLSDESC and reserve two
GOT slots for each symbol.
Add ElfSym::TlsModuleBase and change the signature of getTlsTpOffset()
to special case _TLS_MODULE_BASE_.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D62577
llvm-svn: 362078
This change causes us to read partition specifications from partition
specification sections and split output sections into partitions according
to their reachability from partition entry points.
This is only the first step towards a full implementation of partitions. Later
changes will add additional synthetic sections to each partition so that
they can be loaded independently.
Differential Revision: https://reviews.llvm.org/D60353
llvm-svn: 361925