This is mostly extracted from http://reviews.llvm.org/D18960.
The general idea for tlsdesc is that the two GD got entries are used
for a function pointer and its argument. The dynamic linker sets
both. In the non-dlopen case the dynamic linker sets the function to
the identity and the argument to the offset in the tls block.
All that the static linker has to do in the non-dlopen case is
relocate the code to point to the got entries and create a dynamic
relocation.
The dlopen case is more complicated, but can be implemented in another patch.
llvm-svn: 271569
Patch implements next relaxation from latest ABI:
"Convert memory operand of test and binop into immediate operand, where binop is one of adc, add, and, cmp, or,
sbb, sub, xor instructions, when position-independent code is disabled."
It is described in System V Application Binary Interface AMD64 Architecture Processor
Supplement Draft Version 0.99.8 (https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-r249.pdf,
B.2 "B.2 Optimize GOTPCRELX Relocations").
Differential revision: http://reviews.llvm.org/D20793
llvm-svn: 271405
D15779 introduced basic approach to support new relaxations.
This patch implements relaxations for jmp and call instructions,
described in System V Application Binary Interface AMD64 Architecture Processor
Supplement Draft Version 0.99.8 (https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-r249.pdf,
B.2 "B.2 Optimize GOTPCRELX Relocations")
Differential revision: http://reviews.llvm.org/D20622
llvm-svn: 270721
System V Application Binary Interface AMD64 Architecture Processor Supplement Draft Version 0.99.8
(https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-r249.pdf, B.2 "B.2 Optimize GOTPCRELX Relocations")
introduces possible relaxations for R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX.
That patch implements the next relaxation:
mov foo@GOTPCREL(%rip), %reg => lea foo(%rip), %reg
and also opens door for implementing all other ones.
Implementation was suggested by Rafael Ávila de Espíndola with few additions and testcases by myself.
Differential revision: http://reviews.llvm.org/D15779
llvm-svn: 270705
This makes it explicit that each R_RELAX_TLS_* is equivalent to some
other expression.
With this I think we are at a sweet spot for how much is done in
Target.cpp. I did experiment with moving *all* the value math out of it.
It has the advantage that we know the final value in target independent
code, but it gets quite verbose.
llvm-svn: 270277
This adds direct support for computing offsets from the thread pointer
for both variants. Of the architectures we support, variant 1 is used
only by aarch64 (but that doesn't seem to be documented anywhere.)
llvm-svn: 270243
Lazy binding is quite important for use case like a shared build of
llvm. Also, if someone wants to disable it, it is better done in the
compiler (disable plt generation).
The only reason to keep it is to make it easier to add a new
architecture. But it doesn't really help much as it is possible to start
with non lazy relocation and plt code but still let the generic part
create a dedicated .got.plt and .rela.plt.
llvm-svn: 269982
New names reflect purpose of corresponding GOT entries better.
Both expression types related to entries allocated in the 'local'
part of MIPS GOT. R_MIPS_GOT_LOCAL_PAGE is for entries contain 'page'
addresses. R_MIPS_GOT_LOCAL is for entries contain 'full' address.
llvm-svn: 269597
MIPS N64 ABI packs multiple relocations into the single relocation
record. In general, all up to three relocations can have arbitrary types.
In fact, Clang and GCC uses only a few combinations. For now, we support
two of them. That is allow to pass at least all LLVM test suite cases.
<any relocation> / R_MIPS_SUB / R_MIPS_HI16 | R_MIPS_LO16
<any relocation> / R_MIPS_64 / R_MIPS_NONE
The first relocation is a 'real' relocation which is calculated using
the corresponding symbol's value. The second and the third relocations
used to modify result of the first one: extend it to 64-bit, extract
high or low part etc. For details, see part 2.9 'Relocation' at
https://dmz-portal.mips.com/mw/images/8/82/007-4658-001.pdf
llvm-svn: 268876
MIPS N64 ABI packs multiple relocations into the single relocation
record. Particularly it requires to represent dynamic relative
relocation as a combination of R_MIPS_REL32 and R_MIPS_64 relocations.
llvm-svn: 268565
We were already checking for non relative relocations.
If we ever decide to add support for rw text segments this means we will
have a single spot to add the flag.
llvm-svn: 268558
These relocations introduced by MIPS N64 ABI. R_MIPS_GOT_DISP references
GOT entry with full symbol's address, R_MIPS_GOT_PAGE creates GOT entry
with address of memory page which includes symbol's address,
R_MIPS_GOT_OFST used together with R_MIPS_GOT_PAGE. This relocation
calculates offset from beginning of memory page to the symbol address.
llvm-svn: 268525
It is now used only for relocations that only set the low bits inside a
page. Everything else is handled by getRelExpr.
I will send a another review renaming and better documenting
isRelRelative.
llvm-svn: 267705
MIPS is the only target requires GOT header. We already have MIPS
specific code in the `GotSection` class, so move MIPS GOT header
generation there and delete redundant stuff like `GotHeaderEntriesNum`
field and `writeGotHeader` method.
Differential Revision: http://reviews.llvm.org/D19465
llvm-svn: 267460
These relocations are calculated as S + A - DTPREL or S + A - TPREL,
where DTPREL = TlsVA - 0x8000, TPREL = TlsVA - 0x7000. So the result
is relative to the TLS output section and is not an absolut value
The fix allows to escape creation of unnecessary dynamic relocations
in case of DSO linking.
llvm-svn: 266923
It is now redundant. Writer.cpp can reason that 2 dynamic relocations
are needed: one to find the final got entry address and one to fill the
got entry.
llvm-svn: 266876
This requires adding a few more expression types, but is already a small
simplification. Having Writer.cpp know the exact expression will also
allow further simplifications.
llvm-svn: 266604
The _gp_disp symbol designates offset between start of function and 'gp'
pointer into GOT. The following code is a typical MIPS function preamble
used to setup $gp register:
lui $gp, %hi(_gp_disp)
addi $gp, $gp, %lo(_gp_disp)
To calculate R_MIPS_HI16 / R_MIPS_LO16 relocations results we use
the following formulas:
%hi(_gp - P + A)
%lo(_gp - P + A + 4),
where _gp is a value of _gp symbol, A is addend, and P current address.
The R_MIPS_LO16 relocation references _gp_disp symbol is always the second
instruction. That is why we need four byte adjustments. The patch assigns
R_PC type for R_MIPS_LO16 relocation and adjusts its addend by 4. That fix
R_MIPS_LO16 calculation.
For details see p. 4-19 at ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf
Differential Revision: http://reviews.llvm.org/D19115
llvm-svn: 266368
With this patch we use the first scan over the relocations to remember
the information we found about them: will them be relaxed, will a plt be
used, etc.
With that the actual relocation application becomes much simpler. That
is particularly true for the interfaces in Target.h.
This unfortunately means that we now do two passes over relocations for
non SHF_ALLOC sections. I think this can be solved by factoring out the
code that scans a single relocation. It can then be used both as a scan
that record info and for a dedicated direct relocation of non SHF_ALLOC
sections.
I also think it is possible to reduce the number of enum values by
representing a target with just an OutputSection and an offset (which
can be from the start or end).
This should unblock adding features like relocation optimizations.
llvm-svn: 266158
This patch fixes dynamic relocation creation from GOT access in dynamic
objects on aarch64. Current code creates a plt relative one
(R_AARCH64_JUMP_SLOT) instead of a got relative (R_AARCH64_GLOB_DAT).
It leads the programs fails with:
$ cat t.cc
std::string test = "hello...\n";
int main ()
{
printf ("%s\n", test.c_str());
return 0;
}
$ clang++ t.cc -fpic -o t
$ ./t
hello...
Segmentation fault (core dumped)
Due the fact it will try to access the plt instead of the got for
__cxa_atexit registration for the std::string destruction. It will
lead in a bogus function address in atexit.
llvm-svn: 265784
Similar to r265462, TLS related relocations aren't marked as relative,
meaning that we end up generating R_AARCH64_RELATIVE relocations for
them. This change adds TLS relocations that I've seen on my system. With
this patch applied CloudABI's unit testing binary now passes on aarch64.
Approved by: ruiu
Differential Revision: http://reviews.llvm.org/D18816
llvm-svn: 265575
While trying to get PIE work on CloudABI for x86-64, I noticed that even
though GNU ld would generate functional binaries, LLD would not. It
turns out that we generate relocations for referencing TLS objects
inside of the text segment, which shouldn't happen.
This change extends the isRelRelative() function to list some additional
relocation types that should be treated as relative. This makes my C
library unit testing binary work on x86-64.
Approved by: ruiu
Differential Revision: http://reviews.llvm.org/D18688
Fixes bug: https://llvm.org/bugs/show_bug.cgi?id=27174
llvm-svn: 265462
Our symbol representation was redundant, and some times would get out of
sync. It had an Elf_Sym, but some fields were copied to SymbolBody.
Different parts of the code were checking the bits in SymbolBody and
others were checking Elf_Sym.
There are two general approaches to fix this:
* Copy the required information and don't store and Elf_Sym.
* Don't copy the information and always use the Elf_Smy.
The second way sounds tempting, but has a big problem: we would have to
template SymbolBody. I started doing it, but it requires templeting
*everything* and creates a bit chicken and egg problem at the driver
where we have to find ELFT before we can create an ArchiveFile for
example.
As much as possible I compared the test differences with what gold and
bfd produce to make sure they are still valid. In most cases we are just
adding hidden visibility to a local symbol, which is harmless.
In most tests this is a small speedup. The only slowdown was scylla
(1.006X). The largest speedup was clang with no --build-id, -O3 or
--gc-sections (i.e.: focus on the relocations): 1.019X.
llvm-svn: 265293
The extra fix is to note that it still requires copy relocations.
Original message:
Change how we handle R_MIPS_LO16.
Mips aligns PT_LOAD to 16 bits (0x10000). That means that the lower 16
bits are always the same, so we can, effectively, say that the
relocation is relative.
P.S.: Suggestions for a better name for the predicate are welcome :-)
llvm-svn: 265150
Some targets might require creation of thunks. For example, MIPS targets
require stubs to call PIC code from non-PIC one. The patch implements
infrastructure for thunk code creation and provides support for MIPS
LA25 stubs. Any MIPS PIC code function is invoked with its address
in register $t9. So if we have a branch instruction from non-PIC code
to the PIC one we cannot make the jump directly and need to create a small
stub to save the target function address.
See page 3-38 ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf
- In relocation scanning phase we ask target about thunk creation necessity
by calling `TagetInfo::needsThunk` method. The `InputSection` class
maintains list of Symbols requires thunk creation.
- Reassigning offsets performed for each input sections after relocation
scanning complete because position of each section might change due
thunk creation.
- The patch introduces new dedicated value for DefinedSynthetic symbols
DefinedSynthetic::SectionEnd. Synthetic symbol with that value always
points to the end of the corresponding output section. That allows to
escape updating synthetic symbols if output sections sizes changes after
relocation scanning due thunk creation.
- In the `InputSection::writeTo` method we write thunks after corresponding
input section. Each thunk is written by calling `TargetInfo::writeThunk` method.
- The patch supports the only type of thunk code for each target. For now,
it is enough.
Differential Revision: http://reviews.llvm.org/D17934
llvm-svn: 265059
We have to check the final value that is written.
I don't think this has any real word implications (unless something
supports unaligned instructions), but unblocks simplifying the handling
of PC relative relocations.
llvm-svn: 265009
If we make R_MIPS_LO16 a relative relocation, linker:
- never creates R_MIPS_COPY relocation for it
- attempts to create R_MIPS_REL32 dynamic relocation if R_MIPS_LO16's
target is a preemptible symbol
Differential Revision: http://reviews.llvm.org/D18607
llvm-svn: 264956
This simplifies a few things
* Read the value as early as possible, instead of passing a pointer to
the location.
* Print the warning for missing pair close to where we find out it is
missing.
* Don't pass the value to relocateOne.
llvm-svn: 264802
Mips aligns PT_LOAD to 16 bits (0x10000). That means that the lower 16
bits are always the same, so we can, effectively, say that the
relocation is relative.
llvm-svn: 264761
When R_X86_64_PC32/R_X86_64_32 relocations are
used against preemptible symbol and output is position independent,
error should be generated.
Differential revision: http://reviews.llvm.org/D18190
llvm-svn: 264707
This patch simplifies the isRelRelative for AArch64 and add the missing
ones for bootstrap and test-suite. It also adds more testing for
shared object creation.
llvm-svn: 264322
R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX relocations were added in latest ABI:
https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-r249.pdf
They should be generated instead of R_X86_64_GOTPCREL for cases
when relaxation is possible. Currently this patch just process them in the
same way like R_X86_64_GOTPCREL. That should work for now
and we can implement relaxations later.
There is no testcases provided as I think there is no way to generate
such relocations using llvm-mc atm.
Differential revision: http://reviews.llvm.org/D18301
llvm-svn: 264043