Commit Graph

225 Commits

Author SHA1 Message Date
Rafael Espindola 2deeb6093d Fix PR28575.
Not all relocations from a .eh_frame that point to an executable
section should be ignored. In particular, the relocation finding the
personality function should not.

This is a reduction from trying to bootstrap a static lld on linux.

llvm-svn: 276329
2016-07-21 20:18:30 +00:00
Rafael Espindola 6eae9f2c67 Delete SplitInputSection.
This opens the way for having a different Piece type for EhInputSection.

llvm-svn: 276275
2016-07-21 13:32:37 +00:00
Rafael Espindola 2197311c31 Delete EhInputSection::getOffset.
We no longer need it for relocations in .eh_frame.

The only relocations that point to .eh_frame are the ones trying to
find the output .eh_frame.

This actually fixes a bug in the symbol value code. It was not
handling -1 as an indicator for a piece not being included in the
output.

llvm-svn: 276175
2016-07-20 20:19:58 +00:00
Rafael Espindola 0f7cedaa1e Create thunks before regular relocation scan.
We will need to do something like this to support range extension
thunks since that process is iterative.

Doing this also has the advantage that when doing the regular
relocation scan the offset in the output section is known and we can
just store that. This reduces the number of times we have to run
getOffset and I think will allow a more specialized .eh_frame
representation.

By itself this is already a performance win.

firefox
  master 7.295045737
  patch  7.209466989 0.98826892235
chromium
  master 4.531254468
  patch  4.509221804 0.995137623774
chromium fast
  master 1.836928973
  patch  1.823805241 0.992855612714
the gold plugin
  master 0.379768791
  patch  0.380043405 1.00072310839
clang
  master 0.642698284
  patch  0.642215663 0.999249070657
llvm-as
  master 0.036665467
  patch  0.036456225 0.994293213284
the gold plugin fsds
  master 0.40395817
  patch  0.404384555 1.0010555177
clang fsds
  master 0.722045545
  patch  0.720946135 0.998477367518
llvm-as fsds
  master 0.03292646
  patch  0.032759965 0.994943428477
scylla
  master 3.427376378
  patch  3.368316181 0.98276810292

llvm-svn: 276146
2016-07-20 17:58:07 +00:00
Eugene Leviant e63d81bd05 [ELF] Create output sections in LinkerScript class
llvm-svn: 276121
2016-07-20 14:43:20 +00:00
George Rimar 5d53d1f42c [ELF] - Make few members of Writer to be global and export them for reuse
Creating sections on linkerscript side requires some methods
that can be reused if are exported from writer.

Patch implements that change.

Differential revision: http://reviews.llvm.org/D20104

llvm-svn: 275162
2016-07-12 08:50:42 +00:00
Rui Ueyama ec1b80fd11 Remove unused parameters.
llvm-svn: 275153
2016-07-12 03:49:41 +00:00
Peter Smith fb05cd997c Recommit R274836 Add Thunk support framework for ARM and Mips
The TinyPtrVector of const Thunk<ELFT>* in InputSections.h can cause 
build failures on certain compiler/library combinations when Thunk<ELFT> 
is not a complete type or is an abstract class. Fixed by making Thunk<ELFT>
non Abstract.

type or is an abstract class 

llvm-svn: 274863
2016-07-08 16:10:27 +00:00
Peter Smith eeb827447e Revert R274836 Add Thunk support framework for ARM and Mips
This seems to be causing a buildbot failure on lld-x86_64-freebsd. Will
reproduce locally and fix. 

llvm-svn: 274841
2016-07-08 12:25:50 +00:00
Peter Smith de01b98a26 Add Thunk support framework for ARM and Mips
Generalise the Mips LA25 Thunk code and implement ARM and Thumb
    interworking Thunks.
    
    - Introduce a new module Thunks.cpp to store the Target Specific Thunk
      implementations.
    - DefinedRegular and Shared have a ThunkData field to record Thunk.
    - A Target can have more than one type of Thunk.
    - Support PC-relative calls to Thunks.
    - Support Thunks to PLT entries.
    - Existing Mips LA25 Thunk code integrated.
    - Support for ARMv7A interworking Thunks.
    
    Limitations:
    - Only one Thunk per SymbolBody, this is sufficient for all currently
      implemented Thunks.
    - ARM thunks assume presence of V6T2 MOVT and MOVW instructions.

    Differential revision: http://reviews.llvm.org/D21891

llvm-svn: 274836
2016-07-08 11:13:40 +00:00
Rui Ueyama 1d12ac1d11 Fix endianness issue.
Previously, ch_size was read in host byte order, so if a host and
a target are different in byte order, we would produce a corrupted
output.

llvm-svn: 274729
2016-07-07 03:55:55 +00:00
George Rimar 602fbee9fc [ELF] - Support of compressed input sections implemented.
Patch implements support of zlib style compressed sections.
SHF_COMPRESSED flag is used to recognize that decompression is required.
After that decompression is performed and flag is removed from output.

Differential revision: http://reviews.llvm.org/D20272

llvm-svn: 273661
2016-06-24 11:18:44 +00:00
Simon Atanasyan 002e244717 [ELF][MIPS] Support MIPS TLS relocations
The patch adds one more partition to the MIPS GOT. This time it is for
TLS related GOT entries. Such entries are located after 'local' and 'global'
ones. We cannot get a final offset for these entries at the time of
creation because we do not know size of 'local' and 'global' partitions.
So we have to adjust the offset later using `getMipsTlsOffset()` method.

All MIPS TLS relocations which need GOT entries operates MIPS style GOT
offset - 'offset from the GOT's beginning' - MipsGPOffset constant. That
is why I add new types of relocation expressions.

One more difference from othe ABIs is that the MIPS ABI does not support
any TLS relocation relaxations. I decided to make a separate function
`handleMipsTlsRelocation` and put MIPS TLS relocation handling code
there. It is similar to `handleTlsRelocation` routine and duplicates its
code. But it allows to make the code cleaner and prevent pollution of
the `handleTlsRelocation` by MIPS 'if' statements.

Differential Revision: http://reviews.llvm.org/D21606

llvm-svn: 273569
2016-06-23 15:26:31 +00:00
Rui Ueyama 809d8e2d41 Fix a bug that MIPS thunks can overwrite other section contents.
Peter Smith found while trying to support thunk creation for ARM that
LLD sometimes creates broken thunks for MIPS. The cause of the bug is
that we assign file offsets to input sections too early. We need to
create all sections and then assign section offsets because appending
thunks changes file offsets for all following sections.

This patch separates the pass to assign file offsets from thunk
creation pass. This effectively reverts r265673.

Differential Revision: http://reviews.llvm.org/D21598

llvm-svn: 273532
2016-06-23 04:33:42 +00:00
Simon Atanasyan 4132511cdc [ELF][MIPS] Support GOT entries for non-preemptible symbols with different addends
There are two motivations for this patch. The first one is a preparation
for support MIPS TLS relocations. It might sound like a joke but for GOT
entries related to TLS relocations MIPS ABI uses almost regular approach
with creation of dynamic relocations for each GOT enty etc. But we need
to separate these 'regular' TLS related entries from MIPS specific local
and global parts of GOT. ABI declare simple solution - all TLS related
entries allocated at the end of GOT after local/global parts. The second
motivation it to support GOT relocations for non-preemptible symbols
with addends. If we have more than one GOT relocations against symbol S
with different addends we need to create GOT entries for each unique
Symbol/Addend pairs.

So we store all MIPS GOT entries in separate containers. For non-preemptible
symbols we have to maintain two data structures. The first one is MipsLocal
vector. Each entry corresponds to the GOT entry from the 'local' part
of the GOT contains the symbol's address plus addend. The second one
is MipsLocalMap. It is a map from Symbol/Addend pair to the GOT index.

Differential Revision: http://reviews.llvm.org/D21297

llvm-svn: 273127
2016-06-19 21:39:37 +00:00
Rui Ueyama 424b408165 Rename Align -> Alignment.
I think it is me who named these variables, but I always find that
they are slightly confusing because align is a verb.
Adding four letters is worth it.

llvm-svn: 272984
2016-06-17 01:18:46 +00:00
Rafael Espindola e1979aed0a Implement gd to ie relaxation for aarch64.
llvm-svn: 271815
2016-06-04 23:33:31 +00:00
Rafael Espindola 69f5402b26 Use adjustRelaxExpr for tls relaxations too.
This remove some EM_386 specific code from InputSection.cpp and opens
the way for more relaxations.

llvm-svn: 271814
2016-06-04 23:22:34 +00:00
Rafael Espindola 12dc446939 Fix implicit plt creation on aarch64.
We were not handling page relative relocations.

llvm-svn: 271798
2016-06-04 19:11:14 +00:00
Rafael Espindola e37d13b9ec Start adding tlsdesc support for aarch64.
This is mostly extracted from http://reviews.llvm.org/D18960.

The general idea for tlsdesc is that the two GD got entries are used
for a function pointer and its argument. The dynamic linker sets
both. In the non-dlopen case the dynamic linker sets the function to
the identity and the argument to the offset in the tls block.

All that the static linker has to do in the non-dlopen case is
relocate the code to point to the got entries and create a dynamic
relocation.

The dlopen case is more complicated, but can be implemented in another patch.

llvm-svn: 271569
2016-06-02 19:49:53 +00:00
George Rimar f10c8290fa [ELF] - Implemented support for test/binop relaxations from latest ABI.
Patch implements next relaxation from latest ABI:

"Convert memory operand of test and binop into immediate operand, where binop is one of adc, add, and, cmp, or,
sbb, sub, xor instructions, when position-independent code is disabled."

It is described in System V Application Binary Interface AMD64 Architecture Processor 
Supplement Draft Version 0.99.8 (https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-r249.pdf, 
B.2 "B.2 Optimize GOTPCRELX Relocations").

Differential revision: http://reviews.llvm.org/D20793

llvm-svn: 271405
2016-06-01 16:45:30 +00:00
Rafael Espindola a8433c1d1b Revert "bar"
This reverts commit r271365.
Sorry, wrong branch.

llvm-svn: 271366
2016-06-01 06:15:22 +00:00
Rafael Espindola 74540516ef bar
llvm-svn: 271365
2016-06-01 06:13:54 +00:00
Rui Ueyama 8b972d221e Simplify. NFC.
llvm-svn: 271133
2016-05-28 18:40:38 +00:00
Rui Ueyama 406b469de4 Avoid doing binary search.
MergedInputSection::getOffset is the busiest function in LLD if string
merging is enabled and input files have lots of mergeable sections.
It is usually the case when creating executable with debug info,
so it is pretty common.

The reason why it is slow is because it has to do faily complex
computations. For non-mergeable sections, section contents are
contiguous in output, so in order to compute an output offset,
we only have to add the output section's base address to an input
offset. But for mergeable strings, section contents are split for
merging, so they are not contigous. We've got to do some lookups.

We used to do binary search on the list of section pieces.
It is slow because I think it's hostile to branch prediction.

This patch replaces it with hash table lookup. Seems it's working
pretty well. Below is "perf stat -r10" output when linking clang
with debug info. In this case this patch speeds up about 4%.

Before:

       6584.153205 task-clock (msec)         #    1.001 CPUs utilized            ( +-  0.09% )
               238 context-switches          #    0.036 K/sec                    ( +-  6.59% )
                 0 cpu-migrations            #    0.000 K/sec                    ( +- 50.92% )
         1,067,675 page-faults               #    0.162 M/sec                    ( +-  0.15% )
    18,369,931,470 cycles                    #    2.790 GHz                      ( +-  0.09% )
     9,640,680,143 stalled-cycles-frontend   #   52.48% frontend cycles idle     ( +-  0.18% )
   <not supported> stalled-cycles-backend
    21,206,747,787 instructions              #    1.15  insns per cycle
                                             #    0.45  stalled cycles per insn  ( +-  0.04% )
     3,817,398,032 branches                  #  579.786 M/sec                    ( +-  0.04% )
       132,787,249 branch-misses             #    3.48% of all branches          ( +-  0.02% )

       6.579106511 seconds time elapsed                                          ( +-  0.09% )

After:

       6312.317533 task-clock (msec)         #    1.001 CPUs utilized            ( +-  0.19% )
               221 context-switches          #    0.035 K/sec                    ( +-  4.11% )
                 1 cpu-migrations            #    0.000 K/sec                    ( +- 45.21% )
         1,280,775 page-faults               #    0.203 M/sec                    ( +-  0.37% )
    17,611,539,150 cycles                    #    2.790 GHz                      ( +-  0.19% )
    10,285,148,569 stalled-cycles-frontend   #   58.40% frontend cycles idle     ( +-  0.30% )
   <not supported> stalled-cycles-backend
    18,794,779,900 instructions              #    1.07  insns per cycle
                                             #    0.55  stalled cycles per insn  ( +-  0.03% )
     3,287,450,865 branches                  #  520.799 M/sec                    ( +-  0.03% )
        72,259,605 branch-misses             #    2.20% of all branches          ( +-  0.01% )

       6.307411828 seconds time elapsed                                          ( +-  0.19% )

Differential Revision: http://reviews.llvm.org/D20645

llvm-svn: 270999
2016-05-27 14:39:13 +00:00
Simon Atanasyan 84bb355c3a [ELF][MIPS] Handle section symbol points to the .MIPS.options / .reginfo section
MIPS .reginfo and .MIPS.options sections are consumed by the linker, and
the linker produces a single output section. But it is possible that
input files contain section symbol points to the corresponding input
section. In case of generation a relocatable output we need to write
such symbols to the output file.

Fixes bug 27878.

Differential Revision: http://reviews.llvm.org/D20688

llvm-svn: 270910
2016-05-26 20:46:01 +00:00
George Rimar 5c33b91bbe [ELF] - Implemented optimization for R_X86_64_GOTPCREL relocation.
System V Application Binary Interface AMD64 Architecture Processor Supplement Draft Version 0.99.8 
(https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-r249.pdf, B.2 "B.2 Optimize GOTPCRELX Relocations")
introduces possible relaxations for R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX.

That patch implements the next relaxation: 
mov foo@GOTPCREL(%rip), %reg => lea foo(%rip), %reg
and also opens door for implementing all other ones.

Implementation was suggested by Rafael Ávila de Espíndola with few additions and testcases by myself.

Differential revision: http://reviews.llvm.org/D15779

llvm-svn: 270705
2016-05-25 14:31:37 +00:00
Rafael Espindola bfffa94ea7 Fix crash in .eh_frame marker section.
llvm-svn: 270563
2016-05-24 14:51:50 +00:00
Rafael Espindola 29da3e3577 Simplify. Thanks to Rui for the suggestion.
llvm-svn: 270555
2016-05-24 12:17:11 +00:00
Rafael Espindola fe3a2f1b81 Revert "Simplify. Thanks to Rui for the suggestion."
This reverts commit r270551.

Sorry, I commited the wrong branch :-(

llvm-svn: 270554
2016-05-24 12:12:06 +00:00
Rafael Espindola dba64b8ea4 Simplify. Thanks to Rui for the suggestion.
llvm-svn: 270551
2016-05-24 11:53:15 +00:00
Rui Ueyama 0b9a90364b Rename EHInputSection -> EhInputSection.
llvm-svn: 270532
2016-05-24 04:19:20 +00:00
Rui Ueyama f5febef249 Create a new file EhFrame.cpp and move code to read .eh_frame there.
llvm-svn: 270526
2016-05-24 02:55:45 +00:00
Rui Ueyama b91bf1a9a0 Do not split mergeable sections if they are gc'ed.
Previously, mergeable section's constructors did more than just
setting member variables; it split section contents into small
pieces. It is not always computationally cheap task because if
the section is a mergeable string section, it needs to scan the
entire section to split them by NUL characters.

If a section would be thrown away by GC, that cost ended up
being a waste of time. It is going to be larger problem if the
section is compressed -- the whole time to uncompress it and
split it up is going to be a waste.

Luckily, we can defer section splitting after GC. We just have
to remember which offsets are in use during GC and apply that later.
This patch implements it.

Differential Revision: http://reviews.llvm.org/D20516

llvm-svn: 270455
2016-05-23 16:55:43 +00:00
Rui Ueyama 744d47ea05 Make file-local function file-local. NFC.
llvm-svn: 270387
2016-05-23 00:45:54 +00:00
Rui Ueyama 518f1af04d Split MergeInputSection's ctor. NFC.
llvm-svn: 270386
2016-05-23 00:40:24 +00:00
Rui Ueyama 88abd9b300 Move splitInputSection from EHOutputSection to EHInputSection.
llvm-svn: 270385
2016-05-22 23:53:00 +00:00
Rui Ueyama 34dc99e2c5 Store section contents to SectionPiece. NFC.
So that we don't need to cut a slice when we use a SectionPiece.

llvm-svn: 270348
2016-05-22 01:15:32 +00:00
Rui Ueyama 90fa3722d2 Simplify SplitInputSection::getRangeAndSize.
This patch adds Size member to SectionPiece so that getRangeAndSize
can just return a SectionPiece instead of a std::pair<SectionPiece *, uint_t>.
Also renamed the function.

llvm-svn: 270346
2016-05-22 00:41:38 +00:00
Rui Ueyama 3ea8727188 Define SectionPiece and use it instead of std::pair<uint_t, uint_t>.
We were using std::pair to represents pieces of splittable section
contents. It hurt readability because "first" and "second" are not
meaningful. This patch give them names.

One more thing is that piecewise liveness information is stored to
the second element of the pair as a special value of output section
offset. It was confusing, so I defiend a new bit, "Live", in the
new struct.

llvm-svn: 270340
2016-05-22 00:13:04 +00:00
Simon Atanasyan 1c980ca5aa [ELF] Take into account offset in the output section when read addends for a non-alloc input section
llvm-svn: 270328
2016-05-21 19:48:54 +00:00
Rafael Espindola ebed1fe0de Refactor R_RELAX_TLS_* value computation.
This makes it explicit that each R_RELAX_TLS_* is equivalent to some
other expression.

With this I think we are at a sweet spot for how much is done in
Target.cpp. I did experiment with moving *all* the value math out of it.
It has the advantage that we know the final value in target independent
code, but it gets quite verbose.

llvm-svn: 270277
2016-05-20 21:23:52 +00:00
Rafael Espindola 50223310ba Simplify a bit. NFC.
llvm-svn: 270275
2016-05-20 21:14:06 +00:00
Rafael Espindola 74f3dbe438 Directly compute the right value for R_RELAX_TLS_GD_TO_IE.
This avoid doing math in Target.cpp to compensate.

llvm-svn: 270266
2016-05-20 20:09:35 +00:00
Rafael Espindola 8818ca69dc Make tp offset computation target independent.
This adds direct support for computing offsets from the thread pointer
for both variants. Of the architectures we support, variant 1 is used
only by aarch64 (but that doesn't seem to be documented anywhere.)

llvm-svn: 270243
2016-05-20 17:41:09 +00:00
Simon Atanasyan 4e3a15c9f3 [ELF][MIPS] Rename R_MIPS_GOT_xxx relocation expression kinds
New names reflect purpose of corresponding GOT entries better.
Both expression types related to entries allocated in the 'local'
part of MIPS GOT. R_MIPS_GOT_LOCAL_PAGE is for entries contain 'page'
addresses. R_MIPS_GOT_LOCAL is for entries contain 'full' address.

llvm-svn: 269597
2016-05-15 18:13:50 +00:00
Rafael Espindola 3e0b7837bf Cache result when tail merging too.
This speeds up a link of chromium with -O2 (but no icf,gc) from
1.940664632 to 1.925578119.

llvm-svn: 268639
2016-05-05 16:12:25 +00:00
Peter Collingbourne e29e142a10 ELF: Do not use -1 to mark pieces of merge sections as being tail merged.
We were previously using an output offset of -1 for both GC'd and tail
merged pieces. We need to distinguish these two cases in order to filter
GC'd symbols from the symbol table -- we were previously asserting when we
asked for the VA of a symbol pointing into a dead piece, which would end
up asking the tail merging string table for an offset even though we hadn't
initialized it properly.

This patch fixes the bug by using an offset of -1 to exclusively mean GC'd
pieces, using 0 for tail merges, and distinguishing the tail merge case from
an offset of 0 by asking the output section whether it is tail merge.

Differential Revision: http://reviews.llvm.org/D19953

llvm-svn: 268604
2016-05-05 04:10:12 +00:00
Rafael Espindola ebb04b9eb6 Simplify handling of hint relocations.
llvm-svn: 268501
2016-05-04 14:44:22 +00:00
Simon Atanasyan 5e85a1b5be [ELF][MIPS] Fix typo in the comment. NFC.
llvm-svn: 268486
2016-05-04 10:15:12 +00:00