I've noticed that it is not convenient to create YAMLs from
binaries (using obj2yaml) that have to be test cases for obj2yaml
later (after applying yaml2obj).
The problem, for example is that obj2yaml emits "DynamicSymbols:"
key instead of .dynsym. It also does not create .dynstr.
And when a YAML document without explicitly defined .dynsym/.dynstr
is given to yaml2obj, we have issues:
1) These sections are placed after non-allocatable sections (I've fixed it in D74756).
2) They have VA == 0. User needs create descriptions for such sections explicitly manually
to set a VA.
This patch addresses (2). I suggest to let yaml2obj assign virtual addresses by itself.
It makes an output binary to be much closer to "normal" ELF.
(It is still possible to use "Address: 0x0" for a section to get the original behavior
if it is needed)
Differential revision: https://reviews.llvm.org/D74764
.dynsym and .dynstr are allocatable and therefore normally are placed
before non-allocatable .strtab, .shstrtab, .symtab sections.
But we are placing them after currently what creates a mix of
alloc/non-alloc sections and does not look normal.
Differential revision: https://reviews.llvm.org/D74756
Previously the description allowed to describe symbols with use of
`Name` and `Index` keys. This patch removes them and now it is still
possible to use either names or symbol indexes, but the code is simpler
and the format is slightly different.
Such a change will be useful for another patches, e.g:
https://reviews.llvm.org/D73788#inline-671077
Differential revision: https://reviews.llvm.org/D73888
We are missing ability to override the sh_entsize field for
SHT_REL[A] sections. It would be useful for writing test cases.
Differential revision: https://reviews.llvm.org/D73621
Note: this is a reland with a trivial 2 lines fix in ELFState<ELFT>::writeSectionContent.
It adds a check similar to ones we already have for other sections to fix the case revealed
by bots, like http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/60744.
The encoded sequence of Elf*_Relr entries in a SHT_RELR section looks
like [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ]
i.e. start with an address, followed by any number of bitmaps. The address
entry encodes 1 relocation. The subsequent bitmap entries encode up to 63(31)
relocations each, at subsequent offsets following the last address entry.
More information is here:
https://github.com/llvm-mirror/llvm/blob/master/lib/Object/ELF.cpp#L272
This patch adds a support for these sections.
Differential revision: https://reviews.llvm.org/D71872
The encoded sequence of Elf*_Relr entries in a SHT_RELR section looks
like [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ]
i.e. start with an address, followed by any number of bitmaps. The address
entry encodes 1 relocation. The subsequent bitmap entries encode up to 63(31)
relocations each, at subsequent offsets following the last address entry.
More information is here:
https://github.com/llvm-mirror/llvm/blob/master/lib/Object/ELF.cpp#L272
This patch adds a support for these sections.
Differential revision: https://reviews.llvm.org/D71872
Currently we have the `Flags` property that allows to
set flags for a section. The problem is that it does not
allow us to set an arbitrary value, because of bit fields
validation under the hood. An arbitrary values can be used
to test specific broken cases.
We probably do not want to relax the validation, so this
patch adds a `ShSize` property that allows to
override the `sh_size`. It is inline with others `Sh*` properties
we have already.
Differential revision: https://reviews.llvm.org/D71411
We already have Symbols property to list regular symbols and
it is currently Optional<>. This patch makes DynamicSymbols to be optional
too. With this there is no need to define a dummy symbol anymore to trigger
creation of the .dynsym and it is now possible to define an empty .dynsym using
just the following line:
DynamicSymbols: []
(it is important to have when you do not want to have dynamic symbols,
but want to have a .dynsym)
Now the code is consistent and it helped to fix a bug: previously we
did not report an error when both Content/Size and an empty
Symbols/DynamicSymbols list were specified.
Differential revision: https://reviews.llvm.org/D70956
This section contains strings specifying libraries to be added to the link by the linker.
The strings are encoded as standard null-terminated UTF-8 strings.
This patch adds a way to describe and dump SHT_LLVM_DEPENDENT_LIBRARIES sections.
I introduced a new YAMLFlowString type here. That used to teach obj2yaml to dump
them like:
```
Libraries: [ foo, bar ]
```
instead of the following (if StringRef would be used):
```
Libraries:
- foo
- bar
```
Differential revision: https://reviews.llvm.org/D70598
SHT_LLVM_LINKER_OPTIONS section contains pairs of null-terminated strings.
This patch adds support for them.
Differential revision: https://reviews.llvm.org/D69895
Currently there is no way to describe the data that is not a part of an output section.
It can be a data used to align sections or to fill the gaps with something,
or another kind of custom data. In this patch I suggest a way to describe it. It looks like that:
```
Sections:
- Type: CustomFiller
Pattern: "CCDD"
Size: 4
- Name: .bar
Type: SHT_PROGBITS
Content: "FF"
```
I.e. I've added a kind of synthetic section with a synthetic type "CustomFiller".
In the code it is called a "SyntheticFiller", which is "a synthetic section which
might be used to write the custom data around regular output sections. It does
not present in the sections header table, but it might affect the output file size and
program headers produced. Think about it as about piece of data."
`SyntheticFiller` currently has a `Pattern` field and a `Size` field + an optional `Name`.
When written, `Size` of bytes in the output will be filled with a `Pattern`.
It is possible to reference a named filler it by name from the program headers description,
just like any other normal section.
Differential revision: https://reviews.llvm.org/D69709
Before this change .symtab section was required for SHT_REL[A] section
declarations. yaml2obj automatically defined it in case when YAML document
did not have it.
With this change it is now possible to produce an object that
has a relocation section, but has no symbol table.
It simplifies the code and also it is inline with how we handle Link fields
for another special sections.
Differential revision: https://reviews.llvm.org/D69260
Currently, when we do not specify "Info" field in a YAML description
for SHT_GROUP section, yaml2obj reports an error:
"error: unknown symbol referenced: '' by YAML section '.group1'"
Also, we do not link it with a symbol table by default,
though it is what we do for AddrsigSection, HashSection, RelocationSection.
(http://www.sco.com/developers/gabi/latest/ch4.sheader.html#sh_link)
The patch fixes missings mentioned.
Differential revision: https://reviews.llvm.org/D69299
SHT_NOTE is the section that consists of
namesz, descsz, type, name + padding, desc + padding data.
This patch teaches yaml2obj, obj2yaml to dump and parse them.
This patch implements the section how it is described here:
https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-18048.html
Which says: "For 64–bit objects and 32–bit objects, each entry is an array of 4-byte words in
the format of the target processor"
The official specification is different
http://www.sco.com/developers/gabi/latest/ch5.pheader.html#note_section
And says: "n 64-bit objects (files with e_ident[EI_CLASS] equal to ELFCLASS64), each entry is an array
of 8-byte words in the format of the target processor. In 32-bit objects (files with e_ident[EI_CLASS]
equal to ELFCLASS32), each entry is an array of 4-byte words in the format of the target processor"
Since LLVM uses the first, 32-bit way, this patch follows it.
Differential revision: https://reviews.llvm.org/D68983
This patch tries to resolve problems faced in D68943
and uses some of the code written by Konrad Wilhelm Kleine
in that patch.
Previously, yaml2obj tool always created a .symtab section.
This patch changes that. With it we only create it when
have a "Symbols:" tag in the YAML document or when
we need to create it because it is used by another section(s).
obj2yaml follows the new behavior and does not print "Symbols:"
anymore when there is no symbol table.
Differential revision: https://reviews.llvm.org/D69041
llvm-svn: 375361
It allows using "Size" with or without "Content" in YAML descriptions of
SHT_LLVM_ADDRSIG sections.
Differential revision: https://reviews.llvm.org/D68334
llvm-svn: 373610
This is a follow-up for D68085 which allows using "Size"
tag together with "Content" tag or alone.
Differential revision: https://reviews.llvm.org/D68216
llvm-svn: 373473
Currently we can't use unique suffixes in section names to describe
stack sizes sections. E.g. '.stack_sizes [1]' will be treated as a regular section.
This happens because we recognize stack sizes section by name and
do not yet drop the suffix before the check.
The patch fixes it.
Differential revision: https://reviews.llvm.org/D68018
llvm-svn: 372853
It is a follow-up requested in the review comment
for D67757. Allows to use Content + Size or just Size
when describing .stack_sizes sections in YAML document
Differential revision: https://reviews.llvm.org/D67958
llvm-svn: 372845
.stack_sizes is a SHT_PROGBITS section that contains pairs of
<address (4/8 bytes), stack size (uleb128)>.
This patch teach tools to parse and dump it.
Differential revision: https://reviews.llvm.org/D67757
llvm-svn: 372762
This is a continuation of the YAML library error reporting
refactoring/improvement and the idea by itself was mentioned
in the following thread:
https://reviews.llvm.org/D67182?id=218714#inline-603404
This performs a cleanup of all object emitters in the library.
It allows using the custom one provided by the caller.
One of the nice things is that each tool can now print its tool name,
e.g: "yaml2obj: error: <text>"
Also, the code became a bit simpler.
Differential revision: https://reviews.llvm.org/D67445
llvm-svn: 371865
The address difference between two sections in a PT_LOAD is a constant.
Consider a hypothetical case (pagesize can be very small, say, 4).
```
.text sh_addralign=4
.text.hot sh_addralign=16
```
If we set p_align to 4, the PT_LOAD will be loaded at an address which
is a multiple of 4. The address of .text.hot is guaranteed to be a
multiple of 4, but not necessarily a multiple of 16.
This patch deletes the constraint
if (SHeader->sh_offset == PHeader.p_offset)
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D67260
llvm-svn: 371501
This fixes a bug as well. When "FileSize:" (p_filesz) is specified and
different from the actual value, the following code probably should not
use PHeader.p_filesz:
if (SHeader->sh_offset == PHeader.p_offset + PHeader.p_filesz)
PHeader.p_memsz += SHeader->sh_size;
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D67256
llvm-svn: 371420
The aim of this patch is to refactor how we handle and report error.
I suggest to use the same approach we use in LLD: delayed error reporting.
For that I introduced 'HasError' flag which triggers when we report an error.
Now we do not exit instantly on any error. The benefits are:
1) There are no more 'exit(1)' calls in the library code.
2) Code was simplified significantly in a few places.
3) It is now possible to print multiple errors instead of only one.
Also, I changed the messages to be lower case and removed a full stop.
Differential revision: https://reviews.llvm.org/D67182
llvm-svn: 371380
`struct Elf*_Shdr` has a field `sh_offset`, named `ShOffset` in
llvm::ELFYAML::Section. Rename SHOffset (e_shoff) to SHOff to prevent confusion.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D67254
llvm-svn: 371185
Summary: It says [[ http://www.sco.com/developers/gabi/latest/ch4.eheader.html | here ]] that if there are no program headers than e_phoff should be 0, but currently it is always set after the header. GNU's `readelf` (but not `llvm-readelf`) complains about this: `readelf: Warning: possibly corrupt ELF header - it has a non-zero program header offset, but no program headers`.
Reviewers: jhenderson, grimar, MaskRay, rupprecht
Reviewed By: jhenderson, grimar, MaskRay
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67054
llvm-svn: 371162
Linkers (ld.bfd/gold/lld) place the section header table at the very
end. This allows tools to strip it, which is optional in executable/shared objects.
In addition, if we add or section, the size of the section header table
will change. Placing the section header table in the end keeps section
offsets unchanged.
yaml2obj currently places the section header table immediately after the
program header. Follow what linkers do to make offset updating easier.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D67221
llvm-svn: 371074
In D62809 I accidentally added "ELFState<ELFT> &State" as the
first parameter to two methods. There is no reason for having that.
I removed this argument and also moved finalizeStrings declaration to
remove an excessive 'private:' tag.
Differential revision: https://reviews.llvm.org/D67157
llvm-svn: 371033
This is in line with the previous changes which allowed to
override the sh_offset/sh_size and useful for writing test cases.
Differential revision: https://reviews.llvm.org/D66998
llvm-svn: 370633
Currenly we can encode the 'st_other' field of symbol using 3 fields.
'Visibility' is used to encode STV_* values.
'Other' is used to encode everything except the visibility, but it can't handle arbitrary values.
'StOther' is used to encode arbitrary values when 'Visibility'/'Other' are not helpfull enough.
'st_other' field is used to encode symbol visibility and platform-dependent
flags and values. Problem to encode it is that it consists of Visibility part (STV_* values)
which are enumeration values and the Other part, which is different and inconsistent.
For MIPS the Other part contains flags for all STO_MIPS_* values except STO_MIPS_MIPS16.
(Like comment in ELFDumper says: "Someones in their infinite wisdom decided to make
STO_MIPS_MIPS16 flag overlapped with other ST_MIPS_xxx flags."...)
And for PPC64 the Other part might actually encode any value.
This patch implements custom logic for handling the st_other and removes
'Visibility' and 'StOther' fields.
Here is an example of a new YAML style this patch allows:
- Name: foo
Other: [ 0x4 ]
- Name: bar
Other: [ STV_PROTECTED, 4 ]
- Name: zed
Other: [ STV_PROTECTED, STO_MIPS_OPTIONAL, 0xf8 ]
Differential revision: https://reviews.llvm.org/D66886
llvm-svn: 370472
This allows us to produce broken binaries with local
symbols placed after global in '.dynsym'/'.symtab'
Also, simplifies the code.
Differential revision: https://reviews.llvm.org/D66799
llvm-svn: 370331
This relands this commit, I mistakenly reverted the original change
thinking it was the cause of the observed MSan failures but it was not.
llvm-svn: 370206