This is a NFC splitted from D75342.
Previously obj2yaml never dumped a normal SHT_NULL section (i.e. when it is just zeroed)
or non-allocatable SHT_STRTAB/SHT_SYMTAB/SHT_DYNSYM sections.
This patch does not change the output, but it changes the logic so that we now dump these
sections, and them remove them later. It allows us to create and work with our internal representation
of sections, i.e. to work with the vector of Chunks, what looks cleaner.
It is used by D75342 and also should help us to support dumping a content that does not
belong to a section (i.e. to dump some data as `Fill` chunks).
Differential revision: https://reviews.llvm.org/D76684
This method it a bit too large.
It is becoming inconvenient to update it.
This patch suggests a way to reduce and cleanup it.
Differential revision: https://reviews.llvm.org/D76499
Currently obj2yaml always emits the `EntSize` property when `sh_entsize != 0`.
It is not correct. For example, for `SHT_DYNAMIC` section, `EntSize == 0`
is abnormal, while `sizeof(ELFT::Dyn)` is the expected default.
To reduce the output produces we should not dump default values.
yaml2obj tests that shows `sh_entsize` values produced are:
1) For `SHT_REL*` sections: `yaml2obj\ELF\reloc-sec-entry-size.yaml`
2) For `SHT_DYNAMIC`: `yaml2obj\ELF\dynamic-section.yaml`
Differential revision: https://reviews.llvm.org/D76227
`.rela.dyn` is a dynamic relocation section that normally has
no value in `sh_info` field.
The existent `elf-reladyn-section-shinfo.yaml` which tests this piece has issues:
1) It does not check the case when we have more than one `SHT_REL[A]`
section with `sh_info == 0` in the object. Because of this it did not catch the issue.
Currently we print an excessive "Info" field:
```
- Name: .rela.dyn
Type: SHT_RELA
EntSize: 0x0000000000000018
- Name: .rel.dyn
Type: SHT_REL
EntSize: 0x0000000000000010
Info: ' [1]'
```
2) It seems can be more generic. I've added a `rel-rela-section.yaml` instead.
Differential revision: https://reviews.llvm.org/D76281
Sometimes we need to dump an object and build it again from a YAML
description produced. The problem is that obj2yaml does not dump some
of sections, like string tables and symbol tables.
Because of that yaml2obj implicitly creates them and sections created
are not placed at their original locations. They are added to the end of a section list.
That makes a preparing test cases task harder than it can be.
This patch teaches obj2yaml to dump parts of allocatable SHT_STRTAB, SHT_SYMTAB
and SHT_DYNSYM sections to print placeholders for them.
This also allows to preserve usefull parameters, like virtual address.
Differential revision: https://reviews.llvm.org/D74955
I've noticed that it is not convenient to create YAMLs from
binaries (using obj2yaml) that have to be test cases for obj2yaml
later (after applying yaml2obj).
The problem, for example is that obj2yaml emits "DynamicSymbols:"
key instead of .dynsym. It also does not create .dynstr.
And when a YAML document without explicitly defined .dynsym/.dynstr
is given to yaml2obj, we have issues:
1) These sections are placed after non-allocatable sections (I've fixed it in D74756).
2) They have VA == 0. User needs create descriptions for such sections explicitly manually
to set a VA.
This patch addresses (2). I suggest to let yaml2obj assign virtual addresses by itself.
It makes an output binary to be much closer to "normal" ELF.
(It is still possible to use "Address: 0x0" for a section to get the original behavior
if it is needed)
Differential revision: https://reviews.llvm.org/D74764
I was reported that with commit:
https://github.com/llvm/llvm-project/commit/d3963051c490
gcc-9.2 is giving the warning below.
This should help (I have no gcc 9.2 to test).
[ 57%] Building CXX object tools/obj2yaml/CMakeFiles/obj2yaml.dir/elf2yaml.cpp.o
/llvm/tools/obj2yaml/elf2yaml.cpp: In instantiation of ‘llvm::Expected<llvm::ELFYAML::Object*>
{anonymous}::ELFDumper<ELFT>::dump() [with ELFT = llvm::object::ELFType<llvm::support::little, false>]’:
/llvm/tools/obj2yaml/elf2yaml.cpp:1218:31: required from ‘llvm::Error elf2yaml(llvm::raw_ostream&,
const llvm::object::ELFFile<ELFT>&) [with ELFT = llvm::object::ELFType<llvm::support::little, false>]’
/llvm/tools/obj2yaml/elf2yaml.cpp:1231:47: required from here
/llvm/tools/obj2yaml/elf2yaml.cpp:207:41: warning: comparison of integer expressions of different
signedness: ‘llvm::support::detail::packed_endian_specific_integral<unsigned int, llvm::support::little, 1>::value_type’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
207 | if (!SymTab || SymTabShndx->sh_link != SymTab - Sections.begin())
/llvm/tools/obj2yaml/elf2yaml.cpp: In instantiation of ‘llvm::Expected<llvm::ELFYAML::Object*>
{anonymous}::ELFDumper<ELFT>::dump() [with ELFT = llvm::object::ELFType<llvm::support::big, false>]’:
...
Previously the description allowed to describe symbols with use of
`Name` and `Index` keys. This patch removes them and now it is still
possible to use either names or symbol indexes, but the code is simpler
and the format is slightly different.
Such a change will be useful for another patches, e.g:
https://reviews.llvm.org/D73788#inline-671077
Differential revision: https://reviews.llvm.org/D73888
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Note: this is a reland with a trivial 2 lines fix in ELFState<ELFT>::writeSectionContent.
It adds a check similar to ones we already have for other sections to fix the case revealed
by bots, like http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/60744.
The encoded sequence of Elf*_Relr entries in a SHT_RELR section looks
like [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ]
i.e. start with an address, followed by any number of bitmaps. The address
entry encodes 1 relocation. The subsequent bitmap entries encode up to 63(31)
relocations each, at subsequent offsets following the last address entry.
More information is here:
https://github.com/llvm-mirror/llvm/blob/master/lib/Object/ELF.cpp#L272
This patch adds a support for these sections.
Differential revision: https://reviews.llvm.org/D71872
The encoded sequence of Elf*_Relr entries in a SHT_RELR section looks
like [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ]
i.e. start with an address, followed by any number of bitmaps. The address
entry encodes 1 relocation. The subsequent bitmap entries encode up to 63(31)
relocations each, at subsequent offsets following the last address entry.
More information is here:
https://github.com/llvm-mirror/llvm/blob/master/lib/Object/ELF.cpp#L272
This patch adds a support for these sections.
Differential revision: https://reviews.llvm.org/D71872
We already have Symbols property to list regular symbols and
it is currently Optional<>. This patch makes DynamicSymbols to be optional
too. With this there is no need to define a dummy symbol anymore to trigger
creation of the .dynsym and it is now possible to define an empty .dynsym using
just the following line:
DynamicSymbols: []
(it is important to have when you do not want to have dynamic symbols,
but want to have a .dynsym)
Now the code is consistent and it helped to fix a bug: previously we
did not report an error when both Content/Size and an empty
Symbols/DynamicSymbols list were specified.
Differential revision: https://reviews.llvm.org/D70956
This section contains strings specifying libraries to be added to the link by the linker.
The strings are encoded as standard null-terminated UTF-8 strings.
This patch adds a way to describe and dump SHT_LLVM_DEPENDENT_LIBRARIES sections.
I introduced a new YAMLFlowString type here. That used to teach obj2yaml to dump
them like:
```
Libraries: [ foo, bar ]
```
instead of the following (if StringRef would be used):
```
Libraries:
- foo
- bar
```
Differential revision: https://reviews.llvm.org/D70598
SHT_LLVM_LINKER_OPTIONS section contains pairs of null-terminated strings.
This patch adds support for them.
Differential revision: https://reviews.llvm.org/D69895
Currently there is no way to describe the data that is not a part of an output section.
It can be a data used to align sections or to fill the gaps with something,
or another kind of custom data. In this patch I suggest a way to describe it. It looks like that:
```
Sections:
- Type: CustomFiller
Pattern: "CCDD"
Size: 4
- Name: .bar
Type: SHT_PROGBITS
Content: "FF"
```
I.e. I've added a kind of synthetic section with a synthetic type "CustomFiller".
In the code it is called a "SyntheticFiller", which is "a synthetic section which
might be used to write the custom data around regular output sections. It does
not present in the sections header table, but it might affect the output file size and
program headers produced. Think about it as about piece of data."
`SyntheticFiller` currently has a `Pattern` field and a `Size` field + an optional `Name`.
When written, `Size` of bytes in the output will be filled with a `Pattern`.
It is possible to reference a named filler it by name from the program headers description,
just like any other normal section.
Differential revision: https://reviews.llvm.org/D69709
SHT_NOTE is the section that consists of
namesz, descsz, type, name + padding, desc + padding data.
This patch teaches yaml2obj, obj2yaml to dump and parse them.
This patch implements the section how it is described here:
https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-18048.html
Which says: "For 64–bit objects and 32–bit objects, each entry is an array of 4-byte words in
the format of the target processor"
The official specification is different
http://www.sco.com/developers/gabi/latest/ch5.pheader.html#note_section
And says: "n 64-bit objects (files with e_ident[EI_CLASS] equal to ELFCLASS64), each entry is an array
of 8-byte words in the format of the target processor. In 32-bit objects (files with e_ident[EI_CLASS]
equal to ELFCLASS32), each entry is an array of 4-byte words in the format of the target processor"
Since LLVM uses the first, 32-bit way, this patch follows it.
Differential revision: https://reviews.llvm.org/D68983
This just reorders the code and removes an assignment
of an empty string for the case when a relocation has
no symbol associated. With this our output becomes
cleaner and shorter.
Differential revision: https://reviews.llvm.org/D69255
This patch tries to resolve problems faced in D68943
and uses some of the code written by Konrad Wilhelm Kleine
in that patch.
Previously, yaml2obj tool always created a .symtab section.
This patch changes that. With it we only create it when
have a "Symbols:" tag in the YAML document or when
we need to create it because it is used by another section(s).
obj2yaml follows the new behavior and does not print "Symbols:"
anymore when there is no symbol table.
Differential revision: https://reviews.llvm.org/D69041
llvm-svn: 375361
.stack_sizes is a SHT_PROGBITS section that contains pairs of
<address (4/8 bytes), stack size (uleb128)>.
This patch teach tools to parse and dump it.
Differential revision: https://reviews.llvm.org/D67757
llvm-svn: 372762
`struct Elf*_Shdr` has a field `sh_offset`, named `ShOffset` in
llvm::ELFYAML::Section. Rename SHOffset (e_shoff) to SHOff to prevent confusion.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D67254
llvm-svn: 371185
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
llvm-svn: 369013
In some cases a symbol might have section index == SHN_XINDEX.
This is an escape value indicating that the actual section header index
is too large to fit in the containing field.
Then the SHT_SYMTAB_SHNDX section is used. It contains the 32bit values
that stores section indexes.
ELF gABI says that there can be multiple SHT_SYMTAB_SHNDX sections,
i.e. for example one for .symtab and one for .dynsym
(1) https://groups.google.com/forum/#!topic/generic-abi/-XJAV5d8PRg
(2) DT_SYMTAB_SHNDX: http://www.sco.com/developers/gabi/latest/ch5.dynamic.html
In this patch I am only supporting a single SHT_SYMTAB_SHNDX associated
with a .symtab. This is a more or less common case which is used a few tests I saw in LLVM.
I decided not to create the SHT_SYMTAB_SHNDX section as "implicit",
but implement is like a kind of regular section for now.
i.e. tools do not recreate this section or its content, like they do for
symbol table sections, for example. That should allow to write all kind of
possible broken test cases for our needs and keep the output closer to requested.
Differential revision: https://reviews.llvm.org/D65446
llvm-svn: 368272
Recently an advanced support of SHT_NULL sections
was implemented in yaml2obj.
This patch adds a corresponding support to obj2yaml.
Differential revision: https://reviews.llvm.org/D65215
llvm-svn: 367852
Because of a bug we did not report a error in the case
shown in the test. With this patch we do.
Differential revision: https://reviews.llvm.org/D65214
llvm-svn: 367203
No changes, LLD code was updated in r366057.
Original commit message:
ELF.h contains two getSymbol methods
which seems to be used only from obj2yaml.
One of these methods calls another, which in turn
contains untested error message which doesn't
provide enough information.
Problem is that after improving only just that message,
obj2yaml will not show it,
("Error reading file: yaml: Invalid data was
encountered while parsing the file" message will be shown instead),
because internal errors handling of tool is based on ErrorOr<> class which
stores a error code and as a result can only show a predefined error string, what
actually isn't very useful.
In this patch, I rework obj2yaml's error reporting system
for ELF targets to use Error Expected<> classes.
Also, I improve the error message produced
by getSymbol for demonstration of the new functionality.
Differential revision: https://reviews.llvm.org/D64631
llvm-svn: 366058
ELF.h contains two getSymbol methods
which seems to be used only from obj2yaml.
One of these methods calls another, which in turn
contains untested error message which doesn't
provide enough information.
Problem is that after improving only just that message,
obj2yaml will not show it,
("Error reading file: yaml: Invalid data was
encountered while parsing the file" message will be shown instead),
because internal errors handling of tool is based on ErrorOr<> class which
stores a error code and as a result can only show a predefined error string, what
actually isn't very useful.
In this patch, I rework obj2yaml's error reporting system
for ELF targets to use Error Expected<> classes.
Also, I improve the error message produced
by getSymbol for demonstration of the new functionality.
Differential revision: https://reviews.llvm.org/D64631
llvm-svn: 366052
Some of our test cases are using objects which
has sections with a broken sh_offset field.
There was no way to set it from YAML until this patch.
Differential revision: https://reviews.llvm.org/D63879
llvm-svn: 364898
This allows setting different values for e_shentsize, e_shoff, e_shnum
and e_shstrndx fields and is useful for producing broken inputs for various
test cases.
Differential revision: https://reviews.llvm.org/D63771
llvm-svn: 364517
The patch teaches yaml2obj/obj2yaml to support parsing/dumping
the sections and symbols with the same name.
A special suffix is added to a name to make it unique.
Differential revision: https://reviews.llvm.org/D63596
llvm-svn: 364282
With this patch we get ability to set any flags we want
for implicit sections defined in YAML.
Differential revision: https://reviews.llvm.org/D63136
llvm-svn: 363367
This is a follow-up for D62809.
Content and Size fields should be optional as was discussed in comments
of the D62809's thread. With that, we can describe a specific string table and
symbol table sections in a more correct way and also show appropriate errors.
The patch adds lots of test cases where the behavior is described in details.
Differential revision: https://reviews.llvm.org/D62957
llvm-svn: 362931
This makes the variables naming to match LLVM style,
simplifies the code used to extract the group members,
simplifies the loop and reorders the code around a bit.
llvm-svn: 359101
Currently, YAML has the following syntax for describing the symbols:
Symbols:
Local:
LocalSymbol1:
...
LocalSymbol2:
...
...
Global:
GlobalSymbol1:
...
Weak:
...
GNUUnique:
I.e. symbols are grouped by their bindings. That is not very convenient,
because:
It does not allow to set a custom binding, what can be useful for producing
broken/special outputs for test cases. Adding a new binding would require to
change a syntax (what we observed when added GNUUnique recently).
It does not allow to change the order of the symbols in .symtab/.dynsym,
i.e. currently all Local symbols are placed first, then Global, Weak and GNUUnique
are following, but we are not able to change the order.
It is not consistent. Binding is just one of the properties of the symbol,
we do not group them by other properties.
It makes the code more complex that it can be. This patch shows it can be simplified
with the change performed.
The patch changes the syntax to just:
Symbols:
Symbol1:
...
Symbol2:
...
...
With that, we are able to work with the binding field just like with any other symbol property.
Differential revision: https://reviews.llvm.org/D60122
llvm-svn: 357595