Commit Graph

68 Commits

Author SHA1 Message Date
Saleem Abdulrasool 06036dbc6e MC: correct the emission of weak aliases in COFF
The weak alias should have the characteristics set to
`IMAGE_EXTERN_WEAK_SEARCH_ALIAS` to indicate that the weak external here
is a symbol alias and that the symbol is aliased to a locally defined
symbol.  We were previously setting the characteristics to
`IMAGE_EXTERN_WEAK_SEARCH_LIBRARY` which indicates that the symbol
should be looked for in the libraries.

llvm-svn: 364370
2019-06-26 01:09:52 +00:00
George Rimar 60dc5d4b61 [yaml2obj/obj2yaml] - Allow having the symbols and sections with duplicated names.
The patch teaches yaml2obj/obj2yaml to support parsing/dumping
the sections and symbols with the same name.
A special suffix is added to a name to make it unique.

Differential revision: https://reviews.llvm.org/D63596

llvm-svn: 364282
2019-06-25 08:22:57 +00:00
Fangrui Song ac14f7b10c [lit] Delete empty lines at the end of lit.local.cfg NFC
llvm-svn: 363538
2019-06-17 09:51:07 +00:00
Peter Collingbourne 31fda09b2d Add IR support, ELF section and user documentation for partitioning feature.
The partitioning feature was proposed here:
http://lists.llvm.org/pipermail/llvm-dev/2019-February/130583.html

This is mostly just documentation. The feature itself will be contributed
in subsequent patches.

Differential Revision: https://reviews.llvm.org/D60242

llvm-svn: 361923
2019-05-29 03:29:01 +00:00
Ben Dunbobbin 1d16515fb4 [ELF] Implement Dependent Libraries Feature
This patch implements a limited form of autolinking primarily designed to allow
either the --dependent-library compiler option, or "comment lib" pragmas (
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017) in
C/C++ e.g. #pragma comment(lib, "foo"), to cause an ELF linker to automatically
add the specified library to the link when processing the input file generated
by the compiler.

Currently this extension is unique to LLVM and LLD. However, care has been taken
to design this feature so that it could be supported by other ELF linkers.

The design goals were to provide:

- A simple linking model for developers to reason about.
- The ability to to override autolinking from the linker command line.
- Source code compatibility, where possible, with "comment lib" pragmas in other
  environments (MSVC in particular).

Dependent library support is implemented differently for ELF platforms than on
the other platforms. Primarily this difference is that on ELF we pass the
dependent library specifiers directly to the linker without manipulating them.
This is in contrast to other platforms where they are mapped to a specific
linker option by the compiler. This difference is a result of the greater
variety of ELF linkers and the fact that ELF linkers tend to handle libraries in
a more complicated fashion than on other platforms. This forces us to defer
handling the specifiers to the linker.

In order to achieve a level of source code compatibility with other platforms
we have restricted this feature to work with libraries that meet the following
"reasonable" requirements:

1. There are no competing defined symbols in a given set of libraries, or
   if they exist, the program owner doesn't care which is linked to their
   program.
2. There may be circular dependencies between libraries.

The binary representation is a mergeable string section (SHF_MERGE,
SHF_STRINGS), called .deplibs, with custom type SHT_LLVM_DEPENDENT_LIBRARIES
(0x6fff4c04). The compiler forms this section by concatenating the arguments of
the "comment lib" pragmas and --dependent-library options in the order they are
encountered. Partial (-r, -Ur) links are handled by concatenating .deplibs
sections with the normal mergeable string section rules. As an example, #pragma
comment(lib, "foo") would result in:

.section ".deplibs","MS",@llvm_dependent_libraries,1
         .asciz "foo"

For LTO, equivalent information to the contents of a the .deplibs section can be
retrieved by the LLD for bitcode input files.

LLD processes the dependent library specifiers in the following way:

1. Dependent libraries which are found from the specifiers in .deplibs sections
   of relocatable object files are added when the linker decides to include that
   file (which could itself be in a library) in the link. Dependent libraries
   behave as if they were appended to the command line after all other options. As
   a consequence the set of dependent libraries are searched last to resolve
   symbols.
2. It is an error if a file cannot be found for a given specifier.
3. Any command line options in effect at the end of the command line parsing apply
   to the dependent libraries, e.g. --whole-archive.
4. The linker tries to add a library or relocatable object file from each of the
   strings in a .deplibs section by; first, handling the string as if it was
   specified on the command line; second, by looking for the string in each of the
   library search paths in turn; third, by looking for a lib<string>.a or
   lib<string>.so (depending on the current mode of the linker) in each of the
   library search paths.
5. A new command line option --no-dependent-libraries tells LLD to ignore the
   dependent libraries.

Rationale for the above points:

1. Adding the dependent libraries last makes the process simple to understand
   from a developers perspective. All linkers are able to implement this scheme.
2. Error-ing for libraries that are not found seems like better behavior than
   failing the link during symbol resolution.
3. It seems useful for the user to be able to apply command line options which
   will affect all of the dependent libraries. There is a potential problem of
   surprise for developers, who might not realize that these options would apply
   to these "invisible" input files; however, despite the potential for surprise,
   this is easy for developers to reason about and gives developers the control
   that they may require.
4. This algorithm takes into account all of the different ways that ELF linkers
   find input files. The different search methods are tried by the linker in most
   obvious to least obvious order.
5. I considered adding finer grained control over which dependent libraries were
   ignored (e.g. MSVC has /nodefaultlib:<library>); however, I concluded that this
   is not necessary: if finer control is required developers can fall back to using
   the command line directly.

RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2019-March/131004.html.

Differential Revision: https://reviews.llvm.org/D60274

llvm-svn: 360984
2019-05-17 03:44:15 +00:00
Fangrui Song 5387c2cd17 [llvm-objdump] Print newlines before and after "Disassembly of section ...:"
This improves readability and the behavior is consistent with GNU objdump.

The new test test/tools/llvm-objdump/X86/disassemble-section-name.s
checks we print newlines before and after "Disassembly of section ...:"

Differential Revision: https://reviews.llvm.org/D61127

llvm-svn: 359668
2019-05-01 10:40:48 +00:00
George Rimar 6da44ad75d [yaml2obj][obj2yaml] - Change how symbol's binding is descibed when parsing/dumping.
Currently, YAML has the following syntax for describing the symbols:

Symbols:
  Local:
    LocalSymbol1:
    ...
    LocalSymbol2:
    ...
  ...
  Global:
    GlobalSymbol1:
  ...
  Weak:
  ...
  GNUUnique:

I.e. symbols are grouped by their bindings. That is not very convenient,
because:

It does not allow to set a custom binding, what can be useful for producing
broken/special outputs for test cases. Adding a new binding would require to
change a syntax (what we observed when added GNUUnique recently).

It does not allow to change the order of the symbols in .symtab/.dynsym,
i.e. currently all Local symbols are placed first, then Global, Weak and GNUUnique
are following, but we are not able to change the order.

It is not consistent. Binding is just one of the properties of the symbol,
we do not group them by other properties.

It makes the code more complex that it can be. This patch shows it can be simplified
with the change performed.

The patch changes the syntax to just:

Symbols:
  Symbol1:
  ...
  Symbol2:
  ...
...

With that, we are able to work with the binding field just like with any other symbol property.

Differential revision: https://reviews.llvm.org/D60122

llvm-svn: 357595
2019-04-03 14:53:42 +00:00
James Henderson 9bc817a0ae [yaml2obj]Allow explicit symbol indexes in relocations and emit error for bad names
Prior to this change, the "Symbol" field of a relocation would always be
assumed to be a symbol name, and if no such symbol existed, the
relocation would reference index 0. This confused me when I tried to use
a literal symbol index in the field: since "0x1" was not a known symbol
name, the symbol index was set as 0. This change falls back to treating
unknown symbol names as integers, and emits an error if the name is not
found and the string is not an integer.

Note that the Symbol field is optional, so if a relocation doesn't
reference a symbol, it shouldn't be specified. The new error required a
number of test updates.

Reviewed by: grimar, ruiu
Differential Revision: https://reviews.llvm.org/D58510

llvm-svn: 355938
2019-03-12 17:00:25 +00:00
Sunil Srivastava ae8fe4e093 Improve "llvm-nm -f sysv" output for Elf files
Specifically, compute and Print Type and Section columns.

This is a re-commit of rL354833, after fixing the Asan problem found a a buildbot.

Differential Revision: https://reviews.llvm.org/D59060

llvm-svn: 355742
2019-03-08 22:00:50 +00:00
Bob Haarman 6710cc7db5 simplify COFF module assembly test and move it to Object
Reviewers: pcc, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D57192

llvm-svn: 352142
2019-01-25 00:33:05 +00:00
Daniel Cederman d72b9fd141 Implemented sane default for llvm-objdump's relocation Value format
Summary:
"Unknown" for platforms that were not manually added into the switch
did not make sense at all. Now it prints Target + addend for all
elf-machines that were not explicitly mentioned.

Addresses PR21059 and PR25124.

Original author: fedor.sergeev

Reviewers: jyknight, espindola, fedor.sergeev

Reviewed By: jyknight

Subscribers: eraman, dcederman, jfb, dschuff, aheejin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36464

llvm-svn: 333726
2018-06-01 05:31:58 +00:00
Rafael Espindola dc8b7a96bd Use the section name if a STT_SECTION symbol has empty name.
Without this we would have multiple relocations pointing to symbols
with the same name: the empty string. There was no way for yaml2obj to
be able to handle that.

A more general solution would be to unique symbol names in a similar
way to how we unique section names.  In practice I think this covers
all common cases and is a bit more user friendly than using names like
sym1, sym2, sym3, etc.

llvm-svn: 312603
2017-09-06 00:57:53 +00:00
Rafael Espindola 88ee57ebed obj2yaml: Print unique section names.
Without this patch passing a .o file with multiple sections with the
same name to obj2yaml produces a yaml file that yaml2obj cannot
handle. This is pr34162.

The problem is that when specifying, for example, the section of a
symbol, we get only

Section: foo

and don't know which of the sections whose name is foo we have to use.

One alternative would be to use section numbers. This would work, but
the output from obj2yaml would be very inconvenient to edit as
deleting a section would invalidate all indexes.

Another alternative would be to invent a unique section id that would
exist only on yaml. This would work, but seems a bit heavy handed. We
could make the id optional and default it to the section name.

Since in the last alternative the id is basically what this patch uses
as a name, it can be implemented as a followup patch if needed.

llvm-svn: 312585
2017-09-05 22:30:00 +00:00
Teresa Johnson a83c3f7879 [LTO] Prevent dead stripping and internalization of symbols with sections
Summary:
ELF linkers generate __start_<secname> and __stop_<secname> symbols
when there is a value in a section <secname> where the name is a valid
C identifier.  If dead stripping determines that the values declared
in section <secname> are dead, and we then internalize (and delete)
such a symbol, programs that reference the corresponding start and end
section symbols will get undefined reference linking errors.

To fix this, add the section name to the IRSymtab entry when a symbol is
defined in a specific section. Then use this in the gold-plugin to mark
the symbol as external and visible from outside the summary when the
section name is a valid C identifier.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D35639

llvm-svn: 309009
2017-07-25 19:42:32 +00:00
George Rimar 892c6c86ea [YAML] - Teach yaml2obj/obj2yaml to work with numeric relocation values.
That may be useful if we want to produce or parse object containing
broken relocation values using yaml2obj/obj2yaml.

Previously that was impossible because only enum values were parsed
correctly, this patch allows to put any numeric value as a
relocation type.

Differential revision: https://reviews.llvm.org/D34758

llvm-svn: 306814
2017-06-30 10:31:03 +00:00
Peter Collingbourne 99b98c21f2 Object: Teach irsymtab::read() to try to use the irsymtab that we wrote to disk.
Fixes PR27551.

Differential Revision: https://reviews.llvm.org/D33974

llvm-svn: 306488
2017-06-27 23:50:24 +00:00
Peter Collingbourne 92648c25a4 Bitcode: Write the irsymtab to disk.
Differential Revision: https://reviews.llvm.org/D33973

llvm-svn: 306487
2017-06-27 23:50:11 +00:00
Teresa Johnson 41db92f9ae Add support for handling ifuncs to GlobalValue::getBaseObject
Summary:
All GlobalIndirectSymbol types (not just GlobalAlias) should return
their base object.

Without this patch LTO would warn "Unable to determine comdat of
alias!" for an ifunc.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D33202

llvm-svn: 303096
2017-05-15 18:28:29 +00:00
Rafael Espindola 04bf953de4 Add an extra test for archive symbol tables.
The table should include only defined symbols.

llvm-svn: 303075
2017-05-15 15:56:23 +00:00
Rafael Espindola b26bc7fddc Add ifunc support to ModuleSymbolTable.
Do that by creating a global_values, which is similar to
global_objects, but also iterates over aliases and ifuncs.

llvm-svn: 299018
2017-03-29 19:26:26 +00:00
Davide Italiano f7518498ff [IRObjectFile] Handle undefined weak symbols in RecordStreamer.
Differential Revision:  https://reviews.llvm.org/D24594

llvm-svn: 281629
2016-09-15 17:54:22 +00:00
Mehdi Amini f9721ba5f1 RecordStreamer: handle inline asm "lazy_reference" and mark symbols as "used"
llvm-svn: 277564
2016-08-03 03:51:42 +00:00
Chris Bieneman 8ff0c11357 [yaml2obj] Remove --format option in favor of YAML tags
Summary:
Our YAML library's handling of tags isn't perfect, but it is good enough to get rid of the need for the --format argument to yaml2obj. This patch does exactly that.

Instead of requiring --format, it infers the format based on the tags found in the object file. The supported tags are:

!ELF
!COFF
!mach-o
!fat-mach-o

I have a corresponding patch that is quite large that fixes up all the in-tree test cases.

Reviewers: rafael, Bigcheese, compnerd, silvas

Subscribers: compnerd, llvm-commits

Differential Revision: http://reviews.llvm.org/D21711

llvm-svn: 273915
2016-06-27 19:53:53 +00:00
Davide Italiano ec7e29e941 [IRObjectFile] Propagate .weak attribute correctly for ASM symbols.
PR: 28256
Differential Revision:  http://reviews.llvm.org/D21616

llvm-svn: 273474
2016-06-22 20:48:15 +00:00
Davide Italiano 16bfa13a77 [IRObjectFile] Handle .weak in RecordStreamer.
Differential Revision:  http://reviews.llvm.org/D21476

llvm-svn: 273027
2016-06-17 18:20:14 +00:00
Colin LeMahieu efe3732883 Revert r265817
lld tests need to be addressed.

llvm-svn: 265822
2016-04-08 18:15:37 +00:00
Colin LeMahieu 4a1975ba8e [llvm-objdump] Printing hex instead of dec by default
Differential Revision: http://reviews.llvm.org/D18770

llvm-svn: 265817
2016-04-08 17:55:03 +00:00
Rafael Espindola 8d6fbc3a4e IRObject: Mark extern_weak as weak.
llvm-svn: 262222
2016-02-29 14:26:06 +00:00
David Blaikie 2f40830dde [opaque pointer type] Add textual IR support for explicit type parameter for global aliases
update.py:
import fileinput
import sys
import re

alias_match_prefix = r"(.*(?:=|:|^)\s*(?:external |)(?:(?:private|internal|linkonce|linkonce_odr|weak|weak_odr|common|appending|extern_weak|available_externally) )?(?:default |hidden |protected )?(?:dllimport |dllexport )?(?:unnamed_addr |)(?:thread_local(?:\([a-z]*\))? )?alias"
plain = re.compile(alias_match_prefix + r" (.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|addrspacecast|\[\[[a-zA-Z]|\{\{).*$)")
cast  = re.compile(alias_match_prefix + r") ((?:bitcast|inttoptr|addrspacecast)\s*\(.* to (.*?)(| addrspace\(\d+\) *)\*\)\s*(?:;.*)?$)")
gep   = re.compile(alias_match_prefix + r") ((?:getelementptr)\s*(?:inbounds)?\s*\((?P<type>.*), (?P=type)(?:\s*addrspace\(\d+\)\s*)?\* .*\)\s*(?:;.*)?$)")

def conv(line):
  m = re.match(cast, line)
  if m:
    return m.group(1) + " " + m.group(3) + ", " + m.group(2)
  m = re.match(gep, line)
  if m:
    return m.group(1) + " " + m.group(3) + ", " + m.group(2)
  m = re.match(plain, line)
  if m:
    return m.group(1) + ", " + m.group(2) + m.group(3) + "*" + m.group(4) + "\n"
  return line

for line in sys.stdin:
  sys.stdout.write(conv(line))

apply.sh:
for name in "$@"
do
  python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
  rm -f "$name.tmp"
done

The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh

llvm-svn: 247378
2015-09-11 03:22:04 +00:00
Rafael Espindola be8b0ea854 Delete UnknownAddress. It is a perfectly valid symbol value.
getSymbolValue now returns a value that in convenient for most callers:
* 0 for undefined
* symbol size for common symbols
* offset/address for symbols the rest

Code that needs something more specific can check getSymbolFlags.

llvm-svn: 241605
2015-07-07 17:12:59 +00:00
Rafael Espindola d82477278b Common symbols are not undefined, at least for ObjectFile.
They are implemented like that in some object formats, but for the interface
provided by lib/Object, SF_Undefined and SF_Common are different things.

This matches the ELF and COFF implementation and fixes llvm-nm for MachO.

llvm-svn: 241587
2015-07-07 14:26:39 +00:00
Rafael Espindola 2d5d23d41d llvm-nm: treat weak undefined as undefined.
This matches the behavior of gnu ld.

llvm-svn: 241512
2015-07-06 21:36:23 +00:00
Rafael Espindola 80c3354634 Fix printing of common symbols.
Printing the symbol size matches the behavior or both gnu nm and freebsd nm.

llvm-svn: 241480
2015-07-06 18:18:44 +00:00
Rafael Espindola 60c1a8c01a llvm-nm: print 'n' instead of '?'
This matches gnu nm and has the advantage that there is a upper case N.

llvm-svn: 240655
2015-06-25 16:01:53 +00:00
Rafael Espindola d7a32ea4b8 Change how symbol sizes are handled in lib/Object.
COFF and MachO only define symbol sizes for common symbols. Reflect that
in the class hierarchy by having a method for common symbols only in the base
and a general one in ELF.

This avoids the need of using a magic value for the size, which had a few
problems
* Most callers didn't check for it.
* The ones that did could not tell the magic value from a file actually having
  that value.

llvm-svn: 240529
2015-06-24 10:20:30 +00:00
Rafael Espindola 09e5b1ca76 Move test that depends on x86 to the x86 directory.
llvm-svn: 239043
2015-06-04 15:25:47 +00:00
Jan Wen Voung ce2164f45c Fix getRelocationValueString to return the symbol name for EM_386.
Summary: This helps llvm-objdump -r to print out the symbol name along
with the relocation type on x86. Adjust existing tests from checking
for "Unknown" to check for the symbol now.

Test Plan: Adjusted test/Object tests.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5987

llvm-svn: 220866
2014-10-29 18:37:13 +00:00
Sean Silva 888320e9fa Nuke MCAnalysis.
The code is buggy and barely tested. It is also mostly boilerplate.
(This includes MCObjectDisassembler, which is the interface to that
functionality)

Following an IRC discussion with Jim Grosbach, it seems sensible to just
nuke the whole lot of functionality, and dig it up from VCS if
necessary (I hope not!).

All of this stuff appears to have been added in a huge patch dump (look
at the timeframe surrounding e.g. r182628) where almost every patch
seemed to be untested and not reviewed before being committed.
Post-review responses to the patches were never addressed. I don't think
any of it would have passed pre-commit review.

I doubt anyone is depending on this, since this code appears to be
extremely buggy. In limited testing that Michael Spencer and I did, we
couldn't find a single real-world object file that wouldn't crash the
CFG reconstruction stuff. The symbolizer stuff has O(n^2) behavior and
so is not much use to anyone anyway. It seemed simpler to remove them as
a whole. Most of this code is boilerplate, which is the only way it was
able to scrape by 60% coverage.

HEADSUP: Modules folks, some files I nuked were referenced from
include/llvm/module.modulemap; I just deleted the references. Hopefully
that is the right fix (one was a FIXME though!).

llvm-svn: 216983
2014-09-02 22:32:20 +00:00
Rafael Espindola e45c740370 Fix an off-by-one bug in the target independent llvm-objdump.
It would prevent the display of a single byte instruction before a label.

Patch by Steve King!

llvm-svn: 215837
2014-08-17 16:31:39 +00:00
Rafael Espindola 464fe024c5 Use "weak alias" instead of "alias weak"
Before this patch we had

@a = weak global ...
but
@b = alias weak ...

The patch changes aliases to look more like global variables.

Looking at some really old code suggests that the reason was that the old
bison based parser had a reduction for alias linkages and another one for
global variable linkages. Putting the alias first avoided the reduce/reduce
conflict.

The days of the old .ll parser are long gone. The new one parses just "linkage"
and a later check is responsible for deciding if a linkage is valid in a
given context.

llvm-svn: 214355
2014-07-30 22:51:54 +00:00
Kevin Enderby 8da4bd60fb Changed the lvm-nm alias "-s" for -print-armap to "-M".
This will allow the "-s" flag to implemented in the future as it
is in darwin’s nm(1) to list symbols only in the specified section.

Given a LGTM by Shankar Easwaran who originally implemented
the support for lvm-nm’s -print-armap and archive map symbols.

llvm-svn: 212576
2014-07-08 23:47:31 +00:00
Rafael Espindola d69a347128 Move test since it now depends on the x86 backend.
llvm-svn: 212289
2014-07-03 20:26:21 +00:00
Rafael Espindola 8e8debc756 Add support for inline asm symbols in llvm-ar.
This should allow llvm-ar to be used instead of gnu ar + plugin in a LTO
build. I will add a release note about it once I finish a LTO bootstrap with it.

llvm-svn: 212287
2014-07-03 19:40:08 +00:00
Alp Toker d3d017cf00 Reduce verbiage of lit.local.cfg files
We can just split targets_to_build in one place and make it immutable.

llvm-svn: 210496
2014-06-09 22:42:55 +00:00
Simon Atanasyan d6a20e5115 [yaml2obj] Follow-up to the r208228 and r208406. Remove duplicated YAML
map keys.

llvm-svn: 208412
2014-05-09 13:57:33 +00:00
NAKAMURA Takumi f50871f460 Mark yaml2obj-elf-x86-rel.yaml as XFAIL:vg_leak for now. This has two pairs of duplicate hashes.
llvm-svn: 208406
2014-05-09 11:24:18 +00:00
Simon Atanasyan 68f6150156 [yaml2obj] Support ELF x86 relocations.
llvm-svn: 208228
2014-05-07 17:06:38 +00:00
David Majnemer 7788033be6 YAMLIO: Allow scalars to dictate quotation rules
Introduce ScalarTraits::mustQuote which determines whether or not a
StringRef needs quoting before it is acceptable to output.

llvm-svn: 205955
2014-04-10 07:37:33 +00:00
Filipe Cabecinhas 2c4e8ae0fd Revert "YAMLIO: Encode ambiguous hex strings explicitly"
This reverts commit r205839.

It broke several tests in lld.

llvm-svn: 205857
2014-04-09 14:35:17 +00:00
David Majnemer 815433587c YAMLIO: Encode ambiguous hex strings explicitly
YAMLIO would turn a BinaryRef into the string 0000000004000000.
However, the leading zero causes parsers to interpret it as being an
octal number instead of a hexadecimal one.

Instead, escape such strings as needed.

llvm-svn: 205839
2014-04-09 07:56:27 +00:00