Summary:
When a desired symbol name contains invalid character that the
system assembler could not process, we need to emit .rename
directive in assembly path in order for that desired symbol name
to appear in the symbol table.
Reviewed By: hubert.reinterpretcast, DiggerLin, daltenty, Xiangling_L
Differential Revision: https://reviews.llvm.org/D82481
In D49466, sys::path::replace_path_prefix was used instead startswith for -f[macro/debug/file]-prefix-map options.
However those were reverted later (commit rG3bb24bf25767ef5bbcef958b484e7a06d8689204) due to broken Windows tests.
This patch restores those replace_path_prefix calls.
It also modifies the prefix matching to be case-insensitive under Windows.
Differential Revision : https://reviews.llvm.org/D76869
Ensure that symbols explicitly* assigned a section name are placed into
a section with a compatible entry size.
This is done by creating multiple sections with the same name** if
incompatible symbols are explicitly given the name of an incompatible
section, whilst:
- Avoiding using uniqued sections where possible (for readability and
to maximize compatibly with assemblers).
- Creating as few SHF_MERGE sections as possible (for efficiency).
Given that each symbol is assigned to a section in a single pass, we
must decide which section each symbol is assigned to without seeing the
properties of all symbols. A stable and easy to understand assignment is
desirable. The following rules facilitate this: The "generic" section
for a given section name will be mergeable if the name is a mergeable
"default" section name (such as .debug_str), a mergeable "implicit"
section name (such as .rodata.str2.2), or MC has already created a
mergeable "generic" section for the given section name (e.g. in response
to a section directive in inline assembly). Otherwise, the "generic"
section for a given name is non-mergeable; and, non-mergeable symbols
are assigned to the "generic" section, while mergeable symbols are
assigned to uniqued sections.
Terminology:
"default" sections are those always created by MC initially, e.g. .text
or .debug_str.
"implicit" sections are those created normally by MC in response to the
symbols that it encounters, i.e. in the absence of an explicit section
name assignment on the symbol, e.g. a function foo might be placed into
a .text.foo section.
"generic" sections are those that are referred to when a unique section
ID is not supplied, e.g. if there are multiple unique .bob sections then
".quad .bob" will reference the generic .bob section. Typically, the
generic section is just the first section of a given name to be created.
Default sections are always generic.
* Typically, section names might be explicitly assigned in source code
using a language extension e.g. a section attribute: _attribute_
((section ("section-name"))) -
https://clang.llvm.org/docs/AttributeReference.html
** I refer to such sections as unique/uniqued sections. In assembly the
", unique," assembly syntax is used to express such sections.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43457.
See https://reviews.llvm.org/D68101 for previous discussions leading to
this patch.
Some minor fixes were required to LLVM's tests, for tests had been using
the old behavior - which allowed for explicitly assigning globals with
incompatible entry sizes to a section.
This fix relies on the ",unique ," assembly feature. This feature is not
available until bintuils version 2.35
(https://sourceware.org/bugzilla/show_bug.cgi?id=25380). If the
integrated assembler is not being used then we avoid using this feature
for compatibility and instead try to place mergeable symbols into
non-mergeable sections or issue an error otherwise.
Differential Revision: https://reviews.llvm.org/D72194
I plan to use MCSection::getName() in D78138. Having the function in the base class is also convenient for debugging.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D78251
Fixed an issue exposed by D74006.
In clang cc1as, MCContext::UseNamesOnTempLabels is true.
When parsing a .fnstart directive, FnStart gets redefined to a temporary symbol of a different name (.Ltmp0, .Ltmp1, ...).
MCContext::getELFSection() called by SwitchToEHSection() will create a different .ARM.exidx each time.
llvm-mc uses `Ctx.setUseNamesOnTempLabels(false);` and FnStart is unnamed.
MCContext::getELFSection() called by SwitchToEHSection() will reuse the same .ARM.exidx .
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D75095
https://bugs.llvm.org/show_bug.cgi?id=44775
This rule has been implemented by GNU as https://sourceware.org/ml/binutils/2020-02/msg00028.html (binutils >= 2.35)
It allows us to simplify
```
.section .foo,"o",foo,unique,0
.section .foo,"o",bar,unique,1 # different section
```
to
```
.section .foo,"o",foo
.section .foo,"o",bar # different section
```
We consider the two `.foo` different even if the linked-to symbols foo and bar
are defined in the same section. This is a deliberate choice so that we don't
need to know the section where foo and bar are defined beforehand.
Differential Revision: https://reviews.llvm.org/D74006
"linked-to section" is used by the ELF spec. By analogy, "linked-to
symbol" is a good name for the signature symbol. The word "linked-to"
implies a directed edge and makes it clear its relation with "sh_link",
while one can argue that "associated" means an undirected edge.
Also, combine tests and add precise SMLoc to improve diagnostics.
Reviewed By: eugenis, grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D74082
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Summary:
We are using symbols to represent label and csect interchangeably before, and that could be a problem.
There are cases we would need to add storage mapping class to the symbol if that symbol is actually the name of a csect, but it's hard for us to figure out whether that symbol is a label or csect.
This patch intend to do the following:
1. Construct a QualName (A name include the storage mapping class)
MCSymbolXCOFF for every MCSectionXCOFF.
2. Keep a pointer to that QualName inside of MCSectionXCOFF.
3. Use that QualName whenever we need a symbol refers to that
MCSectionXCOFF.
4. Adapt the snowball effect from the above changes in
XCOFFObjectWriter.cpp.
Reviewers: xingxue, DiggerLin, sfertile, daltenty, hubert.reinterpretcast
Reviewed By: DiggerLin, daltenty
Subscribers: wuzish, nemanjai, mgorny, hiraditya, kbarton, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69633
Adds Wrapper classes for MCSymbol and MCSection into the XCOFF target
object writer. Also adds a class to represent the top-level sections, which we
materialize in the ObjectWriter.
executePostLayoutBinding will map all csects into the appropriate
container depending on its storage mapping class, and map all symbols
into their containing csect. Once all symbols have been processed we
- Assign addresses and symbol table indices.
- Calaculte section sizes.
- Build the section header table.
- Assign the sections raw-pointer value for non-virtual sections.
Since the .bss section is virtual, writing the header table is enough to
add support. Writing of a sections raw data, or of any relocations is
not included in this patch.
Testing is done by dumping the section header table, but it needs to be
extended to include dumping the symbol table once readobj support for
dumping auxiallary entries lands.
Differential Revision: https://reviews.llvm.org/D65159
llvm-svn: 369454
Summary:
This patch keeps track of MCSymbols created for blocks that were
referenced in inline asm. It prevents creating a new symbol which
doesn't refer to the block.
Inline asm may have a reference to a label. The asm parser however
doesn't recognize it as a label and tries to create a new symbol. The
result being that instead of the original symbol (e.g. ".Ltmp0") the
parser replaces it in the inline asm with the new one (e.g. ".Ltmp00")
without updating it in the symbol table. So the machine basic block
retains the "old" symbol (".Ltmp0"), but the inline asm uses the new one
(".Ltmp00").
Reviewers: nickdesaulniers, craig.topper
Subscribers: nathanchance, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65304
llvm-svn: 368477
Stubs out a TargetLoweringObjectFileXCOFF class, implementing only
SelectSectionForGlobal for common symbols. Also adds an override of
EmitGlobalVariable in PPCAIXAsmPrinter which adds a number of defensive errors
and adds support for emitting common globals.
llvm-svn: 366727
Stubs out a number of the classes needed to produce a new object file format
(XCOFF) for the powerpc-aix target. For testing input is an empty module which
produces an object file with just a file header.
Differential Revision: https://reviews.llvm.org/D61694
llvm-svn: 365541
Summary:
(1) Function descriptor on AIX
On AIX, a called routine may have 2 distinct symbols associated with it:
* A function descriptor (Name)
* A function entry point (.Name)
The descriptor structure on AIX is the same as those in the ELF V1 ABI:
* The address of the entry point of the function.
* The TOC base address for the function.
* The environment pointer.
The descriptor symbol uses the same name as the source level function in C.
The function entry point is analogous to the symbol we would generate for a
function in a non-descriptor-based ABI, except that it is renamed by
prepending a ".".
Which symbol gets referenced depends on the context:
* Taking the address of the function references the descriptor symbol.
* Calling the function references the entry point symbol.
(2) Speaking of implementation on AIX, for direct function call target, we
create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to
replace original TargetGlobalAddress SDNode. Then down the path, we can
take advantage of this MCSymbol.
Patch by: Xiangling_L
Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara
Differential Revision: https://reviews.llvm.org/D62532
llvm-svn: 362735
This option provides only the base filename, not a full relative path.
Part of the fix for PR41839.
Differential Revision: https://reviews.llvm.org/D62071
llvm-svn: 361245
Another attempt to land the changes in debug line header to prevent duplicate
files in Dwarf 5. I rolled back my previous commit because of a mistake in
generating the object file in a test. Meanwhile, I addressed some offline
comments and changed the implementation; the largest difference is that
MCDwarfLineTableHeader does not keep DwarfVersion but gets it as a parameter. I
also merged the patch to fix two lld tests that will strt to fail into this
patch.
Original Commit:
https://reviews.llvm.org/D59515
Original Message:
Motivation: In previous dwarf versions, file name indexes started from 1, and
the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes
the primary source file to be explicitly given an entry with an index number 0.
The current implementation honors the specification by just duplicating the
main source file, once with index number 0, and later maybe with another
index number. While this is compliant with the letter of the standard, the
duplication causes problems for consumers of this information such as lldb.
(Some files are duplicated, where only some of them have a line table although
all refer to the same file)
With this change, dwarf 5 debug line section files always start from 0, and
the zeroth entry is not duplicated whenever possible. This requires different
handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns
an index zero for a file name, it signals an error in dwarf 4, but not in dwarf
5) However, I think the minor complication is worth it, because it enables all
consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the
file name list homogenously.
llvm-svn: 358732
This reverts commit rL357020.
The commit broke the test llvm/test/tools/llvm-objdump/embedded-source.test
on some builds including clang-ppc64be-linux-multistage,
clang-s390x-linux, clang-with-lto-ubuntu, clang-x64-windows-msvc,
llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast (and others).
llvm-svn: 357026
Reapply rL356941 after regenerating the object file in the failing test
llvm/test/tools/llvm-objdump/embedded-source.test from source.
Original commit message:
[llvm] Prevent duplicate files in debug line header in dwarf 5.
Motivation: In previous dwarf versions, file name indexes started from 1, and
the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes
the primary source file to be explicitly given an entry with an index number 0.
The current implementation honors the specification by just duplicating the
main source file, once with index number 0, and later maybe with another
index number. While this is compliant with the letter of the standard, the
duplication causes problems for consumers of this information such as lldb.
(Some files are duplicated, where only some of them have a line table although
all refer to the same file)
With this change, dwarf 5 debug line section files always start from 0, and
the zeroth entry is not duplicated whenever possible. This requires different
handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns
an index zero for a file name, it signals an error in dwarf 4, but not in dwarf 5)
However, I think the minor complication is worth it, because it enables all
consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the
file name list homogenously.
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D59515
llvm-svn: 357018
Summary:
Motivation: In previous dwarf versions, file name indexes started from 1, and
the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes
the primary source file to be explicitly given an entry with an index number 0.
The current implementation honors the specification by just duplicating the
main source file, once with index number 0, and later maybe with another
index number. While this is compliant with the letter of the standard, the
duplication causes problems for consumers of this information such as lldb.
(Some files are duplicated, where only some of them have a line table although
all refer to the same file)
With this change, dwarf 5 debug line section files always start from 0, and
the zeroth entry is not duplicated whenever possible. This requires different
handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns
an index zero for a file name, it signals an error in dwarf 4, but not in dwarf 5)
However, I think the minor complication is worth it, because it enables all
consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the
file name list homogenously.
Reviewers: dblaikie, probinson, aprantl, espindola
Reviewed By: probinson
Subscribers: emaste, jvesely, nhaehnle, aprantl, javed.absar, arichardson, hiraditya, MaskRay, rupprecht, jdoerfert, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D59515
llvm-svn: 356941
This patch adds an XCOFF triple object format type into LLVM.
This XCOFF triple object file type will be used later by object file and assembly generation for the AIX platform.
Differential Revision: https://reviews.llvm.org/D58930
llvm-svn: 355989
This was sometimes causing clang or llvm-mc to crash, and in other
cases could emit a bogus DWARF line-table header. I did an interim
patch in r352541; this patch should be a cleaner and more complete
fix, and retains the test.
Addresses PR40538.
Differential Revision: https://reviews.llvm.org/D58750
llvm-svn: 355226
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
The initial patch was not reviewed, and does not have any tests;
it should not have been merged.
This reverts 344395, 344390, 344387, 344385, 344381, 344376,
and 344366.
llvm-svn: 344405
BTF is the debug format for BPF, a kernel virtual machine
and widely used for tracing, networking and security, etc ([1]).
Currently only instruction streams are passed to kernel,
the kernel verifier verifies them before execution. In order to
provide better visibility of bpf programs to user space
tools, some debug information, e.g., function names and
debug line information are desirable for kernel so tools
can get such information with better annotation
for jited instructions for performance or other reasons.
The dwarf is too complicated in kernel and for BPF.
Hence, BTF is designed to be the debug format for BPF ([2]).
Right now, pahole supports BTF for types, which
are generated based on dwarf sections in the ELF file.
In order to annotate performance metrics for jited bpf insns,
it is necessary to pass debug line info to the kernel.
Furthermore, we want to pass the actual code to the
kernel because of the following reasons:
. bpf program typically is small so storage overhead
should be small.
. in bpf land, it is totally possible that
an application loads the bpf program into the
kernel and then that application quits, so
holding debug info by the user space application
is not practical.
. having source codes directly kept by kernel
would ease deployment since the original source
code does not need ship on every hosts and
kernel-devel package does not need to be
deployed even if kernel headers are used.
The only reliable time to get the source code is
during compilation time. This will result in both more
accurate information and easier deployment as
stated in the above.
Another consideration is for JIT. The project like bcc
use MCJIT to compile a C program into bpf insns and
load them to the kernel ([3]). The generated BTF sections
will be readily available for such cases as well.
This patch implemented generation of BTF info in llvm
compiler. The BTF related sections will be generated
when both -target bpf and -g are specified. Two sections
are generated:
.BTF contains all the type and string information, and
.BTF.ext contains the func_info and line_info.
The separation is related to how two sections are used
differently in bpf loader, e.g., linux libbpf ([4]).
The .BTF section can be loaded into the kernel directly
while .BTF.ext needs loader manipulation before loading
to the kernel. The format of the each section is roughly
defined in llvm:include/llvm/MC/MCBTFContext.h and
from the implementation in llvm:lib/MC/MCBTFContext.cpp.
A later example also shows the contents in each section.
The type and func_info are gathered during CodeGen/AsmPrinter
by traversing dwarf debug_info. The line_info is
gathered in MCObjectStreamer before writing to
the object file. After all the information is gathered,
the two sections are emitted in MCObjectStreamer::finishImpl.
With cmake CMAKE_BUILD_TYPE=Debug, the compiler can
dump out all the tables except insn offset, which
will be resolved later as relocation records.
The debug type "btf" is used for BTFContext dump.
Dwarf tests the debug info generation with
llvm-dwarfdump to decode the binary sections and
check whether the result is expected. Currently
we do not have such a tool yet. We will implement
btf dump functionality in bpftool ([5]) as the bpftool is
considered the recommended tool for bpf introspection.
The implementation for type and func_info is tested
with linux kernel test cases. The line_info is visually
checked with dump from linux kernel libbpf ([4]) and
checked with readelf dumping section raw data.
Note that the .BTF and .BTF.ext information will not
be emitted to assembly code and there is no assembler
support for BTF either.
In the below, with a clang/llvm built with CMAKE_BUILD_TYPE=Debug,
Each table contents are shown for a simple C program.
-bash-4.2$ cat -n test.c
1 struct A {
2 int a;
3 char b;
4 };
5
6 int test(struct A *t) {
7 return t->a;
8 }
-bash-4.2$ clang -O2 -target bpf -g -mllvm -debug-only=btf -c test.c
Type Table:
[1] FUNC name_off=1 info=0x0c000001 size/type=2
param_type=3
[2] INT name_off=12 info=0x01000000 size/type=4
desc=0x01000020
[3] PTR name_off=0 info=0x02000000 size/type=4
[4] STRUCT name_off=16 info=0x04000002 size/type=8
name_off=18 type=2 bit_offset=0
name_off=20 type=5 bit_offset=32
[5] INT name_off=22 info=0x01000000 size/type=1
desc=0x02000008
String Table:
0 :
1 : test
6 : .text
12 : int
16 : A
18 : a
20 : b
22 : char
27 : test.c
34 : int test(struct A *t) {
58 : return t->a;
FuncInfo Table:
sec_name_off=6
insn_offset=<Omitted> type_id=1
LineInfo Table:
sec_name_off=6
insn_offset=<Omitted> file_name_off=27 line_off=34 line_num=6 column_num=0
insn_offset=<Omitted> file_name_off=27 line_off=58 line_num=7 column_num=3
-bash-4.2$ readelf -S test.o
......
[12] .BTF PROGBITS 0000000000000000 0000028d
00000000000000c1 0000000000000000 0 0 1
[13] .BTF.ext PROGBITS 0000000000000000 0000034e
0000000000000050 0000000000000000 0 0 1
[14] .rel.BTF.ext REL 0000000000000000 00000648
0000000000000030 0000000000000010 16 13 8
......
-bash-4.2$
The latest linux kernel ([6]) can already support .BTF with type information.
The [7] has the reference implementation in linux kernel side
to support .BTF.ext func_info. The .BTF.ext line_info support is not
implemented yet. If you have difficulty accessing [6], you can
manually do the following to access the code:
git clone https://github.com/yonghong-song/bpf-next-linux.git
cd bpf-next-linux
git checkout btf
The change will push to linux kernel soon once this patch is landed.
References:
[1]. https://www.kernel.org/doc/Documentation/networking/filter.txt
[2]. https://lwn.net/Articles/750695/
[3]. https://github.com/iovisor/bcc
[4]. https://github.com/torvalds/linux/tree/master/tools/lib/bpf
[5]. https://github.com/torvalds/linux/tree/master/tools/bpf/bpftool
[6]. https://github.com/torvalds/linux
[7]. https://github.com/yonghong-song/bpf-next-linux/tree/btf
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Differential Revision: https://reviews.llvm.org/D52950
llvm-svn: 344366
I am experimenting with a single split dwarf (.dwo sections in .o files).
I want to make linker to ignore .dwo sections in .o, for that I am trying to add
SHF_EXCLUDE flag ("E") for them in my asm sample.
I found that currently, it is impossible to add any flag for debug sections using llvm-mc.
That happens because we have a set of predefined unique sections created early with default flags:
https://github.com/llvm-mirror/llvm/blob/master/lib/MC/MCObjectFileInfo.cpp#L391
This patch allows a user to add any flags he wants.
I had to edit TargetLoweringObjectFileImpl.cpp to set MetaData type for debug sections.
Their kind was Data by default (so they were allocatable) and so after changes introduced by
this patch the SHF_ALLOC flag was applied for them, what does not make sense for debug sections.
One of OrcJITTests tests failed because of that.
Differential revision: https://reviews.llvm.org/D51361
llvm-svn: 340904
Now that we create the label at the point of the directive, we don't
need to set the "current CV location", and then later when we emit the
next instruction, create a label for it and emit it.
DWARF still defers the labels used in .debug_loc until the next
instruction or value, for reasons unknown.
llvm-svn: 340883
AT_NAME was being emitted before the directory paths were remapped. This
ensures that all paths are remapped before anything is emitted.
An additional test case has been added.
Note that this only works if the replacement string is an absolute path.
If not, then AT_decl_file believes the new path is a relative path, and
joins that path with the compilation directory. I do not know of a good
way to resolve this.
Patch by: Siddhartha Bagaria (starsid)
Differential revision: https://reviews.llvm.org/D49169
llvm-svn: 336793
debug compilation dir when compiling assembly files with -g.
Part of PR38050.
Patch by Siddhartha Bagaria!
Differential Revision: https://reviews.llvm.org/D48988
llvm-svn: 336680
DWARF v5 explicitly represents file #0 in the line table. Prior
versions did not, so ".loc 0" is still an error in those cases.
Differential Revision: https://reviews.llvm.org/D48452
llvm-svn: 335350
These symbols only get included in the output symbols table if
they are used in a relocation.
This behaviour matches more closely the ELF object writer.
Differential Revision: https://reviews.llvm.org/D46561
llvm-svn: 332005
If no data or instructions are emitted after a location directive, we
should clear the cv_loc when we change sections, or it will be emitted
at the beginning of the next section. This violates our invariant that
all .cv_loc directives belong to the same section. Add clearer
assertions for this.
llvm-svn: 330884
In DWARF v5 the Line Number Program Header is extensible, allowing values with
new content types. In this extension a content type is added,
DW_LNCT_LLVM_source, which contains the embedded source code of the file.
Add new optional attribute for !DIFile IR metadata called source which contains
source text. Use this to output the source to the DWARF line table of code
objects. Analogously extend METADATA_FILE in Bitcode and .file directive in ASM
to support optional source.
Teach llvm-dwarfdump and llvm-objdump about the new values. Update the output
format of llvm-dwarfdump to make room for the new attribute on file_names
entries, and support embedded sources for the -source option in llvm-objdump.
Differential Revision: https://reviews.llvm.org/D42765
llvm-svn: 325970
Looks like these were copied from the ELF sections but
don't apply to Wasm and were not used anywhere.
Also remove unused Wasm methods in MCContext.
Differential Revision: https://reviews.llvm.org/D37633
llvm-svn: 313058
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.
Differential Revision: https://reviews.llvm.org/D33843
llvm-svn: 304864
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
Make MCSectionELF::AssociatedSection be a link to a symbol, because
that's how it works in the assembly, and use it in the asm printer.
llvm-svn: 297769
This just adds the basic skeleton for supporting a new object file format.
All of the actual encoding will be implemented in followup patches.
Differential Revision: https://reviews.llvm.org/D26722
llvm-svn: 295803
Fixed test.
Summary:
Enables source location in diagnostic messages from the backend. This
is after parsing, during finalization. This requires the SourceMgr, the
inline assembly string buffer, and DiagInfo to still be alive after
EmitInlineAsm returns.
This patch creates a single SourceMgr for inline assembly inside the
AsmPrinter. MCContext gets a pointer to this SourceMgr. Using one
SourceMgr per call to EmitInlineAsm would make it difficult for
MCContext to figure out in which SourceMgr the SMLoc is located, while a
single SourceMgr can figure it out if it has multiple buffers.
The Str argument to EmitInlineAsm is copied into a buffer and owned by
the inline asm SourceMgr. This ensures that DiagHandlers won't print
garbage. (Clang emits a "note: instantiated into assembly here", which
refers to this string.)
The AsmParser gets destroyed before finalization, which means that the
DiagHandlers the AsmParser installs into the SourceMgr will be stale.
Restore the saved DiagHandlers.
Since now we're using just one SourceMgr for multiple inline asm
strings, we need to tell the AsmParser which buffer it needs to parse
currently. Hand a buffer id -- returned from SourceMgr::
AddNewSourceBuffer -- to the AsmParser.
Reviewers: rnk, grosbach, compnerd, rengolin, rovka, anemet
Reviewed By: rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29441
llvm-svn: 294458
Summary:
Enables source location in diagnostic messages from the backend. This
is after parsing, during finalization. This requires the SourceMgr, the
inline assembly string buffer, and DiagInfo to still be alive after
EmitInlineAsm returns.
This patch creates a single SourceMgr for inline assembly inside the
AsmPrinter. MCContext gets a pointer to this SourceMgr. Using one
SourceMgr per call to EmitInlineAsm would make it difficult for
MCContext to figure out in which SourceMgr the SMLoc is located, while a
single SourceMgr can figure it out if it has multiple buffers.
The Str argument to EmitInlineAsm is copied into a buffer and owned by
the inline asm SourceMgr. This ensures that DiagHandlers won't print
garbage. (Clang emits a "note: instantiated into assembly here", which
refers to this string.)
The AsmParser gets destroyed before finalization, which means that the
DiagHandlers the AsmParser installs into the SourceMgr will be stale.
Restore the saved DiagHandlers.
Since now we're using just one SourceMgr for multiple inline asm
strings, we need to tell the AsmParser which buffer it needs to parse
currently. Hand a buffer id -- returned from SourceMgr::
AddNewSourceBuffer -- to the AsmParser.
Reviewers: rnk, grosbach, compnerd, rengolin, rovka, anemet
Reviewed By: rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29441
llvm-svn: 294433
On ELF every section can have a corresponding section symbol. When in
an assembly file we have
.quad .text
the '.text' refers to that symbol.
The way we used to handle them is to leave .text an undefined symbol
until the very end when the object writer would map them to the
actual section symbol.
The problem with that is that anything before the end would see an
undefined symbol. This could result in bad diagnostics
(test/MC/AArch64/label-arithmetic-diags-elf.s), or incorrect results
when using the asm streamer (est/MC/Mips/expansion-jal-sym-pic.s).
Fixing this will also allow using the section symbol earlier for
setting sh_link of SHF_METADATA sections.
This patch includes a few hacks to avoid changing our behaviour when
handling conflicts between section symbols and other symbols. I
reported pr31850 to track that.
llvm-svn: 293936
This implements execute-only support for ARM code generation, which
prevents the compiler from generating data accesses to code sections.
The following changes are involved:
* Add the CodeGen option "-arm-execute-only" to the ARM code generator.
* Add the clang flag "-mexecute-only" as well as the GCC-compatible
alias "-mpure-code" to enable this option.
* When enabled, literal pools are replaced with MOVW/MOVT instructions,
with VMOV used in addition for floating-point literals. As the MOVT
instruction is required, execute-only support is only available in
Thumb mode for targets supporting ARMv8-M baseline or Thumb2.
* Jump tables are placed in data sections when in execute-only mode.
* The execute-only text section is assigned section ID 0, and is
marked as unreadable with the SHF_ARM_PURECODE flag with symbol 'y'.
This also overrides selection of ELF sections for globals.
llvm-svn: 289784
Summary:
Changes to llvm-mc to move common logic to separate function.
Related clang patch: https://reviews.llvm.org/D26213
Reviewers: rafael, t.p.northover, colinl, echristo, rengolin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26214
llvm-svn: 288396
MCContext already has many tasks, and separating CodeView out from it is
probably a good idea. The .cv_loc tracking was modelled on the DWARF
tracking which lived directly in MCContext.
Removes the inclusion of MCCodeView.h from MCContext.h, so now there are
only 10 build actions while I hack on CodeView support instead of 265.
llvm-svn: 279847
Group" sections while lowering. In particular, for ELF sections this is
useful for creating function-specific groups that get merged into the
same named section.
Also use const Twine& instead of StringRef for the getELF functions
while we're here.
Differential Revision: http://reviews.llvm.org/D21743
llvm-svn: 274336
Now, after landing r270560, r270557, r270320 it is a proper time.
Original commit message:
[llvm-mc] - Teach llvm-mc to generate compressed debug sections in zlib style.
Before this patch llvm-mc generated zlib-gnu styled sections.
That means no SHF_COMPRESSED flag was set, magic 'zlib' signature
was used in combination with full size field. Sections were renamed to "*.z*".
This patch reimplements the compression style to zlib one as zlib-gnu looks
to be depricated everywhere.
Differential revision: http://reviews.llvm.org/D20331
llvm-svn: 270569
It broke buildbot:
http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/4817/steps/ninja%20check%201/logs/stdio
Actually it is just because D20273 not yet commited, but these 2 were crossing with each other,
and I`ll better find the way to land them separatelly soon.
Initial commit message:
[llvm-mc] - Teach llvm-mc to generate compressed debug sections in zlib style.
Before this patch llvm-mc generated zlib-gnu styled sections.
That means no SHF_COMPRESSED flag was set, magic 'zlib' signature
was used in combination with full size field. Sections were renamed to "*.z*".
This patch reimplements the compression style to zlib one as zlib-gnu looks
to be depricated everywhere.
Differential revision: http://reviews.llvm.org/D20331
llvm-svn: 270075
Before this patch llvm-mc generated zlib-gnu styled sections.
That means no SHF_COMPRESSED flag was set, magic 'zlib' signature
was used in combination with full size field. Sections were renamed to "*.z*".
This patch reimplements the compression style to zlib one as zlib-gnu looks
to be depricated everywhere.
Differential revision: http://reviews.llvm.org/D20331
llvm-svn: 270070
Summary:
This adds a unique ID to the COFF section uniquing map, similar to the
one we have for ELF. The unique id is not currently exposed via the
assembler because we don't have a use case for it yet. Users generally
create .pdata with the .seh_* family of directives, and the assembler
internally needs to produce .pdata and .xdata sections corresponding to
the code section.
The association between .text sections and the assembler-created .xdata
and .pdata sections is maintained as an ID field of MCSectionCOFF. The
CFI-related sections are created with the given unique ID, so if more
code is added to the same text section, we can find and reuse the CFI
sections that were already created.
Reviewers: majnemer, rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19376
llvm-svn: 268331
Removed some unused headers, replaced some headers with forward class declarations.
Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'
Patch by Eugene Kosov <claprix@yandex.ru>
Differential Revision: http://reviews.llvm.org/D19219
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
This is a fix for PR26941.
When there is both a section and a global definition with the same
name, the global wins.
Section symbols are not added to the symbol table; section references
are left undefined and fixed up in the object writer unless they've
been satisfied by some other definition.
llvm-svn: 264649
MCContext shouldn't be accessing the filesystem - that's a gross
layering violation and makes it awkward to use as a library or in a
daemon where it may not even be allowed filesystem access.
The CWD lookup here is normally redundant anyway, since the calling
context either also looks up the CWD or sets this to something more
specific. Here, we fix up the one caller that doesn't already set up a
debug compilation dir and make it clear that the responsibility for
such set up is in the users of MCContext.
llvm-svn: 264109
This reverts commit r259117.
The LineInfo constructor is defined in the codeview library and we have
to link against it now. Doing that isn't trivial, so reverting for now.
llvm-svn: 259126
Adds a new family of .cv_* directives to LLVM's variant of GAS syntax:
- .cv_file: Similar to DWARF .file directives
- .cv_loc: Similar to the DWARF .loc directive, but starts with a
function id. CodeView line tables are emitted by function instead of
by compilation unit, so we needed an extra field to communicate this.
Rather than overloading the .loc direction further, we decided it was
better to have our own directive.
- .cv_stringtable: Emits the codeview string table at the current
position. Currently this just contains the filenames as
null-terminated strings.
- .cv_filechecksums: Emits the file checksum table for all files used
with .cv_file so far. There is currently no support for emitting
actual checksums, just filenames.
This moves the line table emission code down into the assembler. This
is in preparation for implementing the inlined call site line table
format. The inline line table format encoding algorithm requires knowing
the absolute code offsets, so it must run after the assembler has laid
out the code.
David Majnemer collaborated on this patch.
llvm-svn: 259117
This adds reportError to MCContext, which can be used as an alternative to
reportFatalError when the assembler wants to try to continue processing the
rest of the file after the error is reported, so that all of the errors ina
file can be reported. It records the fact that an error was encountered, so we
can avoid emitting an object file if any errors occurred.
This patch doesn't add any uses of this function (a later patch will convert
most uses of reportFatalError to use it), but there is a small functional
change: we use the SourceManager to print the error message, even if we have a
null SMLoc. This means that we get a SourceManager-style message, with the file
and line information shown as <unknown>, rather than the "LLVM ERROR" style
used by report_fatal_error.
llvm-svn: 253327
MCRelaxableFragment previously kept a copy of MCSubtargetInfo and
MCInst to enable re-encoding the MCInst later during relaxation. A copy
of MCSubtargetInfo (instead of a reference or pointer) was needed
because the feature bits could be modified by the parser.
This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment
with a constant reference to MCSubtargetInfo. The copies of
MCSubtargetInfo are kept in MCContext, and the target parsers are now
responsible for asking MCContext to provide a copy whenever the feature
bits of MCSubtargetInfo have to be toggled.
With this patch, I saw a 4% reduction in peak memory usage when I
compiled verify-uselistorder.lto.bc using llc.
rdar://problem/21736951
Differential Revision: http://reviews.llvm.org/D14346
llvm-svn: 253127
This was just forgotten when SectionSymbols was introduced and could cause
corruption if the MCContext was reused after Reset.
Reviewers: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13547
llvm-svn: 249854
This prevents MC clients from getting COFF.h, which conflicts with
winnt.h macros. Also a minor IWYU cleanup. Now the only public headers
including COFF.h are in Object, and they actually need it.
llvm-svn: 246784
This reverts commit r245047.
It was failing on the darwin bots. The problem was that when running
./bin/llc -march=msp430
llc gets to
if (TheTriple.getTriple().empty())
TheTriple.setTriple(sys::getDefaultTargetTriple());
Which means that we go with an arch of msp430 but a triple of
x86_64-apple-darwin14.4.0 which fails badly.
That code has to be updated to select a triple based on the value of
march, but that is not a trivial fix.
llvm-svn: 245062
Other than some places that were handling unknown as ELF, this should
have no change. The test updates are because we were detecting
arm-coff or x86_64-win64-coff as ELF targets before.
It is not clear if the enum should live on the Triple. At least now it lives
in a single location and should be easier to move somewhere else.
llvm-svn: 245047
This causes errors like:
ld: error: blah.o: requires dynamic R_X86_64_PC32 reloc against '' which
may overflow at runtime; recompile with -fPIC
blah.cc:function f(): error: undefined reference to ''
blah.o:g(): error: undefined reference to ''
I have not yet come up with an appropriate reproduction.
llvm-svn: 240394
Now that pr23900 is fixed, we can bring it back with no changes.
Original message:
Make all temporary symbols unnamed.
What this does is make all symbols that would otherwise start with a .L
(or L on MachO) unnamed.
Some of these symbols still show up in the symbol table, but we can just
make them unnamed.
In order to make sure we produce identical results when going thought assembly,
all .L (not just the compiler produced ones), are now unnamed.
Running llc on llvm-as.opt.bc, the peak memory usage goes from 208.24MB to
205.57MB.
llvm-svn: 240302
What this does is make all symbols that would otherwise start with a .L
(or L on MachO) unnamed.
Some of these symbols still show up in the symbol table, but we can just
make them unnamed.
In order to make sure we produce identical results when going thought assembly,
all .L (not just the compiler produced ones), are now unnamed.
Running llc on llvm-as.opt.bc, the peak memory usage goes from 208.24MB to
205.57MB.
llvm-svn: 240130
Directional labels can show up in symbol tables (and we have a llvm-mc test for
that). Given that, we need to make sure they are named.
With that out of the way, use setUseNamesOnTempLabels in llvm-mc so that it
too benefits from the memory saving.
llvm-svn: 239914
Similarly to User which allocates a number of Use's prior to the this pointer,
allocate space for the Name* for MCSymbol only when we need a name.
Given that an MCSymbol is 48-bytes on 64-bit systems, this saves a decent % of space.
Given the verify_uselistorder test case with debug info and llc, 50k symbols have names
out of 700k so this optimises for the common case of temporary unnamed symbols.
Reviewed by David Blaikie.
llvm-svn: 239423
Some temporary symbols are created by MC itself. These symbols are never used
for lookup and are never included in the object symbol table, so we can
avoid creating a name for them.
Other temporaries are created by CodeGen or by the user by explicitly asking
for a name starting with .L (or L on MachO).
These temporaries behave like regular symbols, we just try to avoid including
them in the object symbol table, but sometimes they end up there:
const char *foo() {
return "abc" + 3;
}
will have a relocation pointing to a .L symbol.
It just so happens that almost all MC created temporary has the AlwaysAddSuffix
option and CodeGen/user created ones don't.
One interesting future optimization would be to use unnamed symbols for
all temporaries, but that would require use an st_name of 0 or
having the object writer create the names if a symbol does end up in the
symbol table.
No testcase since this just avoid creating a few extra names for MC created
temporaries.
llvm-svn: 238887