Commit Graph

4495 Commits

Author SHA1 Message Date
Dan Gohman 698c6b0a09 [WebAssembly] Support single-floating-point immediate value
As mentioned in TODO comment, casting double to float causes NaNs to change bits.
To avoid the change, this patch adds support for single-floating-point immediate value on MachineCode.

Patch by Yuta Saito.

Differential Revision: https://reviews.llvm.org/D77384
2021-02-04 18:05:06 -08:00
Fangrui Song 3e8ab54ba0 [MC] Upgrade DWARF version to 5 upon .file 0
Without `-dwarf-version`, llvm-mc uses the default `MCContext::DwarfVersion` 4.

Without `-gdwarf-N`, Clang cc1as uses `clang::driver::ToolChain::GetDefaultDwarfVersion`
which is 4 on many toolchains. Note: `clang -c` can synthesize .debug_info without -g.

There is currently a MCParser warning upon `.file 0` and MCParser errors upon
`.loc 0` if the DWARF version is less than 5. This causes friction to the
following usage:

```
clang -S -g -gdwarf-5 a.c

// MC warning due to .file 0, MC error due to .loc 0
clang -c a.s
llvm-mc -filetype=obj a.s
```

My idea is that we can just upgrade `MCContext::DwarfVersion` to 5 upon
`.file 0` to make the above commands work.

The downside is that for an explicit version `clang -c -gdwarf-4 a.s`, it can be
argued that the new behavior drops the probably intended diagnostic. I think the
downside is small because in most cases DWARF version for an assembly action
should either match the original compile action or be omitted.

Ongoing discussion taking a similar action for GNU as: https://sourceware.org/pipermail/binutils/2021-January/114980.html

Differential Revision: https://reviews.llvm.org/D94882
2021-02-02 09:41:05 -08:00
Fangrui Song 1477ed8465 [MC] Support SHF_GNU_RETAIN as section flag 'R'
On Linux target triples, GNU as sets EI_OSABI to ELFOSABI_GNU when SHF_GNU_RETAIN is used。
On `*-*-freebsd`, it usually sets EI_OSABI to ELFOSABI_FREEBSD.

GNU ld respects SHF_GNU_RETAIN only for ELFOSABI_FREEBSD/ELFOSABI_GNU.
https://sourceware.org/bugzilla/show_bug.cgi?id=27282

MC doesn't set ELFOSABI_GNU for SHF_GNU_RETAIN/STB_GNU_UNIQUE/STT_GNU_IFUNC.
MC assembled object files do not have special semantics in GNU ld.

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D95730
2021-02-02 09:34:09 -08:00
Wouter van Oortmerssen 0d9b17d0ef [WebAssembly] fixed wasm64 data segment init exp not 64-bit
As defined in the spec:
https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md

Differential Revision: https://reviews.llvm.org/D95651
2021-02-01 11:32:50 -08:00
Kazu Hirata 1a2d67fa23 [llvm] Use llvm::lower_bound and llvm::upper_bound (NFC) 2021-01-29 23:23:36 -08:00
Tobias Burnus 70ea15b889 [MC][ELF] Fix accepting abbreviated form with sh_flags and sh_entsize
Followup to D92052 as I missed an issue as shown via GCC bug https://gcc.gnu.org/PR97827, namely: (e.g.) ".rodata." implies ELF::SHF_ALLOC.

Crossref:

- D73999 / commit 75af9da755
  added for LLVM 11 a check that sh_flags and sh_entsize (and sh_type)
  changes are an error, in line with GNU assembler.

-  D92052 / commit 1deff4009e
   permitted the abbreviated form which many assemblers accept and
   GCC generates: while the first .section contains the flags and entsize,
   subsequent sections simply contain the name without repeating entsize or
   flags.

However, the latter patch missed in the check that some flags are automatically set, e.g. '.rodata." implies ELF::SHF_ALLOC.

Related https://bugs.llvm.org/show_bug.cgi?id=48201

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D94072
2021-01-28 14:54:43 +00:00
Sam Clegg d705c2fbd4 Revert "[WebAssembly] MC layer writes table symbols to object files"
This reverts commit d806618636.
Review: https://reviews.llvm.org/D92215

We had issues where older versions of wasm-ld were crashing on object
files containing a table symbol.  We decided that the best strategy
going forward is to only generate these symbol if refernece types is
enabled.  Without reference types enabled we should never geneate a
table symbol or a TABLE_NUMBER relocation.

This revert is in addition to the one already reverted in
https://reviews.llvm.org/D95005.

The plan is to re-land these in updated form after the llvm 12 branch.

Differential Revision: https://reviews.llvm.org/D95420
2021-01-25 22:32:36 -08:00
Kazu Hirata 5f843b2dd2 [llvm] Use isAlpha/isAlnum (NFC) 2021-01-22 23:25:03 -08:00
Kazu Hirata cfa241680f [llvm] Don't include StringSwitch.h where unnecessary (NFC) 2021-01-21 19:59:48 -08:00
Mikael Holmen 2b4716d6df [MC] Use std::make_tuple to make some toolchains happy again
My toolchain (LLVM 8.0, libstdc++ 5.4.0) complained with:

12:27:43 ../lib/MC/MCDwarf.cpp:814:10: error: chosen constructor is explicit in copy-initialization
12:27:43   return {Offset, Size, SetDelta};
12:27:43          ^~~~~~~~~~~~~~~~~~~~~~~~
12:27:43 /proj/flexasic/app/llvm/8.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here
12:27:43         constexpr tuple(_UElements&&... __elements)
12:27:43                   ^
12:27:43 1 error generated.

This commit adds explicit calls to std::make_tuple to work around
the problem.
2021-01-21 14:05:14 +01:00
Fangrui Song 71635ea5ff MCDwarf: Delete uneeded parameter
And change signature
2021-01-21 00:55:07 -08:00
Sam Clegg 96ef4f307d Revert "[WebAssembly] call_indirect issues table number relocs"
This reverts commit 418df4a6ab.

This change broke emscripten tests, I believe because it started
generating 5-byte a wide table index in the call_indirect instruction.
Neither v8 nor wabt seem to be able to handle that.  The spec
currently says that this is single 0x0 byte and:

"In future versions of WebAssembly, the zero byte occurring in the
encoding of the call_indirectcall_indirect instruction may be used to
index additional tables."

So we need to revisit this change.  For backwards compat I guess
we need to guarantee that __indirect_function_table is always at
address zero.   We could also consider making this a single-byte
relocation with and assert if have more than 127 tables (for now).

Differential Revision: https://reviews.llvm.org/D95005
2021-01-19 15:06:07 -08:00
Andy Wingo 831a143e50 [WebAssembly] Change prefix on data segment flags to WASM_DATA_SEGMENT
Element sections will also need flags, so we shouldn't squat the
WASM_SEGMENT namespace.

Depends on D90948.

Differential Revision: https://reviews.llvm.org/D92315
2021-01-19 09:40:42 +01:00
Andy Wingo 418df4a6ab [WebAssembly] call_indirect issues table number relocs
This patch changes to make call_indirect explicitly refer to the
corresponding function table, residualizing TABLE_NUMBER relocs against
it.

With this change, wasm-ld now sees all references to tables, and can
link multiple tables.

Differential Revision: https://reviews.llvm.org/D90948
2021-01-19 09:32:45 +01:00
Yang Fan 9a0900dc4c
[NFC][AIX][XCOFF] Fix compile warning on strncpy
GCC warning:
```
In file included from /usr/include/string.h:495,
                 from /usr/include/c++/9/cstring:42,
                 from /llvm-project/llvm/include/llvm/ADT/Hashing.h:53,
                 from /llvm-project/llvm/include/llvm/ADT/ArrayRef.h:12,
                 from /llvm-project/llvm/include/llvm/MC/MCAsmBackend.h:12,
                 from /llvm-project/llvm/lib/MC/XCOFFObjectWriter.cpp:14:
In function ‘char* strncpy(char*, const char*, size_t)’,
    inlined from ‘{anonymous}::Section::Section(const char*, llvm::XCOFF::SectionTypeFlags, bool, {anonymous}::CsectGroups)’ at /llvm-project/llvm/lib/MC/XCOFFObjectWriter.cpp:146:12:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:106:34: warning: ‘char* __builtin_strncpy(char*, const char*, long unsigned int)’ specified bound 8 equals destination size [-Wstringop-truncation]
  106 |   return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest));
      |          ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D94872
2021-01-19 14:07:11 +08:00
Andy Wingo d806618636 [WebAssembly] MC layer writes table symbols to object files
Now that the linker handles table symbols, we can allow the frontend to
produce them.

Depends on D91870.

Differential Revision: https://reviews.llvm.org/D92215
2021-01-18 16:57:18 +01:00
Derek Schuff e65b9b04cd Revert "[WebAssembly] MC layer writes table symbols to object files"
This reverts commit e9f1ed2306.

Reverting because it depends on 38dfce706f
2021-01-15 15:50:22 -08:00
Andy Wingo e9f1ed2306 [WebAssembly] MC layer writes table symbols to object files
Now that the linker handles table symbols, we can allow the frontend to
produce them.

Depends on D91870.

Differential Revision: https://reviews.llvm.org/D92215
2021-01-15 14:55:55 +01:00
Jian Cai 9dfeec8530 Reland "[AsmParser] make .ascii support spaces as separators"
This relands commit e0963ae274, which was
reverted on commit 82c4153e66 due to a
test failure, which turned out to be a false positive.
2021-01-14 17:51:47 -08:00
Jian Cai 82c4153e66 Revert "[AsmParser] make .ascii support spaces as separators"
This reverts commit e0963ae274. The change
breaks some GDB tests. Revert it while we investigate.
2021-01-13 14:38:22 -08:00
Kazu Hirata 89e8eb946d [llvm] Use llvm::find_if (NFC) 2021-01-11 18:48:06 -08:00
Kazu Hirata 1d0bc05551 [llvm] Use llvm::append_range (NFC) 2021-01-06 18:27:33 -08:00
Sam Clegg 30d314aae1 [MC][WebAssembly] Avoid recalculating indexes in -gsplit-dwarf mode
Be consistent about asserting before setting WasmIndices.  Adding
these assertions revealed that we were duplicating a lot of work
and setting these indexed twice when running in DWO mode.

Differential Revision: https://reviews.llvm.org/D93650
2021-01-06 01:35:06 -08:00
Andy Wingo 9ad83fd6dc [WebAssembly] call_indirect causes indirect function table import
For wasm-ld table linking work to proceed, object files should indicate
if they use an indirect function table.  In the future this will be done
by the usual symbols and relocations mechanism, but until that support
lands in the linker, the presence of an `__indirect_function_table` in
the object file's import section shows that the object file needs an
indirect function table.

Prior to https://reviews.llvm.org/D91637, this condition was met by all
object files residualizing an `__indirect_function_table` import.

Since https://reviews.llvm.org/D91637, the intention has been that only
those object files needing an indirect function table would have the
`__indirect_function_table` import.  However, we missed the case of
object files which use the table via `call_indirect` but which
themselves do not declare any indirect functions.

This changeset makes it so that when we lower a call to `call_indirect`,
that we ensure that a `__indirect_function_table` symbol is present and
that it will be propagated to the linker.

A followup patch will revise this mechanism to make an explicit link
between `call_indirect` and its associated indirect function table; see
https://reviews.llvm.org/D90948.

Differential Revision: https://reviews.llvm.org/D92840
2021-01-05 11:09:24 +01:00
Fangrui Song d9a0c40bce [MC] Split MCContext::createTempSymbol, default AlwaysAddSuffix to true, and add comments
CanBeUnnamed is rarely false. Splitting to a createNamedTempSymbol makes the
intention clearer and matches the direction of reverted r240130 (to drop the
unneeded parameters).

No behavior change.
2020-12-21 14:04:13 -08:00
Fangrui Song 8ffda237a6 MCContext::reportError: don't call report_fatal_error
Errors from MCAssembler, MCObjectStreamer and *ObjectWriter typically cause a crash:

```
% cat c.c
int bar;
extern int foo __attribute__((alias("bar")));
% clang -c -fcommon c.c
fatal error: error in backend: Common symbol 'bar' cannot be used in assignment expr
PLEASE submit a bug report to ...
Stack dump:
...
```

`LLVMTargetMachine::addPassesToEmitFile` constructs `MachineModuleInfoWrapperPass`
which creates a MCContext without SourceMgr. `MCContext::reportError` calls
`report_fatal_error` which gets captured by Clang `LLVMErrorHandler` and gets translated
to the output above.

Since `MCContext::reportError` errors indicate user errors, such a crashing style error
is inappropriate. So this patch changes `report_fatal_error` to `SourceMgr().PrintMessage`.
```
% clang -c -fcommon c.c
<unknown>:0: error: Common symbol 'bar' cannot be used in assignment expr
```

Ideally we should at least recover the original filename (the line information
is generally lost).  That requires general improvement to MC diagnostics,
because currently in many cases SMLoc information is lost.
2020-12-20 23:23:12 -08:00
Jian Cai e0963ae274 [AsmParser] make .ascii support spaces as separators
Currently the integrated assembler only allows commas as the separator
between string arguments in .ascii. This patch adds support to using
space as separators and make IAS consistent with GNU assembler.

Link: https://github.com/ClangBuiltLinux/linux/issues/1196

Reviewed By: nickdesaulniers, jrtc27

Differential Revision: https://reviews.llvm.org/D91460
2020-12-20 22:41:00 -08:00
Fangrui Song 7b9890e17e [MC][ELF] Remove unneeded MCSymbol::setExternal calls
ELF code uses symbol bindings and does not call isExternal().
2020-12-20 21:26:36 -08:00
Fangrui Song e4c360a897 [MC][ELF] Drop MCSymbol::isExternal call sites
ELF uses symbol bindings and MCSymbol::isExternal is not really useful.
The function is no longer used in ELF code now.
2020-12-20 21:18:22 -08:00
Fangrui Song 553d4d08d2 [MC] Report locations for .symver errors 2020-12-20 21:04:12 -08:00
Fangrui Song 72e75ca343 [MC][ELF] Allow STT_SECTION referencing SHF_MERGE on REL targets
This relands D64327 with a more specific workaround for R_386_GOTOFF
(gold<2.34 bug https://sourceware.org/bugzilla/show_bug.cgi?id=16794)

.debug_info has quite a few .debug_str relocations (R_386_32/R_ARM_ABS32).
The original workaround was too general and introduced too many .L symbols
used just as relocation targets.

From the original review:

  ... it reduced the size of a big ARM-32 debug image by 33%. It contained ~68M
  of relocations symbols out of total ~71M symbols (96% of symbols table was
  generated for relocations with symbol).
2020-12-20 18:37:14 -08:00
Fangrui Song 01d1de8196 [MC] Reject byte alignment if larger than or equal to 2**32
This is consistent with the resolution to power-of-2 alignments.
Otherwise, emitCodeAlignment and emitValueToAlignment cannot handle alignments
larger than 2**32 and will trigger assertion failure (PR35218).

Note: GNU as as of 2.35 will use 1 for such a large byte `.align`
2020-12-20 14:17:00 -08:00
Tobias Burnus 1deff4009e [MC][ELF] Accept abbreviated form with sh_flags and sh_entsize
D73999 / commit 75af9da755
added for LLVM 11 a check that sh_flags and sh_entsize (and sh_type)
changes are an error, in line with GNU assembler.

However, GNU assembler accepts and GCC generates an abbreviated form:
while the first .section contains the flags and entsize, subsequent
sections simply contain the name without repeating entsize or flags.

Do likewise for better compatibility.

See https://bugs.llvm.org/show_bug.cgi?id=48201

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D92052
2020-12-11 16:45:45 +00:00
Hongtao Yu 705a4c149d [CSSPGO] Pseudo probe encoding and emission.
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s

Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections.  The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. 

**ELF object emission**

The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission.

Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication.  A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool.

The format of `.pseudo_probe_desc` section looks like:

```
.section   .pseudo_probe_desc,"",@progbits
.quad   6309742469962978389  // Func GUID
.quad   4294967295           // Func Hash
.byte   9                    // Length of func name
.ascii  "_Z5funcAi"          // Func name
.quad   7102633082150537521
.quad   138828622701
.byte   12
.ascii  "_Z8funcLeafi"
.quad   446061515086924981
.quad   4294967295
.byte   9
.ascii  "_Z5funcBi"
.quad   -2016976694713209516
.quad   72617220756
.byte   7
.ascii  "_Z3fibi"
```

For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :

```
FUNCTION BODY (one for each outlined function present in the text section)
    GUID (uint64)
        GUID of the function
    NPROBES (ULEB128)
        Number of probes originating from this function.
    NUM_INLINED_FUNCTIONS (ULEB128)
        Number of callees inlined into this function, aka number of
        first-level inlinees
    PROBE RECORDS
        A list of NPROBES entries. Each entry contains:
          INDEX (ULEB128)
          TYPE (uint4)
            0 - block probe, 1 - indirect call, 2 - direct call
          ATTRIBUTE (uint3)
            reserved
          ADDRESS_TYPE (uint1)
            0 - code address, 1 - address delta
          CODE_ADDRESS (uint64 or ULEB128)
            code address or address delta, depending on ADDRESS_TYPE
    INLINED FUNCTION RECORDS
        A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
        callees.  Each record contains:
          INLINE SITE
            GUID of the inlinee (uint64)
            ID of the callsite probe (ULEB128)
          FUNCTION BODY
            A FUNCTION BODY entry describing the inlined function.
```

To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index.

**Assembling**

Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis.

A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file.

A example assembly looks like:

```
foo2: # @foo2
# %bb.0: # %bb0
pushq %rax
testl %edi, %edi
.pseudoprobe 837061429793323041 1 0 0
je .LBB1_1
# %bb.2: # %bb2
.pseudoprobe 837061429793323041 6 2 0
callq foo
.pseudoprobe 837061429793323041 3 0 0
.pseudoprobe 837061429793323041 4 0 0
popq %rax
retq
.LBB1_1: # %bb1
.pseudoprobe 837061429793323041 5 1 0
callq *%rsi
.pseudoprobe 837061429793323041 2 0 0
.pseudoprobe 837061429793323041 4 0 0
popq %rax
retq
# -- End function
.section .pseudo_probe_desc,"",@progbits
.quad 6699318081062747564
.quad 72617220756
.byte 3
.ascii "foo"
.quad 837061429793323041
.quad 281547593931412
.byte 4
.ascii "foo2"
```

With inlining turned on, the assembly may look different around %bb2 with an inlined probe:

```
# %bb.2:                                # %bb2
.pseudoprobe    837061429793323041 3 0
.pseudoprobe    6699318081062747564 1 0 @ 837061429793323041:6
.pseudoprobe    837061429793323041 4 0
popq    %rax
retq
```

**Disassembling**

We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file.

An example disassembly looks like:

```
00000000002011a0 <foo2>:
  2011a0: 50                    push   rax
  2011a1: 85 ff                 test   edi,edi
  [Probe]:  FUNC: foo2  Index: 1  Type: Block
  2011a3: 74 02                 je     2011a7 <foo2+0x7>
  [Probe]:  FUNC: foo2  Index: 3  Type: Block
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  [Probe]:  FUNC: foo   Index: 1  Type: Block  Inlined: @ foo2:6
  2011a5: 58                    pop    rax
  2011a6: c3                    ret
  [Probe]:  FUNC: foo2  Index: 2  Type: Block
  2011a7: bf 01 00 00 00        mov    edi,0x1
  [Probe]:  FUNC: foo2  Index: 5  Type: IndirectCall
  2011ac: ff d6                 call   rsi
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  2011ae: 58                    pop    rax
  2011af: c3                    ret
```

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D91878
2020-12-10 17:29:28 -08:00
Derek Schuff 8d396acac3 [WebAssembly] Support COMDAT sections in assembly syntax
This CL changes the asm syntax for section flags, making them more like ELF
(previously "passive" was the only option). Now we also allow "G" to designate
COMDAT group sections. In these sections we set the appropriate comdat flag on
function symbols, and also avoid auto-creating a new section for them.

This also adds asm-based tests for the changes D92691 to go along with
the direct-to-object tests.

Differential Revision: https://reviews.llvm.org/D92952
This is a reland of rG4564553b8d8a with a fix to the lit pipeline in
llvm/test/MC/WebAssembly/comdat.ll
2020-12-10 16:43:59 -08:00
Derek Schuff dd1aa4fdd8 Revert "[WebAssembly] Support COMDAT sections in assembly syntax"
This reverts commit 4564553b8d.
It broke several buildbots.
2020-12-10 15:55:33 -08:00
Mitch Phillips 7ead5f5aa3 Revert "[CSSPGO] Pseudo probe encoding and emission."
This reverts commit b035513c06.

Reason: Broke the ASan buildbots:
  http://lab.llvm.org:8011/#/builders/5/builds/2269
2020-12-10 15:53:39 -08:00
Mitch Phillips b955eb688d Revert "[NFC] Fix a gcc build break by using an explict constructor."
This reverts commit 248b279cf0.

Reason: Dependency of patch that broke the ASan buildbots:
  http://lab.llvm.org:8011/#/builders/5/builds/2269
2020-12-10 15:53:38 -08:00
Derek Schuff 4564553b8d [WebAssembly] Support COMDAT sections in assembly syntax
This CL changes the asm syntax for section flags, making them more like ELF
(previously "passive" was the only option). Now we also allow "G" to designate
COMDAT group sections. In these sections we set the appropriate comdat flag on
function symbols, and also avoid auto-creating a new section for them.

This also adds asm-based tests for the changes D92691 to go along with
the direct-to-object tests.

Differential Revision: https://reviews.llvm.org/D92952
2020-12-10 14:46:24 -08:00
Hongtao Yu 248b279cf0 [NFC] Fix a gcc build break by using an explict constructor. 2020-12-10 11:21:40 -08:00
Hongtao Yu b035513c06 [CSSPGO] Pseudo probe encoding and emission.
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s

Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections.  The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. 

**ELF object emission**

The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission.

Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication.  A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool.

The format of `.pseudo_probe_desc` section looks like:

```
.section   .pseudo_probe_desc,"",@progbits
.quad   6309742469962978389  // Func GUID
.quad   4294967295           // Func Hash
.byte   9                    // Length of func name
.ascii  "_Z5funcAi"          // Func name
.quad   7102633082150537521
.quad   138828622701
.byte   12
.ascii  "_Z8funcLeafi"
.quad   446061515086924981
.quad   4294967295
.byte   9
.ascii  "_Z5funcBi"
.quad   -2016976694713209516
.quad   72617220756
.byte   7
.ascii  "_Z3fibi"
```

For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :

```
FUNCTION BODY (one for each outlined function present in the text section)
    GUID (uint64)
        GUID of the function
    NPROBES (ULEB128)
        Number of probes originating from this function.
    NUM_INLINED_FUNCTIONS (ULEB128)
        Number of callees inlined into this function, aka number of
        first-level inlinees
    PROBE RECORDS
        A list of NPROBES entries. Each entry contains:
          INDEX (ULEB128)
          TYPE (uint4)
            0 - block probe, 1 - indirect call, 2 - direct call
          ATTRIBUTE (uint3)
            reserved
          ADDRESS_TYPE (uint1)
            0 - code address, 1 - address delta
          CODE_ADDRESS (uint64 or ULEB128)
            code address or address delta, depending on ADDRESS_TYPE
    INLINED FUNCTION RECORDS
        A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
        callees.  Each record contains:
          INLINE SITE
            GUID of the inlinee (uint64)
            ID of the callsite probe (ULEB128)
          FUNCTION BODY
            A FUNCTION BODY entry describing the inlined function.
```

To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index.

**Assembling**

Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis.

A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file.

A example assembly looks like:

```
foo2: # @foo2
# %bb.0: # %bb0
pushq %rax
testl %edi, %edi
.pseudoprobe 837061429793323041 1 0 0
je .LBB1_1
# %bb.2: # %bb2
.pseudoprobe 837061429793323041 6 2 0
callq foo
.pseudoprobe 837061429793323041 3 0 0
.pseudoprobe 837061429793323041 4 0 0
popq %rax
retq
.LBB1_1: # %bb1
.pseudoprobe 837061429793323041 5 1 0
callq *%rsi
.pseudoprobe 837061429793323041 2 0 0
.pseudoprobe 837061429793323041 4 0 0
popq %rax
retq
# -- End function
.section .pseudo_probe_desc,"",@progbits
.quad 6699318081062747564
.quad 72617220756
.byte 3
.ascii "foo"
.quad 837061429793323041
.quad 281547593931412
.byte 4
.ascii "foo2"
```

With inlining turned on, the assembly may look different around %bb2 with an inlined probe:

```
# %bb.2:                                # %bb2
.pseudoprobe    837061429793323041 3 0
.pseudoprobe    6699318081062747564 1 0 @ 837061429793323041:6
.pseudoprobe    837061429793323041 4 0
popq    %rax
retq
```

**Disassembling**

We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file.

An example disassembly looks like:

```
00000000002011a0 <foo2>:
  2011a0: 50                    push   rax
  2011a1: 85 ff                 test   edi,edi
  [Probe]:  FUNC: foo2  Index: 1  Type: Block
  2011a3: 74 02                 je     2011a7 <foo2+0x7>
  [Probe]:  FUNC: foo2  Index: 3  Type: Block
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  [Probe]:  FUNC: foo   Index: 1  Type: Block  Inlined: @ foo2:6
  2011a5: 58                    pop    rax
  2011a6: c3                    ret
  [Probe]:  FUNC: foo2  Index: 2  Type: Block
  2011a7: bf 01 00 00 00        mov    edi,0x1
  [Probe]:  FUNC: foo2  Index: 5  Type: IndirectCall
  2011ac: ff d6                 call   rsi
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  2011ae: 58                    pop    rax
  2011af: c3                    ret
```

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D91878
2020-12-10 09:50:08 -08:00
Scott Linder 19c56e11fa [MC] Fix ICE with non-newline terminated input
There is an explicit option for the lexer to support this, but we crash
when `-preserve-comments` is enabled because it checks for
`getTok().getString().empty()` to detect the case. This doesn't
work currently because the lexer reports this case as a string of length
1, containing a null byte.

Change the lexer to instead report this case via an empty string, as the
null terminator isn't logically a part of the textual input, and the
check for `.empty()` seems natural and obvious in the calling code.

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D92681
2020-12-09 23:39:32 +00:00
Amy Huang 399bc48ecc [CodeView] Fix inline sites that are missing code offsets.
When an inline site has a starting code offset of 0, we sometimes
don't emit the starting offset.

Bug: https://bugs.llvm.org/show_bug.cgi?id=48377

Differential Revision: https://reviews.llvm.org/D92590
2020-12-07 13:01:53 -08:00
Derek Schuff 0a391060f1 [WebAssembly] Add Object and ObjectWriter support for wasm COMDAT sections
Allow sections to be placed into COMDAT groups, in addtion to functions and data
segments.

Also make section symbols unnamed, which allows sections with identical names
(section names are independent of their section symbols, but previously we
gave the symbols the same name as their sections, which results in collisions
when sections are identically-named).

Differential Revision: https://reviews.llvm.org/D92691
2020-12-07 12:12:44 -08:00
Andy Wingo d823cc7cad [WebAssembly][MC] Fix placement of table section
The table section goes after functions.

Differential Revision: https://reviews.llvm.org/D92323
2020-12-07 16:17:32 +01:00
Fangrui Song 9c53b2adc8 [MC] Delete unused declarations
Notes:

* llvm::createAsmStreamer: it has been moved to TargetRegistry.h
* (anon ns)::WasmObjectWriter::updateCustomSectionRelocations: remnant of D46335
* COFFAsmParser::ParseSEHRegisterNumber: remnant of D66625
* llvm::CodeViewContext::isValidCVFileNumber: accidentally added by r279847
2020-12-06 15:36:39 -08:00
Scott Linder d55d6806ad [MC] Consume EndOfStatement in .cfi_{sections,endproc}
Previously these directives were always interpreted as having an extra
blank line after them.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D92612
2020-12-04 22:30:29 +00:00
jasonliu 2c63e7604c [XCOFF][AIX] Alternative path in EHStreamer for platforms do not have uleb128 support
Summary:
Not all system assembler supports `.uleb128 label2 - label1` form.
When the target do not support this form, we have to take
alternative manual calculation to get the offsets from them.

Reviewed By: hubert.reinterpretcast

Diffierential Revision: https://reviews.llvm.org/D92058
2020-12-02 20:03:15 +00:00
jasonliu a65d8c5d72 [XCOFF][AIX] Generate LSDA data and compact unwind section on AIX
Summary:
AIX uses the existing EH infrastructure in clang and llvm.
The major differences would be
1. AIX do not have CFI instructions.
2. AIX uses a new personality routine, named __xlcxx_personality_v1.
   It doesn't use the GCC personality rountine, because the
   interoperability is not there yet on AIX.
3. AIX do not use eh_frame sections. Instead, it would use a eh_info
section (compat unwind section) to store the information about
personality routine and LSDA data address.

Reviewed By: daltenty, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D91455
2020-12-02 18:42:44 +00:00
Eric Astor c64037b784 [ms] [llvm-ml] Support command-line defines
Enable command-line defines as textmacros

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D90059
2020-12-01 18:06:05 -05:00