Commit Graph

5281 Commits

Author SHA1 Message Date
Roman Lebedev 93f2642871
[X86] AMD Zen 3: same-reg AVX YMM VPSUB{B,W,D,Q} is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:23:01 +03:00
Roman Lebedev 7a45b96e04
[X86] AMD Zen 3: same-reg AVX XMM VPSUB{B,W,D,Q} is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:23:01 +03:00
Roman Lebedev 1ea8be214f
[X86] AMD Zen 3: same-reg SSE XMM PSUB{B,W,D,Q} is a 1-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:23:00 +03:00
Roman Lebedev bbd2117c34
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPSUB{B,W,D,Q} tests 2021-05-14 20:23:00 +03:00
Roman Lebedev d08909d1cb
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPSUB{B,W,D,Q} tests 2021-05-14 20:23:00 +03:00
Roman Lebedev a6f5351443
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PSUB{B,W,D,Q} tests 2021-05-14 20:23:00 +03:00
Roman Lebedev ce22f53916
[X86] AMD Zen 3: same-reg AVX YMM VPANDN is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:23:00 +03:00
Roman Lebedev 44c2b4fe91
[X86] AMD Zen 3: same-reg AVX XMM VPANDN is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:23:00 +03:00
Roman Lebedev a72cacb53f
[X86] AMD Zen 3: same-reg SSE XMM PANDN is a 1-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:22:59 +03:00
Roman Lebedev 9acc589e5a
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPANDN tests 2021-05-14 20:22:59 +03:00
Roman Lebedev a3617138c2
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPANDN tests 2021-05-14 20:22:59 +03:00
Roman Lebedev 3f235a0b84
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PANDN tests 2021-05-14 20:22:59 +03:00
Roman Lebedev 1d73c2b8cf
[X86] AMD Zen 3: same-reg AVX YMM VPXOR is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:22:59 +03:00
Roman Lebedev 31669b5073
[X86] AMD Zen 3: same-reg AVX XMM VPXOR is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 20:22:58 +03:00
Roman Lebedev 498bf365f4
[X86] AMD Zen 3: same-reg SSE XMM PXOR is a 1-cycle(!) dep-breaking zero-idiom
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 20:22:58 +03:00
Roman Lebedev 3009f8a383
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VPXOR tests 2021-05-14 20:22:58 +03:00
Roman Lebedev d58d020b6c
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VPXOR tests 2021-05-14 20:22:58 +03:00
Roman Lebedev 0f7a595095
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM PXOR tests 2021-05-14 20:22:58 +03:00
Roman Lebedev 4af4afe014
[X86] AMD Zen 3: same-reg AVX YMM VANDNPD is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:24 +03:00
Roman Lebedev 17f99a8a41
[X86] AMD Zen 3: same-reg AVX XMM VANDNPD is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:24 +03:00
Roman Lebedev 38ceb46fb0
[X86] AMD Zen 3: same-reg SSE XMM ANDNPD is a 1-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:24 +03:00
Roman Lebedev 3221e06e9b
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VANDNPD tests 2021-05-14 14:06:24 +03:00
Roman Lebedev 0b7e52e725
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VANDNPD tests 2021-05-14 14:06:24 +03:00
Roman Lebedev 055fa84cd8
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM ANDNPD tests 2021-05-14 14:06:24 +03:00
Roman Lebedev d8a595b81c
[X86] AMD Zen 3: same-reg AVX YMM VANDNPS is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:24 +03:00
Roman Lebedev fd4cbc822b
[X86] AMD Zen 3: same-reg AVX XMM VANDNPS is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 14:06:23 +03:00
Roman Lebedev f38dcbecb6
[X86] AMD Zen 3: same-reg SSE XMM ANDNPS is a 1-cycle(!) dep-breaking zero-idiom
Same as SSE XMM XORPS/XORPD, it is not zero-cycle, even though it breaks the deps.
As confirmed by the exegesis measurements, and ref docs.
2021-05-14 14:06:23 +03:00
Roman Lebedev c79c7bb980
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VANDNPS tests 2021-05-14 14:06:23 +03:00
Roman Lebedev a57006d627
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VANDNPS tests 2021-05-14 14:06:23 +03:00
Roman Lebedev a657808948
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM ANDNPS tests 2021-05-14 14:06:23 +03:00
Roman Lebedev 43a7f130a7
[X86] AMD Zen 3: same-reg AVX YMM VXORPD is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 11:56:07 +03:00
Roman Lebedev 336b9dbe88
[X86] AMD Zen 3: same-reg AVX XMM VXORPD is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis measurements, and ref docs.
2021-05-14 11:56:07 +03:00
Roman Lebedev 9c596bc541
[X86] AMD Zen 3: same-reg SSE XMM XORPD is a 1-cycle(!) dep-breaking zero-idiom
Same as with it's float friend, unlike their AVX versions.
As confirmed by exegesis, and ref docs.
2021-05-14 11:56:07 +03:00
Roman Lebedev 3567c7eda1
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VXORPD tests 2021-05-14 11:56:07 +03:00
Roman Lebedev 57eee56d0a
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VXORPD tests 2021-05-14 11:56:06 +03:00
Roman Lebedev fdc65e46b6
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM XORPD tests 2021-05-14 11:56:06 +03:00
Roman Lebedev 59554c01ab
[X86] AMD Zen 3: same-reg AVX YMM VXORPS is a zero-cycle(!) dep-breaking zero-idiom
As confirmed by exegesis, and ref docs.
2021-05-14 11:56:06 +03:00
Roman Lebedev 2a7c52ff7f
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VXORPS tests 2021-05-14 11:56:06 +03:00
Roman Lebedev 26c1bffe67
[X86] AMD Zen 3: same-reg AVX XMM VXORPS is a zero-cycle(!) dep-breaking zero-idiom
Unlike it's legacy SSE XMM XORPS version, which measures as being 1-cycle,
this one is certainly a zero-cycle instruction, in addition to both of them
being dependency breaking.

As confirmed by exegesis measurements, and ref docs.
2021-05-14 11:56:06 +03:00
Roman Lebedev a9fb321a67
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VXORPS tests 2021-05-14 11:56:06 +03:00
Roman Lebedev aa0dcb3ba4
[X86] AMD Zen 3: same-reg SSE XMM XORPS is a 1-cycle(!) dep-breaking one-idiom
While both the SOG and Agner insist that it is zero-cycle,
i can not confirm that claim. While it clearly breaks the dependency,
i can not come up with a snippet, or measurement approach,
to end up with IPC bigger than 4, which, to me, means that it actually
consumes execution resource of an FP unit for a cycle.
2021-05-14 00:03:36 +03:00
Roman Lebedev 6c4596793d
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM XORPS test 2021-05-14 00:03:36 +03:00
Martin Storsjö b42fb6811e [llvm-nm] Support the -V option, print that the tool is compatible with GNU nm
This unlocks some codepaths in libtool.

Differential Revision: https://reviews.llvm.org/D102321
2021-05-13 22:36:25 +03:00
Aakanksha Patil 464e4dc50f [AMDGPU] Add gfx1034 target
Differential Revision: https://reviews.llvm.org/D102306
2021-05-13 14:25:18 -04:00
Jordan Rupprecht 1336c5ae2f [llvm-cov][test] Add test coverage for "gcov" implying "llvm-cov gcov" compatibility.
Much like other LLVM binary utilities, `llvm-cov` has a symlink compatibility feature where it runs in `gcov` compatibility mode if the binary name ends in `gcov`. This is identical to invoking `llvm-cov gcov ...`.

Differential Revision: https://reviews.llvm.org/D102299
2021-05-12 08:21:42 -07:00
Greg McGary 5a43901539 [llvm-objdump] Exclude __mh_*_header symbols during MachO disassembly
`__mh_(execute|dylib|dylinker|bundle|preload|object)_header` are special symbols whose values hold the VMA of the Mach header to support introspection. They are attached to the first section in `__TEXT`, even though their addresses are outside `__TEXT`, and they do not refer to code.

It is normally harmless, but when the first section of `__TEXT` has no other symbols, `__mh_*_header` is considered by the disassembler when determing function boundaries. Since `__mh_*_header` refers to an address outside `__TEXT`, the boundary determination fails and disassembly quits.

Since `__TEXT,__text` normally has symbols, this bug is obscured. Experiments placing `__stubs` and `__stub_helper` first exposed the bug, since neither has symbols.

Differential Revision: https://reviews.llvm.org/D101786
2021-05-12 06:39:14 -07:00
Alex Orlov d8e65585f7 Fixed llvm-objcopy to add correct symbol table for ELF with program headers.
This fixes the following bugs:
https://bugs.llvm.org/show_bug.cgi?id=43935

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D102258
2021-05-12 12:39:30 +04:00
Petr Hosek 8280ece0c9 [Coverage] Support overriding compilation directory
When making compilation relocatable, for example in distributed
compilation scenarios, we want to set compilation dir to a relative
value like `.` but this presents a problem when generating reports
because if the file path is relative as well, for example `..`, you
may end up writing files outside of the output directory.

This change introduces a flag that allows overriding the compilation
directory that's stored inside the profile with a different value that
is absolute.

Differential Revision: https://reviews.llvm.org/D100232
2021-05-11 15:26:45 -07:00
Alan Phipps eccb925147 Reland "[Coverage] Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation""
Originally landed in: 6400905a61
Reverted in: 668dccc396

Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation
groups.

This change corrects the implementation for the branch coverage summary to do
the same thing for branches that is done for lines and regions.  That is,
across function instantiations in an instantiation group, the maximum branch
coverage found in any of those instantiations is returned, with the total
number of branches being the same across instantiations.

Differential Revision: https://reviews.llvm.org/D102193
2021-05-11 11:48:23 -05:00
Alan Phipps 668dccc396 Revert "Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation"
This reverts commit 6400905a61.
2021-05-11 11:26:19 -05:00
Alan Phipps 6400905a61 Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation
groups.

This change corrects the implementation for the branch coverage
summary to do the same thing for branches that is done for lines and regions.
That is, across function instantiations in an instantiation group, the maximum
branch coverage found in any of those instantiations is returned, with the
total number of branches being the same across instantiations.

Differential Revision: https://reviews.llvm.org/D102193
2021-05-11 10:42:40 -05:00
Alex Orlov 05d1ae4e18 * Add support for JSON output style to llvm-symbolizer
This patch adds JSON output style to llvm-symbolizer to better support CLI automation by providing a machine readable output.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D96883
2021-05-11 13:10:54 +04:00
Djordje Todorovic 1ed2963600 [llvm-dwarfdump] Fix abstract origin vars location stats calculation
There are cases where a concrete DIE with DW_TAG_subprogram can have
abstract_origin attribute, so we handle that situation as well.

Differential Revision: https://reviews.llvm.org/D101025
2021-05-11 01:04:51 -07:00
Roman Lebedev 6a64c462eb
[X86] AMD Zen 3: same-reg AVX YMM VPCMP is dep breaking one-idiom
As measured by exegesis, and confirmed by ref docs.
Still not zero-cycle :)
2021-05-10 23:49:27 +03:00
Roman Lebedev 5864e7b86b
[NFC][X86][MCA] AMD Zen 3: add tests for same-re AVX YMM VPCMP 2021-05-10 23:49:27 +03:00
Roman Lebedev 2953245337
[X86] AMD Zen 3: same-reg AVX XMM VPCMP is dep breaking one-idiom
As measured by exegesis, and confirmed by ref docs.
Again, it's not zero-cycle.
2021-05-10 23:49:26 +03:00
Roman Lebedev f59db6c4f8
[NFC][X86][MCA] AMD Zen 3: add tests for same-re AVX XMM VPCMP 2021-05-10 23:49:26 +03:00
Roman Lebedev 0f3bcb97ef
[X86] AMD Zen 3: same-reg SSE XMM PCMP is dep breaking one-idiom
As measured by exegesis, and confirmed by ref docs.
Much like with MMX PCMP, it does actually have to execute, though.
2021-05-10 23:49:26 +03:00
Roman Lebedev 0e538f937a
[NFC][X86][MCA] AMD Zen 3: add tests for same-reg XMM SSE PCMP 2021-05-10 23:49:26 +03:00
Roman Lebedev b24edfff4f
[X86] AMD Zen 3: same-reg PCMPEQ is an MMX all-ones dep breaking idiom
They are, however, not zero-cycle, and do actually execute.

As measured by exegesis, and confirmed by ref docs.
2021-05-10 23:49:26 +03:00
Roman Lebedev ba225ce961
[NFC][X86][MCA] AMD Zen 3: add tests for same-reg MMX PCMPEQ 2021-05-10 23:49:25 +03:00
Roman Lebedev 08cf2776ac
[X86] AMD Zen 3: sub-32-bit CMP also break dependencies
They measure as having the same effect as 32-bit CMP.
2021-05-10 20:57:38 +03:00
Roman Lebedev ecff974b66
[NFC][X86][MCA] AMD Zen 3: add tests for sub-32-bit CMP dep breaking 2021-05-10 20:57:37 +03:00
Fangrui Song 7a0231ae59 [llvm-objdump][MachO] Print a newline before lazy bind/bind/weak/exports trie
This adds a separator between two pieces of information.

Reviewed By: #lld-macho, alexshap

Differential Revision: https://reviews.llvm.org/D102114
2021-05-10 09:16:18 -07:00
Roman Lebedev be23d5e814
[X86] AMD Zen 3: same-reg CMP is a zero-cycle dependency-breaking instruction
As measured by exegesis, and confirmed by ref docs.
2021-05-10 00:03:20 +03:00
Roman Lebedev 9a31efa2f5
[NFC][X86][MCA] AMD Zen 3: add tests for CMP dependency breaking 2021-05-10 00:03:20 +03:00
Roman Lebedev 11b0568dce
[X86] AMD Zen 3: same-reg SBB is a dependency-breaking instruction
As confirmed by exegesis measurements, and ref docs.
It does actually execute.

While there, bump latency for MULX32rr, that seems to match measurements.
2021-05-10 00:03:20 +03:00
Roman Lebedev 8d0e2d2b0f
[NFC][X86][MCA] AMD Zen 3: add tests for SBB dependency breaking 2021-05-10 00:03:20 +03:00
Roman Lebedev eed8552787
[X86] AMD Zen 3: same-register XOR/SUB are GPR dependency breaking zero-idioms
As measured by exegesis and confirmed in reference docs.
2021-05-10 00:03:20 +03:00
Roman Lebedev ab794852ed
[NFC][X86][MCA] AMD Zen3: add GPR zero-idiom dependency breaking tests 2021-05-10 00:03:20 +03:00
Roman Lebedev a21df76db6
[X86] AMD Zen 3: XCHG is a zero-cycle instruction
As measured by exegesis and confirmed by reference docs.
2021-05-09 20:37:57 +03:00
Fangrui Song 492173d42b [test] Fix tools/gold/X86/new-pm.ll after D101797 2021-05-08 13:41:36 -07:00
Arthur Eubanks 34a8a437bf [NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose
Printing pass manager invocations is fairly verbose and not super
useful.

This allows us to remove DebugLogging from pass managers and PassBuilder
since all logging (aside from analysis managers) goes through
instrumentation now.

This has the downside of never being able to print the top level pass
manager via instrumentation, but that seems like a minor downside.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D101797
2021-05-07 21:51:47 -07:00
Roman Lebedev 2819009b5a
[X86] AMD Zen 3: _REV variants of zero-cycles moves are also zero-cycles (PR50261)
Sometimes disassembler picks _REV variants of instructions
over the plain ones, which in this case exposed an issue
that the _REV variants aren't being modelled as optimizable moves.
2021-05-07 18:27:40 +03:00
Roman Lebedev a8e30e63ac
[NFC][X86][MCA] AMD Zen3: add test for zero-cycle X87 move 2021-05-07 18:27:40 +03:00
Roman Lebedev 34de155f7e
[NFC][X86][MCA] AMD Zen3 Decrease iteration count in reg-move-elimination tests
Drop it just enough so it still produces the right IPC.
2021-05-07 17:06:45 +03:00
Roman Lebedev 758c173309
[X86] AMD Zen 3: throughput for renameable XMM/YMM moves is 6
They are resolved at the register rename stage without
using any execution units.
2021-05-07 17:06:45 +03:00
Roman Lebedev 715c0d0bd4
[X86] AMD Zen 3: AVX YMM moves are zero-cycle
I've verified this with llvm-exegesis.
This is not limited to zero registers.
2021-05-07 17:06:45 +03:00
Roman Lebedev ee020b930d
[X86] AMD Zen 3: AVX XMM moves are zero-cycle
I've verified this with llvm-exegesis.
This is not limited to zero registers.
2021-05-07 17:06:44 +03:00
Roman Lebedev 9db4203883
[X86] AMD Zen 3: SSE XMM moves are zero-cycle
I've verified this with llvm-exegesis.
This is not limited to zero registers.

Refs:
AMD SOG 19h, 2.9.4 Zero Cycle Move
The processor is able to execute certain register to register
mov operations with zero cycle delay.

Agner,
22.13 Instructions with no latency
Register-to-register move instructions are resolved at
the register rename stage without using any execution units.
These instructions have zero latency. It is possible to do six such
register renamings per clock cycle, and it is even possible to
rename the same register multiple times in one clock cycle.
2021-05-07 17:06:44 +03:00
Roman Lebedev 0d961fbd52
[NFC][X86][MCA] AMD Zen 3: Add tests for renameable AVX YMM moves 2021-05-07 17:06:44 +03:00
Roman Lebedev bcbfc22ff9
[NFC][X86][MCA] AMD Zen 3: Add tests for renameable AVX XMM moves 2021-05-07 17:06:44 +03:00
Roman Lebedev cbabe4f4d6
[NFC][X86][MCA] AMD Zen 3: Add tests for renameable SSE XMM moves 2021-05-07 17:06:44 +03:00
Roman Lebedev d8c6202576
[X86] AMD Zen 3: throughput for renameable GPR moves is 6
They are resolved at the register rename stage without
using any execution units.
2021-05-07 17:06:43 +03:00
Roman Lebedev e6d688ec96
[NFC][X86][MCA] Increase iteration count in reg move elimination tests
So the IPC actually stabilizes at 6.
2021-05-07 17:06:43 +03:00
Roman Lebedev bda9ca3e44
[NFC][X86][MCA] AMD Zen 3: add tests with non-eliminatible MMX moves
In Zen3, MMX moves are *not* eliminated,
i've verified this with llvm-exegesis.
2021-05-07 13:56:07 +03:00
Roman Lebedev 7059b28d5d
[X86] AMD Zen 3: 32/64 -bit GPR register moves are zero-cycle
I've verified this with llvm-exegesis.
This is not limited to zero registers.

Refs:
AMD SOG 19h, 2.9.4 Zero Cycle Move
The processor is able to execute certain register to register
mov operations with zero cycle delay.

Agner,
22.13 Instructions with no latency
Register-to-register move instructions are resolved at
the register rename stage without using any execution units.
These instructions have zero latency. It is possible to do six such
register renamings per clock cycle, and it is even possible to
rename the same register multiple times in one clock cycle.
2021-05-07 13:56:07 +03:00
Roman Lebedev 227678089c
[NFC][X86][MCA] AMD Zen 3: add tests with eliminatible GPR moves 2021-05-07 13:56:07 +03:00
gbreynoo f0762fc42f [llvm-dwarfdump] Help option output should be consistent with the command guide
The dwarfdump command guide shows the short options used as aliases but
these are not found in the help text unless --show-hidden is used.
Investigating other tools some follow this pattern, others like
llvm-objdump show aliases with --help. This change fixes the help output
to be consistent with the command guide. This includes updating alias
descriptions in the help output to use "--".

As part of this change I updated cmdline.test, including some options
that were missing testing.

Differential Revision: https://reviews.llvm.org/D101646
2021-05-07 11:23:05 +01:00
Matthew Voss 22aece57be Allow llvm-dis to disassemble multiple files
Differential Revision: https://reviews.llvm.org/D101110
2021-05-06 11:08:55 -07:00
Fangrui Song b3336bfa2e [llvm-objcopy][ELF] --only-keep-debug: set offset/size of segments with no sections to zero
PR50160: we currently ignore non-PT_PHDR segments with no sections, not
accounting for its p_offset and p_filesz: this can cause an out-of-bounds write
in `writeSegmentData` if the p_offset+p_filesz is larger than the total file
size.

This can be fixed by setting p_offset=p_filesz=0. The logic nicely unifies with
the logic added in D90897.

Reviewed By: jhenderson, rupprecht

Differential Revision: https://reviews.llvm.org/D101560
2021-05-05 10:26:57 -07:00
Fangrui Song e510860656 [llvm-objdump] Add -M {att,intel} & deprecate --x86-asm-syntax={att,intel}
The internal `cl::opt` option --x86-asm-syntax sets the AsmParser and AsmWriter
dialect. The option is used by llc and llvm-mc tests to set the AsmWriter dialect.

This patch adds -M {att,intel} as GNU objdump compatible aliases (PR43413).

Note: the dialect is initialized when the MCAsmInfo is constructed.
`MCInstPrinter::applyTargetSpecificCLOption` is called too late and its MCAsmInfo
reference is const, so changing the `cl::opt` in
`MCInstPrinter::applyTargetSpecificCLOption` is not an option, at least without
large amount of refactoring.

Reviewed By: hoy, jhenderson, thakis

Differential Revision: https://reviews.llvm.org/D101695
2021-05-05 00:20:41 -07:00
Fangrui Song 96f3a63076 [llvm-objcopy] --dump-section: error if '=' is missing or filename is empty
Fix PR45416: the diagnostic when '=' is missing is misleading.
`FileOutputBuffer::create` returns successfully when the filename is empty
(the temporary file is `.tmp%%%%%%%`), but `FileOutputBuffer::commit` will error when
renaming `.tmp%%%%%%%` to the empty name).

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D101697
2021-05-04 17:30:57 -07:00
Martin Storsjö 70c4930637 [llvm-readobj] [ARMWinEH] Try to resolve label symbols into regular ones
Unwind info generated by MSVC tends to have relocations pointing at
static "label" symbols like "$LN4" instead of regular ones based on
the actual function's name. Try to resolve such symbols to a non-label
symbol if possible (ideally to an external symbol), to improve
the readability.

Differential Revision: https://reviews.llvm.org/D101567
2021-05-04 22:22:18 +03:00
Fangrui Song dcf6d0d389 [llvm-objdump] Fix -a after D100433
-a is alias for --archive-headers, not --all-headers
2021-05-04 10:17:36 -07:00
Fangrui Song 0c2e2f88fb [llvm-objdump] Improve newline consistency between different pieces of information
When dumping multiple pieces of information (e.g. --all-headers),
there is sometimes no separator between two pieces.
This patch uses the "\nheader:\n" style, which generally improves
compatibility with GNU objdump.

Note: objdump -t/-T does not add a newline before "SYMBOL TABLE:" and "DYNAMIC SYMBOL TABLE:".
We add a newline to be consistent with other information.

`objdump -d` prints two empty lines before the first 'Disassembly of section'.
We print just one with this patch.

Differential Revision: https://reviews.llvm.org/D101796
2021-05-04 09:56:07 -07:00
gbreynoo a617e2064d [llvm-objdump] Remove Generic Options group from help text output
Reapply 7368624 after revert and fix

Looking at other tools using tablegen for help output, general options
like --help are not separated from other options. This change removes
the "Generic Options" option group so the options are listed together.
the macho specific option group is left unaffected.

The test help.test was modified to reflect this change.

Differential Revision: https://reviews.llvm.org/D101652
2021-05-04 17:42:20 +01:00
Dimitry Andric dffddde73a Revert "[llvm-objdump] Remove Generic Options group from help text output"
This reverts commit 73686247ac, as there
were git stash conflict markers left unresolved.
2021-05-04 18:28:31 +02:00
gbreynoo 73686247ac [llvm-objdump] Remove Generic Options group from help text output
Looking at other tools using tablegen for help output, general options
like --help are not separated from other options. This change removes
the "Generic Options" option group so the options are listed together.
the macho specific option group is left unaffected.

The test help.test was modified to reflect this change.

Differential Revision: https://reviews.llvm.org/D101652
2021-05-04 16:48:03 +01:00
Giorgis Georgakoudis 404fa9a6cf [Utils] Add prof metadata to matched unnamed values
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101742
2021-05-03 15:15:34 -07:00
Konstantin Zhuravlyov 2055cc8ef4 AMDGPU: XFAIL LLVM::note-amd-valid-v2.test for big endian 2021-05-03 09:45:19 -04:00
Konstantin Zhuravlyov 94aaf3ddd9 Reland "AMDGPU/llvm-readobj: Add missing tests for note parsing/displaying"
This reverts commit 54aad63659.

Includes fix for note-amd-valid-v3.s test.
2021-05-02 22:56:17 -04:00
Sergio Perez Gonzalez 761d5614a1 [Object] Fix e_machine description for EM_CR16 and add EM_MICROBLAZE
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D101133
2021-05-02 19:25:39 -07:00
Roman Lebedev 2b93c9c16c
[X86] AMD Zen 3 Scheduler Model
Introduce basic schedule model for AMD Zen 3 CPU's, a.k.a `znver3`.

This is fully built from scratch, from llvm-mca measurements
and documented reference materials.
Nothing was copied from `znver2`/`znver1`.

I believe this is in a reasonable state of completion for inclusion,
probably better than D52779 `bdver2` was :)

Namely:
* uops are pretty spot-on (at least what llvm-mca can measure)
  {F16422596}
* latency is also pretty spot-on (at least what llvm-mca can measure)
  {F16422601}
* throughput is within reason
  {F16422607}

I haven't run much benchmarks with this,
however RawSpeed benchmarks says this is beneficial:
{F16603978}
{F16604029}

I'll call out the obvious problems there:
* i didn't really bother with X87 instructions
* i didn't really bother with obviously-microcoded/system instructions
* There are large discrepancy in throughput for `mr` and `rm` instructions.
  I'm not really sure if it's a modelling defect that needs to be fixed,
  or it's a defect of measurments.
* Pipe distributions are probably bad :)
  I can't do much here until AMD allows that to be fixed
  by documenting the appropriate counters and updating libpfm

That being said, as @RKSimon notes:
>>! In D94395#2647381, @RKSimon wrote:
> I'll mention again that all the znver* models appear to be very inaccurate wrt SIMD/FPU instructions <...>
so how much worse this could possibly be?!

Things that aren't there:
* Various tunings: zero idioms, etc. That is follow-ups.

Differential Revision: https://reviews.llvm.org/D94395
2021-05-01 22:08:13 +03:00
Jez Ng c00fc180ec [llvm-readobj] Recognize N_THUMB_DEF as a symbol flag
The right symbol flag mask is ~0x7, not ~0xf.

Also emit string names for the other flags (we were missing some).

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D101548
2021-04-30 17:39:56 -04:00
Arthur Eubanks 511f2cecf7 [llvm-reduce] Don't unset dso_local on implicitly dso_local GVs
This introduces a flag that aborts if we ever reduce to IR that fails
the verifier.

Reviewed By: swamulism, arichardson

Differential Revision: https://reviews.llvm.org/D101279
2021-04-30 11:57:22 -07:00
Arthur Eubanks 545a8177ea [llvm-reduce] Add flag to only run specific passes
Reviewed By: fhahn, hans

Differential Revision: https://reviews.llvm.org/D101278
2021-04-30 11:51:01 -07:00
Konstantin Zhuravlyov 54aad63659 Revert "AMDGPU/llvm-readobj: Add missing tests for note parsing/displaying"
This reverts commit c9c4676a45.

Reason for revert: note-amd-valid-v3.s test fails if AMDGPU is not built.
2021-04-30 14:45:52 -04:00
Nick Desaulniers dde24a87c5 [llvm-objdump] add -v alias for --version
Used by the Linux kernel's CONFIG_X86_DECODER_SELFTEST.

Link: https://github.com/ClangBuiltLinux/linux/issues/1130

Reviewed By: MaskRay, jhenderson, rupprecht

Differential Revision: https://reviews.llvm.org/D101483
2021-04-30 11:26:36 -07:00
Konstantin Zhuravlyov c9c4676a45 AMDGPU/llvm-readobj: Add missing tests for note parsing/displaying
This is a follow up review/change for https://reviews.llvm.org/D95638

Add valid note tests for code object v2 notes:
  - NT_AMD_HSA_CODE_OBJECT_VERSION (required yaml2obj update)
  - NT_AMD_HSA_HSAIL (required yaml2obj update)
  - NT_AMD_HSA_ISA_VERSION (required yaml2obj update)
  - NT_AMD_HSA_METADATA
  - NT_AMD_HSA_ISA_NAME
  - NT_AMD_PAL_METADATA

Add valid note tests for code object v3 notes:
  - NT_AMDGPU_METADATA

Add invalid note tests for code object v2 notes:
  - NT_AMD_HSA_CODE_OBJECT_VERSION (required yaml2obj update)
  - NT_AMD_HSA_HSAIL (required yaml2obj update)
  - NT_AMD_HSA_ISA_VERSION (required yaml2obj update)

Add invalid note tests for code object v3 notes:
  - NT_AMDGPU_METADATA

Differential Revision: https://reviews.llvm.org/D101304
2021-04-30 11:19:16 -04:00
Andrea Di Biagio 8bd4f3d547 [MCA] Fix CarryOver check in the DispatchStage (PR50174).
Early exit from method DispatchStage::isAvailable() if the dispatch group is
already full. Not all instructions declare at least one uOP.
Fixes PR50174.
2021-04-30 14:26:46 +01:00
Martin Storsjö 4750a8b1bc Reapply [llvm-readobj] [ARMWinEH] Fix handling of relocations and symbol offsets
When looking up data referenced from pdata/xdata structures, the
referenced data can be found in two different ways:
- For an unrelocated object file, it's located via a relocation
- For a relocated, linked image, the data is referenced with an
  (image relative) absolute address

For the latter case, the absolute address can optionally be
described with a symbol.

For the case of an object file, there's two offsets involved; one
immediate offset encoded in the data location that is modified by
the relocation, and a section offset in the symbol.

Previously, for the ExceptionRecord field, we printed the offset
from the symbol (only) but used the immediate offset ignoring
the symbol's address (using only the symbol's section) for printing
the exception data.

Add a helper method for doing the lookup and address calculation,
for simplifying the calling code and making all the cases consistent.

This addresses an existing FIXME comment, fixing printing of the
exception data for cases where relocations point at individual
symbols in the xdata section (which is what MSVC generates) instead of
all relocations pointing at the start of the xdata section (which is
what LLVM generates).

This also fixes printing of the function name for packed entries in
linked images.

Relanded with a format string fix in the formatSymbol function; one
can't use %X as format string for an uint64_t. That bug has been
present since this code was added in e6971cab30.

Differential Revision: https://reviews.llvm.org/D100305
2021-04-30 09:51:23 +03:00
Martin Storsjö 5bf2ef9d86 Revert "[llvm-readobj] [ARMWinEH] Fix handling of relocations and symbol offsets"
This reverts commit 3778924088.

The added test fails on at least one buildbot, by printing a reversed
combination, printing "func3_xdata +0x18 (0x8)" while it's supposed to
be "func3_xdata +0x8 (0x18)", see e.g.
https://lab.llvm.org/buildbot/#/builders/107/builds/7269. Currently
no idea how that could happen, but reverting until it can be figured
out.
2021-04-30 00:06:16 +03:00
Martin Storsjö 3778924088 [llvm-readobj] [ARMWinEH] Fix handling of relocations and symbol offsets
When looking up data referenced from pdata/xdata structures, the
referenced data can be found in two different ways:
- For an unrelocated object file, it's located via a relocation
- For a relocated, linked image, the data is referenced with an
  (image relative) absolute address

For the latter case, the absolute address can optionally be
described with a symbol.

For the case of an object file, there's two offsets involved; one
immediate offset encoded in the data location that is modified by
the relocation, and a section offset in the symbol.

Previously, for the ExceptionRecord field, we printed the offset
from the symbol (only) but used the immediate offset ignoring
the symbol's address (using only the symbol's section) for printing
the exception data.

Add a helper method for doing the lookup and address calculation,
for simplifying the calling code and making all the cases consistent.

This addresses an existing FIXME comment, fixing printing of the
exception data for cases where relocations point at individual
symbols in the xdata section (which is what MSVC generates) instead of
all relocations pointing at the start of the xdata section (which is
what LLVM generates).

This also fixes printing of the function name for packed entries in
linked images.

Differential Revision: https://reviews.llvm.org/D100305
2021-04-29 23:35:10 +03:00
Alexander Shaposhnikov 86f291ebb2 [llvm-objcopy][MachO] Add support for LC_THREAD/LC_UNIXTHREAD
Add support for LC_THREAD/LC_UNIXTHREAD
(these load commands can be copied over without any modifications).

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D101384
2021-04-28 16:29:33 -07:00
Jonas Devlieghere 625bd94c6d [dsymutil] Add flag to force a static variable to keep its enclosing function
Add a flag to change dsymutil's behavior and force a static variable to
keep its enclosing function. The test shows a situation where that could
be useful. I'm not convinced this behavior makes sense as a default,
which is why it's behind a flag.

rdar://74918374

Differential revision: https://reviews.llvm.org/D101337
2021-04-28 11:33:04 -07:00
Alex Richardson 79030a22cc [llvm-objdump] Fix dumping dynamic relative relocations for SHT_REL
Previously printing R_386_RELATIVE relocations would trigger
`error: can't read an entry at 0x40: it goes past the end of the section (0x40)`
I found this while writing a test case for LLD (D100490).
This also includes some minor cleanup in the elf-dynamic-relcos.test
llvm-objdump test based on the newly added test.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D100489
2021-04-28 12:23:00 +01:00
Alex Richardson 9692811b26 [update_(llc_)test_checks.py] Support pre-processing commands
This has been rather useful in our downstream CHERI target where we want
to run tests both with addrspace(0) and addrspace(200) pointers.
With this patch we can prefix the opt command with
`sed -e 's/addrspace(200)/addrspace(0)/g' -e 's/-A200-P200-G200//g'` to
test both cases using the same IR input.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95137
2021-04-28 12:19:19 +01:00
Alexander Shaposhnikov 412437aec0 Revert "[llvm-objcopy][MachO] Add support for LC_THREAD/LC_UNIXTHREAD"
This reverts commit 4dfddf715b
since it breaks some build bots (e.g. clang-ppc64be-linux)
2021-04-27 16:19:59 -07:00
Alexander Shaposhnikov 4dfddf715b [llvm-objcopy][MachO] Add support for LC_THREAD/LC_UNIXTHREAD
Add support for LC_THREAD/LC_UNIXTHREAD
(these load commands can be copied over without any modifications).

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D101384
2021-04-27 15:54:51 -07:00
Fangrui Song a41f076ef1 [test] Fix tools/gold/X86/weak.ll after D94202
The order regressed after D94202: after `a b`, when adding `a c`, we may reorder
`a` to the right of `b`, causing the final order to be `b a c`.
2021-04-26 16:04:22 -07:00
Ali Tamur 51b4610743 Support DW_FORM_strx* in llvm-dwp.
Currently llvm-dwp only handled DW_FORM_string and DW_FORM_GNU_str_index; with this patch it also starts to handle DW_FORM_strx[1-4]?

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D75485
2021-04-26 12:32:45 -07:00
Martin Storsjö f8de9aaef2 [llvm-rc] Add a GNU windres-like frontend to llvm-rc
This primarily parses a different set of options and invokes the same
resource compiler as llvm-rc normally. Additionally, it can convert
directly to an object file (which in MSVC style setups is done with the
separate cvtres tool, or by the linker).

(GNU windres also supports other conversions; from coff object file back
to .res, and from .res or object file back to .rc form; that's not yet
implemented.)

The other bigger complication lies in being able to imply or pass the
intended target triple, to let clang find the corresponding mingw sysroot
for finding include files, and for specifying the default output object
machine format.

It can be implied from the tool triple prefix, like
`<triple>-[llvm-]windres` or picked up from the windres option e.g.
`-F pe-x86-64`. In GNU windres, that option takes BFD style format names
such as pe-i386 or pe-x86-64. As libbfd in binutils doesn't support
Windows on ARM, there's no such canonical name for the ARM targets.
Therefore, as an LLVM specific extension, this option is extended to
allow passing full triples, too.

Differential Revision: https://reviews.llvm.org/D100756
2021-04-26 22:04:29 +03:00
Tim Renouf 18adf4bb0d [AMDGPU][llvm-objdump] Add lit.local.cfg missing from recent commit
Stops llvm-objdump tests failing when AMDGPU target is not supported.

Change-Id: Ic4ae443958c41c303ff6bee0966e5f21ab7a1851
2021-04-26 14:07:04 +01:00
Tim Renouf 8710eff6c3 [MC][AMDGPU][llvm-objdump] Synthesized local labels in disassembly
1. Add an accessor function to MCSymbolizer to retrieve addresses
   referenced by a symbolizable operand, but not resolved to a symbol.
   That way, the caller can synthesize labels at those addresses and
   then retry disassembling the section.

2. Implement that in AMDGPU -- a failed symbol lookup results in the
   address being added to a vector returned by the new function.

3. Use that in llvm-objdump when using MCSymbolizer (which only happens
   on AMDGPU) and SymbolizeOperands is on.

Differential Revision: https://reviews.llvm.org/D101145

Change-Id: I19087c3bbfece64bad5a56ee88bcc9110d83989e
2021-04-26 13:56:36 +01:00
Djordje Todorovic 6ba150dbb4 [llvm-dwarfdump] Fix split-dwarf bug in stats for inlined var loc cov
Initial (D96045) patch didn't handle split dwarf cases,
so this fixes that bug.

In addition, before applying this patch, we had a slowdown
that happened after the D96045. With this patch,
the slowdown will be fixed as well.

Differential Revision: https://reviews.llvm.org/D100951
2021-04-26 01:56:15 -07:00
Keith Smiley 86b98c60c5 llvm-objdump: add --rpaths to macho support
This prints the rpaths for the given binary

Reviewed By: kastiglione

Differential Revision: https://reviews.llvm.org/D100681
2021-04-22 16:01:10 -07:00
Hongtao Yu aaf120b528 [llvm-profgen] A couple tweaks to the testing harness.
1. Remove unnecessary filtering code.
2. Add llvm-profgen to tool substitutions.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D101006
2021-04-22 08:57:14 -07:00
Wenlei He dff8315892 [CSSPGO][llvm-profdata] Support trimming cold context when merging profiles
The change adds support for triming and merging cold context when mergine CSSPGO profiles using llvm-profdata. This is similar to the context profile trimming in llvm-profgen, however the flexibility to trim cold context after profile is generated can be useful.

Differential Revision: https://reviews.llvm.org/D100528
2021-04-22 00:42:37 -07:00
Hongtao Yu 1a719089a8 [CSSPGO][llvm-profgen] Always report dangling probes for frames with real samples.
Report dangling probes for frames that have real samples collected. Dangling probes are the probes associated to an empty block. When reported, sample count on a dangling probe will not be trusted by the compiler and we will rely on the counts inference algorithm to get the probe a reasonable count. This actually fixes a bug where previously only those dangling probes with samples collected were reported.

This patch also fixes two existing issues. Pseudo probes are stored in `Address2ProbesMap` and their pointers are used in `PseudoProbeInlineTree`. Previously `std::vector` was used to store probes and the pointers to probes may get obsolete as the vector grows. I'm changing `std::vector` to `std::list` instead.

The other issue is that all outlined functions shared the same inline frame previously due to the unchanged `Index` value as the dummy inlineSite identifier.

Good results seen for SPEC2017 in general regarding profile quality.

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D100235
2021-04-21 18:07:58 -07:00
Martin Storsjö 64bc44f5dd [llvm-rc] Run clang to preprocess input files
Allow opting out from preprocessing with a command line argument.

Update tests to pass -no-preprocess to make it not try to use clang
(which isn't a build level dependency of llvm-rc), but add a test that
does preprocessing under clang/test/Preprocessor.

Update a few options to allow them both joined (as -DFOO) and separate
(-D BR), as rc.exe allows both forms of them.

With the verbose flag set, this prints the preprocessing command
used (which differs from what rc.exe does).

Tests under llvm/test/tools/llvm-rc only test constructing the
preprocessor commands, while tests under clang/test/Preprocessor test
actually running the preprocessor.

Differential Revision: https://reviews.llvm.org/D100755
2021-04-21 11:50:10 +03:00
Sebastian Neubauer 4897effb14 [AMDGPU] Add TransVALU to gfx10
Instructions on the transcendental unit are executed in parallel to the
normal VALU, so add this as an extra resource.

This doesn't seem to have any effect, but it should be more correct.

Differential Revision: https://reviews.llvm.org/D100123
2021-04-20 15:34:43 +02:00
Nico Weber 1a3f88658a [llvm-objdump] Add an llvm-otool tool
This implements an LLVM tool that's flag- and output-compatible
with macOS's `otool` -- except for bugs, but from testing with both
`otool` and `xcrun otool-classic`, llvm-otool matches vanilla
otool's behavior very well already. It's not 100% perfect, but
it's a very solid start.

This uses the same approach as llvm-objcopy: llvm-objdump uses
a different OptTable when it's invoked as llvm-otool. This
is possible thanks to D100433.

Differential Revision: https://reviews.llvm.org/D100583
2021-04-20 08:24:58 -04:00
Martin Storsjö 73cda4d183 [llvm-rc] Fix handling of the /X option to match its documentation and rc.exe
This matches how it's documented in the option listing.

Differential Revision: https://reviews.llvm.org/D100754
2021-04-20 09:22:43 +03:00
David Penry 78a871abf7 [ARM] Use ProcResGroup in Cortex-M7 scheduling model
Used to model structural hazards on FP issue, where some
instructions take up 2 issue slots and others one as well
as similar structural hazards on load issue, where some
instructions take up two load lanes and others one.

Differential Revision: https://reviews.llvm.org/D98977
2021-04-19 21:23:05 +01:00
LemonBoy 24185541ca [yaml2obj/obj2yaml/llvm-readobj] Support printing and parsing AVR-specific e_flags
The `e_flags` contains a mixture of bitfields and regular ones, ensure all of them can be serialized and deserialized.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D100250
2021-04-15 15:54:28 +02:00
Alex Orlov 49cbf4cd85 Fix bug in .eh_frame/.debug_frame PC offset calculation for DW_EH_PE_pcrel
This fixes the following bugs:
https://bugs.llvm.org/show_bug.cgi?id=27249
https://bugs.llvm.org/show_bug.cgi?id=46414

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D100328
2021-04-15 15:06:20 +04:00
Nico Weber 5a625e5303 [llvm-objdump] try to fix section-filter.test in full builds after 51aa61e74b 2021-04-14 20:58:51 -04:00
Nico Weber 1035123ac5 [llvm-objdump] Switch command-line parsing from llvm::cl to OptTable
This is similar to D83530, but for llvm-objdump.

The motivation is the desire to add an `llvm-otool` symlink to
llvm-objdump that behaves like macOS's `otool`, using the same
technique the at llvm-objcopy uses to behave like `strip` (etc).

This change for the most part preserves behavior. In some cases,
it increases compatibility with GNU objdump a bit. For example,
the long options now require two dashes, and the long options
taking arguments for the most part now require a `=` in front
of the value. Exceptions are flags where tests passed the
value separately, for these the separate form is kept as
an alias to the = form.

The one-letter short form args are now joined or separate
and long longer accept a =, which also matches GNU objdump.

cl::opt<>s in libraries now have to be explicitly plumbed
through. This patch does that for --x86-asm-syntax=, but
there's hope that we can remove that again.

Differential Revision: https://reviews.llvm.org/D100433
2021-04-14 20:12:24 -04:00
Wenlei He 00ef28ef21 [CSSPGO] Fix dangling context strings and improve profile order consistency and error handling
This patch fixed the following issues along side with some refactoring:

1. Fix bugs where StringRef for context string out live the underlying std::string. We now keep string table in profile generator to hold std::strings. We also do the same for bracketed context strings in profile writer.
2. Make sure profile output strictly follow (total sample, name) order. Previously, there's inconsistency between ProfileMap's key and FunctionSamples's name, leading to inconsistent ordering. This is now fixed by introducing context profile canonicalization. Assertions are also added to make sure ProfileMap's key and FunctionSamples's name are always consistent.
3. Enhanced error handling for profile writing to make sure we bubble up errors properly for both llvm-profgen and llvm-profdata when string table is not populated correctly for extended binary profile.
4. Keep all internal context representation bracket free. This avoids creating new strings for context trimming, merging and preinline. getNameWithContext API is now simplied accordingly.
5. Factor out the code for context trimming and merging into SampleContextTrimmer in SampleProf.cpp. This enables llvm-profdata to use the trimmer when merging profiles. Changes in llvm-profgen will be in separate patch.

Differential Revision: https://reviews.llvm.org/D100090
2021-04-10 12:39:10 -07:00
Alex Orlov f47a4c0713 [lld] Fixed CodeView GuidAdapter::format to handle GUID bytes in the right order.
This fixes https://bugs.llvm.org/show_bug.cgi?id=41712 bug.

Reviewed By: aganea

Differential Revision: https://reviews.llvm.org/D99978
2021-04-09 05:29:14 +04:00
Andrew Savonichev f08a2fc09e [MCA] Add tests for IPC on Cortex-A55
The tests compare IPC statistics that MCA provides with IPC values
measured on Cortex-A55 hardware. For hardware tests, each snippet is
run in a loop unrolled by 1000, and IPC is measured by linux-perf.

Several tests do not match the hardware: the skewed ALU is not
supported, LDR seem to be missing a forwarding path.

Differential Revision: https://reviews.llvm.org/D98174
2021-04-08 19:37:07 +03:00
Nico Weber c22b09debd Revert "[clang] Speedup line offset mapping computation"
This reverts commit 6951b72334.
Breaks several bots, see comments on https://reviews.llvm.org/D99409
2021-04-07 09:42:11 -04:00
serge-sans-paille 6951b72334 [clang] Speedup line offset mapping computation
Clang spends a decent amount of time in the LineOffsetMapping::get(...)
function. This function used to be vectorized (through SSE2) then the
optimization got dropped because the sequential version was on-par performance
wise.

This provides an optimization of the sequential version that works on a word at
a time, using (documented) bithacks to provide a portable vectorization.

When preprocessing the sqlite amalgamation, this yields a sweet 3% speedup.

Differential Revision: https://reviews.llvm.org/D99409
2021-04-07 14:04:32 +02:00
Jonas Devlieghere 162c2759b6 [dsymutil] Stop emulating dsymutil-classic CIE caching behavior
Stop emulating dsymutil-classic which only cached the last used CIE for
reuse.
2021-04-06 20:15:41 -07:00
Jonas Devlieghere 5d07dc8977 [dsymutil] Don't emit .debug_pubnames and .debug_pubtypes
Consider the .debug_pubnames and .debug_pubtypes their own kind of
accelerator and stop emitting them together with the Apple-style
accelerator tables. The only reason we were still emitting both was for
(byte-for-byte) compatibility with dsymutil-classic.

 - This patch adds a new accelerator table kind "Pub" which can be
   specified with --accelerator=Pub.
 - This patch removes the ability to emit both pubnames/types and apple
   style accelerator tables. I don't think anyone is relying on that but
   it's worth pointing out.
 - This patch removes the --minimize option and makes this behavior the
   default. Specifying the flag will result in a warning but won't abort
   the program.

Differential revision: https://reviews.llvm.org/D99907
2021-04-06 19:01:45 -07:00
Arthur Eubanks 9c8b28a69b [llvm-reduce] Remove unwanted module inline asm
We can clear line by line, but that's likely not very important.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D99921
2021-04-06 09:35:37 -07:00
Jay Foad 94d0fc32f5 [AMDGPU] Add some missing testing for new subtargets gfx90a and gfx90c
Differential Revision: https://reviews.llvm.org/D99647
2021-04-06 08:38:59 +01:00
Roman Lebedev d094f3c3c5
[llvm-exegesis] SnippetFile: do create source manager in MCContext
This way, once there's an error in the snippet file (like in the test),
llvm-exegesis won't crash with an assertion failure,
but print a nice diagnostic about the problem.
2021-04-04 15:58:39 +03:00
Roman Lebedev 64a52e1e32
[llvm-exegesis] Don't erroneously refuse to measure POPCNT instruction 2021-04-04 14:38:26 +03:00
Eric Astor 0499a9d688 [ms] [llvm-ml] Accept /WX to signal that warnings should be fatal.
Define -fatal-warnings to make warnings fatal, and accept /WX as an ML.EXE compatible alias for it.

Also make sure that if Warning() returns true, we always treat it as an error.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D92504
2021-04-02 15:13:20 -04:00
Eric Astor 15ec0ad77a [ms] [llvm-ml] Fix case-sensitivity for variables and textmacros
Make variables and text-macro references case-insensitive, to match ml.exe.

Also improve error handling for text-macro expansion.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D92503
2021-04-02 14:08:02 -04:00
Samuel 0bc5436ae8 [llvm-reduce] Move tests to tools folder
Move tests for llvm-reduce to tools folder

Reviewed By: fhahn, lebedev.ri

Differential Revision: https://reviews.llvm.org/D99632
2021-04-01 10:04:10 -07:00
Douglas Yung c5109d3c79 Fix path in test added in e0577b3130 to work with both Linux/Windows paths.
Patch by Ying Yi!
2021-03-30 04:17:29 -07:00
Markus Böck 142d522ded [llvm-profdata] Make sure to consume Error on the error path of setIsIRLevelProfile
Encountered a crash while running a debug build, where this code path would be taken due to a mismatch in profile coverage data versions. Without consuming the error, an assert would be triggered inside the destructor of Error.

Differential Revision: https://reviews.llvm.org/D99457
2021-03-30 08:52:58 +02:00
Jonas Devlieghere b19a9efbc9 [dsymutil] s/dwarfdump/llvm-dwarfdump/ in test 2021-03-29 17:14:35 -07:00
Jonas Devlieghere e0577b3130 [dsymutil] Relocate DW_TAG_label
dsymutil is not relocating the DW_AT_low_pc for a DW_TAG_label. This
patch fixes that and adds a test.

Differential revision: https://reviews.llvm.org/D99534
2021-03-29 15:45:48 -07:00
Wenlei He 30b0232336 [CSSPGO][llvm-profgen] Context-sensitive global pre-inliner
This change sets up a framework in llvm-profgen to estimate inline decision and adjust context-sensitive profile based on that. We call it a global pre-inliner in llvm-profgen.

It will serve two purposes:
  1) Since context profile for not inlined context will be merged into base profile, if we estimate a context will not be inlined, we can merge the context profile in the output to save profile size.
  2) For thinLTO, when a context involving functions from different modules is not inined, we can't merge functions profiles across modules, leading to suboptimal post-inline count quality. By estimating some inline decisions, we would be able to adjust/merge context profiles beforehand as a mitigation.

Compiler inline heuristic uses inline cost which is not available in llvm-profgen. But since inline cost is closely related to size, we could get an estimate through function size from debug info. Because the size we have in llvm-profgen is the final size, it could also be more accurate than the inline cost estimation in the compiler.

This change only has the framework, with a few TODOs left for follow up patches for a complete implementation:
  1) We need to retrieve size for funciton//inlinee from debug info for inlining estimation. Currently we use number of samples in a profile as place holder for size estimation.
  2) Currently the thresholds are using the values used by sample loader inliner. But they need to be tuned since the size here is fully optimized machine code size, instead of inline cost based on not yet fully optimized IR.

Differential Revision: https://reviews.llvm.org/D99146
2021-03-29 09:46:14 -07:00
Andrew Savonichev bba25a9cd8 [MCA] Support carry-over instructions for in-order processors
Instructions that have more uops than the processor's IssueWidth are
issued in multiple cycles.

The patch fixes PR49712.

Differential Revision: https://reviews.llvm.org/D99339
2021-03-26 00:06:19 +03:00
Konstantin Zhuravlyov f4ace63737 AMDGPU: Add target id and code object v4 support
- Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id)
  - Add code object v4 support (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object)
    - Add kernarg_size to kernel descriptor
    - Change trap handler ABI to no longer move queue pointer into s[0:1]
  - Cleanup ELF definitions
    - Add V2, V3, V4 suffixes to make a clear distinction for code object version
    - Consolidate note names

Differential Revision: https://reviews.llvm.org/D95638
2021-03-24 11:54:05 -04:00
Vinicius Tinti 804ff7f293 [llvm-objdump] Implement --prefix-strip option
The option `--prefix-strip` is only used when `--prefix` is not empty.
It removes N initial directories from absolute paths before adding the
prefix.

This matches GNU's objdump behavior.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D96679
2021-03-24 13:22:35 +00:00
Andrew Savonichev 292da93d59 [MCA] Disable RCU for InOrderIssueStage
This is a follow-up for:
D98604 [MCA] Ensure that writes occur in-order

When instructions are aligned by the order of writes, they retire
in-order naturally. There is no need for an RCU, so it is disabled.

Differential Revision: https://reviews.llvm.org/D98628
2021-03-24 13:54:04 +03:00
Andy Wingo 9ac5620cb8 [WebAssembly] Rename WasmLimits::Initial to ::Minimum. NFC.
This patch renames the "Initial" member of WasmLimits to the name used
in the spec, "Minimum".

In the core WebAssembly specification, the Limits data type has one
required "min" member and one optional "max" member, indicating the
minimum required size of the corresponding table or memory, and the
maximum size, if any.

Although the WebAssembly spec does instantiate locally-defined tables
and memories with the initial size being equal to the minimum size, it
can't impose such a requirement for imports.  It doesn't make sense to
require an initial size for a memory import, for example.  The compiler
can only sensibly express the minimum and maximum sizes.

See
https://github.com/WebAssembly/js-types/blob/master/proposals/js-types/Overview.md#naming-of-size-limits
for a related discussion that agrees that the right name of "initial" is
"minimum" when querying the type of a table or memory from JavaScript.
(Of course it still makes sense for JS to speak in terms of an initial
size when it explicitly instantiates memories and tables.)

Differential Revision: https://reviews.llvm.org/D99186
2021-03-24 09:10:11 +01:00
Jay Foad fc7e3e7dd9 [AMDGPU] Set SchedRW on real instructions
Coyp SchedRW from pseudos to real instructions so that llvm-mca has
access to it. This is NFC for normal compiler codegen, which schedules
pseudos not real instructions.

Add an llvm-mca test for some high latency double-precision instructions
as a smoke test.

Differential Revision: https://reviews.llvm.org/D99187
2021-03-23 15:38:11 +00:00
Andrea Di Biagio f5bdc88e4d [MCA] Improved handling of negative read-advance cycles.
Before this patch, register writes were always invalidated by the
RegisterFile at instruction commit stage. So,
the RegisterFile was often losing the knowledge about the `execute
cycle` of writes already committed. While this was not problematic
for non-delayed reads, this was sometimes leading to inaccurate read
latency computations in the presence of negative read-advance cycles.

This patch fixes the issue by changing how the RegisterFile component
internally keeps track of the `execute cycle` information of each
write. On every instruction executed, the RegisterFile gets notified
by the RetireStage, so that it can internally record the execute
cycle of each executed write.
The `execute cycle` information is stored within WriteRef itself, and
it is not invalidated when the write is committed.
2021-03-23 14:47:23 +00:00
Yvan Roux 241032a205 [llvm-symbolizer][llvm-nm] Fix AArch64 and ARM mapping symbols handling.
Exclude AArch64 mapping symbols ($x and $d) for symtab symbolization as
it was done for ARM since D95916 tom bring bots back to green state.

This is implemented by setting SF_FormatSpecific such that
llvm-symbolizer will ignore them, and use this flag to re-implement
llvm-nm --special-syms option which make it work for both targets.

Differential Revision: https://reviews.llvm.org/D98803
2021-03-23 14:17:12 +01:00
Rahman Lavaee 949abf7d6a [llvm-readelf, propeller] Add fallthrough bit to basic block metadata in BB-Address-Map section.
This patch adds a fallthrough bit to basic block metadata, indicating whether the basic block can fallthrough without taking any branches. The bit will help us avoid an intel LBR bug which results in occasional duplicate entries at the beginning of the LBR stack.

This patch uses `MachineBasicBlock::canFallThrough()` to set the bit. This is not a const method because it eventually calls `TargetInstrInfo::analyzeBranch`, but it calls this function with the default `AllowModify=false`. So we can either make the argument to the `getBBAddrMapMetadata` non-const, or we can use `const_cast` when calling `canFallThrough`. I decide to go with the latter since this is purely due to legacy code, and in general we should not allow the BasicBlock to be mutable during `getBBAddrMapMetadata`.

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D96918
2021-03-22 21:38:05 -07:00
Jonas Devlieghere 3d6c7d6e8e [dsymutil] Fix spurious warnings for missing symbols with thinLTO
Fix spurious warnings for missing symbols with thinLTO. The latter
appends a unique suffix to avoid collisions for exported private
symbols, resulting in dsymutil complaining it couldn't find the symbol
in the object file.

rdar://75434058

Differential revision: https://reviews.llvm.org/D99125
2021-03-22 18:36:39 -07:00
Wenlei He ce6bfe9411 [CSSPGO][llvm-profgen] Use profile summary based threshold for context trimming and merging
Switch to use cold threshold from profile summary for cold context merging and trimming, instead of relying on hard coded values. Minor refactoring included for switch names, etc.

Differential Revision: https://reviews.llvm.org/D98921
2021-03-22 08:56:59 -07:00
Andrew Litteken 0776eca7a4 Revert "[IRSim] Adding basic implementation of llvm-sim."
Causing build errors on the Windows Buildbots.

This reverts commit 5155dff278.
2021-03-20 18:03:09 -05:00
Andrew Litteken 5155dff278 [IRSim] Adding basic implementation of llvm-sim.
This is a similarity visualization tool that accepts a Module and
passes it to the IRSimilarityIdentifier.  The resulting SimilarityGroups
are output in a JSON file.

Tests are found in test/tools/llvm-sim and check for the file not found,
a bad module, and that the JSON is created correctly.

Reviewers: paquette, jroelofs, MaskRay

Recommit of: 15645d044b to fix linking
errors.

Differential Revision: https://reviews.llvm.org/D86974
2021-03-20 16:47:50 -05:00
Fangrui Song 948be862d6 [llvm-readobj] Remove legacy GNU_PROPERTY_X86_ISA_1_{NEEDED,USED} and dump new GNU_PROPERTY_X86_ISA_1_{NEEDED,USED}
https://sourceware.org/bugzilla/show_bug.cgi?id=26703 deprecated the
previous GNU_PROPERTY_X86_ISA_1_{CMOV,SSE,*} values (renamed to `COMPAT`)
and added new values.

Since the legacy values are not used by compilers, having dumping support in
llvm-readobj is unnecessary. So just drop the legacy feature.

The new values are used by GCC 11
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97250) `-march=x86-64-v[234]` to
indicate the micro-architecture ISA levels.

Differential Revision: https://reviews.llvm.org/D98818
2021-03-19 14:35:22 -07:00
Wenlei He 1410db70b9 [CSSPGO] Add attribute metadata for context profile
This changes adds attribute field for metadata of context profile. Currently we have an inline attribute that indicates whether the leaf frame corresponding to a context profile was inlined in previous build.

This will be used to help estimating inlining and be taken into account when trimming context. Changes for that in llvm-profgen will follow. It will also help tuning.

Differential Revision: https://reviews.llvm.org/D98823
2021-03-18 22:00:56 -07:00
Andrew Savonichev e6ce0db378 [MCA] Ensure that writes occur in-order
Delay the issue of a new instruction if that leads to out-of-order
commits of writes.

This patch fixes the problem described in:
https://bugs.llvm.org/show_bug.cgi?id=41796#c3

Differential Revision: https://reviews.llvm.org/D98604
2021-03-18 17:10:20 +03:00
Chen Zheng 12824266c7 [NFC] make XCOFF dwarf dump test run only on PowerPC target. 2021-03-17 21:59:47 -04:00
Chen Zheng d33b016ada [XCOFF][llvm-dwarfdump] llvm-dwarfdump support for XCOFF
Author: hubert.reinterpretcast, shchenz

Reviewed By: jasonliu, echristo

Differential Revision: https://reviews.llvm.org/D97186
2021-03-17 21:21:51 -04:00
Eric Astor 1236dbc2fa [ms] [llvm-ml] Allow the /Zs parameter as a synonym for -filetype=null
For ml.exe, /Zs implies a syntax check with no output files.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D90061
2021-03-17 12:18:43 -04:00
Fangrui Song 8fbedb6b90 [llvm-nm] Add --format=just-symbols and make --just-symbol-name its alias
https://sourceware.org/bugzilla/show_bug.cgi?id=27487 binutils will have
--format=just-symbols/-j as well.

Arbitrarily prefer `-j` to `--format=sysv`. Previously `--format=sysv -j` prints
in the sysv format while `-j` takes precedence over other formats.

Differential Revision: https://reviews.llvm.org/D98569
2021-03-16 10:07:01 -07:00
David Zarzycki 0fda5e8441 [llvm-exegesis testing] Workaround unreliable test
Picking an instruction at random is not perfectly reliable.
2021-03-16 08:00:14 -04:00
wlei dddd590fd0 [CSSPGO][llvm-profgen] Fix getCanonicalFnName usage in llvm-profgen
Previously we didn't support to keep the unique linkage name(-funique-internal-linkage-name) in llvm-profgen. As discussed in https://reviews.llvm.org/D96932, we choose to do canonicalization for it.

Now since "selected" is set as the default parameter of getCanonicalFnName in `D96932`, we don't need to add any attribute here for the previous usage and only fix the missing usage in the pseudo probe decoding.

Differential Revision: https://reviews.llvm.org/D98226
2021-03-15 21:00:42 -07:00
Wael Yehia c05990a0cc [PATCH] fix location of test case
from D97507.
2021-03-15 09:34:24 -04:00
Johannes Doerfert cd1bd6e587 [Utils] Check for more global information in update_test_checks
This allows to check for various globals (metadata/attributes/...) and
also resolves problems with globals (metadata/attributes/...) being
reused across different prefixes.

Reviewed By: sstefan1

Differential Revision: https://reviews.llvm.org/D94741
2021-03-11 23:31:16 -06:00
David Blaikie 7906c0309b Move (llvm-original-di-preservation) test example output into the Inputs directory (since it's an input to the test execution)
The "Inputs" subdirectory is used for all files read by the test, not
only those used as input to the execution - so even though this file is
used as a golden reference for the output of the test, it's still an
input to the test execution (it is read in the process of executing the
test).
2021-03-11 17:36:33 -08:00
Jay Foad 7340fd6886 [MCA] Support in-order CPUs with MicroOpBufferSize=1
Differential Revision: https://reviews.llvm.org/D98356
2021-03-11 10:12:54 +00:00
Djordje Todorovic 9f41c03f82 [Debugify][OriginalDIMode] Export the report into JSON file
By using the original-di check with debugify in the combination with
the llvm/utils/llvm-original-di-preservation.py it becomes very user
friendly tool. An example of the HTML page with the issues
related to debug info can be found at [0].

[0] https://djolertrk.github.io/di-checker-html-report-example/

Differential Revision: https://reviews.llvm.org/D82546
2021-03-11 01:11:13 -08:00
Zequan Wu 8d5c3ae357 Revert "[llvm-cov] reset executation count to 0 after wrapped segment"
This reverts D85036

Differential Revision: https://reviews.llvm.org/D98084
2021-03-09 14:47:32 -08:00
Alexander Shaposhnikov ede56e5127 [llvm-objcopy][MachO] Add support for --keep-undefined
This diff introduces --keep-undefined in llvm-objcopy/llvm-strip for Mach-O
which makes the tools preserve undefined symbols.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D97040
2021-03-08 18:57:25 -08:00
Alexander Shaposhnikov 5f2f84a68a [llvm-objdump][MachO] Add support for dumping function starts
Add support for dumping function starts for Mach-O binaries.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D97027
2021-03-08 18:44:44 -08:00
Rahman Lavaee c245c21c43 [llvm-readelf] Support dumping the BB address map section with --bb-addr-map.
This patch lets llvm-readelf dump the content of the BB address map
section in the following format:
```
Function {
  At: <address>
  BB entries [
    {
      Offset:   <offset>
      Size:     <size>
      Metadata: <metadata>
    },
    ...
  ]
}
...
```

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D95511
2021-03-08 16:20:11 -08:00
wlei c460ef61d6 [CSSPGO][llvm-profgen] Change sample count of dangling probe in llvm-profgen
Differential Revision: https://reviews.llvm.org/D96811
2021-03-08 14:36:02 -08:00
Hongtao Yu e68fafa49f [CSSPGO] llvm-profdata support for CS profile.
Context-sensitive AutoFDO profile has a different name scheme where full calling contexts are encoded as function names. When processing CS proifle, llvm-profdata should use full contexts instead of leaf function names.

Reviewed By: wmi, wenlei, wlei

Differential Revision: https://reviews.llvm.org/D97998
2021-03-08 09:04:40 -08:00
Keith Smiley 64240f8138 llvm-nm: add flag to suppress no symbols warning
This spelling matches binutils https://sourceware.org/bugzilla/show_bug.cgi?id=27408

Differential Revision: https://reviews.llvm.org/D83152
2021-03-07 16:20:13 -08:00
Abhina Sreeskantharajan c52fe0b021 [test] Use host platform specific error message substitution in lit tests
This patch uses the errno python library to print out the correct error messages instead of hardcoding the error message per platform.

Reviewed By: jhenderson, ASDenysPetrov

Differential Revision: https://reviews.llvm.org/D97472
2021-03-05 07:21:53 -05:00
James Henderson 076698154a [llvm-objcopy] Fix crash for binary input files with non-ascii names
The code was using the standard isalnum function which doesn't handle
values outside the non-ascii range. Switching to using llvm::isAlnum
instead ensures we don't provoke undefined behaviour, which can in some
cases result in crashes.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D97663
2021-03-05 08:57:40 +00:00
James Henderson 47c343d768 [llvm-objcopy][test] Fix test that could have passed spuriously
The test was showing that when --strip-unneeded is specified for an
executable, all the symbols are stripped. However, the set of symbols
used in the test would be stripped by --strip-unneeded for an ET_REL
object too. Fix this by adding additional symbols that aren't normally
stripped by --strip-unneeded.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D97664
2021-03-05 08:57:39 +00:00
Haowei Wu db06088d63 [llvm-ifs] Add option to use InterfaceStub library
This change adds '-use-interfacestub' option to allow llvm-ifs
to use InterfaceStub lib when generating ELF binary.

Differential Revision: https://reviews.llvm.org/D94461
2021-03-04 11:28:49 -08:00
James Henderson 1562e4552c [llvm-objcopy][llvm-strip][test] Improve testing
This patch adds a number of new test cases that cover various
llvm-objcopy and llvm-strip features that had missing test coverage of
various descriptions:
* --add-section - checked the shdr properties, not just the content.
* Dedicated test case for --add-symbol when there are many sections.
* Show that --change-start accepts negative values without overflow.
  This was previously present but got lost between review versions.
* --dump-section - show that multiple sections can be dumped
  simultaneously to different files, and that an error is reported when
  a section cannot be found.
* --globalize-symbol(s) - show that symbols that are not mentioned are
  not globalized, if they would otherwise be, and that missing symbols
  from the list do not cause problems.
* --keep-global-symbol - show that the --regex option can be used in
  conjunction with this option.
* --keep-symbol - show that the --regex option can be used in
  conjunction with this option.
* --localize-symbol(s) - show that symbols that are not mentioned are
  not localized, if they would otherwise be, and that missing symbols
  from the list do not cause problems.
* --prefix-alloc-sections - show the behaviour of an empty string
  argument and multiple arguments.
* --prefix-symbols - show the behaviour of an empty string argument and
  multiple arguments. Also show the option applies to undefined symbols.
* --redefine-symbol - show that symbols with no name can be renamed,
  that it is not an error if a symbol is not specified, and that the
  option doesn't chain (i.e. --redefine-sym a=b --redefine-sym b=c does
  not redefine a as c).
* --rename-section - show that all section flags are preserved if none
  are specified. Also show that the option does not chain.
* --set-section-alignment - show that only specified sections have
  their alignments changed.
* --set-section-flags - show which section flags are preserved when this
  option is used. Also show that unspecified sections are not affected.
* --preserve-dates - show that -p is an alias of --preserve-dates.
* --strip-symbol - show that --regex works with this option for
  llvm-objcopy as well as llvm-strip.
* --strip-unneeded-symbol(s) - show more clearly that needed symbols are
  not stripped even if requested by this option.
* --allow-broken-links - show the sh_link of a symbol table is set to 0
  when its string table has been removed when this option is specified.
* --weaken-symbol(s) - show that symbols that are not mentioned are not
  weakened, if they would otherwise be, and that missing symbols from
  the list do not cause problems.
* --wildcard - show the wildcard behaviour for several options that were
  previously unchecked.

Reviewed by: alexshap

Differential Revision: https://reviews.llvm.org/D97666
2021-03-04 11:32:27 +00:00
Oliver Stannard aac056c528 [objdump][ARM] Use correct offset when printing ARM/Thumb branch targets
llvm-objdump only uses one MCInstrAnalysis object, so if ARM and Thumb
code is mixed in one object, or if an object is disassembled without
explicitly setting the triple to match the ISA used, then branch and
call targets will be printed incorrectly.

This could be fixed by creating two MCInstrAnalysis objects in
llvm-objdump, like we currently do for SubtargetInfo. However, I don't
think there's any reason we need two separate sub-classes of
MCInstrAnalysis, so instead these can be merged into one, and the ISA
determined by checking the opcode of the instruction.

Differential revision: https://reviews.llvm.org/D97766
2021-03-04 11:15:57 +00:00
Andrew Savonichev d791695cb5 [MCA] Add support for in-order CPUs
This patch adds a pipeline to support in-order CPUs such as ARM
Cortex-A55.

In-order pipeline implements a simplified version of Dispatch,
Scheduler and Execute stages as a single stage. Entry and Retire
stages are common for both in-order and out-of-order pipelines.

Differential Revision: https://reviews.llvm.org/D94928
2021-03-04 14:08:19 +03:00
James Henderson 8bb74d16ef [llvm-objcopy/strip] Fix off-by-one error in SYMTAB_SHNDX need check
The check for whether an extended symbol index table was required
dropped the first SHN_LORESERVE sections from the sections array before
checking whether the remaining sections had symbols. Unfortunately, the
null section header is not present in this list, so the check was
skipping the first section that might be important. If that section
contained a symbol, and no subsequent ones did, the .symtab_shndx
section would not be emitted, leading to a corrupt object.

Also consolidate and expand test coverage in the area to cover this bug
and other aspects of the SYMTAB_SHNDX section.

Reviewed by: alexshap, MaskRay

Differential Revision: https://reviews.llvm.org/D97661
2021-03-04 10:23:45 +00:00
James Henderson 49c91a64fd [llvm-objcopy][test] Improve many-sections object and test case
Additionally do some test tidy-ups and improve coverage of symbol
section indexes where the logical section index >= SHN_LORESERVE.

The symbol and section names in the many-section input object were
mostly shared. This patch changes them to be distinct, enabling
different operations such as --add-symbol, to be more targeted, when
using the object. It also makes the test less confusing and removes some
oddness in the symbol table order, presumably caused by the duplicate
names.

The input object was built from assembly that was of the form:
.section s1
sym1:
.section s2
sym2:
...
with a total of 65536 such occurrences. llvm-objcopy was then used to
remove the empty .text section automatically generated by MC, and
incidentally to move .strtab to the end of the object. This ensured that
the section/symbol indexes matched their name (i.e. section index 1 was
s1, section index 2 was s2 etc, and sym1 was in s1, sym2 in s2 etc).

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D97660
2021-03-04 09:42:43 +00:00
Hongtao Yu ad2a59f584 [CSSPGO] Introducing dangling pseudo probes.
Dangling probes are the probes associated to an empty block. This usually happens when all real instructions are optimized away from the block. There is a problem with dangling probes during the offline counts processing. The way the sample profiler works is that samples collected on the first physical instruction following a probe will be counted towards the probe. This logically equals to treating the instruction next to a probe as if it is from the same block of the probe. In the dangling probe case, the real instruction following a dangling probe actually starts a new block, and samples collected on the new block may cause issues when counted towards the empty block.

To mitigate this issue, we first try to move around a dangling probe inside its owning block. If there are still native instructions preceding the probe in the same block, we can then use them as a place holder to collect samples for the probe. A pass is added to walk each block backwards looking for probes not followed by any real instruction and moving them before the first real instruction. This is done right before the object emission.

If we are unlucky to find such in-block preceding instructions for a probe, the solution we are taking is to tag such probe as dangling so that the samples reported for them will not be trusted by the compiler. We leave it up to the counts inference algorithm to get such probes a reasonable count. The number `UINT64_MAX` is used to mark sample count as collected for a dangling probe.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D95962
2021-03-03 22:44:41 -08:00
Fangrui Song 584cb67d2d [IRSymTab] Set FB_used on llvm.compiler.used symbols
IR symbol table does not parse inline asm. A symbol only referenced by inline
asm is not in the IR symbol table, so LTO does not know that the definition (in
another translation unit) is referenced and may internalize it, even if that
definition has `__attribute__((used))` (which lowers to `llvm.compiler.used` on
ELF targets since D97446).

```
// cabac.c
__attribute__((used)) const uint8_t ff_h264_cabac_tables[...] = {...};

// h264_cabac.c
  asm("lea ff_h264_cabac_tables(%rip), %0" : ...);
```

`__attribute__((used))` is the recommended way to tell the compiler there may
be inline asm references, so the usage is perfectly fine. This patch
conservatively sets the `FB_used` bit on `llvm.compiler.used` symbols to work
around the IR symbol table limitation. Note: before D97446, Clang never emitted
symbols in the `llvm.compiler.used` list, so this change does not punish any
Clang emitted global object.

Without the patch, `ff_h264_cabac_tables` may be assigned to a non-external
partition and get internalized. Then we will get a linker error because the
`cabac.c` definition is not exposed.

Differential Revision: https://reviews.llvm.org/D97755
2021-03-03 16:22:30 -08:00
serge-sans-paille 7b319df29b Revert "Use the default seed value for djb hash for StringMap"
This reverts commit d84440ec91.

It breaks (at least) lldb and lld validation

https://lab.llvm.org/buildbot/#/builders/68/builds/7837
https://lab.llvm.org/buildbot/#/builders/36/builds/5495
2021-03-01 14:00:39 +01:00
serge-sans-paille d84440ec91 Use the default seed value for djb hash for StringMap
See original comment in 560ce2c70f
Baiscally the default seed value results in less collision, but changes the
iteration order, which matters for a few test cases.

Differential Revision: https://reviews.llvm.org/D97396
2021-03-01 13:21:27 +01:00
Fangrui Song 599711dce5 [llvm-dwarfdump] StringMap -> MapVector to make iteration order stable
Exposed by D97396
2021-02-25 20:05:05 -08:00
Fangrui Song 17b4e695ce [llvm-objcopy] If input=output, preserve umask bits, otherwise drop S_ISUID/S_ISGID bits
This makes the behavior similar to cp

```
chmod u+s,g+s,o+x a
sudo llvm-strip a -o b
// With this patch, b drops set-user-ID and set-group-ID bits.
// sudo cp a b => b does not have set-user-ID or set-group-ID bits.
```

This also changes the behavior for the following case:

```
chmod u+s,g+s,o+x a
llvm-strip a
// a preserves set-user-ID and set-group-ID bits.
// This matches binutils<2.36 and probably >=2.37.  2.36 and 2.36.1 have some compatibility issues.
```

Differential Revision: https://reviews.llvm.org/D97253
2021-02-24 11:10:09 -08:00
Matthew Voss 6da7d31416 [llvm-profdata] Emit Error when Invalid MemOpSize Section is Created by llvm-profdata
Under certain (currently unknown) conditions, llvm-profdata is outputting
profiles that have two consecutive entries in the MemOPSize section for the
value 0. This causes the PGOMemOPSizeOpt pass to output an invalid switch
instruction with two cases for 0. As mentioned, we’re not quite sure what’s
causing this to happen, but this patch prevents llvm-profdata from outputting a
profile that has this problem and gives an error with a request for a
reproducible.

Differential Revision: https://reviews.llvm.org/D92074
2021-02-23 12:51:54 -08:00
Rahman Lavaee 9f52708660 [obj2yaml,yaml2obj] Add NumBlocks to the BBAddrMapEntry yaml field.
As discussed in D95511, this allows us to encode invalid BBAddrMap
sections to be used in more rigorous testing.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D96831
2021-02-22 18:08:26 -08:00
Tim Shen a0757d8ebd Patch by @wecing (Chenguang Wang).
The current getFoldedSizeOf() implementation uses naive recursion, which
could be really slow when the input structure type is too complex.

This issue was first brought up in
http://llvm.org/bugs/show_bug.cgi?id=8281; this change fixes it by
adding memoization.

Differential Revision: https://reviews.llvm.org/D6594
2021-02-19 12:44:17 -08:00
Haowei Wu 784c7debb2 [elfabi] Fix a bug when .dynsym contains no non-local symbol
This patch fixed a bug when elbabi was supplied with a tbe file
contains no non-local symbol. Before this patch, it wrote 0 to
sh_info of the .dynsym section, making the ELF stub file invalid.
This patch fixed this issue.

Differential Revision: https://reviews.llvm.org/D96930
2021-02-19 11:36:53 -08:00
Djordje Todorovic b6db47d7e0 [llvm-dwarfdump][locstats] Unify handling of inlined vars with no loc
The presence or absence of an inline variable (as well as formal
parameter) with only an abstract_origin ref (without DW_AT_location)
should not change the location coverage.

It means, for both:

DW_TAG_inlined_subroutine
  DW_AT_abstract_origin (0x0000004e "f")
  DW_AT_low_pc  (0x0000000000000010)
  DW_AT_high_pc (0x0000000000000013)
  DW_TAG_formal_parameter
    DW_AT_abstract_origin       (0x0000005a "b")

and,

DW_TAG_inlined_subroutine
   DW_AT_abstract_origin (0x0000004e "f")
   DW_AT_low_pc  (0x0000000000000010)
   DW_AT_high_pc (0x0000000000000013)

we should report 0% location coverage. If we add DW_AT_location,
for both cases the coverage should be improved.

Differential Revision: https://reviews.llvm.org/D96045
2021-02-19 05:38:01 -08:00
Nikita Popov 2f17ed294f [DCE] Don't remove non-willreturn calls
In both ADCE and BDCE (via DemandedBits) we should not remove
instructions that are not guaranteed to return. This issue was
pointed out by fhahn in the recent llvm-dev thread.

Differential Revision: https://reviews.llvm.org/D96993
2021-02-19 12:35:40 +01:00
Qiu Chaofan 9d2f06445f [llvm-exegesis] Ignore instructions using custom inserter
Some instructions defined in table-gen files sets usesCustomInserter
bit, which means it has to be lowered by target code and isn't actually
valid instruction at MC level. So we should treat them like pseudo
instructions.

Reviewed By: gchatelet

Differential Revision: https://reviews.llvm.org/D94898
2021-02-19 17:04:27 +08:00
Qiu Chaofan d7d4dca15f [llvm-exegesis] [PowerPC] Add basic LIT test
Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D94897
2021-02-19 17:01:05 +08:00
Petr Hosek 5fbd1a333a [Coverage] Store compilation dir separately in coverage mapping
We currently always store absolute filenames in coverage mapping.  This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines.  We are also
duplicating the path prefix potentially wasting space.

This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.

Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.

Differential Revision: https://reviews.llvm.org/D95753
2021-02-18 14:34:39 -08:00
Petr Hosek fbf8b957fd Revert "[Coverage] Store compilation dir separately in coverage mapping"
This reverts commit 97ec8fa5bb since
the test is failing on some bots.
2021-02-18 12:50:24 -08:00
Petr Hosek 97ec8fa5bb [Coverage] Store compilation dir separately in coverage mapping
We currently always store absolute filenames in coverage mapping.  This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines.  We are also
duplicating the path prefix potentially wasting space.

This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.

Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.

Differential Revision: https://reviews.llvm.org/D95753
2021-02-18 12:27:42 -08:00
Fangrui Song 018a484cd2 [llvm-objdump] Map STT_TLS to ST_Other (previously ST_Data)
ST_Data is used to model BFD `BFD_OBJECT`.
A STT_TLS symbol does not have the `BFD_OBJECT` flag in BFD.
This makes sense because a STT_TLS symbol is like in a different address space,
normal data/object properties do not apply on them.

With this change, a STT_TLS symbol will not be displayed as 'O'.
This new behavior matches objdump.

Differential Revision: https://reviews.llvm.org/D96735
2021-02-17 23:17:20 -08:00
Stanislav Mekhanoshin a8d9d50762 [AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
2021-02-17 16:01:32 -08:00
Rahman Lavaee 0252e6ead1 [obj2yaml,yaml2obj] Add NumBlocks to the BBAddrMapEntry yaml field.
As discussed in D95511, this allows us to encode invalid BBAddrMap
sections to be used in more rigorous testing.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D96831
2021-02-17 15:45:13 -08:00
Teresa Johnson 50ac3b1d78 [gold] Match lld WPD behavior for shared library symbols and add test
lld already marks shared library defs as ExportDynamic, which prevents
potentially unsafe devirtualization of symbols defined in shared
libraries. Match that behavior in the gold plugin, and add the same
test.

Depends on D96721.

Differential Revision: https://reviews.llvm.org/D96722
2021-02-17 15:28:49 -08:00
Alexander Shaposhnikov cdcb60a820 [llvm-libtool] Emit warnings for files without symbols
1. Emit warnings for files without symbols.
2. Add -no_warning_for_no_symbols.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D95843
2021-02-16 17:52:12 -08:00
Simonas Kazlauskas 6ffcb2937c [llvm-dwp] Join dwo paths correctly when DWOPath is absolute
When the `DWOPath` is absolute, we want to use `DWOPath` as is, without prepending any other
components to the path. The `sys::path::append` does not join, but rather unconditionally appends
the paths, so something like `sys::path::append("/tmp", "/tmp/banana")` will result in
`/tmp/tmp/banana` rather than the desired `/tmp/banana`.

This then causes `llvm-dwp` to fail in a following situation:

```
$ clang -gsplit-dwarf /tmp/banana/test.c -c -o /tmp/outdir/foo.o
$ clang outdir/foo.o -o outdir/hm
$ llvm-dwarfdump outdir/hm | grep -C2 foo.dwo
                  DW_AT_comp_dir    ("/tmp")
                  DW_AT_GNU_pubnames  (true)
                  DW_AT_GNU_dwo_name    ("/tmp/outdir/foo.dwo")
                                DW_AT_GNU_dwo_id    (0xde4d396f3bf0e257)
                  DW_AT_low_pc  (0x0000000000401100)
$ strace -o trace llvm-dwp -e outdir/hm -o outdir/hm.dwp
error: No such file or directory
$ cat trace | grep foo.dwo
openat(AT_FDCWD, "/tmp/tmp/outdir/foo.dwo", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
```

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D96678
2021-02-16 13:38:35 -08:00
James Henderson c96fee98db [llvm-symbolizer][test] Add explicit tests for CODE and DATA
These directives force the associated address to be interpreted as a
function or data respectively. CODE is the default when not specified.

Differential Revision: https://reviews.llvm.org/D96712

Reviewed by: MaskRay
2021-02-16 10:59:25 +00:00
Fangrui Song c465429f28 [llvm-objcopy] Delete --build-id-link-{dir,input,output}
The few options are niche. They solved a problem which was traditionally solved
with more shell commands (`llvm-readelf -n` fetches the Build ID. Then
`ln` is used to hard link the file to a directory derived from the Build ID.)

Due to limitation, they are no longer used by Fuchsia and they don't appear to
be used elsewhere (checked with Google Search and Debian Code Search). So delete
them without a transition period.

Announcement: https://lists.llvm.org/pipermail/llvm-dev/2021-February/148446.html

Differential Revision: https://reviews.llvm.org/D96310
2021-02-15 11:17:32 -08:00
James Henderson 37c89803d8 [llvm-nm][test] Add additional test coverage for llvm-nm options
Some of these options have a degree of incidental coverage, or are for
Mach-O only. This patch adds dedicated ELF (where applicable) coverage.

Differential Revision: https://reviews.llvm.org/D96602

Reviewed by: rupprecht, Higuoxing
2021-02-15 13:54:08 +00:00
James Henderson 94828afd0a [llvm-nm] Tidy up error messages
This adds colons to separate the file name from the message, removes a
duplicate space, and removes a trailing full stop from some messages.
These help bring the error messages into line with other tools, as well
as making all llvm-nm message more self-consistent.

Differential Revision: https://reviews.llvm.org/D96601

Reviewed by: Higuoxing, rupprecht, MaskRay
2021-02-15 13:54:08 +00:00
Florian Hahn c70737ba1d Recommit "[LTO] Use lto::backend for code generation."
This version of the patch includes a fix for the cfi failures.

(undoes the revert commit 7db390cc77)

It also undoes reverts of follow-up patches that also needed reverting
originally:

  * [LTO] Add option enable NewPM with LTOCodeGenerator.
    (undoes revert commit 0a17664b47)

  * [LTOCodeGenerator] Use lto::Config for options (NFC)."
    (undoes revert commit b0a8e41cff)
2021-02-15 10:05:42 +00:00
Teresa Johnson 95a695bea4 [gold] Add case being tested by equivalent lld test
The new tests added by 1487747e99 for lld
and gold plugin were largely equivalent, but the gold one was missing
one of the cases added to lld. Add that test to the gold plugin version.
2021-02-13 17:09:56 -08:00
Fangrui Song 39db16e75b [test] Make ELF tests less reliant on the lexicographical order of non-local symbols 2021-02-13 01:01:06 -08:00
wlei afd8bd601e [CSSPGO][llvm-profgen] Filter out the instructions without location info for symbolizer
It appears some instructions doesn't have the debug location info and the symbolizer will return an empty call stack for them which will cause some crash later in profile unwinding. Actually we do not record the sample info for them, so this change just filter out those instruction.

As those instruction would appears at the begin and end of the instruction list, without them we need to add the boundary check for IP `advance` and `backward`.

Also for pseudo probe based profile, we actually don't need the symbolized location info, so here just change to use an empty stack for it. This could save half of the binary loading time.

Differential Revision: https://reviews.llvm.org/D96434
2021-02-12 16:47:49 -08:00
wlei 426e326a19 [CSSPGO][llvm-profgen] Renovate perfscript check and command line input validation
This include some changes related with PerfReader's the input check and command line change:

1) It appears there might be thousands of leading MMAP-Event line in the perfscript for large workload. For this case, the 4k threshold is not eligible to determine it's a hybrid sample. This change renovated the `isHybridPerfScript` by going through the script without threshold limitation checking whether there is a non-empty call stack immediately followed by a LBR sample. It will stop once it find a valid one.

2) Added several input validations for the command line switches in PerfReader.

3) Changed the command line `show-disassembly` to `show-disassembly-only`, it will print to stdout and exit early which leave an empty output profile.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D96387
2021-02-12 15:18:50 -08:00
Hongtao Yu 0b1914e83a [ThinLTO][gold] Fix filenaming scheme for tasks.
The gold LTO plugin uses a set of hooks to implements emit-llvm and capture intermediate file generated during LTO. The hooks are called by each lto backend thread with a taskID as argument to differentiate between threads and tasks. Currently, all threads are overwriting the same file which results into only the intermediate output of the last backend thread to be preserved. This diff encodes the taskID into the filename.

Reviewed By: tejohnson, wenlei

Differential Revision: https://reviews.llvm.org/D96173
2021-02-12 09:40:08 -08:00
wlei c3aeabaea1 [CSSPGO][llvm-profgen] Add brackets for context id to support extended binary format
To align with https://reviews.llvm.org/D95547, we need to add brackets for context id before initializing the `SampleContext`.

Also added test cases for extended binary format from llvm-profgen side.

Differential Revision: https://reviews.llvm.org/D95929
2021-02-12 01:14:53 -08:00
Adrian Prantl 97dbab8797 llvm-dwarfdump: fix the counting when printing DW_OP_entry_value
The block size is in bytes, and not number of operands.

Differential Revision: https://reviews.llvm.org/D96472
2021-02-11 11:17:04 -08:00
Todd Lipcon 747c450e6f
Fix JSON formatting when converting to trace event format
Reviewed By: dberris

Differential Revision: https://reviews.llvm.org/D96384
2021-02-10 13:00:28 +11:00
Vinicius Tinti 8b4a727281 [llvm-objdump][test] Fix --prefix tests for system-windows
Merging directories and files may produce different results on different
platforms.

Merging "./Inputs" and "source-interleave-x86_64.c" will use different
separators in POSIX and Windows.

Dedicated tests are needed for dealing with removing trailing separators
for POSIX (consider only '/') and Windows (consider '/' and '\').

Fixes D85024.
Fixes PR46368.

Reviewed By: jhenderson, MaskRay

Differential revision: https://reviews.llvm.org/D95513
2021-02-09 21:54:51 +00:00
Alex Richardson 7dc3136033 [llvm-readobj] Add support for decoding FreeBSD ELF notes
The current support only printed coredump notes, but most binaries also
contain notes. This change adds names for four FreeBSD-specific notes and
pretty-prints three of them:

NT_FREEBSD_ABI_TAG:
This note holds a 32-bit (decimal) integer containing the value of the
__FreeBSD_version macro, which is defined in crt1.o and will hold a value
such as 1300076 for a binary build on a FreeBSD 13 system.

NT_FREEBSD_ARCH_TAG:
A string containing the value of the build-time MACHINE_ARCH

NT_FREEBSD_FEATURE_CTL: A 32-bit flag that indicates to the kernel that
the binary wants certain bevahiour. Examples include setting
NT_FREEBSD_FCTL_ASLR_DISABLE which tells the kernel to disable ASLR.

After this change llvm-readobj also no longer decodes coredump-only
FreeBSD notes in non-coredump files. I've also converted the
note-freebsd.s test to use yaml2obj instead of llvm-mc.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D74393
2021-02-09 16:59:22 +00:00
Alex Richardson 135df21248 [llvm-readelf] Print raw ELF note contents if we can't parse it
Currently, if the note name is known, but the value isn't we don't print
the contents.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D74367
2021-02-09 16:59:22 +00:00
Alex Richardson d613d8eb0e [yaml2obj] Handle NT_* string values in for ELF note types
This is required for D74393.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D95953
2021-02-09 16:59:22 +00:00
Alex Richardson f4670fbfff [llvm-readobj] Print empty line between note sections in GNU mode
This matches GNU binutils.

Reviewed By: rupprecht, jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D96010
2021-02-09 16:59:21 +00:00
Fangrui Song 3e837e1735 [llvm-objcopy][test] Stablize build-id-link-dir.test 2021-02-08 17:22:22 -08:00
Mircea Trofin f31ea86c80 Revert "[Utils] Add a switch controlling prefix warnings in UpdateTestChecks"
This reverts commit 87f8a08ce3.
2021-02-08 11:21:56 -08:00
Fangrui Song 157ac423e0 [llvm-objdump] Support PLT decoding for aarch64_be
Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D96211
2021-02-08 08:50:26 -08:00
Fangrui Song 50e4e385a4 LLVMgold.so: Fix tests after D95380 2021-02-04 21:14:36 -08:00
wlei dd9e219014 [CSSPGO][llvm-profgen] Fix bug with parsing hybrid sample trace line
when we skip the call stack starting with an external address, we should also skip the bottom LBR entry, otherwise it will cause a truncated context issue.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D95480
2021-02-04 16:15:05 -08:00
wlei e10b73f646 [CSSPGO][llvm-profgen] Merge and trim profile for cold context to reduce profile size
This change allows merging and trimming cold context profile in llvm-profgen to solve profile size bloat problem. Currently when the profile's total sample is below threshold(supported by a switch), it will be considered cold and merged into a base context-less profile, which will at least keep the profile quality as good as the baseline(non-cs).

For example, two input profiles:
 [main @ foo @ bar]:60
 [main @ bar]:50
Under threshold = 100, the two profiles will be merge into one with the base context, get result:
 [bar]:110

Added two switches:
`--csprof-cold-thres=<value>`: Specified the total samples threshold for a context profile to be considered cold, with 100 being the default. Any cold context profiles will be merged into context-less base profile by default.
`--csprof-keep-cold`: Force profile generation to keep cold context profiles instead of dropping them. By default, any cold context will not be written to output profile.

Results:
Though not yet evaluating it with the latest CSSPGO, our internal branch shows neutral on performance but significantly reduce the profile size. Detailed evaluation on llvm-profgen with CSSPGO will come later.

Differential Revision: https://reviews.llvm.org/D94111
2021-02-04 11:05:03 -08:00
Fangrui Song eecbb1c776 [llvm-objdump] --source: drop the warning when there is no debug info
Warnings have been added for three cases (PR41905): (1) missing debug info, (2)
the source file cannot be found, (3) the debug info points at a line beyond the
end of the file.

(1) is probably less useful. This was brought up once on
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141264.html and two
internal users mentioned it to me that it was annoying. (I personally
find the warning confusing, too.)

Users specify --source to get additional information if sources happen to be
available.  If sources are not available, it should be obvious as the output
will have no interleaved source lines. The warning can be especially annoying
when using llvm-objdump -S on a bunch of files.

This patch drops the warning when there is no debug info.
(If LLVMSymbolizer::symbolizeCode returns an `Error`, there will still be
an error. There is currently no test for an `Error` return value.
The only code path is probably a broken symbol table, but we probably already emit a warning
in that case)

`source-interleave-prefix.test` has an inappropriate "malformed" test - the test simply has no
.debug_* because new llc does not produce debug info when the filename is empty (invalid).
I have tried tampering the header of .debug_info/.debug_line but llvm-symbolizer does not warn.
This patch does not intend to add the missing test coverage.

Differential Revision: https://reviews.llvm.org/D88715
2021-02-04 09:07:44 -08:00
wlei ac14bb14e7 [CSSPGO][llvm-profgen] Compress recursive cycles in calling context
This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic.
Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration.
For example:
Considering a input context string stack:
[“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For first iteration,, it removed all adjacent repeated frames of size 1:
[“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For second iteration, it removed all adjacent repeated frames of size 2:
[“a”, “b”, “c”, “a”, “b”, “c”, “d”]
So in the end, we get compressed output:
[“a”, “b”, “c”, “d”]

Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator.
Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit.
Added unit tests and regression test for this.

Differential Revision: https://reviews.llvm.org/D93556
2021-02-03 22:16:07 -08:00
wlei 6bccdcdb35 Revert "[CSSPGO][llvm-profgen] Compress recursive cycles in calling context"
This reverts commit 0609f257dc.
2021-02-03 22:16:05 -08:00
wlei 0609f257dc [CSSPGO][llvm-profgen] Compress recursive cycles in calling context
This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic.
Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration.
For example:
Considering a input context string stack:
[“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For first iteration,, it removed all adjacent repeated frames of size 1:
[“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For second iteration, it removed all adjacent repeated frames of size 2:
[“a”, “b”, “c”, “a”, “b”, “c”, “d”]
So in the end, we get compressed output:
[“a”, “b”, “c”, “d”]

Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator.
Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit.
Added unit tests and regression test for this.

Differential Revision: https://reviews.llvm.org/D93556
2021-02-03 18:50:14 -08:00
wlei c82b24f475 [CSSPGO][llvm-profgen] Pseudo probe based CS profile generation
This change implements profile generation infra for pseudo probe in llvm-profgen. During virtual unwinding, the raw profile is extracted into range counter and branch counter and aggregated to sample counter map indexed by the call stack context. This change introduces the last step and produces the eventual profile. Specifically, the body of function sample is recorded by going through each probe among the range and callsite target sample is recorded by extracting the callsite probe from branch's source.

Please refer https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.

**Implementation**

- Extended `PseudoProbeProfileGenerator` for pseudo probe based profile generation.
- `populateBodySamplesWithProbes` reading range counter is responsible for recording function body samples and inferring caller's body samples.
- `populateBoundarySamplesWithProbes` reading branch counter is responsible for recording call site target samples.
- Each sample is recorded with its calling context(named `ContextId`). Remind that the probe based context key doesn't include the leaf frame probe info, so the `ContextId` string is created from two part: one from the probe stack strings' concatenation and other one from the leaf frame probe.
- Added regression test

Test Plan:

ninja & ninja check-llvm

Differential Revision: https://reviews.llvm.org/D92998
2021-02-03 16:21:53 -08:00
Florian Hahn 7db390cc77
Revert "[LTO] Use lto::backend for code generation."
This reverts commit 6a59f05606, because
it is causing failures on green dragon.
2021-02-03 22:49:30 +00:00
Abhina Sreeskantharajan e59d336e75 [test] Use host platform specific error message substitution in lit tests - continued
On z/OS, other error messages are not matched correctly in lit tests.

```
EDC5121I Invalid argument.
EDC5111I Permission denied.
```

This patch adds a lit substitution to fix it.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D95808
2021-02-03 09:53:22 -05:00
Fangrui Song 1560a00032 [yaml2obj/obj2yaml/llvm-readobj] Support SHF_GNU_RETAIN
In binutils, the flag is defined for ELFOSABI_GNU and ELFOSABI_FREEBSD.
It can be used to mark a section as a GC root.

In practice, the flag has generic semantics and can be applied to many
EI_OSABI values, so we consider it generic.

Differential Revision: https://reviews.llvm.org/D95728
2021-02-02 09:19:53 -08:00
Andrew Ng 94fedd2661 [X86] Fix disassembly of x86-64 GDTLS code sequence
For x86-64 the REX.w prefix takes precedence over any other size
override (i.e. 0x66). Therefore, for x86-64 when REX.w is present set
'hasOpSize' to false to ensure that any size override is ignored.

Fixes PR48901.

Differential Revision: https://reviews.llvm.org/D95682
2021-02-02 11:35:00 +00:00
Mircea Trofin 87f8a08ce3 [Utils] Add a switch controlling prefix warnings in UpdateTestChecks
The switch controls both unused prefix warnings, and warnings about
functions which differ under different runs for a prefix, and, thus, end
up not having asserts for that prefix.

(If the latter case spans to all functions, then the former case kicks
in)

The switch is on by default, and can be disabled.

Differential Revision: https://reviews.llvm.org/D95829
2021-02-01 18:04:18 -08:00
Rahman Lavaee f1ff6d210a [obj2yaml, yaml2obj] Use Hex64 for BBAddressMap fields.
This patch let the yaml encoding use Hex64 values for NumBlocks, BB AddressOffset, BB Size, and BB Metadata.
Additionally, it changes the decoded values in elf2yaml to uint64_t to match DataExtractor::getULEB128 return type.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D95767
2021-02-01 15:37:30 -08:00
Patrick Oppenlander 93345e825a [llvm-objcopy] -O binary: consider SHT_NOBITS sections to be empty
This is consistent with BFD objcopy.

Previously llvm objcopy would allocate space for SHT_NOBITS sections
often resulting in enormous binary files.

New test case (binary-paddr.test %t6).

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D95569
2021-02-01 15:01:25 -08:00
Haowei Wu 771b359654 [elfabi] Fix tests which failed on different timezones
This patch fixes elfabi tests on machines using a GMT+X timezone
settings.

Differential Revision: https://reviews.llvm.org/D95641
2021-02-01 10:58:55 -08:00
Teresa Johnson 7f2e0879b5 [LTO] Move part of gold devirt test to v1.16 directory
Part of the gold test added in 1487747e99
relies on more recent fixes to gold that fix the plugin behavior with
--export-dynamic-symbol and --dynamic-list. Extract those parts of the
new test into a v1.16 test.
2021-02-01 09:53:11 -08:00
Alexey Lapshin fb244ffb9f [dsymutil][DWARFLinker][NFC] make AddressManager not depending on the order of checks for relocations.
Current dsymutil implementation of hasLiveMemoryLocation()/hasLiveAddressRange()
and applyValidRelocs() assume that calls should be done in certain order
(from first Dies to last). Multi-thread implementation might call these methods
in other order(it might process compilation units in order other than they are physically
located), so we remove restriction that searching for relocations should be done
in ascending order. This change does not introduce noticable performance degradation.
The testing results for clang binary:

golden-dsymutil/dsymutil  23787992
clang MD5: 5efa8fd9355ebf81b65f24db5375caa2
elapsed time=91sec

build-Release/bin/dsymutil 23855616
clang MD5: 5efa8fd9355ebf81b65f24db5375caa2
elapsed time=91sec

Differential Revision: https://reviews.llvm.org/D93106
2021-01-31 16:34:10 +03:00
Georgii Rymar d221406875 [llvm-symbolizer] - Fix the crash in GNU output style with --no-inlines and missing input file.
Fixes https://bugs.llvm.org/show_bug.cgi?id=48882.

If the input file does not exist (or has a reading error), the
following code will crash if there are two or more input addresses.

```
auto ResOrErr = Symbolizer.symbolizeInlinedCode(
  ModuleName, {Offset, object::SectionedAddress::UndefSection});
Printer << (error(ResOrErr) ? DILineInfo() : ResOrErr.get().getFrame(0));
```

For the first address, `symbolizeInlinedCode` returns an error.
For the second address, `symbolizeInlinedCode` returns an empty result
(not an error) and `.getFrame(0)` will crash.

Differential revision: https://reviews.llvm.org/D95609
2021-01-30 18:36:38 +03:00
Florian Hahn 6a59f05606 [LTO] Use lto::backend for code generation.
This patch updates LTOCodeGenerator to use the utilities provided by
LTOBackend to run middle-end optimizations and backend code generation.

This is a first step towards unifying the code used by libLTO's C API
and the newer, C++ interface (see PR41541).

The immediate motivation is to allow using the new pass manager when
doing LTO using libLTO's C API, which is used on Darwin, among others.

With the changes, there are no codegen/stats differences when building
MultiSource/SPEC2000/SPEC2006 on Darwin X86 with LTO, compared
to without the patch.

Reviewed By: steven_wu

Differential Revision: https://reviews.llvm.org/D94487
2021-01-30 10:09:55 +00:00
Greg McGary 61a5502a93 [llvm-objdump-macho] print per-second-level-page encodings for option --unwind-info
Compact unwind entries have 8 bits for the encoding-table offset:
* offsets 0..126 reference the global commmon-encodings table, while
* offsets 127..255 reference a per-second-level-page table.
This diff teaches `llvm-objdump` to print this per-page encodings table.

Differential Revision: https://reviews.llvm.org/D93265
2021-01-29 21:59:07 -07:00
Abhina Sreeskantharajan 42a21778f6 [test] Use host platform specific error message substitution in lit tests
On z/OS, the following error message is not matched correctly in lit tests.

```
EDC5129I No such file or directory.
```

This patch uses a lit config substitution to check for platform specific error messages.

Reviewed By: muiez, jhenderson

Differential Revision: https://reviews.llvm.org/D95246
2021-01-29 07:16:30 -05:00
Georgii Rymar a5154ab9b0 [llvm-readobj/elf] - Report "bitcode files are not supported" warning for bitcode files.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43543

Currently we report "The file was not recognized as a valid object file" for BC files.
Also, we terminate dumping.

Instead we could report a better warning and try to continue dumping other files.
This is what this patch implements.

Differential revision: https://reviews.llvm.org/D95605
2021-01-29 12:04:41 +03:00
Greg Clayton f8122d3532 Add the ability to extract the unwind rows from DWARF Call Frame Information.
This patch adds the ability to evaluate the state machine for CIE and FDE unwind objects and produce a UnwindTable with all UnwindRow objects needed to unwind registers. It will also dump the UnwindTable for each CIE and FDE when dumping DWARF .debug_frame or .eh_frame sections in llvm-dwarfdump or llvm-objdump. This allows users to see what the unwind rows actually look like for a given CIE or FDE instead of just seeing a list of opcodes.

This patch adds new classes: UnwindLocation, RegisterLocations, UnwindRow, and UnwindTable.

UnwindLocation is a class that describes how to unwind a register or Call Frame Address (CFA).

RegisterLocations is a class that tracks registers and their UnwindLocations. It gets populated when parsing the DWARF call frame instruction opcodes for a unwind row. The registers are mapped from their register numbers to the UnwindLocation in a map.

UnwindRow contains the result of evaluating a row of DWARF call frame instructions for the CIE, or a row from a FDE. The CIE can produce a set of initial instructions that each FDE that points to that CIE will use as the seed for the state machine when parsing FDE opcodes. A UnwindRow for a CIE will not have a valid address, whille a UnwindRow for a FDE will have a valid address.

The UnwindTable is a class that contains a sorted (by address) vector of UnwindRow objects and is the result of parsing all opcodes in a CIE, or FDE. Parsing a CIE should produce a UnwindTable with a single row. Parsing a FDE will produce a UnwindTable with one or more UnwindRow objects where all UnwindRow objects have valid addresses. The rows in the UnwindTable will be sorted from lowest Address to highest after parsing the state machine, or an error will be returned if the table isn't sorted. To parse a UnwindTable clients can use the following methods:

    static Expected<UnwindTable> UnwindTable::create(const CIE *Cie);
    static Expected<UnwindTable> UnwindTable::create(const FDE *Fde);

A valid table will be returned if the DWARF call frame instruction opcodes have no encoding errors. There are a few things that can go wrong during the evaluation of the state machine and these create functions will catch and return them.

Differential Revision: https://reviews.llvm.org/D89845
2021-01-28 13:39:17 -08:00
Fangrui Song b3af96d07b [llvm-nm] Display defined weak STT_GNU_IFUNC symbols as 'i'
This patch makes the behavior match GNU nm.
Note: undefined STT_GNU_IFUNC symbols use 'U'.

Differential Revision: https://reviews.llvm.org/D95461
2021-01-28 09:46:05 -08:00
Rahman Lavaee 3ca502a7d6 Use DataExtractor to decode SLEB128 in android_relas.
A simple refactoring patch which let us use `DataExtractor::getSLEB128` rather than using a lambda function.

Differential Revision: https://reviews.llvm.org/D95158
2021-01-28 01:35:18 -08:00
Georgii Rymar 68195b15a3 [yaml2obj] - Allow empty SectionHeaderTable definitions.
Currently we don't allow the following definition:

```
Sections:
  - Type: SectionHeaderTable
  - Name: .foo
    Type: SHT_PROGBITS
```

We report an error: "SectionHeaderTable can't be empty. Use 'NoHeaders' key to drop the section header table".

It was implemented in this way earlier, when `SectionHeaderTable`
was a dedicated key outside of the `Sections` list. And we did not
allow to select where the table is written.

Currently it makes sense to allow it, because a user might
want to place the default section header table at an arbitrary position,
e.g. before other sections. In this case it is not convenient and error prone
to require specifying all sections:

```
Sections:
  - Type: SectionHeaderTable
    Sections:
      - Name: .foo
      - Name: .strtab
      - Name: .shstrtab
  - Name: .foo
    Type: SHT_PROGBITS
```

This patch allows empty SectionHeaderTable definitions.

Differential revision: https://reviews.llvm.org/D95341
2021-01-28 10:51:52 +03:00
Hongtao Yu 7e99bddfea [CSSPGO] Support of CS profiles in extended binary format.
This change brings up support of context-sensitive profiles in the format of extended binary. Existing sample profile reader/writer/merger code is being tweaked to reflect the fact of bracketed input contexts, like (`[...]`). The paired brackets are also needed in extbinary profiles because we don't yet have an otherwise good way to tell calling contexts apart from regular function names since the context delimiter `@` can somehow serve as a part of the C++ mangled names.

Reviewed By: wmi, wenlei

Differential Revision: https://reviews.llvm.org/D95547
2021-01-27 21:29:46 -08:00
Teresa Johnson 1487747e99 [LTO] Prevent devirtualization for symbols dynamically exported
Identify dynamically exported symbols (--export-dynamic[-symbol=],
--dynamic-list=, or definitions needed to preempt shared objects) and
prevent their LTO visibility from being upgraded.
This helps avoid use of whole program devirtualization when there may
be overrides in dynamic libraries.

Differential Revision: https://reviews.llvm.org/D91583
2021-01-27 15:54:13 -08:00
Fangrui Song 54fb3ca96e [ThinLTO] Add Visibility bits to GlobalValueSummary::GVFlags
Imported functions and variable get the visibility from the module supplying the
definition.  However, non-imported definitions do not get the visibility from
(ELF) the most constraining visibility among all modules (Mach-O) the visibility
of the prevailing definition.

This patch

* adds visibility bits to GlobalValueSummary::GVFlags
* computes the result visibility and propagates it to all definitions

Protected/hidden can imply dso_local which can enable some optimizations (this
is stronger than GVFlags::DSOLocal because the implied dso_local can be
leveraged for ELF -shared while default visibility dso_local has to be cleared
for ELF -shared).

Note: we don't have summaries for declarations, so for ELF if a declaration has
the most constraining visibility, the result visibility may not be that one.

Differential Revision: https://reviews.llvm.org/D92900
2021-01-27 10:43:51 -08:00
Fangrui Song 4d28f0a6a4 [llc] Add reportError helper and canonicalize error messages 2021-01-26 15:33:37 -08:00
Fangrui Song 79ce46e275 [llvm-elfabi] Fix test after D95140 2021-01-26 12:45:45 -08:00
Haowei Wu 15313f64be [llvm-elfabi] Support ELF file that lacks .gnu.hash section
Before this change, when reading ELF file, elfabi determines number of
entries in .dynsym by reading the .gnu.hash section. This change makes
elfabi read section headers directly first. This change allows elfabi
works on ELF files which do not have .gnu.hash sections.

Differential Revision: https://reviews.llvm.org/D93362
2021-01-26 12:31:52 -08:00
Fangrui Song 34b60d8a56 Add -fbinutils-version= to gate ELF features on the specified binutils version
There are two use cases.

Assembler
We have accrued some code gated on MCAsmInfo::useIntegratedAssembler().  Some
features are supported by latest GNU as, but we have to use
MCAsmInfo::useIntegratedAs() because the newer versions have not been widely
adopted (e.g. SHF_LINK_ORDER 'o' and 'unique' linkage in 2.35, --compress-debug-sections= in 2.26).

Linker
We want to use features supported only by LLD or very new GNU ld, or don't want
to work around older GNU ld. We currently can't represent that "we don't care
about old GNU ld".  You can find such workarounds in a few other places, e.g.
Mips/MipsAsmprinter.cpp PowerPC/PPCTOCRegDeps.cpp X86/X86MCInstrLower.cpp
AArch64 TLS workaround for R_AARCH64_TLSLD_MOVW_DTPREL_* (PR ld/18276),
R_AARCH64_TLSLE_LDST8_TPREL_LO12 (https://bugs.llvm.org/show_bug.cgi?id=36727 https://sourceware.org/bugzilla/show_bug.cgi?id=22969)

Mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER components (supported by LLD in D84001;
GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=16833 may take a while before available).
This feature allows to garbage collect some unused sections (e.g. fragmented .gcc_except_table).

This patch adds `-fbinutils-version=` to clang and `-binutils-version` to llc.
It changes one codegen place in SHF_MERGE to demonstrate its usage.
`-fbinutils-version=2.35` means the produced object file does not care about GNU
ld<2.35 compatibility. When `-fno-integrated-as` is specified, the produced
assembly can be consumed by GNU as>=2.35, but older versions may not work.

`-fbinutils-version=none` means that we can use all ELF features, regardless of
GNU as/ld support.

Both clang and llc need `parseBinutilsVersion`. Such command line parsing is
usually implemented in `llvm/lib/CodeGen/CommandFlags.cpp` (LLVMCodeGen),
however, ClangCodeGen does not depend on LLVMCodeGen. So I add
`parseBinutilsVersion` to `llvm/lib/Target/TargetMachine.cpp` (LLVMTarget).

Differential Revision: https://reviews.llvm.org/D85474
2021-01-26 12:28:23 -08:00
Georgii Rymar d5e48f1347 [yaml2obj][obj2yaml] - Improve how we set/dump the sh_entsize field.
We already set the `sh_entsize` field in a single place
for all non-implicit sections.

This patch reorders the logic slightly and with it
we finally have the only one place where the `sh_entsize` is set.

obj2yaml will not dump the `EntSize` key for `SHT_DYNSYM/SHT_SYMTAB` sections anymore,
when the value of `sh_entsize` is equal to `sizeof(Elf_Sym)`

Note that this also seems revealed an issue in llvm-objcopy:
Previously yaml2obj set the `sh_entsize` for the `.symtab` section to 0x18,
now we it sets it for `SHT_SYMTAB` sections, i.e. by type.
But the `llvm-objcopy/ELF/only-keep-debug.test` has a `.symtab` section of type `SHT_STRTAB`,
and now yaml2obj sets the `sh_entsize` to 0 for it.
I had to update the corresponding check lines for `ES`, but the behavior of
`llvm-objcopy` should be fixed instead I think.
I've added a TODO and a comment.

Differential revision: https://reviews.llvm.org/D95364
2021-01-26 13:33:02 +03:00
Ben Shi 2d7aa149a4 [update_llc_test_checks] Support AVR
Reviewed By: arichardson

Differential Revision: https://reviews.llvm.org/D95240
2021-01-26 17:50:56 +08:00
Georgii Rymar db92d47cf7 [llvm-nm][ELF] - Use @@ prefix when printing default versions.
llvm-readelf prints default versions with `@@` prefix.
This patch does the same for llvm-nm.

Differential revision: https://reviews.llvm.org/D94912
2021-01-26 12:16:38 +03:00
Georgii Rymar e98d5c3192 [libObject,llvm-readelf/obj] - Don't use @@ when printing versions of undefined symbols.
A default version (@@) is only available for defined symbols.

Currently we use "@@" for undefined symbols too.
This patch fixes the issue and improves our test case.

Differential revision: https://reviews.llvm.org/D95219
2021-01-26 12:05:59 +03:00
Sam Clegg 84c6f32584 [Object][WebAssembly] Update format of error messages
Error message should start with lowercase in accordance with
https://llvm.org/docs/CodingStandards.html#error-and-warning-messages

Differential Revision: https://reviews.llvm.org/D95239
2021-01-25 21:12:53 -08:00
Abhina Sreeskantharajan 978444d531 Revert "[SystemZ][z/OS] Fix No such file or directory expression error"
This reverts commit 06f8a49693.
2021-01-25 08:29:38 -05:00
Abhina Sreeskantharajan 84851a274e Revert "[SystemZ][z/OS] Fix No such file or directory expression error matching in lit tests - continued"
This reverts commit 520b5ecf85.
2021-01-25 08:29:38 -05:00
Philip Pfaffe da489946a9 [llvm-dwp] Automatically set the target triple
The llvm-dwp tool hard-codes the target triple to x86. Instead, deduce the
target triple from the object files being read.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D93749
2021-01-25 11:58:54 +01:00
Georgii Rymar 9c89dcf807 [yaml2obj, obj2yaml] - Implement section header table as a special Chunk.
This was discussed in D93678 thread.
Currently we have one special chunk - Fill.

This patch re implements the "SectionHeaderTable" key to become a special chunk too.
With that we are able to place the section header table at any location,
just like we place sections.

Differential revision: https://reviews.llvm.org/D95140
2021-01-25 13:08:08 +03:00
David Zarzycki 1bc8daba4f Fix x86 exegesis tests after c042aff886
In c042aff886, unused FileCheck prefixes became an error, which exposed some testing bugs in four exegesis tests. I've tried my best to either fix the testing bugs, or expand the testing to cover more scenarios.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D95287
2021-01-24 08:51:06 -05:00
Abhina Sreeskantharajan 520b5ecf85 [SystemZ][z/OS] Fix No such file or directory expression error matching in lit tests - continued
This is a continuation of https://reviews.llvm.org/D94239. I missed some other spellings of the same error.

Reviewed By: muiez

Differential Revision: https://reviews.llvm.org/D95246
2021-01-22 13:54:25 -05:00
Wolfgang Pieb 7143b63017 [llvm-mca] Adding local lit config file for X86 targets 2021-01-22 09:52:57 -08:00
Abhina Sreeskantharajan 06f8a49693 [SystemZ][z/OS] Fix No such file or directory expression error
On z/OS, the following error message is not matched correctly in lit tests. This patch updates the CHECK expression to match the end period successfully.
```
EDC5129I No such file or directory.
```

Differential Revision: https://reviews.llvm.org/D94239
2021-01-22 11:41:40 -05:00
Wolfgang Pieb 020c00b5d3 [llvm-mca] Test case was missing a triple. 2021-01-21 16:19:32 -08:00
Wolfgang Pieb d38be2ba0e [llvm-mca] Initial implementation of serialization using JSON. The views
implemented at this time are Summary, Timeline, ResourcePressure and InstructionInfo.
Use --json on the command line to obtain JSON output.
2021-01-21 15:15:54 -08:00
Georgii Rymar dd5c982804 [llvm-nm][ELF] - Make -D display symbol versions.
This fixes https://bugs.llvm.org/show_bug.cgi?id=48670.

Since binutils 2.35, nm -D displays symbol versions by default.
This patch teaches llvm-nm to do the same.

Differential revision: https://reviews.llvm.org/D94907
2021-01-21 11:23:45 +03:00
Georgii Rymar 51f4958057 [yaml2obj/obj2yaml] - Improve dumping/creating of ELF versioning sections.
This makes the following improvements.

For `SHT_GNU_versym`:
 * yaml2obj: set `sh_link` to index of `.dynsym` section automatically.
For `SHT_GNU_verdef`:
 * yaml2obj: set `sh_link` to index of `.dynstr` section automatically.
 * yaml2obj: set `sh_info` field automatically.
 * obj2yaml: don't dump the `Info` field when its value matches the number of version definitions.
For `SHT_GNU_verneed`:
 * yaml2obj: set `sh_link` to index of `.dynstr` section automatically.
 * yaml2obj: set `sh_info` field automatically.
 * obj2yaml: don't dump the `Info` field when its value matches the number of version dependencies.

Also, simplifies few test cases.

Differential revision: https://reviews.llvm.org/D94956
2021-01-21 10:36:48 +03:00
wlei daeea961a6 [llvm-profgen][NFC] Fix the incorrect computation of callsite sample count
Differential Revision: https://reviews.llvm.org/D95009
2021-01-19 17:50:48 -08:00
Abhina Sreeskantharajan 88e7c3498c [SystemZ][z/OS] Fix Permission denied pattern matching
On z/OS, the error message "EDC5111I Permission denied." is not matched correctly in lit tests. This patch updates the check expression to match successfully.

Differential Revision: https://reviews.llvm.org/D94432
2021-01-19 13:05:52 -05:00
Yvan Roux 244ad228f3 [ARM][MachineOutliner] Add stack fixup feature
This patch handles cases where we have to save/restore the link register
into the stack and and load/store instruction which use the stack are
part of the outlined region. It checks that there will be no overflow
introduced by the new offset and fixup these instructions accordingly.

Differential Revision: https://reviews.llvm.org/D92934
2021-01-19 10:59:09 +01:00
Abhina Sreeskantharajan 689aaba7ac [SystemZ][z/OS] Fix No such file or directory expression error matching in lit tests
On z/OS, the following error message is not matched correctly in lit tests. This patch updates the CHECK expression to match successfully.
```
EDC5129I No such file or directory.
```

Reviewed By: muiez

Differential Revision: https://reviews.llvm.org/D94239
2021-01-18 07:14:37 -05:00