The prefix given to --prefix will be added to GNU absolute paths when
used with --source option (source interleaved with the disassembly).
This matches GNU's objdump behavior.
GNU and C++17 rules for absolute paths are different.
Differential Revision: https://reviews.llvm.org/D85024
Fixes PR46368.
Differential Revision: https://reviews.llvm.org/D85024
Add printf-style precision specifier to pad numbers to a given number of
digits when matching them if the value is smaller than the given
precision. This works on both empty numeric expression (e.g. variable
definition from input) and when matching a numeric expression. The
syntax is as follows:
[[#%.<precision><format specifier>, ...]
where <format specifier> is optional and ... can be a variable
definition or not with an empty expression or not. In the absence of a
precision specifier, a variable definition will accept leading zeros.
Reviewed By: jhenderson, grimar
Differential Revision: https://reviews.llvm.org/D81667
When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise.
In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand.
So far only the X86 disassemblers are supported.
Test Plan:
llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
mov eax, dword ptr [rsp]
cmp eax, dword ptr [rip + 4112] # 202182 <g>
jge 0x20117e <_start+0x25>
call 0x201158 <foo>
inc dword ptr [rsp]
jmp 0x201169 <_start+0x10>
xor eax, eax
pop rcx
ret
```
llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
<L1>:
mov eax, dword ptr [rsp]
cmp eax, dword ptr <g>
jge <L0>
call <foo>
inc dword ptr [rsp]
jmp <L1>
<L0>:
xor eax, eax
pop rcx
ret
```
Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D84191
Add support for passing in libraries via `-l` and `-L` options to
`llvm-libtool-darwin`.
Reviewed by jhenderson, smeenai
Differential Revision: https://reviews.llvm.org/D85540
Add support for -arch_only option for llvm-libtool-darwin. This diff
also adds support for accepting universal files as input and flattening
them to create the required static library. Supports input universal
files contaning both Mach-O object files or archives.
Differences from cctools' libtool:
- `-arch_only` can be specified multiple times
- archives containing universal files are considered invalid (libtool
allows such archives)
Reviewed by jhenderson, smeenai
Differential Revision: https://reviews.llvm.org/D84770
Add documentation for the remaining options of
`llvm-install-name-tool`.
Reviewed by jhenderson, smeenai
Differential Revision: https://reviews.llvm.org/D85655
The switch from llvm::cl to OptTable (D83530) dropped --version, which
is needed by some users.
This patch also adds a -v alias, which is available in GNU addr2line.
The version dumping is similar to llvm-objcopy --version (exotic):
```
llvm-symbolizer
LLVM (http://llvm.org/):
LLVM version 12.0.0git
Optimized build with assertions.
Default target: x86_64-unknown-linux-gnu
Host CPU: skylake-avx512
```
Reviewed By: dyung, jhenderson
Differential Revision: https://reviews.llvm.org/D85624
Add support for `-D` and `-U` options for llvm-libtool-darwin. `-D`
allows for using zero for timestamps and UIDs/GIDs. `-U` allows for
using actual timestamps and UIDs/GIDs.
Reviewed by jhenderson, smeenai
Differential Revision: https://reviews.llvm.org/D84209
Add support for `-filelist` option for llvm-libtool-darwin. `-filelist`
option allows for passing in a file containing a list of filenames.
Reviewed by jhenderson, smeenai
Differential Revision: https://reviews.llvm.org/D84206
This diff adds documentation for `allow-empty` flag under FileCheck
docs.
Reviewed by jhenderson, smeenai, thopre
Differential Revision: https://reviews.llvm.org/D83682
This came up during the review for D67656. It's nice but also subtle, so documenting it as an idiom will make tests easier to understand.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D68061
for the advantage outlined by D83639 ([OptTable] Support grouped short options)
Some behavior changes:
* -i={0,false} is removed. Use --no-inlines instead.
* --demangle={0,false} is removed. Use --no-demangle instead
* -untag-addresses={0,false} is removed. Use --no-untag-addresses instead
Added a higher level API OptTable::parseArgs which handles optional
initial options populated from an environment variable, expands response
files recursively, and parses options.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D83530
PGO profile is usually more precise than sample profile. However, PGO profile
needs to be collected from loadtest and loadtest may not be representative
enough to the production workload. Sample profile collected from production
can be used as a supplement -- for functions cold in loadtest but warm/hot
in production, we can scale up the related function in PGO profile if the
function is warm or hot in sample profile.
The implementation contains changes in compiler side and llvm-profdata side.
Given an instr profile and a sample profile, for a function cold in PGO
profile but warm/hot in sample profile, llvm-profdata will either mark
all the counters in the profile to be -1 or scale up the max count in the
function to be above hot threshold, depending on the zero counter ratio in
the profile. The assumption is if there are too many counters being zero
in the function profile, the profile is more likely to cause harm than good,
then llvm-profdata will mark all the counters to be -1 indicating the
function is hot but the profile is unaccountable. In compiler side, if a
function profile with all -1 counters is seen, the function entry count will
be set to be above hot threshold but its internal profile will be dropped.
In the long run, it may be useful to let compiler support using PGO profile
and sample profile at the same time, but that requires more careful design
and more substantial changes to make two profiles work seamlessly. The patch
here serves as a simple intermediate solution.
Differential Revision: https://reviews.llvm.org/D81981
Starting with Skylake, the LBR contains the precise number of cycles between the two
consecutive branches.
Making use of this will hopefully make the measurements more precise than the
existing methods of using RDTSC.
Differential Revision: https://reviews.llvm.org/D77422
New change: check for existence of field `cycles` in perf_branch_entry before enabling this mode.
This should prevent compilation errors when building for older kernel whose headers don't support it.
Add support for creating static libraries when the input includes only
Mach-O binaries (and not libraries/archives themselves).
Reviewed by alexshap, Ktwu, smeenai, jhenderson, MaskRay, mtrent
Differential Revision: https://reviews.llvm.org/D83002
Its effect could be achieved by
`-stop-after`,`-print-after`,`-print-after-all`. But a few tests need to
print MIR after ISel which could not be done with
`-print-after`/`-stop-after` since isel pass does not have commandline name.
That's the reason `--print-machineinstrs` is downgraded to
`--print-after-isel` in this patch. `--print-after-isel` could be
removed after we switch to new pass manager since isel pass would have a
commandline text name to use `print-after` or equivalent switches.
The motivation of this patch is to reduce tests dependency on
would-be-deprecated feature.
Reviewed By: arsenm, dsanders
Differential Revision: https://reviews.llvm.org/D83275
This patch changes llvm-readelf (and llvm-readobj for consistency)
behavior to print an error when executed with no input files.
Reading from stdin can be achieved via a '-' for the input
object.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46400
Differential Revision: https://reviews.llvm.org/D83704
Reviewed by: jhenderson, MaskRay, sbc, jyknight
This diff starts the implementation of llvm-libtool-darwin
(an llvm based replacement of cctool's libtool).
Libtool is used for creating static and dynamic libraries
from a bunch of object files given as input.
Reviewed by alexshap, smeenai, jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D82923
From @erichkeane:
```
This patch doesn't seem to build for me:
/iusers/ekeane1/workspaces/llvm-project/llvm/tools/llvm-exegesis/lib/X86/X86Counter.cpp: In function ‘llvm::Error llvm::exegesis::parseDataBuffer(const char*, size_t, const void*, const void*, llvm::SmallVector<long int, 4>*)’:
/iusers/ekeane1/workspaces/llvm-project/llvm/tools/llvm-exegesis/lib/X86/X86Counter.cpp:99:37: error: ‘struct perf_branch_entry’ has no member named ‘cycles’
CycleArray->push_back(Entry.cycles);
I'm on RHEL7, so I have kernel 3.10, so it doesn't have 'cycles'.
According ot this: https://elixir.bootlin.com/linux/v4.3/source/include/uapi/linux/perf_event.h#L963 kernel 4.3 is the first time that 'cycles' appeared in this structure.
```
Starting with Skylake, the LBR contains the precise number of cycles between the two
consecutive branches.
Making use of this will hopefully make the measurements more precise than the
existing methods of using RDTSC.
Differential Revision: https://reviews.llvm.org/D77422
In FileCheck.rst, add `-dump-input-context` and `-dump-input-filter`,
and fix some `-dump-input` documentation.
In `FileCheck -help`, `cl::value_desc("kind")` is being ignored for
`-dump-input-filter`, so just drop it.
Extend `-dump-input=help` to mention FILECHECK_OPTS.
This adds the --debug-vars option to llvm-objdump, which prints
locations (registers/memory) of source-level variables alongside the
disassembly based on DWARF info. A vertical line is printed for each
live-range, with a label at the top giving the variable name and
location, and the position and length of the line indicating the program
counter range in which it is valid.
Differential revision: https://reviews.llvm.org/D70720
Before the fix the build of docs-llvm-html would fail.
The D80959 introduced options that are not recognized, so we have
warning as:
llvm-project/llvm/docs/CommandGuide/llvm-dwarfdump.rst:40\
:unknown option: --debug-info
Differential Revision: https://reviews.llvm.org/D82460
Summary: Rename --elf-cg-profile to --cg-profile and keep --elf-cg-profile as an alias of --cg-profile.
Reviewers: jhenderson, MaskRay, espindola, hans
Reviewed By: jhenderson, MaskRay
Subscribers: emaste, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81855
Summary:
This completes the needed glueing to support reading tbd files from nm.
This includes specifying which slice filtering with `--arch` and a new
option specifically for tbd files `--add-inlinedinfo` which will show
the reexported libraries that are appended in the tbd file.
Reviewers: ributzka, steven_wu, JDevlieghere, jhenderson
Reviewed By: JDevlieghere
Subscribers: hiraditya, MaskRay, dexonsmith, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81614
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch adds support for specifying the
matching constraint for a numeric expression, ie. how the value being
matched should relate to the numeric expression.
This commit only adds the equality constraint where the numeric value
matched must be equal to the numeric expression. It is the default
matching constraint used when not specified. It is added to provision
other matching constraint (e.g. inequality relations).
Copyright:
- Linaro (changes up to diff 183612 of revision D55940)
- GraphCore (changes in later versions of revision D55940 and
in new revision created off D55940)
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D60391
This patch extends numerical expressions to allow calls to
predefined functions. These calls can be combined with the
existing numerical operators, which includes nesting calls.
The call syntax is:
<func>(<args>)
Where <func> is a predefined string literal, currently limited to
one of add, max, min and sub. <arg> is a comma seperated list of
numerical expressions.
Subscribers: arichardson, hiraditya, thopre, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79936
Having the input dumped on failure seems like a better
default: I debugged FileCheck tests for a while without knowing
about this option, which really helps to understand failures.
Remove `-dump-input-on-failure` and the environment variable
FILECHECK_DUMP_INPUT_ON_FAILURE which are now obsolete.
Differential Revision: https://reviews.llvm.org/D81422
Some of the --debug-* options can take an optional offset. Although the
man page does a good job of making that clear, it's much harder to
discover from the help output.
Currently the only reference to this is the following sentence:
> Where applicable these parameters take an optional =<offset> argument
> to dump only the entry at the specified offset.
This patch changes the help output from to print [=<offset>] after the
options that take an offset.
--debug-info[=<offset>] - Dump the .debug_info section
rdar://problem/63150066
Differential revision: https://reviews.llvm.org/D80959
Summary:
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch adds support signed numeric
values, thus allowing negative numeric values.
As such, the patch adds a new class to represent a signed or unsigned
value and add the logic for type promotion and type conversion in
numeric expression mixing signed and unsigned values. It also adds
the %d format specifier to represent signed value.
Finally, it also adds underflow and overflow detection when performing a
binary operation.
Copyright:
- Linaro (changes up to diff 183612 of revision D55940)
- GraphCore (changes in later versions of revision D55940 and
in new revision created off D55940)
Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson
Reviewed By: jhenderson, arichardson
Subscribers: MaskRay, hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, kristina, hfinkel, rogfer01, JonChesterfield
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60390
With this change it is be possible to write FileCheck expressions such
as [[#(VAR+1)-2]]. Currently, the only supported arithmetic operators are
plus and minus, so this is not particularly useful yet. However, it our
CHERI fork we have tests that benefit from having multiplication in
FileCheck expressions. Allowing parenthesized expressions is the simplest
way for us to work around the current lack of operator precedence in
FileCheck expressions.
Reviewed By: thopre, jhenderson
Differential Revision: https://reviews.llvm.org/D77383
cctools strip has the option "-T" which removes Swift symbols.
This diff implements this option in llvm-strip for MachO.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D80099
llvm-extract get serveral new options, but we forgot to update doc.
This patch update the doc.
Reviewed By: volkan
Differential Revision: https://reviews.llvm.org/D80413
Add support for generating a dsymutil reproducer. The result is a folder
containing all the object files for linking.
When --gen-reproducer is passed, dsymutil uses a FileCollectorFileSystem
which keeps track of all the files used by dsymutil. These files are
copied into a temporary directory when dsymutil exists.
When this path is passed to --use-reproducer, dsymutil uses a
RedirectingFileSystem that will use the files from the reproducer
directory instead of the actual paths. This means you don't need to mess
with the OSO path prefix.
Differential revision: https://reviews.llvm.org/D79398
Sometimes you want to disable a FileCheck directive without removing
it entirely, or you want to write comments that mention a directive by
name. The `COM:` directive makes it easy to do this. For example,
you might have:
```
; X32: pinsrd_1:
; X32: pinsrd $1, 4(%esp), %xmm0
; COM: FIXME: X64 isn't working correctly yet for this part of codegen, but
; COM: X64 will have something similar to X32:
; COM:
; COM: X64: pinsrd_1:
; COM: X64: pinsrd $1, %edi, %xmm0
```
Without this patch, you need to use some combination of rewording and
directive syntax mangling to prevent FileCheck from recognizing the
commented occurrences of `X32:` and `X64:` above as directives.
Moreover, FileCheck diagnostics have been proposed that might complain
about the occurrences of `X64` that don't have the trailing `:`
because they look like directive typos:
<http://lists.llvm.org/pipermail/llvm-dev/2020-April/140610.html>
I think dodging all these problems can prove tedious for test authors,
and directive syntax mangling already makes the purpose of existing
test code unclear. `COM:` can avoid all these problems.
This patch also updates the small set of existing tests that define
`COM` as a check prefix:
- clang/test/CodeGen/default-address-space.c
- clang/test/CodeGenOpenCL/addr-space-struct-arg.cl
- clang/test/Driver/hip-device-libs.hip
- llvm/test/Assembler/drop-debug-info-nonzero-alloca.ll
I think lit should support `COM:` as well. Perhaps `clang -verify`
should too.
Reviewed By: jhenderson, thopre
Differential Revision: https://reviews.llvm.org/D79276
Sometimes you want to disable a FileCheck directive without removing
it entirely, or you want to write comments that mention a directive by
name. The `COM:` directive makes it easy to do this. For example,
you might have:
```
; X32: pinsrd_1:
; X32: pinsrd $1, 4(%esp), %xmm0
; COM: FIXME: X64 isn't working correctly yet for this part of codegen, but
; COM: X64 will have something similar to X32:
; COM:
; COM: X64: pinsrd_1:
; COM: X64: pinsrd $1, %edi, %xmm0
```
Without this patch, you need to use some combination of rewording and
directive syntax mangling to prevent FileCheck from recognizing the
commented occurrences of `X32:` and `X64:` above as directives.
Moreover, FileCheck diagnostics have been proposed that might complain
about the occurrences of `X64` that don't have the trailing `:`
because they look like directive typos:
<http://lists.llvm.org/pipermail/llvm-dev/2020-April/140610.html>
I think dodging all these problems can prove tedious for test authors,
and directive syntax mangling already makes the purpose of existing
test code unclear. `COM:` can avoid all these problems.
This patch also updates the small set of existing tests that define
`COM` as a check prefix:
- clang/test/CodeGen/default-address-space.c
- clang/test/CodeGenOpenCL/addr-space-struct-arg.cl
- clang/test/Driver/hip-device-libs.hip
- llvm/test/Assembler/drop-debug-info-nonzero-alloca.ll
I think lit should support `COM:` as well. Perhaps `clang -verify`
should too.
Reviewed By: jhenderson, thopre
Differential Revision: https://reviews.llvm.org/D79276
This patch adds statistics about the contribution of each object file to
the linked debug info. When --statistics is passed to dsymutil, it
prints a table after linking as illustrated below.
It lists the object file name, the size of the debug info in the object
file in bytes, and the absolute size contribution to the linked dSYM and
the percentage difference. The table is sorted by the output size, so
the object files contributing the most to the link are listed first.
.debug_info section size (in bytes)
-------------------------------------------------------------------------------
Filename Object dSYM Change
-------------------------------------------------------------------------------
basic2.macho.x86_64.o 210b 165b -24.00%
basic3.macho.x86_64.o 177b 150b -16.51%
basic1.macho.x86_64.o 125b 129b 3.15%
-------------------------------------------------------------------------------
Total 512b 444b -14.23%
-------------------------------------------------------------------------------
Differential revision: https://reviews.llvm.org/D79513
The --output-target documentation has slightly rotted, as the default is
no longer purely based on the input file format, but also the value of
--input-target. This patch updates the documentation to make this
explicit.
Reviewed by: MaskRay, alexshap
Differential Revision: https://reviews.llvm.org/D79318
This addresses:
-Clean up the source code
-Refactor the JSON fields
-Fix the test cases
-Improve the docs for the stats output
Differential Revision: https://reviews.llvm.org/D77789
Summary:
FileCheck documentation contains an example of a numeric variable
defined and used on the same line. This is not currently supported by
FileCheck so this commit fixes the example to use CHECK-SAME for the
variable use.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D79253
This option was added several months ago in commit e84468c1.
Reviewed by: MaskRay, erik.pilkington, steven_wu
Differential Revision: https://reviews.llvm.org/D79166
This change helps improve `dsymutil` documentation.
- Add missing options
- Re-arrange options in alphabetical order
- Wrap inline options in double-back-quote
- `-v` is for `--version` not `--verbose`
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D78479
Summary:
This matches the behavior of GNU addr2line. We previously treated
hexadecimal addresses as binary if they started with 0b, otherwise as
octal if they started with 0, otherwise as decimal.
This only affects llvm-addr2line; the behavior of llvm-symbolize is
unaffected.
Reviewers: ikudrin, rupprecht, jhenderson
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73306
LanguageExtensions.rst:2191: WARNING: Title underline too short.
llvm-symbolizer.rst:157: Error in "code-block" directive: maximum 1 argument(s) allowed, 30 supplied.
This patch adds missing description of enable-no-signed-zeros-fp-math
and enable-no-trapping-fp-math options of llc.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D77713
The LitConfig is shared across the whole test suite. However, since
enabling recursive expansion can be a breaking change for some test
suites, it's important to confine the setting to test suites that
enable it explicitly.
Note that other issues were raised with the way recursiveExpansionLimit
operates. However, this commit simply moves the setting to the right
place -- the mechanism by which it works can be improved independently.
Differential Revision: https://reviews.llvm.org/D77415
SUMMARY:
For the llvm-objdump -D, the symbol name is used as a label in the disassembly for the specific address (when a symbol address is equal to the virtual address in the dump).
In XCOFF, multiple symbols may have the same name, being differentiated by their storage mapping class. It is helpful to print the QualName and not just the name when forming the output label for a csect symbol. The symbol index further removes any ambiguity caused by duplicate names.
To maintain compatibility with the binutils objdump, the XCOFF-specific --symbol-description option is added to enable the enhanced format.
Reviewers: hubert.reinterpretcast, James Henderson, Jason Liu ,daltenty
Subscribers: wuzish, nemanjai, hiraditya
Differential Revision: https://reviews.llvm.org/D72973
Summary:
This patch is to teach `llvm-objdump` dump dynamic symbols (`-T` and `--dynamic-syms`). Currently, this patch is not fully compatible with `gnu-objdump`, but I would like to continue working on this in next few patches. It has two issues.
1. Some symbols shouldn't be marked as global(g). (`-t/--syms` has same issue as well) (Fixed by D75659)
2. `gnu-objdump` can dump version information and *dynamically* insert before symbol name field.
`objdump -T a.out` gives:
```
DYNAMIC SYMBOL TABLE:
0000000000000000 w D *UND* 0000000000000000 _ITM_deregisterTMCloneTable
0000000000000000 DF *UND* 0000000000000000 GLIBC_2.2.5 printf
0000000000000000 DF *UND* 0000000000000000 GLIBC_2.2.5 __libc_start_main
0000000000000000 w D *UND* 0000000000000000 __gmon_start__
0000000000000000 w D *UND* 0000000000000000 _ITM_registerTMCloneTable
0000000000000000 w DF *UND* 0000000000000000 GLIBC_2.2.5 __cxa_finalize
```
`llvm-objdump -T a.out` gives:
```
DYNAMIC SYMBOL TABLE:
0000000000000000 w D *UND* 0000000000000000 _ITM_deregisterTMCloneTable
0000000000000000 g DF *UND* 0000000000000000 printf
0000000000000000 g DF *UND* 0000000000000000 __libc_start_main
0000000000000000 w D *UND* 0000000000000000 __gmon_start__
0000000000000000 w D *UND* 0000000000000000 _ITM_registerTMCloneTable
0000000000000000 w DF *UND* 0000000000000000 __cxa_finalize
```
Reviewers: jhenderson, grimar, MaskRay, espindola
Reviewed By: jhenderson, grimar
Subscribers: emaste, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75756
Add an option to llvm-dwarfdump to calculate the bytes within
the debug sections. Dump this numbers when using --statistics
option as well.
This is an initial patch (e.g. we should support other units,
since we only support 'bytes' now).
Differential Revision: https://reviews.llvm.org/D74205
Summary:
As noted in documentation, different repetition modes have different trade-offs:
> .. option:: -repetition-mode=[duplicate|loop]
>
> Specify the repetition mode. `duplicate` will create a large, straight line
> basic block with `num-repetitions` copies of the snippet. `loop` will wrap
> the snippet in a loop which will be run `num-repetitions` times. The `loop`
> mode tends to better hide the effects of the CPU frontend on architectures
> that cache decoded instructions, but consumes a register for counting
> iterations.
Indeed. Example:
>>! In D74156#1873657, @lebedev.ri wrote:
> At least for `CMOV`, i'm seeing wildly different results
> | | Latency | RThroughput |
> | duplicate | 1 | 0.8 |
> | loop | 2 | 0.6 |
> where latency=1 seems correct, and i'd expect the througput to be close to 1/2 (since there are two execution units).
This isn't great for analysis, at least for schedule model development.
As discussed in excruciating detail in
>>! In D74156#1924514, @gchatelet wrote:
>>>! In D74156#1920632, @lebedev.ri wrote:
>> ... did that explanation of the question i'm having made any sense?
>
> Thx for digging in the conversation !
> Ok it makes more sense now.
>
> I discussed it a bit with @courbet:
> - We want the analysis tool to stay simple so we'd rather not make it knowledgeable of the repetition mode.
> - We'd like to still be able to select either repetition mode to dig into special cases
>
> So we could add a third `min` repetition mode that would run both and take the minimum. It could be the default option.
> Would you have some time to look what it would take to add this third mode?
there appears to be an agreement that it is indeed sub-par,
and that we should provide an optional, measurement (not analysis!) -time
way to rectify the situation.
However, the solutions isn't entirely straight-forward.
We can just add an actual 'multiplexer' `MinSnippetRepetitor`, because
if we just concatenate snippets produced by `DuplicateSnippetRepetitor`
and `LoopSnippetRepetitor` and run+measure that, the measurement will
naturally be different from what we'd get by running+measuring
them separately and taking the min.
([[ https://www.wolframalpha.com/input/?i=%28x%2By%29%2F2+%21%3D+min%28x%2C+y%29 | `time(D+L)/2 != min(time(D), time(L))` ]])
Also, it seems best to me to have a single snippet instead of generating
a snippet per repetition mode, since the only difference here is that the
loop repetition mode reserves one register for loop counter.
As far as i can tell, we can either teach `BenchmarkRunner::runConfiguration()`
to produce a single report given multiple repetitors (as in the patch),
or do that one layer higher - don't modify `BenchmarkRunner::runConfiguration()`,
produce multiple reports, don't actually print each one, but aggregate them somehow
and only print the final one.
Initially i've gone ahead with the latter approach, but it didn't look like a natural fit;
the former (as in the diff) does seem like a better fit to me.
There's also a question of the test coverage. It sure currently does work here:
```
$ ./bin/llvm-exegesis --opcode-name=CMOV64rr --mode=inverse_throughput --repetition-mode=duplicate
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-8fb949.o
---
mode: inverse_throughput
key:
instructions:
- 'CMOV64rr RAX RAX R11 i_0x0'
- 'CMOV64rr RBP RBP R15 i_0x0'
- 'CMOV64rr RBX RBX RBX i_0x0'
- 'CMOV64rr RCX RCX RBX i_0x0'
- 'CMOV64rr RDI RDI R10 i_0x0'
- 'CMOV64rr RDX RDX RAX i_0x0'
- 'CMOV64rr RSI RSI RAX i_0x0'
- 'CMOV64rr R8 R8 R8 i_0x0'
- 'CMOV64rr R9 R9 RDX i_0x0'
- 'CMOV64rr R10 R10 RBX i_0x0'
- 'CMOV64rr R11 R11 R14 i_0x0'
- 'CMOV64rr R12 R12 R9 i_0x0'
- 'CMOV64rr R13 R13 R12 i_0x0'
- 'CMOV64rr R14 R14 R15 i_0x0'
- 'CMOV64rr R15 R15 R13 i_0x0'
config: ''
register_initial_values:
- 'RAX=0x0'
- 'R11=0x0'
- 'EFLAGS=0x0'
- 'RBP=0x0'
- 'R15=0x0'
- 'RBX=0x0'
- 'RCX=0x0'
- 'RDI=0x0'
- 'R10=0x0'
- 'RDX=0x0'
- 'RSI=0x0'
- 'R8=0x0'
- 'R9=0x0'
- 'R14=0x0'
- 'R12=0x0'
- 'R13=0x0'
cpu_name: bdver2
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { key: inverse_throughput, value: 0.819, per_snippet_value: 12.285 }
error: ''
info: instruction has tied variables, using static renaming.
assembled_snippet: 5541574156415541545348B8000000000000000049BB00000000000000004883EC08C7042400000000C7442404000000009D48BD000000000000000049BF000000000000000048BB000000000000000048B9000000000000000048BF000000000000000049BA000000000000000048BA000000000000000048BE000000000000000049B8000000000000000049B9000000000000000049BE000000000000000049BC000000000000000049BD0000000000000000490F40C3490F40EF480F40DB480F40CB490F40FA480F40D0480F40F04D0F40C04C0F40CA4C0F40D34D0F40DE4D0F40E14D0F40EC4D0F40F74D0F40FD490F40C35B415C415D415E415F5DC3
...
$ ./bin/llvm-exegesis --opcode-name=CMOV64rr --mode=inverse_throughput --repetition-mode=loop
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-051eb3.o
---
mode: inverse_throughput
key:
instructions:
- 'CMOV64rr RAX RAX R11 i_0x0'
- 'CMOV64rr RBP RBP RSI i_0x0'
- 'CMOV64rr RBX RBX R9 i_0x0'
- 'CMOV64rr RCX RCX RSI i_0x0'
- 'CMOV64rr RDI RDI RBP i_0x0'
- 'CMOV64rr RDX RDX R9 i_0x0'
- 'CMOV64rr RSI RSI RDI i_0x0'
- 'CMOV64rr R9 R9 R12 i_0x0'
- 'CMOV64rr R10 R10 R11 i_0x0'
- 'CMOV64rr R11 R11 R9 i_0x0'
- 'CMOV64rr R12 R12 RBP i_0x0'
- 'CMOV64rr R13 R13 RSI i_0x0'
- 'CMOV64rr R14 R14 R14 i_0x0'
- 'CMOV64rr R15 R15 R10 i_0x0'
config: ''
register_initial_values:
- 'RAX=0x0'
- 'R11=0x0'
- 'EFLAGS=0x0'
- 'RBP=0x0'
- 'RSI=0x0'
- 'RBX=0x0'
- 'R9=0x0'
- 'RCX=0x0'
- 'RDI=0x0'
- 'RDX=0x0'
- 'R12=0x0'
- 'R10=0x0'
- 'R13=0x0'
- 'R14=0x0'
- 'R15=0x0'
cpu_name: bdver2
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { key: inverse_throughput, value: 0.6083, per_snippet_value: 8.5162 }
error: ''
info: instruction has tied variables, using static renaming.
assembled_snippet: 5541574156415541545348B8000000000000000049BB00000000000000004883EC08C7042400000000C7442404000000009D48BD000000000000000048BE000000000000000048BB000000000000000049B9000000000000000048B9000000000000000048BF000000000000000048BA000000000000000049BC000000000000000049BA000000000000000049BD000000000000000049BE000000000000000049BF000000000000000049B80200000000000000490F40C3480F40EE490F40D9480F40CE480F40FD490F40D1480F40F74D0F40CC4D0F40D34D0F40D94C0F40E54C0F40EE4D0F40F64D0F40FA4983C0FF75C25B415C415D415E415F5DC3
...
$ ./bin/llvm-exegesis --opcode-name=CMOV64rr --mode=inverse_throughput --repetition-mode=min
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-c7a47d.o
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-2581f1.o
---
mode: inverse_throughput
key:
instructions:
- 'CMOV64rr RAX RAX R11 i_0x0'
- 'CMOV64rr RBP RBP R10 i_0x0'
- 'CMOV64rr RBX RBX R10 i_0x0'
- 'CMOV64rr RCX RCX RDX i_0x0'
- 'CMOV64rr RDI RDI RAX i_0x0'
- 'CMOV64rr RDX RDX R9 i_0x0'
- 'CMOV64rr RSI RSI RAX i_0x0'
- 'CMOV64rr R9 R9 RBX i_0x0'
- 'CMOV64rr R10 R10 R12 i_0x0'
- 'CMOV64rr R11 R11 RDI i_0x0'
- 'CMOV64rr R12 R12 RDI i_0x0'
- 'CMOV64rr R13 R13 RDI i_0x0'
- 'CMOV64rr R14 R14 R9 i_0x0'
- 'CMOV64rr R15 R15 RBP i_0x0'
config: ''
register_initial_values:
- 'RAX=0x0'
- 'R11=0x0'
- 'EFLAGS=0x0'
- 'RBP=0x0'
- 'R10=0x0'
- 'RBX=0x0'
- 'RCX=0x0'
- 'RDX=0x0'
- 'RDI=0x0'
- 'R9=0x0'
- 'RSI=0x0'
- 'R12=0x0'
- 'R13=0x0'
- 'R14=0x0'
- 'R15=0x0'
cpu_name: bdver2
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { key: inverse_throughput, value: 0.6073, per_snippet_value: 8.5022 }
error: ''
info: instruction has tied variables, using static renaming.
assembled_snippet: 5541574156415541545348B8000000000000000049BB00000000000000004883EC08C7042400000000C7442404000000009D48BD000000000000000049BA000000000000000048BB000000000000000048B9000000000000000048BA000000000000000048BF000000000000000049B9000000000000000048BE000000000000000049BC000000000000000049BD000000000000000049BE000000000000000049BF0000000000000000490F40C3490F40EA490F40DA480F40CA480F40F8490F40D1480F40F04C0F40CB4D0F40D44C0F40DF4C0F40E74C0F40EF4D0F40F14C0F40FD490F40C3490F40EA5B415C415D415E415F5DC35541574156415541545348B8000000000000000049BB00000000000000004883EC08C7042400000000C7442404000000009D48BD000000000000000049BA000000000000000048BB000000000000000048B9000000000000000048BA000000000000000048BF000000000000000049B9000000000000000048BE000000000000000049BC000000000000000049BD000000000000000049BE000000000000000049BF000000000000000049B80200000000000000490F40C3490F40EA490F40DA480F40CA480F40F8490F40D1480F40F04C0F40CB4D0F40D44C0F40DF4C0F40E74C0F40EF4D0F40F14C0F40FD4983C0FF75C25B415C415D415E415F5DC3
...
```
but i open to suggestions as to how test that.
I also have gone with the suggestion to default to this new mode.
This was irking me for some time, so i'm happy to finally see progress here.
Looking forward to feedback.
Reviewers: courbet, gchatelet
Reviewed By: courbet, gchatelet
Subscribers: mstojanovic, RKSimon, llvm-commits, courbet, gchatelet
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76921
This allows defining substitutions in terms of other substitutions. For
example, a %build substitution could be defined in terms of a %cxx
substitution as '%cxx %s -o %t.exe' and the script would be properly
expanded.
Differential Revision: https://reviews.llvm.org/D76178
to remap object file paths (but no source paths) before
processing. This is meant to be used for Clang objects where the
module cache location was remapped using ``-fdebug-prefix-map``; to
help dsymutil find the Clang module cache.
<rdar://problem/55685132>
Differential Revision: https://reviews.llvm.org/D76391
This adds the --debug-vars option to llvm-objdump, which prints
locations (registers/memory) of source-level variables alongside the
disassembly based on DWARF info. A vertical line is printed for each
live-range, with a label at the top giving the variable name and
location, and the position and length of the line indicating the program
counter range in which it is valid.
Currently, this only works for object files, not executables or shared
libraries.
Differential revision: https://reviews.llvm.org/D70720
The examples for different options were inconsistently indented in
the HTML display. As they are tied to the options, this change
normalises to indent them the same as the option description body.
"--functions none" and "--functions=none" are not the same. One is the
option "--functions" with its default value of "linkage", followed by an
input address of "none", and the other is "--functions" with the value
"none". This patch fixes the doc to match the actual behaviour by adding
an extra '=' sign in the allowed values description.
Summary:
gnu addr2line prints DWARF line table discriminators like so:
<file>:<line> (discriminator <Number>)
This matches that behavior.
Document how and when --output-style=GNU prints discriminators
Add test for new GNU-style discriminator printing.
Reviewers: rupprecht, labath, jhenderson
Subscribers: aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73318
Summary:
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch adds support for selecting a
matching format to match a numeric value against (ie. decimal, hex lower
case letters or hex upper case letters).
This commit allows to select what format a numeric value should be
matched against. The following formats are supported: decimal value,
lower case hex value and upper case hex value. Matching formats impact
both the format of numeric value to be matched as well as the format of
accepted numbers in a definition with empty numeric expression
constraint.
Default for absence of format is decimal value unless the numeric
expression constraint is non null and use a variable in which case the
format is the one used to define that variable. Conclict of format in
case of several variable being used is diagnosed and forces the user to
select a matching format explicitely.
This commit also enables immediates in numeric expressions to be in any
radix known to StringRef's GetAsInteger method, except for legacy
numeric expressions (ie [[@LINE+<offset>]] which only support decimal
immediates.
Copyright:
- Linaro (changes up to diff 183612 of revision D55940)
- GraphCore (changes in later versions of revision D55940 and
in new revision created off D55940)
Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson
Reviewed By: jhenderson, arichardson
Subscribers: daltenty, MaskRay, hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, kristina, hfinkel, rogfer01, JonChesterfield
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60389
The sub-heading used to contain the --only-keep-debug switch as that
switch wasn't implemented for ELF at one point. Since the switch is now
in the generic options section, and there are no other options in this
sub-heading, it is pointless and can be deleted.