Commit Graph

5669 Commits

Author SHA1 Message Date
Nico Weber e23c6cc54e [aarch64/mac] Correctly disassemble @TLVPPAGE(OFF) relocs
`llvm-otool -tV foo.o` and `llvm-objdump --macho -d foo.o` would
previously fail on object files containing @TLVPPAGE or @TLVPPAGEOFF relocs.

Move llvm-objdump-specific test from
llvm/test/MC/AArch64/arm64-tls-modifiers-darwin.s to new
llvm/test/tools/llvm-objdump/MachO/disassemble-arm64-tlv-modifers.test
and put test for this fix to that new file.

Fixes PR52356.

Differential Revision: https://reviews.llvm.org/D112843
2021-11-10 10:41:18 -05:00
Esme-Yi ab97ffb96a Reland [XCOFF][yaml2obj] support for the auxiliary file header.
Summary: Fix the build failure on MSVC by making the `T` and `U` of the function
'T llvm::Optional<T>::getValueOr<llvm::yaml::Hex32>(U &&) const &' the same.

Differential Revision: https://reviews.llvm.org/D111487
2021-11-10 07:23:56 +00:00
David Blaikie 58b1b6414b llvm-dwarfdump: Lookup type units when prettyprinting types
This handles DWARFv4 and DWARFv5 type units, but not Split DWARF type
units. That'll come in a follow-up patch.
2021-11-09 16:58:22 -08:00
Gulfem Savrun Yeniceri 126e7611c7 [compiler-rt] Fix diagnostic in InstrProfError
This patch fixes some issues introduced in
https://reviews.llvm.org/D108942:

1) Remove the default label to fix the bots that use
-Werror,-Wcovered-switch-default
2) Modify the malformed test to fix the bots that are
built without zlib support
3) Modify some error messages in malformed profiles
2021-11-09 20:30:03 +00:00
Dwight Guth 16c3db8def [llvm-reduce] Fix invalid reduction in basic-blocks delta pass
Previously, if the basic-blocks delta pass tried to remove a basic block
that was the last basic block in a function that did not have external
or weak linkage, the resulting IR would become invalid. Since removing
the last basic block in a function is effectively identical to removing
the function body itself, we check explicitly for this case and if we
detect it, we run the same logic as in ReduceFunctionBodies.cpp

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D113486
2021-11-09 10:43:38 -08:00
Dwight Guth fbfd327fdf [llvm-reduce] Add flag to start at finer granularity
Sometimes if llvm-reduce is interrupted in the middle of a delta pass on
a large file, it can take quite some time for the tool to start actually
doing new work if it is restarted again on the partially-reduced file. A
lot of time ends up being spent testing large chunks when these large
chunks are very unlikely to actually pass the interestingness test. In
cases like this, the tool will complete faster if the starting
granularity is reduced to a finer amount. Thus, we introduce a command
line flag that automatically divides the chunks into smaller subsets a
fixed, user-specified number of times prior to beginning the core loop.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D112651
2021-11-09 10:14:08 -08:00
Fangrui Song 5f1e509579 [llvm-objdump] -p: Dump PE header for PE/COFF
For a trivial DLL built with `clang --target=x86_64-windows -O2 -c a.c; lld-link -subsystem:console -dll a.o -out:a.dll`,
`objdump -p` vs `llvm-objdump -p`:

```
-a.dll:     file format pei-x86-64
-
+a.dll: file format coff-x86-64
 Characteristics 0x2022
        executable
        large address aware
@@ -57,4 +56,4 @@
 Entry d 0000000000000000 00000000 Delay Import Directory
 Entry e 0000000000000000 00000000 CLR Runtime Header
 Entry f 0000000000000000 00000000 Reserved
-
+Export Table:
```

For a Linux image (`vmlinuz-5.10.76-gentoo-r1`) built with `CONFIG_EFI_STUB=y`

```
-vmlinuz-5.10.76-gentoo-r1:     file format pei-x86-64
-
-Characteristics 0x20e
+vmlinuz-5.10.76-gentoo-r1:     file format coff-x86-64
+Characteristics 0x206
        executable
        line numbers stripped
-       symbols stripped
        debugging information removed

 Time/Date              Wed Dec 31 16:00:00 1969
@@ -55,10 +53,4 @@
 Entry d 0000000000000000 00000000 Delay Import Directory
 Entry e 0000000000000000 00000000 CLR Runtime Header
 Entry f 0000000000000000 00000000 Reserved
-
-
-PE File Base Relocations (interpreted .reloc section contents)
-
-Virtual Address: 000037ca Chunk size 10 (0xa) Number of fixups 1
-       reloc    0 offset    0 [37ca] ABSOLUTE
-
+Export Table:
```

`symbols stripped` looks like a GNU objdump problem.

Reviewed By: jhenderson, alexander-shaposhnikov

Differential Revision: https://reviews.llvm.org/D113356
2021-11-09 10:08:41 -08:00
Gulfem Savrun Yeniceri ee88b8d63e [compiler-rt] Add more diagnostic to InstrProfError
If profile data is malformed for any kind of reason, we generate
an error that only reports "malformed instrumentation profile data"
without any further information. This patch extends InstrProfError
class to receive an optional error message argument, so that we can
do better error reporting.

Differential Revision: https://reviews.llvm.org/D108942
2021-11-09 18:04:12 +00:00
Alexey Lapshin c8ae08987d [llvm-dwarfdump] dump link to the immediate parent.
It is often useful to know which die is the parent of the current die.
This patch adds information about parent offset into the dump:

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer    ("by_hand")

0x00000014:   DW_TAG_base_type (0x0000000b)  <<<<<<<<<<<<<<
                DW_AT_name      ("int")

Now it is easy to see which die is the parent of the current die.
This patch makes that behaviour to be default.
We can make it to be opt-in if neccessary.

This functionality differs from already existed "--show-parents"
in that sence that parent information is shown for all dies and
only link to the immediate parent is shown.

Differential Revision: https://reviews.llvm.org/D113406
2021-11-09 14:14:06 +03:00
Simon Pilgrim 32a4a883f6 Revert rGe1eec7601b6988b35ae3cdc8d67cf3cf4e1361dd "[XCOFF][yaml2obj] support for the auxiliary file header."
This is failing on MSVC builds: https://lab.llvm.org/buildbot/#/builders/86/builds/23436
2021-11-09 11:02:13 +00:00
Esme-Yi e1eec7601b [XCOFF][yaml2obj] support for the auxiliary file header.
Summary:
  This patch adds yaml2obj supporting for the auxiliary
  file header of XCOFF.

Reviewed By: DiggerLin, jhenderson

Differential Revision: https://reviews.llvm.org/D111487
2021-11-09 09:48:40 +00:00
Paul Robinson 38be8f4057 Add llvm-tli-checker
A new tool that compares TargetLibraryInfo's opinion of the availability
of library function calls against the functions actually exported by a
specified set of libraries. Can be helpful in verifying the correctness
of TLI for a given target, and avoid mishaps such as had to be addressed
in D107509 and 94b4598d.

The tool currently supports ELF object files only, although it's unlikely
to be hard to add support for other formats.

Re-commits 62dd488 with changes to use pre-generated objects, as not all
bots have ld.lld available.

Differential Revision: https://reviews.llvm.org/D111358
2021-11-08 16:29:28 -08:00
Paul Robinson 1297c21406 Revert "Add llvm-tli-checker"
Not all bots have ld.lld available.
This reverts commit 62dd488164.
2021-11-08 15:48:29 -08:00
Paul Robinson 62dd488164 Add llvm-tli-checker
A new tool that compares TargetLibraryInfo's opinion of the availability
of library function calls against the functions actually exported by a
specified set of libraries. Can be helpful in verifying the correctness
of TLI for a given target, and avoid mishaps such as had to be addressed
in D107509 and 94b4598d.

The tool currently supports ELF object files only, although it's unlikely
to be hard to add support for other formats.

Differential Revision: https://reviews.llvm.org/D111358
2021-11-08 14:59:13 -08:00
Adrian Prantl 8bd8dd16e2 Extend obj2yaml to optionally preserve raw __LINKEDIT/__DATA segments.
I am planning to upstream MachOObjectFile code to support Darwin
chained fixups. In order to test the new parser features we need a way
to produce correct (and incorrect) chained fixups. Right now the only
tool that can produce them is the Darwin linker. To avoid having to
check in binary files, this patch allows obj2yaml to print a hexdump
of the raw LINKEDIT and DATA segment, which both allows to
bootstrap the parser and enables us to easily create malformed inputs
to test error handling in the parser.

This patch adds two new options to obj2yaml:

  -raw-data-segment
  -raw-linkedit-segment

Differential Revision: https://reviews.llvm.org/D113234
2021-11-08 11:30:12 -08:00
Zarko Todorovski c4396b77ae [LLVM][llvm-cfi] Inclusive language: replace uses of blacklist with ignorelist
Replace the description and file names for this argument. As far as I understand
this is a positional argument and I don't believe this changes breaks any existing
interfaces.

Reviewed By: hctim, MaskRay

Differential Revision: https://reviews.llvm.org/D113316
2021-11-08 10:05:52 -05:00
Esme-Yi 9b6f264d2b [XCOFF][llvm-readobj] improve the relocation output.
Summary:
	1. implemented the unexpanded relocations output.
	2. modified the expanded output format to align.

Reviewed By: shchenz, jhenderson

Differential Revision: https://reviews.llvm.org/D111700
2021-11-08 03:15:52 +00:00
David Blaikie 0a5c26f2ef DebugInfo: Simplified Template Names: drop unneeded space in arrays
Matching a recent clang change I've made, now 'int[3]' is formatted
without the space between the type and array bound. This commit updates
libDebugInfoDWARF/llvm-dwarfdump to match that formatting.
2021-11-05 22:50:57 -07:00
wlei 5bf191a381 [llvm-profgen] Fix index out of bounds error while using ip.advance
Previously we assume there're some non-executing sections at the bottom of the text section so that we won't hit the array's bound. But on BOLTed binary, it turned out .bolt section is at the bottom of text section which can be profiled, then it crash llvm-profgen. This change try to fix it.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D113238
2021-11-05 18:38:40 -07:00
David Blaikie f57d0e2726 DWARF Simplified Template Names: Narrow down the handling for operator overloads
Actually we can, for now, remove the explicit "operator" handling
entirely - since clang currently won't try to flag any of these as
rebuildable. That seems like a reasonable state for now, but it could be
narrowed down to only apply to conversion operators, most likely - but
would need more nuance for op> and op>> since they would be incorrectly
flagged as already having their template arguments (due to the trailing
'>').
2021-11-05 15:41:56 -07:00
Fangrui Song 26a8ceba3e [llvm-readobj] Display DT_RELRSZ/DT_RELRENT as " (bytes)"
to match RELSZ/RELENT.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D113206
2021-11-05 10:02:49 -07:00
gbreynoo ced9287c2d [llvm-objdump] Fix the Assertion failure when providing invalid --debug-vars or --dwarf values
As seen in https://bugs.llvm.org/show_bug.cgi?id=52213 llvm-objdump
asserts if either the --debug-vars or the --dwarf options are provided
with invalid values. As suggested, this fix adds use of a default value
to these options and errors when given bad input.

Differential Revision: https://reviews.llvm.org/D112183
2021-11-04 11:01:32 +00:00
wlei 138202a8c3 [llvm-profgen] Warn on invalid range and show warning summary
Two things in this diff:

1) Warn on the invalid range, currently three types of checking, see the detailed message in the code.

2) In some situation, llvm-profgen gives lots of warnings on the truncated stacks which is noisy. This change provides a switch to `--show-detailed-warning` to skip the warnings. Alternatively, we use a summary for those warning and show the percentage of cases with those issues.

Example of warning summary.
```
warning: 0.05%(1120/2428958) cases with issue: Profile context truncated due to missing probe for call instruction.
warning: 0.00%(2/178637) cases with issue: Range does not belong to any functions, likely from external function.
```

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D111902
2021-11-02 19:55:55 -07:00
Hongtao Yu d0eb472f33 [llvm-profdata] Print out section flags for FunctionMetadata section
As titled.

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D113064
2021-11-02 17:59:22 -07:00
Arthur Eubanks f54a8759f0 [llvm-reduce] Reduce more GlobalValue properties
Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D112885
2021-11-02 08:47:41 -07:00
Arthur Eubanks 80ba72b07b [llvm-reduce] Reduce some GlobalObject properties
Specifically, the section and the alignment.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D112884
2021-11-02 08:47:32 -07:00
Frederic Cambus 650311737e [llvm-readobj] Add support for reading OpenBSD ELF core notes.
Notes generated in OpenBSD core files provide additional information
about the kernel state and CPU registers. These notes are described
in core.5, which can be viewed here: https://man.openbsd.org/core.5

Differential Revision: https://reviews.llvm.org/D111966
2021-11-02 10:18:54 +01:00
Markus Lavin fd41738e2c Recommit "[llvm-reduce] Add MIR support"
(Second try. Need to link against CodeGen and MC libs.)

The llvm-reduce tool has been extended to operate on MIR (import, clone and
export). Current limitation is that only a single machine function is
supported. A single reducer pass that operates on machine instructions (while
on SSA-form) has been added. Additional MIR specific reducer passes can be
added later as needed.

Differential Revision: https://reviews.llvm.org/D110527
2021-11-02 10:16:42 +01:00
Markus Lavin aee7f3384b Revert "[llvm-reduce] Add MIR support"
This reverts commit bc2773cb1b.

Broke the clang-ppc64le-linux-multistage build. Reverting while I
investigate.
2021-11-02 09:41:02 +01:00
Markus Lavin bc2773cb1b [llvm-reduce] Add MIR support
The llvm-reduce tool has been extended to operate on MIR (import, clone and
export). Current limitation is that only a single machine function is
supported. A single reducer pass that operates on machine instructions (while
on SSA-form) has been added. Additional MIR specific reducer passes can be
added later as needed.

Differential Revision: https://reviews.llvm.org/D110527
2021-11-02 09:14:56 +01:00
wlei 3f3103c6a9 [llvm-profgen] Fill zero count for all function ranges
Allow filling zero count for all the function ranges even there is no samples hitting that function. Add a switch for this.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D112858
2021-11-01 09:57:05 -07:00
Esme-Yi 81441cf44c [XCOFF] [llvm-readobj] replace tests using binary as input
with tests generated by yaml2obj.

Summary: Because yaml2obj supports basic transforming for XCOFF,
         some of the binary inputs used in the tests of llvm-readobj
         can be replaced with yaml files.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D111699
2021-11-01 08:43:32 +00:00
wlei f5537643b8 [llvm-profgen] Update total samples by accumulating all its body samples
Like probe-based profile, the total samples is the sum of all its body samples. This patch fix it by a post-processing update for the line-number based profile. Tested it on our internal services, results showed no performance change.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D112672
2021-10-29 10:36:57 -07:00
wlei 2f8196db92 [llvm-profgen] Fix bug of populating profile symbol list
Previous implementation of populating profile symbol list is wrong, it only included the profiled symbols. Actually it should use all symbols, here this switches to use the symbols from debug info. Also turned the flag off by default.

Reviewed By: wenlei, hoy

Differential Revision: https://reviews.llvm.org/D111824
2021-10-29 09:59:12 -07:00
Arthur Eubanks 177a703710 [llvm-reduce] Actually skip invalid candidates in operands-to-args
This was checked while counting but not actually when doing the reduction, resulting in crashes.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D112766
2021-10-29 09:14:18 -07:00
David Blaikie b65f24a74c llvm-dwarfdump --verify: Don't diagnose functions in different sections as overlapping
Functions in different sections (common in object files - inline
functions, -ffunction-sections, etc) can't overlap, so factor in the
section when diagnosing overlapping address ranges.

This removes a major false-positive when running llvm-dwarfdump on
unlinked code.
2021-10-28 17:13:57 -07:00
Hongtao Yu 259e4c5658 [CSSPGO] Trim cold base profiles for the CS preinliner.
Adding support to the CS preinliner to trim cold base profiles. This makes trimming consistent with the inline decision made by the preinliner. Also disable the existing profile merger when preinliner is on unless explicitly specified.

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D112489
2021-10-27 22:50:27 -07:00
Djordje Todorovic 40c2bdf6d1 [llvm-locstats] Move the test from D110621 into test/llvm-locstats/ dir 2021-10-27 17:36:19 +02:00
djtodoro 30a3652b6a [llvm-locstats] Report a warning if overflow was detected by llvm-dwarfdump
Catch that llvm-dwarfdump detected an overflow in statistics.

Differential Revision: https://reviews.llvm.org/D110621
2021-10-27 14:35:29 +02:00
Nico Weber 3c0cf7e1a9 Unbreak code_signature_lc.test on macOS after 911be05743 2021-10-26 21:05:48 -04:00
Daniel Rodríguez Troitiño 911be05743 [test][objcopy] Replace GNU sed extension with BSD compatible syntax.
GNU sed offers the `,+4d` to delete the line a next four lines, but BSD
sed doesn't seem to support it (at least in macOS 10.15, but seems to do
in my 11.6 version).

Replace the usage of the extension with the equivalent syntax that works
both in BSD and GNU sed. I don't have a macOS 10.15 to check, but this
works in both my macOS 11.6 and Linux machines.

Differential Revision: https://reviews.llvm.org/D112583
2021-10-26 17:35:56 -07:00
David Blaikie 3ac709b6ce llvm-dwarfdump --verify: Exit non-zero on simplified template name rebuilding failures 2021-10-26 15:57:16 -07:00
Nuri Amari a299b24712 Regenerate LC_CODE_SIGNATURE during llvm-objcopy operations
**Context:**

This is a second attempt at introducing signature regeneration to llvm-objcopy. In this diff: https://reviews.llvm.org/D109840, a script was introduced to test
the validity of a code signature. In this diff: https://reviews.llvm.org/D109803 (now reverted), an effort was made to extract the signature generation behavior out of LLD into a common location for use in llvm-objcopy. In this diff: https://reviews.llvm.org/D109972 it was decided that there was no appropriate common location and that a small amount of duplication to bring signature generation to llvm-objcopy would be better. This diff introduces this duplication.

**Summary**

Prior to this change, if a LC_CODE_SIGNATURE load command
was included in the binary passed to llvm-objcopy, the command and
associated section were simply copied and included verbatim in the
new binary. If rest of the binary was modified at all, this results
in an invalid Mach-O file. This change regenerates the signature
rather than copying it.

The code_signature_lc.test test was modified to include the yaml
representation of a small signed MachO executable in order to
effectively test the signature generation.

Reviewed By: alexander-shaposhnikov, #lld-macho

Differential Revision: https://reviews.llvm.org/D111164
2021-10-26 14:51:13 -07:00
zhijian c2d2fb5093 address an test error on window os , exclude the test llvm/test/tools/llvm-readobj/XCOFF/xcoff-auxiliary-header.test from
windows OS.
http://45.33.8.238/win/47662/step_11.txt
for
https://reviews.llvm.org/D82549
2021-10-26 13:56:52 -04:00
zhijian 158083f0de [AIX][XCOFF] parsing xcoff object file auxiliary header
Summary:

The patch supports parsing the xcoff object file auxiliary header with llvm-readobj with option "auxiliary-headers"

the format of auxiliary header as
https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/filesreference/XCOFF.html#XCOFF__fyovh386shar

Reviewers: James Henderson, Jason Liu, Hubert Tong, Esme yi, Sean Fertile.

Differential Revision: https://reviews.llvm.org/D82549
2021-10-26 10:40:25 -04:00
wlei a5f411b7f8 [llvm-profgen] Allow unsymbolized profile as perf input
This change allows the unsymbolized profile as input. The unsymbolized profile is created by `llvm-profgen` with `--skip-symbolization` and it's after the sample aggregation but before symbolization , so it has much small file size. It can be used for sample merging and trimming,  also is useful for debugging or adding test cases. A switch `--unsymbolized-profile=file-patch` is added for this.

Format of unsymbolized profile:
```

   [context stack1]    # If it's a CS profile
      number of entries in RangeCounter
      from_1-to_1:count_1
      from_2-to_2:count_2
      ......
      from_n-to_n:count_n
      number of entries in BranchCounter
      src_1->dst_1:count_1
      src_2->dst_2:count_2
      ......
      src_n->dst_n:count_n
    [context stack2]
      ......
```

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D111750
2021-10-25 23:58:08 -07:00
Jack Anderson d7733f8422 [DebugInfo] Expand ability to load 2-byte addresses in dwarf sections
Some dwarf loaders in LLVM are hard-coded to only accept 4-byte and 8-byte address sizes. This patch generalizes acceptance into `DWARFContext::isAddressSizeSupported` and provides a common way to generate rejection errors.

The MSP430 target has been given new tests to cover dwarf loading cases that previously failed due to 2-byte addresses.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D111953
2021-10-21 17:31:00 -07:00
Wenlei He e8c245dcd3 [llvm-profgen] Skip duplication factor outside of body sample computation
We incorrectly use duplication factor for total samples even though we already accumulate samples instead of taking MAX. It causes profile to have bloated total samples for functions with loop unrolled or vectorized. The change fix the issue for total sample, head sample and call target samples.

Differential Revision: https://reviews.llvm.org/D112042
2021-10-19 23:10:45 -07:00
Arthur Eubanks 9660563950 [llvm-reduce] Add reduction passes to reduce operands to undef/1/0
Having non-undef constants in a final llvm-reduce output is nicer than
having undefs.

This splits the existing reduce-operands pass into three, one which does
the same as the current pass of reducing to undef, and two more to
reduce to the constant 1 and the constant 0. Do not reduce to undef if
the operand is a ConstantData, and do not reduce 0s to 1s.

Reducing GEP operands very frequently causes invalid IR (since types may
not match up if we index differently into a struct), so don't touch GEPs.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D111765
2021-10-19 15:25:21 -07:00
Simon Pilgrim 0bb32b1b21 [X86][SLM] Fix BitTest+Set uops + port usage
Both ports are required for BitTest ops. Update the uops counts + port usage based off the most recent llvm-exegesis captures and what Intel AoM / Agner reports as well.
2021-10-17 18:13:15 +01:00
Simon Pilgrim 5ed5df4802 [X86][SLM] Fix uops for PCMPISTR/PCMPISTR instructions
Based off a recent llvm-exegesis capture and what Intel AoM / Agner reports as well.
2021-10-17 18:13:14 +01:00
Simon Pilgrim 680afaaa5d [X86][SLM] Fix uops for PCLMULQDQ
Based off a recent llvm-exegesis capture and what Intel AoM / Agner reports as well.
2021-10-17 18:13:14 +01:00
Simon Pilgrim 498c7236bc [X86][SLM] +1uop for PSHUFBrm xmm
Extra 1uop for folded pshufb ops, based off a recent llvm-exegesis capture and what Intel AoM / Agner reports as well.
2021-10-17 18:13:14 +01:00
djtodoro c450e47a8c [llvm-dwarfdump] Fix unsigned overflow when calculating stats
This fixes https://bugs.llvm.org/show_bug.cgi?id=51652.

The idea is to bump all the stat fields to 64-bit wide
unsigned integers. I've confirmed this resolves
the use case for chromium.

Differential Revision: https://reviews.llvm.org/D109217
2021-10-15 12:15:58 +02:00
Craig Topper 3ff9cc01f2 [X86] Use CMOVNS for abs instead of CMOVGE.
CMOVGE reads SF and OF. CMOVNS only reads SF. This matches with
other recent changes to use a single flag where possible. It also
matches gcc codegen.

I believe this technically changes whether the conditioanl move happens
on INT_MIN, but for INT_MIN both registers are the same so it doesn't
matter.

Differential Revision: https://reviews.llvm.org/D111826
2021-10-14 12:28:28 -07:00
Kai Nacke b050564d3e [AIX] Ignore case when comparing output from od
POSIX does not define the exact output from od tool.
While most implementations use lower case characters in hex output,
the z/OS USS implementation uses upper case characters.
To avoid LIT failures, the FileCheck option to ignore the case must
be used when checking hex bytes.

Reviewed By: abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D111427
2021-10-14 13:51:02 -04:00
Wenlei He a316343e19 [llvm-profgen] Allow generating AutoFDO profile from CSSPGO binary
Add `-use-dwarf-correlation` switch to allow llvm-profgen to generate AutoFDO profile for binaries built with CSSPGO (pseudo-probe).

Differential Revision: https://reviews.llvm.org/D111776
2021-10-14 09:11:56 -07:00
wlei 30ca33eab0 [llvm-profgen] Ignore the whole trace with the leading external branch
The first LBR entry can be an external branch, we should ignore the whole trace.

```
     7f7448e889e4 0x7f7448e889e4/0x7f7448e88826/P/-/-/1  0x7f7448e8899f/0x7f7448e889d8/P/-/-/4  ...
```

Reviewed By: wenlei, hoy

Differential Revision: https://reviews.llvm.org/D111749
2021-10-13 16:52:29 -07:00
Michael Kruse dd71b65ca8 [llvm-reduce] Introduce operands-to-args pass.
Instead of setting operands to undef as the "operands" pass does,
convert the operands to a function argument. This avoids having to
introduce undef values into the IR which have some unpredictability
during optimizations.

For instance,

    define void @func() {
    entry:
      %val = add i32 32, 21
      store i32 %val, i32* null
      ret void
    }

is reduced to

    define void @func(i32 %val) {
    entry:
      %val1 = add i32 32, 21
      store i32 %val, i32* null
      ret void
    }

(note that the instruction %val is renamed to %val1 when printing
the IR to avoid ambiguity; ideally %val1 would be removed by dce or the
instruction reduction pass)

Any call to @func is replaced with a call to the function with the
new signature and filled with undef. This is not ideal for IPA passes,
but those out-of-scope for now.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D111503
2021-10-13 09:54:03 -05:00
Arthur Eubanks 337cf0a5ab [llc] Support -time-trace in llc
Mostly copied from opt.cpp.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D111466
2021-10-11 10:16:46 -07:00
Esme-Yi a00ff71668 [XCOFF] Improve error message context.
Summary: This patch improves the error message context of the
XCOFF interfaces by providing more details.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D110320
2021-10-11 02:52:20 +00:00
David Green adec922361 [AArch64] Make -mcpu=generic schedule for an in-order core
We would like to start pushing -mcpu=generic towards enabling the set of
features that improves performance for some CPUs, without hurting any
others. A blend of the performance options hopefully beneficial to all
CPUs. The largest part of that is enabling in-order scheduling using the
Cortex-A55 schedule model. This is similar to the Arm backend change
from eecb353d0e which made -mcpu=generic perform in-order scheduling
using the cortex-a8 schedule model.

The idea is that in-order cpu's require the most help in instruction
scheduling, whereas out-of-order cpus can for the most part out-of-order
schedule around different codegen. Our benchmarking suggests that
hypothesis holds. When running on an in-order core this improved
performance by 3.8% geomean on a set of DSP workloads, 2% geomean on
some other embedded benchmark and between 1% and 1.8% on a set of
singlecore and multicore workloads, all running on a Cortex-A55 cluster.

On an out-of-order cpu the results are a lot more noisy but show flat
performance or an improvement. On the set of DSP and embedded
benchmarks, run on a Cortex-A78 there was a very noisy 1% speed
improvement. Using the most detailed results I could find, SPEC2006 runs
on a Neoverse N1 show a small increase in instruction count (+0.127%),
but a decrease in cycle counts (-0.155%, on average). The instruction
count is very low noise, the cycle count is more noisy with a 0.15%
decrease not being significant. SPEC2k17 shows a small decrease (-0.2%)
in instruction count leading to a -0.296% decrease in cycle count. These
results are within noise margins but tend to show a small improvement in
general.

When specifying an Apple target, clang will set "-target-cpu apple-a7"
on the command line, so should not be affected by this change when
running from clang. This also doesn't enable more runtime unrolling like
-mcpu=cortex-a55 does, only changing the schedule used.

A lot of existing tests have updated. This is a summary of the important
differences:
 - Most changes are the same instructions in a different order.
 - Sometimes this leads to very minor inefficiencies, such as requiring
   an extra mov to move variables into r0/v0 for the return value of a test
   function.
 - misched-fusion.ll was no longer fusing the pairs of instructions it
   should, as per D110561. I've changed the schedule used in the test
   for now.
 - neon-mla-mls.ll now uses "mul; sub" as opposed to "neg; mla" due to
   the different latencies. This seems fine to me.
 - Some SVE tests do not always remove movprfx where they did before due
   to different register allocation giving different destructive forms.
 - The tests argument-blocks-array-of-struct.ll and arm64-windows-calls.ll
   produce two LDR where they previously produced an LDP due to
   store-pair-suppress kicking in.
 - arm64-ldp.ll and arm64-neon-copy.ll are missing pre/postinc on LPD.
 - Some tests such as arm64-neon-mul-div.ll and
   ragreedy-local-interval-cost.ll have more, less or just different
   spilling.
 - In aarch64_generated_funcs.ll.generated.expected one part of the
   function is no longer outlined. Interestingly if I switch this to use
   any other scheduled even less is outlined.

Some of these are expected to happen, such as differences in outlining
or register spilling. There will be places where these result in worse
codegen, places where they are better, with the SPEC instruction counts
suggesting it is not a decrease overall, on average.

Differential Revision: https://reviews.llvm.org/D110830
2021-10-09 15:58:31 +01:00
Qiu Chaofan 573531fb1f Fix typo of colon to semicolon in lit tests 2021-10-09 10:03:50 +08:00
Abhina Sreeskantharajan 7d7b139042 [test] Use host platform specific error message substitution
This patch modifies the testcase to use error substitution so it will pass on all platforms.

Reviewed By: fanbo-meng, muiez

Differential Revision: https://reviews.llvm.org/D111320
2021-10-08 13:52:31 -04:00
wlei b1a45c62f0 [llvm-profgen] Ignore branch count against outline function
For some transformations like hot-cold split or coro split, it can outline its part of function ranges. Since sample loader is the early stage of backend and no split happens at that time, compiler can't recognize those function, so in llvm-profgen we should attribute the sample to the original function. This is already done for the body range samples since we use the symbols from dwarf which is created before the split.

But for branch samples, the call from master function to its outlined function is actually not a call to the original function, we shouldn't add head/callsie samples for it. So instead of dwarf symbol, we use the symbols from symbol table and ignore those functions with special suffixes(like `.cold` ,`.resume`) for accumulating the callsite/head samples.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D110864
2021-10-07 14:03:34 -07:00
gbreynoo 9072183cb6 [llvm-objdump] Fix --prefix and --prefix-strip
In the command guide --prefix and --prefix-strip is used in the form
--prefix=<prefix> however currently it is used in the form --prefix
<prefix>. This change fixes these options to match the command guide.

Differential Revision: https://reviews.llvm.org/D110551
2021-10-07 15:53:45 +01:00
wlei 16516f8925 [llvm-profgen] Support symbol list for accurate profile
Differential Revision: https://reviews.llvm.org/D110859
2021-10-06 11:41:39 -07:00
Petr Hosek 24c615fa6b [InstrProfData] Bump the raw profile version to 8
This is to account for the change that made CountersPtr in __profd_
relative which landed in a1532ed275.
That change hasn't updated the raw profile version, and while the
profile layout stayed the same, profiles generated by tip-of-tree
LLVM are incompatible with 13.x tooling.

Differential Revision: https://reviews.llvm.org/D111123
2021-10-05 09:57:56 -07:00
gbhyamso 02895eede1 [llvm-cxxfilt][NFC] Fix test for running in Windows cmd
The test llvm\test\tools\llvm-cxxfilt\delimiters.test started failling when run
from cmd.exe on Windows after D110986 which added a unicode character (⦙) to it.
Piping the unicode character in cmd.exe causes it to be converted to a '?'.
That causes the test to fail because the llvm-cxxfilt output becomes Foo?Bar
rather than the expected Foo⦙Bar.

Redirect the echo output to and from a temporary file to get around this
problem.

It's not entirely clear what the root cause is, but two separate downstream
builders are tripping up on this, so we are landing the work around for the
time being.

Differential Revision: https://reviews.llvm.org/D111072
2021-10-05 12:10:06 +01:00
wlei 31a5cb3292 [llvm-profgen] Filter out invalid debug line
Differential Revision: https://reviews.llvm.org/D110081
2021-10-04 19:09:06 -07:00
wlei 46cf7d75d9 [llvm-profgen] Add duplication factor for line-number based profile
This change adds duplication factor multiplier while accumulating body samples for line-number based profile. The body sample count will be `duplication-factor * count`. Base discriminator and duplication factor is decoded from the raw discriminator, this requires some refactor works.

Differential Revision: https://reviews.llvm.org/D109934
2021-10-04 19:08:55 -07:00
Simon Pilgrim 7cae0daee6 [X86][Atom] Fix BSR/BSF uops + port usage
Both ports are required for BitScan ops. Update the uops counts + port usage based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner reports as well.
2021-10-02 19:09:44 +01:00
Simon Pilgrim 8e7f6039fa [X86] Atom SSE shift-by-variable take 2uops/3uops not 1uop
Based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner / InstLatX64 reports as well.
2021-10-02 12:28:41 +01:00
Tomasz Miąsko f33274c7bf [llvm-cxxfilt] Replace isalnum with isAlnum from StringExtras
D104366 introduced a new llvm-cxxfilt test with non-ASCII characters,
which caused a failure on llvm-clang-x86_64-expensive-checks-win
builder, with a stack trace suggesting issue in a call to isalnum.

The argument to isalnum should be either EOF or a value that is
representable in the type unsigned char. The llvm-cxxfilt does not
perform a cast from char to unsigned char before the call, so the
value might be out of valid range.

Replace the call to isalnum with isAlnum from StringExtras, which takes
a char as the argument. This also makes the check independent of the
current locale.

Differential Revision: https://reviews.llvm.org/D110986
2021-10-02 08:54:04 +02:00
zhijian 5b44c716ee [AIX]implement the --syms and using "symbol index and qualname" for --sym --symbol--description for llvm-objdump for xcoff
Summary:

for xcoff :

implement the getSymbolFlag and getSymbolType() for option --syms.
llvm-objdump --sym , if the symbol is label, print the containing section for the symbol too.
when using llvm-objdump --sym --symbol--description, print the symbol index and qualname for symbol.
for example:
--symbol-description
00000000000000c0 l .text (csect: (idx: 2) .foov[PR]) (idx: 3) .foov

and without --symbol-description
00000000000000c0 l .text (csect: .foov) .foov

Reviewers: James Henderson,Esme Yi

Differential Revision: https://reviews.llvm.org/D109452
2021-10-01 12:37:51 -04:00
Florian Hahn 57fbb9ed0e
[llvm-reduce] Skip updating calls where OldF isn't the called fn.
When replacing function calls, skip call instructions where the old
function is not the called function, but e.g. the old function is passed
as an argument.

This fixes a crash due to trying to construct invalid IR for the test
case.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D109759
2021-10-01 10:52:48 +01:00
Fangrui Song 8971b99c83 [llvm-objdump/llvm-readobj/obj2yaml/yaml2obj] Support STO_RISCV_VARIANT_CC and DT_RISCV_VARIANT_CC
STO_RISCV_VARIANT_CC marks that a symbol uses a non-standard calling
convention or the vector calling convention.

See https://github.com/riscv/riscv-elf-psabi-doc/pull/190

Differential Revision: https://reviews.llvm.org/D107949
2021-09-29 16:56:52 -07:00
Wael Yehia 8b8da01d88 Revert "[LTO][Legacy] Add -debug-pass-manager option to enable pass run/skip trace."
This reverts commit a60405cf03.
2021-09-29 19:43:35 +00:00
Michael Kruse d9562a8e45 [llvm-reduce] Reduce metadata references.
The ReduceMetadata pass before this patch removed metadata on a per-MDNode (or NamedMDNode) basis. Either all references to an MDNode are kept, or all of them are removed. However, MDNodes are uniqued, meaning that references to MDNodes with the same data become references to the same MDNodes. As a consequence, e.g. tbaa references to the same type will all have the same MDNode reference and hence make it impossible to reduce only keeping metadata on those memory access for which they are interesting.
Moreover, MDNodes can also be referenced by some intrinsics or other MDNodes. These references were not considered for removal leading to the possibility that MDNodes are not actually removed even if selected to be removed by the oracle.

This patch changes ReduceMetadata to reduces based on removable metadata references instead. MDNodes without references implicitly dropped anyway. References by intrinsic calls should be removed by ReduceOperands or ReduceInstructions. References in other MDNodes cannot be removed as it would violate the immutability of MDNodes.

Additionally, ReduceMetadata pass before this patch used `setMetadata(I, NULL)` to remove references, where `I` is the index in the array returned by `getAllMetadata`. However, `setMetadata` expects a MDKind (such as `MD_tbaa`) as first argument. `getAllMetadata` does not return those in consecutive order (otherwise it would not need to be a `std::pair` with `first` representing the MDKind).

Reviewed By: aeubanks, swamulism

Differential Revision: https://reviews.llvm.org/D110534
2021-09-29 11:25:35 -05:00
David Green e9adcbde31 [AArch64] Model Cortex-A55 Q register NEON instructions
Cortex-A55 has 2 64bit NEON vector units, meaning a 128bit instruction
requires taking both units (and can only be issued as the first
instruction in a dual issue pair). This patch models that by splitting
the WriteV SchedWrite into two - the WriteVd that reads/writes only
64bit operands, and the WriteVq that read/writes 128bit registers. The
A55 schedule then uses this distinction to model the WriteVq as taking
both resource units, and starting a Schedule Group and WriteVd as taking
one as before.

I believe this is more correct, even if it does not lead to much better
performance.

Differential Revision: https://reviews.llvm.org/D108766
2021-09-29 16:55:31 +01:00
Wael Yehia a60405cf03 [LTO][Legacy] Add -debug-pass-manager option to enable pass run/skip trace.
Reviewed by: steven_wu, fhahn, tejohnson

Differential Revision: https://reviews.llvm.org/D110075
2021-09-29 12:17:53 +00:00
Igor Kudrin 7b424b9333 [llvm-objcopy] Rename relocation sections together with their targets.
As for now, llvm-objcopy renames only sections that are specified
explicitly in --rename-section, while GNU objcopy keeps names of
relocation sections in sync with their targets. For example:

> readelf -S test.o
...
  [ 1] .foo      PROGBITS
  [ 2] .rela.foo RELA

> objcopy --rename-section .foo=.bar test.o gnu.o
> readelf -S gnu.o
...
  [ 1] .bar      PROGBITS
  [ 2] .rela.bar RELA

> llvm-objcopy --rename-section .foo=.bar test.o llvm.o
> readelf -S llvm.o
...
  [ 1] .bar      PROGBITS
  [ 2] .rela.foo RELA

This patch makes llvm-objcopy to match the behavior of GNU objcopy better.

Differential Revision: https://reviews.llvm.org/D110352
2021-09-29 16:36:37 +07:00
wlei a03cf331e1 [llvm-profgen] Strip context to support non-CS profile generation for hybrid sample
Differential Revision: https://reviews.llvm.org/D109769
2021-09-28 12:20:23 -07:00
Leonard Chan b9f547e8e5 [llvm][profile] Add padding after binary IDs
Some tests with binary IDs would fail with error: no profile can be merged.
This is because raw profiles could have unaligned headers when emitting binary
IDs. This means padding should be emitted after binary IDs are emitted to
ensure everything else is aligned. This patch adds padding after each binary ID
to ensure the next binary ID size is 8-byte aligned. This also adds extra
checks to ensure we aren't reading corrupted data when printing binary IDs.

Differential Revision: https://reviews.llvm.org/D110365
2021-09-28 11:50:50 -07:00
Fangrui Song 74a47e54be [llvm-objdump] Fix -R display and support ET_EXEC
* Add a newline before `DYNAMIC RELOCATION RECORDS` (see D101796)
* Add the missing `OFFSET TYPE VALUE` line
* Align columns

Note: llvm-readobj/ELFDumper.cpp `loadDynamicTable` has sophisticated PT_DYNAMIC
code which is unavailable in llvm-objdump.

Reviewed By: jhenderson, Higuoxing

Differential Revision: https://reviews.llvm.org/D110595
2021-09-28 09:58:27 -07:00
Alex Richardson 547e5e4ae6 [update_llc_test_checks.py] Fix MIPS ASM regex for functions with EH
On MIPS, functions with exception handling code emits an additional
temporary label at the start of the function (due to UseAssignmentForEHBegin):

    _Z8do_catchv:                           # @_Z8do_catchv
    .Ltmp3:
    .set .Lfunc_begin0, .Ltmp3
    .cfi_startproc
    .cfi_personality 128, DW.ref.__gxx_personality_v0
    .cfi_lsda 0, .Lexception0
    .frame	$c11,48,$c17
    .mask 	0x00000000,0
    .fmask	0x00000000,0
    .set	noreorder
    .set	nomacro
    .set	noat
    # %bb.0:                                # %entry

The `[^:]*` regex was terminating the search after .Ltmp<N>: and therefore
not detecting functions with exception handling.

Reviewed By: atanasyan, MaskRay

Differential Revision: https://reviews.llvm.org/D100027
2021-09-28 17:57:36 +01:00
Alex Richardson ee3109b044 [update_llc_test_checks] Baseline test for D100027
Show that we fail to generate CHECK lines for MIPS64 functions with EH.

Differential Revision: https://reviews.llvm.org/D110408
2021-09-28 17:57:36 +01:00
Jozef Lawrynowicz 6cfb4d46ba [llvm-readobj] Support dumping of MSP430 ELF attributes
The MSP430 ABI supports build attributes for specifying
the ISA, code model, data model and enum size in ELF object files.

Differential Revision: https://reviews.llvm.org/D107969
2021-09-28 00:56:11 +03:00
modimo ce6ed64a69 [llvm-profdata] Extend support of --topn to sample profiles
Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D110449
2021-09-24 16:42:46 -07:00
Wei Mi 80865f7579 Add "REQUIRES: zlib" in forward-compatible.test since it handles compressed file. 2021-09-24 15:35:07 -07:00
Wei Mi e8b376547b Fixed a bug in https://reviews.llvm.org/rG8eb617d719bdc6a4ed7773925d2421b9bbdd4b7a.
For compressed profile when reading an unknown section, the data reader pointer
adjustment was incorrect. This patch fixed that.
2021-09-24 15:23:45 -07:00
Jonas Devlieghere d0649320bf [dsymutil] Update union-fwd-decl.test for Windows
Remove path separators from CHECK-lines in union-fwd-decl.test
2021-09-24 15:07:22 -07:00
David Blaikie 9911af4b91 WIP: Verify -gsimple-template-names=mangled values
Clang will encode names that should be able to be simplified as
"_STNname|<template, args>" (eg: "_STNt1|<int>") - this verification
mode will detect these names, decode them, create the original name
("t1<int>") and the simple name ("t1") - letting the simple name run
through the usual rebuilding logic - then compare the two sources of the
full name - the rebuilt and the _STN encoding.

This helps ensure that -gsimple-template-names is lossless.
2021-09-24 14:28:18 -07:00
Jonas Devlieghere 62d6ff5e9e [dsymutil] Track incompleteness across unions
When determining the incompleteness of a DIE based on its children, make
sure we propagate it across union types. See test case for an example.
Without this patch we never emit the definition of Container_ivars.

Differential revision: https://reviews.llvm.org/D110443
2021-09-24 14:26:37 -07:00
wlei 1422fa5fab [llvm-profgen] Unify output format of different unsymbolized profiles
Differential Revision: https://reviews.llvm.org/D110080
2021-09-24 14:18:00 -07:00
wlei 28277e9b48 [AutoFDO][llvm-profgen] Report zero count for unexecuted part of function code
In order to be consistent with compiler that interprets zero count as unexecuted(cold), this change reports zero-value count for unexecuted part of function code. For the implementation, it leverages the range counter, initializes all the executed function range with the zero-value. After all ranges are merged and converted into disjoint ranges, the remaining zero count will indicates the unexecuted(cold) part of the function.

This change also extends the current `findDisjointRanges` method which now can support adding zero-value range.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D109713
2021-09-24 14:15:05 -07:00
wlei d5f2013004 [AutoFDO][llvm-profgen] Profile generation for LBR(non-CS) sample
This patch introduces non-CS AutoFDO profile generation into LLVM. The profile is supposed to be well consumed by compiler using `-fprofile-sample-use=[profile]`.

After range and branch counters are extracted from the LBR sample, here we go through each addresses for symbolization, create FunctionSamples and populate its sub fields like TotalSamples, BodySamples and HeadSamples etc. For inlined code, as we need to map back to original code, so we always add body samples to the leaf frame's function sample.

Reviewed By: wenlei, hoy

Differential Revision: https://reviews.llvm.org/D109551
2021-09-24 13:55:34 -07:00
wlei a7cdcf25c1 [llvm-profgen] Ignore invalid perf line in LBR record
Similar to https://reviews.llvm.org/D109637, there is a whole invalid line of message in perfscript.

```
warning: Invalid address in LBR record at line 14118674: Processed 14138923 events and lost 1 chunks!
warning: Invalid address in LBR record at line 14118676: Check IO/CPU overload!
```

This only happened for LBR only perfscript, hybridperfscript have a check of " 0x" to make sure it's the LBR perf line.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D110424
2021-09-24 13:44:57 -07:00
Simon Pilgrim dade83c02a [X86][SLM] Fix ADDQ/SUBQ/CMPEQQ throughput to account for running on either port.
Testing on a SLM box suggests these can run on either port, but the throughput is 4cy on either (inc MMX versions). Confirmed with Intel AoM / Agner / InstLatX64.
2021-09-24 10:06:14 +01:00
Wenlei He 81c249784f [llvm-profgen] Use hot threshold for context merging and trimming
Without preinliner, we need to tune down the cold count cutoff to merge/trim more context to limit profile size for large components. However it doesn't make sense for cold threshold to be higher than hot threshold, so we now change to use hot threshold as merging/trimming cut off instead.

Differential Revision: https://reviews.llvm.org/D110212
2021-09-22 15:01:51 -07:00
Hongtao Yu 734f4d832c [llvm-profgen] An option to dump disasm of specified symbols
For large app, dumping disasm of the whole program can be slow and result in gianant output. Adding a switch to dump specific symbols only.

Reviewed By: wlei

Differential Revision: https://reviews.llvm.org/D110079
2021-09-22 10:32:59 -07:00
Hongtao Yu d9b511d8e8 [CSSPGO] Set PseudoProbeInserter as a default pass.
Currenlty PseudoProbeInserter is a pass conditioned on a target switch. It works well with a single clang invocation. It doesn't work so well when the backend is called separately (i.e, through the linker or llc), where user has always to pass -pseudo-probe-for-profiling explictly. I'm making the pass a default pass that requires no command line arg to trigger, but will be actually run depending on whether the CU comes with `llvm.pseudo_probe_desc` metadata.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D110209
2021-09-22 09:09:48 -07:00
Sebastian Neubauer ecd5145c27 [Utils] Replace llc with cat for tests
Make the update_llc_test_checks script test independant of llc behavior
by using cat with static files to simulate llc output.

This allows changing llc without breaking the script test case.

The update script is executed in a temporary directory, so the
llc-generated assembly files are copied there. %T is deprecated, but it
allows copying a file with a predictable filename.

Differential Revision: https://reviews.llvm.org/D110143
2021-09-22 10:10:35 +02:00
David Blaikie 49c519a848 DebugInfo: Rebuild decltype(nullptr) as 'std::nullptr_t'
Now that Clang's been changed to render nullptr types/template
parameters as 'std::nullptr_t' do the same thing down here.

(Clang commit: 131e878664 )
2021-09-21 11:37:30 -07:00
Paul Robinson fa822a2ee5 [DebugInfo] Add test for dumping DW_AT_defaulted 2021-09-20 16:43:53 -04:00
Alex Richardson 817e23d481 [update_mir_test_checks.py] Use -NEXT FileCheck directories
Previously the script emitted output using plain CHECK directives. This
can result in a test passing even if there are some instructions between
CHECK directives that should have been removed. It also makes debugging
tests that have the output in a different order more difficult since
FileCheck can match with a later line and then complain about the "wrong"
directive not being found.

This will cause quite large diffs when updating existing tests, but I'm not sure we need an opt-in flag here.

Depends on D109765 (pre-commit tests)

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D109767
2021-09-20 12:55:56 +01:00
Alex Richardson 7b68c0725d pre-commit test for D109767
Differential Revision: https://reviews.llvm.org/D109765
2021-09-20 12:55:56 +01:00
David Blaikie cb42bb3550 llvm-dwarfdump: pretty type printing: print fully qualified names in function type parameter types 2021-09-19 18:49:15 -07:00
David Blaikie 606ea0dd2a llvm-dwarfdump: support for type printing "decltype(nullptr)" as "nullptr_t"
This should probably be rendered as "std::nullptr_t" but for now clang
uses the unqualified name (which is ambiguous with possible user defined
name in the global namespace), so match that here.
2021-09-19 17:33:56 -07:00
David Blaikie 11e0b79b05 llvm-dwarfdump: Don't print even an empty string when a type is unprintable 2021-09-19 17:03:10 -07:00
David Blaikie 5bfe5207ef llvm-dwarfdump: Pretty print names qualified/with scopes 2021-09-19 16:36:01 -07:00
David Blaikie 372e2c24b6 llvm-dwarfdump: Pretty printing types including a space between const and parenthesized references/pointers to arrays 2021-09-19 13:32:53 -07:00
David Blaikie f09ca5c646 DWARFDie: Improve type printing for function and array types - with qualifiers (cv/reference) and pointers to them 2021-09-19 12:59:31 -07:00
Simon Pilgrim f855ef2601 [X86][Atom] Fix FP uops + port usage
Both ports are required in most cases. Update the uops counts + port usage based off the most recent llvm-exegesis captures (PR36895) and what Intel AoM / Agner / InstLatX64 reports as well.

Noticed while trying to improve fp costs for vectorization via the D103695 helper script.
2021-09-19 20:39:20 +01:00
David Blaikie 2ca637c976 llvm-dwarfdump: Refactor type pretty printing tests
Move most type tests to a pre-generated assembly file to make it easier
to add more weird cases without having to hand craft more DWARF.

Move the novel array types that aren't reachable via clang-generated
DWARF to a separate file for easy maintenance.
2021-09-19 09:30:38 -07:00
Simon Pilgrim cf8fac7d07 [X86][Atom] Specific uops for all IMUL/IDIV instructions
Based off a mixture of llvm-exegesis captures (PR36895) and Intel AoM / Agner / InstLatX64 reports.
2021-09-19 16:58:52 +01:00
Simon Pilgrim e381d8b243 [X86][Atom] Fix (U)COMISS/SD uops, latency and throughput
Both ports are required, for reg and mem variants - we can also use the WriteFComX class directly and remove the unnecessary InstRW overrides. Matches what Intel AoM / Agner / InstLatX64 report as well.
2021-09-19 12:44:44 +01:00
Samuel f18c0739b3 [llvm-reduce] Add reduce operands pass
Add reduction to set operands to default values

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D108903
2021-09-17 12:32:15 -07:00
Simon Pilgrim 5ebe95e256 [X86][Atom] Fix integer shuffles uops, latency and throughput
The MMX pack/unpck shuffles don't need an override - they have the same behaviour as other shuffles (Port0 only).
The SSE pslldq/psrldq shuffles don't need an override - they have the same behaviour as other shuffles (Port0 only).
The SSE pshufb shuffles use 4uops (+1 load).

Noticed the pslldq/psrldq issue while trying to improve reduction costs via the D103695 helper script, and fixed the others while reviewing. Confirmed with Intel AoM / Agner / InstLatX64.
2021-09-17 12:11:54 +01:00
Wenlei He 446e21623c [llvm-profgen] Use context-sensitive byte size cost for preinliner decisions by default
Turn on `use-context-cost-for-preinliner` to use context-sensitive byte size cost for preinliner decisions by default.

This is a more accurate proxy of inline cost than profile size. We tested on our large workload that it delivers measureable CPU improvement.

Differential Revision: https://reviews.llvm.org/D109893
2021-09-16 10:36:12 -07:00
serge-sans-paille 85f2ae57f7 Be more flexible on the storage type allowed for llvm::Any::TypeId::Id
This is a follow-up to 2c42a73d6c.
2021-09-16 11:01:53 +02:00
Arthur Eubanks 5d78e33ce5 [test] Move some llvm-extract tests into the proper directory 2021-09-15 15:42:04 -07:00
serge-sans-paille 2c42a73d6c Add extra check for llvm::Any::TypeId visibility
This check should ensure we don't reproduce the problem fixed by
02df443d28

More accurately, it checks every llvm::Any::TypeId symbol in libLLVM-x.so and
make sure they have weak linkage and are not local to the library, which would
lead to duplicate definition if another weak version of the symbol is defined in
another linked library.

Differential Revision: https://reviews.llvm.org/D109252
2021-09-15 08:32:55 +02:00
Esme-Yi 945df8bc4c [obj2yaml][XCOFF] Dump sections
Summary: This patch implements parsing sections for obj2yaml on AIX.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D98003
2021-09-15 05:16:33 +00:00
Hongtao Yu 0057c7185d [CSSPGO][llvm-profgen] Truncate stack samples with invalid return address.
Invalid frame addresses exist in call stack samples due to bad unwinding. This could happen to frame-pointer-based unwinding and the callee functions that do not have the frame pointer chain set up. It isn't common when the program is built with the frame pointer omission disabled, but can still happen with third-party static libs built with frame pointer omitted.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D109638
2021-09-14 21:56:22 -07:00
Martin Storsjö 63784b9a75 [llvm-readobj] [COFF] Resolve relocations pointing at section symbols for arm64 too
This syncs parts from the x86 implementation to the ARMWinEH
implementation.

Currently, neither of the compilers targeting COFF/arm64 (MSVC, LLVM)
produce such relocations, but LLVM might after a later patch.

Differential Revision: https://reviews.llvm.org/D109650
2021-09-14 11:04:46 +03:00
Martin Storsjö 197084fcee [llvm-readobj] [COFF] Try to resolve symbols in unwind info on x86
This is the same as we do on arm64 already for the MSVC style label
symbols, but also handle the way GCC produces it - with all relocations
pointing at the .text section symbol, with various offsets.

Differential Revision: https://reviews.llvm.org/D109649
2021-09-14 11:04:46 +03:00
Esme-Yi b98c3e957f [yaml2obj][XCOFF] add the SectionIndex field for symbol.
Summary: Add the SectionIndex field for symbol.
1: a symbol can reference a section by SectionName or SectionIndex.
2: a symbol can reference a section by both SectionName and SectionIndex.
3: if both Section and SectionIndex are specified, but the two values refer
   to different sections, an error will be reported.
4: an invalid SectionIndex is allowed.
5: if a symbol references a non-existent section by SectionName, an error will be reported.

Reviewed By: jhenderson, Higuoxing

Differential Revision: https://reviews.llvm.org/D109566
2021-09-14 06:18:03 +00:00
Esme-Yi 909f3d7380 [yaml2obj][XCOFF] customize the string table
Summary: The patch adds support for yaml2obj customizing the string table.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D107421
2021-09-13 09:24:38 +00:00
Simon Pilgrim 65ad09da0e [X86][SLM] Fix DIVPD/DIVPS/RCPPS/RSQRTPS/SQRTPD/SQRTPS/DPPD/DPPS uops, latency and throughput
The packed variants of the instructions had been modelled as the same as the scalar variants.

Reported during a run of llvm-exegesis on a cheap SLM box and matches what Agner / InstLatX64 report as well.
2021-09-13 08:36:43 +01:00
Simon Pilgrim df975e4590 [X86][SLM] Fix PSAD/MPSAD uops, latency and throughput
Noticed while trying to improve generic reduction costs via the D103695 helper script. Confirmed with Intel AoM / Agner / InstLatX64.
2021-09-11 11:44:09 +01:00
Simon Pilgrim 484944ac3b [X86][SLM] Fix HADD/HSUB uops, latency and throughput
Noticed while trying to improve generic reduction costs via the D103695 helper script. Confirmed with Intel AoM / Agner / InstLatX64.
2021-09-11 11:44:09 +01:00
Keith Smiley e972e49b11 [llvm-cov] Add error for invalid -path-equivalence format
Differential Revision: https://reviews.llvm.org/D109042
2021-09-10 18:34:37 -07:00
Sam Clegg e4b2f3054a [WebAssembly][libObject] Avoid re-use of Section object during parsing
The re-use of this struct across iterations of the loop was causing
fields (specifically Name) to be incorrectly shared between multiple
sections.

Differential Revision: https://reviews.llvm.org/D108984
2021-09-10 09:30:50 -04:00
Serge Bazanski 231bfaab31 [Lanai] fix MC / objdump
D78776 removed is{Call,Branch,UnconditionalBranch} guards in objdump
before calling MCInstrAnalysis::evaluateBranch. This is fine for other
architectures as they gracefully handle evaluateBranch being called on
non-branches. However, the Lanai MCInstrAnalysis implementation didn't
and that change caused it to crash.

This inserts the same guards back into Lanai's evaluateBranch
implementation and adds a smoke test that exercises `llc | objdump` so
this kind of regression is hopefully caught next time.

Reviewed By: jpienaar, MaskRay

Differential Revision: https://reviews.llvm.org/D107593
2021-09-10 10:46:13 +00:00
Alfonso Sánchez-Beato b25ab4f313 [llvm-objcopy][COFF] Fix test for debug dir presence
If the number of directories was 6 (equal to the DEBUG_DIRECTORY
index), patchDebugDirectory() was run even though the debug directory
is actually the 7th entry. Use <= in the comparison to fix that.

This fixes https://llvm.org/PR51243

Differential Revision: https://reviews.llvm.org/D106940

Reviewed by: jhenderson
2021-09-10 09:57:18 +01:00
Alfonso Sánchez-Beato b33fd31772 [yaml2obj][COFF] Allow variable number of directories
Allow variable number of directories, as allowed by the
specification. NumberOfRvaAndSize will default to 16 if not specified,
as in the past.

Reviewed by: jhenderson

Differential Revision: https://reviews.llvm.org/D108825
2021-09-09 11:16:56 +01:00
Wei Mi 8eb617d719 [SampleFDO] Allow forward compatibility when adding a new section for extbinary
format.

Currently when we add a new section in the profile format and generate a profile
containing the new section, older compiler which reads the new profile will
issue an error. The forward incompatibility can cause unnecessary churn when
extending the profile. This patch removes the incompatibility when adding a new
section for extbinary format.

Differential Revision: https://reviews.llvm.org/D109398
2021-09-07 19:38:43 -07:00
Maksim Panchenko 6300e4ac58 [llvm-objdump] Fix 'llvm-objdump -dr' for executables with relocations
Print relocations interleaved with disassembled instructions for
executables with relocatable sections, e.g. those built with "-Wl,-q".

Differential Revision: https://reviews.llvm.org/D109016
2021-09-07 11:24:24 -07:00
Roman Lebedev e030f808ec
[Exegesis] Native clusterization: sub-partition by sched class id
Currently native clusterization simply groups all benchmarks
by the opcode of key instruction, but that is suboptimal in certain cases,
e.g. where we can already tell that the particular instructions
already resolve into different sched classes.
2021-09-07 17:54:37 +03:00
Roman Lebedev b3b9b297a0
[NFC][exegesis] Add test for the following patch 2021-09-07 17:54:36 +03:00
Simon Pilgrim 056b409ceb [llvm-exegesis][x86] Limit llvm-exegesis analysis tests to x86_64 triple hosts
Attempting to fix an issue with test failures on arm m1 apple macintoshes reported on D109353
2021-09-07 14:35:52 +01:00
Simon Pilgrim 6a9e2764f6 [llvm-exegesis] Analysis tests should run even without libpfm (PR51687)
Move inverse_throughput, latency and uops to sub-directories (like we already do for lbr), which require libpfm, so we can relax the lit limits for analysis tests in the x86 root directory.

Differential Revision: https://reviews.llvm.org/D109353
2021-09-07 13:58:05 +01:00
Andrew Litteken bd4b1b5f6d [IRSim] Adding support for recognizing branch similarity
The current IRSimilarityIdentifier does not try to find similarity across blocks, this patch provides a mechanism to compare two branches against one another, to find similarity across basic blocks, rather than just within them.

This adds a step in the similarity identification process that labels all of the basic blocks so that we can identify the relative branching locations. Within an IRSimilarityCandidate we use these relative locations to determine whether if the branching to other relative locations in the same region is the same between branches. If they are, we consider them similar.

We do not consider the relative location of the branch if the target branch is outside of the region. In this case, both branches must exit to a location outside the region, but the exact relative location does not matter.

Reviewers: paquette, yroux

Differential Revision: https://reviews.llvm.org/D106989
2021-09-06 11:55:38 -07:00
Simon Pilgrim 2005ae15a6 [X86][SLM] WriteVecIMul instructions only take 1uop (REAPPLIED)
The xmm variant have half the throughput (and +1cy latency) of the mmx variants, but are still 1uop.

I still need to do more thorough testing of SLM on test-suite before fixing the obvious bad numbers for WritePMULLD.

But this helps the D103695 helper script get to more accurate numbers for vXi32 multiplies of extended operands (i.e. we can use PMADDWD, PMULLW/PMULHW etc). Matches what Intel AoM / Agner / llvm-exegesis reports.
2021-09-04 15:03:56 +01:00
Simon Pilgrim ac51d69208 Revert rG994da657076900f5ad7fe593c3b5e5f89ab3d53d "[X86][SLM] WriteVecIMul instructions only take 1uop"
This changed some codegen tests that I forgot about in my rebase, I'll recommit shortly with a fix.
2021-09-04 13:39:10 +01:00
Simon Pilgrim 994da65707 [X86][SLM] WriteVecIMul instructions only take 1uop
The xmm variant have half the throughput (and +1cy latency) of the mmx variants, but are still 1uop.

I still need to do more thorough testing of SLM on test-suite before fixing the obvious bad numbers for WritePMULLD.

But this helps the D103695 helper script get to more accurate numbers for vXi32 multiplies of extended operands (i.e. we can use PMADDWD, PMULLW/PMULHW etc). Matches what Intel AoM / Agner / llvm-exegesis reports.
2021-09-04 13:21:34 +01:00
Simon Pilgrim c6371020a8 [X86][SLM] RMW instructions don't require an extra uop
For RMW instructions, the load and store hold the MEC for an extra cycle, but within the same single uop. This is alluded to in the Intel AOM:

"The MEC also owns the MEC RSV, which is responsible for scheduling of all loads and stores. Load and
store instructions go through addresses generation phase in program order to avoid on-the-fly memory
ordering later in the pipeline. Therefore, an unknown address will stall younger memory instructions."

Noticed while trying to get a cheap SLM test box up and running with llvm-exegesis - RMW arithmetic is always 1uop - and matches what Agner / InstLatX64 report as well.
2021-09-04 13:21:34 +01:00
Simon Pilgrim da965a77d5 [X86][SLM] Fix MUL uops, latency and throughput
These were all set to the same best case mul i32 values (which seems to be the only version of MUL that SLM actually performs well with).

Noticed while trying to improve multiplication costs for vectorization via the D103695 helper script. Confirmed with Intel AoM / Agner / InstLatX64.
2021-09-04 13:21:34 +01:00
Simon Pilgrim 7d062d2c47 [X86][Atom] MUL/DIV instructions require both ports, not either.
Noticed while trying to improve multiplication costs for vectorization via the D103695 helper script. Confirmed with Intel AoM.
2021-09-04 11:58:09 +01:00
Richard Smith 02fe58d628 DebugInfo: additional fix missed in bc066e2. 2021-09-03 15:28:00 -07:00
David Blaikie bc066e26c9 DebugInfo: Fix a few bot failures for type dumping fixes 2021-09-03 14:08:58 -07:00
David Blaikie 40f1593558 DebugInfo: Correct/improve type formatting (pointers to function types especially)
This does add some extra superfluous whitespace (eg: "int *") intended
to make the Simplified Template Names work easier - this makes the
DIE-based names match more exactly the clang-generated names, so it's
easier to identify cases that don't generate matching names.

(arguably we could change clang to skip that whitespace or add some
fuzzy matching to accommodate differences in certain whitespace - but
this seemed easier and fairly low-impact)
2021-09-03 12:22:28 -07:00
Jinsong Ji 343a72a24d [NFC][CSSPGO] Add end of file newline to test input
On some platform (eg: AIX), diff will complain about newline.

diff: Missing newline at the end of file
.../llvm/test/tools/llvm-profdata/Inputs/cs-sample.proftext.
2021-09-03 17:42:32 +00:00
Simon Pilgrim 6ba0b9f68a [X86][SLM] Fix PBLENDVB uops and throughput
SLM PBLENDVB is just as bad as BLENDVPD/PS - so model it as such, fixing the rr vs rm uops diff as well. The Intel AoM appears to have a copy+paste typo with PBLENDW, it doesn't match Agner or InstLatX64.

Noticed while investigating some of the weird discrepancies reported by the D103695 helper script (SLM had much better vector shift throughputs than it should).
2021-09-03 11:31:29 +01:00
gbreynoo e28cd75a50 [OptTable] Reapply Improve error message output for grouped short options
This reapplies 71d7fed3bc which was
reverted by 3e2bd82f02. This change
includes the fix for breaking the sanitizer bots.

As seen in https://bugs.llvm.org/show_bug.cgi?id=48880 the current
implementation for parsing grouped short options can return unclear
error messages. This change fixes the example given in the ticket in
which a flag is incorrectly given an argument. Also when parsing a
group we now keep reading past the first incorrect option and output
errors for all incorrect options in the group.

Differential Revision: https://reviews.llvm.org/D108770
2021-09-03 11:13:52 +01:00
Hongtao Yu 7ca8030030 [CSSPGO] Enable loading MD5 CS profile.
Adding the compiler support of MD5 CS profile based on pervious context split work D107299. A MD5 CS profile is about 40% smaller than the string-based extbinary profile. As a result, the compilation is 15% faster.

There are a few conversion from real names to md5 names that have been made on the sample loader and context tracker side to get it work.

Reviewed By: wenlei, wmi

Differential Revision: https://reviews.llvm.org/D108342
2021-09-01 09:19:47 -07:00
Kevin Athey 3e2bd82f02 Revert "[OptTable] Improve error message output for grouped short options"
This reverts commit 71d7fed3bc.

Reason: broke sanitizer bots
more info: https://reviews.llvm.org/D108770
2021-08-31 14:06:11 -07:00
wlei 964053d56f [llvm-profgen] Support LBR only perf script
This change aims at supporting LBR only sample perf script which is used for regular(Non-CS) profile generation.  A LBR perf script includes a batch of LBR sample which starts with a frame pointer and a group of 32 LBR entries is followed. The FROM/TO LBR pair and the range between two consecutive entries (the former entry's TO and the latter entry's FROM) will be used to infer function profile info.

An example of LBR perf script(created by `perf script -F ip,brstack -i perf.data`)
```
           40062f 0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1 ...
           4005d7 0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1 ...
           ...
```

For implementation:
 - Extended a new child class `LBRPerfReader` for the sample parsing, reused all the functionalities in `extractLBRStack` except for an extension to parsing leading instruction pointer.
 - `HybridSample` is reused(just leave the call stack empty) and the parsed samples is still aggregated in `AggregatedSamples`. After that, range samples, branch sample, address samples are computed and recorded.
 - Reused `ContextSampleCounterMap` to store the raw profile, since it's no need to aggregation by context, here it just registered one sample counter with a fake context key.
 - Unified to use `show-raw-profile` instead of `show-unwinder-output` to dump the intermediate raw profile, see the comments of the format of the raw profile. For CS profile, it remains to output the unwinder output.

Profile generation part will come soon.

Differential Revision: https://reviews.llvm.org/D108153
2021-08-31 13:28:17 -07:00
gbreynoo 71d7fed3bc [OptTable] Improve error message output for grouped short options
As seen in https://bugs.llvm.org/show_bug.cgi?id=48880 the current
implementation for parsing grouped short options can return unclear
error messages. This change fixes the example given in the ticket in
which a flag is incorrectly given an argument. Also when parsing a
group we now keep reading past the first incorrect option and output
errors for all incorrect options in the group.

Differential Revision: https://reviews.llvm.org/D108770
2021-08-31 16:41:08 +01:00
Simon Pilgrim 7ec7272b80 [MCA][X86] Add basic coverage for icelake arch
Copy the skylake-avx512 tests for icelake-server coverage.

Add icelake/rocketlake/tigerlake test coverage to the relevent generic tests as well.
2021-08-31 12:20:09 +01:00
Hongtao Yu b9db70369b [CSSPGO] Split context string to deduplicate function name used in the context.
Currently context strings contain a lot of duplicated function names and that significantly increase the profile size. This change split the context into a series of {name, offset, discriminator} tuples so function names used in the context can be replaced by the index into the name table and that significantly reduce the size consumed by context.

A follow-up improvement made in the compiler and profiling tools is to avoid reconstructing full context strings which is  time- and memory- consuming. Instead a context vector of `StringRef` is adopted to represent the full context in all scenarios. As a result, the previous prevalent profile map which was implemented as a `StringRef` is now engineered as an unordered map keyed by `SampleContext`. `SampleContext` is reshaped to using an `ArrayRef` to represent a full context for CS profile. For non-CS profile, it falls back to use `StringRef` to represent a contextless function name. Both the `ArrayRef` and `StringRef` objects are underpinned by real array and string objects that are stored in producer buffers. For compiler, they are maintained by the sample reader. For llvm-profgen, they are maintained in `ProfiledBinary` and `ProfileGenerator`. Full context strings can be generated only in those cases of debugging and printing.

When it comes to profile format, nothing has changed to the text format, though internally CS context is implemented as a vector. Extbinary format is only changed for CS profile, with an additional `SecCSNameTable` section which stores all full contexts logically in the form of `vector<int>`, which each element as an offset points to `SecNameTable`. All occurrences of contexts elsewhere are redirected to using the offset of `SecCSNameTable`.

Testing
This is no-diff change in terms of code quality and profile content (for text profile).

For our internal large service (aka ads), the profile generation is cut to half, with a 20x smaller string-based extbinary format generated.

The compile time of ads is dropped by 25%.

Differential Revision: https://reviews.llvm.org/D107299
2021-08-30 20:09:29 -07:00
Keith Smiley b5da3120b8 [llvm-cov][NFC] Add test for coverage-prefix-map remappings
This test covers acts as a regression test for these fixes:

c75a0a1e9d
dd388ba3e0

Differential Revision: https://reviews.llvm.org/D108805
2021-08-30 17:19:57 -07:00
Haowei Wu 31e61c58b0 [ifs] Add option to hide undefined symbols
This change add an option to llvm-ifs to hide undefined symbols from
its output.

Differential Revision: https://reviews.llvm.org/D108428
2021-08-27 11:15:56 -07:00
Roman Lebedev d4d459e747
[X86] AMD Zen 3: MULX w/ mem operand has the same throughput as with reg op
Exegesis is faulty and sometimes when measuring throughput^-1
produces snippets that have loop-carried dependencies,
which must be what caused me to incorrectly measure it originally.

After looking much more carefully, the inverse throughput should match
that of the MULX w/ reg op.

As per llvm-exegesis measurements.
2021-08-27 13:27:05 +03:00
Roman Lebedev 0f04936a2d
[X86] AMD Zen 3: MULX produces low part of the result in 3cy, +1cy for high part
As per llvm-exegesis measurements.
2021-08-27 13:27:05 +03:00
Roman Lebedev db2c6cd99c
[NFC][X86][MCA] AMD Zen 3: improve MULX test coverage
Latency for MULX isn't right
2021-08-27 13:27:05 +03:00
Andrea Di Biagio 4a5b191703 [X86][MCA] Address the latest issues with MULX reported in PR51495.
It turns out that SchedWrite WriteIMulH was always assigned to the low half of
the result of a MULX (rather than to the high half).

To avoid confusion, this patch swaps the two MULX writes in the tablegen
definition of MULX32/64.  That way, write names better describe what they
actually refer to; this also avoids further complications if in future we decide
to reuse the same MulH writes to also model other scalar integer multiply
instructions.  I also had to swap the latency values for the two MULX writes to
make sure that the change is effectively an NFC. In fact, none of the existing
x86 tests were affected by this small refactoring.

This patch also fixes a bug in MCA: a wrong latency value was propagated for
instructions that perform multiple writes to a same register.  This last issue
was found by Roman while testing MULX on targets that define a different latency
for the Low/High part of the result.

Differential Revision: https://reviews.llvm.org/D108727
2021-08-26 12:08:20 +01:00
David Green 6ffc6951a3 [AArch64] Remove unpredictable from narrowing instructions.
Like other similar instructions the xtn2 family do not have side
effects, and explicitly marking them as such can help improve scheduling
freedom.
2021-08-26 09:43:44 +01:00
David Green 9474b03d41 [AArch64] Add a Cortex-A55 NEON scheduler test case. 2021-08-26 09:43:44 +01:00
Esme-Yi b21ed75e10 [llvm-readobj][XCOFF] Add support for `--needed-libs` option.
Summary: This patch is trying to add support for llvm-readobj
--needed-libs option under XCOFF.
For XCOFF, the needed libraries can be found from the Import
File ID Name Table of the Loader Section.
Currently, I am using binary inputs in the test since yaml2obj
does not yet support for writing the Loader Section and the
import file table.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D106643
2021-08-26 07:17:06 +00:00
Fangrui Song 4a66a11286 [LLVMgold.so][test] Make comdat-nodeduplicate.ll work with binutils<2.27 2021-08-25 16:59:06 -07:00
Andrea Di Biagio 6181427bb9 [X86][MCA] Add more tests for MULX (PR51495).
llvm-mca still reports a wrong latency for the case where
the two destination registers of MULX are the same.
2021-08-25 21:28:21 +01:00
Alfonso Sánchez-Beato cdd407286a [llvm-objcopy] [COFF] Consider section flags when adding section
The --set-section-flags option was being ignored when adding a new
section. Take it into account if present.

Fixes https://llvm.org/PR51244

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D106942
2021-08-25 23:11:41 +03:00
Rong Xu 24201b6437 [SampleFDO] Set ProfileIsFS bit properly from the internal option
We have "-profile-isfs" internal option for text, binary, and
compactbinary format (mostly for debug and test purpose). We
need to set the related flag in FunctionSamples so that ProfileIsFS
is written to the header in extbinary format.

Differential Revision: https://reviews.llvm.org/D108707
2021-08-25 09:07:34 -07:00
Wenlei He a6f15e9a49 [CSSPGO] Use probe inline tree to track zero size fully optimized context for pre-inliner
This is a follow up diff for BinarySizeContextTracker to track zero size for fully optimized inlinee. When an inlinee is fully optimized away, we won't be able to get its size through symbolizing instructions, hence we will treat the corresponding context size as unknown. However by traversing the inlined probe forest, we know what're original inlinees regardless of optimization. If a context show up in inlined probes, but not during symbolization, we know that it's fully optimized away hence its size is zero instead of unknown. It should provide more accurate size cost estimation for pre-inliner to make better inline decisions in llvm-profgen.

Differential Revision: https://reviews.llvm.org/D108350
2021-08-25 09:01:11 -07:00
Andrea Di Biagio 5f848b311f [X86][SchedModel] Fix latency the Hi register write of MULX (PR51495).
Before this patch, WriteIMulH reported a latency value which is correct for the
RR variant of MULX, but not for the RM variant.

This patch fixes the issue by introducing a new WriteIMulHLd, which is meant to
be used only by the RM variant of MULX.

Differential Revision: https://reviews.llvm.org/D108701
2021-08-25 16:12:09 +01:00
Vyacheslav Zakharin 2e192ab1f4 [CodeExtractor] Preserve topological order for the return blocks.
Differential Revision: https://reviews.llvm.org/D108673
2021-08-25 08:09:01 -07:00
Andrea Di Biagio fe13b81ed9 [X86][NFC] Pre-commit llvm-mca tests for PR51495.
WriteIMulH reports an incorrect latency for RM variants of MULX.
2021-08-25 14:17:17 +01:00
Fangrui Song 9b96b0865d llvm-xray {convert,extract}: Add --demangle
No demangling may be a better default in the future.
Add `--demangle` for migration convenience.

Reviewed By: Enna1

Differential Revision: https://reviews.llvm.org/D108100
2021-08-24 13:35:19 -07:00
Patrick Holland e4ebfb5786 [MCA] Adding an AMDGPUCustomBehaviour implementation.
This implementation allows mca to model the desired behaviour of the s_waitcnt
instruction. This patch also adds the RetireOOO flag to the AMDGPU instructions
within the scheduling model. This flag is only used by mca and allows
instructions to finish out-of-order which helps mca's simulations more closely
model the actual device.

Differential Revision: https://reviews.llvm.org/D104730
2021-08-24 13:33:58 -07:00
Arthur Eubanks d2e103644b [llvm-reduce] Remove various module data
This removes the data layout, target triple, source filename, and module
identifier when possible.

Reviewed By: swamulism

Differential Revision: https://reviews.llvm.org/D108568
2021-08-24 09:45:31 -07:00
David Green 50f4ae58eb [AArch64] Correct store ReadAdrBase operand
It appears that the Read operand for stores was being placed on the
first operand (the stored value) not the address base. This adds a
ReadST for the stored value operand, allowing the ReadAdrBase to
correctly act upon the address.

Differential Revision: https://reviews.llvm.org/D108287
2021-08-23 21:07:55 +01:00
David Green 955c9437fd [AArch64] Add Scheduling tests for Load/Store ReadAdv operands. 2021-08-23 21:07:55 +01:00
Alexey Lapshin 07d44cc0b1 [DWARF][Verifier] Do not add child DieRangeInfo with empty address range to the parent.
verifyDieRanges function checks for the intersected address ranges.
It adds child DieRangeInfo into parent DieRangeInfo to check
whether children have overlapping address ranges. It is safe to not add
DieRangeInfo with empty address range into parent's children list.
This decreases the number of children which should be navigated and as a result
decreases execution time(parents having a lot of children with empty ranges
spend much time navigating them). For this command: "llvm-dwarfdump --verify clang-repl"
execution time decreased from 220 sec till 75 sec.

Differential Revision: https://reviews.llvm.org/D107554
2021-08-22 19:39:21 +03:00
Christian Fetzer 9116211d18 [Coverage][llvm-cov] Correctly export branch coverage in LCOV format
Commit 9f2967bcfe introduced support for
branch coverage including export to the LCOV format.

This commit corrects the LCOV field name for branches from BFH to BRH.
The mistake seems to have slipped in as typo because the correct field
name BRH is used in the comment section at the beginning of the file.

Differential Revision: https://reviews.llvm.org/D108358
2021-08-20 13:44:25 -05:00
Andrea Di Biagio 35d4292a73 [X86][SchedModels] Fix missing ReadAdvance for MULX and ADCX/ADOX (PR51494)
Before this patch, instructions MULX32rm and MULX64rm were missing a ReadAdvance
for the implicit read of register EDX/RDX.  This patch fixes the issue, and it
also introduces a new SchedWrite for the two variants of MULX. The general idea
behind this last change is to eventually decrease the number of InstRW in the
scheduling models.

This patch also adds a ReadAdvance for the implicit read of EFLAGS in ADCX/ADOX.

Differential Revision: https://reviews.llvm.org/D108372
2021-08-20 17:39:51 +01:00
Maryam Benimmar 2cdfd0b259 [AIX][XCOFF] 64-bit relocation reading support
Support XCOFFDumper relocation reading support
This patch is part of D103696 partition

Reviewed By: daltenty, Helflym

Differential Revision: https://reviews.llvm.org/D104646
2021-08-19 21:56:57 -04:00
Andrzej Warzynski dcc6b7b1d5 [OptTable] Refine how `printHelp` treats empty help texts
Currently, `printHelp` behaves differently for options that:
  * do not define `HelpText` (such options _are not printed_), and
  * define its `HelpText` as `HelpText<"">` (such options _are printed_).
In practice, both approaches lead to no help text and `printHelp` should
treat them consistently. This patch addresses that by making
`printHelpt` check the length of the help text to be printed.

All affected tests have been updated accordingly. The option definitions
for llvm-cvtres have been updated with a short description or "Not
  implemented" for options that are ignored by the tool.

Differential Revision: https://reviews.llvm.org/D107557
2021-08-19 09:30:15 +00:00
Wenlei He eca03d2768 [CSSPGO] Track and use context-sensitive post-optimization function size to drive global pre-inliner in llvm-profgen
This change enables llvm-profgen to use accurate context-sensitive post-optimization function byte size as a cost proxy to drive global preinline decisions.

To do this, BinarySizeContextTracker is introduced to track function byte size under different inline context during disassembling. In preinliner, we can not query context byte size under switch `context-cost-for-preinliner`. The tracker uses a reverse trie to keep size of functions under different context (callee as parent, caller as child), and it can give best/longest possible matching context size for given input context.

The new size cost is off by default. There're a few TODOs that needs to addressed: 1) avoid dangling string from `Offset2LocStackMap`, which will be addressed in split context work; 2) using inlinee's entry probe to make sure we have correct zero size for inlinee that's completely optimized away after inlining. Some tuning is also needed.

Differential Revision: https://reviews.llvm.org/D108180
2021-08-18 22:50:57 -07:00
Andrea Di Biagio 2d53e54f0e [X86][NFC] Pre-commit tests for PR51494 2021-08-18 19:55:21 +01:00
Maryam Benimmar 7151a8aada [PowerPC][AIX] llvm-readobj: Convert some errors to warnings.
Report warnings rather than errors, so that llvm-readobj doesn't bail
out on malformed inputs.

Differential Revision: https://reviews.llvm.org/D106783
2021-08-18 11:04:08 -04:00
Xu Mingjie 168ee72718 [NFC][llvm-xray] add a llvm-xray convert option `no-demangle`
When option `--symbolize` is true, llvm-xray convert will demangle function
name on default. This patch adds a llvm-xray convert option `no-demangle` to
determine whether to demangle function name when symbolizing function ids from
the input log.

Reviewed By: MaskRay, smeenai

Differential Revision: https://reviews.llvm.org/D108019
2021-08-18 12:22:04 +08:00
Jozef Lawrynowicz 108ba4f4a4 [llvm-readobj] Refactor ELFDumper::printAttributes()
The current implementation of printAttributes makes it fiddly to extend
attribute support for new targets.

By refactoring the code so all target specific variables are
initialized in a switch/case statement, it becomes simpler to extend
attribute support for new targets.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D107968
2021-08-17 13:28:31 -07:00
Tozer 6d5e31baaa Fix 2: [MCParser] Correctly handle CRLF line ends when consuming line comments
Fixes an issue with revision 5c6f748c and ad40cb88.

Adds an mcpu argument to the test command, preventing an invalid default
CPU from being used on some platforms.
2021-08-17 17:13:21 +01:00
Fangrui Song c56b4cfd4b [llvm-objdump] -T: print symbol versions
Similar to D94907 (llvm-nm -D).

The output will match GNU objdump 2.37.
Older versions don't use ` (version)` for undefined symbols.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D108097
2021-08-17 09:10:50 -07:00
Tozer ad40cb8821 Fix: [MCParser] Correctly handle CRLF line ends when consuming line comments
Fixes an issue with revision 5c6f748c.

Move the test added in the above commit into the X86 folder, ensuring
that it is only run on targets where its triple is valid.
2021-08-17 16:16:19 +01:00
Tozer 5c6f748cbc [MCParser] Correctly handle CRLF line ends when consuming line comments
Fixes issue: https://bugs.llvm.org/show_bug.cgi?id=47983

The AsmLexer currently has an issue with lexing line comments in files
with CRLF line endings, in which it reads the carriage return as being
part of the line comment. This causes an error for certain valid comment
layouts; this patch fixes this by excluding the carriage return from the
line comment.

Differential Revision: https://reviews.llvm.org/D90234
2021-08-17 15:52:51 +01:00
Fangrui Song 54e76cb17a [split-file] Default to --no-leading-lines
It turns out that the --leading-lines may be a bad default.
[[#@LINE+-num]] is rarely used.
2021-08-16 19:23:11 -07:00
Hongtao Yu f27fee623d [SamplePGO][NFC] Dump function profiles in order
Sample profiles are stored in a string map which is basically an unordered map. Printing out profiles by simply walking the string map doesn't enforce an order. I'm sorting the map in the decreasing order of total samples to enable a more stable dump, which is good for comparing two dumps.

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D108147
2021-08-16 17:22:30 -07:00
Fangrui Song 935a6d4024 [test] Change llvm-xray options to use the preferred double-dash forms and change -f= to -f 2021-08-15 21:19:04 -07:00
David Blaikie 44d0a99a12 Add missing triple for test 2021-08-15 12:32:12 -07:00
David Blaikie 62a4c2c10e DWARFVerifier: Check section-relative references at the end of the section
This ensures that debug_types references aren't looked for in
debug_info section.

Behavior is still going to be questionable in an unlinked object file -
since cross-cu references could refer to symbols in another .debug_info
(or, in theory, .debug_types) chunk - but if a producer only uses
ref_addr to refer to things within the same .debug_info chunk in an
object file (eg: whole program optimization/LTO - producing two CUs into
a single .debug_info section in an object file - the ref_addrs there
could be resolved relative to that .debug_info chunk, not needing to
consider comdat  (DWARFv5 type units or other creatures) chunks of
.debug_info, etc)
2021-08-15 11:40:24 -07:00
David Blaikie 2af4db7d5c Migrate DWARFVerifier tests to lit-based yaml instead of gtest with embedded yaml
Improves maintainability (edit/modify the tests without recompiling) and
error messages (previously the failure would be a gtest failure
mentioning nothing of the input or desired text) and the option to
improve tests with more checks.

(maybe these tests shouldn't all be in separate files - we could
probably have DWARF yaml that contains multiple errors while still being
fairly maintainable - the various invalid offsets (ref_addr, rnglists,
ranges, etc) could probably be all in one test, but for the simple sake
of the migration I just did the mechanical thing here)
2021-08-13 19:09:41 -07:00
Vyacheslav Zakharin 15497e62f6 [openmp][ELF] Recognize LLVM OpenMP offload specific notes
The new ELF notes are added in clang-offload-wrapper, and llvm-readobj has to visualize them properly.

Differential Revision: https://reviews.llvm.org/D99552
2021-08-12 13:47:48 -07:00
Igor Kudrin 68616584c3 [llvm-objcopy][ELF] Avoid reordering section headers
As for now, llvm-objcopy sorts section headers according to the offsets
of the sections in the input file. That can corrupt section references
in the dynamic symbol table because it is a loadable section and as such
is not updated by the tool. Even though the section references are not
required for loading the binary correctly, they are still handy for a
user who analyzes the file.

While the patch removes global reordering of section headers, it layouts
the sections in the same way as before, i.e. according to their original
offsets. All that helps the output file to resemble the input better.

Note that the patch removes sorting SHT_GROUP sections to the start of
the list, which was introduced in D62620 in order to ensure that they
come before the group members, along with the corresponding test. The
original issue was caused by the sorting of section headers, so dropping
the sorting also resolves the issue.

Differential Revision: https://reviews.llvm.org/D107653
2021-08-12 17:12:09 +07:00
wlei 856a6a5041 [CSSPGO][llvm-profgen] Trim and merge context beforehand to reduce memory usage
Currently we use a centralized string map(StringMap<FunctionSamples> ProfileMap) to store the profile while populating the sample, which might cause the memory usage bottleneck. I saw in an extreme case, there are thousands of samples whose context stack depth is >= 100. The memory consumption can be greater than 100GB.

As here the context is used for inlining, we can assume we won't have so many of inlinees keeping inlined at the same root function, so this change tried to cap the context stack and merge the samples for peak memory reduction and this is done after recursion compression.

The default value is -1 meaning no depth limit, in the future we can tune to a smaller one.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D107800
2021-08-11 16:02:35 -07:00
Fangrui Song 76093b1739 [InlineAdvisor] Add single quotes around caller/callee names
Clang diagnostics refer to identifier names in quotes.
This patch makes inline remarks conform to the convention.
New behavior:

```
% clang -O2 -Rpass=inline -Rpass-missed=inline -S a.c
a.c:4:25: remark: 'foo' inlined into 'bar' with (cost=-30, threshold=337) at callsite bar:0:25; [-Rpass=inline]
int bar(int a) { return foo(a); }
                        ^
```

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D107791
2021-08-10 11:51:31 -07:00
Ben Dunbobbin 9e4d2b193a [llvm-ar] Add some test-cases for empty archives
We had coverage of empty archive in our downstream testsuite.
This adds those cases upstream.

Differential Revision: https://reviews.llvm.org/D107471
2021-08-10 10:34:50 +01:00
Esme-Yi f49c3a6882 [llvm-readobj][XCOFF] Print the length of the string table.
Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D107333
2021-08-09 06:47:15 +00:00
Pirama Arumuga Nainar 16ebb7ab5c [llvm-objcopy] [COFF] Do not patch debug entries if PointerToRawData is zero
Fix an edge case missed by https://reviews.llvm.org/D78921.  For e.g.,
the Repro debug entry (generated with the /Brepro linker flag) does not
have a debug-directory payload.  Do not attempt to patch Debug entries
without a payload.

Differential Revision: https://reviews.llvm.org/D107324
2021-08-06 09:23:25 -07:00
Martin Storsjö 46020f6f0c [llvm-rc] Allow specifying language with a leading 0x prefix
This option is always interpreted strictly as a hexadecimal string,
even if it has no prefix that indicates the number format, hence
the existing call to StringRef::getAsInteger(16, ...).

StringRef::getAsInteger(0, ...) consumes a leading "0x" prefix is
present, but when the radix is specified, the radix shouldn't
be included.

Both MS rc.exe and GNU windres accept the language with that
prefix.

Also allow specifying the codepage to llvm-windres with a different
radix, as GNU windres allows that (but MS rc.exe doesn't).

This fixes https://llvm.org/PR51295.

Differential Revision: https://reviews.llvm.org/D107263
2021-08-05 10:19:55 +03:00
Igor Kudrin 2c14798ead [ARM][llvm-objdump] Annotate PC-relative memory operands of VLDR instructions
This extends D105979 and adds support for VLDR instructions.

Differential Revision: https://reviews.llvm.org/D105980
2021-08-05 14:11:11 +07:00
Igor Kudrin ddbe812bcc [ARM][llvm-objdump] Annotate PC-relative memory operands
This implements `MCInstrAnalysis::evaluateMemoryOperandAddress()` for
Arm so that the disassembler can print the target address of memory
operands that use PC+immediate addressing.

Differential Revision: https://reviews.llvm.org/D105979
2021-08-05 14:11:11 +07:00
Andrea Di Biagio 7a1a35a1d1 [X86][SchedModel] Add missing ReadAdvance for some arithmetic ops (PR51318 and PR51322).
This fixes a bug where implicit uses of EFLAGS were not marked as ReadAdvance in
the RM/MR variants of ADC/SBB (PR51318)

This also fixes the absence of ReadAdvance for the register operand of
RMW arithmetic instructions (PR51322).

Differential Revision: https://reviews.llvm.org/D107367
2021-08-04 17:50:22 +01:00
Esme-Yi 737e27f623 [llvm-readobj][XCOFF] dump the string table only if the size is bigger than 4. 2021-08-04 06:28:26 +00:00
Vitaly Buka 3df1e7e6f0 [llvm-readobj][XCOFF] Warn about invalid offset
Followup for D105522

Differential Revision: https://reviews.llvm.org/D107398
2021-08-03 20:11:26 -07:00
wlei f1affe8dc8 [llvm-profgen][CSSPGO] Support count based aggregated type of hybrid perf script
This change tried to integrate a new count based aggregated type of perf script. The only difference of the format is that an aggregated count is added at the head of the original sample which means the same samples are repeated to the given count times. This is used to reduce the perf script size.
e.g.
```
2
	          4005dc
	          400634
	          400684
	    7f68c5788793
 0x4005c8/0x4005dc/P/-/-/0  ....
```
Implemented by a dedicated PerfReader `AggregatedHybridPerfReader`.

Differential Revision: https://reviews.llvm.org/D107192
2021-08-03 17:56:35 -07:00
wlei fe3ba90830 [llvm-profgen] Support perf script without parsing MMap events
This change supports to run without parsing MMap binary loading events instead it always assumes binary is loaded at the preferred address. This is used when we have assured no binary load address changes or we have pre-processed the addresses resolution. Warn if there's interior mmap event but without leading mmap events.

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D107097
2021-08-03 10:01:07 -07:00
Andrea Di Biagio f0658c7a42 [MCA][NFC] Add tests for PR51318 and PR51322.
Also, regenerate existing X86 tests using update_mca_test.py.
2021-08-03 17:06:34 +01:00
Jason Molenda 0d8cd4e2d5 [AArch64InstPrinter] Change printAddSubImm to comment imm value when shifted
Add a comment when there is a shifted value,
    add x9, x0, #291, lsl #12 ; =1191936
but not when the immediate value is unshifted,
    subs x9, x0, #256 ; =256
when the comment adds nothing additional to the reader.

Differential Revision: https://reviews.llvm.org/D107196
2021-08-03 02:28:46 -07:00
Esme-Yi 69396896fb [llvm-readobj][XCOFF] Fix the error dumping for the first
item of StringTable.

Summary: For the string table in XCOFF, the first 4 bytes
contains the length of the string table, so we should
print the string entries from fifth bytes. This patch
also adds tests for llvm-readobj dumping the string
table.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D105522
2021-08-03 09:08:58 +00:00
Eli Friedman 2a2847823f [ConstantFold] Get rid of special cases for sizeof etc.
Target-dependent constant folding will fold these down to simple
constants (or at least, expressions that don't involve a GEP).  We don't
need heroics to try to optimize the form of the expression before that
happens.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51232 .

Differential Revision: https://reviews.llvm.org/D107116
2021-07-31 13:20:47 -07:00
Petr Hosek 83302c8489 [profile] Fix profile merging with binary IDs
This fixes support for merging profiles which broke as a consequence
of e50a38840d. The issue was missing
adjustment in merge logic to account for the binary IDs which are
now included in the raw profile just after header.

In addition, this change also:
* Includes the version in module signature that's used for merging
to avoid accidental attempts to merge incompatible profiles.
* Moves the binary IDs size field after version field in the header
as was suggested in the review.

Differential Revision: https://reviews.llvm.org/D107143
2021-07-30 18:54:27 -07:00
Petr Hosek d3dd07e3d0 Revert "[profile] Fix profile merging with binary IDs"
This reverts commit dcadd64986.
2021-07-30 18:53:48 -07:00
Petr Hosek dcadd64986 [profile] Fix profile merging with binary IDs
This fixes support for merging profiles which broke as a consequence
of e50a38840d. The issue was missing
adjustment in merge logic to account for the binary IDs which are
now included in the raw profile just after header.

In addition, this change also:
* Includes the version in module signature that's used for merging
to avoid accidental attempts to merge incompatible profiles.
* Moves the binary IDs size field after version field in the header
as was suggested in the review.

Differential Revision: https://reviews.llvm.org/D107143
2021-07-30 17:38:53 -07:00
Fangrui Song a1532ed275 [InstrProfiling] Make CountersPtr in __profd_ relative
Change `CountersPtr` in `__profd_` to a label difference, which is a link-time
constant. On ELF, when linking a shared object, this requires that `__profc_` is
either private or linkonce/linkonce_odr hidden. On COFF, we need D104564 so that
`.quad a-b` (64-bit label difference) can lower to a 32-bit PC-relative relocation.

```
# ELF: R_X86_64_PC64 (PC-relative)
.quad .L__profc_foo-.L__profd_foo

# Mach-O: a pair of 8-byte X86_64_RELOC_UNSIGNED and X86_64_RELOC_SUBTRACTOR
.quad l___profc_foo-l___profd_foo

# COFF: we actually use IMAGE_REL_AMD64_REL32/IMAGE_REL_ARM64_REL32 so
# the high 32-bit value is zero even if .L__profc_foo < .L__profd_foo
# As compensation, we truncate CountersDelta in the header so that
# __llvm_profile_merge_from_buffer and llvm-profdata reader keep working.
.quad .L__profc_foo-.L__profd_foo
```

(Note: link.exe sorts `.lprfc` before `.lprfd` even if the object writer
has `.lprfd` before `.lprfc`, so we cannot work around by reordering
`.lprfc` and `.lprfd`.)

With this change, a stage 2 (`-DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_BUILD_INSTRUMENTED=IR`)
`ld -pie` linked clang is 1.74% smaller due to fewer R_X86_64_RELATIVE relocations.
```
% readelf -r pie | awk '$3~/R.*/{s[$3]++} END {for (k in s) print k, s[k]}'
R_X86_64_JUMP_SLO 331
R_X86_64_TPOFF64 2
R_X86_64_RELATIVE 476059  # was: 607712
R_X86_64_64 2616
R_X86_64_GLOB_DAT 31
```

The absolute function address (used by llvm-profdata to collect indirect call
targets) can be converted to relative as well, but is not done in this patch.

Differential Revision: https://reviews.llvm.org/D104556
2021-07-30 11:52:18 -07:00
Esme-Yi 8011fc1953 [yaml2obj] Enable support for parsing 64-bit XCOFF.
Summary: Add support for yaml2obj to parse 64-bit XCOFF.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D100375
2021-07-30 02:06:04 +00:00
Andrew Savonichev bcc83a2e83 [MCA] Use LSU for the in-order pipeline
Load/Store unit is used to enforce order of loads and stores if they
alias (controlled by --noalias=false option).

Fixes PR50483 - [MCA] In-order pipeline doesn't track memory
load/store dependencies.

Differential Revision: https://reviews.llvm.org/D103955
2021-07-29 14:40:23 +03:00
Sebastian Neubauer 4864893127 [Utils] Do not remove comments in llc test script
When checking if two prefixes can be merged for a function,
update_llc_test_checks.py removed IR comments before comparing
llc outputs of different RUN lines.
This means, if one RUN line emited lines starting with ';' and another
RUN line emited the same lines except the ones starting with ';', both
RUNs would be merged (if they share a prefix).

However, CHECK-NEXT lines check the comments, otherwise they fail, so
the script should not merge RUNs if they contain different comments.

Differential Revision: https://reviews.llvm.org/D101312
2021-07-29 13:03:05 +02:00
Nathan Chancellor 5060224d9e
[test] Fix tools/gold/X86/comdat-nodeduplicate.ll on non-X86 hosts
When running this test on an aarch64 machine, it fails:

```
/usr/bin/ld.gold: error: .../test/tools/gold/X86/Output/comdat-nodeduplicate.ll.tmp/ab.lto.o: incompatible target
```

Specify the elf_x86_64 emulation as all of the other gold plugin tests
do.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D107020
2021-07-28 21:56:23 -07:00
Daniel Rodríguez Troitiño d6704e5ed9 [llvm-objcopy][MachO] Ignore all LC_SUB_* commands.
The LC_SUB_FRAMEWORK, LC_SUB_UMBRELLA, LC_SUB_CLIENT, and LC_SUB_LIBRARY
are used to indicate related libraries, binaries or framework names.
Their only payload is the string with the name of the object. Adding
those commands to the list of ignored/skipped load commands will avoid
an error that stop the process of copying/stripping and will copy their
contents verbatim.

Additionally, in order to have a test for this case, `yaml2obj` now
allows those four commands to contain a `Content`.

Differential Revision: https://reviews.llvm.org/D106412
2021-07-28 17:35:26 -07:00
Eli Friedman 4adcff0b70 [ARM] Fix llvm-objdump disassembly of armv7m object files.
Apparently, the features were getting mixed up, so we'd try to
disassemble in ARM mode. Fix sub-architecture detection to compute the
correct triple if we're detecting it automatically, so the user doesn't
need to pass --triple=thumb etc.

It's possible we should be somehow tying the "+thumb-mode" target
feature more directly to Tag_CPU_arch_profile? But this seems to work
reasonably well, anyway.

While I'm here, fix up the other llvm-objdump tests that were explicitly
specifying an ARM triple; that shouldn't be necessary.

Differential Revision: https://reviews.llvm.org/D106912
2021-07-28 11:41:54 -07:00
Wael Yehia 9559bd1990 [LTO][Legacy] Add new API to check presence of ctor/dtor functions.
On AIX, the linker needs to check whether a given lto_module_t contains
any constructor/destructor functions, in order to implement the behavior
of the -bcdtors:all flag. See
https://www.ibm.com/docs/en/aix/7.2?topic=l-ld-command for the flag's
documentation.
In llvm IR, constructor (destructor) functions are added to a special
global array @llvm.global_ctors (@llvm.global_dtors).
However, because these two symbols are artificial, they are not visited
during the symbol traversal (using the
lto_module_get_[num_symbols|symbol_name|symbol_attribute] API).

This patch adds a new function to the libLTO interface that checks the
presence of one or both of these two symbols.

Reviewed By: steven_wu

Differential Revision: https://reviews.llvm.org/D106887
2021-07-28 12:41:56 +00:00
Esme-Yi 14f6cfcf3c [Debug-Info][llvm-dwarfdump] Don't try to dump location
list for attributes that don't have the loclist class.

Summary: The overflow error occurs when we try to dump
location list for those attributes that do not have the
loclist class, like DW_AT_count and DW_AT_byte_size.
After re-reviewed the entire list, I sorted those
attributes into two parts, one for dumping location list
and one for dumping the location expression.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D105613
2021-07-27 07:28:59 +00:00
Fangrui Song 792c206e2b [llvm-objcopy] Drop GRP_COMDAT if the group signature is localized
See [GRP_COMDAT group with STB_LOCAL signature](https://groups.google.com/g/generic-abi/c/2X6mR-s2zoc)
objcopy PR: https://sourceware.org/bugzilla/show_bug.cgi?id=27931

GRP_COMDAT deduplication is purely based on the signature symbol name in
ld.lld/GNU ld/gold. The local/global status is not part of the equation.

If the signature symbol is localized by --localize-hidden or
--keep-global-symbol, the intention is likely to make the group fully
localized. Drop GRP_COMDAT to suppress deduplication.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D106782
2021-07-26 09:05:18 -07:00
Fangrui Song c0da287c30 [yaml2obj][MachO] Rename PayloadString to Content
The new name is conciser and matches yaml2obj ELF & DWARF.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D106759
2021-07-26 09:04:51 -07:00
gbreynoo 87ed73fe6e [llvm-readobj] Display multiple function names for stack size entries
The current implementation of displaying .stack_size information
presumes that each entry represents a single function but this is not
always the case. For example with the use of ICF multiple functions can
be represented with the same code, meaning that the address found in a
.stack_size entry corresponds to multiple function symbols.
This change allows multiple function names to be displayed when
appropriate.

Differential Revision: https://reviews.llvm.org/D105884
2021-07-26 14:49:53 +01:00
Martin Storsjö 0a1683f8cc [llvm-rc] Allow dashes as part of resource name strings
This matches what MS rc.exe allows in practice. I'm not aware of
any legal syntax case that are broken by allowing dashes as part
of what the tokenizer considers an Identifier - but I'm not
very well versed in the RC syntax either, can @amccarth think of
any case that would be broken by this?

This fixes downstream bug
https://github.com/msys2/MINGW-packages/issues/9180.

Additionally, rc.exe allows such resource name strings to be surrounded
by quotes, ending up with e.g.

    Resource name (string): "QUOTEDNAME"

(i.e., the quotes end up as part of the string), which llvm-rc doesn't
support yet either. (I'm not aware of such cases in the wild though,
but resource string names with dashes do exist.)

This also allows including files with unquoted paths, with filenames
containing dashes (which fixes
https://github.com/msys2/MINGW-packages/issues/9130, which has been
worked around differently so far).

Differential Revision: https://reviews.llvm.org/D106598
2021-07-23 23:05:20 +03:00
Fangrui Song 31677c6481 [llvm-symbolizer] Remove one-dash long options
Most modern tools only accept two-dash long options. Remove one-dash
long options which are not recognized by GNU style `getopt_long`.
This ensures long options cannot collide with grouped short options.

Note: llvm-symbolizer has `-demangle={true,false}` for pprof compatibility
(for a while). They are kept.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D106377
2021-07-23 08:35:45 -07:00
Gulfem Savrun Yeniceri e50a38840d [profile] Add binary id into profiles
This patch adds binary id into profiles to easily associate binaries
with the corresponding profiles. There is an RFC that discusses
the motivation, design and implementation in more detail:
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151154.html

Differential Revision: https://reviews.llvm.org/D102039
2021-07-23 00:19:12 +00:00
Eric Astor a4e964a282 [ms] [llvm-ml] Fix macro case-insensitivity
We previously had issues identifying macros not registered with a lowercase name.

Reviewed By: mstorsjo, thakis

Differential Revision: https://reviews.llvm.org/D106453
2021-07-22 15:50:52 -04:00
Simon Pilgrim d073b19dbf [X86] Fix SLM FP<->INT throughputs.
Noticed while trying to clean up the shift costs model for SSE4 targets using the script in D10369 - SLM double-pumps all the 128-bit vector conversion ops and only use FP0 pipe - numbers taken from Intel AOM + Agner.
2021-07-22 19:39:04 +01:00
Timm Bäder 924d62ca4a [llvm][tools] Hide remaining unrelated llvm- tool options
Differential Revision: https://reviews.llvm.org/D106430
2021-07-22 09:47:55 +02:00
Bill Wendling 635288d215 [llvm-diff] Check for recursive initialiers
We need to check for recursive initializers in the "ConstantStruct"
case.

Differential Revision: https://reviews.llvm.org/D105616
2021-07-21 14:21:21 -07:00
Gulfem Savrun Yeniceri fd895bc81b Revert "[profile] Add binary id into profiles"
Revert "[profile] Change linkage type of a compiler-rt func"
This reverts commits f984ac2715 and
467c719124 because it broke some builds.
2021-07-21 19:15:18 +00:00
Gulfem Savrun Yeniceri f984ac2715 [profile] Add binary id into profiles
This patch adds binary id into profiles to easily associate binaries
with the corresponding profiles. There is an RFC that discusses
the motivation, design and implementation in more detail:
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151154.html

Differential Revision: https://reviews.llvm.org/D102039
2021-07-21 17:55:43 +00:00
Eric Astor 69551486fd [ms] [llvm-ml] Restrict implicit RIP-relative addressing to named-variable references
ML64.EXE applies implicit RIP-relative addressing only to memory references that include a named-variable reference.

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D105372
2021-07-21 11:49:58 -04:00
Eric Astor 5fba605896 [ms] [llvm-ml] Support built-in text macros
Add support for all built-in text macros supported by ML64:
@Date, @Time, @FileName, @FileCur, and @CurSeg.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D104965
2021-07-21 11:44:09 -04:00
Eric Astor 4cbb912d75 [ms] [llvm-ml] Add support for numeric built-in symbols
Support @Version and @Line as built-in symbols. For now, resolves @Version to 1427 (the same as for the VS 2019 release of ML.EXE).

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D104964
2021-07-21 11:43:07 -04:00