Commit Graph

2368 Commits

Author SHA1 Message Date
Andrea Di Biagio 6eebbe0a97 [tblgen][llvm-mca] Add the ability to describe move elimination candidates via tablegen.
This patch adds the ability to identify instructions that are "move elimination
candidates". It also allows scheduling models to describe processor register
files that allow move elimination.

A move elimination candidate is an instruction that can be eliminated at
register renaming stage.
Each subtarget can specify which instructions are move elimination candidates
with the help of tablegen class "IsOptimizableRegisterMove" (see
llvm/Target/TargetInstrPredicate.td).

For example, on X86, BtVer2 allows both GPR and MMX/SSE moves to be eliminated.
The definition of 'IsOptimizableRegisterMove' for BtVer2 looks like this:

```
def : IsOptimizableRegisterMove<[
  InstructionEquivalenceClass<[
    // GPR variants.
    MOV32rr, MOV64rr,

    // MMX variants.
    MMX_MOVQ64rr,

    // SSE variants.
    MOVAPSrr, MOVUPSrr,
    MOVAPDrr, MOVUPDrr,
    MOVDQArr, MOVDQUrr,

    // AVX variants.
    VMOVAPSrr, VMOVUPSrr,
    VMOVAPDrr, VMOVUPDrr,
    VMOVDQArr, VMOVDQUrr
  ], CheckNot<CheckSameRegOperand<0, 1>> >
]>;
```

Definitions of IsOptimizableRegisterMove from processor models of a same
Target are processed by the SubtargetEmitter to auto-generate a target-specific
override for each of the following predicate methods:

```
bool TargetSubtargetInfo::isOptimizableRegisterMove(const MachineInstr *MI)
const;
bool MCInstrAnalysis::isOptimizableRegisterMove(const MCInst &MI, unsigned
CPUID) const;
```

By default, those methods return false (i.e. conservatively assume that there
are no move elimination candidates).

Tablegen class RegisterFile has been extended with the following information:
 - The set of register classes that allow move elimination.
 - Maxium number of moves that can be eliminated every cycle.
 - Whether move elimination is restricted to moves from registers that are
   known to be zero.

This patch is structured in three part:

A first part (which is mostly boilerplate) adds the new
'isOptimizableRegisterMove' target hooks, and extends existing register file
descriptors in MC by introducing new fields to describe properties related to
move elimination.

A second part, uses the new tablegen constructs to describe move elimination in
the BtVer2 scheduling model.

A third part, teaches llm-mca how to query the new 'isOptimizableRegisterMove'
hook to mark instructions that are candidates for move elimination. It also
teaches class RegisterFile how to describe constraints on move elimination at
PRF granularity.

llvm-mca tests for btver2 show differences before/after this patch.

Differential Revision: https://reviews.llvm.org/D53134

llvm-svn: 344334
2018-10-12 11:23:04 +00:00
Jordan Rupprecht bb4588e9c1 [llvm-objcopy] Add -F|--target compatibility
Summary:
This change adds support for the GNU --target flag, which sets both --input-target and --output-target.

GNU objcopy doesn't do any checking for whether both --target and --{input,output}-target are used, and so it allows both, e.g. "--target A --output-target B" is equivalent to "--input-target A --output-target B" since the later command line flag would override earlier ones. This may be error prone, so I chose to implement it as an error if both are used. I'm not sure if anyone is actually using both.

Reviewers: jakehehrlich, jhenderson, alexshap

Reviewed By: jakehehrlich, alexshap

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53029

llvm-svn: 344321
2018-10-12 00:36:01 +00:00
Aaron Smith c5cd5911ec [llvm-pdbutil] Add missing pdb for test
llvm-svn: 344306
2018-10-11 22:25:55 +00:00
Aaron Smith c66838aee9 [llvm-pdbutil] Pretty print PDBSymbolUsingNamespace symbols
Reviewers: rnk, zturner, llvm-commits

Differential Revision: https://reviews.llvm.org/D52799

llvm-svn: 344298
2018-10-11 21:37:18 +00:00
Andrea Di Biagio 6a0b319549 [llvm-mca][BtVer2] Add tests for optimizable GPR register moves. NFC
llvm-svn: 344253
2018-10-11 14:54:54 +00:00
Martin Storsjo 5239041338 [llvm-nm] Include the text "@FILE" in the output of --help
libtool requires this text to be present, in order to conclude that
the tool supports response files. Also add an explicit test of using
response files with llvm-nm.

Differential Revision: https://reviews.llvm.org/D53064

llvm-svn: 344222
2018-10-11 06:53:38 +00:00
Craig Topper df17244ffa Change the timestamp of llvmcache-foo file to meet the thinLTO prune policy
The case will randomly fail if we test it with command "
while llvm-lit test/tools/gold/X86/cache.ll ; do true; done". It is because the llvmcache-foo file is younger than llvmcache-349F039B8EB076D412007D82778442BED3148C4E and llvmcache-A8107945C65C2B2BBEE8E61AA604C311D60D58D6. But due to timestamp precision reason their timestamp is the same. Given the same timestamp, the file prune policy is to remove bigger size file first, so mostly foo file is removed for its bigger size. And the files size is under threshold after deleting foo file. That's what test case expect.

However sometimes, the precision is enough to measure that timestamp of llvmcache-349F039B8EB076D412007D82778442BED3148C4E and llvmcache-A8107945C65C2B2BBEE8E61AA604C311D60D58D6 are smaller than foo, so llvmcache-349F039B8EB076D412007D82778442BED3148C4E and llvmcache-A8107945C65C2B2BBEE8E61AA604C311D60D58D6 are deleted first. Since the files size is still above the file size threshold after deleting the 2 files, the foo file is also deleted. And then the test case fails, because it expect only one file should be deleted instead of 3.

The fix is to change the timestamp of llvmcache-foo file to meet the thinLTO prune policy.

Patch by Luo Yuanke.

Differential Revision: https://reviews.llvm.org/D52452

llvm-svn: 344158
2018-10-10 17:37:32 +00:00
Andrea Di Biagio 1b29ec6531 [llvm-mca][BtVer2] Add two more move-elimination tests. NFC
These should test all the optimizable moves on Jaguar.
A follow-up patch will teach how to recognize these optimizable register moves.

llvm-svn: 344144
2018-10-10 14:46:54 +00:00
John Brawn c616a7236c [llvm-exegesis] Fix function return generation so it doesn't return register 0
When fillMachineFunction generates a return on targets without a return opcode
(such as AArch64) it should pass an empty set of registers as the return
registers, not 0 which means register number zero.

Differential Revision: https://reviews.llvm.org/D53074

llvm-svn: 344139
2018-10-10 13:03:23 +00:00
Fangrui Song 88478bbc60 [opt] Change the parameter of OptTable::PrintHelp from Name to Usage and don't append "[options] <inputs>"
Summary:
Before, "[options] <inputs>" is unconditionally appended to the `Name` parameter. It is more flexible to change its semantic to `Usage` and let user customize the usage line.

% llvm-objcopy
...
USAGE: llvm-objcopy <input> [ <output> ] [options] <inputs>

With this patch:

% llvm-objcopy
...
USAGE: llvm-objcopy input [output]

Reviewers: rupprecht, alexshap, jhenderson

Reviewed By: rupprecht

Subscribers: jakehehrlich, mehdi_amini, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51009

llvm-svn: 344097
2018-10-10 00:15:31 +00:00
Jake Ehrlich 5e49846ca6 [llvm-objcopy] Make -S an alias for --strip-all
-S should be an alias for --strip-all not --strip-all-gnu

llvm-svn: 344080
2018-10-09 21:14:09 +00:00
Adrian Prantl 8aa69e9927 llvm-dwarfdump: Extend --name to also search DW_AT_linkage_name.
rdar://problem/45132695

llvm-svn: 344079
2018-10-09 20:51:33 +00:00
Wolfgang Pieb a9ea9c5034 [DWARF] Make llvm-dwarfdump display the .debug_loc.dwo section. Fixes PR38991.
Reviewer: dblaikie

Differential Revision: https://reviews.llvm.org/D52444

llvm-svn: 344068
2018-10-09 18:38:55 +00:00
Jordan Rupprecht 34c0e470ae [llvm-ar] Use POSIX-specified timestamps for 'tv'.
Summary:
The POSIX spec says:

```
If the −t option is used with the −v option, the standard output format shall be:
"%s %u/%u %u %s %d %d:%d %d %s\n", <member mode>, <user ID>,
<group ID>, <number of bytes in member>,
<abbreviated month>, <day-of-month>, <hour>,
<minute>, <year>, <file>

where:

...
<abbreviated month>
Equivalent to the format of the %b conversion specification format in date.
<day-of-month>
Equivalent to the format of the %e conversion specification format in date.
<hour> Equivalent to the format of the %H conversion specification format in date.
<minute> Equivalent to the format of the %M conversion specification format in date.
<year> Equivalent to the format of the %Y conversion specification format in date.
```

This actually used to be the format printed by llvm-ar. It was apparently accidentally changed (see r207385 followed by comments in r207387). This makes it conform to GNU ar for easier replacement.

Reviewers: MaskRay

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52940

llvm-svn: 343901
2018-10-05 23:25:39 +00:00
Petr Hosek 227f25420a [llvm-nm] Update all tests to redirect stderr to stdout
This addresses the breakage introduced in r343887.

llvm-svn: 343896
2018-10-05 22:16:37 +00:00
Petr Hosek 62f6462bf9 [llvm-nm] Write "no symbol" output to stderr
This matches the output of binutils' nm and ensures that any scripts
or tools that use nm and expect empty output in case there no symbols
don't break.

Differential Revision: https://reviews.llvm.org/D52943

llvm-svn: 343887
2018-10-05 21:10:03 +00:00
Vedant Kumar 5931b4e5b5 [DebugInfo] Add support for DWARF5 call site-related attributes
DWARF v5 introduces DW_AT_call_all_calls, a subprogram attribute which
indicates that all calls (both regular and tail) within the subprogram
have call site entries. The information within these call site entries
can be used by a debugger to populate backtraces with synthetic tail
call frames.

Tail calling frames go missing in backtraces because the frame of the
caller is reused by the callee. Call site entries allow a debugger to
reconstruct a sequence of (tail) calls which led from one function to
another. This improves backtrace quality. There are limitations: tail
recursion isn't handled, variables within synthetic frames may not
survive to be inspected, etc. This approach is not novel, see:

  https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=jelinek.pdf

This patch adds an IR-level flag (DIFlagAllCallsDescribed) which lowers
to DW_AT_call_all_calls. It adds the minimal amount of DWARF generation
support needed to emit standards-compliant call site entries. For easier
deployment, when the debugger tuning is LLDB, the DWARF requirement is
adjusted to v4.

Testing: Apart from check-{llvm, clang}, I built a stage2 RelWithDebInfo
clang binary. Its dSYM passed verification and grew by 1.4% compared to
the baseline. 151,879 call site entries were added.

rdar://42001377

Differential Revision: https://reviews.llvm.org/D49887

llvm-svn: 343883
2018-10-05 20:37:17 +00:00
Simon Pilgrim f09fc3bc12 [X86] Move ReadAfterLd functionality into X86FoldableSchedWrite (PR36957)
Currently we hardcode instructions with ReadAfterLd if the register operands don't need to be available until the folded load has completed. This doesn't take into account the different load latencies of different memory operands (PR36957).

This patch adds a ReadAfterFold def into X86FoldableSchedWrite to replace ReadAfterLd, allowing us to specify the load latency at a scheduler class level.

I've added ReadAfterVec*Ld classes that match the XMM/Scl, XMM and YMM/ZMM WriteVecLoad classes that we currently use, we can tweak these values in future patches once this infrastructure is in place.

Differential Revision: https://reviews.llvm.org/D52886

llvm-svn: 343868
2018-10-05 17:57:29 +00:00
Adrian Prantl 7875142b5c Format the dwarfdump --statistics version as an integer instead of a string.
llvm-svn: 343864
2018-10-05 17:41:30 +00:00
Simon Pilgrim 6ad03ad34b [llvm-mca][x86] Add PR36951 ReadAfterLd test case
llvm-svn: 343795
2018-10-04 16:26:56 +00:00
Greg Bedwell dee7bfdb9f [utils] Ensure that update_mca_test_checks.py writes prefixes in alphabetical order
llvm-svn: 343783
2018-10-04 14:42:19 +00:00
Simon Pilgrim 82a3b1c687 [llvm-mca][x86] Add tests demonstrating ReadAfterLd delay
llvm-svn: 343773
2018-10-04 13:05:42 +00:00
Clement Courbet 217ed1ffff [llvm-exegesis][NFC] Test sched class names only in !NDEBUG mode.
Sched classes have no names in NDEBUG.

llvm-svn: 343755
2018-10-04 07:07:16 +00:00
Fangrui Song 5fbdce131d [llvm-exegesis] Unbreak analysis-uops-variant.test introduced in D52825
A `defined(NDEBUG) && !defined(LLVM_ENABLE_DUMP)` build does not call
writeEscaped and there will be no `SBWriteZeroLatency` in the output.

llvm-svn: 343751
2018-10-04 03:32:47 +00:00
Jordan Rupprecht 53cb573564 [llvm-nm] Print an explicit "no symbols" message when an object file has no symbols
Summary:
GNU nm (and other nm implementations, such as "go tool nm") prints an explicit "no symbols" message when an object file has no symbols. Currently llvm-nm just doesn't print anything. Adding an explicit "no symbols" message will allow llvm-nm to be used in place of nm: some scripts and build processes use `nm <file> | grep "no symbols"` as a test to see if a file has no symbols. It will also be more familiar to anyone used to nm.

That said, the format implemented here is slightly different, in that it doesn't print the tool name in the message (which IMHO is not useful to include).

Demo:
```
$ for nm in nm bin/llvm-nm ; do echo "nm implementation: $nm"; $nm /tmp/foo{1,2}.o; echo; done
nm implementation: nm

/tmp/foo1.o:
nm: /tmp/foo1.o: no symbols

/tmp/foo2.o:
0000000000000000 T foo2

nm implementation: bin/llvm-nm

/tmp/foo1.o:
no symbols

/tmp/foo2.o:
0000000000000000 T foo2
```

Reviewers: MaskRay

Reviewed By: MaskRay

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52810

llvm-svn: 343742
2018-10-03 23:39:49 +00:00
Simon Pilgrim 0b451a2983 [X86][Btver2] Fix MMX PSHUFB schedule
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343701
2018-10-03 18:18:50 +00:00
Andrea Di Biagio 207e0217f9 [llvm-mca] Add support for move elimination in class RegisterFile.
This patch teaches class RegisterFile how to analyze register writes from
instructions that are move elimination candidates.
In particular, it teaches it how to check if a move can be effectively eliminated
by the underlying PRF, and (if necessary) how to perform move elimination.

The long term goal is to allow processor models to describe instructions that
are valid move elimination candidates.
The idea is to let register file definitions in tablegen declare if/when moves
can be eliminated.

This patch is a non functional change.
The logic that performs move elimination is currently disabled.  A future patch
will add support for move elimination in the processor models, and enable this
new code path.

llvm-svn: 343691
2018-10-03 15:02:44 +00:00
Clement Courbet d5a39553ff [llvm-exegesis] Resolve variant classes in analysis.
Summary: See PR38884.

Reviewers: gchatelet

Subscribers: tschuett, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D52825

llvm-svn: 343680
2018-10-03 11:50:25 +00:00
Simon Pilgrim c68cc4efbe [X86][Btver2] Most RMW instructions don't require an additional uop
Remove uop on WriteRMW and move it into the few instructions that need it.

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343671
2018-10-03 10:28:43 +00:00
Simon Pilgrim d11015861c [X86] ALU/ADC RMW instructions should use the WriteRMW sequence class
I was expecting this to be a nfc but Silvermont seems to be setup a little differently:

// A folded store needs a cycle on MEC_RSV for the store data, but it does not need an extra port cycle to recompute the address.
def : WriteRes<WriteRMW, [SLM_MEC_RSV]>;

So moving from WriteStore to WriteRMW reduces predicted port pressure, confirmed by @craig.topper that this is correct.

Differential Revision: https://reviews.llvm.org/D52740

llvm-svn: 343670
2018-10-03 10:01:13 +00:00
Simon Pilgrim 860cb5c071 [X86][Btver2] Fix BLENDV and AESDEC schedules
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343597
2018-10-02 15:13:18 +00:00
Zachary Turner a5e3e02602 [PDB] Add support for dumping Typedef records.
These work a little differently because they are actually in
the globals stream and are treated as symbol records, even though
DIA presents them as types.  So this also adds the necessary
infrastructure to cache records that live somewhere other than
the TPI stream as well.

llvm-svn: 343507
2018-10-01 17:55:38 +00:00
Simon Pilgrim e0d2019052 [X86][Btver2] Fix BT(C|R|S)mr & BT(C|R|S)mi schedule latency + uop counts
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343494
2018-10-01 16:31:30 +00:00
Simon Pilgrim 6ddc4e821c [X86][Btver2] Fix BTmr schedule uop counts
Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343484
2018-10-01 14:42:16 +00:00
Simon Pilgrim a982236e59 [X86][Btver2] Fix masked load schedule
JFPU01 resource usage should match JFPX

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343468
2018-10-01 13:12:05 +00:00
Puyan Lotfi 06e65cae4a [NFC] Adding "REQUIRES: zlib" to a llvm-objcopy test for bots without zlib.
M    test/tools/llvm-objcopy/compress-and-decompress-debug-sections-error.test

llvm-svn: 343454
2018-10-01 10:50:23 +00:00
Andrea Di Biagio 24ea163007 [X86][BtVer2] Teach how to identify zero-idiom VPERM2F128rr instructions.
This patch adds another variant class to identify zero-idiom VPERM2F128rr
instructions.

On Jaguar, a VPERM wih bit 3 and 7 of the mask set, is a zero-idiom.

Differential Revision: https://reviews.llvm.org/D52663

llvm-svn: 343452
2018-10-01 10:35:13 +00:00
Puyan Lotfi af048648d3 [llvm-objcopy] Adding support for decompressing zlib compressed dwarf sections.
Summary: I had added support for compressing dwarf sections in a prior commit,
         this one adds support for decompressing. Usage is:

         llvm-objcopy --decompress-debug-sections input.o output.o

Reviewers: jakehehrlich, jhenderson, alexshap	

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D51841

llvm-svn: 343451
2018-10-01 10:29:41 +00:00
Clement Courbet a933fb237e [X86][Sched] Update scheduling information for VZEROALL on HWS, BDW, SKX, SNB.
Summary:
    While looking at PR35606, I found out that the scheduling info is incorrect.

    One can check that it's really a P5+P6 and not a 2*P56 with:
    echo -e 'vzeroall\nvandps %xmm1, %xmm2, %xmm3' | ./bin/llvm-exegesis -mode=uops -snippets-file=-
    (vandps executes on P5 only)

    Reviewers: craig.topper, RKSimon

    Subscribers: llvm-commits

    Differential Revision: https://reviews.llvm.org/D52541

llvm-svn: 343447
2018-10-01 08:37:48 +00:00
Simon Pilgrim f21083870d [X86] Fix scheduler class for BTmi instructions
This wasn't treated as a folded load instruction

llvm-svn: 343424
2018-09-30 20:19:16 +00:00
Simon Pilgrim b1108399bd [LLVM-MCA][X86] Add missing VCMPESTR/VCMPESTR tests
llvm-svn: 343421
2018-09-30 18:19:00 +00:00
Simon Pilgrim 20623f2343 [LLVM-MCA][X86] Add some AVX512 tests
These are going to be necessary to check I don't mess up when I start cleaning up all the remaining vector integer overrides

llvm-svn: 343414
2018-09-30 17:01:59 +00:00
Simon Pilgrim 4f5693ac8d [X86][Btver2] Fix PCmpIStrI/PCmpIStrM schedules
Missing JFPU0 pipe and double JFPU1 pipe (to match JVALU1) resources

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343413
2018-09-30 16:38:38 +00:00
Zachary Turner a1e79e326a Fix some tests on Windows.
I don't actually have a Windows machine at the present moment,
so hopefully this fixes it.

llvm-svn: 343397
2018-09-30 00:22:21 +00:00
Andrea Di Biagio 6e218d0a57 [llvm-mca] Add a test for zero-idiom VPERM2F128rr. NFC
We don't correctly model the latency and resource usage information for
zero-idiom VPERM2F128rr on Jaguar.

This is demonstrated by the incorrect numbers in the resource pressure view, and
the timeline view.
A follow up patch will fix this problem.

llvm-svn: 343346
2018-09-28 17:47:09 +00:00
Luke Cheeseman 10981cc884 Revert r343317
- asan buildbots are breaking and I need to investigate the issue

llvm-svn: 343341
2018-09-28 17:01:50 +00:00
Greg Bedwell becbbe0383 [utils] Stricter checking from update_mca_test_checks.py
If any prefixes have been specified on the RUN lines that do not end up
ever actually getting printed, raise an Error. This is either an
indication that the run lines just need cleaning up, or that something
is more fundamentally wrong with the test.

Also raise an Error if there are any blocks which cannot be checked
because they are not uniquely covered by a prefix.

Fixed up a couple of tests where the extra checking flagged up issues.

Differential Revision: https://reviews.llvm.org/D48276

llvm-svn: 343332
2018-09-28 15:39:09 +00:00
Greg Bedwell 2f528f8c1e [utils] Allow better identification of matching blocks in update_mca_test_checks.py
Insert empty blocks to cause the positions of matching blocks to match
across lists where possible so that later stages of the algorithm can
actually identify them as being identical.

Regenerated all tests with this change.

Differential Revision: https://reviews.llvm.org/D52560

llvm-svn: 343331
2018-09-28 15:38:56 +00:00
Simon Pilgrim 428c1196d8 [X86][Btver2] PSUBS/PSUBUS instructions are zero-idioms
Noticed during llvm-exegesis tests, the PSUBS/PSUBUS instructions have the same zero-idiom behaviour to PSUB

llvm-svn: 343321
2018-09-28 14:20:42 +00:00
Simon Pilgrim 3216fd3602 [X86][Btver2] Add zero-idiom tests for PSUBS/PSUBUS instructions
Noticed during llvm-exegesis tests, the PSUBS/PSUBUS instructions have the same zero-idiom behaviour to PSUB

llvm-svn: 343319
2018-09-28 13:53:11 +00:00