Commit Graph

21 Commits

Author SHA1 Message Date
Francis Visoiu Mistrih 25528d6de7 [CodeGen] Unify MBB reference format in both MIR and debug output
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

llvm-svn: 319665
2017-12-04 17:18:51 +00:00
Florian Hahn ceb4494786 Recommit [MachineCombiner] Update instruction depths incrementally for large BBs.
This version of the patch fixes an off-by-one error causing PR34596. We
do not need to use std::next(BlockIter) when calling updateDepths, as
BlockIter already points to the next element.

Original commit message:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.

> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313751
2017-09-20 11:54:37 +00:00
Hans Wennborg 06e2a384c2 Revert r312719 "[MachineCombiner] Update instruction depths incrementally for large BBs."
This caused PR34596.

> [MachineCombiner] Update instruction depths incrementally for large BBs.
>
> Summary:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.
>
> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313213
2017-09-13 23:23:09 +00:00
Florian Hahn d39b8a3533 [MachineCombiner] Update instruction depths incrementally for large BBs.
Summary:
For large basic blocks with lots of combinable instructions, the
MachineTraceMetrics computations in MachineCombiner can dominate the compile
time, as computing the trace information is quadratic in the number of
instructions in a BB and it's relevant successors/predecessors.

In most cases, knowing the instruction depth should be enough to make
combination decisions. As we already iterate over all instructions in a basic
block, the instruction depth can be computed incrementally. This reduces the
cost of machine-combine drastically in cases where lots of instructions
are combined. The major drawback is that AFAIK, computing the critical path
length cannot be done incrementally. Therefore we only compute
instruction depths incrementally, for basic blocks with more
instructions than inc_threshold. The -machine-combiner-inc-threshold
option can be used to set the threshold and allows for easier
experimenting and checking if using incremental updates for all basic
blocks has any impact on the performance.

Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn

Reviewed By: fhahn

Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 312719
2017-09-07 12:49:39 +00:00
Sanjay Patel 766589efdc add 'MustReduceDepth' as an objective/cost-metric for the MachineCombiner
This is one of the problems noted in PR25016:
https://llvm.org/bugs/show_bug.cgi?id=25016
and:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/090998.html

The spilling problem is independent and not addressed by this patch.

The MachineCombiner was doing reassociations that don't improve or even worsen the critical path. 
This is caused by inclusion of the "slack" factor when calculating the critical path of the original
code sequence. If we don't add that, then we have a more conservative cost comparison of the old code
sequence vs. a new sequence. The more liberal calculation must be preserved, however, for the AArch64
MULADD patterns because benchmark regressions were observed without that.

The two failing test cases now have identical asm that does what we want:
a + b + c + d ---> (a + b) + (c + d)

Differential Revision: http://reviews.llvm.org/D13417

llvm-svn: 252616
2015-11-10 16:48:53 +00:00
Sanjay Patel 004ea240ad add test cases that demonstrate bad behavior
These are based on PR25016 and likely caused by a bug in
MachineCombiner's definition of improvesCriticalPathLen().

llvm-svn: 249249
2015-10-03 20:52:55 +00:00
Sanjay Patel f0bc07f7a5 [x86] enable machine combiner reassociations for 256-bit vector min/max
llvm-svn: 245735
2015-08-21 21:04:21 +00:00
Sanjay Patel cf942fa905 [x86] enable machine combiner reassociations for 128-bit vector min/max
llvm-svn: 245715
2015-08-21 18:06:49 +00:00
Sanjay Patel 9e5927fdc3 [x86] enable machine combiner reassociations for scalar double-precision min/max
llvm-svn: 245506
2015-08-19 21:27:27 +00:00
Sanjay Patel 4e3ee1e548 [x86] enable machine combiner reassociations for scalar single-precision maximums
llvm-svn: 245504
2015-08-19 21:18:46 +00:00
Sanjay Patel 40d4eb40f6 [x86] enable machine combiner reassociations for scalar single-precision minimums
llvm-svn: 245166
2015-08-15 17:01:54 +00:00
Sanjay Patel 3b7e3677e3 fix typos; NFC
llvm-svn: 245164
2015-08-15 16:53:08 +00:00
Sanjay Patel 9f6c7dddd2 add test case to show current codegen
llvm-svn: 245163
2015-08-15 16:49:50 +00:00
Sanjay Patel 260b6d36f4 [x86] enable machine combiner reassociations for 256-bit vector FP mul/add
llvm-svn: 244705
2015-08-12 00:29:10 +00:00
Sanjay Patel 2c6a01570d [x86] enable machine combiner reassociations for 128-bit vector single/double multiplies
llvm-svn: 244657
2015-08-11 20:19:23 +00:00
Sanjay Patel e0178262d4 [x86] enable machine combiner reassociations for 128-bit vector single/double adds
llvm-svn: 244403
2015-08-08 19:08:20 +00:00
Sanjay Patel 81beefc541 [x86] enable machine combiner reassociations for scalar double-precision multiplies
llvm-svn: 241873
2015-07-09 22:58:39 +00:00
Sanjay Patel ea81edf351 [x86] enable machine combiner reassociations for scalar double-precision adds
llvm-svn: 241871
2015-07-09 22:48:54 +00:00
Sanjay Patel 093fb170a6 [x86] enable machine combiner reassociations for scalar single-precision multiplies
llvm-svn: 241752
2015-07-08 22:35:20 +00:00
Sanjay Patel 681a56ac58 [x86] extend machine combiner reassociation optimization to SSE scalar adds
Extend the reassociation optimization of http://reviews.llvm.org/rL240361 (D10460)
to SSE scalar FP SP adds in addition to AVX scalar FP SP adds.

With the 'switch' in place, we can trivially add other opcodes and test cases in
future patches.

Differential Revision: http://reviews.llvm.org/D10975

llvm-svn: 241515
2015-07-06 22:35:29 +00:00
Sanjay Patel e79b43a01f [x86] generalize reassociation optimization in machine combiner to 2 instructions
Currently ( D10321, http://reviews.llvm.org/rL239486 ), we can use the machine combiner pass
to reassociate the following sequence to reduce the critical path:

A = ? op ?
B = A op X
C = B op Y
-->
A = ? op ?
B = X op Y
C = A op B

'op' is currently limited to x86 AVX scalar FP adds (with fast-math on), but in theory, it could
be any associative math/logic op (see TODO in code comment).

This patch generalizes the pattern match to ignore the instruction that defines 'A'. So instead of
a sequence of 3 adds, we now only need to find 2 dependent adds and decide if it's worth
reassociating them.

This generalization has a compile-time cost because we can now match more instruction sequences
and we rely more heavily on the machine combiner to discard sequences where reassociation doesn't
improve the critical path.

For example, in the new test case:

A = M div N
B = A add X
C = B add Y

We'll match 2 reassociation patterns, but this transform doesn't reduce the critical path:

A = M div N
B = A add Y
C = B add X

We need the combiner to reject that pattern but select this:

A = M div N
B = X add Y
C = B add A

Differential Revision: http://reviews.llvm.org/D10460

llvm-svn: 240361
2015-06-23 00:39:40 +00:00