Commit Graph

143 Commits

Author SHA1 Message Date
Congzhe Cao eac3487510 [LoopInterchange] Try to achieve the most optimal access pattern after interchange
Motivated by pr43326 (https://bugs.llvm.org/show_bug.cgi?id=43326), where a slightly
modified case is as follows.

 void f(int e[10][10][10], int f[10][10][10]) {
   for (int a = 0; a < 10; a++)
     for (int b = 0; b < 10; b++)
       for (int c = 0; c < 10; c++)
         f[c][b][a] = e[c][b][a];
 }

The ideal optimal access pattern after running interchange is supposed to be the following

 void f(int e[10][10][10], int f[10][10][10]) {
   for (int c = 0;  c < 10; c++)
     for (int b = 0; b < 10; b++)
       for (int a = 0; a < 10; a++)
         f[c][b][a] = e[c][b][a];
 }

Currently loop interchange is limited to picking up the innermost loop and finding an order
that is locally optimal for it. However, the pass failed to produce the globally optimal
loop access order. For more complex examples what we get could be quite far from the
globally optimal ordering.

What is proposed in this patch is to do a "bubble-sort" fashion when doing interchange.
By comparing neighbors in `LoopList` in each iteration, we would be able to move each loop
onto a most appropriate place, hence this is an approach that tries to achieve the
globally optimal ordering.

The motivating example above is added as a test case.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D120386
2022-04-06 15:31:56 -04:00
Congzhe Cao abc8ca65c3 [LoopInterchange] Detect output dependency of a store instruction with itself
This patch is motivated by pr48057 where an output dependency is not detected
since loop interchange did not check a store instruction with itself.
Fixed that deficiency.

Reviewed By: bmahjour, Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D118102
2022-03-09 15:50:27 -05:00
serge-sans-paille 59630917d6 Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line:
before: 1062981579
after:  1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817
2022-03-03 07:56:34 +01:00
Congzhe Cao 1ef04326ec [LoopInterchange] Support loop interchange with floating point reductions
Enabled loop interchange support for floating point reductions
if it is allowed to reorder floating point operations.

Previously when we encouter a floating point PHI node in the
outer loop exit block, we bailed out since we could not detect
floating point reductions in the early days. Now we remove this
limiation since we are able to detect floating point reductions.

Reviewed By: #loopoptwg, Meinersbur

Differential Revision: https://reviews.llvm.org/D117450
2022-02-06 17:04:47 -05:00
Congzhe Cao fa6a2876c7 [LoopInterchange] Enable interchange with multiple inner loop indvars
Currently loop interchange only supports loops with one inner loop
induction variable. This patch adds support for transformation with
more than one inner loop induction variables. The induction PHIs and
induction increment instructions are moved/duplicated properly to the
new outer header and the new outer latch, respectively.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D114917
2022-01-14 16:28:41 -05:00
Congzhe Cao 37e34b74e9 [LoopInterchange] Enable interchange with multiple outer loop indvars
This patch enables loop interchange with multiple outer loop
induction variables, and hence removes the limitation that only
a single outer loop induction variable is supported. In fact, it
turns out that the current pass already trivially supports multiple
outer indvars, which is the result of a previous patch
`https://reviews.llvm.org/D102743`. Therefore, this patch removed that
limitation and provides test cases for multiple outer indvars.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D114916
2022-01-13 16:51:32 -05:00
Congzhe Cao c251bfc3b9 [LoopInterchange] Remove a limitation in LoopInterchange legality
There was a limitation in legality that in the original inner loop latch,
no instruction was allowed between the induction variable increment
and the branch instruction. This is because we used to split the
inner latch at the induction variable increment instruction. Since
now we have split at the inner latch branch instruction and have
properly duplicated instructions over to the split block, we remove
this limitation.

Please refer to the test case updates to see how we now interchange
loops where instructions exist between the induction variable
increment and the branch instruction.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D115238
2022-01-06 15:56:32 -05:00
David Blaikie 31b79b86ee Revert "Remove unused variable (-Wunused)"
Patch that removed the use of this variable was  reverted in
8ade3d43a3

This reverts commit 3988a06d86.
2022-01-05 20:43:30 -08:00
Congzhe Cao 8ade3d43a3 Revert "[LoopInterchange] Remove a limitation in LoopInterchange legality"
This reverts commit 15702ff9ce while I
investigate a ppc build bot failure at
https://lab.llvm.org/buildbot#builders/36/builds/16051.
2022-01-05 23:34:36 -05:00
David Blaikie 3988a06d86 Remove unused variable (-Wunused) 2022-01-05 20:29:35 -08:00
Congzhe Cao 15702ff9ce [LoopInterchange] Remove a limitation in LoopInterchange legality
There was a limitation in legality that in the original inner loop latch,
no instruction was allowed between the induction variable increment
and the branch instruction. This is because we used to split the
inner latch at the induction variable increment instruction. Since
now we have split at the inner latch branch instruction and have
properly duplicated instructions over to the split block, we remove
this limitation.

Please refer to the test case updates to see how we now interchange
loops where instructions exist between the induction variable increment
and the branch instruction.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D115238
2022-01-05 22:37:54 -05:00
Philip Reames c16fd6a376 Rename doesNotReadMemory to onlyWritesMemory globally [NFC]
The naming has come up as a source of confusion in several recent reviews.  onlyWritesMemory is consist with onlyReadsMemory which we use for the corresponding readonly case as well.
2022-01-05 08:52:55 -08:00
Benjamin Kramer 9b8b16457c Put implementation details into anonymous namespaces. NFCI. 2021-11-07 15:18:30 +01:00
Kazu Hirata c714da2ceb [Transforms] Use {DenseSet,SetVector,SmallPtrSet}::contains (NFC) 2021-10-31 07:57:32 -07:00
Congzhe Cao a7b7d22d6e [LoopInterchange] Check lcssa phis in the inner latch in scenarios of multi-level nested loops
We already know that we need to check whether lcssa
phis are supported in inner loop exit block or in
outer loop exit block, and we have logic to check
them already. Presumably the inner loop latch does
not have lcssa phis and there is no code that deals
with lcssa phis in the inner loop latch. However,
that assumption is not true, when we have loops
with more than two-level nesting. This patch adds
checks for lcssa phis in the inner latch.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D102300
2021-07-16 11:59:20 -04:00
Congzhe Cao bfefde22b6 [LoopInterhcange] Handle movement of reduction phis appropriately
This patch fixes pr43326 and pr48212.

Currently when we move reduction phis to the right place,
loop interchange assumes the first phi in loop headers is
an induction phi, skips the first phi and assumes the rest
of phis are candidate reduction phis to move. However, it
may not always be the case.

This patch loops over all phis in loop headers and considers
a phi node as a candidate reduction phi to move only when it
is indeed a reduction phi across outer and inner loop.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D102743
2021-05-31 16:27:38 -04:00
Congzhe Cao 3f8be15f29 [LoopInterchange] Handle lcssa PHIs with multiple predecessors
This is a bugfix in the transformation phase.

If the original outer loop header branches to both the inner loop
(header) and the outer loop latch, and if there is an lcssa PHI
node outside the loop nest, then after interchange the new outer latch
will have an lcssa PHI node inserted which has two predecessors, i.e.,
the original outer header and the original outer latch. Currently
the transformation assumes it has only one predecessor (the original
outer latch) and crashes, since the inserted lcssa PHI node does
not take both predecessors as incoming BBs.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D100792
2021-05-11 21:30:54 -04:00
Congzhe Cao 40e3aa39bd [LoopInterchange] Fix legality for triangular loops
This is a bug fix in legality check.

When we encounter triangular loops such as the following form:
    for (int i = 0; i < m; i++)
      for (int j = 0; j < i; j++), or

    for (int i = 0; i < m; i++)
      for (int j = 0; j*i < n; j++),

we should not perform interchange since the number of executions
of the loop body will be different before and after interchange,
resulting in incorrect results.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D101305
2021-05-11 18:36:53 -04:00
Congzhe Cao d3f89d4d16 Revert "[LoopInterchange] Fix legality for triangular loops"
This reverts commit 29342291d2.

The test case requires an assert build. Will add REQUIRES and re-commit.
2021-05-11 18:10:58 -04:00
Congzhe Cao 29342291d2 [LoopInterchange] Fix legality for triangular loops
This is a bug fix in legality check.

When we encounter triangular loops such as the following form:
    for (int i = 0; i < m; i++)
      for (int j = 0; j < i; j++), or

    for (int i = 0; i < m; i++)
      for (int j = 0; j*i < n; j++),

we should not perform interchange since the number of executions of the loop body
will be different before and after interchange, resulting in incorrect results.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D101305
2021-05-11 11:00:46 -04:00
Congzhe Cao ce2db9005d [LoopInterchange] Fix transformation bugs in loop interchange
After loop interchange, the (old) outer loop header should not jump to
the `LoopExit`. Note that the old outer loop becomes the new inner loop
after interchange. If we branched to `LoopExit` then after interchange
we would jump directly from the (new) inner loop header to `LoopExit`
without executing the rest of outer loop.

This patch modifies adjustLoopBranches() such that the old outer
loop header (which becomes the new inner loop header) jumps to the
old inner loop latch which becomes the new outer loop latch after
interchange.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D98475
2021-04-08 14:58:13 -04:00
Congzhe Cao 593cb46550 Revert "[LoopInterchange] Fix transformation bugs in loop interchange"
This reverts commit 6ec68bd815d00c1eec2a6b9766452554f0e6cb61.
2021-04-07 21:17:30 -04:00
CongzheUalberta f5645ea65f [LoopInterchange] Fix transformation bugs in loop interchange
After loop interchange, the (old) outer loop header should not jump to
`LoopExit`. Note that the old outer loop becomes the new inner loop
after interchange. If we branched to `LoopExit` then after interchange
we would jump directly from the (new) inner loop header to `LoopExit`
without executing the rest of (new) outer loop.

This patch modifies adjustLoopBranches() such that the old outer
loop header (which becomes the new inner loop header) jumps to the
old inner loop latch which becomes the new outer loop latch after
interchange.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D98475
2021-04-07 20:55:44 -04:00
Congzhe Cao 829c1b6443 [LoopInterchange] fix tightlyNested() in LoopInterchange legality
This is yet another attempt to fix tightlyNested().

Add checks in tightlyNested() for the inner loop exit block,
such that 1) if there is control-flow divergence in between the inner
loop exit block and the outer loop latch, or 2) if the inner loop exit
block contains unsafe instructions, tightlyNested() returns false.

The reasoning behind is that after interchange, the original inner loop
exit block, which was part of the outer loop, would be put into the new
inner loop, and will be executed different number of times before and
after interchange. Thus it should be dealt with appropriately.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D98263
2021-03-24 15:49:25 -04:00
Ta-Wei Tu 7ff2768be1 Revert "[LoopInterchange] Replace tightly-nesting-ness check with the one from `LoopNest`"
This reverts commit df9158c9a4.
2021-03-11 01:24:43 +08:00
Ta-Wei Tu df9158c9a4 [LoopInterchange] Replace tightly-nesting-ness check with the one from `LoopNest`
The check `tightlyNested()` in `LoopInterchange` is similar to the one in `LoopNest`.
In fact, the former misses some cases where loop-interchange is not feasible and results in incorrect behaviour.
Replacing it with the much robust version provided by `LoopNest` reduces code duplications and fixes https://bugs.llvm.org/show_bug.cgi?id=48113.

`LoopInterchange` has a weaker definition of tightly or perfectly nesting-ness than the one implemented in `LoopNest::arePerfectlyNested()`.
Therefore, `tightlyNested()` is instead implemented with `LoopNest::checkLoopsStructure` and additional checks for unsafe instructions.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D97290
2021-03-08 11:36:08 +08:00
Ta-Wei Tu ea1a1ebbc6 [NFC] Use std::swap in LoopInterchange 2021-03-02 11:42:48 +08:00
Kazu Hirata 5fc9e30985 [Scalar] Use range-based for loops (NFC) 2021-02-25 19:54:38 -08:00
Ta-Wei Tu 0eeaec2a6d [NFC] Refactor LoopInterchange into a loop-nest pass
This is the preliminary patch of converting `LoopInterchange` pass to a loop-nest pass and has no intended functional change.
Changes that are not loop-nest related are split to D96650.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D96644
2021-02-18 00:55:38 +08:00
Ta-Wei Tu 6b612a7baf [NFC][LoopInterchange] Explicitly pass both `InnerLoop` and `OuterLoop` to `processLoop`
This is a split patch of D96644.

Explicitly pass both `InnerLoop` and `OuterLoop` to function `processLoop` to remove the need to swap elements in loop list and allow making loop list an `ArrayRef`.
Also, fix inconsistent spellings of `OuterLoopId` and `Inner Loop Id` in debug log.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D96650
2021-02-16 22:17:44 +08:00
Anton Rapetov bfec9148a0 Scalar: Don't visit constants in findInnerReductionPhi in LoopInterchange
In LoopInterchange, `findInnerReductionPhi()` looks for reduction
variables, which cannot be constants. Update it to return early in that
case.

This also addresses a blocker for removing use-lists from ConstantData,
whose users could be spread across arbitrary modules in the same
LLVMContext.

Differential Revision: https://reviews.llvm.org/D94712
2021-01-21 12:33:06 -08:00
Kazu Hirata 23b0ab2acb [llvm] Use the default value of drop_begin (NFC) 2021-01-18 10:16:36 -08:00
Philip Reames 10ddb927c1 [SCEV] Use isa<> pattern for testing for CouldNotCompute [NFC]
Some older code - and code copied from older code - still directly tested against the singelton result of SE::getCouldNotCompute.  Using the isa<SCEVCouldNotCompute> form is both shorter, and more readable.
2020-11-24 18:47:49 -08:00
Kazu Hirata 43c0e4f665 [Transforms] Use llvm::is_contained (NFC) 2020-11-18 20:42:22 -08:00
Florian Hahn e8dc17a2b7
[LoopInterchange] Skip non SCEV-able operands in cost function.
This fixes a crash when trying to get a SCEV expression for operands
that are not SCEV-able.
2020-11-08 11:41:19 +00:00
Arthur Eubanks 9c21c6c966 [LoopInterchange][NewPM] Port -loop-interchange to NPM
Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D89058
2020-10-09 09:21:31 -07:00
Stefanos Baziotis 89c1e35f3c [LoopInfo] empty() -> isInnermost(), add isOutermost()
Differential Revision: https://reviews.llvm.org/D82895
2020-09-22 23:28:51 +03:00
Florian Hahn 8393b9fd1f [LoopInterchange] Move instructions from preheader to outer loop header.
Instructions defined in the original inner loop preheader may depend on
values defined in the outer loop header, but the inner loop header will
become the entry block in the loop nest. Move the instructions from the
preheader to the outer loop header, so we do not break dominance. We
also have to check for unsafe instructions in the preheader. If there
are no unsafe instructions, all instructions should be movable.

Currently we move all instructions except the terminator and rely on
LICM to hoist out invariant instructions later.

Fixes PR45743
2020-08-10 12:41:33 +01:00
Florian Hahn 54cb552b96 [LoopInterchange] Form LCSSA phis for values in orig outer loop header.
Values defined in the outer loop header could be used in the inner loop
latch. In that case, we need to create LCSSA phis for them, because after
interchanging they will be defined in the new inner loop and used in the
new outer loop.
2020-08-10 11:33:19 +01:00
Kazu Hirata 60434989e5 Use llvm::is_contained where appropriate (NFC)
Use llvm::is_contained where appropriate (NFC)

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D85083
2020-08-01 21:51:06 -07:00
Benjamin Kramer 3badd17b69 SmallPtrSet::find -> SmallPtrSet::count
The latter is more readable and more efficient. While there clean up
some double lookups. NFCI.
2020-06-07 22:38:08 +02:00
Alexey Zhikhartsev f71abec661 [LoopInterchange] Fix interchanging contents of preheader BBs
Summary:
Previously LCSSA was getting broken by placing instructions into the
(newly) inner *header* instead of the *pre*header.

Fixes PR43474

Reviewers: fhahn

Reviewed By: fhahn

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75943
2020-03-13 15:59:37 -04:00
Florian Hahn e8a5c17211 [LoopInterchange] Improve inner exit loop safety checks.
The PHI node checks for inner loop exits are too permissive currently.
As indicated by an existing comment, we should only allow LCSSA PHI
nodes that are part of reductions or are only used outside of the loop
nest. We ensure this by checking the users of the LCSSA PHIs.
Specifically, it is not safe to use an exiting value from the inner loop in the latch of the outer
loop.

It also moves the inner loop exit check before the outer loop exit
check.

Fixes PR43473.

Reviewers: efriedma, mcrosier

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D68144
2019-12-04 17:46:01 +00:00
Florian Hahn 9a432161c6 [LoopInterchange] Adjust assertions when updating successors.
Currently the assertion in updateSuccessor is overly strict in some
cases and overly relaxed in other cases. For branches to the inner and
outer loop preheader it is too strict, because they can either be
unconditional branches or conditional branches with duplicate targets.
Both cases are fine and we can allow updating multiple successors.

On the other hand, we have to at least update one successor. This patch
adds such an assertion.
2019-11-24 19:37:16 +00:00
Reid Kleckner 05da2fe521 Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.

I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
  recompiles    touches affected_files  header
  342380        95      3604    llvm/include/llvm/ADT/STLExtras.h
  314730        234     1345    llvm/include/llvm/InitializePasses.h
  307036        118     2602    llvm/include/llvm/ADT/APInt.h
  213049        59      3611    llvm/include/llvm/Support/MathExtras.h
  170422        47      3626    llvm/include/llvm/Support/Compiler.h
  162225        45      3605    llvm/include/llvm/ADT/Optional.h
  158319        63      2513    llvm/include/llvm/ADT/Triple.h
  140322        39      3598    llvm/include/llvm/ADT/StringRef.h
  137647        59      2333    llvm/include/llvm/Support/Error.h
  131619        73      1803    llvm/include/llvm/Support/FileSystem.h

Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.

Reviewers: bkramer, asbirlea, bollu, jdoerfert

Differential Revision: https://reviews.llvm.org/D70211
2019-11-13 16:34:37 -08:00
Florian Hahn 1ee93240c0 [LoopInterchange] Only skip PHIs with incoming values from the inner loop.
Currently we have limited support for outer loops with multiple basic
blocks after the inner loop exit. But the current checks for creating
PHIs for loop exit values only assumes the header and latches of the
outer loop. It is better to just skip incoming values defined in the
original inner loops. Those are handled earlier.

Reviewers: efriedma, mcrosier

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D70059
2019-11-12 10:30:51 +00:00
Florian Hahn e79381c3f7 [LoopInterchange] Drop unused splitInnerLoopHeader declaration.
llvm-svn: 371601
2019-09-11 10:32:15 +00:00
Florian Hahn e4961218fd [LoopInterchange] Properly move condition, induction increment and ops to latch.
Currently we only rely on the induction increment to come before the
condition to ensure the required instructions get moved to the new
latch.

This patch duplicates and moves the required instructions to the
newly created latch. We move the condition to the end of the new block,
then process its operands. We stop at operands that are defined
outside the loop, or are the induction PHI.

We duplicate the instructions and update the uses in the moved
instructions, to ensure other users remain intact. See the added
test2 for such an example.

Reviewers: efriedma, mcrosier

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D67367

llvm-svn: 371595
2019-09-11 08:23:23 +00:00
Fangrui Song b251cc0d91 Delete dead stores
llvm-svn: 365903
2019-07-12 14:58:15 +00:00
Florian Hahn 11b2f4fe50 [LoopInterchange] Fix handling of LCSSA nodes defined in headers and latches.
The code to preserve LCSSA PHIs currently only properly supports
reduction PHIs and PHIs for values defined outside the latches.

This patch improves the LCSSA PHI handling to cover PHIs for values
defined in the latches.

Fixes PR41725.

Reviewers: efriedma, mcrosier, davide, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D61576

llvm-svn: 361743
2019-05-26 23:38:25 +00:00