Commit Graph

21 Commits

Author SHA1 Message Date
Roman Lebedev 9c4c2f2472
[SimplifyCFG] Tail-merging all blocks with `ret` terminator
Based ontop of D104598, which is a NFCI-ish refactoring.
Here, a restriction, that only empty blocks can be merged, is lifted.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D104597
2021-06-24 13:15:39 +03:00
David Green 8cfc080132 [ARM] Limit v6m unrolling with multiple live outs
v6m cores only have a limited number of registers available. Unrolling
can mean we spend more on stack spills and reloads than we save from the
unrolling. This patch adds an extra heuristic to put a limit on the
unroll count for loops with multiple live out values, as measured from
the LCSSA phi nodes.

Differential Revision: https://reviews.llvm.org/D104659
2021-06-23 16:36:37 +01:00
Roman Lebedev ff4b1d379f
[NFCI-ish][SimplifyCFGPass] Rework and generalize `ret` block tail-merging
This changes the approach taken to tail-merge the blocks
to always create a new block instead of trying to reuse some block,
and generalizes it to support dealing not with just the `ret` in the future.

This effectively lifts the CallBr restriction, although this isn't really intentional.
That is the only non-NFC change here, i'm not sure if it's reasonable/feasible to temporarily retain it.

Other restrictions of the transform remain.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D104598
2021-06-23 14:33:18 +03:00
David Green da98177cda [ARM] Allow v6m runtime loop unrolling
This removes the restriction that only Thumb2 targets enable runtime
loop unrolling, allowing it for Thumb1 only cores as well. The existing
T2 heuristics are used (for the time being) to control when and how
unrolling is performed.

Differential Revision: https://reviews.llvm.org/D99588
2021-04-01 21:21:40 +01:00
David Green 14b2ec934e [ARM] Enable UpperBound unrolling for all loops
This UpperBound unrolling was already enabled so long as a series of
conditions in ARMTTIImpl::getUnrollingPreferences pass. This just always
enables it as it can help fully unroll loops that would not otherwise
pass those tests.

Differential Revision: https://reviews.llvm.org/D99174
2021-03-24 16:39:21 +00:00
David Green 003fab9e8d [ARM] Additional Upper bound unrolling test. NFC 2021-03-23 12:00:40 +00:00
David Green c7e275388e [ARM] Don't aggressively unroll vector remainder loops
We already do not unroll loops with vector instructions under MVE, but
that does not include the remainder loops that the vectorizer produces.
These remainder loops will be rarely executed and are not worth
unrolling, as the trip count is likely to be low if they get executed at
all. Luckily they get llvm.loop.isvectorized to make recognizing them
simpler.

We have wanted to do this for a while but hit issues with low overhead
loops being reverted due to difficult registry allocation. With recent
changes that seems to be less of an issue now.

Differential Revision: https://reviews.llvm.org/D90055
2020-11-10 17:01:31 +00:00
David Green 44c1a56869 [ARM] Add extra MVE tests for various patches. NFC 2020-11-01 16:24:23 +00:00
Sam Parker ea8448e361 [LoopUnroll] Adjust CostKind query
When TTI was updated to use an explicit cost, TCK_CodeSize was used
although the default implicit cost would have been the hand-wavey
cost of size and latency. So, revert back to this behaviour. This is
not expected to have (much) impact on targets since most (all?) of
them return the same value for SizeAndLatency and CodeSize.

When optimising for size, the logic has been changed to query
CodeSize costs instead of SizeAndLatency.

This patch also adds a testing option in the unroller so that
OptSize thresholds can be specified.

Differential Revision: https://reviews.llvm.org/D85723
2020-08-12 12:56:09 +01:00
Sjoerd Meijer 356685a1d8 Follow up of 67bf9a6154, minor fix in test case, removed duplicate option 2020-01-10 09:41:41 +00:00
Sjoerd Meijer 67bf9a6154 [SVEV] Recognise hardware-loop intrinsic loop.decrement.reg
Teach SCEV about the @loop.decrement.reg intrinsic, which has exactly the same
semantics as a sub expression. This allows us to query hardware-loops, which
contain this @loop.decrement.reg intrinsic, so that we can calculate iteration
counts, exit values, etc. of hardwareloops.

This "int_loop_decrement_reg" intrinsic is defined as "IntrNoDuplicate". Thus,
while hardware-loops and tripcounts now become analysable by SCEV, this
prevents the usual loop transformations from applying transformations on
hardware-loops, which is what we want at this point, for which I have added
test cases for loopunrolling and IndVarSimplify and LFTR.

Differential Revision: https://reviews.llvm.org/D71563
2020-01-10 09:35:00 +00:00
Sam Parker 15c7fa4d11 [ARM][MVE] Don't unroll intrinsic loops.
We don't unroll vector loops for MVE targets, but we miss the case
when loops only contain intrinsic calls. So just move the logic a
bit to catch this case.

Differential Revision: https://reviews.llvm.org/D72440
2020-01-09 11:57:34 +00:00
David Green 11c4602fce [MVE] Don't try to unroll vectorised MVE loops
Due to the nature of the beat system in the MVE architecture, along with tail
predication and low-overhead loops, unrolling has less benefit compared to
normal loops. You can not, for example, hide the latency of a load with other
instructions as you can for scalar code. Preventing unrolling also makes the
code easier to read and reason about.

So if a loop contains vector code, don't enable the runtime unrolling. At least
for the time being.

Differential Revision: https://reviews.llvm.org/D65803

llvm-svn: 368530
2019-08-11 08:53:18 +00:00
Fangrui Song ac14f7b10c [lit] Delete empty lines at the end of lit.local.cfg NFC
llvm-svn: 363538
2019-06-17 09:51:07 +00:00
David Green d847aa573b [ARM] Enable Unroll UpperBound
This option allows loops with small max trip counts to be fully unrolled. This
can help with code like the remainder loops from manually unrolled loops like
those that appear in the cmsis dsp library. We would apparently previously
runtime unroll them with the default unroll count (4).

Differential Revision: https://reviews.llvm.org/D63064

llvm-svn: 362928
2019-06-10 10:22:14 +00:00
Eric Christopher cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Sam Parker a16667e79b [ARM] Use Cortex-A57 sched model for Cortex-A72
This mirrors what we already do for AArch64 as the cores are similar.
As discussed in the review, enabling the machine scheduler causes
more variations in performance changes so it is not enabled for now.
This patch improves LNT scores by a geomean of 1.57% at -O3.

Differential Revision: https://reviews.llvm.org/D53562

llvm-svn: 345272
2018-10-25 15:08:29 +00:00
Sam Parker 487ab86942 [ARM] Allow unrolling of multi-block loops.
Before, loop unrolling was only enabled for loops with a single
block. This restriction has been removed and replaced by:
- allow a maximum of two exiting blocks,
- a four basic block limit for cores with a branch predictor.

Differential Revision: https://reviews.llvm.org/D38952

llvm-svn: 316313
2017-10-23 08:05:14 +00:00
Sam Parker 84fd0c3bf2 [ARM] Improve loop unrolling for Cortex-M
- Set the default runtime unroll count to 4 and use the newly added
  UnrollRemainder option.
- Create loop cost and force unroll for a cost less than 12.
- Disable unrolling on Thumb1 only targets.

Differential Revision: https://reviews.llvm.org/D36134

llvm-svn: 310997
2017-08-16 07:42:44 +00:00
Sam Parker 19a08e42a8 [ARM] Enable partial and runtime unrolling
Enable runtime and partial loop unrolling of simple loops without
calls on M-class cores. The thresholds are calculated based on
whether the target is Thumb or Thumb-2.

Differential Revision: https://reviews.llvm.org/D34619

llvm-svn: 308956
2017-07-25 08:51:30 +00:00