Commit Graph

8 Commits

Author SHA1 Message Date
David Green bee2f618d5 [ARM] Introduce t2WhileLoopStartTP
This adds t2WhileLoopStartTP, similar to the t2DoLoopStartTP added in
D90591. It keeps a reference to both the tripcount register and the
element count register, so that the ARMLowOverheadLoops pass in the
backend can pick the correct one without having to search for it from
the operand of a VCTP.

Differential Revision: https://reviews.llvm.org/D103236
2021-06-13 13:55:34 +01:00
David Green ce76093c3c [ARM] Expand predecessor search to multiple blocks when reverting WhileLoopStarts
We were previously only searching a single preheader for call
instructions when reverting WhileLoopStarts to DoLoopStarts. This
extends that to multiple blocks that can come up when, for example a
loop is expanded from a memcpy. It also expends the instructions from
just Call's to also include other LoopStarts, to catch other low
overhead loops in the preheader.

Differential Revision: https://reviews.llvm.org/D102269
2021-05-14 15:08:14 +01:00
Malhar Jajoo dfe3ffaa4a [ARM] Transforming memset to Tail predicated Loop
This patch converts llvm.memset intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).

The llvm.memset is converted to a TP loop for both
constant and non-constant input sizes (of llvm.memset).

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100435
2021-05-07 13:35:53 +01:00
Malhar Jajoo 9ff38e2d9d [ARM] Transforming memcpy to Tail predicated Loop
This patch converts llvm.memcpy intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).

From an implementation point of view, the patch

- adds an ARM specific SDAG Node (to which the llvm.memcpy intrinsic is lowered to, during first phase of ISel)
- adds a corresponding TableGen entry to generate a pseudo instruction, with a custom inserter,
  on matching the above node.
- Adds a custom inserter function that expands the pseudo instruction into MIR suitable
   to be (by later passes) into a WLSTP loop.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D99723
2021-05-06 23:21:28 +01:00
Malhar Jajoo fc690777fc Revert "[ARM] Transforming memcpy to Tail predicated Loop"
Reverting commit since it causes failure (10462).
This reverts commit b856f4a232.
2021-05-06 12:39:08 +01:00
Malhar Jajoo b856f4a232 [ARM] Transforming memcpy to Tail predicated Loop
This patch converts llvm.memcpy intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).

From an implementation point of view, the patch

- adds an ARM specific SDAG Node (to which the llvm.memcpy intrinsic is lowered to, during first phase of ISel)
- adds a corresponding TableGen entry to generate a pseudo instruction, with a custom inserter,
  on matching the above node.
- Adds a custom inserter function that expands the pseudo instruction into MIR suitable
   to be (by later passes) into a WLSTP loop.

Note: A cli option is used to control the conversion of memcpy to TP
loop and this option is currently disabled by default. It may be enabled
in the future after further downstream testing.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D99723
2021-05-06 09:34:09 +01:00
David Green e474499402 [ARM] Treat memcpy/memset/memmove as call instructions for low overhead loops
If an instruction will be lowered to a call there is no advantage of
using a low overhead loop as the LR register will need to be spilled and
reloaded around the call, and the low overhead will end up being
reverted. This teaches our hardware loop lowering that these memory
intrinsics will be calls under certain situations.

Differential Revision: https://reviews.llvm.org/D90439
2020-11-03 11:53:09 +00:00
David Green 785080e3fa [ARM] Low overhead loop memcpy lowering test. NFC 2020-11-03 11:44:50 +00:00