Commit Graph

17 Commits

Author SHA1 Message Date
Pierre-vh 24bf8063d6 [Target][ARM] Replace outdated getARMVPTBlockMask function
getARMVPTBlockMask was an outdated function that only handled basic
block masks: T, TT, TTT and TTTT. This worked fine before the MVE
VPT Block Insertion Pass improvements as it was the only kind of
masks that it could generate, but now it can generate more complex
masks that uses E predicates, so it's dangerous to use that function
to calculate VPT/VPST block masks.

I replaced it with 2 different functions:
  - expandPredBlockMask, in ARMBaseInfo. This adds an "E" or "T" at
    the end of an existing PredBlockMask.
  - recomputeVPTBlockMask, in Thumb2InstrInfo. This takes an iterator
    to a VPT/VPST instruction and recomputes its block mask by looking
    at the predicated instructions that follows it. This should be
    used to recompute a block mask after removing/adding a predicated
    instruction to the block.

The expandPredBlockMask function is pretty much imported from the MVE
VPT Blocks pass.

I had to change the ARMLowOverheadLoops and MVEVPTBlocks passes as well
so they could use these new functions.

Differential Revision: https://reviews.llvm.org/D78201
2020-05-12 12:10:15 +01:00
Pierre-vh 13eb890139 [Target][ARM] Fix VPT Block Pass miscompilation
The pass was incorrectly reverting back to a "T" when something wrote
to VPR inside a "E" block. This is not the correct behaviour, the
predicate should stay the same.

Differential Revision: https://reviews.llvm.org/D77798
2020-04-14 15:16:27 +01:00
Matt Arsenault 6011627f51 CodeGen: More conversions to use Register 2020-04-07 18:54:36 -04:00
Benjamin Kramer b605c56b0f [ARM] Silence warning in Release builds
llvm/lib/Target/ARM/MVEVPTBlockPass.cpp:175:37: error: unused variable 'BlockBeg' [-Werror,-Wunused-variable]
  MachineBasicBlock::instr_iterator BlockBeg = Iter;
                                    ^
2020-04-01 15:29:19 +02:00
Pierre-vh 2effe8f5e7 [Target][ARM] Improvements to the VPT Block Insertion Pass
This allows the MVE VPT Block insertion pass to remove VPNOTs in
order to create more complex VPT blocks such as TE, TEET, TETE, etc.

Differential Revision: https://reviews.llvm.org/D75993
2020-04-01 12:34:20 +01:00
David Green d50e188a07 Revert "[ARM][MVE] VPT Blocks: findVCMPToFoldIntoVPS"
This reverts commit e34801c8e6 and the followup due to multiple
problems.

I've tried to keep the tests and RDA parts where possible, as those
still seem useful.
2020-02-02 13:24:05 +00:00
Sjoerd Meijer a08c0adee0 [ARM][MVE] VTP Block Pass fix
Fix a missing and broken test: 2 VPT blocks predicated on the same VCMP
instruction that can be folded. The problem was that for each VPT block, we
record the predicate statements with a list, but the same instruction was added
twice. Thus, we were running in an assert trying to remove the same instruction
twice. To avoid this the instructions are now recorded with a set.

Differential Revision: https://reviews.llvm.org/D72699
2020-01-14 16:10:55 +00:00
Sjoerd Meijer e34801c8e6 [ARM][MVE] VPT Blocks: findVCMPToFoldIntoVPS
This is a recommit of D71330, but with a few things fixed and changed:

1) ReachingDefAnalysis: this was not running with optnone as it was checking
skipFunction(), which other analysis passes don't do. I guess this is a
copy-paste from a codegen pass.
2) VPTBlockPass: here I've added skipFunction(), because like most/all
optimisations, we don't want to run this with optnone.

This fixes the issues with the initial/previous commit: the VPTBlockPass was
running with optnone, but ReachingDefAnalysis wasn't, and so VPTBlockPass was
crashing querying ReachingDefAnalysis.

I've added test case mve-vpt-block-optnone.mir to check that we don't run
VPTBlock with optnone.

Differential Revision: https://reviews.llvm.org/D71470
2020-01-07 13:54:47 +00:00
Sam Parker 4042518335 [ARM][MVE] Tail predicate in the presence of vcmp
Record the discovered VPT blocks while checking for validity and, for
now, only handle blocks that begin with VPST and not VPT. We're now
allowing more than one instruction to define vpr, but each block must
somehow be predicated using the vctp. This leaves us with several
scenarios which need fixing up:
1) A VPT block with is only predicated by the vctp and has no
   internal vpr defs.
2) A VPT block which is only predicated by the vctp but has an
   internal vpr def.
3) A VPT block which is predicated upon the vctp as well as another
   vpr def.
4) A VPT block which is not predicated upon a vctp, but contains it
   and all instructions within the block are predicated upon in.

The changes needed are, for:
1) The easy one, just remove the vpst and unpredicate the
   instructions in the block.
2) Remove the vpst and unpredicate the instructions up to the
   internal vpr def. Need insert a new vpst to predicate the
   remaining instructions.
3) No nothing.
4) The vctp will be inside a vpt and the instruction will be removed,
   so adjust the size of the mask on the vpst.

Differential Revision: https://reviews.llvm.org/D71107
2019-12-20 08:42:11 +00:00
Sjoerd Meijer 049f9672d8 [ARM] Move MVE opcode helper functions to ARMBaseInstrInfo. NFC.
In ARMLowOverheadLoops.cpp, MVETailPredication.cpp, and MVEVPTBlock.cpp we have
quite a few helper functions all looking at the opcodes of MVE instructions.
This moves all these utility functions to ARMBaseInstrInfo.

Diferential Revision: https://reviews.llvm.org/D71426
2019-12-16 09:13:59 +00:00
Sjoerd Meijer e91420e17d Revert "[ARM][MVE] findVCMPToFoldIntoVPS. NFC."
This reverts commit 9468e3334b.

There's a test that doesn't like this change. The RDA analysis
gets invalided by changes in the block, which is not taken into
account. Revert while I work on a fix for this.
2019-12-13 11:56:44 +00:00
Sjoerd Meijer 9468e3334b [ARM][MVE] findVCMPToFoldIntoVPS. NFC.
This adds ReachingDefAnalysis (RDA) to the VPTBlock pass, so that we can
reimplement findVCMPToFoldIntoVPS with just a few calls to RDA.

Differential Revision: https://reviews.llvm.org/D71330
2019-12-12 15:41:20 +00:00
Reid Kleckner 904cd3e06b Prune a LegacyDivergenceAnalysis and MachineLoopInfo include each
Now X86ISelLowering doesn't depend on many IR analyses.

llvm-svn: 375320
2019-10-19 01:31:09 +00:00
Benjamin Kramer df4b9a3f4f Hide implementation details in namespaces.
llvm-svn: 372113
2019-09-17 12:56:29 +00:00
David Green ce7328cb61 [ARM] Fold VCMP into VPT
MVE has VPT instructions, which perform the duties of both a VCMP and a VPST in
a single instruction, performing the compare and starting the VPT block in one.
This teaches the MVEVPTBlockPass to fold them, searching back through the
basicblock for a valid VCMP and creating the VPT from its operands.

There are some changes to the VPT instructions to accommodate this, altering
the order of the operands to match the VCMP better, and changing P0 register
defs to be VPR defs, as is used in other places.

Differential Revision: https://reviews.llvm.org/D66577

llvm-svn: 371982
2019-09-16 13:02:41 +00:00
David Green 83a3341246 [ARM] Fixup the creation of VPT blocks
This attempts to just fix the creation of VPT blocks, fixing up the iterating,
which instructions are considered in the bundle, and making sure that we do not
overrun the end of the block.

Differential Revision: https://reviews.llvm.org/D67219

llvm-svn: 371064
2019-09-05 13:37:04 +00:00
David Green 379f6186dd [ARM] Move MVEVPTBlockPass to a separate file. NFC
This just pulls the MVEVPTBlockPass into a separate file, as opposed to being
wrapped up in Thumb2ITBlockPass.

Differential revision: https://reviews.llvm.org/D66579

llvm-svn: 370187
2019-08-28 11:37:31 +00:00