Commit Graph

4 Commits

Author SHA1 Message Date
Qiu Chaofan a08fc1361a [PowerPC] Change VSRpRC allocation order
On PowerPC, VSRpRC represents the pairs of even and odd VSX register,
and VRRC corresponds to higher 32 VSX registers. In some cases, extra
copies are produced when handling incoming VRRC arguments with VSRpRC.

This patch changes allocation order of VSRpRC to eliminate this kind of
copy.

Stack frame sizes may increase if allocating non-volatile registers, and
some other vector copies happen. They need fix in future changes.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D104855
2021-06-25 16:04:41 +08:00
Fangrui Song 88cadb894c [PowerPC][test] Add explicit dso_local to definitions in ELF static relocation model tests
TargetMachine::shouldAssumeDSOLocal currently implies dso_local for such definitions.

Adding explicit dso_local makes these tests align with the clang -fpic behavior
and allow the removal of the TargetMachine::shouldAssumeDSOLocal special case.

Rewrite preemption.ll to dsolocal-static.ll and dsolocal-pic.ll, and add
"PIC Level" metadata.
2020-12-30 10:32:34 -08:00
Baptiste Saleil 18db29ea6f [PowerPC] Add peephole to remove redundant accumulator prime/unprime instructions
In some situations, the compiler may insert an accumulator prime instruction and
an accumulator unprime instruction with no use of that accumulator between the two.
That's for example the case when we store an accumulator after assembling it or
restoring it. This patch adds a peephole to remove these prime and unprime instructions.

Differential Revision: https://reviews.llvm.org/D91386
2020-11-18 15:01:07 -06:00
Baptiste Saleil 0156914275 [PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types
This patch legalizes the v256i1 and v512i1 types that will be used for MMA.

It implements loads and stores of these types.
v256i1 is a pair of VSX registers, so for this type, we load/store the two
underlying registers. v512i1 is used for MMA accumulators. So in addition to
loading and storing the 4 associated VSX registers, we generate instructions to
prime (copy the VSX registers to the accumulator) after loading and unprime
(copy the accumulator back to the VSX registers) before storing.

This patch also adds the UACC register class that is necessary to implement the
loads and stores. This class represents accumulator in their unprimed form and
allow the distinction between primed and unprimed accumulators to avoid invalid
copies of the VSX registers associated with primed accumulators.

Differential Revision: https://reviews.llvm.org/D84968
2020-09-28 14:39:37 -05:00