Instead of doing multiple unpacks when zero extending vectors (e.g. v2i16 ->
v2i64), benchmarks have shown that it is better to do a VPERM (vector
permute) since that is only one sequential instruction on the critical path.
This patch achieves this by
1. Expand ZERO_EXTEND_VECTOR_INREG into a vector shuffle with a zero vector
instead of (multiple) unpacks.
2. Improve SystemZ::GeneralShuffle to perform a single unpack as the last
operation if Bytes matches it.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D78486
The type legalizer will scalarize vector conversions from integer to floating
point if the source element size is less than that of the result.
This is avoided now by inserting a zero/sign-extension of the source vector
before type legalization.
Review: Ulrich Weigand
Differential revision: https://reviews.llvm.org/D75978