With regards to overrunning, the langref (llvm/docs/LangRef.rst)
specifies:
(llvm.experimental.vector.insert)
Elements ``idx`` through (``idx`` + num_elements(``subvec``) - 1)
must be valid ``vec`` indices. If this condition cannot be determined
statically but is false at runtime, then the result vector is
undefined.
(llvm.experimental.vector.extract)
Elements ``idx`` through (``idx`` + num_elements(result_type) - 1)
must be valid vector indices. If this condition cannot be determined
statically but is false at runtime, then the result vector is
undefined.
For the non-mixed cases (e.g. inserting/extracting a scalable into/from
another scalable, or inserting/extracting a fixed into/from another
fixed), it is possible to statically check whether or not the above
conditions are met. This was previously missing from the verifier, and
if the conditions were found to be false, the result of the
insertion/extraction would be replaced with an undef.
With regards to invalid indices, the langref (llvm/docs/LangRef.rst)
specifies:
(llvm.experimental.vector.insert)
``idx`` represents the starting element number at which ``subvec``
will be inserted. ``idx`` must be a constant multiple of
``subvec``'s known minimum vector length.
(llvm.experimental.vector.extract)
The ``idx`` specifies the starting element number within ``vec``
from which a subvector is extracted. ``idx`` must be a constant
multiple of the known-minimum vector length of the result type.
Similarly, these conditions were not previously enforced in the
verifier. In some circumstances, invalid indices were permitted
silently, and in other circumstances, an undef was spawned where a
verifier error would have been preferred.
This commit adds verifier checks to enforce the constraints above.
Differential Revision: https://reviews.llvm.org/D104468
In most of cases, it has a single space after comma in assembly operands.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D103790
Compares are considered a narrowing operation for register overlap.
I believe for LMUL<=1 they meet this exception to allow overlap
"The destination EEW is smaller than the source EEW and the overlap is in the
lowest-numbered part of the source register group"
Both the result and the sources will occupy a single register for
LMUL<=1 so the overlap would always be in the "lowest-numbered part".
Reviewed By: frasercrmck, HsiangKai
Differential Revision: https://reviews.llvm.org/D103336
This can help avoid needing a virtual register for the vsetvl output
when the AVL is X0. For other register AVLs it can shorter the live
range of the AVL register if it isn't needed later.
There's probably no advantage when AVL is a 5 bit immediate that
can use vsetivli. But do it anyway for consistency.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D103215
We aren't going to connect the result to anything so we might
as well avoid allocating a register.
Reviewed By: frasercrmck, HsiangKai
Differential Revision: https://reviews.llvm.org/D102031
My thought process is that if v2i64 is an LMUL=1 type then v2i32
should be an LMUL=1/2 type. We limit the fractional LMUL so that
SEW=64 clips to LMUL=1, SEW=32 clips to LMUL=1/2, etc. This
ensures there's always a fractional LMUL available to truncate a type.
This does reduce the number of vsetvlis in some cases.
Some tests increase vsetvlis because the best container type for a
mask type is dependent on the LMUL+SEW that the mask was produced
from, but you can't tell that from the type. I think this is
something we need to solve this in the machine IR when optimizing
vsetvlis.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D101215
When trying to clamp a constant index into a scalable vector we can
test if the index is less than the minimum number of elements in the
vector. If so, we can simply return the index because we know it is
guaranteed to fit inside the vector.
Differential Revision: https://reviews.llvm.org/D100639
RV32 is able to use the llvm.experimental.vector.insert intrinsics too.
This patch ensures they're tested.
Reviewed By: khchen, asb
Differential Revision: https://reviews.llvm.org/D99655
This patch addresses a few issues when dealing with scalable-vector
INSERT_SUBVECTOR and EXTRACT_SUBVECTOR nodes.
When legalizing in DAGTypeLegalizer::SplitVecRes_INSERT_SUBVECTOR, we
store the low and high halves to the stack separately. The offset for
the high half was calculated incorrectly.
Additionally, we can optimize this process when we can detect that the
subvector is contained entirely within the low/high split vector type.
While this optimization is valid on scalable vectors, when performing
the 'high' optimization, the subvector must also be a scalable vector.
Note that the 'low' optimization is still conservative: it may be
possible to insert v2i32 into the low half of a split nxv1i32/nxv1i32,
but we can't guarantee it. It is always possible to insert v2i32 into
nxv2i32 or v2i32 into nxv4i32+2 as we know vscale is at least 1.
Lastly, in SelectionDAG::isSplatValue, we early-exit on the extracted subvector value
type being a scalable vector, forgetting that we can also extract a
fixed-length vector from a scalable one.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D98495
This function isn't exercised in lit tests today today according to
the code coverage report. But will be after the tests in D97543 and
D97559.
Posting this patch to help a crash that Fraser hit.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D97582
Like with EXTRACT_SUBVECTOR, INSERT_SUBVECTOR poses a problem
for vector masks as RVV isn't able to slide mask types around. We choose
instead to bitcast to equivalently-sized i8 types where we can, else we
zero-extend, perform the operation, and truncate back down.
One test was left disabled due to a crash in the legalizer.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D97559
This patch fixes a bug where the lowering for INSERT_SUBVECTOR and
EXTRACT_SUBVECTOR would insist on first extracting a register-aligned
LMUL1 vector type before perfoming the slide up/down. This was even if
the vector was a fractional LMUL type, in which case the aligned
EXTRACT_SUBVECTOR was invalid.
This issue only occurred for scalable vector types, but a variety of
tests for both scalable and fixed-length vectors have been added to
ensure this does not regress in the future.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D97556
This patch unifies the two disparate paths for lowering INSERT_SUBVECTOR
operations under one roof. Consequently, with this patch it is possible to
support any fixed-length subvector insertion, not just "cast-like" ones.
As before, support for the insertion of mask vectors will come in a
separate patch.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D97543