Commit Graph

126 Commits

Author SHA1 Message Date
jacquesguan e60eb7053d recommit "[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat."
With fix for AArch64 and Hexgon test cases.
2022-07-21 17:34:34 +08:00
Krzysztof Parzyszek 0278dee1e5 [Hexagon] Generate TargetConstant in SelectAnyInt
At some point in instruction selection, A2_tfrsi Constant:i32<...> was
created, where the "Constant" came from SelectAnyInt. Since it wasn't
a TargetConstant, it was selected again, leading to
  %vreg = A2_tfrsi ...
  ...   = A2_tfrsi %vreg
which is not a valid code.
2022-04-22 10:36:37 -07:00
Arthur Eubanks 16823adf2a [test] Modify some tests to remove implicit -basic-aa in legacy PM RUN lines 2022-03-08 14:35:06 -08:00
Krzysztof Parzyszek 0792161c00 [Hexagon] Fix operation actions for v128f16
There were more cases of operations that should have been "Custom" for
v128f16, but ended up "Legal" (e.g. load and store).
2022-02-08 15:28:37 -08:00
Brendon Cahoon db5b791595 [Hexagon] Fix an instruction move in HexagonVectorCombine
The HexagonVectorCombine pass was moving an instruction
incorrectly, which caused a use in a GEP that was not yet
defined.

HexagonVectorCombine removes a load from a group due to its
dependences, but in realignGroup, the load is processed anyways.
In realignGroup, when determining the maximum alignment, only
those instructions still in the group should be considered.
2022-01-04 11:41:42 -08:00
Krzysztof Parzyszek 78f5014fea [Hexagon] Conversions to/from FP types, HVX and scalar
Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>
Co-authored-by: Sumanth Gundapaneni <sgundapa@quicinc.com>
2022-01-04 11:03:51 -08:00
Krzysztof Parzyszek db83e3e507 [Hexagon] Generate HVX/FP arithmetic instructions
Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>
Co-authored-by: Sumanth Gundapaneni <sgundapa@quicinc.com>
Co-authored-by: Joshua Herrera <joshherr@quicinc.com>
2021-12-30 12:47:30 -08:00
Krzysztof Parzyszek 9e6afbedb0 [Hexagon] Generate HVX/FP compare instructions
Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>
2021-12-30 12:17:22 -08:00
Krzysztof Parzyszek e107374e40 [Hexagon] Explicitly use integer types when rescaling a mask 2021-12-30 10:14:00 -08:00
Krzysztof Parzyszek eb574259b6 [Hexagon] Handle HVX/FP {masked,wide} loads/stores
Co-authored-by: Rahul Utkoor <quic_rutkoor@quicinc.com>
Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>
2021-12-30 10:14:00 -08:00
Krzysztof Parzyszek cd997689f2 [Hexagon] Fix isTypeForHVX to recognize floating point types
Co-authored-by: Sumanth Gundapaneni <sgundapa@quicinc.com>
2021-12-30 10:01:05 -08:00
Krzysztof Parzyszek 23423638cc [Hexagon] Handle HVX/FP shuffles, insertion and extraction
Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>
2021-12-30 08:44:10 -08:00
Krzysztof Parzyszek 95c7dd8810 Revert "[Hexagon] Don't build two halves of HVX vector in parallel"
This reverts commit ba07f300c6.

A build-vector sequence is made of pairs: rotate+insert. When constructing
a single vector, this results in a chain of 2*N instructions. The rotate
operation is a permute operation, but the insert uses a multiplication
resource: insert and rotate can execute in the same cycle, but obviously
they cannot operate on the same vector. The original halving idea is still
beneficial since it does allow for insert/rotate overlap, and for hiding
insert's latency.
2021-12-30 07:57:11 -08:00
Krzysztof Parzyszek ba07f300c6 [Hexagon] Don't build two halves of HVX vector in parallel
There can only be one permute operations per packet, so this actually
pessimizes the code (due to the extra "or").
2021-12-29 11:00:01 -08:00
Joshua Herrera 505d57486e [Hexagon] Improve BUILD_VECTOR codegen
For vectors with repeating values, old codegen would rotate and insert
every duplicate element. This patch replaces that behavior with a splat
of the most common element, vinsert/vror only occur when needed.
2021-12-29 10:18:21 -08:00
Krzysztof Parzyszek 4df2aba294 [Hexagon] Calling conventions for floating point vectors
They are the same as for the other HVX vectors, but types need to be
listed explicitly. Also, add a detailed codegen testcase.

Co-authored-by: Abhikrant Sharma <quic_abhikran@quicinc.com>
2021-12-29 09:01:07 -08:00
Krzysztof Parzyszek 2ce586bc49 [Hexagon] Handle floating point splats
Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>
2021-12-29 06:52:24 -08:00
Krzysztof Parzyszek 33fc675e16 [Hexagon] Handle floating point vector loads/stores 2021-12-29 05:52:39 -08:00
Krzysztof Parzyszek 6a6ac3b36f [Hexagon] Support BUILD_VECTOR of floating point HVX vectors
Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>
Co-authored-by: Ankit Aggarwal <aankit@quicinc.com>
2021-12-28 14:59:08 -08:00
Jay Foad 9951d437d3 [Hexagon] Add machine verification to some tests 2021-11-02 15:41:30 +00:00
Brendon Cahoon 42dace9c5b [Hexagon] Use getTypeAllocSize to compute difference between objects
The code was using getTypeStoreSize to calculate the difference
between consecutive objects. The calculation was incorrect due
to padding that is added between consecutive objects. The
getTypeAllocSize includes the padding amount. For example,
if the type is [19 x i8], the difference between consecutive
objects is 32 bytes, not 19 bytes.

A second case for getTypeAllocSize is needed when computing
the pointer values for the vector accesses. The calculation needs
to account for the padding as well.

Differential Revision: https://reviews.llvm.org/D109403
2021-09-13 19:04:59 -05:00
Krzysztof Parzyszek 002f5e158d [Hexagon] Restore handling of expanding shuffles
Fixed bugs, added testcases.  The byte-unpack is actually recognized by
the DAG combiner, but the halfword-unpack it not.
2021-05-26 18:04:15 -05:00
Krzysztof Parzyszek e7c839b192 [Hexagon] Improve argument packing in vector shuffle selection 2021-05-25 12:48:14 -05:00
Krzysztof Parzyszek 561026936b [Hexagon] Propagate metadata in Hexagon Vector Combine 2021-05-08 14:35:55 -05:00
Krzysztof Parzyszek ab9521aaeb [Hexagon] Use 'vnot' instead of 'not' in patterns with vectors
'not' expands to checking for an xor with a -1 constant. Since
this looks for a ConstantSDNode it will never match for a vector.

Co-authored-by: Craig Topper <craig.topper@sifive.com>

Differential Revision: https://reviews.llvm.org/D100687
2021-04-22 15:36:20 -05:00
Krzysztof Parzyszek deda60fcaf [Hexagon] Add HVX intrinsics for conditional vector loads/stores
Intrinsics for the following instructions are added. The intrinsic
name is "int_hexagon_<inst>[_128B]", e.g.
  int_hexagon_V6_vL32b_pred_ai        for 64-byte version
  int_hexagon_V6_vL32b_pred_ai_128B   for 128-byte version

V6_vL32b_pred_ai        if (Pv4) Vd32 = vmem(Rt32+#s4)
V6_vL32b_pred_pi        if (Pv4) Vd32 = vmem(Rx32++#s3)
V6_vL32b_pred_ppu       if (Pv4) Vd32 = vmem(Rx32++Mu2)
V6_vL32b_npred_ai       if (!Pv4) Vd32 = vmem(Rt32+#s4)
V6_vL32b_npred_pi       if (!Pv4) Vd32 = vmem(Rx32++#s3)
V6_vL32b_npred_ppu      if (!Pv4) Vd32 = vmem(Rx32++Mu2)

V6_vL32b_nt_pred_ai     if (Pv4) Vd32 = vmem(Rt32+#s4):nt
V6_vL32b_nt_pred_pi     if (Pv4) Vd32 = vmem(Rx32++#s3):nt
V6_vL32b_nt_pred_ppu    if (Pv4) Vd32 = vmem(Rx32++Mu2):nt
V6_vL32b_nt_npred_ai    if (!Pv4) Vd32 = vmem(Rt32+#s4):nt
V6_vL32b_nt_npred_pi    if (!Pv4) Vd32 = vmem(Rx32++#s3):nt
V6_vL32b_nt_npred_ppu   if (!Pv4) Vd32 = vmem(Rx32++Mu2):nt

V6_vS32b_pred_ai        if (Pv4) vmem(Rt32+#s4) = Vs32
V6_vS32b_pred_pi        if (Pv4) vmem(Rx32++#s3) = Vs32
V6_vS32b_pred_ppu       if (Pv4) vmem(Rx32++Mu2) = Vs32
V6_vS32b_npred_ai       if (!Pv4) vmem(Rt32+#s4) = Vs32
V6_vS32b_npred_pi       if (!Pv4) vmem(Rx32++#s3) = Vs32
V6_vS32b_npred_ppu      if (!Pv4) vmem(Rx32++Mu2) = Vs32

V6_vS32Ub_pred_ai       if (Pv4) vmemu(Rt32+#s4) = Vs32
V6_vS32Ub_pred_pi       if (Pv4) vmemu(Rx32++#s3) = Vs32
V6_vS32Ub_pred_ppu      if (Pv4) vmemu(Rx32++Mu2) = Vs32
V6_vS32Ub_npred_ai      if (!Pv4) vmemu(Rt32+#s4) = Vs32
V6_vS32Ub_npred_pi      if (!Pv4) vmemu(Rx32++#s3) = Vs32
V6_vS32Ub_npred_ppu     if (!Pv4) vmemu(Rx32++Mu2) = Vs32

V6_vS32b_nt_pred_ai     if (Pv4) vmem(Rt32+#s4):nt = Vs32
V6_vS32b_nt_pred_pi     if (Pv4) vmem(Rx32++#s3):nt = Vs32
V6_vS32b_nt_pred_ppu    if (Pv4) vmem(Rx32++Mu2):nt = Vs32
V6_vS32b_nt_npred_ai    if (!Pv4) vmem(Rt32+#s4):nt = Vs32
V6_vS32b_nt_npred_pi    if (!Pv4) vmem(Rx32++#s3):nt = Vs32
V6_vS32b_nt_npred_ppu   if (!Pv4) vmem(Rx32++Mu2):nt = Vs32
2021-04-22 11:49:29 -05:00
Brendon Cahoon 57443bfb4a [Hexagon] Fix segment start to adjust for gaps between segments
The Hexagon Vector Combine pass genertes stores for a complete
aligned vector. The start of each section is a multiple of the
vector size, so that value is passed to normalize to compute
the offset of the stores in the section.  The first store may
not occur at offset 0 when there is a gap between sections.
2021-01-19 12:49:39 -06:00
Krzysztof Parzyszek a90214760d [Hexagon] Custom-widen SETCC's operands
The result cannot be widened, unfortunately, because widening vNi1
would depend on the context in which it appears (i.e. the type alone
is not sufficient to tell if it needs to be widened).
2021-01-11 12:21:49 -06:00
Krzysztof Parzyszek 0f903015c7 [Hexagon] Rename test case, NFC 2020-12-15 19:05:31 -06:00
Krzysztof Parzyszek 16385643bb [Hexagon] Emit enough stores when aligning vector addresses 2020-12-15 18:59:53 -06:00
Krzysztof Parzyszek 2cf5310471 [Hexagon] Create vector masks for scalar loads/stores
AlignVectors treats all loaded/stored values as vectors of bytes,
and masks as corresponding vectors of booleans, so make getMask
produce a 1-element vector for scalars from the start.
2020-12-12 11:12:17 -06:00
Krzysztof Parzyszek f5d07a05bb [Hexagon] Realign HVX vectors wherever possible
Introduce HexagonVectorCombine as a helper class for vector-related
optimizations.
2020-12-09 17:11:25 -06:00
Krzysztof Parzyszek b7bde0e4f3 [Hexagon] Improve check for HVX types
Allow non-simple types, like <17 x i32> to be treated as HVX vector
types.
2020-11-27 13:33:10 -06:00
Simon Pilgrim c4628460b7 [Hexagon] Add HVX support for ISD::SMAX/SMIN/UMAX/UMIN instead of custom dag patterns
Followup to D92112 now that I've learnt about HVX type splitting.

This is some necessary cleanup work for min/max ops to eventually help us move the add/sub sat patterns into DAGCombine - D91876.

Differential Revision: https://reviews.llvm.org/D92169
2020-11-27 15:46:11 +00:00
Krzysztof Parzyszek db60e64036 [Hexagon] Handle additional shuffles that can be made perfect 2020-10-29 19:09:00 -05:00
Krzysztof Parzyszek 1b5baa42bc [Hexagon] Handle selection between HVX vector predicates
Make sure that (select i1 q0 q1) is handled properly.
2020-10-23 18:22:03 -05:00
Krzysztof Parzyszek 670cd3c6e3 [Hexagon] Generate better splat code on v62+ 2020-10-14 12:55:20 -05:00
Krzysztof Parzyszek f528816d58 [Hexagon] Move selection of HVX multiply from lowering to patterns
Also, change i32*i32 to V6_vmpyieoh + V6_vmpyiewuh_acc, which works
on V60 as well.
2020-10-02 16:04:34 -05:00
Krzysztof Parzyszek db04bec5f1 [SDAG] Do not convert undef to 0 when folding CONCAT/BUILD_VECTOR
Differential Revision: https://reviews.llvm.org/D88273
2020-09-29 09:12:26 -05:00
Krzysztof Parzyszek 3185839bcf [Hexagon] Avoid crash on CONCAT_VECTORS with illegal element types
Legal vector element types may not be legal as scalar types. When
CONCAT_VECTORS is converted to BUILD_VECTOR, the individual vector
elements become standalone operands to the build operation. If they
have illegal (scalar) types, they need to be made legal. In doing
so, the case of TRUNCATE was not handled, causing an assertion to
fail.
2020-09-24 20:05:23 -05:00
Krzysztof Parzyszek 5f4abb7fab [Hexagon] Replace incorrect pattern for vpackl HWI32 -> HVi8
V6_vdealb4w is not correct for pairs, use V6_vpackeh/V6_vpackeb instead.
2020-09-15 20:34:50 -05:00
Krzysztof Parzyszek f35617ad80 [Hexagon] Add more detailed testcase for widening truncates 2020-09-14 18:10:23 -05:00
Krzysztof Parzyszek bb877d1af2 [Hexagon] Widen loads and handle any-/sign-/zero-extensions 2020-09-14 18:10:23 -05:00
Krzysztof Parzyszek 9d300bc8d2 [Hexagon] Avoid widening vectors with non-HVX element types 2020-09-12 20:26:54 -05:00
Krzysztof Parzyszek 783e28a508 [Hexagon] Split pair-based masked memops 2020-09-10 14:24:42 -05:00
Krzysztof Parzyszek 0ee54cf883 [Hexagon] Account for truncating pairs to non-pairs when widening truncates
Added missing selection patterns for vpackl.
2020-09-09 14:31:52 -05:00
Krzysztof Parzyszek d183f47261 [Hexagon] Handle widening of truncation's operand with legal result
Failing example: v8i8 = truncate v8i32. v8i8 is legal, but v8i32 was
widened to HVX. Make sure that v8i8 does not get altered (even if it's
changed to another legal type).
2020-09-08 16:07:39 -05:00
Krzysztof Parzyszek 9518f032e4 [Hexagon] When widening truncate result, also widen operand if necessary 2020-09-05 18:19:32 -05:00
Krzysztof Parzyszek 8789f2bbde [Hexagon] Resize the mem operand when widening loads and stores 2020-09-05 18:17:48 -05:00
Krzysztof Parzyszek 1387f96ab3 [Hexagon] Handle widening of vector truncate 2020-09-05 15:07:38 -05:00