Commit Graph

102 Commits

Author SHA1 Message Date
Stanislav Mekhanoshin bcaf31ec3f [AMDGPU] Allow finer grain control of an unaligned access speed
A target can return if a misaligned access is 'fast' as defined
by the target or not. In reality there can be different levels
of 'fast' and 'slow'. This patch changes the boolean 'Fast'
argument of the allowsMisalignedMemoryAccesses family of functions
to an unsigned representing its speed.

A target can still define it as it wants and the direct translation
of the current code uses 0 and 1 for current false and true. This
makes the change an NFC.

Subsequent patch will start using an actual value of speed in
the load/store vectorizer to compare if a vectorized access going
to be not just fast, but not slower than before.

Differential Revision: https://reviews.llvm.org/D124217
2022-11-17 09:23:53 -08:00
Kazushi (Jam) Marukawa 33dda45dde [VE] Change the way to lower selectcc
Change to use VEISD::CMPI/CMPU/CMPF/CMPQ and VEISD::CMOV in combineSelectCC
for better optimization.  Support VEISD::CMPI/CMPU in combineTRUNCATE also
to optimize truncate.  Remove obsolete lower patterns from VEInstrInfo.td.
Update regression tests also.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D136049
2022-10-20 08:08:59 +09:00
Kazushi (Jam) Marukawa 0278c9ceb6 [VE] Change the way to lower select
Change to use VEISD::CMOV in combineSelect for better optimization.
Support VEISD::CMOV in combineTRUNCATE also to optimize trancate.
Merge functions to handle condition codes to VE.h.  And add basic
CMOV patterns to VEInstrInfo.td.  Update regression tests also.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D135878
2022-10-15 08:49:36 +09:00
Kazushi (Jam) Marukawa de8013201f [VE] Change to expand FPOW
VE doesn't have FPOW instruction, so this patch makes llvm expand it.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D134695
2022-09-27 20:03:10 +09:00
Kazushi (Jam) Marukawa 76c76e9ab4 [VE] Support smax/smin
Support smax/smin in VEInstrInfo.td.  Remove obsolete patterns for
smax/smin.  Add regression tests for smax/smin/umax/umin.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D134583
2022-09-26 22:02:57 +09:00
Kazushi (Jam) Marukawa 337e54ec95 [VE] Add maxnum and minnum
Add maxnum and minnum for float and double.  Lowering is already
implemented, so this patch changes them legal and adds regression
tests.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D134108
2022-09-21 18:03:49 +09:00
Kazushi (Jam) Marukawa 3ee64ea5cf [VE] Change to expand FMA
VE has fused multiply-add instruction for only vector calculations.  This
patch forces to expand scalar FMA to multiply and add instructions.
This patch also adds regression test.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D134107
2022-09-21 18:02:55 +09:00
Sergei Barannikov c6acb4eb0f [SDAG] Add `getCALLSEQ_END` overload taking `uint64_t`s
All in-tree targets pass pointer-sized ConstantSDNodes to the
method. This overload reduced amount of boilerplate code a bit.  This
also makes getCALLSEQ_END consistent with getCALLSEQ_START, which
already takes uint64_ts.
2022-09-15 14:02:12 -04:00
Kazu Hirata 2833760c57 [Target] Qualify auto in range-based for loops (NFC) 2022-08-28 17:35:09 -07:00
Eli Friedman cfd2c5ce58 Untangle the mess which is MachineBasicBlock::hasAddressTaken().
There are two different senses in which a block can be "address-taken".
There can be a BlockAddress involved, which means we need to map the
IR-level value to some specific block of machine code.  Or there can be
constructs inside a function which involve using the address of a basic
block to implement certain kinds of control flow.

Mixing these together causes a problem: if target-specific passes are
marking random blocks "address-taken", if we have a BlockAddress, we
can't actually tell which MachineBasicBlock corresponds to the
BlockAddress.

So split this into two separate bits: one for BlockAddress, and one for
the machine-specific bits.

Discovered while trying to sort out related stuff on D102817.

Differential Revision: https://reviews.llvm.org/D124697
2022-08-16 16:15:44 -07:00
Fangrui Song de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Kazushi (Jam) Marukawa de690a6438 [VE][NFC] Correct comment 2022-07-01 19:24:57 +09:00
Kazushi (Jam) Marukawa adbb46ea65 [VE] Support load/store vm regsiters
Support load/store vm registers to memory location as a first step.
As a next step, support load/store vm registers to stack location.
This patch also adds several regression tests for not only load/store
vm registers but also missing load/store for vr registers.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D128610
2022-07-01 08:25:24 +09:00
Simon Moll 6ac3d8ef9c [VE] strided v256.23 isel and tests
ISel for experimental.vp.strided.load|store for v256.32 types via
lowering to vvp_load|store SDNodes.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D121616
2022-03-15 15:29:19 +01:00
Simon Moll f318d1e26b [VE] v256i32|64 reduction isel and tests
and|add|or|xor|smax v256i32|64 isel and tests for vp and vector.reduce
intrinsics

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D121469
2022-03-14 11:10:38 +01:00
Simon Moll 9ebaec461a [VE] (masked) load|store v256.32|64 isel
Add `vvp_load|store` nodes. Lower to `vld`, `vst` where possible. Use
`vgt` for masked loads for now.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D120413
2022-03-02 13:31:29 +01:00
Simon Moll 606320ed30 [VE][NFC] Move functions to VVP module
Separate vector isel functions to the module they belong to. Keep scalar
stuff and calls into vector isel in the VEISelLowering.
2022-02-23 10:44:56 +01:00
Simon Moll 4fd77129f2 [VE] Split unsupported v512.32 ops
Split v512.32 binary ops into two v256.32 ops using packing support
opcodes (vec_unpack_lo|hi, vec_pack).

Depends on D120053 for packing opcodes.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D120146
2022-02-22 14:29:41 +01:00
Simon Moll cf964eb5bd [VE] v512i1 mask arithmetic isel
Packed vector and mask registers (v512) are composed of two v256
subregisters that occupy the even and odd element positions.  We add
packing support SDNodes (vec_unpack_lo|hi and vec_pack) and splitting of
v512i1 mask arithmetic ops with those.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D120053
2022-02-21 10:38:11 +01:00
Simon Moll de42307e44 [VE] Fix breakage after D118981
VE backend code expected all VP SDNode to have a mask parameter.  This
is not the case with vp.select|merge after D118981.
2022-02-15 18:56:20 +01:00
Simon Moll 53efbc15cb [VE] v256i1 broadcast isel and tests
Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D119241
2022-02-15 12:40:51 +01:00
Simon Moll ae1bb44ed8 [VE] v256.32|64 setcc isel and tests
Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D119223
2022-02-08 13:20:55 +01:00
Simon Moll 7d926b7177 [VE] LEGALAVL and staged VVP legalization
The new LEGALAVL node annotates that the AVL refers to packs of 64bit.
We use a two-stage lowering approach with LEGALAVL:

First, standard SDNodes are translated into illegal VVP layer nodes.
Regardless of source (VP or standard), all VVP nodes have a mask and AVL
parameter. The AVL parameter refers to the element position (just as in
VP intrinsics).

Second, we legalize the AVL usage in VVP layer nodes. If the element
size is < 64bit, the EVL parameter has to be adjusted to refer to packs
of 64bits.  We wrap the legalized AVL in a LEGALAVL node to track this.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D118321
2022-02-02 09:11:41 +01:00
Simon Moll 5ceb0bc7ea [VE] Packed 32/64bit broadcast isel and tests
Packed-mode broadcast of f32/i32 requires the subregister to be
replicated to the full I64 register prior. Add repl_i32 and repl_f32 to
faciliate this.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D117878
2022-01-26 14:16:06 +01:00
Simon Moll 7950010e49 [VE][NFC] Factor out helper functions
Factor out some helper functions to cleanup VEISelLowering.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D117683
2022-01-21 09:15:59 +01:00
Jim Lin d6b0734837 [NFC] Use Register instead of unsigned 2022-01-19 20:17:04 +08:00
Simon Moll 1b09d0c42b [VE] VECustomDAG builder class
VECustomDAG's functions simplify emitting VE custom ISD nodes. The class
is just a stub now. We add more functions, in particular for the
VP->VVP->VE lowering, to VECustomDAG as we build up vector isel.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D116103
2022-01-18 12:08:07 +01:00
Simon Moll 95bf5ac8a8 [VE] select|vp.merge|vp.select v256 isel and tests
Use the `VMRG` for all three operations for now. `vp_select` will be
used in passthru patterns.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D117206
2022-01-17 15:58:54 +01:00
Kazu Hirata 2aed08131d [llvm] Use true/false instead of 1/0 (NFC)
Identified with modernize-use-bool-literals.
2022-01-07 00:39:14 -08:00
Simon Moll 676af1272b [VE] SHL,SRA,SRL v256i32|64 isel and tests
Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D115734
2021-12-15 11:32:18 +01:00
Simon Moll 7cf887b950 [VE] Fix SDNode user loop after efa896e5f7
Rewriting SDNode user loops broke VEISelLowering (commit efa896e5f7).
This fixes it.
2021-11-15 09:53:09 +01:00
Kazu Hirata efa896e5f7 [Target] Use SDNode::uses (NFC) 2021-11-12 21:23:04 -08:00
Simon Moll 2b626aba44 [VE][NFC] IRBuilder<> -> IRBuilderBase
VE's TTI broke with the switch from IRBuilder<> to IRBuilderBase.
Following that change to compile again.
2021-06-08 13:55:49 +02:00
Fangrui Song dc6a5e070d [VE] Fix allowsMisalignedMemoryAccesses after D96097 2021-02-04 20:46:18 -08:00
Kazu Hirata 177b8d1ad3 [VE] Fix compiler warnings (NFC) 2021-01-31 10:23:39 -08:00
Simon Moll 611d3c63f3 [VP] ISD helper functions [VE] isel for vp_add, vp_and
This implements vp_add, vp_and for the VE target by lowering them to the
VVP_* layer. We also add helper functions for VP SDNodes (isVPSDNode,
getVPMaskIdx, getVPExplicitVectorLengthIdx).

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D93766
2021-01-08 14:29:45 +01:00
Simon Moll eeba70a463 [VE] Expand single-element BUILD_VECTOR to INSERT_VECTOR_ELT
We do this mostly to be able to test the insert_vector_elt isel
patterns. As long as we don't, most single element insertions show up as
`BUILD_VECTOR` in the backend.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D93759
2021-01-08 11:48:01 +01:00
Simon Moll d1b606f897 [VE] Extract & insert vector element isel
Isel and tests for extract_vector_elt and insert_vector_elt.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D93687
2021-01-08 11:46:59 +01:00
Kazushi (Jam) Marukawa f784be0777 [VE] Support SJLJ exception related instructions
Support EH_SJLJ_LONGJMP, EH_SJLJ_SETJMP, and EH_SJLJ_SETUP_DISPATCH
for SjLj exception handling.  NC++ uses SjLj exception handling, so
implement it first.  Add regression tests also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D94071
2021-01-05 20:19:15 +09:00
Kazushi (Jam) Marukawa 53a341a61d [VE][NFC] Fix typo in comments 2021-01-05 18:55:28 +09:00
Kazushi (Jam) Marukawa 2654f33c47 [VE] Support llvm.eh.sjlj.lsda
In order to support SJLJ exception, implement llvm.eh.sjlj.lsda first.
Add regression test also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D93811
2021-01-05 18:06:14 +09:00
Kazushi (Jam) Marukawa 74e7cb26b9 [VE] Remove VA.needsCustom checks
Remove VA.needsCustom checks which are copied from Sparc implementation
at the very beginning of VE implementation.  Add assert to sanity-check
VA.needsCustom flag, also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D93847
2021-01-04 18:19:18 +09:00
Kazushi (Jam) Marukawa 5e273b845b [VE] Support STACKSAVE and STACKRESTORE
Change to use default expanded code.  Add regression tests also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D93539
2020-12-21 20:15:50 +09:00
Kazushi (Jam) Marukawa d99e4a4840 [VE] Support RETURNADDR
Implement RETURNADDR for VE.  Add a regression test also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D93545
2020-12-21 20:06:03 +09:00
Kazushi (Jam) Marukawa 697226550e [VE] Support FRAMEADDR
Implement FRAMEADDR for VE.  Add a regression test also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D93295
2020-12-15 23:31:19 +09:00
Kazushi (Jam) Marukawa 2a2268a6db [VE][NFC] Sort VEISD operations
Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D93294
2020-12-15 23:29:16 +09:00
Kazushi (Jam) Marukawa a2eb07aa55 [VE] Support atomic exchange instructions
Support atomic exchange and atomic compare and exchange instructions.
Change CAS and TS1AM instructions for ISel patterns.  Add selectADDRzi
pattern for them.  Add TS1AM pseudo instruction also for better ISel.
Add shouldExpandAtomicRMWInIR() function to expand all atomicrmw
instructions except atomicrmw xchg.  Add custom lower for i8/i16
atomicrmw xchg.  Modify replaceFI to support CAS/TS1AM instructions
which use "reg+disp" operands instead of "reg+imm+disp" operands.
And, add several regression tests to check the correctness.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D93161
2020-12-15 17:43:11 +09:00
Kazushi (Jam) Marukawa c9213e1b29 [VE] Correct addRegisterClass calls
Correct addRegisterClass calls for vector mask registers.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D93212
2020-12-15 01:16:56 +09:00
Kazushi (Jam) Marukawa 6834b3d6d5 [VE] Optimize prologue/epilogue instructions about GOT
Optimize prologue/epilogue instructions if a given function use GOT but
do not call other functions by eliminating FP.  Previously, we had wrong
implementations taken from other architectures.  Update regression tests
also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92313
2020-12-01 02:22:31 +09:00
Simon Moll b955c7e630 [VE] VE Vector Predicated SDNode, vector add isel and tests
VE Vector Predicated (VVP) SDNodes form an intermediate layer between VE
vector instructions and the initial SDNodes.

We introduce 'vvp_add' with isel and tests as the first of these VVP
nodes. VVP nodes have a mask and explicit vector length operand, which
we will make proper use of later.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D91802
2020-11-23 17:17:07 +01:00