Commit Graph

178 Commits

Author SHA1 Message Date
Matt Arsenault a61c3455c0 AtomicExpand: Use llvm.ptrmask instead of ptrtoint
This removes the ptrtoint from the load's pointer operand, although we
can't entirely eliminate these to get the LSB shift. In a future
patch, this will avoid ptrtoint in the case where the atomic is
overaligned to the word size.
2022-09-28 12:51:30 -04:00
Kazushi (Jam) Marukawa de8013201f [VE] Change to expand FPOW
VE doesn't have FPOW instruction, so this patch makes llvm expand it.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D134695
2022-09-27 20:03:10 +09:00
Kazushi (Jam) Marukawa 1cef30b9d3 [VE] Disable automatic maxnum/minnum selection
Disable FMAX/FMIN selection from select_cc in VEInstrInfo.td because of
the lack of NaN consideration.  This patch removes such selection from
VEInstrInfo.td and lets llvm work on it in combineMinNumMaxNum.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D134595
2022-09-26 22:04:02 +09:00
Kazushi (Jam) Marukawa 76c76e9ab4 [VE] Support smax/smin
Support smax/smin in VEInstrInfo.td.  Remove obsolete patterns for
smax/smin.  Add regression tests for smax/smin/umax/umin.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D134583
2022-09-26 22:02:57 +09:00
Kazushi (Jam) Marukawa 337e54ec95 [VE] Add maxnum and minnum
Add maxnum and minnum for float and double.  Lowering is already
implemented, so this patch changes them legal and adds regression
tests.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D134108
2022-09-21 18:03:49 +09:00
Kazushi (Jam) Marukawa 3ee64ea5cf [VE] Change to expand FMA
VE has fused multiply-add instruction for only vector calculations.  This
patch forces to expand scalar FMA to multiply and add instructions.
This patch also adds regression test.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D134107
2022-09-21 18:02:55 +09:00
Matt Arsenault 230dbe0857 VE: Use generated checks for a copy-pasted output test 2022-09-20 16:51:04 -04:00
Craig Topper 1121eca685 [VP][VE] Default VP_SREM/UREM to Expand and add generic expansion using VP_SDIV/UDIV+VP_MUL+VP_SUB.
I want to default all VP operations to Expand. These 2 were blocking
because VE doesn't support them and the tests were expecting them
to fail a specific way. Using Expand caused them to fail differently.

Seemed better to emulate them using operations that are supported.

@simoll mentioned on Discord that VE has some expansion downstream. Not
sure if its done like this or in the VE target.

Reviewed By: frasercrmck, efocht

Differential Revision: https://reviews.llvm.org/D133514
2022-09-16 13:19:02 -07:00
Craig Topper 38ffa2bb96 [LegalizeTypes] Improve splitting for urem/udiv by constant for some constants.
For remainder:
If (1 << (Bitwidth / 2)) % Divisor == 1, we can add the high and low halves
together and use a (Bitwidth / 2) urem. If (BitWidth /2) is a legal integer
type, this urem will be expand by DAGCombiner using multiply by magic
constant. We do have to take into account that adding high and low
together can produce a carry, making it a (BitWidth / 2)+1 bit number.
So we need to also add back in the carry from the first addition.

For division:
We can use the above trick to compute the remainder, subtract that
remainder from the dividend, then multiply by the multiplicative
inverse of the Divisor modulo (1 << BitWidth).

This is based on the section "Remainder by Summing Digits" in
Hacker's delight.

The remainder trick is similar to a trick you may have learned for
determining if a decimal number is divisible by 3. You can add all the
digits together and see if the sum is divisible by 3. If you're not sure
if the sum is divisible by 3, you can add its digits together. This
can be repeated until you have a single decimal digit. If that digit
is 3, 6, or 9, then the original number is divisible by 3. This works
because 10 % 3 == 1.

gcc already does this same trick. There are additional tricks gcc
does urem as well as srem, udiv, and sdiv that I plan to add in
future patches.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130862
2022-09-12 10:34:52 -07:00
Alex Richardson 2616e00949 [update_llc_test_checks][VE] Handle .Lfoo$local in function regex
While working on https://reviews.llvm.org/D131429, I got a test diff in
one of the VE tests and running update_llc_test_checks.py deleted all the
code for that function. This updates the regex to handle this new output.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D131431
2022-08-24 14:16:20 +00:00
Kazushi (Jam) Marukawa b88aba9d7d [VE] Support inlineasm memory operand
Support inline asm memory operand for VE.  Add regression tests also.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D132380
2022-08-23 13:44:03 +09:00
Amaury Séchet 06da353748 [NFC] Automatically generate CodeGen/VE/Scalar/atomic.ll 2022-07-27 23:52:00 +00:00
Kazushi (Jam) Marukawa 469044cfd3 [VE] Support load/store/spill of vector mask registers
Support load/store/spill of vector mask registers and add regression
tests.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D129415
2022-07-19 10:29:21 +09:00
Kazushi (Jam) Marukawa da5a6b2bf5 [VE] Restructure eliminateFrameIndex
Restructure the current implementation of eliminateFrameIndex function
in order to support more instructions.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D129034
2022-07-05 20:00:19 +09:00
Kazushi (Jam) Marukawa 9ad38e5288 Revert "[VE] Restructure eliminateFrameIndex"
This reverts commit 98e52e8bff.
2022-07-05 19:35:12 +09:00
Kazushi (Jam) Marukawa 98e52e8bff [VE] Restructure eliminateFrameIndex
Restructure the current implementation of eliminateFrameIndex function
in order to support more instructions.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D129034
2022-07-05 19:28:11 +09:00
Kazushi (Jam) Marukawa adbb46ea65 [VE] Support load/store vm regsiters
Support load/store vm registers to memory location as a first step.
As a next step, support load/store vm registers to stack location.
This patch also adds several regression tests for not only load/store
vm registers but also missing load/store for vr registers.

Reviewed By: efocht

Differential Revision: https://reviews.llvm.org/D128610
2022-07-01 08:25:24 +09:00
Daniil Kovalev 62a983ebc5 Revert "[CodeGen] Place SDNode debug ID declaration under appropriate #if"
This reverts commit 83a798d4b0.

As discussed in D120714 with @thakis, the patch added unneeded complexity
without noticeable benefits.
2022-04-06 20:32:53 +03:00
Daniil Kovalev 83a798d4b0 [CodeGen] Place SDNode debug ID declaration under appropriate #if
Place PersistentId declaration under #if LLVM_ENABLE_ABI_BREAKING_CHECKS to
reduce memory usage when it is not needed.

Differential Revision: https://reviews.llvm.org/D120714
2022-04-06 14:09:32 +03:00
Craig Topper 49c2206b3b [VP] Preserve address space of pointer for strided load/store intrinsics.
This adds LLVMAnyPointerToElt to use instead of LLVMPointerToElt.
This allows us to preserve the address space as part of the type
overload for the intrinsic, but still require the vector element
type to match the pointer type.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D122042
2022-03-22 09:52:54 -07:00
Jake Egan c7dc9dbaff [VE] Remove output to /dev/stdout
Sending output to /dev/stdout on AIX gets an llc permission denied error, so this patch removes this from the tests.

Reviewed By: simoll, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D121799
2022-03-16 11:42:09 -04:00
Simon Moll 91fad1167a [VE] v512|256 f32|64 fneg isel and tests
fneg instruction isel and tests. We do this also in preparation of fused
negatate-multiple-add fp operations.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D121620
2022-03-16 11:31:26 +01:00
Simon Moll 6ac3d8ef9c [VE] strided v256.23 isel and tests
ISel for experimental.vp.strided.load|store for v256.32 types via
lowering to vvp_load|store SDNodes.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D121616
2022-03-15 15:29:19 +01:00
Simon Moll 3297571e32 [VE] v256f32|64 fma isel
llvm.fma|fmuladd vp.fma isel and tests

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D121477
2022-03-14 15:59:13 +01:00
Kazushi (Jam) Marukawa 9260592141 [VE] Support more intrinsics
Support new intrinsics for following instrauctions.
  - VLDZ, VPCNT, VBRV
  - LCR, SCR, TSCR, FIDCR
  - FENCE
Also clean the intrinsics implementation of a following instruction.
  - SVOB

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D121509
2022-03-14 19:17:15 +09:00
Simon Moll f318d1e26b [VE] v256i32|64 reduction isel and tests
and|add|or|xor|smax v256i32|64 isel and tests for vp and vector.reduce
intrinsics

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D121469
2022-03-14 11:10:38 +01:00
Simon Moll a5f1262332 [VE] v256.32|64 gather|scatter isel and tests
This adds support for v256.32|64 scatter|gather isel.  vp.gather|scatter
and regular gather|scatter intrinsics are both lowered to the internal
VVP layer.  Splitting these ops on v512.32 is the subject of future
patches.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D121288
2022-03-14 10:38:56 +01:00
Sanjay Patel c2592c374e [SDAG] simplify bitwise logic with repeated operand
We do not have general reassociation here (and probably
do not need it), but I noticed these were missing in
patches/tests motivated by D111530, so we can at
least handle the simplest patterns.

The VE test diff looks correct, but we miss that
pattern in IR currently:
https://alive2.llvm.org/ce/z/u66_PM
2022-03-13 11:12:30 -04:00
Simon Moll c574c54ebf [VE] Split v512.32 load store into interleaved v256.32 ops
Without passthru for now. Support for packed passthru requires
evl-into-mask folding.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D120818
2022-03-07 17:38:38 +01:00
Simon Moll 9ebaec461a [VE] (masked) load|store v256.32|64 isel
Add `vvp_load|store` nodes. Lower to `vld`, `vst` where possible. Use
`vgt` for masked loads for now.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D120413
2022-03-02 13:31:29 +01:00
Simon Moll 4fd77129f2 [VE] Split unsupported v512.32 ops
Split v512.32 binary ops into two v256.32 ops using packing support
opcodes (vec_unpack_lo|hi, vec_pack).

Depends on D120053 for packing opcodes.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D120146
2022-02-22 14:29:41 +01:00
Simon Moll cf964eb5bd [VE] v512i1 mask arithmetic isel
Packed vector and mask registers (v512) are composed of two v256
subregisters that occupy the even and odd element positions.  We add
packing support SDNodes (vec_unpack_lo|hi and vec_pack) and splitting of
v512i1 mask arithmetic ops with those.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D120053
2022-02-21 10:38:11 +01:00
Simon Moll f27423027d [VE] Enable v256 fcmp true|false tests
The broadcast patterns for all-true|false masks are available now.
Enable the true|fast fcmp predicate tests that use them.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D119936
2022-02-18 13:26:18 +01:00
Simon Moll d46e49838e [VE] Fix vmp0 subregister mapping
vmp0 is the all-ones v512i1 register and does not break down into
subregisters.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D120054
2022-02-18 13:17:10 +01:00
Simon Moll 53efbc15cb [VE] v256i1 broadcast isel and tests
Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D119241
2022-02-15 12:40:51 +01:00
Simon Moll ce48fe47af [VE] v256i1 and|or|xor isel and tests
Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D119239
2022-02-14 08:47:06 +01:00
Simon Moll ae1bb44ed8 [VE] v256.32|64 setcc isel and tests
Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D119223
2022-02-08 13:20:55 +01:00
Simon Moll 73ac3b1371 [VE] Packed v512i32 isel and tests
Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D118332
2022-02-03 11:01:54 +01:00
Simon Moll 31cca9e6ba [VE] Packed v512f32 binop isel and tests
Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D118335
2022-02-02 10:09:39 +01:00
Simon Moll 5ceb0bc7ea [VE] Packed 32/64bit broadcast isel and tests
Packed-mode broadcast of f32/i32 requires the subregister to be
replicated to the full I64 register prior. Add repl_i32 and repl_f32 to
faciliate this.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D117878
2022-01-26 14:16:06 +01:00
Simon Moll 43994e9a4a [VE] vp_select+vectorBinOp passthru isel and tests
Extend the VE binaryop vector isel patterns to use passthru when the
result of a SDNode is used in a vector select or merge.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D117495
2022-01-18 11:31:14 +01:00
Simon Moll 95bf5ac8a8 [VE] select|vp.merge|vp.select v256 isel and tests
Use the `VMRG` for all three operations for now. `vp_select` will be
used in passthru patterns.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D117206
2022-01-17 15:58:54 +01:00
Simon Moll b2cea573c9 [VE] FADD,FSUB,FMUL,FDIV v256f32|f64 isel and tests
Depends on D115940 for the `Binary_rv_vr_vv` pattern class op isel
fragment used for divisions.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D116035
2021-12-21 09:15:31 +01:00
Simon Moll 8c51812913 [VE] U|SDIV v256i32|64 isel and tests
Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D115940
2021-12-21 08:51:01 +01:00
Simon Moll 676af1272b [VE] SHL,SRA,SRL v256i32|64 isel and tests
Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D115734
2021-12-15 11:32:18 +01:00
Simon Moll 6847379e89 [VE] MUL,SUB,OR,XOR v256i32|64 isel
v256i32|i64 isel patterns and tests.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D115643
2021-12-14 13:23:48 +01:00
Simon Moll 444013d324 [VE][NFC] Use POSIX-compatible stream redirection
Drop Bash-style stream redirect in favor of POSIX stream redirection to
fix spurious test failures on Windows.

Failure:
https://lab.llvm.org/buildbot/#/builders/123/builds/7509/steps/8/logs/stdio
2021-12-01 17:28:57 +01:00
Simon Moll bb5e35833f [VE][NFC] correct bitmasking in popcnt expansion test 2021-10-25 13:55:58 +02:00
Simon Moll 4e9dbee1a3 [VE][Test] Make Scalar/va_arg test generic
Make match patterns more permissive to be invariant to register
allocation choices.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D111312
2021-10-08 08:07:51 +02:00
Craig Topper 9132299836 [LegalizeTypes][VE] Don't Expand BITREVERSE/BSWAP during type legalization promotion if they will be promoted for NVT in op legalization.
We were trying to expand these if they were going to be expanded
in op legalization so that we generated the minimum number of
operations. We failed to take into account that NVT could be
promoted to another legal type in op legalization.

Hoping this fixes the issue on the VE target reported as a follow
up to D96681. The check line changes were taken from before
1e46b6f401 so this patch does
appear to improve some cases that had previously regressed.
2021-06-29 11:00:11 -07:00