Commit Graph

60633 Commits

Author SHA1 Message Date
Thomas Lively 5e09e9979b [WebAssembly] Prototype extending pairwise add instructions
As proposed in https://github.com/WebAssembly/simd/pull/380. This commit makes
the new instructions available only via clang builtins and LLVM intrinsics to
make their use opt-in while they are still being evaluated for inclusion in the
SIMD proposal.

Depends on D93771.

Differential Revision: https://reviews.llvm.org/D93775
2020-12-28 14:11:14 -08:00
Thomas Lively 44ee14f993 [WebAssembly][NFC] Finish cleaning up SIMD tablegen
This commit is a follow-on to c2c2e9119e73, using the `Vec` records introduced
in that commit in the rest of the SIMD instruction definitions. Also removes
unnecessary types in output patterns.

Differential Revision: https://reviews.llvm.org/D93771
2020-12-28 13:59:23 -08:00
Fangrui Song f931290308 [PowerPC] Parse and ignore .machine
glibc/sysdeps/powerpc/powerpc64 has .machine
{altivec,power4,power5,power6,power7,power8} (.machine power9 is planned in
sysdeps/powerpc/powerpc64/power9/strcmp.S).
The diagnostic is not useful anyway so just delete it.
2020-12-28 12:20:40 -08:00
Arthur Eubanks 9abc457724 [NewPM][AMDGPU] Port amdgpu-simplifylib/amdgpu-usenative
And add them to the pipeline via
AMDGPUTargetMachine::registerPassBuilderCallbacks(), which mirrors
AMDGPUTargetMachine::adjustPassManager().

These passes can't be unconditionally added to PassRegistry.def since
they are only present when the AMDGPU backend is enabled. And there are
no target-specific headers in llvm/include, so parsing these pass names
must occur somewhere in the AMDGPU directory. I decided the best place
was inside the TargetMachine, since the PassBuilder invokes
TargetMachine::registerPassBuilderCallbacks() anyway. If we come up with
a cleaner solution for target-specific passes in the future that's fine,
but there aren't too many target-specific IR passes living in
target-specific directories so it shouldn't be too bad to change in the
future.

Reviewed By: ychen, arsenm

Differential Revision: https://reviews.llvm.org/D93863
2020-12-28 10:38:51 -08:00
Nemanja Ivanovic e73f885c98 [PowerPC] Remove redundant COPY_TO_REGCLASS introduced by 8a58f21f5b 2020-12-28 09:26:51 -06:00
alex-t 644da789e3 [AMDGPU] Split edge to make si_if dominate end_cf
Basic block containing "if" not necessarily dominates block that is the "false" target for the if.

That "false" target block may have another predecessor besides the "if" block. IR value corresponding to the Exec mask is generated by the

si_if intrinsic and then used by the end_cf intrinsic. In this case IR verifier complains that 'Def does not dominate all uses'.

This change split the edge between the "if" block and "false" target block to make it dominated by the "if" block.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D91435
2020-12-28 17:14:02 +03:00
Zakk Chen e673d40199 [RISCV] Define vmsbf.m/vmsif.m/vmsof.m/viota.m/vid.v intrinsics.
Define those intrinsics and lower to V instructions.

Use update_llc_test_checks.py for viota.m tests to check
earlyclobber is applied correctly.
mask viota.m tests uses the same argument as input and mask for
avoid dependency of D93364.

We work with @rogfer01 from BSC to come out this patch.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D93823
2020-12-28 05:54:18 -08:00
Dmitry Preobrazhensky 8c25bb3d0d [AMDGPU][MC] Improved errors handling for v_interp* operands
See bug 48596 (https://bugs.llvm.org/show_bug.cgi?id=48596)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D93757
2020-12-28 16:15:48 +03:00
Dmitry Preobrazhensky 5b17263b6b [AMDGPU][MC][NFC] Parser refactoring
See bug 48515 (https://bugs.llvm.org/show_bug.cgi?id=48515)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D93756
2020-12-28 14:59:49 +03:00
Fraser Cormack d85a198e85 [RISCV] Pattern-match more vector-splatted constants
This patch extends the pattern-matching capability of vector-splatted
constants. When illegally-typed constants are legalized they are
canonically sign-extended to XLenVT. This preserves the sign and allows
us to match simm5. If they were zero-extended for whatever reason we'd
lose that ability: e.g. `(i8 -1) -> (XLenVT 255)` would not be matched
under the current logic.

To address this we first manually sign-extend the splatted constant from
the vector element type to int64_t. This preserves the semantics while
removing any implicitly-truncated bits.

The corresponding logic for uimm5 was not updated, the rationale being
that neither sign- nor zero-extending a legal uimm5 immediate should
change that (unless we expect actual "garbage" upper bits).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93837
2020-12-28 07:11:10 +00:00
Nikita Popov fb77d95022 [AArch64] Fix legalization of i128 ctpop without neon
If neon is disabled, LowerCTPOP will return SDValue() to indicate
that normal legalization should be used. However, ReplaceNodeResults
does not check for this and pushes the empty SDValue() onto the
result vector, which will subsequently result in a crash.

Differential Revision: https://reviews.llvm.org/D93825
2020-12-27 17:24:41 +01:00
Craig Topper 33051d5d61 [X86] Remove X86Fmadd SDNode from tablegen. Use standard fma instead. NFC
I guess I missed this in 4252f7773a
when I modified most patterns.
2020-12-26 23:41:46 -08:00
Craig Topper 76202f09b5 [RISCV] Improve VMConstraint checking on more unary and nullary instructions.
We weren't consistently marking unary instructions as OneInput
and vid.v is really ZeroInput but we had no way to mark that.

This patch improves this by removing the error prone OneInput constraint.
Instead we just always look for the mask in the last operand.

It appears that the "CheckReg" variable used for the check on the broken
instruction was unitialized or garbage because it was also used for
VS1/VS2 constraints. I've scoped the variable locally to each check now.

I've gone through and set NoConstraint on instructions that don't have
a real VMConstraint and don't have a mask as the last operand.

I've also removed the unused enum values in RISCVBaseInfo.h. We
never use them in C++ and we have separate versions in a td file.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D93784
2020-12-26 18:47:59 -08:00
Monk Chiang 622ea9cf74 [RISCV] Define vector widening reduction intrinsic.
Define vwredsumu/vwredsum/vfwredosum/vfwredsum

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Differential Revision: https://reviews.llvm.org/D93807
2020-12-26 21:42:30 +08:00
Amara Emerson e0721a0992 [AArch64][GlobalISel] Notify observer of mutated instruction for shift custom legalization.
No test for this because it's a CSE verifier failure that's only exposed in a
WIP patch for enabling CSE throughout the AArch64 GISel pipeline.
2020-12-25 00:31:47 -08:00
Zakk Chen da4a637e99 [RISCV] Define vpopc/vfirst intrinsics.
Define vpopc/vfirst intrinsics and lower to V instructions.

We work with @rogfer01 from BSC to come out this patch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93795
2020-12-24 19:44:34 -08:00
Kazu Hirata d6ff5cf995 [Target] Use llvm::any_of (NFC) 2020-12-24 19:43:26 -08:00
Zakk Chen 351c216f36 [RISCV] Define vector mask-register logical intrinsics.
Define vector mask-register logical intrinsics and lower them
to V instructions. Also define pseudo instructions vmmv.m
and vmnot.m.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Differential Revision: https://reviews.llvm.org/D93705
2020-12-24 18:59:05 -08:00
ShihPo Hung 912740a864 [RISCV] Add intrinsics for vrgather instruction
This patch defines vrgather intrinsics and lower to V instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Differential revision: https://reviews.llvm.org/D93797
2020-12-24 18:16:02 -08:00
Monk Chiang afd03cd335 [RISCV] Define vector single-width reduction intrinsic.
integer group:
vredsum/vredmaxu/vredmax/vredminu/vredmin/vredand/vredor/vredxor
float group:
vfredosum/vfredsum/vfredmax/vfredmin

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Differential Revision: https://reviews.llvm.org/D93746
2020-12-25 09:56:01 +08:00
Praveen Velliengiri 61177943c9 [AMDGPU] Use MUBUF instructions for global address space access
Currently, the compiler crashes in instruction selection of global
load/stores in gfx600 due to the lack of FLAT instructions. This patch
fix the crash by selecting MUBUF instructions for global load/stores
in gfx600.

Authored-by: Praveen Velliengiri <Praveen.Velliengiri@amd.com>

Reviewed by: t-tye

Differential revision: https://reviews.llvm.org/D92483
2020-12-24 10:13:04 +00:00
Stanislav Mekhanoshin 747f67e034 [AMDGPU] Fix adjustWritemask subreg handling
If we happen to extract a non-dword subreg that breaks the
logic of the function and it may shrink the dmask because
it does not recognize the use of a lane(s).

This bug is next to impossible to trigger with the current
lowering in the BE, but it breaks in one of my future patches.

Differential Revision: https://reviews.llvm.org/D93782
2020-12-23 14:43:31 -08:00
Fraser Cormack 1a7ac29a89 [RISCV] Add ISel support for RVV vector/scalar forms
This patch extends the SDNode ISel support for RVV from only the
vector/vector instructions to include the vector/scalar and
vector/immediate forms.

It uses splat_vector to carry the scalar in each case, except when
XLEN<SEW (RV32 SEW=64) when a custom node `SPLAT_VECTOR_I64` is used for
type-legalization and to encode the fact that the value is sign-extended
to SEW. When the scalar is a full 64-bit value we use a sequence to
materialize the constant into the vector register.

The non-intrinsic ISel patterns have also been split into their own
file.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93312
2020-12-23 20:16:18 +00:00
Craig Topper e0110a4740 [RISCV] Add intrinsics for vfmv.v.f
Also include a special case pattern to use vmv.v.x vd, zero when
the argument is 0.0.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D93672
2020-12-23 10:50:48 -08:00
David Penry a9f14cdc62 [ARM] Add bank conflict hazarding
Adds ARMBankConflictHazardRecognizer. This hazard recognizer
looks for a few situations where the same base pointer is used and
then checks whether the offsets lead to a bank conflict. Two
parameters are also added to permit overriding of the target
assumptions:

arm-data-bank-mask=<int> - Mask of bits which are to be checked for
conflicts.  If all these bits are equal in the offsets, there is a
conflict.
arm-assume-itcm-bankconflict=<bool> - Assume that there will be bank
conflicts on any loads to a constant pool.

This hazard recognizer is enabled for Cortex-M7, where the Technical
Reference Manual states that there are two DTCM banks banked using bit
2 and one ITCM bank.

Differential Revision: https://reviews.llvm.org/D93054
2020-12-23 14:00:59 +00:00
Simon Moll c3acda0798 [VE] Vector 'and' isel and tests
Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D93709
2020-12-23 13:29:29 +01:00
Sebastian Neubauer 221fdedc69 [AMDGPU][GlobalISel] Fold flat vgpr + constant addresses
Use getPtrBaseWithConstantOffset in selectFlatOffsetImpl to fold more
vgpr+constant addresses.

Differential Revision: https://reviews.llvm.org/D93692
2020-12-23 10:40:30 +01:00
ShihPo Hung 6301871d06 [RISCV] Add intrinsics for vfwmacc, vfwnmacc, vfwmsac, vfwnmsac instructions
This patch defines vfwmacc, vfwnmacc, vfwmsc, vfwnmsac intrinsics
and lower to V instructions.
We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Differential Revision: https://reviews.llvm.org/D93693
2020-12-23 00:42:04 -08:00
Zakk Chen 032600b9ae [RISCV] Define vmerge/vfmerge intrinsics.
Define vmerge/vfmerge intrinsics and lower to V instructions.

Include support for vector-vector vfmerge by vmerge.vvm.

We work with @rogfer01 from BSC to come out this patch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93674
2020-12-23 00:07:09 -08:00
Evandro Menezes 4d47944393 [RISCV] Define the vfmin, vfmax RVV intrinsics
Define the vfmin, vfmax IR intrinsics for the respective V instructions.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Evandro Menezes <evandro.menezes@sifive.com>

Differential Revision: https://reviews.llvm.org/D93673
2020-12-23 00:27:38 -06:00
Thomas Lively efe7f5ede0 [WebAssembly][NFC] Refactor SIMD load/store tablegen defs
Introduce `Vec` records, each bundling all information related to a single SIMD
lane interpretation. This lets TableGen definitions take a single Vec parameter
from which they can extract information rather than taking multiple redundant
parameters. This commit refactors all of the SIMD load and store instruction
definitions to use the new `Vec`s. Subsequent commits will similarly refactor
additional instruction definitions.

Differential Revision: https://reviews.llvm.org/D93660
2020-12-22 20:06:12 -08:00
Matt Arsenault 581d13f8ae GlobalISel: Return APInt from getConstantVRegVal
Returning int64_t was arbitrarily limiting for wide integer types, and
the functions should handle the full generality of the IR.

Also changes the full form which returns the originally defined
vreg. Add another wrapper for the common case of just immediately
converting to int64_t (arguably this would be useful for the full
return value case as well).

One possible issue with this change is some of the existing uses did
break without conversion to getConstantVRegSExtVal, and it's possible
some without adequate test coverage are now broken.
2020-12-22 22:23:58 -05:00
Matt Arsenault 8bf9cdeaee AMDGPU: Use Register 2020-12-22 21:55:59 -05:00
Matt Arsenault bac54639c7 AMDGPU: Add spilled CSR SGPRs to entry block live ins 2020-12-22 21:55:59 -05:00
ShihPo Hung ad0a7ad950 [RISCV] Add intrinsics for vf[n]macc/vf[n]msac/vf[n]madd/vf[n]msub instructions
This patch defines vfmadd/vfnmacc, vfmsac/vfnmsac, vfmadd/vfnmadd,
and vfmsub/vfnmsub lower to V instructions.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Differential Revision: https://reviews.llvm.org/D93691
2020-12-22 18:34:00 -08:00
ShihPo Hung 4268783998 [RISCV] Add intrinsics for vwmacc[u|su|us] instructions
This patch defines vwmacc[u|su|us] intrinsics and lower to V instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Differential Revision: https://reviews.llvm.org/D93675
2020-12-22 18:17:39 -08:00
ShihPo Hung c8874464b5 [RISCV] Add intrinsics for vslide1up/down, vfslide1up/down instruction
This patch adds intrinsics for vslide1up, vslide1down, vfslide1up, vfslide1down.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: ShihPo Hung <shihpo.hung@sifive.com>

Differential Revision: https://reviews.llvm.org/D93608
2020-12-22 18:14:22 -08:00
Matt Arsenault 29ed846d67 AMDGPU: Fix assert when checking for implicit operand legality 2020-12-22 20:56:24 -05:00
Arthur O'Dwyer 22cf54a7fb Replace `T(x)` with `reinterpret_cast<T>(x)` everywhere it means reinterpret_cast. NFC.
Differential Revision: https://reviews.llvm.org/D76572
2020-12-22 19:54:29 -05:00
Stanislav Mekhanoshin d15119a02d [AMDGPU][GlobalISel] GlobalISel for flat scratch
It does not seem to fold offsets but this is not specific
to the flat scratch as getPtrBaseWithConstantOffset() does
not return the split for these tests unlike its SDag
counterpart.

Differential Revision: https://reviews.llvm.org/D93670
2020-12-22 16:33:06 -08:00
Stanislav Mekhanoshin ca4bf58e4e [AMDGPU] Support unaligned flat scratch in TLI
Adjust SITargetLowering::allowsMisalignedMemoryAccessesImpl for
unaligned flat scratch support. Mostly needed for global isel.

Differential Revision: https://reviews.llvm.org/D93669
2020-12-22 16:12:31 -08:00
Thomas Lively a781a706b9 [WebAssembly][SIMD] Rename shuffle, swizzle, and load_splats
These instructions previously used prefixes like v8x16 to signify that they were
agnostic between float and int interpretations. We renamed these instructions to
remove this form of prefix in https://github.com/WebAssembly/simd/issues/297 and
https://github.com/WebAssembly/simd/issues/316 and this commit brings the names
in LLVM up to date.

Differential Revision: https://reviews.llvm.org/D93722
2020-12-22 14:29:06 -08:00
Craig Topper 53deef9e0b [RISCV] Remove unneeded !eq comparing a single bit value to 0/1 in RISCVInstrInfoVPseudos.td. NFC
Instead we can either use the bit directly. If it was checking for
0 we need to swap the operands or use !not.
2020-12-22 11:57:16 -08:00
Stanislav Mekhanoshin ae8f4b2178 [AMDGPU] Folding of FI operand with flat scratch
Differential Revision: https://reviews.llvm.org/D93501
2020-12-22 10:48:04 -08:00
Kamau Bridgeman 8a58f21f5b [PowerPC][Power10] Exploit store rightmost vector element instructions
Using the store rightmost vector element instructions to do vector
element extraction and store. The rightmost vector element on little
endian is the zeroth vector element, with these patterns that element
can be extracted and stored in one instruction for all vector types.

Differential Revision: https://reviews.llvm.org/D89195
2020-12-22 12:06:43 -05:00
Paul Walker 8eec7294fe [SVE] Lower vector BITREVERSE and BSWAP operations.
These operations are lowered to RBIT and REVB instructions
respectively.  In the case of fixed-length support using SVE we
also lower BITREVERSE operating on NEON sized vectors as this
results in fewer instructions.

Differential Revision: https://reviews.llvm.org/D93606
2020-12-22 16:49:50 +00:00
Nandor Licker 0586f048d7 [RISCV] Basic jump table lowering
This patch enables jump table lowering in the RISC-V backend.

In addition to the test case included, the new lowering was
tested by compiling the OCaml runtime and running it under qemu.

Differential Revision: https://reviews.llvm.org/D92097
2020-12-22 15:05:54 +00:00
Nemanja Ivanovic ba1202a1e4 [PowerPC] Restore stack ptr from base ptr when available
On subtargets that have a red zone, we will copy the stack pointer to the base
pointer in the prologue prior to updating the stack pointer. There are no other
updates to the base pointer after that. This suggests that we should be able to
restore the stack pointer from the base pointer rather than loading it from the
back chain or adding the frame size back to either the stack pointer or the
frame pointer.
This came about because functions that call setjmp need to restore the SP from
the FP because the back chain might have been clobbered
(see https://reviews.llvm.org/D92906). However, if the stack is realigned, the
restored SP might be incorrect (which is what caused the failures in the two
ASan test cases).

This patch was tested quite extensivelly both with sanitizer runtimes and
general code.

Differential revision: https://reviews.llvm.org/D93327
2020-12-22 05:44:03 -06:00
Hsiangkai Wang 9a8ef927df [RISCV] Define vector compare intrinsics.
Define vector compare intrinsics and lower them to V instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>

Differential Revision: https://reviews.llvm.org/D93368
2020-12-22 14:08:18 +08:00
Zakk Chen 7a2c8be641 [RISCV] Define vleff intrinsics.
Define vleff intrinsics and lower to V instructions.

We work with @rogfer01 from BSC to come out this patch.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93516
2020-12-21 22:05:38 -08:00