Commit Graph

883 Commits

Author SHA1 Message Date
Aries 438f1c92c4 Fix some build warnings 2023-01-19 09:45:27 +08:00
zhoujing 7e701d4ba1 Add support for float point trunc instruction match 2023-01-09 18:06:39 +08:00
Aries 0b43b70327 Fix bug in addressing space mapping 2023-01-03 10:45:58 +08:00
zhoujing 1fab7b80f3 Legalize operation for SETCC 2022-12-29 17:13:49 +08:00
Aries 17adb707e6 Fix bug in kernel arg memory offset calculation 2022-12-29 11:53:29 +08:00
Aries 424ea45e4f Update Ventus GPGPU ABI: X4 as stack pointer, V0-V31 as arguments registers etc 2022-12-28 13:11:22 +08:00
Aries e8368c07e1 Fix kernel argument lowering alignment bug. 2022-12-27 17:00:46 +08:00
Aries 3a9c32a024 Add initial vector support(calling convention fix). 2022-12-27 16:35:12 +08:00
Aries da5006ca8d Add support to lowering BITCAST and Constant Pool for zfinx etc 2022-12-27 13:39:46 +08:00
Aries 9be2c54215 Add initial vGPR + sGPRF32 (zfinx) support 2022-12-27 12:00:30 +08:00
Aries 7d7ef235fd Support f32 return type in VGPR 2022-12-27 11:21:08 +08:00
Aries 2f946d86ad Fix basicblock insertion ordering for ISD::SELECT lowering. 2022-12-22 17:47:03 +08:00
Aries cb6f30fbd7 Add initial support to lower ISD::SELECT into branch instructions in divergent execution path. 2022-12-22 17:17:02 +08:00
Aries b9da010dd5 [NFC] Refactor messy switch...case 2022-12-22 14:50:13 +08:00
Aries beb878e97c Add OpenCL addressing space mapping to RISCVAS.
Add kernel argument lowering.
Clean up a few unrelated RVV code.
2022-12-20 17:08:08 +08:00
Aries dee3135130 Drafting divergent related code, not working yet. 2022-12-19 18:11:34 +08:00
Aries c6b68cbedb Support move between vGPR and sGPR.
Fix a few bugs in calling convention related lowering functions.
2022-12-19 14:21:26 +08:00
Aries 4e0cd22745 Add vALU conditional branch instructions 2022-12-19 13:09:00 +08:00
Aries 894931f522 More clean up and fix build error. 2022-12-19 10:10:28 +08:00
Aries 521e83631d Roughly cleaned RVV instruction selection. 2022-12-19 09:40:05 +08:00
Aries 35633e31e3 In the middle of removing RVV code. 2022-12-16 18:04:43 +08:00
Aries f1eff7fcfe Very very early step to remove RVV features from code base. 2022-12-16 17:33:54 +08:00
Kazu Hirata 3c09ed006a [llvm] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 17:12:44 -08:00
Fangrui Song b0df70403d [Target] llvm::Optional => std::optional
The updated functions are mostly internal with a few exceptions (virtual functions in
TargetInstrInfo.h, TargetRegisterInfo.h).
To minimize changes to LLVMCodeGen, GlobalISel files are skipped.

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 22:43:14 +00:00
Kazu Hirata 20cde15415 [Target] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 20:36:06 -08:00
Krzysztof Parzyszek 864aaa21b4 TargetLowering: convert Optional to std::optional 2022-12-01 16:19:10 -08:00
Philip Reames 7d82c99403 [RISCV][TTI] Account for constant materialization cost when costing arithmetic operations
At the IR level, we generally assume that constants are free to materialize. However, for RISCV due to some quirks of the ISA, materializing arbitrary constants can be rather expensive. We frequently fallback to constant pool loads.

We've been slowly moving in the direction of modeling the cost of the remat as part of the instruction cost. This has the effect of disincentivizing vectorization - mostly SLP - when we'd have to materialize an expensive constant.

We need better modeling of which constants are expensive and not, but the moment let's be consistent with how we model arithmetic and memory instructions. The difference between the two is that arithmetic can sometimes fold a splat operation which stores can not.

Differential Revision: https://reviews.llvm.org/D138941
2022-11-30 07:20:51 -08:00
Philip Reames b25672ba82 [RISCV] Separate out helper for checking if vector splat supported for operand [nfc] 2022-11-29 11:05:46 -08:00
Kazu Hirata 2f61c6c639 [RISCV] Use std::optional in RISCVISelLowering.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-25 23:04:58 -08:00
LiaoChunyu aa14f002d5 [RISCV] Branchless lowering for (select (x < 0), TrueConstant, FalseConstant) and (select (x >= 0), TrueConstant, FalseConstant)
This patch reduces the number of unpredictable branches

(select (x < 0), y, z)  -> x >> (XLEN - 1) & (y - z) + z
(select (x >= 0), y, z) -> x >> (XLEN - 1) & (z - y) + y

Reviewed By: craig.topper, reames

Differential Revision: https://reviews.llvm.org/D137949
2022-11-25 20:18:30 +08:00
wangpc 241accea2a [RISCV] Lower unmasked zero-stride vector load to (scalar load + splat)
So we have the opportunity to fold splat into .vx instruction as what
D101138 has done. If failed, we can select zero-stride vector load
again.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D138101
2022-11-24 11:09:45 +08:00
WuXinlong 219417b2c6 [RISCV] Add CodeGen support and MC testcase of RISCV Zca Extension
This patch add the support of RISCV Zca ext

`Zca` is a subset of C extension instructions that are compatible with the Zc extension.

So this patch implements Zca code generation with reference to the C extension and sets the 2-byte alignment for the Zca extension, just like C extension does.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D130483
2022-11-22 17:22:26 +08:00
Han-Kuan Chen 7e6dbfcd9d [RISCV] Make lowerVECTOR_SHUFFLEAsVSlidedown follow source until not EXTRACT_SUBVECTOR.
Current lowerVECTOR_SHUFFLEAsVSlidedown only seeks whether input are
EXTRACT_SUBVECTOR and their source are same. The commit will make the
function seek input and their source until they are not
EXTRACT_SUBVECTOR.

Differential Revision: https://reviews.llvm.org/D138025
2022-11-17 22:32:53 -08:00
Stanislav Mekhanoshin bcaf31ec3f [AMDGPU] Allow finer grain control of an unaligned access speed
A target can return if a misaligned access is 'fast' as defined
by the target or not. In reality there can be different levels
of 'fast' and 'slow'. This patch changes the boolean 'Fast'
argument of the allowsMisalignedMemoryAccesses family of functions
to an unsigned representing its speed.

A target can still define it as it wants and the direct translation
of the current code uses 0 and 1 for current false and true. This
makes the change an NFC.

Subsequent patch will start using an actual value of speed in
the load/store vectorizer to compare if a vectorized access going
to be not just fast, but not slower than before.

Differential Revision: https://reviews.llvm.org/D124217
2022-11-17 09:23:53 -08:00
Craig Topper 7e15ea102f [RISCV] Add a DAG combine to pre-promote (i1 (truncate (i32 (srl X, Y)))) with Zbs on RV64.
Type legalization will want to turn (srl X, Y) into RISCVISD::SRLW,
which will prevent us from using a BEXT instruction.

This is similar to what we do for (i32 (and (srl X, Y), 1)).
2022-11-16 19:07:33 -08:00
Craig Topper 5c9b03faef [RISCV] Remove duplicate setOperationAction. NFC 2022-11-16 16:54:27 -08:00
Yeting Kuo ed9638c44b [VP][RISCV] Add vp.nearbyint and RISC-V support.
nearbyint has the property to execute without exception.
For not modifying fflags, the patch added new machine opcode
PseudoVFROUND_NOEXCEPT_V that expands vfcvt.x.f.v and vfcvt.f.x.v between a pair
of frflags and fsflags.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D137685
2022-11-16 14:05:35 +08:00
Yeting Kuo 5c3ca10b09 [VP][RISCV] Add vp.bswap and RISC-V support.
The patch also added function expandVPBSWAP to expand ISD::VP_BSWAP nodes.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D137928
2022-11-16 11:36:38 +08:00
wangpc a214c521f8 [RISCV] Don't use zero-stride vector load for gather if not optimized
We may form a zero-stride vector load when lowering gather to strided
load. As what D137699 has done, we use `load+splat` for this form if
there is no optimized implementation.
We restrict this to unmasked loads currently in consideration of the
complexity of hanlding all falses masks.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D137931
2022-11-16 10:43:10 +08:00
Han-Kuan Chen aa47bfa9bc [RISCV] Refactor getDefaultVLOps. NFC.
Current getDefaultVLOps can only deduce VL from a MVT. However,
sometimes users have already known VL value. This commit will provide a
uniform interface to get VL instead of calling DAG.getConstant.

Differential Revision: https://reviews.llvm.org/D138003
2022-11-15 18:11:11 -08:00
Craig Topper 25dcca60f4 [RISCV] Teach shouldSinkOperands that vp.add and friends are commutative.
We previously had a bug that our isel patterns weren't commutative,
but that has been fixed for a while.
2022-11-14 22:01:59 -08:00
Craig Topper dde8423f21 [RISCV] Expand i32 abs to negw+max at isel.
This adds a RISCVISD::ABSW to remember that we started with an i32
abs. Previously we used a DAG combine of (sext_inreg (abs)) to
delay emitting a freeze from type legalization in order to make
ComputeNumSignBits optimizations work on other promoted nodes.

This new approach always uses negw+max even if the result doesn't
need to be sign extended. This helps the RISCVSExtWRemoval pass
if the sext.w is in another basic block.
2022-11-14 19:44:05 -08:00
Yeting Kuo 0c0681b741 [RISCV][NFC] Remove dead code.
All ISD::BSWAP nodes are not customized lowered in RISC-V now, so the patch
removed dead code for ISD::BSWAP in LowerOperation.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D137907
2022-11-14 10:08:48 +08:00
Yeting Kuo 06a7e04be4 [RISCV][NFC] Fix unused variable warning.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D137633
2022-11-10 20:23:09 +08:00
melonedo f4f6c63f0d [RISCV] Add support for static chain
The static chain parameter is a special parameter that is not passed in the usual argument registers or stack space. For example, in x64 System V ABI it is always passed in R10. Although the ABI of RISCV does not assign a register for this purpose, GCC had support for it on RISC-V a long time ago, and it is exposed via `__builtin_call_with_static_chain` intrinsic, and assign t2 for static chain parameters. This patch also chose t2 for compatibility.

In LLVM, static chain parameters are handled by the `nest` attribute of an argument to a function ([D6332](https://reviews.llvm.org/D6332)), so tests are added to ensure `nest` arguments are handled correctly.

Reviewed By: kito-cheng, MaskRay

Differential Revision: https://reviews.llvm.org/D129106
2022-11-09 16:10:32 +08:00
Yeting Kuo 71e4e35581 [VP][RISCV] Add vp.rint and RISC-V support.
FRINT uses dynamic rounding mode instead of static rounding mode. The patch
rename VFCVT_X_F_VL to VFCVT_RM_X_F_VL for static rounding mode uses and added
new ISDNode VFCVT_X_F_VL directly selected to PseudoVFCVT_X_F_V.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D136662
2022-11-01 14:52:47 +08:00
Craig Topper 2a827e4a98 [RISCV] Fix crash a vector add has a 4x sext and zext operand.
We can narrow one of the extends and keep the other original by
using a vwaddu.wv or vwadd.wv.

We were previously forgetting to keep the original operand and
instead took the source of its extend. This resulted in a type
mismatch that later failed with an impossible physical register copy.

To fix this I've refactored some code to maintain information about
whether the source needs to be extended at all for longer so we could
use it in materialize.

Differential Revision: https://reviews.llvm.org/D137106
2022-10-31 15:10:27 -07:00
Craig Topper 6a794419cd [RISCV] Optimize i64 insertelt on RV32.
We can use tail undisturbed vslide1down to insert into the vector.

This should make D136640 unneeded.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D136738
2022-10-28 10:23:19 -07:00
Craig Topper e94dc58dff [RISCV] Inline scalar ceil/floor/trunc/rint/round/roundeven.
This avoids the call overhead as well as the the save/restore of
fflags and the snan handling in the libm function.

The save/restore of fflags and snan handling are needed to be
correct for -ftrapping-math. I think we can ignore them in the
default environment.

The inline sequence will generate an invalid exception for nan
and an inexact exception if fractional bits are discarded.

I've used a custom inserter to explicitly create the control flow
around the float->int->float conversion.

We can probably avoid the final fsgnj after the conversion for
no signed zeros FMF, but I'll leave that for future work.

Note the comparison constant is slightly different than glibc uses.
They use 1<<53 for double, I'm using 1<<52. I believe either are valid.
Numbers >= 1<<52 can't have any fractional bits. It's ok to do the
float->int->float conversion on numbers between 1<<53 and 1<<52 since
they will all fit in 64. We only have a problem if the double can't fit
in i64

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D136508
2022-10-26 14:36:49 -07:00
Craig Topper a61b74889f [RISCV] Use vslide1down for i64 insertelt on RV32.
Instead of using vslide1up, use vslide1down and build the other
direction. This avoids the overlap constraint early clobber of
vslide1up.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D136735
2022-10-26 09:43:12 -07:00