llvm-project

Commit Graph

Author	SHA1	Message	Date
Aries	cb6f30fbd7	Add initial support to lower ISD::SELECT into branch instructions in divergent execution path.	2022-12-22 17:17:02 +08:00
Aries	b9da010dd5	[NFC] Refactor messy switch...case	2022-12-22 14:50:13 +08:00
Aries	beb878e97c	Add OpenCL addressing space mapping to RISCVAS. Add kernel argument lowering. Clean up a few unrelated RVV code.	2022-12-20 17:08:08 +08:00
Aries	dee3135130	Drafting divergent related code, not working yet.	2022-12-19 18:11:34 +08:00
Aries	c6b68cbedb	Support move between vGPR and sGPR. Fix a few bugs in calling convention related lowering functions.	2022-12-19 14:21:26 +08:00
Aries	4e0cd22745	Add vALU conditional branch instructions	2022-12-19 13:09:00 +08:00
Aries	894931f522	More clean up and fix build error.	2022-12-19 10:10:28 +08:00
Aries	521e83631d	Roughly cleaned RVV instruction selection.	2022-12-19 09:40:05 +08:00
Aries	35633e31e3	In the middle of removing RVV code.	2022-12-16 18:04:43 +08:00
Aries	f1eff7fcfe	Very very early step to remove RVV features from code base.	2022-12-16 17:33:54 +08:00
Kazu Hirata	3c09ed006a	[llvm] Use std::nullopt instead of None in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 17:12:44 -08:00
Fangrui Song	b0df70403d	[Target] llvm::Optional => std::optional The updated functions are mostly internal with a few exceptions (virtual functions in TargetInstrInfo.h, TargetRegisterInfo.h). To minimize changes to LLVMCodeGen, GlobalISel files are skipped. https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 22:43:14 +00:00
Kazu Hirata	20cde15415	[Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:06 -08:00
Krzysztof Parzyszek	864aaa21b4	TargetLowering: convert Optional to std::optional	2022-12-01 16:19:10 -08:00
Philip Reames	7d82c99403	[RISCV][TTI] Account for constant materialization cost when costing arithmetic operations At the IR level, we generally assume that constants are free to materialize. However, for RISCV due to some quirks of the ISA, materializing arbitrary constants can be rather expensive. We frequently fallback to constant pool loads. We've been slowly moving in the direction of modeling the cost of the remat as part of the instruction cost. This has the effect of disincentivizing vectorization - mostly SLP - when we'd have to materialize an expensive constant. We need better modeling of which constants are expensive and not, but the moment let's be consistent with how we model arithmetic and memory instructions. The difference between the two is that arithmetic can sometimes fold a splat operation which stores can not. Differential Revision: https://reviews.llvm.org/D138941	2022-11-30 07:20:51 -08:00
Philip Reames	b25672ba82	[RISCV] Separate out helper for checking if vector splat supported for operand [nfc]	2022-11-29 11:05:46 -08:00
Kazu Hirata	2f61c6c639	[RISCV] Use std::optional in RISCVISelLowering.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-11-25 23:04:58 -08:00
LiaoChunyu	aa14f002d5	[RISCV] Branchless lowering for (select (x < 0), TrueConstant, FalseConstant) and (select (x >= 0), TrueConstant, FalseConstant) This patch reduces the number of unpredictable branches (select (x < 0), y, z) -> x >> (XLEN - 1) & (y - z) + z (select (x >= 0), y, z) -> x >> (XLEN - 1) & (z - y) + y Reviewed By: craig.topper, reames Differential Revision: https://reviews.llvm.org/D137949	2022-11-25 20:18:30 +08:00
wangpc	241accea2a	[RISCV] Lower unmasked zero-stride vector load to (scalar load + splat) So we have the opportunity to fold splat into .vx instruction as what D101138 has done. If failed, we can select zero-stride vector load again. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D138101	2022-11-24 11:09:45 +08:00
WuXinlong	219417b2c6	[RISCV] Add CodeGen support and MC testcase of RISCV Zca Extension This patch add the support of RISCV Zca ext `Zca` is a subset of C extension instructions that are compatible with the Zc extension. So this patch implements Zca code generation with reference to the C extension and sets the 2-byte alignment for the Zca extension, just like C extension does. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D130483	2022-11-22 17:22:26 +08:00
Han-Kuan Chen	7e6dbfcd9d	[RISCV] Make lowerVECTOR_SHUFFLEAsVSlidedown follow source until not EXTRACT_SUBVECTOR. Current lowerVECTOR_SHUFFLEAsVSlidedown only seeks whether input are EXTRACT_SUBVECTOR and their source are same. The commit will make the function seek input and their source until they are not EXTRACT_SUBVECTOR. Differential Revision: https://reviews.llvm.org/D138025	2022-11-17 22:32:53 -08:00
Stanislav Mekhanoshin	bcaf31ec3f	[AMDGPU] Allow finer grain control of an unaligned access speed A target can return if a misaligned access is 'fast' as defined by the target or not. In reality there can be different levels of 'fast' and 'slow'. This patch changes the boolean 'Fast' argument of the allowsMisalignedMemoryAccesses family of functions to an unsigned representing its speed. A target can still define it as it wants and the direct translation of the current code uses 0 and 1 for current false and true. This makes the change an NFC. Subsequent patch will start using an actual value of speed in the load/store vectorizer to compare if a vectorized access going to be not just fast, but not slower than before. Differential Revision: https://reviews.llvm.org/D124217	2022-11-17 09:23:53 -08:00
Craig Topper	7e15ea102f	[RISCV] Add a DAG combine to pre-promote (i1 (truncate (i32 (srl X, Y)))) with Zbs on RV64. Type legalization will want to turn (srl X, Y) into RISCVISD::SRLW, which will prevent us from using a BEXT instruction. This is similar to what we do for (i32 (and (srl X, Y), 1)).	2022-11-16 19:07:33 -08:00
Craig Topper	5c9b03faef	[RISCV] Remove duplicate setOperationAction. NFC	2022-11-16 16:54:27 -08:00
Yeting Kuo	ed9638c44b	[VP][RISCV] Add vp.nearbyint and RISC-V support. nearbyint has the property to execute without exception. For not modifying fflags, the patch added new machine opcode PseudoVFROUND_NOEXCEPT_V that expands vfcvt.x.f.v and vfcvt.f.x.v between a pair of frflags and fsflags. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D137685	2022-11-16 14:05:35 +08:00
Yeting Kuo	5c3ca10b09	[VP][RISCV] Add vp.bswap and RISC-V support. The patch also added function expandVPBSWAP to expand ISD::VP_BSWAP nodes. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D137928	2022-11-16 11:36:38 +08:00
wangpc	a214c521f8	[RISCV] Don't use zero-stride vector load for gather if not optimized We may form a zero-stride vector load when lowering gather to strided load. As what D137699 has done, we use `load+splat` for this form if there is no optimized implementation. We restrict this to unmasked loads currently in consideration of the complexity of hanlding all falses masks. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D137931	2022-11-16 10:43:10 +08:00
Han-Kuan Chen	aa47bfa9bc	[RISCV] Refactor getDefaultVLOps. NFC. Current getDefaultVLOps can only deduce VL from a MVT. However, sometimes users have already known VL value. This commit will provide a uniform interface to get VL instead of calling DAG.getConstant. Differential Revision: https://reviews.llvm.org/D138003	2022-11-15 18:11:11 -08:00
Craig Topper	25dcca60f4	[RISCV] Teach shouldSinkOperands that vp.add and friends are commutative. We previously had a bug that our isel patterns weren't commutative, but that has been fixed for a while.	2022-11-14 22:01:59 -08:00
Craig Topper	dde8423f21	[RISCV] Expand i32 abs to negw+max at isel. This adds a RISCVISD::ABSW to remember that we started with an i32 abs. Previously we used a DAG combine of (sext_inreg (abs)) to delay emitting a freeze from type legalization in order to make ComputeNumSignBits optimizations work on other promoted nodes. This new approach always uses negw+max even if the result doesn't need to be sign extended. This helps the RISCVSExtWRemoval pass if the sext.w is in another basic block.	2022-11-14 19:44:05 -08:00
Yeting Kuo	0c0681b741	[RISCV][NFC] Remove dead code. All ISD::BSWAP nodes are not customized lowered in RISC-V now, so the patch removed dead code for ISD::BSWAP in LowerOperation. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D137907	2022-11-14 10:08:48 +08:00
Yeting Kuo	06a7e04be4	[RISCV][NFC] Fix unused variable warning. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D137633	2022-11-10 20:23:09 +08:00
melonedo	f4f6c63f0d	[RISCV] Add support for static chain The static chain parameter is a special parameter that is not passed in the usual argument registers or stack space. For example, in x64 System V ABI it is always passed in R10. Although the ABI of RISCV does not assign a register for this purpose, GCC had support for it on RISC-V a long time ago, and it is exposed via `__builtin_call_with_static_chain` intrinsic, and assign t2 for static chain parameters. This patch also chose t2 for compatibility. In LLVM, static chain parameters are handled by the `nest` attribute of an argument to a function ([D6332](https://reviews.llvm.org/D6332)), so tests are added to ensure `nest` arguments are handled correctly. Reviewed By: kito-cheng, MaskRay Differential Revision: https://reviews.llvm.org/D129106	2022-11-09 16:10:32 +08:00
Yeting Kuo	71e4e35581	[VP][RISCV] Add vp.rint and RISC-V support. FRINT uses dynamic rounding mode instead of static rounding mode. The patch rename VFCVT_X_F_VL to VFCVT_RM_X_F_VL for static rounding mode uses and added new ISDNode VFCVT_X_F_VL directly selected to PseudoVFCVT_X_F_V. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D136662	2022-11-01 14:52:47 +08:00
Craig Topper	2a827e4a98	[RISCV] Fix crash a vector add has a 4x sext and zext operand. We can narrow one of the extends and keep the other original by using a vwaddu.wv or vwadd.wv. We were previously forgetting to keep the original operand and instead took the source of its extend. This resulted in a type mismatch that later failed with an impossible physical register copy. To fix this I've refactored some code to maintain information about whether the source needs to be extended at all for longer so we could use it in materialize. Differential Revision: https://reviews.llvm.org/D137106	2022-10-31 15:10:27 -07:00
Craig Topper	6a794419cd	[RISCV] Optimize i64 insertelt on RV32. We can use tail undisturbed vslide1down to insert into the vector. This should make D136640 unneeded. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D136738	2022-10-28 10:23:19 -07:00
Craig Topper	e94dc58dff	[RISCV] Inline scalar ceil/floor/trunc/rint/round/roundeven. This avoids the call overhead as well as the the save/restore of fflags and the snan handling in the libm function. The save/restore of fflags and snan handling are needed to be correct for -ftrapping-math. I think we can ignore them in the default environment. The inline sequence will generate an invalid exception for nan and an inexact exception if fractional bits are discarded. I've used a custom inserter to explicitly create the control flow around the float->int->float conversion. We can probably avoid the final fsgnj after the conversion for no signed zeros FMF, but I'll leave that for future work. Note the comparison constant is slightly different than glibc uses. They use 1<<53 for double, I'm using 1<<52. I believe either are valid. Numbers >= 1<<52 can't have any fractional bits. It's ok to do the float->int->float conversion on numbers between 1<<53 and 1<<52 since they will all fit in 64. We only have a problem if the double can't fit in i64 Reviewed By: reames Differential Revision: https://reviews.llvm.org/D136508	2022-10-26 14:36:49 -07:00
Craig Topper	a61b74889f	[RISCV] Use vslide1down for i64 insertelt on RV32. Instead of using vslide1up, use vslide1down and build the other direction. This avoids the overlap constraint early clobber of vslide1up. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D136735	2022-10-26 09:43:12 -07:00
Craig Topper	63ed3d0eeb	[RISCV] Rename lowerFTRUNC_FCEIL_FFLOOR_FROUND to lowerVectorFTRUNC_FCEIL_FFLOOR_FROUND. NFC Extracted from D136508.	2022-10-24 20:32:22 -07:00
Craig Topper	ef72ff7b15	[RISCV] Fix unused variable warning. NFC	2022-10-22 22:29:03 -07:00
Craig Topper	7a4e56acac	[RISCV] Add an early out to lowerVECTOR_SHUFFLEAsVSlidedown. NFC If Mask[0] is 0, then we're never going to match a slidedown. If we get through the for loop, then it's an identity mask which should have already been optimized out. Otherwise it's some non-contiguous mask that will fail out of the lop. Might as well not bother entering the loop.	2022-10-18 21:35:15 -07:00
Han-Kuan Chen	615af94dc2	[RISCV] Lower VECTOR_SHUFFLE to VSLIDEDOWN_VL. Differential Revision: https://reviews.llvm.org/D136136	2022-10-18 08:58:39 -07:00
LiaoChunyu	7b970290c0	[RISCV] Optimize SELECT_CC when the true value of select is Constant (select (setcc lhs, rhs, CC), constant, falsev) -> (select (setcc lhs, rhs, InverseCC), falsev, constant) This patch removes unnecessary copies Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D129757	2022-10-18 09:24:17 +08:00
Craig Topper	2b32e4f98b	[RISCV] Add basic support for the sifive-7-series short forward branch optimization. sifive-7-series has macrofusion support to convert a branch over a single instruction into a conditional instruction. This can be an improvement if the branch is hard to predict. This patch adds support for the most basic case, a branch over a move instruction. This is implemented as a pseudo instruction so we can hide the control flow until all code motion passes complete. I've disabled a recent select optimization if this feature is enabled in the subtarget. Related gcc patch for the same optimization https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg211045.html Reviewed By: reames Differential Revision: https://reviews.llvm.org/D135814	2022-10-17 13:56:22 -07:00
Craig Topper	e68b0d5875	[RISCV] Match (select C, -1, X)->(or -C, X) during lowerSelect Same with (select C, X, -1), (select C, 0, X), and (select C, X, 0). There's a DAGCombine after we turn the select into select_cc, but that may introduce a setcc that didn't previously exist. We could add more DAGCombines to remove the extra setcc, but this seemed lower effort. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D135833	2022-10-13 09:06:12 -07:00
Philip Reames	1c41d0cb62	[RISCV] Use branchless form for selects with 0 in either arm Continuing the theme of adding branchless lowerings for simple selects, this time handle the 0 arm case. This is very common for various umin idioms, etc.. Differential Revision: https://reviews.llvm.org/D135600	2022-10-12 13:51:52 -07:00
Yeting Kuo	7329dc0cc3	[RISCV][NFC] Fix unused variable warning. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D135365	2022-10-09 21:39:30 +08:00
Craig Topper	f749b2d9a5	[RISCV] Fix incorrect parenthese placement in comment. NFC	2022-10-07 17:16:38 -07:00
Craig Topper	9f67047cf0	[VP][RISCV] Add vp.smax/smin/umax/umin intrinsics Differential Revision: https://reviews.llvm.org/D135418	2022-10-07 17:14:31 -07:00
eopXD	dbc681c98e	[VP][RISCV] Add vp.roundtozero and its RISC-V support The scalar instruction of this is `llvm.trunc`. However the naming of ISD::VP_TRUNC is already taken by `trunc` of the LLVM IR. Naming this as `vp.ftrunc` would likely cause confusion with `vp.fptrunc`. So adding `vp.roundtozero` that will look similar to `vp.roundeven`. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D135233	2022-10-07 02:15:23 -07:00
Philip Reames	79f0413e5e	[RISCV] Use branchless form for selects with -1 in either arm We can lower these as an or with the negative of the condition value. This appears to result in significantly less branch-y code on multiple common idioms (as seen in tests). Differential Revision: https://reviews.llvm.org/D135316	2022-10-06 15:18:43 -07:00
Philip Reames	04bb32e58a	[DAG] Extract helper for (neg x) [nfc] This is a frequently reoccurring pattern, let's factor it out. Differential Revision: https://reviews.llvm.org/D135301	2022-10-06 13:23:52 -07:00
Quentin Colombet	6e440ee2aa	[RISCV][ISel] Finally fix the UBSan error Forgot another SDValue check and a boolean initialization.	2022-10-05 21:43:09 +00:00
Quentin Colombet	6bbe7d376e	[RISCV][ISel] Attempt to fix UBSan error Explicitly check an SDValue with the invalid SDValue. UBSan reports: runtime error: load of value 36, which is not a valid value for type 'bool' https://lab.llvm.org/buildbot/#/builders/85/builds/11231	2022-10-05 20:59:28 +00:00
Quentin Colombet	c5c2de287e	[RISCV][ISel] Fold extensions when all the users can consume them This patch allows the combines that fold extensions in binary operations to have more than one use. The approach here is pretty conservative: if all the users of an extension can fold the extension, then the folding is done, otherwise we don't fold. This is the first step towards avoiding the one-use limitation. As a result, we make a decision to fold/don't fold for a web of instructions. An instruction is part of the web of instructions as soon as it consumes an extension that needs to be folded for all its users. Because of how SDISel works a web of instructions can be visited over and over. More precisely, if the folding happens, it happens for the whole web and that's the end of it, but if the folding fails, the whole web may be revisited when another member of the web is visited. To avoid a compile time explosion in pathological cases, we bail out earlier for webs that are bigger than a given threshold (arbitrarily set at 18 for now.) This size can be changed using `--riscv-lower-ext-max-web-size=<maxWebSize>`. At the current time, I didn't see a better scheme for that. Assuming we want to stick with doing that in SDISel. Differential Revision: https://reviews.llvm.org/D133739	2022-10-05 20:49:21 +00:00
Quentin Colombet	4852f26acd	[RISCV][ISel] Refactor the formation of VW operations This patch centralizes all the combines of add\|sub\|mul with extended operands in one "framework". The rationale for this change is to offer a one-stop-shop for all these transformations so that, in the future, it is easier to make combine decisions for a web of instructions (i.e., instructions connected through s\|zext operands). Technically this patch is not NFC because the new version is more powerful than the previous version. In particular, it diverges in two cases: - VWMULSU can now also be produced from `mul(splat, zext)`, whereas previously only `mul(sext, splat)` were supported when `splat`s were involved. (As demonstrated in rvv/fixed-vectors-vwmulsu.ll) - VWSUB(U) can now also be produced from `sub(splat, ext)`, whereas previously only `sub(ext, splat)` were supported when `splat`s were involved. (As demonstrated in rvv/fixed-vectors-vwsub.ll) If we wanted, we could block these transformations to make this patch really NFC. For instance, we could do something similar to `AllowSplatInVW_W`, which prevents the combines to form vw(add\|sub)(u)_w when the RHS is a splat. Regarding the "framework" itself, the bulk of the patch is some boilderplate code that abstracts away the actual extensions that are present in the DAG. This allows us to handle `vwadd_w(ext a, b)` as if it was a regular `add(ext a, ext b)`. Since the node `ext b` doesn't actually exist in the DAG, we have a bunch of methods (all in the NodeExtensionHelper class) that fake all that for us. The other half of the change is around `CombineToTry` and `CombineResult`. These helper structures respectively: - Represent the kind of combines that can be applied to a node, and - Store what needs to happen to do that combine. This can be viewed as a two step approach: - First, check if a pattern applies, and - Second apply it. The checks and the materialization of the combines are decoupled so that in the future we can perform several checks and do all the related applies in one go. Differential Revision: https://reviews.llvm.org/D134703	2022-10-05 17:43:48 +00:00
Craig Topper	ece4bb5ab8	[RISCV] Teach SExtWRemoval to recognize sign extended values that come from arguments. This information is not preserved in MIR today. So this patch adds information to RISCVMachineFunctionInfo when the vreg is created for the argument. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D134621	2022-10-04 15:39:10 -07:00
Craig Topper	b41fe90dc3	[RISCV] Correct the setcc in vp.floor/ceil/round/roundeven lowering. We want to emit a masked setcc that preserves zeros in all of the bits where the original mask is zero. To do this we need to pass the original mask as the passthru operand as well. Otherwise, we'll use the mask agnostic policy and replace the zeros with 1s on some CPUs. Differential Revision: https://reviews.llvm.org/D135122	2022-10-03 20:58:05 -07:00
Philip Reames	e884324145	[RISCV] Generalize select (and (x , 0x1) == 0), y, (z ^ y) ) and select (and (x , 0x1) == 0), y, (z \| y) ) transforms by removing and-clause These transforms were recently added (by me) in D134881. Looking at the code again, I realized we don't need the (and x, 0x1) portion of the pattern, we just need to know that the result of that sub-tree is either 0 or 1. Checking for this directly allows us to match slightly more broadly. The test changes are zext i1 arguments, but this could also kick in for e.g. shifts of high bits, or any other source of known bits. Differential Revision: https://reviews.llvm.org/D135081	2022-10-03 13:57:38 -07:00
Philip Reames	a200b0fc25	[DAG] Introduce getSplat utility for common dispatch pattern [nfc] We have a very common pattern of dispatching between BUILD_VECTOR and SPLAT_VECTOR creation repeated in many cases in code. Common the pattern into a utility function.	2022-10-03 12:49:39 -07:00
Craig Topper	f3e87a63e5	[RISCV] Add missing VL arguments to the creation of RISCVISD::VMV_V_X_VL nodes. VMV_V_X_VL nodes should always have a passthru, a splat, and a VL. We were sometimes missing the VL. This went unnoticed because these cases were all selected into the following node to form a .vx or .vi instruction. The ComplexPattern that does this, doesn't check the VL operand. I've added an assert to the ComplexPattern to catch if the operand is missing. @qcolombet spotted some of these in D134703.	2022-10-03 12:21:05 -07:00
Craig Topper	5b06ccb611	Revert "foo" This reverts commit `2138ef354a`. Forgot to squash	2022-10-03 12:15:41 -07:00
Craig Topper	a55cdcae3e	Revert "[RISCV] Add missing VL arguments to the creation of RISCVISD::VMV_V_X_VL nodes." This reverts commit `4c03c9f375`. Forgot to squash	2022-10-03 12:15:28 -07:00
Craig Topper	4c03c9f375	[RISCV] Add missing VL arguments to the creation of RISCVISD::VMV_V_X_VL nodes. VMV_V_X_VL nodes should always have a passthru, a splat, and a VL. We were sometimes missing the VL. This went unnoticed because these cases were all selected into the following node to form a .vx or .vi instruction. The ComplexPattern that does this, doesn't check the VL operand. I've added an assert to the ComplexPattern to catch if the operand is missing. @qcolombet spotted some of these in D134703.	2022-10-03 12:13:21 -07:00
Craig Topper	2138ef354a	foo	2022-10-03 12:13:20 -07:00
Yeting Kuo	cefb7aab61	[VP][RISCV] Add vp.copysign and RISC-V support. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134935	2022-10-01 10:19:10 +08:00
Philip Reames	1e3c179519	[RISCV] Address post commit review comments from D134881	2022-09-30 08:31:40 -07:00
Philip Reames	2b5960028e	[RISCV] Branchless lowering for select (and (x , 0x1) == 0), y, (z ^ y) ) and select (and (x , 0x1) == 0), y, (z \| y) ) This code is directly ported from the X86 backend which applies the same rewrite (along with several others). Planning on looking more closely at the other branchless variants from x86 to see if any are worth porting in future changes. Motivation here is the coremark crc8 routine from https://github.com/eembc/coremark/blob/main/core_util.c#L165. This patch significantly reduces the number of unpredictable branches in the workload. Differential Revision: https://reviews.llvm.org/D134881	2022-09-30 08:24:32 -07:00
Ray Wang	4c786c9747	[RISCV] Remove some unused var decl. NFC Differential Revision: https://reviews.llvm.org/D134707	2022-09-30 08:08:15 -07:00
Yeting Kuo	1cc02b05b7	[SelectionDAG] Add helper function to check whether a SDValue is neutral element. NFC. Using this helper makes work about neutral elements more easier. Although I only find one case now, I think it will have more chance to be used since so many combine works are related to neutral elements. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D133866	2022-09-30 11:29:11 +08:00
eopXD	02a982829c	[RISCV] Add lowering for llvm.roundeven Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134785	2022-09-29 06:08:14 -07:00
eopXD	9677d70eb2	[VP][RISCV] Add vp.floor, vp.round, vp.roundeven and their RISC-V support Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134759	2022-09-27 19:45:58 -07:00
Han-Kuan Chen	c595c874cb	[RISCV] Lower BUILD_VECTOR to RISCVISD::VID_VL if it is floating-point type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D133688	2022-09-27 17:25:34 -07:00
eopXD	163cb33854	[VP][RISCV] Add vp.ceil and RISC-V support Previous commit `8b00b24f85` missed to add `int_ceil` anchor for the llvm.ceil.* section under LangRef.rst Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134586	2022-09-27 12:04:09 -07:00
eopXD	384b8b3da7	Revert "[VP][RISCV] Add vp.ceil and RISC-V support" This reverts commit `8b00b24f85`.	2022-09-27 11:12:57 -07:00
eopXD	8b00b24f85	[VP][RISCV] Add vp.ceil and RISC-V support Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134586	2022-09-27 11:08:27 -07:00
Yeting Kuo	04e1301f3d	[VP][RISCV] Add vp.maxnum and vp.minnum intrinsics and RISC-V support. Add vp.maxnum and vp.minnum which are vector predicted intrinsics of llvm.maxnum and llvm.minnum. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134639	2022-09-27 13:36:45 +08:00
Yeting Kuo	43c5fbdd3a	[VP][RISCV] Add vp.sqrt intrinsic and RISC-V support. The patch modeled vp.fabs patch D132793. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D133690	2022-09-26 10:47:40 +08:00
Philip Reames	6e7c54ecaf	[RISCV] Add lowering for scalable @llvm.riscv.masked.strided.load/store The code previously assumed fixed length vectors; make the relevant code conditional. Having the lowering in place is neccessary for an upcoming change to generalize scatter/gather matching to scalable vectors. Differential Revision: https://reviews.llvm.org/D134489	2022-09-24 17:41:57 -07:00
Craig Topper	19850cc2d8	Revert "[RISCV] Lower BUILD_VECTOR to RISCVISD::VID_VL if it is floating-point type." This reverts commit `dd53a0bb30`. We have seen crashes from this internally. Probably due to the use of RoundingMode::Dynamic.	2022-09-23 18:41:41 -07:00
Craig Topper	90a5d8499a	[RISCV] Promote f16 STRICT_FCEIL/FLOOR/TRUNC/NEARBYINT/RINT/ROUND,ROUNDEVEN to f32.	2022-09-23 14:01:51 -07:00
Philip Reames	60c91fd364	[RISCV] Disallow scale for scatter/gather RISCV doesn't actually support a scaled form of indexed load and store. We previously handled this by forming the scaled SDNode, and then doing custom legalization during lowering. This patch instead adds a callback via TLI to prevent formation entirely. This has two effects: * First, the GEP gets expanded (and used). Instead of the shift being created with an SDLoc of the memory operation, it has the SDLoc of the GEP instruction. This avoids the scheduler perturbing IR order when there's no reason to. * Second, we fix what appears to be a bug in index calculation with RV32. The rules for GEPs require index calculation be done in particular bitwidth, and it appears the custom legalization code got this wrong for the case where index type exceeds pointer width. (Or at least, I trust the generic GEP lowering to be correct a lot more.) The DAGCombiner change to handle VPScatter/VPGather is technically separate, but is required to prevent a regression on those intrinsics. Differential Revision: https://reviews.llvm.org/D134382	2022-09-22 15:31:26 -07:00
Craig Topper	52708be182	[RISCV] Remove support for the unratified Zbe, Zbf, and Zbm extensions. These extensions do not appear to be on their way to ratification.	2022-09-22 13:04:41 -07:00
Fraser Cormack	92d71c615d	[RISCV] Use structured bindings in common RVV lowering code This patch uses structured bindings to simplify a couple of specific cases when lowering RVV operations where we commonly declare two SDValues and immediately 'tie' them to the mask and vector length. There's also a couple places where we split vectors that structured bindings make sense to use. This patch tries to keep these sorts of changes minimal and to cases where the returned types are commonly understood, rather than applying this wholesale to the RISCV backend. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D134442	2022-09-22 16:38:40 +01:00
Craig Topper	bf7c7696fe	[RISCV] Improve support for vector fp_to_sint_sat/uint_sat. The default fixed vector legalization is to unroll. The default scalable vector legalization is to clamp in the FP domain. The RVV vfcvt instructions have saturating behavior so we can use them directly. The only difference is that RVV instruction turn nan into the max value, but the _SAT intrinsics want 0. I'm only supporting 1 step of narrowing for now. I think we can support more steps by using VNCLIP to saturate and narrower. The only case that needs 2 steps of widening is f16->i64 which we can do as f16->f32->i64. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D134400	2022-09-22 08:13:48 -07:00
Craig Topper	8b8e18e11f	[RISCV] Replace RISCVISD::GREV/GORC/SHFL/UNSHFL with BREV8/ORC_B/ZIP/UNZIP. With Zbp removed, we no longer need the generalized forms. The computeKnownBitsForTargetNode code brev8/orc.b is still based on the general form with the shift amount forced to 7.	2022-09-21 21:57:59 -07:00
Craig Topper	182aa0cbe0	[RISCV] Remove support for the unratified Zbp extension. This extension does not appear to be on its way to ratification. Still need some follow up to simplify the RISCVISD nodes.	2022-09-21 21:22:42 -07:00
Craig Topper	1d8a7adca6	[RISCV] Rename RISCVISD::SINT_TO_FP_VL/UINT_TO_FP_VL. NFC Name them after the instructions VFCVT_RTZ_X(U)_F_VL to make it clear that the ISD nodes don't have the poison semantics of ISD::SINT_TO_FP/UINT_TO_FP. I play to reuse this node for a FP_TO_SINT_SAT/FP_TO_UINT_SAT patch and need the instruction semantics.	2022-09-21 15:33:04 -07:00
Craig Topper	70a64fe7b1	[RISCV] Remove support for the unratified Zbt extension. This extension does not appear to be on its way to ratification. Out of the unratified bitmanip extensions, this one had the largest impact on the compiler. Posting this patch to start a discussion about whether we should remove these extensions. We'll talk more at the RISC-V sync meeting this Thursday. Reviewed By: asb, reames Differential Revision: https://reviews.llvm.org/D133834	2022-09-20 20:26:48 -07:00
LiaoChunyu	2e74157ad4	[RISCV]Preserve (and X, 0xffff) in targetShrinkDemandedConstant shrinkdemandedconstant does some optimizations, but is not very friendly to riscv, targetShrinkDemandedConstant to limit the damage. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134155	2022-09-19 14:19:38 +08:00
LiaoChunyu	8fee91c435	[RISCV][NFC]Remove outdated comment from targetShrinkDemandedConstant Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134154	2022-09-19 10:23:06 +08:00
Craig Topper	61595c45af	[RISCV] Simplify some code in vector fp<->int handling. NFC We changed the way container types are selected since this code was written. We no longer need to use the largest type.	2022-09-16 12:56:42 -07:00
Sergei Barannikov	c6acb4eb0f	[SDAG] Add `getCALLSEQ_END` overload taking `uint64_t`s All in-tree targets pass pointer-sized ConstantSDNodes to the method. This overload reduced amount of boilerplate code a bit. This also makes getCALLSEQ_END consistent with getCALLSEQ_START, which already takes uint64_ts.	2022-09-15 14:02:12 -04:00
Han-Kuan Chen	dd53a0bb30	[RISCV] Lower BUILD_VECTOR to RISCVISD::VID_VL if it is floating-point type. Differential Revision: https://reviews.llvm.org/D133688	2022-09-13 18:50:20 -07:00
Craig Topper	8d7e73effe	[RISCV] Teach lowerVECTOR_SHUFFLE to recognize some shuffles as vnsrl. Unary shuffles such as <0,2,4,6,8,10,12,14> or <1,3,5,7,9,11,13,15> where half the elements are returned, can be lowered using vnsrl. SelectionDAGBuilder lowers such shuffles as a build_vector of extract_elements since the mask has less elements than the source. To fix this, I've enable the extractSubvectorIsCheapHook to allow DAGCombine to rebuild the shuffle using 2 extract_subvectors preceding the shufffle. I've gone very conservative on extractSubvectorIsCheapHook to minimize test impact and match what we have test coverage for. This can be improved in the future. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D133736	2022-09-13 11:07:11 -07:00
Alex Bradbury	c44c1e9d3e	[RISCV] Implement isMaskAndCmp0FoldingBeneficial hook This hook is currently only used by CodeGenPrepare, which will sink and duplicate an 'and' into a block that has an 'icmp 0' user of it if the hook returns true. This hook is less useful for RISC-V than for targets like AArch64 that have a TBZ (test bit and branch if zero instruction), but may still be profitable if Zbs is available and a BEXTI can be selected. Conservatively, we return false even if Zbs is enabled for any masks that fit in the ANDI immediate because it's possible the only use is a branch on the result, and ANDI+BNEZ => BEXTI+BNEZ isn't a profitable transformation. Differential Revision: https://reviews.llvm.org/D131492	2022-09-13 18:54:00 +01:00
Alex Bradbury	547160848c	[RISCV] Return true in hasBitTest when Zbs is enabled and update BEXTI pattern for resulting canonicalisation As the Zbs extension includes bext[i] for bit extract, we can unconditionally return true from this hook. This hook causes the DAG combiner to perform the following canonicalisation: and (not (srl X, C)), 1 --> (and X, 1<<C) == 0 and (srl (not X), C)), 1 --> (and X, 1<<C) == 0 As simply changing the hook causes a codegen regression, this patch also modifies a BEXTI pattern to match this canonicalised form. As BSETINVMask is now used for BEXT as well as BSET and BINV, it has been renamed to the more generic SingleBitSetMask. There is one codegen change in bittest.ll for bittest_31_i64 (NOT+BEXTI rather than NOT+SRLIW). This is neutral in terms of code quality. Differential Revision: https://reviews.llvm.org/D131482	2022-09-13 16:51:47 +01:00
Craig Topper	5224bae613	[RISCV] Fix a bug in i32 FP_TO_UINT_SAT lowering on RV64. We use the saturating behavior of fcvt.wu.h/s/d but forgot to take into account that fcvt.wu will sign extend the saturated result. According to computeKnownBits a promoted FP_TO_UINT_SAT is expected to zero extend the saturated value. In many case the upper bits aren't be demanded so this wouldn't be an issue. But if we computeKnownBits caused an AND to be removed it would be a bug. This patch inserts an AND during to zero the upper bits. Unfortunately, this pessimizes code if we aren't able to tell if the upper bits are demanded. To fix that we could custom type promote the FP_TO_UINT_SAT with SEXT_INREG after it, but I'll leave that for future work. I haven't found a failure from this, I was revisiting the code to add vector support and spotted it. Differential Revision: https://reviews.llvm.org/D133746	2022-09-13 08:41:32 -07:00
Craig Topper	4186a49d79	[RISCV] Custom type legalize i32 loads by sign extending. The default is to use extload which can become a zextload or sextload if it is followed by an 'and' or sext_inreg. Sometimes type legalization will introduce an 'and' from promoting something like 'srl X, C' and a sext_inreg from from a setcc. The 'and' could be freely folded with the promoted 'srl' by using srliw, but the sext_inreg can't be folded into a compare. DAG combiner will see both of these choices and may decide to fold the 'and' instead of the 'sext_inreg'. This forces the sext_inreg to become a sext.w. By picking sextload in the type legalizer we take this choice away. Looking at spec2006 compiled with Zba and Zbb this appeared to be net reduction in lines of code in the objdump disassembly output. This is similar to what we do with i32 add/sub/mul/shl in type legalization where we always emit a sext_inreg. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D130397	2022-09-12 09:13:07 -07:00
Joe Loser	5e96cea1db	[llvm] Use std::size instead of llvm::array_lengthof LLVM contains a helpful function for getting the size of a C-style array: `llvm::array_lengthof`. This is useful prior to C++17, but not as helpful for C++17 or later: `std::size` already has support for C-style arrays. Change call sites to use `std::size` instead. Differential Revision: https://reviews.llvm.org/D133429	2022-09-08 09:01:53 -06:00

1 2 3 4 5 ...

921 Commits