llvm-project

Commit Graph

Author	SHA1	Message	Date
Fraser Cormack	3f0df4d7b0	[RISCV] Expand scalable-vector truncstores and extloads Caught in internal testing, these operations are assumed legal by default, even for scalable vector types. Expand them back into separate truncations and stores, or loads and extensions. Also add explicit fixed-length vector tests for these operations, even though they should have been correct already. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99654	2021-04-05 17:03:45 +01:00
Craig Topper	4708a05da0	[RISCV] Use gorciw for i32 orc.b intrinsic when Zbp is enabled. The W version of orc.b does not exist in Zbp so we need to use gorci encoding. If we have Zbp, we can use gorciw which can avoid a sext.w in some cases.	2021-04-04 17:14:28 -07:00
Craig Topper	98d5db3e3a	[RISCV] Lower orc.b intrinsic to RISCVISD::GORCI. This will allow us to share any future known bits, demaned bits, or sign bits improvements.	2021-04-04 12:31:41 -07:00
Craig Topper	a2ea003fcb	[RISCV] Don't convert fshr/fshl to target specific FSL/FSR node if shift amount is a constant. As long as it's a constant we can directly pattern match it without any problems. It's only when it isn't a constant that we need to add an AND. In theory this should allow more target independent optimizations to remain active.	2021-04-03 23:13:30 -07:00
Levy Hsu	f78d932cf2	[RISCV] Add IR intrinsics for Zbc extension Head files are included in a separate patch in case the name needs to be changed. RV32 / 64: clmul clmulh clmulr Differential Revision: https://reviews.llvm.org/D99711	2021-04-02 12:09:13 -07:00
Levy Hsu	944adbf285	Recommit "[RISCV] Add IR intrinsic for Zbb extension" Forgot to amend the Author. Original commit message: Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b Differential Revision: https://reviews.llvm.org/D99320	2021-04-02 11:50:19 -07:00
Craig Topper	1f0b309f24	Revert "[RISCV] Add IR intrinsic for Zbb extension" This reverts commit `1808194590`. I forgot to change the author.	2021-04-02 11:47:02 -07:00
Craig Topper	1808194590	[RISCV] Add IR intrinsic for Zbb extension Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b	2021-04-02 11:23:57 -07:00
Levy Hsu	b001d574d7	[RISCV] Add IR intrinsic for Zbr extension Implementation for RISC-V Zbr extension intrinsic. Header files are included in separate patch in case the name needs to be changed RV32 / 64: crc32b crc32h crc32w crc32cb crc32ch crc32cw RV64 Only: crc32d crc32cd Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99009	2021-04-02 10:58:45 -07:00
Craig Topper	d7ffa82a8e	[RISCV] Improve 64-bit integer constant materialization for more cases. For positive constants we try shifting left to remove leading zeros and fill the bottom bits with 1s. We then materialize that constant shift it right. This patch adds a new strategy to try filling the bottom bits with zeros instead. This catches some additional cases.	2021-04-02 10:18:08 -07:00
Fraser Cormack	3b48d849d4	[RISCV] Optimize more redundant VSETVLIs D99717 introduced some test cases which showed that the output of one vsetvli into another would not be picked up by the RISCVCleanupVSETVLI pass. This patch teaches the optimization about such a pattern. The pattern is quite common when using the RVV vsetvli intrinsic to pass the VL onto other intrinsics. The second test case introduced by D99717 is left unoptimized by this patch. It is a rarer case and will require us to rewire any uses of the redundant vset[i]vli's output to the previous one's. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99730	2021-04-02 10:04:07 +01:00
Craig Topper	766d27dc85	[RISCV] Add isel patterns to handle vrsub intrinsic with 2 vector operands. This occurs when we type legalize an i64 scalar input on RV32. We need to manually splat, which requires a vector input. Rather than special case this in lowering just pattern match it.	2021-04-01 14:10:21 -07:00
Craig Topper	dbbc95e3e5	[RISCV] Use softPromoteHalf legalization for fp16 without Zfh rather than PromoteFloat. The default legalization strategy is PromoteFloat which keeps half in single precision format through multiple floating point operations. Conversion to/from float is done at loads, stores, bitcasts, and other places that care about the exact size being 16 bits. This patches switches to the alternative method softPromoteHalf. This aims to keep the type in 16-bit format between every operation. So we promote to float and immediately round for any arithmetic operation. This should be closer to the IR semantics since we are rounding after each operation and not accumulating extra precision across multiple operations. X86 is the only other target that enables this today. See https://reviews.llvm.org/D73749 I had to update getRegisterTypeForCallingConv to force f16 to use f32 when the F extension is enabled. This way we can still pass it in the lower bits of an FPR for ilp32f and lp64f ABIs. The softPromoteHalf would otherwise always give i16 as the argument type. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D99148	2021-04-01 12:41:57 -07:00
Craig Topper	d157e3f387	[RISCV] Fix handling of nxvXi64 vmsgt(u).vx intrinsics on RV32. We need to splat the scalar separately and use .vv, but there is no vmsgt(u).vv. So add isel patterns to select vmslt(u).vv with swapped operands. We also need to get VT to use for the splat from an operand rather than the result since the result VT is nxvXi1. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D99704	2021-04-01 10:38:05 -07:00
Craig Topper	b7c2e577cc	[RISCV] Add custom type legalization to form MULHSU when possible. There's no target independent ISD opcode for MULHSU, so custom legalize 2*XLen multiplies ourselves. We have to be a little careful to prefer MULHU or MULHSU. I thought about doing this in isel by pattern matching the (add (mul X, (srai Y, XLen-1)), (mulhu X, Y)) pattern. I decided against this because the add might become part of a chain of adds. I don't trust DAG combine not to reassociate with other adds making it difficult to find both pieces again. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D99479	2021-04-01 10:15:55 -07:00
Craig Topper	d61b40ed27	[RISCV] Improve 64-bit integer materialization for some cases. This adds a new integer materialization strategy mainly targeted at 64-bit constants like 0xffffffff where there are 32 or more trailing ones with leading zeros. We can materialize these by using an addi -1 and srli to restore the leading zeros. This matches what gcc does. I haven't limited to just these cases though. The implementation here takes the constant, shifts out all the leading zeros and shifts ones into the LSBs, creates the new sequence, adds an srli, and checks if this is shorter than our original strategy. I've separated the recursive portion into a standalone function so I could append the new strategy outside of the recursion. Since external users are no longer using the recursive function, I've cleaned up the external interface to return the sequence instead of taking a vector by reference. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D98821	2021-04-01 09:12:52 -07:00
Craig Topper	c88ee1a094	[RISCV] Add UnsupportedSchedZfh multiclass to reduce duplicate lines from RISCVSchedRocket.td and RISCVSchedSiFive7.td. NFC	2021-03-31 15:06:14 -07:00
Craig Topper	2a8b7cab6a	[RISCV] Add RISCVISD opcodes for CLZW and CTZW. Our CLZW isel pattern is quite easily broken by surrounding code preventing it from matching sometimes. This usually results in failing to remove the and X, 0xffffffff inserted by type legalization. The add with -32 that type legalization also inserts will often gets combined into other add/sub nodes. That doesn't usually result in extra code when we don't use clzw. CTTZ seems to be less fragile, but I wanted to keep it consistent with CTLZ. Reviewed By: asb, HsiangKai Differential Revision: https://reviews.llvm.org/D99317	2021-03-31 09:40:07 -07:00
Craig Topper	04f10ab367	[RISCV] Add isel patterns to select vsub_vx intrinsic to vadd.vi if it uses a small enough immediate Also modify the simm5_plus1 check because Imm-1 is UB if Imm happens to be INT64_MIN. I don't think the compiler would optimize based on that in this usage, but it could fail UBSan or -ftrapv. Reviewed By: HsiangKai, frasercrmck Differential Revision: https://reviews.llvm.org/D99637	2021-03-31 09:26:41 -07:00
Fraser Cormack	10fc6e4358	[RISCV] Add support for the stepvector intrinsic This adds almost everything required for supporting the new stepvector intrinsic on RVV. It is lowered to the existing VID_VL SDNode. The only exception is a limitation that RV32 cannot yet lower the intrinsic on i64 vectors. This is because the step operand is (currently) required to be at least as large as the vector element type. I will look into patching that out and loosening the requirement to only an integer pointer type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99594	2021-03-31 11:41:17 +01:00
Craig Topper	5db19cc010	[RISCV] simm12_plus1 should not inherit from Operand. NFC We only use this in Pat patterns, so it just needs to be an ImmLeaf. If we did need it as an instruction operand, the ParserMatchClass, EncoderMethod, and DecoderMethod were probably wrong.	2021-03-30 19:02:11 -07:00
Craig Topper	05998701b9	[RISCV] Remove some unused ImmLeafs. NFC These got left behind when we switched RV32 to use selectImm to match RV64.	2021-03-30 18:54:11 -07:00
Craig Topper	a33fcafaf0	[RISCV] Pass 'half' in the lower 16 bits of an f32 value when F extension is enabled, but Zfh is not. Without Zfh the half type isn't legal, but it could still be used as an argument/return in IR. Clang will not generate this today. Previously we promoted the half value to float for arguments and returns if the F extension is enabled but Zfh isn't. Then depending on which ABI is enabled we would pass it in either an FPR or a GPR in float format. If the F extension isn't enabled, it would get passed in the lower 16 bits of a GPR in half format. With this patch the value will always in half format and will be in the lower bits of a GPR or FPR. This should be consistent with where the bits are located when Zfh is enabled. I've based this implementation off of how this is done on ARM. I've manually nan-boxed the value to 32 bits using integer ops. It looks like flw, fsw, fmv.s, fmv.w.x, fmf.x.w won't canonicalize nans so should leave the value alone. I think those are the instructions that could get used on this value. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D98670	2021-03-30 09:47:54 -07:00
Tomas Matheson	a9968c0a33	[NFC][CodeGen] Tidy up TargetRegisterInfo stack realignment functions Currently needsStackRealignment returns false if canRealignStack returns false. This means that the behavior of needsStackRealignment does not correspond to it's name and description; a function might need stack realignment, but if it is not possible then this function returns false. Furthermore, needsStackRealignment is not virtual and therefore some backends have made use of canRealignStack to indicate whether a function needs stack realignment. This patch attempts to clarify the situation by separating them and introducing new names: - shouldRealignStack - true if there is any reason the stack should be realigned - canRealignStack - true if we are still able to realign the stack (e.g. we can still reserve/have reserved a frame pointer) - hasStackRealignment = shouldRealignStack && canRealignStack (not target customisable) Targets can now override shouldRealignStack to indicate that stack realignment is required. This change will make it easier in a future change to handle the case where we need to realign the stack but can't do so (for example when the register allocator creates an aligned spill after the frame pointer has been eliminated). Differential Revision: https://reviews.llvm.org/D98716 Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87	2021-03-30 17:31:39 +01:00
Craig Topper	f069000b43	[RISCV] Remove floating point condition code legalization from lowerFixedLengthVectorSetccToRVV. After D98939, this is done by LegalizeVectorOps making this code dead. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D99519	2021-03-30 09:11:56 -07:00
Evandro Menezes	fd94cfeeb5	[RISCV] Move scheduling resources for B into a separate file (NFC) Differential Revision: https://reviews.llvm.org/D99557	2021-03-29 20:37:22 -05:00
Craig Topper	3dd4aa7d09	[RISCV] When custom iseling masked loads/stores, copy the mask into V0 instead of virtual register. This matches what we do in our isel patterns. In our internal testing we've found this is needed to make the fast register allocator happy at -O0. Otherwise it may assign V0 to an earlier operand and find itself with no registers left when it reaches the mask operand. By using V0 explicitly, the fast register allocator will see it when it checks for phys register usages before it starts allocating vregs. I'll try to update this with a test case. Unfortunately, this does appear to prevent some instruction reordering by the pre-RA scheduler which leads to the increased spills seen in some tests. I suspect that problem could already occur for other instructions that already used V0 directly. There's a lot of repeated code here that could do with some wrapper functions. Not sure if that should be at the level of the new code that deals with V0. That would require multiple output parameters to pass the glue, chain and register back. Maybe it should be at a higher level over the entire set of push_backs. Reviewed By: frasercrmck, HsiangKai Differential Revision: https://reviews.llvm.org/D99367	2021-03-29 10:20:43 -07:00
Roger Ferrer Ibanez	ef76a333fa	[RISCV] Fix offset computation for RVV In D97111 we changed the RVV frame layout when using sp or bp to address the stack slots so we could address the emergency stack slot. The idea is to put the RVV objects as far as possible (in offset terms) from the frame reference register (sp / fp / bp). When using fp this happens naturally because the RVV objects are already the top of the stack and due to the constraints of RVV (VLENB being a power of two >= 128) the stack remains aligned. The rest of this summary does not apply to this case. When using sp / bp we need to skip the non-RVV stack slots. The size of the the non-RVV objects is computed subtracting the callee saved register size (whose computation is added in D97111 itself) to the total size of the stack (which does not account for RVV stack slots). However, when doing so we round to 16 bytes when computing that size and we end emitting a smaller offset that may belong to a scalar stack slot (see D98801). So this change removes that rounding. Also, because we want the RVV objects be between the non-RVV stack slots and the callee-saved register slots, we need to make sure the RVV objects are properly aligned to 8 bytes. Adding a padding of 8 would render the stack unaligned. So when allocating space for RVV (only when we don't use fp) we need to have extra padding that preserves the stack alignment. This way we can round to 8 bytes the offset that skips the non-RVV objects and we do not misalign the whole stack in the way. In some circumstances this means that the RVV objects may have padding before (=lower offsets from sp/bp) and after (before the CSR stack slots). Differential Revision: https://reviews.llvm.org/D98802	2021-03-29 17:03:49 +00:00
Hsiangkai Wang	bc82e9bf25	[RISCV] Add vfabs.v pseudo instruction. Differential Revision: https://reviews.llvm.org/D99454	2021-03-28 10:24:05 +08:00
Craig Topper	5692fc38e0	[RISCV] Add a pattern for (sext_inreg (mul (and X, 0xffffffff), (and Y, 0xffffffff)), i32) to suppress MULW formation We have a special pattern for (mul (and X, 0xffffffff), (and Y, 0xffffffff)), to optimize the ANDs to shift. But if a sext_inreg coms first, we'll form a MULW and limit the effectiveness of the special match. So this patch adds a larger pattern to suppress the MULW formation by emitting a sext.w and then the same output we use for the (mul (and X, 0xffffffff), (and Y, 0xffffffff)). This should all get CSEd. This is the issue I was trying to fix with D99029, but that affected many more tests.	2021-03-27 15:37:18 -07:00
Craig Topper	4d5ee71b52	[RISCV] Merge FMulAdd and FMulSub scheduler classes to a single FMA scheduler class. NFC It's unlikely that FMADD and FMSUB would have different scheduling information so merge them. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D99140	2021-03-26 16:37:20 -07:00
Craig Topper	c41f2f6492	[RISCV] Add scheduler classes for the Zba and Zbb extensions. I've used IALU for the simplest operations from Zbb: min, minu, max, maxu, sext.b, sext.h, zext.h, andn, orn, xnor I've put add.uw in IALU32 and slli.uw in ShiftImm32. Remaining instructions have received new classes. All 3 shadd are grouped together. shadd.uw are grouped together. Rotate left and right are together. Everything else got their own class containing one instruction. I think what I have here is the minimum granularity we need. I could be convinced that we need more classes. Reviewed By: evandro Differential Revision: https://reviews.llvm.org/D99040	2021-03-26 14:15:29 -07:00
Zakk Chen	9049cf77e3	[RISCV] Add constraint for RVV indexed loads. Add the constraint when destination EEW not equals the source EEW for correctness. The RVV spec has three register overlap rules and I implement the first stricter constraint because the others are difficult to enforce. Reviewed By: frasercrmck, craig.topper Differential Revision: https://reviews.llvm.org/D98920	2021-03-26 07:23:24 -07:00
Craig Topper	8f62a80328	[RISCV] Optimize (and (shl GPR:, uimm5:), 0xffffffff) to use 2 shifts instead of 3. The and would normally become SLLI+SRLI, giving us 2 SLLI+SRLI. We can detect this and combine the 2 SLLIs into 1.	2021-03-25 23:31:01 -07:00
Craig Topper	5a18c576c4	[RISCV] Don't call CheckAndMask from selectZExti32. Now that targetShrinkDemandedConstant preserves 0xffffffff masks we shouldn't need to call computeKnownBits here.	2021-03-25 22:07:41 -07:00
Craig Topper	5797feaa55	[RISCV] Reorder checks in RISCVTTIImpl::getGatherScatterOpCost to avoid calling getMinRVVVectorSizeInBits() when V extension is not enabled. getMinRVVVectorSizeInBits() asserts if the V extension isn't enabled. So check that gather/scatter is legal first since it already contains a check for V extension being enabled. It also already checks getMinRVVVectorSizeInBits for fixed length vectors so we don't need a check in getGatherScatterOpCost.	2021-03-25 14:20:47 -07:00
Craig Topper	c40cea6f08	[RISCV] Teach targetShrinkDemandedConstant to preserve (and X, 0xffffffff). We look for this pattern frequently in isel patterns so its a good idea to try to preserve it. This also let's us remove our special isel handling for srliw and use a direct pattern match of (srl (and X, 0xffffffff), C) since no bits will be removed from the and mask. Differential Revision: https://reviews.llvm.org/D99042	2021-03-25 09:03:25 -07:00
Fraser Cormack	99211352c1	[RISCV] Optimize select-like vector shuffles This patch adds a small optimization for vector shuffle lowering, detecting shuffles which can be re-expressed as vector selects. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99270	2021-03-25 11:39:57 +00:00
Fraser Cormack	321a71a772	[RISCV] Optimize BUILD_VECTOR sequences that reveal hidden splats This patch adds further optimization techniques to RVV BUILD_VECTOR lowering. It teaches the compiler to find splats of larger vector element types "hidden" in smaller ones. For example, a v4i8 build_vector (0x1, 0x2, 0x1, 0x2) could be splat as v2i16 0x0201. This is generally more optimal than the dominant-element BUILD_VECTORs and so takes priority. This optimization is currently limited to all-constant-or-undef BUILD_VECTORs as those were found to be the most common. There's no reason this couldn't be extended to other BUILD_VECTORs, but the additional bit-manipulation instructions may require more sophisticated heuristics. There are some cases where the materialization of the larger constant takes more scalar instructions than it does to build the vector with vector instructions. We could add heuristics to try and catch this. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99195	2021-03-25 10:35:31 +00:00
Serge Pavlov	ddb0bcbdff	Add missing cases in RISCVMCExpr::getVariantKindName Differential Revision: https://reviews.llvm.org/D98929	2021-03-25 12:57:05 +07:00
Craig Topper	0f99c6c56e	[RISCV] Remove duplicate DebugLoc variables from cases in ReplaceNodeResults. NFC We already created a DebugLoc at the top of the function. We can just use that one.	2021-03-24 20:23:03 -07:00
Craig Topper	512bae81cc	[RISCV] Add basic cost modelling for fixed vector gather/scatter. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D99142	2021-03-24 11:14:14 -07:00
Craig Topper	f24f09d256	[RISCV] Add TTI support for cpop with Zbb This will tell loop idiom recognize that it can make popcount loops countable using the ctpop intrinsic. I didn't bother checking for illegal types. Type legalization knows how to split a ctpop into multiple ctops added together. Assuming we only receive reasonable integer bit widths, a few cpop instructions added together is probably better than the loop. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D99203	2021-03-24 10:58:42 -07:00
Sander de Smalen	55d18b3cc2	[TTI] Return a TypeSize from getRegisterBitWidth. This patch changes the interface to take a RegisterKind, to indicate whether the register bitwidth of a scalar register, fixed-width vector register, or scalable vector register must be returned. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D98874	2021-03-24 14:45:13 +00:00
Jim Lin	503f1d845f	[RISCV] Add HasStdExtD predicate to copysign from double and to double patterns Copysign from double and to double patterns have lack of HasStdExtD predicate. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99234	2021-03-24 14:29:23 +08:00
Craig Topper	839a46d88f	[RISCV] Use selectImm for RV32. NFC Previously we used selectImm for RV64 and isel patterns for RV32. This should be NFC, but will allow RV32 and RV64 to share improvements in the future. For example, it might be useful to use BSETI from Zbs to make single bit constants. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98877	2021-03-23 08:57:15 -07:00
Fraser Cormack	feff66a082	[RISCV] Further optimize BUILD_VECTORs with repeated elements This patch builds upon the initial BUILD_VECTOR work introduced in D98700. It further optimizes the lowering of BUILD_VECTOR by using VSELECT operations to effectively insert repeated elements into the vector with relatively few instructions. This allows us to optimize more BUILD_VECTORs without significantly increasing the size of the generated code. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98969	2021-03-23 14:14:48 +00:00
Fraser Cormack	5bfbd9d938	[RISCV] Optimize all-constant mask BUILD_VECTORs This patch adds an optimization for mask-vector BUILD_VECTOR nodes whose elements are all constants or undef. It lowers such operations by building up the vector via a series of integer operations, in which multiple mask elements are inserted into a vector at a time via i8/i16/i32/i64 element types. The final result is then bitcast from that integer vector. We restrict this optimization in certain circumstances when optimizing for size. If we are required to use more than one integer insert operation, then it will likely increase code size compared with using a load from a constant pool. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98860	2021-03-23 10:11:19 +00:00
Craig Topper	d7b0c19823	[RISCV] Add scheduler classes to Zfh instructions. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D99053	2021-03-22 20:30:09 -07:00
Craig Topper	8db4804da7	[RISCV] Remove unused SchedWrites WriteFConv32/WriteFConv64/WriteFMov32/WriteFMov64. It doesn't look like any instructions have ever been assigned to these classes. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D99050	2021-03-22 20:29:18 -07:00
Craig Topper	294efcd6f7	[RISCV] Add support for fixed vector masked gather/scatter. I've split the gather/scatter custom handler to avoid complicating it with even more differences between gather/scatter. Tests are the scalable vector tests with the vscale removed and dropped the tests that used vector.insert. We're probably not as thorough on the splitting cases since we use 128 for VLEN here but scalable vector use a known min size of 64. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98991	2021-03-22 10:17:30 -07:00
luxufan	02ffbac844	[RISCV] remove redundant instruction when eliminate frame index The reason for generating mv a0, a0 instruction is when the stack object offset is large then int<12>. To deal this situation, in the elimintateFrameIndex function, it will create a virtual register, which needs the register scavenger to scavenge it. If the machine instruction that contains the stack object and the opcode is ADDI(the addi was generated by frameindexNode), and then this instruction's destination register was the same as the register that was generated by the register scavenger, then the mv a0, a0 was generated. So to eliminnate this instruction, in the eliminateFrameIndex function, if the instrution opcode is ADDI, then the virtual register can't be created. Differential Revision: https://reviews.llvm.org/D92479	2021-03-21 18:54:00 +08:00
Jessica Clarke	b2bb003774	[RISCV] Update comment in RISCVInstrInfoM.td Missed in `07ed62b7d5`.	2021-03-20 22:35:40 +00:00
Craig Topper	07ed62b7d5	[RISCV] Disable (mul (and X, 0xffffffff), (and Y, 0xffffffff)) optimization when Zba is enabled. This optimization is trying to save SRLI instructions needed to implement the ANDs. If we have zext.w we won't save anything. Because we don't check that the multiply is the only user of the AND we might even increase instruction count.	2021-03-20 15:31:45 -07:00
Craig Topper	b0d8823a8a	[RISCV] Add isel pattern to optimize (mul (and X, 0xffffffff), (and Y, 0xffffffff)) on RV64 This patterns computes the full 64 bit product of a 32x32 unsigned multiply. This requires a two pairs of SLLI+SRLI to zero the upper 32 bits of the inputs. We can do better than this by using two SLLI to move the lower bits to the upper bits then use MULHU to compute the product. This is the high half of a full 64x64 product. Since we put 32 0s in the lower bits of the inputs we know the 128-bit product will have zeros in the lower 64 bits. So the upper 64 bits, which MULHU computes, will contain the original 64 bit product we were after. The same trick would work for (mul (sext_inreg X, i32), (sext_inreg Y, i32)) using MULHS, but sext_inreg is sext.w which is already one instruction so we wouldn't save anything. Differential Revision: https://reviews.llvm.org/D99026	2021-03-20 14:55:46 -07:00
Craig Topper	d5c1d305b3	[RISCV] Rename WriteShift/ReadShift scheduler classes to WriteShiftImm/ReadShiftImm. Move variable shifts from WriteIALU/ReadIALU to new WriteShiftReg/ReadShiftReg. Previously only immediate shifts were in WriteShift. Register shifts were grouped with IALU. Seems likely that immediate shifts would be as fast or faster than register shifts. And that immediate shifts wouldn't be any faster than IALU. So if any deserved to be in their own group it should be register shifts not immediate shifts. Rather than try to flip them let's just add more granularity and give each kind their own class. I've used new names for both to make them unambiguous and to force any downstream implementations to be forced to put correct information in their scheduler models. Reviewed By: evandro Differential Revision: https://reviews.llvm.org/D98911	2021-03-19 20:39:49 -07:00
Craig Topper	5d315691c4	[RISCV] Add missing bitcasts to the results of lowerINSERT_SUBVECTOR and lowerEXTRACT_SUBVECTOR when handling mask vectors. Found by adding asserts to LegalizeDAG to catch incorrect result types being returned. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98964	2021-03-19 10:54:33 -07:00
Craig Topper	85f3f6b3cc	[RISCV] Lower scalable vector masked loads to intrinsics to match fixed vectors and reduce isel patterns. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98840	2021-03-19 10:39:35 -07:00
Fraser Cormack	d399b82e2a	[RISCV] Maintain fixed-length info when optimizing BUILD_VECTORs I'm not sure how I failed to notice this before, but when optimizing dominant-element BUILD_VECTORs we would lower via the scalable container type, which lost us the information about the fixed length of the vector types. By lowering via the fixed-length type we can preserve that information and eliminate redundant vsetvli instructions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98938	2021-03-19 17:21:06 +00:00
Fraser Cormack	550292ecb1	[RISCV] Fix missing scalable->fixed-length vector conversion Returning the scalable-vector container type would present problems when the fixed-length INSERT_VECTOR_ELT was used by later operations. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98776	2021-03-19 16:49:47 +00:00
Hsiangkai Wang	aa8d33a6d6	[RISCV] Spilling for Zvlsseg registers. For Zvlsseg, we create several tuple register classes. When spilling for these tuple register classes, we need to iterate NF times to load/store these tuple registers. Differential Revision: https://reviews.llvm.org/D98629	2021-03-19 07:46:16 +08:00
Craig Topper	c9861f722e	[RISCV] Correct the output chain in lowerFixedLengthVectorMaskedLoadToRVV We returned the input chain instead of the output chain from the new load. This bypasses the load in the chain. I haven't found a good way to test this yet. IR order prevents my initial attempts at causing reordering.	2021-03-18 16:34:35 -07:00
Fraser Cormack	3495031a39	[RISCV] Support scalable-vector masked scatter operations This patch adds support for masked scatter intrinsics on scalable vector types. It is mostly an extension of the earlier masked gather support introduced in D96263, since the addressing mode legalization is the same. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96486	2021-03-18 10:17:50 +00:00
Fraser Cormack	0331399dc9	[RISCV] Support scalable-vector masked gather operations This patch supports the masked gather intrinsics in RVV. The RVV indexed load/store instructions only support the "unsigned unscaled" addressing mode; indices are implicitly zero-extended or truncated to XLEN and are treated as byte offsets. This ISA supports the intrinsics directly, but not the majority of various forms of the MGATHER SDNode that LLVM combines to. Any signed or scaled indexing is extended to the XLEN value type and scaled accordingly. This is done during DAG combining as widening the index types to XLEN may produce illegal vectors that require splitting, e.g. nxv16i8->nxv16i64. Support for scalable-vector CONCAT_VECTORS was added to avoid spilling via the stack when lowering split legalized index operands. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96263	2021-03-18 09:26:18 +00:00
Fraser Cormack	c2b4600ec8	[RISCV] Support bitcasts of fixed-length mask vectors Without this patch, bitcasts of fixed-length mask vectors would go through the stack. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98779	2021-03-18 08:52:42 +00:00
ShihPo Hung	fca5d63aa8	[RISCV] Fix isel pattern of masked vmslt[u] This patch changes the operand order of masked vmslt[u] from (mask, rs1, scalar, maskedoff, vl) to (maskedoff, rs1, scalar, mask, vl). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98839	2021-03-17 20:18:11 -07:00
Craig Topper	92b39c6907	[RISCV] Use getTargetExtractSubreg and getTargetInsertSubreg to simplify some code. NFCI	2021-03-17 12:10:19 -07:00
Craig Topper	696ddef569	[RISCV] Support masked load/store for fixed vectors. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98561	2021-03-17 10:26:15 -07:00
Fraser Cormack	70251759a2	[RISCV] Optimize "dominant element" BUILD_VECTORs This patch adds an optimization path for BUILD_VECTOR nodes where the majority of the elements are identical. These can be splatted, with the remaining elements patched up with INSERT_VECTOR_ELTs. The threshold can be tweaked as required - it is currently conservative. Undef elements are disregarded when judging the dominance of a particular element. This allows them to be covered by the splat value. In addition, vectors of 2 elements are always optimized to a splat (for the upper element) and an insert at element zero. This optimization is disabled when optimizing for size. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98700	2021-03-17 10:09:04 +00:00
Fangrui Song	6ab8927931	[RISCV] Support clang -fpatchable-function-entry && GNU function attribute 'patchable_function_entry' Similar to D72215 (AArch64) and D72220 (x86). ``` % clang -target riscv32 -march=rv64g -c -fpatchable-function-entry=2 a.c && llvm-objdump -dr a.o ... 0000000000000000 <main>: 0: 13 00 00 00 nop 4: 13 00 00 00 nop % clang -target riscv32 -march=rv64gc -c -fpatchable-function-entry=2 a.c && llvm-objdump -dr a.o ... 00000002 <main>: 2: 01 00 nop 4: 01 00 nop ``` Recently the mainline kernel started to use -fpatchable-function-entry=8 for riscv (https://git.kernel.org/linus/afc76b8b80112189b6f11e67e19cf58301944814). Differential Revision: https://reviews.llvm.org/D98610	2021-03-16 10:02:35 -07:00
Craig Topper	229eeb187d	[RISCV] Look through copies when trying to find an implicit def in addVSetVL. The InstrEmitter can sometimes insert a copy after an IMPLICIT_DEF before connecting it to the vector instruction. This occurs when constrainRegClass reduces to a class with less than 4 registers. I believe LMUL8 on masked instructions triggers this since the result can only use the v8, v16, or v24 register group as the mask is using v0. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98567	2021-03-16 07:59:09 -07:00
Craig Topper	a33ce06cf5	[RISCV] Improve i32 UADDSAT/USUBSAT on RV64. The default promotion uses zero extends that become shifts. We cam use sign extend instead which is better for RISCV. I've used two different implementations based on whether we have minu/maxu instructions. Differential Revision: https://reviews.llvm.org/D98683	2021-03-16 07:44:06 -07:00
Craig Topper	41759c3d92	[RISCV] Add RISCVISD::BR_CC similar to RISCVISD::SELECT_CC. This allows me to introduce similar combines for branches as we have recently added for SELECT_CC. Some of them are less useful for standalone setccs and only help branch instructions. By having a BR_CC node its easier to only affect branches. I'm using CondCodeSDNode to make isel patterns easier to write so we can refer to the codes by name. SELECT_CC uses a constant instead. I've translated the condition code just like SELECT_CC so we need less patterns for the swapped conditions. This includes special cases for X < 1 and X > -1 that get translated to blez and bgez by using a 0 constant. computeKnownBitsForTargetNode support for SELECT_CC is added to allow MaskedValueIsZero to work for cases where the true and false values of the SELECT_CC are setccs and the result of the SELECT_CC is used by a BR_CC. This was needed to avoid regressions in some of the overflow tests. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98159	2021-03-15 11:54:01 -07:00
Philipp Tomsich	018e96f71f	[RISCV] Add isel-patterns to optimize (a < 1) into blez (a <= 0) The following code-sequence showed up in a testcase (isolated from SPEC2017) for if-conversion and vectorization when searching for the maximum in an array: addi a2, zero, 1 blt a1, a2, .LBB0_5 which can be expressed as `bge zero,a1,.LBB0_5`/`blez a1,/LBB0_5`. More generally, we want to express (a < 1) as (a <= 0). This adds the required isel-pattern and updates the testcases. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98449	2021-03-15 11:32:43 -07:00
Craig Topper	3dc5b533e0	[RISCV] Improve legalization of i32 UADDO/USUBO on RV64. The default legalization uses zero extends that require pair of shifts on RISCV. Instead we can take advantage of the fact that unsigned compares work equally well on sign extended inputs. This allows us to use addw/subw and sext.w. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98233	2021-03-15 09:30:23 -07:00
Fraser Cormack	0c5b789c73	[RISCV] Support fixed-length vectors in the calling convention This patch adds fixed-length vector support to the calling convention when RVV is used to lower fixed-length vectors. The scheme follows the regular vector calling convention for the argument/return registers, but uses scalable vector container types as the LocVTs, and converts to/from the fixed-length vector value types as required. Fixed-length vector types may be split when the combination of minimum VLEN and the maximum allowable LMUL is not large enough to fully contain the vector. In this case the behaviour differs between fixed-length vectors passed as parameters and as return values: 1. For return values, vectors must be passed entirely via registers or via the stack. 2. For parameters, unlike scalar values, split vectors continue to be passed by value, and are split across multiple registers until there are no remaining registers. Thus vector parameters may be found partly in registers and partly on the stack. As with scalable vectors, the first fixed-length mask vector is passed via v0. Split mask fixed-length vectors are passed first via v0 and then via the next available vector register: v8,v9,etc. The handling of vector return values uses all available argument registers v8-v23 which does not adhere to the calling convention we're supposedly implementing, but since this issue affects both fixed-length and scalable-vector values, it was left as-is. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97954	2021-03-15 10:43:51 +00:00
Hsiangkai Wang	a81dff1e58	[RISCV] Support inline asm for vector instructions. Types of fractional LMUL and LMUL=1 are all using VR register class. When using inline asm, it will use the first type in the register class as the type for the register. It is not necessary the same as the value type. We need to use INSERT_SUBVECTOR/EXTRACT_SUBVECToR/BITCAST to make it legal to put the value in the corresponding register class. Differential Revision: https://reviews.llvm.org/D97480	2021-03-15 11:02:18 +08:00
Craig Topper	fcdf7f6224	[RISCV] Give an explicit error if 'generic' CPU is passed instead of 'generic-rv32' or 'generic-rv64'. Validate 64Bit feature against the triple. I encountered a project that uses llvm that passes "generic" by default. While I could fix that project, I wouldn't be surprised if other projects did something similar. So it seems like a good idea to provide a better error here. I've also added validation of the 64Bit feature against the triple so that we can catch a mismatched CPU before failing in a mysterious way. We can make it pretty far in isel because we calculate XLenVT from the triple and use that to set up the legal integer type. Reviewed By: luismarques, khchen Differential Revision: https://reviews.llvm.org/D98307	2021-03-14 17:21:31 -07:00
luxufan	a9b9c64fd4	change rvv frame layout This patch change the rvv frame layout that proposed in D94465. In patch D94465, In the eliminateFrameIndex function, to eliminate the rvv frame index, create temp virtual register is needed. This virtual register should be scavenged by class RegsiterScavenger. If the machine function has other unused registers, there is no problem. But if there isn't unused registers, we need a emergency spill slot. Because of the emergency spill slot belongs to the scalar local variables field, to access emergency spill slot, we need a temp virtual register again. This makes the compiler report the "Incomplete scavenging after 2nd pass" error. So I change the rvv frame layout as follows: ``` \|--------------------------------------\| \| arguments passed on the stack \| \|--------------------------------------\|<--- fp \| callee saved registers \| \|--------------------------------------\| \| rvv vector objects(local variables \| \| and outgoing arguments \| \|--------------------------------------\| \| realignment field \| \|--------------------------------------\| \| scalar local variable(also contains\| \| emergency spill slot) \| \|--------------------------------------\|<--- bp \| variable-sized local variables \| \|--------------------------------------\|<--- sp ``` Differential Revision: https://reviews.llvm.org/D97111	2021-03-13 16:05:55 +08:00
Craig Topper	51151828ac	[RISCV] Teach normaliseSetCC to canonicalize X > -1 to X >= 0 and X < 1 to 0 >= X. This allows the use of BGE with X0 instead of puting -1/1 in a register. Reviewed By: jrtc27 Differential Revision: https://reviews.llvm.org/D98542	2021-03-12 11:50:10 -08:00
Craig Topper	45d3ed0304	[RISCV] Add support for scalable vector masked load/store. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98460	2021-03-12 10:32:33 -08:00
Fraser Cormack	641f5700f9	[RISCV] Optimize INSERT_VECTOR_ELT sequences This patch optimizes the codegen for INSERT_VECTOR_ELT in various ways. Primarily, it removes the use of vslidedown during lowering, and the vector element is inserted entirely using vslideup with a custom VL and slide index. Additionally, lowering of i64-element vectors on RV32 has been optimized in several ways. When the 64-bit value to insert is the same as the sign-extension of the lower 32-bits, the codegen can follow the regular path. When this is not possible, a new sequence of two i32 vslide1up instructions is used to get the vector element into a vector. This sequence was suggested by @craig.topper. From there, the value is slid into the final position for more consistent lowering across RV32 and RV64. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98250	2021-03-12 09:13:38 +00:00
Fraser Cormack	4d2d5855c7	[RISCV] Fix up stale VECREDUCE comments. NFC. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98399	2021-03-12 08:49:46 +00:00
Craig Topper	1d26bbcf9b	[RISCV] Return false from isShuffleMaskLegal except for splats. We don't support any other shuffles currently. This changes the bswap/bitreverse tests that check for this in their expansion code. Previously we expanded a byte swapping shuffle through memory. Now we're scalarizing and doing bit operations on scalars to swap bytes. In the future we can probably use vrgather.vx to do a byte swap shuffle.	2021-03-11 20:02:49 -08:00
Craig Topper	c82f442954	[RISCV] Support fixed vector copysign. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98394	2021-03-11 09:57:24 -08:00
Craig Topper	0dff8a9627	[RISCV] Handle vmv.x.s intrinsic for i64 vectors on RV32. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98372	2021-03-11 09:39:50 -08:00
Craig Topper	9c841cb8e8	[RISCV] Support extract_vector_elt for fixed and scalable masked registers. This uses a really simple approach of converting to an i8 vector and extracting. This is probably not the best approach especially if you know the index is constant. Other ideas: -Store to stack temporary using vse1, load as scalar and shift. -Sort of bitcast the vector to a vector of i8, slide down the appropriate 8 bit element, copy to scalar, shift down the correct bit within the 8 bits we extracted. Not exactly sure how to describe such a bitcast from i1 vector to i8 vector within the type system for elements less than 8. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98310	2021-03-11 09:26:44 -08:00
Craig Topper	9106d04554	[RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32. On riscv32, i64 isn't a legal scalar type but we would like to support scalable vectors of i64. This patch introduces a new node that can represent a splat made of multiple scalar values. I've used this new node to solve the current crashes we experience when getConstant is used after type legalization. For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS when needed and then handling the SPLAT_VECTOR_PARTS later during LegalizeOps. I've remove the special case I previously put in for ABS for D97991 as the default expansion is now able to succesfully use getConstant. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98004	2021-03-10 09:46:18 -08:00
Craig Topper	0c73a506e8	[RISCV] Starting fixing issues that prevent us from testing vXi64 intrinsics on RV32. Currently we crash in type legalization any time an intrinsic uses a scalar i64 on RV32. This patch adds support for type legalizing this to prevent crashing. I don't promise that it uses the best possible codegen just that it is functional. This first version handles 3 cases. vmv.v.x intrinsic, vmv.s.x intrinsic and intrinsics that take a scalar input, splat it and then do some operation. For vmv.v.x we'll either rely on hardware sign extension for constants or we'll convert it to multiple splats and bit manipulation. For vmv.s.x we use a really unoptimal sequence inspired by what we do for an INSERT_VECTOR_ELT. For the third case we'll either try to use the .vi form for constants or convert to a complicated splat and bitmanip and use the .vv form of the operation. I've renamed the ExtendOperand field to SplatOperand now use it specifically for the third case. The first two cases are handled by custom lowering specifically for those intrinsics. I haven't updated all tests yet, but I tried to cover a subset that includes single-width, widening, and narrowing. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97895	2021-03-10 09:45:38 -08:00
Craig Topper	1e39118638	[RISCV] Manually split vector operands to VECREDUCE when handling vXi64 vectors on RV32. The type legalizer will visit the result before the operands. To avoid creating an illegal target specific node or falling back to scalarization, we need to manually split vector operands. This still doesn't handle the case of non-power of 2 operands which need to be widened. I'm not sure the type legalizer is ready for it. I think we would need to insert an INSERT_SUBVECTOR with the power of 2 type we want, with an undef first operand, and the non-power of 2 orignal operand as the vector to insert. Then fill in the neutral elements into the elements the padded elements. Alternatively we INSERT_SUBVECTOR into a neutral vector. From there we carry on splitting if needed to get to a legal type then do the target specific code. The problem with this is the type legalizer doesn't know how to widen an insert_subvector yet. We would need to add that including the handling for a non-undef first vector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98292	2021-03-10 09:27:38 -08:00
Craig Topper	351844edf1	[RISCV] Add support for VECTOR_REVERSE for scalable vector types. I've left mask registers to a future patch as we'll need to convert them to full vectors, shuffle, and then truncate. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97609	2021-03-09 10:03:45 -08:00
Craig Topper	77ac3166e5	[RISCV] Add support for fixed vector reductions. I've included tests that require type legalization to split the vector. The i64 version of these scalarizes on RV32 due to type legalization visiting the result before the vector type. So we have to abort our custom expansion to avoid creating target specific nodes with an illegal type. Then type legalization ends up scalarizing. We might be able to fix this by doing custom splitting for large vectors in our handler to get down to a legal type. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98102	2021-03-09 09:39:59 -08:00
Craig Topper	1c7ad4dd88	[RISCV] Don't modify the SEW immediate on the V extension pseudo instructions after inserting VSETVLI. Previously we set the value to -1, but the SEW information could be useful for scheduling. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D98062	2021-03-09 09:02:19 -08:00
Craig Topper	72ecf2f43f	[RISCV] Optimize fixed vector ABS. Fix crash on scalable vector ABS for SEW=64 with RV32. The default fixed vector expansion uses sra+xor+add since it can't see that smax is legal due to our custom handling. So we select smax(X, sub(0, X)) manually. Scalable vectors are able to use the smax expansion automatically for most cases. It crashes in one case because getConstant can't build a SPLAT_VECTOR for nxvXi64 when i64 scalars aren't legal. So we manually emit a SPLAT_VECTOR_I64 for that case. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97991	2021-03-09 08:51:03 -08:00
Craig Topper	478317fbb7	[RISCV] Make the hasStdExtM() check in RISCVInstrInfo::getVLENFactoredAmount emit a diagnostic rather than an assert. As far as I know we're not enforcing the StdExtM must be enabled to use the V extension. If we use an assert here and hit this code in a release build we'll silently emit an invalid instruction. By using a diagnostic we report the error to the user in release builds. I think there may still be a later fatal error from the code emitter though. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97970	2021-03-09 08:50:02 -08:00
ShihPo Hung	5cdb2e9860	[RISCV][MC] Fix nf encoding for vector ld/st whole register The three bit nf is one less than the number of NFIELDS, so we manually decrement 1 for VS1/2/4/8R & VL1/2/4/8R. Reviewed By: craig.topper Differential revision: https://reviews.llvm.org/D98185	2021-03-08 19:30:24 -08:00
Craig Topper	7a64cc4a76	[RISCV] Make use of DAG.getNeutralElement in lowerVECREDUCE to avoid repeating the same list of constants. NFC Reviewed By: frasercrmck, khchen Differential Revision: https://reviews.llvm.org/D98091	2021-03-08 09:11:10 -08:00
Craig Topper	a2651266c5	[RISCV] Add explicit i64 types to RV64 isel patterns to stop tablegen from generating unneeded i32 patterns for RV32 HwMode.	2021-03-08 09:06:56 -08:00
Fraser Cormack	18173c57bd	[RISCV] Add new entry points to getContainerForFixedLengthVector While working on adding fixed-length vectors to the calling convention, it was necessary to be able to query for a fixed-length vector container type without access to an instance of SelectionDAG. This patch modifies the "main" getContainerForFixedLengthVector function to use an instance of TargetLowering rather than SelectionDAG, and preserves the SelectionDAG overload as a wrapper. An additional non-static version of the function was also added to simplify the common case in RISCVTargetLowering. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97925	2021-03-08 09:26:19 +00:00
Craig Topper	c91b3c9e63	[RISCV] Fold (select_cc (setlt X, Y), 0, ne, trueV, falseV) -> (select_cc X, Y, lt, trueV, falseV) A setcc can be created during LegalizeDAG after select_cc has been created. This combine will enable us to fold these late setccs. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98132	2021-03-07 09:44:56 -08:00
Craig Topper	fdbd5d3206	[RISCV] Fold (select_cc (xor X, Y), 0, eq/ne, trueV, falseV) -> (select_cc X, Y, eq/ne, trueV, falseV) This pattern occurs when lowering for overflow operations introduce an xor after select_cc has already been formed. I had to rework another combine that looked for select_cc of an xor with 1. That xor will now get combined away so we just need to look for the RHS of the select_cc being 1. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D98130	2021-03-07 09:29:55 -08:00
Fangrui Song	2d922de3af	[MC][RISCV] Support .reloc , BFD_RELOC_{NONE,32,64}, BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.	2021-03-05 21:45:11 -08:00
Luke	d28297ff68	[RISCV] Enable fixed-length vectorization of LoopVectorizer for RISC-V Vector By implementing the method "unsigned RISCVTTIImpl::getRegisterBitWidth(bool Vector)", fixed-length vectorization is enabled when possible. Without this method, the "#pragma clang loop" directive is needed to enable vectorization(or the cost model may inform LLVM that "Vectorization is possible but not beneficial"). Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97549	2021-03-05 10:54:51 +08:00
Fraser Cormack	8e7ceffd0b	[RISCV] Fix crash when inserting large fixed-length subvectors This patch addresses a compiler crash resulting from passing a fixed-length type to one that expects scalable vector types. An assertion was added to prevent this regressing in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97868	2021-03-04 09:27:16 +00:00
Fraser Cormack	d8e1d2ebf4	[RISCV] Preserve fixed-length VL on insert_vector_elt in more cases This patch fixes up one case where the fixed-length-vector VL was dropped (falling back to VLMAX) when inserting vector elements, as the code would lower via ISD::INSERT_VECTOR_ELT (at index 0) which loses the fixed-length vector information. To this end, a custom node, VMV_S_XF_VL, was introduced to carry the VL operand through to the final instruction. This node wraps the RVV vmv.s.x and vmv.s.f instructions, which were being selected by insert_vector_elt anyway. There should be no observable difference in scalable-vector codegen. There is still one outstanding drop from fixed-length VL to VLMAX, when an i64 element is inserted into a vector on RV32; the splat (which is custom legalized) has no notion of the original fixed-length vector type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97842	2021-03-04 09:21:10 +00:00
Amara Emerson	8a316045ed	[AArch64][GlobalISel] Enable use of the optsize predicate in the selector. To do this while supporting the existing functionality in SelectionDAG of using PGO info, we add the ProfileSummaryInfo and LazyBlockFrequencyInfo analysis dependencies to the instruction selector pass. Then, use the predicate to generate constant pool loads for f32 materialization, if we're targeting optsize/minsize. Differential Revision: https://reviews.llvm.org/D97732	2021-03-02 12:55:51 -08:00
Fraser Cormack	c1695ddf7d	[RISCV] Support fixed-length INSERT_VECTOR_ELT This patch enables support for lowering INSERT_VECTOR_ELT on fixed-length vector types. The strategy follows that for scalable vector types. This patch also includes a quick fix to prevent the compiler infinitely looping between lowering BUILD_VECTOR as VECTOR_SHUFFLE and back again. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97698	2021-03-02 16:48:38 +00:00
Fraser Cormack	de2b70010a	[RISCV] Lower CONCAT_VECTORS to INSERT_SUBVECTOR nodes The default expansion of CONCAT_VECTORS goes through the stack. This patch avoids that penalty by custom-lowering CONCAT_VECTORS to a series of INSERT_SUBVECTOR nodes. Futher optimizations are possible, but this is a good start. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97692	2021-03-02 11:13:59 +00:00
Fraser Cormack	3fea9226ee	[RISCV] Support INSERT_SUBVECTOR on vector masks Like with EXTRACT_SUBVECTOR, INSERT_SUBVECTOR poses a problem for vector masks as RVV isn't able to slide mask types around. We choose instead to bitcast to equivalently-sized i8 types where we can, else we zero-extend, perform the operation, and truncate back down. One test was left disabled due to a crash in the legalizer. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97559	2021-03-01 12:04:11 +00:00
Fraser Cormack	e80ca3af82	[RISCV] Fix INSERT/EXTRACT_SUBVECTOR on fractional LMUL types This patch fixes a bug where the lowering for INSERT_SUBVECTOR and EXTRACT_SUBVECTOR would insist on first extracting a register-aligned LMUL1 vector type before perfoming the slide up/down. This was even if the vector was a fractional LMUL type, in which case the aligned EXTRACT_SUBVECTOR was invalid. This issue only occurred for scalable vector types, but a variety of tests for both scalable and fixed-length vectors have been added to ensure this does not regress in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97556	2021-03-01 11:51:05 +00:00
Fraser Cormack	4ea734e6ec	[RISCV] Unify scalable- and fixed-vector INSERT_SUBVECTOR lowering This patch unifies the two disparate paths for lowering INSERT_SUBVECTOR operations under one roof. Consequently, with this patch it is possible to support any fixed-length subvector insertion, not just "cast-like" ones. As before, support for the insertion of mask vectors will come in a separate patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97543	2021-03-01 11:38:47 +00:00
Fraser Cormack	bd4d421688	[RISCV] Support EXTRACT_SUBVECTOR on vector masks This patch adds support for extracting subvectors from vector masks. This can be either extracting a scalable vector from another, or a fixed-length vector from a fixed-length or scalable vector. Since RVV lacks a way to slide vector masks down on an element-wise basis and we don't know the true length of the vector registers, in many cases we must resort to using equivalently-sized i8 vectors to perform the operation. When this is not possible we fall back and extend to a suitable i8 vector. Support was also added for fixed-length truncation to mask types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97475	2021-03-01 11:20:09 +00:00
Fangrui Song	47c5576d7d	ELF: Create unique SHF_GNU_RETAIN sections for llvm.used global objects If a global object is listed in `@llvm.used`, place it in a unique section with the `SHF_GNU_RETAIN` flag. The section is a GC root under `ld --gc-sections` with LLD>=13 or GNU ld>=2.36. For front ends which do not expect to see multiple sections of the same name, consider emitting `@llvm.compiler.used` instead of `@llvm.used`. SHF_GNU_RETAIN is restricted to ELFOSABI_GNU and ELFOSABI_FREEBSD in binutils. We don't do the restriction - see the rationale in D95749. The integrated assembler has supported SHF_GNU_RETAIN since D95730. GNU as>=2.36 supports section flag 'R'. We don't need to worry about GNU ld support because older GNU ld just ignores the unknown SHF_GNU_RETAIN. With this change, `__attribute__((retain))` functions/variables emitted by clang will get the SHF_GNU_RETAIN flag. Differential Revision: https://reviews.llvm.org/D97448	2021-02-26 16:38:44 -08:00
Craig Topper	b183cbfacd	[RISCV] Call SelectBaseAddr on the base pointer in the custom isel for vector loads and stores. This will allow FrameIndex as the base address instead of emitting a separate ADDI from isel. eliminateFrameIndex will likely turn it back into an ADDI, but this makes things consistent with the SDPatterns and VLPatterns. I only tested one case for simplicity. I can test more if reviewers want. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97221	2021-02-26 11:38:23 -08:00
Fraser Cormack	37014db013	[RISCV] Use existing method for the LMUL1 type. NFCI. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97467	2021-02-26 09:44:05 +00:00
Craig Topper	d7fca3f0bf	[RISCV] Support fixed vector extract_element for FP types.	2021-02-25 16:30:28 -08:00
Craig Topper	95c6824995	[RISCV] Teach CleanupVSETVLI to remove 'vsetvli zero, zero, vtype' when the vtype matches the previous vsetvli or vsetivli Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D97408	2021-02-25 07:51:19 -08:00
Craig Topper	25c6b7ddd2	[RISCV] Add isel pattern to match X > -1 to bgez. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D97262	2021-02-25 07:42:22 -08:00
Fraser Cormack	0ad86f879f	[RISCV] Update RVV ISA section-header comments. NFC. Some of the section headers had become stale with the transition from RVV specification version 0.9 to 0.10. This patch brings them up to date.	2021-02-25 14:15:28 +00:00
Fraser Cormack	02f435db0b	[RISCV] Support fixed-length vector i2fp/fp2i conversions This patch extends the support for scalable-vector int->fp and fp->int conversions by additionally handling fixed-length vectors. The existing scalable-vector lowering re-expresses widening/narrowing by x4+ conversions as standard nodes. The fixed-length vector support slots in at "the end" of this process by lowering the now equally-sized and widening/narrowing by x2 nodes to our custom VL versions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97374	2021-02-25 13:47:58 +00:00
Fraser Cormack	9620ce90d7	[RISCV] Support fixed-length vector FP_ROUND & FP_EXTEND This patch extends the support for vector FP_ROUND and FP_EXTEND by including support for fixed-length vector types. Since fixed-length vectors use "VL" nodes and scalable vectors can use the standard nodes, there is slightly more to do in the fixed-length case. A helper function was introduced to try and reduce the divergent paths. It is expected that this function will similarly come in useful for lowering the int-to-fp and fp-to-int operations for fixed-length vectors. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97301	2021-02-25 12:16:06 +00:00
Fraser Cormack	84413e1947	[RISCV] Support fixed-length vector truncates This patch extends support for our custom-lowering of scalable-vector truncates to include those of fixed-length vectors. It does this by co-opting the custom RISCVISD::TRUNCATE_VECTOR node and adding mask and VL operands. This avoids unnecessary duplication of patterns and inflation of the ISel table. Some truncates go through CONCAT_VECTORS which currently isn't efficiently handled, as it goes through the stack. This can be improved upon in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97202	2021-02-25 12:11:34 +00:00
Fraser Cormack	3bc5ed3875	[RISCV] Support fixed-length vector sign/zero extension This patch adds support for the custom lowering sign- and zero-extension of fixed-length vector types. It does so through custom nodes. Since the source and destination types are (necessarily) of different sizes, it is possible that the source type is legal whilst the larger destination type isn't. In this case the legalization makes heavy use of EXTRACT_SUBVECTOR. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97194	2021-02-25 12:05:17 +00:00
Fraser Cormack	821f8bb29a	[RISCV] Unify scalable- and fixed-vector EXTRACT_SUBVECTOR lowering This patch unifies the two disparate paths for lowering EXTRACT_SUBVECTOR operations under one roof. Consequently, with this patch it is possible to support any fixed-length subvector extraction, not just "cast-like" ones. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97192	2021-02-25 11:46:57 +00:00
Craig Topper	159f78fc2f	[RISCV] Reuse existing SDLoc and XLenVT in the switch in RISCVISelDAGToDAG::Select. NFC A SDLoc and XLenVT were already created above the switch.	2021-02-24 21:39:00 -08:00
Craig Topper	efcdd598b7	[RISCV] Teach VSETVLI inserter to use VSETIVLI when possible. We always create the VL operand using a register, but if we can determine that it came from an ADDI X0, imm with a sufficiently small immediate, we can use VSETIVLI. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97332	2021-02-24 16:07:33 -08:00
Craig Topper	9bde29629d	[RISCV] Use a ComplexPattern for zexti32 to match sexti32. We just started using a ComplexPattern for sexti32. This updates zexti32 to match. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D97231	2021-02-24 16:06:29 -08:00
Craig Topper	086670d367	[RISCV] Support fixed vector extract element. Use VL=1 for scalable vector extract element. I've changed to use VL=1 for slidedown and shifts to avoid extra element processing that we don't need. The i64 fixed vector handling on i32 isn't great if the vector type isn't legal due to an ordering issue in type legalization. If the vector type isn't legal, we fall back to default legalization which will bitcast the vector to vXi32 and use two independent extracts. Doing better will require handling several different cases by manually inserting insert_subvector/extract_subvector to adjust the type to a legal vector before emitting custom nodes. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97319	2021-02-24 10:17:00 -08:00
Hsiangkai Wang	53c4c2b9f7	[RISCV] vle1.v/vse1.v should be unmasked instructions. vle1.v/vse1.v should be unmasked instructions. The vm encoding is 1 for unmasked instructions. Differential Revision: https://reviews.llvm.org/D97237	2021-02-23 19:59:22 +08:00
Fraser Cormack	dd68f3cf28	[RISCV] Support insertion of misaligned subvectors This patch extends the support for RVV INSERT_SUBVECTOR to cover those which don't align to a vector register boundary. Like the support for EXTRACT_SUBVECTOR in D96959, it accomplishes this by extracting the nearest register-sized subvector (a subregister operation), then sliding the vector down with VSLIDEDOWN, inserting the subvector to the first position, and sliding the vector back up again afterwards. Unlike subvector extraction, for vectors that occupy less than a full vector register we must preserve the untouched elements. We do this by lowering to an LMUL=1 INSERT_SUBVECTOR using the above method and lowering that to a VSLIDEUP with a zero offset. This uses a tail-undisturbed policy and so has the effect of "sliding in" the subvector elements while preserving the surrounding ones. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96972	2021-02-23 10:31:06 +00:00
Craig Topper	3231607ce9	[RISCV] Have sexti32 also recognize AssertZExt from types smaller than i32. An i64 AssertZExt from a type smaller than i32 has at least 33 leading zeros which mean it has at least 33 sign bits. Since we have a couple patterns that use two sexti32, I've switched to a ComplexPattern so tablegen didn't have to generate 9 different permutations. As noted in the FIXME, maybe we should just call computeNumSignBits, but we don't have tests that benefit from that yet. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D97130	2021-02-22 14:56:22 -08:00
Craig Topper	1cd2a5a7da	[RISCV] Add isel support for bitcasts between fixed vector types. This should fix the issue reported in D96972. I don't have a good test case for this without those changes. Differential Revision: https://reviews.llvm.org/D97082	2021-02-22 12:05:46 -08:00
Craig Topper	1aeb927fed	[RISCV] Custom isel the rest of the vector load/store intrinsics. A previous patch moved the index versions. This moves the rest. I also removed the custom lowering for VLEFF since we can now do everything directly in the isel handling. I had to update getLMUL to handle mask registers to index the pseudo table correctly for VLE1/VSE1. This is good for another 15K reduction in llc size. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97097	2021-02-22 09:53:46 -08:00
Craig Topper	1a6c1ac686	[SelectionDAG][RISCV] Teach ComputeNumSignBits to handle SREM. This also removes a pattern from RISCV that is no longer needed since the sexti32 on the LHS of the srem in the pattern implies the result is sign extended so the sign_extend_inreg should be removed in DAG combine now. Reviewed By: luismarques, RKSimon Differential Revision: https://reviews.llvm.org/D97133	2021-02-21 11:13:36 -08:00
Fraser Cormack	3e1317fd32	[RISCV] Support extraction of misaligned subvectors This patch extends the support for RVV EXTRACT_SUBVECTOR to cover those which don't align to a vector register boundary. It accomplishes this by extracting the nearest register-sized subvector (a subregister operation), then sliding the vector down with VSLIDEDOWN and extracting the subvector from the first position (a COPY operation). Since this procedure involves the use of VSCALE and multiplication, the handling of such operations is done during lowering to simplify the implementation and make use of DAG combining. This necessitated moving some helper functions from RISCVISelDAGToDAG to RISCVTargetLowering. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96959	2021-02-20 15:43:54 +00:00
Fraser Cormack	9aa20caee6	[RISCV] Improve register allocation around vector masks With vector mask registers only allocatable to V0 (VMV0Regs) it is relatively simple to generate code which uses multiple masks and naively requires spilling. This patch aims to improve codegen in such cases by telling LLVM it can use VRRegs to hold masks. This will prevent spilling in many cases by having LLVM copy to an available VR register. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97055	2021-02-20 14:47:51 +00:00
Craig Topper	71b68fe532	[RISCV] Teach our custom vector load/store intrinsic isel code to propagate memory operands if we have them. We don't currently create memory operands for these intrinsics, but there was a suggestion of using the indexed load/store intrinsics to implement isel for scalable vector gather/scatter. That may propagate the memory operand from the gather/scatter ISD nodes.	2021-02-19 19:12:20 -08:00
Craig Topper	7e54d7304b	[RISCV] Remove VPatILoad and VPatIStore multiclasses that are no longer used. NFC	2021-02-19 13:23:08 -08:00
Craig Topper	e7c86f4ac4	[RISCV] Use inheritance to reduce some repeated code in tablegen. NFC The VLX and VSX searchable tables, share the same format so we can have a common base class for them.	2021-02-19 10:42:18 -08:00
Craig Topper	7f5b3886e4	[RISCV] Remove unneeded indexed segment load/store vector pseudo instruction. We had more combinations of data and index lmuls than we needed. Also add some asserts to verify that the IndexVT and data VT have the same element count when we isel these pseudo instructions.	2021-02-19 10:28:48 -08:00
Craig Topper	d056d5decf	[RISCV] Use custom isel for vector indexed load/store intrinsics. There are many legal combinations of index and data VTs supported for these intrinsics. This results in a lot of isel patterns in RISCVGenDAGISel.inc. By adding a separate table similar to what we use for segment load/stores, we can more efficiently manually select these intrinsics. We should also be able to reuse this table scalable vector gather/scatter. This reduces the llc binary size by ~56K. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D97033	2021-02-19 10:10:06 -08:00
Craig Topper	dbf910f0d9	[RISCV] Prevent selecting a 0 VL to X0 for the segment load/store intrinsics. Just like we do for isel patterns, we need to call selectVLOp to prevent 0 from being selected to X0 by the default isel. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97021	2021-02-19 10:07:12 -08:00
Craig Topper	98dff5e804	[RISCV] Move SHFLI matching to DAG combine. Add 32-bit support for RV64 We previously used isel patterns for this, but that used quite a bit of space in the isel table due to OR being associative and commutative. It also wouldn't handle shifts/ands being in reversed order. This generalizes the shift/and matching from GREVI to take the expected mask table as input so we can reuse it for SHFLI. There is no SHFLIW instruction, but we can promote a 32-bit SHFLI to i64 on RV64. As long as bit 4 of the control bit isn't set, a 64-bit SHFLI will preserve 33 sign bits if the input had at least 33 sign bits. ComputeNumSignBits has been updated to account for that to avoid sext.w in the tests. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96661	2021-02-19 10:07:12 -08:00
Fraser Cormack	d9531a3097	[RISCV] Address some clang-tidy warnings. NFCI.	2021-02-19 12:10:28 +00:00
Craig Topper	cd4051ac80	[RISCV] Prune unneeded indexed load/store pseudo instructions. We were creating more combinations of value and index lmul than we needed. I've copied the loop structure used here from VPseudoAMOEI with all data sew values instead of just 32/64. Similar can be done for segment loads/store. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D97008	2021-02-18 23:08:39 -08:00
Craig Topper	8ed3bbbcc3	[RISCV] Split zvlsseg searchable table into 4 separate tables. Index by properties rather than intrinsic ID. Intrinsic ID is a 32-bit value which made each row of the table 4 byte aligned. The remaining fields used 5 bytes. This meant 3 bytes of padding per row. This patch breaks the table into 4 separate tables and indexes them by properties we know about the intrinsic. NF, masked, strided, ordered, etc. The indexed load/store tables have no padding in their rows now. All together this reduces the size of llc binary by ~28K. I'm considering adding similar tables for isel of non-segment load/store as well to cut down the size of the isel table and probably improve our isel performance. Those tables would need to indexed from intrinsics, IR loads/stores, gathers/scatters, and RISCVISD opcodes. So having a table that can be indexed without using intrinsic ID is more flexible. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D96894	2021-02-18 19:00:49 -08:00
Craig Topper	cf34559104	[RISCV] Enable PrimaryKeyEarlyOut on RISCVVPseudosTable. This table is queried in RISCVMCInstLower without knowing whether the instruction is a vector pseudo. Due to the way the binary search works, we have to do log2(tablesize) checks just to determine a non-vector instruction isn't in the table. Conveniently, all the vector pseudos are pretty tightly packed within the internal instruction enum. By enabling the PrimaryKeyEarlyOut, tablegen will emit a check against the beginning and end of the table before doing the binary search. This gives a quick early out on the search for the majority of non-vector instructions. Differential Revision: https://reviews.llvm.org/D97016	2021-02-18 18:59:32 -08:00
Craig Topper	0db938312a	[RISCV] Simplify VPseudoAMOEI multiclass. NFC lmul was already iterated in one of the loops. We don't need to recreate it from a string.	2021-02-18 12:40:51 -08:00
Jessica Clarke	74df1ffaad	[RISCV] Use XLenRI alias for RegInfoByHwMode instances This avoids tedious repetition and matches what we do for the ValueTypeByHwMode uses. Reviewed By: craig.topper, luismarques Differential Revision: https://reviews.llvm.org/D96649	2021-02-18 19:38:36 +00:00
Craig Topper	156fc07e19	[RISCV] Add support for fixed vector MULHU/MULHS. This uses to division by constant optimization to use MULHU/MULHS. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D96934	2021-02-18 09:15:08 -08:00
Craig Topper	792627be35	[RISCV] Add support for fixed vector sign/zero extend from mask types. Due to vXi64 on RV32, I've directly emitted this using _VL ISD opcodes. If it wasn't for that we could just use fixed vector BUILD_VECTOR and VSELECT and let those each be legalized. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96910	2021-02-18 09:08:10 -08:00
Craig Topper	c7dd92e8a5	[RISCV] Support isel of scalable vector bitcasts These should be NOPs so we can just replace with the input. This matches what SVE does with isel patterns for all permutations. Custom isel saves us from having to list all permurations for all LMULs. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96921	2021-02-18 09:01:13 -08:00
Hsiangkai Wang	065a187f33	[RISCV] Fix typo. Use ValueType instead of LLVMType.	2021-02-18 23:21:27 +08:00
Hsiangkai Wang	f1efa8abaf	[RISCV] Fix bugs in pseudo instructions for masked segment load. For masked segment load, the destination register should not overlap with mask register. It could not be V0. In the original implementation, there is no segment load/store register class without V0. In this patch, I added these register classes and modify `GetVRegNoV0` to get the correct one. Differential Revision: https://reviews.llvm.org/D96937	2021-02-18 22:17:00 +08:00
Hsiangkai Wang	b97d8b32c3	[NFC][RISCV] Use concise way to describe load/store instructions. Differential Revision: https://reviews.llvm.org/D96923	2021-02-18 22:17:00 +08:00
Benjamin Kramer	ae1e6c3557	[RISCV] Rewrite assert to not give unused variable warnings in Release builds NFCI	2021-02-18 11:42:36 +01:00
Fraser Cormack	d876214990	[RISCV] Begin to support more subvector inserts/extracts This patch adds support for INSERT_SUBVECTOR and EXTRACT_SUBVECTOR (nominally where both operands are scalable vector types) where the vector, subvector, and index align sufficiently to allow decomposition to subregister manipulation: * For extracts, the extracted subvector must correctly align with the lower elements of a vector register. * For inserts, the inserted subvector must be at least one full vector register, and correctly align as above. This approach should work for fixed-length vector insertion/extraction too, but that will come later. Reviewed By: craig.topper, khchen, arcbbb Differential Revision: https://reviews.llvm.org/D96873	2021-02-18 10:18:27 +00:00
Craig Topper	016eca8f90	[RISCV] Guard LowerINSERT_VECTOR_ELT against fixed vectors. The type legalizer can call this code based on the scalar type so we need to verify the vector type is a scalable vector. I think due to how type legalization visits nodes, the vector type will have already been legalized so we don't have an issue with using MVT here like we did for EXTRACT_VECTOR_ELT. I've added a test just in case.	2021-02-17 19:27:08 -08:00
Craig Topper	00c4e0a8f6	[RISCV] Guard the ISD::EXTRACT_VECTOR_ELT handling in ReplaceNodeResults against fixed vectors and non-MVT types. The type legalizer is calling this code based on the scalar type so we need to verify the input type is a scalable vector. The vector type has also not been legalized yet when this is called so we need to use EVT for it.	2021-02-17 18:25:38 -08:00
Craig Topper	3bdd02735b	[RISCV] Localize RISCVZvlssegTable to RISCVISelDAGToDAG.cpp, the only place it is used.	2021-02-17 11:37:28 -08:00
Craig Topper	799f7865c8	[RISCV] Use bits<7> instead of bits<11> for the EEW field size in the RISCVZvlsseg searchable table. NFCI We only support 8, 16, 32, and 64 for EEW. These only need 7 bits to represent.	2021-02-17 11:12:36 -08:00
Craig Topper	d4353a3101	[RISCV] Merge the handlers for masked and unmasked segment loads/stores. A lot of the code for the masked and unmasked is the same. This patch adds a boolean to handle the differences so we can share the code. Differential Revision: https://reviews.llvm.org/D96841	2021-02-17 10:08:33 -08:00
Craig Topper	6f30d0035a	[RISCV] Merge the vsetvli and vsetvlimax intrinsic selection These have very similar code just with a different number of operands and handling for vsetivl. Differential Revision: https://reviews.llvm.org/D96834	2021-02-17 10:08:33 -08:00
luxufan	709ea8bc87	[RISCV] Simplify BP initialisation We can re-use copyPhysReg rather than writing a specialised copy. Differential Revision: https://reviews.llvm.org/D95227	2021-02-17 20:33:20 +08:00
Fraser Cormack	d81161646a	[RISCV] Add support for fixed vector vselect This patch adds support for fixed-length vector vselect. It does so by lowering them to a custom unmasked VSELECT_VL node with a vector length operand. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96768	2021-02-17 10:59:00 +00:00
Hsiangkai Wang	a3c783dbf2	[RISCV] Spilling for RISC-V V extension. (2nd version) Differential Revision: https://reviews.llvm.org/D95148	2021-02-17 14:05:19 +08:00
Hsiangkai Wang	5a31a67385	[RISCV] Frame handling for RISC-V V extension. This patch proposes how to deal with RISC-V vector frame objects. The layout of RISC-V vector frame will look like \|---------------------------------\| \| scalar callee-saved registers \| \|---------------------------------\| \| scalar local variables \| \|---------------------------------\| \| scalar outgoing arguments \| \|---------------------------------\| \| RVV local variables && \| \| RVV outgoing arguments \| \|---------------------------------\| <- end of frame (sp) If there is realignment or variable length array in the stack, we will use frame pointer to access fixed objects and stack pointer to access non-fixed objects. \|---------------------------------\| <- frame pointer (fp) \| scalar callee-saved registers \| \|---------------------------------\| \| scalar local variables \| \|---------------------------------\| \| ///// realignment ///// \| \|---------------------------------\| \| scalar outgoing arguments \| \|---------------------------------\| \| RVV local variables && \| \| RVV outgoing arguments \| \|---------------------------------\| <- end of frame (sp) If there are both realignment and variable length array in the stack, we will use frame pointer to access fixed objects and base pointer to access non-fixed objects. \|---------------------------------\| <- frame pointer (fp) \| scalar callee-saved registers \| \|---------------------------------\| \| scalar local variables \| \|---------------------------------\| \| ///// realignment ///// \| \|---------------------------------\| <- base pointer (bp) \| RVV local variables && \| \| RVV outgoing arguments \| \|---------------------------------\| \| /////////////////////////////// \| \| variable length array \| \| /////////////////////////////// \| \|---------------------------------\| <- end of frame (sp) \| scalar outgoing arguments \| \|---------------------------------\| In this version, we do not save the addresses of RVV objects in the stack. We access them directly through the polynomial expression (a x VLENB + b). We do not reserve frame pointer when there is any RVV object in the stack. So, we also access the scalar frame objects through the polynomial expression (a x VLENB + b) if the access across RVV stack area. Differential Revision: https://reviews.llvm.org/D94465	2021-02-17 14:05:19 +08:00
Craig Topper	61a238e6e1	[RISCV] Add isel patterns for fixed vector fmsub/fnmadd/fnmsub.	2021-02-16 12:03:33 -08:00
Craig Topper	07ca13fe07	[RISCV] Add support for fixed vector mask logic operations. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96741	2021-02-16 09:34:00 -08:00
Fraser Cormack	04977ce5ce	[RISCV] Fix a crash in fixed-length build_vector lowering Non-splatted non-integer build_vector nodes were mistakenly being lowered as VID expressions, which should not happen. VID can only be used to select integer build_vector nodes. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96718	2021-02-16 10:25:15 +00:00
Fraser Cormack	b870199020	[RISCV] Add patterns for scalable-vector fabs & fcopysign The patterns mostly follow the scalar counterparts, save for some extra optimizations to match the vector/scalar forms. The patch adds a DAGCombine for ISD::FCOPYSIGN to try and reorder ISD::FNEG around any ISD::FP_EXTEND or ISD::FP_TRUNC of the second operand. This helps us achieve better codegen to match vfsgnjn. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96028	2021-02-16 10:21:09 +00:00
Craig Topper	29b894a8d3	[RISCV] Add expicit i32/i64 types to RV32 or RV64 only isel patterns. NFC This stops tablegen from generating patterns with the opposite type in the opposite HwMode. This just adds wasted bytes to the isel table. This reduces the isel table by about 1800 bytes.	2021-02-15 14:36:05 -08:00
Craig Topper	7ba2e1c601	[RISCV] Add support for fixed vector floating point setcc. This is annoying because the condition code legalization belongs to LegalizeDAG, but our custom handler runs in Legalize vector ops which occurs earlier. This adds some of the mask binary operations so that we can combine multiple compares that we need for expansion. I've also fixed up RISCVISelDAGToDAG.cpp to handle copies of masks. This patch contains a subset of the integer setcc patch as well. That patch is dependent on the integer binary ops patch. I'll rebase based on what order the patches go in. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96567	2021-02-15 12:52:25 -08:00
Fraser Cormack	4bd5bd4009	[RISCV] Convert VSLIDE(UP\|DOWN) nodes to "VL" versions (NFC) This patch prepares the RISCV VSLIDEUP and VSLIDEDOWN custom nodes to ones carrying additional mask and vector-length operands. This is primarily so they can be used by both systems. This also takes the opportunity to create some helper functions to deal with the common task of getting the default (unmasked) VL operands. Reviewed By: craig.topper, arcbbb Differential Revision: https://reviews.llvm.org/D96505	2021-02-15 10:32:56 +00:00
Craig Topper	3520371ddb	[RISCV] Rename the RVVBaseAddr ComplexPattern to just BaseAddr and use it to merge some scalar load/store patterns too.	2021-02-13 12:01:51 -08:00
Craig Topper	532d4bf025	[RISCV] Move riscv_vfmv_v_f_vl patterns to RISCVInstrInfoVVLPatterns.td for consistency with riscv_vmv_v_x_vl. NFC	2021-02-12 16:08:27 -08:00
Craig Topper	4220a81c84	[RISCV] Add support for fixed vector fabs	2021-02-12 15:33:36 -08:00
Craig Topper	36658376d5	[RISCV] Add support for fixed vector sqrt.	2021-02-12 15:33:29 -08:00
Craig Topper	d32ed9b27e	[RISCV] Use a ComplexPattern to merge the PatFrags for removing unneeded masks on shift amounts. Rather than having patterns with and without an AND, use a ComplexPattern to handle both cases. Reduces the isel table by about 700 bytes.	2021-02-12 14:03:23 -08:00
Craig Topper	1697cc78b1	[RISCV] Add support for integer fixed vector setcc I believe I've covered all orderings of splat operands here. Better canonicalization in lowering might help reduce this. I did not handle the immediate adjustments needed for set(u)gt/set(u)lt. Testing here is limited to byte types because the scalable vector type used for masks for the store is calculated assuming 8 byte elements. But for the setcc its based on the element count of the container type for the setcc input. So they don't agree. We'll need to enhanced D96352 to handle this I think. Differential Revision: https://reviews.llvm.org/D96443	2021-02-12 09:29:41 -08:00
Craig Topper	875c76de2b	[RISCV] Add support for matching .vx and .vi forms of binary instructions for fixed vectors. Unlike scalable vectors, I'm only using a ComplexPattern for the immediate itself. The vmv_v_x is matched explicitly. We igore the VL argument when matching a binary operator, but we do check it when matching splat directly. I left out tests for vXi64 as they fail on rv32 right now. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96365	2021-02-12 09:18:10 -08:00
luxufan	feaf1d81e3	[RISCV] Change parseVTypeI function Change parseVTypeI function to Make the added vset instruction test cases report more concrete error message. Differential Revision: https://reviews.llvm.org/D96218	2021-02-12 19:38:34 +08:00
Fraser Cormack	e88da1d677	[RISCV] Add support for integer fixed min/max This patch extends the initial fixed-length vector support to include smin, smax, umin, and umax. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96491	2021-02-12 09:19:45 +00:00
Craig Topper	7a7836b4d8	[RISCV] Add a pattern for a scalable vector mask vnot. We can use a vnand.mm with the same register for both inputs. This avoids materializing an alls ones constant with vmset.mm.	2021-02-11 15:34:58 -08:00
ShihPo Hung	9e62c9146d	[RISCV] Initial support for insert/extract subvector This patch handles cast-like insert_subvector & extract_subvector in which case: 1. index starts from 0. 2. inserting a fixed-width vector into a scalable vector, or extracting a fixed-width vector from a scalable vector. Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D96352	2021-02-11 14:35:49 -08:00
Craig Topper	033b1bd185	[RISCV] Add support loads, stores, and splats of vXi1 fixed vectors. This refines how we determine which masks types are legal and adds support for loads, stores, and all ones/zeros splats. I left a fixme in store handling where I think we need to zero extra bits if the type isn't a multiple of a byte. If I remember right from X86 there was some case we could have a store of a 1, 2, or 4 bit mask and have a scalar zextload that then expected the bits to be 0. Its tricky to zero the bits with RVV. We need to do something like round VL up, zero a register, lower the VL back down, then do a tail undisturbed move into the zero register. Another option might be to generate a mask of 1/2/4 bits set with a VL of 8 and use that to mask off the bits. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96468	2021-02-11 09:13:16 -08:00
Jessica Clarke	ca606dc988	[RISCV] More whitespace and comment typo fixes in RISCVInstrInfoC.td	2021-02-11 02:32:36 +00:00
Jessica Clarke	0973ce8596	[RISCV] Fix whitespace in RISCVInstrInfoC.td	2021-02-11 02:23:09 +00:00
Craig Topper	350ab4e617	[RISCV] Use OperandTransform field of ImmLeaf to slightly simplify a couple bitmanip patterns. NFC This binds the SDNodeXForm to the ImmLeaf so we only need to mention the ImmLeaf in both the input and output pattern.	2021-02-10 17:52:07 -08:00
Craig Topper	fc4d780eaf	[RISCV] Remove superfluous semicolon. NFC	2021-02-10 11:20:29 -08:00
Craig Topper	cb161b3a88	[RISCV] Add support for matching .vf forms of fadd/fsub/fmul/fdiv/fma for fixed vectors. fma+neg will come in a different patch since I haven't done it for .vv yet either. Differential Revision: https://reviews.llvm.org/D96375	2021-02-10 10:16:27 -08:00
Craig Topper	0c254b4a69	[RISCV] Add support for selecting vrgather.vx/vi for fixed vector splat shuffles. The test cases extract a fixed element from a vector and splat it into a vector. This gets DAG combined into a splat shuffle. I've used some very wide vectors in the test to make sure we have at least a couple tests where the element doesn't fit into the uimm5 immediate of vrgather.vi so we fall back to vrgather.vx. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96186	2021-02-10 10:01:56 -08:00
Fraser Cormack	a3c74d6d53	[RISCV] Add support for selecting vid.v from build_vector This patch optimizes a build_vector "index sequence" and lowers it to the existing custom RISCVISD::VID node. This pattern is common in autovectorized code. The custom node was updated to allow it to be used by both scalable and fixed-length vectors, thus avoiding pattern duplication. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96332	2021-02-10 10:58:40 +00:00
Craig Topper	18ff7e045a	[RISCV] Make the min and max vector width command line options more consistent and check their relationship to each other.	2021-02-09 10:47:23 -08:00
Craig Topper	fd5adae02c	[RISCV] Remove SRO* and SLO* instructions from bitmanip. As of the current draft these are no longer being considered for the bitmanip spec. It wasn't clear what sub extension they belonged in in the 0.93 spec. So remove them. They can always be added back if something changes. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96157	2021-02-09 09:35:05 -08:00
Nemanja Ivanovic	f6e4b9fc06	[RISCV] Fix shared libs build Commit `a2d19bad07` introduced a dependency in the RISCV disassembler on two additional libraries (MC, RISCVDesc) which wasn't added to the CMakeLists.txt. This causes shared library builds to break. This patch just adds them to fix failures seen on some bots, such as the PPC64LE Multistage.	2021-02-09 06:14:25 -06:00
Hsiangkai Wang	a2d19bad07	[RISCV] Use whole register load/store for generic load/store. In vector v0.10, there are whole vector register load/store instructions. I suggest to use the whole register load/store instructions for generic load/store for scalable vector types. It could save up vset{i}vl{i} for these load/store. For fractional LMUL, I keep to use vle{eew}.v/vse{eew}.v instructions to load/store partial vector registers. Differential Revision: https://reviews.llvm.org/D95853	2021-02-09 15:52:04 +08:00
Hsiangkai Wang	a5b07a221a	[RISCV] Initial support of LoopVectorizer for RISC-V Vector. Define an option -riscv-vector-bits-max to specify the maximum vector bits for vectorizer. Loop vectorizer will use the value to check if it is safe to use the whole vector registers to vectorize the loop. It is not the optimum solution for loop vectorizing for scalable vector. It assumed the whole vector registers will be used to vectorize the code. If it is possible, we should configure vl to do vectorize instead of using whole vector registers. We only consider LMUL = 1 in this patch. This patch just an initial work for loop vectorizer for RISC-V Vector. Differential Revision: https://reviews.llvm.org/D95659	2021-02-09 06:32:18 +08:00
Craig Topper	b49aaed8c7	[RISCV] Use _COMMUTABLE fma pseudos for fixed vectors. This matches what we do in the VLMAX SDNode patterns.	2021-02-08 11:27:23 -08:00
Craig Topper	8d8cafa32e	[RISCV] Add support for splat fixed length build_vectors using RVV. Building on the fixed vector support from D95705 I've added ISD nodes for vmv.v.x and vfmv.v.f and switched to lowering the intrinsics to it. This allows us to share the same isel patterns for both. This doesn't handle splats of i64 on RV32 yet. The build_vector gets converted to a vXi32 build_vector+bitcast during type legalization. Not sure the best way to handle this at the moment. Differential Revision: https://reviews.llvm.org/D96108	2021-02-08 11:12:56 -08:00
Craig Topper	b8d719fbe8	[RISCV] Add support for fixed vector FMA. Follow up to D95705. Does not include the commuting support from D95800. Differential Revision: https://reviews.llvm.org/D96103	2021-02-08 11:12:56 -08:00
Craig Topper	a719b667a9	[RISCV] Add initial support for converting fixed vectors to scalable vectors during lowering to use RVV instructions. This is an alternative to D95563. This is modeled after a similar feature for AArch64's SVE that uses predicated scalable vector instructions.a Rather than use predication, this patch uses an explicit VL operand. I've limited it to always use LMUL=1 for now, but we can improve this in the future. This requires a bunch of new ISD opcodes to carry the VL operand. I think we can probably lower intrinsics to these ISD opcodes to cut down on the size of the isel table. Which is why I've added patterns for all integer/float types and not just LMUL=1. I'm only testing one vector width right now, but the width is programmable via the command line. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D95705	2021-02-08 10:41:30 -08:00
Craig Topper	b7b4f4cbc3	[RISCV] Make scalable vector FMA commutable for register allocation. This adds support for commuting operands and converting between vfmadd and vfmacc to avoid register copies. To avoid messing up intrinsic behavior, I've added new pseudo instructions that have the isCommutable flag set. These pseudos also force a tail agnostic policy. The intrinsic version still use the tail undisturbed policy. For best results it looks like we need to start with fmadd and only pick fmacc if its beneficial. MachineCSE commutes without contraining the operands and then commutes back if it didn't help with CSE. So I've made sure that when the operand choice isn't constrained, we will keep fmadd for MachineCSE and when it does the second commute, we get back the original instruction. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D95800	2021-02-08 10:05:33 -08:00
Craig Topper	cc2c45dc54	[RISCV] Use SplatPat/SplatPat_simm5 to handle PseudoVMV_V_X_/PseudoVMV_V_I_ selection as well. This ensures that we'll match immediates consistently regardless of whether we match them as a standalone splat or as part of another operation. While I was there I added complexities to the simm5/uimm5 patterns so we didn't have to assume that the 1 on the non-immediate was lower than what tablegen inferred. I had to make a minor tweak to tablegen to fix one place that didn't expect to see a ComplexPattern that wasn't a "leaf". Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D96199	2021-02-08 09:48:27 -08:00
Mikael Holmen	eb8c27c60c	[RISCV] Use std::make_tuple to make some toolchains happy again My toolchain (LLVM 8.0, libstdc++ 5.4.0) complained with: 12:38:19 ../lib/Target/RISCV/RISCVISelLowering.cpp:1717:12: error: chosen constructor is explicit in copy-initialization 12:38:19 return {RISCVISD::VECREDUCE_FADD, Op.getOperand(0), 12:38:19 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 12:38:19 /proj/flexasic/app/llvm/8.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here 12:38:19 constexpr tuple(_UElements&&... __elements) 12:38:19 ^ 12:38:19 ../lib/Target/RISCV/RISCVISelLowering.cpp:1720:12: error: chosen constructor is explicit in copy-initialization 12:38:19 return {RISCVISD::VECREDUCE_SEQ_FADD, Op.getOperand(1), Op.getOperand(0)}; 12:38:19 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 12:38:19 /proj/flexasic/app/llvm/8.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here 12:38:19 constexpr tuple(_UElements&&... __elements) 12:38:19 ^ 12:38:19 2 errors generated. This commit adds explicit calls to std::make_tuple to work around the problem.	2021-02-08 14:37:25 +01:00
Fraser Cormack	b46aac125d	[RISCV] Support the scalable-vector fadd reduction intrinsic This patch adds support for both the fadd reduction intrinsic, in both the ordered and unordered modes. The fmin and fmax intrinsics are not currently supported due to a discrepancy between the LLVM semantics and the RVV ISA behaviour with regards to signaling NaNs. This behaviour is likely fixed in version 2.3 of the RISC-V F/D/Q extension, but until then the intrinsics can be left unsupported. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95870	2021-02-08 09:52:27 +00:00
Craig Topper	3c767b96dc	[RISCV] Correct types in tablegen multiclasses found by D95874.	2021-02-05 11:55:58 -08:00
Fraser Cormack	e046c0c28b	[RISCV] Support scalable-vector integer reduction intrinsics This patch adds support for the integer reduction intrinsics supported by RVV. This excludes "mul" which has no corresponding instruction. The reduction instructions in RVV have slightly complicated type constraints given they always produce a single "M1" vector register. They are lowered to custom nodes including the second "scalar" reduction operand to simplify the patterns and in the hope that they can be useful for future DAG combines. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95620	2021-02-05 10:10:08 +00:00
Fraser Cormack	c3eb2da6c4	[RISCV] Optimize sign-extended EXTRACT_VECTOR_ELT nodes This patch custom-legalizes all integer EXTRACT_VECTOR_ELT nodes where SEW < XLEN to VMV_S_X nodes to help the compiler infer sign bits from the result. This allows us to eliminate redundant sign extensions. For parity, all integer EXTRACT_VECTOR_ELT nodes are legalized this way so that we don't need TableGen patterns for some and not others. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95741	2021-02-05 10:05:22 +00:00
Fraser Cormack	af48d2bfc2	[RISCV] Add patterns for scalable-vector fsqrt This patch adds support for lowering the sqrt intrinsic to the RVV vfsqrt instruction. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96012	2021-02-05 09:39:19 +00:00
Craig Topper	6b280ce34c	[RISCV] Use LLVMScalarOrSameVectorWidth to make avoid needing to mention the index type for vrgatherei16 intrinsics. Add .vv to the intrinsic name to be consistent with D95979. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D95981	2021-02-04 20:26:45 -08:00
Craig Topper	25ff302a79	[RISCV] Split vrgather intrinsics into separate vrgather.vv and vrgather.vx intrinsics. The vrgather.vv instruction uses a vector of indices with the same SEW as operand 0. The vrgather.vx instructions use a scalar index operand of XLen bits. By splitting this into 2 intrinsics we are able to use LLVMatchType in the definition to avoid specifying the type for the index operand when creating the IR for the intrinsic. For .vv it will match the operand 0 type. And for .vx it will match the type of the vl operand we already needed to specify a type for. I'm considering splitting more intrinsics. This was a somewhat odd one because the .vx doesn't use the element type, it always use XLen. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D95979	2021-02-04 19:50:12 -08:00
Hsiangkai Wang	63baeec66e	[RISCV] Load/store vector mask types. Use vle1.v/vse1.v to load/store vector mask types. Differential Revision: https://reviews.llvm.org/D93364	2021-02-03 13:44:15 +08:00
Hsiangkai Wang	c7189ba785	[RISCV] Add new vector instructions in v0.10. * Add new vector instructions in v0.10. - load/store for mask value vle1.v vse1.v - vsetivli for 0-31 immediate vector length. * Rename vector instructions in v0.10. - vfrsqrte7 -> vfrsqrt7 - vfrece7 -> vfrec7 * Reserve memory width encodings for EEW>128b. Differential Revision: https://reviews.llvm.org/D95781	2021-02-03 13:28:58 +08:00
Fraser Cormack	b4106f9c7b	[RISCV] Fix incorrect RVV sdiv/udiv lowering Due to a clerical error, the sdiv operation was mapping to vdivu and udiv to vdiv, when the opposite mapping is the correct one. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95869	2021-02-02 18:35:53 +00:00
Craig Topper	c4fd1981a7	[RISCV] Correct types in tablegen multiclasses found by D95874.	2021-02-02 10:39:47 -08:00
Craig Topper	912306ef21	[RISCV] Use a ComplexPattern to merge isel patterns for vector load/store with GPR and FrameIndex addresses. This reduces the isel table size by about 3000 bytes. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D95844	2021-02-02 10:20:52 -08:00
Craig Topper	e7f9a83499	[RISCV] Replace NoX0 SDNodeXForm with a ComplexPattern to do the selection of the VL operand. I think this is a more standard way of doing this. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D95833	2021-02-02 00:08:58 -08:00
Craig Topper	72b31ad4b8	[RISCV] Add scalable vector support for floating point FMA instructions A follow up patch will add support for commuting operands or changing opcode to vfmacc and friends. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D95662	2021-02-01 09:52:43 -08:00
Craig Topper	6a3ab66625	[RISCV] Update comment text from D95774. NFC	2021-02-01 09:52:43 -08:00
Craig Topper	1097ee61bf	[RISCV] Optimize (srl (and X, 0xffff), C) -> (srli (slli X, 16), 16 + C). Rather than materializing the 0xffff immediate for the AND, use a shift left to remove the upper bits and then shift in zeros from the right. This pattern occurs when type legalizing an i16 right shift. I've implemented this with custom selection code for a number of reasons. I've limited this to the AND having a single use. We need to compensate for SimplifyDemandedBits altering the AND mask. I'm using *W opcodes on RV64. We may want to generlize this in the future. For all these reason it seemed easiest to do it this way. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95774	2021-02-01 09:37:55 -08:00
Craig Topper	44cc5abbf9	[RISCV] Custom lower fshl/fshr with Zbt extension. We need to add a mask to the shift amount for these operations to use the FSR/FSL instructions. We were previously doing this in isel patterns, but custom lowering will make the mask visible to optimizations earlier.	2021-01-31 17:49:15 -08:00
Craig Topper	3fdf2a56dd	[RISCV] Use MVT instead of EVT in RISCVISelDAGToDAG.cpp All this code runs post type legalization so we should have exclusively legal types. The methods on MVT should be more efficient than EVT.	2021-01-30 15:57:15 -08:00
Hsiangkai Wang	9847023660	[RISCV] Update the version number to v0.10 for vector.	2021-01-30 07:55:58 +08:00
Hsiangkai Wang	282aca10ae	[RISCV] Update the version number to v0.10 for vector. v0.10 is tagged in V specification. Update the version to v0.10. Differential Revision: https://reviews.llvm.org/D95680	2021-01-30 07:20:05 +08:00
Hsiangkai Wang	e08b67f3a8	[NFC][RISCV] Remove redundant pseudo instructions for vector load/store. Not all combinations of SEW and LMUL we need to support. For example, we only need to support [M1, M2, M4, M8] for SEW = 64. There is no need to define pseudos for PseudoVLSE64MF8, PseudoVLSE64MF4, and PseudoVLSE64MF2. Differential Revision: https://reviews.llvm.org/D95667	2021-01-30 07:20:05 +08:00
Kazu Hirata	046cfb8565	[llvm] Forward-declare formatted_raw_ostream (NFC) Various TargetStreamer.h need formatted_raw_ostream but rely on a forward declaration of formatted_raw_ostream in MCStreamer.h. This patch adds forward declarations right in TargetStreamer.h. While we are at it, this patch removes the one in MCStreamer.h, where it is unnecessary.	2021-01-28 22:21:13 -08:00
Christudasan Devadasan	892e4567e1	Support a list of CostPerUse values This patch allows targets to define multiple cost values for each register so that the cost model can be more flexible and better used during the register allocation as per the target requirements. For AMDGPU the VGPR allocation will be more efficient if the register cost can be associated dynamically based on the calling convention. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D86836	2021-01-29 10:14:52 +05:30
Craig Topper	c5d4b77b17	[RISCV] Remove isel patterns for Zbs *W instructions. These instructions have been removed from the 0.94 bitmanip spec. We should focus on optimizing the codegen without using them. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D95302	2021-01-28 09:33:56 -08:00
Craig Topper	ae82a8c863	[RISCV] Add support for scalable vector fneg using vfsgnjn.vv Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D95568	2021-01-28 09:11:49 -08:00
Simon Pilgrim	aa76cebab5	Fix "32-bit shift result used in 64-bit comparison" MSVC warning. NFCI.	2021-01-28 11:21:36 +00:00
Fraser Cormack	fc2f27ccf3	[RISCV] Add support for RVV int<->fp & fp<->fp conversions This patch adds support for the full range of vector int-to-float, float-to-int, and float-to-float conversions on legal types. Many conversions are supported natively in RVV so are lowered with patterns. These include conversions between (element) types of the same size, and those that are half/double the size of the input. When conversions take place between types that are less than half or more than double the size we must lower them using sequences of instructions which go via intermediate types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95447	2021-01-28 09:50:32 +00:00
Craig Topper	5d05cdf55c	[RISCV] Copy isUnneededShiftMask from X86. In `d2927f786e`, I added patterns to remove (and X, 31) from sllw/srlw/sraw shift amounts. There is code in SelectionDAGISel.cpp that knows to use computeKnownBits to fill in bits of the mask that were removed by SimplifyDemandedBits based on bits being known zero. The non-W shift patterns use immbottomxlenset which allows the mask to have more than log2(xlen) trailing ones, but doesn't have a call to computeKnownBits to fill in bits of the mask that may have been cleared by SimplifyDemandedBits. This patch copies code from X86 to handle more than log2(xlen) bottom bits set and uses computeKnownBits to fill in missing bits before counting. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95422	2021-01-27 20:46:10 -08:00
Craig Topper	58aa049b9b	[RISCV] Move RISCVVPseudosTable from RISCVBaseInfo.h to RISCVInstrInfo.h. NFC RISCVBaseInfo.h belongs to the MC layer, but the Pseudo instructions are only used by the CodeGen layer. So it makes sense to keep this table in the CodeGen layer.	2021-01-27 13:38:26 -08:00
Craig Topper	ff038b316d	[RISCV] Reduce field sizes in searchable tables to reduce binary size.	2021-01-27 12:24:01 -08:00
Craig Topper	a40e01e442	[RISCV] Rework fault first only load isel. -Remove the ISD opcode for READ_VL. Just emit the MachineSDNode directly. -Move segmented fault first only load intrinsic handling completely to RISCVISelDAGToDAG.cpp and emit the ReadVL MachineSDNode there instead of lowering to ISD opcodes first.	2021-01-27 11:51:41 -08:00
Craig Topper	04570e98c8	[RISCV] Group the legal vector types into lists we can iterator over in the RISCVISelLowering constructor Remove the RISCVVMVTs namespace because I don't think it provides a lot of value. If we change the mappings we'd likely have to add or remove things from the list anyway. Add a wrapper around addRegisterClass that can determine the register class from the fixed size of the type. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D95491	2021-01-27 10:20:12 -08:00
Fraser Cormack	9a75a808c2	[RISCV] Fix a codegen crash in getSetCCResultType This patch fixes some crashes coming from `RISCVISelLowering::getSetCCResultType`, which would occasionally return an EVT constructed from an invalid MVT, which has a null Type pointer. The attached test shows this happening currently for some fixed-length vectors, which hit this issue when the V extension was enabled, even though they're not legal types under the V extension. The fix was also pre-emptively extended to scalable vectors which can't be represented as an MVT, even though a test case couldn't be found for them. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95434	2021-01-27 10:22:54 +00:00
Craig Topper	f9d7f77267	[RISCV] Have customLegalizeToWOp truncate to the original type instead of i32 now that we use it for i8/i16 as well. `239cfbccb0` add support for legalizing i8/i16 UDIV/UREM/SDIV to use *W instructions. So we need to truncate to i8/i16 if we're legalizing one of those.	2021-01-26 10:50:03 -08:00
Craig Topper	bfc60acd98	[RISCV] Adjust RISCVInstrInfoVSDPatterns.td for different pseudo instructions for different FPR. Move the Suffix string into the VTypeInfo class so we don't need a helper class to get to it. Adjust pseudo naming scheme for FPRs to put F16/F32/F64 in place of F in the pseudo instruction name rather than as a suffix. This avoids special cases like VFMERGE from the original patch. Differential Revision: https://reviews.llvm.org/D95404	2021-01-26 01:00:50 -08:00
Hsiangkai Wang	e72b22a40b	[RISCV] Define different pseudo instructions for different FPR. When spilling, the spill size will depend on the size of register class. For .vf vector instructions, it may spill the floating point scalar argument. In order to use the correct load/store instructions for spilling, we need to provide the correct floating point register class for the .vf vector pseudo instructions. In this commit, we define the .vf pseudo instructions as three different kinds of pseudo instructions for half/float/double. For example, PseudoVFADD_M1 will become as PseudoVFADD_F16_M1, PseudoVFADD_F32_M1, and PseudoVFADD_F64_M1. Differential Revision: https://reviews.llvm.org/D95234	2021-01-26 15:48:35 +08:00
Hsiangkai Wang	f19849a07b	[RISCV] Update V extension to v1.0-draft 08a0b464. Differential Revision: https://reviews.llvm.org/D94583	2021-01-26 12:02:43 +08:00
Hsiangkai Wang	b69932b550	[RISCV] Implement vlsegff intrinsics. Differential Revision: https://reviews.llvm.org/D95303	2021-01-26 12:02:43 +08:00
Craig Topper	15f66cf749	[RISCV] Add isel patterns to optimize slli.uw patterns without Zba extension. This pattern can occur when an unsigned is used to index an array on RV64. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95290	2021-01-25 16:12:08 -08:00
Fraser Cormack	15141cd115	[RISCV] Add RVV insertelt/extractelt scalable-vector patterns Original patch by @rogfer01. This patch adds support for insertelt and extractelt operations on scalable vectors. Special care must be taken on RV32 when dealing with i64 vectors as there are no straightforward ways to insert a 64-bit element without a register of that size. To that end, both are custom-lowered to different sequences. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Fraser Cormack <fraser@codeplay.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94615	2021-01-25 22:03:52 +00:00
Craig Topper	239cfbccb0	[RISCV] Custom type legalize i8/i16 UDIV/UREM/SDIV on RV64 so we can use divuw/remuw/divw. This makes our i8/i16 codegen more similar to the i32 codegen. I've also added computeKnownBits support for DIVUW/REMUW so that we can remove zero extending ANDs from the output. Without this we end up turning DIVUW/REMUW back into DIVU/REMU via some isel patterns. Reviewed By: frasercrmck, luismarques Differential Revision: https://reviews.llvm.org/D95322	2021-01-25 10:47:22 -08:00
Craig Topper	4eb4f8963f	[RISCV] Use sign extend for i32 arguments and returns in makeLibCall on RV64. As far as I know 32 bits arguments and returns on RV64 are always sign extended to i64. So I think we should be taking this into account around libcalls. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95285	2021-01-25 09:33:48 -08:00
Fraser Cormack	fde2466171	[SelectionDAG] Support scalable-vector splats in more cases This patch adds support for scalable-vector splats in DAGCombiner's `isConstantOrConstantVector` and `ISD::matchUnaryPredicate` functions, which enable the SelectionDAG div/rem-by-constant optimizations for scalable vector types. It also fixes up one case where the UDIV optimization was generating a SETCC without first consulting the target for its preferred SETCC result type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94501	2021-01-25 10:58:15 +00:00
Simon Cook	a7c1239f37	[RISCV] Add attribute support for all supported extensions This adds support for ".attribute arch" for all extensions that are currently supported by the compiler. Differential Revision: https://reviews.llvm.org/D94931	2021-01-25 08:58:53 +00:00
Craig Topper	12d0753aca	[RISCV] Use bitsLE instead of strict == MVT::i32 in assertsexti32 and assertzexti32. The patterns that use this really want to know if the operand has at least 32 sign/zero bits. This increases opportunities to use W instructions when the original source used i8/i16. Not sure how much this matters for performance, but it makes i8/i16 code more consistent with i32.	2021-01-24 13:58:14 -08:00
Simon Cook	f3f3c9c254	[RISCV] Fix name of Zba extension (NFC)	2021-01-24 21:02:34 +00:00
Craig Topper	116177afcc	[RISCV] Use SRLIWPat in the PACKUW pattern. This makes the code more tolerant if we ever change SimplifyDemandedBits to not remove 1s from the lsbs of a contiguous mask.	2021-01-24 10:41:58 -08:00
Craig Topper	c50457f3e4	[RISCV] Make the code in MatchSLLIUW ignore the lower bits of the AND mask where the shift has guaranteed zeros. This avoids being dependent on SimplifyDemandedBits having cleared those bits. It could make sense to teach SimplifyDemandedBits to keep all lower bits 1 in an AND mask when possible. This could be implemented with slli+srli in the general case rather than needing to materialize the constant.	2021-01-24 00:34:45 -08:00
Craig Topper	c7d5d8fa33	[RISCV] Group some Zbs isel patterns together and remove a stale comment. NFC	2021-01-23 16:45:05 -08:00
Craig Topper	998057ec06	[RISCV] Add isel patterns to remove masks on SLO/SRO shift amounts.	2021-01-23 15:57:41 -08:00
Craig Topper	d2927f786e	[RISCV] Add isel patterns to remove (and X, 31) from sllw/srlw/sraw shift amounts. We try to do this during DAG combine with SimplifyDemandedBits, but it fails if there are multiple nodes using the AND. For example, multiple shifts using the same shift amount.	2021-01-23 15:08:18 -08:00
Hsiangkai Wang	66a49aef69	[RISCV] Implement vsoxseg/vsuxseg intrinsics. Define vsoxseg/vsuxseg intrinsics and pseudo instructions. Lower vsoxseg/vsuxseg intrinsics to pseudo instructions in RISCVDAGToDAGISel. Differential Revision: https://reviews.llvm.org/D94940	2021-01-23 08:54:56 +08:00
Hsiangkai Wang	97e33feb08	[RISCV] Implement vloxseg/vluxseg intrinsics. Define vloxseg/vluxseg intrinsics and pseudo instructions. Lower vloxseg/vluxseg intrinsics to pseudo instructions in RISCVDAGToDAGISel. Differential Revision: https://reviews.llvm.org/D94903	2021-01-23 08:54:56 +08:00
Craig Topper	d65e8ee507	[RISCV] Add more cmov isel patterns to handle seteq/ne with a small non-zero immediate. Similar to our free standing setcc patterns, we can use ADDI to subtract the immediate from the other operand. Then the cmov can check if the result is zero or non-zero. Reviewed By: mundaym Differential Revision: https://reviews.llvm.org/D95169	2021-01-22 14:51:22 -08:00
Craig Topper	095e245e16	[RISCV] Add isel patterns for SHADD(.UW) This adds an initial set of patterns for these instructions. Its more complicated that I would like for the shadd.uw instructions because there is no guaranteed canonicalization for shl/and with constants. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D95106	2021-01-22 13:28:41 -08:00
Craig Topper	20f2e32d2c	[RISCV] Update B extension version to 0.93. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D95002	2021-01-22 12:49:10 -08:00
Craig Topper	f25f7e8ecd	[RISCV] Add xperm.* instructions to Zbp extension. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94999	2021-01-22 12:49:10 -08:00
Craig Topper	4d5aa760a7	[RISCV] Add support for rev8 and orc.b to Zbb. These instructions use a portion of the encodings for grevi and gorci. The full encodings are only supported with Zbp. Note, rev8 has a different encoding between rv32 and rv64. Zbb is closer to being finalized that Zbp which has motivated some decisions in this patch. I'm treating rev8 and orc.b as separate instructions when either Zbb or Zbp is enabled. This allows us to print to suggest that either feature needs to be enabled to support these mnemonics. I had tried to put HasStdExtZbbAndNotZbp on the Zbb instructions, but that caused a diagnostic that said Zbp is required if neither feature is enabled. We should really mention Zbb since its closer to final. This does require extra isel patterns for the different cases so that bswap will always print as rev8 in assembly listing since we can't use an InstAlias. llvm-objdump disassembling should always pick the rev8 or orc.b instructions. llvm-mc parsing and printing text will not convert the grevi/gorci spellings to rev8/gorc.b. We could probably fix this with a special case in processInstruction in the assembly parser if it its important. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94944	2021-01-22 12:49:10 -08:00
Craig Topper	3c94cee63b	[RISCV] Add zext.h instruction to Zbb. zext.h uses the same encoding as pack rd, rs, x0 in rv32 and packw rd, rs, x0 in rv64. Encodings without x0 as the second source are not valid in Zbb. I've added two new instructions with these specific encodings with predicates that enable them when either Zbb or Zbp is enabled. The pack spelling will only be accepted with Zbp. The disassembler will use the zext.h instruction when either feature is enabled. Using the pack spelling will print as pack when llvm-mc is emitting text. We could fix this with some custom code in processInstruction if this is important, but I'm not sure it is. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94818	2021-01-22 12:49:10 -08:00
Craig Topper	83c92fdeda	[RISCV] Move pack instructions to Zbp extension only. Zext.h will need to come back to Zbb, but that only uses specific encodings of pack. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94742	2021-01-22 12:49:10 -08:00
Craig Topper	5ae92f1e11	[RISCV] Change zext.w to be an alias of add.uw rd, rs1, x0 instead of pack. This didn't make it into the published 0.93 spec, but it was the intention. But it is in the tex source as of this commit `d172f029c0` This means zext.w now requires Zba. Not sure if we should still use pack if Zbp is enabled and Zba isn't. I'll leave that for the future when pack is closer to being final. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94736	2021-01-22 12:49:10 -08:00
Craig Topper	9d499e037e	[RISCV] Modify add.uw patterns to put the masked operand in rs1 to match 0.93 bitmanip spec. The 0.93 spec has this implementation for add.uw uint_xlen_t adduw(uint_xlen_t rs1, uint_xlen_t rs2) { uint_xlen_t rs1u = (uint32_t)rs1; return rs1u + rs2; } The 0.92 spec had the usages of rs1 and rs2 swapped. Reviewed By: frasercrmck, asb Differential Revision: https://reviews.llvm.org/D95090	2021-01-22 12:49:10 -08:00
Craig Topper	efbcd66861	[RISCV] Rename Zbs instructions to start with just 'b' instead of 'sb' to match 0.93 bitmanip spec. Also renamed Zbe instructions to resolve name conflict even though that change is in the 0.94 draft. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94653	2021-01-22 12:49:10 -08:00
Craig Topper	1355458ef6	[RISCV] Move Shift Ones instructions from Zbb to Zbp to match 0.93 bitmanip spec. It's not really clear in the spec that these are in Zbp now, but that's what I've gather from previous commits to the spec. I've file an issue to get it documented properly. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94652	2021-01-22 12:49:10 -08:00
Craig Topper	83a93ae63b	[RISCV] Add SH*ADD(.UW) instructions to Zba extension based on 0.93 bitmanip spec. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94637	2021-01-22 12:49:10 -08:00
Craig Topper	4e6ad11bc6	[RISCV] Add Zba feature and move add.uw and slli.uw to it. Still need to add SH*ADD instructions. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94617	2021-01-22 12:49:10 -08:00
Craig Topper	b825278364	[RISCV] Rename mnemonics slliu.w->slli.uw and addu.w->add.uw to match 0.93 bitmanip spec. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94582	2021-01-22 12:49:10 -08:00
Craig Topper	d985c7321f	[RISCV] Swap encodings of max and minu to match 0.93 bitmanip spec. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94580	2021-01-22 12:49:10 -08:00
Craig Topper	b2f859500f	[RISCV] Remove addiwu, addwu, subwu, subuw, clmulw, clmulrw, clmulhw to match 0.93 bitmanip spec. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94577	2021-01-22 12:49:10 -08:00
Craig Topper	6aced6bf39	[RISCV] Rename pcnt->cpop to match 0.93 bitmanip spec. This is the first of multiple patches to bring our 0.92 implementation up to 0.93. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D94568	2021-01-22 12:49:10 -08:00
Kazu Hirata	cfa241680f	[llvm] Don't include StringSwitch.h where unnecessary (NFC)	2021-01-21 19:59:48 -08:00
Hsiangkai Wang	5d354220d4	[RISCV] Correct DWARF number for vector registers. The DWARF numbers of vector registers are already defined in riscv-elf-psabi. The DWARF number for vector is start from 96. Correct the DWARF numbers of vector registers. Differential Revision: https://reviews.llvm.org/D94749	2021-01-22 11:33:42 +08:00
Craig Topper	f8f1b20e6b	[RISCV] Don't create LMUL=8 pseudo instructions for ternary widening arithmetic instructions These instructions produce 2*SEW result so the input can't have an LMUL=8 or the result would need a non-existant LMUL=16. So only create pseudos for LMUL up to 4. Differential Revision: https://reviews.llvm.org/D95189	2021-01-21 19:29:02 -08:00
ShihPo Hung	9667750331	[RISCV] Add intrinsics for RVV1.0 VFRSQRTE7 & VFRECE7 Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D95113	2021-01-21 18:38:49 -08:00
ShihPo Hung	976cf53cc7	[RISCV] Add intrinsics for vector unordered indexed load in RVV 1.0 Add unordered indexed load: vluxei Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95028	2021-01-21 18:38:49 -08:00
ShihPo Hung	bea661d9a5	[RISCV] Add intrinsics for RVV 1.0 vrgatherei16 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95014	2021-01-21 18:38:49 -08:00
Craig Topper	3b5430eb0d	[RISCV] Add a VL output to vleff intrinsics. The fault-only-first-load instructions can reduce VL if an element other than element 0 triggers a memory fault. This can be used to vectorize loops with data dependent exit conditions like strcmp or strlen. This patch adds a VL output to these intrinsics so that the new VL value can be captured by software. This will be expanded to 'csrr gpr, vl' after the vleff instruction during SelectionDAG. By doing this with one intrinsic we are able to guarantee that the csrr reads the VL value produced by the vleff instruction. Having it as a separate intrinsic would make it impossible to guarantee ordering without making every other vector intrinsic have side effects. The intrinsics are expanded during lowering into two ISD nodes that are glued together. These ISD nodes will go through isel separately, but should maintain the glue so that they get emitted adjacently by InstrEmitter. I've only ran the chain through the vleff instruction, allowing the READ_VL to be deleted if it is unused. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D94286	2021-01-21 17:19:58 -08:00
Hsiangkai Wang	6e360460f1	[RISCV] Use v8-v23 as argument registers to conform to the proposal. The maximum LMUL is 8. We need 16 vector registers for two LMUL-8 arguments. The modification follows the proposal of psABI in https://github.com/riscv/riscv-elf-psabi-doc/pull/171 Differential Revision: https://reviews.llvm.org/D95134	2021-01-22 07:55:24 +08:00
Hsiangkai Wang	b7ab6726b6	[RISCV] New vector load/store in V extension v1.0 Upgrade RISC-V V extension to v1.0-08a0b46. Indexed load/store have ordered and unordered form. New whole vector load/store. Differential Revision: https://reviews.llvm.org/D93614	2021-01-22 07:30:09 +08:00
Michael Munday	4ab0f51a75	Recommit "[RISCV] Legalize select when Zbt extension available" This recommits `71ed4b6ce5` with the polarity of some of the pattern corrected. Original commit message: The custom expansion of select operations in the RISC-V backend interferes with the matching of cmov instructions. Legalizing select when the Zbt extension is available solves that problem. Reviewed By: luismarques, craig.topper Differential Revision: https://reviews.llvm.org/D93767	2021-01-21 12:07:44 -08:00
Hsiangkai Wang	b8921af63b	[RISCV] Update V instructions constraints to conform to v1.0 Upgrade RISC-V V extension to v1.0-08a0b46. Update instruction constraints to conform to v1.0. Differential Revision: https://reviews.llvm.org/D93612	2021-01-22 01:15:55 +08:00
Hsiangkai Wang	266820be35	[RISCV] Add new V instructions in v1.0-08a0b46. Add new V instructions. vfrsqrte7.v vfrece7.v vrgatherei16.vv vneg.v vncvt.x.x.w vfneg.v	2021-01-22 00:59:58 +08:00
Hsiangkai Wang	9dd5aea1e0	[RISCV] Make LMUL field in VTYPE continuous. Upgrade RISC-V V extension to v1.0-08a0b46. Update the VTYPE encoding. Make LMUL encoding in a continuous field.	2021-01-22 00:47:32 +08:00
Hsiangkai Wang	a8b96eadfd	[RISCV] Implement vssseg intrinsics. Define vlsseg intrinsics and pseudo instructions. Lower vlsseg intrinsics to pseudo instructions in RISCVDAGToDAGISel. Differential Revision: https://reviews.llvm.org/D94863	2021-01-21 11:51:35 +08:00
Hsiangkai Wang	e5e329023b	[RISCV] Implement vlsseg intrinsics. Define vlsseg intrinsics and pseudo instructions. Lower vlsseg intrinsics to pseudo instructions in RISCVDAGToDAGISel. Differential Revision: https://reviews.llvm.org/D94763	2021-01-21 11:51:35 +08:00
Hsiangkai Wang	47228f7854	[RISCV] Implement vsseg intrinsics. Define vsseg intrinsics and pseudo instructions. Lower vsseg intrinsics to pseudo instructions in RISCVDAGToDAGISel. Differential Revision: https://reviews.llvm.org/D94688	2021-01-21 11:51:35 +08:00
Craig Topper	e996f1d419	[RISCV] Add another isel pattern for slliu.w. Previously we only matched (and (shl X, C1), 0xffffffff << C1) which matches the InstCombine canonicalization order. But its possible to see (shl (and X, 0xffffffff), C1) if the pattern is introduced in SelectionDAG. For example, through expansion of a GEP.	2021-01-20 14:54:40 -08:00
Craig Topper	9d792fef57	[RISCV] Remove unnecessary APInt copy. NFC getAPIntValue returns a const APInt& so keep it as a reference.	2021-01-20 10:33:09 -08:00
Craig Topper	b11b6ab3e0	[RISCV] Add way to mark CompressPats that should only be used for compressing. There can be muliple patterns that map to the same compressed instruction. Reversing those leads to multiple ways to uncompress an instruction, but its not easily controllable which one will be chosen by the tablegen backend. This patch adds a flag to mark patterns that should only be used for compressing. This allows us to leave one canonical pattern for uncompressing. The obvious benefit of this is getting c.mv to uncompress to the addi patern that is aliased to the mv pseudoinstruction. For the add/and/or/xor/li patterns it just removes some unreachable code from the generated code. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D94894	2021-01-20 09:20:15 -08:00
Hsiangkai Wang	8ca4b174d7	[RISCV] Implement vlseg intrinsics. For Zvlsseg, we need continuous vector registers for the values. We need to define new register classes for the different combinations of (number of fields and LMUL). For example, when the number of fields(NF) = 3, LMUL = 2, the values will be assigned to (V0M2, V2M2, V4M2), (V2M2, V4M2, V6M2), (V4M2, V6M2, V8M2), ... We define the vlseg intrinsics with multiple outputs. There is no way to describe the codegen patterns with multiple outputs in the tablegen files. We do the codegen in RISCVISelDAGToDAG and use EXTRACT_SUBREG to extract the values of output. The multiple scalable vector values will be put into a struct. This patch is depended on the support for scalable vector struct. Differential Revision: https://reviews.llvm.org/D94229	2021-01-20 14:26:04 +08:00
ShihPo Hung	4dae2247fd	[RISCV] refactor VPatBinary (NFC) Make it easier to reuse for intrinsic vrgatherei16 which needs to encode both LMUL & EMUL in the instruction name, like PseudoVRGATHEREI16_VV_M1_M1 and PseudoVRGATHEREI16_VV_M1_M2. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94951	2021-01-19 19:09:56 -08:00
Craig Topper	e75a4b6ea9	[RISCV] Remove NotHasStdExtZbb predicate from zext.h/sext.b/sext.h InstAliases. NFC NotHasStdExtZbb doesn't have an AssemblerPredicate associated with it so it didn't do anything. We don't need it either because the sorting rules in tablegen prioritize by number of predicates. So the dedicated instructions in the B extension that have predicates will be prioritized automatically.	2021-01-19 14:31:48 -08:00
Craig Topper	ce8b3937dd	[RISCV] Add DAG combine to turn (setcc X, 1, setne) -> (setcc X, 0, seteq) if we can prove X is 0/1. If we are able to compare with 0 instead of 1, we might be able to fold the setcc into a beqz/bnez. Often these setccs start life as an xor that gets converted to a setcc by DAG combiner's rebuildSetcc. I looked into a detecting (xor X, 1) and converting to (seteq X, 0) based on boolean contents being 0/1 in rebuildSetcc instead of using computeKnownBits. It was very perturbing to AMDGPU tests which I didn't look closely at. It had a few changes on a couple other targets, but didn't seem to be much if any improvement. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D94730	2021-01-19 11:21:48 -08:00
Fraser Cormack	9c6a00fe99	[RISCV] Add ISel patterns for scalable mask exts & truncs Original patch by @rogfer01. This patch adds support for sign-, zero-, and any-extension from scalable mask vector types to integer vector types, as well as truncation in the opposite direction. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Fraser Cormack <fraser@codeplay.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94590	2021-01-19 18:13:15 +00:00
Fraser Cormack	15fd6bae0e	[RISCV] Extend RVV VType info with the type's AVL (NFC) This patch factors out the "VLMax" operand passed to most scalable-vector ISel patterns into a property of each VType. This is seen as a preparatory change to allow RVV in the future to more easily support fixed-length vector types with constrained vector lengths, with the AVL operand set to the length of the fixed-length vector. It has no effect on the scalable code generation path. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D94594	2021-01-19 15:46:56 +00:00

... 4 5 6 7 8 ...

1330 Commits