llvm-project

Commit Graph

Author	SHA1	Message	Date
Philip Reames	f8c63a7fbf	[SDAG] Allow scalable vectors in ComputeNumSignBits This is a continuation of the series of patches adding lane wise support for scalable vectors in various knownbit-esq routines. The basic idea here is that we track a single lane for scalable vectors which corresponds to an unknown number of lanes at runtime. This is enough for us to perform lane wise reasoning on many arithmetic operations. Differential Revision: https://reviews.llvm.org/D137141	2022-11-18 10:50:06 -08:00
Philip Reames	bc0fea0d55	[SDAG] Allow scalable vectors in ComputeKnownBits his is the SelectionDAG equivalent of D136470, and is thus an alternate patch to D128159. The basic idea here is that we track a single lane for scalable vectors which corresponds to an unknown number of lanes at runtime. This is enough for us to perform lane wise reasoning on many arithmetic operations. This patch also includes an implementation for SPLAT_VECTOR as without it, the lane wise reasoning has no base case. The original patch which inspired this (D128159), also included STEP_VECTOR. I plan to do that as a separate patch. Differential Revision: https://reviews.llvm.org/D137140	2022-11-18 07:40:32 -08:00
Benjamin Maxwell	34d88cf6cf	[DAG] Allow folding AND of anyext masked_load with >1 user to zext version This now allows folding an AND of a anyext masked_load to a zext_masked_load even if the masked load has multiple users. Doing is eliminates some redundant ANDs/MOVs for certain AArch64 SVE code. I'm not sure if there's any cases where doing this could negatively the other users of the masked_load. Looking at other optimizations of masked loads, most don't apply if the load is used more than once, so it doesn't look like this would interfere. Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D137844	2022-11-18 10:38:09 +00:00
YingChi Long	7a715bf317	[VP] Add support for vp.inttoptr & vp.ptrtoint Add vp.inttoptr & vp.ptrtoint support by lowering them into vp.zext / vp.truncate with in SelectionDAGBuilder. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D137169	2022-11-18 10:42:24 +08:00
Stanislav Mekhanoshin	bcaf31ec3f	[AMDGPU] Allow finer grain control of an unaligned access speed A target can return if a misaligned access is 'fast' as defined by the target or not. In reality there can be different levels of 'fast' and 'slow'. This patch changes the boolean 'Fast' argument of the allowsMisalignedMemoryAccesses family of functions to an unsigned representing its speed. A target can still define it as it wants and the direct translation of the current code uses 0 and 1 for current false and true. This makes the change an NFC. Subsequent patch will start using an actual value of speed in the load/store vectorizer to compare if a vectorized access going to be not just fast, but not slower than before. Differential Revision: https://reviews.llvm.org/D124217	2022-11-17 09:23:53 -08:00
Philip Reames	4105794e66	[SDAG] Assert we don't see scalable VECTOR_SHUFFLES It was pointed out in review of D137140 that this case should be impossible. This patch converts an existing bailout into an assert instead.	2022-11-17 08:18:51 -08:00
zhongyunde	8fbb6f8678	[NFC] Fix typo in comment Address comment in https://reviews.llvm.org/D137936 Differential Revision: https://reviews.llvm.org/D138124	2022-11-16 23:35:53 +08:00
Simon Pilgrim	a92f5a08a1	[DAG] simplifySelect - add support for vselect(0, T, F) -> F fold We still need to add handling for the non-zero T fold (which requires getBooleanContents handling)	2022-11-16 13:11:14 +00:00
OCHyams	a1ac6efcb0	[NFC][SelectionDAG][DebugInfo] Refactor DanglingDebugInfo class Hide the underlying DbgValueInst by adding methods to extract the necessary information and by adding a raw_ostream &operator<< overload to print it. Remove the DebugLoc field as this is always the same as the DbgValueInst's DebugLoc (see D136247). Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D136249	2022-11-16 10:10:24 +00:00
OCHyams	9792744650	[NFC][SelectionDAG][DebugInfo] Remove duplicate parameter from handleDebugValue handleDebugValue has two DebugLoc parameters that appear to always take the same value. Remove one of the duplicate parameters. See phabricator review for more detail. Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D136247	2022-11-16 09:59:35 +00:00
Matt Arsenault	116c894d72	DAG: Fix assert on load casted to vector with attached range metadata AMDGPU legalizes i64 loads to loads of <2 x i32>, leaving the i64 MMO with attached range metadata alone. The known bit width was using the scalar element type, and asserting on a mismatch.	2022-11-15 23:28:55 -08:00
Yeting Kuo	ed9638c44b	[VP][RISCV] Add vp.nearbyint and RISC-V support. nearbyint has the property to execute without exception. For not modifying fflags, the patch added new machine opcode PseudoVFROUND_NOEXCEPT_V that expands vfcvt.x.f.v and vfcvt.f.x.v between a pair of frflags and fsflags. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D137685	2022-11-16 14:05:35 +08:00
Yeting Kuo	5c3ca10b09	[VP][RISCV] Add vp.bswap and RISC-V support. The patch also added function expandVPBSWAP to expand ISD::VP_BSWAP nodes. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D137928	2022-11-16 11:36:38 +08:00
Craig Topper	f387918dd8	[TargetLowering][RISCV][ARM][AArch64][Mips] Reduce the number of AND mask constants used by BSWAP expansion. We can reuse constants if we use SRL followed by AND and AND followed by SHL. Similar was done to bitreverse previously. Differential Revision: https://reviews.llvm.org/D138045	2022-11-15 14:36:01 -08:00
Sanjay Patel	fe05a0a3dd	[SDAG] avoid udiv/urem transform for vector/scalar type mismatches This solves the crashing from issue #58994. I don't know anything about VE, so I don't know if the output is as expected or even correct.	2022-11-15 11:01:18 -05:00
Nick Desaulniers	f2981a3bc9	[SelectDagISEL] refactor HandlePHINodesInSuccessorBlocks NFC. While working on this code to support outputs from callbr along indirect branches, I kept making these changes again and again. Precommit these. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D137445	2022-11-10 14:34:23 -08:00
Amaury Séchet	82209fd96e	[NFC] Refactor DAGCombiner::foldSelectOfConstants to reduce nesting 2.0	2022-11-05 17:10:06 +00:00
Amaury Séchet	7c05f092c9	[NFC] Refactor DAGCombiner::foldSelectOfConstants to reduce nesting	2022-11-05 16:17:58 +00:00
Nick Desaulniers	0589038a6f	[StatepointLowering] remove unused parameter. NFC Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D136885	2022-11-03 15:43:20 -07:00
Henry Yu	ef0d689e8b	[SelectionDAGBuilder] use bitcast instead of AnyExtOrTrunc if copy parts from an int vector to a float vector to fix issue #58615 The getCopyFromPartsVector doesn't work correctly when PartEVT and ValueVT have both different element type and different size. This patch 1) removes the part of a comment that contains the incorrect assumption that element type are the same 2) use bitcast when copy parts of int vector to a float vector after the subvector extraction Reviewed By: Peter, efriedma Differential Revision: https://reviews.llvm.org/D136726	2022-11-03 15:35:13 -07:00
Matt Arsenault	cbce11c422	WebAssembly: Move exception handling code together	2022-11-02 16:05:34 -07:00
Matt Arsenault	4fed59ed41	FunctionLoweringInfo: Use TLI member instead of finding it	2022-11-02 16:05:34 -07:00
Yeting Kuo	71e4e35581	[VP][RISCV] Add vp.rint and RISC-V support. FRINT uses dynamic rounding mode instead of static rounding mode. The patch rename VFCVT_X_F_VL to VFCVT_RM_X_F_VL for static rounding mode uses and added new ISDNode VFCVT_X_F_VL directly selected to PseudoVFCVT_X_F_V. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D136662	2022-11-01 14:52:47 +08:00
Simon Pilgrim	55a11b542e	[VectorUtils] Add getShuffleDemandedElts helper We have similar code to translate a demanded elements mask for a shuffle's operands in multiple places - this patch adds a helper function to VectorUtils and updates a number of locations to use it directly. Differential Revision: https://reviews.llvm.org/D136832	2022-10-30 17:03:55 +00:00
Simon Pilgrim	78739fdb4d	[DAG] Enable combineShiftOfShiftedLogic folds after type legalization This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which I've addressed by adding AMDGPUTargetLowering::isDesirableToCommuteWithShift overrides. Fixes #57872 Differential Revision: https://reviews.llvm.org/D136042	2022-10-29 12:30:04 +01:00
Simon Pilgrim	69d117edc2	[DAG] ExpandIntRes_MINMAX - simplify cases with sufficient number of sign bits When legalizing a smax/smin/umax/umin op, if we know that the upper half is all sign bits, then we can perform the op on the lower half and then sign extend the result to the upper half. Alive2: https://alive2.llvm.org/ce/z/rk8Rfd Fixes #58630	2022-10-28 17:10:45 +01:00
Sanjay Patel	1e7c1dd67c	[SDAG] avoid crash from mismatched types in scalar-to-vector fold This bug was introduced with D136713 / `54eeadcf44` . As an enhancement, we could cast operands to the expected type, but we need to make sure that is done correctly (zext vs. sext). It's also possible (but seems unlikely) that an operand can have a type larger than the result type. Fixes #58661	2022-10-28 09:14:08 -04:00
Simon Pilgrim	d47f056cd2	[DAG] visitXOR - fold XOR(A,B) -> OR(A,B) iff A and B have no common bits Alive2: https://alive2.llvm.org/ce/z/7wvfns Part of Issue #58624	2022-10-28 12:11:12 +01:00
Simon Pilgrim	28bfd853ab	[DAG] visitFSUBForFMACombine - pass callbacks by reference in isContractableAndReassociableFMUL lambda capture. NFC. Fixes a coverity remark about large copies by value	2022-10-28 11:48:45 +01:00
Pierre van Houtryve	088a816824	[DAGCombiner] Use `getAnyExtOrTrunc` instead of TRUNCATE in ExtractVectorElt combine ScalarVT isn't guaranteed to be smaller than the BCSrc. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D136849	2022-10-28 06:33:29 +00:00
Craig Topper	00d93def77	[LegalizeVectorOps][X86][RISCV] Expand vector S/USHLSAT instead of unrolling. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D136478	2022-10-27 09:09:36 -07:00
Sanjay Patel	54eeadcf44	[SDAG] avoid vector extract/insert around binop scalar-to-vector (scalar binop (extractelt V, Idx), C) --> shuffle (vector binop V, C'), {Idx, -1, -1...} We generally try to avoid ad-hoc vectorization in SDAG, but the motivating case from issue #39482 escapes our normal vectorization folds in IR. It seems like it should always be a win to transform this pattern in cases where we have the same vector type for input and output and the target supports the vector operation. That avoids transfers from vector to scalar and back. In the x86 shift examples, we create the scalar-to-vector node during legalization. I'm not sure if there's a more general way to create the pattern for testing. (If so, I could add tests for other targets.) Differential Revision: https://reviews.llvm.org/D136713	2022-10-26 14:04:46 -04:00
Sanjay Patel	3aec021118	[SDAG] add helper for opcodes that are not speculatable This is not quite NFC because one of the users should now avoid the DIVREM opcodes too, but I'm not sure how to test that. I used the same name as an analysis function in IR in case we want to expand this to include other operations. Another potential use is proposed in D136713.	2022-10-26 11:20:14 -04:00
Haohai Wen	21f23a37c6	[SelectionDAG] Clamp stack alignment for memset, memmove memcpy has clamped dst stack alignment to NaturalStackAlignment if hasStackRealignment is false. We should also clamp stack alignment for memset and memmove. If we don't clamp, SelectionDAG may first do tail call optimization which requires no stack realignment. Then memmove, memset in same function may be lowered to load/store with larger alignment leading to PEI emit stack realignment code which is absolutely not correct. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D136456	2022-10-26 16:45:31 +08:00
Craig Topper	8c42b5e89e	[SelectionDAG] Add missing semicolon after return. I'm unsure what the code does without the semicolon. On the surface it seems like the assert below it would be considered part of the if and thus the assert would only execute if DestReg is 0. But 0 isn't considered a virtual register so the assert should fail. Found by PVS Studio. Reported https://pvs-studio.com/en/blog/posts/cpp/1003/ (N7)	2022-10-25 10:24:01 -07:00
Sanjay Patel	b179351ad4	[SDAG] refactor folds for scalar-to-vector; NFCI Fix typos, add comments, improve variable names, rearrange code, add early exits.	2022-10-25 12:53:46 -04:00
Roman Lebedev	377f27be87	[X86] `DAGTypeLegalizer::ModifyToType()`: when widening w/ zeros, insert into undef and `and`-mask the padding away We can expect that the sequence of inserting-of-extracts-into-undef will be successfully lowered back into widening of the source vector, but it seems that at least for X86 mask vectors, we have a really hard time recovering from inserting-into-zero. I've looked into alternative fix injection points, and they are much more involved, by the time of `LowerBUILD_VECTORvXi1()`/`LowerINSERT_VECTOR_ELT()` the constants might be obscured, so it does not seem like we can easily deal with this by lowering into bit math later on, some other pieces are missing. Instead, it seems like just clearing the padding away via an `AND`-mask is at least not a worse choice. Why create a problem where there wasn't one. Though yes, it is possible that there are cases where constants originate from the source IR, so some other fix may still be needed. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D136046	2022-10-24 20:27:02 +03:00
Craig Topper	1fa8fd4c33	Recommit "[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant." This reverts commit `65aaecca88`. There was an ordering problem in the calculation of the partial remainder. Original commit message: If the divisor is even, we can first shift the dividend and divisor right by the number of trailing zeros. Now the divisor is odd and we can do the original algorithm to calculate a remainder. Then we shift that remainder left by the number of trailing zeros and add the bits that were shifted out of the dividend. Differential Revision: https://reviews.llvm.org/D135541	2022-10-24 10:08:50 -07:00
Craig Topper	65aaecca88	Revert "[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant." This reverts commit `f6a7b47820`. I received a report that this fails on 32-bit X86.	2022-10-24 07:12:54 -07:00
Simon Pilgrim	fd5f3abb07	[DAG] Fold (abs (sign_extend_inreg x)) -> (zero_extend (abs (truncate x))) (PR43370) If the upper half of an abs() is all sign bits, then we can perform the abs() using just the lower half and then zero extend. I've limited the DAG combine to only sign_extend_inreg (and free truncate/zero_extend) to minimise any later promotion issues, but for legalization a similar fold can use ComputeNumSignBits to be more aggressive. Alive2: https://alive2.llvm.org/ce/z/y32fS4 Fixes #43370 Differential Revision: https://reviews.llvm.org/D136559	2022-10-24 10:27:08 +01:00
Kazu Hirata	a1317be28d	[SelectionDAG] Use std::clamp (NFC)	2022-10-24 00:23:51 -07:00
Simon Pilgrim	913f08b74c	[DAG] Add freeze(sign/zero_extend_vector_inreg(x)) -> sign/zero_extend_vector_inreg(freeze(x)) folding	2022-10-23 12:19:42 +01:00
Craig Topper	f6a7b47820	[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant. If the divisor is even, we can first shift the dividend and divisor right by the number of trailing zeros. Now the divisor is odd and we can do the original algorithm to calculate a remainder. Then we shift that remainder left by the number of trailing zeros and add the bits that were shifted out of the dividend. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D135541	2022-10-22 23:35:33 -07:00
Craig Topper	db25f51e37	Revert "[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))" This reverts commit `e8b3ffa532`. The AMDGPU/mad_64_32.ll seems to fail on some of the build bots but passes locally. I'm really confused.	2022-10-22 22:50:43 -07:00
Craig Topper	e8b3ffa532	[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y)) (sra X, BW-1) is either 0 or -1. So the multiply is a conditional negate of Y. This pattern shows up when type legalizing wide multiplies involving a sign extended value. Fixes PR57549. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D133399	2022-10-22 21:51:45 -07:00
Craig Topper	00816714f9	[DAGCombiner][RISCV] Make foldBinOpIntoSelect work correctly with opaque constants. The CanFoldNonConst doesn't work correctly with opaque constants because getNode won't constant fold constants if one is opaque. Even if the operation is AND/OR. This can lead to infinite loops. This patch does the folding manually in the DAGCombine. Alternatively, we could improve getNode but that seemed likely to have bigger impact and possibly increase compile time for the additional checks. We wouldn't want to directly constant fold because we need to preserve the opaque flag. Fixes PR58511. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D136472	2022-10-22 19:10:33 -07:00
Simon Pilgrim	b24a9f0cef	[DAG] visitFREEZE - pull out Operands array. NFCI. Initial tidyup and it will make it easier to adjust additional Operands in a future patch.	2022-10-22 20:14:56 +01:00
Simon Pilgrim	7511303c4f	[DAG] canCreateUndefOrPoison - add freeze(fsh(x,y,z)) -> fsh(freeze(x),freeze(y),freeze(z)) support The funnel-shift amount is always modulo, so won't introduce poison/undef	2022-10-22 18:39:52 +01:00
Simon Pilgrim	89111707ec	[DAG] canCreateUndefOrPoison - add freeze(rot(x,y)) -> rot(freeze(x),freeze(y)) support The rotation amount is always modulo, so won't introduce poison/undef	2022-10-22 17:24:53 +01:00
Paul Walker	ab8257ca0e	[NFC] Fix a few whitespace inconsistencies.	2022-10-20 14:52:25 +00:00

1 2 3 4 5 ...

12483 Commits