llvm-project

Commit Graph

Author	SHA1	Message	Date
Nick Desaulniers	f2981a3bc9	[SelectDagISEL] refactor HandlePHINodesInSuccessorBlocks NFC. While working on this code to support outputs from callbr along indirect branches, I kept making these changes again and again. Precommit these. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D137445	2022-11-10 14:34:23 -08:00
Amaury Séchet	82209fd96e	[NFC] Refactor DAGCombiner::foldSelectOfConstants to reduce nesting 2.0	2022-11-05 17:10:06 +00:00
Amaury Séchet	7c05f092c9	[NFC] Refactor DAGCombiner::foldSelectOfConstants to reduce nesting	2022-11-05 16:17:58 +00:00
Nick Desaulniers	0589038a6f	[StatepointLowering] remove unused parameter. NFC Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D136885	2022-11-03 15:43:20 -07:00
Henry Yu	ef0d689e8b	[SelectionDAGBuilder] use bitcast instead of AnyExtOrTrunc if copy parts from an int vector to a float vector to fix issue #58615 The getCopyFromPartsVector doesn't work correctly when PartEVT and ValueVT have both different element type and different size. This patch 1) removes the part of a comment that contains the incorrect assumption that element type are the same 2) use bitcast when copy parts of int vector to a float vector after the subvector extraction Reviewed By: Peter, efriedma Differential Revision: https://reviews.llvm.org/D136726	2022-11-03 15:35:13 -07:00
Matt Arsenault	cbce11c422	WebAssembly: Move exception handling code together	2022-11-02 16:05:34 -07:00
Matt Arsenault	4fed59ed41	FunctionLoweringInfo: Use TLI member instead of finding it	2022-11-02 16:05:34 -07:00
Yeting Kuo	71e4e35581	[VP][RISCV] Add vp.rint and RISC-V support. FRINT uses dynamic rounding mode instead of static rounding mode. The patch rename VFCVT_X_F_VL to VFCVT_RM_X_F_VL for static rounding mode uses and added new ISDNode VFCVT_X_F_VL directly selected to PseudoVFCVT_X_F_V. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D136662	2022-11-01 14:52:47 +08:00
Simon Pilgrim	55a11b542e	[VectorUtils] Add getShuffleDemandedElts helper We have similar code to translate a demanded elements mask for a shuffle's operands in multiple places - this patch adds a helper function to VectorUtils and updates a number of locations to use it directly. Differential Revision: https://reviews.llvm.org/D136832	2022-10-30 17:03:55 +00:00
Simon Pilgrim	78739fdb4d	[DAG] Enable combineShiftOfShiftedLogic folds after type legalization This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which I've addressed by adding AMDGPUTargetLowering::isDesirableToCommuteWithShift overrides. Fixes #57872 Differential Revision: https://reviews.llvm.org/D136042	2022-10-29 12:30:04 +01:00
Simon Pilgrim	69d117edc2	[DAG] ExpandIntRes_MINMAX - simplify cases with sufficient number of sign bits When legalizing a smax/smin/umax/umin op, if we know that the upper half is all sign bits, then we can perform the op on the lower half and then sign extend the result to the upper half. Alive2: https://alive2.llvm.org/ce/z/rk8Rfd Fixes #58630	2022-10-28 17:10:45 +01:00
Sanjay Patel	1e7c1dd67c	[SDAG] avoid crash from mismatched types in scalar-to-vector fold This bug was introduced with D136713 / `54eeadcf44` . As an enhancement, we could cast operands to the expected type, but we need to make sure that is done correctly (zext vs. sext). It's also possible (but seems unlikely) that an operand can have a type larger than the result type. Fixes #58661	2022-10-28 09:14:08 -04:00
Simon Pilgrim	d47f056cd2	[DAG] visitXOR - fold XOR(A,B) -> OR(A,B) iff A and B have no common bits Alive2: https://alive2.llvm.org/ce/z/7wvfns Part of Issue #58624	2022-10-28 12:11:12 +01:00
Simon Pilgrim	28bfd853ab	[DAG] visitFSUBForFMACombine - pass callbacks by reference in isContractableAndReassociableFMUL lambda capture. NFC. Fixes a coverity remark about large copies by value	2022-10-28 11:48:45 +01:00
Pierre van Houtryve	088a816824	[DAGCombiner] Use `getAnyExtOrTrunc` instead of TRUNCATE in ExtractVectorElt combine ScalarVT isn't guaranteed to be smaller than the BCSrc. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D136849	2022-10-28 06:33:29 +00:00
Craig Topper	00d93def77	[LegalizeVectorOps][X86][RISCV] Expand vector S/USHLSAT instead of unrolling. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D136478	2022-10-27 09:09:36 -07:00
Sanjay Patel	54eeadcf44	[SDAG] avoid vector extract/insert around binop scalar-to-vector (scalar binop (extractelt V, Idx), C) --> shuffle (vector binop V, C'), {Idx, -1, -1...} We generally try to avoid ad-hoc vectorization in SDAG, but the motivating case from issue #39482 escapes our normal vectorization folds in IR. It seems like it should always be a win to transform this pattern in cases where we have the same vector type for input and output and the target supports the vector operation. That avoids transfers from vector to scalar and back. In the x86 shift examples, we create the scalar-to-vector node during legalization. I'm not sure if there's a more general way to create the pattern for testing. (If so, I could add tests for other targets.) Differential Revision: https://reviews.llvm.org/D136713	2022-10-26 14:04:46 -04:00
Sanjay Patel	3aec021118	[SDAG] add helper for opcodes that are not speculatable This is not quite NFC because one of the users should now avoid the DIVREM opcodes too, but I'm not sure how to test that. I used the same name as an analysis function in IR in case we want to expand this to include other operations. Another potential use is proposed in D136713.	2022-10-26 11:20:14 -04:00
Haohai Wen	21f23a37c6	[SelectionDAG] Clamp stack alignment for memset, memmove memcpy has clamped dst stack alignment to NaturalStackAlignment if hasStackRealignment is false. We should also clamp stack alignment for memset and memmove. If we don't clamp, SelectionDAG may first do tail call optimization which requires no stack realignment. Then memmove, memset in same function may be lowered to load/store with larger alignment leading to PEI emit stack realignment code which is absolutely not correct. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D136456	2022-10-26 16:45:31 +08:00
Craig Topper	8c42b5e89e	[SelectionDAG] Add missing semicolon after return. I'm unsure what the code does without the semicolon. On the surface it seems like the assert below it would be considered part of the if and thus the assert would only execute if DestReg is 0. But 0 isn't considered a virtual register so the assert should fail. Found by PVS Studio. Reported https://pvs-studio.com/en/blog/posts/cpp/1003/ (N7)	2022-10-25 10:24:01 -07:00
Sanjay Patel	b179351ad4	[SDAG] refactor folds for scalar-to-vector; NFCI Fix typos, add comments, improve variable names, rearrange code, add early exits.	2022-10-25 12:53:46 -04:00
Roman Lebedev	377f27be87	[X86] `DAGTypeLegalizer::ModifyToType()`: when widening w/ zeros, insert into undef and `and`-mask the padding away We can expect that the sequence of inserting-of-extracts-into-undef will be successfully lowered back into widening of the source vector, but it seems that at least for X86 mask vectors, we have a really hard time recovering from inserting-into-zero. I've looked into alternative fix injection points, and they are much more involved, by the time of `LowerBUILD_VECTORvXi1()`/`LowerINSERT_VECTOR_ELT()` the constants might be obscured, so it does not seem like we can easily deal with this by lowering into bit math later on, some other pieces are missing. Instead, it seems like just clearing the padding away via an `AND`-mask is at least not a worse choice. Why create a problem where there wasn't one. Though yes, it is possible that there are cases where constants originate from the source IR, so some other fix may still be needed. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D136046	2022-10-24 20:27:02 +03:00
Craig Topper	1fa8fd4c33	Recommit "[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant." This reverts commit `65aaecca88`. There was an ordering problem in the calculation of the partial remainder. Original commit message: If the divisor is even, we can first shift the dividend and divisor right by the number of trailing zeros. Now the divisor is odd and we can do the original algorithm to calculate a remainder. Then we shift that remainder left by the number of trailing zeros and add the bits that were shifted out of the dividend. Differential Revision: https://reviews.llvm.org/D135541	2022-10-24 10:08:50 -07:00
Craig Topper	65aaecca88	Revert "[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant." This reverts commit `f6a7b47820`. I received a report that this fails on 32-bit X86.	2022-10-24 07:12:54 -07:00
Simon Pilgrim	fd5f3abb07	[DAG] Fold (abs (sign_extend_inreg x)) -> (zero_extend (abs (truncate x))) (PR43370) If the upper half of an abs() is all sign bits, then we can perform the abs() using just the lower half and then zero extend. I've limited the DAG combine to only sign_extend_inreg (and free truncate/zero_extend) to minimise any later promotion issues, but for legalization a similar fold can use ComputeNumSignBits to be more aggressive. Alive2: https://alive2.llvm.org/ce/z/y32fS4 Fixes #43370 Differential Revision: https://reviews.llvm.org/D136559	2022-10-24 10:27:08 +01:00
Kazu Hirata	a1317be28d	[SelectionDAG] Use std::clamp (NFC)	2022-10-24 00:23:51 -07:00
Simon Pilgrim	913f08b74c	[DAG] Add freeze(sign/zero_extend_vector_inreg(x)) -> sign/zero_extend_vector_inreg(freeze(x)) folding	2022-10-23 12:19:42 +01:00
Craig Topper	f6a7b47820	[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant. If the divisor is even, we can first shift the dividend and divisor right by the number of trailing zeros. Now the divisor is odd and we can do the original algorithm to calculate a remainder. Then we shift that remainder left by the number of trailing zeros and add the bits that were shifted out of the dividend. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D135541	2022-10-22 23:35:33 -07:00
Craig Topper	db25f51e37	Revert "[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))" This reverts commit `e8b3ffa532`. The AMDGPU/mad_64_32.ll seems to fail on some of the build bots but passes locally. I'm really confused.	2022-10-22 22:50:43 -07:00
Craig Topper	e8b3ffa532	[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y)) (sra X, BW-1) is either 0 or -1. So the multiply is a conditional negate of Y. This pattern shows up when type legalizing wide multiplies involving a sign extended value. Fixes PR57549. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D133399	2022-10-22 21:51:45 -07:00
Craig Topper	00816714f9	[DAGCombiner][RISCV] Make foldBinOpIntoSelect work correctly with opaque constants. The CanFoldNonConst doesn't work correctly with opaque constants because getNode won't constant fold constants if one is opaque. Even if the operation is AND/OR. This can lead to infinite loops. This patch does the folding manually in the DAGCombine. Alternatively, we could improve getNode but that seemed likely to have bigger impact and possibly increase compile time for the additional checks. We wouldn't want to directly constant fold because we need to preserve the opaque flag. Fixes PR58511. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D136472	2022-10-22 19:10:33 -07:00
Simon Pilgrim	b24a9f0cef	[DAG] visitFREEZE - pull out Operands array. NFCI. Initial tidyup and it will make it easier to adjust additional Operands in a future patch.	2022-10-22 20:14:56 +01:00
Simon Pilgrim	7511303c4f	[DAG] canCreateUndefOrPoison - add freeze(fsh(x,y,z)) -> fsh(freeze(x),freeze(y),freeze(z)) support The funnel-shift amount is always modulo, so won't introduce poison/undef	2022-10-22 18:39:52 +01:00
Simon Pilgrim	89111707ec	[DAG] canCreateUndefOrPoison - add freeze(rot(x,y)) -> rot(freeze(x),freeze(y)) support The rotation amount is always modulo, so won't introduce poison/undef	2022-10-22 17:24:53 +01:00
Paul Walker	ab8257ca0e	[NFC] Fix a few whitespace inconsistencies.	2022-10-20 14:52:25 +00:00
Simon Pilgrim	9708d88017	Revert rG42230efccf8fe1185be5fa6c23dce0a8183d6ec9 "[DAG] Fold (sra (or (shl x, c1), (shl y, c2)), c1) -> (sext_inreg (or x, (shl y,c2-c1)) iff c2 >= c1" @foad was right - this isn't actually going to help with D136042 as much as hoped, we need a better AMDGPU-specific solution as other targets are likely to make use of it	2022-10-19 12:07:41 +01:00
Simon Pilgrim	42230efccf	[DAG] Fold (sra (or (shl x, c1), (shl y, c2)), c1) -> (sext_inreg (or x, (shl y,c2-c1)) iff c2 >= c1 Helps with some of the AMDGPU regressions identified in D136042 where we were losing signed BFE patterns after sinking shifts behind logic ops. Differential Revision: https://reviews.llvm.org/D136081	2022-10-19 11:18:49 +01:00
Koakuma	d3fcbee10d	[SPARC] Make calls to function with big return values work Implement CanLowerReturn and associated CallingConv changes for SPARC/SPARC64. In particular, for SPARC64 there's new `RetCC_Sparc64_` functions that handles the return case of the calling convention. It uses the same analysis as `CC_Sparc64_` family of funtions, but fails if the return value doesn't fit into the return registers. This makes calls to functions with big return values converted to an sret function as expected, instead of crashing LLVM. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D132465	2022-10-18 00:01:55 +00:00
Craig Topper	30305d7948	[TargetLowering][RISCV][Sparc] Don't emit zero check in CTTZTableLookup for CTTZ_ZERO_UNDEF. The code incorrectly checked for CTLZ_ZERO_UNDEF instead of CTTZ_ZERO_UNDEF. While I was there I flipped the condition into an early out. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D136010	2022-10-17 10:15:39 -07:00
Kazu Hirata	ef9956f434	[IR] Rename FuncletPadInst::getNumArgOperands to arg_size (NFC) This patch renames FuncletPadInst::getNumArgOperands to arg_size for consistency with CallBase, where getNumArgOperands was removed in favor of arg_size in commit `3e1c787b31` Differential Revision: https://reviews.llvm.org/D136048	2022-10-17 10:15:10 -07:00
Simon Pilgrim	8e77458578	[DAG] visitShiftByConstant - replace constant detection with FoldConstantArithmetic Instead of checking that an operand is constant/opaque before calling getNode() and then checking that the result is a constant, just use FoldConstantArithmetic which will just early-out if the operands are not constant foldable.	2022-10-17 16:19:10 +01:00
Simon Pilgrim	af5942cc09	Remove trailing whitespace. NFC.	2022-10-17 15:20:26 +01:00
Peter Rong	c2e7c9cb33	[CodeGen] Using ZExt for extractelement indices. In https://github.com/llvm/llvm-project/issues/57452, we found that IRTranslator is translating `i1 true` into `i32 -1`. This is because IRTranslator uses SExt for indices. In this fix, we change the expected behavior of extractelement's index, moving from SExt to ZExt. This change includes both documentation, SelectionDAG and IRTranslator. We also included a test for AMDGPU, updated tests for AArch64, Mips, PowerPC, RISCV, VE, WebAssembly and X86 This patch fixes issue #57452. Differential Revision: https://reviews.llvm.org/D132978	2022-10-15 15:45:35 -07:00
Filipp Zhinkin	ef774bec63	[AArch64] Support SETCCCARRY lowering Support SETCCCARRY lowering to SBCS instruction. Related issue: https://github.com/llvm/llvm-project/issues/44629 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D135302	2022-10-14 22:29:31 +03:00
chenglin.bi	c1909d7337	[DAGCombiner] Fix crash for the merge stores with different value type The crash case comes from #58350. It have two stores, one store is type f32 and the other is v1f32. When we try to merge these two stores on v1f32, the memVT is vector type so the old code will use ISD::EXTRACT_SUBVECTOR for type f32 also then compiler crash. So this patch insert a build_vector for f32 store to generate v1f32 also when memVT is v1f32. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D135954	2022-10-15 01:16:35 +08:00
Nicola Lancellotti	ce1a2ccf94	[NFC] Fix typo in DAGCombiner	2022-10-14 17:47:25 +01:00
Sander de Smalen	02df03c5b7	[AArch64][SME] Add support for arm_locally_streaming functions. Functions with `aarch64_sme_pstatesm_body` will emit a SMSTART at the start of the function, and a SMSTOP at the end of the function, such that all operations use the right value for vscale. Because the placement of these nodes is critically important (i.e. no vscale-dependent operations should be done before SMSTART has been issued), we require glueing the CopyFromReg to the Entry node such that we can insert the SMSTART as part of that glued chain. More details about the SME attributes and design can be found in D131562. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D131582	2022-10-14 13:47:53 +00:00
Xiang1 Zhang	aad013de41	[InlineAsm][bugfix] Correct function addressing in inline asm In Linux PIC model, there are 4 cases about value/label addressing: Case 1: Function call or Label jmp inside the module. Case 2: Data access (such as global variable, static variable) inside the module. Case 3: Function call or Label jmp outside the module. Case 4: Data access (such as global variable) outside the module. Due to current llvm inline asm architecture designed to not "recognize" the asm code, there are quite troubles for us to treat mem addressing differently for same value/adress used in different instuctions. For example, in pic model, call a func may in plt way or direclty pc-related, but lea/mov a function adress may use got. This patch fix/refine the case 1 and case 2 in inline asm. Due to currently inline asm didn't support jmp the outsider lable, this patch mainly focus on fix the function call addressing bugs in inline asm. Reviewed By: Pengfei, RKSimon Differential Revision: https://reviews.llvm.org/D133914	2022-10-14 09:47:26 +08:00
Mirko Brkusanin	8b8463ef6c	[SelectionDAG] Use consistent type sizes for opcode	2022-10-12 17:33:04 +02:00
Craig Topper	ac9209751a	Revert "[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))" This reverts commit `0148df8157`. Getting a lit test failures on AMDGPU but I can't reproduce it so far. Reverting to investigate.	2022-10-11 16:30:40 -07:00

1 2 3 4 5 ...

12468 Commits