llvm-project

Commit Graph

Author	SHA1	Message	Date
Roman Lebedev	046bb8ea7c	[NFC][InstCombine] Autogenerate some checklines being affected by upcoming change	2021-03-23 00:51:18 +03:00
Juneyoung Lee	06829034ca	Revert "[ConstantFold] Fold more operations to poison" This reverts commit `53040a968d` due to its bad interaction with select i1 -> and/or i1 transformation. This fixes: https://bugs.llvm.org/show_bug.cgi?id=49005 https://bugs.llvm.org/show_bug.cgi?id=48435	2021-02-04 00:24:02 +09:00
Nikita Popov	fb063c933f	[InstCombine] Duplicate tests for logical and/or (NFC) This replicates existing and/or tests to also test variants using select. This should help us get a more accurate view on which optimizations we're missing if we disable the select -> and/or fold.	2021-01-12 21:50:41 +01:00
Roman Lebedev	897c985e1e	[InstCombine] Canonicalize SPF to abs intrinsic This patch enables canonicalization of SPF_ABS and SPF_ABS to the abs intrinsic. This is a recommit, the original try was `05d4c4ebc2`, but it was reverted due to an apparent miscompile, which since then has just been fixed by the previous commit. Differential Revision: https://reviews.llvm.org/D87188	2020-12-18 21:18:14 +03:00
Juneyoung Lee	53040a968d	[ConstantFold] Fold more operations to poison This patch folds more operations to poison. Alive2 proof: https://alive2.llvm.org/ce/z/mxcb9G (it does not contain tests about div/rem because they fold to poison when raising UB) Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D92270	2020-11-29 21:19:48 +09:00
Roman Lebedev	e465f9c303	[InstCombine] Negator: - (C - %x) --> %x - C (PR47997) This relaxes one-use restriction on that `sub` fold, since apparently the addition of Negator broke preexisting `C-(C2-X) --> X+(C-C2)` (with C=0) fold.	2020-11-03 16:06:51 +03:00
Roman Lebedev	482d65331b	[NFC][InstCombine] Add test coverage for PR47997	2020-11-03 16:06:50 +03:00
Simon Pilgrim	de885f1b2a	[InstCombine] Add (icmp ne A, 0) \| (icmp ne B, 0) --> (icmp ne (A\|B), 0) vector support Scalar cases were already being handled by foldLogOpOfMaskedICmps (so this was dead code), but refactoring to support non-uniform vectors will take some time, so tweak this fold in the meantime.	2020-10-19 15:41:21 +01:00
Simon Pilgrim	ecd25086d1	[InstCombine] Add (icmp eq B, 0) \| (icmp ult/gt A, B) -> (icmp ule A, B-1) vector support	2020-10-19 15:23:48 +01:00
Simon Pilgrim	aba7275bb3	[InstCombine] Add (icmp ne A, 0) \| (icmp ne B, 0) --> (icmp ne (A\|B), 0) tests	2020-10-19 13:42:53 +01:00
Simon Pilgrim	3dd2f02bb0	[InstCombine] Add (icmp eq B, 0) \| (icmp ult A, B) -> (icmp ule A, B-1) vector tests	2020-10-19 11:48:32 +01:00
Nikita Popov	13e19d2e7c	Revert "[InstCombine] Canonicalize SPF_ABS to abs intrinc" This reverts commit `05d4c4ebc2`. mstorsjo reports a miscompile after this change in https://reviews.llvm.org/D87188#2281093. Reverting until I can investigate this.	2020-09-18 09:38:26 +02:00
Nikita Popov	05d4c4ebc2	[InstCombine] Canonicalize SPF_ABS to abs intrinc Enable canonicalization of SPF_ABS and SPF_NABS to the abs intrinsic. To be conservative, the one-use check on the comparison is retained, this may be relaxed if all goes well. It's pretty likely that this will uncover places that missing handling for the abs() intrinsic. Please report any seen performance regressions. Differential Revision: https://reviews.llvm.org/D87188	2020-09-17 22:28:34 +02:00
Sanjay Patel	11d8eedfa5	[InstCombine] move/add tests for icmp with mul operands; NFC	2020-09-07 12:40:37 -04:00
Nikita Popov	ada8a17d94	[InstCombine] Fold abs intrinsic eq zero Following the same transform for the select version of abs.	2020-09-05 15:11:38 +02:00
Nikita Popov	94c71d6aa1	[InstCombine] Add tests for abs intrinsic eq zero (NFC)	2020-09-05 15:11:38 +02:00
Sanjay Patel	ec06b38130	[InstCombine] canonicalize 'not' ops before logical shifts This reverses the existing transform that would uniformly canonicalize any 'xor' after any shift. In the case of logical shifts, that turns a 'not' into an arbitrary 'xor' with constant, and that's probably not as good for analysis, SCEV, or codegen. The SCEV motivating case is discussed in: http://bugs.llvm.org/PR47136 There's an analysis motivating case at: http://bugs.llvm.org/PR38781 I did draft a patch that would do the same for 'ashr' but that's questionable because it's just swapping the position of a 'not' and uncovers at least 2 missing folds that we would probably need to deal with as preliminary steps. Alive proofs: https://rise4fun.com/Alive/BBV Name: shift right of 'not' Pre: C2 == (-1 u>> C1) %a = lshr i8 %x, C1 %r = xor i8 %a, C2 => %n = xor i8 %x, -1 %r = lshr i8 %n, C1 Name: shift left of 'not' Pre: C2 == (-1 << C1) %a = shl i8 %x, C1 %r = xor i8 %a, C2 => %n = xor i8 %x, -1 %r = shl i8 %n, C1 Name: ashr of 'not' %a = ashr i8 %x, C1 %r = xor i8 %a, -1 => %n = xor i8 %x, -1 %r = ashr i8 %n, C1 Differential Revision: https://reviews.llvm.org/D86243	2020-08-22 09:38:13 -04:00
Roman Lebedev	f5df5cd558	Recommit "[InstCombine] Negator: -(X << C) --> X * (-1 << C)" This reverts commit `ac70b37a00` which reverted commit `8aeb2fe13a` because codegen tests got broken and i needed time to investigate. This shows some regressions in tests, but they are all around GEP's, so i'm not really sure how important those are. https://rise4fun.com/Alive/1Gn	2020-08-05 15:59:13 +03:00
Roman Lebedev	ac70b37a00	Revert "[InstCombine] Negator: -(X << C) --> X * (-1 << C)" Breaks codegen tests, will recommit later. This reverts commit `8aeb2fe13a`.	2020-08-05 03:19:38 +03:00
Roman Lebedev	8aeb2fe13a	[InstCombine] Negator: -(X << C) --> X * (-1 << C) This shows some regressions in tests, but they are all around GEP's, so i'm not really sure how important those are. https://rise4fun.com/Alive/1Gn	2020-08-05 03:13:14 +03:00
Roman Lebedev	8fd57b06a4	[NFC][InstCombine] Fix value names (s/%tmp/%i/) and autogenerate a few tests being affected by negator change	2020-08-05 03:12:14 +03:00
Simon Pilgrim	769b979930	[InstCombine] Add (vXi1 trunc(lshr(x,c))) -> icmp_eq(and(x,c')) support for non-uniform vectors As noted on PR46531, we were only performing this transform on uniform vectors as we were using the m_APInt pattern matcher to extract the shift amount. Differential Revision: https://reviews.llvm.org/D83035	2020-07-02 16:56:33 +01:00
Simon Pilgrim	103d62e131	[InstCombine] Add some (vXi1 trunc(lshr(x,c))) -> icmp_eq(and(x,c')) tests for vectors with undef elements Suggested on D83035	2020-07-02 16:04:30 +01:00
Simon Pilgrim	421c02e5c6	[InstCombine] Add some (vXi1 trunc(lshr(x,c))) -> icmp_eq(and(x,c')) tests for non-uniform vectors As noticed on PR46531	2020-07-02 11:56:51 +01:00
Sanjay Patel	4abab5c5ca	[InstCombine] generalize canonicalization of masked equality comparisons (X \| MaskC) == C --> (X & ~MaskC) == C ^ MaskC (X \| MaskC) != C --> (X & ~MaskC) != C ^ MaskC We have more analyis for 'and' patterns and already lean this way in the existing code, so this should be neutral or better in IR. If this does not do as well in codegen, the problem already exists and we should fix that based on target costs/heuristics. http://volta.cs.utah.edu:8080/z/oP3ecL define void @src(i8 %x, i8 %OrC, i8 %C, i1* %p0, i1* %p1) { %or = or i8 %x, %OrC %eq = icmp eq i8 %or, %C store i1 %eq, i1* %p0 %ne = icmp ne i8 %or, %C store i1 %ne, i1* %p1 ret void } define void @tgt(i8 %x, i8 %OrC, i8 %C, i1* %p0, i1* %p1) { %NotOrC = xor i8 %OrC, -1 %a = and i8 %x, %NotOrC %NewC = xor i8 %C, %OrC %eq = icmp eq i8 %a, %NewC store i1 %eq, i1* %p0 %ne = icmp ne i8 %a, %NewC store i1 %ne, i1* %p1 ret void }	2020-04-25 11:31:57 -04:00
Eric Christopher	45dca04395	Exclude bitcast and ext/trunc signbit optimization on ppc_fp128 Revision `a1c05fe` <https://reviews.llvm.org/rGa1c05fe20f3def1f1be9f50d2adefc6b6f1578ad> removed bitcast from the list of problematic transformations, however: %97 = fptrunc ppc_fp128 %2 to double // we need to check ppc_fp128 here to prevent the transformation %98 = bitcast double %97 to i64 // `a1c05fe` checks ppc_fp128 at here %99 = icmp slt i64 %98, 0 %100 = zext i1 %99 to i8 store i8 %100, i8* %7, align 1 so this patch does that. I'm also disabling it in the presence of extend just in case. I verified separately that the hash of -std::infinity and std::infinity don't match now. Differential Revision: https://reviews.llvm.org/D77911	2020-04-10 17:07:55 -07:00
Sanjay Patel	a1c05fe20f	[InstCombine] exclude bitcast of ppc_fp128 in icmp signbit fold Based on the post-commit comments for rG0f56bbc, there might be a problem with this transform: (bitcast (fpext/fptrunc X)) to iX) < 0 --> (bitcast X to iY) < 0 ...and the ppc_fp128 data type, so conservatively bypass if we are bitcasting a ppc_fp128. We might be able to account for endian or other differences to enable this for PowerPC again if that is useful. Differential Revision: https://reviews.llvm.org/D77642	2020-04-08 08:56:19 -04:00
Sanjay Patel	e268ec8e0d	[InstCombine] add icmp+cast tests for ppc_fp128; NFC See post-commit comments for rG0f56bbc.	2020-04-07 07:35:01 -04:00
Sanjay Patel	0f56bbc1a5	[InstCombine] reduce FP-casted and bitcasted signbit check PR45305: https://bugs.llvm.org/show_bug.cgi?id=45305 Alive2 proofs: http://volta.cs.utah.edu:8080/z/bVyrko http://volta.cs.utah.edu:8080/z/Vxpz9q	2020-03-27 17:33:59 -04:00
Sanjay Patel	e72730ee3a	[InstCombine] add tests for FP cast+bitcast signbit checks; NFC PR45305: https://bugs.llvm.org/show_bug.cgi?id=45305	2020-03-27 17:25:25 -04:00
Nikita Popov	9adedd146d	[InstCombine] Relax preconditions for ashr+and+icmp fold (PR44754) Fix for https://bugs.llvm.org/show_bug.cgi?id=44754. We already have a fold that converts icmp (and (ashr X, C3), C2), C1 into icmp (and C2'), C1', but it imposed overly strict requirements on the transform. Relax this by checking that both C2 and C1 don't shift out bits (in a signed sense) when forming the new constants. Alive proofs (https://rise4fun.com/Alive/PTz0): Name: ashr_legal Pre: ((C2 << C3) >> C3) == C2 && ((C1 << C3) >> C3) == C1 %a = ashr i16 %x, C3 %b = and i16 %a, C2 %c = icmp i16 %b, C1 => %d = and i16 %x, C2 << C3 %c = icmp i16 %d, C1 << C3 Name: ashr_shiftout_eq Pre: ((C2 << C3) >> C3) == C2 && ((C1 << C3) >> C3) != C1 %a = ashr i16 %x, C3 %b = and i16 %a, C2 %c = icmp eq i16 %b, C1 => %c = false Note that >> corresponds to ashr here. The case of an equality comparison has some special handling in this transform, because it will form to a true/false result if the condition on the comparison constant it violated. Differential Revision: https://reviews.llvm.org/D74294	2020-02-18 17:49:46 +01:00
Nikita Popov	9bc6bc2d8c	[InstCombine] Add more tests for icmp+and+ashr; NFC	2020-02-18 17:47:48 +01:00
Nikita Popov	d4627b90a0	[InstCombine] Avoid modifying instructions in-place As discussed on D73919, this replaces a few cases where we were modifying multiple operands of instructions in-place with the creation of a new instruction, which we generally prefer nowadays. This tends to be more readable and less prone to worklist management bugs. Test changes are only superficial (instruction naming and order).	2020-02-08 17:05:56 +01:00
Mikhail Maltsev	b6534b2a26	[Analysis] Don't assume that unsigned overflow can't happen in EmitGEPOffset (PR42699) Summary: Currently when computing a GEP offset using the function EmitGEPOffset for the following instruction getelementptr inbounds i32, i32* %p, i64 %offs we get mul nuw i64 %offs, 4 Unfortunately we cannot assume that unsigned wrapping won't happen here because %offs is allowed to be negative. Making such assumptions can lead to miscompilations: see the new test test24_neg_offs in InstCombine/icmp.ll. Without the patch InstCombine would generate the following comparison: icmp eq i64 %offs, 4611686018427387902; 0x3ffffffffffffffe Whereas the correct value to compare with is -2. This patch replaces the NUW flag with NSW in the multiplication instructions generated by EmitGEPOffset and adjusts the test suite. https://bugs.llvm.org/show_bug.cgi?id=42699 Reviewers: chandlerc, craig.topper, ostannard, lebedev.ri, spatel, efriedma, nlopes, aqjune Reviewed By: lebedev.ri Subscribers: reames, lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68342 llvm-svn: 375089	2019-10-17 08:59:06 +00:00
Sanjay Patel	eb8d39e113	[InstCombine] allow icmp+binop folds before min/max bailout (PR43310) This has the potential to uncover missed analysis/folds as shown in the min/max code comment/test, but fewer restrictions on icmp folds should be better in general to solve cases like: https://bugs.llvm.org/show_bug.cgi?id=43310 llvm-svn: 372510	2019-09-22 14:31:53 +00:00
Sanjay Patel	d2a524288d	[InstCombine] add tests for icmp fold hindered by min/max; NFC llvm-svn: 372509	2019-09-22 14:23:22 +00:00
Sanjay Patel	f201b1c918	[InstCombine] add/move tests for icmp with add operand; NFC llvm-svn: 371988	2019-09-16 14:05:19 +00:00
Sanjay Patel	c5cd808156	[InstCombine] remove unneeded one-use checks for icmp fold This fold and several others were added in: rL125734 <https://reviews.llvm.org/rL125734> ...with no explanation for the one-use checks other than the code comments about register pressure. Given that this is IR canonicalization, we shouldn't be worried about register pressure though; the backend should be able to adjust for that as needed. This is part of solving PR43310 the theoretically right way: https://bugs.llvm.org/show_bug.cgi?id=43310 ...ie, if we don't cripple basic transforms, then we won't need to add special-case code to detect larger patterns. rL371940 is a related patch in this series. llvm-svn: 371981	2019-09-16 12:54:34 +00:00
Sanjay Patel	14ce3fde04	[InstCombine] add icmp tests with extra uses; NFC llvm-svn: 371979	2019-09-16 12:19:18 +00:00
Sanjay Patel	3daf168fa9	[InstCombine] remove unneeded one-use checks for icmp fold This fold and several others were added in: rL125734 ...with no explanation for the one-use checks other than the code comments about register pressure. Given that this is IR canonicalization, we shouldn't be worried about register pressure though; the backend should be able to adjust for that as needed. There are similar checks as noted with the TODO comments. I'm hoping to remove those restrictions too, but if any of these does cause a regression, it should be easier to correct by making small, individual commits. This is part of solving PR43310 the theoretically right way: https://bugs.llvm.org/show_bug.cgi?id=43310 ...ie, if we don't cripple basic transforms, then we won't need to add special-case code to detect larger patterns. llvm-svn: 371940	2019-09-15 20:56:34 +00:00
Sanjay Patel	c77ad16f8e	[InstCombine] add icmp tests with extra uses; NFC llvm-svn: 371939	2019-09-15 20:13:27 +00:00
Eric Christopher	cee313d288	Revert "Temporarily Revert "Add basic loop fusion pass."" The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552	2019-04-17 04:52:47 +00:00
Eric Christopher	a863435128	Temporarily Revert "Add basic loop fusion pass." As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546	2019-04-17 02:12:23 +00:00
Simon Pilgrim	8ee477a2ab	[InstSimplify] SimplifyICmpInst - icmp eq/ne %X, undef -> undef As discussed on PR41125 and D59363, we have a mismatch between icmp eq/ne cases with an undef operand: When the other operand is constant we fold to undef (handled in ConstantFoldCompareInstruction) When the other operand is non-constant we fold to a bool constant based on isTrueWhenEqual (handled in SimplifyICmpInst). Neither is really wrong, but this patch changes the logic in SimplifyICmpInst to consistently fold to undef. The NewGVN test change is annoying (as with most heavily reduced tests) but AFAICT I have kept the purpose of the test based on rL291968. Differential Revision: https://reviews.llvm.org/D59541 llvm-svn: 356456	2019-03-19 14:08:23 +00:00
Simon Pilgrim	9497b2b2f7	[InstCombine] Regenerate + add icmp with undef tests Better test coverage for PR41125 and D59363 llvm-svn: 356448	2019-03-19 11:44:22 +00:00
Sanjay Patel	68bc5fb0ad	[InstCombine] X \| C == C --> (X & ~C) == 0 We should canonicalize to one of these forms, and compare-with-zero could be more conducive to follow-on transforms. This also leads to generally better codegen as shown in PR40611: https://bugs.llvm.org/show_bug.cgi?id=40611 llvm-svn: 353313	2019-02-06 16:43:54 +00:00
Sanjay Patel	51abb86f09	[InstCombine] add tests for PR40611 and regenerate checks; NFC Lots of unrelated diffs here from the newer version of the script. llvm-svn: 353312	2019-02-06 16:17:05 +00:00
Sanjay Patel	320cf5dde5	[InstCombine] auto-generate full checks for icmp dominator tests; NFC llvm-svn: 348270	2018-12-04 15:00:35 +00:00
Sanjay Patel	05aadf885d	[InstCombine] reverse 'trunc X to <N x i1>' canonicalization; 2nd try Re-trying r344082 because it unintentionally included extra diffs. Original commit message: icmp ne (and X, 1), 0 --> trunc X to N x i1 Ideally, we'd do the same for scalars, but there will likely be regressions unless we add more trunc folds as we're doing here for vectors. The motivating vector case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) { %c = fcmp ole <4 x float> %x, %y %s = sext <4 x i1> %c to <4 x i32> %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1> %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3> %cond = or <4 x i32> %s1, %s2 %condtr = trunc <4 x i32> %cond to <4 x i1> %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w ret <4 x float> %r } Here's a sampling of the vector codegen for that case using mask+icmp (current behavior) vs. trunc (with this patch): AVX before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vandps LCPI0_0(%rip), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vpcmpeqd %xmm1, %xmm0, %xmm0 vblendvps %xmm0, %xmm3, %xmm2, %xmm0 AVX after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vblendvps %xmm0, %xmm2, %xmm3, %xmm0 AVX512f before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpbroadcastd LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1] vptestnmd %zmm1, %zmm0, %k1 vblendmps %zmm3, %zmm2, %zmm0 {%k1} AVX512f after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpslld $31, %xmm0, %xmm0 vptestmd %zmm0, %zmm0, %k1 vblendmps %zmm2, %zmm3, %zmm0 {%k1} AArch64 before: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b movi v1.4s, #1 and v0.16b, v0.16b, v1.16b cmeq v0.4s, v0.4s, #0 bsl v0.16b, v3.16b, v2.16b AArch64 after: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b bsl v0.16b, v2.16b, v3.16b PowerPC-le before: xvcmpgesp 34, 35, 34 vspltisw 0, 1 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxlxor 35, 35, 35 xxland 34, 0, 32 vcmpequw 2, 2, 3 xxsel 34, 36, 37, 34 PowerPC-le after: xvcmpgesp 34, 35, 34 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxsel 34, 37, 36, 0 Differential Revision: https://reviews.llvm.org/D52747 llvm-svn: 344181	2018-10-10 20:47:46 +00:00
Sanjay Patel	58fc00d0bc	revert r344082: [InstCombine] reverse 'trunc X to <N x i1>' canonicalization This commit accidentally included the diffs from D53057. llvm-svn: 344178	2018-10-10 20:39:39 +00:00

1 2 3 4 5 ...

263 Commits