llvm-project

Commit Graph

Author	SHA1	Message	Date
David Green	9020e22a87	[InstCombine] Convert xor (ashr X, BW-1), C -> select(X >=s 0, C, ~C) The sequence of instructions `xor (ashr X, BW-1), C` (or with a truncation `xor (trunc (ashr X, BW-1)), C)` takes a value, produces all zeros or all ones and with it optionally inverts a constant depending on whether the original input was positive or negative. This is the same as checking if the value is positive, and selecting between the constant and ~constant. https://alive2.llvm.org/ce/z/NJ85qY This is a fairly general version of a fold that helps pull saturating arithmetic into a canonical form. Differential Revision: https://reviews.llvm.org/D109151	2021-10-29 11:19:20 +01:00
Stanislav Mekhanoshin	f7f430c913	[InstCombine] Fixed non-determinisctic order of new instructions Fixes non-determinisctic order of XOR instructions created after `5a7a458306`. The order of call argument evaluation is not defined, so create one Value before the call.	2021-10-28 12:14:02 -07:00
Stanislav Mekhanoshin	5a7a458306	[InstCombine] Fold `(c & ~(a \| b)) \| (b & ~(a \| c))` to `~a & (b ^ c)` ``` ---------------------------------------- define i4 @src(i4 %a, i4 %b, i4 %c) { %0: %or1 = or i4 %a, %b %not1 = xor i4 %or1, 15 %and1 = and i4 %not1, %c %or2 = or i4 %a, %c %not2 = xor i4 %or2, 15 %and2 = and i4 %not2, %b %or3 = or i4 %and1, %and2 ret i4 %or3 } => define i4 @tgt(i4 %a, i4 %b, i4 %c) { %0: %xor = xor i4 %b, %c %not = xor i4 %a, 15 %or3 = and i4 %xor, %not ret i4 %or3 } Transformation seems to be correct! ``` Differential Revision: https://reviews.llvm.org/D112276	2021-10-28 11:54:30 -07:00
Sanjay Patel	3888de9507	[InstCombine] generalize reassociated Demorgan folds This updates the recent D112108 / `b92412fb28` to handle the flipped logic ('or') sibling: https://alive2.llvm.org/ce/z/Y2L6Ch	2021-10-21 10:39:29 -04:00
Stanislav Mekhanoshin	b92412fb28	[InstCombine] Fold `(a & ~b) & ~c` to `a & ~(b \| c)` %not1 = xor i32 %b, -1 %not2 = xor i32 %c, -1 %and1 = and i32 %a, %not1 %and2 = and i32 %and1, %not2 => %i1 = or i32 %b, %c %i2 = xor i32 %1, -1 %and2 = and i32 %i2, %a Differential Revision: https://reviews.llvm.org/D112108	2021-10-20 13:05:46 -07:00
Sanjay Patel	a49f5386ce	[InstCombine] generalize fold for mask-with-signbit-splat, part 2 This removes an over-specified fold. The more general transform was added with: `727e642e97` There's a difference on an existing test that shows a potentially unnecessary use limit on an icmp fold. That fold is in InstCombinerImpl::foldICmpSubConstant(), and IIRC there was some back-and-forth on it and similar folds because they could cause analysis/passes (SCEV, LSR?) to miss optimizations. Differential Revision: https://reviews.llvm.org/D111410	2021-10-15 17:11:29 -04:00
Sanjay Patel	727e642e97	[InstCombine] generalize fold for mask-with-signbit-splat (iN X s>> (N-1)) & Y --> (X < 0) ? Y : 0 https://alive2.llvm.org/ce/z/qeYhdz I was looking at a missing abs() transform and found my way to this generalization of an existing fold that was added with D67799. As discussed in that review, we want to make sure codegen handles this difference well, and for all of the targets/types that I spot-checked, it looks good. I am leaving the existing fold in place in this commit because it covers a potentially missing icmp fold, but I plan to remove that as a follow-up commit as suggested during review. Differential Revision: https://reviews.llvm.org/D111410	2021-10-15 16:25:48 -04:00
Sanjay Patel	905d170803	[InstCombine] allow matching vector splat constants in foldLogOpOfMaskedICmps() This is NFC-intended for scalar code. There are still unnecessary m_ConstantInt restrictions in surrounding code, so this is not a complete fix. This prevents regressions seen with a planned follow-on to D111410.	2021-10-13 10:15:26 -04:00
Sanjay Patel	bc72baa047	[InstCombine] add folds for logical nand/nor This is noted as a regression in: https://llvm.org/PR52077	2021-10-05 18:31:20 -04:00
Sanjay Patel	668beb8ae8	[InstCombine] refactor folds of 'not' instructions; NFC This removes repeated calls to m_Not, so hopefully a little more efficient. Also, we may need to enhance some of these blocks to allow logical and/or (select of bools).	2021-10-05 16:36:57 -04:00
Jay Foad	a9bceb2b05	[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC. Stop using APInt constructors and methods that were soft-deprecated in D109483. This fixes all the uses I found in llvm, except for the APInt unit tests which should still test the deprecated methods. Differential Revision: https://reviews.llvm.org/D110807	2021-10-04 08:57:44 +01:00
Sanjay Patel	41ff7612b3	[InstCombine] allow splat vectors for narrowing masked fold Mostly cosmetic diffs, but the use of m_APInt matches splat constants.	2021-09-17 11:24:16 -04:00
Chris Lattner	735f46715d	[APInt] Normalize naming on keep constructors / predicate methods. This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things: 1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things. 2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct! APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment. Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical. Differential Revision: https://reviews.llvm.org/D109483	2021-09-09 09:50:24 -07:00
Nikita Popov	fafe5a6f44	[InstCombine] Perform "eq of parts" fold with logical ops The pattern matched here is too complex for the general logical and/or to bitwise and/or conversion to trigger. However, the fold is poison-safe, so match it with a select root as well: https://alive2.llvm.org/ce/z/vNzzSg https://alive2.llvm.org/ce/z/Beyumt	2021-08-22 16:55:53 +02:00
Sanjay Patel	eee0ded337	[InstCombine] add min/max intrinsics as freely invertible candidates In the optimized test, we are able to peak through the min/max that has 2 min/max operands and invert them all: https://alive2.llvm.org/ce/z/7gYMN5	2021-08-19 08:41:38 -04:00
Sanjay Patel	e10c3beca5	[InstCombine] add one-use check for min/max fold with not operands; NFC This makes the intrinsic logic match the cmp+select idiom folds just below. It's not clearly a win either way unless we think that a 'not' op costs more than min/max. The cmp+select folds on these patterns are more extensive than the intrinsics currently and may have some complicated interactions, so I'm trying to make those line up and bring the optimizations for intrinsics up to parity.	2021-08-19 08:41:38 -04:00
Simon Pilgrim	afc6b09dee	[InstCombine] getMaskedTypeForICmpPair - remove dead code. NFCI. Ok should be true at this point, so the early-out is dead - replace with an assert.	2021-07-30 19:23:05 +01:00
Simon Pilgrim	401d6685c0	[InstCombine] InstCombinerImpl::visitOr - enable bitreverse matching Currently we only match bswap intrinsics from or(shl(),lshr()) style patterns when we could often match bitreverse intrinsics almost as cheaply. Differential Revision: https://reviews.llvm.org/D90170	2021-05-15 13:39:09 +01:00
Nikita Popov	a8f7dee1df	[InstCombine] Support one-hot merge for logical and/or If a logical and/or is used, we need to be careful not to propagate a potential poison value from the RHS by inserting a freeze instruction. Otherwise it works the same way as bitwise and/or. This is intended to address the regression reported at https://reviews.llvm.org/D101191#2751002. Differential Revision: https://reviews.llvm.org/D102279	2021-05-12 21:01:18 +02:00
Roman Lebedev	554b1bced3	[InstCombine] ~(C + X) --> ~C - X (PR50308) We can not rely on (C+X)-->(X+C) already happening, because we might not have visited that `add` yet. The added testcase would get stuck in an endless combine loop.	2021-05-12 16:10:55 +03:00
Nikita Popov	1556540372	[InstCombine] Clean up one-hot merge optimization (NFC) Remove the requirement that the instruction is a BinaryOperator, make the predicate check more compact and use slightly more meaningful naming for the and operands.	2021-05-11 23:22:11 +02:00
Nikita Popov	463ea28e96	[InstCombine] Fold comparison of integers by parts Let's say you represent (i32, i32) as an i64 from which the parts are extracted with lshr/trunc. Then, if you compare two tuples by parts you get something like A[0] == B[0] && A[1] == B[1], just that the part extraction happens by lshr/trunc and not a narrow load or similar. The fold implemented here reduces such equality comparisons by converting them into a comparison on a larger part of the integer (which might be the whole integer). It handles both the "and of eq" and the conjugated "or of ne" case. I'm being conservative with one-use for now, though this could be relaxed if profitable (the base pattern converts 11 instructions into 5 instructions, but there's quite a few variations on how it can play out). Differential Revision: https://reviews.llvm.org/D101232	2021-05-10 22:22:39 +02:00
Juneyoung Lee	24ce194cfe	[InstCombine] generalize select + select/and/or folding using implied conditions This patch optimizes the remaining possible cases in D101191 by generalizing isImpliedCondition()-based foldings. Assume that there is `op a, (select b, _, _)` where op is one of `and i1`, `or i1` or their select forms. We can do the following optimization based on the result of `isImpliedCondition(a, b)`: If a = true implies… - b = true: - select a, (select b, A, B), false => select a, A, false : https://alive2.llvm.org/ce/z/WCnZYh - and a, (select b, A, B) => select a, A, false : https://alive2.llvm.org/ce/z/uZhcMG - b = false: - select a, (select b, A, B), false => select a, B, false : https://alive2.llvm.org/ce/z/c2hJpV - and a, (select b, A, B) => select a, B, false : https://alive2.llvm.org/ce/z/5ggwMM If a = false implies… - b = true: - select a, true, (select b, A, B) => select a, true, A : https://alive2.llvm.org/ce/z/tidKvH - or a, (select b, A, B) => select a, true, A : https://alive2.llvm.org/ce/z/cC-uyb - b = false: - select a, true, (select b, A, B) => select a, true, B : https://alive2.llvm.org/ce/z/ZXpJq9 - or a, (select b, A, B) => select a, true, B : https://alive2.llvm.org/ce/z/hnDrJj Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101720	2021-05-04 09:42:06 +09:00
Roman Lebedev	91248e2db9	[InstCombine] Improve "get low bit mask upto and including bit X" pattern https://alive2.llvm.org/ce/z/3u-48R	2021-04-11 18:08:08 +03:00
Sanjay Patel	c0bbd0cc35	[InstCombine] fold not ops around min/max intrinsics This is another step towards parity with the existing cmp+select folds (see D98152).	2021-04-07 17:31:36 -04:00
Sanjay Patel	0333ed8e0c	[InstCombine] move abs transform to helper function; NFC The swap of the operands can affect later transforms that are expecting a constant as operand 1. I don't think we can trigger a bug with the current code, but I hit that problem while drafting a new transform for min/max intrinsics.	2021-04-07 08:35:07 -04:00
Philip Reames	4bf8985f4f	Replace calls to IntrinsicInst::Create with CallInst::Create [nfc] There is no IntrinsicInst::Create. These are binding to the method in the super type. Be explicitly about which method is being called.	2021-04-06 13:23:58 -07:00
Sanjay Patel	412fc74140	[InstCombine] fold not+or+neg ~((-X) \| Y) --> (X - 1) & (~Y) We generally prefer 'add' over 'sub', this reduces the dependency chain, and this looks better for codegen on x86, ARM, and AArch64 targets. https://llvm.org/PR45755 https://alive2.llvm.org/ce/z/cxZDSp	2021-04-02 13:16:36 -04:00
Philip Reames	ebc61f9d3c	[instcombine] Collapse trivial or recurrences If we have a recurrence of the form <Start, Or, Step> we know that the value taken by the recurrence stabilizes on the first iteration (provided step is loop invariant). We can exploit that fact to remove the loop carried dependence in the recurrence. Differential Revision: https://reviews.llvm.org/D97578 (or part)	2021-03-08 09:21:38 -08:00
Philip Reames	239a618180	[instcombine] Collapse trivial and recurrences If we have a recurrence of the form <Start, And, Step> we know that the value taken by the recurrence stabilizes on the first iteration (provided step is loop invariant). We can exploit that fact to remove the loop carried dependence in the recurrence. Differential Revision: https://reviews.llvm.org/D97578 (and part)	2021-03-08 09:21:38 -08:00
Simon Pilgrim	609d0c9772	[InstCombine] matchBSwapOrBitReverse - remove pattern matching early-out. NFCI. recognizeBSwapOrBitReverseIdiom + collectBitParts have pattern matching to bail out early if a bswap/bitreverse pattern isn't possible - we should be able to rely on this instead without any notable change in compile time. This is part of a cleanup towards letting matchBSwapOrBitReverse /recognizeBSwapOrBitReverseIdiom use 'root' instructions that aren't ORs (FSHL/FSHRs in particular which can be prematurely created). Differential Revision: https://reviews.llvm.org/D97056	2021-02-20 13:15:34 +00:00
Roman Lebedev	d1a6f92fd5	[InstCombine] Fold `(~x) \| y` --> `~(x & (~y))` iff it is free to do so Iff we know we can get rid of the inversions in the new pattern, we can thus get rid of the inversion in the old pattern, this decreasing instruction count. Note that we could position this transformation as just hoisting of the `not` (still, iff y is freely negatible), but the test changes show a number of regressions, so let's not do that.	2021-01-22 17:23:54 +03:00
Roman Lebedev	79b0d21ce9	[InstCombine] Fold `(~x) & y` --> `~(x \| (~y))` iff it is free to do so Iff we know we can get rid of the inversions in the new pattern, we can thus get rid of the inversion in the old pattern, this decreasing instruction count.	2021-01-22 17:23:54 +03:00
Juneyoung Lee	2d89ebd5d1	Address unused variable warning	2021-01-19 09:30:16 +09:00
Juneyoung Lee	0441df94ad	[InstCombine,InstSimplify] Optimize select followed by and/or/xor This patch adds `A & (A && B)` -> `A && B` (similarly for or + logical or) Also, this patch adds `~(select C, (icmp pred X, Y), const)` -> `select C, (icmp pred' X, Y), ~const`. Alive2 proof: merge_and: https://alive2.llvm.org/ce/z/teMR97 merge_or: https://alive2.llvm.org/ce/z/b4yZUp xor_and: https://alive2.llvm.org/ce/z/_-TXHi xor_or: https://alive2.llvm.org/ce/z/2uYx_a Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D94861	2021-01-19 09:14:17 +09:00
Dávid Bolvanský	0529946b5b	[instCombine] Add (A ^ B) \| ~(A \| B) -> ~(A & B) define i32 @src(i32 %x, i32 %y) { %0: %xor = xor i32 %y, %x %or = or i32 %y, %x %neg = xor i32 %or, 4294967295 %or1 = or i32 %xor, %neg ret i32 %or1 } => define i32 @tgt(i32 %x, i32 %y) { %0: %and = and i32 %x, %y %neg = xor i32 %and, 4294967295 ret i32 %neg } Transformation seems to be correct! https://alive2.llvm.org/ce/z/Cvca4a	2021-01-12 19:29:17 +01:00
Roman Lebedev	374ef57f13	[InstCombine] 'hoist xor-by-constant from xor-by-value': completely give up on constant exprs As Mikael Holmén is noting in the post-commit review for the first fix https://reviews.llvm.org/rGd4ccef38d0bb#967466 not hoisting constantexprs is not enough, because if the xor originally was a constantexpr (i.e. X is a constantexpr). `SimplifyAssociativeOrCommutative()` in `visitXor()` will immediately undo this transform, thus again causing an infinite combine loop. This transform has resulted in a surprising number of constantexpr failures.	2020-12-29 16:28:18 +03:00
Roman Lebedev	d4ccef38d0	[InstCombine] 'hoist xor-by-constant from xor-by-value': ignore constantexprs As it is being reported (in post-commit review) in https://reviews.llvm.org/D93857 this fold (as i expected, but failed to come up with test coverage despite trying) has issues with constant expressions. Since we only care about true constants, which constantexprs are not, don't perform such hoisting for constant expressions.	2020-12-28 20:15:20 +03:00
Roman Lebedev	d9ebaeeb46	[InstCombine] Hoist xor-by-constant from xor-by-value This is one of the deficiencies that can be observed in https://godbolt.org/z/YPczsG after D91038 patch set. This exposed two missing folds, one was fixed by the previous commit, another one is `(A ^ B) \| ~(A ^ B) --> -1` / `(A ^ B) & ~(A ^ B) --> 0`. `-early-cse` will catch it: https://godbolt.org/z/4n1T1v, but isn't meaningful to fix it in InstCombine, because we'd need to essentially do our own CSE, and we can't even rely on `Instruction::isIdenticalTo()`, because there are no guarantees that the order of operands matches. So let's just accept it as a loss.	2020-12-24 21:20:50 +03:00
Roman Lebedev	5b78303433	[InstCombine] Fold `a & ~(a ^ b)` to `x & y` ``` ---------------------------------------- define i32 @and_xor_not_common_op(i32 %a, i32 %b) { %0: %b2 = xor i32 %b, 4294967295 %t2 = xor i32 %a, %b2 %t4 = and i32 %t2, %a ret i32 %t4 } => define i32 @and_xor_not_common_op(i32 %a, i32 %b) { %0: %t4 = and i32 %a, %b ret i32 %t4 } Transformation seems to be correct! ```	2020-12-24 21:20:49 +03:00
Roman Lebedev	a91e96702a	[InstCombine] Fold `and(shl(zext(x), width(SIGNMASK) - width(%x)), SIGNMASK)` to `and(sext(%x), SIGNMASK)` One less instruction and reducing use count of zext. As alive2 confirms, we're fine with all the weird combinations of undef elts in constants, but unless the shift amount was undef for a lane, we must sanitize undef mask to zero, since sign bits are no longer zeros. https://rise4fun.com/Alive/d7r ``` ---------------------------------------- Optimization: zz Precondition: ((C1 == (width(%r) - width(%x))) && isSignBit(C2)) %o0 = zext %x %o1 = shl %o0, C1 %r = and %o1, C2 => %n0 = sext %x %r = and %n0, C2 Done: 2016 Optimization is correct! ```	2020-11-20 00:31:27 +03:00
Sanjay Patel	4a66a1d17a	[InstCombine] allow vectors for masked-add -> xor fold https://rise4fun.com/Alive/I4Ge Name: add with pow2 mask Pre: isPowerOf2(C2) && (C1 & C2) != 0 && (C1 & (C2-1)) == 0 %a = add i8 %x, C1 %r = and i8 %a, C2 => %n = and i8 %x, C2 %r = xor i8 %n, C2	2020-11-17 13:36:08 -05:00
Simon Pilgrim	f7ebdec987	[InstCombine] visitAnd - remove unnecessary Value X, Y shadow variables. NFCI. Fixes a number of Wshadow warnings.	2020-11-17 17:59:21 +00:00
Simon Pilgrim	abf29d9862	[InstCombine] visitAnd - use m_SpecificInt instead of m_APInt + comparison. NFCI. m_SpecificInt has the same 'no undef element' behaviour as m_APInt so no change there, and anyway we have test coverage for undef elements in the fold. Noticed while fixing a Wshadow warning about shadow Value X, Y variables.	2020-11-17 17:37:10 +00:00
Sanjay Patel	f791ad7e1e	[InstCombine] remove scalar constraint for mask-of-add fold https://rise4fun.com/Alive/V6fP Name: add with low mask Pre: (C1 & (-1 u>> countLeadingZeros(C2))) == 0 %a = add i8 %x, C1 %r = and i8 %a, C2 => %r = and i8 %x, C2	2020-11-17 12:13:45 -05:00
Sanjay Patel	433696911a	[InstCombine] relax constraints on mask-of-add There are 2 changes: 1. Remove the unnecessary one-use check. 2. Remove the unnecessary power-of-2 check. https://rise4fun.com/Alive/V6fP Name: add with low mask Pre: (C1 & (-1 u>> countLeadingZeros(C2))) == 0 %a = add i8 %x, C1 %r = and i8 %a, C2 => %r = and i8 %x, C2	2020-11-17 12:13:44 -05:00
Sanjay Patel	6ddc237766	[InstCombine] reduce code for flip of masked bit; NFC There are 1-2 potential follow-up NFC commits to reduce this further on the way to generalizing this for vectors. The operand replacing path should be dead code because demanded bits handles that more generally (D91415).	2020-11-15 15:43:34 -05:00
Simon Pilgrim	6b2eb31e1e	[InstCombine] Add support for zext(and(neg(amt),width-1)) rotate shift amount patterns Alive2: https://alive2.llvm.org/ce/z/bCvvHd	2020-10-26 11:22:41 +00:00
Simon Pilgrim	3052e474ec	[InstCombine] matchBSwapOrBitReversem - recognise or(fshl(),fshl()) bswap patterns. I'm not certain InstCombinerImpl::matchBSwapOrBitReverse needs to filter the or(op0(),op1()) ops - there are just too many cases that recognizeBSwapOrBitReverseIdiom/collectBitParts handle now (and quickly).	2020-10-25 10:17:45 +00:00
Simon Pilgrim	1cab3bf004	[InstCombine] matchBSwapOrBitReverse - expose bswap/bitreverse matching flags. matchBSwapOrBitReverse was hardcoded to just match bswaps - we're going to need to expose the ability to match bitreverse as well, so make this part of the function call.	2020-10-23 12:35:28 +01:00
Simon Pilgrim	19a13bf538	[InstCombine] Rename InstCombinerImpl::matchBSwap to matchBSwapOrBitReverse. NFCI. This matches bswap and bitreverse intrinsics, so we should make that clear in the function name.	2020-10-23 12:35:27 +01:00
Simon Pilgrim	7b4a828452	[InstCombine] foldOrOfICmps - use m_Specific instead of explicit comparisons. NFCI.	2020-10-21 11:53:45 +01:00
Martin Storsjö	4de215ff18	Revert "[InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support" Also revert "[InstCombine] foldOrOfICmps - use m_Specific instead of explicit comparisons. NFCI." to make the primarily intended revert work. This reverts commits `ce13549761` and `e372a5f86f`. This commit caused failed asserts e.g. like this: $ cat repro.cpp bool a(char b) { return b >= '0' && b <= '9' \|\| (b \| 32) >= 'a' && (b \| 32) <= 'z'; $ clang++ -target x86_64-linux-gnu -c -O2 repro.cpp clang++: ../include/llvm/ADT/APInt.h:1151: bool llvm::APInt::operator==(const llvm::APInt&) const: Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"' failed.	2020-10-21 09:47:18 +03:00
Simon Pilgrim	ce13549761	[InstCombine] foldOrOfICmps - use m_Specific instead of explicit comparisons. NFCI.	2020-10-20 16:26:41 +01:00
Simon Pilgrim	e372a5f86f	[InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support Reapplied rGa704d8238c86 with a check for integer/integervector types to prevent matching with pointer types	2020-10-20 14:14:26 +01:00
Simon Pilgrim	adb52e5f9e	[InstCombine] foldOrOfICmps - only fold (icmp_eq B, 0) \| (icmp_ult/gt A, B) for integer types Fixes a number of stage2 buildbots that were failing when I generalized the m_ConstantInt() logic - that didn't match for pointer types but m_Zero() does......	2020-10-19 17:05:38 +01:00
Simon Pilgrim	482e6f0041	Revert rGa704d8238c86bac: "[InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support" This reverts commit `a704d8238c`. Causing stage2 build failures on some bots.	2020-10-19 16:03:36 +01:00
Simon Pilgrim	de885f1b2a	[InstCombine] Add (icmp ne A, 0) \| (icmp ne B, 0) --> (icmp ne (A\|B), 0) vector support Scalar cases were already being handled by foldLogOpOfMaskedICmps (so this was dead code), but refactoring to support non-uniform vectors will take some time, so tweak this fold in the meantime.	2020-10-19 15:41:21 +01:00
Simon Pilgrim	ecd25086d1	[InstCombine] Add (icmp eq B, 0) \| (icmp ult/gt A, B) -> (icmp ule A, B-1) vector support	2020-10-19 15:23:48 +01:00
Simon Pilgrim	a704d8238c	[InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support	2020-10-19 14:55:18 +01:00
Simon Pilgrim	1d90e53044	[InstCombine] foldOrOfICmps - pull out repeated getOperand() calls. NFCI.	2020-10-19 14:28:08 +01:00
Simon Pilgrim	0b7b446a40	[InstCombine] Support vectors-with-undef in and(logicalshift(1,X),1) --> zext(X == 0) fold	2020-10-19 11:10:32 +01:00
Sanjay Patel	53e92b4c0e	[InstCombine] (~A & B) ^ A -> A \| B Differential Revision: https://reviews.llvm.org/D86395	2020-10-17 12:20:18 -04:00
Simon Pilgrim	83ae625f0c	[InstCombine] visitAnd - pull out repeated I.getType() calls. NFCI.	2020-10-16 15:43:11 +01:00
Simon Pilgrim	253f24cf4c	[InstCombine] Remove custom and(trunc(and(x,c1)),c2) fold This is more correctly handled by canEvaluateTruncated (one use checks etc.) and covers all the tests cases that were added for this fold.	2020-10-16 15:43:10 +01:00
Simon Pilgrim	55991b44b7	[InstCombine] foldAndOrOfICmpsOfAndWithPow2 - add vector support Support vector cases for folding: (iszero(A & K1) \| iszero(A & K2)) -> (A & (K1 \| K2)) != (K1 \| K2) (!iszero(A & K1) & !iszero(A & K2)) -> (A & (K1 \| K2)) == (K1 \| K2)	2020-10-16 10:41:40 +01:00
Simon Pilgrim	23f1616626	[InstCombine] Use m_SpecificInt instead of m_APInt + comparison. NFCI.	2020-10-15 16:06:27 +01:00
Simon Pilgrim	2b45639ea0	[InstCombine] InstCombineAndOrXor - refactor cast<ConstantInt> usages to PatternMatch. NFCI. First step towards replacing these to add full vector support.	2020-10-15 16:06:17 +01:00
Simon Pilgrim	09be7623e4	[InstCombine] visitXor - refactor ((X^C1)>>C2)^C3 -> (X>>C2)^((C1>>C2)^C3) fold. NFCI. This is still ConstantInt-only (scalar) but is refactored to use PatternMatch to make adding vector support in the future relatively trivial.	2020-10-15 14:38:15 +01:00
Simon Pilgrim	89a2a47870	[InstCombine] Add m_SpecificIntAllowUndef pattern matcher m_SpecificInt doesn't accept undef elements in a vector splat value - tweak specific_intval to optionally allow undefs and add the m_SpecificIntAllowUndef variants. Allows us to remove the m_APIntAllowUndef + comparison hack inside matchFunnelShift	2020-10-14 16:15:53 +01:00
Simon Pilgrim	1e4d882f9a	[InstCombine] matchFunnelShift - add support for non-uniform vectors containing undefs. Replace m_SpecificInt with m_APIntAllowUndef to matching splats containing undefs, then use ConstantExpr::mergeUndefsWith to merge the undefs together in the result. The undef funnel shift amounts are getting replaced with zero later on - I'll address this in a later patch, otherwise we lose potential shift by splat value patterns.	2020-10-14 10:42:27 +01:00
Simon Pilgrim	bbf3925879	[InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw (REAPPLIED) If value tracking can confirm that a shift value is less than the type bitwidth then we can more confidently fold general or(shl(a,x),lshr(b,sub(bw,x))) patterns to a funnel/rotate intrinsic pattern without causing bad codegen regressions in the backend (see D89139). Reapplied after the shift canonicalization in rG02295e6d1a15 which removed the need to flip the shift values. Differential Revision: https://reviews.llvm.org/D88783	2020-10-12 16:06:41 +01:00
Simon Pilgrim	fa56623370	[InstCombine] matchFunnelShift - remove shift value commutation. NFCI. After rG02295e6d1a15 we no longer need to invert the shift values for fshr - this is just hidden at the moment as funnel shifts only ever match for constant values so never use the fshr "Sub on SHL" path.	2020-10-12 15:55:18 +01:00
Simon Pilgrim	02295e6d1a	[InstCombine] matchFunnelShift - canonicalize to OR(SHL,LSHR). NFCI. Simplify the shift amount matching code by canonicalizing the shift ops first.	2020-10-12 15:10:59 +01:00
Simon Pilgrim	45d785e22b	Revert rGb97093e520036f8 - "[InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw" This reverts commit `b97093e520`. Funnel shift argument commutation isn't working correctly	2020-10-12 11:38:52 +01:00
Simon Pilgrim	b97093e520	[InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw If value tracking can confirm that a shift value is less than the type bitwidth then we can more confidently fold general or(shl(a,x),lshr(b,sub(bw,x))) patterns to a funnel/rotate intrinsic pattern without causing bad codegen regressions in the backend (see D89139). Differential Revision: https://reviews.llvm.org/D88783	2020-10-11 10:37:20 +01:00
Simon Pilgrim	5415fef3ab	[InstCombine] matchFunnelShift - support non-uniform constant vector shift amounts (PR46895) Complete basic PR46895 fixes by refactoring D87452/D88402 to allow us to match non-uniform constant values. We still don't handle non-uniform vectors that contain undef elements, but that can wait until we have a decent generic mechanism for this. Differential Revision: https://reviews.llvm.org/D88420	2020-10-08 12:56:27 +01:00
Simon Pilgrim	e1d4ca0009	[InstCombine] matchRotate - add support for matching general funnel shifts with constant shift amounts (PR46896) First step towards extending the existing rotation support to full funnel shift handling now that the backend legalization support has improved. This enables us to match the shift by constant cases, which are pretty trivial to expand again if necessary. D88420 will add non-uniform support for funnel shifts as well once its been finalized. Differential Revision: https://reviews.llvm.org/D88834	2020-10-08 11:05:14 +01:00
Simon Pilgrim	aa47962cc9	[InstCombine] canNarrowShiftAmt - replace custom Constant matching with m_SpecificInt_ICMP The existing code ignores undef values which matches m_SpecificInt_ICMP, although m_SpecificInt_ICMP returns false for an all-undef constant, I've added test coverage at rGfe0197e194a64f9 to show that undef folding should already have dealt with that case.	2020-10-08 10:53:32 +01:00
Simon Pilgrim	3aa93f690b	[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191) (Reapplied) If we're bswap'ing some bytes and zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern. Reapplied with early-out if recognizeBSwapOrBitReverseIdiom collects a source wider than the result type. Differential Revision: https://reviews.llvm.org/D88578	2020-10-03 14:52:42 +01:00
Simon Pilgrim	0364721e3e	Revert rG3d14a1e982ad27 - "[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191)" This reverts commit `3d14a1e982`. This is breaking on some 2stage clang buildbots	2020-10-02 18:17:14 +01:00
Simon Pilgrim	3d14a1e982	[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191) If we're bswap'ing some bytes and zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern. Differential Revision: https://reviews.llvm.org/D88578	2020-10-02 17:25:12 +01:00
Simon Pilgrim	63ee42a06b	[InstCombine] matchRotate - force splat of uniform constant rotation amounts (PR46895) Fixes minor bug in D88402 where we were using the original shift constant (with undefs) instead of one with the splat values (re)splatted to all elements.	2020-09-28 15:12:41 +01:00
Simon Pilgrim	dabb14cadd	[InstCombine] matchRotate - allow undef in uniform constant rotation amounts (PR46895) An extension to D87452, we can safely permit undefs in the uniform/splat detection https://alive2.llvm.org/ce/z/nT-ptN Differential Revision: https://reviews.llvm.org/D88402	2020-09-28 13:36:13 +01:00
Simon Pilgrim	9ff9c1d8ee	[InstCombine] matchRotate - support (uniform) constant rotation amounts (PR46895) This patch adds handling of rotation patterns with constant shift amounts - the next bit will be how we want to support non-uniform constant vectors. Differential Revision: https://reviews.llvm.org/D87452	2020-09-25 22:03:10 +01:00
Christopher Tetreault	640f20b0c7	[SVE] Remove calls to VectorType::getNumElements from InstCombine Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D82237	2020-08-31 12:59:10 -07:00
Sanjay Patel	ec06b38130	[InstCombine] canonicalize 'not' ops before logical shifts This reverses the existing transform that would uniformly canonicalize any 'xor' after any shift. In the case of logical shifts, that turns a 'not' into an arbitrary 'xor' with constant, and that's probably not as good for analysis, SCEV, or codegen. The SCEV motivating case is discussed in: http://bugs.llvm.org/PR47136 There's an analysis motivating case at: http://bugs.llvm.org/PR38781 I did draft a patch that would do the same for 'ashr' but that's questionable because it's just swapping the position of a 'not' and uncovers at least 2 missing folds that we would probably need to deal with as preliminary steps. Alive proofs: https://rise4fun.com/Alive/BBV Name: shift right of 'not' Pre: C2 == (-1 u>> C1) %a = lshr i8 %x, C1 %r = xor i8 %a, C2 => %n = xor i8 %x, -1 %r = lshr i8 %n, C1 Name: shift left of 'not' Pre: C2 == (-1 << C1) %a = shl i8 %x, C1 %r = xor i8 %a, C2 => %n = xor i8 %x, -1 %r = shl i8 %n, C1 Name: ashr of 'not' %a = ashr i8 %x, C1 %r = xor i8 %a, -1 => %n = xor i8 %x, -1 %r = ashr i8 %n, C1 Differential Revision: https://reviews.llvm.org/D86243	2020-08-22 09:38:13 -04:00
Sanjay Patel	c8d711adae	[InstCombine] reduce code duplication; NFC	2020-08-19 12:05:12 -04:00
Dávid Bolvanský	c2f0101310	[InstCombine] ~(~X + Y) -> X - Y Proof: https://alive2.llvm.org/ce/z/4xharr Solves PR47051 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D85593	2020-08-11 11:05:42 +02:00
Sanjay Patel	23693ffc3b	[InstCombine] reduce xor-of-or's bitwise logic (PR46955); 2nd try The 1st try at this (rG2265d01f2a5b) exposed what looks like unspecified behavior in C/C++ resulting in test variations. The arguments to BinaryOperator::CreateAnd() were both IRBuilder function calls, and the order in which they execute determines the order of the new instructions in the IR. But the order of function arg evaluation is not fixed by the rules of C/C++, so depending on compiler config, the test would fail because the test expected a single fixed ordering of instructions. Original commit message: I tried to use m_Deferred() on this, but didn't find a clean way to do that. http://bugs.llvm.org/PR46955 https://alive2.llvm.org/ce/z/2h6QTq	2020-08-03 10:21:56 -04:00
Sanjay Patel	f19a9be385	Revert "[InstCombine] reduce xor-of-or's bitwise logic (PR46955)" This reverts commit `2265d01f2a`. Seeing bot failures after this change like: http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/42586	2020-08-03 08:58:41 -04:00
Sanjay Patel	2265d01f2a	[InstCombine] reduce xor-of-or's bitwise logic (PR46955) I tried to use m_Deferred() on this, but didn't find a clean way to do that. http://bugs.llvm.org/PR46955 https://alive2.llvm.org/ce/z/2h6QTq	2020-08-03 08:31:43 -04:00
Sebastian Neubauer	2a6c871596	[InstCombine] Move target-specific inst combining For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time. D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target). This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64. This allows to move about 3000 lines out from InstCombine to the targets. Differential Revision: https://reviews.llvm.org/D81728	2020-07-22 15:59:49 +02:00
Sanjay Patel	d8b268680d	[InstCombine] prevent infinite looping in or-icmp fold (PR46712) I'm not sure if the test is truly minimal, but we need to induce a situation where a value becomes a constant but is not immediately folded before getting to the 'or' transform.	2020-07-15 14:12:12 -04:00
Sanjay Patel	2552f65183	[InstCombine] fold mask op into casted shift (PR46013) https://rise4fun.com/Alive/Qply8 Pre: C2 == (-1 u>> zext(C1)) %a = ashr %x, C1 %s = sext %a to i16 %r = and i16 %s, C2 => %s2 = sext %x to i16 %r = lshr i16 %s2, zext(C1) https://bugs.llvm.org/show_bug.cgi?id=46013	2020-06-07 09:33:18 -04:00
Roman Lebedev	fde8eb00e1	[InstCombine] visitMaskedMerge(): when unfolding, sanitize undef constants (PR45955) We can't leave undef vector element constants as-is, it is a miscompile, so we need to sanitize them. We have two vectors (C and ~C): * We can't replace undef with 0 in both of them * We can't replace undef with 0 in only one of them * We could replace undef with -1 in both of them * We could replace undef with -1 in only one(!) of them * We could replace undef with -1 in one and 0 in another one of them. Therefore, it seems best to go with the last option, since otherwise we'd loose knowledge that C and ~C have no common bits set, which seems more important than preserving partial undef knowledge. Fixes https://bugs.llvm.org/show_bug.cgi?id=45955	2020-05-17 22:53:03 +03:00
Simon Pilgrim	bab44a698e	[InstCombine] matchOrConcat - match BITREVERSE Fold or(zext(bitreverse(x)),shl(zext(bitreverse(y)),bw/2) -> bitreverse(or(zext(x),shl(zext(y),bw/2)) Practically this is the same as the BSWAP pattern so we might as well handle it.	2020-05-10 16:00:29 +01:00
Simon Pilgrim	5c91aa6603	[InstCombine] Fold or(zext(bswap(x)),shl(zext(bswap(y)),bw/2)) -> bswap(or(zext(x),shl(zext(y), bw/2)) This adds a general combine that can be used to fold: or(zext(OP(x)), shl(zext(OP(y)),bw/2)) --> OP(or(zext(x), shl(zext(y),bw/2))) Allowing us to widen 'concat-able' style or+zext patterns - I've just set this up for BSWAP but we could use this for other similar ops (BITREVERSE for instance). We already do something similar for bitop(bswap(x),bswap(y)) --> bswap(bitop(x,y)) Fixes PR45715 Reviewed By: @lebedev.ri Differential Revision: https://reviews.llvm.org/D79041	2020-05-05 12:30:10 +01:00
Michael Liao	495bb8feb9	Fix `-Wparentheses` warnings. NFC.	2020-04-24 15:04:01 -04:00
Sanjay Patel	62da6ecea2	[InstCombine] substitute equivalent constant to reduce logic-of-icmps (X == C) && (Y Pred1 X) --> (X == C) && (Y Pred1 C) (X != C) \|\| (Y Pred1 X) --> (X != C) \|\| (Y Pred1 C) This cooperates/overlaps with D78430, but it is a more general transform that gets us most of the expected simplifications and several other improvements. http://volta.cs.utah.edu:8080/z/5gxjjc PR45618: https://bugs.llvm.org/show_bug.cgi?id=45618 Differential Revision: https://reviews.llvm.org/D78582	2020-04-23 10:19:16 -04:00
Michael Liao	21529355e1	Fix `-Wparentheses` warnings. NFC.	2020-04-21 15:02:59 -04:00
Sanjay Patel	978166f209	[InstCombine] improve types/names for logic-of-icmp helper function; NFC	2020-04-21 10:16:45 -04:00
Sanjay Patel	ba72389269	[InstCombine] improve types/names for logic-of-icmp helper functions; NFC	2020-04-21 09:18:22 -04:00
Sanjay Patel	812970edda	[InstCombine] replace undef in vector constant for safe shift transform (PR45447) As noted in PR45447, we have a vector-constant-with-undef-element transform bug: https://bugs.llvm.org/show_bug.cgi?id=45447 We replace undefs with a safe constant (0 or -1) based on the (non-)negative predicate constraint. So this is correct: http://volta.cs.utah.edu:8080/z/WZE36H ...but this is not: http://volta.cs.utah.edu:8080/z/boj8gJ Previously, we were relying on getSafeVectorConstantForBinop() in the related fold (D76800). But that's making an assumption about what qualifies as "safe", and that assumption may not always hold. Differential Revision: https://reviews.llvm.org/D77739	2020-04-09 08:00:46 -04:00
Christopher Tetreault	155740cc33	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, rriddle, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77263	2020-04-08 15:15:41 -07:00
Nikita Popov	672e8bfbfc	[InstCombine] Fix worklist management in foldXorOfICmps() Because this code does not use the IC-aware replaceInstUsesWith() helper, we need to manually push users to the worklist. This is NFC-ish, in that it may only change worklist order.	2020-03-28 18:25:21 +01:00
Jonathan Roelofs	7a89a5d81b	[InstCombine] Fix Incorrect fold of ashr+xor -> lshr w/ vectors Fixes https://bugs.llvm.org/show_bug.cgi?id=43665	2020-03-26 12:09:36 -06:00
Florian Hahn	9063022573	[InstCombin] Avoid nested Create calls, to guarantee order. The original code allowed creating the != checks in unpredictable order, causing http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/34014 to fail.	2020-02-18 09:44:11 +01:00
Florian Hahn	6c85e92bcf	[InstCombine] Simplify a umul overflow check to a != 0 && b != 0. This patch adds a simplification if an OR weakens the overflow condition for umul.with.overflow by treating any non-zero result as overflow. In that case, we overflow if both umul.with.overflow operands are != 0, as in that case the result can only be 0, iff the multiplication overflows. Code like this is generated by code using __builtin_mul_overflow with negative integer constants, e.g. bool test(unsigned long long v, unsigned long long *res) { return __builtin_mul_overflow(v, -4775807LL, res); } ``` ---------------------------------------- Name: D74141 %res = umul_overflow {i8, i1} %a, %b %mul = extractvalue {i8, i1} %res, 0 %overflow = extractvalue {i8, i1} %res, 1 %cmp = icmp ne %mul, 0 %ret = or i1 %overflow, %cmp ret i1 %ret => %t0 = icmp ne i8 %a, 0 %t1 = icmp ne i8 %b, 0 %ret = and i1 %t0, %t1 ret i1 %ret %res = umul_overflow {i8, i1} %a, %b %mul = extractvalue {i8, i1} %res, 0 %cmp = icmp ne %mul, 0 %overflow = extractvalue {i8, i1} %res, 1 Done: 1 Optimization is correct! ``` Reviewers: nikic, lebedev.ri, spatel, Bigcheese, dexonsmith, aemerson Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D74141	2020-02-18 09:11:55 +01:00
Nikita Popov	5b2b67be8e	[InstCombine] Remove unnecessary worklist push; NFCI This is no longer needed after `d4627b90a0`, should have dropped it there...	2020-02-08 17:09:28 +01:00
Nikita Popov	d4627b90a0	[InstCombine] Avoid modifying instructions in-place As discussed on D73919, this replaces a few cases where we were modifying multiple operands of instructions in-place with the creation of a new instruction, which we generally prefer nowadays. This tends to be more readable and less prone to worklist management bugs. Test changes are only superficial (instruction naming and order).	2020-02-08 17:05:56 +01:00
Nikita Popov	878cb38a5c	[InstCombine] Add replaceOperand() helper Adds a replaceOperand() helper, which is like Instruction.setOperand() but adds the old operand to the worklist. This reduces the amount of missing or incorrect worklist management. This only applies the helper to a relatively small subset of setOperand() calls in InstCombine, namely those of the pattern `I.setOperand(); return &I;`, where it is most obviously applicable. Differential Revision: https://reviews.llvm.org/D73803	2020-02-03 19:00:17 +01:00
Nikita Popov	e6c9ab4fb7	[InstCombine] Rename worklist methods; NFC This renames Worklist.AddDeferred() to Worklist.add() and Worklist.Add() to Worklist.push(). The intention here is that Worklist.add() should be the go-to method for explicit worklist management, while the raw Worklist.push() is mostly for InstCombine internals. I will then migrate uses of Worklist.push() to Worklist.add() in followup changes. As suggested by spatel on D73411 I'm also changing the remaining method names to lowercase first character, in line with current coding standards. Differential Revision: https://reviews.llvm.org/D73745	2020-02-03 18:56:51 +01:00
Nikita Popov	efba7ed05e	[PatternMatch] Make m_c_ICmp swap the predicate (PR42801) This addresses https://bugs.llvm.org/show_bug.cgi?id=42801. The m_c_ICmp() matcher is changed to provide the swapped predicate if the operands are swapped. Existing uses of m_c_ICmp() fall in one of two categories: Working on equality predicates only, where swapping is irrelevant. Or performing a manual swap, in which case this patch removes it. The only exception is the foldICmpWithLowBitMaskedVal() fold, which does not swap the predicate, and instead reasons about whether a swap occurred or not for each predicate. Getting the swapped predicate allows us to merge the logic for pairs of predicates, instead of duplicating it. Differential Revision: https://reviews.llvm.org/D72976	2020-01-22 22:56:26 +01:00
Sanjay Patel	f8962571f7	[InstCombine] try to pull 'not' of select into compare operands not (select ?, (cmp TPred, ?, ?), (cmp FPred, ?, ?) --> select ?, (cmp TPred', ?, ?), (cmp FPred', ?, ?) If both sides of the select are cmps, we can remove an instruction. The case where only side is a cmp is deferred to a possible follow-on patch. We have a more general 'isFreeToInvert' analysis, but I'm not seeing a way to use that more widely without inducing infinite looping (opposing transforms). Here, we flip the compare predicates directly, so we should not have any danger by creating extra intermediate 'not' ops. Alive proofs: https://rise4fun.com/Alive/jKa Name: both select values are compares - invert predicates %tcmp = icmp sle i32 %x, %y %fcmp = icmp ugt i32 %z, %w %sel = select i1 %cond, i1 %tcmp, i1 %fcmp %not = xor i1 %sel, true => %tcmp_not = icmp sgt i32 %x, %y %fcmp_not = icmp ule i32 %z, %w %not = select i1 %cond, i1 %tcmp_not, i1 %fcmp_not Name: false val is compare - invert/not %fcmp = icmp ugt i32 %z, %w %sel = select i1 %cond, i1 %tcmp, i1 %fcmp %not = xor i1 %sel, true => %tcmp_not = xor i1 %tcmp, -1 %fcmp_not = icmp ule i32 %z, %w %not = select i1 %cond, i1 %tcmp_not, i1 %fcmp_not Differential Revision: https://reviews.llvm.org/D72007	2020-01-07 10:44:23 -05:00
Roman Lebedev	7015a5c54b	[InstCombine] conditional sign-extend of high-bit-extract: 'or' pattern. In this pattern, all the "magic" bits that we'd `add` are all high sign bits, and in the value we'd be adding to they are all unset, not unexpectedly, so we can have an `or` there: https://rise4fun.com/Alive/ups It is possible that `haveNoCommonBitsSet()` should be taught about this pattern so that we never have an `add` variant, but the reasoning would need to be recursive (because of that `select`), so i'm not really sure that would be worth it just yet. llvm-svn: 375378	2019-10-20 20:52:06 +00:00
Roman Lebedev	a2fa03af3a	[InstCombine] foldUnsignedUnderflowCheck(): one last pattern with 'sub' (PR43251) https://rise4fun.com/Alive/0j9 llvm-svn: 372930	2019-09-25 22:59:59 +00:00
Roman Lebedev	23646952e2	[InstCombine] Fold (A - B) u>=/u< A --> B u>/u<= A iff B != 0 https://rise4fun.com/Alive/KtL This also shows that the fold added in D67412 / r372257 was too specific, and the new fold allows those test cases to be handled more generically, therefore i delete now-dead code. This is yet again motivated by D67122 "[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour" llvm-svn: 372912	2019-09-25 19:06:40 +00:00
Roman Lebedev	45fd1e9d50	[InstCombine] (a+b) < a && (a+b) != 0 -> (0-b) < a iff a/b != 0 (PR43259) Summary: This is again motivated by D67122 sanitizer check enhancement. That patch seemingly worsens `-fsanitize=pointer-overflow` overhead from 25% to 50%, which strongly implies missing folds. For ``` #include <cassert> char* test(char& base, signed long offset) { __builtin_assume(offset < 0); return &base + offset; } ``` We produce https://godbolt.org/z/r40U47 and again those two icmp's can be merged: ``` Name: 0 Pre: C != 0 %adjusted = add i8 %base, C %not_null = icmp ne i8 %adjusted, 0 %no_underflow = icmp ult i8 %adjusted, %base %r = and i1 %not_null, %no_underflow => %neg_offset = sub i8 0, C %r = icmp ugt i8 %base, %neg_offset ``` https://rise4fun.com/Alive/ALap https://rise4fun.com/Alive/slnN There are 3 other variants of this pattern, i believe they all will go into InstSimplify. https://bugs.llvm.org/show_bug.cgi?id=43259 Reviewers: spatel, xbolva00, nikic Reviewed By: spatel Subscribers: efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67849 llvm-svn: 372768	2019-09-24 16:10:50 +00:00
Roman Lebedev	5b881f356c	[InstCombine] (a+b) <= a && (a+b) != 0 -> (0-b) < a (PR43259) Summary: This is again motivated by D67122 sanitizer check enhancement. That patch seemingly worsens `-fsanitize=pointer-overflow` overhead from 25% to 50%, which strongly implies missing folds. This pattern isn't exactly what we get there (strict vs. non-strict predicate), but this pattern does not require known-bits analysis, so it is best to handle it first. ``` Name: 0 %adjusted = add i8 %base, %offset %not_null = icmp ne i8 %adjusted, 0 %no_underflow = icmp ule i8 %adjusted, %base %r = and i1 %not_null, %no_underflow => %neg_offset = sub i8 0, %offset %r = icmp ugt i8 %base, %neg_offset ``` https://rise4fun.com/Alive/knp There are 3 other variants of this pattern, they all will go into InstSimplify: https://rise4fun.com/Alive/bIDZ https://bugs.llvm.org/show_bug.cgi?id=43259 Reviewers: spatel, xbolva00, nikic Reviewed By: spatel Subscribers: hiraditya, majnemer, vsk, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67846 llvm-svn: 372767	2019-09-24 16:10:38 +00:00
Huihui Zhang	a4dd98f2e9	[InstCombine] Fold a shifty implementation of clamp-to-allones. Summary: Fold or(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X) into X s> Y ? -1 : X https://rise4fun.com/Alive/d8Ab clamp255 is a common operator in image processing, can be implemented in a shifty way "(255 - X) >> 31 \| X & 255". Fold shift into select enables more optimization, e.g., vmin generation for ARM target. Reviewers: lebedev.ri, efriedma, spatel, kparzysz, bcahoon Reviewed By: lebedev.ri Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67800 llvm-svn: 372678	2019-09-24 00:30:09 +00:00
Huihui Zhang	8952199715	[InstCombine] Fold a shifty implementation of clamp-to-zero. Summary: Fold and(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X) into X s> Y ? X : 0 https://rise4fun.com/Alive/lFH Fold shift into select enables more optimization, e.g., vmax generation for ARM target. Reviewers: lebedev.ri, efriedma, spatel, kparzysz, bcahoon Reviewed By: lebedev.ri Subscribers: xbolva00, andreadb, craig.topper, RKSimon, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67799 llvm-svn: 372676	2019-09-24 00:15:03 +00:00
Roman Lebedev	23aac95a32	[InstCombine] foldOrOfICmps(): Acquire SimplifyQuery with set CxtI Extracted from https://reviews.llvm.org/D67849#inline-610377 llvm-svn: 372654	2019-09-23 20:40:47 +00:00
Roman Lebedev	595cfda059	[InstCombine] foldAndOfICmps(): Acquire SimplifyQuery with set CxtI Extracted from https://reviews.llvm.org/D67849#inline-610377 llvm-svn: 372653	2019-09-23 20:40:40 +00:00
Roman Lebedev	01ac23ca62	[InstCombine] foldUnsignedUnderflowCheck(): s/Subtracted/ZeroCmpOp/ llvm-svn: 372625	2019-09-23 16:04:32 +00:00
Roman Lebedev	7a67ed5795	[InstCombine] Simplify @llvm.usub.with.overflow+non-zero check (PR43251) Summary: This is again motivated by D67122 sanitizer check enhancement. That patch seemingly worsens `-fsanitize=pointer-overflow` overhead from 25% to 50%, which strongly implies missing folds. In this particular case, given ``` char* test(char& base, unsigned long offset) { return &base - offset; } ``` it will end up producing something like https://godbolt.org/z/luGEju which after optimizations reduces down to roughly ``` declare void @use64(i64) define i1 @test(i8* dereferenceable(1) %base, i64 %offset) { %base_int = ptrtoint i8* %base to i64 %adjusted = sub i64 %base_int, %offset call void @use64(i64 %adjusted) %not_null = icmp ne i64 %adjusted, 0 %no_underflow = icmp ule i64 %adjusted, %base_int %no_underflow_and_not_null = and i1 %not_null, %no_underflow ret i1 %no_underflow_and_not_null } ``` Without D67122 there was no `%not_null`, and in this particular case we can "get rid of it", by merging two checks: Here we are checking: `Base u>= Offset && (Base u- Offset) != 0`, but that is simply `Base u> Offset` Alive proofs: https://rise4fun.com/Alive/QOs The `@llvm.usub.with.overflow` pattern itself is not handled here because this is the main pattern, that we currently consider canonical. https://bugs.llvm.org/show_bug.cgi?id=43251 Reviewers: spatel, nikic, xbolva00, majnemer Reviewed By: xbolva00, majnemer Subscribers: vsk, majnemer, xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67356 llvm-svn: 372341	2019-09-19 17:25:19 +00:00
Roman Lebedev	b646dd92c2	[InstCombine] foldUnsignedUnderflowCheck(): handle last few cases (PR43251) Summary: I don't have a direct motivational case for this, but it would be good to have this for completeness/symmetry. This pattern is basically the motivational pattern from https://bugs.llvm.org/show_bug.cgi?id=43251 but with different predicate that requires that the offset is non-zero. The completeness bit comes from the fact that a similar pattern (offset != zero) will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259, so it'd seem to be good to not overlook very similar patterns.. Proofs: https://rise4fun.com/Alive/21b Also, there is something odd with `isKnownNonZero()`, if the non-zero knowledge was specified as an assumption, it didn't pick it up (PR43267) With this, i see no other missing folds for https://bugs.llvm.org/show_bug.cgi?id=43251 Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67412 llvm-svn: 372257	2019-09-18 20:10:07 +00:00
Roman Lebedev	0410489a34	[InstCombine][NFC] Rename IsFreeToInvert() -> isFreeToInvert() for consistency As per https://reviews.llvm.org/D65530#inline-592325 llvm-svn: 368686	2019-08-13 12:49:16 +00:00
Roman Lebedev	2635c324da	[InstCombine] foldXorOfICmps(): don't give up on non-single-use ICmp's if all users are freely invertible Summary: This is rather unconventional.. As the comment there says, we don't have much folds for xor-of-icmps, we try to turn them into an and-of-icmps, for which we have plenty of folds. But if the ICmp we need to invert is not single-use - we give up. As discussed in https://reviews.llvm.org/D65148#1603922, we may have a non-canonical CLAMP pattern, with bit match and select-of-threshold that we'll potentially clamp. As it can be seen in `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`, out of all 8 variations of the pattern, only two are not canonicalized into the variant with and+icmp instead of bit math. The reason is because the ICmp we need to invert is not single-use - we give up. We indeed can't perform this fold at will, the general rule is that we should not increase instruction count in InstCombine, But we wouldn't end up increasing instruction count if we can adapt every other user to the inverted value. This way the `not` we create will get folded, and in the end the instruction count did not increase. For that, of course, we need to look at the users of a Value, which is again rather unconventional for InstCombine :S Thus i'm proposing to be a little bit more insistive in `foldXorOfICmps()`. The alternatives would be to not create that `not`, but add duplicate code to manually invert all users; or to add some even less general combine to handle some more specific pattern[s]. Reviewers: spatel, nikic, RKSimon, craig.topper Reviewed By: spatel Subscribers: hiraditya, jdoerfert, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65530 llvm-svn: 368685	2019-08-13 12:49:06 +00:00
Craig Topper	e9abc8177a	[InstCombine] Teach foldOrOfICmps to allow icmp eq MIN_INT/MAX to be part of a range comparision. Similar for foldAndOfICmps We can treat icmp eq X, MIN_UINT as icmp ule X, MIN_UINT and allow it to merge with icmp ugt X, C. Similar for the other constants. We can do simliar for icmp ne X, (U)INT_MIN/MAX in foldAndOfICmps. And we already handled UINT_MIN there. Fixes PR42691. Differential Revision: https://reviews.llvm.org/D65017 llvm-svn: 366945	2019-07-24 20:57:29 +00:00
Craig Topper	e6cd20ba53	[InstCombine] Update comment I missed in r366649. NFC llvm-svn: 366658	2019-07-21 16:15:03 +00:00
Craig Topper	1d149d08d3	[InstCombine] Remove insertRangeTest code that handles the equality case. For equality, the function called getTrue/getFalse with the VT of the comparison input. But getTrue/getFalse need the boolean VT. So if this code ever executed, it would assert. I believe these cases are removed by InstSimplify so we don't get here. So this patch just fixes up an assert to exclude the equality possibility and removes the broken code. llvm-svn: 366649	2019-07-21 06:43:38 +00:00
Craig Topper	8fabdfe9fc	[InstCombine] Don't use AddOne/SubOne to see if two APInts are 1 apart. Use APInt operations instead. NFCI AddOne/SubOne create new Constant objects. That seems heavy for comparing ConstantInts which wrap APInts. Just do the math on on the APInts and compare them. llvm-svn: 366648	2019-07-21 05:26:05 +00:00
Rui Ueyama	49a3ad21d6	Fix parameter name comments using clang-tidy. NFC. This patch applies clang-tidy's bugprone-argument-comment tool to LLVM, clang and lld source trees. Here is how I created this patch: $ git clone https://github.com/llvm/llvm-project.git $ cd llvm-project $ mkdir build $ cd build $ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \ -DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \ -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \ -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm $ ninja $ parallel clang-tidy -checks='-,bugprone-argument-comment' \ -config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \ ::: ../llvm/lib//.{cpp,h} ../clang/lib/*/.{cpp,h} ../lld/*/.{cpp,h} llvm-svn: 366177	2019-07-16 04:46:31 +00:00
Sanjay Patel	2675b0c8ab	[InstCombine] squash is-not-power-of-2 using ctpop This is the Demorgan'd 'not' of the pattern handled in: D63660 / rL364153 This is another intermediate IR step towards solving PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 We can test if a value is not a power-of-2 using ctpop(X) > 1, so combining that with an is-zero check of the input is the same as testing if not exactly 1 bit is set: (X == 0) \|\| (ctpop(X) u> 1) --> ctpop(X) != 1 llvm-svn: 364246	2019-06-24 22:35:26 +00:00
Sanjay Patel	13a5ae58fc	[InstCombine] squash is-power-of-2 that uses ctpop This is another intermediate IR step towards solving PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 We can test if a value is power-of-2-or-0 using ctpop(X) < 2, so combining that with a non-zero check of the input is the same as testing if exactly 1 bit is set: (X != 0) && (ctpop(X) u< 2) --> ctpop(X) == 1 Differential Revision: https://reviews.llvm.org/D63660 llvm-svn: 364153	2019-06-23 14:22:37 +00:00
Sanjay Patel	760f61ab36	[InstCombine] try harder to form rotate (funnel shift) (PR20750) We have a similar match for patterns ending in a truncate. This should be ok for all targets because the default expansion would still likely be better from replacing 2 'and' ops with 1. Attempt to show the logic equivalence in Alive (which doesn't currently have funnel-shift in its vocabulary AFAICT): %shamt = zext i8 %i to i32 %m = and i32 %shamt, 31 %neg = sub i32 0, %shamt %and4 = and i32 %neg, 31 %shl = shl i32 %v, %m %shr = lshr i32 %v, %and4 %or = or i32 %shr, %shl => %a = and i8 %i, 31 %shamt2 = zext i8 %a to i32 %neg2 = sub i32 0, %shamt2 %and4 = and i32 %neg2, 31 %shl = shl i32 %v, %shamt2 %shr = lshr i32 %v, %and4 %or = or i32 %shr, %shl https://rise4fun.com/Alive/V9r llvm-svn: 360605	2019-05-13 17:28:19 +00:00
Craig Topper	16dc165046	[InstCombine] Don't transform ((C1 OP zext(X)) & C2) -> zext((C1 OP X) & C2) if either zext or OP has another use. If they have other users we'll just end up increasing the instruction count. We might be able to weaken this to only one of them having a single use if we can prove that the and will be removed. Fixes PR41164. Differential Revision: https://reviews.llvm.org/D59630 llvm-svn: 356690	2019-03-21 17:50:49 +00:00
Sanjay Patel	5b820323ca	[InstCombine] fold logic-of-nan-fcmps (PR41069) Combine 2 fcmps that are checking for nan-ness: and (fcmp ord X, 0), (and (fcmp ord Y, 0), Z) --> and (fcmp ord X, Y), Z or (fcmp uno X, 0), (or (fcmp uno Y, 0), Z) --> or (fcmp uno X, Y), Z This is an exact match for a minimal reassociation pattern. If we want to handle this more generally that should go in the reassociate pass and allow removing this code. This should fix: https://bugs.llvm.org/show_bug.cgi?id=41069 llvm-svn: 356471	2019-03-19 16:39:17 +00:00
Sanjay Patel	587fd849f0	[InstCombine] Fix matchRotate bug when one operand is a ConstantExpr shift This bug seems to be harmless in release builds, but will cause an error in UBSAN builds or an assertion failure in debug builds. When it gets to this opcode comparison, it assumes both of the operands are BinaryOperators, but the prior m_LogicalShift will also match a ConstantExpr. The cast<BinaryOperator> will assert in a debug build, or reading an invalid value for BinaryOp from memory with ((BinaryOperator*)constantExpr)->getOpcode() will cause an error in a UBSAN build. The test I added will fail without this change in debug/UBSAN builds, but not in release. Patch by: @AndrewScheidecker (Andrew Scheidecker) Differential Revision: https://reviews.llvm.org/D58049 llvm-svn: 353736	2019-02-11 19:26:27 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Sanjay Patel	d023dd60e9	[InstCombine] canonicalize another raw IR rotate pattern to funnel shift This is matching the equivalent of the DAG expansion, so it should never end up with worse perf than the original code even if the target doesn't have a rotate instruction. llvm-svn: 350672	2019-01-08 22:39:55 +00:00
Sanjay Patel	3d5bb15a1d	[CmpInstAnalysis] fix function signature for ICmp code to predicate; NFC The old function underspecified the return type, took an unused parameter, and had a misleading name. llvm-svn: 348292	2018-12-04 18:53:27 +00:00
Sanjay Patel	472652ef68	[CmpInstAnalysis] fix formatting; NFC There are potential improvements to the structure of this API raised by D54994, but remove some cosmetic blemishes before making any functional changes. llvm-svn: 348149	2018-12-03 15:48:30 +00:00
Sanjay Patel	6072842770	[InstCombine] fix formatting for matchBSwap(); NFC We should have a similar function for matching rotate and/or funnel shift, so tidy up the related existing call. llvm-svn: 346871	2018-11-14 16:03:36 +00:00
Sanjay Patel	3b206305fd	[InstCombine] try harder to form select from logic ops (2nd try) The original patch was committed here: rL344609 ...and reverted: rL344612 ...because it did not properly check/test data types before calling ComputeNumSignBits(). The tests that caused bot failures for the previous commit are over-reaching front-end tests that run the entire -O optimizer pipeline: Clang :: CodeGen/builtins-systemz-zvector.c Clang :: CodeGen/builtins-systemz-zvector2.c I've added a negative test here to ensure coverage for that case. The new early exit check also tests the type of the 'B' parameter, so we don't waste time on matching if either value is unsuitable. Original commit message: This is part of solving PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 The patterns shown here are a special case of something that we already convert to select. Using ComputeNumSignBits() catches that case (but not the more complicated motivating patterns yet). The backend has hooks/logic to convert back to logic ops if that's better for the target. llvm-svn: 345149	2018-10-24 15:17:56 +00:00
Sanjay Patel	bb3dd34e62	revert rL344609: [InstCombine] try harder to form select from logic ops I noticed a missing check and added it at rL344610, but there actually are codegen tests that will fail without that, so I'll edit those and submit a fixed patch with more tests. llvm-svn: 344612	2018-10-16 15:26:08 +00:00
Sanjay Patel	f6a7c8b1fc	[InstCombine] make sure type is integer before calling ComputeNumSignBits llvm-svn: 344610	2018-10-16 14:44:50 +00:00
Sanjay Patel	0c48c977b8	[InstCombine] try harder to form select from logic ops This is part of solving PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 The patterns shown here are a special case of something that we already convert to select. Using ComputeNumSignBits() catches that case (but not the more complicated motivating patterns yet). The backend has hooks/logic to convert back to logic ops if that's better for the target. llvm-svn: 344609	2018-10-16 14:35:21 +00:00
Sanjay Patel	79dceb2903	[InstCombine] name change: foldShuffledBinop -> foldVectorBinop; NFC This function will deal with more than shuffles with D50992, and I have another potential per-element fold that could live here. llvm-svn: 343692	2018-10-03 15:20:58 +00:00

1 2 3 4 5 ...

632 Commits