llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	c6b88cb918	[InstCombine] Push freeze through recurrence phi We really want to push freezes through recurrence phis, so that we freeze only the start value, rather than the IV value on every iteration. foldOpIntoPhi() already handles this for the case where the transfer function doesn't produce poison, e.g. %iv.next = add %iv, 1. However, this does not work if nowrap flags are present, e.g. the very common %iv.next = add nuw %iv, 1 case. This patch adds a fold that pushes freeze instructions to the start value by checking whether all backedge values will be non-poison after poison generating flags have been dropped. This allows pushing freezes out of loops in most cases. I suspect that this also obsoletes the CanonicalizeFreezeInLoops pass, and we can probably drop it. Fixes https://github.com/llvm/llvm-project/issues/56048. Differential Revision: https://reviews.llvm.org/D127960	2022-06-17 15:01:41 +02:00
Nuno Lopes	e5c5f92e12	[InstCombine] switch synthetic unreachable to use undef instead of poison (NFC)	2022-06-10 21:54:09 +01:00
Nikita Popov	45226d04f0	[InstCombine] Reuse icmp of and/or folds for logical and/or Similarly to a change recently done for fcmps, add a flag that indicates whether the and/or is logical to foldAndOrOfICmps, and reuse the function when folding logical and/or. We were already calling some parts of it, but this gives us a clearer indication of which parts may need poison-safe variants, and would also allow to fold combinations of bitwise and logical and/or. This change should be close to NFC, because all folds this enables were either already called previously, or can make use of implied poison reasoning.	2022-05-23 15:37:07 +02:00
Chenbing Zheng	ffaaf2498b	[InstCombine] (rot X, ?) == 0/-1 --> X == 0/-1 In this patch we add a function foldICmpInstWithConstantAllowUndef to fold integer comparisons with a constant operand: icmp Pred X, C where X is some kind of instruction and C is AllowUndef. We move this fold to the new function, so that it can solve undef elts in a vector. Reviewed By: spatel, RKSimon Differential Revision: https://reviews.llvm.org/D125220	2022-05-19 11:22:26 +08:00
Chenbing Zheng	acbad5086a	[InstCombine] [NFC] separate a function foldICmpBinOpWithConstant There is a long function foldICmpInstWithConstant, we can separate a function foldICmpBinOpWithConstant from it. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D125457	2022-05-14 10:54:15 +08:00
Nikita Popov	6001bfcedc	[InstCombine] Freeze other uses of frozen value If there is a freeze %x, we currently replace all other uses of %x with freeze %x -- as long as they are dominated by the freeze instruction. This patch extends this behavior to cases where we did not originally dominate the use by moving the freeze instruction directly after the definition of the frozen value. The motivation can be seen in test @combine_and_after_freezing_uses: Canonicalizing everything to freeze %x allows folds that are based on value identity (i.e. same operand occurring in two places) to trigger. This also covers the case from D125248. Differential Revision: https://reviews.llvm.org/D125321	2022-05-11 16:47:12 +02:00
Nikita Popov	b457ac4240	[InstCombine] Extract icmp of select transform (NFC) To make it either to extend to the case where the other operand is not a constant.	2022-05-06 14:46:44 +02:00
Nikita Popov	982cbed819	[InstCombine] Fold logical and/or of range icmps with nowrap flags This is an edge-case where we don't convert to bitwise and/or based on implies poison reasoning, so explicitly try to perform the fold in logical form. The transform itself is poison-safe, as both icmps are based on the same value and any nowrap flags are discarded as part of the fold (https://alive2.llvm.org/ce/z/aCwC8b for the used example).	2022-04-29 14:42:42 +02:00
Sanjay Patel	903aa5e0f8	[InstCombine] try to fold icmp with mismatched extended operands If a value is known to be non-negative and zexted, that's the same thing as sexted. So for the purpose of looking past the casts with an icmp, treat it as if it was a sext: https://alive2.llvm.org/ce/z/_BDsGV This is necessary, but not enough to solve the motivating problem: https://github.com/llvm/llvm-project/issues/55013 Differential Revision: https://reviews.llvm.org/D124419	2022-04-26 14:26:36 -04:00
Nikita Popov	ba46ae7bd8	[InstCombine] Merge foldAndOfICmps() and foldOrOfICmps() (NFCI) Folds are supposed to always be added in conjugated pairs for and and or. Merge the two functions to make folds for which this is currently not the case more obvious.	2022-04-22 12:48:03 +02:00
Chenbing Zheng	467cbb6249	[InstCombine] fold more constant divisor to select-of-constants divisor By adding a parameter to function FoldOpIntoSelect， we can fold more Ops to Select. For this example, we tend to fold the division instruction, so we no longer care whether SelectInst is one use. This patch slove TODO left in InstCombine/div.ll. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D122967	2022-04-08 10:19:24 +08:00
Augie Fackler	f3c702fbd1	InstCombineCalls: fix annotateAnyAllocCallSite to report changes Spotted during review of D123052. Differential Revision: https://reviews.llvm.org/D123232	2022-04-07 13:49:09 -04:00
Craig Topper	ce78e68261	[InstCombine] Fold select based logic of fcmps with same operands when FMF is present. If we have a logical and/or in select form and the true/false operand is an fcmp with poison generating FMF, we won't be able to fold it to an and/or instruction. This prevents us from optimizing the case where it is a logical operation of two fcmps with identical operands. This patch adds explicit checks for this case that doesn't rely on converting to and/or to do the optimization. It reuses the existing foldLogicOfFCmps, but adds a new flag to disable the other combine that is inside that function. FMF flags from the two FCmps are intersected using the logic added in D121243. The FIXME has been updated to indicate that we can only use a union for the non-select form. This allows us to optimize cases like this from compare-fp-3.c in the gcc torture suite with fast math. void test1 (float x, float y) { if ((x==y) && (x!=y)) link_error0(); } Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D121323	2022-03-14 14:45:07 -07:00
Nikita Popov	9353ed6a53	[InstCombine] Don't call matchSAddSubSat() for SPF (NFC) Only call it for intrinsic min/max. The moved implementation is unchanged apart from the one-use check: It is now hardcoded to one-use, without the two-use special case for SPF.	2022-02-28 10:41:56 +01:00
Philip Reames	6f9d557e08	[instcombine] Cleanup foldAllocaCmp slightly [NFC]	2022-02-18 18:49:39 -08:00
Nikita Popov	e714b98fff	[InstCombine] Check type compatibility in indexed load fold This fold could use a rewrite to an offset-based implementation, but for now make sure it doesn't crash with opaque pointers.	2022-02-11 10:16:27 +01:00
Kazu Hirata	3a3cb929ab	[llvm] Use = default (NFC)	2022-02-06 22:18:35 -08:00
Nikita Popov	648faa3b5d	[InstCombine] Mark element type access as non-opaque (NFC) Also make the function static to make it more obvious that it is only used in the one place.	2022-01-27 11:40:29 +01:00
Nikita Popov	b7179d9279	[InstCombine] Extract GEP of bitcast folds into separate function (NFC)	2022-01-27 10:48:00 +01:00
Sanjay Patel	39e602b6c4	[InstCombine] try to fold binop with phi operands This is an alternate version of D115914 that handles/tests all binary opcodes. I suspect that we don't see these patterns too often because -simplifycfg would convert the minimal cases into selects rather than leave them in phi form (note: instcombine has logic holes for combining the select patterns too though, so that's another potential patch). We only create a new binop in a predecessor that unconditionally branches to the final block. https://alive2.llvm.org/ce/z/C57M2F https://alive2.llvm.org/ce/z/WHwAoU (not safe to speculate an sdiv for example) https://alive2.llvm.org/ce/z/rdVUvW (but it is ok on this path) Differential Revision: https://reviews.llvm.org/D117110	2022-01-22 15:00:06 -05:00
Nikita Popov	de2ed8e38e	[InstCombine] Extract GEP of GEP fold into separate function This change may not be entirely NFC, because a number of early returns will now only early return from this particular fold, rather than the whole visitGetElementPtr() implementation. This is also the reason why I'm doing this change, as I don't think this was intended.	2021-12-27 14:52:11 +01:00
Sanjay Patel	3db974face	[InstCombine] convert static function to internal class function; NFC The transform can require an optional shuffle instruction to be sound, so we need to use Builder to create all values and then replace the original instruction with whatever that final value is.	2021-12-14 11:18:35 -05:00
Sanjay Patel	337948ac6e	[InstCombine] add folds for binop with sexted bool and constant operands This is a generalization/extension of the existing and/or folds noted with TODO comments. Those have a one-use constraint that is not necessary. Potential follow-ups are noted by the TODO comments in the new function. We can also call this function from other binop visit* functions, but we need to add tests first. This solves: https://llvm.org/PR52543 https://alive2.llvm.org/ce/z/NWuCR5	2021-11-20 12:33:00 -05:00
Sanjay Patel	db231ebdb0	[InstCombine] fold fake vector extract to shift+trunc We already handle more complicated cases like: extelt (bitcast (inselt poison, X, 0)) --> trunc (lshr X) But we missed this simpler pattern: https://alive2.llvm.org/ce/z/D55h64 / https://alive2.llvm.org/ce/z/GKzzRq This is part of solving: https://llvm.org/PR52057 I made the transform depend on legal/desirable int type to avoid creating a shift of an illegal type (for example i128). I'm not sure if that restriction is actually necessary, but we can change that as a follow-up if the backend can deal with integer ops on too-wide illegal types. The pile of AVX512 test changes are all neutral AFAICT - the x86 backend seems to know how to turn that into the expected "kmov" instructions. Differential Revision: https://reviews.llvm.org/D111082	2021-10-06 08:12:05 -04:00
Sanjay Patel	668beb8ae8	[InstCombine] refactor folds of 'not' instructions; NFC This removes repeated calls to m_Not, so hopefully a little more efficient. Also, we may need to enhance some of these blocks to allow logical and/or (select of bools).	2021-10-05 16:36:57 -04:00
Sanjay Patel	6a2a84c253	[InstCombine] add helper for "is desirable int type"; NFC This splits out the logic from shouldChangeType() that currently allows 8/16/32-bit transforms even if those types are not listed as legal in the data layout. This could be useful as a predicate for vector insert/extract transforms. Note that this leaves the subsequent checks in shouldChangeType() unchanged. We may want to merge the checks for i1 and/or "ToLegal" into "isDesirable", but that may alter existing transforms.	2021-10-04 14:30:18 -04:00
Florian Hahn	e08a5dc86f	[InstCombine] Move InstCombineWorklist to Utils to allow reuse (NFC). InstCombine's worklist can be re-used by other passes like VectorCombine. Move it to llvm/Transform/Utils and rename it to InstructionWorklist. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D110181	2021-09-22 08:47:21 +01:00
Sanjay Patel	97a4e7b7ff	[InstCombine] remove a buggy set of zext-icmp transforms The motivating case is an infinite loop shown with a reduced test from: https://llvm.org/PR51762 To solve this, I'm proposing we delete the most obviously broken part of this code. The bug example shows a fundamental problem: we ask computeKnownBits if a transform will be profitable, alter the code by creating new instructions, then rely on computeKnownBits to return the same answer to actually eliminate instructions. But there's no guarantee that the results will be the same between the 1st and 2nd calls. In the infinite loop example, we get different answers, so we add instructions that conflict with some other transform, and we're stuck. There's at least one other problem visible in the test diff for `@zext_or_masked_bit_test_uses`: the code doesn't check uses properly, so we can end up with extra instructions created. Last, it's not clear if this set of transforms actually improves analysis or codegen. I spot-checked a few targets and don't see a clear win: https://godbolt.org/z/x87EWovso If we do see a regression from this change, codegen seems like the right place to add a cmp -> bit-hack fold. If this is too big of a step, we could limit the computeKnownBits calls by not passing a context instruction and/or limiting the recursion. I checked that those would stop the infinite loop for PR51762, but that won't guarantee that some other example does not fall into the same loop. Differential Revision: https://reviews.llvm.org/D109440	2021-09-09 08:49:39 -04:00
Roman Lebedev	35fa7b8ad8	Reland "[InstCombine] Recognize `((x * y) s/ x) !=/== y` as an signed multiplication overflow check (PR48769)" This reverts commit `91f7a4fff7`, relanding commit `13ec913bdf`. The original commit was reverted because of (essentially) https://bugs.llvm.org/show_bug.cgi?id=35922 which has now been addressed by `d0eeb64be5`.	2021-09-07 21:03:52 +03:00
Nikita Popov	fafe5a6f44	[InstCombine] Perform "eq of parts" fold with logical ops The pattern matched here is too complex for the general logical and/or to bitwise and/or conversion to trigger. However, the fold is poison-safe, so match it with a select root as well: https://alive2.llvm.org/ce/z/vNzzSg https://alive2.llvm.org/ce/z/Beyumt	2021-08-22 16:55:53 +02:00
David Green	c6b7db015f	[InstCombine] Add call to matchSAddSubSat from min/max This adds a call to matchSAddSubSat from smin/smax instrinsics, allowing the same patterns to match if the canonical form of a min/max is an intrinsics, not a icmp/select. Differential Revision: https://reviews.llvm.org/D108077	2021-08-15 17:25:16 +01:00
Krishna	d99260641b	[InstCombine] Fold phi ( inttoptr/ptrtoint x ) to phi (x) The inttoptr/ptrtoint roundtrip optimization is not always correct. We are working towards removing this optimization and adding support to specific cases where this optimization works. In this patch, we focus on phi-node operands with inttoptr casts. We know that ptrtoint( inttoptr( ptrtoint x) ) is same as ptrtoint (x). So, we want to remove this roundtrip cast which goes through phi-node. Reviewed By: aqjune Differential Revision: https://reviews.llvm.org/D106289	2021-08-03 17:52:59 +05:30
Sanjay Patel	a22c99c3c1	[InstCombine] canonicalize cmp-of-bitcast-of-vector-cmp to use zero constant We can invert a compare constant and preserve the logic as shown in this sampling: https://alive2.llvm.org/ce/z/YAXbfs (In theory, we could deal with non-all-ones/zero as well, but it doesn't seem worthwhile.) I noticed this as a part of the x86 codegen difference in https://llvm.org/PR51259 - it ends up using "test" instead of "not + cmp" in that example. This pattern also shows up in https://llvm.org/PR41312 and https://llvm.org/PR50798 . Differential Revision: https://reviews.llvm.org/D107170	2021-07-31 13:31:12 -04:00
hyeongyu kim	aca5aeb752	[InstCombine] Add freezeAllUsesOfArgument to visitFreeze In D106041, a freeze was added before the branch condition to solve the miscompilation problem of SimpleLoopUnswitch. However, I found that the added freeze disturbed other optimizations in the following situations. ``` arg.fr = freeze(arg) use(arg.fr) ... use(arg) ``` It is a problem that occurred when arg and arg.fr were recognized as different values. Therefore, changing to use arg.fr instead of arg throughout the function eliminates the above problem. Thus, I add a function that changes all uses of arg to freeze(arg) to visitFreeze of InstCombine. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D106233	2021-07-24 18:08:58 +09:00
Kazu Hirata	ba2dd12d4f	[InstCombine] Remove CreateOverflowTuple (NFC) The last use was removed On Jun 3, 2020 in commit `2a6c871596`.	2021-07-21 07:07:53 -07:00
Krishna Kariya	da92e86263	[InstCombine] Fold IntToPtr/PtrToInt to bitcast The inttoptr/ptrtoint roundtrip optimization is not always correct. We are working towards removing this optimization and adding support to specific cases where this optimization works. This patch is the first one on this line. Consider the example: %i = ptrtoint i8* %X to i64 %p = inttoptr i64 %i to i16* %cmp = icmp eq i8* %load, %p In this specific case, the inttoptr/ptrtoint optimization is correct as it only compares the pointer values. In this patch, we fold inttoptr/ptrtoint to a bitcast (if src and dest types are different). Differential Revision: https://reviews.llvm.org/D105088	2021-07-18 23:13:25 +02:00
hyeongyu kim	1a5f4cbe1b	[InstCombine] Add optimization to prevent poison from being propagated. In D104569, Freeze was inserted just before br to solve the `branching on undef` miscompilation problem. But value analysis was being disturbed by added freeze. ``` v = load ptr cond = freeze(icmp (and v, const), const') br cond, ... ``` The case in which value analysis disturbed is as above. By changing freeze to add immediately after load, value analysis will be successful again. ``` v = load ptr freeze(icmp (and v, const), const') => v = load ptr v' = freeze v icmp (and v', const), const' ``` In this patch, I propose the above optimization. With this patch, the poison will not spread as the freeze is performed early. Reviewed By: nikic, lebedev.ri Differential Revision: https://reviews.llvm.org/D105392	2021-07-11 12:40:43 +09:00
Nikita Popov	a8f7dee1df	[InstCombine] Support one-hot merge for logical and/or If a logical and/or is used, we need to be careful not to propagate a potential poison value from the RHS by inserting a freeze instruction. Otherwise it works the same way as bitwise and/or. This is intended to address the regression reported at https://reviews.llvm.org/D101191#2751002. Differential Revision: https://reviews.llvm.org/D102279	2021-05-12 21:01:18 +02:00
Nikita Popov	1556540372	[InstCombine] Clean up one-hot merge optimization (NFC) Remove the requirement that the instruction is a BinaryOperator, make the predicate check more compact and use slightly more meaningful naming for the and operands.	2021-05-11 23:22:11 +02:00
Juneyoung Lee	24ce194cfe	[InstCombine] generalize select + select/and/or folding using implied conditions This patch optimizes the remaining possible cases in D101191 by generalizing isImpliedCondition()-based foldings. Assume that there is `op a, (select b, _, _)` where op is one of `and i1`, `or i1` or their select forms. We can do the following optimization based on the result of `isImpliedCondition(a, b)`: If a = true implies… - b = true: - select a, (select b, A, B), false => select a, A, false : https://alive2.llvm.org/ce/z/WCnZYh - and a, (select b, A, B) => select a, A, false : https://alive2.llvm.org/ce/z/uZhcMG - b = false: - select a, (select b, A, B), false => select a, B, false : https://alive2.llvm.org/ce/z/c2hJpV - and a, (select b, A, B) => select a, B, false : https://alive2.llvm.org/ce/z/5ggwMM If a = false implies… - b = true: - select a, true, (select b, A, B) => select a, true, A : https://alive2.llvm.org/ce/z/tidKvH - or a, (select b, A, B) => select a, true, A : https://alive2.llvm.org/ce/z/cC-uyb - b = false: - select a, true, (select b, A, B) => select a, true, B : https://alive2.llvm.org/ce/z/ZXpJq9 - or a, (select b, A, B) => select a, true, B : https://alive2.llvm.org/ce/z/hnDrJj Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101720	2021-05-04 09:42:06 +09:00
Reid Kleckner	91f7a4fff7	Revert "[InstCombine] Recognize `((x * y) s/ x) !=/== y` as an signed multiplication overflow check (PR48769)" This reverts commit `13ec913bdf`. This commit introduces new uses of the overflow checking intrinsics that depend on implementations in compiler-rt, which Windows users generally do not link against. I filed an issue (somewhere) to make clang auto-link the builtins library to resolve this situation, but until that happens, it isn't reasonable for the optimizer to introduce new link time dependencies.	2021-04-20 15:53:34 -07:00
Philip Reames	4824d876f0	Revert "Allow invokable sub-classes of IntrinsicInst" This reverts commit `d87b9b81cc`. Post commit review raised concerns, reverting while discussion happens.	2021-04-20 15:38:38 -07:00
Philip Reames	d87b9b81cc	Allow invokable sub-classes of IntrinsicInst It used to be that all of our intrinsics were call instructions, but over time, we've added more and more invokable intrinsics. According to the verifier, we're up to 8 right now. As IntrinsicInst is a sub-class of CallInst, this puts us in an awkward spot where the idiomatic means to check for intrinsic has a false negative if the intrinsic is invoked. This change switches IntrinsicInst from being a sub-class of CallInst to being a subclass of CallBase. This allows invoked intrinsics to be instances of IntrinsicInst, at the cost of requiring a few more casts to CallInst in places where the intrinsic really is known to be a call, not an invoke. After this lands and has baked for a couple days, planned cleanups: Make GCStatepointInst a IntrinsicInst subclass. Merge intrinsic handling in InstCombine and use idiomatic visitIntrinsicInst entry point for InstVisitor. Do the same in SelectionDAG. Do the same in FastISEL. Differential Revision: https://reviews.llvm.org/D99976	2021-04-20 15:03:49 -07:00
Roman Lebedev	13ec913bdf	[InstCombine] Recognize `((x * y) s/ x) !=/== y` as an signed multiplication overflow check (PR48769) We already had support for it's unsigned variant, so simply extend it to also handle the signed variant. Fixes https://bugs.llvm.org/show_bug.cgi?id=48769	2021-04-20 21:29:43 +03:00
Dávid Bolvanský	324d641b75	[InstCombine] Enhance deduction of alignment for aligned_alloc This patch improves https://reviews.llvm.org/D76971 (Deduce attributes for aligned_alloc in InstCombine) and implements "TODO" item mentioned in the review of that patch. > The function aligned_alloc() is the same as memalign(), except for the added restriction that size should be a multiple of alignment. Currently, we simply bail out if we see a non-constant size - change that. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D100785	2021-04-20 02:04:18 +02:00
Simon Pilgrim	609d0c9772	[InstCombine] matchBSwapOrBitReverse - remove pattern matching early-out. NFCI. recognizeBSwapOrBitReverseIdiom + collectBitParts have pattern matching to bail out early if a bswap/bitreverse pattern isn't possible - we should be able to rely on this instead without any notable change in compile time. This is part of a cleanup towards letting matchBSwapOrBitReverse /recognizeBSwapOrBitReverseIdiom use 'root' instructions that aren't ORs (FSHL/FSHRs in particular which can be prematurely created). Differential Revision: https://reviews.llvm.org/D97056	2021-02-20 13:15:34 +00:00
Florian Hahn	d60b74c28a	[InstCombine] Set MadeIRChange in replaceInstUsesWith. Some utilities used by InstCombine, like SimplifyLibCalls, may add new instructions and replace the uses of a call, but return nullptr because the inserted call produces multiple results. Previously, the replaced library calls would get removed by InstCombine's deleter, but after `292077072e` this may not happen, if the willreturn attribute is missing. As a work-around, update replaceInstUsesWith to set MadeIRChange, if it replaces any uses. This catches the cases where it is used as replacer by utilities used by InstCombine and seems useful in general; updating uses will modify the IR. This fixes an expensive-check failure when replacing @__sinpif/@__cospifi with @__sincospif_sret.	2021-01-23 17:52:59 +00:00
Roman Lebedev	d1a6f92fd5	[InstCombine] Fold `(~x) \| y` --> `~(x & (~y))` iff it is free to do so Iff we know we can get rid of the inversions in the new pattern, we can thus get rid of the inversion in the old pattern, this decreasing instruction count. Note that we could position this transformation as just hoisting of the `not` (still, iff y is freely negatible), but the test changes show a number of regressions, so let's not do that.	2021-01-22 17:23:54 +03:00
Roman Lebedev	79b0d21ce9	[InstCombine] Fold `(~x) & y` --> `~(x \| (~y))` iff it is free to do so Iff we know we can get rid of the inversions in the new pattern, we can thus get rid of the inversion in the old pattern, this decreasing instruction count.	2021-01-22 17:23:54 +03:00
Roman Lebedev	4ed0d8f2f0	[NFC][InstCombine] Extract freelyInvertAllUsersOf() out of canonicalizeICmpPredicate() I'd like to use it in an upcoming fold.	2021-01-22 17:23:53 +03:00

1 2 3 4 5 ...

301 Commits