llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	1376301c87	[InstCombine] Canonicalize range test idiom InstCombine converts range tests of the form (X > C1 && X < C2) or (X < C1 \|\| X > C2) into checks of the form (X + C3 < C4) or (X + C3 > C4). It is possible to express all range tests in either of these forms (with different choices of constants), but currently neither of them is considered canonical. We may have equivalent range tests using either ult or ugt. This proposes to canonicalize all range tests to use ult. An alternative would be to canonicalize to either ult or ugt depending on the specific constants involved -- e.g. in practice we currently generate ult for && style ranges and ugt for \|\| style ranges when going through the insertRangeTest() helper. In fact, the "clamp like" fold was relying on this, which is why I had to tweak it to not assume whether inversion is needed based on just the predicate. Proof: https://alive2.llvm.org/ce/z/_SP_rQ Differential Revision: https://reviews.llvm.org/D113366	2021-11-08 21:15:46 +01:00
David Green	61225c0818	[ValueTracking][InstCombine] Introduce and use ComputeMinSignedBits This introduces a new ComputeMinSignedBits method for ValueTracking that returns the BitWidth - SignBits + 1 from ComputeSignBits, and represents the minimum bit size for the value as a signed integer. Similar to the existing APInt::getMinSignedBits method, this can make some of the reasoning around ComputeSignBits more natural. See https://reviews.llvm.org/D112298	2021-11-05 14:41:37 +00:00
Sanjay Patel	c85df3c7d5	[InstCombine] refactor fold for icmp with trunc op; NFC There are at least 3 related folds we can add here - see D112634.	2021-11-03 12:43:15 -04:00
Kazu Hirata	c714da2ceb	[Transforms] Use {DenseSet,SetVector,SmallPtrSet}::contains (NFC)	2021-10-31 07:57:32 -07:00
David Green	11630dbbc3	[InstCombine] Fold BW/2+1 tops bits are same pattern Match "icmp eq (trunc (lsr A, BW), (ashr (trunc A), BW-1))", which checks the top BW/2 + 1 bits are all the same. Create "A >=s INT_MIN && A <=s INT_MAX", which we generate as "icmp ult (add A, 2^BW-1), 2^BW" to skip a few steps of instcombining. https://alive2.llvm.org/ce/z/NjH6Ty https://alive2.llvm.org/ce/z/_fEQ9P Differential Revision: https://reviews.llvm.org/D109155	2021-10-29 12:30:20 +01:00
Sanjay Patel	acabad9ff6	[InstCombine] try to canonicalize icmp with trunc op into mask and cmp The motivating test is based on: https://llvm.org/PR52260 We have better analysis for X == 0, so try harder to form that.	2021-10-26 17:43:28 -04:00
Philip Reames	3c06ecaa1e	[instcombine] Fix oss-fuzz 39934 (mul matcher can match non-instruction) Fixes a crash observed by oss-fuzz in 39934. Issue at hand is that code expects a pattern match on m_Mul to imply the operand is a mul instruction, however mul constexprs are also valid here.	2021-10-24 14:42:03 -07:00
Simon Pilgrim	71e39e3f18	[ADT] Add APInt::isNegatedPowerOf2() helper Inspired by D111968, provide a isNegatedPowerOf2() wrapper instead of obfuscating code with (-Value).isPowerOf2() patterns, which I'm sure are likely avenues for typos..... Differential Revision: https://reviews.llvm.org/D111998	2021-10-19 14:38:21 +01:00
Sanjay Patel	02928fcb8c	[InstCombine] improve code comments; NFC	2021-10-13 10:40:44 -04:00
Sanjay Patel	59441c7329	[InstCombine] fold signbit check of X \| (X -1) There may be some other patterns like this or a generalization, but this is an example that I noticed would definitely regress with a planned follow-up to D111410. https://alive2.llvm.org/ce/z/GVpQDb	2021-10-11 16:14:13 -04:00
Sanjay Patel	05281d95f2	[InstCombine] move fold for "(X-Y) == 0"; NFC This consolidates related folds that all have a similar use restriction that may not be necessary.	2021-10-10 11:26:03 -04:00
Sanjay Patel	da210f5d34	[InstCombine] canonicalize "(C2 - Y) > C" as (Y + ~C2) < ~C The test diffs show that we have better analysis/folds for 'add' (although we should at least have the simplifications independently, so we don't have the one-use restriction). This is related to solving regressions that would appear in transforms related to D111410, and that is part of a series of enhancements that may eventually helpi solve PR34047. https://alive2.llvm.org/ce/z/3tB9KG define i1 @src(i8 %x, i8 %C, i8 %C2) { %sub = sub nuw i8 %C2, %x %r = icmp slt i8 %sub, %C ret i1 %r } define i1 @tgt(i8 %x, i8 %C, i8 %C2) { %Cnot = xor i8 %C, -1 %C2not = xor i8 %C2, -1 %add = add nuw i8 %x, %C2not %r = icmp sgt i8 %add, %Cnot ret i1 %r }	2021-10-10 11:06:49 -04:00
Sanjay Patel	acafde09a3	[InstCombine] enhance icmp with sub folds There were 2 related but over-specified folds for: C1 - X == C One allowed multi-use but was limited to equal constants. The other allowed different constants but disallowed multi-use. This combines the 2 folds into a more general match. The test diffs show the multi-use cases that were falling through the cracks. https://alive2.llvm.org/ce/z/4_hEt2 define i1 @src(i8 %x, i8 %subC, i8 %C) { %s = sub i8 %subC, %x %r = icmp eq i8 %s, %C ret i1 %r } define i1 @tgt(i8 %x, i8 %subC, i8 %C) { %newC = sub i8 %subC, %C %isneg = icmp eq i8 %x, %newC ret i1 %isneg }	2021-10-09 11:39:49 -04:00
Jay Foad	a9bceb2b05	[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC. Stop using APInt constructors and methods that were soft-deprecated in D109483. This fixes all the uses I found in llvm, except for the APInt unit tests which should still test the deprecated methods. Differential Revision: https://reviews.llvm.org/D110807	2021-10-04 08:57:44 +01:00
Sanjay Patel	1f8bead678	[InstCombine] reduce code for swapped predicate; NFC	2021-09-28 10:00:35 -04:00
hyeongyu kim	ec8311444a	[InstCombine] Update InstCombine to use poison instead of undef for shufflevector's placeholder (2/3) This patch is for fixing potential shufflevector-related bugs like D93818. As D93818, this patch change shufflevector's default placeholder to poison. To reduce risk, it was divided into several patches, and this patch is for InstCombineCompares and InstructionCombining. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D110227	2021-09-23 00:14:50 +09:00
Chris Lattner	735f46715d	[APInt] Normalize naming on keep constructors / predicate methods. This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things: 1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things. 2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct! APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment. Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical. Differential Revision: https://reviews.llvm.org/D109483	2021-09-09 09:50:24 -07:00
Sanjay Patel	a3c1669b17	[InstCombine] fold icmp equality with 'or' mask ops This could go either direction since the instruction count is the same either way, but there are a few reasons to prefer this: 1. We already do the related transform with 'and' (see just above the new code). 2. We try (too hard) to compensate for not having this and possibly other folds in transformZExtICmp(), and that leads to bugs like https://llvm.org/PR51762 . 3. Codegen looks better across a variety of targets. https://alive2.llvm.org/ce/z/uEgn4P	2021-09-07 16:34:00 -04:00
Roman Lebedev	35fa7b8ad8	Reland "[InstCombine] Recognize `((x * y) s/ x) !=/== y` as an signed multiplication overflow check (PR48769)" This reverts commit `91f7a4fff7`, relanding commit `13ec913bdf`. The original commit was reverted because of (essentially) https://bugs.llvm.org/show_bug.cgi?id=35922 which has now been addressed by `d0eeb64be5`.	2021-09-07 21:03:52 +03:00
Dávid Bolvanský	3b5f318f5d	[InstCombine] ror/rol(X, RotAmt) == C --> X == rol/ror(C, RotAmt) (PR51567) ``` ---------------------------------------- define i1 @src(i32 %0) { %1: %2 = fshl i32 %0, i32 %0, i32 25 %3 = icmp eq i32 %2, 5 ret i1 %3 } => define i1 @tgt(i32 %0) { %1: %2 = icmp eq i32 %0, 640 ret i1 %2 } Transformation seems to be correct! ``` https://alive2.llvm.org/ce/z/GdY8Jm Solves PR51567 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D109283	2021-09-07 18:04:58 +02:00
Dávid Bolvanský	3a696f6092	[InstCombine] rotate(X,Z) eq/ne rotate(Y,Z) ---> X eq/ne Y (PR51565) ``` ---------------------------------------- define i1 @src(i8 %x, i8 %y, i8 %z) { %0: %f = fshl i8 %x, i8 %x, i8 %z %f2 = fshl i8 %y, i8 %y, i8 %z %r = icmp eq i8 %f, %f2 ret i1 %r } => define i1 @tgt(i8 %x, i8 %y, i8 %z) { %0: %r = icmp eq i8 %x, %y ret i1 %r } Transformation seems to be correct! ``` https://alive2.llvm.org/ce/z/qAZp8f Solves PR51565 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D109271	2021-09-04 18:58:44 +02:00
Sanjay Patel	fd807601a7	[InstCombine] fold (rotate X) eq/ne (0/-1) This generalizes the examples shown in: https://llvm.org/PR51566 https://alive2.llvm.org/ce/z/V-sEy9	2021-09-03 14:51:35 -04:00
Sanjay Patel	d1458903eb	[InstCombine] reduce code duplication; NFC	2021-09-03 14:51:35 -04:00
Arthur Eubanks	099e4bcd5d	[InstCombine] Remove invariant group intrinsincs when comparing against null We cannot leak any equivalency information by comparing against null since null never has virtual metadata associated with it (when null is not a valid dereferenceable pointer). Instcombine seems to make sure that a null will be on the RHS, so we don't have to check both operands. This fixes a missed optimization in llvm-test-suite's MultiSource lambda benchmark under -fstrict-vtable-pointers. Reviewed By: Prazek Differential Revision: https://reviews.llvm.org/D108734	2021-08-29 15:45:25 -07:00
Sanjay Patel	a0a9c9e188	[InstCombine] avoid breaking up min/max (cmp+sel) idioms This is a quick fix for a motivating case that looks like this: https://godbolt.org/z/GeMqzMc38 As noted, we might be able to restore the min/max patterns with select folds, or we just wait for this to become easier with canonicalization to min/max intrinsics.	2021-08-11 12:48:11 -04:00
Sanjay Patel	b267d3ce8d	[InstCombine] avoid infinite loops from min/max canonicalization The intrinsics have an extra chunk of known bits logic compared to the normal cmp+select idiom. That allows folding the icmp in each case to something better, but that then opposes the canonical form of min/max that we try to form for a select. I'm carving out a narrow exception to preserve all existing regression tests while avoiding the inf-loop. It seems unlikely that this is the only bug like this left, but this should fix: https://llvm.org/PR51419	2021-08-10 14:42:37 -04:00
Sanjay Patel	0369714b31	[InstCombine] reduce vector casting before icmp There may be some generalizations (see test comments) of these patterns, but this should handle the cases motivated by: https://llvm.org/PR51315 https://llvm.org/PR51259 The backend may want to transform differently, but at least for the x86 examples that I looked at, there does not appear to be any significant perf diff either way.	2021-08-06 17:09:38 -04:00
Sanjay Patel	a22c99c3c1	[InstCombine] canonicalize cmp-of-bitcast-of-vector-cmp to use zero constant We can invert a compare constant and preserve the logic as shown in this sampling: https://alive2.llvm.org/ce/z/YAXbfs (In theory, we could deal with non-all-ones/zero as well, but it doesn't seem worthwhile.) I noticed this as a part of the x86 codegen difference in https://llvm.org/PR51259 - it ends up using "test" instead of "not + cmp" in that example. This pattern also shows up in https://llvm.org/PR41312 and https://llvm.org/PR50798 . Differential Revision: https://reviews.llvm.org/D107170	2021-07-31 13:31:12 -04:00
Krishna Kariya	da92e86263	[InstCombine] Fold IntToPtr/PtrToInt to bitcast The inttoptr/ptrtoint roundtrip optimization is not always correct. We are working towards removing this optimization and adding support to specific cases where this optimization works. This patch is the first one on this line. Consider the example: %i = ptrtoint i8* %X to i64 %p = inttoptr i64 %i to i16* %cmp = icmp eq i8* %load, %p In this specific case, the inttoptr/ptrtoint optimization is correct as it only compares the pointer values. In this patch, we fold inttoptr/ptrtoint to a bitcast (if src and dest types are different). Differential Revision: https://reviews.llvm.org/D105088	2021-07-18 23:13:25 +02:00
Sanjay Patel	ca6e117d86	[InstCombine] reorder icmp with offset folds for better results This set of folds was added recently with: `c7b658aeb5` `0c400e8953` `40b752d28d` ...and I noted that this wasn't likely to fire in code derived from C/C++ source because of nsw in particular. But I didn't notice that I had placed the code above the no-wrap block of transforms. This is likely the cause of regressions noted from the previous commit because -- as shown in the test diffs -- we may have transformed into a compare with an arbitrary constant rather than a simpler signbit test.	2021-07-14 12:12:05 -04:00
Sanjay Patel	a488c7879e	[InstCombine] reduce signbit test of logic ops to cmp with zero This is the pattern from the description of: https://llvm.org/PR50816 There might be a way to generalize this to a smaller or more generic pattern, but I have not found it yet. https://alive2.llvm.org/ce/z/ShzJoF define i1 @src(i8 %x) { %add = add i8 %x, -1 %xor = xor i8 %x, -1 %and = and i8 %add, %xor %r = icmp slt i8 %and, 0 ret i1 %r } define i1 @tgt(i8 %x) { %r = icmp eq i8 %x, 0 ret i1 %r }	2021-07-12 09:01:26 -04:00
Sanjay Patel	40b752d28d	[InstCombine] fold icmp slt/sgt of offset value with constant This follows up patches for the unsigned siblings: `0c400e8953` `c7b658aeb5` We are translating an offset signed compare to its unsigned equivalent when one end of the range is at the limit (zero or unsigned max). (X + C2) >s C --> X <u (SMAX - C) (if C == C2 - 1) (X + C2) <s C --> X >u (C ^ SMAX) (if C == C2) This probably does not show up much in IR derived from C/C++ source because that would likely have 'nsw', and we have folds for that already. As with the previous unsigned transforms, the folds could be generalized to handle non-constant patterns: https://alive2.llvm.org/ce/z/Y8Xrrm ; sgt define i1 @src(i8 %a, i8 %c) { %c2 = add i8 %c, 1 %t = add i8 %a, %c2 %ov = icmp sgt i8 %t, %c ret i1 %ov } define i1 @tgt(i8 %a, i8 %c) { %c_off = sub i8 127, %c ; SMAX %ov = icmp ult i8 %a, %c_off ret i1 %ov } https://alive2.llvm.org/ce/z/c8uhnk ; slt define i1 @src(i8 %a, i8 %c) { %t = add i8 %a, %c %ov = icmp slt i8 %t, %c ret i1 %ov } define i1 @tgt(i8 %a, i8 %c) { %c_offnot = xor i8 %c, 127 ; SMAX %ov = icmp ugt i8 %a, %c_offnot ret i1 %ov }	2021-07-05 10:08:31 -04:00
Sanjay Patel	0c400e8953	[InstCombine] fold icmp ult of offset value with constant This is one sibling of the fold added with `c7b658aeb5` . (X + C2) <u C --> X >s ~C2 (if C == C2 + SMIN) I'm still not sure how to describe it best, but we're translating 2 constants from an unsigned range comparison to signed because that eliminates the offset (add) op. This could be extended to handle the more general (non-constant) pattern too: https://alive2.llvm.org/ce/z/K-fMBf define i1 @src(i8 %a, i8 %c2) { %t = add i8 %a, %c2 %c = add i8 %c2, 128 ; SMIN %ov = icmp ult i8 %t, %c ret i1 %ov } define i1 @tgt(i8 %a, i8 %c2) { %not_c2 = xor i8 %c2, -1 %ov = icmp sgt i8 %a, %not_c2 ret i1 %ov }	2021-06-30 19:00:12 -04:00
Sanjay Patel	c7b658aeb5	[InstCombine] fold icmp of offset value with constant There must be a better way to describe this pattern in words? (X + C2) >u C --> X <s -C2 (if C == C2 + SMAX) This could be extended to handle the more general (non-constant) pattern too: https://alive2.llvm.org/ce/z/rdfNFP define i1 @src(i8 %a, i8 %c1) { %t = add i8 %a, %c1 %c2 = add i8 %c1, 127 ; SMAX %ov = icmp ugt i8 %t, %c2 ret i1 %ov } define i1 @tgt(i8 %a, i8 %c1) { %neg_c1 = sub i8 0, %c1 %ov = icmp slt i8 %a, %neg_c1 ret i1 %ov } The pattern was noticed as a by-product of D104932.	2021-06-30 13:37:31 -04:00
Sanjay Patel	9d0bf7699c	[InstCombine] don't try to fold a constant expression that can trap (PR50906) We could use a bigger hammer and bail out on any constant expression, but there's a regression test that appears to validly do the transform (although it may not have been intending to check that optimization).	2021-06-28 17:00:21 -04:00
Nikita Popov	fdd4c199a1	Revert "[InstCombine] Make indexed compare fold opaque ptr compatible" This reverts commit `5cb20ef8a2`. Assertion failures with this patch were reported on https://reviews.llvm.org/rG5cb20ef8a235, revert for now.	2021-06-26 00:32:59 +02:00
Eli Friedman	8d5bf0709d	[NFC] Prefer ConstantRange::makeExactICmpRegion over makeAllowedICmpRegion The implementation is identical, but it makes the semantics a bit more obvious.	2021-06-25 14:43:13 -07:00
Nikita Popov	5cb20ef8a2	[InstCombine] Make indexed compare fold opaque ptr compatible Rather than relying on pointer type equality (which, for a change, is silently incorrect with opaque pointers) check that the GEP source element types match.	2021-06-24 22:33:01 +02:00
Juneyoung Lee	c038845f58	[InstCombine] Fold icmp (select c,const,arg), null if icmp arg, null can be simplified This patch folds icmp (select c,const,arg), null if icmp arg, null can be simplified. Resolves llvm.org/pr48975. Reviewed By: nikic, xbolva00 Differential Revision: https://reviews.llvm.org/D96663	2021-06-21 17:39:05 +09:00
hyeongyukim	69b0ed9a0a	[InstCombine] Fix miscompile on GEP+load to icmp fold (PR45210) As noted in PR45210: https://bugs.llvm.org/show_bug.cgi?id=45210 ...the bug is triggered as Eli say when sext(idx) * ElementSize overflows. ``` // assume that GV is an array of 4-byte elements GEP = gep GV, 0, Idx // this is accessing Idx * 4 L = load GEP ICI = icmp eq L, value => ICI = icmp eq Idx, NewIdx ``` The foldCmpLoadFromIndexedGlobal function simplifies GEP+load operation to icmp. And there is a problem because Idx * ElementSize can overflow. Let's assume that the wanted value is at offset 0. Then, there are actually four possible values for Idx to match offset 0: 0x00..00, 0x40..00, 0x80..00, 0xC0..00. We should return true for all these values, but currently, the new icmp only returns true for 0x00..00. This problem can be solved by masking off (trailing zeros of ElementSize) bits from Idx. ``` ... => Idx' = and Idx, 0x3F..FF ICI = icmp eq Idx', NewIdx ``` Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D99481	2021-06-17 19:46:17 +09:00
Nathan Chancellor	e6b086bef2	Revert "[InstCombine] Fix miscompile on GEP+load to icmp fold (PR45210)" This reverts commit `4f2fd3818b`. The Linux kernel fails to build after this commit. See https://reviews.llvm.org/D99481 for a reproducer. Signed-off-by: Nathan Chancellor <nathan@kernel.org>	2021-05-31 20:21:26 -07:00
Hyeongyu Kim	4f2fd3818b	[InstCombine] Fix miscompile on GEP+load to icmp fold (PR45210) As noted in PR45210: https://bugs.llvm.org/show_bug.cgi?id=45210 ...the bug is triggered as Eli say when sext(idx) * ElementSize overflows. ``` // assume that GV is an array of 4-byte elements GEP = gep GV, 0, Idx // this is accessing Idx * 4 L = load GEP ICI = icmp eq L, value => ICI = icmp eq Idx, NewIdx ``` The foldCmpLoadFromIndexedGlobal function simplifies GEP+load operation to icmp. And there is a problem because Idx * ElementSize can overflow. Let's assume that the wanted value is at offset 0. Then, there are actually four possible values for Idx to match offset 0: 0x00..00, 0x40..00, 0x80..00, 0xC0..00. We should return true for all these values, but currently, the new icmp only returns true for 0x00..00. This problem can be solved by masking off (trailing zeros of ElementSize) bits from Idx. ``` ... => Idx' = and Idx, 0x3F..FF ICI = icmp eq Idx', NewIdx ``` Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D99481	2021-05-31 14:08:20 +09:00
Nikita Popov	9a9421a461	Reapply [InstCombine] Fold multiuse shr eq zero This was reverted due to performance regressions in ARM benchmarks, which have since been addressed by D101196 (SCEV analysis improvement) and D101778 (CGP reverse transform). ----- The single-use case is handled implicity by converting the icmp into a mask check first. When comparing with zero in particular, we don't need the one-use restriction, as we only produce a single icmp. https://alive2.llvm.org/ce/z/MSixcm https://alive2.llvm.org/ce/z/GwpG0M	2021-05-22 14:46:50 +02:00
Sanjay Patel	a6f79b5671	[InstCombine] avoid infinite loops with select/icmp transforms This fixes https://llvm.org/PR48900 , but as seen in the regression tests prevents some optimizations. There are a few options to restore those (switch to min/max intrinsics, add larger pattern matching for select with dominating condition, improve CVP), but we need to prevent the bug 1st.	2021-05-04 11:54:06 -04:00
Nikita Popov	24e9fbc1a3	Revert "[InstCombine] Fold multiuse shr eq zero" This reverts commit `9423f78240`. A performance regression with this patch has been reported at https://reviews.llvm.org/rG9423f78240a2#990953. Reverting for now.	2021-04-21 21:40:52 +02:00
Reid Kleckner	91f7a4fff7	Revert "[InstCombine] Recognize `((x * y) s/ x) !=/== y` as an signed multiplication overflow check (PR48769)" This reverts commit `13ec913bdf`. This commit introduces new uses of the overflow checking intrinsics that depend on implementations in compiler-rt, which Windows users generally do not link against. I filed an issue (somewhere) to make clang auto-link the builtins library to resolve this situation, but until that happens, it isn't reasonable for the optimizer to introduce new link time dependencies.	2021-04-20 15:53:34 -07:00
Roman Lebedev	13ec913bdf	[InstCombine] Recognize `((x * y) s/ x) !=/== y` as an signed multiplication overflow check (PR48769) We already had support for it's unsigned variant, so simply extend it to also handle the signed variant. Fixes https://bugs.llvm.org/show_bug.cgi?id=48769	2021-04-20 21:29:43 +03:00
Nikita Popov	9423f78240	[InstCombine] Fold multiuse shr eq zero The single-use case is handled implicity by converting the icmp into a mask check first. When comparing with zero in particular, we don't need the one-use restriction, as we only produce a single icmp. https://alive2.llvm.org/ce/z/MSixcm https://alive2.llvm.org/ce/z/GwpG0M	2021-04-19 22:13:11 +02:00
Mehrnoosh Heidarpour	29f189f90d	[InstCombine] Conditionally emit nowrap flags when combining two adds Currently, the InstCombineCompare is combining two add operations into a single add operation which always has a nsw flag, without checking the conditions to see if this flag should be present according to the original two add operations or not. This patch will change the InstCombineCompare to emit the nsw or nuw only when these flags are allowed to be generated according to the original add operations and remove the possibility of applying wrong optimization with passes that will perform on the IR later in the pipeline. To confirm that the current results are buggy and the results after proposed patch are the correct IR the following examples from Alive2 are attached; the same results can be seen in the case of nuw flag and nsw is just used as an example. The following link shows that the generated IR with current LLVM is a buggy IR when none of the original add operations have nsw flag. https://alive2.llvm.org/ce/z/WGaDrm The following link proves that the generated IR after the patch in the former case is the correct IR. https://alive2.llvm.org/ce/z/wQ7G_e Differential Revision: https://reviews.llvm.org/D100095	2021-04-14 20:53:06 +02:00
Sanjay Patel	5354a213a0	[InstCombine] fold shift+trunc signbit check https://alive2.llvm.org/ce/z/6vQvrP This solves: https://llvm.org/PR49866	2021-04-12 16:19:43 -04:00

1 2 3 4 5 ...

770 Commits