Commit Graph

832 Commits

Author SHA1 Message Date
Nikita Popov 8ee913d83b [IR] Remove Constant::canTrap() (NFC)
As integer div/rem constant expressions are no longer supported,
constants can no longer trap and are always safe to speculate.
Remove the Constant::canTrap() method and its usages.
2022-07-06 10:36:47 +02:00
Chenbing Zheng b43dd2f6c4 [InstCombine] improve fold for icmp_eq_and to icmp_ult
In D95959, the improve analysis for "C >> X" broken the fold
((%x & C) == 0) --> %x u< (-C) iff (-C) is power of two.

It simplifies C, but fails to satisfy the fold condition.
This patch try to restore C before the fold.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D128790
2022-07-05 17:18:23 +08:00
Chenbing Zheng b66220f25a [InstCombine] [NFC] use C.isNegatedPowerOf2() instead of (~C + 1).isPowerOf2()
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D129103
2022-07-05 17:04:59 +08:00
Sanjay Patel ab372cdd6f [InstCombine] add code comment for icmp transform; NFC
This was accidentally left out of cc88445a91
2022-07-01 08:21:55 -04:00
Craig Topper e633f8cd14 [InstCombine] Fix a Wparentheses warning in an assert. NFC 2022-06-30 13:03:32 -07:00
Sanjay Patel cc88445a91 [InstCombine] canonicalize 'icmp (trunc X), C' to 'icmp (X & Mask), C'
I looked at canonicalizing in the other direction, but that causes
many potential regressions and infinite loops because we already
(possibly wrongly) canonicalize "trunc X to i1" into an and+icmp.

This has a data layout restriction to avoid creating illegal
mask instructions, but we could remove that if we can show
that the backend can undo this when needed.

The motivating example from issue #56119 is modeled by the
PhaseOrdering test.
2022-06-30 15:51:39 -04:00
Sanjay Patel 7c4b90a98d [InstCombine] fix overzealous assert in icmp-shr fold
The assert was added with 0399473de8 and is correct for that
pattern, but it is off-by-1 with the enhancement in d4f39d8333.

The transforms are still correct with the new pre-condition:
https://alive2.llvm.org/ce/z/6_6ghm
https://alive2.llvm.org/ce/z/_GTBUt

And as shown in the new test, the transform is expected with
'ult' - in that case, the icmp reduces to test if the shift
amount is 0.
2022-06-30 06:28:48 -04:00
Sanjay Patel d4f39d8333 [InstCombine] add fold for (ShiftC >> X) >u C
This is the 'ugt' sibling to:
0399473de8

Decrement the input compare constant (and implicitly
decrement the new compare constant):
https://alive2.llvm.org/ce/z/iELmct
2022-06-29 12:30:01 -04:00
Nikita Popov 5548e807b5 [IR] Remove support for extractvalue constant expression
This removes the extractvalue constant expression, as part of
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
extractvalue is already not supported in bitcode, so we do not need
to worry about bitcode auto-upgrade.

Uses of ConstantExpr::getExtractValue() should be replaced with
IRBuilder::CreateExtractValue() (if the fact that the result is
constant is not important) or ConstantFoldExtractValueInstruction()
(if it is). Though for this particular case, it is also possible
and usually preferable to use getAggregateElement() instead.

The C API function LLVMConstExtractValue() is removed, as the
underlying constant expression no longer exists. Instead,
LLVMBuildExtractValue() should be used (which will constant fold
or create an instruction). Depending on the use-case,
LLVMGetAggregateElement() may also be used instead.

Differential Revision: https://reviews.llvm.org/D125795
2022-06-28 10:40:17 +02:00
chenglin.bi 30e49a3794 [InstCombine] Optimise shift+and+boolean conversion pattern to simple comparison
if (`C1` is pow2) & (`(C2 & ~(C1-1)) + C1)` is pow2):
    ((C1 << X) & C2) == 0 -> X >= (Log2(C2+C1) - Log2(C1));
https://alive2.llvm.org/ce/z/EJAl1R
    ((C1 << X) & C2) != 0 -> X  < (Log2(C2+C1) - Log2(C1));
https://alive2.llvm.org/ce/z/3bVRVz

And remove dead code.

Fix: https://github.com/llvm/llvm-project/issues/56124

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D126591
2022-06-23 21:53:07 +08:00
Chenbing Zheng 0eff6c6ba8 [InstCombine] add vector support for (A >> C) == (B >> C) --> (A^B) u< (1 << C)
Reviewed By: spatel, RKSimon

Differential Revision: https://reviews.llvm.org/D127398
2022-06-20 10:55:47 +08:00
Sanjay Patel 0399473de8 [InstCombine] add fold for (ShiftC >> X) <u C
https://alive2.llvm.org/ce/z/RcdzM-

This fixes a regression noted in issue #56046.
2022-06-19 11:03:28 -04:00
Simon Moll b8c2781ff6 [NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889
2022-06-09 16:10:08 +02:00
Chenbing Zheng 38992d2c5e [InstCombine] improve fold for icmp-ugt-ashr
Existing condition for
fold icmp ugt (ashr X, ShAmtC), C --> icmp ugt X, ((C + 1) << ShAmtC) - 1
missed some boundary. It cause this fold don't work for some cases, and the
reason is due to signed number overflow.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D127188
2022-06-09 16:22:12 +08:00
Sanjay Patel 2bf6123f22 [InstCombine] fold icmp of sext bool based on limited range
X <=u (sext i1 Y) --> (X == 0) | Y

https://alive2.llvm.org/ce/z/W_tZzo

This is the conjugate/sibling pattern suggested with D126171
for a sign-extended bool value.
2022-05-31 12:37:56 -04:00
Sanjay Patel 49f8b05137 [InstCombine] fold icmp equality with sdiv and SMIN
This extends the fold from D126410 / 3952c905ef
to allow for the only case where it works with signed
division:
https://alive2.llvm.org/ce/z/k7_ypu

(X s/ Y) == SMIN --> (X == SMIN) && (Y == 1)
(X s/ Y) != SMIN --> (X != SMIN) || (Y != 1)

This is another improvement based on #55695.
2022-05-26 16:19:15 -04:00
Sanjay Patel ed5be1523f [InstCombine] reduce code duplication in icmp+div folds; NFC 2022-05-26 16:19:15 -04:00
Sanjay Patel 3952c905ef [InstCombine] fold icmp equality with udiv and large constant
With large compare constant:
(X u/ Y) == C --> (X == C) && (Y == 1)
(X u/ Y) != C --> (X != C) || (Y != 1)

https://alive2.llvm.org/ce/z/EhKwh6

There are various potential missing icmp (div) transforms shown here:
https://github.com/llvm/llvm-project/issues/55695

This is a generalization for part of the udiv + equality.
I didn't check in detail, but some of those may only make sense as
codegen transforms.

This results in one extra instruction in IR, but it is better for
analysis, and looks much better in codegen on all targets that I tried.

Differential Revision: https://reviews.llvm.org/D126410
2022-05-26 09:08:47 -04:00
Sanjay Patel 1ebad988b1 [InstCombine] fold icmp of zext bool based on limited range
X <u (zext i1 Y) --> (X == 0) && Y

https://alive2.llvm.org/ce/z/avQDRY

This is a generalization of 4069cccf3b based on the post-commit suggestion.
This also adds the i1 type check and tests that were missing from the earlier
attempt; that commit caused several bot fails and was reverted.

Differential Revision: https://reviews.llvm.org/D126171
2022-05-23 09:59:21 -04:00
Sanjay Patel cba0ebd576 Revert "[InstCombine] fold icmp with sub and bool"
This reverts commit 4069cccf3b.
This causes bot failures, and there's a possibly a better way to get this and other patterns.
2022-05-22 12:13:20 -04:00
Sanjay Patel 4069cccf3b [InstCombine] fold icmp with sub and bool
This is the specific pattern seen in #53432, but it can be extended
in multiple ways:
1. The 'zext' could be an 'and'
2. The 'sub' could be some other binop with a similar ==0 property (udiv).

There might be some way to generalize using knownbits, but that
would require checking that the 'bool' value is created with
some instruction that can be replaced with new icmp+logic.

https://alive2.llvm.org/ce/z/-KCfpa
2022-05-22 11:51:07 -04:00
Chenbing Zheng ffaaf2498b [InstCombine] (rot X, ?) == 0/-1 --> X == 0/-1
In this patch we add a function foldICmpInstWithConstantAllowUndef
to fold integer comparisons with a constant operand: icmp Pred X, C
where X is some kind of instruction and C is AllowUndef.

We move this fold to the new function, so that it can solve undef elts in a vector.

Reviewed By: spatel, RKSimon

Differential Revision: https://reviews.llvm.org/D125220
2022-05-19 11:22:26 +08:00
Sanjay Patel 990cc49ca0 [InstCombine] avoid crash on fold of icmp with cast operand
We could do better by inserting a bitcast from scalar int
to vector int or using an insertelement (the alternate test
does not crash because there's an independent fold like that).

But this doesn't seem like a likely pattern, so just bail out
for now.

Fixes issue #55516.
2022-05-18 09:16:30 -04:00
Sanjay Patel be6d7cc93c [InstCombine] reduce code duplication for checking types; NFC 2022-05-18 09:16:30 -04:00
Chenbing Zheng acbad5086a [InstCombine] [NFC] separate a function foldICmpBinOpWithConstant
There is a long function foldICmpInstWithConstant,
we can separate a function foldICmpBinOpWithConstant from it.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D125457
2022-05-14 10:54:15 +08:00
Nikita Popov d222bab672 [InstCombine] Handle GEP scalar/vector base mismatch (PR55363)
30a12f3f63 switched the type check
to use the GEP result type rather than the GEP operand type.
However, the GEP result types may match even if the operand types
don't, in case GEPs with scalar/vector base and vector index
are compared.

Fixes https://github.com/llvm/llvm-project/issues/55363.
2022-05-10 11:26:43 +02:00
Nikita Popov 82190f917a [InstCombine] Fold icmp of select with implied condition
When threading the icmp over the select, check whether the
condition can be folded when taking into account the select
condition.
2022-05-06 17:13:32 +02:00
Nikita Popov 0863abe3ac [InstCombine] Fold icmp of select with non-constant operand
Try to push an icmp into a select even if the icmp operand isn't
constant - perform a generic SimplifyICmpInst instead.

This doesn't appear to impact compile-time much, and forming
logical and/or is generally profitable, as we have very good
support for them.
2022-05-06 16:04:39 +02:00
Nikita Popov b457ac4240 [InstCombine] Extract icmp of select transform (NFC)
To make it either to extend to the case where the other operand
is not a constant.
2022-05-06 14:46:44 +02:00
Nikita Popov 95fedfab6c [InstCombine] Handle non-canonical GEP index in indexed compare fold (PR55228)
Normally the index type will already be canonicalized here, but
this is not guaranteed depending on visitation order. The code
was already accounting for a potentially needed sext, but a trunc
may also be needed.

Add a ConstantExpr::getSExtOrTrunc() helper method to make this
simpler. This matches the corresponding IRBuilder method in behavior.

Fixes https://github.com/llvm/llvm-project/issues/55228.
2022-05-02 17:56:01 +02:00
Sanjay Patel 903aa5e0f8 [InstCombine] try to fold icmp with mismatched extended operands
If a value is known to be non-negative and zexted,
that's the same thing as sexted.

So for the purpose of looking past the casts with
an icmp, treat it as if it was a sext:
https://alive2.llvm.org/ce/z/_BDsGV

This is necessary, but not enough to solve the
motivating problem:
https://github.com/llvm/llvm-project/issues/55013

Differential Revision: https://reviews.llvm.org/D124419
2022-04-26 14:26:36 -04:00
Nikita Popov 2bec8d6d59 [InstCombine] Fold X + Y + C u< X
This is a variation on the X + Y u< X fold with an extra constant.
Proof: https://alive2.llvm.org/ce/z/VNb8pY
2022-04-25 12:53:39 +02:00
Alexander Shaposhnikov 6cf10b7e6e [InstCombine] Fold srem(X, PowerOf2) == C into (X & Mask) == C for positive C
This diff extends InstCombinerImpl::foldICmpSRemConstant to handle the cases
srem(X, PowerOf2) == C and
srem(X, PowerOf2) != C
for positive C.
This addresses the issue https://github.com/llvm/llvm-project/issues/54650

Differential revision: https://reviews.llvm.org/D122942

Test plan: make check-all
2022-04-03 03:57:05 +00:00
Sanjay Patel 5f8c2b884d [InstCombine] limit icmp fold with sub if other sub user is a phi
This is a hacky fix for:
https://github.com/llvm/llvm-project/issues/54558

As discussed there, codegen regressed when we opened up this transform
to allow extra uses ( 61580d0949 ), and it's not clear how to
undo the transforms at the later stage of compilation.

As noted in the code comments, there's a set of remaining folds that
are still limited to one-use, so we can try harder to refine and
expand the limitations on these folds, but it's likely to be an
up-and-down battle as we find and overcome similar regressions.

Differential Revision: https://reviews.llvm.org/D122909
2022-04-02 19:23:42 -04:00
Sanjay Patel 97ac0cd6c4 [InstCombine] fold fcmp with lossy casted constant (2nd try)
This is a retry of 9397bdc67e - that was reverted until
we had a clang warning in place to alert users about a
possible mistake in source. The warning was added with
ab982eace6.

This is noted as a missing clang warning in #54222,
but it is also a missing optimization opportunity.

Alive2 proofs:
https://alive2.llvm.org/ce/z/Q8drDq
https://alive2.llvm.org/ce/z/pE6LRt

I don't see a single conversion for all predicates
using "getFCmpCode" logic, so other predicates are
left as a TODO item.
2022-04-02 19:23:01 -04:00
Simon Pilgrim 7e4cf582cf [InstCombine] Add general constant support to eq/ne icmp(add(X,C1),add(Y,C2)) -> icmp(add(X,C1-C2),Y) fold
A further extension for Issue #32161

For eq/ne comparisons - the sign mismatch and bounds constraints are redundant, so if the that fold fails, fallback and just fold the constants directly.

https://alive2.llvm.org/ce/z/cdodNQ

The loop rotation test change looks mostly benign - the backend doesn't seem to suffer? https://gcc.godbolt.org/z/dErMY78To

Differential Revision: https://reviews.llvm.org/D121551
2022-03-15 14:17:38 +00:00
Sanjay Patel 3491f2f4b0 [InstCombine] replace negated operand in fcmp with 0.0
X (any pred) -X --> X (any pred) 0.0

This works with all FP values and preserves FMF.
Alive2 examples:
https://alive2.llvm.org/ce/z/dj6jhp

This can also create one of the patterns that we match as "fabs"
as shown in one of the test diffs.
2022-03-10 12:53:32 -05:00
Sanjay Patel 9fac110bf7 Revert "[InstCombine] fold fcmp with lossy casted constant"
This reverts commit 9397bdc67e.

This optimization is likely to surprise programmers as seen
in post-commit comments, so we should add a clang warning
first (that is proposed in D121306).
2022-03-10 10:22:22 -05:00
Simon Pilgrim 808d9d260b [InstCombine] Add vector support to icmp(add(X,C1),add(Y,C2)) -> icmp(add(X,C1-C2),Y) fold
As discussed on Issue #32161 this fold can be generalized a lot more than it currently is, but this patch at least adds vector support.

Differential Revision: https://reviews.llvm.org/D121358
2022-03-10 13:30:48 +00:00
Sanjay Patel 9397bdc67e [InstCombine] fold fcmp with lossy casted constant
This is noted as a missing clang warning in #54222
(and we should still make that enhancement).

Alive2 proofs:
https://alive2.llvm.org/ce/z/Q8drDq
https://alive2.llvm.org/ce/z/pE6LRt

I don't see a single conversion for all predicates
using "getFCmpCode" logic, so other predicates are
left as a TODO item.
2022-03-08 12:41:12 -05:00
serge-sans-paille 59630917d6 Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line:
before: 1062981579
after:  1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817
2022-03-03 07:56:34 +01:00
Nikita Popov 61580d0949 Reapply [InstCombine] Remove one-use limitation from X-Y==0 fold
This is a recommit without changes. I originally reverted this
due to a significant code-size regression on tramp3d-v4, however
further investigation showed that in the tramp3d-v4 case this
change enables additional optimizations (in particular more
jump threading), which happens to reduce the size of a function
just enough to be eligible for inlining at hot callsites, which
results in the code size increase. As such, this was just bad
luck.

-----

This one-use limitation is artificial, we do not increase
instruction count if we perform the fold with multiple uses. The
motivating case is shown in @sub_eq_zero_select, where the one-use
limitation causes us to miss a subsequent select fold.

I believe the backend is pretty good about reusing flag-producing
subs for cmps with same operands, so I think doing this is fine.

Differential Revision: https://reviews.llvm.org/D120337
2022-03-02 16:43:33 +01:00
Nikita Popov aa551ad198 Revert "[InstCombine] Remove one-use limitation from X-Y==0 fold"
This reverts commit 65dc78d63e.

This caused a major code-size regression on tramp3d-v4, revert
until I can investigate.
2022-02-24 08:50:40 +01:00
Nikita Popov 65dc78d63e [InstCombine] Remove one-use limitation from X-Y==0 fold
This one-use limitation is artificial, we do not increase
instruction count if we perform the fold with multiple uses. The
motivating case is shown in @sub_eq_zero_select, where the one-use
limitation causes us to miss a subsequent select fold.

I believe the backend is pretty good about reusing flag-producing
subs for cmps with same operands, so I think doing this is fine.

Differential Revision: https://reviews.llvm.org/D120337
2022-02-23 09:37:30 +01:00
Philip Reames 6f9d557e08 [instcombine] Cleanup foldAllocaCmp slightly [NFC] 2022-02-18 18:49:39 -08:00
Nikita Popov e714b98fff [InstCombine] Check type compatibility in indexed load fold
This fold could use a rewrite to an offset-based implementation,
but for now make sure it doesn't crash with opaque pointers.
2022-02-11 10:16:27 +01:00
Nikita Popov 3571bdb4f3 [InstCombine] Require equal source element type in icmp of gep fold
Without opaque pointers, this is implicitly enforced. This previously
resulted in a miscompile.
2022-02-11 09:38:28 +01:00
Simon Pilgrim aca355a3bb [InstCombine] Extend fold (icmp sgt smin(PosA, B) 0) -> (icmp sgt B 0) to support smin intrinsic
Replace matchSelectPattern pattern match with the more general m_SMin so that it can handle smin intrinsics as well as the icmp+select pattern

Noticed while reviewing regressions from D98152
2022-02-10 13:28:15 +00:00
Max Kazantsev 70b3beb0e2 [InstCombine] Generalize and-reduce pattern to handle `ne` case as well as `eq`
Following Sanjay's proposal from discussion in D118317, this patch
generalizes and-reduce handling to fold the following pattern
```
  icmp ne (bitcast(icmp ne (lhs, rhs)), 0)
```
into
```
  icmp ne (bitcast(lhs), bitcast(rhs))
```

https://alive2.llvm.org/ce/z/WDcuJ_

Differential Revision: https://reviews.llvm.org/D118431
Reviewed By: lebedev.ri
2022-01-31 12:14:08 +07:00
Max Kazantsev 3b194ca7ab Recommit "[InstCombine] Fold and-reduce idiom"
Checks of original vector types made more thorough.

Differential Revision: https://reviews.llvm.org/D118317
2022-01-29 11:27:48 +07:00