Commit Graph

125 Commits

Author SHA1 Message Date
Sanjay Patel 09c575e728 [InstCombine] add/move tests for shl with binop; NFC 2021-09-28 14:46:27 -04:00
Sanjay Patel 21429cf43a [InstCombine] generalize fold for (trunc (X u>> C1)) u>> C
This is another step towards trying to re-apply D110170
by eliminating conflicting transforms that cause infinite loops.
a47c8e40c7 was a previous patch in this direction.

The diffs here are mostly cosmetic, but intentional:
1. The existing code that would handle this pattern in FoldShiftByConstant()
   is limited to 'shl' only now. The formatting change to IsLeftShift shows
   that we could move several transforms into visitShl() directly for
   efficiency because they are not common shift transforms.

2. The tests are regenerated to show new instruction names to prove that
   we are getting (almost) identical logic results.

3. The one case where we differ ("trunc_sandwich_small_shift1") shows that
   we now use a narrow 'and' instruction. Previously, we relied on another
   transform to do that, but it is limited to legal types. That seems to
   be a legacy constraint from when IR analysis and codegen were less robust.

https://alive2.llvm.org/ce/z/JxyGA4

  declare void @llvm.assume(i1)

  define i8 @src(i32 %x, i32 %c0, i8 %c1) {
    ; The sum of the shifts must not overflow the source width.
    %z1 = zext i8 %c1 to i32
    %sum = add i32 %c0, %z1
    %ov = icmp ult i32 %sum, 32
    call void @llvm.assume(i1 %ov)

    %sh1 = lshr i32 %x, %c0
    %tr = trunc i32 %sh1 to i8
    %sh2 = lshr i8 %tr, %c1
    ret i8 %sh2
  }

  define i8 @tgt(i32 %x, i32 %c0, i8 %c1) {
    %z1 = zext i8 %c1 to i32
    %sum = add i32 %c0, %z1
    %maskc = lshr i8 -1, %c1

    %s = lshr i32 %x, %sum
    %t = trunc i32 %s to i8
    %a = and i8 %t, %maskc
    ret i8 %a
  }
2021-09-27 10:57:31 -04:00
Simon Pilgrim 5a14edd8ed [InstCombine] Ensure shifts are in range for (X << C1) / C2 -> X fold.
We can get here before out of range shift amounts have been handled - limit to BW-2 for sdiv and BW-1 for udiv

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=38078
2021-09-25 12:57:43 +01:00
Simon Pilgrim 10c982e0b3 Revert rG1c9bec727ab5c53fa060560dc8d346a911142170 : [InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069)
Reverted (manually due to merge conflicts) while regressions reported on PR51540 are investigated

As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead.

I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst.

https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!)

Differential Revision: https://reviews.llvm.org/D106450
2021-08-23 21:09:26 +01:00
Simon Pilgrim 1c9bec727a [InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069)
As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead.

I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst.

https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!)

Differential Revision: https://reviews.llvm.org/D106450
2021-07-22 10:58:51 +01:00
Sanjay Patel 0be0a1237c [ValueTracking] improve analysis for "C << X" and "C >> X"
This is based on the example/comments in:
https://llvm.org/PR48984

I tried just lifting the restriction in computeKnownBitsFromShiftOperator()
as suggested in the bug report, but that doesn't catch all of the cases
shown here. I didn't step through to see exactly why that happened. But it
seems like a reasonable compromise to cheaply check the special-case of
shifting a constant.

There's a slight regression on a cmp transform as noted, but this is likely
the more important/common pattern, so we can fix that icmp pattern later if
needed.

Differential Revision: https://reviews.llvm.org/D95959
2021-02-09 12:38:06 -05:00
Sanjay Patel 9d230295d9 [InstCombine] add tests for demanded/known bits of shifted constant; NFC
These are variations of a missed analysis noted in:
https://llvm.org/PR48984
2021-02-04 10:31:22 -05:00
Nikita Popov a6df39236f [InstSimplify] Fold out-of-bounds shift to poison
Make InstSimplify return poison rather than undef for out-of-bounds
shifts, as specified by LandRef:

> If op2 is (statically or dynamically) equal to or larger than the
> number of bits in op1, this instruction returns a poison value.

Differential Revision: https://reviews.llvm.org/D93998
2021-01-06 20:41:37 +01:00
Nikita Popov 766cf7f32e [InstSimplify] Fold division by zero to poison
Div/rem by zero is immediate undefined behavior and anything goes.
Currently we fold it to undef, this patch changes it to fold to
poison instead, which is slightly stronger.

Differential Revision: https://reviews.llvm.org/D93995
2021-01-03 20:52:45 +01:00
Roman Lebedev 0e76a9bc58
[NFC][InstCombine] Update few comment updates i missed in 0ac56e8eaa
As pointed out in post-commit review in that commit
2020-11-06 17:38:00 +03:00
Simon Pilgrim dcb3dc101d [InstCombine] visitShl - ensure inner shifts have inrange amounts
Noticed when fixing OSS Fuzz #26716
2020-10-29 15:28:15 +00:00
Roman Lebedev 0ac56e8eaa
[InstCombine] Fold `(X >>? C1) << C2` patterns to shift+bitmask (PR37872)
This is essentially finalizes a revert of rL155136,
because nowadays the situation has improved, SCEV can model
all these patterns well, and we canonicalize rotate-like patterns
into a funnel shift intrinsics in InstCombine.
So this should not cause any pessimization.

I've verified the canonicalize-{a,l}shr-shl-to-masking.ll transforms
with alive, which confirms that we can freely preserve exact-ness,
and no-wrap flags.

Profs:
* base: https://rise4fun.com/Alive/gPQ
* exact-ness preservation: https://rise4fun.com/Alive/izi
* nuw preservation: https://rise4fun.com/Alive/DmD
* nsw preservation: https://rise4fun.com/Alive/SLN6N
* nuw nsw preservation: https://rise4fun.com/Alive/Qp7

Refs. https://reviews.llvm.org/D46760
2020-10-27 14:42:53 +03:00
Simon Pilgrim 8a836daaa9 [InstCombine] Support lshr(trunc(lshr(x,c1)), c2) -> trunc(lshr(lshr(x,c1),c2)) uniform vector tests
FoldShiftByConstant is hardcoded for scalar/uniform outer shift amounts atm so that needs to be fixed first to support non-uniform cases
2020-10-09 16:54:46 +01:00
Simon Pilgrim af1f016436 [InstCombine] Add lshr(trunc(lshr(x,c1)), c2) -> trunc(lshr(lshr(x,c1),c2)) vector tests 2020-10-09 16:54:46 +01:00
Simon Pilgrim 1c040a3e56 [InstCombine] commonShiftTransforms - add support for pow2 nonuniform constant vectors in srem fold
Note: we already fold srem to undef if any denominator vector element is undef.
2020-10-09 15:59:33 +01:00
Simon Pilgrim ccf1260792 [InstCombine] Add tests for X shift (A srem B) -> X shift (A and B-1) pow2 nonuniform constant vectors 2020-10-09 15:33:06 +01:00
Simon Pilgrim 2cd7b0e130 [ValueTracking] canCreateUndefOrPoison - use APInt to check bounds instead of getZExtValue().
Fixes OSS Fuzz #26135
2020-10-05 13:45:27 +01:00
Roman Lebedev c3b8bd1eea
[InstCombine] Always try to invert non-canonical predicate of an icmp
Summary:
The actual transform i was going after was:
https://rise4fun.com/Alive/Tp9H
```
Name: zz
Pre: isPowerOf2(C0) && isPowerOf2(C1) && C1 == C0
%t0 = and i8 %x, C0
%r = icmp eq i8 %t0, C1
  =>
%t = icmp eq i8 %t0, 0
%r = xor i1 %t, -1

Name: zz
Pre: isPowerOf2(C0)
%t0 = and i8 %x, C0
%r = icmp ne i8 %t0, 0
  =>
%t = icmp eq i8 %t0, 0
%r = xor i1 %t, -1
```
but as it can be seen from the current tests, we already canonicalize most of it,
and we are only missing handling multi-use non-canonical icmp predicates.

If we have both `!=0` and `==0`, even though we can CSE them,
we end up being stuck with them. We should canonicalize to the `==0`.

I believe this is one of the cleanup steps i'll need after `-scalarizer`
if i end up proceeding with my WIP alloca promotion helper pass.

Reviewers: spatel, jdoerfert, nikic

Reviewed By: nikic

Subscribers: zzheng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83139
2020-07-04 18:12:04 +03:00
Roman Lebedev 341ab51149
[NFCI][InstCombine] shift.ll: s/%tmp/%i/ to silence update script warning 2020-07-04 00:39:35 +03:00
Roman Lebedev 796fa662f1
[InstCombine] Invert `add A, sext(B) --> sub A, zext(B)` canonicalization (to `sub A, zext B -> add A, sext B`)
Summary:
D68408 proposes to greatly improve our negation sinking abilities.
But in current canonicalization, we produce `sub A, zext(B)`,
which we will consider non-canonical and try to sink that negation,
undoing the existing canonicalization.
So unless we explicitly stop producing previous canonicalization,
we will have two conflicting folds, and will end up endlessly looping.

This inverts canonicalization, and adds back the obvious fold
that we'd miss:
* `sub [nsw] Op0, sext/zext (bool Y) -> add [nsw] Op0, zext/sext (bool Y)`
  https://rise4fun.com/Alive/xx4
* `sext(bool) + C -> bool ? C - 1 : C`
  https://rise4fun.com/Alive/fBl

It is obvious that `@ossfuzz_9880()` / `@lshr_out_of_range()`/`@ashr_out_of_range()`
(oss-fuzz 4871) are no longer folded as much, though those aren't really worrying.

Reviewers: spatel, efriedma, t.p.northover, hfinkel

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71064
2019-12-05 21:21:30 +03:00
Sanjay Patel 455ce7816c [InstCombine] fold a shifted bool zext to a select (2nd try)
The 1st attempt at rL374828 inserted the code
at the wrong position (outside of the constant-shift-amount
block). Trying again with an additional test to verify
const-ness.

For a constant shift amount, add the following fold.
shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0)

https://rise4fun.com/Alive/IZ9

Fixes PR42257.

Based on original patch by @zvi (Zvi Rackover)

Differential Revision: https://reviews.llvm.org/D63382

llvm-svn: 374886
2019-10-15 13:12:44 +00:00
Sanjay Patel 4335d8f0e8 Revert [InstCombine] fold a shifted bool zext to a select
This reverts r374828 (git commit 1f40f15d54) due to bot breakage

llvm-svn: 374851
2019-10-14 23:55:39 +00:00
Sanjay Patel 1f40f15d54 [InstCombine] fold a shifted bool zext to a select
For a constant shift amount, add the following fold.
shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0)

https://rise4fun.com/Alive/IZ9

Fixes PR42257.

Based on original patch by @zvi (Zvi Rackover)

Differential Revision: https://reviews.llvm.org/D63382

llvm-svn: 374828
2019-10-14 21:56:40 +00:00
Sanjay Patel bfaa1082e1 [InstCombine] add tests for select/shift transforms; NFC
A transform proposal for the shift form is in D63382.

llvm-svn: 374818
2019-10-14 20:28:03 +00:00
Roman Lebedev fb5af8b9b9 [InstCombine] Fold 'icmp eq/ne (?trunc (lshr/ashr %x, bitwidth(x)-1)), 0' -> 'icmp sge/slt %x, 0'
We do indeed already get it right in some cases, but only transitively,
with one-use restrictions. Since we only need to produce a single
comparison, it makes sense to match the pattern directly:
  https://rise4fun.com/Alive/kPg

llvm-svn: 373802
2019-10-04 22:16:22 +00:00
Roman Lebedev ae738641d5 [NFC][InstCombine] Autogenerate shift.ll test
llvm-svn: 373800
2019-10-04 22:15:57 +00:00
Sanjay Patel a53ad0e157 Revert r367891 - "[InstCombine] combine mul+shl separated by zext"
This reverts commit 5dbb90bfe1.

As noted in the post-commit thread for r367891, this can create
a multiply that is lowered to a libcall that may not exist.

We need to improve the backend decomposition for integer multiply
before trying to re-land this (if it's still worthwhile after
doing the backend work).

llvm-svn: 369174
2019-08-16 23:36:28 +00:00
Sanjay Patel 5dbb90bfe1 [InstCombine] combine mul+shl separated by zext
This appears to slightly help patterns similar to what's
shown in PR42874:
https://bugs.llvm.org/show_bug.cgi?id=42874
...but not in the way requested.

That fix will require some later IR and/or backend pass to
decompose multiply/shifts into something more optimal per
target. Those transforms already exist in some basic forms,
but probably need enhancing to catch more cases.

https://rise4fun.com/Alive/Qzv2

llvm-svn: 367891
2019-08-05 16:59:58 +00:00
Sanjay Patel 4b9d66cf41 [InstCombine] add tests for shl+mul; NFC
llvm-svn: 367883
2019-08-05 16:17:07 +00:00
Sanjay Patel 1a29823b9c [InstCombine] add extra use constraint for shl-zext fold
As the test shows, we can end up with more instructions than
we started with if we don't include the extra-use check.

llvm-svn: 367880
2019-08-05 16:04:07 +00:00
Sanjay Patel d1c5d13470 [InstCombine] add test for shl-zext with extra use; NFC
llvm-svn: 367876
2019-08-05 15:25:07 +00:00
Eric Christopher cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Sanjay Patel 5f845732ed [InstSimplify] move tests for shifts; NFC
llvm-svn: 330516
2018-04-21 16:58:00 +00:00
Simon Pilgrim 5d909be91b [InstCombine] Check for out of range ashr values using APInt before calling getZExtValue
Reduced from oss-fuzz #5032 test case

llvm-svn: 322078
2018-01-09 14:23:46 +00:00
Simon Pilgrim 3bf2d64589 [InstCombine] Check for out of range shift values using APInt before calling getZExtValue
Reduced from oss-fuzz #4871 test case

llvm-svn: 321748
2018-01-03 18:28:20 +00:00
Craig Topper 7dd4d32431 Recommit r317510 "[InstCombine] Pull shifts through a select plus binop with constant"
The hexagon test should be fixed now.

Original commit message:

This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.

This can allow us to get the select closer to other selects to enable removing one.

Differential Revision: https://reviews.llvm.org/D39222

llvm-svn: 317600
2017-11-07 18:47:24 +00:00
Hans Wennborg 8c4b10e84a Revert r317510 "[InstCombine] Pull shifts through a select plus binop with constant"
This broke the CodeGen/Hexagon/loop-idiom/pmpy-mod.ll test on a bunch of buildbots.

> This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.
>
> This can allow us to get the select closer to other selects to enable removing one.
>
> Differential Revision: https://reviews.llvm.org/D39222
>
> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317510 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-svn: 317518
2017-11-06 22:28:02 +00:00
Craig Topper 8917647333 [InstCombine] Pull shifts through a select plus binop with constant
This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.

This can allow us to get the select closer to other selects to enable removing one.

Differential Revision: https://reviews.llvm.org/D39222

llvm-svn: 317510
2017-11-06 21:07:22 +00:00
Amjad Aboud 0464c5d958 [InstCombine] Added support for (X >>s C) << C --> X & (-1 << C)
Differential Revision: https://reviews.llvm.org/D36743

llvm-svn: 310949
2017-08-15 19:33:14 +00:00
Craig Topper f93b7b1c1f [ValueTracking] Correct early out in computeKnownBitsFromOperator to work with non power of 2 bit widths
There's an early out that's trying to detect when we don't know any bits that make up the legal range of a shift. The code subtracts one from BitWidth which creates a mask in the lower bits for power of 2 bit widths. This is then ANDed with the known bits to see if any of those bits are known. If the bit width isn't a power of 2 this creates a non-sensical mask.

This patch corrects this by rounding up to a power of 2 before doing the subtract and mask.

Differential Revision: https://reviews.llvm.org/D34165

llvm-svn: 305400
2017-06-14 17:04:59 +00:00
Sanjay Patel c9485ca895 [InstCombine] allow shl+shr demanded bits folds with splat constants
llvm-svn: 300911
2017-04-20 22:33:54 +00:00
Sanjay Patel be2dcaf45a [InstCombine] add tests for shl+shr demanded bits splat vector folds; NFC
llvm-svn: 300907
2017-04-20 22:18:47 +00:00
Sanjay Patel 3e1ae72fcf [InstCombine] allow shl demanded bits folds with splat constants
More fixes are needed to enable the helper SimplifyShrShlDemandedBits().

llvm-svn: 300898
2017-04-20 21:33:02 +00:00
Sanjay Patel fb5b3e773a [InstCombine] allow ashr/lshr demanded bits folds with splat constants
llvm-svn: 300888
2017-04-20 20:59:02 +00:00
Sanjay Patel 7e77bed813 [InstCombine] add tests for demanded bits ashr/lshr splat constants; NFC
llvm-svn: 300884
2017-04-20 20:44:54 +00:00
Sanjay Patel 8c5f236197 [InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2) for vectors with splat constants
llvm-svn: 293570
2017-01-30 23:35:52 +00:00
Sanjay Patel abbb118a78 [InstCombine] add vector test for (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2); NFC
llvm-svn: 293566
2017-01-30 23:26:17 +00:00
Sanjay Patel 0c39d56a60 [InstCombine] enable more lshr(shl X, C1), C2 folds for vectors with splat constants
llvm-svn: 293562
2017-01-30 23:01:05 +00:00
Sanjay Patel 98cc841421 [InstCombine] add tests for more shift-shift patterns; NFC
llvm-svn: 293555
2017-01-30 22:24:36 +00:00