Commit Graph

615 Commits

Author SHA1 Message Date
Stanislav Mekhanoshin e6f6942296 [InstCombine] (~a & b & c) | ~(a | b) -> (c | ~b) & ~a
Transform
```
(~a & b & c) | ~(a | b) -> (c | ~b) & ~a
```
and swapped case
```
(~a | b | c) & ~(a & b) -> (c & ~b) | ~a
```

```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %or1 = or i4 %b, %a
  %not1 = xor i4 %or1, 15
  %not2 = xor i4 %a, 15
  %and1 = and i4 %b, %not2
  %and2 = and i4 %and1, %c
  %or2 = or i4 %and2, %not1
  ret i4 %or2
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %notb = xor i4 %b, 15
  %or = or i4 %notb, %c
  %nota = xor i4 %a, 15
  %and = and i4 %or, %nota
  ret i4 %and
}
Transformation seems to be correct!
```

```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %and1 = and i4 %b, %a
  %not1 = xor i4 %and1, 15
  %not2 = xor i4 %a, 15
  %or1 = or i4 %b, %not2
  %or2 = or i4 %or1, %c
  %and2 = and i4 %or2, %not1
  ret i4 %and2
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %notb = xor i4 %b, 15
  %and = and i4 %notb, %c
  %nota = xor i4 %a, 15
  %or = or i4 %and, %nota
  ret i4 %or
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D113037
2021-12-15 09:36:59 -08:00
Sanjay Patel 1a60ae02c6 [InstCombine] fold mask-with-signbit-splat to icmp+select
~(iN X s>> (N-1)) & Y --> (X s< 0) ? 0 : Y

https://alive2.llvm.org/ce/z/JKlQ9x

This is similar to D111410 / 727e642e97 ,
but it includes a 'not' of the signbit and so it
saves an instruction in the basic pattern.

DAGCombiner or target-specific folds can expand
this back into bit-hacks.

The diffs in the logical-select tests are not true
regressions - running early-cse and another round
of instcombine is expected in a normal opt pipeline,
and that reduces back to a minimal form as shown
in the duplicated PhaseOrdering test.

I have no understanding of the SystemZ diffs, so
I made the minimal edits suggested by FileCheck to
make that test pass again. That whole test file is
wrong though. It is running the entire optimizer (-O2)
to check IR, and then topping that by even running
codegen and checking asm. It needs to be split up.

Fixes #52631
2021-12-14 16:00:42 -05:00
Stanislav Mekhanoshin 06ca0a2733 [InstCombine] (~a & b & c) | ~(a | b | c) -> ~(a | (b ^ c))
Transform
```
(~a & b & c) | ~(a | b | c) -> ~(a | (b ^ c))
```
And swapped case:
```
(~a | b | c) & ~(a & b & c) -> ~a | (b ^ c)
```

```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %or1 = or i4 %b, %a
  %or2 = or i4 %or1, %c
  %not1 = xor i4 %or2, 15
  %not2 = xor i4 %a, 15
  %and1 = and i4 %b, %not2
  %and2 = and i4 %and1, %c
  %or3 = or i4 %and2, %not1
  ret i4 %or3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %1 = xor i4 %c, %b
  %2 = or i4 %1, %a
  %or3 = xor i4 %2, 15
  ret i4 %or3
}
Transformation seems to be correct!
```
```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %and1 = and i4 %b, %a
  %and2 = and i4 %and1, %c
  %not1 = xor i4 %and2, 15
  %not2 = xor i4 %a, 15
  %or1 = or i4 %not2, %b
  %or2 = or i4 %or1, %c
  %and3 = and i4 %or2, %not1
  ret i4 %and3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %xor = xor i4 %b, %c
  %not = xor i4 %a, 15
  %or = or i4 %xor, %not
  ret i4 %or
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D112966
2021-12-09 09:48:55 -08:00
Sanjay Patel 99f8b795cc [InstCombine] try to fold 'or' into 'mul' operand
or (mul X, Y), X --> mul X, (add Y, 1) (when the multiply has no common bits with X)

We already have this fold if the pattern ends in 'add', but we can miss it if the
'add' becomes 'or' via another no-common-bits transform.

This is part of fixing:
http://llvm.org/PR49055
...but it won't make a difference on that example yet.

https://alive2.llvm.org/ce/z/Vrmoeb

Differential Revision: https://reviews.llvm.org/D114729
2021-11-29 17:03:08 -05:00
Stanislav Mekhanoshin 5c6b9e1622 [InstCombine] (~(a | b) & c) | ~(c | (a ^ b)) -> ~((a | b) & (c | (b ^ a)))
```
----------------------------------------
define i3 @src(i3 %a, i3 %b, i3 %c) {
%0:
  %or1 = or i3 %b, %c
  %not1 = xor i3 %or1, 7
  %and1 = and i3 %a, %not1
  %xor1 = xor i3 %b, %c
  %or2 = or i3 %xor1, %a
  %not2 = xor i3 %or2, 7
  %or3 = or i3 %and1, %not2
  ret i3 %or3
}
=>
define i3 @tgt(i3 %a, i3 %b, i3 %c) {
%0:
  %obc = or i3 %b, %c
  %xbc = xor i3 %b, %c
  %o = or i3 %a, %xbc
  %and = and i3 %obc, %o
  %r = xor i3 %and, 7
  ret i3 %r
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D112955
2021-11-29 11:20:34 -08:00
Mehrnoosh Heidarpour c572eb1ad9 [InstCombine] Fold (~A | B) ^ A --> ~(A & B)
https://alive2.llvm.org/ce/z/gLrYPk

Fixes:
https://llvm.org/PR52518

Reviewed by: spatel

Differential revision: https://reviews.llvm.org/D114339
2021-11-29 11:29:21 -05:00
Sanjay Patel 97755ab1c6 [InstCombine] reduce code duplication; NFC 2021-11-28 09:27:20 -05:00
Stanislav Mekhanoshin 9300b133c8 Revert "[InstCombine] (~(a | b) & c) | ~(c | (a ^ b)) -> ~((a | b) & (c | (b ^ a)))"
This reverts commit c407769f5e.
2021-11-24 11:14:52 -08:00
Sanjay Patel 430ad9697d [InstCombine] enhance bitwise select matching
I noticed that adding a seemingly unrelated fold for xor caused
regressions on similar patterns, and this is one of the
underlying causes.

This could also be a variation for code as seen in:
https://llvm.org/PR34047
...although that exact example should be fixed after:
D113035 / c36b7e21bd

The vector test shows that we are actually missing a potential
canonicalization for bitcast-of-sext-of-not or the inverse.
The scalar test shows that even if we had that canonicalization,
it would still be possible to see this pattern due to extra uses.

https://alive2.llvm.org/ce/z/y2BAgi
2021-11-23 09:57:44 -05:00
Stanislav Mekhanoshin c407769f5e [InstCombine] (~(a | b) & c) | ~(c | (a ^ b)) -> ~((a | b) & (c | (b ^ a)))
Transform
```
(~(a | b) & c) | ~(c | (a ^ b)) -> ~((a | b) & (c | (b ^ a)))
```
And swapped case:
```
(a | ~(b & c)) & ~(a & (b ^ c)) --> ~(a | b) | (a ^ b ^ c)
```

```
----------------------------------------
define i3 @src(i3 %a, i3 %b, i3 %c) {
%0:
  %or1 = or i3 %b, %c
  %not1 = xor i3 %or1, 7
  %and1 = and i3 %a, %not1
  %xor1 = xor i3 %b, %c
  %or2 = or i3 %xor1, %a
  %not2 = xor i3 %or2, 7
  %or3 = or i3 %and1, %not2
  ret i3 %or3
}
=>
define i3 @tgt(i3 %a, i3 %b, i3 %c) {
%0:
  %obc = or i3 %b, %c
  %xbc = xor i3 %b, %c
  %o = or i3 %a, %xbc
  %and = and i3 %obc, %o
  %r = xor i3 %and, 7
  ret i3 %r
}
Transformation seems to be correct!
```
```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %and1 = and i4 %b, %c
  %not1 = xor i4 %and1, 15
  %or1 = or i4 %not1, %a
  %xor1 = xor i4 %b, %c
  %and2 = and i4 %xor1, %a
  %not2 = xor i4 %and2, 15
  %and3 = and i4 %or1, %not2
  ret i4 %and3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %xor1 = xor i4 %b, %c
  %xor2 = xor i4 %xor1, %a
  %or1 = or i4 %a, %b
  %not1 = xor i4 %or1, 15
  %or2 = or i4 %xor2, %not1
  ret i4 %or2
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D112955
2021-11-22 10:49:21 -08:00
Sanjay Patel 337948ac6e [InstCombine] add folds for binop with sexted bool and constant operands
This is a generalization/extension of the existing and/or
folds noted with TODO comments. Those have a one-use
constraint that is not necessary.

Potential follow-ups are noted by the TODO comments in
the new function. We can also call this function from
other binop visit* functions, but we need to add tests
first.

This solves:
https://llvm.org/PR52543

https://alive2.llvm.org/ce/z/NWuCR5
2021-11-20 12:33:00 -05:00
Stanislav Mekhanoshin 6d3db28088 [InstCombine] Generalize complex OR patterns to AND
For every pattern with only NOT, OR, and AND operations there is
always a symmetrical attern with AND and OR swapped.

This adds 2 transformations: https://reviews.llvm.org/D113526

```
(~(a & b) | c) & (~(a & c) | b) --> ~((b ^ c) & a)
(~(a & b) | c) & ~(a & c) --> ~((b | c) & a)
```

```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %and1 = and i4 %b, %a
  %not1 = xor i4 %and1, 15
  %and2 = and i4 %a, %c
  %not2 = xor i4 %and2, 15
  %or = or i4 %not2, %b
  %r = and i4 %or, %not1
  ret i4 %r
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %or = or i4 %b, %c
  %and = and i4 %or, %a
  %r = xor i4 %and, 15
  ret i4 %r
}
Transformation seems to be correct!

----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %and1 = and i4 %a, %b
  %not1 = xor i4 %and1, 15
  %or1 = or i4 %not1, %c
  %and2 = and i4 %a, %c
  %not2 = xor i4 %and2, 15
  %or2 = or i4 %not2, %b
  %and3 = and i4 %or1, %or2
  ret i4 %and3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %xor = xor i4 %b, %c
  %and = and i4 %xor, %a
  %not = xor i4 %and, 15
  ret i4 %not
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D113526
2021-11-17 10:47:36 -08:00
Stanislav Mekhanoshin e785f4ab6a [PatternMatch] Add m_BinOp/m_c_BinOp with specific opcode
Differential Revision: https://reviews.llvm.org/D113508
2021-11-15 11:24:27 -08:00
Mehrnoosh Heidarpour 7daa95c8fa [InstCombine] Fold (A^B)|~A-->~(A&B)
https://alive2.llvm.org/ce/z/2v6rhF

Fixes:
https://llvm.org/PR52478

Differential Revision: https://reviews.llvm.org/D113783
2021-11-15 12:29:37 -05:00
Nikita Popov 986416251b [InstCombine] Drop redundant fold for and/or of icmp eq/ne (NFCI)
This handles a special case of foldAndOrOfICmpsUsingRanges()
with two equality predicates.
2021-11-11 20:25:40 +01:00
Nikita Popov 84e273cced [InstCombine] Handle undefs in and of icmp eq zero fold
For the scalar/splat case, this fold is subsumed by
foldLogOpOfMaskedICmps(). However, the conjugated fold for "or"
also supports splats with undef. Make both code paths consistent
by using m_ZeroInt() for the "and" implementation as well.

https://alive2.llvm.org/ce/z/tN63cu
https://alive2.llvm.org/ce/z/ufB_Ue
2021-11-11 19:07:07 +01:00
Nikita Popov 0242a6adf7 [InstCombine] Support splat vectors in some or of icmp folds
Replace m_ConstantInt() with m_APInt() in order to support splat
constants in addition to scalar integers.
2021-11-10 22:59:09 +01:00
Nikita Popov 861adaf2ad [InstCombine] Support splat vectors in some and of icmp folds
Replace m_ConstantInt() with m_APInt() to support splat vectors
in addition to scalar integers.
2021-11-10 22:37:54 +01:00
Nikita Popov 58ebc79a64 [InstCombine] Strip offset when folding and/or of icmps
When folding and/or of icmps, look through add of a constant and
adjust the icmp range instead. Effectively, this decomposes
X + C1 < C2 style range checks back into a normal range. This allows
us to fold comparisons involving two range checks or one range check
and some other condition. We had a fold for a really specific case
of this (or of range check and eq, and only one one side!) while
this handles it in fully generality.

Differential Revision: https://reviews.llvm.org/D113510
2021-11-10 22:01:52 +01:00
Stanislav Mekhanoshin 5731381594 [InstCombine] Relax and reorganize one use checks in the ~(a | b) & c
Since there is just a single check for LHS in ~(A | B) & C | ...
transforms and multiple RHS checks inside with more coming I am
removing m_OneUse checks for LHS and adding new checks for RHS.
This is non essential as long as there is total benefit.

In addition (~(A | B) & C) | (~(A | C) & B) --> (B ^ C) & ~A
checks were overly restrictive, it should be good without any
additional checks.

Differential Revision: https://reviews.llvm.org/D113141
2021-11-10 10:14:12 -08:00
Sanjay Patel 67299aa84f [InstCombine] add check for integer source type from cast to prevent crash
A problem was noted in the post-commit review for
c36b7e21bd / D113035 :

If the source type is not integer or integer vector,
then we could crash when trying to ComputeNumSignBits().
2021-11-10 09:44:55 -05:00
Nikita Popov 0aabdad1ef [InstCombine] Combine code for and/or of icmps (NFC)
The implementation for and/or is the same, apart from the choice
of exactIntersectWith() vs exactUnionWith(). Extract a common
function to make future extension easier.
2021-11-09 21:18:31 +01:00
Nikita Popov bb12dedede [InstCombine] Refactor and/or of icmp with constant (NFCI)
Rather than testing for many specific combinations of predicates
and values, compute the exact icmp regions for both comparisons
and check whether they union/intersect exactly. If they do,
construct the equivalent icmp for the new range. Assuming that the
existing code handled all possible cases, this should be NFC.

Differential Revision: https://reviews.llvm.org/D113367
2021-11-09 21:05:46 +01:00
Stanislav Mekhanoshin 791baf38e1 [InstCombine] Fuse checks for LHS (~(A | B) & C) | ... NFC.
Differential Revision: https://reviews.llvm.org/D113132
2021-11-09 11:31:22 -08:00
Sanjay Patel c36b7e21bd [InstCombine] enhance vector bitwise select matching
(Cond & C) | (~bitcast(Cond) & D) --> bitcast (select Cond, (bc C), (bc D))

This is part of fixing:
https://llvm.org/PR34047

That report shows a case where a bitcast is sitting between the select condition
candidate and its 'not' value due to current cast canonicalization rules.

There's a bitcast type restriction that might be violated in existing matching,
but I still need to investigate if that is possible -
Alive2 shows we can only do this transform safely when the bitcast is from
narrow to wide vector elements (otherwise poison could leak into elements
that were safe in the original code):
https://alive2.llvm.org/ce/z/Hf66qh

Differential Revision: https://reviews.llvm.org/D113035
2021-11-09 08:54:59 -05:00
David Green 1e5f814302 [InstCombine] Fix infinite recursion in ashr/xor vector fold.
The added test has poison lanes due to the vector shuffle. This can
cause an infinite loop of combines in instcombine where it folds
xor(ashr, -1) -> select (icmp slt 0), -1, 0 -> sext (icmp slt 0) -> xor(ashr, -1).
We usually prevent this by checking that the xor constant is not -1,
but with vectors some of the lanes may be -1, some may be poison. So
this changes the way we detect that from "!C1->isAllOnesValue()" to
"!match(C1, m_AllOnes())", which is more able to detect that some of the
lanes are poison.

Fixes PR52397
2021-11-04 09:24:27 +00:00
Sanjay Patel 829146164f [InstCombine] change 'not' match for bitwise select
The tests diffs are logically equivalent, and so this is
generally NFC, but this makes the code match the code
comment.

It should also be more efficient. If we choose the 'not'
operand (rather than the 'not' instruction) as the select
condition, then we don't have to invert the select
condition/operands as a subsequent transform.
2021-11-02 10:16:01 -04:00
Sanjay Patel 42c94bc1ab [InstCombine] allow vector splat matching for bitwise logic fold
Similar to 54e969cffd (and with cosmetic updates to hopefully
make that easier to read), this fold has been around since early
in LLVM history.

Intermediate folds have been added subsequently, so extra uses
are required to exercise this code.

The test example actually shows an unintended consequence with
extra uses - we end up with an extra instruction compared to what
we started with. But this at least makes scalar/vector consistent.

General proof:
https://alive2.llvm.org/ce/z/tmuBza
2021-11-01 11:39:48 -04:00
Sanjay Patel 54e969cffd [InstCombine] allow vector splat matching for bitwise logic folds
This fold was added long ago (part of fixing PR4216),
and it matched scalars only. Intermediate folds have
been added subsequently, so extra uses are required
to exercise this code.

General proof:
https://alive2.llvm.org/ce/z/G6BBhB

One of the specific tests:
https://alive2.llvm.org/ce/z/t0JhEB
2021-11-01 08:26:42 -04:00
Sanjay Patel 511ee8759f [InstCombine] reduce code duplication with commutative matcher; NFC 2021-11-01 08:26:41 -04:00
Sanjay Patel 8f786b4618 [InstCombine] fix comments to match code; NFC 2021-10-29 15:48:35 -04:00
Sanjay Patel d0e9879d96 [InstCombine] allow vector splat matching for bitwise logic folds
These transforms are also likely missing a one-use check,
but that's another patch.
2021-10-29 14:22:50 -04:00
Stanislav Mekhanoshin a905c54b76 [InstCombine] Fold `(~(a | b) & c) | ~(a | c)` into `~((b & c) | a)`
```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
  %or1 = or i4 %b, %a
  %not1 = xor i4 %or1, -1
  %or2 = or i4 %a, %c
  %not2 = xor i4 %or2, -1
  %and = and i4 %not2, %b
  %or3 = or i4 %and, %not1
  ret i4 %or3
}

define i4 @tgt(i4 %a, i4 %b, i4 %c) {
  %and = and i4 %c, %b
  %or = or i4 %and, %a
  %or3 = xor i4 %or, -1
  ret i4 %or3
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D112338
2021-10-29 10:58:09 -07:00
David Green 9020e22a87 [InstCombine] Convert xor (ashr X, BW-1), C -> select(X >=s 0, C, ~C)
The sequence of instructions `xor (ashr X, BW-1), C` (or with a truncation
`xor (trunc (ashr X, BW-1)), C)` takes a value, produces all zeros or all
ones and with it optionally inverts a constant depending on whether the
original input was positive or negative. This is the same as checking if
the value is positive, and selecting between the constant and ~constant.
https://alive2.llvm.org/ce/z/NJ85qY

This is a fairly general version of a fold that helps pull saturating
arithmetic into a canonical form.

Differential Revision: https://reviews.llvm.org/D109151
2021-10-29 11:19:20 +01:00
Stanislav Mekhanoshin f7f430c913 [InstCombine] Fixed non-determinisctic order of new instructions
Fixes non-determinisctic order of XOR instructions created after
5a7a458306. The order of call argument evaluation is not
defined, so create one Value before the call.
2021-10-28 12:14:02 -07:00
Stanislav Mekhanoshin 5a7a458306 [InstCombine] Fold `(c & ~(a | b)) | (b & ~(a | c))` to `~a & (b ^ c)`
```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %or1 = or i4 %a, %b
  %not1 = xor i4 %or1, 15
  %and1 = and i4 %not1, %c
  %or2 = or i4 %a, %c
  %not2 = xor i4 %or2, 15
  %and2 = and i4 %not2, %b
  %or3 = or i4 %and1, %and2
  ret i4 %or3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %xor = xor i4 %b, %c
  %not = xor i4 %a, 15
  %or3 = and i4 %xor, %not
  ret i4 %or3
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D112276
2021-10-28 11:54:30 -07:00
Sanjay Patel 3888de9507 [InstCombine] generalize reassociated Demorgan folds
This updates the recent D112108 / b92412fb28
to handle the flipped logic ('or') sibling:
https://alive2.llvm.org/ce/z/Y2L6Ch
2021-10-21 10:39:29 -04:00
Stanislav Mekhanoshin b92412fb28 [InstCombine] Fold `(a & ~b) & ~c` to `a & ~(b | c)`
%not1 = xor i32 %b, -1
  %not2 = xor i32 %c, -1
  %and1 = and i32 %a, %not1
  %and2 = and i32 %and1, %not2
=>
  %i1 = or i32 %b, %c
  %i2 = xor i32 %1, -1
  %and2 = and i32 %i2, %a

Differential Revision: https://reviews.llvm.org/D112108
2021-10-20 13:05:46 -07:00
Sanjay Patel a49f5386ce [InstCombine] generalize fold for mask-with-signbit-splat, part 2
This removes an over-specified fold. The more general transform
was added with:
727e642e97

There's a difference on an existing test that shows a potentially
unnecessary use limit on an icmp fold.

That fold is in InstCombinerImpl::foldICmpSubConstant(), and IIRC
there was some back-and-forth on it and similar folds because they
could cause analysis/passes (SCEV, LSR?) to miss optimizations.

Differential Revision: https://reviews.llvm.org/D111410
2021-10-15 17:11:29 -04:00
Sanjay Patel 727e642e97 [InstCombine] generalize fold for mask-with-signbit-splat
(iN X s>> (N-1)) & Y --> (X < 0) ? Y : 0

https://alive2.llvm.org/ce/z/qeYhdz

I was looking at a missing abs() transform and found my way to this
generalization of an existing fold that was added with D67799.
As discussed in that review, we want to make sure codegen handles
this difference well, and for all of the targets/types that I
spot-checked, it looks good.

I am leaving the existing fold in place in this commit because
it covers a potentially missing icmp fold, but I plan to remove
that as a follow-up commit as suggested during review.

Differential Revision: https://reviews.llvm.org/D111410
2021-10-15 16:25:48 -04:00
Sanjay Patel 905d170803 [InstCombine] allow matching vector splat constants in foldLogOpOfMaskedICmps()
This is NFC-intended for scalar code. There are still unnecessary
m_ConstantInt restrictions in surrounding code, so this is not a
complete fix.

This prevents regressions seen with a planned follow-on to D111410.
2021-10-13 10:15:26 -04:00
Sanjay Patel bc72baa047 [InstCombine] add folds for logical nand/nor
This is noted as a regression in:
https://llvm.org/PR52077
2021-10-05 18:31:20 -04:00
Sanjay Patel 668beb8ae8 [InstCombine] refactor folds of 'not' instructions; NFC
This removes repeated calls to m_Not, so hopefully a little
more efficient.

Also, we may need to enhance some of these blocks to allow
logical and/or (select of bools).
2021-10-05 16:36:57 -04:00
Jay Foad a9bceb2b05 [APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.

Differential Revision: https://reviews.llvm.org/D110807
2021-10-04 08:57:44 +01:00
Sanjay Patel 41ff7612b3 [InstCombine] allow splat vectors for narrowing masked fold
Mostly cosmetic diffs, but the use of m_APInt matches splat constants.
2021-09-17 11:24:16 -04:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Nikita Popov fafe5a6f44 [InstCombine] Perform "eq of parts" fold with logical ops
The pattern matched here is too complex for the general logical
and/or to bitwise and/or conversion to trigger. However, the
fold is poison-safe, so match it with a select root as well:

https://alive2.llvm.org/ce/z/vNzzSg
https://alive2.llvm.org/ce/z/Beyumt
2021-08-22 16:55:53 +02:00
Sanjay Patel eee0ded337 [InstCombine] add min/max intrinsics as freely invertible candidates
In the optimized test, we are able to peak through the
min/max that has 2 min/max operands and invert them all:
https://alive2.llvm.org/ce/z/7gYMN5
2021-08-19 08:41:38 -04:00
Sanjay Patel e10c3beca5 [InstCombine] add one-use check for min/max fold with not operands; NFC
This makes the intrinsic logic match the cmp+select idiom folds
just below. It's not clearly a win either way unless we think
that a 'not' op costs more than min/max.

The cmp+select folds on these patterns are more extensive than
the intrinsics currently and may have some complicated interactions,
so I'm trying to make those line up and bring the optimizations
for intrinsics up to parity.
2021-08-19 08:41:38 -04:00
Simon Pilgrim afc6b09dee [InstCombine] getMaskedTypeForICmpPair - remove dead code. NFCI.
Ok should be true at this point, so the early-out is dead - replace with an assert.
2021-07-30 19:23:05 +01:00
Simon Pilgrim 401d6685c0 [InstCombine] InstCombinerImpl::visitOr - enable bitreverse matching
Currently we only match bswap intrinsics from or(shl(),lshr()) style patterns when we could often match bitreverse intrinsics almost as cheaply.

Differential Revision: https://reviews.llvm.org/D90170
2021-05-15 13:39:09 +01:00
Nikita Popov a8f7dee1df [InstCombine] Support one-hot merge for logical and/or
If a logical and/or is used, we need to be careful not to propagate
a potential poison value from the RHS by inserting a freeze
instruction. Otherwise it works the same way as bitwise and/or.

This is intended to address the regression reported at
https://reviews.llvm.org/D101191#2751002.

Differential Revision: https://reviews.llvm.org/D102279
2021-05-12 21:01:18 +02:00
Roman Lebedev 554b1bced3
[InstCombine] ~(C + X) --> ~C - X (PR50308)
We can not rely on (C+X)-->(X+C) already happening,
because we might not have visited that `add` yet.
The added testcase would get stuck in an endless combine loop.
2021-05-12 16:10:55 +03:00
Nikita Popov 1556540372 [InstCombine] Clean up one-hot merge optimization (NFC)
Remove the requirement that the instruction is a BinaryOperator,
make the predicate check more compact and use slightly more
meaningful naming for the and operands.
2021-05-11 23:22:11 +02:00
Nikita Popov 463ea28e96 [InstCombine] Fold comparison of integers by parts
Let's say you represent (i32, i32) as an i64 from which the parts
are extracted with lshr/trunc. Then, if you compare two tuples by
parts you get something like A[0] == B[0] && A[1] == B[1], just
that the part extraction happens by lshr/trunc and not a narrow
load or similar.

The fold implemented here reduces such equality comparisons by
converting them into a comparison on a larger part of the integer
(which might be the whole integer). It handles both the "and of eq"
and the conjugated "or of ne" case.

I'm being conservative with one-use for now, though this could be
relaxed if profitable (the base pattern converts 11 instructions
into 5 instructions, but there's quite a few variations on how it
can play out).

Differential Revision: https://reviews.llvm.org/D101232
2021-05-10 22:22:39 +02:00
Juneyoung Lee 24ce194cfe [InstCombine] generalize select + select/and/or folding using implied conditions
This patch optimizes the remaining possible cases in D101191 by generalizing isImpliedCondition()-based
foldings.

Assume that there is `op a, (select b, _, _)` where op is one of `and i1`, `or i1` or their select forms.

We can do the following optimization based on the result of `isImpliedCondition(a, b)`:

If a = true implies…
- b = true:
    - select a, (select b, A, B), false => select a, A, false : https://alive2.llvm.org/ce/z/WCnZYh
    - and a, (select b, A, B) => select a, A, false : https://alive2.llvm.org/ce/z/uZhcMG
- b = false:
    - select a, (select b, A, B), false => select a, B, false : https://alive2.llvm.org/ce/z/c2hJpV
    - and a, (select b, A, B) => select a, B, false : https://alive2.llvm.org/ce/z/5ggwMM

If a = false implies…
- b = true:
    - select a, true, (select b, A, B) => select a, true, A : https://alive2.llvm.org/ce/z/tidKvH
    - or a, (select b, A, B) =>  select a, true, A : https://alive2.llvm.org/ce/z/cC-uyb
- b = false:
    - select a, true, (select b, A, B) => select a, true, B : https://alive2.llvm.org/ce/z/ZXpJq9
    - or a, (select b, A, B) => select a, true, B : https://alive2.llvm.org/ce/z/hnDrJj

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101720
2021-05-04 09:42:06 +09:00
Roman Lebedev 91248e2db9
[InstCombine] Improve "get low bit mask upto and including bit X" pattern
https://alive2.llvm.org/ce/z/3u-48R
2021-04-11 18:08:08 +03:00
Sanjay Patel c0bbd0cc35 [InstCombine] fold not ops around min/max intrinsics
This is another step towards parity with the existing
cmp+select folds (see D98152).
2021-04-07 17:31:36 -04:00
Sanjay Patel 0333ed8e0c [InstCombine] move abs transform to helper function; NFC
The swap of the operands can affect later transforms that
are expecting a constant as operand 1. I don't think we
can trigger a bug with the current code, but I hit that
problem while drafting a new transform for min/max intrinsics.
2021-04-07 08:35:07 -04:00
Philip Reames 4bf8985f4f Replace calls to IntrinsicInst::Create with CallInst::Create [nfc]
There is no IntrinsicInst::Create.  These are binding to the method in the super type.  Be explicitly about which method is being called.
2021-04-06 13:23:58 -07:00
Sanjay Patel 412fc74140 [InstCombine] fold not+or+neg
~((-X) | Y) --> (X - 1) & (~Y)

We generally prefer 'add' over 'sub', this reduces the
dependency chain, and this looks better for codegen on
x86, ARM, and AArch64 targets.

https://llvm.org/PR45755

https://alive2.llvm.org/ce/z/cxZDSp
2021-04-02 13:16:36 -04:00
Philip Reames ebc61f9d3c [instcombine] Collapse trivial or recurrences
If we have a recurrence of the form <Start, Or, Step> we know that the value taken by the recurrence stabilizes on the first iteration (provided step is loop invariant). We can exploit that fact to remove the loop carried dependence in the recurrence.

Differential Revision: https://reviews.llvm.org/D97578 (or part)
2021-03-08 09:21:38 -08:00
Philip Reames 239a618180 [instcombine] Collapse trivial and recurrences
If we have a recurrence of the form <Start, And, Step> we know that the value taken by the recurrence stabilizes on the first iteration (provided step is loop invariant). We can exploit that fact to remove the loop carried dependence in the recurrence.

Differential Revision: https://reviews.llvm.org/D97578 (and part)
2021-03-08 09:21:38 -08:00
Simon Pilgrim 609d0c9772 [InstCombine] matchBSwapOrBitReverse - remove pattern matching early-out. NFCI.
recognizeBSwapOrBitReverseIdiom + collectBitParts have pattern matching to bail out early if a bswap/bitreverse pattern isn't possible - we should be able to rely on this instead without any notable change in compile time.

This is part of a cleanup towards letting matchBSwapOrBitReverse /recognizeBSwapOrBitReverseIdiom use 'root' instructions that aren't ORs (FSHL/FSHRs in particular which can be prematurely created).

Differential Revision: https://reviews.llvm.org/D97056
2021-02-20 13:15:34 +00:00
Roman Lebedev d1a6f92fd5
[InstCombine] Fold `(~x) | y` --> `~(x & (~y))` iff it is free to do so
Iff we know we can get rid of the inversions in the new pattern,
we can thus get rid of the inversion in the old pattern,
this decreasing instruction count.

Note that we could position this transformation as just hoisting
of the `not` (still, iff y is freely negatible), but the test changes
show a number of regressions, so let's not do that.
2021-01-22 17:23:54 +03:00
Roman Lebedev 79b0d21ce9
[InstCombine] Fold `(~x) & y` --> `~(x | (~y))` iff it is free to do so
Iff we know we can get rid of the inversions in the new pattern,
we can thus get rid of the inversion in the old pattern,
this decreasing instruction count.
2021-01-22 17:23:54 +03:00
Juneyoung Lee 2d89ebd5d1 Address unused variable warning 2021-01-19 09:30:16 +09:00
Juneyoung Lee 0441df94ad [InstCombine,InstSimplify] Optimize select followed by and/or/xor
This patch adds `A & (A && B)` -> `A && B`  (similarly for or + logical or)

Also, this patch adds `~(select C, (icmp pred X, Y), const)` -> `select C, (icmp pred' X, Y), ~const`.

Alive2 proof:
merge_and: https://alive2.llvm.org/ce/z/teMR97
merge_or: https://alive2.llvm.org/ce/z/b4yZUp
xor_and: https://alive2.llvm.org/ce/z/_-TXHi
xor_or: https://alive2.llvm.org/ce/z/2uYx_a

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D94861
2021-01-19 09:14:17 +09:00
Dávid Bolvanský 0529946b5b [instCombine] Add (A ^ B) | ~(A | B) -> ~(A & B)
define i32 @src(i32 %x, i32 %y) {
%0:
  %xor = xor i32 %y, %x
  %or = or i32 %y, %x
  %neg = xor i32 %or, 4294967295
  %or1 = or i32 %xor, %neg
  ret i32 %or1
}
=>
define i32 @tgt(i32 %x, i32 %y) {
%0:
  %and = and i32 %x, %y
  %neg = xor i32 %and, 4294967295
  ret i32 %neg
}
Transformation seems to be correct!

https://alive2.llvm.org/ce/z/Cvca4a
2021-01-12 19:29:17 +01:00
Roman Lebedev 374ef57f13
[InstCombine] 'hoist xor-by-constant from xor-by-value': completely give up on constant exprs
As Mikael Holmén is noting in the post-commit review for the first fix
https://reviews.llvm.org/rGd4ccef38d0bb#967466
not hoisting constantexprs is not enough,
because if the xor originally was a constantexpr (i.e. X is a constantexpr).
`SimplifyAssociativeOrCommutative()` in `visitXor()` will immediately
undo this transform, thus again causing an infinite combine loop.

This transform has resulted in a surprising number of constantexpr failures.
2020-12-29 16:28:18 +03:00
Roman Lebedev d4ccef38d0
[InstCombine] 'hoist xor-by-constant from xor-by-value': ignore constantexprs
As it is being reported (in post-commit review) in
https://reviews.llvm.org/D93857
this fold (as i expected, but failed to come up with test coverage
despite trying) has issues with constant expressions.
Since we only care about true constants, which constantexprs are not,
don't perform such hoisting for constant expressions.
2020-12-28 20:15:20 +03:00
Roman Lebedev d9ebaeeb46
[InstCombine] Hoist xor-by-constant from xor-by-value
This is one of the deficiencies that can be observed in
https://godbolt.org/z/YPczsG after D91038 patch set.

This exposed two missing folds, one was fixed by the previous commit,
another one is `(A ^ B) | ~(A ^ B) --> -1` / `(A ^ B) & ~(A ^ B) --> 0`.

`-early-cse` will catch it: https://godbolt.org/z/4n1T1v,
but isn't meaningful to fix it in InstCombine,
because we'd need to essentially do our own CSE,
and we can't even rely on `Instruction::isIdenticalTo()`,
because there are no guarantees that the order of operands matches.
So let's just accept it as a loss.
2020-12-24 21:20:50 +03:00
Roman Lebedev 5b78303433
[InstCombine] Fold `a & ~(a ^ b)` to `x & y`
```
----------------------------------------
define i32 @and_xor_not_common_op(i32 %a, i32 %b) {
%0:
  %b2 = xor i32 %b, 4294967295
  %t2 = xor i32 %a, %b2
  %t4 = and i32 %t2, %a
  ret i32 %t4
}
=>
define i32 @and_xor_not_common_op(i32 %a, i32 %b) {
%0:
  %t4 = and i32 %a, %b
  ret i32 %t4
}
Transformation seems to be correct!
```
2020-12-24 21:20:49 +03:00
Roman Lebedev a91e96702a
[InstCombine] Fold `and(shl(zext(x), width(SIGNMASK) - width(%x)), SIGNMASK)` to `and(sext(%x), SIGNMASK)`
One less instruction and reducing use count of zext.
As alive2 confirms, we're fine with all the weird combinations of
undef elts in constants, but unless the shift amount was undef
for a lane, we must sanitize undef mask to zero, since sign bits
are no longer zeros.

https://rise4fun.com/Alive/d7r
```
----------------------------------------
Optimization: zz
Precondition: ((C1 == (width(%r) - width(%x))) && isSignBit(C2))
  %o0 = zext %x
  %o1 = shl %o0, C1
  %r = and %o1, C2
=>
  %n0 = sext %x
  %r = and %n0, C2

Done: 2016
Optimization is correct!
```
2020-11-20 00:31:27 +03:00
Sanjay Patel 4a66a1d17a [InstCombine] allow vectors for masked-add -> xor fold
https://rise4fun.com/Alive/I4Ge

  Name: add with pow2 mask
  Pre: isPowerOf2(C2) && (C1 & C2) != 0 && (C1 & (C2-1)) == 0
  %a = add i8 %x, C1
  %r = and i8 %a, C2
  =>
  %n = and i8 %x, C2
  %r = xor i8 %n, C2
2020-11-17 13:36:08 -05:00
Simon Pilgrim f7ebdec987 [InstCombine] visitAnd - remove unnecessary Value *X, *Y shadow variables. NFCI.
Fixes a number of Wshadow warnings.
2020-11-17 17:59:21 +00:00
Simon Pilgrim abf29d9862 [InstCombine] visitAnd - use m_SpecificInt instead of m_APInt + comparison. NFCI.
m_SpecificInt has the same 'no undef element' behaviour as m_APInt so no change there, and anyway we have test coverage for undef elements in the fold.

Noticed while fixing a Wshadow warning about shadow Value *X, *Y variables.
2020-11-17 17:37:10 +00:00
Sanjay Patel f791ad7e1e [InstCombine] remove scalar constraint for mask-of-add fold
https://rise4fun.com/Alive/V6fP

  Name: add with low mask
  Pre: (C1 & (-1 u>> countLeadingZeros(C2))) == 0
  %a = add i8 %x, C1
  %r = and i8 %a, C2
  =>
  %r = and i8 %x, C2
2020-11-17 12:13:45 -05:00
Sanjay Patel 433696911a [InstCombine] relax constraints on mask-of-add
There are 2 changes:
1. Remove the unnecessary one-use check.
2. Remove the unnecessary power-of-2 check.

https://rise4fun.com/Alive/V6fP

  Name: add with low mask
  Pre: (C1 & (-1 u>> countLeadingZeros(C2))) == 0
  %a = add i8 %x, C1
  %r = and i8 %a, C2
  =>
  %r = and i8 %x, C2
2020-11-17 12:13:44 -05:00
Sanjay Patel 6ddc237766 [InstCombine] reduce code for flip of masked bit; NFC
There are 1-2 potential follow-up NFC commits to reduce
this further on the way to generalizing this for vectors.

The operand replacing path should be dead code because demanded
bits handles that more generally (D91415).
2020-11-15 15:43:34 -05:00
Simon Pilgrim 6b2eb31e1e [InstCombine] Add support for zext(and(neg(amt),width-1)) rotate shift amount patterns
Alive2: https://alive2.llvm.org/ce/z/bCvvHd
2020-10-26 11:22:41 +00:00
Simon Pilgrim 3052e474ec [InstCombine] matchBSwapOrBitReversem - recognise or(fshl(),fshl()) bswap patterns.
I'm not certain InstCombinerImpl::matchBSwapOrBitReverse needs to filter the or(op0(),op1()) ops - there are just too many cases that recognizeBSwapOrBitReverseIdiom/collectBitParts handle now (and quickly).
2020-10-25 10:17:45 +00:00
Simon Pilgrim 1cab3bf004 [InstCombine] matchBSwapOrBitReverse - expose bswap/bitreverse matching flags.
matchBSwapOrBitReverse was hardcoded to just match bswaps - we're going to need to expose the ability to match bitreverse as well, so make this part of the function call.
2020-10-23 12:35:28 +01:00
Simon Pilgrim 19a13bf538 [InstCombine] Rename InstCombinerImpl::matchBSwap to matchBSwapOrBitReverse. NFCI.
This matches bswap and bitreverse intrinsics, so we should make that clear in the function name.
2020-10-23 12:35:27 +01:00
Simon Pilgrim 7b4a828452 [InstCombine] foldOrOfICmps - use m_Specific instead of explicit comparisons. NFCI. 2020-10-21 11:53:45 +01:00
Martin Storsjö 4de215ff18 Revert "[InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support"
Also revert "[InstCombine] foldOrOfICmps - use m_Specific instead of
explicit comparisons. NFCI." to make the primarily intended revert
work.

This reverts commits ce13549761 and
e372a5f86f.

This commit caused failed asserts e.g. like this:

$ cat repro.cpp
bool a(char b) {
  return b >= '0' && b <= '9' || (b | 32) >= 'a' && (b | 32) <= 'z';
$ clang++ -target x86_64-linux-gnu -c -O2 repro.cpp
clang++: ../include/llvm/ADT/APInt.h:1151: bool llvm::APInt::operator==(const
llvm::APInt&) const: Assertion `BitWidth == RHS.BitWidth && "Comparison
requires equal bit widths"' failed.
2020-10-21 09:47:18 +03:00
Simon Pilgrim ce13549761 [InstCombine] foldOrOfICmps - use m_Specific instead of explicit comparisons. NFCI. 2020-10-20 16:26:41 +01:00
Simon Pilgrim e372a5f86f [InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support
Reapplied rGa704d8238c86 with a check for integer/integervector types to prevent matching with pointer types
2020-10-20 14:14:26 +01:00
Simon Pilgrim adb52e5f9e [InstCombine] foldOrOfICmps - only fold (icmp_eq B, 0) | (icmp_ult/gt A, B) for integer types
Fixes a number of stage2 buildbots that were failing when I generalized the m_ConstantInt() logic - that didn't match for pointer types but m_Zero() does......
2020-10-19 17:05:38 +01:00
Simon Pilgrim 482e6f0041 Revert rGa704d8238c86bac: "[InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support"
This reverts commit a704d8238c.

Causing stage2 build failures on some bots.
2020-10-19 16:03:36 +01:00
Simon Pilgrim de885f1b2a [InstCombine] Add (icmp ne A, 0) | (icmp ne B, 0) --> (icmp ne (A|B), 0) vector support
Scalar cases were already being handled by foldLogOpOfMaskedICmps (so this was dead code), but refactoring to support non-uniform vectors will take some time, so tweak this fold in the meantime.
2020-10-19 15:41:21 +01:00
Simon Pilgrim ecd25086d1 [InstCombine] Add (icmp eq B, 0) | (icmp ult/gt A, B) -> (icmp ule A, B-1) vector support 2020-10-19 15:23:48 +01:00
Simon Pilgrim a704d8238c [InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support 2020-10-19 14:55:18 +01:00
Simon Pilgrim 1d90e53044 [InstCombine] foldOrOfICmps - pull out repeated getOperand() calls. NFCI. 2020-10-19 14:28:08 +01:00
Simon Pilgrim 0b7b446a40 [InstCombine] Support vectors-with-undef in and(logicalshift(1,X),1) --> zext(X == 0) fold 2020-10-19 11:10:32 +01:00
Sanjay Patel 53e92b4c0e [InstCombine] (~A & B) ^ A -> A | B
Differential Revision: https://reviews.llvm.org/D86395
2020-10-17 12:20:18 -04:00
Simon Pilgrim 83ae625f0c [InstCombine] visitAnd - pull out repeated I.getType() calls. NFCI. 2020-10-16 15:43:11 +01:00
Simon Pilgrim 253f24cf4c [InstCombine] Remove custom and(trunc(and(x,c1)),c2) fold
This is more correctly handled by canEvaluateTruncated (one use checks etc.) and covers all the tests cases that were added for this fold.
2020-10-16 15:43:10 +01:00
Simon Pilgrim 55991b44b7 [InstCombine] foldAndOrOfICmpsOfAndWithPow2 - add vector support
Support vector cases for folding:

 (iszero(A & K1) | iszero(A & K2)) -> (A & (K1 | K2)) != (K1 | K2)
 (!iszero(A & K1) & !iszero(A & K2)) -> (A & (K1 | K2)) == (K1 | K2)
2020-10-16 10:41:40 +01:00
Simon Pilgrim 23f1616626 [InstCombine] Use m_SpecificInt instead of m_APInt + comparison. NFCI. 2020-10-15 16:06:27 +01:00