Commit Graph

615 Commits

Author SHA1 Message Date
Chenbing Zheng 1a0187c9e7 [InstCombine] remove useless ‘InstCombiner::’. nfc
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130220
2022-07-22 09:24:24 +08:00
Chenbing Zheng 8075f680c8 [InstCombine] add fold (X > C - 1) ^ (X < C + 1) --> X != C
Considering the correctness of this pattern, we should avoid that C - 1
is non-negative and C + 1 is negative.

Alive2: https://alive2.llvm.org/ce/z/c_rBaq

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D129622
2022-07-21 10:08:21 +08:00
Sanjay Patel 26fbb79c33 [InstCombine] reduce code for signbit folds; NFC 2022-07-18 11:04:58 -04:00
Daniel Bertalan ef7aed3e11 [InstCombine] Do not fold 'and (sext (ashr X, Shift)), C' if Shift < 0
The 'and (sext (ashr X, ShiftC)), C' --> 'lshr (sext X), ShiftC'
transformation would access out of bounds bits in APInt::getLowBitsSet
if the shift count was larger than X's bit width or if it was negative.

Fixes #56424
2022-07-07 19:13:55 +02:00
Sanjay Patel f9f40aa10d [InstCombine] fold negated low-bit-mask to cmp+select
(-(X & 1)) & Y --> (X & 1) == 0 ? 0 : Y
https://alive2.llvm.org/ce/z/rhpH3i

This is noted as a missing IR canonicalization in issue #55618.
We already managed to fix codegen to the expected form.
2022-07-03 12:25:26 -04:00
Eric Gullufsen 73202130e5 [InstCombine] Optimize test for same-sign of values
(icmp slt (X & Y), 0) | (icmp sgt (X | Y), -1) -> (icmp sgt (X ^ Y), -1)
(icmp slt (X | Y), 0) & (icmp sgt (X & Y), -1) -> (icmp slt (X ^ Y), 0)

[[ https://alive2.llvm.org/ce/z/qXxEFP | alive2 example ]]
[[ https://godbolt.org/z/aWf9c6j74 | godbolt  ]]

[[ https://godbolt.org/z/5Ydn5TehY | godbolt for inverted form ]]
[[ https://alive2.llvm.org/ce/z/93AODr | alive2 for inverted form ]]
[[ https://github.com/llvm/llvm-project/issues/55988 | issue #55988 ]]

Differential Revision: https://reviews.llvm.org/D127903
2022-06-19 16:18:19 -04:00
Sanjay Patel bfde861935 [InstCombine] convert mask and shift of power-of-2 to cmp+select
When the mask is a power-of-2 constant and op0 is a shifted-power-of-2
constant, test if the shift amount equals the offset bit index:

(ShiftC << X) & C --> X == (log2(C) - log2(ShiftC)) ? C : 0
(ShiftC >> X) & C --> X == (log2(ShiftC) - log2(C)) ? C : 0

This is an alternate to D127610 with a more general pattern.
We match only shift+and instead of the trailing xor, so we see a few
more tests diffs. I think we discussed this initially in D126617.

Here are proofs for shifts in both directions:
https://alive2.llvm.org/ce/z/CFrLs4

The test diffs look equal or better for IR, and this makes the
patterns more uniform in IR. The backend can partially invert this
in both cases if that is profitable. It is not trivially reversible,
however, so if we find perf regressions that are not easy to undo,
then we may want to revert this.

Differential Revision: https://reviews.llvm.org/D127801
2022-06-17 10:51:57 -04:00
chenglin.bi 286198ff04 [InstCombine] Optimize lshr+shl+and conversion pattern
if `C1` and `C3` are pow2 and `Log2(C3) >= C2`:
    ((C1 >> X) << C2) & C3 -> X == (Log2(C1)+C2-Log2(C3)) ? C3 : 0
https://alive2.llvm.org/ce/z/zvrkKF

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D127469
2022-06-14 11:06:10 +08:00
Sanjay Patel 310adb658c [InstCombine] reorder mask folds for efficiency
This shows narrowing improvements on the logic tests
(transforms recently added with e247b0e5c9).

This is not a complete fix. That would require adding
folds to visitOr/visitXor. But it enables the expected
transforms for the basic patterns in the affected tests.
2022-06-13 09:49:57 -04:00
Sanjay Patel e247b0e5c9 [InstCombine] add narrowing transform for low-masked binop with zext operand (2nd try)
The 1st try ( afa192cfb6 ) was reverted because it could
cause an infinite loop with constant expressions.

A test for that and an extra condition to enable the transform
are added now. I also added code comments to better describe
the transform and the existing, related transform.

Original commit message:
https://alive2.llvm.org/ce/z/hRy3rE

As shown in D123408, we can produce this pattern when moving
casts around, and we already have a related fold for a binop
with a constant operand.
2022-06-10 12:42:27 -04:00
Sanjay Patel 6fedc6a2b4 Revert "[InstCombine] add narrowing transform for low-masked binop with zext operand"
This reverts commit afa192cfb6.
This can cause an infinite loop as shown with an example in the
post-commit thread.
2022-06-10 08:25:10 -04:00
chenglin.bi de7a6ae1ff [InstCombine] Optimize shl+lshr+and conversion pattern
if `C1` and `C3` are pow2 and `Log2(C3)+C2 < BitWidth`:
    ((C1 << X) >> C2) & C3 -> X == (Log2(C3)+C2-Log2(C1)) ? C3 : 0;

https://alive2.llvm.org/ce/z/Pus5bd

Fix issue https://github.com/llvm/llvm-project/issues/55739

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D126617
2022-06-10 09:36:58 +08:00
Sanjay Patel afa192cfb6 [InstCombine] add narrowing transform for low-masked binop with zext operand
https://alive2.llvm.org/ce/z/hRy3rE

As shown in D123408, we can produce this pattern when moving
cast around, and we already have a related fold for a binop
with a constant operand.
2022-06-09 16:59:26 -04:00
Simon Moll b8c2781ff6 [NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889
2022-06-09 16:10:08 +02:00
Biplob Mishra d87bfa9ad0 [InstCombine] Combine instructions of type or/and where AND masks can be combined.
The patch simplifies some of the patterns as below

(A | (B & C0)) | (B & C1) -> A | (B & C0|C1)
((B & C0) | A) | (B & C1) -> (B & C0|C1) | A

In some scenarios like byte reverse on half word, we can see this pattern multiple times and this conversion can optimize these patterns.

Additionally this commit fixes the issue reported with the test case.
int f(int a, int b) {
  int c = ((unsigned char)(a >> 23) & 925);
  if (a)
    c = (a >> 23 & b) | ((unsigned char)(a >> 23) & 925) | (b >> 23 & 157);
  return c;
}

The previous revision/commit did not check one-use of an intermediate value that this transform re-uses.
When that value has another use, an existing transform will try to invert the transform here.
By adding one-use checks, we avoid the infinite loops seen with the earlier commit.

Differential Revision: https://reviews.llvm.org/D124119
2022-06-09 10:58:30 +01:00
Alexander Kornienko aa98e7e1eb Revert "[InstCombine] Combine instructions of type or/and where AND masks can be combined."
This reverts commit ec4adf1f6c. The commit causes
clang to hang on a certain input:
```
$ cat q.cc
int f(int a, int b) {
  int c = ((unsigned char)(a >> 23) & 925);
  if (a)
    c = (a >> 23 & b) | ((unsigned char)(a >> 23) & 925) | (b >> 23 & 157);
  return c;
}

$ time ./clang-15-10515 --target=x86_64--linux-gnu -O1  -c q.cc
^C

real    0m45.072s
user    0m0.025s
sys     0m0.099s
```
2022-06-01 14:20:00 +02:00
Chenbing Zheng 1486a9c9fe [InstCombine] [NFC] refector foldXorOfICmps
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D126268
2022-05-26 11:07:18 +08:00
Nikita Popov a7c079aaa2 [InstCombine] Support logical and in masked icmp fold
Most of the folds implemented in this function work fine with
logical operations. We only need to be careful for the cases that
work on non-constant masks, where the RHS operand shouldn't be
poison.

This is a conservative implementation that bails out of illegal
transforms, but we could also change these to insert freeze instead.
2022-05-24 11:16:33 +02:00
Nikita Popov 5abaabed22 [InstCombine] Use m_APInt() in asymmetric masked icmp fold
This is mostly intended as code cleanup, but it does also add
support for splat vectors to this fold.
2022-05-24 10:57:28 +02:00
Nikita Popov c0e06c7448 [InstCombine] Handle logical and/or in recursive and/or of icmps fold
The and/or of icmps fold is also applied in reassociated form.
However, this currently only happens for bitwise and of bitwise
and, but not for bitwise and of logical and (or other combinations,
but this is the one being addressed here).

We can do this for bitwise+logical combinations as well, but need
to be a bit careful about which of the resulting ands are logical:
https://alive2.llvm.org/ce/z/WYSjGh
https://alive2.llvm.org/ce/z/guxYnz
https://alive2.llvm.org/ce/z/S5SYxY
https://alive2.llvm.org/ce/z/2rAWeW
2022-05-24 10:13:10 +02:00
Nikita Popov f45c1e436e [InstCombine] Change operand order in recursive and/or of icmps fold
The order obviously doesn't matter for bitwise and/or, but would
matter for logical and/or, so change it to preserve the original
order.
2022-05-23 17:29:33 +02:00
Nikita Popov 45226d04f0 [InstCombine] Reuse icmp of and/or folds for logical and/or
Similarly to a change recently done for fcmps, add a flag that
indicates whether the and/or is logical to foldAndOrOfICmps, and
reuse the function when folding logical and/or.

We were already calling some parts of it, but this gives us a
clearer indication of which parts may need poison-safe variants,
and would also allow to fold combinations of bitwise and logical
and/or.

This change should be close to NFC, because all folds this enables
were either already called previously, or can make use of implied
poison reasoning.
2022-05-23 15:37:07 +02:00
Sanjay Patel f0071d43e4 [InstCombine] add use check to fold of bitwise logic with cast ops
This was shown as a potential regression in D126040.
2022-05-20 09:08:53 -04:00
Sanjay Patel be7f09f7b2 [IR] create and use helper functions that test the signbit; NFCI 2022-05-16 11:26:23 -04:00
Biplob Mishra ec4adf1f6c [InstCombine] Combine instructions of type or/and where AND masks can be combined.
The patch simplifies some of the patterns as below

(A | (B & C0)) | (B & C1) -> A | (B & C0|C1)
((B & C0) | A) | (B & C1) -> (B & C0|C1) | A

In some scenarios like byte reverse on half word, we can see this pattern multiple times and this conversion can optimize these patterns.

Differential Revision: https://reviews.llvm.org/D124119
2022-05-16 12:43:33 +01:00
Fraser Cormack bafab9c09f [InstCombine] Fix scalable-vector bitwise select matching
D113035 enhanced the matching of bitwise selects from vector types. This
change unfortunately introduced crashes as it tries to cast scalable
vector types to integers.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D124997
2022-05-06 12:59:39 +01:00
Alexander Shaposhnikov ec7122f64b [InstCombine] Fold ((A&B)^C)|B
Fold ((A&B)^C)|B into C|B.

https://alive2.llvm.org/ce/z/zSGSor

This addresses the issue https://github.com/llvm/llvm-project/issues/55169

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D124710
2022-05-05 00:56:20 +00:00
Nikita Popov 982cbed819 [InstCombine] Fold logical and/or of range icmps with nowrap flags
This is an edge-case where we don't convert to bitwise and/or based
on implies poison reasoning, so explicitly try to perform the fold
in logical form. The transform itself is poison-safe, as both icmps
are based on the same value and any nowrap flags are discarded as
part of the fold (https://alive2.llvm.org/ce/z/aCwC8b for the used
example).
2022-04-29 14:42:42 +02:00
Nikita Popov 57aaeefc18 [InstCombine] Pass ICmpInsts to foldAndOrOfICmpsUsingRanges() (NFC)
Pass the whole instruction rather than unpacking it. This makes it
easier to reuse the function in another place, as the entire
logic is encapsulated.
2022-04-29 12:46:31 +02:00
Nikita Popov 1f53932a95 [InstCombine] Remove foldAndOrOfEqualityCmpsWithConstants() fold
This fold handles a special subset of foldAndOrOfICmpsUsingRanges(),
use the more generic implementation instead.

The result can differ if a representation using a range comparison
is possible, in which case that is preferred over masking. There is
a canonicalization opportunity here.
2022-04-29 12:23:00 +02:00
Nikita Popov 5515263e44 [InstCombine] Fold and of two ranges differing by mask
This is the de Morgan conjugated variant of the existing fold for
ors. Implement this by switching the range code to always work
on ors and perform invert operands at the start and end. This makes
reasoning easier and makes the extension more obviosuly correct.
2022-04-29 12:01:38 +02:00
Nikita Popov d5ee20fcc9 [InstCombine] Switch an or of icmps fold to use constant ranges
We can express this fold more naturally when working on the constant
range implementation. This change is not entirely NFC, because the
code now also handles cases that don't match the precise pattern
this previously looked for, e.g. we can omit an add on one of the
ranges.
2022-04-29 11:15:54 +02:00
Nikita Popov 90dba831ae [InstCombine] Fold or of icmp ne trunc/and
This adds the de Morgan conjugated variant for the existing
"and eq" style fold.

Proof: https://alive2.llvm.org/ce/z/tkNAcG
2022-04-28 15:07:16 +02:00
Nikita Popov e8945110d2 [InstCombine] Remove redundant unsigned underflow fold (NFCI)
This is now handled as a combination of two other folds:
(A+B) <= A & (A+B) != 0  -->  (A+B)-1 < A
(A+B)-1 < A  -->  -B < A
2022-04-25 14:22:43 +02:00
Nikita Popov ee50925894 [InstCombine] Fold (X != 0) & (Y u>= X)
This adds the De Morgan conjugated fold for the existing
(X == 0) | (Y u< X) fold.

Proof: https://alive2.llvm.org/ce/z/3Me3JQ
2022-04-25 13:16:47 +02:00
Nikita Popov 369ef9bf60 [InstCombine] Extract code for or of icmp eq zero and icmp fold (NFC)
To make it easier to extend this to the congruent and case.
2022-04-22 16:48:59 +02:00
Nikita Popov ba46ae7bd8 [InstCombine] Merge foldAndOfICmps() and foldOrOfICmps() (NFCI)
Folds are supposed to always be added in conjugated pairs for and
and or. Merge the two functions to make folds for which this is
currently not the case more obvious.
2022-04-22 12:48:03 +02:00
Nikita Popov 3e1d2c352c [InstCombine] Fix or of commuted foldable predicates
1d90e53044 switch this code to store
the predicates and operands in variables, but retained a
swapOperands() call here. Thus the commuted cases were no longer
folded. Additionally, as the change was not reported, the next
InstCombine iteration would not pick it up either.
2022-04-22 12:31:26 +02:00
Sanjay Patel 7783db55af [InstCombine] try to fold low-mask of ashr to lshr
With one-use, we handle this via demanded-bits.
But We need to handle extra uses to improve issue #54750.

https://alive2.llvm.org/ce/z/aDYkPv
2022-04-11 11:56:40 -04:00
Roman Lebedev 308ca349cb
[InstCombine] Fold `(X | C2) ^ C1 --> (X & ~C2) ^ (C1^C2)`
These two are equivalent,
and i *think* the `and` form is more-ish canonical.

General proof: https://alive2.llvm.org/ce/z/RrF5s6

If constant on the (outer) `xor` is an `undef`,
the whole lane is dead: https://alive2.llvm.org/ce/z/mu4Sh2

However, if the constant on the (inner) `or` is an `undef`,
we must sanitize it first: https://alive2.llvm.org/ce/z/MHYJL7
I guess, producing a zero `and`-mask is optimal in that case.

alive-tv is happy about the entirety of `xor-of-or.ll`.
2022-04-03 00:12:56 +03:00
Hirochika Matsumoto a3cffc1150 [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2
https://alive2.llvm.org/ce/z/94yRMN

Fixes #54177

Differential Revision: https://reviews.llvm.org/D122077
2022-03-29 11:30:06 -04:00
Craig Topper ce78e68261 [InstCombine] Fold select based logic of fcmps with same operands when FMF is present.
If we have a logical and/or in select form and the true/false operand
is an fcmp with poison generating FMF, we won't be able to fold it
to an and/or instruction. This prevents us from optimizing the case
where it is a logical operation of two fcmps with identical operands.

This patch adds explicit checks for this case that doesn't rely on
converting to and/or to do the optimization. It reuses the existing
foldLogicOfFCmps, but adds a new flag to disable the other combine
that is inside that function.

FMF flags from the two FCmps are intersected using the logic added in
D121243. The FIXME has been updated to indicate that we can only use
a union for the non-select form.

This allows us to optimize cases like this from compare-fp-3.c in the
gcc torture suite with fast math.

void
test1 (float x, float y)
{
  if ((x==y) && (x!=y))
    link_error0();
}

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D121323
2022-03-14 14:45:07 -07:00
Craig Topper f72fe2ef67 [InstCombine] Preserve FMF in foldLogicOfFCmps.
This patch intersects the fast math flags from the two fcmps instead
of dropping them.

I poked at this a bunch with Alive2 for nnan and ninf flags and it seemed
to check out. With the other flags it told me "Couldn't prove the
correctness of the transformation". Not sure if I should just preserve
nnan and ninf?

Reviewed By: spatel, lebedev.ri

Differential Revision: https://reviews.llvm.org/D121243
2022-03-09 09:17:09 -08:00
Craig Topper 608161225e [InstCombine][Analysis] Move getFCmpCode and getPredForFCmpCode to CmpInstAnalysis. NFC
The similar getICmpCode and getPredForICmpCode are already there.
This moves FP for consistency.

I think InstCombine is currently the only user of both.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120754
2022-03-03 09:33:24 -08:00
Craig Topper 7bc6667845 [Analysis] Simplify the interface to llvm::getICmpCode. NFC
Instead of passing an InstCmpInt * and a bool just pass the predicate
from the caller.

I'm considering moving the similar FCmp functions from InstCombine
over here and this makes the interface consistent with what is used
for FCmp.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D120609
2022-03-01 09:53:27 -08:00
Nikita Popov 5423b0a525 [InstCombine] Remove not of SPF min/max fold (NFCI)
This should no longer be necessary now that we canonicalize to
intrinsics. Might not be strictly NFC due to worklist order.
2022-02-28 11:02:31 +01:00
Nikita Popov efece08ae2 [InstCombine] Remove manual debug loc transfer
While this might be marginally more precise, we generally don't
bother with this in InstCombine, and let the IRBuilder assign the
debug location. I don't see why this one fold, out of the thousands
done in InstCombine, should be treated specially.
2022-02-14 11:07:05 +01:00
Sanjay Patel 39e602b6c4 [InstCombine] try to fold binop with phi operands
This is an alternate version of D115914 that handles/tests all binary opcodes.

I suspect that we don't see these patterns too often because -simplifycfg
would convert the minimal cases into selects rather than leave them in phi form
(note: instcombine has logic holes for combining the select patterns too though,
so that's another potential patch).

We only create a new binop in a predecessor that unconditionally branches to
the final block.
https://alive2.llvm.org/ce/z/C57M2F
https://alive2.llvm.org/ce/z/WHwAoU (not safe to speculate an sdiv for example)
https://alive2.llvm.org/ce/z/rdVUvW (but it is ok on this path)

Differential Revision: https://reviews.llvm.org/D117110
2022-01-22 15:00:06 -05:00
Sanjay Patel 1d21667ce2 [InstCombine] (~A | B) & (A ^ B) -> ~A & B
This is part of a set of 2-variable logic optimizations
suggested here:
https://lists.llvm.org/pipermail/llvm-dev/2021-December/154470.html

The 'not' op must not propagate undef elements of a vector,
so this patch creates a new 'full' not, but I am not counting
that as an extra-use restriction because it should get folded
with the existing value by CSE.

https://alive2.llvm.org/ce/z/7v65im
2022-01-09 06:23:51 -05:00
Stanislav Mekhanoshin 0b5340acb7 [InstCombine] Factor out a common pattern match used 3 times. NFC.
This is needed for the next patch which will add more patterns
to the same match.

Differential Revision: https://reviews.llvm.org/D116194
2022-01-06 10:23:50 -08:00