Commit Graph

295 Commits

Author SHA1 Message Date
Kazu Hirata a7938c74f1 [llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 21:42:52 -07:00
Kazu Hirata 3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3.
2022-06-25 11:56:50 -07:00
Kazu Hirata aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Sanjay Patel cae993d4c8 [InstCombine] [InstCombine] reduce left-shift-of-right-shifted constant via demanded bits
If we don't demand low bits and it is valid to pre-shift a constant:
(C2 >> X) << C1 --> (C2 << C1) >> X

https://alive2.llvm.org/ce/z/_UzTMP

This is the reverse-order shift sibling to 82040d414b ( D127122 ).
It seems likely that we would want to add this to the SDAG version of
the code too to keep it on par with IR.
2022-06-07 18:43:27 -04:00
Sanjay Patel a4d2c5ecaa [InstCombine] reduce code duplication for accessing type; NFC 2022-06-07 18:43:27 -04:00
Sanjay Patel 82040d414b [InstCombine] reduce right-shift-of-left-shifted constant via demanded bits
If we don't demand high bits (zeros) and it is valid to pre-shift a constant:
(C2 << X) >> C1 --> (C2 >> C1) << X

https://alive2.llvm.org/ce/z/P3dWDW

There are a variety of related patterns, but I haven't found a single solution
that gets all of the motivating examples - so pulling this piece out of
D126617 along with more tests.

We should also handle the case where we shift-right followed by shift-left,
but I'll make that a follow-on patch assuming this one is ok. It seems likely
that we would want to add this to the SDAG version of the code too to keep it
on par with IR.

Differential Revision: https://reviews.llvm.org/D127122
2022-06-07 13:28:18 -04:00
Simon Pilgrim afa1ae9e0c [InstCombine] SimplifyDemandedUseBits - allow and(srem(X,Pow2),C) -> and(X,C) to work on vector types
Replace m_ConstantInt with m_APInt to match uniform (no-undef) vector remainder amounts.
2022-04-07 15:24:45 +01:00
Simon Pilgrim 5909c67883 [InstCombine] SimplifyDemandedUseBits - add TODO to remove shl node if we only demand known sign bits of the shift source
Similar to what we already perform for ashr/lshr
2022-04-07 14:35:11 +01:00
Simon Pilgrim 5e90224839 [InstCombine] SimplifyDemandedUseBits - remove lshr node if we only demand known sign bit
This is a lshr equivalent to D122340 - if we don't demand any of the additional sign bits introduced by the ashr, the lshr can be treated as an ashr and we can remove the shift entirely if we only demand already known sign bits.

Another step towards PR21929

https://alive2.llvm.org/ce/z/6f3kjq

Differential Revision: https://reviews.llvm.org/D123118
2022-04-07 14:33:31 +01:00
Simon Pilgrim 6a094a6264 [InstCombine] SimplifyDemandedUseBits - remove ashr node if we only demand known sign bits
We already do this for SelectionDAG, but we're missing it here.

Noticed while re-triaging PR21929

Differential Revision: https://reviews.llvm.org/D122340
2022-03-25 15:39:08 +00:00
serge-sans-paille 59630917d6 Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line:
before: 1062981579
after:  1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817
2022-03-03 07:56:34 +01:00
serge-sans-paille a494ae43be Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output:
before: 1065307662
after:  1064800684

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120741
2022-03-01 21:00:07 +01:00
Nikita Popov c2428a4fad [InstCombine] Remove SPF min/max check from select demanded bits (NFCI)
This should no longer be necessary now that we canonicalize to
intrinsics. This may not be entirely NFC in practice if worklist
order gets inverted and we perform demanded bits simplification
of a select user before the select is canonicalized.
2022-03-01 14:50:37 +01:00
Sanjay Patel 995d400f3a [InstCombine] reduce mul operands based on undemanded high bits
We already do this in SDAG, but mul was left out of the fold
for unused high bits in IR.

The high bits of a mul's operands do not change the low bits
of the result:
https://alive2.llvm.org/ce/z/XRj5Ek

Verify some test diffs to confirm that they are correct:
https://alive2.llvm.org/ce/z/y_W8DW
https://alive2.llvm.org/ce/z/7DM5uf
https://alive2.llvm.org/ce/z/GDiHCK

This gets a fold that was presumed not possible in D114272:
https://alive2.llvm.org/ce/z/tAN-WY

Removing nsw/nuw is needed for general correctness (and is
also done in the codegen version), but we might be able to
recover more of those with better analysis.

Differential Revision: https://reviews.llvm.org/D119369
2022-02-10 08:10:22 -05:00
Sanjay Patel 897d92faef [InstCombine] generalize 2 LSB of demanded bits for X*X
This is a follow-up suggested in D119060.
Instead of checking each of the bottom 2 bits individually,
we can check them together and handle the possibility that
we demand both together.

https://alive2.llvm.org/ce/z/C2ihC2

Differential Revision: https://reviews.llvm.org/D119139
2022-02-07 11:33:55 -05:00
Sanjay Patel 79b3fe8070 [InstCombine] SimplifyDemandedBits - mul(x,x) is odd iff x is odd
https://alive2.llvm.org/ce/z/AXPr3k
2022-02-07 08:43:12 -05:00
Sanjay Patel 5372160a18 [InstCombine] SimplifyDemandedBits - mul(x,x) - if only demand bit[1] then fold to zero
This is a translation of the fold added to codegen with:
2d1390efbe

Part of solving issue #48027
2022-02-05 09:51:38 -05:00
Sanjay Patel 0236c57181 [InstCombine] try to fold one-demanded-bit-of-multiply
This is a generalization of the icmp fold in D118061 (and that can be abandoned).
We're looking for a disguised form of "odd * odd must be odd".
Some Alive2 proofs to show correctness:
https://alive2.llvm.org/ce/z/60Y8hz
https://alive2.llvm.org/ce/z/HfAP6R

Differential Revision: https://reviews.llvm.org/D118539
2022-02-04 11:40:54 -05:00
Nuno Lopes dd995aceda [InstCombine] remove incorrect gep(x, undef) -> undef optimization
gep(x, undef) carries the provenance of x, so we can't replace it with any
pointer like undef.
This leaves room for improvement for the poison case, but that's currently
not possible as the demanded bits API doesn't distinguish between undef &
poison bits.

Fixes #44790
2022-01-30 11:34:32 +00:00
Craig Topper 9abc593e98 [TargetLowering][InstCombine] Simplify BSwap demanded bits code a little. NFC
Use alignDown instead of &= ~7.

Replace ResultBit with NLZ. (BitWidth - NLZ - NTZ == 8) so
(BitWidth - NTZ - 8 == NLZ).

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D117804
2022-01-20 10:45:17 -08:00
Sanjay Patel f46a9c8edd [InstCombine] don't automatically drop poison-generating flags in SimplifyVectorDemandedElts
I noticed this while reviewing the test diffs in D115460
(and so the diffs in that patch will be reduced if this one is applied first).

This is effectively a revert of 3436dc2923 ( https://reviews.llvm.org/rG3436dc29239d ) -
since that commit, we've made several enhancements, so the reasoning there is no longer
valid. Specifically, we added a poison value to IR, and we clarified the behavior of
undef/poison elements in a shuffle mask:
https://llvm.org/docs/LangRef.html#shufflevector-instruction

Alive2 seems to agree that the propagation of flags in the test diffs shown here are valid:
https://alive2.llvm.org/ce/z/UuY-jr
https://alive2.llvm.org/ce/z/GXoMD9
https://alive2.llvm.org/ce/z/nVCyVH

Differential Revision: https://reviews.llvm.org/D115526
2021-12-13 10:12:19 -05:00
Jay Foad a9bceb2b05 [APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.

Differential Revision: https://reviews.llvm.org/D110807
2021-10-04 08:57:44 +01:00
Sanjay Patel f32c0fe8e5 [InstCombine] fold cast of right-shift if high bits are not demanded (3rd try)
The first two tries at this were reverted because they caused an
infinite loop in instcombine.
That should be fixed after a series of patches that ended with
removing the faulty opposing transform:
3fabd98e5b

Original commit message:
(masked) trunc (lshr X, C) --> (masked) lshr (trunc X), C

Narrowing the shift should be better for analysis and can lead
to follow-on transforms as shown.

Attempt at a general proof in Alive2:
https://alive2.llvm.org/ce/z/tRnnSF

Here are a couple of the specific tests:
https://alive2.llvm.org/ce/z/bCnTp-
https://alive2.llvm.org/ce/z/TfaHnb

Differential Revision: https://reviews.llvm.org/D110170
2021-10-03 10:37:22 -04:00
Sanjay Patel 3c5500907b Revert "[InstCombine] fold cast of right-shift if high bits are not demanded (2nd try)"
This reverts commit bb9333c350.

This exposes another existing bug that causes an infinite loop as shown in
D110170
...so reverting while I look at another fix.
2021-09-24 10:47:35 -04:00
Sanjay Patel bb9333c350 [InstCombine] fold cast of right-shift if high bits are not demanded (2nd try)
The 1st try at this was reverted because it caused an infinite loop in instcombine.
That should be fixed after:
1cd6b44f26

(masked) trunc (lshr X, C) --> (masked) lshr (trunc X), C

Narrowing the shift should be better for analysis and can lead
to follow-on transforms as shown.

Attempt at a general proof in Alive2:
https://alive2.llvm.org/ce/z/tRnnSF

Here are a couple of the specific tests:
https://alive2.llvm.org/ce/z/bCnTp-
https://alive2.llvm.org/ce/z/TfaHnb

Differential Revision: https://reviews.llvm.org/D110170
2021-09-23 09:41:37 -04:00
Sanjay Patel c6013f71a4 Revert "[InstCombine] fold cast of right-shift if high bits are not demanded"
This reverts commit 2f6b07316f.

This caused several bots to hit an infinite loop at stage 2,
so it needs to be reverted while figuring out how to fix that.
2021-09-22 07:45:21 -04:00
Sanjay Patel 2f6b07316f [InstCombine] fold cast of right-shift if high bits are not demanded
(masked) trunc (lshr X, C) --> (masked) lshr (trunc X), C

Narrowing the shift should be better for analysis and can lead
to follow-on transforms as shown.

Attempt at a general proof in Alive2:
https://alive2.llvm.org/ce/z/tRnnSF

Here are a couple of the specific tests:
https://alive2.llvm.org/ce/z/bCnTp-
https://alive2.llvm.org/ce/z/TfaHnb

Differential Revision: https://reviews.llvm.org/D110170
2021-09-21 16:09:08 -04:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Sanjay Patel 790c29ab86 [InstCombine] fold umax/umin intrinsics based on demanded bits
This is a direct translation of the select folds added with
D53033 / D53036 and another step towards canonicalization
using the intrinsics (see D98152).
2021-08-12 12:37:45 -04:00
Juneyoung Lee 7161bb87c9 [InsCombine] Fix a few remaining vec transforms to use poison instead of undef
This is a patch that replaces shufflevector and insertelement's placeholder value with poison.

Underlying motivation is to fix the semantics of shufflevector with undef mask to return poison instead
(D93818)
The consensus has been made in the late 2020 via mailing list as well as the thread in https://bugs.llvm.org/show_bug.cgi?id=44185 .

This patch is a simple syntactic change to the existing code, hence directly pushed as a commit.
2021-05-31 18:47:09 +09:00
Sanjay Patel e82db87fb1 [InstCombine] drop poison flags when simplifying 'shl' based on demanded bits
As with other transforms in demanded bits, we must be careful not to
wrongly propagate nsw/nuw if we are reducing values leading up to the shift.

This bug was introduced with 1b24f35f84 and leads to the miscompile
shown in:
https://llvm.org/PR50341
2021-05-14 13:54:13 -04:00
Dávid Bolvanský 80b897e21b [InstCombine] ctpop(X) ^ ctpop(Y) & 1 --> ctpop(X^Y) & 1 (PR50094)
Original pattern: (__builtin_parity(x) ^ __builtin_parity(y))

LLVM rewrites it as: (__builtin_popcount(x) ^ __builtin_popcount(y)) & 1

Optimized form:  __builtin_popcount(X^Y) & 1

Alive proof: https://alive2.llvm.org/ce/z/-GdWFr

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D101802
2021-05-04 13:16:18 +02:00
Sanjay Patel 1b24f35f84 [InstCombine] improve demanded bits analysis of left-shifted operand
If we don't demand high bits, then we also don't care about those
high bits of a left-shift operand regardless of shift amount.
I noticed the sext/trunc pattern in a motivating example.
It seems like there should be a low-bits with right-shift sibling,
but I haven't looked at that yet.

https://alive2.llvm.org/ce/z/JuS6jc
https://rise4fun.com/Alive/Trm (not sure how to use 'width' with Alive1)
https://alive2.llvm.org/ce/z/gRadbF

Differential Revision: https://reviews.llvm.org/D101489
2021-05-03 08:39:20 -04:00
Sanjay Patel e10d7d455d [InstCombine] fold 'not' of ctpop in parity pattern
As discussed in https://llvm.org/PR50096 , we could
convert the 'not' into a 'sub' and see the same
fold. That's because we already have another demanded
bits optimization for 'sub'.

We could add a related transform for
odd-number-of-type-bits, but that seems unlikely
to be practical.

https://alive2.llvm.org/ce/z/TWJZXr
2021-04-23 13:23:24 -04:00
Juneyoung Lee 1c10201d96 Update InstCombine to use undef matcher instead
This is a patch to use m_Undef() matcher instead of isa<UndefValue>().

As suggested in D100122, this update is separately committed.
2021-04-18 11:05:36 +09:00
Jeroen Dobbelaere b82b305cf9 [InstCombine] Fix out-of-bounds ashr(shl) optimization
This fixes a crash found by the oss fuzzer and reported by @fhahn.
The suggestion of @RKSimon seems to be the correct fix here. (See D91343).

The oss fuzz report can be found here: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32759

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D99792
2021-04-02 13:45:11 +02:00
Sanjay Patel c2ebad8d55 [InstCombine] add fold for demand of low bit of abs()
This is one problem shown in https://llvm.org/PR49763

https://alive2.llvm.org/ce/z/cV6-4K
https://alive2.llvm.org/ce/z/9_3g-L
2021-03-30 15:14:37 -04:00
Sanjay Patel 9502061bcc [InstCombine] avoid infinite loop in demanded bits for select
https://llvm.org/PR49205
2021-02-28 10:17:53 -05:00
Juneyoung Lee 9d70dbdc2b [InstCombine] use poison as placeholder for undemanded elems
Currently undef is used as a don’t-care vector when constructing a vector using a series of insertelement.
However, this is problematic because undef isn’t undefined enough.
Especially, a sequence of insertelement can be optimized to shufflevector, but using undef as its placeholder makes shufflevector a poison-blocking instruction because undef cannot be optimized to poison.
This makes a few straightforward optimizations incorrect, such as:

```
;  https://bugs.llvm.org/show_bug.cgi?id=44185

define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %xv = insertelement <4 x float> %q, float %x, i32 2
  %r = shufflevector <4 x float> %y, <4 x float> %xv, <4 x i32> { 0, 6, 2, undef }
  ret <4 x float> %r ; %r[3] is undef
}
=>
define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %r = insertelement <4 x float> %y, float %x, i32 1
  ret <4 x float> %r ; %r[3] = %y[3], incorrect if %y[3] = poison
}

Transformation doesn't verify!
ERROR: Target is more poisonous than source
```

I’d like to suggest
1. Using poison as insertelement’s placeholder value (IRBuilder::CreateVectorSplat should be patched too)
2. Updating shufflevector’s semantics to return poison element if mask is undef

Note that poison is currently lowered into UNDEF in SelDag, so codegen part is okay.
m_Undef() matches PoisonValue as well, so existing optimizations will still fire.

The only concern is hidden miscompilations that will go incorrect when poison constant is given.
A conservative way is copying all tests having `insertelement undef` & replacing it with `insertelement poison` & run Alive2 on it, but it will create many tests and people won’t like it. :(

Instead, I’ll simply locally maintain the tests and run Alive2.
If there is any bug found, I’ll report it.

Relevant links: https://bugs.llvm.org/show_bug.cgi?id=43958 , http://lists.llvm.org/pipermail/llvm-dev/2019-November/137242.html

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93586
2020-12-28 08:58:15 +09:00
Bhramar Vatsa fd679107d6
[InstCombine] Optimize away the unnecessary multi-use sign-extend
C.f. https://bugs.llvm.org/show_bug.cgi?id=47765

Added a case for handling the sign-extend (Shl+AShr) for multiple uses,
to optimize it away for an individual use,
when the demanded bits aren't affected by sign-extend.

https://rise4fun.com/Alive/lgf

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D91343
2020-12-01 16:54:00 +03:00
Sanjay Patel 4e68bc0999 Revert "[InstCombine] add multi-use demanded bits fold for add with low-bit mask"
This reverts commit e56103d250.
There is a stage2 msan failure blamed on this commit:
http://lab.llvm.org:8011/#/builders/74/builds/888/steps/9/logs/stdio
2020-11-16 14:48:09 -05:00
Sanjay Patel e56103d250 [InstCombine] add multi-use demanded bits fold for add with low-bit mask
I noticed an add example like the one from D91343, so here's a similar patch.
The logic is based on existing code for the single-use demanded bits fold.
But I only matched a constant instead of using compute known bits on the
operands because that was the motivating patterni that I noticed.

I think this will allow removing a special-case (but incomplete) dedicated
fold within visitAnd(), but I need to untangle the existing code to be sure.

https://rise4fun.com/Alive/V6fP

  Name: add with low mask
  Pre: (C1 & (-1 u>> countLeadingZeros(C2))) == 0
  %a = add i8 %x, C1
  %r = and i8 %a, C2
  =>
  %r = and i8 %x, C2

Differential Revision: https://reviews.llvm.org/D91415
2020-11-15 15:09:49 -05:00
Simon Pilgrim 1a62ca65c1 [KnownBits] Add KnownBits::commonBits helper. NFCI.
We have a frequent pattern where we're merging two KnownBits to get the common/shared bits, and I just fell for the gotcha where I tried to use the & operator to merge them........
2020-11-11 12:15:54 +00:00
Simon Pilgrim ec228fbfc0 [InstCombine] SimplifyDemandedUseBits - replace dyn_cast<ConstantInt> with m_ConstantInt. NFCI. 2020-10-20 16:45:16 +01:00
Simon Pilgrim e346ea9905 [InstCombine] SimplifyDemandedUseBits - pass APInt by const reference. NFCI. 2020-10-20 12:13:08 +01:00
Simon Pilgrim b3330ae42c [InstCombine] SimplifyDemandedUseBits - xor - refactor cast<ConstantInt> usage to PatternMatch. NFCI.
First step towards replacing these to add full vector support.
2020-10-15 16:06:23 +01:00
Christopher Tetreault 640f20b0c7 [SVE] Remove calls to VectorType::getNumElements from InstCombine
Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D82237
2020-08-31 12:59:10 -07:00
Sanjay Patel c4f0a0896f [InstCombine] improve demanded element analysis for vector insert-of-extract (2nd try)
The 1st attempt (rG557b890) was reverted because it caused miscompiles.
That bug is avoided here by changing the order of folds and as verified
in the new tests.

Original commit message:
InstCombine currently has odd rules for folding insert-extract chains to shuffles,
so we miss collapsing seemingly simple cases as shown in the tests here.

But poison makes this not quite as easy as we might have guessed. Alive2 tests to
show the subtle difference (similar to the regression tests):
https://alive2.llvm.org/ce/z/hp4hv3 (this is ok)
https://alive2.llvm.org/ce/z/ehEWaN (poison leakage)

SLP tends to create these patterns (as shown in the SLP tests), and this could
help with solving PR16739.

Differential Revision: https://reviews.llvm.org/D86460
2020-08-25 11:19:36 -04:00
Benjamin Kramer c6fb72de4f Revert "[InstCombine] improve demanded element analysis for vector insert-of-extract"
This reverts commit 557b890ff4. Causing
miscompiles, test case is on llvm-commits.
2020-08-25 11:31:31 +02:00
David Sherwood 7b64765cd1 [SVE] Fix TypeSize related warnings with IR truncates of scalable vectors
In getCastInstrCost when the instruction is a truncate we were relying
upon the implicit TypeSize -> uint64_t cast when asking if a given type
has the same size as a legal integer. I've changed the code to only
ask the question if the type is fixed length.

I have also changed InstCombinerImpl::SimplifyDemandedUseBits to bail
out for now if the type is a scalable vector.

I've added the following new tests:

  Analysis/CostModel/AArch64/sve-trunc.ll
  Transforms/InstCombine/AArch64/sve-trunc.ll

for both of these fixes.

Differential revision: https://reviews.llvm.org/D86432
2020-08-25 09:17:56 +01:00