Commit Graph

315 Commits

Author SHA1 Message Date
Sanjay Patel 9ce5f41851 [InstCombine] fold cmp+select using select operand equivalence
As discussed in PR42696:
https://bugs.llvm.org/show_bug.cgi?id=42696
...but won't help that case yet.

We have an odd situation where a select operand equivalence fold was
implemented in InstSimplify when it could have been done more generally
in InstCombine if we allow dropping of {nsw,nuw,exact} from a binop operand.

Here's an example:
https://rise4fun.com/Alive/Xplr

  %cmp = icmp eq i32 %x, 2147483647
  %add = add nsw i32 %x, 1
  %sel = select i1 %cmp, i32 -2147483648, i32 %add
  =>
  %sel = add i32 %x, 1

I've left the InstSimplify code in place for now, but my guess is that we'd
prefer to remove that as a follow-up to save on code duplication and
compile-time.

Differential Revision: https://reviews.llvm.org/D65576

llvm-svn: 367695
2019-08-02 17:39:32 +00:00
Roman Lebedev 0efeaa8162 [IR] SelectInst: add swapValues() utility
Summary:
Sometimes we need to swap true-val and false-val of a `SelectInst`.
Having a function for that is nicer than hand-writing it each time.

Reviewers: spatel, RKSimon, craig.topper, jdoerfert

Reviewed By: jdoerfert

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65520

llvm-svn: 367547
2019-08-01 12:31:35 +00:00
David Bolvansky 9f0d718c66 [InstCombine] Disable fold from D64285 for non-integer types
llvm-svn: 365959
2019-07-12 21:14:21 +00:00
David Bolvansky af1b3185f5 [InstCombine] Fold select (icmp sgt x, -1), lshr (X, Y), ashr (X, Y) to ashr (X, Y))
Summary:
(select (icmp sgt x, -1), lshr (X, Y), ashr (X, Y)) -> ashr (X, Y))
(select (icmp slt x, 1), ashr (X, Y), lshr (X, Y)) -> ashr (X, Y))

Fixes PR41173

Alive proof by @lebedev.ri (thanks)
Name: PR41173
  %cmp = icmp slt i32 %x, 1
  %shr = lshr i32 %x, %y
  %shr1 = ashr i32 %x, %y
  %retval.0 = select i1 %cmp, i32 %shr1, i32 %shr
  =>
  %retval.0 = ashr i32 %x, %y

Optimization: PR41173
Done: 1
Optimization is correct!

Reviewers: lebedev.ri, spatel

Reviewed By: lebedev.ri

Subscribers: nikic, craig.topper, llvm-commits, lebedev.ri

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64285

llvm-svn: 365893
2019-07-12 11:31:16 +00:00
Sanjay Patel 706b48251f [InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics
This is the opposite direction of D62158 (we have to choose 1 form or the other).
Now that we have FMF on the select, this becomes more palatable. And the benefits
of having a single IR instruction for this operation (less chances of missing folds
based on extra uses, etc) overcome my previous comments about the potential advantage
of larger pattern matching/analysis.

Differential Revision: https://reviews.llvm.org/D62414

llvm-svn: 364721
2019-06-30 13:40:31 +00:00
Sanjay Patel 9650c95b7e [InstCombine] allow unordered preds when canonicalizing to fabs()
We have a known-never-nan value via 'nnan', so an unordered predicate
is the same as its ordered sibling.

Similar to:
rL362937

llvm-svn: 362954
2019-06-10 15:39:00 +00:00
Sanjay Patel 85de9634e6 [InstCombine] fix bug in canonicalization to fabs()
Forgot to translate the predicate clauses in rL362943.

llvm-svn: 362945
2019-06-10 14:57:45 +00:00
Sanjay Patel 8b6d9f60ed [InstCombine] change canonicalization to fabs() to use FMF on fsub
Similar to rL362909:
This isn't the ideal fix (use FMF on the select), but it's still an
improvement until we have better FMF propagation to selects and other
FP math operators.

I don't think there's much risk of regression from this change by
not including the FMF on the fcmp any more. The nsz/nnan FMF
should be the same on the fcmp and the fsub because they have the
same operand.

llvm-svn: 362943
2019-06-10 14:46:36 +00:00
Sanjay Patel 8cd8c5784b [InstCombine] allow unordered preds when canonicalizing to fabs()
PR42179:
https://bugs.llvm.org/show_bug.cgi?id=42179

llvm-svn: 362937
2019-06-10 14:14:51 +00:00
Sanjay Patel 87cd16a86e [InstCombine] change canonicalization to fabs() to use FMF on fneg
This isn't the ideal fix (use FMF on the select), but it's still an
improvement until we have better FMF propagation to selects and other
FP math operators.

I don't think there's much risk of regression from this change by
not including the FMF on the fcmp any more. The nsz/nnan FMF
should be the same on the fcmp and the fneg (fsub) because they
have the same operand.

This works around the most glaring FMF logical inconsistency cited
in PR38086:
https://bugs.llvm.org/show_bug.cgi?id=38086

llvm-svn: 362909
2019-06-09 16:22:01 +00:00
Sanjay Patel a6019d5164 [InstCombine] sink FP negation of operands through select
We don't always get this:

Cond ? -X : -Y --> -(Cond ? X : Y)

...even with the legacy IR form of fneg in the case with extra uses,
and we miss matching with the newer 'fneg' instruction because we
are expecting binops through the rest of the path.

Differential Revision: https://reviews.llvm.org/D61604

llvm-svn: 360075
2019-05-06 20:34:05 +00:00
Sanjay Patel a64bd09ec4 [InstCombine] reduce code duplication; NFC
llvm-svn: 360059
2019-05-06 17:39:18 +00:00
Nikita Popov 7462303e06 [InstCombine] Use uadd.sat and usub.sat for canonicalization
Start using the uadd.sat and usub.sat intrinsics for the existing
canonicalizations. These intrinsics should optimize better than
expanded IR, have better handling in the X86 backend and should
be no worse than expanded IR in other backends, as far as we know.

rL357012 already introduced use of uadd.sat for the add+umin pattern.

Differential Revision: https://reviews.llvm.org/D58872

llvm-svn: 357103
2019-03-27 17:56:15 +00:00
Sanjay Patel 1f65903dc1 [InstCombine] move add after smin/smax
Follow-up to rL355221.
This isn't specifically called for within PR14613,
but we'll get there eventually if it's not already
requested in some other bug report.

https://rise4fun.com/Alive/5b0

  Name: smax
  Pre: WillNotOverflowSignedSub(C1,C0)
  %a = add nsw i8 %x, C0
  %cond = icmp sgt i8 %a, C1
  %r = select i1 %cond, i8 %a, i8 C1
  =>
  %c2 = icmp sgt i8 %x, C1-C0
  %u2 = select i1 %c2, i8 %x, i8 C1-C0
  %r = add nsw i8 %u2, C0

  Name: smin
  Pre: WillNotOverflowSignedSub(C1,C0)
  %a = add nsw i32 %x, C0
  %cond = icmp slt i32 %a, C1
  %r = select i1 %cond, i32 %a, i32 C1
  =>
  %c2 = icmp slt i32 %x, C1-C0
  %u2 = select i1 %c2, i32 %x, i32 C1-C0
  %r = add nsw i32 %u2, C0

llvm-svn: 355272
2019-03-02 16:45:10 +00:00
Sanjay Patel 6e1e7e1c3e [InstCombine] move add after umin/umax
In the motivating cases from PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613
...moving the add enables us to narrow the
min/max which eliminates zext/trunc which
enables signficantly better vectorization.
But that bug is still not completely fixed.

https://rise4fun.com/Alive/5KQ

  Name: umax
  Pre: C1 u>= C0
  %a = add nuw i8 %x, C0
  %cond = icmp ugt i8 %a, C1
  %r = select i1 %cond, i8 %a, i8 C1
  =>
  %c2 = icmp ugt i8 %x, C1-C0
  %u2 = select i1 %c2, i8 %x, i8 C1-C0
  %r = add nuw i8 %u2, C0

  Name: umin
  Pre: C1 u>= C0
  %a = add nuw i32 %x, C0
  %cond = icmp ult i32 %a, C1
  %r = select i1 %cond, i32 %a, i32 C1
  =>
  %c2 = icmp ult i32 %x, C1-C0
  %u2 = select i1 %c2, i32 %x, i32 C1-C0
  %r = add nuw i32 %u2, C0

llvm-svn: 355221
2019-03-01 19:42:40 +00:00
Sanjay Patel e8bf0f79bd [InstCombine] canonicalize more unsigned saturated add with 'not'
Yet another pattern variation suggested by:
https://bugs.llvm.org/show_bug.cgi?id=14613

There are 8 more potential commuted patterns here on top of the
8 that were already handled (rL354221, rL354276, rL354393).
We have the obvious commute of the 'add' + commute of the cmp
predicate/operands (ugt/ult) + commute of the select operands:

Name: base
%notx = xor i32 %x, -1
%a = add i32 %notx, %y
%c = icmp ult i32 %x, %y
%r = select i1 %c, i32 -1, i32 %a
=>
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a

Name: ugt
%notx = xor i32 %x, -1
%a = add i32 %notx, %y
%c = icmp ugt i32 %y, %x
%r = select i1 %c, i32 -1, i32 %a
=>
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a

Name: commute select
%notx = xor i32 %x, -1
%a = add i32 %notx, %y
%c = icmp ult i32 %y, %x
%r = select i1 %c, i32 %a, i32 -1
=>
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a

Name: ugt + commute select
%notx = xor i32 %x, -1
%a = add i32 %notx, %y
%c = icmp ugt i32 %x, %y
%r = select i1 %c, i32 %a, i32 -1
=>
%c2 = icmp ult i32 %a, %y
%r = select i1 %c2, i32 -1, i32 %a

https://rise4fun.com/Alive/den

llvm-svn: 354887
2019-02-26 15:18:49 +00:00
Sanjay Patel c1e0184317 [InstCombine] reduce even more unsigned saturated add with 'not' op
We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

  Name: uaddsat, -1 fval
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ugt i32 %notx, %y
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

  Name: uaddsat, -1 fval + ult
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ult i32 %y, %notx
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

https://rise4fun.com/Alive/nTp

llvm-svn: 354393
2019-02-19 22:14:21 +00:00
Sanjay Patel dcb93c0dda [InstCombine] rearrange saturated add folds; NFC
This is no-functional-change-intended, but that was also
true when it was part of rL354276, and I managed to lose
2 predicates for the fold with constant...causing much bot
distress. So this time I'm adding a couple of negative tests
to avoid that.

llvm-svn: 354384
2019-02-19 21:46:13 +00:00
Sanjay Patel 8a35d339c9 Revert "[InstCombine] reduce even more unsigned saturated add with 'not' op"
This reverts commit 079b610c29.
Bots are failing after this change on a stage 2 compile of clang.

llvm-svn: 354277
2019-02-18 16:04:22 +00:00
Sanjay Patel 079b610c29 [InstCombine] reduce even more unsigned saturated add with 'not' op
We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

  Name: uaddsat, -1 fval
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ugt i32 %notx, %y
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

  Name: uaddsat, -1 fval + ult
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ult i32 %y, %notx
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

https://rise4fun.com/Alive/nTp

llvm-svn: 354276
2019-02-18 15:21:39 +00:00
Sanjay Patel b341ee7071 [InstCombine] reduce more unsigned saturated add with 'not' op
We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

  Name: not op
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ult i32 %notx, %y
  %r = select i1 %c, i32 -1, i32 %a
  =>
  %a = add i32 %x, %y
  %c2 = icmp ult i32 %a, %y
  %r = select i1 %c2, i32 -1, i32 %a

  Name: not op ugt
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ugt i32 %y, %notx
  %r = select i1 %c, i32 -1, i32 %a
  =>
  %a = add i32 %x, %y
  %c2 = icmp ult i32 %a, %y
  %r = select i1 %c2, i32 -1, i32 %a

https://rise4fun.com/Alive/niom

(The matching here is still incomplete.)

llvm-svn: 354224
2019-02-17 16:48:50 +00:00
Sanjay Patel bee2073542 [InstCombine] reduce unsigned saturated add with 'not' op
We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

(The matching here is incomplete. Trying to take minimal steps
to make sure we don't induce infinite looping from existing
canonicalizations of the 'select'.)

llvm-svn: 354221
2019-02-17 15:58:48 +00:00
Sanjay Patel 18db56209c [InstCombine] canonicalize cmp/select form of uadd saturate with constant
I'm circling back around to a loose end from D51929.

The backend (either CGP or DAG) doesn't recognize this pattern, so we end up with different asm for these IR variants.

Regardless of any future changes to canonicalize to saturation/overflow intrinsics, we want to get raw IR variations 
into the minimal number of raw IR forms. If/when we can canonicalize to intrinsics, that will make that step easier.

  Pre: C2 == ~C1
  %a = add i32 %x, C1
  %c = icmp ugt i32 %x, C2
  %r = select i1 %c, i32 -1, i32 %a
  =>
  %a = add i32 %x, C1
  %c2 = icmp ult i32 %x, C2
  %r = select i1 %c2, i32 %a, i32 -1

  https://rise4fun.com/Alive/pkH

Differential Revision: https://reviews.llvm.org/D57352

llvm-svn: 352536
2019-01-29 20:02:45 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Sanjay Patel d023dd60e9 [InstCombine] canonicalize another raw IR rotate pattern to funnel shift
This is matching the equivalent of the DAG expansion, 
so it should never end up with worse perf than the 
original code even if the target doesn't have a rotate
instruction.

llvm-svn: 350672
2019-01-08 22:39:55 +00:00
Nikita Popov 65038515ee [InstCombine] Relax cttz/ctlz with select on zero
The cttz/ctlz intrinsics have a parameter specifying whether the
result is undefined for zero. cttz(x, false) can be relaxed to
cttz(x, true) if x is known non-zero, and in fact such an optimization
is already performed. However, this currently doesn't work if x is
non-zero as a result of a select rather than an explicit branch.
This patch adds handling for this case, thus allowing
x != 0 ? cttz(x, false) : y to simplify to x != 0 ? cttz(x, true) : y.

Differential Revision: https://reviews.llvm.org/D55786

llvm-svn: 350463
2019-01-05 09:48:16 +00:00
Sanjay Patel 654e6aabb9 [InstCombine] canonicalize raw IR rotate patterns to funnel shift
The final piece of IR-level analysis to allow this was committed with:
rL350188

Using the intrinsics should improve transforms based on cost models
like vectorization and inlining.

The backend should be prepared too, so we can now canonicalize more
sequences of shift/logic to the intrinsics and know that the end
result should be equal or better to the original code even if the
target does not have an actual rotate instruction.

llvm-svn: 350199
2019-01-01 21:51:39 +00:00
Sanjay Patel d802270808 [InstSimplify] fold select with implied condition
This is an almost direct move of the functionality from InstCombine to 
InstSimplify. There's no reason not to do this in InstSimplify because 
we never create a new value with this transform.

(There's a question of whether any dominance-based transform belongs in
either of these passes, but that's a separate issue.)

I've changed 1 of the conditions for the fold (1 of the blocks for the 
branch must be the block we started with) into an assert because I'm not 
sure how that could ever be false.

We need 1 extra check to make sure that the instruction itself is in a
basic block because passes other than InstCombine may be using InstSimplify
as an analysis on values that are not wired up yet.

The 3-way compare changes show that InstCombine has some kind of 
phase-ordering hole. Otherwise, we would have already gotten the intended
final result that we now show here.

llvm-svn: 347896
2018-11-29 18:44:39 +00:00
Mandeep Singh Grang 0905fc77c1 [InstCombine] Remove a couple of asserts based on incorrect assumptions
Summary:
These asserts are based on the assumption that the order of true/false operands in a select and those in the compare would always be the same.
This fixes PR39595.

Reviewers: craig.topper, spatel, dmgreen

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54359

llvm-svn: 346874
2018-11-14 17:55:07 +00:00
Sanjay Patel f8f12272e8 [InstCombine] canonicalize rotate patterns with cmp/select
The cmp+branch variant of this pattern is shown in:
https://bugs.llvm.org/show_bug.cgi?id=34924
...and as discussed there, we probably can't transform
that without a rotate intrinsic. We do have that now
via funnel shift, but we're not quite ready to 
canonicalize IR to that form yet. The case with 'select'
should already be transformed though, so that's this patch.

The sequence with negation followed by masking is what we
use in the backend and partly in clang (though that part 
should be updated).

https://rise4fun.com/Alive/TplC
  %cmp = icmp eq i32 %shamt, 0
  %sub = sub i32 32, %shamt
  %shr = lshr i32 %x, %shamt
  %shl = shl i32 %x, %sub
  %or = or i32 %shr, %shl
  %r = select i1 %cmp, i32 %x, i32 %or
  =>
  %neg = sub i32 0, %shamt
  %masked = and i32 %shamt, 31
  %maskedneg = and i32 %neg, 31
  %shl2 = lshr i32 %x, %masked
  %shr2 = shl i32 %x, %maskedneg
  %r = or i32 %shl2, %shr2

llvm-svn: 346807
2018-11-13 22:47:24 +00:00
Sanjay Patel 1440107821 [InstSimplify] fold select (fcmp X, Y), X, Y
This is NFCI for InstCombine because it calls InstSimplify, 
so I left the tests for this transform there. As noted in
the code comment, we can allow this fold more often by using
FMF and/or value tracking.

llvm-svn: 346169
2018-11-05 21:51:39 +00:00
Sanjay Patel 87aa10062c [InstCombine] loosen FP 0.0 constraint for fcmp+select substitution
It looks like we correctly removed edge cases with 0.0 from D50714,
but we were a bit conservative because getBinOpIdentity() doesn't
distinguish between +0.0 and -0.0 and 'nsz' is effectively always
true for fcmp (see discussion in:
https://bugs.llvm.org/show_bug.cgi?id=38086

Without this change, we would get regressions by canonicalizing
to +0.0 in all fcmp, and that's a step towards solving:
https://bugs.llvm.org/show_bug.cgi?id=39475

llvm-svn: 346143
2018-11-05 16:50:44 +00:00
Sanjay Patel 747feb28e4 [InstCombine] use 'match' to handle vectors and simplify code
This is another step towards completely removing the fake 
binop queries for not/neg/fneg.

llvm-svn: 345036
2018-10-23 15:05:12 +00:00
Sanjay Patel ad76c682c7 [InstCombine] swap select profile metadata when swapping select ops
llvm-svn: 345034
2018-10-23 14:43:31 +00:00
Neil Henning 57f5d0a885 [IRBuilder] Fixup CreateIntrinsic to allow specifying Types to Mangle.
The IRBuilder CreateIntrinsic method wouldn't allow you to specify the
types that you wanted the intrinsic to be mangled with. To fix this
I've:

- Added an ArrayRef<Type *> member to both CreateIntrinsic overloads.
- Used that array to pass into the Intrinsic::getDeclaration call.
- Added a CreateUnaryIntrinsic to replace the most common use of
  CreateIntrinsic where the type was auto-deduced from operand 0.
- Added a bunch more unit tests to test Create*Intrinsic calls that
  weren't being tested (including the FMF flag that wasn't checked).

This was suggested as part of the AMDGPU specific atomic optimizer
review (https://reviews.llvm.org/D51969).

Differential Revision: https://reviews.llvm.org/D52087

llvm-svn: 343962
2018-10-08 10:32:33 +00:00
Craig Topper 2b3f5df73a [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible
Summary: This restores the combine that was reverted in r341883. The infinite loop from the failing test no longer occurs due to changes from r342163.

Reviewers: spatel, dmgreen

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52070

llvm-svn: 342797
2018-09-22 05:53:27 +00:00
Craig Topper 8fc05ce340 [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y are freely invertible.
This allows the xor to be removed completely.

This might help with recomitting r341674, but seems good regardless.

Coincidentally fixes PR38915.

Differential Revision: https://reviews.llvm.org/D51964

llvm-svn: 342163
2018-09-13 18:52:58 +00:00
Sanjay Patel 37e464876b [InstCombine] remove checks for IsFreeToInvert()
I accidentally committed this diff with rL342147 because
I had applied D51964. We probably do need those checks,
but D51964 has tests and more discussion/motivation,
so they should be re-added with that patch.

llvm-svn: 342149
2018-09-13 16:18:12 +00:00
Sanjay Patel 6f00fc3317 [InstCombine] reorder folds to reduce chance of infinite loops
I don't have a test case for this, but it's motivated by
the discussion in D51964, and I've added TODO comments for
the better fix - move simplifications into instsimplify
because that's more efficient and reduces risk of infinite
loops in instcombine caused by transforms trying to do the
opposite folds.

In this case, we know that the transform that tries to move
'not' through min/max can be fooled by the multiple uses
of a value in another min/max, so try to squash the 
foldSPFofSPF() patterns first.

llvm-svn: 342147
2018-09-13 16:04:06 +00:00
Alina Sbirlea 116caa2920 [InstCombine] Partially revert rL341674 due to PR38897.
Summary:
Revert min/max changes in rL341674 dues to high compile times causing timeouts (PR38897).
Checking in to unblock failing builds. Patch available for post-commit review and re-revert once resolved.
Working on a smaller reproducer for PR38897.

Reviewers: craig.topper, spatel

Subscribers: sanjoy, jlebar, llvm-commits

Differential Revision: https://reviews.llvm.org/D51897

llvm-svn: 341883
2018-09-10 23:47:21 +00:00
Craig Topper 040c2b0acf [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible
If the ~X wasn't able to simplify above the max/min, we might be able to simplify it by moving it below the max/min.

I had to modify the ~(min/max ~X, Y) transform to prevent getting stuck in a loop when we saw the new ~(max/min X, ~Y) before the ~Y had been folded away to remove the new not.

Differential Revision: https://reviews.llvm.org/D51398

llvm-svn: 341674
2018-09-07 16:19:50 +00:00
Florian Hahn e32ff4b28a [InstCombine] Do not fold scalar ops over select with vector condition.
If OtherOpT or OtherOpF have scalar types and the condition is a vector,
we would create an invalid select.

Reviewers: spatel, john.brawn, mssimpso, craig.topper

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D51781

llvm-svn: 341666
2018-09-07 14:40:06 +00:00
Craig Topper 2bcb1eeee1 [InstCombine] Replace two calls to getNumUses() with !hasNUsesOrMore
We were calling getNumUses to check for 1 or 2 uses. But getNumUses is linear in the number of uses. We can instead use !hasNUsesOrMore(3) which will stop the linear scan as soon as it determines there are at least 3 uses even if there are more.

llvm-svn: 340939
2018-08-29 17:09:21 +00:00
Sanjay Patel c615910be5 [InstCombine] fix formatting; NFC
llvm-svn: 340790
2018-08-27 23:01:10 +00:00
David Bolvansky 43b0e25847 [InstCombine] Fold Select with binary op - FP opcodes
Summary:
Follow up for https://reviews.llvm.org/rL339520 and https://reviews.llvm.org/rL338300

Alive:

```
%A = fcmp oeq float %x, 0.0
%B = fadd nsz float %x, %z
%C = select i1 %A, float %B, float %y
=>
%C = select i1 %A, float %z, float %y
----------                                                                      
  %A = fcmp oeq float %x, 0.0
  %B = fadd nsz float %x, %z
  %C = select %A, float %B, float %y
=>
  %C = select %A, float %z, float %y

Done: 1                                                                         
Optimization is correct

%A = fcmp une float %x, -0.0
%B = fadd nsz float %x, %z
%C = select i1 %A, float %y, float %B
=>
%C = select i1 %A, float %y, float %z
----------                                                                      
  %A = fcmp une float %x, -0.0
  %B = fadd nsz float %x, %z
  %C = select %A, float %y, float %B
=>
  %C = select %A, float %y, float %z

Done: 1                                                                         
Optimization is correct
```


Reviewers: spatel, lebedev.ri

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50714

llvm-svn: 340538
2018-08-23 15:22:15 +00:00
Michael Berg 0b838deddc extend binop folds for selects to include true and false binops flag intersection
Summary: This change address bug 38641

Reviewers: spatel, wristow

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D50996

llvm-svn: 340222
2018-08-20 22:26:58 +00:00
Craig Topper 24674ca773 [InstCombine] Move some variable declarations into a more appropriate scope. NFC
llvm-svn: 340150
2018-08-20 05:35:12 +00:00
Michael Berg ed89d069f4 add a missed case for binary op FMF propagation under select folds
llvm-svn: 339938
2018-08-16 20:59:45 +00:00
David Bolvansky 01d98cc03f [InstCombine] Fold Select with binary op - non-commutative opcodes
Summary:
Basic version was merged - https://reviews.llvm.org/D49954

This adds support for FP & non-commutative opcodes

Precommited tests: https://reviews.llvm.org/rL338727

Reviewers: spatel, lebedev.ri

Reviewed By: spatel

Subscribers: jfb

Differential Revision: https://reviews.llvm.org/D50190

llvm-svn: 339520
2018-08-12 17:30:07 +00:00
Sanjay Patel 85e17bb195 [InstCombine] rearrange code for foldSelectBinOpIdentity; NFCI
This is a retry of rL339439 with a fix for the problem that
caused the original commit to be reverted at rL339446. 

That problem was that the compare can be integer while
the binop is FP or vice-versa, so we need to use the binop 
type when we ask for the identity constant.

A test to guard against the problem was added at rL339453.

llvm-svn: 339469
2018-08-10 20:30:35 +00:00
Sanjay Patel c9cc86a5b3 [InstCombine] revert r339439 - rearrange code for foldSelectBinOpIdentity
That was supposed to be NFC, but it exposed a logic hole somewhere that
caused bots to fail.

llvm-svn: 339446
2018-08-10 16:12:19 +00:00
Sanjay Patel 3b92a17526 [InstCombine] rearrange code for foldSelectBinOpIdentity; NFCI
This should make it easier to folow and to add the planned enhancements
such as D50190.

llvm-svn: 339439
2018-08-10 15:11:26 +00:00
David Bolvansky 6737b3a6a1 [InstCombine] Fold Select with binary op
Summary:
Fold
  %A = icmp eq i8 %x, 0
  %B = xor i8 %x, %z
  %C = select i1 %A, i8 %B, i8 %y
To
  %C = select i1 %A, i8 %z, i8 %y

Fixes https://bugs.llvm.org/show_bug.cgi?id=38345
Proof: https://rise4fun.com/Alive/43J

Reviewers: lebedev.ri, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49954

llvm-svn: 338300
2018-07-30 20:38:53 +00:00
Chen Zheng 567485a72f [InstCombine] canonicalize abs pattern
Differential Revision: https://reviews.llvm.org/D48754

llvm-svn: 338092
2018-07-27 01:49:51 +00:00
Chen Zheng ccc8422464 [InstCombine] add more SPFofSPF folding
Differential Revision: https://reviews.llvm.org/D49238

llvm-svn: 337143
2018-07-16 02:23:00 +00:00
John Brawn e4ff0bd401 [InstCombine] Correct the cmp operand type used when canonicalizing abs/nabs
When adjusting a cmp in order to canonicalize an abs/nabs select pattern we need
to use the type of the existing operand when creating a new operand not the
type of a select operand, as the two may be different.

This fixes PR37686.

llvm-svn: 334019
2018-06-05 14:10:55 +00:00
Sanjay Patel 26368cd5d9 [InstCombine] narrow select to match condition operands' size
This is the planned enhancement to D47163 / rL333611.
We want to match cmp/select sizes because that will be recognized
as min/max more easily and lead to better codegen (especially for
vector types).

As mentioned in D47163, this improves some of the tests that would
also be folded by D46380, so we may want to adjust that patch to
match the new patterns where the extend op occurs after the select.

llvm-svn: 333689
2018-05-31 19:55:27 +00:00
Sanjay Patel a003c728a5 [InstCombine] choose 1 form of abs and nabs as canonical
We already do this for min/max (see the blob above the diff), 
so we should do the same for abs/nabs.
A sign-bit check (<s 0) is used as a predicate for other IR 
transforms and it's likely the best for codegen.

This might solve the motivating cases for D47037 and D47041, 
but I think those patches still make sense. We can't guarantee 
this canonicalization if the icmp has more than one use.

Differential Revision: https://reviews.llvm.org/D47076

llvm-svn: 332819
2018-05-20 14:23:23 +00:00
Craig Topper 0198b73769 [InstCombine] Qualify a select pattern based transform to restrct to only min/max and ignore abs/nabs.
llvm-svn: 332770
2018-05-18 21:21:56 +00:00
Sanjay Patel e7b6654711 [InstCombine] refine select-of-constants to bitwise ops
Add logic for the special case when a cmp+select can clearly be
reduced to just a bitwise logic instruction, and remove an 
over-reaching chunk of general purpose bit magic. The primary goal 
is to remove cases where we are not improving the IR instruction 
count when doing these select transforms, and in all cases here that 
is true.

In the motivating 3-way compare tests, there are further improvements
because we can combine/propagate select values (not sure if that
belongs in instcombine, but it's there for now).

DAGCombiner has folds to turn some of these selects into bit magic,
so there should be no difference in the end result in those cases.
Not all constant combinations are handled there yet, however, so it
is possible that some targets will see more cmov/csel codegen with
this change in IR canonicalization. 

Ideally, we'll go further to *not* turn selects into multiple 
logic/math ops in instcombine, and we'll canonicalize to selects.
But we should make sure that this step does not result in regressions
first (and if it does, we should fix those in the backend).

The general direction for this change was discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105373.html
http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html

Alive proofs for the new bit magic:
https://rise4fun.com/Alive/XG7

Differential Revision: https://reviews.llvm.org/D46086

llvm-svn: 331486
2018-05-03 21:58:44 +00:00
Sanjay Patel 807ddee1bf [InstCombine] clean up foldSelectICmpAnd(); NFC
As discussed in D45862, we want to delete parts of
this code because it can create more instructions
than it removes. But we also want to preserve some 
folds that are winners, so tidy up what's here to
make splitting the good from bad a bit easier.

llvm-svn: 330841
2018-04-25 16:34:01 +00:00
Roman Lebedev c00659328a [InstCombine]: foldSelectICmpAndAnd(): and is commutative
Summary:
The fold added in D45108 did not account for the fact that
the and instruction is commutative, and if the mask is a variable,
the mask variable and the fold variable may be swapped.

I have noticed this by accident when looking into [[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]]

This extends/generalizes that fold, so it is handled too.

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45539

llvm-svn: 330001
2018-04-13 09:57:57 +00:00
Roman Lebedev 41922f1a6d [InstCombine] Get rid of select of bittest (PR36950 / PR17564)
Summary:
See [[ https://bugs.llvm.org/show_bug.cgi?id=36950 | PR36950 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=17564 | PR17564 ]], D45065, D45107
https://godbolt.org/g/iAYRup

Alive proof: https://rise4fun.com/Alive/uiH

Testing: `ninja check-llvm`

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45108

llvm-svn: 329492
2018-04-07 10:37:24 +00:00
Sanjay Patel 93e64dd9a1 [PatternMatch] allow undef elements when matching vector FP +0.0
This continues the FP constant pattern matching improvements from:
https://reviews.llvm.org/rL327627
https://reviews.llvm.org/rL327339
https://reviews.llvm.org/rL327307

Several integer constant matchers also have this ability. I'm
separating matching of integer/pointer null from FP positive zero
and renaming/commenting to make the functionality clearer.

llvm-svn: 328461
2018-03-25 21:16:33 +00:00
Sanjay Patel 0ce3086777 [InstCombine] canonicalize fcmp+select to fabs
This is complicated by -0.0 and nan. This is based on the DAG patterns 
as shown in D44091. I'm hoping that we can just remove those DAG folds 
and always rely on IR canonicalization to handle the matching to fabs.

We would still need to delete the broken code from DAGCombiner to fix 
PR36600:
https://bugs.llvm.org/show_bug.cgi?id=36600

Differential Revision: https://reviews.llvm.org/D44550

llvm-svn: 327858
2018-03-19 15:14:30 +00:00
Craig Topper ee99aa4dd0 [InstCombine] Replace calls to getNumUses with hasNUses or hasNUsesOrMore
getNumUses is a linear time operation. It traverses the user linked list to the end and counts as it goes. Since we are only interested in small constant counts, we should use hasNUses or hasNUsesMore more that terminate the traversal as soon as it can provide the answer.

There are still two other locations in InstCombine, but changing those would force a rebase of D44266 which if accepted would remove them.

Differential Revision: https://reviews.llvm.org/D44398

llvm-svn: 327315
2018-03-12 18:46:05 +00:00
Sanjay Patel 1f2f5d18d3 [InstCombine] simplify min/max canonicalization; NFCI
llvm-svn: 326828
2018-03-06 19:01:18 +00:00
Sanjay Patel 7ed0bc26ac [ValueTracking] move helpers for SelectPatterns from InstCombine to ValueTracking
Most of the folds based on SelectPatternResult belong in InstSimplify rather than
InstCombine, so the helper code should be available to other passes/analysis.

llvm-svn: 326812
2018-03-06 16:57:55 +00:00
Craig Topper 1c19cc1745 [InstCombine] Don't fold select(C, Z, binop(select(C, X, Y), W)) -> select(C, Z, binop(Y, W)) if the binop is rem or div.
The select may have been preventing a division by zero or INT_MIN/-1 so removing it might not be safe.

Fixes PR36362.

Differential Revision: https://reviews.llvm.org/D43276

llvm-svn: 325148
2018-02-14 18:08:33 +00:00
Sanjay Patel e9a153f414 [InstCombine] add unsigned saturation subtraction canonicalizations
This is the instcombine part of unsigned saturation canonicalization.
Backend patches already commited: 
https://reviews.llvm.org/D37510
https://reviews.llvm.org/D37534

It converts unsigned saturated subtraction patterns to forms recognized 
by the backend:
(a > b) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b < a) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b > a) ? 0 : a - b -> ((a > b) ? a : b) - b)
(a < b) ? 0 : a - b -> ((a > b) ? a : b) - b)
((a > b) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b < a) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b > a) ? 0 : b - a) -> - ((a > b) ? a : b) - b)
((a < b) ? 0 : b - a) -> - ((a > b) ? a : b) - b)

Patch by Yulia Koval!

Differential Revision: https://reviews.llvm.org/D41480

llvm-svn: 324255
2018-02-05 17:53:29 +00:00
John Brawn 2867bd72c0 [InstCombine] Make foldSelectOpOp able to handle two-operand getelementptr
Three (or more) operand getelementptrs could plausibly also be handled, but
handling only two-operand fits in easily with the existing BinaryOperator
handling.

Differential Revision: https://reviews.llvm.org/D39958

llvm-svn: 322930
2018-01-19 10:05:15 +00:00
Sanjay Patel 31b4b76f99 [InstCombine] fold min/max tree with common operand (PR35717)
There is precedence for factorization transforms in instcombine for FP ops with fast-math. 
We also have similar logic in foldSPFofSPF().

It would take more work to add this to reassociate because that's specialized for binops, 
and min/max are not binops (or even single instructions). Also, I don't have evidence that 
larger min/max trees than this exist in real code, but if we find that's true, we might
want to reorganize where/how we do this optimization.

In the motivating example from https://bugs.llvm.org/show_bug.cgi?id=35717 , we have:

int test(int xc, int xm, int xy) {
  int xk;
  if (xc < xm)
    xk = xc < xy ? xc : xy;
  else
    xk = xm < xy ? xm : xy;
  return xk;
}

This patch solves that problem because we recognize more min/max patterns after rL321672

https://rise4fun.com/Alive/Qjne
https://rise4fun.com/Alive/3yg

Differential Revision: https://reviews.llvm.org/D41603

llvm-svn: 321998
2018-01-08 15:05:34 +00:00
Sanjay Patel 26a6fcde83 [InstCombine] relax use constraint for min/max (~a, ~b) --> ~min/max(a, b)
In the minimal case, this won't remove instructions, but it still improves
uses of existing values.

In the motivating example from PR35834, it does remove instructions, and
sets that case up to be optimized by something like D41603:
https://reviews.llvm.org/D41603

llvm-svn: 321936
2018-01-06 17:34:22 +00:00
Sanjay Patel 5b6aacf2c1 [InstCombine] add folds for min(~a, b) --> ~max(a, b)
Besides the bug of omitting the inverse transform of max(~a, ~b) --> ~min(a, b),
the use checking and operand creation were off. We were potentially creating 
repeated identical instructions of existing values. This led to infinite
looping after I added the extra folds.

By using the simpler m_Not matcher and not creating new 'not' ops for a and b,
we avoid that problem. It's possible that not using IsFreeToInvert() here is
more limiting than the simpler matcher, but there are no tests for anything
more exotic. It's also possible that we should relax the use checking further
to handle a case like PR35834:
https://bugs.llvm.org/show_bug.cgi?id=35834
...but we can make that a follow-up if it is needed. 

llvm-svn: 321882
2018-01-05 19:01:17 +00:00
Craig Topper f7b86728fa [InstCombine] Simplify binops that are only used by a select and are fed by a select with the same condition.
Summary:
This patch optimizes a binop sandwiched between 2 selects with the same condition. Since we know its only used by the select we can propagate the appropriate input value from the earlier select.

As I'm writing this I realize I may need to avoid doing this for division in case the select was protecting a divide by zero?

Reviewers: spatel, majnemer

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39999

llvm-svn: 318267
2017-11-15 05:23:02 +00:00
Matthew Simpson b6915fbfa2 [InstCombine] Simplify selects that test cmpxchg instructions
If a select instruction tests the returned flag of a cmpxchg instruction and
selects between the returned value of the cmpxchg instruction and its compare
operand, the result of the select will always be equal to its false value.

Differential Revision: https://reviews.llvm.org/D39383

llvm-svn: 316994
2017-10-31 12:34:02 +00:00
Eugene Zelenko 7f0f9bc5ab [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316503
2017-10-24 21:24:53 +00:00
Craig Topper 28d6d962d5 [InstCombine] Move foldSelectICmpAnd helper function earlier in the file to enable reuse in a future patch.
llvm-svn: 312518
2017-09-05 05:26:37 +00:00
Craig Topper 4c766a0559 [InstCombine] In foldSelectIntoOp, avoid creating a Constant before we know for sure we're going to use it and avoid an unnecessary call to m_APInt.
Instead of creating a Constant and then calling m_APInt with it (which will always return true). Just create an APInt initially, and use that for the checks in isSelect01 function. If it turns out we do need the Constant, create it from the APInt.

This is a refactor for a future patch that will do some more checks of the constant values here.

llvm-svn: 312517
2017-09-05 05:26:36 +00:00
Craig Topper 924f20262b [InstCombine][InstSimplify] Teach decomposeBitTestICmp to look through truncate instructions
This patch teaches decomposeBitTestICmp to look through truncate instructions on the input to the compare. If a truncate is found it will now return the pre-truncated Value and appropriately extend the APInt mask.

This allows some code to be removed from InstSimplify that was doing this functionality.

This allows InstCombine's bit test combining code to match a pre-truncate Value with the same Value appear with an 'and' on another icmp. Or it allows us to combine a truncate to i16 and a truncate to i8. This also required removing the type check from the beginning of getMaskedTypeForICmpPair, but I believe that's ok because we still have to find two values from the input to each icmp that are equal before we'll do any transformation. So the type check was really just serving as an early out.

There was one user of decomposeBitTestICmp that didn't want to look through truncates, so I've added a flag to prevent that behavior when necessary.

Differential Revision: https://reviews.llvm.org/D37158

llvm-svn: 312382
2017-09-01 21:27:34 +00:00
Sanjay Patel 6f7ac7e402 [InstCombine] remove unnecessary vector select fold; NFCI
This code is double-dead:
1. We simplify all selects with constant true/false condition in InstSimplify.
   I've minimized/moved the tests to show that works as expected.
2. All remaining vector selects with a constant condition are canonicalized to
   shufflevector, so we really can't see this pattern.

llvm-svn: 312123
2017-08-30 14:04:57 +00:00
Craig Topper 5d6ddda92d [InstCombine] Teach foldSelectICmpAndOr to handle vector splats
This was pretty close to working already. While I was here I went ahead and passed the ICmpInst pointer from the caller instead of doing a dyn_cast that can never fail.

Differential Revision: https://reviews.llvm.org/D37237

llvm-svn: 311960
2017-08-29 00:13:49 +00:00
Craig Topper 516e39cd38 [InstCombine] Teach select01 helper of foldSelectIntoOp to handle vector splats
We were handling some vectors in foldSelectIntoOp, but not if the operand of the bin op was any kind of vector constant. This patch fixes it to treat vector splats the same as scalars.

Differential Revision: https://reviews.llvm.org/D37232

llvm-svn: 311940
2017-08-28 22:00:27 +00:00
Craig Topper 74177e1ed1 [InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt X, 0) and (icmp sgt X, -1) as equivalent to an and with the sign bit of the truncated type
This is similar to what was already done in foldSelectICmpAndOr. Ultimately I'd like to see if we can call foldSelectICmpAnd from foldSelectIntoOp if we detect a power of 2 constant. This would allow us to remove foldSelectICmpAndOr entirely.

Differential Revision: https://reviews.llvm.org/D36498

llvm-svn: 311362
2017-08-21 19:02:06 +00:00
Craig Topper 882f29630b [InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1 support splat vectors
This also uses decomposeBitTestICmp to decode the compare.

Differential Revision: https://reviews.llvm.org/D36781

llvm-svn: 311044
2017-08-16 21:52:07 +00:00
Craig Topper 8e351e9018 [InstCombine] Cast to BinaryOperator earlier in foldSelectIntoOp to simplify the code.
We no longer need the explicit operand count check or the later dynamic cast.

llvm-svn: 310339
2017-08-08 06:19:24 +00:00
Craig Topper 1bbcab9ca5 [InstCombine] Support vector splats in foldSelectICmpAnd.
Unfortunately, it looks like there's some other missed optimizations in the generated code for some of these cases. I'll try to look at some of those next.

llvm-svn: 310184
2017-08-05 20:00:41 +00:00
Craig Topper fc5283092b [InstCombine] In foldSelectICmpAnd, if we need to to truncate from the 'and' type to the 'select' type, do it after shifting right instead of just bailing.
Previously we were always trying to emit the zext or truncate before any shift. This meant if the 'and' mask was larger than the size of the truncate we would skip the transformation.

Now we shift the result of the and right first leaving the bit within the range of the truncate.

This matches what we are doing in foldSelectICmpAndOr for the same problem.

llvm-svn: 310159
2017-08-05 01:45:17 +00:00
Craig Topper 3b74a68cc7 [InstCombine] Use ConstantInt::getFalse to reduce some code. NFC
llvm-svn: 310062
2017-08-04 16:07:18 +00:00
Nikolai Bozhenov 1545eb3408 [InstCombine] Canonicalize clamp of float types to minmax in fast mode.
Summary:
This commit allows matchSelectPattern to recognize clamp of float
arguments in the presence of FMF the same way as already done for
integers.

This case is a little different though. With integers, given the
min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX
"automatically". That is not the case for float, because for them only
full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care
about NaNs. On the other hand, some backends (e.g. X86) have only
FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM
nodes are illegal thus selection is not happening. So I decided to do
such kind of transformation in IR (InstCombiner) instead of
complicating the logic in the backend.

Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper

Reviewed By: efriedma

Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits

Patch by Andrei Elovikov <andrei.elovikov@intel.com>

Differential Revision: https://reviews.llvm.org/D33186

llvm-svn: 310054
2017-08-04 12:22:17 +00:00
Craig Topper c2d3c631ff [InstCombine] Move the call to foldSelectICmpAnd into foldSelectInstWithICmp. NFCI
llvm-svn: 310025
2017-08-04 05:12:37 +00:00
Chad Rosier dfd1de687d [Value Tracking] Default argument to true and rename accordingly. NFC.
IMHO this is a bit more readable.

llvm-svn: 309739
2017-08-01 20:18:54 +00:00
Wei Mi fc0e245464 Disable loop unswitching for some patterns containing equality comparison with undef.
This is a workaround for the bug described in PR31652 and
http://lists.llvm.org/pipermail/llvm-dev/2017-July/115497.html. The temporary
solution is to add a function EqualityPropUnSafe. In EqualityPropUnSafe, for
some simple patterns we can know the equality comparison may contains undef,
so we regard such comparison as unsafe and will not do loop-unswitching for
them. We also need to disable the select simplification when one of select
operand is undef and its result feeds into equality comparison.

The patch cannot clear the safety issue caused by the bug, but it can suppress
the issue from happening to some extent.

Differential Revision: https://reviews.llvm.org/D35811

llvm-svn: 309059
2017-07-25 23:37:17 +00:00
Craig Topper fde4723ebe [IR] Add Type::isIntOrIntVectorTy(unsigned) similar to the existing isIntegerTy(unsigned), but also works for vectors.
llvm-svn: 307492
2017-07-09 07:04:03 +00:00
Craig Topper e79b3e7d9a [InstCombine] Speculatively implement a fix for what might be the root cause of PR33721 by making sure that we have integer types before doing select C, -1, 0 -> sext C to int
I recently changed m_One and m_AllOnes to use Constant::isOneValue/isAllOnesValue which work on floating point values too. The original implementation looked specifically for ConstantInt scalars and splats. So I'm guessing we are accidentally trying to issue sext/zexts on floating point types now.

Hopefully I figure out how to reproduce the failure from the PR soon.

llvm-svn: 307486
2017-07-09 03:25:17 +00:00
Craig Topper bb4069e439 [InstCombine] Make InstCombine's IRBuilder be passed by reference everywhere
Previously the InstCombiner class contained a pointer to an IR builder that had been passed to the constructor. Sometimes this would be passed to helper functions as either a pointer or the pointer would be dereferenced to be passed by reference.

This patch makes it a reference everywhere including the InstCombiner class itself so there is more inconsistency. This a large, but mechanical patch. I've done very minimal formatting changes on it despite what clang-format wanted to do.

llvm-svn: 307451
2017-07-07 23:16:26 +00:00
Craig Topper 79ab643da8 [Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI
Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne.

llvm-svn: 307292
2017-07-06 18:39:47 +00:00
Craig Topper 3e1909d797 [InstCombine] Don't create extra ConstantInt objects in foldSelectICmpAnd. NFCI
Instead just use APInt objects and only create a ConstantInt at the end if we need it for the Offset.

llvm-svn: 307270
2017-07-06 15:58:54 +00:00
Nikolai Bozhenov bde9b14c6f Revert of r306525: "Canonicalize clamp of float types to minmax"
llvm-svn: 306815
2017-06-30 10:39:09 +00:00
Nikolai Bozhenov b01e6b5a52 [InstCombine] Canonicalize clamp of float types to minmax in fast mode.
Summary:
This commit allows matchSelectPattern to recognize clamp of float
arguments in the presence of FMF the same way as already done for
integers.

This case is a little different though. With integers, given the
min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX
"automatically". That is not the case for float, because for them only
full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care
about NaNs. On the other hand, some backends (e.g. X86) have only
FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM
nodes are illegal thus selection is not happening. So I decided to do
such kind of transformation in IR (InstCombiner) instead of
complicating the logic in the backend.

Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper

Reviewed By: efriedma

Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits

Patch by Andrei Elovikov <andrei.elovikov@intel.com>

Differential Revision: https://reviews.llvm.org/D33186

llvm-svn: 306525
2017-06-28 09:26:20 +00:00