Commit Graph

325 Commits

Author SHA1 Message Date
Sanjay Patel 2f7c24fe30 [InstCombine] (A + B) + B --> A + (B << 1)
This eliminates a use of 'B', so it can enable follow-on transforms
as well as improve analysis/codegen.

The PhaseOrdering test was added for D61726, and that shows
the limits of instcombine vs. real reassociation. We would
need to run some form of CSE to collapse that further.

The intermediate variable naming here is intentional because
there's a test at llvm/test/Bitcode/value-with-long-name.ll
that would break with the usual nameless value. I'm not sure
how to improve that test to be more robust.

The naming may also be helpful to debug regressions if this
change exposes weaknesses in the reassociation pass for example.
2020-05-22 11:46:59 -04:00
Roman Lebedev 352fef3f11
[InstCombine] Negator - sink sinkable negations
Summary:
As we have discussed previously (e.g. in D63992 / D64090 / [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]]), `sub` instruction
can almost be considered non-canonical. While we do convert `sub %x, C` -> `add %x, -C`,
we sparsely do that for non-constants. But we should.

Here, i propose to interpret `sub %x, %y` as `add (sub 0, %y), %x` IFF the negation can be sinked into the `%y`

This has some potential to cause endless combine loops (either around PHI's, or if there are some opposite transforms).
For former there's `-instcombine-negator-max-depth` option to mitigate it, should this expose any such issues
For latter, if there are still any such opposing folds, we'd need to remove the colliding fold.
In any case, reproducers welcomed!

Reviewers: spatel, nikic, efriedma, xbolva00

Reviewed By: spatel

Subscribers: xbolva00, mgorny, hiraditya, reames, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68408
2020-04-21 22:00:23 +03:00
Sanjay Patel 01bcc3e937 [InstCombine] prevent infinite loop with sub/abs of constant expression
PR45539:
https://bugs.llvm.org/show_bug.cgi?id=45539
2020-04-15 09:19:16 -04:00
Serge Pavlov c7ff5b38f2 [FPEnv] Use single enum to represent rounding mode
Now compiler defines 5 sets of constants to represent rounding mode.
These are:

1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes
defined by IEEE-754 and is used in `APFloat` implementation.

2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754
rounding modes and a special value for dynamic rounding mode. It is used
in clang frontend.

3. `llvm::fp::RoundingMode`. Defines the same values as
`clang::LangOptions::FPRoundingModeKind` but in different order. It is
used to specify rounding mode in in IR and functions that operate IR.

4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7).
Besides constants for rounding mode it also uses a special value to
indicate error. It is convenient to use in intrinsic functions, as it
represents platform-independent representation for rounding mode. In this
role it is used in some pending patches.

5. Values like `FE_DOWNWARD` and other, which specify rounding mode in
library calls `fesetround` and `fegetround`. Often they represent bits
of some control register, so they are target-dependent. The same names
(not values) and a special name `FE_DYNAMIC` are used in
`#pragma STDC FENV_ROUND`.

The first 4 sets of constants are target independent and could have the
same numerical representation. It would simplify conversion between the
representations. Also now `clang::LangOptions::FPRoundingModeKind` and
`llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding
direction `roundTiesToAway`, although it is supported natively on
some targets.

This change defines all the rounding mode type via one `llvm::RoundingMode`,
which also contains rounding mode for IEEE rounding direction `roundTiesToAway`.

Differential Revision: https://reviews.llvm.org/D77379
2020-04-09 13:26:47 +07:00
Simon Moll d871ef4e6a [instcombine] remove fsub to fneg hacks; only emit fneg
Summary: Rewrite the fsub-0.0 idiom to fneg and always emit fneg for fp
negation. This also extends the scalarization cost in instcombine for unary
operators to result in the same IR rewrites for fneg as for the idiom.

Reviewed By: cameron.mcinally

Differential Revision: https://reviews.llvm.org/D75467
2020-03-10 16:57:02 +01:00
Florian Hahn c8c14d979a [InstCombine] Support vectors in SimplifyAddWithRemainder.
SimplifyAddWithRemainder currently also matches for vector types, but
tries to create an integer constant, which causes a crash.

By using Constant::getIntegerValue() we can support both the scalar and
vector cases.

The 2 added test cases crash without the fix.

Reviewers: spatel, lebedev.ri

Reviewed By: spatel, lebedev.ri

Differential Revision: https://reviews.llvm.org/D75906
2020-03-10 14:29:40 +00:00
Roman Lebedev 1badf7c33a
[InstComine] Forego of one-use check in `(X - (X & Y)) --> (X & ~Y)` if Y is a constant
Summary:
This is potentially more friendly for further optimizations,
analysies, e.g.: https://godbolt.org/z/G24anE

This resolves phase-ordering bug that was introduced
in D75145 for https://godbolt.org/z/2gBwF2
https://godbolt.org/z/XvgSua

Reviewers: spatel, nikic, dmgreen, xbolva00

Reviewed By: nikic, xbolva00

Subscribers: hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75757
2020-03-06 21:39:07 +03:00
Simon Moll ddd11273d9 Remove BinaryOperator::CreateFNeg
Use UnaryOperator::CreateFNeg instead.

Summary:
With the introduction of the native fneg instruction, the
fsub -0.0, %x idiom is obsolete. This patch makes LLVM
emit fneg instead of the idiom in all places.

Reviewed By: cameron.mcinally

Differential Revision: https://reviews.llvm.org/D75130
2020-02-27 09:06:03 -08:00
Nikita Popov 5a8819b216 [InstCombine] Use replaceOperand() in more places
This is a followup to D73803, which uses the replaceOperand()
helper in more places.

This should be NFC apart from changes to worklist order.

Differential Revision: https://reviews.llvm.org/D73919
2020-02-11 17:38:23 +01:00
Sanjay Patel 242fed9d7f [InstCombine] convert fsub nsz with fneg operand to -(X + Y)
This was noted in D72521 - we need to match fneg specifically to
consistently handle that pattern along with (-0.0 - X).
2020-01-27 14:49:15 -05:00
Nikita Popov bcfa0f592f [InstCombine] Move negation handling into freelyNegateValue()
Followup to D72978. This moves existing negation handling in
InstCombine into freelyNegateValue(), which make it composable.
In particular, root negations of div/zext/sext/ashr/lshr/sub can
now always be performed through a shl/trunc as well.

Differential Revision: https://reviews.llvm.org/D73288
2020-01-27 20:46:23 +01:00
Nikita Popov 0b83c5a78f [InstCombine] Combine neg of shl of sub (PR44529)
Fixes https://bugs.llvm.org/show_bug.cgi?id=44529. We already have
a combine to sink a negation through a left-shift, but it currently
only works if the shift operand is negatable without creating any
instructions. This patch introduces freelyNegateValue() as a more
powerful extension of dyn_castNegVal(), which allows negating a
value as long as this doesn't end up increasing instruction count.
Specifically, this patch adds support for negating A-B to B-A.

This mechanism could in the future be extended to handle general
negation chains that a) start at a proper 0-X negation and b) only
require one operand to be freely negatable. This would end up as a
weaker form of D68408 aimed at the most obviously profitable subset
that eliminates a negation entirely.

Differential Revision: https://reviews.llvm.org/D72978
2020-01-22 23:03:58 +01:00
Sanjay Patel 0ade2abdb0 [InstCombine] fneg(X + C) --> -C - X
This is 1 of the potential folds uncovered by extending D72521.

We don't seem to do this in the backend either (unless I'm not
seeing some target-specific transform).

icc and gcc (appears to be target-specific) do this transform.

Differential Revision: https://reviews.llvm.org/D73057
2020-01-22 09:48:43 -05:00
Sanjay Patel 3180af4362 [InstCombine] reassociate fsub+fsub into fsub+fadd
As discussed in the motivating PR44509:
https://bugs.llvm.org/show_bug.cgi?id=44509

...we can end up with worse code using fast-math than without.
This is because the reassociate pass greedily transforms fsub
into fneg/fadd and apparently (based on the regression tests
seen here) expects instcombine to clean that up if it wasn't
profitable. But we were missing this fold:

(X - Y) - Z --> X - (Y + Z)

There's another, more specific case that I think we should
handle as shown in the "fake" fneg test (but missed with a real
fneg), but that's another patch. That may be tricky to get
right without conflicting with existing transforms for fneg.

Differential Revision: https://reviews.llvm.org/D72521
2020-01-15 11:14:13 -05:00
Nikita Popov 0e322c8a1f [InstCombine] Preserve nuw on sub of geps (PR44419)
Fix https://bugs.llvm.org/show_bug.cgi?id=44419 by preserving the
nuw on sub of geps. We only do this if the offset has a multiplication
as the final operation, as we can't be sure the operations is nuw
in the other cases without more thorough analysis.

Differential Revision: https://reviews.llvm.org/D72048
2020-01-11 11:01:12 +01:00
Roman Lebedev 6d05bc2e3a
[NFCI][InstCombine] Refactor 'sink negation into select if that folds one hand of select to 0' fold
I would think it's better than having two practically identical folds
next to eachother, but then generalization isn't all that pretty
due to the fact that we need to produce different `sub` each time..

This change is no-functional-changes-intended refactoring.
2020-01-04 17:30:51 +03:00
Roman Lebedev 772ede3d5d
[InstCombine] Sink sub into hands of select if one hand becomes zero. Part 2 (PR44426)
This decreases use count of %Op0, makes one hand of select to be 0,
and possibly exposes further folding potential.

Name: sub %Op0, (select %Cond, %Op0, %FalseVal) -> select %Cond, 0, (sub %Op0, %FalseVal)
  %Op0 = %TrueVal
  %o = select i1 %Cond, i8 %Op0, i8 %FalseVal
  %r = sub i8 %Op0, %o
=>
  %n = sub i8 %Op0, %FalseVal
  %r = select i1 %Cond, i8 0, i8 %n

Name: sub %Op0, (select %Cond, %TrueVal, %Op0) -> select %Cond, (sub %Op0, %TrueVal), 0
  %Op0 = %FalseVal
  %o = select i1 %Cond, i8 %TrueVal, i8 %Op0
  %r = sub i8 %Op0, %o
=>
  %n = sub i8 %Op0, %TrueVal
  %r = select i1 %Cond, i8 %n, i8 0

https://rise4fun.com/Alive/aHRt

https://bugs.llvm.org/show_bug.cgi?id=44426
2020-01-04 17:30:51 +03:00
Roman Lebedev 4d8e47ca18
[InstCombine] Sink sub into hands of select if one hand becomes zero (PR44426)
This decreases use count of %Op1, makes one hand of select to be 0,
and possibly exposes further folding potential.

Name: sub (select %Cond, %Op1, %FalseVal), %Op1 -> select %Cond, 0, (sub %FalseVal, %Op1)
  %Op1 = %TrueVal
  %o = select i1 %Cond, i8 %Op1, i8 %FalseVal
  %r = sub i8 %o, %Op1
=>
  %n = sub i8 %FalseVal, %Op1
  %r = select i1 %Cond, i8 0, i8 %n

Name: sub (select %Cond, %TrueVal, %Op1), %Op1 -> select %Cond, (sub %TrueVal, %Op1), 0
  %Op1 = %FalseVal
  %o = select i1 %Cond, i8 %TrueVal, i8 %Op1
  %r = sub i8 %o, %Op1
=>
  %n = sub i8 %TrueVal, %Op1
  %r = select i1 %Cond, i8 %n, i8 0

https://rise4fun.com/Alive/avL

https://bugs.llvm.org/show_bug.cgi?id=44426
2020-01-04 17:30:51 +03:00
Roman Lebedev 7973aa05f6
[NFC][InstCombine] '(Op1 & С) - Op1' -> '-(Op1 & ~C)' fold (PR44427)
This decreases use count of Op1, potentially allows
us to further hoist said 'neg' later on,
and results in marginally better X86 codegen.

Name: (Op1 & С) - Op1 -> -(Op1 & ~C)
  %o = and i64 %Op1, C1
  %r = sub i64 %o, %Op1
=>
  %n = and i64 %Op1, ~C1
  %r = sub i64 0, %n

https://rise4fun.com/Alive/rwgA

https://godbolt.org/z/R_RMfM

https://bugs.llvm.org/show_bug.cgi?id=44427
2020-01-03 21:25:48 +03:00
Roman Lebedev cc0216bedb
[NFC][InstCombine] '(X & (- Y)) - X' -> '- (X & (Y - 1))' fold (PR44448)
Name: (X & (- Y)) - X  ->  - (X & (Y - 1))  (PR44448)
  %negy = sub i8 0, %y
  %unbiasedx = and i8 %negy, %x
  %r = sub i8 %unbiasedx, %x
=>
  %ymask = add i8 %y, -1
  %xmasked = and i8 %ymask, %x
  %r = sub i8 0, %xmasked

https://rise4fun.com/Alive/OIpla

This decreases use count of %x, may allow us to
later hoist said negation even further,
and results in marginally nicer X86 codegen.

See
  https://bugs.llvm.org/show_bug.cgi?id=44448
  https://reviews.llvm.org/D71499
2020-01-03 20:27:29 +03:00
Roman Lebedev 796fa662f1
[InstCombine] Invert `add A, sext(B) --> sub A, zext(B)` canonicalization (to `sub A, zext B -> add A, sext B`)
Summary:
D68408 proposes to greatly improve our negation sinking abilities.
But in current canonicalization, we produce `sub A, zext(B)`,
which we will consider non-canonical and try to sink that negation,
undoing the existing canonicalization.
So unless we explicitly stop producing previous canonicalization,
we will have two conflicting folds, and will end up endlessly looping.

This inverts canonicalization, and adds back the obvious fold
that we'd miss:
* `sub [nsw] Op0, sext/zext (bool Y) -> add [nsw] Op0, zext/sext (bool Y)`
  https://rise4fun.com/Alive/xx4
* `sext(bool) + C -> bool ? C - 1 : C`
  https://rise4fun.com/Alive/fBl

It is obvious that `@ossfuzz_9880()` / `@lshr_out_of_range()`/`@ashr_out_of_range()`
(oss-fuzz 4871) are no longer folded as much, though those aren't really worrying.

Reviewers: spatel, efriedma, t.p.northover, hfinkel

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71064
2019-12-05 21:21:30 +03:00
Roman Lebedev 09311459e3
[InstCombine] Extend `0 - (X sdiv C) -> (X sdiv -C)` fold to non-splat vectors
Split off from https://reviews.llvm.org/D68408
2019-12-05 15:48:29 +03:00
Roman Lebedev 9948fac6c1 [NFC][InstCombine] Fixup comments
As noted in post-commit review of rL375378375378.

llvm-svn: 375397
2019-10-21 08:21:54 +00:00
Roman Lebedev 7015a5c54b [InstCombine] conditional sign-extend of high-bit-extract: 'or' pattern.
In this pattern, all the "magic" bits that we'd `add` are all
high sign bits, and in the value we'd be adding to they are all unset,
not unexpectedly, so we can have an `or` there:
https://rise4fun.com/Alive/ups

It is possible that `haveNoCommonBitsSet()` should be taught about this
pattern so that we never have an `add` variant, but the reasoning would
need to be recursive (because of that `select`), so i'm not really sure
that would be worth it just yet.

llvm-svn: 375378
2019-10-20 20:52:06 +00:00
Roman Lebedev 7cdeac43e5 [InstCombine] Fold conditional sign-extend of high-bit-extract into high-bit-extract-with-signext (PR42389)
This can come up in Bit Stream abstractions.

The pattern looks big/scary, but it can't be simplified any further.
It only is so simple because a number of my preparatory folds had
happened already (shift amount reassociation / shift amount
reassociation in bit test, sign bit test detection).

Highlights:
* There are two main flavors: https://rise4fun.com/Alive/zWi
  The difference is add vs. sub, and left-shift of -1 vs. 1
* Since we only change the shift opcode,
  we can preserve the exact-ness: https://rise4fun.com/Alive/4u4
* There can be truncation after high-bit-extraction:
  https://rise4fun.com/Alive/slHc1   (the main pattern i'm after!)
  Which means that we need to ignore zext of shift amounts and of NBits.
* The sign-extending magic can be extended itself (in add pattern
  via sext, in sub pattern via zext. not the other way around!)
  https://rise4fun.com/Alive/NhG
  (or those sext/zext can be sinked into `select`!)
  Which again means we should pay attention when matching NBits.
* We can have both truncation of extraction and widening of magic:
  https://rise4fun.com/Alive/XTw
  In other words, i don't believe we need to have any checks on
  bitwidths of any of these constructs.

This is worsened in general by the fact that we may have `sext` instead
of `zext` for shift amounts, and we don't yet canonicalize to `zext`,
although we should. I have not done anything about that here.

Also, we really should have something to weed out `sub` like these,
by folding them into `add` variant.

https://bugs.llvm.org/show_bug.cgi?id=42389

llvm-svn: 373964
2019-10-07 20:53:27 +00:00
Roman Lebedev 053014f8f9 [InstCombine] Deal with -(trunc(X >>u 63)) -> trunc(X >>s 63)
Identical to it's trunc-less variant, just pretent-to hoist
trunc, and everything else still holds:
https://rise4fun.com/Alive/JRU

llvm-svn: 373364
2019-10-01 17:50:20 +00:00
Roman Lebedev 65144149d0 [InstCombine] Preserve 'exact' in -(X >>u 31) -> (X >>s 31) fold
https://rise4fun.com/Alive/yR4

llvm-svn: 373363
2019-10-01 17:50:09 +00:00
David Bolvansky 420cbb6190 [InstCombine] sub(xor(x, y), or(x, y)) -> neg(and(x, y))
Summary:
```
Name: sub(xor(x, y), or(x, y)) -> neg(and(x, y))
%or = or i32 %y, %x
%xor = xor i32 %x, %y
%sub = sub i32 %xor, %or
  =>
%sub1 = and i32 %x, %y
%sub = sub i32 0, %sub1

Optimization: sub(xor(x, y), or(x, y)) -> neg(and(x, y))
Done: 1
Optimization is correct!
```

https://rise4fun.com/Alive/8OI

Reviewers: lebedev.ri

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67188

llvm-svn: 370945
2019-09-04 18:03:21 +00:00
David Bolvansky 0e07248704 [InstCombine] Fold sub (and A, B) (or A, B)) to neg (xor A, B)
Summary:
```
Name: sub(and(x, y), or(x, y)) -> neg(xor(x, y))
%or = or i32 %y, %x
%and = and i32 %x, %y
%sub = sub i32 %and, %or
  =>
%sub1 = xor i32 %x, %y
%sub = sub i32 0, %sub1

Optimization: sub(and(x, y), or(x, y)) -> neg(xor(x, y))
Done: 1
Optimization is correct!
```

https://rise4fun.com/Alive/VI6

Found by @lebedev.ri. Also author of the proof.

Reviewers: lebedev.ri, spatel

Reviewed By: lebedev.ri

Subscribers: llvm-commits, lebedev.ri

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67155

llvm-svn: 370934
2019-09-04 17:30:53 +00:00
David Bolvansky 358b80b340 [InstCombine] Fold sub (or A, B) (and A, B) to (xor A, B)
Summary:
```
Name: sub or and to xor
%or = or i32 %y, %x
%and = and i32 %x, %y
%sub = sub i32 %or, %and
  =>
%sub = xor i32 %x, %y

Optimization: sub or and to xor
Done: 1
Optimization is correct!
```
https://rise4fun.com/Alive/eJu

Reviewers: spatel, lebedev.ri

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67153

llvm-svn: 370883
2019-09-04 12:00:33 +00:00
Roman Lebedev 0410489a34 [InstCombine][NFC] Rename IsFreeToInvert() -> isFreeToInvert() for consistency
As per https://reviews.llvm.org/D65530#inline-592325

llvm-svn: 368686
2019-08-13 12:49:16 +00:00
Roman Lebedev 0efeaa8162 [IR] SelectInst: add swapValues() utility
Summary:
Sometimes we need to swap true-val and false-val of a `SelectInst`.
Having a function for that is nicer than hand-writing it each time.

Reviewers: spatel, RKSimon, craig.topper, jdoerfert

Reviewed By: jdoerfert

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65520

llvm-svn: 367547
2019-08-01 12:31:35 +00:00
Sanjay Patel 435cdecdf7 [InstCombine] canonicalize fneg before fmul/fdiv
Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it
easier to implement the transforms (and possibly other fneg transforms) in
1 place because we can always start the pattern match from fneg (either the
legacy binop or the new unop).

There's a secondary practical benefit seen in PR21914 and PR42681:
https://bugs.llvm.org/show_bug.cgi?id=21914
https://bugs.llvm.org/show_bug.cgi?id=42681
...hoisting fneg rather than sinking seems to play nicer with LICM in IR
(although this change may expose analysis holes in the other direction).

1. The instcombine test changes show the expected neutral IR diffs from
   reversing the order.

2. The reassociation tests show that we were missing an optimization
   opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says
   that all of these transforms are allowed (regardless of binop/unop
   fneg version) because:

   "For all other operations [besides copy/abs/negate/copysign], this
   standard does not specify the sign bit of a NaN result."
   In all of these transforms, we always have some other binop
   (fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a
   potential intermediate NaN operand.
   (If that interpretation is wrong, then we must already have a bug in
   the existing transforms?)

3. The clang tests shouldn't exist as-is, but that's effectively a
   revert of rL367149 (the test broke with an extension of the
   pre-existing fneg canonicalization in rL367146).

Differential Revision: https://reviews.llvm.org/D65399

llvm-svn: 367447
2019-07-31 16:53:22 +00:00
Sanjay Patel e9ee7b47d4 [InstCombine] fold fadd+fneg with fdiv/fmul betweena
The backend already does this via isNegatibleForFree(),
but we may want to alter the fneg IR canonicalizations
that currently exist, so we need to try harder to fold
fneg in IR to avoid regressions.

llvm-svn: 367227
2019-07-29 13:50:25 +00:00
Sanjay Patel 5483f4225e [InstCombine] reduce code for fadd with fneg operand; NFC
llvm-svn: 367224
2019-07-29 13:20:46 +00:00
Sanjay Patel 99c57c6daf [InstCombine] fold fsub+fneg with fdiv/fmul between
The backend already does this via isNegatibleForFree(),
but we may want to alter the fneg IR canonicalizations
that currently exist, so we need to try harder to fold
fneg in IR to avoid regressions.

llvm-svn: 367194
2019-07-28 17:10:06 +00:00
Sanjay Patel c229cfeb7a [InstCombine] remove flop from lerp patterns
(Y * (1.0 - Z)) + (X * Z) -->
Y - (Y * Z) + (X * Z) -->
Y + Z * (X - Y)

This is part of solving:
https://bugs.llvm.org/show_bug.cgi?id=42716

Factoring eliminates an instruction, so that should be a good canonicalization.
The potential conversion to FMA would be handled by the backend based on target
capabilities.

Differential Revision: https://reviews.llvm.org/D65305

llvm-svn: 367101
2019-07-26 11:19:18 +00:00
Roman Lebedev 9f0c83902d [InstCombine] Y - ~X --> X + Y + 1 fold (PR42457)
Summary:
I *think* we'd want this new variant, because we obviously
have better handling for `add` as compared to `sub`/`not`.

https://rise4fun.com/Alive/WMn

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]]

Reviewers: spatel, nikic, huihuiz, efriedma

Reviewed By: spatel

Subscribers: RKSimon, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63992

llvm-svn: 365011
2019-07-03 09:41:50 +00:00
Roman Lebedev 04d3d3bbff [InstCombine] (Y + ~X) + 1 --> Y - X fold (PR42459)
Summary:
To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.

This does address the regression being exposed in D63992.

https://godbolt.org/z/7DGltU
https://rise4fun.com/Alive/EPO0

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42459 | PR42459 ]]

Reviewers: spatel, nikic, huihuiz

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63993

llvm-svn: 364792
2019-07-01 15:55:24 +00:00
Cameron McInally 08200d6d26 [InstCombine] Handle -(X-Y) --> (Y-X) for unary fneg when NSZ
Differential Revision: https://reviews.llvm.org/D62612

llvm-svn: 363082
2019-06-11 16:21:21 +00:00
Roman Lebedev 39390d8317 [InstCombine] 'C-(C2-X) --> X+(C-C2)' constant-fold
It looks this fold was already partially happening, indirectly
via some other folds, but with one-use limitation.
No other fold here has that restriction.

https://rise4fun.com/Alive/ftR

llvm-svn: 362217
2019-05-31 09:47:16 +00:00
Roman Lebedev 886c4ef35a [InstCombine] 'add (sub C1, X), C2 --> sub (add C1, C2), X' constant-fold
https://rise4fun.com/Alive/qJQ

llvm-svn: 362216
2019-05-31 09:47:04 +00:00
Cameron McInally 8bec58d5f7 [NFC][InstCombine] Add FIXME for one-use check on constant negation transforms.
llvm-svn: 361197
2019-05-20 21:00:42 +00:00
Cameron McInally 2557ca296a [InstCombine] Add visitFNeg(...) visitor for unary Fneg
Also, break out a helper function, namely foldFNegIntoConstant(...), which performs transforms common between visitFNeg(...) and visitFSub(...).

Differential Revision: https://reviews.llvm.org/D61693

llvm-svn: 361188
2019-05-20 19:10:30 +00:00
Cameron McInally e75412ab47 Add InstCombine::visitFNeg(...)
Differential Revision: https://reviews.llvm.org/D61784

llvm-svn: 360461
2019-05-10 20:01:04 +00:00
Robert Lougher 8681ef8f41 [InstCombine] Add new combine to add folding
(X | C1) + C2 --> (X | C1) ^ C1 iff (C1 == -C2)

I verified the correctness using Alive:
https://rise4fun.com/Alive/YNV

This transform enables the following transform that already exists in
instcombine:

(X | Y) ^ Y --> X & ~Y

As a result, the full expected transform is:

(X | C1) + C2 --> X & ~C1 iff (C1 == -C2)

There already exists the transform in the sub case:

(X | Y) - Y --> X & ~Y

However this does not trigger in the case where Y is constant due to an earlier
transform:

X - (-C) --> X + C

With this new add fold, both the add and sub constant cases are handled.

Patch by Chris Dawson.

Differential Revision: https://reviews.llvm.org/D61517

llvm-svn: 360185
2019-05-07 19:36:41 +00:00
Sanjay Patel f62dcea7ed [InstCombine] prevent possible miscompile with negate+sdiv of vector op
// 0 - (X sdiv C)  -> (X sdiv -C)  provided the negation doesn't overflow.

This fold has been around for many years and nobody noticed the potential
vector miscompile from overflow until recently...
So it seems unlikely that there's much demand for a vector sdiv optimization
on arbitrary vector constants, so just limit the matching to splat constants
to avoid the possible bug.

Differential Revision: https://reviews.llvm.org/D60426

llvm-svn: 358005
2019-04-09 14:09:06 +00:00
Chen Zheng 923c7c9daa [InstCombine] sdiv exact flag fixup.
Differential Revision: https://reviews.llvm.org/D60396

llvm-svn: 357904
2019-04-08 12:08:03 +00:00
Sanjay Patel 81e8d76f5b [InstCombine] form uaddsat from add+umin (PR14613)
This is the last step towards solving the examples shown in:
https://bugs.llvm.org/show_bug.cgi?id=14613

With this change, x86 should end up with psubus instructions
when those are available.

All known codegen issues with expanding the saturating intrinsics
were resolved with:
D59006 / rL356855

We also have some early evidence in D58872 that using the intrinsics
will lead to better perf. If some target regresses from this, custom
lowering of the intrinsics (as in the above for x86) may be needed.

llvm-svn: 357012
2019-03-26 17:50:08 +00:00
Sanjay Patel 4a47f5f550 [InstCombine] fold adds of constants separated by sext/zext
This is part of a transform that may be done in the backend:
D13757
...but it should always be beneficial to fold this sooner in IR
for all targets.

https://rise4fun.com/Alive/vaiW

  Name: sext add nsw
  %add = add nsw i8 %i, C0
  %ext = sext i8 %add to i32
  %r = add i32 %ext, C1
  =>
  %s = sext i8 %i to i32
  %r = add i32 %s, sext(C0)+C1

  Name: zext add nuw
  %add = add nuw i8 %i, C0
  %ext = zext i8 %add to i16
  %r = add i16 %ext, C1
  =>
  %s = zext i8 %i to i16
  %r = add i16 %s, zext(C0)+C1

llvm-svn: 355118
2019-02-28 19:05:26 +00:00
Sanjay Patel 9907d3c8b4 [InstCombine] canonicalize add/sub with bool
add A, sext(B) --> sub A, zext(B)

We have to choose 1 of these forms, so I'm opting for the
zext because that's easier for value tracking.

The backend should be prepared for this change after:
D57401
rL353433

This is also a preliminary step towards reducing the amount
of bit hackery that we do in IR to optimize icmp/select.
That should be waiting to happen at a later optimization stage.

The seeming regression in the fuzzer test was discussed in:
D58359

We were only managing that fold in instcombine by luck, and
other passes should be able to deal with that better anyway.

llvm-svn: 354748
2019-02-24 16:57:45 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Florian Hahn 4094f34f78 [InstCombine] Don't undo 0 - (X * Y) canonicalization when combining subs.
Otherwise instcombine gets stuck in a cycle. The canonicalization was
added in D55961.

This patch fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=12400

llvm-svn: 351187
2019-01-15 11:18:21 +00:00
Sanjay Patel 79dceb2903 [InstCombine] name change: foldShuffledBinop -> foldVectorBinop; NFC
This function will deal with more than shuffles with D50992, and I 
have another potential per-element fold that could live here.

llvm-svn: 343692
2018-10-03 15:20:58 +00:00
David Green 1e44c3b62c [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A
This is an attempt to get out of a local-minimum that instcombine currently
gets stuck in. We essentially combine two optimisations at once, ~a - ~b = b-a
and min(~a, ~b) = ~max(a, b), only doing the transform if the result is at
least neutral. This involves using IsFreeToInvert, which has been expanded a
little to include selects that can be easily inverted.

This is trying to fix PR35875, using the ideas from Sanjay. It is a large
improvement to one of our rgb to cmy kernels.

Differential Revision: https://reviews.llvm.org/D52177

llvm-svn: 343569
2018-10-02 09:48:34 +00:00
Craig Topper 2da7381678 [InstCombine] Support (sub (sext x), (sext y)) --> (sext (sub x, y)) and (sub (zext x), (zext y)) --> (zext (sub x, y))
Summary:
If the sub doesn't overflow in the original type we can move it above the sext/zext.

This is similar to what we do for add. The overflow checking for sub is currently weaker than add, so the test cases are constructed for what is supported.

Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52075

llvm-svn: 342335
2018-09-15 18:54:10 +00:00
Sanjay Patel 90a36346bc [InstCombine] refactor mul narrowing folds; NFCI
Similar to rL342278:
The test diffs are all cosmetic due to the change in
value naming, but I'm including that to show that the
new code does perform these folds rather than something
else in instcombine.

D52075 should be able to use this code too rather than
duplicating all of the logic.

llvm-svn: 342292
2018-09-14 22:23:35 +00:00
Sanjay Patel 46945b9e9d [InstCombine] add/use overflowing math helper functions; NFC
The mul case can already be refactored to use this similar to
rL342278.
The sub case is proposed in D52075.

llvm-svn: 342289
2018-09-14 21:30:07 +00:00
Sanjay Patel 2426eb46dd [InstCombine] refactor add narrowing folds; NFCI
The test diffs are all cosmetic due to the change in
value naming, but I'm including that to show that the
new code does perform these folds rather than something
else in instcombine.

llvm-svn: 342278
2018-09-14 20:40:46 +00:00
Craig Topper 12fd6bd4ad [InstCombine] Use dyn_cast instead of match(m_Constant). NFC
llvm-svn: 341962
2018-09-11 16:51:26 +00:00
Craig Topper a6cd4b9bce [InstCombine] Extend (add (sext x), cst) --> (sext (add x, cst')) and (add (zext x), cst) --> (zext (add x, cst')) to work for vectors
Differential Revision: https://reviews.llvm.org/D51236

llvm-svn: 340796
2018-08-28 02:02:29 +00:00
Andrea Di Biagio f874607f32 [InstCombine] Remove unused method FAddCombine::createFDiv(). NFC
This commit fixes a (gcc 7.3.0) [-Wunused-function] warning caused by the
presence of unused method FaddCombine::createFDiv().
The last use of that method was removed at r339519.

llvm-svn: 340014
2018-08-17 11:33:48 +00:00
Sanjay Patel dc185ee275 [InstCombine] fix/enhance fadd/fsub factorization
(X * Z) + (Y * Z) --> (X + Y) * Z
  (X * Z) - (Y * Z) --> (X - Y) * Z
  (X / Z) + (Y / Z) --> (X + Y) / Z
  (X / Z) - (Y / Z) --> (X - Y) / Z

The existing code that implemented these folds failed to 
optimize vectors, and it transformed code with multiple 
uses when it should not have.

llvm-svn: 339519
2018-08-12 15:48:26 +00:00
Sanjay Patel 55accd7dd3 [InstCombine] allow fsub+fmul FMF folds for vectors
llvm-svn: 339368
2018-08-09 18:42:12 +00:00
Sanjay Patel ebec4204da [InstCombine] reduce code duplication; NFC
llvm-svn: 339349
2018-08-09 15:07:13 +00:00
Sanjay Patel fe839695a8 [InstCombine] fold fadd+fsub with common operand
This is a sibling to the simplify from:
https://reviews.llvm.org/rL339174

llvm-svn: 339267
2018-08-08 16:19:22 +00:00
Sanjay Patel 2054dd79c2 [InstCombine] fold fsub+fsub with common operand
This is a sibling to the simplify from:
rL339171

llvm-svn: 339266
2018-08-08 16:04:48 +00:00
Sanjay Patel a194b2d2ff [InstCombine] fold fneg into constant operand of fmul/fdiv
This accounts for the missing IR fold noted in D50195. We don't need any fast-math to enable the negation transform. 
FP negation can always be folded into an fmul/fdiv constant to eliminate the fneg.

I've limited this to one-use to ensure that we are eliminating an instruction rather than replacing fneg by a 
potentially expensive fdiv or fmul.

Differential Revision: https://reviews.llvm.org/D50417

llvm-svn: 339248
2018-08-08 14:29:08 +00:00
Fangrui Song f78650a8de Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293
2018-07-30 19:41:25 +00:00
Sanjay Patel 577c705752 [InstCombine] try to fold 'add+sub' to 'not+add'
These are reassociated versions of the same pattern and
similar transforms as in rL338200 and rL338118.

The motivation is identical to those commits:
Patterns with add/sub combos can be improved using
'not' ops. This is better for analysis and may lead
to follow-on transforms because 'xor' and 'add' are
commutative/associative. It can also help codegen.

llvm-svn: 338221
2018-07-29 18:13:16 +00:00
Sanjay Patel 818b253d3a [InstCombine] try to fold 'sub' to 'not'
https://rise4fun.com/Alive/jDd

Patterns with add/sub combos can be improved using
'not' ops. This is better for analysis and may lead
to follow-on transforms because 'xor' and 'add' are 
commutative/associative. It can also help codegen.  

llvm-svn: 338200
2018-07-28 16:48:44 +00:00
Sanjay Patel 70043b7e9a [InstCombine] return when SimplifyAssociativeOrCommutative makes a change
This bug was created by rL335258 because we used to always call instsimplify
after trying the associative folds. After that change it became possible
for subsequent folds to encounter unsimplified code (and potentially assert
because of it). 

Instead of carrying changed state through instcombine, we can just return 
immediately. This allows instsimplify to run, so we can continue assuming
that easy folds have already occurred.

llvm-svn: 336965
2018-07-13 01:18:07 +00:00
Gil Rapaport da2e2caa6c [InstCombine] (A + 1) + (B ^ -1) --> A - B
Turn canonicalized subtraction back into (-1 - B) and combine it with (A + 1) into (A - B).
This is similar to the folding already done for (B ^ -1) + Const into (-1 + Const) - B.

Differential Revision: https://reviews.llvm.org/D48535

llvm-svn: 335579
2018-06-26 05:31:18 +00:00
Sanjay Patel 7b0fc75f73 [InstCombine] simplify binops before trying other folds
This is outwardly NFC from what I can tell, but it should be more efficient 
to simplify first (despite the name, SimplifyAssociativeOrCommutative does
not actually simplify as InstSimplify does - it creates/morphs instructions).

This should make it easier to refactor duplicated code that runs for all binops.

llvm-svn: 335258
2018-06-21 17:06:36 +00:00
Sanjay Patel 3cd1aa88f9 [InstCombine] fold another shifty abs pattern to cmp+sel (PR36036)
The bug report:
https://bugs.llvm.org/show_bug.cgi?id=36036

...requests a DAG change for this, but an IR canonicalization
probably handles most cases. If we still want to match this
pattern in the backend, there's a proposal for that too:
D47831

Alive proofs including nsw/nuw cases that were first noted in:
D46988

https://rise4fun.com/Alive/Kmp

This patch is largely copied from the existing code that was
initially added with:
D40984
...but I didn't see much gain from trying to share code.

llvm-svn: 334137
2018-06-06 21:58:12 +00:00
Roman Lebedev cbf8446359 [InstCombine] PR37603: low bit mask canonicalization
Summary:
This is [[ https://bugs.llvm.org/show_bug.cgi?id=37603 | PR37603 ]].

https://godbolt.org/g/VCMNpS
https://rise4fun.com/Alive/idM

When doing bit manipulations, it is quite common to calculate some bit mask,
and apply it to some value via `and`.

The typical C code looks like:
```
int mask_signed_add(int nbits) {
    return (1 << nbits) - 1;
}
```
which is translated into (with `-O3`)
```
define dso_local i32 @mask_signed_add(int)(i32) local_unnamed_addr #0 {
  %2 = shl i32 1, %0
  %3 = add nsw i32 %2, -1
  ret i32 %3
}
```

But there is a second, less readable variant:
```
int mask_signed_xor(int nbits) {
    return ~(-(1 << nbits));
}
```
which is translated into (with `-O3`)
```
define dso_local i32 @mask_signed_xor(int)(i32) local_unnamed_addr #0 {
  %2 = shl i32 -1, %0
  %3 = xor i32 %2, -1
  ret i32 %3
}
```

Since we created such a mask, it is quite likely that we will use it in `and` next.
And then we may get rid of `not` op by folding into `andn`.

But now that i have actually looked:
https://godbolt.org/g/VTUDmU
_some_ backend changes will be needed too.
We clearly loose `bzhi` recognition.

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47428

llvm-svn: 334127
2018-06-06 19:38:27 +00:00
Sanjay Patel 3bd957b7ae [InstCombine] improve sub with bool folds
There's a patchwork of existing transforms trying to handle
these cases, but as seen in the changed test, we weren't
catching them all.

llvm-svn: 333845
2018-06-03 16:35:26 +00:00
Sanjay Patel bbc6d60677 [InstCombine] call simplify before trying vector folds
As noted in the review thread for rL333782, we could have
made a bug harder to hit if we were simplifying instructions
before trying other folds. 

The shuffle transform in question isn't ever a simplification;
it's just a canonicalization. So I've renamed that to make that 
clearer.

This is NFCI at this point, but I've regenerated the test file 
to show the cosmetic value naming difference of using 
instcombine's RAUW vs. the builder.

Possible follow-ups:
1. Move reassociation folds after simplifies too.
2. Refactor common code; we shouldn't have so much repetition.

llvm-svn: 333820
2018-06-02 16:27:44 +00:00
Sanjay Patel ceb595b04e [InstCombine] don't negate constant expression with fsub (PR37605)
X + (-C) would be transformed back into X - C, so infinite loop:
https://bugs.llvm.org/show_bug.cgi?id=37605

llvm-svn: 333610
2018-05-30 23:55:12 +00:00
Craig Topper 3b768e8602 [InstCombine] Negate ABS/NABS patterns by swapping the select operands to remove the negation
Differential Revision: https://reviews.llvm.org/D47236

llvm-svn: 333101
2018-05-23 17:29:03 +00:00
Omer Paparo Bivas fbb83deef7 [InstCombine] Moving overflow computation logic from InstCombine to ValueTracking; NFC
Differential Revision: https://reviews.llvm.org/D46704

Change-Id: Ifabcbe431a2169743b3cc310f2a34fd706f13f02
llvm-svn: 332026
2018-05-10 19:46:19 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Roman Lebedev 6959b8e76f [PatternMatch] Stabilize the matching order of commutative matchers
Summary:
Currently, we
1. match `LHS` matcher to the `first` operand of binary operator,
2. and then match `RHS` matcher to the `second` operand of binary operator.
If that does not match, we swap the `LHS` and `RHS` matchers:
1. match `RHS` matcher to the `first` operand of binary operator,
2. and then match `LHS` matcher to the `second` operand of binary operator.

This works ok.
But it complicates writing of commutative matchers, where one would like to match
(`m_Value()`) the value on one side, and use (`m_Specific()`) it on the other side.

This is additionally complicated by the fact that `m_Specific()` stores the `Value *`,
not `Value **`, so it won't work at all out of the box.

The last problem is trivially solved by adding a new `m_c_Specific()` that stores the
`Value **`, not `Value *`. I'm choosing to add a new matcher, not change the existing
one because i guess all the current users are ok with existing behavior,
and this additional pointer indirection may have performance drawbacks.
Also, i'm storing pointer, not reference, because for some mysterious-to-me reason
it did not work with the reference.

The first one appears trivial, too.
Currently, we
1. match `LHS` matcher to the `first` operand of binary operator,
2. and then match `RHS` matcher to the `second` operand of binary operator.
If that does not match, we swap the ~~`LHS` and `RHS` matchers~~ **operands**:
1. match ~~`RHS`~~ **`LHS`** matcher to the ~~`first`~~ **`second`** operand of binary operator,
2. and then match ~~`LHS`~~ **`RHS`** matcher to the ~~`second`~ **`first`** operand of binary operator.

Surprisingly, `$ ninja check-llvm` still passes with this.
But i expect the bots will disagree..

The motivational unittest is included.
I'd like to use this in D45664.

Reviewers: spatel, craig.topper, arsenm, RKSimon

Reviewed By: craig.topper

Subscribers: xbolva00, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D45828

llvm-svn: 331085
2018-04-27 21:23:20 +00:00
Sanjoy Das 6f1937b10f [InstCombine] Simplify Add with remainder expressions as operands.
Summary:
Simplify integer add expression X % C0 + (( X / C0 ) % C1) * C0 to
X % (C0 * C1).  This is a common pattern seen in code generated by the XLA
GPU backend.

Add test cases for this new optimization.

Patch by Bixia Zheng!

Reviewers: sanjoy

Reviewed By: sanjoy

Subscribers: efriedma, craig.topper, lebedev.ri, llvm-commits, jlebar

Differential Revision: https://reviews.llvm.org/D45976

llvm-svn: 330992
2018-04-26 20:52:28 +00:00
Sanjay Patel 1170daa277 [InstCombine] simplify fneg+fadd folds; NFC
Two cleanups:
1. As noted in D45453, we had tests that don't need FMF that were misplaced in the 'fast-math.ll' test file.
2. This removes the final uses of dyn_castFNegVal, so that can be deleted. We use 'match' now.

llvm-svn: 330126
2018-04-16 14:13:57 +00:00
Warren Ristow 8b2f27ce3a [InstCombine] Enable Add/Sub simplifications with only 'reassoc' FMF
These simplifications were previously enabled only with isFast(), but that
is more restrictive than required. Since r317488, FMF has 'reassoc' to
control these cases at a finer level.

llvm-svn: 330089
2018-04-14 19:18:28 +00:00
Sanjay Patel ff98682c9c [InstCombine] limit X - (cast(-Y) --> X + cast(Y) with hasOneUse()
llvm-svn: 329821
2018-04-11 15:57:18 +00:00
Sanjay Patel a9ca709011 [InstCombine] limit nsz: -(X - Y) --> Y - X to hasOneUse()
As noted in the post-commit discussion for r329350, we shouldn't
generally assume that fsub is the same cost as fneg.

llvm-svn: 329429
2018-04-06 17:24:08 +00:00
Sanjay Patel 04683de82f [InstCombine] FP: Z - (X - Y) --> Z + (Y - X)
This restores what was lost with rL73243 but without
re-introducing the bug that was present in the old code.

Note that we already have these transforms if the ops are 
marked 'fast' (and I assume that's happening somewhere in
the code added with rL170471), but we clearly don't need 
all of 'fast' for these transforms.

llvm-svn: 329362
2018-04-05 23:21:15 +00:00
Sanjay Patel 03e2526728 [InstCombine] nsz: -(X - Y) --> Y - X
This restores part of the fold that was removed with rL73243 (PR4374).

llvm-svn: 329350
2018-04-05 21:37:17 +00:00
Sanjay Patel deaf4f354e [InstCombine] use pattern matchers for fsub --> fadd folds
This allows folding for vectors with undef elements.

llvm-svn: 329316
2018-04-05 17:06:45 +00:00
Sanjay Patel 93e64dd9a1 [PatternMatch] allow undef elements when matching vector FP +0.0
This continues the FP constant pattern matching improvements from:
https://reviews.llvm.org/rL327627
https://reviews.llvm.org/rL327339
https://reviews.llvm.org/rL327307

Several integer constant matchers also have this ability. I'm
separating matching of integer/pointer null from FP positive zero
and renaming/commenting to make the functionality clearer.

llvm-svn: 328461
2018-03-25 21:16:33 +00:00
Sanjay Patel 1a8d5c3d1f [InstCombine] (~X) - (~Y) --> Y - X
llvm-svn: 326660
2018-03-03 17:53:25 +00:00
Sanjay Patel 8fdd87f929 [InstCombine] move constant check into foldBinOpIntoSelectOrPhi; NFCI
Also, rename 'foldOpWithConstantIntoOperand' because that's annoyingly 
vague. The constant check is redundant in some cases, but it allows 
removing duplication for most of the calls.

llvm-svn: 326329
2018-02-28 16:36:24 +00:00
Sanjay Patel 4a9116e897 [InstCombine] use FMF-copying functions to reduce code; NFCI
llvm-svn: 325923
2018-02-23 17:07:29 +00:00
Sanjay Patel b6404a8ca6 [InstCombine] canonicalize constant-minus-boolean to select-of-constants
This restores the half of:
https://reviews.llvm.org/rL75531
that was reverted at:
https://reviews.llvm.org/rL159230

For the x86 case mentioned there, we now produce:
leal 1(%rdi), %eax
subl %esi, %eax

We have target hooks to invert this in DAGCombiner (and x86 is enabled) with:
https://reviews.llvm.org/rL296977
https://reviews.llvm.org/rL311731

AArch64 and possibly other targets would probably benefit from enabling those hooks too. 
See PR30327:
https://bugs.llvm.org/show_bug.cgi?id=30327#c2

Differential Revision: https://reviews.llvm.org/D40612

llvm-svn: 319964
2017-12-06 21:22:57 +00:00
Sanjay Patel 629c411538 [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html
and again more recently:
http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html

...this is a step in cleaning up our fast-math-flags implementation in IR to better match
the capabilities of both clang's user-visible flags and the backend's flags for SDNode.

As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 
'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic 
reassociation - 'AllowReassoc'.

We're also adding a bit to allow approximations for library functions called 'ApproxFunc' 
(this was initially proposed as 'libm' or similar).

...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did 
look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), 
but that's apparently already used for other purposes. Also, I don't think we can just 
add a field to FPMathOperator because Operator is not intended to be instantiated. 
We'll defer movement of FMF to another day.

We keep the 'fast' keyword. I thought about removing that, but seeing IR like this:
%f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2
...made me think we want to keep the shortcut synonym.

Finally, this change is binary incompatible with existing IR as seen in the 
compatibility tests. This statement:
"Newer releases can ignore features from older releases, but they cannot miscompile 
them. For example, if nsw is ever replaced with something else, dropping it would be 
a valid way to upgrade the IR." 
( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility )
...provides the flexibility we want to make this change without requiring a new IR 
version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will 
fail to optimize some previously 'fast' code because it's no longer recognized as 
'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'.

Note: an inter-dependent clang commit to use the new API name should closely follow 
commit.

Differential Revision: https://reviews.llvm.org/D39304

llvm-svn: 317488
2017-11-06 16:27:15 +00:00
Eugene Zelenko 7f0f9bc5ab [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316503
2017-10-24 21:24:53 +00:00
Sanjay Patel b869f76d85 [InstCombine] use m_Neg() to reduce code; NFCI
llvm-svn: 315762
2017-10-13 21:28:50 +00:00
Sanjay Patel f0242de143 [InstCombine] move code to remove repeated constant check; NFCI
Also, consolidate tests for this fold in one place.

llvm-svn: 315745
2017-10-13 20:29:11 +00:00