Commit Graph

913 Commits

Author SHA1 Message Date
Roman Lebedev ea1a0d7c9a
[InstSimplify] Bypass no-op `and`-mask, using known bits (PR49543)
We already special-cased a few interesting patterns,
but that is strictly less powerful than using KnownBits.

So instead get the known bits for the operand of `and`,
and iff all the unset bits of the `and`-mask are known to be zeros
in the operand, we can omit said `and`.
2021-04-21 00:31:46 +03:00
Sanjay Patel 7ef2c68a3d [InstSimplify] improve efficiency for detecting non-zero value
Stepping through callstacks in the example from D99759 reveals
this potential compile-time improvement.

The savings come from avoiding ValueTracking's computing known
bits if we have already dealt with special-case patterns.

Further improvements in this direction seem possible.

This makes a degenerate test based on PR49785 about 40x faster
(25 sec -> 0.6 sec), but it does not address the larger question
of how to limit computeKnownBitsFromAssume(). Ie, the original
test there is still infinite-time for all practical purposes.

Differential Revision: https://reviews.llvm.org/D100408
2021-04-14 09:04:15 -04:00
Roman Lebedev e8c7f43e2c
[NFC][ConstantRange] Add 'icmp' helper method
"Does the predicate hold between two ranges?"

Not very surprisingly, some places were already doing this check,
without explicitly naming the algorithm, cleanup them all.
2021-04-10 19:38:55 +03:00
Roman Lebedev 7b12c8c59d
Revert "[NFC][ConstantRange] Add 'icmp' helper method"
This reverts commit 17cf2c9423.
2021-04-10 19:37:53 +03:00
Roman Lebedev 17cf2c9423
[NFC][ConstantRange] Add 'icmp' helper method
"Does the predicate hold between two ranges?"

Not very surprisingly, some places were already doing this check,
without explicitly naming the algorithm, cleanup them all.
2021-04-10 19:09:52 +03:00
Florian Hahn 4059c1c32d [SimplifyInst] Use correct type for GEPs with vector indices.
The current code does not properly handle vector indices unless they are
the first index.

At the moment LangRef gives the impression that the vector index must be
the one and only index (https://llvm.org/docs/LangRef.html#getelementptr-instruction).

But vector indices can appear at any position and according to the
verifier there may be multiple vector indices. If that's the case, the
number of elements must match.

This patch updates SimplifyGEPInst to properly handle those additional
cases.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D99961
2021-04-06 17:56:10 +01:00
Sanjay Patel e2a0f512ea [InstSimplify] fix potential miscompile in select value equivalence
This is the sibling fix to c590a9880d -
as there, we can't subsitute a vector value the equality
compare replacement that we are trying requires that the
comparison is true for the entire value. Vector select
can be partly true/false.
2021-04-05 16:52:34 -04:00
Sander de Smalen 0f7bbbc481 Always emit error for wrong interfaces to scalable vectors, unless cmdline flag is passed.
In order to bring up scalable vector support in LLVM incrementally,
we introduced behaviour to emit a warning, instead of an error, when
asking the wrong question of a scalable vector, like asking for the
fixed number of elements.

This patch puts that behaviour under a flag. The default behaviour is
that the compiler will always error, which means that all LLVM unit
tests and regression tests will now fail when a code-path is taken that
still uses the wrong interface.

The behaviour to demote an error to a warning can be individually enabled
for tools that want to support experimental use of scalable vectors.
This patch enables that behaviour when driving compilation from Clang.
This means that for users who want to try out scalable-vector support,
fixed-width codegen support, or build user-code with scalable vector
intrinsics, Clang will not crash and burn when the compiler encounters
such a case.

This allows us to do away with the following pattern in many of the SVE tests:
  RUN: .... 2>%t
  RUN: cat %t | FileCheck --check-prefix=WARN
  WARN-NOT: warning: ...

The behaviour to emit warnings is only temporary and we expect this flag
to be removed in the future when scalable vector support is more stable.

This patch also has fixes the following tests:
 unittests:
   ScalableVectorMVTsTest.SizeQueries
   SelectionDAGAddressAnalysisTest.unknownSizeFrameObjects
   AArch64SelectionDAGTest.computeKnownBitsSVE_ZERO_EXTEND_VECTOR_INREG

 regression tests:
   Transforms/InstCombine/vscale_gep.ll

Reviewed By: paulwalker-arm, ctetreau

Differential Revision: https://reviews.llvm.org/D98856
2021-04-02 10:55:22 +01:00
Yang Fan 279d74ffd1
[InstSimplify] Fix unused variable warning (NFC)
GCC warning:
```
/llvm-project/llvm/lib/Analysis/InstructionSimplify.cpp: In function ‘llvm::Value* SimplifyWithOpReplaced(llvm::Value*, llvm::Value*, llvm::Value*, const llvm::SimplifyQuery&, bool, unsigned int)’:
/llvm-project/llvm/lib/Analysis/InstructionSimplify.cpp:3993:15: warning: unused variable ‘SI’ [-Wunused-variable]
 3993 |     if (auto *SI = dyn_cast<SelectInst>(I))
      |               ^~
```
2021-03-24 09:56:36 +08:00
Juneyoung Lee 960a767368 Reland "[InstCombine] Add simplification of two logical and/ors"
This relands 07c3b97e18 (D96945) which was reverted by
commit f49354838e.
The two-stage compilation successfully tests passes on my machine.
2021-03-23 16:24:50 +09:00
Nikita Popov 7e18cd887c [InstCombine] Whitelist non-refining folds in SimplifyWithOpReplaced
This is an alternative to D98391/D98585, playing things more
conservatively. If AllowRefinement == false, then we don't use
InstSimplify methods at all, and instead explicitly implement a
small number of non-refining folds. Most cases are handled by
constant folding, and I only had to add three folds to cover
our unit tests / test-suite. While this may lose some optimization
power, I think it is safer to approach from this direction, given
how many issues this code has already caused.

Differential Revision: https://reviews.llvm.org/D99027
2021-03-22 22:12:56 +01:00
Nikita Popov daae927f9c [InstSimplify] Clean up SimplifyReplacedWithOp implementation (NFCI)
Replace Op with RepOp up-front, and then always work with the new
operands, rather than checking for replacement in various places.
2021-03-21 15:30:30 +01:00
Simonas Kazlauskas 6513995be3 [InstSimplify] Restrict a GEP transform to avoid provenance changes
This is a follow-up to D98588, and fixes the inline `FIXME` about a GEP-related simplification not
preserving the provenance.

https://alive2.llvm.org/ce/z/qbQoAY

Additional tests were added in {rGf125f28afdb59eba29d2491dac0dfc0a7bf1b60b}

Depends on D98672

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D98611
2021-03-16 18:53:05 +02:00
Simonas Kazlauskas a977324800 [InstSimplify] Match PtrToInt more directly in a GEP transform (NFC)
In preparation for D98611, the upcoming change will need to apply additional checks to `P` and `V`,
and so this refactor paves the way for adding additional checks in a less awkward way.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D98672
2021-03-16 15:45:19 +02:00
Sanjay Patel 660728acd4 [InstSimplify] ctlz({signbit} >>u x) --> x
The motivating pattern was handled in 0a2d69480d ,
but we should have this for symmetry.

But this really highlights that we could generalize for
any shifted constant if we match this in instcombine.

https://alive2.llvm.org/ce/z/MrmVNt
2021-03-15 12:03:35 -04:00
Bjorn Pettersson 529c8e8dc6 [InstSimplify] Simplify smul.fix and smul.fix.sat
Add simplification of smul.fix and smul.fix.sat according to
  X * 0 -> 0
  X * undef -> 0
  X * (1 << scale) -> X

This includes the commuted patterns and splatted vectors.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D98299
2021-03-12 09:09:58 +01:00
Juneyoung Lee 720a828045 Resolve unused variable warning (NFC) 2021-03-11 12:03:03 +09:00
Juneyoung Lee 8652c3e1a3 [InstSimplify] Pass SimplifyQuery to computePointerICmp (NFC) 2021-03-11 11:13:46 +09:00
Juneyoung Lee f49354838e Revert "[InstCombine] Add simplification of two logical and/ors"
This reverts commit 07c3b97e18 due to a reported failure in two-stage build.
2021-03-10 05:48:31 +09:00
Sanjay Patel 34d0d644ff [ValueTracking] move/add helper to get inverse min/max; NFC
We will need to this functionality to improve min/max folds
in instcombine when we canonicalize to intrinsics.
2021-03-08 17:38:22 -05:00
Sanjay Patel 0a2d69480d [InstSimplify] cttz(1<<x) --> x
https://alive2.llvm.org/ce/z/TDacYu
https://alive2.llvm.org/ce/z/KF84S3
2021-03-08 16:30:14 -05:00
Juneyoung Lee 07c3b97e18 [InstCombine] Add simplification of two logical and/ors
This is a patch that adds folding of two logical and/ors that share one variable:

a && (a && b) -> a && b
a && (a & b)  -> a && b
...

This is towards removing the poison-unsafe select optimization (D93065 has more context).

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D96945
2021-03-08 02:38:43 +09:00
Simon Pilgrim 1020d16156 [InstSimplify] Handle nsw shl -> poison patterns
Pulled out from D90479 - this recognises invalid nsw shl patterns with signbit changes that result in poison.

Differential Revision: https://reviews.llvm.org/D97305
2021-02-23 18:26:56 +00:00
Simon Pilgrim 18b9fc48f1 [InstructionSimplify] SimplifyShift - rename shift amount KnownBits. NFCI.
As suggested on D97305.
2021-02-23 18:12:59 +00:00
Simon Pilgrim 476ff0327b [InstSimplify] Cleanup out-of-range shift amount handling.
Use APInt::uge() direct instead of getLimitedValue().

Use KnownBits::getMinValue() to make the bounds check more obvious.
2021-02-22 17:00:49 +00:00
Caroline Concatto 2d728bbff5 [CodeGen][SelectionDAG]Add new intrinsic experimental.vector.reverse
This patch adds  a new intrinsic experimental.vector.reduce that takes a single
vector and returns a vector of matching type but with the original lane order
 reversed. For example:

```
vector.reverse(<A,B,C,D>) ==> <D,C,B,A>
```

The new intrinsic supports fixed and scalable vectors types.
The fixed-width vector relies on shufflevector to maintain existing behaviour.
Scalable vector uses the new ISD node - VECTOR_REVERSE.

This new intrinsic is one of the named shufflevector intrinsics proposed on the
mailing-list in the RFC at [1].

Patch by Paul Walker (@paulwalker-arm).

[1] https://lists.llvm.org/pipermail/llvm-dev/2020-November/146864.html

Differential Revision: https://reviews.llvm.org/D94883
2021-02-15 13:39:43 +00:00
Juneyoung Lee 0441df94ad [InstCombine,InstSimplify] Optimize select followed by and/or/xor
This patch adds `A & (A && B)` -> `A && B`  (similarly for or + logical or)

Also, this patch adds `~(select C, (icmp pred X, Y), const)` -> `select C, (icmp pred' X, Y), ~const`.

Alive2 proof:
merge_and: https://alive2.llvm.org/ce/z/teMR97
merge_or: https://alive2.llvm.org/ce/z/b4yZUp
xor_and: https://alive2.llvm.org/ce/z/_-TXHi
xor_or: https://alive2.llvm.org/ce/z/2uYx_a

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D94861
2021-01-19 09:14:17 +09:00
Nikita Popov a13c0f62c3 [InstSimplify] Fold x*C1/C2 <= x (PR48744)
We can fold x*C1/C2 <= x to true if C1 <= C2. This is valid even
if the multiplication is not nuw: https://alive2.llvm.org/ce/z/vULors

The multiplication or division can be replaced by shifts. We don't
handle the case where both are shifts, as that should get folded
away by InstCombine.
2021-01-17 16:02:55 +01:00
Dávid Bolvanský bfd75bdf3f [NFC] Removed extra text in comments 2021-01-16 22:48:56 +01:00
Dávid Bolvanský 63bedc80da [InstSimplify] Handle commutativity for 'and' and 'outer or' for (~A & B) | ~(A | B) --> ~A
Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D94870
2021-01-16 19:42:50 +01:00
Dávid Bolvanský bdd4dda58b [InstSimplify] Update comments, remove redundant tests 2021-01-16 16:31:23 +01:00
Dávid Bolvanský a4e2a5145a [InstSimplify] Add (~A & B) | ~(A | B) --> ~A 2021-01-16 15:43:34 +01:00
Nikita Popov 7ecad2e4ce [InstSimplify] Don't fold gep p, -p to null
This is a partial fix for https://bugs.llvm.org/show_bug.cgi?id=44403.
Folding gep p, q-p to q is only legal if p and q have the same
provenance. This fold should probably be guarded by something like
getUnderlyingObject(p) == getUnderlyingObject(q).

This patch is a partial fix that removes the special handling for
gep p, 0-p, which will fold to a null pointer, which would certainly
not pass an underlying object check (unless p is also null, in which
case this would fold trivially anyway). Folding to a null pointer
is particularly problematic due to the special handling it receives
in many places, making end-to-end miscompiles more likely.

Differential Revision: https://reviews.llvm.org/D93820
2021-01-12 20:24:23 +01:00
Juneyoung Lee 3a60a1f165 [InstSimplify] Fold insertelement vec, poison, idx into vec
This is a simple patch that adds folding from `insertelement vec, poison, idx` into `vec`.

Alive2 proof: https://alive2.llvm.org/ce/z/2y2vbC

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93994
2021-01-07 10:10:14 +09:00
Nikita Popov 221c3b174b [InstSimplify] Canonicalize non-demanded shuffle op to poison (NFCI)
I don't believe this has an observable effect, because the only
thing we care about here is replacing the operand with a constant
so following folds can apply. This change is just to make the
representation follow canonical unary shuffle form.
2021-01-06 21:22:27 +01:00
Nikita Popov d042f2db5b [InstSimplify] Fold call null/undef to poison
Calling null or undef results in immediate undefined behavior.
Return poison instead of undef in this case, similar to what
we do for immediate UB due to division by zero.
2021-01-06 21:09:30 +01:00
Nikita Popov a6df39236f [InstSimplify] Fold out-of-bounds shift to poison
Make InstSimplify return poison rather than undef for out-of-bounds
shifts, as specified by LandRef:

> If op2 is (statically or dynamically) equal to or larger than the
> number of bits in op1, this instruction returns a poison value.

Differential Revision: https://reviews.llvm.org/D93998
2021-01-06 20:41:37 +01:00
Juneyoung Lee f665a8c5b8 [InstSimplify] gep with poison operand is poison
This is a tiny update to fold gep poison into poison. :)

Alive2 proofs:
https://alive2.llvm.org/ce/z/7Nwdri
https://alive2.llvm.org/ce/z/sDP4sC
2021-01-05 11:07:49 +09:00
Kazu Hirata 848e8f938f [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-04 11:42:44 -08:00
Nikita Popov 3715c99be9 [InstSimplify] Fold nnan/ninf violation to poison
As the comment already indicates, performing an operation with
nnan/ninf flags on a nan/inf or undef results in poison. Now that
we have a proper poison value, we no longer need to relax it to
undef.
2021-01-03 22:05:40 +01:00
Nikita Popov 766cf7f32e [InstSimplify] Fold division by zero to poison
Div/rem by zero is immediate undefined behavior and anything goes.
Currently we fold it to undef, this patch changes it to fold to
poison instead, which is slightly stronger.

Differential Revision: https://reviews.llvm.org/D93995
2021-01-03 20:52:45 +01:00
Nikita Popov f094d65bea [InstSimplify] Fix addo/subo with undef (PR43188)
We can't fold the first result to undef, because not all values
may be reachable under the constraint that no overflow occurred.
Use the same folds we do for saturated math instead.

Proofs:
uaddo: https://alive2.llvm.org/ce/z/zf55N_
saddo: https://alive2.llvm.org/ce/z/a_xPgS
usubo: https://alive2.llvm.org/ce/z/DmRqwt
ssubo: https://alive2.llvm.org/ce/z/8ag7U-
2021-01-03 18:51:49 +01:00
Nikita Popov c6ad00d709 [InstSimplify] Return poison for out of bounds extractelement
This is the same change as D93990, but for extractelement rather
than insertelement.

> If idx exceeds the length of val for a fixed-length vector, the
> result is a poison value. For a scalable vector, if the value of
> idx exceeds the runtime length of the vector, the result is a
> poison value.
2021-01-03 18:15:58 +01:00
Juneyoung Lee 2139958b53 [InstSimplify] Return poison if insertelement touches out of bounds
This is a simple patch that updates InstSimplify to return poison if the index is/can be out-of-bounds

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93990
2021-01-04 00:43:02 +09:00
Kazu Hirata a87c7003ac [Analysis] Remove unused code recursivelySimplifyInstruction (NFC)
The last use of the function, located in RemovePredecessorAndSimplify,
was removed on Dec 25, 2020 in commit
46bea9b297.

The last use of RemovePredecessorAndSimplify was removed on Sep 29,
2010 in commit 99c985c37d.
2020-12-30 17:45:40 -08:00
Sanjay Patel 236c4524a7 [InstSimplify] remove ctpop of 1 (low) bit
https://llvm.org/PR48608

As noted in the test comment, we could handle a more general
case in instcombine and remove this, but I don't have evidence
that we need to do that.

https://alive2.llvm.org/ce/z/MRW9gD
2020-12-28 16:06:20 -05:00
Sanjay Patel 38ca7face6 [InstSimplify] reduce logic with inverted add/sub ops
https://llvm.org/PR48559
This could be part of a larger ValueTracking API,
but I don't see that currently.

https://rise4fun.com/Alive/gR0

  Name: and
  Pre: C1 == ~C2
  %sub = add i8 %x, C1
  %sub1 = sub i8 C2, %x
  %r = and i8 %sub, %sub1
  =>
  %r = 0

  Name: or
  Pre: C1 == ~C2
  %sub = add i8 %x, C1
  %sub1 = sub i8 C2, %x
  %r = or i8 %sub, %sub1
  =>
  %r = -1

  Name: xor
  Pre: C1 == ~C2
  %sub = add i8 %x, C1
  %sub1 = sub i8 C2, %x
  %r = xor i8 %sub, %sub1
  =>
  %r = -1
2020-12-21 08:51:43 -05:00
Roman Lebedev e9289dc25f
[InstSimplify] Don't miscompile `X == 0 ? abs(X) : -abs(X) --> -abs(X)` xform
The transform wasn't checking that the LHS of the comparison
*is* the `X` in question...
This is the miscompile that was holding up D87188.

Thanks to Dave Green for producing an actionable reproducer!
2020-12-18 21:18:13 +03:00
Kazu Hirata eb44682d67 [Analysis] Use is_contained (NFC) 2020-12-11 21:19:31 -08:00
Cullen Rhodes 7b8d50b141 [InstSimplify] Clarify use of FixedVectorType in SimplifySelectInst
Folding a select of vector constants that include undef elements only
applies to fixed vectors, but there's no earlier check the type is not
scalable so it crashes for scalable vectors. This adds a check so this
optimization is only attempted for fixed vectors.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D92046
2020-11-27 09:55:29 +00:00
Sanjay Patel 00808e321c [InstSimplify] allow vector folds for (Pow2C << X) == NonPow2C
Existing pre-conditions seem to be correct:
https://rise4fun.com/Alive/lCLB

  Name: non-zero C1
  Pre: !isPowerOf2(C1) && isPowerOf2(C2) && C1 != 0
  %sub = shl i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false

  Name: one == C2
  Pre: !isPowerOf2(C1) && isPowerOf2(C2) && C2 == 1
  %sub = shl i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false

  Name: nuw
  Pre: !isPowerOf2(C1) && isPowerOf2(C2)
  %sub = shl nuw i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false

  Name: nsw
  Pre: !isPowerOf2(C1) && isPowerOf2(C2)
  %sub = shl nsw i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false
2020-11-08 09:52:05 -05:00
Sanjay Patel c74db55ff5 [InstSimplify] allow vector folds for icmp Pred (1 << X), 0x80 2020-11-04 08:12:48 -05:00
Sanjay Patel e77ba263fe [InstSimplify] peek through 'not' operand in logic-of-icmps fold
This extends D78430 to solve cases like:
https://llvm.org/PR47858

There are still missed opportunities shown in the tests,
and as noted in the earlier patches, we have related
functionality in InstCombine, so we may want to extend
other folds in a similar way.

A semi-random sampling of test diff proofs in this patch:
https://rise4fun.com/Alive/sS4C
2020-10-25 11:13:30 -04:00
Sjoerd Meijer 51d7df3fa1 [InstructionSimplify] icmp (X+Y), (X+Z) simplification
This improves simplifications for pattern `icmp (X+Y), (X+Z)` -> `icmp Y,Z`
if only one of the operands has NSW set, e.g.:

    icmp slt (x + 0), (x +nsw 1)

We can still safely rewrite this to:

    icmp slt 0, 1

because we know that the LHS can't overflow if the RHS has NSW set and
C1 < C2 && C1 >= 0, or C2 < C1 && C1 <= 0

This simplification is useful because ScalarEvolutionExpander which is used to
generate code for SCEVs in different loop optimisers is not always able to put
back NSW flags across control-flow, thus inhibiting CFG simplifications.

Differential Revision: https://reviews.llvm.org/D89317
2020-10-22 08:55:52 +01:00
Sanjay Patel 7c516504a1 [InstSimplify] allow vector splats for icmp-of-neg folds 2020-10-20 09:24:36 -04:00
Juneyoung Lee 9b3c2a72e4 [ValueTracking] Use assume's noundef operand bundle
This patch updates `isGuaranteedNotToBeUndefOrPoison` to use `llvm.assume`'s `noundef` operand bundle.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89219
2020-10-14 20:16:33 +09:00
Simon Pilgrim 95a440b936 [IR] PatternMatch - add m_FShl/m_FShr funnel shift intrinsic matchers. NFCI. 2020-10-01 14:42:34 +01:00
Sanjay Patel 3f100e64b4 [InstSimplify] fix fmin/fmax miscompile for partial undef vectors (PR47567)
It would also be correct to return the variable operand in these cases,
but eliminating a variable use is probably better for optimization.
2020-09-18 10:05:44 -04:00
Nikita Popov 0bb06f297f [InstSimplify] Clarify SimplifyWithOpReplaced() return value
If SimplifyWithOpReplaced() cannot simplify the value, null should
be returned. Make sure this really does happen in all cases,
including those where SimplifyBinOp() returns the original value.

This does not matter for existing users, but does mattter for
D87480, which would go into an infinite loop otherwise.
2020-09-16 20:53:26 +02:00
Sanjay Patel 8985755762 [InstSimplify] add limit folds for fmin/fmax
If the constant operand is the opposite of the min/max value,
then the result must be the other value.

This is based on the similar codegen transform proposed in:
D87571
2020-09-15 10:58:44 -04:00
Sanjay Patel 55d371abd7 [InstSimplify] add folds for fmin/fmax with 'nnan'
maximum(nnan X, +INF) --> +INF
minimum(nnan X, -INF) --> -INF

This is based on the similar codegen transform proposed in:
D87571
2020-09-14 11:46:11 -04:00
Sanjay Patel 7526376164 [InstSimplify] allow folds for fmin/fmax with 'ninf'
maxnum(ninf X, +FLT_MAX) --> +FLT_MAX
minnum(ninf X, -FLT_MAX) --> -FLT_MAX

This is based on the similar codegen transform proposed in:
D87571
2020-09-14 11:18:08 -04:00
Sanjay Patel 22c583c3d0 [InstSimplify] reduce code duplication for fmin/fmax folds; NFC
We use the same code structure for folding integer min/max.
2020-09-14 10:32:11 -04:00
Sanjay Patel 7bb9a2f996 [InstSimplify] fix miscompiles with maximum/minimum intrinsics
As discussed in the sibling codegen functionality patch D87571,
this transform was created with D52766, but it is not correct.

The incorrect test diffs were missed during review, but the
'TODO' comment about this functionality was still in the code -
we need 'nnan' to enable this fold.
2020-09-14 09:06:41 -04:00
Nikita Popov 36e2e2e12e [InstCombine] Fix incorrect SimplifyWithOpReplaced transform (PR47322)
This is a followup to D86834, which partially fixed this issue in
InstSimplify. However, InstCombine repeats the same transform while
dropping poison flags -- which does not cover cases where poison is
introduced in some other way.

The fix here is a bit more comprehensive, because things are quite
entangled, and it's hard to only partially address it without
regressing optimization. There are really two changes here:

 * Export the SimplifyWithOpReplaced API from InstSimplify, with an
   added AllowRefinement flag. For replacements inside the TrueVal
   we don't actually care whether refinement occurs or not, the
   replacement is always legal. This part of the transform is now
   done in InstSimplify only. (It should be noted that the current
   AllowRefinement check is not sufficient -- that's an issue we
   need to address separately.)
 * Change the InstCombine fold to work by temporarily dropping
   poison generating flags, running the fold and then restoring the
   flags if it didn't work out. This will ensure that the InstCombine
   fold is correct as long as the InstSimplify fold is correct.

Differential Revision: https://reviews.llvm.org/D87445
2020-09-12 14:45:06 +02:00
Nikita Popov e97f3b1b43 [InstCombine] Fold abs of known negative operand
If we know that the abs operand is known negative, we can replace
it with a neg.

To avoid computing known bits twice, I've removed the fold for the
non-negative case from InstSimplify. Both the non-negative and the
negative case are handled by InstCombine now, with one known bits call.

Differential Revision: https://reviews.llvm.org/D87196
2020-09-08 20:14:35 +02:00
Nikita Popov ff218cbc84 [InstSimplify] Fold degenerate abs of abs form
This addresses the remaining issue from D87188. Due to a series of
folds, we may end up with abs-of-abs represented as
x == 0 ? -abs(x) : abs(x). Rather than recognizing this as a special
abs pattern and doing an abs-of-abs fold on it afterwards,
I'm directly folding this to one of the select operands in InstSimplify.

The general pattern falls into the "select with operand replaced"
category, but that fold is not powerful enough to recognize that
both hands of the select are the same for value zero.

Differential Revision: https://reviews.llvm.org/D87197
2020-09-06 09:43:08 +02:00
Nikita Popov 73104b0751 [InstSimplify] Fold min/max based on dominating condition
If we have a dominating condition that x >= y, then umax(x, y) is x,
etc. I'm doing this in InstSimplify as the corresponding transform
for the select form is also done there.

Differential Revision: https://reviews.llvm.org/D87168
2020-09-05 16:16:40 +02:00
Nikita Popov 88b310f64b [InstSimplify] Reduce code duplication in simplifySelectWithICmpCond (NFC)
Canonicalize icmp ne to icmp eq and implement all the folds only once.
2020-08-29 22:38:49 +02:00
Nikita Popov a5be86fde5 [InstSimplify] Protect against more poison in SimplifyWithOpReplaced (PR47322)
Replace the check for poison-producing instructions in
SimplifyWithOpReplaced() with the generic helper canCreatePoison()
that properly handles poisonous shifts and thus avoids the problem
from PR47322.

This additionally fixes a bug in IIQ.UseInstrInfo=false mode, which
previously could have caused this code to ignore poison flags.
Setting UseInstrInfo=false should reduce the possible optimizations,
not increase them.

This is not a full solution to the problem, as poison could be
introduced more indirectly. This is just a minimal, easy to backport
fix.

Differential Revision: https://reviews.llvm.org/D86834
2020-08-29 21:59:39 +02:00
Roman Lebedev c1b3e32118
[NFC][InstructionSimplify] Add a warning about not simplifying to not def-reachable
See
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824235.html
and
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824967.html

InstSimply is not allowed to perform simplifications to instructions
that are not def-reachable from the original instruction.
2020-08-29 09:58:08 +03:00
Owen Anderson ed90f15efb Revert "[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block"
This reverts commit 6102310d81.  It
appears to cause compilation non-determinism and caused stage3
mismatches.
2020-08-28 23:43:42 +00:00
David Sherwood f4257c5832 [SVE] Make ElementCount members private
This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().

Differential Revision: https://reviews.llvm.org/D86065
2020-08-28 14:43:53 +01:00
Roman Lebedev b85f91fdce
[InstSimplify] SimplifyPHINode(): check that instruction is in basic block first
As pointed out in post-commit review, this can legally be called
on instructions that are not inserted into basic blocks,
so don't blindly assume that there is basic block.
2020-08-27 22:32:03 +03:00
Roman Lebedev 6102310d81
[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block
Apparently, we don't do this, neither in EarlyCSE, nor in InstSimplify,
nor in (old) GVN, but do in NewGVN and SimplifyCFG of all places..

While i could teach EarlyCSE how to hash PHI nodes,
we can't really do much (anything?) even if we find two identical
PHI nodes in different basic blocks, same-BB case is the interesting one,
and if we teach InstSimplify about it (which is what i wanted originally,
https://reviews.llvm.org/D86530), we get EarlyCSE support for free.

So i would think this is pretty uncontroversial.

On vanilla llvm test-suite + RawSpeed, this has the following effects:
```
| statistic name                                     | baseline  | proposed  |      Δ |        % |    \|%\| |
|----------------------------------------------------|-----------|-----------|-------:|---------:|---------:|
| instsimplify.NumPHICSE                             | 0         | 23779     |  23779 |    0.00% |    0.00% |
| asm-printer.EmittedInsts                           | 7942328   | 7942392   |     64 |    0.00% |    0.00% |
| assembler.ObjectBytes                              | 273069192 | 273084704 |  15512 |    0.01% |    0.01% |
| correlated-value-propagation.NumPhis               | 18412     | 18539     |    127 |    0.69% |    0.69% |
| early-cse.NumCSE                                   | 2183283   | 2183227   |    -56 |    0.00% |    0.00% |
| early-cse.NumSimplify                              | 550105    | 542090    |  -8015 |   -1.46% |    1.46% |
| instcombine.NumAggregateReconstructionsSimplified  | 73        | 4506      |   4433 | 6072.60% | 6072.60% |
| instcombine.NumCombined                            | 3640264   | 3664769   |  24505 |    0.67% |    0.67% |
| instcombine.NumDeadInst                            | 1778193   | 1783183   |   4990 |    0.28% |    0.28% |
| instcount.NumCallInst                              | 1758401   | 1758799   |    398 |    0.02% |    0.02% |
| instcount.NumInvokeInst                            | 59478     | 59502     |     24 |    0.04% |    0.04% |
| instcount.NumPHIInst                               | 330557    | 330533    |    -24 |   -0.01% |    0.01% |
| instcount.TotalInsts                               | 8831952   | 8832286   |    334 |    0.00% |    0.00% |
| simplifycfg.NumInvokes                             | 4300      | 4410      |    110 |    2.56% |    2.56% |
| simplifycfg.NumSimpl                               | 1019808   | 999607    | -20201 |   -1.98% |    1.98% |
```
I.e. it fires ~24k times, causes +110 (+2.56%) more `invoke` -> `call`
transforms, and counter-intuitively results in *more* instructions total.

That being said, the PHI count doesn't decrease that much,
and looking at some examples, it seems at least some of them
were previously getting PHI CSE'd in SimplifyCFG of all places..

I'm adjusting `Instruction::isIdenticalToWhenDefined()` at the same time.
As a comment in `InstCombinerImpl::visitPHINode()` already stated,
there are no guarantees on the ordering of the operands of a PHI node,
so if we just naively compare them, we may false-negatively say that
the nodes are not equal when the only difference is operand order,
which is especially important since the fold is in InstSimplify,
so we can't rely on InstCombine sorting them beforehand.

Fixing this for the general case is costly (geomean +0.02%),
and does not appear to catch anything in test-suite, but for
the same-BB case, it's trivial, so let's fix at least that.

As per http://llvm-compile-time-tracker.com/compare.php?from=04879086b44348cad600a0a1ccbe1f7776cc3cf9&to=82bdedb888b945df1e9f130dd3ac4dd3c96e2925&stat=instructions
this appears to cause geomean +0.03% compile time increase (regression),
but geomean -0.01%..-0.04% code size decrease (improvement).
2020-08-27 18:47:04 +03:00
Nikita Popov d7c119d89c [InstSimplify] Fold min/max intrinsic based on icmp of operands
This is a reboot of D84655, now performing the inner icmp
simplification query without undef folds.

It should be possible to handle the current foldMinMaxSharedOp()
fold based on this, by moving the logic into icmp of min/max instead,
making it more general. We can't drop the folds for constant operands,
because those also allow undef, which we exclude here.

The tests use assumes for exhaustive coverage, and have a few
more examples of misc folds we get based on icmp simplification.

Differential Revision: https://reviews.llvm.org/D85929
2020-08-26 22:02:57 +02:00
Arthur Eubanks 098d3f9827 [InstSimplify] Simplify to vector constants when possible
InstSimplify should do all transformations that ConstProp does, but
one thing that ConstProp does that InstSimplify wouldn't is inline
vector instructions that are constants, e.g. into a ret.

Previously vector instructions wouldn't be inlined in InstSimplify
because llvm::Simplify*Instruction() would return nullptr for specific
instructions, such as vector instructions that were actually constants,
if it couldn't simplify them.

This changes SimplifyInsertElementInst, SimplifyExtractElementInst, and
SimplifyShuffleVectorInst to return a vector constant when possible.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85946
2020-08-26 11:40:36 -07:00
Craig Topper a7a06ded8b Recommit "[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms" and its follow up patches
This recommits the following patches now that D85684 has landed

1cf6f210a2 [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison.
469da663f2 [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison
122b0640fc [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison
ac0af12ed2 [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison
9b1e95329a [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms
2020-08-12 10:45:27 -07:00
Nikita Popov 06d567059e [InstSimplify] Respect CanUseUndef in more places
Similar to what we do in IIQ, add an isUndefValue() helper that
checks for undef values while respective CanUseUndef. This makes
it much easier to search for places that don't respect the flag
yet.
2020-08-11 21:53:33 +02:00
Nikita Popov d110d4aaff [InstSimplify] Forbid undef folds in expandBinOp
This is the replacement for D84250 based on D84792. As we recursively
fold with the same value twice, we need to disable undef folds,
to prevent an undef from being folded to two different values.

Reverting rG00f3579aea6e3d4a4b7464c3db47294f71cef9e4 and using the
test case from https://reviews.llvm.org/D83360#2145793, it no longer
performs the incorrect fold.

Differential Revision: https://reviews.llvm.org/D85684
2020-08-11 18:39:24 +02:00
Sanjay Patel 1470ce4a76 [InstSimplify] fold min/max with matching min/max operands
I think this is the last remaining translation of an existing
instcombine transform for the corresponding cmp+sel idiom.

This interpretation is more general though - we can remove
mismatched signed/unsigned combinations in addition to the
more obvious cases.

min/max(X, Y) must produce X or Y as the result, so this is
just another clause in the existing transform that was already
matching a min/max of min/max.
2020-08-11 11:23:15 -04:00
Florian Hahn d236e1c7b6 [InstSimplify/NewGVN] Add option to control the use of undef.
Making use of undef is not safe if the simplification result is not used
to replace all uses of the result. This leads to problems in NewGVN,
which does not replace all uses in the IR directly. See PR33165 for more
details.

This patch adds an option to SimplifyQuery to disable the use of undef.

Note that I've only guarded uses if isa<UndefValue>/m_Undef where
SimplifyQuery is currently available. If we agree on the general
direction, I'll update the remaining uses.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D84792
2020-08-09 19:16:56 +01:00
Sanjay Patel 250a167c41 [InstSimplify] avoid crashing by trying to rem-by-zero
Bug was noted in the post-commit comments for:
rGe8760bb9a8a3
2020-08-06 16:06:31 -04:00
Sanjay Patel e8760bb9a8 [InstSimplify] fold icmp with mul nsw and constant operands
https://rise4fun.com/Alive/slvl

  Name: mul nsw with icmp eq
  Pre: (C2 % C1) != 0
  %a = mul nsw i8 %x, C1
  %r = icmp eq i8 %a, C2
    =>
  %r = false

  Name: mul nsw with icmp ne
  Pre: (C2 % C1) != 0
  %a = mul nsw i8 %x, C1
  %r = icmp ne i8 %a, C2
    =>
  %r = true

Follow-up to the 'nuw' variation added with:
rGf879c9b79621
2020-08-05 14:38:39 -04:00
Sanjay Patel f879c9b796 [InstSimplify] fold icmp with mul nuw and constant operands
https://rise4fun.com/Alive/pZEr

  Name: mul nuw with icmp eq
  Pre: (C2 %u C1) != 0
  %a = mul nuw i8 %x, C1
  %r = icmp eq i8 %a, C2
    =>
  %r = false

  Name: mul nuw with icmp ne
  Pre: (C2 %u C1) != 0
  %a = mul nuw i8 %x, C1
  %r = icmp ne i8 %a, C2
    =>
  %r = true

There are potentially several other transforms we need to add based on:
D51625
...but it doesn't look like there was follow-up to that patch.
2020-08-05 14:32:17 -04:00
Sanjay Patel bd2c88b253 [InstSimplify] reduce code duplication in simplifyICmpWithMinMax(); NFC 2020-08-05 11:39:28 -04:00
Xavier Denis 29fe3fe615 [InstSimplify] Peephole optimization for icmp (urem X, Y), X
This revision adds the following peephole optimization
and it's negation:

    %a = urem i64 %x, %y
    %b = icmp ule i64 %a, %x
    ====>
    %b = true

With John Regehr's help this optimization was checked with Alive2
which suggests it should be valid.

This pattern occurs in the bound checks of Rust code, the program

    const N: usize = 3;
    const T = u8;

    pub fn split_mutiple(slice: &[T]) -> (&[T], &[T]) {
        let len = slice.len() / N;
        slice.split_at(len * N)
    }

the method call slice.split_at will check that len * N is within
the bounds of slice, this bounds check is after some transformations
turned into the urem seen above and then LLVM fails to optimize it
any further. Adding this optimization would cause this bounds check
to be fully optimized away.

ref: https://github.com/rust-lang/rust/issues/74938

Differential Revision: https://reviews.llvm.org/D85092
2020-08-04 20:48:37 +02:00
Sanjay Patel a16882047a [InstSimplify] refactor min/max folds with shared operand; NFC 2020-08-04 12:21:05 -04:00
Sanjay Patel 04e45ae1c6 [InstSimplify] fold nested min/max intrinsics with constant operands
This is based on the existing code for the non-intrinsic idioms
in InstCombine.

The vector constant constraint is non-obvious: undefs should be
ok in the outer call, but they can't propagate safely from the
inner call in all cases. Example:

https://alive2.llvm.org/ce/z/-2bVbM
  define <2 x i8> @src(<2 x i8> %x) {
  %0:
    %m = umin <2 x i8> %x, { 7, undef }
    %m2 = umin <2 x i8> { 9, 9 }, %m
    ret <2 x i8> %m2
  }
  =>
  define <2 x i8> @tgt(<2 x i8> %x) {
  %0:
    %m = umin <2 x i8> %x, { 7, undef }
    ret <2 x i8> %m
  }
  Transformation doesn't verify!
  ERROR: Value mismatch

  Example:
  <2 x i8> %x = < undef, undef >

  Source:
  <2 x i8> %m = < #x00 (0)	[based on undef value], #x00 (0) >
  <2 x i8> %m2 = < #x00 (0), #x00 (0) >

  Target:
  <2 x i8> %m = < #x07 (7), #x10 (16) >
  Source value: < #x00 (0), #x00 (0) >
  Target value: < #x07 (7), #x10 (16) >
2020-08-04 08:44:48 -04:00
Sanjay Patel 20c71e55aa [InstSimplify] reduce code for min/max analysis; NFC
This should probably be moved up to some common area eventually
when there's another user.
2020-08-04 08:02:33 -04:00
Sanjay Patel 9e5cf6bde5 [InstSimplify] fold variations of max-of-min with common operand
https://alive2.llvm.org/ce/z/ZtxpZ3
2020-08-03 15:02:46 -04:00
Sanjay Patel 4abc69c6f5 [InstSimplify] fold max (max X, Y), X --> max X, Y
https://alive2.llvm.org/ce/z/VGgG3M
2020-08-02 11:50:58 -04:00
Nikita Popov a0addbb4ec [InstSimplify] Reduce code duplication in icmp of binop folds (NFC)
For folds where we check for the binop on both the LHS and RHS,
extract a function that expects it on the LHS and call it with
swapped order.
2020-08-02 15:47:18 +02:00
Craig Topper 85b5315dbe [InstSimplify] Fold abs(abs(x)) -> abs(x)
It's always safe to pick the earlier abs regardless of the nsw flag. We'll just lose it if it is on the outer abs but not the inner abs.

Differential Revision: https://reviews.llvm.org/D85053
2020-08-01 13:25:00 -07:00
Sanjay Patel 04b99a4d18 [InstSimplify] simplify abs if operand is known non-negative
abs() should be rare enough that using value tracking is not going
to be a compile-time cost burden, so use it to reduce a variety of
potential patterns. We do this in DAGCombiner too.

Differential Revision: https://reviews.llvm.org/D85043
2020-08-01 07:47:06 -04:00
Vitaly Buka b0eb40ca39 [NFC] Remove unused GetUnderlyingObject paramenter
Depends on D84617.

Differential Revision: https://reviews.llvm.org/D84621
2020-07-31 02:10:03 -07:00
Vitaly Buka 89051ebace [NFC] GetUnderlyingObject -> getUnderlyingObject
I am going to touch them in the next patch anyway
2020-07-30 21:08:24 -07:00
Sanjay Patel fef513f5cc [InstSimplify] fold min/max intrinsic with undef operand 2020-07-29 17:03:50 -04:00
Sanjay Patel 5cd695dd7f [InstSimplify] fold min/max with opposite of limit value 2020-07-29 17:03:50 -04:00
Sanjay Patel ee9617e96b [InstSimplify] try constant folding intrinsics before general simplifications
This matches the behavior of simplify calls for regular opcodes -
rely on ConstantFolding before spending time on folds with variables.

I am not aware of any diffs from this re-ordering currently, but there was
potential for unintended behavior from the min/max intrinsics because that
code is implicitly assuming that only 1 of the input operands is constant.
2020-07-29 13:18:40 -04:00
Sanjay Patel 3e8534fbc6 [InstSimplify] allow partial undef constants for vector min/max folds 2020-07-29 11:53:41 -04:00
Sanjay Patel 3c20ede18b [InstSimplify] fold integer min/max intrinsic with same args 2020-07-29 11:53:41 -04:00
Sanjay Patel 3fb13b8484 [InstSimplify] allow undefs in icmp with vector constant folds
This is the main icmp simplification shortcoming seen in D84655.

Alive2 agrees that the basic examples are correct at least:

define <2 x i1> @src(<2 x i8> %x) {
%0:
  %r = icmp sle <2 x i8> { undef, 128 }, %x
  ret <2 x i1> %r
}
=>
define <2 x i1> @tgt(<2 x i8> %x) {
%0:
  ret <2 x i1> { 1, 1 }
}
Transformation seems to be correct!

define <2 x i1> @src(<2 x i32> %X) {
%0:
  %A = or <2 x i32> %X, { 63, 63 }
  %B = icmp ult <2 x i32> %A, { undef, 50 }
  ret <2 x i1> %B
}
=>
define <2 x i1> @tgt(<2 x i32> %X) {
%0:
  ret <2 x i1> { 0, 0 }
}
Transformation seems to be correct!

https://alive2.llvm.org/ce/z/omt2ee
https://alive2.llvm.org/ce/z/GW4nP_

Differential Revision: https://reviews.llvm.org/D84762
2020-07-28 15:13:53 -04:00
Sanjay Patel 0481e1ae3c [InstSimplify] fold integer min/max intrinsics with limit constant 2020-07-26 09:41:54 -04:00
Sanjay Patel b89ae102e6 [InstSimplify] fold fcmp using isKnownNeverInfinity + isKnownNeverNaN
Follow-up to D84035 / rG7393d7574c09.
This sidesteps a question of FMF/poison on fcmp raised in PR46077:
http://bugs.llvm.org/PR46077

https://alive2.llvm.org/ce/z/TCsyzD
  define i1 @src(float %x) {
  %0:
    %x42 = fadd nnan ninf float %x, 42.000000
    %r = fcmp ueq float %x42, inf
    ret i1 %r
  }
  =>
  define i1 @tgt(float %x) {
  %0:
    ret i1 0
  }
  Transformation seems to be correct!

https://alive2.llvm.org/ce/z/FQaH7a
  define i1 @src(i8 %x) {
  %0:
    %cast = uitofp i8 %x to float
    %r = fcmp one float inf, %cast
    ret i1 %r
  }
  =>
  define i1 @tgt(i8 %x) {
  %0:
    ret i1 1
  }
  Transformation seems to be correct!
2020-07-26 09:04:37 -04:00
Sanjay Patel 7485e92412 [InstSimplify] reduce code duplication for binop expansion; NFC
D84250 proposes to extend this code, so the duplication for
the commuted case would continue to grow.
2020-07-23 08:35:21 -04:00
Christopher Tetreault 23c5e59d9f [SVE] Remove calls to VectorType::getNumElements from Analysis
Reviewers: efriedma, fpetrogalli, c-rhodes, asbirlea, RKSimon

Reviewed By: RKSimon

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81504
2020-07-22 15:19:05 -07:00
Sanjay Patel 7393d7574c [InstSimplify] fold fcmp with infinity constant using isKnownNeverInfinity
This is a step towards trying to remove unnecessary FP compares
with infinity when compiling with -ffinite-math-only or similar.
I'm intentionally not checking FMF on the fcmp itself because
I'm assuming that will go away eventually.
The analysis part of this was added with rGcd481136 for use with
isKnownNeverNaN. Similarly, that could be an enhancement here to
get predicates like 'one' and 'ueq'.

Differential Revision: https://reviews.llvm.org/D84035
2020-07-19 09:24:52 -04:00
Craig Topper 00f3579aea Revert "[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms" and subsequent patches
This reverts most of the following patches due to reports of miscompiles.
I've left the added test cases with comments updated to be FIXMEs.

1cf6f210a2 [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison.
469da663f2 [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison
122b0640fc [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison
ac0af12ed2 [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison
9b1e95329a [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms
2020-07-15 22:02:33 -07:00
Craig Topper 469da663f2 [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison
Follow up from the transform being removed in D83360. If X is probably not poison, then the transform is safe.

Still plan to remove or adjust the code from ConstantFolding after this.

Differential Revision: https://reviews.llvm.org/D83440
2020-07-09 12:21:03 -07:00
Craig Topper 122b0640fc [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison
We can't fold to the non-undef value unless we know it isn't poison. So check each element with isGuaranteedNotToBeUndefOrPoison. This currently rules out all constant expressions.

Differential Revision: https://reviews.llvm.org/D83442
2020-07-09 11:01:12 -07:00
Craig Topper 9b1e95329a [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms
As noted here https://lists.llvm.org/pipermail/llvm-dev/2016-October/106182.html and by alive2, this transform isn't valid. If X is poison this potentially propagates poison when it shouldn't.

This same transform still exists in DAGCombiner.

Differential Revision: https://reviews.llvm.org/D83360
2020-07-08 12:53:05 -07:00
Nikita Popov a48cf72238 [InstSimplify] Handle not inserted instruction gracefully (PR46638)
When simplifying comparisons using a dominating assume, bail out
if the context instruction is not inserted.
2020-07-08 21:43:32 +02:00
Craig Topper d92bf71a07 Revert "[X86] Merge the FEATURE_64BIT and FEATURE_EM64T bits in X86TargetParser.def."
An accidental change snuck in here

This reverts commit f1d290d812.
2020-07-07 18:20:07 -07:00
Craig Topper f1d290d812 [X86] Merge the FEATURE_64BIT and FEATURE_EM64T bits in X86TargetParser.def.
These represent the same thing but 64BIT only showed up from
getHostCPUFeatures providing a list of featuers to clang. While
EM64T showed up from getting the features for a named CPU.

EM64T didn't have a string specifically so it would not be passed
up to clang when getting features for a named CPU. While 64bit
needed a name since that's how it is index.

Merge them by filtering 64bit out before sending features to clang
for named CPUs.
2020-07-07 17:59:54 -07:00
Nikita Popov 3b671022e4 [InstSimplify] Simplify comparison between zext(x) and sext(x)
This is picking up a loose thread from D69006: We can simplify
(zext x) ule (sext x) and (zext x) sge (sext x) to true, with
various permutations. Oddly, SCEV knows about this identity,
but nothing on the IR level does.

Differential Revision: https://reviews.llvm.org/D83081
2020-07-04 11:03:00 +02:00
Nikita Popov cf1d9f9f49 [InstSimplify] Fold icmp with dominating assume
If we assume(x > y), then we should be able to fold the basic
implications of that, like x >= y. This already happens if either
one of the operands is constant (LVI) or if the conditions are
exactly the same (GVN), but not if we have an implication with
non-constant operands. Support this by querying AssumptionCache.

Fixes https://bugs.llvm.org/show_bug.cgi?id=40149.

Differential Revision: https://reviews.llvm.org/D82717
2020-07-03 18:53:58 +02:00
Christopher Tetreault 747486991c [SVE] Fix bad FixedVectorType cast in simplifyDivRem
Summary:
simplifyDivRem attempts to walk a VectorType elementwise. Ensure that it
only does so for FixedVectorType

Reviewers: efriedma, spatel, lebedev.ri, david-arm, kmclaughlin

Reviewed By: spatel, david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81856
2020-06-16 13:17:05 -07:00
Dorit Nuzman a9fe69c359 [InstSimplify] fix bug in matching or-with-not op (PR46083) 2020-06-03 13:44:29 -04:00
Serge Pavlov 4d20e31f73 [FPEnv] Intrinsic llvm.roundeven
This intrinsic implements IEEE-754 operation roundToIntegralTiesToEven,
and performs rounding to the nearest integer value, rounding halfway
cases to even. The intrinsic represents the missed case of IEEE-754
rounding operations and now llvm provides full support of the rounding
operations defined by the standard.

Differential Revision: https://reviews.llvm.org/D75670
2020-05-26 19:24:58 +07:00
Sanjay Patel 7eed772a27 [PatternMatch] abbreviate vector inst matchers; NFC
Readability is not reduced with these opcodes/match lines,
so reduce odds of awkward wrapping from 80-col limit.
2020-05-24 09:19:47 -04:00
Nikita Popov 5a2265647e Reapply [InstSimplify] Remove known bits constant folding
No changes relative to last time, but after a mitigation for
an AMDGPU regression landed.

---

If SimplifyInstruction() does not succeed in simplifying the
instruction, it will compute the known bits of the instruction
in the hope that all bits are known and the instruction can be
folded to a constant. I have removed a similar optimization
from InstCombine in D75801, and would like to drop this one as well.

On average, we spend ~1% of total compile-time performing this
known bits calculation. However, if we introduce some additional
statistics for known bits computations and how many of them succeed
in simplifying the instruction we get (on test-suite):

    instsimplify.NumKnownBits: 216
    instsimplify.NumKnownBitsComputed: 13828375
    valuetracking.NumKnownBitsComputed: 45860806

Out of ~14M known bits calculations (accounting for approximately
one third of all known bits calculations), only 0.0015% succeed in
producing a constant. Those cases where we do succeed to compute
all known bits will get folded by other passes like InstCombine
later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test
show a hash difference after this change. On lencod we see an
improvement (a loop phi is optimized away), on the GCC torture
test a regression (a function return value is determined only
after IPSCCP, preventing propagation from a noinline function.)

There are various regressions in InstSimplify tests. However, all
of these cases are already handled by InstCombine, and corresponding
tests have already been added there.

Differential Revision: https://reviews.llvm.org/D79294
2020-05-08 10:24:53 +02:00
Nikita Popov 46ee652c70 Revert "[InstSimplify] Remove known bits constant folding"
This reverts commit 08556afc54.

This breaks some AMDGPU tests.
2020-05-03 20:45:10 +02:00
Nikita Popov 08556afc54 [InstSimplify] Remove known bits constant folding
If SimplifyInstruction() does not succeed in simplifying the
instruction, it will compute the known bits of the instruction
in the hope that all bits are known and the instruction can be
folded to a constant. I have removed a similar optimization
from InstCombine in D75801, and would like to drop this one as well.

On average, we spend ~1% of total compile-time performing this
known bits calculation. However, if we introduce some additional
statistics for known bits computations and how many of them succeed
in simplifying the instruction we get (on test-suite):

    instsimplify.NumKnownBits: 216
    instsimplify.NumKnownBitsComputed: 13828375
    valuetracking.NumKnownBitsComputed: 45860806

Out of ~14M known bits calculations (accounting for approximately
one third of all known bits calculations), only 0.0015% succeed in
producing a constant. Those cases where we do succeed to compute
all known bits will get folded by other passes like InstCombine
later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test
show a hash difference after this change. On lencod we see an
improvement (a loop phi is optimized away), on the GCC torture
test a regression (a function return value is determined only
after IPSCCP, preventing propagation from a noinline function.)

There are various regressions in InstSimplify tests. However, all
of these cases are already handled by InstCombine, and corresponding
tests have already been added there.

Differential Revision: https://reviews.llvm.org/D79294
2020-05-03 20:26:58 +02:00
Sanjay Patel 57f0eed98d [InstSimplify] allow insertelement-with-undef fold if poison-safe
The more general fold was not poison-safe, so it was removed:
rG5486e00
...but it is ok to have this transform if analysis can determine
the vector contains no poison. The test shows a simple example
of that: constant integer elements are not poison.
2020-05-01 10:34:29 -04:00
Sanjay Patel 5486e00dc3 [InstSimplify] remove poison-unsafe insertelement of undef value
PR45481:
https://bugs.llvm.org/show_bug.cgi?id=45481

SDAG has an identical transform to this, so there's little
chance of any real-world impact. OTOH, that means we are
effectively sweeping the bug out of sight because poison
exists in codegen too.
2020-05-01 09:22:05 -04:00
Craig Topper a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
James Y Knight 248a5db3f2 Change callbr to only define its output SSA variable on the normal
path, not the indirect targets.

Fixes: PR45565.

Differential Revision: https://reviews.llvm.org/D78341
2020-04-23 19:36:44 -04:00
Christopher Tetreault 9174e0229f [SVE] Remove calls to VectorType::isScalable from analysis
Reviewers: efriedma, sdesmalen, chandlerc, sunfish

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77692
2020-04-23 12:44:22 -07:00
Sanjay Patel e86eff0e82 [InstSimplify] fold and/or of compares with equality to min/max constant
I found 12 (6 if we compress the DeMorganized forms) patterns for logic-of-compares
with a min/max constant while looking at PR45510:
https://bugs.llvm.org/show_bug.cgi?id=45510

The variations on those forms multiply the test cases by 8 (unsigned/signed, swapped
compare operands, commuted logic operands).
We have partial logic to deal with these for the unsigned min (zero) case, but
missed everything else.

We are deferring the majority of these patterns to InstCombine to allow more general
handling (see D78582).

We could use ConstantRange instead of predicate+constant matching here. I don't
expect there's any noticeable compile-time impact for either form.

Here's an abuse of Alive2 to show the 12 basic signed variants of the patterns in
one function:
http://volta.cs.utah.edu:8080/z/5Vpiyg

declare void @use(i1, i1, i1, i1, i1, i1, i1, i1, i1, i1, i1, i1)
define void @src(i8 %x, i8 %y)  {
  %m1 = icmp eq i8 %x, 127
  %c1 = icmp slt i8 %x, %y
  %r1 = and i1 %m1, %c1   ; (X == MAX) && (X < Y) --> false

  %m2 = icmp ne i8 %x, 127
  %c2 = icmp sge i8 %x, %y
  %r2 = or i1 %m2, %c2    ; (X != MAX) || (X >= Y) --> true

  %m3 = icmp eq i8 %x, -128
  %c3 = icmp sgt i8 %x, %y
  %r3 = and i1 %m3, %c3   ; (X == MIN) && (X > Y) --> false

  %m4 = icmp ne i8 %x, -128
  %c4 = icmp sle i8 %x, %y
  %r4 = or i1 %m4, %c4    ; (X != MIN) || (X <= Y) --> true

  %m5 = icmp eq i8 %x, 127
  %c5 = icmp sge i8 %x, %y
  %r5 = and i1 %m5, %c5   ; (X == MAX) && (X >= Y) --> X == MAX

  %m6 = icmp ne i8 %x, 127
  %c6 = icmp slt i8 %x, %y
  %r6 = or i1 %m6, %c6   ; (X != MAX) || (X < Y) --> X != MAX

  %m7 = icmp eq i8 %x, -128
  %c7 = icmp sle i8 %x, %y
  %r7 = and i1 %m7, %c7   ; (X == MIN) && (X <= Y) --> X == MIN

  %m8 = icmp ne i8 %x, -128
  %c8 = icmp sgt i8 %x, %y
  %r8 = or i1 %m8, %c8   ; (X != MIN) || (X > Y) --> X != MIN

  %m9 = icmp ne i8 %x, 127
  %c9 = icmp slt i8 %x, %y
  %r9 = and i1 %m9, %c9    ; (X != MAX) && (X < Y) --> X < Y

  %m10 = icmp eq i8 %x, 127
  %c10 = icmp sge i8 %x, %y
  %r10 = or i1 %m10, %c10    ; (X == MAX) || (X >= Y) --> X >= Y

  %m11 = icmp ne i8 %x, -128
  %c11 = icmp sgt i8 %x, %y
  %r11 = and i1 %m11, %c11    ; (X != MIN) && (X > Y) --> X > Y

  %m12 = icmp eq i8 %x, -128
  %c12 = icmp sle i8 %x, %y
  %r12 = or i1 %m12, %c12    ; (X == MIN) || (X <= Y) --> X <= Y

  call void @use(i1 %r1, i1 %r2, i1 %r3, i1 %r4, i1 %r5, i1 %r6, i1 %r7, i1 %r8, i1 %r9, i1 %r10, i1 %r11, i1 %r12)
  ret void
}

define void @tgt(i8 %x, i8 %y)  {
  %m5 = icmp eq i8 %x, 127
  %m6 = icmp ne i8 %x, 127
  %m7 = icmp eq i8 %x, -128
  %m8 = icmp ne i8 %x, -128
  %c9 = icmp slt i8 %x, %y
  %c10 = icmp sge i8 %x, %y
  %c11 = icmp sgt i8 %x, %y
  %c12 = icmp sle i8 %x, %y
  call void @use(i1 0, i1 1, i1 0, i1 1, i1 %m5, i1 %m6, i1 %m7, i1 %m8, i1 %c9, i1 %c10, i1 %c11, i1 %c12)
  ret void
}

Differential Revision: https://reviews.llvm.org/D78430
2020-04-23 09:16:10 -04:00
Christopher Tetreault b96558f5e5 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: sunfish, sdesmalen, efriedma

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77273
2020-04-09 12:41:28 -07:00
Eli Friedman 1ee6ec2bf3 Remove "mask" operand from shufflevector.
Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.

This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.

I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors.  Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it.  Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.

Differential Revision: https://reviews.llvm.org/D72467
2020-03-31 13:08:59 -07:00
Serge Pavlov f398739152 [FEnv] Constfold some unary constrained operations
This change implements constant folding to constrained versions of
intrinsics, implementing rounding: floor, ceil, trunc, round, rint and
nearbyint.

Differential Revision: https://reviews.llvm.org/D72930
2020-03-28 12:28:33 +07:00
Nikita Popov 417d69595f [InstSimplify] Reorder checks to be more efficient; NFC
First check whether the RHS is a null pointer, and only then
perform a potentially expensive non-zero query.
2020-03-20 22:05:38 +01:00
Nico Weber 623cb95eb3 Revert "[InstSimplify] Simplify calls with "returned" attribute"
This reverts commit 45555c3819.
Causes clang crashes in some causes, see comments on
https://reviews.llvm.org/D75815 for details (including
repro steps).
2020-03-16 15:21:30 -04:00
Huihui Zhang 0616e9964b [InstSimplify][SVE] Fix SimplifyGEPInst for scalable vector.
Summary:
Skip folds that rely on DataLayout::getTypeAllocSize(). For scalable
vector, only minimal type alloc size is known at compile-time.

Reviewers: sdesmalen, efriedma, spatel, apazos

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75892
2020-03-16 11:46:12 -07:00
Huihui Zhang 118abf2017 [SVE] Update API ConstantVector::getSplat() to use ElementCount.
Summary:
Support ConstantInt::get() and Constant::getAllOnesValue() for scalable
vector type, this requires ConstantVector::getSplat() to take in 'ElementCount',
instead of 'unsigned' number of element count.

This change is needed for D73753.

Reviewers: sdesmalen, efriedma, apazos, spatel, huntergr, willlovett

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74386
2020-03-12 13:22:41 -07:00
Sanjay Patel a66dc755db [InstSimplify] simplify FP ops harder with FMF (part 2)
This is part of the IR sibling for:
D75576

Related transform committed with:
rG8ec71585719d
2020-03-12 09:53:20 -04:00
Sanjay Patel 8ec7158571 [InstSimplify] simplify FP ops harder with FMF
This is part of the IR sibling for:
D75576

(I'm splitting part of the transform as a separate commit
to reduce risk. I don't know of any bugs that might be
exposed by this improved folding, but it's hard to see
those in advance...)
2020-03-12 09:13:28 -04:00
Sanjay Patel dea2b93a7b [InstSimplify] reduce code for FP undef/nan folding; NFC 2020-03-12 08:46:15 -04:00
Huihui Zhang 8f52573962 [InstSimplify][SVE] Fix SimplifyInsert/ExtractElementInst for scalable vector.
Summary:
For scalable vector, index out-of-bound can not be determined at compile-time.
The same apply for VectorUtil findScalarElement().

Add test cases to check the functionality of SimplifyInsert/ExtractElementInst for scalable vector.

Reviewers: sdesmalen, efriedma, spatel, apazos

Reviewed By: efriedma

Subscribers: cameron.mcinally, tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75782
2020-03-11 15:09:56 -07:00
Nikita Popov 45555c3819 [InstSimplify] Simplify calls with "returned" attribute
If a call argument has the "returned" attribute, we can simplify
the call to the value of that argument. The "-inst-simplify" pass
already handled this for the constant integer argument case via
known bits, which is invoked in SimplifyInstruction. However,
non-constant (or non-int) arguments are not handled at all right now.

This addresses one of the regressions from D75801.

Differential Revision: https://reviews.llvm.org/D75815
2020-03-09 18:53:47 +01:00
Nikita Popov 829d377a98 [InstSimplify] Don't simplify musttail calls
As pointed out by jdoerfert on D75815, we must be careful when
simplifying musttail calls: We can only replace the return value
if we can eliminate the call entirely. As we can't make this
guarantee for all consumers of InstSimplify, this patch disables
simplification of musttail calls. Without this patch, musttail
simplification currently results in module verification errors.

Differential Revision: https://reviews.llvm.org/D75824
2020-03-09 18:46:56 +01:00
Jay Foad 11d1573bb6 [APFloat] Make use of new overloaded comparison operators. NFC.
Reviewers: ekatz, spatel, jfb, tlively, craig.topper, RKSimon, nikic, scanon

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, dexonsmith, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75744
2020-03-06 16:42:53 +00:00
Juneyoung Lee d7267ee194 [ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch conditions of dominating blocks' terminators
Summary:
```
  br i1 c, BB1, BB2:
BB1:
  use1(c)
BB2:
  use2(c)
```

In BB1 and BB2, c is never undef or poison because otherwise the branch would have triggered UB.

This is a resubmission of 952ad47 with crash fix of llvm/test/Transforms/LoopRotate/freeze-crash.ll.

Checked with Alive2

Reviewers: xbolva00, spatel, lebedev.ri, reames, jdoerfert, nlopes, sanjoy

Reviewed By: reames

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75401
2020-03-06 01:08:35 +09:00
Daniil Suchkov 3db48f9324 Revert "[ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch conditions of dominating blocks' terminators"
That commit causes SIGSEGV on some simple tests.
This reverts commit 952ad4701c.
2020-03-05 16:32:36 +07:00
Nikita Popov c6ff3c9bad [InstSimplify] Constant fold icmp of gep
InstSimplify can fold icmps of gep where the base pointers are the
same and the offsets are constant. It does so by constructing a
constant expression icmp and assumes that it gets folded -- but
this doesn't actually happen, because GEP expressions can usually
only be folded by the target-dependent constant folding layer.
As such, we need to explicitly invoke it here.

Differential Revision: https://reviews.llvm.org/D75407
2020-03-04 23:16:52 +01:00
Nikita Popov 0e890cd4d4 [ConstantFolding] Always return something from ConstantFoldConstant
Spin-off from D75407. As described there, ConstantFoldConstant()
currently returns null for non-ConstantExpr/ConstantVector inputs,
but otherwise always returns non-null, independently of whether
any folding has happened or not.

This is confusing and makes consumer code more complicated.
I would expect either that ConstantFoldConstant() returns only if
it actually folded something, or that it always returns non-null.
I'm going to the latter possibility here, which appears to be more
useful considering existing usage.

Differential Revision: https://reviews.llvm.org/D75543
2020-03-04 18:24:47 +01:00
Juneyoung Lee 952ad4701c [ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch conditions of dominating blocks' terminators
Summary:
```
  br i1 c, BB1, BB2:
BB1:
  use1(c)
BB2:
  use2(c)
```

In BB1 and BB2, c is never undef or poison because otherwise the branch would have triggered UB.

Checked with Alive2

Reviewers: xbolva00, spatel, lebedev.ri, reames, jdoerfert, nlopes, sanjoy

Reviewed By: reames

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75401
2020-03-04 11:43:31 +09:00
Christopher Tetreault b03f3fbd6a Reapply: [SVE] Fix bug in simplification of scalable vector instructions
This reverts commit a05441038a, reapplying
commit 31574d38ac
2020-02-05 10:00:09 -08:00
Reid Kleckner a05441038a Revert "[SVE] Fix bug in simplification of scalable vector instructions"
This reverts commit 31574d38ac.

The newly added shufflevector test does not pass locally on either of my
workstations.
2020-02-03 11:12:09 -08:00
Christopher Tetreault 31574d38ac [SVE] Fix bug in simplification of scalable vector instructions
Summary:
* Most of the simplifications in SimplifyShuffleVectorInst depend on the
concrete value of, or the length of the mask vector. For scalable
vectors, this cannot be known at compile time.
** for these tests, detect if the vector is scalable before attempting
the transformation
* The functions ShuffleVectorInst::getMaskValue and
ShuffleVectorInst::getShuffleMask access the value of the constant mask.
However, since the length of the mask is unknown at compile time, these
function do not work for scalable vectors. Add asserts to ensure that
the input mask is not scalable

Reviewers: efriedma, sdesmalen, apazos, chrisj, huihuiz

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73555
2020-02-03 10:15:56 -08:00
Nikita Popov efba7ed05e [PatternMatch] Make m_c_ICmp swap the predicate (PR42801)
This addresses https://bugs.llvm.org/show_bug.cgi?id=42801.
The m_c_ICmp() matcher is changed to provide the swapped predicate
if the operands are swapped.

Existing uses of m_c_ICmp() fall in one of two categories: Working
on equality predicates only, where swapping is irrelevant.
Or performing a manual swap, in which case this patch removes it.

The only exception is the foldICmpWithLowBitMaskedVal() fold, which
does not swap the predicate, and instead reasons about whether
a swap occurred or not for each predicate. Getting the swapped
predicate allows us to merge the logic for pairs of predicates,
instead of duplicating it.

Differential Revision: https://reviews.llvm.org/D72976
2020-01-22 22:56:26 +01:00
Sanjay Patel da9c93f330 [InstSimplify] fold select of vector constants that include undef elements
As mentioned in D72643, we'd like to be able to assert that any select
of equivalent constants has been removed before we're deep into InstCombine.

But there's a loophole in that assertion for vectors with undef elements
that don't match exactly.

This patch should close that gap. If we have undefs, we can't safely
propagate those unless both constants elements for that lane are undef.

Differential Revision: https://reviews.llvm.org/D72958
2020-01-20 08:48:32 -05:00
Sanjay Patel f53b38d12a [InstSimplify] select Cond, true, false --> Cond
This is step 1 of damage control assuming that we need to remove several
over-reaching folds for select-of-booleans because they can cause
miscompiles as shown in D72396.

The scalar case seems obviously safe:
https://rise4fun.com/Alive/jSj

And I don't think there's any danger for vectors either - if the
condition is poisoned, then the select must be poisoned too, so undef
elements don't make any difference.

Differential Revision: https://reviews.llvm.org/D72412
2020-01-09 09:04:20 -05:00
Sanjay Patel 6080387f13 [InstSimplify] fold splat of inserted constant to vector constant
shuf (inselt ?, C, IndexC), undef, <IndexC, IndexC...> --> <C, C...>

This is another missing shuffle fold pattern uncovered by the
shuffle correctness fix from D70246.

The problem was visible in the post-commit thread example, but
we managed to overcome the limitation for that particular case
with D71220.

This is something like the inverse of the previous fix - there
we didn't demand the inserted scalar, and here we are only
demanding an inserted scalar.

Differential Revision: https://reviews.llvm.org/D71488
2019-12-15 09:32:03 -05:00
Nicola Zaghen 97572775d2 Reland [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same.
GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places
in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues
with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered.

This fixes the buildbot failures.

Differential Revision: https://reviews.llvm.org/D68328

Patch by Joseph Faulls!
2019-12-13 14:30:21 +00:00
Denis Bakhvalov 7081c92241 [NFC][InstSimplify] Refactoring ThreadCmpOverSelect function
Removed code duplication in ThreadCmpOverSelect and broke it
into several smaller functions for reusing them.

Differential Revision: https://reviews.llvm.org/D71158
2019-12-12 22:45:58 +01:00
Nicola Zaghen f798eb21ec Temporarily Revert "[DataLayout] Fix occurrences that size and range of pointers are assumed to be the same."
This reverts commit 5f6208778f.

This caused failures in Transforms/PhaseOrdering/scev-custom-dl.ll
const: Assertion `getBitWidth() == CR.getBitWidth() && "ConstantRange types don't agree!"' failed.
2019-12-12 10:29:54 +00:00
Nicola Zaghen 5f6208778f [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same.
GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places
in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues
with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered.

Differential Revision: https://reviews.llvm.org/D68328

Patch by Joseph Faulls!
2019-12-12 10:07:01 +00:00
Johannes Doerfert a7d992c0f2 [ValueTracking] Allow context-sensitive nullness check for non-pointers
Summary:
Same as D60846 and D69571 but with a fix for the problem encountered
after them. Both times it was a missing context adjustment in the
handling of PHI nodes.

The reproducers created from the bugs that caused the old commits to be
reverted are included.

Reviewers: nikic, nlopes, mkazantsev, spatel, dlrobertson, uabelho, hakzsam, hans

Subscribers: hiraditya, bollu, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71181
2019-12-09 15:15:52 -06:00
Sanjay Patel 1c4dd3ae2f [InstSimplify] fold copysign with negated operand, part 2
This is another transform suggested in PR44153:
https://bugs.llvm.org/show_bug.cgi?id=44153

Unlike rG12f39e0fede9, it doesn't look like the
backend matches this variant.
2019-12-08 10:16:29 -05:00
Sanjay Patel 12f39e0fed [InstSimplify] fold copysign with negated operand
This is another transform suggested in PR44153:
https://bugs.llvm.org/show_bug.cgi?id=44153

The backend for some targets already manages to get
this if it converts copysign to bitwise logic.
2019-12-08 10:08:02 -05:00
Sanjay Patel e177c5a00d [InstSimplify] fold copysign with same args to the arg
This is correct for any value including NaN/inf.

We don't have this fold directly in the backend either,
but x86 manages to get it after converting things to bitops.
2019-11-26 17:35:10 -05:00
Hans Wennborg 6ea4775900 Revert 57dd4b0 "[ValueTracking] Allow context-sensitive nullness check for non-pointers"
This caused miscompiles of Chromium (https://crbug.com/1023818). The reduced
repro is small enough to fit here:

  $ cat /tmp/a.c
  unsigned char f(unsigned char *p) {
    unsigned char result = 0;
    for (int shift = 0; shift < 1; ++shift)
      result |= p[0] << (shift * 8);
    return result;
  }
  $ bin/clang -O2 -S -o - /tmp/a.c | grep -A4 f:
  f:                                      # @f
          .cfi_startproc
  # %bb.0:                                # %entry
          xorl    %eax, %eax
          retq

That's nicely optimized, but I don't think it's the right result :-)

> Same as D60846 but with a fix for the problem encountered there which
> was a missing context adjustment in the handling of PHI nodes.
>
> The test that caused D60846 to be reverted was added in e15ab8f277.
>
> Reviewers: nikic, nlopes, mkazantsev,spatel, dlrobertson, uabelho, hakzsam
>
> Subscribers: hiraditya, bollu, llvm-commits
>
> Tags: #llvm
>
> Differential Revision: https://reviews.llvm.org/D69571

This reverts commit 57dd4b03e4.
2019-11-13 12:19:02 +01:00
aqjune 4187cb138b Add InstCombine/InstructionSimplify support for Freeze Instruction
Summary:
- Add llvm::SimplifyFreezeInst
- Add InstCombiner::visitFreeze
- Add llvm tests

Reviewers: majnemer, sanjoy, reames, lebedev.ri, spatel

Reviewed By: reames, lebedev.ri

Subscribers: reames, lebedev.ri, filcab, regehr, trentxintong, llvm-commits

Differential Revision: https://reviews.llvm.org/D29013
2019-11-12 12:13:26 +09:00
Sanjay Patel 659bd73d13 [InstSimplify] use FMF to improve fcmp+select fold
This is part of a series of patches needed to solve PR39535:
https://bugs.llvm.org/show_bug.cgi?id=39535
2019-11-04 08:29:56 -05:00
Johannes Doerfert 57dd4b03e4 [ValueTracking] Allow context-sensitive nullness check for non-pointers
Same as D60846 but with a fix for the problem encountered there which
was a missing context adjustment in the handling of PHI nodes.

The test that caused D60846 to be reverted was added in e15ab8f277.

Reviewers: nikic, nlopes, mkazantsev,spatel, dlrobertson, uabelho, hakzsam

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69571
2019-10-31 14:37:38 -05:00
Florian Hahn 067ed96e8e [InstCombine] Simplify fma multiplication to nan for undef or nan operands.
In similar fashion to D67721, we can simplify FMA multiplications if any
of the operands is NaN or undef. In instcombine, we will simplify the
FMA to an fadd with a NaN operand, which in turn gets folded to NaN.

Note that this just changes SimplifyFMAFMul, so we still not catch the
case where only the Add part of the FMA is Nan/Undef.

Reviewers: cameron.mcinally, mcberg2017, spatel, arsenm

Reviewed By: cameron.mcinally

Differential Revision: https://reviews.llvm.org/D68265

llvm-svn: 373459
2019-10-02 12:32:52 +00:00
Sanjay Patel be21ceb565 [InstSimplify] fold fma/fmuladd with a NaN or undef operand
This is intended to be similar to the constant folding results from
D67446
and earlier, but not all operands are constant in these tests, so the
responsibility for folding is left to InstSimplify.

Differential Revision: https://reviews.llvm.org/D67721

llvm-svn: 373455
2019-10-02 12:12:02 +00:00
Sanjay Patel 8cecc30c99 [InstSimplify] generalize FP folds with undef/NaN; NFC
We can reuse this logic for things like fma.

llvm-svn: 373119
2019-09-27 20:09:09 +00:00
Roman Lebedev 914a3d1cf2 [InstSimplify] Handle more 'A </>/>=/<= B &&/|| (A - B) !=/== 0' patterns (PR43251)
https://rise4fun.com/Alive/sl9s
https://rise4fun.com/Alive/2plN

https://bugs.llvm.org/show_bug.cgi?id=43251

llvm-svn: 372928
2019-09-25 22:59:41 +00:00
Florian Hahn d663efe23a [InstSimplify] Match 1.0 and 0.0 for both operands in SimplifyFMAMul
Because we do not constant fold multiplications in SimplifyFMAMul,
we match 1.0 and 0.0 for both operands, as multiplying by them
is guaranteed to produce an exact result (if it is allowed to do so).

Note that it is not enough to just swap the operands to ensure a
constant is on the RHS, as we want to also cover the case with
2 constants.

Reviewers: lebedev.ri, spatel, reames, scanon

Reviewed By: lebedev.ri, reames

Differential Revision: https://reviews.llvm.org/D67553

llvm-svn: 372915
2019-09-25 19:33:26 +00:00
Florian Hahn f3ab99dcf8 [InstCombine] Limit FMul constant folding for fma simplifications.
As @reames pointed out post-commit, rL371518 adds additional rounding
in some cases, when doing constant folding of the multiplication.
This breaks a guarantee llvm.fma makes and must be avoided.

This patch reapplies rL371518, but splits off the simplifications not
requiring rounding from SimplifFMulInst as SimplifyFMAFMul.

Reviewers: spatel, lebedev.ri, reames, scanon

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D67434

llvm-svn: 372899
2019-09-25 17:03:20 +00:00
Roman Lebedev baf809811b [InstSimplify] simplifyUnsignedRangeCheck(): X >= Y && Y == 0 --> Y == 0
https://rise4fun.com/Alive/v9Y4

llvm-svn: 372491
2019-09-21 22:27:39 +00:00
Roman Lebedev e94f156f77 [InstSimplify][NFC] Reorganize simplifyUnsignedRangeCheck() to emphasize and/or symmetry
Only a single `X >= Y && Y == 0  -->  Y == 0` fold appears to be missing.

llvm-svn: 372490
2019-09-21 22:27:28 +00:00
Roman Lebedev 9c5a4a4527 [InstSimplify] simplifyUnsignedRangeCheck(): handle few tautological cases (PR43251)
Summary:
This is split off from D67356, since these cases produce a constant,
no real need to keep them in instcombine.

Alive proofs:
https://rise4fun.com/Alive/u7Fk
https://rise4fun.com/Alive/4lV

https://bugs.llvm.org/show_bug.cgi?id=43251

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67498

llvm-svn: 371921
2019-09-14 13:47:27 +00:00
Roman Lebedev f1286621eb [InstSimplify] simplifyUnsignedRangeCheck(): handle more cases (PR43251)
Summary:
I don't have a direct motivational case for this,
but it would be good to have this for completeness/symmetry.

This pattern is basically the motivational pattern from
https://bugs.llvm.org/show_bug.cgi?id=43251
but with different predicate that requires that the offset is non-zero.

The completeness bit comes from the fact that a similar pattern (offset != zero)
will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259,
so it'd seem to be good to not overlook very similar patterns..

Proofs: https://rise4fun.com/Alive/21b

Also, there is something odd with `isKnownNonZero()`, if the non-zero
knowledge was specified as an assumption, it didn't pick it up (PR43267)

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67411

llvm-svn: 371718
2019-09-12 09:26:17 +00:00
Roman Lebedev 00c1ee48e4 [InstSimplify] Pass SimplifyQuery into simplifyUnsignedRangeCheck() and use it for isKnownNonZero()
This was actually the original intention in D67332,
but i messed up and forgot about it.
This patch was originally part of D67411, but precommitting this.

llvm-svn: 371630
2019-09-11 15:32:46 +00:00
Roman Lebedev 6e2c5c8710 [InstSimplify] simplifyUnsignedRangeCheck(): if we know that X != 0, handle more cases (PR43246)
Summary:
This is motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens `-fsanitize=pointer-overflow`
overhead from 25% to 50%, which strongly implies missing folds.

In this particular case, given
```
char* test(char& base, unsigned long offset) {
  return &base + offset;
}
```
it will end up producing something like
https://godbolt.org/z/LK5-iH
which after optimizations reduces down to roughly
```
define i1 @t0(i8* nonnull %base, i64 %offset) {
  %base_int = ptrtoint i8* %base to i64
  %adjusted = add i64 %base_int, %offset
  %non_null_after_adjustment = icmp ne i64 %adjusted, 0
  %no_overflow_during_adjustment = icmp uge i64 %adjusted, %base_int
  %res = and i1 %non_null_after_adjustment, %no_overflow_during_adjustment
  ret i1 %res
}
```
Without D67122 there was no `%non_null_after_adjustment`,
and in this particular case we can get rid of the overhead:

Here we add some offset to a non-null pointer,
and check that the result does not overflow and is not a null pointer.
But since the base pointer is already non-null, and we check for overflow,
that overflow check will already catch the null pointer,
so the separate null check is redundant and can be dropped.

Alive proofs:
https://rise4fun.com/Alive/WRzq

There are more patterns of "unsigned-add-with-overflow", they are not handled here,
but this is the main pattern, that we currently consider canonical,
so it makes sense to handle it.

https://bugs.llvm.org/show_bug.cgi?id=43246

Reviewers: spatel, nikic, vsk

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits, reames

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67332

llvm-svn: 371349
2019-09-08 20:14:15 +00:00
Teresa Johnson 9c27b59cec Change TargetLibraryInfo analysis passes to always require Function
Summary:
This is the first change to enable the TLI to be built per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.

This change should not affect behavior, as the provided function is not
yet used to build a specifically per-function TLI, but rather enables
that migration.

Most of the changes were very mechanical, e.g. passing a Function to the
legacy analysis pass's getTLI interface, or in Module level cases,
adding a callback. This is similar to the way the per-function TTI
analysis works.

There was one place where we were looking for builtins but not in the
context of a specific function. See FindCXAAtExit in
lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround
could provide the wrong behavior in some corner cases. Suggestions
welcome.

Reviewers: chandlerc, hfinkel

Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66428

llvm-svn: 371284
2019-09-07 03:09:36 +00:00
Joerg Sonnenberger 799c96693f Allow replaceAndRecursivelySimplify to list unsimplified visitees.
This is part of D65280 and split it to avoid ABI changes on the 9.0
release branch.

llvm-svn: 370355
2019-08-29 13:22:30 +00:00
Roman Lebedev c584786854 [InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.overflow` inverted overflow bit
Summary:
Now that with D65143/D65144 we've produce `@llvm.umul.with.overflow`,
and with D65147 we've flattened the CFG, we now can see that
the guard may have been there to prevent division by zero is redundant.
We can simply drop it:
```
----------------------------------------
Name: no overflow or zero
  %iszero = icmp eq i4 %y, 0
  %umul = smul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %umul.ov.not = xor %umul.ov, -1
  %retval.0 = or i1 %iszero, %umul.ov.not
  ret i1 %retval.0
=>
  %iszero = icmp eq i4 %y, 0
  %umul = smul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %umul.ov.not = xor %umul.ov, -1
  %retval.0 = or i1 %iszero, %umul.ov.not
  ret i1 %umul.ov.not

Done: 1
Optimization is correct!
```
Note that this is inverted from what we have in a previous patch,
here we are looking for the inverted overflow bit.
And that inversion is kinda problematic - given this particular
pattern we neither hoist that `not` closer to `ret` (then the pattern
would have been identical to the one without inversion,
and would have been handled by the previous patch), neither
do the opposite transform. But regardless, we should handle this too.
I've filled [[ https://bugs.llvm.org/show_bug.cgi?id=42720 | PR42720 ]].

Reviewers: nikic, spatel, xbolva00, RKSimon

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65151

llvm-svn: 370351
2019-08-29 12:48:04 +00:00
Roman Lebedev aaf6ab4410 [InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.overflow` overflow bit
Summary:
Now that with D65143/D65144 we've produce `@llvm.umul.with.overflow`,
and with D65147 we've flattened the CFG, we now can see that
the guard may have been there to prevent division by zero is redundant.
We can simply drop it:
```
----------------------------------------
Name: no overflow and not zero
  %iszero = icmp ne i4 %y, 0
  %umul = umul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %retval.0 = and i1 %iszero, %umul.ov
  ret i1 %retval.0
=>
  %iszero = icmp ne i4 %y, 0
  %umul = umul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %retval.0 = and i1 %iszero, %umul.ov
  ret %umul.ov

Done: 1
Optimization is correct!
```

Reviewers: nikic, spatel, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65150

llvm-svn: 370350
2019-08-29 12:47:50 +00:00
Sanjay Patel 9ce5f41851 [InstCombine] fold cmp+select using select operand equivalence
As discussed in PR42696:
https://bugs.llvm.org/show_bug.cgi?id=42696
...but won't help that case yet.

We have an odd situation where a select operand equivalence fold was
implemented in InstSimplify when it could have been done more generally
in InstCombine if we allow dropping of {nsw,nuw,exact} from a binop operand.

Here's an example:
https://rise4fun.com/Alive/Xplr

  %cmp = icmp eq i32 %x, 2147483647
  %add = add nsw i32 %x, 1
  %sel = select i1 %cmp, i32 -2147483648, i32 %add
  =>
  %sel = add i32 %x, 1

I've left the InstSimplify code in place for now, but my guess is that we'd
prefer to remove that as a follow-up to save on code duplication and
compile-time.

Differential Revision: https://reviews.llvm.org/D65576

llvm-svn: 367695
2019-08-02 17:39:32 +00:00
Jay Foad 565c54320e [InstSimplify] Rename SimplifyFPUnOp and SimplifyFPBinOp
Summary:
SimplifyFPBinOp is a variant of SimplifyBinOp that lets you specify
fast math flags, but the name is misleading because both functions
can simplify both FP and non-FP ops. Instead, overload SimplifyBinOp
so that you can optionally specify fast math flags.

Likewise for SimplifyFPUnOp.

Reviewers: spatel

Reviewed By: spatel

Subscribers: xbolva00, cameron.mcinally, eraman, hiraditya, haicheng, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64902

llvm-svn: 366902
2019-07-24 12:50:10 +00:00
Michael Liao 543ba4e9e0 [InstructionSimplify] Apply sext/trunc after pointer stripping
Summary:
- As the pointer stripping could trace through `addrspacecast` now, need
  to sext/trunc the offset to ensure it has the same width as the
  pointer after stripping.

Reviewers: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64768

llvm-svn: 366162
2019-07-16 01:03:06 +00:00
Tim Northover 030bb3d363 InstructionSimplify: Simplify InstructionSimplify. NFC.
The interface predates CallBase, so both it and implementation were
significantly more complicated than they needed to be. There was even
some redundancy that could be eliminated.

Should also help with OpaquePointers by not trying to derive a
function's type from it's PointerType.

llvm-svn: 365767
2019-07-11 13:11:44 +00:00
Johannes Doerfert 3ed286a388 Replace three "strip & accumulate" implementations with a single one
This patch replaces the three almost identical "strip & accumulate"
implementations for constant pointer offsets with a single one,
combining the respective functionalities. The old interfaces are kept
for now.

Differential Revision: https://reviews.llvm.org/D64468

llvm-svn: 365723
2019-07-11 01:14:48 +00:00
Sanjay Patel b342f026a4 [InstSimplify] simplify power-of-2 (single bit set) sequences
As discussed in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314

Improving the canonicalization for these patterns:
rL363956
...means we should adjust/enhance the related simplification.

https://rise4fun.com/Alive/w1cp

  Name: isPow2 or zero
  %x = and i32 %xx, 2048
  %a = add i32 %x, -1
  %r = and i32 %a, %x
  =>
  %r = i32 0

llvm-svn: 363997
2019-06-20 22:55:28 +00:00
Roman Lebedev 5a663bd77a [InstSimplify] Fix addo/subo undef folds (PR42209)
Fix folds of addo and subo with an undef operand to be:

`@llvm.{u,s}{add,sub}.with.overflow` all fold to `{ undef, false }`,
 as per LLVM undef rules.
Same for commuted variants.

Based on the original version of the patch by @nikic.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42209 | PR42209 ]]

Differential Revision: https://reviews.llvm.org/D63065

llvm-svn: 363522
2019-06-16 20:39:45 +00:00
Sanjay Patel 866db10228 [InstSimplify] reduce code duplication for fcmp folds; NFC
llvm-svn: 362904
2019-06-09 13:58:46 +00:00
Sanjay Patel 73f5a855b3 [InstSimplify] enhance fcmp fold with never-nan operand
This is another step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

This is a continuation of D62979 / rL362879.

llvm-svn: 362903
2019-06-09 13:48:59 +00:00
Sanjay Patel 4329c15f11 [InstSimplify] enhance fcmp fold with never-nan operand
This is 1 step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

I'll update the 'ult' case below here as a follow-up assuming no problems here.

Differential Revision: https://reviews.llvm.org/D62979

llvm-svn: 362879
2019-06-08 15:12:33 +00:00
Craig Topper b457e430f3 [InstructionSimplify] Add missing implementation of llvm::SimplifyUnOp. NFC
There are no callers currently, but the function is declared so we should at
least implement it.

llvm-svn: 362205
2019-05-31 08:10:23 +00:00
Sanjay Patel 8869a98e82 [InstSimplify] fold insertelement-of-extractelement
This was partly handled in InstCombine (only the constant
index case), so delete that and zap it more generally in
InstSimplify.

llvm-svn: 361576
2019-05-24 00:13:58 +00:00
Sanjay Patel e60cb7d1be [InstSimplify] insertelement V, undef, ? --> V
This was part of InstCombine, but it's better placed in
InstSimplify. InstCombine also had an unreachable but weaker
fold for insertelement with undef index, so that is deleted.

llvm-svn: 361559
2019-05-23 21:49:47 +00:00
Sanjay Patel 63fa690617 [InstSimplify] update stale comment; NFC
Missed this diff with rL361118.

llvm-svn: 361180
2019-05-20 17:52:18 +00:00
Cameron McInally 2d2a46db8e [InstSimplify] Teach fsub -0.0, (fneg X) ==> X about unary fneg
Differential Revision: https://reviews.llvm.org/D62077

llvm-svn: 361151
2019-05-20 13:13:35 +00:00
Sanjay Patel 9ef99b4b11 [InstSimplify] fold fcmp (maxnum, X, C1), C2
This is the sibling transform for rL360899 (D61691):

  maxnum(X, GreaterC) == C --> false
  maxnum(X, GreaterC) <= C --> false
  maxnum(X, GreaterC) <  C --> false
  maxnum(X, GreaterC) >= C --> true
  maxnum(X, GreaterC) >  C --> true
  maxnum(X, GreaterC) != C --> true

llvm-svn: 361118
2019-05-19 14:26:39 +00:00