Commit Graph

853 Commits

Author SHA1 Message Date
Nikita Popov 7e18cd887c [InstCombine] Whitelist non-refining folds in SimplifyWithOpReplaced
This is an alternative to D98391/D98585, playing things more
conservatively. If AllowRefinement == false, then we don't use
InstSimplify methods at all, and instead explicitly implement a
small number of non-refining folds. Most cases are handled by
constant folding, and I only had to add three folds to cover
our unit tests / test-suite. While this may lose some optimization
power, I think it is safer to approach from this direction, given
how many issues this code has already caused.

Differential Revision: https://reviews.llvm.org/D99027
2021-03-22 22:12:56 +01:00
Nikita Popov daae927f9c [InstSimplify] Clean up SimplifyReplacedWithOp implementation (NFCI)
Replace Op with RepOp up-front, and then always work with the new
operands, rather than checking for replacement in various places.
2021-03-21 15:30:30 +01:00
Simonas Kazlauskas 6513995be3 [InstSimplify] Restrict a GEP transform to avoid provenance changes
This is a follow-up to D98588, and fixes the inline `FIXME` about a GEP-related simplification not
preserving the provenance.

https://alive2.llvm.org/ce/z/qbQoAY

Additional tests were added in {rGf125f28afdb59eba29d2491dac0dfc0a7bf1b60b}

Depends on D98672

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D98611
2021-03-16 18:53:05 +02:00
Simonas Kazlauskas a977324800 [InstSimplify] Match PtrToInt more directly in a GEP transform (NFC)
In preparation for D98611, the upcoming change will need to apply additional checks to `P` and `V`,
and so this refactor paves the way for adding additional checks in a less awkward way.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D98672
2021-03-16 15:45:19 +02:00
Sanjay Patel 660728acd4 [InstSimplify] ctlz({signbit} >>u x) --> x
The motivating pattern was handled in 0a2d69480d ,
but we should have this for symmetry.

But this really highlights that we could generalize for
any shifted constant if we match this in instcombine.

https://alive2.llvm.org/ce/z/MrmVNt
2021-03-15 12:03:35 -04:00
Bjorn Pettersson 529c8e8dc6 [InstSimplify] Simplify smul.fix and smul.fix.sat
Add simplification of smul.fix and smul.fix.sat according to
  X * 0 -> 0
  X * undef -> 0
  X * (1 << scale) -> X

This includes the commuted patterns and splatted vectors.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D98299
2021-03-12 09:09:58 +01:00
Juneyoung Lee 720a828045 Resolve unused variable warning (NFC) 2021-03-11 12:03:03 +09:00
Juneyoung Lee 8652c3e1a3 [InstSimplify] Pass SimplifyQuery to computePointerICmp (NFC) 2021-03-11 11:13:46 +09:00
Juneyoung Lee f49354838e Revert "[InstCombine] Add simplification of two logical and/ors"
This reverts commit 07c3b97e18 due to a reported failure in two-stage build.
2021-03-10 05:48:31 +09:00
Sanjay Patel 34d0d644ff [ValueTracking] move/add helper to get inverse min/max; NFC
We will need to this functionality to improve min/max folds
in instcombine when we canonicalize to intrinsics.
2021-03-08 17:38:22 -05:00
Sanjay Patel 0a2d69480d [InstSimplify] cttz(1<<x) --> x
https://alive2.llvm.org/ce/z/TDacYu
https://alive2.llvm.org/ce/z/KF84S3
2021-03-08 16:30:14 -05:00
Juneyoung Lee 07c3b97e18 [InstCombine] Add simplification of two logical and/ors
This is a patch that adds folding of two logical and/ors that share one variable:

a && (a && b) -> a && b
a && (a & b)  -> a && b
...

This is towards removing the poison-unsafe select optimization (D93065 has more context).

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D96945
2021-03-08 02:38:43 +09:00
Simon Pilgrim 1020d16156 [InstSimplify] Handle nsw shl -> poison patterns
Pulled out from D90479 - this recognises invalid nsw shl patterns with signbit changes that result in poison.

Differential Revision: https://reviews.llvm.org/D97305
2021-02-23 18:26:56 +00:00
Simon Pilgrim 18b9fc48f1 [InstructionSimplify] SimplifyShift - rename shift amount KnownBits. NFCI.
As suggested on D97305.
2021-02-23 18:12:59 +00:00
Simon Pilgrim 476ff0327b [InstSimplify] Cleanup out-of-range shift amount handling.
Use APInt::uge() direct instead of getLimitedValue().

Use KnownBits::getMinValue() to make the bounds check more obvious.
2021-02-22 17:00:49 +00:00
Caroline Concatto 2d728bbff5 [CodeGen][SelectionDAG]Add new intrinsic experimental.vector.reverse
This patch adds  a new intrinsic experimental.vector.reduce that takes a single
vector and returns a vector of matching type but with the original lane order
 reversed. For example:

```
vector.reverse(<A,B,C,D>) ==> <D,C,B,A>
```

The new intrinsic supports fixed and scalable vectors types.
The fixed-width vector relies on shufflevector to maintain existing behaviour.
Scalable vector uses the new ISD node - VECTOR_REVERSE.

This new intrinsic is one of the named shufflevector intrinsics proposed on the
mailing-list in the RFC at [1].

Patch by Paul Walker (@paulwalker-arm).

[1] https://lists.llvm.org/pipermail/llvm-dev/2020-November/146864.html

Differential Revision: https://reviews.llvm.org/D94883
2021-02-15 13:39:43 +00:00
Juneyoung Lee 0441df94ad [InstCombine,InstSimplify] Optimize select followed by and/or/xor
This patch adds `A & (A && B)` -> `A && B`  (similarly for or + logical or)

Also, this patch adds `~(select C, (icmp pred X, Y), const)` -> `select C, (icmp pred' X, Y), ~const`.

Alive2 proof:
merge_and: https://alive2.llvm.org/ce/z/teMR97
merge_or: https://alive2.llvm.org/ce/z/b4yZUp
xor_and: https://alive2.llvm.org/ce/z/_-TXHi
xor_or: https://alive2.llvm.org/ce/z/2uYx_a

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D94861
2021-01-19 09:14:17 +09:00
Nikita Popov a13c0f62c3 [InstSimplify] Fold x*C1/C2 <= x (PR48744)
We can fold x*C1/C2 <= x to true if C1 <= C2. This is valid even
if the multiplication is not nuw: https://alive2.llvm.org/ce/z/vULors

The multiplication or division can be replaced by shifts. We don't
handle the case where both are shifts, as that should get folded
away by InstCombine.
2021-01-17 16:02:55 +01:00
Dávid Bolvanský bfd75bdf3f [NFC] Removed extra text in comments 2021-01-16 22:48:56 +01:00
Dávid Bolvanský 63bedc80da [InstSimplify] Handle commutativity for 'and' and 'outer or' for (~A & B) | ~(A | B) --> ~A
Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D94870
2021-01-16 19:42:50 +01:00
Dávid Bolvanský bdd4dda58b [InstSimplify] Update comments, remove redundant tests 2021-01-16 16:31:23 +01:00
Dávid Bolvanský a4e2a5145a [InstSimplify] Add (~A & B) | ~(A | B) --> ~A 2021-01-16 15:43:34 +01:00
Nikita Popov 7ecad2e4ce [InstSimplify] Don't fold gep p, -p to null
This is a partial fix for https://bugs.llvm.org/show_bug.cgi?id=44403.
Folding gep p, q-p to q is only legal if p and q have the same
provenance. This fold should probably be guarded by something like
getUnderlyingObject(p) == getUnderlyingObject(q).

This patch is a partial fix that removes the special handling for
gep p, 0-p, which will fold to a null pointer, which would certainly
not pass an underlying object check (unless p is also null, in which
case this would fold trivially anyway). Folding to a null pointer
is particularly problematic due to the special handling it receives
in many places, making end-to-end miscompiles more likely.

Differential Revision: https://reviews.llvm.org/D93820
2021-01-12 20:24:23 +01:00
Juneyoung Lee 3a60a1f165 [InstSimplify] Fold insertelement vec, poison, idx into vec
This is a simple patch that adds folding from `insertelement vec, poison, idx` into `vec`.

Alive2 proof: https://alive2.llvm.org/ce/z/2y2vbC

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93994
2021-01-07 10:10:14 +09:00
Nikita Popov 221c3b174b [InstSimplify] Canonicalize non-demanded shuffle op to poison (NFCI)
I don't believe this has an observable effect, because the only
thing we care about here is replacing the operand with a constant
so following folds can apply. This change is just to make the
representation follow canonical unary shuffle form.
2021-01-06 21:22:27 +01:00
Nikita Popov d042f2db5b [InstSimplify] Fold call null/undef to poison
Calling null or undef results in immediate undefined behavior.
Return poison instead of undef in this case, similar to what
we do for immediate UB due to division by zero.
2021-01-06 21:09:30 +01:00
Nikita Popov a6df39236f [InstSimplify] Fold out-of-bounds shift to poison
Make InstSimplify return poison rather than undef for out-of-bounds
shifts, as specified by LandRef:

> If op2 is (statically or dynamically) equal to or larger than the
> number of bits in op1, this instruction returns a poison value.

Differential Revision: https://reviews.llvm.org/D93998
2021-01-06 20:41:37 +01:00
Juneyoung Lee f665a8c5b8 [InstSimplify] gep with poison operand is poison
This is a tiny update to fold gep poison into poison. :)

Alive2 proofs:
https://alive2.llvm.org/ce/z/7Nwdri
https://alive2.llvm.org/ce/z/sDP4sC
2021-01-05 11:07:49 +09:00
Kazu Hirata 848e8f938f [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-04 11:42:44 -08:00
Nikita Popov 3715c99be9 [InstSimplify] Fold nnan/ninf violation to poison
As the comment already indicates, performing an operation with
nnan/ninf flags on a nan/inf or undef results in poison. Now that
we have a proper poison value, we no longer need to relax it to
undef.
2021-01-03 22:05:40 +01:00
Nikita Popov 766cf7f32e [InstSimplify] Fold division by zero to poison
Div/rem by zero is immediate undefined behavior and anything goes.
Currently we fold it to undef, this patch changes it to fold to
poison instead, which is slightly stronger.

Differential Revision: https://reviews.llvm.org/D93995
2021-01-03 20:52:45 +01:00
Nikita Popov f094d65bea [InstSimplify] Fix addo/subo with undef (PR43188)
We can't fold the first result to undef, because not all values
may be reachable under the constraint that no overflow occurred.
Use the same folds we do for saturated math instead.

Proofs:
uaddo: https://alive2.llvm.org/ce/z/zf55N_
saddo: https://alive2.llvm.org/ce/z/a_xPgS
usubo: https://alive2.llvm.org/ce/z/DmRqwt
ssubo: https://alive2.llvm.org/ce/z/8ag7U-
2021-01-03 18:51:49 +01:00
Nikita Popov c6ad00d709 [InstSimplify] Return poison for out of bounds extractelement
This is the same change as D93990, but for extractelement rather
than insertelement.

> If idx exceeds the length of val for a fixed-length vector, the
> result is a poison value. For a scalable vector, if the value of
> idx exceeds the runtime length of the vector, the result is a
> poison value.
2021-01-03 18:15:58 +01:00
Juneyoung Lee 2139958b53 [InstSimplify] Return poison if insertelement touches out of bounds
This is a simple patch that updates InstSimplify to return poison if the index is/can be out-of-bounds

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93990
2021-01-04 00:43:02 +09:00
Kazu Hirata a87c7003ac [Analysis] Remove unused code recursivelySimplifyInstruction (NFC)
The last use of the function, located in RemovePredecessorAndSimplify,
was removed on Dec 25, 2020 in commit
46bea9b297.

The last use of RemovePredecessorAndSimplify was removed on Sep 29,
2010 in commit 99c985c37d.
2020-12-30 17:45:40 -08:00
Sanjay Patel 236c4524a7 [InstSimplify] remove ctpop of 1 (low) bit
https://llvm.org/PR48608

As noted in the test comment, we could handle a more general
case in instcombine and remove this, but I don't have evidence
that we need to do that.

https://alive2.llvm.org/ce/z/MRW9gD
2020-12-28 16:06:20 -05:00
Sanjay Patel 38ca7face6 [InstSimplify] reduce logic with inverted add/sub ops
https://llvm.org/PR48559
This could be part of a larger ValueTracking API,
but I don't see that currently.

https://rise4fun.com/Alive/gR0

  Name: and
  Pre: C1 == ~C2
  %sub = add i8 %x, C1
  %sub1 = sub i8 C2, %x
  %r = and i8 %sub, %sub1
  =>
  %r = 0

  Name: or
  Pre: C1 == ~C2
  %sub = add i8 %x, C1
  %sub1 = sub i8 C2, %x
  %r = or i8 %sub, %sub1
  =>
  %r = -1

  Name: xor
  Pre: C1 == ~C2
  %sub = add i8 %x, C1
  %sub1 = sub i8 C2, %x
  %r = xor i8 %sub, %sub1
  =>
  %r = -1
2020-12-21 08:51:43 -05:00
Roman Lebedev e9289dc25f
[InstSimplify] Don't miscompile `X == 0 ? abs(X) : -abs(X) --> -abs(X)` xform
The transform wasn't checking that the LHS of the comparison
*is* the `X` in question...
This is the miscompile that was holding up D87188.

Thanks to Dave Green for producing an actionable reproducer!
2020-12-18 21:18:13 +03:00
Kazu Hirata eb44682d67 [Analysis] Use is_contained (NFC) 2020-12-11 21:19:31 -08:00
Cullen Rhodes 7b8d50b141 [InstSimplify] Clarify use of FixedVectorType in SimplifySelectInst
Folding a select of vector constants that include undef elements only
applies to fixed vectors, but there's no earlier check the type is not
scalable so it crashes for scalable vectors. This adds a check so this
optimization is only attempted for fixed vectors.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D92046
2020-11-27 09:55:29 +00:00
Sanjay Patel 00808e321c [InstSimplify] allow vector folds for (Pow2C << X) == NonPow2C
Existing pre-conditions seem to be correct:
https://rise4fun.com/Alive/lCLB

  Name: non-zero C1
  Pre: !isPowerOf2(C1) && isPowerOf2(C2) && C1 != 0
  %sub = shl i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false

  Name: one == C2
  Pre: !isPowerOf2(C1) && isPowerOf2(C2) && C2 == 1
  %sub = shl i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false

  Name: nuw
  Pre: !isPowerOf2(C1) && isPowerOf2(C2)
  %sub = shl nuw i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false

  Name: nsw
  Pre: !isPowerOf2(C1) && isPowerOf2(C2)
  %sub = shl nsw i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false
2020-11-08 09:52:05 -05:00
Sanjay Patel c74db55ff5 [InstSimplify] allow vector folds for icmp Pred (1 << X), 0x80 2020-11-04 08:12:48 -05:00
Sanjay Patel e77ba263fe [InstSimplify] peek through 'not' operand in logic-of-icmps fold
This extends D78430 to solve cases like:
https://llvm.org/PR47858

There are still missed opportunities shown in the tests,
and as noted in the earlier patches, we have related
functionality in InstCombine, so we may want to extend
other folds in a similar way.

A semi-random sampling of test diff proofs in this patch:
https://rise4fun.com/Alive/sS4C
2020-10-25 11:13:30 -04:00
Sjoerd Meijer 51d7df3fa1 [InstructionSimplify] icmp (X+Y), (X+Z) simplification
This improves simplifications for pattern `icmp (X+Y), (X+Z)` -> `icmp Y,Z`
if only one of the operands has NSW set, e.g.:

    icmp slt (x + 0), (x +nsw 1)

We can still safely rewrite this to:

    icmp slt 0, 1

because we know that the LHS can't overflow if the RHS has NSW set and
C1 < C2 && C1 >= 0, or C2 < C1 && C1 <= 0

This simplification is useful because ScalarEvolutionExpander which is used to
generate code for SCEVs in different loop optimisers is not always able to put
back NSW flags across control-flow, thus inhibiting CFG simplifications.

Differential Revision: https://reviews.llvm.org/D89317
2020-10-22 08:55:52 +01:00
Sanjay Patel 7c516504a1 [InstSimplify] allow vector splats for icmp-of-neg folds 2020-10-20 09:24:36 -04:00
Juneyoung Lee 9b3c2a72e4 [ValueTracking] Use assume's noundef operand bundle
This patch updates `isGuaranteedNotToBeUndefOrPoison` to use `llvm.assume`'s `noundef` operand bundle.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89219
2020-10-14 20:16:33 +09:00
Simon Pilgrim 95a440b936 [IR] PatternMatch - add m_FShl/m_FShr funnel shift intrinsic matchers. NFCI. 2020-10-01 14:42:34 +01:00
Sanjay Patel 3f100e64b4 [InstSimplify] fix fmin/fmax miscompile for partial undef vectors (PR47567)
It would also be correct to return the variable operand in these cases,
but eliminating a variable use is probably better for optimization.
2020-09-18 10:05:44 -04:00
Nikita Popov 0bb06f297f [InstSimplify] Clarify SimplifyWithOpReplaced() return value
If SimplifyWithOpReplaced() cannot simplify the value, null should
be returned. Make sure this really does happen in all cases,
including those where SimplifyBinOp() returns the original value.

This does not matter for existing users, but does mattter for
D87480, which would go into an infinite loop otherwise.
2020-09-16 20:53:26 +02:00
Sanjay Patel 8985755762 [InstSimplify] add limit folds for fmin/fmax
If the constant operand is the opposite of the min/max value,
then the result must be the other value.

This is based on the similar codegen transform proposed in:
D87571
2020-09-15 10:58:44 -04:00
Sanjay Patel 55d371abd7 [InstSimplify] add folds for fmin/fmax with 'nnan'
maximum(nnan X, +INF) --> +INF
minimum(nnan X, -INF) --> -INF

This is based on the similar codegen transform proposed in:
D87571
2020-09-14 11:46:11 -04:00
Sanjay Patel 7526376164 [InstSimplify] allow folds for fmin/fmax with 'ninf'
maxnum(ninf X, +FLT_MAX) --> +FLT_MAX
minnum(ninf X, -FLT_MAX) --> -FLT_MAX

This is based on the similar codegen transform proposed in:
D87571
2020-09-14 11:18:08 -04:00
Sanjay Patel 22c583c3d0 [InstSimplify] reduce code duplication for fmin/fmax folds; NFC
We use the same code structure for folding integer min/max.
2020-09-14 10:32:11 -04:00
Sanjay Patel 7bb9a2f996 [InstSimplify] fix miscompiles with maximum/minimum intrinsics
As discussed in the sibling codegen functionality patch D87571,
this transform was created with D52766, but it is not correct.

The incorrect test diffs were missed during review, but the
'TODO' comment about this functionality was still in the code -
we need 'nnan' to enable this fold.
2020-09-14 09:06:41 -04:00
Nikita Popov 36e2e2e12e [InstCombine] Fix incorrect SimplifyWithOpReplaced transform (PR47322)
This is a followup to D86834, which partially fixed this issue in
InstSimplify. However, InstCombine repeats the same transform while
dropping poison flags -- which does not cover cases where poison is
introduced in some other way.

The fix here is a bit more comprehensive, because things are quite
entangled, and it's hard to only partially address it without
regressing optimization. There are really two changes here:

 * Export the SimplifyWithOpReplaced API from InstSimplify, with an
   added AllowRefinement flag. For replacements inside the TrueVal
   we don't actually care whether refinement occurs or not, the
   replacement is always legal. This part of the transform is now
   done in InstSimplify only. (It should be noted that the current
   AllowRefinement check is not sufficient -- that's an issue we
   need to address separately.)
 * Change the InstCombine fold to work by temporarily dropping
   poison generating flags, running the fold and then restoring the
   flags if it didn't work out. This will ensure that the InstCombine
   fold is correct as long as the InstSimplify fold is correct.

Differential Revision: https://reviews.llvm.org/D87445
2020-09-12 14:45:06 +02:00
Nikita Popov e97f3b1b43 [InstCombine] Fold abs of known negative operand
If we know that the abs operand is known negative, we can replace
it with a neg.

To avoid computing known bits twice, I've removed the fold for the
non-negative case from InstSimplify. Both the non-negative and the
negative case are handled by InstCombine now, with one known bits call.

Differential Revision: https://reviews.llvm.org/D87196
2020-09-08 20:14:35 +02:00
Nikita Popov ff218cbc84 [InstSimplify] Fold degenerate abs of abs form
This addresses the remaining issue from D87188. Due to a series of
folds, we may end up with abs-of-abs represented as
x == 0 ? -abs(x) : abs(x). Rather than recognizing this as a special
abs pattern and doing an abs-of-abs fold on it afterwards,
I'm directly folding this to one of the select operands in InstSimplify.

The general pattern falls into the "select with operand replaced"
category, but that fold is not powerful enough to recognize that
both hands of the select are the same for value zero.

Differential Revision: https://reviews.llvm.org/D87197
2020-09-06 09:43:08 +02:00
Nikita Popov 73104b0751 [InstSimplify] Fold min/max based on dominating condition
If we have a dominating condition that x >= y, then umax(x, y) is x,
etc. I'm doing this in InstSimplify as the corresponding transform
for the select form is also done there.

Differential Revision: https://reviews.llvm.org/D87168
2020-09-05 16:16:40 +02:00
Nikita Popov 88b310f64b [InstSimplify] Reduce code duplication in simplifySelectWithICmpCond (NFC)
Canonicalize icmp ne to icmp eq and implement all the folds only once.
2020-08-29 22:38:49 +02:00
Nikita Popov a5be86fde5 [InstSimplify] Protect against more poison in SimplifyWithOpReplaced (PR47322)
Replace the check for poison-producing instructions in
SimplifyWithOpReplaced() with the generic helper canCreatePoison()
that properly handles poisonous shifts and thus avoids the problem
from PR47322.

This additionally fixes a bug in IIQ.UseInstrInfo=false mode, which
previously could have caused this code to ignore poison flags.
Setting UseInstrInfo=false should reduce the possible optimizations,
not increase them.

This is not a full solution to the problem, as poison could be
introduced more indirectly. This is just a minimal, easy to backport
fix.

Differential Revision: https://reviews.llvm.org/D86834
2020-08-29 21:59:39 +02:00
Roman Lebedev c1b3e32118
[NFC][InstructionSimplify] Add a warning about not simplifying to not def-reachable
See
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824235.html
and
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/824967.html

InstSimply is not allowed to perform simplifications to instructions
that are not def-reachable from the original instruction.
2020-08-29 09:58:08 +03:00
Owen Anderson ed90f15efb Revert "[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block"
This reverts commit 6102310d81.  It
appears to cause compilation non-determinism and caused stage3
mismatches.
2020-08-28 23:43:42 +00:00
David Sherwood f4257c5832 [SVE] Make ElementCount members private
This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().

Differential Revision: https://reviews.llvm.org/D86065
2020-08-28 14:43:53 +01:00
Roman Lebedev b85f91fdce
[InstSimplify] SimplifyPHINode(): check that instruction is in basic block first
As pointed out in post-commit review, this can legally be called
on instructions that are not inserted into basic blocks,
so don't blindly assume that there is basic block.
2020-08-27 22:32:03 +03:00
Roman Lebedev 6102310d81
[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block
Apparently, we don't do this, neither in EarlyCSE, nor in InstSimplify,
nor in (old) GVN, but do in NewGVN and SimplifyCFG of all places..

While i could teach EarlyCSE how to hash PHI nodes,
we can't really do much (anything?) even if we find two identical
PHI nodes in different basic blocks, same-BB case is the interesting one,
and if we teach InstSimplify about it (which is what i wanted originally,
https://reviews.llvm.org/D86530), we get EarlyCSE support for free.

So i would think this is pretty uncontroversial.

On vanilla llvm test-suite + RawSpeed, this has the following effects:
```
| statistic name                                     | baseline  | proposed  |      Δ |        % |    \|%\| |
|----------------------------------------------------|-----------|-----------|-------:|---------:|---------:|
| instsimplify.NumPHICSE                             | 0         | 23779     |  23779 |    0.00% |    0.00% |
| asm-printer.EmittedInsts                           | 7942328   | 7942392   |     64 |    0.00% |    0.00% |
| assembler.ObjectBytes                              | 273069192 | 273084704 |  15512 |    0.01% |    0.01% |
| correlated-value-propagation.NumPhis               | 18412     | 18539     |    127 |    0.69% |    0.69% |
| early-cse.NumCSE                                   | 2183283   | 2183227   |    -56 |    0.00% |    0.00% |
| early-cse.NumSimplify                              | 550105    | 542090    |  -8015 |   -1.46% |    1.46% |
| instcombine.NumAggregateReconstructionsSimplified  | 73        | 4506      |   4433 | 6072.60% | 6072.60% |
| instcombine.NumCombined                            | 3640264   | 3664769   |  24505 |    0.67% |    0.67% |
| instcombine.NumDeadInst                            | 1778193   | 1783183   |   4990 |    0.28% |    0.28% |
| instcount.NumCallInst                              | 1758401   | 1758799   |    398 |    0.02% |    0.02% |
| instcount.NumInvokeInst                            | 59478     | 59502     |     24 |    0.04% |    0.04% |
| instcount.NumPHIInst                               | 330557    | 330533    |    -24 |   -0.01% |    0.01% |
| instcount.TotalInsts                               | 8831952   | 8832286   |    334 |    0.00% |    0.00% |
| simplifycfg.NumInvokes                             | 4300      | 4410      |    110 |    2.56% |    2.56% |
| simplifycfg.NumSimpl                               | 1019808   | 999607    | -20201 |   -1.98% |    1.98% |
```
I.e. it fires ~24k times, causes +110 (+2.56%) more `invoke` -> `call`
transforms, and counter-intuitively results in *more* instructions total.

That being said, the PHI count doesn't decrease that much,
and looking at some examples, it seems at least some of them
were previously getting PHI CSE'd in SimplifyCFG of all places..

I'm adjusting `Instruction::isIdenticalToWhenDefined()` at the same time.
As a comment in `InstCombinerImpl::visitPHINode()` already stated,
there are no guarantees on the ordering of the operands of a PHI node,
so if we just naively compare them, we may false-negatively say that
the nodes are not equal when the only difference is operand order,
which is especially important since the fold is in InstSimplify,
so we can't rely on InstCombine sorting them beforehand.

Fixing this for the general case is costly (geomean +0.02%),
and does not appear to catch anything in test-suite, but for
the same-BB case, it's trivial, so let's fix at least that.

As per http://llvm-compile-time-tracker.com/compare.php?from=04879086b44348cad600a0a1ccbe1f7776cc3cf9&to=82bdedb888b945df1e9f130dd3ac4dd3c96e2925&stat=instructions
this appears to cause geomean +0.03% compile time increase (regression),
but geomean -0.01%..-0.04% code size decrease (improvement).
2020-08-27 18:47:04 +03:00
Nikita Popov d7c119d89c [InstSimplify] Fold min/max intrinsic based on icmp of operands
This is a reboot of D84655, now performing the inner icmp
simplification query without undef folds.

It should be possible to handle the current foldMinMaxSharedOp()
fold based on this, by moving the logic into icmp of min/max instead,
making it more general. We can't drop the folds for constant operands,
because those also allow undef, which we exclude here.

The tests use assumes for exhaustive coverage, and have a few
more examples of misc folds we get based on icmp simplification.

Differential Revision: https://reviews.llvm.org/D85929
2020-08-26 22:02:57 +02:00
Arthur Eubanks 098d3f9827 [InstSimplify] Simplify to vector constants when possible
InstSimplify should do all transformations that ConstProp does, but
one thing that ConstProp does that InstSimplify wouldn't is inline
vector instructions that are constants, e.g. into a ret.

Previously vector instructions wouldn't be inlined in InstSimplify
because llvm::Simplify*Instruction() would return nullptr for specific
instructions, such as vector instructions that were actually constants,
if it couldn't simplify them.

This changes SimplifyInsertElementInst, SimplifyExtractElementInst, and
SimplifyShuffleVectorInst to return a vector constant when possible.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85946
2020-08-26 11:40:36 -07:00
Craig Topper a7a06ded8b Recommit "[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms" and its follow up patches
This recommits the following patches now that D85684 has landed

1cf6f210a2 [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison.
469da663f2 [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison
122b0640fc [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison
ac0af12ed2 [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison
9b1e95329a [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms
2020-08-12 10:45:27 -07:00
Nikita Popov 06d567059e [InstSimplify] Respect CanUseUndef in more places
Similar to what we do in IIQ, add an isUndefValue() helper that
checks for undef values while respective CanUseUndef. This makes
it much easier to search for places that don't respect the flag
yet.
2020-08-11 21:53:33 +02:00
Nikita Popov d110d4aaff [InstSimplify] Forbid undef folds in expandBinOp
This is the replacement for D84250 based on D84792. As we recursively
fold with the same value twice, we need to disable undef folds,
to prevent an undef from being folded to two different values.

Reverting rG00f3579aea6e3d4a4b7464c3db47294f71cef9e4 and using the
test case from https://reviews.llvm.org/D83360#2145793, it no longer
performs the incorrect fold.

Differential Revision: https://reviews.llvm.org/D85684
2020-08-11 18:39:24 +02:00
Sanjay Patel 1470ce4a76 [InstSimplify] fold min/max with matching min/max operands
I think this is the last remaining translation of an existing
instcombine transform for the corresponding cmp+sel idiom.

This interpretation is more general though - we can remove
mismatched signed/unsigned combinations in addition to the
more obvious cases.

min/max(X, Y) must produce X or Y as the result, so this is
just another clause in the existing transform that was already
matching a min/max of min/max.
2020-08-11 11:23:15 -04:00
Florian Hahn d236e1c7b6 [InstSimplify/NewGVN] Add option to control the use of undef.
Making use of undef is not safe if the simplification result is not used
to replace all uses of the result. This leads to problems in NewGVN,
which does not replace all uses in the IR directly. See PR33165 for more
details.

This patch adds an option to SimplifyQuery to disable the use of undef.

Note that I've only guarded uses if isa<UndefValue>/m_Undef where
SimplifyQuery is currently available. If we agree on the general
direction, I'll update the remaining uses.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D84792
2020-08-09 19:16:56 +01:00
Sanjay Patel 250a167c41 [InstSimplify] avoid crashing by trying to rem-by-zero
Bug was noted in the post-commit comments for:
rGe8760bb9a8a3
2020-08-06 16:06:31 -04:00
Sanjay Patel e8760bb9a8 [InstSimplify] fold icmp with mul nsw and constant operands
https://rise4fun.com/Alive/slvl

  Name: mul nsw with icmp eq
  Pre: (C2 % C1) != 0
  %a = mul nsw i8 %x, C1
  %r = icmp eq i8 %a, C2
    =>
  %r = false

  Name: mul nsw with icmp ne
  Pre: (C2 % C1) != 0
  %a = mul nsw i8 %x, C1
  %r = icmp ne i8 %a, C2
    =>
  %r = true

Follow-up to the 'nuw' variation added with:
rGf879c9b79621
2020-08-05 14:38:39 -04:00
Sanjay Patel f879c9b796 [InstSimplify] fold icmp with mul nuw and constant operands
https://rise4fun.com/Alive/pZEr

  Name: mul nuw with icmp eq
  Pre: (C2 %u C1) != 0
  %a = mul nuw i8 %x, C1
  %r = icmp eq i8 %a, C2
    =>
  %r = false

  Name: mul nuw with icmp ne
  Pre: (C2 %u C1) != 0
  %a = mul nuw i8 %x, C1
  %r = icmp ne i8 %a, C2
    =>
  %r = true

There are potentially several other transforms we need to add based on:
D51625
...but it doesn't look like there was follow-up to that patch.
2020-08-05 14:32:17 -04:00
Sanjay Patel bd2c88b253 [InstSimplify] reduce code duplication in simplifyICmpWithMinMax(); NFC 2020-08-05 11:39:28 -04:00
Xavier Denis 29fe3fe615 [InstSimplify] Peephole optimization for icmp (urem X, Y), X
This revision adds the following peephole optimization
and it's negation:

    %a = urem i64 %x, %y
    %b = icmp ule i64 %a, %x
    ====>
    %b = true

With John Regehr's help this optimization was checked with Alive2
which suggests it should be valid.

This pattern occurs in the bound checks of Rust code, the program

    const N: usize = 3;
    const T = u8;

    pub fn split_mutiple(slice: &[T]) -> (&[T], &[T]) {
        let len = slice.len() / N;
        slice.split_at(len * N)
    }

the method call slice.split_at will check that len * N is within
the bounds of slice, this bounds check is after some transformations
turned into the urem seen above and then LLVM fails to optimize it
any further. Adding this optimization would cause this bounds check
to be fully optimized away.

ref: https://github.com/rust-lang/rust/issues/74938

Differential Revision: https://reviews.llvm.org/D85092
2020-08-04 20:48:37 +02:00
Sanjay Patel a16882047a [InstSimplify] refactor min/max folds with shared operand; NFC 2020-08-04 12:21:05 -04:00
Sanjay Patel 04e45ae1c6 [InstSimplify] fold nested min/max intrinsics with constant operands
This is based on the existing code for the non-intrinsic idioms
in InstCombine.

The vector constant constraint is non-obvious: undefs should be
ok in the outer call, but they can't propagate safely from the
inner call in all cases. Example:

https://alive2.llvm.org/ce/z/-2bVbM
  define <2 x i8> @src(<2 x i8> %x) {
  %0:
    %m = umin <2 x i8> %x, { 7, undef }
    %m2 = umin <2 x i8> { 9, 9 }, %m
    ret <2 x i8> %m2
  }
  =>
  define <2 x i8> @tgt(<2 x i8> %x) {
  %0:
    %m = umin <2 x i8> %x, { 7, undef }
    ret <2 x i8> %m
  }
  Transformation doesn't verify!
  ERROR: Value mismatch

  Example:
  <2 x i8> %x = < undef, undef >

  Source:
  <2 x i8> %m = < #x00 (0)	[based on undef value], #x00 (0) >
  <2 x i8> %m2 = < #x00 (0), #x00 (0) >

  Target:
  <2 x i8> %m = < #x07 (7), #x10 (16) >
  Source value: < #x00 (0), #x00 (0) >
  Target value: < #x07 (7), #x10 (16) >
2020-08-04 08:44:48 -04:00
Sanjay Patel 20c71e55aa [InstSimplify] reduce code for min/max analysis; NFC
This should probably be moved up to some common area eventually
when there's another user.
2020-08-04 08:02:33 -04:00
Sanjay Patel 9e5cf6bde5 [InstSimplify] fold variations of max-of-min with common operand
https://alive2.llvm.org/ce/z/ZtxpZ3
2020-08-03 15:02:46 -04:00
Sanjay Patel 4abc69c6f5 [InstSimplify] fold max (max X, Y), X --> max X, Y
https://alive2.llvm.org/ce/z/VGgG3M
2020-08-02 11:50:58 -04:00
Nikita Popov a0addbb4ec [InstSimplify] Reduce code duplication in icmp of binop folds (NFC)
For folds where we check for the binop on both the LHS and RHS,
extract a function that expects it on the LHS and call it with
swapped order.
2020-08-02 15:47:18 +02:00
Craig Topper 85b5315dbe [InstSimplify] Fold abs(abs(x)) -> abs(x)
It's always safe to pick the earlier abs regardless of the nsw flag. We'll just lose it if it is on the outer abs but not the inner abs.

Differential Revision: https://reviews.llvm.org/D85053
2020-08-01 13:25:00 -07:00
Sanjay Patel 04b99a4d18 [InstSimplify] simplify abs if operand is known non-negative
abs() should be rare enough that using value tracking is not going
to be a compile-time cost burden, so use it to reduce a variety of
potential patterns. We do this in DAGCombiner too.

Differential Revision: https://reviews.llvm.org/D85043
2020-08-01 07:47:06 -04:00
Vitaly Buka b0eb40ca39 [NFC] Remove unused GetUnderlyingObject paramenter
Depends on D84617.

Differential Revision: https://reviews.llvm.org/D84621
2020-07-31 02:10:03 -07:00
Vitaly Buka 89051ebace [NFC] GetUnderlyingObject -> getUnderlyingObject
I am going to touch them in the next patch anyway
2020-07-30 21:08:24 -07:00
Sanjay Patel fef513f5cc [InstSimplify] fold min/max intrinsic with undef operand 2020-07-29 17:03:50 -04:00
Sanjay Patel 5cd695dd7f [InstSimplify] fold min/max with opposite of limit value 2020-07-29 17:03:50 -04:00
Sanjay Patel ee9617e96b [InstSimplify] try constant folding intrinsics before general simplifications
This matches the behavior of simplify calls for regular opcodes -
rely on ConstantFolding before spending time on folds with variables.

I am not aware of any diffs from this re-ordering currently, but there was
potential for unintended behavior from the min/max intrinsics because that
code is implicitly assuming that only 1 of the input operands is constant.
2020-07-29 13:18:40 -04:00
Sanjay Patel 3e8534fbc6 [InstSimplify] allow partial undef constants for vector min/max folds 2020-07-29 11:53:41 -04:00
Sanjay Patel 3c20ede18b [InstSimplify] fold integer min/max intrinsic with same args 2020-07-29 11:53:41 -04:00
Sanjay Patel 3fb13b8484 [InstSimplify] allow undefs in icmp with vector constant folds
This is the main icmp simplification shortcoming seen in D84655.

Alive2 agrees that the basic examples are correct at least:

define <2 x i1> @src(<2 x i8> %x) {
%0:
  %r = icmp sle <2 x i8> { undef, 128 }, %x
  ret <2 x i1> %r
}
=>
define <2 x i1> @tgt(<2 x i8> %x) {
%0:
  ret <2 x i1> { 1, 1 }
}
Transformation seems to be correct!

define <2 x i1> @src(<2 x i32> %X) {
%0:
  %A = or <2 x i32> %X, { 63, 63 }
  %B = icmp ult <2 x i32> %A, { undef, 50 }
  ret <2 x i1> %B
}
=>
define <2 x i1> @tgt(<2 x i32> %X) {
%0:
  ret <2 x i1> { 0, 0 }
}
Transformation seems to be correct!

https://alive2.llvm.org/ce/z/omt2ee
https://alive2.llvm.org/ce/z/GW4nP_

Differential Revision: https://reviews.llvm.org/D84762
2020-07-28 15:13:53 -04:00
Sanjay Patel 0481e1ae3c [InstSimplify] fold integer min/max intrinsics with limit constant 2020-07-26 09:41:54 -04:00
Sanjay Patel b89ae102e6 [InstSimplify] fold fcmp using isKnownNeverInfinity + isKnownNeverNaN
Follow-up to D84035 / rG7393d7574c09.
This sidesteps a question of FMF/poison on fcmp raised in PR46077:
http://bugs.llvm.org/PR46077

https://alive2.llvm.org/ce/z/TCsyzD
  define i1 @src(float %x) {
  %0:
    %x42 = fadd nnan ninf float %x, 42.000000
    %r = fcmp ueq float %x42, inf
    ret i1 %r
  }
  =>
  define i1 @tgt(float %x) {
  %0:
    ret i1 0
  }
  Transformation seems to be correct!

https://alive2.llvm.org/ce/z/FQaH7a
  define i1 @src(i8 %x) {
  %0:
    %cast = uitofp i8 %x to float
    %r = fcmp one float inf, %cast
    ret i1 %r
  }
  =>
  define i1 @tgt(i8 %x) {
  %0:
    ret i1 1
  }
  Transformation seems to be correct!
2020-07-26 09:04:37 -04:00
Sanjay Patel 7485e92412 [InstSimplify] reduce code duplication for binop expansion; NFC
D84250 proposes to extend this code, so the duplication for
the commuted case would continue to grow.
2020-07-23 08:35:21 -04:00
Christopher Tetreault 23c5e59d9f [SVE] Remove calls to VectorType::getNumElements from Analysis
Reviewers: efriedma, fpetrogalli, c-rhodes, asbirlea, RKSimon

Reviewed By: RKSimon

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81504
2020-07-22 15:19:05 -07:00
Sanjay Patel 7393d7574c [InstSimplify] fold fcmp with infinity constant using isKnownNeverInfinity
This is a step towards trying to remove unnecessary FP compares
with infinity when compiling with -ffinite-math-only or similar.
I'm intentionally not checking FMF on the fcmp itself because
I'm assuming that will go away eventually.
The analysis part of this was added with rGcd481136 for use with
isKnownNeverNaN. Similarly, that could be an enhancement here to
get predicates like 'one' and 'ueq'.

Differential Revision: https://reviews.llvm.org/D84035
2020-07-19 09:24:52 -04:00
Craig Topper 00f3579aea Revert "[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms" and subsequent patches
This reverts most of the following patches due to reports of miscompiles.
I've left the added test cases with comments updated to be FIXMEs.

1cf6f210a2 [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison.
469da663f2 [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison
122b0640fc [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison
ac0af12ed2 [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison
9b1e95329a [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms
2020-07-15 22:02:33 -07:00
Craig Topper 469da663f2 [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison
Follow up from the transform being removed in D83360. If X is probably not poison, then the transform is safe.

Still plan to remove or adjust the code from ConstantFolding after this.

Differential Revision: https://reviews.llvm.org/D83440
2020-07-09 12:21:03 -07:00
Craig Topper 122b0640fc [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison
We can't fold to the non-undef value unless we know it isn't poison. So check each element with isGuaranteedNotToBeUndefOrPoison. This currently rules out all constant expressions.

Differential Revision: https://reviews.llvm.org/D83442
2020-07-09 11:01:12 -07:00
Craig Topper 9b1e95329a [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms
As noted here https://lists.llvm.org/pipermail/llvm-dev/2016-October/106182.html and by alive2, this transform isn't valid. If X is poison this potentially propagates poison when it shouldn't.

This same transform still exists in DAGCombiner.

Differential Revision: https://reviews.llvm.org/D83360
2020-07-08 12:53:05 -07:00
Nikita Popov a48cf72238 [InstSimplify] Handle not inserted instruction gracefully (PR46638)
When simplifying comparisons using a dominating assume, bail out
if the context instruction is not inserted.
2020-07-08 21:43:32 +02:00
Craig Topper d92bf71a07 Revert "[X86] Merge the FEATURE_64BIT and FEATURE_EM64T bits in X86TargetParser.def."
An accidental change snuck in here

This reverts commit f1d290d812.
2020-07-07 18:20:07 -07:00
Craig Topper f1d290d812 [X86] Merge the FEATURE_64BIT and FEATURE_EM64T bits in X86TargetParser.def.
These represent the same thing but 64BIT only showed up from
getHostCPUFeatures providing a list of featuers to clang. While
EM64T showed up from getting the features for a named CPU.

EM64T didn't have a string specifically so it would not be passed
up to clang when getting features for a named CPU. While 64bit
needed a name since that's how it is index.

Merge them by filtering 64bit out before sending features to clang
for named CPUs.
2020-07-07 17:59:54 -07:00
Nikita Popov 3b671022e4 [InstSimplify] Simplify comparison between zext(x) and sext(x)
This is picking up a loose thread from D69006: We can simplify
(zext x) ule (sext x) and (zext x) sge (sext x) to true, with
various permutations. Oddly, SCEV knows about this identity,
but nothing on the IR level does.

Differential Revision: https://reviews.llvm.org/D83081
2020-07-04 11:03:00 +02:00
Nikita Popov cf1d9f9f49 [InstSimplify] Fold icmp with dominating assume
If we assume(x > y), then we should be able to fold the basic
implications of that, like x >= y. This already happens if either
one of the operands is constant (LVI) or if the conditions are
exactly the same (GVN), but not if we have an implication with
non-constant operands. Support this by querying AssumptionCache.

Fixes https://bugs.llvm.org/show_bug.cgi?id=40149.

Differential Revision: https://reviews.llvm.org/D82717
2020-07-03 18:53:58 +02:00
Christopher Tetreault 747486991c [SVE] Fix bad FixedVectorType cast in simplifyDivRem
Summary:
simplifyDivRem attempts to walk a VectorType elementwise. Ensure that it
only does so for FixedVectorType

Reviewers: efriedma, spatel, lebedev.ri, david-arm, kmclaughlin

Reviewed By: spatel, david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81856
2020-06-16 13:17:05 -07:00
Dorit Nuzman a9fe69c359 [InstSimplify] fix bug in matching or-with-not op (PR46083) 2020-06-03 13:44:29 -04:00
Serge Pavlov 4d20e31f73 [FPEnv] Intrinsic llvm.roundeven
This intrinsic implements IEEE-754 operation roundToIntegralTiesToEven,
and performs rounding to the nearest integer value, rounding halfway
cases to even. The intrinsic represents the missed case of IEEE-754
rounding operations and now llvm provides full support of the rounding
operations defined by the standard.

Differential Revision: https://reviews.llvm.org/D75670
2020-05-26 19:24:58 +07:00
Sanjay Patel 7eed772a27 [PatternMatch] abbreviate vector inst matchers; NFC
Readability is not reduced with these opcodes/match lines,
so reduce odds of awkward wrapping from 80-col limit.
2020-05-24 09:19:47 -04:00
Nikita Popov 5a2265647e Reapply [InstSimplify] Remove known bits constant folding
No changes relative to last time, but after a mitigation for
an AMDGPU regression landed.

---

If SimplifyInstruction() does not succeed in simplifying the
instruction, it will compute the known bits of the instruction
in the hope that all bits are known and the instruction can be
folded to a constant. I have removed a similar optimization
from InstCombine in D75801, and would like to drop this one as well.

On average, we spend ~1% of total compile-time performing this
known bits calculation. However, if we introduce some additional
statistics for known bits computations and how many of them succeed
in simplifying the instruction we get (on test-suite):

    instsimplify.NumKnownBits: 216
    instsimplify.NumKnownBitsComputed: 13828375
    valuetracking.NumKnownBitsComputed: 45860806

Out of ~14M known bits calculations (accounting for approximately
one third of all known bits calculations), only 0.0015% succeed in
producing a constant. Those cases where we do succeed to compute
all known bits will get folded by other passes like InstCombine
later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test
show a hash difference after this change. On lencod we see an
improvement (a loop phi is optimized away), on the GCC torture
test a regression (a function return value is determined only
after IPSCCP, preventing propagation from a noinline function.)

There are various regressions in InstSimplify tests. However, all
of these cases are already handled by InstCombine, and corresponding
tests have already been added there.

Differential Revision: https://reviews.llvm.org/D79294
2020-05-08 10:24:53 +02:00
Nikita Popov 46ee652c70 Revert "[InstSimplify] Remove known bits constant folding"
This reverts commit 08556afc54.

This breaks some AMDGPU tests.
2020-05-03 20:45:10 +02:00
Nikita Popov 08556afc54 [InstSimplify] Remove known bits constant folding
If SimplifyInstruction() does not succeed in simplifying the
instruction, it will compute the known bits of the instruction
in the hope that all bits are known and the instruction can be
folded to a constant. I have removed a similar optimization
from InstCombine in D75801, and would like to drop this one as well.

On average, we spend ~1% of total compile-time performing this
known bits calculation. However, if we introduce some additional
statistics for known bits computations and how many of them succeed
in simplifying the instruction we get (on test-suite):

    instsimplify.NumKnownBits: 216
    instsimplify.NumKnownBitsComputed: 13828375
    valuetracking.NumKnownBitsComputed: 45860806

Out of ~14M known bits calculations (accounting for approximately
one third of all known bits calculations), only 0.0015% succeed in
producing a constant. Those cases where we do succeed to compute
all known bits will get folded by other passes like InstCombine
later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test
show a hash difference after this change. On lencod we see an
improvement (a loop phi is optimized away), on the GCC torture
test a regression (a function return value is determined only
after IPSCCP, preventing propagation from a noinline function.)

There are various regressions in InstSimplify tests. However, all
of these cases are already handled by InstCombine, and corresponding
tests have already been added there.

Differential Revision: https://reviews.llvm.org/D79294
2020-05-03 20:26:58 +02:00
Sanjay Patel 57f0eed98d [InstSimplify] allow insertelement-with-undef fold if poison-safe
The more general fold was not poison-safe, so it was removed:
rG5486e00
...but it is ok to have this transform if analysis can determine
the vector contains no poison. The test shows a simple example
of that: constant integer elements are not poison.
2020-05-01 10:34:29 -04:00
Sanjay Patel 5486e00dc3 [InstSimplify] remove poison-unsafe insertelement of undef value
PR45481:
https://bugs.llvm.org/show_bug.cgi?id=45481

SDAG has an identical transform to this, so there's little
chance of any real-world impact. OTOH, that means we are
effectively sweeping the bug out of sight because poison
exists in codegen too.
2020-05-01 09:22:05 -04:00
Craig Topper a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
James Y Knight 248a5db3f2 Change callbr to only define its output SSA variable on the normal
path, not the indirect targets.

Fixes: PR45565.

Differential Revision: https://reviews.llvm.org/D78341
2020-04-23 19:36:44 -04:00
Christopher Tetreault 9174e0229f [SVE] Remove calls to VectorType::isScalable from analysis
Reviewers: efriedma, sdesmalen, chandlerc, sunfish

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77692
2020-04-23 12:44:22 -07:00
Sanjay Patel e86eff0e82 [InstSimplify] fold and/or of compares with equality to min/max constant
I found 12 (6 if we compress the DeMorganized forms) patterns for logic-of-compares
with a min/max constant while looking at PR45510:
https://bugs.llvm.org/show_bug.cgi?id=45510

The variations on those forms multiply the test cases by 8 (unsigned/signed, swapped
compare operands, commuted logic operands).
We have partial logic to deal with these for the unsigned min (zero) case, but
missed everything else.

We are deferring the majority of these patterns to InstCombine to allow more general
handling (see D78582).

We could use ConstantRange instead of predicate+constant matching here. I don't
expect there's any noticeable compile-time impact for either form.

Here's an abuse of Alive2 to show the 12 basic signed variants of the patterns in
one function:
http://volta.cs.utah.edu:8080/z/5Vpiyg

declare void @use(i1, i1, i1, i1, i1, i1, i1, i1, i1, i1, i1, i1)
define void @src(i8 %x, i8 %y)  {
  %m1 = icmp eq i8 %x, 127
  %c1 = icmp slt i8 %x, %y
  %r1 = and i1 %m1, %c1   ; (X == MAX) && (X < Y) --> false

  %m2 = icmp ne i8 %x, 127
  %c2 = icmp sge i8 %x, %y
  %r2 = or i1 %m2, %c2    ; (X != MAX) || (X >= Y) --> true

  %m3 = icmp eq i8 %x, -128
  %c3 = icmp sgt i8 %x, %y
  %r3 = and i1 %m3, %c3   ; (X == MIN) && (X > Y) --> false

  %m4 = icmp ne i8 %x, -128
  %c4 = icmp sle i8 %x, %y
  %r4 = or i1 %m4, %c4    ; (X != MIN) || (X <= Y) --> true

  %m5 = icmp eq i8 %x, 127
  %c5 = icmp sge i8 %x, %y
  %r5 = and i1 %m5, %c5   ; (X == MAX) && (X >= Y) --> X == MAX

  %m6 = icmp ne i8 %x, 127
  %c6 = icmp slt i8 %x, %y
  %r6 = or i1 %m6, %c6   ; (X != MAX) || (X < Y) --> X != MAX

  %m7 = icmp eq i8 %x, -128
  %c7 = icmp sle i8 %x, %y
  %r7 = and i1 %m7, %c7   ; (X == MIN) && (X <= Y) --> X == MIN

  %m8 = icmp ne i8 %x, -128
  %c8 = icmp sgt i8 %x, %y
  %r8 = or i1 %m8, %c8   ; (X != MIN) || (X > Y) --> X != MIN

  %m9 = icmp ne i8 %x, 127
  %c9 = icmp slt i8 %x, %y
  %r9 = and i1 %m9, %c9    ; (X != MAX) && (X < Y) --> X < Y

  %m10 = icmp eq i8 %x, 127
  %c10 = icmp sge i8 %x, %y
  %r10 = or i1 %m10, %c10    ; (X == MAX) || (X >= Y) --> X >= Y

  %m11 = icmp ne i8 %x, -128
  %c11 = icmp sgt i8 %x, %y
  %r11 = and i1 %m11, %c11    ; (X != MIN) && (X > Y) --> X > Y

  %m12 = icmp eq i8 %x, -128
  %c12 = icmp sle i8 %x, %y
  %r12 = or i1 %m12, %c12    ; (X == MIN) || (X <= Y) --> X <= Y

  call void @use(i1 %r1, i1 %r2, i1 %r3, i1 %r4, i1 %r5, i1 %r6, i1 %r7, i1 %r8, i1 %r9, i1 %r10, i1 %r11, i1 %r12)
  ret void
}

define void @tgt(i8 %x, i8 %y)  {
  %m5 = icmp eq i8 %x, 127
  %m6 = icmp ne i8 %x, 127
  %m7 = icmp eq i8 %x, -128
  %m8 = icmp ne i8 %x, -128
  %c9 = icmp slt i8 %x, %y
  %c10 = icmp sge i8 %x, %y
  %c11 = icmp sgt i8 %x, %y
  %c12 = icmp sle i8 %x, %y
  call void @use(i1 0, i1 1, i1 0, i1 1, i1 %m5, i1 %m6, i1 %m7, i1 %m8, i1 %c9, i1 %c10, i1 %c11, i1 %c12)
  ret void
}

Differential Revision: https://reviews.llvm.org/D78430
2020-04-23 09:16:10 -04:00
Christopher Tetreault b96558f5e5 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: sunfish, sdesmalen, efriedma

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77273
2020-04-09 12:41:28 -07:00
Eli Friedman 1ee6ec2bf3 Remove "mask" operand from shufflevector.
Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.

This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.

I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors.  Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it.  Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.

Differential Revision: https://reviews.llvm.org/D72467
2020-03-31 13:08:59 -07:00
Serge Pavlov f398739152 [FEnv] Constfold some unary constrained operations
This change implements constant folding to constrained versions of
intrinsics, implementing rounding: floor, ceil, trunc, round, rint and
nearbyint.

Differential Revision: https://reviews.llvm.org/D72930
2020-03-28 12:28:33 +07:00
Nikita Popov 417d69595f [InstSimplify] Reorder checks to be more efficient; NFC
First check whether the RHS is a null pointer, and only then
perform a potentially expensive non-zero query.
2020-03-20 22:05:38 +01:00
Nico Weber 623cb95eb3 Revert "[InstSimplify] Simplify calls with "returned" attribute"
This reverts commit 45555c3819.
Causes clang crashes in some causes, see comments on
https://reviews.llvm.org/D75815 for details (including
repro steps).
2020-03-16 15:21:30 -04:00
Huihui Zhang 0616e9964b [InstSimplify][SVE] Fix SimplifyGEPInst for scalable vector.
Summary:
Skip folds that rely on DataLayout::getTypeAllocSize(). For scalable
vector, only minimal type alloc size is known at compile-time.

Reviewers: sdesmalen, efriedma, spatel, apazos

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75892
2020-03-16 11:46:12 -07:00
Huihui Zhang 118abf2017 [SVE] Update API ConstantVector::getSplat() to use ElementCount.
Summary:
Support ConstantInt::get() and Constant::getAllOnesValue() for scalable
vector type, this requires ConstantVector::getSplat() to take in 'ElementCount',
instead of 'unsigned' number of element count.

This change is needed for D73753.

Reviewers: sdesmalen, efriedma, apazos, spatel, huntergr, willlovett

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74386
2020-03-12 13:22:41 -07:00
Sanjay Patel a66dc755db [InstSimplify] simplify FP ops harder with FMF (part 2)
This is part of the IR sibling for:
D75576

Related transform committed with:
rG8ec71585719d
2020-03-12 09:53:20 -04:00
Sanjay Patel 8ec7158571 [InstSimplify] simplify FP ops harder with FMF
This is part of the IR sibling for:
D75576

(I'm splitting part of the transform as a separate commit
to reduce risk. I don't know of any bugs that might be
exposed by this improved folding, but it's hard to see
those in advance...)
2020-03-12 09:13:28 -04:00
Sanjay Patel dea2b93a7b [InstSimplify] reduce code for FP undef/nan folding; NFC 2020-03-12 08:46:15 -04:00
Huihui Zhang 8f52573962 [InstSimplify][SVE] Fix SimplifyInsert/ExtractElementInst for scalable vector.
Summary:
For scalable vector, index out-of-bound can not be determined at compile-time.
The same apply for VectorUtil findScalarElement().

Add test cases to check the functionality of SimplifyInsert/ExtractElementInst for scalable vector.

Reviewers: sdesmalen, efriedma, spatel, apazos

Reviewed By: efriedma

Subscribers: cameron.mcinally, tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75782
2020-03-11 15:09:56 -07:00
Nikita Popov 45555c3819 [InstSimplify] Simplify calls with "returned" attribute
If a call argument has the "returned" attribute, we can simplify
the call to the value of that argument. The "-inst-simplify" pass
already handled this for the constant integer argument case via
known bits, which is invoked in SimplifyInstruction. However,
non-constant (or non-int) arguments are not handled at all right now.

This addresses one of the regressions from D75801.

Differential Revision: https://reviews.llvm.org/D75815
2020-03-09 18:53:47 +01:00
Nikita Popov 829d377a98 [InstSimplify] Don't simplify musttail calls
As pointed out by jdoerfert on D75815, we must be careful when
simplifying musttail calls: We can only replace the return value
if we can eliminate the call entirely. As we can't make this
guarantee for all consumers of InstSimplify, this patch disables
simplification of musttail calls. Without this patch, musttail
simplification currently results in module verification errors.

Differential Revision: https://reviews.llvm.org/D75824
2020-03-09 18:46:56 +01:00
Jay Foad 11d1573bb6 [APFloat] Make use of new overloaded comparison operators. NFC.
Reviewers: ekatz, spatel, jfb, tlively, craig.topper, RKSimon, nikic, scanon

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, dexonsmith, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75744
2020-03-06 16:42:53 +00:00
Juneyoung Lee d7267ee194 [ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch conditions of dominating blocks' terminators
Summary:
```
  br i1 c, BB1, BB2:
BB1:
  use1(c)
BB2:
  use2(c)
```

In BB1 and BB2, c is never undef or poison because otherwise the branch would have triggered UB.

This is a resubmission of 952ad47 with crash fix of llvm/test/Transforms/LoopRotate/freeze-crash.ll.

Checked with Alive2

Reviewers: xbolva00, spatel, lebedev.ri, reames, jdoerfert, nlopes, sanjoy

Reviewed By: reames

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75401
2020-03-06 01:08:35 +09:00
Daniil Suchkov 3db48f9324 Revert "[ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch conditions of dominating blocks' terminators"
That commit causes SIGSEGV on some simple tests.
This reverts commit 952ad4701c.
2020-03-05 16:32:36 +07:00
Nikita Popov c6ff3c9bad [InstSimplify] Constant fold icmp of gep
InstSimplify can fold icmps of gep where the base pointers are the
same and the offsets are constant. It does so by constructing a
constant expression icmp and assumes that it gets folded -- but
this doesn't actually happen, because GEP expressions can usually
only be folded by the target-dependent constant folding layer.
As such, we need to explicitly invoke it here.

Differential Revision: https://reviews.llvm.org/D75407
2020-03-04 23:16:52 +01:00
Nikita Popov 0e890cd4d4 [ConstantFolding] Always return something from ConstantFoldConstant
Spin-off from D75407. As described there, ConstantFoldConstant()
currently returns null for non-ConstantExpr/ConstantVector inputs,
but otherwise always returns non-null, independently of whether
any folding has happened or not.

This is confusing and makes consumer code more complicated.
I would expect either that ConstantFoldConstant() returns only if
it actually folded something, or that it always returns non-null.
I'm going to the latter possibility here, which appears to be more
useful considering existing usage.

Differential Revision: https://reviews.llvm.org/D75543
2020-03-04 18:24:47 +01:00
Juneyoung Lee 952ad4701c [ValueTracking] Let isGuaranteedNotToBeUndefOrPoison look into branch conditions of dominating blocks' terminators
Summary:
```
  br i1 c, BB1, BB2:
BB1:
  use1(c)
BB2:
  use2(c)
```

In BB1 and BB2, c is never undef or poison because otherwise the branch would have triggered UB.

Checked with Alive2

Reviewers: xbolva00, spatel, lebedev.ri, reames, jdoerfert, nlopes, sanjoy

Reviewed By: reames

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75401
2020-03-04 11:43:31 +09:00
Christopher Tetreault b03f3fbd6a Reapply: [SVE] Fix bug in simplification of scalable vector instructions
This reverts commit a05441038a, reapplying
commit 31574d38ac
2020-02-05 10:00:09 -08:00
Reid Kleckner a05441038a Revert "[SVE] Fix bug in simplification of scalable vector instructions"
This reverts commit 31574d38ac.

The newly added shufflevector test does not pass locally on either of my
workstations.
2020-02-03 11:12:09 -08:00
Christopher Tetreault 31574d38ac [SVE] Fix bug in simplification of scalable vector instructions
Summary:
* Most of the simplifications in SimplifyShuffleVectorInst depend on the
concrete value of, or the length of the mask vector. For scalable
vectors, this cannot be known at compile time.
** for these tests, detect if the vector is scalable before attempting
the transformation
* The functions ShuffleVectorInst::getMaskValue and
ShuffleVectorInst::getShuffleMask access the value of the constant mask.
However, since the length of the mask is unknown at compile time, these
function do not work for scalable vectors. Add asserts to ensure that
the input mask is not scalable

Reviewers: efriedma, sdesmalen, apazos, chrisj, huihuiz

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73555
2020-02-03 10:15:56 -08:00
Nikita Popov efba7ed05e [PatternMatch] Make m_c_ICmp swap the predicate (PR42801)
This addresses https://bugs.llvm.org/show_bug.cgi?id=42801.
The m_c_ICmp() matcher is changed to provide the swapped predicate
if the operands are swapped.

Existing uses of m_c_ICmp() fall in one of two categories: Working
on equality predicates only, where swapping is irrelevant.
Or performing a manual swap, in which case this patch removes it.

The only exception is the foldICmpWithLowBitMaskedVal() fold, which
does not swap the predicate, and instead reasons about whether
a swap occurred or not for each predicate. Getting the swapped
predicate allows us to merge the logic for pairs of predicates,
instead of duplicating it.

Differential Revision: https://reviews.llvm.org/D72976
2020-01-22 22:56:26 +01:00
Sanjay Patel da9c93f330 [InstSimplify] fold select of vector constants that include undef elements
As mentioned in D72643, we'd like to be able to assert that any select
of equivalent constants has been removed before we're deep into InstCombine.

But there's a loophole in that assertion for vectors with undef elements
that don't match exactly.

This patch should close that gap. If we have undefs, we can't safely
propagate those unless both constants elements for that lane are undef.

Differential Revision: https://reviews.llvm.org/D72958
2020-01-20 08:48:32 -05:00
Sanjay Patel f53b38d12a [InstSimplify] select Cond, true, false --> Cond
This is step 1 of damage control assuming that we need to remove several
over-reaching folds for select-of-booleans because they can cause
miscompiles as shown in D72396.

The scalar case seems obviously safe:
https://rise4fun.com/Alive/jSj

And I don't think there's any danger for vectors either - if the
condition is poisoned, then the select must be poisoned too, so undef
elements don't make any difference.

Differential Revision: https://reviews.llvm.org/D72412
2020-01-09 09:04:20 -05:00
Sanjay Patel 6080387f13 [InstSimplify] fold splat of inserted constant to vector constant
shuf (inselt ?, C, IndexC), undef, <IndexC, IndexC...> --> <C, C...>

This is another missing shuffle fold pattern uncovered by the
shuffle correctness fix from D70246.

The problem was visible in the post-commit thread example, but
we managed to overcome the limitation for that particular case
with D71220.

This is something like the inverse of the previous fix - there
we didn't demand the inserted scalar, and here we are only
demanding an inserted scalar.

Differential Revision: https://reviews.llvm.org/D71488
2019-12-15 09:32:03 -05:00
Nicola Zaghen 97572775d2 Reland [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same.
GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places
in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues
with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered.

This fixes the buildbot failures.

Differential Revision: https://reviews.llvm.org/D68328

Patch by Joseph Faulls!
2019-12-13 14:30:21 +00:00
Denis Bakhvalov 7081c92241 [NFC][InstSimplify] Refactoring ThreadCmpOverSelect function
Removed code duplication in ThreadCmpOverSelect and broke it
into several smaller functions for reusing them.

Differential Revision: https://reviews.llvm.org/D71158
2019-12-12 22:45:58 +01:00
Nicola Zaghen f798eb21ec Temporarily Revert "[DataLayout] Fix occurrences that size and range of pointers are assumed to be the same."
This reverts commit 5f6208778f.

This caused failures in Transforms/PhaseOrdering/scev-custom-dl.ll
const: Assertion `getBitWidth() == CR.getBitWidth() && "ConstantRange types don't agree!"' failed.
2019-12-12 10:29:54 +00:00
Nicola Zaghen 5f6208778f [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same.
GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places
in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues
with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered.

Differential Revision: https://reviews.llvm.org/D68328

Patch by Joseph Faulls!
2019-12-12 10:07:01 +00:00