Commit Graph

832 Commits

Author SHA1 Message Date
Sanjay Patel 236c4524a7 [InstSimplify] remove ctpop of 1 (low) bit
https://llvm.org/PR48608

As noted in the test comment, we could handle a more general
case in instcombine and remove this, but I don't have evidence
that we need to do that.

https://alive2.llvm.org/ce/z/MRW9gD
2020-12-28 16:06:20 -05:00
Sanjay Patel 1351f719d4 [InstSimplify] add tests for ctpop; NFC (PR48608) 2020-12-28 16:06:19 -05:00
Juneyoung Lee db7a2f347f Precommit transform tests that have poison as insertelement's placeholder
This commit copies existing tests at llvm/Transforms and replaces
'insertelement undef' in those files with 'insertelement poison'.
(see https://reviews.llvm.org/D93586)

Tests listed using this script:

grep -R -E '^[^;]*insertelement <.*> undef,' . | cut -d":" -f1 | uniq |
wc -l

Tests updated:

file_org=llvm/test/Transforms/$1
file=${file_org%.ll}-inseltpoison.ll
cp $file_org $file
sed -i -E 's/^([^;]*)insertelement <(.*)> undef/\1insertelement <\2> poison/g' $file
head -1 $file | grep "Assertions have been autogenerated by utils/update_test_checks.py" -q
if [ "$?" == 1 ]; then
  echo "$file : should be manually updated"
  # I manually updated the script
  exit 1
fi
python3 ./llvm/utils/update_test_checks.py --opt-binary=./build-releaseassert/bin/opt $file
2020-12-24 11:46:17 +09:00
Sanjay Patel 38ca7face6 [InstSimplify] reduce logic with inverted add/sub ops
https://llvm.org/PR48559
This could be part of a larger ValueTracking API,
but I don't see that currently.

https://rise4fun.com/Alive/gR0

  Name: and
  Pre: C1 == ~C2
  %sub = add i8 %x, C1
  %sub1 = sub i8 C2, %x
  %r = and i8 %sub, %sub1
  =>
  %r = 0

  Name: or
  Pre: C1 == ~C2
  %sub = add i8 %x, C1
  %sub1 = sub i8 C2, %x
  %r = or i8 %sub, %sub1
  =>
  %r = -1

  Name: xor
  Pre: C1 == ~C2
  %sub = add i8 %x, C1
  %sub1 = sub i8 C2, %x
  %r = xor i8 %sub, %sub1
  =>
  %r = -1
2020-12-21 08:51:43 -05:00
Sanjay Patel d6118759f3 [InstSimplify] add tests for inverted logic operands; NFC 2020-12-21 08:51:42 -05:00
Roman Lebedev e9289dc25f
[InstSimplify] Don't miscompile `X == 0 ? abs(X) : -abs(X) --> -abs(X)` xform
The transform wasn't checking that the LHS of the comparison
*is* the `X` in question...
This is the miscompile that was holding up D87188.

Thanks to Dave Green for producing an actionable reproducer!
2020-12-18 21:18:13 +03:00
Roman Lebedev 9b183a1452
[NFC][InstSimplify] Add miscompiled testcase from D87188/D87197
Thanks to Dave Green for producing an actionable reproducer!
It is (obviously) a miscompile:
```
----------------------------------------
define i32 @select_abs_of_abs_eq_wrong(i32 %x, i32 %y) {
%0:
  %abs = abs i32 %x, 0
  %neg = sub i32 0, %abs
  %cmp = icmp eq i32 %y, 0
  %sel = select i1 %cmp, i32 %neg, i32 %abs
  ret i32 %sel
}
=>
define i32 @select_abs_of_abs_eq_wrong(i32 %x, i32 %y) {
%0:
  %abs = abs i32 %x, 0
  ret i32 %abs
}
Transformation doesn't verify!
ERROR: Value mismatch

Example:
i32 %x = #xe0000000 (3758096384, -536870912)
i32 %y = #x00000000 (0)

Source:
i32 %abs = #x20000000 (536870912)
i32 %neg = #xe0000000 (3758096384, -536870912)
i1 %cmp = #x1 (1)
i32 %sel = #xe0000000 (3758096384, -536870912)

Target:
i32 %abs = #x20000000 (536870912)
Source value: #xe0000000 (3758096384, -536870912)
Target value: #x20000000 (536870912)

Alive2: Transform doesn't verify!

```
2020-12-18 21:18:13 +03:00
Juneyoung Lee 864dda5fd5 [InstSimplify] Add tests that fold instructions with poison operands (NFC) 2020-12-02 01:01:59 +09:00
Juneyoung Lee 9c49dcc356 [ConstantFold] Don't fold and/or i1 poison to poison (NFC)
.. because it causes miscompilation when combined with select i1 -> and/or.

It is the select fold which is incorrect; but it is costly to disable the fold, so hack this one.

D92270
2020-11-30 22:58:31 +09:00
Juneyoung Lee 53040a968d [ConstantFold] Fold more operations to poison
This patch folds more operations to poison.

Alive2 proof: https://alive2.llvm.org/ce/z/mxcb9G (it does not contain tests about div/rem because they fold to poison when raising UB)

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D92270
2020-11-29 21:19:48 +09:00
Juneyoung Lee c6b62efb91 [ConstantFold] Fold operations to poison if possible
This patch updates ConstantFold, so operations are folded into poison if possible.

<alive2 proofs>
casts: https://alive2.llvm.org/ce/z/WSj7rw
binary operations (arithmetic): https://alive2.llvm.org/ce/z/_7dEyJ
binary operations (bitwise): https://alive2.llvm.org/ce/z/cezjVN
vector/aggregate operations: https://alive2.llvm.org/ce/z/BQ7hWz
unary ops: https://alive2.llvm.org/ce/z/yBRs4q
other ops: https://alive2.llvm.org/ce/z/iXbcFD

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D92203
2020-11-29 02:28:40 +09:00
Cullen Rhodes 7b8d50b141 [InstSimplify] Clarify use of FixedVectorType in SimplifySelectInst
Folding a select of vector constants that include undef elements only
applies to fixed vectors, but there's no earlier check the type is not
scalable so it crashes for scalable vectors. This adds a check so this
optimization is only attempted for fixed vectors.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D92046
2020-11-27 09:55:29 +00:00
Christopher Tetreault 792f8e1114 [SVE] Take constant fold fast path for splatted vscale vectors
This should be a perfectly reasonable operation for scalable vectors.
Currently, it only works for zeroinitializer values of
ScalableVectorType, but the fundamental operation is sound and it should
be possible to make it work for other splats

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D77442
2020-11-17 12:45:31 -08:00
Simon Pilgrim 6fa7030a76 [ConstProp] Remove unused check-prefixes
Just use default CHECK and remove duplicate RUN
2020-11-09 13:12:40 +00:00
Sanjay Patel 00808e321c [InstSimplify] allow vector folds for (Pow2C << X) == NonPow2C
Existing pre-conditions seem to be correct:
https://rise4fun.com/Alive/lCLB

  Name: non-zero C1
  Pre: !isPowerOf2(C1) && isPowerOf2(C2) && C1 != 0
  %sub = shl i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false

  Name: one == C2
  Pre: !isPowerOf2(C1) && isPowerOf2(C2) && C2 == 1
  %sub = shl i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false

  Name: nuw
  Pre: !isPowerOf2(C1) && isPowerOf2(C2)
  %sub = shl nuw i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false

  Name: nsw
  Pre: !isPowerOf2(C1) && isPowerOf2(C2)
  %sub = shl nsw i8 C2, %X
  %cmp = icmp eq i8 %sub, C1
  =>
  %cmp = false
2020-11-08 09:52:05 -05:00
Sanjay Patel 73a5f0b614 [InstSimplify] add tests for icmp with power-of-2 operand; NFC 2020-11-08 09:52:05 -05:00
Sanjay Patel c74db55ff5 [InstSimplify] allow vector folds for icmp Pred (1 << X), 0x80 2020-11-04 08:12:48 -05:00
Sanjay Patel 5765edbf9e [InstSimplify] add vector cmp tests; NFC 2020-11-04 08:12:47 -05:00
Sanjay Patel e77ba263fe [InstSimplify] peek through 'not' operand in logic-of-icmps fold
This extends D78430 to solve cases like:
https://llvm.org/PR47858

There are still missed opportunities shown in the tests,
and as noted in the earlier patches, we have related
functionality in InstCombine, so we may want to extend
other folds in a similar way.

A semi-random sampling of test diff proofs in this patch:
https://rise4fun.com/Alive/sS4C
2020-10-25 11:13:30 -04:00
Sanjay Patel 7de2add829 [InstSimplify] add tests for logic-of-cmps with not op; NFC
One variant of this is shown in:
https://llvm.org/PR47858
2020-10-25 11:13:30 -04:00
Sanjay Patel c72198079d [ValueTracking] add range limits for cttz
As discussed in D89952,
instcombine can sometimes find a way to reduce similar patterns,
but it is incomplete.
InstSimplify uses the computeConstantRange() ValueTracking analysis
via simplifyICmpWithConstant(), so we just need to fill in the max
value of cttz to process any "icmp pred cttz(X), C" pattern (the
min value is initialized to zero automatically).

https://alive2.llvm.org/ce/z/Z_SLWZ

Follow-up to D89976.
2020-10-23 08:43:45 -04:00
Sanjay Patel 3fb0d6b0d5 [ValueTracking] add range limits for ctlz
As discussed in D89952,
instcombine can sometimes find a way to reduce similar patterns,
but it is incomplete.
InstSimplify uses the computeConstantRange() ValueTracking analysis
via simplifyICmpWithConstant(), so we just need to fill in the max
value of ctlz to process any "icmp pred ctlz(X), C" pattern (the
min value is initialized to zero automatically).

Follow-up to D89976.
2020-10-23 08:43:45 -04:00
Sanjay Patel 0351bd959f [InstSimplify] add tests for cttz constant range; NFC
This is a search-and-replace of f6cb7f3
2020-10-23 08:43:45 -04:00
Sanjay Patel 9bcb437f46 [InstSimplify] add tests for ctlz constant range; NFC
This is a search-and-replace of f6cb7f3.
2020-10-23 08:43:45 -04:00
Sanjay Patel 748ecc6b32 [ValueTracking] add range limits for ctpop
As discussed in D89952,
instcombine can sometimes find a way to reduce similar patterns,
but it is incomplete.
InstSimplify uses the computeConstantRange() ValueTracking analysis
via simplifyICmpWithConstant(), so we just need to fill in the max
value of ctpop to process any "icmp pred ctpop(X), C" pattern (the
min value is initialized to zero automatically).

Differential Revision: https://reviews.llvm.org/D89976
2020-10-23 08:17:54 -04:00
Sanjay Patel f6cb7f37ff [InstSimplify] add tests for ctpop constant range; NFC 2020-10-22 14:16:48 -04:00
Sjoerd Meijer 51d7df3fa1 [InstructionSimplify] icmp (X+Y), (X+Z) simplification
This improves simplifications for pattern `icmp (X+Y), (X+Z)` -> `icmp Y,Z`
if only one of the operands has NSW set, e.g.:

    icmp slt (x + 0), (x +nsw 1)

We can still safely rewrite this to:

    icmp slt 0, 1

because we know that the LHS can't overflow if the RHS has NSW set and
C1 < C2 && C1 >= 0, or C2 < C1 && C1 <= 0

This simplification is useful because ScalarEvolutionExpander which is used to
generate code for SCEVs in different loop optimisers is not always able to put
back NSW flags across control-flow, thus inhibiting CFG simplifications.

Differential Revision: https://reviews.llvm.org/D89317
2020-10-22 08:55:52 +01:00
Sjoerd Meijer e86a70ce3d [InstructionSimplify] And precommit more tests for D89317. NFC. 2020-10-21 11:02:25 +01:00
Sjoerd Meijer 782b8f0d38 [InstructionSimplify] Precommit more tests for D89317. NFC. 2020-10-21 10:14:39 +01:00
Sanjay Patel 7c516504a1 [InstSimplify] allow vector splats for icmp-of-neg folds 2020-10-20 09:24:36 -04:00
Sanjay Patel b11588b18e [InstSimplify] add vector icmp tests; NFC 2020-10-20 09:24:35 -04:00
Juneyoung Lee 62a0ec1612 Add support for !noundef metatdata on loads
This patch adds metadata !noundef and makes load instructions can optionally have it.
A load with !noundef always return a well-defined value (has no undef bit or isn't poison).
If the loaded value isn't well defined, the behavior is undefined.

This metadata can be used to encode the assumption from C/C++ that certain reads of variables should have well-defined values.
It is helpful for optimizing freeze instructions away, because freeze can be removed when its operand has well-defined value, and showing that a load from arbitrary location is well-defined is usually hard otherwise.

The same information can be encoded with llvm.assume with operand bundle; using metadata is chosen because I wasn't sure whether code motion can be freely done when llvm.assume is inserted from clang instead.
The existing codebase already is stripping unknown metadata when doing code motion, so using metadata is UB-safe as well.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89050
2020-10-17 13:50:10 +09:00
Jay Foad 1417abe54c [AMDGPU] Add new llvm.amdgcn.fma.legacy intrinsic
Differential Revision: https://reviews.llvm.org/D89558
2020-10-16 17:10:21 +01:00
Sjoerd Meijer 66f22411e1 [InstructionSimplify] Precommit tests for D89317. NFC. 2020-10-13 15:40:33 +01:00
Amara Emerson 322d0afd87 [llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.
This change renames the intrinsics to not have "experimental" in the name.

The autoupgrader will handle legacy intrinsics.

Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html

Differential Revision: https://reviews.llvm.org/D88787
2020-10-07 10:36:44 -07:00
Sanjay Patel 149f5b573c [APFloat] convert SNaN to QNaN in convert() and raise Invalid signal
This is an alternate fix (see D87835) for a bug where a NaN constant
gets wrongly transformed into Infinity via truncation.
In this patch, we uniformly convert any SNaN to QNaN while raising
'invalid op'.
But we don't have a way to directly specify a 32-bit SNaN value in LLVM IR,
so those are always encoded/decoded by calling convert from/to 64-bit hex.

See D88664 for a clang fix needed to allow this change.

Differential Revision: https://reviews.llvm.org/D88238
2020-10-01 14:37:38 -04:00
Sanjay Patel 645c53a9d9 [ValueTracking] enhance isKnownNeverInfinity to understand sitofp
As discussed in D87877, instcombine already has this fold,
but it was missing from the more general ValueTracking logic.

https://alive2.llvm.org/ce/z/PumYZP
2020-09-27 08:40:31 -04:00
Sanjay Patel 71f25ac8ca [InstSimplify] add tests for fcmp with casted op; NFC
This shows missing analysis in ValueTracking's isKnownNeverInfinity().
2020-09-27 08:36:57 -04:00
Sanjay Patel e34bd1e0b0 [APFloat] prevent NaN morphing into Inf on conversion (PR43907)
We shift the significand right on a truncation, but that needs to be made NaN-safe:
always set at least 1 bit in the significand.
https://llvm.org/PR43907

See D88238 for the likely follow-up (but needs some plumbing fixes before it can proceed).

Differential Revision: https://reviews.llvm.org/D87835
2020-09-24 14:02:19 -04:00
Arthur Eubanks 61ac58e10a [NewPM] Pin tests with -debug-pass to legacy PM
-debug-pass is a legacy PM only option.

Some tests checks that the pass returned that it made a change,
which is not relevant to the NPM, since passes return PreservedAnalyses.

Some tests check that passes are freed at the proper time, which is also
not relevant to the NPM.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87945
2020-09-22 17:54:25 -07:00
Arthur Eubanks f4f7df037e [DIE] Remove DeadInstEliminationPass
This pass is like DeadCodeEliminationPass, but only does one pass
through a function instead of iterating on users of eliminated
instructions.

DeadCodeEliminationPass should be used in all cases.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87933
2020-09-21 12:12:25 -07:00
Sanjay Patel f74a334fe3 [ConstantFolding] add undef handling for fmin/fmax intrinsics
The output here may not be optimal (yet), but it should be
consistent for commuted operands (it was not before) and
correct. We can do better by checking FMF and NaN if needed.

Code in InstSimplify generally assumes that we have already
folded code like this, so it was not handling 2 constant
inputs by commuting consistently.
2020-09-19 10:31:01 -04:00
Sanjay Patel d3b0644e22 [InstSimplify] add tests for constant folding fmin/fmax with undef op; NFC 2020-09-18 16:09:44 -04:00
Sanjay Patel 3f100e64b4 [InstSimplify] fix fmin/fmax miscompile for partial undef vectors (PR47567)
It would also be correct to return the variable operand in these cases,
but eliminating a variable use is probably better for optimization.
2020-09-18 10:05:44 -04:00
Sanjay Patel 6690de098e [InstSimplify] add another test for NaN propagation; NFC 2020-09-18 09:20:26 -04:00
Sanjay Patel c6ebe3fd00 [InstSimplify] add tests for FP constant miscompile; NFC (PR43907) 2020-09-17 12:04:39 -04:00
Sanjay Patel 8985755762 [InstSimplify] add limit folds for fmin/fmax
If the constant operand is the opposite of the min/max value,
then the result must be the other value.

This is based on the similar codegen transform proposed in:
D87571
2020-09-15 10:58:44 -04:00
Sanjay Patel 55d371abd7 [InstSimplify] add folds for fmin/fmax with 'nnan'
maximum(nnan X, +INF) --> +INF
minimum(nnan X, -INF) --> -INF

This is based on the similar codegen transform proposed in:
D87571
2020-09-14 11:46:11 -04:00
Sanjay Patel 7526376164 [InstSimplify] allow folds for fmin/fmax with 'ninf'
maxnum(ninf X, +FLT_MAX) --> +FLT_MAX
minnum(ninf X, -FLT_MAX) --> -FLT_MAX

This is based on the similar codegen transform proposed in:
D87571
2020-09-14 11:18:08 -04:00
Sanjay Patel dae68fdf9e [InstSimplify] add/move tests for fmin/fmax; NFC
The new tests are duplicated from the sibling patch for codegen:
D87571
2020-09-14 10:24:19 -04:00