Reduce code duplication for commutative pattern matching
and fix a miscompile.
We can't safely propagate an undef element in this transform:
https://alive2.llvm.org/ce/z/s5xy55
https://alive2.llvm.org/ce/z/4PaPDy
There's a related fold where the inner 'or' is replaced by 'and',
but that needs to be more careful about matching a 'not'.
These are similar to the rotate pattern added with:
dcf659e821
...but we don't have guard ops on the shift amount,
so we don't canonicalize to the intrinsic.
declare void @llvm.assume(i1)
define i32 @src(i32 %shamt, i32 %bitwidth) {
; subtract must be in range of bitwidth
%lt = icmp ule i32 %bitwidth, 32
call void @llvm.assume(i1 %lt)
%r = lshr i32 -1, %shamt
%s = sub i32 %bitwidth, %shamt
%l = shl i32 -1, %s
%o = or i32 %r, %l
ret i32 %o
}
define i32 @tgt(i32 %shamt, i32 %bitwidth) {
ret i32 -1
}
https://alive2.llvm.org/ce/z/aF7WHx
This adds more poison folding optimizations to InstSimplify.
Since all binary operators propagate poison, these are fine.
Also, the precondition of `select cond, undef, x` -> `x` is relaxed to allow the case when `x` is undef.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D104661
No changes relative to last time, but after a mitigation for
an AMDGPU regression landed.
---
If SimplifyInstruction() does not succeed in simplifying the
instruction, it will compute the known bits of the instruction
in the hope that all bits are known and the instruction can be
folded to a constant. I have removed a similar optimization
from InstCombine in D75801, and would like to drop this one as well.
On average, we spend ~1% of total compile-time performing this
known bits calculation. However, if we introduce some additional
statistics for known bits computations and how many of them succeed
in simplifying the instruction we get (on test-suite):
instsimplify.NumKnownBits: 216
instsimplify.NumKnownBitsComputed: 13828375
valuetracking.NumKnownBitsComputed: 45860806
Out of ~14M known bits calculations (accounting for approximately
one third of all known bits calculations), only 0.0015% succeed in
producing a constant. Those cases where we do succeed to compute
all known bits will get folded by other passes like InstCombine
later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test
show a hash difference after this change. On lencod we see an
improvement (a loop phi is optimized away), on the GCC torture
test a regression (a function return value is determined only
after IPSCCP, preventing propagation from a noinline function.)
There are various regressions in InstSimplify tests. However, all
of these cases are already handled by InstCombine, and corresponding
tests have already been added there.
Differential Revision: https://reviews.llvm.org/D79294
If SimplifyInstruction() does not succeed in simplifying the
instruction, it will compute the known bits of the instruction
in the hope that all bits are known and the instruction can be
folded to a constant. I have removed a similar optimization
from InstCombine in D75801, and would like to drop this one as well.
On average, we spend ~1% of total compile-time performing this
known bits calculation. However, if we introduce some additional
statistics for known bits computations and how many of them succeed
in simplifying the instruction we get (on test-suite):
instsimplify.NumKnownBits: 216
instsimplify.NumKnownBitsComputed: 13828375
valuetracking.NumKnownBitsComputed: 45860806
Out of ~14M known bits calculations (accounting for approximately
one third of all known bits calculations), only 0.0015% succeed in
producing a constant. Those cases where we do succeed to compute
all known bits will get folded by other passes like InstCombine
later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test
show a hash difference after this change. On lencod we see an
improvement (a loop phi is optimized away), on the GCC torture
test a regression (a function return value is determined only
after IPSCCP, preventing propagation from a noinline function.)
There are various regressions in InstSimplify tests. However, all
of these cases are already handled by InstCombine, and corresponding
tests have already been added there.
Differential Revision: https://reviews.llvm.org/D79294
As it's causing some bot failures (and per request from kbarton).
This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.
llvm-svn: 358546
Loosening the matcher definition reveals a subtle bug in InstSimplify (we should not
assume that because an operand constant matches that it's safe to return it as a result).
So I'm making that change here too (that diff could be independent, but I'm not sure how
to reveal it before the matcher change).
This also seems like a good reason to *not* include matchers that capture the value.
We don't want to encourage the potential misstep of propagating undef values when it's
not allowed/intended.
I didn't include the capture variant option here or in the related rL325437 (m_One),
but it already exists for other constant matchers.
llvm-svn: 325466
The tests here are have operands commuted to provide more coverage. I also commuted one of the instructions in the scalar tests so the 4 tests cover the 4 commuted variations
Differential Revision: https://reviews.llvm.org/D33599
llvm-svn: 304021