Add folds to instcombine to support the removal of select instruction when the masked_load is guaranteed to zero the same lanes, i.e. select(mask, mload(,,mask,0), 0) -> mload(,,mask,0).
Patch originally authored by @paulwalker-arm
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D106376
Bug Fix for PR: https://llvm.org/PR47960
This patch makes sure that the fast math flag used in the 'select'
instruction is the same as the 'fabs' instruction after the transformation.
Differential Revision: https://reviews.llvm.org/D101727
As discussed on PR50183, we already fold to prefer 'select-of-idx' vs 'select-of-gep':
define <4 x i32>* @select0a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) {
%gep0 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1
%gep1 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a3
%sel = select i1 %a2, <4 x i32>* %gep0, <4 x i32>* %gep1
ret <4 x i32>* %sel
}
-->
define <4 x i32>* @select1a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) {
%sel = select i1 %a2, i64 %a1, i64 %a3
%gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel
ret <4 x i32>* %gep
}
This patch adds basic handling for the 'fallthrough' cases where the gep idx == 0 has been folded away to the base address:
define <4 x i32>* @select0(<4 x i32>* %a0, i64 %a1, i1 %a2) {
%gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1
%sel = select i1 %a2, <4 x i32>* %a0, <4 x i32>* %gep
ret <4 x i32>* %sel
}
-->
define <4 x i32>* @select1(<4 x i32>* %a0, i64 %a1, i1 %a2) {
%sel = select i1 %a2, i64 0, i64 %a1
%gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel
ret <4 x i32>* %gep
}
Reapplied with a fix for the bpf "-bpf-disable-avoid-speculation" tests
Differential Revision: https://reviews.llvm.org/D105901
We canonicalized to these select patterns (poison-safe logic)
with D101191, so we need to reduce 'not' ops when possible
as we would with 'and'/'or' instructions.
This is shown in a secondary example in:
https://llvm.org/PR50389https://alive2.llvm.org/ce/z/BvsESh
This is similar to the fix in c590a9880d ( PR49832 ), but
we missed handling the pattern for select of bools (no compare
inst).
We can't substitute a vector value because the equality condition
replacement that we are attempting requires that the condition
is true/false for the entire value. Vector select can be partly
true/false.
I added an assert for vector types, so we shouldn't hit this again.
Fixed formatting while auditing the callers.
https://llvm.org/PR50500
The 2nd test is based on the fuzzer example in post-commit
comments of D101191 -
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34661
The 1st test shows that we don't deal with this symmetrically.
We should be able to reduce both examples (possibly in
instsimplify instead of instcombine).
If a logical and/or is used, we need to be careful not to propagate
a potential poison value from the RHS by inserting a freeze
instruction. Otherwise it works the same way as bitwise and/or.
This is intended to address the regression reported at
https://reviews.llvm.org/D101191#2751002.
Differential Revision: https://reviews.llvm.org/D102279
This is a patch that disables the poison-unsafe select -> and/or i1 folding.
It has been blocking D72396 and also has been the source of a few miscompilations
described in llvm.org/pr49688 .
D99674 conditionally blocked this folding and successfully fixed the latter one.
The former one was still blocked, and this patch addresses it.
Note that a few test functions that has `_logical` suffix are now deoptimized.
These are created by @nikic to check the impact of disabling this optimization
by copying existing original functions and replacing and/or with select.
I can see that most of these are poison-unsafe; they can be revived by introducing
freeze instruction. I left comments at fcmp + select optimizations (or-fcmp.ll, and-fcmp.ll)
because I think they are good targets for freeze fix.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D101191
This fixes the examples from
D99674 and
https://llvm.org/PR49878
The matchers succeed on partial undef/poison vector constants,
but the transform creates a full 'not' (-1) constant, so it
would undo a demanded vector elements change triggered by the
extractelement.
Differential Revision: https://reviews.llvm.org/D100044
As shown in the example based on:
https://llvm.org/PR49832
...and the existing test, we can't substitute
a vector value because the equality compare
replacement that we are attempting requires
that the comparison is true for the entire
value. Vector select can be partly true/false.
This patch fixes llvm.org/pr49688 by conditionally folding select i1 into and/or:
```
select cond, cond2, false
->
and cond, cond2
```
This is not safe if cond2 is poison whereas cond isn’t.
Unconditionally disabling this transformation affects later pipelines that depend on and/or i1s.
To minimize its impact, this patch conservatively checks whether cond2 is an instruction that
creates a poison or its operand creates a poison.
This approach is similar to what InstSimplify's SimplifyWithOpReplaced is doing.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D99674
This is a patch that adds folding of two logical and/ors that share one variable:
a && (a && b) -> a && b
a && (a & b) -> a && b
...
This is towards removing the poison-unsafe select optimization (D93065 has more context).
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D96945
This fixes another unsafe select folding by disabling it if
EnableUnsafeSelectTransform is set to false.
EnableUnsafeSelectTransform's default value is true, hence it won't
affect generated code (unless the flag is explicitly set to false).
Relative to the original change, this adds a check that the
instruction on which we're replacing operands is safe to speculatively
execute, because that's what we're effectively doing. We're executing
the instruction with the replaced operand, which is fine if it's pure,
but not fine if can cause side-effects or UB (aka is not speculatable).
Additionally, we cannot (generally) replace operands in phi nodes,
as these may refer to a different loop iteration. This is also covered
by the speculation check.
-----
InstCombine already performs a fold where X == Y ? f(X) : Z is
transformed to X == Y ? f(Y) : Z if f(Y) simplifies. However,
if f(X) only has one use, then we can always directly replace the
use inside the instruction. To actually be profitable, limit it to
the case where Y is a non-expr constant.
This could be further extended to replace uses further up a one-use
instruction chain, but for now this only looks one level up.
Among other things, this also subsumes D94860.
Differential Revision: https://reviews.llvm.org/D94862
This caused a miscompile in Chromium, see comments on the codereview for
discussion and pointer to a reproducer.
> InstCombine already performs a fold where X == Y ? f(X) : Z is
> transformed to X == Y ? f(Y) : Z if f(Y) simplifies. However,
> if f(X) only has one use, then we can always directly replace the
> use inside the instruction. To actually be profitable, limit it to
> the case where Y is a non-expr constant.
>
> This could be further extended to replace uses further up a one-use
> instruction chain, but for now this only looks one level up.
>
> Among other things, this also subsumes D94860.
>
> Differential Revision: https://reviews.llvm.org/D94862
This also reverts the follow-up
a003f26539cf4db744655e76c41f4c4a8913f116:
> [llvm] Prevent infinite loop in InstCombine of select statements
>
> This fixes an issue where the RHS and LHS the comparison operation
> creating the predicate were swapped back and forth forever.
>
> Differential Revision: https://reviews.llvm.org/D94934
This fixes an issue where the RHS and LHS the comparison operation
creating the predicate were swapped back and forth forever.
Differential Revision: https://reviews.llvm.org/D94934
InstCombine already performs a fold where X == Y ? f(X) : Z is
transformed to X == Y ? f(Y) : Z if f(Y) simplifies. However,
if f(X) only has one use, then we can always directly replace the
use inside the instruction. To actually be profitable, limit it to
the case where Y is a non-expr constant.
This could be further extended to replace uses further up a one-use
instruction chain, but for now this only looks one level up.
Among other things, this also subsumes D94860.
Differential Revision: https://reviews.llvm.org/D94862
We can fold a ? b : false to a & b if is_poison(b) implies that
is_poison(a), at which point we're able to reuse all the usual fold
on ands. In particular, this covers the very common case of
icmp X, C && icmp X, C'. The same applies to ors.
This currently only has an effect if the
-instcombine-unsafe-select-transform=0 option is set.
Differential Revision: https://reviews.llvm.org/D94550
This disables the poison-unsafe select -> and/or transform behind
a flag (we continue to perform the fold by default). This is intended
to simplify evaluation and testing while we teach various passes
to directly recognize the select pattern.
This only disables the main select -> and/or transform. A number of
related ones are instead changed to canonicalize to the a ? b : false
and a ? true : b forms which represent and/or respectively. This
requires a bit of care to avoid infinite loops, as we do not want
!a ? b : false to be converted into a ? false : b.
The basic idea here is the same as D93065, but keeps the change
behind a flag for now.
Differential Revision: https://reviews.llvm.org/D93840
When doing select-to-zext/sext transformations, we should
not handle TrueVal and FalseVal of i1 type otherwise it
would result in zext/sext i1 to i1.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D93272
This patch enables canonicalization of SPF_ABS and SPF_ABS
to the abs intrinsic.
This is a recommit, the original try was
05d4c4ebc2,
but it was reverted due to an apparent miscompile,
which since then has just been fixed by the previous commit.
Differential Revision: https://reviews.llvm.org/D87188
As raised by @nlopes on D90382 - if this is not a rotate then the select was blocking poison from the 'shift-by-zero' non-TVal, but a funnel shift won't - so freeze it.
This is the last of the rotate->funnel shift InstCombine generalizations for PR46896
We still have foldGuardedRotateToFunnelShift to deal with in AggressiveInstCombine
Differential Revision: https://reviews.llvm.org/D90382
When retrying the "simplify with operand replaced" select
optimization without poison flags, also handle inbounds on GEPs.
Of course, this particular example would also be safe to transform
while keeping inbounds, but the underlying machinery does not
know this (yet).
When replacing X == Y ? f(X) : Z with X == Y ? f(Y) : Z, make sure
that Y cannot be undef. If it may be undef, we might end up picking
a different value for undef in the comparison and the select
operand.
Enable canonicalization of SPF_ABS and SPF_NABS to the abs intrinsic.
To be conservative, the one-use check on the comparison is retained,
this may be relaxed if all goes well.
It's pretty likely that this will uncover places that missing
handling for the abs() intrinsic. Please report any seen performance
regressions.
Differential Revision: https://reviews.llvm.org/D87188
Reapply after fixing SimplifyWithOpReplaced() to never return
the original value, which would lead to an infinite loop in this
transform.
-----
For selects of the type X == Y ? A : B, check if we can simplify A
by using the X == Y equality and replace the operand if that's
possible. We already try to do this in InstSimplify, but will only
fold if the result of the simplification is the same as B, in which
case the select can be dropped entirely. Here the select will be
retained, just one operand simplified.
As we are performing an actual replacement here, we don't have
problems with refinement / poison values.
Differential Revision: https://reviews.llvm.org/D87480
For selects of the type X == Y ? A : B, check if we can simplify A
by using the X == Y equality and replace the operand if that's
possible. We already try to do this in InstSimplify, but will only
fold if the result of the simplification is the same as B, in which
case the select can be dropped entirely. Here the select will be
retained, just one operand simplified.
As we are performing an actual replacement here, we don't have
problems with refinement / poison values.
Differential Revision: https://reviews.llvm.org/D87480
This is a followup to D86834, which partially fixed this issue in
InstSimplify. However, InstCombine repeats the same transform while
dropping poison flags -- which does not cover cases where poison is
introduced in some other way.
The fix here is a bit more comprehensive, because things are quite
entangled, and it's hard to only partially address it without
regressing optimization. There are really two changes here:
* Export the SimplifyWithOpReplaced API from InstSimplify, with an
added AllowRefinement flag. For replacements inside the TrueVal
we don't actually care whether refinement occurs or not, the
replacement is always legal. This part of the transform is now
done in InstSimplify only. (It should be noted that the current
AllowRefinement check is not sufficient -- that's an issue we
need to address separately.)
* Change the InstCombine fold to work by temporarily dropping
poison generating flags, running the fold and then restoring the
flags if it didn't work out. This will ensure that the InstCombine
fold is correct as long as the InstSimplify fold is correct.
Differential Revision: https://reviews.llvm.org/D87445
This patch adds an optimization that folds select(freeze(icmp eq/ne x, y), x, y)
to x or y.
This was needed to resolve slowdown after D84940 is applied.
I tried to bake this logic into foldSelectInstWithICmp, but it wasn't clear.
This patch conservatively writes the pattern in a separate function,
foldSelectWithFrozenICmp.
The output does not need freeze; https://alive2.llvm.org/ce/z/X49hNE (from @nikic)
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D85533
For a long time, the InstCombine pass handled target specific
intrinsics. Having target specific code in general passes was noted as
an area for improvement for a long time.
D81728 moves most target specific code out of the InstCombine pass.
Applying the target specific combinations in an extra pass would
probably result in inferior optimizations compared to the current
fixed-point iteration, therefore the InstCombine pass resorts to newly
introduced functions in the TargetTransformInfo when it encounters
unknown intrinsics.
The patch should not have any effect on generated code (under the
assumption that code never uses intrinsics from a foreign target).
This introduces three new functions:
TargetTransformInfo::instCombineIntrinsic
TargetTransformInfo::simplifyDemandedUseBitsIntrinsic
TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic
A few target specific parts are left in the InstCombine folder, where
it makes sense to share code. The largest left-over part in
InstCombineCalls.cpp is the code shared between arm and aarch64.
This allows to move about 3000 lines out from InstCombine to the targets.
Differential Revision: https://reviews.llvm.org/D81728
```
define i32 @test(i1 %cond) {
entry:
br i1 %cond, label %exit, label %exit
exit:
%result = select i1 %cond, i32 123, i32 456
ret i32 %result
}
```
In this test, after applying transformation of replacing select with Phis,
the result will be:
```
define i32 @test(i1 %cond) {
entry:
br i1 %cond, label %exit, label %exit
exit:
%result = i32 phi [123, %exit], [123, %exit]
ret i32 %result
}
```
That is, select is transformed into an invalid Phi, which will then be
reduced to 123 and the second value will be lost. But it is worth
noting that this problem will arise only if select is in the InstCombine
worklist will be before the branch. Otherwise, InstCombine will replace
the branch condition with false and transformation will not be applied.
The fix is to check the target labels in the branch condition for equality.
Patch By: Kirill Polushin
Differential Revision: https://reviews.llvm.org/D84003
Reviewed By: mkazantsev
We can try to replace select with a Phi not in its parent block alone,
but also in blocks of its arguments. We benefit from it when select's
argument is a Phi.
Differential Revision: https://reviews.llvm.org/D83284
Reviewed By: nikic
Summary:
The actual transform i was going after was:
https://rise4fun.com/Alive/Tp9H
```
Name: zz
Pre: isPowerOf2(C0) && isPowerOf2(C1) && C1 == C0
%t0 = and i8 %x, C0
%r = icmp eq i8 %t0, C1
=>
%t = icmp eq i8 %t0, 0
%r = xor i1 %t, -1
Name: zz
Pre: isPowerOf2(C0)
%t0 = and i8 %x, C0
%r = icmp ne i8 %t0, 0
=>
%t = icmp eq i8 %t0, 0
%r = xor i1 %t, -1
```
but as it can be seen from the current tests, we already canonicalize most of it,
and we are only missing handling multi-use non-canonical icmp predicates.
If we have both `!=0` and `==0`, even though we can CSE them,
we end up being stuck with them. We should canonicalize to the `==0`.
I believe this is one of the cleanup steps i'll need after `-scalarizer`
if i end up proceeding with my WIP alloca promotion helper pass.
Reviewers: spatel, jdoerfert, nikic
Reviewed By: nikic
Subscribers: zzheng, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83139
This patch transforms
```
p = phi [x, y]
s = select cond, z, p
```
with
```
s = phi[x, z]
```
if we can prove that the Phi node takes values basing on select's condition.
Differential Revision: https://reviews.llvm.org/D82072
Reviewed By: nikic
We can sometimes replace a select with a Phi node if all of its values
are available on respective incoming edges.
Differential Revision: https://reviews.llvm.org/D82005
Reviewed By: nikic
As mentioned in the post-commit comments of D81013 -
the mask check API has to assume the shuffle is
not length-changing, but we have not ruled that out
in this code. Use the ShuffleVectorInst call instead.
SimplifyDemandedVectorElts() bails out on ScalableVectorType
anyway, but we can exit faster with the external check.
Move this to a helper function because there are likely other
vector folds that we can try here.
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: sdesmalen, rriddle, efriedma
Reviewed By: sdesmalen
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77263
Rather than converting to a dummy select with equal true and false
ops, just directly return the resulting value.
As a side-effect, this fixes missing DCE of the previously replaced
operand.
Summary: Rewrite the fsub-0.0 idiom to fneg and always emit fneg for fp
negation. This also extends the scalarization cost in instcombine for unary
operators to result in the same IR rewrites for fneg as for the idiom.
Reviewed By: cameron.mcinally
Differential Revision: https://reviews.llvm.org/D75467
Use UnaryOperator::CreateFNeg instead.
Summary:
With the introduction of the native fneg instruction, the
fsub -0.0, %x idiom is obsolete. This patch makes LLVM
emit fneg instead of the idiom in all places.
Reviewed By: cameron.mcinally
Differential Revision: https://reviews.llvm.org/D75130
The select-of-cttz transform can currently duplicate cttz intrinsics
and zext/trunc ops. The cause is that it unnecessarily duplicates
the intrinsic and the zext/trunc when setting the "undef_on_zero"
flag to false. However, it's always legal to set the flag from true
to false, so we can make this replacement even if there are extra users.
Differential Revision: https://reviews.llvm.org/D74685
This is a followup to D73803, which uses the replaceOperand()
helper in more places.
This should be NFC apart from changes to worklist order.
Differential Revision: https://reviews.llvm.org/D73919
We were checking for extra uses of the negated operand even
if we were not going to create it as part of this canonicalization.
This was showing up as a regression when we limit EarlyCSE as
proposed in D74285.
While D72944 also fixes https://bugs.llvm.org/show_bug.cgi?id=44541,
it does so in a more roundabout manner and there might be other
loopholes to trigger the same issue. This is a more direct fix,
that prevents the transform if the min/max is based on a
non-canonical sub X, 0 instruction.
Differential Revision: https://reviews.llvm.org/D73849
This renames Worklist.AddDeferred() to Worklist.add() and
Worklist.Add() to Worklist.push(). The intention here is that
Worklist.add() should be the go-to method for explicit worklist
management, while the raw Worklist.push() is mostly for
InstCombine internals. I will then migrate uses of Worklist.push()
to Worklist.add() in followup changes.
As suggested by spatel on D73411 I'm also changing the remaining
method names to lowercase first character, in line with current
coding standards.
Differential Revision: https://reviews.llvm.org/D73745
This should be the last step needed to solve the problem in the
description of PR44153:
https://bugs.llvm.org/show_bug.cgi?id=44153
If we're casting an FP value to int, testing its signbit, and then
choosing between a value and its negated value, that's a
complicated way of saying "copysign":
(bitcast X) < 0 ? -TC : TC --> copysign(TC, X)
Differential Revision: https://reviews.llvm.org/D72643
The constants come through as add %x, -C, not a sub as would be
expected. They need some extra matchers to canonicalise them towards
usub_sat.
Differential Revision: https://reviews.llvm.org/D69514
This adjusts the one use checks in the the usub_sat fold code to not
increase instruction count, but otherwise do the fold. Reviewed as a
part of D69514.
Working on top of D69252, this adds canonicalisation patterns for ssub.with.overflow to ssub.sats.
Differential Revision: https://reviews.llvm.org/D69753
This adds to D69245, adding extra signed patterns for folding from a
sadd_with_overflow to a sadd_sat. These are more complex than the
unsigned patterns, as the overflow can occur in either direction.
For the add case, the positive overflow can only occur if both of the
values are positive (same for both the values being negative). So there
is an extra select on whether to use the positive or negative overflow
limit.
Differential Revision: https://reviews.llvm.org/D69252
As noted by the FIXME comment, this is not correct based on our current FMF semantics.
We should be propagating FMF from the final value in a sequence (in this case the
'select'). So the behavior even without this patch is wrong, but we did not allow FMF
on 'select' until recently.
But if we do the correct thing right now in this patch, we'll inevitably introduce
regressions because we have not wired up FMF propagation for 'phi' and 'select' in
other passes (like SimplifyCFG) or other places in InstCombine. I'm not seeing a
better incremental way to make progress.
That said, the potential extra damage over the existing wrong behavior from this
patch is very limited. AFAIK, the only way to have different FMF on IR in the same
function is if we have LTO inlined IR from 2 modules that were compiled using
different fast-math settings.
As seen in the tests, we may actually see some improvements with this patch because
adding the FMF to the 'select' allows matching to min/max intrinsics that were
previously missed (in the common case, the 'fcmp' and 'select' should have identical
FMF to begin with).
Next steps in the transition:
Make similar changes in instcombine as needed.
Enable phi-to-select FMF propagation in SimplifyCFG.
Remove dependencies on fcmp with FMF.
Deprecate FMF on fcmp.
Differential Revision: https://reviews.llvm.org/D69720
This adds some patterns to transform uadd.with.overflow to uadd.sat
(with usub.with.overflow to usub.sat too). The patterns selects from
UINTMAX (or 0 for subs) depending on whether the operation overflowed.
Signed patterns are a little more involved (they can wrap in two
directions), but can be added here in a followup patch too.
Differential Revision: https://reviews.llvm.org/D69245
This is an extra fold for a canonical form of uadd_sat, as shown in
D68651. It essentially selects uadd from an add and a select.
Differential Revision: https://reviews.llvm.org/D69244
This adds an instcombine matcher for code that attempts to perform signed
saturating arithmetic by casting to a higher type. Unsigned cases are already
matched, this adds extra matches for the more complex signed cases, which
involves matching the min(max(add a b)) nodes with proper extends to ensure
legality.
Differential Revision: https://reviews.llvm.org/D68651
llvm-svn: 375505
This has the potential to uncover missed analysis/folds as shown in the
min/max code comment/test, but fewer restrictions on icmp folds should
be better in general to solve cases like:
https://bugs.llvm.org/show_bug.cgi?id=43310
llvm-svn: 372510
Promoting it from InstCombine's tryToReuseConstantFromSelectInComparison().
Return true if this constant and a constant 'Y' are element-wise equal.
This is identical to just comparing the pointers, with the exception that
for vectors, if only one of the constants has an `undef` element in some
lane, the constants still match.
llvm-svn: 369842
Summary:
If we have e.g.:
```
%t = icmp ult i32 %x, 65536
%r = select i1 %t, i32 %y, i32 65535
```
the constants `65535` and `65536` are suspiciously close.
We could perform a transformation to deduplicate them:
```
Name: ult
%t = icmp ult i32 %x, 65536
%r = select i1 %t, i32 %y, i32 65535
=>
%t.inv = icmp ugt i32 %x, 65535
%r = select i1 %t.inv, i32 65535, i32 %y
```
https://rise4fun.com/Alive/avb
While this may seem esoteric, this should certainly be good for vectors
(less constant pool usage) and for opt-for-size - need to have only one constant.
But the real fun part here is that it allows further transformation,
in particular it finishes cleaning up the `clamp` folding,
see e.g. `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`.
We start with e.g.
```
%dont_need_to_clamp_positive = icmp sle i32 %X, 32767
%dont_need_to_clamp_negative = icmp sge i32 %X, -32768
%clamp_limit = select i1 %dont_need_to_clamp_positive, i32 -32768, i32 32767
%dont_need_to_clamp = and i1 %dont_need_to_clamp_positive, %dont_need_to_clamp_negative
%R = select i1 %dont_need_to_clamp, i32 %X, i32 %clamp_limit
```
without this patch we currently produce
```
%1 = icmp slt i32 %X, 32768
%2 = icmp sgt i32 %X, -32768
%3 = select i1 %2, i32 %X, i32 -32768
%R = select i1 %1, i32 %3, i32 32767
```
which isn't really a `clamp` - both comparisons are performed on the original value,
this patch changes it into
```
%1.inv = icmp sgt i32 %X, 32767
%2 = icmp sgt i32 %X, -32768
%3 = select i1 %2, i32 %X, i32 -32768
%R = select i1 %1.inv, i32 32767, i32 %3
```
and then the magic happens! Some further transform finishes polishing it and we finally get:
```
%t1 = icmp sgt i32 %X, -32768
%t2 = select i1 %t1, i32 %X, i32 -32768
%t3 = icmp slt i32 %t2, 32767
%R = select i1 %t3, i32 %t2, i32 32767
```
which is beautiful and just what we want.
Proofs for `getFlippedStrictnessPredicateAndConstant()` for de-canonicalization:
https://rise4fun.com/Alive/THl
Proofs for the fold itself: https://rise4fun.com/Alive/THl
Reviewers: spatel, dmgreen, nikic, xbolva00
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66232
llvm-svn: 369840