After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...
llvm-svn: 289756
There are other spots where we can use this; we're currently dropping
metadata in some places, and there are proposed changes where we will
want to propagate metadata.
IRBuilder's CreateSelect() already has a parameter like this, so this
change makes the regular 'Create' API line up with that.
llvm-svn: 287976
This is a first step towards canonicalization and improved folding/codegen
for integer min/max as discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html
Here, we're just matching the simplest min/max patterns and adjusting the
icmp predicate while swapping the select operands.
I've included FIXME tests in test/Transforms/InstCombine/select_meta.ll
so it's easier to see how this might be extended (corresponds to the TODO
comment in the code). That's also why I'm using matchSelectPattern()
rather than a simpler check; once the backend is patched, we can just
remove some of the restrictions to allow the obfuscated min/max patterns
in the FIXME tests to be matched.
Differential Revision: https://reviews.llvm.org/D26525
llvm-svn: 287585
As the test change shows, we can increase the critical path by adding
a 'not' instruction, so make sure that we're actually removing an
instruction if we do this transform.
This transform could also cause us to miss folds of min/max pairs.
llvm-svn: 286315
This was reverted at r285866 because there was a crash handling a scalar
select of vectors. I added a check for that pattern and a test case based
on the example provided in the post-commit thread for r285732.
llvm-svn: 286113
This reverts commit r285732.
This change introduced a new assertion failure in the following
testcase at -O2:
typedef short __v8hi __attribute__((__vector_size__(16)));
__v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) {
__v8hi Result = V1;
if (mask & 0x80)
Result[0] = V2[0];
return Result;
}
llvm-svn: 285866
If we're going to canonicalize IR towards select of constants, try harder to create those.
Also, don't lose the metadata.
This is actually 4 related transforms in one patch:
// select X, (sext X), C --> select X, -1, C
// select X, (zext X), C --> select X, 1, C
// select X, C, (sext X) --> select X, C, 0
// select X, C, (zext X) --> select X, C, 0
Differential Revision: https://reviews.llvm.org/D25126
llvm-svn: 283575
Also, make foldSelectExtConst() a member of InstCombiner, remove
unnecessary parameters from its interface, and group visitSelectInst
helpers together in the header file.
llvm-svn: 282796
Summary:
Turn (select C, (sext A), B) into (sext (select C, A, B')) when A is i1 and
B is a compatible constant, also for zext instead of sext. This will then be
further folded into logical operations.
The transformation would be valid for non-i1 types as well, but other parts of
InstCombine prefer to have sext from non-i1 as an operand of select.
Motivated by the shader compiler frontend in Mesa for AMDGPU, which emits i32
for boolean operations. With this change, the boolean logic is fully
recovered.
Reviewers: majnemer, spatel, tstellarAMD
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22747
llvm-svn: 277801
By replacing dyn_cast of ConstantInt with m_Zero/m_One/m_AllOnes, we
allow these transforms for splat vectors.
Differential Revision: http://reviews.llvm.org/D21899
llvm-svn: 274696
Follow-up from r274465: we don't need to capture the value in these cases,
so just match the constant that we're looking for. m_One/m_Zero work with
vector splats as well as scalars.
llvm-svn: 274670
The motivating example for this transform is similar to D20774 where bitcasts interfere
with a single cmp/select sequence, but in this case we have 2 uses of each bitcast to
produce min and max ops:
define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) {
%cmp = fcmp olt <4 x float> %a, %b
%bc1 = bitcast <4 x float> %a to <4 x i32>
%bc2 = bitcast <4 x float> %b to <4 x i32>
%sel1 = select <4 x i1> %cmp, <4 x i32> %bc1, <4 x i32> %bc2
%sel2 = select <4 x i1> %cmp, <4 x i32> %bc2, <4 x i32> %bc1
%bc3 = bitcast <4 x float>* %ptr1 to <4 x i32>*
store <4 x i32> %sel1, <4 x i32>* %bc3
%bc4 = bitcast <4 x float>* %ptr2 to <4 x i32>*
store <4 x i32> %sel2, <4 x i32>* %bc4
ret void
}
With this patch, we move the selects up to use the input args which allows getting rid of
all of the bitcasts:
define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) {
%cmp = fcmp olt <4 x float> %a, %b
%sel1.v = select <4 x i1> %cmp, <4 x float> %a, <4 x float> %b
%sel2.v = select <4 x i1> %cmp, <4 x float> %b, <4 x float> %a
store <4 x float> %sel1.v, <4 x float>* %ptr1, align 16
store <4 x float> %sel2.v, <4 x float>* %ptr2, align 16
ret void
}
The asm for x86 SSE then improves from:
movaps %xmm0, %xmm2
cmpltps %xmm1, %xmm2
movaps %xmm2, %xmm3
andnps %xmm1, %xmm3
movaps %xmm2, %xmm4
andnps %xmm0, %xmm4
andps %xmm2, %xmm0
orps %xmm3, %xmm0
andps %xmm1, %xmm2
orps %xmm4, %xmm2
movaps %xmm0, (%rdi)
movaps %xmm2, (%rsi)
To:
movaps %xmm0, %xmm2
minps %xmm1, %xmm2
maxps %xmm0, %xmm1
movaps %xmm2, (%rdi)
movaps %xmm1, (%rsi)
The TODO comments show that we're limiting this transform only to vectors and only to bitcasts
because we need to improve other transforms or risk creating worse codegen.
Differential Revision: http://reviews.llvm.org/D21190
llvm-svn: 273011
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.
llvm-svn: 272126
We had a select of a cast of a select but attempted to replace the outer
select with the inner select dispite their incompatible types.
Patch by Anton Korobeynikov!
This fixes PR27236.
llvm-svn: 265805
Summary:
There are `SelectPatternFlavor`s that don't represent min or max idioms,
and we should not be passing those to `getCmpPredicateForMinMax`.
Fixes PR25745.
Reviewers: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15249
llvm-svn: 254869
The select pattern recognition in ValueTracking (as used by InstCombine
and SelectionDAGBuilder) only knew about integer patterns. This teaches
it about minimum and maximum operations.
matchSelectPattern() has been extended to return a struct containing the
existing Flavor and a new enum defining the pattern's behavior when
given one NaN operand.
C minnum() is defined to return the non-NaN operand in this case, but
the idiomatic C "a < b ? a : b" would return the NaN operand.
ARM and AArch64 at least have different instructions for these different cases.
llvm-svn: 244580
There were several SelectInst combines that always returned an existing
instruction instead of modifying an old one or creating a new one.
These are prime candidates for moving to InstSimplify.
llvm-svn: 239229
If we have (select a, b, c), it is sometimes valid to simplify this to a
single select operand. However, doing so is only valid if the
computation doesn't inject poison into the computation.
It might be helpful to consider the following example:
(select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN)
The select is equivalent to (add %i, 1) but not (add nsw %i, 1).
Self hosting on x86_64 revealed that this occurs very, very rarely so
bailing out is hopefully pretty reasonable.
llvm-svn: 239215
This reverts commit r239141. This commit was an attempt to reintroduce
a previous patch that broke many self-hosting bots with clang timeouts,
but it still has slowdown issues, at least on ARM, increasing the
compilation time (stage 2, clang's) by 5x.
llvm-svn: 239175
I don't have the IR which is causing the build bot breakage but I can
postulate as to why they are timing out:
1. SimplifyWithOpReplaced was stripping flags from the simplified value.
2. visitSelectInstWithICmp was overriding SimplifyWithOpReplaced because
it's simplification wasn't correct.
3. InstCombine would revisit the add instruction and note that it can
rederive the flags.
4. By modifying the value, we chose to revisit instructions which reuse
the value. One of the instructions is the original select, causing
LLVM to never reach fixpoint.
Instead, strip the flags only when we are sure we are going to perform
the simplification.
llvm-svn: 239141
We cleverly handle cases where computation done in one argument of a select
instruction is suitable for the other operand, thus obviating the need
of the select and the comparison. However, the other operand cannot
have flags.
This fixes PR23757.
llvm-svn: 239115