Commit Graph

755 Commits

Author SHA1 Message Date
Chris Lattner c86e67e110 fix an off-by-one bug that caused a crash analyzing
ashr's with huge shift amounts, PR8896

llvm-svn: 122814
2011-01-04 18:19:15 +00:00
Owen Anderson 226ac14afb When determining if we can fold (x >> C1) << C2, the bits that we need to verify are zero
are not the low bits of x, but the bits that WILL be the low bits after the operation completes.

llvm-svn: 122529
2010-12-23 23:56:24 +00:00
Benjamin Kramer 8ef5001b27 InstCombine: creating selects from -1 and 0 is fine, they combine into a sext from i1.
llvm-svn: 122453
2010-12-22 23:12:15 +00:00
Duncan Sands bab9456f81 Make this test not depend on how the variable is named.
llvm-svn: 122413
2010-12-22 17:08:04 +00:00
Duncan Sands fbb9ac3cca Add a generic expansion transform: A op (B op' C) -> (A op B) op' (A op C)
if both A op B and A op C simplify.  This fires fairly often but doesn't
make that much difference.  On gcc-as-one-file it removes two "and"s and
turns one branch into a select.

llvm-svn: 122399
2010-12-22 13:36:08 +00:00
Benjamin Kramer 68531baea9 Teach InstCombine to merge (icmp ult (X + CA), C1) | (icmp eq X, C2) into (icmp ult (X + CA), C1 + 1) if C2 + CA == C1.
InstCombine creates these so now we compile x == 23 || x == 24 || x == 25 to
  %x.off = add i32 %x, -23
  %1 = icmp ult i32 %x.off, 3
instead of
  %x.off = add i32 %x, -23
  %1 = icmp ult i32 %x.off, 2
  %cmp3 = icmp eq i32 %x, 25
  %ret2 = or i1 %1, %cmp3

llvm-svn: 122248
2010-12-20 16:18:51 +00:00
Duncan Sands ed6d6c33dd Have SimplifyBinOp dispatch Xor, Add and Sub to the corresponding methods
(they had just been forgotten before).  Adding Xor causes "main" in the
existing testcase 2010-11-01-lshr-mask.ll to be hugely more simplified.

llvm-svn: 122245
2010-12-20 14:47:04 +00:00
Chris Lattner 27ca8ebd4b fix PR8807 by making transformConstExprCastCall aware of byval arguments.
llvm-svn: 122238
2010-12-20 08:36:38 +00:00
Mon P Wang 7bcead020d Test case for r122215 when InstCombine optimizes memset
llvm-svn: 122216
2010-12-20 01:06:23 +00:00
Chris Lattner 1e8c032a6e X86 supports i8/i16 overflow ops (except i8 multiplies), we should
generate them.  

Now we compile:

define zeroext i8 @X(i8 signext %a, i8 signext %b) nounwind ssp {
entry:
  %0 = tail call %0 @llvm.sadd.with.overflow.i8(i8 %a, i8 %b)
  %cmp = extractvalue %0 %0, 1
  br i1 %cmp, label %if.then, label %if.end

into:

_X:                                     ## @X
## BB#0:                                ## %entry
	subl	$12, %esp
	movb	16(%esp), %al
	addb	20(%esp), %al
	jo	LBB0_2

Before we were generating:

_X:                                     ## @X
## BB#0:                                ## %entry
	pushl	%ebp
	movl	%esp, %ebp
	subl	$8, %esp
	movb	12(%ebp), %al
	testb	%al, %al
	setge	%cl
	movb	8(%ebp), %dl
	testb	%dl, %dl
	setge	%ah
	cmpb	%cl, %ah
	sete	%cl
	addb	%al, %dl
	testb	%dl, %dl
	setge	%al
	cmpb	%al, %ah
	setne	%al
	andb	%cl, %al
	testb	%al, %al
	jne	LBB0_2

llvm-svn: 122186
2010-12-19 20:03:11 +00:00
Chris Lattner 5e0c0c72e9 recognize an unsigned add with overflow idiom into uadd.
This resolves a README entry and technically resolves PR4916,
but we still get poor code for the testcase in that PR because
GVN isn't CSE'ing uadd with add, filed as PR8817.

Previously we got:

_test7:                                 ## @test7
	addq	%rsi, %rdi
	cmpq	%rdi, %rsi
	movl	$42, %eax
	cmovaq	%rsi, %rax
	ret

Now we get:

_test7:                                 ## @test7
	addq	%rsi, %rdi
	movl	$42, %eax
	cmovbq	%rsi, %rax
	ret

llvm-svn: 122182
2010-12-19 19:37:52 +00:00
Chris Lattner 33dc3f0cfa optimize uadd(x, cst) into a comparison when the normal
result is dead.  This is required for my next patch to not
regress the testsuite.

llvm-svn: 122181
2010-12-19 19:35:32 +00:00
Chris Lattner 79874566ce generalize the sadd creation code to not require that the
sadd formed is half the size of the original type. We can
now compile this into a sadd.i8:

unsigned char X(char a, char b) {
  int res = a+b;
  if ((unsigned )(res+128) > 255U)
    abort();
  return res;
}

llvm-svn: 122178
2010-12-19 18:35:09 +00:00
Chris Lattner c56c845377 fix another miscompile in the llvm.sadd formation logic: it wasn't
checking to see if the high bits of the original add result were dead.
Inserting a smaller add and zexting back to that size is not good enough.

This is likely to be the fix for 8816.

llvm-svn: 122177
2010-12-19 18:22:06 +00:00
Chris Lattner f29562db25 fix a bug (possibly 8816) in the sadd forming xform: it isn't
profitable (or safe) to promote code when the add-with-constant
has other uses.

llvm-svn: 122175
2010-12-19 17:59:02 +00:00
Nate Begeman 7aa18bf46a Add vector versions of some existing scalar transforms to aid codegen in matching psign & pblend operations to the IR produced by clang/gcc for their C idioms.
llvm-svn: 122105
2010-12-17 23:12:19 +00:00
Owen Anderson 1294ea7d53 Reapply r121905 (automatic synthesis of @llvm.sadd.with.overflow) with a fix for a bug that manifested itself
on the DragonEgg self-host bot.  Unfortunately, the testcase is pretty messy and doesn't reduce well due to
interactions with other parts of InstCombine.

llvm-svn: 122072
2010-12-17 18:08:00 +00:00
Duncan Sands 8d1ab6f6e1 Speculatively revert commit 121905 since it looks like it might have broken the
dragonegg self-host buildbot.  Original commit message:

Add an InstCombine transform to recognize instances of manual overflow-safe addition
(performing the addition in a wider type and explicitly checking for overflow), and
fold them down to intrinsics.  This currently only supports signed-addition, but could
be generalized if someone works out the magic constant formulas for other operations.

llvm-svn: 121965
2010-12-16 09:40:54 +00:00
Owen Anderson 1cf8881299 Add an InstCombine transform to recognize instances of manual overflow-safe addition
(performing the addition in a wider type and explicitly checking for overflow), and
fold them down to intrinsics.  This currently only supports signed-addition, but could
be generalized if someone works out the magic constant formulas for other operations.

Fixes <rdar://problem/8558713>.

llvm-svn: 121905
2010-12-15 22:32:38 +00:00
Benjamin Kramer c4169cebe3 Generalize the and-icmp-select instcombine further by allowing selects of the form
(x & 2^n) ? 2^m+C : C

we can offset both arms by C to get the "(x & 2^n) ? 2^m : 0" form, optimize the
select to a shift and apply the offset afterwards.

llvm-svn: 121609
2010-12-11 10:49:22 +00:00
Benjamin Kramer c8b035d006 Factor the (x & 2^n) ? 2^m : 0 instcombine into its own method and generalize it
to catch cases where n != m with a shift.

llvm-svn: 121608
2010-12-11 09:42:59 +00:00
Dan Gohman a32986e899 Really check that the bits that will become zero are actually already zero
before eliminating the operation that zeros them. This fixes rdar://8739316.

llvm-svn: 121353
2010-12-09 02:52:17 +00:00
Chris Lattner 6c7f64e0bc remove a use of llvm-dis
llvm-svn: 120383
2010-11-30 02:04:15 +00:00
Frits van Bommel 28218aa8f1 Transform (extractvalue (load P), ...) to (load (gep P, 0, ...)) if the load has no other uses, shrinking the load.
llvm-svn: 120323
2010-11-29 21:56:20 +00:00
Frits van Bommel 40a80ac963 Update this test to keep testing the -instcombine transform it's supposed to be testing instead of triggering the improved constant folding for insertvalue and extractvalue.
llvm-svn: 120319
2010-11-29 20:55:40 +00:00
Benjamin Kramer 94a622af4c The srem -> urem transform is not safe for any divisor that's not a power of two.
E.g. -5 % 5 is 0 with srem and 1 with urem.

Also addresses Frits van Bommel's comments.

llvm-svn: 120049
2010-11-23 20:33:57 +00:00
Benjamin Kramer b5afa65b0a InstCombine: Reduce "X shift (A srem B)" to "X shift (A urem B)" iff B is positive.
This allows to transform the rem in "1 << ((int)x % 8);" to an and.

llvm-svn: 120028
2010-11-23 18:52:42 +00:00
Duncan Sands adc7771f18 Exploit distributive laws (eg: And distributes over Or, Mul over Add, etc) in a
fairly systematic way in instcombine.  Some of these cases were already dealt
with, in which case I removed the existing code.  The case of Add has a bunch of
funky logic which covers some of this plus a few variants (considers shifts to be
a form of multiplication), which I didn't touch.  The simplification performed is:
A*B+A*C -> A*(B+C).  The improvement is to do this in cases that were not already
handled [such as A*B-A*C -> A*(B-C), which was reported on the mailing list], and
also to do it more often by not checking for "only one use" if "B+C" simplifies.

llvm-svn: 120024
2010-11-23 14:23:47 +00:00
Chris Lattner e5afa15b77 duncan's spider sense was right, I completely reversed the condition
on this instcombine xform.  This fixes a miscompilation of 403.gcc.

llvm-svn: 119988
2010-11-23 02:42:04 +00:00
Benjamin Kramer f1ebb63161 InstCombine: Implement X - A*-B -> X + A*B.
llvm-svn: 119984
2010-11-22 20:31:27 +00:00
Duncan Sands c133c54426 If a GEP index simply advances by multiples of a type of zero size,
then replace the index with zero.

llvm-svn: 119974
2010-11-22 16:32:50 +00:00
Duncan Sands cf4bceba49 Add a rather pointless InstructionSimplify transform, inspired by recent constant
folding improvements: if P points to a type of size zero, turn "gep P, N" into "P".
More generally, if a gep index type has size zero, instcombine could replace the
index with zero, but that is not done here.

llvm-svn: 119942
2010-11-21 13:53:09 +00:00
Chris Lattner f7e896138e optimize:
void a(int x) { if (((1<<x)&8)==0) b(); }

into "x != 3", which occurs over 100 times in 403.gcc but in no
other program in llvm-test.

llvm-svn: 119922
2010-11-21 06:44:42 +00:00
Benjamin Kramer 07726c7d52 InstCombine: Add a missing irem identity (X % X -> 0).
llvm-svn: 119538
2010-11-17 19:11:46 +00:00
Duncan Sands 5ffc298bc7 In which I discover the existence of loops. Threading an operation
over a phi node by applying it to each operand may be wrong if the
operation and the phi node are mutually interdependent (the testcase
has a simple example of this).  So only do this transform if it would
be correct to perform the operation in each predecessor of the block
containing the phi, i.e. if the other operands all dominate the phi.
This should fix the FFMPEG snow.c regression reported by İsmail Dönmez.

llvm-svn: 119347
2010-11-16 12:16:38 +00:00
Duncan Sands f12ba1dfe1 Teach InstructionSimplify the trick of skipping incoming phi
values that are equal to the phi itself.

llvm-svn: 119161
2010-11-15 17:52:45 +00:00
Duncan Sands 01157b027b Move PHI tests to phi.ll, out of select.ll.
llvm-svn: 119153
2010-11-15 16:43:28 +00:00
Duncan Sands 641baf1646 Generalize the reassociation transform in SimplifyCommutative (now renamed to
SimplifyAssociativeOrCommutative) "(A op C1) op C2" -> "A op (C1 op C2)",
which previously was only done if C1 and C2 were constants, to occur whenever
"C1 op C2" simplifies (a la InstructionSimplify).  Since the simplifying operand
combination can no longer be assumed to be the right-hand terms, consider all of
the possible permutations.  When compiling "gcc as one big file", transform 2
(i.e. using right-hand operands) fires about 4000 times but it has to be said
that most of the time the simplifying operands are both constants.  Transforms
3, 4 and 5 each fired once.  Transform 6, which is an existing transform that
I didn't change, never fired.  With this change, the testcase is now optimized
perfectly with one run of instcombine (previously it required instcombine +
reassociate + instcombine, and it may just have been luck that this worked).

llvm-svn: 119002
2010-11-13 15:10:37 +00:00
Duncan Sands f3b1bf1606 Teach InstructionSimplify how to look through PHI nodes. Since PHI
nodes can be used in loops, this could result in infinite looping
if there is no recursion limit, so add such a limit.  It is also
used for the SelectInst case because in theory there could be an
infinite loop there too if the basic block is unreachable.

llvm-svn: 118694
2010-11-10 18:23:01 +00:00
Dale Johannesen 0171dc30ff When checking that the necessary bits are zero in
order to reduce ((x<<30)>>24) to x<<6, check the
correct bits.  PR 8547.

llvm-svn: 118665
2010-11-10 01:30:56 +00:00
Duncan Sands 52be6b41d3 Add an additional test for icmp of select folding.
llvm-svn: 118441
2010-11-08 20:56:28 +00:00
Duncan Sands a620bd1fa3 Add simplification of floating point comparisons with the result
of a select instruction, the same as already exists for integer
comparisons.

llvm-svn: 118379
2010-11-07 16:46:25 +00:00
Duncan Sands f532d31198 Fix a README item: when doing a comparison with the result
of a select instruction, see if doing the compare with the
true and false values of the select gives the same result.
If so, that can be used as the value of the comparison.

llvm-svn: 118378
2010-11-07 16:12:23 +00:00
Owen Anderson 6186c96765 When folding away a (shl (shr)) pair, we need to check that the bits that will BECOME the low
bits are zero, not that the current low bits are zero.  Fixes <rdar://problem/8606771>.

llvm-svn: 117953
2010-11-01 21:08:20 +00:00
Bob Wilson 11ee456e23 Change instcombine's getShuffleMask to represent undef with negative values.
This code had previously used 2*N, where N is the mask length, to represent
undef.  That is not safe because the shufflevector operands may have more
than N elements -- they don't have to match the result type.

llvm-svn: 117721
2010-10-29 22:03:05 +00:00
Bob Wilson cb11b48e7a Make instcombine a little more aggressive in combining vector shuffles.
Allow splats even if they don't match either of the original shuffles,
possibly due to undef entries in the shuffles masks.  Radar 8597790.
Also fix some 80-column violations.

llvm-svn: 117719
2010-10-29 22:02:50 +00:00
Dale Johannesen 16bb87a90e Teach InstCombine not to use Add and Neg on FP. PR 8490.
llvm-svn: 117510
2010-10-27 23:45:18 +00:00
Dan Gohman 2e20dfb0f2 Fix a case where instcombine was stripping metadata (and alignment)
from stores when folding in bitcasts.

llvm-svn: 117265
2010-10-25 16:16:27 +00:00
Bob Wilson a4e231c880 Teach instcombine to set the alignment arguments for NEON load/store intrinsics.
llvm-svn: 117154
2010-10-22 21:41:48 +00:00
Chris Lattner c663a67384 fix PR8267 - Instcombine shouldn't optimizer away volatile memcpy's.
llvm-svn: 115296
2010-10-01 05:51:02 +00:00