Commit Graph

2314 Commits

Author SHA1 Message Date
Galina Kistanova a335f5aeeb Move few target-dependant tests to appropriate directories.
llvm-svn: 131002
2011-05-06 18:24:46 +00:00
Duncan Sands a071c82900 Fix PR9820: a read-only call differs from a load in that a load doesn't
return the pointer being dereferenced, it returns the pointee, but a call
might return the pointer itself.

llvm-svn: 130979
2011-05-06 10:30:37 +00:00
Eli Friedman 8a20e66926 PR9838: Fix transform introduced in r127064 to not trigger when only one side of the icmp is an exact shift.
llvm-svn: 130954
2011-05-05 21:59:18 +00:00
Duncan Sands a228785526 Add variations on: max(x,y) >= min(x,z) folds to true. This isn't that common,
but according to my super-optimizer there are only two missed simplifications
of -instsimplify kind when compiling bzip2, and this is one of them.  It amuses
me to have bzip2 be perfectly optimized as far as instsimplify goes!

llvm-svn: 130840
2011-05-04 16:05:05 +00:00
Duncan Sands 0a9c1246d7 Implement some basic simplifications involving min/max, for example
max(a,b) >= a -> true.  According to my super-optimizer, these are
by far the most common simplifications (of the -instsimplify kind)
that occur in the testsuite and aren't caught by -std-compile-opts.

llvm-svn: 130780
2011-05-03 19:53:10 +00:00
Duncan Sands f91c5ab341 Fix PR9579: when simplifying a compare to "true" or "false", and it was
a vector compare, generate a vector result rather than i1 (and crashing).

llvm-svn: 130706
2011-05-02 18:51:41 +00:00
Duncan Sands a3e3699c88 Move some rem transforms out of instcombine and into instsimplify.
This automagically provides a transform noticed by my super-optimizer
as occurring quite often: "rem x, (select cond, x, 1)" -> 0.

llvm-svn: 130694
2011-05-02 16:27:02 +00:00
Benjamin Kramer 9aa91b1f4e InstCombine: Turn (zext A) udiv (zext B) into (zext (A udiv B)). Same for urem or constant B.
This obviously helps a lot if the division would be turned into a libcall
(think i64 udiv on i386), but div is also one of the few remaining instructions
on modern CPUs that become more expensive when the bitwidth gets bigger.

This also helps register pressure on i386 when dividing chars, divb needs
two 8-bit parts of a 16 bit register as input where divl uses two registers.

int foo(unsigned char a) { return a/10; }
int bar(unsigned char a, unsigned char b) { return a/b; }

compiles into (x86_64)
_foo:
  imull $205, %edi, %eax
  shrl  $11, %eax
  ret
_bar:
  movzbl        %dil, %eax
  divb  %sil, %al
  movzbl        %al, %eax
  ret

llvm-svn: 130615
2011-04-30 18:16:07 +00:00
Benjamin Kramer 57b3df59b9 Use SimplifyDemandedBits on div instructions.
This folds away silly stuff like (a&255)/1000 -> 0.

llvm-svn: 130614
2011-04-30 18:16:00 +00:00
Benjamin Kramer 6a50bbd284 FileCheckize.
llvm-svn: 130613
2011-04-30 18:15:53 +00:00
Peter Collingbourne 616044acd5 SimplifyCFG: Expose phi node folding cost threshold as command line parameter
llvm-svn: 130528
2011-04-29 18:47:38 +00:00
Peter Collingbourne e3511e15e0 SimplifyCFG: Add CostRemaining parameter to DominatesMergePoint
llvm-svn: 130527
2011-04-29 18:47:31 +00:00
Peter Collingbourne 61f6602acd SimplifyCFG: Add Trunc, ZExt and SExt to the list of cheap instructions for phi node folding
llvm-svn: 130526
2011-04-29 18:47:25 +00:00
Benjamin Kramer 16f18ed7b5 InstCombine: turn (C1 << A) << C2) into (C1 << C2) << A)
Fixes PR9809.

llvm-svn: 130485
2011-04-29 08:15:41 +00:00
Chris Lattner 1777601a74 final step needed to resolve PR6627, which allows us to flatten the code down to
a nice and tidy:
  %x1 = load i32* %0, align 4
  %1 = icmp eq i32 %x1, 1179403647
  br i1 %1, label %if.then, label %if.end

instead of doing lots of loads and branches.  May the FreeBSD bootloader
long fit in its allocated space.

llvm-svn: 130416
2011-04-28 18:15:47 +00:00
Benjamin Kramer 4145c0d3b1 InstCombine: Merge "(trunc x) == C1 & (and x, CA) == C2" into a single and+icmp.
This happens when GVN widens loads. Part of PR6627.

llvm-svn: 130405
2011-04-28 16:58:40 +00:00
Chris Lattner 827a270a2a teach GVN to widen integer loads when they are overaligned, when doing an
wider load would allow elimination of subsequent loads, and when the wider
load is still a native integer type.  This eliminates a ton of loads on 
various benchmarks involving struct fields, though it is somewhat hobbled
by clang not being very aggressive about field alignment.

This is yet another step along the way towards resolving PR6627.

llvm-svn: 130390
2011-04-28 07:29:08 +00:00
Andrew Trick 29ac7b8858 Fixes PR9730: indvars: An asserting value handle still pointed to this value
Modified LinearFunctionTestReplace to push the condition on the dead
list instead of eagerly deleting it. This can cause unnecessary
IV rewrites, which should have no effect on codegen and will not be an
issue once we stop generating canonical IVs.

llvm-svn: 130340
2011-04-27 23:00:03 +00:00
Devang Patel 12bf0ab4b5 Simplify cfg inserts a call to trap when unreachable code is detected. Assign DebugLoc to this new trap instruction.
llvm-svn: 130315
2011-04-27 17:59:27 +00:00
Chris Lattner 6b96621a8a remove support for llvm.invariant.end from memdep. It is a
work-in-progress that is not progressing, and it has issues.

llvm-svn: 130247
2011-04-26 21:50:51 +00:00
Chris Lattner 029afe4787 make a couple of changes to the standard pass pipeline:
1. Only run the early (in the module pass pipe) instcombine/simplifycfg
   if the "unit at a time" passes they are cleaning up after runs.

2. Move the "clean up after the unroller" pass to the very end of the
   function-level pass pipeline.  Loop unroll uses instsimplify now,
   so it doesn't create a ton of trash.  Moving instcombine later allows
   it to clean up after opportunities are exposed by GVN, DSE, etc.

3. Introduce some phase ordering tests for things that are specifically
   intended to be simplified by the full optimizer as a whole.

This resolves PR2338, and is progress towards PR6627, which will be 
generating code that looks similar to test2.

llvm-svn: 130241
2011-04-26 20:45:33 +00:00
Chris Lattner 1b06c71668 Transform: "icmp eq (trunc (lshr(X, cst1)), cst" to "icmp (and X, mask), cst"
when X has multiple uses.  This is useful for exposing secondary optimizations,
but the X86 backend isn't ready for this when X has a single use.  For example,
this can disable load folding.

This is inching towards resolving PR6627.

llvm-svn: 130238
2011-04-26 20:18:20 +00:00
Chris Lattner eb045f9c02 Improve the bail-out predicate to really only kick in when phi
translation fails.  We were bailing out in some cases that would
cause us to miss GVN'ing some non-local cases away.

llvm-svn: 130206
2011-04-26 17:41:02 +00:00
Chris Lattner 6f83d06ffa Enhance MemDep: When alias analysis returns a partial alias result,
return it as a clobber.  This allows GVN to do smart things.

Enhance GVN to be smart about the case when a small load is clobbered
by a larger overlapping load.  In this case, forward the value.  This
allows us to compile stuff like this:

int test(void *P) {
  int tmp = *(unsigned int*)P;
  return tmp+*((unsigned char*)P+1);
}

into:

_test:                                  ## @test
	movl	(%rdi), %ecx
	movzbl	%ch, %eax
	addl	%ecx, %eax
	ret

which has one load.  We already handled the case where the smaller
load was from a must-aliased base pointer.

llvm-svn: 130180
2011-04-26 01:21:15 +00:00
Cameron Zwarich ca4c633489 Fix another case of <rdar://problem/9184212> that only occurs with code
generated by llvm-gcc, since llvm-gcc uses 2 i64s for passing a 4 x float
vector on ARM rather than an i64 array like Clang.

llvm-svn: 129878
2011-04-20 21:48:38 +00:00
Frits van Bommel d097212a08 Add test cases for Jay's r129641 and fix a 32-bit-centric testcase in a file with a 64-bit datalayout.
llvm-svn: 129643
2011-04-16 14:31:50 +00:00
Chris Lattner 0ab5e2cded Fix a ton of comment typos found by codespell. Patch by
Luis Felipe Strano Moraes!

llvm-svn: 129558
2011-04-15 05:18:47 +00:00
Eli Friedman 2395626605 Add an instcombine for constructs like a | -(b != c); a select is more
canonical, and generally leads to better code.  Found while looking at
an article about saturating arithmetic.

llvm-svn: 129545
2011-04-14 22:41:27 +00:00
Owen Anderson 92651ec374 Fix an infinite alternation in JumpThreading where two transforms would repeatedly undo each other. The solution is to perform more aggressive constant folding to make one of the edges just folded away rather than trying to thread it.
Fixes <rdar://problem/9284786>.

Discovered with CSmith.

llvm-svn: 129538
2011-04-14 21:35:50 +00:00
Mon P Wang 2e5528f0b2 Vectors with different number of elements of the same element type can have
the same allocation size but different primitive sizes(e.g., <3xi32> and
<4xi32>).  When ScalarRepl promotes them, it can't use a bit cast but
should use a shuffle vector instead.

llvm-svn: 129472
2011-04-13 21:40:02 +00:00
Dan Gohman 1c6c34834b Fix reassociate to use a worklist instead of recursing when new
reassociation opportunities are exposed. This fixes a bug where
the nested reassociation expects to be the IR to be consistent,
but it isn't, because the outer reassociation has disconnected
some of the operands.  rdar://9167457

llvm-svn: 129324
2011-04-12 00:11:56 +00:00
Chris Lattner e81d045d94 remove the StructRetPromotion pass. It is unused, not maintained and
has some bugs.  If this is interesting functionality, it should be 
reimplemented in the argpromotion pass.

llvm-svn: 129314
2011-04-11 23:09:44 +00:00
Eli Friedman 9cca0715aa Add back a couple checks removed by r129128; the fact that an intitializer
is an array of structures doesn't imply it's a ConstantArray of
ConstantStruct.

llvm-svn: 129207
2011-04-09 09:11:09 +00:00
Chris Lattner 88974f4625 fix PR9523, a crash in looprotate on a non-canonical loop made out of indirectbr.
llvm-svn: 129203
2011-04-09 07:25:58 +00:00
Eli Friedman 17822fcde9 PR9604; try to deal with RAUW updates correctly in the AST. I'm not convinced
it's completely safe to cache the AST across LICM runs even with this fix,
but this fix can't hurt.

llvm-svn: 129198
2011-04-09 06:55:46 +00:00
Eli Friedman 4db39cefdb Test for r129190.
llvm-svn: 129197
2011-04-09 06:39:43 +00:00
Devang Patel bc3d8b212f Do not let debug info interfer with branch folding.
llvm-svn: 129114
2011-04-07 23:11:25 +00:00
Devang Patel 197c35298a While hoisting common code from if/else, hoist debug info intrinsics if they match.
llvm-svn: 129078
2011-04-07 17:27:36 +00:00
Eli Friedman c5f22a7815 PR9634: Don't unconditionally tell the AliasSetTracker that the PreheaderLoad
is equivalent to any other relevant value; it isn't true in general.
If it is equivalent, the LoopPromoter will tell the AST the equivalence.
Also, delete the PreheaderLoad if it is unused.

Chris, since you were the last one to make major changes here, can you check
that this is sane?

llvm-svn: 129049
2011-04-07 01:35:06 +00:00
Nadav Rotem cc771acd77 This testcase passed even without the fix. Added the target info to make the
test fail (without the fix). Thanks Dan.

llvm-svn: 128999
2011-04-06 11:18:29 +00:00
Nadav Rotem a069c6ce05 InstCombine optimizes gep(bitcast(x)) even when the bitcasts casts away address
space info. We crash with an assert in this case. This change checks that the
address space of the bitcasted pointer is the same as the gep ptr.

llvm-svn: 128884
2011-04-05 14:29:52 +00:00
Eli Friedman 17bf4922c9 PR9446: RecursivelyDeleteTriviallyDeadInstructions can delete the instruction
after the given instruction; make sure to handle that case correctly.
(It's difficult to trigger; the included testcase involves a dead 
block, but I don't think that's a requirement.) 

While I'm here, get rid of the unnecessary warning about
SimplifyInstructionsInBlock, since it should work correctly as far as I know.

llvm-svn: 128782
2011-04-02 22:45:17 +00:00
Benjamin Kramer d121765e64 InstCombine: Turn icmp + sext into bitwise/integer ops when the input has only one unknown bit.
int test1(unsigned x) { return (x&8) ? 0 : -1; }
int test3(unsigned x) { return (x&8) ? -1 : 0; }

before (x86_64):
_test1:
	andl	$8, %edi
	cmpl	$1, %edi
	sbbl	%eax, %eax
	ret
_test3:
	andl	$8, %edi
	cmpl	$1, %edi
	sbbl	%eax, %eax
	notl	%eax
	ret

after:
_test1:
	shrl	$3, %edi
	andl	$1, %edi
	leal	-1(%rdi), %eax
	ret
_test3:
	shll	$28, %edi
	movl	%edi, %eax
	sarl	$31, %eax
	ret

llvm-svn: 128732
2011-04-01 20:09:10 +00:00
Nadav Rotem d74b72b8a9 Instcombile optimization: extractelement(cast) -> cast(extractelement)
llvm-svn: 128683
2011-03-31 22:57:29 +00:00
Benjamin Kramer 5291054ef1 InstCombine: APFloat can't perform arithmetic on PPC double doubles, don't even try.
Thanks Eli!

llvm-svn: 128676
2011-03-31 21:35:49 +00:00
Benjamin Kramer be209ab8a2 InstCombine: Fix transform to use the swapped predicate.
Thanks Frits!

llvm-svn: 128628
2011-03-31 10:46:03 +00:00
Benjamin Kramer d159d94644 InstCombine: fold fcmp (fneg x), (fneg y) -> fcmp x, y
llvm-svn: 128627
2011-03-31 10:12:22 +00:00
Benjamin Kramer a8c5d0872d InstCombine: fold fcmp pred (fneg x), C -> fcmp swap(pred) x, -C
llvm-svn: 128626
2011-03-31 10:12:15 +00:00
Benjamin Kramer cbb18e91a8 InstCombine: Shrink "fcmp (fpext x), C" to "fcmp x, C" if C can be losslessly converted to the type of x.
Fixes PR9592.

llvm-svn: 128625
2011-03-31 10:12:07 +00:00
Benjamin Kramer 2ccfbc8b71 InstCombine: fold fcmp (fpext x), (fpext y) -> fcmp x, y.
llvm-svn: 128624
2011-03-31 10:11:58 +00:00