Commit Graph

6730 Commits

Author SHA1 Message Date
Chris Lattner 6aa6b1f263 Teach ConvertUsesToScalar to handle memset, allowing it to handle
crazy cases like:

struct f {  int A, B, C, D, E, F; };
short test4() {
  struct f A;
  A.A = 1;
  memset(&A.B, 2, 12);
  return A.C;
}

llvm-svn: 63596
2009-02-03 02:01:43 +00:00
Dan Gohman 7aa0c17cff Delete these two tests. They are specific to x86-64, and there's no
reliable way to do this with the current dejagnu infrastructure.
If someone can figure out how to fix these tests so that they test
what they are intended to test without spuriously failing on any
popular platforms, they are invited to reinstate them.

llvm-svn: 63592
2009-02-03 01:33:26 +00:00
Chris Lattner 09b65ab288 rearrange how SRoA handles promotion of allocas to vectors.
With the new world order, it can handle cases where the first
store into the alloca is an element of the vector, instead of
requiring the first analyzed store to have the vector type 
itself.  This allows us to un-xfail 
test/CodeGen/X86/vec_ins_extract.ll.

llvm-svn: 63590
2009-02-03 01:30:09 +00:00
Chris Lattner a0ce5f060d this test produces an undefined value, we don't care
what it is, but we do want the alloca promoted.

llvm-svn: 63587
2009-02-03 01:13:52 +00:00
Bill Wendling b0ad6f9a6c It fails on Linux. XFAIL that machine.
llvm-svn: 63582
2009-02-03 00:35:11 +00:00
Bill Wendling 423f3bc196 This is passing for us. Should it have been reenabled?
llvm-svn: 63580
2009-02-03 00:27:09 +00:00
Dan Gohman 7948ef5f87 Add explicit -march=x86 to these tests so that they don't
default to -march=x86-64 on 64-bit hosts.

llvm-svn: 63579
2009-02-03 00:20:22 +00:00
Dan Gohman f58f0cbfd5 Fix another test to not use -mcpu=yonah with 64-bit code.
llvm-svn: 63572
2009-02-02 23:43:59 +00:00
Dan Gohman e862b3dd96 Yonah does not support x86-64. Change the -mcpu value to one that does.
llvm-svn: 63561
2009-02-02 22:50:08 +00:00
Devang Patel dd5dbca59c Run dsymutil on darwin, when it is expected, before running gdb test.
llvm-svn: 63548
2009-02-02 21:09:36 +00:00
Chris Lattner c81fdd1773 xfail this for now, will fix shortly.
llvm-svn: 63533
2009-02-02 18:15:33 +00:00
Chris Lattner 64217e6a28 update test
llvm-svn: 63532
2009-02-02 18:12:58 +00:00
Chris Lattner 18eba4f211 Fix a bug which caused us to miscompile a couple of Ada
tests.  Thanks for the beautiful reduced testcase Duncan!

llvm-svn: 63529
2009-02-02 18:02:59 +00:00
Devang Patel 97ba824ad9 Do not add redundant arguments in a method definition DIE.
llvm-svn: 63527
2009-02-02 17:51:41 +00:00
Devang Patel e7a112111a Make this test case smaller.
llvm-svn: 63526
2009-02-02 17:50:43 +00:00
Duncan Sands 7e4cb0a1cf This passes on x86-32 linux at least.
llvm-svn: 63508
2009-02-02 09:10:57 +00:00
Duncan Sands dca376ff07 Make the XFAIL line actually match x86-32 targets.
llvm-svn: 63507
2009-02-02 09:07:13 +00:00
Evan Cheng 50e15bdf81 Teach LowerBRCOND to recognize (xor (setcc x), 1). The xor inverts the condition. It's normally transformed by the dag combiner, unless the condition is set by a arithmetic op with overflow.
llvm-svn: 63505
2009-02-02 08:07:36 +00:00
Chris Lattner 1f386b8ec8 Fix PR3372
llvm-svn: 63501
2009-02-02 07:24:28 +00:00
Chris Lattner c4eb63d412 reduce testcase.
llvm-svn: 63499
2009-02-02 06:55:45 +00:00
Torok Edwin c418287974 add 2 more testcases for -mattr=-sse (r63495).
--This line, and those below, will be ignaored--

A    test/CodeGen/X86/nosse-error1.ll
A    test/CodeGen/X86/nosse-error2.ll

llvm-svn: 63496
2009-02-01 18:24:20 +00:00
Torok Edwin a2d1f35e9a Implement -mno-sse: if SSE is disabled on x86-64, don't store XMM on stack for
var-args, and don't allow FP return values

llvm-svn: 63495
2009-02-01 18:15:56 +00:00
Duncan Sands 3ed768868d Fix PR3453 and probably a bunch of other potential
crashes or wrong code with codegen of large integers:
eliminate the legacy getIntegerVTBitMask and
getIntegerVTSignBit methods, which returned their
value as a uint64_t, so couldn't handle huge types.

llvm-svn: 63494
2009-02-01 18:06:53 +00:00
Nick Lewycky f23908151a Reinstate this optimization to fold icmp of xor when possible. Don't try to
turn icmp eq a+x, b+x into icmp eq a, b if a+x or b+x has other uses. This
may have been increasing register pressure leading to the bzip2 slowdown.

llvm-svn: 63487
2009-01-31 21:30:05 +00:00
Chris Lattner 9e2b9f3234 Fix PR3452 (an infinite loop bootstrapping) by disabling the recent
improvements to the EvaluateInDifferentType code.  This code works 
by just inserted a bunch of new code and then seeing if it is 
useful.  Instcombine is not allowed to do this: it can only insert
new code if it is useful, and only when it is converging to a more
canonical fixed point.  Now that we iterate when DCE makes progress,
this causes an infinite loop when the code ends up not being used.

llvm-svn: 63483
2009-01-31 19:05:27 +00:00
Duncan Sands 41826036b1 Fix PR3401: when using large integers, the type
returned by getShiftAmountTy may be too small
to hold shift values (it is an i8 on x86-32).
Before and during type legalization, use a large
but legal type for shift amounts: getPointerTy;
afterwards use getShiftAmountTy, fixing up any
shift amounts with a big type during operation
legalization.  Thanks to Dan for writing the
original patch (which I shamelessly pillaged).

llvm-svn: 63482
2009-01-31 15:50:11 +00:00
Chris Lattner 76a63ed099 now that all the pieces are in place, teach instcombine's
simplifydemandedbits to simplify instructions with *multiple
uses* in contexts where it can get away with it.  This allows
it to simplify the code in multi-use-or.ll into a single 'add 
double'.

This change is particularly interesting because it will cover
up for some common codegen bugs with large integers created due
to the recent SROA patch.  When working on fixing those bugs,
this should be disabled.

llvm-svn: 63481
2009-01-31 08:40:03 +00:00
Chris Lattner 94cfb281c3 make sure to set Changed=true when instcombine hacks on the code,
not doing so prevents it from properly iterating and prevents it
from deleting the entire body of dce-iterate.ll

llvm-svn: 63476
2009-01-31 07:04:22 +00:00
Mon P Wang b6080cf943 Used "-enable-unsafe-fp-math" to allow this transformation - (a * b -c) = c - a *b.
llvm-svn: 63475
2009-01-31 06:50:54 +00:00
Mon P Wang cf9ba82324 If unsafe FP optimization is not set, don't allow -(A-B) => B-A because
when A==B, -0.0 != +0.0.

llvm-svn: 63474
2009-01-31 06:07:45 +00:00
Chris Lattner ec99c46d44 Simplify and generalize the SROA "convert to scalar" transformation to
be able to handle *ANY* alloca that is poked by loads and stores of 
bitcasts and GEPs with constant offsets.  Before the code had a number
of annoying limitations and caused it to miss cases such as storing into
holes in structs and complex casts (as in bitfield-sroa) where we had
unions of bitfields etc.  This also handles a number of important cases
that are exposed due to the ABI lowering stuff we do to pass stuff by
value.

One case that is pretty great is that we compile 
2006-11-07-InvalidArrayPromote.ll into:

define i32 @func(<4 x float> %v0, <4 x float> %v1) nounwind {
	%tmp10 = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %v1)
	%tmp105 = bitcast <4 x i32> %tmp10 to i128
	%tmp1056 = zext i128 %tmp105 to i256	
	%tmp.upgrd.43 = lshr i256 %tmp1056, 96
	%tmp.upgrd.44 = trunc i256 %tmp.upgrd.43 to i32	
	ret i32 %tmp.upgrd.44
}

which turns into:

_func:
	subl	$28, %esp
	cvttps2dq	%xmm1, %xmm0
	movaps	%xmm0, (%esp)
	movl	12(%esp), %eax
	addl	$28, %esp
	ret

Which is pretty good code all things considering :).

One effect of this is that SROA will start generating arbitrary bitwidth 
integers that are a multiple of 8 bits.  In the case above, we got a 
256 bit integer, but the codegen guys assure me that it can handle the 
simple and/or/shift/zext stuff that we're doing on these operations.

This addresses rdar://6532315

llvm-svn: 63469
2009-01-31 02:28:54 +00:00
Devang Patel c094970cd2 Each input file is encoded as a separate compile unit in LLVM debugging
information output. However, many target specific tool chains prefer to encode
only one compile unit in an object file. In this situation, the LLVM code
generator will include  debugging information entities in the compile unit 
that is marked as main compile unit. The code generator accepts maximum one main
compile unit per module. If a module does not contain any main compile unit 
then the code generator will emit multiple compile units in the output object 
file.

[Part 1]

Update DebugInfo APIs to accept optional boolean value while creating DICompileUnit  to mark the unit as "main" unit. By defaults all units are considered  non-main.  Update SourceLevelDebugging.html to document "main" compile unit.

Update DebugInfo APIs to not accept and encode separate source file/directory entries while creating various llvm.dbg.* entities. There was a recent, yet to be documented, change to include this additional information so no documentation changes are required here.

Update DwarfDebug to handle "main" compile unit. If "main" compile unit is seen then all DIEs are inserted into "main" compile unit. All other compile units are used to find source location for llvm.dbg.* values. If there is not any "main" compile unit then create unique compile unit DIEs for each llvm.dbg.compile_unit.

[Part 2]

Create separate llvm.dbg.compile_unit for each input file. Mark compile unit create for main_input_filename as "main" compile unit. Use appropriate compile unit, based on source location information collected from the tree node, while creating llvm.dbg.* values using DebugInfo APIs.

---

This is Part 1.

llvm-svn: 63400
2009-01-30 18:20:31 +00:00
Zhou Sheng 1e36fbb5ed This is case is to uncover the bug in IntrinsicLowering.cpp,
the LowerPartSet(). It didn't handle the situation correctly when 
the low, high argument values are in reverse order (low > high) 
with 'Val' type is i32 (a corner case).

llvm-svn: 63386
2009-01-30 08:59:51 +00:00
Devang Patel acbb381cc4 Enable target tripple.
llvm-svn: 63361
2009-01-30 01:40:58 +00:00
Devang Patel a103ee4f0d Linux and other target's encoding for DW_AT_declaration may not match.
llvm-svn: 63360
2009-01-30 01:37:30 +00:00
Devang Patel 4ba91058d2 Add DW_AT_declaration for class methods.
llvm-svn: 63356
2009-01-30 01:21:46 +00:00
Owen Anderson ad89c410e6 XFAIL this test. It only worked before because of a bug in the spill point selection code. Not deleting because
it should be possible to enhance the selection code to handle this in the future.

llvm-svn: 63340
2009-01-29 22:27:56 +00:00
Evan Cheng a160d4af82 Local register allocator shouldn't assume only the entry and landing pad basic blocks have live-ins.
llvm-svn: 63323
2009-01-29 18:37:30 +00:00
Dan Gohman ef04ed5477 In the case of an extractelement on an insertelement value,
the element indices may be equal if either one is not a
constant.

llvm-svn: 63311
2009-01-29 16:10:46 +00:00
Evan Cheng a115859df0 Add a always_inline test case.
llvm-svn: 63304
2009-01-29 09:31:54 +00:00
Evan Cheng 45799abe61 Add a test case for Chris lvalue alignment fixes.
llvm-svn: 63300
2009-01-29 08:59:46 +00:00
Evan Cheng 76a2736c74 Exit with nice warnings when register allocator run out of registers.
llvm-svn: 63267
2009-01-29 02:20:59 +00:00
Dan Gohman e58ab79f33 Make x86's BT instruction matching more thorough, and add some
dagcombines that help it match in several more cases. Add
several more cases to test/CodeGen/X86/bt.ll. This doesn't
yet include matching for BT with an immediate operand, it
just covers more register+register cases.

llvm-svn: 63266
2009-01-29 01:59:02 +00:00
Mon P Wang 9150f735fa Fixed lowering of v816 shuffles.
llvm-svn: 63252
2009-01-28 23:11:14 +00:00
Bill Wendling 42b63bc175 Make test platform agnostic.
llvm-svn: 63247
2009-01-28 22:20:56 +00:00
Dan Gohman d21775ae0e Give this test an explicit target, to make it host-independent.
llvm-svn: 63244
2009-01-28 22:14:58 +00:00
Devang Patel d7ecb3b661 Do not forget to derived type while constructing an array type.
llvm-svn: 63233
2009-01-28 21:08:20 +00:00
Chris Lattner df17987c19 Fix some issues with volatility, move "CanConvertToScalar" check
after the others.

llvm-svn: 63227
2009-01-28 20:16:43 +00:00
Chris Lattner 1498e62117 strengthen this test.
llvm-svn: 63222
2009-01-28 19:29:30 +00:00
Evan Cheng f31f288863 The memory alignment requirement on some of the mov{h|l}p{d|s} patterns are 16-byte. That is overly strict. These instructions read / write f64 memory locations without alignment requirement.
llvm-svn: 63195
2009-01-28 08:35:02 +00:00
Mon P Wang d880efc005 Added sse test patterns for r62979 and r63193.
llvm-svn: 63194
2009-01-28 08:13:56 +00:00
Mikhail Glushenkov 2115d09a10 Add three new option properties.
Adds new option properties 'multi_val', 'one_or_more' and 'zero_or_one'.

llvm-svn: 63172
2009-01-28 03:47:20 +00:00
Bill Wendling fd03bdd00c Add testcase for r63142.
llvm-svn: 63149
2009-01-27 23:00:53 +00:00
Evan Cheng 1bc8af207e Implement multiple with overflow by 2 with an add instruction.
llvm-svn: 63090
2009-01-27 03:30:42 +00:00
Evan Cheng ce95cddd0f Forgot this test case.
llvm-svn: 63089
2009-01-27 02:59:39 +00:00
Dan Gohman 52e907a780 Add a FrontendC testcase for the x86-64 Red Zone feature,
to help verify that the feature may be disabled through
the -mno-red-zone option.

llvm-svn: 63079
2009-01-27 00:59:55 +00:00
Devang Patel 45c899cd15 Assorted debug info fixes.
- DW_AT_bit_size is only suitable for bitfields.
- Encode source location info for derived types.
- Source location and type size info is not useful for subroutine_type (info is included in respective DISubprogram) and array_type.

llvm-svn: 63077
2009-01-27 00:45:04 +00:00
Dan Gohman 8738997c11 Add a regression test for x86-64 red zone usage.
llvm-svn: 63075
2009-01-27 00:40:27 +00:00
Dale Johannesen 03490f0ce1 Testcase for 6522054.
llvm-svn: 63067
2009-01-26 23:22:19 +00:00
Duncan Sands d77e476921 Fix PR3393, which amounts to a bug in the expensive
checking logic.  Rather than make the checking more
complicated, I've tweaked some logic to make things
conform to how the checking thought things ought to
be, since this results in a simpler "mental model".

llvm-svn: 63048
2009-01-26 21:54:18 +00:00
Dan Gohman ac272eaf13 At Nick Lewycky's request, rename this test with a more informative name.
llvm-svn: 63042
2009-01-26 21:36:31 +00:00
Evan Cheng 6c7e85142b Enhance logic in X86DAGToDAGISel::PreprocessForRMW which move load inside callseq_start to allow it to be folded into a call. It was not considering the cases where a token factor is between the load and the callseq_start.
llvm-svn: 63022
2009-01-26 18:43:34 +00:00
Mon P Wang 3537a62704 Fixed optimization of combining two shuffles where the first shuffle inputs
has a different number of elements than the output.

llvm-svn: 62998
2009-01-26 04:39:00 +00:00
Scott Michel 9e3e4a9219 CellSPU:
- Rename fcmp.ll test to fcmp32.ll, start adding new double tests to fcmp64.ll
- Fix select_bits.ll test
- Capitulate to the DAGCombiner and move i64 constant loads to instruction
  selection (SPUISelDAGtoDAG.cpp).

  <rant>DAGCombiner will insert all kinds of 64-bit optimizations after
  operation legalization occurs and now we have to do most of the work that
  instruction selection should be doing twice (once to determine if v2i64
  build_vector can be handled by SelectCode(), which then runs all of the
  predicates a second time to select the necessary instructions.) But,
  CellSPU is a good citizen.</rant>

llvm-svn: 62990
2009-01-26 03:31:40 +00:00
Chris Lattner 9449991c4f Handle single-entry phi nodes gracefully in condprop.
llvm-svn: 62985
2009-01-26 02:18:20 +00:00
Chris Lattner 7b6647c178 Fix PR3408 by making a non-obvious assumption very obvious, and
handling the flaw inherent in that assumption.  :)

llvm-svn: 62984
2009-01-26 02:11:30 +00:00
Nate Begeman 5eca265519 Map address space 256 to gs; similar mappings could be supported for the
other x86 segments.  address space 0 is stack/default, 1-255 are reserved for
client use.

llvm-svn: 62980
2009-01-26 01:24:32 +00:00
Torok Edwin 97be2f5840 revert this patch for now, because Codegen does still want to generate SSE code,
for example in the case of va-args. XFAIL associated tests.

llvm-svn: 62972
2009-01-25 20:21:24 +00:00
Torok Edwin 3cc1940003 testcase for llvm-gcc part of PR3402.
llvm-svn: 62969
2009-01-25 18:00:06 +00:00
Torok Edwin a23c73bbdc If user explicitly asks not to use SSE, don't force it. This fixes LLVM part of PR3402.
llvm-svn: 62967
2009-01-25 17:58:56 +00:00
Nick Lewycky cb7a10ab63 Actually run the test in this directory.
llvm-svn: 62957
2009-01-25 08:05:07 +00:00
Nick Lewycky 5647c5d1a4 The function that does nothing but call malloc is noalias return.
llvm-svn: 62956
2009-01-25 07:59:57 +00:00
Evan Cheng 1c7c019229 Private linkage support for PPC / Darwin.
llvm-svn: 62955
2009-01-25 06:32:01 +00:00
Evan Cheng abda665f5f Teach 2addr pass to be do more commuting. If both uses of a two-address instruction are killed, but the first operand has a use before and after the def, commute if the second operand does not suffer from the same issue.
%reg1028<def> = EXTRACT_SUBREG %reg1027<kill>, 1                                                                                                                                     
%reg1029<def> = MOV8rr %reg1028                                                                                                                                                      
%reg1029<def> = SHR8ri %reg1029, 7, %EFLAGS<imp-def,dead>                                                                                                                            
insert => %reg1030<def> = MOV8rr %reg1028                                                                                                                                            
%reg1030<def> = ADD8rr %reg1028<kill>, %reg1029<kill>, %EFLAGS<imp-def,dead>                                                                                                         

In this case, it might not be possible to coalesce the second MOV8rr                                                                                                                 
instruction if the first one is coalesced. So it would be profitable to                                                                                                              
commute it:                                                                                                                                                                          
%reg1028<def> = EXTRACT_SUBREG %reg1027<kill>, 1                                                                                                                                     
%reg1029<def> = MOV8rr %reg1028                                                                                                                                                      
%reg1029<def> = SHR8ri %reg1029, 7, %EFLAGS<imp-def,dead>                                                                                                                            
insert => %reg1030<def> = MOV8rr %reg1029                                                                                                                                            
%reg1030<def> = ADD8rr %reg1029<kill>, %reg1028<kill>, %EFLAGS<imp-def,dead>

llvm-svn: 62954
2009-01-25 03:53:59 +00:00
Dan Gohman 73fd5a5651 Add a PR comment to this test.
llvm-svn: 62921
2009-01-24 17:32:54 +00:00
Torok Edwin f4395ea97a testcase for PR3381.
Also it was an empty struct, not a void after all.

llvm-svn: 62920
2009-01-24 17:16:04 +00:00
Chris Lattner 72cd68fe64 Make InstCombineStoreToCast handle aggregates more aggressively,
handling the case in Transforms/InstCombine/cast-store-gep.ll, which
is a heavily reduced testcase from Clang on x86-64.

llvm-svn: 62904
2009-01-24 01:00:13 +00:00
Devang Patel 486d309b34 Fix test case. Use valid file name and directory in global variable's debug info entry.
llvm-svn: 62883
2009-01-23 21:54:18 +00:00
Chris Lattner 3f4591c89f fix two more cases where we could let the NLPDI cache get unsorted.
With this, sqlite3 now passes.

llvm-svn: 62839
2009-01-23 07:12:16 +00:00
Evan Cheng f347c3615b Update test to reflect command line option name change.
llvm-svn: 62836
2009-01-23 05:45:31 +00:00
Dan Gohman 1f3411de47 Don't create ISD::FNEG nodes after legalize if they aren't legal.
Simplify x+0 to x in unsafe-fp-math mode. This avoids a bunch of
redundant work in many cases, because in unsafe-fp-math mode,
ISD::FADD with a constant is considered free to negate, so the
DAGCombiner often negates x+0 to -0-x thinking it's free, when
in reality the end result is -x, which is more expensive than x.

Also, combine x*0 to 0.

This fixes PR3374.

llvm-svn: 62789
2009-01-22 21:58:43 +00:00
Devang Patel dec7fe2e71 Do not use buggy llvm-gcc to generate testcases.
llvm-svn: 62770
2009-01-22 18:28:11 +00:00
Duncan Sands e3a26635fb Remove no-longer relevant comment. Pointed out
by Gabor.

llvm-svn: 62765
2009-01-22 15:37:29 +00:00
Duncan Sands ac6f7eeb05 This passes on linux.
llvm-svn: 62764
2009-01-22 15:07:15 +00:00
Chris Lattner bed6be62e4 fix a testcase.
llvm-svn: 62758
2009-01-22 07:08:58 +00:00
Chris Lattner f09619d533 Fix PR3358, a really nasty bug where recursive phi translated
analyses could be run without the caches properly sorted.  This
can fix all sorts of weirdness.  Many thanks to Bill for coming
up with the 'issorted' verification idea.

llvm-svn: 62757
2009-01-22 07:04:01 +00:00
Bill Wendling 6cf1f8fd5b Now with RUN line.
llvm-svn: 62716
2009-01-21 21:28:03 +00:00
Bill Wendling ba11cd338b Run this through -simplifycfg and -mem2reg to test only what we need to test.
llvm-svn: 62714
2009-01-21 21:02:27 +00:00
Dale Johannesen 1f86498f93 Do not use host floating point types when emitting
ASCII IR; loading and storing these can change the
bits of NaNs on some hosts.  Remove or add warnings
at a few other places using host floating point;
this is a bad thing to do in general.

llvm-svn: 62712
2009-01-21 20:32:55 +00:00
Dan Gohman 7e6b932f18 Simplify ReduceLoadWidth's logic: it doesn't need several different
special cases after producing the new reduced-width load, because the
new load already has the needed adjustments built into it. This fixes
several bugs due to the special cases, including PR3317.

llvm-svn: 62692
2009-01-21 15:17:51 +00:00
Dan Gohman b43c8996f2 Fix a recent regression. ClrOpcode is not set for i8; for i8, if
we want to clear %ah to zero before a division, just use a
zero-extending mov to %al. This fixes PR3366.

llvm-svn: 62691
2009-01-21 14:50:16 +00:00
Mikhail Glushenkov bf9716e15d Allow hooks with arguments.
llvm-svn: 62685
2009-01-21 13:04:00 +00:00
Duncan Sands d56cf3025f This was causing invalid memory accesses when
generating debug info in the compiler.

llvm-svn: 62684
2009-01-21 11:51:17 +00:00
Duncan Sands 1de451d0d0 Let's try to have our cake and eat it to: move
this test into FrontendC to ensure that llvm-gcc
is available; assemble using "llvm-gcc -xassembler"
rather than "as".

llvm-svn: 62683
2009-01-21 11:37:31 +00:00
Duncan Sands 696f4a8598 Don't rely on grep -w working.
llvm-svn: 62682
2009-01-21 09:41:42 +00:00
Scott Michel ed7d79fce4 CellSPU:
- Ensure that (operation) legalization emits proper FDIV libcall when needed.
- Fix various bugs encountered during llvm-spu-gcc build, along with various
  cleanups.
- Start supporting double precision comparisons for remaining libgcc2 build.
  Discovered interesting DAGCombiner feature, which is currently solved via
  custom lowering (64-bit constants are not legal on CellSPU, but DAGCombiner
  insists on inserting one anyway.)
- Update README.

llvm-svn: 62664
2009-01-21 04:58:48 +00:00
Evan Cheng 201501995f Favors generating "not" over "xor -1". For example.
unsigned test(unsigned a) {
  return ~a;
}
llvm used to generate:
movl    $4294967295, %eax
xorl    4(%esp), %eax

Now it generates:
movl      4(%esp), %eax
notl      %eax

It's 3 bytes shorter.

llvm-svn: 62661
2009-01-21 02:09:05 +00:00
Dale Johannesen 287b4bc44e Disable on x86_64 until I figure out what's wrong.
llvm-svn: 62660
2009-01-21 02:08:30 +00:00
Dale Johannesen b5721632ee Make special cases (0 inf nan) work for frem.
Besides APFloat, this involved removing code
from two places that thought they knew the
result of frem(0., x) but were wrong.

llvm-svn: 62645
2009-01-21 00:35:19 +00:00
Owen Anderson be7a29de0b Be more aggressive about renumbering vregs after splitting them.
llvm-svn: 62639
2009-01-21 00:13:28 +00:00
Devang Patel 6bbacbe372 Appropriately mark fowrad decls.
llvm-svn: 62625
2009-01-20 22:27:02 +00:00
Devang Patel 6fbec1c230 Need compile unit to find location.
llvm-svn: 62624
2009-01-20 22:26:11 +00:00
Dale Johannesen e75fdb0510 Calls to fmod, it turns out, are constant-folded by
invoking the host fmod, not by lowering to frem and
constant-folding that.  Fix this so it tests what I
want to test.

llvm-svn: 62622
2009-01-20 21:58:13 +00:00
Chris Lattner f8a8c13c1e Don't bother running the assembler, we don't know that it will be configured
for whatever llc defaults to.  This fixes PR3363

llvm-svn: 62619
2009-01-20 21:41:53 +00:00
Evan Cheng f1e873a221 Fix PR3243: a LiveVariables bug. When HandlePhysRegKill is checking whether the last reference is also the last def (i.e. dead def), it should also check if last reference is the current machine instruction being processed. This can happen when it is processing a physical register use and setting the current machine instruction as sub-register's last ref.
llvm-svn: 62617
2009-01-20 21:25:12 +00:00
Evan Cheng 4022b7c3f4 Add test case for PR3154.
llvm-svn: 62604
2009-01-20 19:29:54 +00:00
Duncan Sands 489c5484d3 Check that the "don't barf on k8" fix is not
accidentally reverted again.

llvm-svn: 62587
2009-01-20 18:08:39 +00:00
Bill Wendling a908b60fb2 Temporarily XFAIL until this can be looked at. r62557 is what caused it to start failing.
llvm-svn: 62578
2009-01-20 10:28:39 +00:00
Bill Wendling 1d9c8e5522 Testcase for limited precision stuff.
llvm-svn: 62572
2009-01-20 06:23:59 +00:00
Chris Lattner c59945b4bd another fix for PR3354
llvm-svn: 62561
2009-01-20 01:15:41 +00:00
Dan Gohman 161b7b66ac Fix a dagcombine to not generate loads of non-round integer types,
as its comment says, even in the case where it will be generating
extending loads. This fixes PR3216.

llvm-svn: 62557
2009-01-20 01:06:45 +00:00
Evan Cheng 8f79775a66 Make linear scan's trivial coalescer slightly more aggressive.
llvm-svn: 62547
2009-01-20 00:16:18 +00:00
Chris Lattner ea9f1d3c47 Fix a problem exposed by PR3354: simplifycfg was making a potentially
trapping instruction be executed unconditionally.

llvm-svn: 62541
2009-01-19 23:03:13 +00:00
Dale Johannesen d067ecd1c7 Move & restructure test per review.
llvm-svn: 62538
2009-01-19 22:33:12 +00:00
Chris Lattner 7eeb1cc605 convert this to an unfoldable potentially trapping constant expr.
llvm-svn: 62536
2009-01-19 22:12:33 +00:00
Dan Gohman cd0b1bf0a0 Fix SelectionDAG::ReplaceAllUsesWith to behave correctly when
uses are added to the From node while it is processing From's
use list, because of automatic local CSE. The fix is to avoid
visiting any new uses.

Fix a few places in the DAGCombiner that assumed that after
a RAUW call, the From node has no users and may be deleted.

This fixes PR3018.

llvm-svn: 62533
2009-01-19 21:44:21 +00:00
Chris Lattner 6f34e317e9 Fix PR3353, infinitely jump threading an infinite loop make from switches.
llvm-svn: 62529
2009-01-19 21:20:34 +00:00
Dale Johannesen 740e98704d compile-time fmod was done incorrectly. PR 3316.
llvm-svn: 62528
2009-01-19 21:17:05 +00:00
Devang Patel 8c8aa2ac29 Verify Intrinsic::dbg_declare.
llvm-svn: 62526
2009-01-19 21:00:48 +00:00
Evan Cheng 44cc554311 DIVREM isel deficiency: If sign bit is known zero, zero out DX/EDX/RDX instead of sign extending the low part (in AX/EAX/RAX) into it.
llvm-svn: 62519
2009-01-19 19:06:11 +00:00
Nick Lewycky ee22611e33 Port this test from dejagnu to unit testing.
The way this worked before was to test APInt by running
"lli -force-interpreter=true" knowing the lli uses APInt under the hood to
store its values. Now, we test APInt directly.

llvm-svn: 62514
2009-01-19 18:08:33 +00:00
Bill Wendling 534d2e0bae Temporarily revert r62487. It's causing this error during a release bootstrap of
llvm-gcc. Most likely, it's miscompiling one of the "gen*" programs:

/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./prev-gcc/xgcc -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./prev-gcc/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.6.0/bin/ -c -g -O2 -mdynamic-no-pic -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -mdynamic-no-pic -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/build -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include  -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/../llvm.src/include  -D_DEBUG  -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS   -o build/gencondmd.o build/gencondmd.c
../../llvm-gcc.src/gcc/config/i386/mmx.md:926: error: expected '}' before ')' token
../../llvm-gcc.src/gcc/config/i386/mmx.md:926: warning: excess elements in struct initializer
../../llvm-gcc.src/gcc/config/i386/mmx.md:926: warning: (near initialization for 'insn_conditions[4]')
../../llvm-gcc.src/gcc/config/i386/mmx.md:926: error: expected '}' before ')' token
../../llvm-gcc.src/gcc/config/i386/mmx.md:926: error: expected ',' or ';' before ')' token
../../llvm-gcc.src/gcc/config/i386/mmx.md:927: error: expected identifier or '(' before ',' token
../../llvm-gcc.src/gcc/config/i386/sse.md:3458: error: expected identifier or '(' before ',' token
...

llvm-svn: 62506
2009-01-19 08:46:20 +00:00
Evan Cheng 7e9ef4d776 Now not UINT_TO_FP is legal (it's marked custom), dag combiner won't
optimize it to a SINT_TO_FP when the sign bit is known zero. X86 isel should perform the optimization itself.

llvm-svn: 62504
2009-01-19 08:08:22 +00:00
Chris Lattner f2bb4ea39c Fix PR3016, a bug which can occur do to an invalid assumption:
we assumed a CFG structure that would be valid when all code in 
the function is reachable, but not all code is necessarily 
reachable.  Do a simple, but horrible, CFG walk to check for this
case.

llvm-svn: 62487
2009-01-19 02:46:28 +00:00
Chris Lattner 64b7bd7f9e Fix rdar://6505632, an llc crash on 483.xalancbmk
llvm-svn: 62470
2009-01-18 20:35:00 +00:00
Nick Lewycky e5be1cd635 Forgot this in the previous checkin: fopen now has nocapture, realloc is
supposed to take two arguments.

llvm-svn: 62457
2009-01-18 04:46:10 +00:00
Bill Wendling 9880a2cb2f Testcase for last commit.
llvm-svn: 62418
2009-01-17 07:42:44 +00:00
Evan Cheng bf38a5e540 Fix MatchAddress bug that's preventing negative displacement from being folded in 64-bit mode.
llvm-svn: 62413
2009-01-17 07:09:27 +00:00
Mon P Wang ca6d6dea0b Simplify extract element of a scalar to vector.
llvm-svn: 62383
2009-01-17 00:07:25 +00:00
Evan Cheng 41e9f6a854 Fix PPC ISD::Declare isel and eliminate the need for PPCTargetLowering::LowerGlobalAddress to check if isVerifiedDebugInfoDesc() is true. Given the recent changes, it would falsely return true for a lot of GlobalAddressSDNode's.
llvm-svn: 62373
2009-01-16 22:57:32 +00:00
Dan Gohman f1002495e3 Disable the post-RA scheduler on this test, since it uses a
simple %prcontext which doesn't find what it's looking for
if the scheduler has rearranged the instructions.

llvm-svn: 62363
2009-01-16 21:40:12 +00:00
Evan Cheng 968e2e7b3d CreateVirtualRegisters does trivial copy coalescing. If a node def is used by a single CopyToReg, it reuses the virtual register assigned to the CopyToReg. This won't work for SDNode that is a clone or is itself cloned. Disable this optimization for those nodes or it can end up with non-SSA machine instructions.
llvm-svn: 62356
2009-01-16 20:57:18 +00:00
Chris Lattner db2d9613d2 Fix PR3335 by not turning a store to one address space into a store to another.
llvm-svn: 62351
2009-01-16 20:12:52 +00:00
Bill Wendling e04334730e Add support for non-zero __builtin_return_address values on X86.
llvm-svn: 62338
2009-01-16 19:25:27 +00:00
Evan Cheng 2d9e40ed24 This is now passing.
llvm-svn: 62308
2009-01-16 06:59:14 +00:00
Evan Cheng beac6f8b0c Clean up previous cast optimization a bit. Also make zext elimination a bit more aggressive: if it's not necessary to emit an AND (i.e. high bits are already zero), it's profitable to evaluate the operand at a different type.
llvm-svn: 62297
2009-01-16 02:11:43 +00:00
Devang Patel fa1b408b3b Do not stumble over forward declared struct member.
llvm-svn: 62288
2009-01-16 00:50:53 +00:00
Devang Patel 76d190cf4a Validate dbg_* intrinsics before lowering them.
llvm-svn: 62286
2009-01-15 23:41:32 +00:00
Mon P Wang e248edff1b Added missing support to widen an operand from a bit convert.
llvm-svn: 62285
2009-01-15 22:43:38 +00:00
Rafael Espindola f2831d6cd1 Fix Alpha test and support for private linkage.
llvm-svn: 62282
2009-01-15 21:51:46 +00:00
Mon P Wang ebfafee903 Expand insert/extract of a <4 x i32> with a variable index.
llvm-svn: 62281
2009-01-15 21:10:20 +00:00
Rafael Espindola 6de96a1b5d Add the private linkage.
llvm-svn: 62279
2009-01-15 20:18:42 +00:00
Devang Patel 851cdaf1fd Use lightweight DebugInfo objects directly.
llvm-svn: 62276
2009-01-15 19:26:23 +00:00
Devang Patel 8bdc698336 Use variable's context to identify respective DbgScope.
Use light weight DebugInfo object directly.

llvm-svn: 62269
2009-01-15 18:25:17 +00:00
Evan Cheng 60e19a46f2 - Teach CanEvaluateInDifferentType of this xform: sext (zext ty1), ty2 -> zext ty2
- Looking at the number of sign bits of the a sext instruction to determine  whether new trunc + sext pair should be added when its source is being evaluated in a different type.

llvm-svn: 62263
2009-01-15 17:01:23 +00:00
Richard Osborne 40119780a8 Don't fold address calculations which use negative offsets into
the ADDRspii addressing mode.

llvm-svn: 62258
2009-01-15 11:32:30 +00:00
Scott Michel a292fc6d6b - Convert remaining i64 custom lowering into custom instruction emission
sequences in SPUDAGToDAGISel.cpp and SPU64InstrInfo.td, killing custom
  DAG node types as needed.
- i64 mul is now a legal instruction, but emits an instruction sequence
  that stretches tblgen and the imagination, as well as violating laws of
  several small countries and most southern US states (just kidding, but
  looking at a function with 80+ parameters is really weird and just plain
  wrong.)
- Update tests as needed.

llvm-svn: 62254
2009-01-15 04:41:47 +00:00
Chris Lattner 8fb9480ed2 Fix PR3325, a miscompilation of invokes by IPSCCP. Patch by Jay Foad!
llvm-svn: 62244
2009-01-14 21:01:16 +00:00
Devang Patel 08e5e62f98 xfail for now.
llvm-svn: 62243
2009-01-14 20:10:24 +00:00
Richard Osborne 4359325ba8 Add pseudo instructions to the XCore for (load|store|load address) of a
frame index. eliminateFrameIndex will replace these instructions with
(LDWSP|STWSP|LDAWSP) or (LDW|STW|LDAWF) if a frame pointer is in use.

This fixes PR 3324. Previously we used LDWSP, STWSP, LDAWSP before frame
pointer elimination. However since they were marked as implicitly using
SP they could not be rematerialised.

llvm-svn: 62238
2009-01-14 18:26:46 +00:00
Dale Johannesen 1f0e0e7c9c Fix the time regression I introduced in 464.h264ref with
my earlier patch to this file.

The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop.  This was extra bad
because register pressure later forced both base and IV into
memory.  Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this.  However,
there were side effects....

It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before.  And when inserting 
new code that feeds into a PHI, it's right to put such 
code at the original location rather than in the PHI's 
immediate predecessor(s) when the original location is outside 
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.

Also, the mechanism for keeping SCEV's corresponding to GEP's
no longer works, as the GEP might change after its SCEV
is remembered, invalidating the SCEV, and we might get a bad
SCEV value when looking up the GEP again for a later loop.  
This also couldn't happen before, as we weren't recursing
into GEP's outside the loop.

Also, when we build an expression that involves a (possibly
non-affine) IV from a different loop as well as an IV from
the one we're interested in (containsAddRecFromDifferentLoop),
don't recurse into that.  We can't do much with it and will
get in trouble if we try to create new non-affine IVs or something.

More testcases are coming.

llvm-svn: 62212
2009-01-14 02:35:31 +00:00
Chris Lattner 2538eb664c rewrite OptimizeAwayTrappingUsesOfLoads to 1) avoid a temporary
vector and extraneous loop over it, 2) not delete globals used by
phis/selects etc which could actually be useful.  This fixes PR3321.
Many thanks to Duncan for narrowing this down.

llvm-svn: 62201
2009-01-14 00:12:58 +00:00
Dan Gohman b8f5ba6781 Disable the register+memory forms of the bt instructions for now. Thanks
to Eli for pointing out that these forms don't ignore the high bits of
their index operands, and as such are not immediately suitable for use
by isel.

llvm-svn: 62194
2009-01-13 23:23:30 +00:00
Dale Johannesen 0aeabdff57 Fix testsuite regressions from recursive inlining.
llvm-svn: 62189
2009-01-13 22:43:37 +00:00
Dan Gohman 1407484178 The list-td and list-tdrr schedulers don't yet support physreg
scheduling dependencies. Add assertion checks to help catch
this.

It appears the Mips target defaults to list-td, and it has a
regression test that uses a physreg dependence. Such code was
liable to be miscompiled, and now evokes an assertion failure.

llvm-svn: 62177
2009-01-13 20:24:13 +00:00
Dan Gohman 59af77376c Make instcombine ensure that all allocas are explicitly aligned at at
least their preferred alignment.

llvm-svn: 62176
2009-01-13 20:18:38 +00:00
Duncan Sands ffc6133318 When replacing uses and the same node is reached
via two paths, process it once not twice, d'oh!
Analysis, testcase and original patch thanks to
Mon Ping Wang.

llvm-svn: 62169
2009-01-13 15:17:14 +00:00
Duncan Sands ab2fd9e4b9 Mark this XFAIL for the moment.
llvm-svn: 62168
2009-01-13 15:15:46 +00:00
Nick Lewycky 52348300a4 Wind SCEV back in time, to Nov 18th. This 'fixes' PR3275, PR3294, PR3295,
PR3296 and PR3302.

llvm-svn: 62160
2009-01-13 09:18:58 +00:00
Evan Cheng f343168f1f FIX llvm-gcc bootstrap on x86_64 linux. If a virtual register is copied to a physical register, it's not necessarily defined by a copy. We have to watch out it doesn't clobber any sub-register that might be live during its live interval. If the live interval crosses a basic block, then it's not safe to check with the less conservative check (by scanning uses and defs) because it's possible a sub-register might be live out of the block.
llvm-svn: 62144
2009-01-13 03:57:45 +00:00
Devang Patel 76007e009e Use DebugInfo interface to lower dbg_* intrinsics.
llvm-svn: 62126
2009-01-13 00:32:17 +00:00
Dale Johannesen 433a9086c0 Enable recursive inlining. Reduce inlining threshold
back to 200; 400 seems to be too high, loses more than
it gains.

llvm-svn: 62107
2009-01-12 22:11:50 +00:00
Evan Cheng 2adb5cfb48 Second test is only valid in 32-bit mode.
llvm-svn: 62084
2009-01-12 08:05:54 +00:00
Evan Cheng 0258874607 Test for r62076.
llvm-svn: 62077
2009-01-12 03:46:55 +00:00
Evan Cheng b2c42c648d Fix PR3241: Currently EmitCopyFromReg emits a copy from the physical register to a virtual register unless it requires an expensive cross class copy. That means we are only treating "expensive to copy" register dependency as physical register dependency.
Also future proof the scheduler to handle "normal" physical register dependencies. The code is not exercised yet.

llvm-svn: 62074
2009-01-12 03:19:55 +00:00
Evan Cheng 8e7d88b916 This is a dup of pr2659.ll.
llvm-svn: 62029
2009-01-10 19:06:32 +00:00
Evan Cheng ed74d8ac2a Duplicated node may produce a non-physical register def.
llvm-svn: 62015
2009-01-09 22:44:02 +00:00
Evan Cheng c1f5a659de Add test case from PR2659.
llvm-svn: 62006
2009-01-09 21:01:31 +00:00
Chris Lattner ae0e857b98 Fix PR3304
llvm-svn: 61995
2009-01-09 18:18:43 +00:00
Dan Gohman ea1086b7f2 PR2659 was fixed by r61847. Add the testcase as a regression test.
llvm-svn: 61986
2009-01-09 08:16:12 +00:00
Chris Lattner f50aa6ae5c Implement rdar://6480391, extending of equality icmp's to avoid a truncation.
I noticed this in the code compiled for a routine using std::map, which produced
this code:
	%25 = tail call i32 @memcmp(i8* %24, i8* %23, i32 6) nounwind readonly
	%.lobit.i = lshr i32 %25, 31		; <i32> [#uses=1]
	%tmp.i = trunc i32 %.lobit.i to i8		; <i8> [#uses=1]
	%toBool = icmp eq i8 %tmp.i, 0		; <i1> [#uses=1]
	br i1 %toBool, label %bb3, label %bb4
which compiled to:

	call	L_memcmp$stub
	shrl	$31, %eax
	testb	%al, %al
	jne	LBB1_11	## 

with this change, we compile it to:

	call	L_memcmp$stub
	testl	%eax, %eax
	js	LBB1_11

This triggers all the time in common code, with patters like this:

	%169 = and i32 %ply, 1		; <i32> [#uses=1]
	%170 = trunc i32 %169 to i8		; <i8> [#uses=1]
	%toBool = icmp ne i8 %170, 0		; <i1> [#uses=1]

 	%7 = lshr i32 %6, 24		; <i32> [#uses=1]
	%9 = trunc i32 %7 to i8		; <i8> [#uses=1]
	%10 = icmp ne i8 %9, 0		; <i1> [#uses=1]

etc

llvm-svn: 61985
2009-01-09 07:47:06 +00:00
Chris Lattner 482eb70a10 Fix PR3298, a crash in Jump Threading. Apparently even
jump threading can have bugs, who knew? ;-)

llvm-svn: 61983
2009-01-09 06:08:12 +00:00
Chris Lattner d48d1ec320 this doesn't depend on the gcc early inliner anymore.
llvm-svn: 61982
2009-01-09 05:49:27 +00:00
Chris Lattner 7f88a1b512 PR3290 is now fixed.
llvm-svn: 61981
2009-01-09 05:46:19 +00:00
Chris Lattner fef138b140 Fix part 3/2 of PR3290, making instcombine zap (gep(bitcast)) when possible.
llvm-svn: 61980
2009-01-09 05:44:56 +00:00
Chris Lattner 9170731cb7 this test should not run opt -std-compile-opts, it should run
just llc.

llvm-svn: 61979
2009-01-09 05:32:00 +00:00
Dale Johannesen b48fc71fc6 Do not inline functions with (dynamic) alloca into
functions that don't already have a (dynamic) alloca.
Dynamic allocas cause inefficient codegen and we shouldn't
propagate this (behavior follows gcc).  Two existing tests
assumed such inlining would be done; they are hacked by
adding an alloca in the caller, preserving the point of
the tests.

llvm-svn: 61946
2009-01-08 21:45:23 +00:00
Chris Lattner f3e696bc5a ValueTracker can't assume that an alloca with no specified alignment
will get its preferred alignment.  It has to be careful and cautiously assume
it will just get the ABI alignment.  This prevents instcombine from rounding
up the alignment of a load/store without adjusting the alignment of the alloca.

llvm-svn: 61934
2009-01-08 19:28:38 +00:00
Chris Lattner a2ed32eb4f this testcase is huge and hasn't regressed ever, I don't think it is worth keeping.
llvm-svn: 61931
2009-01-08 19:01:45 +00:00
Chris Lattner 55927bdccd the new scalarrepl changes are optimizing away a temporary alloca in
check242, which invalidates this test.  This test is an x86-32 ABI test 
that is trying to be run in a target-independent way, which is not going
to work very well.  Just remove the test.

llvm-svn: 61921
2009-01-08 07:58:23 +00:00
Chris Lattner c518dfd11b This implements the second half of the fix for PR3290, handling
loads from allocas that cover the entire aggregate.  This handles
some memcpy/byval cases that are produced by llvm-gcc.  This triggers
a few times in kc++ (with std::pair<std::_Rb_tree_const_iterator
<kc::impl_abstract_phylum*>,bool>) and once in 176.gcc (with %struct..0anon).

llvm-svn: 61915
2009-01-08 05:42:05 +00:00
Misha Brukman b51cdfadda Fix off-by-one error in traversing an array; this fixes a test.
The error was reported by gcc-4.3.0 during compilation.

llvm-svn: 61896
2009-01-07 23:07:29 +00:00
Duncan Sands 289f59f233 Remove alloca tracking from nocapture analysis. Not only
was it not very helpful, it was also wrong!  The problem
is shown in the testcase: the alloca might be passed to
a nocapture callee which dereferences it and returns the
original pointer.  But because it was a nocapture call we
think we don't need to track its uses, but we do.

llvm-svn: 61876
2009-01-07 19:39:06 +00:00
Chris Lattner f2b8c82ad1 Implement the first half of PR3290: if there is a store of an
integer to a (transitive) bitcast the alloca and if that integer
has the full size of the alloca, then it clobbers the whole thing.
Handle this by extracting pieces out of the stored integer and 
filing them away in the SROA'd elements.

This triggers fairly frequently because the CFE uses integers to
pass small structs by value and the inliner exposes these.  For 
example, in kimwitu++, I see a bunch of these with i64 stores to
"%struct.std::pair<std::_Rb_tree_const_iterator<kc::impl_abstract_phylum*>,bool>"

In 176.gcc I see a few i32 stores to "%struct..0anon".

In the testcase, this is a difference between compiling test1 to:

_test1:
	subl	$12, %esp
	movl	20(%esp), %eax
	movl	%eax, 4(%esp)
	movl	16(%esp), %eax
	movl	%eax, (%esp)
	movl	(%esp), %eax
	addl	4(%esp), %eax
	addl	$12, %esp
	ret

vs:

_test1:
	movl	8(%esp), %eax
	addl	4(%esp), %eax
	ret

The second half of this will be to handle loads of the same form.

llvm-svn: 61853
2009-01-07 08:11:13 +00:00
Evan Cheng f6768bd9cb The coalescer does not coalesce a virtual register to a physical register if any of the physical register's sub-register live intervals overlaps with the virtual register. This is overly conservative. It prevents a extract_subreg from being coalesced away:
v1024 = EDI  // not killed
      =
      = EDI

One possible solution is for the coalescer to examine the sub-register live intervals in the same manner as the physical register. Another possibility is to examine defs and uses (when needed) of sub-registers. Both solutions are too expensive. For now, look for "short virtual intervals" and scan instructions to look for conflict instead.

This is a small win on x86-64. e.g. It shaves 403.gcc by ~80 instructions.

llvm-svn: 61847
2009-01-07 02:08:57 +00:00
Chris Lattner 4687432d03 add a testcase.
llvm-svn: 61845
2009-01-07 01:48:08 +00:00
Dan Gohman 8e8d1da35a Add patterns to match conditional moves with loads folded
into their left operand, rather than their right. Do this
by commuting the operands and inverting the condition.

llvm-svn: 61842
2009-01-07 01:00:24 +00:00
Dan Gohman 33e6fcd56f X86_COND_C and X86_COND_NC are alternate mnemonics for
X86_COND_B and X86_COND_AE, respectively.

llvm-svn: 61835
2009-01-07 00:15:08 +00:00
Dan Gohman 44a3da6c4d Now that fold-pcmpeqd-0.ll is effectively testing that scheduling helps
avoid the need for spilling, add a new testcase that tests that the
pcmpeqd used for V_SETALLONES is changed to a constant-pool load as
needed.

llvm-svn: 61831
2009-01-06 23:48:10 +00:00
Dan Gohman beac19e299 Revert r42653 and forward-port the code that lets INC64_32r be
converted to LEA64_32r in x86's convertToThreeAddress. This
replaces code like this:
   movl  %esi, %edi
   inc   %edi
with this:
   lea   1(%rsi), %edi
which appears to be beneficial.

llvm-svn: 61830
2009-01-06 23:34:46 +00:00
Dan Gohman c7847cdb8d Fix a bug in ComputeLinearIndex computation handling multi-level
aggregate types. Don't increment the current index after reaching
the end of a struct, as it will already be pointing at
one-past-the end. This fixes PR3288.

llvm-svn: 61828
2009-01-06 22:53:52 +00:00
Scott Michel 6887caf11c CellSPU:
- Fix bugs 3194, 3195: i128 load/stores produce correct code (although, we
  need to ensure that i128 is 16-byte aligned in real life), and 128 zero-
  extends are supported.
- New td file: SPU128InstrInfo.td: this is where all new i128 support should
  be put in the future.
- Continue to hammer on i64 operations and test cases; ensure that the only
  remaining problem will be i64 mul.

llvm-svn: 61784
2009-01-06 03:36:14 +00:00
Dan Gohman 53c282cce8 Delete this test; it's a duplicate of 2006-07-03-schedulers.ll.
llvm-svn: 61781
2009-01-06 01:36:23 +00:00
Dan Gohman 79c3516912 Use a latency value of 0 for the artificial edges inserted by
AddPseudoTwoAddrDeps. This lets the scheduling infrastructure
avoid recalculating node heights. In very large testcases this
was a major bottleneck. Thanks to Roman Levenstein for finding
this!

As a side effect, fold-pcmpeqd-0.ll is now scheduled better
and it no longer requires spilling on x86-32.

llvm-svn: 61778
2009-01-06 01:19:04 +00:00
Chris Lattner 4e735eb157 make m_ConstantInt(int64_t) safely match ConstantInt's that are larger than i64.
This fixes an instcombine crash on PR3235.

llvm-svn: 61775
2009-01-05 23:45:50 +00:00
Bill Wendling 2012d84f01 Strength test.
llvm-svn: 61755
2009-01-05 21:27:59 +00:00
Duncan Sands 582c53d147 Teach the internalize pass to also internalize
global aliases.

llvm-svn: 61754
2009-01-05 21:24:45 +00:00
Evan Cheng 8804293fe9 Find loop back edges only after empty blocks are eliminated.
llvm-svn: 61752
2009-01-05 21:17:27 +00:00
Chris Lattner 84434a692b testcase for bill's patch.
llvm-svn: 61751
2009-01-05 21:07:34 +00:00
Duncan Sands f5dbbae4f4 Delete unused global aliases with internal linkage.
In fact this also deletes those with linkonce linkage,
however this is currently dead because for the moment
aliases aren't allowed to have this linkage type.

llvm-svn: 61742
2009-01-05 20:37:33 +00:00
Duncan Sands d60837f85e Don't spew bitcode to standard out if this test
fails, like it is right now.

llvm-svn: 61690
2009-01-05 10:52:29 +00:00
Torok Edwin b9905d7be3 This test passes again, unXFAIL.
llvm-svn: 61688
2009-01-05 09:30:47 +00:00
Chris Lattner 4b6a0cce5e alignment of 0 is not valid.
llvm-svn: 61682
2009-01-05 08:14:35 +00:00
Scott Michel 74f249517e CellSPU:
- Teach SPU64InstrInfo.td about the remaining signed comparisons, update tests
  accordingly.

llvm-svn: 61672
2009-01-05 04:05:53 +00:00
Scott Michel f87d41d8b9 CellSPU:
- Add an 8-bit operation test, which doesn't do much at this point.

llvm-svn: 61665
2009-01-05 01:35:22 +00:00
Scott Michel a664240476 CellSPU:
- Fix (brcond (setq ...)) bug, where BRNZ should have been used vice BRZ.
- Kill unused/unnecessary nodes in SPUNodes.td
- Beef out the i64operations.c test harness to use a lot of unaligned
  loads, test loops and LLVM loop/basic block optimizations; run the
  test harness successfully on real Cell hardware.

llvm-svn: 61664
2009-01-05 01:34:35 +00:00
Nick Lewycky 959af7ba30 Run a post-pass that marks known function declarations by name.
llvm-svn: 61632
2009-01-04 20:27:34 +00:00
Bill Wendling 0a09d2de13 XFAIL this test. The xform was removed.
llvm-svn: 61624
2009-01-04 06:32:28 +00:00
Dan Gohman b9fa1d24f8 Fix a DAGCombiner abort on an invalid shift count constant. This fixes PR3250.
llvm-svn: 61613
2009-01-03 19:22:06 +00:00
Scott Michel 6a1f6279ad CellSPU:
- Remove custom lowering for BRCOND
- Add remaining functionality for branches in SPUInstrInfo, such as branch
  condition reversal and load/store folding. Updated BrCond test to reflect
  branch reversal.

llvm-svn: 61597
2009-01-03 00:27:53 +00:00
Nick Lewycky 380292a51a Don't try to analyze this "backward" case. This is overly conservative
pending a correct solution.

llvm-svn: 61589
2009-01-02 18:54:17 +00:00
Duncan Sands b193a37cd3 When calculating 'nocapture' argument attributes, allow
the argument to be stored to an alloca by tracking uses
of the alloca.  This occurs 4 times (out of 7121, 0.05%)
in MultiSource/Applications, so may not be worth it.  On
the other hand, it is easy to do and fairly cheap.  The
functions it helps are: W_addcom and W_addlit in spiff;
process_args (argv) in d (make_dparser); ercPixConcealIMB
in JM/ldecod.

llvm-svn: 61570
2009-01-02 11:54:37 +00:00
Chris Lattner ac161bff07 Reimplement the old and horrible bison parser for .ll files with a nice
and clean recursive descent parser.

This change has a couple of ramifications:
1. The parser code is about 400 lines shorter (in what we maintain, not
   including what is autogenerated).
2. The code should be significantly faster than the old code because we 
   don't have to work around bison's poor handling of datatypes with 
   ctors/dtors.  This also makes the code much more resistant to memory 
   leaks.
3. We now get caret diagnostics from the .ll parser, woo.
4. The actual diagnostics emited from the parser are completely different
   so a bunch of testcases had to be updated.
5. I now disallow "%ty = type opaque %ty = type i32".  There was no good
   reason to support this, it was just an accident of the old 
   implementation.  I have no reason to think that anyone is actually using
   this.
6. The syntax for sticking a global variable has changed to make it 
   unambiguous.  I don't think anyone is depending on this since only clang
   supports this and it is not solid yet, so I'm not worried about anything
   breaking.
7. This gets rid of the last use of bison, and along with it the .cvs files.
   I'll prune this from the makefiles as a subsequent commit.

There are a few minor cleanups that can be done after this commit (suggestions
welcome!) but this passes dejagnu testing and is ready for its time in the
limelight.

llvm-svn: 61558
2009-01-02 07:01:27 +00:00
Evan Cheng 4c91aa3418 Do not isel load folding bt instructions for pentium m, core, core2, and AMD processors. These are significantly slower than a load followed by a bt of a register.
llvm-svn: 61557
2009-01-02 05:35:45 +00:00
Evan Cheng 1671a309fd Use movaps / movd to extract vector element 0 even with sse4.1. It's still cheaper than pextrw especially if the value is in memory.
llvm-svn: 61555
2009-01-02 05:29:08 +00:00
Nick Lewycky 0cfba9c6bf Remove the cyclic part of this test, it was passing for the wrong
reason. Two functions which mutually require each other to be nocapture 
are not currently supported.

llvm-svn: 61553
2009-01-02 03:52:27 +00:00
Nick Lewycky 7e82055e88 Make adding nocapture a bit stronger. FreeInst is nocapture. Also,
functions that don't write can't leak a pointer except through 
the return value, so a void readonly function is implicitly nocapture.

Test these, and add a test that verifies that f1 calling f2 with an 
otherwise dead pointer gets both of them marked nocapture.

llvm-svn: 61552
2009-01-02 03:46:56 +00:00
Chris Lattner 836dd95506 rename a file to follow naming conventions.
llvm-svn: 61550
2009-01-02 01:52:35 +00:00
Duncan Sands 8c03a123ce Add tests for two types of traps that escape analysis
might one day fall into.

llvm-svn: 61549
2009-01-02 00:55:51 +00:00
Misha Brukman 36daf0d1c7 * Quoted the executable 'runtest' to emphasize the binary needed;
otherwise, some unlucky souls start looking for a 'dejagnu' binary...
* Properly capitalized LLVM.

llvm-svn: 61546
2009-01-01 20:26:05 +00:00
Duncan Sands 8feb694e8f Fix PR3274: when promoting the condition of a BRCOND node,
promote from i1 all the way up to the canonical SetCC type.
In order to discover an appropriate type to use, pass
MVT::Other to getSetCCResultType.  In order to be able to
do this, change getSetCCResultType to take a type as an
argument, not a value (this is also more logical).

llvm-svn: 61542
2009-01-01 15:52:00 +00:00
Bill Wendling aedb54a947 Add transformation:
xor (or (icmp, icmp), true) -> and(icmp, icmp)

This is possible because of De Morgan's law.

llvm-svn: 61537
2009-01-01 01:18:23 +00:00
Duncan Sands 163848021b Look through phi nodes and select instructions when
calculating nocapture attributes.

llvm-svn: 61535
2008-12-31 20:21:34 +00:00
Bill Wendling 15a47ddf40 This is not failing on Darwin for some reason. XFAIL for other platforms.
llvm-svn: 61533
2008-12-31 19:26:09 +00:00
Misha Brukman bbfefd9612 Removed extra spaces.
llvm-svn: 61527
2008-12-31 17:38:27 +00:00
Duncan Sands 44c8cd97a5 Rename AddReadAttrs to FunctionAttrs, and teach it how
to work out (in a very simplistic way) which function
arguments (pointer arguments only) are only dereferenced
and so do not escape.  Mark such arguments 'nocapture'.

llvm-svn: 61525
2008-12-31 16:14:43 +00:00
Bill Wendling 80cb99575e XFAIL test caused by r61493. Apparently, this is expected?
llvm-svn: 61516
2008-12-31 08:26:55 +00:00
Scott Michel 36a494c1d1 XFAIL this for now until I can figure out what's going on.
llvm-svn: 61512
2008-12-31 00:08:25 +00:00
Scott Michel 18d756a411 Fix test erratum (which is wierd: works locally for me?)
llvm-svn: 61511
2008-12-30 23:52:05 +00:00
Scott Michel 41236c0cf3 - Start moving target-dependent nodes that could be represented by an
instruction sequence and cannot ordinarily be simplified by DAGcombine
  into the various target description files or SPUDAGToDAGISel.cpp.

  This makes some 64-bit operations legal.

- Eliminate target-dependent ISD enums.

- Update tests.

llvm-svn: 61508
2008-12-30 23:28:25 +00:00
Duncan Sands c125d6a3d3 Allow readnone functions to read (and write!) global
constants, since doing so is irrelevant for aliasing
purposes.  While this doesn't increase the total number
of functions marked readonly or readnone in MultiSource/
Applications (3089), it does result in 12 functions being
marked readnone rather than readonly.
Before:
  readnone: 820
  readonly: 2269
After:
  readnone: 832
  readonly: 2257

llvm-svn: 61469
2008-12-29 11:34:09 +00:00
Nick Lewycky d80ff135b5 Check that the function prototypes are correct before assuming that the
parameters are pointers.

llvm-svn: 61451
2008-12-27 16:20:53 +00:00
Chris Lattner 1d1087113c add testcase for type parsing.
llvm-svn: 61449
2008-12-27 08:10:46 +00:00
Scott Michel 8233527b05 - Remove Tilmann's custom truncate lowering: it completely hosed over
DAGcombine's ability to find reasons to remove truncates when they were not
  needed. Consequently, the CellSPU backend would produce correct, but _really
  slow and horrible_, code.

  Replaced with instruction sequences that do the equivalent truncation in
  SPUInstrInfo.td.

- Re-examine how unaligned loads and stores work. Generated unaligned
  load code has been tested on the CellSPU hardware; see the i32operations.c
  and i64operations.c in CodeGen/CellSPU/useful-harnesses.  (While they may be
  toy test code, it does prove that some real world code does compile
  correctly.)

- Fix truncating stores in bug 3193 (note: unpack_df.ll will still make llc
  fault because i64 ult is not yet implemented.)

- Added i64 eq and neq for setcc and select/setcc; started new instruction
  information file for them in SPU64InstrInfo.td. Additional i64 operations
  should be added to this file and not to SPUInstrInfo.td.

llvm-svn: 61447
2008-12-27 04:51:36 +00:00
Chris Lattner 3d1bce04e0 add PR #
llvm-svn: 61427
2008-12-25 05:40:38 +00:00
Chris Lattner 2a7c988627 Add a simple pattern for matching 'bt'.
llvm-svn: 61426
2008-12-25 05:34:37 +00:00
Bill Wendling f4e6356d06 Revert the changes in this testcase until Anton can fix them.
llvm-svn: 61414
2008-12-24 05:23:34 +00:00
Dan Gohman 198b8e78c3 Fix a compiler-abort on a testcase where the stack-pointer is added to
a symbolic constant. This is unlikely to be intentional, but it
shouldn't crash the compiler.

llvm-svn: 61408
2008-12-24 00:27:51 +00:00
Dale Johannesen acc84e5aa0 Add another permutation where we should get rid of a-a.
llvm-svn: 61401
2008-12-23 23:01:27 +00:00
Anton Korobeynikov cfe108a064 Update test
llvm-svn: 61399
2008-12-23 22:26:37 +00:00
Chris Lattner c183061f7c Testcase to show we can tie together integers and pointers of
the same size.

llvm-svn: 61380
2008-12-23 18:52:26 +00:00
Mon P Wang f566eea614 Added shuffle and splat test cases for r61365.
llvm-svn: 61366
2008-12-23 04:05:08 +00:00
Dale Johannesen d2a4685860 One more permutation of subtracting off a base value.
llvm-svn: 61361
2008-12-23 01:59:54 +00:00
Mikhail Glushenkov 2fe093f2b8 Use ignore & grep instead of XFAIL.
llvm-svn: 61307
2008-12-21 07:47:49 +00:00
Nick Lewycky 10eb8e533f Turn strcmp into memcmp, such as strcmp(P, "x") --> memcmp(P, "x", 2).
llvm-svn: 61297
2008-12-21 00:19:21 +00:00
Dan Gohman ab316350bf Fix fast-isel to not emit invalid assembly when presented with a
constant shift count that doesn't fit in the shift instruction's
immediate field. This fixes PR3242.

llvm-svn: 61281
2008-12-20 17:19:40 +00:00
Dan Gohman bb92a1b815 Use the correct Preds and Succs lists in setHeightDirty()
and setDepthDirty(), respectively. This fixes PR3241.

llvm-svn: 61276
2008-12-20 16:34:57 +00:00
Bill Wendling 7be667c075 More precise XFAIL.
llvm-svn: 61265
2008-12-19 22:28:23 +00:00
Bill Wendling be2a77f52c Un-XFAIL this test because it's passing and John doesn't seem interested in un-XFAILing it.
llvm-svn: 61264
2008-12-19 22:25:01 +00:00
Evan Cheng 0869f78555 Fix PR3149. If an early clobber def is a physical register and it is tied to an input operand, it effectively extends the live range of the physical register. Currently we do not have a good way to represent this.
172     %ECX<def> = MOV32rr %reg1039<kill>
180     INLINEASM <es:subl $5,$1
        sbbl $3,$0>, 10, %EAX<def>, 14, %ECX<earlyclobber,def>, 9, %EAX<kill>,
36, <fi#0>, 1, %reg0, 0, 9, %ECX<kill>, 36, <fi#1>, 1, %reg0, 0
188     %EAX<def> = MOV32rr %EAX<kill>
196     %ECX<def> = MOV32rr %ECX<kill>
204     %ECX<def> = MOV32rr %ECX<kill>
212     %EAX<def> = MOV32rr %EAX<kill>
220     %EAX<def> = MOV32rr %EAX
228     %reg1039<def> = MOV32rr %ECX<kill>

The early clobber operand ties ECX input to the ECX def.

The live interval of ECX is represented as this:
%reg20,inf = [46,47:1)[174,230:0)  0@174-(230) 1@46-(47)

The right way to represent this is something like
%reg20,inf = [46,47:2)[174,182:1)[181:230:0)  0@174-(182) 1@181-230 @2@46-(47)

Of course that won't work since that means overlapping live ranges defined by two val#.

The workaround for now is to add a bit to val# which says the val# is redefined by a early clobber def somewhere. This prevents the move at 228 from being optimized away by SimpleRegisterCoalescing::AdjustCopiesBackFrom.

llvm-svn: 61259
2008-12-19 20:58:01 +00:00
Bill Wendling dc2b987abb This test works again for Darwin because a patch was reverted.
llvm-svn: 61254
2008-12-19 19:08:13 +00:00
Evan Cheng 3b3de7c228 - CodeGenPrepare does not split loop back edges but it only knows about back edges of single block loops. It now does a DFS walk to find loop back edges.
- Use SplitBlockPredecessors to factor out common predecessors of the critical edge destination. This is disabled for now due to some regressions.

llvm-svn: 61248
2008-12-19 18:03:11 +00:00
Rafael Espindola 770b4b830a Fix bug 3202.
The EH_frame and .eh symbols are now private, except for darwin9 and earlier.
The patch also fixes the definition of PrivateGlobalPrefix on pcc linux.

llvm-svn: 61242
2008-12-19 10:55:56 +00:00
Nick Lewycky 2abb108f1b Resubmit support for the 'nocapture' attribute.
The problematic part of this patch is that we were out of attribute bits,
requiring some fancy bit hacking to make it fit (by shrinking alignment)
without breaking existing users or the file format.

This change will require users to rebuild llvm-gcc to match llvm.

llvm-svn: 61239
2008-12-19 06:39:12 +00:00
Mon P Wang 308a1acaaf Fix test to account for generating some vector code for mul v2i64 instead
of incorrectly generating pmuldq

llvm-svn: 61228
2008-12-18 23:42:37 +00:00
Bill Wendling 4c13e77d49 Re-XFAIL this test until debug stuff settles down.
llvm-svn: 61219
2008-12-18 22:13:31 +00:00
Mon P Wang 6e5f4bc1e7 Added some basic test cases for r61209
llvm-svn: 61210
2008-12-18 20:05:58 +00:00
Nick Lewycky 0f0e63fe73 Make all the vector elements positive in an srem of constant vector.
llvm-svn: 61195
2008-12-18 06:31:11 +00:00
Bill Wendling 7ecf774262 XFAIL on Linux.
llvm-svn: 61176
2008-12-18 00:35:21 +00:00
Bill Wendling ede2f8098d Do not XFAIL.
llvm-svn: 61174
2008-12-18 00:27:15 +00:00
Devang Patel 980210395f XFAIL for now.
llvm-svn: 61167
2008-12-17 22:54:54 +00:00
Devang Patel fd9aa62cc6 Xfail these tests for now.
llvm-svn: 61166
2008-12-17 22:53:09 +00:00
Chris Lattner 222ef4c489 Enhance heap sra to be substantially more aggressive w.r.t PHI
nodes.  This allows it to do fairly general phi insertion if a 
load from a pointer global wants to be SRAd but the load is used
by (recursive) phi nodes.  This fixes a pessimization on ppc
introduced by Load PRE.

llvm-svn: 61123
2008-12-17 05:28:49 +00:00
Eli Friedman 6cf404f2d1 Fix for PR3225: disable a broken optimization in
DAGTypeLegalizer::ExpandShiftWithKnownAmountBit.

In terms of restoring the optimization, the best fix here isn't 
obvious... any ideas?

llvm-svn: 61119
2008-12-17 03:35:17 +00:00
Dale Johannesen f51dcef803 A new dag combine; several permutations of this
are there under ADD, this one was missing.

llvm-svn: 61107
2008-12-16 22:13:49 +00:00
Chris Lattner 56b55387fc Fix another crash found by inspection. If we have a PHI node merging
the load multiple times, make sure the check the uses of the PHI to 
ensure they are transformable.

llvm-svn: 61102
2008-12-16 21:24:51 +00:00
Chris Lattner 06a456b3f4 fix a crash found by inspection.
llvm-svn: 61101
2008-12-16 21:04:51 +00:00
Eli Friedman cb61afb546 Add a helper to remove a branch and DCE the condition, and use it
consistently for deleting branches.  In addition to being slightly 
more readable, this makes SimplifyCFG a bit better 
about cleaning up after itself when it makes conditions unused.

llvm-svn: 61100
2008-12-16 20:54:32 +00:00
Bill Wendling e38c7400c9 Remove empty test.
llvm-svn: 61095
2008-12-16 19:07:17 +00:00
Bill Wendling a397baea88 Temporarily revert r61019, r61030, and r61040. These were breaking LLVM Release
builds.

llvm-svn: 61094
2008-12-16 19:06:48 +00:00
Evan Cheng c35fc49477 We have decided not to support inline asm where an output operand with a matching input operand with incompatible type (i.e. either one is a floating point and the other is an integer or the sizes of the types differ). SelectionDAGBuild will catch these and exit with an error.
llvm-svn: 61092
2008-12-16 18:21:39 +00:00
Torok Edwin 01a1e37154 add testcase for -print-dbginfo
llvm-svn: 61086
2008-12-16 10:10:23 +00:00
Nick Lewycky 69c9aa4ce5 Generalize support for analyzing loops to include SLE/SGE loop exit conditions
and support for non-unit strides with signed exit conditions.

llvm-svn: 61082
2008-12-16 08:30:01 +00:00
Chris Lattner 8b4be37275 fix PR3217: fully cached queries need to be verified against the
visited set before they are used.  If used, their blocks need to be
added to the visited set so that subsequent queries don't use conflicting
pointer values in the cache result blocks.

llvm-svn: 61080
2008-12-16 07:10:09 +00:00
Dan Gohman 51559185f1 Enable anti-dependence breaking by default when post-RA scheduling is enabled.
llvm-svn: 61078
2008-12-16 06:21:45 +00:00
Dan Gohman dddc1ac7ea Fix some register-alias-related bugs in the post-RA scheduler liveness
computation code. Also, avoid adding output-depenency edges when both
defs are dead, which frequently happens with EFLAGS defs.

Compute Depth and Height lazily, and always in terms of edge latency
values. For the schedulers that don't care about latency, edge latencies
are set to 1.

Eliminate Cycle and CycleBound, and LatencyPriorityQueue's Latencies array.
These are all subsumed by the Depth and Height fields.

llvm-svn: 61073
2008-12-16 03:25:46 +00:00
Chris Lattner 590b10dba2 add testcase for r61051
llvm-svn: 61052
2008-12-15 21:46:23 +00:00
Mon P Wang 580f2c7b61 Added support for splitting and scalarizing vector shifts.
llvm-svn: 61050
2008-12-15 21:44:00 +00:00
Chris Lattner 3cdf0a8a2e add a basic test for heap-sra
llvm-svn: 61041
2008-12-15 19:42:05 +00:00
Chris Lattner e3401db1f3 Teach basicaa to use the nocapture attribute when possible. When the
intrinsics are properly marked nocapture, the fixme should be addressed.

llvm-svn: 61040
2008-12-15 18:59:22 +00:00
Chris Lattner 81ee731852 Add a testcase for GCC PR 23455, which lpre handles now. Add some
comments about why we're not getting other cases.

llvm-svn: 61032
2008-12-15 07:49:24 +00:00
Mon P Wang ac4e120912 Added support to LegalizeType for expanding the operands of scalar to vector
and insert vector element.  Modified extract vector element to extend the
result to match the expected promoted type.

llvm-svn: 61029
2008-12-15 06:57:02 +00:00
Chris Lattner 3c2c36b590 gvn now hoists this load out of the hot non-call path.
llvm-svn: 61028
2008-12-15 06:34:48 +00:00
Chris Lattner b2429e2d69 Adjust testcase to make it more stable across visitation order changes,
unbreaking it after r61024.

llvm-svn: 61025
2008-12-15 04:42:00 +00:00
Chris Lattner 69131fd872 make GVN try to rename inputs to the resultant replaced values, which
cleans up the generated code a bit.  This should have the added benefit of
not randomly renaming functions/globals like my previous patch did. :)

llvm-svn: 61023
2008-12-15 03:46:38 +00:00
Chris Lattner ff9f3dba12 Implement initial support for PHI translation in memdep. This means that
memdep keeps track of how PHIs affect the pointer in dep queries, which 
allows it to eliminate the load in cases like rle-phi-translate.ll, which
basically end up being:

BB1:
   X = load P
   br BB3
BB2:
   Y = load Q
   br BB3
BB3:
   R = phi [P] [Q]
   load R

turning "load R" into a phi of X/Y.  In addition to additional exposed
opportunities, this makes memdep safe in many cases that it wasn't before
(which is required for load PRE) and also makes it substantially more 
efficient.  For example, consider:


bb1:  // has many predecessors.
   P = some_operator()
   load P

In this example, previously memdep would scan all the predecessors of BB1
to see if they had something that would mustalias P.  In some cases (e.g.
test/Transforms/GVN/rle-must-alias.ll) it would actually find them and end
up eliminating something.  In many other cases though, it would scan and not
find anything useful.  MemDep now stops at a block if the pointer is defined
in that block and cannot be phi translated to predecessors.  This causes it
to miss the (rare) cases like rle-must-alias.ll, but makes it faster by not
scanning tons of stuff that is unlikely to be useful.  For example, this
speeds up GVN as a whole from 3.928s to 2.448s (60%)!.  IMO, scalar GVN 
should be enhanced to simplify the rle-must-alias pointer base anyway, which
would allow the loads to be eliminated.

In the future, this should be enhanced to phi translate through geps and 
bitcasts as well (as indicated by FIXMEs) making memdep even more powerful.

llvm-svn: 61022
2008-12-15 03:35:32 +00:00
Chris Lattner a236dc44d6 another random testcase that shouldn't crash gvn and is
good for coverage with future changes.

llvm-svn: 61011
2008-12-14 21:20:46 +00:00
Chris Lattner 9b9a145694 RLE isn't smart enough to eliminate this safely yet.
llvm-svn: 60994
2008-12-13 21:04:20 +00:00
Chris Lattner d923519cc5 rename some tests to be more uniform in naming convention.
llvm-svn: 60988
2008-12-13 18:47:40 +00:00
Chris Lattner 9e24267120 gvn should never crash on this.
llvm-svn: 60987
2008-12-13 18:39:44 +00:00
Bill Wendling 293b9181e5 Temporarily revert r60973. It's inexplicably causing a failure when self-hosting LLVM:
llvm[2]: Linking Release executable opt (without symbols)
...
Undefined symbols:
  "llvm::APFloat::IEEEsingle", referenced from:
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
      __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
  "llvm::APFloat::IEEEdouble", referenced from:
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
      __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
ld: symbol(s) not found

This is in release mode. To replicate, compile llvm and llvm-gcc in optimized
mode. Then build llvm, in optimized mode, with the newly created compiler.

llvm-svn: 60977
2008-12-13 09:28:44 +00:00
Chris Lattner 1e29f7c97d make RLE preserve the name of the load that it replaces. This is just
a pretification of the IR.

llvm-svn: 60973
2008-12-13 07:22:47 +00:00
Devang Patel 146324f99c Re-enable test.
llvm-svn: 60968
2008-12-12 22:42:35 +00:00
Bill Wendling c4499feb1a - Use patterns instead of creating completely new instruction matching patterns,
which are identical to the original patterns.

- Change the multiply with overflow so that we distinguish between signed and
  unsigned multiplication. Currently, unsigned multiplication with overflow
  isn't working!

llvm-svn: 60963
2008-12-12 21:15:41 +00:00
Devang Patel 5784ead8c7 XFAIL these tests for now.
llvm-svn: 60959
2008-12-12 19:08:08 +00:00
Nick Lewycky 729bf137a8 Revert my re-instated reverted commit, fixes the bootstrap build on x86-64 linux.
llvm-svn: 60951
2008-12-12 17:09:07 +00:00
Nick Lewycky 6a344e097c Sneaky, sneaky: move the -1 to the outside of the SMax. Reinstate the
optimization of SGE/SLE with unit stride, now that it works properly.

llvm-svn: 60881
2008-12-11 17:40:14 +00:00
Bill Wendling 0864a75ebf If ADD, SUB, or MUL have an overflow bit that's used, don't do transformation on
them. The DAG combiner expects that nodes that are transformed have one value
result.

llvm-svn: 60857
2008-12-10 22:36:00 +00:00
Duncan Sands 09ed3bba2b For amusement, implement SADDO, SSUBO, UADDO, USUBO
for promoted integer types, eg: i16 on ppc-32, or
i24 on any platform.  Complete support for arbitrary
precision integers would require handling expanded
integer types, eg: i128, but I couldn't be bothered.

llvm-svn: 60834
2008-12-10 12:30:42 +00:00
Mon P Wang 4637c3c698 Fixed a bug when trying to optimize a extract vector element of a
bit convert that changes the number of elements of a shuffle.

llvm-svn: 60829
2008-12-10 03:59:02 +00:00
Chris Lattner 2e84a548d6 Allow basicaa to walk through geps with identical indices in
parallel, allowing it to decide that P/Q must alias if A/B
must alias in things like:
 P = gep A, 0, i, 1
 Q = gep B, 0, i, 1

This allows GVN to delete 62 more instructions out of 403.gcc.

llvm-svn: 60820
2008-12-10 01:04:47 +00:00
Evan Cheng 288fbd2133 Fix a couple of Dwarf bugs.
- Emit DW_AT_byte_size for struct and union of size zero.
- Emit DW_AT_declaration for forward type declaration.

llvm-svn: 60812
2008-12-10 00:15:44 +00:00
Bill Wendling 8008cb9a77 Implement fast-isel conversion of a branch instruction that's branching on an
overflow/carry from the "arithmetic with overflow" intrinsics. It searches the
machine basic block from bottom to top to find the SETO/SETC instruction that is
its conditional. If an instruction modifies EFLAGS before it reaches the
SETO/SETC instruction, then it defaults to the normal instruction emission.

llvm-svn: 60807
2008-12-09 23:19:12 +00:00
Chris Lattner 0318b56f0e loosen up an assertion that isn't valid when called from
invalidateCachedPointerInfo.  Thanks to Bill for sending me
a testcase.

llvm-svn: 60805
2008-12-09 22:45:32 +00:00
Bill Wendling db8ec2d75a Add sub/mul overflow intrinsics. This currently doesn't have a
target-independent way of determining overflow on multiplication. It's very
tricky. Patch by Zoltan Varga!

llvm-svn: 60800
2008-12-09 22:08:41 +00:00
Duncan Sands 445071c44f Fix PR3117: not all nodes being legalized. The
essential problem was that the DAG can contain
random unused nodes which were never analyzed.
When remapping a value of a node being processed,
such a node may become used and need to be analyzed;
however due to operands being transformed during
analysis the node may morph into a different one.
Users of the morphing node need to be updated, and
this wasn't happening.  While there I added a bunch
of documentation and sanity checks, so I (or some
other poor soul) won't have to scratch their head
over this stuff so long trying to remember how it
was all supposed to work next time some obscure
problem pops up!  The extra sanity checking exposed
a few places where invariants weren't being preserved,
so those are fixed too.  Since some of the sanity
checking is expensive, I added a flag to turn it
on.  It is also turned on when building with
ENABLE_EXPENSIVE_CHECKS=1.

llvm-svn: 60797
2008-12-09 21:33:20 +00:00
Chris Lattner 702e46ed54 Teach BasicAA::getModRefInfo(CallSite, CallSite) some
tricks based on readnone/readonly functions.

Teach memdep to look past readonly calls when analyzing
deps for a readonly call.  This allows elimination of a
few more calls from 403.gcc:

before:
     63 gvn    - Number of instructions PRE'd
 153986 gvn    - Number of instructions deleted
  50069 gvn    - Number of loads deleted

after:
     63 gvn    - Number of instructions PRE'd
 153991 gvn    - Number of instructions deleted
  50069 gvn    - Number of loads deleted

5 calls isn't much, but this adds plumbing for the next change.

llvm-svn: 60794
2008-12-09 21:19:42 +00:00
Evan Cheng 058522f1da xfail this for now.
llvm-svn: 60777
2008-12-09 18:43:00 +00:00
Mikhail Glushenkov e001666100 Remove Clang tests since clang is not installed on the buildbots.
llvm-svn: 60767
2008-12-09 15:11:45 +00:00
Mikhail Glushenkov 5752117a5b Add some rudimentary tests for .
llvm-svn: 60766
2008-12-09 14:41:27 +00:00
Nick Lewycky f545749f2b It's easy to handle SLE/SGE when the loop has a unit stride.
llvm-svn: 60748
2008-12-09 07:25:04 +00:00
Scott Michel 02e2c2450e CellSPU:
- Fix call.ll and call_indirect.ll expected results, now that it's using a
  different pre-register allocation scheduler.

llvm-svn: 60741
2008-12-09 06:12:03 +00:00
Mon P Wang 4dd832d241 Fix getNode to allow a vector for the shift amount for shifts of vectors.
Fix the shift amount when unrolling a vector shift into scalar shifts.
Fix problem in getShuffleScalarElt where it assumes that the input of
a bit convert must be a vector.

llvm-svn: 60740
2008-12-09 05:46:39 +00:00
Devang Patel 5f769e2d40 Actually test something. Use PR3170 test case.
llvm-svn: 60727
2008-12-08 23:44:46 +00:00
Devang Patel 1c469d36b0 Undo previous patch.
llvm-svn: 60701
2008-12-08 17:02:37 +00:00
Dan Gohman 4c31524bec Factor out the code for sign-extending/truncating gep indices
and use it in x86 address mode folding. Also, make
getRegForValue return 0 for illegal types even if it has a
ValueMap for them, because Argument values are put in the
ValueMap. This fixes PR3181.

llvm-svn: 60696
2008-12-08 07:57:47 +00:00
Mikhail Glushenkov 7f1bef5a55 Make 'extern' an option property.
Makes (forward) work better.

llvm-svn: 60667
2008-12-07 16:47:12 +00:00
Mikhail Glushenkov 203cad7326 Add some clarifying comments.
llvm-svn: 60662
2008-12-07 16:44:15 +00:00
Mikhail Glushenkov 7429d925f0 Add tests for tblgen's LLVMC backend.
llvm-svn: 60657
2008-12-07 16:41:50 +00:00
Chris Lattner f50d7f76c6 fix a bug I introduced in simplifycfg handling single entry phi
nodes. FoldSingleEntryPHINodes deletes the PHI, so there is no
need to delete it afterward.

llvm-svn: 60653
2008-12-07 07:22:45 +00:00
Evan Cheng ab85feb91c Clean up some ARM GV asm printing out; minor fixes to match what gcc does.
llvm-svn: 60621
2008-12-06 02:00:55 +00:00
Chris Lattner 57e91eaf61 Reimplement the inner loop of DSE. It now uniformly uses getDependence(),
doesn't do its own local caching, and is slightly more aggressive about
free/store dse (see testcase).  This eliminates the last external client 
of MemDep::getDependenceFrom().

llvm-svn: 60619
2008-12-06 00:53:22 +00:00
Dale Johannesen 0733759b5a Fix test to pass on Linux.
llvm-svn: 60614
2008-12-05 22:38:21 +00:00
Dale Johannesen 9efd2ce55b Make LoopStrengthReduce smarter about hoisting things out of
loops when they can be subsumed into addressing modes.

Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)

llvm-svn: 60608
2008-12-05 21:47:27 +00:00
Evan Cheng 7a15646d69 This test also requires -mattr=+sse41.
llvm-svn: 60601
2008-12-05 19:26:37 +00:00
Evan Cheng fd8c4d5975 Effectively undo 60461 in PIC mode which simply transform V_SET0 / V_SETALLONES into a load from constpool in order to fold into restores. This is not safe to do when PIC base is being used for a number of reasons:
1. GlobalBaseReg may have been spilled.
2. It may not be live at the use.
3. Spiller doesn't know this is happening so it won't prevent GlobalBaseReg from being spilled later (That by itself is a nasty hack. It's needed because we don't insert the reload until later).

llvm-svn: 60595
2008-12-05 17:23:48 +00:00
Chris Lattner c100828026 Fix test/Transforms/GVN/pre-load.ll
llvm-svn: 60594
2008-12-05 17:04:12 +00:00
Evan Cheng 2a03c7e977 Re-did 60519. It turns out Darwin's handling of hidden visibility symbols are a bit more complicate than I expected. Both declarations and weak definitions still need a stub indirection. However, the stubs are in data section and they contain the addresses of the actual symbols.
llvm-svn: 60571
2008-12-05 01:06:39 +00:00
Scott Michel 6ce01ab378 CellSPU: Add new directory under tests/CodeGen/CellSPU to retain tests that
aren't part of the test suite but are generally useful nonetheless, and can
be expanded later to test the backend against the actual Cell SPU system.

There's basically no other good place to put this code, so put it here for
the time being.

- vecoperations.c: Vector shuffles for all supported vector types, tests
  for v16i8 add and multiply.

llvm-svn: 60566
2008-12-05 00:01:00 +00:00
Devang Patel c56423b500 Rewrite code that 1) filters loops and 2) calculates new loop bounds.
This fixes many bugs. I will add more test cases in a separate check-in.

Some day, the code that manipulates CFG and updates dom. info could use refactoring help.

llvm-svn: 60554
2008-12-04 21:38:42 +00:00
Bill Wendling 6949f6135b Temporarily revert r60519. It was causing a bootstrap failure:
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/bin/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/lib/ -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/include -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/sys-include -DHAVE_CONFIG_H -I. -I../../../llvm-gcc.src/libgomp -I. -I../../../llvm-gcc.src/libgomp/config/posix -I../../../llvm-gcc.src/libgomp -Wall -pthread -Werror -O2 -g -O2 -MT barrier.lo -MD -MP -MF .deps/barrier.Tpo -c ../../../llvm-gcc.src/libgomp/barrier.c  -fno-common -DPIC -o .libs/barrier.o
checking for sys/file.h... /var/folders/zG/zGE-ZJOGFiGjv0B5cs5oYE+++TM/-Tmp-//cc34Jg5P.s:13:non-relocatable subtraction expression, "_gomp_tls_key" minus "L1$pb"
/var/folders/zG/zGE-ZJOGFiGjv0B5cs5oYE+++TM/-Tmp-//cc34Jg5P.s:13:symbol: "_gomp_tls_key" can't be undefined in a subtraction expression
make[4]: *** [barrier.lo] Error 1
make[4]: *** Waiting for unfinished jobs....
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/bin/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/lib/ -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/include -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/sys-include -DHAVE_CONFIG_H -I. -I../../../llvm-gcc.src/libgomp -I. -I../../../llvm-gcc.src/libgomp/config/posix -I../../../llvm-gcc.src/libgomp -Wall -pthread -Werror -O2 -g -O2 -MT alloc.lo -MD -MP -MF .deps/alloc.Tpo -c ../../../llvm-gcc.src/libgomp/alloc.c -o alloc.o >/dev/null 2>&1
yes
checking for sys/param.h... make[3]: *** [all-recursive] Error 1
make[2]: *** [all] Error 2
make[1]: *** [all-target-libgomp] Error 2
make[1]: *** Waiting for unfinished jobs....

llvm-svn: 60527
2008-12-04 04:07:00 +00:00
Evan Cheng 011c4fa8a1 Visibility hidden GVs do not require extra load of symbol address from the GOT or non-lazy-ptr.
llvm-svn: 60519
2008-12-04 01:56:50 +00:00
Evan Cheng 1339e72d97 Use mmx (punpckldq VR64, (mmx_v_set0)) to clear high 32-bits of a VR64 register.
llvm-svn: 60499
2008-12-03 19:38:05 +00:00
Rafael Espindola 74dc32d422 Fix some tests. The grep for "il" was matching "file".
llvm-svn: 60485
2008-12-03 17:14:56 +00:00
Richard Osborne feece7edab Add support for ISD::TRAP to the XCore backend
llvm-svn: 60479
2008-12-03 10:59:16 +00:00
Evan Cheng b5a97ff651 Fix test.
llvm-svn: 60476
2008-12-03 08:20:45 +00:00
Chris Lattner 350fc5721d testcase for br undef folding.
llvm-svn: 60471
2008-12-03 07:48:27 +00:00
Chris Lattner 595c7279bd Teach jump threading some more simple tricks:
1) have it fold "br undef", which does occur with
   surprising frequency as jump threading iterates.
2) teach j-t to delete dead blocks.  This removes the successor
   edges, reducing the in-edges of other blocks, allowing 
   recursive simplification.
3) Fold things like:
     br COND, BBX, BBY
  BBX:
     br COND, BBZ, BBW

   which also happens because jump threading iterates.

llvm-svn: 60470
2008-12-03 07:48:08 +00:00
Chris Lattner 50532410d1 don't spew tons of stuff to the output. This testcase is *not* for
loop deletion (it is for a ton of passes), which is very bad.

llvm-svn: 60465
2008-12-03 06:41:50 +00:00
Dan Gohman cc78cdf275 Mark x86's V_SET0 and V_SETALLONES with isSimpleLoad, and teach X86's
foldMemoryOperand how to "fold" them, by converting them into constant-pool
loads. When they aren't folded, they use xorps/cmpeqd, but for example when
register pressure is high, they may now be folded as memory operands, which
reduces register pressure.

Also, mark V_SET0 isAsCheapAsAMove so that two-address-elimination will
remat it instead of copying zeros around (V_SETALLONES was already marked).

llvm-svn: 60461
2008-12-03 05:21:24 +00:00
Bill Wendling e3402692d8 Change label to 'carry' for unsigned adds.
llvm-svn: 60460
2008-12-03 02:43:12 +00:00
Dan Gohman 5d3d1f69e1 Fix byval arguments in the fastcc calling convention. The fastcc convention
delegates to the regular x86-32 convention which handles byval, but only
after it handles a few cases, and it's necessary to handle byval before
handling those cases. This fixes PR3122 (and rdar://6400815), llvm-gcc
miscompiling LLVM.

llvm-svn: 60453
2008-12-03 01:28:04 +00:00
Dan Gohman 971c88f3b2 Add nounwind attributes to this test.
llvm-svn: 60451
2008-12-03 01:10:18 +00:00
Dale Johannesen b43a689520 testcases for recent dag combiner changes
llvm-svn: 60449
2008-12-03 00:52:41 +00:00
Evan Cheng e62150cae4 Remove a (what appears to be) overly strict assertion. Here is what happened:
1. ppcf128 select is expanded to f64 select's.
2. f64 select operand 0 is an i1 truncate, it's promoted to i32 zero_extend.
3. f64 select is updated. It's changed back to a "NewNode" and being re-analyzed.
4. f64 select operands are being processed. Operand 0 is a "NewNode". It's being expunged out of ReplacedValues map.
5. ExpungeNode tries to remap f64 select and notice it's a "NewNode" and assert.
Duncan, please take a look. Thanks.

llvm-svn: 60443
2008-12-02 21:57:09 +00:00
Scott Michel 7364025ff8 CellSPU:
- Incorporate Tilmann Scheller's ISD::TRUNCATE custom lowering patch
- Update SPU calling convention info, even if it's not used yet (but can be
  at some point or another)
- Ensure that any-extended f32 loads are custom lowered, especially when
  they're promoted for use in printf.

llvm-svn: 60438
2008-12-02 19:53:53 +00:00
Chris Lattner 1db9bbe802 Implement PRE of loads in the GVN pass with a pretty cheap and
straight-forward implementation.  This does not require any extra
alias analysis queries beyond what we already do for non-local loads.

Some programs really really like load PRE.  For example, SPASS triggers
this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc.

The biggest limitation to the implementation is that it does not split
critical edges.  This is a huge killer on many programs and should be
addressed after the initial patch is enabled by default.

The implementation of this should incidentally speed up rejection of 
non-local loads because it avoids creating the repl densemap in cases 
when it won't be used for fully redundant loads.

This is currently disabled by default.
Before I turn this on, I need to fix a couple of miscompilations in
the testsuite, look at compile time performance numbers, and look at
perf impact.  This is pretty close to ready though.

llvm-svn: 60408
2008-12-02 08:16:11 +00:00
Owen Anderson 35bd70c07a Add a test for my previous PRE fix.
llvm-svn: 60394
2008-12-02 04:25:42 +00:00
Evan Cheng 1718fd4375 Fix PR3124: overly strict assert.
llvm-svn: 60392
2008-12-02 02:15:36 +00:00
Bill Wendling 30e9dc81c8 Second stab at target-dependent lowering of everyone's favorite nodes: [SU]ADDO
- LowerXADDO lowers [SU]ADDO into an ADD with an implicit EFLAGS define. The
  EFLAGS are fed into a SETCC node which has the conditional COND_O or COND_C,
  depending on the type of ADDO requested.

- LowerBRCOND now recognizes if it's coming from a SETCC node with COND_O or
  COND_C set.

llvm-svn: 60388
2008-12-02 01:06:39 +00:00
Chris Lattner b2f131a4ab Add rdar reference, make this actually fail when the patch isn't applied.
llvm-svn: 60376
2008-12-01 22:35:31 +00:00
Dale Johannesen 069a4eee55 Consider only references to an IV within the loop when
figuring out the base of the IV.  This produces better
code in the example.  (Addresses use (IV) instead of 
(BASE,IV) - a significant improvement on low-register
machines like x86).

llvm-svn: 60374
2008-12-01 22:00:01 +00:00
Scott Michel 08a4e2045d CellSPU:
- Fix v2[if]64 vector insertion code before IBM files a bug report.
- Ensure that zero (0) offsets relative to $sp don't trip an assert
  (add $sp, 0 gets legalized to $sp alone, tripping an assert)
- Shuffle masks passed to SPUISD::SHUFB are now v16i8 or v4i32

llvm-svn: 60358
2008-12-01 17:56:02 +00:00
Bill Wendling 582fe6b0ca Use m_Specific() instead of double matching.
llvm-svn: 60341
2008-12-01 08:09:47 +00:00
Chris Lattner 9e6b243428 simplify these patterns using m_Specific. No need to grep for
xor in testcase (or is a substring).

llvm-svn: 60328
2008-12-01 05:16:26 +00:00
Chris Lattner 9d02a70a7d Teach inst combine to merge GEPs through PHIs. This is really
important because it is sinking the loads using the GEPs, but
not the GEPs themselves.  This triggers 647 times on 403.gcc
and makes the .s file much much nicer.  For example before:

        je      LBB1_87 ## bb78
LBB1_62:        ## bb77
        leal    84(%esi), %eax
LBB1_63:        ## bb79
        movl    (%eax), %eax
...
LBB1_87:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
        jmp     LBB1_62 ## bb77


after:

        jne     LBB1_63 ## bb79
LBB1_62:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
LBB1_63:        ## bb79
        movl    84(%esi), %eax

The input code was (and the GEPs are merged and
the PHI is now eliminated by instcombine):

        br i1 %tmp233, label %bb78, label %bb77
bb77:           
        %tmp234 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb78:           
        call void @make_decl_rtl(%struct.tree_node* %t_addr.3, i8* null) nounwind
        %tmp235 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb79:           
        %iftmp.12.0.in = phi %struct.rtx_def** [ %tmp235, %bb78 ], [ %tmp234, %bb77 ]           
        %iftmp.12.0 = load %struct.rtx_def** %iftmp.12.0.in             

llvm-svn: 60322
2008-12-01 02:34:36 +00:00
Chris Lattner 8facc59e72 testcase for my previous commit.
llvm-svn: 60315
2008-12-01 01:42:03 +00:00
Bill Wendling 5b902c5b1e Implement ((A|B)&1)|(B&-2) -> (A&1) | B transformation. This also takes care of
permutations of this pattern.

llvm-svn: 60312
2008-12-01 01:07:11 +00:00
Bill Wendling de89bc275c Add instruction combining for ((A&~B)|(~A&B)) -> A^B and all permutations.
llvm-svn: 60291
2008-11-30 13:52:49 +00:00
Bill Wendling 9eef421e12 Implement (A&((~A)|B)) -> A&B transformation in the instruction combiner. This
takes care of all permutations of this pattern.

llvm-svn: 60290
2008-11-30 13:08:13 +00:00
Bill Wendling 2d2e7861b5 getSExtValue() doesn't work for ConstantInts with bitwidth > 64 bits. Use all
APInt calls instead.

This fixes PR3144.

llvm-svn: 60288
2008-11-30 12:38:24 +00:00
Eli Friedman 09bc610945 Optimize memmove and memset into the LLVM builtins. Note that these
only show up in code from front-ends besides llvm-gcc, like clang.

llvm-svn: 60287
2008-11-30 08:32:11 +00:00
Eli Friedman c8228d263b Followup to r60283: optimize arbitrary width signed divisions as well
as unsigned divisions.  Same caveats as before.

llvm-svn: 60284
2008-11-30 06:35:39 +00:00
Eli Friedman 1b7fc154a5 Fix for PR2164: allow transforming arbitrary-width unsigned divides into
multiplies.

Some more cleverness would be nice, though. It would be nice if we 
could do this transformation on illegal types.  Also, we would 
prefer a narrower constant when possible so that we can use a narrower
multiply, which can be cheaper.

llvm-svn: 60283
2008-11-30 06:02:26 +00:00
Eli Friedman bd0f57821a APIntify a test which is potentially unsafe otherwise, and fix the
nearby FIXME.

I'm not sure what the right way to fix the Cell test was; if the 
approach I used isn't okay, please let me know.

llvm-svn: 60277
2008-11-30 04:59:26 +00:00
Bill Wendling 361c0e5f9c Strengthen check for div inst-combining.
llvm-svn: 60276
2008-11-30 04:33:53 +00:00
Bill Wendling 70635adea3 Instcombine was illegally transforming -X/C into X/-C when either X or C
overflowed on negation. This commit checks to make sure that neithe C nor X
overflows. This requires that the RHS of X (a subtract instruction) be a
constant integer.

llvm-svn: 60275
2008-11-30 03:42:12 +00:00
Chris Lattner c40039c736 don't require GVN to work on dead values, just make the
test return the loaded value.

llvm-svn: 60252
2008-11-29 21:21:48 +00:00
Chris Lattner 8c5ff516c6 Fix a thinko that manifested as a crash on clamav last night.
llvm-svn: 60251
2008-11-29 20:29:04 +00:00
Chris Lattner d3d9111ede Fix PR3141 by ensuring that MemoryDependenceAnalysis::removeInstruction
properly updates the reverse dependency map when it installs updated 
dependencies for instructions that depend on the removed instruction.

llvm-svn: 60222
2008-11-28 22:51:08 +00:00
Chris Lattner 8a172daa55 don't call MergeBasicBlockIntoOnlyPred on a block whose only
predecessor is itself.  This doesn't make sense, and this is
a dead infinite loop anyway.

llvm-svn: 60210
2008-11-28 19:54:49 +00:00
Nick Lewycky 4ab50b93c8 Chris prefers icmp/select over udiv!
llvm-svn: 60187
2008-11-27 22:41:10 +00:00
Nick Lewycky 69941fd0a0 Add a couple of missed optimizations on integer vectors. Multiply and divide
by 1, as well as multiply by -1.

llvm-svn: 60182
2008-11-27 20:21:08 +00:00
Chris Lattner 5dfbfcd80d Fix PR3138: if we merge the entry block into another block, make sure to
move the other block back up into the entry position!

llvm-svn: 60179
2008-11-27 19:25:19 +00:00
Bill Wendling 077eb6fcc2 XFAil test due to reverting of patch.
llvm-svn: 60161
2008-11-27 07:34:10 +00:00
Chris Lattner 98d89d1b1b Make jump threading substantially more powerful, in the following ways:
1. Make it fold blocks separated by an unconditional branch.  This enables
   jump threading to see a broader scope.
2. Make jump threading able to eliminate locally redundant loads when they
   feed the branch condition of a block.  This frequently occurs due to
   reg2mem running.
3. Make jump threading able to eliminate *partially redundant* loads when
   they feed the branch condition of a block.  This is common in code with
   lots of loads and stores like C++ code and 255.vortex.

This implements thread-loads.ll and rdar://6402033.

Per the fixme's, several pieces of this should be moved into Transforms/Utils.

llvm-svn: 60148
2008-11-27 05:07:53 +00:00
Evan Cheng 3761143755 Avoid inserting noop's in the middle of a loop.
llvm-svn: 60141
2008-11-27 01:16:00 +00:00
Evan Cheng 83bdb38965 On x86 favors folding short immediate into some arithmetic operations (e.g. add, and, xor, etc.) because materializing an immediate in a register is expensive in turns of code size.
e.g.
movl 4(%esp), %eax
addl $4, %eax

is 2 bytes shorter than

movl $4, %eax
addl 4(%esp), %eax

llvm-svn: 60139
2008-11-27 00:49:46 +00:00
Evan Cheng d1dda5339d Add -march=x86.
llvm-svn: 60135
2008-11-27 00:37:06 +00:00
Bill Wendling a69ced6b68 Add x86-specific test for add-with-overflow intrinsics.
llvm-svn: 60125
2008-11-26 22:42:19 +00:00
Chris Lattner 397a11ccd8 Turn on my codegen prepare heuristic by default. It doesn't affect
performance in most cases on the Grawp tester, but does speed some 
things up (like shootout/hash by 15%).  This also doesn't impact 
compile time in a noticable way on the Grawp tester.

It also, of course, gets the testcase it was designed for right :)

llvm-svn: 60120
2008-11-26 22:16:44 +00:00
Duncan Sands d1ba7908cf Check that running the DAG combiner between type
and operation legalization does something useful.

llvm-svn: 60108
2008-11-26 16:44:30 +00:00
Bill Wendling 3d14916b3e Add test for rdar://6394879.
llvm-svn: 60079
2008-11-26 02:21:12 +00:00
Chris Lattner eb3e4fb6fb This adds in some code (currently disabled unless you pass
-enable-smarter-addr-folding to llc) that gives CGP a better
cost model for when to sink computations into addressing modes.
The basic observation is that sinking increases register 
pressure when part of the addr computation has to be available
for other reasons, such as having a use that is a non-memory
operation.  In cases where it works, it can substantially reduce
register pressure.

This code is currently an overall win on 403.gcc and 255.vortex
(the two things I've been looking at), but there are several 
things I want to do before enabling it by default:

1. This isn't doing any caching of results, so it is much slower 
   than it could be.  It currently slows down release-asserts llc 
   by 1.7% on 176.gcc: 27.12s -> 27.60s.
2. This doesn't think about inline asm memory operands yet.
3. The cost model botches the case when the needed value is live
   across the computation for other reasons.

I'll continue poking at this, and eventually turn it on as llcbeta.

llvm-svn: 60074
2008-11-26 02:00:14 +00:00
Chris Lattner a9ab165b08 Teach CodeGenPrepare to look through Bitcast instructions when attempting to
optimize addressing modes.  This allows us to optimize things like isel-sink2.ll
into:

	movl	4(%esp), %eax
	cmpb	$0, 4(%eax)
	jne	LBB1_2	## F
LBB1_1:	## TB
	movl	$4, %eax
	ret
LBB1_2:	## F
	movzbl	7(%eax), %eax
	ret

instead of:

_test:
	movl	4(%esp), %eax
	cmpb	$0, 4(%eax)
	leal	4(%eax), %eax
	jne	LBB1_2	## F
LBB1_1:	## TB
	movl	$4, %eax
	ret
LBB1_2:	## F
	movzbl	3(%eax), %eax
	ret

This shrinks (e.g.) 403.gcc from 1133510 to 1128345 lines of .s.

Note that the 2008-10-16-SpillerBug.ll testcase is dubious at best, I doubt
it is really testing what it thinks it is.

llvm-svn: 60068
2008-11-26 00:26:16 +00:00
Chris Lattner f0e01def8c fix an over-reduced test.
llvm-svn: 60067
2008-11-26 00:12:08 +00:00
Chris Lattner 0f98f74c74 this doesn't need EH
llvm-svn: 60066
2008-11-26 00:03:26 +00:00
Mikhail Glushenkov 98d5ed5cb7 Since the old llvmc was removed, rename llvmc2 to llvmc.
llvm-svn: 60048
2008-11-25 21:38:12 +00:00
Evan Cheng 2e5aeff676 convertToSignExtendedInteger should return opInvalidOp instead of asserting if sematics of float does not allow arithmetics.
llvm-svn: 60042
2008-11-25 19:00:29 +00:00
Scott Michel 910046d174 CellSPU:
(a) Remove conditionally removed code in SelectXAddr. Basically, hope for the
    best that the A-form and D-form address predicates catch everything before
    the code decides to emit a X-form address.
(b) Expand vector store test cases to include the usual suspects.

llvm-svn: 60034
2008-11-25 17:29:43 +00:00
Scott Michel 5149430c6e CellSPU: test should use shlqby, not shlqbyi
llvm-svn: 60001
2008-11-25 01:30:37 +00:00
Bill Wendling aec5a56446 XFAIL this test. A recent CellSPU check-in broke it.
llvm-svn: 60000
2008-11-25 00:56:34 +00:00
Dan Gohman ad2134d45d Initial support for anti-dependence breaking. Currently this code does not
introduce any new spilling; it just uses unused registers.

Refactor the SUnit topological sort code out of the RRList scheduler and
make use of it to help with the post-pass scheduler.

llvm-svn: 59999
2008-11-25 00:52:40 +00:00
Bill Wendling a307020800 Testcase for constant CFStrings.
llvm-svn: 59992
2008-11-24 23:28:09 +00:00
Chris Lattner 18065ce9fc reenable test
llvm-svn: 59986
2008-11-24 21:27:20 +00:00
Bill Wendling e6fe59df6d Temporarily XFAIL this test. r59976 and r59972 broke it.
llvm-svn: 59981
2008-11-24 20:43:33 +00:00
Chris Lattner 53d6a07869 Fix 3113: If we have a dead cyclic PHI, replace the whole thing
with an undef.

llvm-svn: 59972
2008-11-24 19:25:36 +00:00
Scott Michel 2e5df906f8 CellSPU:
(a) Slight rethink on i64 zero/sign/any extend code - use a shuffle to
    directly zero-extend i32 to i64, but use rotates and shifts for
    sign extension. Also ensure unified register consistency.
(b) Add new test harness for i64 operations: i64ops.ll

llvm-svn: 59970
2008-11-24 18:20:46 +00:00
Scott Michel efc8c7a292 CellSPU:
(a) Improve the extract element code: there's no need to do gymnastics with
    rotates into the preferred slot if a shuffle will do the same thing.
(b) Rename a couple of SPUISD pseudo-instructions for readability and better
    semantic correspondence.
(c) Fix i64 sign/any/zero extension lowering.

llvm-svn: 59965
2008-11-24 17:11:17 +00:00
Bill Wendling 411eaa5c57 Test add-with-overflow with fast ISel.
llvm-svn: 59945
2008-11-24 05:23:38 +00:00