2f82d2d58a 
								
							 
						 
						
							
							
								
								Add FSQRT, FSIN, FCOS nodes, patch contributed by Morten Ofstad  
							
							... 
							
							
							
							llvm-svn: 21605 
							
						 
						
							2005-04-28 21:44:03 +00:00  
				
					
						
							
							
								 
						
							
								4a73c2cfdc 
								
							 
						 
						
							
							
								
								Implement Value* tracking for loads and stores in the selection DAG.  This enables one to use alias analysis in the backends.  
							
							... 
							
							
							
							(TRUNK)Stores and (EXT|ZEXT|SEXT)Loads have an extra SDOperand which is a SrcValueSDNode which contains the Value*.  Note that if the operation is introduced by the backend, it will still have the operand, but the value* will be null.
llvm-svn: 21599 
							
						 
						
							2005-04-27 20:10:01 +00:00  
				
					
						
							
							
								 
						
							
								cfa7ddd6e2 
								
							 
						 
						
							
							
								
								Fold  (X >  -1) | (Y >  -1)  -->   (X&Y >  -1)  
							
							... 
							
							
							
							llvm-svn: 21552 
							
						 
						
							2005-04-26 01:18:33 +00:00  
				
					
						
							
							
								 
						
							
								f806459d90 
								
							 
						 
						
							
							
								
								implement some more logical compares with constants, so that:  
							
							... 
							
							
							
							int foo1(int x, int y) {
  int t1 = x >= 0;
  int t2 = y >= 0;
  return t1 & t2;
}
int foo2(int x, int y) {
  int t1 = x == -1;
  int t2 = y == -1;
  return t1 & t2;
}
produces:
_foo1:
        or r2, r4, r3
        srwi r2, r2, 31
        xori r3, r2, 1
        blr
_foo2:
        and r2, r4, r3
        addic r2, r2, 1
        li r2, 0
        addze r3, r2
        blr
instead of:
_foo1:
        srwi r2, r4, 31
        xori r2, r2, 1
        srwi r3, r3, 31
        xori r3, r3, 1
        and r3, r2, r3
        blr
_foo2:
        addic r2, r4, 1
        li r2, 0
        addze r2, r2
        addic r3, r3, 1
        li r3, 0
        addze r3, r3
        and r3, r2, r3
        blr
llvm-svn: 21547 
							
						 
						
							2005-04-25 21:20:28 +00:00  
				
					
						
							
							
								 
						
							
								d373ff64aa 
								
							 
						 
						
							
							
								
								Codegen x < 0 | y < 0 as (x|y) < 0.  This allows us to compile this to:  
							
							... 
							
							
							
							_foo:
        or r2, r4, r3
        srwi r3, r2, 31
        blr
instead of:
_foo:
        srwi r2, r4, 31
        srwi r3, r3, 31
        or r3, r2, r3
        blr
llvm-svn: 21544 
							
						 
						
							2005-04-25 21:03:25 +00:00  
				
					
						
							
							
								 
						
							
								774511633d 
								
							 
						 
						
							
							
								
								Convert tabs to spaces  
							
							... 
							
							
							
							llvm-svn: 21439 
							
						 
						
							2005-04-22 04:01:18 +00:00  
				
					
						
							
							
								 
						
							
								835702a094 
								
							 
						 
						
							
							
								
								Remove trailing whitespace  
							
							... 
							
							
							
							llvm-svn: 21420 
							
						 
						
							2005-04-21 22:36:52 +00:00  
				
					
						
							
							
								 
						
							
								f6302441f0 
								
							 
						 
						
							
							
								
								Improve and elimination.  On PPC, for:  
							
							... 
							
							
							
							bool %test(int %X) {
        %Y = and int %X, 8
        %Z = setne int %Y, 0
        ret bool %Z
}
we now generate this:
        rlwinm r2, r3, 0, 28, 28
        srwi r3, r2, 3
instead of this:
        rlwinm r2, r3, 0, 28, 28
        srwi r2, r2, 3
        rlwinm r3, r2, 0, 31, 31
I'll leave it to Nate to get it down to one instruction. :)
---------------------------------------------------------------------
llvm-svn: 21391 
							
						 
						
							2005-04-21 06:28:15 +00:00  
				
					
						
							
							
								 
						
							
								ab1ed77570 
								
							 
						 
						
							
							
								
								Fold (x & 8) != 0 and (x & 8) == 8  into (x & 8) >> 3.  
							
							... 
							
							
							
							This turns this PPC code:
        rlwinm r2, r3, 0, 28, 28
        cmpwi cr7, r2, 8
        mfcr r2
        rlwinm r3, r2, 31, 31, 31
into this:
        rlwinm r2, r3, 0, 28, 28
        srwi r2, r2, 3
        rlwinm r3, r2, 0, 31, 31
Next up, nuking the extra and.
llvm-svn: 21390 
							
						 
						
							2005-04-21 06:12:41 +00:00  
				
					
						
							
							
								 
						
							
								b61ecb5875 
								
							 
						 
						
							
							
								
								Fold setcc of MVT::i1 operands into logical operations  
							
							... 
							
							
							
							llvm-svn: 21319 
							
						 
						
							2005-04-18 04:48:12 +00:00  
				
					
						
							
							
								 
						
							
								6d40fd01fe 
								
							 
						 
						
							
							
								
								Another minor simplification: handle setcc (zero_extend x), c -> setcc(x, c')  
							
							... 
							
							
							
							llvm-svn: 21318 
							
						 
						
							2005-04-18 04:30:45 +00:00  
				
					
						
							
							
								 
						
							
								868d473009 
								
							 
						 
						
							
							
								
								Another simple xform  
							
							... 
							
							
							
							llvm-svn: 21317 
							
						 
						
							2005-04-18 04:11:19 +00:00  
				
					
						
							
							
								 
						
							
								bd22d83d15 
								
							 
						 
						
							
							
								
								Fold:  
							
							... 
							
							
							
							// (X != 0) | (Y != 0) -> (X|Y != 0)
        // (X == 0) & (Y == 0) -> (X|Y == 0)
Compiling this:
int %bar(int %a, int %b) {
        entry:
        %tmp.1 = setne int %a, 0
        %tmp.2 = setne int %b, 0
        %tmp.3 = or bool %tmp.1, %tmp.2
        %retval = cast bool %tmp.3 to int
        ret int %retval
        }
to this:
_bar:
        or r2, r3, r4
        addic r3, r2, -1
        subfe r3, r3, r2
        blr
instead of:
_bar:
        addic r2, r3, -1
        subfe r2, r2, r3
        addic r3, r4, -1
        subfe r3, r3, r4
        or r3, r2, r3
        blr
llvm-svn: 21316 
							
						 
						
							2005-04-18 03:59:53 +00:00  
				
					
						
							
							
								 
						
							
								d929f8bcd3 
								
							 
						 
						
							
							
								
								Make the AND elimination operation recursive and significantly more powerful,  
							
							... 
							
							
							
							eliminating an and for Nate's testcase:
int %bar(int %a, int %b) {
        entry:
        %tmp.1 = setne int %a, 0
        %tmp.2 = setne int %b, 0
        %tmp.3 = or bool %tmp.1, %tmp.2
        %retval = cast bool %tmp.3 to int
        ret int %retval
        }
generating:
_bar:
        addic r2, r3, -1
        subfe r2, r2, r3
        addic r3, r4, -1
        subfe r3, r3, r4
        or r3, r2, r3
        blr
instead of:
_bar:
        addic r2, r3, -1
        subfe r2, r2, r3
        addic r3, r4, -1
        subfe r3, r3, r4
        or r2, r2, r3
        rlwinm r3, r2, 0, 31, 31
        blr
llvm-svn: 21315 
							
						 
						
							2005-04-18 03:48:41 +00:00  
				
					
						
							
							
								 
						
							
								80c095f422 
								
							 
						 
						
							
							
								
								Add a couple missing transforms in getSetCC that were triggering assertions  
							
							... 
							
							
							
							in the PPC Pattern ISel
llvm-svn: 21297 
							
						 
						
							2005-04-14 08:56:52 +00:00  
				
					
						
							
							
								 
						
							
								4ddd81657b 
								
							 
						 
						
							
							
								
								Disbale the broken fold of shift + sz[ext] for now  
							
							... 
							
							
							
							Move the transform for select (a < 0) ? b : 0 into the dag from ppc isel
Enable the dag to fold and (setcc, 1) -> setcc for targets where setcc
  always produces zero or one.
llvm-svn: 21291 
							
						 
						
							2005-04-13 21:23:31 +00:00  
				
					
						
							
							
								 
						
							
								56d177a344 
								
							 
						 
						
							
							
								
								fix an infinite loop  
							
							... 
							
							
							
							llvm-svn: 21289 
							
						 
						
							2005-04-13 20:06:29 +00:00  
				
					
						
							
							
								 
						
							
								e3d17d8225 
								
							 
						 
						
							
							
								
								fix some serious miscompiles on ia64, alpha, and ppc  
							
							... 
							
							
							
							llvm-svn: 21288 
							
						 
						
							2005-04-13 19:53:40 +00:00  
				
					
						
							
							
								 
						
							
								8c3d409dc7 
								
							 
						 
						
							
							
								
								avoid work when possible, perhaps fix the problem nate and andrew are seeing  
							
							... 
							
							
							
							with != 0 comparisons vanishing.
llvm-svn: 21287 
							
						 
						
							2005-04-13 19:41:05 +00:00  
				
					
						
							
							
								 
						
							
								b1f25ac188 
								
							 
						 
						
							
							
								
								add back the optimization that Nate added for shl X, (zext_inreg y)  
							
							... 
							
							
							
							llvm-svn: 21273 
							
						 
						
							2005-04-13 02:58:13 +00:00  
				
					
						
							
							
								 
						
							
								39844ac337 
								
							 
						 
						
							
							
								
								Oops, remove these too.  
							
							... 
							
							
							
							llvm-svn: 21272 
							
						 
						
							2005-04-13 02:47:57 +00:00  
				
					
						
							
							
								 
						
							
								2b4e3fca38 
								
							 
						 
						
							
							
								
								Remove all foldings of ZERO_EXTEND_INREG, moving them to work for AND nodes  
							
							... 
							
							
							
							instead.  OVerall, this increases the amount of folding we can do.
llvm-svn: 21265 
							
						 
						
							2005-04-13 02:38:18 +00:00  
				
					
						
							
							
								 
						
							
								ca916ba4a0 
								
							 
						 
						
							
							
								
								Fold shift x, [sz]ext(y) -> shift x, y  
							
							... 
							
							
							
							llvm-svn: 21262 
							
						 
						
							2005-04-12 23:32:28 +00:00  
				
					
						
							
							
								 
						
							
								af1c0f7a00 
								
							 
						 
						
							
							
								
								Fold shift by size larger than type size to undef  
							
							... 
							
							
							
							Make llvm undef values generate ISD::UNDEF nodes
llvm-svn: 21261 
							
						 
						
							2005-04-12 23:12:17 +00:00  
				
					
						
							
							
								 
						
							
								af5b25f139 
								
							 
						 
						
							
							
								
								Remove some redundant checks, add a couple of new ones.  This allows us to  
							
							... 
							
							
							
							compile this:
int foo (unsigned long a, unsigned long long g) {
  return a >= g;
}
To:
foo:
        movl 8(%esp), %eax
        cmpl %eax, 4(%esp)
        setae %al
        cmpl $0, 12(%esp)
        sete %cl
        andb %al, %cl
        movzbl %cl, %eax
        ret
instead of:
foo:
        movl 8(%esp), %eax
        cmpl %eax, 4(%esp)
        setae %al
        movzbw %al, %cx
        movl 12(%esp), %edx
        cmpl $0, %edx
        sete %al
        movzbw %al, %ax
        cmpl $0, %edx
        cmove %cx, %ax
        movzbl %al, %eax
        ret
llvm-svn: 21244 
							
						 
						
							2005-04-12 02:54:39 +00:00  
				
					
						
							
							
								 
						
							
								87bd69884a 
								
							 
						 
						
							
							
								
								canonicalize x <u 1 -> x == 0.  On this testcase:  
							
							... 
							
							
							
							unsigned long long g;
unsigned long foo (unsigned long a) {
  return (a >= g) ? 1 : 0;
}
It changes the ppc code from:
_foo:
.LBB_foo_0:     ; entry
        mflr r11
        stw r11, 8(r1)
        bl "L00000$pb"
"L00000$pb":
        mflr r2
        addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2)
        lwz r4, 0(r2)
        lwz r2, 4(r2)
        cmplw cr0, r3, r2
        li r2, 1
        li r3, 0
        bge .LBB_foo_2  ; entry
.LBB_foo_1:     ; entry
        or r2, r3, r3
.LBB_foo_2:     ; entry
        cmplwi cr0, r4, 1
        li r3, 1
        li r5, 0
        blt .LBB_foo_4  ; entry
.LBB_foo_3:     ; entry
        or r3, r5, r5
.LBB_foo_4:     ; entry
        cmpwi cr0, r4, 0
        beq .LBB_foo_6  ; entry
.LBB_foo_5:     ; entry
        or r2, r3, r3
.LBB_foo_6:     ; entry
        rlwinm r3, r2, 0, 31, 31
        lwz r11, 8(r1)
        mtlr r11
        blr
to:
_foo:
.LBB_foo_0:     ; entry
        mflr r11
        stw r11, 8(r1)
        bl "L00000$pb"
"L00000$pb":
        mflr r2
        addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2)
        lwz r4, 0(r2)
        lwz r2, 4(r2)
        cmplw cr0, r3, r2
        li r2, 1
        li r3, 0
        bge .LBB_foo_2  ; entry
.LBB_foo_1:     ; entry
        or r2, r3, r3
.LBB_foo_2:     ; entry
        cntlzw r3, r4
        srwi r3, r3, 5
        cmpwi cr0, r4, 0
        beq .LBB_foo_4  ; entry
.LBB_foo_3:     ; entry
        or r2, r3, r3
.LBB_foo_4:     ; entry
        rlwinm r3, r2, 0, 31, 31
        lwz r11, 8(r1)
        mtlr r11
        blr
llvm-svn: 21241 
							
						 
						
							2005-04-12 00:28:49 +00:00  
				
					
						
							
							
								 
						
							
								e2427c9afc 
								
							 
						 
						
							
							
								
								Don't bother sign/zext_inreg'ing the result of an and operation if we know  
							
							... 
							
							
							
							the result does change as a result of the extend.
This improves codegen for Alpha on this testcase:
int %a(ushort* %i) {
        %tmp.1 = load ushort* %i
        %tmp.2 = cast ushort %tmp.1 to int
        %tmp.4 = and int %tmp.2, 1
        ret int %tmp.4
}
Generating:
a:
        ldgp $29, 0($27)
        ldwu $0,0($16)
        and $0,1,$0
        ret $31,($26),1
instead of:
a:
        ldgp $29, 0($27)
        ldwu $0,0($16)
        and $0,1,$0
        addl $0,0,$0
        ret $31,($26),1
btw, alpha really should switch to livein/outs for args :)
llvm-svn: 21213 
							
						 
						
							2005-04-10 23:37:16 +00:00  
				
					
						
							
							
								 
						
							
								f74c794ccf 
								
							 
						 
						
							
							
								
								Fold zext_inreg(zextload), likewise for sext's  
							
							... 
							
							
							
							llvm-svn: 21204 
							
						 
						
							2005-04-10 04:33:08 +00:00  
				
					
						
							
							
								 
						
							
								f2bff92411 
								
							 
						 
						
							
							
								
								add a simple xform  
							
							... 
							
							
							
							llvm-svn: 21203 
							
						 
						
							2005-04-10 04:04:49 +00:00  
				
					
						
							
							
								 
						
							
								d8cbfe82ba 
								
							 
						 
						
							
							
								
								Fix a thinko.  If the operand is promoted, pass the promoted value into  
							
							... 
							
							
							
							the new zero extend, not the original operand.  This fixes cast bool -> long
on ppc.
Add an unrelated fixme
llvm-svn: 21196 
							
						 
						
							2005-04-10 01:13:15 +00:00  
				
					
						
							
							
								 
						
							
								da504741da 
								
							 
						 
						
							
							
								
								add a little peephole optimization.  This allows us to codegen:  
							
							... 
							
							
							
							int a(short i) {
        return i & 1;
}
as
_a:
        andi. r3, r3, 1
        blr
instead of:
_a:
        rlwinm r2, r3, 0, 16, 31
        andi. r3, r2, 1
        blr
on ppc.  It should also help the other risc targets.
llvm-svn: 21189 
							
						 
						
							2005-04-09 21:43:54 +00:00  
				
					
						
							
							
								 
						
							
								6a31b878f8 
								
							 
						 
						
							
							
								
								recognize some patterns as fabs operations, so that fabs at the source level  
							
							... 
							
							
							
							is deconstructed then reconstructed here.  This catches 19 fabs's in 177.mesa
9 in 168.wupwise, 5 in 171.swim, 3 in 172.mgrid, and 14 in 173.applu out of
specfp2000.
This allows the X86 code generator to make MUCH better code than before for
each of these and saves one instr on ppc.
This depends on the previous CFE patch to expose these correctly.
llvm-svn: 21171 
							
						 
						
							2005-04-09 05:15:53 +00:00  
				
					
						
							
							
								 
						
							
								b0713c74a2 
								
							 
						 
						
							
							
								
								print and fold BRCONDTWOWAY correctly  
							
							... 
							
							
							
							llvm-svn: 21165 
							
						 
						
							2005-04-09 03:27:28 +00:00  
				
					
						
							
							
								 
						
							
								0ea81f9db4 
								
							 
						 
						
							
							
								
								canonicalize a bunch of operations involving fneg  
							
							... 
							
							
							
							llvm-svn: 21160 
							
						 
						
							2005-04-09 03:02:46 +00:00  
				
					
						
							
							
								 
						
							
								b32d9318d2 
								
							 
						 
						
							
							
								
								If a target zero or sign extends the result of its setcc, allow folding of  
							
							... 
							
							
							
							this into sign/zero extension instructions later.
On PPC, for example, this testcase:
%G = external global sbyte
implementation
void %test(int %X, int %Y) {
  %C = setlt int %X, %Y
  %D = cast bool %C to sbyte
  store sbyte %D, sbyte* %G
  ret void
}
Now codegens to:
        cmpw cr0, r3, r4
        li r3, 1
        li r4, 0
        blt .LBB_test_2 ;
.LBB_test_1:    ;
        or r3, r4, r4
.LBB_test_2:    ;
        addis r2, r2, ha16(L_G$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_G$non_lazy_ptr-"L00000$pb")(r2)
        stb r3, 0(r2)
instead of:
        cmpw cr0, r3, r4
        li r3, 1
        li r4, 0
        blt .LBB_test_2 ;
.LBB_test_1:    ;
        or r3, r4, r4
.LBB_test_2:    ;
***     rlwinm r3, r3, 0, 31, 31
        addis r2, r2, ha16(L_G$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_G$non_lazy_ptr-"L00000$pb")(r2)
        stb r3, 0(r2)
llvm-svn: 21148 
							
						 
						
							2005-04-07 19:43:53 +00:00  
				
					
						
							
							
								 
						
							
								dfed7355c9 
								
							 
						 
						
							
							
								
								Remove somethign I had for testing  
							
							... 
							
							
							
							llvm-svn: 21144 
							
						 
						
							2005-04-07 18:58:54 +00:00  
				
					
						
							
							
								 
						
							
								6b03a0cba1 
								
							 
						 
						
							
							
								
								This patch does two things.  First, it canonicalizes 'X >= C' -> 'X > C-1'  
							
							... 
							
							
							
							(likewise for <= >=u >=u).
Second, it implements a special case hack to turn 'X gtu SINTMAX' -> 'X lt 0'
On powerpc, for example, this changes this:
        lis r2, 32767
        ori r2, r2, 65535
        cmplw cr0, r3, r2
        bgt .LBB_test_2
into:
        cmpwi cr0, r3, 0
        blt .LBB_test_2
llvm-svn: 21142 
							
						 
						
							2005-04-07 18:14:58 +00:00  
				
					
						
							
							
								 
						
							
								7d13eae254 
								
							 
						 
						
							
							
								
								Fix a really scary bug that Nate found where we weren't deleting the right  
							
							... 
							
							
							
							elements auto of the autoCSE maps.
llvm-svn: 21128 
							
						 
						
							2005-04-07 00:30:13 +00:00  
				
					
						
							
							
								 
						
							
								55e8625c69 
								
							 
						 
						
							
							
								
								Add MULHU and MULHS nodes for the high part of an (un)signed 32x32=64b  
							
							... 
							
							
							
							multiply.
llvm-svn: 21102 
							
						 
						
							2005-04-05 22:36:56 +00:00  
				
					
						
							
							
								 
						
							
								c4a2046a88 
								
							 
						 
						
							
							
								
								print fneg/fabs  
							
							... 
							
							
							
							llvm-svn: 21008 
							
						 
						
							2005-04-02 04:58:41 +00:00  
				
					
						
							
							
								 
						
							
								4157c417a1 
								
							 
						 
						
							
							
								
								fix some bugs in the implementation of SHL_PARTS and friends.  
							
							... 
							
							
							
							llvm-svn: 21004 
							
						 
						
							2005-04-02 04:00:59 +00:00  
				
					
						
							
							
								 
						
							
								5b7bb56ef8 
								
							 
						 
						
							
							
								
								Print some new nodes  
							
							... 
							
							
							
							llvm-svn: 21001 
							
						 
						
							2005-04-02 03:30:42 +00:00  
				
					
						
							
							
								 
						
							
								cda9aa7fa9 
								
							 
						 
						
							
							
								
								Add ISD::UNDEF node  
							
							... 
							
							
							
							Teach the SelectionDAG code how to expand and promote it
Have PPC32 LowerCallTo generate ISD::UNDEF for int arg regs used up by fp
  arguments, but not shadowing their value.  This allows us to do the right
  thing with both fixed and vararg floating point arguments.
llvm-svn: 20988 
							
						 
						
							2005-04-01 22:34:39 +00:00  
				
					
						
							
							
								 
						
							
								dec53920b4 
								
							 
						 
						
							
							
								
								PCMarker support for DAG and Alpha  
							
							... 
							
							
							
							llvm-svn: 20965 
							
						 
						
							2005-03-31 21:24:06 +00:00  
				
					
						
							
							
								 
						
							
								85e7163947 
								
							 
						 
						
							
							
								
								Fix a bug where we would incorrectly do a sign ext instead of a zero ext  
							
							... 
							
							
							
							because we were checking the wrong thing.  Thanks to andrew for pointing
this out!
llvm-svn: 20554 
							
						 
						
							2005-03-10 20:55:51 +00:00  
				
					
						
							
							
								 
						
							
								7f26946709 
								
							 
						 
						
							
							
								
								constant fold FP_ROUND_INREG, ZERO_EXTEND_INREG, and SIGN_EXTEND_INREG  
							
							... 
							
							
							
							This allows the alpha backend to compile:
bool %test(uint %P) {
        %c = seteq uint %P, 0
        ret bool %c
}
into:
test:
        ldgp $29, 0($27)
        ZAP $16,240,$0
        CMPEQ $0,0,$0
        AND $0,1,$0
        ret $31,($26),1
instead of:
test:
        ldgp $29, 0($27)
        ZAP $16,240,$0
        ldiq $1,0
        ZAP $1,240,$1
        CMPEQ $0,$1,$0
        AND $0,1,$0
        ret $31,($26),1
... and fixes PR534.
llvm-svn: 20534 
							
						 
						
							2005-03-09 18:37:12 +00:00  
				
					
						
							
							
								 
						
							
								381dddc90c 
								
							 
						 
						
							
							
								
								Don't rely on doubles comparing identical to each other, which doesn't work  
							
							... 
							
							
							
							for 0.0 and -0.0.
llvm-svn: 20230 
							
						 
						
							2005-02-17 20:17:32 +00:00  
				
					
						
							
							
								 
						
							
								90b7c13f3a 
								
							 
						 
						
							
							
								
								Remove the 3 HACK HACK HACKs I put in before, fixing them properly with  
							
							... 
							
							
							
							the new TLI that is available.
Implement support for handling out of range shifts.  This allows us to
compile this code (a 64-bit rotate):
unsigned long long f3(unsigned long long x) {
  return (x << 32) | (x >> (64-32));
}
into this:
f3:
        mov %EDX, DWORD PTR [%ESP + 4]
        mov %EAX, DWORD PTR [%ESP + 8]
        ret
GCC produces this:
$ gcc t.c -masm=intel -O3 -S -o - -fomit-frame-pointer
..
f3:
        push    %ebx
        mov     %ebx, DWORD PTR [%esp+12]
        mov     %ecx, DWORD PTR [%esp+8]
        mov     %eax, %ebx
        mov     %edx, %ecx
        pop     %ebx
        ret
The Simple ISEL produces (eww gross):
f3:
        sub %ESP, 4
        mov DWORD PTR [%ESP], %ESI
        mov %EDX, DWORD PTR [%ESP + 8]
        mov %ECX, DWORD PTR [%ESP + 12]
        mov %EAX, 0
        mov %ESI, 0
        or %EAX, %ECX
        or %EDX, %ESI
        mov %ESI, DWORD PTR [%ESP]
        add %ESP, 4
        ret
llvm-svn: 19780 
							
						 
						
							2005-01-23 04:39:44 +00:00  
				
					
						
							
							
								 
						
							
								3bc78b2e0b 
								
							 
						 
						
							
							
								
								More bugfixes for IA64 shifts.  
							
							... 
							
							
							
							llvm-svn: 19739 
							
						 
						
							2005-01-22 00:33:03 +00:00  
				
					
						
							
							
								 
						
							
								d637c96fac 
								
							 
						 
						
							
							
								
								Add a nasty hack to fix Alpha/IA64 multiplies by a power of two.  
							
							... 
							
							
							
							llvm-svn: 19737 
							
						 
						
							2005-01-22 00:20:42 +00:00  
				
					
						
							
							
								 
						
							
								d53e763f18 
								
							 
						 
						
							
							
								
								Remove unneeded line.  
							
							... 
							
							
							
							llvm-svn: 19736 
							
						 
						
							2005-01-21 23:43:12 +00:00  
				
					
						
							
							
								 
						
							
								4f987bf16d 
								
							 
						 
						
							
							
								
								test commit  
							
							... 
							
							
							
							llvm-svn: 19735 
							
						 
						
							2005-01-21 23:38:56 +00:00  
				
					
						
							
							
								 
						
							
								96e809c47d 
								
							 
						 
						
							
							
								
								Unary token factor nodes are unneeded.  
							
							... 
							
							
							
							llvm-svn: 19727 
							
						 
						
							2005-01-21 18:01:22 +00:00  
				
					
						
							
							
								 
						
							
								1fe9b40981 
								
							 
						 
						
							
							
								
								implement add_parts/sub_parts.  
							
							... 
							
							
							
							llvm-svn: 19714 
							
						 
						
							2005-01-20 18:50:55 +00:00  
				
					
						
							
							
								 
						
							
								9b75e148fd 
								
							 
						 
						
							
							
								
								Know some identities about tokenfactor nodes.  
							
							... 
							
							
							
							llvm-svn: 19699 
							
						 
						
							2005-01-19 18:01:40 +00:00  
				
					
						
							
							
								 
						
							
								32a5f02598 
								
							 
						 
						
							
							
								
								Know some simple identities.  This improves codegen for (1LL << N).  
							
							... 
							
							
							
							llvm-svn: 19698 
							
						 
						
							2005-01-19 17:29:49 +00:00  
				
					
						
							
							
								 
						
							
								a9d53f9fb9 
								
							 
						 
						
							
							
								
								Keep track of the retval type as well.  
							
							... 
							
							
							
							llvm-svn: 19670 
							
						 
						
							2005-01-18 19:26:36 +00:00  
				
					
						
							
							
								 
						
							
								b07e2d2084 
								
							 
						 
						
							
							
								
								Allow setcc operations to have nonbool types.  
							
							... 
							
							
							
							llvm-svn: 19656 
							
						 
						
							2005-01-18 02:52:03 +00:00  
				
					
						
							
							
								 
						
							
								2b4b79581d 
								
							 
						 
						
							
							
								
								Fix the completely broken FP constant folds for setcc's.  
							
							... 
							
							
							
							llvm-svn: 19651 
							
						 
						
							2005-01-18 02:11:55 +00:00  
				
					
						
							
							
								 
						
							
								16f64df93a 
								
							 
						 
						
							
							
								
								Refactor code into a new method.  
							
							... 
							
							
							
							llvm-svn: 19635 
							
						 
						
							2005-01-17 17:15:02 +00:00  
				
					
						
							
							
								 
						
							
								4e550ebb38 
								
							 
						 
						
							
							
								
								Add assertions.  
							
							... 
							
							
							
							llvm-svn: 19596 
							
						 
						
							2005-01-16 02:23:22 +00:00  
				
					
						
							
							
								 
						
							
								0fe7776da5 
								
							 
						 
						
							
							
								
								Eliminate unneeded extensions.  
							
							... 
							
							
							
							llvm-svn: 19577 
							
						 
						
							2005-01-16 00:17:20 +00:00  
				
					
						
							
							
								 
						
							
								630d1937bf 
								
							 
						 
						
							
							
								
								Print extra type for nodes with extra type info.  
							
							... 
							
							
							
							llvm-svn: 19575 
							
						 
						
							2005-01-15 21:11:37 +00:00  
				
					
						
							
							
								 
						
							
								09d1b3d01d 
								
							 
						 
						
							
							
								
								Common code factored out.  
							
							... 
							
							
							
							llvm-svn: 19572 
							
						 
						
							2005-01-15 07:14:32 +00:00  
				
					
						
							
							
								 
						
							
								1001c6e2cd 
								
							 
						 
						
							
							
								
								Add new SIGN_EXTEND_INREG, ZERO_EXTEND_INREG, and FP_ROUND_INREG operators.  
							
							... 
							
							
							
							llvm-svn: 19568 
							
						 
						
							2005-01-15 06:17:04 +00:00  
				
					
						
							
							
								 
						
							
								3b8e719d1d 
								
							 
						 
						
							
							
								
								Adjust to CopyFromReg changes, implement deletion of truncating/extending  
							
							... 
							
							
							
							stores/loads.
llvm-svn: 19562 
							
						 
						
							2005-01-14 22:38:01 +00:00  
				
					
						
							
							
								 
						
							
								39c6744c9f 
								
							 
						 
						
							
							
								
								Start implementing truncating stores and extending loads.  
							
							... 
							
							
							
							llvm-svn: 19559 
							
						 
						
							2005-01-14 22:08:15 +00:00  
				
					
						
							
							
								 
						
							
								e727af06c8 
								
							 
						 
						
							
							
								
								Add new ImplicitDef node, rename CopyRegSDNode class to RegSDNode.  
							
							... 
							
							
							
							llvm-svn: 19535 
							
						 
						
							2005-01-13 20:50:02 +00:00  
				
					
						
							
							
								 
						
							
								4b1be0dfeb 
								
							 
						 
						
							
							
								
								Print new node.  
							
							... 
							
							
							
							llvm-svn: 19526 
							
						 
						
							2005-01-13 17:59:10 +00:00  
				
					
						
							
							
								 
						
							
								4dfd2cfc0c 
								
							 
						 
						
							
							
								
								Do not fold (zero_ext (sign_ext V)) -> (sign_ext V), they are not the same.  
							
							... 
							
							
							
							This fixes llvm-test/SingleSource/Regression/C/casts.c
llvm-svn: 19519 
							
						 
						
							2005-01-12 18:51:15 +00:00  
				
					
						
							
							
								 
						
							
								40e7982c2c 
								
							 
						 
						
							
							
								
								New method  
							
							... 
							
							
							
							llvm-svn: 19517 
							
						 
						
							2005-01-12 18:37:47 +00:00  
				
					
						
							
							
								 
						
							
								844277fb1e 
								
							 
						 
						
							
							
								
								Print new operations.  
							
							... 
							
							
							
							llvm-svn: 19464 
							
						 
						
							2005-01-11 05:57:01 +00:00  
				
					
						
							
							
								 
						
							
								a86fa4455b 
								
							 
						 
						
							
							
								
								shift X, 0 -> X  
							
							... 
							
							
							
							llvm-svn: 19453 
							
						 
						
							2005-01-11 04:25:13 +00:00  
				
					
						
							
							
								 
						
							
								9e4c76123c 
								
							 
						 
						
							
							
								
								Split out SDNode::getOperationName into its own method.  
							
							... 
							
							
							
							llvm-svn: 19443 
							
						 
						
							2005-01-10 23:25:25 +00:00  
				
					
						
							
							
								 
						
							
								41b764144d 
								
							 
						 
						
							
							
								
								Implement a couple of more simplifications.  This lets us codegen:  
							
							... 
							
							
							
							int test2(int * P, int* Q, int A, int B) {
        return P+A == P;
}
into:
test2:
        movl 4(%esp), %eax
        movl 12(%esp), %eax
        shll $2, %eax
        cmpl $0, %eax
        sete %al
        movzbl %al, %eax
        ret
instead of:
test2:
        movl 4(%esp), %eax
        movl 12(%esp), %ecx
        leal (%eax,%ecx,4), %ecx
        cmpl %eax, %ecx
        sete %al
        movzbl %al, %eax
        ret
ICC is producing worse code:
test2:
        movl      4(%esp), %eax                                 #8.5
        movl      12(%esp), %edx                                #8.5
        lea       (%edx,%edx), %ecx                             #9.9
        addl      %ecx, %ecx                                    #9.9
        addl      %eax, %ecx                                    #9.9
        cmpl      %eax, %ecx                                    #9.16
        movl      $0, %eax                                      #9.16
        sete      %al                                           #9.16
        ret                                                     #9.16
as is GCC (looks like our old code):
test2:
        movl    4(%esp), %edx
        movl    12(%esp), %eax
        leal    (%edx,%eax,4), %ecx
        cmpl    %edx, %ecx
        sete    %al
        movzbl  %al, %eax
        ret
llvm-svn: 19430 
							
						 
						
							2005-01-10 02:03:02 +00:00  
				
					
						
							
							
								 
						
							
								00c231baa7 
								
							 
						 
						
							
							
								
								Fix incorrect constant folds, fixing Stepanov after the SHR patch.  
							
							... 
							
							
							
							llvm-svn: 19429 
							
						 
						
							2005-01-10 01:16:03 +00:00  
				
					
						
							
							
								 
						
							
								0966a75e76 
								
							 
						 
						
							
							
								
								Constant fold shifts, turning this loop:  
							
							... 
							
							
							
							.LBB_Z5test0PdS__3:     # no_exit.1
        fldl data(,%eax,8)
        fldl 24(%esp)
        faddp %st(1)
        fstl 24(%esp)
        incl %eax
        movl $16000, %ecx
        sarl $3, %ecx
        cmpl %eax, %ecx
        fstpl 16(%esp)
        #FP_REG_KILL
        jg .LBB_Z5test0PdS__3   # no_exit.1
into:
.LBB_Z5test0PdS__3:     # no_exit.1
        fldl data(,%eax,8)
        fldl 24(%esp)
        faddp %st(1)
        fstl 24(%esp)
        incl %eax
        cmpl $2000, %eax
        fstpl 16(%esp)
        #FP_REG_KILL
        jl .LBB_Z5test0PdS__3   # no_exit.1
llvm-svn: 19427 
							
						 
						
							2005-01-10 00:07:15 +00:00  
				
					
						
							
							
								 
						
							
								fde3a212e2 
								
							 
						 
						
							
							
								
								Add some folds for == and != comparisons.  This allows us to  
							
							... 
							
							
							
							codegen this loop in stepanov:
no_exit.i:              ; preds = %entry, %no_exit.i, %then.i, %_Z5checkd.exit
        %i.0.0 = phi int [ 0, %entry ], [ %i.0.0, %no_exit.i ], [ %inc.0, %_Z5checkd.exit ], [ %inc.012, %then.i ]              ; <int> [#uses=3]
        %indvar = phi uint [ %indvar.next, %no_exit.i ], [ 0, %entry ], [ 0, %then.i ], [ 0, %_Z5checkd.exit ]          ; <uint> [#uses=3]
        %result_addr.i.0 = phi double [ %tmp.4.i.i, %no_exit.i ], [ 0.000000e+00, %entry ], [ 0.000000e+00, %then.i ], [ 0.000000e+00, %_Z5checkd.exit ]          ; <double> [#uses=1]
        %first_addr.0.i.2.rec = cast uint %indvar to int                ; <int> [#uses=1]
        %first_addr.0.i.2 = getelementptr [2000 x double]* %data, int 0, uint %indvar           ; <double*> [#uses=1]
        %inc.i.rec = add int %first_addr.0.i.2.rec, 1           ; <int> [#uses=1]
        %inc.i = getelementptr [2000 x double]* %data, int 0, int %inc.i.rec            ; <double*> [#uses=1]
        %tmp.3.i.i = load double* %first_addr.0.i.2             ; <double> [#uses=1]
        %tmp.4.i.i = add double %result_addr.i.0, %tmp.3.i.i            ; <double> [#uses=2]
        %tmp.2.i = seteq double* %inc.i, getelementptr ([2000 x double]* %data, int 0, int 2000)                ; <bool> [#uses=1]
        %indvar.next = add uint %indvar, 1              ; <uint> [#uses=1]
        br bool %tmp.2.i, label %_Z10accumulateIPddET0_T_S2_S1_.exit, label %no_exit.i
To this:
.LBB_Z4testIPddEvT_S1_T0__1:    # no_exit.i
        fldl data(,%eax,8)
        fldl 16(%esp)
        faddp %st(1)
        fstpl 16(%esp)
        incl %eax
        movl %eax, %ecx
        shll $3, %ecx
        cmpl $16000, %ecx
        #FP_REG_KILL
        jne .LBB_Z4testIPddEvT_S1_T0__1 # no_exit.i
instead of this:
.LBB_Z4testIPddEvT_S1_T0__1:    # no_exit.i
        fldl data(,%eax,8)
        fldl 16(%esp)
        faddp %st(1)
        fstpl 16(%esp)
        incl %eax
        leal data(,%eax,8), %ecx
        leal data+16000, %edx
        cmpl %edx, %ecx
        #FP_REG_KILL
        jne .LBB_Z4testIPddEvT_S1_T0__1 # no_exit.i
llvm-svn: 19425 
							
						 
						
							2005-01-09 20:52:51 +00:00  
				
					
						
							
							
								 
						
							
								7d1670da3f 
								
							 
						 
						
							
							
								
								Fix VC++ compilation error  
							
							... 
							
							
							
							llvm-svn: 19423 
							
						 
						
							2005-01-09 20:41:56 +00:00  
				
					
						
							
							
								 
						
							
								e6f7882c27 
								
							 
						 
						
							
							
								
								Print the DAG out more like a DAG in nested format.  
							
							... 
							
							
							
							llvm-svn: 19422 
							
						 
						
							2005-01-09 20:38:33 +00:00  
				
					
						
							
							
								 
						
							
								1270acc1ce 
								
							 
						 
						
							
							
								
								Print out nodes sorted by their address to make it easier to find them in a list.  
							
							... 
							
							
							
							llvm-svn: 19421 
							
						 
						
							2005-01-09 20:26:36 +00:00  
				
					
						
							
							
								 
						
							
								3d5d5022d5 
								
							 
						 
						
							
							
								
								Add a simple transformation.  This allows us to compile one of the inner  
							
							... 
							
							
							
							loops in stepanov to this:
.LBB_Z5test0PdS__2:     # no_exit.1
        fldl data(,%eax,8)
        fldl 24(%esp)
        faddp %st(1)
        fstl 24(%esp)
        incl %eax
        cmpl $2000, %eax
        fstpl 16(%esp)
        #FP_REG_KILL
        jl .LBB_Z5test0PdS__2
instead of this:
.LBB_Z5test0PdS__2:     # no_exit.1
        fldl data(,%eax,8)
        fldl 24(%esp)
        faddp %st(1)
        fstl 24(%esp)
        incl %eax
        movl $data, %ecx
        movl %ecx, %edx
        addl $16000, %edx
        subl %ecx, %edx
        movl %edx, %ecx
        sarl $2, %ecx
        shrl $29, %ecx
        addl %ecx, %edx
        sarl $3, %edx
        cmpl %edx, %eax
        fstpl 16(%esp)
        #FP_REG_KILL
        jl .LBB_Z5test0PdS__2
The old instruction selector produced:
.LBB_Z5test0PdS__2:     # no_exit.1
        fldl 24(%esp)
        faddl data(,%eax,8)
        fstl 24(%esp)
        movl %eax, %ecx
        incl %ecx
        incl %eax
        leal data+16000, %edx
        movl $data, %edi
        subl %edi, %edx
        movl %edx, %edi
        sarl $2, %edi
        shrl $29, %edi
        addl %edi, %edx
        sarl $3, %edx
        cmpl %edx, %ecx
        fstpl 16(%esp)
        #FP_REG_KILL
        jl .LBB_Z5test0PdS__2   # no_exit.1
Which is even worse!
llvm-svn: 19419 
							
						 
						
							2005-01-09 20:09:57 +00:00  
				
					
						
							
							
								 
						
							
								2a6db3c351 
								
							 
						 
						
							
							
								
								Add support for FP->INT conversions and back.  
							
							... 
							
							
							
							llvm-svn: 19369 
							
						 
						
							2005-01-08 08:08:56 +00:00  
				
					
						
							
							
								 
						
							
								9a97e4d5b6 
								
							 
						 
						
							
							
								
								1ULL << 64 is undefined, don't do it.  
							
							... 
							
							
							
							llvm-svn: 19365 
							
						 
						
							2005-01-08 06:24:30 +00:00  
				
					
						
							
							
								 
						
							
								e0f1fe181a 
								
							 
						 
						
							
							
								
								Fix a pointer invalidation problem.  This fixes Generic/badarg6.ll  
							
							... 
							
							
							
							llvm-svn: 19361 
							
						 
						
							2005-01-07 23:32:00 +00:00  
				
					
						
							
							
								 
						
							
								5c66e45b92 
								
							 
						 
						
							
							
								
								Fold conditional branches on constants away.  
							
							... 
							
							
							
							llvm-svn: 19360 
							
						 
						
							2005-01-07 22:49:57 +00:00  
				
					
						
							
							
								 
						
							
								cda3efa6e5 
								
							 
						 
						
							
							
								
								Fix a thinko in the reassociation code, fixing Generic/badlive.ll  
							
							... 
							
							
							
							llvm-svn: 19359 
							
						 
						
							2005-01-07 22:44:09 +00:00  
				
					
						
							
							
								 
						
							
								4d5ba99283 
								
							 
						 
						
							
							
								
								Simplify: truncate ({zero|sign}_extend (X))  
							
							... 
							
							
							
							llvm-svn: 19353 
							
						 
						
							2005-01-07 21:56:24 +00:00  
				
					
						
							
							
								 
						
							
								9c667933c1 
								
							 
						 
						
							
							
								
								Implement RemoveDeadNodes  
							
							... 
							
							
							
							llvm-svn: 19345 
							
						 
						
							2005-01-07 21:09:16 +00:00  
				
					
						
							
							
								 
						
							
								061a1ea9e3 
								
							 
						 
						
							
							
								
								Complete rewrite of the SelectionDAG class.  
							
							... 
							
							
							
							llvm-svn: 19327 
							
						 
						
							2005-01-07 07:46:32 +00:00  
				
					
						
							
							
								 
						
							
								eb04d9bcb4 
								
							 
						 
						
							
							
								
								Add #include <iostream> since Value.h does not #include it any more.  
							
							... 
							
							
							
							llvm-svn: 14622 
							
						 
						
							2004-07-04 12:19:56 +00:00  
				
					
						
							
							
								 
						
							
								6b7275996c 
								
							 
						 
						
							
							
								
								Rename Type::PrimitiveID to TypeId and ::getPrimitiveID() to ::getTypeID()  
							
							... 
							
							
							
							llvm-svn: 14201 
							
						 
						
							2004-06-17 18:19:28 +00:00  
				
					
						
							
							
								 
						
							
								560b5e42ab 
								
							 
						 
						
							
							
								
								Finegrainify namespacification  
							
							... 
							
							
							
							llvm-svn: 13948 
							
						 
						
							2004-06-02 04:28:06 +00:00  
				
					
						
							
							
								 
						
							
								960707c335 
								
							 
						 
						
							
							
								
								Put all LLVM code into the llvm namespace, as per bug 109.  
							
							... 
							
							
							
							llvm-svn: 9903 
							
						 
						
							2003-11-11 22:41:34 +00:00  
				
					
						
							
							
								 
						
							
								482202a601 
								
							 
						 
						
							
							
								
								Added LLVM project notice to the top of every C++ source file.  
							
							... 
							
							
							
							Header files will be on the way.
llvm-svn: 9298 
							
						 
						
							2003-10-20 19:43:21 +00:00  
				
					
						
							
							
								 
						
							
								e81de41edf 
								
							 
						 
						
							
							
								
								Add a bunch of new node types, etc  
							
							... 
							
							
							
							llvm-svn: 7875 
							
						 
						
							2003-08-15 04:53:16 +00:00  
				
					
						
							
							
								 
						
							
								600d308853 
								
							 
						 
						
							
							
								
								Initial checkin of SelectionDAG implementation.  This is still rough and  
							
							... 
							
							
							
							unfinished
llvm-svn: 7717 
							
						 
						
							2003-08-11 14:57:33 +00:00