Commit Graph

102 Commits

Author SHA1 Message Date
Nate Begeman 0822032c95 Put int the getReg cast optimization from x86 so that we generate fewer
move instructions for the register allocator to coalesce.

llvm-svn: 17608
2004-11-08 02:25:40 +00:00
Nate Begeman f5f0b6b6b0 Disable bogus cast elimination when the cast is used by a setcc instruction.
llvm-svn: 17583
2004-11-07 20:23:42 +00:00
Nate Begeman bff3d4abf0 Thanks to sabre for pointing out that we were incorrectly codegen'ing
int test(int x) { return 32768 - x; }

Fixed by teaching the function that checks a constant's validity to be used
as an immediate argument about subtract-from instructions.

llvm-svn: 17476
2004-11-04 19:43:18 +00:00
Nate Begeman 26feb4f6d8 Fix treecc. Also fix a latent bug in emitBinaryConstOperation that would
allow and const, 0 to be incorrectly codegen'd into a rlwinm instruction.

llvm-svn: 17234
2004-10-26 03:48:25 +00:00
Nate Begeman 74b7c1f3e0 Implement more complete and correct codegen for bitfield inserts, as tested
by the recently committed rlwimi.ll test file.  Also commit initial code
for bitfield extract, although it is turned off until fully debugged.

llvm-svn: 17207
2004-10-24 10:33:30 +00:00
Nate Begeman 6cadac8f43 Kill casts from integer types to unsigned byte, when the cast was only used
as the shift amount operand to a shift instruction.  This was causing us to
emit unnecessary clear operations for code such as:
int foo(int x) { return 1 << x; }

llvm-svn: 17175
2004-10-23 00:50:23 +00:00
Reid Spencer 30d8baea8d Adjust to changes in Makefile.rules
llvm-svn: 17167
2004-10-22 21:02:08 +00:00
Nate Begeman 86b5f8075c Don't clear or sign extend bool->int. This fires a few dozen times on the test suite
llvm-svn: 17147
2004-10-20 21:55:41 +00:00
Nate Begeman 2c873ca365 Implement bitfield insert by recognizing the following pattern:
1. optional shift left
2. and x, immX
3. and y, immY
4. or z, x, y
==> rlwimi z, x, y, shift, mask begin, mask end

where immX == ~immY and immX is a run of set bits. This transformation
fires 32 times on voronoi, once on espresso, and probably several
dozen times on external benchmarks such as gcc.

To put this in terms of actual code generated for
struct B { unsigned a : 3; unsigned b : 2; };
void storeA (struct B *b, int v) { b->a = v;}
void storeB (struct B *b, int v) { b->b = v;}

Old:
_storeA:
        rlwinm r2, r4, 0, 29, 31
        lwz r4, 0(r3)
        rlwinm r4, r4, 0, 0, 28
        or r2, r4, r2
        stw r2, 0(r3)
        blr

_storeB:
        rlwinm r2, r4, 3, 0, 28
        rlwinm r2, r2, 0, 27, 28
        lwz r4, 0(r3)
        rlwinm r4, r4, 0, 29, 26
        or r2, r2, r4
        stw r2, 0(r3)
        blr

New:
_storeA:
        lwz r2, 0(r3)
        rlwimi r2, r4, 0, 29, 31
        stw r2, 0(r3)
        blr

_storeB:
        lwz r2, 0(r3)
        rlwimi r2, r4, 3, 27, 28
        stw r2, 0(r3)
        blr

llvm-svn: 17078
2004-10-17 05:19:20 +00:00
Nate Begeman 29dc5f2a3e Finally fix one of the oldest FIXMEs in the PowerPC backend: correctly
flag rotate left word immediate then mask insert (rlwimi) as a two-address
instruction, and update the ISel usage of the instruction accordingly.

This will allow us to properly schedule rlwimi, and use it to efficiently
codegen bitfield operations.

llvm-svn: 17068
2004-10-16 20:43:38 +00:00
Chris Lattner a3f3c8a1ad ADd support for undef and unreachable
llvm-svn: 17050
2004-10-16 18:13:47 +00:00
Nate Begeman a15c246af9 Better codegen of binary integer ops with 32 bit immediate operands.
This transformation fires a few dozen times across the testsuite.

For example, int test2(int X) { return X ^ 0x0FF00FF0; }
Old:
_test2:
        lis r2, 4080
        ori r2, r2, 4080
        xor r3, r3, r2
        blr

New:
_test2:
        xoris r3, r3, 4080
        xori r3, r3, 4080
        blr

llvm-svn: 17004
2004-10-15 00:50:19 +00:00
Nate Begeman b58dd6799f Implement logical and with an immediate that consists of a contiguous block
of one or more 1 bits (may wrap from least significant bit to most
significant bit) as the rlwinm rather than andi., andis., or some longer
instructons sequence.

int andn4(int z) { return z & -4; }
int clearhi(int z) { return z & 0x0000FFFF; }
int clearlo(int z) { return z & 0xFFFF0000; }
int clearmid(int z) { return z & 0x00FFFF00; }
int clearwrap(int z) { return z & 0xFF0000FF; }

_andn4:
        rlwinm r3, r3, 0, 0, 29
        blr

_clearhi:
        rlwinm r3, r3, 0, 16, 31
        blr

_clearlo:
        rlwinm r3, r3, 0, 0, 15
        blr

_clearmid:
        rlwinm r3, r3, 0, 8, 23
        blr

_clearwrap:
        rlwinm r3, r3, 0, 24, 7
        blr

llvm-svn: 16832
2004-10-08 02:49:24 +00:00
Nate Begeman 6e6514c47e Several fixes and enhancements to the PPC32 backend.
1. Fix an illegal argument to getClassB when deciding whether or not to
   sign extend a byte load.

2. Initial addition of isLoad and isStore flags to the instruction .td file
   for eventual use in a scheduler.

3. Rewrite of how constants are handled in emitSimpleBinaryOperation so
   that we can emit the PowerPC shifted immediate instructions far more
   often.  This allows us to emit the following code:

int foo(int x) { return x | 0x00F0000; }

_foo:
.LBB_foo_0:     ; entry
        ; IMPLICIT_DEF
        oris r3, r3, 15
        blr

llvm-svn: 16826
2004-10-07 22:30:03 +00:00
Chris Lattner f94f985bbd Correct some typeos
llvm-svn: 16770
2004-10-06 16:28:24 +00:00
Nate Begeman 9a1fbaf1e9 Turning on fsel code gen now that we can do so would be good.
llvm-svn: 16765
2004-10-06 11:03:30 +00:00
Nate Begeman fac8529df8 Implement floating point select for lt, gt, le, ge using the powerpc fsel
instruction.

Now, rather than emitting the following loop out of bisect:
.LBB_main_19:	; no_exit.0.i
	rlwinm r3, r2, 3, 0, 28
	lfdx f1, r3, r27
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3)
	fsub f2, f2, f1
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3)
	fcmpu cr0, f1, f4
	bge .LBB_main_64	; no_exit.0.i
.LBB_main_63:	; no_exit.0.i
	b .LBB_main_65	; no_exit.0.i
.LBB_main_64:	; no_exit.0.i
	fmr f2, f1
.LBB_main_65:	; no_exit.0.i
	addi r3, r2, 1
	rlwinm r3, r3, 3, 0, 28
	lfdx f1, r3, r27
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3)
	fsub f4, f4, f1
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f5, lo16(.CPI_main_1-"L00000$pb")(r3)
	fcmpu cr0, f1, f5
	bge .LBB_main_67	; no_exit.0.i
.LBB_main_66:	; no_exit.0.i
	b .LBB_main_68	; no_exit.0.i
.LBB_main_67:	; no_exit.0.i
	fmr f4, f1
.LBB_main_68:	; no_exit.0.i
	fadd f1, f2, f4
	addis r3, r30, ha16(.CPI_main_2-"L00000$pb")
	lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3)
	fmul f1, f1, f2
	rlwinm r3, r2, 3, 0, 28
	lfdx f2, r3, r28
	fadd f4, f2, f1
	fcmpu cr0, f4, f0
	bgt .LBB_main_70	; no_exit.0.i
.LBB_main_69:	; no_exit.0.i
	b .LBB_main_71	; no_exit.0.i
.LBB_main_70:	; no_exit.0.i
	fmr f0, f4
.LBB_main_71:	; no_exit.0.i
	fsub f1, f2, f1
	addi r2, r2, -1
	fcmpu cr0, f1, f3
	blt .LBB_main_73	; no_exit.0.i
.LBB_main_72:	; no_exit.0.i
	b .LBB_main_74	; no_exit.0.i
.LBB_main_73:	; no_exit.0.i
	fmr f3, f1
.LBB_main_74:	; no_exit.0.i
	cmpwi cr0, r2, -1
	fmr f16, f0
	fmr f17, f3
	bgt .LBB_main_19	; no_exit.0.i

We emit this instead:
.LBB_main_19:	; no_exit.0.i
	rlwinm r3, r2, 3, 0, 28
	lfdx f1, r3, r27
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3)
	fsub f2, f2, f1
	fsel f1, f1, f1, f2
	addi r3, r2, 1
	rlwinm r3, r3, 3, 0, 28
	lfdx f2, r3, r27
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3)
	fsub f4, f4, f2
	fsel f2, f2, f2, f4
	fadd f1, f1, f2
	addis r3, r30, ha16(.CPI_main_2-"L00000$pb")
	lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3)
	fmul f1, f1, f2
	rlwinm r3, r2, 3, 0, 28
	lfdx f2, r3, r28
	fadd f4, f2, f1
	fsub f5, f0, f4
	fsel f0, f5, f0, f4
	fsub f1, f2, f1
	addi r2, r2, -1
	fsub f2, f1, f3
	fsel f3, f2, f3, f1
	cmpwi cr0, r2, -1
	fmr f16, f0
	fmr f17, f3
	bgt .LBB_main_19	; no_exit.0.i

llvm-svn: 16764
2004-10-06 09:53:04 +00:00
Nate Begeman 2f1d0ae95e Generate better code by being far less clever when it comes to the select instruction. Don't create overlapping register lifetimes
llvm-svn: 16580
2004-09-29 05:00:31 +00:00
Nate Begeman 7b6df6def2 improve Type::BoolTy codegen by eliminating unnecessary clears and sign extends
llvm-svn: 16578
2004-09-29 03:45:33 +00:00
Nate Begeman 26566f0b68 To go along with sabre's improved InstCombining, improve recognition of
integers that we can use as immediate values in instructions.

Example from yacr2:
-       lis r10, -1
-       ori r10, r10, 65535
-       add r28, r28, r10
+       addi r28, r28, -1
        addi r7, r7, 1
        addi r9, r9, 1
        b .LBB_main_9   ; loopentry.1.i214

llvm-svn: 16566
2004-09-29 02:35:05 +00:00
Nate Begeman 8656a156cf Correct some BuildMI arguments for the upcoming simple scheduler
llvm-svn: 16519
2004-09-27 05:08:17 +00:00
Nate Begeman 49cf74b26c Fix the last of the major PPC GEP folding deficiencies. This will allow
the ISel to use indexed and non-zero immediate offsets for GEPs that have
more than one use.  This is common for instruction sequences such as a load
followed by a modify and store to the same address.

llvm-svn: 16493
2004-09-23 05:31:33 +00:00
Nate Begeman 033b816171 add optimized code sequences for setcc x, 0
llvm-svn: 16478
2004-09-22 04:40:25 +00:00
Misha Brukman 87201ce8f9 s/ISel/PPC32ISel/ to have unique class names for debugging via gdb because the
C++ front-end in gcc does not mangle classes in anonymous namespaces correctly.

llvm-svn: 16470
2004-09-21 18:22:19 +00:00
Nate Begeman 4bfceb1ed5 All PPC instructions are now auto-printed
32 and 64 bit AsmWriters unified
Darwin and AIX specific features of AsmWriter split out

llvm-svn: 16163
2004-09-04 05:00:00 +00:00
Nate Begeman 6173878304 Convert remaining X-Form and Pseudo instructions over to asm writer
llvm-svn: 16142
2004-09-02 08:13:00 +00:00
Reid Spencer 7c16caa336 Changes For Bug 352
Move include/Config and include/Support into include/llvm/Config,
include/llvm/ADT and include/llvm/Support. From here on out, all LLVM
public header files must be under include/llvm/.

llvm-svn: 16137
2004-09-01 22:55:40 +00:00
Nate Begeman 4483df8b63 Implement the following missing functionality in the PPC backend:
cast fp->bool
cast ulong->fp
algebraic right shift long by non-constant value
These changes tested across most of the test suite.  Fixes Regression/casts

llvm-svn: 16081
2004-08-29 08:19:32 +00:00
Nate Begeman 1c57b4fa32 Kill a majority of unnecessary sign extensions for byte loads
llvm-svn: 15991
2004-08-22 08:10:15 +00:00
Nate Begeman 45b0b7cd7c Back out branchless SetCC code. While it helped a lot in some cases, it
hurt a lot in others.  Instead, improve branching version of SetCC and
Select instructions.  The old code will be in CVS should we ever need to
dig it up again.

llvm-svn: 15979
2004-08-21 20:42:14 +00:00
Nate Begeman 1b1a784afa Implement code to convert SetCC into straight line code where appropriate. Add necessary instructions for this transformation to the .td file.
llvm-svn: 15952
2004-08-20 09:56:22 +00:00
Misha Brukman 75e987d0b8 This PHI has 4 additional operands, not 2.
llvm-svn: 15926
2004-08-19 21:00:12 +00:00
Nate Begeman d5c6380015 Convert casts that will have no effect into move instructions.
llvm-svn: 15914
2004-08-19 08:07:50 +00:00
Nate Begeman e4e6d92d1d Clean up floating point instruction selection.
Change int->float cast code to put conversion constants in constant pool.
Shorten code sequence for constant pool fp loads.
Remove LOADLoDirect/LOADLoIndirect psuedo instructions and tweak asmwriter

llvm-svn: 15913
2004-08-19 05:20:54 +00:00
Nate Begeman 0818541631 Re-fix hiding the Frame Pointer from the register allocator in functions
that have a frame pointer.  This change fixes Burg.  In addition, make
the necessary changes to floating point code gen and constant loading after
Chris Lattner's fixes to the asm writer.  These changes fix MallocBench/gs

llvm-svn: 15873
2004-08-17 07:17:44 +00:00
Misha Brukman 116f9277f6 PowerPC 32-/64-bit split: Part I, PPC32* bit files, adapted from former PowerPC*
llvm-svn: 15850
2004-08-17 04:55:41 +00:00
Nate Begeman 8b44a07246 Fix mismatched adjust down/up of SP in functions that contain variable
sized allocas.

llvm-svn: 15806
2004-08-16 01:50:22 +00:00
Nate Begeman 373744c6dc Fix float to int codepath by always allocating 8 bytes for the target of a double store; optimize cmplwi generation.
llvm-svn: 15759
2004-08-15 06:42:28 +00:00
Nate Begeman caeb78e720 Fix handling of FP constants with single precision, and loading of internal linkage function addresses
llvm-svn: 15742
2004-08-14 22:11:38 +00:00
Nate Begeman 5bf9bfe398 Fix siod by switching BoolTy to byte rather than int until CFE changes for
Darwin.  Also, change asm printer to output proper stubs for external
functions whose address is passed as an argument to aid in bugpointing.

llvm-svn: 15721
2004-08-13 09:32:01 +00:00
Nate Begeman 420213f3c5 Fix 177.mesa compilation, don't use floating point regs for base addresses!
llvm-svn: 15720
2004-08-13 04:45:14 +00:00
Nate Begeman 2f1d849271 Fix llc crasher compiling siod by giving BuildMI the correct number of arguments
llvm-svn: 15719
2004-08-13 03:56:49 +00:00
Nate Begeman f17ea0f7b7 Clean up 32/64bit and Darwin/AIX split. Next steps: 64 bit ISel, AIX asm printer.
llvm-svn: 15662
2004-08-11 07:40:04 +00:00
Chris Lattner 6f0291792e Fix a case where constantexprs could leak into the PPC isel.
llvm-svn: 15661
2004-08-11 07:34:50 +00:00
Nate Begeman 7526da6ead Fix 255.vortex by using getClassB instead of getClass
llvm-svn: 15648
2004-08-11 03:30:55 +00:00
Misha Brukman d022a5ac5a Breaking up the PowerPC target into 32- and 64-bit subparts, Part I: 32-bit.
llvm-svn: 15634
2004-08-11 00:09:42 +00:00
Misha Brukman dad438bfb9 Renamed PPC32 (namespace for regs, opcodes) to PPC to include 64-bit targets
llvm-svn: 15631
2004-08-10 22:47:03 +00:00
Nate Begeman 63be70d8f2 Fix casts of float to unsigned long
Replace STDX (store 64 bit int indexed) with STFDX (store double indexed)
Fix latent bug in indexed load generation
Generate indexed loads and stores in many more cases

llvm-svn: 15626
2004-08-10 20:42:36 +00:00
Chris Lattner a8dcf2423e Changes commited for Nate Begeman:
Use a PowerPC specific prolog epilog inserter to control where spilled
callee save regs are placed on the stack.
Get rid of implicit return address stack slot, save return address reg
(LR) in appropriate slot
Improve code generated for functions that don't have calls or access
globals


Note from Chris: PowerPCPEI will eventually be eliminated, once the
functionality is merged into CodeGen/PrologEpilogInserter.cpp

llvm-svn: 15536
2004-08-06 06:58:50 +00:00
Misha Brukman 862bb562cc Simplify loading (un)signed constants to registers, patch by Nate Begeman.
llvm-svn: 15306
2004-07-28 19:13:49 +00:00