are allowed to generate 64-bit-only PowerPC instructions for 32 bit hosts,
such as the PowerPC 970.
This speeds up 189.lucas from 81.99 to 32.64 seconds.
llvm-svn: 23250
giving it a non-instruction opcode. The dag->dag selector used to not
select the operands of the fsel, because it thought that whole tree was
already selected.
llvm-svn: 23091
putting it into the constant pool. This allows the isel machinery to
create constants that it will end up deciding are not needed, without them
ending up in the resultant function constant pool.
llvm-svn: 23081
instead of in the backend. This currently handles fsel cases with registers,
but doesn't have the 0.0 and -0.0 optimization enabled yet.
Once this is finished, special hack for fp immediates can go away.
llvm-svn: 23075
instructions take a consistent reg class. Implement ISD::UNDEF in the dag->dag
selector to generate this, fixing UnitTests/2003-07-06-IntOverflow.
llvm-svn: 23028
Emit the indcall sequence as:
mtctr inreg
mr R12, inreg
btctr
If inreg and R12 aren't coallesced, this reduces the odds of having the mtctr
and btctr in the same dispatch group. :)
llvm-svn: 23023
simplifies BRTWOWAY into BR if one of the results is a fall-through.
Unless I'm missing something, there is no reason to duplicate this
in the target-specific code.
llvm-svn: 22952
it in the block. This codegens:
int %test(bool %c) {
%retval = select bool %c, int 17, int 1
ret int %retval
}
as:
_test:
rlwinm r2, r3, 0, 31, 31
li r2, 17
cmpwi cr0, r3, 0
bne .LBB_test_2 ;
.LBB_test_1: ;
li r2, 1
.LBB_test_2: ;
or r3, r2, r2
blr
instead of:
_test:
rlwinm r2, r3, 0, 31, 31
li r2, 17
li r4, 1
cmpwi cr0, r3, 0
bne .LBB_test_2 ;
.LBB_test_1: ;
or r2, r4, r4
.LBB_test_2: ;
or r3, r2, r2
blr
... which is one fewer instruction. The savings are more significant for
global address and constantfp nodes.
llvm-svn: 22946
Remove dead code in ISD::Constant handling
Add support for add long, imm16
We now codegen 'long long foo(long long a) { return ++a; }'
as:
addic r4, r4, 1
addze r3, r3
blr
instead of:
li r2, 1
li r5, 0
addc r2, r4, r2
adde r3, r3, r5
blr
llvm-svn: 22811
1. Not selecting the false value of a select_cc in the false arm, which
isn't legal for nested selects.
2. Actually returning the node we created and Legalized in the FP_TO_UINT
Expander.
llvm-svn: 22789
codegen for FP_TO_UINT by using the legalizer's SELECT variant.
Implement a codegen improvement for SELECT_CC, selecting the false node in
the MBB that feeds the phi node. This allows us to codegen:
void foo(int *a, int b, int c) { int d = (a < b) ? 5 : 9; *a = d; }
as:
_foo:
li r2, 5
cmpw cr0, r4, r3
bgt .LBB_foo_2 ; entry
.LBB_foo_1: ; entry
li r2, 9
.LBB_foo_2: ; entry
stw r2, 0(r3)
blr
insted of:
_foo:
li r2, 5
li r5, 9
cmpw cr0, r4, r3
bgt .LBB_foo_2 ; entry
.LBB_foo_1: ; entry
or r2, r5, r5
.LBB_foo_2: ; entry
stw r2, 0(r3)
blr
llvm-svn: 22784
specified opcode and an integer constant right operand.
2. Modified ISD::SHL, ISD::SRL, ISD::SRA to use rlwinm when applied after a mask.
llvm-svn: 22761
(avoids an extra level of indirection in MakeReg).
defined MakeIntReg using RegMap->createVirtualRegister(PPC32::GPRCRegisterClass)
defined MakeFPReg using RegMap->createVirtualRegister(PPC32::FPRCRegisterClass)
s/MakeReg(MVT::i32)/MakeIntReg/
s/MakeReg(MVT::f64)/MakeFPReg/
Patch by Jim Laskey!
llvm-svn: 22759