Chris Lattner
66f1fbaaad
Fix miscompilation of float vector returns. Compile code to this:
...
_func:
vsldoi v2, v3, v2, 12
vsldoi v2, v2, v2, 4
blr
instead of:
_func:
vsldoi v2, v3, v2, 12
vsldoi v2, v2, v2, 4
*** vor f1, v2, v2
blr
llvm-svn: 29607
2006-08-11 16:47:32 +00:00
Chris Lattner
8298265042
Fix some ppc64 issues with vector code.
...
llvm-svn: 29384
2006-07-28 16:45:47 +00:00
Chris Lattner
9e56e5c003
Rename RelocModel::PIC to PIC_, to avoid conflicts with -DPIC.
...
llvm-svn: 29307
2006-07-26 21:12:04 +00:00
Chris Lattner
a7976d329e
Implement Regression/CodeGen/PowerPC/bswap-load-store.ll by folding bswaps
...
into i16/i32 load/stores.
llvm-svn: 29089
2006-07-10 20:56:58 +00:00
Chris Lattner
8aed3cc46b
Implement 64-bit select, bswap, etc.
...
llvm-svn: 28935
2006-06-27 20:14:52 +00:00
Chris Lattner
a07410c95b
PPC doesn't have bit converts to/from i64
...
llvm-svn: 28932
2006-06-27 18:40:08 +00:00
Chris Lattner
d48ce27532
Implement 64-bit undef, sub, shl/shr, srem/urem
...
llvm-svn: 28929
2006-06-27 18:18:41 +00:00
Chris Lattner
cb5a84f446
Use i32 for shift amounts instead of i64. This gets bisort working.
...
llvm-svn: 28927
2006-06-27 17:34:57 +00:00
Chris Lattner
97b3da1519
Implement a bunch of 64-bit cleanliness work. With this, treeadd builds (but
...
doesn't work right).
llvm-svn: 28921
2006-06-27 00:04:13 +00:00
Chris Lattner
ec78cade34
Improve PPC64 calling convention support
...
llvm-svn: 28919
2006-06-26 22:48:35 +00:00
Chris Lattner
dc38e6f322
Correct returns of 64-bit values, though they seemed to work before...
...
llvm-svn: 28892
2006-06-21 00:34:03 +00:00
Chris Lattner
a5190ae7a9
fix some assumptions that pointers can only be 32-bits. With this, we can
...
now compile:
static unsigned long X;
void test1() {
X = 0;
}
into:
_test1:
lis r2, ha16(_X)
li r3, 0
stw r3, lo16(_X)(r2)
blr
Totally amazing :)
llvm-svn: 28839
2006-06-16 21:01:35 +00:00
Chris Lattner
a35f306740
Rename some subtarget features. A CPU now can *have* 64-bit instructions,
...
can in 32-bit mode we can choose to optionally *use* 64-bit registers.
llvm-svn: 28824
2006-06-16 17:34:12 +00:00
Evan Cheng
94bb93f8f7
Type of extract_element index operand should be iPTR.
...
llvm-svn: 28797
2006-06-15 08:18:06 +00:00
Chris Lattner
006b2c6ab9
Fix a problem exposed by the local allocator. CALL instructions are not marked
...
as using incoming argument registers, so the local allocator would clobber them
between their set and use. To fix this, we give the call instructions a variable
number of uses in the CALL MachineInstr itself, so live variables understands
the live ranges of these register arguments.
llvm-svn: 28744
2006-06-10 01:14:28 +00:00
Chris Lattner
b9342afa56
Always reserve space for 8 spilled GPRs. GCC apparently assumes that this
...
space will be available, even if the callee isn't varargs.
llvm-svn: 28571
2006-05-30 21:21:04 +00:00
Evan Cheng
a3add0fea8
Change RET node to include signness information of the return values. i.e.
...
RET chain, value1, sign1, value2, sign2, ...
llvm-svn: 28510
2006-05-26 23:10:12 +00:00
Evan Cheng
c2cd473d9b
CALL node change (arg / sign pairs instead of just arguments).
...
llvm-svn: 28462
2006-05-25 00:57:32 +00:00
Chris Lattner
aa2372562e
Patches to make the LLVM sources more -pedantic clean. Patch provided
...
by Anton Korobeynikov! This is a step towards closing PR786.
llvm-svn: 28447
2006-05-24 17:04:05 +00:00
Chris Lattner
33165c246c
Fix CodeGen/Generic/vector.ll:test_div with altivec.
...
llvm-svn: 28445
2006-05-24 00:15:25 +00:00
Chris Lattner
b56d22c2f6
Handle SETO* like we handle SET*, restoring behavior after Evan's setcc
...
change. This fixes PowerPC/fnegsel.ll.
llvm-svn: 28443
2006-05-24 00:06:44 +00:00
Chris Lattner
eb755fc1b3
Make PPC call lowering more aggressive, making the isel matching code simple
...
enough to be autogenerated.
llvm-svn: 28354
2006-05-17 19:00:46 +00:00
Chris Lattner
b1e9e37c58
Switch PPC over to a call-selection model where the lowering code creates
...
the copyto/fromregs instead of making the PPCISD::CALL selection code create
them. This vastly simplifies the selection code, and moves the ABI handling
parts into one place.
llvm-svn: 28346
2006-05-17 06:01:33 +00:00
Chris Lattner
b7552a88d6
3 changes, 2 of which are cleanup one of which changes codegen:
...
1. Rearrange code a bit so that the special case doesn't require indenting lots
of code.
2. Add comments describing PPC calling convention.
3. Only round up to 56-bytes of stack space for an outgoing call if the callee
is varargs. This saves a bit of stack space.
llvm-svn: 28342
2006-05-17 00:15:40 +00:00
Chris Lattner
f058f5aef1
implement passing/returning vector regs to calls, at least non-varargs calls.
...
llvm-svn: 28341
2006-05-16 23:54:25 +00:00
Chris Lattner
aa40ec1b32
Instead of implementing LowerCallTo directly, let the default impl produce an
...
ISD::CALL node, then custom lower that. This means that we only have to handle
LEGAL call operands/results, not every possible type. This allows us to
simplify the call code, shrinking it by about 1/3.
llvm-svn: 28339
2006-05-16 22:56:08 +00:00
Chris Lattner
26e2fcd8b1
Simplify the argument counting logic by only incrementing the index.
...
llvm-svn: 28335
2006-05-16 18:58:15 +00:00
Chris Lattner
76c47b50e7
Simplify the dead argument handling code.
...
llvm-svn: 28334
2006-05-16 18:54:32 +00:00
Chris Lattner
318f0d2122
Vector args passed in registers don't reserve stack space.
...
llvm-svn: 28333
2006-05-16 18:51:52 +00:00
Chris Lattner
4302e8fb67
Switch the PPC backend over to using FORMAL_ARGUMENTS for formal argument
...
handling. This makes the lower argument code significantly simpler (we
only need to handle legal argument types).
Incidentally, this also implements support for vector argument registers,
so long as they are not on the stack.
llvm-svn: 28331
2006-05-16 18:18:50 +00:00
Chris Lattner
d2ca9abf57
Fit in 80 cols
...
llvm-svn: 28311
2006-05-16 04:20:24 +00:00
Chris Lattner
ae48a894b1
Remove dead var, fix bad override.
...
llvm-svn: 28264
2006-05-12 21:09:57 +00:00
Chris Lattner
84b49d51be
Fix CodeGen/Generic/2006-04-28-Sign-extend-bool.ll
...
llvm-svn: 28017
2006-04-28 21:56:10 +00:00
Nate Begeman
4ca2ea5b43
JumpTable support! What this represents is working asm and jit support for
...
x86 and ppc for 100% dense switch statements when relocations are non-PIC.
This support will be extended and enhanced in the coming days to support
PIC, and less dense forms of jump tables.
llvm-svn: 27947
2006-04-22 18:53:45 +00:00
Chris Lattner
518834c67e
Fix a crash on:
...
void foo2(vector float *A, vector float *B) {
vector float C = (vector float)vec_cmpeq(*A, *B);
if (!vec_any_eq(*A, *B))
*B = (vector float){0,0,0,0};
*A = C;
}
llvm-svn: 27808
2006-04-18 18:28:22 +00:00
Chris Lattner
1e174c87c3
pretty print node name
...
llvm-svn: 27806
2006-04-18 18:05:58 +00:00
Chris Lattner
9754d142a4
Implement an important entry from README_ALTIVEC:
...
If an altivec predicate compare is used immediately by a branch, don't
use a (serializing) MFCR instruction to read the CR6 register, which requires
a compare to get it back to CR's. Instead, just branch on CR6 directly. :)
For example, for:
void foo2(vector float *A, vector float *B) {
if (!vec_any_eq(*A, *B))
*B = (vector float){0,0,0,0};
}
We now generate:
_foo2:
mfspr r2, 256
oris r5, r2, 12288
mtspr 256, r5
lvx v2, 0, r4
lvx v3, 0, r3
vcmpeqfp. v2, v3, v2
bne cr6, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
vxor v2, v2, v2
stvx v2, 0, r4
mtspr 256, r2
blr
LBB1_2: ; UnifiedReturnBlock
mtspr 256, r2
blr
instead of:
_foo2:
mfspr r2, 256
oris r5, r2, 12288
mtspr 256, r5
lvx v2, 0, r4
lvx v3, 0, r3
vcmpeqfp. v2, v3, v2
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
cmpwi cr0, r3, 0
beq cr0, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
vxor v2, v2, v2
stvx v2, 0, r4
mtspr 256, r2
blr
LBB1_2: ; UnifiedReturnBlock
mtspr 256, r2
blr
This implements CodeGen/PowerPC/vec_br_cmp.ll.
llvm-svn: 27804
2006-04-18 17:59:36 +00:00
Chris Lattner
96d50487c9
Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing
...
even/odd halves. Thanks to Nate telling me what's what.
llvm-svn: 27793
2006-04-18 04:28:57 +00:00
Chris Lattner
d6d82aa889
Implement v16i8 multiply with this code:
...
vmuloub v5, v3, v2
vmuleub v2, v3, v2
vperm v2, v2, v5, v4
This implements CodeGen/PowerPC/vec_mul.ll. With this, v16i8 multiplies are
6.79x faster than before.
Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with
GCC.
Remove the 'integer multiplies' todo from the README file.
llvm-svn: 27792
2006-04-18 03:57:35 +00:00
Chris Lattner
7e439874cb
Lower v8i16 multiply into this code:
...
li r5, lo16(LCPI1_0)
lis r6, ha16(LCPI1_0)
lvx v4, r6, r5
vmulouh v5, v3, v2
vmuleuh v2, v3, v2
vperm v2, v2, v5, v4
where v4 is:
LCPI1_0: ; <16 x ubyte>
.byte 2
.byte 3
.byte 18
.byte 19
.byte 6
.byte 7
.byte 22
.byte 23
.byte 10
.byte 11
.byte 26
.byte 27
.byte 14
.byte 15
.byte 30
.byte 31
This is 5.07x faster on the G5 (measured) than lowering to scalar code +
loads/stores.
llvm-svn: 27789
2006-04-18 03:43:48 +00:00
Chris Lattner
a2cae1bb10
Custom lower v4i32 multiplies into a cute sequence, instead of having legalize
...
scalarize the sequence into 4 mullw's and a bunch of load/store traffic.
This speeds up v4i32 multiplies 4.1x (measured) on a G5. This implements
PowerPC/vec_mul.ll
llvm-svn: 27788
2006-04-18 03:24:30 +00:00
Chris Lattner
e54133cfba
Make sure to check splats of every constant we can, handle splat(31) by
...
being a bit more clever, add support for odd splats from -31 to -17.
llvm-svn: 27764
2006-04-17 18:09:22 +00:00
Chris Lattner
264c908e3a
Teach the ppc backend to use rol and vsldoi to generate splatted constants.
...
This implements vec_constants.ll:test_vsldoi and test_rol
llvm-svn: 27760
2006-04-17 17:55:10 +00:00
Chris Lattner
1b3806ace5
Make some code more general, adding support for constant formation of several
...
new patterns.
llvm-svn: 27754
2006-04-17 06:58:41 +00:00
Chris Lattner
f8dd76df5b
Learn how to make odd splatted constants in range [17,29]. This implements
...
PowerPC/vec_constants.ll:test_29.
llvm-svn: 27752
2006-04-17 06:07:44 +00:00
Chris Lattner
2a099c04c1
Pull some code out into a helper function.
...
Effeciently codegen even splats in the range [-32,30].
This allows us to codegen <30,30,30,30> as:
vspltisw v0, 15
vadduwm v2, v0, v0
instead of as a cp load.
llvm-svn: 27750
2006-04-17 06:00:21 +00:00
Chris Lattner
071ad01ceb
Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle,
...
if it can be implemented in 3 or fewer discrete altivec instructions, codegen
it as such. This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll
llvm-svn: 27748
2006-04-17 05:28:54 +00:00
Chris Lattner
06a21ba96b
Implement a TODO: have the legalizer canonicalize a bunch of operations to
...
one type (v4i32) so that we don't have to write patterns for each type, and
so that more CSE opportunities are exposed.
llvm-svn: 27731
2006-04-16 01:37:57 +00:00
Chris Lattner
fa5aa396c2
Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors.
...
Remove some done items from the todo list.
llvm-svn: 27729
2006-04-16 01:01:29 +00:00
Chris Lattner
24acbe46c0
Fix a crash when faced with a shuffle vector that has an undef in its mask.
...
llvm-svn: 27726
2006-04-15 23:48:05 +00:00
Chris Lattner
559c8ba466
Allow undef in a shuffle mask
...
llvm-svn: 27714
2006-04-14 23:19:08 +00:00
Chris Lattner
4211ca9108
Move the rest of the PPCTargetLowering::LowerOperation cases out into
...
separate functions, for simplicity and code clarity.
llvm-svn: 27693
2006-04-14 06:01:58 +00:00
Chris Lattner
19e9055eb5
Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separate
...
functions, which makes the code much cleaner :)
llvm-svn: 27692
2006-04-14 05:19:18 +00:00
Chris Lattner
883fb053bd
Force non-darwin targets to use a static relo model. This fixes PR734,
...
tested by CodeGen/Generic/vector.ll
llvm-svn: 27657
2006-04-13 17:10:48 +00:00
Chris Lattner
147e50e1c5
Add a new way to match vector constants, which make it easier to bang bits of
...
different types.
Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load,
implementing PowerPC/vec_constants.ll:test1. This compiles:
typedef float vf __attribute__ ((vector_size (16)));
typedef int vi __attribute__ ((vector_size (16)));
void test(vi *P1, vi *P2, vf *P3) {
*P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000};
*P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF};
*P3 = vec_abs((vector float)*P3);
}
to:
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
vspltisw v0, -1
vslw v0, v0, v0
lvx v1, 0, r3
vand v1, v1, v0
stvx v1, 0, r3
lvx v1, 0, r4
vandc v1, v1, v0
stvx v1, 0, r4
lvx v1, 0, r5
vandc v0, v1, v0
stvx v0, 0, r5
mtspr 256, r2
blr
instead of (with two constant pool entries):
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
li r6, lo16(LCPI1_0)
lis r7, ha16(LCPI1_0)
li r8, lo16(LCPI1_1)
lis r9, ha16(LCPI1_1)
lvx v0, r7, r6
lvx v1, 0, r3
vand v0, v1, v0
stvx v0, 0, r3
lvx v0, r9, r8
lvx v1, 0, r4
vand v1, v1, v0
stvx v1, 0, r4
lvx v1, 0, r5
vand v0, v1, v0
stvx v0, 0, r5
mtspr 256, r2
blr
GCC produces (with 2 cp entries):
_test:
mfspr r0,256
stw r0,-4(r1)
oris r0,r0,0xc00c
mtspr 256,r0
lis r2,ha16(LC0)
lis r9,ha16(LC1)
la r2,lo16(LC0)(r2)
lvx v0,0,r3
lvx v1,0,r5
la r9,lo16(LC1)(r9)
lwz r12,-4(r1)
lvx v12,0,r2
lvx v13,0,r9
vand v0,v0,v12
stvx v0,0,r3
vspltisw v0,-1
vslw v12,v0,v0
vandc v1,v1,v12
stvx v1,0,r5
lvx v0,0,r4
vand v0,v0,v13
stvx v0,0,r4
mtspr 256,r12
blr
llvm-svn: 27624
2006-04-12 19:07:14 +00:00
Chris Lattner
74cf9ff761
Rename get_VSPLI_elt -> get_VSPLTI_elt
...
Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each
form, eliminating a bunch of Pat patterns in the .td file and allowing us to
CSE stuff more aggressively. This implements
PowerPC/buildvec_canonicalize.ll:VSPLTI
llvm-svn: 27614
2006-04-12 17:37:20 +00:00
Chris Lattner
e318a7574e
Ensure that zero vectors are always v4i32, which forces them to CSE with
...
each other. This implements CodeGen/PowerPC/vxor-canonicalize.ll
llvm-svn: 27609
2006-04-12 16:53:28 +00:00
Chris Lattner
e4db08a2f1
Vector function results go into V2 according to GCC. The darwin ABI doc
...
doesn't say where they go :-/
llvm-svn: 27579
2006-04-11 01:38:39 +00:00
Chris Lattner
92533cfb4a
Move some return-handling code from lowerarguments to the ISD::RET handling stuff.
...
No functionality change.
llvm-svn: 27577
2006-04-11 01:21:43 +00:00
Chris Lattner
3a68f3c3ca
properly mark vector selects as expanded to select_cc
...
llvm-svn: 27544
2006-04-08 22:59:15 +00:00
Chris Lattner
0a3d1bbca4
Add VRRC select support
...
llvm-svn: 27543
2006-04-08 22:45:08 +00:00
Chris Lattner
d9e80f4516
Implement PowerPC/CodeGen/vec_splat.ll:spltish to use vsplish instead of a
...
constant pool load.
llvm-svn: 27538
2006-04-08 07:14:26 +00:00
Chris Lattner
d71a1f946d
Change the interface to the predicate that determines if vsplti* can be used.
...
No functionality changes.
llvm-svn: 27536
2006-04-08 06:46:53 +00:00
Chris Lattner
466841ddc7
Make sure to return the result in the right type.
...
llvm-svn: 27469
2006-04-06 23:12:19 +00:00
Chris Lattner
a4bbfaed5c
Match vpku[hw]um(x,x).
...
Convert vsldoi(x,x) to work the same way other (x,x) cases work.
llvm-svn: 27467
2006-04-06 22:28:36 +00:00
Chris Lattner
f38e033270
Add support for matching vmrg(x,x) patterns
...
llvm-svn: 27463
2006-04-06 22:02:42 +00:00
Chris Lattner
d1dcb52093
Pattern match vmrg* instructions, which are now lowered by the CFE into shuffles.
...
llvm-svn: 27457
2006-04-06 21:11:54 +00:00
Chris Lattner
1d33819194
Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. to
...
lower it and LLVM to have one fewer intrinsic. This implements
CodeGen/PowerPC/vec_shuffle.ll
llvm-svn: 27450
2006-04-06 18:26:28 +00:00
Chris Lattner
e8b83b4206
Compile the vpkuhum/vpkuwum intrinsics into vpkuhum/vpkuwum instead of into
...
vperm with a perm mask lvx'd from the constant pool.
llvm-svn: 27448
2006-04-06 17:23:16 +00:00
Chris Lattner
39cc717c65
Fix CodeGen/PowerPC/2006-04-05-splat-ish.ll
...
llvm-svn: 27439
2006-04-05 17:39:25 +00:00
Evan Cheng
2cf4232ced
Fallthrough to expand if a VECTOR_SHUFFLE cannot be custom lowered.
...
llvm-svn: 27433
2006-04-05 06:09:26 +00:00
Chris Lattner
4a744e5c9d
Fix some broken logic that would cause us to codegen {2147483647,2147483647,2147483647,2147483647} as 'vspltisb v0, -1'.
...
llvm-svn: 27413
2006-04-04 22:28:35 +00:00
Chris Lattner
95c7adc7cb
Ask legalize to promote all vector shuffles to be v16i8 instead of having to
...
handle all 4 PPC vector types. This simplifies the matching code and allows
us to eliminate a bunch of patterns. This also adds cases we were missing,
such as CodeGen/PowerPC/vec_splat.ll:splat_h.
llvm-svn: 27400
2006-04-04 17:25:31 +00:00
Chris Lattner
447a7968af
Revert accidentally committed hunks.
...
llvm-svn: 27386
2006-04-03 23:58:04 +00:00
Chris Lattner
533aed9a35
Make sure to mark unsupported SCALAR_TO_VECTOR operations as expand.
...
llvm-svn: 27385
2006-04-03 23:55:43 +00:00
Chris Lattner
c5287c0ece
Inform the dag combiner that the predicate compares only return a low bit.
...
llvm-svn: 27359
2006-04-02 06:26:07 +00:00
Chris Lattner
9b2d6e7886
Custom lower all BUILD_VECTOR's so that we can compile vec_splat_u8(8) into
...
"vspltisb v0, 8" instead of a constant pool load.
llvm-svn: 27335
2006-04-02 00:43:36 +00:00
Chris Lattner
baa73e0d91
Rearrange code a bit
...
llvm-svn: 27306
2006-03-31 19:52:36 +00:00
Chris Lattner
754b41c84b
Add, sub and shuffle are legal for all vector types
...
llvm-svn: 27305
2006-03-31 19:48:58 +00:00
Chris Lattner
829a061abf
note to self: *save* file, then check it in
...
llvm-svn: 27291
2006-03-31 06:04:53 +00:00
Chris Lattner
d4058a59d4
Implement an item from the readme, folding vcmp/vcmp. instructions with
...
identical instructions into a single instruction. For example, for:
void test(vector float *x, vector float *y, int *P) {
int v = vec_any_out(*x, *y);
*x = (vector float)vec_cmpb(*x, *y);
*P = v;
}
we now generate:
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
lvx v0, 0, r4
lvx v1, 0, r3
vcmpbfp. v0, v1, v0
mfcr r4, 2
stvx v0, 0, r3
rlwinm r3, r4, 27, 31, 31
xori r3, r3, 1
stw r3, 0(r5)
mtspr 256, r2
blr
instead of:
_test:
mfspr r2, 256
oris r6, r2, 57344
mtspr 256, r6
lvx v0, 0, r4
lvx v1, 0, r3
vcmpbfp. v2, v1, v0
mfcr r4, 2
*** vcmpbfp v0, v1, v0
rlwinm r4, r4, 27, 31, 31
stvx v0, 0, r3
xori r3, r4, 1
stw r3, 0(r5)
mtspr 256, r2
blr
Testcase here: CodeGen/PowerPC/vcmp-fold.ll
llvm-svn: 27290
2006-03-31 06:02:07 +00:00
Chris Lattner
d7495ae7e9
Lower vector compares to VCMP nodes, just like we lower vector comparison
...
predicates to VCMPo nodes.
llvm-svn: 27285
2006-03-31 05:13:27 +00:00
Chris Lattner
bca5fbe914
Mark INSERT_VECTOR_ELT as expand
...
llvm-svn: 27276
2006-03-31 01:48:55 +00:00
Nate Begeman
1b3928765d
Add a few more altivec intrinsics
...
llvm-svn: 27215
2006-03-28 04:15:58 +00:00
Chris Lattner
cb5ec07cc3
Use normal lvx for scalar_to_vector instead of lve*x. They do the exact
...
same thing and we have a dag node for the former.
llvm-svn: 27205
2006-03-28 01:43:22 +00:00
Chris Lattner
e55d171ccd
Tblgen doesn't like multiple SDNode<> definitions that map to the sameenum value. Split them into separate enums.
...
llvm-svn: 27201
2006-03-28 00:40:33 +00:00
Nate Begeman
ed728c1291
SelectionDAGISel can now natively handle Switch instructions, in the same
...
manner that the LowerSwitch LLVM to LLVM pass does: emitting a binary
search tree of basic blocks. The new approach has several advantages:
it is faster, it generates significantly smaller code in many cases, and
it paves the way for implementing dense switch tables as a jump table by
handling switches directly in the instruction selector.
This functionality is currently only enabled on x86, but should be safe for
every target. In anticipation of making it the default, the cfg is now
properly updated in the x86, ppc, and sparc select lowering code.
llvm-svn: 27156
2006-03-27 01:32:24 +00:00
Chris Lattner
6961fc76bb
Codegen vector predicate compares.
...
llvm-svn: 27151
2006-03-26 10:06:40 +00:00
Evan Cheng
b1ddc988af
Remove PPC:isZeroVector, use ISD::isBuildVectorAllZeros instead
...
llvm-svn: 27149
2006-03-26 09:52:32 +00:00
Chris Lattner
1cb91b3cd9
Add some basic patterns for other datatypes
...
llvm-svn: 27116
2006-03-25 07:39:07 +00:00
Chris Lattner
2771e2c960
Codegen things like:
...
<int -1, int -1, int -1, int -1>
and
<int 65537, int 65537, int 65537, int 65537>
Using things like:
vspltisb v0, -1
and:
vspltish v0, 1
instead of using constant pool loads.
This implements CodeGen/PowerPC/vec_splat.ll:splat_imm_i{32|16}.
llvm-svn: 27106
2006-03-25 06:12:06 +00:00
Chris Lattner
a90b7141ed
Disable the i32->float G5 optimization. It is unsafe, as documented in the
...
comment.
This fixes 177.mesa, and McCat/09-vor with the td scheduler.
llvm-svn: 27060
2006-03-24 07:53:47 +00:00
Chris Lattner
ab882abce8
add support for using vxor to build zero vectors. This implements
...
Regression/CodeGen/PowerPC/vec_zero.ll
llvm-svn: 27059
2006-03-24 07:48:08 +00:00
Chris Lattner
4a66d69433
When possible, custom lower 32-bit SINT_TO_FP to this:
...
_foo2:
extsw r2, r3
std r2, -8(r1)
lfd f0, -8(r1)
fcfid f0, f0
frsp f1, f0
blr
instead of this:
_foo2:
lis r2, ha16(LCPI2_0)
lis r4, 17200
xoris r3, r3, 32768
stw r3, -4(r1)
stw r4, -8(r1)
lfs f0, lo16(LCPI2_0)(r2)
lfd f1, -8(r1)
fsub f0, f1, f0
frsp f1, f0
blr
This speeds up Misc/pi from 2.44s->2.09s with LLC and from 3.01->2.18s
with llcbeta (16.7% and 38.1% respectively).
llvm-svn: 26943
2006-03-22 05:30:33 +00:00
Chris Lattner
00f4683bf6
These targets don't support EXTRACT_VECTOR_ELT, though, in time, X86 will.
...
llvm-svn: 26930
2006-03-21 20:51:05 +00:00
Chris Lattner
6d74b09da7
remove dead variable
...
llvm-svn: 26907
2006-03-20 22:37:23 +00:00
Chris Lattner
a1bc294f0c
Fix a couple of bugs in permute/splat generate, thanks to Nate for actually
...
figuring these out! :)
llvm-svn: 26904
2006-03-20 18:26:51 +00:00
Chris Lattner
a9a1313386
Add support for generating vspltw, instead of a vperm instruction with a
...
constant pool load. This generates significantly nicer code for splats.
When tblgen gets bugfixed, we can remove the custom selection code.
llvm-svn: 26898
2006-03-20 06:51:10 +00:00
Chris Lattner
a8fbb6dd3d
Implement PPC::isSplatShuffleMask and PPC::getVSPLTImmediate.
...
llvm-svn: 26897
2006-03-20 06:37:44 +00:00
Chris Lattner
ffc475689b
fix duplicate definition errors
...
llvm-svn: 26896
2006-03-20 06:33:01 +00:00