llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	66f1fbaaad	Fix miscompilation of float vector returns. Compile code to this: _func: vsldoi v2, v3, v2, 12 vsldoi v2, v2, v2, 4 blr instead of: _func: vsldoi v2, v3, v2, 12 vsldoi v2, v2, v2, 4 *** vor f1, v2, v2 blr llvm-svn: 29607	2006-08-11 16:47:32 +00:00
Chris Lattner	8298265042	Fix some ppc64 issues with vector code. llvm-svn: 29384	2006-07-28 16:45:47 +00:00
Chris Lattner	9e56e5c003	Rename RelocModel::PIC to PIC_, to avoid conflicts with -DPIC. llvm-svn: 29307	2006-07-26 21:12:04 +00:00
Chris Lattner	a7976d329e	Implement Regression/CodeGen/PowerPC/bswap-load-store.ll by folding bswaps into i16/i32 load/stores. llvm-svn: 29089	2006-07-10 20:56:58 +00:00
Chris Lattner	8aed3cc46b	Implement 64-bit select, bswap, etc. llvm-svn: 28935	2006-06-27 20:14:52 +00:00
Chris Lattner	a07410c95b	PPC doesn't have bit converts to/from i64 llvm-svn: 28932	2006-06-27 18:40:08 +00:00
Chris Lattner	d48ce27532	Implement 64-bit undef, sub, shl/shr, srem/urem llvm-svn: 28929	2006-06-27 18:18:41 +00:00
Chris Lattner	cb5a84f446	Use i32 for shift amounts instead of i64. This gets bisort working. llvm-svn: 28927	2006-06-27 17:34:57 +00:00
Chris Lattner	97b3da1519	Implement a bunch of 64-bit cleanliness work. With this, treeadd builds (but doesn't work right). llvm-svn: 28921	2006-06-27 00:04:13 +00:00
Chris Lattner	ec78cade34	Improve PPC64 calling convention support llvm-svn: 28919	2006-06-26 22:48:35 +00:00
Chris Lattner	dc38e6f322	Correct returns of 64-bit values, though they seemed to work before... llvm-svn: 28892	2006-06-21 00:34:03 +00:00
Chris Lattner	a5190ae7a9	fix some assumptions that pointers can only be 32-bits. With this, we can now compile: static unsigned long X; void test1() { X = 0; } into: _test1: lis r2, ha16(_X) li r3, 0 stw r3, lo16(_X)(r2) blr Totally amazing :) llvm-svn: 28839	2006-06-16 21:01:35 +00:00
Chris Lattner	a35f306740	Rename some subtarget features. A CPU now can have 64-bit instructions, can in 32-bit mode we can choose to optionally use 64-bit registers. llvm-svn: 28824	2006-06-16 17:34:12 +00:00
Evan Cheng	94bb93f8f7	Type of extract_element index operand should be iPTR. llvm-svn: 28797	2006-06-15 08:18:06 +00:00
Chris Lattner	006b2c6ab9	Fix a problem exposed by the local allocator. CALL instructions are not marked as using incoming argument registers, so the local allocator would clobber them between their set and use. To fix this, we give the call instructions a variable number of uses in the CALL MachineInstr itself, so live variables understands the live ranges of these register arguments. llvm-svn: 28744	2006-06-10 01:14:28 +00:00
Chris Lattner	b9342afa56	Always reserve space for 8 spilled GPRs. GCC apparently assumes that this space will be available, even if the callee isn't varargs. llvm-svn: 28571	2006-05-30 21:21:04 +00:00
Evan Cheng	a3add0fea8	Change RET node to include signness information of the return values. i.e. RET chain, value1, sign1, value2, sign2, ... llvm-svn: 28510	2006-05-26 23:10:12 +00:00
Evan Cheng	c2cd473d9b	CALL node change (arg / sign pairs instead of just arguments). llvm-svn: 28462	2006-05-25 00:57:32 +00:00
Chris Lattner	aa2372562e	Patches to make the LLVM sources more -pedantic clean. Patch provided by Anton Korobeynikov! This is a step towards closing PR786. llvm-svn: 28447	2006-05-24 17:04:05 +00:00
Chris Lattner	33165c246c	Fix CodeGen/Generic/vector.ll:test_div with altivec. llvm-svn: 28445	2006-05-24 00:15:25 +00:00
Chris Lattner	b56d22c2f6	Handle SETO* like we handle SET*, restoring behavior after Evan's setcc change. This fixes PowerPC/fnegsel.ll. llvm-svn: 28443	2006-05-24 00:06:44 +00:00
Chris Lattner	eb755fc1b3	Make PPC call lowering more aggressive, making the isel matching code simple enough to be autogenerated. llvm-svn: 28354	2006-05-17 19:00:46 +00:00
Chris Lattner	b1e9e37c58	Switch PPC over to a call-selection model where the lowering code creates the copyto/fromregs instead of making the PPCISD::CALL selection code create them. This vastly simplifies the selection code, and moves the ABI handling parts into one place. llvm-svn: 28346	2006-05-17 06:01:33 +00:00
Chris Lattner	b7552a88d6	3 changes, 2 of which are cleanup one of which changes codegen: 1. Rearrange code a bit so that the special case doesn't require indenting lots of code. 2. Add comments describing PPC calling convention. 3. Only round up to 56-bytes of stack space for an outgoing call if the callee is varargs. This saves a bit of stack space. llvm-svn: 28342	2006-05-17 00:15:40 +00:00
Chris Lattner	f058f5aef1	implement passing/returning vector regs to calls, at least non-varargs calls. llvm-svn: 28341	2006-05-16 23:54:25 +00:00
Chris Lattner	aa40ec1b32	Instead of implementing LowerCallTo directly, let the default impl produce an ISD::CALL node, then custom lower that. This means that we only have to handle LEGAL call operands/results, not every possible type. This allows us to simplify the call code, shrinking it by about 1/3. llvm-svn: 28339	2006-05-16 22:56:08 +00:00
Chris Lattner	26e2fcd8b1	Simplify the argument counting logic by only incrementing the index. llvm-svn: 28335	2006-05-16 18:58:15 +00:00
Chris Lattner	76c47b50e7	Simplify the dead argument handling code. llvm-svn: 28334	2006-05-16 18:54:32 +00:00
Chris Lattner	318f0d2122	Vector args passed in registers don't reserve stack space. llvm-svn: 28333	2006-05-16 18:51:52 +00:00
Chris Lattner	4302e8fb67	Switch the PPC backend over to using FORMAL_ARGUMENTS for formal argument handling. This makes the lower argument code significantly simpler (we only need to handle legal argument types). Incidentally, this also implements support for vector argument registers, so long as they are not on the stack. llvm-svn: 28331	2006-05-16 18:18:50 +00:00
Chris Lattner	d2ca9abf57	Fit in 80 cols llvm-svn: 28311	2006-05-16 04:20:24 +00:00
Chris Lattner	ae48a894b1	Remove dead var, fix bad override. llvm-svn: 28264	2006-05-12 21:09:57 +00:00
Chris Lattner	84b49d51be	Fix CodeGen/Generic/2006-04-28-Sign-extend-bool.ll llvm-svn: 28017	2006-04-28 21:56:10 +00:00
Nate Begeman	4ca2ea5b43	JumpTable support! What this represents is working asm and jit support for x86 and ppc for 100% dense switch statements when relocations are non-PIC. This support will be extended and enhanced in the coming days to support PIC, and less dense forms of jump tables. llvm-svn: 27947	2006-04-22 18:53:45 +00:00
Chris Lattner	518834c67e	Fix a crash on: void foo2(vector float A, vector float B) { vector float C = (vector float)vec_cmpeq(A, B); if (!vec_any_eq(A, B)) B = (vector float){0,0,0,0}; A = C; } llvm-svn: 27808	2006-04-18 18:28:22 +00:00
Chris Lattner	1e174c87c3	pretty print node name llvm-svn: 27806	2006-04-18 18:05:58 +00:00
Chris Lattner	9754d142a4	Implement an important entry from README_ALTIVEC: If an altivec predicate compare is used immediately by a branch, don't use a (serializing) MFCR instruction to read the CR6 register, which requires a compare to get it back to CR's. Instead, just branch on CR6 directly. :) For example, for: void foo2(vector float A, vector float B) { if (!vec_any_eq(A, B)) *B = (vector float){0,0,0,0}; } We now generate: _foo2: mfspr r2, 256 oris r5, r2, 12288 mtspr 256, r5 lvx v2, 0, r4 lvx v3, 0, r3 vcmpeqfp. v2, v3, v2 bne cr6, LBB1_2 ; UnifiedReturnBlock LBB1_1: ; cond_true vxor v2, v2, v2 stvx v2, 0, r4 mtspr 256, r2 blr LBB1_2: ; UnifiedReturnBlock mtspr 256, r2 blr instead of: _foo2: mfspr r2, 256 oris r5, r2, 12288 mtspr 256, r5 lvx v2, 0, r4 lvx v3, 0, r3 vcmpeqfp. v2, v3, v2 mfcr r3, 2 rlwinm r3, r3, 27, 31, 31 cmpwi cr0, r3, 0 beq cr0, LBB1_2 ; UnifiedReturnBlock LBB1_1: ; cond_true vxor v2, v2, v2 stvx v2, 0, r4 mtspr 256, r2 blr LBB1_2: ; UnifiedReturnBlock mtspr 256, r2 blr This implements CodeGen/PowerPC/vec_br_cmp.ll. llvm-svn: 27804	2006-04-18 17:59:36 +00:00
Chris Lattner	96d50487c9	Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing even/odd halves. Thanks to Nate telling me what's what. llvm-svn: 27793	2006-04-18 04:28:57 +00:00
Chris Lattner	d6d82aa889	Implement v16i8 multiply with this code: vmuloub v5, v3, v2 vmuleub v2, v3, v2 vperm v2, v2, v5, v4 This implements CodeGen/PowerPC/vec_mul.ll. With this, v16i8 multiplies are 6.79x faster than before. Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with GCC. Remove the 'integer multiplies' todo from the README file. llvm-svn: 27792	2006-04-18 03:57:35 +00:00
Chris Lattner	7e439874cb	Lower v8i16 multiply into this code: li r5, lo16(LCPI1_0) lis r6, ha16(LCPI1_0) lvx v4, r6, r5 vmulouh v5, v3, v2 vmuleuh v2, v3, v2 vperm v2, v2, v5, v4 where v4 is: LCPI1_0: ; <16 x ubyte> .byte 2 .byte 3 .byte 18 .byte 19 .byte 6 .byte 7 .byte 22 .byte 23 .byte 10 .byte 11 .byte 26 .byte 27 .byte 14 .byte 15 .byte 30 .byte 31 This is 5.07x faster on the G5 (measured) than lowering to scalar code + loads/stores. llvm-svn: 27789	2006-04-18 03:43:48 +00:00
Chris Lattner	a2cae1bb10	Custom lower v4i32 multiplies into a cute sequence, instead of having legalize scalarize the sequence into 4 mullw's and a bunch of load/store traffic. This speeds up v4i32 multiplies 4.1x (measured) on a G5. This implements PowerPC/vec_mul.ll llvm-svn: 27788	2006-04-18 03:24:30 +00:00
Chris Lattner	e54133cfba	Make sure to check splats of every constant we can, handle splat(31) by being a bit more clever, add support for odd splats from -31 to -17. llvm-svn: 27764	2006-04-17 18:09:22 +00:00
Chris Lattner	264c908e3a	Teach the ppc backend to use rol and vsldoi to generate splatted constants. This implements vec_constants.ll:test_vsldoi and test_rol llvm-svn: 27760	2006-04-17 17:55:10 +00:00
Chris Lattner	1b3806ace5	Make some code more general, adding support for constant formation of several new patterns. llvm-svn: 27754	2006-04-17 06:58:41 +00:00
Chris Lattner	f8dd76df5b	Learn how to make odd splatted constants in range [17,29]. This implements PowerPC/vec_constants.ll:test_29. llvm-svn: 27752	2006-04-17 06:07:44 +00:00
Chris Lattner	2a099c04c1	Pull some code out into a helper function. Effeciently codegen even splats in the range [-32,30]. This allows us to codegen <30,30,30,30> as: vspltisw v0, 15 vadduwm v2, v0, v0 instead of as a cp load. llvm-svn: 27750	2006-04-17 06:00:21 +00:00
Chris Lattner	071ad01ceb	Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle, if it can be implemented in 3 or fewer discrete altivec instructions, codegen it as such. This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll llvm-svn: 27748	2006-04-17 05:28:54 +00:00
Chris Lattner	06a21ba96b	Implement a TODO: have the legalizer canonicalize a bunch of operations to one type (v4i32) so that we don't have to write patterns for each type, and so that more CSE opportunities are exposed. llvm-svn: 27731	2006-04-16 01:37:57 +00:00
Chris Lattner	fa5aa396c2	Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors. Remove some done items from the todo list. llvm-svn: 27729	2006-04-16 01:01:29 +00:00
Chris Lattner	24acbe46c0	Fix a crash when faced with a shuffle vector that has an undef in its mask. llvm-svn: 27726	2006-04-15 23:48:05 +00:00
Chris Lattner	559c8ba466	Allow undef in a shuffle mask llvm-svn: 27714	2006-04-14 23:19:08 +00:00
Chris Lattner	4211ca9108	Move the rest of the PPCTargetLowering::LowerOperation cases out into separate functions, for simplicity and code clarity. llvm-svn: 27693	2006-04-14 06:01:58 +00:00
Chris Lattner	19e9055eb5	Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separate functions, which makes the code much cleaner :) llvm-svn: 27692	2006-04-14 05:19:18 +00:00
Chris Lattner	883fb053bd	Force non-darwin targets to use a static relo model. This fixes PR734, tested by CodeGen/Generic/vector.ll llvm-svn: 27657	2006-04-13 17:10:48 +00:00
Chris Lattner	147e50e1c5	Add a new way to match vector constants, which make it easier to bang bits of different types. Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load, implementing PowerPC/vec_constants.ll:test1. This compiles: typedef float vf __attribute__ ((vector_size (16))); typedef int vi __attribute__ ((vector_size (16))); void test(vi P1, vi P2, vf P3) { P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000}; P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF}; P3 = vec_abs((vector float)*P3); } to: _test: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 vspltisw v0, -1 vslw v0, v0, v0 lvx v1, 0, r3 vand v1, v1, v0 stvx v1, 0, r3 lvx v1, 0, r4 vandc v1, v1, v0 stvx v1, 0, r4 lvx v1, 0, r5 vandc v0, v1, v0 stvx v0, 0, r5 mtspr 256, r2 blr instead of (with two constant pool entries): _test: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 li r6, lo16(LCPI1_0) lis r7, ha16(LCPI1_0) li r8, lo16(LCPI1_1) lis r9, ha16(LCPI1_1) lvx v0, r7, r6 lvx v1, 0, r3 vand v0, v1, v0 stvx v0, 0, r3 lvx v0, r9, r8 lvx v1, 0, r4 vand v1, v1, v0 stvx v1, 0, r4 lvx v1, 0, r5 vand v0, v1, v0 stvx v0, 0, r5 mtspr 256, r2 blr GCC produces (with 2 cp entries): _test: mfspr r0,256 stw r0,-4(r1) oris r0,r0,0xc00c mtspr 256,r0 lis r2,ha16(LC0) lis r9,ha16(LC1) la r2,lo16(LC0)(r2) lvx v0,0,r3 lvx v1,0,r5 la r9,lo16(LC1)(r9) lwz r12,-4(r1) lvx v12,0,r2 lvx v13,0,r9 vand v0,v0,v12 stvx v0,0,r3 vspltisw v0,-1 vslw v12,v0,v0 vandc v1,v1,v12 stvx v1,0,r5 lvx v0,0,r4 vand v0,v0,v13 stvx v0,0,r4 mtspr 256,r12 blr llvm-svn: 27624	2006-04-12 19:07:14 +00:00
Chris Lattner	74cf9ff761	Rename get_VSPLI_elt -> get_VSPLTI_elt Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each form, eliminating a bunch of Pat patterns in the .td file and allowing us to CSE stuff more aggressively. This implements PowerPC/buildvec_canonicalize.ll:VSPLTI llvm-svn: 27614	2006-04-12 17:37:20 +00:00
Chris Lattner	e318a7574e	Ensure that zero vectors are always v4i32, which forces them to CSE with each other. This implements CodeGen/PowerPC/vxor-canonicalize.ll llvm-svn: 27609	2006-04-12 16:53:28 +00:00
Chris Lattner	e4db08a2f1	Vector function results go into V2 according to GCC. The darwin ABI doc doesn't say where they go :-/ llvm-svn: 27579	2006-04-11 01:38:39 +00:00
Chris Lattner	92533cfb4a	Move some return-handling code from lowerarguments to the ISD::RET handling stuff. No functionality change. llvm-svn: 27577	2006-04-11 01:21:43 +00:00
Chris Lattner	3a68f3c3ca	properly mark vector selects as expanded to select_cc llvm-svn: 27544	2006-04-08 22:59:15 +00:00
Chris Lattner	0a3d1bbca4	Add VRRC select support llvm-svn: 27543	2006-04-08 22:45:08 +00:00
Chris Lattner	d9e80f4516	Implement PowerPC/CodeGen/vec_splat.ll:spltish to use vsplish instead of a constant pool load. llvm-svn: 27538	2006-04-08 07:14:26 +00:00
Chris Lattner	d71a1f946d	Change the interface to the predicate that determines if vsplti* can be used. No functionality changes. llvm-svn: 27536	2006-04-08 06:46:53 +00:00
Chris Lattner	466841ddc7	Make sure to return the result in the right type. llvm-svn: 27469	2006-04-06 23:12:19 +00:00
Chris Lattner	a4bbfaed5c	Match vpku[hw]um(x,x). Convert vsldoi(x,x) to work the same way other (x,x) cases work. llvm-svn: 27467	2006-04-06 22:28:36 +00:00
Chris Lattner	f38e033270	Add support for matching vmrg(x,x) patterns llvm-svn: 27463	2006-04-06 22:02:42 +00:00
Chris Lattner	d1dcb52093	Pattern match vmrg* instructions, which are now lowered by the CFE into shuffles. llvm-svn: 27457	2006-04-06 21:11:54 +00:00
Chris Lattner	1d33819194	Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. to lower it and LLVM to have one fewer intrinsic. This implements CodeGen/PowerPC/vec_shuffle.ll llvm-svn: 27450	2006-04-06 18:26:28 +00:00
Chris Lattner	e8b83b4206	Compile the vpkuhum/vpkuwum intrinsics into vpkuhum/vpkuwum instead of into vperm with a perm mask lvx'd from the constant pool. llvm-svn: 27448	2006-04-06 17:23:16 +00:00
Chris Lattner	39cc717c65	Fix CodeGen/PowerPC/2006-04-05-splat-ish.ll llvm-svn: 27439	2006-04-05 17:39:25 +00:00
Evan Cheng	2cf4232ced	Fallthrough to expand if a VECTOR_SHUFFLE cannot be custom lowered. llvm-svn: 27433	2006-04-05 06:09:26 +00:00
Chris Lattner	4a744e5c9d	Fix some broken logic that would cause us to codegen {2147483647,2147483647,2147483647,2147483647} as 'vspltisb v0, -1'. llvm-svn: 27413	2006-04-04 22:28:35 +00:00
Chris Lattner	95c7adc7cb	Ask legalize to promote all vector shuffles to be v16i8 instead of having to handle all 4 PPC vector types. This simplifies the matching code and allows us to eliminate a bunch of patterns. This also adds cases we were missing, such as CodeGen/PowerPC/vec_splat.ll:splat_h. llvm-svn: 27400	2006-04-04 17:25:31 +00:00
Chris Lattner	447a7968af	Revert accidentally committed hunks. llvm-svn: 27386	2006-04-03 23:58:04 +00:00
Chris Lattner	533aed9a35	Make sure to mark unsupported SCALAR_TO_VECTOR operations as expand. llvm-svn: 27385	2006-04-03 23:55:43 +00:00
Chris Lattner	c5287c0ece	Inform the dag combiner that the predicate compares only return a low bit. llvm-svn: 27359	2006-04-02 06:26:07 +00:00
Chris Lattner	9b2d6e7886	Custom lower all BUILD_VECTOR's so that we can compile vec_splat_u8(8) into "vspltisb v0, 8" instead of a constant pool load. llvm-svn: 27335	2006-04-02 00:43:36 +00:00
Chris Lattner	baa73e0d91	Rearrange code a bit llvm-svn: 27306	2006-03-31 19:52:36 +00:00
Chris Lattner	754b41c84b	Add, sub and shuffle are legal for all vector types llvm-svn: 27305	2006-03-31 19:48:58 +00:00
Chris Lattner	829a061abf	note to self: save file, then check it in llvm-svn: 27291	2006-03-31 06:04:53 +00:00
Chris Lattner	d4058a59d4	Implement an item from the readme, folding vcmp/vcmp. instructions with identical instructions into a single instruction. For example, for: void test(vector float x, vector float y, int P) { int v = vec_any_out(x, y); x = (vector float)vec_cmpb(x, y); P = v; } we now generate: _test: mfspr r2, 256 oris r6, r2, 49152 mtspr 256, r6 lvx v0, 0, r4 lvx v1, 0, r3 vcmpbfp. v0, v1, v0 mfcr r4, 2 stvx v0, 0, r3 rlwinm r3, r4, 27, 31, 31 xori r3, r3, 1 stw r3, 0(r5) mtspr 256, r2 blr instead of: _test: mfspr r2, 256 oris r6, r2, 57344 mtspr 256, r6 lvx v0, 0, r4 lvx v1, 0, r3 vcmpbfp. v2, v1, v0 mfcr r4, 2 ** vcmpbfp v0, v1, v0 rlwinm r4, r4, 27, 31, 31 stvx v0, 0, r3 xori r3, r4, 1 stw r3, 0(r5) mtspr 256, r2 blr Testcase here: CodeGen/PowerPC/vcmp-fold.ll llvm-svn: 27290	2006-03-31 06:02:07 +00:00
Chris Lattner	d7495ae7e9	Lower vector compares to VCMP nodes, just like we lower vector comparison predicates to VCMPo nodes. llvm-svn: 27285	2006-03-31 05:13:27 +00:00
Chris Lattner	bca5fbe914	Mark INSERT_VECTOR_ELT as expand llvm-svn: 27276	2006-03-31 01:48:55 +00:00
Nate Begeman	1b3928765d	Add a few more altivec intrinsics llvm-svn: 27215	2006-03-28 04:15:58 +00:00
Chris Lattner	cb5ec07cc3	Use normal lvx for scalar_to_vector instead of lve*x. They do the exact same thing and we have a dag node for the former. llvm-svn: 27205	2006-03-28 01:43:22 +00:00
Chris Lattner	e55d171ccd	Tblgen doesn't like multiple SDNode<> definitions that map to the sameenum value. Split them into separate enums. llvm-svn: 27201	2006-03-28 00:40:33 +00:00
Nate Begeman	ed728c1291	SelectionDAGISel can now natively handle Switch instructions, in the same manner that the LowerSwitch LLVM to LLVM pass does: emitting a binary search tree of basic blocks. The new approach has several advantages: it is faster, it generates significantly smaller code in many cases, and it paves the way for implementing dense switch tables as a jump table by handling switches directly in the instruction selector. This functionality is currently only enabled on x86, but should be safe for every target. In anticipation of making it the default, the cfg is now properly updated in the x86, ppc, and sparc select lowering code. llvm-svn: 27156	2006-03-27 01:32:24 +00:00
Chris Lattner	6961fc76bb	Codegen vector predicate compares. llvm-svn: 27151	2006-03-26 10:06:40 +00:00
Evan Cheng	b1ddc988af	Remove PPC:isZeroVector, use ISD::isBuildVectorAllZeros instead llvm-svn: 27149	2006-03-26 09:52:32 +00:00
Chris Lattner	1cb91b3cd9	Add some basic patterns for other datatypes llvm-svn: 27116	2006-03-25 07:39:07 +00:00
Chris Lattner	2771e2c960	Codegen things like: <int -1, int -1, int -1, int -1> and <int 65537, int 65537, int 65537, int 65537> Using things like: vspltisb v0, -1 and: vspltish v0, 1 instead of using constant pool loads. This implements CodeGen/PowerPC/vec_splat.ll:splat_imm_i{32\|16}. llvm-svn: 27106	2006-03-25 06:12:06 +00:00
Chris Lattner	a90b7141ed	Disable the i32->float G5 optimization. It is unsafe, as documented in the comment. This fixes 177.mesa, and McCat/09-vor with the td scheduler. llvm-svn: 27060	2006-03-24 07:53:47 +00:00
Chris Lattner	ab882abce8	add support for using vxor to build zero vectors. This implements Regression/CodeGen/PowerPC/vec_zero.ll llvm-svn: 27059	2006-03-24 07:48:08 +00:00
Chris Lattner	4a66d69433	When possible, custom lower 32-bit SINT_TO_FP to this: _foo2: extsw r2, r3 std r2, -8(r1) lfd f0, -8(r1) fcfid f0, f0 frsp f1, f0 blr instead of this: _foo2: lis r2, ha16(LCPI2_0) lis r4, 17200 xoris r3, r3, 32768 stw r3, -4(r1) stw r4, -8(r1) lfs f0, lo16(LCPI2_0)(r2) lfd f1, -8(r1) fsub f0, f1, f0 frsp f1, f0 blr This speeds up Misc/pi from 2.44s->2.09s with LLC and from 3.01->2.18s with llcbeta (16.7% and 38.1% respectively). llvm-svn: 26943	2006-03-22 05:30:33 +00:00
Chris Lattner	00f4683bf6	These targets don't support EXTRACT_VECTOR_ELT, though, in time, X86 will. llvm-svn: 26930	2006-03-21 20:51:05 +00:00
Chris Lattner	6d74b09da7	remove dead variable llvm-svn: 26907	2006-03-20 22:37:23 +00:00
Chris Lattner	a1bc294f0c	Fix a couple of bugs in permute/splat generate, thanks to Nate for actually figuring these out! :) llvm-svn: 26904	2006-03-20 18:26:51 +00:00
Chris Lattner	a9a1313386	Add support for generating vspltw, instead of a vperm instruction with a constant pool load. This generates significantly nicer code for splats. When tblgen gets bugfixed, we can remove the custom selection code. llvm-svn: 26898	2006-03-20 06:51:10 +00:00
Chris Lattner	a8fbb6dd3d	Implement PPC::isSplatShuffleMask and PPC::getVSPLTImmediate. llvm-svn: 26897	2006-03-20 06:37:44 +00:00
Chris Lattner	ffc475689b	fix duplicate definition errors llvm-svn: 26896	2006-03-20 06:33:01 +00:00

1 2 3 4 5 ...

251 Commits