llvm-project

Commit Graph

Author	SHA1	Message	Date
Jim Laskey	d387cc5cde	Reactivate llvm.dbg.declare. llvm-svn: 27192	2006-03-27 23:31:10 +00:00
Chris Lattner	5bb1d90afd	Disable dbg_declare, it currently breaks the CFE build llvm-svn: 27182	2006-03-27 21:36:03 +00:00
Nate Begeman	ed728c1291	SelectionDAGISel can now natively handle Switch instructions, in the same manner that the LowerSwitch LLVM to LLVM pass does: emitting a binary search tree of basic blocks. The new approach has several advantages: it is faster, it generates significantly smaller code in many cases, and it paves the way for implementing dense switch tables as a jump table by handling switches directly in the instruction selector. This functionality is currently only enabled on x86, but should be safe for every target. In anticipation of making it the default, the cfg is now properly updated in the x86, ppc, and sparc select lowering code. llvm-svn: 27156	2006-03-27 01:32:24 +00:00
Jim Laskey	7092888bcc	Bullet proof against undefined args produced by upgrading ols-style debug info. llvm-svn: 27155	2006-03-26 22:46:27 +00:00
Chris Lattner	313229c74b	fix inverted conditional llvm-svn: 27089	2006-03-24 22:49:42 +00:00
Jim Laskey	53f1ecc560	Rename for truth in advertising. llvm-svn: 27063	2006-03-24 09:50:27 +00:00
Chris Lattner	d96b09a7b9	Lower target intrinsics into an INTRINSIC node llvm-svn: 27035	2006-03-24 02:22:33 +00:00
Jim Laskey	a8bdac875d	Handle new forms of llvm.dbg intrinsics. llvm-svn: 26988	2006-03-23 18:06:46 +00:00
Chris Lattner	b893d04a67	Fix a typo llvm-svn: 26965	2006-03-22 22:20:49 +00:00
Chris Lattner	2f4119a608	Implement simple support for vector casting. This can currently only handle casts between legal vector types. llvm-svn: 26961	2006-03-22 20:09:35 +00:00
Chris Lattner	7c0cd8cafc	add some trivial support for extractelement. llvm-svn: 26928	2006-03-21 20:44:12 +00:00
Chris Lattner	672a42d731	Add a hacky workaround for crashes due to vectors live across blocks. Note that this code won't work for vectors that aren't legal on the target. Improvements coming. llvm-svn: 26925	2006-03-21 19:20:37 +00:00
Chris Lattner	29b2301460	implement basic support for INSERT_VECTOR_ELT. llvm-svn: 26849	2006-03-19 01:17:20 +00:00
Chris Lattner	f4e1a53647	Rename ConstantVec -> BUILD_VECTOR and VConstant -> VBUILD_VECTOR. Allow*BUILD_VECTOR to take variable inputs. llvm-svn: 26847	2006-03-19 00:52:58 +00:00
Chris Lattner	c16b05e67d	implement vector.ll:test_undef llvm-svn: 26845	2006-03-19 00:20:20 +00:00
Chris Lattner	32206f54c6	Change the structure of lowering vector stuff. Note: This breaks some things. llvm-svn: 26840	2006-03-18 01:44:44 +00:00
Nate Begeman	bb01d4f272	Remove BRTWOWAY* Make the PPC backend not dependent on BRTWOWAY_CC and make the branch selector smarter about the code it generates, fixing a case in the readme. llvm-svn: 26814	2006-03-17 01:40:33 +00:00
Chris Lattner	7ececaad83	Fix a problem fully scalarizing values. llvm-svn: 26811	2006-03-16 23:05:19 +00:00
Chris Lattner	8471b15706	Add support for CopyFromReg from vector values. Note: this doesn't support illegal vector types yet! llvm-svn: 26799	2006-03-16 19:57:50 +00:00
Chris Lattner	49409cb925	Teach CreateRegForValue how to handle vector types. llvm-svn: 26798	2006-03-16 19:51:18 +00:00
Chris Lattner	4024c00ce7	add support for vector->vector casts llvm-svn: 26788	2006-03-15 22:19:46 +00:00
Jim Laskey	acb6e34277	Handle the removal of the debug chain. llvm-svn: 26729	2006-03-13 13:07:37 +00:00
Evan Cheng	38280c0020	Added a parameter to control whether Constant::getStringValue() would chop off the result string at the first null terminator. llvm-svn: 26704	2006-03-10 23:52:03 +00:00
Chris Lattner	d3ef6c290a	scrape out bits of llvm-db llvm-svn: 26701	2006-03-10 22:48:19 +00:00
Chris Lattner	5255d04357	Simplify the interface to the schedulers, to not pass the selected heuristicin. llvm-svn: 26692	2006-03-10 07:49:12 +00:00
Chris Lattner	213209a248	remove dbg_declare, it's not used yet. llvm-svn: 26659	2006-03-09 20:02:42 +00:00
Jim Laskey	2698f0de7a	Get rid of the multiple copies of getStringValue. Now a Constant:: method. llvm-svn: 26616	2006-03-08 18:11:07 +00:00
Chris Lattner	543832d39d	Change the interface for getting a target HazardRecognizer to be more clean. llvm-svn: 26608	2006-03-08 04:25:59 +00:00
Chris Lattner	47639dbb93	Hoist the HazardRecognizer out of the ScheduleDAGList.cpp file to where targets can implement them. Make the top-down scheduler non-g5-specific. Remove the old testing hazard recognizer. llvm-svn: 26569	2006-03-06 00:22:00 +00:00
Chris Lattner	98ecb8ec61	Split the list scheduler into top-down and bottom-up pieces. The priority function of the top-down scheduler are completely bogus currently, and having (future) PPC specific in this file is also wrong, but this is a small incremental step. llvm-svn: 26552	2006-03-05 21:10:33 +00:00
Chris Lattner	5c1ba2ac08	Codegen copysign[f] into a FCOPYSIGN node llvm-svn: 26542	2006-03-05 05:09:38 +00:00
Evan Cheng	3bf916ddd9	Add more vector NodeTypes: VSDIV, VUDIV, VAND, VOR, and VXOR. llvm-svn: 26504	2006-03-03 07:01:07 +00:00
Chris Lattner	ad3c974a77	remove the read/write port/io intrinsics. llvm-svn: 26479	2006-03-03 00:19:58 +00:00
Chris Lattner	093c159efb	Split memcpy/memset/memmove intrinsics into i32/i64 versions, resolving PR709, and paving the way for future progress. llvm-svn: 26476	2006-03-03 00:00:25 +00:00
Evan Cheng	b97aab4371	Vector ops lowering. llvm-svn: 26436	2006-03-01 01:09:54 +00:00
Chris Lattner	9fed5b6122	Add support for output memory constraints. llvm-svn: 26410	2006-02-27 23:45:39 +00:00
Jeff Cohen	83c22e0d75	Get VC++ building again. llvm-svn: 26351	2006-02-24 02:52:40 +00:00
Chris Lattner	dcf785bf46	Implement (most of) selection of inline asm memory operands. llvm-svn: 26350	2006-02-24 02:13:54 +00:00
Chris Lattner	7ef7a64ebb	Lower C_Memory operands. llvm-svn: 26346	2006-02-24 01:11:24 +00:00
Chris Lattner	e7c0ffb3a0	Fix an endianness problem on big-endian targets with expanded operands to inline asms. Mark some methods const. llvm-svn: 26334	2006-02-23 20:06:57 +00:00
Chris Lattner	571d9647c6	Record all of the expanded registers in the DAG and machine instr, fixing several bugs in inline asm expanded operands. llvm-svn: 26332	2006-02-23 19:21:04 +00:00
Chris Lattner	b1124f3c76	This fixes a couple of problems with expansion llvm-svn: 26318	2006-02-22 23:09:03 +00:00
Chris Lattner	6f87d18be9	Change a whole bunch of code to be built around RegsForValue instead of a single register number. This fully implements promotion for inline asms, expand is close but not quite right yet. llvm-svn: 26316	2006-02-22 22:37:12 +00:00
Chris Lattner	7ad77dfc2a	split register class handling from explicit physreg handling. llvm-svn: 26308	2006-02-22 00:56:39 +00:00
Chris Lattner	5c79f98f15	Adjust to changes in getRegForInlineAsmConstraint prototype llvm-svn: 26306	2006-02-21 23:12:12 +00:00
Evan Cheng	c3dcf5a4d7	Dumb bug. Code sees a memcpy from X+c so it increments src offset. But it turns out not to point to a constant string but it forgot change the offset back. llvm-svn: 26242	2006-02-16 23:11:42 +00:00
Evan Cheng	42c01c8d39	If the false case is the current basic block, then this is a self loop. We do not want to emit "Loop: ... brcond Out; br Loop", as it adds an extra instruction in the loop. Instead, invert the condition and emit "Loop: ... br!cond Loop; br Out. Generalize the fix by moving it from PPCDAGToDAGISel to SelectionDAGLowering. llvm-svn: 26231	2006-02-16 08:27:56 +00:00
Evan Cheng	93e4865d4b	Remove an unused function parameter. llvm-svn: 26221	2006-02-15 22:12:35 +00:00
Evan Cheng	6781b6e62e	Turn a memcpy from string constant into a series of stores of constant values. llvm-svn: 26219	2006-02-15 21:59:04 +00:00
Evan Cheng	e2038bdeee	Lower memcpy with small constant size operand into a series of load / store ops. llvm-svn: 26195	2006-02-15 01:54:51 +00:00
Evan Cheng	0451499b3c	Doh again! llvm-svn: 26188	2006-02-14 23:05:54 +00:00
Evan Cheng	db2a7a736a	Keep to < 80 cols llvm-svn: 26177	2006-02-14 20:12:38 +00:00
Evan Cheng	038521ef76	Missed a break so memcpy cases fell through to memset. Doh. llvm-svn: 26176	2006-02-14 19:45:56 +00:00
Evan Cheng	d502610604	Fixed a build breakage. llvm-svn: 26175	2006-02-14 09:11:59 +00:00
Evan Cheng	4b40a42653	Rename maxStoresPerMemSet to maxStoresPerMemset, etc. llvm-svn: 26174	2006-02-14 08:38:30 +00:00
Evan Cheng	81fcea8aa2	Expand memset dst, c, size to a series of stores if size falls below the target specific theshold, e.g. 16 for x86. llvm-svn: 26171	2006-02-14 08:22:34 +00:00
Chris Lattner	1784a9d267	now that libcalls don't suck, we can remove this hack llvm-svn: 26164	2006-02-14 05:39:35 +00:00
Jim Laskey	390c63e9d9	Rename to better reflect usage (current and planned.) llvm-svn: 26145	2006-02-13 12:50:39 +00:00
Jim Laskey	5995d0160c	Reorg for integration with gcc4. Old style debug info will not be passed though to SelIDAG. llvm-svn: 26115	2006-02-11 01:01:30 +00:00
Evan Cheng	f9adce90bf	Get rid of some memory leaks identified by Valgrind llvm-svn: 25960	2006-02-04 06:49:00 +00:00
Chris Lattner	3b48431333	Add initial support for immediates. This allows us to compile this: int %rlwnm(int %A, int %B) { %C = call int asm "rlwnm $0, $1, $2, $3, $4", "=r,r,r,n,n"(int %A, int %B, int 4, int 17) ret int %C } into: _rlwnm: or r2, r3, r3 or r3, r4, r4 rlwnm r2, r2, r3, 4, 17 ;; note the immediates :) or r3, r2, r2 blr llvm-svn: 25955	2006-02-04 02:26:14 +00:00
Chris Lattner	65ad53feb3	Initial early support for non-register operands, like immediates llvm-svn: 25952	2006-02-04 02:16:44 +00:00
Chris Lattner	f68fd20286	remove some #ifdef'd out code, which should properly be in the dag combiner anyway. llvm-svn: 25941	2006-02-03 20:13:59 +00:00
Chris Lattner	7f5880b1c7	Implement matching constraints. We can now say things like this: %C = call int asm "xyz $0, $1, $2, $3", "=r,r,r,0"(int %A, int %B, int 4) and get: xyz r2, r3, r4, r2 note that the r2's are pinned together. Yaay for 2-address instructions. 2342 ---------------------------------------------------------------------- llvm-svn: 25893	2006-02-02 00:25:23 +00:00
Chris Lattner	1558fc64f9	Implement simple register assignment for inline asms. This allows us to compile: int %test(int %A, int %B) { %C = call int asm "xyz $0, $1, $2", "=r,r,r"(int %A, int %B) ret int %C } into: (0x8906130, LLVM BB @0x8902220): %r2 = OR4 %r3, %r3 %r3 = OR4 %r4, %r4 INLINEASM <es:xyz $0, $1, $2>, %r2<def>, %r2, %r3 %r3 = OR4 %r2, %r2 BLR which asmprints as: _test: or r2, r3, r3 or r3, r4, r4 xyz $0, $1, $2 ;; need to print the operands now :) or r3, r2, r2 blr llvm-svn: 25878	2006-02-01 18:59:47 +00:00
Chris Lattner	3a5ed55187	adjust to changes in InlineAsm interface. Fix a few minor bugs. llvm-svn: 25865	2006-02-01 01:28:23 +00:00
Chris Lattner	2e56e89452	Handle physreg input/outputs. We now compile this: int %test_cpuid(int %op) { %B = alloca int %C = alloca int %D = alloca int %A = call int asm "cpuid", "=eax,==ebx,==ecx,==edx,eax"(int* %B, int* %C, int* %D, int %op) %Bv = load int* %B %Cv = load int* %C %Dv = load int* %D %x = add int %A, %Bv %y = add int %x, %Cv %z = add int %y, %Dv ret int %z } to this: _test_cpuid: sub %ESP, 16 mov DWORD PTR [%ESP], %EBX mov %EAX, DWORD PTR [%ESP + 20] cpuid mov DWORD PTR [%ESP + 8], %ECX mov DWORD PTR [%ESP + 12], %EBX mov DWORD PTR [%ESP + 4], %EDX mov %ECX, DWORD PTR [%ESP + 12] add %EAX, %ECX mov %ECX, DWORD PTR [%ESP + 8] add %EAX, %ECX mov %ECX, DWORD PTR [%ESP + 4] add %EAX, %ECX mov %EBX, DWORD PTR [%ESP] add %ESP, 16 ret ... note the proper register allocation. :) it is unclear to me why the loads aren't folded into the adds. llvm-svn: 25827	2006-01-31 02:03:41 +00:00
Chris Lattner	98ed05c81d	remove method I just added llvm-svn: 25728	2006-01-28 03:43:09 +00:00
Chris Lattner	43b867dd3b	add a new callback llvm-svn: 25727	2006-01-28 03:37:03 +00:00
Nate Begeman	595ec734fc	Implement Promote for VAARG, and allow it to be custom promoted for people who don't want the default behavior (Alpha). llvm-svn: 25726	2006-01-28 03:14:31 +00:00
Nate Begeman	8c47c3a3b1	Remove TLI.LowerReturnTo, and just let targets custom lower ISD::RET for the same functionality. This addresses another piece of bug 680. Next, on to fixing Alpha VAARG, which I broke last time. llvm-svn: 25696	2006-01-27 21:09:22 +00:00
Chris Lattner	476e67be14	initial selectiondag support for new INLINEASM node. Note that inline asms with outputs or inputs are not supported yet. :) llvm-svn: 25664	2006-01-26 22:24:51 +00:00
Nate Begeman	e74795cd70	First part of bug 680: Remove TLI.LowerVA* and replace it with SDNodes that are lowered the same way as everything else. llvm-svn: 25606	2006-01-25 18:21:52 +00:00
Evan Cheng	a6eff8a432	If scheduler choice is the default (-sched=default), use target scheduling preference to determine which scheduler to use. SchedulingForLatency == Breadth first; SchedulingForRegPressure == bottom up register reduction list scheduler. llvm-svn: 25599	2006-01-25 09:12:57 +00:00
Jim Laskey	b8566fa10a	Typo. llvm-svn: 25545	2006-01-23 13:34:04 +00:00
Evan Cheng	31272347d4	Skeleton of the list schedule. llvm-svn: 25544	2006-01-23 08:26:10 +00:00
Evan Cheng	c1e1d9724d	Factor out more instruction scheduler code to the base class. llvm-svn: 25532	2006-01-23 07:01:07 +00:00
Chris Lattner	deda32a786	Fix bugs lowering stackrestore, fixing 2004-08-12-InlinerAndAllocas.c on PPC. llvm-svn: 25522	2006-01-23 05:22:07 +00:00
Chris Lattner	e23928c67f	Fix a bug in a recent refactor that caused a bunch of programs to miscompile or the compiler to crash. llvm-svn: 25503	2006-01-21 19:12:11 +00:00
Evan Cheng	739a6a456e	Do some code refactoring on Jim's scheduler in preparation of the new list scheduler. llvm-svn: 25493	2006-01-21 02:32:06 +00:00
Chris Lattner	222ceabbee	If the target doesn't support f32 natively, insert the FP_EXTEND in target-indep code, so that the LowerReturn code doesn't have to handle it. llvm-svn: 25482	2006-01-20 18:38:32 +00:00
Chris Lattner	e2ee190821	Temporary work around for a libcall insertion bug: If a target doesn't support FSIN/FCOS nodes, do not lower sin/cos to them. llvm-svn: 25425	2006-01-18 21:50:14 +00:00
Robert Bocchino	03e95af9f7	Support for the insertelement operation. llvm-svn: 25405	2006-01-17 20:06:42 +00:00
Reid Spencer	b4f9a6f110	For PR411: This patch is an incremental step towards supporting a flat symbol table. It de-overloads the intrinsic functions by providing type-specific intrinsics and arranging for automatically upgrading from the old overloaded name to the new non-overloaded name. Specifically: llvm.isunordered -> llvm.isunordered.f32, llvm.isunordered.f64 llvm.sqrt -> llvm.sqrt.f32, llvm.sqrt.f64 llvm.ctpop -> llvm.ctpop.i8, llvm.ctpop.i16, llvm.ctpop.i32, llvm.ctpop.i64 llvm.ctlz -> llvm.ctlz.i8, llvm.ctlz.i16, llvm.ctlz.i32, llvm.ctlz.i64 llvm.cttz -> llvm.cttz.i8, llvm.cttz.i16, llvm.cttz.i32, llvm.cttz.i64 New code should not use the overloaded intrinsic names. Warnings will be emitted if they are used. llvm-svn: 25366	2006-01-16 21:12:35 +00:00
Nate Begeman	542c3c17a9	Remove some duplicated code llvm-svn: 25313	2006-01-14 03:18:27 +00:00
Nate Begeman	2fba8a3aaa	bswap implementation llvm-svn: 25312	2006-01-14 03:14:10 +00:00
Chris Lattner	b32664583b	Compile llvm.stacksave/restore into STACKSAVE/STACKRESTORE nodes, and allow targets to custom expand them as they desire. llvm-svn: 25273	2006-01-13 02:50:02 +00:00
Chris Lattner	6c9c250dcd	Add "support" for stacksave/stackrestore to the dag isel llvm-svn: 25268	2006-01-13 02:24:42 +00:00
Robert Bocchino	2c966e7617	Added selection DAG support for the extractelement operation. llvm-svn: 25179	2006-01-10 19:04:57 +00:00
Jim Laskey	219d559824	Applied some recommend changes from sabre. The dominate one beginning "let the pass manager do it's thing." Fixes crash when compiling -g files and suppresses dwarf statements if no debug info is present. llvm-svn: 25100	2006-01-04 22:28:25 +00:00
Chris Lattner	44c07ed61a	enable the gep isel opt llvm-svn: 24910	2005-12-21 19:36:36 +00:00
Chris Lattner	803a575616	Lower ConstantAggregateZero into zeros llvm-svn: 24890	2005-12-21 02:43:26 +00:00
Jim Laskey	7c462768ed	Added source file/line correspondence for dwarf (PowerPC only at this point.) llvm-svn: 24748	2005-12-16 22:45:29 +00:00
Chris Lattner	5d4e61dd87	Don't lump the filename and working dir together llvm-svn: 24697	2005-12-13 17:40:33 +00:00
Chris Lattner	9e8b633ec1	Accept and ignore prefetches for now llvm-svn: 24678	2005-12-12 22:51:16 +00:00
Chris Lattner	f1a54c0d14	Minor tweak to get isel opt llvm-svn: 24663	2005-12-11 09:05:13 +00:00
Chris Lattner	be73d6eece	improve code insertion in two ways: 1. Only forward subst offsets into loads and stores, not into arbitrary things, where it will likely become a load. 2. If the source is a cast from pointer, forward subst the cast as well, allowing us to fold the cast away (improving cases when the cast is from an alloca or global). This hasn't been fully tested, but does appear to further reduce register pressure and improve code. Lets let the testers grind on it a bit. :) llvm-svn: 24640	2005-12-08 08:00:12 +00:00
Nate Begeman	ae89d862f5	Fix a crash where ConstantVec nodes were being generated with the wrong type when the target did not support them. Also teach Legalize how to expand ConstantVecs. This allows us to generate _test: lwz r2, 12(r3) lwz r4, 8(r3) lwz r5, 4(r3) lwz r6, 0(r3) addi r2, r2, 4 addi r4, r4, 3 addi r5, r5, 2 addi r6, r6, 1 stw r2, 12(r3) stw r4, 8(r3) stw r5, 4(r3) stw r6, 0(r3) blr For: void %test(%v4i %P) { %T = load %v4i %P %S = add %v4i %T, <int 1, int 2, int 3, int 4> store %v4i %S, %v4i * %P ret void } On PowerPC. llvm-svn: 24633	2005-12-07 19:48:11 +00:00
Nate Begeman	41b1cdc771	Teach the SelectionDAG ISel how to turn ConstantPacked values into constant nodes with vector types. Also teach the asm printer how to print ConstantPacked constant pool entries. This allows us to generate altivec code such as the following, which adds a vector constantto a packed float. LCPI1_0: <4 x float> < float 0.0e+0, float 0.0e+0, float 0.0e+0, float 1.0e+0 > .space 4 .space 4 .space 4 .long 1065353216 ; float 1 .text .align 4 .globl _foo _foo: lis r2, ha16(LCPI1_0) la r2, lo16(LCPI1_0)(r2) li r4, 0 lvx v0, r4, r2 lvx v1, r4, r3 vaddfp v0, v1, v0 stvx v0, r4, r3 blr For the llvm code: void %foo(<4 x float> * %a) { entry: %tmp1 = load <4 x float> * %a; %tmp2 = add <4 x float> %tmp1, < float 0.0, float 0.0, float 0.0, float 1.0 > store <4 x float> %tmp2, <4 x float> *%a ret void } llvm-svn: 24616	2005-12-06 06:18:55 +00:00
Chris Lattner	3539778883	Fix the #1 code quality problem that I have seen on X86 (and it also affects PPC and other targets). In a particular, consider code like this: struct Vector3 { double x, y, z; }; struct Matrix3 { Vector3 a, b, c; }; double dot(Vector3 &a, Vector3 &b) { return a.x * b.x + a.y * b.y + a.z * b.z; } Vector3 mul(Vector3 &a, Matrix3 &b) { Vector3 r; r.x = dot( a, b.a ); r.y = dot( a, b.b ); r.z = dot( a, b.c ); return r; } void transform(Matrix3 &m, Vector3 *x, int n) { for (int i = 0; i < n; i++) x[i] = mul( x[i], m ); } we compile transform to a loop with all of the GEP instructions for indexing into 'm' pulled out of the loop (9 of them). Because isel occurs a bb at a time we are unable to fold the constant index into the loads in the loop, leading to PPC code that looks like this: LBB3_1: ; no_exit.preheader li r2, 0 addi r6, r3, 64 ;; 9 values live across the loop body! addi r7, r3, 56 addi r8, r3, 48 addi r9, r3, 40 addi r10, r3, 32 addi r11, r3, 24 addi r12, r3, 16 addi r30, r3, 8 LBB3_2: ; no_exit lfd f0, 0(r30) lfd f1, 8(r4) fmul f0, f1, f0 lfd f2, 0(r3) ;; no constant indices folded into the loads! lfd f3, 0(r4) lfd f4, 0(r10) lfd f5, 0(r6) lfd f6, 0(r7) lfd f7, 0(r8) lfd f8, 0(r9) lfd f9, 0(r11) lfd f10, 0(r12) lfd f11, 16(r4) fmadd f0, f3, f2, f0 fmul f2, f1, f4 fmadd f0, f11, f10, f0 fmadd f2, f3, f9, f2 fmul f1, f1, f6 stfd f0, 0(r4) fmadd f0, f11, f8, f2 fmadd f1, f3, f7, f1 stfd f0, 8(r4) fmadd f0, f11, f5, f1 addi r29, r4, 24 stfd f0, 16(r4) addi r2, r2, 1 cmpw cr0, r2, r5 or r4, r29, r29 bne cr0, LBB3_2 ; no_exit uh, yuck. With this patch, we now sink the constant offsets into the loop, producing this code: LBB3_1: ; no_exit.preheader li r2, 0 LBB3_2: ; no_exit lfd f0, 8(r3) lfd f1, 8(r4) fmul f0, f1, f0 lfd f2, 0(r3) lfd f3, 0(r4) lfd f4, 32(r3) ;; much nicer. lfd f5, 64(r3) lfd f6, 56(r3) lfd f7, 48(r3) lfd f8, 40(r3) lfd f9, 24(r3) lfd f10, 16(r3) lfd f11, 16(r4) fmadd f0, f3, f2, f0 fmul f2, f1, f4 fmadd f0, f11, f10, f0 fmadd f2, f3, f9, f2 fmul f1, f1, f6 stfd f0, 0(r4) fmadd f0, f11, f8, f2 fmadd f1, f3, f7, f1 stfd f0, 8(r4) fmadd f0, f11, f5, f1 addi r6, r4, 24 stfd f0, 16(r4) addi r2, r2, 1 cmpw cr0, r2, r5 or r4, r6, r6 bne cr0, LBB3_2 ; no_exit This is much nicer as it reduces register pressure in the loop a lot. On X86, this takes the function from having 9 spilled registers to 2. This should help some spec programs on X86 (gzip?) This is currently only enabled with -enable-gep-isel-opt to allow perf testing tonight. llvm-svn: 24606	2005-12-05 07:10:48 +00:00

1 2 3 4 5 ...

259 Commits