llvm-project

Commit Graph

Author	SHA1	Message	Date
Gabor Greif	5d8f7e0cc7	eliminate warning llvm-svn: 42892	2007-10-12 07:44:54 +00:00
Chris Lattner	d8675e4915	Fix some 80 column violations. Fix DecomposeSimpleLinearExpr to handle simple constants better. Don't nuke gep(bitcast(allocation)) if the bitcast(allocation) will fold the allocation. This fixes PR1728 and Instcombine/malloc3.ll llvm-svn: 42891	2007-10-12 05:30:59 +00:00
Devang Patel	899cc56612	Lower memcpy if it makes sense. llvm-svn: 42864	2007-10-11 17:21:57 +00:00
Devang Patel	2af23f976b	Do not walk invalid iterator. llvm-svn: 42812	2007-10-09 21:31:36 +00:00
Devang Patel	a69f987b66	Fix bug in updating dominance frontier after loop unswitch when frontier includes basic blocks that are not inside loop. llvm-svn: 42654	2007-10-05 22:29:34 +00:00
Devang Patel	3574759d85	Fix 80 col violation. llvm-svn: 42591	2007-10-03 21:17:43 +00:00
Devang Patel	e192e32577	Refactor code in a separate method. llvm-svn: 42590	2007-10-03 21:16:08 +00:00
Dan Gohman	c731c97fac	Use empty() member functions when that's what's being tested for instead of comparing begin() and end(). llvm-svn: 42585	2007-10-03 19:26:29 +00:00
Dale Johannesen	9d559cfff5	Tone down an overzealous optimization. llvm-svn: 42582	2007-10-03 17:45:27 +00:00
Tanya Lattner	30f65fe4a7	Fix PR1719, by not marking llvm.global.annotations internal. llvm-svn: 42578	2007-10-03 17:05:40 +00:00
Chris Lattner	d66e0cd6c0	Fix PR1719, by not marking llvm.noinline internal. llvm-svn: 42565	2007-10-03 03:59:15 +00:00
Dale Johannesen	b6c05b1f90	Fix stride computations for long double arrays. llvm-svn: 42508	2007-10-01 23:08:35 +00:00
Devang Patel	2a60ff1aeb	Relax unsafe use check. If there is one unconditional use inside the loop then it is safe to promote value even if there is another conditional use inside the loop. llvm-svn: 42493	2007-10-01 18:12:58 +00:00
Dale Johannesen	6bf69ed3cc	minor long double related changes llvm-svn: 42439	2007-09-28 18:06:58 +00:00
Dale Johannesen	1d1d0e7735	Don't do SRA for unions with long double fields. Fixes a SWB crash. llvm-svn: 42422	2007-09-28 00:21:38 +00:00
Devang Patel	7bba386f72	Handle multiple induction variables. This fixes PR714. llvm-svn: 42309	2007-09-25 18:24:48 +00:00
Devang Patel	440d13b55b	Do not reserve DOM check for GetElementPtrInst. llvm-svn: 42306	2007-09-25 17:55:50 +00:00
Devang Patel	5e1651d270	doh.. llvm-svn: 42300	2007-09-25 17:43:08 +00:00
Devang Patel	87d7e8ebcb	Add transformation to update loop interation space. Now, for (i=A; i<N; i++) { if (i < X && i > Y) do_something(); } is transformed into U=min(N,X); L=max(A,Y); for (i=L;i<U;i++) do_somethihg(); llvm-svn: 42299	2007-09-25 17:31:19 +00:00
Devang Patel	9e30e1a3be	Do not promote null values because it may be unsafe to do so. llvm-svn: 42270	2007-09-24 20:02:42 +00:00
Dan Gohman	75470c3bf1	explicit keywords. llvm-svn: 42262	2007-09-24 15:48:49 +00:00
Devang Patel	361e52f39c	Fix PR1692 llvm-svn: 42209	2007-09-21 21:18:19 +00:00
Owen Anderson	46da2a6262	Add partial caching of non-local memory dependence queries. This provides a modest speedup for GVN. llvm-svn: 42185	2007-09-21 03:53:52 +00:00
Devang Patel	83cc3f8f51	Update aux. info associated with an instruction before erasing instruction. llvm-svn: 42180	2007-09-20 23:45:50 +00:00
Devang Patel	6117a3b696	Don't increment invalid iterator. llvm-svn: 42178	2007-09-20 23:01:50 +00:00
Nick Lewycky	eae7e7d00b	Fix optimization. %x = sub %x, %y does not imply that %y is zero. llvm-svn: 42157	2007-09-20 00:48:36 +00:00
Devang Patel	464276f831	Avoid unsafe promotion. llvm-svn: 42149	2007-09-19 20:18:51 +00:00
Duncan Sands	d31649bc59	Improve comment. llvm-svn: 42132	2007-09-19 10:25:38 +00:00
Duncan Sands	56df7dec2b	A global variable with external weak linkage can be null, while an alias could alias such a global variable. llvm-svn: 42130	2007-09-19 10:10:31 +00:00
Devang Patel	69a55a38ed	Relax loop ExitCondition predicate restriction. llvm-svn: 42122	2007-09-19 00:28:47 +00:00
Devang Patel	455a53b7db	Filter loops where split condition's false branch is not empty. For example for (int i = 0; i < N; ++i) { if (i == somevalue) dosomething(); else dosomethingelse(); } llvm-svn: 42121	2007-09-19 00:15:16 +00:00
Devang Patel	4c238c451f	Bail out early, before modifying anything. llvm-svn: 42120	2007-09-19 00:11:01 +00:00
Devang Patel	31f2c8592c	Work is incomplete. Loop is not modified at all right now. llvm-svn: 42119	2007-09-19 00:08:13 +00:00
Devang Patel	fcda998ab2	Fix PR1657 llvm-svn: 42075	2007-09-18 01:54:42 +00:00
Devang Patel	267c07b51f	Do not eliminate loop when it is invalid to do so. For example, for(int i = 0; i < N; i++) { if ( i == XYZ) { A; else B; } C; D; } llvm-svn: 42058	2007-09-17 21:01:05 +00:00
Devang Patel	712dbe9d13	Skeleton for transformations to truncate loop's iteration space. llvm-svn: 42054	2007-09-17 20:39:48 +00:00
Devang Patel	9d1af9b63d	Fix comment. llvm-svn: 42048	2007-09-17 20:07:40 +00:00
Chris Lattner	0625bd6472	Merge DenseMapKeyInfo & DenseMapValueInfo into DenseMapInfo Add a new DenseMapInfo::isEqual method to allow clients to redefine the equality predicate used when probing the hash table. llvm-svn: 42042	2007-09-17 18:34:04 +00:00
Dan Gohman	2ac2652779	Instcombine x-((x/y)*y) into a remainder operator. llvm-svn: 42035	2007-09-17 17:31:57 +00:00
Duncan Sands	6d5da71288	Factor the trampoline transformation into a subroutine. llvm-svn: 42021	2007-09-17 10:26:40 +00:00
Owen Anderson	4cd516b50b	Be more careful when constant-folding PHI nodes. llvm-svn: 41998	2007-09-16 08:04:16 +00:00
Owen Anderson	8d0cb881e5	Remove RLE. It is subsumed by GVN. llvm-svn: 41968	2007-09-14 22:33:52 +00:00
Dale Johannesen	98d3a08d8f	Remove the assumption that FP's are either float or double from some of the many places in the optimizers it appears, and do something reasonable with x86 long double. Make APInt::dump() public, remove newline, use it to dump ConstantSDNode's. Allow APFloats in FoldingSet. Expand X86 backend handling of long doubles (conversions to/from int, mostly). llvm-svn: 41967	2007-09-14 22:26:36 +00:00
Chris Lattner	5d13fb538f	Fix a logic error in ValueIsOnlyUsedLocallyOrStoredToOneGlobal that caused miscompilation of 188.ammp. Reject select and bitcast in ValueIsOnlyUsedLocallyOrStoredToOneGlobal because RewriteHeapSROALoadUser can't handle it. llvm-svn: 41950	2007-09-14 03:41:21 +00:00
Chris Lattner	d9111b88d1	silence a bogus gcc warning. llvm-svn: 41949	2007-09-14 03:07:24 +00:00
Bill Wendling	264d4813c7	Temporary reverting r41817 (http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070910/053370.html). It's causing SPASS to fail. llvm-svn: 41938	2007-09-14 01:13:55 +00:00
Chris Lattner	011f91b5b2	Teach GlobalLoadUsesSimpleEnoughForHeapSRA and the SROA rewriter how to handle a limited form of PHI nodes. This finally fixes PR1639, speeding 179.art up from 7.84s to 3.13s on PPC. llvm-svn: 41933	2007-09-13 21:31:36 +00:00
Chris Lattner	ba98f89388	be tolerant of PHI nodes when rewriting heap SROA code. This is a step along the way of PR1639 llvm-svn: 41930	2007-09-13 18:00:31 +00:00
Chris Lattner	f315d4f1a7	refactor some code, no functionality change. On the path to PR1639 llvm-svn: 41929	2007-09-13 17:29:05 +00:00
Chris Lattner	6eed0e7366	Make ValueIsOnlyUsedLocallyOrStoredToOneGlobal smart enough to see through bitcasts and phis. This is a step to fixing PR1639. llvm-svn: 41928	2007-09-13 16:37:20 +00:00
Chris Lattner	2d2892ee6e	Make AllUsesOfLoadedValueWillTrapIfNull strong enough to see through PHI nodes. This is the first step of the fix for PR1639. llvm-svn: 41927	2007-09-13 16:30:19 +00:00
Chris Lattner	7b412cb823	Change llvm.gcroot to not init the root to null at runtime, this prevents using it for live-in values etc. llvm-svn: 41879	2007-09-12 17:53:10 +00:00
Duncan Sands	9204663bcb	Turn calls to trampolines into calls to the underlying nested function. llvm-svn: 41844	2007-09-11 14:35:41 +00:00
Devang Patel	7ed6eb8992	Avoid negative logic. llvm-svn: 41829	2007-09-11 01:10:45 +00:00
Devang Patel	8c95373ced	Refactor code into a separate method. llvm-svn: 41826	2007-09-11 00:42:56 +00:00
Devang Patel	d67479b6ee	Clear split info object. llvm-svn: 41823	2007-09-11 00:23:56 +00:00
Devang Patel	a28a7f1b2d	Split condition does not have to be ICmpInst in all cases. llvm-svn: 41822	2007-09-11 00:12:56 +00:00
Devang Patel	f4202e91f8	Check all terminators inside loop. llvm-svn: 41821	2007-09-10 23:57:58 +00:00
Chris Lattner	e804567cd8	remove some dead code, this is handled by constant folding. llvm-svn: 41819	2007-09-10 23:46:29 +00:00
Devang Patel	2181b8e86a	Swap exit condition operands if it works. llvm-svn: 41817	2007-09-10 23:34:06 +00:00
Chris Lattner	c75cbe6473	Prevent tailcallelim from breaking "recursive" calls to builtins. llvm-svn: 41804	2007-09-10 20:58:55 +00:00
Devang Patel	f8ab0a9acc	Filter exit conditions which are not yet handled. llvm-svn: 41800	2007-09-10 18:33:42 +00:00
Devang Patel	d7409fdce5	Require SCEV before LCSSA. llvm-svn: 41798	2007-09-10 18:08:23 +00:00
Chris Lattner	85a51e0060	Don't zap back to back volatile load/stores llvm-svn: 41759	2007-09-07 05:33:03 +00:00
Dale Johannesen	bed9dc423c	Next round of APFloat changes. Use APFloat in UpgradeParser and AsmParser. Change all references to ConstantFP to use the APFloat interface rather than double. Remove the ConstantFP double interfaces. Use APFloat functions for constant folding arithmetic and comparisons. (There are still way too many places APFloat is just a wrapper around host float/double, but we're getting there.) llvm-svn: 41747	2007-09-06 18:13:44 +00:00
Nick Lewycky	0c5c47944a	Use isTrueWhenEqual. Thanks Chris! llvm-svn: 41741	2007-09-06 02:40:25 +00:00
Nick Lewycky	b0b066eaaa	When the two operands of an icmp are equal, there are five possible predicates that would make the icmp true. Fixes PR1637. llvm-svn: 41740	2007-09-06 01:10:22 +00:00
Chuck Rose III	2320323647	Forgot to obey 80 column rule. Fixing that. llvm-svn: 41725	2007-09-05 20:36:41 +00:00
Chuck Rose III	e58572233d	Added default parameters to GetElementPtrInstr constructor call. Visual Studio 2k5 was getting confused and was unable to compile it. Suspected compiler error. llvm-svn: 41721	2007-09-05 16:54:38 +00:00
Devang Patel	f6ef552f3d	Insert cloned loop basic blocks before original loop header. llvm-svn: 41713	2007-09-04 20:46:35 +00:00
David Greene	c656cbb8c2	Update GEP constructors to use an iterator interface to fix GLIBCXX_DEBUG issues. llvm-svn: 41697	2007-09-04 15:46:09 +00:00
Anton Korobeynikov	35322d745c	Silence warning while compiling with gcc 4.2 llvm-svn: 41676	2007-09-02 22:11:14 +00:00
Evan Cheng	ffac17a223	Fix a gcroot lowering bug. llvm-svn: 41668	2007-09-01 02:00:51 +00:00
Chris Lattner	0e258b8518	Cut off crazy computation. This helps PR1622 slightly. llvm-svn: 41522	2007-08-28 04:23:55 +00:00
Devang Patel	d2456a171d	Use simpler test to filter loops. llvm-svn: 41516	2007-08-27 21:34:31 +00:00
David Greene	703623d571	Update InvokeInst to work like CallInst llvm-svn: 41506	2007-08-27 19:04:21 +00:00
Dan Gohman	71eaf62e5f	Change comments to refer to @malloc and @free instead of %malloc and %free. llvm-svn: 41488	2007-08-27 16:11:48 +00:00
Anton Korobeynikov	24fb6b2f8c	Don't promote volatile loads/stores. This is needed (for example) to handle setjmp/longjmp properly. This fixes PR1520. llvm-svn: 41461	2007-08-26 21:43:30 +00:00
Owen Anderson	2b9ec7ff33	Don't DSe volatile stores. llvm-svn: 41456	2007-08-26 21:14:47 +00:00
Devang Patel	6114751544	Move exit condition and exit branch from exiting block into loop header and dominator info. This avoid execution of dead iteration. Loop is already filter in the beginning such that this change is safe. llvm-svn: 41394	2007-08-25 02:39:24 +00:00
Devang Patel	c1ef32ef3d	Constant split values needs upper bound and lower bound check, just like any other split value. llvm-svn: 41389	2007-08-25 01:09:14 +00:00
Devang Patel	4e63e1f5b5	While calculating upper loop bound for first loop and lower loop bound for second loop, take care of edge cases. llvm-svn: 41387	2007-08-25 00:56:38 +00:00
Devang Patel	f5a01bf025	Fix regression that I caused yesterday night while adding logic to select appropriate split condition branch. llvm-svn: 41365	2007-08-24 19:32:26 +00:00
Devang Patel	4bc9298f2a	It is not safe to execute split condition's true branch first all the time. If split condition predicate is GT or GE then execute false branch first. llvm-svn: 41358	2007-08-24 06:17:19 +00:00
Devang Patel	4be56a5d12	Reject ICMP_NE as index split condition. llvm-svn: 41357	2007-08-24 06:02:25 +00:00
Devang Patel	5e46fac6de	Tightenup loop filter. llvm-svn: 41356	2007-08-24 05:36:56 +00:00
Devang Patel	504dc0aaed	Remove incomplete cost analysis. llvm-svn: 41354	2007-08-24 05:21:13 +00:00
Chris Lattner	b0f158cfdf	rename APInt::toString -> toStringUnsigned for symmetry with toStringSigned() Add an APSInt::toString() method. llvm-svn: 41309	2007-08-23 05:15:32 +00:00
Devang Patel	887db2d832	Remove dead code. llvm-svn: 41295	2007-08-22 21:07:41 +00:00
Devang Patel	6f4f23320d	Fix typo. llvm-svn: 41292	2007-08-22 20:55:18 +00:00
Devang Patel	31206b56d5	Cosmetic change "True Loop" and "False Loop" naming terminology to refer two loops after loop cloning is confusing. Instead just use A_Loop and B_Loop. llvm-svn: 41287	2007-08-22 19:33:29 +00:00
Devang Patel	90da534987	Refactor loop condition check in a separate function. llvm-svn: 41282	2007-08-22 18:27:01 +00:00
Devang Patel	cd8beb7645	Fix thinko. Starting value of second loop's induction variable can not be lower then starting value of original loop's induction variable. llvm-svn: 41280	2007-08-22 18:07:47 +00:00
Devang Patel	a12000d572	Rename bunch of variables. llvm-svn: 41250	2007-08-21 21:12:02 +00:00
Devang Patel	f98db5e62a	Preserve LCSSA. llvm-svn: 41246	2007-08-21 19:47:46 +00:00
Devang Patel	b5933bbbd5	Use SmallVector instead of std::vector. llvm-svn: 41207	2007-08-21 00:31:24 +00:00
Devang Patel	8f4228d619	s/ExitBlock/ExitingBlock/g llvm-svn: 41204	2007-08-20 23:51:18 +00:00
Devang Patel	49c4f9a889	Replace indunction variable with split value in loop body. This fixes art miscompile. llvm-svn: 41195	2007-08-20 20:49:01 +00:00
Devang Patel	c2e2d15f45	Do not split loops rejected by processOneIterationLoop(). llvm-svn: 41194	2007-08-20 20:24:15 +00:00
Nick Lewycky	bfa9499a88	Oops, remove assert that wasn't meant to be committed. llvm-svn: 41170	2007-08-18 23:21:28 +00:00
Nick Lewycky	5b5b1ab9e0	Never insert duplicate edges. llvm-svn: 41169	2007-08-18 23:18:03 +00:00
Nick Lewycky	a0d49dac26	Typo. llvm-svn: 41168	2007-08-18 15:08:56 +00:00
Devang Patel	1282b6e181	Avoid spliting loops where two split condition branches are not independent. llvm-svn: 41148	2007-08-18 00:00:32 +00:00
Devang Patel	d1fcfcc76c	When one branch of condition is eliminated then head of the other branch is not necessary immediate dominators of merge blcok in all cases. llvm-svn: 41144	2007-08-17 21:59:16 +00:00
Owen Anderson	f5023a7a84	Factor out some code into a helper function. llvm-svn: 41131	2007-08-16 22:51:56 +00:00
Owen Anderson	221a43604e	Add some more comments to GVN. llvm-svn: 41129	2007-08-16 22:02:55 +00:00
Devang Patel	3640e78057	Dominance frontier is now required. llvm-svn: 41096	2007-08-15 03:34:53 +00:00
Devang Patel	b81bcbde09	Cleanup removeBlocks. Use dominance frontier to fixup incoming edges of successor blocks not domianted by DeadBB. Use df_iterator to walk and delete basic blocks dominated by DeadBB. llvm-svn: 41095	2007-08-15 03:31:47 +00:00
Reid Spencer	0db035567c	Remove unneeded header file. llvm-svn: 41094	2007-08-15 03:01:04 +00:00
Devang Patel	f55b79fa71	Avoid triangle loops. llvm-svn: 41093	2007-08-15 02:14:55 +00:00
Devang Patel	22c7993ecf	Break infinite loop. llvm-svn: 41091	2007-08-14 23:59:17 +00:00
Devang Patel	7cad917160	Avoid nested loops at the moment. llvm-svn: 41090	2007-08-14 23:53:57 +00:00
Devang Patel	33ba97d747	Fix dominance frontier update while removing blocks. llvm-svn: 41082	2007-08-14 18:35:57 +00:00
Owen Anderson	bc271a02fd	Eliminate PHI nodes with constant values during normal GVN processing, even when they're not related to eliminating a load. llvm-svn: 41081	2007-08-14 18:33:27 +00:00
Owen Anderson	398602a6eb	Be more aggressive in pruning unnecessary PHI nodes when doing PHI construction. llvm-svn: 41080	2007-08-14 18:16:29 +00:00
Owen Anderson	676070d503	Make GVN iterative. llvm-svn: 41078	2007-08-14 18:04:11 +00:00
Owen Anderson	a7b220f23a	Fix a case where GVN was failing to return true when it had, in fact, modified the function. llvm-svn: 41077	2007-08-14 17:59:48 +00:00
Devang Patel	dbe8497d45	Handle last value assignments. llvm-svn: 41063	2007-08-14 01:30:57 +00:00
Devang Patel	f74ccbb4e8	StartValue is already calculated. llvm-svn: 41062	2007-08-14 00:15:45 +00:00
Devang Patel	948653915f	Preserve simple analysis. llvm-svn: 41054	2007-08-13 22:22:13 +00:00
Devang Patel	b8a41bb4f1	Preserve dominator info. llvm-svn: 41053	2007-08-13 22:13:24 +00:00
Devang Patel	da48cf40db	If NewBB dominates DestBB then DestBB is not part of NewBB's dominance frontier. llvm-svn: 41051	2007-08-13 21:59:17 +00:00
Devang Patel	f258578206	Split loops and do CFG cleanup. llvm-svn: 41029	2007-08-12 07:02:51 +00:00
Reid Spencer	9f90f965de	Remove unused variables. llvm-svn: 41028	2007-08-12 04:45:36 +00:00
Chris Lattner	99c8ee2977	Transform a load from an undef/zero global into an undef/global even if we have complex pointer manipulation going on. This allows us to compile stuff like this: __m128i foo(__m128i x){ static const unsigned int c_0[4] = { 0, 0, 0, 0 }; __m128i v_Zero = _mm_loadu_si128((__m128i*)c_0); x = _mm_unpacklo_epi8(x, v_Zero); return x; } into: _foo: xorps %xmm1, %xmm1 punpcklbw %xmm1, %xmm0 ret llvm-svn: 41022	2007-08-11 18:48:48 +00:00
Devang Patel	f417c2cc34	Clone loop. llvm-svn: 40998	2007-08-10 18:07:13 +00:00
Devang Patel	aa36a43908	Add utility to clone loops. llvm-svn: 40997	2007-08-10 17:59:47 +00:00
Devang Patel	9a4761464f	Remove unncessary duplication. llvm-svn: 40979	2007-08-10 00:59:03 +00:00
Devang Patel	7bdf4531bb	Calculate exit and start value of true loop and false loop respectively. llvm-svn: 40978	2007-08-10 00:53:35 +00:00
Devang Patel	67af6cd7ea	ExitCondition and Induction variable are loop constraints not split condition constraints. llvm-svn: 40977	2007-08-10 00:33:50 +00:00
Chris Lattner	a8e4b4bc7b	when we see a unaligned load from an insufficiently aligned global or alloca, increase the alignment of the load, turning it into an aligned load. This allows us to compile: #include <xmmintrin.h> __m128i foo(__m128i x){ static const unsigned int c_0[4] = { 0, 0, 0, 0 }; __m128i v_Zero = _mm_loadu_si128((__m128i*)c_0); x = _mm_unpacklo_epi8(x, v_Zero); return x; } into: _foo: punpcklbw _c_0.5944, %xmm0 ret .data .lcomm _c_0.5944,16,4 # c_0.5944 instead of: _foo: movdqu _c_0.5944, %xmm1 punpcklbw %xmm1, %xmm0 ret .data .lcomm _c_0.5944,16,2 # c_0.5944 llvm-svn: 40971	2007-08-09 19:05:49 +00:00
Owen Anderson	9b1cc8cac0	Make NonLocal and None const in the right way. :-) llvm-svn: 40961	2007-08-09 04:42:44 +00:00
Devang Patel	42e3e5bec1	Traverse loop blocks' terminators to find split candidates. llvm-svn: 40960	2007-08-09 01:39:01 +00:00
Devang Patel	0183c797c4	Add cost analysis. llvm-svn: 40952	2007-08-08 22:25:28 +00:00
Devang Patel	0e34ee25ab	Preserve dom info while processing one iteration loop. llvm-svn: 40947	2007-08-08 21:39:47 +00:00
Owen Anderson	b84d3b1c92	Change the None and NonLocal markers in memdep to be const. llvm-svn: 40946	2007-08-08 21:39:39 +00:00
Devang Patel	8abc5c82b7	Clear split info. llvm-svn: 40944	2007-08-08 21:18:27 +00:00
Devang Patel	593bf9ceb3	Handle multiple split conditions. llvm-svn: 40941	2007-08-08 21:02:17 +00:00
Owen Anderson	680862880d	Global values also don't undead-ify pointers in our dead alloca's set. llvm-svn: 40936	2007-08-08 19:12:31 +00:00
Owen Anderson	ddf4aee543	Make handleEndBlock significantly faster with one trivial improvement, and one hack to avoid hitting a bad case when the alias analysis is imprecise. llvm-svn: 40935	2007-08-08 18:38:28 +00:00
Owen Anderson	50df9685b0	Small improvement: if a function doesn't access memory, we don't need to scan it for potentially undeading pointers. llvm-svn: 40933	2007-08-08 17:58:56 +00:00
Owen Anderson	52aaabf74d	Add some comments, remove a dead argument, and simplify some control flow. No functionality change. llvm-svn: 40932	2007-08-08 17:50:09 +00:00
Owen Anderson	b17ab03081	A few more small cleanups. llvm-svn: 40922	2007-08-08 06:06:02 +00:00
Owen Anderson	0aecf0ebef	First round of cleanups from Chris' feedback. llvm-svn: 40919	2007-08-08 04:52:29 +00:00
Devang Patel	68de1ae816	Embrace patch review feedback. llvm-svn: 40915	2007-08-08 01:51:27 +00:00
Devang Patel	c7e53bdcfd	Fix new compare instruction's signness. Caught by Chris during review. llvm-svn: 40912	2007-08-07 23:17:52 +00:00
Owen Anderson	0cc1a76283	Don't insert nearly as many redundant phi nodes. llvm-svn: 40909	2007-08-07 23:12:31 +00:00
Devang Patel	19211b6528	Use eraseFromParent(). llvm-svn: 40903	2007-08-07 17:45:35 +00:00
David Greene	bacdbaa0da	Fix comment typo llvm-svn: 40898	2007-08-07 16:52:03 +00:00
David Greene	816a190cdf	Fix GLIBCXX_DEBUG error triggered by incrementing erased iterator. llvm-svn: 40897	2007-08-07 16:44:38 +00:00
Devang Patel	c70106cb30	Begin loop index split pass. llvm-svn: 40883	2007-08-07 00:25:56 +00:00
Nick Lewycky	8052019a20	It's safe to fold not of fcmp. llvm-svn: 40870	2007-08-06 20:04:16 +00:00
David Greene	77b2accbca	Make this code more efficient. llvm-svn: 40861	2007-08-06 15:09:17 +00:00
Chris Lattner	c7ba225705	remove some dead lines llvm-svn: 40859	2007-08-06 06:21:06 +00:00
Reid Spencer	d959cfc882	Silence some warnings from doxygen about @param argument name not matching the actual argument name of the documented function. llvm-svn: 40851	2007-08-05 19:35:22 +00:00
Chris Lattner	f0da7975ea	at the end of instcombine, explicitly clear WorklistMap. This shrinks it down to something small. On the testcase from PR1432, this speeds up instcombine from 0.7959s to 0.5000s, (59%) llvm-svn: 40840	2007-08-05 08:47:58 +00:00
Chris Lattner	edce70d2fe	rewrite the code used to construct pruned SSA form with the IDF method. In the old way, we computed and inserted phi nodes for the whole IDF of the definitions of the alloca, then computed which ones were dead and removed them. In the new method, we first compute the region where the value is live, and use that information to only insert phi nodes that are live. This eliminates the need to compute liveness later, and stops the algorithm from inserting a bunch of phis which it then later removes. This speeds up the testcase in PR1432 from 2.00s to 0.15s (14x) in a release build and 6.84s->0.50s (14x) in a debug build. llvm-svn: 40825	2007-08-04 22:50:14 +00:00
Chris Lattner	d91576b01e	Factor out a whole bunch of code into it's own method. llvm-svn: 40824	2007-08-04 21:14:29 +00:00
Chris Lattner	4e1b4140eb	Use getNumPreds(BB) instead of computing them manually. This is a very small but measurable speedup. llvm-svn: 40823	2007-08-04 21:06:15 +00:00
Chris Lattner	b6a4ba808b	Change the rename pass to be "tail recursive", only adding N-1 successors to the worklist, and handling the last one with a 'tail call'. This speeds up PR1432 from 2.0578s to 2.0012s (2.8%) llvm-svn: 40822	2007-08-04 20:40:27 +00:00
Chris Lattner	840259c8d3	cache computation of #preds for a BB. This speeds up mem2reg from 2.0742->2.0522s on PR1432. llvm-svn: 40821	2007-08-04 20:24:50 +00:00
Chris Lattner	050bac4bed	reserve operand space for phi nodes when we insert them. llvm-svn: 40820	2007-08-04 20:14:34 +00:00
Chris Lattner	9318785df5	use continue to avoid nesting, no functionality change. llvm-svn: 40819	2007-08-04 20:07:06 +00:00
Chris Lattner	6b04ecbaf9	Promoting allocas with the 'single store' fastpath is faster than with the 'local to a block' fastpath. This speeds up PR1432 from 2.1232 to 2.0686s (2.6%) llvm-svn: 40818	2007-08-04 20:03:23 +00:00
Chris Lattner	4a930f9444	When PromoteLocallyUsedAllocas promoted allocas, it didn't remember to increment NumLocalPromoted, and didn't actually delete the dead alloca, leading to an extra iteration of mem2reg. llvm-svn: 40817	2007-08-04 20:01:43 +00:00
Chris Lattner	63c039780c	std::map -> DenseMap llvm-svn: 40816	2007-08-04 19:52:20 +00:00
Nick Lewycky	20f0811fc0	Clean up comments, fix up some confusing code logic. Predsimplify fails llvm-gcc bootstrap. llvm-svn: 40815	2007-08-04 18:45:32 +00:00
Chris Lattner	7d382f7680	fix a logic bug where we wouldn't promote single store allocas if the stored value was a non-instruction value. Doh. This increase the # single store allocas from 8982 to 9026, and speeds up mem2reg on the testcase in PR1432 from 2.17 to 2.13s. llvm-svn: 40813	2007-08-04 02:45:02 +00:00
Chris Lattner	1b215f0661	When we do the single-store optimization, delete both the store and the alloca so they don't get reprocessed. This speeds up PR1432 from 2.20s to 2.17s. llvm-svn: 40812	2007-08-04 02:38:38 +00:00
Chris Lattner	862f125457	Three improvements: 1. Check for revisiting a block before checking domination, which is faster. 2. If the stored value isn't an instruction, we don't have to check for domination. 3. If we have a value used in the same block more than once, make sure to remove the block from the UsingBlocks vector. Not doing so forces us to go through the slow path for the alloca. The combination of these improvements increases the number of allocas on the fastpath from 8935 to 8982 on PR1432. This speeds it up from 2.90s to 2.20s (31%) llvm-svn: 40811	2007-08-04 02:32:22 +00:00
Chris Lattner	ae1e00eb36	switch from using a std::set to using a SmallPtrSet. This speeds up the testcase in PR1432 from 6.33s to 2.90s (2.22x) llvm-svn: 40810	2007-08-04 02:21:22 +00:00
Chris Lattner	9181801bb7	In mem2reg, when handling the single-store case, make sure to remove a using block from the list if we handle it. Not doing this caused us to not be able to promote (with the fast path) allocas which have uses (whoops). This increases the # allocas hitting this fastpath from 4042 to 8935 on the testcase in PR1432, speeding up mem2reg by 2.6x llvm-svn: 40809	2007-08-04 02:15:24 +00:00
Chandler Carruth	7132e00de7	This is the patch to provide clean intrinsic function overloading support in LLVM. It cleans up the intrinsic definitions and generally smooths the process for more complicated intrinsic writing. It will be used by the upcoming atomic intrinsics as well as vector and float intrinsics in the future. This also changes the syntax for llvm.bswap, llvm.part.set, llvm.part.select, and llvm.ct* intrinsics. They are automatically upgraded by both the LLVM ASM reader and the bitcode reader. The test cases have been updated, with special tests added to ensure the automatic upgrading is supported. llvm-svn: 40807	2007-08-04 01:51:18 +00:00
Chris Lattner	886a41a007	split rewriting of single-store allocas into its own method. llvm-svn: 40806	2007-08-04 01:47:41 +00:00
Chris Lattner	3cede09c67	refactor some code to shrink PromoteMem2Reg::run a bit llvm-svn: 40805	2007-08-04 01:41:18 +00:00
Chris Lattner	d524537fe9	add a typedef, no other change. llvm-svn: 40804	2007-08-04 01:19:38 +00:00
Chris Lattner	df138be527	avoid an unneeded vector copy. This speeds up mem2reg on the testcase in PR1432 by 6% llvm-svn: 40803	2007-08-04 01:07:49 +00:00
Chris Lattner	fd838f0770	make RenamePassWorkList a local var instead of an ivar. llvm-svn: 40802	2007-08-04 01:04:40 +00:00
Owen Anderson	2d19aae4ca	Fix a subtle miscompilation. This allows 197.parser to be compiled correctly. llvm-svn: 40791	2007-08-03 19:59:35 +00:00
Owen Anderson	774761c503	Fix a subtle iterator invalidation bug in a recursive algorithm. llvm-svn: 40776	2007-08-03 11:03:26 +00:00
Chris Lattner	1f70816c73	Fix an accidental commit. llvm-svn: 40758	2007-08-02 21:33:36 +00:00
Owen Anderson	a8ba659976	Fix 80 col. violations. llvm-svn: 40751	2007-08-02 18:20:52 +00:00
Owen Anderson	9699a6ea03	Fix 80 col. violations. llvm-svn: 40750	2007-08-02 18:16:06 +00:00
Owen Anderson	e3590584b9	Fix 80 col. violations. llvm-svn: 40749	2007-08-02 18:11:11 +00:00
Owen Anderson	0ac1fc8ac1	Fix a bug that was causing several miscompilations on SPEC. llvm-svn: 40746	2007-08-02 17:56:05 +00:00
Chris Lattner	dc2cf228ce	Replacing a cast with another one does not reduce the number of casts in the input. llvm-svn: 40741	2007-08-02 17:23:38 +00:00
Chris Lattner	222b214be7	Disable an xform that causes an infinite loop. This fixes PR1594 llvm-svn: 40739	2007-08-02 16:56:32 +00:00
Chris Lattner	2740694450	wrap some long lines. Major offenders that are left include gvn, gvnpre, dse, and predsimplify. To see these, use: make check-line-length llvm-svn: 40738	2007-08-02 16:53:43 +00:00
Devang Patel	a882328e61	Update dominator info for the middle blocks created while spliting exit edge to preserve LCSSA. Fix dominance frontier update during loop unswitch. This fixes PR 1589, again llvm-svn: 40737	2007-08-02 15:25:57 +00:00
Chris Lattner	b0418fc607	Enhance instcombine to be more aggressive about folding casts of operations of casts. This implements InstCombine/zext-fold.ll llvm-svn: 40726	2007-08-02 06:11:14 +00:00
Chris Lattner	d7cb625a9e	Fix PR1575 and test/Transforms/CondProp/2007-08-01-InvalidRead.ll llvm-svn: 40720	2007-08-02 04:47:05 +00:00
Devang Patel	34890b2f27	Undo previous check-in. llvm-svn: 40698	2007-08-01 23:24:50 +00:00
Devang Patel	561b0c29a3	Update dominator info for the middle blocks created while spliting exit edge to preserve LCSSA. Fix dominance frontier update during loop unswitch. This fixes PR 1589. llvm-svn: 40695	2007-08-01 22:23:50 +00:00
Owen Anderson	c321e5e272	Make non-local memdep not be recursive, and fix a bug on 403.gcc that this exposed. llvm-svn: 40692	2007-08-01 22:01:54 +00:00
Dan Gohman	34d442f274	More explicit keywords. llvm-svn: 40673	2007-08-01 15:32:29 +00:00
Owen Anderson	10e52eddb3	Rename FastDSE to just DSE. llvm-svn: 40668	2007-08-01 06:36:51 +00:00
Owen Anderson	e4a374812b	Move FastDSE in to DeadStoreElimination. llvm-svn: 40667	2007-08-01 06:30:51 +00:00
Owen Anderson	4894e6d8bc	Remove old DSE. llvm-svn: 40666	2007-08-01 06:30:10 +00:00
David Greene	17a5dfe6f7	New CallInst interface to address GLIBCXX_DEBUG errors caused by indexing an empty std::vector. Updates to all clients. llvm-svn: 40660	2007-08-01 03:43:44 +00:00
Owen Anderson	10ffa860d8	Don't let the memory allocator outsmart GVN. ;-) llvm-svn: 40655	2007-07-31 23:27:13 +00:00
Owen Anderson	2464f4f048	Fix a failure I accidentally caused in my last commit by mishandling the removal of redundant phis. llvm-svn: 40650	2007-07-31 20:18:28 +00:00
Lauro Ramos Venancio	549e775e67	Fix a bug in GetKnownAlignment of packed structs. llvm-svn: 40649	2007-07-31 20:13:21 +00:00
Owen Anderson	d58fa6b09f	Fix a misoptimization in aha. llvm-svn: 40642	2007-07-31 17:43:14 +00:00
Dan Gohman	8c4da37b1f	Use SCEVExpander::InsertCastOfTo instead of calling new IntToPtrInst directly, because the insert point used by the SCEVExpander may vary from what LSR originally computes. llvm-svn: 40641	2007-07-31 17:22:27 +00:00
Devang Patel	d8b1ceb5b4	Add note. llvm-svn: 40638	2007-07-31 16:52:25 +00:00
Devang Patel	d491198000	Loop unswitch preserves dom info. Use simple analysis interface to preserve analysis info maintained by other loop passes. llvm-svn: 40627	2007-07-31 08:03:26 +00:00
Devang Patel	b98a097ae9	Implement Simple Analysis interfaces - cloneBasicBlockAnalysis and deleteAnalysisValue. llvm-svn: 40626	2007-07-31 08:01:41 +00:00
Devang Patel	7d165e1d84	If loop can be unswitched again, then do it yourself. llvm-svn: 40609	2007-07-30 23:07:10 +00:00
Owen Anderson	850138157e	Avoid potential iterator invalidation problems. llvm-svn: 40607	2007-07-30 21:26:39 +00:00
Devang Patel	14fae50666	Remove dead code. llvm-svn: 40606	2007-07-30 21:10:44 +00:00
Devang Patel	c5e340eded	LCSSA preserves dom info. llvm-svn: 40604	2007-07-30 20:23:45 +00:00
Devang Patel	698852561c	Loop Rotation pass preserves dominator tree and frontier. llvm-svn: 40603	2007-07-30 20:22:53 +00:00
Devang Patel	bb97ac4dce	LICM preserves scalar evolution and dom frontier. llvm-svn: 40602	2007-07-30 20:19:59 +00:00
Reid Spencer	dff9d69cfb	Fix a typo/thinko. llvm-svn: 40599	2007-07-30 19:53:57 +00:00
Owen Anderson	212d5c27f6	Use more caching when computing non-local dependence. This makes bzip2 not use up the entire 32-bit address space. llvm-svn: 40596	2007-07-30 17:29:24 +00:00
Owen Anderson	d66e285b2e	Fix a bug caused by indiscriminantly asking for the dominators of a predecessor. llvm-svn: 40595	2007-07-30 16:57:08 +00:00
Devang Patel	e3206cb425	Use SmallPtrSet. llvm-svn: 40560	2007-07-27 18:34:27 +00:00
Chuck Rose III	1a39a2d13d	VStudio compiler errors and placing Function->ExFunc map under ManagedStatic control. This commit fixes two things. One is a pair of VStudio compiler errors stemming from variables which defined within the for loop statement and also within the body of the for loop. I fixed these by renaming one of the two variables. Additionally, I've made the Function->ExFunc map in ExternalFunctions.cpp a ManagedStatic object, so that cleanup will be done on llvm_shutdown. In repeated uses of the interpreter, where the same Function* address may get used for completely differnet functions, this was causing a crash. llvm-svn: 40558	2007-07-27 18:26:35 +00:00
Devang Patel	a51e0a3d8d	Fix thinko. Update return status appropriately. llvm-svn: 40546	2007-07-26 20:21:42 +00:00
Owen Anderson	dbf23ccaa0	Fix a couple more bugs in the phi construction by pulling in code that does almost the same things from LCSSA. llvm-svn: 40540	2007-07-26 18:26:51 +00:00
Dan Gohman	6e853bc73f	Move the GET_SIDE_EFFECT_INFO logic from isInstructionTriviallyDead to Instruction::mayWriteToMemory, fixing a FIXME, and helping various places that call mayWriteToMemory directly. llvm-svn: 40533	2007-07-26 16:06:08 +00:00
Dan Gohman	eb47d9213c	Remove a bogus return statement, what appears to have been a pasto from Relation::contradicts in Relation::incorporate. llvm-svn: 40531	2007-07-26 15:29:35 +00:00
Owen Anderson	3b8cc30a61	Fix what is _hopefully_ the last corner case for loops. llvm-svn: 40503	2007-07-25 23:54:42 +00:00
Owen Anderson	8707412593	My last commit was not correct for nested loops. Fix it, and add a testcase for it. llvm-svn: 40498	2007-07-25 22:19:40 +00:00
Owen Anderson	3c67004d47	Fix an infinite loop on 300.twolf. llvm-svn: 40497	2007-07-25 22:03:06 +00:00
Owen Anderson	7bf26ee444	Fix a bug that was causing GVN to crash on 252.eon. llvm-svn: 40494	2007-07-25 21:13:41 +00:00
Owen Anderson	5e5599b7ce	Add basic support for performing whole-function RLE. Note: This has not yet been thoroughly tested. Use at your own risk. llvm-svn: 40489	2007-07-25 19:57:03 +00:00
Devang Patel	33227115b9	Add BasicInliner interface. This interface allows clients to inline bunch of functions with module level call graph information.:wq llvm-svn: 40486	2007-07-25 18:00:25 +00:00
Owen Anderson	ab6ec2eac2	Add a GVN pass, using the value numbering code I developed for GVNPRE and the load elimination code from RedundantLoadElimination. llvm-svn: 40469	2007-07-24 17:55:58 +00:00
Owen Anderson	9baaaa52e6	Rename a lot of things to change FastDLE to RedundantLoadElimination. llvm-svn: 40457	2007-07-24 00:17:04 +00:00
Owen Anderson	7292a4a93f	Rename FastDLE as RedundantLoadElimination. llvm-svn: 40456	2007-07-24 00:08:38 +00:00
Owen Anderson	5e68f0c93d	Don't delete volatile loads. Doing so is not safe. llvm-svn: 40448	2007-07-23 22:05:54 +00:00
Owen Anderson	6aba721425	Add FastDLE, the load-elimination counterpart of FastDSE. llvm-svn: 40445	2007-07-23 21:48:08 +00:00
Owen Anderson	5a201baba9	Fix file header. llvm-svn: 40440	2007-07-23 18:30:37 +00:00
Chris Lattner	4512cd2cab	completely remove a transformation that is unsafe in the face of undefs. llvm-svn: 40439	2007-07-23 17:10:17 +00:00
Devang Patel	5e39293e62	Apply temporary work around to fix llvm mis-compilation reported in PR 1556. llvm-svn: 40133	2007-07-21 00:34:29 +00:00
Chris Lattner	d82e4a19cc	this xform is already done by the constant folder. llvm-svn: 40124	2007-07-20 22:06:41 +00:00
Dan Gohman	e31a61eeca	Optimize alignment of loads and stores. llvm-svn: 40102	2007-07-20 16:34:21 +00:00
Duncan Sands	2be91fcdd8	Place SCCPSolver also in the anonymous namespace. This pacifies g++-4.2. llvm-svn: 40089	2007-07-20 08:56:21 +00:00
Owen Anderson	5bd6c3f2c4	Fix a bug where we were marking GEP expressions with the wrong opcode. llvm-svn: 40085	2007-07-20 08:19:20 +00:00
Owen Anderson	f9e6542969	Make val_replace fail early, which reduces the time to optimize 403.gcc to 14.8s. llvm-svn: 40064	2007-07-19 19:57:13 +00:00
Devang Patel	a273d1cd3a	Verify loop info. llvm-svn: 40062	2007-07-19 18:02:32 +00:00
Owen Anderson	6aa17f1def	Use SmallVector and DenseMap in even more places. With this, the time to optimize 403.gcc is down to 15.1s. llvm-svn: 40042	2007-07-19 06:37:56 +00:00
Owen Anderson	75a244d6eb	Change ValueTable to use a DenseMap for mapping expressions to value numbers. This results in a slight speedup for 403.gcc. llvm-svn: 40040	2007-07-19 06:13:15 +00:00
Owen Anderson	6a4ff8549b	Move some sets and maps to SmallPtrSet and DenseMap respectively. This reduces the time to optimize 403.gcc from 17.6s to 16.4s. llvm-svn: 40036	2007-07-19 03:32:44 +00:00
Devang Patel	186e0d8b0a	After a basic block is split into two parts, second part dominates all the blocks dominated by original basic block. And first part dominates second part. llvm-svn: 40035	2007-07-19 02:29:24 +00:00
Devang Patel	de5901523c	Now this temp. fix is not required. llvm-svn: 40034	2007-07-19 02:22:21 +00:00
Devang Patel	8a1d1ac925	Fix typo. llvm-svn: 40025	2007-07-18 23:50:19 +00:00
Devang Patel	bb8ea8cefc	Fix dominator info update to accommodate CFG changes. This fixes PR1559. llvm-svn: 40024	2007-07-18 23:48:20 +00:00
Owen Anderson	09f86993bd	Take advantage of undefined behavior if the source program tries to GEP beyond the end of an alloca to make FastDSE faster and more aggressive. llvm-svn: 39945	2007-07-16 23:34:39 +00:00
Owen Anderson	7fcaaadf1c	Add support for walking up memory def chains, which enables finding many more dead stores on 400.perlbench. llvm-svn: 39929	2007-07-16 21:52:50 +00:00
Reid Spencer	3363f4ad96	Return Undef if the block has no dominator. This was required to allow llvm-gcc build to succeed. Without this change it fails in libstdc++ compilation. This causes no regressions in dejagnu tests. However, someone who knows this code better might want to review it. llvm-svn: 39924	2007-07-16 21:03:44 +00:00
Dan Gohman	06c60b6032	Fix comments about vectors to use the current wording. llvm-svn: 39921	2007-07-16 14:29:03 +00:00
Chris Lattner	640fd5124d	Repair a regression in Transforms/InstCombine/mul.ll that Reid noticed. llvm-svn: 39896	2007-07-16 04:15:34 +00:00
Nick Lewycky	b7c0c8a350	Start adding and cleaning up comments. llvm-svn: 39894	2007-07-16 02:58:37 +00:00
Chris Lattner	d4fef8dbca	Implement shift-simplify.ll:test[45]. First teach instcombine that sign bit checks only demand the sign bit, this allows simplify demanded bits to hack on expressions better. Second, teach instcombine that ashr is useless if only the sign bit is demanded. llvm-svn: 39880	2007-07-15 20:54:51 +00:00
Chris Lattner	06205d5567	Implement shift-simplify.ll:test3, turning: (X << 31) <s 0 --> (X&1) != 0 This happens dozens of times in the CFE. llvm-svn: 39879	2007-07-15 20:42:37 +00:00
Nick Lewycky	39519f5c41	Use maximal intersection algorithm exclusively. Fixes miscompile bug. llvm-svn: 39852	2007-07-14 04:28:04 +00:00
Devang Patel	4cd1413f15	Make LCSSA a loop pass. llvm-svn: 39844	2007-07-13 23:57:11 +00:00
Owen Anderson	d975efab16	Handle GEPs with all-zero indices in the same way we handle pointer-pointer bitcasts. Also, fix a potentia infinite loop. This brings FastDSE to parity with old DSE on 175.vpr. llvm-svn: 39839	2007-07-13 22:50:48 +00:00
Devang Patel	29ccf8ba52	Disable claims to preserve analysis until open issues are resolved. llvm-svn: 39834	2007-07-13 21:53:42 +00:00
Owen Anderson	9c9ef21432	Be more aggressive in removing dead stores, and in removing instructions trivially dead after DSE. This drastically improves the effect of FastDSE on kimwitu++. llvm-svn: 39819	2007-07-13 18:26:26 +00:00
Owen Anderson	32c4a05dd4	Reimplement removing stores to allocas at the end of a function. This should be safe now. llvm-svn: 39790	2007-07-12 21:41:30 +00:00
Owen Anderson	d4451dee1e	Make the condition-checking for free with non-trivial dependencies more correct. llvm-svn: 39789	2007-07-12 18:08:51 +00:00
Owen Anderson	5e06995b3d	Remove the end-block handling code. It was unsafe, and making it safe would have resulted in falling back to the slow DSE case. I need to think some more about the right way to handle this. llvm-svn: 39788	2007-07-12 17:52:20 +00:00
Gabor Greif	b8bca52c7d	checked in as obvious, thanks Benoit Boissinot! llvm-svn: 39774	2007-07-12 13:31:38 +00:00
Owen Anderson	1e1bace52b	Let MemoryDependenceAnalysis take care of updating AliasAnalysis. llvm-svn: 39769	2007-07-12 00:06:21 +00:00
Devang Patel	fac4d1f014	Preserve analysis info. llvm-svn: 39767	2007-07-11 23:47:28 +00:00
Owen Anderson	aa07172340	Handle the case where an entire structure is freed, and its dependency is a store to a field within that structure. Also, refactor the runOnBasicBlock() function, splitting some of the special cases into separate functions. llvm-svn: 39762	2007-07-11 23:19:17 +00:00
Owen Anderson	1441470be8	Add support for eliminate stores to stack-allocated memory locations at the end of a function. llvm-svn: 39754	2007-07-11 21:06:56 +00:00
Owen Anderson	e720144837	Handle eliminating stores that occur right before a free. llvm-svn: 39753	2007-07-11 20:38:34 +00:00
Owen Anderson	bf971aafb6	Clean up a few things based on Chris' feedback. llvm-svn: 39747	2007-07-11 19:03:09 +00:00
Tanya Lattner	ccecbcd779	Adding ability to demote phi to stack. llvm-svn: 39744	2007-07-11 18:41:34 +00:00
Owen Anderson	5e72db3f7f	Add FastDSE, a new algorithm for doing dead store elimination. This algorithm is not as accurate as the current DSE, but it only a linear scan over each block, rather than quadratic. Eventually (once it has been improved somewhat), this will replace the current DSE. NOTE: This has not yet been extensively tested. llvm-svn: 38517	2007-07-11 00:46:18 +00:00
Owen Anderson	084d3c2e2f	Make the pass registration static. llvm-svn: 38508	2007-07-10 20:20:19 +00:00
Anton Korobeynikov	76547349c1	During module cloning copy aliases too. This fixes PR1544 llvm-svn: 38505	2007-07-10 19:07:35 +00:00
Nick Lewycky	e635cc43c6	Update the ValueRanges interface to use value numbers instead of Value*s. llvm-svn: 38483	2007-07-10 03:28:21 +00:00
Owen Anderson	4c4b238448	Move some key maps from std::map to DenseMap. This improves the time to optimize Anton's testcase from 17.5s to 15.7s. llvm-svn: 38480	2007-07-10 00:27:22 +00:00
Owen Anderson	41c2cab873	Use a cheaper test, delaying calling find_leader() until we know that it's necessary. This improves the time to optimize Anton's testcase from 21.1s to 17.6s. llvm-svn: 38479	2007-07-10 00:09:25 +00:00
Owen Anderson	7ee197ecf2	Add an assertion if find_leader fails. llvm-svn: 38477	2007-07-09 23:57:18 +00:00
Owen Anderson	effc7a7d16	Take advantage of the new fast SmallPtrSet assignment operator when propagating AVAIL_OUT sets. This reduces the time to optimize Anton's testcase from 31.2s to 21.s! llvm-svn: 38475	2007-07-09 22:29:50 +00:00
Devang Patel	e8ec7661ea	Expose struct size threhold to allow users to tweak their own setting. llvm-svn: 38472	2007-07-09 21:19:23 +00:00
Owen Anderson	56b01eb3d9	Fix a comment. llvm-svn: 38459	2007-07-09 16:43:55 +00:00
Owen Anderson	267ba45249	Improve a hotspot that was making build_sets() slower by calling lookup() too often. This improves Anton's testcase from 36s to 32s. llvm-svn: 38441	2007-07-09 07:56:55 +00:00
Owen Anderson	1c83b5d999	Start using a set representation that remembers the set of value numbers represented in the set. For the moment, this results in a slight performance decrease, but it lays the groundwork for future improvements. llvm-svn: 38439	2007-07-09 06:50:06 +00:00
Owen Anderson	8b99e0ab20	Fix an error where ANTIC_OUT was ending up with more than one expression of the same value number. This fixes an infinite loop on 444.namd. llvm-svn: 37967	2007-07-07 20:13:57 +00:00
Nick Lewycky	9b2252c6f0	Back out Devang's fix for PR1320 because it causes PR1542. llvm-svn: 37966	2007-07-07 16:23:34 +00:00
Devang Patel	12358b4827	These rountines are now available as part of basic block utilities. llvm-svn: 37955	2007-07-06 22:03:47 +00:00
Devang Patel	86d0ea973d	Request DominanceFrontiner in advance. llvm-svn: 37954	2007-07-06 21:43:22 +00:00
Devang Patel	3ee408264b	Preserve various analysis info. llvm-svn: 37953	2007-07-06 21:40:13 +00:00
Devang Patel	d7767cc2a7	Add SplitEdge and SplitBlock utility routines. llvm-svn: 37952	2007-07-06 21:39:20 +00:00
Owen Anderson	7d4bbc1c0c	Be more aggressive in the heuristic. This mostly exposes more opportunities for the GVN part of GVNPRE to apply. llvm-svn: 37951	2007-07-06 20:29:43 +00:00
Owen Anderson	3c3dd902ec	Achieve what the incorrect test was trying to do by simply requiring that all critical edges be split before we begin. llvm-svn: 37949	2007-07-06 18:12:36 +00:00
Owen Anderson	bcdd7ec4c9	Remove an incorrect check. llvm-svn: 37948	2007-07-06 16:52:47 +00:00
Zhou Sheng	1ee941dac4	Correct a typo. llvm-svn: 37936	2007-07-06 06:01:16 +00:00
Owen Anderson	02e9698293	Fix a bunch of issues found in a testcase from 400.perlbench. llvm-svn: 37929	2007-07-05 23:11:26 +00:00
Nick Lewycky	73dd692173	Break "variable canonicalization" out of InequalityGraph and into its own class "ValueNumbering". llvm-svn: 37881	2007-07-05 03:15:00 +00:00
Owen Anderson	ca1a184fd8	Fix another bug, this time in PREing select instructions. llvm-svn: 37878	2007-07-04 22:33:23 +00:00
Owen Anderson	cd94fc982a	Fix a typo that was killing GVNPRE of select instructions. llvm-svn: 37871	2007-07-04 18:26:18 +00:00
Owen Anderson	664e260a9c	Fix an error in phi translation of GEPs that was causing failures. llvm-svn: 37868	2007-07-04 04:51:16 +00:00
Owen Anderson	2e4b6feac2	Add support for performing GVNPRE on GEP instructions. llvm-svn: 37862	2007-07-03 23:51:19 +00:00
Owen Anderson	b9a494aea3	Add functionality to value number GEP instructions. This also provides the infrastructure that will be used for function calls. NOTE: This does not yet do any transformation of GEPs or function calls. llvm-svn: 37860	2007-07-03 22:50:56 +00:00
Owen Anderson	6b958c72bd	Make the unary operator case a bit faster, since casts are the only kind of unary operation. llvm-svn: 37857	2007-07-03 19:01:42 +00:00
Owen Anderson	59bd053fc5	Add support for performing GVNPRE on cast instructions, and add a testcase for this. llvm-svn: 37856	2007-07-03 18:37:08 +00:00
Devang Patel	0975c6d7f9	Preserve DominanceFrontier. llvm-svn: 37820	2007-06-29 23:11:49 +00:00
David Greene	1e2a12019f	Fix reference to iterator invalidated by an erase operation. Uncovered by _GLIBCXX_DEBUG. llvm-svn: 37796	2007-06-29 02:53:16 +00:00
Devang Patel	9feb7f5846	Do not filter loop if candidate branch is in loop header. llvm-svn: 37792	2007-06-29 01:39:53 +00:00
Owen Anderson	67799d4ffb	Add support for value numbering (but not actually optimizing) cast instructions. llvm-svn: 37789	2007-06-29 00:51:03 +00:00
Owen Anderson	c738f7ca42	Add a type field to expressions in preparation for performing GVNPRE on casts. llvm-svn: 37788	2007-06-29 00:40:05 +00:00
Owen Anderson	8a9fa5d081	Add support for performing GVNPRE on select instructions. This fixes test/Transforms/GVNPRE/select.ll. llvm-svn: 37783	2007-06-28 23:51:21 +00:00
Devang Patel	6ba5ad482f	- Undo previous check and allow loop switch for condtion that is not inside loop. - Avoid loop unswich for loop header branch. - While cloning dominators fix typo and handle self dominating blocks. llvm-svn: 37772	2007-06-28 02:05:46 +00:00
Devang Patel	3304e469f7	Update LoopUnswitch pass to preserve DomiantorTree. llvm-svn: 37771	2007-06-28 00:49:00 +00:00
Devang Patel	3c723c8db7	If a condition is not inside a loop then the condition is suitable to loop unswitch candidate for the loop. llvm-svn: 37770	2007-06-28 00:44:10 +00:00
Owen Anderson	e02da55cc8	Make many sets a much more reasonable size. This decreases the time to optimize Anton's testcase from 35.5s to 34.7s. llvm-svn: 37769	2007-06-28 00:34:34 +00:00
Owen Anderson	7dae8efcf2	Use cached information that has already been computed to make clean() simpler and faster. This is a small speedup on most cases. llvm-svn: 37761	2007-06-27 17:38:29 +00:00
Owen Anderson	0eb265729a	Fold a lot of code into two cases: binary instructions and ternary instructions. This saves many lines of code duplication. No functionality change. llvm-svn: 37759	2007-06-27 17:03:03 +00:00
Zhou Sheng	8d438858c8	Fix a bug. llvm-svn: 37751	2007-06-27 09:50:26 +00:00
Owen Anderson	b6a39fcb21	Add support for performing GVNPRE on the three vector-specific operations. llvm-svn: 37745	2007-06-27 04:10:46 +00:00
Owen Anderson	5477c54aa0	1. Correct some comments and clean up some dead code. 2. When calculating ANTIC_IN, only iterate the changed blocks. For most average inputs this is a small speedup, but for cases with unusual CFGs, this can be a significant win. llvm-svn: 37742	2007-06-26 23:29:41 +00:00
Chris Lattner	ea5c4bd51c	fix Transforms/Inline/2007-06-25-WeakInline.ll by not inlining functions with weak linkage. llvm-svn: 37723	2007-06-25 21:50:09 +00:00
Owen Anderson	43ca4b48f1	Use the built-in postorder iterators rather than computing a postorder walk by hand. llvm-svn: 37721	2007-06-25 18:25:31 +00:00
Owen Anderson	191eb06352	1) Fix an issue with non-deterministic iteration order in phi_translate 2) Remove some maximal-set computing code that is no longer used. 3) Use a post-order CFG traversal to compute ANTIC_IN instead of a postdom traversal. This causes the ANTIC_IN calculation to converge much faster. Thanks to Daniel Berlin for suggesting this. With this patch, the time to optimize 403.gcc decreased from 17.5s to 7.5s, and Anton's huge testcase decreased from 62 minutes to 38 seconds. llvm-svn: 37714	2007-06-25 05:41:12 +00:00
Nick Lewycky	8735f44104	Fix value ranges. llvm-svn: 37713	2007-06-24 20:14:22 +00:00
Owen Anderson	7fb6da8e4d	Fix a silly mistake that was causing failures. llvm-svn: 37712	2007-06-24 08:42:24 +00:00
Nick Lewycky	0f986fdbfa	Remove tabs. llvm-svn: 37710	2007-06-24 04:40:16 +00:00
Nick Lewycky	26e25d340e	Remove use of ETForest. Also cleaned up issues around unreachable basic blocks, and optimizing within one basic block. llvm-svn: 37709	2007-06-24 04:36:20 +00:00
Owen Anderson	49409f6501	Rework topo_sort so eliminate some behavior that scaled terribly. This reduces the time to optimize 403.gcc from 18.2s to 17.5s, and has an even larger effect on larger testcases. llvm-svn: 37708	2007-06-22 21:31:16 +00:00
Owen Anderson	21a1131565	Perform fewer set insertions while calculating ANTIC_IN. This reduces the amount of time to optimize 403.gcc from 21.9s to 18.2s. llvm-svn: 37707	2007-06-22 18:27:04 +00:00
Owen Anderson	92c7b22e1a	Remove some code that I was using for collecting performance information that should not have been committed. llvm-svn: 37706	2007-06-22 17:04:40 +00:00
Owen Anderson	f6e21871ad	Avoid excessive calls to find_leader when calculating AVAIL_OUT. This reduces the time to optimize 403.gcc from 23.5s to 21.9s. llvm-svn: 37702	2007-06-22 03:14:03 +00:00
Owen Anderson	d50a29d613	Reserve space in vectors before topologically sorting into them. This improves the time to optimize 403.gcc from 28s to 23.5s. llvm-svn: 37699	2007-06-22 00:43:22 +00:00
Owen Anderson	28a2d449fa	Make a bunch of optimizations for compile time to GVNPRE, including smarter set unions, deferring blocks rather than computing maximal sets, and smarter use of sets. With these enhancements, the time to optimize 273.perlbmk goes from 5.3s to 2.7s. llvm-svn: 37698	2007-06-22 00:20:30 +00:00
Chris Lattner	fb032b176b	Significantly improve the documentation of the instcombine divide/compare transformation. Also, keep track of which end of the integer interval overflows occur on. This fixes Transforms/InstCombine/2007-06-21-DivCompareMiscomp.ll and rdar://5278853, a miscompilation of perl. llvm-svn: 37692	2007-06-21 18:11:19 +00:00
Owen Anderson	2ff912bf33	Change lots of sets from std::set to SmallPtrSet. This reduces the time required to optimize 253.perlbmk from 10.9s to 5.3s. llvm-svn: 37690	2007-06-21 17:57:53 +00:00
Devang Patel	d5258a23a5	Move code to update dominator information after basic block is split from LoopSimplify.cpp to Dominator.cpp llvm-svn: 37689	2007-06-21 17:23:45 +00:00
Owen Anderson	27876a3ff9	Eliminate a redundant check. This speeds up optimization of 253.perlbmk from 13.5 seconds to 10.9 seconds. llvm-svn: 37683	2007-06-21 01:59:05 +00:00
Owen Anderson	fd5683ad7a	Comment-ize the functions in GVNPRE. llvm-svn: 37681	2007-06-21 00:19:05 +00:00
Chris Lattner	3bbec59e8b	refactor a bunch of code out of visitICmpInstWithInstAndIntCst into its own routine. llvm-svn: 37679	2007-06-20 23:46:26 +00:00
Owen Anderson	06c1e585c9	Split runOnFunction into many smaller functions. This make it easier to get accurate performance analysis of GVNPRE. llvm-svn: 37678	2007-06-20 22:10:02 +00:00
Owen Anderson	b0714bb7bb	Make GVNPRE accurate report whether it modified the function or not. llvm-svn: 37673	2007-06-20 18:30:20 +00:00
Owen Anderson	7b0fb44ca9	Get rid of an unneeded helper function. llvm-svn: 37670	2007-06-20 00:43:33 +00:00
Owen Anderson	1ad2c10215	Use a DenseMap instead of an std::map for the value numbering. This reduces the time to optimize lencod on a PPC Debug build from ~300s to ~140s. llvm-svn: 37668	2007-06-19 23:23:54 +00:00
Owen Anderson	2320d430bd	Make dependsOnInvoke much more specific in what it tests, which in turn make it much faster to run. This reduces the time to optimize lencondwith a debug build on PPC from ~450s to ~300s. llvm-svn: 37667	2007-06-19 23:07:16 +00:00
Tanya Lattner	c655839d71	Moved Inliner.h to include/llvm/Transforms/IPO/InlinerPass.h llvm-svn: 37666	2007-06-19 22:31:52 +00:00
Tanya Lattner	ab11b1c702	Inliner pass header file was moved. llvm-svn: 37665	2007-06-19 22:29:50 +00:00
Dan Gohman	32f53bbd85	Rename ScalarEvolution::deleteInstructionFromRecords to deleteValueFromRecords and loosen the types to all it to accept Value* instead of just Instruction*, since this is what ScalarEvolution uses internally anyway. This allows more flexibility for future uses. llvm-svn: 37657	2007-06-19 14:28:31 +00:00
Owen Anderson	1370faf889	Handle constants in phi nodes properly. This fixes test/Transforms/GVNPRE/2007-06-18-ConstantInPhi.ll llvm-svn: 37655	2007-06-19 07:35:36 +00:00
Chris Lattner	09a33a4f64	silence a bogus warning Duraid ran into. llvm-svn: 37649	2007-06-19 05:43:49 +00:00
Owen Anderson	91c54950b3	Be careful to erase values from all of the appropriate sets when they're not needed anymore. This fixes a few more memory-related issues. llvm-svn: 37647	2007-06-19 05:37:32 +00:00
Owen Anderson	b9cbaed623	Remember to clear the maximal sets between functions. Thanks to Nicholas for valgrinding this. llvm-svn: 37646	2007-06-19 04:32:55 +00:00
Owen Anderson	b56fba0c5a	Refactor GVNPRE to use a much smart method of uniquing value sets, and centralize a lot of the value numbering information. No functionality change. llvm-svn: 37645	2007-06-19 03:31:41 +00:00
Owen Anderson	dd998e1913	Cache the results of dependsOnInvoke() llvm-svn: 37622	2007-06-18 04:42:29 +00:00
Owen Anderson	f1c04e1ddb	Fix indentation. llvm-svn: 37621	2007-06-18 04:31:21 +00:00
Owen Anderson	b364b413af	Don't perform an expensive check if it's not necessary. llvm-svn: 37620	2007-06-18 04:30:44 +00:00
Owen Anderson	658f2c4881	Fix test/Transforms/GVNPRE/2007-06-15-InvokeInst.ll by ignoring all instructions that depend on invokes. llvm-svn: 37610	2007-06-16 00:26:54 +00:00
Dan Gohman	203a035251	Use SCEVConstant::get instead of SCEVUnknown::get to create an integer constant SCEV. llvm-svn: 37596	2007-06-15 18:00:55 +00:00
Owen Anderson	acaed06827	Fix test/Transforms/GVNPRE/2007-06-15-Looping.ll llvm-svn: 37595	2007-06-15 17:55:15 +00:00
Dan Gohman	cb9e09ad57	Add a SCEV class and supporting code for sign-extend expressions. This created an ambiguity for expandInTy to decide when to use sign-extension or zero-extension, but it turns out that most of its callers don't actually need a type conversion, now that LLVM types don't have explicit signedness. Drop expandInTy in favor of plain expand, and change the few places that actually need a type conversion to do it themselves. llvm-svn: 37591	2007-06-15 14:38:12 +00:00
Chris Lattner	373389260f	Generalize many transforms to work on ~ of vectors in addition to ~ of integer ops. This implements Transforms/InstCombine/and-or-not.ll test3/test4, and finishes off PR1510 llvm-svn: 37589	2007-06-15 06:23:19 +00:00
Chris Lattner	481e28b1f5	Implement two xforms: 1. ~(~X \| Y) === (X & ~Y) 2. (A\|B) & ~(A&B) -> A^B This allows us to transform ~(~(a\|b) \| (a&b)) -> a^b. This implements PR1510 for scalar values. llvm-svn: 37584	2007-06-15 05:58:24 +00:00
Chris Lattner	f14e5175ed	delete some obviously dead vector operations, which deletes a few thousand operations from Duraids example. llvm-svn: 37582	2007-06-15 05:26:55 +00:00
Owen Anderson	4036ad485f	Fix test/Transforms/GVNPRE/2007-06-12-PhiTranslate.ll llvm-svn: 37564	2007-06-12 22:43:57 +00:00
Owen Anderson	4276984012	Refactor some code, and fix test/Transforms/GVNPRE/2007-06-12-NoExit.ll by being more careful when using post-dominator information. llvm-svn: 37556	2007-06-12 16:57:50 +00:00
Dale Johannesen	edfec0b515	Sink CmpInst's to their uses to reduce register pressure. llvm-svn: 37554	2007-06-12 16:50:17 +00:00
Owen Anderson	a75dd4dc56	Fix a few more bugs, including an instance of walking in reverse topological rather than topological order. This fixes a testcase extracted from llvm-test. llvm-svn: 37550	2007-06-12 00:50:47 +00:00
Devang Patel	78b9c68164	Add and use DominatorTreeBase::findNearestCommonDominator(). llvm-svn: 37545	2007-06-11 23:31:22 +00:00
Devang Patel	536ac4dca7	Simplify. llvm-svn: 37542	2007-06-11 21:45:31 +00:00
Devang Patel	d18054afcf	simplify llvm-svn: 37541	2007-06-11 21:25:31 +00:00
Devang Patel	ab2eee89a4	Simplify. Dominator Tree is required so always available. llvm-svn: 37540	2007-06-11 21:18:00 +00:00
Owen Anderson	d184c18074	Handle functions with multiple exit blocks properly. llvm-svn: 37539	2007-06-11 16:25:17 +00:00
Owen Anderson	223718c40e	Perform PRE of comparison operators. llvm-svn: 37536	2007-06-09 18:35:31 +00:00
Owen Anderson	7d76b2a774	Collect statistics from GVN-PRE. llvm-svn: 37530	2007-06-08 22:02:36 +00:00
Owen Anderson	b232efaf48	Fix typo in a comment. llvm-svn: 37526	2007-06-08 20:57:08 +00:00
Owen Anderson	55994f2453	Fix a bug that was causing the elimination phase not to replace values when it should be. With this patch, GVN-PRE now correctly optimizes the example from the thesis. Many thanks to Daniel Berlin for helping me find errors in this. llvm-svn: 37525	2007-06-08 20:44:02 +00:00
Owen Anderson	2e5efc30c2	Small bugfix, and const-ify some methods (Thanks, Bill). llvm-svn: 37513	2007-06-08 01:52:45 +00:00
Devang Patel	becc466451	Update LoopSimplify to require and preserve DominatorTree only. Now LoopSimplify does not require nor preserve ETForest. llvm-svn: 37512	2007-06-08 01:50:32 +00:00
Owen Anderson	be80240b29	Add partial redundancy elimination. llvm-svn: 37510	2007-06-08 01:03:01 +00:00
Devang Patel	8ecffa996a	Do not preserve ETForest. llvm-svn: 37506	2007-06-08 00:02:08 +00:00
Devang Patel	3f4c6fe7e8	Do not require ETForest. Now it is unused by LICM. llvm-svn: 37502	2007-06-07 22:21:15 +00:00
Devang Patel	cf470e5255	Do not use ETForest as well as DomiantorTree. DominatorTree is sufficient. llvm-svn: 37501	2007-06-07 22:17:16 +00:00
Devang Patel	fc7fdef7d2	Use DominatorTree instead of ETForest. This allows faster immediate domiantor walk. llvm-svn: 37500	2007-06-07 21:57:03 +00:00
Devang Patel	df6355ccf8	Use DominatorTree instead of ETForest. llvm-svn: 37499	2007-06-07 21:42:15 +00:00
Devang Patel	fb582f8dda	Use DominatorTree instead of ETForest. llvm-svn: 37498	2007-06-07 21:35:27 +00:00
Devang Patel	5b8a5516e4	Use DominatorTree instead of ETForest. llvm-svn: 37495	2007-06-07 18:45:06 +00:00
Devang Patel	593e766fb5	Use DominatorTree instead of ETForest. llvm-svn: 37494	2007-06-07 18:40:55 +00:00
Devang Patel	af41e4a192	Maintain ETNode as part of DomTreeNode. This adds redundancy for now. llvm-svn: 37492	2007-06-07 17:47:21 +00:00
Tanya Lattner	5801c23e05	Formating fixes. llvm-svn: 37491	2007-06-07 17:12:16 +00:00
Tanya Lattner	cb90f1d881	Instruct the inliner to obey the noinline attribute. Add test case. llvm-svn: 37481	2007-06-06 21:59:26 +00:00
Chris Lattner	34404e3247	simplify this code and fix PR1493, now that llvm-gcc3 is dead. llvm-svn: 37478	2007-06-06 20:51:41 +00:00
Lauro Ramos Venancio	368e8872db	Fix PR1499. llvm-svn: 37472	2007-06-06 17:08:48 +00:00
Nick Lewycky	91ed6efc24	Inform ScalarEvolutions that we're deleting Values. This is the obviously correct part of the fix for PR1487. llvm-svn: 37457	2007-06-06 03:51:56 +00:00
Owen Anderson	634a063c1d	Add simple full redundancy elimination. llvm-svn: 37455	2007-06-06 01:27:49 +00:00
Chris Lattner	1b7b6e76ec	Fix PR1495 and CodeGen/X86/2007-06-05-LSR-Dominator.ll llvm-svn: 37454	2007-06-06 01:23:55 +00:00
Devang Patel	506310d3dd	Avoid non-trivial loop unswitching while optimizing for size. llvm-svn: 37446	2007-06-06 00:21:03 +00:00
Owen Anderson	ddbe430732	Fix a misunderstanding of the algorithm. Really, we should be tracking values and expression separately. We can get around this, however, by only keeping opaque values in TMP_GEN. llvm-svn: 37443	2007-06-05 23:46:12 +00:00
Owen Anderson	c84720913a	Don't leak memory. llvm-svn: 37442	2007-06-05 22:11:49 +00:00
Owen Anderson	9b89e4b561	Fix a small bug, some 80 cols violations, and add some more debugging output. llvm-svn: 37436	2007-06-05 17:31:23 +00:00
Dan Gohman	151169df1e	Allow insertelement, extractelement, and shufflevector to be hoisted/sunk by LICM. llvm-svn: 37435	2007-06-05 16:05:55 +00:00
Bill Wendling	6357bf20fa	Patches by Chuck Rose to unbreak V Studio builds. Thanks Chuck! llvm-svn: 37428	2007-06-04 23:52:59 +00:00
Devang Patel	b3adb9876a	s/ETNode::getChildren/ETNode::getETNodeChildren/g llvm-svn: 37426	2007-06-04 23:45:02 +00:00
Owen Anderson	3c9d8eef21	Don't use std::set_difference when the two sets are sorted differently. Compute the difference manually instead. This allows GVNPRE to produce correct analysis for the example in the GVNPRE paper. llvm-svn: 37425	2007-06-04 23:34:56 +00:00
Owen Anderson	3df5299f94	Fix a bunch of small bugs, and improve the debugging output significantly. llvm-svn: 37424	2007-06-04 23:28:33 +00:00
Chris Lattner	d7897d40b6	When rebuilding constant structs, make sure to honor the isPacked bit. This fixes PR1491 and GlobalOpt/2007-06-04-PackedStruct.ll llvm-svn: 37423	2007-06-04 22:23:42 +00:00
Owen Anderson	38b6b22a41	Make phi_translate correct. llvm-svn: 37418	2007-06-04 18:05:26 +00:00
Devang Patel	ebc5b96735	s/DominatorTree::createNewNode/DominatorTree::addNewBlock/g llvm-svn: 37415	2007-06-04 16:43:25 +00:00
Devang Patel	a89566aefd	Add basic block level interface to change immediate dominator and create new node. llvm-svn: 37414	2007-06-04 16:22:33 +00:00
Devang Patel	bdd1aaef10	s/llvm::DominatorTreeBase::DomTreeNode/llvm::DomTreeNode/g llvm-svn: 37407	2007-06-04 00:32:22 +00:00
Owen Anderson	0eca9aad10	Don't use the custom comparator where it's not necessary. llvm-svn: 37406	2007-06-03 22:02:14 +00:00
Devang Patel	0e8aa7b69a	s/DominatorTreeBase::Node/DominatorTreeBase:DomTreeNode/g llvm-svn: 37403	2007-06-03 06:26:14 +00:00
Owen Anderson	46499645db	Remove an unused method. llvm-svn: 37402	2007-06-03 05:58:25 +00:00
Owen Anderson	0b68cda302	There's no need to have an Expression class... Value works just as well! This simplifies a lot of code. llvm-svn: 37401	2007-06-03 05:55:58 +00:00
Devang Patel	ac54a62fd2	Insert new instructions in AliasSet. llvm-svn: 37390	2007-06-01 22:15:31 +00:00
Owen Anderson	48e93f2ce9	clean() needs to process things in topological order. llvm-svn: 37389	2007-06-01 22:00:37 +00:00
Owen Anderson	4c89142466	Fix Expression comparison, which in turn fixes a value numbering error. llvm-svn: 37386	2007-06-01 17:34:47 +00:00
Owen Anderson	331bf6a959	Add a topological sort function. llvm-svn: 37376	2007-05-31 22:44:11 +00:00
Owen Anderson	81d156e16f	Attempt to fix up phi_translate. llvm-svn: 37366	2007-05-31 00:42:15 +00:00
Devang Patel	9b3b35d14f	Fix typo. llvm-svn: 37360	2007-05-30 15:29:37 +00:00
Chris Lattner	8767920f20	Fix Transforms/ScalarRepl/2007-05-29-MemcpyPreserve.ll and the second half of PR1421, by not decimating structs with holes that are the source and destination of a memcpy. llvm-svn: 37358	2007-05-30 06:11:23 +00:00
Owen Anderson	4b0c1859fd	Fix a typo llvm-svn: 37350	2007-05-29 23:34:14 +00:00
Owen Anderson	0c4230724c	Re-fix a bug, where I was now being too aggressive. llvm-svn: 37348	2007-05-29 23:26:30 +00:00
Owen Anderson	4a6ec8fb57	Use proper debugging facilities so other people don't have to look at my commented-out debugging lines. llvm-svn: 37347	2007-05-29 23:15:21 +00:00
Owen Anderson	f11bdc7637	Comment debug code out that I accidentally uncommented last time. llvm-svn: 37346	2007-05-29 22:43:03 +00:00
Owen Anderson	ac83a3e4ff	Add a place where I missed using the maximal set. Note that using the maximal set this way is _SLOW_. Somewhere down the line, I'll look at speeding it up. llvm-svn: 37345	2007-05-29 22:35:41 +00:00
Owen Anderson	5fba6c19b2	Very first part of a GVN-PRE implementation. It currently performs a bunch of analysis, and nothing more. It is also quite slow for the moment. However, it should give a sense of what's going on. llvm-svn: 37343	2007-05-29 21:53:49 +00:00
Chris Lattner	80c94a4a04	Fix PR1446 by not scalarrepl'ing giant structures. llvm-svn: 37326	2007-05-24 18:43:04 +00:00
Dan Gohman	30978078bf	Minor comment cleanups. llvm-svn: 37321	2007-05-24 14:36:04 +00:00
Chris Lattner	f79577d314	fix a miscompilation when passing a float through varargs llvm-svn: 37297	2007-05-23 01:17:04 +00:00
Chris Lattner	a655a157a0	Fix Transforms/InstCombine/2007-05-18-CastFoldBug.ll, a bug that devastates objc code due to the way the FE lowers objc message sends. llvm-svn: 37256	2007-05-19 06:51:32 +00:00
Chris Lattner	e8bd53c36a	Handle negative strides much more optimally. This compiles X86/lsr-negative-stride.ll into: _t: movl 8(%esp), %ecx movl 4(%esp), %eax cmpl %ecx, %eax je LBB1_3 #bb17 LBB1_1: #bb cmpl %ecx, %eax jg LBB1_4 #cond_true LBB1_2: #cond_false subl %eax, %ecx cmpl %ecx, %eax jne LBB1_1 #bb LBB1_3: #bb17 ret LBB1_4: #cond_true subl %ecx, %eax cmpl %ecx, %eax jne LBB1_1 #bb jmp LBB1_3 #bb17 instead of: _t: subl $4, %esp movl %esi, (%esp) movl 12(%esp), %ecx movl 8(%esp), %eax cmpl %ecx, %eax je LBB1_4 #bb17 LBB1_1: #bb.outer movl %ecx, %edx negl %edx LBB1_2: #bb cmpl %ecx, %eax jle LBB1_5 #cond_false LBB1_3: #cond_true addl %edx, %eax cmpl %ecx, %eax jne LBB1_2 #bb LBB1_4: #bb17 movl (%esp), %esi addl $4, %esp ret LBB1_5: #cond_false movl %ecx, %edx subl %eax, %edx movl %eax, %esi addl %esi, %esi cmpl %ecx, %esi je LBB1_4 #bb17 LBB1_6: #cond_false.bb.outer_crit_edge movl %edx, %ecx jmp LBB1_1 #bb.outer llvm-svn: 37252	2007-05-19 01:22:21 +00:00
Devang Patel	2c30a37a5c	Fix PR1431 Test case at Transformations/SCCP/2007-05-16-InvokeCrash.ll llvm-svn: 37185	2007-05-17 22:10:15 +00:00
Chris Lattner	66ad6fac2f	selects can also reach here llvm-svn: 37081	2007-05-15 06:42:04 +00:00
Chris Lattner	234f96daa8	Fix Transforms/InstCombine/2007-05-14-Crash.ll llvm-svn: 37057	2007-05-15 00:16:00 +00:00
Dan Gohman	8d40e4d965	Correct a few comments. llvm-svn: 37034	2007-05-14 14:31:17 +00:00
Chris Lattner	cea37beb52	Fix Transforms/GlobalOpt/2007-05-13-Crash.ll llvm-svn: 37020	2007-05-13 21:28:07 +00:00
Chris Lattner	1480e16596	significantly improve debug output of lsr llvm-svn: 36996	2007-05-11 22:40:34 +00:00
Dan Gohman	b5650ebd6a	Fix typos. llvm-svn: 36994	2007-05-11 21:10:54 +00:00
Dan Gohman	2980d9da45	This patch extends the LoopUnroll pass to be able to unroll loops with unknown trip counts. This is left off by default, and a command-line option enables it. It also begins to separate loop unrolling into a utility routine; eventually it might be made usable from other passes. It currently works by inserting conditional branches between each unrolled iteration, unless it proves that the trip count is a multiple of a constant integer > 1, which it currently only does in the rare case that the trip count expression is a Mul operator with a ConstantInt operand. Eventually this information might be provided by other sources, for example by a pass that peels/splits the loop for this purpose. llvm-svn: 36990	2007-05-11 20:53:41 +00:00
Chris Lattner	600db3eb96	fix regressions from my previous checking, including Transforms/InstCombine/2006-12-08-ICmp-Combining.ll llvm-svn: 36989	2007-05-11 16:58:45 +00:00
Chris Lattner	fe2b44de9f	fix Transforms/InstCombine/2007-05-10-icmp-or.ll llvm-svn: 36984	2007-05-11 05:55:56 +00:00
Devang Patel	9557247412	Fix PR1333 Testcases : http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070507/049451.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070507/049452.html llvm-svn: 36955	2007-05-09 08:24:12 +00:00
Dan Gohman	2e1f804764	Fix various whitespace inconsistencies. llvm-svn: 36936	2007-05-08 15:19:19 +00:00
Dan Gohman	49d08a57f5	Correct the comment for ApproximateLoopSize to reflect what it actually does. llvm-svn: 36935	2007-05-08 15:14:19 +00:00
Dale Johannesen	86e1dcf530	Don't generate branch to entry block. llvm-svn: 36917	2007-05-08 01:01:04 +00:00
Chris Lattner	3b6f75cb2f	Fix PR1395, by passing the ID correctly llvm-svn: 36894	2007-05-06 23:13:56 +00:00
Nick Lewycky	e7da2d6ac3	Fix typo in comment. llvm-svn: 36873	2007-05-06 13:37:16 +00:00
Chris Lattner	9b35b3e863	Fix a bug in my previous patch llvm-svn: 36857	2007-05-06 07:24:03 +00:00
Chris Lattner	5aa73fe34c	Implement Transforms/InstCombine/cast_ptr.ll llvm-svn: 36809	2007-05-05 22:41:33 +00:00
Chris Lattner	361e981415	wrap long lines llvm-svn: 36807	2007-05-05 22:32:24 +00:00
Chris Lattner	1077d2a30d	Fix Transforms/LoopUnroll/2007-05-05-UnrollMiscomp.ll and PR1385. If we have a LCSSA, only modify the input value if the inval was defined by an instruction in the loop. If defined by something before the loop, it is still valid. llvm-svn: 36784	2007-05-05 18:49:57 +00:00
Chris Lattner	57d89a5a89	make a temporary for *SI, no functionality change. llvm-svn: 36782	2007-05-05 18:36:36 +00:00
Chris Lattner	5c827bda0d	Fix InstCombine/2007-05-04-Crash.ll and PR1384 llvm-svn: 36775	2007-05-05 01:59:31 +00:00
Dan Gohman	2bcbd5b7ca	Use IntrinsicInst to test for prefetch instructions, which is ever so slightly nicer than using CallInst with an extra check; thanks Chris. llvm-svn: 36743	2007-05-04 14:59:09 +00:00
Dan Gohman	3fbb18d1b6	Allow strength reduction to make use of addressing modes for the address operand in a prefetch intrinsic. llvm-svn: 36713	2007-05-03 23:20:33 +00:00
Devang Patel	8c78a0bff0	Drop 'const' llvm-svn: 36662	2007-05-03 01:11:54 +00:00
Devang Patel	e95c6ad802	Use 'static const char' instead of 'static const int'. Due to darwin gcc bug, one version of darwin linker coalesces static const int, which defauts PassID based pass identification. llvm-svn: 36652	2007-05-02 21:39:20 +00:00
Lauro Ramos Venancio	41223586a2	Fix build error. llvm-svn: 36648	2007-05-02 20:37:47 +00:00
Devang Patel	09f162ca6a	Do not use typeinfo to identify pass in pass manager. llvm-svn: 36632	2007-05-01 21:15:47 +00:00
Anton Korobeynikov	546ea7ea88	Implement review feedback llvm-svn: 36564	2007-04-29 18:02:48 +00:00
Anton Korobeynikov	b18f8f85e9	Implement review feedback. Aliasees can be either GlobalValue's or bitcasts of them. llvm-svn: 36537	2007-04-28 13:45:00 +00:00
Chris Lattner	089e35cc57	fix a bug triggered by 403.gcc llvm-svn: 36527	2007-04-28 05:27:36 +00:00
Chris Lattner	6e880871e9	Fix several latent bugs in EmitGEPOffset that didn't manifest with its previous clients. This fixes MallocBench/gs llvm-svn: 36525	2007-04-28 04:52:43 +00:00
Chris Lattner	c753800800	uhn zap cvs llvm-svn: 36523	2007-04-28 03:50:56 +00:00
Chris Lattner	acbf6a401d	Implement PR1345 and Transforms/InstCombine/bitcast-gep.ll llvm-svn: 36521	2007-04-28 00:57:34 +00:00
Chris Lattner	1db224db92	refactor some code relating to pointer cast xforms, pulling it out of the codepath for unrelated casts. llvm-svn: 36511	2007-04-27 17:44:50 +00:00
Zhou Sheng	3178736d50	Using APInt more efficiently. llvm-svn: 36475	2007-04-26 16:42:07 +00:00
Devang Patel	d3ccc073a2	Mem2Reg does not need TargetData. llvm-svn: 36444	2007-04-25 18:32:35 +00:00
Devang Patel	073be55d8e	Remove unused function argument. llvm-svn: 36441	2007-04-25 17:15:20 +00:00
Anton Korobeynikov	a97b694c82	Implement aliases. This fixes PR1017 and it's dependent bugs. CFE part will follow. llvm-svn: 36435	2007-04-25 14:27:10 +00:00
Chris Lattner	827cb98a0a	If an alloca only has two types of uses: 1) reads 2) a memcpy/memmove that copies from a constant global, then we can change the reads to read from the global instead of from the alloca. This eliminates the alloca and the memcpy, and promotes secondary optimizations (because the loads are now loads from a constant global). This is important for a common C idiom: void foo() { int A[] = {1,2,3,4,5,6,7,8,9...}; ... only reads of A ... } For some reason, people forget to mark the array static or const. This triggers on these multisource benchmarks: JM/ldecode: block_pos, [3 x [4 x [4 x i32]]] FreeBench/mason: m, [18 x i32], inlined 4 times MiBench/office-stringsearch: search_strings, [1332 x i8] MiBench/office-stringsearch: find_strings, [1333 x i8] Prolangs-C++/city: dirs, [9 x i8], inlined 4 places and these spec benchmarks: 177.mesa: message, [8 x [32 x i8]] 186.crafty: bias_rl45, [64 x i32] 186.crafty: diag_sq, [64 x i32] 186.crafty: empty, [9 x i8] 186.crafty: xlate, [15 x i8] 186.crafty: status, [13 x i8] 186.crafty: bdinfo, [25 x i8] 445.gobmk: routines, [16 x i8] 458.sjeng: piece_rep, [14 x i8*] 458.sjeng: t, [13 x i32], inlined 4 places. 464.h264ref: block8x8_idx, [3 x [4 x [4 x i32]]] 464.h264ref: block_pos, [3 x [4 x [4 x i32]]] 464.h264ref: j_off_tab, [12 x i32] This implements Transforms/ScalarRepl/memcpy-from-global.ll llvm-svn: 36429	2007-04-25 06:40:51 +00:00
Chris Lattner	31e5addb67	refactor the SROA code out into its own method, no functionality change. llvm-svn: 36426	2007-04-25 05:02:56 +00:00
Owen Anderson	510fefcd8a	Undo my previous changes. Since my approach to this problem is being revised, this approach is no longer appropriate. llvm-svn: 36421	2007-04-25 04:18:54 +00:00
Devang Patel	d3208523b2	Fix http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070423/048376.html llvm-svn: 36417	2007-04-25 00:37:04 +00:00
Owen Anderson	c24701ed7f	Rollback some changes that adversely affected performance. I'm currently rethinking my approach to this, so hopefully I'll find a way to do this without making this slower. llvm-svn: 36392	2007-04-24 06:40:39 +00:00
Devang Patel	38bc86f057	Fix http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070423/048333.html llvm-svn: 36380	2007-04-23 22:42:03 +00:00
Owen Anderson	64995e1b3f	Make PredicateSimplifier not use DominatorTree. llvm-svn: 36300	2007-04-21 07:38:12 +00:00
Owen Anderson	2965adb849	Fix a comment. llvm-svn: 36299	2007-04-21 07:12:44 +00:00
Jeff Cohen	5959f42498	Comment out usage of write() for now. llvm-svn: 36287	2007-04-20 22:40:10 +00:00
Devang Patel	83a3adcc3f	Avoid recursion. llvm-svn: 36272	2007-04-20 20:04:37 +00:00
Owen Anderson	2da606c757	Move more passes to using ETForest instead of DominatorTree. llvm-svn: 36271	2007-04-20 06:27:13 +00:00
Zhou Sheng	aafe4e216e	Make use of ConstantInt::isZero instead of ConstantInt::isNullValue. llvm-svn: 36261	2007-04-19 05:39:12 +00:00
Zhou Sheng	82fcf3cb5f	Make the operations of APInt variables more efficient. llvm-svn: 36260	2007-04-19 05:35:00 +00:00
Evan Cheng	db9b65d67a	Revert Owen's last check-in. This is breaking Mac OS X / PPC llvm-gcc bootstrap. llvm-svn: 36258	2007-04-18 22:39:00 +00:00
Owen Anderson	9421f03959	Revert changes that caused breakage. llvm-svn: 36255	2007-04-18 06:46:57 +00:00
Owen Anderson	9a6091dec1	Switch more uses of DominatorTree over to ETForest. llvm-svn: 36254	2007-04-18 05:43:13 +00:00
Owen Anderson	550e8db9c7	Use ETForest instead of DominatorTree. llvm-svn: 36252	2007-04-18 05:25:43 +00:00
Owen Anderson	fc40d446c9	Use ETForest instead of DominatorTree. llvm-svn: 36249	2007-04-18 04:55:33 +00:00
Owen Anderson	08293fd6d1	Use new ETForest accessor. llvm-svn: 36248	2007-04-18 04:46:35 +00:00
Owen Anderson	f38f2f2394	Use ETForest instead of DominatorTree. llvm-svn: 36247	2007-04-18 04:39:32 +00:00
Dan Gohman	2ce1116b33	Spell doFinalization right, so that it is a proper virtual override and gets called. llvm-svn: 36208	2007-04-17 18:21:36 +00:00
Chris Lattner	233f97ac6a	remove use of BasicBlock::getNext llvm-svn: 36205	2007-04-17 18:09:47 +00:00
Chris Lattner	24e2d9ca03	remove use of BasicBlock::getNext llvm-svn: 36202	2007-04-17 17:54:12 +00:00
Chris Lattner	cd9bda71a0	eliminate use of Instruction::getNext() llvm-svn: 36200	2007-04-17 17:51:03 +00:00
Chris Lattner	77a3edcb92	remove use of Instruction::getNext llvm-svn: 36199	2007-04-17 17:47:54 +00:00
Devang Patel	abdff3fecd	Fix http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070416/047888.html llvm-svn: 36182	2007-04-16 23:03:45 +00:00
Anton Korobeynikov	fb80151c42	Removed tabs everywhere except autogenerated & external files. Add make target for tabs checking. llvm-svn: 36146	2007-04-16 18:10:23 +00:00
Chris Lattner	343c88cdb9	Fix PR1335 and Transforms/Inline/2007-04-15-InlineEH.ll llvm-svn: 36090	2007-04-15 21:38:06 +00:00
Owen Anderson	f35a1dbc7a	Remove ImmediateDominator analysis. The same information can be obtained from DomTree. A lot of code for constructing ImmediateDominator is now folded into DomTree construction. This is part of the ongoing work for PR217. llvm-svn: 36063	2007-04-15 08:47:27 +00:00
Chris Lattner	f8a7bf317e	fix SimplifyLibCalls/IsDigit.ll llvm-svn: 36047	2007-04-15 05:38:40 +00:00
Chris Lattner	4a6e0cbd41	Extend store merging to support the 'if/then' version in addition to if/then/else. This sinks the two stores in this example into a single store in cond_next. In this case, it allows elimination of the load as well: store double 0.000000e+00, double* @s.3060 %tmp3 = fcmp ogt double %tmp1, 5.000000e-01 ; <i1> [#uses=1] br i1 %tmp3, label %cond_true, label %cond_next cond_true: ; preds = %entry store double 1.000000e+00, double* @s.3060 br label %cond_next cond_next: ; preds = %entry, %cond_true %tmp6 = load double* @s.3060 ; <double> [#uses=1] This implements Transforms/InstCombine/store-merge.ll:test2 llvm-svn: 36040	2007-04-15 01:02:18 +00:00
Chris Lattner	14a251b937	refactor some code, no functionality change. llvm-svn: 36037	2007-04-15 00:07:55 +00:00
Chris Lattner	28d921d04f	fix long lines llvm-svn: 36031	2007-04-14 23:32:02 +00:00
Chris Lattner	7bfdd0abe1	Implement Transforms/InstCombine/vec_extract_elt.ll, transforming: define i32 @test(float %f) { %tmp7 = insertelement <4 x float> undef, float %f, i32 0 %tmp17 = bitcast <4 x float> %tmp7 to <4 x i32> %tmp19 = extractelement <4 x i32> %tmp17, i32 0 ret i32 %tmp19 } into: define i32 @test(float %f) { %tmp19 = bitcast float %f to i32 ; <i32> [#uses=1] ret i32 %tmp19 } On PPC, this is the difference between: _test: mfspr r2, 256 oris r3, r2, 8192 mtspr 256, r3 stfs f1, -16(r1) addi r3, r1, -16 addi r4, r1, -32 lvx v2, 0, r3 stvx v2, 0, r4 lwz r3, -32(r1) mtspr 256, r2 blr and: _test: stfs f1, -4(r1) nop nop nop lwz r3, -4(r1) blr llvm-svn: 36025	2007-04-14 23:02:14 +00:00
Chris Lattner	b37fb6a0da	Implement InstCombine/vec_demanded_elts.ll:test2. This allows us to turn unsigned test(float f) { return _mm_cvtsi128_si32( (__m128i) _mm_set_ss( f*f )); } into: _test: movss 4(%esp), %xmm0 mulss %xmm0, %xmm0 movd %xmm0, %eax ret instead of: _test: movss 4(%esp), %xmm0 mulss %xmm0, %xmm0 xorps %xmm1, %xmm1 movss %xmm0, %xmm1 movd %xmm1, %eax ret GCC gets: _test: subl $28, %esp movss 32(%esp), %xmm0 mulss %xmm0, %xmm0 xorps %xmm1, %xmm1 movss %xmm0, %xmm1 movaps %xmm1, %xmm0 movd %xmm0, 12(%esp) movl 12(%esp), %eax addl $28, %esp ret llvm-svn: 36020	2007-04-14 22:29:23 +00:00
Chris Lattner	a6b5660209	avoid copying sets and vectors around. llvm-svn: 36017	2007-04-14 22:10:17 +00:00
Chris Lattner	6f58839b20	avoid iterator invalidation. llvm-svn: 36002	2007-04-14 18:06:52 +00:00
Jeff Cohen	4bd0fd367a	An even better fix. llvm-svn: 35998	2007-04-14 17:18:29 +00:00
Jeff Cohen	7233aa9369	Fix recent regression that broke several llvm-tests. llvm-svn: 35996	2007-04-14 16:55:19 +00:00
Chris Lattner	49fa8d2bff	Implement a few missing xforms: printf("foo\n") -> puts. printf("x") -> putchar printf("") -> noop. Still need to do the xforms for fprintf. This implements Transforms/SimplifyLibCalls/Printf.ll llvm-svn: 35984	2007-04-14 01:17:48 +00:00
Chris Lattner	02137eec8f	in addition to merging, constantmerge should also delete trivially dead globals, in order to clean up after simplifylibcalls. llvm-svn: 35982	2007-04-14 01:11:54 +00:00
Chris Lattner	efb33d28c6	Implement PR1201 and test/Transforms/InstCombine/malloc-free-delete.ll llvm-svn: 35981	2007-04-14 00:20:02 +00:00
Chris Lattner	164b76565b	use an accessor to simplify code. llvm-svn: 35979	2007-04-14 00:17:39 +00:00
Chris Lattner	efd3051d60	Now that codegen prepare isn't defeating me, I can finally fix what I set out to do! :) This fixes a problem where LSR would insert a bunch of code into each MBB that uses a particular subexpression (e.g. IV+base+C). The problem is that this code cannot be CSE'd back together if inserted into different blocks. This patch changes LSR to attempt to insert a single copy of this code and share it, allowing codegenprepare to duplicate the code if it can be sunk into various addressing modes. On CodeGen/ARM/lsr-code-insertion.ll, for example, this gives us code like: add r8, r0, r5 str r6, [r8, #+4] .. ble LBB1_4 @cond_next LBB1_3: @cond_true str r10, [r8, #+4] LBB1_4: @cond_next ... LBB1_5: @cond_true55 ldr r6, LCPI1_1 str r6, [r8, #+4] instead of: add r10, r0, r6 str r8, [r10, #+4] ... ble LBB1_4 @cond_next LBB1_3: @cond_true add r8, r0, r6 str r10, [r8, #+4] LBB1_4: @cond_next ... LBB1_5: @cond_true55 add r8, r0, r6 ldr r10, LCPI1_1 str r10, [r8, #+4] Besides being smaller and more efficient, this makes it immediately obvious that it is profitable to predicate LBB1_3 now :) llvm-svn: 35972	2007-04-13 20:42:26 +00:00
Chris Lattner	feee64e997	Completely rewrite addressing-mode related sinking of code. In particular, this fixes problems where codegenprepare would sink expressions into load/stores that are not valid, and fixes cases where it would miss important valid ones. This fixes several serious codesize and perf issues, particularly on targets with complex addressing modes like arm and x86. For example, now we compile CodeGen/X86/isel-sink.ll to: _test: movl 8(%esp), %eax movl 4(%esp), %ecx cmpl $1233, %eax ja LBB1_2 #F LBB1_1: #T movl $4, (%ecx,%eax,4) movl $141, %eax ret LBB1_2: #F movl (%ecx,%eax,4), %eax ret instead of: _test: movl 8(%esp), %eax leal (,%eax,4), %ecx addl 4(%esp), %ecx cmpl $1233, %eax ja LBB1_2 #F LBB1_1: #T movl $4, (%ecx) movl $141, %eax ret LBB1_2: #F movl (%ecx), %eax ret llvm-svn: 35970	2007-04-13 20:30:56 +00:00
Devang Patel	38705d5494	Remove use of SlowOperationInformer. llvm-svn: 35967	2007-04-13 18:58:18 +00:00
Devang Patel	b730fe57bf	Undo previous check-in. llvm-svn: 35966	2007-04-13 18:35:15 +00:00
Devang Patel	f929b86140	Hello uses LLVMSupport.a (SlowerOperationInformer) llvm-svn: 35965	2007-04-13 18:28:23 +00:00
Lauro Ramos Venancio	749e4668e7	Implement the "thread_local" keyword. llvm-svn: 35950	2007-04-12 18:32:50 +00:00
Reid Spencer	c78d122a6a	Build Hello by default so it can be used in test cases. llvm-svn: 35922	2007-04-11 21:03:37 +00:00
Chris Lattner	5ee4d0726a	Fix Transforms/ScalarRepl/union-pointer.ll llvm-svn: 35906	2007-04-11 15:45:25 +00:00
Chris Lattner	74ff60ff84	Turn stuff like: icmp slt i32 %X, 0 ; <i1>:0 [#uses=1] sext i1 %0 to i32 ; <i32>:1 [#uses=1] into: %X.lobit = ashr i32 %X, 31 ; <i32> [#uses=1] This implements InstCombine/icmp.ll:test[34] llvm-svn: 35891	2007-04-11 06:57:46 +00:00
Chris Lattner	d0f7942e23	Simplify some comparisons to arithmetic, this implements: Transforms/InstCombine/icmp.ll llvm-svn: 35890	2007-04-11 06:53:04 +00:00
Chris Lattner	20f2372a7c	canonicalize (x <u 2147483648) -> (x >s -1) and (x >u 2147483647) -> (x <s 0) llvm-svn: 35886	2007-04-11 06:12:58 +00:00
Chris Lattner	7ddbff090a	fix a miscompilation of: define i32 @test(i32 %X) { entry: %Y = and i32 %X, 4 ; <i32> [#uses=1] icmp eq i32 %Y, 0 ; <i1>:0 [#uses=1] sext i1 %0 to i32 ; <i32>:1 [#uses=1] ret i32 %1 } by moving code out of commonIntCastTransforms into visitZExt. Simplify the APInt gymnastics in it etc. llvm-svn: 35885	2007-04-11 05:45:39 +00:00
Chris Lattner	32104034f8	fix a regression introduced by my last patch. llvm-svn: 35879	2007-04-11 03:27:24 +00:00
Chris Lattner	daa012d1fb	Simplify SROA conversion to integer in some ways, make it more general in others. We now tolerate small amounts of undefined behavior, better emulating what would happen if the transaction actually occurred in memory. This fixes SingleSource/UnitTests/2007-04-10-BitfieldTest.c on PPC, at least until Devang gets a chance to fix the CFE from doing undefined things with bitfields :) llvm-svn: 35875	2007-04-11 00:57:54 +00:00
Chris Lattner	467b69cabb	Strengthen the boundary conditions of this fold, implementing InstCombine/set.ll:test25 llvm-svn: 35852	2007-04-09 23:52:13 +00:00
Owen Anderson	3c7867935e	Re-constify things that don't break the build. Last patch in this series, I promise. llvm-svn: 35848	2007-04-09 23:38:18 +00:00
Chris Lattner	3e9690f987	eliminate the last uses of some TLI methods. llvm-svn: 35844	2007-04-09 23:29:07 +00:00
Owen Anderson	f1ca1376d3	Unconst-ify stuff that broke the build. llvm-svn: 35843	2007-04-09 23:08:26 +00:00
Owen Anderson	5917716146	Const-ify some parameters, and some cosmetic cleanups. No functionality change. llvm-svn: 35842	2007-04-09 22:54:50 +00:00
Owen Anderson	e0ef5ac6bd	Tabs -> Spaces llvm-svn: 35841	2007-04-09 22:31:43 +00:00
Owen Anderson	83efbc84f7	Improve some _slow_ behavior introduced in my patches the last few days. llvm-svn: 35839	2007-04-09 22:25:09 +00:00
Chris Lattner	780c009756	switch LSR to use isLegalAddressingMode instead of other simpler hooks llvm-svn: 35837	2007-04-09 22:20:14 +00:00
Devang Patel	bca0d57179	Check _all_ PHINodes. llvm-svn: 35836	2007-04-09 22:20:10 +00:00
Devang Patel	8eb8eeada9	Insert new pre-header before new header. Original pre-header may happen to be an entry, in such case, it is not a good idea to insert new block before entry. Also fix typo in assertion check. llvm-svn: 35833	2007-04-09 21:40:43 +00:00
Devang Patel	854197884b	Preserve canonical loop form. llvm-svn: 35829	2007-04-09 20:19:46 +00:00
Reid Spencer	8436cdfda2	Don't link against System or Support library. These things will already be in the opt tool. llvm-svn: 35827	2007-04-09 19:17:47 +00:00
Devang Patel	b9af5747a5	Do not create new pre-header. Reuse original pre-header. llvm-svn: 35825	2007-04-09 19:04:21 +00:00
Devang Patel	03d7ae3a74	Simpler for() loops. llvm-svn: 35822	2007-04-09 17:09:13 +00:00
Devang Patel	d6ba41e02d	Fix future bug. Of course, Chris spotted this. Handle Argument or Undef as an incoming PHI value. llvm-svn: 35821	2007-04-09 16:41:46 +00:00
Devang Patel	b28a391a8d	More cosmetic changes. llvm-svn: 35820	2007-04-09 16:21:29 +00:00
Devang Patel	88bc2c6f82	Only cosmetic changes. Zero functionality Change. llvm-svn: 35819	2007-04-09 16:11:48 +00:00
Chris Lattner	a87c9f6114	Fix PR1304 and Transforms/InstCombine/2007-04-08-SingleEltVectorCrash.ll llvm-svn: 35792	2007-04-09 01:37:55 +00:00
Chris Lattner	4ca9cbb170	Eliminate useless insertelement instructions. This implements Transforms/InstCombine/vec_insertelt.ll and fixes PR1286. We now compile the code from that bug into: _foo: movl 4(%esp), %eax movdqa (%eax), %xmm0 movl 8(%esp), %ecx psllw (%ecx), %xmm0 movdqa %xmm0, (%eax) ret instead of: _foo: subl $4, %esp movl %ebp, (%esp) movl %esp, %ebp movl 12(%ebp), %eax movdqa (%eax), %xmm0 #IMPLICIT_DEF %eax pinsrw $2, %eax, %xmm0 xorl %ecx, %ecx pinsrw $3, %ecx, %xmm0 pinsrw $4, %eax, %xmm0 pinsrw $5, %ecx, %xmm0 pinsrw $6, %eax, %xmm0 pinsrw $7, %ecx, %xmm0 movl 8(%ebp), %eax movdqa (%eax), %xmm1 psllw %xmm0, %xmm1 movdqa %xmm1, (%eax) movl %ebp, %esp popl %ebp ret woo :) llvm-svn: 35788	2007-04-09 01:11:16 +00:00
Owen Anderson	ae39ca037a	Cleanup some from my DomSet-removal changes. Add a new isReachableFromEntry test to ETForest to factor a common test out of code. llvm-svn: 35786	2007-04-09 00:52:49 +00:00
Chris Lattner	aa8ad10c2f	Fix a typo that broke SimplifyLibCalls/SPrintF.ll (pr1315) llvm-svn: 35768	2007-04-08 18:11:26 +00:00
Chris Lattner	c8d3788f71	reenable this xform, whoops :) llvm-svn: 35765	2007-04-08 08:01:49 +00:00
Chris Lattner	7621a031d8	Fix regression on Instcombine/apint-or2.ll llvm-svn: 35763	2007-04-08 07:55:22 +00:00
Chris Lattner	1150df9cc4	Generalize the code that handles (A&B)\|(A&C) to work where B/C are not constants. Add a new xform to simplify (A&B)\|(~A&C). THis implements InstCombine/or2.ll:test1 llvm-svn: 35760	2007-04-08 07:47:01 +00:00
Chris Lattner	5717981e5d	implement a fixme: move optimizations for fwrite out of fputs into a new fwrite optimizer. llvm-svn: 35758	2007-04-08 07:00:35 +00:00
Nick Lewycky	e6c64466c7	Remove DominatorSet usage from LoopSimplify. Patch from Owen Anderson. llvm-svn: 35757	2007-04-08 01:04:30 +00:00
Chris Lattner	182a945fb5	Significantly simplify the clients of GetConstantStringInfo, by having it just return the string itself. llvm-svn: 35755	2007-04-07 21:58:02 +00:00
Chris Lattner	08c0b8b3c8	Fix problems in the sprintf optimizer llvm-svn: 35754	2007-04-07 21:17:51 +00:00
Chris Lattner	bed184cbcf	Change CastToCStr to take a pointer instead of a reference. Fix some miscompilations in fprintf optimizer. llvm-svn: 35753	2007-04-07 21:04:50 +00:00
Chris Lattner	898d698d9f	Fix an off-by-one error that broke Prolangs/deriv2 with llc on x86 and Prolangs-C/cdecl llvm-svn: 35749	2007-04-07 20:19:08 +00:00
Owen Anderson	f7ebea1b9f	Add DomSet back, and revert the changes to LoopSimplify. Apparently the ETForest updating mechanisms don't work as I thought they did. These changes will be reapplied once the issue is worked out. llvm-svn: 35741	2007-04-07 18:23:27 +00:00
Nick Lewycky	d4f51a8ae3	Add support for cast instructions. llvm-svn: 35734	2007-04-07 15:48:32 +00:00
Owen Anderson	8763ba1b88	Completely purge DomSet. This is the (hopefully) final patch for PR1171. llvm-svn: 35731	2007-04-07 07:17:27 +00:00
Owen Anderson	706e97049d	Completely purge DomSet from LoopSimplify. This is part of the continuing work on PR1171. llvm-svn: 35730	2007-04-07 06:56:47 +00:00
Owen Anderson	d03a646f06	BreakCriticalEdges does still preserve DominatorTree. llvm-svn: 35729	2007-04-07 05:57:09 +00:00
Owen Anderson	b39d9ca902	Expunge DomSet from BreakCriticalEdges. This is part of the continuing work for PR 1171. llvm-svn: 35728	2007-04-07 05:49:29 +00:00
Owen Anderson	f095bf3ac4	Expunge DomSet from CodeExtractor. This is part of the continuing work on PR1171. llvm-svn: 35726	2007-04-07 05:31:27 +00:00
Nick Lewycky	93f541057b	Support NE inequality in ValueRanges. llvm-svn: 35724	2007-04-07 04:49:12 +00:00
Owen Anderson	910419596e	Expunge a bunch of uses of DomSet from LoopSimplify. Many more remain. This is the beginning of work for PR1171. llvm-svn: 35720	2007-04-07 04:37:14 +00:00
Nick Lewycky	3bb6de85d1	Cleanup. Refactor out the applying of value ranges to its own method. llvm-svn: 35719	2007-04-07 03:36:51 +00:00
Nick Lewycky	12d44abe0f	Use TargetData to find the size of a type. llvm-svn: 35718	2007-04-07 03:16:12 +00:00
Nick Lewycky	eeb01b41ef	Strengthen icmp snuggling by doing 'compare-or-equal-to' to 'compare' first and then range testing second. llvm-svn: 35715	2007-04-07 02:30:14 +00:00
Devang Patel	f42389ffe5	Add loop rotation pass. llvm-svn: 35714	2007-04-07 01:25:15 +00:00
Chris Lattner	0f1509511e	fix a miscompilation in printf optimizer. llvm-svn: 35713	2007-04-07 01:18:36 +00:00
Chris Lattner	6a36d636e9	trunc to bool no longer compares against zero llvm-svn: 35712	2007-04-07 01:03:46 +00:00
Chris Lattner	e8829aa9dd	cleanups for strlen optimizer llvm-svn: 35711	2007-04-07 01:02:00 +00:00
Chris Lattner	485b6415b1	Introduce a new ReplaceCallWith method, which simplifies a lot of code. llvm-svn: 35710	2007-04-07 00:42:32 +00:00
Chris Lattner	6a6c1f1c30	fixes for strcpy optimizer llvm-svn: 35709	2007-04-07 00:26:18 +00:00
Chris Lattner	f9ee647e86	Fix bugs in strncmp. llvm-svn: 35708	2007-04-07 00:06:57 +00:00
Chris Lattner	c9ccc30212	fix 3 miscompilations and several compielr crashes in strcmp optimizer. llvm-svn: 35707	2007-04-07 00:01:51 +00:00
Chris Lattner	39f0bb9670	Fix several nasty bugs in the strchr optimizer, this fixes SimplifyLibCalls/2007-04-06-strchr-miscompile.ll and PR1307 llvm-svn: 35706	2007-04-06 23:38:55 +00:00
Chris Lattner	56b7fc7768	clean up strcat optimizer, no functionality change. llvm-svn: 35704	2007-04-06 22:59:33 +00:00
Chris Lattner	9b2b8abd20	rename getConstantStringLength -> GetConstantStringInfo. Make it return the start index of the array as well as the length. No functionality change. llvm-svn: 35703	2007-04-06 22:54:17 +00:00
Chris Lattner	3dbe65f80a	implement Transforms/InstCombine/malloc2.ll and PR1313 llvm-svn: 35700	2007-04-06 18:57:34 +00:00
Chris Lattner	1a9a760318	Fix Transforms/GlobalOpt/2007-04-05-Crash.ll llvm-svn: 35689	2007-04-05 21:09:42 +00:00
Chris Lattner	108083edff	Use a worklist-driven algorithm instead of a recursive one. llvm-svn: 35680	2007-04-05 01:27:02 +00:00
Dale Johannesen	7c2001d014	Prevent transformConstExprCastCall from generating conversions that assert elsewhere. llvm-svn: 35668	2007-04-04 19:16:42 +00:00
Jeff Cohen	5a1c750f31	Fix 2007-04-04-BadFoldBitcastIntoMalloc.ll llvm-svn: 35665	2007-04-04 16:58:57 +00:00
Duncan Sands	f01a47c93c	Fix comment. llvm-svn: 35655	2007-04-04 06:42:45 +00:00
Chris Lattner	e5bbb3cb1a	Fix a bug I introduced with my patch yesterday which broke Qt (I converted some constant exprs to apints). Thanks to Anton for tracking down a small testcase that triggered this! llvm-svn: 35633	2007-04-03 23:29:39 +00:00
Chris Lattner	a74deafb13	reinstate the previous two patches, with a bugfix :) ldecod now passes. llvm-svn: 35626	2007-04-03 17:43:25 +00:00
Evan Cheng	7511fa280d	Reverting back to 1.723. The last two commits broke JM (and possibily others) on ARM. llvm-svn: 35620	2007-04-03 08:11:50 +00:00
Chris Lattner	81e0707552	split some code out into a helper function llvm-svn: 35615	2007-04-03 05:11:24 +00:00
Chris Lattner	64c764cebc	Split a whole ton of code out of visitICmpInst into visitICmpInstWithInstAndIntCst. llvm-svn: 35614	2007-04-03 04:46:52 +00:00
Chris Lattner	8b2ec5f506	Fix PR1253 and xor2.ll:test[01] llvm-svn: 35612	2007-04-03 01:47:41 +00:00
Chris Lattner	f3197a7d53	allow -1 strides to reuse "1" strides. llvm-svn: 35607	2007-04-02 22:51:58 +00:00
Zhou Sheng	9bc8ab100d	1. Make use of APInt operation instead of using ConstantExpr::getXXX. 2. Use cheaper APInt methods. llvm-svn: 35594	2007-04-02 13:45:30 +00:00
Zhou Sheng	56cda95658	Use uint32_t for bitwidth instead of unsigned. llvm-svn: 35593	2007-04-02 08:20:41 +00:00
Chris Lattner	28e0e4e11e	Pass the type of the store access, not the type of the store, into the target hook. This allows us to codegen a loop as: LBB1_1: @cond_next mov r2, #0 str r2, [r0, +r3, lsl #2] add r3, r3, #1 cmn r3, #1 bne LBB1_1 @cond_next instead of: LBB1_1: @cond_next mov r2, #0 str r2, [r0], #+4 add r3, r3, #1 cmn r3, #1 bne LBB1_1 @cond_next This looks the same, but has one fewer induction variable (and therefore, one fewer register) live in the loop. llvm-svn: 35592	2007-04-02 06:34:44 +00:00
Chris Lattner	9d5aacee92	Wrap long line llvm-svn: 35588	2007-04-02 05:48:58 +00:00
Chris Lattner	50490d54f2	use more obvious function name. llvm-svn: 35587	2007-04-02 05:42:22 +00:00
Chris Lattner	b24acc7bee	simplify (x+c)^signbit as (x+c+signbit), pointed out by PR1288. This implements test/Transforms/InstCombine/xor.ll:test28 llvm-svn: 35584	2007-04-02 05:36:22 +00:00
Chris Lattner	b7b75145f1	reduce use of std::set llvm-svn: 35576	2007-04-02 01:44:59 +00:00
Chris Lattner	c3748562bd	Various passes before isel split edges and do other CFG-restructuring changes. isel has its own particular features that it wants in the CFG, in order to reduce the number of times a constant is computed, etc. Make sure that we clean up the CFG before doing any other things for isel. Doing so can dramatically reduce the number of split edges and reduce the number of places that constants get computed. For example, this shrinks CodeGen/Generic/phi-immediate-factoring.ll from 44 to 37 instructions on X86, and from 21 to 17 MBB's in the output. This is primarily a code size win, not a performance win. This implements CodeGen/Generic/phi-immediate-factoring.ll and PR1296. llvm-svn: 35575	2007-04-02 01:35:34 +00:00
Chris Lattner	8fe3cbe6bd	print the type of an inserted IV in -debug mode. llvm-svn: 35563	2007-04-01 22:21:39 +00:00
Chris Lattner	c3eeb42809	simplify this code, make it work for ap ints llvm-svn: 35561	2007-04-01 20:57:36 +00:00
Zhou Sheng	150f3bbab2	Avoid unnecessary APInt construction. llvm-svn: 35555	2007-04-01 17:13:37 +00:00
Reid Spencer	6bba6c8143	For PR1297: Support overloaded intrinsics bswap, ctpop, cttz, ctlz. llvm-svn: 35547	2007-04-01 07:35:23 +00:00
Chris Lattner	0427799531	Fix InstCombine/2007-03-31-InfiniteLoop.ll llvm-svn: 35536	2007-04-01 05:36:37 +00:00
Chris Lattner	f2836d17b6	Split the sdisel code munging stuff out into its own opt-pass, CodeGenPrepare. llvm-svn: 35528	2007-03-31 04:06:36 +00:00
Zhou Sheng	82c42284f4	Delete dead code. llvm-svn: 35525	2007-03-31 02:50:26 +00:00
Zhou Sheng	4f16402e0d	Use APInt operators to calculate the carry bits, remove this loop. llvm-svn: 35524	2007-03-31 02:38:39 +00:00
Zhou Sheng	fd28a33031	Make sure the use of ConstantInt::getZExtValue() for shift amount safe. llvm-svn: 35510	2007-03-30 17:20:39 +00:00
Zhou Sheng	b25806fa5f	1. Make sure the use of ConstantInt::getZExtValue() for getting shift amount is safe. 2. Use new method on ConstantInt instead of (? :) operator. 3. Use new method uge() on ConstantInt to simplify codes. llvm-svn: 35505	2007-03-30 09:29:48 +00:00
Zhou Sheng	5e60a4a6b0	Use APInt operation instead of ConstantExpr::getXX. llvm-svn: 35503	2007-03-30 05:45:18 +00:00
Zhou Sheng	b3a80b1d70	1. Make more use of APInt::getHighBitsSet/getLowBitsSet. 2. Let APInt variable do the binary operation stuff instead of using ConstantExpr::getXXX. llvm-svn: 35450	2007-03-29 08:15:12 +00:00
Zhou Sheng	444af49cc0	Clean up some codes in InstCombiner::SimplifyDemandedBits(). llvm-svn: 35446	2007-03-29 04:45:55 +00:00
Zhou Sheng	a4475575c0	Clean up codes in InstCombiner::SimplifyDemandedBits(): 1. Line out nested call of APInt::zext/trunc. 2. Make more use of APInt::getHighBitsSet/getLowBitsSet. 3. Use APInt[] operator instead of expression like "APIntVal & SignBit". llvm-svn: 35444	2007-03-29 02:26:30 +00:00
Zhou Sheng	4961cf1c06	1. Make the APInt variable do the binary operation stuff if possible instead of using ConstantExpr::getXX. 2. Use constant reference to APInt if possible instead of expensive APInt copy. llvm-svn: 35443	2007-03-29 01:57:21 +00:00
Zhou Sheng	117477e28b	Avoid unnecessary APInt construction. llvm-svn: 35431	2007-03-28 17:38:21 +00:00
Zhou Sheng	23f7a1c947	1. Make more use of getLowBitsSet/getHighBitsSet. 2. Use APInt[] instead of "X & SignBit". 3. Clean up some codes. 4. Make the expression like "ShiftAmt = ShiftAmtC->getZExtValue()" safe. llvm-svn: 35424	2007-03-28 15:02:20 +00:00
Zhou Sheng	2777a31850	1. Make more use of getLowBitsSet/getHighBitsSet. 2. Make the APInt value do the zext/trunc stuff instead of using ConstantExpr::getZExt(). llvm-svn: 35422	2007-03-28 09:19:01 +00:00
Zhou Sheng	c2d3309b99	Use UnknownBIts[BitWidth-1] instead of UnknownBIts & SignBits. llvm-svn: 35418	2007-03-28 05:15:57 +00:00
Zhou Sheng	18570b1f14	Remove unused APInt variable. llvm-svn: 35414	2007-03-28 03:02:21 +00:00
Zhou Sheng	57e3f7324b	Clean up codes in ComputeMaskedBits(): 1. Line out nested use of zext/trunc. 2. Make more use of getHighBitsSet/getLowBitsSet. 3. Use APInt[] != 0 instead of "(APInt & SignBit) != 0". llvm-svn: 35408	2007-03-28 02:19:03 +00:00
Reid Spencer	a5c18bf798	For PR1280: When converting an add/xor/and triplet into a trunc/sext, only do so if the intermediate integer type is a bitwidth that the targets can handle. llvm-svn: 35400	2007-03-28 01:36:16 +00:00
Evan Cheng	a4ed8a512a	Unbreaks non-debug builds. llvm-svn: 35383	2007-03-27 16:44:48 +00:00
Reid Spencer	54d5b1b8f8	Implement some minor review feedback. llvm-svn: 35373	2007-03-26 23:58:26 +00:00
Reid Spencer	441486c172	For PR1271: Fix another incorrectly converted shift mask. llvm-svn: 35371	2007-03-26 23:45:51 +00:00
Devang Patel	4398e242dd	Reduce malloc/free traffic. llvm-svn: 35370	2007-03-26 23:19:29 +00:00
Chris Lattner	d2602d5054	eliminate use of std::set llvm-svn: 35361	2007-03-26 20:40:50 +00:00
Reid Spencer	755d0e7ffc	Get better debug output by having modified instructions print both the original and new instruction. A slight performance hit with ostringstream but it is only for debug. Also, clean up an uninitialized variable warning noticed in a release build. llvm-svn: 35358	2007-03-26 17:44:01 +00:00
Reid Spencer	769a5a8e0b	Get the number of bits to set in a mask correct for a shl/lshr transform. llvm-svn: 35357	2007-03-26 17:18:58 +00:00
Reid Spencer	50898607a9	For PR1271: Fix SingleSource/Regression/C/2003-05-21-UnionBitFields.c by changing a getHighBitsSet call to getLowBitsSet call that was incorrectly converted from the original lshr constant expression. llvm-svn: 35348	2007-03-26 05:25:00 +00:00
Dale Johannesen	e5866e7b89	Look through bitcast when finding IVs. (Chris' patch really.) llvm-svn: 35347	2007-03-26 03:01:27 +00:00
Reid Spencer	52830327e9	For PR1271: Remove a use of getLowBitsSet that caused the mask used for replacement of shl/lshr pairs with an AND instruction to be computed incorrectly. Its not clear exactly why this is the case. This solves the disappearing shifts problem, but it doesn't fix Regression/C/2003-05-21-UnionBitFields. It seems there is more going on. llvm-svn: 35342	2007-03-25 21:11:44 +00:00
Chris Lattner	9bf53ffaa2	implement Transforms/InstCombine/cast2.ll:test3 and PR1263 llvm-svn: 35341	2007-03-25 20:43:09 +00:00
Reid Spencer	624766f8a2	Some cleanup from review: * Don't assume shift amounts are <= 64 bits * Avoid creating an extra APInt in SubOne and AddOne by using -- and ++ * Add another use of getLowBitsSet * Convert a series of if statements to a switch llvm-svn: 35339	2007-03-25 19:55:33 +00:00
Reid Spencer	80263aadf3	Refactor several ConstantExpr::getXXX calls with ConstantInt arguments using the facilities of APInt. While this duplicates a tiny fraction of the constant folding code, it also makes the code easier to read and avoids large ConstantExpr overhead for simple, known computations. llvm-svn: 35335	2007-03-25 05:33:51 +00:00
Zhou Sheng	222d5ebfd2	1. Avoid unnecessary APInt construction if possible. 2. Use isStrictlyPositive() instead of isPositive() in two places where they need APInt value > 0 not only >=0. llvm-svn: 35333	2007-03-25 05:01:29 +00:00
Reid Spencer	cd99fbdf3b	Make more uses of getHighBitsSet and get rid of some pointless & of an APInt with its type mask. llvm-svn: 35325	2007-03-25 04:26:16 +00:00
Reid Spencer	d8aad61d4d	More APIntification: * Convert the last use of a uint64_t that should have been an APInt. * Change ComputeMaskedBits to have a const reference argument for the Mask so that recursions don't cause unneeded temporaries. This causes temps to be needed in other places (where the mask has to change) but this change optimizes for the recursion which is more frequent. * Remove two instances of &ing a Mask with getAllOnesValue. Its not needed any more because APInt is accurate in its bit computations. * Start using the getLowBitsSet and getHighBits set methods on APInt instead of shifting. This makes it more clear in the code what is going on. llvm-svn: 35321	2007-03-25 02:03:12 +00:00
Chris Lattner	3a8248f79d	fix a regression on vector or instructions. llvm-svn: 35314	2007-03-24 23:56:43 +00:00
Zhou Sheng	e9ebd3f6ba	Make some codes more efficient. llvm-svn: 35297	2007-03-24 15:34:37 +00:00
Reid Spencer	a962d18774	For PR1205: Convert some calls to ConstantInt::getZExtValue() into getValue() and use APInt facilities in the subsequent computations. llvm-svn: 35294	2007-03-24 00:42:08 +00:00
Reid Spencer	959a21d3dc	For PR1205: * APIntify visitAdd and visitSelectInst * Remove unused uint64_t versions of utility functions that have been replaced with APInt versions. This completes most of the changes for APIntification of InstCombine. This passes llvm-test and llvm/test/Transforms/InstCombine/APInt. Patch by Zhou Sheng. llvm-svn: 35287	2007-03-23 21:24:59 +00:00
Reid Spencer	6d39206bc2	For PR1205: APIntify visitDiv, visitMul and visitRem. Patch by Zhou Sheng. llvm-svn: 35283	2007-03-23 20:05:17 +00:00
Chris Lattner	12b89cc148	switch AddReachableCodeToWorklist from being recursive to being iterative. llvm-svn: 35282	2007-03-23 19:17:18 +00:00
Reid Spencer	6274c72ee1	For PR1205: APIntify several utility functions supporting logical operators and shift operators. Patch by Zhou Sheng. llvm-svn: 35281	2007-03-23 18:46:34 +00:00
Zhou Sheng	0900993ebc	Make the "KnownZero ^ TypeMask" computation just once. llvm-svn: 35276	2007-03-23 03:13:21 +00:00
Zhou Sheng	755f04b5d7	Simplify the code. llvm-svn: 35275	2007-03-23 02:39:25 +00:00
Reid Spencer	b722f2b110	For PR1205: APInt support for logical operators in visitAnd, visitOr, and visitXor. Patch by Zhou Sheng. llvm-svn: 35273	2007-03-22 22:19:58 +00:00
Reid Spencer	4154e732e6	For PR1205: * APIntify commonIntCastTransforms * APIntify visitTrunc * APIntify visitZExt Patch by Zhou Sheng. llvm-svn: 35271	2007-03-22 20:56:53 +00:00
Reid Spencer	c3e3b8a32f	For PR1205: * Re-enable the APInt version of MaskedValueIsZero. * APIntify the Comput{Un}SignedMinMaxValuesFromKnownBits functions * APIntify visitICmpInst. llvm-svn: 35270	2007-03-22 20:36:03 +00:00
Dan Gohman	dcb291faa4	Change uses of Function::front to Function::getEntryBlock for readability. llvm-svn: 35265	2007-03-22 16:38:57 +00:00
Nick Lewycky	b0da7ed9c8	Fix broken optimization disabled by a logic bug. Analyze GEPs. If the indices are all zero, transfer whether the pointer is known to be not null through the GEP. Add a few more cases for xor and shift instructions. llvm-svn: 35257	2007-03-22 02:02:51 +00:00
Reid Spencer	f40711637f	For PR1248: * Fix some indentation and comments in InsertRangeTest * Add an "IsSigned" parameter to AddWithOverflow and make it handle signed additions. Also, APIntify this function so it works with any bitwidth. * For the icmp pred ([us]div %X, C1), C2 transforms, exit early if the div instruction's RHS is zero. * Finally, for icmp pred (sdiv %X, C1), -C2, fix an off-by-one error. The HiBound needs to be incremented in order to get the range test correct. llvm-svn: 35247	2007-03-21 23:19:50 +00:00
Dale Johannesen	bacf4acf65	do not share old induction variables when this would result in invalid instructions (that would have to be split later) llvm-svn: 35227	2007-03-20 21:54:54 +00:00
Jeff Cohen	1baf5c84ab	Fix some VC++ warnings. llvm-svn: 35224	2007-03-20 20:43:18 +00:00
Devang Patel	1758cb50de	LoopSimplify::FindPHIToPartitionLoops() Use ETForest instead of DominatorSet. llvm-svn: 35221	2007-03-20 20:18:12 +00:00
Zhou Sheng	b3949340c8	Simplify isHighOnes(). llvm-svn: 35211	2007-03-20 12:49:06 +00:00
Dale Johannesen	e3a02be5f1	use types of loads and stores, not address, in CheckForIVReuse llvm-svn: 35197	2007-03-20 00:47:50 +00:00
Reid Spencer	6682721316	Make isOneBitSet faster by using APInt::isPowerOf2. Thanks Chris. llvm-svn: 35194	2007-03-20 00:16:52 +00:00
Reid Spencer	cc031a43aa	APIntify the isHighOnes utility function. llvm-svn: 35190	2007-03-19 21:29:50 +00:00
Reid Spencer	ef599b0786	Implement isMaxValueMinusOne in terms of APInt instead of uint64_t. Patch by Sheng Zhou. llvm-svn: 35188	2007-03-19 21:10:28 +00:00
Reid Spencer	3b93db72b4	Implement isMinValuePlusOne using facilities of APInt instead of uint64_t Patch by Zhou Sheng. llvm-svn: 35187	2007-03-19 21:08:07 +00:00
Reid Spencer	129a86792d	Implement isOneBitSet in terms of APInt::countPopulation. llvm-svn: 35186	2007-03-19 21:04:43 +00:00
Reid Spencer	450434ed65	1. Use APInt::getSignBit to reduce clutter (patch by Sheng Zhou) 2. Replace uses of the "isPositive" utility function with APInt::isPositive llvm-svn: 35185	2007-03-19 20:58:18 +00:00
Reid Spencer	03c31d5bb0	Remove a redundant clause in an if statement. Patch by Sheng Zhou. llvm-svn: 35184	2007-03-19 20:47:50 +00:00
Chris Lattner	9c62db7c8c	fix ScalarRepl/2007-03-19-CanonicalizeMemcpy.ll llvm-svn: 35169	2007-03-19 18:25:57 +00:00
Chris Lattner	877a3b424d	implement the next chunk of SROA with memset/memcpy's of aggregates. This implements Transforms/ScalarRepl/memset-aggregate-byte-leader.ll llvm-svn: 35150	2007-03-19 00:16:43 +00:00
Nick Lewycky	db204ecfbc	Clean up this code and fix subtract miscompile. llvm-svn: 35146	2007-03-18 22:58:46 +00:00
Chris Lattner	0741842b3b	Implement InstCombine/and-xor-merge.ll:test[12]. Rearrange some code to simplify it now that shifts are binops llvm-svn: 35145	2007-03-18 22:51:34 +00:00
Nick Lewycky	17d20fd41e	Propagate ValueRanges across equality. Add some more micro-optimizations: x * 0 = 0, a - x = a --> x = 0. llvm-svn: 35138	2007-03-18 01:09:32 +00:00
Anton Korobeynikov	22f436da42	Silence warning llvm-svn: 35137	2007-03-17 14:48:06 +00:00
Nick Lewycky	4f73de2b4e	Add more comments and update to new asm syntax. Add new micro-optimizations. Add icmp predicate snuggling. Given %x ULT 4, "icmp ugt %x, 2" becomes "icmp eq %x, 3". This doesn't apply in any non-trivial cases yet due to missing support for NE values in ValueRanges. llvm-svn: 35119	2007-03-16 02:37:39 +00:00
Zhou Sheng	d8c645b0ba	ShiftAmt might equal to zero. Handle this situation. llvm-svn: 35094	2007-03-14 09:07:33 +00:00
Zhou Sheng	b912844554	Enable KnownZero/One.clear(). llvm-svn: 35093	2007-03-14 03:21:24 +00:00
Evan Cheng	b5eb932c93	Correct type info for isLegalAddressImmediate() check. llvm-svn: 35086	2007-03-13 20:34:37 +00:00
Chris Lattner	d1bce956b4	ifdef out some dead code. Fix PR1244 and Transforms/InstCombine/2007-03-13-CompareMerge.ll llvm-svn: 35082	2007-03-13 14:27:42 +00:00
Zhou Sheng	ebe634e662	For expression like "APInt::getAllOnesValue(ShiftAmt).zextOrCopy(BitWidth)", to handle ShiftAmt == BitWidth situation, use zextOrCopy() instead of zext(). llvm-svn: 35080	2007-03-13 06:40:59 +00:00
Zhou Sheng	af4341d441	In APInt version ComputeMaskedBits(): 1. Ensure VTy, KnownOne and KnownZero have same bitwidth. 2. Make code more efficient. llvm-svn: 35078	2007-03-13 02:23:10 +00:00
Evan Cheng	720acdfb31	Use new TargetLowering addressing modes hooks. llvm-svn: 35072	2007-03-12 23:27:37 +00:00
Jeff Cohen	00227417d2	Unbreak VC++ build. Do not use identifiers starting with _ as they are reserved and can collide with system defined names. Windows defines _BB, for example. llvm-svn: 35066	2007-03-12 17:56:27 +00:00
Reid Spencer	1791f23803	Add an APInt version of SimplifyDemandedBits. Patch by Zhou Sheng. llvm-svn: 35064	2007-03-12 17:25:59 +00:00
Reid Spencer	d9281784be	Add an APInt version of ShrinkDemandedConstant. Patch by Zhou Sheng. llvm-svn: 35063	2007-03-12 17:15:10 +00:00
Zhou Sheng	be171ee5cd	Avoid to assert on "(KnownZero & KnownOne) == 0". llvm-svn: 35062	2007-03-12 16:54:56 +00:00
Zhou Sheng	b3e00c4656	In function ComputeMaskedBits(): 1. Replace getSignedMinValue() with getSignBit() for better code readability. 2. Replace APIntOps::shl() with operator<<= for convenience. 3. Make APInt construction more effective. llvm-svn: 35060	2007-03-12 05:44:52 +00:00
Nick Lewycky	d9bd0bc3e2	Add value ranges. Currently inefficient in both execution time and optimization power. llvm-svn: 35058	2007-03-10 18:12:48 +00:00
Anton Korobeynikov	8a6dc102d3	Use range tests in LowerSwitch, where possible llvm-svn: 35057	2007-03-10 16:46:28 +00:00
Devang Patel	5f50e61d52	Remove dead comments. llvm-svn: 35053	2007-03-09 23:41:03 +00:00
Devang Patel	bda1250624	Avoid recursion. Use iterative algorithm for RenamePass(). llvm-svn: 35052	2007-03-09 23:39:14 +00:00
Devang Patel	58818c530f	Increment iterator now because IVUseShouldUsePostIncValue may remove User from the list of I users. llvm-svn: 35051	2007-03-09 21:19:53 +00:00
Zhou Sheng	d1eb3d593e	Fix a bug in function ComputeMaskedBits(). llvm-svn: 35027	2007-03-08 15:15:18 +00:00
Chris Lattner	abd3bff4f2	This appears correct, enable it so we can see perf changes on testers llvm-svn: 35024	2007-03-08 07:03:55 +00:00
Chris Lattner	9f022d550b	Second half of PR1226. This is currently still disabled, until I have a chance to do the correctness/performance analysis testing. llvm-svn: 35023	2007-03-08 06:36:54 +00:00
Zhou Sheng	387d7b1a35	Fix a bug in APIntified ComputeMaskedBits(). llvm-svn: 35022	2007-03-08 05:42:00 +00:00
Reid Spencer	bb5741fb02	For PR1205: Provide an APIntified version of MaskedValueIsZero. This will (temporarily) cause a "defined but not used" message from the compiler. It will be used in the next patch in this series. Patch by Sheng Zhou. llvm-svn: 35019	2007-03-08 01:52:58 +00:00
Reid Spencer	aa69640b10	For PR1205: Add a new ComputeMaskedBits function that is APIntified. We'll slowly convert things over to use this version. When its all done, we'll remove the existing version. llvm-svn: 35018	2007-03-08 01:46:38 +00:00
Devang Patel	2ac57e1f02	Now IndVarSimplify is a LoopPass. llvm-svn: 35003	2007-03-07 06:39:01 +00:00
Devang Patel	69730c96db	Now LICM is a LoopPass. llvm-svn: 35001	2007-03-07 04:41:30 +00:00
Devang Patel	9779e56c04	Now LoopUnroll is a LoopPass. llvm-svn: 34996	2007-03-07 01:38:05 +00:00
Devang Patel	901a27d892	Now LoopUnswitch is a LoopPass. llvm-svn: 34992	2007-03-07 00:26:10 +00:00
Devang Patel	b0743b5d6a	Now LoopStrengthReduce is a LoopPass. llvm-svn: 34984	2007-03-06 21:14:09 +00:00
Reid Spencer	3939b1a274	Remove an unnecessary if statement and adjust indentation. llvm-svn: 34939	2007-03-05 23:36:13 +00:00
Chris Lattner	66e6a8229a	This is the first major step of implementing PR1226. We now successfully scalarrepl things down to elements, but mem2reg can't promote elements that are memset/memcpy'd. Until then, the code is disabled "0 &&". llvm-svn: 34924	2007-03-05 07:52:57 +00:00
Chris Lattner	fe53cf2459	fix a subtle bug that caused an MSVC warning. Thanks to Jeffc for pointing this out. llvm-svn: 34920	2007-03-05 00:11:19 +00:00
Chris Lattner	5fdded1d2f	Add some simplifications for demanded bits, this allows instcombine to turn: define i64 @test(i64 %A, i32 %B) { %tmp12 = zext i32 %B to i64 ; <i64> [#uses=1] %tmp3 = shl i64 %tmp12, 32 ; <i64> [#uses=1] %tmp5 = add i64 %tmp3, %A ; <i64> [#uses=1] %tmp6 = and i64 %tmp5, 123 ; <i64> [#uses=1] ret i64 %tmp6 } into: define i64 @test(i64 %A, i32 %B) { %tmp6 = and i64 %A, 123 ; <i64> [#uses=1] ret i64 %tmp6 } This implements Transforms/InstCombine/add2.ll:test1 llvm-svn: 34919	2007-03-05 00:02:29 +00:00
Jeff Cohen	b622c11f77	Unbreak VC++ build. llvm-svn: 34917	2007-03-05 00:00:42 +00:00
Chris Lattner	ab2f913b68	simplify some code llvm-svn: 34914	2007-03-04 23:16:36 +00:00
Chris Lattner	c33fd469ef	minor cleanups llvm-svn: 34904	2007-03-04 04:50:21 +00:00
Chris Lattner	8258b44b22	Speed up -instcombine by 20% by avoiding a particularly expensive passmgr call. llvm-svn: 34902	2007-03-04 04:27:24 +00:00
Chris Lattner	a5403a587c	switch MarkAliveBlocks over to using SmallPtrSet instead of std::set, speeding up simplifycfg by 20% llvm-svn: 34901	2007-03-04 04:20:48 +00:00
Chris Lattner	d7b4c92cd0	make better use of LCSSA information in RewriteLoopExitValues. Before, we would scan the entire loop body, then scan all users of instructions in the loop, looking for users outside the loop. Now, since we know that the loop is in LCSSA form, we know that any users outside the loop will be LCSSA phi nodes. Just scan them. This speeds up indvars significantly. llvm-svn: 34898	2007-03-04 03:43:23 +00:00
Chris Lattner	1f7648efba	Implement PR1179/PR1232 and test/Transforms/IndVarsSimplify/loop_evaluate_[234].ll This makes -indvars require and use LCSSA, updating it as appropriate. llvm-svn: 34896	2007-03-04 01:00:28 +00:00
Chris Lattner	ed30abf0cb	Make RewriteLoopExitValues far less nested by using continue in the loop llvm-svn: 34891	2007-03-03 22:48:48 +00:00
Chris Lattner	da1d04a057	my recent change caused a failure in a bswap testcase, because it changed the order that instcombine processed instructions in the testcase. The end result is that instcombine finished with: define i16 @test1(i16 %a) { %tmp = zext i16 %a to i32 ; <i32> [#uses=2] %tmp21 = lshr i32 %tmp, 8 ; <i32> [#uses=1] %tmp5 = shl i32 %tmp, 8 ; <i32> [#uses=1] %tmp.upgrd.32 = or i32 %tmp21, %tmp5 ; <i32> [#uses=1] %tmp.upgrd.3 = trunc i32 %tmp.upgrd.32 to i16 ; <i16> [#uses=1] ret i16 %tmp.upgrd.3 } which can't get matched as a bswap. This patch makes instcombine more sophisticated about removing truncating casts, allowing it to turn this into: define i16 @test2(i16 %a) { %tmp211 = lshr i16 %a, 8 %tmp52 = shl i16 %a, 8 %tmp.upgrd.323 = or i16 %tmp211, %tmp52 ret i16 %tmp.upgrd.323 } which then matches as bswap. This fixes bswap.ll and implements InstCombine/cast2.ll:test[12]. This also implements cast elimination of add/sub. llvm-svn: 34870	2007-03-03 05:27:34 +00:00
Nick Lewycky	db42295ff2	Translate bit operations to English. llvm-svn: 34868	2007-03-03 03:14:40 +00:00
Chris Lattner	960a543037	add a top-level iteration loop to instcombine. This means that it will never finish without combining something it is capable of. llvm-svn: 34865	2007-03-03 02:04:50 +00:00
Reid Spencer	c34dedf686	APIntify this pass. llvm-svn: 34863	2007-03-03 00:48:31 +00:00
Reid Spencer	53a3739c80	Finally get this patch right :) Replace expensive getZExtValue() == 0 calls with isZero() calls. llvm-svn: 34861	2007-03-02 23:51:25 +00:00
Reid Spencer	ba547cbb2a	Dang, I've done that twice now! Undo previous commit. llvm-svn: 34860	2007-03-02 23:37:53 +00:00
Reid Spencer	558990e189	Use more efficient test for one value in a ConstantInt. llvm-svn: 34859	2007-03-02 23:35:28 +00:00
Reid Spencer	29fe20a98b	Guard against huge loop trip counts in an APInt safe way. llvm-svn: 34858	2007-03-02 23:31:34 +00:00
Reid Spencer	dec03a08d6	Make sure debug code is not evaluated in non-debug case. llvm-svn: 34856	2007-03-02 23:15:21 +00:00
Reid Spencer	1e102971d2	1. Sort switch cases using APInt safe comparison. 2. Make sure debug output of APInt values is safe for all bit widths. llvm-svn: 34855	2007-03-02 23:05:28 +00:00
Reid Spencer	43376a74af	Use APInt safe isOne() method on ConstantInt instead of getZExtValue()==1 llvm-svn: 34854	2007-03-02 23:03:17 +00:00
Reid Spencer	bb38d79ad6	Make sorting of ConstantInt be APInt clean through use of ult function. llvm-svn: 34853	2007-03-02 23:01:14 +00:00
Chris Lattner	b15e2b182f	Fix a significant algorithm problem with the instcombine worklist. removing a value from the worklist required scanning the entire worklist to remove all entries. We now use a combination map+vector to prevent duplicates from happening and prevent the scan. This speeds up instcombine on a large file from the llvm-gcc bootstrap from 189.7s to 4.84s in a debug build and from 5.04s to 1.37s in a release build. llvm-svn: 34848	2007-03-02 21:28:56 +00:00
Chris Lattner	51f5457ad4	minor cleanup llvm-svn: 34846	2007-03-02 19:59:19 +00:00
Chris Lattner	4bd8cda3f0	switch the inliner from being recursive to being iterative. llvm-svn: 34832	2007-03-02 03:11:20 +00:00
Reid Spencer	197adfaa0a	Reverse a premature commital. llvm-svn: 34822	2007-03-02 00:31:39 +00:00
Reid Spencer	2e54a15943	Prefer non-virtual calls to ConstantInt::isZero over virtual calls to Constant::isNullValue() in situations where it is possible. llvm-svn: 34821	2007-03-02 00:28:52 +00:00
Reid Spencer	fa63226751	Although probably not necessary, guard against a potential assertion by using isNullValue() instead of getZExtValue() == 0. llvm-svn: 34815	2007-03-01 21:54:37 +00:00
Reid Spencer	17797076ef	Use isUnitValue() instead of getZExtValue() == 1 which will prevent an assert if the ConstantInt's value is large. llvm-svn: 34814	2007-03-01 21:51:23 +00:00
Reid Spencer	5b0548de77	Use APInt conversion to string so the result is correct regardless of the bit width of the ConstantInt being converted. llvm-svn: 34810	2007-03-01 21:00:32 +00:00
Reid Spencer	24f1a0e78f	The 64-bit constructor for ConstantInt changes from int64_t to uint64_t. This caused a warning for construction with -1. Avoid the warning by using -1ULL instead. llvm-svn: 34796	2007-03-01 19:33:52 +00:00
Reid Spencer	6a44033465	Remove the "isSigned" parameters from ConstantRange. It turns out they are not needed as the results are the same with or without it. Patch by Nicholas Lewycky. llvm-svn: 34782	2007-03-01 07:54:15 +00:00
Reid Spencer	d373b9dc59	For PR1205: Adjust to changes in ConstantRange interface. llvm-svn: 34762	2007-02-28 22:03:51 +00:00
Reid Spencer	3a7e9d8e75	For PR1205: Remove ConstantInt from ConstantRange interface and adjust its users to compensate. llvm-svn: 34758	2007-02-28 19:57:34 +00:00
Reid Spencer	56f784d12d	For PR1205: First round of ConstantRange changes. This makes all CR constructors use only APInt and not use ConstantInt. Clients are adjusted accordingly. llvm-svn: 34756	2007-02-28 18:57:32 +00:00
Devang Patel	97517ff930	Use efficient container SmallPtrSet llvm-svn: 34640	2007-02-26 20:22:50 +00:00
Devang Patel	967b84c681	Do not unswitch loop on same value again and again. llvm-svn: 34638	2007-02-26 19:31:58 +00:00
Chris Lattner	c4d8e7e614	Fix InstCombine/2007-02-23-PhiFoldInfLoop.ll and PR1217 llvm-svn: 34546	2007-02-24 01:03:45 +00:00
Chris Lattner	1e48acb858	fix an obscure and tricky bug the inliner can hit sometimes. llvm-svn: 34531	2007-02-23 19:54:30 +00:00
Jim Laskey	d879dfbf1c	Revert changes for a simplier solution. llvm-svn: 34495	2007-02-22 16:21:18 +00:00
Jim Laskey	e4ccf22c34	Itanium ABI exception handing support. llvm-svn: 34480	2007-02-21 22:49:50 +00:00
Dan Gohman	8c8597c4d9	Fix typos in comments. llvm-svn: 34456	2007-02-20 20:52:03 +00:00
Chris Lattner	c35fe713ff	remove reoptimizer-specific passes llvm-svn: 34439	2007-02-20 05:31:49 +00:00
Chris Lattner	b5f6d0c15a	eliminate use of deprecated apis llvm-svn: 34417	2007-02-19 07:34:47 +00:00
Chris Lattner	9f4707eb04	fix comment llvm-svn: 34395	2007-02-18 22:10:58 +00:00
Chris Lattner	a6f54c0e2c	simplify pass, delete dead gvar protos as well. llvm-svn: 34394	2007-02-18 22:10:34 +00:00
Chris Lattner	99c6cf60f1	convert more vectors to smallvectors, 2.8% speedup llvm-svn: 34333	2007-02-15 22:52:10 +00:00
Chris Lattner	af6094fe3f	change some vectors to smallvectors. This speeds up instcombine on 447.dealII by 5%. llvm-svn: 34332	2007-02-15 22:48:32 +00:00
Chris Lattner	7907e5fe07	switch an std::set to a SmallPtr set, this speeds up instcombine by 9.5% on 447.dealII llvm-svn: 34323	2007-02-15 19:41:52 +00:00
Reid Spencer	09575bac2e	For PR1195: Change use of "packed" term to "vector" in comments, strings, variable names, etc. llvm-svn: 34300	2007-02-15 03:39:18 +00:00
Reid Spencer	537ee02f89	Change an assert that mentions Packed Type -> Vector Type. llvm-svn: 34298	2007-02-15 03:11:20 +00:00
Reid Spencer	d84d35ba70	For PR1195: Rename PackedType -> VectorType, ConstantPacked -> ConstantVector, and PackedTyID -> VectorTyID. No functional changes. llvm-svn: 34293	2007-02-15 02:26:10 +00:00
Chris Lattner	945e437c65	Generalize TargetData strings, to support more interesting forms of data. Patch by Scott Michel. llvm-svn: 34266	2007-02-14 05:52:17 +00:00
Chris Lattner	ade1c2bb51	eliminate a bunch of vector-related heap traffic llvm-svn: 34222	2007-02-13 05:58:53 +00:00
Chris Lattner	a06a8fd2d7	Eliminate use of ctors that take vectors. llvm-svn: 34219	2007-02-13 02:10:56 +00:00
Chris Lattner	a731513406	stop using methods that take vectors. llvm-svn: 34205	2007-02-12 22:56:41 +00:00
Chris Lattner	32ab643df7	Switch ValueSymbolTable to use StringMap<Value> instead of std::map<std::string, Value> as its main datastructure. There are many improvements yet to be made, but this speeds up opt --std-compile-opts on 447.dealII by 7.3%. llvm-svn: 34193	2007-02-12 05:18:08 +00:00
Chris Lattner	8dd4cae4f8	simplify code by using Value::takeName llvm-svn: 34177	2007-02-11 01:37:51 +00:00
Chris Lattner	6e0123b17f	Simplify code by using value::takename llvm-svn: 34176	2007-02-11 01:23:03 +00:00
Chris Lattner	8d4c36bb40	simplify name juggling through the use of Value::takeName. llvm-svn: 34175	2007-02-11 01:08:35 +00:00
Chris Lattner	c473d8e431	Privatize StructLayout::MemberOffsets, adding an accessor llvm-svn: 34156	2007-02-10 19:55:17 +00:00
Chris Lattner	bf6286ba04	Fix Transforms/DeadArgElim/2007-02-07-FuncRename.ll, fallout from PR411. This happened because deadargelim now causes VMCore to auto-rename every function that it hacks arguments out of. Because it hacks arguments out of functions in a non-deterministic order, this caused the resultant numbering to be nondet. The fix is to just be careful to not rename functions! llvm-svn: 34005	2007-02-07 19:31:33 +00:00
Chris Lattner	88051b0fad	shrink vmcore by moving symbol table stripping support out of VMCore into the one IPO pass that uses it. llvm-svn: 33990	2007-02-07 06:22:45 +00:00
Chris Lattner	430c9217f0	redesign the primary datastructure used by mem2reg to eliminate an std::map of std::vector's (ouch!). This speeds up mem2reg by 10% on 176.gcc. llvm-svn: 33974	2007-02-07 01:15:04 +00:00
Chris Lattner	c85e79f3e0	With the last change, we no longer need both directions of mapping from BBNumbers. Instead of using a bi-directional mapping, just use a single densemap. This speeds up mem2reg on 176.gcc by 8%, from 1.3489 to 1.2485s. llvm-svn: 33940	2007-02-05 23:37:20 +00:00
Reid Spencer	557ab15e71	Apply the VISIBILITY_HIDDEN field to the remaining anonymous classes in the Transforms library. This reduces debug library size by 132 KB, debug binary size by 376 KB, and reduces link time for llvm tools slightly. llvm-svn: 33939	2007-02-05 23:32:05 +00:00
Chris Lattner	52da61fb5c	Simplify use of DFBlocks, this makes no noticable performance difference, but paves the way to eliminate BBNumbers. llvm-svn: 33938	2007-02-05 23:31:26 +00:00
Reid Spencer	193abd95c9	This file should have been removed when -raise was removed. It isn't used any more. llvm-svn: 33937	2007-02-05 23:27:02 +00:00
Chris Lattner	bf67b1229b	Switch InsertedPHINodes back to SmallPtrSet now that the SmallPtrSet::erase bug is fixed. llvm-svn: 33932	2007-02-05 23:11:37 +00:00
Chris Lattner	606dde0093	switch a SmallPtrSet back to an std::set for now, this caused problems. llvm-svn: 33930	2007-02-05 22:28:52 +00:00
Chris Lattner	1ed84bbd2d	switch an std::set over to a SmallPtrSet, speeding up mem2reg 6% on 176.gcc. llvm-svn: 33929	2007-02-05 22:15:21 +00:00
Chris Lattner	70fbb9de4c	switch an std::set over to SmallPtrSet, speeding up mem2reg 3.4% on 176.gcc. llvm-svn: 33928	2007-02-05 22:13:11 +00:00
Chris Lattner	8fbc888d91	eliminate some malloc traffic, this speeds up mem2reg by 3.4%. llvm-svn: 33927	2007-02-05 21:58:48 +00:00
Reid Spencer	ca3bf1ad85	Add missing and needed #include. llvm-svn: 33926	2007-02-05 21:47:39 +00:00
Reid Spencer	35a0718d82	Make the class VISIBILITY_HIDDEN. Reduce lexical size of the anonymous namespace. llvm-svn: 33925	2007-02-05 21:45:12 +00:00
Reid Spencer	1241d6d5ab	For PR411: Adjust to changes in Module interface: getMainFunction() -> getFunction("main") getNamedFunction(X) -> getFunction(X) llvm-svn: 33922	2007-02-05 21:19:13 +00:00
Reid Spencer	3aaaa0b2bd	For PR411: This patch replaces the SymbolTable class with ValueSymbolTable which does not support types planes. This means that all symbol names in LLVM must now be unique. The patch addresses the necessary changes to deal with this and removes code no longer needed as a result. This completes the bulk of the changes for this PR. Some cleanup patches will follow. llvm-svn: 33918	2007-02-05 20:47:22 +00:00
Reid Spencer	e84cf92141	For PR411: This pass is no longer needed. llvm-svn: 33917	2007-02-05 20:41:05 +00:00
Reid Spencer	ba09a3e5f0	Create a pass to strip dead function declarations (prototypes). This is for use by llvm-extract and bugpoint. llvm-svn: 33916	2007-02-05 20:24:25 +00:00
Chris Lattner	83ac5ae9f3	Fix miscompilations of consumer-typeset, telecomm-gsm, and 176.gcc. llvm-svn: 33902	2007-02-05 05:57:49 +00:00
Reid Spencer	a1d35926b7	For PR1177: Revert last patch which caused iteration invalidation. llvm-svn: 33901	2007-02-05 05:23:32 +00:00
Chris Lattner	0a28e90f2c	fix a miscompilation of 176.gcc llvm-svn: 33900	2007-02-05 04:09:35 +00:00
Owen Anderson	f6fa108993	Use DenseMap for pointer->pointer maps. llvm-svn: 33897	2007-02-05 02:39:47 +00:00
Chris Lattner	3e009e8b8f	rewrite shift/shift folding, now that types are not signed. llvm-svn: 33892	2007-02-05 00:57:54 +00:00
Nick Lewycky	15245953a5	Fix indenting, remove tabs. Learn from sext and zext. The destination value falls within the range of the source type. Generalize properties regarding constant ints. Get smarter about marking blocks as unreachable. If 1 >= 2 in order for this block to execute, then it isn't reachable. llvm-svn: 33889	2007-02-04 23:43:05 +00:00
Reid Spencer	3f4e6e84dc	For PR1163: Make the Module's dependent library use a std::vector instead of SetVector adjust #includes in .cpp files because SetVector.h is no longer included. llvm-svn: 33855	2007-02-04 00:40:42 +00:00
Chris Lattner	6c344e56b1	remove some dead code llvm-svn: 33845	2007-02-03 23:28:07 +00:00
Reid Spencer	8de97bba5a	For PR1072: Removing -raise has neglible positive or negative side effects so we are opting to remove it. See the PR for comparison details. llvm-svn: 33844	2007-02-03 23:15:56 +00:00
Chris Lattner	1bfc7ab6a7	Switch inliner over to use DenseMap instead of std::map for ValueMap. This speeds up the inliner 16%. llvm-svn: 33801	2007-02-03 00:08:31 +00:00
Chris Lattner	fc8190dbb7	Switch this back to using an std::map. DenseMap entries are getting invalidated llvm-svn: 33799	2007-02-02 22:36:16 +00:00
Chris Lattner	37d400a83d	Remove more malloc thrashing, this speeds up IPSCCP on kimwitu another 6.7%. llvm-svn: 33796	2007-02-02 21:15:06 +00:00
Chris Lattner	3e667f3e61	Convert an std::set to SmallSet, this speeds up IPSCCP 17% on kimwitu. llvm-svn: 33794	2007-02-02 20:57:39 +00:00
Chris Lattner	0e7ec675da	eliminate a malloc/free for (almost) every GEP processed. This speeds up IPSCCP 3.3% on kimwitu. llvm-svn: 33793	2007-02-02 20:51:48 +00:00
Chris Lattner	067d607e0e	switch hash_map's over to DenseMap in SCCP. This speeds up SCCP by 30% in a release-assert build on kimwitu++. llvm-svn: 33792	2007-02-02 20:38:30 +00:00
Reid Spencer	2f34b98cbf	Remove dead code and fix indentation per Chris' review comments. llvm-svn: 33785	2007-02-02 14:41:37 +00:00
Reid Spencer	0d5f9237b6	Use short form of binary operator create functions. llvm-svn: 33783	2007-02-02 14:08:20 +00:00
Chris Lattner	d5fea61d98	bugfix for reid's shift patch. llvm-svn: 33779	2007-02-02 05:29:55 +00:00
Reid Spencer	2341c22ec7	Changes to support making the shift instructions be true BinaryOperators. This feature is needed in order to support shifts of more than 255 bits on large integer types. This changes the syntax for llvm assembly to make shl, ashr and lshr instructions look like a binary operator: shl i32 %X, 1 instead of shl i32 %X, i8 1 Additionally, this should help a few passes perform additional optimizations. llvm-svn: 33776	2007-02-02 02:16:23 +00:00
Chris Lattner	c904205d28	Fix Transforms/InstCombine/2007-02-01-LoadSinkAlloca.ll, a serious code pessimization where instcombine can sink a load (good for code size) that prevents an alloca from being promoted by mem2reg (bad for everything). llvm-svn: 33771	2007-02-01 22:30:07 +00:00
Reid Spencer	26c642de74	Ensure that ConvertOperandToType generates a result conversion by initializing the Res variable to 0 and asserting it is not zero after the result should have been created. llvm-svn: 33761	2007-02-01 19:14:51 +00:00
Chris Lattner	ce494229a1	Fix bugs in the inliner having to do with single-entry phi nodes and valuemap updating. These were exposed by Devang's recent passmgr changes (with non-default passorderings) because now the inliner can be interleved with the LCSSA pass. llvm-svn: 33760	2007-02-01 18:48:38 +00:00
Chris Lattner	416a8939c3	remove temporary vectors. llvm-svn: 33715	2007-01-31 20:08:52 +00:00
Chris Lattner	7a63e7a7ad	eliminate temporary vectors llvm-svn: 33713	2007-01-31 20:07:32 +00:00
Chris Lattner	927653f27f	eliminate temporary vectors llvm-svn: 33712	2007-01-31 19:59:55 +00:00
Chris Lattner	4fc18a4cb8	Revert another incorrectly applied chunk, which fixes InstCombine/vec_insert_to_shuffle.ll llvm-svn: 33705	2007-01-31 18:09:17 +00:00
Chris Lattner	f96f4a874c	eliminate temporary vectors llvm-svn: 33693	2007-01-31 04:40:53 +00:00
Chris Lattner	aa17576933	Move symbolic constant folding code to libanalysis. llvm-svn: 33688	2007-01-31 00:53:10 +00:00
Chris Lattner	024f4ab383	Adjust #includes to match movement of constant folding code from transformutils to libanalysis. llvm-svn: 33680	2007-01-30 23:46:24 +00:00
Chris Lattner	2ae054adb0	move a bunch of constant folding code f rom Transforms/Utils/Local.cpp into libanalysis/ConstantFolding.cpp. llvm-svn: 33679	2007-01-30 23:45:45 +00:00
Chris Lattner	14789a92e1	remove now-dead code. llvm-svn: 33678	2007-01-30 23:29:47 +00:00
Chris Lattner	f94bed3f13	the inliner pass now passes targetdata down through the inliner api's llvm-svn: 33677	2007-01-30 23:28:39 +00:00
Chris Lattner	ad84a730ba	The inliner/cloner can now optionally take TargetData info, which can be used by constant folding. llvm-svn: 33676	2007-01-30 23:22:39 +00:00
Chris Lattner	e3eda25641	pass TD to constant folding apis llvm-svn: 33674	2007-01-30 23:16:15 +00:00
Chris Lattner	0d74d3c09b	use smallvector instead of vector to make constant folding a bit more efficient llvm-svn: 33672	2007-01-30 23:15:19 +00:00
Chris Lattner	6fc4b46d43	adjust to api change llvm-svn: 33671	2007-01-30 23:14:52 +00:00
Chris Lattner	2c4610e4ca	Change constant folding APIs to take an optional TargetData, and change ConstantFoldInstOperands/ConstantFoldCall to take a pointer to an array of operands + size, instead of an std::vector. In some cases, switch to using a SmallVector instead of a vector. This allows us to get rid of some special case gross code that was there to avoid the cost of constructing a vector. llvm-svn: 33670	2007-01-30 23:13:49 +00:00
Chris Lattner	2b15f2ba9d	remove some bits that are not yet meant to land. llvm-svn: 33666	2007-01-30 22:50:32 +00:00
Chris Lattner	4284f6463a	Symbolically evaluate constant expressions like &A[123] - &A[4].f. This occurs in C++ code like: #include <iostream> #include <iterator> int a[] = { 1, 2, 3, 4, 5 }; int main() { using namespace std; copy(a, a + sizeof(a)/sizeof(a[0]), ostream_iterator<int>(cout, "\n")); return 0; } Before we would decide the loop trip count is: sdiv (i32 sub (i32 ptrtoint (i32* getelementptr ([5 x i32]* @a, i32 0, i32 5) to i32), i32 ptrtoint ([5 x i32]* @a to i32)), i32 4) Now we decide it is "5". Amazing. This code will need to be refactored, but I'm doing that as a separate commit. llvm-svn: 33665	2007-01-30 22:32:46 +00:00
Reid Spencer	5301e7c605	For PR1136: Rename GlobalVariable::isExternal as isDeclaration to avoid confusion with external linkage types. llvm-svn: 33663	2007-01-30 20:08:39 +00:00
Nick Lewycky	56639800c9	Simplify names of lattice values. SGTUNE becomes SGT, for example. Fix initializeConstant, now initializeInt. Fixes major performance bottleneck. X == Y \|\| X->DominatedBy(Y) is redundant. Remove the X == Y part. Fix crasher in makeEqual where getOrInsertNode would add a new constant, producing an NE relationship between the two members we're trying to make equal. This now allows us to mark more BBs as unreachable. llvm-svn: 33612	2007-01-29 02:56:54 +00:00
Anton Korobeynikov	037c867b54	Propagate changes from my local tree. This patch includes: 1. New parameter attribute called 'inreg'. It has meaning "place this parameter in registers, if possible". This is some generalization of gcc's regparm(n) attribute. It's currently used only in X86-32 backend. 2. Completely rewritten CC handling/lowering code inside X86 backend. Merged stdcall + c CCs and fastcall + fast CC. 3. Dropped CSRET CC. We cannot add struct return variant for each target-specific CC (e.g. stdcall + csretcc and so on). 4. Instead of CSRET CC introduced 'sret' parameter attribute. Setting in on first attribute has meaning 'This is hidden pointer to structure return. Handle it gently'. 5. Fixed small bug in llvm-extract + add new feature to FunctionExtraction pass, which relinks all internal-linkaged callees from deleted function to external linkage. This will allow further linking everything together. NOTEs: 1. Documentation will be updated soon. 2. llvm-upgrade should be improved to translate csret => sret. Before this, there will be some unexpected test fails. llvm-svn: 33597	2007-01-28 13:31:35 +00:00
Chris Lattner	c8fb6de78c	Fix test/Transforms/InstCombine/2007-01-27-AndICmp.ll, a miscompilation of Mozilla that Anton tracked down. llvm-svn: 33591	2007-01-27 23:08:34 +00:00
Jim Laskey	c56315c2b5	Change the MachineDebugInfo to MachineModuleInfo to better reflect usage for debugging and exception handling. llvm-svn: 33550	2007-01-26 21:22:28 +00:00
Reid Spencer	3ac38e99b9	For PR761: The Module::setEndianness and Module::setPointerSize methods have been removed. Instead you can get/set the DataLayout. Adjust thise accordingly. llvm-svn: 33530	2007-01-26 08:11:39 +00:00
Devang Patel	13058a5ae9	Inherit CallGraphSCCPass directly from Pass. llvm-svn: 33514	2007-01-26 00:47:38 +00:00
Devang Patel	5292e65791	Inherit BasicBlockPass directly from Pass. llvm-svn: 33511	2007-01-25 23:23:25 +00:00
Chris Lattner	79f08506f1	Make llvm-extract preserve the callingconv of prototypes in the extracted code. llvm-svn: 33500	2007-01-25 17:38:26 +00:00
Reid Spencer	31a4ef4dc1	Cleanup checks in the load and store of casted pointer transforms. Two changes: (1) don't special case for i1 any more, (2) use the new TargetData::getTypeSizeInBits method to ensure source and dest are the same bit width. llvm-svn: 33427	2007-01-22 05:51:25 +00:00
Reid Spencer	2eadb5310d	For PR970: Clean up handling of isFloatingPoint() and dealing with PackedType. Patch by Gordon Henriksen! llvm-svn: 33415	2007-01-21 00:29:26 +00:00
Reid Spencer	9a4bed06dd	Revise the store V, (cast P) -> store (cast V) -> P transform. We only want to do this if the src and destination types have the same bit width. This patch uses TargetData::getTypeSizeInBits() instead of making a special case for integer types and avoiding the transform if they don't match. llvm-svn: 33414	2007-01-20 23:35:48 +00:00
Chris Lattner	50ee0e40e5	Teach TargetData to handle 'preferred' alignment for each target, and use these alignment amounts to align scalars when we can. Patch by Scott Michel! llvm-svn: 33409	2007-01-20 22:35:55 +00:00
Owen Anderson	dfd79ad319	Correct a comment. llvm-svn: 33397	2007-01-20 10:07:23 +00:00
Reid Spencer	e928a15c9e	For this transform: store V, (cast P) -> store (cast V), P don't allow the transform if V and the pointer's element type are different width integer types. llvm-svn: 33371	2007-01-19 21:20:31 +00:00
Reid Spencer	a94d394ad2	For PR1043: This is the final patch for this PR. It implements some minor cleanup in the use of IntegerType, to wit: 1. Type::getIntegerTypeMask -> IntegerType::getBitMask 2. Type::IntTy changed to IntegerType from Type* 3. ConstantInt::getType() returns IntegerType* now, not Type* This also fixes PR1120. Patch by Sheng Zhou. llvm-svn: 33370	2007-01-19 21:13:56 +00:00
Chris Lattner	120ab038eb	Fix InstCombine/2007-01-18-VectorInfLoop.ll, a case where instcombine infinitely loops. llvm-svn: 33343	2007-01-18 22:16:33 +00:00
Reid Spencer	c050af9126	Clean up some code around the store V, (cast P) -> store (cast V), P transform. Change some variable names so it is clear what is source and what is dest of the cast. Also, add an assert to ensure that the integer to integer case is asserting if the bitwidths are different. This prevents illegal casts from being formed and catches bitwidth bugs sooner. llvm-svn: 33337	2007-01-18 18:54:33 +00:00
Reid Spencer	a8a1547370	For PR1094: Adjust the use of SetVector for changes in SetVector's interface. Patch by Gordon Henriksen. llvm-svn: 33280	2007-01-17 02:23:37 +00:00
Chris Lattner	479a9fc492	Fix a regression in my isIntegral patch that broke 471.omnetpp. This is because TargetData::getTypeSize() returns the same for i1 and i8. This fix is not right for the full generality of bitwise types, but it fixes the regression. llvm-svn: 33237	2007-01-15 17:55:20 +00:00
Nick Lewycky	6ce36cff3a	Don't print address of ETNode. Print the DFSNumIn which uniquely identifies the basic block and is stable across runs in gdb or valgrind. Make Node::update handle edges which dominate and are tighter than existing edges. Replace makeEqual's "squeeze theorem" code. Fixes miscompilation. Gate the calls to defToOps and opsToDef. Before this, we were getting IG edges about values which weren't even defined in the dominated area. This reduces the size of the IG by about half. llvm-svn: 33236	2007-01-15 14:30:07 +00:00
Chris Lattner	c8dcede292	Implement InstCombine/phi.ll:test7, deletion of trivial value loops for induction variables. llvm-svn: 33234	2007-01-15 07:30:06 +00:00
Chris Lattner	27df1db485	simplify some code now that types are signless llvm-svn: 33232	2007-01-15 07:02:54 +00:00
Chris Lattner	a4beeef76c	delete stores to allocas with one use. This is a trivial form of DSE which often kicks in for ?: expressions. llvm-svn: 33231	2007-01-15 06:51:56 +00:00
Chris Lattner	03c4953cdd	rename Type::isIntegral to Type::isInteger, eliminating the old Type::isInteger. rename Type::getIntegralTypeMask to Type::getIntegerTypeMask. This makes naming much more consistent. For example, there are now no longer any instances of IntegerType that are not considered isInteger! :) llvm-svn: 33225	2007-01-15 02:27:26 +00:00
Chris Lattner	1942249c5b	Eliminate calls to isInteger, generalizing code and tightening checks as needed. llvm-svn: 33218	2007-01-15 01:55:30 +00:00
Chris Lattner	f739d01059	Fix Analysis/Dominators/2006-10-02-BreakCritEdges.ll llvm-svn: 33210	2007-01-15 00:15:09 +00:00
Chris Lattner	6ee923f3bb	instcombine has always been miscompiling fcmp x, x, disregarding possible NANs. This fixes PR1111 and Transforms/InstCombine/2007-01-14-FcmpSelf.ll llvm-svn: 33208	2007-01-14 19:42:17 +00:00
Chris Lattner	9818a6fd76	Fix PR1110 and Analysis/Dominators/2007-01-14-BreakCritEdges.ll by being more careful about unreachable code when updating dominator info. llvm-svn: 33204	2007-01-14 18:33:35 +00:00
Chris Lattner	387bf3f700	Fix Transforms/InstCombine/2007-01-13-ExtCompareMiscompile.ll, which is part of PR1107 llvm-svn: 33185	2007-01-13 23:11:38 +00:00
Reid Spencer	47bb5c996e	Fix indentation to prior level for easier diffs. llvm-svn: 33184	2007-01-13 05:10:53 +00:00
Nick Lewycky	4294446fcb	"Default context" blocks can occur after a non-default one. This meant that properties were being applied where they didn't belong. Fixes crash in new MiBench testcase. Also mark debugging code as such in #ifdef. llvm-svn: 33177	2007-01-13 02:05:28 +00:00
Chris Lattner	ff7434a526	Fix a minor bug handling constant exprs, introduced by a recent patch. llvm-svn: 33175	2007-01-13 00:42:58 +00:00
Chris Lattner	ca82a908e3	fix a bug in a recent patch llvm-svn: 33164	2007-01-13 00:02:49 +00:00
Chris Lattner	f5e5236b57	simplify some code llvm-svn: 33150	2007-01-12 22:51:20 +00:00
Chris Lattner	3b6058c278	Remove over-general comparisons llvm-svn: 33147	2007-01-12 22:49:11 +00:00
Chris Lattner	e3721e3002	eliminate redundant check llvm-svn: 33132	2007-01-12 18:35:11 +00:00
Chris Lattner	15649084e9	Branch conditions must be i1 llvm-svn: 33129	2007-01-12 18:30:11 +00:00
Reid Spencer	7a9c62baa6	For PR1064: Implement the arbitrary bit-width integer feature. The feature allows integers of any bitwidth (up to 64) to be defined instead of just 1, 8, 16, 32, and 64 bit integers. This change does several things: 1. Introduces a new Derived Type, IntegerType, to represent the number of bits in an integer. The Type classes SubclassData field is used to store the number of bits. This allows 2^23 bits in an integer type. 2. Removes the five integer Type::TypeID values for the 1, 8, 16, 32 and 64-bit integers. These are replaced with just IntegerType which is not a primitive any more. 3. Adjust the rest of LLVM to account for this change. Note that while this incremental change lays the foundation for arbitrary bit-width integers, LLVM has not yet been converted to actually deal with them in any significant way. Most optimization passes, for example, will still only deal with the byte-width integer types. Future increments will rectify this situation. llvm-svn: 33113	2007-01-12 07:05:14 +00:00
Reid Spencer	cddc9dfe97	Implement review feedback for the ConstantBool->ConstantInt merge. Chris recommended that getBoolValue be replaced with getZExtValue and that get(bool) be replaced by get(const Type*, uint64_t). This implements those changes. llvm-svn: 33110	2007-01-12 04:24:46 +00:00
Nick Lewycky	ee32ee0250	If we know that it's a constant being casted, propagate through the cast instruction. Doesn't work the other way though (can't recover bits that have been truncated). llvm-svn: 33104	2007-01-12 01:23:53 +00:00
Nick Lewycky	4a74a75bbb	Clean up logic after ConstantBool removal. llvm-svn: 33096	2007-01-12 00:02:12 +00:00
Reid Spencer	542964f55b	Rename BoolTy as Int1Ty. Patch by Sheng Zhou. llvm-svn: 33076	2007-01-11 18:21:29 +00:00
Zhou Sheng	bd23db9968	Remove unnecessary boolean type check. llvm-svn: 33075	2007-01-11 14:38:17 +00:00
Zhou Sheng	75b871fb1e	For PR1043: Merge ConstantIntegral and ConstantBool into ConstantInt. Remove ConstantIntegral and ConstantBool from LLVM. llvm-svn: 33073	2007-01-11 12:24:14 +00:00
Zhou Sheng	691b263e07	Fixed indentation. llvm-svn: 33072	2007-01-11 10:33:26 +00:00
Nick Lewycky	5d6ede524a	Quiet compiler warning. The only reason the function is marked virtual is so that it can be called from inside a debugger. llvm-svn: 33067	2007-01-11 02:38:21 +00:00
Nick Lewycky	2fc338f923	New predicate simplifier! Please do not enable, there is still some known miscompile problem. llvm-svn: 33066	2007-01-11 02:32:38 +00:00
Chris Lattner	8571caa99b	Fix a bug in heap-sra that caused compilation failure of office-ispell. llvm-svn: 33043	2007-01-09 23:29:37 +00:00
Jeff Cohen	223004cd12	Unbreak VC++ build. llvm-svn: 33021	2007-01-08 20:17:17 +00:00
Reid Spencer	8f166b0ef3	Comparison of primitive type sizes should now be done in bits, not bytes. This patch converts getPrimitiveSize to getPrimitiveSizeInBits where it is appropriate to do so (comparison of integer primitive types). llvm-svn: 33012	2007-01-08 16:32:00 +00:00
Reid Spencer	bf96e02a54	For PR1097: Enable complex addressing modes on 64-bit platforms involving two induction variables by keeping a size and scale in 64-bits not 32. Patch by Dan Gohman. llvm-svn: 33011	2007-01-08 16:17:51 +00:00
Reid Spencer	4f98e62831	Types should be const. llvm-svn: 33001	2007-01-07 21:45:41 +00:00
Chris Lattner	950d0e9926	this pass is unused llvm-svn: 32998	2007-01-07 18:12:43 +00:00
Chris Lattner	34acba48cc	Change the interface to Module::getOrInsertFunction to be easier to use,to resolve PR1088, and to help PR411. This simplifies many clients also llvm-svn: 32989	2007-01-07 08:12:01 +00:00
Chris Lattner	d97f1936bb	prepare for adjustment to getOrInsertFunction method llvm-svn: 32985	2007-01-07 07:54:34 +00:00
Chris Lattner	cc4715e06e	relax some types llvm-svn: 32982	2007-01-07 07:22:20 +00:00
Chris Lattner	9641ab26ec	relax types llvm-svn: 32981	2007-01-07 06:59:47 +00:00
Chris Lattner	fbc524fe87	relax some types llvm-svn: 32980	2007-01-07 06:58:05 +00:00
Chris Lattner	0816559b13	add -debug output for -indvars. llvm-svn: 32971	2007-01-07 01:14:12 +00:00
Chris Lattner	7051d758de	Fix regressions in InstCombine/call-cast-target.ll and InstCombine/2003-11-13-ConstExprCastCall.ll llvm-svn: 32959	2007-01-06 19:53:32 +00:00
Reid Spencer	32af9e8cc5	For PR411: Take an incremental step towards type plane elimination. This change separates types from values in the symbol tables by finally making use of the TypeSymbolTable class. This yields more natural interfaces for dealing with types and unclutters the SymbolTable class. llvm-svn: 32956	2007-01-06 07:24:44 +00:00
Chris Lattner	c343a99786	this final call to canLosslesslyBitCastTo is dead, because ValueRequiresCast is only called on integers. llvm-svn: 32949	2007-01-06 02:11:56 +00:00
Chris Lattner	400f959a0c	simplify some more code now that there are not multiple different integer types of the same size llvm-svn: 32948	2007-01-06 02:09:32 +00:00
Chris Lattner	64d87b0215	eliminate some uses of canLosslesslyBitCastTo, this actually makes the code stronger, by nuking relational pointer comparisons with casts. llvm-svn: 32947	2007-01-06 01:45:59 +00:00
Chris Lattner	3fe98ae10a	no need to worry about int vs uint any more. llvm-svn: 32946	2007-01-06 01:37:35 +00:00
Chris Lattner	d7b6ea166d	Implement InstCombine/vec_shuffle.ll:%test7, simplifying shuffles with undef operands. llvm-svn: 32899	2007-01-05 07:36:08 +00:00
Chris Lattner	17c7c030c2	fold things like a^b != c^a -> b != c. This implements InstCombine/xor.ll:test27 llvm-svn: 32893	2007-01-05 03:04:57 +00:00
Chris Lattner	23eb8ec78b	Compile X + ~X to -1. This implements Instcombine/add.ll:test34 llvm-svn: 32890	2007-01-05 02:17:46 +00:00
Reid Spencer	6ff3e73db6	Death to useless bitcast instructions! llvm-svn: 32866	2007-01-04 05:23:51 +00:00
Chris Lattner	806adafd95	Enable a couple xforms for packed vectors (undef \| v) -> -1 for packed. llvm-svn: 32858	2007-01-04 02:12:40 +00:00
Jim Laskey	c4ba9c161b	Vectors are not supported by ConstantInt::getAllOnesValue. llvm-svn: 32827	2007-01-03 00:11:03 +00:00
Reid Spencer	e8a74ee5ea	Fix a typo. llvm-svn: 32803	2006-12-31 22:26:06 +00:00
Reid Spencer	c635f47d9a	For PR950: This patch replaces signed integer types with signless ones: 1. [US]Byte -> Int8 2. [U]Short -> Int16 3. [U]Int -> Int32 4. [U]Long -> Int64. 5. Removal of isSigned, isUnsigned, getSignedVersion, getUnsignedVersion and other methods related to signedness. In a few places this warranted identifying the signedness information from other sources. llvm-svn: 32785	2006-12-31 05:48:39 +00:00
Reid Spencer	193df25eb9	For PR1066: Fix this by ensuring that a bitcast is inserted to do sign switching. This is only temporarily needed as the merging of signed and unsigned is next on the SignlessTypes plate. llvm-svn: 32757	2006-12-24 00:40:59 +00:00
Reid Spencer	910f23f7d7	Shut up some compilers that can't accurately analyze variable usage correctly and emit "may be used uninitialized" warnings. llvm-svn: 32756	2006-12-23 19:17:57 +00:00
Reid Spencer	43c77d53ff	For PR1065: Don't allow CmpInst instances to be processed in FoldSelectOpOp because you can't easily swap their operands. llvm-svn: 32753	2006-12-23 18:58:04 +00:00
Reid Spencer	266e42b312	For PR950: This patch removes the SetCC instructions and replaces them with the ICmp and FCmp instructions. The SetCondInst instruction has been removed and been replaced with ICmpInst and FCmpInst. llvm-svn: 32751	2006-12-23 06:05:41 +00:00
Chris Lattner	f171af97d5	add a simple fast-path for dead allocas llvm-svn: 32750	2006-12-22 23:14:42 +00:00
Reid Spencer	a276d0972c	Remove isSigned calls via foreknowledge of main's argument types. llvm-svn: 32730	2006-12-21 07:49:49 +00:00
Reid Spencer	4720d4d9ef	Get rid of a useless if statement whose then and else blocks were identical. llvm-svn: 32729	2006-12-21 07:15:54 +00:00
Chris Lattner	1847f6ddbd	handle undef values much more carefully: generalize the resolveundefbranches code to handle instructions as well, so that we properly fold things like X & undef -> 0. This fixes Transforms/SCCP/2006-12-19-UndefBug.ll llvm-svn: 32715	2006-12-20 06:21:33 +00:00
Chris Lattner	575d3218ab	switch statistics over to not use static ctors. llvm-svn: 32709	2006-12-19 23:16:47 +00:00
Chris Lattner	1fa216f572	eliminate static ctor from example. llvm-svn: 32696	2006-12-19 22:24:09 +00:00
Chris Lattner	40b29cac01	remove dead statistic llvm-svn: 32695	2006-12-19 22:23:21 +00:00
Chris Lattner	45f966d80f	switch more statistics over to STATISTIC, eliminating static ctors. Also, delete some dead ones. llvm-svn: 32694	2006-12-19 22:17:40 +00:00
Chris Lattner	1631bcb1d4	Eliminate static ctors due to Statistic objects llvm-svn: 32693	2006-12-19 22:09:18 +00:00
Chris Lattner	0e5255bdc6	Convert more Statistic's over to STATISTIC llvm-svn: 32692	2006-12-19 21:49:03 +00:00
Chris Lattner	79a42ac941	Switch over Transforms/Scalar to use the STATISTIC macro. For each statistic converted, we lose a static initializer. This also allows GCC to emit warnings about unused statistics. llvm-svn: 32690	2006-12-19 21:40:18 +00:00
Reid Spencer	668d90f289	Convert the last uses of CastInst::createInferredCast to a normal cast creation. These changes are still temporary but at least this pushes knowledge of signedness out closer to where it can be determined properly and allows signedness to be removed from VMCore. llvm-svn: 32654	2006-12-18 08:47:13 +00:00
Reid Spencer	b83593e3ea	Convert the last use of two-argument ConstantExpr::getCast into another form so we can remove that method from ConstantExpr. llvm-svn: 32652	2006-12-18 08:16:27 +00:00
Bill Wendling	a77f14265b	Added an automatic cast to "std::ostream" etc. from OStream. We then can rework the hacks that had us passing OStream in. We pass in std::ostream instead, check for null, and then dispatch to the correct print() method. llvm-svn: 32636	2006-12-17 05:15:13 +00:00
Chris Lattner	fd5f03ec3f	when inserting a dummy argument to work-around the CBE not supporting zero arg vararg functions, pass undef instead of 'int 0', which is cheaper. llvm-svn: 32634	2006-12-16 21:21:53 +00:00
Chris Lattner	8f7b775bf4	re-enable a temporarily-reverted patch llvm-svn: 32595	2006-12-15 07:32:38 +00:00
Reid Spencer	74a528b427	Fix a bug in EvaluateInDifferentType. The type of operand should not be used to determine whether a ZExt or SExt cast is performed. Instead, pass an "isSigned" bool to the function and determine its value from the opcode of the cast involved. Also, clean up some cruft from previous patches. llvm-svn: 32548	2006-12-13 18:21:21 +00:00
Reid Spencer	2a499b0b6c	Implement review feedback. Most of this has to do with removing unnecessary cast instructions. A few are bug fixes. llvm-svn: 32544	2006-12-13 17:19:09 +00:00
Reid Spencer	612683b0d7	For mul transforms, when checking for a cast from bool as either operand, make sure to also check that it is a zext from bool, not any other cast operation type. llvm-svn: 32539	2006-12-13 08:33:33 +00:00
Reid Spencer	799b5bfc71	Fix and/or/xor (cast A), (cast B) --> cast (and/or/xor A, B) The cast patch introduced the possibility that the wrong cast opcode could be used and that this transform could trigger on different kinds of cast operations. This patch rectifies that. llvm-svn: 32538	2006-12-13 08:27:15 +00:00
Reid Spencer	df1f19a8ef	Change the interface to SCEVExpander::InsertCastOfTo to take a cast opcode so the decision of which opcode to use is pushed upward to the caller. Adjust the callers to pass the expected opcode. llvm-svn: 32535	2006-12-13 08:06:42 +00:00
Reid Spencer	a730cf80d7	Fix some casts. isdigit(c) returns 0 or 1, not 0 or -1 llvm-svn: 32534	2006-12-13 08:04:32 +00:00
Chris Lattner	7c1dff99dc	revert my recent int<->fp and vector union promotion changes, they expose obscure bugs affecting the X86 code generator. I will reenable this when fixed. llvm-svn: 32524	2006-12-13 02:26:45 +00:00
Reid Spencer	bfe26ffcfc	Replace CastInst::createInferredCast calls with more accurate cast creation calls. llvm-svn: 32521	2006-12-13 00:50:17 +00:00
Reid Spencer	bb65ebf9a1	Replace inferred getCast(V,Ty) calls with more strict variants. Rename getZeroExtend and getSignExtend to getZExt and getSExt to match the the casting mnemonics in the rest of LLVM. llvm-svn: 32514	2006-12-12 23:36:14 +00:00
Chris Lattner	2dc148e89d	this can be trunc or bitcast, per line 3092. llvm-svn: 32487	2006-12-12 19:11:20 +00:00
Chris Lattner	ade1f6894d	Fix regression on 400.perlbench last night. llvm-svn: 32486	2006-12-12 18:41:03 +00:00
Reid Spencer	13bc5d7b57	Fix numerous inferred casts. llvm-svn: 32479	2006-12-12 09:18:51 +00:00
Reid Spencer	41cb269a2b	Fix the casting for the computation of the Malloc size. llvm-svn: 32477	2006-12-12 09:17:08 +00:00
Reid Spencer	b341b0861d	Change inferred getCast into specific getCast. Passes all tests. llvm-svn: 32469	2006-12-12 05:05:00 +00:00
Chris Lattner	6e5fe376ec	Patch for PR1045 and Transforms/ScalarRepl/2006-12-11-SROA-Crash.ll llvm-svn: 32468	2006-12-12 04:24:41 +00:00
Chris Lattner	e810140c4b	trunc to integer, not to FP. llvm-svn: 32426	2006-12-11 01:17:00 +00:00
Chris Lattner	23f4b68f7e	implement promotion of unions containing two packed types of the same width. This implements Transforms/ScalarRepl/union-packed.ll llvm-svn: 32422	2006-12-11 00:35:08 +00:00
Chris Lattner	216c3028e6	* Eliminate calls to CastInst::createInferredCast. * Add support for promoting unions with fp values in them. This produces our new int<->fp bitcast instructions, implementing Transforms/ScalarRepl/union-fp-int.ll As an example, this allows us to compile this: union intfloat { int i; float f; }; float invsqrt(const float arg_x) { union intfloat x = { .f = arg_x }; const float xhalf = arg_x * 0.5f; x.i = 0x5f3759df - (x.i >> 1); return x.f * (1.5f - xhalf * x.f * x.f); } into: _invsqrt: movss 4(%esp), %xmm0 movd %xmm0, %eax sarl %eax movl $1597463007, %ecx subl %eax, %ecx movd %ecx, %xmm1 mulss LCPI1_0, %xmm0 mulss %xmm1, %xmm0 movss LCPI1_1, %xmm2 mulss %xmm1, %xmm0 subss %xmm0, %xmm2 movl 8(%esp), %eax mulss %xmm2, %xmm1 movss %xmm1, (%eax) ret instead of: _invsqrt: subl $4, %esp movss 8(%esp), %xmm0 movss %xmm0, (%esp) movl (%esp), %eax movl $1597463007, %ecx sarl %eax subl %eax, %ecx movl %ecx, (%esp) mulss LCPI1_0, %xmm0 movss (%esp), %xmm1 mulss %xmm1, %xmm0 mulss %xmm1, %xmm0 movss LCPI1_1, %xmm2 subss %xmm0, %xmm2 mulss %xmm2, %xmm1 movl 12(%esp), %eax movss %xmm1, (%eax) addl $4, %esp ret llvm-svn: 32418	2006-12-10 23:56:50 +00:00
Reid Spencer	efe5c862f1	Incorporate any changes in the successor blocks into the result of MarkAliveBlocks. llvm-svn: 32375	2006-12-08 21:52:01 +00:00
Bill Wendling	9bfb1e1f29	What should be the last unnecessary <iostream>s in the library. llvm-svn: 32333	2006-12-07 22:21:48 +00:00
Bill Wendling	22e978a736	Removing even more <iostream> includes. llvm-svn: 32320	2006-12-07 20:04:42 +00:00
Bill Wendling	f3baad3ee1	Changed llvm_ostream et all to OStream. llvm_cerr, llvm_cout, llvm_null, are now cerr, cout, and NullStream resp. llvm-svn: 32298	2006-12-07 01:30:32 +00:00
Reid Spencer	4ae56f3086	Update ConstantIntegral Max/Min tests for new interface. llvm-svn: 32288	2006-12-06 20:39:57 +00:00
Chris Lattner	f06bb658a8	add missing #include llvm-svn: 32280	2006-12-06 18:14:47 +00:00
Chris Lattner	700b873130	Detemplatize the Statistic class. The only type it is instantiated with is 'unsigned'. llvm-svn: 32279	2006-12-06 17:46:33 +00:00
Chris Lattner	edcc8c2f8b	Remove the 'printname' argument to WriteAsOperand. It is always true, and passing false would make the asmprinter fail anyway. llvm-svn: 32264	2006-12-06 06:16:21 +00:00
Chris Lattner	ec58903623	counter should be unsigned. llvm-svn: 32252	2006-12-06 01:50:04 +00:00
Chris Lattner	c209b584eb	add an instcombine xform. This speeds up 462.libquantum from 9.78s to 7.48s. This regression is due to unforseen consequences of the cast patch. llvm-svn: 32209	2006-12-05 01:26:29 +00:00
Devang Patel	21efc73161	SCCP does not handle Packed Type properly. Disable Packed Type handling for now. llvm-svn: 32208	2006-12-04 23:54:59 +00:00
Reid Spencer	14fbdd5523	Update call to CastInst::getCastOpcode for its new signature. llvm-svn: 32166	2006-12-04 02:48:01 +00:00
Jeff Cohen	cc08c83186	Unbreak VC++ build. llvm-svn: 32113	2006-12-02 02:22:01 +00:00
Chris Lattner	7a002fec1f	disable transformations that are invalid for fp vectors. This fixes Transforms/InstCombine/2006-12-01-BadFPVectorXform.ll llvm-svn: 32112	2006-12-02 00:13:08 +00:00
Reid Spencer	ad05ee9f39	Remove 4 FIXMEs to hack around cast-to-bool problems which no longer exist. llvm-svn: 32051	2006-11-30 23:13:36 +00:00
Chris Lattner	c8978c5272	make it clear that this is always a zext llvm-svn: 32044	2006-11-30 17:35:08 +00:00
Chris Lattner	3ede00b376	One more bugfix, 3 cases of making casts explicit. llvm-svn: 32043	2006-11-30 17:32:29 +00:00
Chris Lattner	0390b9e6bb	Fix a bug in globalopt due to the recent cast patch. llvm-svn: 32042	2006-11-30 17:26:08 +00:00
Chris Lattner	960acb008b	implement cast.ll:test35. With this, we recognize: unsigned short swp(unsigned short a) { return ((a & 0xff00) >> 8 \| (a & 0x00ff) << 8); } as an idiom for bswap. llvm-svn: 32011	2006-11-29 07:18:39 +00:00
Chris Lattner	d747f015ff	Teach instcombine to turn trunc(srl x, c) -> srl (trunc(x), c) when safe. This implements InstCombine/cast.ll:test34. It fires hundreds of times on 176.gcc. llvm-svn: 32009	2006-11-29 07:04:07 +00:00
Chris Lattner	a7942b7bbd	Implement Regression/Transforms/InstCombine/bswap-fold.ll, folding seteq (bswap(x)), c -> seteq(x,bswap(c)) llvm-svn: 32006	2006-11-29 05:02:16 +00:00
Reid Spencer	a736fdf216	Join a split line. llvm-svn: 31996	2006-11-29 01:11:01 +00:00
Reid Spencer	116ad83aa0	Undo the last patch until 253.perlbmk passes with these changes. llvm-svn: 31977	2006-11-28 20:23:51 +00:00
Reid Spencer	59fe2d89ae	Remove 4 FIXME's from the CAST patch now that the back end is correctly producing code for "trunc to bool". This passes all tests on Linux. llvm-svn: 31963	2006-11-28 07:23:01 +00:00
Chris Lattner	8e9a7b73d9	Fix PR1014 and InstCombine/2006-11-27-XorBug.ll. llvm-svn: 31941	2006-11-27 19:55:07 +00:00
Reid Spencer	6c38f0bb07	For PR950: The long awaited CAST patch. This introduces 12 new instructions into LLVM to replace the cast instruction. Corresponding changes throughout LLVM are provided. This passes llvm-test, llvm/test, and SPEC CPUINT2000 with the exception of 175.vpr which fails only on a slight floating point output difference. llvm-svn: 31931	2006-11-27 01:05:10 +00:00
Bill Wendling	4ae401074c	Remove #include <iostream> and use llvm_* streams instead. llvm-svn: 31925	2006-11-26 10:17:54 +00:00
Bill Wendling	8f13b5c43e	Replace #include <iostream> with llvm_* streams. llvm-svn: 31924	2006-11-26 10:02:32 +00:00
Bill Wendling	5dbf43c983	Removed #include <iostream> and replaced with llvm_* streams. llvm-svn: 31923	2006-11-26 09:46:52 +00:00
Bill Wendling	a7459ca813	Removed #include <iostream> and used the llvm_cerr/DOUT streams instead. llvm-svn: 31922	2006-11-26 09:17:06 +00:00
Nick Lewycky	09b7e4d3ab	Update to new predicate simplifier VRP design. Fixes PR966 and PR967. Remove predicate simplifier from default gcc3 pipeline. New design is too slow to enable by default. Add new testcases for problems encountered in development. llvm-svn: 31895	2006-11-22 23:49:16 +00:00
Chris Lattner	ec45a4c88c	This xform is handled by FoldOpIntoPhi in visitCastInst in a more elegant way. llvm-svn: 31889	2006-11-21 17:05:13 +00:00
Chris Lattner	95adf8f1da	Do not convert massive blocks on phi nodes into select statements. Instead only do these transformations if there are a small number of phi's. This speeds up Ptrdist/ks from 2.35s to 2.19s on my mac pro. llvm-svn: 31853	2006-11-18 19:19:36 +00:00
Chris Lattner	21eba2da26	If an indvar with a variable stride is used by the exit condition, go ahead and handle it like constant stride vars. This fixes some bad codegen in variable stride cases. For example, it compiles this: void foo(int k, int i) { for (k=i+i; k <= 8192; k+=i) flags2[k] = 0; } to: LBB1_1: #bb.preheader movl %eax, %ecx addl %ecx, %ecx movl L_flags2$non_lazy_ptr, %edx LBB1_2: #bb movb $0, (%edx,%ecx) addl %eax, %ecx cmpl $8192, %ecx jle LBB1_2 #bb LBB1_5: #return ret or (if the array is local and we are in dynamic-nonpic or static mode): LBB3_2: #bb movb $0, _flags2(%ecx) addl %eax, %ecx cmpl $8192, %ecx jle LBB3_2 #bb and: lis r2, ha16(L_flags2$non_lazy_ptr) lwz r2, lo16(L_flags2$non_lazy_ptr)(r2) slwi r3, r4, 1 LBB1_2: ;bb li r5, 0 add r6, r4, r3 stbx r5, r2, r3 cmpwi cr0, r6, 8192 bgt cr0, LBB1_5 ;return instead of: leal (%eax,%eax,2), %ecx movl %eax, %edx addl %edx, %edx addl L_flags2$non_lazy_ptr, %edx xorl %esi, %esi LBB1_2: #bb movb $0, (%edx,%esi) movl %eax, %edi addl %esi, %edi addl %ecx, %esi cmpl $8192, %esi jg LBB1_5 #return and: lis r2, ha16(L_flags2$non_lazy_ptr) lwz r2, lo16(L_flags2$non_lazy_ptr)(r2) mulli r3, r4, 3 slwi r5, r4, 1 li r6, 0 add r2, r2, r5 LBB1_2: ;bb li r5, 0 add r7, r3, r6 stbx r5, r2, r6 add r6, r4, r6 cmpwi cr0, r7, 8192 ble cr0, LBB1_2 ;bb This speeds up Benchmarks/Shootout/sieve from 8.533s to 6.464s and implements LoopStrengthReduce/var_stride_used_by_compare.ll llvm-svn: 31809	2006-11-17 06:17:33 +00:00
Chris Lattner	e3a63d136d	Fix a gcc 4.2 warning. llvm-svn: 31751	2006-11-15 04:53:24 +00:00
Chris Lattner	f05d69ae72	implement InstCombine/shift-simplify.ll by transforming: (X >> Z) op (Y >> Z) -> (X op Y) >> Z for all shifts and all ops={and/or/xor}. llvm-svn: 31729	2006-11-14 07:46:50 +00:00
Chris Lattner	d12a4bf799	implement InstCombine/and-compare.ll:test1. This compiles: typedef struct { unsigned prefix : 4; unsigned code : 4; unsigned unsigned_p : 4; } tree_common; int foo(tree_common a, tree_common b) { return a->code == b->code; } into: _foo: movl 4(%esp), %eax movl 8(%esp), %ecx movl (%eax), %eax xorl (%ecx), %eax # TRUNCATE movb %al, %al shrb $4, %al testb %al, %al sete %al movzbl %al, %eax ret instead of: _foo: movl 8(%esp), %eax movb (%eax), %al shrb $4, %al movl 4(%esp), %ecx movb (%ecx), %cl shrb $4, %cl cmpb %al, %cl sete %al movzbl %al, %eax ret saving one cycle by eliminating a shift. llvm-svn: 31727	2006-11-14 06:06:06 +00:00
Chris Lattner	d4dee405cb	Fix InstCombine/2006-11-10-ashr-miscompile.ll a miscompilation introduced by the shr -> [al]shr patch. This was reduced from 176.gcc. llvm-svn: 31653	2006-11-10 23:38:52 +00:00
Chris Lattner	82928ca290	second patch to fix PR992/993. llvm-svn: 31610	2006-11-09 23:36:08 +00:00
Chris Lattner	924f4fee8b	Minimal patch to fix PR992/PR993 llvm-svn: 31608	2006-11-09 23:17:45 +00:00
Chris Lattner	6e2c15c158	Teach ShrinkDemandedConstant how to handle X+C. This implements: add.ll:test33, add.ll:test34, shift-sra.ll:test2 llvm-svn: 31586	2006-11-09 05:12:27 +00:00
Chris Lattner	4f218d56f5	reenable factoring of GEP expressions, being more precise about the case that it bad to do. llvm-svn: 31563	2006-11-08 19:42:28 +00:00
Chris Lattner	cd62f11227	make this code more efficient by not creating a phi node we are just going to delete in the first place. This also makes it simpler. llvm-svn: 31562	2006-11-08 19:29:23 +00:00
Jim Laskey	61feeb90f9	Remove redundant <cmath>. llvm-svn: 31561	2006-11-08 19:16:44 +00:00
Chris Lattner	a3acfca920	disable this factoring optzn for GEPs for now, this severely pessimizes some loops. llvm-svn: 31560	2006-11-08 18:49:31 +00:00
Reid Spencer	fdff938a7e	For PR950: This patch converts the old SHR instruction into two instructions, AShr (Arithmetic) and LShr (Logical). The Shr instructions now are not dependent on the sign of their operands. llvm-svn: 31542	2006-11-08 06:47:33 +00:00
Chris Lattner	4967f6ddea	scalarrepl should not split the two elements of the vsiidx array: int func(vFloat v0, vFloat v1) { int ii; vSInt32 vsiidx[2]; vsiidx[0] = _mm_cvttps_epi32(v0); vsiidx[1] = _mm_cvttps_epi32(v1); ii = ((int *) vsiidx)[4]; return ii; } This fixes Transforms/ScalarRepl/2006-11-07-InvalidArrayPromote.ll llvm-svn: 31524	2006-11-07 22:42:47 +00:00
Jeff Cohen	7d6f3db3e2	Unbreak VC++ build. llvm-svn: 31464	2006-11-05 19:31:28 +00:00
Nick Lewycky	67bad5adbc	Remove commented line from earlier debugging. llvm-svn: 31460	2006-11-05 14:19:40 +00:00
Andrew Lenharth	0ebb0b03e6	The wrong parameter was being tested to deturmine i32 vs i64 llvm-svn: 31431	2006-11-03 22:45:50 +00:00
Chris Lattner	62e2cad6b8	remove dead code llvm-svn: 31398	2006-11-03 01:34:58 +00:00
Reid Spencer	de46e48420	For PR786: Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting fall out by removing unused variables. Remaining warnings have to do with unused functions (I didn't want to delete code without review) and unused variables in generated code. Maintainers should clean up the remaining issues when they see them. All changes pass DejaGnu tests and Olden. llvm-svn: 31380	2006-11-02 20:25:50 +00:00
Reid Spencer	7eb55b395f	For PR950: Replace the REM instruction with UREM, SREM and FREM. llvm-svn: 31369	2006-11-02 01:53:59 +00:00
Devang Patel	2cb4f83b38	There can be more than one PHINode at the start of the block. llvm-svn: 31362	2006-11-01 23:04:45 +00:00
Devang Patel	44519a8feb	Handle PHINode with only one incoming value. This fixes http://llvm.org/bugs/show_bug.cgi?id=979 llvm-svn: 31358	2006-11-01 22:26:43 +00:00
Chris Lattner	5a0bd61c64	Fix GlobalOpt/2006-11-01-ShrinkGlobalPhiCrash.ll and McGill/chomp llvm-svn: 31352	2006-11-01 18:03:33 +00:00
Chris Lattner	eebea43b48	Factor gep instructions through phi nodes. llvm-svn: 31346	2006-11-01 07:43:41 +00:00
Chris Lattner	14f82c7dcd	Turn a phi of many loads into a phi of the address and a single load of the result. This can significantly shrink code and exposes identities more aggressively. llvm-svn: 31344	2006-11-01 07:13:54 +00:00
Chris Lattner	dc826fc068	Fix a bug in the previous patch llvm-svn: 31342	2006-11-01 04:55:47 +00:00
Chris Lattner	cadac0c5c3	Fold things like "phi [add (a,b), add(c,d)]" into two phi's and one add. This triggers thousands of times on multisource. llvm-svn: 31341	2006-11-01 04:51:18 +00:00
Chris Lattner	984d6e1669	generalize the fix for PR977 to also fix Transforms/LCSSA/2006-10-31-UnreachableBlock-2.ll llvm-svn: 31317	2006-10-31 18:56:48 +00:00
Chris Lattner	eb68f080ef	Fix PR977 and Transforms/LCSSA/2006-10-31-UnreachableBlock.ll llvm-svn: 31315	2006-10-31 17:52:18 +00:00
Chris Lattner	fc519cd2d1	Fix SimplifyCFG/2006-10-29-InvokeCrash.ll, a crash compiling QT. llvm-svn: 31284	2006-10-29 21:21:20 +00:00
Chris Lattner	3e763f5708	add option to isCriticalEdge llvm-svn: 31258	2006-10-28 06:58:17 +00:00
Chris Lattner	a6eb7e0803	break edges more intelligently llvm-svn: 31257	2006-10-28 06:45:33 +00:00
Chris Lattner	80ea207bfa	Expose a smarter way to break critical edges. llvm-svn: 31256	2006-10-28 06:44:56 +00:00
Chris Lattner	400ac04e64	SplitCriticalEdge checks to see if an edge is critical, don't check twice llvm-svn: 31255	2006-10-28 06:38:14 +00:00
Chris Lattner	5191c65485	prepare for a change I'm about to make llvm-svn: 31248	2006-10-28 00:59:20 +00:00
Reid Spencer	00c482b7a2	Simplify code a bit by changing instances of: InsertNewInstBefore(new CastInst(Val, ValTy, Val->GetName()), I) into: InsertCastBefore(Val, ValTy, I) llvm-svn: 31204	2006-10-26 19:19:06 +00:00
Reid Spencer	7e80b0b31e	For PR950: Make necessary changes to support DIV -> [SUF]Div. This changes llvm to have three division instructions: signed, unsigned, floating point. The bytecode and assembler are bacwards compatible, however. llvm-svn: 31195	2006-10-26 06:15:43 +00:00
Nick Lewycky	5b979ae531	Fix 2006-10-25-AddSetCC. A relational operator (like setlt) can never produce an EQ property. llvm-svn: 31193	2006-10-26 02:35:18 +00:00
Nick Lewycky	9d17c82a26	Resurrect r1.25. Fix and comment the "or", "and" and "xor" transformations. llvm-svn: 31189	2006-10-25 23:48:24 +00:00
Chris Lattner	53f53db919	hide symbols properly llvm-svn: 31184	2006-10-25 21:14:31 +00:00
Chris Lattner	ebb1ad4382	Fix Transforms/ScalarRepl/2006-10-23-PointerUnionCrash.ll llvm-svn: 31151	2006-10-24 06:26:32 +00:00
Chris Lattner	dc7b9beb20	Revert back to r1.21, which was the last revision of predsimplify that passes llvm-gcc bootstrap. llvm-svn: 31146	2006-10-24 00:36:21 +00:00
Chris Lattner	fe7b6ef346	Handle fallout from the recent branch-on-undef changes. This fixes Prolangs-C/agrep and SCCP/2006-10-23-IPSCCP-Crash.ll llvm-svn: 31132	2006-10-23 18:57:02 +00:00
Nick Lewycky	53b4158448	Remove the Backwards operation. Resolving now works at the time when a property is added by running through the list of uses of the value and adding resolved properties to the property set. llvm-svn: 31126	2006-10-23 01:56:02 +00:00
Nick Lewycky	6f5c30fcec	Fix similar missing optimization opportunity in XOR. llvm-svn: 31123	2006-10-22 22:22:58 +00:00
Nick Lewycky	af2b0571d0	Whoops! Add missing NULL check. llvm-svn: 31121	2006-10-22 21:38:24 +00:00
Nick Lewycky	2c734f3fc1	Handle "if ((x\|y) != 0)" for ints like we do for bools. Fixes missed optimization opportunity pointed out by Chris Lattner. llvm-svn: 31118	2006-10-22 21:36:41 +00:00
Nick Lewycky	f345008339	AllocaInst can't return a null pointer. Fixes missed optimization opportunity pointed out by Andrew Lewycky. llvm-svn: 31115	2006-10-22 19:53:27 +00:00
Chris Lattner	250eff20da	Add a workaround for PR962, disabling the more aggressive form of this transformation. This speeds up a C++ app 2.25x. llvm-svn: 31113	2006-10-22 18:42:26 +00:00
Chris Lattner	af17096dcf	3 Changes: 1. Better document what is going on here. 2. Only hack on one branch per iteration, making the results less conservative. 3. Handle the problematic case by marking edges executable instead of by playing with value lattice states. This is far less pessimistic, and fixes SCCP/ipsccp-gvar.ll. llvm-svn: 31106	2006-10-22 05:59:17 +00:00
Chris Lattner	af1222c1a7	llvm-extract should remove module-level asm llvm-svn: 31086	2006-10-20 21:35:41 +00:00
Chris Lattner	319c86fd38	Fix an ugly problem in SCCP. This fixes Benchmarks/Misc-C++/mandel-text.cpp llvm-svn: 31073	2006-10-20 20:19:08 +00:00
Chris Lattner	5dee3b2526	Fix miscompilation of MallocBench/espresso which code review pointed out but apparently didn't make it into the final patch. llvm-svn: 31070	2006-10-20 18:20:21 +00:00
Reid Spencer	e0fc4dfc22	For PR950: This patch implements the first increment for the Signless Types feature. All changes pertain to removing the ConstantSInt and ConstantUInt classes in favor of just using ConstantInt. llvm-svn: 31063	2006-10-20 07:07:24 +00:00
Devang Patel	5d417e35bc	While creating mask, use 1ULL instead of 1. llvm-svn: 31062	2006-10-20 01:16:56 +00:00
Chris Lattner	b8b11599dd	Fix SimplifyCFG/2006-10-19-UncondDiv.ll by disabling a bad xform. llvm-svn: 31061	2006-10-20 00:42:07 +00:00
Devang Patel	5d6df959e3	It is OK to remove extra cast if operation is EQ/NE even though source and destination sign may not match but other conditions are met. llvm-svn: 31056	2006-10-19 20:59:13 +00:00
Devang Patel	88afd00d1d	Typo Typo. llvm-svn: 31055	2006-10-19 19:21:36 +00:00
Devang Patel	472530d9fc	Typo. llvm-svn: 31054	2006-10-19 19:05:38 +00:00
Devang Patel	b42aef4925	Fix bug in PR454 resolution. Added new test case. This fixes llvmAsmParser.cpp miscompile by llvm on PowerPC Darwin. llvm-svn: 31053	2006-10-19 18:54:08 +00:00
Reid Spencer	3c514959dd	Undo Chris' last patch, it caused a regression. llvm-svn: 30991	2006-10-16 23:08:08 +00:00
Chris Lattner	9a1c7dd27a	fix a buggy check that accidentally disabled this xform llvm-svn: 30967	2006-10-15 22:42:15 +00:00
Nick Lewycky	77e030bca9	Replace custom dispatch code with two uses of InstVisitor. Improves compile-time performance. llvm-svn: 30896	2006-10-12 02:02:44 +00:00
Chris Lattner	41b442242d	Implement SROA of unions with mixed pointers/integers in them. This implements PR892 and Transforms/ScalarRepl/union-pointer.ll:test2 llvm-svn: 30825	2006-10-08 23:53:04 +00:00
Chris Lattner	05f8272afa	Implement Transforms/ScalarRepl/union-pointer.ll:test llvm-svn: 30823	2006-10-08 23:28:04 +00:00
Chris Lattner	2deeaeaca7	add a new SimplifyDemandedVectorElts method, which works similarly to SimplifyDemandedBits. The idea is that some operations can be simplified if not all of the computed elements are needed. Some targets (like x86) have a large number of intrinsics that operate on a single element, but pass other elts through unmodified. If those other elements are not needed, the intrinsics can be simplified to scalar operations, and insertelement ops can be removed. This turns (f.e.): ushort %Convert_sse(float %f) { %tmp = insertelement <4 x float> undef, float %f, uint 0 ; <<4 x float>> [#uses=1] %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1 ; <<4 x float>> [#uses=1] %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2 ; <<4 x float>> [#uses=1] %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3 ; <<4 x float>> [#uses=1] %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > ) ; <<4 x float>> [#uses=1] %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer ) ; <<4 x float>> [#uses=1] %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 ) ; <int> [#uses=1] %tmp69 = cast int %tmp to ushort ; <ushort> [#uses=1] ret ushort %tmp69 } into: ushort %Convert_sse(float %f) { entry: %tmp28 = sub float %f, 1.000000e+00 ; <float> [#uses=1] %tmp37 = mul float %tmp28, 5.000000e-01 ; <float> [#uses=1] %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0 ; <<4 x float>> [#uses=1] %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > ) ; <<4 x float>> [#uses=1] %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > ) ; <<4 x float>> [#uses=1] %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 ) ; <int> [#uses=1] %tmp69 = cast int %tmp to ushort ; <ushort> [#uses=1] ret ushort %tmp69 } which improves codegen from: _Convert_sse: movss LCPI1_0, %xmm0 movss 4(%esp), %xmm1 subss %xmm0, %xmm1 movss LCPI1_1, %xmm0 mulss %xmm0, %xmm1 movss LCPI1_2, %xmm0 minss %xmm0, %xmm1 xorps %xmm0, %xmm0 maxss %xmm0, %xmm1 cvttss2si %xmm1, %eax andl $65535, %eax ret to: _Convert_sse: movss 4(%esp), %xmm0 subss LCPI1_0, %xmm0 mulss LCPI1_1, %xmm0 movss LCPI1_2, %xmm1 minss %xmm1, %xmm0 xorps %xmm1, %xmm1 maxss %xmm1, %xmm0 cvttss2si %xmm0, %eax andl $65535, %eax ret This is just a first step, it can be extended in many ways. Testcase here: Transforms/InstCombine/vec_demanded_elts.ll llvm-svn: 30752	2006-10-05 06:55:50 +00:00
Chris Lattner	52886e72d7	This case isn't implemented yet. It seems unlikely to be needed, but if it ever is, we want to get an assert instead of silent bad codegen. llvm-svn: 30716	2006-10-04 04:58:58 +00:00
Nick Lewycky	58a910dff5	Simplify logic further. Ensure that we copy KnownProperties before calling visitBasicBlock, else we may leak properties into blocks where they don't belong. llvm-svn: 30705	2006-10-03 17:36:01 +00:00
Nick Lewycky	1d00f3e144	Simplify, now that predsimplify depends on break-crit-edges. Fix SwitchInst where dest-block is the same as one of the cases. llvm-svn: 30700	2006-10-03 15:19:11 +00:00
Nick Lewycky	755f801adc	Move break-crit-edges before the predicate simplifier. Allows us to optimize in more cases. llvm-svn: 30699	2006-10-03 14:52:23 +00:00
Evan Cheng	ff510a58c2	Revert previous patch. Still breaking things. llvm-svn: 30698	2006-10-03 07:26:07 +00:00
Chris Lattner	8aca0ee8c3	Fix PR932 and Analysis/Dominators/2006-10-02-BreakCritEdges.ll: The critical edge block dominates the dest block if the destblock dominates all edges other than the one incoming from the critical edge. llvm-svn: 30696	2006-10-03 07:02:02 +00:00
Chris Lattner	7d19067c42	Fix a bug from r1.391 of this file, where we checked the size instead of the alignment when promoting allocations. This implements InstCombine/cast.ll:test32 llvm-svn: 30682	2006-10-01 19:40:58 +00:00
Chris Lattner	4797c891c0	Fix debug output llvm-svn: 30680	2006-09-30 23:32:50 +00:00
Chris Lattner	24d3d4280a	Implement SRA of heap allocations. llvm-svn: 30679	2006-09-30 23:32:09 +00:00
Chris Lattner	80a01ef6f0	Add some ifdef'd out debug info llvm-svn: 30676	2006-09-30 19:40:30 +00:00
Chris Lattner	6ab03f6a08	Eliminate ConstantBool::True and ConstantBool::False. Instead, provide ConstantBool::getTrue() and ConstantBool::getFalse(). llvm-svn: 30665	2006-09-28 23:35:22 +00:00
Owen Anderson	7cb6809c25	Another attempt at making ArgPromotion smarter. This patch no longer breaks Burg. llvm-svn: 30657	2006-09-28 23:02:22 +00:00
Chris Lattner	525804f31e	simplify code llvm-svn: 30656	2006-09-28 22:58:25 +00:00
Chris Lattner	e03ca2ca4a	set DEBUG_TYPE right llvm-svn: 30623	2006-09-27 04:58:23 +00:00
Nick Lewycky	059c79264f	Style changes only. Remove dead code, fix a comment. llvm-svn: 30588	2006-09-23 15:13:08 +00:00
Chris Lattner	6bd6da4097	Be far more careful when splitting a loop header, either to form a preheader or when splitting loops with a common header into multiple loops. In particular the old code would always insert the preheader before the old loop header. This is disasterous in cases where the loop hasn't been rotated. For example, it can produce code like: .. outside the loop... jmp LBB1_2 #bb13.outer LBB1_1: #bb1 movsd 8(%esp,%esi,8), %xmm1 mulsd (%edi), %xmm1 addsd %xmm0, %xmm1 addl $24, %edi incl %esi jmp LBB1_3 #bb13 LBB1_2: #bb13.outer leal (%edx,%eax,8), %edi pxor %xmm1, %xmm1 xorl %esi, %esi LBB1_3: #bb13 movapd %xmm1, %xmm0 cmpl $4, %esi jl LBB1_1 #bb1 Note that the loop body is actually LBB1_1 + LBB1_3, which means that the loop now contains an uncond branch WITHIN it to jump around the inserted loop header (LBB1_2). Doh. This patch changes the preheader insertion code to insert it in the right spot, producing this code: ... outside the loop, fall into the header ... LBB1_1: #bb13.outer leal (%edx,%eax,8), %esi pxor %xmm0, %xmm0 xorl %edi, %edi jmp LBB1_3 #bb13 LBB1_2: #bb1 movsd 8(%esp,%edi,8), %xmm0 mulsd (%esi), %xmm0 addsd %xmm1, %xmm0 addl $24, %esi incl %edi LBB1_3: #bb13 movapd %xmm0, %xmm1 cmpl $4, %edi jl LBB1_2 #bb1 Totally crazy, no branch in the loop! :) llvm-svn: 30587	2006-09-23 08:19:21 +00:00
Chris Lattner	608cd05e3f	Teach UpdateDomInfoForRevectoredPreds to handle revectored preds that are not reachable, making it general purpose enough for use by InsertPreheaderForLoop. Eliminate custom dominfo updating code in InsertPreheaderForLoop, using UpdateDomInfoForRevectoredPreds instead. llvm-svn: 30586	2006-09-23 07:40:52 +00:00
Chris Lattner	51c95cdd82	Fix Transforms/IndVarsSimplify/2006-09-20-LFTR-Crash.ll llvm-svn: 30555	2006-09-21 05:12:20 +00:00
Nick Lewycky	fde9c308b2	Don't rewrite ConstantExpr::get. llvm-svn: 30552	2006-09-21 01:05:35 +00:00
Nick Lewycky	d74c55f483	Once we're down to "setcc type constant1, constant2", at least come up with the right answer. llvm-svn: 30550	2006-09-20 23:02:24 +00:00
Nick Lewycky	cfff1c3f86	Use a total ordering to compare instructions. Fixes infinite loop in resolve(). llvm-svn: 30540	2006-09-20 17:04:01 +00:00
Andrew Lenharth	44cb67af5c	simplify llvm-svn: 30535	2006-09-20 15:37:57 +00:00
Chris Lattner	380c7e9a59	We went through all that trouble to compute whether it was safe to transform this comparison, but never checked it. Whoops, no wonder we miscompiled 177.mesa! llvm-svn: 30511	2006-09-20 04:44:59 +00:00
Evan Cheng	cd3f6ff0e5	Back out Chris' last set of changes. This breaks 177.mesa and povray somehow. llvm-svn: 30505	2006-09-20 01:39:40 +00:00
Evan Cheng	453280b94d	80 col. llvm-svn: 30504	2006-09-20 01:10:02 +00:00
Andrew Lenharth	4f339bebb0	If we have an add, do it in the pointer realm, not the int realm. This is critical in the linux kernel for pointer analysis correctness llvm-svn: 30496	2006-09-19 18:24:51 +00:00
Chris Lattner	12f52faf93	implement select.ll:test19-22 llvm-svn: 30482	2006-09-19 06:18:21 +00:00
Nick Lewycky	b9c5483a93	Walk down the dominator tree instead of the control flow graph. That means that we can't modify the CFG any more, at least not until it's possible to update the dominator tree (PR217). llvm-svn: 30469	2006-09-18 21:09:35 +00:00
Chris Lattner	de07792595	Fix an infinite loop building the CFE llvm-svn: 30465	2006-09-18 18:27:05 +00:00
Chris Lattner	67a35bbce7	Implement a trivial optzn: of vastart is never called in a function that takes ... args, remove the '...'. This is Transforms/DeadArgElim/dead_vaargs.ll llvm-svn: 30459	2006-09-18 07:02:31 +00:00
Chris Lattner	4922a0e53f	Implement InstCombine/cast.ll:test31. This speeds up 462.libquantum by 26%. llvm-svn: 30456	2006-09-18 05:27:43 +00:00
Chris Lattner	420c4bcc8d	Implement Transforms/InstCombine/shift-sra.ll:test0 llvm-svn: 30450	2006-09-18 04:31:40 +00:00
Chris Lattner	b3f24c91b0	Rewrite shift/and/compare sequences to promote better licm of the RHS. Use isLogicalShift/isArithmeticShift to simplify code. llvm-svn: 30448	2006-09-18 04:22:48 +00:00
Chris Lattner	850465d53f	Fix Transforms/InstCombine/2006-09-15-CastToBool.ll and PR913 llvm-svn: 30405	2006-09-16 03:14:10 +00:00
Chris Lattner	9482cc5b16	revert previous two patches. They cause miscompilation of MultiSource/Applications/Burg llvm-svn: 30397	2006-09-15 17:24:45 +00:00
Owen Anderson	edadd3faee	Revert my previous work on ArgumentPromotion. Further investigation has revealed these changes to be incorrect. They just weren't showing up in any of our current testcases. llvm-svn: 30385	2006-09-15 05:22:51 +00:00
Anton Korobeynikov	d61d39ec53	Adding dllimport, dllexport and external weak linkage types. DLL* linkages got full (I hope) codegeneration support in C & both x86 assembler backends. External weak linkage added for future use, we don't provide any codegeneration, etc. support for it. llvm-svn: 30374	2006-09-14 18:23:27 +00:00
Chris Lattner	237ccf2a51	Second half of the fix for Transforms/Inline/inline_cleanup.ll This folds unconditional branches that are often produced by code specialization. llvm-svn: 30307	2006-09-13 21:27:00 +00:00
Nick Lewycky	12efffc96b	Add some more consistency checks. llvm-svn: 30305	2006-09-13 19:32:53 +00:00
Nick Lewycky	51ce8d6b46	Fix unionSets so that it can merge correctly. llvm-svn: 30304	2006-09-13 19:24:01 +00:00
Chris Lattner	6ef6d06d21	Implement the first half of Transforms/Inline/inline_cleanup.ll llvm-svn: 30303	2006-09-13 19:23:57 +00:00
Nick Lewycky	3a4dc7b489	Erase dead instructions. llvm-svn: 30298	2006-09-13 18:55:37 +00:00
Devang Patel	fab4972a6e	Initialize DontInternalize. llvm-svn: 30281	2006-09-13 01:02:26 +00:00
Chris Lattner	1d7ec20a4d	An sinkable instruction may exist with uses, if those uses are in dead blocks. Handle this. This fixes PR908 and Transforms/LICM/2006-09-12-DeadUserOfSunkInstr.ll llvm-svn: 30275	2006-09-12 19:17:09 +00:00
Chris Lattner	d28627009a	Fix PR905 and InstCombine/2006-09-11-EmptyStructCrash.ll llvm-svn: 30266	2006-09-11 21:43:16 +00:00
Nick Lewycky	e94f42a740	Skip the linear search if the answer is already known. llvm-svn: 30251	2006-09-11 17:23:34 +00:00
Chris Lattner	d1f8e07808	Allow tail duplication in more cases, relaxing the previous restriction a bit. This fixes Regression/Transforms/TailDup/MergeTest.ll llvm-svn: 30237	2006-09-10 18:17:58 +00:00
Nick Lewycky	9a22d7b60f	Replace EquivalenceClasses with a custom-built data structure. Many common operations (like findProperties) should be faster, at the expense of unionSets being slower in cases that are rare in practise. Don't erase a dead Instruction. This fixes a memory corruption issue. llvm-svn: 30235	2006-09-10 02:27:07 +00:00
Chris Lattner	0468987592	Implement Transforms/InstCombine/hoist_instr.ll llvm-svn: 30234	2006-09-09 22:02:56 +00:00
Chris Lattner	27ff96d87a	Make inlining costs more accurate. llvm-svn: 30231	2006-09-09 20:40:44 +00:00
Chris Lattner	d79dc79831	Turn div X, (Cond ? Y : 0) -> div X, Y This implements select.ll::test18. llvm-svn: 30230	2006-09-09 20:26:32 +00:00
Chris Lattner	c465046e65	Throttle back tail duplication to avoid creating really ugly sequences of code. For Transforms/TailDup/if-tail-dup.ll, f.e., it produces: _foo: movl 8(%esp), %eax movl 4(%esp), %ecx testl $1, %ecx je LBB1_2 #cond_next LBB1_1: #cond_true movl $1, (%eax) LBB1_2: #cond_next testl $2, %ecx je LBB1_4 #cond_next10 LBB1_3: #cond_true6 movl $1, 4(%eax) LBB1_4: #cond_next10 testl $4, %ecx je LBB1_6 #cond_next18 LBB1_5: #cond_true14 movl $1, 8(%eax) LBB1_6: #cond_next18 testl $8, %ecx je LBB1_8 #return LBB1_7: #cond_true22 movl $1, 12(%eax) ret LBB1_8: #return ret instead of: _foo: movl 4(%esp), %eax testl $2, %eax sete %cl movl 8(%esp), %edx testl $1, %eax je LBB1_2 #cond_next LBB1_1: #cond_true movl $1, (%edx) testb %cl, %cl jne LBB1_4 #cond_next10 jmp LBB1_3 #cond_true6 LBB1_2: #cond_next testb %cl, %cl jne LBB1_4 #cond_next10 LBB1_3: #cond_true6 movl $1, 4(%edx) testl $4, %eax je LBB1_6 #cond_next18 jmp LBB1_5 #cond_true14 LBB1_4: #cond_next10 testl $4, %eax je LBB1_6 #cond_next18 LBB1_5: #cond_true14 movl $1, 8(%edx) testl $8, %eax je LBB1_8 #return jmp LBB1_7 #cond_true22 LBB1_6: #cond_next18 testl $8, %eax je LBB1_8 #return LBB1_7: #cond_true22 movl $1, 12(%edx) ret LBB1_8: #return ret llvm-svn: 30158	2006-09-07 21:30:15 +00:00
Chris Lattner	845b223da4	Fix Duraid's changes to work when TLI is null. This fixes the failing lowerinvoke regtests. llvm-svn: 30115	2006-09-05 17:48:07 +00:00
Duraid Madina	cf6749e4c0	add setJumpBufSize() and setJumpBufAlignment() to target-lowering. Call these from your backend to enjoy setjmp/longjmp goodness, see lib/Target/IA64/IA64ISelLowering.cpp for an example llvm-svn: 30095	2006-09-04 06:21:35 +00:00
Owen Anderson	19b80e76df	Make ArgumentPromotion handle recursive functions that pass pointers in their recursive calls. llvm-svn: 30057	2006-09-02 21:19:44 +00:00
Nick Lewycky	8e5599354a	Improve handling of SelectInst. Reorder operations to remove duplicated work. Fix to leave floating-point types out of the optimization. Add tests to predsimplify.ll for SwitchInst and SelectInst handling. llvm-svn: 30055	2006-09-02 19:40:38 +00:00
Nick Lewycky	f6f529d008	Don't confuse canonicalize and lookup. Fixes predsimplify.reg4.ll. Also corrects missing optimization opportunity removing cases from a switch. llvm-svn: 30009	2006-09-01 03:26:35 +00:00
Nick Lewycky	08674ab707	Properties where both Values weren't in the union (as being equal to another Value) weren't being found by findProperties. This fixes predsimplify.ll test6, a missed optimization opportunity. llvm-svn: 29991	2006-08-31 00:39:16 +00:00
Nick Lewycky	5f8f9af65c	Move to using the EquivalenceClass ADT. Removes SynSets. If a branch's condition has become a ConstantBool, simplify it immediately. Removing the edge saves work and exposes up more optimization opportunities in the pass. Add support for SelectInst. llvm-svn: 29970	2006-08-30 02:46:48 +00:00
Devang Patel	f489d0f85c	Do not rely on std::sort and std::erase to get list of unique exit blocks. The output is dependent on addresses of basic block. Add and use Loop::getUniqueExitBlocks. llvm-svn: 29966	2006-08-29 22:29:16 +00:00
Owen Anderson	a8a2e5c666	Clean up a bit. llvm-svn: 29950	2006-08-29 06:10:56 +00:00
Nick Lewycky	b2e8ae1700	Add PredicateSimplifier pass. Collapses equal variables into one form and simplifies expressions. This implements the optimization described in PR807. llvm-svn: 29947	2006-08-28 22:44:55 +00:00
Owen Anderson	62c84fe371	Make LoopUnroll fold excessive BasicBlocks. This results in a significant speedup of gccas on 252.eon llvm-svn: 29936	2006-08-28 02:09:46 +00:00
Chris Lattner	97c9f20c52	simplify AnalysisGroup registration, eliminating one typeid call. llvm-svn: 29932	2006-08-28 00:42:29 +00:00
Chris Lattner	c2d3d3112e	eliminate RegisterOpt. It does the same thing as RegisterPass. llvm-svn: 29925	2006-08-27 22:42:52 +00:00
Chris Lattner	3d27be1333	s\|llvm/Support/Visibility.h\|llvm/Support/Compiler.h\| llvm-svn: 29911	2006-08-27 12:54:02 +00:00
Owen Anderson	403b95af47	Fix a crash related to updating Phi nodes in the original header block. This was causing a crash in 175.vpr llvm-svn: 29887	2006-08-25 22:13:55 +00:00
Owen Anderson	8e4b029573	Add an assertion to check that we're really preserving LCSSA. llvm-svn: 29886	2006-08-25 22:12:36 +00:00
Owen Anderson	8cca95cf5d	Reapply the indvars patch, since nothing blew up last night. llvm-svn: 29874	2006-08-25 17:41:25 +00:00
Owen Anderson	94446a4267	Revert my previous patch. Since there are some major changes that went in today, I'm going to wait to put this in HEAD until tomorrow, so as not to clutter the nightly tester. llvm-svn: 29868	2006-08-25 03:45:57 +00:00
Owen Anderson	15a6423431	Specify that indvars actually preserve LCSSA. This has been done for a while, but I forgot to put in the analysis usage. llvm-svn: 29867	2006-08-25 03:32:13 +00:00
Owen Anderson	e001d811ba	Implement unrolling of multiblock loops. This significantly improves the utility of the LoopUnroll pass. Also, add a testcase for multiblock-loop unrolling. llvm-svn: 29859	2006-08-24 21:28:19 +00:00
Reid Spencer	5495fe8dd6	Fix a grammaro in a comment. llvm-svn: 29765	2006-08-18 09:01:07 +00:00
Chris Lattner	6441cf93c9	Handle single-entry PHI nodes correctly. This fixes PR877 and Transforms/CondProp/2006-08-14-SingleEntryPhiCrash.ll llvm-svn: 29673	2006-08-14 21:38:05 +00:00
Chris Lattner	f18b396cc2	Don't attempt to split subloops out of a loop with a huge number of backedges. Not only will this take huge amounts of compile time, the resultant loop nests won't be useful for optimization. This reduces loopsimplify time on Transforms/LoopSimplify/2006-08-11-LoopSimplifyLongTime.ll from ~32s to ~0.4s with a debug build of llvm on a 2.7Ghz G5. llvm-svn: 29647	2006-08-12 05:25:00 +00:00
Chris Lattner	85d9944f9a	Reimplement the loopsimplify code which deletes edges from unreachable blocks that target loop blocks. Before, the code was run once per loop, and depended on the number of predecessors each block in the loop had. Unfortunately, scanning preds can be really slow when huge numbers of phis exist or when phis with huge numbers of inputs exist. Now, the code is run once per function and scans successors instead of preds, which is far faster. In addition, the new code is simpler and is goto free, woo. This change speeds up a nasty testcase Duraid provided me from taking hours to taking ~72s with a debug build. The functionality this implements is already tested in the testsuite as Transforms/CodeExtractor/2004-03-13-LoopExtractorCrash.ll. llvm-svn: 29644	2006-08-12 04:51:20 +00:00
Reid Spencer	2b6d18a64f	Make this example pass use some things from lib/Support (EscapeString, SlowOperatingInfo, Statistics). Besides providing an example of how to use these facilities, it also serves to debug problems with runtime linking when dlopening a loadable module. These three support facilities exercise different combinations of Text/Weak Weak/Text and Text/Text linking between the executable and the module. llvm-svn: 29552	2006-08-07 23:17:24 +00:00
Reid Spencer	e6458c3fb2	For PR780: 1. Change the usage of LOADABLE_MODULE so that it implies all the things necessary to make a loadable module. This reduces the user's burdern to get a loadable module correctly built. 2. Document the usage of LOADABLE_MODULE in the MakefileGuide 3. Adjust the makefile for lib/Transforms/Hello to use the new specification for building loadable modules 4. Adjust the sample project to not attempt to build a shared library for its little library. This was just wasteful and not instructive at all. llvm-svn: 29551	2006-08-07 23:12:15 +00:00
Chris Lattner	c9009d917d	Fix PR867 (and maybe 868) and testcsae: Transforms/SimplifyCFG/2006-08-03-Crash.ll llvm-svn: 29515	2006-08-03 21:40:24 +00:00
Chris Lattner	3ff620178b	Changes: 1. Update an obsolete comment. 2. Make the sorting by base an explicit (though still N^2) step, so that the code is more clear on what it is doing. 3. Partition uses so that uses inside the loop are handled before uses outside the loop. Note that none of these changes currently changes the code inserted by LSR, but they are a stepping stone to getting there. This code is the result of some crazy pair programming with Nate. :) llvm-svn: 29493	2006-08-03 06:34:50 +00:00
Chris Lattner	38b6e8382a	Add special check to avoid isLoop call. Simple, but doesn't seem to speed up lcssa much in practice. llvm-svn: 29465	2006-08-02 00:16:47 +00:00
Chris Lattner	5a2bc786be	Replace the SSA update code in LCSSA with a bottom-up approach instead of a top down approach, inspired by discussions with Tanya. This approach is significantly faster, because it does not need dominator frontiers and it does not insert extraneous unused PHI nodes. For example, on 252.eon, in a release-asserts build, this speeds up LCSSA (which is the slowest pass in gccas) from 9.14s to 0.74s on my G5. This code is also slightly smaller and significantly simpler than the old code. Amusingly, in a normal Release build (which includes the "assert(L->isLCSSAForm());" assertion), asserting that the result of LCSSA is in LCSSA form is actually slower than the LCSSA transformation pass itself on 252.eon. I will see if Loop::isLCSSAForm can be sped up next. llvm-svn: 29463	2006-08-02 00:06:09 +00:00
Chris Lattner	85ea83e821	Add some advice llvm-svn: 29324	2006-07-27 04:24:14 +00:00
Chris Lattner	1b928478aa	Minor comment tweaks llvm-svn: 29226	2006-07-20 19:06:16 +00:00
Devang Patel	edd2f9952e	Make it fit into 80 cols. llvm-svn: 29223	2006-07-20 18:03:39 +00:00
Devang Patel	839d9260f0	Add new constructor to accept vector of exported names while creating InternalizePass. llvm-svn: 29222	2006-07-20 17:48:05 +00:00
Owen Anderson	8ef4c92ef8	Add an assertion. llvm-svn: 29199	2006-07-19 05:48:45 +00:00
Owen Anderson	aba8c199dd	Make LoopUnroll not die on LCSSA Phis. This makes lencod work again. llvm-svn: 29198	2006-07-19 05:45:14 +00:00
Owen Anderson	00b974cdbc	Fix a error that hadn't yet cause any problems, but I'm sure it would have somewhere down the road. llvm-svn: 29197	2006-07-19 03:51:48 +00:00
Chris Lattner	fea3974133	silence warnings in a release build llvm-svn: 29189	2006-07-18 21:48:57 +00:00
Evan Cheng	e9c68f52e1	Only reuse a previous IV if it would not require a type conversion. llvm-svn: 29186	2006-07-18 19:07:58 +00:00
Chris Lattner	19247f36ea	eliminate some ugly code, using ConstantExpr::getWithOperands instead. llvm-svn: 29149	2006-07-14 22:21:31 +00:00
Owen Anderson	bea70ee1de	Hopefully the final attempt at making IndVars preserve LCSSA. This should fix PR 831. llvm-svn: 29141	2006-07-14 18:49:15 +00:00
Chris Lattner	9b6c02ebe4	Revert this patch temporarily until PR831 is fixed. llvm-svn: 29134	2006-07-13 19:05:20 +00:00
Chris Lattner	b3c64f7ab3	Handle instructions in the map, but that map to a null pointer. This unbreaks smg2000. llvm-svn: 29127	2006-07-12 21:37:11 +00:00
Owen Anderson	dea9202e3b	IndVars now (correctly) preserves LCSSA form. llvm-svn: 29126	2006-07-12 21:29:14 +00:00
Chris Lattner	6148456ec2	In addition to deleting calls, the inliner can constant fold them as well. Handle this case, which doesn't require a new callgraph edge. This fixes a crash compiling MallocBench/gs. llvm-svn: 29121	2006-07-12 18:37:18 +00:00
Chris Lattner	5de3b8b262	Change the callgraph representation to store the callsite along with the target CG node. This allows the inliner to properly update the callgraph when using the pruning inliner. The pruning inliner may not copy over all call sites from a callee to a caller, so the edges corresponding to those call sites should not be copied over either. This fixes PR827 and Transforms/Inline/2006-07-12-InlinePruneCGUpdate.ll llvm-svn: 29120	2006-07-12 18:29:36 +00:00
Chris Lattner	091b6ea847	Silence a warning produced in assertions-disabled mode llvm-svn: 29108	2006-07-11 18:31:26 +00:00
Owen Anderson	15b1f7d2cd	Revert my indvars changes because they were breaking things. Unfortunately this didn't start showing up until after the recent instcombine fixes. llvm-svn: 29102	2006-07-11 07:25:33 +00:00
Owen Anderson	bbf8990ef7	Add a comment, and fix a typo that broke the build. llvm-svn: 29094	2006-07-10 22:15:25 +00:00
Owen Anderson	ae8aa646f1	Don't indent the entire function. llvm-svn: 29093	2006-07-10 22:03:18 +00:00
Chris Lattner	b7845d69db	Recognize 16-bit bswaps by relaxing overconstrained pattern. This implements Transforms/InstCombine/bswap.ll:test[34]. llvm-svn: 29087	2006-07-10 20:25:24 +00:00
Owen Anderson	a6968f83b2	Make instcombine not remove Phi nodes when LCSSA is live. llvm-svn: 29083	2006-07-10 19:03:49 +00:00
Owen Anderson	fe6e97d275	Fix typo in the comment. llvm-svn: 29078	2006-07-09 21:35:40 +00:00
Owen Anderson	aecaabb6e1	Add a fix for an issue where LCSSA would fail to insert undef's in some corner cases. Ideally, this issue will go away in the future as LCSSA gets smarter about which Phi nodes it inserts. llvm-svn: 29076	2006-07-09 08:14:06 +00:00
Chris Lattner	fd2e13b107	Fix PR820 and Transforms/GlobalOpt/2006-07-07-InlineAsmCrash.ll llvm-svn: 29071	2006-07-07 21:37:01 +00:00
Chris Lattner	996795b0dd	Use hidden visibility to make symbols in an anonymous namespace get dropped. This shrinks libllvmgcc.dylib another 67K llvm-svn: 28975	2006-06-28 23:17:24 +00:00
Chris Lattner	4a4c7fe7fa	Shrink libllvmgcc.dylib by another 23K llvm-svn: 28972	2006-06-28 22:08:15 +00:00
Owen Anderson	18e816f356	Switch to a very conservative heuristic for determining when loop-unswitching will be profitable. This is mainly to remove some cases where excessive unswitching would result in long compile times and/or huge generated code. Once someone comes up with a better heuristic that avoids these cases, this should be switched out. llvm-svn: 28962	2006-06-28 17:47:50 +00:00
Chris Lattner	3fda386965	Fix Transforms/InstCombine/2006-06-28-infloop.ll llvm-svn: 28961	2006-06-28 17:34:50 +00:00
Chris Lattner	0a2e11260e	Don't unswitch really large loops even if they are mostly filled with empty blocks. llvm-svn: 28959	2006-06-28 16:38:55 +00:00
Andrew Lenharth	ebfa24ee9a	Catch more function pointer casting problems Remove the Function pointer cast in these calls, converting it to a cast of argument. %tmp60 = tail call int cast (int (ulong)* %str to int (int))( int 10 ) %tmp60 = tail call int cast (int (ulong) %str to int (int)*)( uint %tmp51 ) llvm-svn: 28953	2006-06-28 01:01:52 +00:00
Owen Anderson	bb3ae5eb8f	Fix for 2006-06-27-DeadSwitchCase.ll Be more careful when updating Phi nodes after eliminating dead switch cases. Fix proposed by Chris. llvm-svn: 28947	2006-06-27 22:26:09 +00:00
Chris Lattner	c4998a0138	Fix Transforms/DeadArgElim/2006-06-27-struct-ret.ll. -deadargelim should not remove the struct return argument of a csret function, even if it is obviously dead. llvm-svn: 28943	2006-06-27 21:05:04 +00:00
Owen Anderson	b659bb4196	De-pessimize the handling of LCSSA Phi nodes in IndVarSimplify. Hopefully this will make Shootout-C/nestedloop faster. llvm-svn: 28924	2006-06-27 02:17:08 +00:00
Chris Lattner	49771a0462	random code cleanups, no functionality change llvm-svn: 28914	2006-06-26 19:10:05 +00:00
Owen Anderson	f52351e50f	Make LoopUnswitch able to unswitch loops with live-out values by taking advantage of LCSSA. This results several times the number of unswitchings occurring on tests such and timberwolfmc, unix-tbl, and ldecod. llvm-svn: 28912	2006-06-26 07:44:36 +00:00
Chris Lattner	053fb9319d	Fix IndVarsSimplify/2006-06-16-Indvar-LCSSA-Crash.ll, a case where a "LCSSA" phi node causes indvars to break dominance properties. This fixes causes indvars to avoid inserting aggressive code in this case, instead indvars should be fixed to be more aggressive in the face of lcssa phi's. llvm-svn: 28850	2006-06-17 01:02:31 +00:00
Evan Cheng	8a417a2fde	Add missing casts. This fixed some regressions. llvm-svn: 28834	2006-06-16 18:37:15 +00:00
Evan Cheng	1fc4025a9c	More libcall transformations: printf("%s\n", str) -> puts(str) printf("%c", c) -> putchar(c) Also fixed fprintf(file, "%c", c) -> fputc(c, file) llvm-svn: 28815	2006-06-16 08:36:35 +00:00
Evan Cheng	f2ea587aa2	Simplify fprintf(file, "%s", str) to fputs(str, file). llvm-svn: 28814	2006-06-16 04:52:30 +00:00
Chris Lattner	c482a9e31a	Implement Transforms/InstCombine/bswap.ll, turning common shift/and/or bswap idioms into bswap intrinsics. llvm-svn: 28803	2006-06-15 19:07:26 +00:00
Chris Lattner	0c4f5a655a	Fix Transforms/LoopUnswitch/2006-06-13-SingleEntryPHI.ll, a loop unswitch bug exposed by the recent lcssa work. llvm-svn: 28779	2006-06-14 04:46:17 +00:00
Chris Lattner	e3abb14503	Use the PotDoms map to memoize 'dominating value' lookup. With this patch, LCSSA is still the slowest pass when gccas'ing 252.eon, but now it only takes 39s instead of 289s. :) llvm-svn: 28776	2006-06-14 01:13:57 +00:00
Owen Anderson	e714a5c549	Fix another instance where PHI nodes need special treatment. llvm-svn: 28774	2006-06-13 20:50:09 +00:00
Owen Anderson	3f8ff0449a	Fix a bug that was causing major slowdowns in povray. This was due to LCSSA not handling PHI nodes correctly when determining if a value was live-out. This patch reduces the number of detected live-out variables in the testcase from 6565 to 485. llvm-svn: 28771	2006-06-13 19:37:18 +00:00
Owen Anderson	fd0a3d6e5c	Reapply my 6/9 changes. The bug Evan saw no longer occurs. llvm-svn: 28759	2006-06-12 21:49:21 +00:00
Chris Lattner	b5c9d7a0af	Fix an infinite loop on Transforms/SimplifyCFG/2006-06-12-InfLoop.ll llvm-svn: 28758	2006-06-12 20:18:01 +00:00
Owen Anderson	0ac336965e	Fix for 2006-06-26-MultipleExitsSingleBlock. If a single exit block has multiple predecessors within the loop, it will appear in the exit blocks list more than once. LCSSA needs to take that into account so that it doesn't double process that exit block. llvm-svn: 28750	2006-06-12 07:10:16 +00:00
Owen Anderson	b538f14d2a	Re-commit the safe parts of my 6/9 patch. Still working on fixing the unsafe parts. llvm-svn: 28748	2006-06-11 19:22:28 +00:00
Evan Cheng	1b6e310e6f	Back out Owen's 6/9 changes. They broke MultiSource/Benchmarks/Prolangs-C/bison (and perhaps others). llvm-svn: 28747	2006-06-11 09:32:57 +00:00
Owen Anderson	b1dc1d44f8	Add LCSSA as a requirement for LoopUnswitch, and assert that LoopUnswitch preserves LCSSA. llvm-svn: 28739	2006-06-09 18:40:32 +00:00
Owen Anderson	505adff3f0	Make Loop able to verify that it is in LCSSA-form, and have the LCSSA pass assert on this. llvm-svn: 28738	2006-06-09 18:33:30 +00:00
Evan Cheng	398f70292c	RewriteExpr, either the new PHI node of induction variable or the post-increment value, should be first cast to the appropriated type (to the type of the common expr). Otherwise, the rewrite of a use based on (common + iv) may end up with an incorrect type. llvm-svn: 28735	2006-06-09 00:12:42 +00:00
Owen Anderson	5d029264ec	Update some comments, and expose LCSSAID in preparation for having other passes require LCSSA. llvm-svn: 28734	2006-06-08 20:02:53 +00:00
Reid Spencer	d4b795902c	Fix a spello in a comment. llvm-svn: 28714	2006-06-07 21:24:10 +00:00
Chris Lattner	95cebb082f	Fix a bug in a recent patch. This fixes UnitTests/Vector/Altivec/casts.c on PPC/altivec llvm-svn: 28698	2006-06-06 22:26:02 +00:00
Owen Anderson	ac601b4c4b	Fix some formatting, and use inLoop() when appropriate. llvm-svn: 28694	2006-06-06 04:36:36 +00:00
Owen Anderson	9e81c1bb03	Stop a memory leak, and update some comments. llvm-svn: 28693	2006-06-06 04:28:30 +00:00
Owen Anderson	766f90b08e	Some more clean-up, and squash an IDF-Phi related bug. llvm-svn: 28680	2006-06-04 00:55:19 +00:00
Owen Anderson	eb33815f1b	Various clean-ups suggested by Chris. llvm-svn: 28678	2006-06-04 00:02:23 +00:00
Owen Anderson	d00eacc4f9	Fix a bug in Phi-noded insertion. Also, update some comments to reflect what's actually going on. llvm-svn: 28677	2006-06-03 23:22:50 +00:00
Chris Lattner	540886f0ae	Remove unneeded hook. Patch by Anton K. Thanks! llvm-svn: 28664	2006-06-02 19:11:46 +00:00
Chris Lattner	02e0b4ddb7	Force anything that #includes llvm/Transforms/Utils/UnifyFunctionExitNodes.h to link in the implementation. Thanks to Anton Korobeynikov for figuring out what was going on here. llvm-svn: 28660	2006-06-02 18:40:06 +00:00
Chris Lattner	cdf2b1fc30	Remove dead #include llvm-svn: 28642	2006-06-01 20:02:28 +00:00
Chris Lattner	cc340c02a4	Make the "pruning cloner" smarter. As it propagates constants through the code (while cloning) it often gets the branch/switch instructions. Since it knows that edges of the CFG are dead, it need not clone (or even look) at the obviously dead blocks. This should speed up the inliner substantially on code where there are lots of inlinable calls to functions with constant arguments. On C++ code in particular, this kicks in. llvm-svn: 28641	2006-06-01 19:19:23 +00:00
Chris Lattner	f905a7b994	Silence a -pedantic warning. llvm-svn: 28632	2006-06-01 17:16:21 +00:00
Owen Anderson	619e4ba57f	Remove a FIXME that was fixed with my last patch. llvm-svn: 28619	2006-06-01 06:07:40 +00:00
Owen Anderson	cd76fa04a1	More cleanups. Also, add a special case for updating PHI nodes, and reimplement getValueDominatingFunction to walk the DominanceTree rather than just searching blindly. llvm-svn: 28618	2006-06-01 06:05:47 +00:00
Chris Lattner	1df0e98ac2	Swap the order of operands created here. For +&\|^, the order doesn't matter, but for sub, it really does! Fix fixes a miscompilation of fibheap_cut in llvmgcc4. llvm-svn: 28600	2006-05-31 21:14:00 +00:00
Owen Anderson	dad8c57340	Extract a huge loop into a helper method. Fix a few iterator-invalidation bugs. llvm-svn: 28599	2006-05-31 20:55:06 +00:00
Owen Anderson	8a8f278f15	Add Use replacement. Assuming there is nothing horribly wrong with this, LCSSA is now theoretically feature-complete. It has not, however, been thoroughly test, and is still considered experimental. llvm-svn: 28529	2006-05-29 01:00:00 +00:00
Owen Anderson	152d063ccb	Major think-o. Iterate over all live out-of-loop values, and perform the other calculations on each individually, rather than trying to delay it and do them all at the end. llvm-svn: 28527	2006-05-28 19:33:28 +00:00
Owen Anderson	1310e42803	Make LCSSA insert proper Phi nodes throughout the rest of the CFG by computing the iterated Dominance Frontier of the loop-closure Phi's. This is the second phase of the LCSSA pass. The third phase (coming soon) will be to update all uses of loop variables to use the loop-closure Phi's instead. llvm-svn: 28524	2006-05-27 18:47:11 +00:00
Chris Lattner	67c424e010	Fix some regression from the inliner patch I committed last night. This fixes ldecod, lencod, and SPASS. llvm-svn: 28523	2006-05-27 17:28:13 +00:00
Chris Lattner	be853d77e9	Switch the inliner over to using CloneAndPruneFunctionInto. This effectively makes it so that it constant folds instructions on the fly. This is good for several reasons: 0. Many instructions are constant foldable after inlining, particularly if inlining a call with constant arguments. 1. Without this, the inliner has to allocate memory for all of the instructions that can be constant folded, then a subsequent pass has to delete them. This gets the job done without this extra work. 2. This makes the inliner pass a bit more aggressive: in particular, it partially solves a phase order issue where the inliner would inline lots of code that folds away to nothing, but think that the resultant function is big because of this code that will be gone. Now the code never exists. This is the first part of a 2-step process. The second part will be smart enough to see when this implicit constant folding propagates a constant into a branch or switch instruction, making CFG edges dead. This implements Transforms/Inline/inline_constprop.ll llvm-svn: 28521	2006-05-27 01:28:04 +00:00
Chris Lattner	3df13f4f22	Implement a new method, CloneAndPruneFunctionInto, as documented. llvm-svn: 28519	2006-05-27 01:22:24 +00:00
Chris Lattner	bc3c879fcf	Refactor some code to expose an interface to constant fold and instruction given it's opcode, typeand operands. llvm-svn: 28517	2006-05-27 01:18:04 +00:00
Owen Anderson	b4e16996f1	A few small clean-ups, and the addition of an LCSSA statistic. llvm-svn: 28512	2006-05-27 00:31:37 +00:00
Owen Anderson	6e047ab8fc	Fix a copy-and-paste-o that would break some compilers. llvm-svn: 28507	2006-05-26 21:19:17 +00:00
Owen Anderson	f3dd3e2bfd	Clean up and refactor LCSSA a bunch. It should also run faster now, though there's still a lot of work to be done on it. llvm-svn: 28506	2006-05-26 21:11:53 +00:00
Chris Lattner	dab43b2b0e	Implement Transforms/InstCombine/store.ll:test2. llvm-svn: 28503	2006-05-26 19:19:20 +00:00
Owen Anderson	8eca8910b6	Skeletal LCSSA pass. This is currently non-functional. Expect functionality and documentation updates soo. llvm-svn: 28495	2006-05-26 13:58:26 +00:00
Chris Lattner	0e47716e69	Transform things like (splat(splat)) -> splat llvm-svn: 28490	2006-05-26 00:29:06 +00:00
Chris Lattner	12249be286	Introduce a helper function that simplifies interpretation of shuffle masks. No functionality change. llvm-svn: 28489	2006-05-25 23:48:38 +00:00
Chris Lattner	99155be33f	Turn (cast (shuffle (cast)) -> shuffle (cast) if it reduces the # casts in the program. This exposes more opportunities for the instcombiner, and implements vec_shuffle.ll:test6 llvm-svn: 28487	2006-05-25 23:24:33 +00:00
Chris Lattner	83f6578b0c	extract element from a shuffle vector can be trivially turned into an extractelement from the SV's source. This implement vec_shuffle.ll:test[45] llvm-svn: 28485	2006-05-25 22:53:38 +00:00
Chris Lattner	0853700582	Revert a patch that is unsafe, due to out of range array accesses in inner array scopes possibly accessing valid memory in outer subscripts. llvm-svn: 28478	2006-05-25 21:25:12 +00:00
Chris Lattner	a643d528bd	Patch for a new instcombine xform, patch contributed by Nick Lewycky! This implements Transforms/InstCombine/2006-05-10-InvalidIndexUndef.ll llvm-svn: 28450	2006-05-24 17:34:30 +00:00
Chris Lattner	aa2372562e	Patches to make the LLVM sources more -pedantic clean. Patch provided by Anton Korobeynikov! This is a step towards closing PR786. llvm-svn: 28447	2006-05-24 17:04:05 +00:00
Chris Lattner	d0622b6894	Silence a bogus gcc warning llvm-svn: 28422	2006-05-20 23:14:03 +00:00
Reid Spencer	2452c94df4	Fix a doxygen problem and break lines at 80 columns llvm-svn: 28395	2006-05-19 19:09:46 +00:00
Chris Lattner	e4cb4768fa	Declare that lowerinvoke doesn't interact with other lowering passes. Patch written by Domagoj Babic! llvm-svn: 28367	2006-05-17 21:05:27 +00:00
Chris Lattner	2e266807c3	Add a CloneModule call that exposes the mapping of values from the old module to the new module. Patch provided by Nick Lewycky! llvm-svn: 28349	2006-05-17 18:05:35 +00:00
Chris Lattner	35515557c7	remove some dead code identified by coverity llvm-svn: 28289	2006-05-14 18:45:44 +00:00
Chris Lattner	3237da073e	remove dead variables llvm-svn: 28286	2006-05-14 18:33:57 +00:00
Evan Cheng	18d0438148	Backing out last check-in for now. It's causing an infinite loop gccas lencode. llvm-svn: 28284	2006-05-14 06:46:03 +00:00
Chris Lattner	3987a8532d	Add/Sub/Mul are safe to promote here as well. Incrementing a single-bit bitfield now gives this code: _plus: lwz r2, 0(r3) rlwimi r2, r2, 0, 1, 31 xoris r2, r2, 32768 stw r2, 0(r3) blr instead of this: _plus: lwz r2, 0(r3) srwi r4, r2, 31 slwi r4, r4, 31 addis r4, r4, -32768 rlwimi r2, r4, 0, 0, 0 stw r2, 0(r3) blr this can obviously still be improved. llvm-svn: 28275	2006-05-13 02:16:08 +00:00
Chris Lattner	1ebbe6a22e	Implement simple promotion for cast elimination in instcombine. This is currently very limited, but can be extended in the future. For example, we now compile: uint %test30(uint %c1) { %c2 = cast uint %c1 to ubyte %c3 = xor ubyte %c2, 1 %c4 = cast ubyte %c3 to uint ret uint %c4 } to: _xor: movzbl 4(%esp), %eax xorl $1, %eax ret instead of: _xor: movb $1, %al xorb 4(%esp), %al movzbl %al, %eax ret More impressively, we now compile: struct B { unsigned bit : 1; }; void xor(struct B *b) { b->bit = b->bit ^ 1; } To (X86/PPC): _xor: movl 4(%esp), %eax xorl $-2147483648, (%eax) ret _xor: lwz r2, 0(r3) xoris r2, r2, 32768 stw r2, 0(r3) blr instead of (X86/PPC): _xor: movl 4(%esp), %eax movl (%eax), %ecx movl %ecx, %edx shrl $31, %edx # TRUNCATE movb %dl, %dl xorb $1, %dl movzbl %dl, %edx andl $2147483647, %ecx shll $31, %edx orl %ecx, %edx movl %edx, (%eax) ret _xor: lwz r2, 0(r3) srwi r4, r2, 31 xori r4, r4, 1 rlwimi r2, r4, 31, 0, 0 stw r2, 0(r3) blr This implements InstCombine/cast.ll:test30. llvm-svn: 28273	2006-05-13 02:06:03 +00:00
Chris Lattner	cd60d38b30	Remove some dead variables. Fix a nasty bug in the memcmp optimizer where we used the wrong variable! llvm-svn: 28269	2006-05-12 23:35:26 +00:00
Chris Lattner	94acc47654	Remove dead stuff llvm-svn: 28268	2006-05-12 23:32:01 +00:00
Chris Lattner	1443bc52be	Refactor some code, making it simpler. When doing the initial pass of constant folding, if we get a constantexpr, simplify the constant expr like we would do if the constant is folded in the normal loop. This fixes the missed-optimization regression in Transforms/InstCombine/getelementptr.ll last night. llvm-svn: 28224	2006-05-11 17:11:52 +00:00
Chris Lattner	a36ee4ea34	Two changes: 1. Implement InstCombine/deadcode.ll by not adding instructions in unreachable blocks (due to constants in conditional branches/switches) to the worklist. This causes them to be deleted before instcombine starts up, leading to better optimization. 2. In the prepass over instructions, do trivial constprop/dce as we go. This has the effect of improving the effectiveness of #1. In addition, it significantly speeds up instcombine on test cases with large amounts of constant folding code (for example, that produced by code specialization or partial evaluation). In one example, it speeds up instcombine from 0.0589s to 0.0224s with a release build (a 2.6x speedup). llvm-svn: 28215	2006-05-10 19:00:36 +00:00
Chris Lattner	4fe87d67c4	Patch to make some xforms preserve each other. Patch contributed by Domagoj Babic! llvm-svn: 28181	2006-05-09 04:13:41 +00:00
Chris Lattner	1d441adfbf	Move some code around. Make the "fold (and (cast A), (cast B)) -> (cast (and A, B))" transformation only apply when both casts really will cause code to be generated. If one or both doesn't, then this xform doesn't remove a cast. This fixes Transforms/InstCombine/2006-05-06-Infloop.ll llvm-svn: 28141	2006-05-06 09:00:16 +00:00
Chris Lattner	e745c7de0e	Fix an infinite loop compiling oggenc last night. llvm-svn: 28128	2006-05-05 20:51:30 +00:00
Chris Lattner	3af1053488	Implement InstCombine/cast.ll:test29 llvm-svn: 28126	2006-05-05 06:39:07 +00:00
Chris Lattner	fb29692055	Fix Transforms/InstCombine/2006-05-04-DemandedBitCrash.ll llvm-svn: 28101	2006-05-04 17:33:35 +00:00
Chris Lattner	2d3a02725d	Add pass ID's for various passes, so they can be AddRequiredID. Patch by Domagoj Babic! llvm-svn: 28048	2006-05-02 04:24:36 +00:00
Chris Lattner	655d08fda8	Fix InstCombine/2006-04-28-ShiftShiftLongLong.ll llvm-svn: 28019	2006-04-28 22:21:41 +00:00
Chris Lattner	e63d808b6e	Fix Transforms/Reassociate/2006-04-27-ReassociateVector.ll llvm-svn: 28007	2006-04-28 04:14:49 +00:00
Chris Lattner	b6cb64b7e6	Add support for inserting undef into a vector. This implements Transforms/InstCombine/vec_insert_to_shuffle.ll llvm-svn: 27997	2006-04-27 21:14:21 +00:00
Chris Lattner	f98b4aa2e7	Fix some nondeterminstic behavior in the mem2reg pass that (in addition to nondeterminism being bad) could cause some trivial missed optimizations (dead phi nodes being left around for later passes to clean up). With this, llvm-gcc4 now bootstraps and correctly compares. I don't know why I never tried to do it before... :) llvm-svn: 27984	2006-04-27 01:14:43 +00:00
Chris Lattner	dae49df407	Fix Transforms/ScalarRepl/2006-04-20-PromoteCrash.ll llvm-svn: 27912	2006-04-20 20:48:50 +00:00
Andrew Lenharth	f89e630b2f	Make code match cvs commit message :) llvm-svn: 27881	2006-04-20 15:41:37 +00:00
Andrew Lenharth	61eae29ad6	If we can convert the return pointer type into an integer that IntPtrType can be converted to losslessly, we can continue the conversion to a direct call. llvm-svn: 27880	2006-04-20 14:56:47 +00:00
Chris Lattner	36dd7c98d1	Turn x86 unaligned load/store intrinsics into aligned load/store instructions if the pointer is known aligned. llvm-svn: 27781	2006-04-17 22:26:56 +00:00
Chris Lattner	9095186deb	Fix a bug in the 'shuffle(undef,x,mask) -> shuffle(x, undef,mask')' xform Make the insert/extract elt -> shuffle code more aggressive. This fixes CodeGen/PowerPC/vec_shuffle.ll llvm-svn: 27728	2006-04-16 00:51:47 +00:00
Chris Lattner	34cebe785d	Canonicalize shuffle(undef,x,mask) -> shuffle(x, undef,mask'). llvm-svn: 27727	2006-04-16 00:03:56 +00:00
Chris Lattner	39fac448d6	significant cleanups to code that uses insert/extractelt heavily. This builds maximal shuffles out of them where possible. llvm-svn: 27717	2006-04-15 01:39:45 +00:00
Chris Lattner	3323ce165d	Teach scalarrepl to promote unions of vectors and floats, producing insert/extractelement operations. This implements Transforms/ScalarRepl/vector_promote.ll llvm-svn: 27710	2006-04-14 21:42:41 +00:00
Andrew Lenharth	92cf71f6d7	linear -> constant time llvm-svn: 27652	2006-04-13 13:43:31 +00:00
Reid Spencer	13a1a7a4a6	Get rid of a signed/unsigned compare warning. llvm-svn: 27625	2006-04-12 19:28:15 +00:00
Chris Lattner	b19a5c661b	Turn casts into getelementptr's when possible. This enables SROA to be more aggressive in some cases where LLVMGCC 4 is inserting casts for no reason. This implements InstCombine/cast.ll:test27/28. llvm-svn: 27620	2006-04-12 18:09:35 +00:00
Chris Lattner	2d37f920ad	Implement vec_shuffle.ll:test3 llvm-svn: 27573	2006-04-10 23:06:36 +00:00
Chris Lattner	fbb77a408b	Implement InstCombine/vec_shuffle.ll:test[12] llvm-svn: 27571	2006-04-10 22:45:52 +00:00
Andrew Lenharth	a9cdcca3c3	Add a simple pass to make sure that all (non-library) calls to malloc and free are visible to analysis as intrinsics. That is, make sure someone doesn't pass free around by address in some struct (as happens in say 176.gcc). This doesn't get rid of any indirect calls, just ensure calls to free and malloc are always direct. llvm-svn: 27560	2006-04-10 19:26:09 +00:00
Chris Lattner	17bd60588c	Add supprot for shufflevector llvm-svn: 27513	2006-04-08 01:19:12 +00:00
Chris Lattner	8ec0205de4	Fix inlining of insert/extract element constantexprs llvm-svn: 27478	2006-04-07 04:41:03 +00:00
Chris Lattner	e79d249c29	Lower vperm(x,y, mask) -> shuffle(x,y,mask) if mask is constant. This allows us to compile oh-so-realistic stuff like this: vec_vperm(A, B, (vector unsigned char){14}); to: vspltb v0, v0, 14 instead of: vspltisb v0, 14 vperm v0, v2, v1, v0 llvm-svn: 27452	2006-04-06 19:19:17 +00:00
Chris Lattner	caba72b6ff	vector casts of casts are eliminable. Transform this: %tmp = cast <4 x uint> %tmp to <4 x int> ; <<4 x int>> [#uses=1] %tmp = cast <4 x int> %tmp to <4 x float> ; <<4 x float>> [#uses=1] into: %tmp = cast <4 x uint> %tmp to <4 x float> ; <<4 x float>> [#uses=1] llvm-svn: 27355	2006-04-02 05:43:13 +00:00
Chris Lattner	ebca476b27	Allow transforming this: %tmp = cast <4 x uint>* %testData to <4 x int>* ; <<4 x int>> [#uses=1] %tmp = load <4 x int> %tmp ; <<4 x int>> [#uses=1] to this: %tmp = load <4 x uint>* %testData ; <<4 x uint>> [#uses=1] %tmp = cast <4 x uint> %tmp to <4 x int> ; <<4 x int>> [#uses=1] llvm-svn: 27353	2006-04-02 05:37:12 +00:00
Chris Lattner	f42d0aeda1	Turn altivec lvx/stvx intrinsics into loads and stores. This allows the elimination of one load from this: int AreSecondAndThirdElementsBothNegative( vector float in ) { #define QNaN 0x7FC00000 const vector unsigned int testData = (vector unsigned int)( QNaN, 0, 0, QNaN ); vector float test = vec_ld( 0, (float) &testData ); return ! vec_any_ge( test, *in ); } Now generating: _AreSecondAndThirdElementsBothNegative: mfspr r2, 256 oris r4, r2, 49152 mtspr 256, r4 li r4, lo16(LCPI1_0) lis r5, ha16(LCPI1_0) addi r6, r1, -16 lvx v0, r5, r4 stvx v0, 0, r6 lvx v1, 0, r3 vcmpgefp. v0, v0, v1 mfcr r3, 2 rlwinm r3, r3, 27, 31, 31 xori r3, r3, 1 cntlzw r3, r3 srwi r3, r3, 5 mtspr 256, r2 blr llvm-svn: 27352	2006-04-02 05:30:25 +00:00
Chris Lattner	70ec96fa32	Adjust to change in Intrinsics.gen interface. llvm-svn: 27344	2006-04-02 03:35:01 +00:00
Chris Lattner	1b2436a624	add valuemapper support for inline asm llvm-svn: 27332	2006-04-01 23:17:11 +00:00
Chris Lattner	6cf4914fd4	Fix InstCombine/2006-04-01-InfLoop.ll llvm-svn: 27330	2006-04-01 22:05:01 +00:00
Chris Lattner	dcd0792622	Fold A^(B&A) -> (B&A)^A Fold (B&A)^A == ~B & A This implements InstCombine/xor.ll:test2[56] llvm-svn: 27328	2006-04-01 08:03:55 +00:00
Chris Lattner	8d1d8d364c	If we can look through vector operations to find the scalar version of an extract_element'd value, do so. llvm-svn: 27323	2006-03-31 23:01:56 +00:00
Chris Lattner	92346c315e	extractelement(undef,x) -> undef llvm-svn: 27300	2006-03-31 18:25:14 +00:00
Chris Lattner	612fa8e6f3	Fix Transforms/InstCombine/2006-03-30-ExtractElement.ll llvm-svn: 27261	2006-03-30 22:02:40 +00:00
Chris Lattner	42e0ba09aa	teach the inliner to work with packed constants llvm-svn: 27161	2006-03-27 05:50:18 +00:00
Chris Lattner	d70d9f5b24	Don't crash on packed logical ops llvm-svn: 27125	2006-03-25 21:58:26 +00:00
Chris Lattner	f365f5f0c1	Fix spello llvm-svn: 27052	2006-03-24 07:14:34 +00:00
Chris Lattner	5821a6a17a	add the actual cost to the debug info llvm-svn: 27051	2006-03-24 07:14:00 +00:00
Jim Laskey	8f64426f5c	Strip changes to llvm.dbg intrinsics. llvm-svn: 26993	2006-03-23 18:11:33 +00:00
Jim Laskey	83f99115db	Can't combine anymore - we don't have a chain through llvm.dbg intrinsics. llvm-svn: 26992	2006-03-23 18:10:42 +00:00
Chris Lattner	7d80b4f366	silence a bogus gcc warning llvm-svn: 26953	2006-03-22 17:27:24 +00:00
Chris Lattner	d783c76c18	Teach cee to propagate through switch statements. This implements Transforms/CorrelatedExprs/switch.ll Patch contributed by Eric Kidd! llvm-svn: 26872	2006-03-19 19:37:24 +00:00
Evan Cheng	c28282bd87	- Fixed a bogus if condition. - Added more debugging info. - Allow reuse of IV of negative stride. e.g. -4 stride == 2 * iv of -2 stride. llvm-svn: 26841	2006-03-18 08:03:12 +00:00
Evan Cheng	f09f0ebd48	Sort StrideOrder so we can process the smallest strides first. This allows for more IV reuses. llvm-svn: 26837	2006-03-18 00:44:49 +00:00
Evan Cheng	4520698820	Allow users of iv / stride to be rewritten with expression that is a multiply of a smaller stride even if they have a common loop invariant expression part. llvm-svn: 26828	2006-03-17 19:52:23 +00:00
Evan Cheng	3df447d354	For each loop, keep track of all the IV expressions inserted indexed by stride. For a set of uses of the IV of a stride which is a multiple of another stride, do not insert a new IV expression. Rather, reuse the previous IV and rewrite the uses as uses of IV expression multiplied by the factor. e.g. x = 0 ...; x ++ y = 0 ...; y += 4 then use of y can be rewritten as use of 4*x for x86. llvm-svn: 26803	2006-03-16 21:53:05 +00:00
Chris Lattner	6d6084fd04	Teach the strip pass to strip type names in addition to value names. This is fallout from the type/value split in the symtab long long ago :) llvm-svn: 26785	2006-03-15 19:22:41 +00:00
Chris Lattner	c5f866bb4a	Implement a FIXME, recusively reassociating AAB + AAC --> A(AB+AC) --> A(A*(B+C)) This implements Reassociate/mul-factor3.ll llvm-svn: 26757	2006-03-14 16:04:29 +00:00
Chris Lattner	2fc319d444	extract some code into a method, no functionality change llvm-svn: 26755	2006-03-14 07:11:11 +00:00
Chris Lattner	d6bde46d85	Promote shifts by a constant to multiplies so that we can reassociate (x<<1)+(y<<1) -> (X+Y)<<1. This implements Transforms/Reassociate/shift-factor.ll llvm-svn: 26753	2006-03-14 06:55:18 +00:00
Evan Cheng	c567c4efbb	Added target lowering hooks which LSR consults to make more intelligent transformation decisions. llvm-svn: 26738	2006-03-13 23:14:23 +00:00
Jim Laskey	acb6e34277	Handle the removal of the debug chain. llvm-svn: 26729	2006-03-13 13:07:37 +00:00
Chris Lattner	60f6833376	use autogenerated side-effect information llvm-svn: 26673	2006-03-09 22:38:10 +00:00
Chris Lattner	6b7847a5bc	fix a pasto llvm-svn: 26627	2006-03-09 06:09:41 +00:00
Chris Lattner	fc34f8bb48	Fix a miscompilation of 188.ammp with the new CFE. 188.ammp is accessing arrays out of range in a horrible way, but we shouldn't break it anyway. Details in the comments. llvm-svn: 26606	2006-03-08 01:05:29 +00:00
Jim Laskey	69effa2325	Switch to using a numeric id for anchors. llvm-svn: 26598	2006-03-07 20:53:47 +00:00
Chris Lattner	7b87fd53f9	Fix ConstantMerge/2006-03-07-DontMergeDiffSections.ll, a problem Jim hypotheticalized about, where we would incorrectly merge two globals in different sections. llvm-svn: 26597	2006-03-07 17:56:59 +00:00
Chris Lattner	53ef5a032c	Teach the alignment handling code to look through constant expr casts and GEPs llvm-svn: 26580	2006-03-07 01:28:57 +00:00
Chris Lattner	82f2ef20b6	Teach instcombine to increase the alignment of memset/memcpy/memmove when the pointer is known to come from either a global variable, alloca or malloc. This allows us to compile this: P = malloc(28); memset(P, 0, 28); into explicit stores on PPC instead of a memset call. llvm-svn: 26577	2006-03-06 20:18:44 +00:00
Chris Lattner	6bc98653c2	Make vector narrowing more effective, implementing Transforms/InstCombine/vec_narrow.ll. This add support for narrowing extract_element(insertelement) also. llvm-svn: 26538	2006-03-05 00:22:33 +00:00
Chris Lattner	4c065091d8	Add factoring of multiplications, e.g. turning AA+AB into A*(A+B). Testcase here: Transforms/Reassociate/mulfactor.ll llvm-svn: 26524	2006-03-04 09:31:13 +00:00
Chris Lattner	32c01df299	Canonicalize (X+C1)C2 -> XC2+C1*C2 This implements Transforms/InstCombine/add.ll:test31 llvm-svn: 26519	2006-03-04 06:04:02 +00:00
Chris Lattner	681ef2f083	Change this to work with renamed intrinsics. llvm-svn: 26484	2006-03-03 01:34:17 +00:00
Chris Lattner	ea7986aeca	Make this work with renamed intrinsics. llvm-svn: 26482	2006-03-03 01:30:23 +00:00
Chris Lattner	85dda9a2bd	Generalize the REM folding code to handle another case Nick Lewycky pointed out: realize the AND can provide factors and look through Casts. llvm-svn: 26469	2006-03-02 06:50:58 +00:00
Chris Lattner	c5b6c9a12a	Fix a regression in a patch from a couple of days ago. This fixes Transforms/InstCombine/2006-02-28-Crash.ll llvm-svn: 26427	2006-02-28 19:47:20 +00:00
Chris Lattner	b70f141893	Implement rem.ll:test[7-9] and PR712 llvm-svn: 26415	2006-02-28 05:49:21 +00:00
Chris Lattner	2a7c7b8bab	Simplify some code now that the RHS of a rem can't be 0 llvm-svn: 26413	2006-02-28 05:40:55 +00:00
Chris Lattner	0de4a8d7b7	Rearrange some code, fold "rem X, 0", implementing rem.ll:test6 llvm-svn: 26411	2006-02-28 05:30:45 +00:00
Chris Lattner	c7bfed0f7b	Merge two almost-identical pieces of code. Make this code more powerful by using ComputeMaskedBits instead of looking for an AND operand. This lets us fold this: int %test23(int %a) { %tmp.1 = and int %a, 1 %tmp.2 = seteq int %tmp.1, 0 %tmp.3 = cast bool %tmp.2 to int ;; xor tmp1, 1 ret int %tmp.3 } into: xor (and a, 1), 1 llvm-svn: 26396	2006-02-27 02:38:23 +00:00
Chris Lattner	f5c8a0b83f	Fold (A^B) == A -> B == 0 and (A-B) == A -> B == 0 llvm-svn: 26394	2006-02-27 01:44:11 +00:00
Chris Lattner	f78df7c14d	Fold (X\|C1)^C2 -> X^(C1\|C2) when possible. This implements InstCombine/or.ll:test23. llvm-svn: 26385	2006-02-26 19:57:54 +00:00
Chris Lattner	b580d26e7d	Fix a problem that Nate noticed that boils down to an over conservative check in the code that does "select C, (X+Y), (X-Y) --> (X+(select C, Y, (-Y)))". We now compile this loop: LBB1_1: ; no_exit add r6, r2, r3 subf r3, r2, r3 cmpwi cr0, r2, 0 addi r7, r5, 4 lwz r2, 0(r5) addi r4, r4, 1 blt cr0, LBB1_4 ; no_exit LBB1_3: ; no_exit mr r3, r6 LBB1_4: ; no_exit cmpwi cr0, r4, 16 mr r5, r7 bne cr0, LBB1_1 ; no_exit into this instead: LBB1_1: ; no_exit srawi r6, r2, 31 add r2, r2, r6 xor r6, r2, r6 addi r7, r5, 4 lwz r2, 0(r5) addi r4, r4, 1 add r3, r3, r6 cmpwi cr0, r4, 16 mr r5, r7 bne cr0, LBB1_1 ; no_exit llvm-svn: 26356	2006-02-24 18:05:58 +00:00
Chris Lattner	e5521db5bc	Fix Regression/Transforms/LoopUnswitch/2006-02-22-UnswitchCrash.ll, which caused SPASS to fail building last night. We can't trivially unswitch a loop if the exit block has phi nodes in it, because we don't know which predecessor to use. llvm-svn: 26320	2006-02-22 23:55:00 +00:00
Chris Lattner	8a5a324dac	Add some comments, simplify some code, and fix a bug that caused rewriting to rewrite with the wrong value. llvm-svn: 26311	2006-02-22 06:37:14 +00:00
Chris Lattner	c2e3a7a4ce	improved support for branch folding, still not enabled. llvm-svn: 26289	2006-02-18 07:57:38 +00:00
Jeff Cohen	0add83e969	Fix bugs identified by VC++. llvm-svn: 26287	2006-02-18 03:20:33 +00:00
Chris Lattner	19fa8ac938	Implement deletion of dead blocks, currently disabled. llvm-svn: 26285	2006-02-18 02:42:34 +00:00
Chris Lattner	cb853de534	a previous patch completely disabled trivial unswitching, this fixees it. Thanks to nate for pointing this out :) llvm-svn: 26280	2006-02-18 01:32:04 +00:00
Chris Lattner	29f771ba21	initial trivial support for folding branches that have now-constant destinations. llvm-svn: 26279	2006-02-18 01:27:45 +00:00
Chris Lattner	8e44ff50b0	When unswitching a loop, make sure to update loop info with exit blocks in the right loop. llvm-svn: 26277	2006-02-18 00:55:32 +00:00
Chris Lattner	d95665188b	Fix Transforms/SimplifyCFG/2006-02-17-InfiniteUnroll.ll llvm-svn: 26275	2006-02-18 00:33:17 +00:00
Chris Lattner	baddba41c7	Fix loops where the header has an exit, fixing a loop-unswitch crash on crafty llvm-svn: 26258	2006-02-17 06:39:56 +00:00
Chris Lattner	6fd136239b	start of some new simplification code, not thoroughly tested, use at your own risk :) llvm-svn: 26248	2006-02-17 00:31:07 +00:00
Nate Begeman	8a77efe4f7	Rework the SelectionDAG-based implementations of SimplifyDemandedBits and ComputeMaskedBits to match the new improved versions in instcombine. Tested against all of multisource/benchmarks on ppc. llvm-svn: 26238	2006-02-16 21:11:51 +00:00
Chris Lattner	fa335f6083	Change SplitBlock to increment a BasicBlock::iterator, not an Instruction*. Apparently they do different things :) This fixes a testcase that nate reduced from spass. Also included are a couple minor code changes that don't affect the generated code at all. llvm-svn: 26235	2006-02-16 19:36:22 +00:00
Jeff Cohen	55f63f1b53	Fix VC++ warning. llvm-svn: 26228	2006-02-16 04:07:37 +00:00
Chris Lattner	ff42e81028	fix a bug where we unswitched the wrong way llvm-svn: 26225	2006-02-16 01:24:41 +00:00
Chris Lattner	fdff0bb43e	Implement trivial unswitching for switch stmts. This allows us to trivial unswitch this loop on 2 before sweating to unswitch on 1/3. void test4(int N, int i, int C, intP, intQ) { int j; for (j = 0; j < N; ++j) { switch (C) { // general unswitching. default: P[i+j] = 0; break; case 1: Q[i+j] = 0; break; case 3: P[i+j] = Q[i+j]; break; case 2: break; // TRIVIAL UNSWITCH on C==2 } } } llvm-svn: 26223	2006-02-15 22:52:05 +00:00
Chris Lattner	e5cb76d744	make "trivial" unswitching significantly more general. It can now handle this for example: for (j = 0; j < N; ++j) { // trivial unswitch if (C) P[i+j] = 0; } turning it into the obvious code without bothering to duplicate an empty loop. llvm-svn: 26220	2006-02-15 22:03:36 +00:00
Andrew Lenharth	47da60130a	fix a bunch of alpha regressions. see bug 709 llvm-svn: 26218	2006-02-15 21:13:37 +00:00
Chris Lattner	65152d80ec	Checking the wrong value. This caused us to emit silly code like Y = seteq bool X, true instead of just using X :) llvm-svn: 26215	2006-02-15 19:05:52 +00:00
Chris Lattner	01db04efb0	more refactoring, no functionality change. llvm-svn: 26194	2006-02-15 01:44:42 +00:00
Chris Lattner	b0cbe7106e	pull some code out into a function llvm-svn: 26191	2006-02-15 00:07:43 +00:00
Chris Lattner	9c5693fb2a	Canonicalize inner loops before outer loops. Inner loop canonicalization can provide work for the outer loop to canonicalize. This fixes a case that breaks unswitching. llvm-svn: 26189	2006-02-14 23:06:02 +00:00
Chris Lattner	cffbbee8d1	When splitting exit edges to canonicalize loops, make sure to put the new block in the appropriate loop nest. Third time is the charm, right? llvm-svn: 26187	2006-02-14 22:34:08 +00:00
Chris Lattner	0b8ec1a132	Use statistics to keep track of what flavors of loops we are unswitching llvm-svn: 26157	2006-02-14 01:01:41 +00:00
Chris Lattner	8b10ab3002	Implement Instcombine/and.ll:test34 llvm-svn: 26155	2006-02-13 23:07:23 +00:00
Chris Lattner	7d8522884b	If any of the sign extended bits are demanded, the input sign bit is demanded for a sign extension. This fixes InstCombine/2006-02-13-DemandedMiscompile.ll and Ptrdist/bc. llvm-svn: 26152	2006-02-13 22:41:07 +00:00
Chris Lattner	68e7475777	Be careful not to request or look at bits shifted in from outside the size of the input. This fixes the mediabench/gsm/toast failure last night. llvm-svn: 26138	2006-02-13 06:09:08 +00:00
Chris Lattner	f5b4ef7f58	remove some more dead special case code llvm-svn: 26135	2006-02-12 08:07:37 +00:00
Chris Lattner	5b2edb1fca	Eliminate special case hacks that are superceded by general purpose hacks llvm-svn: 26134	2006-02-12 08:02:11 +00:00
Chris Lattner	ee0f280743	Three changes: 1. Teach GetConstantInType to handle boolean constants. 2. Teach instcombine to fold (compare X, CST) when X has known 0/1 bits. Testcase here: set.ll:test22 3. Improve the "(X >> c1) & C2 == 0" folding code to allow a noop cast between the shift and and. More aggressive bitfolding for other reasons was turning signed shr's into unsigned shr's, leaving the noop cast in the way. llvm-svn: 26131	2006-02-12 02:07:56 +00:00
Chris Lattner	02f53ad3a2	Revert my last patch. It too breaks stuff llvm-svn: 26128	2006-02-12 01:59:10 +00:00
Chris Lattner	35248e06bc	Fix for my previously reverted patch llvm-svn: 26126	2006-02-11 21:24:54 +00:00
Chris Lattner	0157e7f55b	Port the recent innovations in ComputeMaskedBits to SimplifyDemandedBits. This allows us to simplify on conditions where bits are not known, but they are not demanded either! This also fixes a couple of bugs in ComputeMaskedBits that were exposed during this work. In the future, swaths of instcombine should be removed, as this code subsumes a bunch of ad-hockery. llvm-svn: 26122	2006-02-11 09:31:47 +00:00
Chris Lattner	b24ce3a2a8	revert my previous change, it exposed other problems. llvm-svn: 26121	2006-02-11 08:47:47 +00:00
Chris Lattner	05bf90dddf	Make this check stricter. Disallow loop exit blocks from being shared by loops and their subloops. llvm-svn: 26118	2006-02-11 02:13:17 +00:00
Chris Lattner	a6ae101afa	remove dead expr llvm-svn: 26116	2006-02-11 01:43:37 +00:00
Chris Lattner	fbadd7e1ee	implement unswitching of loops with switch stmts and selects in them llvm-svn: 26114	2006-02-11 00:43:37 +00:00
Chris Lattner	f1b151684d	Update PHI nodes in successors of exit blocks. llvm-svn: 26113	2006-02-10 23:26:14 +00:00
Chris Lattner	fe4151efe7	Reform the unswitching code in terms of edge splitting, not block splitting. llvm-svn: 26112	2006-02-10 23:16:39 +00:00
Chris Lattner	ec6b40a093	Fix a case where UnswitchTrivialCondition broke critical edges with phi's in the successors llvm-svn: 26108	2006-02-10 19:08:15 +00:00
Chris Lattner	6e263155a6	add some notes, move some code around. Implement unswitching of loops with branches on partially invariant computations. llvm-svn: 26104	2006-02-10 02:30:37 +00:00
Chris Lattner	4935417a84	Move code around to be more logical, no functionality change. llvm-svn: 26103	2006-02-10 02:01:22 +00:00
Chris Lattner	3fc3148b85	When unswitching a trivial loop, do admit we are doing it! :) llvm-svn: 26102	2006-02-10 01:36:35 +00:00
Chris Lattner	ed7a67b0de	Implement unconditional unswitching of 'trivial' loops, those loops that contain branches in their entry block that control whether or not the loop is a noop or not. llvm-svn: 26101	2006-02-10 01:24:09 +00:00
Chris Lattner	4f0e66df6a	Simplify control flow a bit, note that unswitch preserves canonical loop form llvm-svn: 26098	2006-02-09 22:15:42 +00:00
Chris Lattner	8976219850	Make the threshold a parameter llvm-svn: 26093	2006-02-09 20:15:48 +00:00
Chris Lattner	2826e0511b	Simplify the loop-unswitch pass, by not even trying to unswitch loops with uses of loop values outside the loop. We need loop-closed SSA form to do this right, or to use SSA rewriting if we really care. llvm-svn: 26089	2006-02-09 19:14:52 +00:00
Chris Lattner	24cd2fa269	Fix 80-column violations llvm-svn: 26088	2006-02-09 07:41:14 +00:00
Chris Lattner	4534dd59a3	Enhance MVIZ in three ways: 1. Teach it new tricks: in particular how to propagate through signed shr and sexts. 2. Teach it to return a bitset of known-1 and known-0 bits, instead of just zero. 3. Teach instcombine (AND X, C) to fold when we know all C bits of X. This implements Regression/Transforms/InstCombine/bittest.ll, and allows future things to be simplified. llvm-svn: 26087	2006-02-09 07:38:58 +00:00
Chris Lattner	ab2dc4d70d	Simplify some code, reducing calls to MaskedValueIsZero. Implement a minor optimization where we reduce the number of bits in AND masks when possible. llvm-svn: 26056	2006-02-08 07:34:50 +00:00
Chris Lattner	5997cf9381	Use EraseInstFromFunction in a few cases to put the uses of the removed instruction onto the worklist (in case they are now dead). Add a really trivial local DSE implementation to help out bitfield code. We now fold this: struct S { unsigned char a : 1, b : 1, c : 1, d : 2, e : 3; S(); }; S::S() : a(0), b(0), c(1), d(0), e(6) {} to this: void %_ZN1SC1Ev(%struct.S* %this) { entry: %tmp.1 = getelementptr %struct.S* %this, int 0, uint 0 store ubyte 38, ubyte* %tmp.1 ret void } much earlier (in gccas instead of only in gccld after DSE runs). llvm-svn: 26050	2006-02-08 03:25:32 +00:00
Chris Lattner	06a0ed1ee0	Implement some more interesting select sccp cases. This implements: test/Regression/Transforms/SCCP/select.ll llvm-svn: 26049	2006-02-08 02:38:11 +00:00
Chris Lattner	ddba3289b5	Fix a problem in my patch yesterday, causing a miscompilation of 176.gcc llvm-svn: 26045	2006-02-08 01:20:23 +00:00
Chris Lattner	44314827d6	Fix Transforms/InstCombine/2006-02-07-SextZextCrash.ll llvm-svn: 26040	2006-02-07 19:07:40 +00:00
Chris Lattner	92a6865321	Generalize MaskedValueIsZero into a ComputeMaskedNonZeroBits function, which is just as efficient as MVIZ and is also more general. Fix a few minor bugs introduced in recent patches llvm-svn: 26036	2006-02-07 08:05:22 +00:00
Chris Lattner	c3ebf40031	Make MaskedValueIsZero take a uint64_t instead of a ConstantIntegral as a mask. This allows the code to be simpler and more efficient. Also, generalize some of the cases in MVIZ a bit, making it slightly more aggressive. llvm-svn: 26035	2006-02-07 07:27:52 +00:00
Chris Lattner	77defbae0a	Use Type::getIntegralTypeMask() to simplify some code llvm-svn: 26034	2006-02-07 07:00:41 +00:00
Chris Lattner	2590e511d8	Implement the beginnings of a facility for simplifying expressions based on 'demanded bits', inspired by Nate's work in the dag combiner. This isn't complete, but needs to unrelated instcombiner changes to continue. llvm-svn: 26033	2006-02-07 06:56:34 +00:00
Chris Lattner	2e90b732fa	Turn A % (C << N), where C is 2^k, into A & ((C << N)-1) [urem only]. Turn A / (C1 << N), where C1 is "1<<C2" into A >> (N+C2) [udiv only]. Tested with: rem.ll:test5, div.ll:test10 llvm-svn: 26003	2006-02-05 07:54:04 +00:00
Chris Lattner	d30c4991a1	Use SCEVExpander::InsertCastOfTo instead of our own code. This reduces #LLVM LOC, and auto-cse's cast instructions. llvm-svn: 25974	2006-02-04 09:52:43 +00:00
Chris Lattner	2959f0003e	Fix two significant bugs in LSR: 1. When rewriting code in outer loops, sometimes we would insert code into inner loops that is invariant in that loop. 2. Notice that 4(2+x) is 8+4x and use that to simplify expressions. This is a performance neutral change. llvm-svn: 25964	2006-02-04 07:36:50 +00:00
Jeff Cohen	15a8c15a1f	Improve compatibility with VC2005, patch by Morten Ofstad! llvm-svn: 25661	2006-01-26 20:41:32 +00:00
Chris Lattner	120f31b1fd	teach the cloner to handle inline asms llvm-svn: 25633	2006-01-26 01:55:22 +00:00
Chris Lattner	c0f633a598	Fix Regression/Transforms/ScalarRepl/2006-01-24-IllegalUnionPromoteCrash.ll llvm-svn: 25587	2006-01-24 19:36:27 +00:00
Chris Lattner	00fcdfef0d	rename method llvm-svn: 25572	2006-01-24 04:16:34 +00:00
Chris Lattner	37992b34c2	When cloning a module, clone the inline asm. llvm-svn: 25559	2006-01-23 23:06:28 +00:00
Chris Lattner	5774040c09	add a bunch more optimizations for unary double math functions llvm-svn: 25530	2006-01-23 06:24:46 +00:00
Chris Lattner	57a2863cbb	Refactor/genericize this, no functionality change llvm-svn: 25525	2006-01-23 05:57:36 +00:00
Chris Lattner	c597b8a55e	Make iostream #inclusion explicit llvm-svn: 25514	2006-01-22 23:32:06 +00:00
Chris Lattner	33081b4648	Make this more efficient in the following ways: 1. Do not statically construct a map when the program starts up, this is expensive and cannot be optimized. Instead, create a list. 2. Do not insert entries for all function in the module into a hashmap that lives the full life of the compiler. llvm-svn: 25512	2006-01-22 23:10:26 +00:00
Chris Lattner	469640e506	Add explicit #includes of <iostream> llvm-svn: 25509	2006-01-22 22:53:01 +00:00
Chris Lattner	0d4ebfc15b	Several non-functionality changing changes: 1. Use the varargs version of getOrInsertFunction to simplify code. 2. remove #include 3. Reduce the number of #ifdef's. 4. remove extraneous vertical whitespace. llvm-svn: 25508	2006-01-22 22:35:08 +00:00
Robert Bocchino	027c18da98	ConstantFoldLoadThroughGEPConstantExpr wasn't handling pointers to packed types correctly. llvm-svn: 25470	2006-01-19 23:53:23 +00:00
Reid Spencer	ade182125f	For PR696: Don't do floor->floorf conversion if floorf is not available. This checks the compiler's host, not its target, which is incorrect for cross-compilers Not sure that's important as we don't build many cross-compilers. llvm-svn: 25456	2006-01-19 08:36:56 +00:00
Chris Lattner	e154abf9b3	Implement casts.ll:test26: a cast from float -> double -> integer, doesn't need the float->double part. llvm-svn: 25452	2006-01-19 07:40:22 +00:00
Chris Lattner	7be2203c9f	If not internalizing, don't mark llvm.global[cd]tors const, as a fix for a hypothetical future boog. llvm-svn: 25430	2006-01-19 00:46:54 +00:00
Chris Lattner	d693b7943a	Don't internalize llvm.global[cd]tor unless there are uses of it. This unbreaks front-ends that don't use __main (like the new CFE). llvm-svn: 25429	2006-01-19 00:40:39 +00:00
Chris Lattner	b98282d2d6	Make sure that cloning a module clones its target triple and dependent library list as well. This should help bugpoint. llvm-svn: 25424	2006-01-18 21:32:45 +00:00
Robert Bocchino	e6336a9b69	Constant folding support for the insertelement operation. llvm-svn: 25407	2006-01-17 20:07:07 +00:00
Robert Bocchino	6dce25019d	Lowerpacked and SCCP support for the insertelement operation. llvm-svn: 25406	2006-01-17 20:06:55 +00:00
Chris Lattner	801f47512d	Clean up the FFS optimization code, and make it correctly create the appropriate unsigned llvm.cttz.* intrinsic, fixing the 2005-05-11-Popcount-ffs-fls regression last night. llvm-svn: 25398	2006-01-17 18:27:17 +00:00
Reid Spencer	b4f9a6f110	For PR411: This patch is an incremental step towards supporting a flat symbol table. It de-overloads the intrinsic functions by providing type-specific intrinsics and arranging for automatically upgrading from the old overloaded name to the new non-overloaded name. Specifically: llvm.isunordered -> llvm.isunordered.f32, llvm.isunordered.f64 llvm.sqrt -> llvm.sqrt.f32, llvm.sqrt.f64 llvm.ctpop -> llvm.ctpop.i8, llvm.ctpop.i16, llvm.ctpop.i32, llvm.ctpop.i64 llvm.ctlz -> llvm.ctlz.i8, llvm.ctlz.i16, llvm.ctlz.i32, llvm.ctlz.i64 llvm.cttz -> llvm.cttz.i8, llvm.cttz.i16, llvm.cttz.i32, llvm.cttz.i64 New code should not use the overloaded intrinsic names. Warnings will be emitted if they are used. llvm-svn: 25366	2006-01-16 21:12:35 +00:00
Chris Lattner	307b7ea15f	fix a crash due to missing parens llvm-svn: 25363	2006-01-16 19:47:21 +00:00
Chris Lattner	0de2c7d3d8	This pass has never worked correctly. Remove. llvm-svn: 25349	2006-01-16 01:06:00 +00:00
Chris Lattner	f6d6823f09	Let the inliner update the callgraph to reflect the changes it makes, instead of doing it ourselves. This fixes Transforms/Inline/2006-01-14-CallGraphUpdate.ll llvm-svn: 25321	2006-01-14 20:09:18 +00:00
Chris Lattner	0841fb1d4c	Teach the inliner to update the CallGraph itself, and have it add edges to llvm.stacksave/restore when it inserts calls to them. llvm-svn: 25320	2006-01-14 20:07:50 +00:00
Chris Lattner	ef530c24c1	FunctionPass's cannot do IPO things. llvm-svn: 25315	2006-01-14 19:30:35 +00:00
Nate Begeman	82049eba2c	Add bswap intrinsics as documented in the Language Reference llvm-svn: 25309	2006-01-14 01:25:24 +00:00
Robert Bocchino	a83529678e	Added instcombine support for extractelement. llvm-svn: 25299	2006-01-13 22:48:06 +00:00
Chris Lattner	5fba6e6696	it is ok to dce stacksave. llvm-svn: 25295	2006-01-13 21:31:54 +00:00
Chris Lattner	503221f5c5	Do a simple instcombine xforms to delete llvm.stackrestore cases. llvm-svn: 25294	2006-01-13 21:28:09 +00:00
Chris Lattner	c66b223b28	Simplify this a tiny bit by using the new IntrinsicInst functionality. llvm-svn: 25292	2006-01-13 20:11:04 +00:00
Chris Lattner	45406c0c53	Permit inlining functions that contain dynamic allocations now that InlineFunction handles this case safely. This implements Transforms/Inline/dynamic_alloca_test.ll. llvm-svn: 25288	2006-01-13 19:35:43 +00:00
Chris Lattner	2be0607a8d	If inlining a call to a function that contains dynamic allocas, wrap the resultant code with llvm.stacksave/llvm.stackrestore intrinsics. llvm-svn: 25286	2006-01-13 19:34:14 +00:00
Chris Lattner	e24f79a032	Use ClonedCodeInfo to avoid another walk over the inlined code, this this time in common C cases. llvm-svn: 25285	2006-01-13 19:18:11 +00:00
Chris Lattner	19e6a08d78	Use the ClonedCodeInfo object to avoid scans of the inlined code when it doesn't contain any calls. This is a fairly common case for C++ code, so it will probably speed up the inliner marginally in these cases. llvm-svn: 25284	2006-01-13 19:15:15 +00:00
Chris Lattner	908d79556d	Refactor a bunch of invoke handling stuff out into a new function "HandleInlinedInvoke". No functionality change. llvm-svn: 25283	2006-01-13 19:05:59 +00:00
Chris Lattner	edad1288fd	Allow the code cloning interfaces to capture some important info about the code being cloned if the client wants. llvm-svn: 25281	2006-01-13 18:39:17 +00:00
Chris Lattner	257492c0ab	Fix a bug I noticed by inspection: if the first instruction in the inlined function was not an alloca, we wouldn't check the entry block for any allocas, leading to increased stack space in some cases. In practice, allocas are almost always at the top of the block, so this was never noticed. llvm-svn: 25280	2006-01-13 18:16:48 +00:00
Chris Lattner	49c4d536bd	Fix 80 column violations llvm-svn: 25279	2006-01-13 18:06:56 +00:00
Chris Lattner	0770d8e326	Preserve and update ETForest. Patch by Daniel Berlin llvm-svn: 25203	2006-01-11 05:11:13 +00:00
Chris Lattner	cb36710ff9	Switch these to using ETForest instead of DominatorSet to compute itself. Patch written by Daniel Berlin! llvm-svn: 25202	2006-01-11 05:10:20 +00:00
Chris Lattner	48e4a2ebd8	Switch this to using ETForest instead of DominatorSet to compute itself. Patch written by Daniel Berlin! llvm-svn: 25201	2006-01-11 05:09:40 +00:00
Robert Bocchino	230044839d	Added support for the extractelement operation. llvm-svn: 25181	2006-01-10 19:05:34 +00:00
Robert Bocchino	bd518d153b	Added lower packed support for the extractelement operation. llvm-svn: 25180	2006-01-10 19:05:05 +00:00
Chris Lattner	cda4aa6eb4	Teach loopsimplify to update et-forest. Patch contributed by Daniel Berlin! llvm-svn: 25153	2006-01-09 08:03:08 +00:00
Chris Lattner	9cbfbc21bb	fix some 176.gcc miscompilation from my previous patch. llvm-svn: 25137	2006-01-07 01:32:28 +00:00
Chris Lattner	330628a6d8	silence some bogus gcc warnings on fenris llvm-svn: 25130	2006-01-06 17:59:59 +00:00
Chris Lattner	eb372a0276	Enhance the shift-shift folding code to allow a no-op cast to occur in between the shifts. This allows us to fold this (which is the 'integer add a constant' sequence from cozmic's scheme compmiler): int %x(uint %anf-temporary776) { %anf-temporary777 = shr uint %anf-temporary776, ubyte 1 %anf-temporary800 = cast uint %anf-temporary777 to int %anf-temporary804 = shl int %anf-temporary800, ubyte 1 %anf-temporary805 = add int %anf-temporary804, -2 %anf-temporary806 = or int %anf-temporary805, 1 ret int %anf-temporary806 } into this: int %x(uint %anf-temporary776) { %anf-temporary776 = cast uint %anf-temporary776 to int %anf-temporary776.mask1 = add int %anf-temporary776, -2 %anf-temporary805 = or int %anf-temporary776.mask1, 1 ret int %anf-temporary805 } note that instcombine already knew how to eliminate the AND that the two shifts fold into. This is tested by InstCombine/shift.ll:test26 -Chris llvm-svn: 25128	2006-01-06 07:52:12 +00:00
Chris Lattner	b330939d90	Simplify the code a bit more llvm-svn: 25126	2006-01-06 07:22:22 +00:00
Chris Lattner	145539343f	Extract a bunch of code out of visitShiftInst into FoldShiftByConstant. No functionality changes. llvm-svn: 25125	2006-01-06 07:12:35 +00:00
Chris Lattner	8cdc773748	Pull inline methods out of the pass class definition to make it easier to read the code. Do not internalize debugger anchors. llvm-svn: 25067	2006-01-03 19:13:17 +00:00
Duraid Madina	7a3ad6cae2	getting there... llvm-svn: 25021	2005-12-26 13:48:44 +00:00
Chris Lattner	8c9e14620f	Fix Transforms/ScalarRepl/2005-12-14-UnionPromoteCrash.ll, a crash on undefined behavior in 126.gcc on big-endian systems. llvm-svn: 24708	2005-12-14 17:23:59 +00:00
Reid Spencer	175613adf6	Improve ResolveFunctions to: a) use better local variable names (OldMT -> OldFT) where "M" is used to mean "Function" (perhaps it was previously "Method"?) b) print out the module identifier in a warning message so that it is possible to track down in which module the error occurred. llvm-svn: 24698	2005-12-13 19:56:51 +00:00
Chris Lattner	3b0a62d8a5	Implement a little hack for parity with GCC on crafty. This speeds up 186.crafty by about 16% (from 15.109s to 13.045s) on my system. This turns allocas with unions/casts into scalars. For example crafty has something like this: union doub { unsigned short i[4]; long long d; }; int f(long long a) { return ((union doub){.d=a}).i[1]; } Instead of generating loads and stores to an alloca, we now promote the whole thing to a scalar long value. This implements: Transforms/ScalarRepl/AggregatePromote.ll llvm-svn: 24667	2005-12-12 07:19:13 +00:00
Chris Lattner	077200737c	getRawValue zero extens for unsigned values, use getsextvalue so that we know that small negative values fit into the immediate field of addressing modes. llvm-svn: 24608	2005-12-05 18:23:57 +00:00
Chris Lattner	165998207e	Wrap a long line, never internalize llvm.used. llvm-svn: 24602	2005-12-05 05:07:38 +00:00
Chris Lattner	2820b8c855	Fix SimplifyCFG/2005-12-03-IncorrectPHIFold.ll llvm-svn: 24581	2005-12-03 18:25:58 +00:00
Chris Lattner	dc4ffef633	Fix a bug where we didn't realize that vaarg reads memory. This fixes Transforms/DeadStoreElimination/2005-11-30-vaarg.ll llvm-svn: 24545	2005-11-30 19:38:22 +00:00
Andrew Lenharth	d251192910	a few more comments on the interfaces and functions llvm-svn: 24500	2005-11-28 18:10:59 +00:00
Andrew Lenharth	517caef495	Added documented rsprofiler interface. Also remove new profiler passes, the old ones have been updated to implement the interface. llvm-svn: 24499	2005-11-28 18:00:38 +00:00
Jeff Cohen	7ff44ec372	Fix VC++ warning. llvm-svn: 24496	2005-11-28 06:45:57 +00:00
Andrew Lenharth	93e59f6032	Random sampling (aka Arnold and Ryder) profiling. This is still preliminary, but it works on spec on x86 and alpha. The idea is to allow profiling passes to remember what profiling they inserted, then a random sampling framework is inserted which consists of duplicated basic blocks (without profiling), such that at each backedge in the program and entry into every function, the framework chooses whether to use the instrumented code or the instrumentation free code. The goal of such a framework is to make it reasonably cheap to do random sampling of very expensive profiling products (such as load-value profiling). The code is organized into 3 parts (2 passes) 1) a linked set of profiling passes, which implement an analysis group (linked, like alias analysis are). These insert profiling into the program, and remember what they inserted, so that at a later time they can be queried about any instruction. 2) a pass that handles inserting the random sampling framework. This also has options to control how random samples are choosen. Currently implemented are Global counters, register allocated global counters, and read cycle counter (see? there was a reason for it). The profiling passes are almost identical to the existing ones (block, function, and null profiling is supported right now), and they are valid passes without the sampling framework (hence the existing passes can be unified with the new ones, not done yet). Some things are a bit ugly still, but that should be fixed up soon enough. Other todo? making the counter values not "magic 2^16 -1" values, but dynamically choosable. llvm-svn: 24493	2005-11-28 00:58:09 +00:00
Andrew Lenharth	5fc3794e71	since reg2mem requires it, might as well mention that it preserves it llvm-svn: 24491	2005-11-25 16:04:54 +00:00
Andrew Lenharth	061029dee2	Reg2Mem is something a pass may depend on, so allow that llvm-svn: 24488	2005-11-22 22:14:23 +00:00
Andrew Lenharth	71b09bbb07	turns out, demotion and invokes and critical edges don't mix llvm-svn: 24487	2005-11-22 21:45:19 +00:00
Chris Lattner	9c37f23645	Fix a crash building 176.gcc due to my recent patch, which only fixed half the problem. llvm-svn: 24414	2005-11-18 18:30:47 +00:00
Chris Lattner	3e9e8bd25c	Implement a refinement to the mem2reg algorithm for cases where an alloca has a single def. In this case, look for uses that are dominated by the def and attempt to rewrite them to directly use the stored value. This speeds up mem2reg on these values and reduces the number of phi nodes inserted. This should address PR665. llvm-svn: 24411	2005-11-18 07:31:42 +00:00
Chris Lattner	31dc3827d3	This needs proper dominance llvm-svn: 24410	2005-11-18 07:29:44 +00:00
Chris Lattner	bca0be812d	This was checking the wrong GEP expression. Fixing this fixes a gccas crash compiling mysql reported by Ted Kremenek. llvm-svn: 24402	2005-11-17 19:35:42 +00:00
Andrew Lenharth	d9c13b1336	the pain isn't gone unless the phinodes are spilled too llvm-svn: 24288	2005-11-10 19:39:09 +00:00
Andrew Lenharth	8e66c0c8a9	this works with backedges to the existing entry block alot better llvm-svn: 24270	2005-11-10 17:35:34 +00:00
Andrew Lenharth	4130a4f061	The pass everyone has been waiting for! Reg2Mem for fun you can opt -reg2mem -mem2reg llvm-svn: 24267	2005-11-10 01:58:38 +00:00
Nate Begeman	848622f87f	Add support alignment of allocation instructions. Add support for specifying alignment and size of setjmp jmpbufs. No targets currently do anything with this information, nor is it presrved in the bytecode representation. That's coming up next. llvm-svn: 24196	2005-11-05 09:21:28 +00:00
Chris Lattner	16b29e9562	Implement Transforms/TailCallElim/return-undef.ll, a trivial case that has been sitting in my inbox since May 18. :) llvm-svn: 24194	2005-11-05 08:21:11 +00:00
Chris Lattner	dd0c174082	Turn sdiv into udiv if both operands have a clear sign bit. This occurs a few times in crafty: OLD: %tmp.36 = div int %tmp.35, 8 ; <int> [#uses=1] NEW: %tmp.36 = div uint %tmp.35, 8 ; <uint> [#uses=0] OLD: %tmp.19 = div int %tmp.18, 8 ; <int> [#uses=1] NEW: %tmp.19 = div uint %tmp.18, 8 ; <uint> [#uses=0] OLD: %tmp.117 = div int %tmp.116, 8 ; <int> [#uses=1] NEW: %tmp.117 = div uint %tmp.116, 8 ; <uint> [#uses=0] OLD: %tmp.92 = div int %tmp.91, 8 ; <int> [#uses=1] NEW: %tmp.92 = div uint %tmp.91, 8 ; <uint> [#uses=0] Which all turn into shrs. llvm-svn: 24190	2005-11-05 07:40:31 +00:00
Chris Lattner	e9ff0eaf5b	Turn srem -> urem when neither input has their sign bit set. This triggers 8 times in vortex, allowing the srems to be turned into shrs: OLD: %tmp.104 = rem int %tmp.5.i37, 16 ; <int> [#uses=1] NEW: %tmp.104 = rem uint %tmp.5.i37, 16 ; <uint> [#uses=0] OLD: %tmp.98 = rem int %tmp.5.i24, 16 ; <int> [#uses=1] NEW: %tmp.98 = rem uint %tmp.5.i24, 16 ; <uint> [#uses=0] OLD: %tmp.91 = rem int %tmp.5.i19, 8 ; <int> [#uses=1] NEW: %tmp.91 = rem uint %tmp.5.i19, 8 ; <uint> [#uses=0] OLD: %tmp.88 = rem int %tmp.5.i14, 8 ; <int> [#uses=1] NEW: %tmp.88 = rem uint %tmp.5.i14, 8 ; <uint> [#uses=0] OLD: %tmp.85 = rem int %tmp.5.i9, 1024 ; <int> [#uses=2] NEW: %tmp.85 = rem uint %tmp.5.i9, 1024 ; <uint> [#uses=0] OLD: %tmp.82 = rem int %tmp.5.i, 512 ; <int> [#uses=2] NEW: %tmp.82 = rem uint %tmp.5.i1, 512 ; <uint> [#uses=0] OLD: %tmp.48.i = rem int %tmp.5.i.i161, 4 ; <int> [#uses=1] NEW: %tmp.48.i = rem uint %tmp.5.i.i161, 4 ; <uint> [#uses=0] OLD: %tmp.20.i2 = rem int %tmp.5.i.i, 4 ; <int> [#uses=1] NEW: %tmp.20.i2 = rem uint %tmp.5.i.i, 4 ; <uint> [#uses=0] it also occurs 9 times in gcc, but with odd constant divisors (1009 and 61) so the payoff isn't as great. llvm-svn: 24189	2005-11-05 07:28:37 +00:00
Andrew Lenharth	662295587d	make this 64 bit clean, fixed test30 of /Regression/Transforms/InstCombine/add.ll llvm-svn: 24158	2005-11-02 18:35:40 +00:00
Chris Lattner	09efd4e5b6	Limit the search depth of MaskedValueIsZero to 6 instructions, to avoid bad cases. This fixes Markus's second testcase in PR639, and should seal it for good. llvm-svn: 24123	2005-10-31 18:35:52 +00:00
Chris Lattner	27d351f159	This pass is now obsolete since all targets have moved to the SelectionDAG infrastructure and the simple isels have been removed. llvm-svn: 24090	2005-10-29 05:33:46 +00:00
Chris Lattner	752717d4ec	Remove dead #include llvm-svn: 24083	2005-10-29 04:41:30 +00:00
Chris Lattner	ceb9d5adaa	Now that instcombine does this xform, remove it from the -raise pass llvm-svn: 24082	2005-10-29 04:40:23 +00:00
Chris Lattner	8f663e8bbc	Pull some code out into a function, give it the ability to see through +. This allows us to turn code like malloc(4*x+4) -> malloc int, (x+1) llvm-svn: 24081	2005-10-29 04:36:15 +00:00
Chris Lattner	8270c33606	Remove a special case, allowing the general case to handle it. No functionality change. llvm-svn: 24076	2005-10-29 03:19:53 +00:00
Chris Lattner	b9d3ca5c3c	Fix a bit of backwards logic that broke exptree and smg2000 llvm-svn: 24056	2005-10-28 16:27:35 +00:00
Chris Lattner	c4f67e67d2	Do not sink any instruction with side effects, including vaarg. This fixes PR640 llvm-svn: 24046	2005-10-27 17:13:11 +00:00
Chris Lattner	479911f971	Fix #include order llvm-svn: 24044	2005-10-27 16:34:00 +00:00
John Criswell	fe5f33b120	Move some constant folding code shared by Analysis and Transform passes into the LLVMAnalysis library. This allows LLVMTranform and LLVMTransformUtils to be archives and linked with LLVMAnalysis.a, which provides any missing definitions. llvm-svn: 24036	2005-10-27 15:54:34 +00:00
Chris Lattner	c6372cca78	Fix typo llvm-svn: 24033	2005-10-27 06:26:26 +00:00
Chris Lattner	0fe7551bc0	Teach instcombine to promote stuff like (cast (malloc sbyte, 8X) to int) into: malloc int, (2*X) llvm-svn: 24032	2005-10-27 06:24:46 +00:00
Chris Lattner	b3ecf96900	Promote cases like cast (malloc sbyte, 100) to int* into (malloc [25 x int]) directly without having to convert to (malloc [100 x sbyte]) first. llvm-svn: 24031	2005-10-27 06:12:00 +00:00
Chris Lattner	bb17180a23	Minor change to this file to support obscure cases with constant array amounts llvm-svn: 24030	2005-10-27 05:53:56 +00:00
John Criswell	94b7bea733	1. Remove libraries no longer created from the list of libraries linked into the SparcV9 JIT. 2. Make LLVMTransformUtils a relinked object file and always link it before LLVMAnalysis.a. These two libraries have circular dependencies on each other which creates problem when building the SparcV9 JIT. This change fixes the dependency on all platforms problems with a minimum of fuss. llvm-svn: 24023	2005-10-26 20:35:13 +00:00
Chris Lattner	38a1b00a0f	fold nested and's early to avoid inefficiencies in MaskedValueIsZero. This fixes a very slow compile in PR639. llvm-svn: 24011	2005-10-26 17:18:16 +00:00
Jeff Cohen	2b8cbf319c	Update Visual Studio projects to reflect moved file. llvm-svn: 23998	2005-10-26 05:36:51 +00:00
Alkis Evlogimenos	cb67b650b5	Stop using deprecated types llvm-svn: 23973	2005-10-25 11:18:06 +00:00
Chris Lattner	46705b2f2d	Handle allocations that, even after removing dead uses, still have more than one use (but one is a cast). This handles the very common case of: X = alloc [n x byte] Y = cast X to somethingbetter seteq X, null In order to avoid infinite looping when there are multiple casts, we only allow this if the xform is strictly increasing the alignment of the allocation. llvm-svn: 23961	2005-10-24 06:35:18 +00:00
Chris Lattner	355ecc09f8	Fix a bug where we would 'promote' an allocation from one type to another where the second has less alignment required. If we had explicit alignment support in the IR, we could handle this case, but we can't until we do. llvm-svn: 23960	2005-10-24 06:26:18 +00:00
Chris Lattner	ac87beb03a	Before promoting a malloc type, remove dead uses. This makes instcombine more effective at promoting these allocations, catching them earlier in the compile process. llvm-svn: 23959	2005-10-24 06:22:12 +00:00
Chris Lattner	216be91817	Pull some code out into a function, no functionality change llvm-svn: 23958	2005-10-24 06:03:58 +00:00
Chris Lattner	b37336978f	Remove some beta code that no longer has an owner. llvm-svn: 23944	2005-10-24 02:32:41 +00:00
Chris Lattner	f9998d9704	Do not build the ProfilePaths directory anymore llvm-svn: 23943	2005-10-24 02:31:49 +00:00
Chris Lattner	bde3845548	DONT_BUILD_RELINKED is gone and implied by BUILD_ARCHIVE now llvm-svn: 23940	2005-10-24 02:26:13 +00:00
Chris Lattner	8c087e962c	Only build .a file versions of these libraries, instead of .a and .o versions. This should speed up build times. llvm-svn: 23933	2005-10-24 01:59:48 +00:00
Chris Lattner	bd77fac034	Make sure that anything using the ADCE pass pulls in the UnifyFunctionExitNodes code llvm-svn: 23931	2005-10-24 01:40:23 +00:00
Jeff Cohen	11e26b52b2	When a function takes a variable number of pointer arguments, with a zero pointer marking the end of the list, the zero must be cast to the pointer type. An un-cast zero is a 32-bit int, and at least on x86_64, gcc will not extend the zero to 64 bits, thus allowing the upper 32 bits to be random junk. The new END_WITH_NULL macro may be used to annotate a such a function so that GCC (version 4 or newer) will detect the use of un-casted zero at compile time. llvm-svn: 23888	2005-10-23 04:37:20 +00:00
Chris Lattner	5df0e36e98	My previous patch was too conservative. Reject FP and void types, but do allow pointer types. llvm-svn: 23859	2005-10-21 05:45:41 +00:00
Chris Lattner	0c0b38bb4c	Do NOT touch FP ops with LSR. This fixes a testcase Nate sent me from an inner loop like this: LBB_RateConvertMono8AltiVec_2: ; no_exit lis r2, ha16(.CPI_RateConvertMono8AltiVec_0) lfs f3, lo16(.CPI_RateConvertMono8AltiVec_0)(r2) fmr f3, f3 fadd f0, f2, f0 fadd f3, f0, f3 fcmpu cr0, f3, f1 bge cr0, LBB_RateConvertMono8AltiVec_2 ; no_exit to an inner loop like this: LBB_RateConvertMono8AltiVec_1: ; no_exit fsub f2, f2, f1 fcmpu cr0, f2, f1 fmr f0, f2 bge cr0, LBB_RateConvertMono8AltiVec_1 ; no_exit Doh! good catch! llvm-svn: 23838	2005-10-20 04:47:10 +00:00
Chris Lattner	45517baf9f	Add an option to this pass. If it is set, we are allowed to internalize all but main. If it's not set, we can still internalize, but only if an explicit symbol list is provided. llvm-svn: 23783	2005-10-18 06:29:22 +00:00
Chris Lattner	da1b152c43	Make this work for FP constantexprs llvm-svn: 23773	2005-10-17 20:18:38 +00:00
Chris Lattner	7fde91e365	Oops, X+0.0 isn't foldable, but X+-0.0 is. llvm-svn: 23772	2005-10-17 17:56:38 +00:00
Chris Lattner	32979336a7	relax this a bit, as we only support the default rounding mode llvm-svn: 23771	2005-10-17 17:49:32 +00:00
Chris Lattner	192cd18f53	Fix (hopefully the last) issue where LSR is nondeterminstic. When pulling out CSE's of base expressions it could build a result whose order was nondet. llvm-svn: 23698	2005-10-11 18:41:04 +00:00
Chris Lattner	5c9d63da31	Fix another problem where LSR was being nondeterminstic. Also remove elements from the end of a vector instead of the beginning llvm-svn: 23697	2005-10-11 18:30:57 +00:00
Chris Lattner	b7a3894e7c	Fix another lsr-is-nondeterministic case llvm-svn: 23695	2005-10-11 18:17:57 +00:00
Chris Lattner	03b9eb506c	Make MaskedValueIsZero a bit more aggressive llvm-svn: 23677	2005-10-09 22:08:50 +00:00
Chris Lattner	62010c450f	Fix funky xcode indentation llvm-svn: 23674	2005-10-09 06:36:35 +00:00
Chris Lattner	eb4be8b942	Hrm, you didn't see this. llvm-svn: 23673	2005-10-09 06:24:02 +00:00
Chris Lattner	4ea0a3eaac	Fix a source of non-determinism in the backend: the order of processing IV strides dependend on the pointer order of the strides in memory. Non-determinism is bad. llvm-svn: 23672	2005-10-09 06:20:55 +00:00
Jeff Cohen	572910c9a2	Remove useless variable. llvm-svn: 23656	2005-10-07 05:28:29 +00:00
Chris Lattner	20b0754c41	Fix DemoteRegToStack on an invoke. This fixes PR634. llvm-svn: 23618	2005-10-04 00:44:01 +00:00
Chris Lattner	4c3b2b536c	Clean up the code a bit. Use isInstructionTriviallyDead to be more aggressive and more correct than use_empty(). This fixes PR635 and SimplifyCFG/2005-10-02-InvokeSimplify.ll llvm-svn: 23616	2005-10-03 23:43:43 +00:00
Chris Lattner	f07a587c79	Make IVUseShouldUsePostIncValue more aggressive when the use is a PHI. In particular, it should realize that phi's use their values in the pred block not the phi block itself. This change turns our em3d loop from this: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r2, 0 b LBB_test_6 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit or r2, r6, r6 lwz r6, 0(r3) cmpw cr0, r6, r5 beq cr0, LBB_test_6 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r2, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; endif.loopexit.loopexit_crit_edge addi r3, r2, 1 blr LBB_test_6: ; loopexit or r3, r2, r2 blr into: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r2, 0 b LBB_test_5 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 or r2, r6, r6 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 or r2, r6, r6 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; loopexit or r3, r2, r2 blr Unfortunately, this is actually worse code, because the register coallescer is getting confused somehow. If it were doing its job right, it could turn the code into this: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r6, 0 b LBB_test_5 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; loopexit or r3, r6, r6 blr ... which I'll work on next. :) llvm-svn: 23604	2005-10-03 02:50:05 +00:00
Chris Lattner	e4ed42a426	Refactor some code into a function llvm-svn: 23603	2005-10-03 01:04:44 +00:00
Chris Lattner	360928dbed	This break is bogus and I have no idea why it was there. Basically it prevents memoizing code when IV's are used by phinodes outside of loops. In a simple example, we were getting this code before (note that r6 and r7 are isomorphic IV's): li r6, 0 or r7, r6, r6 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 or r2, r7, r7 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r2, r7, 1 addi r7, r7, 1 addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit Now we get: li r6, 0 LBB_test_3: ; no_exit or r2, r6, r6 lwz r6, 0(r3) cmpw cr0, r6, r5 beq cr0, LBB_test_6 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r2, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit this was noticed in em3d. llvm-svn: 23602	2005-10-03 00:37:33 +00:00
Chris Lattner	8fcce170cf	when checking if we should move a split edge block outside of a loop, check the presplit pred, not the post-split pred. This was causing us to make the wrong decision in some cases, leaving the critical edge block in the loop. llvm-svn: 23601	2005-10-03 00:31:52 +00:00
Jeff Cohen	f8a5e5ae6e	Fix VC++ warnings. llvm-svn: 23579	2005-10-01 03:57:14 +00:00
Chris Lattner	a554c9470b	Insert stores after phi nodes in the normal dest. This fixes LowerInvoke/2005-08-03-InvokeWithPHI.ll llvm-svn: 23525	2005-09-29 17:44:20 +00:00
Chris Lattner	87ef943a4c	Fold isascii into a simple comparison. This speeds up 197.parser by 7.4%, bringing the LLC time down to the CBE time. llvm-svn: 23521	2005-09-29 06:17:27 +00:00
Chris Lattner	5f6035feb0	remove a bunch of unneeded stuff, or self evident comments llvm-svn: 23519	2005-09-29 06:16:11 +00:00
Chris Lattner	c244e7c178	Implement a couple of memcmp folds from the todo list llvm-svn: 23517	2005-09-29 04:54:20 +00:00
Chris Lattner	ea7214b23d	Constant fold llvm.sqrt llvm-svn: 23487	2005-09-28 01:34:32 +00:00
Chris Lattner	3b63bb375c	add a note about a way to improve this code further, that I won't be getting to right now. llvm-svn: 23485	2005-09-27 22:44:59 +00:00
Chris Lattner	eb953f0ef8	Fix a regression in my previous patch, fixing GlobalOpt/2005-09-27-Crash.ll and PR632. llvm-svn: 23484	2005-09-27 22:28:11 +00:00
Chris Lattner	e285f5ed8f	Avoid spilling stack slots... to stack slots. llvm-svn: 23478	2005-09-27 21:33:12 +00:00
Chris Lattner	87eb249300	Completely rewrite 'correct' eh support. This changes how setjmp insertion is performed so it is only at most once per function that contains an invoke instead of once per invoke in the function. This patch has the following perks: 1. It fixes PR631, which complains about slowness. 2. If fixes PR240, which complains about non-volatile vars being live across setjmp/longjmps. 3. It improves (but does not fix) the jmpbuf alignment issue on itanium by not forcing the jmpbufs to always be 8-bytes off the alignment of the structure. 4. It speeds up 253.perlbmk from 338s to 13.70s (a 25x improvement!), making us now about 4% faster than GCC. Further improvements are also possible. llvm-svn: 23477	2005-09-27 21:18:17 +00:00
Chris Lattner	92233d2175	Make the pass name simpler llvm-svn: 23476	2005-09-27 21:10:32 +00:00
Chris Lattner	16cd356fb2	allow demotion to volatile values, add support for invoke llvm-svn: 23473	2005-09-27 19:39:00 +00:00
Chris Lattner	3d27e7f27f	Add support for external calls that we know how to constant fold. This implements ctor-list-opt.ll:CTOR8 llvm-svn: 23465	2005-09-27 05:02:43 +00:00
Chris Lattner	29b2780c8a	Fix a bug where we would evaluate stores into linkonce objects which could be potentially replaced at link-time. llvm-svn: 23463	2005-09-27 04:50:03 +00:00
Chris Lattner	65a3a0918f	Implement support for static constructors with calls in them. This is useful because gccas runs globalopt before inlining. This implements ctor-list-opt.ll:CTOR7 llvm-svn: 23462	2005-09-27 04:45:34 +00:00
Chris Lattner	da1889b778	Refactor this code a bit, no functionality changes. llvm-svn: 23460	2005-09-27 04:27:01 +00:00
Chris Lattner	f2f89af69a	Remove some dead code. ctor evaluation subsumes empty ctor elim llvm-svn: 23453	2005-09-26 20:38:20 +00:00
Chris Lattner	6bf2cd5735	Add support for alloca, implementing ctor-list-opt.ll:CTOR6 llvm-svn: 23452	2005-09-26 17:07:09 +00:00
Chris Lattner	46d9ff081d	Add a debug printout, fix a crash on kc++ llvm-svn: 23450	2005-09-26 07:34:35 +00:00
Chris Lattner	46af55e0e4	Implement loads/stores through GEP's of globals. This implements ctor-list-opt.ll:CTOR5. llvm-svn: 23449	2005-09-26 06:52:44 +00:00
Chris Lattner	61ff32cd70	Replace TraverseGEPInitializer with ConstantFoldLoadThroughGEPConstantExpr llvm-svn: 23447	2005-09-26 05:34:07 +00:00
Chris Lattner	02ae21e1e0	Eliminate GetGEPGlobalInitializer in favor of the more powerful ConstantFoldLoadThroughGEPConstantExpr function in the utils lib. llvm-svn: 23446	2005-09-26 05:28:52 +00:00
Chris Lattner	0b011ec8e2	Factor the GetGEPGlobalInitializer out of this pass and into Transforms/Utils as ConstantFoldLoadThroughGEPConstantExpr. llvm-svn: 23445	2005-09-26 05:28:06 +00:00
Chris Lattner	c13c7b9376	Move the ConstantFoldLoadThroughGEPConstantExpr function out of the InstCombine pass. llvm-svn: 23444	2005-09-26 05:27:10 +00:00
Chris Lattner	b009663e27	add a comment llvm-svn: 23442	2005-09-26 05:16:34 +00:00
Chris Lattner	4b05c322d5	Add support for getelementptr, load, and correctly reject volatile stores. llvm-svn: 23441	2005-09-26 05:15:37 +00:00
Chris Lattner	3e9ea5ffec	Add support for br/brcond/switch and phi llvm-svn: 23439	2005-09-26 04:57:38 +00:00
Chris Lattner	99e23fa74c	Add a simple interpreter to this code, allowing us to statically evaluate global ctors that are simple enough. This implements ctor-list-opt.ll:CTOR2. llvm-svn: 23437	2005-09-26 04:44:35 +00:00
Chris Lattner	696beefabb	factor some code into a InstallGlobalCtors method, add comments. No functionality change. llvm-svn: 23435	2005-09-26 02:31:18 +00:00
Chris Lattner	838bdc1836	Make the global opt optimizer work on modules with a null terminator, by accepting the null even with a non-65535 init prio llvm-svn: 23434	2005-09-26 02:19:27 +00:00
Chris Lattner	41b6a5a693	Factor this code out into a few methods. Implement the start of global ctor optimization. It is currently smart enough to remove the global ctor for cases like this: struct foo { foo() {} } x; ... saving a bit of startup time for the program. llvm-svn: 23433	2005-09-26 01:43:45 +00:00
Chris Lattner	f487768062	Fix some logic I broke that caused a regression on SimplifyLibCalls/2005-05-20-sprintf-crash.ll llvm-svn: 23430	2005-09-25 07:06:48 +00:00
Chris Lattner	0b3557f54a	Move MaskedValueIsZero up. Match a bunch of idioms for sign extensions, implementing InstCombine/signext.ll llvm-svn: 23428	2005-09-24 23:43:33 +00:00
Chris Lattner	175463a165	Simplify this code a bit by relying on recursive simplification. Support sprintf("%s", P)'s that have uses. s/hasNUses(0)/use_empty()/ llvm-svn: 23425	2005-09-24 22:17:06 +00:00
Chris Lattner	499e33646e	remove some debugging code llvm-svn: 23411	2005-09-23 18:49:09 +00:00
Chris Lattner	c59a371d45	Fold two consequtive branches that share a common destination between them. This implements SimplifyCFG/branch-fold.ll, and is useful on ?:/min/max heavy code llvm-svn: 23410	2005-09-23 18:47:20 +00:00
Chris Lattner	3a978bf66d	simplify some logic further llvm-svn: 23408	2005-09-23 07:23:18 +00:00
Chris Lattner	cc14ebc17b	pull a bunch of logic out of SimplifyCFG into a helper fn llvm-svn: 23407	2005-09-23 06:39:30 +00:00
Chris Lattner	6c70106053	Start threading across blocks with code in them, so long as the code does not define a value that is used outside of it's block. This catches many more simplifications, e.g. 854 in 176.gcc, 137 in vpr, etc. This implements branch-phi-thread.ll:test3.ll llvm-svn: 23397	2005-09-20 01:48:40 +00:00
Chris Lattner	f0bd8d0107	Implement merging of blocks with the same condition if the block has multiple predecessors. This implements branch-phi-thread.ll::test1 llvm-svn: 23395	2005-09-20 00:43:16 +00:00
Chris Lattner	049cb4482f	Reject a case we don't handle yet llvm-svn: 23393	2005-09-19 23:57:04 +00:00
Chris Lattner	a160924d57	remove debugging code :-/ llvm-svn: 23392	2005-09-19 23:50:15 +00:00
Chris Lattner	748f903046	Implement SimplifyCFG/branch-phi-thread.ll, the most trivial case of threading control across branches with determined outcomes. More generality to follow. This triggers a couple thousand times in specint. llvm-svn: 23391	2005-09-19 23:49:37 +00:00
Chris Lattner	b4b2530a1a	Refactor this code a bit and make it more general. This now compiles: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } To: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) slwi r3, r3, 6 add r3, r4, r3 rlwimi r3, r4, 0, 26, 14 stw r3, 0(r2) blr instead of: _plus2: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) rlwinm r5, r4, 26, 21, 31 add r3, r5, r3 rlwimi r4, r3, 6, 15, 25 stw r4, 0(r2) blr by eliminating an 'and'. I'm pretty sure this is as small as we can go :) llvm-svn: 23386	2005-09-18 07:22:02 +00:00
Chris Lattner	797dee7705	Compile struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus2 (unsigned int x) { b.j += x; } to: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX and %ECX, 131008 mov %EDX, DWORD PTR [%ESP + 4] shl %EDX, 6 add %EDX, %ECX and %EDX, 131008 and %EAX, -131009 or %EDX, %EAX mov DWORD PTR [b], %EDX ret instead of: plus2: mov %EAX, DWORD PTR [b] mov %ECX, %EAX shr %ECX, 6 and %ECX, 2047 add %ECX, DWORD PTR [%ESP + 4] shl %ECX, 6 and %ECX, 131008 and %EAX, -131009 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23385	2005-09-18 06:30:59 +00:00
Chris Lattner	01f56c68e9	Generalize this transform, using MaskedValueIsZero, allowing us to compile: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus3 (unsigned int x) { b.k += x; } To: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 add DWORD PTR [b], %EAX ret instead of: plus3: mov %EAX, DWORD PTR [%ESP + 4] shl %EAX, 17 mov %ECX, DWORD PTR [b] add %EAX, %ECX and %EAX, -131072 and %ECX, 131071 or %ECX, %EAX mov DWORD PTR [b], %ECX ret llvm-svn: 23384	2005-09-18 06:02:59 +00:00
Chris Lattner	4ebc8ab4e0	fix typeo llvm-svn: 23383	2005-09-18 05:25:20 +00:00
Chris Lattner	e5b23a6d67	Remove unintentionally committed code llvm-svn: 23382	2005-09-18 05:12:51 +00:00
Chris Lattner	27cb9dbd35	implement shift.ll:test25. This compiles: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus3 (unsigned int x) { b.k += x; } to: _plus3: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r3, 0(r2) rlwinm r4, r3, 0, 0, 14 add r4, r4, r3 rlwimi r4, r3, 0, 15, 31 stw r4, 0(r2) blr instead of: _plus3: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) srwi r5, r4, 17 add r3, r5, r3 slwi r3, r3, 17 rlwimi r3, r4, 0, 15, 31 stw r3, 0(r2) blr llvm-svn: 23381	2005-09-18 05:12:10 +00:00
Chris Lattner	af517574ce	Implement add.ll:test29. Codegening: struct S { unsigned int i : 6, j : 11, k : 15; } b; void plus1 (unsigned int x) { b.i += x; } as: _plus1: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) add r3, r4, r3 rlwimi r3, r4, 0, 0, 25 stw r3, 0(r2) blr instead of: _plus1: lis r2, ha16(L_b$non_lazy_ptr) lwz r2, lo16(L_b$non_lazy_ptr)(r2) lwz r4, 0(r2) rlwinm r5, r4, 0, 26, 31 add r3, r5, r3 rlwimi r3, r4, 0, 0, 25 stw r3, 0(r2) blr llvm-svn: 23379	2005-09-18 04:24:45 +00:00
Chris Lattner	027eaf01cf	remove debug output llvm-svn: 23377	2005-09-18 03:50:25 +00:00
Chris Lattner	1521298993	Implement or.ll:test21. This teaches instcombine to be able to turn this: struct { unsigned int bit0:1; unsigned int ubyte:31; } sdata; void foo() { sdata.ubyte++; } into this: foo: add DWORD PTR [sdata], 2 ret instead of this: foo: mov %EAX, DWORD PTR [sdata] mov %ECX, %EAX add %ECX, 2 and %ECX, -2 and %EAX, 1 or %EAX, %ECX mov DWORD PTR [sdata], %EAX ret llvm-svn: 23376	2005-09-18 03:42:07 +00:00
Chris Lattner	a393e4d4b3	Fix the regression last night compiling povray llvm-svn: 23348	2005-09-14 17:32:56 +00:00
Chris Lattner	2a8932960d	Add a simple xform to simplify array accesses with casts in the way. This is useful for 178.galgel where resolution of dope vectors (by the optimizer) causes the scales to become apparent. llvm-svn: 23328	2005-09-13 18:36:04 +00:00
Chris Lattner	fd018c8dfe	Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI node that is not the original PHI. This fixes up a dot-product loop in galgel, speeding it up from 18.47s to 16.13s. llvm-svn: 23327	2005-09-13 02:09:55 +00:00
Chris Lattner	567b81f0d2	Add a helper function, allowing us to simplify some code a bit, changing indentation, no functionality change llvm-svn: 23325	2005-09-13 00:40:14 +00:00
Chris Lattner	219175c84d	Implement a simple xform to turn code like this: if () { store A -> P; } else { store B -> P; } into a PHI node with one store, in the most trival case. This implements load.ll:test10. llvm-svn: 23324	2005-09-12 23:23:25 +00:00
Chris Lattner	e0bfdf1485	Another load-peephole optimization: do gcse when two loads are next to each other. This implements InstCombine/load.ll:test9 llvm-svn: 23322	2005-09-12 22:21:03 +00:00
Chris Lattner	b990f7d8ed	Implement a trivial form of store->load forwarding where the store and the load are exactly consequtive. This is picked up by other passes, but this triggers thousands of times in fortran programs that use static locals (and is thus a compile-time speedup). llvm-svn: 23320	2005-09-12 22:00:15 +00:00
Chris Lattner	8048b85e8f	Fix a regression from last night, which caused this pass to create invalid code for IV uses outside of loops that are not dominated by the latch block. We should only convert these uses to use the post-inc value if they ARE dominated by the latch block. Also use a new LoopInfo method to simplify some code. This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll llvm-svn: 23318	2005-09-12 17:11:27 +00:00
Chris Lattner	a67648396a	_test: li r2, 0 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r2, 1 stw r2, 0(r4) blr [zion ~/llvm]$ cat > ~/xx Uses of IV's outside of the loop should use hte post-incremented version of the IV, not the preincremented version. This helps many loops (e.g. in sixtrack) which used to generate code like this (this is the code from the dont-hoist-simple-loop-constants.ll testcase): _test: li r2, 0 ** IV starts at 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 Copy for loop exit li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 IV+2 cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 IV+2 stw r2, 0(r4) blr And now generated code like this: _test: li r2, 1 * IV starts at 1 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 * IV.postinc + 0 blt cr0, LBB_test_1 LBB_test_2: ; loopexit.2.loopexit stw r2, 0(r4) * IV.postinc + 0 blr llvm-svn: 23313	2005-09-12 06:04:47 +00:00
Chris Lattner	530fe6ab30	implement Transforms/LoopStrengthReduce/dont-hoist-simple-loop-constants.ll. We used to emit this code for it: _test: li r2, 1 ;; Value tying up a register for the whole loop li r5, 0 LBB_test_1: ; no_exit.2 or r6, r5, r5 li r5, 0 stw r5, 0(r3) addi r5, r6, 1 addi r3, r3, 4 add r7, r2, r5 ;; should be addi r7, r5, 1 cmpwi cr0, r7, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r6, 2 stw r2, 0(r4) blr now we emit this: _test: li r2, 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 ;; whoa, fold those adds! cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 stw r2, 0(r4) blr more improvement coming. llvm-svn: 23306	2005-09-10 01:18:45 +00:00
Chris Lattner	b5e381a8cf	Fix a problem that Dan Berlin noticed, where reassociation would not succeed in building maximal expressions before simplifying them. In particular, i cases like this: X-(A+B+X) the code would consider A+B+X to be a maximal expression (not understanding that the single use '-' would be turned into a + later), simplify it (a noop) then later get simplified again. Each of these simplify steps is where the cost of reassociation comes from, so this patch should speed up the already fast pass a bit. Thanks to Dan for noticing this! llvm-svn: 23214	2005-09-02 07:07:58 +00:00
Chris Lattner	9fe263aa75	Avoid creating garbage instructions, just move the old add instruction to where we need it when converting -(A+B+C) -> -A + -B + -C. llvm-svn: 23213	2005-09-02 06:38:04 +00:00
Chris Lattner	d1325da091	add some assertions and fix problems where reassociate could access the Ops vector out of range llvm-svn: 23211	2005-09-02 05:23:22 +00:00
Chris Lattner	8ca5b2a6d2	Fix Regression/Transforms/Reassociate/2005-08-24-Crash.ll llvm-svn: 23019	2005-08-24 17:55:32 +00:00
Chris Lattner	4201cd1bbc	Transform floor((double)FLT) -> (double)floorf(FLT), implementing Regression/Transforms/SimplifyLibCalls/floor.ll. This triggers 19 times in 177.mesa. llvm-svn: 23017	2005-08-24 17:22:17 +00:00
Chris Lattner	ea7dfd53d6	Fix Transforms/LoopStrengthReduce/2005-08-17-OutOfLoopVariant.ll, a crash on 177.mesa llvm-svn: 22843	2005-08-17 21:22:41 +00:00
Chris Lattner	2bf7cb5213	Use a new helper to split critical edges, making the code simpler. Do not claim to not change the CFG. We do change the cfg to split critical edges. This isn't causing us a problem now, but could likely do so in the future. llvm-svn: 22824	2005-08-17 06:35:16 +00:00
Chris Lattner	5cf983ee0f	Fix a bad case in gzip where we put lots of things in registers across the loop, because a IV-dependent value was used outside of the loop and didn't have immediate-folding capability llvm-svn: 22798	2005-08-16 00:38:11 +00:00
Chris Lattner	47d3ec3525	Ooops, don't forget to clear this. The real inner loop is now: .LBB_foo_3: ; no_exit.1 lfd f2, 0(r9) lfd f3, 8(r9) fmul f4, f1, f2 fmadd f4, f0, f3, f4 stfd f4, 8(r9) fmul f3, f1, f3 fmsub f2, f0, f2, f3 stfd f2, 0(r9) addi r9, r9, 16 addi r8, r8, 1 cmpw cr0, r8, r4 ble .LBB_foo_3 ; no_exit.1 llvm-svn: 22782	2005-08-13 07:42:01 +00:00
Chris Lattner	5949d49032	Recursively scan scev expressions for common subexpressions. This allows us to handle nested loops much better, for example, by being able to tell that these two expressions: {( 8 + ( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp 12)}<loopentry.1> {(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1> Have the following common part that can be shared: {(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1> This allows us to codegen an important inner loop in 168.wupwise as: .LBB_foo_4: ; no_exit.1 lfd f2, 16(r9) fmul f3, f0, f2 fmul f2, f1, f2 fadd f4, f3, f2 stfd f4, 8(r9) fsub f2, f3, f2 stfd f2, 16(r9) addi r8, r8, 1 addi r9, r9, 16 cmpw cr0, r8, r4 ble .LBB_foo_4 ; no_exit.1 instead of: .LBB_foo_3: ; no_exit.1 lfdx f2, r6, r9 add r10, r6, r9 lfd f3, 8(r10) fmul f4, f1, f2 fmadd f4, f0, f3, f4 stfd f4, 8(r10) fmul f3, f1, f3 fmsub f2, f0, f2, f3 stfdx f2, r6, r9 addi r9, r9, 16 addi r8, r8, 1 cmpw cr0, r8, r4 ble .LBB_foo_3 ; no_exit.1 llvm-svn: 22781	2005-08-13 07:27:18 +00:00
Chris Lattner	89c1dfc733	Teach SplitCriticalEdge to update LoopInfo if it is alive. This fixes a problem in LoopStrengthReduction, where it would split critical edges then confused itself with outdated loop information. llvm-svn: 22776	2005-08-13 01:38:43 +00:00
Chris Lattner	79396539d3	remove dead code. The exit block list is computed on demand, thus does not need to be updated. This code is a relic from when it did. llvm-svn: 22775	2005-08-13 01:30:36 +00:00
Chris Lattner	8447b49526	When splitting critical edges, make sure not to leave the new block in the middle of the loop. This turns a critical loop in gzip into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 bne .LBB_test_8 ; loopentry.loopexit_crit_edge .LBB_test_2: ; shortcirc_next.0 add r28, r3, r27 lhz r28, 5(r28) add r26, r4, r27 lhz r26, 5(r26) cmpw cr0, r28, r26 bne .LBB_test_7 ; shortcirc_next.0.loopexit_crit_edge .LBB_test_3: ; shortcirc_next.1 add r28, r3, r27 lhz r28, 7(r28) add r26, r4, r27 lhz r26, 7(r26) cmpw cr0, r28, r26 bne .LBB_test_6 ; shortcirc_next.1.loopexit_crit_edge .LBB_test_4: ; shortcirc_next.2 add r28, r3, r27 lhz r26, 9(r28) add r28, r4, r27 lhz r25, 9(r28) addi r28, r27, 8 cmpw cr7, r26, r25 mfcr r26, 1 rlwinm r26, r26, 31, 31, 31 add r25, r8, r27 cmpw cr7, r25, r7 mfcr r25, 1 rlwinm r25, r25, 29, 31, 31 and. r26, r26, r25 bne .LBB_test_1 ; loopentry instead of this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_test_9 ; loopexit .LBB_test_3: ; shortcirc_next.0 add r28, r3, r27 lhz r28, 5(r28) add r26, r4, r27 lhz r26, 5(r26) cmpw cr0, r28, r26 beq .LBB_test_5 ; shortcirc_next.1 .LBB_test_4: ; shortcirc_next.0.loopexit_crit_edge add r2, r11, r27 add r8, r12, r27 b .LBB_test_9 ; loopexit .LBB_test_5: ; shortcirc_next.1 add r28, r3, r27 lhz r28, 7(r28) add r26, r4, r27 lhz r26, 7(r26) cmpw cr0, r28, r26 beq .LBB_test_7 ; shortcirc_next.2 .LBB_test_6: ; shortcirc_next.1.loopexit_crit_edge add r2, r9, r27 add r8, r10, r27 b .LBB_test_9 ; loopexit .LBB_test_7: ; shortcirc_next.2 add r28, r3, r27 lhz r26, 9(r28) add r28, r4, r27 lhz r25, 9(r28) addi r28, r27, 8 cmpw cr7, r26, r25 mfcr r26, 1 rlwinm r26, r26, 31, 31, 31 add r25, r8, r27 cmpw cr7, r25, r7 mfcr r25, 1 rlwinm r25, r25, 29, 31, 31 and. r26, r26, r25 bne .LBB_test_1 ; loopentry Next up, improve the code for the loop. llvm-svn: 22769	2005-08-12 22:22:17 +00:00
Chris Lattner	4fec86d348	Fix a FIXME: if we are inserting code for a PHI argument, split the critical edge so that the code is not always executed for both operands. This prevents LSR from inserting code into loops whose exit blocks contain PHI uses of IV expressions (which are outside of loops). On gzip, for example, we turn this ugly code: .LBB_test_1: ; loopentry add r27, r3, r28 lhz r27, 3(r27) add r26, r4, r28 lhz r26, 3(r26) add r25, r30, r28 ;; Only live if exiting the loop add r24, r29, r28 ;; Only live if exiting the loop cmpw cr0, r27, r26 bne .LBB_test_5 ; loopexit into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_test_9 ; loopexit .LBB_test_2: ; shortcirc_next.0 ... blt .LBB_test_1 into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_t_3: ; shortcirc_next.0 .LBB_test_3: ; shortcirc_next.0 ... blt .LBB_test_1 Next step: get the block out of the loop so that the loop is all fall-throughs again. llvm-svn: 22766	2005-08-12 22:06:11 +00:00
Chris Lattner	b7ebe65c56	Change break critical edges to not remove, then insert, PHI node entries. Instead, just update the BB in-place. This is both faster, and it prevents split-critical-edges from shuffling the PHI argument list unneccesarily. llvm-svn: 22765	2005-08-12 21:58:07 +00:00
Chris Lattner	62df798919	remove some trickiness that broke yacr2 and some other programs last night llvm-svn: 22751	2005-08-10 17:15:20 +00:00
Chris Lattner	f83ce5faee	Make loop-simplify produce better loops by turning PHI nodes like X = phi [X, Y] into just Y. This often occurs when it seperates loops that have collapsed loop headers. This implements LoopSimplify/phi-node-simplify.ll llvm-svn: 22746	2005-08-10 02:07:32 +00:00
Chris Lattner	677d85784a	Allow indvar simplify to canonicalize ANY affine IV, not just affine IVs with constant stride. This implements Transforms/IndVarsSimplify/variable-stride-ivs.ll llvm-svn: 22744	2005-08-10 01:12:06 +00:00
Chris Lattner	edff91a49a	Teach LSR to strength reduce IVs that have a loop-invariant but non-constant stride. For code like this: void foo(float a, float b, int n, int stride_a, int stride_b) { int i; for (i=0; i<n; i++) a[istride_a] = b[istride_b]; } we now emit: .LBB_foo2_2: ; no_exit lfs f0, 0(r4) stfs f0, 0(r3) addi r7, r7, 1 add r4, r2, r4 add r3, r6, r3 cmpw cr0, r7, r5 blt .LBB_foo2_2 ; no_exit instead of: .LBB_foo_2: ; no_exit mullw r8, r2, r7 ;; multiply! slwi r8, r8, 2 lfsx f0, r4, r8 mullw r8, r2, r6 ;; multiply! slwi r8, r8, 2 stfsx f0, r3, r8 addi r2, r2, 1 cmpw cr0, r2, r5 blt .LBB_foo_2 ; no_exit loops with variable strides occur pretty often. For example, in SPECFP2K there are 317 variable strides in 177.mesa, 3 in 179.art, 14 in 188.ammp, 56 in 168.wupwise, 36 in 172.mgrid. Now we can allow indvars to turn functions written like this: void foo2(float a, float b, int n, int stride_a, int stride_b) { int i, ai = 0, bi = 0; for (i=0; i<n; i++) { a[ai] = b[bi]; ai += stride_a; bi += stride_b; } } into code like the above for better analysis. With this patch, they generate identical code. llvm-svn: 22740	2005-08-10 00:45:21 +00:00
Chris Lattner	dde7dc525e	Fix Regression/Transforms/LoopStrengthReduce/phi_node_update_multiple_preds.ll by being more careful about updating PHI nodes llvm-svn: 22739	2005-08-10 00:35:32 +00:00
Chris Lattner	c6c4d99a21	Fix some 80 column violations. Once we compute the evolution for a GEP, tell SE about it. This allows users of the GEP to know it, if the users are not direct. This allows us to compile this testcase: void fbSolidFillmmx(int w, unsigned char d) { while (w >= 64) { (unsigned long long ) (d + 0) = 0; (unsigned long long ) (d + 8) = 0; (unsigned long long ) (d + 16) = 0; (unsigned long long ) (d + 24) = 0; (unsigned long long ) (d + 32) = 0; (unsigned long long ) (d + 40) = 0; (unsigned long long ) (d + 48) = 0; (unsigned long long *) (d + 56) = 0; w -= 64; d += 64; } } into: .LBB_fbSolidFillmmx_2: ; no_exit li r2, 0 stw r2, 0(r4) stw r2, 4(r4) stw r2, 8(r4) stw r2, 12(r4) stw r2, 16(r4) stw r2, 20(r4) stw r2, 24(r4) stw r2, 28(r4) stw r2, 32(r4) stw r2, 36(r4) stw r2, 40(r4) stw r2, 44(r4) stw r2, 48(r4) stw r2, 52(r4) stw r2, 56(r4) stw r2, 60(r4) addi r4, r4, 64 addi r3, r3, -64 cmpwi cr0, r3, 63 bgt .LBB_fbSolidFillmmx_2 ; no_exit instead of: .LBB_fbSolidFillmmx_2: ; no_exit li r11, 0 stw r11, 0(r4) stw r11, 4(r4) stwx r11, r10, r4 add r12, r10, r4 stw r11, 4(r12) stwx r11, r9, r4 add r12, r9, r4 stw r11, 4(r12) stwx r11, r8, r4 add r12, r8, r4 stw r11, 4(r12) stwx r11, r7, r4 add r12, r7, r4 stw r11, 4(r12) stwx r11, r6, r4 add r12, r6, r4 stw r11, 4(r12) stwx r11, r5, r4 add r12, r5, r4 stw r11, 4(r12) stwx r11, r2, r4 add r12, r2, r4 stw r11, 4(r12) addi r4, r4, 64 addi r3, r3, -64 cmpwi cr0, r3, 63 bgt .LBB_fbSolidFillmmx_2 ; no_exit llvm-svn: 22737	2005-08-09 23:39:36 +00:00
Chris Lattner	02742710f3	SCEVAddExpr::get() of an empty list is invalid. llvm-svn: 22724	2005-08-09 01:13:47 +00:00
Chris Lattner	a091ff1764	Implement: LoopStrengthReduce/share_ivs.ll Two changes: * Only insert one PHI node for each stride. Other values are live in values. This cannot introduce higher register pressure than the previous approach, and can take advantage of reg+reg addressing modes. * Factor common base values out of uses before moving values from the base to the immediate fields. This improves codegen by starting the stride-specific PHI node out at a common place for each IV use. As an example, we used to generate this for a loop in swim: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfd f0, 0(r8) stfd f0, 0(r3) lfd f0, 0(r6) stfd f0, 0(r7) lfd f0, 0(r2) stfd f0, 0(r5) addi r9, r9, 1 addi r2, r2, 8 addi r5, r5, 8 addi r6, r6, 8 addi r7, r7, 8 addi r8, r8, 8 addi r3, r3, 8 cmpw cr0, r9, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 now we emit: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfdx f0, r8, r2 stfdx f0, r9, r2 lfdx f0, r5, r2 stfdx f0, r7, r2 lfdx f0, r3, r2 stfdx f0, r6, r2 addi r10, r10, 1 addi r2, r2, 8 cmpw cr0, r10, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 As another more dramatic example, we used to emit this: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfd f0, 8(r21) lfd f4, 8(r3) lfd f5, 8(r27) lfd f6, 8(r22) lfd f7, 8(r5) lfd f8, 8(r6) lfd f9, 8(r30) lfd f10, 8(r11) lfd f11, 8(r12) fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfd f0, 8(r4) lfd f0, 8(r25) lfd f5, 8(r26) lfd f6, 8(r23) lfd f9, 8(r28) lfd f10, 8(r10) lfd f12, 8(r9) lfd f13, 8(r29) fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfd f0, 8(r24) lfd f0, 8(r8) fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfd f0, 8(r2) addi r20, r20, 1 addi r2, r2, 8 addi r8, r8, 8 addi r10, r10, 8 addi r12, r12, 8 addi r6, r6, 8 addi r29, r29, 8 addi r28, r28, 8 addi r26, r26, 8 addi r25, r25, 8 addi r24, r24, 8 addi r5, r5, 8 addi r23, r23, 8 addi r22, r22, 8 addi r3, r3, 8 addi r9, r9, 8 addi r11, r11, 8 addi r30, r30, 8 addi r27, r27, 8 addi r21, r21, 8 addi r4, r4, 8 cmpw cr0, r20, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 we now emit: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfdx f0, r21, r20 lfdx f4, r3, r20 lfdx f5, r27, r20 lfdx f6, r22, r20 lfdx f7, r5, r20 lfdx f8, r6, r20 lfdx f9, r30, r20 lfdx f10, r11, r20 lfdx f11, r12, r20 fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfdx f0, r4, r20 lfdx f0, r25, r20 lfdx f5, r26, r20 lfdx f6, r23, r20 lfdx f9, r28, r20 lfdx f10, r10, r20 lfdx f12, r9, r20 lfdx f13, r29, r20 fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfdx f0, r24, r20 lfdx f0, r8, r20 fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfdx f0, r2, r20 addi r19, r19, 1 addi r20, r20, 8 cmpw cr0, r19, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 llvm-svn: 22722	2005-08-09 00:18:09 +00:00
Chris Lattner	37c24cc98c	Suck the base value out of the UsersToProcess vector into the BasedUser class to simplify the code. Fuse two loops. llvm-svn: 22721	2005-08-08 22:56:21 +00:00
Chris Lattner	37ed895bf1	Split MoveLoopVariantsToImediateField out from MoveImmediateValues. The first is a correctness thing, and the later is an optzn thing. This also is needed to support a future change. llvm-svn: 22720	2005-08-08 22:32:34 +00:00
Chris Lattner	9f269e40c9	Use the new 'moveBefore' method to simplify some code. Really, which is easier to understand? :) llvm-svn: 22706	2005-08-08 19:11:57 +00:00
Chris Lattner	14203e85b2	Not all constants are legal immediates in load/store instructions. llvm-svn: 22704	2005-08-08 06:25:50 +00:00
Chris Lattner	c70bbc0c41	Implement LoopStrengthReduce/share_code_in_preheader.ll by having one rewriter for all code inserted into the preheader, which is never flushed. llvm-svn: 22702	2005-08-08 05:47:49 +00:00
Chris Lattner	9bfa6f8784	Implement a simple optimization for the termination condition of the loop. The termination condition actually wants to use the post-incremented value of the loop, not a new indvar with an unusual base. On PPC, for example, this allows us to compile LoopStrengthReduce/exit_compare_live_range.ll to: _foo: li r2, 0 .LBB_foo_1: ; no_exit li r5, 0 stw r5, 0(r3) addi r2, r2, 1 cmpw cr0, r2, r4 bne .LBB_foo_1 ; no_exit blr instead of: _foo: li r2, 1 ;; IV starts at 1, not 0 .LBB_foo_1: ; no_exit li r5, 0 stw r5, 0(r3) addi r5, r2, 1 cmpw cr0, r2, r4 or r2, r5, r5 ;; Reg-reg copy, extra live range bne .LBB_foo_1 ; no_exit blr This implements LoopStrengthReduce/exit_compare_live_range.ll llvm-svn: 22699	2005-08-08 05:28:22 +00:00
Chris Lattner	579b20b747	All stats are "Number of ..." llvm-svn: 22694	2005-08-07 20:02:04 +00:00
Chris Lattner	2c14cf7b74	Add some simple folds that occur in bitfield cases. Fix a minor bug in isHighOnes, where it would consider 0 to have high ones. llvm-svn: 22693	2005-08-07 07:03:10 +00:00
Chris Lattner	134ebd0801	Fix typoCVS: ---------------------------------------------------------------------- llvm-svn: 22692	2005-08-07 07:00:52 +00:00
Chris Lattner	f4dd8c445c	* Use the new PHINode::hasConstantValue method to simplify some code * Teach this code to move allocas out of the loop when tail call eliminating a call marked 'tail'. This implements TailCallElim/move_alloca_for_tail_call.ll * Do not perform this transformation if a call is marked 'tail' and if there are allocas that we cannot move out of the loop in #2. Doing so would increase the stack usage of the function. This implements fixes PR615 and TailCallElim/dont-tce-tail-marked-call.ll. llvm-svn: 22690	2005-08-07 04:27:41 +00:00
Chris Lattner	11e7a5eda7	Make sure to clean CastedPointers after casts are potentially deleted. This fixes LSR crashes on 301.apsi, 191.fma3d, and 189.lucas llvm-svn: 22673	2005-08-05 01:30:11 +00:00
Chris Lattner	9f9c260b8c	now that hasConstantValue defaults to only returning values that dominate the PHI node, this ugly code can vanish. llvm-svn: 22672	2005-08-05 01:04:30 +00:00
Chris Lattner	257efb2ad3	This code can handle non-dominating instructions llvm-svn: 22667	2005-08-05 00:57:45 +00:00
Nate Begeman	b392321cae	Fix a fixme in CondPropagate.cpp by moving a PhiNode optimization into BasicBlock's removePredecessor routine. This requires shuffling around the definition and implementation of hasContantValue from Utils.h,cpp into Instructions.h,cpp llvm-svn: 22664	2005-08-04 23:24:19 +00:00
Chris Lattner	45f8b6e7aa	Modify how immediates are removed from base expressions to deal with the fact that the symbolic evaluator is not always able to use subtraction to remove expressions. This makes the code faster, and fixes the last crash on 178.galgel. Finally, add a statistic to see how many phi nodes are inserted. On 178.galgel, we get the follow stats: 2562 loop-reduce - Number of PHIs inserted 3927 loop-reduce - Number of GEPs strength reduced llvm-svn: 22662	2005-08-04 22:34:05 +00:00
Chris Lattner	a6d7c355bc	* Refactor some code into a new BasedUser::RewriteInstructionToUseNewBase method. * Fix a crash on 178.galgel, where we would insert expressions before PHI nodes instead of into the PHI node predecessor blocks. llvm-svn: 22657	2005-08-04 20:03:32 +00:00
Chris Lattner	0f7c0fa2a7	Fix a case that caused this to crash on 178.galgel llvm-svn: 22653	2005-08-04 19:26:19 +00:00
Chris Lattner	acc42c4df1	Teach LSR about loop-variant expressions, such as loops like this: for (i = 0; i < N; ++i) A[i][foo()] = 0; here we still want to strength reduce the A[i] part, even though foo() is l-v. This also simplifies some of the 'CanReduce' logic. This implements Transforms/LoopStrengthReduce/ops_after_indvar.ll llvm-svn: 22652	2005-08-04 19:08:16 +00:00
Nate Begeman	456044b724	Remove some more dead code. llvm-svn: 22650	2005-08-04 18:13:56 +00:00
Chris Lattner	eaf24725b2	Refactor this code substantially with the following improvements: 1. We only analyze instructions once, guaranteed 2. AnalyzeGetElementPtrUsers has been ripped apart and replaced with something much simpler. The next step is to handle expressions that are not all indvar+loop-invariant values (e.g. handling indvar+loopvariant). llvm-svn: 22649	2005-08-04 17:40:30 +00:00
Chris Lattner	6f286b760f	refactor some code llvm-svn: 22643	2005-08-04 01:19:13 +00:00
Chris Lattner	6510749050	invert to if's to make the logic simpler llvm-svn: 22641	2005-08-04 00:40:47 +00:00
Chris Lattner	a0102fbc4f	When processing outer loops and we find uses of an IV in inner loops, make sure to handle the use, just don't recurse into it. This permits us to generate this code for a simple nested loop case: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r29, 44(r1) stw r30, 40(r1) mflr r11 stw r11, 56(r1) lis r2, ha16(L_A$non_lazy_ptr) lwz r30, lo16(L_A$non_lazy_ptr)(r2) li r29, 1 .LBB_foo_1: ; no_exit.0 bl L_bar$stub li r2, 1 or r3, r30, r30 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r3) stfd f0, 0(r3) addi r4, r2, 1 addi r3, r3, 8 cmpwi cr0, r2, 100 or r2, r4, r4 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r30, r30, 800 addi r2, r29, 1 cmpwi cr0, r29, 100 or r29, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 40(r1) lwz r29, 44(r1) lwz r1, 0(r1) blr instead of this: _foo: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r28, 44(r1) ;; uses an extra register. stw r29, 40(r1) stw r30, 36(r1) mflr r11 stw r11, 56(r1) li r30, 1 li r29, 0 or r28, r29, r29 .LBB_foo_1: ; no_exit.0 bl L_bar$stub mulli r2, r28, 800 ;; unstrength-reduced multiply lis r3, ha16(L_A$non_lazy_ptr) ;; loop invariant address computation lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 mulli r4, r29, 800 ;; unstrength-reduced multiply addi r3, r3, 8 add r3, r4, r3 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 ;; multiple stride 8 IV's addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r28, r28, 1 ;;; Many IV's with stride 1 addi r29, r29, 1 addi r2, r30, 1 cmpwi cr0, r30, 100 or r30, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 36(r1) lwz r29, 40(r1) lwz r28, 44(r1) lwz r1, 0(r1) blr llvm-svn: 22640	2005-08-04 00:14:11 +00:00
Chris Lattner	fc62470466	Teach loop-reduce to see into nested loops, to pull out immediate values pushed down by SCEV. In a nested loop case, this allows us to emit this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 li r3, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r2) ;; Uses offset of 8 instead of 0 stfd f0, 0(r2) addi r4, r3, 1 addi r2, r2, 8 cmpwi cr0, r3, 100 or r3, r4, r4 bne .LBB_foo_2 ; no_exit.1 instead of this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 addi r3, r3, 8 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 llvm-svn: 22639	2005-08-03 23:44:42 +00:00
Chris Lattner	bb78c97e24	improve debug output llvm-svn: 22638	2005-08-03 23:30:08 +00:00
Chris Lattner	db23c74e5e	Move from Stage 0 to Stage 1. Only emit one PHI node for IV uses with identical bases and strides (after moving foldable immediates to the load/store instruction). This implements LoopStrengthReduce/dont_insert_redundant_ops.ll, allowing us to generate this PPC code for test1: or r30, r3, r3 .LBB_test1_1: ; Loop li r2, 0 stw r2, 0(r30) stw r2, 4(r30) bl L_pred$stub addi r30, r30, 8 cmplwi cr0, r3, 0 bne .LBB_test1_1 ; Loop instead of this code: or r30, r3, r3 or r29, r3, r3 .LBB_test1_1: ; Loop li r2, 0 stw r2, 0(r29) stw r2, 4(r30) bl L_pred$stub addi r30, r30, 8 ;; Two iv's with step of 8 addi r29, r29, 8 cmplwi cr0, r3, 0 bne .LBB_test1_1 ; Loop llvm-svn: 22635	2005-08-03 22:51:21 +00:00
Chris Lattner	430d0022df	Rename IVUse to IVUsersOfOneStride, use a struct instead of a pair to unify some parallel vectors and get field names more descriptive than "first" and "second". This isn't lisp afterall :) llvm-svn: 22633	2005-08-03 22:21:05 +00:00
Chris Lattner	84e9baa925	Fix a nasty dangling pointer issue. The ScalarEvolution pass would keep a map from instruction* to SCEVHandles. When we delete instructions, we have to tell it about it. We would run into nasty cases where new instructions were reallocated at old instruction addresses and get the old map values. Bad bad bad :( llvm-svn: 22632	2005-08-03 21:36:09 +00:00
Chris Lattner	3de05cc930	The correct fix for PR612, which also fixes Transforms/LowerInvoke/2005-08-03-InvokeWithPHIUse.ll llvm-svn: 22628	2005-08-03 18:51:44 +00:00
Chris Lattner	f8a81a9886	When inserting code, make sure not to insert it before PHI nodes. This fixes PR612 and Transforms/LowerInvoke/2005-08-03-InvokeWithPHI.ll llvm-svn: 22626	2005-08-03 18:34:29 +00:00
Chris Lattner	d683bdd0f8	Fix Transforms/SimplifyCFG/2005-08-03-PHIFactorCrash.ll, a problem that occurred while bugpointing another testcase llvm-svn: 22621	2005-08-03 17:59:45 +00:00
Chris Lattner	2dbf1960ff	Finally, add the required constraint checks to fix Transforms/SimplifyCFG/2005-08-01-PHIUpdateFail.ll the right way llvm-svn: 22615	2005-08-03 00:59:12 +00:00
Chris Lattner	908036942c	Simplify some code, add the correct pred checks llvm-svn: 22613	2005-08-03 00:38:27 +00:00
Chris Lattner	982b75c061	Refactor code out of PropagatePredecessorsForPHIs, turning it into a pure function with no side-effects llvm-svn: 22612	2005-08-03 00:29:26 +00:00
Chris Lattner	1f047fd513	use splice instead of remove/insert to avoid some symtab operations llvm-svn: 22611	2005-08-03 00:23:42 +00:00
Chris Lattner	76dc204488	move two functions up in the file, use SafeToMergeTerminators to eliminate some duplicated code llvm-svn: 22610	2005-08-03 00:19:45 +00:00
Chris Lattner	733d6704ce	Rip some code out of the main SimplifyCFG function into a subfunction and call it from the only place it is live. No functionality changes. llvm-svn: 22609	2005-08-03 00:11:16 +00:00
Chris Lattner	ac594de8dc	Disable this patch: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20050801/027345.html This breaks real programs and only fixes an obscure regression testcase. A real fix is in development. llvm-svn: 22606	2005-08-02 23:31:38 +00:00

... 30 31 32 33 34 ...

5144 Commits