llvm-project

Commit Graph

Author	SHA1	Message	Date
Dale Johannesen	1f0e0e7c9c	Fix the time regression I introduced in 464.h264ref with my earlier patch to this file. The issue there was that all uses of an IV inside a loop are actually references to Base[IV2], and there was one use outside that was the same but LSR didn't see the base or the scaling because it didn't recurse into uses outside the loop; thus, it used base+IVscale mode inside the loop instead of pulling base out of the loop. This was extra bad because register pressure later forced both base and IV into memory. Doing that recursion, at least enough to figure out addressing modes, is a good idea in general; the change in AddUsersIfInteresting does this. However, there were side effects.... It is also possible for recursing outside the loop to introduce another IV where there was only 1 before (if the refs inside are not scaled and the ref outside is). I don't think this is a common case, but it's in the testsuite. It is right to be very aggressive about getting rid of such introduced IVs (CheckForIVReuse and the handling of nonzero RewriteFactor in StrengthReduceStridedIVUsers). In the testcase in question the new IV produced this way has both a nonconstant stride and a nonzero base, neither of which was handled before. And when inserting new code that feeds into a PHI, it's right to put such code at the original location rather than in the PHI's immediate predecessor(s) when the original location is outside the loop (a case that couldn't happen before) (RewriteInstructionToUseNewBase); better to avoid making multiple copies of it in this case. Also, the mechanism for keeping SCEV's corresponding to GEP's no longer works, as the GEP might change after its SCEV is remembered, invalidating the SCEV, and we might get a bad SCEV value when looking up the GEP again for a later loop. This also couldn't happen before, as we weren't recursing into GEP's outside the loop. Also, when we build an expression that involves a (possibly non-affine) IV from a different loop as well as an IV from the one we're interested in (containsAddRecFromDifferentLoop), don't recurse into that. We can't do much with it and will get in trouble if we try to create new non-affine IVs or something. More testcases are coming. llvm-svn: 62212	2009-01-14 02:35:31 +00:00
Chris Lattner	2538eb664c	rewrite OptimizeAwayTrappingUsesOfLoads to 1) avoid a temporary vector and extraneous loop over it, 2) not delete globals used by phis/selects etc which could actually be useful. This fixes PR3321. Many thanks to Duncan for narrowing this down. llvm-svn: 62201	2009-01-14 00:12:58 +00:00
Dale Johannesen	0aeabdff57	Fix testsuite regressions from recursive inlining. llvm-svn: 62189	2009-01-13 22:43:37 +00:00
Dan Gohman	59af77376c	Make instcombine ensure that all allocas are explicitly aligned at at least their preferred alignment. llvm-svn: 62176	2009-01-13 20:18:38 +00:00
Duncan Sands	944ccc5d6a	Correct a comment. llvm-svn: 62165	2009-01-13 13:48:44 +00:00
Dale Johannesen	433a9086c0	Enable recursive inlining. Reduce inlining threshold back to 200; 400 seems to be too high, loses more than it gains. llvm-svn: 62107	2009-01-12 22:11:50 +00:00
Duncan Sands	dc020f9c3c	Rename getABITypeSize to getTypePaddedSize, as suggested by Chris. llvm-svn: 62099	2009-01-12 20:38:59 +00:00
Dale Johannesen	f84685290a	Increase default inlining aggressiveness in partial compensation for turning off gcc's inliner. This gets us closer to the amount of inlining we were getting before. It is not a win on everything, of course, but seems to gain overall. llvm-svn: 62058	2009-01-11 23:11:00 +00:00
Chris Lattner	bd3c7c8b52	Duncan is nervous about undefinedness of % with negatives. I'm not thrilled about 64-bit % in general, so rewrite to use * instead. llvm-svn: 62047	2009-01-11 20:41:36 +00:00
Chris Lattner	b19151686f	do not generated GEPs into vectors where they don't already exist. We should treat vectors as atomic types, not like arrays. llvm-svn: 62046	2009-01-11 20:23:52 +00:00
Chris Lattner	171d2d474f	Make a couple of cleanups to the instcombine bitcast/gep canonicalization transform based on duncan's comments: 1) improve the comment about %. 2) within our index loop make sure the offset stays within the type size, instead of within the abi size. This allows us to reason explicitly about landing in tail padding and means that issues like non-zero offsets into [0 x foo] types don't occur anymore. llvm-svn: 62045	2009-01-11 20:15:20 +00:00
Chris Lattner	5f54d50917	fix typo Duncan noticed. llvm-svn: 61997	2009-01-09 18:31:39 +00:00
Chris Lattner	ae0e857b98	Fix PR3304 llvm-svn: 61995	2009-01-09 18:18:43 +00:00
Misha Brukman	5cbf223916	Removed trailing whitespace from Makefiles. llvm-svn: 61991	2009-01-09 16:44:42 +00:00
Chris Lattner	f50aa6ae5c	Implement rdar://6480391, extending of equality icmp's to avoid a truncation. I noticed this in the code compiled for a routine using std::map, which produced this code: %25 = tail call i32 @memcmp(i8* %24, i8* %23, i32 6) nounwind readonly %.lobit.i = lshr i32 %25, 31 ; <i32> [#uses=1] %tmp.i = trunc i32 %.lobit.i to i8 ; <i8> [#uses=1] %toBool = icmp eq i8 %tmp.i, 0 ; <i1> [#uses=1] br i1 %toBool, label %bb3, label %bb4 which compiled to: call L_memcmp$stub shrl $31, %eax testb %al, %al jne LBB1_11 ## with this change, we compile it to: call L_memcmp$stub testl %eax, %eax js LBB1_11 This triggers all the time in common code, with patters like this: %169 = and i32 %ply, 1 ; <i32> [#uses=1] %170 = trunc i32 %169 to i8 ; <i8> [#uses=1] %toBool = icmp ne i8 %170, 0 ; <i1> [#uses=1] %7 = lshr i32 %6, 24 ; <i32> [#uses=1] %9 = trunc i32 %7 to i8 ; <i8> [#uses=1] %10 = icmp ne i8 %9, 0 ; <i1> [#uses=1] etc llvm-svn: 61985	2009-01-09 07:47:06 +00:00
Chris Lattner	0f7cf1d7e1	Remove some old code that looks like a remanant from signed-types days. llvm-svn: 61984	2009-01-09 07:10:58 +00:00
Chris Lattner	482eb70a10	Fix PR3298, a crash in Jump Threading. Apparently even jump threading can have bugs, who knew? ;-) llvm-svn: 61983	2009-01-09 06:08:12 +00:00
Chris Lattner	fef138b140	Fix part 3/2 of PR3290, making instcombine zap (gep(bitcast)) when possible. llvm-svn: 61980	2009-01-09 05:44:56 +00:00
Chris Lattner	a784a2ce01	move some code, check to see if the input to the GEP is a bitcast (which is constant time and cheap) before checking hasAllZeroIndices. llvm-svn: 61976	2009-01-09 04:53:57 +00:00
Dale Johannesen	4755d9df78	Adjustments to last patch based on review. llvm-svn: 61969	2009-01-09 01:30:11 +00:00
Dale Johannesen	b48fc71fc6	Do not inline functions with (dynamic) alloca into functions that don't already have a (dynamic) alloca. Dynamic allocas cause inefficient codegen and we shouldn't propagate this (behavior follows gcc). Two existing tests assumed such inlining would be done; they are hacked by adding an alloca in the caller, preserving the point of the tests. llvm-svn: 61946	2009-01-08 21:45:23 +00:00
Chris Lattner	c518dfd11b	This implements the second half of the fix for PR3290, handling loads from allocas that cover the entire aggregate. This handles some memcpy/byval cases that are produced by llvm-gcc. This triggers a few times in kc++ (with std::pair<std::_Rb_tree_const_iterator <kc::impl_abstract_phylum*>,bool>) and once in 176.gcc (with %struct..0anon). llvm-svn: 61915	2009-01-08 05:42:05 +00:00
Duncan Sands	0bcf085845	Whitespace - correct formatting. llvm-svn: 61879	2009-01-07 20:01:06 +00:00
Duncan Sands	289f59f233	Remove alloca tracking from nocapture analysis. Not only was it not very helpful, it was also wrong! The problem is shown in the testcase: the alloca might be passed to a nocapture callee which dereferences it and returns the original pointer. But because it was a nocapture call we think we don't need to track its uses, but we do. llvm-svn: 61876	2009-01-07 19:39:06 +00:00
Duncan Sands	94bcbbab74	Reorder these. llvm-svn: 61873	2009-01-07 19:17:02 +00:00
Duncan Sands	02599850b4	Use a switch rather than a sequence of "isa" tests. llvm-svn: 61872	2009-01-07 19:10:21 +00:00
Duncan Sands	187c5716b6	The verifier checks that the aliasee is not null. llvm-svn: 61870	2009-01-07 18:45:53 +00:00
Chris Lattner	f2b8c82ad1	Implement the first half of PR3290: if there is a store of an integer to a (transitive) bitcast the alloca and if that integer has the full size of the alloca, then it clobbers the whole thing. Handle this by extracting pieces out of the stored integer and filing them away in the SROA'd elements. This triggers fairly frequently because the CFE uses integers to pass small structs by value and the inliner exposes these. For example, in kimwitu++, I see a bunch of these with i64 stores to "%struct.std::pair<std::_Rb_tree_const_iterator<kc::impl_abstract_phylum*>,bool>" In 176.gcc I see a few i32 stores to "%struct..0anon". In the testcase, this is a difference between compiling test1 to: _test1: subl $12, %esp movl 20(%esp), %eax movl %eax, 4(%esp) movl 16(%esp), %eax movl %eax, (%esp) movl (%esp), %eax addl 4(%esp), %eax addl $12, %esp ret vs: _test1: movl 8(%esp), %eax addl 4(%esp), %eax ret The second half of this will be to handle loads of the same form. llvm-svn: 61853	2009-01-07 08:11:13 +00:00
Chris Lattner	9a2de65fd6	Factor a bunch of code out into a helper method. llvm-svn: 61852	2009-01-07 07:18:45 +00:00
Chris Lattner	db561146aa	use continue to simplify code and reduce nesting, no functionality change. llvm-svn: 61851	2009-01-07 06:39:58 +00:00
Chris Lattner	938b54f383	Get TargetData once up front and cache as an ivar instead of requerying it all over the place. llvm-svn: 61850	2009-01-07 06:34:28 +00:00
Chris Lattner	a63dba9e6c	Use the hasAllZeroIndices predicate to simplify some code, no functionality change. llvm-svn: 61849	2009-01-07 06:25:07 +00:00
Chris Lattner	2fdcc59bb6	Change m_ConstantInt and m_SelectCst to take their constant integers as template arguments instead of as instance variables, exposing more optimization opportunities to the compiler earlier. llvm-svn: 61776	2009-01-05 23:53:12 +00:00
Duncan Sands	582c53d147	Teach the internalize pass to also internalize global aliases. llvm-svn: 61754	2009-01-05 21:24:45 +00:00
Evan Cheng	8804293fe9	Find loop back edges only after empty blocks are eliminated. llvm-svn: 61752	2009-01-05 21:17:27 +00:00
Duncan Sands	52e5deece5	Not having an aliasee is a theoretical possibility. llvm-svn: 61745	2009-01-05 20:47:56 +00:00
Duncan Sands	821d13cf78	Format more neatly. llvm-svn: 61744	2009-01-05 20:39:50 +00:00
Duncan Sands	d24b93f339	Remove trailing spaces. llvm-svn: 61743	2009-01-05 20:38:27 +00:00
Duncan Sands	f5dbbae4f4	Delete unused global aliases with internal linkage. In fact this also deletes those with linkonce linkage, however this is currently dead because for the moment aliases aren't allowed to have this linkage type. llvm-svn: 61742	2009-01-05 20:37:33 +00:00
Dan Gohman	906152a20f	Tidy up #includes, deleting a bunch of unnecessary #includes. llvm-svn: 61715	2009-01-05 17:59:02 +00:00
Nick Lewycky	e4e5532e05	Move the libcall annotating part from doFinalization to doInitialization. Finalization occurs after all the FunctionPasses in the group have run, which is clearly not what we want. This also means that we have to make sure that we apply the right param attributes when creating a new function. Also, add a missed optimization: strdup and strndup. NoCapture and NoAlias return! llvm-svn: 61658	2009-01-05 00:07:50 +00:00
Nick Lewycky	959af7ba30	Run a post-pass that marks known function declarations by name. llvm-svn: 61632	2009-01-04 20:27:34 +00:00
Bill Wendling	0c04f9fdc3	Revert this transform. It was causing some dramatic slowdowns in a few tests. See PR3266. llvm-svn: 61623	2009-01-04 06:19:11 +00:00
Nick Lewycky	1d805c62c4	Any void readonly functions are provably dead, don't waste time adding nocapture attributes to them. llvm-svn: 61610	2009-01-03 17:05:32 +00:00
Duncan Sands	c7affb0a8f	Load tracking means that the value analyzed may not have pointer type. In particular, it may be the condition argument for a select or a GEP index. While I was unable to construct a testcase for which some bits of the original pointer are captured due to one of these, it's very very close to being possible - so play safe and exclude these possibilities. llvm-svn: 61580	2009-01-02 15:16:38 +00:00
Duncan Sands	b193a37cd3	When calculating 'nocapture' argument attributes, allow the argument to be stored to an alloca by tracking uses of the alloca. This occurs 4 times (out of 7121, 0.05%) in MultiSource/Applications, so may not be worth it. On the other hand, it is easy to do and fairly cheap. The functions it helps are: W_addcom and W_addlit in spiff; process_args (argv) in d (make_dparser); ercPixConcealIMB in JM/ldecod. llvm-svn: 61570	2009-01-02 11:54:37 +00:00
Duncan Sands	cefc8604aa	Improve comments and reorganize a bit - no functionality change. llvm-svn: 61569	2009-01-02 11:46:24 +00:00
Nick Lewycky	7e82055e88	Make adding nocapture a bit stronger. FreeInst is nocapture. Also, functions that don't write can't leak a pointer except through the return value, so a void readonly function is implicitly nocapture. Test these, and add a test that verifies that f1 calling f2 with an otherwise dead pointer gets both of them marked nocapture. llvm-svn: 61552	2009-01-02 03:46:56 +00:00
Duncan Sands	1f11d2bbc1	Mention that this pass does escape analysis in the leading comments. llvm-svn: 61548	2009-01-01 20:45:19 +00:00
Bill Wendling	0fcff2c203	Fix comment. llvm-svn: 61538	2009-01-01 01:19:59 +00:00
Bill Wendling	aedb54a947	Add transformation: xor (or (icmp, icmp), true) -> and(icmp, icmp) This is possible because of De Morgan's law. llvm-svn: 61537	2009-01-01 01:18:23 +00:00
Duncan Sands	163848021b	Look through phi nodes and select instructions when calculating nocapture attributes. llvm-svn: 61535	2008-12-31 20:21:34 +00:00
Duncan Sands	df128eb477	Don't analyze arguments already marked 'nocapture'. llvm-svn: 61532	2008-12-31 18:08:59 +00:00
Duncan Sands	44c8cd97a5	Rename AddReadAttrs to FunctionAttrs, and teach it how to work out (in a very simplistic way) which function arguments (pointer arguments only) are only dereferenced and so do not escape. Mark such arguments 'nocapture'. llvm-svn: 61525	2008-12-31 16:14:43 +00:00
Duncan Sands	f6069577fa	Experiments show that looking through phi nodes and select instructions doesn't buy anything here except extra complexity: the only difference in the entire testsuite was that a readonly function became readnone in MiBench/consumer-typeset. Add a comment about this. llvm-svn: 61478	2008-12-29 20:51:17 +00:00
Duncan Sands	c125d6a3d3	Allow readnone functions to read (and write!) global constants, since doing so is irrelevant for aliasing purposes. While this doesn't increase the total number of functions marked readonly or readnone in MultiSource/ Applications (3089), it does result in 12 functions being marked readnone rather than readonly. Before: readnone: 820 readonly: 2269 After: readnone: 832 readonly: 2257 llvm-svn: 61469	2008-12-29 11:34:09 +00:00
Dale Johannesen	656237beca	Revert 61362 and 61402 until SPEC breakage is fixed. llvm-svn: 61403	2008-12-23 23:21:35 +00:00
Dale Johannesen	f8b161bcd1	This fixes the bug in 175.vpr. It doesn't fix the other SPEC breakage. I'll be reverting all recent changes shortly, this checking is mostly so this change doesn't get lost. llvm-svn: 61402	2008-12-23 23:05:26 +00:00
Dale Johannesen	93b9aa8799	Fix the time regression I introduced in 464.h264ref with my last patch to this file. The issue there was that all uses of an IV inside a loop are actually references to Base[IV2], and there was one use outside that was the same but LSR didn't see the base or the scaling because it didn't recurse into uses outside the loop; thus, it used base+IVscale mode inside the loop instead of pulling base out of the loop. This was extra bad because register pressure later forced both base and IV into memory. Doing that recursion, at least enough to figure out addressing modes, is a good idea in general; the change in AddUsersIfInteresting does this. However, there were side effects.... It is also possible for recursing outside the loop to introduce another IV where there was only 1 before (if the refs inside are not scaled and the ref outside is). I don't think this is a common case, but it's in the testsuite. It is right to be very aggressive about getting rid of such introduced IVs (CheckForIVReuse and the handling of nonzero RewriteFactor in StrengthReduceStridedIVUsers). In the testcase in question the new IV produced this way has both a nonconstant stride and a nonzero base, neither of which was handled before. And when inserting new code that feeds into a PHI, it's right to put such code at the original location rather than in the PHI's immediate predecessor(s) when the original location is outside the loop (a case that couldn't happen before) (RewriteInstructionToUseNewBase); better to avoid making multiple copies of it in this case. Also, the mechanism for keeping SCEV's corresponding to GEP's no longer works, as the GEP might change after its SCEV is remembered, invalidating the SCEV, and we might get a bad SCEV value when looking up the GEP again for a later loop. This also couldn't happen before, as we weren't recursing into GEP's outside the loop. I owe some testcases for this, want to get it in for nightly runs. llvm-svn: 61362	2008-12-23 02:12:52 +00:00
Owen Anderson	164274eeb1	Don't forget to remove phi nodes from the value numbering table after we collapse them. llvm-svn: 61358	2008-12-23 00:49:51 +00:00
Bill Wendling	456e885382	Comment clean-ups. No functionality change. llvm-svn: 61354	2008-12-22 22:32:22 +00:00
Bill Wendling	e7f08e7250	Check that the instruction isn't in the value numbering scope. llvm-svn: 61353	2008-12-22 22:28:56 +00:00
Bill Wendling	86f01cb9f6	Simplification: Negate the operator== method instead of implementing a full operator!= method. llvm-svn: 61352	2008-12-22 22:16:31 +00:00
Bill Wendling	3c793441cb	Add verification that deleted instruction isn't hiding in the PHI map. llvm-svn: 61350	2008-12-22 22:14:07 +00:00
Bill Wendling	ebb6a543fa	Verify removed in a few more places. llvm-svn: 61349	2008-12-22 21:57:30 +00:00
Bill Wendling	6b18a3994b	Add verification functions to GVN which check to see that an instruction was truely deleted. These will be expanded with further checks of all of the data structures. llvm-svn: 61347	2008-12-22 21:36:08 +00:00
Nick Lewycky	10eb8e533f	Turn strcmp into memcmp, such as strcmp(P, "x") --> memcmp(P, "x", 2). llvm-svn: 61297	2008-12-21 00:19:21 +00:00
Nick Lewycky	4bc10c9e77	Remove redundant test for vector-nature. Scan the vector first to see whether our optz'n will apply to it, then build the replacement vector only if needed. llvm-svn: 61279	2008-12-20 16:48:00 +00:00
Evan Cheng	3b3de7c228	- CodeGenPrepare does not split loop back edges but it only knows about back edges of single block loops. It now does a DFS walk to find loop back edges. - Use SplitBlockPredecessors to factor out common predecessors of the critical edge destination. This is disabled for now due to some regressions. llvm-svn: 61248	2008-12-19 18:03:11 +00:00
Bill Wendling	070de29fcf	Didn't mean to commit this. llvm-svn: 61222	2008-12-18 22:19:50 +00:00
Bill Wendling	4c13e77d49	Re-XFAIL this test until debug stuff settles down. llvm-svn: 61219	2008-12-18 22:13:31 +00:00
Nick Lewycky	c3a70ade66	Oops! Left out a line. Simplifying the sdiv might allow further simplifications for our users. llvm-svn: 61196	2008-12-18 06:42:28 +00:00
Nick Lewycky	0f0e63fe73	Make all the vector elements positive in an srem of constant vector. llvm-svn: 61195	2008-12-18 06:31:11 +00:00
Chris Lattner	4caf5eb70c	Fix PR2929 by making bugpoint/code extract propagate the nothrow bit from the original function to the cloned one. llvm-svn: 61194	2008-12-18 05:52:56 +00:00
Dale Johannesen	3e5843b992	Revert previous patch, appears to break bootstrap. llvm-svn: 61181	2008-12-18 01:23:41 +00:00
Dale Johannesen	12d031b716	Fix the time regression I introduced in 464.h264ref with my last patch to this file. The issue there was that all uses of an IV inside a loop are actually references to Base[IV2], and there was one use outside that was the same but LSR didn't see the base or the scaling because it didn't recurse into uses outside the loop; thus, it used base+IVscale mode inside the loop instead of pulling base out of the loop. This was extra bad because register pressure later forced both base and IV into memory. Doing that recursion, at least enough to figure out addressing modes, is a good idea in general; the change in AddUsersIfInteresting does this. However, there were side effects.... It is also possible for recursing outside the loop to introduce another IV where there was only 1 before (if the refs inside are not scaled and the ref outside is). I don't think this is a common case, but it's in the testsuite. It is right to be very aggressive about getting rid of such introduced IVs (CheckForIVReuse and the handling of nonzero RewriteFactor in StrengthReduceStridedIVUsers). In the testcase in question the new IV produced this way has both a nonconstant stride and a nonzero base, neither of which was handled before. (This patch does not handle all the cases where this can happen.) And when inserting new code that feeds into a PHI, it's right to put such code at the original location rather than in the PHI's immediate predecessor(s) when the original location is outside the loop (a case that couldn't happen before) (RewriteInstructionToUseNewBase); better to avoid making multiple copies of it in this case. Everything above is exercised in CodeGen/X86/lsr-negative-stride.ll (and ifcvt4 in ARM which is the same IR). llvm-svn: 61178	2008-12-18 00:57:22 +00:00
Chris Lattner	b6372933b5	reapply this hunk from Bill's reversion in r61169, it is conservative and safe and orthogonal from turning off load pre. llvm-svn: 61177	2008-12-18 00:51:32 +00:00
Chris Lattner	c1c6404bba	make instnamer name unnamed blocks as well as instructions and args. llvm-svn: 61175	2008-12-18 00:33:11 +00:00
Bill Wendling	be4fb8a25f	Temporarily revert r61027. It was causing a bootstrap failure in "release" mode with everyone's favorite error messages: Comparing stages 2 and 3 warning: ./cc1-checksum.o differs warning: ./cc1plus-checksum.o differs Bootstrap comparison failure! ./c-decl.o differs ./cp/decl.o differs ./df-core.o differs ./gcc.o differs ./i386.o differs ./stor-layout.o differs ./tree-pretty-print.o differs ./tree.o differs make[2]: * [compare] Error 1 make[1]: * [stage3-bubble] Error 2 See PR3227. llvm-svn: 61169	2008-12-17 23:31:20 +00:00
Chris Lattner	0cdf52310a	insert some sequence points and preincrement an iterator to avoid iterator invalidation problems. llvm-svn: 61124	2008-12-17 05:42:08 +00:00
Chris Lattner	222ef4c489	Enhance heap sra to be substantially more aggressive w.r.t PHI nodes. This allows it to do fairly general phi insertion if a load from a pointer global wants to be SRAd but the load is used by (recursive) phi nodes. This fixes a pessimization on ppc introduced by Load PRE. llvm-svn: 61123	2008-12-17 05:28:49 +00:00
Dale Johannesen	904ce8120d	Clarify that the scale factor from CheckForIVReuse can be negative. Keep track of whether all uses of an IV are outside the loop. Some cosmetics; no functional change. llvm-svn: 61109	2008-12-16 22:16:28 +00:00
Chris Lattner	56b55387fc	Fix another crash found by inspection. If we have a PHI node merging the load multiple times, make sure the check the uses of the PHI to ensure they are transformable. llvm-svn: 61102	2008-12-16 21:24:51 +00:00
Chris Lattner	06a456b3f4	fix a crash found by inspection. llvm-svn: 61101	2008-12-16 21:04:51 +00:00
Eli Friedman	cb61afb546	Add a helper to remove a branch and DCE the condition, and use it consistently for deleting branches. In addition to being slightly more readable, this makes SimplifyCFG a bit better about cleaning up after itself when it makes conditions unused. llvm-svn: 61100	2008-12-16 20:54:32 +00:00
Chris Lattner	6ddde53783	switch some std::set/std::map to SmallPtrSet/DenseMap. llvm-svn: 61081	2008-12-16 07:34:30 +00:00
Chris Lattner	49e3bdc165	enhance heap-sra to apply to fixed sized array allocations, not just variable sized array allocations. llvm-svn: 61051	2008-12-15 21:44:34 +00:00
Chris Lattner	1c731fa86f	Use stripPointerCasts. llvm-svn: 61047	2008-12-15 21:20:32 +00:00
Chris Lattner	f0eb568021	minor tweaks for formatting, allow bitcast in ValueIsOnlyUsedLocallyOrStoredToOneGlobal. llvm-svn: 61046	2008-12-15 21:08:54 +00:00
Chris Lattner	c4274a71d5	refactor some code into a new TryToOptimizeStoreOfMallocToGlobal function. Use GetElementPtrInst::hasAllZeroIndices where possible. llvm-svn: 61045	2008-12-15 21:02:25 +00:00
Chris Lattner	0c68ae0603	Enable Load PRE. This teaches GVN to push partially redundant loads up the CFG when there is exactly one predecessor where the load is not available. This is designed to not increase code size but still eliminate partially redundant loads. This fires 1765 times on 403.gcc even though it doesn't do critical edge splitting yet (the most common reason for it to fail). llvm-svn: 61027	2008-12-15 05:28:29 +00:00
Owen Anderson	03aacbae90	Ifdef out some code that I didn't mean to enable by default yet. llvm-svn: 61024	2008-12-15 03:52:17 +00:00
Chris Lattner	69131fd872	make GVN try to rename inputs to the resultant replaced values, which cleans up the generated code a bit. This should have the added benefit of not randomly renaming functions/globals like my previous patch did. :) llvm-svn: 61023	2008-12-15 03:46:38 +00:00
Owen Anderson	bfe133e4ac	Add support for slow-path GVN with full phi construction for scalars. This is disabled for now, as it actually pessimizes code in the abscence of phi translation for load elimination. This slow down GVN a bit, by about 2% on 403.gcc. llvm-svn: 61021	2008-12-15 02:03:00 +00:00
Chris Lattner	f5eef9f6db	eliminate warning when asserts disabled. llvm-svn: 61012	2008-12-14 21:36:23 +00:00
Owen Anderson	e34c2399de	Generalize GVN's phi construciton routine to work for things other than loads. llvm-svn: 61009	2008-12-14 19:10:35 +00:00
Bill Wendling	293b9181e5	Temporarily revert r60973. It's inexplicably causing a failure when self-hosting LLVM: llvm[2]: Linking Release executable opt (without symbols) ... Undefined symbols: "llvm::APFloat::IEEEsingle", referenced from: __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(Constants.o) __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o) __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o) "llvm::APFloat::IEEEdouble", referenced from: __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(Constants.o) __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o) __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o) ld: symbol(s) not found This is in release mode. To replicate, compile llvm and llvm-gcc in optimized mode. Then build llvm, in optimized mode, with the newly created compiler. llvm-svn: 60977	2008-12-13 09:28:44 +00:00
Chris Lattner	1e29f7c97d	make RLE preserve the name of the load that it replaces. This is just a pretification of the IR. llvm-svn: 60973	2008-12-13 07:22:47 +00:00
Misha Brukman	234b44add2	Fix spelling. llvm-svn: 60971	2008-12-13 05:21:37 +00:00
Chris Lattner	fa9f99aa12	Teach GVN to invalidate some memdep information when it does an RAUW of a pointer. This allows is to catch more equivalencies. For example, the type_lists_compatible_p function used to require two iterations of the gvn pass (!) to delete its 18 redundant loads because the first pass would CSE all the addressing computation cruft, which would unblock the second memdep/gvn passes from recognizing them. This change allows memdep/gvn to catch all 18 when run just once on the function (as is typical :) instead of just 3. On all of 403.gcc, this bumps up the # reundandancies found from: 63 gvn - Number of instructions PRE'd 153991 gvn - Number of instructions deleted 50069 gvn - Number of loads deleted to: 63 gvn - Number of instructions PRE'd 154137 gvn - Number of instructions deleted 50185 gvn - Number of loads deleted +120 loads deleted isn't bad. llvm-svn: 60799	2008-12-09 22:06:23 +00:00
Chris Lattner	254314e6bc	rename getNonLocalDependency -> getNonLocalCallDependency, and remove pointer stuff from it, simplifying the code a bit. llvm-svn: 60783	2008-12-09 19:38:05 +00:00
Chris Lattner	b6fc4b8d92	Switch GVN::processNonLocalLoad to using the new MemDep::getNonLocalPointerDependency method. There are some open issues with this (missed optimizations) and plenty of future work, but this does allow GVN to eliminate slightly more loads (49246 vs 49033). Switching over now allows simplification of the other code path in memdep. llvm-svn: 60780	2008-12-09 19:25:07 +00:00
Chris Lattner	0a5a8d54a9	random cleanups, no functionality change. llvm-svn: 60779	2008-12-09 19:21:47 +00:00
Chris Lattner	56b20ffc5f	Fix a really subtle off-by-one bug that Duncan noticed with valgrind on test/CodeGen/Generic/2007-06-06-CriticalEdgeLandingPad. llvm-svn: 60739	2008-12-09 04:47:21 +00:00
Chris Lattner	e598370ae9	remove DebugIterations option. Despite the accusations, jump threading has been shown to only expose problems not have bugs itself. I'm sure it's completely bug free! ;-) llvm-svn: 60725	2008-12-08 22:44:07 +00:00
Devang Patel	2bb8a2f80f	Fix spelling. Thanks Duncan! llvm-svn: 60702	2008-12-08 17:07:24 +00:00
Devang Patel	1c469d36b0	Undo previous patch. llvm-svn: 60701	2008-12-08 17:02:37 +00:00
Chris Lattner	f50d7f76c6	fix a bug I introduced in simplifycfg handling single entry phi nodes. FoldSingleEntryPHINodes deletes the PHI, so there is no need to delete it afterward. llvm-svn: 60653	2008-12-07 07:22:45 +00:00
Chris Lattner	5df5b4cc2e	don't bother touching volatile stores, they will just return clobber on everything interesting anyway. llvm-svn: 60640	2008-12-07 00:25:15 +00:00
Chris Lattner	57e91eaf61	Reimplement the inner loop of DSE. It now uniformly uses getDependence(), doesn't do its own local caching, and is slightly more aggressive about free/store dse (see testcase). This eliminates the last external client of MemDep::getDependenceFrom(). llvm-svn: 60619	2008-12-06 00:53:22 +00:00
Dale Johannesen	9efd2ce55b	Make LoopStrengthReduce smarter about hoisting things out of loops when they can be subsumed into addressing modes. Change X86 addressing mode check to realize that some PIC references need an extra register. (I believe this is correct for Linux, if not, I'm sure someone will tell me.) llvm-svn: 60608	2008-12-05 21:47:27 +00:00
Chris Lattner	0e3d6337c6	Make a few major changes to memdep and its clients: 1. Merge the 'None' result into 'Normal', making loads and stores return their dependencies on allocations as Normal. 2. Split the 'Normal' result into 'Clobber' and 'Def' to distinguish between the cases when memdep knows the value is produced from when we just know if may be changed. 3. Move some of the logic for determining whether readonly calls are CSEs into memdep instead of it being in GVN. This still leaves verification that the arguments are hte same to GVN to let it know about value equivalences in different contexts. 4. Change memdep's call/call dependency analysis to use getModRefInfo(CallSite,CallSite) instead of doing something very weak. This only really matters for things like DSA, but someday maybe we'll have some other decent context sensitive analyses :) 5. This reimplements the guts of memdep to handle the new results. 6. This simplifies GVN significantly: a) readonly call CSE is slightly simpler b) I eliminated the "getDependencyFrom" chaining for load elimination and load CSE doesn't have to worry about volatile (they are always clobbers) anymore. c) GVN no longer does any 'lastLoad' caching, leaving it to memdep. 7. The logic in DSE is simplified a bit and sped up. A potentially unsafe case was eliminated. llvm-svn: 60607	2008-12-05 21:04:20 +00:00
Anton Korobeynikov	24600bf05a	Revert invalid r60393. It causes llvm-gcc bootstrap fails in release builds. See PR3160 for details llvm-svn: 60604	2008-12-05 19:38:49 +00:00
Chris Lattner	c100828026	Fix test/Transforms/GVN/pre-load.ll llvm-svn: 60594	2008-12-05 17:04:12 +00:00
Chris Lattner	d2a653af0c	Make IsValueFullyAvailableInBlock safe. llvm-svn: 60588	2008-12-05 07:49:08 +00:00
Devang Patel	c56423b500	Rewrite code that 1) filters loops and 2) calculates new loop bounds. This fixes many bugs. I will add more test cases in a separate check-in. Some day, the code that manipulates CFG and updates dom. info could use refactoring help. llvm-svn: 60554	2008-12-04 21:38:42 +00:00
Chris Lattner	8f723670ce	Start simplifying a switch that has a successor that is a switch. llvm-svn: 60534	2008-12-04 06:31:07 +00:00
Chris Lattner	75c2661d24	add a debugging option to help track down j-t problems. llvm-svn: 60514	2008-12-04 00:07:59 +00:00
Dale Johannesen	4e9e6ea604	Remove an unused field. llvm-svn: 60508	2008-12-03 22:43:56 +00:00
Dale Johannesen	f7a588b909	Fix a misspelled function name. llvm-svn: 60506	2008-12-03 20:56:12 +00:00
Chris Lattner	dc3f6f2c12	Factor some code into a new FoldSingleEntryPHINodes method. llvm-svn: 60501	2008-12-03 19:44:02 +00:00
Dale Johannesen	d49ceff6ba	Fix a really wrong comment. llvm-svn: 60494	2008-12-03 19:25:46 +00:00
Chris Lattner	595c7279bd	Teach jump threading some more simple tricks: 1) have it fold "br undef", which does occur with surprising frequency as jump threading iterates. 2) teach j-t to delete dead blocks. This removes the successor edges, reducing the in-edges of other blocks, allowing recursive simplification. 3) Fold things like: br COND, BBX, BBY BBX: br COND, BBZ, BBW which also happens because jump threading iterates. llvm-svn: 60470	2008-12-03 07:48:08 +00:00
Chris Lattner	37e0136fef	third time is the charm. llvm-svn: 60469	2008-12-03 07:45:15 +00:00
Chris Lattner	c04a1ffa9a	fix assertion. llvm-svn: 60468	2008-12-03 07:43:05 +00:00
Chris Lattner	7eb270ed03	Rename DeleteBlockIfDead to DeleteDeadBlock and make it unconditionally delete the block. All likely clients will do the checking anyway. llvm-svn: 60464	2008-12-03 06:40:52 +00:00
Chris Lattner	bcc904a67c	Factor some code out of SimplifyCFG, forming a new DeleteBlockIfDead method. llvm-svn: 60463	2008-12-03 06:37:44 +00:00
Dale Johannesen	4d2ecb8f68	Minor rewrite per review feedback. llvm-svn: 60442	2008-12-02 21:17:11 +00:00
Dale Johannesen	70060013d2	Make the code do what the comment says it does. llvm-svn: 60431	2008-12-02 18:40:09 +00:00
Chris Lattner	1db9bbe802	Implement PRE of loads in the GVN pass with a pretty cheap and straight-forward implementation. This does not require any extra alias analysis queries beyond what we already do for non-local loads. Some programs really really like load PRE. For example, SPASS triggers this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc. The biggest limitation to the implementation is that it does not split critical edges. This is a huge killer on many programs and should be addressed after the initial patch is enabled by default. The implementation of this should incidentally speed up rejection of non-local loads because it avoids creating the repl densemap in cases when it won't be used for fully redundant loads. This is currently disabled by default. Before I turn this on, I need to fix a couple of miscompilations in the testsuite, look at compile time performance numbers, and look at perf impact. This is pretty close to ready though. llvm-svn: 60408	2008-12-02 08:16:11 +00:00
Bill Wendling	87beb9b909	Remove some errors that crept in. No functionality change. llvm-svn: 60403	2008-12-02 06:24:20 +00:00
Bill Wendling	790b4bf9a9	Merge two if-statements into one. llvm-svn: 60402	2008-12-02 06:22:04 +00:00
Bill Wendling	5635295266	More styalistic changes. No functionality change. llvm-svn: 60401	2008-12-02 06:18:11 +00:00
Bill Wendling	85de4b35ca	- Remove the buggy -X/C -> X/-C transform. This isn't valid when X isn't a constant. If X is a constant, then this is folded elsewhere. - Added a note to Target/README.txt to indicate that we'd like to implement this when we're able. llvm-svn: 60399	2008-12-02 05:12:47 +00:00
Bill Wendling	5369db5917	Improve comment. llvm-svn: 60398	2008-12-02 05:09:00 +00:00
Bill Wendling	21716dff5e	- Reduce nesting. - No need to do a swap on a canonicalized pattern. No functionality change. llvm-svn: 60397	2008-12-02 05:06:43 +00:00
Chris Lattner	ead1a61b47	some random comment improvements. llvm-svn: 60395	2008-12-02 04:52:26 +00:00
Owen Anderson	d930420ccf	Fix an issue that Chris noticed, where local PRE was not properly instantiating a new value numbering set after splitting a critical edge. This increases the number of instances of PRE on 403.gcc from ~60 to ~570. llvm-svn: 60393	2008-12-02 04:09:22 +00:00
Dale Johannesen	069a4eee55	Consider only references to an IV within the loop when figuring out the base of the IV. This produces better code in the example. (Addresses use (IV) instead of (BASE,IV) - a significant improvement on low-register machines like x86). llvm-svn: 60374	2008-12-01 22:00:01 +00:00
Bill Wendling	6f71bce4cf	Don't rebuild RHSNeg. Just use the one that's already there. llvm-svn: 60370	2008-12-01 21:06:30 +00:00
Bill Wendling	84f6f2539f	Document what this check is doing. Also, no need to cast to ConstantInt. llvm-svn: 60369	2008-12-01 21:03:43 +00:00
Bill Wendling	e6c87a4952	Use a simple comparison. Overflow on integer negation can only occur when the integer is "minint". llvm-svn: 60366	2008-12-01 19:46:27 +00:00
Bill Wendling	47f733e4ea	Generalize the FoldOrWithConstant method to fold for any two constants which don't have overlapping bits. llvm-svn: 60344	2008-12-01 08:32:40 +00:00
Bill Wendling	22e761b302	Reduce copy-and-paste code by splitting out the code into its own function. llvm-svn: 60343	2008-12-01 08:23:25 +00:00
Bill Wendling	582fe6b0ca	Use m_Specific() instead of double matching. llvm-svn: 60341	2008-12-01 08:09:47 +00:00
Bill Wendling	4eecfb655b	Move pattern check outside of the if-then statement. This prevents us from fiddling with constants unless we have to. llvm-svn: 60340	2008-12-01 07:47:02 +00:00
Chris Lattner	6f5bf6a718	Rename some variables, only increment BI once at the start of the loop instead of throughout it. llvm-svn: 60339	2008-12-01 07:35:54 +00:00
Chris Lattner	f00aae4968	pull the predMap densemap out of the inner loop of performPRE, so that it isn't reallocated all the time. This is a tiny speedup for GVN: 3.90->3.88s llvm-svn: 60338	2008-12-01 07:29:03 +00:00
Chris Lattner	2b07d3ccde	switch a couple more calls to use array_pod_sort. llvm-svn: 60337	2008-12-01 06:52:57 +00:00
Chris Lattner	2c2dd15a85	Introduce a new array_pod_sort function and switch LSR to use it instead of std::sort. This shrinks the release-asserts LSR.o file by 1100 bytes of code on my system. We should start using array_pod_sort where possible. llvm-svn: 60335	2008-12-01 06:49:59 +00:00
Chris Lattner	2aebea5735	Eliminate use of setvector for the DeadInsts set, just use a smallvector. This is a lot cheaper and conceptually simpler. llvm-svn: 60332	2008-12-01 06:27:41 +00:00
Chris Lattner	4da78e3774	DeleteTriviallyDeadInstructions is always passed the DeadInsts ivar, just use it directly. llvm-svn: 60330	2008-12-01 06:14:28 +00:00
Chris Lattner	a68a5a4784	simplify DeleteTriviallyDeadInstructions again, unlike my previous buggy rewrite, this notifies ScalarEvolution of a pending instruction about to be removed and then erases it, instead of erasing it then notifying. llvm-svn: 60329	2008-12-01 06:11:32 +00:00
Chris Lattner	9e6b243428	simplify these patterns using m_Specific. No need to grep for xor in testcase (or is a substring). llvm-svn: 60328	2008-12-01 05:16:26 +00:00
Chris Lattner	88a1f0213d	Teach jump threading to clean up after itself, DCE and constfolding the new instructions it simplifies. Because we're threading jumps on edges with constants coming in from PHI's, we inherently are exposing a lot more constants to the new block. Folding them and deleting dead conditions allows the cost model in jump threading to be more accurate as it iterates. llvm-svn: 60327	2008-12-01 04:48:07 +00:00
Chris Lattner	084b3a47d3	Change instcombine to use FoldPHIArgGEPIntoPHI to fold two operand PHIs instead of using FoldPHIArgBinOpIntoPHI. In addition to being more obvious, this also fixes a problem where instcombine wouldn't merge two phis that had different variable indices. This prevented instcombine from factoring big chunks of code in 403.gcc. For example: insn_cuid.exit: - %tmp336 = load i32** @uid_cuid, align 4 - %tmp337 = getelementptr %struct.rtx_def* %insn_addr.0.ph.i, i32 0, i32 3 - %tmp338 = bitcast [1 x %struct.rtunion]* %tmp337 to i32* - %tmp339 = load i32* %tmp338, align 4 - %tmp340 = getelementptr i32* %tmp336, i32 %tmp339 br label %bb62 bb61: - %tmp341 = load i32** @uid_cuid, align 4 - %tmp342 = getelementptr %struct.rtx_def* %insn, i32 0, i32 3 - %tmp343 = bitcast [1 x %struct.rtunion]* %tmp342 to i32* - %tmp344 = load i32* %tmp343, align 4 - %tmp345 = getelementptr i32* %tmp341, i32 %tmp344 br label %bb62 bb62: - %iftmp.62.0.in = phi i32* [ %tmp345, %bb61 ], [ %tmp340, %insn_cuid.exit ] + %insn.pn2 = phi %struct.rtx_def* [ %insn, %bb61 ], [ %insn_addr.0.ph.i, %insn_cuid.exit ] + %tmp344.pn.in.in = getelementptr %struct.rtx_def* %insn.pn2, i32 0, i32 3 + %tmp344.pn.in = bitcast [1 x %struct.rtunion]* %tmp344.pn.in.in to i32* + %tmp341.pn = load i32** @uid_cuid + %tmp344.pn = load i32* %tmp344.pn.in + %iftmp.62.0.in = getelementptr i32* %tmp341.pn, i32 %tmp344.pn %iftmp.62.0 = load i32* %iftmp.62.0.in llvm-svn: 60325	2008-12-01 03:42:51 +00:00
Chris Lattner	9d02a70a7d	Teach inst combine to merge GEPs through PHIs. This is really important because it is sinking the loads using the GEPs, but not the GEPs themselves. This triggers 647 times on 403.gcc and makes the .s file much much nicer. For example before: je LBB1_87 ## bb78 LBB1_62: ## bb77 leal 84(%esi), %eax LBB1_63: ## bb79 movl (%eax), %eax ... LBB1_87: ## bb78 movl $0, 4(%esp) movl %esi, (%esp) call L_make_decl_rtl$stub jmp LBB1_62 ## bb77 after: jne LBB1_63 ## bb79 LBB1_62: ## bb78 movl $0, 4(%esp) movl %esi, (%esp) call L_make_decl_rtl$stub LBB1_63: ## bb79 movl 84(%esi), %eax The input code was (and the GEPs are merged and the PHI is now eliminated by instcombine): br i1 %tmp233, label %bb78, label %bb77 bb77: %tmp234 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22 br label %bb79 bb78: call void @make_decl_rtl(%struct.tree_node* %t_addr.3, i8* null) nounwind %tmp235 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22 br label %bb79 bb79: %iftmp.12.0.in = phi %struct.rtx_def [ %tmp235, %bb78 ], [ %tmp234, %bb77 ] %iftmp.12.0 = load %struct.rtx_def %iftmp.12.0.in llvm-svn: 60322	2008-12-01 02:34:36 +00:00
Chris Lattner	9ce8995d24	Make GVN be more intelligent about redundant load elimination: when finding dependent load/stores, realize that they are the same if aliasing claims must alias instead of relying on the pointers to be exactly equal. This makes load elimination more aggressive. For example, on 403.gcc, we had: < 68 gvn - Number of instructions PRE'd < 152718 gvn - Number of instructions deleted < 49699 gvn - Number of loads deleted < 6153 memdep - Number of dirty cached non-local responses < 169336 memdep - Number of fully cached non-local responses < 162428 memdep - Number of uncached non-local responses now we have: > 64 gvn - Number of instructions PRE'd > 153623 gvn - Number of instructions deleted > 49856 gvn - Number of loads deleted > 5022 memdep - Number of dirty cached non-local responses > 159030 memdep - Number of fully cached non-local responses > 162443 memdep - Number of uncached non-local responses That's an extra 157 loads deleted and extra 905 other instructions nuked. This slows down GVN very slightly, from 3.91 to 3.96s. llvm-svn: 60314	2008-12-01 01:31:36 +00:00
Chris Lattner	7e61dafc95	Reimplement the non-local dependency data structure in terms of a sorted vector instead of a densemap. This shrinks the memory usage of this thing substantially (the high water mark) as well as making operations like scanning it faster. This speeds up memdep slightly, gvn goes from 3.9376 to 3.9118s on 403.gcc This also splits out the statistics for the cached non-local case to differentiate between the dirty and clean cached case. Here's the stats for 403.gcc: 6153 memdep - Number of dirty cached non-local responses 169336 memdep - Number of fully cached non-local responses 162428 memdep - Number of uncached non-local responses yay for caching :) llvm-svn: 60313	2008-12-01 01:15:42 +00:00
Bill Wendling	5b902c5b1e	Implement ((A\|B)&1)\|(B&-2) -> (A&1) \| B transformation. This also takes care of permutations of this pattern. llvm-svn: 60312	2008-12-01 01:07:11 +00:00
Chris Lattner	8541edec44	Cache analyses in ivars and add some useful DEBUG output. This speeds up GVN from 4.0386s to 3.9376s. llvm-svn: 60310	2008-12-01 00:40:32 +00:00
Chris Lattner	80c7d81e81	improve indentation, do cheap checks before expensive ones, remove some fixme's. This speeds up GVN very slightly on 403.gcc (4.06->4.03s) llvm-svn: 60309	2008-11-30 23:39:23 +00:00
Eli Friedman	11c15a5de7	Minor cleanup: use getTrue and getFalse where appropriate. No functional change. llvm-svn: 60307	2008-11-30 22:48:49 +00:00
Eli Friedman	55e4becba9	Some minor cleanups to instcombine; no functionality change. Note that the FoldOpIntoPhi call is dead because it's impossible for the first operand of a subtraction to be both a ConstantInt and a PHINode. llvm-svn: 60306	2008-11-30 21:09:11 +00:00
Bill Wendling	de89bc275c	Add instruction combining for ((A&~B)\|(~A&B)) -> A^B and all permutations. llvm-svn: 60291	2008-11-30 13:52:49 +00:00
Bill Wendling	9eef421e12	Implement (A&((~A)\|B)) -> A&B transformation in the instruction combiner. This takes care of all permutations of this pattern. llvm-svn: 60290	2008-11-30 13:08:13 +00:00
Bill Wendling	2fe3229824	Forgot one remaining call to getSExtValue(). llvm-svn: 60289	2008-11-30 12:41:09 +00:00
Bill Wendling	2d2e7861b5	getSExtValue() doesn't work for ConstantInts with bitwidth > 64 bits. Use all APInt calls instead. This fixes PR3144. llvm-svn: 60288	2008-11-30 12:38:24 +00:00
Eli Friedman	09bc610945	Optimize memmove and memset into the LLVM builtins. Note that these only show up in code from front-ends besides llvm-gcc, like clang. llvm-svn: 60287	2008-11-30 08:32:11 +00:00
Bill Wendling	7abf352f44	Don't make TwoToExp signed by default. llvm-svn: 60279	2008-11-30 05:29:33 +00:00
Bill Wendling	af200e9237	From Hacker's Delight: "For signed integers, the determination of overflow of xy is not so simple. If x and y have the same sign, then overflow occurs iff xy > 231 - 1. If they have opposite signs, then overflow occurs iff xy < -2*31." In this case, x == -1. llvm-svn: 60278	2008-11-30 05:01:05 +00:00
Bill Wendling	70635adea3	Instcombine was illegally transforming -X/C into X/-C when either X or C overflowed on negation. This commit checks to make sure that neithe C nor X overflows. This requires that the RHS of X (a subtract instruction) be a constant integer. llvm-svn: 60275	2008-11-30 03:42:12 +00:00
Chris Lattner	3ff6d01586	Fix a fixme by making memdep's handling of allocations more logical. If we see that a load depends on the allocation of its memory with no intervening stores, we now return a 'None' depedency instead of "Normal". This tweaks GVN to do its optimization with the new result. llvm-svn: 60267	2008-11-30 01:39:32 +00:00
Chris Lattner	63bd586d35	Eliminate the dropInstruction method, which is not needed any more. Fix a subtle iterator invalidation bug I introduced in the last commit. llvm-svn: 60258	2008-11-29 23:30:39 +00:00
Chris Lattner	1c6b62eb4d	Change MemDep::getNonLocalDependency to return its results as a smallvector instead of a DenseMap. This speeds up GVN by 5% on 403.gcc. llvm-svn: 60255	2008-11-29 21:33:22 +00:00
Chris Lattner	f280b0c729	reimplement getNonLocalDependency with a simpler worklist formulation that is faster and doesn't require nonLazyHelper. Much less code. llvm-svn: 60253	2008-11-29 21:22:42 +00:00
Chris Lattner	8c5ff516c6	Fix a thinko that manifested as a crash on clamav last night. llvm-svn: 60251	2008-11-29 20:29:04 +00:00
Chris Lattner	51ba8d0630	Split getDependency into getDependency and getDependencyFrom, the former does caching, the later doesn't. This dramatically simplifies the logic in getDependency and getDependencyFrom. llvm-svn: 60234	2008-11-29 03:47:00 +00:00
Bill Wendling	469e3aa696	Temporarily revert r60195. It's causing an optimized bootstrap of llvm-gcc to fail. llvm-svn: 60233	2008-11-29 03:43:04 +00:00
Chris Lattner	7f9c8a0f05	Introduce and use a new MemDepResult class to hold the results of a memdep query. This makes it crystal clear what cases can escape from MemDep that the clients have to handle. This also gives the clients a nice simplified interface to it that is easy to poke at. This patch also makes DepResultTy and MemoryDependenceAnalysis::DepType private, yay. llvm-svn: 60231	2008-11-29 02:29:27 +00:00
Chris Lattner	de04e1173a	Reimplement the internal abstraction used by MemDep in terms of a pointer/int pair instead of a manually bitmangled pointer. This forces clients to think a little more about checking the appropriate pieces and will be useful for internal implementation improvements later. I'm not particularly happy with this. After going through this I don't think that the clients of memdep should be exposed to the internal type at all. I'll fix this in a subsequent commit. This has no functionality change. llvm-svn: 60230	2008-11-29 01:43:36 +00:00
Chris Lattner	f3f6a801cc	don't revisit instructions off the beginning of the block. llvm-svn: 60221	2008-11-28 22:50:08 +00:00
Chris Lattner	f2a8ba4cf0	simplify some code, remove escaped newline. llvm-svn: 60213	2008-11-28 21:29:52 +00:00
Chris Lattner	8a172daa55	don't call MergeBasicBlockIntoOnlyPred on a block whose only predecessor is itself. This doesn't make sense, and this is a dead infinite loop anyway. llvm-svn: 60210	2008-11-28 19:54:49 +00:00
Chris Lattner	e9f6c355bf	rewrite RecursivelyDeleteTriviallyDeadInstructions to use a more efficient formulation that doesn't require set lookups or scanning a set. llvm-svn: 60203	2008-11-28 01:20:46 +00:00
Chris Lattner	d4b5ba615e	remove some weirdness that came from the LSR code that has nothing to do with dead instruction elimination. No tests in dejagnu depend on this, so I don't know what it was needed for. llvm-svn: 60202	2008-11-28 00:58:15 +00:00
Chris Lattner	1adb6759ef	rewrite a big chunk of how DSE does recursive dead operand elimination to use more modern infrastructure. Also do a bunch of small cleanups. llvm-svn: 60201	2008-11-28 00:27:14 +00:00
Chris Lattner	8e84c129ce	delete ErasePossiblyDeadInstructionTree, replacing uses of it with RecursivelyDeleteTriviallyDeadInstructions. llvm-svn: 60196	2008-11-27 23:25:44 +00:00
Chris Lattner	c077a2a535	Simplify LoopStrengthReduce::DeleteTriviallyDeadInstructions by making it use RecursivelyDeleteTriviallyDeadInstructions to do the heavy lifting. llvm-svn: 60195	2008-11-27 23:23:35 +00:00
Chris Lattner	a1bbdff933	enhance RecursivelyDeleteTriviallyDeadInstructions to make PHIs dead if they are single-value. llvm-svn: 60194	2008-11-27 23:18:11 +00:00
Chris Lattner	1cb4f72706	Enhance RecursivelyDeleteTriviallyDeadInstructions to optionally return a list of deleted instructions. llvm-svn: 60193	2008-11-27 23:14:34 +00:00
Chris Lattner	96e2dbe008	use continue to reduce indentation llvm-svn: 60192	2008-11-27 23:00:20 +00:00
Chris Lattner	c6c481cdfc	remove doConstantPropagation and dceInstruction, they are just wrappers around the interesting code and use an obscure iterator abstraction that dates back many many years. Move EraseDeadInstructions to Transforms/Utils and name it RecursivelyDeleteTriviallyDeadInstructions. llvm-svn: 60191	2008-11-27 22:57:53 +00:00
Chris Lattner	5ef9ebf787	simplify code. llvm-svn: 60190	2008-11-27 22:56:14 +00:00
Chris Lattner	c92fa42ddd	simplify this logic. llvm-svn: 60189	2008-11-27 22:46:09 +00:00
Nick Lewycky	4ab50b93c8	Chris prefers icmp/select over udiv! llvm-svn: 60187	2008-11-27 22:41:10 +00:00
Nick Lewycky	69941fd0a0	Add a couple of missed optimizations on integer vectors. Multiply and divide by 1, as well as multiply by -1. llvm-svn: 60182	2008-11-27 20:21:08 +00:00
Chris Lattner	4059f43b74	defensive patch: if CGP is merging a block with the entry block, make sure it ends up being the entry block. llvm-svn: 60180	2008-11-27 19:29:14 +00:00
Chris Lattner	5dfbfcd80d	Fix PR3138: if we merge the entry block into another block, make sure to move the other block back up into the entry position! llvm-svn: 60179	2008-11-27 19:25:19 +00:00
Chris Lattner	e0d019def6	switch InstCombine::visitLoadInst to use FindAvailableLoadedValue llvm-svn: 60169	2008-11-27 08:56:30 +00:00
Chris Lattner	c6ae56d23f	enhance FindAvailableLoadedValue to make use of AliasAnalysis if it has it. llvm-svn: 60167	2008-11-27 08:18:12 +00:00
Chris Lattner	72f16e70f0	move FindAvailableLoadedValue from JumpThreading to Transforms/Utils. llvm-svn: 60166	2008-11-27 08:10:05 +00:00
Chris Lattner	d6204bed3d	simplify this code a bit. llvm-svn: 60164	2008-11-27 07:54:38 +00:00
Chris Lattner	206250284d	Use the new MergeBasicBlockIntoOnlyPred function. llvm-svn: 60163	2008-11-27 07:54:12 +00:00
Chris Lattner	99d6809ac1	move MergeBasicBlockIntoOnlyPred to Transforms/Utils. llvm-svn: 60162	2008-11-27 07:43:12 +00:00
Chris Lattner	240051aace	rename ThreadBlock to ProcessBlock, since it does other things than just simple threading. llvm-svn: 60157	2008-11-27 07:20:04 +00:00
Chris Lattner	98d89d1b1b	Make jump threading substantially more powerful, in the following ways: 1. Make it fold blocks separated by an unconditional branch. This enables jump threading to see a broader scope. 2. Make jump threading able to eliminate locally redundant loads when they feed the branch condition of a block. This frequently occurs due to reg2mem running. 3. Make jump threading able to eliminate partially redundant loads when they feed the branch condition of a block. This is common in code with lots of loads and stores like C++ code and 255.vortex. This implements thread-loads.ll and rdar://6402033. Per the fixme's, several pieces of this should be moved into Transforms/Utils. llvm-svn: 60148	2008-11-27 05:07:53 +00:00
Chris Lattner	397a11ccd8	Turn on my codegen prepare heuristic by default. It doesn't affect performance in most cases on the Grawp tester, but does speed some things up (like shootout/hash by 15%). This also doesn't impact compile time in a noticable way on the Grawp tester. It also, of course, gets the testcase it was designed for right :) llvm-svn: 60120	2008-11-26 22:16:44 +00:00
Chris Lattner	fef04acc50	teach the new heuristic how to handle inline asm. llvm-svn: 60088	2008-11-26 04:59:11 +00:00
Chris Lattner	6d71b7fb95	Improve ValueAlreadyLiveAtInst with a cheap and dirty, but effective heuristic: the value is already live at the new memory operation if it is used by some other instruction in the memop's block. This is cheap and simple to compute (moreso than full liveness). This improves the new heuristic even more. For example, it cuts two out of three new instructions out of 255.vortex:DbmFileInGrpHdr, which is one of the functions that the heuristic regressed. This overall eliminates another 40 instructions from 403.gcc and visibly reduces register pressure in 255.vortex (though this only actually ends up saving the 2 instructions from the whole program). llvm-svn: 60084	2008-11-26 03:20:37 +00:00
Chris Lattner	e34fe2c52d	Start rewroking a subpiece of the profitability heuristic to be phrased in terms of liveness instead of as a horrible hack. :) In pratice, this doesn't change the generated code for either 255.vortex or 403.gcc, but it could cause minor code changes in theory. This is framework for coming changes. llvm-svn: 60082	2008-11-26 03:02:41 +00:00
Chris Lattner	383a797f42	add a comment, make save/restore logic more obvious. llvm-svn: 60076	2008-11-26 02:11:11 +00:00
Chris Lattner	eb3e4fb6fb	This adds in some code (currently disabled unless you pass -enable-smarter-addr-folding to llc) that gives CGP a better cost model for when to sink computations into addressing modes. The basic observation is that sinking increases register pressure when part of the addr computation has to be available for other reasons, such as having a use that is a non-memory operation. In cases where it works, it can substantially reduce register pressure. This code is currently an overall win on 403.gcc and 255.vortex (the two things I've been looking at), but there are several things I want to do before enabling it by default: 1. This isn't doing any caching of results, so it is much slower than it could be. It currently slows down release-asserts llc by 1.7% on 176.gcc: 27.12s -> 27.60s. 2. This doesn't think about inline asm memory operands yet. 3. The cost model botches the case when the needed value is live across the computation for other reasons. I'll continue poking at this, and eventually turn it on as llcbeta. llvm-svn: 60074	2008-11-26 02:00:14 +00:00
Evan Cheng	496b042e20	Revert r60042. IndVarSimplify should check if APFloat is PPCDoubleDouble first before trying to convert it to an integer. llvm-svn: 60072	2008-11-26 01:11:57 +00:00
Chris Lattner	a9ab165b08	Teach CodeGenPrepare to look through Bitcast instructions when attempting to optimize addressing modes. This allows us to optimize things like isel-sink2.ll into: movl 4(%esp), %eax cmpb $0, 4(%eax) jne LBB1_2 ## F LBB1_1: ## TB movl $4, %eax ret LBB1_2: ## F movzbl 7(%eax), %eax ret instead of: _test: movl 4(%esp), %eax cmpb $0, 4(%eax) leal 4(%eax), %eax jne LBB1_2 ## F LBB1_1: ## TB movl $4, %eax ret LBB1_2: ## F movzbl 3(%eax), %eax ret This shrinks (e.g.) 403.gcc from 1133510 to 1128345 lines of .s. Note that the 2008-10-16-SpillerBug.ll testcase is dubious at best, I doubt it is really testing what it thinks it is. llvm-svn: 60068	2008-11-26 00:26:16 +00:00
Chris Lattner	f3e95505c5	Teach MatchScaledValue to handle Scales by 1 with MatchAddr (which can recursively match things) and scales by 0 by ignoring them. This triggers once in 403.gcc, saving 1 (!!!!) instruction in the whole huge app. llvm-svn: 60013	2008-11-25 07:25:26 +00:00
Chris Lattner	728f90220a	significantly refactor all the addressing mode matching logic into a new AddressingModeMatcher class. This makes it easier to reason about and reduces passing around of stuff, but has no functionality change. llvm-svn: 60012	2008-11-25 07:09:13 +00:00
Chris Lattner	58f49d2916	refactor all the constantexpr/instruction handling code out into a new FindMaximalLegalAddressingModeForOperation helper method. llvm-svn: 60011	2008-11-25 05:15:49 +00:00
Chris Lattner	a3fbff15b9	another minor tweak llvm-svn: 60010	2008-11-25 04:47:41 +00:00
Chris Lattner	d616ef5683	minor cleanups no functionality change. llvm-svn: 60009	2008-11-25 04:42:10 +00:00
Chris Lattner	6416a6b7a0	rearrange and tidy some code, no functionality change. llvm-svn: 59990	2008-11-24 22:44:16 +00:00
Chris Lattner	d917c8c8fe	minor cleanups to debug code, no functionality change. llvm-svn: 59989	2008-11-24 22:40:05 +00:00
Chris Lattner	d78894197a	reenable the right part of the code. llvm-svn: 59985	2008-11-24 21:26:21 +00:00
Chris Lattner	992a541002	revert an accidental commit, this fixes the regression on test/CodeGen/X86/isel-sink.ll llvm-svn: 59976	2008-11-24 19:40:34 +00:00
Chris Lattner	53d6a07869	Fix 3113: If we have a dead cyclic PHI, replace the whole thing with an undef. llvm-svn: 59972	2008-11-24 19:25:36 +00:00
Devang Patel	702f45df58	Fix build failure. llvm-svn: 59844	2008-11-21 21:00:20 +00:00
Devang Patel	cb181bb203	Silence unused variable warnings. llvm-svn: 59841	2008-11-21 20:00:59 +00:00
Chris Lattner	dd7083452f	reapply Sanjiv's patch to genericize memcpy/memset/memmove to take an arbitrary integer width for the count. llvm-svn: 59823	2008-11-21 16:42:48 +00:00
Bill Wendling	4bce2bff88	Revert r59802. It was breaking the build of llvm-gcc: g++ -m32 -c -g -DIN_GCC -W -Wall -Wwrite-strings -Wmissing-format-attribute -fno-common -mdynamic-no-pic -DHAVE_CONFIG_H -Wno-unused -DTARGET_NAME=\"i386-apple-darwin9.5.0\" -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/../llvm.src/include -D_DEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include ../../llvm-gcc.src/gcc/llvm-types.cpp -o llvm-types.o ../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemCpy(llvm::Value, llvm::Value, llvm::Value, unsigned int)': ../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i32' is not a member of 'llvm::Intrinsic' ../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i64' is not a member of 'llvm::Intrinsic' ../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemMove(llvm::Value, llvm::Value, llvm::Value, unsigned int)': ../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i32' is not a member of 'llvm::Intrinsic' ../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i64' is not a member of 'llvm::Intrinsic' ../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemSet(llvm::Value, llvm::Value, llvm::Value, unsigned int)': ../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i32' is not a member of 'llvm::Intrinsic' ../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i64' is not a member of 'llvm::Intrinsic' make[3]: [llvm-convert.o] Error 1 make[3]: * Waiting for unfinished jobs.... rm fsf-funding.pod gcov.pod gfdl.pod cpp.pod gpl.pod gcc.pod make[2]: * [all-stage1-gcc] Error 2 make[1]: * [stage1-bubble] Error 2 make: *** [all] Error 2 llvm-svn: 59809	2008-11-21 09:09:41 +00:00
Sanjiv Gupta	09a203765a	Make mem[cpy,move,set] intrinsics overloaded. llvm-svn: 59802	2008-11-21 07:49:09 +00:00
Nick Lewycky	07d726ec4d	Optimize (x/y)*y into x-(x%y) in general. Div and rem are about the same, and a subtract is cheaper than a multiply. This generalizes an existing transform. llvm-svn: 59800	2008-11-21 07:33:58 +00:00
Devang Patel	45f1ae028e	Fix unused variable warnings. llvm-svn: 59778	2008-11-21 01:52:59 +00:00
Bill Wendling	f5260d29c2	Fix error where it wasn't getting the correct caller function. llvm-svn: 59758	2008-11-21 00:09:21 +00:00
Bill Wendling	26c6a3e736	If the function being inlined has a higher stack protection level than the inlining function, then increase the stack protection level on the inlining function. llvm-svn: 59757	2008-11-21 00:06:32 +00:00
Devang Patel	38642e598e	Don't forget arguments! llvm-svn: 59745	2008-11-20 19:50:17 +00:00
Devang Patel	c8b2fe1eed	Do not forget llvm.dbg.declare's first argument while removing debugging information. llvm-svn: 59688	2008-11-20 01:20:42 +00:00
Oscar Fuentes	4fb443f81b	CMake: Removed source file. llvm-svn: 59662	2008-11-19 19:32:19 +00:00
Devang Patel	79303b2572	Do not use separate utility to walk all instructions and remove dead dbg intrinsics. Let instcombiner do this job. llvm-svn: 59659	2008-11-19 19:01:37 +00:00
Devang Patel	827bced2b1	Let instcombiner remove redundant dbg intrinsics. llvm-svn: 59658	2008-11-19 18:59:41 +00:00
Devang Patel	7ed6c5317c	If there are two consecutive llvm.dbg.stoppoint calls then it is likely that the optimizer deleted code in between these two intrinsics. Keep only the last llvm.dbg.stoppoint in this case. llvm-svn: 59657	2008-11-19 18:56:50 +00:00
Devang Patel	25662f3e4a	Remove unused variables. llvm-svn: 59570	2008-11-19 00:22:02 +00:00
Devang Patel	ebd2363339	Fix typo. llvm-svn: 59569	2008-11-19 00:19:18 +00:00
Devang Patel	b5e867acff	Add new helper pass that strips all symbol names except debugging information. This pass makes it easier to test wheter debugging info. influences optimization passes or not. llvm-svn: 59552	2008-11-18 21:34:39 +00:00
Devang Patel	3b7a2be88e	Remove even more llvm.dbg variables. Remove all dead globals from llvm.metadata. Ignore linkonce linkage for selected llvm.dbg values. llvm-svn: 59547	2008-11-18 21:13:41 +00:00
Devang Patel	a13f1f38fa	Initialize MallocFunc and FreeFunc properly. llvm-svn: 59538	2008-11-18 18:43:07 +00:00
Bill Wendling	cf194e9a27	Cast to remove warning about comparing signed and unsigned. llvm-svn: 59518	2008-11-18 10:57:27 +00:00
Devang Patel	f1e9329209	Give SIToFPInst preference over UIToFPInst because it is faster on platforms that are widely used. llvm-svn: 59476	2008-11-18 00:40:02 +00:00
Devang Patel	180afd2c55	While handling floating point IVs lift restrictions on initial value and increment value. llvm-svn: 59471	2008-11-17 23:27:13 +00:00
Devang Patel	aa3d68d301	Handle floating point ivs during doInitialization(). llvm-svn: 59466	2008-11-17 21:32:02 +00:00
Devang Patel	b63c74730c	Let AnalyzeAlloca() remove debug intrinsics. llvm-svn: 59454	2008-11-17 18:37:53 +00:00
Torok Edwin	026259faeb	If SI->size() is 0, we are not allowed to dereference ->begin(). This fixed PR3078. llvm-svn: 59416	2008-11-16 17:21:25 +00:00
Chris Lattner	7917b43a28	eliminate some std::set's. llvm-svn: 59409	2008-11-16 07:17:51 +00:00
Chris Lattner	f8f6270f14	simplify loop llvm-svn: 59406	2008-11-16 06:35:18 +00:00
Chris Lattner	44152742a0	simplify a bunch more instcombines to use m_Specific etc. llvm-svn: 59403	2008-11-16 05:38:51 +00:00
Chris Lattner	d397fef50d	factor the code for simplifying (icmp)\|(icmp) into its own function. llvm-svn: 59402	2008-11-16 05:20:07 +00:00
Chris Lattner	909b969b18	do some computation with apints instead of ConstantInts. llvm-svn: 59401	2008-11-16 05:14:43 +00:00
Chris Lattner	feaea9bdf7	merge a check into a place where it is simpler. llvm-svn: 59400	2008-11-16 05:10:52 +00:00
Chris Lattner	269cbd5770	factor a whole bunch of code out into a helper function. llvm-svn: 59398	2008-11-16 05:06:21 +00:00
Chris Lattner	b37b6e7e96	simplify the conditions on two gigantic if's, decreasing indentation a bit. Next step is to factor out into their own helper functions. llvm-svn: 59397	2008-11-16 04:55:20 +00:00
Chris Lattner	f1be285134	simplify some instcombine matches by using m_Specific llvm-svn: 59395	2008-11-16 04:46:19 +00:00
Chris Lattner	fae5e33111	Use new m_SelectCst template to eliminate macros. llvm-svn: 59392	2008-11-16 04:33:38 +00:00
Chris Lattner	569d78cbb5	simplify code. llvm-svn: 59390	2008-11-16 04:26:55 +00:00
Chris Lattner	c3f3b059d0	Handle the case where there is no "not". It is possible it got folded into the select. llvm-svn: 59389	2008-11-16 04:25:26 +00:00
Chris Lattner	5f6d9a313b	factor a bunch of copy/paste code out into a helper function. Eliminate the cases checking for cond?0:-1, since that is already handled by commutative checking. llvm-svn: 59388	2008-11-16 04:24:12 +00:00
Chris Lattner	68d2da2a19	rearrange some code, no functionality change. llvm-svn: 59381	2008-11-16 03:56:24 +00:00
Chris Lattner	e02c7c7ad2	if we're going to use a macro, use it maximally. no functionality change. llvm-svn: 59380	2008-11-16 03:54:57 +00:00
Devang Patel	8ada1d5de5	Refactor code. Strip debug information before stripping symbol names. llvm-svn: 59328	2008-11-14 22:49:37 +00:00
Devang Patel	3dd51c5c62	Really remove all debug information. llvm-svn: 59208	2008-11-13 01:28:40 +00:00
Oscar Fuentes	1b504d5372	CMake: Remove removed source file. llvm-svn: 59098	2008-11-12 00:14:12 +00:00
Devang Patel	4f02a0b740	Remove llvm-svn: 59093	2008-11-11 23:58:15 +00:00
Devang Patel	bf0835706c	Undo previous check-in. llvm-svn: 59092	2008-11-11 23:57:33 +00:00
Oscar Fuentes	2353ef3e91	CMake: Updated list of source files for lib/Transforms/Utils. llvm-svn: 59077	2008-11-11 19:51:36 +00:00
Devang Patel	6096f26bd4	Add utility pass to remove dbg info. llvm-svn: 59068	2008-11-11 19:33:39 +00:00
Devang Patel	95b18126ee	Use actual function name in comments. llvm-svn: 59063	2008-11-11 19:16:41 +00:00
Cedric Venet	8cb2e28e43	Update CMakeLists.txt llvm-svn: 59039	2008-11-11 09:55:48 +00:00
Devang Patel	53b39b5467	Cleanup debug info. assocated with deleted instructions. llvm-svn: 59012	2008-11-11 00:54:10 +00:00
Devang Patel	dc6699e82f	Add utility routines to remove dead debug info. llvm-svn: 59011	2008-11-11 00:53:02 +00:00
Devang Patel	d0ce981372	If the sign of exit condition and split condition does not match then do not split loop index. llvm-svn: 58995	2008-11-10 19:48:34 +00:00
Bill Wendling	7ef7314d1a	Third time's a charm. The previous patches didn't match correctly. Also, we need to make sure that the conditional is the same before doing the transformation. llvm-svn: 58978	2008-11-10 06:59:06 +00:00
Mon P Wang	25f0106fd9	Added support for the following definition of shufflevector <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> llvm-svn: 58964	2008-11-10 04:46:22 +00:00
Bill Wendling	4fb13c051d	Correction for the last patch. Should match the conditional in the first part of the select match, not the select instruction itself. llvm-svn: 58947	2008-11-09 23:37:53 +00:00
Bill Wendling	1579287550	The method of doing the matching with a 'select' instruction was wrong. The original code was matching like this: if (match(A, m_Not(m_Value(B)))) B was already matched as a 'select' instruction. However, this isn't matching what we think it's matching. It would match B as a 'Value', so basically anything would match to it. In this case, a Constant matched. B was replaced with a constant representation. And then the wrong value would be used in the SelectInst::Create statement, causing a crash. After thinking on this for a moment, and after Nick L. told me how the pattern matching stuff was supposed to work, the solution was to match NOT an m_Value, but an m_Select. llvm-svn: 58946	2008-11-09 23:17:42 +00:00
Nuno Lopes	2e42927e7c	fix leakage of ValueNumbering llvm-svn: 58933	2008-11-09 12:45:23 +00:00
Bill Wendling	3f547be28f	If the LHS of the FCMP is coming from a UIToFP instruction, then we don't want to generate signed ICMP instructions to replace the FCMP. This would violate the following: define i1 @test1(i32 %val) { %1 = uitofp i32 %val to double %2 = fcmp ole double %1, 0.000000e+00 ret i1 %2 } would be transformed into: define i1 @test1(i32 %val) { %1 = icmp slt i33 %val, 1 ret i1 %1 } which is obviously wrong. This patch modifes InstCombiner::FoldFCmp_IntToFP_Cst to handle when the LHS comes from UIToFP. llvm-svn: 58929	2008-11-09 04:26:50 +00:00
Daniel Dunbar	2b9dce2669	Rework r58829, allowing removal of dbg info intrinsics during alloca promotion. - Eliminate uses after free and simplify tests. Devang: Please check that this is still doing what you intended. llvm-svn: 58887	2008-11-08 04:12:17 +00:00
Bill Wendling	b9656df4ac	BCUI + 1 doesn't work. Use next instead. llvm-svn: 58830	2008-11-07 01:59:41 +00:00
Devang Patel	b8e0d59ceb	Handle (delete) dbg intrinsics while promoting alloca. llvm-svn: 58826	2008-11-07 01:30:07 +00:00
Mon P Wang	5ca2ec65bd	Fixed scalarizing an extract subvector and prevent an infinite loop when simplify a vector. llvm-svn: 58820	2008-11-06 22:52:21 +00:00
Devang Patel	5a5ab730e0	InstructionNamer preserves everything. llvm-svn: 58787	2008-11-06 01:00:16 +00:00
Devang Patel	f0ef35738c	Do now allow InlineAlways pass to remove dead functions. llvm-svn: 58744	2008-11-05 01:39:16 +00:00
Devang Patel	7a848b0ee3	Check Attribute::NoInline. llvm-svn: 58742	2008-11-05 01:37:05 +00:00
Oscar Fuentes	076e048cf7	CMake: updated list of source files. llvm-svn: 58736	2008-11-05 00:11:22 +00:00
Dan Gohman	8cdea717a3	Add a new pass to simplify specific half_powr function calls. This is a specialized pass that it not likely to be generally useful. llvm-svn: 58732	2008-11-04 23:41:45 +00:00
Dale Johannesen	0a7b4f5800	Allow SROA of vectors. Removing this caused a huge performance regression in something we care about. This may not be final fix. llvm-svn: 58718	2008-11-04 20:54:03 +00:00
Devang Patel	f33f8a8606	Fix unused variable warnings. llvm-svn: 58651	2008-11-03 23:14:09 +00:00
Devang Patel	fe57d109b6	Ignore conditions that are outside the loop. llvm-svn: 58631	2008-11-03 19:38:07 +00:00
Andrew Lenharth	348f3fa6a7	add a period at the end of the comment, ignoring the fact that the comment would be hard pressed to be considered a sentence, but if it makes Bill happy... llvm-svn: 58630	2008-11-03 19:29:29 +00:00
Devang Patel	c1631db93b	Turn floating point IVs into integer IVs where possible. This allows SCEV users to effectively calculate trip count. LSR later on transforms back integer IVs to floating point IVs later on to avoid int-to-float casts inside the loop. llvm-svn: 58625	2008-11-03 18:32:19 +00:00
Andrew Lenharth	45b86322f2	Ensure that we are checking only calls to the function we are interested in specializing llvm-svn: 58615	2008-11-03 16:05:35 +00:00
Nick Lewycky	d73806a9cc	Replace explicit loop with utility function. llvm-svn: 58593	2008-11-03 03:49:14 +00:00
Nick Lewycky	3c6d34a7f0	Changes from Duncan's review: * merge two weak functions by making them both alias a third non-weak fn * don't reimplement CallSite::hasArgument * whitelist the safe linkage types llvm-svn: 58568	2008-11-02 16:46:26 +00:00
Duncan Sands	cede1e035c	Get this building on 64 bit machines (error: cast from ‘const llvm::PointerType*’ to ‘unsigned int’ loses precision). llvm-svn: 58561	2008-11-02 09:00:33 +00:00
Oscar Fuentes	0433be6feb	CMake: added a source file. llvm-svn: 58559	2008-11-02 06:01:39 +00:00
Nick Lewycky	d01d42e76c	Add a new MergeFunctions pass. It finds identical functions and merges them. This triggers only 60 times in llvm-test (look at .llvm.bc, not .linked.rbc) and so it probably wont be turned on by default. Also, may of those are likely to go away when PR2973 is fixed. llvm-svn: 58557	2008-11-02 05:52:50 +00:00
Nick Lewycky	8d8acf327b	Fix demanded bits analysis with srem by negative number. Based on a patch by Richard Osborne. llvm-svn: 58555	2008-11-02 02:41:50 +00:00
Dan Gohman	83eea0b17f	Fix this recently moved code to use the correct type. CI is now a ConstantInt, and SI is the original cast instruction. This fixes PR2996. llvm-svn: 58549	2008-11-02 00:17:33 +00:00
Daniel Dunbar	a1c4fcfc29	Fix warning. llvm-svn: 58486	2008-10-31 01:50:01 +00:00
Dan Gohman	13cbcf1c18	Canonicalize sext(i1) to i1?-1:0, and update various instcombine optimizations accordingly. llvm-svn: 58457	2008-10-30 20:40:10 +00:00
Daniel Dunbar	3933e66a89	Add InlineCost class for represent the estimated cost of inlining a function. - This explicitly models the costs for functions which should "always" or "never" be inlined. This fixes bugs where such costs were not previously respected. llvm-svn: 58450	2008-10-30 19:26:59 +00:00
Chris Lattner	0934c0f35b	Fix PR2967 by not deleting volatile load/stores that occur before unreachable. I don't really see this as being needed, but there is little harm from doing it. llvm-svn: 58385	2008-10-29 17:46:26 +00:00
Daniel Dunbar	e7fbf9f425	Factor shouldInline method out of Inliner. - No functionality change. llvm-svn: 58355	2008-10-29 01:02:02 +00:00
Daniel Dunbar	cc20455346	Assorted comment/naming fixes, 80-col violations, and reindentation. - No functionality change. llvm-svn: 58352	2008-10-28 23:24:26 +00:00
Dan Gohman	2c34c130bf	(A & sext(C)) \| (B & ~sext(C) -> C ? A : B llvm-svn: 58351	2008-10-28 22:38:57 +00:00
Torok Edwin	ca97b42ef7	export an ID for the instructionNamer, allowing analysis/transformation passes that need it to require it by ID. llvm-svn: 58238	2008-10-27 10:16:27 +00:00
Chris Lattner	59b5691388	Rewrite all the 'PromoteLocallyUsedAlloca[s]' logic. With the power of LargeBlockInfo, we can now dramatically simplify their implementation and speed them up at the same time. Now the code has time proportional to the number of uses of the alloca, not the size of the block. This also eliminates code that tried to batch up different allocas which are used in the same blocks, and eliminates the 'retry list' logic which was baroque and no unneccesary. In addition to being a speedup for crazy cases, this is also a nice cleanup: PromoteMemoryToRegister.cpp \| 270 +++++++++++++++----------------------------- 1 file changed, 96 insertions(+), 174 deletions(-) llvm-svn: 58229	2008-10-27 07:05:53 +00:00
Chris Lattner	f594ecc453	Add a new LargeBlockInfo helper, which is just a wrapper around a trivial dense map. Use this in RewriteSingleStoreAlloca to avoid aggressively rescanning blocks over and over again. This fixes PR2925, speeding up mem2reg on the testcase in that bug from 4.56s to 0.02s in a debug build on my machine. llvm-svn: 58227	2008-10-27 06:05:26 +00:00
Nick Lewycky	f6e4dca67e	Add value range analyzing of Add and Sub. Understand that mul %x, 1 = %x. llvm-svn: 58069	2008-10-24 04:00:26 +00:00
Daniel Dunbar	7f39e2d85a	Change createPass factory functions to return Pass instead of LoopPass*. - Although less precise, this means they can be used in clients without RTTI (who would otherwise need to include LoopPass.h, which eventually includes things using dynamic_cast). This was the simplest solution that presented itself, but I am happy to use a better one if available. llvm-svn: 58010	2008-10-22 23:32:42 +00:00
Dan Gohman	72e66eedb8	Use Function::getEntryBlock() instead of Function::front(), for clarity. llvm-svn: 57870	2008-10-21 03:10:28 +00:00
Dan Gohman	fa29b67aee	Fix a bug that prevented llvm-extract -delete from working. llvm-svn: 57864	2008-10-21 01:08:07 +00:00
Dan Gohman	215742a966	Use 0 instead of false to return a null pointer. llvm-svn: 57660	2008-10-17 00:56:52 +00:00
Dan Gohman	bc0278400c	Teach instcombine's visitLoad to scan back several instructions to find opportunities for store-to-load forwarding or load CSE, in the same way that visitStore scans back to do DSE. Also, define a new helper function for testing whether the addresses of two memory accesses are known to have the same value, and use it in both visitStore and visitLoad. These two changes allow instcombine to eliminate loads in code produced by front-ends that frequently emit obviously redundant addressing for memory references. llvm-svn: 57608	2008-10-15 23:19:35 +00:00
Evan Cheng	d885f6e139	Combine (fcmp cc0 x, y) \| (fcmp cc1 x, y) into a single fcmp when possible. llvm-svn: 57515	2008-10-14 18:44:08 +00:00
Evan Cheng	ce70752b11	- Somehow I forgot about one / une. - Renumber fcmp predicates to match their icmp counterparts. - Try swapping operands to expose more optimization opportunities. llvm-svn: 57513	2008-10-14 18:13:38 +00:00
Evan Cheng	67786cce66	Optimize anding of two fcmp into a single fcmp if the operands are the same. e.g. uno && ueq -> ueq ord && olt -> olt ord && ueq -> oeq llvm-svn: 57507	2008-10-14 17:15:11 +00:00
Matthijs Kooijman	f7d3cb5435	Make InstructionCombining::getBitCastOperand() recognize GEP instructions and constant expression with all zero indices as being the same as a bitcast. llvm-svn: 57442	2008-10-13 15:17:01 +00:00
Chris Lattner	da435910e8	Fix PR2697 by rewriting the '(X / pos) op neg' logic. This also changes a couple other cases for clarity, but shouldn't affect correctness. Patch by Eli Friedman! llvm-svn: 57387	2008-10-11 22:55:00 +00:00
Devang Patel	647a1e532b	Check loop exit predicate properly while eliminating one iteration loop. This patch fixes PR 2869 llvm-svn: 57369	2008-10-10 22:02:57 +00:00
Nuno Lopes	e3127f3f80	fix memleak by cleaning the global sets on pass exit llvm-svn: 57353	2008-10-10 16:25:50 +00:00
Dale Johannesen	4f0bd68cfe	Add a "loses information" return value to APFloat::convert and APFloat::convertToInteger. Restore return value to IEEE754. Adjust all users accordingly. llvm-svn: 57329	2008-10-09 23:00:39 +00:00
Nick Lewycky	03c5fa18f1	Don't drop alignment on globals when cloning. llvm-svn: 57320	2008-10-09 06:27:14 +00:00
Nuno Lopes	06c67f88d7	dont specialize weak functions and the like llvm-svn: 57305	2008-10-08 18:45:59 +00:00
Duncan Sands	26ff6f9c54	Add <cstdio> include where needed by gcc-4.4. Patch by Samuel Tardieu. llvm-svn: 57291	2008-10-08 07:23:46 +00:00
Chris Lattner	42d5785dbd	Add parentheses to avoid warnings in GCC 4.4.0, patch by Samuel Tardieu! llvm-svn: 57288	2008-10-08 06:42:28 +00:00
Andrew Lenharth	5aa1cc4065	Correctly set attributes when removing args during cloning. Fixes PR2765 llvm-svn: 57254	2008-10-07 18:08:38 +00:00
Devang Patel	40aafce00d	Fix typo, fix PR 2865. llvm-svn: 57221	2008-10-06 23:22:54 +00:00
Matthijs Kooijman	cbe5e16eb5	Allow scalarrepl to treat an all-zero GEP just as bitcast. This includes not marking a GEP involving a vector as unsafe, but only when it has all zero indices. This allows scalarrepl to work in a few more cases. llvm-svn: 57177	2008-10-06 16:23:31 +00:00
Chris Lattner	917a6c1343	rewrite bswap matching to be more general, allowing arbitrary shifting and masking inside a bswap expr. This allows it to handle the cases from PR2842, which involve the intermediate 'or' expressions being shifted, not just the input value. llvm-svn: 57095	2008-10-05 02:13:19 +00:00
Chris Lattner	ca91f265c4	fix a bug where the bswap matcher could match a case involving ashr. It should only apply to lshr. llvm-svn: 57089	2008-10-05 00:50:57 +00:00
Duncan Sands	1d35e9aebe	Ignore loads from and stores to local memory (i.e. allocas) when deciding whether to mark a function readnone/readonly. Since the pass is currently run before SROA, this may be quite helpful. Requested by Chris on IRC. llvm-svn: 57050	2008-10-04 13:24:24 +00:00
Dan Gohman	e21903987f	Clean up some multiple-return-value code that is no longer applicable. llvm-svn: 57033	2008-10-03 22:21:24 +00:00
Devang Patel	f963403b58	Nick Lewycky's patch. While hosting instruction check PHI node. llvm-svn: 57025	2008-10-03 18:57:37 +00:00
Duncan Sands	3a813a5d3f	Teach internalize to preserve the callgraph. Why? Because it was there! llvm-svn: 56996	2008-10-03 07:36:09 +00:00
Owen Anderson	cb4f156b6b	SplitBlock should only attempt to update LoopInfo if it is actually being used. llvm-svn: 56994	2008-10-03 06:55:35 +00:00
Duncan Sands	d65a4daeea	Factorize code: remove variants of "strip off pointer bitcasts and GEP's", and centralize the logic in Value::getUnderlyingObject. The difference with stripPointerCasts is that stripPointerCasts only strips GEPs if all indices are zero, while getUnderlyingObject strips GEPs no matter what the indices are. llvm-svn: 56922	2008-10-01 15:25:41 +00:00
Nuno Lopes	96740aad86	revert the addition of Preverves(CallGraph), per Duncan's comments llvm-svn: 56917	2008-10-01 09:13:40 +00:00
Dan Gohman	67d90de2b0	Call ScalarEvolution's deleteValueFromRecords before deleting an instruction, not after. This fixes some uses of free'd memory. llvm-svn: 56908	2008-10-01 02:02:03 +00:00
Nuno Lopes	5093ab4c76	add preserversCFG() + preservers(CallGraph) llvm-svn: 56887	2008-09-30 22:04:30 +00:00
Nuno Lopes	2bd7b24f1a	add AU.setPreservesCFG() since this pass only adds and removes function attributes llvm-svn: 56868	2008-09-30 18:34:38 +00:00
Nick Lewycky	e8ced3ec19	Fix misoptimization of: xor i1 (icmp eq (X, C1), icmp s[lg]t (X, C2)) llvm-svn: 56834	2008-09-30 06:08:34 +00:00

... 5 6 7 8 9 ...

5144 Commits