llvm-project

Commit Graph

Author	SHA1	Message	Date
Nick Lewycky	40a34dd9a3	Enhance a couple places where we were doing constant folding of instructions, but not load instructions. Noticed by inspection. llvm-svn: 140966	2011-10-02 09:12:55 +00:00
Andrew Trick	f7656015fc	Inlining and unrolling heuristics should be aware of free truncs. We want heuristics to be based on accurate data, but more importantly we don't want llvm to behave randomly. A benign trunc inserted by an upstream pass should not cause a wild swings in optimization level. See PR11034. It's a general problem with threshold-based heuristics, but we can make it less bad. llvm-svn: 140919	2011-10-01 01:39:05 +00:00
Andrew Trick	caa500bf93	whitespace llvm-svn: 140916	2011-10-01 01:27:56 +00:00
Jim Grosbach	011dafba61	Don't modify constant in-place. llvm-svn: 140875	2011-09-30 19:58:46 +00:00
Jim Grosbach	24ff834671	float comparison to double 'zero' constant can just be a float 'zero.' InstCombine was incorrectly considering the conversion of the constant zero to be unsafe. We want to transform: define float @bar(float %x) nounwind readnone optsize ssp { %conv = fpext float %x to double %cmp = fcmp olt double %conv, 0.000000e+00 %conv1 = zext i1 %cmp to i32 %conv2 = sitofp i32 %conv1 to float ret float %conv2 } Into: define float @bar(float %x) nounwind readnone optsize ssp { %cmp = fcmp olt float %x, 0.000000e+00 ; <---- This %conv1 = zext i1 %cmp to i32 %conv2 = sitofp i32 %conv1 to float ret float %conv2 } rdar://10215914 llvm-svn: 140869	2011-09-30 18:45:50 +00:00
Jim Grosbach	129c52af18	Tidy up. Trailing whitespace. llvm-svn: 140865	2011-09-30 18:09:53 +00:00
Duncan Sands	5c05579f94	Inlining often produces landingpad instructions with repeated catch or repeated filter clauses. Teach instcombine a bunch of tricks for simplifying landingpad clauses. Currently the code only recognizes the GNU C++ and Ada personality functions, but that doesn't stop it doing a bunch of "generic" transforms which are hopefully fine for any real-world personality function. If these "generic" transforms turn out not to be generic, they can always be conditioned on the personality function. Probably someone should add the ObjC++ personality function. I didn't as I don't know anything about it. llvm-svn: 140852	2011-09-30 13:12:16 +00:00
Nick Lewycky	a3e7ffdae8	Fold two identical set lookups into one. No functionality change. llvm-svn: 140821	2011-09-29 23:40:12 +00:00
Dan Gohman	4ac148dcbc	When eliminating unnecessary retain+autorelease on return values, handle the case where the retain is in a different basic block. rdar://10210274. llvm-svn: 140815	2011-09-29 22:27:34 +00:00
Dan Gohman	2053a5dd64	Don't eliminate objc_retainBlock calls on stack objects if the objc_retainBlock call is potentially responsible for copying the block to the heap to extend its lifetime. rdar://10209613. llvm-svn: 140814	2011-09-29 22:25:23 +00:00
Eli Friedman	95031ed837	Clean up uses of switch instructions so they are not dependent on the operand ordering. Patch by Stepan Dyatkovskiy. llvm-svn: 140803	2011-09-29 20:21:17 +00:00
Andrew Trick	168dfffdb8	typo + pasto llvm-svn: 140769	2011-09-29 01:53:08 +00:00
Andrew Trick	bc6de90a5f	LSR: rewrite inner loops only. Rewriting the entire loop nest now requires -enable-lsr-nested. See PR11035 for some performance data. A few unit tests specifically test nested LSR, and are now under a flag. llvm-svn: 140762	2011-09-29 01:33:38 +00:00
Andrew Trick	e0e30532a5	indvars should hoist [sz]ext because licm is not rerun. llvm-svn: 140670	2011-09-28 01:35:36 +00:00
Benjamin Kramer	547b6c5ecd	Stop emitting instructions with the name "tmp" they eat up memory and have to be uniqued, without any benefit. If someone prefers %tmp42 to %42, run instnamer. llvm-svn: 140634	2011-09-27 20:39:19 +00:00
Bill Wendling	90f90da156	Split the landing pad basic block with the correct function. Also merge the split landingpad instructions into a PHI node. PR11016 llvm-svn: 140592	2011-09-27 00:59:31 +00:00
Andrew Trick	581243919d	Disable LSR retry by default. Disabling aggressive LSR saves compilation time, and with the new indvars behavior usually improves performance. llvm-svn: 140590	2011-09-27 00:44:14 +00:00
Andrew Trick	8868faec63	LSR, one of the new Cost::isLoser() checks did not get merged in the previous checkin. llvm-svn: 140583	2011-09-26 23:35:25 +00:00
Andrew Trick	784729d408	LSR cost metric minor fix and verification. The minor bug heuristic was noticed by inspection. I added the isLoser/isValid helpers because they will become more important with subsequent checkins. llvm-svn: 140580	2011-09-26 23:11:04 +00:00
Andrew Trick	8b2fe2f744	LSR minor bug fix in RateRegister. No test case. Noticed by inspection and I doubt it ever affects the outcome of the overall heuristic, let alone final codegen. llvm-svn: 140431	2011-09-23 23:05:19 +00:00
Eli Friedman	f9b785f185	PR10987: add a missed safety check to isSafePHIToSpeculate in scalarrepl. llvm-svn: 140327	2011-09-22 18:56:30 +00:00
Eli Friedman	1815b688cc	Make sure IPSCCP never marks a tracked call as overdefined in SCCPSolver::ResolvedUndefsIn. If we do, we can end up in a situation where a function is resolved to return a constant, but the caller is marked overdefined, which confuses the code later. <rdar://problem/9956541> (again). llvm-svn: 140210	2011-09-20 23:28:51 +00:00
Bill Wendling	a6e1c51ed7	Relax this condition. Some passes require breaking critical edges before they're called. Don't segfault because of that. llvm-svn: 140196	2011-09-20 22:28:17 +00:00
Bill Wendling	04289fcad8	Place the check for an exit landing pad where it will be run on both code paths through the if-then-else. llvm-svn: 140195	2011-09-20 22:27:16 +00:00
Bill Wendling	0058520770	Omit extracting a loop if one of the exits is a landing pad. The landing pad must accompany the invoke when it's extracted. However, if it does, then the loop isn't properly extracted. I.e., the resulting extraction has a loop in it. The extracted function is then extracted, etc. resulting in an infinite loop. llvm-svn: 140193	2011-09-20 22:23:09 +00:00
Bill Wendling	3d48f59231	Check the terminator, not the basic block. llvm-svn: 140176	2011-09-20 20:20:50 +00:00
Bill Wendling	c1da6ea344	When extracting a basic block that ends in an 'invoke' instruction, we need to extract its associated landing pad block as well. However, that landing pad block may have more than one predecessor. So split the landing pad block so that individual landing pads have only one predecessor. This type of transformation may produce a false positive with bugpoint. llvm-svn: 140173	2011-09-20 19:10:24 +00:00
Bill Wendling	fc1176e061	Use ArrayRef instead of an explicit 'const std::vector &'. llvm-svn: 140172	2011-09-20 19:05:04 +00:00
Devang Patel	7d06f5cdd4	If simple ownership works then friendship is not required. llvm-svn: 140169	2011-09-20 18:48:56 +00:00
Bill Wendling	1bfe55a378	Use ArrayRef instead of 'const std::vector' to pass around the list of basic blocks to extract. llvm-svn: 140168	2011-09-20 18:42:07 +00:00
Devang Patel	add1f17575	Update GCOVLines to provide interfaces to write line table and calculate complete length. llvm-svn: 140167	2011-09-20 18:35:00 +00:00
Bill Wendling	9a2ba72c49	Fix comments. llvm-svn: 140164	2011-09-20 18:24:46 +00:00
Devang Patel	1a155a8200	Update comment. llvm-svn: 140156	2011-09-20 18:05:45 +00:00
Devang Patel	9cb1fc034b	Use StringRef instead of std::string. llvm-svn: 140154	2011-09-20 17:55:19 +00:00
Devang Patel	972df96ab1	Eliminate unnecessary copy of FileName from GCOVLines. GCOVLines is always accessed through a StringMap where the key is FileName. llvm-svn: 140151	2011-09-20 17:43:14 +00:00
Devang Patel	b011105d6c	There is no need to write a local utility routine to find subprogram info if the utility routine is already available in DebugInfo. llvm-svn: 140145	2011-09-20 15:57:19 +00:00
Bill Wendling	7cdaa3a1a8	Revert r140083 and r140084 until buildbots can be fixed. llvm-svn: 140094	2011-09-19 23:30:41 +00:00
Bill Wendling	d3c9d971e6	If we are extracting a basic block that ends in an invoke call, we must also extract the landing pad block. Otherwise, there will be a situation where the invoke's unwind edge lands on a non-landing pad. We also forbid the user from extracting the landing pad block by itself. Again, this is not a valid transformation. llvm-svn: 140083	2011-09-19 23:00:52 +00:00
Eli Friedman	61d7c8a065	Fix an infinite loop where a transform in InstCombiner::visitAnd claims a construct is changed when it is not. (See included testcase.) Patch by Xiaoyi Guo. llvm-svn: 140072	2011-09-19 21:58:15 +00:00
Andrew Trick	7251e41b16	[indvars] Fix PR10946: SCEV cannot handle Vector IVs. llvm-svn: 140026	2011-09-19 17:54:39 +00:00
Andrew Trick	74111ee07f	Reapply r139759. Disable IV rewriting by default. See PR10916. llvm-svn: 139842	2011-09-15 20:58:37 +00:00
Eli Friedman	888bea0b95	Make demanded-elt simplification for shufflevector slightly stronger. Spotted by inspection. llvm-svn: 139768	2011-09-15 01:14:29 +00:00
Dan Gohman	fca43c21c3	Don't mark objc_retainBlock as nounwind. It calls user copy constructors which could theoretically throw. llvm-svn: 139710	2011-09-14 18:33:34 +00:00
Dan Gohman	d4b5e3a4d9	objc_retainBlock is not NoModRef because it can update forwarding pointers in memory relevant to the optimizer. rdar://10050579. llvm-svn: 139708	2011-09-14 18:13:00 +00:00
Andrew Trick	f9f68b816b	[indvars] Revert r139579 until 401.bzip -arch i386 miscompilation is fixed. PR10920. llvm-svn: 139583	2011-09-13 05:23:49 +00:00
Andrew Trick	061d811c51	Disable IV rewriting by default. See PR10916. llvm-svn: 139579	2011-09-13 03:23:21 +00:00
Andrew Trick	3de5b8e4c1	[indvars] Fix bugs in floating point IV range checks noticed by inspection. llvm-svn: 139574	2011-09-13 01:59:32 +00:00
Eli Friedman	72a93e5e9b	Add comment to clarify the behavior of a helper in DSE. llvm-svn: 139571	2011-09-13 01:28:59 +00:00
Eli Friedman	a93ab13e0b	Correct grammar. llvm-svn: 139565	2011-09-13 00:44:16 +00:00
Eli Friedman	7c5dc122a0	Change a bunch of isVolatile() checks to check for atomic load/store as well. No tests; these changes aren't really interesting in the sense that the logic is the same for volatile and atomic. I believe this completes all of the changes necessary for the optimizer to handle loads and stores correctly. I'm going to try and come up with some additional testing, though. llvm-svn: 139533	2011-09-12 20:23:13 +00:00
Andrew Trick	183013d8d4	Rename -disable-iv-rewrite to -enable-iv-rewrite=false in preparation for default change. llvm-svn: 139517	2011-09-12 18:28:44 +00:00
Andrew Trick	c7868bf064	[disable-iv-rewrite] Allow WidenIV to handle NSW/NUW operations better. Don't immediately give up when an add operation can't be trivially sign/zero-extended within a loop. If it has NSW/NUW flags, generate a new expression with sign extended (non-recurrent) operand. As before, if SCEV says that all sign extends are loop invariant, then we can widen the operation. llvm-svn: 139453	2011-09-10 01:24:17 +00:00
Andrew Trick	465f42ff67	Comment formatting. llvm-svn: 139375	2011-09-09 17:35:10 +00:00
Andrew Trick	1eee7f1242	Add -verify-indvars for imperfect SCEV trip count verification after indvars. llvm-svn: 139169	2011-09-06 20:20:38 +00:00
Devang Patel	c10e52a0c4	Use IRBuilder. llvm-svn: 139156	2011-09-06 18:49:53 +00:00
Owen Anderson	58704ee442	Try again at r138809 (make DSE more aggressive in removing dead stores at the end of a function), now with less deleting stores before memcpy's. llvm-svn: 139150	2011-09-06 18:14:09 +00:00
Duncan Sands	a098436b32	Split the init.trampoline intrinsic, which currently combines GCC's init.trampoline and adjust.trampoline intrinsics, into two intrinsics like in GCC. While having one combined intrinsic is tempting, it is not natural because typically the trampoline initialization needs to be done in one function, and the result of adjust trampoline is needed in a different (nested) function. To get around this llvm-gcc hacks the nested function lowering code to insert an additional parent variable holding the adjust.trampoline result that can be accessed from the child function. Dragonegg doesn't have the luxury of tweaking GCC code, so it stored the result of adjust.trampoline in the memory GCC set aside for the trampoline itself (this is always available in the child function), and set up some new memory (using an alloca) to hold the trampoline. Unfortunately this breaks Go which allocates trampoline memory on the heap and wants to use it even after the parent has exited (!). Rather than doing even more hacks to get Go working, it seemed best to just use two intrinsics like in GCC. Patch mostly by Sanjoy Das. llvm-svn: 139140	2011-09-06 13:37:06 +00:00
Duncan Sands	29192d042e	Delete trivial landing pads that just continue unwinding the caught exception. llvm-svn: 139117	2011-09-05 12:57:57 +00:00
Bill Wendling	321fb37773	Use Duncan's patch to delete the instructions in reverse order (minus the landingpad and terminator). llvm-svn: 139090	2011-09-04 09:43:36 +00:00
Bill Wendling	a336e70573	Update comments to reflect reality. llvm-svn: 139023	2011-09-02 18:43:33 +00:00
Andrew Trick	31b941a60d	Enable SCEV-based unrolling by default. This changes loop unrolling to use the same mechanism for trip count computation as indvars. This is a stronger check that tends to unroll more loops. A very common side-effect is that many single iteration loops will be removed sooner. The real goal was simply to remove dependence on canonical IVs. x86 is break even. ARM performance changes to expect (+ is good): External/SPEC/CFP2000/183.equake/183.equake +13% SingleSource/Benchmarks/Dhrystone/fldry +21% MultiSource/Applications/spiff/spiff +3% SingleSource/Benchmarks/Stanford/Puzzle -14% The Puzzle regression is actually an improvement in loop optimization that defeats GVN: rdar://problem/10065079. llvm-svn: 139009	2011-09-02 17:26:28 +00:00
Jakub Staszak	7470fb01d0	Compare type size instead of type _store_ size to make sure that BitCastInst will be valid. This fixes PR10820. llvm-svn: 139005	2011-09-02 14:57:37 +00:00
Bill Wendling	a3ba6d3b80	Reduce indentation. No functionality change. llvm-svn: 138968	2011-09-01 21:29:49 +00:00
Bill Wendling	bf8280ff27	Change worklist driven deletion to be an iterative process. Duncan noticed this! llvm-svn: 138967	2011-09-01 21:28:33 +00:00
Eli Friedman	71f5c2f158	Fix an issue with the IR sink pass found by inspection. (I'm not sure anyone is actually using this, but might as well fix it since I found the issue.) llvm-svn: 138965	2011-09-01 21:21:24 +00:00
Bill Wendling	a617c32745	Resubmit with fix. Properly remove the instructions except for landingpad, which should be removed only when its invokes are. llvm-svn: 138932	2011-09-01 01:28:11 +00:00
Bill Wendling	9f7cf20e60	Submitted this too early. llvm-svn: 138931	2011-09-01 01:18:33 +00:00
Bill Wendling	2d1f11f743	Don't DCE the landingpad instruction. The landingpad instruction can be removed only when its invokes are removed. llvm-svn: 138930	2011-09-01 01:16:58 +00:00
Bill Wendling	770d0f0700	Make sure we aren't deleting the landingpad instruction. The landingpad instruction is required in the landing pad block. Because we're not deleting terminating instructions, the invoke may still jump to here (see Transforms/SCCP/2004-11-16-DeadInvoke.ll). Remove all uses of the landingpad instruction, but keep it around until code-gen can remove the basic block. llvm-svn: 138890	2011-08-31 20:55:20 +00:00
Rafael Espindola	a45c20b049	Remove the old tail duplication pass. It is not used and is unable to update ssa, so it has to be run really early in the pipeline. Any replacement should probably use the SSAUpdater. llvm-svn: 138841	2011-08-30 23:03:45 +00:00
Owen Anderson	e316e5b2ad	Speculatively revert r138809 in an attempt to fix DragonEgg. llvm-svn: 138829	2011-08-30 21:11:06 +00:00
Owen Anderson	d708ec4c6a	When walking backwards to eliminate final stores to allocas at the end of a function, encountering an unrelated store should not cause us to give up like encountering a load does. llvm-svn: 138809	2011-08-30 18:51:55 +00:00
Nadav Rotem	5fc81ffbac	Fixes following the CR by Chris and Duncan: Optimize chained bitcasts of the form A->B->A. Undo r138722 and change isEliminableCastPair to allow this case. llvm-svn: 138756	2011-08-29 19:58:36 +00:00
Nadav Rotem	52600ee8c3	Bitcasts are transitive. Bitcast-Bitcast-X becomes Bitcast-X. llvm-svn: 138722	2011-08-28 11:51:08 +00:00
Bill Wendling	eed1e8905a	Don't sink landingpad instructions during ind-var simplification. llvm-svn: 138651	2011-08-26 20:40:15 +00:00
Benjamin Kramer	0655b78ccc	Address review comments. - Reword comments. - Allow undefined behavior interfering with undefined behavior. - Add address space checks. llvm-svn: 138619	2011-08-26 02:25:55 +00:00
Benjamin Kramer	fb212a6309	SimplifyCFG: If we have a PHI node that can evaluate to NULL and do a load or store to the address returned by the PHI node then we can consider this incoming value as dead and remove the edge pointing there, unless there are instructions that can affect control flow executed in between. In theory this could be extended to other instructions, eg. division by zero, but it's likely that it will "miscompile" some code because people depend on div by zero not trapping. NULL pointer dereference usually leads to a crash so we should be on the safe side. This shrinks the size of a Release clang by 16k on x86_64. llvm-svn: 138618	2011-08-26 01:22:29 +00:00
Bill Wendling	3fb137f7ef	LSR wants to split the landing pad's critical edge. Let it do it, but use the proper function to do it. llvm-svn: 138550	2011-08-25 05:55:40 +00:00
Bill Wendling	07efd6f1e0	When inserting new instructions, use getFirstInsertionPt instead of getFirstNonPHI so that it will skip over the landingpad instructions as well. llvm-svn: 138537	2011-08-25 01:08:34 +00:00
Bill Wendling	86c5cbe613	Skip the landingpad instruction when determining the insertion point. llvm-svn: 138481	2011-08-24 21:06:46 +00:00
Bill Wendling	0902a68f69	Use getFirstInsertionPt instead of getFirstNonPHI so that it skips to the proper insertion place. llvm-svn: 138473	2011-08-24 20:28:43 +00:00
Rafael Espindola	d3e65e702f	Fix a crashing bug in SplitBlock when it is called on a block with no dominator information even though dominators were previously computed. Patch by Nick Sumner. llvm-svn: 138449	2011-08-24 18:07:01 +00:00
Dan Gohman	4b8e8ce37f	Add a comment. llvm-svn: 138243	2011-08-22 17:29:37 +00:00
Dan Gohman	56e1cef705	Constant pointers to objects don't need reference counting. llvm-svn: 138242	2011-08-22 17:29:11 +00:00
Bill Wendling	38d813087e	If we're splitting the landing pad block and assigning it only one predecessor, then don't split it a second time, since that block will be dead. llvm-svn: 138153	2011-08-19 23:46:30 +00:00
Bill Wendling	26e19288be	The landingpad instruction isn't dead simply because it's value isn't used. llvm-svn: 138102	2011-08-19 21:52:06 +00:00
Benjamin Kramer	4938edb02c	Make a bunch of symbols private. llvm-svn: 138025	2011-08-19 01:42:18 +00:00
Benjamin Kramer	5a656883b1	C API functions must be able to see their extern "C" definitions, or it will be impossible to call them from C. llvm-svn: 138022	2011-08-19 01:36:54 +00:00
Dan Gohman	b38940135b	Track a retain+release nesting level independently of the known-incremented level, because the two concepts can be used to prove the saftey of a retain+release removal in different ways. llvm-svn: 138016	2011-08-19 00:26:36 +00:00
Bill Wendling	c61f7659ba	Intelligently split the landing pad block. We have to be careful when splitting the landing pad block, because the landingpad instruction is required to remain as the first non-PHI of an invoke's unwind edge. To retain this, we split the block into two blocks, moving the predecessors within the loop to one block and the remaining predecessors to the other. The landingpad instruction is cloned into the new blocks. llvm-svn: 138015	2011-08-19 00:09:22 +00:00
Bill Wendling	ca7d309623	Add SplitLandingPadPredecessors(). SplitLandingPadPredecessors is similar to SplitBlockPredecessors in that it splits the current block and attaches a set of predecessors to the new basic block. However, it differs from SplitBlockPredecessors in that it's specifically designed to handle landing pad blocks. Two new basic blocks are created: one that is has the vector of predecessors as its predecessors and one that has the remaining predecessors as its predecessors. Those two new blocks then receive a cloned copy of the landingpad instruction from the original block. The landingpad instructions are joined in a PHI, etc. Like SplitBlockPredecessors, it updates the LLVM IR, AliasAnalysis, DominatorTree, DominanceFrontier, LoopInfo, and LCCSA analyses. llvm-svn: 138014	2011-08-19 00:05:40 +00:00
Bill Wendling	2b31c45e8e	Use 'getFirstInsertionPt' when trying to insert new instructions during LICM. llvm-svn: 138008	2011-08-18 23:42:36 +00:00
Dan Gohman	c57b58cc40	Make it clear that this code is iterating in reverse order through the array. llvm-svn: 137985	2011-08-18 21:27:42 +00:00
Bill Wendling	b15d6eb93b	Revert r137871. The loop simplify pass should require all exits from a loop that aren't from an indirect branch need to be dominated by the loop header. llvm-svn: 137981	2011-08-18 21:10:01 +00:00
Bill Wendling	b267e2a7ec	Split out the updating of PHI nodes after splitting the BB into a separate function. llvm-svn: 137979	2011-08-18 20:51:04 +00:00
Bill Wendling	ec3823dcb7	Use this fantzy ArrayRef thing to pass in the list of predecessors. llvm-svn: 137978	2011-08-18 20:39:32 +00:00
Nick Lewycky	74acf9f501	The edge from DISubprogram to DICompileUnit has been removed in recent versions of debug info. llvm-svn: 137972	2011-08-18 19:07:42 +00:00
Bill Wendling	6029135af9	Use static instead of anonymous namespace. llvm-svn: 137959	2011-08-18 17:57:57 +00:00
Bill Wendling	0a693f47ee	Split out the analysis updating code into a helper function. No intended functionality change. llvm-svn: 137926	2011-08-18 05:25:23 +00:00
Devang Patel	53771ba07c	Dramatically speedup codegen prepare by a) avoiding use of dominator tree and b) doing a separate pass over dbg.value instructions. llvm-svn: 137908	2011-08-18 00:50:51 +00:00
Devang Patel	2b21d86cfe	Do not use DebugInfoFinder. Extract debug info directly from llvm.dbg.cu named mdnode. llvm-svn: 137890	2011-08-17 22:49:38 +00:00
Eli Friedman	9a468153e1	Atomic load/store handling for the passes using memdep (GVN, DSE, memcpyopt). llvm-svn: 137888	2011-08-17 22:22:24 +00:00
Bill Wendling	8bbcbedeaf	Disable PRE for landing pads. PRE needs the landing pads to have their critical edges split. Doing this for a landing pad is non-trivial. Abandon the attempt to perform PRE when we come across a landing pad. (Reviewed by Owen!) llvm-svn: 137876	2011-08-17 21:32:02 +00:00
Bill Wendling	79a6873d9c	Increment the insertion iterator to beyond the landingpad instruction. llvm-svn: 137872	2011-08-17 21:21:31 +00:00
Bill Wendling	39257d6b5c	Don't optimize the landing pad exit block. One way to exit the loop is through an unwind edge. However, that may involve splitting the critical edge of the landing pad, which is non-trivial. Prevent the transformation from rewriting the landing pad exit loop block. llvm-svn: 137871	2011-08-17 21:20:43 +00:00
Bill Wendling	2dfbcc4506	Assert that we aren't trying to split the critical edge of a landing pad. Doing so requires more care than this generic algorithm should handle. llvm-svn: 137866	2011-08-17 21:04:05 +00:00
Bill Wendling	a9ee09f4be	Revert r137655. There is some question about whether the 'landingpad' instruction should be marked as potentially reading and/or writing memory. llvm-svn: 137863	2011-08-17 20:36:44 +00:00
Eli Friedman	d7749be2d7	Silly mistake from r137777; restore significant isStructTy() checks. While here, be a bit more defensive with unknown instructions. Fixes PR10687. llvm-svn: 137836	2011-08-17 18:10:43 +00:00
Eli Friedman	0793eb4c46	A bunch of misc fixes to SCCPSolver::ResolvedUndefsIn, including a fix to stop making random bad assumptions about instructions which are not explicitly listed. Includes fix for rdar://9956541, a version of "undef ^ undef should return 0 because it's easier than arguing with users". llvm-svn: 137777	2011-08-16 22:06:31 +00:00
Eli Friedman	56f2f21254	Minor bug in SCCP found by inspection. (I don't think it's possible to hit this with a normal pass pipeline, but fixing for completeness.) llvm-svn: 137755	2011-08-16 21:12:35 +00:00
Bill Wendling	8ddfc09e7a	Use the getFirstInsertionPt() method instead of getFirstNonPHI + an 'isa<>' check for a LandingPadInst. llvm-svn: 137745	2011-08-16 20:45:24 +00:00
Bill Wendling	55d875fa1c	I think there was some confusion about what I meant. :-) Replacing the comment. llvm-svn: 137743	2011-08-16 20:41:17 +00:00
David Chisnall	719a72f34c	Add a mechanism for optimisation plugins to register passes that all front ends can use without needing to be aware of the plugin (or the plugin be aware of the front end). Before 3.0, I'd like to add a mechanism for automatically loading a set of plugins from a config file. API suggestions welcome... llvm-svn: 137717	2011-08-16 13:58:41 +00:00
Bill Wendling	be33e8d58d	A few places where we want to skip the landingpad instruction for insertion. llvm-svn: 137712	2011-08-16 04:52:55 +00:00
Eli Friedman	a917d4f9b4	Revert a bit of r137667; the logic in question can safely handle atomic load/store. llvm-svn: 137702	2011-08-16 01:28:22 +00:00
Eli Friedman	bd39703456	After talking with Bill, it seems like the LandingPad handling here is likely to be wrong (or at least somewhat suspect). Leave a FIXME for Bill. llvm-svn: 137694	2011-08-16 00:41:37 +00:00
Eli Friedman	b8f30de527	Minor comment fixes. llvm-svn: 137693	2011-08-16 00:20:11 +00:00
Eli Friedman	0ffdf2ea0b	Update SimplifyCFG for atomic operations. This commit includes a mention of the landingpad instruction, but it's not changing the behavior around it. I think the current behavior is correct, though. Bill, can you double-check that? llvm-svn: 137691	2011-08-15 23:59:28 +00:00
Eli Friedman	01a67111d1	Add comments and test for atomic load/store and mem2reg. llvm-svn: 137690	2011-08-15 23:55:52 +00:00
Bill Wendling	5a18b7c7c7	In places where it's using "getFirstNonPHI", skip the landingpad instruction if necessary. llvm-svn: 137679	2011-08-15 23:19:54 +00:00
Bill Wendling	91d4e9edec	Don't sink the instruction to before a landingpad instruction. llvm-svn: 137672	2011-08-15 22:53:05 +00:00
Eli Friedman	211e348eaa	Update inter-procedural optimizations for atomic load/store. llvm-svn: 137667	2011-08-15 22:16:46 +00:00
Eli Friedman	8bc586e770	Update instcombine for atomic load/store. llvm-svn: 137664	2011-08-15 22:09:40 +00:00
Bill Wendling	e86965ee19	Duncan pointed out that the LandingPadInst might read memory. (It might also write to memory.) Marking it as such makes some checks for immobility go away. llvm-svn: 137655	2011-08-15 21:14:31 +00:00
Eli Friedman	4d05198d1f	Fix llvm::CloneModule to correctly clone globals. Patch per bug report by Simon Moll on llvmdev. llvm-svn: 137654	2011-08-15 21:05:06 +00:00
Eli Friedman	91386c7be4	Atomic load/store support in LICM. llvm-svn: 137648	2011-08-15 20:52:09 +00:00
Bill Wendling	d9fb470758	The "landingpad" instruction will never be "trivially" dead. llvm-svn: 137642	2011-08-15 20:10:51 +00:00
Bill Wendling	dd94d3426b	Don't try to sink the landingpad instruction. It's immobile. llvm-svn: 137629	2011-08-15 18:23:40 +00:00
Bill Wendling	88294cdbe0	Mark the SCC as "might unwind" if we run into a 'resume' instruction. llvm-svn: 137627	2011-08-15 18:22:00 +00:00
Bill Wendling	b9c0e0db53	Skip the insertion iterator past the landingpad instruction if there. llvm-svn: 137626	2011-08-15 18:21:07 +00:00
Bill Wendling	55421f0c4d	Add inlining for the new EH scheme. This builds off of the current scheme, but instead of llvm.eh.exception and llvm.eh.selector, it uses the landingpad instruction. And instead of llvm.eh.resume, it uses the resume instruction. Because of the invariants in the landing pad instruction, a lot of code that's currently needed to find the appropriate intrinsic calls for an invoke instruction won't be needed once we go to the new EH scheme. The "FIXME"s tell us what to remove after we switch. llvm-svn: 137576	2011-08-14 08:01:36 +00:00
Nick Lewycky	746e317953	This transform is not safe. Thanks to Eli for pointing that out! llvm-svn: 137575	2011-08-14 04:51:49 +00:00
Nick Lewycky	ae13df60a6	Don't attempt to add 'nsw' when intermediate instructions had no such guarantee. llvm-svn: 137572	2011-08-14 03:41:33 +00:00
Nick Lewycky	de49278c26	Teach instcombine to preserve the nsw bit by doing an after-the-fact analysis when combining add and sub instructions. Patch by Pranav Bhandarkar! llvm-svn: 137570	2011-08-14 01:45:19 +00:00
Bill Wendling	fae1475823	Initial commit of the 'landingpad' instruction. This implements the 'landingpad' instruction. It's used to indicate that a basic block is a landing pad. There are several restrictions on its use (see LangRef.html for more detail). These restrictions allow the exception handling code to gather the information it needs in a much more sane way. This patch has the definition, implementation, C interface, parsing, and bitcode support in it. llvm-svn: 137501	2011-08-12 20:24:12 +00:00
Chris Lattner	335d399a0e	switch to use the new api for structtypes. llvm-svn: 137480	2011-08-12 18:06:37 +00:00
Duncan Sands	a41634e307	Silence a bunch (but not all) "variable written but not read" warnings when building with assertions disabled. llvm-svn: 137460	2011-08-12 14:54:45 +00:00
Dan Gohman	10a18d55ce	Don't convert objc_autoreleaseReturnValue to objc_autorelease if the result is returned through a bitcast. llvm-svn: 137402	2011-08-12 00:36:31 +00:00
Dan Gohman	121302772d	Don't let arbitrary calls disrupt nested retain+release pairs if the retains and releases all use the same SSA pointer value. Also, don't let CFG hazards disrupt nested retain+release pair optimizations. llvm-svn: 137399	2011-08-12 00:26:31 +00:00
Dan Gohman	4767a1a117	Use an actual reverse-CFG reverse-postorder for the bottom-up traversal, rather than plain postorder, so that CFG constructs like single-exit loops are reliably visited in a sensible order. llvm-svn: 137398	2011-08-12 00:24:29 +00:00
Andrew Trick	2b6860f0a1	Allow loop unrolling to get known trip counts from ScalarEvolution. SCEV unrolling can unroll loops with arbitrary induction variables. It is a prerequisite for -disable-iv-rewrite performance. It is also easily handles loops of arbitrary structure including multiple exits and is generally more robust. This is under a temporary option to avoid affecting default behavior for the next couple of weeks. It is needed so that I can checkin unit tests for updateUnloop. llvm-svn: 137384	2011-08-11 23:36:16 +00:00
Dan Gohman	7e315fc37d	Fix typos in comments, and delete an unused function. llvm-svn: 137352	2011-08-11 21:06:32 +00:00
Devang Patel	bb23a4a9a5	Distinguish between two copies of one inlined variable. Take 2. llvm-svn: 137253	2011-08-10 21:50:54 +00:00
Andrew Trick	6dbb060778	Comments. Thanks for the spell check Nick! Also, my apologies for spoiling the autocomplete on SimplifyInstructions.cpp. I couldn't think of a better filename. llvm-svn: 137229	2011-08-10 18:07:05 +00:00
Andrew Trick	4d0040baf8	Invoke SimplifyIndVar when we partially unroll a loop. Fixes PR10534. llvm-svn: 137203	2011-08-10 04:29:49 +00:00
Andrew Trick	e629d008fb	Cleanup. Make ScalarEvolution an explicit argument of the SimplifyIndVar utility since it is required. llvm-svn: 137202	2011-08-10 04:22:26 +00:00
Andrew Trick	74664d5ec6	SimplifyIndVar: make foldIVUser iterative to fold a chain of operands. llvm-svn: 137199	2011-08-10 04:01:31 +00:00
Benjamin Kramer	0b0e47d6ad	Update CMake build. llvm-svn: 137198	2011-08-10 03:51:58 +00:00
Andrew Trick	3ec331eaf4	Added a SimplifyIndVar utility to simplify induction variable users based on ScalarEvolution without changing the induction variable phis. This utility is the main tool of IndVarSimplifyPass, but the pass also restructures induction variables in strange ways that are sensitive to pass ordering. This provides a way for other loop passes to simplify new uses of induction variables created during transformation. The utility may be used by any pass that preserves ScalarEvolution. Soon LoopUnroll will use it. The net effect in this checkin is to cleanup the IndVarSimplify pass by factoring out the SimplifyIndVar algorithm into a standalone utility. llvm-svn: 137197	2011-08-10 03:46:27 +00:00
Andrew Trick	78b40c3f3a	Cleanup. Added LoopBlocksDFS::perform for simple clients. llvm-svn: 137195	2011-08-10 01:59:05 +00:00
Andrew Trick	b72bbe2a92	Fix the LoopUnroller to handle nontrivial loops and partial unrolling. These are not individual bug fixes. I had to rewrite a good chunk of the unroller to make it sane. I think it was getting lucky on trivial completely unrolled loops with no early exits. I included some fairly simple unit tests for partial unrolling. I didn't do much stress testing, so it may not be perfect, but should be usable now. llvm-svn: 137190	2011-08-10 00:28:10 +00:00
Eli Friedman	59b66883ea	Representation of 'atomic load' and 'atomic store' in IR. llvm-svn: 137170	2011-08-09 23:02:53 +00:00
Rafael Espindola	07f6091527	Add a C interface to PassManagerBuilder. It is missing the addExtension functionality since in the C api a pass is created and added to a pass manager in a single call. llvm-svn: 137159	2011-08-09 22:17:34 +00:00
Andrew Trick	5e0ee1c7f2	LoopUnroll looks like it has some stale code. Remove it to prove my sanity and avoid further confusion. llvm-svn: 137106	2011-08-09 03:11:29 +00:00
Bill Wendling	55a09346ac	There is only one instance of this placeholder being created. Just use that instead of a vector. llvm-svn: 137099	2011-08-09 01:17:10 +00:00
Bill Wendling	def94edf69	Remove an instance where the 'unwind' instruction was created. The 'unwind' instruction was acting essentially as a placeholder, because it would be replaced at the end of this function by a branch to the "unwind handler". The 'unwind' instruction is going away, so use 'unreachable' instead, which serves the same purpose as a placeholder. llvm-svn: 137098	2011-08-09 01:09:21 +00:00
Andrew Trick	6d45a01b67	Made SCEV's UDiv expressions more canonical. When dividing a recurrence, the initial values low bits can sometimes be ignored. To take advantage of this, added FoldIVUser to IndVarSimplify to fold an IV operand into a udiv/lshr if the operator doesn't affect the result. -indvars -disable-iv-rewrite now transforms i = phi i4 i1 = i0 + 1 idx = i1 >> (2 or more) i4 = i + 4 into i = phi i4 idx = i0 >> ... i4 = i + 4 llvm-svn: 137013	2011-08-06 07:00:37 +00:00
Chandler Carruth	81b7e11c89	Temporarily revert r135528 which distinguishes between two copies of one inlined variable, based on the discussion in PR10542. This explodes the runtime of several passes down the pipeline due to a large number of "copies" remaining live across a large function. This only shows up with both debug and opt, but when it does it creates a many-minute compile when self-hosting LLVM+Clang. There are several other cases that show these types of regressions. All of this is tracked in PR10542, and progress is being made on fixing the issue. Once its addressed, the re-instated, but until then this restores the performance for self-hosting and other opt+debug builds. Devang, let me know if this causes any trouble, or impedes fixing it in any way, and thanks for working on this! llvm-svn: 136953	2011-08-05 00:51:31 +00:00
Devang Patel	c0174048a4	We need to map DebugLoc. It leads to Fuction * (through subprogram entry node) which should be appropriately mapped. llvm-svn: 136910	2011-08-04 20:02:18 +00:00
Evan Cheng	e4df6a2add	Fix an obvious type. Patch by Ivan Krasin. llvm-svn: 136900	2011-08-04 18:40:26 +00:00
Bill Wendling	2d3138c112	Remove the LowerSetJmp pass. It wasn't used effectively by any of the targets. This is some of my original LLVM code. wipes tear llvm-svn: 136821	2011-08-03 22:18:20 +00:00
Andrew Trick	bf69d03382	SCEV: Use AssertingVH to catch dangling BasicBlock* when passes forget to notify SCEV of a change. Add forgetLoop in a couple of those places. llvm-svn: 136797	2011-08-03 18:32:11 +00:00
Andrew Trick	9d8c2af257	whitespace llvm-svn: 136795	2011-08-03 18:28:21 +00:00
Nick Lewycky	d405b7e2ae	Small cleanups: - use SmallVectorImpl& for the function argument. - ignore the operands on the GEP, even if they aren't constant! Much as we pretend the malloc succeeds, we pretend that malloc + whatever-you-GEP'd-by is not null. It's magic! llvm-svn: 136757	2011-08-03 01:11:40 +00:00
Nick Lewycky	50f4966ceb	Fix logical error when detecting lifetime intrinsics. Don't replace a gep/bitcast with 'undef' because that will form a "free(undef)" which in turn means "unreachable". What we wanted was a no-op. Instead, analyze the whole tree and look for all the instructions we need to delete first, then delete them second, not relying on the use_list to stay consistent. llvm-svn: 136752	2011-08-03 00:43:35 +00:00
Nick Lewycky	e8ae02dfb9	Teach InstCombine that lifetime intrincs aren't a real user on the result of a malloc call. llvm-svn: 136732	2011-08-02 22:08:01 +00:00
Rafael Espindola	3ea478b7ac	Move methods in PassManagerBuilder offline. llvm-svn: 136727	2011-08-02 21:50:27 +00:00
Eli Friedman	366bccefad	Add new atomic instructions to SCCP. No functional change, but stops debug spam. llvm-svn: 136723	2011-08-02 21:35:16 +00:00
Nick Lewycky	99890a225f	Lifetime intrinsics on undef are dead. llvm-svn: 136722	2011-08-02 21:19:27 +00:00
Owen Anderson	bddf40e082	Revert r136503 and r136480 in an effort to fix non-determinism in the llvm-gcc buildbots on i386. Devang is looking into the root cause. llvm-svn: 136674	2011-08-02 02:23:42 +00:00
Bill Wendling	f891bf8b30	Add the 'resume' instruction for the new EH rewrite. This adds the 'resume' instruction class, IR parsing, and bitcode reading and writing. The 'resume' instruction resumes propagation of an existing (in-flight) exception whose unwinding was interrupted with a 'landingpad' instruction (to be added later). llvm-svn: 136589	2011-07-31 06:30:59 +00:00
Rafael Espindola	a3a44f3fc3	Add a small gep optimization I noticed was missing while reading some IL. llvm-svn: 136585	2011-07-31 04:43:41 +00:00
Bill Wendling	ad088e6724	Revert r136253, r136263, r136269, r136313, r136325, r136326, r136329, r136338, r136339, r136341, r136369, r136387, r136392, r136396, r136429, r136430, r136444, r136445, r136446, r136253 pending review. llvm-svn: 136556	2011-07-30 05:42:50 +00:00
Devang Patel	ce0ceebb1c	Clear DbgValues in the end. llvm-svn: 136503	2011-07-29 19:49:58 +00:00
Devang Patel	3e02522fee	Clean up debug info after reassociation. llvm-svn: 136480	2011-07-29 19:00:35 +00:00
Eli Friedman	adec587d5c	Misc optimizer+codegen work for 'cmpxchg' and 'atomicrmw'. They appear to be working on x86 (at least for trivial testcases); other architectures will need more work so that they actually emit the appropriate instructions for orderings stricter than 'monotonic'. (As far as I can tell, the ARM, PPC, Mips, and Alpha backends need such changes.) llvm-svn: 136457	2011-07-29 03:05:32 +00:00
Eli Friedman	530341d748	Make sure to correctly clear the exact/nuw/nsw flags off of shifts when they are combined together. <rdar://problem/9859829> llvm-svn: 136435	2011-07-29 00:18:19 +00:00
Chandler Carruth	9d7feab3e0	Rewrite the CMake build to use explicit dependencies between libraries, specified in the same file that the library itself is created. This is more idiomatic for CMake builds, and also allows us to correctly specify dependencies that are missed due to bugs in the GenLibDeps perl script, or change from compiler to compiler. On Linux, this returns CMake to a place where it can relably rebuild several targets of LLVM. I have tried not to change the dependencies from the ones in the current auto-generated file. The only places I've really diverged are in places where I was seeing link failures, and added a dependency. The goal of this patch is not to start changing the dependencies, merely to move them into the correct location, and an explicit form that we can control and change when necessary. This also removes a serialization point in the build because we don't have to scan all the libraries before we begin building various tools. We no longer have a step of the build that regenerates a file inside the source tree. A few other associated cleanups fall out of this. This isn't really finished yet though. After talking to dgregor he urged switching to a single CMake macro to construct libraries with both sources and dependencies in the arguments. Migrating from the two macros to that style will be a follow-up patch. Also, llvm-config is still generated with GenLibDeps.pl, which means it still has slightly buggy dependencies. The internal CMake 'llvm-config-like' macro uses the correct explicitly specified dependencies however. A future patch will switch llvm-config generation (when using CMake) to be based on these deps as well. This may well break Windows. I'm getting a machine set up now to dig into any failures there. If anyone can chime in with problems they see or ideas of how to solve them for Windows, much appreciated. llvm-svn: 136433	2011-07-29 00:14:25 +00:00
Bill Wendling	9e5f0f8fce	Some minor cleanups. No functionalitical change. llvm-svn: 136341	2011-07-28 07:44:07 +00:00
Bill Wendling	fa28440f15	Leverage some of the code that John wrote to manage the landing pads. The new EH is more simple in many respects. Mainly, we don't have to worry about the "llvm.eh.exception" and "llvm.eh.selector" calls being in weird places. llvm-svn: 136339	2011-07-28 07:31:46 +00:00
Bill Wendling	51affc8258	Automatically merge the landingpad clauses when we come across a callee's landingpad. llvm-svn: 136329	2011-07-28 02:40:13 +00:00
Benjamin Kramer	e71b9c446d	Fix a use after free. An instruction can't be both an intrinsic call and a fence. llvm-svn: 136319	2011-07-28 01:20:19 +00:00
Bill Wendling	246eb96c8a	Initial stab at getting inlining working with the EH rewrite. This takes the new 'resume' instruction and turns it into a direct jump to the caller's landing pad code. The caller's landingpad instruction is merged with the landingpad instructions of the callee. This is a bit rough and makes some assumptions in how the code works. But it passes a simple test. llvm-svn: 136313	2011-07-28 00:38:23 +00:00
Bill Wendling	9c5b7ff807	Refuse to inline two functions which use different personality functions. llvm-svn: 136269	2011-07-27 21:44:28 +00:00
Bill Wendling	6c923bb8d9	Merge the contents from exception-handling-rewrite to the mainline. This adds the new instructions 'landingpad' and 'resume'. llvm-svn: 136253	2011-07-27 20:18:04 +00:00
Nick Lewycky	8ac9ecedfd	Teach the ConstantMerge pass about alignment. Fixes PR10514! llvm-svn: 136250	2011-07-27 19:47:34 +00:00
Eli Friedman	89b694b096	Misc mid-level changes for new 'fence' instruction. llvm-svn: 136205	2011-07-27 01:08:30 +00:00
Bill Wendling	3fe5d68563	Use the correct for for the version. It's little endian and my brain is obviously big endian. :-) PR10502 llvm-svn: 136111	2011-07-26 18:31:41 +00:00
Rafael Espindola	b84dc6bca8	Add LLVMAddAlwaysInlinerPass to the C API. llvm-svn: 136083	2011-07-26 15:23:23 +00:00
Rafael Espindola	be2fe29f9c	LLVM 3.0 is here, remove old do nothing method. llvm-svn: 136082	2011-07-26 15:17:32 +00:00
Nick Lewycky	15e2d90746	Finish adding support for lifetime intrinsics to SROA. Fixes PR10121! llvm-svn: 136008	2011-07-25 23:14:22 +00:00
Andrew Trick	990f771a9a	Add clarifying comments for the new arguments to UnrollLoop. llvm-svn: 135988	2011-07-25 22:17:47 +00:00
Nick Lewycky	77cb8e681f	Add missing space (this line is no longer pushing the 80-column limit). llvm-svn: 135973	2011-07-25 21:16:04 +00:00
Rafael Espindola	7281395c8c	Add LLVMAddLowerExpectIntrinsicPass to the C API. llvm-svn: 135966	2011-07-25 20:57:59 +00:00
Frits van Bommel	ede0dc6dda	Shorten some expressions by using ArrayRef::slice(). llvm-svn: 135910	2011-07-25 15:13:01 +00:00
Jay Foad	d1b7849d49	Convert GetElementPtrInst to use ArrayRef. llvm-svn: 135904	2011-07-25 09:48:08 +00:00
Andrew Trick	1cabe54fab	Move trip count discovery outside of the generic LoopUnroll helper. This removes its dependence on canonical induction variables. llvm-svn: 135829	2011-07-23 00:33:05 +00:00
Andrew Trick	279e7a6c83	whitespace llvm-svn: 135828	2011-07-23 00:29:16 +00:00
Dan Gohman	6320f52ff4	Move the last uses of RetainFunc etc. over to using getRetainCallee() etc. so that a declaration for objc_retain is created when needed if it doesn't already exist. rdar://9825114. llvm-svn: 135821	2011-07-22 22:29:21 +00:00
Jay Foad	17bab44308	Fix more MSVC warnings caused by a cases I missed when converting ConstantExpr::getGetElementPtr to use ArrayRef. llvm-svn: 135762	2011-07-22 08:52:50 +00:00
Jay Foad	040dd82f44	Convert IRBuilder::CreateGEP and IRBuilder::CreateInBoundsGEP to use ArrayRef. llvm-svn: 135761	2011-07-22 08:16:57 +00:00
Jay Foad	71f19ac6af	Fix an MSVC warning, caused by a case I missed when converting ConstantExpr::getGetElementPtr to use ArrayRef. llvm-svn: 135758	2011-07-22 07:54:01 +00:00
Dan Gohman	e106aee6f5	Fix MergeInVectorType to check for vector types with the same alloc size but different element types, so that it filters out the cases that CreateShuffleVectorCast doesn't handle. This fixes rdar://9786827. llvm-svn: 135721	2011-07-21 23:30:09 +00:00
Andrew Trick	cd3e8cb882	Cleanup: make std::pair usage slightly less indecipherable without actually naming variables! llvm-svn: 135684	2011-07-21 17:37:39 +00:00
Jay Foad	2f5fc8c67d	Make better use of ConstantExpr::getGetElementPtr's InBounds parameter. llvm-svn: 135676	2011-07-21 15:15:37 +00:00
Jay Foad	ed8db7d9df	Convert ConstantExpr::getGetElementPtr and ConstantExpr::getInBoundsGetElementPtr to use ArrayRef. llvm-svn: 135673	2011-07-21 14:31:17 +00:00
Chris Lattner	5cf753c95e	move tier out of an anonymous namespace, it doesn't make sense to for it to be an an anon namespace and be in a header. Eliminate some extraenous uses of tie. llvm-svn: 135669	2011-07-21 06:21:31 +00:00
Andrew Trick	bd243d0dfe	LSR, correct fix for rdar://9786536. Silly casting bug. llvm-svn: 135654	2011-07-21 01:45:54 +00:00
Andrew Trick	858e9f083d	LSR must sometimes sign-extend before generating double constants. rdar://9786536 llvm-svn: 135650	2011-07-21 01:05:01 +00:00
Andrew Trick	8acb434402	LSR crashes on an empty IVUsers list. rdar://9786536 llvm-svn: 135644	2011-07-21 00:40:04 +00:00
Eli Friedman	911e12f505	Clean up includes of llvm/Analysis/ConstantFolding.h so it's included where it's used and not included where it isn't. llvm-svn: 135628	2011-07-20 21:57:23 +00:00
Eli Friedman	0cdc148ab8	Bring LICM into compliance with the new "Memory Model for Concurrent Operations" in LangRef. llvm-svn: 135625	2011-07-20 21:37:47 +00:00
Jay Foad	50bfbab033	Fix a GCC warning. llvm-svn: 135581	2011-07-20 08:15:21 +00:00
Andrew Trick	638b355a16	indvars: Added getInsertPointForUses to find a valid place to truncate the IV. llvm-svn: 135568	2011-07-20 05:32:06 +00:00
Andrew Trick	2210448520	indvars -disable-iv-rewrite: Add NarrowIVDefUse to cache def-use info. Holding Use* pointers is bad form even though it happened to work in this case. llvm-svn: 135566	2011-07-20 04:39:24 +00:00
Andrew Trick	c5dd3e976a	indvars -disable-iv-rewrite fix: derived GEP IVs llvm-svn: 135558	2011-07-20 02:08:58 +00:00
Eli Friedman	55d6ccbb79	PR10386: Don't try to split an edge from an indirectbr. llvm-svn: 135534	2011-07-19 22:59:41 +00:00
Devang Patel	a59b24b090	Distinguish between two copies of one inlined variable. llvm-svn: 135528	2011-07-19 22:31:15 +00:00
Jay Foad	b992a635fb	Convert SimplifyGEPInst to use ArrayRef. llvm-svn: 135482	2011-07-19 15:07:52 +00:00
Jay Foad	bf904773bb	Convert TargetData::getIndexedOffset to use ArrayRef. llvm-svn: 135478	2011-07-19 14:01:37 +00:00
Jay Foad	f4b14a2b0d	Use ArrayRef in ConstantFoldInstOperands and ConstantFoldCall. llvm-svn: 135477	2011-07-19 13:32:40 +00:00
Andrew Trick	c43b67644c	Compiler warning. llvm-svn: 135426	2011-07-18 21:15:03 +00:00
Andrew Trick	7da2417c8a	indvars: LinearFunctionTestReplace for non-canonical IVs. For -disable-iv-rewrite, perform LFTR without generating a new "canonical" induction variable. Instead find the "best" existing induction variable for use in the loop exit test and compute the final value of that IV for use in the new loop exit test. In short, convert to a simple eq/ne exit test as long as it's cheap to do so. llvm-svn: 135420	2011-07-18 20:32:31 +00:00
Andrew Trick	494c549ebd	indvars: Added verification that LFTR and other indvars goodness does not interfere with BackedgeTakenCount computation. llvm-svn: 135412	2011-07-18 18:44:20 +00:00
Andrew Trick	a27d8b183a	indvars: Added isHighCostExpansion. Avoid generating extra ops in the preheader for the sole purpose of LFTR, since LFTR itself is usually not a clear optimization. llvm-svn: 135409	2011-07-18 18:21:35 +00:00
Frits van Bommel	717d7edd3e	Migrate LLVM and Clang to use the new makeArrayRef(...) functions where previously explicit non-default constructors were used. Mostly mechanical with some manual reformatting. llvm-svn: 135390	2011-07-18 12:00:32 +00:00
Chris Lattner	229907cd11	land David Blaikie's patch to de-constify Type, with a few tweaks. llvm-svn: 135375	2011-07-18 04:54:35 +00:00
Chris Lattner	7b70bef7c8	fix a warning in TinyPtrVector, adopt it in SSAUpdater, saving some mallocs. llvm-svn: 135366	2011-07-18 01:43:58 +00:00
Andrew Trick	c591f3afc3	indvars: fix a pass-sensitivity issue that would hit the SCEVExpander assertion I added in r135333. Check for the existence of a preheader before expanding a recurrence. llvm-svn: 135335	2011-07-16 01:18:53 +00:00
Andrew Trick	9ea55dc2d6	indvars: remove ExprToIVMap because it won't be needed by LFTR. llvm-svn: 135334	2011-07-16 01:06:48 +00:00
Chris Lattner	8b4cf5e8a2	fix rdar://9776316 - type remapping needed for inline asm blobs, fixing some objc llvm-test crashes with LTO. llvm-svn: 135324	2011-07-15 23:18:40 +00:00
Chad Rosier	a7ff54351a	Disable loop idiom recognition of memset/memcpy if the function being compiled is named after a common idiom (i.e., memset/memcpy). Otherwise, we can run into infinite recursion. Ideally, the user should use the correct -fno-builtin flag, but in case they don't we should play nicely. rdar://9763412 llvm-svn: 135286	2011-07-15 18:25:04 +00:00
Frits van Bommel	bbe46f28b1	No need to explicitly invoke the ArrayRef constructor here. llvm-svn: 135281	2011-07-15 17:13:23 +00:00
Jay Foad	5bd375a6cc	Convert CallInst and InvokeInst APIs to use ArrayRef. llvm-svn: 135265	2011-07-15 08:37:34 +00:00
Chris Lattner	b1a1512119	start using the new helper methods a bit. llvm-svn: 135251	2011-07-15 06:08:15 +00:00
Devang Patel	cbd3bb27d7	Undo r135191 (i.e. reapply Chris's patch. Now linker maps NamedMDNodes first, so there is not any need to map DebugLoc). llvm-svn: 135205	2011-07-14 22:14:06 +00:00
Chris Lattner	fb9f4926d1	revert r135172 until Devang and I figure out the right answer. llvm-svn: 135191	2011-07-14 21:25:42 +00:00
Chris Lattner	69eea72779	Stop the ValueMapper from calling getAllMetadata, which unpacks DebugLoc into an MDNode. This saves a bunch of time and memory in the IR linker, e.g. when doing LTO of files with debug info. llvm-svn: 135172	2011-07-14 18:53:50 +00:00
Benjamin Kramer	e6e1933f31	Change Intrinsic::getDeclaration and friends to take an ArrayRef. llvm-svn: 135154	2011-07-14 17:45:39 +00:00
Evan Cheng	b94674b325	It's not safe to fold (fptrunc (sqrt (fpext x))) to (sqrtf x) if there is another use of sqrt. rdar://9763193 llvm-svn: 135058	2011-07-13 19:08:16 +00:00
Jay Foad	57aa636794	Convert InsertValueInst and ExtractValueInst APIs to use ArrayRef. llvm-svn: 135040	2011-07-13 10:26:04 +00:00
Jay Foad	b804a2b751	Second attempt at de-constifying LLVM Types in FunctionType::get(), StructType::get() and TargetData::getIntPtrType(). llvm-svn: 134982	2011-07-12 14:06:48 +00:00
Bill Wendling	a78cd228c2	Revert r134893 and r134888 (and related patches in other trees). It was causing an assert on Darwin llvm-gcc builds. Assertion failed: (castIsValid(op, S, Ty) && "Invalid cast!"), function Create, file /Users/buildslave/zorg/buildbot/smooshlab/slave-0.8/build.llvm-gcc-i386-darwin9-RA/llvm.src/lib/VMCore/Instructions.cpp, li\ ne 2067. etc. http://smooshlab.apple.com:8013/builders/llvm-gcc-i386-darwin9-RA/builds/2354 --- Reverse-merging r134893 into '.': U include/llvm/Target/TargetData.h U include/llvm/DerivedTypes.h U tools/bugpoint/ExtractFunction.cpp U unittests/Support/TypeBuilderTest.cpp U lib/Target/ARM/ARMGlobalMerge.cpp U lib/Target/TargetData.cpp U lib/VMCore/Constants.cpp U lib/VMCore/Type.cpp U lib/VMCore/Core.cpp U lib/Transforms/Utils/CodeExtractor.cpp U lib/Transforms/Instrumentation/ProfilingUtils.cpp U lib/Transforms/IPO/DeadArgumentElimination.cpp U lib/CodeGen/SjLjEHPrepare.cpp --- Reverse-merging r134888 into '.': G include/llvm/DerivedTypes.h U include/llvm/Support/TypeBuilder.h U include/llvm/Intrinsics.h U unittests/Analysis/ScalarEvolutionTest.cpp U unittests/ExecutionEngine/JIT/JITTest.cpp U unittests/ExecutionEngine/JIT/JITMemoryManagerTest.cpp U unittests/VMCore/PassManagerTest.cpp G unittests/Support/TypeBuilderTest.cpp U lib/Target/MBlaze/MBlazeIntrinsicInfo.cpp U lib/Target/Blackfin/BlackfinIntrinsicInfo.cpp U lib/VMCore/IRBuilder.cpp G lib/VMCore/Type.cpp U lib/VMCore/Function.cpp G lib/VMCore/Core.cpp U lib/VMCore/Module.cpp U lib/AsmParser/LLParser.cpp U lib/Transforms/Utils/CloneFunction.cpp G lib/Transforms/Utils/CodeExtractor.cpp U lib/Transforms/Utils/InlineFunction.cpp U lib/Transforms/Instrumentation/GCOVProfiling.cpp U lib/Transforms/Scalar/ObjCARC.cpp U lib/Transforms/Scalar/SimplifyLibCalls.cpp U lib/Transforms/Scalar/MemCpyOptimizer.cpp G lib/Transforms/IPO/DeadArgumentElimination.cpp U lib/Transforms/IPO/ArgumentPromotion.cpp U lib/Transforms/InstCombine/InstCombineCompares.cpp U lib/Transforms/InstCombine/InstCombineAndOrXor.cpp U lib/Transforms/InstCombine/InstCombineCalls.cpp U lib/CodeGen/DwarfEHPrepare.cpp U lib/CodeGen/IntrinsicLowering.cpp U lib/Bitcode/Reader/BitcodeReader.cpp llvm-svn: 134949	2011-07-12 01:15:52 +00:00
Andrew Trick	cdc2297ee1	indvars: Code reorganization in preparation for LinearFunctionTestReplace rewrite. No functionality. I've been wanting to group the indvar subphases into sections and order them by their logical sequence. My next checkin adds functions related to LFTR, and doing the reorg now should help reviewers. Since, most of the code in IndVarSimplify.cpp has recently been replaced or will be replaced soon, obscuring blame should not be an issue. This seems like an ideal time to shuffle the code around. I'm happy to take more suggestions for cleaning up the code. Or if you've been wanting to cleanup anything in this file yourself, now is a good time. llvm-svn: 134941	2011-07-12 00:08:50 +00:00
Jay Foad	7c57be3e2b	De-constify Types in StructType::get() and TargetData::getIntPtrType(). llvm-svn: 134893	2011-07-11 09:56:20 +00:00
Jay Foad	56cc1530ee	De-constify Types in FunctionType::get(). llvm-svn: 134888	2011-07-11 07:56:41 +00:00
Rafael Espindola	403256763f	Don't duplicate the work done by a gep into a "bitcast" if the gep has more than one use. Fixes PR10322. llvm-svn: 134883	2011-07-11 03:43:47 +00:00
Chris Lattner	6b96757745	remove the DerivedType which isn't adding value anymore. llvm-svn: 134832	2011-07-09 17:59:15 +00:00
Chris Lattner	b1ed91f397	Land the long talked about "type system rewrite" patch. This patch brings numerous advantages to LLVM. One way to look at it is through diffstat: 109 files changed, 3005 insertions(+), 5906 deletions(-) Removing almost 3K lines of code is a good thing. Other advantages include: 1. Value::getType() is a simple load that can be CSE'd, not a mutating union-find operation. 2. Types a uniqued and never move once created, defining away PATypeHolder. 3. Structs can be "named" now, and their name is part of the identity that uniques them. This means that the compiler doesn't merge them structurally which makes the IR much less confusing. 4. Now that there is no way to get a cycle in a type graph without a named struct type, "upreferences" go away. 5. Type refinement is completely gone, which should make LTO much MUCH faster in some common cases with C++ code. 6. Types are now generally immutable, so we can use "Type " instead "const Type " everywhere. Downsides of this patch are that it removes some functions from the C API, so people using those will have to upgrade to (not yet added) new API. "LLVM 3.0" is the right time to do this. There are still some cleanups pending after this, this patch is large enough as-is. llvm-svn: 134829	2011-07-09 17:41:24 +00:00
Lang Hames	266dab7bab	Added recognition for signed add/sub/mul with overflow intrinsics to GVN as per Chris and Frits suggestion. llvm-svn: 134777	2011-07-09 00:25:11 +00:00
Bob Wilson	3c68b626e7	Reapply a fixed version of r133285. This tightens up checking for overflow in alloca sizes, based on feedback from Duncan and John about the change in r132926. llvm-svn: 134749	2011-07-08 22:09:33 +00:00
Benjamin Kramer	6a24f9487a	Remove unused copy of UpdateInlinedAtInfo. llvm-svn: 134720	2011-07-08 19:32:06 +00:00
Devang Patel	35797406a5	Refactor. It is inliner's responsibility to update line number information. llvm-svn: 134708	2011-07-08 18:01:31 +00:00
Lang Hames	29cd98fd52	Make GVN look through extractvalues for recognised intrinsics. GVN can then CSE ops that match values produced by the intrinsics. llvm-svn: 134677	2011-07-08 01:50:54 +00:00
Devang Patel	41e97da74f	Use DBG_VALUE location while inserting DBG_VALUE during alloca promotion. llvm-svn: 134568	2011-07-07 00:05:58 +00:00
Jakub Staszak	a11f7ecbf8	Fix a bug in the "expect" intrinsic lowering. llvm-svn: 134566	2011-07-06 23:50:16 +00:00
Devang Patel	c6ee9181d0	Handle cases where multiple dbg.declare and dbg.value intrinsics are tied to one alloca. llvm-svn: 134549	2011-07-06 22:06:11 +00:00
Devang Patel	a3cbf52a57	Simplify. Consolidate dbg.declare handling in AllocaPromoter. llvm-svn: 134538	2011-07-06 21:09:55 +00:00
Andrew Trick	9f8c2853ca	indvars -disable-iv-rewrite: ExprToMap lives in Pass data, so be more careful about referencing values. llvm-svn: 134537	2011-07-06 21:07:10 +00:00
Andrew Trick	3239055dee	indvars -disable-iv-rewrite: Added SimplifyCongruentIVs. llvm-svn: 134530	2011-07-06 20:50:43 +00:00
Tobias Grosser	a3928f5084	LICM: Remove trailing white spaces llvm-svn: 134521	2011-07-06 19:20:02 +00:00
Tobias Grosser	4a5d9a9c20	LICM: Do not loose alignment on promotion The promotion code lost any alignment information, when hoisting loads and stores out of the loop. This lead to incorrect aligned memory accesses. We now use the largest alignment we can prove to be correct. llvm-svn: 134520	2011-07-06 19:19:55 +00:00
Jakub Staszak	3f158fdf6e	Introduce "expect" intrinsic instructions. llvm-svn: 134516	2011-07-06 18:22:43 +00:00
Devang Patel	c3239d3965	Preserve debug loc. llvm-svn: 134441	2011-07-05 21:48:22 +00:00
Andrew Trick	92905a1767	indvars -disable-iv-rewrite: avoid multiple IVs in weird cases. Putting back the helper that I removed on 7/1 to do this right. llvm-svn: 134423	2011-07-05 18:19:39 +00:00
Benjamin Kramer	9eca5feff1	PR10267: Don't combine an equality compare with an AND into an inequality compare when the AND has more than one use. This can pessimize code, inequalities are generally more expensive. llvm-svn: 134379	2011-07-04 20:16:36 +00:00
Andrew Trick	6d12309475	indvars -disable-iv-rewrite: bug fix involving weird geps and related cleanup. llvm-svn: 134306	2011-07-02 02:34:25 +00:00
Owen Anderson	2f37bdc392	Generalize @llvm.ctlz, @llvm.cttz, and @llvm.ctpop to work on vectors of integers, and fix the one optimization pass that I'm aware of that needs updating for this. At least one current target, ARM NEON, can implement these operations on vectors directly. llvm-svn: 134265	2011-07-01 21:52:38 +00:00
Nick Lewycky	f64a39768d	Fix likely typo, reduce number of instruction name collisions. llvm-svn: 134235	2011-07-01 06:27:03 +00:00
Rafael Espindola	b10a0f223a	Add r134057 back, but splice the predecessor after the successors phi nodes. Original message: Let simplify cfg simplify bb with only debug and lifetime intrinsics. llvm-svn: 134182	2011-06-30 20:14:24 +00:00
Andrew Trick	efe89ad414	indvars -disable-iv-rewrite: handle cloning binary operators that cannot overflow. llvm-svn: 134177	2011-06-30 19:02:17 +00:00
Andrew Trick	cc68605353	indvars -disable-iv-rewrite: handle an edge case involving identity phis. llvm-svn: 134124	2011-06-30 01:27:23 +00:00
Andrew Trick	ecdd6e4c67	indvars -disable-iv-rewrite: insert new trunc instructions carefully. llvm-svn: 134112	2011-06-29 23:03:57 +00:00
Chad Rosier	96ed721d9b	Temporarily revert r134057: "Let simplify cfg simplify bb with only debug and lifetime intrinsics" due to buildbot failures. llvm-svn: 134071	2011-06-29 16:22:11 +00:00
Rafael Espindola	4c0dfcec7e	Let simplify cfg simplify bb with only debug and lifetime intrinsics. llvm-svn: 134057	2011-06-29 05:25:47 +00:00
Andrew Trick	efe2b1963d	indvars -disable-iv-rewrite: just because SCEV ignores casts doesn't mean they can be removed. llvm-svn: 134054	2011-06-29 03:13:40 +00:00
Andrew Trick	4426f5b388	cleanup: misleading comment. llvm-svn: 134010	2011-06-28 16:45:04 +00:00
Andrew Trick	411daa5e81	SCEVExpander: give new insts a name that identifies the reponsible pass. llvm-svn: 133992	2011-06-28 05:07:32 +00:00
Andrew Trick	60ab3efb3e	whitespace llvm-svn: 133991	2011-06-28 05:04:16 +00:00
Nick Lewycky	fa44dc6509	Fix typo in comment. llvm-svn: 133990	2011-06-28 03:57:31 +00:00
Andrew Trick	56b315a9cf	indvars --disable-iv-rewrite: sever ties with IVUsers. llvm-svn: 133988	2011-06-28 03:01:46 +00:00
Andrew Trick	8a3c39c737	indvars --disable-iv-rewrite: Defer evaluating s/zext until SCEV evaluates all other IV exprs. llvm-svn: 133982	2011-06-28 02:49:20 +00:00
Andrew Trick	163b4a70fb	indvars -disable-iv-rewrite: run RLEV after SimplifyIVUsers for a bit more control over the order SCEVs are evaluated. llvm-svn: 133959	2011-06-27 23:17:44 +00:00
Jakub Staszak	423651e46a	Calculate GetBestDestForJumpOnUndef correctly. llvm-svn: 133946	2011-06-27 21:51:12 +00:00
Nick Lewycky	a61df3f843	Teach one piece of scalarrepl to handle lifetime markers. When transforming an alloca that only holds a copy of a global and we're going to replace the users of the alloca with that global, just nuke the lifetime intrinsics. Part of PR10121. llvm-svn: 133905	2011-06-27 05:40:02 +00:00
Nick Lewycky	3e334a42d7	Move onlyUsedByLifetimeMarkers to ValueTracking so that it can be used by other passes as well. llvm-svn: 133904	2011-06-27 04:20:45 +00:00
Eli Friedman	2c980fafff	PR10180: Fix a instcombine crash with FP vectors. llvm-svn: 133756	2011-06-23 20:40:23 +00:00
Jay Foad	61ea0e4692	Reinstate r133513 (reverted in r133700) with an additional fix for a -Wshorten-64-to-32 warning in Instructions.h. llvm-svn: 133708	2011-06-23 09:09:15 +00:00
Eric Christopher	96513120b7	Revert r133513: "Reinstate r133435 and r133449 (reverted in r133499) now that the clang self-hosted build failure has been fixed (r133512)." Due to some additional warnings. llvm-svn: 133700	2011-06-23 06:24:52 +00:00
Devang Patel	ea7751bc24	Set debug loc. llvm-svn: 133636	2011-06-22 19:52:36 +00:00
Jay Foad	83be361b8a	Replace the existing forms of ConstantArray::get() with a single form that takes an ArrayRef. llvm-svn: 133615	2011-06-22 09:24:39 +00:00
Andrew Trick	fc4ccb20c6	IVUsers no longer needs to record the phis. llvm-svn: 133518	2011-06-21 15:43:52 +00:00
Benjamin Kramer	ccbb77f239	Remove unused variables. llvm-svn: 133514	2011-06-21 14:58:30 +00:00
Jay Foad	a97a2c998e	Reinstate r133435 and r133449 (reverted in r133499) now that the clang self-hosted build failure has been fixed (r133512). llvm-svn: 133513	2011-06-21 10:33:19 +00:00
Jay Foad	25127ab1e4	Don't use PN->replaceUsesOfWith() to change a PHINode's incoming blocks, because it won't work after my phi operand changes, because the incoming blocks will no longer be Uses. llvm-svn: 133512	2011-06-21 10:02:43 +00:00
Andrew Trick	69d4452f2e	indvars -disable-iv-rewrite: Adds support for eliminating identity ops. This is a rewrite of the IV simplification algorithm used by -disable-iv-rewrite. To avoid perturbing the default mode, I temporarily split the driver and created SimplifyIVUsersNoRewrite. The idea is to avoid doing opcode/pattern matching inside IndVarSimplify. SCEV already does it. We want to optimize with the full generality of SCEV, but optimize def-use chains top down on-demand rather than rewriting the entire expression bottom-up. This was easy to do for operations that SCEV can prove are identity function. So we're now eliminating bitmasks and zero extends this way. A result of this rewrite is that indvars -disable-iv-rewrite no longer requires IVUsers. llvm-svn: 133502	2011-06-21 03:22:38 +00:00
Chad Rosier	184f3b37e2	Revert r133435 and r133449 to appease buildbots. llvm-svn: 133499	2011-06-21 02:09:03 +00:00
Dan Gohman	ceaac7cb4a	Completely short-circuit out ARC optimization if the ARC runtime functions do not appear in the module. llvm-svn: 133478	2011-06-20 23:20:43 +00:00
Jay Foad	e03c05c35a	Change how PHINodes store their operands. Change PHINodes to store simple pointers to their incoming basic blocks, instead of full-blown Uses. Note that this loses an optimization in SplitCriticalEdge(), because we can no longer walk the use list of a BasicBlock to find phi nodes. See the comment I removed starting "However, the foreach loop is slow for blocks with lots of predecessors". Extend replaceAllUsesWith() on a BasicBlock to also update any phi nodes in the block's successors. This mimics what would have happened when PHINodes were proper Users of their incoming blocks. (Note that this only works if OldBB->replaceAllUsesWith(NewBB) is called when OldBB still has a terminator instruction, so it still has some successors.) llvm-svn: 133435	2011-06-20 14:38:01 +00:00
Jay Foad	372ad64b4d	Make better use of the PHINode API. Change various bits of code to make better use of the existing PHINode API, to insulate them from forthcoming changes in how PHINodes store their operands. llvm-svn: 133434	2011-06-20 14:18:48 +00:00
Chris Lattner	cc19efaa97	Revamp the "ConstantStruct::get" methods. Previously, these were scattered all over the place in different styles and variants. Standardize on two preferred entrypoints: one that takes a StructType and ArrayRef, and one that takes StructType and varargs. In cases where there isn't a struct type convenient, we now add a ConstantStruct::getAnon method (whose name will make more sense after a few more patches land). It would be "really really nice" if the ConstantStruct::get and ConstantVector::get methods didn't make temporary std::vectors. llvm-svn: 133412	2011-06-20 04:01:31 +00:00
Chris Lattner	f3f545ea8a	fix the varargs version of StructType::get to not require an LLVMContext, making usage much cleaner. llvm-svn: 133364	2011-06-18 22:48:56 +00:00
Hans Wennborg	4ab4a8e63a	Fix PR10103: Less code for enum type translation. In cases such as the attached test, where the case value for a switch destination is used in a phi node that follows the destination, it might be better to replace that value with the condition value of the switch, so that more blocks can be folded away with TryToSimplifyUncondBranchFromEmptyBlock because there are less conflicts in the phi node. llvm-svn: 133344	2011-06-18 10:28:47 +00:00
Cameron Zwarich	9601ddb2f3	When scalar replacement returns a vector type, only accept it if the vector type's bitwidth matches the (allocated) size of the alloca. This severely pessimizes vector scalar replacement when the only vector type being used is something like <3 x float> on x86 or ARM whose allocated size matches a <4 x float>. I hope to fix some of the flawed assumptions about allocated size throughout scalar replacement and reenable this in most cases. llvm-svn: 133338	2011-06-18 06:17:51 +00:00
Cameron Zwarich	2a26100c87	Fix an invalid bitcast crash that occurs when doing a partial memset of a vector alloca. Fixes part of <rdar://problem/9580800>. llvm-svn: 133336	2011-06-18 05:47:49 +00:00
Cameron Zwarich	cd42038fdc	Remove a pointless assignment. Nothing checks the value of VectorTy anymore now unless ScalarKind is Vector. llvm-svn: 133335	2011-06-18 05:47:45 +00:00
Chad Rosier	c76b9d8c2f	Revert r133285. Causing odd failures on Dragonegg. llvm-svn: 133301	2011-06-17 22:08:25 +00:00
Devang Patel	6f7315b0ca	Set debug loc for new preheader's terminator. llvm-svn: 133298	2011-06-17 21:36:44 +00:00
Stuart Hastings	23be986a0c	Relocate NUW test to cover all binary ops in a dynamic alloca expr. Followup to 132926. rdar://problem/9265821 llvm-svn: 133285	2011-06-17 20:21:52 +00:00
Nick Lewycky	e11f467dda	When promoting an alloca to registers discard any lifetime intrinsics. llvm-svn: 133251	2011-06-17 10:09:00 +00:00
Dan Gohman	00fa9634d5	Fix ARCOpt to insert releases on both successors of an invoke rather than trying to insert them immediately after the invoke. llvm-svn: 133188	2011-06-16 20:57:14 +00:00
John McCall	d935e9c359	The ARC language-specific optimizer. Credit to Dan Gohman. llvm-svn: 133108	2011-06-15 23:37:01 +00:00
Eli Friedman	19ace4c31a	Simplify; no significant functionality change. llvm-svn: 133086	2011-06-15 21:08:25 +00:00
Rafael Espindola	ea7a02774d	Fix cmake build. llvm-svn: 133085	2011-06-15 21:03:04 +00:00
Eli Friedman	a472b7d900	Remove unused code. llvm-svn: 133078	2011-06-15 19:58:09 +00:00
Eli Friedman	e8bbc10880	Stop using memdep for a check that didn't really make sense with memdep. In terms of specific issues, using memdep here checks irrelevant instructions and won't work properly once we start returning "unknown" more aggressively from memdep. llvm-svn: 133035	2011-06-15 01:25:56 +00:00
Eli Friedman	7d58bc7bc0	Add "unknown" results for memdep, which mean "I don't know whether a dependence for the given instruction exists in the given block". This cleans up all the existing hacks in memdep which represent this concept by returning clobber with various unrelated instructions. llvm-svn: 133031	2011-06-15 00:47:34 +00:00
Cameron Zwarich	b5f19d9f6f	Be more obvious about what is being tested. llvm-svn: 132982	2011-06-14 06:33:51 +00:00
John McCall	5af845226c	Use IRBuilder to make our intrinsic calls in the inliner so that we pick up line info correctly. llvm-svn: 132961	2011-06-14 02:51:53 +00:00
Nick Lewycky	9711b5c70b	Use Value::stripPointerCasts instead of reinventing part of the wheel. llvm-svn: 132954	2011-06-14 00:59:24 +00:00
Cameron Zwarich	922e4940bd	Fix grammar. llvm-svn: 132952	2011-06-13 23:39:23 +00:00
Cameron Zwarich	3ecbd59c27	Rename MergeInType to MergeInTypeForLoadOrStore. llvm-svn: 132940	2011-06-13 21:44:43 +00:00
Cameron Zwarich	8cb90ac456	Remove the HadAVector instance variable and replace it with a use of ScalarKind. llvm-svn: 132939	2011-06-13 21:44:40 +00:00
Cameron Zwarich	1bfab48edb	Remove a vacuous check. llvm-svn: 132938	2011-06-13 21:44:38 +00:00
Cameron Zwarich	5e9a0be4b3	Have SRoA explicitly track the kind of scalar it is promoting. This is pretty spartan right now, but I plan to encode more information in this enum to improve the correctness and reliability of SRoA. At least this first pass makes it possible to make VectorTy an actual VectorType. llvm-svn: 132937	2011-06-13 21:44:35 +00:00
Cameron Zwarich	8deb615d64	Remove an argument that is always true. llvm-svn: 132936	2011-06-13 21:44:31 +00:00
Stuart Hastings	351a3f881f	Avoid fusing bitcasts with dynamic allocas if the amount-to-allocate might overflow. Re-typing the alloca to a larger type (e.g. double) hoists a shift into the alloca, potentially exposing overflow in the expression. rdar://problem/9265821 llvm-svn: 132926	2011-06-13 18:48:49 +00:00
Benjamin Kramer	c970849ea0	InstCombine: Fold A-b == C --> b == A-C if A and C are constants. The backend already knew this trick. llvm-svn: 132915	2011-06-13 15:24:24 +00:00
Nick Lewycky	f8e046b148	It's possible that an all-zero GEP may be used as the argument to lifetime intrinsics. In fact, we'll optimize a bitcast to that when possible. Detect it when looking for the lifetime intrinsics. No test case, noticed by inspection. llvm-svn: 132906	2011-06-13 07:52:46 +00:00
Benjamin Kramer	91f914ce21	InstCombine: Shrink ((zext X) & C1) == C2 to fold away the cast if the "zext" and the "and" have one use. llvm-svn: 132897	2011-06-12 22:48:00 +00:00
Benjamin Kramer	35159c114c	Simplify code. No functionality changes, name changes aside. llvm-svn: 132896	2011-06-12 22:47:53 +00:00
John McCall	58fb52c6c7	When deleting a basic block, remove call edges only for non-intrinsics. llvm-svn: 132803	2011-06-09 20:31:09 +00:00
John McCall	fc1ca36866	SplitCriticalEdge can sometimes split the edge from an invoke to a landing pad, separating the exception and selector calls from the new lpad. Teaching it not to do that, or to properly adjust the CFG afterwards, is out of scope because it would require the other edges to the landing pad to be split as well (effectively). Instead, just recover from the most likely cases during inlining. The best long-term solution is to change the exception representation and commit to either requiring or not requiring the more complex edge-splitting logic; this is just a shorter-term hack. llvm-svn: 132799	2011-06-09 20:06:24 +00:00
John McCall	729c35b680	Teach the CallGraph to ignore calls to intrinsics. llvm-svn: 132797	2011-06-09 19:46:27 +00:00
Rafael Espindola	b77c00fb60	Improve the handling of available_externally and llvm.global_ctors. llvm-svn: 132775	2011-06-09 14:38:09 +00:00
Cameron Zwarich	c62894d440	Remove a vacuous condition. llvm-svn: 132767	2011-06-09 01:52:44 +00:00
Cameron Zwarich	77a699a829	Fix PR10104 by adding a bounds check on a vector element access check. It was assuming that all offsets are legal vector accesses, and thus trying to access the float member of { <2 x float>, float } as the 3rd element of the first member. llvm-svn: 132766	2011-06-09 01:45:33 +00:00
Cameron Zwarich	c3b1cc9aca	Fix an assymmetry between ConvertScalar_ExtractValue and ConvertScalar_InsertValue. The former was using the size of the entire alloca, whereas the latter was correctly using the allocated size of the immediate type being converted (which may differ from the size of the alloca). This fixes PR10082. llvm-svn: 132759	2011-06-08 22:08:31 +00:00
Bill Wendling	4f163dfed1	If the block that we're threading through is jumped to by an indirect branch, then we don't want to set the destination in the indirect branch to the destination. This is because the indirect branch needs its destinations to have had their block addresses taken. This isn't so of the new critical edge that's split during this process. If it turns out that the destination block has only one predecessor, and that being a BB with an indirect branch, then it won't be marked as 'used' and may be removed. PR10072 llvm-svn: 132638	2011-06-04 09:42:04 +00:00
Devang Patel	84bb33add9	Use IRBuilder, preserve line numbers. llvm-svn: 132578	2011-06-03 19:46:19 +00:00
Nick Lewycky	611582401f	Bail on unswitching a switch statement for a case with a critical edge. We name which edge to split by pred/succ pair, which means that we can end up splitting the wrong edge (by case value) in the switch statement entirely. Fixes PR10031! llvm-svn: 132535	2011-06-03 06:27:15 +00:00
Devang Patel	5127c5d9b2	Preserve line number information while converting Invoke into a Call. llvm-svn: 132505	2011-06-02 22:46:58 +00:00
Eli Friedman	5da0ff41d7	PR10067: Add missing safety check to call return transformation in MemCpyOpt::processStore. If something accesses the dest of the "copy" between the call and the copy, the performCallSlotOptzn transformation is not valid. llvm-svn: 132485	2011-06-02 21:24:42 +00:00
Stuart Hastings	2380483355	Reapply 132348 with fixes. rdar://problem/6501862 llvm-svn: 132402	2011-06-01 16:42:47 +00:00
John McCall	fca7786267	First, do no harm -- even if we can't find a selector for an enclosing landing pad, forward llvm.eh.resume calls to it instead of turning them invalidly into invokes. llvm-svn: 132382	2011-06-01 02:17:11 +00:00
Stuart Hastings	9d6a06d536	Revert to pacify a buildbot. rdar://problem/6501862 llvm-svn: 132351	2011-05-31 19:56:35 +00:00
Stuart Hastings	780f723309	Followup to 132316; accept arbitrary constants, add with a constant, sub with a non-constant. Fix comments, enlarge test case. rdar://problem/6501862 llvm-svn: 132348	2011-05-31 19:29:55 +00:00
Stuart Hastings	8284374b07	(1 - X) * (-2) -> (x - 1) * 2, for all positive nonzero powers of 2 rdar://problem/6501862 llvm-svn: 132316	2011-05-30 20:00:33 +00:00
Nick Lewycky	c66d455e50	Don't crash owhen ComputeLoadResult can't compute the result of the load. llvm-svn: 132290	2011-05-29 19:33:36 +00:00
Nick Lewycky	a3bb03e400	Obey the isVolatile bit on memory intrinsics when analyzing uses of a global variable. Noticed by inspection. Simulate memset in EvaluateFunction where the target of the memset and the value we're setting are both the null value. Fixes PR10047! llvm-svn: 132288	2011-05-29 18:41:56 +00:00
Nadav Rotem	707f2d7787	Fix warnings due to 132263; Thanks rdivacky. llvm-svn: 132285	2011-05-29 08:10:47 +00:00
John McCall	2c6d23fba2	Fix this to work correctly with phis; test case to follow if this successfully fixes self-host. llvm-svn: 132275	2011-05-29 03:01:09 +00:00
Benjamin Kramer	fd53a27f99	ConstantFoldInstOperands doesn't like compares, hand it off to instsimplify instead. Fixes PR10040. llvm-svn: 132254	2011-05-28 10:16:58 +00:00
John McCall	046c47e970	Implement and document the llvm.eh.resume intrinsic, which is transformed by the inliner into a branch to the enclosing landing pad (when inlined through an invoke). If not so optimized, it is lowered DWARF EH preparation into a call to _Unwind_Resume (or _Unwind_SjLj_Resume as appropriate). Its chief advantage is that it takes both the exception value and the selector value as arguments, meaning that there is zero effort in recovering these; however, the frontend is required to pass these down, which is not actually particularly difficult. Also document the behavior of landing pads a bit better, and make it clearer that it's okay that personality functions don't always land at landing pads. This is just a fact of life. Don't write optimizations that rely on pushing things over an unwind edge. llvm-svn: 132253	2011-05-28 07:45:59 +00:00
Nadav Rotem	a9effb13dd	Refactor getActionType and getTypeToTransformTo ; place all of the 'decision' code in one place. Re-apply 131534 and fix the multi-step promotion of integers. llvm-svn: 132217	2011-05-27 21:03:13 +00:00
Eli Friedman	ddf7f55531	Attempt to preserve debug line info in LICM; as the comment in the code says, it's hard to pick good line numbers for this transformation, but something is better than nothing. rdar://9143729 llvm-svn: 132215	2011-05-27 20:31:51 +00:00
Eli Friedman	942e1c10f6	Don't sink or hoist debug info instrinsics; it isn't useful. This also prevents LICM sinking from erasing debug intrinsics which don't dominate any exit block of the loop. rdar://9143943 . llvm-svn: 132201	2011-05-27 18:37:52 +00:00
John McCall	bd04b74bb2	Fix the inliner to maintain the current de facto invoke semantics: - the selector for the landing pad must provide all available information about the handlers, filters, and cleanups within that landing pad - calls to _Unwind_Resume must be converted to branches to the enclosing lpad so as to avoid re-entering the unwinder when the lpad claimed it was going to handle the exception in some way This is quite specific to libUnwind-based unwinding. In an effort to not interfere too badly with other unwinders, and with existing hacks in frontends, this only triggers on _Unwind_Resume (not _Unwind_Resume_or_Rethrow) and does nothing with selectors if it cannot find a selector call for either lpad. llvm-svn: 132200	2011-05-27 18:34:38 +00:00
Eli Friedman	b868c83e67	Oops, wasn't intending to commit this. Partial revert of r132194. llvm-svn: 132195	2011-05-27 18:04:04 +00:00
Eli Friedman	fe84bd659c	Fix a silly mistake (which trips over an assertion) in r132099. rdar://9515076 llvm-svn: 132194	2011-05-27 18:02:04 +00:00
Benjamin Kramer	749ef5f420	InstCombine: Make switch folding with equality compares more aggressive by trying instsimplify on the arm where we know the compared value. Stuff like "x == y ? y : x&y" now folds into "x&y". llvm-svn: 132185	2011-05-27 13:00:16 +00:00
Eli Friedman	e217f89420	One more debug line number miss in instcombine (although the code in question isn't actually in instcombine). llvm-svn: 132170	2011-05-27 01:00:36 +00:00
Eli Friedman	35211c6091	Final step of instcombine debuginfo; switch a couple more places over to InsertNewInstWith, and use setDebugLoc for the cases which can't be easily handled by the automated mechanisms. llvm-svn: 132167	2011-05-27 00:19:40 +00:00
Chandler Carruth	07f5b65e63	Fix warning about \|\| and && without explicit grouping. This looks like it flagged an actual bug. Devang, please review. I added the parentheses that change behavior, but make the behavior more closely match commit log's intent. llvm-svn: 132165	2011-05-26 23:37:58 +00:00
Devang Patel	bf22998f21	Do not insert anything after terminator. llvm-svn: 132164	2011-05-26 23:16:48 +00:00
Chad Rosier	b362884ca9	Renamed llvm.x86.sse42.crc32 intrinsics; crc64 doesn't exist. crc32.[8\|16\|32] have been renamed to .crc32.32.[8\|16\|32] and crc64.[8\|16\|32] have been renamed to .crc32.64.[8\|64]. llvm-svn: 132163	2011-05-26 23:13:19 +00:00
Devang Patel	252f0079a9	Do not move DBG_VALUE in middle of PHI nodes. llvm-svn: 132161	2011-05-26 22:43:14 +00:00
Devang Patel	0da5250bcd	If llvm.dbg.value and the value instruction it refers to are far apart then iSel may not be able to find corresponding Node for llvm.dbg.value during DAG construction. Make iSel's life easier by removing this distance between llvm.dbg.value and its value instruction. llvm-svn: 132151	2011-05-26 21:51:06 +00:00
Andrew Trick	7fac79e255	indvars: incremental fixes for -disable-iv-rewrite and testcases. Use a proper worklist for use-def traversal without holding onto an iterator. Now that we process all IV uses, we need complete logic for resusing existing derived IV defs. See HoistStep. llvm-svn: 132103	2011-05-26 00:46:11 +00:00
Eli Friedman	865866e7fe	PR9998: ashr exact %x, 31 is not equivalent to sdiv exact %x, -2147483648. llvm-svn: 132097	2011-05-25 23:26:20 +00:00
Evan Cheng	9605a698b0	Simplify r132022 based on Cameron's feedback. llvm-svn: 132071	2011-05-25 18:17:13 +00:00
Andrew Trick	eb3c36e69c	indvars: fixed IV cloning in -disable-iv-rewrite mode with associated cleanup and overdue test cases. llvm-svn: 132038	2011-05-25 04:42:22 +00:00
Evan Cheng	73e6c09d5e	Forgot dyn_cast check. llvm-svn: 132025	2011-05-24 23:47:50 +00:00
Evan Cheng	1b55f56b01	Fix LoopUnswitch bug. RewriteLoopBodyWithConditionConstant can delete a dead case of a switch instruction. Back off this optimization when this would eliminate all of the predecessors to the latch. Sorry, I am unable to reduce a reasonably sized test case. rdar://9486843 llvm-svn: 132022	2011-05-24 23:12:57 +00:00
Eli Friedman	68aab459ae	Make instcombine O(N) instead of O(N^2) in code where the same simplifiable constant is used many times. Part of rdar://9471075. llvm-svn: 131979	2011-05-24 18:52:07 +00:00
Cameron Zwarich	46e1ebf367	Clean up the lazy initialization of DIBuilder a bit. llvm-svn: 131956	2011-05-24 06:00:08 +00:00
Cameron Zwarich	843bc7d673	Make LoadAndStorePromoter preserve debug info and create llvm.dbg.values when promoting allocas to SSA variables. Fixes <rdar://problem/9479036>. llvm-svn: 131953	2011-05-24 03:10:43 +00:00
Dan Gohman	6c4a319088	When checking for signed multiplication overflow, watch out for INT_MIN and -1. This fixes PR9845. llvm-svn: 131919	2011-05-23 21:07:39 +00:00
Chris Lattner	388cb8a57c	rearrange two transforms, since one subsumes the other. Make the shift-exactness xform recurse. llvm-svn: 131888	2011-05-23 00:32:19 +00:00
Chris Lattner	8aff4f8efc	Transform any logical shift of a power of two into an exact/NUW shift when in a known-non-zero context. llvm-svn: 131887	2011-05-23 00:21:50 +00:00
Chris Lattner	321c58fc41	use the valuetracking isPowerOfTwo function, which is more powerful than checking for a constant directly. Thanks to Duncan for pointing this out. llvm-svn: 131885	2011-05-23 00:09:55 +00:00
Chris Lattner	83791ced7b	Teach valuetracking that byval arguments with a specified alignment are aligned. Teach memcpyopt to not give up all hope when confonted with an underaligned memcpy feeding an overaligned byval. If the source of the memcpy can be determined to be adequeately aligned, or if it can be forced to be, we can eliminate the memcpy. This addresses PR9794. We now compile the example into: define i32 @f(%struct.p* nocapture byval align 8 %q) nounwind ssp { entry: %call = call i32 @g(%struct.p* byval align 8 %q) nounwind ret i32 %call } in both x86-64 and x86-32 mode. We still don't get a tailcall though, because tailcalls apparently can't handle byval. llvm-svn: 131884	2011-05-23 00:03:39 +00:00
Chris Lattner	162dfc3e6b	add some random notes. llvm-svn: 131862	2011-05-22 18:26:48 +00:00
Chris Lattner	7c99f19d9f	Carve out a place in instcombine to put transformations which work knowing that their result is non-zero. Implement an example optimization (PR9814), which allows us to transform: A / ((1 << B) >>u 2) into: A >>u (B-2) which we compile into: _divu3: ## @divu3 leal -2(%rsi), %ecx shrl %cl, %edi movl %edi, %eax ret instead of: _divu3: ## @divu3 movb %sil, %cl movl $1, %esi shll %cl, %esi shrl $2, %esi movl %edi, %eax xorl %edx, %edx divl %esi, %eax ret llvm-svn: 131860	2011-05-22 18:18:41 +00:00
Chris Lattner	c4ca7ab7e7	Fix PR9815: I was trying to get out of "generating code and then failing to form a memset, then having to delete it" but my approximation isn't safe for self recurrent loops. Instead of doign a hack, just do it the right way. llvm-svn: 131858	2011-05-22 17:39:56 +00:00
Frits van Bommel	ad964559ef	Add a parameter to ConstantFoldTerminator() that callers can use to ask it to also clean up the condition of any conditional terminator it folds to be unconditional, if that turns the condition into dead code. This just means it calls RecursivelyDeleteTriviallyDeadInstructions() in strategic spots. It defaults to the old behavior. I also changed -simplifycfg, -jump-threading and -codegenprepare to use this to produce slightly better code without any extra cleanup passes (AFAICT this was the only place in -simplifycfg where now-dead conditions of replaced terminators weren't being cleaned up). The only other user of this function is -sccp, but I didn't read that thoroughly enough to figure out whether it might be holding pointers to instructions that could be deleted by this. llvm-svn: 131855	2011-05-22 16:24:18 +00:00
Chris Lattner	1a1acc2191	fix PR9856, an incorrectly conservative assertion: a global can be "stored once" even if its address is compared. llvm-svn: 131849	2011-05-22 07:15:13 +00:00
Chris Lattner	f0d59072de	fix PR9841 by having GVN not process dead loads. This was causing it to get into infinite loops when it would widen a load (which can necessarily leave around dead loads). llvm-svn: 131847	2011-05-22 07:03:34 +00:00
Nick Lewycky	a68ec83b36	Teach the inliner to emit llvm.lifetime.start/end, to scope the local variables of the inlinee to the code representing the original function. llvm-svn: 131838	2011-05-22 05:22:10 +00:00
Eli Friedman	3de2ddc578	PR7952: Make isa<> use the same logic as cast<>, so that they both work consistently. llvm-svn: 131803	2011-05-21 19:13:10 +00:00
Benjamin Kramer	fda5dc4968	Revert "InstCombine: Turn mul.with.overflow(X, 2) into the cheaper add.with.overflow(X, X)" It's better to do this in codegen, mul.with.overflow(X, 2) is more canonical because it has only one use on "X". llvm-svn: 131798	2011-05-21 18:31:42 +00:00
Benjamin Kramer	691731eb9c	InstCombine: Turn mul.with.overflow(X, 2) into the cheaper add.with.overflow(X, X) llvm-svn: 131789	2011-05-21 09:22:06 +00:00
Andrew Trick	f44aadf0fd	indvars: Prototyping Sign/ZeroExtend elimination without canonical IVs. No functionality enabled by default. Use -disable-iv-rewrite. Extended IVUsers to keep track of the phi that represents the users' IV. Added the WidenIV transform to replace a narrow IV with a wide IV by doing a one-for-one replacement of IV users instead of expanding the SCEV expressions. [sz]exts are removed and truncs are inserted. llvm-svn: 131744	2011-05-20 18:25:42 +00:00
Andrew Trick	b75279cbbd	indvars: minor cleanup in preparation for sign/zero extend elimination. llvm-svn: 131716	2011-05-20 03:37:48 +00:00
Evan Cheng	e8d2e9eb35	Revert r131664 and fix it in instcombine instead. rdar://9467055 llvm-svn: 131708	2011-05-20 00:54:37 +00:00
Devang Patel	1407fb4bbe	Reapply r131605. This time with a fix, which is to use NoFolder. llvm-svn: 131673	2011-05-19 20:52:46 +00:00
Evan Cheng	dc867ae1fc	Add comment. llvm-svn: 131659	2011-05-19 18:18:39 +00:00
Rafael Espindola	964602d7ba	revert 131605 to fix PR9946. llvm-svn: 131620	2011-05-19 02:26:30 +00:00
Eli Friedman	6efb64ea8e	Make the demanded bits/elements optimizations preserve debug line information. I'm not sure this is quite ideal, but I can't really think of any better way to do it. llvm-svn: 131616	2011-05-19 01:20:42 +00:00
Devang Patel	3015a54813	Use IRBuilder. llvm-svn: 131609	2011-05-19 00:13:33 +00:00
Devang Patel	31458a0002	Use IRBuilder while simplifying unreachable. llvm-svn: 131607	2011-05-19 00:09:21 +00:00
Devang Patel	4b13f39b77	Use IRBuilder while simplifying conditional branch. llvm-svn: 131605	2011-05-18 23:59:51 +00:00
Eli Friedman	41e509a33d	More instcombine cleanup, towards improving debug line info. llvm-svn: 131604	2011-05-18 23:58:37 +00:00
Devang Patel	7de6c4bf75	Use IRBuilder while simplifying branch. llvm-svn: 131598	2011-05-18 23:18:47 +00:00
Eli Friedman	1754a25977	More instcombine simplifications towards better debug locations. llvm-svn: 131596	2011-05-18 23:11:30 +00:00
Devang Patel	dd14e0f7fa	Use IRBuilder while simplifying return instruction. llvm-svn: 131580	2011-05-18 21:33:11 +00:00
Dan Gohman	3268e4d692	When forming an ICmpZero LSRUse, normalize the non-IV operand of the comparison, so that the resulting expression is fully normalized. This fixes PR9939. llvm-svn: 131576	2011-05-18 21:02:18 +00:00
Devang Patel	583805530c	Spread use of IRBuilder even more. llvm-svn: 131571	2011-05-18 20:53:17 +00:00
Devang Patel	a7ec47d23c	Use IRBuilder while simplifying switch instruction. llvm-svn: 131566	2011-05-18 20:35:38 +00:00
Devang Patel	0b373dca1f	Use IRBuilder while simplifying unwind. llvm-svn: 131561	2011-05-18 20:01:18 +00:00
Eli Friedman	49346010f8	More instcombine cleanup aimed towards improving debug line info. llvm-svn: 131559	2011-05-18 19:57:14 +00:00
Devang Patel	2c2ea226b7	Use IRBuilder while simplifying terminator. llvm-svn: 131552	2011-05-18 18:43:31 +00:00
Devang Patel	767f6930bc	Use IRBuilder while simplifying unconditional branch. llvm-svn: 131551	2011-05-18 18:28:48 +00:00
Devang Patel	5c810ce4a3	Use IRBuilder while folding two entry PHINode. llvm-svn: 131548	2011-05-18 18:16:44 +00:00
Eli Friedman	2fd66441c6	Switch more inst insertion in instcombine to IRBuilder. llvm-svn: 131547	2011-05-18 18:10:28 +00:00
Devang Patel	15ad6761da	Set up IRBuilder for use during simplification. llvm-svn: 131545	2011-05-18 18:01:27 +00:00
Eli Friedman	0b43b9ee98	Switch more inst insertion in instcombine to IRBuilder. llvm-svn: 131544	2011-05-18 17:58:37 +00:00
Matt Beaumont-Gay	8fa6ebf975	fix typo llvm-svn: 131543	2011-05-18 17:37:10 +00:00
Eli Friedman	cde9c1628c	Switch inst insertion in instcombine transform to IRBuilder. llvm-svn: 131542	2011-05-18 17:31:55 +00:00
Devang Patel	1fabbe921b	Use IRBuiler while constant folding terminator. llvm-svn: 131541	2011-05-18 17:26:46 +00:00
Stuart Hastings	728f6260b9	Fix inelegant initialization. llvm-svn: 131538	2011-05-18 15:54:26 +00:00
Duncan Sands	3d9407f4eb	Revert commit 131534 since it seems to have broken several buildbots. Original log entry: Refactor getActionType and getTypeToTransformTo ; place all of the 'decision' code in one place. llvm-svn: 131536	2011-05-18 14:57:56 +00:00
Nadav Rotem	c5c27ede55	Refactor getActionType and getTypeToTransformTo ; place all of the 'decision' code in one place. llvm-svn: 131534	2011-05-18 12:26:38 +00:00
Eli Friedman	96254a0d53	Start trying to make InstCombine preserve more debug info. The idea here is to set the debug location on the IRBuilder, which will be then right location in most cases. This should magically give many transformations debug locations, and fixing places which are missing a debug location will usually just means changing the code creating it to use the IRBuilder. As an example, the change to InstCombineCalls catches a common case where a call to a bitcast of a function is rewritten. Chris, does this approach look reasonable? llvm-svn: 131516	2011-05-18 01:28:27 +00:00
Eli Friedman	b9ed18f2cb	Use ReplaceInstUsesWith instead of replaceAllUsesWith where appropriate in instcombine. llvm-svn: 131512	2011-05-18 00:32:01 +00:00
Devang Patel	b849cd511b	Preseve line numbers while simplifying CFG. llvm-svn: 131508	2011-05-17 23:29:05 +00:00
Bill Wendling	0671ba8448	Conditionalize the format of the GCOV files by target type. Darwin uses the 4.2 format. llvm-svn: 131503	2011-05-17 23:05:13 +00:00
Stuart Hastings	5bd18b6638	X86 pmovsx/pmovzx ignore the upper half of their inputs. rdar://problem/6945110 llvm-svn: 131493	2011-05-17 22:13:31 +00:00
Devang Patel	341b38c22a	Preserve line number information. llvm-svn: 131482	2011-05-17 20:00:02 +00:00
Devang Patel	c5933f2418	Set debug loc for new load instruction. llvm-svn: 131481	2011-05-17 19:43:38 +00:00
Devang Patel	c23bcbc498	Preserve line number information. llvm-svn: 131480	2011-05-17 19:43:06 +00:00
Devang Patel	a0b682db62	There is no need to force DebugLoc on a PHI at this point. llvm-svn: 131427	2011-05-16 22:05:03 +00:00
Devang Patel	8e60ff11db	Preserve debug info for unused zero extended boolean argument. Radar 9422775. llvm-svn: 131422	2011-05-16 21:24:05 +00:00
Rafael Espindola	2050af838d	Don't do tail calls in a function that call setjmp. The stack might be corrupted when setjmp returns again. llvm-svn: 131399	2011-05-16 03:05:33 +00:00
Benjamin Kramer	d96205c4e5	SimplifyCFG: Use ComputeMaskedBits to prune dead cases from switch instructions. llvm-svn: 131345	2011-05-14 15:57:25 +00:00
Stuart Hastings	66a82b966e	Avoid combining GEPs that might overflow at runtime. rdar://problem/9267970 Patch by Julien Lerouge! llvm-svn: 131339	2011-05-14 05:55:10 +00:00
Julien Lerouge	7e11f9e26d	Fix a source of non determinism in FindUsedTypes, use a SetVector instead of a set. rdar://9423996 llvm-svn: 131283	2011-05-13 05:20:42 +00:00
Andrew Trick	03957dfeb1	Convert SimplifyIVUsers into a worklist instead of a single pass over the users. llvm-svn: 131277	2011-05-13 01:12:21 +00:00
Andrew Trick	81683ed232	indvars: Added SimplifyIVUsers. Interleave IV simplifications. Currently involves EliminateComparison and EliminateRemainder. Next I'll add EliminateExtend. llvm-svn: 131210	2011-05-12 00:04:28 +00:00
Devang Patel	3fd06f760b	Preserve line number information. llvm-svn: 131112	2011-05-10 00:03:11 +00:00
Duncan Sands	a071c82900	Fix PR9820: a read-only call differs from a load in that a load doesn't return the pointer being dereferenced, it returns the pointee, but a call might return the pointer itself. llvm-svn: 130979	2011-05-06 10:30:37 +00:00
Nick Lewycky	a7028848a1	The computation of string length is not that complicated. Fix it, again. :) llvm-svn: 130967	2011-05-05 23:52:18 +00:00
Eli Friedman	8a20e66926	PR9838: Fix transform introduced in r127064 to not trigger when only one side of the icmp is an exact shift. llvm-svn: 130954	2011-05-05 21:59:18 +00:00
Nick Lewycky	4f9c367f0b	Update the gcov version used slightly, to make it stop causing modern gcov's to crash. llvm-svn: 130911	2011-05-05 02:46:38 +00:00
Nick Lewycky	baa878ce4a	Remove dead function. llvm-svn: 130903	2011-05-05 00:17:34 +00:00
Nick Lewycky	a3d5d167a8	When the path wasn't emitted by the frontend, discard any path on the source filename. llvm-svn: 130897	2011-05-05 00:03:30 +00:00
Devang Patel	ffb798c1c6	Set debug loc for new instructions. llvm-svn: 130895	2011-05-04 23:58:50 +00:00
Devang Patel	ac794d46bf	Set debug location for new PHI nodes created in exit block. llvm-svn: 130894	2011-05-04 23:58:22 +00:00
Devang Patel	306f8db721	Preserve line number information while threading jumps. llvm-svn: 130880	2011-05-04 22:48:19 +00:00
Devang Patel	c7e4fa7c19	Preserve line number info. llvm-svn: 130876	2011-05-04 21:58:58 +00:00
Devang Patel	0daa07eb90	preserve line number info. llvm-svn: 130869	2011-05-04 21:37:05 +00:00
Nick Lewycky	6d9f061a6b	Emit gcov data files to the directory specified in the metadata produced by the frontend, if applicable. llvm-svn: 130835	2011-05-04 04:03:04 +00:00
Andrew Trick	1abe296cfd	indvars: Added DisableIVRewrite and WidenIVs. This adds functionality to remove size/zero extension during indvars without generating a canonical IV and rewriting all IV users. It's disabled by default so should have no effect on codegen. Work in progress. llvm-svn: 130829	2011-05-04 02:10:13 +00:00
Andrew Trick	38c4e34abb	indvars: Added canExpandBackEdgeTakenCount. Only create a canonical IV for backedge taken count if it will actually be used by LinearFunctionTestReplace. And some related cleanup, preparing to reduce dependence on canonical IVs. No significant effect on x86 or arm in the test-suite. llvm-svn: 130799	2011-05-03 22:24:10 +00:00
Benjamin Kramer	9c373c1c7a	Remove unused variables caught by GCC's -Wunused-but-set-variable. llvm-svn: 130755	2011-05-03 16:00:27 +00:00
Dan Gohman	6136e94897	Add an unfolded offset field to LSR's Formula record. This is used to model constants which can be added to base registers via add-immediate instructions which don't require an additional register to materialize the immediate. llvm-svn: 130743	2011-05-03 00:46:49 +00:00
Devang Patel	bb35e8ba88	Scanning entire basic block may be too expensive in terms of compile time. Instead, just use whatever location info first non-phi instruction has. llvm-svn: 130729	2011-05-02 21:57:00 +00:00
Duncan Sands	6b699f863f	Remove unused variable. llvm-svn: 130705	2011-05-02 18:41:29 +00:00
Duncan Sands	a3e3699c88	Move some rem transforms out of instcombine and into instsimplify. This automagically provides a transform noticed by my super-optimizer as occurring quite often: "rem x, (select cond, x, 1)" -> 0. llvm-svn: 130694	2011-05-02 16:27:02 +00:00
Chris Lattner	23f61a09af	enhance memcpyopt to obey -fno-builtin and friends. This addresses a problem reported on cfe-dev. llvm-svn: 130661	2011-05-01 18:27:11 +00:00
Benjamin Kramer	9aa91b1f4e	InstCombine: Turn (zext A) udiv (zext B) into (zext (A udiv B)). Same for urem or constant B. This obviously helps a lot if the division would be turned into a libcall (think i64 udiv on i386), but div is also one of the few remaining instructions on modern CPUs that become more expensive when the bitwidth gets bigger. This also helps register pressure on i386 when dividing chars, divb needs two 8-bit parts of a 16 bit register as input where divl uses two registers. int foo(unsigned char a) { return a/10; } int bar(unsigned char a, unsigned char b) { return a/b; } compiles into (x86_64) _foo: imull $205, %edi, %eax shrl $11, %eax ret _bar: movzbl %dil, %eax divb %sil, %al movzbl %al, %eax ret llvm-svn: 130615	2011-04-30 18:16:07 +00:00
Benjamin Kramer	57b3df59b9	Use SimplifyDemandedBits on div instructions. This folds away silly stuff like (a&255)/1000 -> 0. llvm-svn: 130614	2011-04-30 18:16:00 +00:00
Devang Patel	a8e7411c74	Assing line number info to new PHIs created by SSA updater. llvm-svn: 130551	2011-04-29 22:28:59 +00:00
Devang Patel	c1f7c1d469	Preserve line number information. llvm-svn: 130536	2011-04-29 20:38:55 +00:00
Peter Collingbourne	616044acd5	SimplifyCFG: Expose phi node folding cost threshold as command line parameter llvm-svn: 130528	2011-04-29 18:47:38 +00:00
Peter Collingbourne	e3511e15e0	SimplifyCFG: Add CostRemaining parameter to DominatesMergePoint llvm-svn: 130527	2011-04-29 18:47:31 +00:00
Peter Collingbourne	61f6602acd	SimplifyCFG: Add Trunc, ZExt and SExt to the list of cheap instructions for phi node folding llvm-svn: 130526	2011-04-29 18:47:25 +00:00
Benjamin Kramer	f0e3f04470	Balance parentheses. llvm-svn: 130489	2011-04-29 08:41:23 +00:00
Benjamin Kramer	16f18ed7b5	InstCombine: turn (C1 << A) << C2) into (C1 << C2) << A) Fixes PR9809. llvm-svn: 130485	2011-04-29 08:15:41 +00:00
Devang Patel	80d1d3aaec	Preserve line number information. llvm-svn: 130450	2011-04-28 22:48:14 +00:00
Benjamin Kramer	cf9d1ad62e	We require threse bits to be zero, too. This shouldn't happen in practice because the icmp would be a constant. Add a check so we don't miscompile code if something goes wrong. llvm-svn: 130446	2011-04-28 21:38:51 +00:00
Nick Lewycky	6aa79492a5	Only read predecessor once so as to fix a theoretical issue where it changes between two reads (threading). Fix an off-by-one in the indirect counter table that I meant to revert after an earlier experiment. Whoops! Implement GCOV_PREFIX. Doesn't handle GCOV_PREFIX_STRIP yet. Fix an off-by-one in string emission. Extra whoops! Tolerate DISubprograms that have null Function's attached to them. I don't yet understand what this means, but it happens when you have a global static with a non-trivial constructor/destructor. Fix a crash on switch statements with a single successor (default-only). llvm-svn: 130443	2011-04-28 21:35:49 +00:00
Devang Patel	72aa1a8a68	Remove DbgDeclare only if all uses are converted. llvm-svn: 130431	2011-04-28 20:32:02 +00:00
Benjamin Kramer	101720fb58	Fix a comment. llvm-svn: 130428	2011-04-28 20:09:57 +00:00
Chris Lattner	a5452c0d67	improve comment. llvm-svn: 130426	2011-04-28 20:02:57 +00:00
Devang Patel	33d87d97f6	Do not lose line number info while eliminating tail call. llvm-svn: 130419	2011-04-28 18:43:39 +00:00
Chris Lattner	1777601a74	final step needed to resolve PR6627, which allows us to flatten the code down to a nice and tidy: %x1 = load i32* %0, align 4 %1 = icmp eq i32 %x1, 1179403647 br i1 %1, label %if.then, label %if.end instead of doing lots of loads and branches. May the FreeBSD bootloader long fit in its allocated space. llvm-svn: 130416	2011-04-28 18:15:47 +00:00
Chris Lattner	45e393fc9c	code cleanups only. llvm-svn: 130414	2011-04-28 18:08:21 +00:00
Andrew Trick	c4456ae6ec	Reapply r130340: Fix for PR9730. llvm-svn: 130408	2011-04-28 17:30:04 +00:00
Benjamin Kramer	4145c0d3b1	InstCombine: Merge "(trunc x) == C1 & (and x, CA) == C2" into a single and+icmp. This happens when GVN widens loads. Part of PR6627. llvm-svn: 130405	2011-04-28 16:58:40 +00:00
Chris Lattner	f81f789b6c	centralize "marking for deletion" into a helper function. Pass GVN around to static functions instead of passing around tons of random ivars. llvm-svn: 130403	2011-04-28 16:36:48 +00:00
Chris Lattner	6cec6ab275	Promote toErase to be an ivar of the GVN class. llvm-svn: 130401	2011-04-28 16:18:52 +00:00
Chris Lattner	827a270a2a	teach GVN to widen integer loads when they are overaligned, when doing an wider load would allow elimination of subsequent loads, and when the wider load is still a native integer type. This eliminates a ton of loads on various benchmarks involving struct fields, though it is somewhat hobbled by clang not being very aggressive about field alignment. This is yet another step along the way towards resolving PR6627. llvm-svn: 130390	2011-04-28 07:29:08 +00:00
Andrew Trick	1e34241abd	Reverting r130340 in the unlikely event that it's responsible for a llvm-gcc stage2 compiler error. llvm-svn: 130350	2011-04-28 00:13:59 +00:00
Andrew Trick	29ac7b8858	Fixes PR9730: indvars: An asserting value handle still pointed to this value Modified LinearFunctionTestReplace to push the condition on the dead list instead of eagerly deleting it. This can cause unnecessary IV rewrites, which should have no effect on codegen and will not be an issue once we stop generating canonical IVs. llvm-svn: 130340	2011-04-27 23:00:03 +00:00
Devang Patel	12bf0ab4b5	Simplify cfg inserts a call to trap when unreachable code is detected. Assign DebugLoc to this new trap instruction. llvm-svn: 130315	2011-04-27 17:59:27 +00:00
Duncan Sands	085ad3b81a	Stop trying to have instcombine preserve LCSSA form: this was not effective in avoiding recomputation of LCSSA form; the widespread use of instsimplify (which looks through phi nodes) means it was not preserving LCSSA form anyway; and instcombine is no longer scheduled in the middle of the loop passes so this doesn't matter anymore. llvm-svn: 130301	2011-04-27 10:55:12 +00:00
Chris Lattner	1b06c71668	Transform: "icmp eq (trunc (lshr(X, cst1)), cst" to "icmp (and X, mask), cst" when X has multiple uses. This is useful for exposing secondary optimizations, but the X86 backend isn't ready for this when X has a single use. For example, this can disable load folding. This is inching towards resolving PR6627. llvm-svn: 130238	2011-04-26 20:18:20 +00:00
Chris Lattner	31b106d7dd	some random cleanups, no functionality change. llvm-svn: 130237	2011-04-26 20:02:45 +00:00
Chris Lattner	eb045f9c02	Improve the bail-out predicate to really only kick in when phi translation fails. We were bailing out in some cases that would cause us to miss GVN'ing some non-local cases away. llvm-svn: 130206	2011-04-26 17:41:02 +00:00
Nick Lewycky	c58d293a6f	Rename everything to follow LLVM style ... I think. Add support for switch and indirectbr edges. This works by densely numbering all blocks which have such terminators, and then separately numbering the possible successors. The predecessors write down a number, the successor knows its own number (as a ConstantInt) and sends that and the pointer to the number the predecessor wrote down to the runtime, who looks up the counter in a per-function table. Coverage data should now be functional, but I haven't tested it on anything other than my 2-file synthetic test program for coverage. llvm-svn: 130186	2011-04-26 03:54:16 +00:00
Chris Lattner	6f83d06ffa	Enhance MemDep: When alias analysis returns a partial alias result, return it as a clobber. This allows GVN to do smart things. Enhance GVN to be smart about the case when a small load is clobbered by a larger overlapping load. In this case, forward the value. This allows us to compile stuff like this: int test(void P) { int tmp = (unsigned int)P; return tmp+((unsigned char*)P+1); } into: _test: ## @test movl (%rdi), %ecx movzbl %ch, %eax addl %ecx, %eax ret which has one load. We already handled the case where the smaller load was from a must-aliased base pointer. llvm-svn: 130180	2011-04-26 01:21:15 +00:00
Jay Foad	1a180156b6	Remove unused STL header includes. llvm-svn: 130068	2011-04-23 19:53:52 +00:00
Jay Foad	5514afe6b2	PR9214: Convert Metadata API to use ArrayRef. llvm-svn: 129932	2011-04-21 19:59:31 +00:00
Nick Lewycky	8411b5511e	In gcov profiling, give all functions an extra unified return block. This is necessary since gcov counts transitions between blocks. It can't see if you've run every line in a straight-line function, so we add an edge for it to notice. llvm-svn: 129905	2011-04-21 03:18:00 +00:00
Nick Lewycky	ed749d8c94	Fix think-o: emit all 8 bytes of the EOF marker. Also reflow a line in a comment for 80 columns. llvm-svn: 129904	2011-04-21 02:48:39 +00:00
Nick Lewycky	8e0a38f88a	Add independent controls for whether GCOV profiling should emit .gcno files or instrument the program to emit .gcda. TODO: we should emit slightly different .gcda files when .gcno emission is off. llvm-svn: 129903	2011-04-21 01:56:25 +00:00
Cameron Zwarich	ca4c633489	Fix another case of <rdar://problem/9184212> that only occurs with code generated by llvm-gcc, since llvm-gcc uses 2 i64s for passing a 4 x float vector on ARM rather than an i64 array like Clang. llvm-svn: 129878	2011-04-20 21:48:38 +00:00
Cameron Zwarich	76dfa226cf	The bitcast case here is actually handled uniformly earlier in the function, so delete it. llvm-svn: 129877	2011-04-20 21:48:34 +00:00
Cameron Zwarich	4cd9a4a975	Cleanup some code to better use an early return style in preparation for adding more cases. llvm-svn: 129876	2011-04-20 21:48:16 +00:00
Jay Foad	6a85be25a4	Trivial simplification. llvm-svn: 129759	2011-04-19 15:23:29 +00:00
Chandler Carruth	2b1ba48f8d	Mark some functions as used which are used within debug-only code. This silences Clang's -Wunused-function when building in release mode. llvm-svn: 129709	2011-04-18 18:49:44 +00:00
Frits van Bommel	d6d4f987b4	Rename a misleadingly-named variable. llvm-svn: 129644	2011-04-16 14:32:34 +00:00
Jay Foad	7d03e9be47	Fix bug when checking phi operands in InstCombiner::visitPHINode(), found by code inspection. llvm-svn: 129641	2011-04-16 14:17:37 +00:00
Rafael Espindola	c715e724de	Fix cmake build. llvm-svn: 129632	2011-04-16 02:06:46 +00:00
Nick Lewycky	c5ea8528cc	Move the re-stemming function up top and use it where it's currently inlined. Break the arc-profile code out to a function like the notes emission code is, and reorder the functions in the file. The only functionality change is that we no longer modify the Module when the Module has no debug info to use. llvm-svn: 129631	2011-04-16 02:05:18 +00:00
Nick Lewycky	966edd068f	Rename LineProfiling to GCOVProfiling to more accurately represent what it does. Also mostly implement it. Still a work-in-progress, but generates legal output on crafted test cases. llvm-svn: 129630	2011-04-16 01:20:23 +00:00
Chris Lattner	0ab5e2cded	Fix a ton of comment typos found by codespell. Patch by Luis Felipe Strano Moraes! llvm-svn: 129558	2011-04-15 05:18:47 +00:00
Eli Friedman	2395626605	Add an instcombine for constructs like a \| -(b != c); a select is more canonical, and generally leads to better code. Found while looking at an article about saturating arithmetic. llvm-svn: 129545	2011-04-14 22:41:27 +00:00
Owen Anderson	92651ec374	Fix an infinite alternation in JumpThreading where two transforms would repeatedly undo each other. The solution is to perform more aggressive constant folding to make one of the edges just folded away rather than trying to thread it. Fixes <rdar://problem/9284786>. Discovered with CSmith. llvm-svn: 129538	2011-04-14 21:35:50 +00:00
Mon P Wang	1cde91674a	Cleanup r129509 based on comments by Chris llvm-svn: 129532	2011-04-14 19:20:42 +00:00
Mon P Wang	0f6bad7b6e	Cleanup r129472 by using a utility routine as suggested by Eli. llvm-svn: 129509	2011-04-14 08:04:01 +00:00
Chris Lattner	fba5cdfce1	rework FoldBranchToCommonDest to exit earlier when there is a bonus instruction around, reducing work. Greatly simplify handling of debug instructions. There is no need to build up a vector of them and then move them into the one predecessor if we're processing a block. Instead just rescan the block and copy them into the pred. If a block gets merged into multiple preds, this will retain more debug info. llvm-svn: 129502	2011-04-14 02:44:53 +00:00
Chris Lattner	35a65b2aa6	fix a couple -Wsign-compare warnings. llvm-svn: 129501	2011-04-14 02:27:25 +00:00
Mon P Wang	2e5528f0b2	Vectors with different number of elements of the same element type can have the same allocation size but different primitive sizes(e.g., <3xi32> and <4xi32>). When ScalarRepl promotes them, it can't use a bit cast but should use a shuffle vector instead. llvm-svn: 129472	2011-04-13 21:40:02 +00:00
Junjie Gu	377cc31a74	Fixed the revision 129449. llvm-svn: 129450	2011-04-13 16:45:49 +00:00
Junjie Gu	7c3b4593b5	Passing unroll parameters (unroll-count, threshold, and partial unroll) via LoopUnroll class's ctor. Doing so will allow multiple context with different loop unroll parameters to run. This is a minor change and no effect on existing application. llvm-svn: 129449	2011-04-13 16:15:29 +00:00
Rafael Espindola	6aafb64daf	Add the alias analysis to the C api. llvm-svn: 129447	2011-04-13 15:44:58 +00:00
Bill Wendling	b902f1dd88	Reapply r129401 with patch for clang. llvm-svn: 129419	2011-04-13 00:36:11 +00:00
Bill Wendling	dbfde42468	Revert r129401 for now. Clang is using the old way of doing things. llvm-svn: 129403	2011-04-12 22:59:27 +00:00
Bill Wendling	47c24875a1	Remove the unaligned load intrinsics in favor of using native unaligned loads. Now that we have a first-class way to represent unaligned loads, the unaligned load intrinsics are superfluous. First part of <rdar://problem/8460511>. llvm-svn: 129401	2011-04-12 22:46:31 +00:00
NAKAMURA Takumi	3f28443a07	lib/Transforms/Instrumentation/CMakeLists.txt: Add LineProfiling.cpp to fix up r129340. llvm-svn: 129343	2011-04-12 01:54:40 +00:00
Nick Lewycky	9d60e373cf	Add support for line profiling. Very work-in-progress. Use debug info in the IR to find the directory/file:line:col. Each time that location changes, bump a counter. Unlike the existing profiling system, we don't try to look at argv[], and thusly don't require main() to be present in the IR. This matches GCC's technique where you specify the profiling flag when producing each .o file. The runtime library is minimal, currently just calling printf at program shutdown time. The API is designed to make it possible to emit GCOV data later on. llvm-svn: 129340	2011-04-12 01:06:09 +00:00
Nick Lewycky	fbc5a4004c	Consider ConstantAggregateZero as well as ConstantArray/Struct. llvm-svn: 129338	2011-04-12 01:02:45 +00:00
Dan Gohman	1c6c34834b	Fix reassociate to use a worklist instead of recursing when new reassociation opportunities are exposed. This fixes a bug where the nested reassociation expects to be the IR to be consistent, but it isn't, because the outer reassociation has disconnected some of the operands. rdar://9167457 llvm-svn: 129324	2011-04-12 00:11:56 +00:00
Chris Lattner	7d4cdae564	comment cleanup, use moveBefore instead of removeFromParent+insertBefore. llvm-svn: 129319	2011-04-11 23:24:57 +00:00
Chris Lattner	e81d045d94	remove the StructRetPromotion pass. It is unused, not maintained and has some bugs. If this is interesting functionality, it should be reimplemented in the argpromotion pass. llvm-svn: 129314	2011-04-11 23:09:44 +00:00
Nick Lewycky	0f85789800	Just because a GlobalVariable's initializer is [N x { i32, void ()* }] doesn't mean that it has to be ConstantArray of ConstantStruct. We might have ConstantAggregateZero, at either level, so don't crash on that. Also, semi-deprecate the sentinal value. The linker isn't aware of sentinals so we end up with the two lists appended, each with their "sentinals" on them. Different parts of LLVM treated sentinals differently, so make them all just ignore the single entry and continue on with the rest of the list. llvm-svn: 129307	2011-04-11 22:11:20 +00:00
Jay Foad	7c14a558fe	Don't include Operator.h from InstrTypes.h. llvm-svn: 129271	2011-04-11 09:35:34 +00:00
Eli Friedman	9cca0715aa	Add back a couple checks removed by r129128; the fact that an intitializer is an array of structures doesn't imply it's a ConstantArray of ConstantStruct. llvm-svn: 129207	2011-04-09 09:11:09 +00:00
Chris Lattner	88974f4625	fix PR9523, a crash in looprotate on a non-canonical loop made out of indirectbr. llvm-svn: 129203	2011-04-09 07:25:58 +00:00
Chris Lattner	af1bccec68	Fix a bug where RecursivelyDeleteTriviallyDeadInstructions could delete the instruction pointed to by CGP's current instruction iterator, leading to a crash on the testcase. This fixes PR9578. llvm-svn: 129200	2011-04-09 07:05:44 +00:00
Nick Lewycky	bd10af96bd	Add a function for profiling to run at shutdown. Unlike the existing API, this can be used even when main() isn't present in the Module, but it means that you don't get to read argv[]. llvm-svn: 129163	2011-04-08 22:19:52 +00:00
Nick Lewycky	466d0c1f93	llvm.global_[cd]tor is defined to be either external, or appending with an array of { i32, void ()* }. Teach the verifier to verify that, deleting copies of checks strewn about. llvm-svn: 129128	2011-04-08 07:30:21 +00:00
Devang Patel	bc3d8b212f	Do not let debug info interfer with branch folding. llvm-svn: 129114	2011-04-07 23:11:25 +00:00
Rafael Espindola	e4e4e37580	Expose more passes to the C API. llvm-svn: 129087	2011-04-07 18:20:46 +00:00
Devang Patel	197c35298a	While hoisting common code from if/else, hoist debug info intrinsics if they match. llvm-svn: 129078	2011-04-07 17:27:36 +00:00
Eli Friedman	c5f22a7815	PR9634: Don't unconditionally tell the AliasSetTracker that the PreheaderLoad is equivalent to any other relevant value; it isn't true in general. If it is equivalent, the LoopPromoter will tell the AST the equivalence. Also, delete the PreheaderLoad if it is unused. Chris, since you were the last one to make major changes here, can you check that this is sane? llvm-svn: 129049	2011-04-07 01:35:06 +00:00
Devang Patel	e48ddf863b	Simplify. isIdenticalToWhenDefined() checks opcode. llvm-svn: 129041	2011-04-07 00:30:15 +00:00
Devang Patel	d715ec82b4	While folding branch to a common destination into a predecessor, copy dbg values also. llvm-svn: 129035	2011-04-06 22:37:20 +00:00
Nick Lewycky	ee54fa29d5	Fix typos. Adjust some whitespace for style. No functionality change. llvm-svn: 128924	2011-04-05 20:39:27 +00:00
Nadav Rotem	a069c6ce05	InstCombine optimizes gep(bitcast(x)) even when the bitcasts casts away address space info. We crash with an assert in this case. This change checks that the address space of the bitcasted pointer is the same as the gep ptr. llvm-svn: 128884	2011-04-05 14:29:52 +00:00
Jay Foad	11522097be	Remove some support for ReturnInsts with multiple operands, and for returning a scalar value in a function whose return type is a single- element structure or array. llvm-svn: 128810	2011-04-04 07:44:02 +00:00
Eli Friedman	b85c0caf7d	Attempt to fix breakage from r128782 reported by Francois Pichet on llvm-commits. (Not sure why it only breaks on Windows; maybe it has something to do with the iterator representation...) llvm-svn: 128802	2011-04-04 00:37:38 +00:00
Eli Friedman	17bf4922c9	PR9446: RecursivelyDeleteTriviallyDeadInstructions can delete the instruction after the given instruction; make sure to handle that case correctly. (It's difficult to trigger; the included testcase involves a dead block, but I don't think that's a requirement.) While I'm here, get rid of the unnecessary warning about SimplifyInstructionsInBlock, since it should work correctly as far as I know. llvm-svn: 128782	2011-04-02 22:45:17 +00:00
Benjamin Kramer	50a281a871	While SimplifyDemandedBits constant folds this, we can't rely on it here. It's possible to craft an input that hits the recursion limits in a way that SimplifyDemandedBits doesn't simplify the icmp but ComputeMaskedBits can infer which bits are zero. No test case as it depends on too many other things. Fixes PR9609. llvm-svn: 128777	2011-04-02 18:50:58 +00:00
Benjamin Kramer	8b94c295c3	Fix comment. llvm-svn: 128745	2011-04-01 22:29:18 +00:00
Benjamin Kramer	5cad45307e	Tweaks to the icmp+sext-to-shifts optimization to address Frits' comments: - Localize the check if an icmp has one use to a place where we know we're introducing something that's likely more expensive than a sext from i1. - Add an assert to make sure a case that would lead to a miscompilation is folded away earlier. - Fix a typo. llvm-svn: 128744	2011-04-01 22:22:11 +00:00
Benjamin Kramer	ac2d5657a6	Fix build. llvm-svn: 128733	2011-04-01 20:15:16 +00:00
Benjamin Kramer	d121765e64	InstCombine: Turn icmp + sext into bitwise/integer ops when the input has only one unknown bit. int test1(unsigned x) { return (x&8) ? 0 : -1; } int test3(unsigned x) { return (x&8) ? -1 : 0; } before (x86_64): _test1: andl $8, %edi cmpl $1, %edi sbbl %eax, %eax ret _test3: andl $8, %edi cmpl $1, %edi sbbl %eax, %eax notl %eax ret after: _test1: shrl $3, %edi andl $1, %edi leal -1(%rdi), %eax ret _test3: shll $28, %edi movl %edi, %eax sarl $31, %eax ret llvm-svn: 128732	2011-04-01 20:09:10 +00:00
Benjamin Kramer	398b8c5faf	InstCombine: Move (sext icmp) transforms into their own method. No intended functionality change. llvm-svn: 128731	2011-04-01 20:09:03 +00:00
Nadav Rotem	d74b72b8a9	Instcombile optimization: extractelement(cast) -> cast(extractelement) llvm-svn: 128683	2011-03-31 22:57:29 +00:00
Benjamin Kramer	5291054ef1	InstCombine: APFloat can't perform arithmetic on PPC double doubles, don't even try. Thanks Eli! llvm-svn: 128676	2011-03-31 21:35:49 +00:00
Benjamin Kramer	be209ab8a2	InstCombine: Fix transform to use the swapped predicate. Thanks Frits! llvm-svn: 128628	2011-03-31 10:46:03 +00:00
Benjamin Kramer	d159d94644	InstCombine: fold fcmp (fneg x), (fneg y) -> fcmp x, y llvm-svn: 128627	2011-03-31 10:12:22 +00:00
Benjamin Kramer	a8c5d0872d	InstCombine: fold fcmp pred (fneg x), C -> fcmp swap(pred) x, -C llvm-svn: 128626	2011-03-31 10:12:15 +00:00
Benjamin Kramer	cbb18e91a8	InstCombine: Shrink "fcmp (fpext x), C" to "fcmp x, C" if C can be losslessly converted to the type of x. Fixes PR9592. llvm-svn: 128625	2011-03-31 10:12:07 +00:00
Benjamin Kramer	2ccfbc8b71	InstCombine: fold fcmp (fpext x), (fpext y) -> fcmp x, y. llvm-svn: 128624	2011-03-31 10:11:58 +00:00
Bill Wendling	5034159c5f	* The DSE code that tested for overlapping needed to take into account the fact that one of the numbers is signed while the other is unsigned. This could lead to a wrong result when the signed was promoted to an unsigned int. * Add the data layout line to the testcase so that it will test the appropriate thing. Patch by David Terei! llvm-svn: 128577	2011-03-30 21:37:19 +00:00
Benjamin Kramer	8564e0de96	InstCombine: If the divisor of an fdiv has an exact inverse, turn it into an fmul. Fixes PR9587. llvm-svn: 128546	2011-03-30 15:42:35 +00:00
Jay Foad	52131344a2	Remove PHINode::reserveOperandSpace(). Instead, add a parameter to PHINode::Create() giving the (known or expected) number of operands. llvm-svn: 128537	2011-03-30 11:28:46 +00:00
Jay Foad	e0938d8a87	(Almost) always call reserveOperandSpace() on newly created PHINodes. llvm-svn: 128535	2011-03-30 11:19:20 +00:00
Benjamin Kramer	272f2b0044	InstCombine: Add a few missing combines for ANDs and ORs of sign bit tests. On x86 we now compile "if (a < 0 && b < 0)" into testl %edi, %esi js IF.THEN llvm-svn: 128496	2011-03-29 22:06:41 +00:00
Benjamin Kramer	e41395ac24	DSE: Remove an early exit optimization that depended on the ordering of a SmallPtrSet. Fixes PR9569 and will hopefully make selfhost on ASLR-enabled systems more deterministic. llvm-svn: 128482	2011-03-29 20:28:57 +00:00
Cameron Zwarich	ff811cc475	Do some simple copy propagation through integer loads and stores when promoting vector types. This helps a lot with inlined functions when using the ARM soft float ABI. Fixes <rdar://problem/9184212>. llvm-svn: 128453	2011-03-29 05:19:52 +00:00
Nick Lewycky	ebc2f3a68c	Remove tabs I accidentally added. llvm-svn: 128413	2011-03-28 17:48:26 +00:00
Jay Foad	1c83965f5a	Make more use of PHINode::getNumIncomingValues(). llvm-svn: 128406	2011-03-28 13:03:10 +00:00
Frits van Bommel	d14d991bf7	Add some debug output when -instcombine uses RAUW. This can make debug output for those cases much clearer since without this it only showed that the original instruction was removed, not what it was replaced with. llvm-svn: 128399	2011-03-27 23:32:31 +00:00
Nick Lewycky	8544228d5a	Teach the transformation that moves binary operators around selects to preserve the subclass optional data. llvm-svn: 128388	2011-03-27 19:51:23 +00:00
Benjamin Kramer	1f90da127f	Use APInt's umul_ov instead of rolling our own overflow detection. llvm-svn: 128380	2011-03-27 15:04:38 +00:00
Nick Lewycky	83167df787	Add a small missed optimization: turn X == C ? X : Y into X == C ? C : Y. This removes one use of X which helps it pass the many hasOneUse() checks. In my analysis, this turns up very often where X = A >>exact B and that can't be simplified unless X has one use (except by increasing the lifetime of A which is generally a performance loss). llvm-svn: 128373	2011-03-27 07:30:57 +00:00
Bill Wendling	b5139920d6	Simplification noticed by Frits. llvm-svn: 128333	2011-03-26 09:32:07 +00:00
Bill Wendling	19f33b9393	Rework the logic that determines if a store completely overlaps an ealier store. There are two ways that a later store can comletely overlap a previous store: 1. They both start at the same offset, but the earlier store's size is <= the later's size, or 2. The earlier store's offset is > the later's offset, but it's offset + size doesn't extend past the later's offset + size. llvm-svn: 128332	2011-03-26 08:02:59 +00:00
Cameron Zwarich	d4174ee43e	Fix a typo and add a test. llvm-svn: 128331	2011-03-26 04:58:50 +00:00
Bill Wendling	db40b5c899	PR9561: A store with a negative offset (via GEP) could erroniously say that it completely overlaps a previous store, thus mistakenly deleting that store. Check for this condition. llvm-svn: 128319	2011-03-26 01:20:37 +00:00
Nick Lewycky	0e25c8b364	No functionality change, just adjust some whitespace for coding style compliance. llvm-svn: 128257	2011-03-25 06:05:50 +00:00
Cameron Zwarich	74157ab3e5	Debug intrinsics must be skipped at the beginning and ends of blocks, lest they affect the generated code. llvm-svn: 128217	2011-03-24 16:34:59 +00:00
Cameron Zwarich	2edfe778ec	It is enough for the CallInst to have no uses to be made a tail call with a ret void; it doesn't need to have a void type. llvm-svn: 128212	2011-03-24 15:54:11 +00:00
Devang Patel	8f606d7b9b	s/UpdateDT/ModifiedDT/g llvm-svn: 128211	2011-03-24 15:35:25 +00:00
Cameron Zwarich	4649f17db1	Do early taildup of ret in CodeGenPrepare for potential tail calls that have a void return type. This fixes PR9487. llvm-svn: 128197	2011-03-24 04:52:10 +00:00
Cameron Zwarich	0e331c05ae	Use an early return instead of a long if block. llvm-svn: 128196	2011-03-24 04:52:07 +00:00
Cameron Zwarich	dd84bcce8f	When UpdateDT is set, DT is invalid, which could cause problems when trying to use it later. I couldn't make a test that hits this with the current code. llvm-svn: 128195	2011-03-24 04:52:04 +00:00
Cameron Zwarich	47e7175fe9	Check for TLI so that -codegenprepare can be used from opt. llvm-svn: 128194	2011-03-24 04:51:51 +00:00
Cameron Zwarich	10ebc189ee	Fix PR9464 by correcting some math that just happened to be right in most cases that were hit in practice. llvm-svn: 128146	2011-03-23 05:25:55 +00:00
Anders Carlsson	1cc8073bb3	Handle another case that Frits suggested. llvm-svn: 128068	2011-03-22 03:21:01 +00:00
Devang Patel	17bbd7f495	Simplify. llvm-svn: 128030	2011-03-21 22:04:45 +00:00
Anders Carlsson	4dd420f193	More cleanups to the OptimizeEmptyGlobalCXXDtors GlobalOpt function. llvm-svn: 127997	2011-03-21 14:54:40 +00:00
Anders Carlsson	701822a48e	As suggested by Nick Lewycky, ignore debugging intrinsics when trying to decide whether a destructor is empty or not. llvm-svn: 127985	2011-03-21 02:42:27 +00:00
Nick Lewycky	d078183725	Fix comments llvm-svn: 127984	2011-03-21 02:26:01 +00:00
Evan Cheng	0663f23bd8	Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified. llvm-svn: 127981	2011-03-21 01:19:09 +00:00
Anders Carlsson	336fd90f4d	Don't try to eliminate invokes to __cxa_atexit. llvm-svn: 127976	2011-03-20 20:21:33 +00:00
Anders Carlsson	fcec2f519a	Don't segfault on mutual recursion, as pointed out by Frits. llvm-svn: 127975	2011-03-20 20:16:43 +00:00
Anders Carlsson	48a44911d3	Address comments from Frits van Bommel. llvm-svn: 127974	2011-03-20 19:51:13 +00:00
Anders Carlsson	ee6bc70d2f	Add an optimization to GlobalOpt that eliminates calls to __cxa_atexit, if the function passed is empty. llvm-svn: 127970	2011-03-20 17:59:11 +00:00
Daniel Dunbar	327cd36f74	Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR", it broke a lot of things. llvm-svn: 127954	2011-03-19 21:47:14 +00:00
Evan Cheng	824a711305	SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR to have single return block (at least getting there) for optimizations. This is general goodness but it would prevent some tailcall optimizations. One specific case is code like this: int f1(void); int f2(void); int f3(void); int f4(void); int f5(void); int f6(void); int foo(int x) { switch(x) { case 1: return f1(); case 2: return f2(); case 3: return f3(); case 4: return f4(); case 5: return f5(); case 6: return f6(); } } => LBB0_2: ## %sw.bb callq _f1 popq %rbp ret LBB0_3: ## %sw.bb1 callq _f2 popq %rbp ret LBB0_4: ## %sw.bb3 callq _f3 popq %rbp ret This patch teaches codegenprep to duplicate returns when the return value is a phi and where the phi operands are produced by tail calls followed by an unconditional branch: sw.bb7: ; preds = %entry %call8 = tail call i32 @f5() nounwind br label %return sw.bb9: ; preds = %entry %call10 = tail call i32 @f6() nounwind br label %return return: %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ] ret i32 %retval.0 This allows codegen to generate better code like this: LBB0_2: ## %sw.bb jmp _f1 ## TAILCALL LBB0_3: ## %sw.bb1 jmp _f2 ## TAILCALL LBB0_4: ## %sw.bb3 jmp _f3 ## TAILCALL rdar://9147433 llvm-svn: 127953	2011-03-19 17:17:39 +00:00
Devang Patel	2c7ee2700c	If an AllocaInst referred by DbgDeclareInst is used by a LoadInst then the LoadInst should also get a corresponding llvm.dbg.value intrinsic. llvm-svn: 127924	2011-03-18 23:45:43 +00:00
Devang Patel	3ac171d49a	Remove dead code. llvm-svn: 127923	2011-03-18 23:33:58 +00:00
Devang Patel	c1431e6e84	Consider debug info intrinsics pointing to null value as dead instructions. llvm-svn: 127922	2011-03-18 23:28:02 +00:00
Andrew Trick	f8f67f0188	Remove TargetData and ValueTracking includes. I didn't mean for them to sneak in my last checkin. llvm-svn: 127842	2011-03-18 00:36:39 +00:00
Andrew Trick	87716c93c2	Added isValidRewrite() to check the result of ScalarEvolutionExpander. SCEV may generate expressions composed of multiple pointers, which can lead to invalid GEP expansion. Until we can teach SCEV to follow strict pointer rules, make sure no bad GEPs creep into IR. Fixes rdar://problem/9038671. llvm-svn: 127839	2011-03-17 23:51:11 +00:00
Andrew Trick	e44f0d94f6	whitespace llvm-svn: 127837	2011-03-17 23:46:48 +00:00
Devang Patel	aad34d882d	Try to not lose variable's debug info during instcombine. This is done by lowering dbg.declare intrinsic into dbg.value intrinsic. Radar 9143931. llvm-svn: 127834	2011-03-17 22:18:16 +00:00
Devang Patel	8c0b16b0aa	Refactor into a separate utility function. llvm-svn: 127832	2011-03-17 21:58:19 +00:00
Cameron Zwarich	7599b106b7	Fix a comment. llvm-svn: 127728	2011-03-16 08:13:42 +00:00
Cameron Zwarich	0454253d7a	Only convert allocas to scalars if it is profitable. The profitability metric I chose is having a non-memcpy/memset use and being larger than any native integer type. Originally I chose having an access of a size smaller than the total size of the alloca, but this caused some minor issues on the spirit benchmark where SRoA runs again after some inlining. This fixes <rdar://problem/8613163>. llvm-svn: 127718	2011-03-16 00:13:44 +00:00
Cameron Zwarich	b51c830f7c	Better use initializer lists. llvm-svn: 127716	2011-03-16 00:13:37 +00:00
Cameron Zwarich	63062ccf85	Add a clarifying comment. llvm-svn: 127715	2011-03-16 00:13:35 +00:00
Cameron Zwarich	dbb27393cc	Clean up something noticed by Fritz. llvm-svn: 127684	2011-03-15 18:42:33 +00:00
Cameron Zwarich	0b8cdfb6ec	Do not add PHIs with no users when creating LCSSA form. Patch by Andrew Clinton. llvm-svn: 127674	2011-03-15 07:41:25 +00:00
Eli Friedman	c4414c6e92	PR9450: Make switch optimization in SimplifyCFG not dependent on the ordering of pointers in an std::map. llvm-svn: 127650	2011-03-15 02:23:35 +00:00
Eric Christopher	2139d3148f	If we don't know how long a string is we can't fold an _chk version to the normal version. Fixes rdar://9123638 llvm-svn: 127636	2011-03-15 00:25:41 +00:00
Andrew Trick	8b55b736b1	Added SCEV::NoWrapFlags to manage unsigned, signed, and self wrap properties. Added the self-wrap flag for SCEV::AddRecExpr. A slew of temporary FIXMEs indicate the intention of the no-self-wrap flag without changing behavior in this revision. llvm-svn: 127590	2011-03-14 16:50:06 +00:00
Andrew Trick	328b223bb1	whitespace llvm-svn: 127589	2011-03-14 16:48:10 +00:00
Jin-Gu Kang	b452db02f0	This case is solved by Scalar Replacement of Aggregates (DT) and Early CSE pass so this patch reverts it to original source code. llvm-svn: 127574	2011-03-14 01:21:00 +00:00
Jin-Gu Kang	b7538c71e1	Add comment as following: load and store reference same memory location, the memory location is represented by getelementptr with two uses (load and store) and the getelementptr's base is alloca with single use. At this point, instructions from alloca to store can be removed. (this pattern is generated when bitfield is accessed.) For example, %u = alloca %struct.test, align 4 ; [#uses=1] %0 = getelementptr inbounds %struct.test* %u, i32 0, i32 0;[#uses=2] %1 = load i8* %0, align 4 ; [#uses=1] %2 = and i8 %1, -16 ; [#uses=1] %3 = or i8 %2, 5 ; [#uses=1] store i8 %3, i8* %0, align 4 llvm-svn: 127565	2011-03-13 14:05:51 +00:00
Jin-Gu Kang	2e939f7c3c	This patch removes some of useless instructions generated by bitfield access. llvm-svn: 127539	2011-03-12 12:18:44 +00:00
Cameron Zwarich	338d362200	Roll r127459 back in: Optimize trivial branches in CodeGenPrepare, which often get created from the lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127498	2011-03-11 21:52:04 +00:00
Daniel Dunbar	94ccb27b43	Revert r127459, "Optimize trivial branches in CodeGenPrepare, which often get created from the", it broke some GCC test suite tests. llvm-svn: 127477	2011-03-11 19:30:30 +00:00
Benjamin Kramer	51897bcd3e	InstCombine: Fix a thinko where transform an icmp under the assumption that it's a zero comparison when it's not. Fixes PR9454. llvm-svn: 127464	2011-03-11 11:37:40 +00:00
Cameron Zwarich	cc27b3acc4	Optimize trivial branches in CodeGenPrepare, which often get created from the lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127459	2011-03-11 04:54:27 +00:00
Dan Gohman	affbc66f60	RecursivelyDeleteTriviallyDeadInstructions only needs a Value, not an Instruction, so casting is not necessary. Also, it's theoretically possible that the Value is not an Instruction, since WeakVH follows RAUWs. llvm-svn: 127427	2011-03-10 20:57:44 +00:00
Dan Gohman	154ed49784	Fix reassociate to postpone certain instruction deletions until after it has finished all of its reassociations, because its habit of unlinking operands and holding them in a datastructure while working means that it's not easy to determine when an instruction is really dead until after all its regular work is done. rdar://9096268. llvm-svn: 127424	2011-03-10 19:51:54 +00:00
Benjamin Kramer	b49b964b98	InstCombine: Turn umul_with_overflow into mul nuw if we can prove that it cannot overflow. This happens a lot in clang-compiled C++ code because it adds overflow checks to operator new[]: unsigned foo(unsigned n) { return new unsigned[n]; } We can optimize away the overflow check on 64 bit targets because (uint64_t)n4 cannot overflow. llvm-svn: 127418	2011-03-10 18:40:14 +00:00
Devang Patel	13f8c7d48e	Preserve line number information while simplifying libcalls. llvm-svn: 127362	2011-03-09 21:27:52 +00:00
Devang Patel	a10794ab7b	These llvm.dbg.* constants are not used anymore. llvm-svn: 127352	2011-03-09 19:41:33 +00:00
Cameron Zwarich	19f2b3c652	Fix a crasher introduced by r127317 that is seen on the bots when using an alloca as both integer and floating-point vectors of the same size. Bugpoint is not cooperating with me, but I'll try to find a manual testcase tomorrow. llvm-svn: 127320	2011-03-09 07:34:11 +00:00
Cameron Zwarich	3b649f4d01	Add support to scalar replacement for partial vector accesses of an alloca, e.g. a union of a float, <2 x float>, and <4 x float>. This mostly comes up with the use of vector intrinsics, especially in NEON when programmers know the layout of the register file. This enables codegen to eliminate a lot of the subregister traffic it would otherwise generate. This commit only enables this for a small number of floating-point cases, but a lot more integer cases. I assume this is okay for all ports, but I did not do extensive testing of the quality of code involving i512 vectors and the like. If there is a use case where this generates worse code than before, let me know and we can scale it back. This fixes <rdar://problem/9036264>. llvm-svn: 127317	2011-03-09 05:43:05 +00:00
Cameron Zwarich	43a241fa06	Move vector type merging to a separate function in preparation for it getting more complicated. llvm-svn: 127316	2011-03-09 05:43:01 +00:00
Eli Friedman	a81a82dcaf	PR9346: Prevent SimplifyDemandedBits from incorrectly introducing INT_MIN % -1. llvm-svn: 127306	2011-03-09 01:28:35 +00:00
Eli Friedman	aac35b3fbb	PR9420; an instruction before an unreachable is guaranteed not to have any reachable uses, but there still might be uses in dead blocks. Use the standard solution of replacing all the uses with undef. This is a rare case because it's very sensitive to phase ordering in SimplifyCFG. llvm-svn: 127299	2011-03-09 00:48:33 +00:00
Devang Patel	fbb482b314	llvm.dbg.declare intrinsic does not use any llvm::Values. It's magic! llvm-svn: 127282	2011-03-08 22:12:11 +00:00
Nick Lewycky	afc8098c9e	Reorder comments to put them the right way around. llvm-svn: 127220	2011-03-08 06:29:47 +00:00
Devang Patel	97d0be8ee1	While sinking an instruction, do not lose llvm.dbg.value intrinsic. llvm-svn: 127214	2011-03-08 03:06:19 +00:00
Devang Patel	d00c628f8f	Preserve line no. info. Radar `9097659` llvm-svn: 127182	2011-03-07 22:43:45 +00:00
Nick Lewycky	e467979d0a	Add more analysis of the sign bit of an srem instruction. If the LHS is negative then the result could go either way. If it's provably positive then so is the srem. Fixes PR9343 #7! llvm-svn: 127146	2011-03-07 01:50:10 +00:00
Rafael Espindola	871cfde1c2	Don't internalize available_externally functions. We already did the right thing for variables. llvm-svn: 127138	2011-03-06 23:41:34 +00:00
Nick Lewycky	92db8e8e39	ConstantInt has some getters which return ConstantInt's or ConstantVector's of the value splatted into every element. Extend this to getTrue and getFalse which by providing new overloads that take Types that are either i1 or <N x i1>. Use it in InstCombine to add vector support to some code, fixing PR8469! llvm-svn: 127116	2011-03-06 03:36:19 +00:00
Benjamin Kramer	08c913b6e6	InstCombine: We know the number of items initially added to the worklist map, reserve space early to avoid rehashing. llvm-svn: 127089	2011-03-05 16:43:46 +00:00
Cameron Zwarich	13c885d193	Fix PR9398 - 10% of llc compile time is spent in Value::getNumUses. This reduces the percentage of time spent in CodeGenPrepare when llcing 403.gcc from 12.6% to 1.8% of total llc time. llvm-svn: 127069	2011-03-05 08:12:26 +00:00
Nick Lewycky	9719a719c7	Thread comparisons over udiv/sdiv/ashr/lshr exact and lshr nuw/nsw whenever possible. This goes into instcombine and instsimplify because instsimplify doesn't need to check hasOneUse since it returns (almost exclusively) constants. This fixes PR9343 #4 #5 and #8! llvm-svn: 127064	2011-03-05 05:19:11 +00:00
Nick Lewycky	25cc338d88	Try once again to optimize "icmp (srem X, Y), Y" by turning the comparison into true/false or "icmp slt/sge Y, 0". llvm-svn: 127063	2011-03-05 04:28:48 +00:00
Jakob Stoklund Olesen	e2017b6f2e	DenseMap<uintptr_t,...> doesn't allow all values as keys. Avoid colliding with the sentinels, hopefully unbreaking llvm-gcc-x86_64-linux-selfhost. llvm-svn: 126982	2011-03-04 02:48:56 +00:00
Richard Osborne	5003782293	Fix typo in comment. llvm-svn: 126941	2011-03-03 14:21:22 +00:00
Richard Osborne	af52c52569	Optimize fprintf -> iprintf if there are no floating point arguments and siprintf is available on the target. llvm-svn: 126940	2011-03-03 14:20:22 +00:00
Richard Osborne	2dfb888392	Optimize sprintf -> siprintf if there are no floating point arguments and siprintf is available on the target. llvm-svn: 126937	2011-03-03 14:09:28 +00:00
Richard Osborne	815de536e5	Optimize printf -> iprintf if there are no floating point arguments and iprintf is available on the target. Currently iprintf is only marked as being available on the XCore. llvm-svn: 126935	2011-03-03 13:17:51 +00:00
Cameron Zwarich	86ade9510f	Remove some more unused code that I missed. llvm-svn: 126826	2011-03-02 03:48:29 +00:00
Cameron Zwarich	5dd2aa2615	Eliminate the unused CodeGenPrepare option to split critical edges. llvm-svn: 126825	2011-03-02 03:31:46 +00:00
Cameron Zwarich	b7f8eaafa3	Stop computing the number of uses twice per value in CodeGenPrepare's sinking of addressing code. On 403.gcc this almost halves CodeGenPrepare time and reduces total llc time by 9.5%. Unfortunately, getNumUses() is still the hottest function in llc. llvm-svn: 126782	2011-03-01 21:13:53 +00:00
Anders Carlsson	da80afef99	Make InstCombiner::FoldAndOfICmps create a ConstantRange that's the intersection of the LHS and RHS ConstantRanges and return "false" when the range is empty. This simplifies some code and catches some extra cases. llvm-svn: 126744	2011-03-01 15:05:01 +00:00
Eli Friedman	683bbc16c4	Add an obvious missing safety check to DAE::RemoveDeadArgumentsFromCallers. llvm-svn: 126720	2011-03-01 00:33:47 +00:00
Ted Kremenek	20164dcc68	Unbreak CMake build. llvm-svn: 126715	2011-02-28 23:56:33 +00:00
Chris Lattner	1ac5e0c5c6	update cmake llvm-svn: 126694	2011-02-28 22:45:25 +00:00
Dan Gohman	06d70015ce	Delete the GEPSplitter experiment. llvm-svn: 126671	2011-02-28 19:47:47 +00:00
Dan Gohman	b8a25f49f3	Delete the SimplifyHalfPowrLibCalls pass, which was unused, and only existed as the result of a misunderstanding. llvm-svn: 126669	2011-02-28 19:41:14 +00:00
Frits van Bommel	8ae07996c9	Teach SimplifyCFG that (switch (select cond, X, Y)) is better expressed as a branch. Based on a patch by Alistair Lynn. llvm-svn: 126647	2011-02-28 09:44:07 +00:00
Nick Lewycky	66f4f22f7b	srem doesn't actually have the same resulting sign as its numerator, you could also have a zero when numerator = denominator. Reverts parts of r126635 and r126637. llvm-svn: 126644	2011-02-28 09:17:39 +00:00
Nick Lewycky	174a705497	Teach InstCombine to fold "(shr exact X, Y) == 0" --> X == 0, fixing #1 from PR9343. llvm-svn: 126643	2011-02-28 08:31:40 +00:00
Nick Lewycky	6b445419b0	The sign of an srem instruction is the sign of its dividend (the first argument), regardless of the divisor. Teach instcombine about this and fix test7 in PR9343! llvm-svn: 126635	2011-02-28 06:20:05 +00:00
Benjamin Kramer	ceb5daa567	Revert "SimplifyCFG: GEPs with just one non-constant index are also cheap." Yes, there are other types than i8* and GEPs on them can produce an add+multiply. We don't consider that cheap enough to be speculatively executed. llvm-svn: 126481	2011-02-25 10:33:33 +00:00
Benjamin Kramer	dfdca1a14d	SimplifyCFG: GEPs with just one non-constant index are also cheap. llvm-svn: 126452	2011-02-24 23:26:09 +00:00
Benjamin Kramer	27361a7124	SimplifyCFG: GEPs with constant indices are cheap enough to be executed unconditionally. llvm-svn: 126445	2011-02-24 22:46:11 +00:00
Devang Patel	cedf928743	Do not use DIFactory. Use DIBuilder. llvm-svn: 126398	2011-02-24 18:49:55 +00:00
Chris Lattner	eddb33ebd0	wire TargetLibraryInfo into simplify libcalls and use it in a couple of trivial places. This pass needs a lot of work. llvm-svn: 126367	2011-02-24 07:16:14 +00:00
Chris Lattner	2e56e20662	move a massive amount of code out into its own helper function to reduce nesting. This needs to be turned into a table. llvm-svn: 126366	2011-02-24 07:12:12 +00:00
Chris Lattner	adf38b3e09	change instcombine to not turn a call to non-varargs bitcast of function prototype into a call to a varargs prototype. We do allow the xform if we have a definition, but otherwise we don't want to risk that we're changing the abi in a subtle way. On X86-64, for example, varargs require passing stuff in %al. llvm-svn: 126363	2011-02-24 05:10:56 +00:00
Cameron Zwarich	826308586c	Make LoopDeletion work on loops with multiple edges, as long as the incoming values from all of the loop's exiting blocks are equal. Patch by Andrew Clinton. llvm-svn: 126253	2011-02-22 22:25:39 +00:00
Duncan Sands	ecbbf0825b	If the phi node was used by an unreachable instruction that ends up using itself without going via a phi node then we could return false here in spite of making a change. Also, tweak the comment because this method can (and always could) return true without deleting the original phi node. For example, if the phi node was used by a read-only invoke instruction which is used by another phi node phi2 which is only used by and only uses the invoke, then phi2 would be deleted but not the invoke instruction and not the original phi node. llvm-svn: 126129	2011-02-21 17:32:05 +00:00
Chris Lattner	2333ac279f	fix a crasher in disabled code (on variable stride loops) llvm-svn: 126125	2011-02-21 17:02:55 +00:00
Duncan Sands	6dcd49bc2b	Simplify RecursivelyDeleteDeadPHINode. The only functionality change should be that if the phi is used by a side-effect free instruction with no uses then the phi and the instruction now get zapped (checked by the unittest). llvm-svn: 126124	2011-02-21 16:27:36 +00:00
Chris Lattner	bc661d6686	Add some (disabled code) to print out negative strides. llvm-svn: 126102	2011-02-21 02:08:54 +00:00
Nick Lewycky	183c24c51b	Make RecursivelyDeleteDeadPHINode delete a phi node that has no users and add a test for that. With this change, test/CodeGen/X86/codegen-dce.ll no longer finds any instructions to DCE, so delete the test. Also renamed J and JP to I and IP in RecursivelyDeleteDeadPHINode. llvm-svn: 126088	2011-02-20 18:05:56 +00:00
Benjamin Kramer	5b7a4e0195	Move "A \| ~(A & ?) -> -1" from InstCombine to InstructionSimplify. llvm-svn: 126082	2011-02-20 15:20:01 +00:00
Benjamin Kramer	d5d7f37beb	InstCombine: Add a bunch of combines of the form x \| (y ^ z). We usually catch this kind of optimization through InstSimplify's distributive magic, but or doesn't distribute over xor in general. "A \| ~(A \| B) -> A \| ~B" hits 24 times on gcc.c. llvm-svn: 126081	2011-02-20 13:23:43 +00:00
Nick Lewycky	c8a1569950	Teach RecursivelyDeleteDeadPHINodes to handle multiple self-references. Patch by Andrew Clinton! llvm-svn: 126077	2011-02-20 08:38:20 +00:00
Nick Lewycky	080ea93779	Instead of keeping two Value*->id# mappings, keep one Value->Value mapping and one Value set. This is faster because we only need to use the set when there isn't already an entry in the map. No functionality change! llvm-svn: 126076	2011-02-20 08:11:03 +00:00
Eli Friedman	ef200db4fd	PR9218: SimplifyDemandedVectorElts can return a non-null value that is not the instruction passed in. Make sure to account for this correctly, instead of looping infinitely. llvm-svn: 126058	2011-02-19 22:42:40 +00:00
Chris Lattner	72a35fb974	rewrite the memset_pattern pattern generation stuff to accept any 2/4/8/16-byte constant, including globals. This makes us generate much more "pretty" pattern globals as well because it doesn't break it down to an array of bytes all the time. This enables us to handle stores of relocatable globals. This kicks in about 48 times in 254.gap, giving us stuff like this: @.memset_pattern40 = internal constant [2 x %struct.TypHeader* (%struct.TypHeader, %struct.TypHeader)] [%struct.TypHeader (%struct.TypHeader, %struct .TypHeader)* @IsFalse, %struct.TypHeader* (%struct.TypHeader, %struct.TypHeader)* @IsFalse], align 16 ... call void @memset_pattern16(i8* %scevgep5859, i8* bitcast ([2 x %struct.TypHeader* (%struct.TypHeader, %struct.TypHeader)] @.memset_pattern40 to i8* ), i64 %tmp75) nounwind llvm-svn: 126044	2011-02-19 19:56:44 +00:00
Chris Lattner	0f4a64011e	Implement rdar://9009151, transforming strided loop stores of unsplatable values into memset_pattern16 when it is available (recent darwins). This transforms lots of strided loop stores of ints for example, like 5 in vpr: Formed memset: call void @memset_pattern16(i8* %4, i8* getelementptr inbounds ([16 x i8]* @.memset_pattern9, i32 0, i32 0), i64 %tmp25) from store to: {%3,+,4}<%11> at: store i32 3, i32* %scevgep, align 4, !tbaa !4 llvm-svn: 126040	2011-02-19 19:31:39 +00:00
Chris Lattner	e6b261fec5	Make loop-idiom use TargetLibraryInfo to determine whether it is allowed to hack on memset, memcpy etc. llvm-svn: 125974	2011-02-18 22:22:15 +00:00
Oscar Fuentes	5ed962656c	Move library stuff out of the toplevel CMakeLists.txt file. llvm-svn: 125968	2011-02-18 22:06:14 +00:00
Duncan Sands	84653b3674	Add some transforms of the kind X-Y>X -> 0>Y which are valid when there is no overflow. These subsume some existing equality transforms, so zap those. llvm-svn: 125843	2011-02-18 16:25:37 +00:00
Chris Lattner	1a924e770a	prevent jump threading from merging blocks when their address is taken (and used!). This prevents merging the blocks (invalidating the block addresses) in a case like this: #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; }) void foo() { printf("%p\n", _THIS_IP_); printf("%p\n", _THIS_IP_); printf("%p\n", _THIS_IP_); } which fixes PR4151. llvm-svn: 125829	2011-02-18 04:43:06 +00:00
Chris Lattner	4a14fbc50c	Don't unroll loops whose header block's address is taken. This is part of a futile attempt to not "break" bizzaro code like this: l1: printf("l1: %p\n", &&l1); ++x; if( x < 3 ) goto l1; Previously we'd fold &&l1 to 1, which is fine per our semantics but not helpful to the user. llvm-svn: 125827	2011-02-18 04:25:21 +00:00
Chris Lattner	a8fed47eed	have instcombine preserve nsw/nuw/exact when sinking common operations through a phi. llvm-svn: 125790	2011-02-17 23:01:49 +00:00
Chris Lattner	75ae5a45ff	fix typo llvm-svn: 125787	2011-02-17 22:32:54 +00:00
Chris Lattner	abb8eb2c63	fix instcombine merging GEPs through a PHI to only make the result inbounds if all of the inputs are inbounds. llvm-svn: 125785	2011-02-17 22:21:26 +00:00
Chris Lattner	d406764d52	add is always integer, thanks to Frits for noticing this. llvm-svn: 125774	2011-02-17 20:55:29 +00:00
Duncan Sands	e522001171	Transform "A + B >= A + C" into "B >= C" if the adds do not wrap. Likewise for some variations (some of these were already present so I unified the code). Spotted by my auto-simplifier as occurring a lot. llvm-svn: 125734	2011-02-17 07:46:37 +00:00
Chris Lattner	5592071768	preserve NUW/NSW when transforming add x,x llvm-svn: 125711	2011-02-17 02:23:02 +00:00
Chris Lattner	3eb0af94c4	fix PR9215, preventing -reassociate from clearing nsw/nuw when it swaps the LHS/RHS of a single binop. llvm-svn: 125700	2011-02-17 01:29:24 +00:00
Duncan Sands	75b5d27b84	Spelling fix: consequtive -> consecutive. llvm-svn: 125563	2011-02-15 09:23:02 +00:00
Nadav Rotem	67d67a0385	Fix 9216 - Endless loop in InstCombine pass. The pattern "A&(A^B) -> A & ~B" recreated itself because ~B is actually a xor -1. llvm-svn: 125557	2011-02-15 07:13:48 +00:00
Devang Patel	8d53ac81ec	Do not forget DebugLoc! llvm-svn: 125547	2011-02-15 02:02:30 +00:00
Chris Lattner	9f0ac0dd8b	tidy up a bit. llvm-svn: 125546	2011-02-15 01:56:08 +00:00
Chris Lattner	69229316aa	convert ConstantVector::get to use ArrayRef. llvm-svn: 125537	2011-02-15 00:14:00 +00:00
Devang Patel	3058398655	Do not hoist @llvm.dbg.value. Here, @llvm.dbg.value is "referring" a value that is modified inside loop. llvm-svn: 125529	2011-02-14 23:03:23 +00:00
Chris Lattner	34442e6ebf	revert my ConstantVector patch, it seems to have made the llvm-gcc builders unhappy. llvm-svn: 125504	2011-02-14 18:15:46 +00:00
Chris Lattner	d9f5b88548	Switch ConstantVector::get to use ArrayRef instead of a pointer+size idiom. Change various clients to simplify their code. llvm-svn: 125487	2011-02-14 07:55:32 +00:00
Chris Lattner	9bd7fdff58	remove a now-unneccesary cast. llvm-svn: 125464	2011-02-13 18:30:09 +00:00
Chris Lattner	43273affb9	implement instcombine folding for things like (x >> c) < 42. We were previously simplifying divisions, but not right shifts! llvm-svn: 125454	2011-02-13 08:07:21 +00:00
Chris Lattner	d369f575d7	refactor some code out into a helper method. llvm-svn: 125451	2011-02-13 07:43:07 +00:00
Daniel Dunbar	210ce0feb5	SimplifyLibCalls: Add missing legalize check on various printf to puts and putchar transforms, their return values are not compatible. llvm-svn: 125442	2011-02-12 18:19:57 +00:00
Benjamin Kramer	1800d823de	Also fold (A+B) == A -> B == 0 when the add is commuted. llvm-svn: 125411	2011-02-11 21:46:48 +00:00
Chris Lattner	d3c0e05f51	When lowering an inbounds gep, the intermediate adds can have unsigned overflow (e.g. due to a negative array index), but the scales on array size multiplications are known to not sign wrap. llvm-svn: 125409	2011-02-11 21:37:43 +00:00
Cameron Zwarich	99de19b3cb	Make LoopUnswitch preserve ScalarEvolution by just forgetting everything about a loop when unswitching it. It only does this in the complex case, because everything should be fine already in the simple case. llvm-svn: 125369	2011-02-11 06:08:28 +00:00
Cameron Zwarich	25cb63c791	LoopInstSimplify preserves ScalarEvolution. llvm-svn: 125368	2011-02-11 06:08:25 +00:00
Cameron Zwarich	97dae4d361	If we can't avoid running loop-simplify twice for now, at least avoid running iv-users twice. llvm-svn: 125318	2011-02-10 23:53:14 +00:00
Cameron Zwarich	d8e66038f4	Rename 'loopsimplify' to 'loop-simplify'. llvm-svn: 125317	2011-02-10 23:38:10 +00:00
Chris Lattner	d86ded17ad	implement the first part of PR8882: when lowering an inbounds gep to explicit addressing, we know that none of the intermediate computation overflows. This could use review: it seems that the shifts certainly wouldn't overflow, but could the intermediate adds overflow if there is a negative index? Previously the testcase would instcombine to: define i1 @test(i64 %i) { %p1.idx.mask = and i64 %i, 4611686018427387903 %cmp = icmp eq i64 %p1.idx.mask, 1000 ret i1 %cmp } now we get: define i1 @test(i64 %i) { %cmp = icmp eq i64 %i, 1000 ret i1 %cmp } llvm-svn: 125271	2011-02-10 07:11:16 +00:00
Chris Lattner	6b657aed33	Enhance a bunch of transformations in instcombine to start generating exact/nsw/nuw shifts and have instcombine infer them when it can prove that the relevant properties are true for a given shift without them. Also, a variety of refactoring to use the new patternmatch logic thrown in for good luck. I believe that this takes care of a bunch of related code quality issues attached to PR8862. llvm-svn: 125267	2011-02-10 05:36:31 +00:00
Chris Lattner	98457101fc	Enhance the "compare with shift" and "compare with div" optimizations to be much more aggressive in the face of exact/nsw/nuw div and shifts. For example, these (which are the same except the first is 'exact' sdiv: define i1 @sdiv_icmp4_exact(i64 %X) nounwind { %A = sdiv exact i64 %X, -5 ; X/-5 == 0 --> x == 0 %B = icmp eq i64 %A, 0 ret i1 %B } define i1 @sdiv_icmp4(i64 %X) nounwind { %A = sdiv i64 %X, -5 ; X/-5 == 0 --> x == 0 %B = icmp eq i64 %A, 0 ret i1 %B } compile down to: define i1 @sdiv_icmp4_exact(i64 %X) nounwind { %1 = icmp eq i64 %X, 0 ret i1 %1 } define i1 @sdiv_icmp4(i64 %X) nounwind { %X.off = add i64 %X, 4 %1 = icmp ult i64 %X.off, 9 ret i1 %1 } This happens when you do something like: (ptr1-ptr2) == 42 where the pointers are pointers to non-unit types. llvm-svn: 125266	2011-02-10 05:23:05 +00:00
Chris Lattner	dcef03fba2	more cleanups, notably bitcast isn't used for "signed to unsigned type conversions". :) llvm-svn: 125265	2011-02-10 05:17:27 +00:00
Chris Lattner	7d0e43ff8b	A bunch of cleanups and simplifications using the new PatternMatch predicates and generally tidying things up. Only very trivial functionality changes like now doing (-1 - A) -> (~A) for vectors too. InstCombineAddSub.cpp \| 296 +++++++++++++++++++++----------------------------- 1 file changed, 126 insertions(+), 170 deletions(-) llvm-svn: 125264	2011-02-10 05:14:58 +00:00
Chris Lattner	768003c59e	teach SimplifyDemandedBits that exact shifts demand the bits they are shifting out since they do require them to be zeros. Similarly for NUW/NSW bits of shl llvm-svn: 125263	2011-02-10 05:09:34 +00:00
Eric Christopher	da6bd45088	Revert this in an attempt to bring the builders back. llvm-svn: 125257	2011-02-10 01:48:24 +00:00
Cameron Zwarich	58c8670ab2	Turn this pass ordering: Natural Loop Information Loop Pass Manager Canonicalize natural loops Scalar Evolution Analysis Loop Pass Manager Induction Variable Users Canonicalize natural loops Induction Variable Users Loop Strength Reduction into this: Scalar Evolution Analysis Loop Pass Manager Canonicalize natural loops Induction Variable Users Loop Strength Reduction This fixes <rdar://problem/8869639>. I also filed PR9184 on doing this sort of thing automatically, but it seems easier to just change the ordering of the passes if this is the only case. llvm-svn: 125254	2011-02-10 01:07:54 +00:00
Chris Lattner	9e4aa0259f	Teach instsimplify some tricks about exact/nuw/nsw shifts. improve interfaces to instsimplify to take this info. llvm-svn: 125196	2011-02-09 17:15:04 +00:00
Chris Lattner	b940091388	Rework InstrTypes.h so to reduce the repetition around the NSW/NUW/Exact versions of creation functions. Eventually, the "insertion point" versions of these should just be removed, we do have IRBuilder afterall. Do a massive rewrite of much of pattern match. It is now shorter and less redundant and has several other widgets I will be using in other patches. Among other changes, m_Div is renamed to m_IDiv (since it only matches integer divides) and m_Shift is gone (it used to match all binops!!) and we now have m_LogicalShift for the one client to use. Enhance IRBuilder to have "isExact" arguments to things like CreateUDiv and reduce redundancy within IRbuilder by having these methods chain to each other more instead of duplicating code. llvm-svn: 125194	2011-02-09 17:00:45 +00:00
Nick Lewycky	292e78c3cd	When removing a function from the function set and adding it to deferred, we could end up removing a different function than we intended because it was functionally equivalent, then end up with a comparison of a function against itself in the next round of comparisons (the one in the function set and the one on the deferred list). To fix this, I introduce a choice in the form of comparison for ComparableFunctions, either normal or "pointer only" used to find exact Function*'s in lookups. Also add some debugging statements. llvm-svn: 125180	2011-02-09 06:32:02 +00:00
Dan Gohman	de7f699754	Don't split any loop backedges, including backedges of loops other than the active loop. This is generally desirable, and it avoids trouble in situations such as the testcase in PR9123, though the failure mode depends on use-list order, so it is infeasible to test. llvm-svn: 125065	2011-02-08 00:55:13 +00:00
Benjamin Kramer	8d6a8c130b	SimplifyCFG: Track the number of used icmps when turning a icmp chain into a switch. If we used only one icmp, don't turn it into a switch. Also prevent the switch-to-icmp transform from creating identity adds, noticed by Marius Wachtler. llvm-svn: 125056	2011-02-07 22:37:28 +00:00
Chris Lattner	35315d065b	enhance vmcore to know that udiv's can be exact, and add a trivial instcombine xform to exercise this. Nothing forms exact udivs yet though. This is progress on PR8862 llvm-svn: 124992	2011-02-06 21:44:57 +00:00
Nick Lewycky	cb1a4c26ee	Simplify away redundant test, and document what's going on. llvm-svn: 124977	2011-02-06 05:04:00 +00:00
Nick Lewycky	f8797fda44	Remove specialized comparison of InlineAsm objects. They're uniqued on creation now, and this wasn't comparing some of their relevant bits anyhow. llvm-svn: 124976	2011-02-06 04:33:50 +00:00
Benjamin Kramer	62aa46b852	SimplifyCFG: Also transform switches that represent a range comparison but are not sorted into sub+icmp. This transforms another 1000 switches in gcc.c. llvm-svn: 124826	2011-02-03 22:51:41 +00:00
Benjamin Kramer	f4ea1d5f79	SimplifyCFG: Turn switches into sub+icmp+branch if possible. This makes the job of the later optzn passes easier, allowing the vast amount of icmp transforms to chew on it. We transform 840 switches in gcc.c, leading to a 16k byte shrink of the resulting binary on i386-linux. The testcase from README.txt now compiles into decl %edi cmpl $3, %edi sbbl %eax, %eax andl $1, %eax ret llvm-svn: 124724	2011-02-02 15:56:22 +00:00
Nick Lewycky	a46c898314	Remove wasteful caching. This isn't needed for correctness because any function that might have changed been affected by a merge elsewhere will have been removed from the function set, and it isn't needed for performance because we call grow() ahead of time to prevent reallocations. llvm-svn: 124717	2011-02-02 05:31:01 +00:00
Dan Gohman	c6f0bda839	Conservatively, clear optional flags, such as nsw, when performing reassociation. No testcase, because I wasn't able to create a testcase which actually demonstrates a problem. llvm-svn: 124713	2011-02-02 02:05:46 +00:00
Dan Gohman	08d2c98c23	Fix reassociate to clear optional flags, such as nsw. llvm-svn: 124712	2011-02-02 02:02:34 +00:00
Anders Carlsson	f23a6da271	Recognize and simplify (A+B) == A -> B == 0 A == (A+B) -> B == 0 llvm-svn: 124567	2011-01-30 22:01:13 +00:00
Francois Pichet	326e4a2966	Unbreak the MSVC build. The DEBUG() call at line 606 demands to see raw_ostream's definition. I have no idea why this seems to only break MSVC. llvm-svn: 124545	2011-01-29 20:06:16 +00:00
Frits van Bommel	2a55951d08	Call SimplifyFDivInst() in InstCombiner::visitFDiv(). llvm-svn: 124535	2011-01-29 17:50:27 +00:00
Frits van Bommel	c2549661af	Move InstCombine's knowledge of fdiv to SimplifyInstruction(). llvm-svn: 124534	2011-01-29 15:26:31 +00:00
Evan Cheng	73c29178ac	Add a test for TCE return duplication. llvm-svn: 124527	2011-01-29 04:53:35 +00:00
Evan Cheng	d983eba7dc	Re-apply r124518 with fix. Watch out for invalidated iterator. llvm-svn: 124526	2011-01-29 04:46:23 +00:00
Evan Cheng	65b8ccf6ac	Revert r124518. It broke Linux self-host. llvm-svn: 124522	2011-01-29 02:43:04 +00:00
Evan Cheng	d4eff31476	Re-commit r124462 with fixes. Tail recursion elim will now dup ret into unconditional predecessor to enable TCE on demand. llvm-svn: 124518	2011-01-29 01:29:26 +00:00
Andrew Trick	24f5ff0f23	Implementation of path profiling. Modified patch by Adam Preuss. This builds on the existing framework for block tracing, edge profiling and optimal edge profiling. See -help-hidden for new flags. For documentation, see the technical report "Implementation of Path Profiling..." in llvm.org/pubs. llvm-svn: 124515	2011-01-29 01:09:53 +00:00
Duncan Sands	771e82a863	My auto-simplifier noticed that ((X/Y)Y)/Y occurs several times in SPEC benchmarks, and that it can be simplified to X/Y. (In general you can only simplify (ZY)/Y to Z if the multiplication did not overflow; if Z has the form "X/Y" then this is the case). This patch implements that transform and moves some Div logic out of instcombine and into InstructionSimplify. Unfortunately instcombine gets in the way somewhat, since it likes to change (X/Y)Y into X-(X rem Y), so I had to teach instcombine about this too. Finally, thanks to the NSW/NUW flags, sometimes we know directly that "ZY" does not overflow, because the flag says so, so I added that logic too. This eliminates a bunch of divisions and subtractions in 447.dealII, and has good effects on some other benchmarks too. It seems to have quite an effect on tramp3d-v4 but it's hard to say if it's good or bad because inlining decisions changed, resulting in massive changes all over. llvm-svn: 124487	2011-01-28 16:51:11 +00:00
Nick Lewycky	cfb284cf96	Rename functions to follow coding standard. Also rejiggers comments. No functionality change. llvm-svn: 124482	2011-01-28 08:43:14 +00:00
Nick Lewycky	aaf401241a	Add a doxygen comment for this class. llvm-svn: 124480	2011-01-28 08:19:00 +00:00
Nick Lewycky	564fcca856	Reorder for readability. (Chris, is this what you meant?) llvm-svn: 124479	2011-01-28 07:36:21 +00:00
Evan Cheng	aaa9606b2f	Revert r124462. There are a few big regressions that I need to fix first. llvm-svn: 124478	2011-01-28 07:12:38 +00:00
Nick Lewycky	c5eb3733f7	Reduce the number of functions we look at in the first pass, and preallocate the function equality set. llvm-svn: 124475	2011-01-28 05:48:15 +00:00
Nick Lewycky	b074e32641	Fold select + select where both selects are on the same condition. llvm-svn: 124469	2011-01-28 03:28:10 +00:00
Evan Cheng	417fca86c4	- Stop simplifycfg from duplicating "ret" instructions into unconditional branches. PR8575, rdar://5134905, rdar://8911460. - Allow codegen tail duplication to dup small return blocks after register allocation is done. llvm-svn: 124462	2011-01-28 02:19:21 +00:00
Benjamin Kramer	57e3d65884	Unbreak the build. llvm-svn: 124426	2011-01-27 20:30:54 +00:00
Nick Lewycky	e2d46d30ae	Expound upon this comparison! llvm-svn: 124406	2011-01-27 19:51:31 +00:00
Nick Lewycky	5a37e950e1	Use dyn_cast instead of isa+cast. llvm-svn: 124404	2011-01-27 19:42:43 +00:00
Nick Lewycky	13e04aef2a	Fix surprising missed optimization in mergefunc where we forgot to consider that relationships like "i8* null" is equivalent to "i32* null". llvm-svn: 124368	2011-01-27 08:38:19 +00:00
Duncan Sands	69bdb585b2	Fix PR9039, a use-after-free in reassociate. The issue was that the operand being factorized (and erased) could occur several times in Ops, resulting in freed memory being used when the next occurrence in Ops was analyzed. llvm-svn: 124287	2011-01-26 10:08:38 +00:00
Nick Lewycky	91543447a6	AttrListPtr has an overloaded operator== which does this for us, we should use it. No functionality change! llvm-svn: 124286	2011-01-26 09:23:19 +00:00
Nick Lewycky	82d4db8662	Teach mergefunc that intptr_t is the same width as a pointer. We still can't merge vector<intptr_t>::push_back() and vector<void>::push_back() because Enumerate() doesn't realize that "i64 null" and "i8** null" are equivalent. llvm-svn: 124285	2011-01-26 09:13:58 +00:00
Nick Lewycky	fb622f9920	There are no vectors of pointer or arrays, so we don't need to check vector elements for type equivalence. llvm-svn: 124284	2011-01-26 08:50:18 +00:00
Nick Lewycky	f1cec164ce	Teach mergefunc how to emit aliases safely again -- but keep it turned it off for now. It's controlled by the HasGlobalAliases variable which is not attached to any flag yet. llvm-svn: 124182	2011-01-25 08:56:50 +00:00
Dan Gohman	0f124e1987	Give GetUnderlyingObject a TargetData, to keep it in sync with BasicAA's DecomposeGEPExpression, which recently began using a TargetData. This fixes PR8968, though the testcase is awkward to reduce. Also, update several off GetUnderlyingObject's users which happen to have a TargetData handy to pass it in. llvm-svn: 124134	2011-01-24 18:53:32 +00:00
Chris Lattner	b4017769ae	fix PR9017, a bug where we'd assert when promoting in unreachable code. llvm-svn: 124100	2011-01-24 03:29:07 +00:00
Chris Lattner	23289c385a	fix PR9015, a crash linking recursive metadata. llvm-svn: 124099	2011-01-24 03:18:24 +00:00
Chris Lattner	d83e7b0ff6	enhance SRoA to promote allocas that are used by PHI nodes. This often occurs because instcombine sinks loads and inserts phis. This kicks in on such apps as 175.vpr, eon, 403.gcc, xalancbmk and a bunch of times in spec2006 in some app that uses std::deque. This resolves the last of rdar://7339113. llvm-svn: 124090	2011-01-24 01:07:11 +00:00
Chris Lattner	a960725d18	Enhance SRoA to promote allocas that are used by selects in some common cases. This triggers a surprising number of times in SPEC2K6 because min/max idioms end up doing this. For example, code from the STL ends up looking like this to SRoA: %202 = load i64* %__old_size, align 8, !tbaa !3 %203 = load i64* %__old_size, align 8, !tbaa !3 %204 = load i64* %__n, align 8, !tbaa !3 %205 = icmp ult i64 %203, %204 %storemerge.i = select i1 %205, i64* %__n, i64* %__old_size %206 = load i64* %storemerge.i, align 8, !tbaa !3 We can now promote both the __n and the __old_size allocas. This addresses another chunk of rdar://7339113, poor codegen on stringswitch. llvm-svn: 124088	2011-01-23 22:04:55 +00:00
Ted Kremenek	3c4408ceb6	Null initialize a few variables flagged by clang's -Wuninitialized-experimental warning. While these don't look like real bugs, clang's -Wuninitialized-experimental analysis is stricter than GCC's, and these fixes have the benefit of being general nice cleanups. llvm-svn: 124073	2011-01-23 17:05:06 +00:00
Chris Lattner	9491dee24e	Enhance SRoA to be more aggressive about scalarization of aggregate allocas that have PHI or select uses of their element pointers. This can often happen when instcombine sinks two loads into a successor, inserting a phi or select. With this patch, we can scalarize the alloca, but the pinned elements are not yet promoted. This is still a win for large aggregates where only one element is used. This fixes rdar://8904039 and part of rdar://7339113 (poor codegen on stringswitch). llvm-svn: 124070	2011-01-23 08:27:54 +00:00
Cameron Zwarich	07d6fe34b3	Convert two std::vectors to SmallVectors for a 3.4% speedup running -scalarrepl on test-suite + SPEC2000 & SPEC2006. llvm-svn: 124068	2011-01-23 08:03:04 +00:00
Chris Lattner	8acbb79506	have AllocaInfo store the alloca being inspected, simplifying callers. No functionality change. llvm-svn: 124067	2011-01-23 07:29:29 +00:00
Chris Lattner	3e56c29068	Rearrange some code a bit. Change MarkUnsafe to handle the "Transformation preventing inst" printing, so that -scalarrepl -debug will always print the rejected instruction. No functionality change. llvm-svn: 124066	2011-01-23 07:05:44 +00:00
Chris Lattner	a587ab7b94	remove an old hack that avoided creating MMX datatypes. The X86 backend has been fixed. llvm-svn: 124064	2011-01-23 06:40:33 +00:00
Dan Gohman	19e30d5a7d	Actually check memcpy lengths, instead of just commenting about how they should be checked. llvm-svn: 123999	2011-01-21 22:07:57 +00:00
Owen Anderson	a834200dbe	Just because we have determined that an (fcmp \| fcmp) is true for A < B, A == B, and A > B, does not mean we can fold it to true. We still need to check for A ? B (A unordered B). llvm-svn: 123993	2011-01-21 19:39:42 +00:00
Nick Lewycky	ae0275e018	SCCP doesn't actually preserve the CFG. It will delete and insert terminator instructions. llvm-svn: 123973	2011-01-21 08:38:09 +00:00
Chris Lattner	b5e15d1907	fix PR9013, an infinite loop in instcombine. llvm-svn: 123968	2011-01-21 05:29:50 +00:00
Chris Lattner	f4ca47bda8	update obsolete comment. llvm-svn: 123965	2011-01-21 05:08:26 +00:00
Nick Lewycky	6a083cf820	Don't try to pull vector bitcasts that change the number of elements through a select. A vector select is pairwise on each element so we'd need a new condition with the right number of elements to select on. Fixes PR8994. llvm-svn: 123963	2011-01-21 02:30:43 +00:00
Duncan Sands	8fb2c3827c	At -O123 the early-cse pass is run before instcombine has run. According to my auto-simplier the transform most missed by early-cse is (zext X) != 0 -> X != 0. This patch adds this transform and some related logic to InstructionSimplify and removes some of the logic from instcombine (unfortunately not all because there are several situations in which instcombine can improve things by making new instructions, whereas instsimplify is not allowed to do this). At -O2 this often results in more than 15% more simplifications by early-cse, and results in hundreds of lines of bitcode being eliminated from the testsuite. I did see some small negative effects in the testsuite, for example a few additional instructions in three programs. One program, 483.xalancbmk, got an additional 35 instructions, which seems to be due to a function getting an additional instruction and then being inlined all over the place. llvm-svn: 123911	2011-01-20 13:21:55 +00:00
Rafael Espindola	fc355bc070	Add unnamed_addr when we can show that address of a global is not used. llvm-svn: 123834	2011-01-19 16:32:21 +00:00
Chris Lattner	86d56c651d	fix rdar://8878965, a regression I introduced with the recent llvm.objectsize changes. llvm-svn: 123771	2011-01-18 20:53:04 +00:00
Cameron Zwarich	fc210c79b7	Convert a std::map to a DenseMap for another 1.7% speedup on -scalarrepl. llvm-svn: 123732	2011-01-18 04:50:38 +00:00
Cameron Zwarich	6968c41ac8	Make a std::vector a SmallVector<*, 32> like the other vectors in the same function. This seems to be about a 1.5% speedup of -scalarrepl on test-suite with SPEC2000 and SPEC2006. llvm-svn: 123731	2011-01-18 04:41:32 +00:00
Rafael Espindola	ecd5b9abe9	Reduce indentation and remove commented out code. llvm-svn: 123729	2011-01-18 04:36:06 +00:00
Cameron Zwarich	b703654edc	Remove code for updating dominance frontiers and some outdated references to dominance and post-dominance frontiers. llvm-svn: 123725	2011-01-18 04:11:31 +00:00
Cameron Zwarich	4694e69540	Remove outdated references to dominance frontiers. llvm-svn: 123724	2011-01-18 03:53:26 +00:00
Owen Anderson	459e079912	Remove dead code, that I apparently wrote a while back. We seem to be doing well enough without whatever this was trying to do. When/if someone has the time to do some empirical evaluations, it might be worth it to figure out what this code was trying to do and see if it's worth resurrecting/fixing. llvm-svn: 123684	2011-01-17 22:39:54 +00:00
Cameron Zwarich	b410858a5f	Roll r123609 back in with two changes that fix test failures with expensive checks enabled: 1) Use '<' to compare integers in a comparison function rather than '<='. 2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize the priority queue. The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at just under 16% rather than 17%. llvm-svn: 123662	2011-01-17 17:38:41 +00:00
Cameron Zwarich	67431d7943	Roll out r123609 due to failures on the llvm-x86_64-linux-checks bot. llvm-svn: 123618	2011-01-17 07:26:51 +00:00
Cameron Zwarich	814cd9233e	Eliminate the use of dominance frontiers in PromoteMemToReg. In addition to eliminating a potentially quadratic data structure, this also gives a 17% speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial experiment gave a greater speedup around 25%, but I moved the dominator tree level computation from dominator tree construction to PromoteMemToReg. Since this approach to computing IDFs has a much lower overhead than the old code using precomputed DFs, it is worth looking at using this new code for the second scalarrepl pass as well. llvm-svn: 123609	2011-01-17 01:08:59 +00:00
Anders Carlsson	d3db83349e	Teach DAE to look for functions whose arguments are unused, and change all callers to pass in an undefvalue instead. llvm-svn: 123596	2011-01-16 21:25:33 +00:00
Chris Lattner	7c9f4c9c2b	tidy up a comment, as suggested by duncan llvm-svn: 123590	2011-01-16 17:46:19 +00:00
Rafael Espindola	751677a040	Don't merge two constants if we care about the address of both. This fixes the original testcase in PR8927. It also causes a clang binary built with a patched clang to increase in size by 0.21%. We can probably get some of the size back by writing a pass that detects that a global never has its pointer compared and adds unnamed_addr to it (maybe extend global opt). It is also possible that there are some other cases clang could add unnamed_addr to. I will investigate extending globalopt next. llvm-svn: 123584	2011-01-16 17:05:09 +00:00
Chris Lattner	e5f8de8639	fix PR8932, a case where arg promotion could infinitely promote. llvm-svn: 123574	2011-01-16 08:09:24 +00:00
Chris Lattner	ed1fb92cfe	simplify a little llvm-svn: 123573	2011-01-16 07:11:21 +00:00
Chris Lattner	6fab2e9418	if an alloca is only ever accessed as a unit, and is accessed with load/store instructions, then don't try to decimate it into its individual pieces. This will just make a mess of the IR and is pointless if none of the elements are individually accessed. This was generating really terrible code for std::bitset (PR8980) because it happens to be lowered by clang as an {[8 x i8]} structure instead of {i64}. The testcase now is optimized to: define i64 @test2(i64 %X) { br label %L2 L2: ; preds = %0 ret i64 %X } before we generated: define i64 @test2(i64 %X) { %sroa.store.elt = lshr i64 %X, 56 %1 = trunc i64 %sroa.store.elt to i8 %sroa.store.elt8 = lshr i64 %X, 48 %2 = trunc i64 %sroa.store.elt8 to i8 %sroa.store.elt9 = lshr i64 %X, 40 %3 = trunc i64 %sroa.store.elt9 to i8 %sroa.store.elt10 = lshr i64 %X, 32 %4 = trunc i64 %sroa.store.elt10 to i8 %sroa.store.elt11 = lshr i64 %X, 24 %5 = trunc i64 %sroa.store.elt11 to i8 %sroa.store.elt12 = lshr i64 %X, 16 %6 = trunc i64 %sroa.store.elt12 to i8 %sroa.store.elt13 = lshr i64 %X, 8 %7 = trunc i64 %sroa.store.elt13 to i8 %8 = trunc i64 %X to i8 br label %L2 L2: ; preds = %0 %9 = zext i8 %1 to i64 %10 = shl i64 %9, 56 %11 = zext i8 %2 to i64 %12 = shl i64 %11, 48 %13 = or i64 %12, %10 %14 = zext i8 %3 to i64 %15 = shl i64 %14, 40 %16 = or i64 %15, %13 %17 = zext i8 %4 to i64 %18 = shl i64 %17, 32 %19 = or i64 %18, %16 %20 = zext i8 %5 to i64 %21 = shl i64 %20, 24 %22 = or i64 %21, %19 %23 = zext i8 %6 to i64 %24 = shl i64 %23, 16 %25 = or i64 %24, %22 %26 = zext i8 %7 to i64 %27 = shl i64 %26, 8 %28 = or i64 %27, %25 %29 = zext i8 %8 to i64 %30 = or i64 %29, %28 ret i64 %30 } In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough PHIs are in play that instcombine backs off. It's better to not generate this stuff in the first place. llvm-svn: 123571	2011-01-16 06:18:28 +00:00
Chris Lattner	7cd8cf7d24	Use an irbuilder to get some trivial constant folding when doing a store of a constant. llvm-svn: 123570	2011-01-16 05:58:24 +00:00
Chris Lattner	adb1a233b1	remove a dead check, this was needed before we had an explicit veto on uses of phis. llvm-svn: 123569	2011-01-16 05:37:55 +00:00
Chris Lattner	d55581ded8	enhance FoldOpIntoPhi in instcombine to try harder when a phi has multiple uses. In some cases, all the uses are the same operation, so instcombine can go ahead and promote the phi. In the testcase this pushes an add out of the loop. llvm-svn: 123568	2011-01-16 05:28:59 +00:00
Chris Lattner	ea7131a062	remove the AllowAggressive argument to FoldOpIntoPhi. It is forced to false in the first line of the function because it isn't a good idea, even for compares. llvm-svn: 123566	2011-01-16 05:14:26 +00:00
Chris Lattner	ff2e737714	more cleanups: use the IR builder. llvm-svn: 123565	2011-01-16 05:08:00 +00:00
Chris Lattner	25ce280511	tidy up code. llvm-svn: 123564	2011-01-16 04:37:29 +00:00
Owen Anderson	4e54efd625	Improve the safety of my globalopt enhancement by ensuring that the bitcast of the stored value to the new store type is always. Also, add a testcase. llvm-svn: 123563	2011-01-16 04:33:33 +00:00
Chris Lattner	8b4952fcf7	simplify this code, it is still broken but will follow up on llvm-commits. llvm-svn: 123558	2011-01-16 02:05:10 +00:00
Chris Lattner	1e209b87ad	remove the partial specialization pass. It is unmaintained and has bugs. llvm-svn: 123554	2011-01-16 00:27:10 +00:00
Nick Lewycky	4a1ff16b29	Add missing whitespace. llvm-svn: 123543	2011-01-15 18:42:52 +00:00
Nick Lewycky	0296a481f9	Make constmerge a two-pass algorithm so that it won't miss merging opporuntities. Fixes PR8978. llvm-svn: 123541	2011-01-15 18:14:21 +00:00
Benjamin Kramer	ed5f2e504e	Try to unbreak selfhost. llvm-svn: 123537	2011-01-15 11:25:34 +00:00
Nick Lewycky	540f9536c8	Add a cache that protects mergefunc's internals from more surprises in DenseSet. Also, replace tabs with spaces. Yes, it's 2011. llvm-svn: 123535	2011-01-15 10:16:23 +00:00
Chris Lattner	af26390790	temporarily revert r123526. While working on a follow-on patch I realize that ConstantFoldTerminator doesn't preserve dominfo. llvm-svn: 123527	2011-01-15 07:51:19 +00:00
Chris Lattner	8df83c4a24	fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient code The basic issue is that isel (very reasonably!) expects conditional branches to be folded, so CGP leaving around a bunch dead computation feeding conditional branches isn't such a good idea. Just fold branches on constants into unconditional branches. llvm-svn: 123526	2011-01-15 07:36:13 +00:00
Chris Lattner	ee588defc6	simplify code, no functionality change. llvm-svn: 123525	2011-01-15 07:29:01 +00:00
Chris Lattner	1b93be501d	Now that instruction optzns can update the iterator as they go, we can have objectsize folding recursively simplify away their result when it folds. It is important to catch this here, because otherwise we won't eliminate the cross-block values at isel and other times. llvm-svn: 123524	2011-01-15 07:25:29 +00:00
Chris Lattner	7a2771440f	make the current instruction iterator an ivar, allowing xforms that potentially invalidate it (like inline asm lowering) to be sunk into their proper place, cleaning up a ton of code. llvm-svn: 123523	2011-01-15 07:14:54 +00:00
Chris Lattner	9c10d587f6	implement an instcombine xform that canonicalizes casts outside of and-with-constant operations. This fixes rdar://8808586 which observed that we used to compile: union xy { struct x { _Bool b[15]; } x; __attribute__((packed)) struct y { __attribute__((packed)) unsigned long b0to7; __attribute__((packed)) unsigned int b8to11; __attribute__((packed)) unsigned short b12to13; __attribute__((packed)) unsigned char b14; } y; }; struct x foo(union xy *xy) { return xy->x; } into: _foo: ## @foo movq (%rdi), %rax movabsq $1095216660480, %rcx ## imm = 0xFF00000000 andq %rax, %rcx movabsq $-72057594037927936, %rdx ## imm = 0xFF00000000000000 andq %rax, %rdx movzbl %al, %esi orq %rdx, %rsi movq %rax, %rdx andq $65280, %rdx ## imm = 0xFF00 orq %rsi, %rdx movq %rax, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rdx, %rsi movl %eax, %edx andl $-16777216, %edx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rdx orq %rcx, %rdx movabsq $280375465082880, %rcx ## imm = 0xFF0000000000 movq %rax, %rsi andq %rcx, %rsi orq %rdx, %rsi movabsq $71776119061217280, %r8 ## imm = 0xFF000000000000 andq %r8, %rax orq %rsi, %rax movzwl 12(%rdi), %edx movzbl 14(%rdi), %esi shlq $16, %rsi orl %edx, %esi movq %rsi, %r9 shlq $32, %r9 movl 8(%rdi), %edx orq %r9, %rdx andq %rdx, %rcx movzbl %sil, %esi shlq $32, %rsi orq %rcx, %rsi movl %edx, %ecx andl $-16777216, %ecx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rcx movq %rdx, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rcx, %rsi movq %rdx, %rcx andq $65280, %rcx ## imm = 0xFF00 orq %rsi, %rcx movzbl %dl, %esi orq %rcx, %rsi andq %r8, %rdx orq %rsi, %rdx ret We now compile this into: _foo: ## @foo ## BB#0: ## %entry movzwl 12(%rdi), %eax movzbl 14(%rdi), %ecx shlq $16, %rcx orl %eax, %ecx shlq $32, %rcx movl 8(%rdi), %edx orq %rcx, %rdx movq (%rdi), %rax ret A small improvement :-) llvm-svn: 123520	2011-01-15 06:32:33 +00:00
Chris Lattner	e20dd530d0	one more instcombine variant that is needed to work with future changes, no functionality change currently. llvm-svn: 123517	2011-01-15 05:50:18 +00:00
Chris Lattner	497459d5fd	fix typo llvm-svn: 123516	2011-01-15 05:42:47 +00:00
Chris Lattner	f3c4eefff8	Catch ~x < cst just like ~x < ~y, we currently handle this through means that are about to disappear. llvm-svn: 123515	2011-01-15 05:41:33 +00:00
Chris Lattner	311aa63c87	reduce indentation llvm-svn: 123514	2011-01-15 05:40:29 +00:00
Chris Lattner	b68ec5c339	Generalize LoadAndStorePromoter a bit and switch LICM to use it. llvm-svn: 123501	2011-01-15 00:12:35 +00:00
Owen Anderson	3e2f6cf7ae	Fix a false-positive warning. llvm-svn: 123480	2011-01-14 22:31:13 +00:00
Owen Anderson	9eb7cb48e4	Enhance GlobalOpt to be able evaluate initializers that involve stores through bitcasts, at least in simple cases. This fixes clang's CodeGenCXX/virtual-base-dtor.cpp llvm-svn: 123477	2011-01-14 22:19:20 +00:00
Chris Lattner	b498f9aff3	switch SRoA to use LoadAndStorePromoter instead of its own copy of the code. llvm-svn: 123457	2011-01-14 19:50:47 +00:00
Chris Lattner	95294b8796	Add a new LoadAndStorePromoter class, which implements the general "promote a bunch of load and stores" logic, allowing the code to be shared and reused. llvm-svn: 123456	2011-01-14 19:36:13 +00:00
Chris Lattner	9987a6f49b	split SROA into two passes: one that uses DomFrontiers (-scalarrepl) and one that uses SSAUpdater (-scalarrepl-ssa) llvm-svn: 123436	2011-01-14 08:13:00 +00:00
Chris Lattner	543384efb4	Implement full support for promoting allocas to registers using SSAUpdater instead of DomTree/DomFrontier. This may be interesting for reducing compile time. This is currently disabled, but seems to work just fine. When this is enabled, we eliminate two runs of dominator frontier, one in the "early per-function" optimizations and one in the "interlaced with inliner" function passes. llvm-svn: 123434	2011-01-14 07:50:47 +00:00
Chris Lattner	90f3a9a1c7	indentation llvm-svn: 123426	2011-01-14 04:23:53 +00:00
Duncan Sands	7f60dc1eb0	Move some shift transforms out of instcombine and into InstructionSimplify. While there, I noticed that the transform "undef >>a X -> undef" was wrong. For example if X is 2 then the top two bits must be equal, so the result can not be anything. I fixed this in the constant folder as well. Also, I made the transform for "X << undef" stronger: it now folds to undef always, even though X might be zero. This is in accordance with the LangRef, but I must admit that it is fairly aggressive. Also, I added "i32 X << 32 -> undef" following the LangRef and the constant folder, likewise fairly aggressive. llvm-svn: 123417	2011-01-14 00:37:45 +00:00
Bob Wilson	328e91bbe1	Fix whitespace. llvm-svn: 123396	2011-01-13 20:59:44 +00:00
Bob Wilson	c8056a952e	Check for empty structs, and for consistency, zero-element arrays. llvm-svn: 123383	2011-01-13 18:26:59 +00:00
Bob Wilson	08713d3c5f	Extend SROA to handle arrays accessed as homogeneous structs and vice versa. This is a minor extension of SROA to handle a special case that is important for some ARM NEON operations. Some of the NEON intrinsics return multiple values, which are handled as struct types containing multiple elements of the same vector type. The corresponding return types declared in the arm_neon.h header have equivalent arrays. We need SROA to recognize that it can split up those arrays and structs into separate vectors, even though they are not always accessed with the same type. SROA already handles loads and stores of an entire alloca by using insertvalue/extractvalue to access the individual pieces, and that code works the same regardless of whether the type is a struct or an array. So, all that needs to be done is to check for compatible arrays and homogeneous structs. llvm-svn: 123381	2011-01-13 17:45:11 +00:00
Bob Wilson	12eec40c83	Make SROA more aggressive with allocas containing padding. SROA only split up structs and arrays one level at a time, so padding can only cause trouble if it is located in between the struct or array elements. llvm-svn: 123380	2011-01-13 17:45:08 +00:00
Devang Patel	30f3ebbc1f	Use SmallVector instead of SmallPtrSet and avoid non-deterministic behavior. llvm-svn: 123318	2011-01-12 19:12:45 +00:00
Chris Lattner	dd5f60b7a7	revert 123144, reenabling the rest of memset formation. llvm-svn: 123302	2011-01-12 03:25:15 +00:00
Chris Lattner	654098f411	revert r123146 which disabled code that wasn't the root cause of the bootstrap miscompare issue. llvm-svn: 123299	2011-01-12 01:52:23 +00:00
Chris Lattner	fa7c29d255	revert r123149, reenabling an improvement to memcpyopt that wasn't the source of the bootstrap problem. llvm-svn: 123298	2011-01-12 01:43:46 +00:00
Jakob Stoklund Olesen	12cc296bd4	Remove the PR8954 workaround. llvm-svn: 123288	2011-01-11 22:56:41 +00:00
Jakob Stoklund Olesen	f2407aa98b	Fix a non-deterministic loop in llvm::MergeBlockIntoPredecessor. DT->changeImmediateDominator() trivially ignores identity updates, so there is really no need for the uniqueing provided by SmallPtrSet. I expect this to fix PR8954. llvm-svn: 123286	2011-01-11 22:54:38 +00:00
Cameron Zwarich	cb9c4f85ec	Dial back the speculative fix for PR8954 a bit, so that we only recompute dominators once at the beginning of GVN instead of once per iteration. llvm-svn: 123278	2011-01-11 22:14:42 +00:00
Cameron Zwarich	51eb403907	Attempt to fix the bootstrap buildbot. Rafael says this works for him on x86-64 Linux. llvm-svn: 123270	2011-01-11 20:23:34 +00:00
Owen Anderson	0022a4b417	Remove dead variable, const-ref-ize an APInt. llvm-svn: 123248	2011-01-11 18:26:37 +00:00
Chris Lattner	d41db8f9cb	this pass claims to preserve scev, make sure to tell it about deletions. llvm-svn: 123247	2011-01-11 18:14:50 +00:00
Frits van Bommel	8e158495f1	Factor the actual simplification out of SimplifyIndirectBrOnSelect and into a new helper function so it can be reused in e.g. an upcoming SimplifySwitchOnSelect. No functional change. llvm-svn: 123234	2011-01-11 12:52:11 +00:00
Chris Lattner	193ce7c4d1	update memdep when an instruction is deleted. This code isn't actually reached in the testcase in PR8954, but it's safe and good practice. llvm-svn: 123224	2011-01-11 08:19:16 +00:00
Chris Lattner	e2523b287c	when MergeBlockIntoPredecessor merges two blocks, update MemDep if it is floating around in the ether. llvm-svn: 123223	2011-01-11 08:16:49 +00:00
Chris Lattner	f6ae904e34	Fix FoldSingleEntryPHINodes to update memdep and AA when it deletes phi nodes. It is called from MergeBlockIntoPredecessor which is called from GVN, which claims to preserve these. I'm skeptical that this is the actual problem behind PR8954, but this is a stab in the right direction. llvm-svn: 123222	2011-01-11 08:13:40 +00:00
Chris Lattner	dfcfcb49fa	random cleanups llvm-svn: 123221	2011-01-11 08:00:40 +00:00
Chris Lattner	63fe78de68	remove a bogus assertion: the latch block of a loop is not neccesarily an uncond branch to the header. This fixes PR8955 (the assertion tripping). llvm-svn: 123219	2011-01-11 07:47:59 +00:00
Owen Anderson	d490c2d2ae	Fix a random missed optimization by making InstCombine more aggressive when determining which bits are demanded by a comparison against a constant. llvm-svn: 123203	2011-01-11 00:36:45 +00:00
Chandler Carruth	cf414cf0a6	Teach instcombine about the rest of the SSE and SSE2 conversion intrinsics element dependencies. Reviewed by Nick. llvm-svn: 123161	2011-01-10 07:19:37 +00:00
Chris Lattner	88bc848ab6	another random stab in the dark trying to fix llvm-gcc-i386-linux-selfhost llvm-svn: 123149	2011-01-10 02:34:11 +00:00
Chris Lattner	4662bd4b13	another (more) aggressive attempt to bring llvm-gcc-i386-linux-selfhost back to life. llvm-svn: 123146	2011-01-10 00:47:34 +00:00
Chris Lattner	1017fa6746	temporarily disable memset formation from memsets in an effort to restore buildbot stability. llvm-svn: 123144	2011-01-09 23:52:48 +00:00
Chris Lattner	caf5c0d037	fix a few old bugs (found by inspection) where we would zap instructions without informing memdep. This could cause nondeterminstic weirdness based on where instructions happen to get allocated, and will hopefully breath some life into some broken testers. llvm-svn: 123124	2011-01-09 19:26:10 +00:00
Tobias Grosser	cc21c4aa98	Instcombine: Fix pattern where the sext did not dominate the icmp using it llvm-svn: 123121	2011-01-09 16:00:11 +00:00
Cameron Zwarich	a42e5915bf	LoopInstSimplify preserves LoopSimplify. llvm-svn: 123117	2011-01-09 12:35:16 +00:00
Chris Lattner	a337f5ec5c	reduce indentation. Print <nuw> and <nsw> when dumping SCEV AddRec's that have the bit set. llvm-svn: 123104	2011-01-09 02:16:18 +00:00
Chris Lattner	7d6433ae76	fix a latent bug in memcpyoptimizer that my recent patches exposed: it wasn't updating memdep when fusing stores together. This fixes the crash optimizing the bullet benchmark. llvm-svn: 123091	2011-01-08 22:19:21 +00:00
Chris Lattner	ff6ed2ac5f	tryMergingIntoMemset can only handle constant length memsets. llvm-svn: 123090	2011-01-08 22:11:56 +00:00
Chris Lattner	9a1d63ba9f	Merge memsets followed by neighboring memsets and other stores into larger memsets. Among other things, this fixes rdar://8760394 and allows us to handle "Example 2" from http://blog.regehr.org/archives/320, compiling it into a single 4096-byte memset: _mad_synth_mute: ## @mad_synth_mute ## BB#0: ## %entry pushq %rax movl $4096, %esi ## imm = 0x1000 callq ___bzero popq %rax ret llvm-svn: 123089	2011-01-08 21:19:19 +00:00
Chris Lattner	5120ebf184	fix an issue in IsPointerOffset that prevented us from recognizing that P and P+1 are relative to the same base pointer. llvm-svn: 123087	2011-01-08 21:07:56 +00:00
Chris Lattner	4dc1fd938f	enhance memcpyopt to merge a store and a subsequent memset into a single larger memset. llvm-svn: 123086	2011-01-08 20:54:51 +00:00
Chris Lattner	c638147e9f	constify TargetData references. Split memset formation logic out into its own "tryMergingIntoMemset" helper function. llvm-svn: 123081	2011-01-08 20:24:01 +00:00
Chris Lattner	59c82f850d	When loop rotation happens, it is very common for the duplicated condbr to be foldable into an uncond branch. When this happens, we can make a much simpler CFG for the loop, which is important for nested loop cases where we want the outer loop to be aggressively optimized. Handle this case more aggressively. For example, previously on phi-duplicate.ll we would get this: define void @test(i32 %N, double* %G) nounwind ssp { entry: %cmp1 = icmp slt i64 1, 1000 br i1 %cmp1, label %bb.nph, label %for.end bb.nph: ; preds = %entry br label %for.body for.body: ; preds = %bb.nph, %for.cond %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ] %arrayidx = getelementptr inbounds double* %G, i64 %j.02 %tmp3 = load double* %arrayidx %sub = sub i64 %j.02, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.02, 1 br label %for.cond for.cond: ; preds = %for.body %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge for.cond.for.end_crit_edge: ; preds = %for.cond br label %for.end for.end: ; preds = %for.cond.for.end_crit_edge, %entry ret void } Now we get the much nicer: define void @test(i32 %N, double* %G) nounwind ssp { entry: br label %for.body for.body: ; preds = %entry, %for.body %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ] %arrayidx = getelementptr inbounds double* %G, i64 %j.01 %tmp3 = load double* %arrayidx %sub = sub i64 %j.01, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.01, 1 %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.end for.end: ; preds = %for.body ret void } With all of these recent changes, we are now able to compile: void foo(char X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i100] = 0; } into a single memset of 10000 bytes. This series of changes should also be helpful for other nested loop scenarios as well. llvm-svn: 123079	2011-01-08 19:59:06 +00:00
Chris Lattner	30f318e5d1	split ssa updating code out to its own helper function. Don't bother moving the OrigHeader block anymore: we just merge it away anyway so its code layout doesn't matter. llvm-svn: 123077	2011-01-08 19:26:33 +00:00
Chris Lattner	2615130e1d	Implement a TODO: Enhance loopinfo to merge away the unconditional branch that it was leaving in loops after rotation (between the original latch block and the original header. With this change, it is possible for rotated loops to have just a single basic block, which is useful. llvm-svn: 123075	2011-01-08 19:10:28 +00:00
Chris Lattner	930b716e1b	various code cleanups, enhance MergeBlockIntoPredecessor to preserve loop info. llvm-svn: 123074	2011-01-08 19:08:40 +00:00
Chris Lattner	fee37c5fa3	inline preserveCanonicalLoopForm now that it is simple. llvm-svn: 123073	2011-01-08 18:55:50 +00:00
Chris Lattner	063dca0f6a	Three major changes: 1. Rip out LoopRotate's domfrontier updating code. It isn't needed now that LICM doesn't use DF and it is super complex and gross. 2. Make DomTree updating code a lot simpler and faster. The old loop over all the blocks was just to find a block?? 3. Change the code that inserts the new preheader to just use SplitCriticalEdge instead of doing an overcomplex reimplementation of it. No behavior change, except for the name of the inserted preheader. llvm-svn: 123072	2011-01-08 18:52:51 +00:00
Chris Lattner	30d95f9f87	reduce nesting. llvm-svn: 123071	2011-01-08 18:47:43 +00:00
Chris Lattner	7fab23bc1d	LoopRotate requires canonical loop form, so it always has preheaders and latch blocks. Reorder entry conditions to make hte pass faster and more logical. llvm-svn: 123069	2011-01-08 18:06:22 +00:00
Chris Lattner	d62691f4e8	use the LI ivar. llvm-svn: 123068	2011-01-08 17:49:51 +00:00
Chris Lattner	385f2ec6d8	some cleanups: remove dead arguments and eliminate ivars that are just passed to one function. llvm-svn: 123067	2011-01-08 17:48:33 +00:00
Chris Lattner	25ba40a0cc	fix an issue duncan pointed out, which could cause loop rotate to violate LCSSA form llvm-svn: 123066	2011-01-08 17:38:45 +00:00
Cameron Zwarich	b4ab257bcc	Fix coding style issues. llvm-svn: 123065	2011-01-08 17:07:11 +00:00
Cameron Zwarich	84986b298a	Make more passes preserve dominators (or state that they preserve dominators if they all ready do). This removes two dominator recomputations prior to isel, which is a 1% improvement in total llc time for 403.gcc. The only potentially suspect thing is making GCStrategy recompute dominators if it used a custom lowering strategy. llvm-svn: 123064	2011-01-08 17:01:52 +00:00
Cameron Zwarich	80bd9af7c5	Contract subloop bodies. However, it is still important to visit the phis at the top of subloop headers, as the phi uses logically occur outside of the subloop. llvm-svn: 123062	2011-01-08 15:52:22 +00:00
Frits van Bommel	6a1fb8f235	Fix a bug in r123034 (trying to sext/zext non-integers) and clean up a little. llvm-svn: 123061	2011-01-08 10:51:36 +00:00
Chris Lattner	8c5defd0b0	Have loop-rotate simplify instructions (yay instsimplify!) as it clones them into the loop preheader, eliminating silly instructions like "icmp i32 0, 100" in fixed tripcount loops. This also better exposes the bigger problem with loop rotate that I'd like to fix: once this has been folded, the duplicated conditional branch often turns into an uncond branch. Not aggressively handling this is pessimizing later loop optimizations somethin' fierce by making "dominates all exit blocks" checks fail. llvm-svn: 123060	2011-01-08 08:24:46 +00:00
Chris Lattner	43f8d16482	Revamp the ValueMapper interfaces in a couple ways: 1. Take a flags argument instead of a bool. This makes it more clear to the reader what it is used for. 2. Add a flag that says that "remapping a value not in the map is ok". 3. Reimplement MapValue to share a bunch of code and be a lot more efficient. For lookup failures, don't drop null values into the map. 4. Using the new flag a bunch of code can vaporize in LinkModules and LoopUnswitch, kill it. No functionality change. llvm-svn: 123058	2011-01-08 08:15:20 +00:00
Chris Lattner	2b3f20e6ec	two minor changes: switch to the standard ValueToValueMapTy map from ValueMapper.h (giving us access to its utilities) and add a fastpath in the loop rotation code, avoiding expensive ssa updator manipulation for values with nothing to update. llvm-svn: 123057	2011-01-08 07:21:31 +00:00
Tobias Grosser	fc3d7f664b	InstCombine: Match min/max hidden by sext/zext X = sext x; x >s c ? X : C+1 --> X = sext x; X <s C+1 ? C+1 : X X = sext x; x <s c ? X : C-1 --> X = sext x; X >s C-1 ? C-1 : X X = zext x; x >u c ? X : C+1 --> X = zext x; X <u C+1 ? C+1 : X X = zext x; x <u c ? X : C-1 --> X = zext x; X >u C-1 ? C-1 : X X = sext x; x >u c ? X : C+1 --> X = sext x; X <u C+1 ? C+1 : X X = sext x; x <u c ? X : C-1 --> X = sext x; X >u C-1 ? C-1 : X Instead of calculating this with mixed types promote all to the larger type. This enables scalar evolution to analyze this expression. PR8866 llvm-svn: 123034	2011-01-07 21:33:14 +00:00
Tobias Grosser	411e6eedff	Some whitespace fixes llvm-svn: 123033	2011-01-07 21:33:13 +00:00
Benjamin Kramer	134cde912a	Revert 122959, it needs more thought. Add it back to README.txt with additional notes. llvm-svn: 123030	2011-01-07 20:42:20 +00:00
Jay Foad	89afb43b1e	Remove all uses of the "ugly" method BranchInst::setUnconditionalDest(). llvm-svn: 123025	2011-01-07 20:25:56 +00:00
Benjamin Kramer	ae67cc13a9	InstCombine: Turn _chk functions into the "unsafe" variant if length and max langth are equal. This happens when we take the (non-constant) length from a malloc. llvm-svn: 122961	2011-01-06 14:22:52 +00:00
Benjamin Kramer	799b011276	InstCombine: If we call llvm.objectsize on a malloc call we can replace it with the size passed to malloc. llvm-svn: 122959	2011-01-06 13:11:05 +00:00
Benjamin Kramer	a76cc117e0	InstCombine: Teach llvm.objectsize folding to look through GEPs. llvm-svn: 122958	2011-01-06 13:07:49 +00:00
Cameron Zwarich	9ec19ea06a	Add the CallInst optimizations that don't involve expanding inline assembly to OptimizeInst() so that they can be used on a worklist instruction. llvm-svn: 122945	2011-01-06 02:56:42 +00:00
Cameron Zwarich	d28c78eb4f	Move the GEP handling in CodeGenPrepare to OptimizeInst(). llvm-svn: 122944	2011-01-06 02:44:52 +00:00
Cameron Zwarich	14ac865ca9	Split the optimizations in CodeGenPrepare that don't manipulate the iterators into a separate function, so that it can be called from a loop using a worklist rather than a loop traversing a whole basic block. llvm-svn: 122943	2011-01-06 02:37:26 +00:00
Jakob Stoklund Olesen	70be93a200	Zap the last two -Wself-assign warnings in llvm. Simplify RALinScan::DowngradeRegister with TRI::getOverlaps while we are there. llvm-svn: 122940	2011-01-06 01:33:22 +00:00
Cameron Zwarich	ce3b930a98	Stop reallocating SunkAddrs for each basic block. When we move to an instruction worklist, the key will need to become std::pair<BasicBlock, Value>. llvm-svn: 122932	2011-01-06 00:42:50 +00:00
Cameron Zwarich	b62ccb241b	Add some more statistics to CodeGenPrepare. llvm-svn: 122891	2011-01-05 17:47:38 +00:00
Cameron Zwarich	ced753fadf	Add some stats to CodeGenPrepare to make it easier to speed it up without regressing code quality. llvm-svn: 122887	2011-01-05 17:27:27 +00:00
Cameron Zwarich	6a78995369	Use pop_back_val instead of back followed by pop_back. llvm-svn: 122876	2011-01-05 16:08:47 +00:00
Cameron Zwarich	5a2bb998ac	Use a worklist for later iterations just like ordinary instsimplify. The next step is to only process instructions in subloops if they have been modified by an earlier simplification. llvm-svn: 122869	2011-01-05 05:47:47 +00:00
Cameron Zwarich	4c51d122d5	Change LoopInstSimplify back to a LoopPass. It revisits subloops rather than skipping them, but it should probably use a worklist and only revisit those instructions in subloops that have actually changed. It should probably also use a worklist after the first iteration like instsimplify now does. Regardless, it's only 0.3% of opt -O2 time on 403.gcc if it replaces the instcombine placed in the middle of the loop passes. llvm-svn: 122868	2011-01-05 05:15:53 +00:00
Owen Anderson	7b25ff04bd	Don't bother value numbering instructions with void types in GVN. In theory this should allow us to insert fewer things into the value numbering maps, but any speedup is beneath the noise threshold on my machine on 403.gcc. llvm-svn: 122844	2011-01-04 22:15:21 +00:00
Owen Anderson	e39cb57b09	Complete the NumberTable --> LeaderTable rename. llvm-svn: 122828	2011-01-04 19:29:46 +00:00
Owen Anderson	d7d06d3aaf	Fix typo in a comment. llvm-svn: 122827	2011-01-04 19:25:18 +00:00
Owen Anderson	51489b3b28	Prune #include's. llvm-svn: 122826	2011-01-04 19:24:57 +00:00
Owen Anderson	c7c3bc63f7	Clarify terminology, settling on referring to what was the "number table" as the "leader table", and rename methods to make it much more clear what they're doing. llvm-svn: 122823	2011-01-04 19:13:25 +00:00
Owen Anderson	83546f2fe0	When removing a value from GVN's leaders list, don't drop the Next pointer in a corner case. llvm-svn: 122822	2011-01-04 19:10:54 +00:00
Dale Johannesen	a71d2cc88d	Improve the accuracy of the inlining heuristic looking for the case where a static caller is itself inlined everywhere else, and thus may go away if it doesn't get too big due to inlining other things into it. If there are references to the caller other than calls, it will not be removed; account for this. This results in same-day completion of the case in PR8853. llvm-svn: 122821	2011-01-04 19:01:54 +00:00
Owen Anderson	41a1550ef5	Branch instructions don't produce values, so there's no need to generate a value number for them. This avoids adding them to the various value numbering tables, resulting in a minor (~3%) speedup for GVN on 40.gcc. llvm-svn: 122819	2011-01-04 18:54:18 +00:00
Owen Anderson	22c53e277a	Remove commented out code. llvm-svn: 122817	2011-01-04 18:22:08 +00:00
Cameron Zwarich	b2a41e9388	Switch to the new style of asterisk placement. llvm-svn: 122815	2011-01-04 18:19:19 +00:00
Chris Lattner	8643810ede	Teach loop-idiom to turn a loop containing a memset into a larger memset when safe. The testcase is basically this nested loop: void foo(char X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i100] = 0; } which gets turned into a single memset now. clang -O3 doesn't optimize this yet though due to a phase ordering issue I haven't analyzed yet. llvm-svn: 122806	2011-01-04 07:46:33 +00:00
Chris Lattner	a62b01dc37	restructure this a bit. Initialize the WeakVH with "I", the instruction after the store. The store will always be deleted if the transformation kicks in, so we'd do an N^2 scan of every loop block. Whoops. llvm-svn: 122805	2011-01-04 07:27:30 +00:00
Cameron Zwarich	f4e13699e7	Avoid finding loop back edges when we are not splitting critical edges in CodeGenPrepare (which is the default behavior). llvm-svn: 122801	2011-01-04 04:43:31 +00:00
Cameron Zwarich	e924969380	Address most of Duncan's review comments. Also, make LoopInstSimplify a simple FunctionPass. It probably doesn't have a reason to be a LoopPass, as it will probably drop the simple fixed point and either use RPO iteration or Duncan's approach in instsimplify of only revisiting instructions that have changed. The next step is to preserve LoopSimplify. This looks like it won't be too hard, although the pass manager doesn't actually seem to respect when non-loop passes claim to preserve LCSSA or LoopSimplify. This will have to be fixed. llvm-svn: 122791	2011-01-04 00:12:46 +00:00
Chris Lattner	0ba473c218	use the very-handy getTruncateOrZeroExtend helper function, and stop setting NSW: signed overflow is possible. Thanks to Dan for pointing these out. llvm-svn: 122790	2011-01-04 00:06:55 +00:00
Owen Anderson	0839d3930a	Fix comment. llvm-svn: 122788	2011-01-03 23:51:56 +00:00
Owen Anderson	d62d37225a	Use the new addEscapingValue callback to update GlobalsModRef when GVN adds PHIs of GEPs. For the moment, have GlobalsModRef handle this conservatively by simply removing the value from its maps. llvm-svn: 122787	2011-01-03 23:51:43 +00:00
Chris Lattner	bde6ec1db6	Duncan deftly points out that readnone functions aren't invalidated by stores, so they can be handled as 'simple' operations. llvm-svn: 122785	2011-01-03 23:38:13 +00:00
Owen Anderson	3a33d0cc4a	Simplify GVN's value expression structure, allowing the elimination of a lot of almost-but-not-quite-identical code. No intended functionality change. llvm-svn: 122760	2011-01-03 19:00:11 +00:00
Chris Lattner	16ca19ffc5	stength reduce my previous patch a bit. The only instructions that are allowed to have metadata operands are intrinsic calls, and the only ones that take metadata currently return void. Just reject all void instructions, which should not be value numbered anyway. To future proof things, add an assert to the getHashValue impl for calls to check that metadata operands aren't present. llvm-svn: 122759	2011-01-03 18:43:03 +00:00
Chris Lattner	142f1cd251	fix PR8895: metadata operands don't have a strong use of their nested values, so they can change and drop to null, which can change the hash and cause havok. It turns out that it isn't a good idea to value number stuff with metadata operands anyway, so... don't. llvm-svn: 122758	2011-01-03 18:28:15 +00:00
Duncan Sands	697de77339	Speed up instsimplify by about 10-15% by not bothering to retry InstructionSimplify on instructions that didn't change since the last time round the loop. llvm-svn: 122745	2011-01-03 10:50:04 +00:00
Cameron Zwarich	43cecb1200	Switch a worklist in CodeGenPrepare to SmallVector and increase the inline capacity on the Visited SmallPtrSet. On 403.gcc, this is about a 4.5% speedup of CodeGenPrepare time (which itself is 10% of time spent in the backend). This is progress towards PR8889. llvm-svn: 122741	2011-01-03 06:33:01 +00:00
Chris Lattner	9e5e9ed79a	earlycse can do trivial with-a-block dead store elimination as well. This deletes 60 stores in 176.gcc that largely come from bitfield code. llvm-svn: 122736	2011-01-03 04:17:24 +00:00
Chris Lattner	4b9a525742	switch the load table to use a recycling bump pointer allocator, speeding earlycse up by 6%. llvm-svn: 122733	2011-01-03 03:53:50 +00:00
Chris Lattner	e0e32a9ef0	now that loads are in their own table, we can implement store->load forwarding. This allows EarlyCSE to zap 600 more loads from 176.gcc. llvm-svn: 122732	2011-01-03 03:46:34 +00:00
Chris Lattner	92bb0f9f9d	split loads and calls into separate tables. Loads are now just indexed by their pointer instead of using MemoryValue to wrap it. llvm-svn: 122731	2011-01-03 03:41:27 +00:00
Chris Lattner	4cb365414f	various cleanups, no functionality change. llvm-svn: 122729	2011-01-03 03:28:23 +00:00
Chris Lattner	b9a8efc960	Teach EarlyCSE to do trivial CSE of loads and read-only calls. On 176.gcc, this catches 13090 loads and calls, and increases the number of simple instructions CSE'd from 29658 to 36208. llvm-svn: 122727	2011-01-03 03:18:43 +00:00
Chris Lattner	79d83067ee	rename InstValue to SimpleValue, add some comments. llvm-svn: 122725	2011-01-03 02:20:48 +00:00
Michael J. Spencer	edb5bcdde5	CMake: Add missing source file. llvm-svn: 122724	2011-01-03 02:13:05 +00:00
Chris Lattner	d815f69b30	Allocate nodes for the scoped hash table from a recyling bump pointer allocator. This speeds up early cse by about 20% llvm-svn: 122723	2011-01-03 01:42:46 +00:00
Chris Lattner	02a9776b64	reduce redundancy in the hashing code and other misc cleanups. llvm-svn: 122720	2011-01-03 01:10:08 +00:00
Cameron Zwarich	cab9a0abab	Add a new loop-instsimplify pass, with the intention of replacing the instance of instcombine that is currently in the middle of the loop pass pipeline. This commit only checks in the pass; it will hopefully be enabled by default later. llvm-svn: 122719	2011-01-03 00:25:16 +00:00
Chris Lattner	0844c76f9a	fix some pastos llvm-svn: 122718	2011-01-02 23:29:58 +00:00
Chris Lattner	8fac5db251	add DEBUG and -stats output to earlycse. Teach it to CSE the rest of the non-side-effecting instructions. llvm-svn: 122716	2011-01-02 23:19:45 +00:00
Chris Lattner	18ae5436b1	Enhance earlycse to do CSE of casts, instsimplify and die. Add a testcase. llvm-svn: 122715	2011-01-02 23:04:14 +00:00
Chris Lattner	bf0aa927cc	split dom frontier handling stuff out to its own DominanceFrontier header, so that Dominators.h is just domtree. Also prune #includes a bit. llvm-svn: 122714	2011-01-02 22:09:33 +00:00
Chris Lattner	704541bb23	sketch out a new early cse pass. No functionality yet. llvm-svn: 122713	2011-01-02 21:47:05 +00:00
Chris Lattner	9c69406f2b	fix a miscompilation of tramp3d-v4: when forming a memcpy, we have to make sure that the loop we're promoting into a memcpy doesn't mutate the input of the memcpy. Before we were just checking that the dest of the memcpy wasn't mod/ref'd by the loop. llvm-svn: 122712	2011-01-02 21:14:18 +00:00
Chris Lattner	5702a43c09	If a loop iterates exactly once (has backedge count = 0) then don't mess with it. We'd rather peel/unroll it than convert all of its stores into memsets. llvm-svn: 122711	2011-01-02 20:24:21 +00:00
Nick Lewycky	5361b84184	Also remove functions that use complex constant expressions in terms of another function. llvm-svn: 122705	2011-01-02 19:16:44 +00:00
Chris Lattner	8455b6e45e	enhance loop idiom recognition to scan all unconditionally executed blocks in a loop, instead of just the header block. This makes it more aggressive, able to handle Duncan's Ada examples. llvm-svn: 122704	2011-01-02 19:01:03 +00:00
Chris Lattner	0cdc6f62a5	make inSubLoop much more efficient. llvm-svn: 122703	2011-01-02 18:53:08 +00:00
Chris Lattner	27497ece96	rip out isExitBlockDominatedByBlockInLoop, calling DomTree::dominates instead. isExitBlockDominatedByBlockInLoop is a relic of the days when domtree was just a tree and didn't have DFS numbers. Checking DFS numbers is faster and easier than "limiting the search of the tree". llvm-svn: 122702	2011-01-02 18:45:39 +00:00
Chris Lattner	0469e01c02	add a list of opportunities for future improvement. llvm-svn: 122701	2011-01-02 18:32:09 +00:00
Duncan Sands	64f1c0dcda	Fix PR8702 by not having LoopSimplify claim to preserve LCSSA form. As described in the PR, the pass could break LCSSA form when inserting preheaders. It probably would be easy enough to fix this, but since currently we always go into LCSSA form after running this pass, doing so is not urgent. llvm-svn: 122695	2011-01-02 13:38:21 +00:00
Chris Lattner	ddf58010bd	Allow loop-idiom to run on multiple BB loops, but still only scan the loop header for now for memset/memcpy opportunities. It turns out that loop-rotate is successfully rotating loops, but DOESN'T MERGE THE BLOCKS, turning "for loops" into 2 basic block loops that loop-idiom was ignoring. With this fix, we form many many more memcpy and memsets than before, including on the "history" loops in the viterbi benchmark, which look like this: for (j=0; j<MAX_history; ++j) { history_new[i][j+1] = history[2*i][j]; } Transforming these loops into memcpy's speeds up the viterbi benchmark from 11.98s to 3.55s on my machine. Woo. llvm-svn: 122685	2011-01-02 07:58:36 +00:00
Chris Lattner	5b5a043d82	remove debugging code. llvm-svn: 122683	2011-01-02 07:37:13 +00:00
Chris Lattner	12f91befce	add some -stats output. llvm-svn: 122682	2011-01-02 07:36:44 +00:00
Chris Lattner	679572e584	improve loop rotation to use CodeMetrics to analyze the size of a loop header instead of its own code size estimator. This allows it to handle bitcasts etc more precisely. llvm-svn: 122681	2011-01-02 07:35:53 +00:00
Chris Lattner	85b6d81d41	teach loop idiom recognition to form memcpy's from simple loops. llvm-svn: 122678	2011-01-02 03:37:56 +00:00
Nick Lewycky	4e250c8245	Remove functions from the FnSet when one of their callee's is being merged. This maintains the guarantee that the DenseSet expects two elements it contains to not go from inequal to equal under its nose. As a side-effect, this also lets us switch from iterating to a fixed-point to actually maintaining a work queue of functions to look at again, and we don't add thunks to our work queue so we don't need to detect and ignore them. llvm-svn: 122677	2011-01-02 02:46:33 +00:00
Chris Lattner	1903c42b97	fix a globalopt crash on two Adobe-C++ testcases that the recent loop idiom pass exposed. llvm-svn: 122674	2011-01-01 22:31:46 +00:00
Chris Lattner	a3514441e0	add a validity check that was missed, fixing a crash on the new testcase. llvm-svn: 122662	2011-01-01 20:12:04 +00:00
Chris Lattner	91a4435875	improve validity check to handle constant-trip-count loops more aggressively. In practice, this doesn't help anything though, see the todo. llvm-svn: 122660	2011-01-01 19:54:22 +00:00
Chris Lattner	8b3baf6d75	implement the "no aliasing accesses in loop" safety check. This pass should be correct now. llvm-svn: 122659	2011-01-01 19:39:01 +00:00
Duncan Sands	2c440fa403	Simplify this pass by using a depth-first iterator to ensure that all operands are visited before the instructions themselves. llvm-svn: 122647	2010-12-31 17:49:05 +00:00
Duncan Sands	6cc7126ed9	Zap dead instructions harder. llvm-svn: 122645	2010-12-31 16:17:54 +00:00
Benjamin Kramer	570dd787a6	Make a bunch of symbols internal. llvm-svn: 122642	2010-12-30 22:34:44 +00:00
Chris Lattner	65a699d4d0	simplify this, isBytewiseValue handles the extra check. We still check for "multiple of a byte" in size to make it clear that the >> 3 below is safe. llvm-svn: 122604	2010-12-28 18:53:48 +00:00
Duncan Sands	5cf10e691b	Silence gcc warning about an unused variable when doing a release build. llvm-svn: 122593	2010-12-28 09:41:15 +00:00
Chris Lattner	cb18bfa3d2	fix some issues Frits noticed, add AliasAnalysis as a dependency llvm-svn: 122585	2010-12-27 18:39:08 +00:00
Benjamin Kramer	84bd73c527	BuildLibCalls: Nuke EmitMemCpy, EmitMemMove and EmitMemSet. They are dead and superseded by IRBuilder. llvm-svn: 122576	2010-12-27 00:25:32 +00:00
Benjamin Kramer	7cba269dfb	SimplifyLibCalls: Use IRBuilder to simplify code. llvm-svn: 122575	2010-12-27 00:16:46 +00:00
Chris Lattner	b9fe685b9a	have loop-idiom nuke instructions that feed stores that get removed. llvm-svn: 122574	2010-12-27 00:03:23 +00:00
Chris Lattner	29e14edc8d	implement enough of the memset inference algorithm to recognize and insert memsets. This is still missing one important validity check, but this is enough to compile stuff like this: void test0(std::vector<char> &X) { for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I) *I = 0; } void test1(std::vector<int> &X) { for (long i = 0, e = X.size(); i != e; ++i) X[i] = 0x01010101; } With: $ clang t.cpp -S -o - -O2 -emit-llvm \| opt -loop-idiom \| opt -O3 \| llc to: __Z5test0RSt6vectorIcSaIcEE: ## @_Z5test0RSt6vectorIcSaIcEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rsi cmpq %rsi, %rax je LBB0_2 ## BB#1: ## %bb.nph subq %rax, %rsi movq %rax, %rdi callq ___bzero LBB0_2: ## %for.end addq $8, %rsp ret ... __Z5test1RSt6vectorIiSaIiEE: ## @_Z5test1RSt6vectorIiSaIiEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rdx subq %rax, %rdx cmpq $4, %rdx jb LBB1_2 ## BB#1: ## %for.body.preheader andq $-4, %rdx movl $1, %esi movq %rax, %rdi callq _memset LBB1_2: ## %for.end addq $8, %rsp ret llvm-svn: 122573	2010-12-26 23:42:51 +00:00
Chris Lattner	6cf8d6cc6e	start using irbuilder to make mem intrinsics in a few passes. llvm-svn: 122572	2010-12-26 22:57:41 +00:00
Chris Lattner	7c5f9c35d1	sketch more of this out. llvm-svn: 122567	2010-12-26 20:45:45 +00:00
Chris Lattner	9cb1035f94	move isBytewiseValue out to ValueTracking.h/cpp llvm-svn: 122565	2010-12-26 20:15:01 +00:00
Chris Lattner	81ae3f299a	actually add the file... llvm-svn: 122563	2010-12-26 19:39:38 +00:00
Chris Lattner	2ef535a4e4	Start of a pass for recognizing memset and memcpy idioms. No functionality yet. llvm-svn: 122562	2010-12-26 19:32:44 +00:00
Benjamin Kramer	30342fb1fd	Simplify code. llvm-svn: 122561	2010-12-26 15:23:45 +00:00
Chris Lattner	d729d0dcdb	don't lose TD info llvm-svn: 122556	2010-12-25 20:52:04 +00:00
Chris Lattner	20fca48341	switch the inliner alignment enforcement stuff to use the getOrEnforceKnownAlignment function, which simplifies the code and makes it stronger. llvm-svn: 122555	2010-12-25 20:42:38 +00:00
Chris Lattner	6fcd32e7d7	Move getOrEnforceKnownAlignment out of instcombine into Transforms/Utils. llvm-svn: 122554	2010-12-25 20:37:57 +00:00
Benjamin Kramer	b90b2f0635	Fix a thinko pointed out by Frits van Bommel: looking through global variables in isBytewiseValue is not safe. llvm-svn: 122550	2010-12-24 22:23:59 +00:00
Benjamin Kramer	ea9152e551	MemCpyOpt: Turn memcpys from a constant into a memset if possible. This allows us to compile "int cst[] = {-1, -1, -1};" into movl $-1, 16(%rsp) movq $-1, 8(%rsp) instead of movl _cst+8(%rip), %eax movl %eax, 16(%rsp) movq _cst(%rip), %rax movq %rax, 8(%rsp) llvm-svn: 122548	2010-12-24 21:17:12 +00:00
Owen Anderson	226ac14afb	When determining if we can fold (x >> C1) << C2, the bits that we need to verify are zero are not the low bits of x, but the bits that WILL be the low bits after the operation completes. llvm-svn: 122529	2010-12-23 23:56:24 +00:00
Owen Anderson	5d690d4168	It is possible for SimplifyCFG to cause PHI nodes to become redundant too late in the optimization pipeline to be caught by instcombine, and it's not feasible to catch them in SimplifyCFG because the use-lists are in an inconsistent state at the point where it could know that it need to simplify them. Instead, have CodeGenPrepare look for trivially redundant PHIs as part of its general cleanup effort. llvm-svn: 122516	2010-12-23 20:57:35 +00:00
Mon P Wang	18b762a946	Preserve the address space when generating bitcasts for MemTransferInst in ConvertToScalarInfo llvm-svn: 122462	2010-12-23 01:41:32 +00:00
Jeffrey Yasskin	9b43f33620	Change all self assignments X=X to (void)X, so that we can turn on a new gcc warning that complains on self-assignments and self-initializations. llvm-svn: 122458	2010-12-23 00:58:24 +00:00
Benjamin Kramer	8ef5001b27	InstCombine: creating selects from -1 and 0 is fine, they combine into a sext from i1. llvm-svn: 122453	2010-12-22 23:12:15 +00:00
Duncan Sands	fbb9ac3cca	Add a generic expansion transform: A op (B op' C) -> (A op B) op' (A op C) if both A op B and A op C simplify. This fires fairly often but doesn't make that much difference. On gcc-as-one-file it removes two "and"s and turns one branch into a select. llvm-svn: 122399	2010-12-22 13:36:08 +00:00
Duncan Sands	3547d2ebd8	Add some statistics, good for understanding how much more powerful instcombine is compared to instsimplify. llvm-svn: 122397	2010-12-22 09:40:51 +00:00
Owen Anderson	5ab8d4b5e5	Give GVN back the ability to perform simple conditional propagation on conditional branch values. I still think that LVI should be handling this, but that capability is some ways off in the future, and this matters for some significant benchmarks. llvm-svn: 122378	2010-12-21 23:54:34 +00:00
Owen Anderson	12470778d7	Remove dead code. llvm-svn: 122371	2010-12-21 22:31:24 +00:00
Benjamin Kramer	43493c089f	GVN's Expression is not POD-like (it contains a SmallVector). Simplify code while at it. llvm-svn: 122362	2010-12-21 21:30:19 +00:00
Duncan Sands	3b8af41a3e	Visit instructions deterministically. Use a FIFO so as to approximately visit instructions before their uses, since InstructionSimplify does a better job in that case. All this prompted by Frits van Bommel. llvm-svn: 122343	2010-12-21 17:08:55 +00:00
Duncan Sands	e7cbb64ec0	If an instruction simplifies, try again to simplify any uses of it. This is not very important since the pass is only used for testing, but it does make it more realistic. Suggested by Frits van Bommel. llvm-svn: 122336	2010-12-21 16:12:03 +00:00
Duncan Sands	d0eb6d39f8	Pull a few more simplifications out of instcombine (there are still plenty left though!), in particular for multiplication. llvm-svn: 122330	2010-12-21 14:00:22 +00:00
Duncan Sands	eaff500c7b	Oops, forgot to add the pass itself! llvm-svn: 122265	2010-12-20 21:07:42 +00:00
Duncan Sands	a436cbe4bf	Add a new convenience pass for testing InstructionSimplify. Previously it could only be tested indirectly, via instcombine, gvn or some other pass that makes use of InstructionSimplify, which means that testcases had to be carefully contrived to dance around any other transformations that that pass did. llvm-svn: 122264	2010-12-20 20:54:37 +00:00
Benjamin Kramer	f7957d0463	Add a check missing from my last commit and avoid a potential overflow situation. llvm-svn: 122258	2010-12-20 20:00:31 +00:00
Benjamin Kramer	2bca3a67b3	Reduce indentation. llvm-svn: 122249	2010-12-20 16:21:59 +00:00
Benjamin Kramer	68531baea9	Teach InstCombine to merge (icmp ult (X + CA), C1) \| (icmp eq X, C2) into (icmp ult (X + CA), C1 + 1) if C2 + CA == C1. InstCombine creates these so now we compile x == 23 \|\| x == 24 \|\| x == 25 to %x.off = add i32 %x, -23 %1 = icmp ult i32 %x.off, 3 instead of %x.off = add i32 %x, -23 %1 = icmp ult i32 %x.off, 2 %cmp3 = icmp eq i32 %x, 25 %ret2 = or i1 %1, %cmp3 llvm-svn: 122248	2010-12-20 16:18:51 +00:00
Chris Lattner	27ca8ebd4b	fix PR8807 by making transformConstExprCastCall aware of byval arguments. llvm-svn: 122238	2010-12-20 08:36:38 +00:00
Chris Lattner	7398965b67	various cleanups for transformConstExprCastCall llvm-svn: 122237	2010-12-20 08:25:06 +00:00
Chris Lattner	0f11495289	when eliding a byval copy due to inlining a readonly function, we have to make sure that the reused alloca has sufficient alignment. llvm-svn: 122236	2010-12-20 08:10:40 +00:00
Chris Lattner	0099744506	pull byval processing out to its own helper function. llvm-svn: 122235	2010-12-20 07:57:41 +00:00
Chris Lattner	7394680a00	fix PR8769, a miscompilation by inliner when inlining a function with a byval argument. The generated alloca has to have at least the alignment of the byval, if not, the client may be making assumptions that the new alloca won't satisfy. llvm-svn: 122234	2010-12-20 07:45:28 +00:00
Mon P Wang	1991c47ec1	Avoid dropping the address space when InstCombine optimizes memset llvm-svn: 122215	2010-12-20 01:05:30 +00:00
Chris Lattner	4fb9dd4c74	fix an oversight caught by Frits! llvm-svn: 122204	2010-12-19 23:24:04 +00:00
Chris Lattner	b6252a376a	tidy up llvm-svn: 122190	2010-12-19 20:24:28 +00:00
Chris Lattner	3e635d2e99	move a transformation to a more logical place, simplifying it. llvm-svn: 122183	2010-12-19 19:43:52 +00:00
Chris Lattner	5e0c0c72e9	recognize an unsigned add with overflow idiom into uadd. This resolves a README entry and technically resolves PR4916, but we still get poor code for the testcase in that PR because GVN isn't CSE'ing uadd with add, filed as PR8817. Previously we got: _test7: ## @test7 addq %rsi, %rdi cmpq %rdi, %rsi movl $42, %eax cmovaq %rsi, %rax ret Now we get: _test7: ## @test7 addq %rsi, %rdi movl $42, %eax cmovbq %rsi, %rax ret llvm-svn: 122182	2010-12-19 19:37:52 +00:00
Chris Lattner	33dc3f0cfa	optimize uadd(x, cst) into a comparison when the normal result is dead. This is required for my next patch to not regress the testsuite. llvm-svn: 122181	2010-12-19 19:35:32 +00:00
Chris Lattner	ce2995ae58	use IC.ReplaceInstUsesWith instead of a raw RAUW so that uses of the old thing end up on the instcombine worklist. Not doing this can cause an extra top-level iteration of instcombine, burning compile time. llvm-svn: 122179	2010-12-19 18:38:44 +00:00
Chris Lattner	79874566ce	generalize the sadd creation code to not require that the sadd formed is half the size of the original type. We can now compile this into a sadd.i8: unsigned char X(char a, char b) { int res = a+b; if ((unsigned )(res+128) > 255U) abort(); return res; } llvm-svn: 122178	2010-12-19 18:35:09 +00:00
Chris Lattner	c56c845377	fix another miscompile in the llvm.sadd formation logic: it wasn't checking to see if the high bits of the original add result were dead. Inserting a smaller add and zexting back to that size is not good enough. This is likely to be the fix for 8816. llvm-svn: 122177	2010-12-19 18:22:06 +00:00
Chris Lattner	f29562db25	fix a bug (possibly 8816) in the sadd forming xform: it isn't profitable (or safe) to promote code when the add-with-constant has other uses. llvm-svn: 122175	2010-12-19 17:59:02 +00:00
Chris Lattner	ee61c1d820	rework the code added in r122072 to pull it out to its own helper function, clean up comments, and reduce indentation. No functionality change. llvm-svn: 122174	2010-12-19 17:52:50 +00:00
Chris Lattner	408a684d29	Enhance LICM to promote alias sets whose pointers themselves are stored, which doesn't affect the memory address being promoted. llvm-svn: 122172	2010-12-19 05:57:25 +00:00
Chris Lattner	3337a81450	fix PR8602, a bug in an assertion: a volatile store of a pointer does not make the alias set for that pointer volatile, just stores to the pointer. llvm-svn: 122171	2010-12-19 05:51:54 +00:00
Chris Lattner	fb888622c3	revert r122164, I'm going to go with a different approach. llvm-svn: 122168	2010-12-19 04:23:03 +00:00
Chris Lattner	583ec6fa44	first step to fixing PR8642: don't fold away empty basic blocks which have trapping constant exprs in them due to PHI nodes. Eliminating them can cause the constant expr to be evalutated on new paths if the input edges are critical. llvm-svn: 122164	2010-12-19 03:02:34 +00:00
Chris Lattner	6b8b4855ff	simplify this a bit. llvm-svn: 122156	2010-12-18 20:22:49 +00:00
Bill Wendling	5e3605552e	Whitespace fixes. No functionality change. llvm-svn: 122110	2010-12-17 23:27:41 +00:00
Nate Begeman	7aa18bf46a	Add vector versions of some existing scalar transforms to aid codegen in matching psign & pblend operations to the IR produced by clang/gcc for their C idioms. llvm-svn: 122105	2010-12-17 23:12:19 +00:00
Owen Anderson	1294ea7d53	Reapply r121905 (automatic synthesis of @llvm.sadd.with.overflow) with a fix for a bug that manifested itself on the DragonEgg self-host bot. Unfortunately, the testcase is pretty messy and doesn't reduce well due to interactions with other parts of InstCombine. llvm-svn: 122072	2010-12-17 18:08:00 +00:00
Benjamin Kramer	e5f49c4ff2	SimplifyCFG: Ranges can be larger than 64 bits. Fixes Release-selfhost build. llvm-svn: 122054	2010-12-17 10:48:14 +00:00
Chris Lattner	d14b0f1db7	improve switch formation to handle small range comparisons formed by comparisons. For example, this: void foo(unsigned x) { if (x == 0 \|\| x == 1 \|\| x == 3 \|\| x == 4 \|\| x == 6) bar(); } compiles into: _foo: ## @foo ## BB#0: ## %entry cmpl $6, %edi ja LBB0_2 ## BB#1: ## %entry movl %edi, %eax movl $91, %ecx btq %rax, %rcx jb LBB0_3 instead of: _foo: ## @foo ## BB#0: ## %entry cmpl $2, %edi jb LBB0_4 ## BB#1: ## %switch.early.test cmpl $6, %edi ja LBB0_3 ## BB#2: ## %switch.early.test movl %edi, %eax movl $88, %ecx btq %rax, %rcx jb LBB0_4 This catches a bunch of cases in GCC, which look like this: %804 = load i32* @which_alternative, align 4, !tbaa !0 %805 = icmp ult i32 %804, 2 %806 = icmp eq i32 %804, 3 %or.cond121 = or i1 %805, %806 %807 = icmp eq i32 %804, 4 %or.cond124 = or i1 %or.cond121, %807 br i1 %or.cond124, label %.thread, label %808 turning this into a range comparison. llvm-svn: 122045	2010-12-17 06:20:15 +00:00
Dan Gohman	93dc2b808f	Revert r64460. strtol and friends cannot be marked readonly, even with a null endptr argument, because they may write to errno. This fixes a seflhost miscompile observed on Linux targets when TBAA was enabled. llvm-svn: 122014	2010-12-17 01:09:43 +00:00
Frits van Bommel	9bbe849fc3	Fix a bug in the loop in JumpThreading::ProcessThreadableEdges() where it could falsely produce a MultipleDestSentinel value if the first predecessor ended with an 'indirectbr'. If that happened, it caused an unnecessary FindMostPopularDest() call. This wasn't a correctness problem, but it broke the fast path for single-predecessor blocks. llvm-svn: 121966	2010-12-16 12:16:00 +00:00
Duncan Sands	8d1ab6f6e1	Speculatively revert commit 121905 since it looks like it might have broken the dragonegg self-host buildbot. Original commit message: Add an InstCombine transform to recognize instances of manual overflow-safe addition (performing the addition in a wider type and explicitly checking for overflow), and fold them down to intrinsics. This currently only supports signed-addition, but could be generalized if someone works out the magic constant formulas for other operations. llvm-svn: 121965	2010-12-16 09:40:54 +00:00
Dan Gohman	e1a17a3473	Make memcpyopt TBAA-aware. llvm-svn: 121944	2010-12-16 02:51:19 +00:00
Dan Gohman	4467aa5294	Preserve TBAA tags when doing load PRE. llvm-svn: 121921	2010-12-15 23:53:55 +00:00
Owen Anderson	1cf8881299	Add an InstCombine transform to recognize instances of manual overflow-safe addition (performing the addition in a wider type and explicitly checking for overflow), and fold them down to intrinsics. This currently only supports signed-addition, but could be generalized if someone works out the magic constant formulas for other operations. Fixes <rdar://problem/8558713>. llvm-svn: 121905	2010-12-15 22:32:38 +00:00
Dan Gohman	a4fcd2418d	Move Value::getUnderlyingObject to be a standalone function so that it can live in Analysis instead of VMCore. llvm-svn: 121885	2010-12-15 20:02:24 +00:00
Duncan Sands	0a2c416894	Move Sub simplifications and additional Add simplifications out of instcombine and into InstructionSimplify. llvm-svn: 121861	2010-12-15 14:07:39 +00:00
Frits van Bommel	3d1803495e	Teach jump threading to "look through" a select when the branch direction of a terminator depends on it. When it sees a promising select it now tries to figure out whether the condition of the select is known in any of the predecessors and if so it maps the operands appropriately. llvm-svn: 121859	2010-12-15 09:51:20 +00:00
Chris Lattner	e893e2601e	make qsort predicate more conformant by returning 0 for equal values. llvm-svn: 121838	2010-12-15 04:52:41 +00:00
Owen Anderson	35609d97ae	Fix PR8790, another instance where unreachable code can cause instruction simplification to fail, this case involve a select that simplifies to itself. llvm-svn: 121817	2010-12-15 00:55:35 +00:00
Owen Anderson	15c85c916f	Cleanup trailing whitespace. llvm-svn: 121816	2010-12-15 00:52:44 +00:00
Chris Lattner	7499b452c1	- Insert new instructions before DomBlock's terminator, which is simpler than finding a place to insert in BB. - Don't perform the 'if condition hoisting' xform on certain i1 PHIs, as it interferes with switch formation. This re-fixes "example 7", without breaking the world hopefully. llvm-svn: 121764	2010-12-14 08:46:09 +00:00
Chris Lattner	335f0e4ad4	fix two significant issues with FoldTwoEntryPHINode: first, it can kick in on blocks whose conditions have been folded to a constant, even though one of the edges will be trivially folded. second, it doesn't clean up the "if diamond" that it just eliminated away. This is a problem because other simplifycfg xforms kick in depending on the order of block visitation, causing pointless work. llvm-svn: 121762	2010-12-14 08:01:53 +00:00
Chris Lattner	dc20a7d38c	remove the instsimplify logic I added in r121754. It is apparently breaking the selfhost builds, though I can't fathom how. llvm-svn: 121761	2010-12-14 07:53:03 +00:00
Chris Lattner	9ac168d0ab	clean up logic, convert std::set to SmallPtrSet, handle the case when all 2-entry phis are simplified away. llvm-svn: 121760	2010-12-14 07:41:39 +00:00
Chris Lattner	9fd838d31b	tidy up a bit, move DEBUG down to when we commit to doing the transform so we don't print it unless the xform happens. llvm-svn: 121758	2010-12-14 07:23:10 +00:00
Chris Lattner	b42d293faa	use SimplifyInstruction instead of reimplementing part of it. llvm-svn: 121757	2010-12-14 07:20:29 +00:00
Chris Lattner	fb73de482c	simplify GetIfCondition by using getSinglePredecessor. llvm-svn: 121756	2010-12-14 07:15:21 +00:00
Chris Lattner	0f4d67bd88	use AddPredecessorToBlock in 3 places instead of a manual loop. llvm-svn: 121755	2010-12-14 07:09:42 +00:00
Chris Lattner	a07cc6f4fd	make FoldTwoEntryPHINode use instsimplify a bit, make GetIfCondition faster by avoiding pred_iterator. No really interesting change. llvm-svn: 121754	2010-12-14 07:00:00 +00:00
Chris Lattner	afd2a8cfbb	remove the dead (and terrible) llvm::RemoveSuccessor function. llvm-svn: 121753	2010-12-14 06:51:55 +00:00
Chris Lattner	d7beca3782	improve DEBUG's a bit, switch to eraseFromParent() to simplify code a bit, switch from constant folding to instsimplify. llvm-svn: 121751	2010-12-14 06:17:25 +00:00
Chris Lattner	5a9d59d918	reapply my recent change that disables a piece of the switch formation work, but fixes 400.perlbmk. llvm-svn: 121749	2010-12-14 05:57:30 +00:00
Owen Anderson	3e5648896e	Fix recent buildbot breakage by pulling SimplifyCFG back to its state as of r121694, the most recent state where I'm confident there were no crashes or miscompilations. XFAIL the test added since then for now. llvm-svn: 121733	2010-12-13 23:49:28 +00:00
Chris Lattner	a6e5d5694a	temporarily disable part of my previous patch, which causes an iterator invalidation issue, causing a crash on some versions of perlbmk. llvm-svn: 121728	2010-12-13 23:02:19 +00:00
Chris Lattner	2d434e594e	add some DEBUG's. llvm-svn: 121711	2010-12-13 19:55:30 +00:00
Benjamin Kramer	1e155ab7e1	Fix sort predicate. qsort(3)'s predicate semantics differ from std::sort's. Fixes PR 8780. llvm-svn: 121705	2010-12-13 18:20:38 +00:00
Chris Lattner	fb836f8c1a	reinstate my patch: the miscompile was caused by an inverted branch in the 'and' case. llvm-svn: 121695	2010-12-13 08:12:19 +00:00
Chris Lattner	79db357d80	Completely disable the optimization I added in r121680 until I can track down a miscompile. This should bring the buildbots back to life llvm-svn: 121693	2010-12-13 07:41:29 +00:00
Chris Lattner	fbeb55844b	Make simplifycfg reprocess newly formed "br (cond1 \| cond2)" conditions when simplifying, allowing them to be eagerly turned into switches. This is the last step required to get "Example 7" from this blog post: http://blog.regehr.org/archives/320 On X86, we now generate this machine code, which (to my eye) seems better than the ICC generated code: _crud: ## @crud ## BB#0: ## %entry cmpb $33, %dil jb LBB0_4 ## BB#1: ## %switch.early.test addb $-34, %dil cmpb $58, %dil ja LBB0_3 ## BB#2: ## %switch.early.test movzbl %dil, %eax movabsq $288230376537592865, %rcx ## imm = 0x400000017001421 btq %rax, %rcx jb LBB0_4 LBB0_3: ## %lor.rhs xorl %eax, %eax ret LBB0_4: ## %lor.end movl $1, %eax ret llvm-svn: 121690	2010-12-13 07:00:06 +00:00
Chris Lattner	1d05761df4	make this logic a bit simpler. llvm-svn: 121689	2010-12-13 06:36:51 +00:00
Chris Lattner	25c3af35d8	split all the guts of SimplifyCFGOpt::run out into one function per terminator kind. llvm-svn: 121688	2010-12-13 06:25:44 +00:00
Chris Lattner	cb570f87e5	fix a bug in r121680 that upset the various buildbots. llvm-svn: 121687	2010-12-13 05:34:18 +00:00
Chris Lattner	a6db741f3d	refactor the speculative execution logic to be factored into the cond branch code instead of doing a cfg search for every block simplified. llvm-svn: 121686	2010-12-13 05:26:52 +00:00
Chris Lattner	466f54ffcf	simplify a bunch of code. llvm-svn: 121685	2010-12-13 05:20:28 +00:00
Chris Lattner	6df7bdd810	move HoistThenElseCodeToIf up to a more logical and efficient-to-handle place. llvm-svn: 121684	2010-12-13 05:15:29 +00:00
Chris Lattner	2e3832d9a0	move 'MergeBlocksIntoPredecessor' call earlier. Use getSinglePredecessor to simplify code. llvm-svn: 121683	2010-12-13 05:10:48 +00:00
Chris Lattner	a69c443459	factor new code out to a SimplifyBranchOnICmpChain helper function. llvm-svn: 121681	2010-12-13 05:03:41 +00:00
Chris Lattner	a442f24a36	enhance the "change or icmp's into switch" xform to handle one value in an 'or sequence' that it doesn't understand. This allows us to optimize something insane like this: int crud (unsigned char c, unsigned x) { if(((((((((( (int) c <= 32 \|\| (int) c == 46) \|\| (int) c == 44) \|\| (int) c == 58) \|\| (int) c == 59) \|\| (int) c == 60) \|\| (int) c == 62) \|\| (int) c == 34) \|\| (int) c == 92) \|\| (int) c == 39) != 0) foo(); } into: define i32 @crud(i8 zeroext %c, i32 %x) nounwind ssp noredzone { entry: %cmp = icmp ult i8 %c, 33 br i1 %cmp, label %if.then, label %switch.early.test switch.early.test: ; preds = %entry switch i8 %c, label %if.end [ i8 39, label %if.then i8 44, label %if.then i8 58, label %if.then i8 59, label %if.then i8 60, label %if.then i8 62, label %if.then i8 46, label %if.then i8 92, label %if.then i8 34, label %if.then ] by pulling the < comparison out ahead of the newly formed switch. llvm-svn: 121680	2010-12-13 04:50:38 +00:00
Chris Lattner	5a177e681e	merge two very similar functions into one that has a bool argument. llvm-svn: 121678	2010-12-13 04:26:26 +00:00
Chris Lattner	9b1af510cb	don't bother handling non-canonical icmp's llvm-svn: 121676	2010-12-13 04:18:32 +00:00
Chris Lattner	395252d93e	inline a function, making the result much simpler. llvm-svn: 121675	2010-12-13 04:15:19 +00:00
Chris Lattner	62cc76e9cc	Fix my previous patch to handle a degenerate case that the llvm-gcc bootstrap buildbot tripped over. llvm-svn: 121674	2010-12-13 03:43:57 +00:00
Chris Lattner	11dafaa3ec	convert some methods to be static functions llvm-svn: 121673	2010-12-13 03:30:12 +00:00
Chris Lattner	4642d79fb0	zap two more std::sorts. llvm-svn: 121672	2010-12-13 03:24:30 +00:00
Chris Lattner	d9bacc088a	fix a fairly serious oversight with switch formation from or'd conditions. Previously we'd compile something like this: int crud (unsigned char c) { return c == 62 \|\| c == 34 \|\| c == 92; } into: switch i8 %c, label %lor.rhs [ i8 62, label %lor.end i8 34, label %lor.end ] lor.rhs: ; preds = %entry %cmp8 = icmp eq i8 %c, 92 br label %lor.end lor.end: ; preds = %entry, %entry, %lor.rhs %0 = phi i1 [ true, %entry ], [ %cmp8, %lor.rhs ], [ true, %entry ] %lor.ext = zext i1 %0 to i32 ret i32 %lor.ext which failed to merge the compare-with-92 into the switch. With this patch we simplify this all the way to: switch i8 %c, label %lor.rhs [ i8 62, label %lor.end i8 34, label %lor.end i8 92, label %lor.end ] lor.rhs: ; preds = %entry br label %lor.end lor.end: ; preds = %entry, %entry, %entry, %lor.rhs %0 = phi i1 [ true, %entry ], [ false, %lor.rhs ], [ true, %entry ], [ true, %entry ] %lor.ext = zext i1 %0 to i32 ret i32 %lor.ext which is much better for codegen's switch lowering stuff. This kicks in 33 times on 176.gcc (for example) cutting 103 instructions off the generated code. llvm-svn: 121671	2010-12-13 03:18:54 +00:00
Chris Lattner	73a58627c3	simplify code and reduce indentation llvm-svn: 121670	2010-12-13 02:38:13 +00:00
Chris Lattner	7c8e6047d6	convert an std::sort to array_pod_sort. llvm-svn: 121669	2010-12-13 02:00:58 +00:00
Chris Lattner	1475987634	move the "br (X == 0 \| X == 1), T, F" -> switch optimization to a new location in simplifycfg. In the old days, SimplifyCFG was never run on the entry block, so we had to scan over all preds of the BB passed into simplifycfg to do this xform, now we can just check blocks ending with a condbranch. This avoids a scan over all preds of every simplified block, which should be a significant compile-time perf win on functions with lots of edges. No functionality change. llvm-svn: 121668	2010-12-13 01:57:34 +00:00
Chris Lattner	4088e2b8e4	reduce indentation and generally simplify code, no functionality change. llvm-svn: 121667	2010-12-13 01:47:07 +00:00
Chris Lattner	7cb7867d7a	use getFirstNonPHIOrDbg to simplify this code. llvm-svn: 121664	2010-12-13 01:28:06 +00:00
Benjamin Kramer	c4169cebe3	Generalize the and-icmp-select instcombine further by allowing selects of the form (x & 2^n) ? 2^m+C : C we can offset both arms by C to get the "(x & 2^n) ? 2^m : 0" form, optimize the select to a shift and apply the offset afterwards. llvm-svn: 121609	2010-12-11 10:49:22 +00:00
Benjamin Kramer	c8b035d006	Factor the (x & 2^n) ? 2^m : 0 instcombine into its own method and generalize it to catch cases where n != m with a shift. llvm-svn: 121608	2010-12-11 09:42:59 +00:00
Chris Lattner	bc4457e317	enhance memcpyopt to zap memcpy's that have the same src/dst. llvm-svn: 121362	2010-12-09 07:45:45 +00:00
Chris Lattner	fd51c52ef6	fix PR8753, eliminating a case where we'd infinitely make a substitution because it doesn't actually change the IR. Patch by Jakub Staszak! llvm-svn: 121361	2010-12-09 07:39:50 +00:00
Dan Gohman	a32986e899	Really check that the bits that will become zero are actually already zero before eliminating the operation that zeros them. This fixes rdar://8739316. llvm-svn: 121353	2010-12-09 02:52:17 +00:00
Frits van Bommel	d2f4b09e10	Remove some dead code from the jump threading pass. The last uses of these functions were removed in r113852 when LazyValueInfo was permanently enabled and removed the need for them. llvm-svn: 121133	2010-12-07 13:08:07 +00:00
Jay Foad	583abbc4df	PR5207: Change APInt methods trunc(), sext(), zext(), sextOrTrunc() and zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method trunc(), to be const and to return a new value instead of modifying the object in place. llvm-svn: 121120	2010-12-07 08:25:19 +00:00
Chris Lattner	0d71c4f564	reapply r121100 with a tweak to constant fold ConstExprs with TargetData (if available) as we go so that we get simple constantexprs not insane ones. This fixes the failure of clang/test/CodeGenCXX/virtual-base-ctor.cpp that the previous iteration of this patch had. llvm-svn: 121111	2010-12-07 04:33:29 +00:00
Eric Christopher	f10dcfb9fb	Temporarily revert r121100 as it's causing clang to fail CodeGenCXX/virtual-base-ctor.cpp. llvm-svn: 121102	2010-12-07 02:41:11 +00:00
Chris Lattner	287f4366c1	fix PR8710 - teach global opt that some constantexprs are too complex to put in a global variable's initializer. llvm-svn: 121100	2010-12-07 01:59:32 +00:00
Frits van Bommel	d9df6eaa9c	Implement jump threading of 'indirectbr' by keeping track of whether we're looking for ConstantInts or BlockAddresss. llvm-svn: 121066	2010-12-06 23:36:56 +00:00
Chris Lattner	7ff0ba41bd	replace a linear scan with a symtab lookup, reduce indentation. No functionality change. llvm-svn: 121042	2010-12-06 21:53:07 +00:00
Chris Lattner	4dc53e37d9	Use a stronger predicate here, pointed out by Duncan llvm-svn: 121040	2010-12-06 21:48:10 +00:00
Chris Lattner	ca335e38cf	add some DEBUG statements. llvm-svn: 121038	2010-12-06 21:13:51 +00:00
Chris Lattner	fb212de06d	Fix PR8735, a really terrible problem in the inliner's "alloca merging" optimization. Consider: static void foo() { A = alloca ... } static void bar() { B = alloca ... call foo(); } void main() { bar() } The inliner proceeds bottom up, but lets pretend it decides not to inline foo into bar. When it gets to main, it inlines bar into main(), and says "hey, I just inlined an alloca "B" into main, lets remember that. Then it keeps going and finds that it now contains a call to foo. It decides to inline foo into main, and says "hey, foo has an alloca A, and I have an alloca B from another inlined call site, lets reuse it". The problem with this of course, is that the lifetime of A and B are nested, not disjoint. Unfortunately I can't create a reasonable testcase for this: the one in the PR is both huge and extremely sensitive, because you minor tweaks end up causing foo to get inlined into bar too early. We already have tests for the basic alloca merging optimization and this does not break them. llvm-svn: 120995	2010-12-06 07:52:42 +00:00
Chris Lattner	cd3af96a8f	improve comment llvm-svn: 120994	2010-12-06 07:43:04 +00:00
Chris Lattner	5b6a865f2e	improve -debug output and comments a little. llvm-svn: 120993	2010-12-06 07:38:40 +00:00
Chris Lattner	94fbdf3814	Fix PR8728, a miscompilation I recently introduced. When optimizing memcpy's like: memcpy(A, B) memcpy(A, C) we cannot delete the first memcpy as dead if A and C might be aliases. If so, we actually get: memcpy(A, B) memcpy(A, A) which is not correct to transform into: memcpy(A, A) This patch was heavily influenced by Jakub Staszak's patch in PR8728, thanks Jakub! llvm-svn: 120974	2010-12-06 01:48:06 +00:00
Frits van Bommel	76244867cf	Refactor jump threading. Should have no functional change other than the order of two transformations that are mutually-exclusive and the exact formatting of debug output. Internally, it now stores the ConstantInts as Constants, and actual undef values instead of nulls. llvm-svn: 120946	2010-12-05 19:06:41 +00:00
Frits van Bommel	5e75ef4a8e	Remove trailing whitespace. llvm-svn: 120945	2010-12-05 19:02:47 +00:00
Frits van Bommel	8fb69ee805	Teach SimplifyCFG to turn (indirectbr (select cond, blockaddress(@fn, BlockA), blockaddress(@fn, BlockB))) into (br cond, BlockA, BlockB). llvm-svn: 120943	2010-12-05 18:29:03 +00:00
Jay Foad	25a5e4ca1f	PR5207: Rename overloaded APInt methods set(), clear(), flip() to setAllBits(), setBit(unsigned), etc. llvm-svn: 120564	2010-12-01 08:53:58 +00:00
Chris Lattner	1c577b54b0	fix a bozo bug I introduced in r119930, causing a miscompile of 20040709-1.c from the gcc testsuite. I was using the size of a pointer instead of the pointee. This fixes rdar://8713376 llvm-svn: 120519	2010-12-01 01:24:55 +00:00
Chris Lattner	903add84d9	Enhance DSE to handle the variable index case in PR8657. llvm-svn: 120498	2010-11-30 23:43:23 +00:00
Chris Lattner	c0f3379ae0	teach DSE to use GetPointerBaseWithConstantOffset to analyze may-aliasing stores that partially overlap with different base pointers. This implements PR6043 and the non-variable part of PR8657 llvm-svn: 120485	2010-11-30 23:05:20 +00:00
Chris Lattner	e28618de59	move GetPointerBaseWithConstantOffset out of GVN into ValueTracking.h llvm-svn: 120476	2010-11-30 22:25:26 +00:00
Chris Lattner	50162e3c2a	remove a fixed fixme llvm-svn: 120474	2010-11-30 22:18:11 +00:00
Chris Lattner	6712251f41	Make DeleteDeadInstruction be a static function, move some code around. llvm-svn: 120471	2010-11-30 21:58:14 +00:00
Chris Lattner	51d67ce2ff	switch RemoveAccessedObjects to use AliasAnalysis::Location to simplify the code. We now get accurate sizes on Loads, though it surely doesn't matter in practice. llvm-svn: 120469	2010-11-30 21:47:58 +00:00
Chris Lattner	f80b39986f	two improvements to RemoveAccessedObjects: 1. if the underlying pointer passed in can be resolved to any argument or alloca, then we don't need to scan. Previously we would only avoid the scan if the alloca or byval was actually considered dead. 2. The dead store processing code is itself completely dead and didn't handle volatile stores right anyway, so delete it. This allows simplifying the interface to RemoveAccessedObjects. llvm-svn: 120467	2010-11-30 21:38:30 +00:00
Chris Lattner	7fe08b67fa	remove the "undead" terminology, which is nonstandard and never made sense to me. We now have a set of dead stack objects, and they become live when loaded. Fix a theoretical problem where we'd pass in the wrong pointer to the alias query. llvm-svn: 120465	2010-11-30 21:32:12 +00:00
Chris Lattner	127818d746	move call handling in handleEndBlock up a bit, and simplify it. If the call might read all the allocas, stop scanning early. Convert a vector to smallvector, shrink SmallPtrSet to 16 instead of 64 to avoid crazy linear scans. llvm-svn: 120463	2010-11-30 21:18:46 +00:00
Dale Johannesen	d3a58c8fa1	Avoid exponential growth of a table. It feels like there should be a better way to do this. PR 8679. llvm-svn: 120457	2010-11-30 20:23:21 +00:00
Chris Lattner	60a8b3dab8	various cleanups and code simplification llvm-svn: 120454	2010-11-30 19:48:15 +00:00
Chris Lattner	51c28a93cc	make getPointerSize a static function. Add ivars to DSE for AA and MD pass info instead of using getAnalysis<> all over. llvm-svn: 120453	2010-11-30 19:34:42 +00:00
Chris Lattner	77d79fa25f	reduce indentation, clean up TD use a bit. llvm-svn: 120452	2010-11-30 19:28:23 +00:00
Chris Lattner	b63ba73b1b	enhance isRemovable to refuse to delete volatile mem transfers now that DSE hacks on them. This fixes a regression I introduced, by generalizing DSE to hack on transfers. llvm-svn: 120445	2010-11-30 19:12:10 +00:00
Chris Lattner	58b779e9c2	Rewrite the main DSE loop to be written in terms of reasoning about pairs of AA::Location's instead of looking for MemDep's "Def" predicate. This is more powerful and general, handling memset/memcpy/store all uniformly, and implementing PR8701 and probably obsoleting parts of memcpyoptimizer. This also fixes an obscure bug with init.trampoline and i8 stores, but I'm not surprised it hasn't been hit yet. Enhancing init.trampoline to carry the size that it stores would allow DSE to be much more aggressive about optimizing them. llvm-svn: 120406	2010-11-30 07:23:21 +00:00
Anders Carlsson	e3ea1cba79	Add a puts optimization that converts puts() to putchar('\n'). llvm-svn: 120398	2010-11-30 06:19:18 +00:00
Chris Lattner	3590ef817c	rename a function and reduce some indentation, no functionality change. llvm-svn: 120391	2010-11-30 05:30:45 +00:00
Chris Lattner	b438ef236c	remove the pointless check of MemoryUseIntrinsic from is trivially dead, since these have side effects. This makes the (misnamed) MemoryUseIntrinsic class dead, so remove it. llvm-svn: 120382	2010-11-30 02:03:47 +00:00
Chris Lattner	2227a8a192	rename doesClobberMemory -> hasMemoryWrite to be more specific, and remove an actively-wrong comment. llvm-svn: 120378	2010-11-30 01:37:52 +00:00
Chris Lattner	9d179d911d	clean up handling of 'free', detangling it from everything else. It can be seriously improved, but at least now it isn't intertwined with the other logic. llvm-svn: 120377	2010-11-30 01:28:33 +00:00
Chris Lattner	9a146372b5	Teach basicaa that memset's modref set is at worst "mod" and never contains "ref". Enhance DSE to use a modref query instead of a store-specific hack to generalize the "ignore may-alias stores" optimization to handle memset and memcpy. llvm-svn: 120368	2010-11-30 00:28:45 +00:00
Chris Lattner	c3c754f750	my previous patch would cause us to start deleting some volatile stores, fix and add a testcase. llvm-svn: 120363	2010-11-30 00:12:39 +00:00
Chris Lattner	d4f1090948	two changes to DSE that shouldn't affect anything: 1. Don't bother trying to optimize: lifetime.end(ptr) store(ptr) as it is undefined, and therefore shouldn't exist. 2. Move the 'storing a loaded pointer' xform up, simplifying the may-aliased store code. llvm-svn: 120359	2010-11-30 00:01:19 +00:00
Chris Lattner	b4df1d5a3e	prune an llvmcontext include and simplify some code. llvm-svn: 120347	2010-11-29 23:35:33 +00:00
Chris Lattner	2e8793482c	fix PR8677, patch by Jakub Staszak! llvm-svn: 120325	2010-11-29 21:59:31 +00:00
Frits van Bommel	28218aa8f1	Transform (extractvalue (load P), ...) to (load (gep P, 0, ...)) if the load has no other uses, shrinking the load. llvm-svn: 120323	2010-11-29 21:56:20 +00:00
Owen Anderson	8ba5f39f70	Second attempt at fixing the performance regressions introduced by my recent GVN improvement. Looking through a single layer of PHI nodes when attempting to sink GEPs, we need to iteratively look through arbitrary PHI nests. llvm-svn: 120202	2010-11-27 08:15:55 +00:00
Nick Lewycky	b8de00ee07	Treat a call of function pointer like a load of the pointer when considering whether the pointer can be replaced with the global variable it is a copy of. Fixes PR8680. llvm-svn: 120126	2010-11-24 22:04:20 +00:00
Duncan Sands	0488d564e1	Rename SimplifyDistributed to the more meaningfull name SimplifyByFactorizing. llvm-svn: 120051	2010-11-23 20:42:39 +00:00
Benjamin Kramer	94a622af4c	The srem -> urem transform is not safe for any divisor that's not a power of two. E.g. -5 % 5 is 0 with srem and 1 with urem. Also addresses Frits van Bommel's comments. llvm-svn: 120049	2010-11-23 20:33:57 +00:00
Duncan Sands	433c1679cf	Replace calls to ConstantFoldInstruction with calls to SimplifyInstruction in two places that are really interested in simplified instructions, not constants. llvm-svn: 120044	2010-11-23 20:26:33 +00:00
Duncan Sands	bb2cd025a9	Constant folding here is pointless, because InstructionSimplify (which does constant folding and more) is called a few lines later. llvm-svn: 120042	2010-11-23 20:24:21 +00:00
Benjamin Kramer	b5afa65b0a	InstCombine: Reduce "X shift (A srem B)" to "X shift (A urem B)" iff B is positive. This allows to transform the rem in "1 << ((int)x % 8);" to an and. llvm-svn: 120028	2010-11-23 18:52:42 +00:00
Duncan Sands	60813f96e0	Propagate LeftDistributes and RightDistributes into their only uses. Stylistic improvement suggested by Frits van Bommel. llvm-svn: 120026	2010-11-23 15:28:14 +00:00
Duncan Sands	22df741687	Fix typo pointed out by Frits van Bommel and Marius Wachtler. llvm-svn: 120025	2010-11-23 15:25:34 +00:00
Duncan Sands	adc7771f18	Exploit distributive laws (eg: And distributes over Or, Mul over Add, etc) in a fairly systematic way in instcombine. Some of these cases were already dealt with, in which case I removed the existing code. The case of Add has a bunch of funky logic which covers some of this plus a few variants (considers shifts to be a form of multiplication), which I didn't touch. The simplification performed is: AB+AC -> A(B+C). The improvement is to do this in cases that were not already handled [such as AB-AC -> A(B-C), which was reported on the mailing list], and also to do it more often by not checking for "only one use" if "B+C" simplifies. llvm-svn: 120024	2010-11-23 14:23:47 +00:00
Chris Lattner	e5afa15b77	duncan's spider sense was right, I completely reversed the condition on this instcombine xform. This fixes a miscompilation of 403.gcc. llvm-svn: 119988	2010-11-23 02:42:04 +00:00
Benjamin Kramer	f1ebb63161	InstCombine: Implement X - A-B -> X + AB. llvm-svn: 119984	2010-11-22 20:31:27 +00:00
Duncan Sands	c133c54426	If a GEP index simply advances by multiples of a type of zero size, then replace the index with zero. llvm-svn: 119974	2010-11-22 16:32:50 +00:00
Duncan Sands	8a0f486e36	Move the "gep undef" -> "undef" transform from instcombine to InstructionSimplify. llvm-svn: 119970	2010-11-22 13:42:49 +00:00
Duncan Sands	c6648eb4c3	Don't keep track of inserted phis in PromoteMemoryToRegister: the information is never used. Patch by Cameron Zwarich. llvm-svn: 119963	2010-11-22 09:41:24 +00:00
Chris Lattner	fc9aead6fd	fix comment llvm-svn: 119948	2010-11-21 19:05:34 +00:00
Chris Lattner	5957229659	rework some DSE paths to use the newly-public "getPointerDependencyFrom" method in MemDep instead of inserting an instruction, doing a query, then removing it. Neither operation is effectively cached. llvm-svn: 119930	2010-11-21 08:06:10 +00:00
Chris Lattner	e48c31ce33	implement PR8576, deleting dead stores with intervening may-alias stores. llvm-svn: 119927	2010-11-21 07:34:32 +00:00
Chris Lattner	f7e896138e	optimize: void a(int x) { if (((1<<x)&8)==0) b(); } into "x != 3", which occurs over 100 times in 403.gcc but in no other program in llvm-test. llvm-svn: 119922	2010-11-21 06:44:42 +00:00
Chris Lattner	58f9f58716	Implement PR8644: forwarding a memcpy value to a byval, allowing the memcpy to be eliminated. Unfortunately, the requirements on byval's without explicit alignment are really weak and impossible to predict in the mid-level optimizer, so this doesn't kick in much with current frontends. The fix is to change clang to set alignment on all byval arguments. llvm-svn: 119916	2010-11-21 00:28:59 +00:00
Benjamin Kramer	ddd1b7b801	Simplify code. No change in functionality. llvm-svn: 119908	2010-11-20 18:43:35 +00:00
Owen Anderson	ea326db47b	Document the new GVN number table structure. llvm-svn: 119865	2010-11-19 22:48:40 +00:00
Owen Anderson	dfb8c3bbfc	When folding addressing modes in CodeGenPrepare, attempt to look through PHI nodes if all the operands of the PHI are equivalent. This allows CodeGenPrepare to undo unprofitable PRE transforms. llvm-svn: 119853	2010-11-19 22:15:03 +00:00
Duncan Sands	aef146b890	Factor code for testing whether replacing one value with another preserves LCSSA form out of ScalarEvolution and into the LoopInfo class. Use it to check that SimplifyInstruction simplifications are not breaking LCSSA form. Fixes PR8622. llvm-svn: 119727	2010-11-18 19:59:41 +00:00
Owen Anderson	c21c100f3d	Completely rework the datastructure GVN uses to represent the value number to leader mapping. Previously, this was a tree of hashtables, and a query recursed into the table for the immediate dominator ad infinitum if the initial lookup failed. This led to really bad performance on tall, narrow CFGs. We can instead replace it with what is conceptually a multimap of value numbers to leaders (actually represented by a hashtable with a list of Value*'s as the value type), and then determine which leader from that set to use very cheaply thanks to the DFS numberings maintained by DominatorTree. Because there are typically few duplicates of a given value, this scan tends to be quite fast. Additionally, we use a custom linked list and BumpPtr allocation to avoid any unnecessary allocation in representing the value-side of the multimap. This change brings with it a 15% (!) improvement in the total running time of GVN on 403.gcc, which I think is pretty good considering that includes all the "real work" being done by MemDep as well. The one downside to this approach is that we can no longer use GVN to perform simple conditional progation, but that seems like an acceptable loss since we now have LVI and CorrelatedValuePropagation to pick up the slack. If you see conditional propagation that's not happening, please file bugs against LVI or CVP. llvm-svn: 119714	2010-11-18 18:32:40 +00:00
Chris Lattner	1385dff8c0	slightly simplify code and substantially improve comment. Instead of saying "it would be bad", give an example of what is going on. llvm-svn: 119695	2010-11-18 08:07:09 +00:00
Chris Lattner	731caac7c6	remove a pointless restriction from memcpyopt. It was refusing to optimize two memcpy's like this: copy A <- B copy C <- A if it couldn't prove that noalias(B,C). We can eliminate the copy by producing a memmove instead of memcpy. llvm-svn: 119694	2010-11-18 08:00:57 +00:00
Chris Lattner	c274a83442	remove another pointless noalias check: M is a memcpy, so the source and dest are known to not overlap. llvm-svn: 119692	2010-11-18 07:39:57 +00:00
Chris Lattner	75cfe98534	use AA::isNoAlias instead of open coding it. Remove an extraneous noalias check: there is no need to check to see if the source and dest of a memcpy are noalias, behavior is undefined if not. llvm-svn: 119691	2010-11-18 07:38:43 +00:00
Chris Lattner	1e37bbafbb	finish a thought. llvm-svn: 119690	2010-11-18 07:32:33 +00:00
Chris Lattner	7e9b2ea3bf	rearrange some code, splitting memcpy/memcpy optimization out of processMemCpy into its own function. llvm-svn: 119687	2010-11-18 07:02:37 +00:00
Chris Lattner	ac5701319b	allow eliminating an alloca that is just copied from an constant global if it is passed as a byval argument. The byval argument will just be a read, so it is safe to read from the original global instead. This allows us to promote away the %agg.tmp alloca in PR8582 llvm-svn: 119686	2010-11-18 06:41:51 +00:00
Chris Lattner	f183d5c4be	enhance the "alloca is just a memcpy from constant global" to ignore calls that obviously can't modify the alloca because they are readonly/readnone. llvm-svn: 119683	2010-11-18 06:26:49 +00:00
Chris Lattner	7aeae25c78	fix a small oversight in the "eliminate memcpy from constant global" optimization. If the alloca that is "memcpy'd from constant" also has a memcpy from it, ignore it: it is a load. We now optimize the testcase to: define void @test2() { %B = alloca %T %a = bitcast %T* @G to i8* %b = bitcast %T* %B to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* %b, i8* %a, i64 124, i32 4, i1 false) call void @bar(i8* %b) ret void } previously we would generate: define void @test() { %B = alloca %T %b = bitcast %T* %B to i8* %G.0 = getelementptr inbounds %T* @G, i32 0, i32 0 %tmp3 = load i8* %G.0, align 4 %G.1 = getelementptr inbounds %T* @G, i32 0, i32 1 %G.15 = bitcast [123 x i8]* %G.1 to i8* %1 = bitcast [123 x i8]* %G.1 to i984* %srcval = load i984* %1, align 1 %B.0 = getelementptr inbounds %T* %B, i32 0, i32 0 store i8 %tmp3, i8* %B.0, align 4 %B.1 = getelementptr inbounds %T* %B, i32 0, i32 1 %B.12 = bitcast [123 x i8]* %B.1 to i8* %2 = bitcast [123 x i8]* %B.1 to i984* store i984 %srcval, i984* %2, align 1 call void @bar(i8* %b) ret void } llvm-svn: 119682	2010-11-18 06:20:47 +00:00
Dan Gohman	20d9ce21ef	Move SCEV::dominates and properlyDominates to ScalarEvolution. llvm-svn: 119570	2010-11-17 21:41:58 +00:00
Dan Gohman	afd6db9932	Move SCEV::isLoopInvariant and hasComputableLoopEvolution to be member functions of ScalarEvolution, in preparation for memoization and other optimizations. llvm-svn: 119562	2010-11-17 21:23:15 +00:00
Dan Gohman	1ee6d24072	Reference ScalarEvolution by name rather than directly in LICM, to avoid an unneeded dependence. llvm-svn: 119557	2010-11-17 20:50:07 +00:00
Benjamin Kramer	07726c7d52	InstCombine: Add a missing irem identity (X % X -> 0). llvm-svn: 119538	2010-11-17 19:11:46 +00:00
Duncan Sands	c89ac07e7a	Move some those Xor simplifications which don't require creating new instructions out of InstCombine and into InstructionSimplify. While there, introduce an m_AllOnes pattern to simplify matching with integers and vectors with all bits equal to one. llvm-svn: 119536	2010-11-17 18:52:15 +00:00
Duncan Sands	9d9a4e2ca2	Have InlineFunction use SimplifyInstruction rather than hasConstantValue. I was leery of using SimplifyInstruction while the IR was still in a half-baked state, which is the reason for delaying the simplification until the IR is fully cooked. llvm-svn: 119494	2010-11-17 11:16:23 +00:00
Duncan Sands	ba0b22c785	Have RemovePredecessorAndSimplify you SimplifyInstruction rather than hasConstantValue. llvm-svn: 119457	2010-11-17 04:12:05 +00:00
Duncan Sands	72313843d5	Remove dead code in GVN: now that SimplifyInstruction is called systematically, CollapsePhi will always return null here. Note that CollapsePhi did an extra check, isSafeReplacement, which the SimplifyInstruction logic does not do. I think that check was bogus - I guess we will soon find out! (It was originally added in commit 41998 without a testcase). llvm-svn: 119456	2010-11-17 04:05:21 +00:00
Duncan Sands	637049515f	Have a few places that want to simplify phi nodes use SimplifyInstruction rather than calling hasConstantValue. No intended functionality change. llvm-svn: 119352	2010-11-16 17:41:24 +00:00
Duncan Sands	b99f39b9f6	If dom tree information is available, make it possible to pass it to get better phi node simplification. llvm-svn: 119055	2010-11-14 18:36:10 +00:00
Duncan Sands	4581ddc123	Teach InstructionSimplify about phi nodes. I chose to have it simply offload the work to hasConstantValue rather than do something more complicated (such handling mutually recursive phis) because (1) it is not clear it is worth it; and (2) if it is worth it, maybe such logic would be better placed in hasConstantValue. Adjust some GVN tests which are now cleaned up much further (eg: all phi nodes are removed). llvm-svn: 119043	2010-11-14 13:30:18 +00:00
Duncan Sands	641baf1646	Generalize the reassociation transform in SimplifyCommutative (now renamed to SimplifyAssociativeOrCommutative) "(A op C1) op C2" -> "A op (C1 op C2)", which previously was only done if C1 and C2 were constants, to occur whenever "C1 op C2" simplifies (a la InstructionSimplify). Since the simplifying operand combination can no longer be assumed to be the right-hand terms, consider all of the possible permutations. When compiling "gcc as one big file", transform 2 (i.e. using right-hand operands) fires about 4000 times but it has to be said that most of the time the simplifying operands are both constants. Transforms 3, 4 and 5 each fired once. Transform 6, which is an existing transform that I didn't change, never fired. With this change, the testcase is now optimized perfectly with one run of instcombine (previously it required instcombine + reassociate + instcombine, and it may just have been luck that this worked). llvm-svn: 119002	2010-11-13 15:10:37 +00:00
Duncan Sands	246b71c596	Have GVN simplify instructions as it goes. For example, consider "%z = %x and %y". If GVN can prove that %y equals %x, then it turns this into "%z = %x and %x". With the new code, %z will be replaced with %x everywhere (and then deleted). Previously %z would be value numbered too, which is a waste of time. Also, while a clever value numbering algorithm would give %z the same value number as %x, our current one doesn't do so (at least I don't think it does). The new logic has an essentially equivalent effect to what you would get if %z was given the same value number as %x, i.e. it should make value numbering smarter. While there, get hold of target data once at the start rather than a gazillion times all over the place. llvm-svn: 118923	2010-11-12 21:10:24 +00:00
Dan Gohman	d4b7fff2e8	Enhance DSE to handle the case where a free call makes more than one store dead. This is especially noticeable in SingleSource/Benchmarks/Shootout/objinst. llvm-svn: 118875	2010-11-12 02:19:17 +00:00
Dan Gohman	65316d6749	Add helper functions for computing the Location of load, store, and vaarg instructions. llvm-svn: 118845	2010-11-11 21:50:19 +00:00
Dan Gohman	a826a88755	Factor out Instruction::isSafeToSpeculativelyExecute's code for testing for dereferenceable pointers into a helper function, isDereferenceablePointer. Teach it how to reason about GEPs with simple non-zero indices. Also eliminate ArgumentPromtion's IsAlwaysValidPointer, which didn't check for weak externals or out of range gep indices. llvm-svn: 118840	2010-11-11 21:23:25 +00:00
Dan Gohman	dcdfd8dd24	TBAA-enable ArgumentPromotion. llvm-svn: 118804	2010-11-11 18:09:32 +00:00
Dan Gohman	0cc4c7516e	Make Sink tbaa-aware. llvm-svn: 118788	2010-11-11 16:21:47 +00:00
Dan Gohman	c3b4ea7b7d	It's safe to sink some instructions which are not safe to speculatively execute. Make Sink's predicate more precise. llvm-svn: 118787	2010-11-11 16:20:28 +00:00
Dan Gohman	0a6021a54d	Enhance GVN to do more precise alias queries for non-local memory references. For example, this allows gvn to eliminate the load in this example: void foo(int n, int* p, int q) { p[0] = 0; p[1] = 1; if (n) { q = p[0]; } } llvm-svn: 118714	2010-11-10 20:37:15 +00:00
Dan Gohman	d209911642	Use getValueOperand() and getPointerOperand() on load and store instructions instead of hard-coding operand numbers. llvm-svn: 118698	2010-11-10 19:03:33 +00:00
Dan Gohman	066c1bb1e9	Add a doesAccessArgPointees helper function, and update code to use it, and to be consistent. llvm-svn: 118692	2010-11-10 18:17:28 +00:00
Dan Gohman	2577580967	Factor out the code for testing whether a function accesses arbitrary memory into a helper function, and adjust some comments. llvm-svn: 118687	2010-11-10 17:34:04 +00:00
Dale Johannesen	0171dc30ff	When checking that the necessary bits are zero in order to reduce ((x<<30)>>24) to x<<6, check the correct bits. PR 8547. llvm-svn: 118665	2010-11-10 01:30:56 +00:00
Dan Gohman	2694e14087	Make ModRefBehavior a lattice. Use this to clean up AliasAnalysis chaining and simplify FunctionAttrs' GetModRefBehavior logic. llvm-svn: 118660	2010-11-10 01:02:18 +00:00
Dan Gohman	e3467a7687	Teach FunctionAttrs about the VAArg instruction. llvm-svn: 118627	2010-11-09 20:17:38 +00:00
Dan Gohman	35814e6128	Use the AliasAnalysis interface to determine how a Function accesses memory. This isn't a real improvement with present day AliasAnalysis implementations; it's mainly for consistency. llvm-svn: 118624	2010-11-09 20:13:27 +00:00
Dan Gohman	0f17507478	Teach LICM and AliasSetTracker about AccessesArgumentsReadonly. llvm-svn: 118618	2010-11-09 19:58:21 +00:00
Dan Gohman	de52155685	Teach FunctionAttrs about AccessesArgumentsReadonly. llvm-svn: 118617	2010-11-09 19:56:27 +00:00
Dan Gohman	470ade12e0	Fix a thinko that Duncan spotted. llvm-svn: 118430	2010-11-08 19:24:47 +00:00
Dan Gohman	2cd1fd4a82	Make FunctionAttrs TBAA-aware. llvm-svn: 118417	2010-11-08 17:12:04 +00:00
Dan Gohman	9130bad71f	Extend the AliasAnalysis::pointsToConstantMemory interface to allow it to optionally look for constant or local (alloca) memory. Teach BasicAliasAnalysis::pointsToConstantMemory to look through Select and Phi nodes, and to support looking for local memory. Remove FunctionAttrs' PointsToLocalOrConstantMemory function, now that AliasAnalysis knows all the tricks that it knew. llvm-svn: 118412	2010-11-08 16:45:26 +00:00
Dan Gohman	86449d705a	Make FunctionAttrs use AliasAnalysis::getModRefBehavior, now that it knows about intrinsic functions. llvm-svn: 118410	2010-11-08 16:10:15 +00:00
Duncan Sands	9d1fe4c40d	Rename PointsToLocalMemory to PointsToLocalOrConstantMemory to make the code more self-documenting. llvm-svn: 118171	2010-11-03 14:45:05 +00:00
Jakob Stoklund Olesen	31a7eb40c1	Let the -inline-threshold command line argument take precedence over the threshold given to createFunctionInliningPass(). Both opt -O3 and clang would silently ignore the -inline-threshold option. llvm-svn: 118117	2010-11-02 23:40:26 +00:00
Owen Anderson	6186c96765	When folding away a (shl (shr)) pair, we need to check that the bits that will BECOME the low bits are zero, not that the current low bits are zero. Fixes <rdar://problem/8606771>. llvm-svn: 117953	2010-11-01 21:08:20 +00:00
Duncan Sands	e659aba516	Now that the MallocInst no longer exists, this workaround for it claiming not to have side-effects is no longer needed. llvm-svn: 117789	2010-10-30 16:12:16 +00:00
Duncan Sands	b8f3b14dfb	If a function does a volatile load from a global constant, do not consider it to be readonly. In fact, don't even consider it to be readonly if it does a volatile load from an AllocaInst either (it is debatable as to whether readonly would be correct or not in this case; play safe for the moment). This fixes PR8279. llvm-svn: 117783	2010-10-30 12:59:44 +00:00
Bob Wilson	67a6f32c59	Clean up indentation and other whitespace. llvm-svn: 117728	2010-10-29 22:20:45 +00:00
Bob Wilson	8ecf98b04f	Remove trailing whitespace. llvm-svn: 117727	2010-10-29 22:20:43 +00:00
Bob Wilson	9d07f39ace	Fix 80-column violation. llvm-svn: 117722	2010-10-29 22:03:07 +00:00
Bob Wilson	11ee456e23	Change instcombine's getShuffleMask to represent undef with negative values. This code had previously used 2*N, where N is the mask length, to represent undef. That is not safe because the shufflevector operands may have more than N elements -- they don't have to match the result type. llvm-svn: 117721	2010-10-29 22:03:05 +00:00
Bob Wilson	cb11b48e7a	Make instcombine a little more aggressive in combining vector shuffles. Allow splats even if they don't match either of the original shuffles, possibly due to undef entries in the shuffles masks. Radar 8597790. Also fix some 80-column violations. llvm-svn: 117719	2010-10-29 22:02:50 +00:00
Owen Anderson	374e1464ae	Give up on doing in-line instruction simplification during correlated value propagation. Instruction simplification needs to be guaranteed never to be run on an unreachable block. However, earlier block simplifications may have changed the CFG to make block that were reachable when we began our iteration unreachable by the time we try to simplify them. (Note that this also means that our depth-first iterators were potentially being invalidated). This should not have a large impact on code quality, since later runs of instcombine should pick up these simplifications. Fixes PR8506. llvm-svn: 117709	2010-10-29 21:05:17 +00:00
John Thompson	e8360b7182	Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support. llvm-svn: 117667	2010-10-29 17:29:13 +00:00
Dale Johannesen	16bb87a90e	Teach InstCombine not to use Add and Neg on FP. PR 8490. llvm-svn: 117510	2010-10-27 23:45:18 +00:00
Dan Gohman	2e20dfb0f2	Fix a case where instcombine was stripping metadata (and alignment) from stores when folding in bitcasts. llvm-svn: 117265	2010-10-25 16:16:27 +00:00
Duncan Sands	31c803b2ba	Fix PR8445: a block with no predecessors may be the entry block, in which case it isn't unreachable and should not be zapped. The check for the entry block was missing in one case: a block containing a unwind instruction. While there, do some small cleanups: "M" is not a great name for a Function* (it would be more appropriate for a Module*), change it to "Fn"; use Fn in more places. llvm-svn: 117224	2010-10-24 12:23:30 +00:00
Benjamin Kramer	76229bc128	SmallVectorize. llvm-svn: 117213	2010-10-23 17:10:24 +00:00
Chandler Carruth	88c54b82c1	Switch attribute macros to use 'LLVM_' as a prefix. We retain the old names until other LLVM projects using these are cleaned up. llvm-svn: 117200	2010-10-23 08:10:43 +00:00
Bob Wilson	a4e231c880	Teach instcombine to set the alignment arguments for NEON load/store intrinsics. llvm-svn: 117154	2010-10-22 21:41:48 +00:00
Duncan Sands	94da154558	RetOp is not actually used for anything useful (though it looks like maybe it was supposed to be used in the test...), so zap it (gcc-4.6 warning). llvm-svn: 117023	2010-10-21 16:05:44 +00:00
Dan Gohman	f372cf869b	Reapply r116831 and r116839, converting AliasAnalysis to use uint64_t, plus fixes for places I missed before. llvm-svn: 116875	2010-10-19 22:54:46 +00:00
Dan Gohman	b4aa503501	Revert r116831 and r116839, which are breaking selfhost builds. llvm-svn: 116858	2010-10-19 21:06:16 +00:00
Owen Anderson	a4fefc1949	Passes do not need to recursively initialize passes that they preserve, if they do not also require them. This allows us to reduce inter-pass linkage dependencies. llvm-svn: 116854	2010-10-19 20:08:44 +00:00
Dan Gohman	896ac62346	Oops, check in all the files for converting AliasAnalysis to use uint64_t. llvm-svn: 116839	2010-10-19 18:08:27 +00:00
Owen Anderson	6c18d1aac0	Get rid of static constructors for pass registration. Instead, every pass exposes an initializeMyPassFunction(), which must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize the pass's dependencies. Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h before parsing commandline arguments. I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass registration/creation, please send the testcase to me directly. llvm-svn: 116820	2010-10-19 17:21:58 +00:00
Dan Gohman	14fe8cf238	Consistently use AliasAnalysis::UnknownSize instead of hardcoding ~0u. llvm-svn: 116815	2010-10-19 17:06:23 +00:00
Mikhail Glushenkov	2072db24ed	GlobalOpt: EvaluateFunction() must not evaluate stores to weak_odr globals. Fixes PR8389. llvm-svn: 116812	2010-10-19 16:47:23 +00:00
Mikhail Glushenkov	cf2afe008d	Trailing whitespace. llvm-svn: 116749	2010-10-18 21:16:00 +00:00
Dan Gohman	71af9db0e8	Make AliasSetTracker TBAA-aware, enabling TBAA-enabled LICM. llvm-svn: 116743	2010-10-18 20:44:50 +00:00
Devang Patel	218f3206fa	Transfer debug loc to lowered call. Patch by Alexander Herz! llvm-svn: 116733	2010-10-18 18:53:44 +00:00
Benjamin Kramer	1dc34b48dd	Eliminate some calls to Value::getNameStr. llvm-svn: 116670	2010-10-16 11:28:23 +00:00
Owen Anderson	18e4fed3fa	Generalize MemCpyOpt's handling of call slot forwarding to function properly when the call slot forwarding is implemented with a load/store pair rather than a memcpy. llvm-svn: 116637	2010-10-15 22:52:12 +00:00
Owen Anderson	071cee0c81	CallGraphSCC passes implicity require CallGraph analysis. llvm-svn: 116443	2010-10-13 22:00:45 +00:00
Rafael Espindola	c2240adcc7	Fix PR8313 by changing ValueToValueMap use a TrackingVH. llvm-svn: 116390	2010-10-13 02:08:17 +00:00
Rafael Espindola	229e38f0fe	Be more consistent in using ValueToValueMapTy. llvm-svn: 116387	2010-10-13 01:36:30 +00:00
Owen Anderson	8ac477ffb5	Begin adding static dependence information to passes, which will allow us to perform initialization without static constructors AND without explicit initialization by the client. For the moment, passes are required to initialize both their (potential) dependencies and any passes they preserve. I hope to be able to relax the latter requirement in the future. llvm-svn: 116334	2010-10-12 19:48:12 +00:00
Kenneth Uildriks	b8d7efe785	Now using a variant of the existing inlining heuristics to decide whether to create a given specialization of a function in PartialSpecialization. If the total performance bonus across all callsites passing the same constant exceeds the specialization cost, we create the specialization. llvm-svn: 116158	2010-10-09 22:06:36 +00:00
Dan Gohman	2fd85d7cd2	Filter out illegal formulae after updating offsets, not before, so that formulae which become illegal as a result of the offset updating don't escape. This is for rdar://8529692. No testcase yet, because the given cases hit use-list ordering differences. llvm-svn: 116093	2010-10-08 19:33:26 +00:00
Daniel Dunbar	d4e9c3b43a	Update CMake. llvm-svn: 116034	2010-10-08 02:30:03 +00:00
Dan Gohman	5947e1626a	Delete the FormulaSorter class and inline its one method into its one user. This code will be restructured soon and FormulaSorter is getting in the way. llvm-svn: 116012	2010-10-07 23:52:18 +00:00
Dan Gohman	1b61fd9bff	Fix a spello. llvm-svn: 116011	2010-10-07 23:43:09 +00:00
Dan Gohman	34f37e0d04	Charge a formula for explicit multiplies on scaled registers too, not just base registers. llvm-svn: 116010	2010-10-07 23:41:58 +00:00
Dan Gohman	49d638b45a	Use size_t for consistency. llvm-svn: 116009	2010-10-07 23:37:58 +00:00
Dan Gohman	8e72611058	When merging one use into another, transfer the offsets from the old use to the new one. llvm-svn: 116008	2010-10-07 23:36:45 +00:00
Dan Gohman	a7b68d6d95	Fix LSR to keep the RegUseTracker up to date when combining users. This doesn't usually matter, because the other heuristics usually succeed regardless, but it's good to keep the register use bookkeeping consistent. llvm-svn: 116005	2010-10-07 23:33:43 +00:00
Devang Patel	57da4caa85	Remove LoopIndexSplit pass. It is neither maintained nor used by anyone. llvm-svn: 116004	2010-10-07 23:29:37 +00:00
Owen Anderson	df7a4f2515	Now with fewer extraneous semicolons! llvm-svn: 115996	2010-10-07 22:25:06 +00:00
Owen Anderson	9786868939	Add initialization routines for Instrumentation. llvm-svn: 115971	2010-10-07 20:17:24 +00:00
Owen Anderson	f7ef5dfccc	Add initialization routines to InstCombine. llvm-svn: 115965	2010-10-07 20:04:55 +00:00
Owen Anderson	bf70a035f0	Add an initialization routine for libLLVMipo.a llvm-svn: 115933	2010-10-07 18:09:59 +00:00
Owen Anderson	4698c5d7f7	Next step on the getting-rid-of-static-ctors train: begin adding per-library initialization functions that initialize the set of passes implemented in that library. Add C bindings for these functions as well. llvm-svn: 115927	2010-10-07 17:55:47 +00:00
Owen Anderson	5e19bfcde3	Move the pass initialization helper functions into the llvm namespace, and add a header declaring them all. This is also where we will declare per-library pass-set initializer functions down the road. llvm-svn: 115900	2010-10-07 04:13:08 +00:00
Owen Anderson	6da4d820fa	Since the Hello pass is built as a loadable dynamic library, don't try to convert it to new-style registration yet. llvm-svn: 115881	2010-10-07 00:31:16 +00:00
Owen Anderson	13a642da0b	Now that the profitable bits of EnableFullLoadPRE have been enabled by default, rip out the remainder. Anyone interested in more general PRE would be better served by implementing it separately, to get real anticipation calculation, etc. llvm-svn: 115337	2010-10-01 20:02:55 +00:00
Eric Christopher	3ad2f3a2f2	Fix the other half of the alignment changing issue by making sure that the memcpy alignment is the minimum of the incoming alignments. Fixes PR 8266. llvm-svn: 115305	2010-10-01 09:02:05 +00:00
Chris Lattner	c663a67384	fix PR8267 - Instcombine shouldn't optimizer away volatile memcpy's. llvm-svn: 115296	2010-10-01 05:51:02 +00:00
Dale Johannesen	dd224d2333	Massive rewrite of MMX: The x86_mmx type is used for MMX intrinsics, parameters and return values where these use MMX registers, and is also supported in load, store, and bitcast. Only the above operations generate MMX instructions, and optimizations do not operate on or produce MMX intrinsics. MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into smaller pieces. Optimizations may occur on these forms and the result casted back to x86_mmx, provided the result feeds into a previous existing x86_mmx operation. The point of all this is prevent optimizations from introducing MMX operations, which is unsafe due to the EMMS problem. llvm-svn: 115243	2010-09-30 23:57:10 +00:00
Owen Anderson	3170a25a84	We do want to allow LoadPRE to perform LICM-like transformations: we already consider PHI nodes to be negligible for code size (making this transform code size neutral), and it allows us to hoist values out of loops, which is always a good thing. llvm-svn: 115205	2010-09-30 20:53:04 +00:00
Jakob Stoklund Olesen	eb12f49fb7	Try again to disable critical edge splitting in CodeGenPrepare. The bug that broke i386 linux has been fixed in r115191. llvm-svn: 115204	2010-09-30 20:51:52 +00:00
Benjamin Kramer	5d66e5feb8	Tighten up prototype verification of strchr and strrchr to avoid a crash in the very unlikely case that someone passes an integer > i64 to strchr. llvm-svn: 115144	2010-09-30 11:21:59 +00:00
Benjamin Kramer	2b76c66fd6	Add constant folding for strspn and strcspn to SimplifyLibCalls. llvm-svn: 115116	2010-09-30 00:58:35 +00:00
Benjamin Kramer	38d22f69fc	Add strpbrk folding to SimplifyLibCalls. llvm-svn: 115111	2010-09-29 23:52:12 +00:00
Benjamin Kramer	8e861d7eee	Simplify the loop in StrChrOptimizer. FileCheckize test. llvm-svn: 115095	2010-09-29 22:29:12 +00:00
Benjamin Kramer	824645abc9	Teach SimplifyLibCalls how to optimize strrchr. llvm-svn: 115091	2010-09-29 21:50:51 +00:00
Owen Anderson	99c985c37d	Fix PR8247: JumpThreading can cause a block to become unreachable while still having predecessor, if it is part of a self-loop. Because of this, we cannot use the Simplify* APIs, as they can assert-fail on unreachable code. Since it's not easy to determine if a given threading will cause a block to become unreachable, simply defer simplifying simplification to later InstCombine and/or DCE passes. llvm-svn: 115082	2010-09-29 20:34:41 +00:00
Owen Anderson	d67ca0ed4c	Revert r114919, which caused some serious regressions on ARM. llvm-svn: 115053	2010-09-29 18:05:19 +00:00
Oscar Fuentes	b4b12535e8	Removed a bunch of unnecessary target_link_libraries. llvm-svn: 114999	2010-09-28 22:39:14 +00:00
Owen Anderson	9c93fd5598	Weight loop unrolling counts by nesting depth. Unrolling deeply nested loops tends to cause high register pressure and thus excess spills, which we don't currently recover from well. This should be re-evaluated in the future if our ability to generate good spills/splits improves. Partial fix for <rdar://problem/7635585>. llvm-svn: 114919	2010-09-27 22:58:54 +00:00
Jakob Stoklund Olesen	415a7a6fec	Revert "Disable codegen prepare critical edge splitting. Machine instruction passes now" This reverts revision 114633. It was breaking llvm-gcc-i386-linux-selfhost. It seems there is a downstream bug that is exposed by -cgp-critical-edge-splitting=0. When that bug is fixed, this patch can go back in. Note that the changes to tailcallfp2.ll are not reverted. They were good are required. llvm-svn: 114859	2010-09-27 18:43:48 +00:00
Dan Gohman	16ef49686c	Delete an unused function. llvm-svn: 114841	2010-09-27 16:58:21 +00:00
Owen Anderson	b590a927cd	LoadPRE was not properly checking that the load it was PRE'ing post-dominated the block it was being hoisted to. Splitting critical edges at the merge point only addressed part of the issue; it is also possible for non-post-domination to occur when the path from the load to the merge has branches in it. Unfortunately, full anticipation analysis is time-consuming, so for now approximate it. This is strictly more conservative than real anticipation, so we will miss some cases that real PRE would allow, but we also no longer insert loads into paths where they didn't exist before. :-) This is a very slight net positive on SPEC for me (0.5% on average). Most of the benchmarks are largely unaffected, but when it pays off it pays off decently: 181.mcf improves by 4.5% on my machine. llvm-svn: 114785	2010-09-25 05:26:18 +00:00
Eric Christopher	ebacd2b023	If we're changing the source of a memcpy we need to use the alignment of the source, not the original alignment since it may no longer be valid. Fixes rdar://8400094 llvm-svn: 114781	2010-09-25 00:57:26 +00:00
Michael J. Spencer	ded5f66813	Get rid of pop_macro warnings on MSVC. llvm-svn: 114750	2010-09-24 19:48:47 +00:00
Bob Wilson	3aecb15f0a	Fix llvm-extract so that it changes the linkage of all GlobalValues to "external" even when doing lazy bitcode loading. This was broken because a function that is not materialized fails the !isDeclaration() test. llvm-svn: 114666	2010-09-23 17:25:06 +00:00
Evan Cheng	794aaa79e2	Disable codegen prepare critical edge splitting. Machine instruction passes now break critical edges on demand. llvm-svn: 114633	2010-09-23 06:55:34 +00:00
Bob Wilson	b6832a4372	When moving zext/sext to be folded with a load, ignore the issue of whether truncates are free only in the case where the extended type is legal but the load type is not. If both types are illegal, such as when they are too big, the load may not be legalized into an extended load. llvm-svn: 114568	2010-09-22 18:44:56 +00:00
Bob Wilson	4ddcb6a6b4	Move a sign-extend or a zero-extend of a load to the same basic block as the load when the type of the load is not legal, even if truncates are not free. The load is going to be legalized to an extending load anyway. llvm-svn: 114488	2010-09-21 21:54:27 +00:00
Bob Wilson	ff714f9992	Clarify a comment. llvm-svn: 114487	2010-09-21 21:44:14 +00:00
Gabor Greif	a06741b356	do not rely on the implicit-dereference semantics of dyn_cast_or_null llvm-svn: 114278	2010-09-18 11:55:34 +00:00
Gabor Greif	aaa22cf1b6	do not rely on the implicit-dereference semantics of dyn_cast_or_null llvm-svn: 114277	2010-09-18 11:53:39 +00:00
Owen Anderson	d104806575	Use a depth-first iteratation in CorrelatedValuePropagation to avoid wasting time trying to optimize unreachable blocks. llvm-svn: 114105	2010-09-16 18:35:07 +00:00
Dale Johannesen	f95f59a0c2	When substituting sunkaddrs into indirect arguments an asm, we were walking the asm arguments once and stashing their Values. This is wrong because the same memory location can be in the list twice, and if the first one has a sunkaddr substituted, the stashed value for the second one will be wrong (use-after-free). PR 8154. llvm-svn: 114104	2010-09-16 18:30:55 +00:00
Chris Lattner	67e534505d	fix PR8144, a bug where constant merge would merge globals marked attribute(used). llvm-svn: 113911	2010-09-15 00:30:11 +00:00
Owen Anderson	d361aac3d0	Remove the option to disable LazyValueInfo in JumpThreading, as it is now on by default and has received significant testing. llvm-svn: 113852	2010-09-14 20:57:41 +00:00
Chris Lattner	f1144f0929	fix PR8102, a case where we'd copyValue from a value that we already deleted. Fix this by doing the copyValue's before we delete stuff! The testcase only repros the problem on my system with valgrind. llvm-svn: 113820	2010-09-14 00:19:00 +00:00
Michael J. Spencer	93c9b2ea93	Revert "CMake: Get rid of LLVMLibDeps.cmake and export the libraries normally." This reverts commit r113632 Conflicts: cmake/modules/AddLLVM.cmake llvm-svn: 113819	2010-09-13 23:59:48 +00:00
Eric Christopher	e3a89f9f9c	Remove unused variable. llvm-svn: 113769	2010-09-13 18:27:59 +00:00
John Thompson	1094c80281	Added skeleton for inline asm multiple alternative constraint support. llvm-svn: 113766	2010-09-13 18:15:37 +00:00
Owen Anderson	c237a849e3	Re-apply r113679, which was reverted in r113720, which added a paid of new instcombine transforms to expose greater opportunities for store narrowing in codegen. This patch fixes a potential infinite loop in instcombine caused by one of the introduced transforms being overly aggressive. llvm-svn: 113763	2010-09-13 17:59:27 +00:00
Eric Christopher	26abd3e0c2	Revert 113679, it was causing an infinite loop in a testcase that I've sent on to Owen. llvm-svn: 113720	2010-09-12 06:09:23 +00:00
Owen Anderson	70f4524427	Invert and-of-or into or-of-and when doing so would allow us to clear bits of the and's mask. This can result in increased opportunities for store narrowing in code generation. Update a number of tests for this change. This fixes <rdar://problem/8285027>. Additionally, because this inverts the order of ors and ands, some patterns for optimizing or-of-and-of-or no longer fire in instances where they did originally. Add a simple transform which recaptures most of these opportunities: if we have an or-of-constant-or and have failed to fold away the inner or, commute the order of the two ors, to give the non-constant or a chance for simplification instead. llvm-svn: 113679	2010-09-11 05:48:06 +00:00
Gabor Greif	2f5f696b66	typoes llvm-svn: 113647	2010-09-10 22:25:58 +00:00
Michael J. Spencer	dc38d36ccb	CMake: Get rid of LLVMLibDeps.cmake and export the libraries normally. llvm-svn: 113632	2010-09-10 21:14:25 +00:00
Benjamin Kramer	77ab138f84	This transform is also performed by InstructionSimplify, remove the duplicate. llvm-svn: 113608	2010-09-10 19:52:35 +00:00
Owen Anderson	d85c9ccdba	Lower the unrolling theshold to 150. Empirical tests indicate that this is a sweet spot in the performance per code size increase curve. llvm-svn: 113595	2010-09-10 17:57:00 +00:00
Owen Anderson	04cf3fd761	What the loop unroller cares about, rather than just not unrolling loops with calls, is not unrolling loops that contain calls that would be better off getting inlined. This mostly comes up when an interleaved devirtualization pass has devirtualized a call which the inliner will inline on a future pass. Thus, rather than blocking all loops containing calls, add a metric for "inline candidate calls" and block loops containing those instead. llvm-svn: 113535	2010-09-09 20:32:23 +00:00
Owen Anderson	6270515918	Revert r113439, which relaxed the requirement that loops containing calls cannot be unrolled. After some discussion, there seems to be a better way to achieve the same effect. llvm-svn: 113528	2010-09-09 20:02:23 +00:00
Owen Anderson	11ab204fdc	r113526 introduced an unintended change to the loop unrolling threshold. Revert it. llvm-svn: 113527	2010-09-09 19:11:57 +00:00
Owen Anderson	b61b1647e2	Fix typo in code to cap the loop code size reduction calculation. llvm-svn: 113526	2010-09-09 19:08:59 +00:00
Owen Anderson	62ea1b718c	Use code-size reduction metrics to estimate the amount of savings we'll get when we unroll a loop. Next step is to recalculate the threshold values given this new heuristic. llvm-svn: 113525	2010-09-09 19:07:31 +00:00
Owen Anderson	8084dbaf8e	Relax the "don't unroll loops containing calls" rule. Instead, when a loop contains a call, lower the unrolling threshold to the optimize-for-size threshold. Basically, for loops containing calls, unrolling can still be profitable as long as the loop is REALLY small. llvm-svn: 113439	2010-09-08 23:10:07 +00:00
Owen Anderson	3fe002dfb5	Generalize instcombine's support for combining multiple bit checks into a single test. Patch by Dirk Steinke! llvm-svn: 113423	2010-09-08 22:16:17 +00:00
Owen Anderson	a4d9c78aa1	Add a separate unrolling threshold when the current function is being optimized for size. The threshold value of 50 is arbitrary, and I chose it simply by analogy to the inlining thresholds, where the baseline unrolling threshold is slightly smaller than the baseline inlining threshold. This could undoubtedly use some tuning. llvm-svn: 113306	2010-09-07 23:15:30 +00:00
Chris Lattner	6e27b3e004	Fix a serious performance regression introduced by r108687 on linux: turning (fptrunc (sqrt (fpext x))) -> (sqrtf x) is great, but we have to delete the original sqrt as well. Not doing so causes us to do two sqrt's when building with -fmath-errno (the default on linux). llvm-svn: 113260	2010-09-07 20:01:38 +00:00
Nick Lewycky	71972d45dc	Fix major bug in thunk detection. Also verify the calling convention. Switch from isWeakForLinker to mayBeOverridden which is more accurate. Add more statistics and debugging info. Add comments. Move static function outside anonymous namespace. llvm-svn: 113190	2010-09-07 01:42:10 +00:00
Chris Lattner	be9019090e	fix PR8067, an over-aggressive assertion in LICM. llvm-svn: 113146	2010-09-06 05:11:24 +00:00
Chris Lattner	b01c24a945	Teach loop rotate to hoist trivially invariant instructions in the duplicated block instead of duplicating them. Duplicating them into the end of the loop and the preheader means that we got a phi node in the header of the loop, which prevented LICM from hoisting them. GVN would usually come around later and merge the duplicated instructions so we'd get reasonable output... except that anything dependent on the shoulda-been-hoisted value can't be hoisted. In PR5319 (which this fixes), a memory value didn't get promoted. llvm-svn: 113134	2010-09-06 01:10:22 +00:00
Chris Lattner	da24b9a49a	pull a simple method out of LICM into a new Loop::hasLoopInvariantOperands method. Remove a useless and confusing Loop::isLoopInvariant(Instruction) method, which didn't do what you thought it did. No functionality change. llvm-svn: 113133	2010-09-06 01:05:37 +00:00
Chris Lattner	1edf7434cf	more cleanups llvm-svn: 113115	2010-09-05 20:13:07 +00:00
Chris Lattner	e6214557e7	Change lower atomic pass to use IntrinsicInst to simplify it a bit. llvm-svn: 113114	2010-09-05 20:10:47 +00:00
Chris Lattner	05ef361b5e	eliminate some non-obvious casts. UndefValue isa Constant. llvm-svn: 113113	2010-09-05 20:03:09 +00:00
Nick Lewycky	e3ac69eca3	Fix warning reported by MSVC++ builder. llvm-svn: 113106	2010-09-05 09:11:38 +00:00
Nick Lewycky	f3a07ec394	Switch FnSet to containing the ComparableFunction instead of a pointer to one. This reduces malloc traffic (yay!) and removes MergeFunctionsEqualityInfo. llvm-svn: 113105	2010-09-05 09:00:32 +00:00
Nick Lewycky	0095937b13	Fix many bugs when merging weak-strong and weak-weak pairs. We now merge all strong functions first to make sure they're the canonical definitions and then do a second pass looking only for weak functions. llvm-svn: 113104	2010-09-05 08:22:49 +00:00
Chris Lattner	65b48b5dfc	zap dead code. llvm-svn: 113073	2010-09-04 18:12:00 +00:00
Dan Gohman	487e250109	Fix LoopSimplify to notify ScalarEvolution when splitting a loop backedge into an inner loop, as the new loop iteration may differ substantially. This fixes PR8078. llvm-svn: 113057	2010-09-04 02:42:48 +00:00
Chris Lattner	50506787d1	fix a bug in my licm rewrite when a load from the promoted memory location is being re-stored to the memory location. We would get a dangling pointer from the SSAUpdate data structure and miss a use. This fixes PR8068 llvm-svn: 113042	2010-09-04 00:12:30 +00:00
Owen Anderson	c91c1a205a	Propagate non-local comparisons. Fixes PR1757. llvm-svn: 113025	2010-09-03 22:47:08 +00:00
Owen Anderson	c725462245	Add support for simplifying a load from a computed value to a load from a global when it is provable that they're equivalent. This fixes PR4855. llvm-svn: 112994	2010-09-03 19:08:37 +00:00
Chris Lattner	affc0e42f0	fix more AST updating bugs, correcting miscompilation in PR8041 llvm-svn: 112878	2010-09-02 22:19:10 +00:00
Duncan Sands	6778149f7e	Reapply commit 112699, speculatively reverted by echristo, since I'm sure it is harmless. Original commit message: If PrototypeValue is erased in the middle of using the SSAUpdator then the SSAUpdator may access freed memory. Instead, simply pass in the type and name explicitly, which is all that was used anyway. llvm-svn: 112810	2010-09-02 08:14:03 +00:00
Chris Lattner	8af45a889d	deepen my MMX/SRoA hack to avoid hurting non-x86 codegen. llvm-svn: 112763	2010-09-01 23:09:27 +00:00
Dan Gohman	0ad7d9c24e	Fix loop unswitching's assumption that a code path which either infinite loops or exits will eventually exit. This fixes PR5373. llvm-svn: 112745	2010-09-01 21:46:45 +00:00
Owen Anderson	73f988cafa	JumpThreading keeps LazyValueInfo up to date, so we don't need to rerun it if we schedule another LVI-using pass afterwards. llvm-svn: 112722	2010-09-01 18:27:22 +00:00
Eric Christopher	a5d315c665	Speculatively revert 112699 and 112702, they seem to be causing self host errors on clang-x86-64. llvm-svn: 112719	2010-09-01 17:29:10 +00:00
Duncan Sands	f7b18437b5	If PrototypeValue is erased in the middle of using the SSAUpdator then the SSAUpdator may access freed memory. Instead, simply pass in the type and name explicitly, which is all that was used anyway. llvm-svn: 112699	2010-09-01 10:29:33 +00:00
Chris Lattner	34e5361eb5	add a gross hack to work around a problem that Argiris reported on llvmdev: SRoA is introducing MMX datatypes like <1 x i64>, which then cause random problems because the X86 backend is producing mmx stuff without inserting proper emms calls. In the short term, force off MMX datatypes. In the long term, the X86 backend should not select generic vector types to MMX registers. This is being worked on, but won't be done in time for 2.8. rdar://8380055 llvm-svn: 112696	2010-09-01 05:14:33 +00:00
Dan Gohman	110ed64fbb	Revert 112442 and 112440 until the compile time problems introduced by 112440 are resolved. llvm-svn: 112692	2010-09-01 01:45:53 +00:00
Chris Lattner	030f02021b	licm is wasting time hoisting constant foldable operations, instead of hoisting them, just fold them away. This occurs in the testcase for PR8041, for example. llvm-svn: 112669	2010-08-31 23:00:16 +00:00
Chris Lattner	daca6f3483	tidy up llvm-svn: 112643	2010-08-31 21:21:25 +00:00
Owen Anderson	3c84ecb067	More cleanups of my JumpThreading transforms, including extracting some duplicated code into a helper function. llvm-svn: 112634	2010-08-31 20:26:04 +00:00
Owen Anderson	6fdcb172a9	Add an RAII helper to make cleanup of the RecursionSet more fool-proof. llvm-svn: 112628	2010-08-31 19:24:27 +00:00
Owen Anderson	048efbe225	Only try to clean up the current block if we changed that block already. llvm-svn: 112625	2010-08-31 18:55:52 +00:00
Owen Anderson	cd4de7f399	Refactor my fix for PR5652 to terminate the predecessor lookups after the first failure. llvm-svn: 112620	2010-08-31 18:48:48 +00:00
Nick Lewycky	68984ede5c	Fix an infinite loop; merging two functions will create a new function (if the two are weak, we make them thunks to a new strong function) so don't iterate through the function list as we're modifying it. Also add back the outermost loop which got removed during the cleanups. llvm-svn: 112595	2010-08-31 08:29:37 +00:00
Owen Anderson	ce401be792	Don't perform an extra traversal of the function just to do cleanup. We can safely simplify instructions after each block has been processed without worrying about iterator invalidation. llvm-svn: 112594	2010-08-31 07:55:56 +00:00
Owen Anderson	48d58ad64c	Rename ValuePropagation to a more descriptive CorrelatedValuePropagation. llvm-svn: 112591	2010-08-31 07:48:34 +00:00
Owen Anderson	d2918a07bd	Rename file to something more descriptive. llvm-svn: 112590	2010-08-31 07:41:39 +00:00
Owen Anderson	3997a07fb9	More Chris-inspired JumpThreading fixes: use ConstantExpr to correctly constant-fold undef, and be more careful with its return value. This actually exposed an infinite recursion bug in ComputeValueKnownInPredecessors which theoretically already existed (in JumpThreading's handling of and/or of i1's), but never manifested before. This patch adds a tracking set to prevent this case. llvm-svn: 112589	2010-08-31 07:36:34 +00:00
Nick Lewycky	0464d1d7ec	Switch to DenseSet, simplifying much more code. We now have a single iteration where we hash, compare and fold, instead of one iteration where we build up the hash buckets and a second one to fold. llvm-svn: 112582	2010-08-31 05:53:05 +00:00
Owen Anderson	376597c13e	Remove r111665, which implemented store-narrowing in InstCombine. Chris discovered a miscompilation in it, and it's not easily fixable at the optimizer level. I'll investigate reimplementing it in DAGCombine. llvm-svn: 112575	2010-08-31 04:41:06 +00:00
Owen Anderson	b58b3c0dda	Fix a typo. llvm-svn: 112560	2010-08-30 23:59:30 +00:00
Owen Anderson	b974dbbdd7	Cleanups suggested by Chris. llvm-svn: 112553	2010-08-30 23:34:17 +00:00
Owen Anderson	c910acb54a	Re-apply r112539, being more careful to respect the return values of the constant folding methods. Additionally, use the ConstantExpr::get*() methods to simplify some constant folding. llvm-svn: 112550	2010-08-30 23:22:36 +00:00
Owen Anderson	30bacbdfdf	Add statistics to evaluate this pass. llvm-svn: 112545	2010-08-30 22:45:55 +00:00
Owen Anderson	1ddcbbe49c	Revert r112539. It accidentally introduced a miscompilation. llvm-svn: 112543	2010-08-30 22:33:41 +00:00
Owen Anderson	75f6037c7c	Fixes and cleanups pointed out by Chris. In general, be careful to handle 0 results from ComputeValueKnownInPredecessors (indicating undef), and re-use existing constant folding APIs. llvm-svn: 112539	2010-08-30 22:07:52 +00:00
Chris Lattner	c843fca2fd	rewrite DwarfEHPrepare to use SSAUpdater to promote its allocas instead of PromoteMemToReg. This allows it to stop using DF and DT, eliminating a computation of DT and DF from clang -O3. Clang is now down to 2 runs of DomFrontier. llvm-svn: 112457	2010-08-29 19:54:28 +00:00
Chris Lattner	f58382ed87	two changes: 1) make AliasSet hold the list of call sites with an assertingvh so we get a violent explosion if the pointer dangles. 2) Fix AliasSetTracker::deleteValue to remove call sites with by-pointer comparisons instead of by-alias queries. Using findAliasSetForCallSite can cause alias sets to get merged when they shouldn't, and can also miss alias sets when the call is readonly. #2 fixes PR6889, which only repros with a .c file :( llvm-svn: 112452	2010-08-29 18:42:23 +00:00
Chris Lattner	263f804699	LICM does get dead instructions input to it. Instead of sinking them out of loops, just delete them. llvm-svn: 112451	2010-08-29 18:22:25 +00:00
Chris Lattner	6ac0659a1c	use moveBefore instead of remove+insert, it avoids some symtab manipulation, so its faster (in addition to being more elegant) llvm-svn: 112450	2010-08-29 18:18:40 +00:00
Chris Lattner	f03b4eac48	revert 112448 for now. llvm-svn: 112449	2010-08-29 18:11:16 +00:00
Chris Lattner	11f8ad8211	optimize LICM::hoist to use moveBefore. Correct its updating of AST to remove the hoisted instruction from the AST, since it is no longer in the loop. llvm-svn: 112448	2010-08-29 18:03:33 +00:00
Chris Lattner	1a1ed69435	fix some bugs (found by inspection) where LICM would not update LICM correctly. When sinking an instruction, it should not add entries for the sunk instruction to the AST, it should remove the entry for the sunk instruction. The blocks being sunk to are not in the loop, so their instructions shouldn't be in the AST (yet)! llvm-svn: 112447	2010-08-29 18:00:00 +00:00
Chris Lattner	cc9cbc66a3	rework the ownership of subloop alias information: instead of keeping them around until the pass is destroyed, keep them around a) just when useful (not for outer loops) and b) destroy them right after we use them. This should reduce memory use and fixes potential bugs where a loop is deleted and another loop gets allocated to the same address. llvm-svn: 112446	2010-08-29 17:46:00 +00:00
Chris Lattner	bc1a65ac6c	apparently unswitch had the same "Feature". Stop its claims that it preserves domfrontier if it doesn't really. llvm-svn: 112445	2010-08-29 17:23:19 +00:00
Chris Lattner	d6f46b8af8	now that loop passes don't use DomFrontier, there is no reason for the unroller to pretend it supports updating it. It still has a horrible hack for DomTree. llvm-svn: 112444	2010-08-29 17:21:35 +00:00
Dan Gohman	002ff89cbd	Optionally rerun dedicated-register filtering after applying other filtering techniques, as those may allow it to filter out more obviously unprofitable candidates. llvm-svn: 112441	2010-08-29 16:39:22 +00:00
Dan Gohman	f031792cc6	Fix several areas in LSR to do a better job keeping the main LSRInstance data structures up to date. This fixes some pessimizations caused by stale data which will be exposed in an upcoming change. llvm-svn: 112440	2010-08-29 16:32:54 +00:00
Dan Gohman	e9e0873b08	Refactor the three main groups of code out of NarrowSearchSpaceUsingHeuristics into separate functions. llvm-svn: 112439	2010-08-29 16:09:42 +00:00
Dan Gohman	37a0f68036	Delete a bogus check. llvm-svn: 112438	2010-08-29 15:30:29 +00:00
Dan Gohman	b6a520d63c	Add some comments. llvm-svn: 112437	2010-08-29 15:27:08 +00:00
Dan Gohman	bf673e0652	Move this debug output into GenerateAllReuseFormula, to declutter the high-level logic. llvm-svn: 112436	2010-08-29 15:21:38 +00:00
Dan Gohman	d366b6d5c8	Delete an unused declaration. llvm-svn: 112435	2010-08-29 15:19:11 +00:00
Dan Gohman	4f13bbfefc	Do one lookup instead of two. llvm-svn: 112434	2010-08-29 15:18:49 +00:00
Chris Lattner	f94f6bb0ba	licm preserves the cfg, it doesn't have to explicitly say it preserves domfrontier. It does preserve AA though. llvm-svn: 112419	2010-08-29 07:02:56 +00:00
Chris Lattner	abe61ef3b4	now that it doesn't use the PromoteMemToReg function, LICM doesn't require DomFrontier. Dropping this doesn't actually save any runs of the pass though. llvm-svn: 112418	2010-08-29 06:49:44 +00:00
Chris Lattner	1dc98b47b5	completely rewrite the memory promotion algorithm in LICM. Among other things, this uses SSAUpdater instead of PromoteMemToReg. llvm-svn: 112417	2010-08-29 06:43:52 +00:00
Chris Lattner	9c3931a544	use getUniqueExitBlocks instead of a manual set. llvm-svn: 112412	2010-08-29 05:12:21 +00:00
Chris Lattner	85bf5421e1	reimplement LICM::sink to use SSAUpdater instead of PromoteMemToReg. This leads to much simpler code. llvm-svn: 112410	2010-08-29 04:55:06 +00:00
Chris Lattner	c3fb03e289	implement SSAUpdater::RewriteUseAfterInsertions, a helpful form of RewriteUse. llvm-svn: 112409	2010-08-29 04:54:06 +00:00
Chris Lattner	b50407f104	remove dead proto llvm-svn: 112408	2010-08-29 04:53:24 +00:00
Chris Lattner	cd96b4df56	reduce indentation in LICM::sink by using early exits, use getUniqueExitBlocks instead of getExitBlocks and a manual set to eliminate dupes. llvm-svn: 112405	2010-08-29 04:28:20 +00:00
Chris Lattner	188cc5a0fc	modernize this pass a bit: use efficient set/map and reduce indentation. llvm-svn: 112404	2010-08-29 04:23:04 +00:00
Chris Lattner	13ee795c42	remove unions from LLVM IR. They are severely buggy and not being actively maintained, improved, or extended. llvm-svn: 112356	2010-08-28 04:09:24 +00:00
Chris Lattner	504e5100d3	remove the ABCD and SSI passes. They don't have any clients that I'm aware of, aren't maintained, and LVI will be replacing their value. nlewycky approved this on irc. llvm-svn: 112355	2010-08-28 03:51:24 +00:00
Chris Lattner	50df36ac0a	for completeness, allow undef also. llvm-svn: 112351	2010-08-28 03:36:51 +00:00
Chris Lattner	95bb297c26	squish dead code. llvm-svn: 112350	2010-08-28 03:21:03 +00:00
Chris Lattner	d0214f3efe	handle the constant case of vector insertion. For something like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.B; A.A = 42; return A; } we now generate: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 pshufd $16, %xmm2, %xmm2 movss LCPI1_1(%rip), %xmm0 pshufd $16, %xmm0, %xmm0 unpcklps %xmm2, %xmm0 ret instead of: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss 12(%rax), %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 unpcklps %xmm0, %xmm1 addss LCPI1_0(%rip), %xmm2 movd %xmm2, %eax shlq $32, %rax addq $1109917696, %rax ## imm = 0x42280000 movd %rax, %xmm0 ret llvm-svn: 112345	2010-08-28 01:50:57 +00:00
Chris Lattner	dd6601048e	optimize bitcasts from large integers to vector into vector element insertion from the pieces that feed into the vector. This handles a pattern that occurs frequently due to code generated for the x86-64 abi. We now compile something like this: struct S { float A, B, C, D; }; struct S g; struct S bar() { struct S A = g; ++A.A; ++A.C; return A; } into all nice vector operations: _bar: ## @bar ## BB#0: ## %entry movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm0 movss 4(%rax), %xmm2 movss 12(%rax), %xmm3 pshufd $16, %xmm2, %xmm2 unpcklps %xmm2, %xmm0 addss 8(%rax), %xmm1 pshufd $16, %xmm1, %xmm1 pshufd $16, %xmm3, %xmm2 unpcklps %xmm2, %xmm1 ret instead of icky integer operations: _bar: ## @bar movq _g@GOTPCREL(%rip), %rax movss LCPI1_0(%rip), %xmm1 movss (%rax), %xmm0 addss %xmm1, %xmm0 movd %xmm0, %ecx movl 4(%rax), %edx movl 12(%rax), %esi shlq $32, %rdx addq %rcx, %rdx movd %rdx, %xmm0 addss 8(%rax), %xmm1 movd %xmm1, %eax shlq $32, %rsi addq %rax, %rsi movd %rsi, %xmm1 ret This resolves rdar://8360454 llvm-svn: 112343	2010-08-28 01:20:38 +00:00
Benjamin Kramer	83f9ff0452	Update CMake build. Add newline at end of file. llvm-svn: 112332	2010-08-28 00:11:12 +00:00
Owen Anderson	cf7f941121	Add a prototype of a new peephole optimizing pass that uses LazyValue info to simplify PHIs and select's. This pass addresses the missed optimizations from PR2581 and PR4420. llvm-svn: 112325	2010-08-27 23:31:36 +00:00
Chris Lattner	6c1395f62a	Enhance the shift propagator to handle the case when you have: A = shl x, 42 ... B = lshr ..., 38 which can be transformed into: A = shl x, 4 ... iff we can prove that the would-be-shifted-in bits are already zero. This eliminates two shifts in the testcase and allows eliminate of the whole i128 chain in the real example. llvm-svn: 112314	2010-08-27 22:53:44 +00:00
Chris Lattner	18d7fc8fc6	Implement a pretty general logical shift propagation framework, which is good at ripping through bitfield operations. This generalize a bunch of the existing xforms that instcombine does, such as (x << c) >> c -> and to handle intermediate logical nodes. This is useful for ripping up the "promote to large integer" code produced by SRoA. llvm-svn: 112304	2010-08-27 22:24:38 +00:00
Chris Lattner	25a198e72b	remove some special shift cases that have been subsumed into the more general simplify demanded bits logic. llvm-svn: 112291	2010-08-27 21:04:34 +00:00
Owen Anderson	99d4cb861b	Fix typos in comments. llvm-svn: 112286	2010-08-27 20:32:56 +00:00
Chris Lattner	7398434675	teach the truncation optimization that an entire chain of computation can be truncated if it is fed by a sext/zext that doesn't have to be exactly equal to the truncation result type. llvm-svn: 112285	2010-08-27 20:32:06 +00:00
Chris Lattner	90cd746e63	Add an instcombine to clean up a common pattern produced by the SRoA "promote to large integer" code, eliminating some type conversions like this: %94 = zext i16 %93 to i32 ; <i32> [#uses=2] %96 = lshr i32 %94, 8 ; <i32> [#uses=1] %101 = trunc i32 %96 to i8 ; <i8> [#uses=1] This also unblocks other xforms from happening, now clang is able to compile: struct S { float A, B, C, D; }; float foo(struct S A) { return A.A + A.B+A.C+A.D; } into: _foo: ## @foo ## BB#0: ## %entry pshufd $1, %xmm0, %xmm2 addss %xmm0, %xmm2 movdqa %xmm1, %xmm3 addss %xmm2, %xmm3 pshufd $1, %xmm1, %xmm0 addss %xmm3, %xmm0 ret on x86-64, instead of: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movapd %xmm1, %xmm3 addss %xmm2, %xmm3 movd %xmm1, %rax shrq $32, %rax movd %eax, %xmm0 addss %xmm3, %xmm0 ret This seems pretty close to optimal to me, at least without using horizontal adds. This also triggers in lots of other code, including SPEC. llvm-svn: 112278	2010-08-27 18:31:05 +00:00
Owen Anderson	6ebbd92380	Use LVI to eliminate conditional branches where we've tested a related condition previously. Update tests for this change. This fixes PR5652. llvm-svn: 112270	2010-08-27 17:12:29 +00:00
Chris Lattner	bfd2228182	optimize "integer extraction out of the middle of a vector" as produced by SRoA. This is part of rdar://7892780, but needs another xform to expose this. llvm-svn: 112232	2010-08-26 22:14:59 +00:00
Chris Lattner	d4ebd6df5a	optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x' is a vector to be a vector element extraction. This allows clang to compile: struct S { float A, B, C, D; }; float foo(struct S A) { return A.A + A.B+A.C+A.D; } into: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movapd %xmm1, %xmm3 addss %xmm2, %xmm3 movd %xmm1, %rax shrq $32, %rax movd %eax, %xmm0 addss %xmm3, %xmm0 ret instead of: _foo: ## @foo ## BB#0: ## %entry movd %xmm0, %rax movd %eax, %xmm0 shrq $32, %rax movd %eax, %xmm2 addss %xmm0, %xmm2 movd %xmm1, %rax movd %eax, %xmm1 addss %xmm2, %xmm1 shrq $32, %rax movd %eax, %xmm0 addss %xmm1, %xmm0 ret ... eliminating half of the horribleness. llvm-svn: 112227	2010-08-26 21:55:42 +00:00
Owen Anderson	bd2ecc7e68	Make JumpThreading smart enough to properly thread StrSwitch when it's compiled with clang++. llvm-svn: 112198	2010-08-26 17:40:24 +00:00
Dan Gohman	ca26f79051	Reapply r112091 and r111922, support for metadata linking, with a fix: add a flag to MapValue and friends which indicates whether any module-level mappings are being made. In the common case of inlining, no module-level mappings are needed, so MapValue doesn't need to examine non-function-local metadata, which can be very expensive in the case of a large module with really deep metadata (e.g. a large C++ program compiled with -g). This flag is a little awkward; perhaps eventually it can be moved into the ClonedCodeInfo class. llvm-svn: 112190	2010-08-26 15:41:53 +00:00
Daniel Dunbar	ce45863f0d	Revert r111922, "MapValue support for MDNodes. This is similar to r109117, except ...", it is causing massive performance regressions when building Clang with itself (-O3 -g). llvm-svn: 112158	2010-08-26 03:48:11 +00:00
Daniel Dunbar	95fe13c720	Revert r112091, "Remap metadata attached to instructions when remapping individual ...", which depends on r111922, which I am reverting. llvm-svn: 112157	2010-08-26 03:48:08 +00:00
Chris Lattner	07afbd5a08	zap dead code. llvm-svn: 112130	2010-08-26 01:13:54 +00:00
Dan Gohman	8f292e7a6d	Rewrite ExtractGV, removing a bunch of stuff that didn't fully work, and was over-complicated, and replacing it with a simple implementation. llvm-svn: 112120	2010-08-26 00:22:55 +00:00
Chris Lattner	8df99b523e	remove some llvmcontext arguments that are now dead post-refactoring. llvm-svn: 112104	2010-08-25 23:00:45 +00:00
Dan Gohman	fd824487a3	Remap metadata attached to instructions when remapping individual instructions, not when remapping modules. llvm-svn: 112091	2010-08-25 21:36:50 +00:00
Devang Patel	01262e129e	DIGlobalVariable can be used to encode debug info for globals that are directly folded into a constant by FE. llvm-svn: 112072	2010-08-25 18:52:02 +00:00
Dan Gohman	a209503467	Use MapValue in the Linker instead of having a private function which does the same thing. This eliminates redundant code and handles MDNodes better. MDNode linking still doesn't fully work yet though. llvm-svn: 111941	2010-08-24 18:50:07 +00:00
Owen Anderson	7c853e877e	Turn LVI on, previously detected failures should be fixed now. llvm-svn: 111923	2010-08-24 17:21:18 +00:00
Dan Gohman	6901283544	MapValue support for MDNodes. This is similar to r109117, except that it avoids a lot of unnecessary cloning by avoiding remapping MDNode cycles when none of the nodes in the cycle actually need to be remapped. Also it uses the new temporary MDNode mechanism. llvm-svn: 111922	2010-08-24 17:10:10 +00:00
Owen Anderson	6ffa3f2aea	Turn LVI back off, I have a testcase now. llvm-svn: 111834	2010-08-23 19:59:27 +00:00
Owen Anderson	630add39a6	Re-enable LazyValueInfo. Monitoring for failures. llvm-svn: 111816	2010-08-23 18:12:23 +00:00
Owen Anderson	d31d82d75c	Now that PassInfo and Pass::ID have been separated, move the rest of the passes over to the new registration API. llvm-svn: 111815	2010-08-23 17:52:01 +00:00
Owen Anderson	84c29a096b	Re-apply r111568 with a fix for the clang self-host. llvm-svn: 111665	2010-08-20 18:24:43 +00:00
Owen Anderson	43057cd56a	Revert r111568 to unbreak clang self-host. llvm-svn: 111571	2010-08-19 23:25:16 +00:00
Owen Anderson	bb723b228a	When a set of bitmask operations, typically from a bitfield initialization, only modifies the low bytes of a value, we can narrow the store to only over-write the affected bytes. llvm-svn: 111568	2010-08-19 22:15:40 +00:00
Owen Anderson	aac8cbb261	Disable LVI while I evaluate a failure. llvm-svn: 111551	2010-08-19 19:47:08 +00:00
Owen Anderson	5c87dd55d3	Tentatively enabled LVI by default. I'll be monitoring for any failures. llvm-svn: 111543	2010-08-19 19:04:40 +00:00
Dan Gohman	129a816ee6	Process the step before the start, because it's usually the simpler of the two. llvm-svn: 111495	2010-08-19 01:02:31 +00:00
Owen Anderson	208636fa33	Inform LazyValueInfo whenever a block is deleted, to avoid dangling pointer issues. llvm-svn: 111382	2010-08-18 18:39:01 +00:00
Chris Lattner	3c603024bb	Fix PR7755: knowing something about an inval for a pred from the LHS should disable reconsidering that pred on the RHS. However, knowing something about the pred on the RHS shouldn't disable subsequent additions on the RHS from happening. llvm-svn: 111349	2010-08-18 03:14:36 +00:00
Chris Lattner	f0b5b67ba5	fit in 80 cols llvm-svn: 111348	2010-08-18 03:13:35 +00:00
Chris Lattner	b45de95345	remove some dead code. llvm-svn: 111344	2010-08-18 02:41:56 +00:00
Chris Lattner	6aabb66139	remove dead prototype. llvm-svn: 111342	2010-08-18 02:37:06 +00:00
Eric Christopher	51edc7b7e1	Temporarily revert r110987 as it's causing some miscompares in vector heavy code. I'll re-enable when we've tracked down the problem. llvm-svn: 111318	2010-08-17 22:55:27 +00:00
Dan Gohman	5047ca0c02	When rotating loops, put the original header at the bottom of the loop, making the resulting loop significantly less ugly. Also, zap its trivial PHI nodes, since it's easy. llvm-svn: 111255	2010-08-17 17:39:21 +00:00
Dan Gohman	941020ed72	Use the getUniquePredecessor() utility function, instead of doing what it does manually. llvm-svn: 111248	2010-08-17 17:07:02 +00:00
Evan Cheng	8b637b177c	Add an option to disable codegen prepare critical edge splitting. In theory, PHI elimination is already doing all (most?) of the splitting needed. But machine-licm and machine-sink seem to miss some important optimizations when splitting is disabled. llvm-svn: 111224	2010-08-17 01:34:49 +00:00
Dan Gohman	89fdbaf99a	Instead of having CollectSubexpr's categorize operands as interesting or uninteresting, just put all the operands on one list and make GenerateReassociations make the decision about what's interesting. This is simpler, and it avoids an extra ScalarEvolution::getAddExpr call. llvm-svn: 111133	2010-08-16 15:50:00 +00:00
Dan Gohman	9b7632df26	Put add operands in ScalarEvolution-canonical order, when convenient. This isn't necessary, because ScalarEvolution sorts them anyway, but it's tidier this way. llvm-svn: 111132	2010-08-16 15:39:27 +00:00
Dan Gohman	6e964c7fb4	Avoid #include <ScalarEvolution.h> in LoopSimplify.cpp, which doesn't actually use ScalarEvolution. llvm-svn: 111124	2010-08-16 14:44:03 +00:00
Dan Gohman	250b754428	Instead, teach SimplifyCFG to trim non-address-taken blocks from indirectbr destination lists. llvm-svn: 111122	2010-08-16 14:41:14 +00:00
Dan Gohman	aa445c0751	LoopSimplify shouldn't split loop backedges that use indirectbr. PR7867. llvm-svn: 111061	2010-08-14 00:43:09 +00:00
Dan Gohman	4a63fad976	Teach SimplifyCFG how to simplify indirectbr instructions. - Eliminate redundant successors. - Convert an indirectbr with one successor into a direct branch. Also, generalize SimplifyCFG to be able to be run on a function entry block. It knows quite a few simplifications which are applicable to the entry block, and it only needs a few checks to avoid trouble with the entry block. llvm-svn: 111060	2010-08-14 00:29:42 +00:00
Dan Gohman	081ffcd00b	Fix LSR's ExtractImmediate and ExtractSymbol to avoid calling ScalarEvolution::getAddExpr, which can be pretty expensive, when nothing has changed, which is pretty common. llvm-svn: 111042	2010-08-13 21:17:19 +00:00
Nate Begeman	2a0ca3e937	Reapply this transformation now that it is passing the external test which it previously failed. llvm-svn: 110987	2010-08-13 00:17:53 +00:00
Chris Lattner	363226dfe8	fix PR7876: If ipsccp decides that a function's address is taken before it rewrites the code, we need to use that in the post-rewrite pass. llvm-svn: 110962	2010-08-12 22:25:23 +00:00
Eric Christopher	ac40d49c70	Temporarily revert 110737 and 110734, they were causing failures in an external testsuite. llvm-svn: 110905	2010-08-12 07:01:22 +00:00
Nate Begeman	265363061e	Add the minimal amount of smarts necessary to instcombine of shufflevectors to recognize patterns generated by clang for transpose of a matrix in generic vectors. This is made of two parts: 1) Propagating vector extracts of hi/lo half into their users 2) Recognizing an insertion of even elements followed by the odd elements as an unpack. Testcase to come, but this shrinks the # of shuffle instructions generated on x86 from ~40 to the minimal 8. llvm-svn: 110734	2010-08-10 21:38:12 +00:00
Nick Lewycky	f0067b668c	Fix a use after free error caught by the valgrind builders. llvm-svn: 110601	2010-08-09 21:03:28 +00:00
Eli Friedman	f99e7e6643	PR7853: fix a silly mistake introduced in r101899, and add a test to make sure it doesn't regress again. llvm-svn: 110597	2010-08-09 20:49:43 +00:00
Nick Lewycky	fbd2757cde	Do more to modernize MergeFunctions. Refactor in response to Chris' code review. llvm-svn: 110538	2010-08-08 05:04:23 +00:00
Owen Anderson	0398607714	Don't attempt the PRE inline asm calls, since we don't value number them yet. Fixes PR7835. llvm-svn: 110489	2010-08-07 00:20:35 +00:00
Dan Gohman	0f7892b8ae	Eliminate PromoteMemoryToRegisterID; just use addPreserved("mem2reg") instead, as an example of what this looks like. llvm-svn: 110478	2010-08-06 21:48:06 +00:00
Owen Anderson	a7aed18624	Reapply r110396, with fixes to appease the Linux buildbot gods. llvm-svn: 110460	2010-08-06 18:33:48 +00:00
Nick Lewycky	5a2849e166	Fix uninitialized variable warning. Also move 'default' case next to a real case to help compiler optimize in non-Debug builds. No functionality change. llvm-svn: 110435	2010-08-06 07:43:46 +00:00
Nick Lewycky	f216f69ad9	Work in progress, cleaning up MergeFuncs. Further clean up the comparison function by removing overly generalized "domains". Remove all understanding of ELF aliases and simplify folding code and comments. llvm-svn: 110434	2010-08-06 07:21:30 +00:00
Owen Anderson	bda59bd247	Revert r110396 to fix buildbots. llvm-svn: 110410	2010-08-06 00:23:35 +00:00
Owen Anderson	755aceb5d0	Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static ID member as the sole unique type identifier. Clean up APIs related to this change. llvm-svn: 110396	2010-08-05 23:42:04 +00:00
Owen Anderson	4674dd6cf5	Give JumpThreading+LVI a long-form cl::opt so that it's easier to toggle the default. llvm-svn: 110384	2010-08-05 22:11:31 +00:00
Owen Anderson	9f2bca02d7	Experiments show that we can safely increase our unrolling threshold without unduly impacting code size, particularly since unrolling is not enabled at -Os. llvm-svn: 110233	2010-08-04 18:32:46 +00:00
Dan Gohman	ba81fc16a5	Fix whitespace. llvm-svn: 110223	2010-08-04 17:43:57 +00:00
Dan Gohman	839c972102	Fix a comment. llvm-svn: 110181	2010-08-04 01:16:35 +00:00
Dan Gohman	5442c71f2e	Thread const correctness through a bunch of AliasAnalysis interfaces and eliminate several const_casts. Make CallSite implicitly convertible to ImmutableCallSite. Rename the getModRefBehavior for intrinsic IDs to getIntrinsicModRefBehavior to avoid overload ambiguity with CallSite, which happens to be implicitly convertible to bool. llvm-svn: 110155	2010-08-03 21:48:53 +00:00
Dan Gohman	3619660529	Make instcombine set explicit alignments on load or store instructions with alignment 0, so that subsequent passes don't need to bother checking the TargetData ABI size manually. llvm-svn: 110128	2010-08-03 18:20:32 +00:00
Peter Collingbourne	ddaaf40d24	Add an atomic lowering pass llvm-svn: 110113	2010-08-03 16:19:16 +00:00
Dan Gohman	35e8a6209d	Use unary + instead of a separate local variable for working around std::min vs static const friction. llvm-svn: 110112	2010-08-03 16:15:50 +00:00
Owen Anderson	8f306a779b	Re-apply the infamous r108614, with a fix pointed out by Dirk Steinke. llvm-svn: 110036	2010-08-02 09:32:13 +00:00
Oscar Fuentes	40b31ad3ee	Prefix `next' iterator operation with `llvm::'. Fixes potential ambiguity problems on VS 2010. Patch by nobled! llvm-svn: 110029	2010-08-02 06:00:15 +00:00
Daniel Dunbar	c1b09c8644	Fix a -Wreorder warning. llvm-svn: 110022	2010-08-02 05:43:46 +00:00
Nick Lewycky	f52bd9cc33	Work in progress. Start cleaning up MergeFunctions to look more like the rest of LLVM. The primary change here is to move the methods responsible for comparison into the new FunctionComparator object. Some comments added. There's more to do. llvm-svn: 110021	2010-08-02 05:23:03 +00:00
Daniel Dunbar	0b636a24c7	Speculatively revert r108614, "Another attempt at getting the clang self-host to like my instcombine patch.", in an attempt to fix Clang i386 bootstrap. - Also PR7719. llvm-svn: 109953	2010-07-31 19:51:11 +00:00
Rafael Espindola	40f18838b7	The BlockExtractorPass() constructor was not reading the BlockFile and that was exactly what bugpoint expected it to do. There was also only one user of BlockExtractorPass(const std::vector<BasicBlock*> &B), so just remove it and make BlockExtractorPass read BlockFile. This fixes bugpoint's block extraction. Nick, please review. llvm-svn: 109936	2010-07-31 00:32:17 +00:00
Dan Gohman	d566d2c7b5	Move MaximumAlignment to be a member of the Value class. llvm-svn: 109891	2010-07-30 21:07:05 +00:00
Nick Lewycky	299c6dfcbf	Add missing newline to debug statement. llvm-svn: 109886	2010-07-30 20:27:01 +00:00
Eli Friedman	0428a61e45	PR7750: !CExpr->isNullValue() only properly computes whether CExpr is nonnull if CExpr is a ConstantInt. llvm-svn: 109773	2010-07-29 18:03:33 +00:00
Gabor Greif	62f0aac99d	simplify by using CallSite constructors; virtually eliminates CallSite::get from the tree llvm-svn: 109687	2010-07-28 22:50:26 +00:00
Dan Gohman	a7e5a24093	Define a maximum supported alignment value for load, store, and alloca instructions (constrained by their internal encoding), and add error checking for it. Fix an instcombine bug which generated huge alignment values (null is infinitely aligned). This fixes undefined behavior noticed by John Regehr. llvm-svn: 109643	2010-07-28 20:12:04 +00:00
Dan Gohman	9cd20bf792	When user code intentionally dereferences null, the alignment of the dereference is theoretically infinite. Put a cap on the computed alignment to avoid overflow, noticed by John Regehr. llvm-svn: 109596	2010-07-28 17:14:23 +00:00
Gabor Greif	f0084e1333	simplify llvm-svn: 109589	2010-07-28 15:52:43 +00:00
Gabor Greif	0a970698da	use Value* constructor of CallSite to create potentially improper site, and test that llvm-svn: 109581	2010-07-28 14:28:18 +00:00
Gabor Greif	f159085414	recommit simplification (r109502, backed out r109509); seems to innocent llvm-svn: 109510	2010-07-27 16:44:23 +00:00
Gabor Greif	5f91b7cf3e	back out this too to restore the bots llvm-svn: 109509	2010-07-27 15:56:07 +00:00
Gabor Greif	7b0a5fd2a5	simplify: CallSite::get --> CallSite constructor llvm-svn: 109506	2010-07-27 15:02:37 +00:00
Gabor Greif	7527b2ed5c	simplify llvm-svn: 109502	2010-07-27 13:31:22 +00:00
Owen Anderson	aa7f66ba67	Add an initial implementation of LazyValueInfo updating for JumpThreading. Disabled for now. llvm-svn: 109424	2010-07-26 18:48:03 +00:00
Dan Gohman	0141c13b22	Remove LCSSA's bogus dependence on LoopSimplify and LoopSimplify's bogus dependence on DominanceFrontier. Instead, add an explicit DominanceFrontier pass in StandardPasses.h to ensure that it gets scheduled at the right time. Declare that loop unrolling preserves ScalarEvolution, and shuffle some getAnalysisUsages. This eliminates one LoopSimplify and one LCCSA run in the standard compile opts sequence. llvm-svn: 109413	2010-07-26 18:11:16 +00:00
Dan Gohman	a7908ae369	Preserve ScalarEvolution in the loop unroller. llvm-svn: 109412	2010-07-26 18:02:06 +00:00
Dan Gohman	65b257c9d2	Use DominatorTree::properlyDominates instead of dominates with an explicit inequality check. llvm-svn: 109401	2010-07-26 17:37:36 +00:00
Dan Gohman	31f73ef210	A block dominates itself, by definition. llvm-svn: 109400	2010-07-26 17:35:32 +00:00
Nick Lewycky	7bc0443f2b	Revert this because we can't clone cyclic MDNodes which are creating during a build of llvm-gcc. llvm-svn: 109355	2010-07-24 20:54:02 +00:00
Nick Lewycky	14b69d59dd	Whether function-local or not, a MDNode may reference a Function in which case it needs to be mapped to refer to the function in the new module, not the old one. Fixes PR7700. llvm-svn: 109353	2010-07-24 19:43:25 +00:00
Devang Patel	5fa3813329	Speculatively revert 109117 llvm-svn: 109132	2010-07-22 18:44:00 +00:00
Gabor Greif	59f9970ba5	keep in 80 cols llvm-svn: 109122	2010-07-22 17:18:03 +00:00
Devang Patel	fac440cfb6	Map MDNode correctly. A non function local MDNode can have an operand which is cloned by MapValue(). llvm-svn: 109117	2010-07-22 16:35:00 +00:00
Gabor Greif	dde79d8f1a	mass elimination of reliance on automatic iterator dereferencing llvm-svn: 109103	2010-07-22 13:36:47 +00:00
Gabor Greif	84012a93ef	simplify llvm-svn: 109101	2010-07-22 13:07:39 +00:00
Gabor Greif	b8686360a1	do not access arguments via low-level interface, do not multiply dereference use_iterators llvm-svn: 109100	2010-07-22 13:04:32 +00:00
Gabor Greif	10bb1f5462	pass dereferenced iterator to dyn_cast llvm-svn: 109099	2010-07-22 11:48:35 +00:00
Gabor Greif	36f25dfd33	pass dereferenced iterator to dyn_cast llvm-svn: 109098	2010-07-22 11:43:44 +00:00
Gabor Greif	3e44ea1917	undo 80 column trespassing I caused llvm-svn: 109092	2010-07-22 10:37:47 +00:00
Dan Gohman	2637cc1a38	Make NamedMDNode not be a subclass of Value, and simplify the interface for creating and populating NamedMDNodes. llvm-svn: 109061	2010-07-21 23:38:33 +00:00
Owen Anderson	a57b97e7e7	Fix batch of converting RegisterPass<> to INTIALIZE_PASS(). llvm-svn: 109045	2010-07-21 22:09:45 +00:00
Dan Gohman	afbe4a7a10	Make this code a little more readable. llvm-svn: 108968	2010-07-20 23:49:44 +00:00
Dan Gohman	7373bd9973	Use DebugLocs instead of MDNodes. llvm-svn: 108967	2010-07-20 23:49:05 +00:00
Dan Gohman	b22dd85bb3	Fix a typo. llvm-svn: 108962	2010-07-20 23:10:36 +00:00
Dan Gohman	5c2e65b7bf	Don't look up the "dbg" metadata kind by name. llvm-svn: 108961	2010-07-20 23:09:34 +00:00
Dan Gohman	d2c7e52d05	Use getDebugLoc and setDebugLoc instead of getDbgMetadata and setDbgMetadata, avoiding MDNode overhead. llvm-svn: 108909	2010-07-20 20:09:07 +00:00
Dan Gohman	12725c7d46	Remember that the induction variable is always a PHINode and use getIncomingValueForBlock instead of LoopInfo::getCanonicalInductionVariableIncrement. llvm-svn: 108865	2010-07-20 17:18:52 +00:00
Owen Anderson	84774eda4b	Tweak per Chris' comments. llvm-svn: 108736	2010-07-19 19:23:32 +00:00
Owen Anderson	32a58342ed	Reimplement r108639 in InstCombine rather than DAGCombine. llvm-svn: 108687	2010-07-19 08:09:34 +00:00
Owen Anderson	7d2818b073	Another attempt at getting the clang self-host to like my instcombine patch. llvm-svn: 108614	2010-07-17 06:56:35 +00:00
Chris Lattner	27e997a168	eliminate unlockedRefineAbstractTypeTo, types are all per-llvmcontext, so there is no locking involved in type refinement. llvm-svn: 108553	2010-07-16 20:50:13 +00:00
Dan Gohman	efd7f9c360	Reorder the contents of various getAnalysisUsage functions, eliminating a redundant loopsimplify run from the default -O2 sequence. llvm-svn: 108539	2010-07-16 17:58:45 +00:00
Owen Anderson	8a39c807e2	Remove the rest of my instcombine changes. Back to the drawing board on this one. llvm-svn: 108530	2010-07-16 16:39:00 +00:00
Gabor Greif	6d673953e3	eliminate CallInst::ArgOffset llvm-svn: 108522	2010-07-16 09:38:02 +00:00
Nick Lewycky	375efe3157	Arrays and vectors with different numbers of elements are not equivalent. llvm-svn: 108517	2010-07-16 06:31:12 +00:00
Eric Christopher	15a81cddb4	Also revert 108422, it's causing some test failures. Working on testcases for Owen. llvm-svn: 108494	2010-07-16 01:36:12 +00:00
Dan Gohman	1415208292	Don't merge uses when they are targetting fixup sites with different widths. In a use with a narrower fixup, formulae may be wider than the fixup, in which case the high bits aren't necessarily meaningful, so it isn't safe to reuse them for uses with wider fixups. This fixes PR7618, though the testcase is too large for a reasonable regression test, since it heavily dependes on hitting LSR's heuristics in a certain way. llvm-svn: 108455	2010-07-15 20:24:58 +00:00
Dan Gohman	a1501b9c50	Use dbgs() instead of errs() in a DEBUG. llvm-svn: 108453	2010-07-15 20:12:42 +00:00
Owen Anderson	eaf64d5c1e	Speculatively revert r108429 to fix the clang self-host. llvm-svn: 108436	2010-07-15 18:18:57 +00:00
Owen Anderson	eb08d01061	Per Chris' suggestion, get rid of the select canonicalization and just add the corresponding or-icmp-and pattern. This has the added benefit of doing the matching earlier, and thus being less susceptible to being confused by earlier transforms. llvm-svn: 108429	2010-07-15 17:24:23 +00:00
Owen Anderson	13700ebb02	Remove unneeded check, and correct style. llvm-svn: 108427	2010-07-15 16:38:22 +00:00
Dan Gohman	4afd412d6b	Watch out for a constant offset cancelling out a base register, forming a zero. This situation arrises in Fortran code with induction variables that start at 1 instead of 0. This fixes PR7651. llvm-svn: 108424	2010-07-15 15:14:45 +00:00
Owen Anderson	7151dfd48a	Reapply r108378, with bugfixes, testcase, and improved comment formatting. This now passes LIT, nighty test, and llvm-gcc bootstrap on my machine. llvm-svn: 108422	2010-07-15 15:00:23 +00:00
Nick Lewycky	485ce5a49c	This is a full sentence. llvm-svn: 108418	2010-07-15 06:51:22 +00:00
Nick Lewycky	e6f3287cbb	Disable aliases on all platforms. llvm-svn: 108417	2010-07-15 06:48:56 +00:00
Chris Lattner	e41ab07c61	make various clients of ReplaceAndSimplifyAllUses tolerate it changing the things it replaces, not just causing them to drop to null. There is no functionality change yet, but this is required for a subsequent patch. llvm-svn: 108414	2010-07-15 06:06:04 +00:00
Eli Friedman	a8b4e3732b	Speculatively revert r108378; may be causing bootstrap failures. llvm-svn: 108389	2010-07-15 00:33:00 +00:00
Owen Anderson	37d91d84af	Add instcombine transforms to optimize tests of multiple bits of the same value into a single larger comparison. llvm-svn: 108378	2010-07-14 23:33:51 +00:00
Owen Anderson	2cfe91379b	Extend SimplifyCFG's common-destination folding heuristic to allow a single "bonus" instruction to be speculatively executed. Add a heuristic to ensure we're not tripping up out-of-order execution by checking that this bonus instruction only uses values that were already guaranteed to be available. This allows us to eliminate the short circuit in (x&1)&&(x&2). llvm-svn: 108351	2010-07-14 19:52:16 +00:00
Chris Lattner	ec0e7b1643	revert r108320, I see the failures now... llvm-svn: 108322	2010-07-14 06:16:35 +00:00
Chris Lattner	658680b2f5	reapply benjamin's instcombine patch, I don't see anything wrong with it and can't repro any problems with a manual self-host. llvm-svn: 108320	2010-07-14 05:59:13 +00:00
Eric Christopher	ea282034b6	Grammar. llvm-svn: 108252	2010-07-13 18:27:13 +00:00
Duncan Sands	f88a284579	Handle the case of a tail recursion in which the tail call is followed by a return that returns a constant, while elsewhere in the function another return instruction returns a different constant. This is a special case of accumulator recursion, so just generalize the existing logic a bit. llvm-svn: 108241	2010-07-13 15:41:41 +00:00
Benjamin Kramer	8f36402ac2	Nope, still breaks the release selfhost bots :( llvm-svn: 108153	2010-07-12 16:38:48 +00:00
Benjamin Kramer	07b695e052	Reapply the "or" half of r108136, which seems to be less problematic. llvm-svn: 108152	2010-07-12 16:15:48 +00:00
Gabor Greif	1b787df129	cache result of operator* llvm-svn: 108150	2010-07-12 15:48:26 +00:00
Benjamin Kramer	c719e8ae9e	Revert r108141 again, sigh. llvm-svn: 108148	2010-07-12 14:42:04 +00:00
Gabor Greif	96fedcb136	cache result of operator* llvm-svn: 108147	2010-07-12 14:15:58 +00:00
Gabor Greif	f9c38b5a45	cache result of operator* llvm-svn: 108146	2010-07-12 14:15:10 +00:00
Gabor Greif	88dd73b75e	cache result of operator* llvm-svn: 108145	2010-07-12 14:14:03 +00:00
Gabor Greif	a75ed761a9	cache result of operator* llvm-svn: 108144	2010-07-12 14:13:15 +00:00
Gabor Greif	15445db11b	cache results of operator* llvm-svn: 108143	2010-07-12 14:12:11 +00:00
Gabor Greif	a5fa885d47	cache results of operator* llvm-svn: 108142	2010-07-12 14:10:24 +00:00
Benjamin Kramer	f578c36035	Reapply 108136 with an ugly pasto fixed. llvm-svn: 108141	2010-07-12 13:44:00 +00:00
Benjamin Kramer	11743249e6	Move optimization to avoid redundant matching. llvm-svn: 108140	2010-07-12 13:34:22 +00:00
Benjamin Kramer	9675e759cf	Revert r108136 until I figure out why it broke selfhost. llvm-svn: 108139	2010-07-12 12:35:49 +00:00
Gabor Greif	782f62412f	cache dereferenced iterators llvm-svn: 108138	2010-07-12 12:03:02 +00:00
Gabor Greif	433b975fe2	recommit r108131 (hich has been backed out in r108135) with a fix llvm-svn: 108137	2010-07-12 12:02:10 +00:00
Benjamin Kramer	35473faa50	instcombine: fold (x & y) \| (~x & z) and (x & y) ^ (~x & z) into ((y ^ z) & x) ^ z which is one instruction shorter. (PR6773) before: %and = and i32 %y, %x %neg = xor i32 %x, -1 %and4 = and i32 %z, %neg %xor = xor i32 %and4, %and after: %xor1 = xor i32 %z, %y %and2 = and i32 %xor1, %x %xor = xor i32 %and2, %z llvm-svn: 108136	2010-07-12 11:54:45 +00:00
Gabor Greif	f9610827ce	back out r108131 (of TailDuplication.cpp) for now, it causes a buildbot failure llvm-svn: 108135	2010-07-12 11:32:39 +00:00
Gabor Greif	6143704ac5	cache dereferenced iterators llvm-svn: 108134	2010-07-12 11:19:24 +00:00
Gabor Greif	8629f12bb8	cache dereferenced iterators llvm-svn: 108133	2010-07-12 10:59:23 +00:00
Gabor Greif	d993402df3	cache dereferenced iterators llvm-svn: 108132	2010-07-12 10:49:54 +00:00
Gabor Greif	2a464d7308	cache dereferenced iterators llvm-svn: 108131	2010-07-12 10:36:48 +00:00
Duncan Sands	41b4a6b36a	Convert some tab stops into spaces. llvm-svn: 108130	2010-07-12 08:16:59 +00:00
Chris Lattner	601e390a3b	make the prototypes for CreateMalloc and CreateFree more consistent. Patch by Hans Vandierendonck from PR7605 llvm-svn: 108116	2010-07-12 00:57:28 +00:00
Chris Lattner	bbc25ff5cc	if jump threading is able to infer interesting values on both the LHS and RHS of an and/or instruction, don't multiply add known predecessor values. This fixes the crash on testcase from PR7498 llvm-svn: 108114	2010-07-12 00:47:34 +00:00
Duncan Sands	82b21c086e	The accumulator tail recursion transform claims to work for any associative operation, but the way it's implemented requires the operation to also be commutative. So add a check for commutativity (and tweak the corresponding comments). This makes no difference in practice since every associative LLVM instruction is also commutative! Here's an example to show the need for commutativity: the accum_recursion.ll testcase calculates the factorial function. Before the transformation the result of a call is ((((11)2)3)...)x while afterwards it is (((1x)(x-1))...2)1 which clearly requires both associativity and commutativity of * to be equal to the original. llvm-svn: 108056	2010-07-10 20:31:42 +00:00
Gabor Greif	9d5ae03404	cache result of operator* llvm-svn: 107990	2010-07-09 16:51:20 +00:00
Gabor Greif	fd8e7d4a0f	cache result of operator* llvm-svn: 107984	2010-07-09 16:31:08 +00:00
Gabor Greif	e7650c7c29	cache result of operator* llvm-svn: 107983	2010-07-09 16:26:41 +00:00
Gabor Greif	04af1e4f65	cache result of operator* llvm-svn: 107981	2010-07-09 16:17:52 +00:00
Gabor Greif	e82532a1c5	cache result of operator* llvm-svn: 107976	2010-07-09 15:40:10 +00:00
Gabor Greif	6d8870fc35	cache result of operator* llvm-svn: 107975	2010-07-09 15:25:42 +00:00
Gabor Greif	329c4d8ed9	cache result of operator* llvm-svn: 107974	2010-07-09 15:25:09 +00:00
Gabor Greif	0028cc6730	cache result of operator* llvm-svn: 107972	2010-07-09 15:01:36 +00:00
Gabor Greif	d323f5e161	cache result of operator* (found by inspection) llvm-svn: 107971	2010-07-09 14:48:08 +00:00
Gabor Greif	b0d56ffc85	cache result of operator* llvm-svn: 107969	2010-07-09 14:36:49 +00:00
Gabor Greif	4247949ce9	cache result of operator* llvm-svn: 107968	2010-07-09 14:29:14 +00:00
Gabor Greif	a02f232c1b	cache result of operator* llvm-svn: 107966	2010-07-09 14:18:23 +00:00
Gabor Greif	f0821f39ee	cache operator*'s result (in multiple functions) llvm-svn: 107965	2010-07-09 14:02:13 +00:00
Gabor Greif	60a346d0f1	do not repeatedly dereference use_iterator llvm-svn: 107962	2010-07-09 12:23:50 +00:00
Benjamin Kramer	2321e6a4d4	Teach instcombine to transform (X >s -1) ? C1 : C2 and (X <s 0) ? C2 : C1 into ((X >>s 31) & (C2 - C1)) + C1, avoiding the conditional. This optimization could be extended to take non-const C1 and C2 but we better stay conservative to avoid code size bloat for now. for int sel(int n) { return n >= 0 ? 60 : 100; } we now generate sarl $31, %edi andl $40, %edi leal 60(%rdi), %eax instead of testl %edi, %edi movl $60, %ecx movl $100, %eax cmovnsl %ecx, %eax llvm-svn: 107866	2010-07-08 11:39:10 +00:00
Chris Lattner	efa3c824cc	Fix the second half of PR7437: scalarrepl wasn't preserving address spaces when SRoA'ing memcpy's. llvm-svn: 107846	2010-07-08 00:27:05 +00:00
Duncan Sands	408bb192de	Rename "Release" builds as "Release+Asserts"; rename "Release-Asserts" builds to "Release". The default build is unchanged (optimization on, assertions on), however it is now called Release+Asserts. The intent is that future LLVM releases released via llvm.org will be Release builds in the new sense, i.e. will have assertions disabled (currently they have assertions enabled, for a more than 20% slowdown). This will bring them in line with MacOS releases, which ship with assertions disabled. It also means that "Release" now means the same things in make and cmake builds: cmake already disables assertions for "Release" builds AFAICS. llvm-svn: 107758	2010-07-07 07:48:00 +00:00
Nick Lewycky	dace239949	Detabify this file. llvm-svn: 107637	2010-07-06 03:53:43 +00:00
Devang Patel	cefe3831b7	MDString is already checked earlier. llvm-svn: 107516	2010-07-02 21:13:23 +00:00
Dan Gohman	832282e061	Don't claim to preserve AliasAnalysis. First, this is doesn't actually have any effect, and second, deleting stores can potentially invalidate an AliasAnalysis, and there's currently no notification for this. llvm-svn: 107496	2010-07-02 18:43:05 +00:00
Bill Wendling	03bcd6ecc8	Implement the "linker_private_weak" linkage type. This will be used for Objective-C metadata types which should be marked as "weak", but which the linker will remove upon final linkage. However, this linkage isn't specific to Objective-C. For example, the "objc_msgSend_fixup_alloc" symbol is defined like this: .globl l_objc_msgSend_fixup_alloc .weak_definition l_objc_msgSend_fixup_alloc .section __DATA, __objc_msgrefs, coalesced .align 3 l_objc_msgSend_fixup_alloc: .quad _objc_msgSend_fixup .quad L_OBJC_METH_VAR_NAME_1 This is different from the "linker_private" linkage type, because it can't have the metadata defined with ".weak_definition". Currently only supported on Darwin platforms. llvm-svn: 107433	2010-07-01 21:55:59 +00:00
Devang Patel	2b434e12cd	Debugging infomration is encoded in llvm IR using metadata. This is designed such a way that debug info for symbols preserved even if symbols are optimized away by the optimizer. Add new special pass to remove debug info for such symbols. llvm-svn: 107416	2010-07-01 19:49:20 +00:00
Devang Patel	b9e2e4b762	If a named mdnode is removed then mark module as changed. llvm-svn: 107412	2010-07-01 18:27:46 +00:00
Jim Grosbach	e74c78d539	lowerinvoke needs to handle aggregate function args like sjlj eh does. llvm-svn: 107335	2010-06-30 22:22:59 +00:00
Devang Patel	db735cbbab	Remove all debug info related named mdnodes. llvm-svn: 107323	2010-06-30 21:29:00 +00:00
Gabor Greif	74470192d7	use ArgOperand API llvm-svn: 107278	2010-06-30 12:42:43 +00:00
Gabor Greif	d50572802e	use ArgOperand API llvm-svn: 107277	2010-06-30 12:40:35 +00:00
Gabor Greif	3abd881bea	use getArgOperand (corrected by CallInst::ArgOffset) instead of getOperand llvm-svn: 107275	2010-06-30 12:38:26 +00:00
Gabor Greif	743b3fd196	use getArgOperand (corrected by CallInst::ArgOffset) instead of getOperand llvm-svn: 107273	2010-06-30 09:19:23 +00:00
Gabor Greif	f628ecd15f	use getNumArgOperands instead of getNumOperands llvm-svn: 107272	2010-06-30 09:17:53 +00:00
Gabor Greif	fe252e6fa0	use getArgOperand instead of getOperand llvm-svn: 107271	2010-06-30 09:16:16 +00:00
Gabor Greif	8ae3095286	use getArgOperand instead of getOperand llvm-svn: 107270	2010-06-30 09:15:28 +00:00
Gabor Greif	e9acc46f65	use getArgOperand instead of getOperand llvm-svn: 107269	2010-06-30 09:14:26 +00:00
Bill Wendling	3632171750	Revert r107205 and r107207. llvm-svn: 107215	2010-06-29 22:34:52 +00:00
Bill Wendling	1767723dbe	Introducing the "linker_weak" linkage type. This will be used for Objective-C metadata types which should be marked as "weak", but which the linker will remove upon final linkage. For example, the "objc_msgSend_fixup_alloc" symbol is defined like this: .globl l_objc_msgSend_fixup_alloc .weak_definition l_objc_msgSend_fixup_alloc .section __DATA, __objc_msgrefs, coalesced .align 3 l_objc_msgSend_fixup_alloc: .quad _objc_msgSend_fixup .quad L_OBJC_METH_VAR_NAME_1 This is different from the "linker_private" linkage type, because it can't have the metadata defined with ".weak_definition". llvm-svn: 107205	2010-06-29 21:24:00 +00:00
Duncan Sands	17f1ca8793	Return Changed. This required setting Changed if dbg metadata is stripped off. Currently set unconditionally, since the API does not provide a way of working out if anything was actually stripped off. llvm-svn: 107142	2010-06-29 14:52:10 +00:00
Gabor Greif	5b1370ee80	use ArgOperand API llvm-svn: 107017	2010-06-28 16:50:57 +00:00
Gabor Greif	e23efeef10	use ArgOperand API llvm-svn: 107016	2010-06-28 16:45:00 +00:00
Gabor Greif	18c5bae727	employ CallInst::ArgOffset (for now) llvm-svn: 107015	2010-06-28 16:43:57 +00:00
Gabor Greif	2dd4307e45	use setArgOperand llvm-svn: 107004	2010-06-28 12:31:35 +00:00
Gabor Greif	ec60adf161	use CallInst::ArgOffset llvm-svn: 107003	2010-06-28 12:30:07 +00:00
Gabor Greif	2de43a7c5c	use ArgOperand API and CallInst::ArgOffset llvm-svn: 107002	2010-06-28 12:29:20 +00:00
Gabor Greif	4300fc77ae	use cached value llvm-svn: 107000	2010-06-28 11:20:42 +00:00
Chris Lattner	25a843fcd2	minor cleanup to SROA: when lowering type unsafe accesses to large integers, the first inserted value would always create an 'or X, 0'. Even though this is trivially zapped by instcombine, don't bother creating this pointless instruction. llvm-svn: 106979	2010-06-27 07:58:26 +00:00
Duncan Sands	3a5cb69cb8	Fix PR7328: when turning a tail recursion into a loop, need to preserve the returned value after the tail call if it differs from other return values. The optimal thing to do would be to introduce a phi node for the return value, but for the moment just fix the miscompile. llvm-svn: 106947	2010-06-26 12:53:31 +00:00
Dan Gohman	fb9712bdae	In GenerateReassociations, don't bother thinking about individual SCEVUnknown values which are loop-variant, as LSR can't do anything interesting with these values in any case. This fixes very slow compile times on loops which have large numbers of such values. llvm-svn: 106897	2010-06-25 22:32:18 +00:00
Dale Johannesen	ce97d55ad9	The hasMemory argument is irrelevant to how the argument for an "i" constraint should get lowered; PR 6309. While this argument was passed around a lot, this is the only place it was used, so it goes away from a lot of other places. llvm-svn: 106893	2010-06-25 21:55:36 +00:00
Gabor Greif	e3ba486c9f	use ArgOperand API (one more hunk I could split) llvm-svn: 106825	2010-06-25 07:58:41 +00:00
Gabor Greif	5f3e656a1b	use ArgOperand API (some hunks I could split) llvm-svn: 106824	2010-06-25 07:57:14 +00:00
Gabor Greif	07e9284c75	use ArgOperand API; tighten type of handleFreeWithNonTrivialDependency to be able to use isFreeCall whithout a cast or new overload llvm-svn: 106823	2010-06-25 07:40:32 +00:00
Dan Gohman	4143e9deeb	Add an exports file for the Hello example plugin. llvm-svn: 106768	2010-06-24 17:36:51 +00:00
Dan Gohman	963b1c142e	A few minor micro-optimizations. llvm-svn: 106764	2010-06-24 16:57:52 +00:00
Dan Gohman	47ddf76d89	Teach getExactSDiv to evaluate x/1 to x up front, as it's a common enough special case, and it theoretically allows more folding because it works even when x is unanalyzable. llvm-svn: 106763	2010-06-24 16:51:25 +00:00
Dan Gohman	ab5422200b	Fix copy+pasto issues in isMulSExtable. llvm-svn: 106759	2010-06-24 16:45:11 +00:00
Gabor Greif	7ccec09252	use ArgOperand API llvm-svn: 106752	2010-06-24 16:11:44 +00:00
Gabor Greif	a6d75e2cf7	use (even more, still) ArgOperand API llvm-svn: 106750	2010-06-24 15:51:11 +00:00
Gabor Greif	218f5541b2	use ArgOperand API and CallSite for arg range; add necessary casts and perform some cosmetics llvm-svn: 106747	2010-06-24 14:42:01 +00:00
Gabor Greif	5aafdf1e43	use ArgOperand API and CallSite for arg range llvm-svn: 106745	2010-06-24 14:13:36 +00:00
Gabor Greif	0a136c9b53	use (even more) ArgOperand API llvm-svn: 106744	2010-06-24 13:54:33 +00:00
Gabor Greif	590d95ed18	use ArgOperand API llvm-svn: 106743	2010-06-24 13:42:49 +00:00
Gabor Greif	589a0b950a	use ArgOperand API llvm-svn: 106740	2010-06-24 12:58:35 +00:00
Gabor Greif	7943017490	use ArgOperand API llvm-svn: 106737	2010-06-24 12:35:13 +00:00
Gabor Greif	75f6943c95	use ArgOperand API, also tighten the type of visitFree to make this work out smoothly llvm-svn: 106736	2010-06-24 12:21:15 +00:00
Gabor Greif	91f9589057	use ArgOperand API; introduce downcasted pointers into scope to facilitate this llvm-svn: 106734	2010-06-24 12:03:56 +00:00
Gabor Greif	e2f482ca0b	use ArgOperand API llvm-svn: 106731	2010-06-24 10:42:46 +00:00
Gabor Greif	2d958d4db5	use ArgOperand API llvm-svn: 106730	2010-06-24 10:17:17 +00:00
Gabor Greif	5bcaa55761	use callsite to obtain all arguments llvm-svn: 106729	2010-06-24 10:04:07 +00:00
Gabor Greif	42f620cc55	use callsite to obtain all arguments llvm-svn: 106728	2010-06-24 09:56:43 +00:00
Gabor Greif	0f60709f0e	use getNumArgOperands llvm-svn: 106709	2010-06-24 00:48:48 +00:00
Gabor Greif	4a39b84a9d	use ArgOperand API llvm-svn: 106707	2010-06-24 00:44:01 +00:00
Devang Patel	0dc3c2d37e	Use ValueMap instead of DenseMap. The ValueMapper used by various cloning utility maps MDNodes also. llvm-svn: 106706	2010-06-24 00:33:28 +00:00
Devang Patel	d8dedee96d	Use available typedef for " DenseMap<const Value, Value>". llvm-svn: 106699	2010-06-24 00:00:42 +00:00
Devang Patel	b8f11de105	Cosmetic change. Do not use "ValueMap" as a name for a local variable or an argument. llvm-svn: 106698	2010-06-23 23:55:51 +00:00
Devang Patel	9ad629367d	Revert 106592 for now. It causes clang-selfhost build failure. llvm-svn: 106598	2010-06-22 23:29:55 +00:00
Dan Gohman	1081f1a0f5	Fix OptimizeMax to handle an odd case where one of the max operands is another max which folds. This fixes PR7454. llvm-svn: 106594	2010-06-22 23:07:13 +00:00
Devang Patel	87f75f75be	If a metadata operand is seeded in value map and the metadata should also be seeded in value map. This is not limited to function local metadata. Failure to seed metdata in such cases causes troubles when in a cloned module, metadata from a new module refers to values in old module. Usually this results in mysterious bugpoint crashes. For example, Checking to see if we can delete global inits: Unknown constant! UNREACHABLE executed at /d/g/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:904! llvm-svn: 106592	2010-06-22 22:53:21 +00:00
Devang Patel	e43c6487da	While cloning a module, clone metadata attached with instructions. llvm-svn: 106591	2010-06-22 22:50:42 +00:00
Devang Patel	e3fbbd19ed	Clone named metadata while cloning a module. Reapply Bob's patch. llvm-svn: 106560	2010-06-22 18:52:38 +00:00
Dan Gohman	d2d1ae105d	Use pre-increment instead of post-increment when the result is not used. llvm-svn: 106542	2010-06-22 15:08:57 +00:00
Devang Patel	f040dec68a	Revert 106528. It is causing self host failures. llvm-svn: 106529	2010-06-22 06:14:09 +00:00
Devang Patel	b195eb4acf	Do not rely on DenseMap slot which can be easily invalidated when DenseMap grows. llvm-svn: 106528	2010-06-22 05:16:56 +00:00
Bob Wilson	6c1fc79cab	Revert my change to clone named metadata. Buildbots are complaining. --- Reverse-merging r106508 into '.': U lib/Transforms/Utils/CloneModule.cpp llvm-svn: 106521	2010-06-22 02:08:51 +00:00
Bob Wilson	5f9575c1cd	Include named metadata when cloning a module. llvm-svn: 106508	2010-06-22 00:11:03 +00:00
Dan Gohman	dd41bba517	Use A.append(...) instead of A.insert(A.end(), ...) when A is a SmallVector, and other SmallVector simplifications. llvm-svn: 106452	2010-06-21 19:47:52 +00:00
Dan Gohman	32655906e4	Add a TODO comment. llvm-svn: 106397	2010-06-19 21:30:18 +00:00
Dan Gohman	51d00092b6	Include the use kind along with the expression in the key of the use sharing map. The reconcileNewOffset logic already forces a separate use if the kinds differ, so incorporating the kind in the key means we can track more sharing opportunities. More sharing means fewer total uses to track, which means smaller problem sizes, which means the conservative throttles don't kick in as often. llvm-svn: 106396	2010-06-19 21:29:59 +00:00
Dan Gohman	297fb8b9fc	Don't include things in anonymous namespaces that don't need it. llvm-svn: 106395	2010-06-19 21:21:39 +00:00
Dan Gohman	f3aea7aecf	Disable indvars on loops when LoopSimplify form is not available. This fixes PR7333. llvm-svn: 106267	2010-06-18 01:35:11 +00:00
Jim Grosbach	e94f1ded24	remove trailing whitespace llvm-svn: 106164	2010-06-16 22:41:09 +00:00
Rafael Espindola	a20e2dfe86	Make sure that simplify libcalls does not replace a call with one calling convention with a new call with a different calling convention. llvm-svn: 106134	2010-06-16 19:34:01 +00:00
Benjamin Kramer	a13bd20396	simplify-libcalls: fold strncmp(x, y, 1) -> memcmp(x, y, 1) The memcmp will be optimized further and even the pathological case 'strstr(x, "x") == x' generates optimal code now. llvm-svn: 106097	2010-06-16 10:30:29 +00:00
Benjamin Kramer	1118860e3a	simplify-libcalls: fold strstr(a, b) == a -> strncmp(a, b, strlen(b)) == 0 llvm-svn: 106047	2010-06-15 21:34:25 +00:00
Chris Lattner	329ea064ed	jump threading can't split a critical edge from an indirectbr. This fixes PR7356. llvm-svn: 105950	2010-06-14 19:45:43 +00:00
Benjamin Kramer	b82de426de	SimplifyCFG: don't turn volatile stores to null/undef into unreachable. Fixes PR7369. llvm-svn: 105914	2010-06-13 14:35:54 +00:00
Kenneth Uildriks	9b21208bfb	Pulled CodeMetrics out of InlineCost.h and made it a bit more general, so it can be reused from PartialSpecializationCost llvm-svn: 105725	2010-06-09 15:11:37 +00:00
Dan Gohman	fb8ed43349	Make bugpoint dead-argument-hacking actually work, and actually test it. llvm-svn: 105551	2010-06-07 20:20:33 +00:00
Kenneth Uildriks	1850444000	Partial specialization was not checking the callsite to make sure it was using the same constants as the specialization, leading to calls to the wrong specialization. Patch by Takumi Nakamura\! llvm-svn: 105528	2010-06-05 14:50:21 +00:00
Dan Gohman	67b4403101	Don't track users of undef values; they aren't interesting for register pressure. llvm-svn: 105501	2010-06-04 23:16:05 +00:00
Devang Patel	36da24b546	Copy location info for current function argument from dbg.declare if respective store instruction does not have any location info. llvm-svn: 105490	2010-06-04 22:27:30 +00:00
Jim Grosbach	5ba76b94f8	Remove unused code llvm-svn: 105293	2010-06-01 21:56:30 +00:00
Jim Grosbach	0e20dc5cd6	fix think-o llvm-svn: 105291	2010-06-01 21:35:50 +00:00
Jim Grosbach	b69c68742a	Simplify things a bit more. Fix prototype to use SmallVectorImpl and change a few SmallVectors to vanilla C arrays. llvm-svn: 105289	2010-06-01 21:06:46 +00:00
Jim Grosbach	a37af16221	mirror of r105280 changes for LowerInvoke, which uses the same basic logic here llvm-svn: 105281	2010-06-01 18:04:56 +00:00
Jim Grosbach	7352167560	Use SmallVector instead of std::vector. llvm-svn: 105279	2010-06-01 17:56:41 +00:00
Duncan Sands	4c904fa797	Fix PR7272: when inlining through a callsite with byval arguments, the newly created allocas may be used by inlined calls, so these need to have their tail call flags cleared. Fixes PR7272. llvm-svn: 105255	2010-05-31 21:00:26 +00:00
Benjamin Kramer	5ac57e3440	Avoid swap when a copy suffices. llvm-svn: 105220	2010-05-31 12:50:41 +00:00
Nick Lewycky	aee2632be3	The memcpy intrinsic only takes i8* for %src and %dst, so cast them to that first. Fixes PR7265. llvm-svn: 105206	2010-05-31 06:16:35 +00:00
Dan Gohman	826bdf8c10	Move FindAvailableLoadedValue isSafeToLoadUnconditionally out of lib/Transforms/Utils and into lib/Analysis so that Analysis passes can use them. llvm-svn: 104949	2010-05-28 16:19:17 +00:00
Dan Gohman	df5d7dcef1	Teach instcombine to promote alloca array sizes. llvm-svn: 104945	2010-05-28 15:09:00 +00:00
Dan Gohman	05a6555acb	Fix instcombine's handling of alloca to accept non-i32 types. llvm-svn: 104935	2010-05-28 04:33:04 +00:00
Devang Patel	3e0fbafab2	Fix typo. llvm-svn: 104914	2010-05-28 01:29:50 +00:00
Devang Patel	e2099e8088	Fix typo. llvm-svn: 104913	2010-05-28 01:17:51 +00:00
Devang Patel	7a9dedf0ab	Do not drop location info for inlined function args. llvm-svn: 104884	2010-05-27 20:25:04 +00:00
Duncan Sands	f162eace49	Teach instCombine to remove malloc+free if malloc's only uses are comparisons to null. Patch by Matti Niemenmaa. llvm-svn: 104871	2010-05-27 19:09:06 +00:00
Benjamin Kramer	6877119ef3	Kill unneeded SExt. llvm-svn: 104692	2010-05-26 09:45:04 +00:00
Benjamin Kramer	9439084cea	Properly promote operands when optimizing a single-character memcmp. llvm-svn: 104648	2010-05-25 22:53:43 +00:00
Dan Gohman	a4abd035ea	Fix a missing newline in debug output. llvm-svn: 104644	2010-05-25 21:50:35 +00:00
Dan Gohman	9b48b856ea	DominatorTree.getNode can return null for unreachable blocks. llvm-svn: 104290	2010-05-20 22:46:54 +00:00
Dan Gohman	86110fa2bb	Minor code cleanups. llvm-svn: 104287	2010-05-20 22:25:20 +00:00
Dan Gohman	6295f2ebb8	Make Solve check its own post-condition, to reduce clutter in the top-level LSRInstance logic. llvm-svn: 104278	2010-05-20 20:59:23 +00:00
Dan Gohman	a4ca28a3ae	Add comments. llvm-svn: 104276	2010-05-20 20:52:00 +00:00
Dan Gohman	927bcaadda	More code cleanups. Use iterators instead of indices when indices aren't needed. llvm-svn: 104273	2010-05-20 20:33:18 +00:00
Dan Gohman	4c4043cf34	Fix OptimizeShadowIV to set Changed. Change OptimizeLoopTermCond to set Changed directly instead of using a return value. Rename FilterOutUndesirableDedicatedRegisters's Changed variable to distinguish it from LSRInstance's Changed member. llvm-svn: 104269	2010-05-20 20:05:31 +00:00
Dan Gohman	8ec018cedf	Add some comments. llvm-svn: 104268	2010-05-20 20:00:41 +00:00
Dan Gohman	8ce95cc3c5	Simplify this code. Don't do a DomTreeNode lookup for each visited block. llvm-svn: 104267	2010-05-20 20:00:25 +00:00
Dan Gohman	ab5fb7f559	Minor code cleanups. llvm-svn: 104263	2010-05-20 19:44:23 +00:00
Dan Gohman	ee2fea3cd7	When canonicalizing icmp operand order to put the loop invariant operand on the left, the interesting operand is on the right. This fixes a bug where LSR was failing to recognize ICmpZero uses, which led it to be unable to reverse the induction variable in the attached testcase. Delete test/CodeGen/X86/stack-color-with-reg-2.ll, because its test is extremely fragile and hard to meaningfully update. llvm-svn: 104262	2010-05-20 19:26:52 +00:00
Dan Gohman	fdf9874ba7	Set Changed to true when canonicalizing ICmp operand order; even though it isn't a very interesting change, it's a change nonetheless. llvm-svn: 104260	2010-05-20 19:16:03 +00:00
Devang Patel	e2ff7f3a7d	Strip llvm.dbg.lv also. llvm-svn: 104236	2010-05-20 16:49:22 +00:00
Dan Gohman	981563d0ba	Rename a variable to avoid shadowing. llvm-svn: 104234	2010-05-20 16:41:11 +00:00
Dan Gohman	6b733fc189	Minor code simplification. llvm-svn: 104232	2010-05-20 16:23:28 +00:00
Dan Gohman	80a9608442	Move the code for deleting BaseRegs and LSRUses into helper functions, and fix a bug that valgrind noticed where the code would std::swap an element with itself. llvm-svn: 104225	2010-05-20 15:17:54 +00:00
Dan Gohman	20fab456da	Teach LSR how to cope better with unrolled loops on targets where the addressing modes don't make this trivially easy. This allows it to avoid falling into the less precise heuristics in more cases. llvm-svn: 104186	2010-05-19 23:43:12 +00:00
Dan Gohman	beebef4137	Add a comment. llvm-svn: 104089	2010-05-18 23:55:57 +00:00
Dan Gohman	50f8f2c23d	Fix the predicate which checks for non-sensical formulae which have constants in registers which partially cancel out their immediate fields. llvm-svn: 104088	2010-05-18 23:48:08 +00:00
Dan Gohman	4cf99b5303	Factor out the code for recomputing an LSRUse's Regs set after some of its formulae have been removed into a helper function, and also teach it how to update the RegUseTracker. llvm-svn: 104087	2010-05-18 23:42:37 +00:00
Dan Gohman	a4eca05174	Factor out code for estimating search space complexity into a helper function. llvm-svn: 104082	2010-05-18 22:51:59 +00:00
Dan Gohman	63e9015248	Add some more debug output. llvm-svn: 104080	2010-05-18 22:41:32 +00:00
Dan Gohman	f1c7b1b42f	Factor out the code for deleting a formula from an LSRUse into a helper function. llvm-svn: 104079	2010-05-18 22:39:15 +00:00
Dan Gohman	8aca7ef903	Make some debug output more informative. llvm-svn: 104078	2010-05-18 22:37:37 +00:00
Dan Gohman	06ab08f795	Print an error message in Formula::print if the HasBaseReg flag is inconsistent with the BaseRegs field. It's not print's job to assert on an invalid condition, but it can make one more obvious. llvm-svn: 104077	2010-05-18 22:35:55 +00:00
Dan Gohman	248c41d108	Rename RegUseTracker's RegUses member to RegUsesMap to avoid confusion with LSRInstance's RegUses member. llvm-svn: 104076	2010-05-18 22:33:00 +00:00
Nick Lewycky	b35818eb25	Teach the always inliner to release its inline cost estimates, like the basic inliner did in r103653. Why does the always inliner even bother with cost estimates anyways? llvm-svn: 103858	2010-05-15 04:26:25 +00:00
Nick Lewycky	002a45eb64	Clean up, no functional change. llvm-svn: 103857	2010-05-15 03:41:58 +00:00
Nick Lewycky	2b3cbac0ee	Remove heinous tabs. llvm-svn: 103700	2010-05-13 06:45:13 +00:00
Nick Lewycky	d3c6dfe853	Replace the core comparison login in merge functions. We can now merge vector<>::push_back() in: int foo(vector<int> &a, vector<unsigned> &b) { a.push_back(10); b.push_back(11); } to two calls to the same push_back function, or fold away the two copies of push_back() in: struct T { int; }; struct S { char; }; vector<T> t; vector<S> s; void f(T x) { t.push_back(x); } void g(S x) { s.push_back(x); } but leave f() and g() separate, since they refer to two different global variables. llvm-svn: 103698	2010-05-13 05:48:45 +00:00
Nick Lewycky	c63aa1e8ab	Clear CachedFunctionInfo upon Pass::releaseMemory. Because ValueMap will abort on RAUW of functions, this is a correctness issue instead of a mere memory usage problem. No testcase until the new MergeFunctions can land. llvm-svn: 103653	2010-05-12 21:48:15 +00:00
Duncan Sands	6c5e4355bb	I got tired of VISIBILITY_HIDDEN colliding with the gcc enum. Rename it to LLVM_LIBRARY_VISIBILITY and introduce LLVM_GLOBAL_VISIBILITY, which is the opposite, for future use by dragonegg. llvm-svn: 103495	2010-05-11 20:16:09 +00:00
Douglas Gregor	6739a89117	Fixes for Microsoft Visual Studio 2010, from Steven Watanabe! llvm-svn: 103457	2010-05-11 06:17:44 +00:00
Chris Lattner	84d4618659	make simplifycfg insert an llvm.trap before the 'unreachable' it introduces when it detects undefined behavior. llvm.trap generally codegens into some thing really small (e.g. a 2 byte ud2 instruction on x86) and debugging this sort of thing is "nontrivial". For example, we now compile: void foo() { (int)0 = 42; } into: _foo: pushl %ebp movl %esp, %ebp ud2 Some may even claim that this is a security hole, though that seems dubious to me. This addresses rdar://7958343 - Optimizing away null dereference potentially allows arbitrary code execution llvm-svn: 103356	2010-05-08 22:15:59 +00:00
Chris Lattner	02b0df5338	Teach instcombine to transform a bitcast/(zext\|trunc)/bitcast sequence with a vector input and output into a shuffle vector. This sort of sequence happens when the input code stores with one type and reloads with another type and then SROA promotes to i96 integers, which make everyone sad. This fixes rdar://7896024 llvm-svn: 103354	2010-05-08 21:50:26 +00:00
Chris Lattner	5a62d6e578	Fix PR7052, patch by Jakub Staszak! llvm-svn: 103347	2010-05-08 20:01:44 +00:00
Dan Gohman	d0800241d2	When pruning candidate formulae out of an LSRUse, update the LSRUse's Regs set after all pruning is done, rather than trying to do it on the fly, which can produce an incomplete result. This fixes a case where heuristic pruning was stripping all formulae from a use, which led the solver to enter an infinite loop. Also, add a few asserts to diagnose this kind of situation. llvm-svn: 103328	2010-05-07 23:36:59 +00:00
Devang Patel	32cc43c242	Wrap const MDNode * inside DIDescriptor. llvm-svn: 103295	2010-05-07 20:54:48 +00:00
Devang Patel	4423abd734	Use overloaded operators instead of DIDescriptor::getNode() llvm-svn: 103276	2010-05-07 18:19:32 +00:00
Ted Kremenek	d90773ebe0	Update CMake build. llvm-svn: 103266	2010-05-07 17:13:20 +00:00
Dan Gohman	5d5b8b1b8c	Add an LLVM IR version of code sinking. This uses the same simple algorithm as MachineSink, but it isn't constrained by MachineInstr-level details. llvm-svn: 103257	2010-05-07 15:40:13 +00:00
Bob Wilson	0c8b29bcdb	Use the right version of "append" to combine two SmallVectors. This fixes the compile-time regressions seen in last night's tests. llvm-svn: 103118	2010-05-05 20:44:15 +00:00
Bob Wilson	d1b38e317d	Combine the implementations of the core part of the SSAUpdater and MachineSSAUpdater to avoid duplicating all the code. llvm-svn: 103060	2010-05-04 23:18:19 +00:00
Bob Wilson	a2fda8b648	Defer adding critical edges to the "toSplit" list until after checking for indirect branches in all the predecessors. This avoids unnecessarily splitting edges in cases where load PRE is not possible anyway. Thanks to Jakub Staszak for pointing this out. llvm-svn: 103034	2010-05-04 20:03:21 +00:00
Dan Gohman	1d2ded75e2	Use getConstant instead of getIntegerSCEV. The two are basically the same, now that getConstant has overloads consistent with ConstantInt::get. llvm-svn: 102965	2010-05-03 22:09:21 +00:00
Devang Patel	9f5200a122	Check for side effects before splitting loop. Patch by Jakub Staszak! llvm-svn: 102928	2010-05-03 18:06:58 +00:00
Chris Lattner	b49a622fe9	revert r102831. We already delete dead readonly calls in other places, killing a valid transformation is not the right answer. llvm-svn: 102850	2010-05-01 17:19:38 +00:00
Owen Anderson	550986ea90	Disable the call-deletion transformation introduced in r86975. Without halting analysis, it is illegal to delete a call to a read-only function. The correct solution is almost certainly to add a "must halt" attribute and only allow deletions in its presence. XFAIL the relevant testcase for now. llvm-svn: 102831	2010-05-01 08:34:28 +00:00
Chris Lattner	c2432b9d44	rename InlineInfo.DevirtualizedCalls -> InlinedCalls to reflect that it includes all inlined calls now, not just devirtualized ones. llvm-svn: 102824	2010-05-01 01:26:13 +00:00
Chris Lattner	fc8d9ee6c3	Implement rdar://6295824 and PR6724 with two tiny changes that can have a big effect :). The first is to enable the iterative SCC passmanager juice that kicks in when the scc passmgr detects that a function pass has devirtualized a call. In this case, it will rerun all the passes it manages on the SCC, up to the iteration count limit (4). This is useful because a function pass may devirualize a call, and we want the inliner to inline it, or pruneeh to infer stuff about it, etc. The second patch is to add all call sites to the DevirtualizedCalls list the inliner uses. This list is about to get renamed, but the jist of this is that the inliner now reconsiders all inlined call sites as candidates for further inlining. The intuition is this that in cases like this: f() { g(1); } g(int x) { h(x); } We analyze this bottom up, and may decide that it isn't profitable to inline H into G. Next step, we decide that it is profitable to inline G into F, and do so, which means that F now calls H. Even though the call from G -> H may not have been profitable to inline, the call from F -> H may be (in this case because a constant allows folding etc). In my spot checks, this doesn't have a big impact on code. For example, the LLC output for 252.eon grew from 0.02% (from 317252 to 317308) and 176.gcc actually shrunk by .3% (from 1525612 to 1520964 bytes). 252.eon never iterated in the SCC Passmgr, 176.gcc iterated at most 1 time. llvm-svn: 102823	2010-05-01 01:15:56 +00:00
Chris Lattner	e8262675a3	The inliner has traditionally not considered call sites that appear due to inlining a callee as candidates for futher inlining, but a recent patch made it do this if those call sites were indirect and became direct. Unfortunately, in bizarre cases (see testcase) doing this can cause us to infinitely inline mutually recursive functions into callers not in the cycle. Fix this by keeping track of the inline history from which callsite inline candidates got inlined from. This shouldn't affect any "real world" code, but is required for a follow on patch that is coming up next. llvm-svn: 102822	2010-05-01 01:05:10 +00:00
Devang Patel	3ca9a9b59c	Preserve debug info attached with call instruction while eliminating dead argument. Radar 7927803 llvm-svn: 102760	2010-04-30 20:23:54 +00:00
Chris Lattner	4bd85e47bf	further clarify alignment of globals, fix instcombine to not increase the alignment of globals with an assigned alignment and section. llvm-svn: 102476	2010-04-28 00:31:12 +00:00
Chris Lattner	44a27efdf9	Fix a problem that lower invoke has with allocas (PR6694), and add a version of createLowerInvokePass that allows the client to specify whether it wants "expensive" or "cheap" lowering. Patch by Alex Mac! llvm-svn: 102402	2010-04-26 23:49:32 +00:00
Chris Lattner	87aa2243e2	fix PR6940: sitofp(undef) folds to 0.0, not undef. llvm-svn: 102358	2010-04-26 18:21:23 +00:00
Chris Lattner	b34ffe36ae	remove #if 1's. llvm-svn: 102296	2010-04-25 04:43:02 +00:00
Dan Gohman	534ba376f6	Generalize LSR's OptimizeMax to handle the new kinds of max expressions that indvars may use, now that indvars is recognizing le and ge loops. llvm-svn: 102235	2010-04-24 03:13:44 +00:00
Chris Lattner	d3b361d1b6	enable my inliner change: add newly devirtualized call sites to the worklist, making them inline candidates. llvm-svn: 102213	2010-04-23 21:16:07 +00:00
Chris Lattner	c691de3b4e	switch InlineInfo.DevirtualizedCalls's list to be of WeakVH. This fixes a bug where calls inlined into an invoke would get changed into an invoke but the array would keep pointing to the (now dead) call. The improved inliner behavior is still disabled for now. llvm-svn: 102196	2010-04-23 18:37:01 +00:00
Dan Gohman	997bbc54d6	Fix LSR to tolerate cases where ScalarEvolution initially misses an opportunity to fold add operands, but folds them after LSR has separated them out. This fixes rdar://7886751. llvm-svn: 102157	2010-04-23 01:55:05 +00:00
Chris Lattner	d8d898dbd3	disable my previous inliner patch, it appears to be busting self-host. llvm-svn: 102153	2010-04-23 00:41:03 +00:00
Chris Lattner	2eee5d3467	The inliner was choosing to not consider call sites that appear in the SCC as a result of inlining as candidates for inlining. Change this so that it does consider call sites that change from being indirect to being direct as a result of inlining. This allows it to completely "devirtualize" the testcase. llvm-svn: 102146	2010-04-22 23:37:35 +00:00
Chris Lattner	4ba01ec869	refactor the interface to InlineFunction so that most of the in/out arguments are handled with a new InlineFunctionInfo class. This makes it easier to extend InlineFunction to return more info in the future. llvm-svn: 102137	2010-04-22 23:07:58 +00:00
Chris Lattner	016c00a311	when inlining something like this: define void @f3(void (i8) %__f) ssp { entry: call void %__f(i8* undef) unreachable } define void @f4(i8* %this) ssp align 2 { entry: call void @f3(void (i8) @f2) ssp ret void } The inliner is turning the indirect call to %__f into a direct call to F2. Make the call graph more precise when this happens. The inliner doesn't revisit call sites introduced by inlining, so there isn't an easy way to test for this, but a more precise callgraph is a good thing. llvm-svn: 102131	2010-04-22 21:31:00 +00:00
Chris Lattner	0a3b5b4e39	eliminate dead #include. llvm-svn: 102119	2010-04-22 20:41:10 +00:00
Bob Wilson	4c7f50afb8	Fix a performance problem with the new SSAUpdater. This showed up in the GCCAS time for MultiSource/Benchmarks/ASCI_Purple/SMG2000. llvm-svn: 102009	2010-04-21 18:39:03 +00:00
Devang Patel	2176643241	Rename ValueMapTy as ValueToValueMapTy to clearly indicate that this has no replationship with ADT/ValueMap. llvm-svn: 101950	2010-04-20 22:24:18 +00:00
Devang Patel	382b969647	There is no need to install ValueMapper.h header. llvm-svn: 101949	2010-04-20 22:18:31 +00:00
Gabor Greif	27b3d55194	use abstract accessors to CallInst llvm-svn: 101899	2010-04-20 13:13:04 +00:00
Chris Lattner	66e809acc0	remove a bunch of ad-hoc code to simplify instructions from loop unswitch, and use inst simplify instead. It is more powerful and less duplication. llvm-svn: 101874	2010-04-20 05:33:18 +00:00
Chris Lattner	c707fa9651	move some select simplifications out out instcombine into inst simplify. No functionality change. llvm-svn: 101873	2010-04-20 05:32:14 +00:00
Chris Lattner	5814d9d9da	RewriteLoopBodyWithConditionConstant can end up rewriting the condition we're unswitching on. In this case, don't try to simplify the second copy of the loop which may be dead or not, but is probably a constant now. This fixes PR6879 llvm-svn: 101870	2010-04-20 05:09:16 +00:00
Chris Lattner	a5cdd5e6a2	make the inliner do less work for leaf functions. llvm-svn: 101846	2010-04-20 00:47:08 +00:00
Chris Lattner	e93846762a	Fix rdar://7879828 - crash in CallGraph, a self host issue. Arg promotion was deleting call graph nodes that still had references from the 'indirect' CGN. Like the inliner, it should only delete the function if all references are gone. llvm-svn: 101845	2010-04-20 00:46:50 +00:00
Dan Gohman	e637ff5e9a	Remove the Expr member from IVUsers. Instead of remembering the expression, just ask ScalarEvolution for it on demand. This helps IVUsers be more robust in the case of expressions changing underneath it. This fixes PR6862. llvm-svn: 101819	2010-04-19 21:48:58 +00:00
Bob Wilson	ca51425d94	Re-commit my previous SSAUpdater changes. The previous version naively tried to determine where to place PHIs by iteratively comparing reaching definitions at each block. That was just plain wrong. This version now computes the dominator tree within the subset of the CFG where PHIs may need to be placed, and then places the PHIs in the iterated dominance frontier of each definition. The rest of the patch is mostly the same, with a few more performance improvements added in. llvm-svn: 101612	2010-04-17 03:08:24 +00:00
Eric Christopher	7258dcd77f	Revert 101465, it broke internal OpenGL testing. Probably the best way to know that all getOperand() calls have been handled is to replace that API instead of updating. llvm-svn: 101579	2010-04-16 23:37:20 +00:00
Chris Lattner	4422d31b84	introduce a new CallGraphSCC class, and pass it around to CallGraphSCCPass's instead of passing around a std::vector<CallGraphNode*>. No functionality change, but now we have a much tidier interface. llvm-svn: 101558	2010-04-16 22:42:17 +00:00
Dan Gohman	99e5327bfd	Refine the detection of seemingly infinitely recursive calls where the callee is expected to be expanded to something else by codegen, so that normal infinitely recursive calls are still transformed. llvm-svn: 101468	2010-04-16 15:57:50 +00:00
Gabor Greif	f375520f7b	reapply r101434 with a fix for self-hosting rotate CallInst operands, i.e. move callee to the back of the operand array the motivation for this patch are laid out in my mail to llvm-commits: more efficient access to operands and callee, faster callgraph-construction, smaller compiler binary llvm-svn: 101465	2010-04-16 15:33:14 +00:00
Chris Lattner	bd2d9430d6	fix comment noticed by Bob llvm-svn: 101437	2010-04-16 02:32:17 +00:00
Gabor Greif	403e9694f9	back out r101423 and r101397, they break llvm-gcc self-host on darwin10 llvm-svn: 101434	2010-04-16 01:16:20 +00:00
Chris Lattner	1146d326a7	fix PR6832: we were using the alignment of a pointer when we wanted the alignment of the pointee. llvm-svn: 101432	2010-04-16 01:05:38 +00:00
Chris Lattner	b73552908e	improve comments. llvm-svn: 101429	2010-04-16 00:38:19 +00:00
Chris Lattner	78d7dbbc30	pull all the ConvertToScalarInfo code together into one place. llvm-svn: 101427	2010-04-16 00:24:57 +00:00
Chris Lattner	d69c3ee958	more refactoring: suck some stuff out of SRoA into ConvertToScalarInfo. llvm-svn: 101425	2010-04-16 00:20:00 +00:00
Gabor Greif	6af0ad846e	shift intrinsic operand llvm-svn: 101423	2010-04-16 00:06:45 +00:00
Chris Lattner	9ef4eae6e6	introduce a new ConvertToScalarInfo struct to simplify CanConvertToScalar/MergeInType. Eliminate a pointless LLVMContext argument to MergeInType. llvm-svn: 101422	2010-04-15 23:50:26 +00:00
Chris Lattner	9c1172d848	tidy interface to isOnlyCopiedFromConstantGlobal llvm-svn: 101405	2010-04-15 21:59:20 +00:00
Gabor Greif	33ae80bff7	reapply r101364, which has been backed out in r101368 with a fix rotate CallInst operands, i.e. move callee to the back of the operand array the motivation for this patch are laid out in my mail to llvm-commits: more efficient access to operands and callee, faster callgraph-construction, smaller compiler binary llvm-svn: 101397	2010-04-15 20:51:13 +00:00
Anton Korobeynikov	839cdaa70a	Revert r100896 and around - this breaks the only mingw32 buildbot we have. llvm-svn: 101387	2010-04-15 19:51:42 +00:00
Dan Gohman	b29cda9b3c	Fix a bunch of namespace polution. llvm-svn: 101376	2010-04-15 17:08:50 +00:00
Gabor Greif	9fd00c7d25	back out r101364, as it trips the linux nightlybot on some clang C++ tests llvm-svn: 101368	2010-04-15 12:46:56 +00:00
Gabor Greif	aafd209632	rotate CallInst operands, i.e. move callee to the back of the operand array the motivation for this patch are laid out in my mail to llvm-commits: more efficient access to operands and callee, faster callgraph-construction, smaller compiler binary llvm-svn: 101364	2010-04-15 10:49:53 +00:00
Tobias Grosser	de1a37b872	IPO needs ScalarOpts and InstCombine in its libs The commit "Adding IPSCCP and Internalize passes to the C-bindings" introduced new dependencies for IPO. Add these to the CMAKE build as otherwise the BUILD_SHARED_LIBS=1 build fails. llvm-svn: 101313	2010-04-14 23:42:23 +00:00
Evan Cheng	21b588b678	- Code clean up to reduce indentation. - TryToOptimizeStoreOfMallocToGlobal should check if TargetData is available and bail out if it is not. The transformations being done requires TD. llvm-svn: 101285	2010-04-14 20:52:55 +00:00
Gabor Greif	c08e5df836	performance: cache the dereferenced use_iterator llvm-svn: 101253	2010-04-14 16:48:56 +00:00
Gabor Greif	a49686fa3e	performance: cache the dereferenced use_iterator llvm-svn: 101250	2010-04-14 16:13:56 +00:00
Nick Lewycky	163a743b51	I don't know how, but I managed to goof the revert. Remove function that should have been removed in r101231. llvm-svn: 101232	2010-04-14 05:03:50 +00:00
Nick Lewycky	ca615eb0d6	Revert r101213. llvm-svn: 101231	2010-04-14 04:51:58 +00:00
Nick Lewycky	087d59cf25	Remove tab. llvm-svn: 101223	2010-04-14 04:19:05 +00:00
Nick Lewycky	3cdae269f0	While DAE can't modify the function signature of an externally visible function, it can check whether the visible direct callers are passing in parameters to dead arguments and replace those with undef. This reinstates r94322 with bugs fixed. llvm-svn: 101213	2010-04-14 03:38:11 +00:00
Eric Christopher	4016dcd625	Actually... return after the check for invalid input. llvm-svn: 101139	2010-04-13 16:41:29 +00:00
Owen Anderson	b516f1c6cc	Remove SCCVN from the CMake build system. llvm-svn: 101125	2010-04-13 08:33:09 +00:00
Owen Anderson	9ed6abfe0b	SCCVN, we hardly knew ye! llvm-svn: 101117	2010-04-13 05:24:08 +00:00
Dan Gohman	5867a56db8	Teach IndVarSimplify how to eliminate remainder operators where the numerator is an induction variable. For example, with code like this: for (i=0;i<n;++i) x[i%n] = 0; IndVarSimplify will now recognize that i is always less than n inside the loop, and eliminate the remainder. llvm-svn: 101113	2010-04-13 01:46:36 +00:00
Dan Gohman	4a645b88ef	Suppress LinearFunctionTestReplace when the computed backedge-taken expression is a UDiv and it doesn't appear that the UDiv came from the user's source. ScalarEvolution has recently figured out how to compute a tripcount expression for the inner loop in SingleSource/Benchmarks/Shootout/sieve.c, using a udiv. Emitting a udiv instruction dramatically slows down the enclosing loop. llvm-svn: 101068	2010-04-12 21:13:43 +00:00
Dan Gohman	27c8e79839	Delete this code, which is no longer needed. llvm-svn: 101033	2010-04-12 08:00:22 +00:00
Dan Gohman	07f6563e81	Move the EliminateIVUsers call back out to its original location. Now that a ScalarEvolution bug with overflow handling is fixed, the normal analysis code will automatically decline to operate on the icmp instructions which are responsible for the loop exit. llvm-svn: 101032	2010-04-12 07:56:56 +00:00
Dan Gohman	15f90c294c	Use RecursivelyDeleteTriviallyDeadInstructions in EliminateIVComparisons, instead of deleting just the user. This makes it more consistent with other code in IndVarSimplify, and theoretically can eliminate more users earlier. llvm-svn: 101027	2010-04-12 07:29:15 +00:00
Eric Christopher	1f272f7fd8	Verify function prototypes before trying to optimize functions. We also need TargetData, just return false if we don't have it. Update testcases accordingly. Fixes PR6807. llvm-svn: 101011	2010-04-12 04:48:00 +00:00
Dan Gohman	fa5ad797e3	Re-apply r101000, with a fix: Don't eliminate an icmp which is part of the loop exit test. This usually doesn't come up for a variety of reasons, but it isn't impossible, so make IndVarSimplify handle it conservatively. llvm-svn: 101008	2010-04-12 02:21:50 +00:00
Dan Gohman	c0f1efaf8d	Revert 101000, which is breaking self-host builds. llvm-svn: 101002	2010-04-12 00:17:10 +00:00
Dan Gohman	af4ab1b681	Teach IndVarSimplify how to eliminate comparisons involving induction variables. For example, with code like this: for (i=0;i<n;++i) if (i<n) x[i] = 0; IndVarSimplify will now recognize that i is always less than n inside the loop, and eliminate the if. llvm-svn: 101000	2010-04-11 23:10:12 +00:00
Dan Gohman	b50349a979	Rename isLoopGuardedByCond to isLoopEntryGuardedByCond, to emphasise that it's only testing for the entry condition, not full loop-invariant conditions. llvm-svn: 100979	2010-04-11 19:27:13 +00:00
Chris Lattner	4568ed7893	Implement support for varargs functions without any fixed parameters in the CBE by implicitly adding a fixed argument. This allows eliminating a work-around from DAE. Patch by Sylvere Teissier! llvm-svn: 100944	2010-04-10 19:12:44 +00:00
Chris Lattner	9ae28b141f	fix PR6743, a case where we'd delete an instruction before using it in some cases. llvm-svn: 100937	2010-04-10 18:26:57 +00:00
Chris Lattner	b9801ffcb5	fix PR6760, a missing check in heap SRoA. llvm-svn: 100936	2010-04-10 18:19:22 +00:00
Dan Gohman	607e02b33a	When determining a canonical insert position, don't climb deeper into adjacent loops. Also, ensure that the insert position is dominated by the loop latch of any loop in the post-inc set which has a latch. llvm-svn: 100906	2010-04-09 22:07:05 +00:00
Chris Lattner	74e2ef68b9	suck the propagating "has dynamic libs" check into a single makefile variable TARGET_HAS_DYNAMIC_LIBS llvm-svn: 100896	2010-04-09 20:51:47 +00:00
Chris Lattner	c86cdc7d47	add minix support, patch by Kees van Reeuwijk! PR6797 llvm-svn: 100895	2010-04-09 20:45:04 +00:00
Wesley Peck	a2ca3fa781	Adding IPSCCP and Internalize passes to the C-bindings llvm-svn: 100893	2010-04-09 20:43:20 +00:00
Dan Gohman	42ec4eb351	When looking for loop-invariant users, look through no-op instructions, so that an unfortunately placed bitcast doesn't pin a value in a register. llvm-svn: 100883	2010-04-09 19:12:34 +00:00
Gabor Greif	ef60190a00	performance: cache result of looking up user llvm-svn: 100862	2010-04-09 15:18:34 +00:00
Dan Gohman	0a8175d1db	Minor code simplification. llvm-svn: 100859	2010-04-09 14:53:59 +00:00
Gabor Greif	ce6dd889ec	const-ize a predicate llvm-svn: 100856	2010-04-09 10:57:00 +00:00
Dan Gohman	d2df643ddb	Refactor the code for computing the insertion point for an expression into a separate function. llvm-svn: 100845	2010-04-09 02:00:38 +00:00
Chris Lattner	c6c153be45	fix a SCCP miscompilation that could happen when a forced constant is changed to a constant, we would end up adding the instruction to the wrong worklist, preventing it from being properly revisited. This fixes rdar://7832370 llvm-svn: 100837	2010-04-09 01:14:31 +00:00
Dan Gohman	9b5d0bb774	Avoid allocating a value of zero in a register if the initial formula inputs happen to negate each other. llvm-svn: 100828	2010-04-08 23:36:27 +00:00
Dan Gohman	4ce1fb1448	Add variants of ult, ule, etc. which take a uint64_t RHS, for convenience. llvm-svn: 100824	2010-04-08 23:03:40 +00:00
Dan Gohman	4506539d84	When expanding expressions which are using post-inc mode for multiple loops, ensure that the expansion is dominated by the increments of those loops. llvm-svn: 100748	2010-04-08 05:57:57 +00:00
Dan Gohman	eb7111b98f	Say bitcast instead of bitconvert. llvm-svn: 100720	2010-04-07 23:22:42 +00:00
Eric Christopher	e8b281c3c3	Add support for stpncpy_chk. llvm-svn: 100710	2010-04-07 23:00:07 +00:00
Chris Lattner	2104b8d36e	rename llvm::llvm_report_error -> llvm::report_fatal_error llvm-svn: 100709	2010-04-07 22:58:41 +00:00
Dan Gohman	d006ab90dd	Generalize IVUsers to track arbitrary expressions rather than expressions explicitly split into stride-and-offset pairs. Also, add the ability to track multiple post-increment loops on the same expression. This refines the concept of "normalizing" SCEV expressions used for to post-increment uses, and introduces a dedicated utility routine for normalizing and denormalizing expressions. This fixes the expansion of expressions which are post-increment users of more than one loop at a time. More broadly, this takes LSR another step closer to being able to reason about more than one loop at a time. llvm-svn: 100699	2010-04-07 22:27:08 +00:00
Gabor Greif	08d85da6cc	fix 80-col violations llvm-svn: 100677	2010-04-07 18:59:26 +00:00
Gabor Greif	df323a51f5	performance: get rid of repeated dereferencing of use_iterator by caching its result llvm-svn: 100550	2010-04-06 19:32:30 +00:00
Gabor Greif	679728790b	make more two predicates constant llvm-svn: 100549	2010-04-06 19:24:18 +00:00
Gabor Greif	08355d6cda	performance: get rid of repeated dereferencing of use_iterator by caching its result llvm-svn: 100547	2010-04-06 19:14:05 +00:00
Gabor Greif	a21bc0fbd5	const-ize predicate ValueIsOnlyUsedLocallyOrStoredToOneGlobal llvm-svn: 100546	2010-04-06 18:58:22 +00:00
Gabor Greif	0439789023	use CallSite to access calls vs. invokes uniformly and remove assumptions about operand order llvm-svn: 100544	2010-04-06 18:45:08 +00:00
Chris Lattner	adca608281	fix a really nasty bug that Evan was tracking in SCCP. When resolving undefs in branches/switches, we have two cases: a branch on a literal undef or a branch on a symbolic value which is undef. If we have a literal undef, the code was correct: forcing it to a constant is the right thing to do. If we have a branch on a symbolic value that is undef, we should force the symbolic value to a constant, which then makes the successor block live. Forcing the condition of the branch to being a constant isn't safe if later paths become live and the value becomes overdefined. This is the case that 'forcedconstant' is designed to handle, so just use it. This fixes rdar://7765019 but there is no good testcase for this, the one I have is too insane to be useful in the future. llvm-svn: 100478	2010-04-05 22:14:48 +00:00
Chris Lattner	c832c1bf69	some code cleanups, use SwitchInst::findCaseValue, reduce indentation llvm-svn: 100468	2010-04-05 21:18:32 +00:00
Evan Cheng	ba930449a9	Code clean up. llvm-svn: 100467	2010-04-05 21:16:25 +00:00
Mon P Wang	c576ee9040	Reapply address space patch after fixing an issue in MemCopyOptimizer. Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset, e.g., llvm.memcpy.i32(i8, i8, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8, i8, i32, i32, i1) llvm-svn: 100304	2010-04-04 03:10:48 +00:00
Chris Lattner	ecb536313f	require that the branch being controlled by the IV exits the loop. With this information we can guarantee the iteration count of the loop is bounded by the compare. I think this xforms is finally safe now. llvm-svn: 100285	2010-04-03 07:21:39 +00:00
Chris Lattner	40060d33f6	add integer overflow check for the fp induction variable checker. Amusingly, we already had tests that we should have rejects because they would be miscompiled in the testsuite. The remaining issue with this is that we don't check that the branch causes us to exit the loop if it fails, so we don't actually know if we remain in bounds. llvm-svn: 100284	2010-04-03 07:18:48 +00:00
Chris Lattner	69913466cb	add a comment and fix some consistency issues, converting to a signed vs unsigned value depending on the sign of the constant fp means that we can't distinguish between a truly negative number and a positive number so large the 32nd bit is set. So, do don't this! llvm-svn: 100283	2010-04-03 06:41:49 +00:00
Chris Lattner	40ea690f39	fix PR6761, a miscompilation due to the fp->int IV conversion stuff. More bugs remain though. llvm-svn: 100282	2010-04-03 06:30:03 +00:00
Chris Lattner	42202868c3	just eliminate the uitofp checks. This code isn't doing the required validity checks in the first place, and supporting a condition large enough to require the 32'nd bit isn't worth it. llvm-svn: 100280	2010-04-03 06:25:21 +00:00
Chris Lattner	ca25b60f4e	rename PH -> PN to be consistent with WeakPN and the rest of llvm. llvm-svn: 100276	2010-04-03 06:17:08 +00:00
Chris Lattner	774858fc38	improve comment and drop a dead check. If PH had no uses, it would have been deleted by RecursivelyDeleteTriviallyDeadInstructions llvm-svn: 100275	2010-04-03 06:16:22 +00:00
Chris Lattner	915322bc4a	strength reduce a ridiculous use of APInt. llvm-svn: 100274	2010-04-03 06:13:12 +00:00
Chris Lattner	0b941347f9	rename stuff improve comment grammar. llvm-svn: 100273	2010-04-03 06:11:07 +00:00
Chris Lattner	d77bde5f94	simplify some code and resolve a fixme. llvm-svn: 100272	2010-04-03 06:06:59 +00:00
Chris Lattner	2ff33f91d5	There is no guarantee that the increment and the branch are in the same block. Insert the new increment in the correct location. Also, more cleanups. llvm-svn: 100271	2010-04-03 06:05:10 +00:00
Chris Lattner	c558b49f14	first half of a pass through IndVarSimplify::HandleFloatingPointIV, this cleans up a bunch of code and also fixes several crashes and miscompiles. More to come unfortunately, this optimization is quite broken. llvm-svn: 100270	2010-04-03 05:54:59 +00:00
Chris Lattner	2e23e5284c	don't internalize available_externally functions, they are really just declarations. This is related to PR6524 llvm-svn: 100269	2010-04-03 05:24:50 +00:00
Bob Wilson	f1aa4743d9	Revert all my SSAUpdater patches. The PHI placement algorithm is not correct (what was I thinking?) and there's also a problem with LCSSA. I'll try again later with fixes. --- Reverse-merging r100263 into '.': U lib/Transforms/Utils/SSAUpdater.cpp --- Reverse-merging r100177 into '.': G lib/Transforms/Utils/SSAUpdater.cpp --- Reverse-merging r100148 into '.': G lib/Transforms/Utils/SSAUpdater.cpp --- Reverse-merging r100147 into '.': U include/llvm/Transforms/Utils/SSAUpdater.h G lib/Transforms/Utils/SSAUpdater.cpp --- Reverse-merging r100131 into '.': G include/llvm/Transforms/Utils/SSAUpdater.h G lib/Transforms/Utils/SSAUpdater.cpp --- Reverse-merging r100130 into '.': G lib/Transforms/Utils/SSAUpdater.cpp --- Reverse-merging r100126 into '.': G include/llvm/Transforms/Utils/SSAUpdater.h G lib/Transforms/Utils/SSAUpdater.cpp --- Reverse-merging r100050 into '.': D test/Transforms/GVN/2010-03-31-RedundantPHIs.ll --- Reverse-merging r100047 into '.': G include/llvm/Transforms/Utils/SSAUpdater.h G lib/Transforms/Utils/SSAUpdater.cpp llvm-svn: 100264	2010-04-03 03:50:38 +00:00
Bob Wilson	25f1aefd5b	Add a DEBUG_TYPE for the SSAUpdater. llvm-svn: 100263	2010-04-03 03:28:44 +00:00
Evan Cheng	ed66db3f9b	Code refactoring. llvm-svn: 100262	2010-04-03 02:23:43 +00:00
Mon P Wang	999c1b927b	Revert r100191 since it breaks objc in clang llvm-svn: 100199	2010-04-02 18:43:02 +00:00
Mon P Wang	a972ab8564	Reapply address space patch after fixing an issue in MemCopyOptimizer. Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset, e.g., llvm.memcpy.i32(i8, i8, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8, i8, i32, i32, i1) llvm-svn: 100191	2010-04-02 18:04:15 +00:00
Dan Gohman	f7239102fe	Manually notify ScalarEvolution before making an operand replacement, since it can't currently observe such changes automatically. llvm-svn: 100186	2010-04-02 14:48:31 +00:00
Bob Wilson	3c54edf9b3	Recommit 100158 now that the buildbots are happy again. llvm-svn: 100177	2010-04-02 05:09:46 +00:00
Dan Gohman	4bd755419f	Revert the recent alignment changes. They're broken for -Os because, in particular, they end up aligning strings at 16-byte boundaries, and there's no way for GlobalOpt to check OptForSize. llvm-svn: 100172	2010-04-02 03:04:37 +00:00
Bob Wilson	0389adcd73	Revert 100158 in case it is causing some of the buildbot problems. llvm-svn: 100164	2010-04-02 01:22:49 +00:00
Dan Gohman	c671347fcb	Make globalopt refine global variable alignment. llvm-svn: 100160	2010-04-02 00:14:16 +00:00
Bob Wilson	9af4e118c6	Check for terminating conditions before adding PHIs to the worklists. This is more efficient than adding them to the worklist and then ignoring them. llvm-svn: 100158	2010-04-02 00:10:41 +00:00
Bob Wilson	737195069a	Remove trailing whitespace. llvm-svn: 100148	2010-04-01 23:06:38 +00:00
Bob Wilson	37b73d9d3e	Rewrite another SSAUpdater function to avoid recursion. llvm-svn: 100147	2010-04-01 23:05:58 +00:00
Bob Wilson	8409feadf0	Change another SSAUpdater function to avoid recursion. llvm-svn: 100131	2010-04-01 20:04:30 +00:00
Bob Wilson	043c0406f7	Simplify the code to check for existing PHIs, now that it is only used in one place. This removes the template function added in svn 94690. llvm-svn: 100130	2010-04-01 19:53:48 +00:00
Bob Wilson	38fc88ee5d	The SSAUpdater should avoid recursive traversals of the CFG, since that may blow out the stack for really big functions. Start by fixing an easy case. llvm-svn: 100126	2010-04-01 18:46:59 +00:00
Gabor Greif	5d5db5342b	Introduce ImmutableCallSite, useful for contexts where no mutation is necessary. Inherits from new templated baseclass CallSiteBase<> which is highly customizable. Base CallSite on it too, in a configuration that allows full mutation. Adapt some call sites in analyses to employ ImmutableCallSite. llvm-svn: 100100	2010-04-01 08:21:08 +00:00
Nick Lewycky	bfb50a0d43	Clean up this file a little, no functionality change. This is a subset of my patch back in r94322. llvm-svn: 100097	2010-04-01 07:34:00 +00:00
Bob Wilson	ac229124f4	Rewrite part of the SSAUpdater to be more careful about inserting redundant PHIs. The previous algorithm was unable to reliably detect when existing PHIs in a cycle can be reused. I'm still working on reducing a testcase. Radar 7711900. llvm-svn: 100047	2010-03-31 20:51:00 +00:00
Dale Johannesen	b67a6e6620	Fix a nasty dangling-pointer heisenbug that could generate wrong code pretty much anywhere AFAICT. A case that hits the bug reproducibly is impossible, but the situation was like this: Addr = ... Store -> Addr Addr2 = GEP , 0, 0 Store -> Addr2 Handling the first store, the code changed replaced Addr with a sunkaddr and deleted Addr, but not its table entry. Code in OptimizedBlock replaced Addr2 with a bitcast; if that happened to reuse the memory of Addr, the old table entry was erroneously found when handling the second store. llvm-svn: 100044	2010-03-31 20:37:15 +00:00
Bob Wilson	6f7fd28824	Revert Mon Ping's change 99928, since it broke all the llvm-gcc buildbots. llvm-svn: 99948	2010-03-30 22:27:04 +00:00
Mon P Wang	7460571381	Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset, e.g., llvm.memcpy.i32(i8, i8, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8, i8, i32, i32, i1) A update of langref will occur in a subsequent checkin. llvm-svn: 99928	2010-03-30 20:55:56 +00:00
Dan Gohman	39027c403c	Fix a grammaro. llvm-svn: 99917	2010-03-30 20:04:57 +00:00
Gabor Greif	b469818279	fix two cases where the arguments were extracted from the wrong range out of the InvokeInst spotted by baldrick -- thanks\! llvm-svn: 99914	2010-03-30 19:20:53 +00:00
Jeffrey Yasskin	12fd516e51	Remove another memory leak from ABCD by using Edges by value instead of pointer. There was also a SmallPtrSet whose settiness wasn't being used, so I changed it to a SmallVector. llvm-svn: 99713	2010-03-27 09:09:17 +00:00
Jeffrey Yasskin	97e613b6da	In ABCD, change the non-null Bound*s to Bound&s. llvm-svn: 99711	2010-03-27 08:15:46 +00:00
Jeffrey Yasskin	33bc7e4cb5	Fix a memory leak in ABCD by giving ownership of Bound objects to the MemoizedResultChart. llvm-svn: 99710	2010-03-27 08:09:24 +00:00
Eric Christopher	81c03447fc	When we promote a load of an argument make sure to take the alignment of the previous load - it's usually important. For example, we don't want to blindly turn an unaligned load into an aligned one. llvm-svn: 99699	2010-03-27 01:54:00 +00:00
Dan Gohman	d42e09d91e	Ignore debug intrinsics in yet more places. llvm-svn: 99580	2010-03-26 00:33:27 +00:00
Gabor Greif	6c6b2fd2b2	rename pred_const_iterator to const_pred_iterator for consistency's sake llvm-svn: 99567	2010-03-25 23:25:28 +00:00
Gabor Greif	c78d720f02	rename use_const_iterator to const_use_iterator for consistency's sake llvm-svn: 99564	2010-03-25 23:06:16 +00:00
Chris Lattner	0563804982	fix PR6642, GVN forwarding from memset to load of the base of the memset. llvm-svn: 99488	2010-03-25 05:58:19 +00:00
Eric Christopher	1d38538fb6	Temporarily revert this, it's causing an issue with an internal project. llvm-svn: 99451	2010-03-24 23:35:21 +00:00
Evan Cheng	c12c2d9bb4	Move OptChkCall off LibCallOptimization into StrCpyOpt. llvm-svn: 99418	2010-03-24 20:19:04 +00:00
Gabor Greif	a2fbc0ae1b	Finally land the InvokeInst operand reordering. I have audited all getOperandNo calls now, fixing hidden assumptions. CallSite related uglyness will be eliminated successively. Note this patch has a long and griveous history, for all the back-and-forths have a look at CallSite.h's log. llvm-svn: 99399	2010-03-24 13:21:49 +00:00
Gabor Greif	be18ae6781	tighten a type and remove trailing whitespace, no functional changes llvm-svn: 99398	2010-03-24 11:58:07 +00:00
Gabor Greif	9027ffb918	increase const goodness and remove pointless getUser() calls llvm-svn: 99395	2010-03-24 10:29:52 +00:00
Gabor Greif	11ff53146f	cache result of UI.getOperandNo() instead of calling it twice, it is cheaper this way llvm-svn: 99394	2010-03-24 10:12:54 +00:00
Chris Lattner	00eeac4179	add some accessors to callsite/callinst/invokeinst to check for the noinline attribute, and make the inliner refuse to inline a call site when the call site is marked noinline even if the callee isn't. This fixes PR6682. llvm-svn: 99341	2010-03-23 22:59:07 +00:00
Bill Wendling	04803e8ef6	Skip debugging intrinsics when sinking unused invariants. llvm-svn: 99324	2010-03-23 21:15:59 +00:00
Evan Cheng	d9e822345c	Teach simplify libcall to transform __strcpy_chk to __memcpy_chk to enable optimizations down stream. llvm-svn: 99282	2010-03-23 15:48:04 +00:00
Gabor Greif	161cb044f3	add assert in argpromotion, which cannot trigger if Function::hasAddressTaken works as advertised also included some cosmetic cleanups llvm-svn: 99276	2010-03-23 14:40:20 +00:00
Evan Cheng	3f7842232e	Fix an incorrect logic causing instcombine to miss some _chk -> non-chk transformations. llvm-svn: 99263	2010-03-23 06:06:09 +00:00
Evan Cheng	9a7b270825	Fix 80 col violation. llvm-svn: 99224	2010-03-22 22:44:31 +00:00
Gabor Greif	e1517a084f	backing out r99170 because it still fails on clang-x86_64-darwin10-fnt llvm-svn: 99171	2010-03-22 09:11:00 +00:00
Gabor Greif	7a743e15e3	Now that hopefully all direct accesses to InvokeInst operands are fixed we can reapply the InvokeInst operand reordering patch. (see r98957). llvm-svn: 99170	2010-03-22 08:28:00 +00:00
Gabor Greif	febf6ab718	Add a setCalledFunction member to InvokeInst (like in CallInst) and use this (as well as getCalledValue) to access the callee, instead of {g\|s}etOperand(0). llvm-svn: 99084	2010-03-20 21:00:25 +00:00
Dan Gohman	1a2abe5580	Clear the SCEVExpander's insertion point after making deletions, so that the SCEVExpander doesn't retain a dangling pointer as its insert position. The dangling pointer in this case wasn't ever used to insert new instructions, but it was causing trouble with SCEVExpander's code for automatically advancing its insert position past debug intrinsics. This fixes use-after-free errors that valgrind noticed in test/Transforms/IndVarSimplify/2007-06-06-DeleteDanglesPtr.ll and test/Transforms/IndVarSimplify/exit_value_tests.ll. llvm-svn: 99036	2010-03-20 03:53:53 +00:00
Gabor Greif	6c56ed847e	back out r98957, it broke http://smooshlab.apple.com:8010/builders/clang-x86_64-darwin10-fnt/builds/703 in the nightly test suite llvm-svn: 98958	2010-03-19 13:50:02 +00:00
Gabor Greif	8335f9c0bf	Recommit r80858 again (which has been backed out in r80871). This time I did a self-hosted bootstrap on Linux x86-64, with no problems. Let's see how darwin 64-bit self-hosting goes. At the first sign of failure I'll back this out. Maybe the valgrind bots give me a hint of what may be wrong (it at all). llvm-svn: 98957	2010-03-19 11:55:53 +00:00
Benjamin Kramer	f2e4b5dd7f	str[r]chr returns its pointer argument so we cannot mark it as nocapture. Thanks to Duncan for spotting my mistake. llvm-svn: 98671	2010-03-16 20:33:15 +00:00
Benjamin Kramer	5cf5fd2ffa	Mark str[r]chr readonly. llvm-svn: 98663	2010-03-16 19:36:43 +00:00
Devang Patel	45c1505bf6	Skip debug info intrinsics. llvm-svn: 98584	2010-03-15 22:23:03 +00:00
Devang Patel	b21991c4f5	Skip debug info intrinsics. llvm-svn: 98581	2010-03-15 21:25:29 +00:00
Devang Patel	d3f41e8939	In "empty" bb, the return instruction may not be first instruction, if dbg value intrinsics are present in this bb. Use terminator to find return instructions. llvm-svn: 98565	2010-03-15 19:05:46 +00:00
Bill Wendling	55e69d179b	Skip over debug info when trying to merge two return BBs. llvm-svn: 98491	2010-03-14 10:40:55 +00:00
Bill Wendling	ee84f27536	Make returns more consistent with others. llvm-svn: 98490	2010-03-14 10:40:28 +00:00
Benjamin Kramer	a956527c92	Add a virtual destructor and give vtable a home. llvm-svn: 98376	2010-03-12 20:41:29 +00:00
Benjamin Kramer	7b88a49f3e	Factor checked library call optimization into a common helper class and use it to unify the almost identical code in CodeGenPrepare and InstCombineCalls. llvm-svn: 98338	2010-03-12 09:27:41 +00:00
Nate Begeman	2e41605d4f	Whoops this already existed. llvm-svn: 98297	2010-03-11 23:21:19 +00:00
Nate Begeman	5daa235c91	Add a handful of additional useful pass manager things to the C API llvm-svn: 98296	2010-03-11 23:06:07 +00:00
Benjamin Kramer	2fc395659c	stpcpy is so similar to strcpy, it doesn't deserve a complete copy of the __strcpy_chk -> strcpy code. llvm-svn: 98284	2010-03-11 20:45:13 +00:00
Eric Christopher	607de1de53	Lower stpcpy_chk when possible. llvm-svn: 98274	2010-03-11 19:24:34 +00:00
Eric Christopher	103e3ef893	Fix typo. llvm-svn: 98260	2010-03-11 17:45:38 +00:00
Eric Christopher	4b7948e09e	Do some final lowering in CodeGenPrepare of _chk calls similar to that in InstCombineCalls. More call lowering needed. llvm-svn: 98228	2010-03-11 02:41:03 +00:00
Eric Christopher	43dc11c525	Add strncpy libcall creator. Use it when it should be used. llvm-svn: 98219	2010-03-11 01:25:07 +00:00
Dan Gohman	2734ebd37f	Add a DominatorTree argument to isLCSSA so that it doesn't have to compute a set of reachable blocks for itself each time it is called, which is fairly frequently. llvm-svn: 98179	2010-03-10 19:38:49 +00:00
Dan Gohman	b7e0b87441	Fix a comment. llvm-svn: 98122	2010-03-10 02:18:48 +00:00
Jakob Stoklund Olesen	b495cad7ca	Try to keep the cached inliner costs around for a bit longer for big functions. The Caller cost info would be reset everytime a callee was inlined. If the caller has lots of calls and there is some mutual recursion going on, the caller cost info could be calculated many times. This patch reduces inliner runtime from 240s to 0.5s for a function with 20000 small function calls. This is a more conservative version of r98089 that doesn't break the clang test CodeGenCXX/temp-order.cpp. That test relies on rather extreme inlining for constant folding. llvm-svn: 98099	2010-03-09 23:02:17 +00:00
Jakob Stoklund Olesen	4497475905	Revert r98089, it was breaking a clang test. llvm-svn: 98094	2010-03-09 22:43:37 +00:00
Jakob Stoklund Olesen	741dec43e4	Try to keep the cached inliner costs around for a bit longer for big functions. The Caller cost info would be reset everytime a callee was inlined. If the caller has lots of calls and there is some mutual recursion going on, the caller cost info could be calculated many times. This patch reduces inliner runtime from 240s to 0.5s for a function with 20000 small function calls. llvm-svn: 98089	2010-03-09 22:17:11 +00:00
Jakob Stoklund Olesen	d62c2f554c	Add inlining threshold to log output. llvm-svn: 98024	2010-03-09 00:59:53 +00:00
Evan Cheng	4f2fd2d2be	Re-commit 97860 with fix. getMallocAllocatedType may return null. llvm-svn: 98000	2010-03-08 22:54:36 +00:00
Devang Patel	3b548aa8e2	Avoid using DIDescriptor.isNull(). This is a first step towards eliminating checks in Descriptor constructors. llvm-svn: 97975	2010-03-08 20:52:55 +00:00
Devang Patel	bc97f6b757	Revert r97947. llvm-svn: 97963	2010-03-08 19:20:38 +00:00
Devang Patel	fe28599f6f	Avoid using DIDescriptor.isNull(). This is a first step towards eliminating unncessary constructor checks in light weight DIDescriptor wrappers. llvm-svn: 97947	2010-03-08 18:25:48 +00:00
Eric Christopher	1810d77cb4	Let the fallthrough handle whether or not we've changed anything before we try to optimize. llvm-svn: 97876	2010-03-06 10:59:25 +00:00
Eric Christopher	a7fb58f5f5	Migrate _chk call lowering from SimplifyLibCalls to InstCombine. Stub out the remainder of the calls that we should lower in some way and move the tests to the new correct directory. Fix up tests that are now optimized more than they were before by -instcombine. llvm-svn: 97875	2010-03-06 10:50:38 +00:00
Eric Christopher	d8b43d0e59	Temporarily revert: Log: Transform @llvm.objectsize to integer if the argument is a result of malloc of known size. Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp llvm/trunk/test/Transforms/InstCombine/objsize.ll It appears to be causing swb and nightly test failures. llvm-svn: 97866	2010-03-06 03:11:35 +00:00
Evan Cheng	afdc7d3aab	Transform @llvm.objectsize to integer if the argument is a result of malloc of known size. llvm-svn: 97860	2010-03-06 01:01:42 +00:00
Ted Kremenek	65bb311629	Update CMake build. llvm-svn: 97846	2010-03-05 22:34:16 +00:00
Eric Christopher	87abfc506f	Move SimplifyLibCalls's LibCall builders to a separate file so they can be used in more places. Add an argument for the TargetData that most of them need. Update for the getInt8PtrTy() change. Should be no functionality change. llvm-svn: 97844	2010-03-05 22:25:30 +00:00
Evan Cheng	d214ed0e75	Safely turn memset_chk etc. to non-chk variant if the known object size is >= memset / memcpy / memmove size. llvm-svn: 97828	2010-03-05 20:59:47 +00:00
Evan Cheng	fffdad58ac	Instcombine should turn llvm.objectsize of a alloca with static size to an integer. llvm-svn: 97827	2010-03-05 20:47:23 +00:00
Chris Lattner	f6befffbb2	fix PR6512, a case where instcombine would incorrectly merge loads from different addr spaces. llvm-svn: 97813	2010-03-05 18:53:28 +00:00
Chris Lattner	067459c62b	Fix PR6503. This turned into a much more interesting and nasty bug. Various parts of the cmp\|cmp and cmp&cmp folding logic wasn't prepared for vectors (unrelated to the bug but noticed while in the code) and the code was definitely not safe to use by the (cast icmp)\|(cast icmp) handling logic that I added in r95855. Fix all this up by changing the various routines to more consistently use IRBuilder and not pass in the I which had the wrong type. llvm-svn: 97801	2010-03-05 08:46:26 +00:00
Chris Lattner	343d2e48b2	simplify some functions and make them work with vector compares, noticed by inspection. llvm-svn: 97795	2010-03-05 07:47:57 +00:00
Chris Lattner	c6c1523f59	fix a nice subtle reassociate bug which would only occur in a very specific use pattern embodied in the carefully reduced testcase. llvm-svn: 97794	2010-03-05 07:18:54 +00:00
Eric Christopher	4899cbc77d	Move GetStringLength and helper from SimplifyLibCalls to ValueTracking. No functionality change. llvm-svn: 97793	2010-03-05 06:58:57 +00:00
Evan Cheng	43d6ff7701	Add missing break for Intrinsic::objectsize case. It was falling through to the following Intrinsic::bswap code. I have no idea why it wasn't breaking stuff. llvm-svn: 97774	2010-03-05 01:22:47 +00:00
Dan Gohman	29707de4fe	Make SCEVExpander and LSR more aggressive about hoisting expressions out of loops. llvm-svn: 97642	2010-03-03 05:29:13 +00:00
Bill Wendling	af13d82945	This test case: long test(long x) { return (x & 123124) \| 3; } Currently compiles to: _test: orl $3, %edi movq %rdi, %rax andq $123127, %rax ret This is because instruction and DAG combiners canonicalize (or (and x, C), D) -> (and (or, D), (C \| D)) However, this is only profitable if (C & D) != 0. It gets in the way of the 3-addressification because the input bits are known to be zero. llvm-svn: 97616	2010-03-03 00:35:56 +00:00
Dan Gohman	52f5563973	Non-affine post-inc SCEV expansions have more code which must be emitted after the increment. Make sure the insert position reflects this. This fixes PR6453. llvm-svn: 97537	2010-03-02 01:59:21 +00:00
Dan Gohman	6f34abd092	Floating-point add, sub, and mul are now spelled fadd, fsub, and fmul, respectively. llvm-svn: 97531	2010-03-02 01:11:08 +00:00
Bob Wilson	0fd415820b	Don't attempt load PRE when there is no real redundancy (i.e., the load is in a loop and is itself the only dependency). llvm-svn: 97526	2010-03-02 00:09:29 +00:00
Bob Wilson	892432b7ef	When GVN needs to split critical edges for load PRE, check all of the predecessors before returning. Otherwise, if multiple predecessor edges need splitting, we only get one of them per iteration. This makes a small but measurable compile time improvement with -enable-full-load-pre. llvm-svn: 97521	2010-03-01 23:37:32 +00:00
Evan Cheng	7263cf8431	MemoryDepAnalysis is not used if redundant load processing is disabled. llvm-svn: 97512	2010-03-01 22:23:12 +00:00
Dan Gohman	39917c7c81	Add some debug output to LoopSimplify. llvm-svn: 97458	2010-03-01 17:55:27 +00:00
Dan Gohman	8b0a419eb1	Spelling fixes. llvm-svn: 97453	2010-03-01 17:49:51 +00:00
Dan Gohman	0c39a35457	Prune #includes. llvm-svn: 97448	2010-03-01 17:42:17 +00:00
Bob Wilson	1136166ee9	Revert r97245 which seems to be causing performance problems. llvm-svn: 97366	2010-02-28 05:34:05 +00:00
Chris Lattner	2af7e3dceb	fix grammaro's pointed out by daniel llvm-svn: 97313	2010-02-27 07:50:40 +00:00
Chris Lattner	d887f1da73	fix PR6414, a nondeterminism issue in IPSCCP which was because of a subtle interation in a loop operating in densemap order. llvm-svn: 97288	2010-02-27 00:07:42 +00:00
Chris Lattner	65d3a0a5f8	Fix rdar://7694996 a miscompile of 183.equake from my patch yesterday, confusing the old MAT variable with the new GlobalType one. This caused us to promote the @disp global pointer into: @disp.body = internal global double* undef instead of: @disp.body = internal global [3 x double] undef llvm-svn: 97285	2010-02-26 23:42:13 +00:00
Chris Lattner	da5fcdace0	remove dead code, by this point all uses of CI are gone. llvm-svn: 97283	2010-02-26 23:35:25 +00:00
Bob Wilson	ed1b0c31a7	Move the EnableFullLoadPRE flag from a separate command-line option to an argument of createGVNPass and set it automatically for -O3. llvm-svn: 97245	2010-02-26 19:09:47 +00:00
Bob Wilson	d4655991c3	Remove unused "NoPRE" parameter in GVN and createGVNPass(). llvm-svn: 97235	2010-02-26 18:35:19 +00:00
Chris Lattner	0521c09d97	fix PR6435 another bug from the MallocInst elimination work. llvm-svn: 97231	2010-02-26 18:23:13 +00:00
Chris Lattner	7939f795f5	rewrite OptimizeGlobalAddressOfMalloc to fix PR6422, some bugs introduced when mallocinst was eliminated. llvm-svn: 97178	2010-02-25 22:33:52 +00:00
Dan Gohman	a9c205cc88	Make LoopSimplify change conditional branches in loop exiting blocks which branch on undef to branch on a boolean constant for the edge exiting the loop. This helps ScalarEvolution compute trip counts for loops. Teach ScalarEvolution to recognize single-value PHIs, when safe, and ForgetSymbolicName to forget such single-value PHI nodes as apprpriate in ForgetSymbolicName. llvm-svn: 97126	2010-02-25 06:57:05 +00:00
Nick Lewycky	614fb949b9	Modernize comment. llvm-svn: 97121	2010-02-25 06:39:10 +00:00
Nick Lewycky	dc835c4361	Correct whitespace. llvm-svn: 97120	2010-02-25 06:38:51 +00:00
Daniel Dunbar	693ea89214	Reapply r97010, the speculative revert failed. llvm-svn: 97036	2010-02-24 08:48:04 +00:00
Daniel Dunbar	0a2031e5b6	Speculatively revert r97010, "Add an argument to PHITranslateValue to specify the DominatorTree. ...", in hopes of restoring poor old PPC bootstrap. llvm-svn: 97027	2010-02-24 06:55:22 +00:00
Dan Gohman	94732024eb	Fix indentation. llvm-svn: 97024	2010-02-24 06:46:09 +00:00
Bob Wilson	66e58ac742	Add an argument to PHITranslateValue to specify the DominatorTree. If this argument is non-null, pass it along to PHITranslateSubExpr so that it can prefer using existing values that dominate the PredBB, instead of just blindly picking the first equivalent value that it finds on a uselist. Also when the DominatorTree is specified, have PHITranslateValue filter out any result that does not dominate the PredBB. This is basically just refactoring the check that used to be in GetAvailablePHITranslatedSubExpr and also in GVN. Despite my initial expectations, this change does not affect the results of GVN for any testcases that I could find, but it should help compile time. Before this change, if PHITranslateSubExpr picked a value that does not dominate, PHITranslateWithInsertion would then insert a new value, which GVN would later determine to be redundant and would replace. By picking a good value to begin with, we save GVN the extra work of inserting and then replacing a new value. llvm-svn: 97010	2010-02-24 01:39:00 +00:00
Dan Gohman	cd4c03e886	Don't do (X != Y) ? X : Y -> X for floating-point values; it doesn't handle NaN properly. Do (X une Y) ? X : Y -> X if one of X and Y is not zero. llvm-svn: 96955	2010-02-23 17:17:57 +00:00
Bob Wilson	923261bbe9	Update memdep when load PRE inserts a new load, and add some debug output. I don't have a small testcase for this. llvm-svn: 96890	2010-02-23 05:55:00 +00:00
Evan Cheng	3688b8fa68	Instcombine constant folding can normalize gep with negative index to index with large offset. When instcombine objsize checking transformation sees these geps where the offset seemingly point out of bound, it should just return "i don't know" rather than asserting. llvm-svn: 96825	2010-02-22 23:34:00 +00:00
Bob Wilson	1da9041913	Erase deleted instructions from GVN's ValueTable. This fixes assertion failures from ValueTable::verifyRemoved() when using -debug. llvm-svn: 96805	2010-02-22 21:39:41 +00:00
Dan Gohman	8c16b38262	Remove unused variables and parameters. llvm-svn: 96780	2010-02-22 04:11:59 +00:00
Dan Gohman	4506fcb3c2	When emitting an instruction which depends on both a post-incremented induction variable value and a loop-variant value, don't force the insert position to be at the post-increment position, because it may not be dominated by the loop-variant value. This fixes a use-before-def problem noticed on PPC. llvm-svn: 96774	2010-02-22 03:59:54 +00:00
Dan Gohman	740909be2d	This cast<Instruction> is unnecessary. llvm-svn: 96771	2010-02-22 02:07:36 +00:00
Dan Gohman	4eebb94094	Rename getSDiv to getExactSDiv to reflect its behavior in cases where the division would have a remainder. llvm-svn: 96693	2010-02-19 19:35:48 +00:00
Dan Gohman	85af256779	Check for overflow when scaling up an add or an addrec for scaled reuse. llvm-svn: 96692	2010-02-19 19:32:49 +00:00
Dale Johannesen	1d6827adef	recommit 96626, evidence that it broke things appears to be spurious llvm-svn: 96662	2010-02-19 07:14:22 +00:00
Dale Johannesen	1f790c28d0	Revert 96626, which causes build failure on ppc Darwin. llvm-svn: 96653	2010-02-19 01:54:37 +00:00
Dan Gohman	2446f57503	When determining the set of interesting reuse factors, consider strides in foreign loops. This helps locate reuse opportunities with existing induction variables in foreign loops and reduces the need for inserting new ones. This fixes rdar://7657764. llvm-svn: 96629	2010-02-19 00:05:23 +00:00
Dan Gohman	60b3326435	Indvars needs to explicitly notify ScalarEvolution when it is replacing a loop exit value, so that if a loop gets deleted, ScalarEvolution isn't stick holding on to dangling SCEVAddRecExprs for that loop. This fixes PR6339. llvm-svn: 96626	2010-02-18 23:26:33 +00:00
Dan Gohman	c43d264cc0	Hoist this loop-invariant logic out of the loop. llvm-svn: 96614	2010-02-18 21:34:02 +00:00
Dan Gohman	13ac3b2139	Delete some unneeded casts. llvm-svn: 96429	2010-02-17 00:42:19 +00:00
Dan Gohman	5f10d6c52c	Don't attempt to divide INT_MIN by -1; consider such cases to have overflowed. llvm-svn: 96428	2010-02-17 00:41:53 +00:00
Bob Wilson	aff96b2132	Rename SuccessorNumber to GetSuccessorNumber. llvm-svn: 96387	2010-02-16 21:06:42 +00:00
Dan Gohman	6deab96c81	Refactor rewriting for PHI nodes into a separate function. llvm-svn: 96382	2010-02-16 20:25:07 +00:00
Bob Wilson	92cdb6eec5	Split critical edges as needed for load PRE. llvm-svn: 96378	2010-02-16 19:51:59 +00:00
Bob Wilson	3de492ec35	Refactor to share code to find the position of a basic block successor in the terminator's list of successors. llvm-svn: 96377	2010-02-16 19:49:17 +00:00
Dan Gohman	0849ed5e26	Fix whitespace. llvm-svn: 96372	2010-02-16 19:42:34 +00:00
Duncan Sands	19d0b47b1f	There are two ways of checking for a given type, for example isa<PointerType>(T) and T->isPointerTy(). Convert most instances of the first form to the second form. Requested by Chris. llvm-svn: 96344	2010-02-16 11:11:14 +00:00
Dan Gohman	521efe68ab	Split the main for-each-use loop again, this time for GenerateTruncates, as it also peeks at which registers are being used by other uses. This makes LSR less sensitive to use-list order. llvm-svn: 96308	2010-02-16 01:42:53 +00:00
Chris Lattner	6fbfe5897c	fix PR6305 by handling BlockAddress in a helper function called by jump threading. llvm-svn: 96263	2010-02-15 20:47:49 +00:00
Duncan Sands	9dff9bec31	Uniformize the names of type predicates: rather than having isFloatTy and isInteger, we now have isFloatTy and isIntegerTy. Requested by Chris! llvm-svn: 96223	2010-02-15 16:12:20 +00:00
Dan Gohman	e4e51a63da	Fix whitespace. llvm-svn: 96179	2010-02-14 18:51:39 +00:00
Dan Gohman	e7f74bb16c	Fix a comment. llvm-svn: 96178	2010-02-14 18:51:20 +00:00
Dan Gohman	bb7d52213c	When complicated expressions are broken down into subexpressions with multiplication by constants distributed through, occasionally those subexpressions can include both x and -x. For now, if this condition is discovered within LSR, just prune such cases away, as they won't be profitable. This fixes a "zero allocated in a base register" assertion failure. llvm-svn: 96177	2010-02-14 18:50:49 +00:00
Dan Gohman	2d0f96d49a	Actually, this code doesn't have to be quite so conservative in the no-TLI case. But it should still default to declining the transformation. llvm-svn: 96152	2010-02-14 03:21:49 +00:00
Dan Gohman	cb76a806f0	Don't attempt aggressive post-inc uses if TargetLowering is not available, because profitability can't be sufficiently approximated. llvm-svn: 96148	2010-02-14 02:45:21 +00:00
John McCall	0daaf13b97	Make LSR not crash if invoked without target lowering info, e.g. if invoked from opt. llvm-svn: 96135	2010-02-13 23:40:16 +00:00
Eric Christopher	843a4cc43c	Fix a problem where we had bitcasted operands that gave us odd offsets since the bitcasted pointer size and the offset pointer size are going to be different types for the GEP vs base object. llvm-svn: 96134	2010-02-13 23:38:01 +00:00
Chris Lattner	b8639bc2d1	remove dead code. llvm-svn: 96109	2010-02-13 19:07:06 +00:00
Chris Lattner	42c66b7270	Split some code out to a helper function (FindReusablePredBB) and add a doxygen comment. Cache the phi entry to avoid doing tons of PHINode::getBasicBlockIndex calls in the common case. On my insane testcase from re2c, this speeds up CGP from 617.4s to 7.9s (78x). llvm-svn: 96083	2010-02-13 05:35:08 +00:00
Chris Lattner	5e7f705934	Speed up codegen prepare from 3.58s to 0.488s. llvm-svn: 96081	2010-02-13 05:01:14 +00:00
Chris Lattner	72c4dce884	PHINode::getBasicBlockIndex is O(n) in the number of inputs to a PHI, avoid it in the common case where the BB occurs in the same index for multiple phis. This speeds up CGP on an insane testcase from 8.35 to 3.58s. llvm-svn: 96080	2010-02-13 04:24:19 +00:00
Chris Lattner	b0ebb65ab0	iterate over preds using PHI information when available instead of using pred_begin/end. It is much faster. llvm-svn: 96079	2010-02-13 04:15:26 +00:00
Chris Lattner	96b8826542	speed up CGP a bit by scanning predecessors through phi operands instead of with pred_begin/end. llvm-svn: 96078	2010-02-13 04:04:42 +00:00
Dan Gohman	5b18f039eb	Fix a pruning heuristic which implicitly assumed that SmallPtrSet is deterministically sorted. llvm-svn: 96071	2010-02-13 02:06:02 +00:00
Jakob Stoklund Olesen	492b8b42cd	Enable the inlinehint attribute in the Inliner. Functions explicitly marked inline will get an inlining threshold slightly more aggressive than the default for -O3. This means than -O3 builds are mostly unaffected while -Os builds will be a bit bigger and faster. The difference depends entirely on how many 'inline's are sprinkled on the source. In the CINT2006 suite, only these tests are significantly affected under -Os: Size Time 471.omnetpp +1.63% -1.85% 473.astar +4.01% -6.02% 483.xalancbmk +4.60% 0.00% Note that 483.xalancbmk runs too quickly to give useful timing results. llvm-svn: 96066	2010-02-13 01:51:53 +00:00
Dan Gohman	2b75de97c0	Reapply 95979, a compile-time speedup, now that the bug it exposed is fixed. llvm-svn: 96005	2010-02-12 19:35:25 +00:00
Dan Gohman	363f847ec6	Fix this code to avoid dereferencing an end() iterator in offset distributions it doesn't expect. llvm-svn: 96002	2010-02-12 19:20:37 +00:00
Chris Lattner	75879be9d8	1. modernize the constantmerge pass, using densemap/smallvector. 2. don't bother trying to merge globals in non-default sections, doing so is quite dubious at best anyway. 3. fix a bug reported by Arnaud de Grandmaison where we'd try to merge two globals in different address spaces. llvm-svn: 95995	2010-02-12 18:17:23 +00:00
Daniel Dunbar	e0b2c69d3c	Revert "Reverse the order for collecting the parts of an addrec. The order", it is breaking llvm-gcc bootstrap. llvm-svn: 95988	2010-02-12 17:27:08 +00:00
Dan Gohman	0194f58047	Reverse the order for collecting the parts of an addrec. The order doesn't matter, except that ScalarEvolution tends to need less time to fold the results this way. llvm-svn: 95979	2010-02-12 11:08:26 +00:00
Dan Gohman	45774ce0ad	Reapply the new LoopStrengthReduction code, with compile time and bug fixes, and with improved heuristics for analyzing foreign-loop addrecs. This change also flattens IVUsers, eliminating the stride-oriented groupings, which makes it easier to work with. llvm-svn: 95975	2010-02-12 10:34:29 +00:00
Eric Christopher	cccdc13662	Make sure that ConstantExpr offsets also aren't off of extern symbols. Thanks to Duncan Sands for the testcase! llvm-svn: 95877	2010-02-11 17:44:04 +00:00
Chris Lattner	4e8137d678	Rename ValueRequiresCast to ShouldOptimizeCast, to better reflect what it does. Enhance it to return false to optimizing vector sign extensions from vector comparisions, which is the idiom used to get a splatted vector for a vector comparison. Doing this breaks vector-casts.ll, add some compensating transformations to handle the important case they cover without depending on this canonicalization. This fixes rdar://7434900 a serious pessimization of vector compares. llvm-svn: 95855	2010-02-11 06:26:33 +00:00
Chris Lattner	c053cbbc4d	Make DSE only scan blocks that are reachable from the entry block. Other blocks may have pointer cycles that will crash basicaa and other alias analyses. In any case, there is no point wasting cycles optimizing dead blocks. This fixes rdar://7635088 llvm-svn: 95852	2010-02-11 05:11:54 +00:00
Chris Lattner	d924f63692	Make jump threading honor x\|undef -> true and x&undef -> false, instead of considering x\|undef -> x, which may not be true. llvm-svn: 95850	2010-02-11 04:40:44 +00:00
Eric Christopher	531ea566a6	Add ConstantExpr handling to Intrinsic::objectsize lowering. Update testcase accordingly now that we can optimize another section. llvm-svn: 95846	2010-02-11 01:48:54 +00:00
Devang Patel	03936a1880	Ignore dbg info intrinsics. llvm-svn: 95828	2010-02-11 00:20:49 +00:00
Devang Patel	211746a69a	Strip new llvm.dbg.value intrinsic. llvm-svn: 95807	2010-02-10 21:19:56 +00:00
Dan Gohman	4a618827de	Fix "the the" and similar typos. llvm-svn: 95781	2010-02-10 16:03:48 +00:00
Eric Christopher	7b7028fd24	Move Intrinsic::objectsize lowering back to InstCombineCalls and enable constant 0 offset lowering. llvm-svn: 95691	2010-02-09 21:24:27 +00:00
Eric Christopher	ad1aa86276	Pull these back out, they're a little too aggressive and time consuming for a simple optimization. llvm-svn: 95671	2010-02-09 17:29:18 +00:00
Chris Lattner	f4c8d3cea9	simplify this code, duh. llvm-svn: 95643	2010-02-09 01:14:06 +00:00
Chris Lattner	9b6a1789e5	fix PR6193, only considering sign extensions from i1 for this xform. llvm-svn: 95642	2010-02-09 01:12:41 +00:00
Eric Christopher	be2f0b2b7b	Add file in here too. llvm-svn: 95641	2010-02-09 01:11:03 +00:00
Eric Christopher	9f85e7eb16	Add a new pass to do llvm.objsize lowering using SCEV. Initial skeleton and SCEVUnknown lowering implemented, the rest should come relatively quickly. Move testcase to new directory. Move pass to right before SimplifyLibCalls - which is moved down a bit so we can take advantage of a few opts. llvm-svn: 95628	2010-02-09 00:35:38 +00:00
Chris Lattner	b22423c89a	fix some problems handling large vectors reported in PR6230 llvm-svn: 95616	2010-02-08 23:56:03 +00:00
Jakob Stoklund Olesen	74bb06c0f0	Reintroduce the InlineHint function attribute. This time it's for real! I am going to hook this up in the frontends as well. The inliner has some experimental heuristics for dealing with the inline hint. When given a -respect-inlinehint option, functions marked with the inline keyword are given a threshold just above the default for -O3. We need some experiments to determine if that is the right thing to do. llvm-svn: 95466	2010-02-06 01:16:28 +00:00
Jakob Stoklund Olesen	5f9ead2714	Don't unroll loops containing function calls. llvm-svn: 95454	2010-02-05 23:21:31 +00:00
Jakob Stoklund Olesen	916f48a054	Teach SimplifyCFG about magic pointer constants. Weird code sometimes uses pointer constants other than null. This patch teaches SimplifyCFG to build switch instructions in those cases. Code like this: void f(const char x) { if (!x) puts("null"); else if ((uintptr_t)x == 1) puts("one"); else if (x == (char)2 \|\| x == (char)3) puts("two"); else if ((intptr_t)x == 4) puts("four"); else puts(x); } Now becomes a switch: define void @f(i8 %x) nounwind ssp { entry: %magicptr23 = ptrtoint i8* %x to i64 ; <i64> [#uses=1] switch i64 %magicptr23, label %if.else16 [ i64 0, label %if.then i64 1, label %if.then2 i64 2, label %if.then9 i64 3, label %if.then9 i64 4, label %if.then14 ] Note that LLVM's own DenseMap uses magic pointers. llvm-svn: 95439	2010-02-05 22:03:18 +00:00
Chris Lattner	64ffd11d49	fix logical-select to invoke filecheck right, and fix hte instcombine xform it is checking to actually pass. There is no need to match m_SelectCst<0, -1> since instcombine canonicalizes that into not(sext). Add matches for sext(not(x)) in addition to not(sext(x)). llvm-svn: 95420	2010-02-05 19:53:02 +00:00
Dan Gohman	4739e41ce9	Implement releaseMemory in CodeGenPrepare and free the BackEdges container data. This prevents it from holding onto dangling pointers and potentially behaving unpredictably. llvm-svn: 95409	2010-02-05 19:24:11 +00:00
Dan Gohman	8abb67df63	Use a SmallSetVector instead of a SetVector; this code showed up as a malloc caller in a profile. llvm-svn: 95407	2010-02-05 19:20:15 +00:00
Eric Christopher	04371b4f12	Remove this code for now. I have a better idea and will rewrite with that in mind. llvm-svn: 95402	2010-02-05 19:04:06 +00:00
Bob Wilson	27dfb1e1a4	Do not reassociate expressions with i1 type. SimplifyCFG converts some short-circuited conditions to AND/OR expressions, and those expressions are often converted back to a short-circuited form in code gen. The original source order may have been optimized to take advantage of the expected values, and if we reassociate them, we change the order and subvert that optimization. Radar 7497329. llvm-svn: 95333	2010-02-04 23:32:37 +00:00
Jakob Stoklund Olesen	113fb54bcb	Increase inliner thresholds by 25. This makes the inliner about as agressive as it was before my changes to the inliner cost calculations. These levels give the same performance and slightly smaller code than before. llvm-svn: 95320	2010-02-04 18:48:20 +00:00
Eric Christopher	107a1fbf61	Temporarily revert this since it appears to have caused a build failure. llvm-svn: 95294	2010-02-04 06:41:27 +00:00
Eric Christopher	42fa84a880	Rework constant expr and array handling for objectsize instcombining. Fix bugs where we would compute out of bounds as in bounds, and where we couldn't know that the linker could override the size of an array. Add a few new testcases, change existing testcase to use a private global array instead of extern. llvm-svn: 95283	2010-02-04 02:55:34 +00:00
Eric Christopher	f12e18db21	If we're dealing with a zero-length array, don't lower to any particular size, we just don't know what the length is yet. llvm-svn: 95266	2010-02-03 23:56:07 +00:00
Bob Wilson	04365c5f72	Adjust the heuristics used to decide when SROA is likely to be profitable. The SRThreshold value makes perfect sense for checking if an entire aggregate should be promoted to a scalar integer, but it is not so good for splitting an aggregate into its separate elements. A struct may contain a large embedded array along with some scalar fields that would benefit from being split apart by SROA. Even if the total aggregate size is large, it may still be good to perform SROA. Thus, the most important piece of this patch is simply moving the aggregate size comparison vs. SRThreshold so that it guards only the aggregate promotion. We have also been checking the number of elements to decide if an aggregate should be split up. The limit of "SRThreshold/4" seemed rather arbitrary, and I don't think it's very useful to derive this limit from SRThreshold anyway. I've collected some data showing that the current default limit of 32 (since SRThreshold defaults to 128) is a reasonable cutoff for struct types. One thing suggested by the data is that distinguishing between structs and arrays might be useful. There are (obviously) a lot more large arrays than large structs (as measured by the number of elements and not the total size -- a large array inside a struct still counts as a single element given the way we do SROA right now). Out of 8377 arrays where we successfully performed SROA while compiling a large set of benchmarks, only 16 of them had more than 8 elements. And, for those 16 arrays, it's not at all clear that SROA was actually beneficial. So, to offset the compile time cost of investigating more large structs for SROA, the patch lowers the limit on array elements to 8. This fixes Apple Radar 7563690. llvm-svn: 95224	2010-02-03 17:23:56 +00:00
Evan Cheng	27a41d5473	Revert 94937 and move the noreturn check to codegen. llvm-svn: 95198	2010-02-03 03:55:59 +00:00
Bob Wilson	76e8c59509	Fix some comment typos. llvm-svn: 95170	2010-02-03 00:33:21 +00:00
Eric Christopher	d86233c118	Recommit this, looks like it wasn't the cause. llvm-svn: 95165	2010-02-03 00:21:58 +00:00
Eric Christopher	e67d01a9a8	Hopefully temporarily revert this. llvm-svn: 95154	2010-02-02 23:01:31 +00:00
Eric Christopher	f9553572b7	Reformat my last patch slightly. llvm-svn: 95147	2010-02-02 22:29:26 +00:00
Eric Christopher	4264e7e46f	Re-add strcmp and known size object size checking optimization. Passed bootstrap and nightly test run here. llvm-svn: 95145	2010-02-02 22:10:43 +00:00
Chris Lattner	8e2c471614	don't turn (A & (C0?-1:0)) \| (B & ~(C0?-1:0)) -> C0 ? A : B for vectors. Codegen is generating awful code or segfaulting in various cases (e.g. PR6204). llvm-svn: 95058	2010-02-02 02:43:51 +00:00
Chris Lattner	302240d73e	fix a crash in loop unswitch on a loop invariant vector condition. llvm-svn: 95055	2010-02-02 02:26:54 +00:00
Dan Gohman	949458d014	LangRef.html says that inttoptr and ptrtoint always use zero-extension when the cast is extending. llvm-svn: 95046	2010-02-02 01:44:02 +00:00
Eric Christopher	14dfc3f6df	Don't need to check the last argument since it'll always be bool. We also don't use TargetData here. llvm-svn: 95040	2010-02-02 00:51:45 +00:00
Eric Christopher	9afa973203	More indentation/tabification fixes. llvm-svn: 95036	2010-02-02 00:13:06 +00:00
Eric Christopher	1408234753	Untabify previous commit. llvm-svn: 95035	2010-02-02 00:06:55 +00:00
Eric Christopher	56e4182c49	Formatting. llvm-svn: 95027	2010-02-01 23:25:03 +00:00
Bob Wilson	d517b52012	Add an option to GVN to remove all partially redundant loads. This is currently disabled by default. This divides the existing load PRE code into 2 phases: first it checks that it is safe to move the load to each of the predecessors where it is unavailable, and then if it is safe, the code is changed to move the load. Radar 7571861. llvm-svn: 95007	2010-02-01 21:17:14 +00:00
Chris Lattner	9306ffa05a	cleanups. llvm-svn: 94995	2010-02-01 19:54:45 +00:00
Chris Lattner	846a52e228	fix rdar://7590304, a miscompilation of objc apps on arm. The caller of objc message send was getting marked arm_apcscc, but the prototype isn't. This is fine at runtime because objcmsgsend is implemented in assembly. Only turn a mismatched caller and callee into 'unreachable' if the callee is a definition. llvm-svn: 94986	2010-02-01 18:11:34 +00:00
Chris Lattner	2cecedf081	fix rdar://7590304, an infinite loop in instcombine. In the invoke case, instcombine can't zap the invoke for fear of changing the CFG. However, we have to do something to prevent the next iteration of instcombine from inserting another store -> undef before the invoke thereby getting into infinite iteration between dead store elim and store insertion. Just zap the callee to null, which will prevent the next iteration from doing anything. llvm-svn: 94985	2010-02-01 18:04:58 +00:00
Bob Wilson	f65ba356e1	Fix pr6198 by moving the isSized() check to an outer conditional. The testcase from pr6198 does not crash for me -- I don't know what's up with that -- so I'm not adding it to the tests. llvm-svn: 94984	2010-02-01 17:41:44 +00:00
Eli Friedman	a2cc2875fc	Simplify/generalize the xor+add->sign-extend instcombine. llvm-svn: 94943	2010-01-31 04:29:12 +00:00
Eli Friedman	37a8197b61	Add a small transform: transform -(X<<Y) to (-X<<Y) when the shift has a single use and X is free to negate. llvm-svn: 94941	2010-01-31 02:30:23 +00:00
Evan Cheng	d86d3fe0c3	Do not mark no-return calls tail calls. It'll screw up special calls like longjmp and it doesn't make much sense for performance reason. If my logic is faulty, please let me know. llvm-svn: 94937	2010-01-31 00:59:31 +00:00
Bob Wilson	56600a15ad	Check alignment of loads when deciding whether it is safe to execute them unconditionally. Besides checking the offset, also check that the underlying object is aligned as much as the load itself. llvm-svn: 94875	2010-01-30 04:42:39 +00:00
Bob Wilson	4b71b6c179	Use more specific types to avoid casts. No functionality change. llvm-svn: 94863	2010-01-30 00:41:10 +00:00
Jakob Stoklund Olesen	e27dc727e2	Keep iterating over all uses when meeting a phi node in AllUsesOfValueWillTrapIfNull(). This bug was exposed by my inliner cost changes in r94615, and caused failures of lencod on most architectures when building with LTO. This patch fixes lencod and 464.h264ref on x86-64 (and likely others). llvm-svn: 94858	2010-01-29 23:54:14 +00:00
Bob Wilson	1b8453067b	Preserve load alignment in instcombine transformations. I've been unable to create a testcase where this matters. The select+load transformation only occurs when isSafeToLoadUnconditionally is true, and in those situations, instcombine also changes the underlying objects to be aligned. This seems like a good idea regardless, and I've verified that it doesn't pessimize the subsequent realignment. llvm-svn: 94850	2010-01-29 22:39:21 +00:00
Eric Christopher	5a0e174863	Revert my last couple of patches. They appear to have broken bison. llvm-svn: 94841	2010-01-29 21:16:24 +00:00
Bob Wilson	34e10c2218	Use uint64_t instead of unsigned for offsets and sizes. llvm-svn: 94835	2010-01-29 20:34:28 +00:00
Bob Wilson	7c42b9d51e	Improve isSafeToLoadUnconditionally to recognize that GEPs with constant indices are safe if the result is known to be within the bounds of the underlying object. llvm-svn: 94829	2010-01-29 19:19:08 +00:00
Duncan Sands	c8a3e56870	Having RHSKnownZero and RHSKnownOne be alternative names for KnownZero and KnownOne (via APInt &RHSKnownZero = KnownZero, etc) seems dangerous and confusing to me: it is easy not to notice this, and then wonder why KnownZero/RHSKnownZero changed underneath you when you modified RHSKnownZero/KnownZero etc. So get rid of this. No intended functionality change (tested with "make check" + llvm-gcc bootstrap). llvm-svn: 94802	2010-01-29 06:18:46 +00:00
Eric Christopher	9b3c02b7da	Make strcpy_chk lower to strcpy if we have a safe size. llvm-svn: 94783	2010-01-29 01:37:11 +00:00
Eric Christopher	997f7ca8c5	Add constant support to object size handling and remove default lowering. We'll either figure it out, or not and be lowered by SelectionDAGBuild. Add test. llvm-svn: 94775	2010-01-29 01:09:57 +00:00
Bill Wendling	48816a0b3f	Generic reformatting and comment fixing. No functionality change. llvm-svn: 94771	2010-01-29 00:52:43 +00:00
Bill Wendling	8277838cf8	Add newline to debugging output, and fix some grammar-os in comment. llvm-svn: 94765	2010-01-29 00:27:39 +00:00
Victor Hernandez	006b53f199	mem2reg erases the dbg.declare intrinsics that it converts to dbg.val intrinsics llvm-svn: 94763	2010-01-29 00:01:35 +00:00
Duncan Sands	3a48b87c54	Fix PR6165. The bug was that LHSKnownZero was being and'd with DemandedMask when it should have been and'd with LowBits. Fix that and while there beef up the logic in the case of a negative LHS. llvm-svn: 94745	2010-01-28 17:22:42 +00:00
Bob Wilson	7577e948e4	Avoid creating redundant PHIs in SSAUpdater::GetValueInMiddleOfBlock. This was already being done in SSAUpdater::GetValueAtEndOfBlock so I've just changed SSAUpdater to check for existing PHIs in both places. llvm-svn: 94690	2010-01-27 22:01:02 +00:00
Jeffrey Yasskin	091217be6f	Kill ModuleProvider and ghost linkage by inverting the relationship between Modules and ModuleProviders. Because the "ModuleProvider" simply materializes GlobalValues now, and doesn't provide modules, it's renamed to "GVMaterializer". Code that used to need a ModuleProvider to materialize Functions can now materialize the Functions directly. Functions no longer use a magic linkage to record that they're materializable; they simply ask the GVMaterializer. Because the C ABI must never change, we can't remove LLVMModuleProviderRef or the functions that refer to it. Instead, because Module now exposes the same functionality ModuleProvider used to, we store a Module* in any LLVMModuleProviderRef and translate in the wrapper methods. The bindings to other languages still use the ModuleProvider concept. It would probably be worth some time to update them to follow the C++ more closely, but I don't intend to do it. Fixes http://llvm.org/PR5737 and http://llvm.org/PR5735. llvm-svn: 94686	2010-01-27 20:34:15 +00:00
Benjamin Kramer	1266d46d32	Don't bother with sprintf, just pass the Twine through. llvm-svn: 94684	2010-01-27 19:58:47 +00:00
Benjamin Kramer	40582a891c	Use the less expensive getName function instead of getNameStr. llvm-svn: 94683	2010-01-27 19:46:52 +00:00
Chris Lattner	65f4733b77	some cleanups. llvm-svn: 94649	2010-01-27 02:12:20 +00:00
Chris Lattner	711e701f1c	no need to check for null llvm-svn: 94648	2010-01-27 02:04:20 +00:00
Victor Hernandez	477d9274bb	When converting dbg.declare to dbg.value, attach promoted store's debug metadata to dbg.value llvm-svn: 94634	2010-01-27 00:44:36 +00:00
Victor Hernandez	2b17e2a452	Avoid extra calls to MD->getNumOperands() llvm-svn: 94618	2010-01-26 23:29:09 +00:00
Victor Hernandez	9ecd2f039f	Switch AllocaDbgDeclares to SmallVector and don't leak DIFactory llvm-svn: 94567	2010-01-26 18:57:53 +00:00
Victor Hernandez	cd94410152	In mem2reg, for all alloca/stores that get promoted where the alloca has an associated llvm.dbg.declare instrinsic, insert an llvm.dbg.var intrinsic before each store. llvm-svn: 94493	2010-01-26 02:42:15 +00:00
Bob Wilson	70c8fe5e4e	Remove check for an impossible condition: the condition of the while loop has already checked that TmpBB->getSinglePredecessor() is non-null. llvm-svn: 94451	2010-01-25 21:28:05 +00:00
Bob Wilson	fc060e4337	Change Value::getUnderlyingObject to have the MaxLookup value specified as a parameter with a default value, instead of just hardcoding it in the implementation. The limit of MaxLookup = 6 was introduced in r69151 to fix a performance problem with O(n^2) behavior in instcombine, but the scalarrepl pass is relying on getUnderlyingObject to go all the way back to an AllocaInst. Making the limit part of the method signature makes it clear that by default the result is limited and should help avoid similar problems in the future. This fixes pr6126. llvm-svn: 94433	2010-01-25 18:26:54 +00:00
Victor Hernandez	8a588e1444	Revert r94260 until findDbgDeclare() is made more efficient llvm-svn: 94432	2010-01-25 17:52:13 +00:00
Chris Lattner	823aed16f9	make -fno-rtti the default unless a directory builds with REQUIRES_RTTI. llvm-svn: 94378	2010-01-24 20:43:08 +00:00
Chris Lattner	1b35bbe813	change the canonical form of "cond ? -1 : 0" to be "sext cond" instead of a select. This simplifies some instcombine code, matches the policy for zext (cond ? 1 : 0 -> zext), and allows us to generate better code for a testcase on ppc. llvm-svn: 94339	2010-01-24 00:09:49 +00:00
Chris Lattner	e112ff64c5	fix a potential overflow issue Eli pointed out. llvm-svn: 94336	2010-01-23 23:31:46 +00:00
Nick Lewycky	7e7ed8b9e5	Speculatively revert r94322 to see if it fixes darwin selfhost buildbot. llvm-svn: 94331	2010-01-23 20:32:12 +00:00
Chris Lattner	29b15c5cfd	third bug from PR6119: the xor dupe extension allows for arbitrary terminators in predecessors, don't assume it is a conditional or uncond branch. The testcase shows an example where they can happen with switches. llvm-svn: 94323	2010-01-23 19:21:31 +00:00
Nick Lewycky	32966aed9d	Teach DAE that even though it can't modify the function signature of an externally visible function, it can still find all callers of it and replace the parameters to a dead argument with undef. llvm-svn: 94322	2010-01-23 19:19:34 +00:00
Chris Lattner	ba2d0b89ff	add an early out to ProcessBranchOnXOR to speed it up, handle the case when we can infer an input to the xor from all inputs that agree, instead of going into an infinite loop. Another part of PR6199 llvm-svn: 94321	2010-01-23 19:16:25 +00:00
Chris Lattner	de5ab4860f	fix a crash in jump threading, PR6119 llvm-svn: 94319	2010-01-23 18:56:07 +00:00
Chris Lattner	249da5cb73	implement a simple instcombine xform that has been in the readme forever. llvm-svn: 94318	2010-01-23 18:49:30 +00:00
Eric Christopher	ba7cd4c393	Reapply 94059 while fixing the calling convention setup for strcpy. llvm-svn: 94287	2010-01-23 05:29:06 +00:00
Victor Hernandez	5006e43faf	In mem2reg, for all alloca/stores that get promoted where the alloca has an associated llvm.dbg.declare instrinsic, insert an llvm.dbg.var intrinsic before each store llvm-svn: 94260	2010-01-23 00:17:34 +00:00
Benjamin Kramer	3838dfbaea	Another strncmp -> StringRef.startswith simplification. llvm-svn: 94203	2010-01-22 20:00:21 +00:00
Bob Wilson	6c0c8d41b4	Revert 94059. It is breaking the MultiSource/Benchmarks/Prolangs-C/bison test on ARM. llvm-svn: 94198	2010-01-22 19:16:40 +00:00
Victor Hernandez	5f8c8c034a	Keep ignoring pointer-to-pointer bitcasts llvm-svn: 94194	2010-01-22 19:05:05 +00:00
Chris Lattner	7ba0661f27	Stop building RTTI information for most llvm libraries. Notable missing ones are libsupport, libsystem and libvmcore. libvmcore is currently blocked on bugpoint, which uses EH. Once it stops using EH, we can switch it off. This #if 0's out 3 unit tests, because gtest requires RTTI information. Suggestions welcome on how to fix this. llvm-svn: 94164	2010-01-22 06:49:46 +00:00
Dan Gohman	045f81981a	Revert LoopStrengthReduce.cpp to pre-r94061 for now. llvm-svn: 94123	2010-01-22 00:46:49 +00:00
Victor Hernandez	7b151e9f06	No need to look through bitcasts for DbgInfoIntrinsic llvm-svn: 94114	2010-01-21 23:09:12 +00:00
Victor Hernandez	ae4d949721	DbgInfoIntrinsic no longer appear in an instruction's use list llvm-svn: 94113	2010-01-21 23:08:36 +00:00
Victor Hernandez	5f5abd598c	No need to look through bitcasts for DbgInfoIntrinsic llvm-svn: 94112	2010-01-21 23:07:15 +00:00
Victor Hernandez	1df65186d1	DbgInfoIntrinsics no longer appear in an instruction's use list; so clean up looking for them in use iterations and remove OnlyUsedByDbgInfoIntrinsics() llvm-svn: 94111	2010-01-21 23:05:53 +00:00
Dan Gohman	b1ee154b6b	When inserting expressions for post-increment users which contain loop-variant components, adds must be inserted after the increment. Keep track of the increment position for this case, and insert these adds in the correct location. llvm-svn: 94110	2010-01-21 23:01:22 +00:00
Dan Gohman	cb8d577eb2	Include IVUsers information in LSR's debug output. llvm-svn: 94108	2010-01-21 22:46:32 +00:00
Dan Gohman	29916e023d	Prune the search for candidate formulae if the number of register operands exceeds the number of registers used in the initial solution, as that wouldn't lead to a profitable solution anyway. llvm-svn: 94107	2010-01-21 22:42:49 +00:00
Dan Gohman	c903499ff8	Add a comment. llvm-svn: 94104	2010-01-21 21:31:09 +00:00
Chris Lattner	24716b6c63	It turns out that this #include is needed because otherwise ValueMapper.cpp ends up calling an out of line __ZNK4llvm12PATypeHolder3getEv, which is a template and llvm-config determines arbitrarily to use the one in libipo. This sucks, but keeping the #include is a reasonable workaround. llvm-svn: 94103	2010-01-21 21:29:25 +00:00
Chris Lattner	9889b4be04	unbreak the build, apparently without this transformutils starts depending on libipa? llvm-svn: 94102	2010-01-21 21:20:51 +00:00
Chris Lattner	e39837d5ee	tidy up llvm-svn: 94101	2010-01-21 21:05:54 +00:00
Victor Hernandez	a9ad174b49	Don't need to include IntrinsicInst.h any more llvm-svn: 94092	2010-01-21 19:33:59 +00:00
Victor Hernandez	d089f4e10b	No need to map NULL operands of metadata llvm-svn: 94091	2010-01-21 19:26:20 +00:00
Dan Gohman	51ad99d2c5	Re-implement the main strength-reduction portion of LoopStrengthReduction. This new version is much more aggressive about doing "full" reduction in cases where it reduces register pressure, and also more aggressive about rewriting induction variables to count down (or up) to zero when doing so reduces register pressure. It currently uses fairly simplistic algorithms for finding reuse opportunities, but it introduces a new framework allows it to combine multiple strategies at once to form hybrid solutions, instead of doing all full-reduction or all base+index. llvm-svn: 94061	2010-01-21 02:09:26 +00:00
Eric Christopher	fa863258d0	Add strcpy_chk -> strcpy support for "don't know" object size answers. This will update as object size checking gets better information. llvm-svn: 94059	2010-01-21 01:04:38 +00:00
Chris Lattner	3c5bf71353	simplify this code. llvm-svn: 94048	2010-01-20 23:30:28 +00:00
Jakob Stoklund Olesen	8a19d3c96c	Move per-function inline threshold calculation to a method. No functional change except the forgotten test for InlineLimit.getNumOccurrences() == 0 in the CurrentThreshold2 calculation. llvm-svn: 94007	2010-01-20 17:51:28 +00:00
Victor Hernandez	f2462407ee	Switch Elts from vector to SmallVector llvm-svn: 93989	2010-01-20 06:56:16 +00:00
Victor Hernandez	5fa88d4e30	Map operands of all function-local metadata, not just metadata passed to llvm.dbg.declare intrinsics llvm-svn: 93979	2010-01-20 05:49:59 +00:00
Dan Gohman	ca19445d08	When doing address-mode sinking, expand the base register first, rather than the scaled register. This makes it more likely that subsequent AddrModeMatcher queries will match the new address the same way as the old, instead of accidentally matching what had been the base register as the new scaled register, and then failing to match the scaled register. This fixes some problems with address-mode sinking multiple muls into a block, which will be a lot more common with some upcoming LoopStrengthReduction changes. llvm-svn: 93935	2010-01-19 22:45:06 +00:00
Chris Lattner	18f49ce2d3	optimize ~(~X >>s Y) --> (X >>s Y), patch by Edmund Grimley Evans! llvm-svn: 93884	2010-01-19 18:16:19 +00:00
Bob Wilson	58d59fe394	Fix a crash in scalarrepl for memcpy/memmove where the source and destination are the same. I had already fixed a similar problem where the source and destination were different bitcasts derived from the same alloca, but the previous fix still did not handle the case where both operands are exactly the same value. Radar 7552893. llvm-svn: 93848	2010-01-19 04:32:48 +00:00
Eric Christopher	84bd316bd6	Fix comment. llvm-svn: 93831	2010-01-19 01:20:15 +00:00
Chris Lattner	43f2fa6201	my instcombine transformations to make extension elimination more aggressive changed the canonical form from sext(trunc(x)) to ashr(lshr(x)), make sure to transform a couple more things into that canonical form, and catch a case where we missed turning zext/shl/ashr into a single sext. llvm-svn: 93787	2010-01-18 22:19:16 +00:00
Devang Patel	696cb8d410	While mapping llvm.dbg.declare intrinsic manually map its operand, if possible, because it points to an alloca instruction through metadata. llvm-svn: 93757	2010-01-18 19:52:14 +00:00
Owen Anderson	cdea3572fa	Convert some of the dynamic opcode lookups into static ones. llvm-svn: 93693	2010-01-17 19:33:27 +00:00
Owen Anderson	fa1edea9ce	Fix comment. llvm-svn: 93679	2010-01-17 06:49:03 +00:00
Bob Wilson	e0da4b6cff	Fix a comment typo. llvm-svn: 93560	2010-01-15 21:55:02 +00:00
Bill Wendling	ad7a5b07a7	When the visitSub method was split into visitSub and visitFSub, this xform was added to the FSub version. However, the original version of this xform guarded against doing this for floating point (!Op0->getType()->isFPOrFPVector()). This is causing LLVM to perform incorrect xforms for code like: void func(double rhi, double rlo, double xh, double xl, double yh, double yl){ double mh, ml; double c = 134217729.0; double up, u1, u2, vp, v1, v2; up = xhc; u1 = (xh - up) + up; u2 = xh - u1; vp = yhc; v1 = (yh - vp) + vp; v2 = yh - v1; mh = xhyh; ml = (((u1v1 - mh) + (u1v2)) + (u2v1)) + (u2v2); ml += xhyl + xlyh; rhi = mh + ml; rlo = (mh - (rhi)) + ml; } The last line was optimized away, but rl is intended to be the difference between the infinitely precise result of mh + ml and after it has been rounded to double precision. llvm-svn: 93369	2010-01-13 23:23:17 +00:00
Chris Lattner	573da8ac90	1) Use the new SimplifyInstructionsInBlock routine instead of the copy in JT. 2) When cloning blocks for PHI or xor conditions, use instsimplify to simplify the code as we go. This allows us to squish common cases early in JT which opens up opportunities for subsequent iterations, and allows it to completely simplify the testcase. llvm-svn: 93253	2010-01-12 20:41:47 +00:00
Chris Lattner	7c743f2c74	add a helper function. llvm-svn: 93251	2010-01-12 19:40:54 +00:00
Chris Lattner	af7855d571	tidy up llvm-svn: 93222	2010-01-12 02:07:50 +00:00
Chris Lattner	eb73bdb2e1	Teach jump threading to duplicate small blocks when the branch condition is a xor with a phi node. This eliminates nonsense like this from 176.gcc in several places: LBB166_84: testl %eax, %eax - setne %al - xorb %cl, %al - notb %al - testb $1, %al - je LBB166_85 + je LBB166_69 + jmp LBB166_85 This is rdar://7391699 llvm-svn: 93221	2010-01-12 02:07:17 +00:00
Chris Lattner	6a19ed0b86	some cleanup, and make it obvious that ProcessJumpOnPHI only works on branches by renaming it and checking for a branch at the call site. llvm-svn: 93208	2010-01-11 23:41:09 +00:00
Chris Lattner	d1a3efedd8	reenable the piece that turns trunc(zext(x)) -> x even if zext has multiple uses, codegen has no apparent problem with the trunc version of this, because it turns into a simple subreg idiom llvm-svn: 93202	2010-01-11 22:49:40 +00:00
Chris Lattner	a6b1356cf9	Disable folding sext(trunc(x)) -> x (and other similar cast/cast cases) when the trunc has multiple uses. Codegen is not able to coalesce the subreg case correctly and so this leads to higher register pressure and spilling (see PR5997). This speeds up 256.bzip2 from 8.60 -> 8.04s on my machine, ~7%. llvm-svn: 93200	2010-01-11 22:45:25 +00:00
Chris Lattner	9518869423	add one more bitfield optimization, allowing clang to generate good code on PR4216: _test_bitfield: ## @test_bitfield orl $32962, %edi movl $4294941946, %eax andq %rdi, %rax ret instead of: _test_bitfield: movl $4294941696, %ecx movl %edi, %eax orl $194, %edi orl $32768, %eax andq $250, %rdi andq %rax, %rcx movq %rdi, %rax orq %rcx, %rax ret Evan is looking into the remaining andq+imm -> andl optimization. llvm-svn: 93147	2010-01-11 06:55:24 +00:00
Chris Lattner	0a85420409	Extend CanEvaluateZExtd to handle and/or/xor more aggressively in the BitsToClear case. This allows it to promote expressions which have an and/or/xor after the lshr, promoting cases like test2 (from PR4216) and test3 (random extample extracted from a spec benchmark). clang now compiles the code in PR4216 into: _test_bitfield: ## @test_bitfield movl %edi, %eax orl $194, %eax movl $4294902010, %ecx andq %rax, %rcx orl $32768, %edi andq $39936, %rdi movq %rdi, %rax orq %rcx, %rax ret instead of: _test_bitfield: ## @test_bitfield movl %edi, %eax orl $194, %eax movl $4294902010, %ecx andq %rax, %rcx shrl $8, %edi orl $128, %edi shlq $8, %rdi andq $39936, %rdi movq %rdi, %rax orq %rcx, %rax ret which is still not great, but is progress. llvm-svn: 93145	2010-01-11 04:05:13 +00:00
Chris Lattner	12bd8992b3	Remove the dead TD argument to CanEvaluateZExtd, and add a new BitsToClear result which allows us to start promoting expressions that end with a lshr-by-constant. This is conservatively correct and better than what we had before (see testcases) but still needs to be extended further. llvm-svn: 93144	2010-01-11 03:32:00 +00:00
Chris Lattner	172630abd2	improve comments, remove dead TD argument to CanEvaluateSExtd. llvm-svn: 93143	2010-01-11 02:43:35 +00:00
Chris Lattner	7dd540ee24	teach sext optimization to handle truncs from types that are not the dest of the sext. llvm-svn: 93128	2010-01-10 20:30:41 +00:00
Chris Lattner	39d2daa94c	teach zext optimization how to deal with truncs that don't come from the zext dest type. This allows us to handle test52/53 in cast.ll, and allows llvm-gcc to generate much better code for PR4216 in -m64 mode: _test_bitfield: ## @test_bitfield orl $32962, %edi movl %edi, %eax andl $-25350, %eax ret This also fixes a bug handling vector extends, ensuring that the mask produced is a vector constant, not an integer constant. llvm-svn: 93127	2010-01-10 20:25:54 +00:00
Chris Lattner	1a05fddcdc	simplify CanEvaluateSExtd to return a bool now that we have a simpler profitability predicate. llvm-svn: 93111	2010-01-10 07:57:20 +00:00
Chris Lattner	d7816780e2	the NumCastsRemoved argument to CanEvaluateSExtd is dead, remove it. llvm-svn: 93110	2010-01-10 07:42:21 +00:00
Chris Lattner	2fff10c424	now that the cost model has changed, we can always consider elimination of a sign extend to be a win, which simplifies the client of CanEvaluateSExtd, and allows us to eliminate more casts (examples taken from real code). llvm-svn: 93109	2010-01-10 07:40:50 +00:00
Chris Lattner	d8509424a4	change the preferred canonical form for a sign extension to be lshr+ashr instead of trunc+sext. We want to avoid type conversions whenever possible, it is easier to codegen expressions without truncates and extensions. llvm-svn: 93107	2010-01-10 07:08:30 +00:00
Chris Lattner	2b459fe7e1	fix indentation of switch statements, no functionality change. llvm-svn: 93106	2010-01-10 06:59:55 +00:00
Chris Lattner	127bbc715e	fix pasto that broke bootstrap. llvm-svn: 93105	2010-01-10 06:50:04 +00:00
Chris Lattner	b7be7cc486	simplify CanEvaluateZExtd now that we don't care about the number of bits known clear in the result and don't care about the # casts eliminated. TD is also dead but keeping it for now. llvm-svn: 93098	2010-01-10 02:50:04 +00:00
Chris Lattner	49d2c9764d	two changes: 1) don't try to optimize a sext or zext that is only used by a trunc, let the trunc get optimized first. This avoids some pointless effort in some common cases since instcombine scans down a block in the first pass. 2) Change the cost model for zext elimination to consider an 'and' cheaper than a zext. This allows us to do it more aggressively, and for the next patch to simplify the code quite a bit. llvm-svn: 93097	2010-01-10 02:39:31 +00:00
Chris Lattner	f0af17dab3	enhance CanEvaluateZExtd to handle shift left and sext, allowing more expressions to be promoted and casts eliminated. llvm-svn: 93096	2010-01-10 02:22:12 +00:00
Chris Lattner	7723e2b10f	remove an xform subsumed by EvaluateInDifferentType. llvm-svn: 93095	2010-01-10 01:35:55 +00:00
Julien Lerouge	321098ebec	Fix nondeterministic behavior. llvm-svn: 93093	2010-01-10 01:07:22 +00:00
Chris Lattner	c95a7a21b7	clean up this xform by using m_Trunc. llvm-svn: 93092	2010-01-10 01:04:31 +00:00
Chris Lattner	883550afe8	inline and remove the rest of commonIntCastTransforms. llvm-svn: 93091	2010-01-10 01:00:46 +00:00
Chris Lattner	c3aca38468	Inline the expression type promotion/demotion stuff out of commonIntCastTransforms into the callers, eliminating a switch, and allowing the static predicate methods to be moved down to live next to the corresponding function. No functionality change. llvm-svn: 93089	2010-01-10 00:58:42 +00:00
Chris Lattner	ab7087ad66	only factor from expressions whose uses are empty and whose base is the right expression type. This fixes PR5981. llvm-svn: 93045	2010-01-09 06:01:36 +00:00
Julien Lerouge	f50a3f19da	Fix nondeterministic behavior. llvm-svn: 93038	2010-01-09 01:06:49 +00:00
Eric Christopher	4a1d7e1506	Remove unnecessary dyn_cast and add a comment. Part of a WIP. llvm-svn: 93026	2010-01-08 21:37:11 +00:00
Chris Lattner	9242ae047c	mplement a theoretical fixme. llvm-svn: 93024	2010-01-08 19:28:47 +00:00
Chris Lattner	10840e9e13	rename CanEvaluateInDifferentType -> CanEvaluateTruncated and simplify it now that it is only used for truncates. llvm-svn: 93021	2010-01-08 19:19:23 +00:00
Chris Lattner	a1e223ea10	teach instcombine to delete sign extending shift pairs (sra(shl X, C), C) when the input is already sign extended. llvm-svn: 93019	2010-01-08 19:04:21 +00:00
Duncan Sands	4a8b15dc74	Suppress an unused variable warning when assertions are off; remove some trailing whitespace while there. llvm-svn: 93008	2010-01-08 17:51:48 +00:00
Chris Lattner	8c92b57df9	tidy up some stuff duncan pointed out. llvm-svn: 93007	2010-01-08 17:48:19 +00:00
Chris Lattner	35d3b9dcd0	teach ComputeNumSignBits to look through PHI nodes. llvm-svn: 92964	2010-01-07 23:44:37 +00:00
Chris Lattner	3057c37959	Enhance instcombine to reason more strongly about promoting computation that feeds into a zext, similar to the patch I did yesterday for sext. There is a lot of room for extension beyond this patch. llvm-svn: 92962	2010-01-07 23:41:00 +00:00
Benjamin Kramer	76e2766442	Use a do-while loop instead of while + boolean. llvm-svn: 92912	2010-01-07 13:50:07 +00:00
Duncan Sands	f117880ab0	Be less stingy as to how many selects and phi nodes we are prepared to look through. llvm-svn: 92898	2010-01-07 05:48:42 +00:00
Chris Lattner	9855a6bb7c	handle ConstantVector while I'm in here. llvm-svn: 92892	2010-01-07 01:20:20 +00:00
Chris Lattner	64ecc468bd	fix a globalopt crash on 'bullet' (handling evaluation of a store to an element of a vector in a static ctor) which occurs with an unrelated patch I'm testing. Annoyingly, EvaluateStoreInto basically does exactly the same stuff as InsertElement constant folding, but it now handles vectors, and you can't insertelement into a vector. It would be 'really nice' if GEP into a vector were not legal. llvm-svn: 92889	2010-01-07 01:16:21 +00:00
Eric Christopher	2cdb806fd8	Move the object size intrinsic optimization to inst-combine and make it work for any integer size return type. llvm-svn: 92853	2010-01-06 20:04:44 +00:00
Duncan Sands	c8493da5b1	Fix a README item: have functionattrs look through selects and phi nodes when deciding which pointers point to local memory. I actually checked long ago how useful this is, and it isn't very: it hardly ever fires in the testsuite, but since Chris wants it here it is! llvm-svn: 92836	2010-01-06 15:37:47 +00:00
Mikhail Glushenkov	40d2429b28	Formatting. llvm-svn: 92831	2010-01-06 09:20:39 +00:00
Duncan Sands	78376ad7e1	Partially address a README by having functionattrs consider calls to memcpy, memset and other intrinsics that only access their arguments to be readnone if the intrinsic's arguments all point to local memory. This improves the testcase in the README to readonly, but it could in theory be made readnone, however this would involve more sophisticated analysis that looks through the memcpy. llvm-svn: 92829	2010-01-06 08:45:52 +00:00
Chris Lattner	4339f2abdb	tweaks suggested by Duncan llvm-svn: 92824	2010-01-06 05:32:15 +00:00
Chris Lattner	98748c0964	Teach instcombine's sext elimination logic to be more aggressive. Previously, instcombine would only promote an expression tree to the larger type if doing so eliminated two casts. This is because a need to manually do the sign extend after the promoted expression tree with two shifts. Now, we keep track of whether the result of the computation is going to be properly sign extended already. If so, we can unconditionally promote the expression, which allows us to zap more sext's. This implements rdar://6598839 (aka gcc pr38751) llvm-svn: 92815	2010-01-06 01:56:21 +00:00
Chris Lattner	8600dd3d7c	simplify this code. llvm-svn: 92800	2010-01-05 23:00:30 +00:00
Chris Lattner	554d0564ff	make this a static function instead of a method. llvm-svn: 92795	2010-01-05 22:30:42 +00:00
Chris Lattner	a93c63c22d	more rearrangement and cleanup, fix my test failure. llvm-svn: 92792	2010-01-05 22:21:18 +00:00
Chris Lattner	f476ef502c	cleanup llvm-svn: 92790	2010-01-05 22:07:33 +00:00
Chris Lattner	f88dd5ed64	remove two trunc xforms that are subsumed by EvaluateInDifferentType. The only difference is that EvaluateInDifferentType checks to ensure they are profitable before doing them :) llvm-svn: 92788	2010-01-05 22:01:41 +00:00
Chris Lattner	44a63815b9	just remove this xform which is subsumed by others. llvm-svn: 92775	2010-01-05 21:16:30 +00:00
Chris Lattner	b82a840eb2	move a trunc-specific transform out of commonIntCastTransforms into visitTrunc. llvm-svn: 92773	2010-01-05 21:11:17 +00:00
Benjamin Kramer	d2564e3afb	Move remaining stuff to the isInteger predicate. llvm-svn: 92771	2010-01-05 21:05:54 +00:00
Chris Lattner	fd7e42b65d	move a zext specific xform out of commonIntCastTransforms into visitZExt and modernize it. llvm-svn: 92770	2010-01-05 21:04:47 +00:00
Chris Lattner	aaccc8de62	move a trunc-specific xform out of commonIntCastTransforms into visitTrunc llvm-svn: 92768	2010-01-05 20:57:30 +00:00
Chris Lattner	dec6847bf6	reduce indentation llvm-svn: 92766	2010-01-05 20:56:24 +00:00
Benjamin Kramer	a81a6dff0d	Convert a ton of simple integer type equality tests to the new predicate. llvm-svn: 92760	2010-01-05 20:07:06 +00:00
Chris Lattner	54f4e39956	optimize comparisons against cttz/ctlz/ctpop, patch by Alastair Lynn! llvm-svn: 92745	2010-01-05 18:09:56 +00:00
Dan Gohman	c3c031bb37	Nick Lewycky pointed out that this code makes changes unconditionally. llvm-svn: 92739	2010-01-05 17:50:58 +00:00
Dan Gohman	b5358003fb	Set Changed properly after calling DeleteDeadPHIs. llvm-svn: 92735	2010-01-05 16:31:45 +00:00
Dan Gohman	28943873e6	Use do+while instead of while for loops which obviously have a non-zero trip count. Use SmallVector's pop_back_val(). llvm-svn: 92734	2010-01-05 16:27:25 +00:00
Dan Gohman	92fdb96474	Fix indentation. llvm-svn: 92733	2010-01-05 16:20:55 +00:00
Dan Gohman	cb99fe9839	Make RecursivelyDeleteTriviallyDeadInstructions, RecursivelyDeleteDeadPHINode, and DeleteDeadPHIs return a flag indicating whether they made any changes. llvm-svn: 92732	2010-01-05 15:45:31 +00:00
Benjamin Kramer	f7cc698b69	Add newline at EOF. llvm-svn: 92727	2010-01-05 13:32:48 +00:00
Benjamin Kramer	ccce8bae14	Avoid going through the LLVMContext for type equality where it's safe to dereference the type pointer. llvm-svn: 92726	2010-01-05 13:12:22 +00:00
Chris Lattner	223812d547	prune some #includes. llvm-svn: 92712	2010-01-05 07:54:43 +00:00
Chris Lattner	0a8191ee88	split and/or/xor out into one overly-large (2000LOC) file. However, I think it does make sense to keep them together, at least for now. llvm-svn: 92711	2010-01-05 07:50:36 +00:00
Chris Lattner	ed41b14f54	missed file with previous commit. llvm-svn: 92710	2010-01-05 07:45:02 +00:00
Chris Lattner	dc67e13442	split instcombine of shifts out to its own file. llvm-svn: 92709	2010-01-05 07:44:46 +00:00
Chris Lattner	e903f38b4d	eliminate getBitCastOperand and simplify some over-complex inbounds stuff. llvm-svn: 92708	2010-01-05 07:42:10 +00:00
Chris Lattner	7a9e47ac4b	split call handling out to InstCombineCalls.cpp llvm-svn: 92707	2010-01-05 07:32:13 +00:00
Chris Lattner	9da1cb243b	optimize cttz and ctlz when we can prove something about the leading/trailing bits. Patch by Alastair Lynn! llvm-svn: 92706	2010-01-05 07:23:56 +00:00
Chris Lattner	85e65e58ac	this inline function moved to addsub llvm-svn: 92705	2010-01-05 07:20:54 +00:00
Chris Lattner	82aa888e8c	split add/sub out to its own file. Eliminate use of dyn_castNotVal in the X+~X transform. dyn_castNotVal is dramatic overkill for what the xform needed. llvm-svn: 92704	2010-01-05 07:18:46 +00:00
Chris Lattner	c7de92ae15	all the places we use hasOneUse() we know are instructions, so inline and simplify. llvm-svn: 92700	2010-01-05 07:04:23 +00:00
Chris Lattner	c6493f070e	eliminate AssociativeOpt and its last uses. llvm-svn: 92697	2010-01-05 07:01:16 +00:00
Chris Lattner	94694c7f0b	inline the FoldICmpLogical functor. llvm-svn: 92695	2010-01-05 06:59:49 +00:00
Chris Lattner	98d48a0b76	inline the 'AddRHS' transformation, simplifying things significantly. Eliminate the 'AddMaskingAnd' transformation, it is redundant with this more general code right below it: // A+B --> A\|B iff A and B have no bits set in common. llvm-svn: 92693	2010-01-05 06:29:13 +00:00
Chris Lattner	39b063bf37	remove massive over-genality manifested as a big template that got instantiated. There is no reason for instcombine to try this hard for simple associative optimizations. Next up, eliminate the template completely. llvm-svn: 92692	2010-01-05 06:24:06 +00:00
Chris Lattner	dc054bf39a	split mul/div/rem instructions out to their own file. llvm-svn: 92689	2010-01-05 06:09:35 +00:00
Chris Lattner	1e7b7b50b1	clean up header. llvm-svn: 92688	2010-01-05 06:05:07 +00:00
Chris Lattner	8f771cb78f	split select out to its own file. llvm-svn: 92687	2010-01-05 06:03:12 +00:00
Chris Lattner	a65e2f7304	split out load/store/alloca. llvm-svn: 92685	2010-01-05 05:57:49 +00:00
Chris Lattner	841af4f03d	reduce indentation llvm-svn: 92684	2010-01-05 05:42:08 +00:00
Chris Lattner	ec97a90221	split vector stuff out to InstCombineVectorOps.cpp llvm-svn: 92683	2010-01-05 05:36:20 +00:00
Chris Lattner	de1feded32	split PHI node stuff out to InstCombinePHI.cpp llvm-svn: 92682	2010-01-05 05:31:55 +00:00
Chris Lattner	27acfcd1c4	convert various IntrinsicInst's to use class instead of struct. llvm-svn: 92681	2010-01-05 05:21:26 +00:00
Chris Lattner	f741d72b84	fix an infinite loop in reassociate building emacs. llvm-svn: 92679	2010-01-05 04:55:35 +00:00
David Greene	cf0addf927	Change errs() to dbgs(). llvm-svn: 92639	2010-01-05 01:28:37 +00:00
David Greene	6ef94ad615	Change errs() to dbgs(). llvm-svn: 92636	2010-01-05 01:28:29 +00:00
David Greene	74e8bd05cc	Change errs() to dbgs(). llvm-svn: 92633	2010-01-05 01:28:12 +00:00
David Greene	9fcfd96da9	Change errs() to dbgs(). llvm-svn: 92631	2010-01-05 01:28:07 +00:00
David Greene	44cb8ade45	Change errs() to dbgs(). llvm-svn: 92629	2010-01-05 01:28:05 +00:00
David Greene	8306b60d56	Change errs() to dbgs(). llvm-svn: 92627	2010-01-05 01:27:54 +00:00
David Greene	0122fc495d	Change errs() to dbgs(). llvm-svn: 92625	2010-01-05 01:27:51 +00:00
David Greene	241992382e	Change errs() to dbgs(). llvm-svn: 92624	2010-01-05 01:27:47 +00:00
David Greene	e0b9789593	Change errs() to dbgs(). llvm-svn: 92623	2010-01-05 01:27:44 +00:00
David Greene	6bc0776343	Change errs() to dbgs(). llvm-svn: 92622	2010-01-05 01:27:39 +00:00
David Greene	3a79df0993	Change errs() to dbgs(). llvm-svn: 92620	2010-01-05 01:27:33 +00:00
David Greene	0fd862254e	Change errs() to dbgs(). llvm-svn: 92619	2010-01-05 01:27:30 +00:00
David Greene	d17c3916d0	Change errs() to dbgs(). llvm-svn: 92617	2010-01-05 01:27:24 +00:00
David Greene	9ddc6e2e12	Change errs() to dbgs(). llvm-svn: 92615	2010-01-05 01:27:21 +00:00
David Greene	1efdb45562	Change errs() to dbgs(). llvm-svn: 92614	2010-01-05 01:27:19 +00:00
David Greene	2e6efc441f	Change errs() to dbgs(). llvm-svn: 92613	2010-01-05 01:27:17 +00:00
David Greene	389fc3b9f6	Change errs() to dbgs(). llvm-svn: 92612	2010-01-05 01:27:15 +00:00
David Greene	74e2d4917d	Change errs() to dbgs(). llvm-svn: 92611	2010-01-05 01:27:11 +00:00
David Greene	48c86bedbd	Change errs() to dbgs(). llvm-svn: 92610	2010-01-05 01:27:09 +00:00
David Greene	0dd384cfd0	Change errs() to dbgs(). llvm-svn: 92609	2010-01-05 01:27:06 +00:00
David Greene	d9c355d590	Change errs() to dbgs(). llvm-svn: 92608	2010-01-05 01:27:04 +00:00
David Greene	b72ad95ecf	Change errs() to dbgs(). llvm-svn: 92607	2010-01-05 01:27:01 +00:00
David Greene	084b0dde9d	Change errs() to dbgs(). llvm-svn: 92606	2010-01-05 01:26:57 +00:00
David Greene	76a4e852f8	Change errs() to dbgs(). llvm-svn: 92605	2010-01-05 01:26:54 +00:00
David Greene	725c7c3f2e	Change errs() to dbgs(). llvm-svn: 92604	2010-01-05 01:26:52 +00:00
David Greene	3774a38fdf	Change errs() to dbgs(). llvm-svn: 92603	2010-01-05 01:26:49 +00:00
David Greene	50c54238e4	Change errs() to dbgs(). llvm-svn: 92602	2010-01-05 01:26:45 +00:00
David Greene	0ad6dce031	Change errs() to dbgs(). llvm-svn: 92601	2010-01-05 01:26:44 +00:00
David Greene	627f40a9f2	Change errs() to dbgs(). llvm-svn: 92600	2010-01-05 01:26:41 +00:00
David Greene	a8a32dd987	Change errs() to dbgs(). llvm-svn: 92599	2010-01-05 01:26:39 +00:00
Devang Patel	be94f23992	Remove dead debug info intrinsics. Intrinsic::dbg_stoppoint Intrinsic::dbg_region_start Intrinsic::dbg_region_end Intrinsic::dbg_func_start AutoUpgrade simply ignores these intrinsics now. llvm-svn: 92557	2010-01-05 01:10:40 +00:00
Daniel Dunbar	72a87448c1	Fix some struct/class specifier mismatches. llvm-svn: 92550	2010-01-05 00:15:58 +00:00
Chris Lattner	a751d09c08	Truncate GEP indexes larger than the pointer size down to pointer size when doing this transform if the GEP is not inbounds. No testcase because it is very difficult to trigger this: instcombine already canonicalizes GEP indices to pointer size, so it relies specific permutations of the instcombine worklist. Thanks to Duncan for pointing this possible problem out. llvm-svn: 92495	2010-01-04 18:57:15 +00:00
Chris Lattner	2cb08e69b1	silence a bogus 'might be used uninit' warning from GCC. llvm-svn: 92494	2010-01-04 18:48:26 +00:00
Chris Lattner	59d95743c8	move some more cast-related stuff llvm-svn: 92471	2010-01-04 07:59:07 +00:00
Mikhail Glushenkov	6a8ac8ce8f	80-col violations, trailing whitespace. llvm-svn: 92470	2010-01-04 07:55:25 +00:00
Chris Lattner	92be2adba6	move the [Can]EvaluateInDifferentType functions out to InstCombineCasts.cpp llvm-svn: 92469	2010-01-04 07:54:59 +00:00
Chris Lattner	2b295a0eba	split 943 lines of instcombine out to a new InstCombineCasts.cpp file. InstructionCombining.cpp is now down to a svelte 9300 lines :) llvm-svn: 92468	2010-01-04 07:53:58 +00:00
Chris Lattner	2188e40e4c	split instcombine of compares (visit[FI]Cmp) out to a new InstCombineCompares.cpp file. llvm-svn: 92467	2010-01-04 07:37:31 +00:00
Chris Lattner	6ea40f1542	update cmakefile llvm-svn: 92466	2010-01-04 07:19:55 +00:00
Chris Lattner	7e0449172c	move the 'SimplifyDemandedFoo' methods out to their own file, cutting 1K lines out of instcombine.cpp llvm-svn: 92465	2010-01-04 07:17:19 +00:00
Chris Lattner	35522b7465	split the instcombine class definition out to a header shared among the instcombine library. llvm-svn: 92463	2010-01-04 07:12:23 +00:00
Chris Lattner	b8906bda13	remove a ton of unneeded LLVMContext stuff. llvm-svn: 92462	2010-01-04 07:02:48 +00:00
Chris Lattner	66c2e54bcd	move InstCombineWorklist out to its own header. llvm-svn: 92461	2010-01-04 06:30:00 +00:00
Chris Lattner	e2b9da98b0	forgot to svn add these. llvm-svn: 92460	2010-01-04 06:28:20 +00:00
Chris Lattner	c0e6640d3a	move instcombine to its own library, it's past time. llvm-svn: 92459	2010-01-04 06:23:24 +00:00
Chris Lattner	2d91231d82	implement an instcombine xform needed by clang's codegen on the example in PR4216. This doesn't trigger in the testsuite, so I'd really appreciate someone scrutinizing the logic for correctness. llvm-svn: 92458	2010-01-04 06:03:59 +00:00
Chris Lattner	48218e42cd	pull my debug hooks out, I'm done with this xform for now. llvm-svn: 92446	2010-01-03 06:58:48 +00:00
Nick Lewycky	475d3d1215	Small cleanups, refactor some duplicated code into a single method. No functionality change. llvm-svn: 92445	2010-01-03 04:39:07 +00:00
Chris Lattner	fca0c8f93a	generalize the previous transformation to handle indexing into arrays of structs and other arrays, so long as all the subsequent indexes are constants. This triggers frequently for stuff like: @divisions = internal constant [29 x [2 x i32]] [[2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 2], [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2]], align 32 ; <[29 x [2 x i32]]> [#uses=50] %623 = getelementptr inbounds [29 x [2 x i32]] @divisions, i64 0, i64 %619, i64 0 ; <i32*> [#uses=1] %684 = icmp eq i32 %683, 999 also for the "my_defs" table in 'gs', etc. llvm-svn: 92444	2010-01-03 03:03:27 +00:00
Nick Lewycky	ff9cd7ace7	Cleanup. llvm-svn: 92436	2010-01-03 00:55:31 +00:00
Chris Lattner	98ad2b56cc	teach instcombine to optimize idioms like A[i]&42 == 0. This occurs in 403.gcc in mode_mask_array, in safe-ctype.c (which is copied in multiple apps) in _sch_istable, etc. llvm-svn: 92427	2010-01-02 22:08:28 +00:00
Chris Lattner	b56bef45f8	Teach the table lookup optimization to generate range compares when a consequtive sequence of elements all satisfies the predicate. Like the double compare case, this generates better code than the magic constant case and generalizes to more than 32/64 element array lookups. Here are some examples where it triggers. From 403.gcc, most accesses to the rtx_class array are handled, e.g.: @rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]> [#uses=547] %142 = icmp eq i8 %141, 105 @rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]> [#uses=543] %165 = icmp eq i8 %164, 60 Also, most of the 59-element arrays (mode_class/rid_to_yy, etc) optimized before are actually range compares. This lets 32-bit machines optimize them. 400.perlbmk has stuff like this: 400.perlbmk: PL_regkind, even for 32-bit: @PL_regkind = constant [62 x i8] c"\00\00\02\02\02\06\06\06\06\09\09\0B\0B\0D\0E\0E\0E\11\12\12\14\14\16\16\18\18\1A\1A\1C\1C\1E\1F !!!$$&'((((,-.///88886789:;8$", align 32 ; <[62 x i8]> [#uses=4] %811 = icmp ne i8 %810, 33 @PL_utf8skip = constant [256 x i8] c"\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\04\04\04\04\04\04\04\04\05\05\05\05\06\06\07\0D", align 32 ; <[256 x i8]> [#uses=94] %12 = icmp ult i8 %10, 2 etc. llvm-svn: 92426	2010-01-02 21:50:18 +00:00
Chris Lattner	e199d2df80	theoretically the negate we find could be in a different function, check for this case. llvm-svn: 92425	2010-01-02 21:46:33 +00:00
Chris Lattner	2fa4ec70fc	use enums for the over/underdefined markers for clarity. Switch to using -2/-3 instead of -1/-2 for a future xform. llvm-svn: 92423	2010-01-02 20:20:33 +00:00
Chris Lattner	351e22aa36	remove the random sampling framework, which is not maintained anymore. If there is interest, it can be resurrected from SVN. PR4912. llvm-svn: 92422	2010-01-02 20:07:03 +00:00
Nick Lewycky	a67519be12	Fix logic error in previous commit. The != case needs to become an or, not an and. llvm-svn: 92419	2010-01-02 16:14:56 +00:00
Nick Lewycky	357d41b3c1	Optimize pointer comparison into the typesafe form, now that the backends will handle them efficiently. This is the opposite direction of the transformation we used to have here. llvm-svn: 92418	2010-01-02 15:25:44 +00:00
Chris Lattner	cfda435c73	Generalize the previous xform to handle cases where exactly two elements match or don't match with two comparisons. For example, the testcase compiles into: define i1 @test5(i32 %X) { %1 = icmp eq i32 %X, 2 ; <i1> [#uses=1] %2 = icmp eq i32 %X, 7 ; <i1> [#uses=1] %R = or i1 %1, %2 ; <i1> [#uses=1] ret i1 %R } This generalizes the previous xforms when the array is larger than 64 elements (and this case matches) and generates better code for cases where it overlaps with the magic bitshift case. This generalizes more cases than you might expect. For example, 400.perlbmk has: @PL_utf8skip = constant [256 x i8] c"\01\01\01\... %15 = icmp ult i8 %7, 7 403.gcc has: @rid_to_yy = internal constant [114 x i16] [i16 259, i16 260, ... %18 = icmp eq i16 %16, 295 and xalancbmk has a bunch of examples, such as _ZN11xercesc_2_5L15gCombiningCharsE and _ZN11xercesc_2_5L10gBaseCharsE. llvm-svn: 92417	2010-01-02 09:35:17 +00:00
Chris Lattner	c6ac078423	fix a miscompilation I introduced of cdecl with a late change. llvm-svn: 92416	2010-01-02 09:22:13 +00:00
Chris Lattner	935a4a606a	enhance the compare/load/index optimization to work on any load from a global with 32/64 elements or less (depending on whether i64 is native on the target), generating a bitshift idiom to determine the result. For example, on test4 we produce: define i1 @test4(i32 %X) { %1 = lshr i32 933, %X ; <i32> [#uses=1] %2 = and i32 %1, 1 ; <i32> [#uses=1] %R = icmp ne i32 %2, 0 ; <i1> [#uses=1] ret i1 %R } This triggers in a number of interesting cases, for example, here's an fp case: @A.3255 = internal constant [4 x double] [double 4.100000e+00, double -3.900000e+00, double -1.000000e+00, double 1.000000e+00], align 32 ; <[4 x double]> [#uses=7] ... %7 = fcmp olt double %3, 0.000000e+00 In this case we make the slen2_tab global dead, which is nice: @slen2_tab = internal constant [16 x i32] [i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 2, i32 3], align 32 ; <[16 x i32]> [#uses=1] ... %204 = icmp eq i32 %46, 0 Perl has a bunch of these, also on the 'Perl_regkind' array: @Perl_yygindex = internal constant [51 x i16] [i16 0, i16 0, i16 0, i16 0, i16 374, i16 351, i16 0, i16 -12, i16 0, i16 946, i16 413, i16 -83, i16 0, i16 0, i16 0, i16 -311, i16 -13, i16 4007, i16 2893, i16 0, i16 0, i16 0, i16 0, i16 0, i16 372, i16 -8, i16 0, i16 0, i16 246, i16 -131, i16 43, i16 86, i16 208, i16 -45, i16 -169, i16 987, i16 0, i16 0, i16 0, i16 0, i16 308, i16 0, i16 -271, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0], align 32 ; <[51 x i16]> [#uses=1] ... %1364 = icmp eq i16 %1361, 0 186.crafty really likes this on 64-bit machines, because it triggers on a bunch of globals like this: @white_outpost = internal constant [64 x i8] c"\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\02\02\00\00\00\00\00\04\05\05\04\00\00\00\00\03\06\06\03\00\00\00\00\00\01\01\00\00\00\00\00\00\00\00\00\00\00", align 32 ; <[64 x i8]> [#uses=2] However the big winner is 403.gcc, which triggers hundreds of times, eliminating all the accesses to the 57-element arrays 'mode_class', mode_unit_size, mode_bitsize, regclass_map, etc. go 64-bit machines :) llvm-svn: 92415	2010-01-02 08:56:52 +00:00
Chris Lattner	b1567bd584	enhance the previous optimization to work with fcmp in addition to icmp. llvm-svn: 92412	2010-01-02 08:20:51 +00:00
Chris Lattner	a061859ccc	Teach instcombine to fold compares of loads from constant arrays with variable indices into a comparison of the index with a constant. The most common occurrence of this that I see by far is stuff like: if ("foobar"[i] == '\0') ... which we compile into: if (i == 6), saving a load and materialization of the global address. This also exposes loop trip count information to later passes in many cases. This triggers hundreds of times in xalancbmk, which is where I first noticed it, but it also triggers in many other apps. Here are a few interesting ones from various apps: @must_be_connected_without = internal constant [8 x i8] [i8 getelementptr inbounds ([3 x i8]* @.str64320, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str27283, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str71327, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str72328, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str18274, i64 0, i64 0), i8* getelementptr inbounds ([6 x i8]* @.str11267, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str32288, i64 0, i64 0), i8* null], align 32 ; <[8 x i8]> [#uses=2] %scevgep.i = getelementptr [8 x i8] @must_be_connected_without, i64 0, i64 %indvar.i ; <i8*> [#uses=1] %17 = load ... %18 = icmp eq i8 %17, null ; <i1> [#uses=1] -> icmp eq i64 %indvar.i, 7 @yytable1095 = internal constant [84 x i8] c"\12\01(\05\06\07\08\09\0A\0B\0C\0D\0E1\0F\10\11266\1D: \10\11,-,0\03'\10\11B6\04\17&\18\1945\05\06\07\08\09\0A\0B\0C\0D\0E\1E\0F\10\11\1A\1B\1C$3+>#%;<IJ=ADFEGH9KL\00\00\00C", align 32 ; <[84 x i8]> [#uses=2] %57 = getelementptr inbounds [84 x i8]* @yytable1095, i64 0, i64 %56 ; <i8> [#uses=1] %mode.0.in = getelementptr inbounds [9 x i32] @mb_mode_table, i64 0, i64 %.pn ; <i32> [#uses=1] load ... %64 = icmp eq i8 %58, 4 ; <i1> [#uses=1] -> icmp eq i64 %.pn, 35 ; <i1> [#uses=0] @gsm_DLB = internal constant [4 x i16] [i16 6554, i16 16384, i16 26214, i16 32767] %scevgep.i = getelementptr [4 x i16] @gsm_DLB, i64 0, i64 %indvar.i ; <i16*> [#uses=1] %425 = load %scevgep.i %426 = icmp eq i16 %425, -32768 ; <i1> [#uses=0] -> false llvm-svn: 92411	2010-01-02 08:12:04 +00:00
Chris Lattner	2e4be2c340	remove the instcombine transformations that are inserting nasty pointer to int casts that confuse later optimizations. See PR3351 for details. This improves but doesn't complete fix 483.xalancbmk because llvm-gcc does this xform in GCC's "fold" routine as well. Clang++ will do better I guess. llvm-svn: 92408	2010-01-02 00:31:05 +00:00
Chris Lattner	faf1337acb	add a simple instcombine xform, simplify another one to use hasAllZeroIndices() instead of hand rolling a loop. llvm-svn: 92403	2010-01-01 23:09:08 +00:00
Chris Lattner	30c0a2833d	generalize the pointer difference optimization to handle a constantexpr gep on the 'base' side of the expression. This completes comment #4 in PR3351, which comes from 483.xalancbmk. llvm-svn: 92402	2010-01-01 22:42:29 +00:00
Chris Lattner	4394f71752	teach instcombine to optimize pointer difference idioms involving constant expressions. This is a step towards comment #4 in PR3351. llvm-svn: 92401	2010-01-01 22:29:12 +00:00
Chris Lattner	9d4c5414bb	use 'match' to simplify some code. llvm-svn: 92400	2010-01-01 22:12:03 +00:00
Chris Lattner	25c87e9cf9	implement the transform requested in PR5284 llvm-svn: 92398	2010-01-01 18:34:40 +00:00
Chris Lattner	ee1f861d81	add missing line. llvm-svn: 92384	2010-01-01 01:54:08 +00:00
Chris Lattner	8330daf733	add a few trivial instcombines for llvm.powi. llvm-svn: 92383	2010-01-01 01:52:15 +00:00
Chris Lattner	0c59ac3f41	When factoring multiply expressions across adds, factor both positive and negative forms of constants together. This allows us to compile: int foo(int x, int y) { return (x-y) + (x-y) + (x-y); } into: _foo: ## @foo subl %esi, %edi leal (%rdi,%rdi,2), %eax ret instead of (where the 3 and -3 were not factored): _foo: imull $-3, 8(%esp), %ecx imull $3, 4(%esp), %eax addl %ecx, %eax ret this started out as: movl 12(%ebp), %ecx imull $3, 8(%ebp), %eax subl %ecx, %eax subl %ecx, %eax subl %ecx, %eax ret This comes from PR5359. llvm-svn: 92381	2010-01-01 01:13:15 +00:00
Chris Lattner	a552683fd4	clean up some comments. llvm-svn: 92377	2010-01-01 00:04:26 +00:00
Chris Lattner	17229a7cb8	switch from std::map to DenseMap for rank data structures. llvm-svn: 92375	2010-01-01 00:01:34 +00:00
Chris Lattner	fed3397654	reuse negates where possible instead of always creating them from scratch. This allows us to optimize test12 into: define i32 @test12(i32 %X) { %factor = mul i32 %X, -3 ; <i32> [#uses=1] %Z = add i32 %factor, 6 ; <i32> [#uses=1] ret i32 %Z } instead of: define i32 @test12(i32 %X) { %Y = sub i32 6, %X ; <i32> [#uses=1] %C = sub i32 %Y, %X ; <i32> [#uses=1] %Z = sub i32 %C, %X ; <i32> [#uses=1] ret i32 %Z } llvm-svn: 92373	2009-12-31 20:34:32 +00:00
Chris Lattner	60c2ca743d	we don't need a smallptrset to detect duplicates, the values are sorted, so we can just do a linear scan. llvm-svn: 92372	2009-12-31 19:49:01 +00:00
Chris Lattner	1d8979422a	make reassociate more careful about not leaving around dead mul's llvm-svn: 92370	2009-12-31 19:34:45 +00:00
Chris Lattner	ed18917665	remove debug llvm-svn: 92369	2009-12-31 19:25:19 +00:00
Chris Lattner	60b71b5c4d	teach reassociate to factor x+x+x -> x*3. While I'm at it, fix RemoveDeadBinaryOp to actually do something. llvm-svn: 92368	2009-12-31 19:24:52 +00:00
Chris Lattner	38abecbad0	change reassociate to use SmallVector for its key datastructures instead of std::vector. llvm-svn: 92366	2009-12-31 18:40:32 +00:00
Chris Lattner	ac61550504	change an if to an assert, fix comment. llvm-svn: 92364	2009-12-31 18:18:46 +00:00
Chris Lattner	177140ad12	move the rest of the add optimization code out to OptimizeAdd, improve some comments, simplify a bit of code. llvm-svn: 92363	2009-12-31 18:17:13 +00:00
Chris Lattner	ba1f36aa99	factor statistic updating better. llvm-svn: 92362	2009-12-31 17:51:05 +00:00
Chris Lattner	4e3a5678af	simple fix for an incorrect factoring which causes a miscompilation, PR5458. llvm-svn: 92354	2009-12-31 08:33:49 +00:00
Chris Lattner	5f8a005d38	factor code out into helper functions. llvm-svn: 92347	2009-12-31 07:59:34 +00:00
Chris Lattner	f5c2b8b8d7	switch some std::vector's to smallvector. Reduce nesting. llvm-svn: 92346	2009-12-31 07:48:51 +00:00
Chris Lattner	9039ff8912	use more modern datastructures. llvm-svn: 92344	2009-12-31 07:33:14 +00:00
Chris Lattner	bc1512c8d1	clean up -debug output. llvm-svn: 92343	2009-12-31 07:17:37 +00:00
Chris Lattner	6a0ca6aa90	fix Analysis/DebugInfo.h to not include Metadata.h. Do this by moving one method out of line and eliminating redundant checks from other methods. llvm-svn: 92337	2009-12-31 03:02:08 +00:00
Chris Lattner	9b493028df	rename "elements" of metadata to "operands". "Elements" are things that occur in types. "operands" are things that occur in values. llvm-svn: 92322	2009-12-31 01:22:29 +00:00
Benjamin Kramer	756d7086c1	Use an array instead of a SmallVector. llvm-svn: 92264	2009-12-29 11:04:52 +00:00
Chris Lattner	22e13ba4e5	prune #includes. llvm-svn: 92260	2009-12-29 09:12:29 +00:00
Chris Lattner	a0566979b7	Final step in the metadata API restructuring: move the getMDKindID/getMDKindNames methods to LLVMContext (and add convenience methods to Module), eliminating MetadataContext. Move the state that it maintains out to LLVMContext. llvm-svn: 92259	2009-12-29 09:01:33 +00:00
Chris Lattner	6311212bf9	remove useless argument. llvm-svn: 92256	2009-12-29 08:03:58 +00:00
Chris Lattner	2f2aa2b067	This is a major cleanup of the instruction metadata interfaces that I asked Devang to do back on Sep 27. Instead of going through the MetadataContext class with methods like getMD() and getMDs(), just ask the instruction directly for its metadata with getMetadata() and getAllMetadata(). This includes a variety of other fixes and improvements: previously all Value*'s were bloated because the HasMetadata bit was thrown into value, adding a 9th bit to a byte. Now this is properly sunk down to the Instruction class (the only place where it makes sense) and it will be folded away somewhere soon. This also fixes some confusion in getMDs and its clients about whether the returned list is indexed by the MDID or densely packed. This is now returned sorted and densely packed and the comments make this clear. This introduces a number of fixme's which I'll follow up on. llvm-svn: 92235	2009-12-28 23:41:32 +00:00
Chris Lattner	17079fc0fa	split code that doesn't need to be templated out of IRBuilder into a new non-templated IRBuilderBase class. Move that large CreateGlobalString out of line, eliminating the need to #include GlobalVariable.h in IRBuilder.h llvm-svn: 92227	2009-12-28 21:28:46 +00:00
Chris Lattner	7093946ab1	rename getMDKind -> getMDKindID, make it autoinsert if an MD Kind doesn't exist already, eliminate registerMDKind. Tidy up a bunch of random stuff. llvm-svn: 92225	2009-12-28 20:45:51 +00:00
Chris Lattner	f8d22fc77d	Metadata.h doesn't need to include ValueHandle.h anymore. llvm-svn: 92211	2009-12-28 08:20:46 +00:00
Chris Lattner	1a32ede6fd	move an optimization for memcmp out of simplifylibcalls and into SDISel. This optimization was causing simplifylibcalls to introduce type-unsafe nastiness. This is the first step, I'll be expanding the memcmp optimizations shortly, covering things that we really really wouldn't want simplifylibcalls to do. llvm-svn: 92098	2009-12-24 00:37:38 +00:00
Chris Lattner	efebb234b7	reorder to follow a normal fall-through style, no functionality change. llvm-svn: 92084	2009-12-23 23:24:51 +00:00
David Greene	2330f78075	Remove dump routine and the associated Debug.h from a header. Patch up other files to compensate. llvm-svn: 92075	2009-12-23 22:58:38 +00:00
Eric Christopher	fdb33458fc	Update objectsize intrinsic and associated dependencies. Fix lowering code and update testcases. llvm-svn: 91979	2009-12-23 02:51:48 +00:00
Chris Lattner	c0f6402a94	Fix the Convert to scalar to not insert dead loads in the store case. The load is needed when we have a small store into a large alloca (at which point we get a load/insert/store sequence), but when you do a full-sized store, this load ends up being dead. This dead load is bad in really large nasty testcases where the load ends up causing mem2reg to insert large chains of dependent phi nodes which only ADCE can delete. Instead of doing this, just don't insert the dead load. This fixes rdar://6864035 llvm-svn: 91917	2009-12-22 19:33:28 +00:00
Chris Lattner	fda3b559e6	fix some fixme's by using twines llvm-svn: 91916	2009-12-22 19:23:33 +00:00
Bob Wilson	62a84ea8e3	Generalize SROA to allow the first index of a GEP to be non-zero. Add a missing check that an array reference doesn't go past the end of the array, and remove some redundant checks for in-bound array and vector references that are no longer needed. llvm-svn: 91897	2009-12-22 06:57:14 +00:00
Chris Lattner	f21a220bcd	Implement PR5795 by merging duplicated return blocks. This could go further by merging all returns in a function into a single one, but simplifycfg currently likes to duplicate the return (an unfortunate choice!) llvm-svn: 91890	2009-12-22 06:07:30 +00:00
Chris Lattner	9b7d99eb76	The phi translated pointer can be computed when returning a partially cached result instead of stored. This reduces memdep memory usage, and also eliminates a bunch of weakvh's. This speeds up gvn on gcc.c-torture/20001226-1.c from 23.9s to 8.45s (2.8x) on a different machine than earlier. llvm-svn: 91885	2009-12-22 04:25:02 +00:00
Eric Christopher	ab6a0d60d5	Whitespace fixes. llvm-svn: 91875	2009-12-22 01:23:51 +00:00
Daniel Dunbar	c661a2d4d8	Add suggested parentheses. llvm-svn: 91853	2009-12-21 23:27:57 +00:00
Chris Lattner	bf20018423	Add a fastpath to Load GVN to special case when we have exactly one dominating load to avoid even messing around with SSAUpdate at all. In this case (which is very common, we can just use the input value directly). This speeds up GVN time on gcc.c-torture/20001226-1.c from 36.4s to 16.3s, which still isn't great, but substantially better and this is a simple speedup that applies to lots of different cases. llvm-svn: 91851	2009-12-21 23:15:48 +00:00
Chris Lattner	927b0ac4b2	refactor some code out to a new helper method. llvm-svn: 91849	2009-12-21 23:04:33 +00:00
Chris Lattner	eaa25da8bb	improve indentation avoid a pointless conversion from weakvh to trackingvh, no functionality change. llvm-svn: 91848	2009-12-21 22:43:03 +00:00
Bob Wilson	88a0598fe8	Remove special-case SROA optimization of variable indexes to one-element and two-element arrays. After restructuring the SROA code, it was not safe to do this without adding more checking. It is not clear that this special-case has really been useful, and removing this simplifies the code quite a bit. llvm-svn: 91828	2009-12-21 18:39:47 +00:00
Chris Lattner	4edfcb88e1	revert r89298, which was committed without a testcase. I think the underlying PHI node insertion issue in SSAUpdate is fixed. llvm-svn: 91821	2009-12-21 07:45:57 +00:00
Chris Lattner	8fb07c5a21	fix PR5837 by having SSAUpdate reuse phi nodes for the 'GetValueInMiddleOfBlock' case, instead of inserting duplicates. A similar fix is almost certainly needed by the machine-level SSAUpdate implementation. llvm-svn: 91820	2009-12-21 07:16:11 +00:00
Chris Lattner	d4fb4296df	give instcombine some helper functions for matching MIN and MAX, and implement some optimizations for MIN(MIN()) and MAX(MAX()) and MIN(MAX()) etc. This substantially improves the code in PR5822 but doesn't kick in much elsewhere. 2 max's were optimized in pairlocalalign and one in smg2000. llvm-svn: 91814	2009-12-21 06:03:05 +00:00
Chris Lattner	ffbd02829c	enhance x-(-A) -> x+A to preserve NUW/NSW. Use the presence of NSW/NUW to fold "icmp (x+cst), x" to a constant in cases where it would otherwise be undefined behavior. Surprisingly (to me at least), this triggers hundreds of the times in a few benchmarks: lencode, ldecode, and 466.h264ref seem to really like this. llvm-svn: 91812	2009-12-21 04:04:05 +00:00
Chris Lattner	900ce231f9	Optimize all cases of "icmp (X+Cst), X" to something simpler. This triggers a bunch in lencode, ldecod, spass, 176.gcc, 252.eon, among others. It is also the first part of PR5822 llvm-svn: 91811	2009-12-21 03:19:28 +00:00
Douglas Gregor	740ab38bb7	Fix a bunch of little errors that Clang complains about when its being pedantic llvm-svn: 91764	2009-12-19 07:05:23 +00:00
Chris Lattner	4ad5eba568	fix PR5827 by disabling the phi slicing transformation in a case where instcombine would have to split a critical edge due to a phi node of an invoke. Since instcombine can't change the CFG, it has to bail out from doing the transformation. llvm-svn: 91763	2009-12-19 07:01:15 +00:00
Bob Wilson	c16811b575	Update my SROA changes in response to review. * change FindElementAndOffset to return a uint64_t instead of unsigned, and to identify the type to be used for that result in a GEP instruction. * move "isa<ConstantInt>" to be first in conditional. * replace some dyn_casts with casts. * add a comment about handling mem intrinsics. llvm-svn: 91762	2009-12-19 06:53:17 +00:00
Bob Wilson	532cd232fb	Reapply 91459 with a simple fix for the problem that broke the x86_64-darwin bootstrap. This also replaces the WeakVH references that Chris objected to with normal Value references. llvm-svn: 91711	2009-12-18 20:14:40 +00:00
Eli Friedman	86b9d75dc8	Optimize icmp of null and select of two constants even if the select has multiple uses. (The construct in question was found in gcc.) llvm-svn: 91675	2009-12-18 08:22:35 +00:00
Dan Gohman	57e808628c	Eliminte unnecessary uses of <cstdio>. llvm-svn: 91666	2009-12-18 03:25:51 +00:00
Dan Gohman	18fa5686f6	Add Loop contains utility methods for testing whether a loop contains another loop, or an instruction. The loop form is substantially more efficient on large loops than the typical code it replaces. llvm-svn: 91654	2009-12-18 01:24:09 +00:00
Dan Gohman	fd7231f1fe	Minor code simplification. llvm-svn: 91653	2009-12-18 01:20:44 +00:00
Dan Gohman	b1924e8a0f	Don't pass const pointers by reference. llvm-svn: 91647	2009-12-18 00:38:08 +00:00
Dan Gohman	1af1954852	Update a comment. llvm-svn: 91645	2009-12-18 00:28:43 +00:00
Dan Gohman	92c3696524	Reapply LoopStrengthReduce and IVUsers cleanups, excluding the part of 91296 that caused trouble -- the Processed list needs to be preserved for the livetime of the pass, as AddUsersIfInteresting is called from other passes. llvm-svn: 91641	2009-12-18 00:06:20 +00:00
Eli Friedman	250b119d98	Allow instcombine to combine "sext(a) >u const" to "a >u trunc(const)". llvm-svn: 91631	2009-12-17 22:42:29 +00:00
Eli Friedman	7cc86b4cc6	Make the ptrtoint comparison simplification work if one side is a global. llvm-svn: 91624	2009-12-17 21:27:47 +00:00
Eli Friedman	5842c9968a	Slightly generalize transformation of memmove(a,a,n) so that it also applies to memcpy. (Such a memcpy is technically illegal, but in practice is safe and is generated by struct self-assignment in C code.) llvm-svn: 91621	2009-12-17 21:07:31 +00:00
Bob Wilson	f3927b7994	Re-revert 91459. It's breaking the x86_64 darwin bootstrap. llvm-svn: 91607	2009-12-17 18:34:24 +00:00
Evan Cheng	090ac0865a	Revert 91280-91283, 91286-91289, 91291, 91293, 91295-91296. It apparently introduced a non-deterministic behavior in the optimizer somewhere. llvm-svn: 91598	2009-12-17 09:39:49 +00:00
Daniel Dunbar	ab42d42390	Reapply r91459, it was only unmasking the bug, and since TOT is still broken having it reverted does no good. llvm-svn: 91559	2009-12-16 20:09:53 +00:00
Daniel Dunbar	133efc317e	Revert "Reapply 91184 with fixes and an addition to the testcase to cover the problem", this broke llvm-gcc bootstrap for release builds on x86_64-apple-darwin10. This reverts commit db22309800b224a9f5f51baf76071d7a93ce59c9. llvm-svn: 91534	2009-12-16 10:56:17 +00:00
Chris Lattner	f278addbdc	reapply my strstr optimization. I have reproduced the x86-64 bootstrap miscompile (i386.o miscompares) but it happens both with and without this patch. llvm-svn: 91532	2009-12-16 09:32:05 +00:00
Chris Lattner	177be32334	revert my strstr optimization, I'm told it breaks x86-64 bootstrap. Will reapply with a fix when I get a chance. llvm-svn: 91486	2009-12-16 00:46:02 +00:00
Bob Wilson	e44756d7c2	Reapply 91184 with fixes and an addition to the testcase to cover the problem found last time. Instead of trying to modify the IR while iterating over it, I've change it to keep a list of WeakVH references to dead instructions, and then delete those instructions later. I also added some special case code to detect and handle the situation when both operands of a memcpy intrinsic are referencing the same alloca. llvm-svn: 91459	2009-12-15 22:00:51 +00:00
Chris Lattner	26ab363361	optimize strstr, PR5783 llvm-svn: 91438	2009-12-15 19:14:40 +00:00
Dan Gohman	265ce318b8	Delete an unused function. llvm-svn: 91432	2009-12-15 16:30:09 +00:00
Chris Lattner	24aba42d04	add some other xforms that should be done as part of PR5783 llvm-svn: 91428	2009-12-15 09:05:13 +00:00
Chris Lattner	45d040bd85	Remove isPod() from DenseMapInfo, splitting it out to its own isPodLike type trait. This is a generally useful type trait for more than just DenseMap, and we really care about whether something acts like a pod, not whether it really is a pod. llvm-svn: 91421	2009-12-15 07:26:43 +00:00
Dan Gohman	fbeec7270c	Fix a thinko; isNotAlreadyContainedIn had a built-in negative, so the condition was inverted when the code was converted to contains(). llvm-svn: 91295	2009-12-14 17:31:01 +00:00
Dan Gohman	416d5b7361	Remove unnecessary #includes. llvm-svn: 91293	2009-12-14 17:19:06 +00:00
Dan Gohman	163fb26927	Instead of having a ScalarEvolution pointer member in BasedUser, just pass the ScalarEvolution pointer into the functions which need it. llvm-svn: 91289	2009-12-14 17:12:51 +00:00
Dan Gohman	8dbd4e3d16	Don't bother cleaning up if there's nothing to clean up. llvm-svn: 91288	2009-12-14 17:10:44 +00:00
Dan Gohman	88c7e61c5b	Delete an unused variable. llvm-svn: 91287	2009-12-14 17:08:09 +00:00
Dan Gohman	838f604543	LSR itself doesn't need LoopInfo. llvm-svn: 91283	2009-12-14 17:02:34 +00:00
Dan Gohman	273e692952	LSR itself doesn't need DominatorTree. llvm-svn: 91282	2009-12-14 16:57:08 +00:00
Dan Gohman	c3513095cf	Remove the code in LSR that manually hoists expansions out of loops; SCEVExpander does this automatically. llvm-svn: 91281	2009-12-14 16:52:55 +00:00
Dan Gohman	ec2a7c58e8	Minor code cleanups. llvm-svn: 91280	2009-12-14 16:37:29 +00:00
Chris Lattner	aaa6ac10a6	revert r91184, because it causes a crash on a .bc file I just sent to Bob. llvm-svn: 91268	2009-12-14 05:11:02 +00:00
Chandler Carruth	dcf5dacb2c	Don't leave pointers uninitialized in the default constructor. GCC complains about the potential use of these uninitialized members under certain conditions. llvm-svn: 91239	2009-12-13 07:04:45 +00:00
Bob Wilson	895f364ae6	Revise scalar replacement to be more flexible about handle bitcasts and GEPs. While scanning through the uses of an alloca, keep track of the current offset relative to the start of the alloca, and check memory references to see if the offset & size correspond to a component within the alloca. This has the nice benefit of unifying much of the code from isSafeUseOfAllocation, isSafeElementUse, and isSafeUseOfBitCastedAllocation. The code to rewrite the uses of a promoted alloca, after it is determined to be safe, is reorganized in the same way. Also, when rewriting GEP instructions, mark them as "in-bounds" since all the indices are known to be safe. llvm-svn: 91184	2009-12-11 23:47:40 +00:00
Eric Christopher	22889c049d	Make sure the immediate dominator isn't NULL through iterations of the loop. We could get to this condition via indirect branches. llvm-svn: 91009	2009-12-10 00:25:41 +00:00
Chris Lattner	9ccc879006	Fix PR5744, a case where we were getting the pointer size instead of the value size. This only manifested when memdep inprecisely returns clobber, which is do to a caching issue in the PR5744 testcase. We can 'efficiently emulate' this by using '-no-aa' llvm-svn: 91004	2009-12-10 00:11:45 +00:00
Chris Lattner	3ddf804f78	allow this to build when the #if 0's are enabled. No functionality change. llvm-svn: 90999	2009-12-10 00:04:46 +00:00
Dan Gohman	72c367fb52	Dereference loopHeader after checking for null rather than before. llvm-svn: 90990	2009-12-09 22:55:01 +00:00
Chris Lattner	ca5f9cb18b	fix hte last remaining known (by me) phi translation bug. When we reanalyze clobbers to forward pieces of large stores to small loads, we need to consider the properly phi translated pointer in the store block. llvm-svn: 90978	2009-12-09 18:21:46 +00:00
Chris Lattner	f8ba1253f1	change GetStoreValueForLoad to use IRBuilder, which is cleaner and implicitly constant folds. llvm-svn: 90977	2009-12-09 18:13:28 +00:00
Bob Wilson	1c5a6fb299	Fix a comment. llvm-svn: 90975	2009-12-09 18:05:27 +00:00
Chris Lattner	07df9efb35	change AnalyzeLoadFromClobberingMemInst/AnalyzeLoadFromClobberingStore to require the load ty/ptr to be passed in, no functionality change. llvm-svn: 90960	2009-12-09 07:37:07 +00:00
Chris Lattner	0def861ee9	change AnalyzeLoadFromClobberingWrite and clients to pass in type and pointer instead of the load. No functionality change. llvm-svn: 90959	2009-12-09 07:34:10 +00:00
Chris Lattner	0c31547168	change NonLocalDepEntry from being a typedef for an std::pair to be its own small class. No functionality change. llvm-svn: 90956	2009-12-09 07:08:01 +00:00
Chris Lattner	946b58dd90	add some aborts to #if 0's. llvm-svn: 90929	2009-12-09 02:41:54 +00:00
Chris Lattner	972e6d8d00	Switch GVN and memdep to use PHITransAddr, which correctly handles phi translation of complex expressions like &A[i+1]. This has the following benefits: 1. The phi translation logic is all contained in its own class with a strong interface and verification that it is self consistent. 2. The logic is more correct than before. Previously, if intermediate expressions got PHI translated, we'd miss the update and scan for the wrong pointers in predecessor blocks. @phi_trans2 is a testcase for this. 3. We have a lot less code in memdep. We can handle phi translation across blocks of things like @phi_trans3, which is pretty insane :). This patch should fix the miscompiles of 255.vortex, and I tested it with a bootstrap of llvm-gcc, llvm-test and dejagnu of course. llvm-svn: 90926	2009-12-09 01:59:31 +00:00
Bob Wilson	c5d082fd5d	Some superficial cleanups. llvm-svn: 90866	2009-12-08 18:27:03 +00:00
Bob Wilson	2029ea04f9	Clean up dead operands left around after SROA replaces a mem intrinsic. I'm not aware that this does anything significant on its own, but it's needed for another patch that I'm working on. llvm-svn: 90864	2009-12-08 18:22:03 +00:00
Duncan Sands	6a3df7b0c7	Teach GlobalOpt to delete aliases with internal linkage (after forwarding any uses). GlobalDCE can also do this, but is only run at -O3. llvm-svn: 90850	2009-12-08 10:10:20 +00:00
Nick Lewycky	8bca014d7f	Remove unnecessary #include "llvm/LLVMContext.h". llvm-svn: 90836	2009-12-08 05:45:41 +00:00
Chris Lattner	6d6f10fe91	fix PR5698 llvm-svn: 90708	2009-12-06 17:17:23 +00:00
Chris Lattner	778cb92235	constant fold loads from memcpy's from global constants. This is important because clang lowers nontrivial automatic struct/array inits to memcpy from a global array. llvm-svn: 90698	2009-12-06 05:29:56 +00:00
Chris Lattner	93236ba327	add support for forwarding mem intrinsic values to non-local loads. llvm-svn: 90697	2009-12-06 04:54:31 +00:00
Chris Lattner	42376066eb	Handle forwarding local memsets to loads. For example, we optimize this: short x(short A) { memset(A, 1, sizeof(A)*100); return A[42]; } to 'return 257' instead of doing the load. llvm-svn: 90695	2009-12-06 01:57:02 +00:00
Nick Lewycky	a0e9d700dc	Generalize this optimization to work on equality comparisons between any two integers that are constant except for a single bit (the same n-th bit in each). llvm-svn: 90646	2009-12-05 05:00:00 +00:00
Bob Wilson	050b812fe7	Fix up some comments. llvm-svn: 90603	2009-12-04 21:57:37 +00:00
Bob Wilson	5ca37b274c	Fix 80-column violations. llvm-svn: 90601	2009-12-04 21:51:35 +00:00
Chris Lattner	2bd9609992	add an assert to make it really clear what this is doing. Return singularval as a compile time perf optimization to avoid a load. llvm-svn: 90507	2009-12-04 01:03:32 +00:00
Bob Wilson	53bdae3802	Fix a comment typo. llvm-svn: 90487	2009-12-03 21:47:07 +00:00
Owen Anderson	0b6e260066	Fix this crasher, and add a FIXME for a missed optimization. llvm-svn: 90408	2009-12-03 03:43:29 +00:00
Chris Lattner	a48f44d9ee	improve portability to avoid conflicting with std::next in c++'0x. Patch by Howard Hinnant! llvm-svn: 90365	2009-12-03 00:50:42 +00:00
Jim Grosbach	d831ef4945	Move EliminateDuplicatePHINodes() from SimplifyCFG.cpp to Local.cpp llvm-svn: 90324	2009-12-02 17:06:45 +00:00
Andreas Neustifter	3d207290fe	Cheap, mostly strict, stable sorting. This is necessary for tests so the results are comparable. llvm-svn: 90320	2009-12-02 15:57:15 +00:00
Owen Anderson	b9878ee6b6	Cleanup/remove some parts of the lifetime region handling code in memdep and GVN, per Chris' comments. Adjust testcases to match. llvm-svn: 90304	2009-12-02 07:35:19 +00:00
Chris Lattner	c468025ac9	factor some code better. llvm-svn: 90299	2009-12-02 06:44:58 +00:00
Chris Lattner	2764b4dc55	formatting cleanups. llvm-svn: 90298	2009-12-02 06:35:55 +00:00
Chris Lattner	eea42c7b51	tidy up, remove dependence on order of evaluation of function args from EmitMemCpy. llvm-svn: 90297	2009-12-02 06:05:42 +00:00
Chris Lattner	3c9aca9079	fix PR5640 by tracking whether a block is the header of a loop more precisely, which prevents us from infinitely peeling the loop. llvm-svn: 90211	2009-12-01 06:04:43 +00:00
Benjamin Kramer	3efc050ac4	Revert r90089 for now, it's breaking selfhost. llvm-svn: 90097	2009-11-29 21:17:48 +00:00
Benjamin Kramer	bfa993ab20	Fix two FIXMEs. llvm-svn: 90089	2009-11-29 20:29:30 +00:00
Chris Lattner	1cc4cca193	add testcases for the foo_with_overflow op xforms added recently and fix bugs exposed by the tests. Testcases from Alastair Lynn! llvm-svn: 90056	2009-11-29 02:57:29 +00:00
Chris Lattner	cd261c9c26	Implement PR5634. llvm-svn: 90046	2009-11-29 00:51:17 +00:00
Chris Lattner	32140312ca	reenable load address insertion in load pre. This allows us to handle cases like this: void test(int N, double* G) { long j; for (j = 1; j < N - 1; j++) G[j+1] = G[j] + G[j+1]; } where G[1] isn't live into the loop. llvm-svn: 90041	2009-11-28 16:08:18 +00:00
Chris Lattner	44da5bd837	Enhance InsertPHITranslatedPointer to be able to return a list of newly inserted instructions. No functionality change until someone starts using it. llvm-svn: 90039	2009-11-28 15:39:14 +00:00
Chris Lattner	cf0b198827	disable value insertion for now, I need to figure out how to inform GVN about the newly inserted values. This fixes PR5631. llvm-svn: 90022	2009-11-27 22:50:07 +00:00
Chris Lattner	2be52e72ae	Rework InsertPHITranslatedPointer to handle the recursive case, this fixes PR5630 and sets the stage for the next phase of goodness (testcase pending). llvm-svn: 90019	2009-11-27 22:05:15 +00:00
Chris Lattner	3d9823b9cf	factor some logic out of instcombine into a new SimplifyAddInst method. llvm-svn: 90011	2009-11-27 17:42:22 +00:00
Chris Lattner	2226db66ab	fix PR5436 by making the 'simple' case of SRoA not promote out of range array indexes. The "complex" case of SRoA still handles them, and correctly. This fixes a weirdness where we'd correctly avoid transforming A[0][42] if the 42 was too large, but we'd only do it if it was one gep, not two separate ones. llvm-svn: 90007	2009-11-27 16:37:41 +00:00
Chris Lattner	25be93dfed	teach GVN's load PRE to insert computations of the address in predecessors where it is not available. It's unclear how to get this inserted computation into GVN's scalar availability sets, Owen, help? :) llvm-svn: 89997	2009-11-27 08:25:10 +00:00
Chris Lattner	a9a76ccf56	Fix phi translation in load PRE to agree with the phi translation done by memdep, and reenable gep translation again. llvm-svn: 89992	2009-11-27 06:31:14 +00:00
Chris Lattner	8574aba4ea	factor some instcombine simplifications for getelementptr out to a new SimplifyGEPInst method in InstructionSimplify.h. No functionality change. llvm-svn: 89980	2009-11-27 00:29:05 +00:00
Chris Lattner	a5bc618a91	fix crash on Transforms/InstCombine/intrinsics.ll introduced by r89970 llvm-svn: 89972	2009-11-26 22:08:06 +00:00
Chris Lattner	a73ecf0b00	Fix PR5471 by removing an instcombine xform. Some pieces of the code generates store to undef and some generates store to null as the idiom for undefined behavior. Since simplifycfg zaps both, don't remove the undefined behavior in instcombine. llvm-svn: 89971	2009-11-26 22:04:42 +00:00
Chris Lattner	5b83ba215d	implement a bunch of xforms for overflow intrinsics, based on a patch by Alastair Lynn. llvm-svn: 89970	2009-11-26 21:42:47 +00:00
Edward O'Callaghan	2b8fed15e0	Reverting patch in revision 89758, initial attempt at fixing PR5373 has proven to be bogus. llvm-svn: 89844	2009-11-25 05:38:41 +00:00
Edward O'Callaghan	5fd452d596	Fix for PR5373, Credit to Jakub Staszak. llvm-svn: 89758	2009-11-24 11:51:52 +00:00
Dan Gohman	580b80d6d9	Make ConstantFoldConstantExpression recursively visit the entire ConstantExpr, not just the top-level operator. This allows it to fold many more constants. Also, make GlobalOpt call ConstantFoldConstantExpression on GlobalVariable initializers. llvm-svn: 89659	2009-11-23 16:22:21 +00:00
Dan Gohman	1f522d98f8	Fix a use of an invalidated iterator in the case where there are multiple adjacent uses of a dead basic block from the same user. This fixes PR5596. llvm-svn: 89658	2009-11-23 16:13:39 +00:00
Nick Lewycky	15a1287c1f	Pull LLVMContext out of PromoteMemToReg. llvm-svn: 89645	2009-11-23 03:50:44 +00:00
Nick Lewycky	621fe5614e	Remove LLVMContext and its include. llvm-svn: 89644	2009-11-23 03:34:29 +00:00
Nick Lewycky	39dbfd3c58	Remove unused LLVMContext. llvm-svn: 89642	2009-11-23 03:29:18 +00:00
Nick Lewycky	922d4ab574	Reapply r88830 with a bugfix: this transform only applies to icmp eq/ne. This fixes part of PR5438. llvm-svn: 89639	2009-11-23 03:17:33 +00:00
Eric Christopher	0c7bd96de2	Add more optimizations for object size checking, enable handling of object size intrinsic and verify return type is correct. Collect various code in one place. llvm-svn: 89523	2009-11-21 01:01:30 +00:00
Dan Gohman	fbffe63528	Make Loop::getLoopLatch() work on loops which don't have preheaders, as it may be used in contexts where preheader insertion may have failed due to an indirectbr. Make LoopSimplify's LoopSimplify::SeparateNestedLoop properly fail in the case that it would require splitting an indirectbr edge. These fix PR5502. llvm-svn: 89484	2009-11-20 20:51:18 +00:00
Dan Gohman	d15302afa0	Fix IPSCCP's code for deleting dead blocks to tolerate outstanding blockaddress users. This fixes PR5569. llvm-svn: 89483	2009-11-20 20:19:14 +00:00
Daniel Dunbar	f87c75706f	Revert "Add some rough optimizations for checking routines.", it buildeth not. llvm-svn: 89482	2009-11-20 20:17:30 +00:00
Eric Christopher	cf97d01dff	Add some rough optimizations for checking routines. llvm-svn: 89479	2009-11-20 19:57:37 +00:00
Duncan Sands	9e26aac773	Fix PR5563, an expensive checks failure when running on tests/Transforms/InstCombine/shufflemask-undef.ll. If anyone cares, the use of 2*e here (and the equivalent all over the place in instcombine) seems wrong, though harmless: it should really be twice the length of the input vector. I think shufflevector used to require that the mask have the same length as the input, but I don't think that's true any more. I don't care enough about vectors to do anything about this... llvm-svn: 89456	2009-11-20 13:19:51 +00:00
Dan Gohman	94e617627d	Extend CaptureTracking to indicate when a value is never stored, even if it is not ultimately captured. Teach BasicAliasAnalysis that a local object address which does not escape and is never stored does not alias with a value resulting from a load. llvm-svn: 89398	2009-11-19 21:57:48 +00:00
Dan Gohman	cbc6ebb6fd	Enable hoisting of loads from constant memory by default. In cases where they are lowered to instruction sequences more complex than a simple load, such that CodeGen cannot rematerialize them, a reload from a spill slot is likely to be cheaper than the complex sequence. llvm-svn: 89374	2009-11-19 19:00:10 +00:00
Jim Grosbach	dcef55b2ef	Eliminate duplicate phi nodes in loops. Loop rotation, for example, can introduce these, and it's beneficial to later passes to clean them up. llvm-svn: 89298	2009-11-19 02:03:18 +00:00
Jim Grosbach	cc69a1ba9a	Make EliminateDuplicatePHINodes() available as a utility function llvm-svn: 89297	2009-11-19 02:02:10 +00:00
Jim Grosbach	6bf5305f5d	grammar llvm-svn: 89145	2009-11-17 21:37:04 +00:00
Jim Grosbach	e4e018ae67	80-column violations llvm-svn: 89123	2009-11-17 19:05:35 +00:00
Evan Cheng	ba4e5da727	Generalize OptimizeLoopTermCond to optimize more loop terminating icmp to use postinc iv. llvm-svn: 89116	2009-11-17 18:10:11 +00:00
Jim Grosbach	60f4854c76	Remove trailing whitespace llvm-svn: 89110	2009-11-17 17:53:56 +00:00
Devang Patel	12144a2348	Remove debug info attached with an instruction. llvm-svn: 89016	2009-11-17 00:47:06 +00:00
David Greene	a3ce7828b2	Fix an expensive-checks error. The Mask and LHSMask may not be of the same size, so don't do the transformation if they're different. llvm-svn: 88972	2009-11-16 21:52:23 +00:00
Duncan Sands	e5de4a9ad6	CreateIntCast takes an "isSigned" parameter. Pass "true" for it, rather than a name. llvm-svn: 88908	2009-11-16 12:32:28 +00:00
Chris Lattner	9d9812a636	make PRE of loads preserve the alignment of the moved load instruction. llvm-svn: 88865	2009-11-15 19:58:31 +00:00
Chris Lattner	5f037b6439	fix a bug handling 'not x' when x is undef. llvm-svn: 88864	2009-11-15 19:57:43 +00:00
Nick Lewycky	95148689c9	Revert r88830 and r88831 which appear to have caused a selfhost buildbot some grief. I suspect this patch merely exposed a bug else. llvm-svn: 88841	2009-11-15 07:47:32 +00:00
Nick Lewycky	e29fa4c7a1	Teach instcombine to look for booleans in wider integers when it encounters a zext(icmp). It may be able to optimize that away. This fixes one of the cases in PR5438. llvm-svn: 88830	2009-11-15 05:55:17 +00:00
Nick Lewycky	7935bcb0fe	Remove LLVMContext from reassociate. It was threaded through every function but ultimately never used. llvm-svn: 88763	2009-11-14 07:25:54 +00:00
Dan Gohman	81132465d3	Add an option for running GVN with redundant load processing disabled. llvm-svn: 88742	2009-11-14 02:27:51 +00:00
Owen Anderson	e96b2111b1	Re-enable this code, since redundant PHIs are now being better nuked. llvm-svn: 87042	2009-11-12 23:22:41 +00:00
Chris Lattner	5c89f4b4ef	use isInstructionTriviallyDead, as pointed out by Duncan llvm-svn: 87035	2009-11-12 21:58:18 +00:00
Chris Lattner	eb9acbfb05	implement a nice little efficiency hack in the inliner. Since we're now running IPSCCP early, and we run functionattrs interlaced with the inliner, we often (particularly for small or noop functions) completely propagate all of the information about a call to its call site in IPSSCP (making a call dead) and functionattrs is smart enough to realize that the function is readonly (because it is interlaced with inliner). To improve compile time and make the inliner threshold more accurate, realize that we don't have to inline dead readonly function calls. Instead, just delete the call. This happens all the time for C++ codes, here are some counters from opt/llvm-ld counting the number of times calls were deleted vs inlined on various apps: Tramp3d opt: 5033 inline - Number of call sites deleted, not inlined 24596 inline - Number of functions inlined llvm-ld: 667 inline - Number of functions deleted because all callers found 699 inline - Number of functions inlined 483.xalancbmk opt: 8096 inline - Number of call sites deleted, not inlined 62528 inline - Number of functions inlined llvm-ld: 217 inline - Number of allocas merged together 2158 inline - Number of functions inlined 471.omnetpp: 331 inline - Number of call sites deleted, not inlined 8981 inline - Number of functions inlined llvm-ld: 171 inline - Number of functions deleted because all callers found 629 inline - Number of functions inlined Deleting a call is much faster than inlining it, and is insensitive to the size of the callee. :) llvm-svn: 86975	2009-11-12 07:56:08 +00:00
Evan Cheng	85a9f430e9	- Teach LSR to avoid changing cmp iv stride if it will create an immediate that cannot be folded into target cmp instruction. - Avoid a phase ordering issue where early cmp optimization would prevent the later count-to-zero optimization. - Add missing checks which could cause LSR to reuse stride that does not have users. - Fix a bug in count-to-zero optimization code which failed to find the pre-inc iv's phi node. - Remove, tighten, loosen some incorrect checks disable valid transformations. - Quite a bit of code clean up. llvm-svn: 86969	2009-11-12 07:35:05 +00:00
Chris Lattner	5f6b8b2bcb	use getPredicateOnEdge to fold comparisons through PHI nodes, which implements GCC PR18046. This also gets us 360 more jump threads on 176.gcc. llvm-svn: 86953	2009-11-12 05:24:05 +00:00
Chris Lattner	22db4b5e0c	various fixes to the lattice transfer functions. llvm-svn: 86952	2009-11-12 04:57:13 +00:00
Chris Lattner	c893c4ed10	switch jump threading to use getPredicateOnEdge in one place making the new LVI stuff smart enough to subsume some special cases in the old code. Disable them when LVI is around, the testcase still passes. llvm-svn: 86951	2009-11-12 04:37:50 +00:00
Daniel Dunbar	11881e2283	Add the braces gcc suggested. llvm-svn: 86933	2009-11-12 02:52:56 +00:00
Chris Lattner	ba45616958	with the new code we can thread non-instruction values. This allows us to handle the test10 testcase. llvm-svn: 86924	2009-11-12 01:41:34 +00:00
Chris Lattner	3f80d85191	this argument can be an arbitrary value, it doesn't need to be an instruction. llvm-svn: 86923	2009-11-12 01:37:43 +00:00
Chris Lattner	d5e25436a1	expose edge information and switch j-t to use it. llvm-svn: 86920	2009-11-12 01:29:10 +00:00
Chris Lattner	67146695b6	pass TD into a SimplifyCmpInst call. Add another case that uses LVI info when -enable-jump-threading-lvi is passed. llvm-svn: 86886	2009-11-11 22:31:38 +00:00
Duncan Sands	ba61fed5d3	Don't trivially delete unused calls to llvm.invariant.start. This allows llvm.invariant.start to be used without necessarily being paired with a call to llvm.invariant.end. If you run the entire optimization pipeline then such calls are in fact deleted (adce does it), but that's actually a good thing since we probably do want them to be zapped late in the game. There should really be an integration test that checks that the llvm.invariant.start call lasts long enough that all passes that do interesting things with it get to do their stuff before it is deleted. But since no passes do anything interesting with it yet this will have to wait for later. llvm-svn: 86840	2009-11-11 15:34:13 +00:00
Chris Lattner	852f2653c4	remove the now dead condprop pass, PR3906. llvm-svn: 86810	2009-11-11 05:56:35 +00:00
Chris Lattner	fde1f8d0d8	stub out some LazyValueInfo interfaces, and have JumpThreading start using them in a trivial way when -enable-jump-threading-lvi is passed. enable-jump-threading-lvi will be my playground for awhile. llvm-svn: 86789	2009-11-11 02:08:33 +00:00
Chris Lattner	3a2ae908fe	add a fixme llvm-svn: 86766	2009-11-11 00:21:58 +00:00
Evan Cheng	12f146d8f7	Block terminator may be a switch. llvm-svn: 86761	2009-11-11 00:00:21 +00:00
Devang Patel	f6eeaebd76	Implement support to debug inlined functions. llvm-svn: 86748	2009-11-10 23:06:00 +00:00
Chris Lattner	9518fbb54e	implement a TODO by teaching jump threading about "xor x, 1". llvm-svn: 86739	2009-11-10 22:39:16 +00:00
Chris Lattner	852d6d64ff	move some generally useful functions out of jump threading into libanalysis and transformutils. llvm-svn: 86735	2009-11-10 22:26:15 +00:00
Chris Lattner	02e2cee7dc	fix a crash in SCCP handling extractvalue of an array, pointed out and tracked down by Stephan Reiter! llvm-svn: 86726	2009-11-10 22:02:09 +00:00
Chris Lattner	40b15f220d	improve comment. llvm-svn: 86723	2009-11-10 21:45:09 +00:00
Chris Lattner	80e7e5a429	Make jump threading eliminate blocks that just contain phi nodes, debug intrinsics, and an unconditional branch when possible. This reuses the TryToSimplifyUncondBranchFromEmptyBlock function split out of simplifycfg. llvm-svn: 86722	2009-11-10 21:40:01 +00:00
Evan Cheng	87fe40b32d	Generalize lsr code that optimize loop to count down towards zero. llvm-svn: 86715	2009-11-10 21:14:05 +00:00
Duncan Sands	23344095de	Add defensive break. llvm-svn: 86705	2009-11-10 19:36:40 +00:00
Duncan Sands	8d4cde2b55	Fix obvious typo. llvm-svn: 86694	2009-11-10 18:21:37 +00:00
Chris Lattner	b8f79ba10e	clarify logic. llvm-svn: 86689	2009-11-10 17:00:47 +00:00
Duncan Sands	1925d3a1d1	Teach DSE to eliminate useless trampolines. llvm-svn: 86683	2009-11-10 13:49:50 +00:00
Duncan Sands	04e0c95248	Add brackets to make gcc-4.4 happy. llvm-svn: 86681	2009-11-10 09:32:10 +00:00
Victor Hernandez	fcc77b1c02	Update computeArraySize() to use ComputeMultiple() to determine the array size associated with a malloc; also extend PerformHeapAllocSRoA() to check if the optimized malloc's arg had its highest bit set, so that it is safe for ComputeMultiple() to look through sext instructions while determining the optimized malloc's array size llvm-svn: 86676	2009-11-10 08:32:25 +00:00
Chris Lattner	1559bedcc7	unify the code that determines whether it is a good idea to change the type of a computation. This fixes some infinite loops when dealing with TD that has no native types. llvm-svn: 86670	2009-11-10 07:23:37 +00:00
Nick Lewycky	5b3def9b86	Simplify. llvm-svn: 86668	2009-11-10 07:00:43 +00:00
Nick Lewycky	9027147fb1	Reapply r86359, "Teach dead store elimination that certain intrinsics write to memory just like a store" with bug fixed (partial-overwrite.ll is the regression test). llvm-svn: 86667	2009-11-10 06:46:40 +00:00
Chris Lattner	cbd18fc93d	refactor TryToSimplifyUncondBranchFromEmptyBlock out of SimplifyCFG. llvm-svn: 86666	2009-11-10 05:59:26 +00:00
Oscar Fuentes	bbc1067001	CMake: Support for building llvm loadable modules. llvm-svn: 86656	2009-11-10 02:45:37 +00:00
Chris Lattner	38c44ea6b0	make jump threading recursively simplify expressions instead of doing it just one level deep. On the testcase we go from getting this: F1: ; preds = %T2 %F = and i1 true, %cond ; <i1> [#uses=1] br i1 %F, label %X, label %Y to a fully threaded: F1: ; preds = %T2 br label %Y This changes gets us to the point where we're forming (too many) switch instructions on doug's strswitch testcase. llvm-svn: 86646	2009-11-10 01:57:31 +00:00
Chris Lattner	be11db6894	don't invalidate PN, rewrite of this code is in progress anyway. llvm-svn: 86639	2009-11-10 01:19:06 +00:00
Chris Lattner	fb7f87d5a3	add a new SimplifyInstruction API, which is like ConstantFoldInstruction, except that the result may not be a constant. Switch jump threading to use it so that it gets things like (X & 0) -> 0, which occur when phi preds are deleted and the remaining phi pred was a zero. llvm-svn: 86637	2009-11-10 01:08:51 +00:00
Jeffrey Yasskin	b40d3f76a0	Fix DenseMap iterator constness. This patch forbids implicit conversion of DenseMap::const_iterator to DenseMap::iterator which was possible because DenseMapIterator inherited (publicly) from DenseMapConstIterator. Conversion the other way around is now allowed as one may expect. The template DenseMapConstIterator is removed and the template parameter IsConst which specifies whether the iterator is constant is added to DenseMapIterator. Actually IsConst parameter is not necessary since the constness can be determined from KeyT but this is not relevant to the fix and can be addressed later. Patch by Victor Zverovich! llvm-svn: 86636	2009-11-10 01:02:17 +00:00
Chris Lattner	a71e9d61be	factor simplification logic for AND and OR out to InstSimplify from instcombine. llvm-svn: 86635	2009-11-10 00:55:12 +00:00
Chris Lattner	ccfdceb22c	pull a bunch of logic out of instcombine into instsimplify for compare simplification, this handles the foldable fcmp x,x cases among many others. llvm-svn: 86627	2009-11-09 23:55:12 +00:00
Chris Lattner	beadc6e8c7	inline a simple function. llvm-svn: 86625	2009-11-09 23:31:49 +00:00
Chris Lattner	c1f19071f8	rename SimplifyCompare -> SimplifyCmpInst and split it into Simplify[IF]Cmp pieces. Add some predicates to CmpInst to determine whether a predicate is fp or int. llvm-svn: 86624	2009-11-09 23:28:39 +00:00
Chris Lattner	cdfb80de16	fix ConstantFoldCompareInstOperands to take the LHS/RHS as individual operands instead of taking a temporary array llvm-svn: 86619	2009-11-09 23:06:58 +00:00
Chris Lattner	800aad3dda	use instructionsimplify instead of a weak clone of ad-hoc folding stuff. llvm-svn: 86616	2009-11-09 23:00:14 +00:00
Chris Lattner	2978ca7b79	stub out a new form of BasicBlock::RemovePredecessorAndSimplify which simplifies instruction users of PHIs when the phi is eliminated. This will be moved to transforms/utils after some other refactoring. llvm-svn: 86603	2009-11-09 22:32:36 +00:00
Dan Gohman	f324dd65f8	Fix a comment in a typo that Duncan noticed. llvm-svn: 86575	2009-11-09 18:59:22 +00:00
Dan Gohman	c146c78060	Generalize LCSSA to handle loops with exits with predecessors outside the loop. This is needed because with indirectbr it may not be possible for LoopSimplify to guarantee that all loop exit predecessors are inside the loop. This fixes PR5437. LCCSA no longer actually requires LoopSimplify form, but for now it must still have the dependency because the PassManager doesn't know how to schedule LoopSimplify otherwise. llvm-svn: 86569	2009-11-09 18:28:24 +00:00
Chris Lattner	39c07b2eef	if a 'with overflow' intrinsic just has the normal result used, simplify it to a normal binop. Patch by Alastair Lynn, testcase by me. llvm-svn: 86524	2009-11-09 07:07:56 +00:00
Chris Lattner	feeabde753	fix PR5104: when printing a single character, return the result of putchar in case there is an error. llvm-svn: 86515	2009-11-09 04:57:04 +00:00
Chris Lattner	0685be3441	enhance PHI slicing to handle the case when a slicable PHI is begin used by a chain of other PHIs. llvm-svn: 86503	2009-11-09 01:38:00 +00:00
Owen Anderson	939ea35244	Small cleanups. llvm-svn: 86499	2009-11-09 00:48:15 +00:00
Owen Anderson	73fc616838	Revert my previous patch to ABCD and fix things the right way. There are two problems addressed here: 1) We need to avoid processing sigma nodes as phi nodes for constraint generation. 2) We need to generate constraints for comparisons against constants properly. This includes our first working ABCD test! llvm-svn: 86498	2009-11-09 00:44:44 +00:00
Chris Lattner	ea465e221e	comment typos pointed out by Duncan llvm-svn: 86497	2009-11-09 00:41:49 +00:00
Owen Anderson	058088f219	Fix an issue where the ordering of blocks within a function could lead to different constraint graphs being produced. The cause was that we were incorrectly marking sigma instructions as processed after handling the sigma-specific constraints for them, potentially neglecting to process them as normal instructions as well. Unfortunately, the testcase that inspired this still doesn't work because of a bug in the solver, which is next on the list to debug. llvm-svn: 86486	2009-11-08 22:36:55 +00:00
Chris Lattner	2299d4b6d8	Teach an instcombine to not pull trunc instructions through PHI nodes when both the source and dest are illegal types, since it would cause the phi to grow (for example, we shouldn't transform test14b's phi to a phi on i320). This fixes an infinite loop on i686 bootstrap with phi slicing turned on, so turn it back on. llvm-svn: 86483	2009-11-08 21:20:06 +00:00
Chris Lattner	a837e4db6b	reapply r8644[3-5] with only the scary part (SliceUpIllegalIntegerPHI) disabled. llvm-svn: 86480	2009-11-08 19:23:30 +00:00
Daniel Dunbar	4c41373c56	Speculatively revert r8644[3-5], they seem to be leading to infinite loops in llvm-gcc bootstrap. llvm-svn: 86478	2009-11-08 17:52:47 +00:00
Chris Lattner	c7a450b5b2	teach a couple of instcombine transformations involving PHIs to not turn a PHI in a legal type into a PHI of an illegal type, and add a new optimization that breaks up insane integer PHI nodes into small pieces (PR3451). llvm-svn: 86443	2009-11-08 08:21:13 +00:00
Nick Lewycky	b9397262b7	Improve tail call elimination to handle the switch statement. llvm-svn: 86403	2009-11-07 21:10:15 +00:00
Chris Lattner	c77d24b792	make instcombine only rewrite a chain of computation (eliminating some extends) if the new type of the computation is legal or if both the source and dest are illegal. This prevents instcombine from changing big chains of computation into i64 on 32-bit targets for example. llvm-svn: 86398	2009-11-07 19:11:46 +00:00
Chris Lattner	431000da21	Revert r86359, it is breaking the self host on the llvm-gcc-i386-darwin9 build bot. llvm-svn: 86391	2009-11-07 17:59:32 +00:00
Nick Lewycky	b6a3dd48f4	Teach dead store elimination that certain intrinsics write to memory just like a store. llvm-svn: 86359	2009-11-07 08:34:40 +00:00
Chris Lattner	5ff7f5672e	reapply 86289, 86278, 86270, 86267, 86266 & 86264 plus a fix (making pred factoring only happen if threading is guaranteed to be successful). This now survives an X86-64 bootstrap of llvm-gcc. llvm-svn: 86355	2009-11-07 08:05:03 +00:00
Nick Lewycky	9b669b3c4f	Oops, FunctionContainsEscapingAllocas is really used to mean two different things. Back out part of r86349 for a moment. llvm-svn: 86353	2009-11-07 07:42:38 +00:00
Nick Lewycky	5091272fdf	Dust off tail recursion elimination. Fix a fixme by applying CaptureTracking and add a .ll to demo the new capability. llvm-svn: 86349	2009-11-07 07:10:01 +00:00
Devang Patel	3a42e7ac65	Revert following patches to fix llvmgcc bootstrap. 86289, 86278, 86270, 86267, 86266 & 86264 Chris, please take a look. llvm-svn: 86321	2009-11-07 01:32:59 +00:00
Victor Hernandez	bde558c536	- new SROA mallocs should have the mallocs running-or'ed, not the malloc's bitcast - fix ProcessInternalGlobal() debug output llvm-svn: 86317	2009-11-07 00:41:19 +00:00
Jeffrey Yasskin	8f77e948e5	Avoid "ambiguous 'else'" warning from gcc. llvm-svn: 86314	2009-11-07 00:26:47 +00:00
Victor Hernandez	f3db915294	Re-commit r86077 now that r86290 fixes the 179.art and 175.vpr ARM regressions. Here is the original commit message: This commit updates malloc optimizations to operate on malloc calls that have constant int size arguments. Update CreateMalloc so that its callers specify the size to allocate: MallocInst-autoupgrade users use non-TargetData-computed allocation sizes. Optimization uses use TargetData to compute the allocation size. Now that malloc calls can have constant sizes, update isArrayMallocHelper() to use TargetData to determine the size of the malloced type and the size of malloced arrays. Extend getMallocType() to support malloc calls that have non-bitcast uses. Update OptimizeGlobalAddressOfMalloc() to optimize malloc calls that have non-bitcast uses. The bitcast use of a malloc call has to be treated specially here because the uses of the bitcast need to be replaced and the bitcast needs to be erased (just like the malloc call) for OptimizeGlobalAddressOfMalloc() to work correctly. Update PerformHeapAllocSRoA() to optimize malloc calls that have non-bitcast uses. The bitcast use of the malloc is not handled specially here because ReplaceUsesOfMallocWithGlobal replaces through the bitcast use. Update OptimizeOnceStoredGlobal() to not care about the malloc calls' bitcast use. Update all globalopt malloc tests to not rely on autoupgraded-MallocInsts, but instead use explicit malloc calls with correct allocation sizes. llvm-svn: 86311	2009-11-07 00:16:28 +00:00
Chris Lattner	eb690feaef	Fix a bug where we'd call SplitBlockPredecessors with a pred in the set only once even if it has multiple edges to BB. llvm-svn: 86299	2009-11-06 23:19:58 +00:00
Eli Friedman	a70917b2f4	Remove function left over from other jump threading cleanup. llvm-svn: 86289	2009-11-06 21:24:57 +00:00
Chris Lattner	a8b9ce3f07	Fix a problem discovered on self host. llvm-svn: 86278	2009-11-06 19:21:48 +00:00
Chris Lattner	d91a7960bf	remove more code subsumed by r86264 llvm-svn: 86270	2009-11-06 18:24:32 +00:00
Chris Lattner	899ef22acb	eliminate some more code subsumed by r86264 llvm-svn: 86267	2009-11-06 18:22:54 +00:00
Chris Lattner	2f6184f6aa	remove now redundant code, r86264 handles this case. llvm-svn: 86266	2009-11-06 18:20:58 +00:00
Chris Lattner	68d2417e05	Extend jump threading to support much more general threading predicates. This allows us to jump thread things like: _ZN12StringSwitchI5ColorE4CaseILj7EEERS1_RAT__KcRKS0_.exit119: %tmp1.i24166 = phi i8 [ 1, %bb5.i117 ], [ %tmp1.i24165, %_Z....exit ], [ %tmp1.i24165, %bb4.i114 ] %toBoolnot.i87 = icmp eq i8 %tmp1.i24166, 0 ; <i1> [#uses=1] %tmp4.i90 = icmp eq i32 %tmp2.i, 6 ; <i1> [#uses=1] %or.cond173 = and i1 %toBoolnot.i87, %tmp4.i90 ; <i1> [#uses=1] br i1 %or.cond173, label %bb4.i96, label %_ZN12... Where it is "obvious" that when coming from %bb5.i117 that the 'and' is always false. This triggers a surprisingly high number of times in the testsuite, and gets us closer to generating good code for doug's strswitch testcase. This also make a bunch of other code in jump threading redundant, I'll rip out in the next patch. This survived an enable-checking llvm-gcc bootstrap. llvm-svn: 86264	2009-11-06 18:15:14 +00:00
Chris Lattner	8c12bb8cd7	remove some more Context arguments. llvm-svn: 86235	2009-11-06 05:59:53 +00:00
Chris Lattner	46b5c642b9	remove a bunch of extraneous LLVMContext arguments from various APIs, addressing PR5325. llvm-svn: 86231	2009-11-06 04:27:31 +00:00
Victor Hernandez	b9f5899779	Revert r86077 because it caused crashes in 179.art and 175.vpr on ARM llvm-svn: 86213	2009-11-06 01:33:24 +00:00
Dan Gohman	a1bf0c0acc	Teach LSR to avoid calling SplitCriticalEdge on edges with indirectbr. llvm-svn: 86193	2009-11-05 23:34:59 +00:00
Dan Gohman	928068a886	Avoid calling getUniqueExitBlocks from within LoopSimplify, as it depends on loops having dedicated exits, which LoopSimplify can no longer always guarantee. llvm-svn: 86181	2009-11-05 21:48:32 +00:00
Dan Gohman	dca7ac335b	LoopDeletion depends on loops having dedicated exits. llvm-svn: 86180	2009-11-05 21:47:04 +00:00
Dan Gohman	1ef784db67	The introduction of indirectbr meant the introduction of unsplittable critical edges, which means the introduction of loops which cannot be transformed to LoopSimplify form. Fix LoopSimplify to avoid transforming such loops into invalid code. llvm-svn: 86176	2009-11-05 21:14:46 +00:00
Dan Gohman	a83ac2d9e7	Update various Loop optimization passes to cope with the possibility that LoopSimplify form may not be available. llvm-svn: 86175	2009-11-05 21:11:53 +00:00
Dan Gohman	415c64ea3f	Teach LoopUnroll how to bail if LoopSimplify can't give it what it needs. llvm-svn: 86164	2009-11-05 19:44:06 +00:00
Dan Gohman	d9fa1c9c1e	Call getAnalysis<LoopInfo> the normal way, instead of asking passed-in LoopPassManager for it. llvm-svn: 86163	2009-11-05 19:43:25 +00:00
Dan Gohman	885c46e387	Delete an unused member variable. llvm-svn: 86160	2009-11-05 19:33:15 +00:00
Dan Gohman	00c793822e	Add an assertion to catch indirectbr in SplitBlockPredecessors. This makes several optimization passes abort in cases where they're currently silently miscompiling code. Remove the indirectbr assertion from SplitEdge. Indirectbr is only a problem for critical edges, and SplitEdge defers to SplitCriticalEdge to handle those, and SplitCriticalEdge has its own assertion for indirectbr. llvm-svn: 86147	2009-11-05 18:25:44 +00:00
Benjamin Kramer	b971445ab7	Teach SimplifyLibCalls to fold memcmp calls with constant arguments. llvm-svn: 86141	2009-11-05 17:44:22 +00:00
Benjamin Kramer	3fcbb82151	Do map insert+find in one step. TODO -= 2. llvm-svn: 86133	2009-11-05 14:33:27 +00:00
Victor Hernandez	492ed30a32	Update CreateMalloc so that its callers specify the size to allocate: MallocInst-autoupgrade users use non-TargetData-computed allocation sizes. Optimization uses use TargetData to compute the allocation size. Now that malloc calls can have constant sizes, update isArrayMallocHelper() to use TargetData to determine the size of the malloced type and the size of malloced arrays. Extend getMallocType() to support malloc calls that have non-bitcast uses. Update OptimizeGlobalAddressOfMalloc() to optimize malloc calls that have non-bitcast uses. The bitcast use of a malloc call has to be treated specially here because the uses of the bitcast need to be replaced and the bitcast needs to be erased (just like the malloc call) for OptimizeGlobalAddressOfMalloc() to work correctly. Update PerformHeapAllocSRoA() to optimize malloc calls that have non-bitcast uses. The bitcast use of the malloc is not handled specially here because ReplaceUsesOfMallocWithGlobal replaces through the bitcast use. Update OptimizeOnceStoredGlobal() to not care about the malloc calls' bitcast use. Update all globalopt malloc tests to not rely on autoupgraded-MallocInsts, but instead use explicit malloc calls with correct allocation sizes. llvm-svn: 86077	2009-11-05 00:03:03 +00:00
Chris Lattner	a09062758b	improve DSE when TargetData is not around, based on work by Hans Wennborg! llvm-svn: 86067	2009-11-04 23:20:12 +00:00
Chris Lattner	762b56fa8c	Fix an iterator invalidation bug that happens when a hashtable resizes in IPSCCP. This fixes PR5394. llvm-svn: 86036	2009-11-04 18:57:42 +00:00
Chris Lattner	cb3c64ee3c	move two functions up higher in the file. Delete a useless argument to EmitGEPOffset. Implement some new transforms for optimizing subtracts of two pointer to ints into the same vector. This happens for C++ iterator idioms for example, stringmap takes a const char* that points to the start and end of a string. Once inlined, we want the pointer difference to turn back into a length. This is rdar://7362831. llvm-svn: 86021	2009-11-04 08:05:20 +00:00
Chris Lattner	156b8c7109	reimplement multiple return value handling in IPSCCP, making it more aggressive an correct. This survives building llvm in 64-bit mode with optimizations and the built llvm passes make check. llvm-svn: 85973	2009-11-03 23:40:48 +00:00
Chris Lattner	2c427233d4	finish half thunk thought llvm-svn: 85937	2009-11-03 20:52:57 +00:00
Chris Lattner	cde8de519d	fix an IPSCCP bug I introduced when I changed IPSCCP to start working on functions that don't have local linkage. Basically, we need to be more careful about propagating argument information to functions whose results we aren't tracking. This fixes a miscompilation of LLVMCConfigurationEmitter.cpp when built with an llvm-gcc that has ipsccp enabled. llvm-svn: 85923	2009-11-03 19:24:51 +00:00
Chris Lattner	e1d5cd9f48	fix a subtle bug I introduced when refactoring SCCP. Testcase to follow. llvm-svn: 85903	2009-11-03 16:50:11 +00:00
Benjamin Kramer	5573971453	Eliminate some temporaries. llvm-svn: 85896	2009-11-03 12:52:50 +00:00
Chris Lattner	5a3832496a	remove a isFreeCall check: it is a callinst that can write to memory already. llvm-svn: 85863	2009-11-03 05:33:46 +00:00
Ted Kremenek	2124f0d43f	Alphabetize. llvm-svn: 85859	2009-11-03 04:01:53 +00:00
Chris Lattner	fb14181b18	turn IPSCCP back on now that the iterator invalidation bug is fixed. llvm-svn: 85858	2009-11-03 03:42:51 +00:00
Chris Lattner	b70ef3c8c7	fix a nasty iterator invalidation bug from my conversion from std::map to DenseMap, exposed on release llvm-gcc bootstrap. llvm-svn: 85840	2009-11-02 23:25:39 +00:00
Chris Lattner	a15cc59dcb	revert r8579[56], which are causing unhappiness in buildbot land. llvm-svn: 85818	2009-11-02 19:31:10 +00:00
Chris Lattner	a3d794ebbb	disable IPSCCP support for multiple return values, it is buggy, so just disable it until I can fix it. llvm-svn: 85810	2009-11-02 18:22:51 +00:00
Chris Lattner	9d49f0c858	improve IPSCCP to be able to propagate the result of "!mayBeOverridden" function to calls of that function, regardless of whether it has local linkage or has its address taken. Not escaping should only affect whether we make an aggressive assumption about the arguments to a function, not whether we can track the result of it. llvm-svn: 85795	2009-11-02 07:33:59 +00:00
Chris Lattner	47837c5182	don't mark the arguments of prototype overdefined, they will never be queried. llvm-svn: 85793	2009-11-02 06:34:04 +00:00
Chris Lattner	5503328332	restore some code I removed in r85788, refactor it into a shared place instead of duplicating it 4 times. llvm-svn: 85792	2009-11-02 06:28:16 +00:00
Chris Lattner	4910b656b2	remove some confused code that dates from when we had "multiple return values" but not "first class aggregates" llvm-svn: 85791	2009-11-02 06:17:06 +00:00
Chris Lattner	809aee2f40	avoid redundant lookups in BBExecutable, and make it a SmallPtrSet. llvm-svn: 85790	2009-11-02 06:11:23 +00:00
Chris Lattner	e77c9aa04a	Use the libanalysis 'ConstantFoldLoadFromConstPtr' function instead of reinventing SCCP-specific logic. This gives us new powers. llvm-svn: 85789	2009-11-02 06:06:14 +00:00
Chris Lattner	f548403989	switch the main 'ValueState' map from being an std::map to being a DenseMap. Doing this required being aware of subtle iterator invalidation issues, but it provides a big speedup. In a release-asserts build, this sped up optimizing 403.gcc from 1.34s -> 0.79s (IPSCCP) and 1.11s -> 0.44s (SCCP). This commit also conflates in a bunch of general cleanups, sorry. llvm-svn: 85788	2009-11-02 05:55:40 +00:00
Chris Lattner	4e849162ef	fix a bug exposed by moving SRoA earlier which caused a crash building kc++ llvm-svn: 85786	2009-11-02 04:37:17 +00:00
Chris Lattner	e82b087ae6	only IPSCCP incoming arguments if the function is executable, this fixes an assertion on the buildbot. llvm-svn: 85784	2009-11-02 03:25:55 +00:00
Chris Lattner	9e97fbe114	add a new ValueState::getConstantInt() helper, use it to simplify some code. llvm-svn: 85783	2009-11-02 03:21:36 +00:00
Chris Lattner	7ccf1a6df6	tidy up some more: remove some extraneous inline specifiers, return harder. llvm-svn: 85780	2009-11-02 03:03:42 +00:00
Chris Lattner	b5a13d4c90	eliminate the SCCPSolver::getValueMapping method. llvm-svn: 85778	2009-11-02 02:54:24 +00:00
Chris Lattner	c49ae9912a	fix failures introduced in r85774 llvm-svn: 85777	2009-11-02 02:48:17 +00:00
Chris Lattner	e405ed9651	factor duplicated code into a new DeleteInstructionInBlock function, eliminate temporary (and pointless) smallvector. llvm-svn: 85776	2009-11-02 02:47:51 +00:00
Chris Lattner	a3c39d394d	Chris used to use '...' instead of proper grammar. llvm-svn: 85775	2009-11-02 02:33:50 +00:00
Chris Lattner	6df5cec72f	remove some extraneous llvmcontext stuff. llvm-svn: 85774	2009-11-02 02:30:06 +00:00
Chris Lattner	efdd2bbce6	change LatticeVal to use PointerIntPair to save some space. llvm-svn: 85773	2009-11-02 02:20:32 +00:00
Chris Lattner	3cd6a61b27	fix instcombine to only do store sinking when the alignments of the two loads agree. Propagate that onto the new store. llvm-svn: 85772	2009-11-02 02:06:37 +00:00
Chris Lattner	328ef89bd1	when merging two loads, make sure to take the min of their alignment, not the max. This didn't matter until the previous patch because instcombine would refuse to sink loads with differenting alignments. llvm-svn: 85738	2009-11-01 20:07:07 +00:00
Chris Lattner	2a249e267a	split load sinking out to its own function, like gep sinking. llvm-svn: 85737	2009-11-01 20:04:24 +00:00
Chris Lattner	0b40a8bc0e	fix a bug noticed by inspection: when instcombine sinks loads through phis, it didn't preserve the alignment of the load. This is a missed optimization of the alignment is high and a miscompilation when the alignment is low. llvm-svn: 85736	2009-11-01 19:50:13 +00:00
Chris Lattner	b5d9c8c708	cleanups, switch GlobalDCE to SmallPtrSet instead of std::set llvm-svn: 85730	2009-11-01 19:03:42 +00:00
Chris Lattner	37536b90e1	remove a bunch of locking from LLVMContextImpl. Since only one thread can be banging on a context at a time, this isn't needed. Owen, please review. llvm-svn: 85728	2009-11-01 18:42:03 +00:00
Chris Lattner	249f96e339	improve comment. llvm-svn: 85725	2009-11-01 18:17:37 +00:00
Douglas Gregor	291f6145b8	Reverting 85714, 85715, 85716, which are breaking the build llvm-svn: 85717	2009-11-01 16:42:53 +00:00
Dan Gohman	576ac96367	Remove the #include of Pass.h from PassManager.h. This breaks a significant #include dependency, as frontends commonly pull in PassManager.h. llvm-svn: 85714	2009-11-01 15:20:19 +00:00
Chris Lattner	1a8b80ed5a	teach ipsccp and ipconstprop that a blockaddress doesn't 'take the address' of a function in a way that should prevent ip constprop. This allows clang/test/CodeGen/indirect-goto.c to pass with the new indirect goto lowering. llvm-svn: 85709	2009-11-01 06:11:53 +00:00
Chris Lattner	a1dc101f66	change llvm::MergeBlockIntoPredecessor to not merge two blocks BB1->BB2 when BB2 has its address taken. Since it ends up doing BB2->rauw(BB1), this can cause the address of the entry block to be taken. Since it is generally undesirable to nuke blocks whose address is taken, even when we can, just unconditionally stop this xform. llvm-svn: 85708	2009-11-01 04:57:33 +00:00
Chris Lattner	746139b736	strengthen an assumption: RevectorBlockTo knows that PredBB ended in an uncond branch because the pass requires BreakCriticalEdges. However, BCE doesn't eliminate critical adges from indbrs. llvm-svn: 85707	2009-11-01 04:23:20 +00:00
Chris Lattner	7a8db3a41a	if CostMetrics says to never duplicate some code, don't unswitch a loop. This prevents unswitching from duplicating indbr's. llvm-svn: 85705	2009-11-01 03:42:55 +00:00
Chris Lattner	54a4b84012	constant fold indirectbr(blockaddress(%bb)) -> br label %bb. llvm-svn: 85704	2009-11-01 03:40:38 +00:00
Chris Lattner	aa99c94e2a	Revert 85678/85680. The decision is to stay with the current form of indirectbr, thus we don't need "blockaddr(@func, null)". Eliminate it for simplicity. llvm-svn: 85699	2009-11-01 01:27:45 +00:00
Chris Lattner	a546dcf418	Make sure PRE doesn't split crit edges from indirectbr. llvm-svn: 85692	2009-10-31 22:11:15 +00:00
Chris Lattner	c872b09676	llvm::SplitEdge should refuse to split an edge from an indirectbr. Fix CodeGenPrepare to not try to split edges from indirectbr. llvm-svn: 85690	2009-10-31 22:04:43 +00:00
Chris Lattner	ba364b0a9a	update the comment above llvm::SplitCriticalEdge, and make it abort on IndirectBrInst as describe in the comment. llvm-svn: 85688	2009-10-31 21:51:10 +00:00
Chris Lattner	3c89c53f35	adjust a couple xforms to work with null bb's in BlockAddress. llvm-svn: 85680	2009-10-31 20:13:24 +00:00
Chris Lattner	a742b8f94f	add a comment. llvm-svn: 85671	2009-10-31 17:48:31 +00:00
Dan Gohman	2d02ff8cbb	Revert r85667. LoopUnroll currently can't call utility functions which auto-update the DominatorTree because it doesn't keep the DominatorTree current while it works. llvm-svn: 85670	2009-10-31 17:33:01 +00:00
Dan Gohman	144694bcb7	Remove redundant code. llvm-svn: 85668	2009-10-31 16:16:41 +00:00
Dan Gohman	041e2dbad1	Merge the enhancements from LoopUnroll's FoldBlockIntoPredecessor into MergeBlockIntoPredecessor. This makes SimplifyCFG slightly more aggressive, and makes it unnecessary for LoopUnroll to have its own copy of this code. llvm-svn: 85667	2009-10-31 16:08:00 +00:00
Dan Gohman	880c92ac1c	Rename forgetLoopBackedgeTakenCount to forgetLoop, because it clears out more information than just the stored backedge taken count. llvm-svn: 85664	2009-10-31 15:04:55 +00:00
Dan Gohman	969e83a4ff	Replace LoopUnrollPass.cpp's custom code-size estimation code using the new common CodeMetrics code. llvm-svn: 85663	2009-10-31 14:54:17 +00:00
Dan Gohman	fa8969f70e	Simplify this code. llvm-svn: 85662	2009-10-31 14:46:50 +00:00
Dan Gohman	af94015c18	Remove an unnecessary #include. llvm-svn: 85661	2009-10-31 14:39:43 +00:00
Dan Gohman	f35b6640f6	Update CMakeLists for recent renames. llvm-svn: 85660	2009-10-31 14:38:25 +00:00
Dan Gohman	f70e76c435	Rename UnrollLoop.cpp to LoopUnroll.cpp, and LoopUnroll.cpp to LoopUnrollPass.cpp, for consistency with other passes which are similarly split. llvm-svn: 85659	2009-10-31 14:37:31 +00:00
Dan Gohman	fb7f0e57b6	Remove CodeGenLICM. It's largely obsoleted by MachineLICM's new ability to unfold loop-invariant loads. llvm-svn: 85657	2009-10-31 14:35:41 +00:00
Dan Gohman	930aa9d3d2	Reapply r85634, with the bug fixed. llvm-svn: 85655	2009-10-31 14:22:52 +00:00
Evan Cheng	c16d8f2054	Revert 85634. It's breaking consumer-typeset (and others). llvm-svn: 85641	2009-10-31 01:28:06 +00:00
Dan Gohman	7f7d97eb73	Add a comment about a missed opportunity. llvm-svn: 85635	2009-10-30 23:15:43 +00:00
Dan Gohman	5bec30ca5d	Optimize around the fact that pred_iterator is slow: instead of sorting PHI operands by the predecessor order, sort them by the order used by the first PHI in the block. This is still suffucient to expose duplicates. llvm-svn: 85634	2009-10-30 23:15:21 +00:00
Dan Gohman	1a95106602	Teach SimplifyCFG how to eliminate duplicate PHI nodes within a block. This reduces codesize on a variety of codes by 1-2% on x86-64. It also helps clean up after SSAUpdater. llvm-svn: 85626	2009-10-30 22:39:04 +00:00
Dan Gohman	13e41edc71	Sort the incoming values in PHI nodes to match the predecessor order. This helps expose duplicate PHIs, which will make it easier for them to be eliminated. llvm-svn: 85623	2009-10-30 22:22:22 +00:00
Evan Cheng	5a6b9c40d6	Add option to createGVNPass to disable PRE. llvm-svn: 85609	2009-10-30 20:12:24 +00:00
Nick Lewycky	b43a43a8fd	Apply some cleanups. No functionality changes. llvm-svn: 85498	2009-10-29 07:35:15 +00:00
Chris Lattner	312748848f	just for the hell of it, allow globalopt to statically evaluate static constructors with indirect gotos :) llvm-svn: 85495	2009-10-29 05:51:50 +00:00
Chris Lattner	ee8b951e73	teach various passes about blockaddress. We no longer crash on any clang tests. llvm-svn: 85465	2009-10-29 01:21:20 +00:00
Chris Lattner	be060382e9	teach ValueMapper about BlockAddress', making bugpoint a lot more useful. llvm-svn: 85458	2009-10-29 00:31:02 +00:00
Chris Lattner	cf5a47d63d	unindent massive blocks, no functionality change. llvm-svn: 85457	2009-10-29 00:28:30 +00:00
Victor Hernandez	0d025421cd	Extend getMallocArraySize() to determine the array size if the malloc argument is: ArraySize * ElementSize ElementSize * ArraySize ArraySize << log2(ElementSize) ElementSize << log2(ArraySize) Refactor isArrayMallocHelper and delete isSafeToGetMallocArraySize, so that there is only 1 copy of the malloc array determining logic. Update users of getMallocArraySize() to not bother calling isArrayMalloc() as well. llvm-svn: 85421	2009-10-28 20:18:55 +00:00
Devang Patel	ffd561bc2d	llvm.dbg.global_variables do not exist anymore. llvm-svn: 85402	2009-10-28 16:51:52 +00:00
Edward O'Callaghan	1042ca112f	No newline at end of file. llvm-svn: 85390	2009-10-28 15:04:53 +00:00
Benjamin Kramer	ecc60b80b0	Update CMake file. llvm-svn: 85389	2009-10-28 13:29:18 +00:00
Owen Anderson	2b2bd28973	Treat lifetime begin/end markers as allocations/frees respectively for the purposes for GVN/DSE. llvm-svn: 85383	2009-10-28 07:05:35 +00:00
Nick Lewycky	175308c43e	Add ABCD, a generalized implementation of the Elimination of Array Bounds Checks on Demand algorithm which looks at arbitrary branches instead of loop iterations. This is GSoC work by Andre Tavares with only editorial changes applied! llvm-svn: 85382	2009-10-28 07:03:15 +00:00
Chris Lattner	a91a563530	Previously, all operands to Constant were themselves constant. In the new world order, BlockAddress can have a BasicBlock operand. This doesn't permute much, because if you have a ConstantExpr (or anything more specific than Constant) we still know the operand has to be a Constant. llvm-svn: 85375	2009-10-28 05:14:34 +00:00
Devang Patel	11cf3f4a27	Factor out redundancy from clone() implementations. llvm-svn: 85327	2009-10-27 22:16:29 +00:00
Victor Hernandez	f390e04a47	Rename MallocFreeHelper as MemoryBuiltins llvm-svn: 85286	2009-10-27 20:05:49 +00:00
Chris Lattner	c6b3b25f94	Fix a pretty serious misfeature of the inliner: if it inlines a function with multiple return values it inserts a PHI to merge them all together. However, if the return values are all the same, it ends up with a pointless PHI and this pointless PHI happens to really block SRoA from happening in at least a silly C++ example written by Doug, but probably others. This fixes rdar://7339069. llvm-svn: 85206	2009-10-27 05:39:41 +00:00
Mike Stump	2b0a49a682	VS build fix, patch by Marius Wachtler. llvm-svn: 85197	2009-10-27 02:14:13 +00:00
Eric Christopher	7a50b280c1	Add objectsize intrinsic and hook it up through codegen. Doesn't do anything than return "I don't know" at the moment. llvm-svn: 85189	2009-10-27 00:52:25 +00:00
Dan Gohman	f808106bbe	Add braces to avoid ambiguous else. llvm-svn: 85185	2009-10-27 00:11:02 +00:00
Victor Hernandez	762195bd01	Rename MallocHelper as MallocFreeHelper, since it now also identifies calls to free() llvm-svn: 85181	2009-10-26 23:58:56 +00:00
Owen Anderson	03b5de67b0	Add a straight-forward implementation of SCCVN for aggressively eliminating scalar redundancies. llvm-svn: 85179	2009-10-26 23:55:47 +00:00
Victor Hernandez	de5ad42aa1	Remove FreeInst. Remove LowerAllocations pass. Update some more passes to treate free calls just like they were treating FreeInst. llvm-svn: 85176	2009-10-26 23:43:48 +00:00
Dan Gohman	34e38afa96	Simplify this code. LoopDeletion doesn't need to explicit check that the loop exiting block dominates the latch block; if ScalarEvolution can prove that the trip-count is finite, that's sufficient. llvm-svn: 85165	2009-10-26 22:18:58 +00:00
Dan Gohman	672927f393	Code that checks WillNotOverflowSignedAdd before creating an Add can safely use the NSW bit on the Add. llvm-svn: 85164	2009-10-26 22:14:22 +00:00
Ted Kremenek	ce8f626f82	Update CMake files. llvm-svn: 85161	2009-10-26 22:06:01 +00:00
Dan Gohman	6a1d9eace9	Check in the experimental GEP splitter pass. This pass splits complex GEPs (more than one non-zero index) into simple GEPs (at most one non-zero index). In some simple experiments using this it's not uncommon to see 3% overall code size wins, because it exposes redundancies that can be eliminated, however it's tricky to use because instcombine aggressively undoes the work that this pass does. llvm-svn: 85144	2009-10-26 19:12:14 +00:00
Dan Gohman	6a10d5ebd3	Fix a typo in a comment. llvm-svn: 85120	2009-10-26 15:55:24 +00:00
Chris Lattner	683eed3286	reapply r85085 with a bugfix to avoid infinite looping. All of the 'demorgan' related xforms need to use dyn_castNotVal, not m_Not. llvm-svn: 85119	2009-10-26 15:40:07 +00:00
Dan Gohman	d632f89596	Make LSR's OptimizeShadowIV ignore induction variables with negative strides for now, because it doesn't handle them correctly. This fixes a miscompile of SingleSource/Benchmarks/Misc-C++/ray. This problem was usually hidden because indvars transforms such induction variables into negations of canonical induction variables. llvm-svn: 85118	2009-10-26 15:32:57 +00:00
Evan Cheng	8014a728b9	Revert 85085. It causes infinite looping during llvm-gcc build. llvm-svn: 85090	2009-10-26 03:51:32 +00:00
Chris Lattner	2e6564d6ff	Implement PR3266 & PR5276, folding: not (or (icmp, icmp)) -> and(icmp, icmp) llvm-svn: 85085	2009-10-26 01:06:31 +00:00
Nick Lewycky	974e12b2d3	Remove includes of Support/Compiler.h that are no longer needed after the VISIBILITY_HIDDEN removal. llvm-svn: 85043	2009-10-25 06:57:41 +00:00
Nick Lewycky	02d5f77d26	Remove VISIBILITY_HIDDEN from class/struct found inside anonymous namespaces. Chris claims we should never have visibility_hidden inside any .cpp file but that's still not true even after this commit. llvm-svn: 85042	2009-10-25 06:33:48 +00:00
Nick Lewycky	54d7179a25	Remove ICmpInst::isSignedPredicate which was a reimplementation CmpInst::isSigned. llvm-svn: 85037	2009-10-25 05:20:17 +00:00
Dan Gohman	ef41a1ce3c	MapValue doesn't needs its LLVMContext argument. llvm-svn: 85020	2009-10-24 23:37:16 +00:00
Dan Gohman	8f4078ba39	Rename isLoopExit to isLoopExiting, for consistency with the wording used elsewhere - an exit block is a block outside the loop branched to from within the loop. An exiting block is a block inside the loop that branches out. llvm-svn: 85019	2009-10-24 23:34:26 +00:00
Dan Gohman	b979794e4b	Rewrite LoopRotation's SSA updating code using SSAUpdater. llvm-svn: 85016	2009-10-24 23:19:52 +00:00
Victor Hernandez	e297149e26	Auto-upgrade free instructions to calls to the builtin free function. Update all analysis passes and transforms to treat free calls just like FreeInst. Remove RaiseAllocations and all its tests since FreeInst no longer needs to be raised. llvm-svn: 84987	2009-10-24 04:23:03 +00:00
Victor Hernandez	8acf2956b8	Remove AllocationInst. Since MallocInst went away, AllocaInst is the only subclass of AllocationInst, so it no longer is necessary. llvm-svn: 84969	2009-10-23 21:09:37 +00:00
Dan Gohman	41d00ac45b	Make LoopDeletion check the maximum backedge taken count, rather than the exact backedge taken count, when checking for infinite loops. This allows it to delete loops with multiple exit conditions. llvm-svn: 84952	2009-10-23 17:10:01 +00:00
Chris Lattner	cf7e8947e9	move another load optimization from instcombine -> libanalysis. llvm-svn: 84841	2009-10-22 06:44:07 +00:00
Chris Lattner	51d2f70e32	move 'loading i32 from string' optimization from instcombine to libanalysis. Instcombine shrinking... does this even make sense??? llvm-svn: 84840	2009-10-22 06:38:35 +00:00
Chris Lattner	1664a4fd86	Move some constant folding logic for loads out of instcombine into Analysis/ConstantFolding.cpp. This doesn't change the behavior of instcombine but makes other clients of ConstantFoldInstruction able to handle loads. This was partially extracted from Eli's patch in PR3152. llvm-svn: 84836	2009-10-22 06:25:11 +00:00
Chris Lattner	c7a962d3b3	fix PR5262. llvm-svn: 84810	2009-10-22 00:17:26 +00:00
Devang Patel	27e0be274e	Derive metadata hierarchy from Value instead of User. llvm-svn: 84801	2009-10-21 23:57:35 +00:00
Chris Lattner	966526cbfb	revert r84754, it isn't the right approach. Edwin, please propose patches for fixes like this instead of committing them directly. llvm-svn: 84799	2009-10-21 23:41:58 +00:00
Victor Hernandez	be9e179104	Make changes to rev 84292 as requested by Chris Lattner. Most changes are cleanup, but there is 1 correctness fix: I fixed InstCombine so that the icmp is removed only if the malloc call is removed (which requires explicit removal because the Worklist won't DCE any calls since they can have side-effects). llvm-svn: 84772	2009-10-21 19:11:40 +00:00
Torok Edwin	1539a352a6	Fix PR5262: when folding select into PHI, make sure all operands are available in the PHI's Basic Block. This uses a conservative approach, because we don't have dominator info in instcombine. llvm-svn: 84754	2009-10-21 10:49:00 +00:00
Chris Lattner	8ed7bef409	make GVN work better when TD is not around: "In the existing code, if the load and the value to replace it with are of different types and target data is available, it tries to use the target data to coerce the replacement value to the type of the load. Otherwise, it skips all effort to handle the type mismatch and just feeds the wrongly-typed replacement value to replaceAllUsesWith, which triggers an assertion. The patch replaces it with an outer if checking for type mismatch, and an inner if-else that checks whether target data is available and, if not, returns false rather than trying to replace the load." Patch by Kenneth Uildriks! llvm-svn: 84739	2009-10-21 04:11:19 +00:00
Devang Patel	1d7f7d21dc	Do not remove dead metadata for now. llvm-svn: 84731	2009-10-21 02:21:34 +00:00
Chris Lattner	7f903681ac	alternate fix for PR5258 which avoids worklist problems, with reduced testcase. llvm-svn: 84667	2009-10-20 20:27:49 +00:00
Dan Gohman	b6b8ec769c	Restore LoopUnswitch's block-oriented threshold. LoopUnswitch now checks both the estimated code size and the number of blocks when deciding whether to do a non-trivial unswitch. This protects it from some very undesirable worst-case behavior on large numbers of loop-unswitchable conditions, such as in the testcase in PR5259. llvm-svn: 84661	2009-10-20 20:06:09 +00:00
Torok Edwin	cf10ec951d	Fix PR5258, jump-threading creating invalid PHIs. When an incoming value for a PHI is updated, we must also updated all other incoming values for the same BB to match, otherwise we create invalid PHIs. llvm-svn: 84638	2009-10-20 15:42:00 +00:00
Torok Edwin	729d92bd74	Fix PR4313: IPSCCP was not setting the lattice value for the invoke instruction when the invoke had multiple return values: it set the lattice value only on the extractvalue. This caused the invoke's lattice value to remain the default (undefined), and later propagated to extractvalue's operand, which incorrectly introduces undefined behavior. llvm-svn: 84637	2009-10-20 15:15:09 +00:00
Owen Anderson	168ad6985e	Refactor lookup_or_add to contain _MUCH_ less duplicated code. Add support for numbering first class aggregate instructions while we're at it. llvm-svn: 84547	2009-10-19 22:14:22 +00:00
Victor Hernandez	5c704d505c	Malloc calls are marked NoAlias, so the code below the isMalloc() check makes it redundant. Removing the isMalloc() check. llvm-svn: 84541	2009-10-19 21:47:22 +00:00
Owen Anderson	1059b5b32d	Simplify some code. llvm-svn: 84533	2009-10-19 21:14:57 +00:00
Dan Gohman	8f986672a1	Fix SplitBlockPredecessors' LoopInfo updating code to handle the case where a loop's header is being split and it has predecessors which are not contained by the most-nested loop which contains the loop. This fixes PR5235. llvm-svn: 84505	2009-10-19 16:04:50 +00:00
Dan Gohman	511d2e26dd	Change instnamer to name arguments "arg" instead of "tmp" for clarity, and to name basic blocks "bb" instead of "BB", for consistency. llvm-svn: 84502	2009-10-19 14:47:32 +00:00
Chris Lattner	1fa98f0d74	remove the IndMemRemPass, which only made sense for when malloc/free were intrinsic instructions. llvm-svn: 84404	2009-10-18 05:02:09 +00:00
Daniel Dunbar	8eff29d805	Use raw_ostream::write_escaped instead of EscapeString. llvm-svn: 84356	2009-10-17 20:43:19 +00:00
Chris Lattner	88b36f1140	Simplify some code (first hunk) and fix PR5208 (second hunk) by updating the callgraph when introducing a call. llvm-svn: 84310	2009-10-17 05:39:39 +00:00
Victor Hernandez	a3aaf85e23	Remove MallocInst from LLVM Instructions. llvm-svn: 84299	2009-10-17 01:18:07 +00:00
Victor Hernandez	c7d6a8327c	Autoupgrade malloc insts to malloc calls. Update testcases that rely on malloc insts being present. Also prematurely remove MallocInst handling from IndMemRemoval and RaiseAllocations to help pass tests in this incremental step. llvm-svn: 84292	2009-10-17 00:00:19 +00:00
Victor Hernandez	264da3274e	HeapAllocSRoA also needs to check if malloc array size can be computed. llvm-svn: 84288	2009-10-16 23:12:25 +00:00
Dan Gohman	99429a00ff	Move zext and sext casts fed by loads into the same block as the load, to help SelectionDAG fold them into the loads, unless conditions are unfavorable. llvm-svn: 84271	2009-10-16 20:59:35 +00:00
Duncan Sands	0058c7bcb0	Strip trailing white space. llvm-svn: 84256	2009-10-16 15:20:13 +00:00
Victor Hernandez	13020b1faf	Fix bug where array malloc with unexpected computation of the size argument resulted in MallocHelper identifying the malloc as a non-array malloc. This broke GlobalOpt's optimization of stores of mallocs to global variables. The fix is to classify malloc's into 3 categories: 1. non-array mallocs 2. array mallocs whose array size can be determined 3. mallocs that cannot be determined to be of type 1 or 2 and cannot be optimized getMallocArraySize() returns NULL for category 3, and all users of this function must avoid their malloc optimization if this function returns NULL. Eventually, currently unexpected codegen for computing the malloc's size argument will be supported in isArrayMalloc() and getMallocArraySize(), extending malloc optimizations to those examples. llvm-svn: 84199	2009-10-15 20:14:52 +00:00
Chris Lattner	c855b45b78	only try to fold constantexpr operands when the worklist is first populated, don't bother every time going around the main worklist. This speeds up a release-asserts opt -std-compile-opts on 403.gcc by about 4% (1.5s). It seems to speed up the most expensive instances of instcombine by ~10%. llvm-svn: 84171	2009-10-15 04:59:28 +00:00
Chris Lattner	dd1f68a10c	don't bother calling ConstantFoldInstruction unless there is a use of the instruction (which disqualifies stores, unreachable, etc) and at least the first operand is a constant. This filters out a lot of obvious cases that can't be folded. Also, switch the IRBuilder to a TargetFolder, which tries harder. llvm-svn: 84170	2009-10-15 04:13:44 +00:00
Devang Patel	92f8619923	Use isVoidTy() llvm-svn: 84118	2009-10-14 17:29:00 +00:00
Chris Lattner	6b9044db01	make instcombine's instruction sinking more aggressive in the presence of PHI nodes. llvm-svn: 84103	2009-10-14 15:21:58 +00:00
Devang Patel	a677136900	Check void type before using RAUWd. llvm-svn: 84049	2009-10-13 22:56:32 +00:00
Devang Patel	115741ba79	Do not check use_empty() before replaceAllUsesWith(). This gives ValueHandles a chance to get properly updated. llvm-svn: 84033	2009-10-13 21:41:20 +00:00
Dan Gohman	2dc6f8de03	Use the new CodeMetrics class to compute code size instead of manually counting instructions. llvm-svn: 84016	2009-10-13 20:12:23 +00:00
Ted Kremenek	113d959f1b	Update CMake file. llvm-svn: 84001	2009-10-13 18:48:07 +00:00
Dan Gohman	54463e837a	Commit the removal of this file, which is now moved to lib/Analysis. llvm-svn: 83999	2009-10-13 18:37:20 +00:00
Dan Gohman	4552e3cd73	Move the InlineCost code from Transforms/Utils to Analysis. llvm-svn: 83998	2009-10-13 18:30:07 +00:00
Dan Gohman	5b3e05bcaa	Start refactoring the inline cost estimation code so that it can be used for purposes other than inlining. llvm-svn: 83997	2009-10-13 18:24:11 +00:00
Chris Lattner	19788ca686	change simplifycfg to not duplicate 'unwind' instructions. Hopefully this will increase the likelihood of common code getting sunk towards the unwind. llvm-svn: 83996	2009-10-13 18:13:05 +00:00
Dan Gohman	71ca652475	Make LoopUnswitch's cost estimation count Instructions, rather than BasicBlocks, so that it doesn't blindly procede in the presence of large individual BasicBlocks. This addresses a class of code-size expansion problems. llvm-svn: 83992	2009-10-13 17:50:43 +00:00
Evan Cheng	f815861591	Make licm debug message readable. llvm-svn: 83908	2009-10-12 22:25:23 +00:00
Dale Johannesen	4c9f0e8f53	Fix warning. llvm-svn: 83870	2009-10-12 18:45:32 +00:00
Chris Lattner	8abd572dae	populate instcombine's initial worklist more carefully, causing it to visit instructions from the start of the function to the end of the function in the first path. This greatly speeds up some pathological cases (e.g. PR5150). Try #3, this time with some unneeded debug info stuff removed which was causing dead pointers to be added to the worklist. llvm-svn: 83818	2009-10-12 03:58:40 +00:00
Chris Lattner	8ce6b36c86	revert r83814 for now, it is making the llvm-gcc bootstrap unhappy. llvm-svn: 83817	2009-10-11 23:56:08 +00:00
Chris Lattner	78d6310429	populate instcombine's initial worklist more carefully, causing it to visit instructions from the start of the function to the end of the function in the first path. This greatly speeds up some pathological cases (e.g. PR5150). llvm-svn: 83814	2009-10-11 23:17:43 +00:00
Chris Lattner	2c2deae5ac	remove some harmful code that would turn an insertelement on an undef into a shuffle even if it was used by another insertelement. If the visitation order of instcombine was wrong, this would turn a chain of insertelements into a chain of shufflevectors, which was quite painful. Since CollectShuffleElements handles these cases, the code can just be nuked. llvm-svn: 83810	2009-10-11 23:02:46 +00:00
Chris Lattner	c6cdbfbfdd	teach instcombine to simplify xor's harder, catching the new testcase. llvm-svn: 83799	2009-10-11 22:22:13 +00:00
Chris Lattner	6e6ac47125	cleanups llvm-svn: 83797	2009-10-11 22:00:32 +00:00
Chris Lattner	1639234775	cleanup, no functionality change. llvm-svn: 83795	2009-10-11 21:36:10 +00:00
Chris Lattner	fd27f8a5b3	generalize a transformation even more: we don't care whether the input the the mul is a zext from bool, just that it is all zeros other than the low bit. This fixes some phase ordering issues that would cause us to miss some xforms in mul.ll when the worklist is visited differently. llvm-svn: 83794	2009-10-11 21:29:45 +00:00
Chris Lattner	406cb75c6b	simplify a transformation by making it more general. llvm-svn: 83792	2009-10-11 21:22:21 +00:00
Chris Lattner	f39f4f928a	temporarily revert previous patch llvm-svn: 83791	2009-10-11 21:05:34 +00:00
Chris Lattner	bb058d3a23	populate instcombine's initial worklist more carefully, causing it to visit instructions from the start of the function to the end of the function in the first path. This greatly speeds up some pathological cases (e.g. PR5150). llvm-svn: 83790	2009-10-11 21:04:37 +00:00
Torok Edwin	8b3081350e	Remove CleanupDbgInfo, instcombine does this and its not worth duplicating it here. llvm-svn: 83789	2009-10-11 19:58:35 +00:00
Torok Edwin	907ec36943	LICM shouldn't sink/delete debug information. Fix this and add a testcase. For now the metadata of sinked/hoisted instructions is still wrong, but that'll be fixed when instructions will have debug metadata directly attached. llvm-svn: 83786	2009-10-11 19:15:54 +00:00
Chris Lattner	85c85c5e04	when folding duplicate conditions, delete the now-probably-dead instruction tree feeding it. llvm-svn: 83778	2009-10-11 18:39:58 +00:00
Chris Lattner	e374382b8f	implement rdar://7293527, a trivial instcombine that llvm-gcc gets but clang doesn't, because it is implemented in GCC's fold routine. llvm-svn: 83761	2009-10-11 07:53:15 +00:00
Chris Lattner	97b1405207	implement a transformation in jump threading that is currently done by condprop, but do it in a much more general form. The basic idea is that we can do a limited form of tail duplication in the case when we have a branch on a phi. Moving the branch up in to the predecessor block makes instruction selection much easier and encourages chained jump threadings. llvm-svn: 83759	2009-10-11 07:24:57 +00:00
Chris Lattner	6ce85e85f5	restructure some code, no functionality change. llvm-svn: 83756	2009-10-11 04:40:21 +00:00
Chris Lattner	f466bc84c9	factor some code better and move a function, no functionality change. llvm-svn: 83755	2009-10-11 04:33:43 +00:00
Chris Lattner	f99a74e24b	make jump threading on a phi with undef inputs happen. llvm-svn: 83754	2009-10-11 04:18:15 +00:00
Chris Lattner	71d353dd48	rewrite LCSSA to use SSAUpdate, to only return true if it modifies the IR, and to implement the FIXME'd optimization. llvm-svn: 83748	2009-10-11 02:53:37 +00:00
Chris Lattner	101dde30ed	clean up and simplify some code. Don't use setvector when things will be inserted only once, just use vector. Don't compute ExitBlocks unless we need it, change std::sort to array_pod_sort. llvm-svn: 83747	2009-10-11 01:07:15 +00:00
Chris Lattner	b6c65faa64	switch GVN to use SSAUpdater. Besides removing a lot of complexity from GVN, this also speeds it up, inserts fewer PHI nodes (see the testcase) and allows it to remove more loads (due to fewer PHI nodes standing in the way). llvm-svn: 83746	2009-10-10 23:50:30 +00:00
Chris Lattner	9c382cebc5	add a simple helper method. llvm-svn: 83745	2009-10-10 23:41:48 +00:00
Chris Lattner	249265de06	add ability for clients of SSAUpdater to find out about the PHI nodes inserted. llvm-svn: 83744	2009-10-10 23:15:24 +00:00
Chris Lattner	89d2a5c4f3	remove dead code llvm-svn: 83742	2009-10-10 23:04:12 +00:00
Chris Lattner	67cdd8b567	add the ability to get a rewritten value from the middle of a block, not just at the end. Add a big comment explaining when this could be useful (which never happens for jump threading). llvm-svn: 83741	2009-10-10 23:00:11 +00:00
Chris Lattner	e474a8d3a7	rename GetValueInBlock -> GetValueAtEndOfBlock to better reflect what it does. llvm-svn: 83740	2009-10-10 22:41:58 +00:00
Chris Lattner	65e69a77e1	use a typedef instead of spelling out an insane type. Yay for auto someday. llvm-svn: 83707	2009-10-10 09:09:20 +00:00
Chris Lattner	84095071ea	Change jump threading to use the new SSAUpdater class instead of DemoteRegToStack. This makes it more efficient (because it isn't creating a ton of load/stores that are eventually removed by a later mem2reg), and more slightly more effective (because those load/stores don't get in the way of threading). llvm-svn: 83706	2009-10-10 09:05:58 +00:00
Chris Lattner	60d4e69c81	Implement an efficient and fully general SSA update mechanism that works on unstructured CFGs. This implements PR217, our oldest open PR. llvm-svn: 83705	2009-10-10 09:04:27 +00:00
Chris Lattner	f30a2b0c86	random tidying llvm-svn: 83701	2009-10-10 06:22:45 +00:00
Dale Johannesen	96a5b87ae2	Use names instead of numbers for some of the magic constants used in inlining heuristics (especially those used in more than one file). No functional change. llvm-svn: 83675	2009-10-09 21:42:02 +00:00
Dale Johannesen	3059924bdd	When considering whether to inline Callee into Caller, and that will make Caller too big to inline, see if it might be better to inline Caller into its callers instead. This situation is described in PR 2973, although I haven't tried the specific case in SPASS. llvm-svn: 83602	2009-10-09 00:11:32 +00:00
Dan Gohman	09984279fd	Add a form of addPreserved which takes a string argument, to allow passes to declare that they preserve other passes without needing to pull in additional header file or library dependencies. Convert MachineFunctionPass and CodeGenLICM to make use of this. llvm-svn: 83555	2009-10-08 17:00:02 +00:00
Jeffrey Yasskin	dafd08ea7e	In instcombine's debug output, avoid printing ADD for instructions that are already on the worklist, and print Visited when an instruction is about to be visited. Net, on one input, this reduced the output size by at least 9x. llvm-svn: 83510	2009-10-08 00:12:24 +00:00
Eric Christopher	5b741f3d14	80-column and whitespace fixes. llvm-svn: 83489	2009-10-07 21:14:25 +00:00
Eric Christopher	e666bc9f64	Add FreeInst to the "is a call" check for Insts that are calls, but not intrinsics. llvm-svn: 83441	2009-10-07 00:54:08 +00:00
Eric Christopher	6ba26317ce	While we still have a MallocInst treat it as a call like any other for inlining. When MallocInst goes away this code will be subsumed as part of calls and work just fine... llvm-svn: 83434	2009-10-07 00:02:18 +00:00
Ted Kremenek	2275a7dfef	Update CMake file. llvm-svn: 83404	2009-10-06 19:45:38 +00:00
Chris Lattner	a893f5bdf5	remove predicate simplifier, it never got the last bugs beaten out of it, and jump threading, condprop and gvn are now getting most of the benefit. This was approved by Nicholas and Nicolas. llvm-svn: 83390	2009-10-06 16:59:46 +00:00
Duncan Sands	9ed7b16bf3	Introduce and use convenience methods for getting pointer types where the element is of a basic builtin type. For example, to get an i8* use getInt8PtrTy. llvm-svn: 83379	2009-10-06 15:40:36 +00:00
Dan Gohman	e525d9ddc0	Remove an unnnecessary LLVMContext argument in ConstantFoldLoadThroughGEPConstantExpr. llvm-svn: 83311	2009-10-05 16:36:26 +00:00
Dan Gohman	238cf49812	Use Use::operator= instead of Use::set, for consistency. llvm-svn: 83310	2009-10-05 16:31:55 +00:00
Chris Lattner	fdd8790718	strength reduce a ton of type equality tests to check the typeid (Through the new predicates I added) instead of going through a context and doing a pointer comparison. Besides being cheaper, this allows a smart compiler to turn the if sequence into a switch. llvm-svn: 83297	2009-10-05 05:54:46 +00:00
Chris Lattner	463716d559	instcombine shouldn't delete all null checks for mallocs. This fixes PR5130. llvm-svn: 83290	2009-10-05 02:47:47 +00:00
Owen Anderson	b5049bebb3	Do away with the strange use of BitVectors in SSI, and just use normal sets. This makes the code much more C++/LLVM-ish. llvm-svn: 83286	2009-10-04 18:49:55 +00:00
Owen Anderson	286feb16a9	Fix a typo in the comment. llvm-svn: 83283	2009-10-04 17:52:13 +00:00
Owen Anderson	a62bf10651	SSI needs to require DT and DF transitively, since it uses them outside of its runOnFunction. Similarly, it can be marked setPreservesAll, since it does no work in its runOnFunction. llvm-svn: 83282	2009-10-04 17:47:39 +00:00
Evan Cheng	bb4ed2394b	Allow -inline-threshold override default threshold even if compiling to optimize for size. llvm-svn: 83274	2009-10-04 06:13:54 +00:00
Douglas Gregor	d846fbf20d	Remove GVNPRE.cpp from the CMake makefile llvm-svn: 83194	2009-10-01 05:30:05 +00:00
Chris Lattner	5f3cc06cd2	remove the GVNPRE pass. It has been subsumed by the GVN pass. Ok'd by Owen. llvm-svn: 83193	2009-10-01 02:18:36 +00:00
Dan Gohman	ea0bb8f555	Fix this code so that it doesn't try to iterate through a std::vector while calling changeImmediateDominator, which removes elements from the vector. This fixes PR5097. llvm-svn: 83166	2009-09-30 20:54:16 +00:00
Dan Gohman	7d3b0be05b	Remove a redundant #ifndef and add an assertion string. llvm-svn: 82991	2009-09-28 14:38:19 +00:00
Dan Gohman	9a7320c711	Convert LoopSimplify and LoopExtractor from FunctionPass to LoopPass. llvm-svn: 82990	2009-09-28 14:37:51 +00:00
Chris Lattner	0261b5d2d2	The select instruction is not neccesarily in the same block as the phi nodes. Make sure to phi translate from the right block. This fixes a llvm-building-llvm failure on GVN-PRE.cpp llvm-svn: 82970	2009-09-28 06:49:44 +00:00
Chris Lattner	4425660b1f	simplify some code. llvm-svn: 82936	2009-09-27 21:46:50 +00:00
Chris Lattner	b2e88cd01c	The bitcast case is not needed here: instcombine turns icmp(bitcast(x), null) -> icmp(x, null) already. llvm-svn: 82935	2009-09-27 21:42:46 +00:00
Chris Lattner	8b4d3dfbbf	calls are already unmovable, malloc doesn't need a special case. llvm-svn: 82933	2009-09-27 21:36:19 +00:00
Chris Lattner	f9e0c7f84b	calls to external functions are already marked overdefined, special casing malloc isn't needed. llvm-svn: 82932	2009-09-27 21:35:11 +00:00
Chris Lattner	5abb1e4cd2	calls are already handled, malloc doesn't need a special case. llvm-svn: 82931	2009-09-27 21:33:46 +00:00
Chris Lattner	466d57f6c1	calls are rejected above, no need to special case malloc here. llvm-svn: 82929	2009-09-27 21:31:39 +00:00
Chris Lattner	43d0db70ac	remove special handling of bitcast(malloc), it will be handled when the loop inspects the bitcast operand. llvm-svn: 82928	2009-09-27 21:29:28 +00:00
Chris Lattner	a8627272c1	unlike the malloc instruction, "malloc" calls do not claim to be readonly, just nounwind. llvm-svn: 82927	2009-09-27 21:23:38 +00:00
Chris Lattner	b391e87263	allow pushing icmps through phis with multiple uses and across critical edges. These are important to push up to encourage jump threading. This shrinks 176.gcc a bit. llvm-svn: 82923	2009-09-27 20:46:36 +00:00
Chris Lattner	ae289632ef	Enhance the previous fix for PR4895 to allow more values than just simple constants for the true/false value of the select. We now do phi translation etc. This really fixes PR4895 :) llvm-svn: 82917	2009-09-27 20:18:49 +00:00
Chris Lattner	facb867af3	implement PR4895, by making FoldOpIntoPhi handle select conditions that are phi nodes. Also tighten up FoldOpIntoPhi to treat constantexpr operands to phis just like other variables, avoiding moving constantexpr computations around. Patch by Daniel Dunbar. llvm-svn: 82913	2009-09-27 19:57:57 +00:00
Dan Gohman	0e70af36c0	Grab an LLVM Context from an instruction that exists rather than one that is deleted in some situations. This fixes a use-after-free. llvm-svn: 82903	2009-09-27 16:10:30 +00:00
Dan Gohman	fc20b67e80	Tell ScalarEvolution to forget everything it knows about a loop before rotating the loop, since loop rotation is a very significant change. llvm-svn: 82901	2009-09-27 15:37:03 +00:00
Nick Lewycky	42fb7452df	Instruction::clone does not need to take an LLVMContext&. Remove that and update all the callers. llvm-svn: 82889	2009-09-27 07:38:41 +00:00
Dan Gohman	62995c71a2	Fix SimplifyLibCalls to transfer attributes from callees rather than calls, since direct calls don't always reflect the attributes of their callees. llvm-svn: 82867	2009-09-26 18:10:13 +00:00
Dan Gohman	394468dc8e	Rename ConstantFP's getInf to getInfinity. llvm-svn: 82823	2009-09-25 23:40:21 +00:00
Dan Gohman	5ffd53892d	Transform pow(x, 0.5) to (x == -inf ? inf : fabs(sqrt(x))), which is typically faster then doing a general pow. llvm-svn: 82819	2009-09-25 23:10:17 +00:00
Torok Edwin	21bd8c9fc5	Constant propagating byval pointer is safe if function is readonly. llvm-svn: 82700	2009-09-24 18:33:42 +00:00
Torok Edwin	f95a450ef9	Don't constant propagate byval pointers, since they are not really pointers, but rather structs passed by value. This fixes PR5038. llvm-svn: 82689	2009-09-24 09:47:18 +00:00
Dale Johannesen	fb1b55bc9c	A minor improvment in accuracy to inline cost computation, and some cosmetics. llvm-svn: 82660	2009-09-23 22:05:24 +00:00
Chris Lattner	e3ce1e2a37	tidy up llvm-svn: 82488	2009-09-21 22:26:02 +00:00
Chris Lattner	247053867e	big endian systems shift by bits too, hopefully this will fix the ppc bootstrap problems. llvm-svn: 82464	2009-09-21 17:55:47 +00:00
Dan Gohman	43d6830ea0	Nick pointed out that DominanceFrontier and DominanceTree are preserved by setPreservesCFG(). llvm-svn: 82463	2009-09-21 17:54:42 +00:00
Dan Gohman	af57ae3da4	Remove the special-case for constants in PHI nodes; it's not really helpful, and it didn't correctly handle the case of constants input to PHIs for backedges. llvm-svn: 82462	2009-09-21 17:53:35 +00:00
Chris Lattner	9045f235d2	fix PR5016, a crash I introduced in GVN handing first class arrays and structs, which cannot be bitcast to integers. llvm-svn: 82460	2009-09-21 17:24:04 +00:00
Chris Lattner	4d8af2f1ae	enable non-local analysis and PRE of large store -> little load. This doesn't kick in too much because of phi translation issues, but this can be resolved in the future. llvm-svn: 82447	2009-09-21 06:48:08 +00:00
Chris Lattner	0cdc17eb50	convert an std::pair to an explicit struct. llvm-svn: 82446	2009-09-21 06:30:24 +00:00
Chris Lattner	d28f90897a	move some functions, add a comment. llvm-svn: 82444	2009-09-21 06:24:16 +00:00
Chris Lattner	9d7fb29522	split HandleLoadFromClobberingStore in two pieces: one that does the analysis, one that does the xform. llvm-svn: 82443	2009-09-21 06:22:46 +00:00
Chris Lattner	0a9616d906	Improve GVN to be able to forward substitute a small load from a piece of a large store when both are in the same block. This allows clang to compile the testcase in PR4216 to this code: _test_bitfield: movl 4(%esp), %eax movl %eax, %ecx andl $-65536, %ecx orl $32962, %eax andl $40186, %eax orl %ecx, %eax ret This is not ideal, but is a whole lot better than the code produced by llvm-gcc: _test_bitfield: movw $-32574, %ax orw 4(%esp), %ax andw $-25350, %ax movw %ax, 4(%esp) movw 7(%esp), %cx shlw $8, %cx movzbl 6(%esp), %edx orw %cx, %dx movzwl %dx, %ecx shll $16, %ecx movzwl %ax, %eax orl %ecx, %eax ret and dramatically better than that produced by gcc 4.2: _test_bitfield: pushl %ebx call L3 "L00000000001$pb": L3: popl %ebx movl 8(%esp), %eax leal 0(,%eax,4), %edx sarb $7, %dl movl %eax, %ecx andl $7168, %ecx andl $-7201, %ebx movzbl %dl, %edx andl $1, %edx sall $5, %edx orl %ecx, %ebx orl %edx, %ebx andl $24, %eax andl $-58336, %ebx orl %eax, %ebx orl $32962, %ebx movl %ebx, %eax popl %ebx ret llvm-svn: 82439	2009-09-21 05:57:11 +00:00
Chris Lattner	1eefa9c427	formatting cleanups, no functionality change. llvm-svn: 82426	2009-09-21 02:42:51 +00:00
Chris Lattner	a0aa8fb6a6	Move CoerceAvailableValueToLoadType earlier in GVN.cpp. Hook it up so that nonlocal and partially redundant loads can use it as well. The testcase shows examples of craziness this can handle. This triggers many times in 176.gcc. llvm-svn: 82403	2009-09-20 20:09:34 +00:00
Chris Lattner	7c62d8a1a8	change the interface to CoerceAvailableValueToLoadType to be more generic. llvm-svn: 82402	2009-09-20 19:31:14 +00:00
Chris Lattner	1dd48c34e5	enhance GVN to forward substitute a stored value to a load (and load -> load) when the base pointers must alias but when they are different types. This occurs very very frequently in 176.gcc and other code that uses bitfields a lot. llvm-svn: 82399	2009-09-20 19:03:47 +00:00
Daniel Dunbar	7d6781b0fe	Tabs -> spaces, and remove trailing whitespace. llvm-svn: 82355	2009-09-20 02:20:51 +00:00
Nick Lewycky	1303c0ab86	Remove the default value for ConstantStruct::get's isPacked parameter and update the code which was broken by this. llvm-svn: 82327	2009-09-19 20:30:26 +00:00
Victor Hernandez	5d034499ad	Enhance transform passes so that they apply the same tranforms to malloc calls as to MallocInst. Reviewed by Dan Gohman. llvm-svn: 82300	2009-09-18 22:35:49 +00:00
Victor Hernandez	788eaabd18	Update malloc call creation code (AllocType is now the element type of the malloc, not the resulting type). In getMallocArraySize(), fix bug in the case that array size is the product of 2 constants. Extend isArrayMalloc() and getMallocArraySize() to handle case where malloc is used as char array. Ensure that ArraySize in LowerAllocations::runOnBasicBlock() is correct type. Extend Instruction::isSafeToSpeculativelyExecute() to handle malloc calls. Add verification for malloc calls. Reviewed by Dan Gohman. llvm-svn: 82257	2009-09-18 19:20:02 +00:00
Daniel Dunbar	487d1c8138	Update CMake. llvm-svn: 82097	2009-09-17 00:06:48 +00:00
Dan Gohman	0f64d71d99	Add a new pass for doing late hoisting of floating-point and vector constants out of loops. These aren't covered by the regular LICM pass, because in LLVM IR constants don't require separate instructions. They're not always covered by the MachineLICM pass either, because it doesn't know how to unfold folded constant-pool loads. This is somewhat experimental at this point, and off by default. llvm-svn: 82076	2009-09-16 20:25:11 +00:00
Dan Gohman	bd0050810c	Change FoldPHIArgBinOpIntoPHI to decline folding if it would introduce two phis, similar to the FoldPHIArgGEPIntoPHI change. Also, delete some comments that don't reflect the code. llvm-svn: 82053	2009-09-16 16:50:24 +00:00
Andreas Neustifter	41c1103273	Reapplied r81355 with the problems fixed. (See http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090907/086737.html and http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090907/086746.html) llvm-svn: 82039	2009-09-16 11:35:50 +00:00
Andreas Neustifter	f8cb758ba8	Preserve ProfileInfo during CodeGenPrepare. llvm-svn: 82034	2009-09-16 09:26:52 +00:00
Dan Gohman	3b7ce109ec	Don't sink gep operators through phi nodes if the result would require more than one phi, since that leads to higher register pressure on entry to the phi. This is especially problematic when the phi is in a loop header, as it increases register pressure throughout the loop. llvm-svn: 81993	2009-09-16 02:01:52 +00:00

... 53 54 55 56 57 ...

11073 Commits