llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrew Trick	3ec331eaf4	Added a SimplifyIndVar utility to simplify induction variable users based on ScalarEvolution without changing the induction variable phis. This utility is the main tool of IndVarSimplifyPass, but the pass also restructures induction variables in strange ways that are sensitive to pass ordering. This provides a way for other loop passes to simplify new uses of induction variables created during transformation. The utility may be used by any pass that preserves ScalarEvolution. Soon LoopUnroll will use it. The net effect in this checkin is to cleanup the IndVarSimplify pass by factoring out the SimplifyIndVar algorithm into a standalone utility. llvm-svn: 137197	2011-08-10 03:46:27 +00:00
Eli Friedman	59b66883ea	Representation of 'atomic load' and 'atomic store' in IR. llvm-svn: 137170	2011-08-09 23:02:53 +00:00
Andrew Trick	6d45a01b67	Made SCEV's UDiv expressions more canonical. When dividing a recurrence, the initial values low bits can sometimes be ignored. To take advantage of this, added FoldIVUser to IndVarSimplify to fold an IV operand into a udiv/lshr if the operator doesn't affect the result. -indvars -disable-iv-rewrite now transforms i = phi i4 i1 = i0 + 1 idx = i1 >> (2 or more) i4 = i + 4 into i = phi i4 idx = i0 >> ... i4 = i + 4 llvm-svn: 137013	2011-08-06 07:00:37 +00:00
Evan Cheng	e4df6a2add	Fix an obvious type. Patch by Ivan Krasin. llvm-svn: 136900	2011-08-04 18:40:26 +00:00
Eli Friedman	366bccefad	Add new atomic instructions to SCCP. No functional change, but stops debug spam. llvm-svn: 136723	2011-08-02 21:35:16 +00:00
Owen Anderson	bddf40e082	Revert r136503 and r136480 in an effort to fix non-determinism in the llvm-gcc buildbots on i386. Devang is looking into the root cause. llvm-svn: 136674	2011-08-02 02:23:42 +00:00
Bill Wendling	f891bf8b30	Add the 'resume' instruction for the new EH rewrite. This adds the 'resume' instruction class, IR parsing, and bitcode reading and writing. The 'resume' instruction resumes propagation of an existing (in-flight) exception whose unwinding was interrupted with a 'landingpad' instruction (to be added later). llvm-svn: 136589	2011-07-31 06:30:59 +00:00
Bill Wendling	ad088e6724	Revert r136253, r136263, r136269, r136313, r136325, r136326, r136329, r136338, r136339, r136341, r136369, r136387, r136392, r136396, r136429, r136430, r136444, r136445, r136446, r136253 pending review. llvm-svn: 136556	2011-07-30 05:42:50 +00:00
Devang Patel	ce0ceebb1c	Clear DbgValues in the end. llvm-svn: 136503	2011-07-29 19:49:58 +00:00
Devang Patel	3e02522fee	Clean up debug info after reassociation. llvm-svn: 136480	2011-07-29 19:00:35 +00:00
Eli Friedman	adec587d5c	Misc optimizer+codegen work for 'cmpxchg' and 'atomicrmw'. They appear to be working on x86 (at least for trivial testcases); other architectures will need more work so that they actually emit the appropriate instructions for orderings stricter than 'monotonic'. (As far as I can tell, the ARM, PPC, Mips, and Alpha backends need such changes.) llvm-svn: 136457	2011-07-29 03:05:32 +00:00
Chandler Carruth	9d7feab3e0	Rewrite the CMake build to use explicit dependencies between libraries, specified in the same file that the library itself is created. This is more idiomatic for CMake builds, and also allows us to correctly specify dependencies that are missed due to bugs in the GenLibDeps perl script, or change from compiler to compiler. On Linux, this returns CMake to a place where it can relably rebuild several targets of LLVM. I have tried not to change the dependencies from the ones in the current auto-generated file. The only places I've really diverged are in places where I was seeing link failures, and added a dependency. The goal of this patch is not to start changing the dependencies, merely to move them into the correct location, and an explicit form that we can control and change when necessary. This also removes a serialization point in the build because we don't have to scan all the libraries before we begin building various tools. We no longer have a step of the build that regenerates a file inside the source tree. A few other associated cleanups fall out of this. This isn't really finished yet though. After talking to dgregor he urged switching to a single CMake macro to construct libraries with both sources and dependencies in the arguments. Migrating from the two macros to that style will be a follow-up patch. Also, llvm-config is still generated with GenLibDeps.pl, which means it still has slightly buggy dependencies. The internal CMake 'llvm-config-like' macro uses the correct explicitly specified dependencies however. A future patch will switch llvm-config generation (when using CMake) to be based on these deps as well. This may well break Windows. I'm getting a machine set up now to dig into any failures there. If anyone can chime in with problems they see or ideas of how to solve them for Windows, much appreciated. llvm-svn: 136433	2011-07-29 00:14:25 +00:00
Benjamin Kramer	e71b9c446d	Fix a use after free. An instruction can't be both an intrinsic call and a fence. llvm-svn: 136319	2011-07-28 01:20:19 +00:00
Bill Wendling	6c923bb8d9	Merge the contents from exception-handling-rewrite to the mainline. This adds the new instructions 'landingpad' and 'resume'. llvm-svn: 136253	2011-07-27 20:18:04 +00:00
Eli Friedman	89b694b096	Misc mid-level changes for new 'fence' instruction. llvm-svn: 136205	2011-07-27 01:08:30 +00:00
Nick Lewycky	15e2d90746	Finish adding support for lifetime intrinsics to SROA. Fixes PR10121! llvm-svn: 136008	2011-07-25 23:14:22 +00:00
Nick Lewycky	77cb8e681f	Add missing space (this line is no longer pushing the 80-column limit). llvm-svn: 135973	2011-07-25 21:16:04 +00:00
Rafael Espindola	7281395c8c	Add LLVMAddLowerExpectIntrinsicPass to the C API. llvm-svn: 135966	2011-07-25 20:57:59 +00:00
Jay Foad	d1b7849d49	Convert GetElementPtrInst to use ArrayRef. llvm-svn: 135904	2011-07-25 09:48:08 +00:00
Andrew Trick	1cabe54fab	Move trip count discovery outside of the generic LoopUnroll helper. This removes its dependence on canonical induction variables. llvm-svn: 135829	2011-07-23 00:33:05 +00:00
Andrew Trick	279e7a6c83	whitespace llvm-svn: 135828	2011-07-23 00:29:16 +00:00
Dan Gohman	6320f52ff4	Move the last uses of RetainFunc etc. over to using getRetainCallee() etc. so that a declaration for objc_retain is created when needed if it doesn't already exist. rdar://9825114. llvm-svn: 135821	2011-07-22 22:29:21 +00:00
Dan Gohman	e106aee6f5	Fix MergeInVectorType to check for vector types with the same alloc size but different element types, so that it filters out the cases that CreateShuffleVectorCast doesn't handle. This fixes rdar://9786827. llvm-svn: 135721	2011-07-21 23:30:09 +00:00
Andrew Trick	cd3e8cb882	Cleanup: make std::pair usage slightly less indecipherable without actually naming variables! llvm-svn: 135684	2011-07-21 17:37:39 +00:00
Jay Foad	ed8db7d9df	Convert ConstantExpr::getGetElementPtr and ConstantExpr::getInBoundsGetElementPtr to use ArrayRef. llvm-svn: 135673	2011-07-21 14:31:17 +00:00
Chris Lattner	5cf753c95e	move tier out of an anonymous namespace, it doesn't make sense to for it to be an an anon namespace and be in a header. Eliminate some extraenous uses of tie. llvm-svn: 135669	2011-07-21 06:21:31 +00:00
Andrew Trick	bd243d0dfe	LSR, correct fix for rdar://9786536. Silly casting bug. llvm-svn: 135654	2011-07-21 01:45:54 +00:00
Andrew Trick	858e9f083d	LSR must sometimes sign-extend before generating double constants. rdar://9786536 llvm-svn: 135650	2011-07-21 01:05:01 +00:00
Andrew Trick	8acb434402	LSR crashes on an empty IVUsers list. rdar://9786536 llvm-svn: 135644	2011-07-21 00:40:04 +00:00
Eli Friedman	0cdc148ab8	Bring LICM into compliance with the new "Memory Model for Concurrent Operations" in LangRef. llvm-svn: 135625	2011-07-20 21:37:47 +00:00
Jay Foad	50bfbab033	Fix a GCC warning. llvm-svn: 135581	2011-07-20 08:15:21 +00:00
Andrew Trick	638b355a16	indvars: Added getInsertPointForUses to find a valid place to truncate the IV. llvm-svn: 135568	2011-07-20 05:32:06 +00:00
Andrew Trick	2210448520	indvars -disable-iv-rewrite: Add NarrowIVDefUse to cache def-use info. Holding Use* pointers is bad form even though it happened to work in this case. llvm-svn: 135566	2011-07-20 04:39:24 +00:00
Andrew Trick	c5dd3e976a	indvars -disable-iv-rewrite fix: derived GEP IVs llvm-svn: 135558	2011-07-20 02:08:58 +00:00
Jay Foad	bf904773bb	Convert TargetData::getIndexedOffset to use ArrayRef. llvm-svn: 135478	2011-07-19 14:01:37 +00:00
Jay Foad	f4b14a2b0d	Use ArrayRef in ConstantFoldInstOperands and ConstantFoldCall. llvm-svn: 135477	2011-07-19 13:32:40 +00:00
Andrew Trick	c43b67644c	Compiler warning. llvm-svn: 135426	2011-07-18 21:15:03 +00:00
Andrew Trick	7da2417c8a	indvars: LinearFunctionTestReplace for non-canonical IVs. For -disable-iv-rewrite, perform LFTR without generating a new "canonical" induction variable. Instead find the "best" existing induction variable for use in the loop exit test and compute the final value of that IV for use in the new loop exit test. In short, convert to a simple eq/ne exit test as long as it's cheap to do so. llvm-svn: 135420	2011-07-18 20:32:31 +00:00
Andrew Trick	494c549ebd	indvars: Added verification that LFTR and other indvars goodness does not interfere with BackedgeTakenCount computation. llvm-svn: 135412	2011-07-18 18:44:20 +00:00
Andrew Trick	a27d8b183a	indvars: Added isHighCostExpansion. Avoid generating extra ops in the preheader for the sole purpose of LFTR, since LFTR itself is usually not a clear optimization. llvm-svn: 135409	2011-07-18 18:21:35 +00:00
Chris Lattner	229907cd11	land David Blaikie's patch to de-constify Type, with a few tweaks. llvm-svn: 135375	2011-07-18 04:54:35 +00:00
Andrew Trick	c591f3afc3	indvars: fix a pass-sensitivity issue that would hit the SCEVExpander assertion I added in r135333. Check for the existence of a preheader before expanding a recurrence. llvm-svn: 135335	2011-07-16 01:18:53 +00:00
Andrew Trick	9ea55dc2d6	indvars: remove ExprToIVMap because it won't be needed by LFTR. llvm-svn: 135334	2011-07-16 01:06:48 +00:00
Chad Rosier	a7ff54351a	Disable loop idiom recognition of memset/memcpy if the function being compiled is named after a common idiom (i.e., memset/memcpy). Otherwise, we can run into infinite recursion. Ideally, the user should use the correct -fno-builtin flag, but in case they don't we should play nicely. rdar://9763412 llvm-svn: 135286	2011-07-15 18:25:04 +00:00
Jay Foad	5bd375a6cc	Convert CallInst and InvokeInst APIs to use ArrayRef. llvm-svn: 135265	2011-07-15 08:37:34 +00:00
Chris Lattner	b1a1512119	start using the new helper methods a bit. llvm-svn: 135251	2011-07-15 06:08:15 +00:00
Benjamin Kramer	e6e1933f31	Change Intrinsic::getDeclaration and friends to take an ArrayRef. llvm-svn: 135154	2011-07-14 17:45:39 +00:00
Jay Foad	b804a2b751	Second attempt at de-constifying LLVM Types in FunctionType::get(), StructType::get() and TargetData::getIntPtrType(). llvm-svn: 134982	2011-07-12 14:06:48 +00:00
Bill Wendling	a78cd228c2	Revert r134893 and r134888 (and related patches in other trees). It was causing an assert on Darwin llvm-gcc builds. Assertion failed: (castIsValid(op, S, Ty) && "Invalid cast!"), function Create, file /Users/buildslave/zorg/buildbot/smooshlab/slave-0.8/build.llvm-gcc-i386-darwin9-RA/llvm.src/lib/VMCore/Instructions.cpp, li\ ne 2067. etc. http://smooshlab.apple.com:8013/builders/llvm-gcc-i386-darwin9-RA/builds/2354 --- Reverse-merging r134893 into '.': U include/llvm/Target/TargetData.h U include/llvm/DerivedTypes.h U tools/bugpoint/ExtractFunction.cpp U unittests/Support/TypeBuilderTest.cpp U lib/Target/ARM/ARMGlobalMerge.cpp U lib/Target/TargetData.cpp U lib/VMCore/Constants.cpp U lib/VMCore/Type.cpp U lib/VMCore/Core.cpp U lib/Transforms/Utils/CodeExtractor.cpp U lib/Transforms/Instrumentation/ProfilingUtils.cpp U lib/Transforms/IPO/DeadArgumentElimination.cpp U lib/CodeGen/SjLjEHPrepare.cpp --- Reverse-merging r134888 into '.': G include/llvm/DerivedTypes.h U include/llvm/Support/TypeBuilder.h U include/llvm/Intrinsics.h U unittests/Analysis/ScalarEvolutionTest.cpp U unittests/ExecutionEngine/JIT/JITTest.cpp U unittests/ExecutionEngine/JIT/JITMemoryManagerTest.cpp U unittests/VMCore/PassManagerTest.cpp G unittests/Support/TypeBuilderTest.cpp U lib/Target/MBlaze/MBlazeIntrinsicInfo.cpp U lib/Target/Blackfin/BlackfinIntrinsicInfo.cpp U lib/VMCore/IRBuilder.cpp G lib/VMCore/Type.cpp U lib/VMCore/Function.cpp G lib/VMCore/Core.cpp U lib/VMCore/Module.cpp U lib/AsmParser/LLParser.cpp U lib/Transforms/Utils/CloneFunction.cpp G lib/Transforms/Utils/CodeExtractor.cpp U lib/Transforms/Utils/InlineFunction.cpp U lib/Transforms/Instrumentation/GCOVProfiling.cpp U lib/Transforms/Scalar/ObjCARC.cpp U lib/Transforms/Scalar/SimplifyLibCalls.cpp U lib/Transforms/Scalar/MemCpyOptimizer.cpp G lib/Transforms/IPO/DeadArgumentElimination.cpp U lib/Transforms/IPO/ArgumentPromotion.cpp U lib/Transforms/InstCombine/InstCombineCompares.cpp U lib/Transforms/InstCombine/InstCombineAndOrXor.cpp U lib/Transforms/InstCombine/InstCombineCalls.cpp U lib/CodeGen/DwarfEHPrepare.cpp U lib/CodeGen/IntrinsicLowering.cpp U lib/Bitcode/Reader/BitcodeReader.cpp llvm-svn: 134949	2011-07-12 01:15:52 +00:00
Andrew Trick	cdc2297ee1	indvars: Code reorganization in preparation for LinearFunctionTestReplace rewrite. No functionality. I've been wanting to group the indvar subphases into sections and order them by their logical sequence. My next checkin adds functions related to LFTR, and doing the reorg now should help reviewers. Since, most of the code in IndVarSimplify.cpp has recently been replaced or will be replaced soon, obscuring blame should not be an issue. This seems like an ideal time to shuffle the code around. I'm happy to take more suggestions for cleaning up the code. Or if you've been wanting to cleanup anything in this file yourself, now is a good time. llvm-svn: 134941	2011-07-12 00:08:50 +00:00
Jay Foad	56cc1530ee	De-constify Types in FunctionType::get(). llvm-svn: 134888	2011-07-11 07:56:41 +00:00
Lang Hames	266dab7bab	Added recognition for signed add/sub/mul with overflow intrinsics to GVN as per Chris and Frits suggestion. llvm-svn: 134777	2011-07-09 00:25:11 +00:00
Lang Hames	29cd98fd52	Make GVN look through extractvalues for recognised intrinsics. GVN can then CSE ops that match values produced by the intrinsics. llvm-svn: 134677	2011-07-08 01:50:54 +00:00
Devang Patel	41e97da74f	Use DBG_VALUE location while inserting DBG_VALUE during alloca promotion. llvm-svn: 134568	2011-07-07 00:05:58 +00:00
Devang Patel	c6ee9181d0	Handle cases where multiple dbg.declare and dbg.value intrinsics are tied to one alloca. llvm-svn: 134549	2011-07-06 22:06:11 +00:00
Devang Patel	a3cbf52a57	Simplify. Consolidate dbg.declare handling in AllocaPromoter. llvm-svn: 134538	2011-07-06 21:09:55 +00:00
Andrew Trick	9f8c2853ca	indvars -disable-iv-rewrite: ExprToMap lives in Pass data, so be more careful about referencing values. llvm-svn: 134537	2011-07-06 21:07:10 +00:00
Andrew Trick	3239055dee	indvars -disable-iv-rewrite: Added SimplifyCongruentIVs. llvm-svn: 134530	2011-07-06 20:50:43 +00:00
Tobias Grosser	a3928f5084	LICM: Remove trailing white spaces llvm-svn: 134521	2011-07-06 19:20:02 +00:00
Tobias Grosser	4a5d9a9c20	LICM: Do not loose alignment on promotion The promotion code lost any alignment information, when hoisting loads and stores out of the loop. This lead to incorrect aligned memory accesses. We now use the largest alignment we can prove to be correct. llvm-svn: 134520	2011-07-06 19:19:55 +00:00
Jakub Staszak	3f158fdf6e	Introduce "expect" intrinsic instructions. llvm-svn: 134516	2011-07-06 18:22:43 +00:00
Devang Patel	c3239d3965	Preserve debug loc. llvm-svn: 134441	2011-07-05 21:48:22 +00:00
Andrew Trick	92905a1767	indvars -disable-iv-rewrite: avoid multiple IVs in weird cases. Putting back the helper that I removed on 7/1 to do this right. llvm-svn: 134423	2011-07-05 18:19:39 +00:00
Andrew Trick	6d12309475	indvars -disable-iv-rewrite: bug fix involving weird geps and related cleanup. llvm-svn: 134306	2011-07-02 02:34:25 +00:00
Nick Lewycky	f64a39768d	Fix likely typo, reduce number of instruction name collisions. llvm-svn: 134235	2011-07-01 06:27:03 +00:00
Andrew Trick	efe89ad414	indvars -disable-iv-rewrite: handle cloning binary operators that cannot overflow. llvm-svn: 134177	2011-06-30 19:02:17 +00:00
Andrew Trick	cc68605353	indvars -disable-iv-rewrite: handle an edge case involving identity phis. llvm-svn: 134124	2011-06-30 01:27:23 +00:00
Andrew Trick	ecdd6e4c67	indvars -disable-iv-rewrite: insert new trunc instructions carefully. llvm-svn: 134112	2011-06-29 23:03:57 +00:00
Andrew Trick	efe2b1963d	indvars -disable-iv-rewrite: just because SCEV ignores casts doesn't mean they can be removed. llvm-svn: 134054	2011-06-29 03:13:40 +00:00
Andrew Trick	4426f5b388	cleanup: misleading comment. llvm-svn: 134010	2011-06-28 16:45:04 +00:00
Andrew Trick	411daa5e81	SCEVExpander: give new insts a name that identifies the reponsible pass. llvm-svn: 133992	2011-06-28 05:07:32 +00:00
Andrew Trick	60ab3efb3e	whitespace llvm-svn: 133991	2011-06-28 05:04:16 +00:00
Andrew Trick	56b315a9cf	indvars --disable-iv-rewrite: sever ties with IVUsers. llvm-svn: 133988	2011-06-28 03:01:46 +00:00
Andrew Trick	8a3c39c737	indvars --disable-iv-rewrite: Defer evaluating s/zext until SCEV evaluates all other IV exprs. llvm-svn: 133982	2011-06-28 02:49:20 +00:00
Andrew Trick	163b4a70fb	indvars -disable-iv-rewrite: run RLEV after SimplifyIVUsers for a bit more control over the order SCEVs are evaluated. llvm-svn: 133959	2011-06-27 23:17:44 +00:00
Jakub Staszak	423651e46a	Calculate GetBestDestForJumpOnUndef correctly. llvm-svn: 133946	2011-06-27 21:51:12 +00:00
Nick Lewycky	a61df3f843	Teach one piece of scalarrepl to handle lifetime markers. When transforming an alloca that only holds a copy of a global and we're going to replace the users of the alloca with that global, just nuke the lifetime intrinsics. Part of PR10121. llvm-svn: 133905	2011-06-27 05:40:02 +00:00
Jay Foad	61ea0e4692	Reinstate r133513 (reverted in r133700) with an additional fix for a -Wshorten-64-to-32 warning in Instructions.h. llvm-svn: 133708	2011-06-23 09:09:15 +00:00
Eric Christopher	96513120b7	Revert r133513: "Reinstate r133435 and r133449 (reverted in r133499) now that the clang self-hosted build failure has been fixed (r133512)." Due to some additional warnings. llvm-svn: 133700	2011-06-23 06:24:52 +00:00
Devang Patel	ea7751bc24	Set debug loc. llvm-svn: 133636	2011-06-22 19:52:36 +00:00
Andrew Trick	fc4ccb20c6	IVUsers no longer needs to record the phis. llvm-svn: 133518	2011-06-21 15:43:52 +00:00
Jay Foad	a97a2c998e	Reinstate r133435 and r133449 (reverted in r133499) now that the clang self-hosted build failure has been fixed (r133512). llvm-svn: 133513	2011-06-21 10:33:19 +00:00
Jay Foad	25127ab1e4	Don't use PN->replaceUsesOfWith() to change a PHINode's incoming blocks, because it won't work after my phi operand changes, because the incoming blocks will no longer be Uses. llvm-svn: 133512	2011-06-21 10:02:43 +00:00
Andrew Trick	69d4452f2e	indvars -disable-iv-rewrite: Adds support for eliminating identity ops. This is a rewrite of the IV simplification algorithm used by -disable-iv-rewrite. To avoid perturbing the default mode, I temporarily split the driver and created SimplifyIVUsersNoRewrite. The idea is to avoid doing opcode/pattern matching inside IndVarSimplify. SCEV already does it. We want to optimize with the full generality of SCEV, but optimize def-use chains top down on-demand rather than rewriting the entire expression bottom-up. This was easy to do for operations that SCEV can prove are identity function. So we're now eliminating bitmasks and zero extends this way. A result of this rewrite is that indvars -disable-iv-rewrite no longer requires IVUsers. llvm-svn: 133502	2011-06-21 03:22:38 +00:00
Chad Rosier	184f3b37e2	Revert r133435 and r133449 to appease buildbots. llvm-svn: 133499	2011-06-21 02:09:03 +00:00
Dan Gohman	ceaac7cb4a	Completely short-circuit out ARC optimization if the ARC runtime functions do not appear in the module. llvm-svn: 133478	2011-06-20 23:20:43 +00:00
Jay Foad	e03c05c35a	Change how PHINodes store their operands. Change PHINodes to store simple pointers to their incoming basic blocks, instead of full-blown Uses. Note that this loses an optimization in SplitCriticalEdge(), because we can no longer walk the use list of a BasicBlock to find phi nodes. See the comment I removed starting "However, the foreach loop is slow for blocks with lots of predecessors". Extend replaceAllUsesWith() on a BasicBlock to also update any phi nodes in the block's successors. This mimics what would have happened when PHINodes were proper Users of their incoming blocks. (Note that this only works if OldBB->replaceAllUsesWith(NewBB) is called when OldBB still has a terminator instruction, so it still has some successors.) llvm-svn: 133435	2011-06-20 14:38:01 +00:00
Jay Foad	372ad64b4d	Make better use of the PHINode API. Change various bits of code to make better use of the existing PHINode API, to insulate them from forthcoming changes in how PHINodes store their operands. llvm-svn: 133434	2011-06-20 14:18:48 +00:00
Cameron Zwarich	9601ddb2f3	When scalar replacement returns a vector type, only accept it if the vector type's bitwidth matches the (allocated) size of the alloca. This severely pessimizes vector scalar replacement when the only vector type being used is something like <3 x float> on x86 or ARM whose allocated size matches a <4 x float>. I hope to fix some of the flawed assumptions about allocated size throughout scalar replacement and reenable this in most cases. llvm-svn: 133338	2011-06-18 06:17:51 +00:00
Cameron Zwarich	2a26100c87	Fix an invalid bitcast crash that occurs when doing a partial memset of a vector alloca. Fixes part of <rdar://problem/9580800>. llvm-svn: 133336	2011-06-18 05:47:49 +00:00
Cameron Zwarich	cd42038fdc	Remove a pointless assignment. Nothing checks the value of VectorTy anymore now unless ScalarKind is Vector. llvm-svn: 133335	2011-06-18 05:47:45 +00:00
Dan Gohman	00fa9634d5	Fix ARCOpt to insert releases on both successors of an invoke rather than trying to insert them immediately after the invoke. llvm-svn: 133188	2011-06-16 20:57:14 +00:00
John McCall	d935e9c359	The ARC language-specific optimizer. Credit to Dan Gohman. llvm-svn: 133108	2011-06-15 23:37:01 +00:00
Eli Friedman	e8bbc10880	Stop using memdep for a check that didn't really make sense with memdep. In terms of specific issues, using memdep here checks irrelevant instructions and won't work properly once we start returning "unknown" more aggressively from memdep. llvm-svn: 133035	2011-06-15 01:25:56 +00:00
Eli Friedman	7d58bc7bc0	Add "unknown" results for memdep, which mean "I don't know whether a dependence for the given instruction exists in the given block". This cleans up all the existing hacks in memdep which represent this concept by returning clobber with various unrelated instructions. llvm-svn: 133031	2011-06-15 00:47:34 +00:00
Cameron Zwarich	b5f19d9f6f	Be more obvious about what is being tested. llvm-svn: 132982	2011-06-14 06:33:51 +00:00
Cameron Zwarich	922e4940bd	Fix grammar. llvm-svn: 132952	2011-06-13 23:39:23 +00:00
Cameron Zwarich	3ecbd59c27	Rename MergeInType to MergeInTypeForLoadOrStore. llvm-svn: 132940	2011-06-13 21:44:43 +00:00
Cameron Zwarich	8cb90ac456	Remove the HadAVector instance variable and replace it with a use of ScalarKind. llvm-svn: 132939	2011-06-13 21:44:40 +00:00
Cameron Zwarich	1bfab48edb	Remove a vacuous check. llvm-svn: 132938	2011-06-13 21:44:38 +00:00
Cameron Zwarich	5e9a0be4b3	Have SRoA explicitly track the kind of scalar it is promoting. This is pretty spartan right now, but I plan to encode more information in this enum to improve the correctness and reliability of SRoA. At least this first pass makes it possible to make VectorTy an actual VectorType. llvm-svn: 132937	2011-06-13 21:44:35 +00:00
Cameron Zwarich	8deb615d64	Remove an argument that is always true. llvm-svn: 132936	2011-06-13 21:44:31 +00:00
Cameron Zwarich	c62894d440	Remove a vacuous condition. llvm-svn: 132767	2011-06-09 01:52:44 +00:00
Cameron Zwarich	77a699a829	Fix PR10104 by adding a bounds check on a vector element access check. It was assuming that all offsets are legal vector accesses, and thus trying to access the float member of { <2 x float>, float } as the 3rd element of the first member. llvm-svn: 132766	2011-06-09 01:45:33 +00:00
Cameron Zwarich	c3b1cc9aca	Fix an assymmetry between ConvertScalar_ExtractValue and ConvertScalar_InsertValue. The former was using the size of the entire alloca, whereas the latter was correctly using the allocated size of the immediate type being converted (which may differ from the size of the alloca). This fixes PR10082. llvm-svn: 132759	2011-06-08 22:08:31 +00:00
Devang Patel	84bb33add9	Use IRBuilder, preserve line numbers. llvm-svn: 132578	2011-06-03 19:46:19 +00:00
Nick Lewycky	611582401f	Bail on unswitching a switch statement for a case with a critical edge. We name which edge to split by pred/succ pair, which means that we can end up splitting the wrong edge (by case value) in the switch statement entirely. Fixes PR10031! llvm-svn: 132535	2011-06-03 06:27:15 +00:00
Devang Patel	5127c5d9b2	Preserve line number information while converting Invoke into a Call. llvm-svn: 132505	2011-06-02 22:46:58 +00:00
Eli Friedman	5da0ff41d7	PR10067: Add missing safety check to call return transformation in MemCpyOpt::processStore. If something accesses the dest of the "copy" between the call and the copy, the performCallSlotOptzn transformation is not valid. llvm-svn: 132485	2011-06-02 21:24:42 +00:00
Nadav Rotem	707f2d7787	Fix warnings due to 132263; Thanks rdivacky. llvm-svn: 132285	2011-05-29 08:10:47 +00:00
Nadav Rotem	a9effb13dd	Refactor getActionType and getTypeToTransformTo ; place all of the 'decision' code in one place. Re-apply 131534 and fix the multi-step promotion of integers. llvm-svn: 132217	2011-05-27 21:03:13 +00:00
Eli Friedman	ddf7f55531	Attempt to preserve debug line info in LICM; as the comment in the code says, it's hard to pick good line numbers for this transformation, but something is better than nothing. rdar://9143729 llvm-svn: 132215	2011-05-27 20:31:51 +00:00
Eli Friedman	942e1c10f6	Don't sink or hoist debug info instrinsics; it isn't useful. This also prevents LICM sinking from erasing debug intrinsics which don't dominate any exit block of the loop. rdar://9143943 . llvm-svn: 132201	2011-05-27 18:37:52 +00:00
Eli Friedman	b868c83e67	Oops, wasn't intending to commit this. Partial revert of r132194. llvm-svn: 132195	2011-05-27 18:04:04 +00:00
Eli Friedman	fe84bd659c	Fix a silly mistake (which trips over an assertion) in r132099. rdar://9515076 llvm-svn: 132194	2011-05-27 18:02:04 +00:00
Chandler Carruth	07f5b65e63	Fix warning about \|\| and && without explicit grouping. This looks like it flagged an actual bug. Devang, please review. I added the parentheses that change behavior, but make the behavior more closely match commit log's intent. llvm-svn: 132165	2011-05-26 23:37:58 +00:00
Devang Patel	bf22998f21	Do not insert anything after terminator. llvm-svn: 132164	2011-05-26 23:16:48 +00:00
Devang Patel	252f0079a9	Do not move DBG_VALUE in middle of PHI nodes. llvm-svn: 132161	2011-05-26 22:43:14 +00:00
Devang Patel	0da5250bcd	If llvm.dbg.value and the value instruction it refers to are far apart then iSel may not be able to find corresponding Node for llvm.dbg.value during DAG construction. Make iSel's life easier by removing this distance between llvm.dbg.value and its value instruction. llvm-svn: 132151	2011-05-26 21:51:06 +00:00
Andrew Trick	7fac79e255	indvars: incremental fixes for -disable-iv-rewrite and testcases. Use a proper worklist for use-def traversal without holding onto an iterator. Now that we process all IV uses, we need complete logic for resusing existing derived IV defs. See HoistStep. llvm-svn: 132103	2011-05-26 00:46:11 +00:00
Evan Cheng	9605a698b0	Simplify r132022 based on Cameron's feedback. llvm-svn: 132071	2011-05-25 18:17:13 +00:00
Andrew Trick	eb3c36e69c	indvars: fixed IV cloning in -disable-iv-rewrite mode with associated cleanup and overdue test cases. llvm-svn: 132038	2011-05-25 04:42:22 +00:00
Evan Cheng	73e6c09d5e	Forgot dyn_cast check. llvm-svn: 132025	2011-05-24 23:47:50 +00:00
Evan Cheng	1b55f56b01	Fix LoopUnswitch bug. RewriteLoopBodyWithConditionConstant can delete a dead case of a switch instruction. Back off this optimization when this would eliminate all of the predecessors to the latch. Sorry, I am unable to reduce a reasonably sized test case. rdar://9486843 llvm-svn: 132022	2011-05-24 23:12:57 +00:00
Cameron Zwarich	46e1ebf367	Clean up the lazy initialization of DIBuilder a bit. llvm-svn: 131956	2011-05-24 06:00:08 +00:00
Cameron Zwarich	843bc7d673	Make LoadAndStorePromoter preserve debug info and create llvm.dbg.values when promoting allocas to SSA variables. Fixes <rdar://problem/9479036>. llvm-svn: 131953	2011-05-24 03:10:43 +00:00
Dan Gohman	6c4a319088	When checking for signed multiplication overflow, watch out for INT_MIN and -1. This fixes PR9845. llvm-svn: 131919	2011-05-23 21:07:39 +00:00
Chris Lattner	83791ced7b	Teach valuetracking that byval arguments with a specified alignment are aligned. Teach memcpyopt to not give up all hope when confonted with an underaligned memcpy feeding an overaligned byval. If the source of the memcpy can be determined to be adequeately aligned, or if it can be forced to be, we can eliminate the memcpy. This addresses PR9794. We now compile the example into: define i32 @f(%struct.p* nocapture byval align 8 %q) nounwind ssp { entry: %call = call i32 @g(%struct.p* byval align 8 %q) nounwind ret i32 %call } in both x86-64 and x86-32 mode. We still don't get a tailcall though, because tailcalls apparently can't handle byval. llvm-svn: 131884	2011-05-23 00:03:39 +00:00
Chris Lattner	c4ca7ab7e7	Fix PR9815: I was trying to get out of "generating code and then failing to form a memset, then having to delete it" but my approximation isn't safe for self recurrent loops. Instead of doign a hack, just do it the right way. llvm-svn: 131858	2011-05-22 17:39:56 +00:00
Frits van Bommel	ad964559ef	Add a parameter to ConstantFoldTerminator() that callers can use to ask it to also clean up the condition of any conditional terminator it folds to be unconditional, if that turns the condition into dead code. This just means it calls RecursivelyDeleteTriviallyDeadInstructions() in strategic spots. It defaults to the old behavior. I also changed -simplifycfg, -jump-threading and -codegenprepare to use this to produce slightly better code without any extra cleanup passes (AFAICT this was the only place in -simplifycfg where now-dead conditions of replaced terminators weren't being cleaned up). The only other user of this function is -sccp, but I didn't read that thoroughly enough to figure out whether it might be holding pointers to instructions that could be deleted by this. llvm-svn: 131855	2011-05-22 16:24:18 +00:00
Chris Lattner	f0d59072de	fix PR9841 by having GVN not process dead loads. This was causing it to get into infinite loops when it would widen a load (which can necessarily leave around dead loads). llvm-svn: 131847	2011-05-22 07:03:34 +00:00
Eli Friedman	3de2ddc578	PR7952: Make isa<> use the same logic as cast<>, so that they both work consistently. llvm-svn: 131803	2011-05-21 19:13:10 +00:00
Andrew Trick	f44aadf0fd	indvars: Prototyping Sign/ZeroExtend elimination without canonical IVs. No functionality enabled by default. Use -disable-iv-rewrite. Extended IVUsers to keep track of the phi that represents the users' IV. Added the WidenIV transform to replace a narrow IV with a wide IV by doing a one-for-one replacement of IV users instead of expanding the SCEV expressions. [sz]exts are removed and truncs are inserted. llvm-svn: 131744	2011-05-20 18:25:42 +00:00
Andrew Trick	b75279cbbd	indvars: minor cleanup in preparation for sign/zero extend elimination. llvm-svn: 131716	2011-05-20 03:37:48 +00:00
Dan Gohman	3268e4d692	When forming an ICmpZero LSRUse, normalize the non-IV operand of the comparison, so that the resulting expression is fully normalized. This fixes PR9939. llvm-svn: 131576	2011-05-18 21:02:18 +00:00
Duncan Sands	3d9407f4eb	Revert commit 131534 since it seems to have broken several buildbots. Original log entry: Refactor getActionType and getTypeToTransformTo ; place all of the 'decision' code in one place. llvm-svn: 131536	2011-05-18 14:57:56 +00:00
Nadav Rotem	c5c27ede55	Refactor getActionType and getTypeToTransformTo ; place all of the 'decision' code in one place. llvm-svn: 131534	2011-05-18 12:26:38 +00:00
Devang Patel	341b38c22a	Preserve line number information. llvm-svn: 131482	2011-05-17 20:00:02 +00:00
Devang Patel	c5933f2418	Set debug loc for new load instruction. llvm-svn: 131481	2011-05-17 19:43:38 +00:00
Rafael Espindola	2050af838d	Don't do tail calls in a function that call setjmp. The stack might be corrupted when setjmp returns again. llvm-svn: 131399	2011-05-16 03:05:33 +00:00
Andrew Trick	03957dfeb1	Convert SimplifyIVUsers into a worklist instead of a single pass over the users. llvm-svn: 131277	2011-05-13 01:12:21 +00:00
Andrew Trick	81683ed232	indvars: Added SimplifyIVUsers. Interleave IV simplifications. Currently involves EliminateComparison and EliminateRemainder. Next I'll add EliminateExtend. llvm-svn: 131210	2011-05-12 00:04:28 +00:00
Duncan Sands	a071c82900	Fix PR9820: a read-only call differs from a load in that a load doesn't return the pointer being dereferenced, it returns the pointee, but a call might return the pointer itself. llvm-svn: 130979	2011-05-06 10:30:37 +00:00
Devang Patel	ffb798c1c6	Set debug loc for new instructions. llvm-svn: 130895	2011-05-04 23:58:50 +00:00
Devang Patel	306f8db721	Preserve line number information while threading jumps. llvm-svn: 130880	2011-05-04 22:48:19 +00:00
Devang Patel	c7e4fa7c19	Preserve line number info. llvm-svn: 130876	2011-05-04 21:58:58 +00:00
Devang Patel	0daa07eb90	preserve line number info. llvm-svn: 130869	2011-05-04 21:37:05 +00:00
Andrew Trick	1abe296cfd	indvars: Added DisableIVRewrite and WidenIVs. This adds functionality to remove size/zero extension during indvars without generating a canonical IV and rewriting all IV users. It's disabled by default so should have no effect on codegen. Work in progress. llvm-svn: 130829	2011-05-04 02:10:13 +00:00
Andrew Trick	38c4e34abb	indvars: Added canExpandBackEdgeTakenCount. Only create a canonical IV for backedge taken count if it will actually be used by LinearFunctionTestReplace. And some related cleanup, preparing to reduce dependence on canonical IVs. No significant effect on x86 or arm in the test-suite. llvm-svn: 130799	2011-05-03 22:24:10 +00:00
Dan Gohman	6136e94897	Add an unfolded offset field to LSR's Formula record. This is used to model constants which can be added to base registers via add-immediate instructions which don't require an additional register to materialize the immediate. llvm-svn: 130743	2011-05-03 00:46:49 +00:00
Chris Lattner	23f61a09af	enhance memcpyopt to obey -fno-builtin and friends. This addresses a problem reported on cfe-dev. llvm-svn: 130661	2011-05-01 18:27:11 +00:00
Devang Patel	c1f7c1d469	Preserve line number information. llvm-svn: 130536	2011-04-29 20:38:55 +00:00
Devang Patel	80d1d3aaec	Preserve line number information. llvm-svn: 130450	2011-04-28 22:48:14 +00:00
Chris Lattner	a5452c0d67	improve comment. llvm-svn: 130426	2011-04-28 20:02:57 +00:00
Devang Patel	33d87d97f6	Do not lose line number info while eliminating tail call. llvm-svn: 130419	2011-04-28 18:43:39 +00:00
Chris Lattner	1777601a74	final step needed to resolve PR6627, which allows us to flatten the code down to a nice and tidy: %x1 = load i32* %0, align 4 %1 = icmp eq i32 %x1, 1179403647 br i1 %1, label %if.then, label %if.end instead of doing lots of loads and branches. May the FreeBSD bootloader long fit in its allocated space. llvm-svn: 130416	2011-04-28 18:15:47 +00:00
Chris Lattner	45e393fc9c	code cleanups only. llvm-svn: 130414	2011-04-28 18:08:21 +00:00
Andrew Trick	c4456ae6ec	Reapply r130340: Fix for PR9730. llvm-svn: 130408	2011-04-28 17:30:04 +00:00
Chris Lattner	f81f789b6c	centralize "marking for deletion" into a helper function. Pass GVN around to static functions instead of passing around tons of random ivars. llvm-svn: 130403	2011-04-28 16:36:48 +00:00
Chris Lattner	6cec6ab275	Promote toErase to be an ivar of the GVN class. llvm-svn: 130401	2011-04-28 16:18:52 +00:00
Chris Lattner	827a270a2a	teach GVN to widen integer loads when they are overaligned, when doing an wider load would allow elimination of subsequent loads, and when the wider load is still a native integer type. This eliminates a ton of loads on various benchmarks involving struct fields, though it is somewhat hobbled by clang not being very aggressive about field alignment. This is yet another step along the way towards resolving PR6627. llvm-svn: 130390	2011-04-28 07:29:08 +00:00
Andrew Trick	1e34241abd	Reverting r130340 in the unlikely event that it's responsible for a llvm-gcc stage2 compiler error. llvm-svn: 130350	2011-04-28 00:13:59 +00:00
Andrew Trick	29ac7b8858	Fixes PR9730: indvars: An asserting value handle still pointed to this value Modified LinearFunctionTestReplace to push the condition on the dead list instead of eagerly deleting it. This can cause unnecessary IV rewrites, which should have no effect on codegen and will not be an issue once we stop generating canonical IVs. llvm-svn: 130340	2011-04-27 23:00:03 +00:00
Devang Patel	12bf0ab4b5	Simplify cfg inserts a call to trap when unreachable code is detected. Assign DebugLoc to this new trap instruction. llvm-svn: 130315	2011-04-27 17:59:27 +00:00
Chris Lattner	eb045f9c02	Improve the bail-out predicate to really only kick in when phi translation fails. We were bailing out in some cases that would cause us to miss GVN'ing some non-local cases away. llvm-svn: 130206	2011-04-26 17:41:02 +00:00
Chris Lattner	6f83d06ffa	Enhance MemDep: When alias analysis returns a partial alias result, return it as a clobber. This allows GVN to do smart things. Enhance GVN to be smart about the case when a small load is clobbered by a larger overlapping load. In this case, forward the value. This allows us to compile stuff like this: int test(void P) { int tmp = (unsigned int)P; return tmp+((unsigned char*)P+1); } into: _test: ## @test movl (%rdi), %ecx movzbl %ch, %eax addl %ecx, %eax ret which has one load. We already handled the case where the smaller load was from a must-aliased base pointer. llvm-svn: 130180	2011-04-26 01:21:15 +00:00
Jay Foad	1a180156b6	Remove unused STL header includes. llvm-svn: 130068	2011-04-23 19:53:52 +00:00
Cameron Zwarich	ca4c633489	Fix another case of <rdar://problem/9184212> that only occurs with code generated by llvm-gcc, since llvm-gcc uses 2 i64s for passing a 4 x float vector on ARM rather than an i64 array like Clang. llvm-svn: 129878	2011-04-20 21:48:38 +00:00
Cameron Zwarich	76dfa226cf	The bitcast case here is actually handled uniformly earlier in the function, so delete it. llvm-svn: 129877	2011-04-20 21:48:34 +00:00
Cameron Zwarich	4cd9a4a975	Cleanup some code to better use an early return style in preparation for adding more cases. llvm-svn: 129876	2011-04-20 21:48:16 +00:00
Chris Lattner	0ab5e2cded	Fix a ton of comment typos found by codespell. Patch by Luis Felipe Strano Moraes! llvm-svn: 129558	2011-04-15 05:18:47 +00:00
Owen Anderson	92651ec374	Fix an infinite alternation in JumpThreading where two transforms would repeatedly undo each other. The solution is to perform more aggressive constant folding to make one of the edges just folded away rather than trying to thread it. Fixes <rdar://problem/9284786>. Discovered with CSmith. llvm-svn: 129538	2011-04-14 21:35:50 +00:00
Mon P Wang	1cde91674a	Cleanup r129509 based on comments by Chris llvm-svn: 129532	2011-04-14 19:20:42 +00:00
Mon P Wang	0f6bad7b6e	Cleanup r129472 by using a utility routine as suggested by Eli. llvm-svn: 129509	2011-04-14 08:04:01 +00:00
Chris Lattner	35a65b2aa6	fix a couple -Wsign-compare warnings. llvm-svn: 129501	2011-04-14 02:27:25 +00:00
Mon P Wang	2e5528f0b2	Vectors with different number of elements of the same element type can have the same allocation size but different primitive sizes(e.g., <3xi32> and <4xi32>). When ScalarRepl promotes them, it can't use a bit cast but should use a shuffle vector instead. llvm-svn: 129472	2011-04-13 21:40:02 +00:00
Junjie Gu	377cc31a74	Fixed the revision 129449. llvm-svn: 129450	2011-04-13 16:45:49 +00:00
Junjie Gu	7c3b4593b5	Passing unroll parameters (unroll-count, threshold, and partial unroll) via LoopUnroll class's ctor. Doing so will allow multiple context with different loop unroll parameters to run. This is a minor change and no effect on existing application. llvm-svn: 129449	2011-04-13 16:15:29 +00:00
Rafael Espindola	6aafb64daf	Add the alias analysis to the C api. llvm-svn: 129447	2011-04-13 15:44:58 +00:00
Bill Wendling	b902f1dd88	Reapply r129401 with patch for clang. llvm-svn: 129419	2011-04-13 00:36:11 +00:00
Bill Wendling	dbfde42468	Revert r129401 for now. Clang is using the old way of doing things. llvm-svn: 129403	2011-04-12 22:59:27 +00:00
Bill Wendling	47c24875a1	Remove the unaligned load intrinsics in favor of using native unaligned loads. Now that we have a first-class way to represent unaligned loads, the unaligned load intrinsics are superfluous. First part of <rdar://problem/8460511>. llvm-svn: 129401	2011-04-12 22:46:31 +00:00
Dan Gohman	1c6c34834b	Fix reassociate to use a worklist instead of recursing when new reassociation opportunities are exposed. This fixes a bug where the nested reassociation expects to be the IR to be consistent, but it isn't, because the outer reassociation has disconnected some of the operands. rdar://9167457 llvm-svn: 129324	2011-04-12 00:11:56 +00:00
Jay Foad	7c14a558fe	Don't include Operator.h from InstrTypes.h. llvm-svn: 129271	2011-04-11 09:35:34 +00:00
Chris Lattner	88974f4625	fix PR9523, a crash in looprotate on a non-canonical loop made out of indirectbr. llvm-svn: 129203	2011-04-09 07:25:58 +00:00
Chris Lattner	af1bccec68	Fix a bug where RecursivelyDeleteTriviallyDeadInstructions could delete the instruction pointed to by CGP's current instruction iterator, leading to a crash on the testcase. This fixes PR9578. llvm-svn: 129200	2011-04-09 07:05:44 +00:00
Rafael Espindola	e4e4e37580	Expose more passes to the C API. llvm-svn: 129087	2011-04-07 18:20:46 +00:00
Eli Friedman	c5f22a7815	PR9634: Don't unconditionally tell the AliasSetTracker that the PreheaderLoad is equivalent to any other relevant value; it isn't true in general. If it is equivalent, the LoopPromoter will tell the AST the equivalence. Also, delete the PreheaderLoad if it is unused. Chris, since you were the last one to make major changes here, can you check that this is sane? llvm-svn: 129049	2011-04-07 01:35:06 +00:00
Bill Wendling	5034159c5f	* The DSE code that tested for overlapping needed to take into account the fact that one of the numbers is signed while the other is unsigned. This could lead to a wrong result when the signed was promoted to an unsigned int. * Add the data layout line to the testcase so that it will test the appropriate thing. Patch by David Terei! llvm-svn: 128577	2011-03-30 21:37:19 +00:00
Jay Foad	52131344a2	Remove PHINode::reserveOperandSpace(). Instead, add a parameter to PHINode::Create() giving the (known or expected) number of operands. llvm-svn: 128537	2011-03-30 11:28:46 +00:00
Jay Foad	e0938d8a87	(Almost) always call reserveOperandSpace() on newly created PHINodes. llvm-svn: 128535	2011-03-30 11:19:20 +00:00
Benjamin Kramer	e41395ac24	DSE: Remove an early exit optimization that depended on the ordering of a SmallPtrSet. Fixes PR9569 and will hopefully make selfhost on ASLR-enabled systems more deterministic. llvm-svn: 128482	2011-03-29 20:28:57 +00:00
Cameron Zwarich	ff811cc475	Do some simple copy propagation through integer loads and stores when promoting vector types. This helps a lot with inlined functions when using the ARM soft float ABI. Fixes <rdar://problem/9184212>. llvm-svn: 128453	2011-03-29 05:19:52 +00:00
Bill Wendling	b5139920d6	Simplification noticed by Frits. llvm-svn: 128333	2011-03-26 09:32:07 +00:00
Bill Wendling	19f33b9393	Rework the logic that determines if a store completely overlaps an ealier store. There are two ways that a later store can comletely overlap a previous store: 1. They both start at the same offset, but the earlier store's size is <= the later's size, or 2. The earlier store's offset is > the later's offset, but it's offset + size doesn't extend past the later's offset + size. llvm-svn: 128332	2011-03-26 08:02:59 +00:00
Cameron Zwarich	d4174ee43e	Fix a typo and add a test. llvm-svn: 128331	2011-03-26 04:58:50 +00:00
Bill Wendling	db40b5c899	PR9561: A store with a negative offset (via GEP) could erroniously say that it completely overlaps a previous store, thus mistakenly deleting that store. Check for this condition. llvm-svn: 128319	2011-03-26 01:20:37 +00:00
Cameron Zwarich	74157ab3e5	Debug intrinsics must be skipped at the beginning and ends of blocks, lest they affect the generated code. llvm-svn: 128217	2011-03-24 16:34:59 +00:00
Cameron Zwarich	2edfe778ec	It is enough for the CallInst to have no uses to be made a tail call with a ret void; it doesn't need to have a void type. llvm-svn: 128212	2011-03-24 15:54:11 +00:00
Devang Patel	8f606d7b9b	s/UpdateDT/ModifiedDT/g llvm-svn: 128211	2011-03-24 15:35:25 +00:00
Cameron Zwarich	4649f17db1	Do early taildup of ret in CodeGenPrepare for potential tail calls that have a void return type. This fixes PR9487. llvm-svn: 128197	2011-03-24 04:52:10 +00:00
Cameron Zwarich	0e331c05ae	Use an early return instead of a long if block. llvm-svn: 128196	2011-03-24 04:52:07 +00:00
Cameron Zwarich	dd84bcce8f	When UpdateDT is set, DT is invalid, which could cause problems when trying to use it later. I couldn't make a test that hits this with the current code. llvm-svn: 128195	2011-03-24 04:52:04 +00:00
Cameron Zwarich	47e7175fe9	Check for TLI so that -codegenprepare can be used from opt. llvm-svn: 128194	2011-03-24 04:51:51 +00:00
Cameron Zwarich	10ebc189ee	Fix PR9464 by correcting some math that just happened to be right in most cases that were hit in practice. llvm-svn: 128146	2011-03-23 05:25:55 +00:00
Evan Cheng	0663f23bd8	Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified. llvm-svn: 127981	2011-03-21 01:19:09 +00:00
Daniel Dunbar	327cd36f74	Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR", it broke a lot of things. llvm-svn: 127954	2011-03-19 21:47:14 +00:00
Evan Cheng	824a711305	SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR to have single return block (at least getting there) for optimizations. This is general goodness but it would prevent some tailcall optimizations. One specific case is code like this: int f1(void); int f2(void); int f3(void); int f4(void); int f5(void); int f6(void); int foo(int x) { switch(x) { case 1: return f1(); case 2: return f2(); case 3: return f3(); case 4: return f4(); case 5: return f5(); case 6: return f6(); } } => LBB0_2: ## %sw.bb callq _f1 popq %rbp ret LBB0_3: ## %sw.bb1 callq _f2 popq %rbp ret LBB0_4: ## %sw.bb3 callq _f3 popq %rbp ret This patch teaches codegenprep to duplicate returns when the return value is a phi and where the phi operands are produced by tail calls followed by an unconditional branch: sw.bb7: ; preds = %entry %call8 = tail call i32 @f5() nounwind br label %return sw.bb9: ; preds = %entry %call10 = tail call i32 @f6() nounwind br label %return return: %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ] ret i32 %retval.0 This allows codegen to generate better code like this: LBB0_2: ## %sw.bb jmp _f1 ## TAILCALL LBB0_3: ## %sw.bb1 jmp _f2 ## TAILCALL LBB0_4: ## %sw.bb3 jmp _f3 ## TAILCALL rdar://9147433 llvm-svn: 127953	2011-03-19 17:17:39 +00:00
Andrew Trick	f8f67f0188	Remove TargetData and ValueTracking includes. I didn't mean for them to sneak in my last checkin. llvm-svn: 127842	2011-03-18 00:36:39 +00:00
Andrew Trick	87716c93c2	Added isValidRewrite() to check the result of ScalarEvolutionExpander. SCEV may generate expressions composed of multiple pointers, which can lead to invalid GEP expansion. Until we can teach SCEV to follow strict pointer rules, make sure no bad GEPs creep into IR. Fixes rdar://problem/9038671. llvm-svn: 127839	2011-03-17 23:51:11 +00:00
Andrew Trick	e44f0d94f6	whitespace llvm-svn: 127837	2011-03-17 23:46:48 +00:00
Cameron Zwarich	7599b106b7	Fix a comment. llvm-svn: 127728	2011-03-16 08:13:42 +00:00
Cameron Zwarich	0454253d7a	Only convert allocas to scalars if it is profitable. The profitability metric I chose is having a non-memcpy/memset use and being larger than any native integer type. Originally I chose having an access of a size smaller than the total size of the alloca, but this caused some minor issues on the spirit benchmark where SRoA runs again after some inlining. This fixes <rdar://problem/8613163>. llvm-svn: 127718	2011-03-16 00:13:44 +00:00
Cameron Zwarich	b51c830f7c	Better use initializer lists. llvm-svn: 127716	2011-03-16 00:13:37 +00:00
Cameron Zwarich	63062ccf85	Add a clarifying comment. llvm-svn: 127715	2011-03-16 00:13:35 +00:00
Andrew Trick	8b55b736b1	Added SCEV::NoWrapFlags to manage unsigned, signed, and self wrap properties. Added the self-wrap flag for SCEV::AddRecExpr. A slew of temporary FIXMEs indicate the intention of the no-self-wrap flag without changing behavior in this revision. llvm-svn: 127590	2011-03-14 16:50:06 +00:00
Andrew Trick	328b223bb1	whitespace llvm-svn: 127589	2011-03-14 16:48:10 +00:00
Cameron Zwarich	338d362200	Roll r127459 back in: Optimize trivial branches in CodeGenPrepare, which often get created from the lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127498	2011-03-11 21:52:04 +00:00
Daniel Dunbar	94ccb27b43	Revert r127459, "Optimize trivial branches in CodeGenPrepare, which often get created from the", it broke some GCC test suite tests. llvm-svn: 127477	2011-03-11 19:30:30 +00:00
Cameron Zwarich	cc27b3acc4	Optimize trivial branches in CodeGenPrepare, which often get created from the lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127459	2011-03-11 04:54:27 +00:00
Dan Gohman	affbc66f60	RecursivelyDeleteTriviallyDeadInstructions only needs a Value, not an Instruction, so casting is not necessary. Also, it's theoretically possible that the Value is not an Instruction, since WeakVH follows RAUWs. llvm-svn: 127427	2011-03-10 20:57:44 +00:00
Dan Gohman	154ed49784	Fix reassociate to postpone certain instruction deletions until after it has finished all of its reassociations, because its habit of unlinking operands and holding them in a datastructure while working means that it's not easy to determine when an instruction is really dead until after all its regular work is done. rdar://9096268. llvm-svn: 127424	2011-03-10 19:51:54 +00:00
Devang Patel	13f8c7d48e	Preserve line number information while simplifying libcalls. llvm-svn: 127362	2011-03-09 21:27:52 +00:00
Cameron Zwarich	19f2b3c652	Fix a crasher introduced by r127317 that is seen on the bots when using an alloca as both integer and floating-point vectors of the same size. Bugpoint is not cooperating with me, but I'll try to find a manual testcase tomorrow. llvm-svn: 127320	2011-03-09 07:34:11 +00:00
Cameron Zwarich	3b649f4d01	Add support to scalar replacement for partial vector accesses of an alloca, e.g. a union of a float, <2 x float>, and <4 x float>. This mostly comes up with the use of vector intrinsics, especially in NEON when programmers know the layout of the register file. This enables codegen to eliminate a lot of the subregister traffic it would otherwise generate. This commit only enables this for a small number of floating-point cases, but a lot more integer cases. I assume this is okay for all ports, but I did not do extensive testing of the quality of code involving i512 vectors and the like. If there is a use case where this generates worse code than before, let me know and we can scale it back. This fixes <rdar://problem/9036264>. llvm-svn: 127317	2011-03-09 05:43:05 +00:00
Cameron Zwarich	43a241fa06	Move vector type merging to a separate function in preparation for it getting more complicated. llvm-svn: 127316	2011-03-09 05:43:01 +00:00
Devang Patel	97d0be8ee1	While sinking an instruction, do not lose llvm.dbg.value intrinsic. llvm-svn: 127214	2011-03-08 03:06:19 +00:00
Devang Patel	d00c628f8f	Preserve line no. info. Radar `9097659` llvm-svn: 127182	2011-03-07 22:43:45 +00:00
Cameron Zwarich	13c885d193	Fix PR9398 - 10% of llc compile time is spent in Value::getNumUses. This reduces the percentage of time spent in CodeGenPrepare when llcing 403.gcc from 12.6% to 1.8% of total llc time. llvm-svn: 127069	2011-03-05 08:12:26 +00:00
Richard Osborne	5003782293	Fix typo in comment. llvm-svn: 126941	2011-03-03 14:21:22 +00:00
Richard Osborne	af52c52569	Optimize fprintf -> iprintf if there are no floating point arguments and siprintf is available on the target. llvm-svn: 126940	2011-03-03 14:20:22 +00:00
Richard Osborne	2dfb888392	Optimize sprintf -> siprintf if there are no floating point arguments and siprintf is available on the target. llvm-svn: 126937	2011-03-03 14:09:28 +00:00
Richard Osborne	815de536e5	Optimize printf -> iprintf if there are no floating point arguments and iprintf is available on the target. Currently iprintf is only marked as being available on the XCore. llvm-svn: 126935	2011-03-03 13:17:51 +00:00
Cameron Zwarich	86ade9510f	Remove some more unused code that I missed. llvm-svn: 126826	2011-03-02 03:48:29 +00:00
Cameron Zwarich	5dd2aa2615	Eliminate the unused CodeGenPrepare option to split critical edges. llvm-svn: 126825	2011-03-02 03:31:46 +00:00
Cameron Zwarich	b7f8eaafa3	Stop computing the number of uses twice per value in CodeGenPrepare's sinking of addressing code. On 403.gcc this almost halves CodeGenPrepare time and reduces total llc time by 9.5%. Unfortunately, getNumUses() is still the hottest function in llc. llvm-svn: 126782	2011-03-01 21:13:53 +00:00
Ted Kremenek	20164dcc68	Unbreak CMake build. llvm-svn: 126715	2011-02-28 23:56:33 +00:00
Chris Lattner	1ac5e0c5c6	update cmake llvm-svn: 126694	2011-02-28 22:45:25 +00:00
Dan Gohman	06d70015ce	Delete the GEPSplitter experiment. llvm-svn: 126671	2011-02-28 19:47:47 +00:00
Dan Gohman	b8a25f49f3	Delete the SimplifyHalfPowrLibCalls pass, which was unused, and only existed as the result of a misunderstanding. llvm-svn: 126669	2011-02-28 19:41:14 +00:00
Chris Lattner	eddb33ebd0	wire TargetLibraryInfo into simplify libcalls and use it in a couple of trivial places. This pass needs a lot of work. llvm-svn: 126367	2011-02-24 07:16:14 +00:00
Chris Lattner	2e56e20662	move a massive amount of code out into its own helper function to reduce nesting. This needs to be turned into a table. llvm-svn: 126366	2011-02-24 07:12:12 +00:00
Cameron Zwarich	826308586c	Make LoopDeletion work on loops with multiple edges, as long as the incoming values from all of the loop's exiting blocks are equal. Patch by Andrew Clinton. llvm-svn: 126253	2011-02-22 22:25:39 +00:00
Chris Lattner	2333ac279f	fix a crasher in disabled code (on variable stride loops) llvm-svn: 126125	2011-02-21 17:02:55 +00:00
Chris Lattner	bc661d6686	Add some (disabled code) to print out negative strides. llvm-svn: 126102	2011-02-21 02:08:54 +00:00
Chris Lattner	72a35fb974	rewrite the memset_pattern pattern generation stuff to accept any 2/4/8/16-byte constant, including globals. This makes us generate much more "pretty" pattern globals as well because it doesn't break it down to an array of bytes all the time. This enables us to handle stores of relocatable globals. This kicks in about 48 times in 254.gap, giving us stuff like this: @.memset_pattern40 = internal constant [2 x %struct.TypHeader* (%struct.TypHeader, %struct.TypHeader)] [%struct.TypHeader (%struct.TypHeader, %struct .TypHeader)* @IsFalse, %struct.TypHeader* (%struct.TypHeader, %struct.TypHeader)* @IsFalse], align 16 ... call void @memset_pattern16(i8* %scevgep5859, i8* bitcast ([2 x %struct.TypHeader* (%struct.TypHeader, %struct.TypHeader)] @.memset_pattern40 to i8* ), i64 %tmp75) nounwind llvm-svn: 126044	2011-02-19 19:56:44 +00:00
Chris Lattner	0f4a64011e	Implement rdar://9009151, transforming strided loop stores of unsplatable values into memset_pattern16 when it is available (recent darwins). This transforms lots of strided loop stores of ints for example, like 5 in vpr: Formed memset: call void @memset_pattern16(i8* %4, i8* getelementptr inbounds ([16 x i8]* @.memset_pattern9, i32 0, i32 0), i64 %tmp25) from store to: {%3,+,4}<%11> at: store i32 3, i32* %scevgep, align 4, !tbaa !4 llvm-svn: 126040	2011-02-19 19:31:39 +00:00
Chris Lattner	e6b261fec5	Make loop-idiom use TargetLibraryInfo to determine whether it is allowed to hack on memset, memcpy etc. llvm-svn: 125974	2011-02-18 22:22:15 +00:00
Chris Lattner	1a924e770a	prevent jump threading from merging blocks when their address is taken (and used!). This prevents merging the blocks (invalidating the block addresses) in a case like this: #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; }) void foo() { printf("%p\n", _THIS_IP_); printf("%p\n", _THIS_IP_); printf("%p\n", _THIS_IP_); } which fixes PR4151. llvm-svn: 125829	2011-02-18 04:43:06 +00:00
Chris Lattner	3eb0af94c4	fix PR9215, preventing -reassociate from clearing nsw/nuw when it swaps the LHS/RHS of a single binop. llvm-svn: 125700	2011-02-17 01:29:24 +00:00
Duncan Sands	75b5d27b84	Spelling fix: consequtive -> consecutive. llvm-svn: 125563	2011-02-15 09:23:02 +00:00
Chris Lattner	69229316aa	convert ConstantVector::get to use ArrayRef. llvm-svn: 125537	2011-02-15 00:14:00 +00:00
Devang Patel	3058398655	Do not hoist @llvm.dbg.value. Here, @llvm.dbg.value is "referring" a value that is modified inside loop. llvm-svn: 125529	2011-02-14 23:03:23 +00:00
Chris Lattner	34442e6ebf	revert my ConstantVector patch, it seems to have made the llvm-gcc builders unhappy. llvm-svn: 125504	2011-02-14 18:15:46 +00:00
Chris Lattner	d9f5b88548	Switch ConstantVector::get to use ArrayRef instead of a pointer+size idiom. Change various clients to simplify their code. llvm-svn: 125487	2011-02-14 07:55:32 +00:00
Daniel Dunbar	210ce0feb5	SimplifyLibCalls: Add missing legalize check on various printf to puts and putchar transforms, their return values are not compatible. llvm-svn: 125442	2011-02-12 18:19:57 +00:00
Cameron Zwarich	99de19b3cb	Make LoopUnswitch preserve ScalarEvolution by just forgetting everything about a loop when unswitching it. It only does this in the complex case, because everything should be fine already in the simple case. llvm-svn: 125369	2011-02-11 06:08:28 +00:00
Cameron Zwarich	25cb63c791	LoopInstSimplify preserves ScalarEvolution. llvm-svn: 125368	2011-02-11 06:08:25 +00:00
Cameron Zwarich	97dae4d361	If we can't avoid running loop-simplify twice for now, at least avoid running iv-users twice. llvm-svn: 125318	2011-02-10 23:53:14 +00:00
Eric Christopher	da6bd45088	Revert this in an attempt to bring the builders back. llvm-svn: 125257	2011-02-10 01:48:24 +00:00
Cameron Zwarich	58c8670ab2	Turn this pass ordering: Natural Loop Information Loop Pass Manager Canonicalize natural loops Scalar Evolution Analysis Loop Pass Manager Induction Variable Users Canonicalize natural loops Induction Variable Users Loop Strength Reduction into this: Scalar Evolution Analysis Loop Pass Manager Canonicalize natural loops Induction Variable Users Loop Strength Reduction This fixes <rdar://problem/8869639>. I also filed PR9184 on doing this sort of thing automatically, but it seems easier to just change the ordering of the passes if this is the only case. llvm-svn: 125254	2011-02-10 01:07:54 +00:00
Dan Gohman	de7f699754	Don't split any loop backedges, including backedges of loops other than the active loop. This is generally desirable, and it avoids trouble in situations such as the testcase in PR9123, though the failure mode depends on use-list order, so it is infeasible to test. llvm-svn: 125065	2011-02-08 00:55:13 +00:00
Dan Gohman	08d2c98c23	Fix reassociate to clear optional flags, such as nsw. llvm-svn: 124712	2011-02-02 02:02:34 +00:00
Francois Pichet	326e4a2966	Unbreak the MSVC build. The DEBUG() call at line 606 demands to see raw_ostream's definition. I have no idea why this seems to only break MSVC. llvm-svn: 124545	2011-01-29 20:06:16 +00:00
Evan Cheng	73c29178ac	Add a test for TCE return duplication. llvm-svn: 124527	2011-01-29 04:53:35 +00:00
Evan Cheng	d983eba7dc	Re-apply r124518 with fix. Watch out for invalidated iterator. llvm-svn: 124526	2011-01-29 04:46:23 +00:00
Evan Cheng	65b8ccf6ac	Revert r124518. It broke Linux self-host. llvm-svn: 124522	2011-01-29 02:43:04 +00:00
Evan Cheng	d4eff31476	Re-commit r124462 with fixes. Tail recursion elim will now dup ret into unconditional predecessor to enable TCE on demand. llvm-svn: 124518	2011-01-29 01:29:26 +00:00
Duncan Sands	69bdb585b2	Fix PR9039, a use-after-free in reassociate. The issue was that the operand being factorized (and erased) could occur several times in Ops, resulting in freed memory being used when the next occurrence in Ops was analyzed. llvm-svn: 124287	2011-01-26 10:08:38 +00:00
Dan Gohman	0f124e1987	Give GetUnderlyingObject a TargetData, to keep it in sync with BasicAA's DecomposeGEPExpression, which recently began using a TargetData. This fixes PR8968, though the testcase is awkward to reduce. Also, update several off GetUnderlyingObject's users which happen to have a TargetData handy to pass it in. llvm-svn: 124134	2011-01-24 18:53:32 +00:00
Chris Lattner	d83e7b0ff6	enhance SRoA to promote allocas that are used by PHI nodes. This often occurs because instcombine sinks loads and inserts phis. This kicks in on such apps as 175.vpr, eon, 403.gcc, xalancbmk and a bunch of times in spec2006 in some app that uses std::deque. This resolves the last of rdar://7339113. llvm-svn: 124090	2011-01-24 01:07:11 +00:00
Chris Lattner	a960725d18	Enhance SRoA to promote allocas that are used by selects in some common cases. This triggers a surprising number of times in SPEC2K6 because min/max idioms end up doing this. For example, code from the STL ends up looking like this to SRoA: %202 = load i64* %__old_size, align 8, !tbaa !3 %203 = load i64* %__old_size, align 8, !tbaa !3 %204 = load i64* %__n, align 8, !tbaa !3 %205 = icmp ult i64 %203, %204 %storemerge.i = select i1 %205, i64* %__n, i64* %__old_size %206 = load i64* %storemerge.i, align 8, !tbaa !3 We can now promote both the __n and the __old_size allocas. This addresses another chunk of rdar://7339113, poor codegen on stringswitch. llvm-svn: 124088	2011-01-23 22:04:55 +00:00
Chris Lattner	9491dee24e	Enhance SRoA to be more aggressive about scalarization of aggregate allocas that have PHI or select uses of their element pointers. This can often happen when instcombine sinks two loads into a successor, inserting a phi or select. With this patch, we can scalarize the alloca, but the pinned elements are not yet promoted. This is still a win for large aggregates where only one element is used. This fixes rdar://8904039 and part of rdar://7339113 (poor codegen on stringswitch). llvm-svn: 124070	2011-01-23 08:27:54 +00:00
Chris Lattner	8acbb79506	have AllocaInfo store the alloca being inspected, simplifying callers. No functionality change. llvm-svn: 124067	2011-01-23 07:29:29 +00:00
Chris Lattner	3e56c29068	Rearrange some code a bit. Change MarkUnsafe to handle the "Transformation preventing inst" printing, so that -scalarrepl -debug will always print the rejected instruction. No functionality change. llvm-svn: 124066	2011-01-23 07:05:44 +00:00
Chris Lattner	a587ab7b94	remove an old hack that avoided creating MMX datatypes. The X86 backend has been fixed. llvm-svn: 124064	2011-01-23 06:40:33 +00:00
Dan Gohman	19e30d5a7d	Actually check memcpy lengths, instead of just commenting about how they should be checked. llvm-svn: 123999	2011-01-21 22:07:57 +00:00
Nick Lewycky	ae0275e018	SCCP doesn't actually preserve the CFG. It will delete and insert terminator instructions. llvm-svn: 123973	2011-01-21 08:38:09 +00:00
Chris Lattner	86d56c651d	fix rdar://8878965, a regression I introduced with the recent llvm.objectsize changes. llvm-svn: 123771	2011-01-18 20:53:04 +00:00
Cameron Zwarich	b703654edc	Remove code for updating dominance frontiers and some outdated references to dominance and post-dominance frontiers. llvm-svn: 123725	2011-01-18 04:11:31 +00:00
Cameron Zwarich	4694e69540	Remove outdated references to dominance frontiers. llvm-svn: 123724	2011-01-18 03:53:26 +00:00
Owen Anderson	459e079912	Remove dead code, that I apparently wrote a while back. We seem to be doing well enough without whatever this was trying to do. When/if someone has the time to do some empirical evaluations, it might be worth it to figure out what this code was trying to do and see if it's worth resurrecting/fixing. llvm-svn: 123684	2011-01-17 22:39:54 +00:00
Cameron Zwarich	b410858a5f	Roll r123609 back in with two changes that fix test failures with expensive checks enabled: 1) Use '<' to compare integers in a comparison function rather than '<='. 2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize the priority queue. The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at just under 16% rather than 17%. llvm-svn: 123662	2011-01-17 17:38:41 +00:00
Cameron Zwarich	67431d7943	Roll out r123609 due to failures on the llvm-x86_64-linux-checks bot. llvm-svn: 123618	2011-01-17 07:26:51 +00:00
Cameron Zwarich	814cd9233e	Eliminate the use of dominance frontiers in PromoteMemToReg. In addition to eliminating a potentially quadratic data structure, this also gives a 17% speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial experiment gave a greater speedup around 25%, but I moved the dominator tree level computation from dominator tree construction to PromoteMemToReg. Since this approach to computing IDFs has a much lower overhead than the old code using precomputed DFs, it is worth looking at using this new code for the second scalarrepl pass as well. llvm-svn: 123609	2011-01-17 01:08:59 +00:00
Chris Lattner	7c9f4c9c2b	tidy up a comment, as suggested by duncan llvm-svn: 123590	2011-01-16 17:46:19 +00:00
Chris Lattner	ed1fb92cfe	simplify a little llvm-svn: 123573	2011-01-16 07:11:21 +00:00
Chris Lattner	6fab2e9418	if an alloca is only ever accessed as a unit, and is accessed with load/store instructions, then don't try to decimate it into its individual pieces. This will just make a mess of the IR and is pointless if none of the elements are individually accessed. This was generating really terrible code for std::bitset (PR8980) because it happens to be lowered by clang as an {[8 x i8]} structure instead of {i64}. The testcase now is optimized to: define i64 @test2(i64 %X) { br label %L2 L2: ; preds = %0 ret i64 %X } before we generated: define i64 @test2(i64 %X) { %sroa.store.elt = lshr i64 %X, 56 %1 = trunc i64 %sroa.store.elt to i8 %sroa.store.elt8 = lshr i64 %X, 48 %2 = trunc i64 %sroa.store.elt8 to i8 %sroa.store.elt9 = lshr i64 %X, 40 %3 = trunc i64 %sroa.store.elt9 to i8 %sroa.store.elt10 = lshr i64 %X, 32 %4 = trunc i64 %sroa.store.elt10 to i8 %sroa.store.elt11 = lshr i64 %X, 24 %5 = trunc i64 %sroa.store.elt11 to i8 %sroa.store.elt12 = lshr i64 %X, 16 %6 = trunc i64 %sroa.store.elt12 to i8 %sroa.store.elt13 = lshr i64 %X, 8 %7 = trunc i64 %sroa.store.elt13 to i8 %8 = trunc i64 %X to i8 br label %L2 L2: ; preds = %0 %9 = zext i8 %1 to i64 %10 = shl i64 %9, 56 %11 = zext i8 %2 to i64 %12 = shl i64 %11, 48 %13 = or i64 %12, %10 %14 = zext i8 %3 to i64 %15 = shl i64 %14, 40 %16 = or i64 %15, %13 %17 = zext i8 %4 to i64 %18 = shl i64 %17, 32 %19 = or i64 %18, %16 %20 = zext i8 %5 to i64 %21 = shl i64 %20, 24 %22 = or i64 %21, %19 %23 = zext i8 %6 to i64 %24 = shl i64 %23, 16 %25 = or i64 %24, %22 %26 = zext i8 %7 to i64 %27 = shl i64 %26, 8 %28 = or i64 %27, %25 %29 = zext i8 %8 to i64 %30 = or i64 %29, %28 ret i64 %30 } In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough PHIs are in play that instcombine backs off. It's better to not generate this stuff in the first place. llvm-svn: 123571	2011-01-16 06:18:28 +00:00
Chris Lattner	7cd8cf7d24	Use an irbuilder to get some trivial constant folding when doing a store of a constant. llvm-svn: 123570	2011-01-16 05:58:24 +00:00
Chris Lattner	d55581ded8	enhance FoldOpIntoPhi in instcombine to try harder when a phi has multiple uses. In some cases, all the uses are the same operation, so instcombine can go ahead and promote the phi. In the testcase this pushes an add out of the loop. llvm-svn: 123568	2011-01-16 05:28:59 +00:00
Chris Lattner	af26390790	temporarily revert r123526. While working on a follow-on patch I realize that ConstantFoldTerminator doesn't preserve dominfo. llvm-svn: 123527	2011-01-15 07:51:19 +00:00
Chris Lattner	8df83c4a24	fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient code The basic issue is that isel (very reasonably!) expects conditional branches to be folded, so CGP leaving around a bunch dead computation feeding conditional branches isn't such a good idea. Just fold branches on constants into unconditional branches. llvm-svn: 123526	2011-01-15 07:36:13 +00:00
Chris Lattner	ee588defc6	simplify code, no functionality change. llvm-svn: 123525	2011-01-15 07:29:01 +00:00
Chris Lattner	1b93be501d	Now that instruction optzns can update the iterator as they go, we can have objectsize folding recursively simplify away their result when it folds. It is important to catch this here, because otherwise we won't eliminate the cross-block values at isel and other times. llvm-svn: 123524	2011-01-15 07:25:29 +00:00
Chris Lattner	7a2771440f	make the current instruction iterator an ivar, allowing xforms that potentially invalidate it (like inline asm lowering) to be sunk into their proper place, cleaning up a ton of code. llvm-svn: 123523	2011-01-15 07:14:54 +00:00
Chris Lattner	b68ec5c339	Generalize LoadAndStorePromoter a bit and switch LICM to use it. llvm-svn: 123501	2011-01-15 00:12:35 +00:00
Chris Lattner	b498f9aff3	switch SRoA to use LoadAndStorePromoter instead of its own copy of the code. llvm-svn: 123457	2011-01-14 19:50:47 +00:00
Chris Lattner	9987a6f49b	split SROA into two passes: one that uses DomFrontiers (-scalarrepl) and one that uses SSAUpdater (-scalarrepl-ssa) llvm-svn: 123436	2011-01-14 08:13:00 +00:00
Chris Lattner	543384efb4	Implement full support for promoting allocas to registers using SSAUpdater instead of DomTree/DomFrontier. This may be interesting for reducing compile time. This is currently disabled, but seems to work just fine. When this is enabled, we eliminate two runs of dominator frontier, one in the "early per-function" optimizations and one in the "interlaced with inliner" function passes. llvm-svn: 123434	2011-01-14 07:50:47 +00:00
Bob Wilson	328e91bbe1	Fix whitespace. llvm-svn: 123396	2011-01-13 20:59:44 +00:00
Bob Wilson	c8056a952e	Check for empty structs, and for consistency, zero-element arrays. llvm-svn: 123383	2011-01-13 18:26:59 +00:00
Bob Wilson	08713d3c5f	Extend SROA to handle arrays accessed as homogeneous structs and vice versa. This is a minor extension of SROA to handle a special case that is important for some ARM NEON operations. Some of the NEON intrinsics return multiple values, which are handled as struct types containing multiple elements of the same vector type. The corresponding return types declared in the arm_neon.h header have equivalent arrays. We need SROA to recognize that it can split up those arrays and structs into separate vectors, even though they are not always accessed with the same type. SROA already handles loads and stores of an entire alloca by using insertvalue/extractvalue to access the individual pieces, and that code works the same regardless of whether the type is a struct or an array. So, all that needs to be done is to check for compatible arrays and homogeneous structs. llvm-svn: 123381	2011-01-13 17:45:11 +00:00
Bob Wilson	12eec40c83	Make SROA more aggressive with allocas containing padding. SROA only split up structs and arrays one level at a time, so padding can only cause trouble if it is located in between the struct or array elements. llvm-svn: 123380	2011-01-13 17:45:08 +00:00
Devang Patel	30f3ebbc1f	Use SmallVector instead of SmallPtrSet and avoid non-deterministic behavior. llvm-svn: 123318	2011-01-12 19:12:45 +00:00
Chris Lattner	dd5f60b7a7	revert 123144, reenabling the rest of memset formation. llvm-svn: 123302	2011-01-12 03:25:15 +00:00
Chris Lattner	654098f411	revert r123146 which disabled code that wasn't the root cause of the bootstrap miscompare issue. llvm-svn: 123299	2011-01-12 01:52:23 +00:00
Chris Lattner	fa7c29d255	revert r123149, reenabling an improvement to memcpyopt that wasn't the source of the bootstrap problem. llvm-svn: 123298	2011-01-12 01:43:46 +00:00
Jakob Stoklund Olesen	12cc296bd4	Remove the PR8954 workaround. llvm-svn: 123288	2011-01-11 22:56:41 +00:00
Cameron Zwarich	cb9c4f85ec	Dial back the speculative fix for PR8954 a bit, so that we only recompute dominators once at the beginning of GVN instead of once per iteration. llvm-svn: 123278	2011-01-11 22:14:42 +00:00
Cameron Zwarich	51eb403907	Attempt to fix the bootstrap buildbot. Rafael says this works for him on x86-64 Linux. llvm-svn: 123270	2011-01-11 20:23:34 +00:00
Chris Lattner	193ce7c4d1	update memdep when an instruction is deleted. This code isn't actually reached in the testcase in PR8954, but it's safe and good practice. llvm-svn: 123224	2011-01-11 08:19:16 +00:00
Chris Lattner	f6ae904e34	Fix FoldSingleEntryPHINodes to update memdep and AA when it deletes phi nodes. It is called from MergeBlockIntoPredecessor which is called from GVN, which claims to preserve these. I'm skeptical that this is the actual problem behind PR8954, but this is a stab in the right direction. llvm-svn: 123222	2011-01-11 08:13:40 +00:00
Chris Lattner	dfcfcb49fa	random cleanups llvm-svn: 123221	2011-01-11 08:00:40 +00:00
Chris Lattner	63fe78de68	remove a bogus assertion: the latch block of a loop is not neccesarily an uncond branch to the header. This fixes PR8955 (the assertion tripping). llvm-svn: 123219	2011-01-11 07:47:59 +00:00
Chris Lattner	88bc848ab6	another random stab in the dark trying to fix llvm-gcc-i386-linux-selfhost llvm-svn: 123149	2011-01-10 02:34:11 +00:00
Chris Lattner	4662bd4b13	another (more) aggressive attempt to bring llvm-gcc-i386-linux-selfhost back to life. llvm-svn: 123146	2011-01-10 00:47:34 +00:00
Chris Lattner	1017fa6746	temporarily disable memset formation from memsets in an effort to restore buildbot stability. llvm-svn: 123144	2011-01-09 23:52:48 +00:00
Chris Lattner	caf5c0d037	fix a few old bugs (found by inspection) where we would zap instructions without informing memdep. This could cause nondeterminstic weirdness based on where instructions happen to get allocated, and will hopefully breath some life into some broken testers. llvm-svn: 123124	2011-01-09 19:26:10 +00:00
Cameron Zwarich	a42e5915bf	LoopInstSimplify preserves LoopSimplify. llvm-svn: 123117	2011-01-09 12:35:16 +00:00
Chris Lattner	a337f5ec5c	reduce indentation. Print <nuw> and <nsw> when dumping SCEV AddRec's that have the bit set. llvm-svn: 123104	2011-01-09 02:16:18 +00:00
Chris Lattner	7d6433ae76	fix a latent bug in memcpyoptimizer that my recent patches exposed: it wasn't updating memdep when fusing stores together. This fixes the crash optimizing the bullet benchmark. llvm-svn: 123091	2011-01-08 22:19:21 +00:00
Chris Lattner	ff6ed2ac5f	tryMergingIntoMemset can only handle constant length memsets. llvm-svn: 123090	2011-01-08 22:11:56 +00:00
Chris Lattner	9a1d63ba9f	Merge memsets followed by neighboring memsets and other stores into larger memsets. Among other things, this fixes rdar://8760394 and allows us to handle "Example 2" from http://blog.regehr.org/archives/320, compiling it into a single 4096-byte memset: _mad_synth_mute: ## @mad_synth_mute ## BB#0: ## %entry pushq %rax movl $4096, %esi ## imm = 0x1000 callq ___bzero popq %rax ret llvm-svn: 123089	2011-01-08 21:19:19 +00:00
Chris Lattner	5120ebf184	fix an issue in IsPointerOffset that prevented us from recognizing that P and P+1 are relative to the same base pointer. llvm-svn: 123087	2011-01-08 21:07:56 +00:00
Chris Lattner	4dc1fd938f	enhance memcpyopt to merge a store and a subsequent memset into a single larger memset. llvm-svn: 123086	2011-01-08 20:54:51 +00:00
Chris Lattner	c638147e9f	constify TargetData references. Split memset formation logic out into its own "tryMergingIntoMemset" helper function. llvm-svn: 123081	2011-01-08 20:24:01 +00:00
Chris Lattner	59c82f850d	When loop rotation happens, it is very common for the duplicated condbr to be foldable into an uncond branch. When this happens, we can make a much simpler CFG for the loop, which is important for nested loop cases where we want the outer loop to be aggressively optimized. Handle this case more aggressively. For example, previously on phi-duplicate.ll we would get this: define void @test(i32 %N, double* %G) nounwind ssp { entry: %cmp1 = icmp slt i64 1, 1000 br i1 %cmp1, label %bb.nph, label %for.end bb.nph: ; preds = %entry br label %for.body for.body: ; preds = %bb.nph, %for.cond %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ] %arrayidx = getelementptr inbounds double* %G, i64 %j.02 %tmp3 = load double* %arrayidx %sub = sub i64 %j.02, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.02, 1 br label %for.cond for.cond: ; preds = %for.body %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge for.cond.for.end_crit_edge: ; preds = %for.cond br label %for.end for.end: ; preds = %for.cond.for.end_crit_edge, %entry ret void } Now we get the much nicer: define void @test(i32 %N, double* %G) nounwind ssp { entry: br label %for.body for.body: ; preds = %entry, %for.body %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ] %arrayidx = getelementptr inbounds double* %G, i64 %j.01 %tmp3 = load double* %arrayidx %sub = sub i64 %j.01, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.01, 1 %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.end for.end: ; preds = %for.body ret void } With all of these recent changes, we are now able to compile: void foo(char X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i100] = 0; } into a single memset of 10000 bytes. This series of changes should also be helpful for other nested loop scenarios as well. llvm-svn: 123079	2011-01-08 19:59:06 +00:00
Chris Lattner	30f318e5d1	split ssa updating code out to its own helper function. Don't bother moving the OrigHeader block anymore: we just merge it away anyway so its code layout doesn't matter. llvm-svn: 123077	2011-01-08 19:26:33 +00:00
Chris Lattner	2615130e1d	Implement a TODO: Enhance loopinfo to merge away the unconditional branch that it was leaving in loops after rotation (between the original latch block and the original header. With this change, it is possible for rotated loops to have just a single basic block, which is useful. llvm-svn: 123075	2011-01-08 19:10:28 +00:00
Chris Lattner	fee37c5fa3	inline preserveCanonicalLoopForm now that it is simple. llvm-svn: 123073	2011-01-08 18:55:50 +00:00
Chris Lattner	063dca0f6a	Three major changes: 1. Rip out LoopRotate's domfrontier updating code. It isn't needed now that LICM doesn't use DF and it is super complex and gross. 2. Make DomTree updating code a lot simpler and faster. The old loop over all the blocks was just to find a block?? 3. Change the code that inserts the new preheader to just use SplitCriticalEdge instead of doing an overcomplex reimplementation of it. No behavior change, except for the name of the inserted preheader. llvm-svn: 123072	2011-01-08 18:52:51 +00:00
Chris Lattner	7fab23bc1d	LoopRotate requires canonical loop form, so it always has preheaders and latch blocks. Reorder entry conditions to make hte pass faster and more logical. llvm-svn: 123069	2011-01-08 18:06:22 +00:00
Chris Lattner	d62691f4e8	use the LI ivar. llvm-svn: 123068	2011-01-08 17:49:51 +00:00
Chris Lattner	385f2ec6d8	some cleanups: remove dead arguments and eliminate ivars that are just passed to one function. llvm-svn: 123067	2011-01-08 17:48:33 +00:00
Chris Lattner	25ba40a0cc	fix an issue duncan pointed out, which could cause loop rotate to violate LCSSA form llvm-svn: 123066	2011-01-08 17:38:45 +00:00
Cameron Zwarich	b4ab257bcc	Fix coding style issues. llvm-svn: 123065	2011-01-08 17:07:11 +00:00
Cameron Zwarich	84986b298a	Make more passes preserve dominators (or state that they preserve dominators if they all ready do). This removes two dominator recomputations prior to isel, which is a 1% improvement in total llc time for 403.gcc. The only potentially suspect thing is making GCStrategy recompute dominators if it used a custom lowering strategy. llvm-svn: 123064	2011-01-08 17:01:52 +00:00
Cameron Zwarich	80bd9af7c5	Contract subloop bodies. However, it is still important to visit the phis at the top of subloop headers, as the phi uses logically occur outside of the subloop. llvm-svn: 123062	2011-01-08 15:52:22 +00:00
Chris Lattner	8c5defd0b0	Have loop-rotate simplify instructions (yay instsimplify!) as it clones them into the loop preheader, eliminating silly instructions like "icmp i32 0, 100" in fixed tripcount loops. This also better exposes the bigger problem with loop rotate that I'd like to fix: once this has been folded, the duplicated conditional branch often turns into an uncond branch. Not aggressively handling this is pessimizing later loop optimizations somethin' fierce by making "dominates all exit blocks" checks fail. llvm-svn: 123060	2011-01-08 08:24:46 +00:00
Chris Lattner	43f8d16482	Revamp the ValueMapper interfaces in a couple ways: 1. Take a flags argument instead of a bool. This makes it more clear to the reader what it is used for. 2. Add a flag that says that "remapping a value not in the map is ok". 3. Reimplement MapValue to share a bunch of code and be a lot more efficient. For lookup failures, don't drop null values into the map. 4. Using the new flag a bunch of code can vaporize in LinkModules and LoopUnswitch, kill it. No functionality change. llvm-svn: 123058	2011-01-08 08:15:20 +00:00
Chris Lattner	2b3f20e6ec	two minor changes: switch to the standard ValueToValueMapTy map from ValueMapper.h (giving us access to its utilities) and add a fastpath in the loop rotation code, avoiding expensive ssa updator manipulation for values with nothing to update. llvm-svn: 123057	2011-01-08 07:21:31 +00:00
Cameron Zwarich	9ec19ea06a	Add the CallInst optimizations that don't involve expanding inline assembly to OptimizeInst() so that they can be used on a worklist instruction. llvm-svn: 122945	2011-01-06 02:56:42 +00:00
Cameron Zwarich	d28c78eb4f	Move the GEP handling in CodeGenPrepare to OptimizeInst(). llvm-svn: 122944	2011-01-06 02:44:52 +00:00
Cameron Zwarich	14ac865ca9	Split the optimizations in CodeGenPrepare that don't manipulate the iterators into a separate function, so that it can be called from a loop using a worklist rather than a loop traversing a whole basic block. llvm-svn: 122943	2011-01-06 02:37:26 +00:00
Jakob Stoklund Olesen	70be93a200	Zap the last two -Wself-assign warnings in llvm. Simplify RALinScan::DowngradeRegister with TRI::getOverlaps while we are there. llvm-svn: 122940	2011-01-06 01:33:22 +00:00
Cameron Zwarich	ce3b930a98	Stop reallocating SunkAddrs for each basic block. When we move to an instruction worklist, the key will need to become std::pair<BasicBlock, Value>. llvm-svn: 122932	2011-01-06 00:42:50 +00:00
Cameron Zwarich	b62ccb241b	Add some more statistics to CodeGenPrepare. llvm-svn: 122891	2011-01-05 17:47:38 +00:00
Cameron Zwarich	ced753fadf	Add some stats to CodeGenPrepare to make it easier to speed it up without regressing code quality. llvm-svn: 122887	2011-01-05 17:27:27 +00:00
Cameron Zwarich	6a78995369	Use pop_back_val instead of back followed by pop_back. llvm-svn: 122876	2011-01-05 16:08:47 +00:00
Cameron Zwarich	5a2bb998ac	Use a worklist for later iterations just like ordinary instsimplify. The next step is to only process instructions in subloops if they have been modified by an earlier simplification. llvm-svn: 122869	2011-01-05 05:47:47 +00:00
Cameron Zwarich	4c51d122d5	Change LoopInstSimplify back to a LoopPass. It revisits subloops rather than skipping them, but it should probably use a worklist and only revisit those instructions in subloops that have actually changed. It should probably also use a worklist after the first iteration like instsimplify now does. Regardless, it's only 0.3% of opt -O2 time on 403.gcc if it replaces the instcombine placed in the middle of the loop passes. llvm-svn: 122868	2011-01-05 05:15:53 +00:00
Owen Anderson	7b25ff04bd	Don't bother value numbering instructions with void types in GVN. In theory this should allow us to insert fewer things into the value numbering maps, but any speedup is beneath the noise threshold on my machine on 403.gcc. llvm-svn: 122844	2011-01-04 22:15:21 +00:00
Owen Anderson	e39cb57b09	Complete the NumberTable --> LeaderTable rename. llvm-svn: 122828	2011-01-04 19:29:46 +00:00
Owen Anderson	d7d06d3aaf	Fix typo in a comment. llvm-svn: 122827	2011-01-04 19:25:18 +00:00
Owen Anderson	51489b3b28	Prune #include's. llvm-svn: 122826	2011-01-04 19:24:57 +00:00
Owen Anderson	c7c3bc63f7	Clarify terminology, settling on referring to what was the "number table" as the "leader table", and rename methods to make it much more clear what they're doing. llvm-svn: 122823	2011-01-04 19:13:25 +00:00
Owen Anderson	83546f2fe0	When removing a value from GVN's leaders list, don't drop the Next pointer in a corner case. llvm-svn: 122822	2011-01-04 19:10:54 +00:00
Owen Anderson	41a1550ef5	Branch instructions don't produce values, so there's no need to generate a value number for them. This avoids adding them to the various value numbering tables, resulting in a minor (~3%) speedup for GVN on 40.gcc. llvm-svn: 122819	2011-01-04 18:54:18 +00:00
Owen Anderson	22c53e277a	Remove commented out code. llvm-svn: 122817	2011-01-04 18:22:08 +00:00
Cameron Zwarich	b2a41e9388	Switch to the new style of asterisk placement. llvm-svn: 122815	2011-01-04 18:19:19 +00:00
Chris Lattner	8643810ede	Teach loop-idiom to turn a loop containing a memset into a larger memset when safe. The testcase is basically this nested loop: void foo(char X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i100] = 0; } which gets turned into a single memset now. clang -O3 doesn't optimize this yet though due to a phase ordering issue I haven't analyzed yet. llvm-svn: 122806	2011-01-04 07:46:33 +00:00
Chris Lattner	a62b01dc37	restructure this a bit. Initialize the WeakVH with "I", the instruction after the store. The store will always be deleted if the transformation kicks in, so we'd do an N^2 scan of every loop block. Whoops. llvm-svn: 122805	2011-01-04 07:27:30 +00:00
Cameron Zwarich	f4e13699e7	Avoid finding loop back edges when we are not splitting critical edges in CodeGenPrepare (which is the default behavior). llvm-svn: 122801	2011-01-04 04:43:31 +00:00
Cameron Zwarich	e924969380	Address most of Duncan's review comments. Also, make LoopInstSimplify a simple FunctionPass. It probably doesn't have a reason to be a LoopPass, as it will probably drop the simple fixed point and either use RPO iteration or Duncan's approach in instsimplify of only revisiting instructions that have changed. The next step is to preserve LoopSimplify. This looks like it won't be too hard, although the pass manager doesn't actually seem to respect when non-loop passes claim to preserve LCSSA or LoopSimplify. This will have to be fixed. llvm-svn: 122791	2011-01-04 00:12:46 +00:00
Chris Lattner	0ba473c218	use the very-handy getTruncateOrZeroExtend helper function, and stop setting NSW: signed overflow is possible. Thanks to Dan for pointing these out. llvm-svn: 122790	2011-01-04 00:06:55 +00:00
Owen Anderson	0839d3930a	Fix comment. llvm-svn: 122788	2011-01-03 23:51:56 +00:00
Owen Anderson	d62d37225a	Use the new addEscapingValue callback to update GlobalsModRef when GVN adds PHIs of GEPs. For the moment, have GlobalsModRef handle this conservatively by simply removing the value from its maps. llvm-svn: 122787	2011-01-03 23:51:43 +00:00
Chris Lattner	bde6ec1db6	Duncan deftly points out that readnone functions aren't invalidated by stores, so they can be handled as 'simple' operations. llvm-svn: 122785	2011-01-03 23:38:13 +00:00
Owen Anderson	3a33d0cc4a	Simplify GVN's value expression structure, allowing the elimination of a lot of almost-but-not-quite-identical code. No intended functionality change. llvm-svn: 122760	2011-01-03 19:00:11 +00:00
Chris Lattner	16ca19ffc5	stength reduce my previous patch a bit. The only instructions that are allowed to have metadata operands are intrinsic calls, and the only ones that take metadata currently return void. Just reject all void instructions, which should not be value numbered anyway. To future proof things, add an assert to the getHashValue impl for calls to check that metadata operands aren't present. llvm-svn: 122759	2011-01-03 18:43:03 +00:00
Chris Lattner	142f1cd251	fix PR8895: metadata operands don't have a strong use of their nested values, so they can change and drop to null, which can change the hash and cause havok. It turns out that it isn't a good idea to value number stuff with metadata operands anyway, so... don't. llvm-svn: 122758	2011-01-03 18:28:15 +00:00
Cameron Zwarich	43cecb1200	Switch a worklist in CodeGenPrepare to SmallVector and increase the inline capacity on the Visited SmallPtrSet. On 403.gcc, this is about a 4.5% speedup of CodeGenPrepare time (which itself is 10% of time spent in the backend). This is progress towards PR8889. llvm-svn: 122741	2011-01-03 06:33:01 +00:00
Chris Lattner	9e5e9ed79a	earlycse can do trivial with-a-block dead store elimination as well. This deletes 60 stores in 176.gcc that largely come from bitfield code. llvm-svn: 122736	2011-01-03 04:17:24 +00:00
Chris Lattner	4b9a525742	switch the load table to use a recycling bump pointer allocator, speeding earlycse up by 6%. llvm-svn: 122733	2011-01-03 03:53:50 +00:00
Chris Lattner	e0e32a9ef0	now that loads are in their own table, we can implement store->load forwarding. This allows EarlyCSE to zap 600 more loads from 176.gcc. llvm-svn: 122732	2011-01-03 03:46:34 +00:00
Chris Lattner	92bb0f9f9d	split loads and calls into separate tables. Loads are now just indexed by their pointer instead of using MemoryValue to wrap it. llvm-svn: 122731	2011-01-03 03:41:27 +00:00
Chris Lattner	4cb365414f	various cleanups, no functionality change. llvm-svn: 122729	2011-01-03 03:28:23 +00:00
Chris Lattner	b9a8efc960	Teach EarlyCSE to do trivial CSE of loads and read-only calls. On 176.gcc, this catches 13090 loads and calls, and increases the number of simple instructions CSE'd from 29658 to 36208. llvm-svn: 122727	2011-01-03 03:18:43 +00:00
Chris Lattner	79d83067ee	rename InstValue to SimpleValue, add some comments. llvm-svn: 122725	2011-01-03 02:20:48 +00:00
Michael J. Spencer	edb5bcdde5	CMake: Add missing source file. llvm-svn: 122724	2011-01-03 02:13:05 +00:00
Chris Lattner	d815f69b30	Allocate nodes for the scoped hash table from a recyling bump pointer allocator. This speeds up early cse by about 20% llvm-svn: 122723	2011-01-03 01:42:46 +00:00
Chris Lattner	02a9776b64	reduce redundancy in the hashing code and other misc cleanups. llvm-svn: 122720	2011-01-03 01:10:08 +00:00
Cameron Zwarich	cab9a0abab	Add a new loop-instsimplify pass, with the intention of replacing the instance of instcombine that is currently in the middle of the loop pass pipeline. This commit only checks in the pass; it will hopefully be enabled by default later. llvm-svn: 122719	2011-01-03 00:25:16 +00:00
Chris Lattner	0844c76f9a	fix some pastos llvm-svn: 122718	2011-01-02 23:29:58 +00:00
Chris Lattner	8fac5db251	add DEBUG and -stats output to earlycse. Teach it to CSE the rest of the non-side-effecting instructions. llvm-svn: 122716	2011-01-02 23:19:45 +00:00
Chris Lattner	18ae5436b1	Enhance earlycse to do CSE of casts, instsimplify and die. Add a testcase. llvm-svn: 122715	2011-01-02 23:04:14 +00:00
Chris Lattner	bf0aa927cc	split dom frontier handling stuff out to its own DominanceFrontier header, so that Dominators.h is just domtree. Also prune #includes a bit. llvm-svn: 122714	2011-01-02 22:09:33 +00:00
Chris Lattner	704541bb23	sketch out a new early cse pass. No functionality yet. llvm-svn: 122713	2011-01-02 21:47:05 +00:00
Chris Lattner	9c69406f2b	fix a miscompilation of tramp3d-v4: when forming a memcpy, we have to make sure that the loop we're promoting into a memcpy doesn't mutate the input of the memcpy. Before we were just checking that the dest of the memcpy wasn't mod/ref'd by the loop. llvm-svn: 122712	2011-01-02 21:14:18 +00:00
Chris Lattner	5702a43c09	If a loop iterates exactly once (has backedge count = 0) then don't mess with it. We'd rather peel/unroll it than convert all of its stores into memsets. llvm-svn: 122711	2011-01-02 20:24:21 +00:00
Chris Lattner	8455b6e45e	enhance loop idiom recognition to scan all unconditionally executed blocks in a loop, instead of just the header block. This makes it more aggressive, able to handle Duncan's Ada examples. llvm-svn: 122704	2011-01-02 19:01:03 +00:00
Chris Lattner	0cdc6f62a5	make inSubLoop much more efficient. llvm-svn: 122703	2011-01-02 18:53:08 +00:00
Chris Lattner	27497ece96	rip out isExitBlockDominatedByBlockInLoop, calling DomTree::dominates instead. isExitBlockDominatedByBlockInLoop is a relic of the days when domtree was just a tree and didn't have DFS numbers. Checking DFS numbers is faster and easier than "limiting the search of the tree". llvm-svn: 122702	2011-01-02 18:45:39 +00:00
Chris Lattner	0469e01c02	add a list of opportunities for future improvement. llvm-svn: 122701	2011-01-02 18:32:09 +00:00
Chris Lattner	ddf58010bd	Allow loop-idiom to run on multiple BB loops, but still only scan the loop header for now for memset/memcpy opportunities. It turns out that loop-rotate is successfully rotating loops, but DOESN'T MERGE THE BLOCKS, turning "for loops" into 2 basic block loops that loop-idiom was ignoring. With this fix, we form many many more memcpy and memsets than before, including on the "history" loops in the viterbi benchmark, which look like this: for (j=0; j<MAX_history; ++j) { history_new[i][j+1] = history[2*i][j]; } Transforming these loops into memcpy's speeds up the viterbi benchmark from 11.98s to 3.55s on my machine. Woo. llvm-svn: 122685	2011-01-02 07:58:36 +00:00
Chris Lattner	5b5a043d82	remove debugging code. llvm-svn: 122683	2011-01-02 07:37:13 +00:00
Chris Lattner	12f91befce	add some -stats output. llvm-svn: 122682	2011-01-02 07:36:44 +00:00
Chris Lattner	679572e584	improve loop rotation to use CodeMetrics to analyze the size of a loop header instead of its own code size estimator. This allows it to handle bitcasts etc more precisely. llvm-svn: 122681	2011-01-02 07:35:53 +00:00
Chris Lattner	85b6d81d41	teach loop idiom recognition to form memcpy's from simple loops. llvm-svn: 122678	2011-01-02 03:37:56 +00:00
Chris Lattner	a3514441e0	add a validity check that was missed, fixing a crash on the new testcase. llvm-svn: 122662	2011-01-01 20:12:04 +00:00
Chris Lattner	91a4435875	improve validity check to handle constant-trip-count loops more aggressively. In practice, this doesn't help anything though, see the todo. llvm-svn: 122660	2011-01-01 19:54:22 +00:00
Chris Lattner	8b3baf6d75	implement the "no aliasing accesses in loop" safety check. This pass should be correct now. llvm-svn: 122659	2011-01-01 19:39:01 +00:00
Chris Lattner	65a699d4d0	simplify this, isBytewiseValue handles the extra check. We still check for "multiple of a byte" in size to make it clear that the >> 3 below is safe. llvm-svn: 122604	2010-12-28 18:53:48 +00:00
Duncan Sands	5cf10e691b	Silence gcc warning about an unused variable when doing a release build. llvm-svn: 122593	2010-12-28 09:41:15 +00:00
Chris Lattner	cb18bfa3d2	fix some issues Frits noticed, add AliasAnalysis as a dependency llvm-svn: 122585	2010-12-27 18:39:08 +00:00
Benjamin Kramer	7cba269dfb	SimplifyLibCalls: Use IRBuilder to simplify code. llvm-svn: 122575	2010-12-27 00:16:46 +00:00
Chris Lattner	b9fe685b9a	have loop-idiom nuke instructions that feed stores that get removed. llvm-svn: 122574	2010-12-27 00:03:23 +00:00
Chris Lattner	29e14edc8d	implement enough of the memset inference algorithm to recognize and insert memsets. This is still missing one important validity check, but this is enough to compile stuff like this: void test0(std::vector<char> &X) { for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I) *I = 0; } void test1(std::vector<int> &X) { for (long i = 0, e = X.size(); i != e; ++i) X[i] = 0x01010101; } With: $ clang t.cpp -S -o - -O2 -emit-llvm \| opt -loop-idiom \| opt -O3 \| llc to: __Z5test0RSt6vectorIcSaIcEE: ## @_Z5test0RSt6vectorIcSaIcEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rsi cmpq %rsi, %rax je LBB0_2 ## BB#1: ## %bb.nph subq %rax, %rsi movq %rax, %rdi callq ___bzero LBB0_2: ## %for.end addq $8, %rsp ret ... __Z5test1RSt6vectorIiSaIiEE: ## @_Z5test1RSt6vectorIiSaIiEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rdx subq %rax, %rdx cmpq $4, %rdx jb LBB1_2 ## BB#1: ## %for.body.preheader andq $-4, %rdx movl $1, %esi movq %rax, %rdi callq _memset LBB1_2: ## %for.end addq $8, %rsp ret llvm-svn: 122573	2010-12-26 23:42:51 +00:00
Chris Lattner	6cf8d6cc6e	start using irbuilder to make mem intrinsics in a few passes. llvm-svn: 122572	2010-12-26 22:57:41 +00:00
Chris Lattner	7c5f9c35d1	sketch more of this out. llvm-svn: 122567	2010-12-26 20:45:45 +00:00
Chris Lattner	9cb1035f94	move isBytewiseValue out to ValueTracking.h/cpp llvm-svn: 122565	2010-12-26 20:15:01 +00:00
Chris Lattner	81ae3f299a	actually add the file... llvm-svn: 122563	2010-12-26 19:39:38 +00:00
Chris Lattner	2ef535a4e4	Start of a pass for recognizing memset and memcpy idioms. No functionality yet. llvm-svn: 122562	2010-12-26 19:32:44 +00:00
Benjamin Kramer	30342fb1fd	Simplify code. llvm-svn: 122561	2010-12-26 15:23:45 +00:00
Benjamin Kramer	b90b2f0635	Fix a thinko pointed out by Frits van Bommel: looking through global variables in isBytewiseValue is not safe. llvm-svn: 122550	2010-12-24 22:23:59 +00:00
Benjamin Kramer	ea9152e551	MemCpyOpt: Turn memcpys from a constant into a memset if possible. This allows us to compile "int cst[] = {-1, -1, -1};" into movl $-1, 16(%rsp) movq $-1, 8(%rsp) instead of movl _cst+8(%rip), %eax movl %eax, 16(%rsp) movq _cst(%rip), %rax movq %rax, 8(%rsp) llvm-svn: 122548	2010-12-24 21:17:12 +00:00
Owen Anderson	5d690d4168	It is possible for SimplifyCFG to cause PHI nodes to become redundant too late in the optimization pipeline to be caught by instcombine, and it's not feasible to catch them in SimplifyCFG because the use-lists are in an inconsistent state at the point where it could know that it need to simplify them. Instead, have CodeGenPrepare look for trivially redundant PHIs as part of its general cleanup effort. llvm-svn: 122516	2010-12-23 20:57:35 +00:00
Mon P Wang	18b762a946	Preserve the address space when generating bitcasts for MemTransferInst in ConvertToScalarInfo llvm-svn: 122462	2010-12-23 01:41:32 +00:00
Jeffrey Yasskin	9b43f33620	Change all self assignments X=X to (void)X, so that we can turn on a new gcc warning that complains on self-assignments and self-initializations. llvm-svn: 122458	2010-12-23 00:58:24 +00:00
Owen Anderson	5ab8d4b5e5	Give GVN back the ability to perform simple conditional propagation on conditional branch values. I still think that LVI should be handling this, but that capability is some ways off in the future, and this matters for some significant benchmarks. llvm-svn: 122378	2010-12-21 23:54:34 +00:00
Owen Anderson	12470778d7	Remove dead code. llvm-svn: 122371	2010-12-21 22:31:24 +00:00
Benjamin Kramer	43493c089f	GVN's Expression is not POD-like (it contains a SmallVector). Simplify code while at it. llvm-svn: 122362	2010-12-21 21:30:19 +00:00
Chris Lattner	b6252a376a	tidy up llvm-svn: 122190	2010-12-19 20:24:28 +00:00
Chris Lattner	408a684d29	Enhance LICM to promote alias sets whose pointers themselves are stored, which doesn't affect the memory address being promoted. llvm-svn: 122172	2010-12-19 05:57:25 +00:00
Chris Lattner	3337a81450	fix PR8602, a bug in an assertion: a volatile store of a pointer does not make the alias set for that pointer volatile, just stores to the pointer. llvm-svn: 122171	2010-12-19 05:51:54 +00:00
Chris Lattner	fb888622c3	revert r122164, I'm going to go with a different approach. llvm-svn: 122168	2010-12-19 04:23:03 +00:00
Chris Lattner	583ec6fa44	first step to fixing PR8642: don't fold away empty basic blocks which have trapping constant exprs in them due to PHI nodes. Eliminating them can cause the constant expr to be evalutated on new paths if the input edges are critical. llvm-svn: 122164	2010-12-19 03:02:34 +00:00
Dan Gohman	93dc2b808f	Revert r64460. strtol and friends cannot be marked readonly, even with a null endptr argument, because they may write to errno. This fixes a seflhost miscompile observed on Linux targets when TBAA was enabled. llvm-svn: 122014	2010-12-17 01:09:43 +00:00
Frits van Bommel	9bbe849fc3	Fix a bug in the loop in JumpThreading::ProcessThreadableEdges() where it could falsely produce a MultipleDestSentinel value if the first predecessor ended with an 'indirectbr'. If that happened, it caused an unnecessary FindMostPopularDest() call. This wasn't a correctness problem, but it broke the fast path for single-predecessor blocks. llvm-svn: 121966	2010-12-16 12:16:00 +00:00
Dan Gohman	e1a17a3473	Make memcpyopt TBAA-aware. llvm-svn: 121944	2010-12-16 02:51:19 +00:00
Dan Gohman	4467aa5294	Preserve TBAA tags when doing load PRE. llvm-svn: 121921	2010-12-15 23:53:55 +00:00
Dan Gohman	a4fcd2418d	Move Value::getUnderlyingObject to be a standalone function so that it can live in Analysis instead of VMCore. llvm-svn: 121885	2010-12-15 20:02:24 +00:00
Frits van Bommel	3d1803495e	Teach jump threading to "look through" a select when the branch direction of a terminator depends on it. When it sees a promising select it now tries to figure out whether the condition of the select is known in any of the predecessors and if so it maps the operands appropriately. llvm-svn: 121859	2010-12-15 09:51:20 +00:00
Owen Anderson	35609d97ae	Fix PR8790, another instance where unreachable code can cause instruction simplification to fail, this case involve a select that simplifies to itself. llvm-svn: 121817	2010-12-15 00:55:35 +00:00
Owen Anderson	15c85c916f	Cleanup trailing whitespace. llvm-svn: 121816	2010-12-15 00:52:44 +00:00
Chris Lattner	73a58627c3	simplify code and reduce indentation llvm-svn: 121670	2010-12-13 02:38:13 +00:00
Chris Lattner	bc4457e317	enhance memcpyopt to zap memcpy's that have the same src/dst. llvm-svn: 121362	2010-12-09 07:45:45 +00:00
Chris Lattner	fd51c52ef6	fix PR8753, eliminating a case where we'd infinitely make a substitution because it doesn't actually change the IR. Patch by Jakub Staszak! llvm-svn: 121361	2010-12-09 07:39:50 +00:00
Frits van Bommel	d2f4b09e10	Remove some dead code from the jump threading pass. The last uses of these functions were removed in r113852 when LazyValueInfo was permanently enabled and removed the need for them. llvm-svn: 121133	2010-12-07 13:08:07 +00:00
Jay Foad	583abbc4df	PR5207: Change APInt methods trunc(), sext(), zext(), sextOrTrunc() and zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method trunc(), to be const and to return a new value instead of modifying the object in place. llvm-svn: 121120	2010-12-07 08:25:19 +00:00
Frits van Bommel	d9df6eaa9c	Implement jump threading of 'indirectbr' by keeping track of whether we're looking for ConstantInts or BlockAddresss. llvm-svn: 121066	2010-12-06 23:36:56 +00:00
Chris Lattner	4dc53e37d9	Use a stronger predicate here, pointed out by Duncan llvm-svn: 121040	2010-12-06 21:48:10 +00:00
Chris Lattner	ca335e38cf	add some DEBUG statements. llvm-svn: 121038	2010-12-06 21:13:51 +00:00
Chris Lattner	94fbdf3814	Fix PR8728, a miscompilation I recently introduced. When optimizing memcpy's like: memcpy(A, B) memcpy(A, C) we cannot delete the first memcpy as dead if A and C might be aliases. If so, we actually get: memcpy(A, B) memcpy(A, A) which is not correct to transform into: memcpy(A, A) This patch was heavily influenced by Jakub Staszak's patch in PR8728, thanks Jakub! llvm-svn: 120974	2010-12-06 01:48:06 +00:00
Frits van Bommel	76244867cf	Refactor jump threading. Should have no functional change other than the order of two transformations that are mutually-exclusive and the exact formatting of debug output. Internally, it now stores the ConstantInts as Constants, and actual undef values instead of nulls. llvm-svn: 120946	2010-12-05 19:06:41 +00:00
Frits van Bommel	5e75ef4a8e	Remove trailing whitespace. llvm-svn: 120945	2010-12-05 19:02:47 +00:00
Chris Lattner	1c577b54b0	fix a bozo bug I introduced in r119930, causing a miscompile of 20040709-1.c from the gcc testsuite. I was using the size of a pointer instead of the pointee. This fixes rdar://8713376 llvm-svn: 120519	2010-12-01 01:24:55 +00:00
Chris Lattner	903add84d9	Enhance DSE to handle the variable index case in PR8657. llvm-svn: 120498	2010-11-30 23:43:23 +00:00
Chris Lattner	c0f3379ae0	teach DSE to use GetPointerBaseWithConstantOffset to analyze may-aliasing stores that partially overlap with different base pointers. This implements PR6043 and the non-variable part of PR8657 llvm-svn: 120485	2010-11-30 23:05:20 +00:00
Chris Lattner	e28618de59	move GetPointerBaseWithConstantOffset out of GVN into ValueTracking.h llvm-svn: 120476	2010-11-30 22:25:26 +00:00

... 7 8 9 10 11 ...

5368 Commits