llvm-project

Commit Graph

Author	SHA1	Message	Date
Micah Villmow	f07b962801	Fix a compiler warning with an unused variable. llvm-svn: 166634	2012-10-24 22:32:26 +00:00
Hal Finkel	69b07a2c3a	Update GVN to support vectors of pointers. GVN will now generate ptrtoint instructions for vectors of pointers. Fixes PR14166. llvm-svn: 166624	2012-10-24 21:22:30 +00:00
Nadav Rotem	e4f491e7ee	whitespace llvm-svn: 166622	2012-10-24 20:58:40 +00:00
Nadav Rotem	a721b21c64	LoopVectorizer: Add a basic cost model which uses the VTTI interface. llvm-svn: 166620	2012-10-24 20:36:32 +00:00
Micah Villmow	bf3eeb2dfc	Add some cleanup to the DataLayout changes requested by Chandler. llvm-svn: 166607	2012-10-24 18:36:13 +00:00
Micah Villmow	51e7246cb4	Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this! llvm-svn: 166596	2012-10-24 17:25:11 +00:00
Micah Villmow	6a8f3f9e20	Delete a directory that wasn't supposed to be checked in yet. llvm-svn: 166591	2012-10-24 17:20:04 +00:00
Micah Villmow	12d9127833	Add in support for getIntPtrType to get the pointer type based on the address space. This checkin also adds in some tests that utilize these paths and updates some of the clients. llvm-svn: 166578	2012-10-24 15:52:52 +00:00
Nadav Rotem	5bed7b4fad	Use the AliasAnalysis isIdentifiedObj because it also understands mallocs and c++ news. PR14158. llvm-svn: 166491	2012-10-23 18:44:18 +00:00
Duncan Sands	5ed3900d77	Fix typo that somehow escaped both testing and code inspection. llvm-svn: 166475	2012-10-23 09:07:02 +00:00
Duncan Sands	533c8ae79f	Transform code like this %V = mul i64 %N, 4 %t = getelementptr i8* bitcast (i32* %arr to i8), i32 %V into %t1 = getelementptr i32 %arr, i32 %N %t = bitcast i32* %t1 to i8* incorporating the multiplication into the getelementptr. This happens all the time in dragonegg, for example for int foo(int A, int N) { return A[N]; } because gcc turns this into byte pointer arithmetic before it hits the plugin: D.1590_2 = (long unsigned int) N_1(D); D.1591_3 = D.1590_2 4; D.1592_5 = A_4(D) + D.1591_3; D.1589_6 = D.1592_5; return D.1589_6; The D.1592_5 line is a POINTER_PLUS_EXPR, which is turned into a getelementptr on a bitcast of A_4 to i8, so this becomes exactly the kind of IR that the transform fires on. An analogous transform (with no testcases!) already existed for bitcasts of arrays, so I rewrote it to share code with this one. llvm-svn: 166474	2012-10-23 08:28:26 +00:00
Richard Smith	6289a4e85e	Per the C++ standard, we need to include the definition of llvm::Calculate in every TU where it's implicitly instantiated, even if there's an implicit instantiation for the same types available in another TU. llvm-svn: 166470	2012-10-23 06:19:46 +00:00
Julien Lerouge	a302b6d95e	Fix typo. llvm-svn: 166456	2012-10-23 00:38:15 +00:00
Julien Lerouge	d7fa5e420d	Explain why DenseMap is still used here instead of MapVector. llvm-svn: 166454	2012-10-23 00:23:46 +00:00
Julien Lerouge	8cf84fa4e2	Iterating over a DenseMap<std::pair<BasicBlock, unsigned>, PHINode> is not deterministic, replace it with a DenseMap<std::pair<unsigned, unsigned>, PHINode*> (we already have a map from BasicBlock to unsigned). <rdar://problem/12541389> llvm-svn: 166435	2012-10-22 19:43:56 +00:00
Nadav Rotem	1c7fc71e69	Don't crash if the load/store pointer is not a GEP. Fix by Shivarama Rao <Shivarama.Rao@amd.com> llvm-svn: 166427	2012-10-22 18:27:56 +00:00
Argyrios Kyrtzidis	54ff5e81a1	Revert r166407 because it caused analyzer tests to crash and broke self-host bots. llvm-svn: 166424	2012-10-22 18:16:14 +00:00
Hal Finkel	931c52b84c	BBVectorize should ignore unreachable blocks. Unreachable blocks can have invalid instructions. For example, jump threading can produce self-referential instructions in unreachable blocks. Also, we should not be spending time optimizing unreachable code. Fixes PR14133. llvm-svn: 166423	2012-10-22 18:00:55 +00:00
Nadav Rotem	f17cd27362	Rename a variable. llvm-svn: 166410	2012-10-22 04:53:05 +00:00
Nadav Rotem	03011f1393	Vectorizer: optimize the generation of selects. If the condition is uniform, generate a scalar-cond select (i1 as selector). llvm-svn: 166409	2012-10-22 04:38:00 +00:00
Nadav Rotem	c9741887c3	Update the loop vectorizer docs. llvm-svn: 166408	2012-10-22 03:52:53 +00:00
Nick Lewycky	8b67e1e0b9	Reapply r166405, teaching tailcallelim to be smarter about nocapture, with a very small but very important bugfix: bool shouldExplore(Use U) { Value V = U->get(); if (isa<CallInst>(V) \|\| isa<InvokeInst>(V)) [...] should have read: bool shouldExplore(Use U) { Value V = U->getUser(); if (isa<CallInst>(V) \|\| isa<InvokeInst>(V)) Fixes PR14143! llvm-svn: 166407	2012-10-22 03:03:52 +00:00
NAKAMURA Takumi	60d56d2eea	Revert r166405, "Teach TailRecursionElimination to consider 'nocapture' when deciding whether" It broke selfhosting stage2 in several builders. llvm-svn: 166406	2012-10-22 00:48:51 +00:00
Nick Lewycky	2d28f2bf83	Teach TailRecursionElimination to consider 'nocapture' when deciding whether calls can be marked tail. llvm-svn: 166405	2012-10-21 23:51:22 +00:00
Benjamin Kramer	f77f224df9	Revert r166390 "LoopIdiom: Replace custom dependence analysis with LoopDependenceAnalysis." It passes all tests, produces better results than the old code but uses the wrong pass, LoopDependenceAnalysis, which is old and unmaintained. "Why is it still in tree?", you might ask. The answer is obviously: "To confuse developers." Just swapping in the new dependency pass sends the pass manager into an infinte loop, I'll try to figure out why tomorrow. llvm-svn: 166399	2012-10-21 19:31:16 +00:00
Anders Carlsson	7d8991c778	Avoid an extra hash lookup when inserting a value into the widen map. llvm-svn: 166395	2012-10-21 16:26:35 +00:00
Jakub Staszak	baa063bd03	Simplify code. No functionality change. llvm-svn: 166393	2012-10-21 15:36:03 +00:00
Jakub Staszak	9694ab8ffa	Simplify code. No functionality change. llvm-svn: 166392	2012-10-21 15:29:19 +00:00
Benjamin Kramer	3ae8bc68af	LoopIdiom: Replace custom dependence analysis with LoopDependenceAnalysis. Requires a lot less code and complexity on loop-idiom's side and the more precise analysis can catch more cases, like the one I included as a test case. This also fixes the edge-case miscompilation from PR9481. I'm not entirely sure that all cases are handled that the old checks handled but LDA will certainly become smarter in the future. llvm-svn: 166390	2012-10-21 15:03:07 +00:00
Nadav Rotem	fe88c67161	Fix a bug in the vectorization of wide load/store operations. We used a SCEV to detect that A[X] is consecutive. We assumed that X was the induction variable. But X can be any expression that uses the induction for example: X = i + 2; llvm-svn: 166388	2012-10-21 06:49:10 +00:00
Nadav Rotem	c1679a95b6	Add support for reduction variables that do not start at zero. This is important for nested-loop reductions such as : In the innermost loop, the induction variable does not start with zero: for (i = 0 .. n) for (j = 0 .. m) sum += ... llvm-svn: 166387	2012-10-21 05:52:51 +00:00
Nadav Rotem	364bd30641	Document change. Describe the pass and some papers that inspired the design of the pass. llvm-svn: 166386	2012-10-21 04:04:25 +00:00
Nadav Rotem	7e1084d36c	Vectorizer: fix a bug in the classification of induction/reduction phis. llvm-svn: 166384	2012-10-21 02:38:01 +00:00
Nadav Rotem	e5dc57d4fb	Fix an infinite loop in the loop-vectorizer. PR14134. llvm-svn: 166379	2012-10-20 20:45:01 +00:00
Benjamin Kramer	7ddd70527c	SROA: Simplify code. No functionality change. llvm-svn: 166375	2012-10-20 12:04:57 +00:00
Benjamin Kramer	f55b592cc8	InstCombine: Fix an edge case where constant icmps could sneak into ConstantFoldInstOperands and crash. Have to refactor the ConstantFolder interface one day to define bugs like this away. Fixes PR14131. llvm-svn: 166374	2012-10-20 08:43:52 +00:00
Nadav Rotem	d189b82a9b	Vectorize: teach cavVectorizeMemory to distinguish between A[i]+=x and A[B[i]]+=x. If the pointer is consecutive then it is safe to read and write. If the pointer is non-loop-consecutive then it is unsafe to vectorize it because we may hit an ordering issue. llvm-svn: 166371	2012-10-20 08:26:33 +00:00
Nadav Rotem	3940bafb54	Fix a typo llvm-svn: 166367	2012-10-20 05:03:27 +00:00
Nadav Rotem	f70ca3ceed	Vectorizer: refactor the memory checks to a new function. No functionality change. llvm-svn: 166366	2012-10-20 04:59:06 +00:00
Nadav Rotem	550f7f7e19	LoopVectorize: Keep the IRBuilder on the stack. llvm-svn: 166354	2012-10-19 23:27:19 +00:00
Nadav Rotem	4f7f72702b	Vectorizer: Add support for loop reductions. For example: for (i=0; i<n; i++) sum += A[i] + B[i] + i; llvm-svn: 166351	2012-10-19 23:05:40 +00:00
Nadav Rotem	4dc976fbcb	revert r166264 because the LTO build is still failing llvm-svn: 166340	2012-10-19 21:28:43 +00:00
Benjamin Kramer	317d6c621d	SimplifyLibcalls: The return value of ffsll is always i32, even when the input is zero. Fixes PR13028. llvm-svn: 166313	2012-10-19 20:43:44 +00:00
Benjamin Kramer	f1088a37cb	Indvars: Don't recursively delete instruction during BB iteration. This can invalidate the iterators leading to use after frees and crashes. Fixes PR12536. llvm-svn: 166291	2012-10-19 17:53:54 +00:00
Alexey Samsonov	8418442ff1	[ASan] Support comments in ASan/TSan blacklist file as lines starting with # llvm-svn: 166283	2012-10-19 15:24:46 +00:00
Evgeniy Stepanov	8eb77d847e	Move SplitBlockAndInsertIfThen to BasicBlockUtils. llvm-svn: 166278	2012-10-19 10:48:31 +00:00
Benjamin Kramer	319cb771b2	LoopVectorize: Keep the IRBuilder on the stack. No functionality change. llvm-svn: 166274	2012-10-19 08:42:02 +00:00
Kostya Serebryany	0995994989	[asan] make sure asan erases old unused allocas after it created a new one. This became important after the recent move from ModulePass to FunctionPass because no cleanup is happening after asan pass any more. llvm-svn: 166267	2012-10-19 06:20:53 +00:00
Nadav Rotem	4985ddc5e0	recommit the patch that makes LSR and LowerInvoke use the TargetTransform interface. llvm-svn: 166264	2012-10-19 04:27:49 +00:00
Nadav Rotem	ced93f3a05	vectorizer: Add support for reading and writing from the same memory location. llvm-svn: 166255	2012-10-19 01:24:18 +00:00
Nadav Rotem	1667324f22	cleanup the comment. llvm-svn: 166247	2012-10-18 23:21:01 +00:00
Nadav Rotem	d45a6b93df	fix a naming typo llvm-svn: 166232	2012-10-18 21:45:31 +00:00
Nadav Rotem	f8a1396882	Avoid reconstructing the pointer set when searching for duplicated read/write pointers. llvm-svn: 166205	2012-10-18 18:34:50 +00:00
Meador Inge	2332615f53	Cosmetic change -- move two simplifiers to the right commented statement group. llvm-svn: 166199	2012-10-18 18:12:43 +00:00
Meador Inge	000dbccfc6	instcombine: Migrate strcpy optimizations This patch migrates the strcpy optimizations from the simplify-libcalls pass into the instcombine library call simplifier. Note also that StrCpyChkOpt has been updated with a few simplifications that were being done in the simplify-libcalls version of StrCpyOpt, but not in the migrated implementation of StrCpyOpt. There is no reason to overload StrCpyOpt with fortified and regular simplifications in the new model since there is already a dedicated simplifier for __strcpy_chk. llvm-svn: 166198	2012-10-18 18:12:40 +00:00
Nadav Rotem	a031c57417	When looking for a vector representation of a scalar, do a single lookup. Also, cache the result of the broadcast instruction. No functionality change. llvm-svn: 166191	2012-10-18 17:31:49 +00:00
Chandler Carruth	59ff93afe6	Refactor insert and extract of sub-integers into static helpers that operate purely on values. Sink the alloca loading and storing logic into the rewrite routines that are specific to alloca-integer-rewrite driving. This is just a refactoring here, but the subsequent step will be to reuse the insertion and extraction logic when rewriting integer loads and stores that have been split and decomposed into narrower loads and stores. No functionality changed other than different names for instructions. llvm-svn: 166176	2012-10-18 09:56:08 +00:00
Chandler Carruth	e793a50f45	This FIXME was fixed some time ago. =] llvm-svn: 166175	2012-10-18 09:56:06 +00:00
Chandler Carruth	e8479e15f5	Introduce a BarrierNoop pass, a hack designed to allow some control over the implicitly-formed-and-nesting CGSCC pass manager and function pass managers, especially when using them on the opt commandline or using extension points in the module builder. The '-barrier' opt flag (or the pass itself) will create a no-op module pass in the pipeline, resetting the pass manager stack, and allowing the creation of a new pipeline of function passes or CGSCC passes to be created that is independent from any previous pipelines. For example, this can be used to test running two CGSCC passes in independent CGSCC pass managers as opposed to in the same CGSCC pass manager. It also allows us to introduce a further hack into the PassManagerBuilder to separate the O0 pipeline extension passes from the always-inliner's CGSCC pass manager, which they likely do not want to participate in... At the very least none of the Sanitizer passes want this behavior. This fixes a bug with ASan at O0 currently, and I'll commit the ASan test which covers this pass. I'm happy to add a test case that this pass exists and works, but not sure how much time folks would like me to spend adding test cases for the details of its behavior of partition pass managers.... The whole thing is just vile, and mostly intended to unblock ASan, so I'm hoping to rip this all out in a brave new pass manager world. llvm-svn: 166172	2012-10-18 08:05:46 +00:00
Nadav Rotem	7a1728094c	remove unused variable to fix a warning. llvm-svn: 166170	2012-10-18 06:09:21 +00:00
Bob Wilson	d6d9ccca38	Temporarily revert the TargetTransform changes. The TargetTransform changes are breaking LTO bootstraps of clang. I am working with Nadav to figure out the problem, but I am reverting it for now to get our buildbots working. This reverts svn commits: 165665 165669 165670 165786 165787 165997 and I have also reverted clang svn 165741 llvm-svn: 166168	2012-10-18 05:43:52 +00:00
Nadav Rotem	642efbcdd8	Remove the use of dominators and AA. llvm-svn: 166167	2012-10-18 05:33:02 +00:00
Nadav Rotem	b52f717411	Vectorizer: Add support for loops with an unknown count. For example: for (i=0; i<n; i++){ a[i] = b[i+1] + c[i+3]; } llvm-svn: 166165	2012-10-18 05:29:12 +00:00
NAKAMURA Takumi	7857415785	LoopVectorize.cpp: Fix a warning. [-Wunused-variable] llvm-svn: 166153	2012-10-17 23:40:15 +00:00
Jakub Staszak	68e5dfddcb	Remove redundant SetInsertPoint call. llvm-svn: 166138	2012-10-17 23:06:37 +00:00
Roman Divacky	4955ec317c	Fix some typos and wrong indenting. llvm-svn: 166128	2012-10-17 21:07:35 +00:00
Nadav Rotem	6b94c2a09b	Add a loop vectorizer. llvm-svn: 166112	2012-10-17 18:25:06 +00:00
Kostya Serebryany	20343351be	[asan] better debug diagnostics in asan compiler module llvm-svn: 166102	2012-10-17 13:40:06 +00:00
Chandler Carruth	6fab42aa39	This just in, it is a bad idea to use 'udiv' on an offset of a pointer. A very bad idea. Let's not do that. Fixes PR14105. Note that this wasn't that glaring of an oversight. Originally, these routines were only called on offsets within an alloca, which are intrinsically positive. But over the evolution of the pass, they ended up being called for arbitrary offsets, and things went downhill... llvm-svn: 166095	2012-10-17 09:23:48 +00:00
Chandler Carruth	40617f593e	Fix a really annoying "bug" introduced in r165941. The change from that revision makes no sense. We cannot use the address space of the post indexed type to conclude anything about a pre indexed pointer type's size. More importantly, this index can never be over a pointer. We are indexing over arrays and vectors here. Of course, I have no test case here. Neither did the original patch. =/ llvm-svn: 166091	2012-10-17 07:22:16 +00:00
Michael Gottesman	02a1141e5a	[InstCombine] Teach InstCombine how to handle an obfuscated splat. An obfuscated splat is where the frontend poorly generates code for a splat using several different shuffles to create the splat, i.e., %A = load <4 x float>* %in_ptr, align 16 %B = shufflevector <4 x float> %A, <4 x float> undef, <4 x i32> <i32 0, i32 0, i32 undef, i32 undef> %C = shufflevector <4 x float> %B, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 4, i32 undef> %D = shufflevector <4 x float> %C, <4 x float> %A, <4 x i32> <i32 0, i32 1, i32 2, i32 4> llvm-svn: 166061	2012-10-16 21:29:38 +00:00
Jakub Staszak	8f46e914fb	Simplify code. No functionality change. llvm-svn: 166053	2012-10-16 19:52:32 +00:00
Jakub Staszak	25dcab1eaa	80-col fixup. llvm-svn: 166050	2012-10-16 19:39:40 +00:00
Jakub Staszak	ba34fdb0e4	Simplify potentially quadratic behavior while erasing elements from std::vector. llvm-svn: 166045	2012-10-16 19:32:31 +00:00
Bill Wendling	c6a15cf519	Use the Attributes::get method which takes an AttrVal value directly to simplify the code a bit. No functionality change. llvm-svn: 166009	2012-10-16 05:23:31 +00:00
Craig Topper	c74b600afb	Fix filename in file header. llvm-svn: 166004	2012-10-16 02:21:30 +00:00
Bill Wendling	50d27849f6	Move the Attributes::Builder outside of the Attributes class and into its own class named AttrBuilder. No functionality change. llvm-svn: 165960	2012-10-15 20:35:56 +00:00
Micah Villmow	4bb926d91d	Resubmit the changes to llvm core to update the functions to support different pointer sizes on a per address space basis. llvm-svn: 165941	2012-10-15 16:24:29 +00:00
Kostya Serebryany	b0e2506d97	[asan] make AddressSanitizer to be a FunctionPass instead of ModulePass. This will simplify chaining other FunctionPasses with asan. Also some minor cleanup llvm-svn: 165936	2012-10-15 14:20:06 +00:00
Chandler Carruth	49c8eea3c0	Update the memcpy rewriting to fully support widened int rewriting. This includes extracting ints for copying elsewhere and inserting ints when copying into the alloca. This should fix the CanSROA assertion coming out of Clang's regression test suite. llvm-svn: 165931	2012-10-15 10:24:43 +00:00
Chandler Carruth	9d966a2002	Follow-up fix to r165928: handle memset rewriting for widened integers, and generally clean up the memset handling. It had rotted a bit as the other rewriting logic got polished more. llvm-svn: 165930	2012-10-15 10:24:40 +00:00
Chandler Carruth	435c4e0792	First major step toward addressing PR14059. This teaches SROA to handle cases where we have partial integer loads and stores to an otherwise promotable alloca to widen[1] those loads and stores to cover the entire alloca and bitcast them into the appropriate type such that promotion can proceed. These partial loads and stores stem from an annoying confluence of ARM's calling convention and ABI lowering and the FCA pre-splitting which takes place in SROA. Clang lowers a { double, double } in-register function argument as a [4 x i32] function argument to ensure it is placed into integer 32-bit registers (a really unnerving implicit contract between Clang and the ARM backend I would add). This results in a FCA load of [4 x i32]* from the { double, double } alloca, and SROA decomposes this into a sequence of i32 loads and stores. Inlining proceeds, code gets folded, but at the end of the day, we still have i32 stores to the low and high halves of a double alloca. Widening these to be i64 operations, and bitcasting them to double prior to loading or storing allows promotion to proceed for these allocas. I looked quite a bit changing the IR which Clang produces for this case to be more friendly, but small changes seem unlikely to help. I think the best representation we could use currently would be to pass 4 i32 arguments thereby avoiding any FCAs, but that would still require this fix. It seems like it might eventually be nice to somehow encode the ABI register selection choices outside of the parameter type system so that the parameter can be a { double, double }, but the CC register annotations indicate that this should be passed via 4 integer registers. This patch does not address the second problem in PR14059, which is the reverse: when a struct alloca is loaded as a larger single integer. This patch also does not address some of the code quality issues with the FCA-splitting. Those don't actually impede any optimizations really, but they're on my list to clean up. [1]: Pedantic footnote: for those concerned about memory model issues here, this is safe. For the alloca to be promotable, it cannot escape or have any use of its address that could allow these loads or stores to be racing. Thus, widening is always safe. llvm-svn: 165928	2012-10-15 08:40:30 +00:00
Chandler Carruth	aa6afbb831	Hoist the canConvertValue predicate and the convertValue transform out into static helper functions. They're really quite generic and are going to be needed elsewhere shortly. llvm-svn: 165927	2012-10-15 08:40:22 +00:00
Bill Wendling	fbd38fe2e3	Add an enum for the return and function indexes into the AttrListPtr object. This gets rid of some magic numbers. llvm-svn: 165924	2012-10-15 07:29:08 +00:00
Bill Wendling	d079a446d7	Attributes Rewrite Convert the internal representation of the Attributes class into a pointer to an opaque object that's uniqued by and stored in the LLVMContext object. The Attributes class then becomes a thin wrapper around this opaque object. Eventually, the internal representation will be expanded to include attributes that represent code generation options, etc. llvm-svn: 165917	2012-10-15 04:46:55 +00:00
Meador Inge	40b6fac36c	instcombine: Migrate strcmp and strncmp optimizations This patch migrates the strcmp and strncmp optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 165915	2012-10-15 03:47:37 +00:00
Benjamin Kramer	c5b0678cf8	Simplify code. No functionality change. llvm-svn: 165904	2012-10-14 11:15:42 +00:00
Benjamin Kramer	650b1dbd56	Unquadratize SetVector removal loops in DSE. Erasing from the beginning or middle of the vector is expensive, remove_if can do it in linear time even though it's a bit ugly without lambdas. No functionality change. llvm-svn: 165903	2012-10-14 10:21:31 +00:00
Bill Wendling	76d2cd2f60	Remove operator cast method in favor of querying with the correct method. llvm-svn: 165899	2012-10-14 08:54:26 +00:00
Bill Wendling	2a3c1cca7d	Remove the bitwise AND operators from the Attributes class. Replace it with the equivalent from the builder class. llvm-svn: 165896	2012-10-14 07:52:48 +00:00
Bill Wendling	722b26c0f2	Remove the bitwise assignment OR operator from the Attributes class. Replace it with the equivalent from the builder class. llvm-svn: 165895	2012-10-14 07:35:59 +00:00
Bill Wendling	a05b043c4a	Remove the bitwise XOR operator from the Attributes class. Replace it with the equivalent from the builder class. llvm-svn: 165893	2012-10-14 06:56:13 +00:00
Bill Wendling	85a64c217f	Remove the bitwise NOT operator from the Attributes class. Replace it with the equivalent from the builder class. llvm-svn: 165892	2012-10-14 06:39:53 +00:00
Benjamin Kramer	44e58f9eb1	Remove unused private field. llvm-svn: 165881	2012-10-13 18:03:34 +00:00
Meador Inge	174185084c	instcombine: Migrate strchr and strrchr optimizations This patch migrates the strchr and strrchr optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 165875	2012-10-13 16:45:37 +00:00
Meador Inge	7fb2f7378b	instcombine: Migrate strcat and strncat optimizations This patch migrates the strcat and strncat optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 165874	2012-10-13 16:45:32 +00:00
Meador Inge	df796f893f	Implement new LibCallSimplifier class This patch implements the new LibCallSimplifier class as outlined in [1]. In addition to providing the new base library simplification infrastructure, all the fortified library call simplifications were moved over to the new infrastructure. The rest of the library simplification optimizations will be moved over with follow up patches. NOTE: The original fortified library call simplifier located in the SimplifyFortifiedLibCalls class was not removed because it is still used by CodeGenPrepare. This class will eventually go away too. [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-August/052283.html llvm-svn: 165873	2012-10-13 16:45:24 +00:00
Chandler Carruth	ba9319925e	Teach SROA to cope with wrapper aggregates. These show up a lot in ABI type coercion code, especially when targetting ARM. Things like [1 x i32] instead of i32 are very common there. The goal of this logic is to ensure that when we are picking an alloca type, we look through such wrapper aggregates and across any zero-length aggregate elements to find the simplest type possible to form a type partition. This logic should (generally speaking) rarely fire. It only ends up kicking in when an alloca is accessed using two different types (for instance, i32 and float), and the underlying alloca type has wrapper aggregates around it. I noticed a significant amount of this occurring looking at stepanov_abstraction generated code for arm, and suspect it happens elsewhere as well. Note that this doesn't yet address truly heinous IR productions such as PR14059 is concerning. Those result in mismatched sizes of types in addition to mismatched access and alloca types. llvm-svn: 165870	2012-10-13 10:49:33 +00:00
Chandler Carruth	482c61787c	Speculatively harden the conversion logic. I have no idea if this will help the dragonegg builders, and no test case at this point, but this was one dimly plausible case I spotted by inspection. Hopefully will get a testcase from those bots soon-ish, and will tidy this up with proper testing. llvm-svn: 165869	2012-10-13 10:49:30 +00:00
Chandler Carruth	0fb8a7787e	Silence a warning in -assert builds. llvm-svn: 165867	2012-10-13 05:09:27 +00:00
Chandler Carruth	891fec0b56	Clean up how we rewrite loads and stores to the whole alloca. When these are single value types, the load and store should be directly based upon the alloca and then bitcasting can fix the type as needed afterward. This might in theory improve some of the IR coming out of SROA, but I don't expect big changes yet and don't have any test cases on hand. This is really just a cleanup/refactoring patch. The next patch will cause this code path to be hit a lot more, actually get SROA to promote more allocas and include several more test cases. llvm-svn: 165864	2012-10-13 02:41:05 +00:00
Manman Ren	97c1876256	PGO: create metadata for switch only if it has more than one targets. When all cases of a switch statement are dead, the weights vector only has one element, and we will get an ssertion failure when calling createBranchWeights. llvm-svn: 165759	2012-10-11 22:28:34 +00:00
Micah Villmow	0c61134d8d	Revert 165732 for further review. llvm-svn: 165747	2012-10-11 21:27:41 +00:00
Micah Villmow	083189730e	Add in the first iteration of support for llvm/clang/lldb to allow variable per address space pointer sizes to be optimized correctly. llvm-svn: 165726	2012-10-11 17:21:41 +00:00
Nick Lewycky	49ac81ac68	Don't crash when !tbaa.struct contents is invalid. llvm-svn: 165693	2012-10-11 02:05:23 +00:00
Nadav Rotem	e10328737d	Add a new interface to allow IR-level passes to access codegen-specific information. llvm-svn: 165665	2012-10-10 22:04:55 +00:00
Bill Wendling	bbcdf4e2a5	Remove the final bits of Attributes being declared in the Attribute namespace. Use the attribute's enum value instead. No functionality change intended. llvm-svn: 165610	2012-10-10 07:36:45 +00:00
Bill Wendling	ed42e799dc	Pass into the AttributeWithIndex::get method an ArrayRef of attribute enums. These are then created via the correct Attributes creation method. llvm-svn: 165607	2012-10-10 06:13:42 +00:00
Bill Wendling	f319e9905f	Have 'addFnAttr' take the attribute enum value. Then have it build the attribute object and add it appropriately. No functionality change. llvm-svn: 165595	2012-10-10 03:12:49 +00:00
Bill Wendling	8ccd6ca199	Use the attribute enums to query if a parameter has an attribute. llvm-svn: 165550	2012-10-09 21:38:14 +00:00
Michael Ilseman	336cb79fdf	Update EarlyCSE's SimpleValues to use Hashing.h for their hashes. Expanded the hashing and equality to allow for equality modulo commutativity for binary ops, and comparisons with swapping of predicates. llvm-svn: 165509	2012-10-09 16:57:38 +00:00
Alexey Samsonov	2747e22051	Fixup for r165490: Use DenseMap instead of std::map. Simplify the loop in CollectFunctionDIs. llvm-svn: 165498	2012-10-09 10:34:52 +00:00
Bill Wendling	93f70b78fd	Use the enum value of the attributes when adding them to the attributes builder. llvm-svn: 165494	2012-10-09 09:11:20 +00:00
Alexey Samsonov	3b861ec989	Fix PR14016. DeadArgumentElimination pass can replace one LLVM function with another, invalidating a pointer stored in debug info metadata entry for this function. To fix this, we collect debug info descriptors for functions before running a DeadArgumentElimination pass and "patch" pointers in metadata nodes if we replace a function. llvm-svn: 165490	2012-10-09 08:13:15 +00:00
Bill Wendling	c9b22d735a	Create enums for the different attributes. We use the enums to query whether an Attributes object has that attribute. The opaque layer is responsible for knowing where that specific attribute is stored. llvm-svn: 165488	2012-10-09 07:45:08 +00:00
Chandler Carruth	503eb2bb49	Fix PR14034, an infloop / heap corruption / crash bug in the new SROA. Thanks to Benjamin for the raw test case. This one took about 50 times longer to reduce than to fix. =/ llvm-svn: 165476	2012-10-09 01:58:35 +00:00
Bill Wendling	f1c60d6d04	Fix. Apply the no capture attribute to the correct parameter. llvm-svn: 165469	2012-10-09 00:51:40 +00:00
Bill Wendling	c1e8e74cbd	Convert to using the Attributes::Builder class to create attributes. llvm-svn: 165468	2012-10-09 00:47:36 +00:00
Bill Wendling	70f3917b0e	Convert to using the Attributes::Builder interface. llvm-svn: 165465	2012-10-09 00:01:21 +00:00
Nadav Rotem	35315fea70	Refactor the AddrMode class out of TLI to its own header file. This class is used by LSR and a number of places in the codegen. This is the first step in de-coupling LSR from TLI, and creating a new interface in between them. llvm-svn: 165455	2012-10-08 23:06:34 +00:00
Nick Lewycky	7c3b5d9444	Give CaptureTracker::shouldExplore a base implementation. Most users want to do the same thing. No functionality change. llvm-svn: 165435	2012-10-08 22:12:48 +00:00
Micah Villmow	cdfe20b97f	Move TargetData to DataLayout. llvm-svn: 165402	2012-10-08 16:38:25 +00:00
NAKAMURA Takumi	605fe78aca	SROA.cpp: Fix a warning, [-Wunused-variable] llvm-svn: 165309	2012-10-05 13:56:23 +00:00
Duncan Sands	933db779a2	Move this test a bit later, after the point at which we know that we either have an alloca or a parameter, since then the alloca test should make sense to readers, while before it probably appears too specific. No functionality change. llvm-svn: 165306	2012-10-05 07:29:46 +00:00
Chandler Carruth	e5b7a2ccd2	Teach the new SROA a new trick. Now we zap any memcpy or memmoves which are in fact identity operations. We detect these and kill their partitions so that even splitting is unaffected by them. This is particularly important because Clang relies on emitting identity memcpy operations for struct copies, and these fold away to constants very often after inlining. Fixes the last big performance FIXME I have on my plate. llvm-svn: 165285	2012-10-05 01:29:09 +00:00
Chandler Carruth	90c4a3ae20	Lift the speculation visitor above all the helpers that are targeted at the rewrite visitor to make the fact that the speculation is completely independent a bit more clear. I promise that this is just a cut/paste of the one visitor and adding the annonymous namespace wrappings. The diff may look completely preposterous, it does in git for some reason. llvm-svn: 165284	2012-10-05 01:29:06 +00:00
Preston Gurd	0d67f5106c	This patch corrects commit 165126 by using an integer bit width instead of a pointer to a type, in order to remove the uses of getGlobalContext(). Patch by Tyler Nowicki. llvm-svn: 165255	2012-10-04 21:33:40 +00:00
Jakub Staszak	e076cac097	Add a comment to the commit r165187. llvm-svn: 165238	2012-10-04 19:08:30 +00:00
Benjamin Kramer	d12e82e523	SimplifyCFG: Enhance the "remove CFG edge that leads to null pointer dereference" optimization to also handle instructions with multiple uses. We conservatively only check the first use to avoid walking long use chains. This catches the common case of having both a load and a store to a pointer supplied by a PHI node. llvm-svn: 165232	2012-10-04 16:11:49 +00:00
Duncan Sands	a6d20010fe	In my recent change to avoid use of underaligned memory I didn't notice that cpyDest can be mutated in some cases, which would then cause a crash later if indeed the memory was underaligned. This brought down several buildbots, so I guess the underaligned case is much more common than I thought! llvm-svn: 165228	2012-10-04 13:53:21 +00:00
Chandler Carruth	ac8317fd36	Fix PR13969, a mini-phase-ordering issue with the new SROA pass. Currently, we re-visit allocas when something changes about the way they might be split to allow better scalarization to take place. However, we weren't handling the case when the promotion is what would change the behavior of SROA. When an address derived from an alloca is stored into another alloca, we consider the first to have escaped. If the second is ever promoted to an SSA value, we will suddenly be able to run the SROA pass on the first alloca. This patch adds explicit support for this form if iteration. When we detect a store of a pointer derived from an alloca, we flag the underlying alloca for reprocessing after promotion. The logic works hard to only do this when there is definitely going to be promotion and it might remove impediments to the analysis of the alloca. Thanks to Nick for the great test case and Benjamin for some sanity check review. llvm-svn: 165223	2012-10-04 12:33:50 +00:00
Duncan Sands	c6ada69a14	The memcpy optimizer was happily doing call slot forwarding when the new memory was less aligned than the old. In the testcase this results in an overaligned memset: the memset alignment was correct for the original memory but is too much for the new memory. Fix this by either increasing the alignment of the new memory or bailing out if that isn't possible. Should fix the gcc-4.7 self-host buildbot failure. llvm-svn: 165220	2012-10-04 10:54:40 +00:00
Chandler Carruth	43c8b46deb	Teach the integer-promotion rewrite strategy to be endianness aware. Sorry for this being broken so long. =/ As part of this, switch all of the existing tests to be Little Endian, which is the behavior I was asserting in them anyways! Add in a new big-endian test that checks the interesting behavior there. Another part of this is to tighten the rules abotu when we perform the full-integer promotion. This logic now rejects cases where there fully promoted integer is a non-multiple-of-8 bitwidth or cases where the loads or stores touch bits which are in the allocated space of the alloca but are not loaded or stored when accessing the integer. Sadly, these aren't really observable today as the rest of the pass will already ensure the invariants hold. However, the latter situation is likely to become a potential concern in the future. Thanks to Benjamin and Duncan for early review of this patch. I'm still looking into whether there are further endianness issues, please let me know if anyone sees BE failures persisting past this. llvm-svn: 165219	2012-10-04 10:39:28 +00:00
Bill Wendling	e8619aa1c1	Use method to query for attributes. llvm-svn: 165209	2012-10-04 06:58:52 +00:00
Bill Wendling	d777398ee4	Add method to query for 'NoAlias' attribute on call/invoke instructions. llvm-svn: 165208	2012-10-04 06:52:09 +00:00
Bill Wendling	d0935f7069	Use method to query for attributes. llvm-svn: 165207	2012-10-04 06:49:41 +00:00
Bill Wendling	9cae65918c	Query for attributes via the correct method call. llvm-svn: 165206	2012-10-04 06:48:57 +00:00
Kostya Serebryany	d23b18fe7f	[tsan] add 3 internal flags for fine-grain control of what is instrumented and what is not. llvm-svn: 165204	2012-10-04 05:28:50 +00:00
Jakub Staszak	f8a8129513	Fix PR13967. llvm-svn: 165187	2012-10-03 23:59:47 +00:00
Preston Gurd	5509e3d727	This Patch corrects a problem whereby the optimization to use a faster divide instruction (for Intel Atom) was not being done by Clang, because the type context used by Clang is not the default context. It fixes the problem by getting the global context types for each div/rem instruction in order to compare them against the types in the BypassTypeMap. Tests for this will be done as a separate patch to Clang. Patch by Tyler Nowicki. llvm-svn: 165126	2012-10-03 16:11:44 +00:00
Dmitry Vyukov	f4cb22121a	tsan: prepare for migration to new memory_order enum values (ABI compatible) llvm-svn: 165107	2012-10-03 13:00:57 +00:00
Chandler Carruth	08e5f49f90	Fix an issue where we failed to adjust the alignment constraint on a memcpy to reflect that '0' has a different meaning when applied to a load or store. Now we correctly use underaligned loads and stores for the test case added. llvm-svn: 165101	2012-10-03 08:26:28 +00:00
Chandler Carruth	4b2b38d398	Try to use a better set of abstractions for computing the alignment necessary during rewriting. As part of this, fix a real think-o here where we might have left off an alignment specification when the address is in fact underaligned. I haven't come up with any way to trigger this, as there is always some other factor that reduces the alignment, but it certainly might have been an observable bug in some way I can't think of. This also slightly changes the strategy for placing explicit alignments on loads and stores to only do so when the alignment does not match that required by the ABI. This causes a few redundant alignments to go away from test cases. I've also added a couple of tests that really push on the alignment that we end up with on loads and stores. More to come here as I try to fix an underlying bug I have conjectured and produced test cases for, although it's not clear if this bug is the one currently hitting dragonegg's gcc47 bootstrap. llvm-svn: 165100	2012-10-03 08:14:02 +00:00
Chandler Carruth	3f57b82979	Switch the SetVector::remove_if implementation to use partition which preserves the values of the relocated entries, unlikely remove_if. This allows walking them and erasing them. Also flesh out the predicate we are using for this to support the various constraints actually imposed on a UnaryPredicate -- without this we can't compose it with std::not1. Thanks to Sean Silva for the review here and noticing the issue with std::remove_if. llvm-svn: 165073	2012-10-03 00:03:00 +00:00
Chandler Carruth	b09f0a3c75	Teach the new SROA to handle cases where an alloca that has already been scheduled for processing on the worklist eventually gets deleted while we are processing another alloca, fixing the original test case in PR13990. To facilitate this, add a remove_if helper to the SetVector abstraction. It's not easy to use the standard abstractions for this because of the specifics of SetVectors types and implementation. Finally, a nice small test case is included. Thanks to Benjamin for the fantastic reduced test case here! All I had to do was delete some empty basic blocks! llvm-svn: 165065	2012-10-02 22:46:45 +00:00
Chandler Carruth	6c3890b680	Fix another crasher in SROA, reported by Joel. We require that the indices into the use lists are stable in order to build fast lookup tables to locate a particular partition use from an operand of a PHI or select. This is (obviously in hind sight) incompatible with erasing elements from the array. Really, we don't want to erase anyways. It is expensive, and a rare operation. Instead, simply weaken the contract of the PartitionUse structure to allow null Use pointers to represent dead uses. Now we can clear out the pointer to mark things as dead, and all it requires is adding some 'continue' checks to the various loops. I'm still reducing a test case for this, as the test case I have is huge. I think this one I can get a nice test case for though, as it was much more deterministic. llvm-svn: 165032	2012-10-02 18:57:13 +00:00
Chandler Carruth	3903e05244	Fix a silly coding error on my part. The whole point of the speculator being separate was that it can grow the use list. As a consequence, we can't use the iterator-pair interface, we need an index based interface. Expose such an interface from the AllocaPartitioning, and use it in the speculator. This should at least fix a use-after-free bug found by Duncan, and may fix some of the other crashers. I don't have a nice deterministic test case yet, but if I get a good one, I'll add it. llvm-svn: 165027	2012-10-02 17:49:47 +00:00
Chandler Carruth	4e4359935b	Turn the new SROA pass back on. Let's see if it sticks this time. =] Again, let me know if anything breaks due to this! llvm-svn: 164986	2012-10-02 04:24:01 +00:00
Chandler Carruth	d71ef3a02a	Make this plural. Spotted by Duncan in review (and a very old typo, this is the second time I've moved this comment around...) llvm-svn: 164939	2012-10-01 12:24:42 +00:00
Chandler Carruth	d325f8021b	Prune some unnecessary includes. llvm-svn: 164938	2012-10-01 12:21:54 +00:00
Chandler Carruth	176ca71a82	Fix several issues with alignment. We weren't always accounting for type alignment requirements of the new alloca. As one consequence which was reported as a bug by Duncan, we overaligned memcpy calls to ranges of allocas after they were rewritten to types with lower alignment requirements. Other consquences are possible, but I don't have any test cases for them. llvm-svn: 164937	2012-10-01 12:16:54 +00:00
Benjamin Kramer	9fc3dc7781	SimplifyCFG: Don't crash when forming a switch bitmap with an undef default value. Fixes PR13985. llvm-svn: 164934	2012-10-01 11:31:48 +00:00
Chandler Carruth	82a57543d6	Factor the PHI and select speculation into a separate rewriter. This could probably be factored still further to hoist this logic into a generic helper, but currently I don't have particularly clean ideas about how to handle that. This at least allows us to drop custom load rewriting from the speculation logic, which in turn allows the existing load rewriting logic to fire. In theory, this could enable vector promotion or other tricks after speculation occurs, but I've not dug into such issues. This is primarily just cleaning up the factoring of the code and the resulting logic. llvm-svn: 164933	2012-10-01 10:54:05 +00:00
Chandler Carruth	54e8f0b4cf	Refactor the PartitionUse structure to actually use the Use* instead of a pair of instructions, one for the used pointer and the second for the user. This simplifies the representation and also makes it more dense. This was noticed because of the miscompile in PR13926. In that case, we were running up against a fundamental "bad idea" in the speculation of PHI and select instructions: the speculation and rewriting are interleaved, which requires phi speculation to also perform load rewriting! This is bad, and causes us to miss opportunities to do (for example) vector rewriting only exposed after PHI speculation, etc etc. It also, in the old system, required us to insert new load uses into the current partition's use list, which would then be ignored during rewriting because we had already extracted an end iterator for the use list. The appending behavior (and much of the other oddities) stem from the strange de-duplication strategy in the PartitionUse builder. Amusingly, all this went without notice for so long because it could only be triggered by having different GEPs into the same partition of the same alloca, where both different GEPs were operands of a single PHI, and where the GEP which was not encountered first also had multiple uses within that same PHI node... Hence the insane steps required to reproduce. So, step one in fixing this fundamental bad idea is to make the PartitionUse actually contain a Use*, and to make the builder do proper deduplication instead of funky de-duplication. This is enough to remove the appending behavior, and fix the miscompile in PR13926, but there is more work to be done here. Subsequent commits will lift the speculation into its own visitor. It'll be a useful step toward potentially extracting all of the speculation logic into a generic utility transform. The existing PHI test case for repeated operands has been made more extreme to catch even these issues. This test case, run through the old pass, will exactly reproduce the miscompile from PR13926. ;] We were so close here! llvm-svn: 164925	2012-10-01 01:49:22 +00:00
Benjamin Kramer	f064b65a94	SimplifyCFG: Enumerating all predecessors of a BB can be expensive (switches), avoid it if possible. No functionality change. llvm-svn: 164923	2012-09-30 21:03:56 +00:00
Benjamin Kramer	8403625123	ArgumentPromotion: Remove ancient workaround for a bug in the C backend. Fun fact: The CBE learned how to deal with this situation before it was removed. llvm-svn: 164918	2012-09-30 17:31:56 +00:00
Chandler Carruth	903790eff5	Fix a somewhat surprising miscompile where code relying on an ABI alignment could lose it due to the alloca type moving down to a much smaller alignment guarantee. Now SROA will actively compute a proper alignment, factoring the target data, any explicit alignment, and the offset within the struct. This will in some cases lower the alignment requirements, but when we lower them below those of the type, we drop the alignment entirely to give freedom to the code generator to align it however is convenient. Thanks to Duncan for the lovely test case that pinned this down. =] llvm-svn: 164891	2012-09-29 10:41:21 +00:00
Evan Cheng	64a223aed8	Do not delete BBs if their addresses are taken. rdar://12396696 llvm-svn: 164866	2012-09-28 23:58:57 +00:00
Evan Cheng	8c6b06d4a0	GlobalDCE should be run at -O2 / -Os to eliminate unused dtor, etc. rdar://9142819 llvm-svn: 164850	2012-09-28 21:23:26 +00:00
Benjamin Kramer	255dea4b90	CorrelatedPropagation: BasicBlock::removePredecessor can simplify PHI nodes. If the it's the condition of a SwitchInst, reload it. Fixes PR13972. llvm-svn: 164818	2012-09-28 10:42:50 +00:00
Benjamin Kramer	ed84360a45	GlobalOpt: non-constexpr bitcasts or GEPs can occur even if the global value is only stored once. Fixes PR13968. llvm-svn: 164815	2012-09-28 10:01:27 +00:00
Nick Lewycky	156999f8b9	Surprisingly, we missed a trivial case here. Fix that! llvm-svn: 164814	2012-09-28 09:33:53 +00:00
Benjamin Kramer	c2081d1c19	Fix a integer overflow in SimplifyCFG's look up table formation logic. If the width is very large it gets truncated from uint64_t to uint32_t when passed to TD->fitsInLegalInteger. The truncated value can fit in a register. This manifested in massive memory usage or crashes (PR13946). llvm-svn: 164784	2012-09-27 18:29:58 +00:00
Sylvestre Ledru	91ce36c986	Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768	2012-09-27 10:14:43 +00:00
Sylvestre Ledru	721cffd53a	Fix a typo 'iff' => 'if' llvm-svn: 164767	2012-09-27 09:59:43 +00:00
Nick Lewycky	7b4cd228aa	Prefer shuffles to selects. Backends love shuffles! llvm-svn: 164763	2012-09-27 08:33:56 +00:00
Nick Lewycky	2e646236fb	Disable the new SROA pass to get the tree back in working order. We don't yet have testcases for the current problems. llvm-svn: 164731	2012-09-26 22:43:04 +00:00
Bill Wendling	863bab689a	Remove the `hasFnAttr' method from Function. The hasFnAttr method has been replaced by querying the Attributes explicitly. No intended functionality change. llvm-svn: 164725	2012-09-26 21:48:26 +00:00
Hans Wennborg	cd3a11f725	Address Duncan's comments on r164684: - Put statistics in alphabetical order - Don't use getZextValue when building TableInt, just use APInts - Introduce Create{Z,S}ExtOrTrunc in IRBuilder. llvm-svn: 164696	2012-09-26 14:01:53 +00:00
Hans Wennborg	f2e2c108dd	Address Duncan's comments on r164682: - Finish assert messages with exclamation mark - Move overflow checking into ShouldBuildLookupTable. llvm-svn: 164692	2012-09-26 11:07:37 +00:00
Chandler Carruth	208124f5a2	Analogous fix to memset and memcpy rewriting. Don't have a test case contrived for these yet, as I spotted them by inspection and the test cases are a bit more tricky to phrase. llvm-svn: 164691	2012-09-26 10:59:22 +00:00
Chandler Carruth	3e4273dd0c	When rewriting the pointer operand to a load or store which has alignment guarantees attached, re-compute the alignment so that we consider offsets which impact alignment. llvm-svn: 164690	2012-09-26 10:45:28 +00:00
Chandler Carruth	871ba7249c	Teach all of the loads, stores, memsets and memcpys created by the rewriter in SROA to carry a proper alignment. This involves interrogating various sources of alignment, etc. This is a more complete and principled fix to PR13920 as well as related bugs pointed out by Eli in review and by inspection in the area. Also by inspection fix the integer and vector promotion paths to create aligned loads and stores. I still need to work up test cases for these... Sorry for the delay, they were found purely by inspection. llvm-svn: 164689	2012-09-26 10:27:46 +00:00
Hans Wennborg	39583b88a0	SimplifyCFG: Make the switch-to-lookup table transformation store the tables in bitmaps when they fit in a target-legal register. This saves some space, and it also allows for building tables that would otherwise be deemed too sparse. One interesting case that this hits is example 7 from http://blog.regehr.org/archives/320. We currently generate good code for this when lowering the switch to the selection DAG: we build a bitmask to decide whether to jump to one block or the other. My patch will result in the same bitmask, but it removes the need for the jump, as the return value can just be retrieved from the mask. llvm-svn: 164684	2012-09-26 09:44:49 +00:00
Hans Wennborg	776d7126b7	SimplifyCFG: Refactor the switch-to-lookup table transformation by breaking out the building of lookup tables into a separate class. llvm-svn: 164682	2012-09-26 09:34:53 +00:00
Chandler Carruth	4bd8f66ed9	Revert the business end of r164636 and try again. I'll come in again. ;] This should really, really fix PR13916. For real this time. The underlying bug is... a bit more subtle than I had imagined. The setup is a code pattern that leads to an @llvm.memcpy call with two equal pointers to an alloca in the source and dest. Now, not any pattern will do. The alloca needs to be formed just so, and both pointers should be wrapped in different bitcasts etc. When this precise pattern hits, a funny sequence of events transpires. First, we correctly detect the potential for overlap, and correctly optimize the memcpy. The first time. However, we do simplify the set of users of the alloca, and that causes us to run the alloca back through the SROA pass in case there are knock-on simplifications. At this point, a curious thing has happened. If we happen to have an i8 alloca, we have direct i8 pointer values. So we don't bother creating a cast, we rewrite the arguments to the memcpy to dircetly refer to the alloca. Now, in an unrelated area of the pass, we have clever logic which ensures that when visiting each User of a particular pointer derived from an alloca, we only visit that User once, and directly inspect all of its operands which refer to that particular pointer value. However, the mechanism used to detect memcpy's with the potential to overlap relied upon getting visited once per Use, not once per User. This is always true unless the same exact value is both source and dest. It turns out that almost nothing actually produces that pattern though. We can hand craft test cases that more directly test this behavior of course, and those are included. Also, note that there is a significant missed optimization here -- we prove in many cases that there is a non-volatile memcpy call with identical source and dest addresses. We shouldn't prevent splitting the alloca in that case, and in fact we should just remove such memcpy calls eagerly. I'll address that in a subsequent commit. llvm-svn: 164669	2012-09-26 07:41:40 +00:00
Craig Topper	2a6a08b1cd	Rename virtual table anchors from Anchor() to anchor() for consistency with the rest of the tree. llvm-svn: 164666	2012-09-26 06:36:36 +00:00
Michael Ilseman	a398d4cfaf	Expansions for u/srem, using the udiv expansion. More unit tests for udiv and u/srem. Fixed issue with Release build. llvm-svn: 164654	2012-09-26 01:55:01 +00:00
Nick Lewycky	d9f7910671	Don't drop the alignment on a memcpy intrinsic when producing a store. This is only a missed optimization opportunity if the store is over-aligned, but a miscompile if the store's new type has a higher natural alignment than the memcpy did. Fixes PR13920! llvm-svn: 164641	2012-09-25 22:46:21 +00:00
Nick Lewycky	a0c16aee0a	Revert the business end of r164634, and replace it with a different fix. The reason we were getting two of the same alloca is because of a memmove/memcpy which had the same alloca in both the src and dest. Now we detect that case directly. This has the same testcase as before, but fixes a clang test CodeGenObjC/exceptions.m which runs clang -O2. llvm-svn: 164636	2012-09-25 21:50:37 +00:00
Nick Lewycky	9f19349846	Don't try to promote the same alloca twice. Fixes PR13916! Chandler, it's not obvious that it's okay that this alloca gets into the list twice to begin with. Please review and see whether this is the fix you really want, but I wanted to get a fix checked in quickly. llvm-svn: 164634	2012-09-25 21:15:50 +00:00
Bill Wendling	eb33723ace	Move Attribute::typeIncompatible inside of the Attributes class. llvm-svn: 164629	2012-09-25 20:38:59 +00:00
Chad Rosier	88387f5332	Revert r164614 to appease the buildbots. llvm-svn: 164627	2012-09-25 19:57:20 +00:00
Michael Ilseman	506150a071	Expansions for u/srem, using the udiv expansion. More unit tests for udiv and u/srem. llvm-svn: 164614	2012-09-25 17:56:47 +00:00
Chandler Carruth	8b907e8acb	Fix a case where SROA did not correctly detect dead PHI or selects due to chains or cycles between PHIs and/or selects. Also add a couple of really nice test cases reduced from Kostya's reports in PR13905 and PR13906. Both are fixed by this patch. llvm-svn: 164596	2012-09-25 10:03:40 +00:00
Chandler Carruth	2603a18769	Fix a crash in SROA. This was reported independently by Takumi and David (I think), but I would appreciate folks verifying that this fixes the big crasher. I'm still working on a reduced test case, but because this was causing problems I wanted to get the fix checked in quickly. llvm-svn: 164585	2012-09-25 02:42:03 +00:00
Nick Lewycky	42bca056e0	Don't forget that strcpy and friends return a pointer to the destination, so it's not a dead store if that pointer is used. Whoops! llvm-svn: 164583	2012-09-25 01:55:59 +00:00
Nick Lewycky	627d217727	Remove unused name of variable to quiet a warning. Also canonicalize a declaration to use the same form as in the rest of the file. No functionality change. llvm-svn: 164576	2012-09-24 23:47:23 +00:00
Nick Lewycky	9f4729d331	Teach DSE that strcpy, strncpy, strcat and strncat are all stores which may be dead. llvm-svn: 164561	2012-09-24 22:09:10 +00:00
Nick Lewycky	135ac9ac89	Move all the calls to AA.getTargetLibraryInfo() to using a TLI member variable. No functionality change. llvm-svn: 164560	2012-09-24 22:07:09 +00:00
Richard Osborne	2fd29bfb90	Add missing check for presence of target data. This avoids a crash in visitAllocaInst when target data isn't available. llvm-svn: 164539	2012-09-24 17:10:03 +00:00
Chandler Carruth	8232bf53c6	Enable the new SROA pass by default. Queue the fallout. ;] llvm-svn: 164480	2012-09-24 01:10:25 +00:00
Chandler Carruth	92924fd28f	Address one of the original FIXMEs for the new SROA pass by implementing integer promotion analogous to vector promotion. When there is an integer alloca being accessed both as its integer type and as a narrower integer type, promote the narrower access to "insert" and "extract" the smaller integer from the larger one, and make the integer alloca a candidate for promotion. In the new formulation, we don't care about target legal integer or use thresholds to control things. Instead, we only perform this promotion to an integer type which the frontend has already emitted a load or store for. This bounds the scope and prevents optimization passes from coalescing larger and larger entities into a single integer. llvm-svn: 164479	2012-09-24 00:34:20 +00:00
Chandler Carruth	e7a1ba5e8b	Switch to a signed representation for the dynamic offsets while walking across the uses of the alloca. It's entirely possible for negative numbers to come up here, and in some rare cases simply doing the 2's complement arithmetic isn't the correct decision. Notably, we can't zext the index of the GEP. The definition of GEP is that these offsets are sign extended or truncated to the size of the pointer, and then wrapping 2's complement arithmetic used. This patch fixes an issue that comes up with no input from the buildbots or bootstrap afaict. The only place where it manifested, disturbingly, is Clang's own regression test suite. A reduced and targeted collection of tests are added to cope with this. Note that I've tried to pin down the potential cases of overflow, but may have missed some cases. I've tried to add a few cases to test this, but its hard because LLVM has quite limited support for >64bit constructs. llvm-svn: 164475	2012-09-23 11:43:14 +00:00
Chandler Carruth	225d4bdb07	Fix a case where the new SROA pass failed to zap dead operands to selects with a constant condition. This resulted in the operands remaining live through the SROA rewriter. Most of the time, this just caused some dead allocas to persist and get zapped by later passes, but in one case found by Joerg, it caused a crash when we tried to promote the alloca despite it having this dead use. We already have the mechanisms in place to handle this, just wire select up to them. llvm-svn: 164427	2012-09-21 23:36:40 +00:00
Benjamin Kramer	eba9aca5cd	LoopIdiom: Give up when the loop is not in canonical form. We rely on it when doing the transforms. This can happen when there is an indirectbr in the loop. Fixes PR13892. llvm-svn: 164383	2012-09-21 17:27:23 +00:00
Benjamin Kramer	efb4d34bcf	InstCombine: Make sure we use the pre-zext type when creating a constant of a value that is zext'd. Fixes PR13250. llvm-svn: 164377	2012-09-21 16:26:41 +00:00
Manman Ren	93ab64916f	SimplifyCFG: sink common codes from IF, ELSE blocks down to END block. We already have HoistThenElseCodeToIf, this patch implements SinkThenElseCodeToEnd. When END block has only two predecessors and each predecessor terminates with unconditional branches, we compare instructions in IF and ELSE blocks backwards and check whether we can sink the common instructions down. rdar://12191395 llvm-svn: 164325	2012-09-20 22:37:36 +00:00
Michael Ilseman	5117db54ff	Renaming functions to match coding style guidelines llvm-svn: 164238	2012-09-19 18:14:45 +00:00
Michael Ilseman	370a1a1c94	Doxygen-ify comments llvm-svn: 164235	2012-09-19 16:25:57 +00:00
Michael Ilseman	1db690d15e	Put the * and & next to the variable, rather than the type. llvm-svn: 164232	2012-09-19 16:17:20 +00:00
Hans Wennborg	f744fa917d	SimplifyCFG: Don't generate invalid code for switch used to initialize two variables where the first variable is returned and the second ignored. I don't think this occurs in practice (other passes should have cleaned up the unused phi node), but it should still be handled correctly. Also make the logic for determining if we should return early less sketchy. llvm-svn: 164225	2012-09-19 14:24:21 +00:00
Benjamin Kramer	47196e6cd5	IntegerDivision: Style cleanups, avoid warning about mixing \|\| and && without parens. llvm-svn: 164216	2012-09-19 13:03:07 +00:00
Hans Wennborg	02fbc71647	CodeGenPrep: turn lookup tables into switches for some targets. This is a follow-up from r163302, which added a transformation to SimplifyCFG that turns some switches into loads from lookup tables. It was pointed out that some targets, such as GPUs and deeply embedded targets, might not find this appropriate, but SimplifyCFG doesn't have enough information about the target to decide this. This patch adds the reverse transformation to CodeGenPrep: it turns loads from lookup tables back into switches for targets where we do not build jump tables (assuming these are also the targets where lookup tables are inappropriate). Hopefully we will eventually get to have target information in SimplifyCFG, and then this CodeGenPrep transformation can be removed. llvm-svn: 164206	2012-09-19 07:48:16 +00:00
Chandler Carruth	3f882d4cf5	Fix the last crasher I've gotten a reproduction for in SROA. This one from the dragonegg build bots when we turned on the full version of the pass. Included a much reduced test case for this pesky bug, despite bugpoint's uncooperative behavior. Also, I audited all the similar code I could find and didn't spot any other cases where this mistake cropped up. llvm-svn: 164178	2012-09-18 22:37:19 +00:00
Michael Ilseman	52059da858	New utility for expanding integer division for targets that don't support it. Implementation derived from compiler-rt's implementation of signed and unsigned integer division. llvm-svn: 164173	2012-09-18 22:02:40 +00:00
Andrew Trick	402edbbe39	LSR critical edge splitting fix for PR13756. llvm-svn: 164147	2012-09-18 17:51:33 +00:00
Chandler Carruth	d356fd02a9	Fix getCommonType in a different way from the way I fixed it when working on FCA splitting. Instead of refusing to form a common type when there are uses of a subsection of the alloca as well as a use of the entire alloca, just skip the subsection uses and continue looking for a whole-alloca use with a type that we can use. This produces slightly prettier IR I think, and also fixes the other failure in the test. llvm-svn: 164146	2012-09-18 17:49:37 +00:00
Benjamin Kramer	a59ef5795d	Fix build for compilers that don't understand injected class names properly. llvm-svn: 164142	2012-09-18 17:11:47 +00:00
Benjamin Kramer	73a9e4a1f9	SROA: Use CRTP for OpSplitter to get rid of virtual dispatch and the virtual-dtor warnings that come with it. llvm-svn: 164140	2012-09-18 17:06:32 +00:00
Benjamin Kramer	65f8c88242	SROA: Replace the member function template contraption for recursively splitting aggregates into a real class. No intended functionality change. llvm-svn: 164135	2012-09-18 16:20:46 +00:00
NAKAMURA Takumi	eb2c8f0fc6	SROA.cpp: Appease msvc. ...I don't know why this could appease msvc...baad. llvm-svn: 164130	2012-09-18 15:29:02 +00:00
Benjamin Kramer	9bc3efc81c	LNT builders have picked up new SROA, disable it to get the remaining builders green again. llvm-svn: 164124	2012-09-18 13:43:00 +00:00
Chandler Carruth	a34f3567e0	Fix a warning in release builds and a test case I forgot to update with a fix to getCommonType in the previous patch. llvm-svn: 164120	2012-09-18 13:02:06 +00:00
Chandler Carruth	42cb9cb14f	Add a major missing piece to the new SROA pass: aggressive splitting of FCAs. This is essential in order to promote allocas that are used in struct returns by frontends like Clang. The FCA load would block the rest of the pass from firing, resulting is significant regressions with the bullet benchmark in the nightly test suite. Thanks to Duncan for repeated discussions about how best to do this, and to both him and Benjamin for review. This appears to have blocked many places where the pass tries to fire, and so I'm expect somewhat different results with this fix added. As with the last big patch, I'm including a change to enable the SROA by default temporarily. Ben is going to remove this as soon as the LNT bots pick up the patch. I'm just trying to get a round of LNT numbers from the stable machines in the lab. NOTE: Four clang tests are expected to fail in the brief window where this is enabled. Sorry for the noise! llvm-svn: 164119	2012-09-18 12:57:43 +00:00
Richard Osborne	b68053e266	Fix instcombine to obey requested alignment when merging allocas. llvm-svn: 164117	2012-09-18 09:31:44 +00:00
Craig Topper	b1d83e8c72	Mark unimplemented copy constructors and copy assignment operators as LLVM_DELETED_FUNCTION. llvm-svn: 164090	2012-09-18 02:01:41 +00:00
Manman Ren	5657555357	PGO: preserve branch-weight metadata when simplifying Switch to a sub, an icmp and a conditional branch; also when removing dead cases from a switch. llvm-svn: 164084	2012-09-18 00:47:33 +00:00
Manman Ren	ce48ea7e25	PGO: preserve branch-weight metadata when simplifying Switch Hanlde the case when we split the default edge if the default target has "icmp" and unconditinal branch. llvm-svn: 164076	2012-09-17 23:07:43 +00:00
Manman Ren	774246a3a9	PGO: preserve branch-weight metadata when simplifying SwitchOnSelect. llvm-svn: 164068	2012-09-17 22:28:55 +00:00
Manman Ren	2d4c10fc49	PGO: preserve branch-weight metadata when simplifying two branches with a common destination in SimplifyCondBranchToCondBranch. llvm-svn: 164054	2012-09-17 21:30:40 +00:00
Bill Wendling	636f1a1d99	s/__llvm_gcov_flush/__gcov_flush/g llvm-svn: 164040	2012-09-17 17:57:05 +00:00
Benjamin Kramer	02a4dff492	NewSROA: Provide a full set of operator< for ByteRanges. MSVC8 won't compile lower_bound if one is missing. llvm-svn: 164035	2012-09-17 16:42:36 +00:00
Axel Naumann	4a1270691e	Fix a few vars that can end up being used without initialization. The cases where no initialization happens should still be checked for logic flaws. llvm-svn: 164032	2012-09-17 14:20:57 +00:00
Chandler Carruth	9712117a07	Refactor the SROA visitors for partitioning an alloca and building partition use lists a bit. No functionality changed. These visitors are actually visiting a tuple of a Use and an offset into the alloca. However, we use the InstVisitor to handle the dispatch over the users, and so the Use and Offset are stored in class member variables and set just before each call to visit(). This is fairly awkward and makes the functions a bit harder to read, but its the only real option we have until InstVisitor can be rewritten to use variadic templates. However, this pattern shouldn't be followed on the helper member functions where there is no interface constraint from the visitor. We already were passing the instruction as a normal parameter rather than use the Use to get at it, start passing the offset as well. This will become more important in subsequent patches as the offset will in some cases change while visiting a single instruction. llvm-svn: 164003	2012-09-16 19:39:50 +00:00
Craig Topper	a60c0f1163	Use LLVM_DELETED_FUNCTION in place of 'DO NOT IMPLEMENT' comments. llvm-svn: 163974	2012-09-15 17:09:36 +00:00
Benjamin Kramer	ed11e35e57	Disable new sroa now that all buildbots have tested it. What we have so far: - Some clang test failures (these were known already) - Perf results are mixed, some big regressions http://llvm.org/perf/db_default/v4/nts/3844 http://llvm.org/perf/db_default/v4/nts/3845 bullet suffers a lot. matmul is interesting: slower scalar code, faster with -vectorize. - Some dragonegg selfhost bots crash in SROA during selfhost now http://lab.llvm.org:8011/builders/dragonegg-x86_64-linux-gcc-4.6-self-host-checks/builds/1632 http://lab.llvm.org:8011/builders/dragonegg-x86_64-linux-gcc-4.5-self-host/builds/1891 llvm-svn: 163968	2012-09-15 15:11:10 +00:00
Chandler Carruth	70b44c5ccf	Port the SSAUpdater-based promotion logic from the old SROA pass to the new one, and add support for running the new pass in that mode and in that slot of the pass manager. With this the new pass can completely replace the old one within the pipeline. The strategy for enabling or disabling the SSAUpdater logic is to do it by making the requirement of the domtree analysis optional. By default, it is required and we get the standard mem2reg approach. This is usually the desired strategy when run in stand-alone situations. Within the CGSCC pass manager, we disable requiring of the domtree analysis and consequentially trigger fallback to the SSAUpdater promotion. In theory this would allow the pass to re-use a domtree if one happened to be available even when run in a mode that doesn't require it. In practice, it lets us have a single pass rather than two which was simpler for me to wrap my head around. There is a hidden flag to force the use of the SSAUpdater code path for the purpose of testing. The primary testing strategy is just to run the existing tests through that path. One notable difference is that it has custom code to handle lifetime markers, and one of the tests has been enhanced to exercise that code. This has survived a bootstrap and the test suite without serious correctness issues, however my run of the test suite produced very alarming performance numbers. I don't entirely understand or trust them though, so more investigation is on-going. To aid my understanding of the performance impact of the new SROA now that it runs throughout the optimization pipeline, I'm enabling it by default in this commit, and will disable it again once the LNT bots have picked up one iteration with it. I want to get those bots (which are much more stable) to evaluate the impact of the change before I jump to any conclusions. NOTE: Several Clang tests will fail because they run -O3 and check the result's order of output. They'll go back to passing once I disable it again. llvm-svn: 163965	2012-09-15 11:43:14 +00:00
Manman Ren	bfb9d435e4	PGO: preserve branch-weight metadata when simplifying two branches with a common destination. Updated previous implementation to fix a case not covered: // PBI: br i1 %x, TrueDest, BB // BI: br i1 %y, TrueDest, FalseDest The other case was handled correctly. // PBI: br i1 %x, BB, FalseDest // BI: br i1 %y, TrueDest, FalseDest Also tried to use 64-bit arithmetic instead of APInt with scale to simplify the computation. Let me know if you have other opinions about this. llvm-svn: 163954	2012-09-15 00:39:57 +00:00
Bill Wendling	8d26bc38f5	Remove comment. llvm-svn: 163945	2012-09-14 22:35:49 +00:00
Manman Ren	8691e5220b	PGO: preserve branch-weight metadata when simplifying a switch with a single case to a conditional branch and when removing dead cases. llvm-svn: 163942	2012-09-14 21:53:06 +00:00
Evan Cheng	71be12b35b	Stylistic and 80-col fixes llvm-svn: 163940	2012-09-14 21:25:34 +00:00
Alex Rosenberg	af2808cb72	Review feedback from Duncan Sands. Alphabetize includes and simplify lit config. llvm-svn: 163928	2012-09-14 19:19:57 +00:00
Manman Ren	5e5049d9a6	Try to fix the bots by detecting inconsistant branch-weight metadata. llvm-svn: 163926	2012-09-14 19:05:19 +00:00
Manman Ren	d81b8e88e3	PGO: preserve branch-weight metadata when merging two switches where the default target of the first switch is not the basic block the second switch is in (PredDefault != BB). llvm-svn: 163916	2012-09-14 17:29:56 +00:00
Dmitri Gribenko	5485acd440	Fix Doxygen issues: * wrap code blocks in \code ... \endcode; * refer to parameter names in paragraphs correctly (\arg is not what most people want -- it starts a new paragraph); * use \param instead of \arg to document parameters in order to be consistent with the rest of the codebase. llvm-svn: 163902	2012-09-14 14:57:36 +00:00
Benjamin Kramer	4622cd7edd	SROA: Silence unused variable warnings in Release builds. The NDEBUG hack is ugly, but I see no better solution. llvm-svn: 163900	2012-09-14 13:08:09 +00:00
Chandler Carruth	054a40a4ff	Rework the computation of a sub-structure natural type. There were pointless checks in here, bad asserts, and just confusing code. I've also added a bit more to the comment to clarify what this function is really trying to do as it was not obvious to Duncan when studying it. Thanks to Duncan for helping me dig through the issue. No real functionality changed here in practical cases, and certainly no test case. This is just cleanup spotted by inspection. llvm-svn: 163897	2012-09-14 11:08:31 +00:00
Chandler Carruth	0cc59250d5	Rely on the recursive check for pointer types rather than adding an explicit check before recursing. A simplification requested by Duncan during review. llvm-svn: 163896	2012-09-14 10:30:44 +00:00
Chandler Carruth	cabd96cbaa	Be a bit more aggressive in bailing out of this routine. Spotted by inspection by Duncan during review. My suspicion is that we would still have returned 0 anyways in this case, but doing it sooner is better. llvm-svn: 163895	2012-09-14 10:30:42 +00:00
Chandler Carruth	dd3cea898f	Add some comments clarifying that the GEP analysis for vector GEPs is deeply suspicious and likely to go away eventually. Also fix a bogus comment about one of the checks in the vector GEP analysis. Based on review from Duncan. llvm-svn: 163894	2012-09-14 10:30:40 +00:00
Chandler Carruth	19450da9e6	Move an instance variable to a local variable based on review by Duncan. Originally I had anticipated needing to thread this through more bits of the SROA pass itself, but that ended up not happening. In the end, this is a much simpler way to manange the variable. llvm-svn: 163893	2012-09-14 10:26:38 +00:00
Chandler Carruth	4b40e008bd	Add a comment about debug intrinsics that I really don't want to forget from Duncan's review as a FIXME. llvm-svn: 163892	2012-09-14 10:26:36 +00:00
Chandler Carruth	b0de6ddbe0	Add two asserts that Duncan thought would help ensure things don't rot unexpectedly in the future. More fixes from his code review. llvm-svn: 163891	2012-09-14 10:26:34 +00:00
Chandler Carruth	6ba9824c2b	Actually keep the flag default-off for now. =/ That's what I get for being busy testing this... llvm-svn: 163890	2012-09-14 10:18:54 +00:00
Chandler Carruth	796de48459	Remove some dead, commented out code Duncan spotted in review. llvm-svn: 163889	2012-09-14 10:18:53 +00:00
Chandler Carruth	25fb23d687	Wrap the dumping and printing routines in NDEBUG and LLVM_ENABLE_DUMP macros. llvm-svn: 163888	2012-09-14 10:18:51 +00:00
Chandler Carruth	93a21e7aaf	Lots of comment fixes and cleanups from Duncan's review. llvm-svn: 163887	2012-09-14 10:18:49 +00:00
NAKAMURA Takumi	4bbca0bb6c	SROA.cpp: Unbreak gcc, sorry! llvm-svn: 163886	2012-09-14 10:06:10 +00:00
NAKAMURA Takumi	f4619d169d	SROA.cpp: Appease msvc. LLVM_ATTRIBUTE(s) should come front of "const". llvm-svn: 163885	2012-09-14 09:55:22 +00:00
Chandler Carruth	9a447db9fc	Speculative change to try to fix older GCC versions that can't handle the injected class name of a dependent base class here. llvm-svn: 163884	2012-09-14 09:30:33 +00:00
Chandler Carruth	1b398ae0ae	Introduce a new SROA implementation. This is essentially a ground up re-think of the SROA pass in LLVM. It was initially inspired by a few problems with the existing pass: - It is subject to the bane of my existence in optimizations: arbitrary thresholds. - It is overly conservative about which constructs can be split and promoted. - The vector value replacement aspect is separated from the splitting logic, missing many opportunities where splitting and vector value formation can work together. - The splitting is entirely based around the underlying type of the alloca, despite this type often having little to do with the reality of how that memory is used. This is especially prevelant with unions and base classes where we tail-pack derived members. - When splitting fails (often due to the thresholds), the vector value replacement (again because it is separate) can kick in for preposterous cases where we simply should have split the value. This results in forming i1024 and i2048 integer "bit vectors" that tremendously slow down subsequnet IR optimizations (due to large APInts) and impede the backend's lowering. The new design takes an approach that fundamentally is not susceptible to many of these problems. It is the result of a discusison between myself and Duncan Sands over IRC about how to premptively avoid these types of problems and how to do SROA in a more principled way. Since then, it has evolved and grown, but this remains an important aspect: it fixes real world problems with the SROA process today. First, the transform of SROA actually has little to do with replacement. It has more to do with splitting. The goal is to take an aggregate alloca and form a composition of scalar allocas which can replace it and will be most suitable to the eventual replacement by scalar SSA values. The actual replacement is performed by mem2reg (and in the future SSAUpdater). The splitting is divided into four phases. The first phase is an analysis of the uses of the alloca. This phase recursively walks uses, building up a dense datastructure representing the ranges of the alloca's memory actually used and checking for uses which inhibit any aspects of the transform such as the escape of a pointer. Once we have a mapping of the ranges of the alloca used by individual operations, we compute a partitioning of the used ranges. Some uses are inherently splittable (such as memcpy and memset), while scalar uses are not splittable. The goal is to build a partitioning that has the minimum number of splits while placing each unsplittable use in its own partition. Overlapping unsplittable uses belong to the same partition. This is the target split of the aggregate alloca, and it maximizes the number of scalar accesses which become accesses to their own alloca and candidates for promotion. Third, we re-walk the uses of the alloca and assign each specific memory access to all the partitions touched so that we have dense use-lists for each partition. Finally, we build a new, smaller alloca for each partition and rewrite each use of that partition to use the new alloca. During this phase the pass will also work very hard to transform uses of an alloca into a form suitable for promotion, including forming vector operations, speculating loads throguh PHI nodes and selects, etc. After splitting is complete, each newly refined alloca that is a candidate for promotion to a scalar SSA value is run through mem2reg. There are lots of reasonably detailed comments in the source code about the design and algorithms, and I'm going to be trying to improve them in subsequent commits to ensure this is well documented, as the new pass is in many ways more complex than the old one. Some of this is still a WIP, but the current state is reasonbly stable. It has passed bootstrap, the nightly test suite, and Duncan has run it successfully through the ACATS and DragonEgg test suites. That said, it remains behind a default-off flag until the last few pieces are in place, and full testing can be done. Specific areas I'm looking at next: - Improved comments and some code cleanup from reviews. - SSAUpdater and enabling this pass inside the CGSCC pass manager. - Some datastructure tuning and compile-time measurements. - More aggressive FCA splitting and vector formation. Many thanks to Duncan Sands for the thorough final review, as well as Benjamin Kramer for lots of review during the process of writing this pass, and Daniel Berlin for reviewing the data structures and algorithms and general theory of the pass. Also, several other people on IRC, over lunch tables, etc for lots of feedback and advice. llvm-svn: 163883	2012-09-14 09:22:59 +00:00
Dan Gohman	3f553c21eb	Handle the new !tbaa.struct metadata tags when converting a memcpy into scalar loads and stores. llvm-svn: 163844	2012-09-13 21:51:01 +00:00
Dan Gohman	d0080c45f9	Extract code for reducing a type to a single value type into a helper function. llvm-svn: 163817	2012-09-13 18:19:06 +00:00
Benjamin Kramer	15a257dadd	MemCpyOpt: When forming a memset from stores also take GEP constexprs into account. This is common when storing to global variables. llvm-svn: 163809	2012-09-13 16:29:49 +00:00
Nadav Rotem	97d44349c9	Fix an 80 char line limit. llvm-svn: 163808	2012-09-13 16:27:32 +00:00
Bill Wendling	fb1f6681a3	Use Nick's suggestion of storing a large NULL into the GV instead of memset, which requires TargetData. llvm-svn: 163799	2012-09-13 14:32:30 +00:00
Dmitri Gribenko	2bc1d483fe	Fix Doxygen issues: * wrap code blocks in \code ... \endcode; * refer to parameter names in paragraphs correctly (\arg is not what most people want -- it starts a new paragraph). llvm-svn: 163790	2012-09-13 12:34:29 +00:00
Bill Wendling	2e6e86606a	Introduce the __llvm_gcov_flush function. This function writes out the current values of the counters and then resets them. This can be used similarly to the __gcov_flush function to sync the counters when need be. For instance, in a situation where the application doesn't exit. <rdar://problem/12185886> llvm-svn: 163757	2012-09-13 00:09:55 +00:00
Dan Gohman	7c84dad80a	Detect overflow in the path count computation. rdar://12277446. llvm-svn: 163739	2012-09-12 20:45:17 +00:00
Manman Ren	49dbe255e6	PGO: preserve branch-weight metadata when removing a case which jumps to the default target. llvm-svn: 163724	2012-09-12 17:04:11 +00:00
Manman Ren	49d684e1e2	Release build: guard dump functions with "#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)" No functional change. Update r163344. llvm-svn: 163679	2012-09-12 05:06:18 +00:00
Manman Ren	571d9e4b80	SimplifyCFG: preserve branch-weight metadata when creating a new switch from a pair of switch/branch where both depend on the value of the same variable and the default case of the first switch/branch goes to the second switch/branch. Code clean up and fixed a few issues: 1> handling the case where some cases of the 2nd switch are invalidated 2> correctly calculate the weight for the 2nd switch when it is a conditional eq Testing case is modified from Alastair's original patch. llvm-svn: 163635	2012-09-11 17:43:35 +00:00
NAKAMURA Takumi	7419c5fb45	llvm/lib/Transforms/Utils/CMakeLists.txt: Update. llvm-svn: 163593	2012-09-11 02:55:37 +00:00
Alex Rosenberg	04b43aab43	Add a pass that renames everything with metasyntatic names. This works well after using bugpoint to reduce the confusion presented by the original names, which no longer mean what they used to. llvm-svn: 163592	2012-09-11 02:46:18 +00:00
Benjamin Kramer	1f66f885e8	Move bypassSlowDivision into the llvm namespace. llvm-svn: 163503	2012-09-10 11:52:08 +00:00
Hans Wennborg	7fd5c844af	Fix style issues from r163302 pointed out by Evan. llvm-svn: 163491	2012-09-10 07:44:22 +00:00
Nick Lewycky	12d825d9ca	Move spaces to the right places. No functionality change. llvm-svn: 163485	2012-09-09 23:41:11 +00:00
Benjamin Kramer	2b11eb0729	DSE: Poking holes into a SetVector is expensive, avoid it if possible. llvm-svn: 163480	2012-09-09 16:44:05 +00:00
Andrew Trick	d3b4d2cb76	Remove an incorrect assert during branch weight propagation. Patch and test case by Alastair Murray! llvm-svn: 163437	2012-09-08 00:07:26 +00:00
Hans Wennborg	08238adbbb	SimplifyCFG: ValidLookupTableConstant should be static llvm-svn: 163378	2012-09-07 08:22:57 +00:00
Manman Ren	c3366ccecb	Release build: guard dump functions with "ifndef NDEBUG" No functional change. llvm-svn: 163344	2012-09-06 19:55:56 +00:00
Hans Wennborg	feb4d07d88	Fix switch_to_lookup_table.ll test from r163302. The lookup tables did not get built in a deterministic order. This makes them get built in the order that the corresponding phi nodes were found. llvm-svn: 163305	2012-09-06 10:10:35 +00:00
Hans Wennborg	8a62fc5294	Build lookup tables for switches (PR884) This adds a transformation to SimplifyCFG that attemps to turn switch instructions into loads from lookup tables. It works on switches that are only used to initialize one or more phi nodes in a common successor basic block, for example: int f(int x) { switch (x) { case 0: return 5; case 1: return 4; case 2: return -2; case 5: return 7; case 6: return 9; default: return 42; } This speeds up the code by removing the hard-to-predict jump, and reduces code size by removing the code for the jump targets. llvm-svn: 163302	2012-09-06 09:43:28 +00:00
Jim Grosbach	30c4282f88	Update function names to conform to guidelines. No functional change. llvm-svn: 163279	2012-09-06 00:59:08 +00:00
Roman Divacky	ad06cee239	Stop casting away const qualifier needlessly. llvm-svn: 163258	2012-09-05 22:26:57 +00:00
Kostya Serebryany	5f5973df08	[asan] fix lint llvm-svn: 163205	2012-09-05 09:00:18 +00:00
Kostya Serebryany	2fa38f8ce0	[asan] extend the blacklist functionality to handle global-init. Patch by Reid Watson llvm-svn: 163199	2012-09-05 07:29:56 +00:00
Dan Gohman	df476e5e93	Make provenance checking conservative in cases when pointers-to-strong-pointers may be in play. These can lead to retains and releases happening in unstructured ways, foiling the optimizer. This fixes rdar://12150909. llvm-svn: 163180	2012-09-04 23:16:20 +00:00
Jakub Staszak	e535c1a12e	BypassSlowDivision: Assign to reference, don't copy the object. llvm-svn: 163179	2012-09-04 23:11:11 +00:00
Jakub Staszak	85a7787588	Fix my previous patch (r163164). It does now what it is supposed to do: Doesn't set MadeChange to TRUE if BypassSlowDivision doesn't change anything. llvm-svn: 163165	2012-09-04 21:16:59 +00:00
Jakub Staszak	46beca6364	Return false if BypassSlowDivision doesn't change anything. Also a few minor changes: - use pre-inc instead of post-inc - use isa instead of dyn_cast - 80 col - trailing spaces llvm-svn: 163164	2012-09-04 20:48:24 +00:00
Preston Gurd	cdf540d5d6	Generic Bypass Slow Div - CodeGenPrepare pass for identifying div/rem ops - Backend specifies the type mapping using addBypassSlowDivType - Enabled only for Intel Atom with O2 32-bit -> 8-bit - Replace IDIV with instructions which test its value and use DIVB if the value is positive and less than 256. - In the case when the quotient and remainder of a divide are used a DIV and a REM instruction will be present in the IR. In the non-Atom case they are both lowered to IDIVs and CSE removes the redundant IDIV instruction, using the quotient and remainder from the first IDIV. However, due to this optimization CSE is not able to eliminate redundant IDIV instructions because they are located in different basic blocks. This is overcome by calculating both the quotient (DIV) and remainder (REM) in each basic block that is inserted by the optimization and reusing the result values when a subsequent DIV or REM instruction uses the same operands. - Test cases check for the presents of the optimization when calculating either the quotient, remainder, or both. Patch by Tyler Nowicki! llvm-svn: 163150	2012-09-04 18:22:17 +00:00
Nadav Rotem	03dcd85b56	LICM may hoist an instruction with undefined behavior above a trap. Scan the body of the loop and find instructions that may trap. Use this information when deciding if it is safe to hoist or sink instructions. Notice that we can optimize the search of instructions that may throw in the case of nested loops. rdar://11518836 llvm-svn: 163132	2012-09-04 10:25:04 +00:00
Nadav Rotem	9d83202620	Not all targets have efficient ISel code generation for select instructions. For example, the ARM target does not have efficient ISel handling for vector selects with scalar conditions. This patch adds a TLI hook which allows the different targets to report which selects are supported well and which selects should be converted to CF duting codegen prepare. llvm-svn: 163093	2012-09-02 12:10:19 +00:00
Benjamin Kramer	599a4bb6ea	LoopRotation: Make the brute force DomTree update more brute force. We update until we hit a fixpoint. This is probably slow but also slightly simplifies the code. It should also fix the occasional invalid domtrees observed when building with expensive checking. I couldn't find a case where this had a measurable slowdown, but if someone finds a pathological case where it does we may have to find a cleverer way of updating dominators here. Thanks to Duncan for the test case. llvm-svn: 163091	2012-09-02 11:57:22 +00:00
Logan Chien	9ab55b8d59	Rename ANDROIDEABI to Android. Most of the code guarded with ANDROIDEABI are not ARM-specific, and having no relation with arm-eabi. Thus, it will be more natural to call this environment "Android" instead of "ANDROIDEABI". Note: We are not using ANDROID because several projects are using "-DANDROID" as the conditional compilation flag. llvm-svn: 163087	2012-09-02 09:29:46 +00:00
Benjamin Kramer	3be6a480a4	LoopRotation: Check some invariants of the dominator updating code. llvm-svn: 163058	2012-09-01 12:04:51 +00:00
Michael Ilseman	30c3e14e8e	test llvm-svn: 162914	2012-08-30 15:45:16 +00:00
Benjamin Kramer	afdfdb5cff	LoopRotate: Also rotate loops with multiple exits. The old PHI updating code in loop-rotate was replaced with SSAUpdater a while ago, it has no problems with comples PHIs. What had to be fixed is detecting whether a loop was already rotated and updating dominators when multiple exits were present. This change increases overall code size a bit, mostly due to additional loop unrolling opportunities. Passes test-suite and selfhost with -verify-dom-info. Fixes PR7447. Thanks to Andy for the input on the domtree updating code. llvm-svn: 162912	2012-08-30 15:39:42 +00:00
Benjamin Kramer	d4a64716ab	InstCombine: Fix comment to reflect the code. llvm-svn: 162911	2012-08-30 15:07:40 +00:00
Alexey Samsonov	f54e3aaeaa	Whitespace llvm-svn: 162907	2012-08-30 13:47:13 +00:00
Nadav Rotem	d5f5777b77	It is illegal to transform (sdiv (ashr X c1) c2) -> (sdiv x (2^c1 * c2)), because C always rounds towards zero. Thanks Dirk and Ben. llvm-svn: 162899	2012-08-30 11:23:20 +00:00
Bill Wendling	14c8a051ca	Pass by pointer and not std::string. llvm-svn: 162888	2012-08-30 01:32:31 +00:00
Bill Wendling	1f6f8c2cb7	Revert r162855 in favor of changing clang to emit the absolute coverage file path. llvm-svn: 162883	2012-08-30 00:34:21 +00:00
Andrew Trick	3051aa1cb8	Preserve branch profile metadata during switch formation. Patch by Michael Ilseman! This fixes SimplifyCFGOpt::FoldValueComparisonIntoPredecessors to preserve metata when folding conditional branches into switches. void foo(int x) { if (x == 0) bar(1); else if (__builtin_expect(x == 10, 1)) bar(2); else if (x == 20) bar(3); } CFG: B0 \| \ \| X0 B10 \| \ \| X10 B20 \| \ E X20 Merge B0-B10: w(B0-X0) = w(B0-X0)sum-weights(B10) = w(B0-X0) (w(B10-X10) + w(B10-B20)) w(B0-X10) = w(B0-B10) * w(B10-X10) w(B0-B20) = w(B0-B10) * w(B10-B20) B0 __ \| \ \ \| X10 X0 B20 \| \ E X20 Merge B0-B20: w(B0-X0) = w(B0-X0) * sum-weights(B20) = w(B0-X0) * (w(B20-E) + w(B20-X20)) w(B0-X10) = w(B0-X10) * sum-weights(B20) = ... w(B0-X20) = w(B0-B20) * w(B20-X20) w(B0-E) = w(B0-B20) * w(B20-E) llvm-svn: 162868	2012-08-29 21:46:38 +00:00
Andrew Trick	f3cf1932b3	whitespace llvm-svn: 162867	2012-08-29 21:46:36 +00:00
Bill Wendling	11e61b9557	Use the full path to output the .gcda file. This lets the user run the program from a different directory and still have the .gcda files show up in the correct place. <rdar://problem/12179524> llvm-svn: 162855	2012-08-29 20:30:44 +00:00
Bill Wendling	e8aee6b8a5	Use ArrayRef instead of SmallVector when passing vector into function. llvm-svn: 162851	2012-08-29 18:45:41 +00:00
Benjamin Kramer	8bcc971174	Make MemoryBuiltins aware of TargetLibraryInfo. This disables malloc-specific optimization when -fno-builtin (or -ffreestanding) is specified. This has been a problem for a long time but became more severe with the recent memory builtin improvements. Since the memory builtin functions are used everywhere, this required passing TLI in many places. This means that functions that now have an optional TLI argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead mallocs anymore if the TLI argument is missing. I've updated most passes to do the right thing. Fixes PR13694 and probably others. llvm-svn: 162841	2012-08-29 15:32:21 +00:00
Benjamin Kramer	1e1a1dedc6	InstCombine: Defensively avoid undefined shifts by limiting the amount to the bit width. No test case, undefined shifts get folded early, but can occur when other transforms generate a constant. Thanks to Duncan for bringing this up. llvm-svn: 162755	2012-08-28 13:59:23 +00:00
Benjamin Kramer	9c0a807c27	InstCombine: Guard the transform introduced in r162743 against large ints and non-const shifts. llvm-svn: 162751	2012-08-28 13:08:13 +00:00
Nadav Rotem	d457787fed	Make sure that we don't call getZExtValue on values > 64 bits. Thanks Benjamin for noticing this. llvm-svn: 162749	2012-08-28 12:23:22 +00:00
Nadav Rotem	11935b29f3	Teach InstCombine to canonicalize [SU]div+[AL]shl patterns. For example: %1 = lshr i32 %x, 2 %2 = udiv i32 %1, 100 rdar://12182093 llvm-svn: 162743	2012-08-28 10:01:43 +00:00
Dan Gohman	10c82cee04	Don't use for loops for code that is only intended to execute once. No intended functionality change. Thanks to Ahmed Charles for spotting it. llvm-svn: 162686	2012-08-27 18:31:36 +00:00
Kostya Serebryany	4cc511daf0	[asan/tsan] rename FunctionBlackList* to BlackList* as this class is not limited to functions any more llvm-svn: 162566	2012-08-24 16:44:47 +00:00
Kostya Serebryany	36dfc5ceab	[asan/tsan] extend the functionality of FunctionBlackList to globals and modules. Patch by Reid Watson. llvm-svn: 162565	2012-08-24 16:40:11 +00:00
Benjamin Kramer	dd62d6b6c8	GVN: Fix quadratic runtime on the number of switch cases. No intended behavior change. This was introduced in r162023. With the fixed algorithm a Release build of ARMInstPrinter.cpp goes from 16s to 10s on a 2011 MBP. llvm-svn: 162559	2012-08-24 15:06:28 +00:00
Benjamin Kramer	e07728b936	SimplifyLibCalls: Give all safely-shrinkable libcalls the same treatment. llvm-svn: 162383	2012-08-22 19:39:15 +00:00
Chad Rosier	0122909d95	Add a few float shrinking optimizations to SimplifyLibCalls. Unsafe optimizations are guarded by the -enable-double-float-shrink LLVM option. Last bit of PR13574. Patch by Weiming Zhao <weimingz@codeaurora.org>. llvm-svn: 162368	2012-08-22 17:22:33 +00:00
Chad Rosier	b2f5c1cdbb	Add a new helper function, AddOpt(F1, F1, Opt), as part of PR13574. No functional change intended. Patch by Weiming Zhao <weimingz@codeaurora.org>. llvm-svn: 162363	2012-08-22 16:52:57 +00:00
Richard Smith	976f8605e2	MaximumSpanningTree::EdgeWeightCompare: Make this comparator actually be a strict weak ordering, and don't pass possibly-null pointers to dyn_cast. llvm-svn: 162314	2012-08-21 21:03:40 +00:00
Richard Smith	ad9c8e839e	Don't bind a reference to a dereferenced null pointer (for return value of WeakVH::operator*). llvm-svn: 162309	2012-08-21 20:35:14 +00:00
Chandler Carruth	c908ca1766	Port the global copy optimization from the SROA pass to InstCombine. This optimization is really just replacing allocas wholesale with globals, there is no scalarization. The underlying motivation for this patch is to simplify the SROA pass and focus it on splitting and promoting allocas. llvm-svn: 162271	2012-08-21 08:39:44 +00:00
Kostya Serebryany	f4be019fba	[asan] add code to detect global initialization fiasco in C/C++. The sub-pass is off by default for now. Patch by Reid Watson. Note: this patch changes the interface between LLVM and compiler-rt parts of asan. The corresponding patch to compiler-rt will follow. llvm-svn: 162268	2012-08-21 08:24:25 +00:00
Michael Liao	6e12d12830	revise debug output to avoid dangling pointer llvm-svn: 162256	2012-08-21 05:55:22 +00:00
Benjamin Kramer	9d03242fcf	InstCombine: Fix a crasher when encountering a function pointer. llvm-svn: 162180	2012-08-18 22:04:34 +00:00
Benjamin Kramer	9282aef86d	Remove overly conservative hasOneUse check, this always expands into a single IR instruction. llvm-svn: 162175	2012-08-18 20:24:19 +00:00
Benjamin Kramer	8c2a733c55	InstCombine: Add a couple of fabs identities for comparing with 0.0. llvm-svn: 162174	2012-08-18 20:06:47 +00:00
Benjamin Kramer	000132454c	SimplifyLibcalls: Add fabs and trunc to the list of libcalls that are safe to shrink from double to float. llvm-svn: 162173	2012-08-18 19:27:32 +00:00
Richard Smith	257c5f2088	Fix undefined behavior (binding a reference to a dereferenced null pointer) if SSAUpdater was created and destroyed without being initialized. llvm-svn: 162137	2012-08-17 21:42:44 +00:00
Rafael Espindola	cc80cdebb9	Teach GVN to reason about edges dominating uses. This allows it to handle cases where some fact lake a=b dominates a use in a phi, but doesn't dominate the basic block itself. This feature could also be implemented by splitting critical edges, but at least with the current algorithm reasoning about the dominance directly is faster. The time for running "opt -O2" in the testcase in pr10584 is 1.003 times slower and on gcc as a single file it is 1.0007 times faster. llvm-svn: 162023	2012-08-16 15:09:43 +00:00
Bill Wendling	4d5150d978	Remove dead flag. llvm-svn: 161990	2012-08-15 21:18:10 +00:00
Kostya Serebryany	1e575ab8b2	[asan] implement --asan-always-slow-path, which is a part of the improvement to handle unaligned partially OOB accesses. See http://code.google.com/p/address-sanitizer/issues/detail?id=100 llvm-svn: 161937	2012-08-15 08:58:58 +00:00
Michael Liao	69e172a6f0	fix infinite loop in instcombine with more than 4GB memcpy - memcpy size is wrongly truncated into 32-bit and treat 8GB memcpy is 0-sized memcpy - as 0-sized memcpy/memset is already removed before SimplifyMemTransfer and SimplifyMemSet in visitCallInst, replace 0 checking with assertions. - replace getZExtValue() with getLimitedValue() according to Eli Friedman llvm-svn: 161923	2012-08-15 03:49:59 +00:00
Kostya Serebryany	fda7a138f7	[asan] insert crash basic blocks inline as opposed to inserting them at the end of the function. This doesn't seem to fix or break anything, but is considered to be more friendly to downstream passes llvm-svn: 161870	2012-08-14 14:04:51 +00:00
Craig Topper	2a40418a99	Change greater than to greater than or equal so that an identical sized store to the same offset is treated as completing overwriting. llvm-svn: 161857	2012-08-14 07:32:05 +00:00
Nadav Rotem	70409991bc	During the CodeGenPrepare we often lower intrinsics (such as objsize) and allow some optimizations to turn conditional branches into unconditional. This commit adds a simple control-flow optimization which merges two consecutive basic blocks which are connected by a single edge. This allows the codegen to operate on larger basic blocks. rdar://11973998 llvm-svn: 161852	2012-08-14 05:19:07 +00:00
Nadav Rotem	8d80452076	LICM uses AliasSet information to hoist and sink instructions. However, other passes, such as LoopRotate may invalidate its AliasSet because SSAUpdater does not update the AliasSet properly. This patch teaches SSAUpdater to notify AliasSet that it made changes. The testcase in PR12901 is too big to be useful and I could not reduce it to a normal size. rdar://11872059 PR12901 llvm-svn: 161803	2012-08-13 23:06:54 +00:00
Kostya Serebryany	0f7a80d0c3	[asan] remove the code for --asan-merge-callbacks as it appears to be a bad idea. (partly related to Bug 13225) llvm-svn: 161757	2012-08-13 14:08:46 +00:00
Rafael Espindola	64e7b5703e	Constify some basic blocks, no functionality change. llvm-svn: 161668	2012-08-10 15:55:25 +00:00
Pete Cooper	0deca6be79	Fix crash when when do lto on Bullet. Dynamic GEPs in SROA were incorrectly being applied to all accesses to an alloca, not just the ones which read from the GEP. Thanks to Evan for reducing the test. rdar://11861001 llvm-svn: 161654	2012-08-10 03:26:36 +00:00
Eli Friedman	08ec0a8122	isAllocLikeFn is allowed to return true for functions which read memory; make sure we account for that correctly in DeadStoreElimination. Fixes a regression from r158919. PR13547. llvm-svn: 161468	2012-08-08 02:17:32 +00:00
Dan Gohman	b948736002	Avoid recomputing the unique exit blocks and their insert points when doing multiple scalar promotions on a single loop. This also has the effect of preserving the order of stores sunk out of loops, which is aesthetically pleasing, and it happens to fix the testcase in PR13542, though it doesn't fix the underlying problem. llvm-svn: 161459	2012-08-08 00:00:26 +00:00
Bob Wilson	61f3ad5759	Fix a serious typo in InstCombine's optimization of comparisons. An unsigned value converted to floating-point will always be greater than a negative constant. Unfortunately InstCombine reversed the check so that unsigned values were being optimized to always be greater than all positive floating-point constants. <rdar://problem/12029145> llvm-svn: 161452	2012-08-07 22:35:16 +00:00
Bill Wendling	8555a37c04	Move the "findUsedStructTypes" functionality outside of the Module class. The "findUsedStructTypes" method is very expensive to run. It needs to be optimized so that LTO can run faster. Splitting this method out of the Module class will help this occur. For instance, it can keep a list of seen objects so that it doesn't process them over and over again. llvm-svn: 161228	2012-08-03 00:30:35 +00:00
Nuno Lopes	a9a8c62714	remove tabs from my previous commit. Sorry, not used to this editor anymore.. XCode please come back; you're forgiven :) llvm-svn: 161120	2012-08-01 17:13:28 +00:00
Nuno Lopes	e7220312c2	(hopefuly) fix the remaining cases where null wasnt expected (PR13497). I'll commit a test to the clang tree. llvm-svn: 161118	2012-08-01 16:58:51 +00:00
Evan Cheng	249716e8ae	Teach CodeGenPrep to look past bitcast when it's duplicating return instruction into predecessor blocks to enable tail call optimization. rdar://11958338 llvm-svn: 160894	2012-07-27 21:21:26 +00:00
Nuno Lopes	20c7eb3549	fix infinite loop in instcombine in the presence of a (malformed) self-referencing select inst. This can happen as long as the instruction is not reachable. Instcombine does generate these unreachable malformed selects when doing RAUW llvm-svn: 160874	2012-07-27 18:03:57 +00:00
Pete Cooper	abc13af9c6	Simplify demanded bits of select sources where the condition is a constant vector llvm-svn: 160835	2012-07-26 23:10:24 +00:00
Pete Cooper	e807e45bff	Teach SimplifyDemandedBits how to look through fpext and fptrunc to simplify their operand llvm-svn: 160823	2012-07-26 22:37:04 +00:00
Nuno Lopes	5940c4a15f	do null checks for a few more Emit*() functions. Thanks Eli for noticing. llvm-svn: 160787	2012-07-26 17:10:46 +00:00
Duncan Sands	5651452076	Stop reassociate from looking through expressions of arbitrary complexity. This is a temporary measure until my fix for PR13021 is ready. llvm-svn: 160778	2012-07-26 09:26:40 +00:00
Nick Lewycky	7d0f110cb3	It's not safe to blindly remove invoke instructions. This happens when we encounter an invoke of an allocation function. This should fix the dragonegg bootstrap. Testcase to follow, later. llvm-svn: 160757	2012-07-25 21:19:40 +00:00
Nuno Lopes	f0626f2205	revert r160742: it's breaking CMake build original commit msg: MemoryBuiltins: add support to determine the size of strdup'ed non-constant strings llvm-svn: 160751	2012-07-25 18:49:28 +00:00
Nuno Lopes	f0441e04bd	MemoryBuiltins: add support to determine the size of strdup'ed non-constant strings llvm-svn: 160742	2012-07-25 17:29:22 +00:00
Nuno Lopes	7ba5b98720	add EmitStrNLen() llvm-svn: 160741	2012-07-25 17:18:59 +00:00
Nuno Lopes	89702e94b5	make all Emit*() functions consult the TargetLibraryInfo information before creating a call to a library function. Update all clients to pass the TLI information around. Previous draft reviewed by Eli. llvm-svn: 160733	2012-07-25 16:46:31 +00:00
Nick Lewycky	38be931223	Don't delete one more instruction than we're allowed to. This should fix the Darwin bootstrap. Testcase exists but isn't fully reduced, I expect to commit the testcase this evening. llvm-svn: 160693	2012-07-24 21:33:00 +00:00
Nadav Rotem	465834c85f	Clean whitespaces. llvm-svn: 160668	2012-07-24 10:51:42 +00:00
Nick Lewycky	faa9c3b035	Teach globalopt to not nuke all stores to globals. Keep them around of they might be deliberate "one time" leaks, so that leak checkers can find them. This is a reapply of r160602 with the fix that this time I'm committing the code I thought I was committing last time; the I->eraseFromParent() goes after the break out of the loop. llvm-svn: 160664	2012-07-24 07:21:08 +00:00
Dan Gohman	f64ff8ed3a	An objc_retain can serve as a may-use for a different pointer. rdar://11931823. llvm-svn: 160637	2012-07-23 19:27:31 +00:00
Nadav Rotem	1088811c33	Suppress a warning. llvm-svn: 160629	2012-07-23 13:44:15 +00:00
Sylvestre Ledru	35521e2310	Fix a typo (the the => the) llvm-svn: 160621	2012-07-23 08:51:15 +00:00
Chandler Carruth	c8acd7c96b	Move the initialization of the bounds checking pass. The pass itself moved earlier. This fixes some layering issues. llvm-svn: 160611	2012-07-22 05:19:32 +00:00
Nick Lewycky	9669c198ba	Revert r160602. llvm-svn: 160603	2012-07-21 09:03:15 +00:00
Nick Lewycky	72b83e5eaa	Teach globalopt to play nice with leak checkers. This is a reapplication of r160529 that was subsequently reverted. The fix was to not call GV->eraseFromParent() right before the caller does the same. The existing testcases already caught this bug if run under valgrind. llvm-svn: 160602	2012-07-21 08:29:45 +00:00
Nuno Lopes	20ea62527a	move the bounds checking pass to the instrumentation folder, where it belongs. I dunno why in the world I dropped it in the Scalar folder in the first place. No functionality change. llvm-svn: 160587	2012-07-20 22:39:33 +00:00
Richard Osborne	0ab2b0df82	Fix assertion in jump threading (PR13405). GetBestDestForJumpOnUndef() assumes there is at least 1 successor, which isn't true if the block ends in an indirect branch with no successors. Fix this by bailing out earlier in this case. llvm-svn: 160546	2012-07-20 10:36:17 +00:00
Kostya Serebryany	f02c6069ac	[asan] make sure that the crash callbacks do not get merged (Chandler's idea: insert an empty InlineAsm). Change the order in which the new BBs are inserted: the slow path BB is insert between old BBs, the crash BB is inserted at the end. Don't create an empty BB (introduced by recent commits). Update the test. The experimental code that does manual crash callback merge will most likely be deleted later. llvm-svn: 160544	2012-07-20 09:54:50 +00:00
Nick Lewycky	7707e23429	Revert r160529 due to crashes. llvm-svn: 160532	2012-07-19 23:59:21 +00:00
Nick Lewycky	0fa6a28141	Don't wipe out global variables that are probably storing pointers to heap memory. This makes clang play nice with leak checkers. llvm-svn: 160529	2012-07-19 22:35:28 +00:00
Benjamin Kramer	f364a63c3e	Replace some explicit compare loops with std::equal. No functionality change. llvm-svn: 160501	2012-07-19 10:46:05 +00:00
Bill Wendling	ea6397f67b	Remove tabs. llvm-svn: 160477	2012-07-19 00:11:40 +00:00
Andrew Trick	0d07dfcd6f	indvars: drive by heuristics fix. Minor oversight noticed by inspection. Sorry no unit test. llvm-svn: 160422	2012-07-18 04:35:13 +00:00
Andrew Trick	c08726627c	indvars: Linear function test replace should avoid reusing undef. Fixes PR13371: indvars pass incorrectly substitutes 'undef' values. I do not like this fix. It's needed until/unless the meaning of undef changes. It attempts to be complete according to the IR spec, but I don't have much confidence in the implementation given the difficulty testing undefined behavior. Worse, this invalidates some of my hard-fought work on indvars and LSR to optimize pointer induction variables. It results benchmark regressions, which I'll track internally. On x86_64 no LTO I see: -3% huffbench -3% 400.perlbench -8% fhourstones My only suggestion for recovering is to change the meaning of undef. If we could trust an arbitrary instruction to produce a some real value that can be manipulated (e.g. incremented) according to non-undef rules, then this case could be easily handled with SCEV. llvm-svn: 160421	2012-07-18 04:35:10 +00:00
Evan Cheng	e6a3b03ee0	Back out r160101 and instead implement a dag combine to recover from instcombine transformation. llvm-svn: 160387	2012-07-17 18:54:11 +00:00
Kostya Serebryany	986b8da500	[asan] more code to merge crash callbacks. Doesn't fully work yet, but allows to hold performance experiments llvm-svn: 160361	2012-07-17 11:04:12 +00:00
Andrew Trick	c803706c18	Reapply r160340. LSR: Limit CollectSubexprs. Speculatively fix crashes by code inspection. Can't reproduce them yet. llvm-svn: 160344	2012-07-17 05:30:37 +00:00
Andrew Trick	e834cb465a	Revert "LSR: try not to blow up solving combinatorial problems brute force." Some units tests crashed on a different platform. llvm-svn: 160341	2012-07-17 05:05:21 +00:00
Andrew Trick	7cd6d426b3	LSR: try not to blow up solving combinatorial problems brute force. This places limits on CollectSubexprs to constrains the number of reassociation possibilities. It limits the recursion depth and skips over chains of nested recurrences outside the current loop. Fixes PR13361. Although underlying SCEV behavior is still potentially bad. llvm-svn: 160340	2012-07-17 05:00:56 +00:00
Nuno Lopes	482fb19fd5	fix PR13339 (remove the predecessor from the unwind BB when removing an invoke) llvm-svn: 160325	2012-07-16 22:49:40 +00:00
Kostya Serebryany	c4ce5dfe2d	[asan] a bit more refactoring, addressed some of the style comments from chandlerc, partially implemented crash callback merging (under flag) llvm-svn: 160290	2012-07-16 17:12:07 +00:00
Kostya Serebryany	874dae6119	[asan] refactor instrumentation to allow merging the crash callbacks (not fully implemented yet, no functionality change except the BB order) llvm-svn: 160284	2012-07-16 16:15:40 +00:00
Kostya Serebryany	4273bb05d1	[asan] initialize asan error callbacks in runOnModule instead of doing that on-demand llvm-svn: 160269	2012-07-16 14:09:42 +00:00
Chandler Carruth	8b540ab337	Revert r160254 temporarily. It turns out that ASan relied on the at-the-end block insertion order to (purely by happenstance) disable some LLVM optimizations, which in turn start firing when the ordering is made more "normal". These optimizations in turn merge many of the instrumentation reporting calls which breaks the return address based error reporting in ASan. We're looking at several different options for fixing this. llvm-svn: 160256	2012-07-16 10:01:02 +00:00
Chandler Carruth	3dd6c81492	Teach AddressSanitizer to create basic blocks in a more natural order. This is particularly useful to the backend code generators which try to process things in the incoming function order. Also, cleanup some uses of IRBuilder to be a bit simpler and more clear. llvm-svn: 160254	2012-07-16 08:58:53 +00:00
Chandler Carruth	36e2ecf528	Move llvm/Support/TypeBuilder.h -> llvm/TypeBuilder.h. This completes the move of *Builder classes into the Core library. No uses of this builder in Clang or DragonEgg I could find. If there is a desire to have an IR-building-support library that contains all of these builders, that can be easily added, but currently it seems likely that these add no real overhead to VMCore. llvm-svn: 160243	2012-07-15 23:45:24 +00:00
Chandler Carruth	ec7ad6561f	Move llvm/Support/MDBuilder.h to llvm/MDBuilder.h, to live with IRBuilder, DIBuilder, etc. This is the proper layering as MDBuilder can't be used (or implemented) without the Core Metadata representation. Patches to Clang and Dragonegg coming up. llvm-svn: 160237	2012-07-15 23:26:50 +00:00
Andrew Trick	653513b8dd	LSR Fix: check SCEV expression safety before expansion. All SCEV expressions used by LSR formulae must be safe to expand. i.e. they may not contain UDiv unless we can prove nonzero denominator. Fixes PR11356: LSR hoists UDiv. llvm-svn: 160205	2012-07-13 23:33:10 +00:00
Benjamin Kramer	abbfe69356	Make helper functions static. llvm-svn: 160173	2012-07-13 13:25:15 +00:00
Evan Cheng	493eb32ff4	Instcombine was transforming: %shr = lshr i64 %key, 3 %0 = load i64* %val, align 8 %sub = add i64 %0, -1 %and = and i64 %sub, %shr ret i64 %and to: %shr = lshr i64 %key, 3 %0 = load i64* %val, align 8 %sub = add i64 %0, 2305843009213693951 %and = and i64 %sub, %shr ret i64 %and The demanded bit optimization is actually a pessimization because add -1 would be codegen'ed as a sub 1. Teach the demanded constant shrinking optimization to check for negated constant to make sure it is actually reducing the width of the constant. rdar://11793464 llvm-svn: 160101	2012-07-12 01:45:35 +00:00
Nuno Lopes	95cc4f3cb5	instcombine: merge the functions that remove dead allocas and dead mallocs/callocs/... This patch removes ~70 lines in InstCombineLoadStoreAlloca.cpp and makes both functions a bit more aggressive than before :) In theory, we can be more aggressive when removing an alloca than a malloc, because an alloca pointer should never escape, but we are not taking advantage of this anyway llvm-svn: 159952	2012-07-09 18:38:20 +00:00
Nuno Lopes	fa0dffccee	teach instcombine to remove allocated buffers even if there are stores, memcpy/memmove/memset, and objectsize users. This means we can do cheap DSE for heap memory. Nothing is done if the pointer excapes or has a load. The churn in the tests is mostly due to objectsize, since we want to make sure we don't delete the malloc call before evaluating the objectsize (otherwise it becomes -1/0) llvm-svn: 159876	2012-07-06 23:09:25 +00:00
Kostya Serebryany	e36ae68803	[tsan] fix compile-time falilure found while building Chromium with tsan (tsan issue #3 ). A unit test will follow separately. llvm-svn: 159736	2012-07-05 09:07:31 +00:00
Stepan Dyatkovskiy	7ff588f986	Reverted r156659, due to probable performance regressions, DenseMap should be used here: IntegersSubsetMapping - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement if possible. llvm-svn: 159703	2012-07-04 05:53:05 +00:00
Nuno Lopes	1e8dffdf27	BoundsChecking: optimize out the check for offset < 0 if size is known to be >= 0 (signed). (LLVM optimizers cannot do this optimization by themselves) llvm-svn: 159668	2012-07-03 17:30:18 +00:00
Stepan Dyatkovskiy	8b0c97e0dd	Part of r159527. Splitted into series of patches and gone with fixed PR13256: IntegersSubsetMapping - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement if possible. llvm-svn: 159659	2012-07-03 13:46:45 +00:00
Eric Christopher	b65acc61a5	Revert "IntRange:" as it appears to be breaking self hosting. This reverts commit b2833d9dcba88c6f0520cad760619200adc0442c. llvm-svn: 159618	2012-07-02 23:22:21 +00:00
Duncan Sands	e8ce94fcd7	GlobalOpt forgot to handle bitcast when analyzing globals. Found by inspection. llvm-svn: 159546	2012-07-02 18:55:39 +00:00
Nuno Lopes	d0bcfe4d9d	fix the regression I introduced in r159385 (it's necessary to update PHI nodes in unwind BB llvm-svn: 159534	2012-07-02 16:14:47 +00:00
Stepan Dyatkovskiy	8b9ecca42d	IntRange: - Changed isSingleNumber method behaviour. Now this flag is calculated on demand. IntegersSubsetMapping - Optimized diff operation. - Replaced type of Items field from std::list with std::map. - Added new methods: bool isOverlapped(self &RHS) void add(self& RHS, SuccessorClass S) void detachCase(self& NewMapping, SuccessorClass Succ) void removeCase(SuccessorClass Succ) SuccessorClass findSuccessor(const IntTy& Val) const IntTy* getCaseSingleNumber(SuccessorClass *Succ) IntegersSubsetTest - DiffTest: Added checks for successors. SimplifyCFG Updated SwitchInst usage (now it is case-ragnes compatible) for - SimplifyEqualityComparisonWithOnlyPredecessor - FoldValueComparisonIntoPredecessors llvm-svn: 159527	2012-07-02 13:02:18 +00:00
Kostya Serebryany	eeaf688c0f	[asan] small code simplification llvm-svn: 159522	2012-07-02 11:42:29 +00:00
Bill Wendling	a164735baa	Don't reinsert the 'atexit' function if it already exists. llvm-svn: 159491	2012-06-30 20:21:19 +00:00
Nuno Lopes	7b12b87096	revert r159440. As Duncan pointed out, the test for invoke is not needed at this point llvm-svn: 159471	2012-06-29 22:10:10 +00:00
Benjamin Kramer	396b3adc10	CodeGenPrepare: Don't crash when TLI is not available. This happens when codegenprepare is invoked via opt. llvm-svn: 159457	2012-06-29 19:58:21 +00:00
Duncan Sands	9838286d9e	Rework this to clarify where the removal of nodes from the queue is really happening. No intended functionality change. llvm-svn: 159451	2012-06-29 19:03:05 +00:00
Nuno Lopes	b37ef71ce1	ignore 'invoke new' in isInstructionTriviallyDead, since most callers are not ready to handle invokes. instcombine will take care of this. llvm-svn: 159440	2012-06-29 17:37:07 +00:00
Duncan Sands	369c6d270b	Fix a reassociate crash on sozefx when compiling with dragonegg+gcc-4.7 due to the optimizers producing a multiply expression with more multiplications than the original (!). llvm-svn: 159426	2012-06-29 13:25:06 +00:00
Chandler Carruth	aafe0918bc	Move llvm/Support/IRBuilder.h -> llvm/IRBuilder.h This was always part of the VMCore library out of necessity -- it deals entirely in the IR. The .cpp file in fact was already part of the VMCore library. This is just a mechanical move. I've tried to go through and re-apply the coding standard's preferred header sort, but at 40-ish files, I may have gotten some wrong. Please let me know if so. I'll be committing the corresponding updates to Clang and Polly, and Duncan has DragonEgg. Thanks to Bill and Eric for giving the green light for this bit of cleanup. llvm-svn: 159421	2012-06-29 12:38:19 +00:00
Bill Wendling	f799efdedc	The DIBuilder class is just a wrapper around debug info creation (a.k.a. MDNodes). The module doesn't belong in Analysis. Move it to the VMCore instead. llvm-svn: 159414	2012-06-29 08:32:07 +00:00
Nuno Lopes	b97a4e8bc2	make simplifyCFG erase invokes to readonly/readnone functions llvm-svn: 159385	2012-06-28 22:32:27 +00:00
Nuno Lopes	9ac4661afa	make instcombine produce calls to llvm.donothing instead of a random intrinsic llvm-svn: 159384	2012-06-28 22:31:24 +00:00
Kostya Serebryany	c387ca7bab	[asan] set a hard limit on the number of instructions instrumented pear each BB. This is (hopefully temporary) workaround for PR13225 llvm-svn: 159344	2012-06-28 09:34:41 +00:00
Hal Finkel	918ca2b8b7	Precompute SCEV pointer analysis prior to instruction fusion in BBVectorize. When both a load/store and its address computation are being vectorized, it can happen that the address-computation vectorization destroys SCEV's ability to analyize the relative pointer offsets. As a result (like with the aliasing analysis info), we need to precompute the necessary information prior to instruction fusing. This was found during stress testing (running through the test suite with a very low required chain length); unfortunately, I don't have a small test case. llvm-svn: 159332	2012-06-28 05:42:45 +00:00
Hal Finkel	0873d73cbf	Remove a useless check in BBVectorize. A shuffle mask will always be a constant, but I did not realize that when I originally wrote the code. llvm-svn: 159331	2012-06-28 05:42:43 +00:00
Hal Finkel	f2dcb9a9c4	Allow BBVectorize to form non-2^n-length vectors. The original algorithm only used recursive pair fusion of equal-length types. This is now extended to allow pairing of any types that share the same underlying scalar type. Because we would still generally prefer the 2^n-length types, those are formed first. Then a second set of iterations form the non-2^n-length types. Also, a call to SimplifyInstructionsInBlock has been added after each pairing iteration. This takes care of DCE (and a few other things) that make the following iterations execute somewhat faster. For the same reason, some of the simple shuffle-combination cases are now handled internally. There is some additional refactoring work to be done, but I've had many requests for this feature, so additional refactoring will come soon in future commits (as will additional test cases). llvm-svn: 159330	2012-06-28 05:42:42 +00:00
Hal Finkel	74e5225c92	Refactor operation equivalence checking in BBVectorize by extending Instruction::isSameOperationAs. Maintaining this kind of checking in different places is dangerous, extending Instruction::isSameOperationAs consolidates this logic into one place. Here I've added an optional flags parameter and two flags that are important for vectorization: CompareIgnoringAlignment and CompareUsingScalarTypes. llvm-svn: 159329	2012-06-28 05:42:26 +00:00
Bill Wendling	e38859dc8e	Move lib/Analysis/DebugInfo.cpp to lib/VMCore/DebugInfo.cpp and include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h. The reasoning is because the DebugInfo module is simply an interface to the debug info MDNodes and has nothing to do with analysis. llvm-svn: 159312	2012-06-28 00:05:13 +00:00
Matt Beaumont-Gay	a58862310c	Revert r159136 due to PR13124. Original commit message: If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it hidden. Being linkonce_odr guarantees that it is available in every dso that needs it. Being a constant/function with unnamed_addr guarantees that the copies don't have to be merged. llvm-svn: 159272	2012-06-27 17:10:33 +00:00
Duncan Sands	514db117bd	Some reassociate optimizations create new instructions, which they insert just before the expression root. Any existing operators that are changed to use one of them needs to be moved between it and the expression root, and recursively for the operators using that one. When I rewrote RewriteExprTree I accidentally inverted the logic, resulting in the compacting going down from operators to operands rather than up from operands to the operators using them, oops. Fix this, resolving PR12963. llvm-svn: 159265	2012-06-27 14:19:00 +00:00
Evan Cheng	319be53a1f	Remove a instcombine transform that (no longer?) makes sense: // C - zext(bool) -> bool ? C - 1 : C if (ZExtInst *ZI = dyn_cast<ZExtInst>(Op1)) if (ZI->getSrcTy()->isIntegerTy(1)) return SelectInst::Create(ZI->getOperand(0), SubOne(C), C); This ends up forming sext i1 instructions that codegen to terrible code. e.g. int blah(_Bool x, _Bool y) { return (x - y) + 1; } => movzbl %dil, %eax movzbl %sil, %ecx shll $31, %ecx sarl $31, %ecx leal 1(%rax,%rcx), %eax ret Without the rule, llvm now generates: movzbl %sil, %ecx movzbl %dil, %eax incl %eax subl %ecx, %eax ret It also helps with ARM (and pretty much any target that doesn't have a sext i1 :-). The transformation was done as part of Eli's r75531. He has given the ok to remove it. rdar://11748024 llvm-svn: 159230	2012-06-26 22:03:13 +00:00
Duncan Sands	8bc764aeca	Replacing zero-sized alloca's with a null pointer is too aggressive, instead merge all zero-sized alloca's into one, fixing c43204g from the Ada ACATS conformance testsuite. What happened there was that a variable sized object was being allocated on the stack, "alloca i8, i32 %size". It was then being passed to another function, which tested that the address was not null (raising an exception if it was) then manipulated %size bytes in it (load and/or store). The optimizers cleverly managed to deduce that %size was zero (congratulations to them, as it isn't at all obvious), which made the alloca zero size, causing the optimizers to replace it with null, which then caused the check mentioned above to fail, and the exception to be raised, wrongly. Note that no loads and stores were actually being done to the alloca (the loop that does them is executed %size times, i.e. is not executed), only the not-null address check. llvm-svn: 159202	2012-06-26 13:39:21 +00:00
Nuno Lopes	31b54a5379	revert my previous commit (r159173), since as Eli pointed out, it's perfectly ok to mark realloc as noalias llvm-svn: 159175	2012-06-25 23:26:10 +00:00
Nuno Lopes	75eaa72de9	do not set realloc() as NotAlias, since it can return the same pointer. This whole thing should be upgraded to use the MemoryBuiltin interface anyway.. llvm-svn: 159173	2012-06-25 22:55:50 +00:00
Dan Gohman	5f725cd196	Fix the objc_autoreleasedReturnValue optimization code to locate the call correctly even in the case where it is an invoke. This fixes rdar://11714057. llvm-svn: 159157	2012-06-25 19:47:37 +00:00
Nuno Lopes	07594cba7c	improve optimization of invoke instructions: - simplifycfg: invoke undef/null -> unreachable - instcombine: invoke new -> invoke expect(0, 0) (an arbitrary NOOP intrinsic; only done if the allocated memory is unused, of course) - verifier: allow invoke of intrinsics (to make the previous step work) llvm-svn: 159146	2012-06-25 17:11:47 +00:00
Rafael Espindola	540c3d23df	If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it hidden. Being linkonce_odr guarantees that it is available in every dso that needs it. Being a constant/function with unnamed_addr guarantees that the copies don't have to be merged. llvm-svn: 159136	2012-06-25 14:30:31 +00:00
Eli Bendersky	f0ad3606c7	The name (and comment describing) of llvm::GetFirstDebuigLocInBasicBlock no longer represents what the function does. Therefore, the function is removed and its functionality is folded into the only place in the code-base where it was being used. llvm-svn: 159133	2012-06-25 10:13:14 +00:00
NAKAMURA Takumi	704de074b8	llvm/lib: [CMake] Add explicit dependency to intrinsics_gen. llvm-svn: 159112	2012-06-24 13:32:01 +00:00
Hal Finkel	3099ce9489	Allow controlling vectorization of boolean values separately from other integer types. These are used as the result of comparisons, and often handled differently from larger integer types. llvm-svn: 159111	2012-06-24 13:28:01 +00:00
Nick Lewycky	0a045bbe4e	Remove dyn_cast + dereference pattern by replacing it with a cast and changing the safety check to look for the same type we're going to actually cast to. Fixes PR13180! llvm-svn: 159110	2012-06-24 10:15:42 +00:00
Nick Lewycky	b74ae9c5b2	Tab to spaces. No functionality change. llvm-svn: 159104	2012-06-24 04:07:14 +00:00
Nick Lewycky	bfb07fb562	Remove a dangling reference to a deleted instruction. Fixes PR13185! llvm-svn: 159096	2012-06-24 01:44:08 +00:00
Hal Finkel	4b06b1a0ee	Allow BBVectorize to fuse compare instructions. llvm-svn: 159088	2012-06-23 21:52:50 +00:00
Hans Wennborg	cbe34b4cc9	Extend the IL for selecting TLS models (PR9788) This allows the user/front-end to specify a model that is better than what LLVM would choose by default. For example, a variable might be declared as @x = thread_local(initialexec) global i32 42 if it will not be used in a shared library that is dlopen'ed. If the specified model isn't supported by the target, or if LLVM can make a better choice, a different model may be used. llvm-svn: 159077	2012-06-23 11:37:03 +00:00
Stepan Dyatkovskiy	8e00efeace	Optimized usage of new SwitchInst case values (IntegersSubset type) in Local.cpp, Execution.cpp and BitcodeWriter.cpp. I got about 1% of compile-time improvement on my machines (Ubuntu 11.10 i386 and Ubuntu 12.04 x64). llvm-svn: 159076	2012-06-23 10:58:58 +00:00
Nuno Lopes	de8c6fb24f	BoundsChecking: attach debug info to traps to make my life a bit more sane llvm-svn: 159055	2012-06-23 00:12:34 +00:00
Jakob Stoklund Olesen	c5c4e96f3e	Revert remaining part of r93200: "Disable folding sext(trunc(x)) -> x" This fixes PR5997. These transforms were disabled because codegen couldn't deal with other uses of trunc(x). This is now handled by the peephole pass. This causes no regressions on x86-64. llvm-svn: 159003	2012-06-22 16:36:43 +00:00
Stepan Dyatkovskiy	a6c8cc307b	Fixed r158979. Original message: Performance optimizations: - SwitchInst: case values stored separately from Operands List. It allows to make faster access to individual case value numbers or ranges. - Optimized IntItem, added APInt value caching. - Optimized IntegersSubsetGeneric: added optimizations for cases when subset is single number or when subset consists from single numbers only. llvm-svn: 158997	2012-06-22 14:53:30 +00:00
Nuno Lopes	0b60ebbf79	fix whitespace in my last commit. sorry for the churn :S enough for today; going to sleep. llvm-svn: 158953	2012-06-22 00:29:58 +00:00
Nuno Lopes	9792d68381	remove extractMallocCallFromBitCast, since it was tailor maded for its sole user. Update GlobalOpt accordingly. llvm-svn: 158952	2012-06-22 00:25:01 +00:00
Nuno Lopes	771e7bd4ba	instcombine: disable optimization of 'invoke null/undef'. I'll move this functionality to SimplifyCFG (since we cannot make changes to the CFG here). Fixes the crashes with the attached test case llvm-svn: 158951	2012-06-21 23:52:14 +00:00
Evan Cheng	32c7cc8ec9	Look pass zext to strength reduce an udiv. Patch by David Majnemer. rdar://11721329 llvm-svn: 158946	2012-06-21 22:52:49 +00:00
Nuno Lopes	dc6085e52d	Add support for invoke to the MemoryBuiltin analysid. Update comments accordingly. Make instcombine remove useless invokes to C++'s 'new' allocation function (test attached). llvm-svn: 158937	2012-06-21 21:25:05 +00:00
Nuno Lopes	0e967e0186	port the BoundsChecking patch to the new MemoryBuiltin API (i.e., remove most of the code from here). Remove the alloc_size.ll test until we settle on a metadata format that makes everyone happy.. llvm-svn: 158920	2012-06-21 15:59:53 +00:00
Nuno Lopes	55fff83422	refactor the MemoryBuiltin analysis: - provide more extensive set of functions to detect library allocation functions (e.g., malloc, calloc, strdup, etc) - provide an API to compute the size and offset of an object pointed by Move a few clients (GVN, AA, instcombine, ...) to the new API. This implementation is a lot more aggressive than each of the custom implementations being replaced. Patch reviewed by Nick Lewycky and Chandler Carruth, thanks. llvm-svn: 158919	2012-06-21 15:45:28 +00:00
Nadav Rotem	4e9012c2b1	Add a number of threshold arguments to the SRA pass. A patch by Tom Stellard with minor changes. llvm-svn: 158918	2012-06-21 13:44:31 +00:00
Nuno Lopes	3fa32f2452	replace usage of EmitGEPOffset() with TargetData::getIndexedOffset() when the GEP offset is known to be constant. With this change, we avoid relying on the IR Builder to constant fold the operations. No functionality change intended. llvm-svn: 158829	2012-06-20 17:30:51 +00:00
Chandler Carruth	c60fbe6b58	Fix two rather subtle internal vs. external linker issues. I'll admit I'm not entirely satisfied with this change, but it seemed the cleanest option. Other suggestions quite welcome The issue is that the traits specializations have static methods which return the typedef'ed PHI_iterator type. In both the IR and MI layers this is typedef'ed to a custom iterator class defined in an anonymous namespace giving the types and the functions returning them internal linkage. However, because the traits specialization is defined in the 'llvm' namespace (where it has to be, specialized template lives there), and is in turn used in the templated implementation of the SSAUpdater. This led to the linkage conflict that Clang now warns about. The simplest solution to me was just to define the PHI_iterator as a nested class inside the trait specialization. That way it still doesn't get scoped widely, it can't be accidentally reused somewhere, etc. This is a little gross just because nested class definitions are a little gross, but the alternatives seem more ad-hoc. llvm-svn: 158799	2012-06-20 08:39:30 +00:00
Pete Cooper	33ee6c9bf1	Now that SROA can form alloca's for dynamic vector accesses, further improve it to be able to replace operations on these vector alloca's with insert/extract element insts llvm-svn: 158623	2012-06-17 03:58:26 +00:00
Hal Finkel	fa103d3fc7	Teach BBVectorize to combine, when possible, or discard metadata when fusing instructions. The present implementation handles only TBAA and FP metadata, discarding everything else. For debug metadata, the current behavior is maintained (the debug metadata associated with one of the instructions will be kept, discarding that attached to the other). This should address PR 13040. llvm-svn: 158606	2012-06-16 20:34:06 +00:00
Hal Finkel	16ddd4b66b	Move the Metadata merging methods from GVN and make them public in MDNode. There are other passes, BBVectorize specifically, that also need some of this functionality. llvm-svn: 158605	2012-06-16 20:33:37 +00:00
Evan Cheng	773b2cd63c	It's not deterministic to iterate over SmallPtrSet. Replace it with SmallSetVector. Patch by Daniel Reynaud. rdar://11671029 llvm-svn: 158594	2012-06-16 04:28:11 +00:00
Pete Cooper	818e9f4a26	Fix crash from r158529 on Bullet. Dynamic GEPs created by SROA needed to insert extra "i32 0" operands to index through structs and arrays to get to the vector being indexed. llvm-svn: 158590	2012-06-16 01:43:26 +00:00
Andrew Trick	8370c7c38f	LSR: fix expansion of scaled reg in non-address type formulae. For non-address users, Base and Scaled registers are not specially associated to fit an address mode, so SCEVExpander should apply normal expansion rules. Otherwise we may sink computation into inner loops that have already been optimized. llvm-svn: 158537	2012-06-15 20:07:29 +00:00
Andrew Trick	aca8fb3c45	LSR fix: "Special" users are just like "Basic" users but allow -1 scale. llvm-svn: 158536	2012-06-15 20:07:26 +00:00
Pete Cooper	e24d6a19e3	Allow SROA to split up an array of vectors into multiple vectors, even when the vectors are dynamically indexed llvm-svn: 158529	2012-06-15 18:07:29 +00:00
Rafael Espindola	1821c6c3b0	Some optimizations done by globalopt are safe only for internal linkage, not linkonce linkage. For example, it is not valid to add unnamed_addr. This also fixes a crash in g++.dg/opt/static5.C. llvm-svn: 158528	2012-06-15 18:00:24 +00:00
Duncan Sands	7838603ffc	Fix issues (infinite loop and/or crash) with self-referential instructions, for example degenerate phi nodes and binops that use themselves in unreachable code. Thanks to Charles Davis for the testcase that uncovered this can of worms. llvm-svn: 158508	2012-06-15 08:37:50 +00:00
Pete Cooper	1d1fa72837	Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct llvm-svn: 158479	2012-06-14 23:53:53 +00:00
Rafael Espindola	def1b09be2	Implement the isSafeToDiscardIfUnused predicate and use it in globalopt and globaldce. Globaldce was already removing linkonce globals, but globalopt was not. llvm-svn: 158476	2012-06-14 22:48:13 +00:00
Pete Cooper	5d19452f3f	Revert r158454: Allow SROA to look at a vector type... Its breaking the vectorise buildbot This reverts commit 12c1f86ffa731e2952c80d2cc577000c96b8962c. llvm-svn: 158462	2012-06-14 18:32:52 +00:00
Pete Cooper	a7e6d58a87	Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct llvm-svn: 158454	2012-06-14 16:38:13 +00:00
Manman Ren	c2bc2d106b	InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y). uno && ueq was converted to ueq, it should be converted to uno. llvm-svn: 158441	2012-06-14 05:57:42 +00:00
Pete Cooper	e2fe809772	Revert "Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access" This reverts commit 51786e0aaec76b973205066bd44f7f427b21969f. llvm-svn: 158408	2012-06-13 17:55:22 +00:00
Pete Cooper	e1d4e8b563	Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access llvm-svn: 158407	2012-06-13 17:30:34 +00:00
Duncan Sands	409d8ae165	It is possible for several constants which aren't individually absorbing to combine to the absorbing element. Thanks to nbjoerg on IRC for pointing this out. llvm-svn: 158399	2012-06-13 12:15:56 +00:00
Duncan Sands	318a89ddac	When linearizing a multiplication, return at once if we see a factor of zero, since then the entire expression must equal zero (similarly for other operations with an absorbing element). With this in place a bunch of reassociate code for handling constants is dead since it is all taken care of when linearizing. No intended functionality change. llvm-svn: 158398	2012-06-13 09:42:13 +00:00
Manman Ren	d33f4efbfd	SimplifyCFG: fold unconditional branch to its predecessor if profitable. This patch extends FoldBranchToCommonDest to fold unconditional branches. For unconditional branches, we fold them if it is easy to update the phi nodes in the common successors. rdar://10554090 llvm-svn: 158392	2012-06-13 05:43:29 +00:00
Duncan Sands	72aea01b6e	Use DenseMap as SmallMap workaround rather than std::map, at Chandler's request. llvm-svn: 158371	2012-06-12 20:26:43 +00:00
Duncan Sands	67cd591989	Use std::map rather than SmallMap because SmallMap assumes that the value has POD type, causing memory corruption when mapping to APInts with bitwidth > 64. Merge another crash testcase into crash.ll while there. llvm-svn: 158369	2012-06-12 20:16:51 +00:00
Duncan Sands	d7aeefebd6	Now that Reassociate's LinearizeExprTree can look through arbitrary expression topologies, it is quite possible for a leaf node to have huge multiplicity, for example: x0 = xx, x1 = x0x0, x2 = x1*x1, ... rapidly gives a value which is x raised to a vast power (the multiplicity, or weight, of x). This patch fixes the computation of weights by correctly computing them no matter how big they are, rather than just overflowing and getting a wrong value. It turns out that the weight for a value never needs more bits to represent than the value itself, so it is enough to represent weights as APInts of the same bitwidth and do the right overflow-avoiding dance steps when computing weights. As a side-effect it reduces the number of multiplies needed in some cases of large powers. While there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree static, pushing the rank computation out into users. This is progress towards fixing PR13021. llvm-svn: 158358	2012-06-12 14:33:56 +00:00
Benjamin Kramer	2150145ae4	InstCombine: factor code better. No functionality change. llvm-svn: 158301	2012-06-11 08:01:25 +00:00
Benjamin Kramer	8b8a76974f	InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing the compare. This saves a cast, and zext is more expensive on platforms with subreg support than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750. On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the same performance now when not inlining either function. stupid_memchr: 323.0us bsd_memchr: 321.0us memchr: 479.0us where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time, I haven't fully understood the issue yet, something is grossly mangling the loop after inlining. llvm-svn: 158297	2012-06-10 20:35:00 +00:00
Dmitri Gribenko	dbeafa773a	Convert comments to proper Doxygen comments. llvm-svn: 158248	2012-06-09 00:01:45 +00:00
Nuno Lopes	2710f1b049	canonicalize: -%a + 42 into 42 - %a previously we were emitting: -(%a + 42) This fixes the infinite loop in PR12338. The generated code is still not perfect, though. Will work on that next llvm-svn: 158237	2012-06-08 22:30:05 +00:00
Duncan Sands	3293f460e7	Reapply commit 158073 with a fix (the testcase was already committed). The problem was that by moving instructions around inside the function, the pass could accidentally move the iterator being used to advance over the function too. Fix this by only processing the instruction equal to the iterator, and leaving processing of instructions that might not be equal to the iterator to later (later = after traversing the basic block; it could also wait until after traversing the entire function, but this might make the sets quite big). Original commit message: Grab-bag of reassociate tweaks. Unify handling of dead instructions and instructions to reoptimize. Exploit this to more systematically eliminate dead instructions (this isn't very useful in practice but is convenient for analysing some testcase I am working on). No need for WeakVH any more: use an AssertingVH instead. llvm-svn: 158226	2012-06-08 20:15:33 +00:00
Nuno Lopes	4b68c1da54	BoundsChecking: add support for ConstantPointerNull. fixes a bunch of instrumentation failures in loops with reallocs llvm-svn: 158210	2012-06-08 16:31:42 +00:00
Duncan Sands	9a5cf92250	Revert commit 158073 while waiting for a fix. The issue is that reassociate can move instructions within the instruction list. If the instruction just happens to be the one the basic block iterator is pointing to, and it is moved to a different basic block, then we get into an infinite loop due to the iterator running off the end of the basic block (for some reason this doesn't fire any assertions). Original commit message: Grab-bag of reassociate tweaks. Unify handling of dead instructions and instructions to reoptimize. Exploit this to more systematically eliminate dead instructions (this isn't very useful in practice but is convenient for analysing some testcase I am working on). No need for WeakVH any more: use an AssertingVH instead. llvm-svn: 158199	2012-06-08 13:37:30 +00:00
Nadav Rotem	4e50efead6	Fix a bug in FoldSelectOpOp. Bitcast ops may change the number of vector elements, which may disagree with the select condition type. llvm-svn: 158166	2012-06-07 20:28:57 +00:00
Benjamin Kramer	628a39faa3	Remove unused private fields found by clang's new -Wunused-private-field. There are some that I didn't remove this round because they looked like obvious stubs. There are dead variables in gtest too, they should be fixed upstream. llvm-svn: 158090	2012-06-06 18:25:08 +00:00
Chad Rosier	faa3894628	Fix combine of uno && ord -> false so that the ordering of the fcmps doesn't matter. rdar://11579835 llvm-svn: 158084	2012-06-06 17:22:40 +00:00
Duncan Sands	763da45e9e	Grab-bag of reassociate tweaks. Unify handling of dead instructions and instructions to reoptimize. Exploit this to more systematically eliminate dead instructions (this isn't very useful in practice but is convenient for analysing some testcase I am working on). No need for WeakVH any more: use an AssertingVH instead. llvm-svn: 158073	2012-06-06 14:53:10 +00:00
Andrew Trick	a6fb910fad	LoopUnroll: always check for NULL LoopPassManager llvm-svn: 158007	2012-06-05 17:51:05 +00:00
Rafael Espindola	47d988c54c	When gvn decides to replace an instruction with another, we have to patch the replacement to make it at least as generic as the instruction being replaced. This includes: * dropping nsw/nuw flags * getting the least restrictive tbaa and fpmath metadata * merging ranges Fixes PR12979. llvm-svn: 157958	2012-06-04 22:44:21 +00:00
Benjamin Kramer	bde9176663	Fix typos found by http://github.com/lyda/misspell-check llvm-svn: 157885	2012-06-02 10:20:22 +00:00
Stepan Dyatkovskiy	0e46d8a08c	PR1255: case ranges. IntRange converted from struct to class. So main change everywhere is replacement of ".Low/High" with ".getLow/getHigh()" llvm-svn: 157884	2012-06-02 09:42:43 +00:00
Bill Wendling	e85f34969e	Register the gcov "writeout" at init time. Don't list this as a d'tor. Instead, inject some code in that will run via the "__mod_init_func" method that registers the gcov "writeout" function to execute at exit time. The problem is that the "__mod_term_func" method of specifying d'tors is deprecated on Darwin. And it can lead to some ambiguities when dealing with multiple libraries. <rdar://problem/11110106> llvm-svn: 157852	2012-06-01 23:14:32 +00:00
Nuno Lopes	adf1c859dd	BoundsChecking: fix a bug when the handling of recursive PHIs failed and could leave dangling references in the cache add regression tests for this problem. Can already compile & run: PHP, PCRE, and ICU (i.e., all the software I tried) llvm-svn: 157822	2012-06-01 17:43:31 +00:00
Nuno Lopes	288e86ff6b	add -bounds-checking-multiple-traps option to make one trap BB per check disabled by default for now; we can discusse the default value (& name) later llvm-svn: 157777	2012-05-31 22:58:48 +00:00
Nuno Lopes	7d00061d87	revamp BoundsChecking considerably: - compute size & offset at the same time. The side-effects of this are that we now support negative GEPs. It's now approaching a phase that it can be reused by other passes (e.g., lowering of the objectsize intrinsic) - use APInt throughout to handle wrap-arounds - add support for PHI instrumentation - add a cache (required for recursive PHIs anyway) - remove hoisting support for now, since it was wrong in a few cases sorry for the churn here.. tests will follow soon. llvm-svn: 157775	2012-05-31 22:45:40 +00:00
Duncan Sands	339bb61e32	Enhance the sinking code to handle diamond patterns. Patch by Carlo Alberto Ferraris. llvm-svn: 157736	2012-05-31 08:09:49 +00:00
Kostya Serebryany	9024160439	[asan] instrument cmpxchg and atomicrmw llvm-svn: 157683	2012-05-30 09:04:06 +00:00
Nuno Lopes	8bd45f8ecd	bounds checking: - hoist checks out of loops where SCEV is smart enough - add additional statistics to measure how much we loose for not supporting interprocedural and pointers loaded from memory llvm-svn: 157649	2012-05-29 22:32:51 +00:00
Stepan Dyatkovskiy	58107dd547	ConstantRangesSet renamed to IntegersSubset. CRSBuilder renamed to IntegersSubsetMapping. llvm-svn: 157612	2012-05-29 12:26:47 +00:00
Benjamin Kramer	9d5849f51d	Fix suspicous hasOneUse() check, found by PVS Studio (PR12357). llvm-svn: 157592	2012-05-28 20:52:48 +00:00
Benjamin Kramer	b8743a9150	InstCombine: Fix infinite loop when encountering switch on trivial icmp. The test case feeds the following into InstCombine's visitSelect: %tobool8 = icmp ne i32 0, 0 %phitmp = select i1 %tobool8, i32 3, i32 0 Then instcombine replaces the right side of the switch with 0, doesn't notice that nothing changes and tries again indefinitely. This fixes PR12897. llvm-svn: 157587	2012-05-28 19:18:16 +00:00
Stepan Dyatkovskiy	e3e19cbb13	PR1255: Case Ranges Implemented IntItem - the wrapper around APInt. Why not to use APInt item directly right now? 1. It will very difficult to implement case ranges as series of small patches. We got several large and heavy patches. Each patch will about 90-120 kb. If you replace ConstantInt with APInt in SwitchInst you will need to changes at the same time all Readers,Writers and absolutely all passes that uses SwitchInst. 2. We can implement APInt pool inside and save memory space. E.g. we use several switches that works with 256 bit items (switch on signatures, or strings). We can avoid value duplicates in this case. 3. IntItem can be easyly easily replaced with APInt. 4. Currenly we can interpret IntItem both as ConstantInt and as APInt. It allows to provide SwitchInst methods that works with ConstantInt for non-updated passes. Why I need it right now? Currently I need to update SimplifyCFG pass (EqualityComparisons). I need to work with APInts directly a lot, so peaces of code ConstantInt *V = ...; if (V->getValue().ugt(AnotherV->getValue()) { ... } will look awful. Much more better this way: IntItem V = ConstantIntVal->getValue(); if (AnotherV < V) { } Of course any reviews are welcome. P.S.: I'm also going to rename ConstantRangesSet to IntegersSubset, and CRSBuilder to IntegersSubsetMapping (allows to map individual subsets of integers to the BasicBlocks). Since in future these classes will founded on APInt, it will possible to use them in more generic ways. llvm-svn: 157576	2012-05-28 12:39:09 +00:00
Bill Wendling	1560517ec3	Implement the indirect counter increment code in a better way. Instead of replicating the code for every place it's needed, we instead generate a function that does that for us. This function is local to the executable, so there shouldn't be any writing violations. llvm-svn: 157564	2012-05-28 06:10:56 +00:00
Chris Lattner	3cb6f83ebb	switch AttrListPtr::get to take an ArrayRef, simplifying a lot of clients. llvm-svn: 157556	2012-05-28 01:47:44 +00:00
Benjamin Kramer	152f106e5f	PR12967: Don't crash when trying to fold a shift that's larger than the type's size. llvm-svn: 157548	2012-05-27 22:03:32 +00:00
Chris Lattner	144b619684	Reimplement the intrinsic verifier to use the same table as Intrinsic::getDefinition, making it stronger and more sane. Delete the code from tblgen that produced the old code. Besides being a path forward in intrinsic sanity, this also eliminates a bunch of machine generated code that was compiled into Function.o llvm-svn: 157545	2012-05-27 19:37:05 +00:00
Duncan Sands	3c05cd3ea8	Since commit 157467, if reassociate isn't actually going to change an expression then it doesn't alter the instructions composing it, however it would continue to move the instructions to just before the expression root. Ensure it doesn't move them either, so now it really does nothing if there is nothing to do. That commit also ensured that nsw etc flags weren't cleared if the expression was not being changed. Tweak this a bit so that it doesn't clear flags on the initial part of a computation either if that part didn't change but later bits did. llvm-svn: 157518	2012-05-26 16:42:52 +00:00
Benjamin Kramer	58abf4f193	SimplifyCFG: Turn the ad-hoc std::pair that represents switch cases into an explicit struct. llvm-svn: 157516	2012-05-26 14:29:37 +00:00
Benjamin Kramer	65e75666ff	Add support for branch weight metadata to MDBuilder and use it in various places. llvm-svn: 157515	2012-05-26 13:59:43 +00:00
Duncan Sands	c94ac6fdf6	Move this debug statement earlier so it is easy to see the order in which operands come flying out of the linearization stage. llvm-svn: 157512	2012-05-26 07:47:48 +00:00
Bill Wendling	8ed0749a34	The llvm_gcda_increment_indirect_counter function writes to the arguments that are passed in. However, those arguments may be in a write-protected area, as far as the runtime library is concerned. For instance, the data could be placed into a 'linkedit' section, which isn't writable. Emit the code from llvm_gcda_increment_indirect_counter directly into the function instead. Note: The code for this is ugly, and can lead to bloat. We should look into simplifying this code instead of having all of these branches. <rdar://problem/11181370> llvm-svn: 157505	2012-05-25 23:55:00 +00:00
Nuno Lopes	e9b0bdf804	bounds checking: add support for byval arguments llvm-svn: 157498	2012-05-25 21:15:17 +00:00
Nuno Lopes	a6da3ff896	boundschecking: add support for select add experimental support for alloc_size metadata llvm-svn: 157481	2012-05-25 16:54:04 +00:00
Duncan Sands	bddfb2f96b	Make the reassociation pass more powerful so that it can handle expressions with arbitrary topologies (previously it would give up when hitting a diamond in the use graph for example). The testcase from PR12764 is now reduced from a pile of additions to the optimal 1617*%x0+208. In doing this I changed the previous strategy of dropping all uses for expression leaves to one of dropping all but one use. This works out more neatly (but required a bunch of tweaks) and is also safer: some recently fixed bugs during recursive linearization were because the linearization code thinks it completely owns a node if it has no uses outside the expression it is linearizing. But if the node was also in another expression that had been linearized (and thus all uses of the node from that expression dropped) then the conclusion that it is completely owned by the expression currently being linearized is wrong. Keeping one use from within each linearized expression avoids this kind of mistake. llvm-svn: 157467	2012-05-25 12:03:02 +00:00
Stepan Dyatkovskiy	183d18aa5a	PR1255 related changes (case ranges): LowerSwitch::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced. test/Transform/LowerSwitch/feature.ll - this test was refactored: grep + count was replaced with FileCheck usage. llvm-svn: 157384	2012-05-24 09:33:20 +00:00
Nuno Lopes	10287d839f	BoundsChecking: add a couple of simple tests and fix a bug in branch emition llvm-svn: 157329	2012-05-23 16:24:52 +00:00
Patrik Hägglund	8a1e316c15	Fix the inliner so that the optsize function attribute don't alter the inline threshold if the global inline threshold is lower (as for -Oz). Reviewed by Chandler Carruth and Bill Wendling. llvm-svn: 157323	2012-05-23 13:42:57 +00:00
Evgeniy Stepanov	617232f32b	Use zero-based shadow by default on Android. llvm-svn: 157317	2012-05-23 11:52:12 +00:00
Stepan Dyatkovskiy	7a50155227	PR1255(case ranges) related changes in Local Transformations. llvm-svn: 157315	2012-05-23 08:18:26 +00:00
Nuno Lopes	59e9df773a	address some of John Criswell's comments teach computeAllocSize about realloc, reallocf, and valloc llvm-svn: 157298	2012-05-22 22:02:19 +00:00
Nuno Lopes	eee43e1bc7	hopefully fix the CMake build. sorry for breakage llvm-svn: 157264	2012-05-22 17:40:46 +00:00
Nuno Lopes	a2f6cecb6d	add a new pass to instrument loads and stores for run-time bounds checking move EmitGEPOffset from InstCombine to Transforms/Utils/Local.h (a draft of this) patch reviewed by Andrew, thanks. llvm-svn: 157261	2012-05-22 17:19:09 +00:00
Nuno Lopes	ad40c0a425	revert my previous patches that introduced an additional parameter to the objectsize intrinsic. After a lot of discussion, we realized it's not the best option for run-time bounds checking llvm-svn: 157255	2012-05-22 15:25:31 +00:00
Duncan Sands	4df5e96d3a	Fix PR12858, a crash due to GVN's PRE not fully removing an instruction from the leader table. That's because it wasn't expecting instructions to turn up as leader for a value number that is not its own, but equality propagation could create this situation. One solution is to have the leader table use a WeakVH but this slows down GVN by about 5%. Instead just have equality propagation not add instructions to the leader table, only constants and arguments. In theory this might cause GVN to run more (each time it changes something it runs again) but it doesn't seem to occur enough to cause a slow down. llvm-svn: 157251	2012-05-22 14:17:53 +00:00
Dan Gohman	9c97eea0fd	Mark an unreachable region of code with llvm_unreachable. llvm-svn: 157197	2012-05-21 17:41:28 +00:00
Peter Collingbourne	9a03c73297	Do not pass an invalid domtree to SimplifyInstruction from LoopUnswitch. Fixes PR12887. llvm-svn: 157140	2012-05-20 01:32:09 +00:00
Peter Collingbourne	97b1076435	Do not eliminate allocas whose alignment exceeds that of the copied-in constant, as a subsequent user may rely on over alignment. Fixes PR12885. llvm-svn: 157134	2012-05-19 22:52:10 +00:00
Dan Gohman	14862c3141	Fix replacing all the users of objc weak runtime routines when deleting them. rdar://11434915. llvm-svn: 157080	2012-05-18 22:17:29 +00:00
David Majnemer	a9330fe553	Teach SimplifyLibCalls about stpcpy. llvm-svn: 156815	2012-05-15 11:46:21 +00:00
Chad Rosier	a968caf8e0	Move the capture analysis from MemoryDependencyAnalysis to a more general place so that it can be reused in MemCpyOptimizer. This analysis is needed to remove an unnecessary memcpy when returning a struct into a local variable. rdar://11341081 PR12686 llvm-svn: 156776	2012-05-14 20:35:04 +00:00
Jay Foad	ca0c499609	Teach Function::hasAddressTaken that BlockAddress doesn't really take the address of a function. llvm-svn: 156703	2012-05-12 08:30:16 +00:00
Nuno Lopes	e2cfd3ce95	objectsize: add a few more tests and fix a bug llvm-svn: 156625	2012-05-11 18:25:29 +00:00
Eli Friedman	e0a64d83fc	Fix a minor logic mistake transforming compares in instcombine. PR12514. llvm-svn: 156600	2012-05-11 01:32:59 +00:00
Nuno Lopes	f573030391	objectsize: add support for GEPs with non-constant indexes add an additional parameter to InstCombiner::EmitGEPOffset() to force it to not emit operations with NUW flag llvm-svn: 156585	2012-05-10 23:17:35 +00:00
Dan Gohman	ed7c24e2d9	Teach DeadStoreElimination to eliminate exit-block stores with phi addresses. llvm-svn: 156558	2012-05-10 18:57:38 +00:00
Nuno Lopes	300d629924	teach DSE and isInstructionTriviallyDead() about calloc llvm-svn: 156553	2012-05-10 17:14:00 +00:00
Dan Gohman	f8b19d09ba	Fix the objc_storeStrong recognizer to stop before walking off the end of a basic block if there's no store. llvm-svn: 156520	2012-05-09 23:08:33 +00:00
Nuno Lopes	7100f463b0	objectsize: refactor code a bit to enable future changes to support run-time information add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs) llvm-svn: 156515	2012-05-09 21:30:57 +00:00
Craig Topper	28540adfcf	Remove unused variable to get rid of warning. llvm-svn: 156466	2012-05-09 07:08:58 +00:00
Dan Gohman	41375a3545	Miscellaneous accumulated cleanups. llvm-svn: 156445	2012-05-08 23:39:44 +00:00
Dan Gohman	61708d37d6	Fix objc_storeStrong pattern matching to catch a potential use of the old value after the store but before it is released. This fixes rdar:/11116986. llvm-svn: 156442	2012-05-08 23:34:08 +00:00
Duncan Sands	3bbb1d50df	Calling ReassociateExpression recursively is extremely dangerous since it will replace the operands of expressions with only one use with undef and generate a new expression for the original without using RAUW to update the original. Thus any copies of the original expression held in a vector may end up referring to some bogus value - and using a ValueHandle won't help since there is no RAUW. There is already a mechanism for getting the effect of recursion non-recursively: adding the value to be recursed on to RedoInsts. But it wasn't being used systematically. Have various places where recursion had snuck in at some point use the RedoInsts mechanism instead. Fixes PR12169. llvm-svn: 156379	2012-05-08 12:16:05 +00:00
Andrew Trick	d29cd732d4	Allow NULL LoopPassManager argument in UnrollLoop. PR12734. llvm-svn: 156358	2012-05-08 02:52:09 +00:00
Owen Anderson	f4f80e1f39	Teach reassociate to commute FMul's and FAdd's in order to canonicalize the order of their operands across instructions. This allows for greater CSE opportunities. llvm-svn: 156323	2012-05-07 20:47:23 +00:00
Benjamin Kramer	3d38c17b59	Switch the select to branch transformation on by default. The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! llvm-svn: 156258	2012-05-06 14:25:16 +00:00
Jakub Staszak	cfc46f82ff	Remove trailing spaces. llvm-svn: 156257	2012-05-06 13:52:31 +00:00
Benjamin Kramer	047d7ca0b1	CodeGenPrepare: Add a transform to turn selects into branches in some cases. This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) llvm-svn: 156234	2012-05-05 12:49:22 +00:00
Stepan Dyatkovskiy	cb2a1a34e2	Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for case when alloca's size is calculated within the "add/sub/... nsw". Also added fix to 2011-06-13-nsw-alloca.ll test. llvm-svn: 156231	2012-05-05 07:09:40 +00:00
Chandler Carruth	6781821c01	Teach the code extractor how to extract a sequence of blocks from RegionInfo's RegionNode. This mirrors the logic for automating the extraction from a Loop. llvm-svn: 156208	2012-05-04 21:33:30 +00:00
Chandler Carruth	14316fcf7d	Factor the computation of input and output sets into a public interface of the CodeExtractor utility. This allows speculatively computing input and output sets to measure the likely size impact of the code extraction. These sets cannot be reused sadly -- we mutate the function prior to forming the final sets used by the actual extraction. The interface has been revamped slightly to make it easier to use correctly by making the interface const and sinking the computation of the number of exit blocks into the full extraction function and away from the rest of this logic which just computed two output parameters. llvm-svn: 156168	2012-05-04 11:20:27 +00:00
Chandler Carruth	44e13911bc	Rather than trying to gracefully handle input sequences with repeated blocks, assert that this doesn't happen. We don't want to bother trying to support this call pattern as it isn't necessary. llvm-svn: 156167	2012-05-04 11:17:06 +00:00
Chandler Carruth	0a570552d1	Fix a goof with my previous commit by completely returning when we detect an in-eligible block rather than just breaking out of the loop. llvm-svn: 156166	2012-05-04 11:14:19 +00:00
Chandler Carruth	2f5d0191f7	Hoist a safety assert from the extraction method into the construction of the extractor itself. llvm-svn: 156164	2012-05-04 10:26:45 +00:00
Chandler Carruth	0fde00150d	Move the CodeExtractor utility to a dedicated header file / source file, and expose it as a utility class rather than as free function wrappers. The simple free-function interface works well for the bugpoint-specific pass's uses of code extraction, but in an upcoming patch for more advanced code extraction, they simply don't expose a rich enough interface. I need to expose various stages of the process of doing the code extraction and query information to decide whether or not to actually complete the extraction or give up. Rather than build up a new predicate model and pass that into these functions, just take the class that was actually implementing the functions and lift it up into a proper interface that can be used to perform code extraction. The interface is cleaned up and re-documented to work better in a header. It also is now setup to accept the blocks to be extracted in the constructor rather than in a method. In passing this essentially reverts my previous commit here exposing a block-level query for eligibility of extraction. That is no longer necessary with the more rich interface as clients can query the extraction object for eligibility directly. This will reduce the number of walks of the input basic block sequence by quite a bit which is useful if this enters the normal optimization pipeline. llvm-svn: 156163	2012-05-04 10:18:49 +00:00
Bill Wendling	fa0ebcd1b0	Add 'landingpad' instructions to the list of instructions to ignore. Also combine the code in the 'assert' statement. llvm-svn: 156155	2012-05-04 04:22:32 +00:00
Chandler Carruth	da7513a834	A pile of long over-due refactorings here. There are some very, very minor behavior changes with this, but nothing I have seen evidence of in the wild or expect to be meaningful. The real goal is unifying our logic and simplifying the interfaces. A summary of the changes follows: - Make 'callIsSmall' actually accept a callsite so it can handle intrinsics, and simplify callers appropriately. - Nuke a completely bogus declaration of 'callIsSmall' that was still lurking in InlineCost.h... No idea how this got missed. - Teach the 'isInstructionFree' about the various more intelligent 'free' heuristics that got added to the inline cost analysis during review and testing. This mostly surrounds int->ptr and ptr->int casts. - Switch most of the interesting parts of the inline cost analysis that were essentially computing 'is this instruction free?' to use the code metrics routine instead. This way we won't keep duplicating logic. All of this is motivated by the desire to allow other passes to compute a roughly equivalent 'cost' metric for a particular basic block as the inline cost analysis. Sadly, re-using the same analysis for both is really messy because only the actual inline cost analysis is ever going to go to the contortions required for simplification, SROA analysis, etc. llvm-svn: 156140	2012-05-04 00:58:03 +00:00
Chandler Carruth	a46e62424b	Factor the logic for testing whether a basic block is viable for code extraction into a public interface. Also clean it up and apply it more consistently such that we check for landing pads anywhere in the extracted code, not just in single-block extraction. This will be used to guide decisions in passes that are planning to eventually perform a round of code extraction. llvm-svn: 156114	2012-05-03 22:26:53 +00:00
Nuno Lopes	d4cf35d775	remove calls to calloc if the allocated memory is not used (it was already being done for malloc) fix a few typos found by Chad in my previous commit llvm-svn: 156110	2012-05-03 22:08:19 +00:00
Nuno Lopes	d2b71e7fa9	add support for calloc to objectsize lowering llvm-svn: 156102	2012-05-03 21:19:58 +00:00
Nuno Lopes	22f6f3b055	replace 'break's with 'return 0' in visitCallInst code for objectsize, since there is no need to fallback to visitCallSite. This gives a 0.9% in a test case llvm-svn: 156069	2012-05-03 16:06:07 +00:00
Bill Wendling	c94d86c4ad	Whitespace cleanup. llvm-svn: 156034	2012-05-02 23:43:23 +00:00
Kostya Serebryany	ae7188d9b9	[tsan] typo and style (thanks to Nick Lewycky) llvm-svn: 155986	2012-05-02 13:12:19 +00:00
Bill Wendling	274ba89d77	The value held in the vector may be RAUW'ed by some of the canonicalization methods. Use a weak value handle to keep up with this. PR12245 llvm-svn: 155984	2012-05-02 09:59:45 +00:00
Nick Lewycky	78ee67e814	An instruction in a loop is not guaranteed to be executed just because the loop has no exit blocks. Fixes PR12706! llvm-svn: 155884	2012-05-01 04:03:01 +00:00
Lang Hames	3a90fabd85	Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. Fixes <rdar://problem/11291436>. This is a second attempt at a fix for this, the first was r155468. Thanks to Chandler, Bob and others for the feedback that helped me improve this. llvm-svn: 155866	2012-05-01 00:20:38 +00:00
Bill Wendling	bf4b9afbeb	Second attempt at PR12573: Allow the "SplitCriticalEdge" function to split the edge to a landing pad. If the pass is sure that it thinks it knows what it's doing, then it may go ahead and specify that the landing pad can have its critical edge split. The loop unswitch pass is one of these passes. It will split the critical edges of all edges coming from a loop to a landing pad not within the loop. Doing so will retain important loop analysis information, such as loop simplify. llvm-svn: 155817	2012-04-30 10:44:54 +00:00
Bill Wendling	325e6cd9cb	Use an ArrayRef instead of explicit vector type. llvm-svn: 155816	2012-04-30 10:25:51 +00:00
Bill Wendling	712d85a8c0	Remove hack from r154987. The problem persists even with it, so it's not even a good hack. llvm-svn: 155813	2012-04-30 09:23:48 +00:00
Rafael Espindola	dd48931461	Make sure HoistInsertPosition finds a position that is dominated by all inputs. llvm-svn: 155809	2012-04-30 03:53:06 +00:00
Hal Finkel	27c3246169	Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.). Target specific types should not be vectorized. As a practical matter, these types are already register matched (at least in the x86 case), and codegen does not always work correctly (at least in the ppc case, and this is not worth fixing because ppc_fp128 is currently broken and will probably go away soon). llvm-svn: 155729	2012-04-27 19:34:00 +00:00
David Blaikie	84e4b39995	Change recurse depth limit to uint32 to fix warning. llvm-svn: 155727	2012-04-27 19:30:32 +00:00
Dan Gohman	dae3349ac2	Miscellaneous accumulated cleanups. llvm-svn: 155725	2012-04-27 18:56:31 +00:00
Mon P Wang	6120cfb8cd	Add an early bailout to IsValueFullyAvailableInBlock from deeply nested blocks. The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow issues. <rdar://problem/11286839>. llvm-svn: 155722	2012-04-27 18:09:28 +00:00
Kostya Serebryany	5a464f03d3	[asan] small optimization: do not emit "x+0" instructions llvm-svn: 155701	2012-04-27 10:04:53 +00:00
Kostya Serebryany	a1259778b4	[tsan] Atomic support for ThreadSanitizer, patch by Dmitry Vyukov llvm-svn: 155698	2012-04-27 07:31:53 +00:00
Jakob Stoklund Olesen	c90abc8956	Break up getProfitableChainIncrement(). The required checks are moved to ChainInstruction() itself and the policy decisions are moved to IVChain::isProfitableInc(). Also cache the ExprBase in IVChain to avoid frequent recomputations. No functional change intended. llvm-svn: 155676	2012-04-26 23:33:11 +00:00
Jakob Stoklund Olesen	a0337d7bd9	Turn IVChain into a struct. No functional change intended. llvm-svn: 155675	2012-04-26 23:33:09 +00:00
Chad Rosier	7813dcee30	Add instcombine patterns for the following transformations: (x & y) \| (x ^ y) -> x \| y (x & y) + (x ^ y) -> x \| y Patch by Manman Ren. rdar://10770603 llvm-svn: 155674	2012-04-26 23:29:14 +00:00
Chandler Carruth	739ef80fd7	Teach the reassociate pass to fold chains of multiplies with repeated elements to minimize the number of multiplies required to compute the final result. This uses a heuristic to attempt to form near-optimal binary exponentiation-style multiply chains. While there are some cases it misses, it seems to at least a decent job on a very diverse range of inputs. Initial benchmarks show no interesting regressions, and an 8% improvement on SPASS. Let me know if any other interesting results (in either direction) crop up! Credit to Richard Smith for the core algorithm, and helping code the patch itself. llvm-svn: 155616	2012-04-26 05:30:30 +00:00
Jakob Stoklund Olesen	293673d788	Print IV chain numbers while collecting them. llvm-svn: 155567	2012-04-25 18:01:32 +00:00
Lang Hames	2fd0c69125	Reverting r155468. Chris and Chandler have convinced me that it's dangerous and in poor taste. Talking through some alternate solutions with Chandler. llvm-svn: 155530	2012-04-25 02:16:54 +00:00
Dan Gohman	62079b43cc	Simplify the known retain count tracking; use a boolean state instead of a precise count. Also, move RRInfo's Partial field into PtrState, now that it won't increase the size. llvm-svn: 155513	2012-04-25 00:50:46 +00:00
Dan Gohman	c24c66f21c	Build custom predecessor and successor lists for each basic block. These lists exclude invoke unwind edges and loop backedges which are being ignored. This makes it easier to ignore them consistently. llvm-svn: 155500	2012-04-24 22:53:18 +00:00
Lang Hames	84531c2b5f	Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. This fixes <rdar://problem/11291436>. llvm-svn: 155468	2012-04-24 18:58:36 +00:00
Jakob Stoklund Olesen	43bcb970e5	Reapply r155136 after fixing PR12599. Original commit message: Defer some shl transforms to DAGCombine. The shl instruction is used to represent multiplication by a constant power of two as well as bitwise left shifts. Some InstCombine transformations would turn an shl instruction into a bit mask operation, making it difficult for later analysis passes to recognize the constsnt multiplication. Disable those shl transformations, deferring them to DAGCombine time. An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'. These transformations are deferred: (X >>? C) << C --> X & (-1 << C) (When X >> C has multiple uses) (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2) (When C2 > C1) (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) (When C1 > C2) The corresponding exact transformations are preserved, just like div-exact + mul: (X >>?,exact C) << C --> X (X >>?,exact C1) << C2 --> X << (C2-C1) (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2) The disabled transformations could also prevent the instruction selector from recognizing rotate patterns in hash functions and cryptographic primitives. I have a test case for that, but it is too fragile. llvm-svn: 155362	2012-04-23 17:39:52 +00:00
Alexander Potapenko	056e27ea49	Fix issue 67 by checking that the interface functions weren't redefined in the compiled source file. llvm-svn: 155346	2012-04-23 10:47:31 +00:00
Kostya Serebryany	5a4b7a232c	[tsan] use llvm/ADT/Statistic.h for tsan stats llvm-svn: 155341	2012-04-23 08:44:59 +00:00
Jakob Stoklund Olesen	205ee3b389	Revert r155136 "Defer some shl transforms to DAGCombine." While the patch was perfect and defect free, it exposed a really nasty bug in X86 SelectionDAG that caused an llc crash when compiling lencod. I'll put the patch back in after fixing the SelectionDAG problem. llvm-svn: 155181	2012-04-20 00:38:45 +00:00
Bill Wendling	9f97595201	Put this expensive check below the less expensive ones. llvm-svn: 155166	2012-04-19 23:31:07 +00:00
Dan Gohman	26aa827461	Avoid a bug in the path count computation, preventing an infinite loop repeatedlt making the same change. This is for rdar://11256239. llvm-svn: 155160	2012-04-19 21:50:46 +00:00
Jakob Stoklund Olesen	6b6c81e6b2	Defer some shl transforms to DAGCombine. The shl instruction is used to represent multiplication by a constant power of two as well as bitwise left shifts. Some InstCombine transformations would turn an shl instruction into a bit mask operation, making it difficult for later analysis passes to recognize the constsnt multiplication. Disable those shl transformations, deferring them to DAGCombine time. An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'. These transformations are deferred: (X >>? C) << C --> X & (-1 << C) (When X >> C has multiple uses) (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2) (When C2 > C1) (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) (When C1 > C2) The corresponding exact transformations are preserved, just like div-exact + mul: (X >>?,exact C) << C --> X (X >>?,exact C1) << C2 --> X << (C2-C1) (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2) The disabled transformations could also prevent the instruction selector from recognizing rotate patterns in hash functions and cryptographic primitives. I have a test case for that, but it is too fragile. llvm-svn: 155136	2012-04-19 16:46:26 +00:00
Dan Gohman	22fbe8d709	Don't crash on code where the user put __attribute__((constructor)) on a function with arguments. This fixes rdar://11265785. llvm-svn: 155073	2012-04-18 22:24:33 +00:00
Bill Wendling	4d4d025751	Use a heavy hammer to fix PR12573. If the loop contains invoke instructions, whose unwind edge escapes the loop, then don't try to unswitch the loop. Doing so may cause the unwind edge to be split, which not only is non-trivial but doesn't preserve loop simplify information. Fixes PR12573 llvm-svn: 154987	2012-04-18 06:00:09 +00:00
Andrew Trick	19f80c1e7e	loop-reduce: Add an early bailout to catch extremely large loops. This introduces a threshold of 200 IV Users, which is very conservative but should be sufficient to avoid serious compile time sink or stack overflow. The llvm test-suite with LTO never exceeds 190 users per loop. The bug doesn't relate to a specific type of loop. Checking in an arbitrary giant loop as a unit test would be silly. Fixes rdar://11262507. llvm-svn: 154983	2012-04-18 04:00:10 +00:00
Joe Groff	a81bcbb9bb	fix pr12559: mark unavailable win32 math libcalls also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint llvm-svn: 154960	2012-04-17 23:05:54 +00:00
Hal Finkel	52ba49f399	Fix style violation in BBVectorize (pointed out by Bill Wendling) llvm-svn: 154810	2012-04-16 12:39:17 +00:00
Bill Wendling	82b90a3804	Add a Fixme. llvm-svn: 154793	2012-04-16 04:23:52 +00:00
Hal Finkel	8ee309d9b7	Simplify checking for pointer types in BBVectorize (this change was suggested by Duncan). llvm-svn: 154787	2012-04-16 03:49:42 +00:00
Hal Finkel	83c9796033	Fix an error in BBVectorize important for vectorizing pointer types. When vectorizing pointer types it is important to realize that potential pairs cannot be connected via the address pointer argument of a load or store. This is because even after vectorization, the address is still a scalar because the address of the higher half of the pair is implicit from the address of the lower half (it need not be, and should not be, explicitly computed). llvm-svn: 154735	2012-04-14 07:32:50 +00:00
Hal Finkel	f589519a67	Enhance BBVectorize to more-properly handle pointer values and vectorize GEPs. llvm-svn: 154734	2012-04-14 07:32:43 +00:00
Hal Finkel	b2336a79f9	Add support to BBVectorize for vectorizing selects. llvm-svn: 154700	2012-04-13 20:45:45 +00:00
Dan Gohman	670f93744b	Add some comments, and fix a few places that missed setting Changed. llvm-svn: 154687	2012-04-13 18:57:48 +00:00
Dan Gohman	e1e352af2b	Consider ObjC runtime calls objc_storeWeak and others which make a copy of their argument as "escape" points for objc_retainBlock optimization. This fixes rdar://11229925. llvm-svn: 154682	2012-04-13 18:28:58 +00:00
Hal Finkel	204bf5352a	By default, use Early-CSE instead of GVN for vectorization cleanup. As has been suggested by Duncan and others, Early-CSE and GVN should do similar redundancy elimination, but Early-CSE is much less expensive. Most of my autovectorization benchmarks show a performance regresion, but all of these are < 0.1%, and so I think that it is still worth using the less expensive pass. llvm-svn: 154673	2012-04-13 17:15:33 +00:00
Dan Gohman	de8d2c446b	Use the new Use-aware dominates method to apply the objc runtime library return value optimization for phi uses. Even when the phi itself is not dominated, the specific use may be dominated. llvm-svn: 154647	2012-04-13 01:08:28 +00:00
Bill Wendling	585583c8dd	Code-gen may inject code into the IR before it emits the ASM. The linker obviously cannot know that this code is present, let alone used. So prevent the internalize pass from internalizing those global values which code-gen may insert. llvm-svn: 154645	2012-04-13 01:06:27 +00:00
Dan Gohman	8478d76d64	Don't move objc_autorelease calls past autorelease pool boundaries when optimizing autorelease calls on phi nodes with null operands. This fixes rdar://11207070. llvm-svn: 154642	2012-04-13 00:59:57 +00:00
Chad Rosier	cc899f3b6d	Typo. llvm-svn: 154522	2012-04-11 19:21:58 +00:00
Chandler Carruth	7ae90d4d2d	Add two statistics to help track how we are computing the inline cost. Yea, 'NumCallerCallersAnalyzed' isn't a great name, suggestions welcome. llvm-svn: 154492	2012-04-11 10:15:10 +00:00
Kostya Serebryany	5ba61ac651	[tsan] two more compile-time optimizations: - don't isntrument reads from constant globals. Saves ~1.5% of instrumented instructions on CPU2006 (counting static instructions, not their execution). - don't insrument reads from vtable (which is a global constant too). Saves ~5%. I did not measure the run-time impact of this, but it is certainly non-negative. llvm-svn: 154444	2012-04-10 22:29:17 +00:00
Kostya Serebryany	bf2de80be6	[tsan] compile-time instrumentation: do not instrument a read if a write to the same temp follows in the same BB. Also add stats printing. On Spec CPU2006 this optimization saves roughly 4% of instrumented reads (which is 3% of all instrumented accesses): Writes : 161216 Reads : 446458 Reads-before-write: 18295 llvm-svn: 154418	2012-04-10 18:18:56 +00:00
Andrew Trick	4442bfe559	Fix 12513: Loop unrolling breaks with indirect branches. Take this opportunity to generalize the indirectbr bailout logic for loop transformations. CFG transformations will never get indirectbr right, and there's no point trying. llvm-svn: 154386	2012-04-10 05:14:42 +00:00
Andrew Trick	4104ed9c76	whitespace llvm-svn: 154385	2012-04-10 05:14:37 +00:00
Chandler Carruth	f82b0e2d29	Teach InstCombine to nuke a common alloca pattern -- an alloca which has GEPs, bit casts, and stores reaching it but no other instructions. These often show up during the iterative processing of the inliner, SROA, and DCE. Once we hit this point, we can completely remove the alloca. These were actually showing up in the final, fully optimized code in a bunch of inliner tests I've been working on, and notably they show up after LLVM finishes optimizing away all function calls involved in hash_combine(a, b). llvm-svn: 154285	2012-04-08 14:36:56 +00:00
Hongbin Zheng	5758f495da	Refactor: Use positive field names in VectorizeConfig. llvm-svn: 154249	2012-04-07 03:56:23 +00:00
Chandler Carruth	49da93396e	Sink the collection of return instructions until after all simplification has been performed. This is a bit less efficient (requires another ilist walk of the basic blocks) but shouldn't matter in practice. More importantly, it's just too much work to keep track of all the various ways the return instructions can be mutated while simplifying them. This fixes yet another crasher, reported by Daniel Dunbar. llvm-svn: 154179	2012-04-06 17:21:31 +00:00
Duncan Sands	d12b18f820	Make GVN's propagateEquality non-recursive. No intended functionality change. The modifications are a lot more trivial than they appear to be in the diff! llvm-svn: 154174	2012-04-06 15:31:09 +00:00
Chandler Carruth	e41f6f4189	Sink the return instruction collection until after we're done deleting dead code, including dead return instructions in some cases. Otherwise, we end up having a bogus poniter to a return instruction that blows up much further down the road. It turns out that this pattern is both simpler to code, easier to update in the face of enhancements to the inliner cleanup, and likely cheaper given that it won't add dead instructions to the list. Thanks to John Regehr's numerous test cases for teasing this out. llvm-svn: 154157	2012-04-06 01:11:52 +00:00
Dan Gohman	cc64bbca81	Fix accidentally inverted logic from r152803, and make the testcase slightly less trivial. This fixes rdar://11171718. llvm-svn: 154118	2012-04-05 20:27:21 +00:00
Hongbin Zheng	31d33b8318	BBVectorize: Add the const modifier to the VectorizeConfig because we won't modify it. llvm-svn: 154098	2012-04-05 16:07:49 +00:00
Hongbin Zheng	d6825173d3	Introduce the VectorizeConfig class, with which we can control the behavior of the BBVectorizePass without using command line option. As pointed out by Hal, we can ask the TargetLoweringInfo for the architecture specific VectorizeConfig to perform vectorizing with architecture specific information. llvm-svn: 154096	2012-04-05 15:46:55 +00:00
Hongbin Zheng	6edbc39bd7	Add the function "vectorizeBasicBlock" which allow users vectorize a BasicBlock in other passes, e.g. we can call vectorizeBasicBlock in the loop unroll pass right after the loop is unrolled. llvm-svn: 154089	2012-04-05 08:05:16 +00:00
Jakob Stoklund Olesen	f2390e8303	Pass the right sign to TLI->isLegalICmpImmediate. LSR can fold three addressing modes into its ICmpZero node: ICmpZero BaseReg + Offset => ICmp BaseReg, -Offset ICmpZero -1ScaleReg + Offset => ICmp ScaleReg, Offset ICmpZero BaseReg + -1ScaleReg => ICmp BaseReg, ScaleReg The first two cases are only used if TLI->isLegalICmpImmediate() likes the offset. Make sure the right Offset sign is passed to this method in the second case. The ARM version is not symmetric. <rdar://problem/11184260> llvm-svn: 154079	2012-04-05 03:10:56 +00:00
Rafael Espindola	ba0a6cabb8	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Hongbin Zheng	b21b865fe8	LoopUnrollPass: Use variable "Threshold" instead of "CurrentThreshold" when reducing unroll count, otherwise the reduced unroll count is not taking the "OptimizeForSize" attribute into account. llvm-svn: 154007	2012-04-04 11:44:08 +00:00
Bill Wendling	932b992888	Add an option to turn off the expensive GVN load PRE part of GVN. llvm-svn: 153902	2012-04-02 22:16:50 +00:00
Stepan Dyatkovskiy	f62ffeca88	Fast fix for PR12343: http://llvm.org/bugs/show_bug.cgi?id=12343 We have not trivial way for splitting edges that are goes from indirect branch. We can do it with some tricks, but it should be additionally discussed. And it is still dangerous due to difficulty of indirect branches controlling. Fix forbids this case for unswitching. llvm-svn: 153879	2012-04-02 17:16:45 +00:00
Chandler Carruth	45ae88f5fc	Belatedly address some code review from Chris. As a side note, I really dislike array_pod_sort... Do we really still care about any STL implementations that get this so wrong? Does libc++? llvm-svn: 153834	2012-04-01 10:41:24 +00:00
Chandler Carruth	c5bfb3c0f5	Fix a pretty scary bug I introduced into the always inliner with a single missing character. Somehow, this had gone untested. I've added tests for returns-twice logic specifically with the always-inliner that would have caught this, and fixed the bug. Thanks to Matt for the careful review and spotting this!!! =D llvm-svn: 153832	2012-04-01 10:21:05 +00:00
Chandler Carruth	a88a0faaa3	Give the always-inliner its own custom filter. It shouldn't have to pay the very high overhead of the complex inline cost analysis when all it wants to do is detect three patterns which must not be inlined. Comment the code, clean it up, and leave some hints about possible performance improvements if this ever shows up on a profile. Moving this off of the (now more expensive) inline cost analysis is particularly important because we have to run this inliner even at -O0. llvm-svn: 153814	2012-03-31 13:17:18 +00:00
Chandler Carruth	edd2826f3e	Remove a bunch of empty, dead, and no-op methods from all of these interfaces. These methods were used in the old inline cost system where there was a persistent cache that had to be updated, invalidated, and cleared. We're now doing more direct computations that don't require this intricate dance. Even if we resume some level of caching, it would almost certainly have a simpler and more narrow interface than this. llvm-svn: 153813	2012-03-31 12:48:08 +00:00
Chandler Carruth	0539c071ea	Initial commit for the rewrite of the inline cost analysis to operate on a per-callsite walk of the called function's instructions, in breadth-first order over the potentially reachable set of basic blocks. This is a major shift in how inline cost analysis works to improve the accuracy and rationality of inlining decisions. A brief outline of the algorithm this moves to: - Build a simplification mapping based on the callsite arguments to the function arguments. - Push the entry block onto a worklist of potentially-live basic blocks. - Pop the first block off of the front of the worklist (for breadth-first ordering) and walk its instructions using a custom InstVisitor. - For each instruction's operands, re-map them based on the simplification mappings available for the given callsite. - Compute any simplification possible of the instruction after re-mapping, and store that back int othe simplification mapping. - Compute any bonuses, costs, or other impacts of the instruction on the cost metric. - When the terminator is reached, replace any conditional value in the terminator with any simplifications from the mapping we have, and add any successors which are not proven to be dead from these simplifications to the worklist. - Pop the next block off of the front of the worklist, and repeat. - As soon as the cost of inlining exceeds the threshold for the callsite, stop analyzing the function in order to bound cost. The primary goal of this algorithm is to perfectly handle dead code paths. We do not want any code in trivially dead code paths to impact inlining decisions. The previous metric was extremely flawed here, and would always subtract the average cost of two successors of a conditional branch when it was proven to become an unconditional branch at the callsite. There was no handling of wildly different costs between the two successors, which would cause inlining when the path actually taken was too large, and no inlining when the path actually taken was trivially simple. There was also no handling of the code path, only the immediate successors. These problems vanish completely now. See the added regression tests for the shiny new features -- we skip recursive function calls, SROA-killing instructions, and high cost complex CFG structures when dead at the callsite being analyzed. Switching to this algorithm required refactoring the inline cost interface to accept the actual threshold rather than simply returning a single cost. The resulting interface is pretty bad, and I'm planning to do lots of interface cleanup after this patch. Several other refactorings fell out of this, but I've tried to minimize them for this patch. =/ There is still more cleanup that can be done here. Please point out anything that you see in review. I've worked really hard to try to mirror at least the spirit of all of the previous heuristics in the new model. It's not clear that they are all correct any more, but I wanted to minimize the change in this single patch, it's already a bit ridiculous. One heuristic that is not yet mirrored is to allow inlining of functions with a dynamic alloca if the caller has a dynamic alloca. I will add this back, but I think the most reasonable way requires changes to the inliner itself rather than just the cost metric, and so I've deferred this for a subsequent patch. The test case is XFAIL-ed until then. As mentioned in the review mail, this seems to make Clang run about 1% to 2% faster in -O0, but makes its binary size grow by just under 4%. I've looked into the 4% growth, and it can be fixed, but requires changes to other parts of the inliner. llvm-svn: 153812	2012-03-31 12:42:41 +00:00
Benjamin Kramer	53dc873342	Internalize: Remove reference of @llvm.noinline, it was replaced with the noinline attribute a long time ago. llvm-svn: 153806	2012-03-31 11:03:47 +00:00
Hal Finkel	5cad8742cc	Correctly vectorize powi. The powi intrinsic requires special handling because it always takes a single integer power regardless of the result type. As a result, we can vectorize only if the powers are equal. Fixes PR12364. llvm-svn: 153797	2012-03-31 03:38:40 +00:00
Jakob Stoklund Olesen	4e55044ff5	Don't PRE compares. CodeGenPrepare sinks compare instructions down to their uses to prevent live flags and predicate registers across basic blocks. PRE of a compare instruction prevents that, forcing the i1 compare result into a general purpose register. That is usually more expensive than the redundant compare PRE was trying to eliminate in the first place. llvm-svn: 153657	2012-03-29 17:22:39 +00:00
Benjamin Kramer	aa9e4a5e59	GlobalOpt: If we have an inbounds GEP from a ConstantAggregateZero global that we just determined to be constant, replace all loads from it with a zero value. llvm-svn: 153576	2012-03-28 14:50:09 +00:00
Chandler Carruth	772c88b887	Switch to WeakVHs in the value mapper, and aggressively prune dead basic blocks in the function cloner. This removes the last case of trivially dead code that I've been seeing in the wild getting inlined, analyzed, re-inlined, optimized, only to be deleted. Nukes a FIXME from the cleanup tests. llvm-svn: 153572	2012-03-28 08:38:27 +00:00
Chad Rosier	bb2a6da440	Fix 80-column violation. llvm-svn: 153556	2012-03-28 00:35:33 +00:00
Chandler Carruth	b9e35fbc1e	Make a seemingly tiny change to the inliner and fix the generated code size bloat. Unfortunately, I expect this to disable the majority of the benefit from r152737. I'm hopeful at least that it will fix PR12345. To explain this requires... quite a bit of backstory I'm afraid. TL;DR: The change in r152737 actually did The Wrong Thing for linkonce-odr functions. This change makes it do the right thing. The benefits we saw were simple luck, not any actual strategy. Benchmark numbers after a mini-blog-post so that I've written down my thoughts on why all of this works and doesn't work... To understand what's going on here, you have to understand how the "bottom-up" inliner actually works. There are two fundamental modes to the inliner: 1) Standard fixed-cost bottom-up inlining. This is the mode we usually think about. It walks from the bottom of the CFG up to the top, looking at callsites, taking information about the callsite and the called function and computing th expected cost of inlining into that callsite. If the cost is under a fixed threshold, it inlines. It's a touch more complicated than that due to all the bonuses, weights, etc. Inlining the last callsite to an internal function gets higher weighth, etc. But essentially, this is the mode of operation. 2) Deferred bottom-up inlining (a term I just made up). This is the interesting mode for this patch an r152737. Initially, this works just like mode #1, but once we have the cost of inlining into the callsite, we don't just compare it with a fixed threshold. First, we check something else. Let's give some names to the entities at this point, or we'll end up hopelessly confused. We're considering inlining a function 'A' into its callsite within a function 'B'. We want to check whether 'B' has any callers, and whether it might be inlined into those callers. If so, we also check whether inlining 'A' into 'B' would block any of the opportunities for inlining 'B' into its callers. We take the sum of the costs of inlining 'B' into its callers where that inlining would be blocked by inlining 'A' into 'B', and if that cost is less than the cost of inlining 'A' into 'B', then we skip inlining 'A' into 'B'. Now, in order for #2 to make sense, we have to have some confidence that we will actually have the opportunity to inline 'B' into its callers when cheaper, and that we'll be able to revisit the decision and inline 'A' into 'B' if that ever becomes the correct tradeoff. This often isn't true for external functions -- we can see very few of their callers, and we won't be able to re-consider inlining 'A' into 'B' if 'B' is external when we finally see more callers of 'B'. There are two cases where we believe this to be true for C/C++ code: functions local to a translation unit, and functions with an inline definition in every translation unit which uses them. These are represented as internal linkage and linkonce-odr (resp.) in LLVM. I enabled this logic for linkonce-odr in r152737. Unfortunately, when I did that, I also introduced a subtle bug. There was an implicit assumption that the last caller of the function within the TU was the last caller of the function in the program. We want to bonus the last caller of the function in the program by a huge amount for inlining because inlining that callsite has very little cost. Unfortunately, the last caller in the TU of a linkonce-odr function is not the last caller in the program, and so we don't want to apply this bonus. If we do, we can apply it to one callsite per-TU. Because of the way deferred inlining works, when it sees this bonus applied to one callsite in the TU for 'B', it decides that inlining 'B' is of the utmost importance just so we can get that final bonus. It then proceeds to essentially force deferred inlining regardless of the actual cost tradeoff. The result? PR12345: code bloat, code bloat, code bloat. Another result is getting damn lucky on a few benchmarks, and the over-inlining exposing critically important optimizations. I would very much like a list of benchmarks that regress after this change goes in, with bitcode before and after. This will help me greatly understand what opportunities the current cost analysis is missing. Initial benchmark numbers look very good. WebKit files that exhibited the worst of PR12345 went from growing to shrinking compared to Clang with r152737 reverted. - Bootstrapped Clang is 3% smaller with this change. - Bootstrapped Clang -O0 over a single-source-file of lib/Lex is 4% faster with this change. Please let me know about any other performance impact you see. Thanks to Nico for reporting and urging me to actually fix, Richard Smith, Duncan Sands, Manuel Klimek, and Benjamin Kramer for talking through the issues today. llvm-svn: 153506	2012-03-27 10:48:28 +00:00
Nadav Rotem	a8f3562e8f	153465 was incorrect. In this code we wanted to check that the pointer operand is of pointer type (and not vector type). llvm-svn: 153468	2012-03-26 21:00:53 +00:00
Nadav Rotem	e63e59cc44	PR12357: The pointer was used before it was checked. llvm-svn: 153465	2012-03-26 20:39:18 +00:00
Andrew Trick	14779cc49e	LSR ivchain bug fix: corner case with ConstantExpr. Fixes PR11950. llvm-svn: 153463	2012-03-26 20:28:37 +00:00
Andrew Trick	356a896394	comment typo llvm-svn: 153462	2012-03-26 20:28:35 +00:00
Chris Lattner	b1e2e1e091	eliminate an unneeded branch, part of PR12357 llvm-svn: 153458	2012-03-26 19:13:57 +00:00
Eric Christopher	2b40fdf3ae	Tidy. llvm-svn: 153456	2012-03-26 19:09:40 +00:00
Eric Christopher	f16bee8682	Tidy. llvm-svn: 153455	2012-03-26 19:09:38 +00:00
Andrew Trick	e51feea79c	LSR cleanup: potential bug caught by PVS-Studio. Thanks Andrey. llvm-svn: 153451	2012-03-26 18:03:16 +00:00
Kostya Serebryany	6f8a776041	[tsan] treat vtable pointer updates in a special way (requires tbaa); fix a bug (forgot to return true after instrumenting); make sure the tsan tests are run llvm-svn: 153448	2012-03-26 17:35:03 +00:00
Craig Topper	6e80c28017	Prune some includes and forward declarations. llvm-svn: 153429	2012-03-26 06:58:25 +00:00
Chandler Carruth	ef82cf5b1e	Teach the function cloner (and thus the inliner) to simplify PHINodes aggressively. There are lots of dire warnings about this being expensive that seem to predate switching to the TrackingVH-based value remapper that is automatically updated on RAUW. This makes it easy to not just prune single-entry PHIs, but to fully simplify PHIs, and to recursively simplify the newly inlined code to propagate PHINode simplifications. This introduces a bit of a thorny problem though. We may end up simplifying a branch condition to a constant when we fold PHINodes, and we would like to nuke any dead blocks resulting from this so that time isn't wasted continually analyzing them, but this isn't easy. Deleting basic blocks after they are fully cloned and mapped into the new function currently requires manually updating the value map. The last piece of the simplification-during-inlining puzzle will require either switching to WeakVH mappings or some other piece of refactoring. I've left a FIXME in the testcase about this. llvm-svn: 153410	2012-03-25 10:34:54 +00:00
Chandler Carruth	2121199241	Move the instruction simplification of callsite arguments in the inliner to instead rely on much more generic and powerful instruction simplification in the function cloner (and thus inliner). This teaches the pruning function cloner to use instsimplify rather than just the constant folder to fold values during cloning. This can simplify a large number of things that constant folding alone cannot begin to touch. For example, it will realize that 'or' and 'and' instructions with certain constant operands actually become constants regardless of what their other operand is. It also can thread back through the caller to perform simplifications that are only possible by looking up a few levels. In particular, GEPs and pointer testing tend to fold much more heavily with this change. This should (in some cases) have a positive impact on compile times with optimizations on because the inliner itself will simply avoid cloning a great deal of code. It already attempted to prune proven-dead code, but now it will be use the stronger simplifications to prove more code dead. llvm-svn: 153403	2012-03-25 04:03:40 +00:00
Chandler Carruth	0c72e3f469	Add an asserting ValueHandle to the block simplification code which will fire if anything ever invalidates the assumption of a terminator instruction being unchanged throughout the routine. I've convinced myself that the current definition of simplification precludes such a transformation, so I think getting some asserts coverage that we don't violate this agreement is sufficient to make this code safe for the foreseeable future. Comments to the contrary or other suggestions are of course welcome. =] The bots are now happy with this code though, so it appears the bug here has indeed been fixed. llvm-svn: 153401	2012-03-25 03:29:25 +00:00
Chandler Carruth	17fc6ef234	Don't form a WeakVH around the sentinel node in the instructions BB list. This is a bad idea. ;] I'm hopeful this is the bug that's showing up with the MSVC bots, but we'll see. It is definitely unnecessary. InstSimplify won't do anything to a terminator instruction, we don't need to even include it in the iteration range. We can also skip the now dead terminator check, although I've made it an assert to help document that this is an important invariant. I'm still a bit queasy about this because there is an implicit assumption that the terminator instruction cannot be RAUW'ed by the simplification code. While that appears to be true at the moment, I see no guarantee that would ensure it remains true in the future. I'm looking at the cleanest way to solve that... llvm-svn: 153399	2012-03-24 23:03:27 +00:00
Chandler Carruth	cf1b585f60	Refactor the interface to recursively simplifying instructions to be tad bit simpler by handling a common case explicitly. Also, refactor the implementation to use a worklist based walk of the recursive users, rather than trying to use value handles to detect and recover from RAUWs during the recursive descent. This fixes a very subtle bug in the previous implementation where degenerate control flow structures could cause mutually recursive instructions (PHI nodes) to collapse in just such a way that From became equal to To after some amount of recursion. At that point, we hit the inf-loop that the assert at the top attempted to guard against. This problem is defined away when not using value handles in this manner. There are lots of comments claiming that the WeakVH will protect against just this sort of error, but they're not accurate about the actual implementation of WeakVHs, which do still track RAUWs. I don't have any test case for the bug this fixes because it requires running the recursive simplification on unreachable phi nodes. I've no way to either run this or easily write an input that triggers it. It was found when using instruction simplification inside the inliner when running over the nightly test-suite. llvm-svn: 153393	2012-03-24 21:11:24 +00:00
Francois Pichet	4b9ab74690	Fix the MSVC build. llvm-svn: 153366	2012-03-24 01:36:37 +00:00
Andrew Trick	25553ab5fe	More IndVarSimplify cleanup. llvm-svn: 153362	2012-03-24 00:51:17 +00:00
Kostya Serebryany	e505a5abe9	add EP_OptimizerLast extension point llvm-svn: 153353	2012-03-23 23:22:59 +00:00
Dan Gohman	e3ed2b0699	Don't convert objc_retainAutoreleasedReturnValue to objc_retain if it is retaining the return value of an invoke that it immediately follows. llvm-svn: 153344	2012-03-23 18:09:00 +00:00
Dan Gohman	5c70fadc17	It's not possible to insert code immediately after an invoke in the same basic block, and it's not safe to insert code in the successor blocks if the edges are critical edges. Splitting those edges is possible, but undesirable, especially on the unwind side. Instead, make the bottom-up code motion to consider invokes to be part of their successor blocks, rather than part of their parent blocks, so that it doesn't push code past them and onto the edges. This fixes PR12307. llvm-svn: 153343	2012-03-23 17:47:54 +00:00
Duncan Sands	a11ef6e4ea	When propagating equalities, eg replacing A with B in every basic block dominated by Root, check that B is available throughout the scope. This is obviously true (famous last words?) given the current logic, but the check may be helpful if more complicated reasoning is added one day. llvm-svn: 153323	2012-03-23 08:45:52 +00:00
Duncan Sands	8f897dc88b	Indentation. llvm-svn: 153322	2012-03-23 08:29:04 +00:00
Andrew Trick	e3502cb204	Remove -enable-lsr-retry in time for 3.1. llvm-svn: 153287	2012-03-22 22:42:51 +00:00
Andrew Trick	d97b83e320	Remove -enable-lsr-nested in time for 3.1. Tests cases have been removed but attached to open PR12330. llvm-svn: 153286	2012-03-22 22:42:45 +00:00
Dan Gohman	817a7c6fdf	Refactor the code for visiting instructions out into helper functions. llvm-svn: 153267	2012-03-22 18:24:56 +00:00
Andrew Trick	0654989062	Remove unused simplifyIVUsers llvm-svn: 153262	2012-03-22 17:47:30 +00:00
Andrew Trick	f47d0af551	Remove -enable-iv-rewrite, which has been unsupported since 3.0. llvm-svn: 153260	2012-03-22 17:10:11 +00:00
Chris Lattner	7d7dba3c92	don't use "signed", just something I noticed in patches flying by. llvm-svn: 153237	2012-03-22 03:46:58 +00:00
Kostya Serebryany	84a7f2e8e9	[asan] fix one more bug related to long double llvm-svn: 153189	2012-03-21 15:28:50 +00:00
Eric Christopher	7d522f161d	Zap some dead code pointed out by Chandler. llvm-svn: 153150	2012-03-20 23:28:58 +00:00
Andrew Trick	f7711010e1	LoopSimplify bug fix. Handle indirect loop back edges. Do not call SplitBlockPredecessors on a loop preheader when one of the predecessors is an indirectbr. Otherwise, you will hit this assert: !isa<IndirectBrInst>(Preds[i]->getTerminator()) && "Cannot split an edge from an IndirectBrInst" llvm-svn: 153134	2012-03-20 21:24:52 +00:00
Andrew Trick	bb01cbb312	whitespace llvm-svn: 153133	2012-03-20 21:24:47 +00:00
Kostya Serebryany	c58dc9fcd2	[asan] don't emit __asan_mapping_offset/__asan_mapping_scale by default -- they are currently used only for experiments llvm-svn: 153040	2012-03-19 16:40:35 +00:00
Bill Wendling	55b6b2b6a9	Revert r152907. llvm-svn: 152935	2012-03-16 18:20:54 +00:00
Bill Wendling	a2a26b546c	The alignment of the pointer part of the store instruction may have an alignment. If that's the case, then we want to make sure that we don't increase the alignment of the store instruction. Because if we increase it to be "more aligned" than the pointer, code-gen may use instructions which require a greater alignment than the pointer guarantees. <rdar://problem/11043589> llvm-svn: 152907	2012-03-16 07:40:08 +00:00
Chandler Carruth	b37fc13a36	Rip out support for 'llvm.noinline'. This thing has a strange history... It was added in 2007 as the first cut at supporting no-inline attributes, but we didn't have function attributes of any form at the time. However, it was added without any mention in the LangRef or other documentation. Later on, in 2008, Devang added function notes for 'inline=never' and then turned them into proper function attributes. From that point onward, as far as I can tell, the world moved on, and no one has touched 'llvm.noinline' in any meaningful way since. It's time has now come. We have had better mechanisms for doing this for a long time, all the frontends I'm aware of use them, and this is just holding back progress. Given that it was never a documented feature of the IR, I've provided no auto-upgrade support. If people know of real, in-the-wild bitcode that relies on this, yell at me and I'll add it, but I seriously doubt anyone cares. llvm-svn: 152904	2012-03-16 06:10:15 +00:00
Chandler Carruth	d7a5f2adb0	Start removing the use of an ad-hoc 'never inline' set and instead directly query the function information which this set was representing. This simplifies the interface of the inline cost analysis, and makes the always-inline pass significantly more efficient. Previously, always-inline would first make a single set of every function in the module except those marked with the always-inline attribute. It would then query this set at every call site to see if the function was a member of the set, and if so, refuse to inline it. This is quite wasteful. Instead, simply check the function attribute directly when looking at the callsite. The normal inliner also had similar redundancy. It added every function in the module with the noinline attribute to its set to ignore, even though inside the cost analysis function we already tested the noinline attribute and produced the same result. The only tricky part of removing this is that we have to be able to correctly remove only the functions inlined by the always-inline pass when finalizing, which requires a bit of a hack. Still, much less of a hack than the set of all non-always-inline functions was. While I was touching this function, I switched a heavy-weight set to a vector with sort+unique. The algorithm already had a two-phase insert and removal pattern, we were just needlessly paying the uniquing cost on every insert. This probably speeds up some compiles by a small amount (-O0 compiles with lots of always-inline, so potentially heavy libc++ users), but I've not tried to measure it. I believe there is no functional change here, but yell if you spot one. None are intended. Finally, the direction this is going in is to greatly simplify the inline cost query interface so that we can replace its implementation with a much more clever one. Along the way, all the APIs get simplified, so it seems incrementally good. llvm-svn: 152903	2012-03-16 06:10:13 +00:00
Andrew Trick	070e540a3e	LSR fix: Add isSimplifiedLoopNest to IVUsers analysis. Only record IVUsers that are dominated by simplified loop headers. Otherwise SCEVExpander will crash while looking for a preheader. I previously tried to work around this in LSR itself, but that was insufficient. This way, LSR can continue to run if some uses are not in simple loops, as long as we don't attempt to analyze those users. Fixes <rdar://problem/11049788> Segmentation fault: 11 in LoopStrengthReduce llvm-svn: 152892	2012-03-16 03:16:56 +00:00
Eli Friedman	e06535b2f6	In InstCombiner::visitOr, make sure we reverse the operand swap used for checking for or-of-xor operations after those checks; a later check expects that any constant will be in Op1. PR12234. llvm-svn: 152884	2012-03-16 00:52:42 +00:00
Rafael Espindola	f58927855b	Short term fix for pr12270 before we change dominates to handle unreachable code. While here, reduce indentation. llvm-svn: 152803	2012-03-15 15:52:59 +00:00
Bill Wendling	7fa1be77cc	Use an iterator instead of calling .size() on the worklist every time, which is wasteful. llvm-svn: 152794	2012-03-15 11:19:41 +00:00
Chandler Carruth	be2ccf01b7	Remove the basic inliner. This was added in 2007, and hasn't really changed since. No one was using it. It is yet another consumer of the InlineCost interface that I'd like to change. llvm-svn: 152769	2012-03-15 01:37:56 +00:00
Chandler Carruth	3904590ba8	This pass didn't want the inline cost per-se, it just wants generic code metrics. llvm-svn: 152760	2012-03-15 00:29:10 +00:00
Aaron Ballman	a733297fa6	Fixed a transform crash when setting a negative size value for memset. Fixes PR12202. llvm-svn: 152756	2012-03-15 00:05:31 +00:00
Kostya Serebryany	abad002d55	[tsan] use FunctionBlackList llvm-svn: 152755	2012-03-14 23:33:24 +00:00
Kostya Serebryany	01401cec00	[asan] rename class BlackList to FunctionBlackList and move it into a separate file -- we will need the same functionality in ThreadSanitizer llvm-svn: 152753	2012-03-14 23:22:10 +00:00
Dan Gohman	532fb8131b	When an invoke is marked with metadata indicating its unwind edge should be ignored by ARC optimization, don't insert new ARC runtime calls in the unwind destination. llvm-svn: 152748	2012-03-14 23:05:06 +00:00
Chandler Carruth	30b8416d2c	Change where we enable the heuristic that delays inlining into functions which are small enough to themselves be inlined. Delaying in this manner can be harmful if the function is inelligible for inlining in some (or many) contexts as it pessimizes the code of the function itself in the event that inlining does not eventually happen. Previously the check was written to only do this delaying of inlining for static functions in the hope that they could be entirely deleted and in the knowledge that all callers of static functions will have the opportunity to inline if it is in fact profitable. However, with C++ we get two other important sources of functions where the definition is always available for inlining: inline functions and templated functions. This patch generalizes the inliner to allow linkonce-ODR (the linkage such C++ routines receive) to also qualify for this delay-based inlining. Benchmarking across a range of large real-world applications shows roughly 2% size increase across the board, but an average speedup of about 0.5%. Some benhcmarks improved over 2%, and the 'clang' binary itself (when bootstrapped with this feature) shows a 1% -O0 performance improvement when run over all Sema, Lex, and Parse source code smashed into a single file. A clean re-build of Clang+LLVM with a bootstrapped Clang shows approximately 2% improvement, but that measurement is often noisy. llvm-svn: 152737	2012-03-14 20:16:41 +00:00
Pete Cooper	615fd897e0	Target override to allow CodeGenPrepare to sink address operands to intrinsics in the same way it current does for loads and stores llvm-svn: 152666	2012-03-13 20:59:56 +00:00
Chris Lattner	87fa77bd8a	enhance jump threading to preserve TBAA information when PRE'ing loads, fixing rdar://11039258, an issue that came up when inspecting clang's bootstrapped codegen. llvm-svn: 152635	2012-03-13 18:07:41 +00:00
Dan Gohman	eab06fa3c9	Teach globalopt how to evaluate an invoke with a non-void return type. llvm-svn: 152634	2012-03-13 18:01:37 +00:00
Chandler Carruth	595fda8466	When inlining a function and adding its inner call sites to the candidate set for subsequent inlining, try to simplify the arguments to the inner call site now that inlining has been performed. The goal here is to propagate and fold constants through deeply nested call chains. Without doing this, we loose the inliner bonus that should be applied because the arguments don't match the exact pattern the cost estimator uses. Reviewed on IRC by Benjamin Kramer. llvm-svn: 152556	2012-03-12 11:19:33 +00:00
Stepan Dyatkovskiy	97b02fc1b3	llvm::SwitchInst Renamed methods caseBegin, caseEnd and caseDefault with case_begin, case_end, and case_default. Added some notes relative to case iterators. llvm-svn: 152532	2012-03-11 06:09:17 +00:00
Duncan Sands	14eb175836	Add statistics on removed switch cases, and fix the phi statistic to count the number of phis changed, not the number visited. llvm-svn: 152425	2012-03-09 19:21:15 +00:00
Dan Gohman	500b598c5c	When identifying exit nodes for the reverse-CFG reverse-post-order traversal, consider nodes for which the only successors are backedges which the traversal is ignoring to be exit nodes. This fixes a problem where the bottom-up traversal was failing to visit split blocks along split loop backedges. This fixes rdar://10989035. llvm-svn: 152421	2012-03-09 18:50:52 +00:00
Duncan Sands	cca89124a2	Eliminate switch cases that can never match, for example removes all negative switch cases if the branch condition is known to be positive. Inspired by a recent improvement to GCC's VRP. llvm-svn: 152405	2012-03-09 13:45:18 +00:00
Stepan Dyatkovskiy	5b648afb4d	Taken into account Duncan's comments for r149481 dated by 2nd Feb 2012: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/136146.html Implemented CaseIterator and it solves almost all described issues: we don't need to mix operand/case/successor indexing anymore. Base iterator class is implemented as a template since it may be initialized either from "const SwitchInst" or from "SwitchInst". ConstCaseIt is just a read-only iterator. CaseIt is read-write iterator; it allows to change case successor and case value. Usage of iterator allows totally remove resolveXXXX methods. All indexing convertions done automatically inside the iterator's getters. Main way of iterator usage looks like this: SwitchInst SI = ... // intialize it somehow for (SwitchInst::CaseIt i = SI->caseBegin(), e = SI->caseEnd(); i != e; ++i) { BasicBlock BB = i.getCaseSuccessor(); ConstantInt *V = i.getCaseValue(); // Do something. } If you want to convert case number to TerminatorInst successor index, just use getSuccessorIndex iterator's method. If you want initialize iterator from TerminatorInst successor index, use CaseIt::fromSuccessorIndex(...) method. There are also related changes in llvm-clients: klee and clang. llvm-svn: 152297	2012-03-08 07:06:20 +00:00
Sebastian Pop	5ce71b18cb	fix typos llvm-svn: 152035	2012-03-05 17:39:47 +00:00
Sebastian Pop	8844e224b8	remove spaces on empty lines llvm-svn: 152034	2012-03-05 17:39:45 +00:00
Duncan Sands	3eb328574e	This is not a common case, in fact it never happens! llvm-svn: 152027	2012-03-05 12:23:00 +00:00
Chandler Carruth	d95357a18e	Switch mem2reg to use the new hashing infrastructure. llvm-svn: 152026	2012-03-05 11:29:56 +00:00
Chandler Carruth	e134d1a336	Replace the ad-hoc hashing in GVN with the new hashing infrastructure. This implicitly fixes a nasty bug in the GVN hashing (that thankfully could only manifest as a performance bug): actually include the opcode in the hash. The old code started the hash off with the opcode, but then overwrote it with the type pointer. Since this is likely to be pretty hot (GVN being already pretty expensive) I've included a micro-optimization to just not bother with the varargs hashing if they aren't present. I can't measure any change in GVN performance due to this, even with a big test case like Duncan's sqlite one. Everything I see is in the noise floor. That said, this closes a loop hole for a potential scaling problem due to collisions if the opcode were the differentiating aspect of the expression. llvm-svn: 152025	2012-03-05 11:29:54 +00:00
Duncan Sands	4d928e7dff	Nick pointed out on IRC that GVN's propagateEquality wasn't propagating equalities into phi node operands for which the equality is known to hold in the incoming basic block. That's because replaceAllDominatedUsesWith wasn't handling phi nodes correctly in general (that this didn't give wrong results was just luck: the specific way GVN uses replaceAllDominatedUsesWith precluded wrong changes to phi nodes). llvm-svn: 152006	2012-03-04 13:25:19 +00:00
Bill Wendling	97b9359623	Do trivial CSE of dead BBs during codegen preparation. Some BBs can become dead after codegen preparation. If we delete them here, it could help enable tail-call optimizations later on. <rdar://problem/10256573> llvm-svn: 152002	2012-03-04 10:46:01 +00:00
Evgeniy Stepanov	d33e3d8c6e	ASan: use getTypeAllocSize instead of getTypeStoreSize. This change replaces getTypeStoreSize with getTypeAllocSize in AddressSanitizer instrumentation for stack allocations. One case where old behaviour produced undesired results is an optimization in InstCombine pass (PromoteCastOfAllocation), which can replace alloca(T) with alloca(S), where S has the same AllocSize, but a smaller StoreSize. Another case is memcpy(long double => long double), where ASan will poison bytes 10-15 of a stack-allocated long double (StoreSize 10, AllocSize 16, sizeof(long double) = 16). See http://llvm.org/bugs/show_bug.cgi?id=12047 for more context. llvm-svn: 151887	2012-03-02 10:41:08 +00:00
Dan Gohman	362eb69f24	Fix an iterator invalidation problem. operator[] on a DenseMap can insert a new element, invalidating iterators. Use find instead, and handle the case where the key is not found explicitly. llvm-svn: 151871	2012-03-02 01:26:46 +00:00
Dan Gohman	55b067427b	Misc micro-optimizations. llvm-svn: 151869	2012-03-02 01:13:53 +00:00
Duncan Sands	bb2fe65542	Have GVN also do condition propagation when the right-hand side is not a constant. This fixes PR1768. llvm-svn: 151713	2012-02-29 11:12:03 +00:00
Bill Wendling	f2c78f344e	Restrict this transformation to equality conditions. This transformation is not correct for not-equal conditions: (trunc x) != C1 & (and x, CA) != C2 -> (and x, CA\|CMAX) != C1\|C2 Let C1 == 0 C2 == 0 CA == 0xFF0000 CMAX == 0xFF and truncating to i8. The original truth table: x \| A: trunc x != 0 \| B: x & 0xFF0000 != 0 \| A & B != 0 -------------------------------------------------------------- 0x00000 \| 0 \| 0 \| 0 0x00001 \| 1 \| 0 \| 0 0x10000 \| 0 \| 1 \| 0 0x10001 \| 1 \| 1 \| 1 The truth table of the replacement: x \| x & 0xFF00FF != 0 ---------------------------- 0x00000 \| 0 0x00001 \| 1 0x10000 \| 1 0x10001 \| 1 So they are different. llvm-svn: 151691	2012-02-29 01:46:50 +00:00
Pete Cooper	39b5255df4	Reverted r152620 - DSE: Shorten memset when a later store overwrites the start of it. There were all sorts of buildbot issues llvm-svn: 151621	2012-02-28 05:06:24 +00:00
Pete Cooper	f3862f91de	DSE: Shorten memset when a later store overwrites the start of it llvm-svn: 151620	2012-02-28 04:27:10 +00:00
Benjamin Kramer	93887631d9	Plog a memleak in GlobalOpt. Found by valgrind. llvm-svn: 151525	2012-02-27 12:48:24 +00:00
Duncan Sands	9edea84420	Micro-optimization, no functionality change. llvm-svn: 151524	2012-02-27 12:11:41 +00:00

... 12 13 14 15 16 ...

10027 Commits