llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	9433a28845	Preserve IR flags (nsw, nuw, exact, fast-math) in SLP vectorizer (PR20802). The SLP vectorizer should propagate IR-level optimization hints/flags (nsw, nuw, exact, fast-math) when converting scalar instructions into vectors. But this isn't a simple copy - we need to take the intersection (the logical 'and') of the sets of flags on the scalars. The solution is further complicated because we can have non-uniform (non-SIMD) vector ops after: http://reviews.llvm.org/D4015 http://llvm.org/viewvc/llvm-project?view=revision&revision=211339 The vast majority of changed files are existing tests that were not propagating IR flags, but I've also added a new test file for focused testing of IR flag possibilities. Differential Revision: http://reviews.llvm.org/D5172 llvm-svn: 217051	2014-09-03 17:40:30 +00:00
Sanjay Patel	a982d992f0	Change name of copyFlags() to copyIRFlags(). Add convenience method for logical 'and' of all flags. NFC. Adding 'IR' to the names in an attempt to be less ambiguous about the flags we're dealing with here. The 'and' method is needed by the SLPVectorizer (PR20802) and possibly other passes. llvm-svn: 217004	2014-09-03 01:06:50 +00:00
Hal Finkel	445dda5c4a	Add pass-manager flags to use CFL AA Add -use-cfl-aa (and -use-cfl-aa-in-codegen) to add CFL AA in the default pass managers (for easy testing). llvm-svn: 216978	2014-09-02 22:12:54 +00:00
Kostya Serebryany	ad23852ac3	[asan] Assign a low branch weight to ASan's slow path, patch by Jonas Wagner. This speeds up asan (at least on SPEC) by 1%-5% or more. Also fix lint in dfsan. llvm-svn: 216972	2014-09-02 21:46:51 +00:00
Yi Jiang	77a609b556	Generate extract for in-tree uses if the use is scalar operand in vectorized instruction. radar://18144665 llvm-svn: 216946	2014-09-02 21:00:39 +00:00
David Blaikie	15913f46b2	unique_ptrify the result of SpecialCaseList::create llvm-svn: 216925	2014-09-02 18:13:54 +00:00
David Majnemer	49428105aa	LICM: Don't crash when an instruction is used by an unreachable BB Summary: BBs might contain non-LCSSA'd values after the LCSSA pass is run if they are unreachable from the entry block. Normally, the users of the instruction would be PHIs but the unreachable BBs have normal users; rewrite their uses to be undef values. An alternative fix could involve fixing this at LCSSA but that would require this invariant to hold after subsequent transforms. If a BB created an unreachable block, they would be in violation of this. This fixes PR19798. Differential Revision: http://reviews.llvm.org/D5146 llvm-svn: 216911	2014-09-02 16:22:00 +00:00
David Majnemer	d4cffcf073	SROA: Don't insert instructions before a PHI SROA may decide that it needs to insert a bitcast and would set it's insertion point before a PHI. This will create an invalid module right quick. Instead, choose the first insertion point in the basic block that holds our PHI. This fixes PR20822. Differential Revision: http://reviews.llvm.org/D5141 llvm-svn: 216891	2014-09-01 21:20:14 +00:00
David Majnemer	d2df50196f	Revert "Revert two GEP-related InstCombine commits" This reverts commit r216698 which reverted r216523 and r216598. We would attempt to perform the transformation even if the match() failed because, as a side effect, it would set V. This would trick us into believing that we correctly found a place to correctly apply the transform. An additional test case was added to getelementptr.ll so that we might not regress in the future. llvm-svn: 216890	2014-09-01 21:10:02 +00:00
Sanjay Patel	5ad239e15a	Add a convenience method to copy wrapping, exact, and fast-math flags (NFC). The loop vectorizer preserves wrapping, exact, and fast-math properties of scalar instructions. This patch adds a convenience method to make that operation easier because we need to do this in the loop vectorizer, SLP vectorizer, and possibly other places. Although this is a 'no functional change' patch, I've added a testcase to verify that the exact flag is preserved by the loop vectorizer. The wrapping and fast-math flags are already checked in existing testcases. Differential Revision: http://reviews.llvm.org/D5138 llvm-svn: 216886	2014-09-01 18:44:57 +00:00
Chandler Carruth	18cee1defc	Fix a really bad miscompile introduced in r216865 - the else-if logic chain became completely broken here as all intrinsic users ended up being skipped, and the ones that seemed to be singled out were actually the exact wrong set. This is a great example of why long else-if chains can be easily confusing. Switch the entire code to use early exits and early continues to have simpler (and more importantly, correct) logic here, as well as fixing the reversed logic for detecting and continuing on lifetime intrinsics. I've also significantly cleaned up the test case and added another test case demonstrating an example where the optimization is not (trivially) safe to perform. llvm-svn: 216871	2014-09-01 10:09:18 +00:00
Renato Golin	86a6c3f269	Small refactor on VectorizerHint for deduplication Previously, the hint mechanism relied on clean up passes to remove redundant metadata, which still showed up if running opt at low levels of optimization. That also has shown that multiple nodes of the same type, but with different values could still coexist, even if temporary, and cause confusion if the next pass got the wrong value. This patch makes sure that, if metadata already exists in a loop, the hint mechanism will never append a new node, but always replace the existing one. It also enhances the algorithm to cope with more metadata types in the future by just adding a new type, not a lot of code. Re-applying again due to MSVC 2013 being minimum requirement, and this patch having C++11 that MSVC 2012 didn't support. Fixes PR20655. llvm-svn: 216870	2014-09-01 10:00:17 +00:00
Hal Finkel	0c083024f0	Feed AA to the inliner and use AA->getModRefBehavior in AddAliasScopeMetadata This feeds AA through the IFI structure into the inliner so that AddAliasScopeMetadata can use AA->getModRefBehavior to figure out which functions only access their arguments (instead of just hard-coding some knowledge of memory intrinsics). Most of the information is only available from BasicAA; this is important for preserving alias scoping information for target-specific intrinsics when doing the noalias parameter attribute to metadata conversion. llvm-svn: 216866	2014-09-01 09:01:39 +00:00
Nick Lewycky	fc243d54d2	Ignore lifetime intrinsics in use list for MemCpyOptimizer. Patch by Luqman Aden, review by Hal Finkel. llvm-svn: 216865	2014-09-01 06:03:11 +00:00
Hal Finkel	cbb85f249e	Fix AddAliasScopeMetadata again - alias.scope must be a complete description I thought that I had fixed this problem in r216818, but I did not do a very good job. The underlying issue is that when we add alias.scope metadata we are asserting that this metadata completely describes the aliasing relationships within the current aliasing scope domain, and so in the context of translating noalias argument attributes, the pointers must all be based on noalias arguments (as underlying objects) and have no other kind of underlying object. In r216818 excluding appropriate accesses from getting alias.scope metadata is done by looking for underlying objects that are not identified function-local objects -- but that's wrong because allocas, etc. are also function-local objects and we need to explicitly check that all underlying objects are the noalias arguments for which we're adding metadata aliasing scopes. This fixes the underlying-object check for adding alias.scope metadata, and does some refactoring of the related capture-checking eligibility logic (and adds more comments; hopefully making everything a bit clearer). Fixes self-hosting on x86_64 with -mllvm -enable-noalias-to-md-conversion (the feature is still disabled by default). llvm-svn: 216863	2014-09-01 04:26:40 +00:00
Craig Topper	6dc4a8bc2c	Fix some cases where StringRef was being passed by const reference. Remove const from some other StringRefs since its implicitly const already. llvm-svn: 216820	2014-08-30 16:48:02 +00:00
Hal Finkel	a3708df41a	Fix AddAliasScopeMetadata to not add scopes when deriving from unknown pointers The previous implementation of AddAliasScopeMetadata, which adds noalias metadata to preserve noalias parameter attribute information when inlining had a flaw: it would add alias.scope metadata to accesses which might have been derived from pointers other than noalias function parameters. This was incorrect because even some access known not to alias with all noalias function parameters could easily alias with an access derived from some other pointer. Instead, when deriving from some unknown pointer, we cannot add alias.scope metadata at all. This fixes a miscompile of the test-suite's tramp3d-v4. Furthermore, we cannot add alias.scope to functions unless we know they access only argument-derived pointers (currently, we know this only for memory intrinsics). Also, we fix a theoretical problem with using the NoCapture attribute to skip the capture check. This is incorrect (as explained in the comment added), but would not matter in any code generated by Clang because we get only inferred nocapture attributes in Clang-generated IR. This functionality is not yet enabled by default. llvm-svn: 216818	2014-08-30 12:48:33 +00:00
David Majnemer	492e612e01	InstCombine: Respect recursion depth in visitUDivOperand llvm-svn: 216817	2014-08-30 09:19:05 +00:00
David Majnemer	5e96f1b4c8	InstCombine: Try harder to combine icmp instructions consider: (and (icmp X, Y), (and Z, (icmp A, B))) It may be possible to combine (icmp X, Y) with (icmp A, B). If we successfully combine, create an 'and' instruction with Z. This fixes PR20814. N.B. There is room for improvement after this change but I'm not convinced it's worth chasing yet. llvm-svn: 216814	2014-08-30 06:18:20 +00:00
Hal Finkel	2d3d6da44b	Fix a typo in AddAliasScopeMetadata llvm-svn: 216741	2014-08-29 16:33:41 +00:00
David Majnemer	400e725bde	Revert two GEP-related InstCombine commits This reverts commit r216523 and r216598; people have reported regressions. llvm-svn: 216698	2014-08-29 00:06:43 +00:00
Reid Kleckner	febb279c9c	Don't promote byval pointer arguments when padding matters Don't promote byval pointer arguments when when their size in bits is not equal to their alloc size in bits. This can happen for x86_fp80, where the size in bits is 80 but the alloca size in bits in 128. Promoting these types can break passing unions of x86_fp80s and other types. Patch by Thomas Jablin! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D5057 llvm-svn: 216693	2014-08-28 22:42:00 +00:00
David Majnemer	074052b623	InstCombine: Remove redundant combines InstSimplify already handles icmp (X+Y), X (and things like it) appropriately. The first thing that InstCombine does is run InstSimplify on the instruction. llvm-svn: 216659	2014-08-28 10:08:37 +00:00
Erik Eckstein	8354cfaf95	Fix: SLPVectorizer tried to move an instruction which was replaced by a vector instruction. For a detailed description of the problem see the comment in the test file. The problematic moveBefore() calls are not required anymore because the new scheduling algorithm ensures a correct ordering anyway. llvm-svn: 216656	2014-08-28 07:04:02 +00:00
David Majnemer	76d06bc613	InstSimplify: Move a transform from InstCombine to InstSimplify Several combines involving icmp (shl C2, %X) C1 can be simplified without introducing any new instructions. Move them to InstSimplify; while we are at it, make them more powerful. llvm-svn: 216642	2014-08-28 03:34:28 +00:00
David Majnemer	22ccfc4484	InstCombine: Combine gep X, (Y-X) to Y We try to perform this transform in InstSimplify but we aren't always able to. Sometimes, we need to insert a bitcast if X and Y don't have the same time. llvm-svn: 216598	2014-08-27 20:08:37 +00:00
Michael Zolotukhin	5dc466b863	[SLP] Re-enable vectorization of GEP expressions (re-apply r210342 with a fix). llvm-svn: 216549	2014-08-27 15:01:18 +00:00
Craig Topper	e1d1294853	Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created. llvm-svn: 216525	2014-08-27 05:25:25 +00:00
Craig Topper	3af9722529	Fix some cases were ArrayRefs were being passed by reference. Also remove 'const' from some other ArrayRef uses since its implicitly const already. llvm-svn: 216524	2014-08-27 05:25:00 +00:00
David Majnemer	54e97d5dc0	InstCombine: Optimize GEP's involving ptrtoint better We supported transforming: (gep i8* X, -(ptrtoint Y)) to: (inttoptr (sub (ptrtoint X), (ptrtoint Y))) However, this only fired if 'X' had type i8*. Generalize this to support various types of different sizes. This results in much better CodeGen, especially for pointers to packed structs. llvm-svn: 216523	2014-08-27 05:16:04 +00:00
Joerg Sonnenberger	cb5674b9c2	Revert r210342 and r210343, add test case for the crasher. PR 20642. llvm-svn: 216475	2014-08-26 19:06:41 +00:00
Dinesh Dwivedi	4919bbe29d	This patch enables SimplifyUsingDistributiveLaws() to handle following pattens. (X >> Z) & (Y >> Z) -> (X&Y) >> Z for all shifts. (X >> Z) \| (Y >> Z) -> (X\|Y) >> Z for all shifts. (X >> Z) ^ (Y >> Z) -> (X^Y) >> Z for all shifts. These patterns were previously handled separately in visitAnd()/visitOr()/visitXor(). Differential Revision: http://reviews.llvm.org/D4951 llvm-svn: 216443	2014-08-26 08:53:32 +00:00
Reid Kleckner	3715461b48	musttail: Don't eliminate varargs packs if there is a forwarding call Also clean up and beef up this grep test for the feature. llvm-svn: 216425	2014-08-26 00:59:51 +00:00
Sanjay Patel	4e31cdabd1	fix typos in comments llvm-svn: 216424	2014-08-26 00:59:15 +00:00
Reid Kleckner	e6e88f99b3	ArgPromotion: Don't touch variadic functions Adding, removing, or changing non-pack parameters can change the ABI classification of pack parameters. Clang and other frontends encode the classification in the IR of the call site, but the callee side determines it dynamically based on the number of registers consumed so far. Changing the prototype affects the number of registers consumed would break such code. Dead argument elimination performs a similar task and already has a similar check to avoid this problem. Patch by Thomas Jablin! llvm-svn: 216421	2014-08-25 23:58:48 +00:00
Rafael Espindola	3fd1e9933f	Modernize raw_fd_ostream's constructor a bit. Take a StringRef instead of a "const char *". Take a "std::error_code &" instead of a "std::string &" for error. A create static method would be even better, but this patch is already a bit too big. llvm-svn: 216393	2014-08-25 18:16:47 +00:00
Bruno Cardoso Lopes	e2a1fa35df	Remove dangling initializers in GlobalDCE GlobalDCE deletes global vars and updates their initializers to nullptr while leaving underlying constants to be cleaned up later by its uses. The clean up may never happen, fix this by forcing it every time it's safe to destroy constants. Final patch by Rafael Espindola http://reviews.llvm.org/D4931 <rdar://problem/17523868> llvm-svn: 216390	2014-08-25 17:51:14 +00:00
Stepan Dyatkovskiy	c90308bf83	MergeFunctions, tiny refactoring: cmpAPFloat has been renamed to cmpAPFloats (multiple form). llvm-svn: 216376	2014-08-25 08:22:46 +00:00
Stepan Dyatkovskiy	7f895c1184	MergeFunctions, tiny refactoring: cmpAPInt has been renamed to cmpAPInts (multiple form). llvm-svn: 216375	2014-08-25 08:19:50 +00:00
Stepan Dyatkovskiy	0b765dee6e	MergeFunctions, tiny refactoring: cmpType has been renamed to cmpTypes (multiple form). llvm-svn: 216374	2014-08-25 08:16:39 +00:00
Stepan Dyatkovskiy	016daddc52	MergeFunctions, tiny refactoring: cmpGEP has been renamed to cmpGEPs (multiple form). llvm-svn: 216373	2014-08-25 08:12:45 +00:00
Karthik Bhat	7f33ff7dea	Allow vectorization of division by uniform power of 2. This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible. Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend. Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971) llvm-svn: 216371	2014-08-25 04:56:54 +00:00
Craig Topper	4627679cec	Use range based for loops to avoid needing to re-mention SmallPtrSet size. llvm-svn: 216351	2014-08-24 23:23:06 +00:00
David Majnemer	0ffccf7fb5	InstCombine: Properly optimize or'ing bittests together CFE, with -03, would turn: bool f(unsigned x) { bool a = x & 1; bool b = x & 2; return a \| b; } into: %1 = lshr i32 %x, 1 %2 = or i32 %1, %x %3 = and i32 %2, 1 %4 = icmp ne i32 %3, 0 This sort of thing exposes a nasty pathology in GCC, ICC and LLVM. Instead, we would rather want: %1 = and i32 %x, 3 %2 = icmp ne i32 %1, 0 Things get a bit more interesting in the following case: %1 = lshr i32 %x, %y %2 = or i32 %1, %x %3 = and i32 %2, 1 %4 = icmp ne i32 %3, 0 Replacing it with the following sequence is better: %1 = shl nuw i32 1, %y %2 = or i32 %1, 1 %3 = and i32 %2, %x %4 = icmp ne i32 %3, 0 This sequence is preferable because %1 doesn't involve %x and could potentially be hoisted out of loops if it is invariant; only perform this transform in the non-constant case if we know we won't increase register pressure. llvm-svn: 216343	2014-08-24 09:10:57 +00:00
Jingyue Wu	ec33fa9aca	[SROA] Fold a PHI node if all its incoming values are the same Summary: Fixes PR20425. During slice building, if all of the incoming values of a PHI node are the same, replace the PHI node with the common value. This simplification makes alloca's used by PHI nodes easier to promote. Test Plan: Added three more tests in phi-and-select.ll Reviewers: nlewycky, eliben, meheff, chandlerc Reviewed By: chandlerc Subscribers: zinovy.nis, hfinkel, baldrick, llvm-commits Differential Revision: http://reviews.llvm.org/D4659 llvm-svn: 216299	2014-08-22 22:45:57 +00:00
David Majnemer	49775e0173	InstCombine: Don't unconditionally preserve 'nuw' when shrinking constants Consider: %add = add nuw i32 %a, -16777216 %and = and i32 %add, 255 Regardless of whether or not we demand the sign bit of %add, we cannot replace -16777216 with 2130706432 without also removing 'nuw' from the instruction. llvm-svn: 216273	2014-08-22 17:11:04 +00:00
David Majnemer	0e6c986696	InstCombine: sub nsw %x, C -> add nsw %x, -C if C isn't INT_MIN We can preserve nsw during this transform if -C won't overflow. llvm-svn: 216269	2014-08-22 16:41:23 +00:00
David Majnemer	42b83a5e36	InstCombine: Don't unconditionally preserve 'nsw' when shrinking constants Consider: %add = add nsw i32 %a, -16777216 %and = and i32 %add, 255 Regardless of whether or not we demand the sign bit of %add, we cannot replace -16777216 with 2130706432 without also removing 'nsw' from the instruction. This fixes PR20377. llvm-svn: 216261	2014-08-22 07:56:32 +00:00
Erik Eckstein	b49d7abb7b	fix: SLPVectorizer crashes for unreachable blocks containing not schedulable instructions. In unreachable blocks it's legal to have instructions like "%x = op %x". Such instuctions are not schedulable. Therefore the SLPVectorizer has to check for unreachable blocks and ignore them. Fixes bug 20646. llvm-svn: 216256	2014-08-22 01:18:39 +00:00
Peter Collingbourne	fab565a56b	[dfsan] Fix non-determinism bug in non-zero label check annotator. We now use a std::vector instead of a DenseSet to store the list of label checks so that we can iterate over it deterministically. llvm-svn: 216255	2014-08-22 01:18:18 +00:00
Reid Kleckner	c36f48f08a	SROA: Handle a case of store size being smaller than allocation size In this case, we are creating an x86_fp80 slice for a union from C where the padding bytes may contain real data. An x86_fp80 alloca is 16 bytes, and that's just fine. We can't, however, use regular loads and stores to access the slice, because the store size is only 10 bytes / 80 bits. Instead, use memcpy and memset. Fixes PR18726. Reviewed By: chandlerc Differential Revision: http://reviews.llvm.org/D5012 llvm-svn: 216248	2014-08-22 00:09:56 +00:00
David Blaikie	2f3f76fdb1	Use DILexicalBlockFile, rather than DILexicalBlock, to track discriminator changes to ensure discriminator changes don't introduce new DWARF DW_TAG_lexical_blocks. Somewhat unnoticed in the original implementation of discriminators, but it could cause instructions to end up in new, small, DW_TAG_lexical_blocks due to the use of DILexicalBlock to track discriminator changes. Instead, use DILexicalBlockFile which we already use to track file changes without introducing new scopes, so it works well to track discriminator changes in the same way. llvm-svn: 216239	2014-08-21 22:45:21 +00:00
Rafael Espindola	7cebf36a95	Move some logic to populateLTOPassManager. This will avoid code duplication in the next commit which calls it directly from the gold plugin. llvm-svn: 216211	2014-08-21 20:03:44 +00:00
Rafael Espindola	216e0c0617	Respect LibraryInfo in populateLTOPassManager and use it. NFC. llvm-svn: 216203	2014-08-21 18:49:52 +00:00
Rafael Espindola	e07caad9e7	Handle inlining in populateLTOPassManager like in populateModulePassManager. No functionality change. llvm-svn: 216178	2014-08-21 13:35:30 +00:00
Zinovy Nis	33406da5f4	[CLNUP] Remove return after llvm_unreachable. Thanks to Hal Finkel for pointing. llvm-svn: 216176	2014-08-21 13:30:05 +00:00
Rafael Espindola	208bc533cd	Move DisableGVNLoadPRE from populateLTOPassManager to PassManagerBuilder. llvm-svn: 216174	2014-08-21 13:13:17 +00:00
Erik Verbruggen	2b98bd2a80	Reassociate x + -0.1234 * y into x - 0.1234 * y This does not require -ffast-math, and it gives CSE/GVN more options to eliminate duplicate expressions in, e.g.: return ((x + 0.1234 * y) * (x - 0.1234 * y)); Differential Revision: http://reviews.llvm.org/D4904 llvm-svn: 216169	2014-08-21 10:45:30 +00:00
Zinovy Nis	0a36cba29d	[INDVARS] Extend using of widening of induction variables for the cases of "sub nsw" and "mul nsw" instructions. Currently only "add nsw" are widened. This patch eliminates tons of "sext" instructions for 64 bit code (and the corresponding target code) in cases like: int N = 100; float *A; void foo(int x0, int x1) { float A_cur = &A[0][0]; float * A_next = &A[1][0]; for(int x = x0; x < x1; ++x). { // Currently only [x+N] case is widened. Others 2 cases lead to sext. // This patch fixes it, so all 3 cases do not need sext. const float div = A_cur[x + N] + A_cur[x - N] + A_cur[x * N]; A_next[x] = div; } } ... > clang++ test.cpp -march=core-avx2 -Ofast -fno-unroll-loops -fno-tree-vectorize -S -o - Differential Revision: http://reviews.llvm.org/D4695 llvm-svn: 216160	2014-08-21 08:25:45 +00:00
Craig Topper	71b7b68b74	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 216158	2014-08-21 05:55:13 +00:00
David Majnemer	5d1aeba2ea	InstCombine: Fold ((A \| B) & C1) ^ (B & C2) -> (A & C1) ^ B if C1^C2=-1 Adapted from a patch by Richard Smith, test-case written by me. llvm-svn: 216157	2014-08-21 05:14:48 +00:00
James Molloy	82c995d450	[LoopVectorizer] Limit unroll factor in the presence of nested reductions. If we have a scalar reduction, we can increase the critical path length if the loop we're unrolling is inside another loop. Limit, by default to 2, so the critical path only gets increased by one reduction operation. llvm-svn: 216140	2014-08-20 23:53:52 +00:00
Yi Jiang	1a4e73d7bf	New InstCombine pattern: (icmp ult/ule (A + C1), C3) \| (icmp ult/ule (A + C2), C3) to (icmp ult/ule ((A & ~(C1 ^ C2)) + max(C1, C2)), C3) under certain condition llvm-svn: 216135	2014-08-20 22:55:40 +00:00
David Majnemer	42158f3eea	InstCombine: Annotate sub with nuw when we prove it's safe We can prove that a 'sub' can be a 'sub nuw' if the left-hand side is negative and the right-hand side is non-negative. llvm-svn: 216045	2014-08-20 07:17:31 +00:00
Peter Collingbourne	f39430bd4a	[dfsan] Treat vararg custom functions like unimplemented functions. Because declarations of these functions can appear in places like autoconf checks, they have to be handled somehow, even though we do not support vararg custom functions. We do so by printing a warning and calling the uninstrumented function, as we do for unimplemented functions. llvm-svn: 216042	2014-08-20 01:40:23 +00:00
David Majnemer	57d5bc8849	InstCombine: Annotate sub with nsw when we prove it's safe We can prove that a 'sub' can be a 'sub nsw' under certain conditions: - The sign bits of the operands is the same. - Both operands have more than 1 sign bit. The subtraction cannot be a signed overflow in either case. llvm-svn: 216037	2014-08-19 23:36:30 +00:00
Renato Golin	06d601fb3e	Revert "Small refactor on VectorizerHint for deduplication" This reverts commit r215994 because MSVC 2012 can't cope with its C++11 goodness. llvm-svn: 215999	2014-08-19 18:08:50 +00:00
Renato Golin	dd6394d833	Small refactor on VectorizerHint for deduplication Previously, the hint mechanism relied on clean up passes to remove redundant metadata, which still showed up if running opt at low levels of optimization. That also has shown that multiple nodes of the same type, but with different values could still coexist, even if temporary, and cause confusion if the next pass got the wrong value. This patch makes sure that, if metadata already exists in a loop, the hint mechanism will never append a new node, but always replace the existing one. It also enhances the algorithm to cope with more metadata types in the future by just adding a new type, not a lot of code. llvm-svn: 215994	2014-08-19 17:30:43 +00:00
Mayur Pandey	960507beb4	InstCombine: ((A & ~B) ^ (~A & B)) to A ^ B Proof using CVC3 follows: $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR((A & ~B),(~A & B)) = BVXOR(A,B); $ cvc3 t.cvc Valid. Differential Revision: http://reviews.llvm.org/D4898 llvm-svn: 215974	2014-08-19 08:19:19 +00:00
Craig Topper	97ebe53032	Const-correct and prevent a copy of a SmallPtrSet. llvm-svn: 215973	2014-08-19 07:44:27 +00:00
Mayur Pandey	75b76c6a92	test commit (spelling correction) llvm-svn: 215970	2014-08-19 06:41:55 +00:00
Craig Topper	6230691c91	Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size." Getting a weird buildbot failure that I need to investigate. llvm-svn: 215870	2014-08-18 00:24:38 +00:00
Craig Topper	5229cfd163	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 215868	2014-08-17 23:47:00 +00:00
Owen Anderson	a4428aa484	Remove an InstCombine that transformed patterns like (x * uitofp i1 y) to (select y, x, 0.0) when the multiply has fast math flags set. While this might seem like an obvious canonicalization, there is one subtle problem with it. The result of the original expression is undef when x is NaN (remember, fast math flags), but the result of the select is always defined when x is NaN. This means that the new expression is strictly more defined than the original one. One unfortunate consequence of this is that the transform is not reversible! It's always legal to make increase the defined-ness of an expression, but it's not legal to reduce it. Thus, targets that prefer the original form of the expression cannot reverse the transform to recover it. Another way to think of it is that the transform has lost source-level information (the fast math flags), which is undesirable. llvm-svn: 215825	2014-08-17 03:51:29 +00:00
David Majnemer	1a0bbc8a5c	InstCombine: Fix a potential bug in 0 - (X sdiv C) -> (X sdiv -C) While most (X sdiv 1) operations will get caught by InstSimplify, it is still possible for a sdiv to appear in the worklist which hasn't been simplified yet. This means that it is possible for 0 - (X sdiv 1) to get transformed into (X sdiv -1); dividing by -1 can make the transform produce undef values instead of the proper result. Sorry for the lack of testcase, it's a bit problematic because it relies on the exact order of operations in the worklist. llvm-svn: 215818	2014-08-16 09:23:42 +00:00
David Majnemer	f9a095d606	InstCombine: Combine mul with div. We can combne a mul with a div if one of the operands is a multiple of the other: %mul = mul nsw nuw %a, C1 %ret = udiv %mul, C2 => %ret = mul nsw %a, (C1 / C2) This can expose further optimization opportunities if we end up multiplying or dividing by a power of 2. Consider this small example: define i32 @f(i32 %a) { %mul = mul nuw i32 %a, 14 %div = udiv exact i32 %mul, 7 ret i32 %div } which gets CodeGen'd to: imull $14, %edi, %eax imulq $613566757, %rax, %rcx shrq $32, %rcx subl %ecx, %eax shrl %eax addl %ecx, %eax shrl $2, %eax retq We can now transform this into: define i32 @f(i32 %a) { %shl = shl nuw i32 %a, 1 ret i32 %shl } which gets CodeGen'd to: leal (%rdi,%rdi), %eax retq This fixes PR20681. llvm-svn: 215815	2014-08-16 08:55:06 +00:00
Rafael Espindola	ea46c32f81	Introduce a helper to combine instruction metadata. Replace the old code in GVN and BBVectorize with it. Update SimplifyCFG to use it. Patch by Björn Steinbrink! llvm-svn: 215723	2014-08-15 15:46:38 +00:00
Hal Finkel	61c386126b	Copy noalias metadata from call sites to inlined instructions When a call site with noalias metadata is inlined, that metadata can be propagated directly to the inlined instructions (only those that might access memory because it is not useful on the others). Prior to inlining, the noalias metadata could express that a call would not alias with some other memory access, which implies that no instruction within that called function would alias. By propagating the metadata to the inlined instructions, we preserve that knowledge. This should complete the enhancements requested in PR20500. llvm-svn: 215676	2014-08-14 21:09:37 +00:00
Hal Finkel	d2dee16c27	Add noalias metadata for general calls (not just memory intrinsics) during inlining When preserving noalias function parameter attributes by adding noalias metadata in the inliner, we should do this for general function calls (not just memory intrinsics). The logic is very similar to what already existed (except that we want to add this metadata even for functions taking no relevant parameters). This metadata can be used by ModRef queries in the caller after inlining. This addresses the first part of PR20500. Adding noalias metadata during inlining is still turned off by default. llvm-svn: 215657	2014-08-14 16:44:03 +00:00
Chad Rosier	11ab941644	[Reassociation] Add support for reassociation with unsafe algebra. Vector instructions are (still) not supported for either integer or floating point. Hopefully, that work will be landed shortly. llvm-svn: 215647	2014-08-14 15:23:01 +00:00
David Majnemer	698dca0b95	InstCombine: ((A \| ~B) ^ (~A \| B)) to A ^ B Proof using CVC3 follows: $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR((A \| ~B),(~A \|B)) = BVXOR(A,B); $ cvc3 t.cvc Valid. Patch by Mayur Pandey! Differential Revision: http://reviews.llvm.org/D4883 llvm-svn: 215621	2014-08-14 06:46:25 +00:00
David Majnemer	f1eda23514	Added InstCombine Transform for ((B \| C) & A) \| B -> B \| (A & C) Transform ((B \| C) & A) \| B --> B \| (A & C) Z3 Link: http://rise4fun.com/Z3/hP6p Patch by Sonam Kumari! Differential Revision: http://reviews.llvm.org/D4865 llvm-svn: 215619	2014-08-14 06:41:38 +00:00
Jan Vesely	0cd3ec6cfa	utils: Fix segfault in flattencfg v2: continue iterating through the rest of the bb use for loop v3: initialize FlattenCFG pass in ScalarOps add test v4: split off initializing flattencfg to a separate patch add comment Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215574	2014-08-13 20:31:53 +00:00
Jan Vesely	5a956d49f7	Initialize FlattenCFG pass Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215573	2014-08-13 20:31:52 +00:00
Benjamin Kramer	a7c40ef022	Canonicalize header guards into a common format. Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558	2014-08-13 16:26:38 +00:00
Chandler Carruth	0fb998110a	[optnone] Make the optnone attribute effective at suppressing function attribute and function argument attribute synthesizing and propagating. As with the other uses of this attribute, the goal remains a best-effort (no guarantees) attempt to not optimize the function or assume things about the function when optimizing. This is particularly useful for compiler testing, bisecting miscompiles, triaging things, etc. I was hitting specific issues using optnone to isolate test code from a test driver for my fuzz testing, and this is one step of fixing that. llvm-svn: 215538	2014-08-13 10:49:33 +00:00
Chandler Carruth	3f92ecc2a0	Revert r215415 which causse MSan to crash on a great deal of C++ code. I've followed up on the original commit as well. llvm-svn: 215532	2014-08-13 09:19:39 +00:00
Karthik Bhat	a4a4db91be	InstCombine: Combine (xor (or %a, %b) (xor %a, %b)) to (add %a, %b) Correctness proof of the transform using CVC3- $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR(A \| B, BVXOR(A,B) ) = A & B; $ cvc3 t.cvc Valid. llvm-svn: 215524	2014-08-13 05:13:14 +00:00
Matt Arsenault	4815f09bbe	Allwo bitcast + struct GEP transform to work with addrspacecast llvm-svn: 215467	2014-08-12 19:46:13 +00:00
Reid Kleckner	3ae6e1528a	msan: Handle musttail calls First, avoid calling setTailCall(false) on musttail calls. The funciton prototypes should be "congruent", so the shadow layout should be exactly the same. Second, avoid inserting instrumentation after a musttail call to propagate the return value shadow. We don't need to propagate the result of a tail call, it should already be in the right place. Reviewed By: eugenis Differential Revision: http://reviews.llvm.org/D4331 llvm-svn: 215415	2014-08-12 00:12:43 +00:00
Reid Kleckner	e31acf239a	Move helper for getting a terminating musttail call to BasicBlock No functional change. To be used in future commits that need to look for such instructions. Reviewed By: rafael Differential Revision: http://reviews.llvm.org/D4504 llvm-svn: 215413	2014-08-12 00:05:15 +00:00
David Majnemer	ab07f00c64	InstCombine: Combine (add (and %a, %b) (or %a, %b)) to (add %a, %b) What follows bellow is a correctness proof of the transform using CVC3. $ < t.cvc A, B : BITVECTOR(32); QUERY BVPLUS(32, A & B, A \| B) = BVPLUS(32, A, B); $ cvc3 < t.cvc Valid. llvm-svn: 215400	2014-08-11 22:32:02 +00:00
James Molloy	65b08f5e46	[LoopVectorizer] Enable support for floating-point subtraction reductions llvm-svn: 215200	2014-08-08 12:41:08 +00:00
David Majnemer	fe8c7540b0	GlobalOpt: Optimize in the face of insertvalue/extractvalue GlobalOpt didn't know how to simulate InsertValueInst or ExtractValueInst. Optimizing these is pretty straightforward. N.B. This came up when looking at clang's IRGen for MS ABI member pointers; they are represented as aggregates. llvm-svn: 215184	2014-08-08 05:50:43 +00:00
Gerolf Hoflehner	ea96a3d336	Fix for multi-line comment warning llvm-svn: 215169	2014-08-07 23:19:55 +00:00
Arnold Schwaighofer	4fb3c47456	SLPVectorizer: Use the type of the value loaded/stored to get the ABI alignment We were using the pointer type which is incorrect. llvm-svn: 215162	2014-08-07 22:47:27 +00:00
Owen Anderson	6c19ab1b5d	Fix a case in SROA where lifetime intrinsics could inhibit alloca promotion. In this case, the code path dealing with vector promotion was missing the explicit checks for lifetime intrinsics that were present on the corresponding integer promotion path. llvm-svn: 215148	2014-08-07 21:07:35 +00:00
Rui Ueyama	c487f7728e	Revert "r214897 - Remove dead zero store to calloc initialized memory" It broke msan. llvm-svn: 214989	2014-08-06 19:30:38 +00:00
James Molloy	568da0990e	Add a new option -run-slp-after-loop-vectorization. This swaps the order of the loop vectorizer and the SLP/BB vectorizers. It is disabled by default so we can do performance testing - ideally we want to change to having the loop vectorizer running first, and the SLP vectorizer using its leftovers instead of the other way around. llvm-svn: 214963	2014-08-06 12:56:19 +00:00
Peter Collingbourne	df240b252a	[dfsan] Try not to create too many additional basic blocks in functions which already have a large number of blocks. Works around a performance issue with the greedy register allocator. llvm-svn: 214944	2014-08-06 00:33:40 +00:00

1 2 3 4 5 ...

11889 Commits