llvm-project

Commit Graph

Author	SHA1	Message	Date
Hal Finkel	081eaef6fa	Add a runtime unrolling parameter to the LoopUnroll pass constructor As with the other loop unrolling parameters (the unrolling threshold, partial unrolling, etc.) runtime unrolling can now also be controlled via the constructor. This will be necessary for moving non-trivial unrolling late in the pass manager (after loop vectorization). No functionality change intended. llvm-svn: 194027	2013-11-05 00:08:03 +00:00
Shuxin Yang	d1382b6c31	Remove dead code llvm-svn: 194017	2013-11-04 21:44:01 +00:00
Benjamin Kramer	9e7f7c7fdb	SLPVectorizer: Use properlyDominates to satisfy the irreflexivity of a strict weak ordering. STL debug mode checks this. llvm-svn: 194015	2013-11-04 21:34:55 +00:00
Matt Arsenault	243140f2fd	Scalarize select vector arguments when extracted. When the elements are extracted from a select on vectors or a vector select, do the select on the extracted scalars from the input if there is only one use. llvm-svn: 194013	2013-11-04 20:36:06 +00:00
Benjamin Kramer	191ba00b83	SLPVectorizer: Add a missing pair of parens. No functionality change. llvm-svn: 193958	2013-11-03 12:54:32 +00:00
Benjamin Kramer	91e8f3c348	SLPVectorizer: When CSEing generated gathers only scan blocks containing them. Instead of doing a RPO traversal of the whole function remember the blocks containing gathers (typically <= 2) and scan them in dominator-first order. The actual CSE is still quadratic, but I'm not confident that adding a scoped hash table here is worth it as we're only looking at the generated instructions and not arbitrary code. llvm-svn: 193956	2013-11-03 12:27:52 +00:00
David Majnemer	120f4a06fd	Revert "Inliner: Handle readonly attribute per argument when adding memcpy" This reverts commit r193356, it caused PR17781. A reduced test case covering this regression has been added to the test suite. llvm-svn: 193955	2013-11-03 12:22:13 +00:00
David Majnemer	927df85de0	Spell "Actual" correctly llvm-svn: 193954	2013-11-03 11:09:39 +00:00
Bob Wilson	d8d92d90fa	Convert calls to __sinpi and __cospi into __sincospi_stret This adds an SimplifyLibCalls case which converts the special __sinpi and __cospi (float & double variants) into a __sincospi_stret where appropriate to remove duplicated work. Patch by Tim Northover llvm-svn: 193943	2013-11-03 06:48:38 +00:00
Benjamin Kramer	089c1e4f6d	SLPVectorizer: Remove duplicated function. llvm-svn: 193927	2013-11-02 14:46:27 +00:00
Benjamin Kramer	568a1cd9df	LoopVectorize: Remove quadratic behavior the local CSE. Doing this with a hash map doesn't change behavior and avoids calling isIdenticalTo O(n^2) times. This should probably eventually move into a utility class shared with EarlyCSE and the limited CSE in the SLPVectorizer. llvm-svn: 193926	2013-11-02 13:39:00 +00:00
Arnold Schwaighofer	d0789cdffe	LoopVectorizer: Move cse code into its own function llvm-svn: 193895	2013-11-01 23:28:54 +00:00
Arnold Schwaighofer	a846a7f8f0	LoopVectorizer: Perform redundancy elimination on induction variables When the loop vectorizer was part of the SCC inliner pass manager gvn would run after the loop vectorizer followed by instcombine. This way redundancy (multiple uses) were removed and instcombine could perform scalarization on the induction variables. Having moved the loop vectorizer to later we no longer run any form of redundancy elimination before we perform instcombine. This caused vectorized induction variables to survive that did not before. On a recent iMac this helps linpack back from 6000Mflops to 7000Mflops. This should also help lpbench and paq8p. I ran a Release (without Asserts) build over the test-suite and did not see any negative impact on compile time. radar://15339680 llvm-svn: 193891	2013-11-01 22:18:19 +00:00
Benjamin Kramer	1fbcdca9e3	LoopVectorize: Look for consecutive acces in GEPs with trailing zero indices If we have a pointer to a single-element struct we can still build wide loads and stores to it (if there is no padding). llvm-svn: 193860	2013-11-01 14:09:50 +00:00
Arnold Schwaighofer	70a4665f55	LoopVectorizer: If dependency checks fail try runtime checks When a dependence check fails we can still try to vectorize loops with runtime array bounds checks. This helps linpack to vectorize a loop in dgefa. And we are back to 2x of the scalar performance on a corei7-avx. radar://15339680 llvm-svn: 193853	2013-11-01 03:05:07 +00:00
Arnold Schwaighofer	1ca922e296	LoopVectorizer: Clear all member data structures in RuntimeCheck.reset() Clear all data structures when resetting the RuntimeCheck data structure. No test case. This was exposed by an upcomming change. llvm-svn: 193852	2013-11-01 03:05:04 +00:00
Manman Ren	87a2adc7fe	Do not convert "call asm" to "invoke asm" in Inliner. Given that backend does not handle "invoke asm" correctly ("invoke asm" will be handled by SelectionDAGBuilder::visitInlineAsm, which does not have the right setup for LPadToCallSiteMap) and we already made the assumption that inline asm does not throw in InstCombiner::visitCallSite, we are going to make the same assumption in Inliner to make sure we don't convert "call asm" to "invoke asm". If it becomes necessary to add support for "invoke asm" later on, we will need to modify the backend as well as remove the assumptions that inline asm does not throw. Fix rdar://15317907 llvm-svn: 193808	2013-10-31 21:56:03 +00:00
Rafael Espindola	282a47037b	Use LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN instead of the "dso list". There are two ways one could implement hiding of linkonce_odr symbols in LTO: * LLVM tells the linker which symbols can be hidden if not used from native files. * The linker tells LLVM which symbols are not used from other object files, but will be put in the dso symbol table if present. GOLD's API is the second option. It was implemented almost 1:1 in llvm by passing the list down to internalize. LLVM already had partial support for the first option. It is also very similar to how ld64 handles hiding these symbols when not doing LTO. This patch then * removes the APIs for the DSO list. * marks LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN all linkonce_odr unnamed_addr global values and other linkonce_odr whose address is not used. * makes the gold plugin responsible for handling the API mismatch. llvm-svn: 193800	2013-10-31 20:51:58 +00:00
Rafael Espindola	6554e5a94d	Merge CallGraph and BasicCallGraph. llvm-svn: 193734	2013-10-31 03:03:55 +00:00
Matt Arsenault	38b8ecf378	Teach scalarrepl about address spaces llvm-svn: 193720	2013-10-30 22:54:58 +00:00
Matt Arsenault	614ea99da7	Fix GVN creating bitcast between address spaces llvm-svn: 193710	2013-10-30 19:05:41 +00:00
Arnold Schwaighofer	77af0f6e82	ARM cost model: Account for zero cost scalar SROA instructions By vectorizing a series of srl, or, ... instructions we have obfuscated the intention so much that the backend does not know how to fold this code away. radar://15336950 llvm-svn: 193573	2013-10-29 01:33:53 +00:00
Arnold Schwaighofer	86252451c4	SLPVectorizer: Use vector type for vectorized memory operations No test case, because with the current cost model we don't see a difference. An upcoming ARM memory cost model change will expose and test this bug. radar://15332579 llvm-svn: 193572	2013-10-29 01:33:50 +00:00
Shuxin Yang	2e1890e18b	Revert r193251 : Use address-taken to disambiguate global variable and indirect memops. llvm-svn: 193489	2013-10-27 03:08:44 +00:00
Wan Xiaofei	be640b28c0	Quick look-up for block in loop. This patch implements quick look-up for block in loop by maintaining a hash set for blocks. It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng). Below are the compilation time for our benchmark in llc before & after the patch. Benchmark llc - trunk llc - patched 401.bzip2 0.339081 100.00% 0.329657 102.86% 403.gcc 19.853966 100.00% 19.605466 101.27% 429.mcf 0.049823 100.00% 0.048451 102.83% 433.milc 0.514898 100.00% 0.510217 100.92% 444.namd 1.109328 100.00% 1.103481 100.53% 445.gobmk 4.988028 100.00% 4.929114 101.20% 456.hmmer 0.843871 100.00% 0.825865 102.18% 458.sjeng 0.754238 100.00% 0.714095 105.62% 464.h264ref 2.9668 100.00% 2.90612 102.09% 471.omnetpp 4.556533 100.00% 4.511886 100.99% bitmnp01 0.038168 100.00% 0.0357 106.91% idctrn01 0.037745 100.00% 0.037332 101.11% libquake2 3.78689 100.00% 3.76209 100.66% libquake_ 2.251525 100.00% 2.234104 100.78% linpack 0.033159 100.00% 0.032788 101.13% matrix01 0.045319 100.00% 0.043497 104.19% nbench 0.333161 100.00% 0.329799 101.02% tblook01 0.017863 100.00% 0.017666 101.12% ttsprk01 0.054337 100.00% 0.053057 102.41% Reviewer : Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov> Approver : Andrew Trick <atrick@apple.com> Test : Pass make check-all & llvm test-suite llvm-svn: 193460	2013-10-26 03:08:02 +00:00
Andrew Trick	57243da70f	Fix SCEVExpander: don't try to expand quadratic recurrences outside a loop. Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) When SCEV expands a recurrence outside of a loop it attempts to scale by the stride of the recurrence. Chained recurrences don't work that way. We could compute binomial coefficients, but would hve to guarantee that the chained AddRec's are in a perfectly reduced form. llvm-svn: 193438	2013-10-25 21:35:56 +00:00
Rafael Espindola	7749d7ccc7	Handle calls and invokes in GlobalStatus. This patch teaches GlobalStatus to analyze a call that uses the global value as a callee, not as an argument. With this change internalize call handle the common use of linkonce_odr functions. This reduces the number of linkonce_odr functions in a LTO build of clang (checked with the emit-llvm gold plugin option) from 1730 to 60. llvm-svn: 193436	2013-10-25 21:29:52 +00:00
Hal Finkel	02f562df43	LoopVectorizer: Don't attempt to vectorize extractelement instructions The loop vectorizer does not currently understand how to vectorize extractelement instructions. The existing check, which excluded all vector-valued instructions, did not catch extractelement instructions because it checked only the return value. As a result, vectorization would proceed, producing illegal instructions like this: %58 = extractelement <2 x i32> %15, i32 0 %59 = extractelement i32 %58, i32 0 where the second extractelement is illegal because its first operand is not a vector. llvm-svn: 193434	2013-10-25 20:40:15 +00:00
Tom Stellard	bc7d87f07c	Inliner: Handle readonly attribute per argument when adding memcpy Patch by: Vincent Lejeune llvm-svn: 193356	2013-10-24 16:38:33 +00:00
Renato Golin	1ba143e140	Mark vector loops as already vectorized Make sure we mark all loops (scalar and vector) when vectorizing, so that we don't try to vectorize them anymore. Also, set unroll to 1, since this is what we check for on early exit. llvm-svn: 193349	2013-10-24 14:50:51 +00:00
Nuno Lopes	340b0463e6	fix PR17635: false positive with packed structures LLVM optimizers may widen accesses to packed structures that overflow the structure itself, but should be in bounds up to the alignment of the object llvm-svn: 193317	2013-10-24 09:17:24 +00:00
Juergen Ributzka	d04d096ecf	Fix a bug in LinearFunctionTestReplace that created invalid loop exit checks. Reviewed by Andy llvm-svn: 193303	2013-10-24 05:29:56 +00:00
Andrew Trick	ada2356ac9	Clarify comments in genLoopLimit. llvm-svn: 193292	2013-10-24 00:43:38 +00:00
Yuchen Wu	3197b25b27	Fixed comment typo in GCOVProfiling.cpp llvm-svn: 193268	2013-10-23 20:35:00 +00:00
Shuxin Yang	e4fb375995	Use address-taken to disambiguate global variable and indirect memops. Major steps include: 1). introduces a not-addr-taken bit-field in GlobalVariable 2). GlobalOpt pass sets "not-address-taken" if it proves a global varirable dosen't have its address taken. 3). AA use this info for disambiguation. llvm-svn: 193251	2013-10-23 17:28:19 +00:00
Eric Christopher	874fa0f6c7	Fix spelling, grammar, and match naming convention for test files. llvm-svn: 193130	2013-10-21 23:14:06 +00:00
Tom Stellard	e1631ddf93	SimplifyCFG: Don't duplicate calls to functions marked noduplicate v2 v2: - Use CI->cannotDuplicate() llvm-svn: 193115	2013-10-21 20:07:30 +00:00
Matt Arsenault	404c60a7c3	Use more type helper functions llvm-svn: 193109	2013-10-21 19:43:56 +00:00
Matt Arsenault	fa64659bd8	Teach SimplifyCFG about address spaces llvm-svn: 193104	2013-10-21 18:55:08 +00:00
Rafael Espindola	3d7fc25c7c	Optimize more linkonce_odr values during LTO. When a linkonce_odr value that is on the dso list is not unnamed_addr we can still look to see if anything is actually using its address. If not, it is safe to hide it. This patch implements that by moving GlobalStatus to Transforms/Utils and using it in Internalize. llvm-svn: 193090	2013-10-21 17:14:55 +00:00
Michael Gottesman	63c63ac21e	Fix the predecessor removal logic in r193045. Additionally some small comment/stylistic fixes are included as well. llvm-svn: 193068	2013-10-21 05:20:11 +00:00
Bill Wendling	90dd90afcb	Don't eliminate a partially redundant load if it's in a landing pad. A landing pad can be jumped to only by the unwind edge of an invoke instruction. If we eliminate a partially redundant load in a landing pad, it will create a basic block that violates this constraint. It then leads to other problems down the line if it tries to merge that basic block with the landing pad. Avoid this by not eliminating the load in a landing pad. PR17621 llvm-svn: 193064	2013-10-21 04:09:17 +00:00
Michael Gottesman	c024f3258a	Teach simplify-cfg how to correctly create covered lookup tables for switches on iN with N >= 3. One optimization simplify-cfg performs is the converting of switches to lookup tables if the switch has > 4 cases. This is done by: 1. Finding the max/min case value and calculating the switch case range. 2. Create a lookup table basic block. 3. Perform a check in the switch's BB to see if the input value is in the switch's case range. If the input value satisfies said predicate branch to the lookup table BB, otherwise branch to the switch's default destination BB using the default value as the result. The conditional check consists of subtracting the min case value of the table from any input iN value and then ensuring that said value is unsigned less than the size of the lookup table represented as an iN value. If the lookup table is a covered lookup table, the size of the table will be N which is 0 as an iN value. Thus the comparison will be an `icmp ult` of an iN value against 0 which is always false yielding the incorrect result. This patch fixes this problem by recognizing if we have a covered lookup table and if we do, unconditionally jumps to the lookup table BB since the covering property of the lookup table implies no input values could not be handled by said BB. rdar://15268442 llvm-svn: 193045	2013-10-20 07:04:37 +00:00
Bill Wendling	4fea22c63b	Perform an intelligent splice of the predecessor with the single successor. If the predecessor's being spliced into a landing pad, then we need the PHIs to come first and the rest of the predecessor's code to come after the landing pad instruction. llvm-svn: 193035	2013-10-19 11:27:12 +00:00
Nadav Rotem	7f27e0b0ce	Mark some command line flags as hidden llvm-svn: 193013	2013-10-18 23:38:13 +00:00
Rafael Espindola	045a78fa7e	Rename fields of GlobalStatus to match the coding style. llvm-svn: 192910	2013-10-17 18:18:52 +00:00
Rafael Espindola	27797baee7	rename SafeToDestroyConstant to isSafeToDestroyConstant and clang-format. llvm-svn: 192907	2013-10-17 18:06:32 +00:00
Rafael Espindola	026c9cbefe	Simplify the interface of AnalyzeGlobal a bit and rename to analyzeGlobal. No functionality change. llvm-svn: 192906	2013-10-17 18:00:25 +00:00
Evgeniy Stepanov	21a9c93a4d	[msan] Use zero-extension in shadow cast by default. Switch to sign-extension in r192575 caused 7% perf loss on 482.sphinx3. llvm-svn: 192882	2013-10-17 10:53:50 +00:00
Dmitry Vyukov	b1ad5720a2	tsan: implement no_sanitize_thread attribute If a function has no_sanitize_thread attribute, do not instrument memory accesses in it. llvm-svn: 192871	2013-10-17 07:20:06 +00:00
Arnold Schwaighofer	a66582470b	SLPVectorizer: Don't vectorize volatile memory operations radar://15231682 Reapply r192799, http://lab.llvm.org:8011/builders/lldb-x86_64-debian-clang/builds/8226 showed that the bot is still broken even with this out. llvm-svn: 192820	2013-10-16 17:52:40 +00:00
Arnold Schwaighofer	06a0324f6a	Revert "SLPVectorizer: Don't vectorize volatile memory operations" This speculatively reverts commit 192799. It might have broken a linux buildbot. llvm-svn: 192816	2013-10-16 17:19:40 +00:00
Arnold Schwaighofer	5078ea2bd9	SLPVectorizer: Don't vectorize volatile memory operations radar://15231682 llvm-svn: 192799	2013-10-16 16:09:00 +00:00
Kostya Serebryany	d3d23bec66	[asan] Optimize accesses to global arrays with constant index Summary: Given a global array G[N], which is declared in this CU and has static initializer avoid instrumenting accesses like G[i], where 'i' is a constant and 0<=i<N. Also add a bit of stats. This eliminates ~1% of instrumentations on SPEC2006 and also partially helps when asan is being run together with coverage. Reviewers: samsonov Reviewed By: samsonov CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1947 llvm-svn: 192794	2013-10-16 14:06:14 +00:00
Benjamin Kramer	c97850be76	LoopVectorize: Properly reflect PODness in comments. llvm-svn: 192717	2013-10-15 16:19:54 +00:00
Craig Topper	ef9e993eaa	Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext. llvm-svn: 192672	2013-10-15 05:20:47 +00:00
Rafael Espindola	8c1d78ad51	Remove lib/Transforms/Instrumentation/ProfilingUtils.* They were leftover from the old profiling support. Patch by Alastair Murray. llvm-svn: 192605	2013-10-14 16:46:46 +00:00
Chris Lattner	94fc4bed1f	Basic blocks typically have few predecessors. Use a SmallDenseMap to avoid a heap allocation when this is the case. llvm-svn: 192602	2013-10-14 16:05:55 +00:00
Evgeniy Stepanov	be83d8f693	[msan] Instrument x86._cvt intrinsics. Currently MSan checks that arguments of cvt intrinsics are fully initialized. That's too much to ask: some of them only operate on lower half, or even quarter, of the input register. llvm-svn: 192599	2013-10-14 15:16:25 +00:00
Evgeniy Stepanov	9b5517b127	[msan] Fix handling of scalar select of vectors. llvm-svn: 192575	2013-10-14 09:52:09 +00:00
Arnold Schwaighofer	58864d2d5f	SLPVectorizer: Sort PHINodes based on their opcode Before this patch we relied on the order of phi nodes when we looked for phi nodes of the same type. This could prevent vectorization of cases where there was a phi node of a second type in between phi nodes of some type. This is important for vectorization of an internal graphics kernel. On the test suite + external on x86_64 (and on a run on armv7s) it showed no impact on either performance or compile time. radar://15024459 llvm-svn: 192537	2013-10-12 18:56:27 +00:00
Tobias Grosser	5cff1e2d78	LoopVectorize: Add missing INITIALIZE_PASS_DEPENDENCY macros Contributed-by: Peter Zotov <whitequark@whitequark.org> llvm-svn: 192536	2013-10-12 18:29:15 +00:00
Renato Golin	dd943a8919	Better info when debugging vectorizer llvm-svn: 192460	2013-10-11 16:14:39 +00:00
Shuxin Yang	1cab418ce2	Fix a bug in Dead Argument Elimination. If a function seen at compile time is not necessarily the one linked to the binary being built, it is illegal to change the actual arguments passing to it. e.g. -------------------------- void foo(int lol) { // foo() has linkage satisifying isWeakForLinker() // "lol" is not used at all. } void bar(int lo2) { // xform to foo(undef) is illegal, as compiler dose not know which // instance of foo() will be linked to the the binary being built. foo(lol2); } ----------------------------- Such functions can be captured by isWeakForLinker(). NOTE that mayBeOverridden() is insufficient for this purpose as it dosen't include linkage types like AvailableExternallyLinkage and LinkOnceODRLinkage. Take link_odr* as an example, it indicates a set of EQUIVALENT globals that can be merged at link-time. However, the semantic of EQUIVALENT-functions includes parameters. Changing parameters breaks the assumption. Thank John McCall for help, especially for the explanation of subtle difference between linkage types. rdar://11546243 llvm-svn: 192302	2013-10-09 17:21:44 +00:00
Arnold Schwaighofer	0caddfc731	LoopVectorize: External uses must use the last value in a reduction cycle Otherwise, we don't perform operations that would have been performed on the scalar version. Fixes PR17498. llvm-svn: 192133	2013-10-07 21:05:43 +00:00
Alexey Samsonov	a1944e6d26	Revert r191834 until we measure the effect of this benchmarks and maybe find a better way to fix it llvm-svn: 192121	2013-10-07 19:03:24 +00:00
Hal Finkel	f5a3eaea55	UpdatePHINodes in BasicBlockUtils should not crash on duplicate predecessors UpdatePHINodes has an optimization to reuse an existing PHI node, where it first deletes all of its entries and then replaces them. Unfortunately, in the case where we had duplicate predecessors (which are allowed so long as the associated PHI entries have the same value), the loop removing the existing PHI entries from the to-be-reused PHI would assert (if that PHI was not the one which had the duplicates). llvm-svn: 192001	2013-10-04 23:41:05 +00:00
Arnold Schwaighofer	698d4ac8a8	SLPVectorizer: Sort inputs to commutative binary operations Sort the operands of the other entries in the current vectorization root according to the first entry's operands opcodes. %conv0 = uitofp ... %load0 = load float ... = fmul %conv0, %load0 = fmul %load0, %conv1 = fmul %load0, %conv2 Make sure that we recursively vectorize <%conv0, %conv1, %conv2> and <%load0, %load0, %load0>. This makes it more likely to obtain vectorizable trees. We have to be careful when we sort that we don't destroy 'good' existing ordering implied by source order. radar://15080067 llvm-svn: 191977	2013-10-04 20:39:16 +00:00
Owen Anderson	5797bfd4a3	Pull fptrunc's upwards through selects when one of the select's selectands was a constant. This has a number of benefits, including producing small immediates (easier to materialize, smaller constant pools) as well as being more likely to allow the fptrunc to fuse with a preceding instruction (truncating selects are unusual). llvm-svn: 191929	2013-10-03 21:08:05 +00:00
Rafael Espindola	cda2911caa	Optimize linkonce_odr unnamed_addr functions during LTO. Generalize the API so we can distinguish symbols that are needed just for a DSO symbol table from those that are used from some native .o. The symbols that are only wanted for the dso symbol table can be dropped if llvm can prove every other dso has a copy (linkonce_odr) and the address is not important (unnamed_addr). llvm-svn: 191922	2013-10-03 18:29:09 +00:00
Matt Arsenault	bfa37e546d	Make gep i8* X, -(ptrtoint Y) transform work with address spaces llvm-svn: 191920	2013-10-03 18:15:57 +00:00
Matt Arsenault	0be1cb1c7b	Don't use runtime bounds check between address spaces. Don't vectorize with a runtime check if it requires a comparison between pointers with different address spaces. The values can't be assumed to be directly comparable. Previously it would create an illegal bitcast. llvm-svn: 191862	2013-10-02 22:38:17 +00:00
Yi Jiang	8fd1a806d5	Apply slp vectorization on fully-vectorizable tree of height 2 llvm-svn: 191852	2013-10-02 20:20:39 +00:00
Matt Arsenault	39d592fe48	Fix debug printing spacing. Fix missing newlines, missing and extra spaces in printed messages. llvm-svn: 191851	2013-10-02 20:04:29 +00:00
Matt Arsenault	cccbe16785	Fix comment grammar and capitalization. llvm-svn: 191850	2013-10-02 20:04:26 +00:00
Benjamin Kramer	b9add84ef6	SLPVectorizer: Make store chain finding more aggressive with GetUnderlyingObject. This recursively strips all GEPs like the existing code. It also handles bitcasts and other operations that do not change the pointer value. llvm-svn: 191847	2013-10-02 19:06:06 +00:00
Tom Stellard	d3e916eb6a	StructurizeCFG: Add dependency on LowerSwitch pass Switch instructions were crashing the StructurizeCFG pass, and it's probably easier anyway if we don't need to handle them in this pass. Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 191841	2013-10-02 17:04:59 +00:00
Chandler Carruth	ea56494625	Remove the very substantial, largely unmaintained legacy PGO infrastructure. This was essentially work toward PGO based on a design that had several flaws, partially dating from a time when LLVM had a different architecture, and with an effort to modernize it abandoned without being completed. Since then, it has bitrotted for several years further. The result is nearly unusable, and isn't helping any of the modern PGO efforts. Instead, it is getting in the way, adding confusion about PGO in LLVM and distracting everyone with maintenance on essentially dead code. Removing it paves the way for modern efforts around PGO. Among other effects, this removes the last of the runtime libraries from LLVM. Those are being developed in the separate 'compiler-rt' project now, with somewhat different licensing specifically more approriate for runtimes. llvm-svn: 191835	2013-10-02 15:42:23 +00:00
Alexey Samsonov	31540172d0	Remove "localize global" optimization Summary: As discussed in http://llvm-reviews.chandlerc.com/D1754, this optimization isn't really valid for C, and fires too rarely anyway. Reviewers: rafael, nicholas Reviewed By: nicholas CC: rnk, llvm-commits, nicholas Differential Revision: http://llvm-reviews.chandlerc.com/D1769 llvm-svn: 191834	2013-10-02 15:31:34 +00:00
Matt Arsenault	517d84e268	Don't merge tiny functions. It's silly to merge functions like these: define void @foo(i32 %x) { ret void } define void @bar(i32 %x) { ret void } to get define void @bar(i32) { tail call void @foo(i32 %0) ret void } llvm-svn: 191786	2013-10-01 18:05:30 +00:00
Rafael Espindola	44fee4e0eb	Remove several unused variables. Patch by Alp Toker. llvm-svn: 191757	2013-10-01 13:32:03 +00:00
Matt Arsenault	5ea37f8d89	Fix code duplication llvm-svn: 191716	2013-10-01 00:01:14 +00:00
Matt Arsenault	8468062c6e	Use right address space size in InstCombineCompares The test's output doesn't change, but this ensures this is actually hit with a different address space. llvm-svn: 191701	2013-09-30 21:11:01 +00:00
Matt Arsenault	06adecabe7	Constant fold ptrtoint + compare with address spaces llvm-svn: 191699	2013-09-30 21:06:18 +00:00
Benjamin Kramer	f00472908a	BoundsChecking: Fix refacto. llvm-svn: 191676	2013-09-30 15:52:50 +00:00
Benjamin Kramer	6e931528fe	Convert manual insert point restores to the new RAII object. llvm-svn: 191675	2013-09-30 15:40:17 +00:00
Benjamin Kramer	6748576a0d	InstCombine: Replace manual fast math flag copying with the new IRBuilder RAII helper. Defines away the issue where cast<Instruction> would fail because constant folding happened. Also slightly cleaner. llvm-svn: 191674	2013-09-30 15:39:59 +00:00
Benjamin Kramer	d36f1abefd	IRBuilder: Add RAII objects to reset insertion points or fast math flags. Inspired by the object from the SLPVectorizer. This found a minor bug in the debug loc restoration in the vectorizer where the location of a following instruction was attached instead of the location from the original instruction. llvm-svn: 191673	2013-09-30 15:39:48 +00:00
Joey Gouly	d51a35c6a0	Fix a bug in InstCombine where it attempted to cast a Value* to an Instruction* when it was actually a Constant*. There are quite a few other casts to Instruction that might have the same problem, but this is the only one I have a test case for. llvm-svn: 191668	2013-09-30 14:18:35 +00:00
Robert Wilhelm	2788d3ec99	Even more spelling fixes for "instruction". llvm-svn: 191611	2013-09-28 13:42:22 +00:00
Robert Wilhelm	f0cfb83bb4	Fix spelling intruction -> instruction. llvm-svn: 191610	2013-09-28 11:46:15 +00:00
Matt Arsenault	31cfc78f81	Use right pointer type in DebugIR llvm-svn: 191576	2013-09-27 22:26:25 +00:00
Matt Arsenault	fa25272db9	Use type helper functions llvm-svn: 191574	2013-09-27 22:18:51 +00:00
Matt Arsenault	29f31735a2	Fix SLPVectorizer using wrong address space for load/store llvm-svn: 191564	2013-09-27 21:24:57 +00:00
Justin Bogner	4a9ac8cd75	InstCombine: Only foldSelectICmpAndOr for integer types Currently foldSelectICmpAndOr asserts if the "or" involves a vector containing several of the same power of two. We can easily avoid this by only performing the fold on integer types, like foldSelectICmpAnd does. Fixes <rdar://problem/15012516> llvm-svn: 191552	2013-09-27 20:35:39 +00:00
Justin Bogner	ca9bd8fac1	Transforms: Use getFirstNonPHI to set the insertion point for PHIs We were previously using getFirstInsertionPt to insert PHI instructions when vectorizing, but getFirstInsertionPt also skips past landingpads, causing this to generate invalid IR. We can avoid this issue by using getFirstNonPHI instead. llvm-svn: 191526	2013-09-27 15:30:25 +00:00
Puyan Lotfi	74e38de492	First check in. Modified a comment. llvm-svn: 191491	2013-09-27 07:36:10 +00:00
Arnold Schwaighofer	07520324f5	SLPVectorize: Put horizontal reductions feeding a store under separate flag Put them under a separate flag for experimentation. They are more likely to interfere with loop vectorization which happens later in the pass pipeline. llvm-svn: 191371	2013-09-25 14:02:32 +00:00
Evgeniy Stepanov	32be0340f5	[msan] Fix -Wreturn-type warnings in non-self-hosted build. llvm-svn: 191361	2013-09-25 08:56:00 +00:00
Yi Jiang	edf2d9179e	set the cost of tiny trees to INT_MAX in SLP vectorizer to disable vectorization on them llvm-svn: 191314	2013-09-24 17:26:43 +00:00
Benjamin Kramer	30d249a1b3	Push analysis passes to InstSimplify when they're around anyways. llvm-svn: 191309	2013-09-24 16:37:40 +00:00
Evgeniy Stepanov	5522a70674	[msan] Handling of atomic load/store, atomic rmw, cmpxchg. llvm-svn: 191287	2013-09-24 11:20:27 +00:00
Arnold Schwaighofer	22639407d7	Revert "LoopVectorizer: Only allow vectorization of intrinsics." Revert 191122 - with extra checks we are allowed to vectorize math library function calls. Standard library indentifiers are reserved names so functions with external linkage must not overrided them. However, functions with internal linkage can. Therefore, we can vectorize calls to math library functions with a check for external linkage and matching signature. This matches what we do during SelectionDAG building. llvm-svn: 191206	2013-09-23 14:54:39 +00:00
Benjamin Kramer	8817cca5ce	Provide basic type safety for array_pod_sort comparators. This makes using array_pod_sort significantly safer. The implementation relies on function pointer casting but that should be safe as we're dealing with void* here. llvm-svn: 191175	2013-09-22 14:09:50 +00:00
Benjamin Kramer	5626259506	Drop spurious handle in comment. llvm-svn: 191172	2013-09-22 11:24:58 +00:00
Benjamin Kramer	90901a35ce	SROA: Handle casts involving vectors of pointers and integer scalars. SROA wants to convert any types of equivalent widths but it's not possible to convert vectors of pointers to an integer scalar with a single cast. As a workaround we add a bitcast to the corresponding int ptr type first. This type of cast used to be an edge case but has become common with SLP vectorization. Fixes PR17271. llvm-svn: 191143	2013-09-21 20:36:04 +00:00
Arnold Schwaighofer	d743feef81	SLPVectorizer: Fix multiline comment warning llvm-svn: 191135	2013-09-21 05:37:30 +00:00
Arnold Schwaighofer	500242d4fe	Reapply "SLPVectorizer: Handle more horizontal reductions (disabled)"" Reapply r191108 with a fix for a memory corruption error I introduced. Of course, we can't reference the scalars that we replace by vectorizing and then call their eraseFromParent method. I only 'needed' the scalars to get the DebugLoc. Just store the DebugLoc before actually vectorizing instead. As a nice side effect, this also simplifies the interface between BoUpSLP and the HorizontalReduction class to returning a value pointer (the vectorized tree root). radar://14607682 llvm-svn: 191123	2013-09-21 01:06:00 +00:00
Nadav Rotem	3371172a67	LoopVectorizer: Only allow vectorization of intrinsics. We can't know for sure that the functions 'abs' or 'round' are the functions from libm. rdar://15012650 llvm-svn: 191122	2013-09-21 00:27:05 +00:00
Arnold Schwaighofer	f1dfbfdde1	Revert "SLPVectorizer: Handle more horizontal reductions (disabled)" This reverts commit r191108. The horizontal.ll test case fails under libgmalloc. Thanks Shuxin for pointing this out to me. llvm-svn: 191121	2013-09-21 00:06:20 +00:00
Shuxin Yang	6e35094bbf	Resurrect r191017 " GVN proceeds in the presence of dead code" plus a fix to PR17307 & 17308. The problem of r191017 is that when GVN fabricate a val-number for a dead instruction (in order to make following expr-PRE happy), it forget to fabricate a leader-table entry for it as well. llvm-svn: 191118	2013-09-20 23:12:57 +00:00
Benjamin Kramer	0e2d162d1e	InstCombine: Remove unused argument. No functionality change. llvm-svn: 191112	2013-09-20 22:12:42 +00:00
Arnold Schwaighofer	4724963112	SLPVectorizer: Handle more horizontal reductions (disabled) Match reductions starting at binary operation feeding into a phi. The code handles trees like r += v1 + v2 + v3 ... and r += v1 r += v2 ... and r *= v1 + v2 + ... We currently only handle associative operations (add, fadd fast). The code can now also handle reductions feeding into stores. a[i] = v1 + v2 + v3 + ... The code is currently disabled behind the flag "-slp-vectorize-hor". The cost model for most architectures is not there yet. I found one opportunity of a horizontal reduction feeding a phi in TSVC (LoopRerolling-flt) and there are several opportunities where reductions feed into stores. radar://14607682 llvm-svn: 191108	2013-09-20 21:18:20 +00:00
Joerg Sonnenberger	1fbe323649	Revert r191017, it results in segmentation faults in Qt. llvm-svn: 191104	2013-09-20 20:33:57 +00:00
Benjamin Kramer	e6461e3053	InstCombine: Canonicalize (gep i8* X, -(ptrtoint Y)) to (sub (ptrtoint X), (ptrtoint Y)) The GEP pattern is what SCEV expander emits for "ugly geps". The latter is what you get for pointer subtraction in C code. The rest of instcombine already knows how to deal with that so just canonicalize on that. llvm-svn: 191090	2013-09-20 14:38:44 +00:00
Shuxin Yang	3a7ca6ec87	[Fast-math] Disable "(C1/X)C2 => (C1C2)/X" if C1/X has multiple uses. If "C1/X" were having multiple uses, the only benefit of this transformation is to potentially shorten critical path. But it is at the cost of instroducing additional div. The additional div may or may not incur cost depending on how div is implemented. If it is implemented using Newton–Raphson iteration, it dosen't seem to incur any cost (FIXME). However, if the div blocks the entire pipeline, that sounds to be pretty expensive. Let CodeGen to take care this transformation. This patch sees 6% on a benchmark. rdar://15032743 llvm-svn: 191037	2013-09-19 21:13:46 +00:00
Benjamin Kramer	0b37cdf9af	InstCombine: Don't allow turning vector-of-pointer loads into vector-of-integer. The code below can't handle any pointers. PR17293. llvm-svn: 191036	2013-09-19 20:59:04 +00:00
Shuxin Yang	74c9a170b8	GVN proceeds in the presence of dead code. This is how it ignores the dead code: 1) When a dead branch target, say block B, is identified, all the blocks dominated by B is dead as well. 2) The PHIs of those blocks in dominance-frontier(B) is updated such that the operands corresponding to dead predecessors are replaced by "UndefVal". Using lattice's jargon, the "UndefVal" is the "Top" in essence. Phi node like this "phi(v1 bb1, undef xx)" will be optimized into "v1" if v1 is constant, or v1 is an instruction which dominate this PHI node. 3) When analyzing the availability of a load L, all dead mem-ops which L depends on disguise as a load which evaluate exactly same value as L. 4) The dead mem-ops will be materialized as "UndefVal" during code motion. llvm-svn: 191017	2013-09-19 17:22:51 +00:00
Evgeniy Stepanov	37b8645480	[msan] Wrap indirect functions. Adds a flag to the MemorySanitizer pass that enables runtime rewriting of indirect calls. This is part of MSanDR implementation and is needed to return control to the DynamiRio-based helper tool on transition between instrumented and non-instrumented modules. Disabled by default. llvm-svn: 191006	2013-09-19 15:22:35 +00:00
Kostya Serebryany	f322382e22	[asan] call __asan_stack_malloc_N only if use-after-return detection is enabled with the run-time option llvm-svn: 190939	2013-09-18 14:07:14 +00:00
Robert Lytton	f637e2cb23	Prevent LoopVectorizer and SLPVectorizer running if the target has no vector registers. XCore target: Add XCoreTargetTransformInfo This is where getNumberOfRegisters() resides, which in turn returns the number of vector registers (=0). llvm-svn: 190936	2013-09-18 12:43:35 +00:00
Craig Topper	be3e01e61f	Revert accidental commit I had to make to get the test case in PR17268 to still work correctly. llvm-svn: 190917	2013-09-18 04:10:17 +00:00
Craig Topper	98064b9f4d	Lift alignment restrictions for load/store folding on VINSERTF128/VEXTRACTF128. Fixes PR17268. llvm-svn: 190916	2013-09-18 03:55:53 +00:00
David Blaikie	eacc287b49	ifndef NDEBUG-out an asserts-only constant committed in r190863 llvm-svn: 190905	2013-09-18 00:11:27 +00:00
Quentin Colombet	870b662779	Revert the load slicing done in r190870. To avoid regressions with bitfield optimizations, this slicing should take place later, like ISel time. llvm-svn: 190891	2013-09-17 22:01:26 +00:00
Matt Arsenault	e6952f28ca	Cleanup handling of constant function casts. Some of this code is no longer necessary since int<->ptr casts are no longer occur as of r187444. This also fixes handling vectors of pointers, and adds a bunch of new testcases for vectors and address spaces. llvm-svn: 190885	2013-09-17 21:10:14 +00:00
Arnold Schwaighofer	4a3dcaa193	SLPVectorizer: Don't vectorize phi nodes that use invoke values We can't insert an insertelement after an invoke. We would have to split a critical edge. So when we see a phi node that uses an invoke we just give up. radar://14990770 llvm-svn: 190871	2013-09-17 17:03:29 +00:00
Quentin Colombet	b8d672ef5b	[InstCombiner] Slice a big load in two loads when the elements are next to each other in memory. The motivation was to get rid of truncate and shift right instructions that get in the way of paired load or floating point load. E.g., Consider the following example: struct Complex { float real; float imm; }; When accessing a complex, llvm was generating a 64-bits load and the imm field was obtained by a trunc(lshr) sequence, resulting in poor code generation, at least for x86. The idea is to declare that two load instructions is the canonical form for loading two arithmetic type, which are next to each other in memory. Two scalar loads at a constant offset from each other are pretty easy to detect for the sorts of passes that like to mess with loads. <rdar://problem/14477220> llvm-svn: 190870	2013-09-17 16:57:34 +00:00
Kostya Serebryany	bc86efb89d	[asan] inline the calls to __asan_stack_free_* with small sizes. Yet another 10%-20% speedup for use-after-return llvm-svn: 190863	2013-09-17 12:14:50 +00:00
Stepan Dyatkovskiy	dc2c4b4462	Bugfix for PR17099: Wrong cast operation. MergeFunctions emits Bitcast instead of pointer-to-integer operation. Patch fixes MergeFunctions::writeThunk function. It replaces unconditional Bitcast creation with "Value* createCast(...)" method, that checks operand types and selects proper instruction. See unit-test as example. llvm-svn: 190859	2013-09-17 09:36:11 +00:00
Matt Arsenault	899f7d2b00	MemCpyOptimizer: Use max legal int size instead of pointer size If there are no legal integers, assume 1 byte. This makes more sense than using the pointer size as a guess for the maximum GPR width. It is conceivable to want to use some 64-bit pointers on a target where 64-bit integers aren't legal. llvm-svn: 190817	2013-09-16 22:43:16 +00:00
Arnold Schwaighofer	53e622cef4	Don't vectorize if there are outside loop users of the induction variable. We would have to compute the pre increment value, either by computing it on every loop iteration or by splitting the edge out of the loop and inserting a computation for it there. For now, just give up vectorizing such loops. Fixes PR17179. llvm-svn: 190790	2013-09-16 16:17:24 +00:00
Evgeniy Stepanov	604293fbb4	[msan] Check return value of main(). llvm-svn: 190782	2013-09-16 13:24:32 +00:00
Peter Collingbourne	3fa50f9b05	Implement function prefix data as an IR feature. Previous discussion: http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/063909.html Differential Revision: http://llvm-reviews.chandlerc.com/D1191 llvm-svn: 190773	2013-09-16 01:08:15 +00:00
Benjamin Kramer	7d6052687e	Replace some unnecessary vector copies with references. llvm-svn: 190770	2013-09-15 22:04:42 +00:00
Robert Wilhelm	042f10ce41	Fix spelling. llvm-svn: 190750	2013-09-14 09:34:59 +00:00
Chandler Carruth	ebeac5cb89	Remove the long, long defunct IR block placement pass. This pass was based on the previous (essentially unused) profiling infrastructure and the assumption that by ordering the basic blocks at the IR level in a particular way, the correct layout would happen in the end. This sometimes worked, and mostly didn't. It also was a really naive implementation of the classical paper that dates from when branch predictors were primarily directional and when loop structure wasn't commonly available. It also didn't factor into the equation non-fallthrough branches and other machine level details. Anyways, for all of these reasons and more, I wrote MachineBlockPlacement, which completely supercedes this pass. It both uses modern profile information infrastructure, and actually works. =] llvm-svn: 190748	2013-09-14 09:28:14 +00:00
Evgeniy Stepanov	0435ecd18f	[msan] Add source file:line to stack origin reports. Compiler part. llvm-svn: 190689	2013-09-13 12:54:49 +00:00
Duncan Sands	c9e95ad0db	Avoid a compiler warning about Found not being used when assertions are disabled. llvm-svn: 190668	2013-09-13 08:16:06 +00:00
Hal Finkel	8f2e700522	Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 190542	2013-09-11 19:25:43 +00:00
Benjamin Kramer	079b96e6f7	Revert "Give internal classes hidden visibility." It works with clang, but GCC has different rules so we can't make all of those hidden. This reverts commit r190534. llvm-svn: 190536	2013-09-11 18:05:11 +00:00
Benjamin Kramer	6a44af3629	Give internal classes hidden visibility. Worth 100k on a linux/x86_64 Release+Asserts clang. llvm-svn: 190534	2013-09-11 17:42:27 +00:00
Matt Arsenault	d3471e9ea8	Use type form of getIntPtrType This doesn't change anything since malloc always returns address space 0. llvm-svn: 190498	2013-09-11 07:29:40 +00:00
Matt Arsenault	009faed1be	Teach loop-idiom about address space pointer sizes llvm-svn: 190491	2013-09-11 05:09:42 +00:00
Matt Arsenault	5df49bd703	Add braces llvm-svn: 190490	2013-09-11 05:09:35 +00:00
Eli Friedman	77d7fbb924	Get rid of unused isPodLike definitions. llvm-svn: 190461	2013-09-11 00:36:54 +00:00
Eli Friedman	05906faa4d	Don't assert on invalid loop vectorization hint. llvm-svn: 190450	2013-09-10 23:45:25 +00:00
Eli Friedman	c1f1f852d7	Fix mistake in r190442. llvm-svn: 190446	2013-09-10 23:09:24 +00:00
Eli Friedman	1891f69323	Remove unused functions. llvm-svn: 190442	2013-09-10 22:42:31 +00:00
Matt Arsenault	a90a18e0ea	Teach ScalarEvolution about pointer address spaces llvm-svn: 190425	2013-09-10 19:55:24 +00:00
Benjamin Kramer	934f6f39f4	LoopVectorize: PHI nodes are always at the beginning of a block, no need to scan the whole block. llvm-svn: 190422	2013-09-10 18:46:15 +00:00
Kostya Serebryany	6805de5467	[asan] refactor the use-after-return API so that the size class is computed at compile time instead of at run-time. llvm part llvm-svn: 190407	2013-09-10 13:16:56 +00:00
Matt Arsenault	f631f8c640	Use StringRef::npos for StringRef instead of std::string one llvm-svn: 190375	2013-09-10 00:41:53 +00:00
Eli Friedman	33d3700716	Don't shrink atomic ops to bool in GlobalOpt. LLVM IR doesn't currently allow atomic bool load/store operations, and the transformation is dubious anyway because it isn't profitable on all platforms. PR17163. llvm-svn: 190357	2013-09-09 22:00:13 +00:00
Quentin Colombet	5ab555532b	[InstCombiner] Expose opportunities to merge subtract and comparison. Several architectures use the same instruction to perform both a comparison and a subtract. The instruction selection framework does not allow to consider different basic blocks to expose such fusion opportunities. Therefore, these instructions are “merged” by CSE at MI IR level. To increase the likelihood of CSE to apply in such situation, we reorder the operands of the comparison, when they have the same complexity, so that they matches the order of the most frequent subtract. E.g., icmp A, B ... sub B, A <rdar://problem/14514580> llvm-svn: 190352	2013-09-09 20:56:48 +00:00
Bob Wilson	e407736a06	Revert patches to add case-range support for PR1255. The work on this project was left in an unfinished and inconsistent state. Hopefully someone will eventually get a chance to implement this feature, but in the meantime, it is better to put things back the way the were. I have left support in the bitcode reader to handle the case-range bitcode format, so that we do not lose bitcode compatibility with the llvm 3.3 release. This reverts the following commits: 155464, 156374, 156377, 156613, 156704, 156757, 156804 156808, 156985, 157046, 157112, 157183, 157315, 157384, 157575, 157576, 157586, 157612, 157810, 157814, 157815, 157880, 157881, 157882, 157884, 157887, 157901, 158979, 157987, 157989, 158986, 158997, 159076, 159101, 159100, 159200, 159201, 159207, 159527, 159532, 159540, 159583, 159618, 159658, 159659, 159660, 159661, 159703, 159704, 160076, 167356, 172025, 186736 llvm-svn: 190328	2013-09-09 19:14:35 +00:00
Manman Ren	d8c68b1852	TBAA: add isTBAAVtableAccess to MDNode so clients can call the function instead of having its own implementation. The implementation of isTBAAVtableAccess is in TypeBasedAliasAnalysis.cpp since it is related to the format of TBAA metadata. The path for struct-path tbaa will be exercised by test/Instrumentation/ThreadSanitizer/read_from_global.ll, vptr_read.ll, and vptr_update.ll when struct-path tbaa is on by default. llvm-svn: 190216	2013-09-06 22:47:05 +00:00
Matt Arsenault	8227b9f69c	Use type helper functions. llvm-svn: 190113	2013-09-06 00:37:24 +00:00
Matt Arsenault	37d42ecaff	Teach CodeGenPrepare about address spaces llvm-svn: 190112	2013-09-06 00:18:43 +00:00
Matt Arsenault	e6db76071c	Consistently use dbgs() in debug printing llvm-svn: 190093	2013-09-05 19:48:28 +00:00
Rafael Espindola	d21ac19bda	Remove unused argument. llvm-svn: 190090	2013-09-05 19:15:21 +00:00
Nick Lewycky	2c88067a46	Declare missing dependency on AliasAnalysis. Patch by Liu Xin! llvm-svn: 190035	2013-09-05 08:19:58 +00:00
Rafael Espindola	b7c0b4a327	Rename some variables to match the style guide. I am about to patch this code, and this makes the diff far more readable. llvm-svn: 189982	2013-09-04 20:08:46 +00:00
Rafael Espindola	b832d49822	Small simplification given that insert of an empty range is a nop. llvm-svn: 189971	2013-09-04 18:53:21 +00:00
Rafael Espindola	49a6c153c9	Refactor duplicated logic to a helper function. No functionality change. llvm-svn: 189969	2013-09-04 18:37:36 +00:00
Rafael Espindola	9406516af1	Remove dead code. llvm-svn: 189967	2013-09-04 18:16:02 +00:00
Rafael Espindola	128c5ea902	Revert "Add r159136 back now that pr13124 has been fixed." This reverts commit r189886. I found a corner case where this optimization is not valid: Say we have a "linkonce_odr unnamed_addr" in two translation units: * In TU 1 this optimization kicks in and makes it hidden. * In TU 2 it gets const merged with a constant that is not unnamed_addr, resulting in a non unnamed_addr constant with default visibility. * The static linker rules for combining visibility them produce a hidden symbol, which is incorrect from the point of view of the non unnamed_addr constant. The one place we can do this is when we know that the symbol is not used from another TU in the same shared object, i.e., during LTO. I will move it there. llvm-svn: 189954	2013-09-04 16:09:01 +00:00
Tim Northover	dc647a2603	InstCombine: allow unmasked icmps to be combined with logical ops "(icmp op i8 A, B)" is equivalent to "(icmp op i8 (A & 0xff), B)" as a degenerate case. Allowing this as a "masked" comparison when analysing "(icmp) &/\| (icmp)" allows us to combine them in more cases. rdar://problem/7625728 llvm-svn: 189931	2013-09-04 11:57:17 +00:00
Tim Northover	c0756c454c	InstCombine: look for masked compares with subset relation Even in cases which aren't universally optimisable like "(A & B) != 0 && (A & C) != 0", the masks can make one of the comparisons completely redundant. In this case, since we've gone to the effort of spotting masked comparisons we should combine them. rdar://problem/7625728 llvm-svn: 189930	2013-09-04 11:57:13 +00:00
Rafael Espindola	5eb7df68bf	Add r159136 back now that pr13124 has been fixed. Original message: If a constant or a function has linkonce_odr linkage and unnamed_addr, mark hidden. Being linkonce_odr guarantees that it is available in every dso that needs it. Being a constant/function with unnamed_addr guarantees that the copies don't have to be merged. llvm-svn: 189886	2013-09-03 23:34:36 +00:00
Michael Gottesman	469a80cb30	[objc-arc] Remove dead code from previous commit. llvm-svn: 189870	2013-09-03 22:40:56 +00:00
Michael Gottesman	e29b1c1825	[objc-arc] Turn off the objc_retainBlock -> objc_retain optimization. The reason that I am turning off this optimization is that there is an additional case where a block can escape that has come up. Specifically, this occurs when a block is used in a scope outside of its current scope. This can cause a captured retainable object pointer whose life is preserved by the objc_retainBlock to be deallocated before the block is invoked. An example of the code needed to trigger the bug is: ---- \#import <Foundation/Foundation.h> int main(int argc, const char * argv[]) { void (^somethingToDoLater)(); { NSObject *obj = [NSObject new]; somethingToDoLater = ^{ [obj self]; // Crashes here }; } NSLog(@"test."); somethingToDoLater(); return 0; } ---- In the next commit, I remove all the dead code that results from this. Once I put in the fixing commit I will bring back the tests that I deleted in this commit. rdar://14802782. rdar://14868830. llvm-svn: 189869	2013-09-03 22:40:54 +00:00
Nadav Rotem	5d78dba6d9	Enable late-vectorization by default. This patch changes the default setting for the LateVectorization flag that controls where the loop-vectorizer is ran. Perf gains: SingleSource/Benchmarks/Shootout/matrix -37.33% MultiSource/Benchmarks/PAQ8p/paq8p -22.83% SingleSource/Benchmarks/Linpack/linpack-pc -16.22% SingleSource/Benchmarks/Shootout-C++/ary3 -15.16% MultiSource/Benchmarks/TSVC/NodeSplitting-flt/NodeSplitting-flt -10.34% MultiSource/Benchmarks/TSVC/NodeSplitting-dbl/NodeSplitting-dbl -7.12% Regressions: SingleSource/Benchmarks/Misc/lowercase 15.10% MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt 13.18% SingleSource/Benchmarks/Shootout-C++/matrix 8.27% SingleSource/Benchmarks/CoyoteBench/lpbench 7.30% llvm-svn: 189858	2013-09-03 21:33:17 +00:00
Matt Arsenault	3dfe54e954	Teach InstCombineLoadCast about address spaces. This is another one that doesn't matter much, but uses the right GEP index types in the first place. llvm-svn: 189854	2013-09-03 21:05:48 +00:00
Matt Arsenault	e38e4cdc46	Use type form of getIntPtrType in alloca visitor. This doesn't actually matter, since alloca is always 0 address space, but this is more consistent. llvm-svn: 189853	2013-09-03 21:05:15 +00:00
Yi Jiang	aeb5b46a85	In this patch we are trying to do two things: 1) If the width of vectorization list candidate is bigger than vector reg width, we will break it down to fit the vector reg. 2) We do not vectorize the width which is not power of two. The performance result shows it will help some spec benchmarks. mesa improved 6.97% and ammp improved 1.54%. llvm-svn: 189830	2013-09-03 17:26:04 +00:00
Evgeniy Stepanov	e95d37c81d	[msan] Fix handling of select with struct arguments. llvm-svn: 189796	2013-09-03 13:05:29 +00:00
Evgeniy Stepanov	566f591404	[msan] Fix select instrumentation. Select condition shadow was being ignored resulting in false negatives. This change OR-s sign-extended condition shadow into the result shadow. llvm-svn: 189785	2013-09-03 10:04:11 +00:00
Benjamin Kramer	2702caad08	SimplifyLibCalls: When emitting an overloaded fp function check that it's available. The existing code missed some edge cases when e.g. we're going to emit sqrtf but only the availability of sqrt was checked. This happens on odd platforms like windows. llvm-svn: 189724	2013-08-31 18:19:35 +00:00
Bill Wendling	2865be79f8	Compulsive reformatting. llvm-svn: 189697	2013-08-30 21:07:33 +00:00
Benjamin Kramer	010f108382	InstCombine: Check for zero shift amounts before subtracting one causing integer overflow. PR17026. Also avoid undefined shifts and shift amounts larger than 64 bits (those are always undef because we can't represent integer types that large). llvm-svn: 189672	2013-08-30 14:35:35 +00:00
Bill Wendling	4c0d9adecb	Random cleanup: No need to use a std::vector here, since createInternalizePass uses an ArrayRef. llvm-svn: 189632	2013-08-30 00:48:37 +00:00
Hal Finkel	8e83820a04	Revert: r189565 - Add getUnrollingPreferences to TTI Revert unintentional commit (of an unreviewed change). Original commit message: Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189566	2013-08-29 03:33:15 +00:00
Hal Finkel	63e6c0e9fb	Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189565	2013-08-29 03:29:57 +00:00
Nadav Rotem	4c459bcd47	Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC passes. This patch moves the SLP-vectorizer and BB-vectorizer back into SCC passes for two reasons: 1. They are a kind of cannonicalization. 2. The performance measurements show that it is better to keep them in. There should be no functional change if you are not enabling the LateVectorization mode. llvm-svn: 189539	2013-08-28 23:40:29 +00:00
Matt Arsenault	38874731f6	Fix typo. llvm-svn: 189524	2013-08-28 22:17:26 +00:00
Hal Finkel	6d09904cc9	Disable unrolling in the loop vectorizer when disabled in the pass manager When unrolling is disabled in the pass manager, the loop vectorizer should also not unroll loops. This will allow the -fno-unroll-loops option in Clang to behave as expected (even for vectorizable loops). The loop vectorizer's -force-vector-unroll option will (continue to) override the pass-manager setting (including -force-vector-unroll=0 to force use of the internal auto-selection logic). In order to test this, I added a flag to opt (-disable-loop-unrolling) to force disable unrolling through opt (the analog of -fno-unroll-loops in Clang). Also, this fixes a small bug in opt where the loop vectorizer was enabled only after the pass manager populated the queue of passes (the global_alias.ll test needed a slight update to the RUN line as a result of this fix). llvm-svn: 189499	2013-08-28 18:33:10 +00:00
Alexey Samsonov	9b7e2b555c	80 cols llvm-svn: 189473	2013-08-28 11:25:12 +00:00
Peter Collingbourne	28a10aff48	DataFlowSanitizer: Implement trampolines for function pointers passed to custom functions. Differential Revision: http://llvm-reviews.chandlerc.com/D1503 llvm-svn: 189408	2013-08-27 22:09:06 +00:00
Nadav Rotem	6b41f7cc4c	Refactor 'vectorizeLoop' no functionality change. This patch merges LoopVectorize of InnerLoopVectorizer and InnerLoopUnroller by adding checks for VF=1. This helps in erasing the Unroller code that is almost identical to the InnerLoopVectorizer code. llvm-svn: 189391	2013-08-27 18:52:47 +00:00
Michael Gottesman	eab9a7fa7c	Fixed typo. Noticed by Stephen Checkoway <s@pahtak.org>. llvm-svn: 189312	2013-08-27 04:43:03 +00:00
Matt Arsenault	ed9f76d37b	Fix inserting instructions before last in bundle. The builder inserts from before the insert point, not after, so this would insert before the last instruction in the bundle instead of after it. I'm not sure if this can actually be a problem with any of the current insertions. llvm-svn: 189285	2013-08-26 23:08:37 +00:00
Nadav Rotem	bdc9ff4498	LoopVectorize: Implement partial loop unrolling when vectorization is not profitable. This patch enables unrolling of loops when vectorization is legal but not profitable. We add a new class InnerLoopUnroller, that extends InnerLoopVectorizer and replaces some of the vector-specific logic with scalars. This patch does not introduce any runtime regressions and improves the following workloads: SingleSource/Benchmarks/Shootout/matrix -22.64% SingleSource/Benchmarks/Shootout-C++/matrix -13.06% External/SPEC/CINT2006/464_h264ref/464_h264ref -3.99% SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding -1.95% llvm-svn: 189281	2013-08-26 22:33:26 +00:00
Yi Jiang	7107d41574	test commit. Remove blank line llvm-svn: 189265	2013-08-26 18:57:55 +00:00
Matt Arsenault	bcd8c577d7	Fix unused variable in release build llvm-svn: 189264	2013-08-26 18:38:29 +00:00
Matt Arsenault	8f21c838c0	Constify functions llvm-svn: 189234	2013-08-26 17:56:38 +00:00
Matt Arsenault	39274be65f	Vectorize starting from insertelements building a vector llvm-svn: 189233	2013-08-26 17:56:35 +00:00
Matt Arsenault	8405888af1	Check if in set on insertion instead of separately llvm-svn: 189179	2013-08-24 19:55:38 +00:00
Benjamin Kramer	b12cf01908	Add a function object to compare the first or second component of a std::pair. Replace instances of this scattered around the code base. llvm-svn: 189169	2013-08-24 12:54:27 +00:00
Peter Collingbourne	a96296f3ab	DataFlowSanitizer: correctly combine labels in the case where they are equal. llvm-svn: 189133	2013-08-23 18:45:06 +00:00
Evgeniy Stepanov	d42863cc1f	[msan] Fix handling of va_arg overflow area on x86_64. The code was erroneously reading overflow area shadow from the TLS slot, bypassing the local copy. Reading shadow directly from TLS is wrong, because it can be overwritten by a nested vararg call, if that happens before va_start. llvm-svn: 189104	2013-08-23 12:11:00 +00:00
Richard Sandiford	37cd6cfba2	Turn MipsOptimizeMathLibCalls into a target-independent scalar transform ...so that it can be used for z too. Most of the code is the same. The only real change is to use TargetTransformInfo to test when a sqrt instruction is available. The pass is opt-in because at the moment it only handles sqrt. llvm-svn: 189097	2013-08-23 10:27:02 +00:00
Alexey Samsonov	6dae24df16	80 cols llvm-svn: 189091	2013-08-23 07:42:51 +00:00
Michael Gottesman	823aaffd37	Update StripDeadDebugInfo to use DebugInfoFinder so that it is no longer stale to the point of not working and more resilient to debug info changes. The current version of StripDeadDebugInfo became stale and no longer actually worked since it was expecting an older version of debug info. This patch updates it to use DebugInfoFinder and the modern DebugInfo classes as much as possible to make it more redundent to such changes. Additionally, the only place where that was avoided (the code where we replace the old sets with the new), I call verify on the DIContextUnit implying that if the format changes and my live set changes no longer make sense an assert will be hit. In order to ensure that that occurs I have included a test case. The actual stripping of the dead debug info follows the same strategy as was used before in this class: find the live set and replace the old set in the given compile unit (which may contain dead global variables/functions) with the new live one. llvm-svn: 189078	2013-08-23 00:23:24 +00:00
Peter Collingbourne	34f0c313e2	DataFlowSanitizer: Replace non-instrumented aliases of instrumented functions, and vice versa, with wrappers. Differential Revision: http://llvm-reviews.chandlerc.com/D1442 llvm-svn: 189054	2013-08-22 20:08:15 +00:00
Peter Collingbourne	761a4fc475	DataFlowSanitizer: Factor the wrapper builder out to buildWrapperFunction. Differential Revision: http://llvm-reviews.chandlerc.com/D1441 llvm-svn: 189053	2013-08-22 20:08:11 +00:00
Peter Collingbourne	59b1262d01	DataFlowSanitizer: Prefix the name of each instrumented function with "dfs$". DFSan changes the ABI of each function in the module. This makes it possible for a function with the native ABI to be called with the instrumented ABI, or vice versa, thus possibly invoking undefined behavior. A simple way of statically detecting instances of this problem is to prepend the prefix "dfs$" to the name of each instrumented-ABI function. This will not catch every such problem; in particular function pointers passed across the instrumented-native barrier cannot be used on the other side. These problems could potentially be caught dynamically. Differential Revision: http://llvm-reviews.chandlerc.com/D1373 llvm-svn: 189052	2013-08-22 20:08:08 +00:00
Chandler Carruth	1c34afcb61	Teach the SLP vectorizer the correct way to check for consecutive access using GEPs. Previously, it used a number of different heuristics for analyzing the GEPs. Several of these were conservatively correct, but failed to fall back to SCEV even when SCEV might have given a reasonable answer. One was simply incorrect in how it was formulated. There was good code already to recursively evaluate the constant offsets in GEPs, look through pointer casts, etc. I gathered this into a form code like the SLP code can use in a previous commit, which allows all of this code to become quite simple. There is some performance (compile time) concern here at first glance as we're directly attempting to walk both pointers constant GEP chains. However, a couple of thoughts: 1) The very common cases where there is a dynamic pointer, and a second pointer at a constant offset (usually a stride) from it, this code will actually not do any unnecessary work. 2) InstCombine and other passes work very hard to collapse constant GEPs, so it will be rare that we iterate here for a long time. That said, if there remain performance problems here, there are some obvious things that can improve the situation immensely. Doing a vectorizer-pass-wide memoizer for each individual layer of pointer values, their base values, and the constant offset is likely to be able to completely remove redundant work and strictly limit the scaling of the work to scrape these GEPs. Since this optimization was not done on the prior version (which would still benefit from it), I've not done it here. But if folks have benchmarks that slow down it should be straight forward for them to add. I've added a test case, but I'm not really confident of the amount of testing done for different access patterns, strides, and pointer manipulation. llvm-svn: 189007	2013-08-22 12:45:17 +00:00
Matt Arsenault	f599d97449	Teach LoopVectorize about address space sizes llvm-svn: 188980	2013-08-22 02:42:55 +00:00
Michael Gottesman	0dc00645a2	Fixed typo. llvm-svn: 188957	2013-08-21 22:53:54 +00:00
Michael Gottesman	0900993c3c	Removed trailing whitespace. llvm-svn: 188956	2013-08-21 22:53:29 +00:00
Yunzhong Gao	05efa23294	No functionality change. Replace "(255 & value)" with "(0xFF & value)" to improve clarity. llvm-svn: 188941	2013-08-21 22:11:15 +00:00
Matt Arsenault	745101d666	Teach InstCombine about address spaces llvm-svn: 188926	2013-08-21 19:53:10 +00:00
Matt Arsenault	745832dcc9	Use attribute helper function llvm-svn: 188916	2013-08-21 18:54:50 +00:00
Matt Arsenault	3c71dabd88	Fix typo llvm-svn: 188915	2013-08-21 18:54:47 +00:00
Bill Wendling	707f601fa5	Move registering the execution of a basic block to the beginning rather than the end. There are situations which can affect the correctness (or at least expectation) of the gcov output. For instance, if a call to __gcov_flush() occurs within a block before the execution count is registered and then the program aborts in some way, then that block will not be marked as executed. This is not normally what the user expects. If we move the code that's registering when a block is executed to the beginning, we can catch these types of situations. PR16893 llvm-svn: 188849	2013-08-20 23:52:00 +00:00
Arnold Schwaighofer	e1f3ab69d1	SLPVectorizer: Fix invalid iterator errors Update iterator when the SLP vectorizer changes the instructions in the basic block by restarting the traversal of the basic block. Patch by Yi Jiang! Fixes PR 16899. llvm-svn: 188832	2013-08-20 21:21:45 +00:00
Hal Finkel	0c5c01aa4a	Add a llvm.copysign intrinsic This adds a llvm.copysign intrinsic; We already have Libfunc recognition for copysign (which is turned into the FCOPYSIGN SDAG node). In order to autovectorize calls to copysign in the loop vectorizer, we need a corresponding intrinsic as well. In addition to the expected changes to the language reference, the loop vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a few lists in LegalizeVector{Ops,Types} so that vector copysigns can be expanded. In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN be Expand for vector types. This seems correct for all in-tree targets, and I think is the right thing to do because, previously, there was no way to generate vector-values FCOPYSIGN nodes (and most targets don't specify an action for vector-typed FCOPYSIGN). llvm-svn: 188728	2013-08-19 23:35:46 +00:00
Jakub Staszak	b4eb6adebb	Use pop_back_val() instead of both back() and pop_back(). llvm-svn: 188723	2013-08-19 22:47:55 +00:00
Matt Arsenault	d79f7d9ea1	Teach InstCombine visitGetElementPtr about address spaces llvm-svn: 188721	2013-08-19 22:17:40 +00:00
Matt Arsenault	98f34e3abe	Cleanup visitGetElementPtr to make address space change easier llvm-svn: 188720	2013-08-19 22:17:34 +00:00
Matt Arsenault	94a028aa43	commonPointerCast cleanups to make address space change easier llvm-svn: 188719	2013-08-19 22:17:18 +00:00
Matt Arsenault	5aeae18e9d	Revert non-test parts of r188507 Re-add the inboundsless tests I didn't add originally llvm-svn: 188710	2013-08-19 21:40:31 +00:00
Peter Collingbourne	aac65a313d	Introduce SpecialCaseList::isIn overload for GlobalAliases. Differential Revision: http://llvm-reviews.chandlerc.com/D1437 llvm-svn: 188688	2013-08-19 19:00:35 +00:00
Michael Kuperstein	4bb3f8f2e4	Adds missing TLI check for library simplification of * pow(x, 0.5) -> fabs(sqrt(x)) * pow(2.0, x) -> exp2(x) llvm-svn: 188656	2013-08-19 06:55:47 +00:00
Peter Collingbourne	03c3324ccd	Remove SpecialCaseList::findCategory. It turned out that I didn't need this for DFSan. llvm-svn: 188646	2013-08-19 00:24:20 +00:00
Joerg Sonnenberger	8e3050db51	PR 16899: Do not modify the basic block using the iterator, but keep the next value. This avoids crashes due to invalidation. Patch by Joey Gouly. llvm-svn: 188605	2013-08-17 11:04:47 +00:00
Jim Grosbach	d0de8ace8a	InstCombine: Use isAllOnesValue() instead of explicit -1. llvm-svn: 188563	2013-08-16 17:03:36 +00:00
Jim Grosbach	20e3b9ac30	InstCombine: Simplify if(x!=0 && x!=-1). When both constants are positive or both constants are negative, InstCombine already simplifies comparisons like this, but when it's exactly zero and -1, the operand sorting ends up reversed and the pattern fails to match. Handle that special case. Follow up for rdar://14689217 llvm-svn: 188512	2013-08-16 00:15:20 +00:00
Matt Arsenault	1de76773bc	Don't do FoldCmpLoadFromIndexedGlobal for non inbounds GEPs This path wasn't tested before without a datalayout, so add some more tests and re-run with and without one. llvm-svn: 188507	2013-08-15 23:11:07 +00:00
Matt Arsenault	5cae894a13	Fix spelling llvm-svn: 188506	2013-08-15 23:11:03 +00:00
Yunzhong Gao	c0c2b16932	Fixing a corner-case bug in strchr and strrchr lib call optimizations where the input character is not converted to char before comparing with zero. The patch was discussed in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130812/184069.html llvm-svn: 188489	2013-08-15 20:58:59 +00:00
Peter Collingbourne	444c59e270	DataFlowSanitizer: Add a debugging feature to help us track nonzero labels. Summary: When the -dfsan-debug-nonzero-labels parameter is supplied, the code is instrumented such that when a call parameter, return value or load produces a nonzero label, the function __dfsan_nonzero_label is called. The idea is that a debugger breakpoint can be set on this function in a nominally label-free program to help identify any bugs in the instrumentation pass causing labels to be introduced. Reviewers: eugenis CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1405 llvm-svn: 188472	2013-08-15 18:51:12 +00:00
Mark Lacey	a2626555f1	Fix small typo: s/succ/Succ/ llvm-svn: 188415	2013-08-14 22:11:42 +00:00
Peter Collingbourne	9d31d6f329	DataFlowSanitizer: Instrumentation for memset. Differential Revision: http://llvm-reviews.chandlerc.com/D1395 llvm-svn: 188412	2013-08-14 20:51:38 +00:00
Peter Collingbourne	68162e7512	DataFlowSanitizer: greylist is now ABI list. This replaces the old incomplete greylist functionality with an ABI list, which can provide more detailed information about the ABI and semantics of specific functions. The pass treats every function in the "uninstrumented" category in the ABI list file as conforming to the "native" (i.e. unsanitized) ABI. Unless the ABI list contains additional categories for those functions, a call to one of those functions will produce a warning message, as the labelling behaviour of the function is unknown. The other supported categories are "functional", "discard" and "custom". - "discard" -- This function does not write to (user-accessible) memory, and its return value is unlabelled. - "functional" -- This function does not write to (user-accessible) memory, and the label of its return value is the union of the label of its arguments. - "custom" -- Instead of calling the function, a custom wrapper __dfsw_F is called, where F is the name of the function. This function may wrap the original function or provide its own implementation. Differential Revision: http://llvm-reviews.chandlerc.com/D1345 llvm-svn: 188402	2013-08-14 18:54:12 +00:00
Chandler Carruth	2de93afee3	Fix a really terrifying but improbable bug in mem2reg. If you have seen extremely subtle miscompilations (such as a load getting replaced with the value stored below the load within a basic block) related to promoting an alloca to an SSA value, there is the dim possibility that you hit this. Please let me know if you won this unfortunate lottery. The first half of mem2reg's core logic (as it is used both in the standalone mem2reg pass and in SROA) builds up a mapping from 'Instruction ' to the index of that instruction within its basic block. This allows quickly establishing which store dominate a particular load even for large basic blocks. We cache this information throughout the run of mem2reg over a function in order to amortize the cost of computing it. This is not in and of itself a strange pattern in LLVM. However, it introduces a very important constraint: absolutely no instruction can be deleted from the program without updating the mapping. Otherwise a newly allocated instruction might get the same pointer address, and then end up with a wrong index. Yes, LLVM routinely suffers from a single threaded* variant of the ABA problem. Most places in LLVM don't find avoiding this an imposition because they don't both delete and create new instructions iteratively, but mem2reg loves to do this... All the time. Fortunately, the mem2reg code was really careful about updating this cache to handle this eventuallity... except when it comes to the debug declare intrinsic. Oops. The fix is to invalidate that pointer in the cache when we delete it, the same as we do when deleting alloca instructions and other instructions. I've also caused the same bug in new code while working on a fix to PR16867, so this seems to be a really unfortunate pattern. Hopefully in subsequent patches the deletion of dead instructions can be consolidated sufficiently to make it less likely that we'll see future occurences of this bug. Sorry for not having a test case, but I have literally no idea how to reliably trigger this kind of thing. It may be single-threaded, but it remains an ABA problem. It would require a really amazing number of stars to align. llvm-svn: 188367	2013-08-14 08:56:41 +00:00
Matt Arsenault	9e3a6ca698	Fix always creating GEP with i32 indices Use the pointer size if datalayout is available. Use i64 if it's not, which is consistent with what other places do when the pointer size is unknown. The test doesn't really test this in a useful way since it will be transformed to that later anyway, but this now tests it for non-zero arrays and when datalayout isn't available. The cases in visitGetElementPtrInst should save an extra re-visit to the newly created GEP since it won't need to cleanup after itself. llvm-svn: 188339	2013-08-14 00:24:38 +00:00
Matt Arsenault	fc00f7eabd	Use type helper functions instead of cast llvm-svn: 188338	2013-08-14 00:24:34 +00:00
Matt Arsenault	640ff9dbcf	Use array initializer, space around operator llvm-svn: 188337	2013-08-14 00:24:05 +00:00
Hal Finkel	1a61f621da	BBVectorize: Add initial stores to the write set when tracking uses When computing the use set of a store, we need to add the store to the write set prior to iterating over later instructions. Otherwise, if there is a later aliasing load of that store, that load will not be tagged as a use, and bad things will happen. trackUsesOfI still adds later dependent stores of an instruction to that instruction's write set, but it never sees the original instruction, and so when tracking uses of a store, the store must be added to the write set by the caller. Fixes PR16834. llvm-svn: 188329	2013-08-13 23:34:32 +00:00
Nick Lewycky	c7776f737f	Revert r187191, which broke opt -mem2reg on the testcases included in PR16867. However, opt -O2 doesn't run mem2reg directly so nobody noticed until r188146 when SROA started sending more things directly down the PromoteMemToReg path. In order to revert r187191, I also revert dependent revisions r187296, r187322 and r188146. Fixes PR16867. Does not add the testcases from that PR, but both of them should get added for both mem2reg and sroa when this revert gets unreverted. llvm-svn: 188327	2013-08-13 22:51:58 +00:00
Dmitry Vyukov	96a7084620	dfsan: fix lint warnings llvm-svn: 188293	2013-08-13 16:52:41 +00:00
Arnold Schwaighofer	124ccf3ad1	Also remove logic in LateVectorize llvm-svn: 188285	2013-08-13 16:12:04 +00:00
Arnold Schwaighofer	c14b59d1a1	Remove logic that decides whether to vectorize or not depending on O-levels I have moved this logic into clang and opt. llvm-svn: 188281	2013-08-13 15:51:25 +00:00
Peter Collingbourne	8d642de169	Reapply r188119 now that the bug it exposed is fixed. llvm-svn: 188217	2013-08-12 22:38:43 +00:00
Peter Collingbourne	fb3a2b4f97	DataFlowSanitizer: fix a use-after-free. Spotted by libgmalloc. llvm-svn: 188216	2013-08-12 22:38:39 +00:00
Bill Wendling	e1eaecd528	Move stack protector names to the same place. llvm-svn: 188198	2013-08-12 20:09:37 +00:00
Nadav Rotem	e23147bbd4	Fix PR16797 - Support PHINodes with multiple inputs from the same basic block. Do not generate new vector values for the same entries because we know that the incoming values from the same block must be identical. llvm-svn: 188185	2013-08-12 17:46:44 +00:00
Alexey Samsonov	15dc0af78b	Remove unused SpecialCaseList constructors llvm-svn: 188171	2013-08-12 11:50:44 +00:00
Alexey Samsonov	e4b5fb8851	Add SpecialCaseList::createOrDie() factory and use it in sanitizer passes llvm-svn: 188169	2013-08-12 11:46:09 +00:00
Alexey Samsonov	9e4fdd2656	Introduce factory methods for SpecialCaseList Summary: Doing work in constructors is bad: this change suggests to call SpecialCaseList::create(Path, Error) instead of "new SpecialCaseList(Path)". Currently the latter may crash with report_fatal_error, which is undesirable - sometimes we want to report the error to user gracefully - for example, if he provides an incorrect file as an argument of Clang's -fsanitize-blacklist flag. Reviewers: pcc Reviewed By: pcc CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1327 llvm-svn: 188156	2013-08-12 07:49:36 +00:00
Richard Sandiford	feb34713d5	Fix big-endian handling of integer-to-vector bitcasts in InstCombine These functions used to assume that the lsb of an integer corresponds to vector element 0, whereas for big-endian it's the other way around: the msb is in the first element and the lsb is in the last element. Fixes MultiSource/Benchmarks/mediabench/gsm/toast for z. llvm-svn: 188155	2013-08-12 07:26:09 +00:00
Chandler Carruth	d7cd7e367e	Re-instate r187323 which fast-tracks promotable allocas as soon as the SROA-based analysis has enough information. This should work now that both mem2reg and the SSAUpdater-based AllocaPromoter have been updated to be able to promote the types of allocas that the SROA analysis detects. I've included tests for the AllocaPromoter that were only possible to write once we fast-tracked promotable allocas without rewriting them. This includes a test both for r187347 and r188145. Original commit log for r187323: """ Now that mem2reg understands how to cope with a slightly wider set of uses of an alloca, we can pre-compute promotability while analyzing an alloca for splitting in SROA. That lets us short-circuit the common case of a bunch of trivially promotable allocas. This cuts 20% to 30% off the run time of SROA for typical frontend-generated IR sequneces I'm seeing. It gets the new SROA to within 20% of ScalarRepl for such code. My current benchmark for these numbers is PR15412, but it fits the general pattern of IR emitted by Clang so it should be widely applicable. """ llvm-svn: 188146	2013-08-11 02:17:11 +00:00
Chandler Carruth	c17283b407	Finish fixing the SSAUpdater-based AllocaPromoter strategy in SROA to cope with the more general set of patterns that are now handled by mem2reg and that we can detect quickly while doing SROA's initial analysis. Notably, this allows it to promote through no-op bitcast and GEP sequences. A core part of the SSAUpdater approach is the ability to test whether a particular instruction is part of the set being promoted. Testing this becomes significantly more complex in the world where the operand to every load and store isn't the alloca itself. I ended up using the approach of walking up the def-chain until we find the alloca. I benchmarked this against keeping a set of pointer operands and keeping a set of the loads and stores we care about, and this one seemed faster although the difference was very small. No test case yet because currently the rewriting always "fixes" the inputs to not require this. The next patch which re-enables early promotion of easy cases in SROA will include a test case that specifically exercises this aspect of the alloca promoter. llvm-svn: 188145	2013-08-11 01:56:15 +00:00
Chandler Carruth	45b136f4cf	Reformat some bits of AllocaPromoter and simplify the name and type of our visiting datastructures in the AllocaPromoter/SSAUpdater path of SROA. Also shift the order if clears around to be more consistent. No functionality changed here, this is just a cleanup. llvm-svn: 188144	2013-08-11 01:03:18 +00:00
Arnold Schwaighofer	3dcdb89d69	Revert r188119 "Kill some duplicated code for removing unreachable BBs." It is breaking builbots with libgmalloc enabled on Mac OS X. $ cd llvm ; mkdir release ; cd release $ ../configure --enable-optimized —prefix=$PWD/install $ make $ make check $ Release+Asserts/bin/llvm-lit -v --param use_gmalloc=1 --param \ gmalloc_path=/usr/lib/libgmalloc.dylib \ ../test/Instrumentation/DataFlowSanitizer/args-unreachable-bb.ll llvm-svn: 188142	2013-08-10 20:16:06 +00:00
Michael Gottesman	d6ce6cbdac	[objc-arc] Track if we encountered an additive overflow while computing {TopDown,BottomUp}PathCounts and do nothing if it occurred. I fixed the aforementioned problems that came up on some of the linux boxes. Major thanks to Nick Lewycky for his help debugging! rdar://14590914 llvm-svn: 188122	2013-08-09 23:22:27 +00:00
Peter Collingbourne	32090aba06	Kill some duplicated code for removing unreachable BBs. This moves removeUnreachableBlocksFromFn from SimplifyCFGPass.cpp to Utils/Local.cpp and uses it to replace the implementation of llvm::removeUnreachableBlocks, which appears to do a strict subset of what removeUnreachableBlocksFromFn does. Differential Revision: http://llvm-reviews.chandlerc.com/D1334 llvm-svn: 188119	2013-08-09 22:47:24 +00:00
Peter Collingbourne	ae66d57bcf	DataFlowSanitizer: Remove unreachable BBs so IR continues to verify under the args ABI. Differential Revision: http://llvm-reviews.chandlerc.com/D1316 llvm-svn: 188113	2013-08-09 21:42:53 +00:00
Jakub Staszak	23ec6a97d1	Mark obviously const methods. Also use reference for parameters when possible. llvm-svn: 188103	2013-08-09 20:53:48 +00:00
Michael Gottesman	6663c7d5fc	Revert "[objc-arc] Track if we encountered an additive overflow while computing {TopDown,BottomUp}PathCounts and do nothing if it occured." This reverts commit r187941. The commit was passing on my os x box, but it is failing on some non-osx platforms. I do not have time to look into it now, so I am reverting and will recommit after I figure this out. llvm-svn: 187946	2013-08-08 00:41:18 +00:00
Peter Collingbourne	a5689e69af	Fix ARM build. llvm-svn: 187944	2013-08-08 00:15:27 +00:00
Michael Gottesman	ddc89fcccd	[objc-arc] Track if we encountered an additive overflow while computing {TopDown,BottomUp}PathCounts and do nothing if it occured. rdar://14590914 llvm-svn: 187941	2013-08-07 23:56:41 +00:00
Michael Gottesman	0fecf98955	[objc-arc] Change 4 iterator methods which return const_iterators to be const methods. llvm-svn: 187940	2013-08-07 23:56:34 +00:00
Hal Finkel	171817ee8a	Add ISD::FROUND for libm round() All libm floating-point rounding functions, except for round(), had their own ISD nodes. Recent PowerPC cores have an instruction for round(), and so here I'm adding ISD::FROUND so that round() can be custom lowered as well. For the most part, this is straightforward. I've added an intrinsic and a matching ISD node just like those for nearbyint() and friends. The SelectionDAG pattern I've named frnd (because ISD::FP_ROUND has already claimed fround). This will be used by the PowerPC backend in a follow-up commit. llvm-svn: 187926	2013-08-07 22:49:12 +00:00
Peter Collingbourne	e5d5b0c71e	DataFlowSanitizer; LLVM changes. DataFlowSanitizer is a generalised dynamic data flow analysis. Unlike other Sanitizer tools, this tool is not designed to detect a specific class of bugs on its own. Instead, it provides a generic dynamic data flow analysis framework to be used by clients to help detect application-specific issues within their own code. Differential Revision: http://llvm-reviews.chandlerc.com/D965 llvm-svn: 187923	2013-08-07 22:47:18 +00:00
Benjamin Kramer	6a4976d3e0	JumpThreading: Turn a select instruction into branching if it allows to thread one half of the select. This is a common pattern coming out of simplifycfg generating gross code. a: ; preds = %entry %sel = select i1 %cmp1, double %add, double 0.000000e+00 br label %b b: %cond5 = phi double [ %sel, %a ], [ %sub, %entry ] %cmp6 = fcmp oeq double %cond5, 0.000000e+00 br i1 %cmp6, label %if.then, label %if.end becomes a: br i1 %cmp1, label %b, label %if.then b: %cond5 = phi double [ %sub, %entry ], [ %add, %a ] %cmp6 = fcmp oeq double %cond5, 0.000000e+00 br i1 %cmp6, label %if.then, label %if.end Skipping block b completely if possible. llvm-svn: 187880	2013-08-07 10:29:38 +00:00
Bill Wendling	58f8cef83b	Change the linkage of these global values to 'internal'. The globals being generated here were given the 'private' linkage type. However, this caused them to end up in different sections with the wrong prefix. E.g., they would be in the __TEXT,__const section with an 'L' prefix instead of an 'l' (lowercase ell) prefix. The problem is that the linker will eat a literal label with 'L'. If a weak symbol is then placed into the __TEXT,__const section near that literal, then it cannot distinguish between the literal and the weak symbol. Part of the problems here was introduced because the address sanitizer converted some C strings into constant initializers with trailing nuls. (Thus putting them in the __const section with the wrong prefix.) The others were variables that the address sanitizer created but simply had the wrong linkage type. llvm-svn: 187827	2013-08-06 22:52:42 +00:00
Arnold Schwaighofer	a7cd6bf3bb	LoopVectorize: Allow vectorization of loops with lifetime markers Patch by Marc Jessome! llvm-svn: 187825	2013-08-06 22:37:52 +00:00
Jakub Staszak	27da123d66	Adjust file to the coding standard. llvm-svn: 187808	2013-08-06 17:03:42 +00:00
Serge Pavlov	71044cbe16	Unbreak Debug build on Windows llvm-svn: 187786	2013-08-06 08:44:18 +00:00
Tom Stellard	aa664d9b92	Factor FlattenCFG out from SimplifyCFG Patch by: Mei Ye llvm-svn: 187764	2013-08-06 02:43:45 +00:00
Matt Arsenault	ff7dc7248e	Fix missing -- C++ --s llvm-svn: 187758	2013-08-06 00:16:21 +00:00
Peter Collingbourne	bace606657	Introduce an optimisation for special case lists with large numbers of literal entries. Our internal regex implementation does not cope with large numbers of anchors very efficiently. Given a ~3600-entry special case list, regex compilation can take on the order of seconds. This patch solves the problem for the special case of patterns matching literal global names (i.e. patterns with no regex metacharacters). Rather than forming regexes from literal global name patterns, add them to a StringSet which is checked before matching against the regex. This reduces regex compilation time by an order of roughly thousands when reading the aforementioned special case list, according to a completely unscientific study. No test cases. I figure that any new tests for this code should check that regex metacharacters are properly recognised. However, I could not find any documentation which documents the fact that the syntax of global names in special case lists is based on regexes. The extent to which regex syntax is supported in special case lists should probably be decided on/documented before writing tests. Differential Revision: http://llvm-reviews.chandlerc.com/D1150 llvm-svn: 187732	2013-08-05 17:48:04 +00:00
Alexey Samsonov	f52b717db3	80-cols llvm-svn: 187725	2013-08-05 13:19:49 +00:00
Nadav Rotem	5defea90e6	SLPVectorizer: Fix PR16777. PHInodes may use multiple extracted values that come from different blocks. Thanks Alexey Samsonov. llvm-svn: 187663	2013-08-02 18:40:24 +00:00
Alexey Samsonov	9096968de5	Fix dereferencing end iterator in SimplifyCFG. Patch by Ye Mei. llvm-svn: 187646	2013-08-02 08:06:43 +00:00
Matt Arsenault	87dc60761f	Teach getOrEnforceKnownAlignment about address spaces llvm-svn: 187629	2013-08-01 22:42:18 +00:00
Nadav Rotem	e4e6e9ed47	Move the optlevel check to the frontend. llvm-svn: 187628	2013-08-01 22:41:58 +00:00
Nadav Rotem	9153b3871d	Only enable SLP-vectorization on O3 builds. llvm-svn: 187595	2013-08-01 18:28:15 +00:00
Nadav Rotem	25f15358d2	80-col llvm-svn: 187535	2013-07-31 22:17:45 +00:00
Owen Anderson	c7be519dc0	Preserve fast-math flags when folding (fsub x, (fneg y)) to (fadd x, y). llvm-svn: 187462	2013-07-30 23:53:17 +00:00
Matt Arsenault	cacbb2377a	Change behavior of calling bitcasted alias functions. It will now only convert the arguments / return value and call the underlying function if the types are able to be bitcasted. This avoids using fp<->int conversions that would occur before. llvm-svn: 187444	2013-07-30 20:45:05 +00:00
Nadav Rotem	d9c74cc6d3	SLPVectorier: update the debug location for the new instructions. llvm-svn: 187363	2013-07-29 18:18:46 +00:00
Chandler Carruth	cd7c8cdfa1	Teach the AllocaPromoter which is wrapped around the SSAUpdater infrastructure to do promotion without a domtree the same smarts about looking through GEPs, bitcasts, etc., that I just taught mem2reg about. This way, if SROA chooses to promote an alloca which still has some noisy instructions this code can cope with them. I've not used as principled of an approach here for two reasons: 1) This code doesn't really need it as we were already set up to zip through the instructions used by the alloca. 2) I view the code here as more of a hack, and hopefully a temporary one. The SSAUpdater path in SROA is a real sore point for me. It doesn't make a lot of architectural sense for many reasons: - We're likely to end up needing the domtree anyways in a subsequent pass, so why not compute it earlier and use it. - In the future we'll likely end up needing the domtree for parts of the inliner itself. - If we need to we could teach the inliner to preserve the domtree. Part of the re-work of the pass manager will allow this to be very powerful even in large SCCs with many functions. - Ultimately, computing a domtree has gotten significantly faster since the original SSAUpdater-using code went into ScalarRepl. We no longer use domfrontiers, and much of domtree is lazily done based on queries rather than eagerly. - At this point keeping the SSAUpdater-based promotion saves a total of 0.7% on a build of the 'opt' tool for me. That's not a lot of performance given the complexity! So I'm leaving this a bit ugly in the hope that eventually we just remove all of this nonsense. I can't even readily test this because this code isn't reachable except through SROA. When I re-instate the patch that fast-tracks allocas already suitable for promotion, I'll add a testcase there that failed before this change. Before that, SROA will fix any test case I give it. llvm-svn: 187347	2013-07-29 09:06:53 +00:00
Nadav Rotem	750e42cba3	Don't vectorize when the attribute NoImplicitFloat is used. llvm-svn: 187340	2013-07-29 05:13:00 +00:00
Rafael Espindola	caa776be91	Fix -Wdocumentation warnings. llvm-svn: 187336	2013-07-28 23:43:28 +00:00
Chandler Carruth	6b55dbea86	Update comments for SSAUpdater to use the modern doxygen comment standards for LLVM. Remove duplicated comments on the interface from the implementation file (implementation comments are left there of course). Also clean up, re-word, and fix a few typos and errors in the commenst spotted along the way. This is in preparation for changes to these files and to keep the uninteresting tidying in a separate commit. llvm-svn: 187335	2013-07-28 22:00:33 +00:00
Chandler Carruth	d31370e060	Temporarily revert r187323 until I update SSAUpdater to match mem2reg. I forgot that we had two totally independent things here. :: sigh :: llvm-svn: 187327	2013-07-28 09:05:49 +00:00
Chandler Carruth	9d96100ff0	Now that mem2reg understands how to cope with a slightly wider set of uses of an alloca, we can pre-compute promotability while analyzing an alloca for splitting in SROA. That lets us short-circuit the common case of a bunch of trivially promotable allocas. This cuts 20% to 30% off the run time of SROA for typical frontend-generated IR sequneces I'm seeing. It gets the new SROA to within 20% of ScalarRepl for such code. My current benchmark for these numbers is PR15412, but it fits the general pattern of IR emitted by Clang so it should be widely applicable. llvm-svn: 187323	2013-07-28 08:27:12 +00:00
Chandler Carruth	d5b806a27f	Thread DataLayout through the callers and into mem2reg. This will be useful in a subsequent patch, but causes an unfortunate amount of noise, so I pulled it out into a separate patch. llvm-svn: 187322	2013-07-28 06:43:11 +00:00
Nadav Rotem	3e50c68956	Update the comment llvm-svn: 187316	2013-07-27 23:28:47 +00:00
Chandler Carruth	8e3c4dc50e	Don't use all the #ifdefs to hide the stats counters and instead rely on their being optimized out in debug mode. Realistically, this just isn't going to be the slow part anyways. This also fixes unused variable warnings that are breaking LLD build bots. =/ I didn't see these at first, and kept losing track of the fact that they were broken. llvm-svn: 187297	2013-07-27 10:17:49 +00:00
Chandler Carruth	e8f5812a30	Merge the removal of dead instructions and lifetime markers with the analysis of the alloca. We don't need to visit all the users twice for this. We build up a kill list during the analysis and then just process it afterward. This recovers the tiny bit of performance lost by moving to the visitor based analysis system as it removes one entire use-list walk from mem2reg. In some cases, this is now faster than mem2reg was previously. llvm-svn: 187296	2013-07-27 09:43:30 +00:00
Nick Lewycky	0b68245ec8	Reimplement isPotentiallyReachable to make nocapture deduction much stronger. Adds unit tests for it too. Split BasicBlockUtils into an analysis-half and a transforms-half, and put the analysis bits into a new Analysis/CFG.{h,cpp}. Promote isPotentiallyReachable into llvm::isPotentiallyReachable and move it into Analysis/CFG. llvm-svn: 187283	2013-07-27 01:24:00 +00:00
Tom Stellard	8b1e021e85	SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278	2013-07-27 00:01:07 +00:00
Nadav Rotem	cfd40da9b1	SLP Vectorier: Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize. llvm-svn: 187267	2013-07-26 23:07:55 +00:00
Nadav Rotem	9ce0f779bc	SLP Vectorizer: Disable the vectorization of non power of two chains, such as <3 x float>, because we dont have a good cost model for these types. llvm-svn: 187265	2013-07-26 22:53:11 +00:00
Owen Anderson	d6d4da09f7	Fix variable name. llvm-svn: 187253	2013-07-26 22:06:21 +00:00
Owen Anderson	e37c2e4d11	When InstCombine tries to fold away (fsub x, (fneg y)) into (fadd x, y), it is also worthwhile for it to look through FP extensions and truncations, whose application commutes with fneg. llvm-svn: 187249	2013-07-26 21:40:29 +00:00
Stephen Lin	4ef1387221	Correct case of m_UIToFp to m_UIToFP to match instruction name, add m_SIToFP for consistency. llvm-svn: 187225	2013-07-26 17:55:00 +00:00
Chandler Carruth	9af38fc247	Re-implement the analysis of uses in mem2reg to be significantly more robust. It now uses an InstVisitor and worklist to actually walk the uses of the Alloca transitively and detect the pattern which we can directly promote: loads & stores of the whole alloca and instructions we can completely ignore. Also, with this new implementation teach both the predicate for testing whether we can promote and the promotion engine itself to use the same code so we no longer have strange divergence between the two code paths. I've added some silly test cases to demonstrate that we can handle slightly more degenerate code patterns now. See the below for why this is even interesting. Performance impact: roughly 1% regression in the performance of SROA or ScalarRepl on a large C++-ish test case where most of the allocas are basically ready for promotion. The reason is because of silly redundant work that I've left FIXMEs for and which I'll address in the next commit. I wanted to separate this commit as it changes the behavior. Once the redundant work in removing the dead uses of the alloca is fixed, this code appears to be faster than the old version. =] So why is this useful? Because the previous requirement for promotion required a specific visit pattern of the uses of the alloca to verify: we had to look for no more than 1 intervening use. The end goal is to have SROA automatically detect when an alloca is already promotable and directly hand it to the mem2reg machinery rather than trying to partition and rewrite it. This is a 25% or more performance improvement for SROA, and a significant chunk of the delta between it and ScalarRepl. To get there, we need to make mem2reg actually capable of promoting allocas which look promotable to SROA without have SROA do tons of work to massage the code into just the right form. This is actually the tip of the iceberg. There are tremendous potential savings we can realize here by de-duplicating work between mem2reg and SROA. llvm-svn: 187191	2013-07-26 08:20:39 +00:00
Bill Schmidt	0a9170d931	[PowerPC] Support powerpc64le as a syntax-checking target. This patch provides basic support for powerpc64le as an LLVM target. However, use of this target will not actually generate little-endian code. Instead, use of the target will cause the correct little-endian built-in defines to be generated, so that code that tests for __LITTLE_ENDIAN__, for example, will be correctly parsed for syntax-only testing. Code generation will otherwise be the same as powerpc64 (big-endian), for now. The patch leaves open the possibility of creating a little-endian PowerPC64 back end, but there is no immediate intent to create such a thing. The LLVM portions of this patch simply add ppc64le coverage everywhere that ppc64 coverage currently exists. There is nothing of any import worth testing until such time as little-endian code generation is implemented. In the corresponding Clang patch, there is a new test case variant to ensure that correct built-in defines for little-endian code are generated. llvm-svn: 187179	2013-07-26 01:35:43 +00:00
Rafael Espindola	17600e29fa	Respect llvm.used in Internalize. The language reference says that: "If a symbol appears in the @llvm.used list, then the compiler, assembler, and linker are required to treat the symbol as if there is a reference to the symbol that it cannot see" Since even the linker cannot see the reference, we must assume that the reference can be using the symbol table. For example, a user can add __attribute__((used)) to a debug helper function like dump and use it from a debugger. llvm-svn: 187103	2013-07-25 03:23:25 +00:00
Nick Lewycky	5b15037fc9	Check that TD isn't NULL before dereferencing it down this path. llvm-svn: 187099	2013-07-25 02:55:14 +00:00
Rafael Espindola	ec2375fb51	Make these methods const correct. Thanks to Nick Lewycky for noticing it. llvm-svn: 187098	2013-07-25 02:50:08 +00:00
Benjamin Kramer	328da33d19	TRE: Move class into anonymous namespace. While there shrink a dangerously large SmallPtrSet. llvm-svn: 187050	2013-07-24 16:12:08 +00:00
Chandler Carruth	58e25d3905	Fix a problem I introduced in r187029 where we would over-eagerly schedule an alloca for another iteration in SROA. This only showed up with a mixture of promotable and unpromotable selects and phis. Added a test case for this. llvm-svn: 187031	2013-07-24 12:12:17 +00:00
Chandler Carruth	83ea195d40	Fix PR16687 where we were incorrectly promoting an alloca that had pending speculation for a phi node. The problem here is that we were using growth of the specluation set as an indicator of whether speculation would occur, and if the phi node is already in the set we don't see it grow. This is a symptom of the fact that this signal is a total hack. Unfortunately, I couldn't really come up with a non-hacky way of signaling that promotion remains valid after speculation occurs, such that we only speculate when all else looks good for promotion. In the end, I went with at least a much more explicit approach of doing the work of queuing inside the phi and select processing and setting a preposterously named flag to convey that we're in the special state of requiring speculating before promotion. Thanks to Richard Trieu and Nick Lewycky for the excellent work reducing a testcase for this from a pretty giant, nasty assert in a big application. =] The testcase was excellent. llvm-svn: 187029	2013-07-24 09:47:28 +00:00
Matt Arsenault	f64212b281	Fix spelling llvm-svn: 186997	2013-07-23 22:20:57 +00:00
Nick Lewycky	6ab9d936d5	Remove extraneous null statement. No functionality change! llvm-svn: 186893	2013-07-22 23:38:27 +00:00
Jakub Staszak	d4d94065e3	Use switch instead of if. No functionality change. llvm-svn: 186892	2013-07-22 23:38:16 +00:00
Jakub Staszak	8e1a6e7d53	Remove trailing spaces. llvm-svn: 186890	2013-07-22 23:16:36 +00:00
Nadav Rotem	cf0dcdc71c	When we vectorize across multiple basic blocks we may vectorize PHINodes that create a cycle. We already break the cycle on phi-nodes, but arithmetic operations are still uplicated. This patch adds code that checks if the operation that we are vectorizing was vectorized during the visit of the operands and uses this value if it can. llvm-svn: 186883	2013-07-22 22:18:07 +00:00
Jakub Staszak	cb132face0	OldPtr is llvm::Instruction. Remove unneeded cast<>. llvm-svn: 186880	2013-07-22 22:10:43 +00:00
Jakub Staszak	6b36db08f3	Change tabs to spaces. llvm-svn: 186877	2013-07-22 21:11:30 +00:00
Matt Arsenault	fb18323885	Fix spelling and grammar llvm-svn: 186858	2013-07-22 18:59:58 +00:00
Nadav Rotem	8c45d4b27f	Fix an obvious typo in the loop vectorizer where the cost model uses the wrong variable. The variable BlockCost is ignored. We don't have tests for the effect of if-conversion loops because it requires a big test (that includes if-converted loops) and it is difficult to find and balance a loop to do the right thing. llvm-svn: 186845	2013-07-22 17:10:48 +00:00
Nadav Rotem	d7ff88a8d9	Delete unused helper functions. llvm-svn: 186808	2013-07-22 05:19:22 +00:00
Benjamin Kramer	2fdb758ca8	mem2reg: Minor STL usage cleanup. No functionality change. llvm-svn: 186790	2013-07-21 11:03:40 +00:00
Chandler Carruth	7aa9ebb546	Make the mem2reg interface use an ArrayRef as it keeps a copy of these to iterate over. llvm-svn: 186788	2013-07-21 08:37:58 +00:00
Nadav Rotem	f6bb6a464c	Revert a part of r186420. Don't forbid multiple store chains that merge. llvm-svn: 186786	2013-07-21 06:12:57 +00:00
Chandler Carruth	b1ca98c4d0	Hoist the rest of the logic for promoting single-store allocas into the helper function. This leaves both trivial cases handled entirely in helper functions and merely manages the list of allocas to process in the run method. The next step will be to handle all of the trivial promotion work prior to even creating the core class and the subsequent simplifications that enables. llvm-svn: 186784	2013-07-21 01:52:33 +00:00
Chandler Carruth	f9e7e1dd87	Hoist the rest of the logic for fully promoting allocas with all uses in a single block into the helper routine. This takes advantage of the fact that we can directly replace uses prior to any store with undef to simplify matters and unconditionally promote allocas only used within one block. I've removed the special handling for the case of no stores existing. This has no semantic effect but might slow things down. I'll fix that in a later patch when I refactor this entire thing to be easier to manage the different cases. llvm-svn: 186783	2013-07-21 01:44:07 +00:00
Chandler Carruth	e99f931516	Remove a method made dead by the prior refactoring. llvm-svn: 186782	2013-07-21 00:01:34 +00:00
Chandler Carruth	420fafef93	Hoist the two trivial promotion routines out of the big class that handles the general cases. The hope is to refactor this so that we don't end up building the entire class for the trivial cases. I also want to lift a lot of the early pre-processing in the initial segment of run() into a separate routine, and really none of it needs to happen inside the primary promotion class. These routines in particular used none of the actual state in the promotion class, so they don't really make sense as members. llvm-svn: 186781	2013-07-20 23:59:51 +00:00
Chandler Carruth	48e11fd76d	Hoist the AllocaInfo struct to the top of the file. This struct is nicely independent of everything else, and we already needed a foward declaration here. It's simpler to just define it immediately. llvm-svn: 186780	2013-07-20 23:39:26 +00:00
Chandler Carruth	4711793e8a	Sink a typedef and comparator down to the function that actually uses them. llvm-svn: 186779	2013-07-20 23:36:19 +00:00
Rafael Espindola	c2bb73fc8d	Don't crash when llvm.compiler.used becomes empty. GlobalOpt simplifies llvm.compiler.used by removing any members that are also in the more strict llvm.used. Handle the special case where llvm.compiler.used becomes empty. llvm-svn: 186778	2013-07-20 23:33:15 +00:00
Chandler Carruth	f3878f46ce	Don't allocate the DIBuilder on the heap and remove all the complexity that ensued from that. llvm-svn: 186777	2013-07-20 23:33:06 +00:00
Chandler Carruth	e62f211b77	Rename constructor parameters to follow the common member-shadowing pattern and conform to the naming conventions. llvm-svn: 186776	2013-07-20 23:23:47 +00:00
Chandler Carruth	b3e8e6f10b	Reformat the implementation of mem2reg with clang-format so that my subsequent changes don't introduce inconsistencies. llvm-svn: 186775	2013-07-20 23:20:08 +00:00
Chandler Carruth	985eb0b550	Remove a DenseMapInfo specialization for std::pair -- we have one of those baked into DenseMap now. llvm-svn: 186773	2013-07-20 23:09:05 +00:00
Chandler Carruth	019516109d	Update mem2reg's comments to conform to the new doxygen standards. No functionality changed. llvm-svn: 186772	2013-07-20 22:20:05 +00:00
Benjamin Kramer	08e5070bf5	SROA: Microoptimization: Remove dead entries first, then sort. While there replace an explicit struct with std::mem_fun. llvm-svn: 186761	2013-07-20 08:38:34 +00:00
Stephen Lin	a9b57f6bea	InstCombine: call FoldOpIntoSelect for all floating binops, not just fmul llvm-svn: 186759	2013-07-20 07:13:13 +00:00
Nadav Rotem	e210839f5b	fix an 80-col line. llvm-svn: 186733	2013-07-19 23:14:01 +00:00
Nadav Rotem	c069c25518	Use LLVMs ADTs that improve the compile time of this pass. llvm-svn: 186732	2013-07-19 23:12:19 +00:00
Nadav Rotem	5c9a193a65	SLPVectorizer: Improve the compile time of isConsecutive by reordering the conditions that check GEPs and eliminate two of the calls to accumulateConstantOffset. llvm-svn: 186731	2013-07-19 23:11:15 +00:00
Rafael Espindola	9aadcc4c0e	s/compiler_used/compiler.used/. We were incorrectly using compiler_used instead of compiler.used. Unfortunately the passes using the broken name had tests also using the broken name. llvm-svn: 186705	2013-07-19 18:44:51 +00:00
Chandler Carruth	6c321c131b	Cleanup the stats counters for the new implementation. These actually count the right things and have the right names. llvm-svn: 186667	2013-07-19 10:57:36 +00:00
Chandler Carruth	1ed848d55c	Fix another assert failure very similar to PR16651's test case. This test case came from Benjamin and found the parallel bug in the vector promotion code. llvm-svn: 186666	2013-07-19 10:57:32 +00:00
Chandler Carruth	9f21fe1d65	Try to move to a more reasonable set of naming conventions given the new implementation of the SROA algorithm. We were using the term 'partition' in many places that no longer ever represented an actual partition, but rather just an arbitrary slice of an alloca. No functionality change intended here. Mostly just renaming of types, functions, variables, and rewording of comments. Several comments were rewritten to make a lot more sense in the new structure of things. The stats are still weird and not reflective of how this really works. I'll fix those up in a separate patch as it is a touch more semantic of a change... llvm-svn: 186659	2013-07-19 09:13:58 +00:00
Chandler Carruth	90a735d606	A long overdue cleanup in SROA to use 'DL' instead of 'TD' for the DataLayout variables. llvm-svn: 186656	2013-07-19 07:21:28 +00:00
Chandler Carruth	5955c9e4da	Fix PR16651, an assert introduced in my recent re-work of the innards of SROA. The crux of the issue is that now we track uses of a partition of the alloca in two places: the iterators over the partitioning uses and the previously collected split uses vector. We weren't accounting for the fact that the split uses might invalidate integer widening in ways other than due to their width (in this case due to being volatile). Further reduced testcase added to the tests. llvm-svn: 186655	2013-07-19 07:12:23 +00:00
Eric Christopher	03b3e1118f	Remove DIBuilder cache of variable TheCU and change the few uses that wanted it. Also change the interface for createCompileUnit to compensate. Fix comments that refer to TheCU as well. llvm-svn: 186637	2013-07-19 00:51:47 +00:00
Nick Lewycky	03f3d34ffb	Clean up some of this code a tiny bit, no functionality change. llvm-svn: 186622	2013-07-18 22:32:32 +00:00
Eric Christopher	a4b6cf14f6	Revert "Remove DIBuilder cache of variable TheCU and change the few" This reverts commit r186599 as I didn't want to commit this yet. llvm-svn: 186601	2013-07-18 19:13:06 +00:00
Eric Christopher	d0b2150f01	Remove DIBuilder cache of variable TheCU and change the few uses that wanted it. Also change the interface for createCompileUnit to compensate. Fix comments that refer to TheCU as well. llvm-svn: 186599	2013-07-18 19:11:29 +00:00
Nadav Rotem	bb3398f000	Handle constants without going through SCEV. llvm-svn: 186593	2013-07-18 18:34:21 +00:00
Nadav Rotem	de2815a5f7	SLPVectorizer: Speedup isConsecutive by manually checking GEPs with multiple indices. This brings the compile time of the SLP-Vectorizer to about 2.5% of OPT for my testcase. llvm-svn: 186592	2013-07-18 18:20:45 +00:00
Chandler Carruth	f0546402af	Reapply r186316 with a fix for one bug where the code could walk off the end of a vector. This was found with ASan. I've had one other report of a crasher, but thus far been unable to reproduce the crash. It may well be fixed with this version, and if not I'd like to get more information from the build bots about what is happening. See r186316 for the full commit log for the new implementation of the SROA algorithm. llvm-svn: 186565	2013-07-18 07:15:00 +00:00
Nadav Rotem	7d7036b8c6	SLPVectorizer: Speedup isConsecutive (that checks if two addresses are consecutive in memory) by checking for additional patterns that don't need to go through SCEV. llvm-svn: 186563	2013-07-18 04:33:20 +00:00
Eric Christopher	7ab2c3ecb2	Add comparison operators for DIDescriptors to fix c++98 fallout of operator bool change. Also convert a variable in DebugIR. llvm-svn: 186544	2013-07-17 23:25:22 +00:00
Nadav Rotem	43639e8492	Fix a comment. llvm-svn: 186541	2013-07-17 22:41:16 +00:00
Stephen Lin	03f9fbbcd7	Restore r181216, which was partially reverted in r182499. llvm-svn: 186533	2013-07-17 20:06:03 +00:00
Nadav Rotem	3072baeb9c	Add a micro optimization to catch cases where the PtrA equals PtrB. llvm-svn: 186531	2013-07-17 19:52:25 +00:00
Hal Finkel	ec7cd26968	Fix comparisons of alloca alignment in inliner merging Duncan pointed out a mistake in my fix in r186425 when only one of the allocas being compared had the target-default alignment. This is essentially his suggested solution. Thanks! llvm-svn: 186510	2013-07-17 14:32:41 +00:00
Craig Topper	24048c9440	Mark a method 'const' and another 'static'. llvm-svn: 186485	2013-07-17 03:54:53 +00:00
Craig Topper	1c4d667ca5	Make a few more static string pointers constant. llvm-svn: 186484	2013-07-17 03:43:10 +00:00
Nadav Rotem	2202317fce	SLPVectorizer: Accelerate the isConsecutive check by replacing the subtraction of the two values with a simple SCEV expression that adds the offset to one of the pointers that we compare. llvm-svn: 186479	2013-07-17 00:48:31 +00:00
Nadav Rotem	d2e8c4cdea	flip the scev minus direction to simplify the code. llvm-svn: 186466	2013-07-16 22:57:06 +00:00
Nadav Rotem	8f924f3891	SLPVectorizer: Improve the compile time of isConsecutive by adding a simple constant-gep check before using SCEV. This check does not always work because not all of the GEPs use a constant offset, but it happens often enough to reduce the number of times we use SCEV. llvm-svn: 186465	2013-07-16 22:51:07 +00:00
Rafael Espindola	6d35481c94	Add a wrapper for open. This centralizes the handling of O_BINARY and opens the way for hiding more differences (like how open behaves with directories). llvm-svn: 186447	2013-07-16 19:44:17 +00:00
Peter Collingbourne	8b77f18da0	Make SpecialCaseList match full strings, as documented, using anchors. Differential Revision: http://llvm-reviews.chandlerc.com/D1149 llvm-svn: 186431	2013-07-16 17:56:07 +00:00
Hal Finkel	9caa8f7ba7	When the inliner merges allocas, it must keep the larger alignment For safety, the inliner cannot decrease the allignment on an alloca when merging it with another. I've included two variants of the test case for this: one with DataLayout available, and one without. When DataLayout is not available, if only one of the allocas uses the default alignment (getAlignment() == 0), then they cannot be safely merged. llvm-svn: 186425	2013-07-16 17:10:55 +00:00
Nadav Rotem	26bf9a0c75	SLPVectorizer: Reduce the compile time of the consecutive store lookup. Process groups of stores in chunks of 16. llvm-svn: 186420	2013-07-16 15:25:17 +00:00
Craig Topper	d3a34f81f8	Add 'const' qualifiers to static const char* variables. llvm-svn: 186371	2013-07-16 01:17:10 +00:00
Nadav Rotem	1c1d6c1666	PR16628: Fix a bug in the code that merges compares. Compares return i1 but they compare different types. llvm-svn: 186359	2013-07-15 22:52:48 +00:00
Stephen Lin	837bba1c51	Remove trailing whitespace llvm-svn: 186333	2013-07-15 17:55:02 +00:00
Chandler Carruth	e3899f2c2c	Revert r186316 while I track down an ASan failure and an assert from a bot. This reverts the commit which introduced a new implementation of the fancy SROA pass designed to reduce its overhead. I'll skip the huge commit log here, refer to r186316 if you're looking for how this all works and why it works that way. llvm-svn: 186332	2013-07-15 17:36:21 +00:00
Chandler Carruth	e74ff4c643	Reimplement SROA yet again. Same fundamental principle, but a totally different core implementation strategy. Previously, SROA would build a relatively elaborate partitioning of an alloca, associate uses with each partition, and then rewrite the uses of each partition in an attempt to break apart the alloca into chunks that could be promoted. This was very wasteful in terms of memory and compile time because regardless of how complex the alloca or how much we're able to do in breaking it up, all of the datastructure work to analyze the partitioning was done up front. The new implementation attempts to form partitions of the alloca lazily and on the fly, rewriting the uses that make up that partition as it goes. This has a few significant effects: 1) Much simpler data structures are used throughout. 2) No more double walk of the recursive use graph of the alloca, only walk it once. 3) No more complex algorithms for associating a particular use with a particular partition. 4) PHI and Select speculation is simplified and happens lazily. 5) More precise information is available about a specific use of the alloca, removing the need for some side datastructures. Ultimately, I think this is a much better implementation. It removes about 300 lines of code, but arguably removes more like 500 considering that some code grew in the process of being factored apart and cleaned up for this all to work. I've re-used as much of the old implementation as possible, which includes the lion's share of code in the form of the rewriting logic. The interesting new logic centers around how the uses of a partition are sorted, and split into actual partitions. Each instruction using a pointer derived from the alloca gets a 'Partition' entry. This name is totally wrong, but I'll do a rename in a follow-up commit as there is already enough churn here. The entry describes the offset range accessed and the nature of the access. Once we have all of these entries we sort them in a very specific way: increasing order of begin offset, followed by whether they are splittable uses (memcpy, etc), followed by the end offset or whatever. Sorting by splittability is important as it simplifies the collection of uses into a partition. Once we have these uses sorted, we walk from the beginning to the end building up a range of uses that form a partition of the alloca. Overlapping unsplittable uses are merged into a single partition while splittable uses are broken apart and carried from one partition to the next. A partition is also introduced to bridge splittable uses between the unsplittable regions when necessary. I've looked at the performance PRs fairly closely. PR15471 no longer will even load (the module is invalid). Not sure what is up there. PR15412 improves by between 5% and 10%, however it is nearly impossible to know what is holding it up as SROA (the entire pass) takes less time than reading the IR for that test case. The analysis takes the same time as running mem2reg on the final allocas. I suspect (without much evidence) that the new implementation will scale much better however, and it is just the small nature of the test cases that makes the changes small and noisy. Either way, it is still simpler and cleaner I think. llvm-svn: 186316	2013-07-15 10:30:19 +00:00
Craig Topper	06b3b6651e	Add 'const' qualifier to some arrays. llvm-svn: 186312	2013-07-15 08:02:13 +00:00
Craig Topper	5871321e49	Use llvm::array_lengthof to replace sizeof(array)/sizeof(array[0]). llvm-svn: 186301	2013-07-15 04:27:47 +00:00
Nadav Rotem	d9f3f4548e	SLPVectorizer: change the order in which we search for vectorization candidates. Do stores first and PHIs second. llvm-svn: 186277	2013-07-14 06:15:46 +00:00
Craig Topper	b94011fd28	Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size. llvm-svn: 186274	2013-07-14 04:42:23 +00:00
Arnold Schwaighofer	a92eeebde8	LoopVectorizer: Disallow reductions whose header phi is used outside the loop If an outside loop user of the reduction value uses the header phi node we cannot just reduce the vectorized phi value in the vector code epilog because we would loose VF-1 reductions. lp: p = phi (0, lv) lv = lv + 1 ... brcond , lp, outside outside: usr = add 0, p (Say the loop iterates two times, the value of p coming out of the loop is one). We cannot just transform this to: vlp: p = phi (<0,0>, lv) lv = lv + <1,1> .. brcond , lp, outside outside: p_reduced = p[0] + [1]; usr = add 0, p_reduced (Because the original loop iterated two times the vectorized loop would iterate one time, but p_reduced ends up being zero instead of one). We would have to execute VF-1 iterations in the scalar remainder loop in such cases. For now, just disable vectorization. PR16522 llvm-svn: 186256	2013-07-13 19:09:29 +00:00
Andrew Trick	0ae8c94f8f	LoopVectorize fix: LoopInfo must be valid when invoking utils like SCEVExpander. In general, one should always complete CFG modifications first, update CFG-based analyses, like Dominatores and LoopInfo, then generate instruction sequences. LoopVectorizer was creating a new loop, calling SCEVExpander to generate checks, then updating LoopInfo. I just changed the order. llvm-svn: 186241	2013-07-13 06:20:06 +00:00
Nick Lewycky	7459be6dc7	Add a microoptimization for urem. llvm-svn: 186235	2013-07-13 01:16:47 +00:00
Joey Gouly	a3250f22c2	Fix a crash in EvaluateInDifferentElementOrder where it would generate an undef vector of the wrong type. LGTM'd by Nick Lewycky on IRC. llvm-svn: 186224	2013-07-12 23:08:06 +00:00
Andrew Trick	a1e4118a46	LFTR improvement to avoid truncation. This is a reimplemntation of the patch originally in r186107. llvm-svn: 186215	2013-07-12 22:08:48 +00:00
Andrew Trick	2b71848ffe	Cleanup LFTR logic. llvm-svn: 186214	2013-07-12 22:08:44 +00:00
Andrew Trick	466555e50d	Cleanup: rename a variable to make the logic easier to follow. llvm-svn: 186213	2013-07-12 22:08:41 +00:00
Arnold Schwaighofer	9da9a43af8	TargetTransformInfo: address calculation parameter for gather/scather Address calculation for gather/scather in vectorized code can incur a significant cost making vectorization unbeneficial. Add infrastructure to add cost. Tests and cost model for targets will be in follow-up commits. radar://14351991 llvm-svn: 186187	2013-07-12 19:16:02 +00:00
Chandler Carruth	cf3715cadd	Revert "indvars: Improve LFTR by eliminating truncation when comparing against a constant." This reverts commit r186107. It didn't handle wrapping arithmetic in the loop correctly and thus caused the following C program to count from 0 to UINT64_MAX instead of from 0 to 255 as intended: #include <stdio.h> int main() { unsigned char first = 0, last = 255; do { printf("%d\n", first); } while (first++ != last); } Full test case and instructions to reproduce with just the -indvars pass sent to the original review thread rather than to r186107's commit. llvm-svn: 186152	2013-07-12 11:18:55 +00:00
Nadav Rotem	89c41bf06a	SLPVectorizer: Sink and enable CSE for ExtractElements. llvm-svn: 186145	2013-07-12 06:09:24 +00:00
Nadav Rotem	fa3c2db211	SLPVectorize: Replace the code that checks for vectorization candidates in successor blocks with code that scans PHINodes. Before we could vectorize PHINodes scanning successors was a good way of finding candidates. Now we can vectorize the phinodes which is simpler. llvm-svn: 186139	2013-07-12 00:04:18 +00:00
Nadav Rotem	db06b139fd	Remove an argument that we dont use anymore. llvm-svn: 186116	2013-07-11 20:56:13 +00:00
Andrew Trick	3095993d6f	indvars: Improve LFTR by eliminating truncation when comparing against a constant. Patch by Michele Scandale! Adds a special handling of the case where, during the loop exit condition rewriting, the exit value is a constant of bitwidth lower than the type of the induction variable: instead of introducing a trunc operation in order to match correctly the operand types, it allows to convert the constant value to an equivalent constant, depending on the initial value of the induction variable and the trip count, in order have an equivalent comparison between the induction variable and the new constant. llvm-svn: 186107	2013-07-11 17:08:59 +00:00
Benjamin Kramer	fc3ea6f4bc	Don't use a potentially expensive shift if all we want is one set bit. No functionality change. llvm-svn: 186095	2013-07-11 16:05:50 +00:00
Arnold Schwaighofer	e97c71b8fd	LoopVectorize: Vectorize all accesses in address space zero with unit stride We can vectorize them because in the case where we wrap in the address space the unvectorized code would have had to access a pointer value of zero which is undefined behavior in address space zero according to the LLVM IR semantics. (Thank you Duncan, for pointing this out to me). Fixes PR16592. llvm-svn: 186088	2013-07-11 15:21:55 +00:00
Duncan Sands	e773c08021	TryToSimplifyUncondBranchFromEmptyBlock was checking that any common predecessors of the two blocks it is attempting to merge supply the same incoming values to any phi in the successor block. This change allows merging in the case where there is one or more incoming values that are undef. The undef values are rewritten to match the non-undef value that flows from the other edge. Patch by Mark Lacey. llvm-svn: 186069	2013-07-11 08:28:20 +00:00
Nadav Rotem	08efb262a9	Fix a warning. llvm-svn: 186064	2013-07-11 05:39:02 +00:00
Nadav Rotem	b8dd66f655	SLPVectorizer: refactor the code that places extracts. Place the code that decides where to put extracts in the build-tree phase. This allows us to take the cost of the extracts into account. llvm-svn: 186058	2013-07-11 04:54:05 +00:00
Michael Gottesman	b40db26eae	Teach TailRecursionElimination to handle certain cases of nocapture escaping allocas. Without the changes introduced into this patch, if TRE saw any allocas at all, TRE would not perform TRE or mark callsites with the tail marker. Because TRE runs after mem2reg, this inadequacy is not a death sentence. But given a callsite A without escaping alloca argument, A may not be able to have the tail marker placed on it due to a separate callsite B having a write-back parameter passed in via an argument with the nocapture attribute. Assume that B is the only other callsite besides A and B only has nocapture escaping alloca arguments (NOTE B may have other arguments that are not passed allocas). In this case not marking A with the tail marker is unnecessarily conservative since: 1. By assumption A has no escaping alloca arguments itself so it can not access the caller's stack via its arguments. 2. Since all of B's escaping alloca arguments are passed as parameters with the nocapture attribute, we know that B does not stash said escaping allocas in a manner that outlives B itself and thus could be accessed indirectly by A. With the changes introduced by this patch: 1. If we see any escaping allocas passed as a capturing argument, we do nothing and bail early. 2. If we do not see any escaping allocas passed as captured arguments but we do see escaping allocas passed as nocapture arguments: i. We do not perform TRE to avoid PR962 since the code generator produces significantly worse code for the dynamic allocas that would be created by the TRE algorithm. ii. If we do not return twice, mark call sites without escaping allocas with the tail marker. NOTE This excludes functions with escaping nocapture allocas. 3. If we do not see any escaping allocas at all (whether captured or not): i. If we do not have usage of setjmp, mark all callsites with the tail marker. ii. If there are no dynamic/variable sized allocas in the function, attempt to perform TRE on all callsites in the function. Based off of a patch by Nick Lewycky. rdar://14324281. llvm-svn: 186057	2013-07-11 04:40:01 +00:00
Michael Gottesman	6eb95dc2f7	[objc-arc] Changed 'mode: c++' => 'C++' at Nick Lewycky's suggestion. Also removed unnecessary mode: c++ lines from .cpp files. llvm-svn: 186026	2013-07-10 18:49:00 +00:00
Peter Collingbourne	49062a97cf	Implement categories for special case lists. A special case list can now specify categories for specific globals, which can be used to instruct an instrumentation pass to treat certain functions or global variables in a specific way, such as by omitting certain aspects of instrumentation while keeping others, or informing the instrumentation pass that a specific uninstrumentable function has certain semantics, thus allowing the pass to instrument callers according to those semantics. For example, AddressSanitizer now uses the "init" category instead of global-init prefixes for globals whose initializers should not be instrumented, but which in all other respects should be instrumented. The motivating use case is DataFlowSanitizer, which will have a number of different categories for uninstrumentable functions, such as "functional" which specifies that a function has pure functional semantics, or "discard" which indicates that a function's return value should not be labelled. Differential Revision: http://llvm-reviews.chandlerc.com/D1092 llvm-svn: 185978	2013-07-09 22:03:17 +00:00
Peter Collingbourne	2eb048d230	Introduce a SpecialCaseList ctor which takes a MemoryBuffer to make it more unit testable, and fix memory leak in the other ctor. Differential Revision: http://llvm-reviews.chandlerc.com/D1090 llvm-svn: 185976	2013-07-09 22:03:09 +00:00
Peter Collingbourne	015370e23a	Rename BlackList class to SpecialCaseList and move it to Transforms/Utils. Differential Revision: http://llvm-reviews.chandlerc.com/D1089 llvm-svn: 185975	2013-07-09 22:02:49 +00:00
Nadav Rotem	d7b574e5b3	Fix PR16571, which is a bug in the code that checks that all of the types in the bundle are uniform. llvm-svn: 185970	2013-07-09 21:38:08 +00:00
Nadav Rotem	861bef7dd0	Set the default insert point to the first instruction, and not to end() llvm-svn: 185953	2013-07-09 17:55:36 +00:00
David Majnemer	eeed73b981	InstCombine: Fix typo in comment for visitICmpInstWithInstAndIntCst llvm-svn: 185916	2013-07-09 09:24:35 +00:00
David Majnemer	72d76275ac	InstCombine: variations on 0xffffffff - x >= 4 The following transforms are valid if -C is a power of 2: (icmp ugt (xor X, C), ~C) -> (icmp ult X, C) (icmp ult (xor X, C), -C) -> (icmp uge X, C) These are nice, they get rid of the xor. llvm-svn: 185915	2013-07-09 09:20:58 +00:00
David Majnemer	414d4e58aa	InstCombine: X & -C != -C -> X <= u ~C Tests were added in r185910 somehow. llvm-svn: 185912	2013-07-09 08:09:32 +00:00
David Majnemer	bafa537eb7	Commit r185909 was a misapplied patch, fix it llvm-svn: 185910	2013-07-09 07:58:32 +00:00
David Majnemer	f2a9a513c7	InstCombine: add more transforms C1-X <u C2 -> (X\|(C2-1)) == C1 C1-X >u C2 -> (X\|C2) == C1 X-C1 <u C2 -> (X & -C2) == C1 X-C1 >u C2 -> (X & ~C2) == C1 llvm-svn: 185909	2013-07-09 07:50:59 +00:00
Eli Bendersky	07b0e451ca	Fix comment llvm-svn: 185888	2013-07-08 23:57:07 +00:00
Nadav Rotem	c9c57518ab	This patch changes the saved IRBuilder insert point from BasicBlock::iterator to AssertingVH. Commit 185883 fixes a bug in the IRBuilder that should fix the ASan bot. AssertingVH can help in exposing some RAUW problems. Thanks Ben and Alexey! llvm-svn: 185886	2013-07-08 23:31:13 +00:00
Michael Gottesman	c1b648f6c0	[objc-arc] Fix assertion in EraseInstruction so that noop on null calls when passed null do not trigger the assert. The specific case of interest is when objc_retainBlock is passed null. llvm-svn: 185885	2013-07-08 23:30:23 +00:00
David Majnemer	fa90a0b325	InstCombine: Fold X-C1 <u 2 -> (X & -2) == C1 Back in r179493 we determined that two transforms collided with each other. The fix back then was to reorder the transforms so that the preferred transform would give it a try and then we would try the secondary transform. However, it was noted that the best approach would canonicalize one transform into the other, removing the collision and allowing us to optimize IR given to us in that form. llvm-svn: 185808	2013-07-08 11:53:08 +00:00
Nadav Rotem	2ee35771a8	Clear the builder insert point between tree-vectorization phases. llvm-svn: 185777	2013-07-07 14:57:18 +00:00
Nadav Rotem	2041b742d4	SLPVectorizer: Implement DCE as part of vectorization. This is a complete re-write if the bottom-up vectorization class. Before this commit we scanned the instruction tree 3 times. First in search of merge points for the trees. Second, for estimating the cost. And finally for vectorization. There was a lot of code duplication and adding the DCE exposed bugs. The new design is simpler and DCE was a part of the design. In this implementation we build the tree once. After that we estimate the cost by scanning the different entries in the constructed tree (in any order). The vectorization phase also works on the built tree. llvm-svn: 185774	2013-07-07 06:57:07 +00:00
Michael Gottesman	618df456e2	[objc-arc] Remove the alias analysis part of r185764. Upon further reflection, the alias analysis part of r185764 is not a safe change. llvm-svn: 185770	2013-07-07 04:18:03 +00:00
Michael Gottesman	a72630d453	[objc-arc] Teach the ARC optimizer that objc_sync_enter/objc_sync_exit do not modify the ref count of an objc object and additionally are inert for modref purposes. llvm-svn: 185769	2013-07-07 01:52:55 +00:00
Michael Gottesman	e557da26db	[objc-arc] When we initialize ARCRuntimeEntryPoints, make sure we reset all references to entrypoint declarations as well. llvm-svn: 185764	2013-07-06 18:43:05 +00:00
Benjamin Kramer	3d90a8f4f9	Reassociate: Remove unnecessary default operator=. llvm-svn: 185757	2013-07-06 15:10:13 +00:00
Michael Gottesman	4d9439c73f	[objc-arc] Performed some small cleanups in ARCRuntimeEntryPoints and added an llvm_unreachable after the switch to quiet -Wreturn_type errors. llvm-svn: 185746	2013-07-06 02:18:56 +00:00
Michael Gottesman	574d521c85	[objc-arc] Renamed Module => TheModule in ARCRuntimeEntryPoints. Also did some small cleanups. This fixes an issue that came up due to -fpermissive on the bots. llvm-svn: 185744	2013-07-06 01:57:32 +00:00
Michael Gottesman	01df45056e	Removed trailing whitespace. llvm-svn: 185743	2013-07-06 01:41:35 +00:00
Michael Gottesman	0b912b2673	[objc-arc] Updated ObjCARCContract to use ARCRuntimeEntryPoints. llvm-svn: 185742	2013-07-06 01:39:26 +00:00
Michael Gottesman	14acfacc48	[objc-arc] Updated ObjCARCOpts to use ARCRuntimeEntryPoints. llvm-svn: 185741	2013-07-06 01:39:23 +00:00
Michael Gottesman	a94186a4c0	[objc-arc] Refactor runtime entrypoint declaration entrypoint creation. This is the first patch in a series of 3 patches which clean up how we create runtime function declarations in the ARC optimizer when they do not exist already in the IR. Currently we have a bunch of duplicated code in ObjCARCOpts, ObjCARCContract that does this. This patch refactors that code into a separate class called ARCRuntimeEntryPoints which lazily creates the declarations for said entrypoints. The next two patches will consist of the work of refactoring ObjCARCContract/ObjCARCOpts to use this new code. llvm-svn: 185740	2013-07-06 01:39:18 +00:00
Nick Lewycky	cff2cf8e3a	Fix annotation of unlink. Should fix builder. llvm-svn: 185738	2013-07-06 00:59:28 +00:00
Nick Lewycky	c2ec0725ce	Extend 'readonly' and 'readnone' to work on function arguments as well as functions. Make the function attributes pass add it to known library functions and when it can deduce it. llvm-svn: 185735	2013-07-06 00:29:58 +00:00
Rafael Espindola	155cf0f3a6	Use sys::fs::createTemporaryFile. llvm-svn: 185719	2013-07-05 20:14:52 +00:00
Sylvestre Ledru	751447a3ac	Remove a useless declarations (found by scan-build) llvm-svn: 185709	2013-07-05 15:58:12 +00:00
David Majnemer	c2a990bc00	InstCombine: (icmp eq B, 0) \| (icmp ult A, B) -> (icmp ule A, B-1) This transform allows us to turn IR that looks like: %1 = icmp eq i64 %b, 0 %2 = icmp ult i64 %a, %b %3 = or i1 %1, %2 ret i1 %3 into: %0 = add i64 %b, -1 %1 = icmp uge i64 %0, %a ret i1 %1 which means we go from lowering: cmpq %rsi, %rdi setb %cl testq %rsi, %rsi sete %al orb %cl, %al ret to lowering: decq %rsi cmpq %rdi, %rsi setae %al ret llvm-svn: 185677	2013-07-05 00:31:17 +00:00
David Majnemer	37f8f445de	InstCombine: Reimplementation of visitUDivOperand This transform was originally added in r185257 but later removed in r185415. The original transform would create instructions speculatively and then discard them if the speculation was proved incorrect. This has been replaced with a scheme that splits the transform into two parts: preflight and fold. While we preflight, we build up fold actions that inform the folding stage on how to act. llvm-svn: 185667	2013-07-04 21:17:49 +00:00
Benjamin Kramer	371722288c	SimplifyCFG: Teach switch generation some patterns that instcombine forms. This allows us to create switches even if instcombine has munged two of the incombing compares into one and some bit twiddling. This was motivated by enum compares that are common in clang. llvm-svn: 185632	2013-07-04 14:22:02 +00:00
Nick Lewycky	a833c667e2	Tabs to spaces. No functionality change. llvm-svn: 185612	2013-07-04 03:51:53 +00:00
Craig Topper	af0dea1347	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. llvm-svn: 185606	2013-07-04 01:31:24 +00:00
Craig Topper	31ee5866de	Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size. llvm-svn: 185540	2013-07-03 15:07:05 +00:00
Evgeniy Stepanov	dc6d7eb860	[msan] Unpoison stack allocations and undef values in blacklisted functions. This changes behavior of -msan-poison-stack=0 flag from not poisoning stack allocations to actively unpoisoning them. llvm-svn: 185538	2013-07-03 14:39:14 +00:00
Michael Gottesman	2db11161a8	Added support in FunctionAttrs for adding relevant function/argument attributes for the posix call gettimeofday. This implies annotating it as nounwind and its arguments as nocapture. To be conservative, we do not annotate the arguments with noalias since some platforms do not have restrict on the declaration for gettimeofday. llvm-svn: 185502	2013-07-03 04:00:54 +00:00
Manman Ren	d0e67aa1ce	Debug Info: cleanup llvm-svn: 185456	2013-07-02 18:37:35 +00:00
Hal Finkel	fdbe161b1a	Revert r185257 (InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms) I'm reverting this commit because: 1. As discussed during review, it needs to be rewritten (to avoid creating and then deleting instructions). 2. This is causing optimizer crashes. Specifically, I'm seeing things like this: While deleting: i1 % Use still stuck around after Def is destroyed: <badref> = select i1 <badref>, i32 0, i32 1 opt: /src/llvm-trunk/lib/IR/Value.cpp:79: virtual llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when a value is destroyed!"' failed. I'd guess that these will go away once we're no longer creating/deleting instructions here, but just in case, I'm adding a regression test. Because the code is bring rewritten, I've just XFAIL'd the original regression test. Original commit message: InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms Real world code sometimes has the denominator of a 'udiv' be a 'select'. LLVM can handle such cases but only when the 'select' operands are symmetric in structure (both select operands are a constant power of two or a left shift, etc.). This falls apart if we are dealt a 'udiv' where the code is not symetric or if the select operands lead us to more select instructions. Instead, we should treat the LHS and each select operand as a distinct divide operation and try to optimize them independently. If we can to simplify each operation, then we can replace the 'udiv' with, say, a 'lshr' that has a new select with a bunch of new operands for the select. llvm-svn: 185415	2013-07-02 05:21:11 +00:00
Nick Lewycky	26fcc51f15	Add missing break statements. Noticed by inspection. llvm-svn: 185414	2013-07-02 05:02:56 +00:00
Manman Ren	74c188f026	Debug Info: clean up usage of Verify. No functionality change. It should suffice to check the type of a debug info metadata, instead of calling Verify. llvm-svn: 185383	2013-07-01 21:02:01 +00:00
Arnold Schwaighofer	ef51cf202b	LoopVectorize: Math functions only read rounding mode Math functions are mark as readonly because they read the floating point rounding mode. Because we don't vectorize loops that would contain function calls that set the rounding mode it is safe to ignore this memory read. llvm-svn: 185299	2013-07-01 00:54:44 +00:00
Stephen Lin	2e551adcd9	DeadArgumentElimination: keep return value on functions that have a live argument with the 'returned' attribute (rather than generate invalid IR); however, if both can be eliminated, both will be llvm-svn: 185290	2013-06-30 20:26:21 +00:00
Benjamin Kramer	4093f29366	InstCombine: Also turn selects fed by an and into arithmetic when the types don't match. Inserting a zext or trunc is sufficient. This pattern is somewhat common in LLVM's pointer mangling code. llvm-svn: 185270	2013-06-29 21:17:04 +00:00
Benjamin Kramer	4ab72f9b9a	LoopVectorizer: Pack MemAccessInfo pairs. llvm-svn: 185263	2013-06-29 17:52:08 +00:00
Benjamin Kramer	53545693d7	Move helper classes into anonymous namespaces. llvm-svn: 185262	2013-06-29 17:02:06 +00:00
David Majnemer	5953d3712a	InstCombine: FoldGEPICmp shouldn't change sign of base pointer comparison Changing the sign when comparing the base pointer would introduce all sorts of unexpected things like: %gep.i = getelementptr inbounds [1 x i8]* %a, i32 0, i32 0 %gep2.i = getelementptr inbounds [1 x i8]* %b, i32 0, i32 0 %cmp.i = icmp ult i8* %gep.i, %gep2.i %cmp.i1 = icmp ult [1 x i8]* %a, %b %cmp = icmp ne i1 %cmp.i, %cmp.i1 ret i1 %cmp into: %cmp.i = icmp slt [1 x i8]* %a, %b %cmp.i1 = icmp ult [1 x i8]* %a, %b %cmp = xor i1 %cmp.i, %cmp.i1 ret i1 %cmp By preserving the original sign, we now get: ret i1 false This fixes PR16483. llvm-svn: 185259	2013-06-29 10:28:04 +00:00
David Majnemer	92a8a7d45a	InstCombine: Small whitespace cleanup in FoldGEPICmp llvm-svn: 185258	2013-06-29 09:45:35 +00:00
David Majnemer	797227eea6	InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms Real world code sometimes has the denominator of a 'udiv' be a 'select'. LLVM can handle such cases but only when the 'select' operands are symmetric in structure (both select operands are a constant power of two or a left shift, etc.). This falls apart if we are dealt a 'udiv' where the code is not symetric or if the select operands lead us to more select instructions. Instead, we should treat the LHS and each select operand as a distinct divide operation and try to optimize them independently. If we can to simplify each operation, then we can replace the 'udiv' with, say, a 'lshr' that has a new select with a bunch of new operands for the select. llvm-svn: 185257	2013-06-29 08:40:07 +00:00
Nadav Rotem	0a25727f31	We preserve the CFG and some of the analysis passes. llvm-svn: 185251	2013-06-29 05:38:15 +00:00
Nadav Rotem	e00343446c	Update docs. llvm-svn: 185250	2013-06-29 05:37:19 +00:00
David Majnemer	b889e405eb	InstCombine: Optimize (1 << X) Pred CstP2 to X Pred Log2(CstP2) We may, after other optimizations, find ourselves with IR that looks like: %shl = shl i32 1, %y %cmp = icmp ult i32 %shl, 32 Instead, we should just compare the shift count: %cmp = icmp ult i32 %y, 5 llvm-svn: 185242	2013-06-28 23:42:03 +00:00
Nadav Rotem	060be733a5	SLP Vectorizer: Add support for trees with external users. To support this we have to insert 'extractelement' instructions to pick the right lane. We had this functionality before but I removed it when we moved to the multi-block design because it was too complicated. llvm-svn: 185230	2013-06-28 22:07:09 +00:00
Nadav Rotem	9ce3fedcdd	LoopVectorizer: Refactor the code that checks if it is safe to predicate blocks. In this code we keep track of pointers that we are allowed to read from, if they are accessed by non-predicated blocks. We use this list to allow vectorization of conditional loads in predicated blocks because we know that these addresses don't segfault. llvm-svn: 185214	2013-06-28 20:46:27 +00:00
Daniel Malea	b17b1cd6f5	Remove needless include (unistd.h) in DebugIR pass - should unbreak Windows builds llvm-svn: 185198	2013-06-28 19:19:44 +00:00
Daniel Malea	0673464a92	Add missing header for DebugIR - missed svn add... llvm-svn: 185194	2013-06-28 19:07:59 +00:00
Daniel Malea	31321fa53d	Remove limitation on DebugIR that made it require existing debug metadata. - Build debug metadata for 'bare' Modules using DIBuilder - DebugIR can be constructed to generate an IR file (to be seen by a debugger) or not in cases where the user already has an IR file on disk. llvm-svn: 185193	2013-06-28 19:05:23 +00:00
Arnold Schwaighofer	ce2c766f61	LoopVectorize: Pull dyn_cast into setDebugLocFromInst llvm-svn: 185168	2013-06-28 17:14:48 +00:00
Arnold Schwaighofer	3b27b992ca	LoopVectorize: Use static function instead of DebugLocSetter class I used the class to safely reset the state of the builder's debug location. I think I have caught all places where we need to set the debug location to a new one. Therefore, we can replace the class by a function that just sets the debug location. llvm-svn: 185165	2013-06-28 16:26:54 +00:00
Manman Ren	983a16c08a	Debug Info: clean up usage of Verify. No functionality change. It should suffice to check the type of a debug info metadata, instead of calling Verify. For cases where we know the type of a DI metadata, use assert. Also update testing cases to make them conform to the format of DI classes. llvm-svn: 185135	2013-06-28 05:43:10 +00:00
Arnold Schwaighofer	12ecb331af	LoopVectorize: Preserve debug location info radar://14169017 llvm-svn: 185122	2013-06-28 00:38:54 +00:00
Matt Arsenault	5d2e85f6d7	Fix using arg_end() - arg_begin() instead of arg_size() llvm-svn: 185121	2013-06-28 00:25:40 +00:00
Michael Gottesman	79b0967548	Revert "Revert "[APFloat] Removed APFloat constructor which initialized to either zero/NaN but allowed you to arbitrarily set the category of the float."" This reverts commit r185099. Looks like both the ppc-64 and mips bots are still failing after I reverted this change. Since: 1. The mips bot always performs a clean build, 2. The ppc64-bot failed again after a clean build (I asked the ppc-64 maintainers to clean the bot which they did... Thanks Will!), I think it is safe to assume that this change was not the cause of the failures that said builders were seeing. Thus I am recomitting. llvm-svn: 185111	2013-06-27 21:58:19 +00:00
Michael Gottesman	ccaf3321f1	Revert "[APFloat] Removed APFloat constructor which initialized to either zero/NaN but allowed you to arbitrarily set the category of the float." This reverts commit r185095. This is causing a FileCheck failure on the 3dnow intrinsics on at least the mips/ppc bots but not on the x86 bots. Reverting while I figure out what is going on. llvm-svn: 185099	2013-06-27 20:40:11 +00:00
Arnold Schwaighofer	38de7cd464	LoopVectorize: Cache edge masks created during if-conversion Otherwise, we end up with an exponential IR blowup. Fixes PR16472. llvm-svn: 185097	2013-06-27 20:31:06 +00:00
Michael Gottesman	03255a1675	[APFloat] Removed APFloat constructor which initialized to either zero/NaN but allowed you to arbitrarily set the category of the float. The category which an APFloat belongs to should be dependent on the actual value that the APFloat has, not be arbitrarily passed in by the user. This will prevent inconsistency bugs where the category and the actual value in APFloat differ. I also fixed up all of the references to this constructor (which were only in LLVM). llvm-svn: 185095	2013-06-27 19:50:52 +00:00
Arnold Schwaighofer	a2dd195fb3	LoopVectorize: Use vectorized loop invariant gep index anchored in loop Use vectorized instruction instead of original instruction anchored in the original loop. Fixes PR16452 and t2075.c of PR16455. llvm-svn: 185081	2013-06-27 15:11:55 +00:00
Arnold Schwaighofer	ccd6c9929b	LoopVectorize: Don't store a reversed value in the vectorized value map When we store values for reversed induction stores we must not store the reversed value in the vectorized value map. Another instruction might use this value. This fixes 3 test cases of PR16455. llvm-svn: 185051	2013-06-27 00:45:41 +00:00
Michael Gottesman	41748d7c86	Added support for the Builtin attribute. The Builtin attribute is an attribute that can be placed on function call site that signal that even though a function is declared as being a builtin, rdar://problem/13727199 llvm-svn: 185049	2013-06-27 00:25:01 +00:00
Nadav Rotem	8edefb3665	No need to use a Set when a vector would do. llvm-svn: 185047	2013-06-27 00:14:13 +00:00
Nadav Rotem	93f880fb77	SLP: When searching for vectorization opportunities scan the blocks in post-order because we grow chains upwards. llvm-svn: 185041	2013-06-26 23:44:45 +00:00
Nadav Rotem	7f0d6d7975	SLP: Dont erase instructions during vectorization because it prevents the outerloops from iterating over the instructions. llvm-svn: 185040	2013-06-26 23:43:23 +00:00
Michael Gottesman	c2af8d6273	In InstCombine{AddSub,MulDivRem} convert APFloat.isFiniteNonZero() && !APFloat.isDenormal => APFloat.isNormal. llvm-svn: 185037	2013-06-26 23:17:31 +00:00
Eric Christopher	b8c608ea39	Revert "Debug Info: clean up usage of Verify." as it's breaking bots. This reverts commit r185020 llvm-svn: 185032	2013-06-26 22:44:57 +00:00
Manman Ren	aa00ce0e8f	Debug Info: clean up usage of Verify. No functionality change. It should suffice to check the type of a debug info metadata, instead of calling Verify. llvm-svn: 185020	2013-06-26 21:26:10 +00:00
Nadav Rotem	4c5b2d1de6	Erase all of the instructions that we RAUWed llvm-svn: 184969	2013-06-26 17:16:09 +00:00
Nadav Rotem	f4ca3994b8	Do not add cse-ed instructions into the visited map because we dont want to consider them as a candidate for replacement of instructions to be visited. llvm-svn: 184966	2013-06-26 16:54:53 +00:00
Kostya Serebryany	5e276f9dbc	[asan] workaround for PR16277: don't instrument AllocaInstr with alignment more than the redzone size llvm-svn: 184928	2013-06-26 09:49:52 +00:00
Kostya Serebryany	9f5213f20f	[asan] add option -asan-keep-uninstrumented-functions llvm-svn: 184927	2013-06-26 09:18:17 +00:00
Nick Lewycky	5cd9538b90	dbgs() << Instruction doesn't print a newline on the end any more. Update these debug statements to add a missing newline. Also canonicalize to '\n' instead of "\n"; the latter calls a function with a loop the former does not. llvm-svn: 184897	2013-06-26 00:30:18 +00:00
Nadav Rotem	0794acc1da	SLPVectorizer: support slp-vectorization of PHINodes between basic blocks llvm-svn: 184888	2013-06-25 23:04:09 +00:00
Bob Wilson	acfc01dedf	Fix SROA to avoid unnecessary scalar conversions for 1-element vectors. When a 1-element vector alloca is promoted, a store instruction can often be rewritten without converting the value to a scalar and using an insertelement instruction to stuff it into the new alloca. This patch just adds a check to skip that conversion when it is unnecessary. This turns out to be really important for some ARM Neon operations where <1 x i64> is used to get around the fact that i64 is not a legal type. llvm-svn: 184870	2013-06-25 19:09:50 +00:00
Nadav Rotem	3de032a3b6	Fix a typo in the code that collected the costs recursively. llvm-svn: 184827	2013-06-25 05:30:56 +00:00
Nadav Rotem	9c7c997a7e	Rename the variable to fix a warning. Thanks Andy Gibbs. llvm-svn: 184749	2013-06-24 15:59:47 +00:00
Arnold Schwaighofer	b252c11ccc	Reapply 184685 after the SetVector iteration order fix. This should hopefully have fixed the stage2/stage3 miscompare on the dragonegg testers. "LoopVectorize: Use the dependence test utility class We now no longer need alias analysis - the cases that alias analysis would handle are now handled as accesses with a large dependence distance. We can now vectorize loops with simple constant dependence distances. for (i = 8; i < 256; ++i) { a[i] = a[i+4] * a[i+8]; } for (i = 8; i < 256; ++i) { a[i] = a[i-4] * a[i-8]; } We would be able to vectorize about 200 more loops (in many cases the cost model instructs us no to) in the test suite now. Results on x86-64 are a wash. I have seen one degradation in ammp. Interestingly, the function in which we now vectorize a loop is never executed so we probably see some instruction cache effects. There is a 2% improvement in h264ref. There is one or the other TSCV loop kernel that speeds up. radar://13681598" llvm-svn: 184724	2013-06-24 12:09:15 +00:00
Arnold Schwaighofer	91472fa4fc	LoopVectorize: Use SetVector for the access set We are creating the runtime checks using this set so we need a deterministic iteration order. llvm-svn: 184723	2013-06-24 12:09:12 +00:00
Chandler Carruth	08e1b8742b	Add a flag to defer vectorization into a phase after the inliner and its CGSCC pass manager. This should insulate the inlining decisions from the vectorization decisions, however it may have both compile time and code size problems so it is just an experimental option right now. Adding this based on a discussion with Arnold and it seems at least worth having this flag for us to both run some experiments to see if this strategy is workable. It may solve some of the regressions seen with the loop vectorizer. llvm-svn: 184698	2013-06-24 07:21:47 +00:00
Arnold Schwaighofer	58ca945f38	Revert "LoopVectorize: Use the dependence test utility class" This reverts commit cbfa1ca993363ca5c4dbf6c913abc957c584cbac. We are seeing a stage2 and stage3 miscompare on some dragonegg bots. llvm-svn: 184690	2013-06-24 06:10:41 +00:00
Arnold Schwaighofer	b914a7e2ef	LoopVectorize: Use the dependence test utility class We now no longer need alias analysis - the cases that alias analysis would handle are now handled as accesses with a large dependence distance. We can now vectorize loops with simple constant dependence distances. for (i = 8; i < 256; ++i) { a[i] = a[i+4] * a[i+8]; } for (i = 8; i < 256; ++i) { a[i] = a[i-4] * a[i-8]; } We would be able to vectorize about 200 more loops (in many cases the cost model instructs us no to) in the test suite now. Results on x86-64 are a wash. I have seen one degradation in ammp. Interestingly, the function in which we now vectorize a loop is never executed so we probably see some instruction cache effects. There is a 2% improvement in h264ref. There is one or the other TSCV loop kernel that speeds up. radar://13681598 llvm-svn: 184685	2013-06-24 03:55:48 +00:00
Arnold Schwaighofer	d517976758	LoopVectorize: Add utility class for checking dependency among accesses This class checks dependences by subtracting two Scalar Evolution access functions allowing us to catch very simple linear dependences. The checker assumes source order in determining whether vectorization is safe. We currently don't reorder accesses. Positive true dependencies need to be a multiple of VF otherwise we impede store-load forwarding. llvm-svn: 184684	2013-06-24 03:55:45 +00:00
Arnold Schwaighofer	d57419696d	LoopVectorize: Add utility class for building sets of dependent accesses Sets of dependent accesses are built by unioning sets based on underlying objects. This class will be used by the upcoming dependence checker. llvm-svn: 184683	2013-06-24 03:55:44 +00:00
Nadav Rotem	210e86d7c4	SLP Vectorizer: Add support for vectorizing parts of the tree. Untill now we detected the vectorizable tree and evaluated the cost of the entire tree. With this patch we can decide to trim-out branches of the tree that are not profitable to vectorizer. Also, increase the max depth from 6 to 12. In the worse possible case where all of the code is made of diamond-shaped graph this can bring the cost to 2**10, but diamonds are not very common. llvm-svn: 184681	2013-06-24 02:52:43 +00:00
Nadav Rotem	0323925d51	SLP Vectorizer: Fix a bug in the code that does CSE on the generated gather sequences. Make sure that we don't replace and RAUW two sequences if one does not dominate the other. llvm-svn: 184674	2013-06-23 21:57:27 +00:00
Nadav Rotem	78428401e9	SLP Vectorizer: Erase instructions outside the vectorizeTree method. The RAII builder location guard is saving a reference to instructions, so we can't erase instructions during vectorization. llvm-svn: 184671	2013-06-23 19:38:56 +00:00
Nadav Rotem	eb65e67eea	SLP Vectorizer: Implement a simple CSE optimization for the gather sequences. llvm-svn: 184660	2013-06-23 06:15:46 +00:00
Nadav Rotem	80de0a28f1	SLP Vectorizer: Implement multi-block slp-vectorization. Rewrote the SLP-vectorization as a whole-function vectorization pass. It is now able to vectorize chains across multiple basic blocks. It still does not vectorize PHIs, but this should be easy to do now that we scan the entire function. I removed the support for extracting values from trees. We are now able to vectorize more programs, but there are some serious regressions in many workloads (such as flops-6 and mandel-2). llvm-svn: 184647	2013-06-22 21:34:10 +00:00
Benjamin Kramer	40d7f354b5	Revert "FunctionAttrs: Merge attributes once instead of doing it for every argument." It doesn't work as I intended it to. This reverts commit r184638. llvm-svn: 184641	2013-06-22 16:56:32 +00:00
Benjamin Kramer	76b7bd0e75	FunctionAttrs: Merge attributes once instead of doing it for every argument. It has become an expensive operation. No functionality change. llvm-svn: 184638	2013-06-22 15:51:19 +00:00
Michael Gottesman	9799cf7fb3	[objc-arc-opts] Make IsTrackingImpreciseReleases a const method. Thanks to Bill Wendling for pointing this out! llvm-svn: 184593	2013-06-21 20:52:49 +00:00
Michael Gottesman	e3943d0554	[objc-arc-opts] Now that PtrState.RRI is encapsulated in PtrState, make PtrState.RRI private and delete the TODO. llvm-svn: 184587	2013-06-21 19:44:30 +00:00
Michael Gottesman	4f6ef11763	[objc-arc-opts] Encapsulated PtrState.RRI.{Calls,ReverseInsertPts} into several methods on PtrState. llvm-svn: 184586	2013-06-21 19:44:27 +00:00
Michael Gottesman	f040118167	[objcarcopts] Encapsulated PtrState.RRI.IsTrackingImpreciseRelease() => PtrState.IsTrackingImpreciseRelease(). llvm-svn: 184583	2013-06-21 19:12:38 +00:00
Michael Gottesman	2f2945973a	[objcarcopts] Encapsulate PtrState.RRI.CFGHazardAfflicted via methods PtrState.{IsCFGHazardAfflicted,SetCFGHazardAfflicted}. llvm-svn: 184582	2013-06-21 19:12:36 +00:00
Michael Gottesman	f701d3f864	[objcarcopts] Encapsulate PtrState.RRI.ReleaseMetadata into the methods PtrState.GetReleaseMetadata() and PtrState.SetReleaseMetadata(). llvm-svn: 184534	2013-06-21 07:03:07 +00:00
Michael Gottesman	b82a179606	[objcarcopts] Encapsulate PtrState.RRI.IsTailCallRelease into the method PtrState.IsTailCallRelease() and PtrState.SetTailCallRelease(). llvm-svn: 184533	2013-06-21 07:00:44 +00:00
Michael Gottesman	9313225e72	[obcjarcopts] Encapsulate PtrState.RRI.KnownSafe in the methods PtrState.IsKnownSafe and PtrState.SetKnownSafe. This is apart of a series of patches to encapsulate PtrState.RRI and make PtrState.RRI a private field of PtrState. NOTE This is actually the second commit in the patch stream. I should have put this note on the first such commit r184528. llvm-svn: 184532	2013-06-21 06:59:02 +00:00
Michael Gottesman	b7deb4cd79	[objcarcopts] Some more minor code cleanups/comment additions. llvm-svn: 184531	2013-06-21 06:54:31 +00:00
Michael Gottesman	4773a10cfb	[objcarcopts] Refactor out the RRInfo merging code from PtrState into RRInfo::Merge. I also added some comments and performed minor code cleanups. llvm-svn: 184528	2013-06-21 05:42:08 +00:00
Nadav Rotem	e1713e5fcf	SLP Vectorizer: do not search for store-chains that are wider than the vector-register size. llvm-svn: 184527	2013-06-21 04:18:13 +00:00
Meador Inge	dfb08a2cb8	Remove the simplify-libcalls pass (finally) This commit completely removes what is left of the simplify-libcalls pass. All of the functionality has now been migrated to the instcombine and functionattrs passes. The following C API functions are now NOPs: 1. LLVMAddSimplifyLibCallsPass 2. LLVMPassManagerBuilderSetDisableSimplifyLibCalls llvm-svn: 184459	2013-06-20 19:48:07 +00:00
Nadav Rotem	b488beefeb	Clang-format the SLP vectorizer. No functionality change. llvm-svn: 184446	2013-06-20 17:54:36 +00:00
Nadav Rotem	14a89c5428	SLPVectorization: Add a basic support for cross-basic block slp vectorization. We collect gather sequences when we vectorize basic blocks. Gather sequences are excellent hints for vectorization of other basic blocks. llvm-svn: 184444	2013-06-20 17:41:45 +00:00
Nadav Rotem	c41028a013	Change the debug type to match the debug type that is used by vecutils.cpp. This change makes it easier to filter debug messages. llvm-svn: 184440	2013-06-20 16:38:05 +00:00
Michael Gottesman	3cb77ab98a	[APFloat] Converted all references to APFloat::isNormal => APFloat::isFiniteNonZero. Turns out all the references were in llvm and not in clang. llvm-svn: 184356	2013-06-19 21:23:18 +00:00
Bill Wendling	7a639ea2a4	Access the TargetLoweringInfo from the TargetMachine object instead of caching it. The TLI may change between functions. No functionality change. llvm-svn: 184352	2013-06-19 21:07:11 +00:00
Matt Arsenault	d46fce1141	Move StructurizeCFG out of R600 to generic Transforms. Register it with PassManager llvm-svn: 184343	2013-06-19 20:18:24 +00:00
Quentin Colombet	145eb97d3a	LSR: Fix the parameters used to compute the scaling factor cost. Prior to this change, the considered addressing modes may be invalid since the maximum and minimum offsets were not taking into account. This was causing an assertion failure. The added test case exercices that behavior. <rdar://problem/14199725> Assertion failed: (CurScaleCost >= 0 && "Legal addressing mode has an illegal cost!") llvm-svn: 184341	2013-06-19 19:59:41 +00:00
Nadav Rotem	1e9668ea81	SLPVectorizer: handle scalars that are extracted from vectors (using ExtractElementInst). llvm-svn: 184325	2013-06-19 17:33:16 +00:00
Nadav Rotem	86e848c849	SLPVectorizer: start constructing chains at stores that are not power of two. The type <3 x i8> is a common in graphics and we want to be able to vectorize it. This changes accelerates bullet by 12% and 471_omnetpp by 5%. llvm-svn: 184317	2013-06-19 15:57:29 +00:00
Nadav Rotem	e98da7f548	SLPVectorizer: vectorize compares and selects. llvm-svn: 184282	2013-06-19 05:49:52 +00:00
Nadav Rotem	4f3224f3ed	Document the return value and fix a typo. llvm-svn: 184281	2013-06-19 05:47:33 +00:00
Nadav Rotem	1f96427da0	Scan the successor blocks and use the PHI nodes as a hint for possible chain roots. llvm-svn: 184201	2013-06-18 15:58:05 +00:00
Nadav Rotem	3349feac4e	Add a return value to make this function more useful. llvm-svn: 184200	2013-06-18 15:57:12 +00:00
Nick Lewycky	0fdd01965e	Fix nondeterminism in .gcno file generation. llvm-svn: 184174	2013-06-18 06:38:21 +00:00
Pekka Jaaskelainen	eb90fd1c3b	Fix for a regression caused by the LoopVectorizer when vectorizing loops with memory accesses to non-zero address spaces. It simply dropped the AS info. Fixes PR16306. llvm-svn: 184103	2013-06-17 18:49:06 +00:00
Nadav Rotem	cde24ef389	Disable vectorization for -Oz. llvm-svn: 184089	2013-06-17 17:22:40 +00:00
Nadav Rotem	7dd8210b71	Enable the loop vectorizer by default for -Os and -O2. llvm-svn: 184084	2013-06-17 16:23:34 +00:00
Jakub Staszak	4898e62ac0	Use 0 instead of NULL. llvm-svn: 184044	2013-06-15 12:20:44 +00:00
Benjamin Kramer	9ddfaf2be6	PruneEH: Only merge attribute sets when used. No functionality change. llvm-svn: 184041	2013-06-15 10:55:39 +00:00
Derek Schuff	ec9dc01b33	Fix DeleteDeadVarargs not to crash on functions referenced by BlockAddresses This pass was assuming that if hasAddressTaken() returns false for a function, the function's only uses are call sites. That's not true because there can be references by BlockAddresses too. Fix the pass to handle this case. Fix BlockAddress::replaceUsesOfWithOnConstant() to allow a function's type to be changed by RAUW'ing the function with a bitcast of the recreated function. Patch by Mark Seaborn. llvm-svn: 183933	2013-06-13 19:51:17 +00:00
Rafael Espindola	8d30480344	Always remove an alias when we rename the target. Should fix the dragonegg build bots. llvm-svn: 183845	2013-06-12 16:45:47 +00:00
Rafael Espindola	3bc8e71909	Move PathV2.h to Path.h Most clients have already been moved from Path V1 to V2. The ones using V1 now include PathV1.h explicitly. llvm-svn: 183801	2013-06-11 22:21:28 +00:00
Rafael Espindola	a82555c0f8	Change how globalopt handles aliases in llvm.used. Instead of a custom implementation of replaceAllUsesWith, we just call replaceAllUsesWith and recreate llvm.used and llvm.compiler-used. This change is particularity interesting because it makes llvm see through what clang is doing with static used functions in extern "C" contexts. With this change, running clang -O2 in extern "C" { __attribute__((used)) static void foo() {} } produces @llvm.used = appending global [1 x i8] [i8 bitcast (void ()* @foo to i8*)], section "llvm.metadata" define internal void @foo() #0 { entry: ret void } llvm-svn: 183756	2013-06-11 17:48:06 +00:00
Tim Northover	64280fbba1	Make DeadArgumentElimination more conservative on variadic functions Variadic functions are particularly fragile in the face of ABI changes, so this limits how much the pass changes them llvm-svn: 183625	2013-06-09 02:17:27 +00:00
Shuxin Yang	140d592d84	Fix a potential bug in r183584. r183584 tries to derive some info from the code AFTER a call and apply these derived info to the code BEFORE the call, which is not always safe as the call in question may never return, and in this case, the derived info is invalid. Thank Duncan for pointing out this potential bug. rdar://14073661 llvm-svn: 183606	2013-06-08 04:56:05 +00:00
Shuxin Yang	bd254f2601	Fix an assertion in MemCpyOpt pass. The MemCpyOpt pass is capable of optimizing: callee(&S); copy N bytes from S to D. into: callee(&D); subject to some legality constraints. Assertion is triggered when the compiler tries to evalute "sizeof(typeof(D))", while D is an opaque-typed, 'sret' formal argument of function being compiled. i.e. the signature of the func being compiled is something like this: T caller(...,%opaque* noalias nocapture sret %D, ...) The fix is that when come across such situation, instead of calling some utility functions to get the size of D's type (which will crash), we simply assume D has at least N bytes as implified by the copy-instruction. rdar://14073661 llvm-svn: 183584	2013-06-07 22:45:21 +00:00
Michael Gottesman	9e7261c874	[objc-arc] Ensure that the cfg path count does not overflow when we multiply TopDownPathCount/BottomUpPathCount. rdar://12480535 llvm-svn: 183489	2013-06-07 06:16:49 +00:00
Jakub Staszak	96ff4d6d3b	Simplify code. No functionality change. llvm-svn: 183461	2013-06-06 23:34:59 +00:00
Nadav Rotem	99e529ea3c	Jeffrey Yasskin volunteered to benchmark the vectorizer on -O2 or -Os when compiling chrome. This patch adds a new flag to enable vectorization on all levels and not only on -O3. It should go away once we make a decision. llvm-svn: 183456	2013-06-06 22:35:47 +00:00
Jakub Staszak	bddea11bc5	Re-apply "Use IRBuilder instead of ConstantInt methods." with the fixed issues. llvm-svn: 183439	2013-06-06 20:18:46 +00:00
Rafael Espindola	a7bbc0b740	Revert "Use IRBuilder instead of ConstantInt methods. It simplifies code a little bit." This reverts commit 183328. It caused pr16244 and broke the bots. llvm-svn: 183422	2013-06-06 17:03:05 +00:00
Jakub Staszak	9de494e0ee	Remove unneeded cast<>. llvm-svn: 183363	2013-06-06 00:49:57 +00:00
Jakub Staszak	461d1fe6fc	Use IRBuilder instead of ConstantInt methods. llvm-svn: 183360	2013-06-06 00:37:23 +00:00
Jakub Staszak	2f390b755a	Use IRBuilder instead of ConstantInt methods. It simplifies code a little bit. llvm-svn: 183328	2013-06-05 18:27:02 +00:00
David Majnemer	29130c5e8d	IndVarSimplify: check if loop invariant expansion can trap IndVarSimplify is willing to move divide instructions outside of their loop bodies if they are invariant of the loop. However, it may not be safe to expand them if we do not know if they can trap. Instead, check to see if it is not safe to expand the instruction and skip the expansion. This fixes PR16041. Testcase by Rafael Ávila de Espíndola. llvm-svn: 183239	2013-06-04 17:51:58 +00:00
Rafael Espindola	a5e536ab0e	Second part of pr16069 The problem this time seems to be a thinko. We were assuming that in the CFG A \| \ \| B \| / C speculating the basic block B would cause only the phi value for the B->C edge to be speculated. That is not true, the phi's are semantically in the edges, so if the A->B->C path is taken, any code needed for A->C is not executed and we have to consider it too when deciding to speculate B. llvm-svn: 183226	2013-06-04 14:11:59 +00:00
Hans Wennborg	5cf30be6e4	Typo: s/caes/cases/ in SimplifyCFG llvm-svn: 183219	2013-06-04 11:22:30 +00:00
Nick Lewycky	688d668e5c	Delete dead safety check. llvm-svn: 183167	2013-06-03 23:15:20 +00:00
David Majnemer	c82f27af2a	SimplifyCFG: Do not transform PHI to select if doing so would be unsafe PR16069 is an interesting case where an incoming value to a PHI is a trap value while also being a 'ConstantExpr'. We do not consider this case when performing the 'HoistThenElseCodeToIf' optimization. Instead, make our modifications more conservative if we detect that we cannot transform the PHI to a select. llvm-svn: 183152	2013-06-03 20:43:12 +00:00
David Majnemer	8e7dd2f628	SimplifyCFG: Small cleanup, use ICmpInst::isEquality() llvm-svn: 183151	2013-06-03 20:39:50 +00:00
Kostya Serebryany	9e62b301e6	[asan] ASan Linux MIPS32 support (llvm part), patch by Jyun-Yan Y llvm-svn: 183104	2013-06-03 14:46:56 +00:00
Nick Lewycky	3f715e260a	When determining the new index for an insertelement, we may not assume that an index greater than the size of the vector is invalid. The shuffle may be shrinking the size of the vector. Fixes a crash! Also drop the maximum recursion depth of the safety check for this optimization to five. llvm-svn: 183080	2013-06-01 20:51:31 +00:00
David Majnemer	91142c485e	SimplifyCFG: Fix typo in comment for ComputeSpeculationCost llvm-svn: 183078	2013-06-01 19:43:23 +00:00
Benjamin Kramer	7c275640e7	Move getRealLinkageName to a common place and remove all the duplicates of it. Also simplify code a bit while there. No functionality change. llvm-svn: 183076	2013-06-01 17:51:14 +00:00
Arnold Schwaighofer	7b1b4db35e	LoopVectorize: Change API call to get the backedge taken count Use ScalarEvolution's getBackedgeTakenCount API instead of getExitCount since that is really what we want to know. Using the more specific getExitCount was safe because we made sure that there is only one exiting block. No functionality change. llvm-svn: 183047	2013-05-31 21:48:56 +00:00
Quentin Colombet	bf490d4a32	Loop Strength Reduce: Scaling factor cost. Account for the cost of scaling factor in Loop Strength Reduce when rating the formulae. This uses a target hook. The default implementation of the hook is: if the addressing mode is legal, the scaling factor is free. <rdar://problem/13806271> llvm-svn: 183045	2013-05-31 21:29:03 +00:00
Arnold Schwaighofer	70a9be5297	LoopVectorize: PHIs with only outside users should prevent vectorization We check that instructions in the loop don't have outside users (except if they are reduction values). Unfortunately, we skipped this check for if-convertable PHIs. Fixes PR16184. llvm-svn: 183035	2013-05-31 19:53:50 +00:00
Quentin Colombet	8aa7abe2ae	Modify how the formulae are rated in Loop Strength Reduce. Namely, check if the target allows to fold more that one register in the addressing mode and if yes, adjust the cost accordingly. Prior to this commit, reg1 + scale * reg2 accesses were artificially preferred to reg1 + reg2 accesses. Indeed, the cost model wrongly assumed that reg1 + reg2 needs a temporary register for the computation, whereas it was correctly estimated for reg1 + scale * reg2. <rdar://problem/13973908> llvm-svn: 183021	2013-05-31 17:20:29 +00:00
Rafael Espindola	65281bf36e	Simplify multiplications by vectors whose elements are powers of 2. Patch by Andrea Di Biagio. llvm-svn: 183005	2013-05-31 14:27:15 +00:00
Evgeniy Stepanov	888385e40f	[msan] Handle mixed track-origins and keep-going settings (llvm part). Before this change, each module defined a weak_odr global __msan_track_origins with a value of 1 if origin tracking is enabled, 0 if disabled. If there are modules with different values, any of them may win. If 0 wins, and there is at least one module with 1, the program will most likely crash. With this change, __msan_track_origins is only emitted if origin tracking is on. Then runtime library detects if there is at least one module with origin tracking, and enables runtime support for it. llvm-svn: 182997	2013-05-31 12:04:29 +00:00
Nick Lewycky	a2b7720618	Reapply with r182909 with a fix to the calculation of the new indices for insertelement instructions. llvm-svn: 182976	2013-05-31 00:59:42 +00:00
Evgeniy Stepanov	2c14269883	Revert r182909. PR/16177 llvm-svn: 182919	2013-05-30 09:40:17 +00:00
Nick Lewycky	d7f27094c0	Swizzle vector inputs if it helps us eliminate shuffles. llvm-svn: 182909	2013-05-30 04:33:38 +00:00
NAKAMURA Takumi	d11b42aaad	LoopVectorize.cpp: Fix abuse of StringRef on Twine. Twine captures the pointer of StringRef. llvm-svn: 182820	2013-05-29 03:13:47 +00:00
NAKAMURA Takumi	d57ea87080	Whitespace. llvm-svn: 182819	2013-05-29 03:13:41 +00:00
Paul Redmond	5fdf836ba4	Add support for llvm.vectorizer metadata - llvm.loop.parallel metadata has been renamed to llvm.loop to be more generic by making the root of additional loop metadata. - Loop::isAnnotatedParallel now looks for llvm.loop and associated llvm.mem.parallel_loop_access - document llvm.loop and update llvm.mem.parallel_loop_access - add support for llvm.vectorizer.width and llvm.vectorizer.unroll - document llvm.vectorizer.* metadata - add utility class LoopVectorizerHints for getting/setting loop metadata - use llvm.vectorizer.width=1 to indicate already vectorized instead of already_vectorized - update existing tests that used llvm.loop.parallel and llvm.vectorizer.already_vectorized Reviewed by: Nadav Rotem llvm-svn: 182802	2013-05-28 20:00:34 +00:00
James Molloy	f6f121e277	Extend RemapInstruction and friends to take an optional new parameter, a ValueMaterializer. Extend LinkModules to pass a ValueMaterializer to RemapInstruction and friends to lazily create Functions for lazily linked globals. This is a big win when linking small modules with large (mostly unused) library modules. llvm-svn: 182776	2013-05-28 15:17:05 +00:00
Evgeniy Stepanov	fca012334b	[msan] Fix argument shadow alignment. llvm-svn: 182771	2013-05-28 13:07:43 +00:00
Michael J. Spencer	df1ecbd734	Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros. llvm-svn: 182680	2013-05-24 22:23:49 +00:00
Michael Gottesman	e67f40c514	[objc-arc] KnownSafe does not imply that it is safe to perform code motion across CFG edges since even if it is safe to remove RR pairs, we may still be able to move a retain/release into a loop. rdar://13949644 llvm-svn: 182670	2013-05-24 20:44:05 +00:00
Michael Gottesman	5a91bbf33a	[objc-arc] Make sure that multiple owners is propogated correctly through the pass via the usage of a global data structure. rdar://13750319 llvm-svn: 182669	2013-05-24 20:44:02 +00:00
Benjamin Kramer	6ac1e62377	LoopVectorize: LoopSimplify can't canonicalize loops with an indirectbr in it, don't assert on those cases. Fixes PR16139. llvm-svn: 182656	2013-05-24 18:05:35 +00:00
Joey Gouly	b34294d0e4	Run clang-format over the scalarizePHI function. llvm-svn: 182640	2013-05-24 12:33:28 +00:00
Joey Gouly	83699284be	scalarizePHI needs to insert the next ExtractElement in the same block as the BinaryOperator, not in the block where the IRBuilder is currently inserting into. Fixes a bug where scalarizePHI would create instructions that would not dominate all uses. llvm-svn: 182639	2013-05-24 12:29:54 +00:00
Daniel Malea	fddddbeab0	Re-implement DebugIR in a way that does not subclass AssemblyWriter: - move AsmWriter.h from public headers into lib - marked all AssemblyWriter functions as non-virtual; no need to override them - DebugIR now "plugs into" AssemblyWriter with an AssemblyAnnotationWriter helper - exposed flags to control hiding of a) debug metadata b) debug intrinsic calls C/R: Paul Redmond llvm-svn: 182617	2013-05-23 22:34:33 +00:00
Benjamin Kramer	ad5c24f161	More symbols that should be static. llvm-svn: 182590	2013-05-23 16:09:15 +00:00
Michael Gottesman	740db977f6	[objc-arc] Fixed number of prefixing slashes in some comments in a function from 3 to 2 to match the rest of ObjCARCOpts. llvm-svn: 182557	2013-05-23 02:35:21 +00:00
Nadav Rotem	9e00eb38a2	SLPVectorizer: Change the order in which new instructions are added to the function. We are not working on a DAG and I ran into a number of problems when I enabled the vectorizations of 'diamond-trees' (trees that share leafs). * Imroved the numbering API. * Changed the placement of new instructions to the last root. * Fixed a bug with external tree users with non-zero lane. * Fixed a bug in the placement of in-tree users. llvm-svn: 182508	2013-05-22 19:47:32 +00:00
Jean-Luc Duprat	0dda6f168c	This is an update to a previous commit (r181216). The earlier change list introduced the following inst combines: B * (uitofp i1 C) —> select C, B, 0 A * (1 - uitofp i1 C) —> select C, 0, A select C, 0, B + select C, A, 0 —> select C, A, B Together these 3 changes would simplify : A * (1 - uitofp i1 C) + B * uitofp i1 C down to : select C, B, A In practice we found that the first two substitutions can have a negative effect on performance, because they reduce opportunities to use FMA contractions; between the two options FMAs are often the better choice. This change list amends the previous one to enable just these inst combines: select C, B, 0 + select C, 0, A —> select C, B, A A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A llvm-svn: 182499	2013-05-22 18:29:31 +00:00
Arnold Schwaighofer	12b0d1cda0	LoopVectorize: Make Value pointers that could be RAUW'ed a VH The Value pointers we store in the induction variable list can be RAUW'ed by a call to SCEVExpander::expandCodeFor, use a TrackingVH instead. Do the same thing in some other places where we store pointers that could potentially be RAUW'ed. Fixes PR16073. llvm-svn: 182485	2013-05-22 16:54:56 +00:00
Evgeniy Stepanov	ebd7f8e7ef	[msan] A no-op implementation of VarArg handling. This stuff is used on platforms where MSan does not have a proper VarArg implementation (anything other than x86_64 at the moment). llvm-svn: 182375	2013-05-21 12:27:47 +00:00
Bill Wendling	5f4740390e	Remove unused #include. llvm-svn: 182315	2013-05-20 20:59:12 +00:00
Hal Finkel	a969df84ab	Rename LoopSimplify.h to LoopUtils.h As discussed, LoopUtils.h is a better name. llvm-svn: 182314	2013-05-20 20:46:30 +00:00
Hal Finkel	a12d82b421	Expose InsertPreheaderForLoop from LoopSimplify to other passes Other passes, PPC counter-loop formation for example, also need to add loop preheaders outside of the regular loop simplification pass. This makes InsertPreheaderForLoop a global function so that it can be used by other passes. No functionality change intended. llvm-svn: 182299	2013-05-20 16:47:07 +00:00
Arnold Schwaighofer	693a1ca628	LoopVectorize: Handle single edge PHIs We might encouter single edge PHIs - handle them with an identity select. Fixes PR15990. llvm-svn: 182199	2013-05-18 18:38:34 +00:00
Matt Arsenault	52ddb7bcdd	Add missing -- C++ -- to headers llvm-svn: 182164	2013-05-17 21:43:39 +00:00
Benjamin Kramer	d84a63398e	LoopVectorize: Simplify code. No functionality change. llvm-svn: 182100	2013-05-17 14:48:17 +00:00
Evgeniy Stepanov	1e7643243d	[msan] Switch TLS globals to initial-exec model. They are always defined in the main executable. llvm-svn: 181994	2013-05-16 09:14:05 +00:00
Arnold Schwaighofer	88e7fddc8c	LoopVectorize: Move call of canHoistAllLoads to canVectorizeWithIfConvert We only want to check this once, not for every conditional block in the loop. No functionality change (except that we don't perform a check redudantly anymore). llvm-svn: 181942	2013-05-15 22:38:14 +00:00
Michael Gottesman	b4e7f4d841	[objc-arc] Fixed a spelling error and made the statistic descriptions be consistent about their usage of periods. llvm-svn: 181901	2013-05-15 17:43:03 +00:00
Arnold Schwaighofer	09cee97270	LoopVectorize: Fix comments No functionality change. llvm-svn: 181862	2013-05-15 02:02:45 +00:00
Arnold Schwaighofer	2d920477a4	LoopVectorize: Hoist conditional loads if possible InstCombine can be uncooperative to vectorization and sink loads into conditional blocks. This prevents vectorization. Undo this optimization if there are unconditional memory accesses to the same addresses in the loop. radar://13815763 llvm-svn: 181860	2013-05-15 01:44:30 +00:00
Sylvestre Ledru	149e281aa8	Fix two typo llvm-svn: 181848	2013-05-14 23:36:24 +00:00
Manman Ren	b3c52fb45b	GlobalOpt: fix an issue where CXAAtExitFn points to a deleted function. CXAAtExitFn was set outside a loop and before optimizations where functions can be deleted. This patch will set CXAAtExitFn inside the loop and after optimizations. Seg fault when running LTO because of accesses to a deleted function. rdar://problem/13838828 llvm-svn: 181838	2013-05-14 21:52:44 +00:00
Michael Gottesman	0c8b562851	Removed trailing whitespace. llvm-svn: 181760	2013-05-14 06:40:10 +00:00
Arnold Schwaighofer	2e7a922a15	LoopVectorize: Handle loops with multiple forward inductions We used to give up if we saw two integer inductions. After this patch, we base further induction variables on the chosen one like we do in the reverse induction and pointer induction case. Fixes PR15720. radar://13851975 llvm-svn: 181746	2013-05-14 00:21:18 +00:00
Michael Gottesman	f3f9e3b10a	[objc-arc-opts] Added debug statements when we set and unset whether a pointer is known positive. llvm-svn: 181745	2013-05-14 00:08:09 +00:00
Michael Gottesman	a76143eeee	[objc-arc-opts] In the presense of an alloca unconditionally remove RR pairs if and only if we are both KnownSafeBU/KnownSafeTD rather than just either or. In the presense of a block being initialized, the frontend will emit the objc_retain on the original pointer and the release on the pointer loaded from the alloca. The optimizer will through the provenance analysis realize that the two are related (albiet different), but since we only require KnownSafe in one direction, will match the inner retain on the original pointer with the guard release on the original pointer. This is fixed by ensuring that in the presense of allocas we only unconditionally remove pointers if both our retain and our release are KnownSafe (i.e. we are KnownSafe in both directions) since we must deal with the possibility that the frontend will emit what (to the optimizer) appears to be unbalanced retain/releases. An example of the miscompile is: %A = alloca retain(%x) retain(%x) <--- Inner Retain store %x, %A %y = load %A ... DO STUFF ... release(%y) call void @use(%x) release(%x) <--- Guarding Release getting optimized to: %A = alloca retain(%x) store %x, %A %y = load %A ... DO STUFF ... release(%y) call void @use(%x) rdar://13750319 llvm-svn: 181743	2013-05-13 23:49:42 +00:00
Matt Beaumont-Gay	e55d9492e3	Move a couple more statistics inside '#ifndef NDEBUG'. Suppresses an unused-variable warning in -Asserts builds. llvm-svn: 181733	2013-05-13 21:10:49 +00:00
Michael Gottesman	993fbf704a	[objc-arc-opts] Add comment to BBState making it clear that get{TopDown,BottomUp}PtrState will create a new PtrState object if it does not find a PtrState for Arg. llvm-svn: 181726	2013-05-13 19:40:39 +00:00
Michael Gottesman	9fc50b82a4	[objc-arc] Move the before optimization statistics gathering phase out of OptimizeIndividualCalls. This makes the statistics gathering completely independent of the actual optimization occuring, preventing any sort of bleeding over from occuring. Additionally, it simplifies a switch statement in the non-statistic gathering case. llvm-svn: 181719	2013-05-13 18:29:07 +00:00
Duncan Sands	0480b9b54e	Suppress GCC compiler warnings in release builds about variables that are only read in asserts. llvm-svn: 181689	2013-05-13 07:50:47 +00:00
Nadav Rotem	33dcf0a70f	SLPVectorizer: Swap LHS and RHS. No functionality change. llvm-svn: 181684	2013-05-13 05:13:13 +00:00
Nadav Rotem	ce42cc6d4d	SLPVectorizer: Fix a bug in the code that generates extracts for values with multiple users. The external user does not have to be in lane #0. We have to save the lane for each scalar so that we know which vector lane to extract. llvm-svn: 181674	2013-05-12 22:58:45 +00:00
Nadav Rotem	cbf6d24d50	SLPVectorizer: Clear the map that maps between scalars to vectors after each round of vectorization. Testcase in the next commit. llvm-svn: 181673	2013-05-12 22:55:57 +00:00
David Majnemer	6c30f49af3	InstCombine: Flip the order of two urem transforms There are two transforms in visitUrem that conflict with each other. ) One, if a divisor is a power of two, subtracts one from the divisor and turns it into a bitwise-and. ) The other unwraps both operands if they are surrounded by zext instructions. Flipping the order allows the subtraction to go beneath the sign extension. llvm-svn: 181668	2013-05-12 00:07:05 +00:00
Arnold Schwaighofer	f2305e4467	LoopVectorize: Use the widest induction variable type Use the widest induction type encountered for the cannonical induction variable. We used to turn the following loop into an empty loop because we used i8 as induction variable type and truncated 1024 to 0 as trip count. int a[1024]; void fail() { int reverse_induction = 1023; unsigned char forward_induction = 0; while ((reverse_induction) >= 0) { forward_induction++; a[reverse_induction] = forward_induction; --reverse_induction; } } radar://13862901 llvm-svn: 181667	2013-05-11 23:04:28 +00:00
Arnold Schwaighofer	a544fefa32	LoopVectorize: Use variable instead of repeated function call No functionality change intended. llvm-svn: 181666	2013-05-11 23:04:26 +00:00
Arnold Schwaighofer	1ba84df437	LoopVectorize: Use IRBuilder interface in more places No functionality change intended. llvm-svn: 181665	2013-05-11 23:04:24 +00:00
David Majnemer	470b077bca	InstCombine: Turn urem to bitwise-and more often Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively fold away urem instructions. llvm-svn: 181661	2013-05-11 09:01:28 +00:00
Nadav Rotem	cdfb48d2fe	SLPVectorizer: Add support for trees with external users. For example: bar() { int a = A[i]; int b = A[i+1]; B[i] = a; B[i+1] = b; foo(a); <--- a is used outside the vectorized expression. } llvm-svn: 181648	2013-05-10 22:59:33 +00:00
Nadav Rotem	0686e5cb05	Add a debug print llvm-svn: 181647	2013-05-10 22:56:18 +00:00
Benjamin Kramer	14e915f7b4	InstCombine: Don't claim to be able to evaluate any shl in a zexted type. The shift amount may be larger than the type leading to undefined behavior. Limit the transform to constant shift amounts. While there update the bits to clear in the result which may enable additional optimizations. PR15959. llvm-svn: 181604	2013-05-10 16:26:37 +00:00
Benjamin Kramer	a6645e8b8f	InstCombine: Verify the type before transforming uitofp into select. PR15952. llvm-svn: 181586	2013-05-10 09:16:52 +00:00
Dmitri Gribenko	9bf66a5fd0	Fix a documentation warning: \bried -> \brief llvm-svn: 181551	2013-05-09 21:16:18 +00:00
Shuxin Yang	1d8d7e4d38	[GVN] Split critical-edge on the fly, instead of postpone edge-splitting to next iteration. This on step toward non-iterative GVN. My local hack suggests that getting rid of iteration will speedup GVN by 30%+ on a medium sized input (2k LOC, C++). I cannot explain why not 2x or more at this moment. llvm-svn: 181532	2013-05-09 18:34:27 +00:00
Rafael Espindola	007521673b	Don't replace an alias in llvm.used with its target. When we replace an internal alias with its target, be careful not to replace the entry in llvm.used (and llvm.compiler_used). llvm-svn: 181524	2013-05-09 17:22:59 +00:00
Benjamin Kramer	21b972ae94	InstCombine: Don't just copy known bits from the first operand of an srem. That's obviously wrong. Conservatively restrict it to the sign bit, which matches the original intention of this analysis. Fixes PR15940. llvm-svn: 181518	2013-05-09 16:32:32 +00:00
Arnold Schwaighofer	2e8c69cf97	LoopVectorizer: Don't assert on the absence of induction variables A computable loop exit count does not imply the presence of an induction variable. Scalar evolution can return a value for an infinite loop. Fixes PR15926. llvm-svn: 181495	2013-05-09 00:32:18 +00:00
Daniel Malea	3c5bed1670	Add DebugIR pass -- emits IR file and replace source lines with IR lines in MD - requires existing debug information to be present - fixes up file name and line number information in metadata - emits a "<orig_filename>-debug.ll" succinct IR file (without !dbg metadata or debug intrinsics) that can be read by a debugger - initialize pass in opt tool to enable the "-debug-ir" flag - lit tests to follow llvm-svn: 181467	2013-05-08 20:44:14 +00:00
Nick Lewycky	5fb1963f2a	Fix a bug in codegenprep where it was losing track of values OptimizeMemoryInst by switching to a ValueMap. Patch by Andrea DiBiagio! llvm-svn: 181397	2013-05-08 09:00:10 +00:00
Arnold Schwaighofer	3610139ac5	LoopVectorizer: Improve reduction variable identification The two nested loops were confusing and also conservative in identifying reduction variables. This patch replaces them by a worklist based approach. llvm-svn: 181369	2013-05-07 21:55:37 +00:00
Arnold Schwaighofer	e78b76fbed	LoopVectorize: getConsecutiveVector must respect signed arithmetic We were passing an i32 to ConstantInt::get where an i64 was needed and we must also pass the sign if we pass negatives numbers. The start index passed to getConsecutiveVector must also be signed. Should fix PR15882. llvm-svn: 181286	2013-05-07 04:37:05 +00:00
David Majnemer	70f286d95f	InstCombine: (X ^ signbit) + C -> X + (signbit ^ C) llvm-svn: 181249	2013-05-06 21:21:31 +00:00
Andrew Trick	9c72b071fe	Rotate multi-exit loops even if the latch was simplified. Test case by Michele Scandale! Fixes PR10293: Load not hoisted out of loop with multiple exits. There are few regressions with this patch, now tracked by rdar:13817079, and a roughly equal number of improvements. The regressions are almost certainly back luck because LoopRotate has very little idea of whether rotation is profitable. Doing better requires a more comprehensive solution. This checkin is a quick fix that lacks generality (PR10293 has a counter-example). But it trivially fixes the case in PR10293 without interfering with other cases, and it does satify the criteria that LoopRotate is a loop canonicalization pass that should avoid heuristics and special cases. I can think of two approaches that would probably be better in the long run. Ultimately they may both make sense. (1) LoopRotate should check that the current header would make a good loop guard, and that the loop does not already has a sufficient guard. The artifical SimplifiedLoopLatch check would be unnecessary, and the design would be more general and canonical. Two difficulties: - We need a strong guarantee that we won't endlessly rotate, so the analysis would need to be precise in order to avoid the SimplifiedLoopLatch precondition. - Analysis like this are usually based on SCEV, which we don't want to rely on. (2) Rotate on-demand in late loop passes. This could even be done by shoving the loop back on the queue after the optimization that needs it. This could work well when we find LICM opportunities in multi-branch loops. This requires some work, and it doesn't really solve the problem of SCEV wanting a loop guard before the analysis. llvm-svn: 181230	2013-05-06 17:58:18 +00:00
Jean-Luc Duprat	3e4fc3ef24	Provide InstCombines for the following 3 cases: A * (1 - (uitofp i1 C)) -> select C, 0, A B * (uitofp i1 C) -> select C, B, 0 select C, 0, A + select C, B, 0 -> select C, B, A These come up in code that has been hand-optimized from a select to a linear blend, on platforms where that may have mattered. We want to undo such changes with the following transform: A(1 - uitofp i1 C) + B(uitofp i1 C) -> select C, A, B llvm-svn: 181216	2013-05-06 16:55:50 +00:00
Nadav Rotem	632b25b743	Update the comment to mention that we use TTI. llvm-svn: 181178	2013-05-06 03:06:36 +00:00
Nadav Rotem	c70ef4e93c	Revert r164763 because it introduces new shuffles. Thanks Nick Lewycky for pointing this out. llvm-svn: 181177	2013-05-06 02:39:09 +00:00
Rafael Espindola	c229a4fff4	Fix const merging when an alias of a const is llvm.used. We used to disable constant merging not only if a constant is llvm.used, but also if an alias of a constant is llvm.used. This change fixes that. llvm-svn: 181175	2013-05-06 01:48:55 +00:00
Benjamin Kramer	3e3f2a4b8d	LoopVectorize: Print values instead of pointers in debug output. llvm-svn: 181157	2013-05-05 14:54:52 +00:00
Arnold Schwaighofer	d96e427eac	LoopVectorize: Add support for floating point min/max reductions Add support for min/max reductions when "no-nans-float-math" is enabled. This allows us to assume we have ordered floating point math and treat ordered and unordered predicates equally. radar://13723044 llvm-svn: 181144	2013-05-05 01:54:48 +00:00
Arnold Schwaighofer	f5183729db	LoopVectorizer: Cleanup of miminimum/maximum pattern match code No need for setting the operands. The pointers are going to be bound by the matcher. radar://13723044 llvm-svn: 181142	2013-05-05 01:54:44 +00:00
Arnold Schwaighofer	a670a0a3aa	LoopVectorize: We don't need an identity element for min/max reductions We can just use the initial element that feeds the reduction. max(max(x, y), z) == max(max(x,y), max(x,z)) radar://13723044 llvm-svn: 181141	2013-05-05 01:54:42 +00:00
Dmitri Gribenko	3238fb7595	Add ArrayRef constructor from None, and do the cleanups that this constructor enables Patch by Robert Wilhelm. llvm-svn: 181138	2013-05-05 00:40:33 +00:00
Nick Lewycky	881e9d62e2	Tabs to spaces. No functionality change. llvm-svn: 181082	2013-05-04 01:08:15 +00:00
Shuxin Yang	637b9bebd4	Decompose GVN::processNonLocalLoad() (about 400 LOC) into smaller helper functions. No function change. This function consists of following steps: 1. Collect dependent memory accesses. 2. Analyze availability. 3. Perform fully redundancy elimination, or 4. Perform PRE, depending on the availability Step 2, 3 and 4 are now moved to three helper routines. llvm-svn: 181047	2013-05-03 19:17:26 +00:00
Nadav Rotem	4ce060b3da	LoopVectorizer: Add support for if-conversion of PHINodes with 3+ incoming values. By supporting the vectorization of PHINodes with more than two incoming values we can increase the complexity of nested if statements. We can now vectorize this loop: int foo(int A, int B, int n) { for (int i=0; i < n; i++) { int x = 9; if (A[i] > B[i]) { if (A[i] > 19) { x = 3; } else if (B[i] < 4 ) { x = 4; } else { x = 5; } } A[i] = x; } } llvm-svn: 181037	2013-05-03 17:42:55 +00:00
Shuxin Yang	af2c3ddf0d	[GV] Remove dead code which is really difficult to decipher. Actually it took me couple of hours trying to make sense of them and only to find they are dead code. I guess the original author used "allSingleSucc" to indicate if there are any critial edge emanating from some blocks, and tried to perform code motion (actually speculation) in the presence of these critical edges; but later on he/she changed mind and decided to perform edge-splitting first. llvm-svn: 180951	2013-05-02 21:14:31 +00:00
Filip Pizlo	dec20e43c0	This patch breaks up Wrap.h so that it does not have to include all of the things, and renames it to CBindingWrapping.h. I also moved CBindingWrapping.h into Support/. This new file just contains the macros for defining different wrap/unwrap methods. The calls to those macros, as well as any custom wrap/unwrap definitions (like for array of Values for example), are put into corresponding C++ headers. Doing this required some #include surgery, since some .cpp files relied on the fact that including Wrap.h implicitly caused the inclusion of a bunch of other things. This also now means that the C++ headers will include their corresponding C API headers; for example Value.h must include llvm-c/Core.h. I think this is harmless, since the C API headers contain just external function declarations and some C types, so I don't believe there should be any nasty dependency issues here. llvm-svn: 180881	2013-05-01 20:59:00 +00:00
Nadav Rotem	1e211913b5	SROA: Generate selects instead of shuffles when blending values because this is the cannonical form. Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often. llvm-svn: 180875	2013-05-01 19:53:30 +00:00
Jim Grosbach	d11584a7f7	Revert "InstCombine: Fold more shuffles of shuffles." This reverts commit r180802 There's ongoing discussion about whether this is the right place to make this transformation. Reverting for now while we figure it out. llvm-svn: 180834	2013-05-01 00:25:27 +00:00
Richard Trieu	624c2ebcbb	Fix a use after free. RI is freed before the call to getDebugLoc(). To prevent this, capture the location before RI is freed. llvm-svn: 180824	2013-04-30 22:45:10 +00:00
Nadav Rotem	9feda6071a	Fix a typo llvm-svn: 180806	2013-04-30 21:04:51 +00:00
Jim Grosbach	0b914fe839	InstCombine: Fold more shuffles of shuffles. Always fold a shuffle-of-shuffle into a single shuffle when there's only one input vector in the first place. Continue to be more conservative when there's multiple inputs. rdar://13402653 PR15866 llvm-svn: 180802	2013-04-30 20:43:52 +00:00
Adrian Prantl	8beccf9e6d	Spelling. Thanks, Eric. llvm-svn: 180794	2013-04-30 17:33:32 +00:00
Adrian Prantl	0941638a1b	Set debug locations for branch instructions created during inlining, even the inlined function has multiple returns. rdar://problem/12415623 llvm-svn: 180793	2013-04-30 17:08:16 +00:00
David Majnemer	d73f37bb83	Fix a bug in foldSelectICmpAndOr. Differences in bitwidth between X and Y could exist even if C1 and C2 have the same Log2 representation. llvm-svn: 180779	2013-04-30 10:36:33 +00:00
David Majnemer	8d048d0482	Fix "Combine bit test + conditional or into simple math" This fixes the optimization introduced in r179748 and reverted in r179750. While the optimization was sound, it did not properly respect differences in bit-width. llvm-svn: 180777	2013-04-30 08:57:58 +00:00
Arnold Schwaighofer	474df6d3ed	SimplifyCFG: If convert single conditional stores This resurrects r179957, but adds code that makes sure we don't touch atomic/volatile stores: This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case where the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. llvm-svn: 180731	2013-04-29 21:28:24 +00:00
Michael Gottesman	03cf3c8966	Add in some conditional compilation in order to silence an unused variable warning. llvm-svn: 180700	2013-04-29 07:29:08 +00:00
Michael Gottesman	214ca90f8e	[objc-arc] Apply the RV optimization to retains next to calls in ObjCARCContract instead of ObjCARCOpts. Turning retains into retainRV calls disrupts the data flow analysis in ObjCARCOpts. Thus we move it as late as we can by moving it into ObjCARCContract. We leave in the conversion from retainRV -> retain in ObjCARCOpt since it enables the dataflow analysis. rdar://10813093 llvm-svn: 180698	2013-04-29 06:53:53 +00:00
Michael Gottesman	9c11815978	Added statistics to count the number of retains/releases before/after optimization. llvm-svn: 180697	2013-04-29 06:16:57 +00:00
Michael Gottesman	8005ad3f3e	Removed trailing whitespace. llvm-svn: 180696	2013-04-29 06:16:55 +00:00
Michael Gottesman	3e3977c49f	Fix for r180693. = /. llvm-svn: 180694	2013-04-29 05:25:39 +00:00
Michael Gottesman	a87bb8f50b	[objc-arc-annotations] Moved the disabling of call movement to ConnectTDBUTraversals so that I can prevent Changed = true from being set. This prevents an infinite loop. llvm-svn: 180693	2013-04-29 05:13:13 +00:00
Shuxin Yang	04a4fd43aa	Fix a XOR reassociation bug. When Reassociator optimize "(x \| C1)" ^ "(X & C2)", it may swap the two subexpressions, however, it forgot to swap cached constants (of C1 and C2) accordingly. rdar://13739160 llvm-svn: 180676	2013-04-27 18:02:12 +00:00
Adrian Prantl	d00333a4b2	fix a typo that due to cu&paste quadrupled itself rdar://problem/13056109 llvm-svn: 180618	2013-04-26 18:10:50 +00:00
Adrian Prantl	29b9de7bf1	Bugfix for the debug intrinsic handling in InstCombiner: Since we can't guarantee that the original dbg.declare instrinsic is removed by LowerDbgDeclare(), we need to make sure that we are not inserting the same dbg.value intrinsic over and over. This removes tons of redundant DIEs when compiling optimized code. rdar://problem/13056109 llvm-svn: 180615	2013-04-26 17:48:33 +00:00
Nadav Rotem	13306816fc	LoopVectorizer: Calculate the number of pointers to disambiguate at runtime based on the numbers of reads and writes. llvm-svn: 180593	2013-04-26 05:08:59 +00:00
Michael Gottesman	47cf8a4c12	Revert "[objc-arc] Added ImpreciseAutoreleaseSet to track autorelease calls that were once autoreleaseRV instructions." This reverts commit r180222. I think this might tie in with a different problem which will require a different approach potentially. I am reverting this in the case I need to go down that second path. My apologies for the noise. = /. llvm-svn: 180590	2013-04-26 01:12:18 +00:00
Nadav Rotem	f43cbeee15	LoopVectorizer: No need to generate pointer disambiguation checks between readonly pointers. llvm-svn: 180570	2013-04-25 19:55:03 +00:00
Michael Gottesman	fdb497a9b2	[objc-arc] Added ImpreciseAutoreleaseSet to track autorelease calls that were once autoreleaseRV instructions. Due to the semantics of ARC, we must be extremely conservative with autorelease calls inserted by the frontend since ARC gaurantees that said object will be in the autorelease pool after that point, an optimization invariant that the optimizer must respect. On the other hand, we are allowed significantly more flexibility with autoreleaseRV instructions. Often times though this flexibility is disrupted by early transformations which transform objc_autoreleaseRV => objc_autorelease if said instruction is no longer being used as part of an RV pair (generally due to inlining). Since we can not tell the difference in between an autorelease put into place by the frontend and one created through said ``strength reduction'' we can not perform these optimizations. The addition of this set gets around said issues by allowing us to differentiate in between said two cases. rdar://problem/13697741. llvm-svn: 180222	2013-04-24 22:18:18 +00:00
Michael Gottesman	cd5b02701c	Fixed comment typo. llvm-svn: 180221	2013-04-24 22:18:15 +00:00
Arnold Schwaighofer	3fa801fbc2	LoopVectorizer: Change variable name Stride to ConsecutiveStride This makes it easier to read the code. No functionality change. llvm-svn: 180197	2013-04-24 16:16:03 +00:00
Arnold Schwaighofer	a6578f7056	LoopVectorize: Scalarize padded types This patch disables memory-instruction vectorization for types that need padding bytes, e.g., x86_fp80 has 10 bytes store size with 6 bytes padding in darwin on x86_64. Because the load/store vectorization is performed by the bit casting to a packed vector, which has incompatible memory layout due to the lack of padding bytes, the present vectorizer produces inconsistent result for memory instructions of those types. This patch checks an equality of the AllocSize of a scalar type and allocated size for each vector element, to ensure that there is no padding bytes and the array can be read/written using vector operations. Patch by Daisuke Takahashi! Fixes PR15758. llvm-svn: 180196	2013-04-24 16:16:01 +00:00
Arnold Schwaighofer	23a0589bce	LoopVectorizer: Bail out if we don't have datalayout we need it llvm-svn: 180195	2013-04-24 16:15:58 +00:00
Adrian Prantl	15db52bf6d	Make sure the instruction right after an inlined function has a debug location. This solves a problem where range of an inlined subroutine is emitted wrongly. Patch by Manman Ren. Fixes rdar://problem/12415623 llvm-svn: 180140	2013-04-23 19:56:03 +00:00
Nadav Rotem	71c9d6d333	LoopVectorizer: Fix 15830. When scalarizing and unrolling stores make sure that the order in which the elements are scalarized is the same as the original order. This fixes a miscompilation in FreeBSD's regex library. llvm-svn: 180121	2013-04-23 17:12:42 +00:00
Pekka Jaaskelainen	d3c90e132a	Call the potentially costly isAnnotatedParallel() only once. Made the uniform write test's checks a bit stricter. llvm-svn: 180119	2013-04-23 16:44:43 +00:00
Pekka Jaaskelainen	6f2f66b63f	Refuse to (even try to) vectorize loops which have uniform writes, even if erroneously annotated with the parallel loop metadata. Fixes Bug 15794: "Loop Vectorizer: Crashes with the use of llvm.loop.parallel metadata" llvm-svn: 180081	2013-04-23 08:08:51 +00:00
Eric Christopher	04d4e9312c	Move C++ code out of the C headers and into either C++ headers or the C++ files themselves. This enables people to use just a C compiler to interoperate with LLVM. llvm-svn: 180063	2013-04-22 22:47:22 +00:00
Anat Shemer	10260a75e3	Changed back (relative to commit 179786) the operations executed when extract(cast) is transformed to cast(extract). It uses the Builder class as before. In addition the result node is added to the Worklist, so all the previous extract users will become the new scalar cast users. llvm-svn: 180045	2013-04-22 20:51:10 +00:00
Rafael Espindola	74f2e46eef	Clarify that llvm.used can contain aliases. Also add a check for llvm.used in the verifier and simplify clients now that they can assume they have a ConstantArray. llvm-svn: 180019	2013-04-22 14:58:02 +00:00
Benjamin Kramer	0212dc27ed	SROA: Don't crash on a select with two identical operands. This is an edge case that can happen if we modify a chain of multiple selects. Update all operands in that case and remove the assert. PR15805. llvm-svn: 179982	2013-04-21 17:48:39 +00:00
Arnold Schwaighofer	6eb32b31bd	Revert "SimplifyCFG: If convert single conditional stores" There is the temptation to make this tranform dependent on target information as it is not going to be beneficial on all (sub)targets. Therefore, we should probably do this in MI Early-Ifconversion. This reverts commit r179957. Original commit message: "SimplifyCFG: If convert single conditional stores This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up." llvm-svn: 179980	2013-04-21 13:09:04 +00:00
Nadav Rotem	c57af326a4	SLPVectorize: Add support for vectorization of casts. llvm-svn: 179975	2013-04-21 08:05:59 +00:00
Nadav Rotem	98ad5f0f4c	SLPVectorizer: Fix a bug in the code that scans the tree in search of nodes with multiple users. We did not terminate the switch case and we executed the search routine twice. llvm-svn: 179974	2013-04-21 07:37:56 +00:00
Michael Gottesman	3eab2e43d2	When we strength reduce an objc_retainBlock call to objc_retain, increment NumPeeps and make sure that Changed is set to true. llvm-svn: 179968	2013-04-21 00:50:27 +00:00
Michael Gottesman	1e43004295	Fixed comment typo. llvm-svn: 179967	2013-04-21 00:44:46 +00:00
Michael Gottesman	df110ac9ec	[objc-arc] Fixed typo in debug message. llvm-svn: 179966	2013-04-21 00:30:50 +00:00
Michael Gottesman	cdb7c15ce8	[objc-arc] Fixed comment typo. llvm-svn: 179965	2013-04-21 00:25:04 +00:00
Michael Gottesman	fb9ece9a7c	[objc-arc] Refactored OptimizeReturns so that it uses continue instead of a large multi-level nested if statement. llvm-svn: 179964	2013-04-21 00:25:01 +00:00
Michael Gottesman	01338a442a	[objc-arc] Added debug statement saying when we are resetting a sequence's progress. This will make it clearer when we are actually resetting a sequence's progress vs just changing state. This is an important distinction because the former case clears any pointers that we are tracking while the later does not. llvm-svn: 179963	2013-04-20 23:36:57 +00:00
Nadav Rotem	8aca44a623	Fix PR15800. Do not try to vectorize vectors and structs. llvm-svn: 179960	2013-04-20 22:29:43 +00:00
Arnold Schwaighofer	3546ccf465	SimplifyCFG: If convert single conditional stores This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up. llvm-svn: 179957	2013-04-20 21:42:09 +00:00
Benjamin Kramer	519b2e3087	VecUtils: Clean up uses of dyn_cast. llvm-svn: 179936	2013-04-20 10:36:17 +00:00
Benjamin Kramer	4600bcc337	SLPVectorizer: Strength reduce SmallVectors to ArrayRefs. Avoids a couple of copies and allows more flexibility in the clients. llvm-svn: 179935	2013-04-20 09:49:10 +00:00
Nadav Rotem	ce2660d639	SLPVectorizer: Reduce the compile time by eliminating the search for some of the more expensive patterns. After this change will only check basic arithmetic trees that start at cmpinstr. llvm-svn: 179933	2013-04-20 07:29:34 +00:00
Nadav Rotem	998e035cae	refactor tryToVectorizePair to a new method that supports vectorization of lists. llvm-svn: 179932	2013-04-20 07:22:58 +00:00
Nadav Rotem	890387289e	Fix an unused variable warning. llvm-svn: 179931	2013-04-20 06:40:28 +00:00
Nadav Rotem	83c7c41bc2	SLPVectorizer: Improve the cost model for loop invariant broadcast values. llvm-svn: 179930	2013-04-20 06:13:47 +00:00
Nadav Rotem	dfe1c93ca4	Report the number of stores that were found in the debug message. llvm-svn: 179929	2013-04-20 05:23:11 +00:00
Nadav Rotem	dfd8fcbb00	Fix the header comment. llvm-svn: 179928	2013-04-20 05:18:51 +00:00
Nadav Rotem	5ed99674e9	Use 64bit arithmetic for calculating distance between pointers. llvm-svn: 179927	2013-04-20 05:17:47 +00:00
Benjamin Kramer	630e6e1422	MergeFunc: Make pointer and integer types generate the same hash. The logic that actually compares the types considers pointers and integers the same if they are of the same size. This created a strange mismatch between hash and reality and made the test case for this fail on some platforms (yay, test cases). llvm-svn: 179905	2013-04-19 23:06:44 +00:00
Arnold Schwaighofer	5146940316	LoopVectorizer: Use matcher from PatternMatch.h for the min/max patterns Also make some static function class functions to avoid having to mention the class namespace for enums all the time. No functionality change intended. llvm-svn: 179886	2013-04-19 21:03:36 +00:00
Jakub Staszak	99317268e2	Keep coding stanard. Don't use "else if" after "return". llvm-svn: 179826	2013-04-19 01:18:04 +00:00
Bill Wendling	3b21eb69fb	Implement a better fix for PR15185. If the return type is a pointer and the call returns an integer, then do the inttoptr convertions. And vice versa. llvm-svn: 179817	2013-04-18 23:34:17 +00:00
Dmitri Gribenko	d29ea04446	Fix a -Wdocumentation warning llvm-svn: 179789	2013-04-18 20:13:04 +00:00
Anat Shemer	5570318f43	In the function InstCombiner::visitExtractElementInst() removed the limitation that extract is promoted over a cast only if the cast has only one use. llvm-svn: 179786	2013-04-18 19:56:44 +00:00
Anat Shemer	0c95efad7e	Added a function scalarizePHI() that sclarizes a vector phi instruction if it has only 2 uses: one to promote the vector phi in a loop and the other use is an extract operation of one element at a constant location. llvm-svn: 179783	2013-04-18 19:35:39 +00:00
Chris Lattner	8cf09416ea	Fix a comment, PR15777. llvm-svn: 179775	2013-04-18 17:42:14 +00:00
Arnold Schwaighofer	4cd6aa110c	LoopVectorizer: Recognize min/max reductions A min/max operation is represented by a select(cmp(lt/le/gt/ge, X, Y), X, Y) sequence in LLVM. If we see such a sequence we can treat it just as any other commutative binary instruction and reduce it. This appears to help bzip2 by about 1.5% on an imac12,2. radar://12960601 llvm-svn: 179773	2013-04-18 17:22:34 +00:00
Benjamin Kramer	8df2cfb858	LoopVectorize: Use a set to avoid longer cycles in the reduction chain too. Fixes PR15748. llvm-svn: 179757	2013-04-18 14:29:13 +00:00
David Majnemer	81af06e003	Revert "Combine bit test + conditional or into simple math" It is causing stage2 builds to fail, let's get them running again. llvm-svn: 179750	2013-04-18 08:42:33 +00:00
David Majnemer	bdf0caf6b1	Combine bit test + conditional or into simple math Simplify: (select (icmp eq (and X, C1), 0), Y, (or Y, C2)) Into: (or (shl (and X, C1), C3), y) Where: C3 = Log(C2) - Log(C1) If: C1 and C2 are both powers of two llvm-svn: 179748	2013-04-18 07:30:07 +00:00
Michael Gottesman	323964ca9e	[objc-arc] Do not mismatch up retains inside a for loop with releases outside said for loop in the presense of differing provenance caused by escaping blocks. This occurs due to an alloca representing a separate ownership from the original pointer. Thus consider the following pseudo-IR: objc_retain(%a) for (...) { objc_retain(%a) %block <- %a F(%block) objc_release(%block) } objc_release(%a) From the perspective of the optimizer, the %block is a separate provenance from the original %a. Thus the optimizer pairs up the inner retain for %a and the outer release from %a, resulting in segfaults. This is fixed by noting that the signature of a mismatch of retain/releases inside the for loop is a Use/CanRelease top down with an None bottom up (since bottom up the Retain-CanRelease-Use-Release sequence is completed by the inner objc_retain, but top down due to the differing provenance from the objc_release said sequence is not completed). In said case in CheckForCFGHazards, we now clear the state of %a implying that no pairing will occur. Additionally a test case is included. rdar://12969722 llvm-svn: 179747	2013-04-18 05:39:45 +00:00
Michael Gottesman	9e5181393a	Removed trailing whitespace. llvm-svn: 179746	2013-04-18 04:34:11 +00:00
Michael Gottesman	4e88ce68ae	[objc-arc] Added annotation option to only emit annotations for a specific ssa identifier. llvm-svn: 179729	2013-04-17 21:59:41 +00:00
Michael Gottesman	adb921affa	Fixed typo. llvm-svn: 179721	2013-04-17 21:03:53 +00:00
Michael Gottesman	6806b51ad2	[objc-arc] Added descriptions for EnableARCAnnotations, EnableCheckForCFGHazards, EnableARCOptimizations. llvm-svn: 179718	2013-04-17 20:48:03 +00:00
Michael Gottesman	ffef24f964	[objc-arc] Added an option to arc-annotations for turning off CheckForCFGHazard. llvm-svn: 179717	2013-04-17 20:48:01 +00:00
Peter Collingbourne	37ae72b508	Do not optimise fprintf() calls if its return value is used. Differential Revision: http://llvm-reviews.chandlerc.com/D620 llvm-svn: 179661	2013-04-17 02:01:10 +00:00
Hans Wennborg	c9e1d99279	simplifycfg: Fix integer overflow converting switch into icmp. If a switch instruction has a case for every possible value of its type, with the same successor, SimplifyCFG would replace it with an icmp ult, but the computation of the bound overflows in that case, which inverts the test. Patch by Jed Davis! llvm-svn: 179587	2013-04-16 08:35:36 +00:00
Bill Wendling	3789171972	We are not able to bitcast a pointer to an integral value. Two return types are not equivalent if one is a pointer and the other is an integral. This is because we cannot bitcast a pointer to an integral value. PR15185 llvm-svn: 179569	2013-04-15 22:33:50 +00:00
Nadav Rotem	b9116e6966	SLPVectorizer: Make it a function pass and add code for hoisting the vector-gather sequence out of loops. llvm-svn: 179562	2013-04-15 22:00:26 +00:00
Jim Grosbach	0f38c1e3a7	Fix a typo in comment. llvm-svn: 179542	2013-04-15 17:40:48 +00:00
Nadav Rotem	d4dcc003df	Add an option -vectorize-slp-aggressive for running the BB vectorizer. Make -fslp-vectorize run the slp-vectorizer. llvm-svn: 179508	2013-04-15 05:39:58 +00:00
Nadav Rotem	a1e5e44eb3	Rename the slp-vectorizer clang/llvm flags. No functionality change. llvm-svn: 179505	2013-04-15 04:54:42 +00:00
Nadav Rotem	5d393c416f	SLPVectorizer: Add support for vectorizing trees that start at compare instructions. llvm-svn: 179504	2013-04-15 04:25:27 +00:00
David Majnemer	1fae195557	Reorders two transforms that collide with each other One performs: (X == 13 \| X == 14) -> X-13 <u 2 The other: (A == C1 \|\| A == C2) -> (A & ~(C1 ^ C2)) == C1 The problem is that there are certain values of C1 and C2 that trigger both transforms but the first one blocks out the second, this generates suboptimal code. Reordering the transforms should be better in every case and allows us to do interesting stuff like turn: %shr = lshr i32 %X, 4 %and = and i32 %shr, 15 %add = add i32 %and, -14 %tobool = icmp ne i32 %add, 0 into: %and = and i32 %X, 240 %tobool = icmp ne i32 %and, 224 llvm-svn: 179493	2013-04-14 21:15:43 +00:00
Benjamin Kramer	7d62ea86e5	Miscellaneous cleanups for VecUtils.h llvm-svn: 179483	2013-04-14 09:33:08 +00:00
Nadav Rotem	3403c11529	SLP: Document the scalarization cost method. llvm-svn: 179479	2013-04-14 07:22:22 +00:00
Nadav Rotem	54b413d157	SLPVectorizer: Add support for trees that don't start at binary operators, and add the cost of extracting values from the roots of the tree. llvm-svn: 179475	2013-04-14 05:15:53 +00:00
Nadav Rotem	0b9cf8567b	SLPVectorizer: add initial support for reduction variable vectorization. llvm-svn: 179470	2013-04-14 03:22:20 +00:00
Benjamin Kramer	adc1727c39	GlobalDCE: Fix an oversight in my last commit that could lead to crashes. There is a Constant with non-constant operands: blockaddress. llvm-svn: 179460	2013-04-13 16:11:14 +00:00
Benjamin Kramer	89ca4bc6d4	Fix a scalability issue with complex ConstantExprs. This is basically the same fix in three different places. We use a set to avoid walking the whole tree of a big ConstantExprs multiple times. For example: (select cmp, (add big_expr 1), (add big_expr 2)) We don't want to visit big_expr twice here, it may consist of thousands of nodes. The testcase exercises this by creating an insanely large ConstantExprs out of a loop. It's questionable if the optimizer should ever create those, but this can be triggered with real C code. Fixes PR15714. llvm-svn: 179458	2013-04-13 12:53:18 +00:00
Benjamin Kramer	e89c705030	InstCombine: Check the operand types before merging fcmp ord & fcmp ord. Fixes PR15737. llvm-svn: 179417	2013-04-12 21:56:23 +00:00
Nadav Rotem	8543ba3e52	SLPVectorizer: add support for vectorization of diamond shaped trees. We now perform a preliminary traversal of the graph to collect values with multiple users and check where the users came from. llvm-svn: 179414	2013-04-12 21:16:54 +00:00
Nadav Rotem	4da0ab1d68	Add debug prints. llvm-svn: 179412	2013-04-12 21:11:14 +00:00
David Majnemer	1a08accbb7	Simplify (A & ~B) in icmp if A is a power of 2 The transform will execute like so: (A & ~B) == 0 --> (A & B) != 0 (A & ~B) != 0 --> (A & B) == 0 llvm-svn: 179386	2013-04-12 17:25:07 +00:00
Arnold Schwaighofer	f9cea17f75	LoopVectorizer: integer division is not a reduction operation Don't classify idiv/udiv as a reduction operation. Integer division is lossy. For example : (1 / 2) * 4 != 4/2. Example: int a[] = { 2, 5, 2, 2} int x = 80; for() x /= a[i]; Scalar: x /= 2 // = 40 x /= 5 // = 8 x /= 2 // = 4 x /= 2 // = 2 Vectorized: <80, 1> / <2,5> //= <40,0> <40, 0> / <2,2> //= <20,0> 20*0 = 0 radar://13640654 llvm-svn: 179381	2013-04-12 15:15:19 +00:00
David Majnemer	b81cd63c4b	Optimize icmp involving addition better Allows LLVM to optimize sequences like the following: %add = add nsw i32 %x, 1 %cmp = icmp sgt i32 %add, %y into: %cmp = icmp sge i32 %x, %y as well as: %add1 = add nsw i32 %x, 20 %add2 = add nsw i32 %y, 57 %cmp = icmp sge i32 %add1, %add2 into: %add = add nsw i32 %y, 37 %cmp = icmp sle i32 %cmp, %x llvm-svn: 179316	2013-04-11 20:05:46 +00:00
Benjamin Kramer	a95f87494a	Fix for wrong instcombine on vector insert/extract When trying to collapse sequences of insertelement/extractelement instructions into single shuffle instructions, there is one specific case where the Instruction Combiner wrongly updates the resulting Mask of shuffle indexes. The problem is in function CollectShuffleElments. If we have a sequence of insert/extract element instructions like the one below: %tmp1 = extractelement <4 x float> %LHS, i32 0 %tmp2 = insertelement <4 x float> %RHS, float %tmp1, i32 1 %tmp3 = extractelement <4 x float> %RHS, i32 2 %tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 3 Where: . %RHS will have a mask of [4,5,6,7] . %LHS will have a mask of [0,1,2,3] The Mask of shuffle indexes is wrongly computed to [4,1,6,7] instead of [4,0,6,7]. When analyzing %tmp2 in order to compute the Mask for the resulting shuffle instruction, the algorithm forgets to update the mask index at position 1 with the index associated to the element extracted from %LHS by instruction %tmp1. Patch by Andrea DiBiagio! llvm-svn: 179291	2013-04-11 15:10:09 +00:00
Alexey Samsonov	a28f36c2e2	[ASan] Allow disabling init-order checks for globals by source file name. llvm-svn: 179280	2013-04-11 13:20:00 +00:00
Benjamin Kramer	c86fdf12e8	Rename the C function to create a SLPVectorizerPass to something sane and expose it in the header file. llvm-svn: 179272	2013-04-11 11:36:36 +00:00
Nadav Rotem	73dffa4184	Make the SLP store-merger less paranoid about function calls. We check for function calls when we check if it is safe to sink instructions. llvm-svn: 179207	2013-04-10 19:41:36 +00:00
Nadav Rotem	88dd5f7a38	We require DataLayout for analyzing the size of stores. llvm-svn: 179206	2013-04-10 18:57:27 +00:00
Joey Gouly	81259294be	Change CloneFunctionInto to always clone Argument attributes induvidually, rather than checking if the source and destination have the same number of arguments and copying the attributes over directly. llvm-svn: 179169	2013-04-10 10:37:38 +00:00
Bob Wilson	798a7709fc	Fix some comment typos. llvm-svn: 179132	2013-04-09 22:15:51 +00:00
Nadav Rotem	2d9dec322e	Add support for bottom-up SLP vectorization infrastructure. This commit adds the infrastructure for performing bottom-up SLP vectorization (and other optimizations) on parallel computations. The infrastructure has three potential users: 1. The loop vectorizer needs to be able to vectorize AOS data structures such as (sum += A[i] + A[i+1]). 2. The BB-vectorizer needs this infrastructure for bottom-up SLP vectorization, because bottom-up vectorization is faster to compute. 3. A loop-roller needs to be able to analyze consecutive chains and roll them into a loop, in order to reduce code size. A loop roller does not need to create vector instructions, and this infrastructure separates the chain analysis from the vectorization. This patch also includes a simple (100 LOC) bottom up SLP vectorizer that uses the infrastructure, and can vectorize this code: void SAXPY(int x, int y, int a, int i) { x[i] = a * x[i] + y[i]; x[i+1] = a * x[i+1] + y[i+1]; x[i+2] = a * x[i+2] + y[i+2]; x[i+3] = a * x[i+3] + y[i+3]; } llvm-svn: 179117	2013-04-09 19:44:35 +00:00
Shuxin Yang	331f01dcb4	Redo the fix Benjamin Kramer committed in r178793 about iterator invalidation in Reassociate. I brazenly think this change is slightly simpler than r178793 because: - no "state" in functor - "OpndPtrs[i]" looks simpler than "&Opnds[OpndIndices[i]]" While I can reproduce the probelm in Valgrind, it is rather difficult to come up a standalone testing case. The reason is that when an iterator is invalidated, the stale invalidated elements are not yet clobbered by nonsense data, so the optimizer can still proceed successfully. Thank Benjamin for fixing this bug and generously providing the test case. llvm-svn: 179062	2013-04-08 22:00:43 +00:00
Chandler Carruth	0e8a52d18f	Fix PR15674 (and PR15603): a SROA think-o. The fix for PR14972 in r177055 introduced a real think-o in the store side, likely because I was much more focused on the load side. While we can arbitrarily widen (or narrow) a loaded value, we can't arbitrarily widen a value to be stored, as that changes the width of memory access! Lock down the code path in the store rewriting which would do this to only handle the intended circumstance. All of the existing tests continue to pass, and I've added a test from the PR. llvm-svn: 178974	2013-04-07 11:47:54 +00:00
Michael Gottesman	7924997c36	Removed trailing whitespace. llvm-svn: 178932	2013-04-05 23:46:45 +00:00
Michael Gottesman	31ba23aa56	An objc_retain can serve as a use for a different pointer. This is the counterpart to commit r160637, except it performs the action in the bottomup portion of the data flow analysis. llvm-svn: 178922	2013-04-05 22:54:32 +00:00
Michael Gottesman	1d8d25777d	Properly model precise lifetime when given an incomplete dataflow sequence. The normal dataflow sequence in the ARC optimizer consists of the following states: Retain -> CanRelease -> Use -> Release The optimizer before this patch stored the uses that determine the lifetime of the retainable object pointer when it bottom up hits a retain or when top down it hits a release. This is correct for an imprecise lifetime scenario since what we are trying to do is remove retains/releases while making sure that no ``CanRelease'' (which is usually a call) deallocates the given pointer before we get to the ``Use'' (since that would cause a segfault). If we are considering the precise lifetime scenario though, this is not correct. In such a situation, we DO care about the previous sequence, but additionally, we wish to track the uses resulting from the following incomplete sequences: Retain -> CanRelease -> Release (TopDown) Retain <- Use <- Release (BottomUp) NOTE This patch looks large but the most of it consists of updating test cases. Additionally this fix exposed an additional bug. I removed the test case that expressed said bug and will recommit it with the fix in a little bit. llvm-svn: 178921	2013-04-05 22:54:28 +00:00
Jim Grosbach	bdbd73460c	Tidy up a bit. No functional change. llvm-svn: 178915	2013-04-05 21:20:12 +00:00
Shuxin Yang	95adf5258f	Disable the optimization about promoting vector-element-access with symbolic index. This optimization is unstable at this moment; it 1) block us on a very important application 2) PR15200 3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll (the CHECK command compare the output against wrong result) I personally believe this optimization should not have any impact on the autovectorized code, as auto-vectorizer is supposed to put gather/scatter in a "right" way. Although in theory downstream optimizaters might reveal some gather/scatter optimization opportunities, the chance is quite slim. For the hand-crafted vectorizing code, in term of redundancy elimination, load-CSE, copy-propagation and DSE can collectively achieve the same result, but in much simpler way. On the other hand, these optimizers are able to improve the code in a incremental way; in contrast, SROA is sort of all-or-none approach. However, SROA might slighly win in stack size, as it tries to figure out a stretch of memory tightenly cover the area accessed by the dynamic index. rdar://13174884 PR15200 llvm-svn: 178912	2013-04-05 21:07:08 +00:00
Michael Gottesman	bab49e976b	Added two debug logging messages to VisitInstructionsTopDown to match VisitInstructionsBottomUp. llvm-svn: 178895	2013-04-05 18:26:08 +00:00
Michael Gottesman	89279f8383	Cleaned up whitespace and made debug logging less verbose. llvm-svn: 178893	2013-04-05 18:10:41 +00:00
Arnold Schwaighofer	df6f67ed87	LoopVectorizer: Pass OperandValueKind information to the cost model Pass down the fact that an operand is going to be a vector of constants. This should bring the performance of MultiSource/Benchmarks/PAQ8p/paq8p on x86 back. It had degraded to scalar performance due to my pervious shift cost change that made all shifts expensive on x86. radar://13576547 llvm-svn: 178809	2013-04-04 23:26:27 +00:00
Benjamin Kramer	dd67654af6	Reassociate: Avoid iterator invalidation. OpndPtrs stored pointers into the Opnd vector that became invalid when the vector grows. Store indices instead. Sadly I only have a large testcase that only triggers under valgrind, so I didn't include it. llvm-svn: 178793	2013-04-04 21:15:42 +00:00
Michael Gottesman	21a4ed3227	Refactored out the helper method FindPredecessorAutoreleaseWithSafePath from ObjCARCOpt::OptimizeReturns. Now ObjCARCOpt::OptimizeReturns is easy to read and reason about. llvm-svn: 178715	2013-04-03 23:39:14 +00:00
Michael Gottesman	6908db148b	Refactored out the helper function FindPredecessorRetainWithSafePath from ObjCARCOpt::OptimizeReturns. llvm-svn: 178714	2013-04-03 23:16:05 +00:00
Michael Gottesman	c2d5bf5c53	Small cleanups. Cleaned up trailing whitespace and added extra slashes in front of a function level comment so that it follow the convention of having 3 slashes. llvm-svn: 178712	2013-04-03 23:07:45 +00:00
Michael Gottesman	54dc7fdefb	Refactored out a part of ObjCARCOpt::OptimizeReturns into its own method HasSafePathToPredecessorCall. llvm-svn: 178710	2013-04-03 23:04:28 +00:00
Michael Gottesman	0a1748bb8c	Removed an old comment. llvm-svn: 178709	2013-04-03 23:04:24 +00:00
Michael Gottesman	43e7e00a68	Clean up arc annotations by moving the top/bottom BB annotations into conditional macros that no-op in Release mode instead of #ifdef sections of the code. This is to follow the example of the DEBUG macro. llvm-svn: 178705	2013-04-03 22:41:59 +00:00
Michael Gottesman	b8c8836594	Remove an optimization where we were changing an objc_autorelease into an objc_autoreleaseReturnValue. The semantics of ARC implies that a pointer passed into an objc_autorelease must live until some point (potentially down the stack) where an autorelease pool is popped. On the other hand, an objc_autoreleaseReturnValue just signifies that the object must live until the end of the given function at least. Thus objc_autorelease is stronger than objc_autoreleaseReturnValue in terms of the semantics of ARC* implying that performing the given strength reduction without any knowledge of how this relates to the autorelease pool pop that is further up the stack violates the semantics of ARC. *Even though objc_autoreleaseReturnValue if you know that no RV optimization will occur is more computationally expensive. llvm-svn: 178612	2013-04-03 02:57:24 +00:00
Michael Gottesman	624243914f	Improved comment. No functionality change. llvm-svn: 178605	2013-04-03 01:57:16 +00:00
Bill Wendling	88d06c3b2d	Use a worklist to avoid a sneaky iterator invalidation. The iterator could be invalidated when it's recursively deleting a whole bunch of constant expressions in a constant initializer. Note: This was only reproducible if `opt' was run on a `.bc' file. If `opt' was run on a `.ll' file, it wouldn't crash. This is why the test first pushes the `.ll' file through `llvm-as' before feeding it to `opt'. PR15440 llvm-svn: 178531	2013-04-02 08:16:45 +00:00
Shuxin Yang	6662fd0f15	Correct assertion condition llvm-svn: 178484	2013-04-01 18:13:05 +00:00
Shuxin Yang	7b0c94e207	Implement XOR reassociation. It is based on following rules: rule 1: (x \| c1) ^ c2 => (x & ~c1) ^ (c1^c2), only useful when c1=c2 rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2)) rule 3: (x \| c1) ^ (x \| c2) = (x & c3) ^ c3 where c3 = c1 ^ c2 rule 4: (x \| c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2 It reduces an application's size (in terms of # of instructions) by 8.9%. Reviwed by Pete Cooper. Thanks a lot! rdar://13212115 llvm-svn: 178409	2013-03-30 02:15:01 +00:00
Michael Gottesman	3b8f877860	Add clang.arc.used to ModuleHasARC so ARC always runs if said call is present in a module. clang.arc.used is an interesting call for ARC since ObjCARCContract needs to run to remove said intrinsic to avoid a linker error (since the call does not exist). llvm-svn: 178369	2013-03-29 21:15:23 +00:00
Michael Gottesman	60f6b28c58	Removed trailing whitespace. llvm-svn: 178329	2013-03-29 05:13:07 +00:00
Michael Gottesman	ba64859e6e	Removed dead code from ObjCARCOpts relating to tracking objc_retainBlocks through the ARC Dataflow analysis. By the time we get to the ARC dataflow analysis, any objc_retainBlock calls are not optimizable. llvm-svn: 178306	2013-03-28 23:08:44 +00:00
Bill Wendling	85722f48da	Minor simplification. Go ahead and use the full path for both the .gcno and .gcda files. llvm-svn: 178302	2013-03-28 22:40:08 +00:00
Michael Gottesman	49f9885a2a	Non optimizable objc_retainBlock calls are not forwarding. Since we handle optimizable objc_retainBlocks through strength reduction in OptimizableIndividualCalls, we know that all code after that point will only see non-optimizable objc_retainBlock calls. IsForwarding is only called by functions after that point, so it is ok to just classify objc_retainBlock as non-forwarding. <rdar://problem/13249661>. llvm-svn: 178285	2013-03-28 20:11:30 +00:00
Michael Gottesman	158fdf699e	[ObjCARC] Strength reduce objc_retainBlock -> objc_retain if the objc_retainBlock is optimizable. If an objc_retainBlock has the copy_on_escape metadata attached to it AND if the block pointer argument only escapes down the stack, we are allowed to strength reduce the objc_retainBlock to to an objc_retain and thus optimize it. Current there is logic in the ARC data flow analysis to handle this case which is complicated and involved making distinctions in between objc_retainBlock and objc_retain in certain places and considering them the same in others. This patch simplifies said code by: 1. Performing the strength reduction in the initial ARC peephole analysis (ObjCARCOpts::OptimizeIndividualCalls). 2. Changes the ARC dataflow analysis (which runs after the peephole analysis) to consider all objc_retainBlock calls to not be optimizable (since if the call was optimizable, we would have strength reduced it already). This patch leaves in the infrastructure in the ARC dataflow analysis to handle this case, which due to 2 will just be dead code. I am doing this on purpose to separate the removal of the old code from the testing of the new code. <rdar://problem/13249661>. llvm-svn: 178284	2013-03-28 20:11:19 +00:00
Kostya Serebryany	463aa81418	[tsan] make sure memset/memcpy/memmove are not inlined in tsan mode llvm-svn: 178230	2013-03-28 11:21:13 +00:00
Akira Hatanaka	99866dd535	Check if Type is a vector before calling function Type::getVectorNumElements. llvm-svn: 178208	2013-03-28 01:28:02 +00:00
Bill Wendling	5aa82397e8	Use the full path when outputting the `.gcda' file. If we compile a single source program, the `.gcda' file will be generated where the program was executed. This isn't desirable, because that place may be at an unpredictable place (the program could call `chdir' for instance). Instead, we will output the `.gcda' file in the same place we output the `.gcno' file. I.e., the directory where the executable was generated. This matches GCC's behavior. <rdar://problem/13061072> & PR11809 llvm-svn: 178084	2013-03-26 22:47:50 +00:00
Ulrich Weigand	8a51d8ea95	Make InstCombineCasts.cpp:OptimizeIntToFloatBitCast endian safe. The OptimizeIntToFloatBitCast converts shift-truncate sequences into extractelement operations. The computation of the element index to be used in the resulting operation is currently only correct for little-endian targets. This commit fixes the element index computation to be correct for big-endian targets as well. If the target byte order is unknown, the optimization cannot be performed at all. llvm-svn: 178031	2013-03-26 15:36:14 +00:00
Alexey Samsonov	e1e26bf158	[ASan] Change the ABI of __asan_before_dynamic_init function: now it takes pointer to private string with module name. This string serves as a unique module ID in ASan runtime. LLVM part llvm-svn: 178013	2013-03-26 13:05:41 +00:00
Michael Gottesman	cd4de0f9bb	[ObjCARC Annotations] Added support for displaying the state of pointers at the bottom/top of BBs of the ARC dataflow analysis for both bottomup and topdown analyses. This will allow for verification and analysis of the merge function of the data flow analyses in the ARC optimizer. The actual implementation of this feature is by introducing calls to the functions llvm.arc.annotation.{bottomup,topdown}.{bbstart,bbend} which are only declared. Each such call takes in a pointer to a global with the same name as the pointer whose provenance is being tracked and a pointer whose name is one of our Sequence states and points to a string that contains the same name. To ensure that the optimizer does not consider these annotations in any way, I made it so that the annotations are considered to be of IC_None type. A test case is included for this commit and the previous ObjCARCAnnotation commit. llvm-svn: 177952	2013-03-26 00:42:09 +00:00
Michael Gottesman	81b1d43783	[ObjCARC Annotations] Implemented ARC annotation metadata to expose the ARC data flow analysis state in the IR via metadata. Previously the inner works of the data flow analysis in ObjCARCOpts was hard to get out of the optimizer for analysis of bugs or testing. All of the current ARC unit tests are based off of testing the effect of the data flow analysis (i.e. what statements are removed or moved, etc.). This creates weakness in the current unit testing regimem since we are not actually testing what effects various instructions have on the modeled pointer state. Additionally in order to analyze a bug in the optimizer, one would need to track by hand what the optimizer was actually doing either through use of DEBUG statements or through the usage of a debugger, both yielding large loses in developer productivity. This patch deals with these two issues by providing ARC annotation metadata that annotates instructions with the state changes that they cause in various pointers as well as provides metadata to annotate provenance sources. Specifically, we introduce the following metadata types: 1. llvm.arc.annotation.bottomup. 2. llvm.arc.annotation.topdown. 3. llvm.arc.annotation.provenancesource. llvm.arc.annotation.{bottomup,topdown}: These annotations describes a state change in a pointer when we are visiting instructions bottomup/topdown respectively. The output format for both is the same: !1 = metadata !{metadata !"(test,%x)", metadata !"S_Release", metadata !"S_Use"} The first element is a string tuple with the following format: (function,variable name) The second two elements of the metadata show the previous state of the pointer (in this case S_Release) and the new state of the pointer (S_Use). We write the metadata in such a manner to ensure that it is easy for outside tools to parse. This is important since I am currently working on a tool for taking this information and pretty printing it besides the IR and that can be used for LIT style testing via the generation of an index. llvm.arc.annotation.provenancesource: This metadata is used to annotate instructions which act as provenance sources, i.e. ones that introduce a new (from the optimizer's perspective) non-argument pointer to track. This enables cross-referencing in between provenance sources and the state changes that occur to them. This is still a work in progress. Additionally I plan on committing later today additions to the annotations that annotate at the top/bottom of basic blocks the state of the various pointers being tracked. NOTE The metadata support is conditionally compiled into libObjCARCOpts only when we are producing a debug build of llvm/clang and even so are disabled by default. To enable the annotation metadata, pass in -enable-objc-arc-annotations to opt. llvm-svn: 177951	2013-03-26 00:42:04 +00:00
Shuxin Yang	389ed4b8f7	Fix a bug in fast-math fadd/fsub simplification. The problem is that the code mistakenly took for granted that following constructor is able to create an APFloat from a SIGNED integer: APFloat::APFloat(const fltSemantics &ourSemantics, integerPart value) rdar://13486998 llvm-svn: 177906	2013-03-25 20:43:41 +00:00
Arnaud A. de Grandmaison	3ee88e8a77	Address issues found by Duncan during post-commit review of r177856. llvm-svn: 177863	2013-03-25 11:47:38 +00:00
Arnaud A. de Grandmaison	9c383d68cf	InstCombine: simplify comparisons to zero of (shl %x, Cst) or (mul %x, Cst) This simplification happens at 2 places : - using the nsw attribute when the shl / mul is used by a sign test - when the shl / mul is compared for (in)equality to zero llvm-svn: 177856	2013-03-25 09:48:49 +00:00
Michael Gottesman	65c2481d09	Changed isNullOrUndef => IsNullOrUndef and isNoopInstruction => IsNoopInstruction so that all helper functions are named similarly in ObjCARC.h. llvm-svn: 177855	2013-03-25 09:27:43 +00:00
Jakub Staszak	4f9d1e85d0	Minor cleanups. No functionality change. llvm-svn: 177837	2013-03-24 09:56:28 +00:00
Jakub Staszak	f6df1e3def	Use dyn_cast instead of isa && cast. No functionality change. llvm-svn: 177836	2013-03-24 09:25:47 +00:00
Michael Gottesman	764b1cfced	Change method name ClearRefCount => ClearKnownPositiveRefCount to match the name of the member that it is modifying. llvm-svn: 177818	2013-03-23 05:46:19 +00:00
Michael Gottesman	07beea47b8	Changed the method name PtrState.IsKnownIncremented() to PtrState.HasKnownPositiveRefCount(). Now said method matches namewise every other method which refers to the member KnownPositiveRefCount of the class PtrState. llvm-svn: 177816	2013-03-23 05:31:01 +00:00
John McCall	20182ac0c7	Kill every call to @clang.arc.use in the ARC contract phase. llvm-svn: 177769	2013-03-22 21:38:36 +00:00
Bill Wendling	56f15bf490	Add all clauses when merging the landing pads. Duplicates will be handled later on. llvm-svn: 177757	2013-03-22 20:31:05 +00:00
Bill Wendling	a397c017bb	Don't use the removed API. llvm-svn: 177749	2013-03-22 18:49:53 +00:00
Kostya Serebryany	cdd35a9050	[asan] Change the way we report the alloca frame on stack-buff-overflow. Before: the function name was stored by the compiler as a constant string and the run-time was printing it. Now: the PC is stored instead and the run-time prints the full symbolized frame. This adds a couple of instructions into every function with non-empty stack frame, but also reduces the binary size because we store less strings (I saw 2% size reduction). This change bumps the asan ABI version to v3. llvm part. Example of report (now): ==31711==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffa77cf1c5 at pc 0x41feb0 bp 0x7fffa77cefb0 sp 0x7fffa77cefa8 READ of size 1 at 0x7fffa77cf1c5 thread T0 #0 0x41feaf in Frame0(int, char, char, char) stack-oob-frames.cc:20 #1 0x41f7ff in Frame1(int, char, char) stack-oob-frames.cc:24 #2 0x41f477 in Frame2(int, char) stack-oob-frames.cc:28 #3 0x41f194 in Frame3(int) stack-oob-frames.cc:32 #4 0x41eee0 in main stack-oob-frames.cc:38 #5 0x7f0c5566f76c (/lib/x86_64-linux-gnu/libc.so.6+0x2176c) #6 0x41eb1c (/usr/local/google/kcc/llvm_cmake/a.out+0x41eb1c) Address 0x7fffa77cf1c5 is located in stack of thread T0 at offset 293 in frame #0 0x41f87f in Frame0(int, char, char, char*) stack-oob-frames.cc:12 <<<<<<<<<<<<<< this is new This frame has 6 object(s): [32, 36) 'frame.addr' [96, 104) 'a.addr' [160, 168) 'b.addr' [224, 232) 'c.addr' [288, 292) 's' [352, 360) 'd' llvm-svn: 177724	2013-03-22 10:37:20 +00:00
Dmitry Vyukov	55e63ef454	tsan: handle vptr loads specially This is required to determine ctor/dtor vs virtual call races. http://llvm-reviews.chandlerc.com/D566 llvm-svn: 177717	2013-03-22 08:51:22 +00:00
Evgeniy Stepanov	2a066afce5	Fix llvm::removeUnreachableBlocks to handle unreachable loops. llvm-svn: 177713	2013-03-22 08:43:04 +00:00
Arnaud A. de Grandmaison	f364bc63e7	InstCombine: Improve the result bitvect type when folding (cmp pred (load (gep GV, i)) C) to a bit test. The original code used i32, and i64 if legal. This introduced unneeded casts when they aren't legal, or when the index variable i has another type. In order of preference: try to use i's type; use the smallest fitting legal type (using an added DataLayout method); default to i32. A testcase checks that this works when the index gep operand is i16. Patch by : Ahmed Bougacha <ahmed.bougacha@gmail.com> Reviewed by : Duncan llvm-svn: 177712	2013-03-22 08:25:01 +00:00
Bill Wendling	173c71ff3d	Always forward 'resume' instructions to the outter landing pad. How did this ever work? Basically, if you have a function that's inlined into the caller, it may not have any 'call' instructions, but any 'resume' instructions it may have should still be forwarded to the outer (caller's) landing pad. This requires that all of the 'landingpad' instructions in the callee have their clauses merged with the caller's outer 'landingpad' instruction (hence the bit of ugly code in the `forwardResume' method). Testcase in a follow commit to the test-suite repository. <rdar://problem/13360379> & PR15555 llvm-svn: 177680	2013-03-21 23:30:12 +00:00
Chandler Carruth	34f0c7fcaf	[SROA] Prefix names using a custom IRBuilder inserter. The key part of this is ensuring that name prefixes remain in a Twine form until we get to a point where we can nuke them under NDEBUG. This is tricky using the old APIs as they played fast and loose with Twine, which is prone to serious error. The inserter is much cleaner as it is actually in the call stack leading to the setName call, and so has a good opportunity to prepend the prefix. This matters more than you might imagine because most runs over an alloca find a single partition, and rewrite 3 or 4 instructions referring to it. As a consequence doing this lazily and exclusively with Twine allows the optimizer to delete more of it and shaves another 2% to 3% off of the release build's SROA run time for PR15412. I also think the APIs are cleaner, and the use of Twine is more reliable, so I consider it a win-win despite the churn required to reach this state. llvm-svn: 177631	2013-03-21 09:52:18 +00:00
Evgeniy Stepanov	a9a962cad8	[msan] Add an option to disable poisoning of shadow for undef values. llvm-svn: 177630	2013-03-21 09:38:26 +00:00
Meador Inge	cf691565ed	simplify-libcalls: Removed unused variable The 'Modified' variable should have been removed from SimplifyLibCalls in r177619, but was missed. This commit removes it. llvm-svn: 177622	2013-03-21 02:44:07 +00:00
Meador Inge	6b6a161ccf	Move library call prototype attribute inference to functionattrs The simplify-libcalls pass implemented a doInitialization hook to infer function prototype attributes for well-known functions. Given that the simplify-libcalls pass is going away and that the functionattrs pass is already in place to deduce function attributes, I am moving this logic to the functionattrs pass. This approach was discussed during patch review: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121126/157465.html. llvm-svn: 177619	2013-03-21 00:55:59 +00:00
Bill Wendling	c77e9440cf	Call the new llvm_gcov_init function to register the environment. Use the new `llvm_gcov_init' function to register the writeout and flush functions. The initialization function will also call `atexit' for some cleanups and final writout calls. But it does this only once. This is better than checking for the `main' function, because in a library that function may not exist. <rdar://problem/12439551> llvm-svn: 177579	2013-03-20 21:13:59 +00:00
Chandler Carruth	0fad17527b	Fix a silly search-and-replace goof with r177495 that only broke non-release builds. llvm-svn: 177498	2013-03-20 07:40:56 +00:00
Chandler Carruth	d177f86124	[SROA] Don't preserve the IR names in release builds. This is espcially important because the new SROA pass goes to great lengths to provide helpful names for debugging, and as a consequence they can become very slow to render. Good for between 5% and 15% of the SROA runtime on some slow test cases such as the one in PR15412. llvm-svn: 177495	2013-03-20 07:30:36 +00:00
Chandler Carruth	0941b66283	Move the endif to the correct line so we don't have warnings about unused statistics variables. llvm-svn: 177494	2013-03-20 06:47:00 +00:00
Chandler Carruth	5f5b616344	Introduce some new statistics to help track the exact behavior of the new SROA pass. llvm-svn: 177493	2013-03-20 06:30:46 +00:00
Quentin Colombet	2393cb92b8	Update global merge pass according to Duncan's advices: - Remove useless includes - Change misleading comments - Move code into doFinalization llvm-svn: 177445	2013-03-19 21:46:49 +00:00
Bill Wendling	04d57c7b2c	Register the GCOV writeout functions so that they're emitted serially. We don't want to write out >1000 files at the same time. That could make things prohibitively expensive. Instead, register the "writeout" function so that it's emitted serially. <rdar://problem/12439551> llvm-svn: 177437	2013-03-19 21:03:22 +00:00
Arnaud A. de Grandmaison	87c473f0d1	IndVarSimplify: do not recompute an IV value outside of the loop if : - it is trivially known to be used inside the loop in a way that can not be optimized away - there is no use outside of the loop which can take advantage of the computation hoisting llvm-svn: 177432	2013-03-19 20:00:22 +00:00
Andrew Trick	f3a2544dba	Revert "Cleanup some SCEV logic a bit." This reverts commit 82cd8f7382322bee7a71cdc31f7a923c44d37d32. Just add a comment instead! llvm-svn: 177377	2013-03-19 05:10:27 +00:00
Andrew Trick	de78866594	Cleanup some SCEV logic a bit. Make the code more obvious to scan-build and humans. llvm-svn: 177375	2013-03-19 04:14:59 +00:00
Andrew Trick	a1c01ba8c7	Tighten up an internal LSR API that should check for NULL. No test case, but should fix a scan_build warning. llvm-svn: 177374	2013-03-19 04:14:57 +00:00
Nick Lewycky	d67186337a	Emit the linkage name instead of the function name, when available. This means that we'll prefer to emit the mangled C++ name (pending a clang change). llvm-svn: 177371	2013-03-19 01:37:55 +00:00
Jakub Staszak	bc421efddf	Make method private. Keep coding standard. llvm-svn: 177348	2013-03-18 23:31:30 +00:00
Bill Wendling	c3cab816bb	Register the flush function for each compile unit. For each compile unit, we want to register a function that will flush that compile unit. Otherwise, __gcov_flush() would only flush the counters within the current compile unit, and not any outside of it. PR15191 & <rdar://problem/13167507> llvm-svn: 177340	2013-03-18 23:04:39 +00:00
Quentin Colombet	8fc340976d	Extend global merge pass to optionally consider global constant variables. Also add some checks to not merge globals used within landing pad instructions or marked as "used". llvm-svn: 177331	2013-03-18 22:30:07 +00:00
Kostya Serebryany	10cc12f2b7	[asan] when creating string constants, set unnamed_attr and align 1 so that equal strings are merged by the linker. Observed up to 1% binary size reduction. Thanks to Anton Korobeynikov for the suggestion llvm-svn: 177264	2013-03-18 09:38:39 +00:00
Chandler Carruth	f74654d274	Mark internal classes as POD-like to get better behavior out of SmallVector and DenseMap. This speeds up SROA by 25% on PR15412. llvm-svn: 177259	2013-03-18 08:36:46 +00:00
Kostya Serebryany	bd016bb614	[asan] while generating the description of a global variable, emit the module name in a separate field, thus not duplicating this information if every description. This decreases the binary size (observed up to 3%). https://code.google.com/p/address-sanitizer/issues/detail?id=168 . This changes the asan API version. llvm-part llvm-svn: 177254	2013-03-18 08:05:29 +00:00
Kostya Serebryany	6b5b58deeb	[asan] don't instrument functions with available_externally linkage. This saves a bit of compile time and reduces the number of redundant global strings generated by asan (https://code.google.com/p/address-sanitizer/issues/detail?id=167 ) llvm-svn: 177250	2013-03-18 07:33:49 +00:00
Arnold Schwaighofer	c63cf3a0ae	LoopVectorize: Invert case when we use a vector cmp value to query select cost We generate a select with a vectorized condition argument when the condition is NOT loop invariant. Not the other way around. llvm-svn: 177098	2013-03-14 18:54:36 +00:00
Shuxin Yang	2eca602f8b	Perform factorization as a last resort of unsafe fadd/fsub simplification. Rules include: 1)1 xy +/- xz => x*(y +/- z) (the order of operands dosen't matter) 2) y/x +/- z/x => (y +/- z)/x The transformation is disabled if the new add/sub expr "y +/- z" is a denormal/naz/inifinity. rdar://12911472 llvm-svn: 177088	2013-03-14 18:08:26 +00:00
Alexey Samsonov	819eddc3ce	[ASan] emit instrumentation for initialization order checking by default llvm-svn: 177063	2013-03-14 12:38:58 +00:00
Chandler Carruth	a1c54bbe34	PR14972: SROA vs. GVN exposed a really bad bug in SROA. The fundamental problem is that SROA didn't allow for overly wide loads where the bits past the end of the alloca were masked away and the load was sufficiently aligned to ensure there is no risk of page fault, or other trapping behavior. With such widened loads, SROA would delete the load entirely rather than clamping it to the size of the alloca in order to allow mem2reg to fire. This was exposed by a test case that neatly arranged for GVN to run first, widening certain loads, followed by an inline step, and then SROA which miscompiles the code. However, I see no reason why this hasn't been plaguing us in other contexts. It seems deeply broken. Diagnosing all of the above took all of 10 minutes of debugging. The really annoying aspect is that fixing this completely breaks the pass. ;] There was an implicit reliance on the fact that no loads or stores extended past the alloca once we decided to rewrite them in the final stage of SROA. This was used to encode information about whether the loads and stores had been split across multiple partitions of the original alloca. That required threading explicit tracking of whether a use of a partition is split across multiple partitions. Once that was done, another problem arose: we allowed splitting of integer loads and stores iff they were loads and stores to the entire alloca. This is a really arbitrary limitation, and splitting at least some integer loads and stores is crucial to maximize promotion opportunities. My first attempt was to start removing the restriction entirely, but currently that does Very Bad Things by causing many common alloca patterns to be fully decomposed into i8 operations and lots of or-ing together to produce larger integers on demand. The code bloat is terrifying. That is still the right end-goal, but substantial work must be done to either merge partitions or ensure that small i8 values are eagerly merged in some other pass. Sadly, figuring all this out took essentially all the time and effort here. So the end result is that we allow splitting only when the load or store at least covers the alloca. That ensures widened loads and stores don't hurt SROA, and that we don't rampantly decompose operations more than we have previously. All of this was already fairly well tested, and so I've just updated the tests to cover the wide load behavior. I can add a test that crafts the pass ordering magic which caused the original PR, but that seems really brittle and to provide little benefit. The fundamental problem is that widened loads should Just Work. llvm-svn: 177055	2013-03-14 11:32:24 +00:00
Nick Lewycky	307a1d03b5	Remove accidentally committed debug line. llvm-svn: 177005	2013-03-14 05:19:12 +00:00
Nick Lewycky	fdfed3e9c9	Refactor GCOV's six constructor arguments into a struct with a getter that constructs default arguments. It can now take default arguments from cl::opt'ions. Add a new -default-gcov-version=... option, and actually test it! Sink the reverse-order of the version into GCOVProfiling, hiding it from our users. llvm-svn: 177002	2013-03-14 05:13:26 +00:00
Nick Lewycky	ad145509eb	No functionality change. Rename emitGCNO() to the more sensible emitProfileNotes(), similar to emitProfileArcs(). Also update its comment. Also add a comment on Version[4] (there will be another comment in clang later), and compress lines that exceeded 80 columns. llvm-svn: 176994	2013-03-13 22:55:42 +00:00
Arnaud A. de Grandmaison	7153305b92	Fix a performance regression when combining to smaller types in icmp (shl %v, C1), C2 : Only combine when the shl is only used by the icmp llvm-svn: 176950	2013-03-13 14:40:37 +00:00
Dan Gohman	00253592c7	Change the order of the operands in patchAndReplaceAllUsesWith so that they're more consistent with Value::replaceAllUsesWith. llvm-svn: 176872	2013-03-12 16:22:56 +00:00
Meador Inge	20255ef24d	LibCallSimplifier: optimize speed for short-lived instances Nadav reported a performance regression due to the work I did to merge the library call simplifier into instcombine [1]. The issue is that a new LibCallSimplifier object is being created whenever InstCombiner::runOnFunction is called. Every time a LibCallSimplifier object is used to optimize a call it creates a hash table to map from a function name to an object that optimizes functions of that name. For short-lived LibCallSimplifier instances this is quite inefficient. Especially for cases where no calls are actually simplified. This patch fixes the issue by dropping the hash table and implementing an explicit lookup function to correlate the function name to the object that optimizes functions of that name. This avoids the cost of always building and destroying the hash table in cases where the LibCallSimplifier object is short-lived and avoids the cost of building the table when no simplifications are actually preformed. On a benchmark containing 100,000 calls where none of them are simplified I noticed a 30% speedup. On a benchmark containing 100,000 calls where all of them are simplified I noticed an 8% speedup. [1] http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130304/167639.html llvm-svn: 176840	2013-03-12 00:08:29 +00:00
Bill Wendling	9534d8885f	Don't remove a landing pad if the invoke requires a table entry. An invoke may require a table entry. For instance, when the function it calls is expected to throw. <rdar://problem/13360379> llvm-svn: 176827	2013-03-11 20:53:00 +00:00
Nick Lewycky	5f50854186	Use LLVMBool instead of 'bool' in the C API. Based on a patch by Peter Zotov! llvm-svn: 176793	2013-03-10 21:58:22 +00:00
Hal Finkel	f610be9f36	BBVectorize: Fixup debugging statements After the recent data-structure improvements, a couple of debugging statements were broken (printing pointer values). llvm-svn: 176791	2013-03-10 20:57:42 +00:00
Benjamin Kramer	6eda79f69a	Remove a source of nondeterminism from the LoopVectorizer. This made us emit runtime checks in a random order. Hopefully bootstrap miscompares will go away now. llvm-svn: 176775	2013-03-09 19:22:40 +00:00
Arnold Schwaighofer	8b3dc09400	LoopVectorizer: Ignore all dbg intrinisic Ignore all DbgIntriniscInfo instructions instead of just DbgValueInst. llvm-svn: 176769	2013-03-09 16:27:27 +00:00
Arnold Schwaighofer	4090b61ac3	LoopVectorizer: Ignore dbg.value instructions We want vectorization to happen at -g. Ignore calls to the dbg.value intrinsic and don't transfer them to the vectorized code. radar://13378964 llvm-svn: 176768	2013-03-09 15:56:34 +00:00
Jakub Staszak	2ef36b633b	Simplify code. No functionality change. llvm-svn: 176765	2013-03-09 11:18:59 +00:00
Nick Lewycky	291df6ec42	Use the correct index variable. This is the meat of what was supposed to be in r176751. Also, learn a lesson about applying patches by hand/eyeball. llvm-svn: 176764	2013-03-09 10:13:26 +00:00
Nick Lewycky	03aed11cdb	Fix bug introduced in r176616 when making function identifier numbers stable. Count the subprograms, not the compile units. llvm-svn: 176751	2013-03-09 02:06:37 +00:00
Nick Lewycky	88f1d0d64e	Don't emit the extra checksum into the .gcda file if the user hasn't asked for it. Fortunately, versions of gcov that predate the extra checksum also ignore any extra data, so this isn't a problem. There will be a matching commit in compiler-rt. llvm-svn: 176745	2013-03-09 01:33:06 +00:00
Benjamin Kramer	37c2d65c5a	Insert the reduction start value into the first bypass block to preserve domination. Fixes PR15344. llvm-svn: 176701	2013-03-08 16:58:37 +00:00
Jakub Staszak	fd56611b49	Keep coding stanard. llvm-svn: 176661	2013-03-07 22:20:06 +00:00
Jakub Staszak	db4579d796	Don't create IRBuilder if we can return from the method earlier. llvm-svn: 176660	2013-03-07 22:10:33 +00:00
Pekka Jaaskelainen	093cf41e86	Fixed a crash when cloning a function into a function with different size argument list and without attributes in the arguments. llvm-svn: 176632	2013-03-07 16:46:43 +00:00
Nick Lewycky	492afe8127	Switch from a version 4.2/4.4 switch to a four-byte version string to be put into the actual gcov file. Instead of using the bottom 4 bytes as the function identifier, use a counter. This makes the identifier numbers stable across multiple runs. llvm-svn: 176616	2013-03-07 08:28:49 +00:00
Andrew Trick	a0a5ca06b9	SimplifyCFG fix for volatile load/store. Fixes rdar:13349374. Volatile loads and stores need to be preserved even if the language standard says they are undefined. "volatile" in this context means "get out of the way compiler, let my platform handle it". Additionally, this is the only way I know of with llvm to write to the first page (when hardware allows) without dropping to assembly. llvm-svn: 176599	2013-03-07 01:03:35 +00:00
Andrew Trick	fcb37243f9	Generalize my previous fix for -print-options. Always print options that differ from their implicit default. At least for simple option types. llvm-svn: 176572	2013-03-06 19:04:56 +00:00
Andrew Trick	946c2b32e6	Give -loop-vectorize an explicit default. This way, clang -mllvm -print-options shows that the driver is overriding it. llvm-svn: 176569	2013-03-06 18:22:22 +00:00
Jim Grosbach	95d2eb95c3	InstCombine: Don't shrink allocas when combining with a bitcast. When considering folding a bitcast of an alloca into the alloca itself, make sure we don't shrink the amount of memory being allocated, or things rapidly go sideways. rdar://13324424 llvm-svn: 176547	2013-03-06 05:44:53 +00:00
Lang Hames	30be8a30cc	Check isDiscardableIfUnused, rather than hasLocalLinkage, when bumping GlobalValue linkage up to ExternalLinkage in the ExtractGV pass. This prevents linkonce and linkonce_odr symbols from being DCE'd. llvm-svn: 176459	2013-03-04 22:40:44 +00:00
Preston Gurd	485296d1e8	Bypass Slow Divides * Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442	2013-03-04 18:13:57 +00:00
Nadav Rotem	739e37a0d2	PR14448 - prevent the loop vectorizer from vectorizing the same loop twice. The LoopVectorizer often runs multiple times on the same function due to inlining. When this happens the loop vectorizer often vectorizes the same loops multiple times, increasing code size and adding unneeded branches. With this patch, the vectorizer during vectorization puts metadata on scalar loops and marks them as 'already vectorized' so that it knows to ignore them when it sees them a second time. PR14448. llvm-svn: 176399	2013-03-02 01:33:49 +00:00
Peter Collingbourne	1b97a9c82a	Modify {Call,Invoke}Inst::addAttribute to take an AttrKind. llvm-svn: 176397	2013-03-02 01:20:18 +00:00
Benjamin Kramer	12f98fae98	LoopVectorize: Don't hang forever if a PHI only has skipped PHI uses. Fixes PR15384. llvm-svn: 176366	2013-03-01 19:07:31 +00:00
Quentin Colombet	e684a6d4aa	Fix a bug in instcombine for fmul in fast math mode. The instcombine recognized pattern looks like: a = b * c d = a +/- Cst or a = b * c d = Cst +/- a When creating the new operands for fadd or fsub instruction following the related fmul, the first operand was created with the second original operand (M0 was created with C1) and the second with the first (M1 with Opnd0). The fix consists in creating the new operands with the appropriate original operand, i.e., M0 with Opnd0 and M1 with C1. llvm-svn: 176300	2013-02-28 21:12:40 +00:00
Evgeniy Stepanov	00062b4498	[msan] Implement sanitize_memory attribute. Shadow checks are disabled and memory loads always produce fully initialized values in functions that don't have a sanitize_memory attribute. Value and argument shadow is propagated as usual. This change also updates blacklist behaviour to match the above. llvm-svn: 176247	2013-02-28 11:25:14 +00:00
Evgeniy Stepanov	4c9300e630	Remove unused leftover declarations. llvm-svn: 176240	2013-02-28 08:42:11 +00:00
Benjamin Kramer	dc145816fd	LoopVectorize: Vectorize math builtin calls. This properly asks TargetLibraryInfo if a call is available and if it is, it can be translated into the corresponding LLVM builtin. We don't vectorize sqrt() yet because I'm not sure about the semantics for negative numbers. The other intrinsic should be exact equivalents to the libm functions. Differential Revision: http://llvm-reviews.chandlerc.com/D465 llvm-svn: 176188	2013-02-27 15:24:19 +00:00
Nick Lewycky	6fd43e4071	In GCC 4.7, function names are now forbidden from .gcda files. Support this by passing a null pointer to the function name in to GCDAProfiling, and add another switch onto GCOVProfiling. llvm-svn: 176173	2013-02-27 06:22:56 +00:00
Nick Lewycky	625f395663	Doh, fix behaviour change introduced in r176168 which is tested in clang, not llvm. llvm-svn: 176172	2013-02-27 06:21:30 +00:00
Nadav Rotem	464e807d41	For each function that we optimize we initialize a new list of lib functions. For each function name we malloc memory. This patch changes the Libcall map to use BumpPtrAllocator. Now we malloc only once. This speeds up instcombine by a few % on a large c++ program. llvm-svn: 176170	2013-02-27 05:53:43 +00:00
Nick Lewycky	8e94d80aab	IRBuilder has grown all sorts of useful utility functions. Make use of them to clean up this code a tiny bit. No functionality change. llvm-svn: 176168	2013-02-27 05:46:30 +00:00
Pedro Artigas	e40467b589	Enhance integer division emulation support to handle types smaller than 32 bits, enhancement done the trivial way; by extending inputs and truncating outputs which is addequate for targets with little or no support for integer arithmetic on integer types less than 32 bits. llvm-svn: 176139	2013-02-26 23:33:20 +00:00
Kostya Serebryany	cf880b9443	Unify clang/llvm attributes for asan/tsan/msan (LLVM part) These are two related changes (one in llvm, one in clang). LLVM: - rename address_safety => sanitize_address (the enum value is the same, so we preserve binary compatibility with old bitcode) - rename thread_safety => sanitize_thread - rename no_uninitialized_checks -> sanitize_memory CLANG: - add __attribute__((no_sanitize_address)) as a synonym for __attribute__((no_address_safety_analysis)) - add __attribute__((no_sanitize_thread)) - add __attribute__((no_sanitize_memory)) for S in address thread memory If -fsanitize=S is present and __attribute__((no_sanitize_S)) is not set llvm attribute sanitize_S llvm-svn: 176075	2013-02-26 06:58:09 +00:00
Benjamin Kramer	ee40b9a2d4	CVP: If we have a PHI with an incoming select, try to skip the select. This is a common pattern with dyn_cast and similar constructs, when the PHI no longer depends on the select it can often be turned into a simpler construct or even get hoisted out of the loop. PR15340. llvm-svn: 175995	2013-02-24 15:34:43 +00:00
Michael Gottesman	f4b7761ed7	Fixed a careless mistake. rdar://13273675. llvm-svn: 175939	2013-02-23 00:31:32 +00:00
Bill Wendling	09bd1f71ee	Implement the NoBuiltin attribute. The 'nobuiltin' attribute is applied to call sites to indicate that LLVM should not treat the callee function as a built-in function. I.e., it shouldn't try to replace that function with different code. llvm-svn: 175835	2013-02-22 00:12:35 +00:00
Renato Golin	cf928cb53f	Allow GlobalValues to vectorize with AliasAnalysis Storing the load/store instructions with the values and inspect them using Alias Analysis to make sure they don't alias, since the GEP pointer operand doesn't take the offset into account. Trying hard to not add any extra cost to loads and stores that don't overlap on global values, AA is only calculated if all of the previous attempts failed. Using biggest vector register size as the stride for the vectorization access, as we're being conservative and the cost model (which calculates the real vectorization factor) is only run after the legalization phase. We might re-think this relationship in the future, but for now, I'd rather be safe than sorry. llvm-svn: 175818	2013-02-21 22:39:03 +00:00
Chad Rosier	9b7f9c3e9e	Remove dead code and whitespace. llvm-svn: 175804	2013-02-21 21:40:51 +00:00
Chad Rosier	4d87d45a05	Update a comment that looks to have been accidentally deleted many moons ago. llvm-svn: 175658	2013-02-20 20:15:55 +00:00
Kostya Serebryany	699ac28aa5	[asan] instrument invoke insns with noreturn attribute (as well as call insns) llvm-svn: 175617	2013-02-20 12:35:15 +00:00
Jakub Staszak	ae2fd9c97d	Remove unused variable. llvm-svn: 175568	2013-02-19 22:17:58 +00:00
Jakub Staszak	3c6583a1b1	Minor cleanups. No functionality change. llvm-svn: 175567	2013-02-19 22:14:45 +00:00
Jakub Staszak	90fbe91c58	Remove unneeded #includes. llvm-svn: 175565	2013-02-19 22:06:38 +00:00
Jakub Staszak	086f6cde5d	Fix typos. llvm-svn: 175562	2013-02-19 22:02:21 +00:00
Kostya Serebryany	3ece9beaf1	[asan] instrument memory accesses with unusual sizes This patch makes asan instrument memory accesses with unusual sizes (e.g. 5 bytes or 10 bytes), e.g. long double or packed structures. Instrumentation is done with two 1-byte checks (first and last bytes) and if the error is found __asan_report_load_n(addr, real_size) or __asan_report_store_n(addr, real_size) is called. Also, call these two new functions in memset/memcpy instrumentation. asan-rt part will follow. llvm-svn: 175507	2013-02-19 11:29:21 +00:00
Bill Wendling	c98e4fef1a	Temporarily revert r175470 for more review. llvm-svn: 175476	2013-02-19 00:52:45 +00:00
Bill Wendling	66651e4c2f	Check to see if the 'no-builtin' attribute is set before simplifying a library call. llvm-svn: 175470	2013-02-18 23:17:16 +00:00
Kostya Serebryany	7ca384bc1a	[asan] revert r175266 as it breaks code with packed structures. supporting long double will require a more general solution llvm-svn: 175442	2013-02-18 13:47:02 +00:00
Hal Finkel	76e65e4542	BBVectorize: Fix an invalid reference bug This fixes PR15289. This bug was introduced (recently) in r175215; collecting all std::vector references for candidate pairs to delete at once is invalid because subsequent lookups in the owning DenseMap could invalidate the references. bugpoint was able to reduce a useful test case. Unfortunately, because whether or not this asserts depends on memory layout, this test case will sometimes appear to produce valid output. Nevertheless, running under valgrind will reveal the error. llvm-svn: 175397	2013-02-17 15:59:26 +00:00
Bill Wendling	23242098e7	The transform is: (or (bool?A:B),(bool?C:D)) --> (bool?(or A,C):(or B,D)) By the time the OR is visited, both the SELECTs have been visited and not optimized and the OR itself hasn't been transformed so we do this transform in the hopes that the new ORs will be optimized. The transform is explicitly disabled for vector-selects until "codegen matures to handle them better". Patch by Muhammad Tauqir! llvm-svn: 175380	2013-02-16 23:41:36 +00:00
Jakub Staszak	11bd83551c	Reduce indents in LSRInstance::NarrowSearchSpaceByCollapsingUnrolledCode method. No functionality change. llvm-svn: 175364	2013-02-16 16:08:15 +00:00
Hal Finkel	89909397a1	BBVectorize: Call a DAG and DAG instead of a tree Several functions and variable names used the term 'tree' to refer to what is actually a DAG. Correcting this mistake will, hopefully, prevent confusion in the future. No functionality change intended. llvm-svn: 175278	2013-02-15 17:20:54 +00:00
Arnaud A. de Grandmaison	1fd843eee7	Fix refactoring mistake in "Teach InstCombine to work with smaller legal types..." llvm-svn: 175273	2013-02-15 15:18:17 +00:00
Arnaud A. de Grandmaison	61c167c62b	Teach InstCombine to work with smaller legal types in icmp (shl %v, C1), C2 It enables to work with a smaller constant, which is target friendly for those which can compare to immediates. It also avoids inserting a shift in favor of a trunc, which can be free on some targets. This used to work until LLVM-3.1, but regressed with the 3.2 release. llvm-svn: 175270	2013-02-15 14:35:47 +00:00
Kostya Serebryany	a968568165	[asan] support long double on 64-bit. See https://code.google.com/p/address-sanitizer/issues/detail?id=151 llvm-svn: 175266	2013-02-15 12:46:06 +00:00
Benjamin Kramer	6ecb1e78a9	Make helpers static. Add missing include so LLVMInitializeObjCARCOpts gets C linkage. llvm-svn: 175264	2013-02-15 12:30:38 +00:00
Hal Finkel	283f4f0e66	BBVectorize: Cap the number of candidate pairs in each instruction group For some basic blocks, it is possible to generate many candidate pairs for relatively few pairable instructions. When many (tens of thousands) of these pairs are generated for a single instruction group, the time taken to generate and rank the different vectorization plans can become quite large. As a result, we now cap the number of candidate pairs within each instruction group. This is done by closing out the group once the threshold is reached (set now at 3000 pairs). Although this will limit the overall compile-time impact, this may not be the best way to achieve this result. It might be better, for example, to prune excessive candidate pairs after the fact the prevent the generation of short, but highly-connected groups. We can experiment with this in the future. This change reduces the overall compile-time slowdown of the csa.ll test case in PR15222 to ~5x. If 5x is still considered too large, a lower limit can be used as the default. This represents a functionality change, but only for very large inputs (thus, there is no regression test). llvm-svn: 175251	2013-02-15 04:28:42 +00:00
Hal Finkel	e7a1ef422b	BBVectorize: Remove the remaining instances of std::multimap All instances of std::multimap have now been replaced by DenseMap<K, std::vector<V> >, and this yields a speedup of 5% on the csa.ll test case from PR15222. No functionality change intended. llvm-svn: 175216	2013-02-14 22:38:04 +00:00
Hal Finkel	c3a4425c34	BBVectorize: Don't store candidate pairs in a std::multimap This is another commit on the road to removing std::multimap from BBVectorize. This gives an ~1% speedup on the csa.ll test case in PR15222. No functionality change intended. llvm-svn: 175215	2013-02-14 22:37:09 +00:00
Bill Wendling	7297b864a4	Retain the name of the new internal global that's been shrunk. It's possible (e.g. after an LTO build) that an internal global may be used for debugging purposes. If that's the case appending a '.b' to it makes it hard to find that variable. Steal the name from the old GV before deleting it so that they can find that variable again. llvm-svn: 175104	2013-02-13 23:00:51 +00:00
Benjamin Kramer	0aa2ad6104	LoopVectorize: Simplify code for clarity. No functionality change. llvm-svn: 175076	2013-02-13 21:12:29 +00:00
Pekka Jaaskelainen	0d23725a8d	Metadata for annotating loops as parallel. The first consumer for this metadata is the loop vectorizer. See the documentation update for more info. llvm-svn: 175060	2013-02-13 18:08:57 +00:00
Kostya Serebryany	caf11af9d3	[asan] fix confusing indentation llvm-svn: 175033	2013-02-13 05:14:12 +00:00
Arnaud A. de Grandmaison	2e4df4f7c2	Fix comment visitSExt is an adapted copy of the related visitZExt method, so adapt the comment accordingly. llvm-svn: 175019	2013-02-13 00:19:19 +00:00
Michael Gottesman	27029f4642	Changed isStoredObjCPointer => IsStoredObjCPointer. No functionality change. llvm-svn: 175017	2013-02-12 23:35:08 +00:00
Dan Gohman	a6307574d6	Actually delete this code, since it's really not clear what it's trying to do. llvm-svn: 175014	2013-02-12 22:26:41 +00:00
Dan Gohman	f377160d2f	Record PRE predecessors with a SmallVector instead of a DenseMap, and avoid a second pred_iterator traversal. llvm-svn: 175001	2013-02-12 19:49:10 +00:00
Dan Gohman	2001cd8f9e	When disabling PRE for a value is directly redundant with itself (through a loop), don't continue to iterate through the reamining predecessors. llvm-svn: 174994	2013-02-12 19:05:10 +00:00
Dan Gohman	fd41de0b10	Check that pointers are removed from maps before calling delete on the pointers, for tidiness' sake. llvm-svn: 174988	2013-02-12 18:44:43 +00:00
Dan Gohman	f60667020a	Minor code simplification. llvm-svn: 174985	2013-02-12 18:38:36 +00:00
Alexander Potapenko	259e8127ad	[ASan] Do not use kDefaultShort64bitShadowOffset on Mac, where the binaries may get mapped at 0x100000000+ and thus may interleave with the shadow. llvm-svn: 174964	2013-02-12 12:41:12 +00:00
Kostya Serebryany	be73337ad2	[asan] change the default mapping offset on x86_64 to 0x7fff8000. This gives roughly 5% speedup. Since this is an ABI change, bump the asan ABI version by renaming __asan_init to __asan_init_v1. llvm part, compiler-rt part will follow llvm-svn: 174957	2013-02-12 11:11:02 +00:00
Hal Finkel	6ae564b4a0	BBVectorize: Don't over-search when building the dependency map When building the pairable-instruction dependency map, don't search past the last pairable instruction. For large blocks that have been divided into multiple instruction groups, searching past the last instruction in each group is very wasteful. This gives a 32% speedup on the csa.ll test case from PR15222 (when using 50 instructions in each group). No functionality change intended. llvm-svn: 174915	2013-02-11 23:02:17 +00:00
Hal Finkel	39a95032d2	BBVectorize: Omit unnecessary entries in PairableInstUsers This map is queried only for instructions in pairs of pairable instructions; so make sure that only pairs of pairable instructions are added to the map. This gives a 3.5% speedup on the csa.ll test case from PR15222. No functionality change intended. llvm-svn: 174914	2013-02-11 23:02:09 +00:00
Michael Ilseman	74a6da963b	Optimization: bitcast (<1 x ...> insertelement ..., X, ...) to ... ==> bitcast X to ... llvm-svn: 174905	2013-02-11 21:41:44 +00:00
Hal Finkel	0b8ae895b4	BBVectorize: Eliminate one more restricted linear search This eliminates one more linear search over a range of std::multimap entries. This gives a 22% speedup on the csa.ll test case from PR15222. No functionality change intended. llvm-svn: 174893	2013-02-11 17:19:34 +00:00
Kostya Serebryany	c5f44bc62d	[asan] added a flag -mllvm asan-short-64bit-mapping-offset=1 (0 by default) This flag makes asan use a small (<2G) offset for 64-bit asan shadow mapping. On x86_64 this saves us a register, thus achieving ~2/3 of the zero-base-offset's benefits in both performance and code size. Thanks Jakub Jelinek for the idea. llvm-svn: 174886	2013-02-11 14:36:01 +00:00
Hal Finkel	cb268f7995	BBVectorize: Remove the linear searches from pair connection searching This removes the last of the linear searches over ranges of std::multimap iterators, giving a 7% speedup on the doduc.bc input from PR15222. No functionality change intended. llvm-svn: 174859	2013-02-11 05:29:51 +00:00
Hal Finkel	fee38f9754	BBVectorize: Avoid linear searches within the load-move set This is another cleanup aimed at eliminating linear searches in ranges of std::multimap. No functionality change intended. llvm-svn: 174858	2013-02-11 05:29:49 +00:00
Hal Finkel	dd4bc66593	BBVectorize: isa/cast cleanup in getInstructionTypes Profiling suggests that getInstructionTypes is performance-sensitive, this cleans up some double-casting in that function in favor of using dyn_cast. No functionality change intended. llvm-svn: 174857	2013-02-11 05:29:48 +00:00
Hal Finkel	c1cc166948	BBVectorize: Make the bookkeeping to support full cycle checking less expensive By itself, this does not have much of an effect, but only because in the default configuration the full cycle checks are used only for small problem sizes. This is part of a general cleanup of uses of iteration over std::multimap ranges only for the purpose of checking membership. No functionality change intended. llvm-svn: 174856	2013-02-11 05:29:41 +00:00
Andrew Trick	bc7059032b	LSR IVChain improvement. Handle chains in which the same offset is used for both loads and stores to the same array. Fixes rdar://11410078. llvm-svn: 174789	2013-02-09 01:11:01 +00:00
Jakub Staszak	f23980aba5	Remove #includes from the commonly used LoopInfo.h. llvm-svn: 174786	2013-02-09 01:04:28 +00:00
Bob Wilson	bfb44ef9cb	Revert "Add LLVMContext::emitWarning methods and use them. <rdar://problem/12867368>" This reverts r171041. This was a nice idea that didn't work out well. Clang warnings need to be associated with warning groups so that they can be selectively disabled, promoted to errors, etc. This simplistic patch didn't allow for that. Enhancing it to provide some way for the backend to specify a front-end warning type seems like overkill for the few uses of this, at least for now. llvm-svn: 174748	2013-02-08 21:48:29 +00:00
Hal Finkel	dd2721842d	BBVectorize: Use TTI->getAddressComputationCost This is a follow-up to the cost-model change in r174713 which splits the cost of a memory operation between the address computation and the actual memory access. In r174713, this cost is always added to the memory operation cost, and so BBVectorize will do the same. Currently, this new cost function is used only by ARM, and I don't have any ARM test cases for BBVectorize. Assistance in generating some good ARM test cases for BBVectorize would be greatly appreciated! llvm-svn: 174743	2013-02-08 21:13:39 +00:00
Chad Rosier	22d275f7b8	[SimplifyLibCalls] Library call simplification doen't work if the call site isn't using the default calling convention. However, if the transformation is from a call to inline IR, then the calling convention doesn't matter. rdar://13157990 llvm-svn: 174724	2013-02-08 18:00:14 +00:00
Jakob Stoklund Olesen	479e5a9313	Typos. llvm-svn: 174723	2013-02-08 17:43:32 +00:00
Arnold Schwaighofer	594fa2dc2b	ARM cost model: Address computation in vector mem ops not free Adds a function to target transform info to query for the cost of address computation. The cost model analysis pass now also queries this interface. The code in LoopVectorize adds the cost of address computation as part of the memory instruction cost calculation. Only there, we know whether the instruction will be scalarized or not. Increase the penality for inserting in to D registers on swift. This becomes necessary because we now always assume that address computation has a cost and three is a closer value to the architecture. radar://13097204 llvm-svn: 174713	2013-02-08 14:50:48 +00:00
Michael Kuperstein	f63b77be7f	Test Commit llvm-svn: 174709	2013-02-08 12:58:29 +00:00
Andrew Trick	1bd53c3675	Revert "Have InstCombine call SipmlifyCall when handling calls. Test case included." This reverts commit 3854a5d90fee52af1065edbed34521fff6cdc18d. This causes a clang unit test to hang: vtable-available-externally.cpp. llvm-svn: 174692	2013-02-08 01:55:39 +00:00
Michael Ilseman	6092dc5455	Have InstCombine call SipmlifyCall when handling calls. Test case included. llvm-svn: 174675	2013-02-07 23:01:35 +00:00
Nadav Rotem	a9100f3609	fix 80-col violation and fix the docs. llvm-svn: 174671	2013-02-07 22:34:07 +00:00
Arnold Schwaighofer	3476fc8c82	Loop Vectorizer: Refactor Memory Cost Computation We don't want too many classes in a pass and the classes obscure the details. I was going a little overboard with object modeling here. Replace classes by generic code that handles both loads and stores. No functionality change intended. llvm-svn: 174646	2013-02-07 19:05:21 +00:00
Michael Gottesman	697d8b9a26	Moved some comments due to the recent refactoring of ObjCARC. 1. Moved a comment from ObjCARCOpts.cpp -> ObjCARCContract.cpp. 2. Removed a comment from ObjCARCOpts.cpp that was already moved to ObjCARCAliasAnalysis.h/.cpp. llvm-svn: 174581	2013-02-07 04:12:57 +00:00
Michael Ilseman	1dd6f2a5ba	Preserve fast-math flags after reassociation and commutation. Update test cases llvm-svn: 174571	2013-02-07 01:40:15 +00:00
Benjamin Kramer	944e0abf04	InstCombine: Fix and simplify the inttoptr side too. llvm-svn: 174438	2013-02-05 20:22:40 +00:00
Michael Gottesman	415ddd7e13	Removed explicit inline as per the LLVM style guide. llvm-svn: 174432	2013-02-05 19:32:18 +00:00
Benjamin Kramer	e477875873	InstCombine: Harden code to work with vectors of pointers and simplify it a bit. Found by running instcombine on a fabricated test case for the constant folder. llvm-svn: 174430	2013-02-05 19:21:56 +00:00
Arnold Schwaighofer	3be40b56c5	Loop Vectorizer: Refactor code to compute vectorized memory instruction cost Introduce a helper class that computes the cost of memory access instructions. No functionality change intended. llvm-svn: 174422	2013-02-05 18:46:41 +00:00
Chad Rosier	92a54f6d4c	[SjLj Prepare] When demoting an invoke instructions to the stack, if the normal edge is critical, then split it so we can insert the store. rdar://13126179 llvm-svn: 174418	2013-02-05 18:23:10 +00:00
Arnold Schwaighofer	22174f5d5a	Loop Vectorizer: Handle pointer stores/loads in getWidestType() In the loop vectorizer cost model, we used to ignore stores/loads of a pointer type when computing the widest type within a loop. This meant that if we had only stores/loads of pointers in a loop we would return a widest type of 8bits (instead of 32 or 64 bit) and therefore a vector factor that was too big. Now, if we see a consecutive store/load of pointers we use the size of a pointer (from data layout). This problem occured in SingleSource/Benchmarks/Shootout-C++/hash.cpp (reduced test case is the first test in vector_ptr_load_store.ll). radar://13139343 llvm-svn: 174377	2013-02-05 15:08:02 +00:00
Nick Lewycky	535d97cc86	Revert accidental commit (ran svn commit from wrong directory). llvm-svn: 174241	2013-02-02 00:25:26 +00:00
Nick Lewycky	a8c77e4266	This patch makes "&Cls::purevfn" not an odr use. This isn't what the standard says, but that's a defect (to be filed). "Cls::purevfn()" is still an odr use. Also fixes a bug in the previous patch that caused us to not mark the function referenced just because we didn't want to mark it odr used. llvm-svn: 174240	2013-02-02 00:22:37 +00:00
Preston Gurd	25c3b6acc0	This patch aims to improve compile time performance by increasing the SCEV vector size in LoopStrengthReduce. It is observed that the BaseRegs vector size is 4 in most cases, and elements are frequently copied when it is initialized as SmallVector<const SCEV *, 2> BaseRegs. Our benchmark results show that the compilation time performance improved by ~0.5%. Patch by Wan Xiaofei. llvm-svn: 174219	2013-02-01 20:41:27 +00:00
Nadav Rotem	4349f6963e	Revert r174152. The shift amount may overflow and in that case this transformation is illegal. llvm-svn: 174156	2013-02-01 07:59:33 +00:00
Nadav Rotem	1d584029ae	Optimize shift lefts of a constant by a value plus constant into a single shift. llvm-svn: 174152	2013-02-01 06:45:40 +00:00
Manman Ren	aec2ce7db4	Linker: correctly link in dbg.declare This is a re-worked version of r174048. Given source IR: call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15 we used to generate call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29 !27 = metadata !{null} With this patch, we will correctly generate call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28 Looking up %argc.addr in ValueMap will return null, since %argc.addr is already correctly set up, we can use identity mapping. rdar://problem/13089880 llvm-svn: 174093	2013-01-31 21:19:18 +00:00
Alexey Samsonov	5234a8ed9f	Revert r173946. This breaks compilation of googletest with Clang llvm-svn: 174048	2013-01-31 08:02:11 +00:00
Dan Gohman	20a2ae9df5	Change GetPointerBaseWithConstantOffset's DataLayout argument from a reference to a pointer, so that it can handle the case where DataLayout is not available and behave conservatively. llvm-svn: 174024	2013-01-31 02:00:45 +00:00
Bill Wendling	785afdf3a4	Remove addRetAttributes and addFnAttributes, which aren't useful abstractions. llvm-svn: 173992	2013-01-30 23:40:31 +00:00
Bill Wendling	d219675c2a	Convert typeIncompatible to return an AttributeSet. There are still places which treat the Attribute object as a collection of attributes. I'm systematically removing them. llvm-svn: 173990	2013-01-30 23:07:40 +00:00
Manman Ren	81dcc62805	Linker: correctly link in dbg.declare Given source IR: call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15 we used to generate call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29 !27 = metadata !{null} With this patch, we will correctly generate call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28 Looking up %argc.addr in ValueMap will return null, since %argc.addr is already correctly set up, we can use identity mapping. llvm-svn: 173946	2013-01-30 17:42:15 +00:00
Nadav Rotem	513bd8a73c	InstCombine: canonicalize sext-and --> select sext-not-and --> select. Patch by Muhammad Tauqir Ahmad. llvm-svn: 173901	2013-01-30 06:35:22 +00:00
Michael Gottesman	e52dec1695	Made certain small functions in PtrState inlined. llvm-svn: 173842	2013-01-29 22:29:59 +00:00
Pekka Jaaskelainen	f50ab84bb1	LoopVectorize: convert TinyTripCountVectorThreshold constant to a command line switch. llvm-svn: 173837	2013-01-29 21:42:08 +00:00
Michael Gottesman	9bdab2bf6b	Removed trailing comma in last element of enum declaration. llvm-svn: 173836	2013-01-29 21:41:44 +00:00
Michael Gottesman	386241ce5b	Moved S_Stop back to its previous position in the sequence order. llvm-svn: 173834	2013-01-29 21:39:02 +00:00
Michael Gottesman	23cda0cd39	Fixed a few debug messages and some 80+ violations. llvm-svn: 173832	2013-01-29 21:07:53 +00:00
Michael Gottesman	53fd20bdbd	Added some periods to some comments and added an overload for operator<< for type Sequence so I can print out Sequences in debug statements. llvm-svn: 173831	2013-01-29 21:07:51 +00:00
Michael Gottesman	774d2c014e	Changed DoesObjCBlockEscape => DoesRetainableObjPtrEscape so I can use it to perform escape analysis of other retainable object pointers in other locations. llvm-svn: 173829	2013-01-29 21:00:52 +00:00
Edwin Vane	82f80d4967	Fixing warnings revealed by gcc release build Fixed set-but-not-used warnings. Reviewer: gribozavr llvm-svn: 173810	2013-01-29 17:42:24 +00:00
Benjamin Kramer	cf406756ce	LoopVectorize: Clean up ValueMap a bit and avoid double lookups. No intended functionality change. llvm-svn: 173809	2013-01-29 17:31:33 +00:00
Timur Iskhodzhanov	5d7ff00456	Hopefully fix the Windows build failure introduced in r173769 llvm-svn: 173781	2013-01-29 09:09:27 +00:00
Michael Gottesman	1e29ca1501	Fixed 2 more header comments... llvm-svn: 173774	2013-01-29 05:07:18 +00:00
Michael Gottesman	5a8f9e7c54	Fixed header comment. llvm-svn: 173773	2013-01-29 05:05:17 +00:00
Michael Gottesman	23a1ee5f5b	Fixed some whitespace/80+ violations. Also added a space after a namespace declaration. llvm-svn: 173772	2013-01-29 04:58:30 +00:00
Michael Gottesman	7bf48af498	Added missing dashes from header declaration comment. llvm-svn: 173770	2013-01-29 04:53:55 +00:00
Michael Gottesman	13a5f1a8b7	Juggled Debug.h from ObjCARC.h to only the including cpp files that actually have DEBUG statements. Also changed raw_ostream in said header to be a forward declaration (removing an include). llvm-svn: 173769	2013-01-29 04:51:59 +00:00
Michael Gottesman	278266faa8	Sorted includes using utils/sort_includes. llvm-svn: 173767	2013-01-29 04:20:52 +00:00
Michael Gottesman	f823dd2ef7	Added two missing headers from ObjCARCAliasAnalysis.h. This was missed since whenever I was including ObjCARCAliasAnalysis.h, I was including ObjCARC.h before it which included these includes (resulting in no compilation breakage). llvm-svn: 173764	2013-01-29 04:09:24 +00:00
Michael Gottesman	7f387ae6e3	Removed InstCombine/Targets as library dependencies for libObjCARCOpts since they are unnecessary. llvm-svn: 173763	2013-01-29 04:05:17 +00:00
Michael Gottesman	778138e960	Extracted ObjCARCContract from ObjCARCOpts into its own file. This also required adding 2x headers Dependency Analysis.h/Provenance Analysis.h and a .cpp file DependencyAnalysis.cpp to unentangle the dependencies inbetween ObjCARCContract and ObjCARCOpts. llvm-svn: 173760	2013-01-29 03:03:03 +00:00
Michael Gottesman	50a622f120	Removed some cruft from ObjCARCAliasAnalysis.cpp. llvm-svn: 173759	2013-01-29 03:02:59 +00:00
Hal Finkel	bf4db4fe11	Unroll again after running BBVectorize Because BBVectorize may significantly shorten a loop body, unroll again after vectorization. This is especially important when using runtime or partial unrolling. llvm-svn: 173730	2013-01-29 00:22:49 +00:00
Renato Golin	1258519674	Vectorization Factor clarification llvm-svn: 173691	2013-01-28 16:02:45 +00:00
Evgeniy Stepanov	6f85ef300d	[msan] Mostly disable msan-handle-icmp-exact. It is way too slow. Change the default option value to 0. Always do exact shadow propagation for unsigned ICmp with constants, it is cheap (under 1% cpu time) and required for correctness. llvm-svn: 173682	2013-01-28 11:42:28 +00:00
Evgeniy Stepanov	52c7b1b98f	Revert r173678. Broken tests. llvm-svn: 173679	2013-01-28 09:18:40 +00:00
Evgeniy Stepanov	5ec2ff57e9	[msan] Make msan-handle-icmp-exact=0 by default. 50% slowdown on one of the specs. llvm-svn: 173678	2013-01-28 09:15:15 +00:00
Michael Gottesman	5ed40afe17	Created ObjCARCUtil.cpp for functions which in my humble opinion are too large to static inline and place in a header file such as ObjCARC.h. llvm-svn: 173666	2013-01-28 06:39:31 +00:00
Michael Gottesman	9bfcf28d88	Cleaned up includes in various ObjCARC files and removed some whitespace violations. llvm-svn: 173663	2013-01-28 05:51:58 +00:00
Michael Gottesman	294e7daaac	Refactor ObjCARCAliasAnalysis into its own file. llvm-svn: 173662	2013-01-28 05:51:54 +00:00
Michael Gottesman	fa0939f790	Refactored out pass ObjCARCAPElim from ObjCARCOpts.cpp => ObjCARCAPElim.cpp. llvm-svn: 173654	2013-01-28 04:12:07 +00:00
Michael Gottesman	283e079fa6	Fixed case insensitive issue. llvm-svn: 173653	2013-01-28 03:35:20 +00:00
Michael Gottesman	0d90b12acc	Removed extraneous doxygen end module statement. llvm-svn: 173652	2013-01-28 03:30:34 +00:00
Michael Gottesman	08904e3ba4	Extracted pass ObjCARCExpand from ObjCARC.cpp => ObjCARCExpand.cpp. I also added the local header ObjCARC.h for common functions used by the various passes. llvm-svn: 173651	2013-01-28 03:28:38 +00:00
Michael Gottesman	79d8d81226	Extracted ObjCARC.cpp into its own library libLLVMObjCARCOpts in preparation for refactoring the ARC Optimizer. llvm-svn: 173647	2013-01-28 01:35:51 +00:00
Hal Finkel	293a41d14f	BBVectorize: Better use of TTI->getShuffleCost When flipping the pair of subvectors that form a vector, if the vector length is 2, we can use the SK_Reverse shuffle kind to get more-accurate cost information. Also we can use the SK_ExtractSubvector shuffle kind to get accurate subvector extraction costs. The current cost model implementations don't yet seem complex enough for this to make a difference (thus, there are no test cases with this commit), but it should help in future. Depending on how the various targets optimize and combine shuffles in practice, we might be able to get more-accurate costs by combining the costs of multiple shuffle kinds. For example, the cost of flipping the subvector pairs could be modeled as two extractions and two subvector insertions. These changes, however, should probably be motivated by specific test cases. llvm-svn: 173621	2013-01-27 20:07:01 +00:00
Chandler Carruth	329b590e6e	Re-revert r173342, without losing the compile time improvements, flat out bug fixes, or functionality preserving refactorings. llvm-svn: 173610	2013-01-27 06:42:03 +00:00
Michael Gottesman	5300cdd8f2	Renamed function IsPotentialUse to IsPotentialRetainableObjPtr. This name change does the following: 1. Causes the function name to use proper ARC terminology. 2. Makes it clear what the function truly does. llvm-svn: 173609	2013-01-27 06:19:48 +00:00
Bill Wendling	3575c8c6d6	Use the AttributeSet instead of AttributeWithIndex. In the future, AttributeWithIndex won't be used anymore. Besides, it exposes the internals of the AttributeSet to outside users, which isn't goodness. llvm-svn: 173602	2013-01-27 02:08:22 +00:00
Bill Wendling	37a52df920	Use the AttributeSet instead of AttributeWithIndex. In the future, AttributeWithIndex won't be used anymore. Besides, it exposes the internals of the AttributeSet to outside users, which isn't goodness. llvm-svn: 173601	2013-01-27 01:57:28 +00:00
Bill Wendling	6eaab61bb5	Use the AttributeSet instead of AttributeWithIndex. In the future, AttributeWithIndex won't be used anymore. Besides, it exposes the internals of the AttributeSet to outside users, which isn't goodness. llvm-svn: 173600	2013-01-27 01:44:34 +00:00
Hal Finkel	2d443e94b4	BBVectorize: Add a additional comment about the cost computation llvm-svn: 173580	2013-01-26 16:49:04 +00:00
Hal Finkel	351a75b6d7	BBVectorize: Fix anomalous capital letter in comment llvm-svn: 173579	2013-01-26 16:49:03 +00:00
Bill Wendling	201d7b2545	Convert BuildLibCalls.cpp to using the AttributeSet methods instead of AttributeWithIndex. llvm-svn: 173536	2013-01-26 00:03:11 +00:00
Bill Wendling	57625a4966	Remove some introspection functions. The 'getSlot' function and its ilk allow introspection into the AttributeSet class. However, that class should be opaque. Allow access through accessor methods instead. llvm-svn: 173522	2013-01-25 23:09:36 +00:00
Nadav Rotem	69a040d3eb	LoopVectorize: Refactor the code that vectorizes loads/stores to remove duplication. llvm-svn: 173500	2013-01-25 21:47:42 +00:00
Bill Wendling	8649283e75	Use the new 'getSlotIndex' method to retrieve the attribute's slot index. llvm-svn: 173499	2013-01-25 21:46:52 +00:00
Benjamin Kramer	21e8da5990	LoopVectorize: Simplify code. No functionality change. llvm-svn: 173475	2013-01-25 19:43:15 +00:00
Pedro Artigas	b95c98faa2	added ability to dynamically change the ExportList of an already created InternalizePass (useful for pass reuse) llvm-svn: 173474	2013-01-25 19:41:03 +00:00
Nadav Rotem	8e9ca2f8cb	LoopVectorizer: Refactor more code to use the IRBuilder. llvm-svn: 173471	2013-01-25 19:26:23 +00:00
Nadav Rotem	c8adf3ff6e	Refactor some code to use the IRBuilder. llvm-svn: 173467	2013-01-25 18:34:09 +00:00
Evgeniy Stepanov	2cb0fa10c2	[msan] A comment on ICmp handling logic. llvm-svn: 173453	2013-01-25 15:35:29 +00:00
Evgeniy Stepanov	fac8403249	[msan] Implement exact shadow propagation for relational ICmp. Only for integers, pointers, and vectors of those. No floats. Instrumentation seems very heavy, and may need to be replaced with some approximation in the future. llvm-svn: 173452	2013-01-25 15:31:10 +00:00
Chandler Carruth	ceff222dea	Switch this code away from Value::isUsedInBasicBlock. That code either loops over instructions in the basic block or the use-def list of the value, neither of which are really efficient when repeatedly querying about values in the same basic block. What's more, we already know that the CondBB is small, and so we can do a much more efficient test by counting the uses in CondBB, and seeing if those account for all of the uses. Finally, we shouldn't blanket fail on any such instruction, instead we should conservatively assume that those instructions are part of the cost. Note that this actually fixes a bug in the pass because isUsedInBasicBlock has a really terrible bug in it. I'll fix that in my next commit, but the fix for it would make this code suddenly take the compile time hit I thought it already was taking, so I wanted to go ahead and migrate this code to a faster & better pattern. The bug in isUsedInBasicBlock was also causing other tests to test the wrong thing entirely: for example we weren't actually disabling speculation for floating point operations as intended (and tested), but the test passed because we failed to speculate them due to the isUsedInBasicBlock failure. llvm-svn: 173417	2013-01-25 05:40:09 +00:00
Michael Gottesman	12780c2d97	Added comment to ObjCARC elaborating what is meant by the term 'Provenance' in 'Provenance Analysis'. llvm-svn: 173374	2013-01-24 21:35:00 +00:00
Benjamin Kramer	1c4e323fdd	Reapply chandlerc's r173342 now that the miscompile it was triggering is fixed. Original commit message: Plug TTI into the speculation logic, giving it a real cost interface that can be specialized by targets. The goal here is not to be more aggressive, but to just be more accurate with very obvious cases. There are instructions which are known to be truly free and which were not being modeled as such in this code -- see the regression test which is distilled from an inner loop of zlib. Everywhere the TTI cost model is insufficiently conservative I've added explicit checks with FIXME comments to go add proper modelling of these cost factors. If this causes regressions, the likely solution is to make TTI even more conservative in its cost estimates, but test cases will help here. llvm-svn: 173357	2013-01-24 16:44:25 +00:00
Chandler Carruth	321c6a7c50	Revert r173342 temporarily. It appears to cause a very late miscompile of stage2 in a bootstrap. Still investigating.... llvm-svn: 173343	2013-01-24 13:24:24 +00:00
Chandler Carruth	5f4519309f	Plug TTI into the speculation logic, giving it a real cost interface that can be specialized by targets. The goal here is not to be more aggressive, but to just be more accurate with very obvious cases. There are instructions which are known to be truly free and which were not being modeled as such in this code -- see the regression test which is distilled from an inner loop of zlib. Everywhere the TTI cost model is insufficiently conservative I've added explicit checks with FIXME comments to go add proper modelling of these cost factors. If this causes regressions, the likely solution is to make TTI even more conservative in its cost estimates, but test cases will help here. llvm-svn: 173342	2013-01-24 12:39:29 +00:00
Chandler Carruth	01bffaad03	Address a large chunk of this FIXME by accumulating the cost for unfolded constant expressions rather than checking each one independently. llvm-svn: 173341	2013-01-24 12:05:17 +00:00
Chandler Carruth	8a21005cca	Switch the constant expression speculation cost evaluation away from a cost fuction that seems both a bit ad-hoc and also poorly suited to evaluating constant expressions. Notably, it is missing any support for trivial expressions such as 'inttoptr'. I could fix this routine, but it isn't clear to me all of the constraints its other users are operating under. The core protection that seems relevant here is avoiding the formation of a select instruction wich a further chain of select operations in a constant expression operand. Just explicitly encode that constraint. Also, update the comments and organization here to make it clear where this needs to go -- this should be driven off of real cost measurements which take into account the number of constants expressions and the depth of the constant expression tree. llvm-svn: 173340	2013-01-24 11:53:01 +00:00
Chandler Carruth	7481ca8ff5	Rephrase the speculating scan of the conditional BB to be phrased in terms of cost rather than hoisting a single instruction. This does not change the cost model! We still set the cost threshold at 1 here, it's just that we track it by accumulating cost rather than by storing an instruction. The primary advantage is that we no longer leave no-op intrinsics in the basic block. For example, this will now move both debug info intrinsics and a single instruction, instead of only moving the instruction and leaving a basic block with nothing bug debug info intrinsics in it, and those intrinsics now no longer ordered correctly with the hoisted value. Instead, we now splice the entire conditional basic block's instruction sequence. This also places the code for checking the safety of hoisting next to the code computing the cost. Currently, the only observable side-effect of this change is that debug info intrinsics are no longer abandoned. I'm not sure how to craft a test case for this, and my real goal was the refactoring, but I'll talk to Dave or Eric about how to add a test case for this. llvm-svn: 173339	2013-01-24 11:52:58 +00:00
Kostya Serebryany	e35d59a8d0	[asan] fix 32-bit builds llvm-svn: 173338	2013-01-24 10:43:50 +00:00
Chandler Carruth	76aacbd874	Simplify the PHI node operand rewriting. Previously, the code would scan the PHI nodes and build up a small setvector of candidate value pairs in phi nodes to go and rewrite. Once certain the rewrite could be performed, the code walks the set, and for each one re-scans the entire PHI node list looking for nodes to rewrite operands. Instead, scan the PHI nodes once to check for hazards, and then scan it a second time to rewrite the operands to selects. No set vector, and a max of two scans. The only downside is that we might form identical selects, but instcombine or anything else should fold those easily, and it seems unlikely to happen often. llvm-svn: 173337	2013-01-24 10:40:51 +00:00
Kostya Serebryany	87191f6221	[asan] adaptive redzones for globals (the larger the global the larger is the redzone) llvm-svn: 173335	2013-01-24 10:35:40 +00:00
Chandler Carruth	e2a779f3a7	Give the basic block variables here names based on the if-then-end structure being analyzed. No functionality changed. llvm-svn: 173334	2013-01-24 09:59:39 +00:00
Chandler Carruth	1d20c02f55	Lift a cheap early exit test above loops and other complex early exit tests. No need to pay the high cost when we're never going to do anything. No functionality changed. llvm-svn: 173331	2013-01-24 08:22:40 +00:00
Chandler Carruth	8a4a16618f	Spiff up the comment on this method, making the example a bit more pretty in doxygen, adding some of the details actually present in a classic example where this matters (a loop from gzip and many other compression algorithms), and a cautionary note about the risks inherent in the transform. This has come up on the mailing lists recently, and I suspect folks reading this code could benefit from going and looking at the MI pass that can really deal with these issues. llvm-svn: 173329	2013-01-24 08:05:06 +00:00
Craig Topper	3529aa5fc2	Remove trailing whitespace. llvm-svn: 173322	2013-01-24 05:22:40 +00:00
Benjamin Kramer	e4c46fec73	Revert "InstCombine: Clean up weird code that talks about a modulus that's long gone." This causes crashes during the build of compiler-rt during selfhost. Add a testcase for coverage. llvm-svn: 173279	2013-01-23 17:52:29 +00:00
Benjamin Kramer	cd86115d8a	InstCombine: Clean up weird code that talks about a modulus that's long gone. This does the right thing unless the multiplication overflows, but the old code didn't handle that case either. llvm-svn: 173276	2013-01-23 17:16:22 +00:00
Anton Korobeynikov	4ec3ae78b3	Make sure metarenamer won't rename special stuff (intrinsics and explicitly renamed stuff). Otherwise this might hide the problems. llvm-svn: 173265	2013-01-23 15:03:08 +00:00
Kostya Serebryany	4766fe6f10	[asan] use ADD instead of OR when applying shadow offset of PowerPC. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55975 for details llvm-svn: 173258	2013-01-23 12:54:55 +00:00
Duncan Sands	5924545c0c	Initialize the components of this class. Otherwise GCC thinks that Array may be used uninitialized, since it fails to understand that Array is only used when SingleValue is not, and outputs a warning. It also seems generally safer given that the constructor is non-trivial and has plenty of early exits. llvm-svn: 173242	2013-01-23 09:09:50 +00:00
Bill Wendling	d154e283f2	Add the IR attribute 'sspstrong'. SSPStrong applies a heuristic to insert stack protectors in these situations: * A Protector is required for functions which contain an array, regardless of type or length. * A Protector is required for functions which contain a structure/union which contains an array, regardless of type or length. Note, there is no limit to the depth of nesting. * A protector is required when the address of a local variable (i.e., stack based variable) is exposed. (E.g., such as through a local whose address is taken as part of the RHS of an assignment or a local whose address is taken as part of a function argument.) This patch implements the SSPString attribute to be equivalent to SSPRequired. This will change in a subsequent patch. llvm-svn: 173230	2013-01-23 06:41:41 +00:00
Bill Wendling	49bc76cbb3	Remove the last of uses that use the Attribute object as a collection of attributes. Collections of attributes are handled via the AttributeSet class now. This finally frees us up to make significant changes to how attributes are structured. llvm-svn: 173228	2013-01-23 06:14:59 +00:00
Nadav Rotem	ab3e698ee9	Add support for reverse pointer induction variables. These are loops that contain pointers that count backwards. For example, this is the hot loop in BZIP: do { m = --p; p = ( ... ); } while (--n); llvm-svn: 173219	2013-01-23 01:35:00 +00:00
Bill Wendling	430fa9bfb3	Use the AttributeSet when removing multiple attributes. Use Attribute::AttrKind when removing one attribute. This further encapsulates the use of the attributes. llvm-svn: 173214	2013-01-23 00:45:55 +00:00
Bill Wendling	c0e2a1f457	Use the AttributeSet when adding multiple attributes and an Attribute::AttrKind when adding a single attribute to the function. llvm-svn: 173210	2013-01-23 00:20:53 +00:00
Michael Gottesman	8b5515fa1b	Fixed typo. llvm-svn: 173202	2013-01-22 21:53:43 +00:00
Michael Gottesman	9de6f96ad5	[ObjCARC] Refactored out the inner most 2-loops from PerformCodePlacement into the method ConnectTDBUTraversals. The method PerformCodePlacement was doing too much (i.e. 3x loops, lots of different checking). This refactoring separates the analysis section of the method into a separate function while leaving the actual code placement and analysis preparation in PerformCodePlacement. NOTE Really this part of ObjCARC should be refactored out of the main pass class into its own seperate class/struct. But, it is not time to make that change yet though (don't want to make such an invasive change without fixing all of the bugs first). llvm-svn: 173201	2013-01-22 21:49:00 +00:00
Bill Wendling	09175b39f2	More encapsulation work. Use the AttributeSet when we're talking about more than one attribute. Add a function that adds a single attribute. No functionality change intended. llvm-svn: 173196	2013-01-22 21:15:51 +00:00
Evgeniy Stepanov	dcf6bcb904	[msan] Export the value of msan-keep-going flag for the runtime. llvm-svn: 173156	2013-01-22 13:26:53 +00:00
Evgeniy Stepanov	c4415591ed	[msan] Do not insert check on volatile store. Volatile bitfields can cause valid stores of uninitialized bits. llvm-svn: 173153	2013-01-22 12:30:52 +00:00
Chandler Carruth	0ba8db45c6	Begin fleshing out an interface in TTI for modelling the costs of generic function calls and intrinsics. This is somewhat overlapping with an existing intrinsic cost method, but that one seems targetted at vector intrinsics. I'll merge them or separate their names and use cases in a separate commit. This sinks the test of 'callIsSmall' down into TTI where targets can control it. The whole thing feels very hack-ish to me though. I've left a FIXME comment about the fundamental design problem this presents. It isn't yet clear to me what the users of this function really care about. I'll have to do more analysis to figure that out. Putting this here at least provides it access to proper analysis pass tools and other such. It also allows us to more cleanly implement the baseline cost interfaces in TTI. With this commit, it is now theoretically possible to simplify much of the inline cost analysis's handling of calls by calling through to this interface. That conversion will have to happen in subsequent commits as it requires more extensive restructuring of the inline cost analysis. The CodeMetrics class is now really only in the business of running over a block of code and aggregating the metrics on that block of code, with the actual cost evaluation done entirely in terms of TTI. llvm-svn: 173148	2013-01-22 11:26:02 +00:00
Bill Wendling	c90d9f89b7	Have AttributeSet::getRetAttributes() return an AttributeSet instead of Attribute. This further restricts the use of the Attribute class to the Attribute family of classes. llvm-svn: 173098	2013-01-21 22:44:49 +00:00
Bill Wendling	bd4ea16bf3	Make AttributeSet::getFnAttributes() return an AttributeSet instead of an Attribute. This is more code to isolate the use of the Attribute class to that of just holding one attribute instead of a collection of attributes. llvm-svn: 173094	2013-01-21 21:57:28 +00:00
Paul Redmond	9d86a4a3b6	Transform (sub 0, (zext bool to A)) to (sext bool to A) and (sub 0, (sext bool to A)) to (zext bool to A). Patch by Muhammad Ahmad Reviewed by Duncan Sands llvm-svn: 173093	2013-01-21 21:57:20 +00:00
Nadav Rotem	b2e7e7a0b6	Fix a comment. Induction vars dont need to start at zero. llvm-svn: 173061	2013-01-21 17:59:18 +00:00
Chandler Carruth	bb9caa9241	Switch CodeMetrics itself over to use TTI to determine if an instruction is free. The whole CodeMetrics API should probably be reworked more, but this is enough to allow deleting the duplicate code there for computing whether an instruction is free. All of the passes using this have been updated to pull in TTI and hand it to the CodeMetrics stuff. Further, a dead CodeMetrics API (analyzeFunction) is nuked for lack of users. llvm-svn: 173036	2013-01-21 13:04:33 +00:00
Chandler Carruth	4319e2948d	Make the inline cost a proper analysis pass. This remains essentially a dynamic analysis done on each call to the routine. However, now it can use the standard pass infrastructure to reference other analyses, instead of a silly setter method. This will become more interesting as I teach it about more analysis passes. This updates the two inliner passes to use the inline cost analysis. Doing so highlights how utterly redundant these two passes are. Either we should find a cheaper way to do always inlining, or we should merge the two and just fiddle with the thresholds to get the desired behavior. I'm leaning increasingly toward the latter as it would also remove the Inliner sub-class split. llvm-svn: 173030	2013-01-21 11:39:18 +00:00
Chandler Carruth	7e88e74447	Formatting and comment fixes to the always inliner. Formatting fixes brought to you by clang-format. llvm-svn: 173029	2013-01-21 11:39:16 +00:00
Chandler Carruth	0df3e5310c	Clean up the formatting and doxygen for the simple inliner a bit. No functionality changed. llvm-svn: 173028	2013-01-21 11:39:14 +00:00
Benjamin Kramer	a6e2e2a0a7	LoopVectorize: Fix a C++11 incompatibility. llvm-svn: 172990	2013-01-20 20:29:52 +00:00
Nadav Rotem	da9f2adffd	Fix a build error. llvm-svn: 172971	2013-01-20 09:39:17 +00:00
Nadav Rotem	c42f90b1f4	LoopVectorizer: Implement a new heuristics for selecting the unroll factor. We ignore the cpu frontend and focus on pipeline utilization. We do this because we don't have a good way to estimate the loop body size at the IR level. llvm-svn: 172964	2013-01-20 05:24:29 +00:00
Benjamin Kramer	d455ed85d1	LoopVectorizer: Emit memory checks into their own basic block. This separates the check for "too few elements to run the vector loop" from the "memory overlap" check, giving a lot nicer code and allowing to skip the memory checks when we're not going to execute the vector code anyways. We still leave the decision of whether to emit the memory checks as branches or setccs, but it seems to be doing a good job. If ugly code pops up we may want to emit them as separate blocks too. Small speedup on MultiSource/Benchmarks/MallocBench/espresso. Most of this is legwork to allow multiple bypass blocks while updating PHIs, dominators and loop info. llvm-svn: 172902	2013-01-19 13:57:58 +00:00
Chandler Carruth	1fe21fc0b5	Sort all of the includes. Several files got checked in with mis-sorted includes. llvm-svn: 172891	2013-01-19 08:03:47 +00:00
Michael Gottesman	87db357547	Improved comment. llvm-svn: 172864	2013-01-18 23:02:45 +00:00
Michael Gottesman	9854e0c6a2	Fixed typo in comment. llvm-svn: 172863	2013-01-18 23:00:33 +00:00
Bill Wendling	658d24d211	Use AttributeSet accessor methods instead of Attribute accessor methods. Further encapsulation of the Attribute object. Don't allow direct access to the Attribute object as an aggregate. llvm-svn: 172853	2013-01-18 21:53:16 +00:00
Bill Wendling	7754389526	Push some more methods down to hide the use of the Attribute class. Because the Attribute class is going to stop representing a collection of attributes, limit the use of it as an aggregate in favor of using AttributeSet. This replaces some of the uses for querying the function attributes. llvm-svn: 172844	2013-01-18 21:11:39 +00:00
Benjamin Kramer	0eba5775f3	Silence GCC warning about dropping off a non-void function. llvm-svn: 172839	2013-01-18 19:45:22 +00:00
Alexey Samsonov	46c5a5549e	80 columns llvm-svn: 172813	2013-01-18 12:49:06 +00:00
Will Dietz	b9eb34e100	Move Blacklist.h to include/ to enable use from clang. llvm-svn: 172806	2013-01-18 11:29:21 +00:00
Craig Topper	45d9f4b569	Check for less than 0 in shuffle mask instead of -1. It's more consistent with other code related to shuffles and easier to implement in compiled code. llvm-svn: 172788	2013-01-18 05:30:07 +00:00
Craig Topper	2ea22b0b84	Remove trailing whitespace. Remove new lines between closing brace and 'else' llvm-svn: 172784	2013-01-18 05:09:16 +00:00
Michael Gottesman	d359e06245	Fixed 80+ violation. llvm-svn: 172782	2013-01-18 03:08:39 +00:00
Michael Gottesman	1d777513e5	Added missing const from my last commit. llvm-svn: 172736	2013-01-17 18:36:17 +00:00
Michael Gottesman	782e34474a	[ObjCARC] Implemented operator<< for InstructionClass and changed a ``Visited'' Debug message to use it. llvm-svn: 172735	2013-01-17 18:32:34 +00:00
Alexey Samsonov	347bcd3c5c	ASan: add optional 'zero-based shadow' option to ASan passes. Always tell the values of shadow scale and offset to the runtime llvm-svn: 172709	2013-01-17 11:12:32 +00:00
Alexey Samsonov	1345d35e40	ASan: wrap mapping scale and offset in a struct and make it a member of ASan passes. Add test for non-default mapping scale and offset. No functionality change llvm-svn: 172610	2013-01-16 13:23:28 +00:00
Michael Gottesman	6a9355f8d7	[ObjCARC] Turn off ignoring unwind edges in ObjCARC when -fno-objc-arc-exception is enabled due to it's affect on correctness. Specifically according to the semantics of ARC -fno-objc-arc-exception simply states that it is expected that the unwind path out of a call MAY not release objects. Thus we can have the situation where a release gets moved into a catch block which we ignore when we remove a retain/release pair resulting in (even though we assume the program is exiting anyways) the cleanup code path potentially blowing up before program exit. llvm-svn: 172599	2013-01-16 06:32:39 +00:00
Nadav Rotem	7df850924d	Teach InstCombine to optimize extract of a value from a vector add operation with a constant zero. llvm-svn: 172576	2013-01-15 23:43:14 +00:00
Shuxin Yang	e822745202	1. Hoist minus sign as high as possible in an attempt to reveal some optimization opportunities (in the enclosing supper-expressions). rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y) if expression "-0.0 - X" has only one reference. rule 2. (0.0 - X ) * Y => -0.0 - (X * Y) if expression "0.0 - X" has only one reference, and the instruction is marked "noSignedZero". 2. Eliminate negation (The compiler was already able to handle these opt if the 0.0s are replaced with -0.0.) rule 3: (0.0 - X) * (0.0 - Y) => X * Y rule 4: (0.0 - X) * C => X * -C if the expr is flagged "noSignedZero". 3. Rule 5: (XY) X => (XX) Y if X!=Y and the expression is flagged with "UnsafeAlgebra". The purpose of this transformation is two-fold: a) to form a power expression (of X). b) potentially shorten the critical path: After transformation, the latency of the instruction Y is amortized by the expression of XX, and therefore Y is in a "less critical" position compared to what it was before the transformation. 4. Remove the InstCombine code about simplifiying "X select". The reasons are following: a) The "select" is somewhat architecture-dependent, therefore the higher level optimizers are not able to precisely predict if the simplification really yields any performance improvement or not. b) The "select" operator is bit complicate, and tends to obscure optimization opportunities. It is btter to keep it as low as possible in expr tree, and let CodeGen to tackle the optimization. llvm-svn: 172551	2013-01-15 21:09:32 +00:00
Nadav Rotem	d33ce6f100	LoopVectorizer cost model. Honor the user command line flag that selects the vectorization factor even if the target machine does not have any vector registers. llvm-svn: 172544	2013-01-15 18:25:16 +00:00
Evgeniy Stepanov	d14e47b146	[msan] Fix handling of equality comparison of pointer vectors. Also improve test coveration of the handling of relational comparisons. llvm-svn: 172539	2013-01-15 16:44:52 +00:00
Jakub Staszak	190db2f25c	Remove trailing spaces. llvm-svn: 172489	2013-01-14 23:16:36 +00:00
Shuxin Yang	320f52a4b0	This change is to implement following rules under the condition C_A and/or C_R --------------------------------------------------------------------------- C_A: reassociation is allowed C_R: reciprocal of a constant C is appropriate, which means - 1/C is exact, or - reciprocal is allowed and 1/C is neither a special value nor a denormal. ----------------------------------------------------------------------------- rule1: (X/C1) / C2 => X / (C2C1) (if C_A) => X (1/(C2C1)) (if C_A && C_R) rule 2: XC1 / C2 => X * (C1/C2) if C_A rule 3: (X/Y)/Z = > X/(YZ) (if C_A && at least one of Y and Z is symbolic value) rule 4: Z/(X/Y) = > (ZY)/X (similar to rule3) rule 5: C1/(XC2) => (C1/C2) / X (if C_A) rule 6: C1/(X/C2) => (C1C2) / X (if C_A) rule 7: C1/(C2/X) => (C1/C2) * X (if C_A) llvm-svn: 172488	2013-01-14 22:48:41 +00:00
David Greene	530430be98	Fix Casting Bug Add a const version of getFpValPtr to avoid a cast-away-const warning. llvm-svn: 172467	2013-01-14 21:04:40 +00:00
Nick Lewycky	80ea003c6c	Fix typo in comment. llvm-svn: 172460	2013-01-14 20:56:10 +00:00
Michael Gottesman	e9145d3846	Changed SmallPtrSet.count guard + SmallPtrSet.insert to just SmallPtrSet.insert. llvm-svn: 172452	2013-01-14 19:18:39 +00:00
Michael Gottesman	4385edf5cb	Fixed some 80+ violations. llvm-svn: 172374	2013-01-14 01:47:53 +00:00
Michael Gottesman	97e3df087d	Updated the documentation in ObjCARC.cpp to fit the style guide better (i.e. use doxygen). Still some work to do though. llvm-svn: 172371	2013-01-14 00:35:14 +00:00
Michael Gottesman	f15c0bb495	Fixed an infinite loop in the block escape in analysis in ObjCARC caused by 2x blocks each assigned a value via a phi-node causing each to depend on the other. A test case is provided as well. llvm-svn: 172368	2013-01-13 22:12:06 +00:00
Dmitri Gribenko	226fea5bd6	Remove redundant 'llvm::' qualifications llvm-svn: 172358	2013-01-13 16:01:15 +00:00
Nadav Rotem	40e45eeae2	Fix PR14547. Handle induction variables of small sizes smaller than i32 (i8 and i16). llvm-svn: 172348	2013-01-13 07:56:29 +00:00
Michael Gottesman	1a89fe554b	[ObjCARC] Even more debug messages! llvm-svn: 172347	2013-01-13 07:47:32 +00:00
Michael Gottesman	af2113ffb5	[ObjCARC] More debug messages. llvm-svn: 172346	2013-01-13 07:00:51 +00:00
Chandler Carruth	7e31c8f0ae	Fix an editor goof in r171738 that Bill spotted. He may even have a test case, but looking at the diff this was an obviously unintended change. Thanks for the careful review Bill! =] llvm-svn: 172336	2013-01-12 23:46:04 +00:00
Benjamin Kramer	64a857ac69	GlobalOpt: Avoid jump on uninitialized value. Found by valgrind. llvm-svn: 172318	2013-01-12 15:34:31 +00:00
Michael Gottesman	9f1be68703	Fixed debug message in ObjCARC. llvm-svn: 172299	2013-01-12 03:45:49 +00:00
Michael Gottesman	b24bdef7a4	Fixed a few debug messages in ObjCARC and added one. llvm-svn: 172298	2013-01-12 02:57:16 +00:00
Michael Gottesman	556ff61122	Fixed bug in ObjCARC where we were changing a call from objc_autoreleaseRV => objc_autorelease but were not updating the InstructionClass to IC_Autorelease. llvm-svn: 172288	2013-01-12 01:25:19 +00:00
Michael Gottesman	c9656faf1e	Fixed a bug where we were tail calling objc_autorelease causing an object to not be placed into an autorelease pool. The reason that this occurs is that tail calling objc_autorelease eventually tail calls -[NSObject autorelease] which supports fast autorelease. This can cause us to violate the semantic gaurantees of __autoreleasing variables that assignment to an __autoreleasing variables always yields an object that is placed into the innermost autorelease pool. The fix included in this patch works by: 1. In the peephole optimization function OptimizeIndividualFunctions, always remove tail call from objc_autorelease. 2. Whenever we convert to/from an objc_autorelease, set/unset the tail call keyword as appropriate. NOTE I also handled the case where objc_autorelease is converted in OptimizeReturns to an autoreleaseRV which still violates the ARC semantics. I will be removing that in a later patch and I wanted to make sure that the tree is in a consistent state vis-a-vis ARC always. Additionally some test cases are provided and all tests that have tail call marked objc_autorelease keywords have been modified so that tail call has been removed. NOTE One test fails due to a separate bug that I am going to commit soon. Thus I marked the check line TMP: instead of CHECK: so make check does not fail. llvm-svn: 172287	2013-01-12 01:25:15 +00:00
Michael Gottesman	2a6542727d	Fixed whitespace. llvm-svn: 172271	2013-01-11 23:08:52 +00:00
Michael Gottesman	d1a46f23b4	Added debug messages to GlobalOpt. Specifically: 1. Added a missing new line when we emit a debug message saying that we are marking a global variable as constant. 2. Added debug messages that describe what is occuring when GlobalOpt is evaluating a block/function. 3. Added a debug message that says what specific constructor is being evaluated. llvm-svn: 172247	2013-01-11 20:07:53 +00:00
Nadav Rotem	853fe0acb9	ARM Cost Model: We need to detect the max bitwidth of types in the loop in order to select the max vectorization factor. We don't have a detailed analysis on which values are vectorized and which stay scalars in the vectorized loop so we use another method. We look at reduction variables, loads and stores, which are the only ways to get information in and out of loop iterations. If the data types are extended and truncated then the cost model will catch the cost of the vector zext/sext/trunc operations. llvm-svn: 172178	2013-01-11 07:11:59 +00:00
Shuxin Yang	c5c730b0e0	PR14904: Segmentation fault running pass 'Recognize loop idioms' The root cause is mistakenly taking for granted that "dyn_cast<Instruction>(a-Value)" return a non-NULL instruction. llvm-svn: 172145	2013-01-10 23:32:01 +00:00
Peter Collingbourne	f7d65c43d0	[msan] Change va_start/va_copy shadow memset alignment to 8. This fixes va_start/va_copy of a va_list field which happens to not be laid out at a 16-byte boundary. Differential Revision: http://llvm-reviews.chandlerc.com/D276 llvm-svn: 172128	2013-01-10 22:36:33 +00:00
Owen Anderson	dbf0ca523d	Teach InstCombine to hoist FABS and FNEG through FPTRUNC instructions. The application of these operations commutes with the truncation, so we should prefer to do them in the smallest size we can, to save register space, use smaller constant pool entries, etc. llvm-svn: 172117	2013-01-10 22:06:52 +00:00
Nadav Rotem	6eae65cfac	LoopVectorizer: Fix a bug in the vectorization of BinaryOperators. The BinaryOperator can be folded to an Undef, and we don't want to set NSW flags to undef vals. PR14878 llvm-svn: 172079	2013-01-10 17:34:39 +00:00
Joey Gouly	58bf951dec	Fix TryToShrinkGlobalToBoolean in GlobalOpt, so that it does not discard address spaces. llvm-svn: 172051	2013-01-10 10:31:11 +00:00
Michael Gottesman	a6cb018bb5	[ObjCARC Debug Message] Added debug message when we convert an autorelease into an autoreleaseRV. llvm-svn: 172034	2013-01-10 02:03:50 +00:00
Nadav Rotem	b1791a75cd	ARM Cost model: Use the size of vector registers and widest vectorizable instruction to determine the max vectorization factor. llvm-svn: 172010	2013-01-09 22:29:00 +00:00
Michael Gottesman	c189a392ce	[ObjCARC Debug Messages] This is a squashed commit of 3x debug message commits ala echristo's suggestion. 1. Added debug messages when in OptimizeIndividualCalls we move calls into predecessors and then erase the original call. 2. Added debug messages when in the process of moving calls in ObjCARCOpt::MoveCalls we create new RR and delete old RR. 3. Added a debug message when we visit a specific retain instruction in ObjCARCOpt::PerformCodePlacement. llvm-svn: 171988	2013-01-09 19:23:24 +00:00
Benjamin Kramer	130fcde3e5	LICM: Hoist insertvalue/extractvalue out of loops. Fixes PR14854. llvm-svn: 171984	2013-01-09 18:12:03 +00:00
Nadav Rotem	b696c36fcd	Cost Model: Move the 'max unroll factor' variable to the TTI and add initial Cost Model support on ARM. llvm-svn: 171928	2013-01-09 01:15:42 +00:00
Shuxin Yang	f0537ab681	Consider expression "0.0 - X" as the negation of X if - this expression is explicitly marked no-signed-zero, or - no-signed-zero of this expression can be derived from some context. llvm-svn: 171922	2013-01-09 00:13:41 +00:00
Nadav Rotem	3c352c0f4a	Code cleanup: refactor the switch statements in the generation of reduction variables into an IR builder call. llvm-svn: 171871	2013-01-08 17:37:45 +00:00
Nadav Rotem	6f6d21a17b	Rename the enum members to match the LLVM coding style. llvm-svn: 171868	2013-01-08 17:23:17 +00:00
Bill Wendling	76c6521ba1	Make sure we don't emit instructions before a landingpad instruction. PR14782 llvm-svn: 171846	2013-01-08 10:51:32 +00:00
Nadav Rotem	5a197c06f3	LoopVectorizer: Add support for floating point reductions llvm-svn: 171812	2013-01-07 23:13:00 +00:00
Shuxin Yang	8013866519	Cosmetical changne in order to conform to coding std. Thank Eric Christopher for figuring out these problems! llvm-svn: 171805	2013-01-07 22:41:28 +00:00
Nadav Rotem	c60d7d96f5	LoopVectorizer: When we vectorizer and widen loops we process many elements at once. This is a good thing, except for small loops. On small loops post-loop that handles scalars (and runs slower) can take more time to execute than the rest of the loop. This patch disables widening of loops with a small static trip count. llvm-svn: 171798	2013-01-07 21:54:51 +00:00
Shuxin Yang	df0e61e793	This change is to implement following rules: o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal) o. X/C1 * C2 -> X/(C1/C2) (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp) Let MDC denote multiplication or dividion with one & only one operand being a constant o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2) (so long as the constant-folding doesn't yield any denormal or special value) llvm-svn: 171793	2013-01-07 21:39:23 +00:00
Michael Gottesman	10426b571e	Fixed EOL whitespace. llvm-svn: 171791	2013-01-07 21:26:07 +00:00
Quentin Colombet	3b2db0bcd3	When code size is the priority (Oz, MinSize attribute), help llvm turning a code like this: if (foo) free(foo) into that: free(foo) Move a call to free from basic block FB into FB's predecessor, P, when the path from P to FB is taken only if the argument of free is not equal to NULL. Some restrictions apply on P and FB to be sure that this code motion is profitable. Namely: 1. FB must have only one predecessor P. 2. FB must contain only the call to free plus an unconditional branch to S. 3. P's successors are FB and S. Because of 1., we will not increase the code size when moving the call to free from FB to P. Because of 2., FB will be empty after the move. Because of 2. and 3., P's branch instruction becomes useless, so as FB (simplifycfg will do the job). llvm-svn: 171762	2013-01-07 18:37:41 +00:00
Chandler Carruth	dcb603feef	Move TypeFinder.h into the IR tree, it clearly belongs with the IR library. llvm-svn: 171749	2013-01-07 15:43:51 +00:00
Chandler Carruth	839a98e687	Move CallGraphSCCPass.h into the Analysis tree; that's where the implementation lives already. llvm-svn: 171746	2013-01-07 15:26:48 +00:00
Chandler Carruth	683ff2d7f9	Remove the long defunct 'DefaultPasses' header. We have a pass manager builder these days, and this thing hasn't seen updates for a very long time. llvm-svn: 171741	2013-01-07 15:16:50 +00:00
Chandler Carruth	95f83e0155	Sink AddrMode back into TargetLowering, removing one of the most peculiar headers under include/llvm. This struct still doesn't make a lot of sense, but it makes more sense down in TargetLowering than it did before. llvm-svn: 171739	2013-01-07 15:14:13 +00:00
Chandler Carruth	6e479322aa	Remove LSR's use of the random AddrMode struct. These variables were already in a class, just inline the four of them. I suspect that this class could be simplified some to not always keep distinct variables for these things, but it wasn't clear to me how given the usage so I opted for a trivial and mechanical translation. This removes one of the two remaining users of a header in include/llvm which does nothing more than define a 4 member struct. llvm-svn: 171738	2013-01-07 15:04:40 +00:00
Chandler Carruth	26c59fa870	Switch the SCEV expander and LoopStrengthReduce to use TargetTransformInfo rather than TargetLowering, removing one of the primary instances of the layering violation of Transforms depending directly on Target. This is a really big deal because LSR used to be a "special" pass that could only be tested fully using llc and by looking at the full output of it. It also couldn't run with any other loop passes because it had to be created by the backend. No longer is this true. LSR is now just a normal pass and we should probably lift the creation of LSR out of lib/CodeGen/Passes.cpp and into the PassManagerBuilder. =] I've not done this, or updated all of the tests to use opt and a triple, because I suspect someone more familiar with LSR would do a better job. This change should be essentially without functional impact for normal compilations, and only change behvaior of targetless compilations. The conversion required changing all of the LSR code to refer to the TTI interfaces, which fortunately are very similar to TargetLowering's interfaces. However, it also allowed us to always expect to have some implementation around. I've pushed that simplification through the pass, and leveraged it to simplify code somewhat. It required some test updates for one of two things: either we used to skip some checks altogether but now we get the default "no" answer for them, or we used to have no information about the target and now we do have some. I've also started the process of removing AddrMode, as the TTI interface doesn't use it any longer. In some cases this simplifies code, and in others it adds some complexity, but I think it's not a bad tradeoff even there. Subsequent patches will try to clean this up even further and use other (more appropriate) abstractions. Yet again, almost all of the formatting changes brought to you by clang-format. =] llvm-svn: 171735	2013-01-07 14:41:08 +00:00
Silviu Baranga	a055aab506	Make the MergeGlobals pass correctly handle the address space qualifiers of the global variables. We partition the set of globals by their address space, and apply the same the trasnformation as before to merge them. llvm-svn: 171730	2013-01-07 12:31:25 +00:00
Chandler Carruth	b348328b5d	Simplify LoopVectorize to require target transform info and rely on it being present. Make a member of one of the helper classes a reference as part of this. Reformatting goodness brought to you by clang-format. llvm-svn: 171726	2013-01-07 11:12:29 +00:00
Chandler Carruth	b7e60f6844	Merge the unused header file for LoopVectorizer into the source file. This makes the loop vectorizer match the pattern followed by roughly all other passses. =] Notably, this header file was braken in several regards: it contained a using namespace directive, global #define's that aren't globaly appropriate, and global constants defined directly in the header file. As a side benefit, lots of the types in this file become internal, which will cause the optimizer to chew on this pass more effectively. llvm-svn: 171723	2013-01-07 10:44:06 +00:00
Chandler Carruth	7383bfd67e	Switch BBVectorize to directly depend on having a TTI analysis. This could be simplified further, but Hal has a specific feature for ignoring TTI, and so I preserved that. Also, I needed to use it because a number of tests fail when switching from a null TTI to the NoTTI nonce implementation. That seems suspicious to me and so may be something that you need to look into Hal. I worked it by preserving the old behavior for these tests with the flag that ignores all target info. llvm-svn: 171722	2013-01-07 10:22:36 +00:00
Chandler Carruth	04ece8623e	Fix a slew of indentation and parameter naming style issues. This 80% of this patch brought to you by the tool clang-format. I wanted to fix up the names of constructor parameters because they followed a bit of an anti-pattern by naming initialisms with CamelCase: 'Tti', 'Se', etc. This appears to have been in an attempt to not overlap with the names of member variables 'TTI', 'SE', etc. However, constructor arguments can very safely alias members, and in fact that's the conventional way to pass in members. I've fixed all of these I saw, along with making some strang abbreviations such as 'Lp' be simpler 'L', or 'Lgl' be the word 'Legal'. However, the code I was touching had indentation and formatting somewhat all over the map. So I ran clang-format and fixed them. I also fixed a few other formatting or doxygen formatting issues such as using ///< on trailing comments so they are associated with the correct entry. There is still a lot of room for improvement of the formating and cleanliness of this code. ;] At least a few parts of the coding standards or common practices in LLVM's code aren't followed, the enum naming rules jumped out at me. I may mix some of these while I'm here, but not all of them. llvm-svn: 171719	2013-01-07 09:57:00 +00:00
Chandler Carruth	342cc255d0	Switch LoopIdiom pass to directly require target transform information. I'm sorry for duplicating bad style here, but I wanted to keep consistency. I've pinged the code review thread where this style was reviewed and changes were requested. llvm-svn: 171714	2013-01-07 09:17:41 +00:00
Chandler Carruth	0b4ef9cedc	Make SimplifyCFG simply depend upon TargetTransformInfo and pass it through as a reference rather than a pointer. There is always some implementation of this available, so this simplifies code by not having to test for whether it is available or not. Further, it turns out there were piles of places where SimplifyCFG was recursing and not passing down either TD or TTI. These are fixed to be more pedantically consistent even though I don't have any particular cases where it would matter. llvm-svn: 171691	2013-01-07 03:53:25 +00:00
Chandler Carruth	2109f47d97	Fix the enumerator names for ShuffleKind to match tho coding standards, and make its comments doxygen comments. llvm-svn: 171688	2013-01-07 03:20:02 +00:00
Chandler Carruth	50a36cd148	Make the popcnt support enums and methods have more clear names and follow the conding conventions regarding enumerating a set of "kinds" of things. llvm-svn: 171687	2013-01-07 03:16:03 +00:00
Chandler Carruth	d3e73556d6	Move TargetTransformInfo to live under the Analysis library. This no longer would violate any dependency layering and it is in fact an analysis. =] llvm-svn: 171686	2013-01-07 03:08:10 +00:00
Michael Gottesman	add0847459	[ObjCARC Debug Message] - Added debug message when fuse a retain/autorelease pair in ObjCARCContract::ContractAutorelease. llvm-svn: 171679	2013-01-07 00:31:26 +00:00
Michael Gottesman	d61a3b2707	[ObjCARC Debug Message] - Added debug message when we zap a matching retain/autorelease pair in ObjCARCOpt::OptimizeReturns. llvm-svn: 171678	2013-01-07 00:04:56 +00:00
Michael Gottesman	5b970e14e6	[ObjCARC Debug Message] - Added debug message when we erase ARC calls with null since they are no-ops. llvm-svn: 171677	2013-01-07 00:04:52 +00:00
Michael Gottesman	8800a51ac1	[ObjCARC Debug Message] - Added debug message when we add a nounwind keyword to a function which can not throw. llvm-svn: 171676	2013-01-06 23:39:13 +00:00
Michael Gottesman	2d76331f86	[ObjCARC Debug Message] - Added debug message when we add a tail keyword to a function which can never be passed stack args. llvm-svn: 171675	2013-01-06 23:39:09 +00:00
Michael Gottesman	4bf6e7516e	[ObjCARC Debug Messages] - Added missing newline. llvm-svn: 171674	2013-01-06 22:56:54 +00:00
Michael Gottesman	a6a1dadeab	Added debug statement to ObjCARC when we replace objc_autorelease(x) with objc_release(x) when x is otherwise unused. llvm-svn: 171673	2013-01-06 22:56:50 +00:00
Michael Gottesman	fec61c018d	Added 2x Debug statements to ObjCARC that log when we handle the two undefined pointer-to-weak-pointer is NULL cases by replacing the given call inst with an undefined value. The reason that there are two cases is that the first case handles the unary cases and the second the binary cases. llvm-svn: 171672	2013-01-06 21:54:30 +00:00
Michael Gottesman	dc042f0089	Added debug message in ObjCARC when we remove a no-op cast which has only special semantic meaning in the frontend and thus in the optimizer can be deleted. llvm-svn: 171670	2013-01-06 21:07:15 +00:00
Michael Gottesman	1bf6908867	Added debug message to ObjCARC when we transform an objc_autoreleaseReturnValue => objc_autorelease due to its operand not being used as a return value. llvm-svn: 171669	2013-01-06 21:07:11 +00:00
Andrew Trick	f950ce8e38	Fix a crash in LSR replaceCongruentIVs. Indirect branch in the preheader crashes replaceCongruentIVs. Fixes rdar://12910141. llvm-svn: 171653	2013-01-06 05:59:39 +00:00
Michael Gottesman	def07bba3e	Added debug message to ObjCARC when we transform objc_retainAutorelasedReturnValue => objc_retain since the operand to said function is not a return value. llvm-svn: 171629	2013-01-05 17:55:42 +00:00
Michael Gottesman	5c32ce9d3e	Added debug message for ObjCARC when we zap an objc_autoreleaseReturnValue/objc_retainAutoreleasedValue pair. llvm-svn: 171628	2013-01-05 17:55:35 +00:00
Chris Lattner	473988cf54	switch from pointer equality comparison to MDNode::getMostGenericTBAA when merging two TBAA tags, pointed out by Nuno. llvm-svn: 171627	2013-01-05 16:44:07 +00:00
Chandler Carruth	21b3c586ab	Switch the loop vectorizer from VTTI to just use TTI directly. llvm-svn: 171620	2013-01-05 10:16:02 +00:00
Chandler Carruth	7c4f91dea5	Switch the BB vectorizer from the VTTI interface to the simple TTI interface. llvm-svn: 171618	2013-01-05 10:05:28 +00:00
Chandler Carruth	6db43e6ca3	Switch SimplifyCFG over to the TargetTransformInfo interface rather than the ScalarTargetTransformInfo interface. llvm-svn: 171617	2013-01-05 10:05:26 +00:00
Chandler Carruth	6fe147fb3a	Switch LoopIdiomRecognize to directly use the TargetTransformInfo interface rather than the ScalarTargetTransformInterface. llvm-svn: 171616	2013-01-05 10:00:09 +00:00
Chandler Carruth	c892591596	Sink the AddressingModeMatcher helper class into an anonymous namespace next to its only user. This helper relies on TargetLowering information that shouldn't be generally used throughout the Transfoms library, and so it made little sense as a generic utility. This also consolidates the file where we need to remove the remaining uses of TargetLowering in favor of the IR-layer abstract interface in TargetTransformInfo. llvm-svn: 171590	2013-01-05 02:09:22 +00:00
Nadav Rotem	e9f5bfd5e9	iLoopVectorize: Non commutative operators can be used as reduction variables as long as the reduction chain is used in the LHS. PR14803. llvm-svn: 171583	2013-01-05 01:15:47 +00:00
Paul Redmond	874f01e956	Do not vectorize loops with subtraction reductions Since subtraction does not commute the loop vectorizer incorrectly vectorizes reductions such as x = A[i] - x. Disabling for now. llvm-svn: 171537	2013-01-04 22:10:16 +00:00
Michael Gottesman	1e00ac6256	Added DEBUG message to ObjCARC when we optimize objc_retain => objc_retainAutorelasedReturnValue. llvm-svn: 171535	2013-01-04 21:30:38 +00:00
Michael Gottesman	9f848aeddd	Fixed up some DEBUG messages where I was putting in the text of a message the method where it was being called when I should have just prefixed the actual message with Pass::Method. Additionally I fixed some whitespace issues. llvm-svn: 171534	2013-01-04 21:29:57 +00:00
Nadav Rotem	93bd30be9b	Fix a warning llvm-svn: 171525	2013-01-04 21:08:44 +00:00
Nadav Rotem	be6570d429	Move the loop vectorizer from O2 to O3. It looks like the increase in code size actually hurts the performance on many programs. llvm-svn: 171471	2013-01-04 17:57:44 +00:00
Nadav Rotem	e1d5c4b8b9	LoopVectorizer: 1. Add code to estimate register pressure. 2. Add code to select the unroll factor based on register pressure. 3. Add bits to TargetTransformInfo to provide the number of registers. llvm-svn: 171469	2013-01-04 17:48:25 +00:00
Michael Gottesman	50ae5b28e9	Changed two debug statements that state that a queue had finished being processed when said queue was really a list to state a list had finished being processed. llvm-svn: 171465	2013-01-03 08:09:27 +00:00
Michael Gottesman	ef682c5430	Added DEBUG message for ObjCARC when we zap a push/pop pair in ObjCARCAPElim::OptimizeBB. llvm-svn: 171464	2013-01-03 08:09:17 +00:00
Michael Gottesman	416dc00cad	Added DEBUG message to ObjCARC when we transform objc_initWeak(p, null) => *p = null. llvm-svn: 171463	2013-01-03 07:32:53 +00:00
Michael Gottesman	00d1f966b4	Added DEBUG message for ObjCARC when an inline asm marker is inserted for architectures where this is required to perform a retainAutoreleasedReturnValue optimization. llvm-svn: 171462	2013-01-03 07:32:41 +00:00
Nadav Rotem	72f984b596	LoopVectorizer: Add support for loop-unrolling during vectorization for increasing the ILP. At the moment this feature is disabled by default and this commit should not cause any functional changes. llvm-svn: 171436	2013-01-03 00:52:27 +00:00
Nadav Rotem	4897392360	Avoid vectorization when the function has the "noimplicitflot" attribute. llvm-svn: 171429	2013-01-02 23:54:43 +00:00
Shuxin Yang	98c844fd89	- Add comment to two functions which might be considered as dead code. - Fix a typo llvm-svn: 171399	2013-01-02 18:26:31 +00:00
Chandler Carruth	db25c6cf8e	Actually update the CMake and Makefile builds correctly, and update the code that includes Intrinsics.gen directly. This never showed up in my testing because the old Intrinsics.gen was still kicking around in the make build system and was correct there. =[ Thankfully, some of the bots to clean rebuilds and that caught this. llvm-svn: 171373	2013-01-02 12:09:16 +00:00
Chandler Carruth	9fb823bbd4	Move all of the header files which are involved in modelling the LLVM IR into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366	2013-01-02 11:36:10 +00:00
Chandler Carruth	be81023d74	Resort the #include lines in include/... and lib/... with the utils/sort_includes.py script. Most of these are updating the new R600 target and fixing up a few regressions that have creeped in since the last time I sorted the includes. llvm-svn: 171362	2013-01-02 10:22:59 +00:00
Benjamin Kramer	614b5e85b9	Add IRBuilder::CreateVectorSplat and use it to simplify code. llvm-svn: 171349	2013-01-01 19:55:16 +00:00
Benjamin Kramer	c003a4521b	SROA: Clean up unused assignment warnings from clang's analyzer. No functionality change. llvm-svn: 171348	2013-01-01 16:13:35 +00:00
Michael Gottesman	c8a11df33b	Added DEBUG message when ObjCARC replaces a call which returns its argument verbatim with its argument to temporarily undo an optimization. Specifically these calls return their argument verbatim, as a low-level optimization. However, this makes high-level optimizations harder. We undo any uses of this optimization that the front-end emitted. We redo them later in the contract pass. llvm-svn: 171346	2013-01-01 16:05:54 +00:00
Michael Gottesman	3f146e204e	Added DEBUG messages to the top of several processing loops in ObjCARC.cpp that emit what instructions are being visited. This is a part of a larger effort of adding DEBUG messages to the ARC Optimizer Backend. llvm-svn: 171345	2013-01-01 16:05:48 +00:00
Jakub Staszak	c48bbe7170	Add extra CHECK to make sure that 'or' instruction was replaced. Also add an assert to avoid confusion in the code where is known that C1 <= C2. llvm-svn: 171310	2012-12-31 18:26:42 +00:00
Chris Lattner	f5cca68c2c	Fix LICM's memory promotion optimization to preserve TBAA tags when promoting a store in a loop. This was noticed when working on PR14753, but isn't directly related. llvm-svn: 171281	2012-12-31 08:37:17 +00:00
Chris Lattner	eeefe1bc07	teach instcombine to preserve TBAA tag when merging two stores, part of PR14753 llvm-svn: 171279	2012-12-31 08:10:58 +00:00
Jakub Staszak	f584977df2	Grammo. llvm-svn: 171272	2012-12-31 01:40:44 +00:00
Bill Wendling	6e95ae803a	Remove the getAttributesAtIndex and getNumAttrs methods in favor of using the getAttrSomewhere predicate. This prevents the uses of 'Attribute' as a collection of attributes. llvm-svn: 171271	2012-12-31 00:49:59 +00:00
Jakub Staszak	ea2b9b9d67	Transform (A == C1 \|\| A == C2) into (A & ~(C1 ^ C2)) == C1 if C1 and C2 differ only with one bit. Fixes PR14708. llvm-svn: 171270	2012-12-31 00:34:55 +00:00
Nuno Lopes	b6ad98224a	convert a bunch of callers from DataLayout::getIndexedOffset() to GEP::accumulateConstantOffset(). The later API is nicer than the former, and is correct regarding wrap-around offsets (if anyone cares). There are a few more places left with duplicated code, which I'll remove soon. llvm-svn: 171259	2012-12-30 16:25:48 +00:00
Bill Wendling	94dcaf8e2b	Remove Function::getParamAttributes and use the AttributeSet accessor methods instead. llvm-svn: 171255	2012-12-30 12:45:13 +00:00
Bill Wendling	698e84fc4f	Remove the Function::getFnAttributes method in favor of using the AttributeSet directly. This is in preparation for removing the use of the 'Attribute' class as a collection of attributes. That will shift to the AttributeSet class instead. llvm-svn: 171253	2012-12-30 10:32:01 +00:00
Nadav Rotem	0b37f14371	LoopVectorizer: Fix a bug in the code that updates the loop exiting block. LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs. The bug happened because undefs are not loop values. This patch handles these PHIs. PR14725 llvm-svn: 171251	2012-12-30 07:47:00 +00:00
Alexey Samsonov	3efc87e92d	Add proper support for -fsanitize-blacklist= flag for TSan and MSan. LLVM part. llvm-svn: 171183	2012-12-28 09:30:44 +00:00
Chandler Carruth	e40e60eed5	Make this parameter be named consistently with most other getAnalysisUsage implementations. llvm-svn: 171157	2012-12-27 11:17:15 +00:00
Alexey Samsonov	29dd7f2090	[ASan] Fix lifetime intrinsics handling. Now for each intrinsic we check if it describes one of 'interesting' allocas. Assume that allocas can go through casts and phi-nodes before apperaring as llvm.lifetime arguments llvm-svn: 171153	2012-12-27 08:50:58 +00:00
Nadav Rotem	5350cd314b	If all of the write objects are identified then we can vectorize the loop even if the read objects are unidentified. PR14719. llvm-svn: 171124	2012-12-26 23:30:53 +00:00
Nick Lewycky	90053a1214	Remove mid-optimizer warning. This situation should be handled differently, such as by a compiler warning, a check in clang -fsanitizer=undefined, being optimized to unreachable, or a combination of the above. PR14722. llvm-svn: 171119	2012-12-26 22:00:35 +00:00
Nadav Rotem	3f7c4f36ba	LoopVectorizer: Optimize the vectorization of consecutive memory access when the iteration step is -1 llvm-svn: 171114	2012-12-26 19:08:17 +00:00
Evgeniy Stepanov	5eb5bf8b46	[msan] Raise alignment of origin stores/loads when possible. Origin alignment is as high as the alignment of the corresponding application location, but never less than 4. llvm-svn: 171110	2012-12-26 11:55:09 +00:00
Evgeniy Stepanov	d8be0c510c	[msan] Expand the file comment with track-origins info. llvm-svn: 171109	2012-12-26 10:59:00 +00:00
Hal Finkel	30e95a8ebb	BBVectorize: Use VTTI to compute costs for intrinsics vectorization For the time being this includes only some dummy test cases. Once the generic implementation of the intrinsics cost function does something other than assuming scalarization in all cases, or some target specializes the interface, some real test cases can be added. Also, for consistency, I changed the type of IID from unsigned to Intrinsic::ID in a few other places. llvm-svn: 171079	2012-12-26 01:36:57 +00:00
Hal Finkel	b44f890133	LoopVectorize: Enable vectorization of the fmuladd intrinsic llvm-svn: 171076	2012-12-25 23:21:29 +00:00
Hal Finkel	2a456112ec	BBVectorize: Enable vectorization of the fmuladd intrinsic llvm-svn: 171075	2012-12-25 22:36:08 +00:00
Evgeniy Stepanov	f19c086d1e	[msan] Fix handling of vectors of pointers. VectorType::getInteger() can not be used with them, because pointer size depends on the target. llvm-svn: 171070	2012-12-25 16:04:38 +00:00
Evgeniy Stepanov	ec8371283b	[msan] Fix handling of select with vector condition. llvm-svn: 171069	2012-12-25 14:56:21 +00:00
Alexey Samsonov	788381b8ac	ASan: initialize callbacks from ASan module pass in a separate function for consistency llvm-svn: 171061	2012-12-25 12:28:20 +00:00
Alexey Samsonov	1e3f7ba8f7	ASan: move stack poisoning logic into FunctionStackPoisoner struct llvm-svn: 171060	2012-12-25 12:04:36 +00:00
Bob Wilson	4ed23578da	Add LLVMContext::emitWarning methods and use them. <rdar://problem/12867368> When the backend is used from clang, it should produce proper diagnostics instead of just printing messages to errs(). Other clients may also want to register their own error handlers with the LLVMContext, and the same handler should work for warnings in the same way as the existing emitError methods. llvm-svn: 171041	2012-12-24 18:15:21 +00:00
Nadav Rotem	5f7c12cfbd	LoopVectorizer: When checking for vectorizable types, also check the StoreInst operands. PR14705. llvm-svn: 171023	2012-12-24 09:14:18 +00:00
Alexey Samsonov	098842b401	Fix typo in comments llvm-svn: 171021	2012-12-24 08:52:53 +00:00
Nadav Rotem	bd5d1d832a	LoopVectorizer: Fix an endless loop in the code that looks for reductions. The bug was in the code that detects PHIs in if-then-else block sequence. PR14701. llvm-svn: 171008	2012-12-24 01:22:06 +00:00
Benjamin Kramer	28691400dd	LoopVectorize: Fix accidentaly inverted condition. llvm-svn: 171001	2012-12-23 13:21:41 +00:00
Benjamin Kramer	855ba03408	LoopVectorize: For scalars and void types there is no need to compute vector insert/extract costs. Fixes an assert during the build of oggenc in the test suite. llvm-svn: 171000	2012-12-23 13:19:18 +00:00
Nadav Rotem	2cade68025	Loop Vectorizer: Update the cost model of scatter/gather operations and make them more expensive. llvm-svn: 170995	2012-12-23 07:23:55 +00:00
Craig Topper	4c94775198	Remove trailing whitespace llvm-svn: 170990	2012-12-22 18:09:02 +00:00
Bill Wendling	c79e42c5ce	Change 'AttrVal' to 'AttrKind' to better reflect that it's a kind of attribute instead of the value of the attribute. llvm-svn: 170972	2012-12-22 00:37:52 +00:00
Roman Divacky	a229186a82	Remove duplicate includes. llvm-svn: 170902	2012-12-21 17:06:44 +00:00
Evgeniy Stepanov	4fbc0d08bf	[msan] Remove unreachable blocks before instrumenting a function. llvm-svn: 170883	2012-12-21 11:18:49 +00:00
Nadav Rotem	3b850b70b3	Enable if-conversion. llvm-svn: 170841	2012-12-21 04:47:54 +00:00
Evan Cheng	99cafb1db2	Every pass deserves a name, even codegenprep. llvm-svn: 170831	2012-12-21 01:48:14 +00:00
Nadav Rotem	a4b53f20a3	BB-Vectorizer: Check the cost of the store pointer type and not the return type, which is void. A number of test cases fail after adding the assertion in TTImpl. llvm-svn: 170828	2012-12-21 01:24:36 +00:00
Nadav Rotem	e7785686a5	Fix a bug in the code that checks if we can vectorize loops while using dynamic memory bound checks. Before the fix we were able to vectorize this loop from the Livermore Loops benchmark: for ( k=1 ; k<n ; k++ ) x[k] = x[k-1] + y[k]; llvm-svn: 170811	2012-12-21 00:07:35 +00:00
Nadav Rotem	2ababf68d7	LoopVectorize: Fix a bug in the scalarization of instructions. Before if-conversion we could check if a value is loop invariant if it was declared inside the basic block. Now that loops have multiple blocks this check is incorrect. This fixes External/SPEC/CINT95/099_go/099_go llvm-svn: 170756	2012-12-20 20:24:40 +00:00
Nadav Rotem	8b20c0a814	Loop Vectorizer: turn-off if-conversion. llvm-svn: 170708	2012-12-20 17:42:53 +00:00
James Molloy	4f6fb953a7	Add a new attribute, 'noduplicate'. If a function contains a noduplicate call, the call cannot be duplicated - Jump threading, loop unrolling, loop unswitching, and loop rotation are inhibited if they would duplicate the call. Similarly inlining of the function is inhibited, if that would duplicate the call (in particular inlining is still allowed when there is only one callsite and the function has internal linkage). llvm-svn: 170704	2012-12-20 16:04:27 +00:00
Craig Topper	ae48cb2e5a	Formatting fixes. Remove some unnecessary 'else' after 'return'. No functional change. llvm-svn: 170676	2012-12-20 07:15:54 +00:00
Craig Topper	9d4171afed	Removing trailing whitespace llvm-svn: 170675	2012-12-20 07:09:41 +00:00
Nadav Rotem	7bdc45b570	Loop Vectorizer: Enable if-conversion. llvm-svn: 170632	2012-12-20 02:00:02 +00:00
Nadav Rotem	28408a20c9	whitespace llvm-svn: 170626	2012-12-20 00:49:56 +00:00
Paul Redmond	5917f4c715	Transform (x&C)>V into (x&C)!=0 where possible When the least bit of C is greater than V, (x&C) must be greater than V if it is not zero, so the comparison can be simplified. Although this was suggested in Target/X86/README.txt, it benefits any architecture with a directly testable form of AND. Patch by Kevin Schoedel llvm-svn: 170576	2012-12-19 19:47:13 +00:00
Evgeniy Stepanov	abeae5c7d5	[msan] Add track-origins argument to the pass constructor. llvm-svn: 170544	2012-12-19 13:55:51 +00:00
Evgeniy Stepanov	d7571cd4bc	[msan] Heuristically instrument unknown intrinsics. This changes adds shadow and origin propagation for unknown intrinsics by examining the arguments and ModRef behaviour. For now, only 3 classes of intrinsics are handled: - those that look like simple SIMD store - those that look like simple SIMD load - those that don't have memory effects and look like arithmetic/logic/whatever operation on simple types. llvm-svn: 170530	2012-12-19 11:22:04 +00:00
Benjamin Kramer	e300004bd5	LoopVectorize: Make iteration over induction variables not depend on pointer values. MapVector is a bit heavyweight, but I don't see a simpler way. Also the InductionList is unlikely to be large. This should help 3-stage selfhost compares (PR14647). llvm-svn: 170528	2012-12-19 11:09:15 +00:00
Bill Wendling	d97b75d816	Inline the 'hasIncompatibleWithVarArgsAttrs' method into its only uses. And some minor comment reformatting. llvm-svn: 170516	2012-12-19 08:57:40 +00:00

... 22 23 24 25 26 ...

12034 Commits