llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	d2d2aa52cd	[LowerExpectIntrinsic] make default likely/unlikely ratio bigger We need the default ratio to be sufficiently large that it triggers transforms based on block frequency info (BFI) and plays well with the recently introduced BranchProbability used by CGP. Differential Revision: http://reviews.llvm.org/D19435 llvm-svn: 267615	2016-04-26 22:23:38 +00:00
Justin Bogner	90744d215b	Reassociate: Simplify using lambdas. NFC llvm-svn: 267614	2016-04-26 22:22:18 +00:00
David Majnemer	30ffc4ce45	[SROA] Don't falsely report that changes have occured We would report that the function changed despite creating no new allocas or performing any promotion. This fixes PR27316. llvm-svn: 267507	2016-04-26 01:05:00 +00:00
Chad Rosier	e2cbd13e56	[ValueTracking] Improve isImpliedCondition when the dominating cond is false. llvm-svn: 267430	2016-04-25 17:23:36 +00:00
Andrew Kaylor	aa641a5171	Re-commit optimization bisect support (r267022) without new pass manager support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231	2016-04-22 22:06:11 +00:00
Justin Bogner	b93949089e	PM: Port SinkingPass to the new pass manager llvm-svn: 267199	2016-04-22 19:54:10 +00:00
Justin Bogner	82077c4ab0	PM: Reorder the functions used for SinkingPass. NFC This will make the port to the new PM easier to follow. llvm-svn: 267198	2016-04-22 19:54:04 +00:00
Jun Bum Lim	d29a24e4fd	[DeadStoreElimination] Shorten beginning of memset overwritten by later stores Summary: This change will shorten memset if the beginning of memset is overwritten by later stores. Reviewers: hfinkel, eeckstein, dberlin, mcrosier Subscribers: mgrang, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18906 llvm-svn: 267197	2016-04-22 19:51:29 +00:00
Justin Bogner	395c2127ed	PM: Port DCE to the new pass manager Also add a very basic test, since apparently there aren't any tests for DCE whatsoever to add the new pass version to. llvm-svn: 267196	2016-04-22 19:40:41 +00:00
Chad Rosier	1a4bc110f5	[EarlyCSE/CVP] Add stats for CVPs and make sure to account for any Changes. llvm-svn: 267187	2016-04-22 18:47:21 +00:00
David Majnemer	bfd695d591	[EarlyCSE] Don't add the overflow flags to the hash We take the intersection of overflow flags while CSE'ing. This permits us to consider two instructions with different overflow behavior to be replaceable. llvm-svn: 267153	2016-04-22 14:12:50 +00:00
Vedant Kumar	6013f45f92	Revert "Initial implementation of optimization bisect support." This reverts commit r267022, due to an ASan failure: http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549 llvm-svn: 267115	2016-04-22 06:51:37 +00:00
David Majnemer	d0ce8f1485	[GVN] Respect fast-math-flags on fcmps We assumed that flags were only present on binary operators. This is not true, they may also be present on calls and fcmps. llvm-svn: 267113	2016-04-22 06:37:51 +00:00
David Majnemer	9554c1339c	[EarlyCSE] Take the intersection of flags on instructions EarlyCSE had inconsistent behavior with regards to flag'd instructions: - In some cases, it would pessimize if the available instruction had different flags by not performing CSE. - In other cases, it would miscompile if it replaced an instruction which had no flags with an instruction which has flags. Fix this by being more consistent with our flag handling by utilizing andIRFlags. llvm-svn: 267111	2016-04-22 06:37:45 +00:00
Andrew Kaylor	f0f279291c	Initial implementation of optimization bisect support. This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations. The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used. The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way. Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267022	2016-04-21 17:58:54 +00:00
Adam Nemet	963341c872	[LoopUtils] Move def of findStringMetadataForLoop to LoopUtils.cpp. NFC The decl is in LoopUtils.h. I think that this was added to LoopVersioningLICM.cpp by mistake. llvm-svn: 267014	2016-04-21 17:33:17 +00:00
Adam Nemet	f787826b46	[LoopUtils] Rename {check->find}StringMetadata{Into->For}Loop. NFC "Into" was misleading. I am also planning to use this helper to look for loop metadata and return the argument, so find seems like a better name. llvm-svn: 267013	2016-04-21 17:33:12 +00:00
Chad Rosier	b346dcbc25	Typo. llvm-svn: 266905	2016-04-20 19:16:23 +00:00
Chad Rosier	41dd31f0b0	[ValueTracking] Make isImpliedCondition return an Optional<bool>. NFC. Phabricator Revision: http://reviews.llvm.org/D19277 llvm-svn: 266904	2016-04-20 19:15:26 +00:00
Chad Rosier	b7dfbb40a3	[ValueTracking] Improve isImpliedCondition for conditions with matching operands. This patch improves SimplifyCFG to catch cases like: if (a < b) { if (a > b) <- known to be false unreachable; } Phabricator Revision: http://reviews.llvm.org/D18905 llvm-svn: 266767	2016-04-19 17:19:14 +00:00
Michael Kuperstein	de16b44f74	Port DemandedBits to the new pass manager. Differential Revision: http://reviews.llvm.org/D18679 llvm-svn: 266699	2016-04-18 23:55:01 +00:00
Mehdi Amini	b550cb1750	[NFC] Header cleanup Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' \| xargs grep -L 'IndexedMap[<]' \| xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595	2016-04-18 09:17:29 +00:00
Duncan P. N. Exon Smith	a71301befa	Transforms: Fix bootstrap after r266565 Apparently there isn't test coverage for all of these. I'd appreciate if someone with could reproduce and send me something to reduce, but for now I've just looked for users of RemapInstruction and MapValue and ensured they don't accidentally insert nullptr. Here is one of the bootstraps that caught: http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/11494 llvm-svn: 266567	2016-04-17 19:26:49 +00:00
Justin Lebar	cad81cf6b3	[Speculation] Add a SpeculativeExecution mode where the pass does nothing unless TTI::hasBranchDivergence() is true. Summary: This lets us add this pass to the IR pass manager unconditionally; it will simply not do anything on targets without branch divergence. Reviewers: tra Subscribers: llvm-commits, jingyue, rnk, chandlerc Differential Revision: http://reviews.llvm.org/D18625 llvm-svn: 266398	2016-04-15 00:32:09 +00:00
Nicolai Haehnle	05b127da06	[StructurizeCFG] Annotate branches that were treated as uniform Summary: This fully solves the problem where the StructurizeCFG pass does not consider the same branches as uniform as the SIAnnotateControlFlow pass. The patch in D19013 helps with this problem, but is not sufficient (and, interestingly, causes a "regression" with one of the existing test cases). No tests included here, because tests in D19013 already cover this. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19018 llvm-svn: 266346	2016-04-14 17:42:35 +00:00
Tim Northover	5c02f9ad28	ARM: override cost function to re-enable ConstantHoisting (& fix it). At some point, ARM stopped getting any benefit from ConstantHoisting because the pass called a different variant of getIntImmCost. Reimplementing the correct variant revealed some problems, however: + ConstantHoisting was modifying switch statements. This is simply invalid, the cases must remain integer constants no matter the notional cost. + ConstantHoisting was mangling alloca instructions in the entry block. These should be handled by FrameLowering, so constants actually have a cost of 0. Worse, the resulting bitcasts meant they became dynamic allocas. rdar://25707382 llvm-svn: 266260	2016-04-13 23:08:27 +00:00
Betul Buyukkurt	bf8554c279	[PGO] Remove redundant VP instrumentation LLVM optimization passes may reduce a profiled target expression to a constant. Removing runtime calls at such instrumentation points would help speedup the runtime of the instrumented program. llvm-svn: 266229	2016-04-13 18:52:19 +00:00
Sanjoy Das	5ce3272833	Don't IPO over functions that can be de-refined Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 llvm-svn: 265762	2016-04-08 00:48:30 +00:00
Ulrich Weigand	fc23907673	[GVN] Address review comments for D18662 As suggested by Chandler in his review comments for D18662, this follow-on patch renames some variables in GetLoadValueForLoad and CoerceAvailableValueToLoadType to hopefully make it more obvious which variables hold value sizes and which hold load/store sizes. No functional change intended. llvm-svn: 265687	2016-04-07 15:55:11 +00:00
Ulrich Weigand	6e6966460a	[GVN] Fix handling of sub-byte types in big-endian mode When GVN wants to re-interpret an already available value in a smaller type, it needs to right-shift the value on big-endian systems to ensure the correct bytes are accessed. The shift value is the difference of the sizes of the two types. This is correct as long as both types occupy multiples of full bytes. However, when one of them is a sub-byte type like i1, this no longer holds true: we still need to shift, but only to access the correct byte. Accessing bits within the byte requires no shift in either endianness; e.g. an i1 resides in the least-significant bit of its containing byte on both big- and little-endian systems. Therefore, the appropriate shift value to be used is the difference of the storage sizes of the two types. This is already handled correctly in one place where such a shift takes place (GetStoreValueForLoad), but is incorrect in two other places: GetLoadValueForLoad and CoerceAvailableValueToLoadType. This patch changes both places to use the storage size as well. Differential Revision: http://reviews.llvm.org/D18662 llvm-svn: 265684	2016-04-07 15:45:02 +00:00
Duncan P. N. Exon Smith	da68cbc4ad	IR: RF_IgnoreMissingValues => RF_IgnoreMissingLocals, NFC Clarify what this RemapFlag actually means. - Change the flag name to match its intended behaviour. - Clearly document that it's not supposed to affect globals. - Add a host of FIXMEs to indicate how to fix the behaviour to match the intent of the flag. RF_IgnoreMissingLocals should only affect the behaviour of RemapInstruction for function-local operands; namely, for operands of type Argument, Instruction, and BasicBlock. Currently, it is only passed into RemapInstruction calls (and the transitive MapValue calls that it makes). When I split Metadata from Value I didn't understand the flag, and I used it in a bunch of places for "global" metadata. This commit doesn't have any functionality change, but prepares to cleanup MapMetadata and MapValue. llvm-svn: 265628	2016-04-07 00:26:43 +00:00
JF Bastien	800f87a871	NFC: make AtomicOrdering an enum class Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602	2016-04-06 21:19:33 +00:00
Fiona Glaser	045afc4f66	Loop Unroll: add options and tweak to make Partial unrolling more useful 1. Add FullUnrollMaxCount option that works like MaxCount, but also limits the unroll count for fully unrolled loops. So if a loop has an iteration count over this, it won't fully unroll. 2. Add CLI options for MaxCount and the new option, so they can be tested (plus a test). 3. Make partial unrolling obey MaxCount. An example use-case (the out of tree one this is originally designed for) is a target’s TTI can analyze a loop and decide on a max unroll count separate from the size threshold, e.g. based on register pressure, then constrain LoopUnroll to not exceed that, regardless of the size of the unrolled loop. llvm-svn: 265562	2016-04-06 16:57:25 +00:00
Fiona Glaser	16332ba861	LoopUnroll: only allow non-modulo Partial unrolling when Runtime=true Patch by Evgeny Stupachenko <evstupac@gmail.com>. llvm-svn: 265558	2016-04-06 16:43:45 +00:00
Chad Rosier	074ce836f0	Simplify logic. NFC. llvm-svn: 265537	2016-04-06 13:27:13 +00:00
Richard Trieu	f35d4b0928	Add parentheses to silence warning. llvm-svn: 265516	2016-04-06 04:22:00 +00:00
Sanjoy Das	99abb2728b	[RS4GC] Add a comment llvm-svn: 265503	2016-04-06 01:33:54 +00:00
Sanjoy Das	8d89a2b296	[RS4GC] NFC cleanup of the DeferredReplacement class Instead of constructors use clearly named factory methods. llvm-svn: 265486	2016-04-05 23:18:53 +00:00
Sanjoy Das	49e974b33b	[RS4GC] Better codegen for deoptimize calls Don't emit a gc.result for a statepoint lowered from @llvm.experimental.deoptimize since the call into __llvm_deoptimize is effectively noreturn. Instead follow the corresponding gc.statepoint with an "unreachable". llvm-svn: 265485	2016-04-05 23:18:35 +00:00
Sanjay Patel	e77c7de459	use range loop; NFCI llvm-svn: 265360	2016-04-04 23:05:06 +00:00
Zia Ansari	a82a58a4e5	Enable unroll for constant bound loops when TripCount is not modulo of unroll factor, reducing it to maximum power-of-2 that satisfies threshold limit. Commit for Evgeny Stupachenko (evstupac@gmail.com) Differential Revision: http://reviews.llvm.org/D18290 llvm-svn: 265337	2016-04-04 19:24:46 +00:00
Sanjoy Das	021de058df	Introduce a @llvm.experimental.guard intrinsic Summary: As discussed on llvm-dev[1]. This change adds the basic boilerplate code around having this intrinsic in LLVM: - Changes in Intrinsics.td, and the IR Verifier - A lowering pass to lower @llvm.experimental.guard to normal control flow - Inliner support [1]: http://lists.llvm.org/pipermail/llvm-dev/2016-February/095523.html Reviewers: reames, atrick, chandlerc, rnk, JosephTremoulet, echristo Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18527 llvm-svn: 264976	2016-03-31 00:18:46 +00:00
David Majnemer	5d518386b6	[IndVarSimplify] Don't insert after a catchswitch Widening a PHI requires us to insert a trunc. The logical place for this trunc is in the same BB as the PHI. This is not possible if the BB is terminated by a catchswitch. This fixes PR27133. llvm-svn: 264926	2016-03-30 21:12:06 +00:00
Adam Nemet	1428d41f9a	[LoopDataPrefetch] Centralize the tuning cl::opts under the pass This is effectively NFC, minus the renaming of the options (-cyclone-prefetch-distance -> -prefetch-distance). The change was requested by Tim in D17943. llvm-svn: 264806	2016-03-29 23:45:52 +00:00
Duncan P. N. Exon Smith	e8eb94a9a5	ADCE: Remove debug info intrinsics in dead scopes During ADCE, track which debug info scopes still have live references from the code, and delete debug info intrinsics for the dead ones. These intrinsics describe the locations of variables (in registers or stack slots). If there's no code left corresponding to a variable's scope, then there's no way to reference the variable in the debugger and it doesn't matter what its value is. I add a DEBUG printout when the described location in an SSA register, in case it helps some trying to track down why locations get lost. However, we still delete these; the scope itself isn't attached to any real code, so the ship has already sailed. llvm-svn: 264800	2016-03-29 22:57:12 +00:00
Adam Nemet	85fba39390	[LoopDataPrefetch] Make more member functions private, NFC. llvm-svn: 264798	2016-03-29 22:40:02 +00:00
Hyojin Sung	4673f10568	[SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes is currently used to recognize potential loops of which the block is the header and keep the block. However, the current algorithm fails if the loops' exit condition is evaluated only with volatile values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested loop, the loop is collapsed into a single loop which prevent later optimizations from being applied (e.g., transforming nested loops into simplified forms and loop vectorization). The patch augments the existing PHI node-based check by adding a pre-test if the BB actually belongs to a set of loop headers and not eliminating it if yes. llvm-svn: 264697	2016-03-29 04:08:57 +00:00
Reid Kleckner	ba85781f58	Revert "[SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops" This reverts commit r264596. It does not compile. llvm-svn: 264604	2016-03-28 18:07:40 +00:00
Hyojin Sung	0ada5b0d14	[SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes is currently used to recognize potential loops of which the block is the header and keep the block. However, the current algorithm fails if the loops' exit condition is evaluated only with volatile values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested loop, the loop is collapsed into a single loop which prevent later optimizations from being applied (e.g., transforming nested loops into simplified forms and loop vectorization). The patch augments the existing PHI node-based check by adding a pre-test if the BB actually belongs to a set of loop headers and not eliminating it if yes. llvm-svn: 264596	2016-03-28 17:22:25 +00:00
Hal Finkel	5c83a090bc	[SROA] Fix typo in comment llvm-svn: 264573	2016-03-28 11:23:21 +00:00
Hal Finkel	29f5131daf	C++11 is required, remove some preprocessor checks for it We require C++11 to build, so remove a few remaining preprocessor checks for '__cplusplus >= 201103L'. This should always be true. llvm-svn: 264572	2016-03-28 11:13:03 +00:00
Sanjoy Das	d4c783335b	[RS4GC] Lower calls to @llvm.experimental.deoptimize This changes RS4GC to lower calls to ``@llvm.experimental.deoptimize`` to gc.statepoints wrapping ``__llvm_deoptimize``, and changes ``callsGCLeafFunction`` to recognize ``@llvm.experimental.deoptimize`` as a non GC leaf function. I've had to hard code the ``"__llvm_deoptimize"`` name in RewriteStatepointsForGC; since ``TargetLibraryInfo`` is available only during codegen. This isn't without precedent in the codebase, so I'm not overtly concerned. llvm-svn: 264456	2016-03-25 20:12:13 +00:00
David L Kreitzer	8d441eb936	Enable non-power-of-2 #pragma unroll counts. Patch by Evgeny Stupachenko. Differential Revision: http://reviews.llvm.org/D18202 llvm-svn: 264407	2016-03-25 14:24:52 +00:00
David Majnemer	e09d035dad	[LoopStrengthReduce] Don't hoist into a catchswitch We try to hoist the insertion point as high as possible to encourage sharing. However, we must be careful not to hoist into a catchswitch as it is both an EHPad and a terminator. llvm-svn: 264344	2016-03-24 21:40:22 +00:00
Adam Nemet	7aba60c853	[LLE] Check for mismatching types between the store and the load earlier isDependenceDistanceOfOne asserts that the store and the load access through the same type. This function is also used by removeDependencesFromMultipleStores so we need to make sure we filter out mismatching types before reaching this point. Now we do this when the initial candidates are gathered. This is a refinement of the fix made in r262267. Fixes PR27048. llvm-svn: 264313	2016-03-24 17:59:26 +00:00
Zinovy Nis	07ac2bd4d0	[PATCH] Force LoopReroll to reset the loop trip count value after reroll. It's a bug fix. For rerolled loops SE trip count remains unchanged. It leads to incorrect work of the next passes. My patch just resets SE info for rerolled loop forcing SE to re-evaluate it next time it requested. I also added a verifier call in the exisitng test to be sure no invalid SE data remain. Without my fix this test would fail with -verify-scev. Differential Revision: http://reviews.llvm.org/D18316 llvm-svn: 264051	2016-03-22 13:50:57 +00:00
Adam Nemet	709e3046ee	[LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead Summary: It can hurt performance to prefetch ahead too much. Be conservative for now and don't prefetch ahead more than 3 iterations on Cyclone. Reviewers: hfinkel Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17949 llvm-svn: 263772	2016-03-18 00:27:43 +00:00
Adam Nemet	6d8beeca53	[LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accesses Summary: And use this TTI for Cyclone. As it was explained in the original RFC (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW prefetcher work up to 2KB strides. I am also adding tests for this and the previous change (D17943): * Cyclone prefetching accesses with a large stride * Cyclone not prefetching accesses with a small stride * Generic Aarch64 subtarget not prefetching either Reviewers: hfinkel Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17945 llvm-svn: 263771	2016-03-18 00:27:38 +00:00
Adam Nemet	5eccf07df3	[LoopVersioning] Annotate versioned loop with noalias metadata Summary: If we decide to version a loop to benefit a transformation, it makes sense to record the now non-aliasing accesses in the newly versioned loop. This allows non-aliasing information to be used by subsequent passes. One example is 456.hmmer in SPECint2006 where after loop distribution, we vectorize one of the newly distributed loops. To vectorize we version this loop to fully disambiguate may-aliasing accesses. If we add the noalias markers, we can use the same information in a later DSE pass to eliminate some dead stores which amounts to ~25% of the instructions of this hot memory-pipeline-bound loop. The overall performance improves by 18% on our ARM64. The scoped noalias annotation is added in LoopVersioning. The patch then enables this for loop distribution. A follow-on patch will enable it for the vectorizer. Eventually this should be run by default when versioning the loop but first I'd like to get some feedback whether my understanding and application of scoped noalias metadata is correct. Essentially my approach was to have a separate alias domain for each versioning of the loop. For example, if we first version in loop distribution and then in vectorization of the distributed loops, we have a different set of memchecks for each versioning. By keeping the scopes in different domains they can conveniently be defined independently since different alias domains don't affect each other. As written, I also have a separate domain for each loop. This is not necessary and we could save some metadata here by using the same domain across the different loops. I don't think it's a big deal either way. Probably the best is to review the tests first to see if I mapped this problem correctly to scoped noalias markers. I have plenty of comments in the tests. Note that the interface is prepared for the vectorizer which needs the annotateInstWithNoAlias API. The vectorizer does not use LoopVersioning so we need a way to pass in the versioned instructions. This is also why the maps have to become part of the object state. Also currently, we only have an AA-aware DSE after the vectorizer if we also run the LTO pipeline. Depending how widely this triggers we may want to schedule a DSE toward the end of the regular pass pipeline. Reviewers: hfinkel, nadav, ashutosh.nema Subscribers: mssimpso, aemerson, llvm-commits, mcrosier Differential Revision: http://reviews.llvm.org/D16712 llvm-svn: 263743	2016-03-17 20:32:32 +00:00
Sanjoy Das	c9058ca9e0	[Statepoints] Export a magic constant into a header; NFC llvm-svn: 263733	2016-03-17 18:42:17 +00:00
Sanjoy Das	312038872d	[Statepoints] Separate out logic for statepoint directives; NFC This splits out the logic that maps the `"statepoint-id"` attribute into the actual statepoint ID, and the `"statepoint-num-patch-bytes"` attribute into the number of patchable bytes the statpeoint is lowered into. The new home of this logic is in IR/Statepoint.cpp, and this refactoring will support similar functionality when lowering calls with deopt operand bundles in the future. llvm-svn: 263685	2016-03-17 01:56:10 +00:00
Geoff Berry	56fabf9b55	Revert "[LSR] Create fewer redundant instructions." This reverts commit r263644. Investigating bootstrap failures. llvm-svn: 263655	2016-03-16 19:21:47 +00:00
Geoff Berry	459b750871	[LSR] Create fewer redundant instructions. Summary: Fix LSRInstance::HoistInsertPosition() to check the original insert position block first for a canonical insertion point that is dominated by all inputs. This leads to SCEV being able to reuse more instructions since it currently tracks the instructions it creates for reuse by keeping a table of <Value, insert point> pairs. Reviewers: atrick Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18001 llvm-svn: 263644	2016-03-16 17:29:49 +00:00
Haicheng Wu	7873857a88	[JumpThreading] See through Cast Instructions To capture more jump-thread opportunity. llvm-svn: 263618	2016-03-16 04:52:52 +00:00
Haicheng Wu	64d9d7c3f7	Revert "[JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors()" Not sure it handles undef properly. llvm-svn: 263605	2016-03-15 23:38:47 +00:00
Justin Lebar	6827de19b2	[LoopUnroll] Respect the convergent attribute. Summary: Specifically, when we perform runtime loop unrolling of a loop that contains a convergent op, we can only unroll k times, where k divides the loop trip multiple. Without this change, we'll happily unroll e.g. the following loop for (int i = 0; i < N; ++i) { if (i == 0) convergent_op(); foo(); } into int i = 0; if (N % 2 == 1) { convergent_op(); foo(); ++i; } for (; i < N - 1; i += 2) { if (i == 0) convergent_op(); foo(); foo(); }. This is unsafe, because we've just added a control-flow dependency to the convergent op in the prelude. In general, runtime unrolling loops that contain convergent ops is safe only if we don't have emit a prelude, which occurs when the unroll count divides the trip multiple. Reviewers: resistor Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17526 llvm-svn: 263509	2016-03-14 23:15:34 +00:00
Amaury Sechet	bdb261b4c0	Imporove load to store => memcpy Summary: This now try to reorder instructions in order to help create the optimizable pattern. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer Differential Revision: http://reviews.llvm.org/D16523 llvm-svn: 263503	2016-03-14 22:52:27 +00:00
Chad Rosier	00bd82cade	[CVP] Replace nonnegative with positive, per Philip's request. NFC. llvm-svn: 263430	2016-03-14 13:48:00 +00:00
Haicheng Wu	d60ae33d29	[CVP] Convert an SDiv to a UDiv if both operands are known to be nonnegative The motivating example is this for (j = n; j > 1; j = i) { i = j / 2; } The signed division is safely to be changed to an unsigned division (j is known to be larger than 1 from the loop guard) and later turned into a single shift without considering the sign bit. llvm-svn: 263406	2016-03-14 03:24:28 +00:00
Mehdi Amini	ba9fba81d6	Remove PreserveNames template parameter from IRBuilder This reapplies r263258, which was reverted in r263321 because of issues on Clang side. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263393	2016-03-13 21:05:13 +00:00
Eric Christopher	35abd051c0	Temporarily revert: commit ae14bf6488e8441f0f6d74f00455555f6f3943ac Author: Mehdi Amini <mehdi.amini@apple.com> Date: Fri Mar 11 17:15:50 2016 +0000 Remove PreserveNames template parameter from IRBuilder Summary: Following r263086, we are now relying on a flag on the Context to discard Value names in release builds. Reviewers: chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18023 From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258 91177308-0d34-0410-b5e6-96231b3b80d8 until we can figure out what to do about clang and Release build testing. This reverts commit 263258. llvm-svn: 263321	2016-03-12 01:47:22 +00:00
Mehdi Amini	99eab3dd06	Remove PreserveNames template parameter from IRBuilder Summary: Following r263086, we are now relying on a flag on the Context to discard Value names in release builds. Reviewers: chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18023 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263258	2016-03-11 17:15:50 +00:00
Mehdi Amini	1e9c925182	Do not specialize IRBuilder to strip names in SROA Summary: Following r263086, we are replacing this by a runtime check. More cleanup will follow on the IRBuilder itself, but I submitted this patch separately as SROA has a fancy "prefixInserter" class that needs extra-love. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18022 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263256	2016-03-11 17:15:34 +00:00
Chandler Carruth	ace8c8f765	[PM] Sink the "Expression" type for GVN into the class as a private member type. Because of how this type is used by the ValueTable, it cannot actually have hidden visibility. GCC actually nicely warns about this but Clang just silently ... I don't even know. =/ We should do a better job either way though. This should resolve a bunch of the GCC warnings about visibility that the port of GVN triggered and make the visibility story a bit more correct. llvm-svn: 263250	2016-03-11 16:25:19 +00:00
Chandler Carruth	3bc9c7fb45	[PM] The order of evaluation of these analyses is actually significant, much to my horror, so use variables to fix it in place. This terrifies me. Both basic-aa and memdep will provide more precise information when the domtree and/or the loop info is available. Because of this, if your pass (like GVN) requires domtree, and then queries memdep or basic-aa, it will get more precise results. If it does this in the other order, it gets less precise results. All of the ideas I have for fixing this are, essentially, terrible. Here I've just caused us to stop having unspecified behavior as different implementations evaluate the order of these arguments differently. I'm actually rather glad that they do, or the fragility of memdep and basic-aa would have gone on unnoticed. I've left comments so we don't immediately break this again. This should fix bots whose host compilers evaluate the order of arguments differently from Clang. llvm-svn: 263231	2016-03-11 13:26:47 +00:00
Chandler Carruth	b47f8010a9	[PM] Make the AnalysisManager parameter to run methods a reference. This was originally a pointer to support pass managers which didn't use AnalysisManagers. However, that doesn't realistically come up much and the complexity of supporting it doesn't really make sense. In fact, many parts of the pass manager were just assuming the pointer was never null already. This at least makes it much more explicit and clear. llvm-svn: 263219	2016-03-11 11:05:24 +00:00
Chandler Carruth	89c45a162f	[PM] Port GVN to the new pass manager, wire it up, and teach a couple of tests to run GVN in both modes. This is mostly the boring refactoring just like SROA and other complex transformation passes. There is some trickiness in that GVN's ValueNumber class requires hand holding to get to compile cleanly. I'm open to suggestions about a better pattern there, but I tried several before settling on this. I was trying to balance my desire to sink as much implementation detail into the source file as possible without introducing overly many layers of abstraction. Much like with SROA, the design of this system is made somewhat more cumbersome by the need to support both pass managers without duplicating the significant state and logic of the pass. The same compromise is struck here. I've also left a FIXME in a doxygen comment as the GVN pass seems to have pretty woeful documentation within it. I'd like to submit this with the FIXME and let those more deeply familiar backfill the information here now that we have a nice place in an interface to put that kind of documentaiton. Differential Revision: http://reviews.llvm.org/D18019 llvm-svn: 263208	2016-03-11 08:50:55 +00:00
Adam Nemet	efb234135c	[LLE] Add missed LoopSimplify dependence The code assumed that we always had a preheader without making the pass dependent on LoopSimplify. Thanks to Mattias Eriksson V for reporting this. llvm-svn: 263173	2016-03-10 23:54:39 +00:00
Chandler Carruth	37f1f12226	[SROA] Fix PR25873, which Andrea Di Biagio analyzed the daylights out of, and I misdiagnosed for months and months. Andrea has had a patch for this forever, but I just couldn't see how it was fixing the root cause of the problem. It didn't make sense to me, even though the patch was perfectly good and the analysis of the actual failure event was fantastic. Well, I came back to it today because the patch has sat for far too long and needs attention and decided I wouldn't let it go until I really understood what was going on. After quite some time in the debugger, I finally realized that in fact I had just missed an important case with my previous attempt to fix PR22093 in r225149. Not only do we need to handle loads that won't be split, but stores-of-loads that we won't split. We do actually have enough logic in the presplitting to form new slices for split stores.... unless we decided not to split them! I'm so sorry that it took me this long to come to the realization that this is the issue. It seems so obvious in hind sight (of course). Anyways, the fix becomes much smaller and more focused. The fact that we're left doing integer smashing is related to the FIXME in my original commit: fundamentally, we're not aggressive about pre-splitting for loads and stores to the same alloca. If we want to get aggressive about this, it'll need both what Andrea had put into the proposed fix, but also a lot more logic to essentially iteratively pre-split the alloca until we can't do any more. As I said in that commit log, its really unclear that this is the right call. Instead, the integer blending and letting targets lower this to narrower stores seems slightly better. But we definitely shouldn't really go down that path just to fix this bug. Again, tons of thanks are owed to Andrea and others at Sony for working on this bug. I really should have seen what was going on here and re-directed them sooner. =//// llvm-svn: 263121	2016-03-10 15:31:17 +00:00
Chandler Carruth	d94a5962cc	[SROA] Clean up some really weird code, no functionality changed. We already have the instruction extracted into 'I', just cast that to a store the way we do for loads. Also, we don't enter the if unless SI is non-null, so don't test it again for null. I'm pretty sure the entire test there can be nuked, but this is just the trivial cleanup. llvm-svn: 263112	2016-03-10 14:16:18 +00:00
Chandler Carruth	7776377e62	[gvn] Fix more indenting and formatting in regions of code that will need to be changed for porting to the new pass manager. Also sink the comment on the ValueTable class back to that class instead of it dangling on an anonymous namespace. No functionality changed. llvm-svn: 263084	2016-03-10 00:58:20 +00:00
Chandler Carruth	169c84f1cc	[gvn] Reformat a chunk of the GVN code that is strangely indented prior to restructuring it for porting to the new pass manager. No functionality changed. llvm-svn: 263083	2016-03-10 00:58:18 +00:00
Chandler Carruth	61440d225b	[PM] Port memdep to the new pass manager. This is a fairly straightforward port to the new pass manager with one exception. It removes a very questionable use of releaseMemory() in the old pass to invalidate its caches between runs on a function. I don't think this is really guaranteed to be safe. I've just used the more direct port to the new PM to address this by nuking the results object each time the pass runs. While this could cause some minor malloc traffic increase, I don't expect the compile time performance hit to be noticable, and it makes the correctness and other aspects of the pass much easier to reason about. In some cases, it may make things faster by making the sets and maps smaller with better locality. Indeed, the measurements collected by Bruno (thanks!!!) show mostly compile time improvements. There is sadly very limited testing at this point as there are only two tests of memdep, and both rely on GVN. I'll be porting GVN next and that will exercise this heavily though. Differential Revision: http://reviews.llvm.org/D17962 llvm-svn: 263082	2016-03-10 00:55:30 +00:00
Philip Reames	e0a5454df4	Fix the build I screwed up rebasing 263072. This change fixes the build and passes all make check. llvm-svn: 263073	2016-03-09 23:07:53 +00:00
Philip Reames	b54c8e6eea	[LICM] Store promotion when memory is thread local This patch teaches LICM's implementation of store promotion to exploit the fact that the memory location being accessed might be provable thread local. The fact it's thread local weakens the requirements for where we can insert stores since no other thread can observe the write. This allows us perform store promotion even in cases where the store is not guaranteed to execute in the loop. Two key assumption worth drawing out is that this assumes a) no-capture is strong enough to imply no-escape, and b) standard allocation functions like malloc, calloc, and operator new return values which can be assumed not to have previously escaped. In future work, it would be nice to generalize this so that it works without directly seeing the allocation site. I believe that the nocapture return attribute should be suitable for this purpose, but haven't investigated carefully. It's also likely that we could support unescaped allocas with similar reasoning, but since SROA and Mem2Reg should destroy those, they're less interesting than they first might seem. Differential Revision: http://reviews.llvm.org/D16783 llvm-svn: 263072	2016-03-09 22:59:30 +00:00
Adam Nemet	660748ca8c	[LLE] Add missing check for unit stride I somehow missed this. The case in GCC (global_alloc) was similar to the new testcase except it had an array of structs rather than a two dimensional array. Fixes RP26885. llvm-svn: 263058	2016-03-09 20:47:55 +00:00
Adam Nemet	34785ecff1	[LoopDataPrefetch] Add stats and debug output llvm-svn: 262998	2016-03-09 05:33:21 +00:00
Sanjoy Das	2eac48de9e	Return StringRef instead of a naked char*; NFC llvm-svn: 262989	2016-03-09 02:34:19 +00:00
Sanjoy Das	f13900f8ac	[IRCE] Reflow comments; NFC llvm-svn: 262988	2016-03-09 02:34:15 +00:00
Sanjay Patel	f831fdb56a	fix variable name; NFC llvm-svn: 262953	2016-03-08 19:07:42 +00:00
Sanjay Patel	5c96723622	use range-based loop; NFCI llvm-svn: 262952	2016-03-08 19:06:12 +00:00
Adam Nemet	bb3680bd85	[LoopDataPrefetch] If prefetch distance is not set, skip pass This lets select sub-targets enable this pass. The patch implements the idea from the recent llvm-dev thread: http://thread.gmane.org/gmane.comp.compilers.llvm.devel/94925 The goal is to enable the LoopDataPrefetch pass for the Cyclone sub-target only within Aarch64. Positive and negative tests will be included in an upcoming patch that enables selective prefetching of large-strided accesses on Cyclone. llvm-svn: 262844	2016-03-07 18:35:42 +00:00
Adam Nemet	efc091f457	[LLE] Fix a comment llvm-svn: 262270	2016-02-29 23:21:12 +00:00
Adam Nemet	83be06e529	[LLE] Fix SingleSource/Benchmarks/Polybench/stencils/jacobi-2d-imper with Polly We can actually have dependences between accesses with different underlying types. Bail in this case. A test will follow shortly. llvm-svn: 262267	2016-02-29 22:53:59 +00:00
Chandler Carruth	ad8cb382fa	[LICM] Teach LICM how to handle cases where the alias set tracker was merged into a loop that was subsequently unrolled (or otherwise nuked). In this case it can't merge in the ASTs for any remaining nested loops, it needs to re-add their instructions dircetly. The fix is very isolated, but I've pulled the code for merging blocks into the AST into a single place in the process. The only behavior change is in the case which would have crashed before. This fixes a crash reported by Mikael Holmen on the list after r261316 restored much of the loop pass pipelining and allowed us to actually do this kind of nested transformation sequenc. I've taken that test case and further reduced it into the somewhat twisty maze of loops in the included test case. This does in fact trigger the bug even in this reduced form. llvm-svn: 262108	2016-02-27 04:34:07 +00:00
Haicheng Wu	5539f852ae	[JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors() This change tries to find more opportunities to thread over basic blocks. llvm-svn: 261981	2016-02-26 06:06:04 +00:00
Michael Zolotukhin	9f520ebc54	[LoopUnrollAnalyzer] Check that we're using SCEV for the same loop we're simulating. Summary: Check that we're using SCEV for the same loop we're simulating. Otherwise, we might try to use the iteration number of the current loop in SCEV expressions for inner/outer loops IVs, which is clearly incorrect. Reviewers: chandlerc, hfinkel Subscribers: sanjoy, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17632 llvm-svn: 261958	2016-02-26 02:57:05 +00:00
Adam Nemet	fb31d580ea	[LoopDataPrefetch] Make it testable with opt Summary: Since this is an IR pass it's nice to be able to write tests without llc. This is the counterpart of the llc test under CodeGen/PowerPC/loop-data-prefetch.ll. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17464 llvm-svn: 261578	2016-02-22 21:41:22 +00:00
Philip Reames	ce38c2ddf6	[RS4GC] "Constant fold" the rs4gc-split-vector-values flag This flag was part of a migration to a new means of handling vectors-of-points which was described in the llvm-dev thread "FYI: Relocating vector of pointers". The old code path has been off by default for a while without complaints, so time to cleanup. llvm-svn: 261569	2016-02-22 21:01:28 +00:00
Philip Reames	79fa9b75c0	[RS4GC] Revert optimization attempt due to memory corruption This change reverts "246133 [RewriteStatepointsForGC] Reduce the number of new instructions for base pointers" and a follow on bugfix 12575. As pointed out in pr25846, this code suffers from a memory corruption bug. Since I'm (empirically) not going to get back to this any time soon, simply reverting the problematic change is the right answer. llvm-svn: 261565	2016-02-22 20:45:56 +00:00
Elena Demikhovsky	9914dbd11b	Allow setting MaxRerollIterations above 16 By Ayal Zaks. Differential Revision http://reviews.llvm.org/D17258 llvm-svn: 261517	2016-02-22 09:38:28 +00:00
Duncan P. N. Exon Smith	e9bc579c37	ADT: Remove == and != comparisons between ilist iterators and pointers I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498	2016-02-21 20:39:50 +00:00
Sanjoy Das	979a11d5b2	[LoopDeletion] Add an assert that verifies LCSSA This is inspired by PR24804 -- had this assert been there before, isolating the root cause for PR24804 would have been far easier. llvm-svn: 261481	2016-02-21 17:11:59 +00:00
Chandler Carruth	31088a9d58	[LPM] Factor all of the loop analysis usage updates into a common helper routine. We were getting this wrong in small ways and generally being very inconsistent about it across loop passes. Instead, let's have a common place where we do this. One minor downside is that this will require some analyses like SCEV in more places than they are strictly needed. However, this seems benign as these analyses are complete no-ops, and without this consistency we can in many cases end up with the legacy pass manager scheduling deciding to split up a loop pass pipeline in order to run the function analysis half-way through. It is very, very annoying to fix these without just being very pedantic across the board. The only loop passes I've not updated here are ones that use AU.setPreservesAll() such as IVUsers (an analysis) and the pass printer. They seemed less relevant. With this patch, almost all of the problems in PR24804 around loop pass pipelines are fixed. The one remaining issue is that we run simplify-cfg and instcombine in the middle of the loop pass pipeline. We've recently added some loop variants of these passes that would seem substantially cleaner to use, but this at least gets us much closer to the previous state. Notably, the seven loop pass managers is down to three. I've not updated the loop passes using LoopAccessAnalysis because that analysis hasn't been fully wired into LoopSimplify/LCSSA, and it isn't clear that those transforms want to support those forms anyways. They all run late anyways, so this is harmless. Similarly, LSR is left alone because it already carefully manages its forms and doesn't need to get fused into a single loop pass manager with a bunch of other loop passes. LoopReroll didn't use loop simplified form previously, and I've updated the test case to match the trivially different output. Finally, I've also factored all the pass initialization for the passes that use this technique as well, so that should be done regularly and reliably. Thanks to James for the help reviewing and thinking about this stuff, and Ben for help thinking about it as well! Differential Revision: http://reviews.llvm.org/D17435 llvm-svn: 261316	2016-02-19 10:45:18 +00:00
Lawrence Hu	84e6f1dd70	Bug fix: use dyn_cast_or_null instead of dyn_cast Differential Revision: http://reviews.llvm.org/D17154 llvm-svn: 261299	2016-02-19 02:17:07 +00:00
Richard Trieu	7a08381403	Remove uses of builtin comma operator. Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270	2016-02-18 22:09:30 +00:00
Adam Nemet	9d9cb274ea	[PPCLoopDataPrefetch] Move pass to Transforms/Scalar/LoopDataPrefetch. NFC This patch is part of the work to make PPCLoopDataPrefetch target-independent (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758). Obviously the pass still only used from PPC at this point. Subsequent patches will start driving this from ARM64 as well. Due to the previous patch most lines should show up as moved lines. llvm-svn: 261265	2016-02-18 21:38:19 +00:00
Junmo Park	80440eb804	Minor code cleanup. NFC. llvm-svn: 261200	2016-02-18 10:09:20 +00:00
Haicheng Wu	57e1a3e6ee	[LIR] Avoid turning non-temporal stores into memset This is to fix PR26645. llvm-svn: 261149	2016-02-17 21:00:06 +00:00
Roman Gareev	036c08874a	Tweak the LICM code to reuse the first sub-loop instead of creating a new one LICM starts with an empty AST, and then merges in each sub-loop. While the add code is appropriate for sub-loop 2 and up, it's utterly unnecessary for sub-loop 1. If the AST starts off empty, we can just clone/move the contents of the subloop into the containing AST. Reviewed-by: Philip Reames <listmail@philipreames.com> Differential Revision: http://reviews.llvm.org/D16753 llvm-svn: 260892	2016-02-15 14:48:50 +00:00
Chad Rosier	81362a8599	[LIR] Allow merging of memsets in negatively strided loops. Last part of PR25166. llvm-svn: 260732	2016-02-12 21:03:23 +00:00
Justin Lebar	df04d2a1f1	[LoopRotate] Don't perform loop rotation if the loop header calls a convergent function. Summary: Calls to convergent functions can be duplicated, but only if the duplicates are not control-flow dependent on any additional values. Loop rotation doesn't meet the bar. Reviewers: jingyue Subscribers: mzolotukhin, llvm-commits, arsenm, joker.eph, resistor, tra, hfinkel, broune Differential Revision: http://reviews.llvm.org/D17127 llvm-svn: 260729	2016-02-12 21:01:33 +00:00
David Majnemer	01674939b2	Remove unused variable llvm-svn: 260722	2016-02-12 20:33:51 +00:00
Philip Reames	96fccc2d09	[GVN] Common code for local and non-local load availability [NFCI] The attached patch removes all of the block local code for performing X-load forwarding by reusing the code used in the non-local case. The motivation here is to remove duplication and in the process increase our test coverage of some fairly tricky code. I have some upcoming changes I'll be proposing in this area and wanted to have the code cleaned up a bit first. Note: The review for this mostly happened in email which didn't make it to phabricator on the 258882 commit thread. Differential Revision: http://reviews.llvm.org/D16608 llvm-svn: 260711	2016-02-12 19:24:57 +00:00
Chad Rosier	4acff96646	[LIR] Partially revert r252926(NFC), which introduced a very subtle change. In short, before r252926 we were comparing an unsigned (StoreSize) against an a APInt (Stride), which is fine and well. After we were zero extending the Stride and then converting to an unsigned, which is not the same thing. Obviously, Stides can also be negative. This commit just restores the original behavior. AFAICT, it's not possible to write a test case to expose the issue because the code already has checks to make sure the StoreSize can't overflow an unsigned (which prevents the Stride from overflowing an unsigned as well). llvm-svn: 260706	2016-02-12 19:05:27 +00:00
Tamas Berghammer	d12f31535a	Fix MSVC 2013 build after rL260504 llvm-svn: 260511	2016-02-11 11:27:51 +00:00
Ashutosh Nema	2260a3a046	Fixed typo in comment & coding style for LoopVersioningLICM. llvm-svn: 260504	2016-02-11 09:23:53 +00:00
Philip Reames	d59638e14f	Follow up to 260439, Speculative fix to clang builders It looks like clang has a couple of test cases which caught the fact LVI was not slightly more precise after 260439. When looking at the failures, it struck me as wasteful to be querying nullness of a constant via LVI, so instead of tweaking the clang tests, let's just stop querying constants from this source. llvm-svn: 260451	2016-02-10 22:22:41 +00:00
Tom Stellard	6fa4e290d7	StructurizeCFG: Initialize SkipUniformRegions in the default constructor This should fix some random bot failures caused by r260336. llvm-svn: 260342	2016-02-10 01:10:09 +00:00
Tom Stellard	755a4e6b57	StructurizeCFG: Add an option for skipping regions with only uniform branches Summary: Tests for this will be added once the AMDGPU backend enables this option. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16602 llvm-svn: 260336	2016-02-10 00:39:37 +00:00
Michael Zolotukhin	1da4afdfc9	Factor out UnrollAnalyzer to Analysis, and add unit tests for it. Summary: Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests. The change itself is supposed to be NFC, except adding a couple of tests. I plan to add more tests as I add new functionality and find/fix bugs. Reviewers: chandlerc, hfinkel, sanjoy Subscribers: zzheng, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D16623 llvm-svn: 260169	2016-02-08 23:03:59 +00:00
Haicheng Wu	b35f772b90	[JumpThreading] Change a return of ComputeValueKnownInPredecessors() Change a return statement of ComputeValueKnownInPredecessors() to be the same as the rest return statements of the function. Otherwise, it might return true with an empty Result when the current basic block has no predecessors and trigger the first assert of JumpThreading::ProcessThreadableEdges(). llvm-svn: 260110	2016-02-08 17:00:39 +00:00
Ashutosh Nema	df6763abe8	New Loop Versioning LICM Pass Summary: When alias analysis is uncertain about the aliasing between any two accesses, it will return MayAlias. This uncertainty from alias analysis restricts LICM from proceeding further. In cases where alias analysis is uncertain we might use loop versioning as an alternative. Loop Versioning will create a version of the loop with aggressive aliasing assumptions in addition to the original with conservative (default) aliasing assumptions. The version of the loop making aggressive aliasing assumptions will have all the memory accesses marked as no-alias. These two versions of loop will be preceded by a memory runtime check. This runtime check consists of bound checks for all unique memory accessed in loop, and it ensures the lack of memory aliasing. The result of the runtime check determines which of the loop versions is executed: If the runtime check detects any memory aliasing, then the original loop is executed. Otherwise, the version with aggressive aliasing assumptions is used. The pass is off by default and can be enabled with command line option -enable-loop-versioning-licm. Reviewers: hfinkel, anemet, chatur01, reames Subscribers: MatzeB, grosser, joker.eph, sanjoy, javed.absar, sbaranga, llvm-commits Differential Revision: http://reviews.llvm.org/D9151 llvm-svn: 259986	2016-02-06 07:47:48 +00:00
Joseph Tremoulet	adc2376375	[RS4GC] Pass DenseMap by reference, NFC Summary: Passing the rematerialized values map to insertRematerializationStores by value looks to be a simple oversight; update it to pass by reference. Reviewers: reames, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16911 llvm-svn: 259867	2016-02-05 01:42:52 +00:00
Adam Nemet	9455c1d2b1	[LoopLoadElim] Don't allow versioning when optForSize This was requested in the review of D16300. llvm-svn: 259861	2016-02-05 01:14:05 +00:00
David Majnemer	a53b5bbb18	[LoopStrengthReduce] Don't rewrite PHIs with incoming values from CatchSwitches Bail out if we have a PHI on an EHPad that gets a value from a CatchSwitchInst. Because the CatchSwitchInst cannot be split, there is no good place to stick any instructions. This fixes PR26373. llvm-svn: 259702	2016-02-03 21:30:34 +00:00
Adam Nemet	d52ed84160	[LoopVersioning] Expose loop versioning as a pass too Summary: LoopVersioning is a transform utility that transform passes can use to run-time disambiguate may-aliasing accesses. I'd like to also expose as pass to allow it to be unit-tested. I am planning to add support for non-aliasing annotation in LoopVersioning and I'd like to be able to write tests directly using this pass. (After that feature is done, the pass could also be used to look for optimization opportunities that are hidden behind incomplete alias information at compile time.) The pass drives LoopVersioning in its default way which is to fully disambiguate may-aliasing accesses no matter how many checks are required. Reviewers: hfinkel, ashutosh.nema, sbaranga Subscribers: zzheng, mssimpso, llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D16612 llvm-svn: 259610	2016-02-03 00:06:10 +00:00
Matthias Braun	b30f2f5141	Avoid overly large SmallPtrSet/SmallSet These sets perform linear searching in small mode so it is never a good idea to use SmallSize/N bigger than 32. llvm-svn: 259283	2016-01-30 01:24:31 +00:00
Fiona Glaser	36e8230db0	Fix typo in LoopSimplifyCFG llvm-svn: 259261	2016-01-29 23:12:52 +00:00
Fiona Glaser	b417d464e6	Add LoopSimplifyCFG pass Loop transformations can sometimes fail because the loop, while in valid rotated LCSSA form, is not in a canonical CFG form. This is an extremely simple pass that just merges obviously redundant blocks, which can be used to fix some known failure cases. In the future, it may be enhanced with more cases (and have code shared with SimplifyCFG). This allows us to run LoopSimplifyCFG -> LoopRotate -> LoopUnroll, so that SimplifyCFG cleans up the loop before Rotate tries to run. Not currently used in the pass manager, since this pass doesn't do anything unless you can hook it up in an LPM with other loop passes. It'll be added once Chandler cleans up things to allow this. Tested in a custom pipeline out of tree to confirm it works in practice (in addition to the included trivial test). llvm-svn: 259256	2016-01-29 22:35:36 +00:00
David Majnemer	75f492e7f1	Fix the build llvm-svn: 259215	2016-01-29 17:46:57 +00:00
Sanjoy Das	c816f03b70	[RS4GC] Address post-commit review on r259208 from David NFC llvm-svn: 259211	2016-01-29 17:20:49 +00:00
Sanjoy Das	565f7866ac	[RS4GC] Remove unnecessary const_cast; NFC GCRelocateInst::getDerivedPtr already returns a non-const llvm::Value pointer. llvm-svn: 259209	2016-01-29 16:54:49 +00:00
Sanjoy Das	3794eeb8bb	[RS4GC] Minor local cleanup to StabilizeOrder; NFC - Locally declare struct, and call it BaseDerivedPair - Use a lambda to compare, instead of a singleton with uninitialized fields - Add a constructor to BaseDerivedPair and use SmallVector::emplace_back llvm-svn: 259208	2016-01-29 16:50:34 +00:00
Philip Reames	10e678d25a	[GVN] Add clarifying assert [NFCI] Just adding an assert which makes invariants between AnalyzeLoadsFromClobberingLoads and GetLoadValueForLoad slightly more clear. llvm-svn: 259145	2016-01-29 02:23:10 +00:00
Sanjoy Das	bcf27523f5	[RS4GC] Minor cleanups enabled by the previous change; NFC llvm-svn: 259133	2016-01-29 01:03:20 +00:00
Sanjoy Das	4099297856	[RS4GC] Delete code that is dead due to r259129; NFC llvm-svn: 259132	2016-01-29 01:03:17 +00:00
Sanjoy Das	0407108020	[RS4GC] Clamp UseDeoptBundles to true and update tests The full diff for the test directory may be hard to read because of the filename clash; so here's all that happened as far as the tests are concerned: ``` cd test/Transforms/RewriteStatepointsForGC git rm ll git mv deopt-bundles/ ./ rmdir deopt-bundles find . -name '*.ll' \| xargs gsed -i 's/-rs4gc-use-deopt-bundles //g' ``` llvm-svn: 259129	2016-01-29 00:28:57 +00:00
Sanjoy Das	bb04f6e28f	[PlaceSafepoints] Use DEBUG() instead of TraceLSP DEBUG() is the more idiomatic LLVM style. llvm-svn: 259121	2016-01-28 23:49:27 +00:00
Sanjoy Das	cd23fec756	[PlaceSafepoints] Misc. minor cleanups; NFC These changes are aimed at bringing PlaceSafepoints up to code with the LLVM coding guidelines: - Fix variable naming - Use DenseSet instead of std::set - Remove dead code - Minor local code simplifications llvm-svn: 259112	2016-01-28 23:03:19 +00:00
Sanjoy Das	360a4e4ee2	[PlaceSafepoints] Remvoe unused headers, and sort #includes; NFC llvm-svn: 259111	2016-01-28 23:03:17 +00:00
Sanjoy Das	12673765cf	[PlaceSafepoints] Eliminate dead code; NFC Now that NoStatepoints is a constant `true`, we can get rid of a bunch of dead code. llvm-svn: 259110	2016-01-28 23:03:14 +00:00
Sanjoy Das	f7302c8baf	[PlaceSafepoints] Clamp NoStatepoints to true This change permanently clamps -spp-no-statepoints to true (the code deletion will come later). Tests that specifically tested PlaceSafepoint's ability to wrap calls in gc.statepoint have been moved to RS4GC's test suite. llvm-svn: 259096	2016-01-28 21:51:14 +00:00
Sanjoy Das	7a2e2bed67	[LICM] Keep metadata on control equivalent hoists Summary: If the instruction we're hoisting out of a loop into its preheader is guaranteed to have executed in the loop, then the metadata associated with the instruction (e.g. !range or !dereferenceable) is valid in the preheader. This is because once we're in the preheader, we know we're eventually going to reach the location the metadata was valid at. This change makes LICM smarter around this, and helps it recognize cases like these: ``` do { int a = ptr; !range !0 ... } while (i++ < N); ``` to ``` int a = ptr; !range !0 do { ... } while (i++ < N); ``` Earlier we'd drop the `!range` metadata after hoisting the load from `ptr`. Reviewers: igor-laevsky Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16669 llvm-svn: 259053	2016-01-28 15:51:58 +00:00
Sanjoy Das	cddde58f1c	[IndVars] Hoist DataLayout load out of loop; NFC llvm-svn: 258946	2016-01-27 17:05:09 +00:00
Sanjoy Das	2f7a7447c2	[IndVars] Use isSCEVable; NFC llvm-svn: 258945	2016-01-27 17:05:06 +00:00
Sanjoy Das	8fdf87c338	[IndVars] Use range-for; NFC llvm-svn: 258944	2016-01-27 17:05:03 +00:00
Chen Li	5cde8389cf	[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader Summary: This is a revised version of D13974, and the following quoted summary are from D13974 "This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch." D13974 was committed but failed one lnt test. The bug was that we only checked the condition from loop exit's incoming block was a loop invariant. But there could be another condition from loop header to that incoming block not being a loop invariant. This would produce miscompiled code. This patch fixes the issue by checking if the incoming block is loop header, and if not, don't perform the rewrite. The could be further improved by recursively checking all conditions leading to loop exit block, but I'd like to check in this simple version first and improve it with future patches. Reviewers: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16570 llvm-svn: 258912	2016-01-27 07:40:41 +00:00
Philip Reames	8e785a4ec0	[GVN] Split AvailableValueInBlock into two parts [NFC] AvailableValue is the part that represents the potential rematerialization. AvailableValueInBlock is simply a pair of an AvailableValue and a BB which we might materialize it in. This is motivated by http://reviews.llvm.org/D16608. The intent is that we'll have a single function which handles the local case which both local and non-local will use to identify available values. Once that's done, the local case can rematerialize at the use site and the non-local case can do the SSA construction as it does currently. llvm-svn: 258882	2016-01-26 23:43:16 +00:00
Chris Bieneman	e49730d4ba	Remove autoconf support Summary: This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html "I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened." - Obi Wan Kenobi Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16471 llvm-svn: 258861	2016-01-26 21:29:08 +00:00
Aditya Nandakumar	3d0c46d489	Reassociate: Reprocess RedoInsts after each inst Previously the RedoInsts was processed at the end of the block. However it was possible that it left behind some instructions that were not canonicalized. This should guarantee that any previous instruction in the basic block is canonicalized before we process a new instruction. llvm-svn: 258830	2016-01-26 18:42:36 +00:00
Haicheng Wu	f1c00a22be	[LIR] Add support for structs and hand unrolled loops This is a recommit of r258620 which causes PR26293. The original message: Now LIR can turn following codes into memset: typedef struct foo { int a; int b; } foo_t; void bar(foo_t f, unsigned n) { for (unsigned i = 0; i < n; ++i) { f[i].a = 0; f[i].b = 0; } } void test(foo_t f, unsigned n) { for (unsigned i = 0; i < n; i += 2) { f[i] = 0; f[i+1] = 0; } } llvm-svn: 258777	2016-01-26 02:27:47 +00:00
Philip Reames	273dcb0d82	[GVN] Rearrange code to make local vs non-local cases more obvious [NFCI] llvm-svn: 258747	2016-01-25 23:37:53 +00:00
Philip Reames	10a50b188e	[GVN] Factor out common code [NFCI] We had the same code duplicated for each type of Def. We also have the entire block duplicated between the local and non-local case, but let's start with local cleanup. llvm-svn: 258740	2016-01-25 23:19:12 +00:00
Lawrence Hu	d3d51061fb	Enable loopreroll to rerool loop with pointer induction variable. Example: while (buf !=end ) { S += buf[0]; S += buf[1]; buf +=2; }; Differential Revision: http://reviews.llvm.org/D13151 llvm-svn: 258709	2016-01-25 19:43:45 +00:00
Lawrence Hu	b917cd9fa6	Undo commit 258700 due to missing commit message llvm-svn: 258708	2016-01-25 19:36:30 +00:00
Quentin Colombet	a392810bea	Speculatively revert r258620 as it is the likely culprid of PR26293. llvm-svn: 258703	2016-01-25 19:12:49 +00:00
Lawrence Hu	84b6195e41	Differential Revision: http://reviews.llvm.org/D13151 llvm-svn: 258700	2016-01-25 18:53:39 +00:00
David Majnemer	eec878574e	Fix build bot breakage llvm-svn: 258661	2016-01-24 16:46:53 +00:00
David Majnemer	dcd6c79d55	Fix buildbot failures llvm-svn: 258655	2016-01-24 06:40:37 +00:00
David Majnemer	88542a0a69	[SCCP] Remove duplicate code SCCP has code identical to changeToUnreachable's behavior, switch it over to just call changeToUnreachable. No functionality change intended. llvm-svn: 258654	2016-01-24 06:26:47 +00:00
David Majnemer	35c46d3e0b	[InstCombine, SCCP] Consolidate code used to remove instructions InstCombine and SCCP both want to remove dead code in a very particular way but using identical means to do so. Share the code between the two. No functionality change is intended. llvm-svn: 258653	2016-01-24 05:26:18 +00:00
Haicheng Wu	dd5e9d2159	[LIR] Add support for structs and hand unrolled loops Now LIR can turn following codes into memset: typedef struct foo { int a; int b; } foo_t; void bar(foo_t f, unsigned n) { for (unsigned i = 0; i < n; ++i) { f[i].a = 0; f[i].b = 0; } } void test(foo_t f, unsigned n) { for (unsigned i = 0; i < n; i += 2) { f[i] = 0; f[i+1] = 0; } } llvm-svn: 258620	2016-01-23 06:52:41 +00:00
Sanjoy Das	95639746e5	[PlaceSafepoints] Introduce a -spp-no-statepoints flag Summary: This change adds a `-spp-no-statepoints` flag to PlaceSafepoints that bypasses the code that wraps newly introduced polls and existing calls in gc.statepoint. With `-spp-no-statepoints` enabled, PlaceSafepoints effectively becomes a safpeoint poll insertion pass. The eventual goal is to "constant fold" this option, along with `-rs4gc-use-deopt-bundles` to `true`, once clients using gc.statepoint are okay doing so. Reviewers: pgavlin, reames, JosephTremoulet Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16439 llvm-svn: 258551	2016-01-22 21:02:55 +00:00
Sanjoy Das	acc43d197d	[RS4GC] Use OB_deopt instead of "deopt" llvm-svn: 258529	2016-01-22 19:20:40 +00:00
Eduard Burtescu	68e7f49f8e	[opaque pointer types] [NFC] DataLayout::getIndexedOffset: take source element type instead of pointer type and rename to getIndexedOffsetInType. Summary: Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16282 llvm-svn: 258478	2016-01-22 03:08:27 +00:00
Eduard Burtescu	e2a6917849	[opaque pointer types] [NFC] FindAvailableLoadedValue: take LoadInst instead of just the pointer. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16422 llvm-svn: 258477	2016-01-22 01:51:51 +00:00
Eduard Burtescu	1423921a24	[opaque pointer types] [NFC] Add an explicit type argument to ConstantFoldLoadFromConstPtr. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16418 llvm-svn: 258472	2016-01-22 01:17:26 +00:00
David L Kreitzer	4d7257dfa1	Fix for two constant propagation problems in GVN with the assume intrinsic instruction. Patch by Yuanrui Zhang. Differential Revision: http://reviews.llvm.org/D16100 llvm-svn: 258435	2016-01-21 21:32:35 +00:00
Sanjoy Das	a34ce95b60	Add a "gc-transition" operand bundle Summary: This adds a new kind of operand bundle to LLVM denoted by the `"gc-transition"` tag. Inputs to `"gc-transition"` operand bundle are lowered into the "transition args" section of `gc.statepoint` by `RewriteStatepointsForGC`. This removes the last bit of functionality that was unsupported in the deopt bundle based code path in `RewriteStatepointsForGC`. Reviewers: pgavlin, JosephTremoulet, reames Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16342 llvm-svn: 258338	2016-01-20 19:50:25 +00:00
Eduard Burtescu	19eb03106d	[opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType. Summary: GEPOperator: provide getResultElementType alongside getSourceElementType. This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has. GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16275 llvm-svn: 258145	2016-01-19 17:28:00 +00:00
Philip Reames	b336bca07e	[GC] Lower vectors-of-pointers directly by default This commit changes the default on our lowering of vectors-of-pointers from splitting in RS4GC to reporting them in the final stack map. All of the changes to do so are already in place and tested. Assuming no problems are unearthed in the next week, we will be deleting the old code entirely next Monday. llvm-svn: 258111	2016-01-19 04:18:24 +00:00
Eduard Burtescu	6007e0dd02	Revert assert added in rL258028 as the alloca and OtherPtr types may differ in address space. llvm-svn: 258029	2016-01-18 00:20:34 +00:00
Eduard Burtescu	90c4449128	[opaque pointer types] Alloca: use getAllocatedType() instead of getType()->getPointerElementType(). Reviewers: mjacob Subscribers: llvm-commits, dblaikie Differential Revision: http://reviews.llvm.org/D16272 llvm-svn: 258028	2016-01-18 00:10:01 +00:00
Sanjoy Das	de47590589	[IndVars] Fix PR25576 `LCSSASafePhiForRAUW` as computed was incorrect -- in cases like these (this exact example does not actually trigger the bug): define i32 @f(i32 %n, i1* %c) { entry: br label %outer.loop outer.loop: br label %inner.loop inner.loop: %iv = phi i32 [ 0, %outer.loop ], [ %iv.inc, %inner.loop ] %iv.inc = add nuw nsw i32 %iv, 1 %tc = udiv i32 %n, 13 %be.cond = icmp ult i32 %iv, %tc br i1 %be.cond, label %inner.loop, label %inner.exit inner.exit: %iv.lcssa = phi i32 [ %iv, %inner.loop ] %outer.be.cond = load volatile i1, i1* %c br i1 %outer.be.cond, label %outer.loop, label %leave leave: %iv.lcssa.lcssa = phi i32 [ %iv.lcssa, %inner.exit ] ret i32 %iv.lcssa.lcssa } `LCSSASafePhiForRAUW` is true for `%iv.lcssa` when re-rewriting the exit value of `%iv` for `%inner.loop` to `%tc` (this can happen due to `SCEVExpander::findExistingExpansion`), but the RAUW breaks LCSSA. To fix this, instead of computing `SafePhi` with special logic, decide the safety of RAUW directly via `replacementPreservesLCSSAForm`. llvm-svn: 258016	2016-01-17 18:12:52 +00:00
Sanjoy Das	7a8a705c9d	[IndVars] Use emplace_back; NFC llvm-svn: 258015	2016-01-17 18:12:48 +00:00
Artur Pilipenko	aba8fdc480	Fix buildbot failure introduced by 258010. Remove local variables became unused. llvm-svn: 258011	2016-01-17 12:59:40 +00:00
Artur Pilipenko	f84dc06e5b	Push isDereferenceableAndAlignedPointer down into isSafeToLoadUnconditionally Reviewed By: reames Differential Revision: http://reviews.llvm.org/D16226 llvm-svn: 258010	2016-01-17 12:35:29 +00:00
Manuel Jacob	5f6eaac611	GlobalValue: use getValueType() instead of getType()->getPointerElementType(). Reviewers: mjacob Subscribers: jholewinski, arsenm, dsanders, dblaikie Patch by Eduard Burtescu. Differential Revision: http://reviews.llvm.org/D16260 llvm-svn: 257999	2016-01-16 20:30:46 +00:00
Justin Bogner	fd757648a4	PM: Fix an inverted condition in simplifyFunctionCFG I mentioned the issue here in code review way back in September and was sure we'd fixed it, but apparently we forgot: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150921/301850.html In any case, as soon as you try to use this pass in anything but the most basic pipeline everything falls apart. Fix the condition. llvm-svn: 257935	2016-01-15 21:21:39 +00:00
Artur Pilipenko	6dd6969cee	Change isSafeToLoadUnconditionally arguments order. Separated from http://reviews.llvm.org/D10920 . llvm-svn: 257894	2016-01-15 15:27:46 +00:00
Keno Fischer	d5354fdddb	[SROA] Also insert a bit piece expression if only one piece is needed Summary: If SROA creates only one piece (e.g. because the other is not needed), it still needs to create a bit_piece expression if that bit piece is smaller than the original size of the alloca. Reviewers: aprantl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16187 llvm-svn: 257795	2016-01-14 20:06:34 +00:00
Sanjay Patel	9913322327	move return variable declarations down to where they are actually used; NFCI llvm-svn: 257700	2016-01-13 23:01:57 +00:00
Justin Bogner	b8d82abb78	LoopUnroll: Move the actual unrolling logic to a standalone function. NFC This is pure code motion - break the actual work out of runOnLoop into a reusable standalone function. llvm-svn: 257445	2016-01-12 05:21:37 +00:00
Justin Bogner	921b04e9a4	LoopUnroll: Make canUnrollCompletely static - it doesn't use any state. NFC llvm-svn: 257427	2016-01-12 01:06:32 +00:00
Justin Bogner	a1dd493159	LoopUnroll: Clean up the maze of initialization for unroll parameters. NFC The layering of where the various loop unroll parameters are initialized and overridden here was very confusing, making it pretty difficult to tell just how the various sources interacted. Instead, we put all of the initialization logic together in a single function so that it's obvious what overrides what. llvm-svn: 257426	2016-01-12 00:55:26 +00:00
Justin Bogner	0fb7ed5726	LoopUnroll: Use the optsize threshold for minsize as well Currently we're unrolling loops more in minsize than in optsize, which means -Oz will have a larger code size than -Os. That doesn't make any sense. This resolves the FIXME about this in LoopUnrollPass and extends the optsize test to make sure we use the smaller threshold for minsize as well. llvm-svn: 257402	2016-01-11 22:39:43 +00:00
David Majnemer	d9833ea579	[JumpThreading] Don't forget to report that the IR changed JumpThreading's runOnFunction is supposed to return true if it made any changes. JumpThreading has a call to removeUnreachableBlocks which may result in changes to the IR but runOnFunction didn't appropriate account for this possibility, leading to badness. While we are here, make sure to call LazyValueInfo::eraseBlock in removeUnreachableBlocks; JumpThreading preserves LVI. This fixes PR26096. llvm-svn: 257279	2016-01-10 07:13:04 +00:00
Benjamin Kramer	543762da3e	[JumpThreading] Use range-based for loops. No functionality change intended. llvm-svn: 257262	2016-01-09 18:43:01 +00:00
Benjamin Kramer	530e0db333	[TRE] Simplify code with range-based loops and std::find. No functional change intended. llvm-svn: 257261	2016-01-09 17:35:29 +00:00
Manuel Jacob	734e73342d	[RS4GC] Update and simplify handling of Constants in findBaseDefiningValueOfVector(). Summary: This is analogous to r256079, which removed an overly strong assertion, and r256812, which simplified the code by replacing three conditionals by one. Reviewers: reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D16019 llvm-svn: 257250	2016-01-09 04:02:16 +00:00
Manuel Jacob	0593cfd336	[RS4GC] Unify two asserts. NFC. llvm-svn: 257247	2016-01-09 03:08:49 +00:00
Philip Reames	5715f576ea	[rs4gc] Optionally directly relocated vector of pointers This patch teaches rewrite-statepoints-for-gc to relocate vector-of-pointers directly rather than trying to split them. This builds on the recent lowering/IR changes to allow vector typed gc.relocates. The motivation for this is that we recently found a bug in the vector splitting code where depending on visit order, a vector might not be relocated at some safepoint. Specifically, the bug is that the splitting code wasn't updating the side tables (live vector) of other safepoints. As a result, a vector which was live at two safepoints might not be updated at one of them. However, if you happened to visit safepoints in post order over the dominator tree, everything worked correctly. Weirdly, it turns out that post order is actually an incredibly common order to visit instructions in in practice. Frustratingly, I have not managed to write a test case which actually hits this. I can only reproduce it in large IR files produced by actual applications. Rather than continue to make this code more complicated, we can remove all of the complexity by just representing the relocation of the entire vector natively in the IR. At the moment, the new functionality is hidden behind a flag. To use this code, you need to pass "-rs4gc-split-vector-values=0". Once I have a chance to stress test with this option and get feedback from other users, my plan is to flip the default and remove the original splitting code. I would just remove it now, but given the rareness of the bug, I figured it was better to leave it in place until the new approach has been stress tested. Differential Revision: http://reviews.llvm.org/D15982 llvm-svn: 257244	2016-01-09 01:31:13 +00:00
Sanjay Patel	9f088ab5e2	rangify; NFCI llvm-svn: 257226	2016-01-08 22:59:42 +00:00
Sanjay Patel	9f49b683e0	variable names start with an upper case letter; NFC llvm-svn: 257213	2016-01-08 22:05:03 +00:00
Haicheng Wu	a6a3279bd3	[JumpThreading] Split select that has constant conditions coming from the PHI node Look for PHI/Select in the same BB of the form bb: %p = phi [false, %bb1], [true, %bb2], [false, %bb3], [true, %bb4], ... %s = select p, trueval, falseval And expand the select into a branch structure. This later enables jump-threading over bb in this pass. Using the similar approach of SimplifyCFG::FoldCondBranchOnPHI(), unfold select if the associated PHI has at least one constant. If the unfolded select is not jump-threaded, it will be folded again in the later optimizations. llvm-svn: 257198	2016-01-08 19:39:39 +00:00
Justin Bogner	e9fb228d59	LoopInfo: Simplify ownership of Loop objects It's strange that LoopInfo mostly owns the Loop objects, but that it defers deleting them to the loop pass manager. Instead, change the oddly named "updateUnloop" to "markAsRemoved" and have it queue the Loop object for deletion. We can't delete the Loop immediately when we remove it, since we need its pointer identity still, so we'll mark the object as "invalid" so that clients can see what's going on. llvm-svn: 257191	2016-01-08 19:08:53 +00:00
Mehdi Amini	599ebf2767	Remove static global GCNames from Function.cpp and move it to the Context This remove the need for locking when deleting a function. Differential Revision: http://reviews.llvm.org/D15988 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 257139	2016-01-08 02:28:20 +00:00
Aditya Nandakumar	f94c149f7f	Instructions to be redone only if from the same BB While adding instructions(possible roots) to be redone, make sure they are from the same basic block. llvm-svn: 257112	2016-01-07 23:22:55 +00:00
David Majnemer	f1a9c9e148	[SCCP] Don't violate the lattice invariants We marked values which are 'undef' as constant instead of undefined which violates SCCP's invariants. If we can figure out that a computation results in 'undef', leave it in the undefined state. This fixes PR16052. llvm-svn: 257102	2016-01-07 21:36:16 +00:00
David Majnemer	f3b99dd22e	Remove junk accidentally commited with r257087 llvm-svn: 257089	2016-01-07 19:30:13 +00:00
David Majnemer	bae945735a	[SCCP] Can't go from overdefined to constant The fix for PR23999 made us mark loads of null as producing the constant undef which upsets the lattice. Instead, keep the load as "undefined". This fixes PR26044. llvm-svn: 257087	2016-01-07 19:25:39 +00:00
Philip Reames	103d2381d6	[RS4GC] Add an option to suppress vector splitting At the moment, this is essentially a diangostic option so that I can start collecting failing test cases, but we will eventually migrate to removing the vector splitting code entirely. llvm-svn: 257015	2016-01-07 02:20:11 +00:00
Mehdi Amini	0535003bef	Fix PR26051: Memcpy optimization should introduce a call to memcpy before the store destination position This is a conservative fix, I expect Amaury to relax this. Follow-up for r256923 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 256999	2016-01-06 23:50:22 +00:00
Amaury Sechet	3235c08253	Promote aggregate store to memset when possible Summary: As per title. This will allow the optimizer to pick up on it. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15923 llvm-svn: 256969	2016-01-06 19:47:24 +00:00
Amaury Sechet	5fc9f6999d	Remove useless DEBUG llvm-svn: 256968	2016-01-06 19:45:09 +00:00
Amaury Sechet	d3b2c0fd94	Improve load/store to memcpy for aggregate Summary: It turns out that if we don't try to do it at the store location, we can do it before any operation that alias the load, as long as no operation alias the store. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15903 llvm-svn: 256923	2016-01-06 09:30:39 +00:00
Amaury Sechet	a0c242cdfd	Implement load to store => memcpy in MemCpyOpt for aggregates Summary: Most of the tool chain is able to optimize scalar and memcpy like operation effisciently while it isn't that good with aggregates. In order to improve the support of aggregate, we try to change aggregate manipulation into either scalar or memcpy like ones whenever possible without loosing informations. This is one such opportunity. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15894 llvm-svn: 256868	2016-01-05 20:17:48 +00:00
Manuel Jacob	75cbfdcf03	[RS4GC] Simplify handling of Constants in findBaseDefiningValue(). NFC. Summary: Previously there were three conditionals, checking for global variables, undef values and everything constant except these two, all three returning the same value. This commit replaces them by one conditional. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15818 llvm-svn: 256812	2016-01-05 04:06:21 +00:00
Manuel Jacob	83eefa6d20	[Statepoints] Refactor GCRelocateOperands into an intrinsic wrapper. NFC. Summary: This commit renames GCRelocateOperands to GCRelocateInst and makes it an intrinsic wrapper, similar to e.g. MemCpyInst. Also, all users of GCRelocateOperands were changed to use the new intrinsic wrapper instead. Reviewers: sanjoy, reames Subscribers: reames, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D15762 llvm-svn: 256811	2016-01-05 04:03:00 +00:00
David Majnemer	b33f3a239a	[LICM] Fix a small oversight introduced in r256763 r256763 had promoteLoopAccessesToScalars check for the existence of a catchswitch when the exit blocks were populated but promoteLoopAccessesToScalars may be called with a prepopulated set of exit blocks which would also need to be checked. This fixes PR26019. llvm-svn: 256788	2016-01-04 23:16:22 +00:00
Haicheng Wu	9d6c94006e	[LIR] General refactoring to simplify code and the ease future code review This is a resubmission of r256336 which was reverted in r256361. The issue was the lack of the invariant check of the memset value in processLooMemSet(). The original message: Move several checks into isLegalStores. Also, delineate between those stores that are memset-able and those that are memcpy-able. llvm-svn: 256783	2016-01-04 21:43:14 +00:00
Aditya Nandakumar	12d060481a	Remove dead instructions before Redoing Before reevaluating instructions, iterate over all instructions to be reevaluated and remove trivially dead instructions and if any of it's operands become trivially dead, mark it for deletion until all trivially dead instructions have been removed llvm-svn: 256773	2016-01-04 19:48:14 +00:00
David Majnemer	219055f9df	[LICM] Don't insert instructions after a catchswitch when performing loop promotion Inserting after a catchswitch results in verifier errors, bail out on promotion if a catchswitch is a loop exit. llvm-svn: 256763	2016-01-04 17:42:19 +00:00
David Majnemer	42a0730c42	[LICM] Make instruction sinking funclet-aware We had two bugs here: - We might try to sink into a catchswitch, causing verifier failures. - We will succeed in sinking into a cleanuppad but we didn't update the funclet operand bundle. This fixes PR26000. llvm-svn: 256728	2016-01-04 03:37:39 +00:00
Manuel Jacob	67f1d3ac63	[RS4GC] Use DenseMap::count() instead of DenseMap::find()/DenseMap::end(). NFC. llvm-svn: 256586	2015-12-29 22:16:41 +00:00
Manuel Jacob	e3773d632e	[PlaceSafepoints] Assert that the gc.safepoint_poll function is present in the module. If running the PlaceSafepoints pass on a module which doesn't have the gc.safepoint_poll function without disabling entry and backedge safepoints, previously the pass crashed with an obscure error because of a null pointer. Now it fails the assert instead. llvm-svn: 256580	2015-12-29 21:57:55 +00:00
Geoff Berry	43dc285915	[JumpThreading] Fix opcode bonus in getJumpThreadDuplicationCost() The code that was meant to adjust the duplication cost based on the terminator opcode was not being executed in cases where the initial threshold was hit inside the loop. Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D15536 llvm-svn: 256568	2015-12-29 18:10:16 +00:00
Manuel Jacob	9db5b93ffc	[RS4GC] Fix rematerialization of bitcast of bitcast. Summary: Previously, only the outer (last) bitcast was rematerialized, resulting in a use of the unrelocated inner (first) bitcast after the statepoint. See the test case for an example. Reviewers: igor-laevsky, reames Subscribers: reames, alex, llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D15789 llvm-svn: 256520	2015-12-28 20:14:05 +00:00
Chen Li	d71999ef1b	[gc.statepoint] Change gc.statepoint intrinsic's return type to token type instead of i32 type Summary: This patch changes gc.statepoint intrinsic's return type to token type instead of i32 type. Using token types could prevent LLVM to merge different gc.statepoint nodes into PHI nodes and cause further problems with gc relocations. The patch also changes the way on how gc.relocate and gc.result look for their corresponding gc.statepoint on unwind path. The current implementation uses the selector value extracted from a { i8*, i32 } landingpad as a hook to find the gc.statepoint, while the patch directly uses a token type landingpad (http://reviews.llvm.org/D15405) to find the gc.statepoint. Reviewers: sanjoy, JosephTremoulet, pgavlin, igor-laevsky, mjacob Subscribers: reames, mjacob, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D15662 llvm-svn: 256443	2015-12-26 07:54:32 +00:00
Nico Weber	95cc9d5f14	Revert r256336, it caused PR25939 llvm-svn: 256361	2015-12-24 04:01:06 +00:00
Chad Rosier	fba65d2fd3	[LIR] General refactoring to simplify code and the ease future code review. Move several checks into isLegalStores. Also, delineate between those stores that are memset-able and those that are memcpy-able. http://reviews.llvm.org/D15683 Patch by Haicheng Wu <haicheng@codeaurora.org>! llvm-svn: 256336	2015-12-23 17:29:33 +00:00
David Majnemer	63ad9e0543	[OperandBundles] Have TailCallElim play nice with operand bundles A call site's use of a Value might not correspond to an argument operand but to a bundle operand. This fixes PR25928. llvm-svn: 256328	2015-12-23 09:58:43 +00:00
Philip Reames	ee8f055327	[GC] Make GCStrategy::isGCManagedPointer a type predicate not a value predicate [NFC] Reasons: 1) The existing form was a form of false generality. None of the implemented GCStrategies use anything other than a type. Its becoming more and more clear we're going to need some type of strong GC pointer in the type system and we shouldn't pretend otherwise at this point. 2) The API was awkward when applied to vectors-of-pointers. The old one could have been made to work, but calling isGCManagedPointer(Ty->getScalarType()) is much cleaner than the Value alternatives. 3) The rewriting implementation effectively assumes the type based predicate as well. We should be consistent. llvm-svn: 256312	2015-12-23 01:42:15 +00:00
Manuel Jacob	a4efd8ac2e	[RS4GC] Fix base pair printing for constants. Previously, "%" + name of the value was printed for each derived and base pointer. This is correct for instructions, but wrong for e.g. globals. llvm-svn: 256305	2015-12-23 00:19:45 +00:00
Cong Hou	6a2c71af0b	[BPI] Fix two potential divide-by-zero operations that are introduced in r256263. llvm-svn: 256303	2015-12-22 23:45:55 +00:00
Cong Hou	e93b8e1539	[BPI] Replace weights by probabilities in BPI. This patch removes all weight-related interfaces from BPI and replace them by probability versions. With this patch, we won't use edge weight anymore in either IR or MC passes. Edge probabilitiy is a better representation in terms of CFG update and validation. Differential revision: http://reviews.llvm.org/D15519 llvm-svn: 256263	2015-12-22 18:56:14 +00:00
Manuel Jacob	4e4f60ded0	Remove deprecated llvm.experimental.gc.result.{int,float,ptr} intrinsics. Summary: These were deprecated 11 months ago when a generic llvm.experimental.gc.result intrinsic, which works for all types, was added. Reviewers: sanjoy, reames Subscribers: sanjoy, chenli, llvm-commits Differential Revision: http://reviews.llvm.org/D15719 llvm-svn: 256262	2015-12-22 18:44:45 +00:00
Manuel Jacob	990dfa6fe5	[RS4GC] Fix crash in the case that a live variable has a constant base. Summary: Previously, RS4GC crashed in CreateGCRelocates() because it assumed that every base is also in the array of live variables, which isn't true if a live variable has a constant base. This change fixes the crash by making sure CreateGCRelocates() won't try to relocate a live variable with a constant base. This would be unnecessary anyway because anything with a constant base won't move. Reviewers: reames Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D15556 llvm-svn: 256252	2015-12-22 16:50:44 +00:00
Chad Rosier	94274fb1ad	[LIR] Refactor code to enable future patch. NFC. llvm-svn: 256159	2015-12-21 14:49:32 +00:00
Manuel Jacob	8050a49737	[RS4GC] Add an assert which fails if there is a (yet unsupported) addrspacecast. The slightly strange indentation comes from clang-format. llvm-svn: 256132	2015-12-21 01:26:46 +00:00
Philip Reames	5d54689bca	[RS4GC] Remove an overly strong assertion As shown by the included test case, it's reasonable to end up with constant references during base pointer calculation. The code actually handled this case just fine, we only had the assert to help isolate problems under the belief that constant references shouldn't be present in IR generated by managed frontends. This turned out to be wrong on two fronts: 1) Manual Jacobs is working on a language with constant references, and b) we found a case where the optimizer does create them in practice. llvm-svn: 256079	2015-12-19 02:38:22 +00:00
Jingyue Wu	ba3ca76ed2	[NaryReassociate] allow candidate to have a different type Summary: If Candiadte may have a different type from GEP, we should bitcast or pointer cast it to GEP's type so that the later RAUW doesn't complain. Added a test in nary-gep.ll Reviewers: tra, meheff Subscribers: mcrosier, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D15618 llvm-svn: 256035	2015-12-18 21:36:30 +00:00
Philip Reames	dd0948a1b6	[RS4GC] Use an value handle to help isolate errors quickly Inspired by the bug reported in 25846. Whatever we end up doing about that one, the value handle change is a generally good one since it will help catch this type of mistake more quickly. Patch by: Manuel Jacob llvm-svn: 255984	2015-12-18 03:53:28 +00:00
Sanjoy Das	0de2feceb1	[SCEV] Add and use SCEVConstant::getAPInt; NFCI llvm-svn: 255921	2015-12-17 20:28:46 +00:00
Philip Reames	15145fb7b1	[EarlyCSE] DSE of atomic unordered stores The rules for removing trivially dead stores are a lot less complicated than loads. Since we know the later store post dominates the former and the former dominates the later, unless the former has side effects other than the actual store, we can remove it. One slightly surprising thing is that we can freely remove atomic stores, even if the later one isn't atomic. There's no guarantee the atomic one was every visible. For the moment, we don't handle DSE of ordered atomic stores. We could extend the same chain of reasoning to them, but the catch is we'd then have to model the ordering effect without a store instruction. Since our fences are a stronger than our operation orderings, simple using a fence isn't an obvious win. This arguable calls for a refinement in our fence specification, but that's (much) later work. Differential Revision: http://reviews.llvm.org/D15352 llvm-svn: 255914	2015-12-17 18:50:50 +00:00
Eric Christopher	bfba572425	Fix funciton->function typo. llvm-svn: 255841	2015-12-16 23:10:53 +00:00
Justin Bogner	883a3ea67f	LPM: Make callers of LPM.deleteLoopFromQueue update LoopInfo directly. NFC As of r255720, the loop pass manager will DTRT when passes update the loop info for removed loops, so they no longer need to reach into LPPassManager APIs to do this kind of transformation. This change very nearly removes the need for the LPPassManager to even be passed into loop passes - the only remaining pass that uses the LPM argument is LoopUnswitch. llvm-svn: 255797	2015-12-16 18:40:20 +00:00
Philip Reames	ae1f265bf1	[EarlyCSE] DSE of stores which write back loaded values Extend EarlyCSE with an additional style of dead store elimination. If we write back a value just read from that memory location, we can eliminate the store under the assumption that the value hasn't changed. I'm implementing this mostly because I noticed the omission when looking at the code. It seemed strange to have InstCombine have a peephole which was more powerful than EarlyCSE. :) Differential Revision: http://reviews.llvm.org/D15397 llvm-svn: 255739	2015-12-16 01:01:30 +00:00
Justin Bogner	843fb204b7	LPM: Stop threading `Pass ` through all of the loop utility APIs. NFC A large number of loop utility functions take a `Pass ` and reach into it to find out which analyses to preserve. There are a number of problems with this: - The APIs have access to pretty well any Pass state they want, so it's hard to tell what they may or may not do. - Other APIs have copied these and pass around a `Pass *` even though they don't even use it. Some of these just hand a nullptr to the API since the callers don't even have a pass available. - Passes in the new pass manager don't work like the current ones, so the APIs can't be used as is there. Instead, we should explicitly thread the analysis results that we actually care about through these APIs. This is both simpler and more reusable. llvm-svn: 255669	2015-12-15 19:40:57 +00:00
Justin Bogner	6291b587b6	LoopRotate: Convert the methods of LoopRotate to utility functions. NFC This moves the actual work to do loop rotation into standalone functions with the analysis results they need passed in as arguments, leaving the class itself as a relatively simple shim. This will make the functions easy to reuse when we're ready to port this transformation to the new pass manager. llvm-svn: 255574	2015-12-14 23:22:48 +00:00
Justin Bogner	a730045156	LoopRotate: Reorder some method implementations. NFC This just moves some callers after their callees. My next patch will convert some of these methods to stand alone functions, and that diff is more obviously NFC if I move these first. That change, in turn, will make it much easier to port this pass to the new pass manager once the loop pass manager is in place. llvm-svn: 255573	2015-12-14 23:22:44 +00:00
Sanjay Patel	af674fbfd9	getParent() ^ 3 == getModule() ; NFCI llvm-svn: 255511	2015-12-14 17:24:23 +00:00
David Majnemer	8a1c45d6e8	[IR] Reformulate LLVM's EH funclet IR While we have successfully implemented a funclet-oriented EH scheme on top of LLVM IR, our scheme has some notable deficiencies: - catchendpad and cleanupendpad are necessary in the current design but they are difficult to explain to others, even to seasoned LLVM experts. - catchendpad and cleanupendpad are optimization barriers. They cannot be split and force all potentially throwing call-sites to be invokes. This has a noticable effect on the quality of our code generation. - catchpad, while similar in some aspects to invoke, is fairly awkward. It is unsplittable, starts a funclet, and has control flow to other funclets. - The nesting relationship between funclets is currently a property of control flow edges. Because of this, we are forced to carefully analyze the flow graph to see if there might potentially exist illegal nesting among funclets. While we have logic to clone funclets when they are illegally nested, it would be nicer if we had a representation which forbade them upfront. Let's clean this up a bit by doing the following: - Instead, make catchpad more like cleanuppad and landingpad: no control flow, just a bunch of simple operands; catchpad would be splittable. - Introduce catchswitch, a control flow instruction designed to model the constraints of funclet oriented EH. - Make funclet scoping explicit by having funclet instructions consume the token produced by the funclet which contains them. - Remove catchendpad and cleanupendpad. Their presence can be inferred implicitly using coloring information. N.B. The state numbering code for the CLR has been updated but the veracity of it's output cannot be spoken for. An expert should take a look to make sure the results are reasonable. Reviewers: rnk, JosephTremoulet, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D15139 llvm-svn: 255422	2015-12-12 05:38:55 +00:00
Chad Rosier	d7634fc91d	Revert r255247, r255265, and r255286 due to serious compile-time regressions. Revert "[DSE] Disable non-local DSE to see if the bots go green." Revert "[DeadStoreElimination] Use range-based loops. NFC." Revert "[DeadStoreElimination] Add support for non-local DSE." llvm-svn: 255354	2015-12-11 18:39:41 +00:00
Hal Finkel	494393b740	AlignmentFromAssumptions and SLPVectorizer preserves AA and GlobalsAA GlobalsAA's assumptions that passes do not escape globals not previously escaped is not violated by AlignmentFromAssumptions and SLPVectorizer. Marking them as such allows GlobalsAA to be preserved until GVN in the LTO pipeline. http://lists.llvm.org/pipermail/llvm-dev/2015-December/092972.html Patch by Vaivaswatha Nagaraj! llvm-svn: 255348	2015-12-11 17:46:01 +00:00
Chad Rosier	843c7b4309	[DSE] Disable non-local DSE to see if the bots go green. I see a few bots timing out, so I'm speculatively disabling r255247. llvm-svn: 255286	2015-12-10 19:23:02 +00:00
Chad Rosier	02fe4248a2	[DeadStoreElimination] Use range-based loops. NFC. llvm-svn: 255265	2015-12-10 17:27:18 +00:00
Chad Rosier	533bc3fcac	[DeadStoreElimination] Add support for non-local DSE. We extend the search for redundant stores to predecessor blocks that unconditionally lead to the block BB with the current store instruction. That also includes single-block loops that unconditionally lead to BB, and if-then-else blocks where then- and else-blocks unconditionally lead to BB. http://reviews.llvm.org/D13363 Patch by Ivan Baev <ibaev@codeaurora.org>! llvm-svn: 255247	2015-12-10 13:51:43 +00:00
Silviu Baranga	86de80db37	[LLE] Use the PredicatedScalarEvolution interface to query SCEVs for dependences Summary: LAA uses the PredicatedScalarEvolution interface, so it can produce forward/backward dependences having SCEVs that are AddRecExprs only after being transformed by PredicatedScalarEvolution. Use PredicatedScalarEvolution to get the expected expressions. Reviewers: anemet Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D15382 llvm-svn: 255241	2015-12-10 11:07:18 +00:00
Reid Kleckner	54ade23504	[Float2Int] Don't operate on vector instructions This fixes a crash bug. It's also not clear if we'd want to do this transform for vectors. llvm-svn: 255155	2015-12-09 21:08:18 +00:00
Silviu Baranga	9cd9a7e310	Re-commit r255115, with the PredicatedScalarEvolution class moved to ScalarEvolution.h, in order to avoid cyclic dependencies between the Transform and Analysis modules: [LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is that both LAA and LV should use this interface everywhere. This also solves a problem involving the result of SCEV expression rewritting when the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates P1: {a,+,b} has nsw P2: b = 1. Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies). The SCEVPredicatedLayer maintains the order of transformations by feeding back the results of previous transformations into new transformations, and therefore avoiding this issue. The SCEVPredicatedLayer maintains a cache to remember the results of previous SCEV rewritting results. This also has the benefit of reducing the overall number of expression rewrites. Reviewers: mzolotukhin, anemet Subscribers: jmolloy, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14296 llvm-svn: 255122	2015-12-09 16:06:28 +00:00
Silviu Baranga	ad1ccb357b	Revert r255115 until we figure out how to fix the bot failures. llvm-svn: 255117	2015-12-09 15:25:28 +00:00
Silviu Baranga	41eb682501	[LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is that both LAA and LV should use this interface everywhere. This also solves a problem involving the result of SCEV expression rewritting when the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates P1: {a,+,b} has nsw P2: b = 1. Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies). The SCEVPredicatedLayer maintains the order of transformations by feeding back the results of previous transformations into new transformations, and therefore avoiding this issue. The SCEVPredicatedLayer maintains a cache to remember the results of previous SCEV rewritting results. This also has the benefit of reducing the overall number of expression rewrites. Reviewers: mzolotukhin, anemet Subscribers: jmolloy, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14296 llvm-svn: 255115	2015-12-09 15:03:52 +00:00
JF Bastien	9938425b31	EarlyCSE: fix typo from rL255054. llvm-svn: 255102	2015-12-09 09:05:42 +00:00
Vikram TV	74b4111483	Test commit access - Fix few missing '.' in comments of LoopInterchange code. llvm-svn: 255095	2015-12-09 05:16:24 +00:00
Sanjoy Das	42e551b92d	[IndVars] Use any_of and foreach instead of explicit for loops; NFC llvm-svn: 255077	2015-12-08 23:52:58 +00:00
Philip Reames	8fc2cbf933	[EarlyCSE] Value forwarding for unordered atomics This patch teaches the fully redundant load part of EarlyCSE how to forward from atomic and volatile loads and stores, and how to eliminate unordered atomics (only). This patch does not include dead store elimination support for unordered atomics, that will follow in the near future. The basic idea is that we allow all loads and stores to be tracked by the AvailableLoad table. We store a bit in the table which tracks whether load/store was atomic, and then only replace atomic loads with ones which were also atomic. No attempt is made to refine our handling of ordered loads or stores. Those are still treated as full fences. We could pretty easily extend the release fence handling to release stores, but that should be a separate patch. Differential Revision: http://reviews.llvm.org/D15337 llvm-svn: 255054	2015-12-08 21:45:41 +00:00
Sanjoy Das	683bf070ef	[IndVars] Have getInsertPointForUses preserve LCSSA Summary: Also add a stricter post-condition for IndVarSimplify. Fixes PR25578. Test case by Michael Zolotukhin. Reviewers: hfinkel, atrick, mzolotukhin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15059 llvm-svn: 254977	2015-12-08 00:13:21 +00:00
Philip Reames	9e5e2d61bf	Reapply 254950 w/fix 254950 ended up being not NFC. The previous code was overriding the flags for whether an instruction read or wrote memory using the target specific flags returned via TTI. I'd missed this in my refactoring. Since I mistakenly built only x86 and didn't notice the number of unsupported tests, I didn't catch that before the original checkin. This raises an interesting issue though. Given we have function attributes (i.e. readonly, readnone, argmemonly) which describe the aliasing of intrinsics, why does TTI have this information overriding the instruction definition at all? I see no reason for this, but decided to preserve existing behavior for the moment. The root issue might be that we don't have a "writeonly" attribute. Original commit message: [EarlyCSE] Simplify and invert ParseMemoryInst [NFCI] Restructure ParseMemoryInst - which was introduced to abstract over target specific load and stores instructions - to just query the underlying instructions. In theory, this could be slightly slower than caching the results, but in practice, it's very unlikely to be measurable. The simple query scheme makes it far easier to understand, and much easier to extend with new queries. Given I'm about to need to add new query types, doing the cleanup first seemed worthwhile. Do we still believe the target specific intrinsic handling is worthwhile in EarlyCSE? It adds quite a bit of complexity and makes the code harder to read. Being able to delete the abstraction entirely would be wonderful. llvm-svn: 254957	2015-12-07 22:41:23 +00:00
Philip Reames	4b5634af44	Revert 254950 It's causing test failures on AArch64. Due to a bad build config on my part, I apparently wasn't running the tests I thought I was. llvm-svn: 254954	2015-12-07 21:41:29 +00:00
Philip Reames	998cae653b	[EarlyCSE] Simplify and invert ParseMemoryInst [NFCI] Restructure ParseMemoryInst - which was introduced to abstract over target specific load and stores instructions - to just query the underlying instructions. In theory, this could be slightly slower than caching the results, but in practice, it's very unlikely to be measurable. The simple query scheme makes it far easier to understand, and much easier to extend with new queries. Given I'm about to need to add new query types, doing the cleanup first seemed worthwhile. Do we still believe the target specific intrinsic handling is worthwhile in EarlyCSE? It adds quite a bit of complexity and makes the code harder to read. Being able to delete the abstraction entirely would be wonderful. llvm-svn: 254950	2015-12-07 21:27:15 +00:00
Philip Reames	7c6692de16	[EarlyCSE] IsSimple vs IsVolatile naming clarification (NFC) When the notion of target specific memory intrinsics was introduced to EarlyCSE, the commit confused the notions of volatile and simple memory access. Since I'm about to start working on this area, cleanup the naming so that patches aren't horribly confusing. Note that the actual implementation was always bailing if the load or store wasn't simple. Reminder: - "volatile" - C++ volatile, can't remove any memory operations, but in principal unordered - "ordered" - imposes ordering constraints on other nearby memory operations - "atomic" - can't be split or sheared. In LLVM terms, all "ordered" operations are also atomic so the predicate "isAtomic" is often used. - "simple" - a load which is none of the above. These are normal loads and what most of the optimizer works with. llvm-svn: 254805	2015-12-05 00:18:33 +00:00
Akira Hatanaka	237916b537	[AttributeSet] Overload AttributeSet::addAttribute to reduce compile time. The new overloaded function is used when an attribute is added to a large number of slots of an AttributeSet (for example, to function parameters). This is much faster than calling AttributeSet::addAttribute once per slot, because AttributeSet::getImpl (which calls FoldingSet::FIndNodeOrInsertPos) is called only once per function instead of once per slot. With this commit, clang compiles a file which used to take over 22 minutes in just 13 seconds. rdar://problem/23581000 Differential Revision: http://reviews.llvm.org/D15085 llvm-svn: 254491	2015-12-02 06:58:49 +00:00
Chad Rosier	869962f962	[LIR] Push check into helper function. NFC. llvm-svn: 254416	2015-12-01 14:26:35 +00:00
Craig Topper	d896b03e4c	Remove an intermediate lambda. NFC llvm-svn: 254246	2015-11-29 05:38:08 +00:00
Craig Topper	e471cf32a0	Use range-based for loops. NFC llvm-svn: 254222	2015-11-28 08:23:04 +00:00
Davide Italiano	dd04fee8a6	[SCCP] More informative message if we don't know how to handle a terminator. llvm-svn: 254093	2015-11-25 21:03:36 +00:00
Sanjay Patel	739f2ce93a	use convenience function for copying IR flags; NFCI llvm-svn: 253996	2015-11-24 17:16:33 +00:00
Chad Rosier	a15b4b6af2	[LIR] Put includes in correct order. NFC. llvm-svn: 253915	2015-11-23 21:09:13 +00:00
Andrew Kaylor	0615a0e65d	[WinEH] Fix a case where GVN could incorrectly PRE a load into an EH pad. Differential Revision: http://reviews.llvm.org/D14842 llvm-svn: 253908	2015-11-23 19:51:41 +00:00
Davide Italiano	945d05f6a0	[LoopStrengthReduce] Mark dump() definitions as LLVM_DUMP_METHOD. llvm-svn: 253841	2015-11-23 02:47:30 +00:00
Craig Topper	a5ea5289ff	Use modulo operator instead of multiplying result of a divide and subtracting from the original dividend. NFC. llvm-svn: 253792	2015-11-21 17:44:42 +00:00
Owen Anderson	630077ef55	Fix a pair of issues that caused an infinite loop in reassociate. Terrifyingly, one of them is a mishandling of floating point vectors in Constant::isZero(). How exactly this issue survived this long is beyond me. llvm-svn: 253655	2015-11-20 08:16:13 +00:00
Craig Topper	e325e3806f	Use range-based for loops. NFC llvm-svn: 253652	2015-11-20 07:18:48 +00:00
Chad Rosier	1cd3da15e8	[LIR] Update some comments. NFC. llvm-svn: 253603	2015-11-19 21:33:07 +00:00
Chad Rosier	3ecc8d8d83	[LIR] Fix 80-column from previous commit. llvm-svn: 253586	2015-11-19 18:25:11 +00:00
Chad Rosier	fddc01f393	[LIR] Sink checks into function to enable future refactoring. NFC. The purpose of this change is help delineate the memset and memcpy optimizations with the overall goal of resolving PR25520. llvm-svn: 253585	2015-11-19 18:22:21 +00:00
Chad Rosier	85c21f0a6e	[LIR] Use the more appropriate method. NFC. llvm-svn: 253578	2015-11-19 17:27:28 +00:00
Pete Cooper	67cf9a723b	Revert "Change memcpy/memset/memmove to have dest and source alignments." This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543	2015-11-19 05:56:52 +00:00
Weiming Zhao	b69babd01e	Fix bug 25440: GVN assertion after coercing loads Optimizations like LoadPRE in GVN will insert new instructions. If the insertion point is in a already processed BB, they should get a value number explicitly. If the insertion point is after current instruction, then just leave it. However, current GVN framework has no support for it. In this patch, we just bail out if a VN can't be found. Dfferential Revision: http://reviews.llvm.org/D14670 A test/Transforms/GVN/pr25440.ll M lib/Transforms/Scalar/GVN.cpp llvm-svn: 253536	2015-11-19 02:45:18 +00:00
Mehdi Amini	adb4057a15	Fix returned value for GVN: could return "false" even after modifying the IR This bug would manifest in some very specific cases where all the following conditions are fullfilled: - GVN didn't remove block - The regular GVN iteration didn't change the IR - PRE is enabled - PRE will not split critical edge - The last instruction processed by PRE didn't change the IR Because the CallGraph PassManager relies on this returned value to decide if it needs to recompute a node after the execution of Function passes, not returning the right value can lead to unexpected results. Fix for: https://llvm.org/bugs/show_bug.cgi?id=24715 Patch by Wenxiang Qiu <vincentqiuuu@gmail.com> From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 253518	2015-11-18 22:49:49 +00:00
Pete Cooper	72bc23ef02	Change memcpy/memset/memmove to have dest and source alignments. Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.llvm\.memset.)i32\ [0-9]\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, / isVolatile / false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, / isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511	2015-11-18 22:17:24 +00:00
Mike Aizatsky	c7810baaa6	Disable gvn non-local speculative loads under asan. Summary: Fix for https://llvm.org/bugs/show_bug.cgi?id=25550 Differential Revision: http://reviews.llvm.org/D14763 llvm-svn: 253498	2015-11-18 20:43:00 +00:00
Igor Laevsky	7310c68e85	Revert "Revert "Strip metadata when speculatively hoisting instructions (r252604)" Failing clang test is now fixed by the r253458. llvm-svn: 253459	2015-11-18 14:50:18 +00:00
Craig Topper	66059c9f4d	Replace dyn_cast with isa in places that weren't using the returned value for more than a boolean check. NFC. llvm-svn: 253441	2015-11-18 07:07:59 +00:00
Philip Reames	b6e8fe3dac	[PRE] Preserve !invariant.load metadata Spoted via inspection. Test case included. llvm-svn: 253275	2015-11-17 00:15:09 +00:00
Owen Anderson	2de9f545aa	Add intermediate subtract instructions to reassociation worklist. We sometimes create intermediate subtract instructions during reassociation. Adding these to the worklist to revisit exposes many additional reassociation opportunities. Patch by Aditya Nandakumar. llvm-svn: 253240	2015-11-16 18:07:30 +00:00
David Majnemer	7378e7a333	[LoopStrengthReduce] Don't increment iterator past the end of the BB We tried to move the insertion point beyond instructions like landingpad and cleanuppad. However, we also tried to move past catchpad. This is problematic because catchpad is also a terminator. This fixes PR25541. llvm-svn: 253238	2015-11-16 17:37:58 +00:00
Keno Fischer	86c95b5642	[Sink] Don't move landingpads Summary: Moving landingpads into successor basic blocks makes the verifier sad. Teach Sink that much like PHI nodes and terminator instructions, landingpads (and cleanuppads, etc.) may not be moved between basic blocks. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14475 llvm-svn: 253182	2015-11-16 04:47:58 +00:00
Chad Rosier	cc299b627d	[LIR] Add support for creating memcpys from loops with a negative stride. This allows us to transform the below loop into a memcpy. void test(unsigned __restrict__ a, unsigned __restrict__ b) { for (int i = 2047; i >= 0; --i) { a[i] = b[i]; } } This is the memcpy version of r251518, which added support for memset with negative strided loops. llvm-svn: 253091	2015-11-13 21:51:02 +00:00
Chad Rosier	2fa50a7a05	Add a comment that should have made my last commit. llvm-svn: 253063	2015-11-13 19:13:40 +00:00
Chad Rosier	ed0c7d1316	[LIR] Factor out the code to compute base ptr for negative strided loops. This will allow for the code to be reused in the memcpy optimization. llvm-svn: 253061	2015-11-13 19:11:07 +00:00
Tobias Grosser	8241795d20	Revert "Fix bug 25440: GVN assertion after coercing loads" This reverts 252919 which broke LNT: MultiSource/Applications/SPASS llvm-svn: 252936	2015-11-12 20:04:21 +00:00
Chad Rosier	a548fe569b	[LIR] Minor refactoring. NFCI. This change prevents uninteresting stores from being inserted into the list of candidate stores for memset/memcpy conversion. llvm-svn: 252926	2015-11-12 19:09:16 +00:00
Weiming Zhao	eed0145dd2	Fix bug 25440: GVN assertion after coercing loads Summary: when coercing loads, it inserts some instructions, which have no GV assigned. https://llvm.org/bugs/show_bug.cgi?id=25440 Reviewers: hfinkel, dberlin Subscribers: dberlin, llvm-commits Differential Revision: http://reviews.llvm.org/D14479 llvm-svn: 252919	2015-11-12 18:19:59 +00:00
Chad Rosier	cc9030b60a	[LIR] General refactor to improve compile-time and simplify code. First create a list of candidates, then transform. This simplifies the code in that you have don't have to worry that you may be using an invalidated iterator. Previously, each time we created a memset/memcpy we would reevaluate the entire loop potentially resulting in lots of redundant work for large basic blocks. llvm-svn: 252817	2015-11-11 23:00:59 +00:00
Renato Golin	0e77d72b0a	Revert "Strip metadata when speculatively hoisting instructions" This reverts commit r252604, as it broke all ARM and AArch64 buildbots, as well as some x86, et al. llvm-svn: 252623	2015-11-10 18:01:16 +00:00
Igor Laevsky	01c3692a10	Strip metadata when speculatively hoisting instructions This is fix for PR24059. When we are hoisting instruction above some condition it may turn out that metadata on this instruction was control dependant on the condition. This metadata becomes invalid and we need to drop it. This patch should cover most obvious places of speculative execution (which I have found by greping isSafeToSpeculativelyExecute). I think there are more cases but at least this change covers the severe ones. Differential Revision: http://reviews.llvm.org/D14398 llvm-svn: 252604	2015-11-10 14:10:31 +00:00
Chad Rosier	19dc92dc8d	Simplify. NFC. llvm-svn: 252491	2015-11-09 16:56:06 +00:00
Silviu Baranga	2910a4f6b1	Allow LLE/LD and the loop versioning infrastructure to use SCEV predicates Summary: LAA currently generates a set of SCEV predicates that must be checked by users. In the case of Loop Distribute/Loop Load Elimination, no such predicates could have been emitted, since we don't allow stride versioning. However, in the future there could be SCEV predicates that will need to be checked. This change adds support for SCEV predicate versioning in the Loop Distribute, Loop Load Eliminate and the loop versioning infrastructure. Reviewers: anemet Subscribers: mssimpso, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14240 llvm-svn: 252467	2015-11-09 13:26:09 +00:00
David Majnemer	b222184223	[LoopStrengthReduce] Don't bother fixing up PHIs from EH Pad preds We cannot really insert fixup code into a PHI's predecessor. This fixes PR25445. llvm-svn: 252416	2015-11-08 05:04:07 +00:00
Duncan P. N. Exon Smith	83c4b68720	ADT: Remove last implicit ilist iterator conversions, NFC Some implicit ilist iterator conversions have crept back into Analysis, Transforms, Hexagon, and llvm-stress. This removes them. I'll commit a patch immediately after this to disallow them (in a separate patch so that it's easy to revert if necessary). llvm-svn: 252371	2015-11-07 00:01:16 +00:00
Akira Hatanaka	5cfcce12eb	Add 'notail' marker for call instructions. This marker prevents optimization passes from adding 'tail' or 'musttail' markers to a call. Is is used to prevent tail call optimization from being performed on the call. rdar://problem/22667622 Differential Revision: http://reviews.llvm.org/D12923 llvm-svn: 252368	2015-11-06 23:55:38 +00:00
Sanjoy Das	55ea67cea7	[ValueTracking] Add parameters to isImpliedCondition; NFC Summary: This change makes the `isImpliedCondition` interface similar to the rest of the functions in ValueTracking (in that it takes a DataLayout, AssumptionCache etc.). This is an NFC, intended to make a later diff less noisy. Depends on D14369 Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14391 llvm-svn: 252333	2015-11-06 19:01:08 +00:00
Chad Rosier	43f9b48975	[LIR] Simplify code by making DataLayout globally accessible. NFC. llvm-svn: 252317	2015-11-06 16:33:57 +00:00
Eugene Zelenko	ffec81ca00	Fix some Clang-tidy modernize warnings, other minor fixes. Fixed warnings are: modernize-use-override, modernize-use-nullptr and modernize-redundant-void-arg. Differential revision: http://reviews.llvm.org/D14312 llvm-svn: 252087	2015-11-04 22:32:32 +00:00
Philip Reames	814fb60130	[CVP] Fold return values if possible In my previous change to CVP (251606), I made CVP much more aggressive about trying to constant fold comparisons. This patch is a reversal in direction. Rather than being agressive about every compare, we restore the non-block local restriction for most, and then try hard for compares feeding returns. The motivation for this is two fold: * The more I thought about it, the less comfortable I got with the possible compile time impact of the other approach. There have been no reported issues, but after talking to a couple of folks, I've come to the conclusion the time probably isn't justified. * It turns out we need to know the context to leverage the full power of LVI. In particular, asking about something at the end of it's block (the use of a compare in a return) will frequently get more precise results than something in the middle of a block. This is an implementation detail, but it's also hard to get around since mid-block queries have to reason about possible throwing instructions and don't get to use most of LVI's block focused infrastructure. This will become particular important when combined with http://reviews.llvm.org/D14263. Differential Revision: http://reviews.llvm.org/D14271 llvm-svn: 252032	2015-11-04 01:43:54 +00:00
Adam Nemet	7c94c9bf07	Fix unused variable warning from r252017 llvm-svn: 252019	2015-11-04 00:10:33 +00:00
Adam Nemet	e54a4fa95d	LLE 6/6: Add LoopLoadElimination pass Summary: The goal of this pass is to perform store-to-load forwarding across the backedge of a loop. E.g.: for (i) A[i + 1] = A[i] + B[i] => T = A[0] for (i) T = T + B[i] A[i + 1] = T The pass relies on loop dependence analysis via LoopAccessAnalisys to find opportunities of loop-carried dependences with a distance of one between a store and a load. Since it's using LoopAccessAnalysis, it was easy to also add support for versioning away may-aliasing intervening stores that would otherwise prevent this transformation. This optimization is also performed by Load-PRE in GVN without the option of multi-versioning. As was discussed with Daniel Berlin in http://reviews.llvm.org/D9548, this is inferior to a more loop-aware solution applied here. Hopefully, we will be able to remove some complexity from GVN/MemorySSA as a consequence. In the long run, we may want to extend this pass (or create a new one if there is little overlap) to also eliminate loop-indepedent redundant loads and store that require versioning due to may-aliasing intervening stores/loads. I have some motivating cases for store elimination. My plan right now is to wait for MemorySSA to come online first rather than using memdep for this. The main motiviation for this pass is the 456.hmmer loop in SPECint2006 where after distributing the original loop and vectorizing the top part, we are left with the critical path exposed in the bottom loop. Being able to promote the memory dependence into a register depedence (even though the HW does perform store-to-load fowarding as well) results in a major gain (~20%). This gain also transfers over to x86: it's around 8-10%. Right now the pass is off by default and can be enabled with -enable-loop-load-elim. On the LNT testsuite, there are two performance changes (negative number -> improvement): 1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the critical paths is reduced 2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this outside of LNT The pass is scheduled after the loop vectorizer (which is after loop distribution). The rational is to try to reuse LAA state, rather than recomputing it. The order between LV and LLE is not critical because normally LV does not touch scalar st->ld forwarding cases where vectorizing would inhibit the CPU's st->ld forwarding to kick in. LoopLoadElimination requires LAA to provide the full set of dependences (including forward dependences). LAA is known to omit loop-independent dependences in certain situations. The big comment before removeDependencesFromMultipleStores explains why this should not occur for the cases that we're interested in. Reviewers: dberlin, hfinkel Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13259 llvm-svn: 252017	2015-11-03 23:50:08 +00:00
Adam Nemet	a2df750fb3	[LAA] LLE 3/6: Rename InterestingDependence to Dependences, NFC Summary: We now collect all types of dependences including lexically forward deps not just "interesting" ones. Reviewers: hfinkel Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13256 llvm-svn: 251985	2015-11-03 21:39:52 +00:00
Tobias Grosser	526d52691a	Revert "[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader" Commit 251839 triggers miscompiles on some bots: http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/13723 (The commit is listed in 13722, but due to an existing failure introduced in 13721 and reverted in 13723 the failure is only visible in 13723) To verify r251839 is indeed the only change that triggered the buildbot failures and to ensure the buildbots remain green while investigating I temporarily revert this commit. At the current state it is unclear if this commit introduced some miscompile or if it only exposed code to Polly that is subsequently miscompiled by Polly. llvm-svn: 251901	2015-11-03 07:14:39 +00:00
Chen Li	d715310162	[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader Summary: This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch. Reviewers: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13974 llvm-svn: 251839	2015-11-02 22:00:15 +00:00
Justin Bogner	19b679963f	[PM] Port ADCE to the new pass manager llvm-svn: 251725	2015-10-30 23:13:18 +00:00
Philip Reames	eb3e9dad7f	[LVI/CVP] Teach LVI about range metadata Somewhat shockingly for an analysis pass which is computing constant ranges, LVI did not understand the ranges provided by range metadata. As part of this change, I included a change to CVP primarily because doing so made it much easier to write small self contained test cases. CVP was previously only handling the non-local operand case, but given that LVI can sometimes figure out information about instructions standalone, I don't see any reason to restrict this. There could possibly be a compile time impact from this, but I suspect it should be minimal. If anyone has an example which substaintially regresses, please let me know. I could restrict the block local handling to ICmps feeding Terminator instructions if needed. Note that this patch continues a somewhat bad practice in LVI. In many cases, we know facts about values, and separate context sensitive facts about values. LVI makes no effort to distinguish and will frequently cache the same value fact repeatedly for different contexts. I would like to change this, but that's a large enough change that I want it to go in separately with clear documentation of what's changing. Other examples of this include the non-null handling, and arguments. As a meta comment: the entire motivation of this change was being able to write smaller (aka reasonable sized) test cases for a future patch teaching LVI about select instructions. Differential Revision: http://reviews.llvm.org/D13543 llvm-svn: 251606	2015-10-29 03:57:17 +00:00
Sanjoy Das	13e63a2f21	[JumpThreading] Use dominating conditions to prove implications Summary: If P branches to Q conditional on C and Q branches to R conditional on C' and C => C' then the branch conditional on C' can be folded to an unconditional branch. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13972 llvm-svn: 251557	2015-10-28 21:27:08 +00:00
Chad Rosier	7142da0ed4	Typo. llvm-svn: 251521	2015-10-28 15:08:33 +00:00
Chad Rosier	7967614b2b	Reapply: [LIR] Add support for creating memsets from loops with a negative stride. The simple fix is to prevent forming memcpy from loops with a negative stride. llvm-svn: 251518	2015-10-28 14:38:49 +00:00
Chad Rosier	8eb2a18a9f	Revert "[LIR] Add support for creating memsets from loops with a negative stride." This reverts commit r251512. This is causing LNT/chomp to fail. llvm-svn: 251513	2015-10-28 13:54:09 +00:00
Chad Rosier	d6a6bd5501	[LIR] Add support for creating memsets from loops with a negative stride. http://reviews.llvm.org/D14125 llvm-svn: 251512	2015-10-28 12:55:34 +00:00
Chen Li	8d23a9bbef	Revert r251492 "[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader", because it broke some bots. llvm-svn: 251498	2015-10-28 05:15:51 +00:00
Chen Li	032a5d0cea	[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader Summary: This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch. Reviewers: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13974 llvm-svn: 251492	2015-10-28 04:45:47 +00:00
Igor Laevsky	1ef06559f4	[RS4GC] Strip noalias attribute after statepoint rewrite We should remove noalias along with dereference and dereference_or_null attributes because statepoint could potentially touch the entire heap including noalias objects. Differential Revision: http://reviews.llvm.org/D14032 llvm-svn: 251333	2015-10-26 19:06:01 +00:00
Benjamin Kramer	8ceb323bb4	Convert assert(false) into llvm_unreachable where it makes sense. llvm-svn: 251266	2015-10-25 22:28:27 +00:00
NAKAMURA Takumi	26c3872666	ScalarReplAggregates.cpp: Try to appease clash of anonymous::SROA in modules build. llvm-svn: 251181	2015-10-24 06:42:42 +00:00
Igor Laevsky	dde0029a25	[RS4GC] Rename stripDereferenceabilityInfo into stripNonValidAttributes. llvm-svn: 251157	2015-10-23 22:42:44 +00:00
Tim Northover	d4f55c0b1b	GVN: don't try to replace instruction with itself. After some look-ahead PRE was added for GEPs, an instruction could end up in the table of candidates before it was actually inspected. When this happened the pass might decide it was the best candidate to replace itself. This didn't go well. Should fix PR25291 llvm-svn: 251145	2015-10-23 20:30:02 +00:00
Justin Bogner	35e46cdd04	LoopPass: Simplify the API for adding a new loop. NFC The insertLoop() API is only used to add new loops, and has confusing ownership semantics. Simplify it by replacing it with addLoop(). llvm-svn: 251064	2015-10-22 21:21:32 +00:00
David Majnemer	e0675fb8fb	[Sink] Don't check BB.empty() As an invariant, BasicBlocks cannot be empty when passed to a transform. This is not the case for MachineBasicBlocks and the Sink pass was ported from the MachineSink pass which would explain the check's existence. llvm-svn: 251057	2015-10-22 20:29:08 +00:00
Sanjoy Das	3020b1bc8c	[RS4GC] Remove a redundant linear search, NFCI Since LiveVariables is uniqued (we just created it from a `DenseSet`), `FindIndex(LiveVariables, LiveVariables[i])` is always `i`. llvm-svn: 250786	2015-10-20 01:06:31 +00:00
Sanjoy Das	b1942f14cd	[RS4GC] Clean up `find_index`; NFC - Bring it up to the LLVM Coding Style - Sink it inside `CreateGCRelocates`, which is its only user llvm-svn: 250785	2015-10-20 01:06:28 +00:00
Sanjoy Das	7ad67640e9	[RS4GC] Re-purpose `normalizeForInvokeSafepoint`; NFC. `normalizeForInvokeSafepoint` in RewriteStatepointsForGC.cpp, as it is written today, deals with `gc.relocate` and `gc.result` uses of a statepoint equally well. This change documents this fact and adds a test case. There is no functional change here -- only documentation of existing functionality. llvm-svn: 250784	2015-10-20 01:06:24 +00:00
Sanjoy Das	ff3dba736a	[RS4GC] Minor cleanup to `normalizeForInvokeSafepoint`; NFC llvm-svn: 250783	2015-10-20 01:06:17 +00:00
Jakub Staszak	f12821a43c	Preserve CFG in MergedLoadStoreMotion. This fixes PR24426. llvm-svn: 250660	2015-10-18 19:34:10 +00:00
Sanjoy Das	58fae7cf6b	[RS4GC] Dont' propagate call attrs related to patchable statepoints The `"statepoint-id"` and `"statepoint-num-patch-bytes"` attributes are used solely to determine properties of the `gc.statepoint` being created. Once the `gc.statepoint` is in place, these should be removed. llvm-svn: 250491	2015-10-16 02:41:23 +00:00
Sanjoy Das	810a59d037	[RS4GC] Bring legalizeCallAttributes up to LLVM coding style; NFC llvm-svn: 250490	2015-10-16 02:41:11 +00:00
Sanjoy Das	25ec1a3e60	[RS4GC] Use "deopt" operand bundles Summary: This is a step towards using operand bundles to carry deopt state till RewriteStatepointsForGC. The change adds a flag to RewriteStatepointsForGC that teaches it to pick up deopt state from a `"deopt"` operand bundle attached to the `call` or `invoke` it is wrapping. The command line flag added, `-rs4gc-use-deopt-bundles`, will only exist for a short while. Once we are able to pipe deopt bundle state through the full optimization pipeline without problems, we will "constant fold" `-rs4gc-use-deopt-bundles` to `true`. Reviewers: swaroop.sridhar, reames Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13372 llvm-svn: 250489	2015-10-16 02:41:00 +00:00
Sanjoy Das	7360f30852	[IndVars] Rename getExtend; NFC Rename `IndVarSimplify::getExtend` to `IndVarSimplify::createExtendInst` to make it obvious that it creates `llvm::Instruction` s. llvm-svn: 250484	2015-10-16 01:00:50 +00:00
Sanjoy Das	37e87c2023	[IndVars] Have `cloneArithmeticIVUser` guess better Summary: `cloneArithmeticIVUser` currently trips over expression like `add %iv, -1` when `%iv` is being zero extended -- it tries to construct the widened use as `add %iv.zext, zext(-1)` and (correctly) fails to prove equivalence to `zext(add %iv, -1)` (here the SCEV for `%iv` is `{1,+,1}`). This change teaches `IndVars` to try sign extending the non-IV operand if that makes the newly constructed IV use equivalent to the widened narrow IV use. Reviewers: atrick, hfinkel, reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13717 llvm-svn: 250483	2015-10-16 01:00:47 +00:00
Sanjoy Das	472840a3d3	[IndVars] Extract out a few local variables; NFC llvm-svn: 250482	2015-10-16 01:00:44 +00:00
Sanjoy Das	1fd184e5a2	[IndVars] Split `WidenIV::cloneIVUser`; NFC Summary: This NFC splitting is intended to make a later diff easier to follow. It just tail duplicates `cloneIVUser` into `cloneArithmeticIVUser` and `cloneBitwiseIVUser`. Reviewers: atrick, hfinkel, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13716 llvm-svn: 250481	2015-10-16 01:00:39 +00:00
Benjamin Kramer	6db3338cb1	[ScalarOpts] Remove dead code. Does not touch debug dumpers. NFC. llvm-svn: 250417	2015-10-15 15:08:58 +00:00
Manman Ren	72d44b1b09	Recommit r250345, it was reverted in r250366 to investigate a bot failure. Our internal bot is still red after r250366. llvm-svn: 250415	2015-10-15 14:59:40 +00:00
Manman Ren	f5499fd9d5	Temporarily revert r250345 to sort out bot failure. With r250345 and r250343, we start to observe the following failure when bootstrap clang with lto and pgo: PHI node entries do not match predecessors! %.sroa.029.3.i = phi %"class.llvm::SDNode.13298"* [ null, %30953 ], [ null, %31017 ], [ null, %30998 ], [ null, %_ZN4llvm8dyn_castINS_14ConstantSDNodeENS_7SDValueEEENS_10cast_rettyIT_T0_E8ret_typeERS5_.exit.i.1804 ], [ null, %30975 ], [ null, %30991 ], [ null, %_ZNK4llvm3EVT13getScalarTypeEv.exit.i.1812 ], [ %..sroa.029.0.i, %_ZN4llvm11SmallVectorIiLj8EED1Ev.exit.i.1826 ], !dbg !451895 label %30998 label %_ZNK4llvm3EVTeqES0_.exit19.thread.i LLVM ERROR: Broken function found, compilation aborted! I will re-commit this if the bot does not recover. llvm-svn: 250366	2015-10-15 04:58:24 +00:00
Cong Hou	b74d3b3b86	Update the branch weight metadata in JumpThreading pass. Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). This is the third attempt to submit this patch, while the first two led to failures in some FDO tests. After investigation, it is the edge weight normalization that caused those failures. In this patch the edge weight normalization is fixed so that there is no zero weight in the output and the sum of all weights can fit in 32-bit integer. Several unit tests are added. Differential revision: http://reviews.llvm.org/D10979 llvm-svn: 250345	2015-10-14 23:14:17 +00:00
Chen Li	567aa7ab30	[LoopUnswitch] Correct misleading comments. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13738 llvm-svn: 250317	2015-10-14 19:47:43 +00:00
Manman Ren	2c8e16d507	Revert r250204 and r250240 due to bot failure. We failed to build PGO-ed clang. llvm-svn: 250264	2015-10-14 03:04:03 +00:00
Chad Rosier	7f08d80595	Typo. llvm-svn: 250224	2015-10-13 20:59:16 +00:00
Duncan P. N. Exon Smith	be4d8cba1c	Scalar: Remove remaining ilist iterator implicit conversions Remove remaining `ilist_iterator` implicit conversions from LLVMScalarOpts. This change exposed some scary behaviour in lib/Transforms/Scalar/SCCP.cpp around line 1770. This patch changes a call from `Function::begin()` to `&Function::front()`, since the return was immediately being passed into another function that takes a `Function`. `Function::front()` started to assert, since the function was empty. Note that `Function::end()` does not point at a legal `Function` -- it points at an `ilist_half_node` -- so the other function was getting garbage before. (I added the missing check for `Function::isDeclaration()`.) Otherwise, no functionality change intended. llvm-svn: 250211	2015-10-13 19:26:58 +00:00
Cong Hou	7ab123a5cf	Update the branch weight metadata in JumpThreading pass. Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). Differential revision: http://reviews.llvm.org/D10979 llvm-svn: 250204	2015-10-13 18:43:10 +00:00
Duncan P. N. Exon Smith	3a9c9e3dcd	Scalar: Remove some implicit ilist iterator conversions, NFC Remove some of the implicit ilist iterator conversions in LLVMScalarOpts. More to go. llvm-svn: 250197	2015-10-13 18:26:00 +00:00
Sanjoy Das	b873cbe5c9	[IndVars] NFC Cleanup. - Rename methods according to the LLVM Coding Style - Merge adjacent anonymous namespace block - Use `auto` in two places llvm-svn: 250152	2015-10-13 07:17:38 +00:00
Manman Ren	9f824dab1d	Revert 250089 due to bot failure. It failed when building clang itself with PGO. llvm-svn: 250145	2015-10-13 03:38:02 +00:00
Cong Hou	3320bcd815	Update the branch weight metadata in JumpThreading pass. In JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). Differential revision: http://reviews.llvm.org/D10979 llvm-svn: 250089	2015-10-12 19:44:08 +00:00
Sanjoy Das	cc16ccc1ab	[IndVars] Use `auto`; NFC llvm-svn: 249944	2015-10-10 06:33:33 +00:00
Owen Anderson	97ca0f3f2c	Generalize convergent check to handle invokes as well as calls. llvm-svn: 249892	2015-10-09 20:17:46 +00:00
Owen Anderson	2c9978b12b	Teach LoopUnswitch not to perform non-trivial unswitching on loops containing convergent operations. Doing so could cause the post-unswitching convergent ops to be control-dependent on the unswitch condition where they were not before. This check could be refined to allow unswitching where the convergent operation was already control-dependent on the unswitch condition. llvm-svn: 249874	2015-10-09 18:40:20 +00:00
Owen Anderson	d95b08a0a7	Refine the definition of convergent to only disallow the addition of new control dependencies. This covers the common case of operations that cannot be sunk. Operations that cannot be hoisted should already be handled properly via the safe-to-speculate rules and mechanisms. llvm-svn: 249865	2015-10-09 18:06:13 +00:00
Andrea Di Biagio	99493df257	[MemCpyOpt] Fix wrong merging adjacent nontemporal stores into memset calls. Pass MemCpyOpt doesn't check if a store instruction is nontemporal. As a consequence, adjacent nontemporal stores are always merged into a memset call. Example: ;;; define void @foo(<4 x float>* nocapture %p) { entry: store <4 x float> zeroinitializer, <4 x float>* %p, align 16, !nontemporal !0 %p1 = getelementptr inbounds <4 x float>, <4 x float>* %dst, i64 1 store <4 x float> zeroinitializer, <4 x float>* %p1, align 16, !nontemporal !0 ret void } !0 = !{i32 1} ;;; In this example, the two nontemporal stores are combined to a memset of zero which does not preserve the nontemporal hint. Later on the backend (tested on a x86-64 corei7) expands that memset call into a sequence of two normal 16-byte aligned vector stores. opt -memcpyopt example.ll -S -o - \| llc -mcpu=corei7 -o - Before: xorps %xmm0, %xmm0 movaps %xmm0, 16(%rdi) movaps %xmm0, (%rdi) With this patch, we no longer merge nontemporal stores into calls to memset. In this example, llc correctly expands the two stores into two movntps: xorps %xmm0, %xmm0 movntps %xmm0, 16(%rdi) movntps %xmm0, (%rdi) In theory, we could extend the usage of !nontemporal metadata to memcpy/memset calls. However a change like that would only have the effect of forcing the backend to expand !nontemporal memsets back to sequences of store instructions. A memset library call would not have exactly the same semantic of a builtin !nontemporal memset call. So, SelectionDAG will have to conservatively expand it back to a sequence of !nontemporal stores (effectively undoing the merging). Differential Revision: http://reviews.llvm.org/D13519 llvm-svn: 249820	2015-10-09 10:53:41 +00:00
Arnaud A. de Grandmaison	859b2ac07d	[EarlyCSE] Address post commit review for r249523. llvm-svn: 249814	2015-10-09 09:23:01 +00:00
Sanjoy Das	3c520a1272	[RS4GC] Refactoring to make a later change easier, NFCI Summary: These non-semantic changes will help make a later change adding support for deopt operand bundles more streamlined. Reviewers: reames, swaroop.sridhar Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13491 llvm-svn: 249779	2015-10-08 23:18:38 +00:00
Sanjoy Das	c21a05a3a4	[PlaceSafeopints] Extract out `callsGCLeafFunction`, NFC Summary: This will be used in a later change to RewriteStatepointsForGC. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13490 llvm-svn: 249777	2015-10-08 23:18:30 +00:00
Sanjoy Das	1ede5367ba	[RS4GC] Don't copy ADT's unneccessarily, NFCI Summary: Use `const auto &` instead of `auto` in `makeStatepointExplicit`. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13454 llvm-svn: 249776	2015-10-08 23:18:22 +00:00
Sanjoy Das	40bdd041db	[RS4GC] Use AssertingVH for RematerializedValueMapTy, NFCI Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13489 llvm-svn: 249620	2015-10-07 21:32:35 +00:00
Arnaud A. de Grandmaison	a6178a179d	[EarlyCSE] Fix handling of target memory intrinsics for CSE'ing loads. Summary: Some target intrinsics can access multiple elements, using the pointer as a base address (e.g. AArch64 ld4). When trying to CSE such instructions, it must be checked the available value comes from a compatible instruction because the pointer is not enough to discriminate whether the value is correct. Reviewers: ssijaric Subscribers: mcrosier, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13475 llvm-svn: 249523	2015-10-07 07:41:29 +00:00
Sanjoy Das	60bf3db17f	[RS4GC] Remove an unnecessary assert & related variables I don't think this assert adds much value, and removing it and related variables avoids an "unused variable" warning in release builds. llvm-svn: 249511	2015-10-07 02:39:27 +00:00
Sanjoy Das	b40bd1a93f	[RS4GC] Cosmetic cleanup, NFC Summary: A series of cosmetic cleanup changes to RewriteStatepointsForGC: - Rename variables to LLVM style - Remove some redundant asserts - Remove an unsued `Pass *` parameter - Remove unnecessary variables - Use C++11 idioms where applicable - Pass CallSite by value, not reference Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13370 llvm-svn: 249508	2015-10-07 02:39:18 +00:00
Hans Wennborg	083ca9bb32	Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D13321 llvm-svn: 249482	2015-10-06 23:24:35 +00:00
Sanjoy Das	5c8bead46d	[IndVars] Don't break dominance in `eliminateIdentitySCEV` Summary: After r249211, `getSCEV(X) == getSCEV(Y)` does not guarantee that X and Y are related in the dominator tree, even if X is an operand to Y (I've included a toy example in comments, and a real example as a test case). This commit changes `SimplifyIndVar` to require a `DominatorTree`. I don't think this is a problem because `ScalarEvolution` requires it anyway. Fixes PR25051. Depends on D13459. Reviewers: atrick, hfinkel Subscribers: joker.eph, llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13460 llvm-svn: 249471	2015-10-06 21:44:49 +00:00
Arnaud A. de Grandmaison	6fd488b156	[EarlyCSE] Constify ParseMemoryInst methods (NFC). llvm-svn: 249400	2015-10-06 13:35:30 +00:00
Piotr Padlewski	dc9b2cfc50	inariant.group handling in GVN The most important part required to make clang devirtualization works ( ͡°͜ʖ ͡°). The code is able to find non local dependencies, but unfortunatelly because the caller can only handle local dependencies, I had to add some restrictions to look for dependencies only in the same BB. http://reviews.llvm.org/D12992 llvm-svn: 249196	2015-10-02 22:12:22 +00:00
Jingyue Wu	df1a1b113b	[NaryReassociate] SeenExprs records WeakVH Summary: The instructions SeenExprs records may be deleted during rewriting. FindClosestMatchingDominator should ignore these deleted instructions. Fixes PR24301. Reviewers: grosser Subscribers: grosser, llvm-commits Differential Revision: http://reviews.llvm.org/D13315 llvm-svn: 248983	2015-10-01 03:51:44 +00:00
Fiona Glaser	b0c6d9174e	DeadCodeElimination: rewrite to be faster Same strategy as simplifyInstructionsInBlock. ~1/3 less time on my test suite. This pass doesn't have many in-tree users, but getting rid of an O(N^2) worst case and making it cleaner should at least make it a viable alternative to ADCE, since it's now consistently somewhat faster. llvm-svn: 248927	2015-09-30 17:49:49 +00:00
Chen Li	9f27fc0599	[LoopUnswitch] Add block frequency analysis to recognize hot/cold regions Summary: This patch adds block frequency analysis to LoopUnswitch pass to recognize hot/cold regions. For cold regions the pass only performs trivial unswitches since they do not increase code size, and for hot regions everything works as before. This helps to minimize code growth in cold regions and be more aggressive in hot regions. Currently the default cold regions are blocks with frequencies below 20% of function entry frequency, and it can be adjusted via -loop-unswitch-cold-block-frequency flag. The entire feature is controlled via -loop-unswitch-with-block-frequency flag and it is off by default. Reviewers: broune, silvas, dnovillo, reames Subscribers: davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D11605 llvm-svn: 248777	2015-09-29 05:03:32 +00:00
Weiming Zhao	310770a90f	[LoopReroll] Ignore debug intrinsics Originally, debug intrinsics and annotation intrinsics may prevent the loop to be rerolled, now they are ignored. Differential Revision: http://reviews.llvm.org/D13150 llvm-svn: 248718	2015-09-28 17:03:23 +00:00
Justin Bogner	0638b7ba99	ADCE: Fix typo in file comment. NFC llvm-svn: 248613	2015-09-25 21:03:46 +00:00
Lawrence Hu	cac0b89289	Swap loop invariant GEP with loop variant GEP to allow more LICM. This patch changes the order of GEPs generated by Splitting GEPs pass, specially when one of the GEPs has constant and the base is loop invariant, then we will generate the GEP with constant first when beneficial, to expose more cases for LICM. If originally Splitting GEP generate the following: do.body.i: %idxprom.i = sext i32 %shr.i to i64 %2 = bitcast %typeD* %s to i8* %3 = shl i64 %idxprom.i, 2 %uglygep = getelementptr i8, i8* %2, i64 %3 %uglygep7 = getelementptr i8, i8* %uglygep, i64 1032 ... Now it genereates: do.body.i: %idxprom.i = sext i32 %shr.i to i64 %2 = bitcast %typeD* %s to i8* %3 = shl i64 %idxprom.i, 2 %uglygep = getelementptr i8, i8* %2, i64 1032 %uglygep7 = getelementptr i8, i8* %uglygep, i64 %3 ... For no-loop cases, the original way of generating GEPs seems to expose more CSE cases, so we don't change the logic for no-loop cases, and only limit our change to the specific case we are interested in. llvm-svn: 248420	2015-09-23 19:25:30 +00:00
Igor Laevsky	029bd93c5d	[DeadStoreElimination] Remove dead zero store to calloc initialized memory This change allows dead store elimination to remove zero and null stores into memory freshly allocated with calloc-like function. Differential Revision: http://reviews.llvm.org/D13021 llvm-svn: 248374	2015-09-23 11:38:44 +00:00
Sanjoy Das	2aacc0ecca	[SCEV] Introduce ScalarEvolution::getOne and getZero. Summary: It is fairly common to call SE->getConstant(Ty, 0) or SE->getConstant(Ty, 1); this change makes such uses a little bit briefer. I've refactored the call sites I could find easily to use getZero / getOne. Reviewers: hfinkel, majnemer, reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12947 llvm-svn: 248362	2015-09-23 01:59:04 +00:00
Michael Zolotukhin	deade19630	[Unroll] Do not crash trying to propagate a value to vector load. llvm-svn: 248333	2015-09-22 22:27:12 +00:00
Michael Zolotukhin	8bb31dd08a	[Unroll] Follow-up for r247769: fix a bug in UnrolledInstAnalyzer::visitLoad. Apart from checking that GlobalVariable is a constant, we should check that it's not a weak constant, in which case we can't propagate its value. llvm-svn: 248327	2015-09-22 21:41:29 +00:00
NAKAMURA Takumi	10c80e7996	Prune trailing whitespaces. llvm-svn: 248265	2015-09-22 11:19:03 +00:00
NAKAMURA Takumi	0a7d0ad95f	Untabify. llvm-svn: 248264	2015-09-22 11:15:07 +00:00
NAKAMURA Takumi	a9cb538a74	Reformat blank lines. llvm-svn: 248263	2015-09-22 11:14:39 +00:00
NAKAMURA Takumi	84965031a7	Reformat comment lines. llvm-svn: 248262	2015-09-22 11:14:12 +00:00
NAKAMURA Takumi	70ad98aca4	Reformat. llvm-svn: 248261	2015-09-22 11:13:55 +00:00
Michael Zolotukhin	9f3aea6e1f	[LoopUnswitch] Require DominatorTree info. Summary: We should either require the DT info to be available, or check if it's available in every place we use DT (and we already miss such check in one place, which causes failures in some cases). As other loop passes preserve DT and it's usually available, it makes sense to just require it here. There is no regression test, because the bug only shows up if pass manager decides to clean DT info right before LoopUnswitch. If loop-unswitch is run separately, DT is available, so bug isn't exposed. Reviewers: chandlerc, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13036 llvm-svn: 248230	2015-09-22 00:22:47 +00:00
Philip Reames	5f99423de9	[LICM] Hoist calls to readonly argmemonly functions even with stores in the loop We know that an argmemonly function can only access memory pointed to by it's pointer arguments. Rather than needing to consider all possible stores as aliasing (as we do for a readonly function), we can only consider the aliasing of the pointer arguments. Note that this change only addresses hoisting. I'm thinking about how to address speculation safety as well, but that will be a different change. FYI, argmemonly disallows accessing memory through non-pointer typed arguments. Differential Revision: http://reviews.llvm.org/D12771 llvm-svn: 248220	2015-09-21 22:27:59 +00:00
Mehdi Amini	24e20583d1	Fix UB: can't bind a reference to nullptr (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 248213	2015-09-21 21:29:43 +00:00
Sanjoy Das	7cc2cfecd9	[IndVars] Use C++11 style field initialization; NFCI. llvm-svn: 248131	2015-09-20 18:42:53 +00:00
Sanjoy Das	e1e352d5c5	[IndVars] Don't add a level of indentation for namespace {. NFC. Whitespace-only change. llvm-svn: 248130	2015-09-20 18:42:50 +00:00
Sanjoy Das	9119bf4c0b	[IndVars] Don't repeat function names in comment; NFC. Only changes comments. llvm-svn: 248112	2015-09-20 06:58:03 +00:00
Sanjoy Das	428db150d1	[IndVars] Fix a bug in r248045. Because -indvars widens induction variables through arithmetic, `NeverNegative` cannot be a property of the `WidenIV` (a `WidenIV` manages information for all transitive uses of an IV being widened, including uses of `-1 * IV`). Instead it must live on `NarrowIVDefUse` which manages information for a specific def-use edge in the transitive use list of an induction variable. This change also adds a test case that demonstrates the problem with r248045. llvm-svn: 248107	2015-09-20 01:52:18 +00:00
Sanjoy Das	f69d0e3384	[IndVars] Widen more comparisons for non-negative induction vars Summary: If an induction variable is provably non-negative, its sign extension is equal to its zero extension. This means narrow uses like icmp slt iNarrow %indvar, %rhs can be widened into icmp slt iWide zext(%indvar), sext(%rhs) Reviewers: atrick, mcrosier, hfinkel Subscribers: hfinkel, reames, llvm-commits Differential Revision: http://reviews.llvm.org/D12745 llvm-svn: 248045	2015-09-18 21:21:02 +00:00
Larisse Voufo	532bf7153c	Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan. (Complete version of r247497. See D12886) llvm-svn: 248022	2015-09-18 19:14:35 +00:00
Piotr Padlewski	a4d43337d4	gvn small fix http://reviews.llvm.org/D12928 llvm-svn: 247935	2015-09-17 20:34:22 +00:00
David L Kreitzer	da700ce581	Test commit: Fixed a few typos in the comments. llvm-svn: 247793	2015-09-16 13:27:30 +00:00
Michael Zolotukhin	fc314be0ec	[Unroll] Fix a bug in UnrolledInstAnalyzer::visitLoad. We only checked that a global is initialized with constants, which is incorrect. We should be checking that GlobalVariable is a constant, not just initialized with it. llvm-svn: 247769	2015-09-16 03:25:09 +00:00
Sanjoy Das	8a5526e8be	[IndVars] Fix PR24783. In `IndVarSimplify::ExpandSCEVIfNeeded`, `SCEVExpander::findExistingExpansion` may return an `llvm::Value` that differs in type from the SCEV it was asked to find an expansion for (but computes the same value). In such cases, we fall back on `expandCodeFor`; and rely on LLVM to CSE the two equivalent expressions (different only by a no-op cast) into a single computation. I tried a few other approaches to fixing PR24783, all of which turned out to be more complex than this current version: 1. Move the `ExpandSCEVIfNeeded` logic into `expandCodeFor`. This got problematic because currently we do not pass in the `Loop *` into `expandCodeFor`. Changing the interface to do this is a more invasive change, and really does not make much semantic sense unless the SCEV being passed in is an add recurrence. There is also the problem of `expandCodeFor` being used in places other than `indvars` -- there may be performance / correctness issues elsewhere if `expandCodeFor` is moved from always generating IR from scratch to cache-like model. 2. Have `findExistingExpansion` only return expression with the correct type. This would make `isHighCostExpansionHelper` and thus `isHighCostExpansion` more conservative than necessary. 3. Insert casts on the value returned by `findExistingExpansion` if needed using `InsertNoopCastOfTo`. This is complicated because `InsertNoopCastOfTo` depends on internal state of its `SCEVExpander` (specifically `Builder.GetInserPoint()`), and this may not be set up when `ExpandSCEVIfNeeded` is called. 4. Manually insert casts on the value returned by `findExistingExpansion` if needed using `InsertNoopCastOfTo` via `CastInst::Create`. This is probably workable, but figuring out the location where the cast instruction needs to be inserted has enough edge cases (arguments, constants, invokes, LCSSA must be preserved) makes me feel what I have right now is simplest solution. llvm-svn: 247749	2015-09-15 23:45:39 +00:00
Sanjoy Das	0ce51a92a8	[IndVars] Rename variable; NFC. llvm-svn: 247748	2015-09-15 23:45:35 +00:00
Larisse Voufo	6b867c7254	Revert "Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan." for preliminary community discussion (See. D12886) llvm-svn: 247716	2015-09-15 19:14:05 +00:00
Igor Laevsky	bdc1eafe20	[CorrelatedValuePropagation] Infer nonnull attributes LazuValueInfo can prove that value is nonnull based on the context information. Make use of this ability to infer nonnull attributes for the call arguments. Differential Revision: http://reviews.llvm.org/D12836 llvm-svn: 247707	2015-09-15 17:51:50 +00:00
Marcello Maggioni	454faa84e2	[NaryReassociate] Add support for Mul instructions This patch extends the current pass by handling Mul instructions as well. Patch by: Volkan Keles (vkeles@apple.com) llvm-svn: 247705	2015-09-15 17:22:52 +00:00
Sanjoy Das	f75e15e5ac	[PlaceSafepoints] Make the width of a counted loop settable. Summary: This change lets a `PlaceSafepoints` client change how wide the trip count of a loop has to be for the loop to be considerd "counted", via `CountedLoopTripWidth`. It also removes the boolean `SkipCounted` flag and the `upperTripBound` constant -- we can get the old behavior of `SkipCounted` == `false` by setting `CountedLoopTripWidth` to `13` (2 ^ 13 == 8192). Reviewers: reames Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12789 llvm-svn: 247656	2015-09-15 01:42:48 +00:00
Chandler Carruth	29a18a4663	[PM] Port SROA to the new pass manager. In some ways this is a very boring port to the new pass manager as there are no interesting analyses or dependencies or other oddities. However, this does introduce the first good example of a transformation pass with non-trivial state porting to the new pass manager. I've tried to carve out patterns here to replicate elsewhere, and would appreciate comments on whether folks like these patterns: - A common need in the new pass manager is to effectively lift the pass class and some of its state into a public header file. Prior to this, LLVM used anonymous namespaces to provide "module private" types and utilities, but that doesn't scale to cases where a public header file is needed and the new pass manager will exacerbate that. The pattern I've adopted here is to use the namespace-cased-name of the core pass (what would be a module if we had them) as a module-private namespace. Then utility and other code can be declared and defined in this namespace. At some point in the future, we could even have (conditionally compiled) code that used modules features when available to do the same basic thing. - I've split the actual pass run method in two in order to expose a private method usable by the old pass manager to wrap the new class with a minimum of duplicated code. I actually looked at a bunch of ways to automate or generate these, but they are all quite terrible IMO. The fundamental need is to extract the set of analyses which need to cross this interface boundary, and that will end up being too unpredictable to effectively encapsulate IMO. This is also a relatively small amount of boiler plate that will live a relatively short time, so I'm not too worried about the fact that it is boiler plate. The rest of the patch is totally boring but results in a massive diff (sorry). It just moves code around and removes or adds qualifiers to reflect the new name and nesting structure. Differential Revision: http://reviews.llvm.org/D12773 llvm-svn: 247501	2015-09-12 09:09:14 +00:00
Larisse Voufo	f57162b6e7	Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan. llvm-svn: 247497	2015-09-12 01:41:55 +00:00
James Molloy	efbba72cb2	Add GlobalsAA as preserved to a bunch of transforms GlobalsAA must by definition be preserved in function passes, but the passmanager doesn't know that. Make each pass explicitly preserve GlobalsAA. llvm-svn: 247263	2015-09-10 10:22:12 +00:00
Philip Reames	953817b65d	[RewriteStatepointsForGC] Minor refactor to use shared implementation [NFC] llvm-svn: 247223	2015-09-10 00:44:10 +00:00
Philip Reames	b4e55f3923	[RewriteStatepointsForGC] Strengthen a confusingly weak assertion [NFC] The assertion was weaker than it should be and gave the impression we're growing the number of base defining values being considered during the fixed point interation. That's not true. The tighter form of the assert is useful documentation. llvm-svn: 247221	2015-09-10 00:32:56 +00:00
Philip Reames	c8ded462c4	[RewriteStatepointsForGC] One last bit of naming [NFCI] llvm-svn: 247220	2015-09-10 00:27:50 +00:00
Philip Reames	34d7a7493d	[RewriteStatepointsForGC] Further style/naming fixup [NFCI] llvm-svn: 247217	2015-09-10 00:22:49 +00:00
Philip Reames	7540e3a45d	[RewriteStatepointsForGC] More naming cleanup [NFCI] llvm-svn: 247213	2015-09-10 00:01:53 +00:00
Philip Reames	ece70b8042	[RewriteStatepointsForGC] Code cleanup [NFC] Factor out common code related to naming values, fix a small style issue. More to follow in separate changes. llvm-svn: 247211	2015-09-09 23:57:18 +00:00
Philip Reames	6628713f4f	[RewriteStatepointsForGC] Extend base pointer inference to handle insertelement This change is simply enhancing the existing inference algorithm to handle insertelement instructions by conservatively inserting a new instruction to propagate the vector of associated base pointers. In the process, I'm ripping out the peephole optimizations which mostly helped cover the fact this hadn't been done. Note that most of the newly inserted nodes will be nearly immediately removed by the post insertion optimization pass introduced in 246718. Arguably, we should be trying harder to avoid the malloc traffic here, but I'd rather get the code correct, then worry about compile time. Unlike previous extensions of the algorithm to handle more case, I discovered the existing code was causing miscompiles in some cases. In particular, we had an implicit assumption that the peephole covered all insert element instructions, so if we had a value directly based on a insert element the peephole didn't cover, we proceeded as if it were a base anyways. Not good. I believe we had the same issue with shufflevector which is why I adjusted the predicate for them as well. Differential Revision: http://reviews.llvm.org/D12583 llvm-svn: 247210	2015-09-09 23:40:12 +00:00
Philip Reames	15d5563cea	[RewriteStatepointsForGC] Make base pointer inference deterministic Previously, the base pointer algorithm wasn't deterministic. The core fixed point was (of course), but we were inserting new nodes and optimizing them in an order which was unspecified and variable. We'd somewhat hacked around this for testing by sorting by value name, but that doesn't solve the general determinism problem. Instead, we can use the order of traversal over the def/use graph to give us a single consistent ordering. Today, this is a DFS order, but the exact order doesn't mater provided it's deterministic for a given input. (Q: It is safe to rely on a deterministic order of operands right?) Note that this only fixes the determinism within a single inference step. The inference step is currently invoked many times in a non-deterministic order. That's a future change in the sequence. :) Differential Revision: http://reviews.llvm.org/D12640 llvm-svn: 247208	2015-09-09 23:26:08 +00:00
Chandler Carruth	7b560d40bd	[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are within a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167	2015-09-09 17:55:00 +00:00
Chandler Carruth	1688a772fc	Fix a typo I spotted when hacking on SROA. Somewhat alarming that nothing broke. llvm-svn: 247127	2015-09-09 09:46:16 +00:00
Sanjoy Das	da0d79e0a0	[IRCE] Add INITIALIZE_PASS_DEPENDENCY invocations. IRCE was just using INITIALIZE_PASS(), which is incorrect. llvm-svn: 247122	2015-09-09 03:47:18 +00:00
Philip Reames	3ea158950e	[RewriteStatepointsForGC] Extract common code, comment, and fix a build warning [NFC] llvm-svn: 246810	2015-09-03 21:57:40 +00:00
Philip Reames	f5b8e47651	[RewriteStatepointsForGC] Strengthen invariants around BDVs As a first step towards a new implementation of the base pointer inference algorithm, introduce an abstraction for BDVs, strengthen the assertions around them, and rewrite the BDV relation code in terms of the abstraction which includes an explicit notion of whether the BDV is also a base. The later is motivated by the fact we had a bug where insertelement was always assumed to be a base pointer even though the BDV code knew it wasn't. The strengthened assertions in this patch would have caught that bug. The next step will be to separate the DefiningValueMap into a BDV use list cache (entirely within findBasePointers) and a base pointer cache. Having the former will allow me to use a deterministic visit order when visiting BDVs in the inference algorithm and remove a bunch of ordering related hacks. Before actually doing the last step, I'm likely going to extend the lattice with a 'BaseN' (seen only base inputs) state so that I can kill the post process optimization step. Phabricator Revision: http://reviews.llvm.org/D12608 llvm-svn: 246809	2015-09-03 21:34:30 +00:00
Philip Reames	246e618e77	[RewriteStatepointsForGC] Workaround a lack of determinism in visit order The visit order being used in the base pointer inference algorithm is currently non-deterministic. When working on http://reviews.llvm.org/D12583, I discovered that we were relying on a peephole optimization to get deterministic ordering in one of the test cases. This change is intented to let me test and land http://reviews.llvm.org/D12583. The current code will not be long lived. I'm starting to investigate a rewrite of the algorithm which will combine the post-process step into the initial algorithm and make the visit order determistic. Before doing that, I wanted to make sure the existing code was complete and the test were stable. Hopefully, patches should be up for review for the new algorithm this week or early next. llvm-svn: 246801	2015-09-03 20:24:29 +00:00
Philip Reames	07a2ee1aff	[RewriteStatepointsForGC] Delete stale comment [NFC] llvm-svn: 246722	2015-09-02 22:35:42 +00:00
Philip Reames	b3967cd08e	[RewriteStatepointsForGC] Pull a function out of anon namespace [NFC] Thanks to David Blaikie for noticing in previous commit. llvm-svn: 246721	2015-09-02 22:30:53 +00:00
Philip Reames	9546f367f7	[RewriteStatepointsForGC] Bugfix for change 246133 Fix a bug in change 246133. I didn't handle the case where we had a cycle in the use graph and could add an instruction we were about to erase back on to the worklist. Oddly, I have not been able to write a small test case for this, even with the AssertingVH added. I have confirmed the basic theory for the fix on a large failing example, but all attempts to reduce that to something appropriate for a test case have failed. Differential Revision: http://reviews.llvm.org/D12575 llvm-svn: 246718	2015-09-02 22:25:07 +00:00
Philip Reames	6906e92812	Fix release build warning for unused function llvm-svn: 246717	2015-09-02 21:57:17 +00:00
Philip Reames	dab35f317d	[RewriteStatepointsForGC] Improve debug output [NFC] llvm-svn: 246713	2015-09-02 21:11:44 +00:00
Piotr Padlewski	0c7d8fc1f6	assuem(X) handling in GVN bugfix There was infinite loop because it was trying to change assume(true) into assume(true) Also added handling when assume(false) appear http://reviews.llvm.org/D12516 llvm-svn: 246697	2015-09-02 20:00:03 +00:00
Piotr Padlewski	28ffcbe1cc	Constant propagation after hitting assume(cmp) bugfix Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246696	2015-09-02 19:59:59 +00:00
Piotr Padlewski	14e815c22b	Constant propagation after hiting llvm.assume After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 246695	2015-09-02 19:59:53 +00:00
Jingyue Wu	e84f671830	[JumpThreading] make jump threading respect convergent annotation. Summary: JumpThreading shouldn't duplicate a convergent call, because that would move a convergent call into a control-inequivalent location. For example, if (cond) { ... } else { ... } convergent_call(); if (cond) { ... } else { ... } should not be optimized to if (cond) { ... convergent_call(); ... } else { ... convergent_call(); ... } Test Plan: test/Transforms/JumpThreading/basic.ll Patch by Xuetian Weng. Reviewers: resistor, arsenm, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12484 llvm-svn: 246415	2015-08-31 06:10:27 +00:00
Chandler Carruth	4b682f6f24	[SROA] Fix PR24463, a crash I introduced in SROA by allowing it to handle more allocas with loads past the end of the alloca. I suspect there are some related crashers with slightly different patterns, but I'll fix those and add test cases as I find them. Thanks to David Majnemer for the excellent test case reduction here. Made this super simple to debug and fix. llvm-svn: 246289	2015-08-28 09:03:52 +00:00
Steven Wu	61db34d12e	Revert r246244 and r246243 These two commits cause clang/llvm bootstrap to hang. llvm-svn: 246279	2015-08-28 06:52:00 +00:00
Piotr Padlewski	3f81ec1e38	Constant propagation after hitting assume(cmp) bugfix Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246244	2015-08-28 01:02:00 +00:00
Piotr Padlewski	63cc5d4627	Constant propagation after hiting llvm.assume After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 246243	2015-08-28 01:01:57 +00:00
James Molloy	1bbf15c57c	[LoopVectorize] Extract InductionInfo into a helper class... ... and move it into LoopUtils where it can be used by other passes, just like ReductionDescriptor. The API is very similar to ReductionDescriptor - that is, not very nice at all. Sorting these both out will come in a followup. NFC llvm-svn: 246145	2015-08-27 09:53:00 +00:00
Philip Reames	dfd890dd3a	Allow value forwarding past release fences in EarlyCSE A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-store forwarding across a release fence. We do need to make sure that stores before the fence can't be eliminated even if there's another store to the same location after the fence. In theory, we could reorder the second store above the fence and then eliminate the former, but we can't do this if the stores are on opposite sides of the fence. Note: While more aggressive then what's there, this patch is still implementing a really conservative ordering. In particular, I'm not trying to exploit undefined behavior via races, or the fact that the LangRef says only 'atomic' accesses are ordered w.r.t. fences. Differential Revision: http://reviews.llvm.org/D11434 llvm-svn: 246134	2015-08-27 01:32:33 +00:00
Philip Reames	abcdc5e3a8	[RewriteStatepointsForGC] Reduce the number of new instructions for base pointers When computing base pointers, we introduce new instructions to propagate the base of existing instructions which might not be bases. However, the algorithm doesn't make any effort to recognize when the new instruction to be inserted is the same as an existing one already in the IR. Since this is happening immediately before rewriting, we don't really have a chance to fix it after the pass runs without teaching loop passes about statepoints. I'm really not thrilled with this patch. I've rewritten it 4 different ways now, but this is the best I've come up with. The case where the new instruction is just the original base defining value could be merged into the existing algorithm with some complexity. The problem is that we might have something like an extractelement from a phi of two vectors. It may be trivially obvious that the base of the 0th element is an existing instruction, but I can't see how to make the algorithm itself figure that out. Thus, I resort to the call to SimplifyInstruction instead. Note that we can only adjust the instructions we've inserted ourselves. The live sets are still being tracked in side structures at this point in the code. We can't easily muck with instructions which might be in them. Long term, I'm really thinking we need to materialize the live pointer sets explicitly in the IR somehow rather than using side structures to track them. Differential Revision: http://reviews.llvm.org/D12004 llvm-svn: 246133	2015-08-27 01:02:28 +00:00
Chandler Carruth	748d095ff0	[SROA] Rip out all support for SSAUpdater in SROA. This was only added to preserve the old ScalarRepl's use of SSAUpdater which was originally to avoid use of dominance frontiers. Now, we only need a domtree, and we'll need a domtree right after this pass as well and so it makes perfect sense to always and only use the dom-tree powered mem2reg. This was flag-flipper earlier and has stuck reasonably so I wanted to gut the now-dead code out of SROA before we waste more time with it. Among other things, this will make passmanager porting easier. llvm-svn: 246028	2015-08-26 09:09:29 +00:00
NAKAMURA Takumi	c57a09821f	Update libdeps in LLVMipo and LLVMScalarOpts, corresponding to r245940. llvm-svn: 245957	2015-08-25 17:11:17 +00:00
Diego Novillo	4d71113cdb	Convert SampleProfile pass into a Module pass. Eventually, we will need sample profiles to be incorporated into the inliner's cost models. To do this, we need the sample profile pass to be a module pass. This patch makes no functional changes beyond the mechanical adjustments needed to run SampleProfile as a module pass. llvm-svn: 245940	2015-08-25 15:25:11 +00:00
Sanjay Patel	6b2765fe49	fix typo; NFC llvm-svn: 245869	2015-08-24 20:11:14 +00:00
Adrian Prantl	cbdfdb74d3	Rename Instruction::dropUnknownMetadata() to dropUnknownNonDebugMetadata() and make it always preserve debug locations, since all callers wanted this behavior anyway. This is addressing a post-commit review feedback for r245589. NFC (inside the LLVM tree). llvm-svn: 245622	2015-08-20 22:00:30 +00:00
Jingyue Wu	10fcea5d4b	[ValueTracking] computeOverflowForSignedAdd and isKnownNonNegative Summary: Refactor, NFC Extracts computeOverflowForSignedAdd and isKnownNonNegative from NaryReassociate to ValueTracking in case others need it. Reviewers: reames Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D11313 llvm-svn: 245591	2015-08-20 18:27:04 +00:00
Adrian Prantl	baf90fc265	Fix a bug that caused SimplifyCFG to drop DebugLocs. Instruction::dropUnknownMetadata(KnownSet) is supposed to preserve all metadata in KnownSet, but the condition for DebugLocs was inverted. Most users of dropUnknownMetadata() actually worked around this by not adding LLVMContext::MD_dbg to their list of KnowIDs. This is now made explicit. llvm-svn: 245589	2015-08-20 18:24:02 +00:00
Adrian Prantl	a317cd2583	Fix a debug location handling bug in GVN. Caught by the famous "DebugLoc describes the currect SubProgram" assertion. When GVN is removing a nonlocal load it updates the debug location of the SSA value it replaced the load with with the one of the load. In the testcase this actually overwrites a valid debug location with an empty one. In reality GVN has to make an arbitrary choice between two equally valid debug locations. This patch changes to behavior to only update the location if the value doesn't already have a debug location. llvm-svn: 245588	2015-08-20 18:23:56 +00:00
Adam Nemet	e48134093d	[LVer] Fix FIXME: hide addPHINodes, NFC Since Ashutosh made findDefsUsedOutsideOfLoop public, we can clean this up. Now clients that don't compute DefsUsedOutsideOfLoop can just call versionLoop() and computing DefsUsedOutsideOfLoop will happen implicitly. With that there is no reason to expose addPHINodes anymore. Ashutosh, you can now drop the calls to findDefsUsedOutsideOfLoop and addPHINodes in LVerLICM and things should just work. llvm-svn: 245579	2015-08-20 17:22:29 +00:00
Benjamin Kramer	fcdb1c14ac	Make helper functions static. NFC. llvm-svn: 245549	2015-08-20 09:57:22 +00:00
Bjorn Steinbrink	2e2f66557e	Revert "[DSE] Enable removal of lifetime intrinsics in terminating blocks" llvm-svn: 245543	2015-08-20 08:58:47 +00:00
Bjorn Steinbrink	cc7e8a9705	[DSE] Enable removal of lifetime intrinsics in terminating blocks Usually DSE is not supposed to remove lifetime intrinsics, but it's actually ok to remove them for dead objects in terminating blocks, because they convey no extra information there. Until we hit a lifetime start that cannot be removed, that is. Because from that point on the lifetime intrinsics become interesting again, e.g. for stack coloring. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11710 llvm-svn: 245542	2015-08-20 08:25:28 +00:00
David Majnemer	ba275f9947	Replace some calls to isa<LandingPadInst> with isEHPad() No functionality change is intended. llvm-svn: 245487	2015-08-19 19:54:02 +00:00
Nick Lewycky	1098e496e1	More clean up, still NFC. Remove dead variables now that the casts are gone. llvm-svn: 245420	2015-08-19 06:25:30 +00:00
Nick Lewycky	2c852543a3	Clean up this file a little. Remove dead casts, casting Values to Values. Adjust some comments for typos and whitespace. NFC. llvm-svn: 245419	2015-08-19 06:22:33 +00:00
Ashutosh Nema	c5b7b55589	Exposed findDefsUsedOutsideOfLoop as a loop utility function Exposed findDefsUsedOutsideOfLoop as a loop utility function by moving it from LoopDistribute to LoopUtils. Reviewed By: anemet llvm-svn: 245416	2015-08-19 05:40:42 +00:00
Eric Christopher	0efe9f60bb	Revert "Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks." This is causing bootstrap problems, e.g.: http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/2960 This reverts r245195. llvm-svn: 245402	2015-08-19 02:15:13 +00:00
Nick Lewycky	06b0ea2e8f	Fix three typos in comments; "easilly" -> "easily". llvm-svn: 245379	2015-08-18 22:41:58 +00:00
Justin Bogner	9f00ebaeda	Revert "Constant propagation after hiting llvm.assume" This was also failing bootstrap: http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build This reverts r245265. llvm-svn: 245269	2015-08-18 07:00:34 +00:00
Piotr Padlewski	94ca3783b8	Constant propagation after hiting llvm.assume After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 245265	2015-08-18 03:55:30 +00:00
Karthik Bhat	3af28945b9	Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks. PR24469 resulted because DeleteDeadInstruction in handleNonLocalStoreDeletion was deleting the next basic block iterator. Fixed the same by resetting the basic block iterator post call to DeleteDeadInstruction. llvm-svn: 245195	2015-08-17 05:51:39 +00:00
Chandler Carruth	2f1fd1658f	[PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never actually invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there is actually a key that we don't update with value handles -- there is a map keyed off of Loops. Because LoopInfo does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that don't trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in just such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193	2015-08-17 02:08:17 +00:00
Sanjoy Das	94c4aecf83	[LSR][NFC] Don’t duplicate entity name at the beginning of the comment. llvm-svn: 245183	2015-08-16 18:22:46 +00:00
Sanjoy Das	302bfd04b5	[LSR][NFC] Use camelCase for method names in Formula and RegUseTracker. llvm-svn: 245182	2015-08-16 18:22:43 +00:00
David Majnemer	e04443baff	Revert "Add support for cross block dse. This patch enables dead stroe elimination across basicblocks." This reverts commit r245025, it caused PR24469. llvm-svn: 245172	2015-08-16 07:11:59 +00:00
David Majnemer	0bc0eef71c	[IR] Give catchret an optional 'return value' operand Some personality routines require funclet exit points to be clearly marked, this is done by producing a token at the funclet pad and consuming it at the corresponding ret instruction. CleanupReturnInst already had a spot for this operand but CatchReturnInst did not. Other personality routines don't need to use this which is why it has been made optional. llvm-svn: 245149	2015-08-15 02:46:08 +00:00
Matt Arsenault	427a0fd22e	LoopStrengthReduce: Try to pass address space to isLegalAddressingMode This seems to only work some of the time. In some situations, this seems to use a nonsensical type and isn't actually aware of the memory being accessed. e.g. if branch condition is an icmp of a pointer, it checks the addressing mode of i1. llvm-svn: 245137	2015-08-15 00:53:06 +00:00
James Molloy	87405c7f66	Separate out BDCE's analysis into a separate DemandedBits analysis. This allows other areas of the compiler to use BDCE's bit-tracking. NFCI. llvm-svn: 245039	2015-08-14 11:09:09 +00:00
Adam Nemet	06ccf0145f	[LVer] Remove unused Pass parameter from versionLoop, NFC llvm-svn: 245032	2015-08-14 06:30:26 +00:00
David Majnemer	b611e3f50e	[IR] Add token types This introduces the basic functionality to support "token types". The motivation stems from the need to perform operations on a Value whose provenance cannot be obscured. There are several applications for such a type but my immediate motivation stems from WinEH. Our personality routine enforces a single-entry - single-exit regime for cleanups. After several rounds of optimizations, we may be left with a terminator whose "cleanup-entry block" is not entirely clear because control flow has merged two cleanups together. We have experimented with using labels as operands inside of instructions which are not terminators to indicate where we came from but found that LLVM does not expect such exotic uses of BasicBlocks. Instead, we can use this new type to clearly associate the "entry point" and "exit point" of our cleanup. This is done by having the cleanuppad yield a Token and consuming it at the cleanupret. The token type makes it impossible to obscure or otherwise hide the Value, making it trivial to track the relationship between the two points. What is the burden to the optimizer? Well, it turns out we have already paid down this cost by accepting that there are certain calls that we are not permitted to duplicate, optimizations have to watch out for such instructions anyway. There are additional places in the optimizer that we will probably have to update but early examination has given me the impression that this will not be heroic. Differential Revision: http://reviews.llvm.org/D11861 llvm-svn: 245029	2015-08-14 05:09:07 +00:00
Karthik Bhat	ddc2a86a00	Add support for cross block dse. This patch enables dead stroe elimination across basicblocks. Example: define void @test_02(i32 %N) { %1 = alloca i32 store i32 %N, i32* %1 store i32 10, i32* @x %2 = load i32, i32* %1 %3 = icmp ne i32 %2, 0 br i1 %3, label %4, label %5 ; <label>:4 store i32 5, i32* @x br label %7 ; <label>:5 %6 = load i32, i32* @x store i32 %6, i32* @y br label %7 ; <label>:7 store i32 15, i32* @x ret void } In the above example dead store "store i32 5, i32* @x" is now eliminated. Differential Revision: http://reviews.llvm.org/D11143 llvm-svn: 245025	2015-08-14 04:17:23 +00:00
Chandler Carruth	1db22822b4	[PM/AA] Hoist the interface to TBAA into a dedicated header along with its creation function. Update the relevant includes accordingly. llvm-svn: 245019	2015-08-14 03:33:48 +00:00
Chandler Carruth	42ff448fe4	[PM/AA] Hoist ScopedNoAliasAA's interface into a header and move the creation function there. Same basic refactoring as the other alias analyses. Nothing special required this time around. llvm-svn: 245012	2015-08-14 02:55:50 +00:00
Jingyue Wu	1238f341ba	[SeparateConstOffsetFromGEP] sext(a)+sext(b) => sext(a+b) when a+b can't sign-overflow. Summary: This patch implements my promised optimization to reunites certain sexts from operands after we extract the constant offset. See the header comment of reuniteExts for its motivation. One key building block that enables this optimization is Bjarke's poison value analysis (D11212). That helps to prove "a +nsw b" can't overflow. Reviewers: broune Subscribers: jholewinski, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12016 llvm-svn: 245003	2015-08-14 02:02:05 +00:00
Chandler Carruth	bf143e2a20	[LIR] Re-instate r244880, reverted in r244884, factoring the handling of AliasAnalysis in LoopIdiomRecognize. The previous commit to LIR, r244879, exposed some scary bug in the loop pass pipeline with an assert failure that showed up on several bots. This patch got reverted as part of getting that revision reverted, but they're actually independent and unrelated. This patch has no functional change and should be completely safe. It is also useful for my current work on the AA infrastructure. llvm-svn: 244993	2015-08-14 00:21:10 +00:00
Sanjay Patel	a75c41e5f3	don't repeat function names in comments; NFC llvm-svn: 244977	2015-08-13 22:53:20 +00:00
Jingyue Wu	13a80eaceb	[SeparateConstOffsetFromGEP] strengthen the inbounds attribute We used to be over-conservative about preserving inbounds. Actually, the second GEP (which applies the constant offset) can inherit the inbounds attribute of the original GEP, because the resultant pointer is equivalent to that of the original GEP. For example, x = GEP inbounds a, i+5 => y = GEP a, i // inbounds removed x = GEP inbounds y, 5 // inbounds preserved llvm-svn: 244937	2015-08-13 18:48:49 +00:00
Erik Eckstein	11fc8175d9	[DeadStoreElimination] remove a redundant store even if the load is in a different block. DeadStoreElimination does eliminate a store if it stores a value which was loaded from the same memory location. So far this worked only if the store is in the same block as the load. Now we can also handle stores which are in a different block than the load. Example: define i32 @test(i1, i32) { entry: %l2 = load i32, i32 %1, align 4 br i1 %0, label %bb1, label %bb2 bb1: br label %bb3 bb2: ; This store is redundant store i32 %l2, i32* %1, align 4 br label %bb3 bb3: ret i32 0 } Differential Revision: http://reviews.llvm.org/D11854 llvm-svn: 244901	2015-08-13 15:36:11 +00:00
Renato Golin	655348f0b2	Revert "[LIR] Start leveraging the fundamental guarantees of a loop..." This reverts commit r244879, as it broke the test-suite on SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64. llvm-svn: 244885	2015-08-13 11:25:38 +00:00
Renato Golin	4d57906b0e	Revert "[LIR] Handle access to AliasAnalysis the same way as the other analysis in LoopIdiomRecognize." This reverts commit r244880, as it broke the test-suite on SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64. llvm-svn: 244884	2015-08-13 11:25:35 +00:00
Ashutosh Nema	47802628f7	Test Commit. llvm-svn: 244883	2015-08-13 11:18:35 +00:00
Chandler Carruth	c2af09823f	[LIR] Handle access to AliasAnalysis the same way as the other analysis in LoopIdiomRecognize. This is what started me staring at this code. Now migrating it with the new AA stuff will be trivial. llvm-svn: 244880	2015-08-13 10:00:53 +00:00
Chandler Carruth	8ae7b81559	[LIR] Start leveraging the fundamental guarantees of a loop in simplified form to remove redundant checks and simplify the code for popcount recognition. We don't actually need to handle all of these cases. I've left a FIXME for one in particular until I finish inspecting to make sure we don't actually rely on the predicate in any way. llvm-svn: 244879	2015-08-13 09:56:20 +00:00
Chandler Carruth	18c2669aca	[LIR] Handle the LoopInfo the same as all the other analyses. No utility really in breaking pattern just for this analysis. llvm-svn: 244878	2015-08-13 09:27:01 +00:00
Chen Li	f458c6f313	[LoopUnswitch] Check OptimizeForSize before traversing over all basic blocks in current loop Summary: This patch moves the check of OptimizeForSize before traversing over all basic blocks in current loop. If OptimizeForSize is set to true, no non-trivial unswitch is ever allowed. Therefore, the early exit will help reduce compilation time. This patch should be NFC. Reviewers: reames, weimingz, broune Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11997 llvm-svn: 244868	2015-08-13 05:24:29 +00:00
Chandler Carruth	dc298329cc	[LIR] Make the LoopIdiomRecognize pass get analyses essentially the same way as every other pass. This simplifies the code quite a bit and is also more idiomatic! <ba-dum!> llvm-svn: 244853	2015-08-13 01:03:26 +00:00
Chandler Carruth	8219a501da	[LIR] Remove the dedicated class for popcount recognition and sink the code into methods on LoopIdiomRecognize. This simplifies the code somewhat and also makes it much easier to move the analyses around. Ultimately, the separate class wasn't providing significant value over methods -- it contained the precondition basic block and the current loop. The current loop is already available and the precondition block wasn't needed everywhere and is easy to pass around. In several cases I just moved things to be static functions because they already accepted most of their inputs as arguments. This doesn't fix the way we manage analyses yet, that will be the next patch, but it already makes the code over 50 lines shorter. No functionality changed. llvm-svn: 244851	2015-08-13 00:44:29 +00:00
Chandler Carruth	d9c6070c98	[LIR] Move all the helpers to be private and re-order the methods in a way that groups things logically. No functionality changed. llvm-svn: 244845	2015-08-13 00:10:03 +00:00
Chandler Carruth	be158b17db	[LIR] Remove the 'LIRUtils' abstraction which was unnecessary and adding complexity. There is only one function that was called from multiple locations, and that was 'getBranch' which has a reasonable one-line spelling already: dyn_cast<BranchInst>(BB->getTerminator). We could make this shorter, but it doesn't seem to add much value. Instead, we should avoid calling it so many times on the same basic blocks, but that will be in a subsequent patch. The other functions are only called in one location, so inline them there, and take advantage of this to use direct early exit and reduce indentation. This makes it much more clear what is being tested for, and in fact makes it clear now to me that there are simpler ways to do this work. However, this patch just does the mechanical inlining. I'll clean up the functionality of the code to leverage loop simplified form more effectively in a follow-up. Despite lots of early line breaks due to early-exit, this is still shorter than it was before. llvm-svn: 244841	2015-08-12 23:55:56 +00:00
Chandler Carruth	bad690e8f7	[LIR] Run clang-format over LoopIdiomRecognize in preparation for a significant code cleanup here. The handling of analyses in this pass is overly complex and can be simplified significantly, but the right way to do that is to simplify all of the code not just the analyses, and that'll require pretty extensive edits that would be noisy with formatting changes mixed into them. llvm-svn: 244828	2015-08-12 23:06:37 +00:00
Philip Reames	971dc3a82a	[RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints To be clear: this is an optimization not a correctness change. CodeGenPrep likes to duplicate icmps feeding branch instructions to take advantage of x86's ability to fuze many comparison/branch patterns into a single micro-op and to reduce the need for materializing i1s into general registers. PlaceSafepoints likes to place safepoint polls right at the end of basic blocks (immediately before terminators) when inserting entry and backedge safepoints. These two heuristics interact in a somewhat unfortunate way where the branch terminating the original block will be controlled by a condition driven by unrelocated pointers. This forces the register allocator to keep both the relocated and unrelocated values of the pointers feeding the icmp alive over the safepoint poll. One simple fix would have been to just adjust PlaceSafepoints to move one back in the basic block, but you can reach similar cases as a result of LICM or other hoisting passes. As a result, doing a post insertion fixup seems to be more robust. I considered doing this in CodeGenPrep itself, but having to update the live sets of already rewritten safepoints gets complicated fast. In particular, you can't just use def/use information because by moving the icmp, we're extending the live range of it's inputs potentially. Instead, this patch teaches RewriteStatepointsForGC to make the required adjustments before making the relocations explicit in the IR. This change really highlights the fact that RSForGC is a CodeGenPrep-like pass which is performing target specific lowering. In the long run, we may even want to combine the two though this would require a lot more smarts to be integrated into RSForGC first. We currently rely on being able to run a set of cleanup passes post rewriting because the IR RSForGC generates is pretty damn ugly. Differential Revision: http://reviews.llvm.org/D11819 llvm-svn: 244821	2015-08-12 22:11:45 +00:00
Philip Reames	9ac4e38a16	[RewriteStatepointsForGC] Handle extractelement fully in the base pointer algorithm When rewriting the IR such that base pointers are available for every live pointer, we potentially need to duplicate instructions to propagate the base. The original code had only handled PHI and Select under the belief those were the only instructions which would need duplicated. When I added support for vector instructions, I'd added a collection of hacks for ExtractElement which caught most of the common cases. Of course, I then found the one test case my hacks couldn't cover. :) This change removes all of the early hacks for extract element. By defining extractelement as a BDV (rather than trying to look through it), we can extend the rewriting algorithm to duplicate the extract as needed. Note that a couple of peephole optimizations were left in for the moment, because while we now handle extractelement as a first class citizen, we're not yet handling insertelement. That change will follow in the near future. llvm-svn: 244808	2015-08-12 21:00:20 +00:00
Chandler Carruth	19ac7d5b29	[PM/AA] Add missing static dependency edges from DSE and memdep to TLI. I forgot to add these in r244780 and r244778. Sorry about that. Also order the static dependencies in a lexicographical order. llvm-svn: 244787	2015-08-12 18:10:45 +00:00
Chandler Carruth	dbe40fb45e	[PM/AA] Stop getting the TargetLibraryInfo out of the AliasAnalysis and just depend on it directly. This was particularly frustrating because there was a really wide mixture of using a member variable and re-extracting it from the AA that happened to be around. I think the result is much more clear. I've also deleted all of the pointless null checks and used references across the APIs where I could to make it explicit that this cannot be null in a useful fashion. llvm-svn: 244780	2015-08-12 18:01:44 +00:00
Sanjay Patel	956e29cc8e	don't repeat function names in comments; NFC llvm-svn: 244672	2015-08-11 21:24:04 +00:00
Sanjay Patel	41f3d95f76	fix 80-cols; NFC llvm-svn: 244668	2015-08-11 21:11:56 +00:00
Igor Laevsky	4709c03715	[IndVarSimplify] Make cost estimation in RewriteLoopExitValues smarter Differential Revision: http://reviews.llvm.org/D11687 llvm-svn: 244474	2015-08-10 18:23:58 +00:00
Mark Heffernan	8939154a22	Add new llvm.loop.unroll.enable metadata. This change adds the unroll metadata "llvm.loop.unroll.enable" which directs the optimizer to unroll a loop fully if the trip count is known at compile time, and unroll partially if the trip count is not known at compile time. This differs from "llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not known at compile time. The "llvm.loop.unroll.enable" is intended to be added for loops annotated with "#pragma unroll". llvm-svn: 244466	2015-08-10 17:28:08 +00:00
Fraser Cormack	e29ab2bfab	Prevent the scalarizer from caching incorrect entries The scalarizer can cache incorrect entries when walking up a chain of insertelement instructions. This occurs when it encounters more than one instruction that it is not actively searching for, as it unconditionally caches every element it finds. The fix is to only cache the first element that it isn't searching for so we don't overwrite correct entries. Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D11559 llvm-svn: 244448	2015-08-10 14:48:47 +00:00
Benjamin Kramer	df005cbe19	Fix some comment typos. llvm-svn: 244402	2015-08-08 18:27:36 +00:00
Adam Nemet	15840393f3	[LAA] Make the set of runtime checks part of the state of LAA, NFC This is the full set of checks that clients can further filter. IOW, it's client-agnostic. This makes LAA complete in the sense that it now provides the two main results of its analysis precomputed: 1. memory dependences via getDepChecker().getInsterestingDependences() 2. run-time checks via getRuntimePointerCheck().getChecks() However, as a consequence we now compute this information pro-actively. Thus if the client decides to skip the loop based on the dependences we've computed the checks unnecessarily. In order to see whether this was a significant overhead I checked compile time on SPEC2k6 LTO bitcode files. The change was in the noise. The checks are generated in canCheckPtrAtRT, at the same place where we used to call groupChecks to merge checks. llvm-svn: 244368	2015-08-07 22:44:15 +00:00
Pete Cooper	ebcd748927	Convert a bunch of loops to foreach. NFC. After r244074, we now have a successors() method to iterate over all the successors of a TerminatorInst. This commit changes a bunch of eligible loops to use it. llvm-svn: 244260	2015-08-06 20:22:46 +00:00
Nico Rieck	78199518c4	Rename inst_range() to instructions() for consistency. NFC llvm-svn: 244248	2015-08-06 19:10:45 +00:00

... 8 9 10 11 12 ...

7658 Commits