llvm-project

Commit Graph

Author	SHA1	Message	Date
Piotr Padlewski	28ffcbe1cc	Constant propagation after hitting assume(cmp) bugfix Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246696	2015-09-02 19:59:59 +00:00
Piotr Padlewski	14e815c22b	Constant propagation after hiting llvm.assume After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 246695	2015-09-02 19:59:53 +00:00
Benjamin Kramer	f175e04435	[RemoveDuplicatePHINodes] Start over after removing a PHI. This makes RemoveDuplicatePHINodes more effective and fixes an assertion failure. Triggering the assertions requires a DenseSet reallocation so this change only contains a constructive test. I'll explain the issue with a small example. In the following function there's a duplicate PHI, %4 and %5 are identical. When this is found the DenseSet in RemoveDuplicatePHINodes contains %2, %3 and %4. define void @F() { br label %1 ; <label>:1 ; preds = %1, %0 %2 = phi i32 [ 42, %0 ], [ %4, %1 ] %3 = phi i32 [ 42, %0 ], [ %5, %1 ] %4 = phi i32 [ 42, %0 ], [ 23, %1 ] %5 = phi i32 [ 42, %0 ], [ 23, %1 ] br label %1 } after RemoveDuplicatePHINodes runs the function looks like this. %3 has changed and is now identical to %2, but RemoveDuplicatePHINodes never saw this. define void @F() { br label %1 ; <label>:1 ; preds = %1, %0 %2 = phi i32 [ 42, %0 ], [ %4, %1 ] %3 = phi i32 [ 42, %0 ], [ %4, %1 ] %4 = phi i32 [ 42, %0 ], [ 23, %1 ] br label %1 } If the DenseSet does a reallocation now it will reinsert all keys and stumble over %3 now having a different hash value than it had when inserted into the map for the first time. This change clears the set whenever a PHI is deleted and starts the progress from the beginning, allowing %3 to be deleted and avoiding inconsistent DenseSet state. This potentially has a negative performance impact because it rescans all PHIs, but I don't think that this ever makes a difference in practice. llvm-svn: 246694	2015-09-02 19:52:23 +00:00
James Molloy	1e583704f5	[LV] Don't bail to MiddleBlock if a runtime check fails, bail to ScalarPH instead We were bailing to two places if our runtime checks failed. If the initial overflow check failed, we'd go to ScalarPH. If any other check failed, we'd go to MiddleBlock. This caused us to have to have an extra PHI per induction and reduction as the vector loop's exit block was not dominated by its latch. There's no need to have this behavior - if we just always go to ScalarPH we can get rid of a bunch of complexity. llvm-svn: 246637	2015-09-02 10:15:39 +00:00
James Molloy	f2523e38d8	[LV] Move some code around slightly to make the intent of the function more clear. NFC. llvm-svn: 246636	2015-09-02 10:15:32 +00:00
James Molloy	aca2f400ba	[LV] Cleanup: Sink an IRBuilder closer to its uses. NFC. llvm-svn: 246635	2015-09-02 10:15:27 +00:00
James Molloy	cba9230507	[LV] Refactor all runtime check emissions into helper functions. This reduces the complexity of createEmptyBlock() and will open the door to further refactoring. The test change is simply because we're now constant folding a trivial test. llvm-svn: 246634	2015-09-02 10:15:22 +00:00
James Molloy	ff623dce39	[LV] Pull creation of trip counts into a helper function. ... and do a tad of tidyup while we're at it. Because StartIdx must now be zero, there's no difference between Count and EndIdx. llvm-svn: 246633	2015-09-02 10:15:16 +00:00
James Molloy	239ff5d193	[LV] Factor the creation of the loop induction variable out of createEmptyLoop() It makes things easier to understand if this is in a helper method. This is part of my ongoing spaghetti-removal operation on createEmptyLoop. llvm-svn: 246632	2015-09-02 10:15:09 +00:00
James Molloy	a860a2216a	[LV] Never widen an induction variable. There's no need to widen canonical induction variables. It's just as efficient to create a new, wide, induction variable. Consider, if we widen an indvar, then we'll have to truncate it before its uses anyway (1 trunc). If we create a new indvar instead, we'll have to truncate that instead (1 trunc) [besides which IndVars should go and clean up our mess after us anyway on principle]. This lets us remove a ton of special-casing code. llvm-svn: 246631	2015-09-02 10:15:05 +00:00
James Molloy	c07701b017	[LV] Switch to using canonical induction variables. Vectorized loops only ever have one induction variable. All induction PHIs from the scalar loop are rewritten to be in terms of this single indvar. We were trying very hard to pick an indvar that already existed, even if that indvar wasn't canonical (didn't start at zero). But trying so hard is really fruitless - creating a new, canonical, indvar only results in one extra add in the worst case and that add is trivially easy to push through the PHI out of the loop by instcombine. If we try and be less clever here and instead let instcombine clean up our mess (as we do in many other places in LV), we can remove unneeded complexity. llvm-svn: 246630	2015-09-02 10:14:54 +00:00
Yaron Keren	611c7cff53	Move createEliminateAvailableExternallyPass earlier in the pass pipeline to save running many ModulePasses on available external functions that are thrown away anyhow. llvm-svn: 246619	2015-09-02 06:34:11 +00:00
Hans Wennborg	dada1d20ba	DeadArgElim: don't eliminate arguments from naked functions Differential Revision: http://reviews.llvm.org/D12534 llvm-svn: 246564	2015-09-01 18:06:46 +00:00
Hans Wennborg	043bf5b296	Fix Windows build by including raw_ostream.h llvm-svn: 246486	2015-08-31 21:19:18 +00:00
Philip Reames	a88caeab6c	[FunctionAttr] Infer nonnull attributes on returns Teach FunctionAttr to infer the nonnull attribute on return values of functions which never return a potentially null value. This is done both via a conservative local analysis for the function itself and a optimistic per-SCC analysis. If no function in the SCC returns anything which could be null (other than values from other functions in the SCC), we can conclude no function returned a null pointer. Even if some function within the SCC returns a null pointer, we may be able to locally conclude that some don't. Differential Revision: http://reviews.llvm.org/D9688 llvm-svn: 246476	2015-08-31 19:44:38 +00:00
Jingyue Wu	e84f671830	[JumpThreading] make jump threading respect convergent annotation. Summary: JumpThreading shouldn't duplicate a convergent call, because that would move a convergent call into a control-inequivalent location. For example, if (cond) { ... } else { ... } convergent_call(); if (cond) { ... } else { ... } should not be optimized to if (cond) { ... convergent_call(); ... } else { ... convergent_call(); ... } Test Plan: test/Transforms/JumpThreading/basic.ll Patch by Xuetian Weng. Reviewers: resistor, arsenm, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12484 llvm-svn: 246415	2015-08-31 06:10:27 +00:00
Sanjoy Das	6f5dca70ed	[InstCombine] Fix PR24605. PR24605 is caused due to an incorrect insert point in instcombine's IR builder. When simplifying %t = add X Y ... %m = icmp ... %t the replacement for %t should be placed before %t, not before %m, as there could be a use of %t between %t and %m. llvm-svn: 246315	2015-08-28 19:09:31 +00:00
Chad Rosier	dc65532fd9	Optimize memcmp(x,y,n)==0 for small n and suitably aligned x/y. http://reviews.llvm.org/D6952 PR20673 llvm-svn: 246313	2015-08-28 18:30:18 +00:00
JF Bastien	f5aa1ca655	Remove Merge Functions pointer comparisons Summary: This patch removes two remaining places where pointer value comparisons are used to order functions: comparing range annotation metadata, and comparing block address constants. (These are both rare cases, and so no actual non-determinism was observed from either case). The fix for range metadata is simple: the annotation always consists of a pair of integers, so we just order by those integers. The fix for block addresses is more subtle. Two constants are the same if they are the same basic block in the same function, or if they refer to corresponding basic blocks in each respective function. Note that in the first case, merging is trivially correct. In the second, the correctness of merging relies on the fact that the the values of block addresses cannot be compared. This change is actually an enhancement, as these functions could not previously be merged (see merge-block-address.ll). There is still a problem with cross function block addresses, in that constants pointing to a basic block in a merged function is not updated. This also more robustly compares floating point constants by all fields of their semantics, and fixes a dyn_cast/cast mixup. Author: jrkoenig Reviewers: dschuff, nlewycky, jfb Subscribers llvm-commits Differential revision: http://reviews.llvm.org/D12376 llvm-svn: 246305	2015-08-28 16:49:09 +00:00
Chandler Carruth	4b682f6f24	[SROA] Fix PR24463, a crash I introduced in SROA by allowing it to handle more allocas with loads past the end of the alloca. I suspect there are some related crashers with slightly different patterns, but I'll fix those and add test cases as I find them. Thanks to David Majnemer for the excellent test case reduction here. Made this super simple to debug and fix. llvm-svn: 246289	2015-08-28 09:03:52 +00:00
Steven Wu	61db34d12e	Revert r246244 and r246243 These two commits cause clang/llvm bootstrap to hang. llvm-svn: 246279	2015-08-28 06:52:00 +00:00
Piotr Padlewski	3f81ec1e38	Constant propagation after hitting assume(cmp) bugfix Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246244	2015-08-28 01:02:00 +00:00
Piotr Padlewski	63cc5d4627	Constant propagation after hiting llvm.assume After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 246243	2015-08-28 01:01:57 +00:00
Tyler Nowicki	5eaa5a9d26	Improve vectorization diagnostic messages and extend vectorize(enable) pragma. This patch changes the analysis diagnostics produced when loops with floating-point recurrences or memory operations are identified. The new messages say "cannot prove it is safe to reorder * operations; allow reordering by specifying #pragma clang loop vectorize(enable)". Depending on the type of diagnostic the message will include additional options such as ffast-math or __restrict__. This patch also allows the vectorize(enable) pragma to override the low pointer memory check threshold. When the hint is given a higher threshold is used. See the clang patch for the options produced for each diagnostic. llvm-svn: 246187	2015-08-27 18:56:49 +00:00
Chad Rosier	c94f8e2906	[LoopVectorize] Add Support for Small Size Reductions. Unlike scalar operations, we can perform vector operations on element types that are smaller than the native integer types. We type-promote scalar operations if they are smaller than a native type (e.g., i8 arithmetic is promoted to i32 arithmetic on Arm targets). This patch detects and removes type-promotions within the reduction detection framework, enabling the vectorization of small size reductions. In the legality phase, we look through the ANDs and extensions that InstCombine creates during promotion, keeping track of the smaller type. In the profitability phase, we use the smaller type and ignore the ANDs and extensions in the cost model. Finally, in the code generation phase, we truncate the result of the reduction to allow InstCombine to rewrite the entire expression in the smaller type. This fixes PR21369. http://reviews.llvm.org/D12202 Patch by Matt Simpson <mssimpso@codeaurora.org>! llvm-svn: 246149	2015-08-27 14:12:17 +00:00
James Molloy	1bbf15c57c	[LoopVectorize] Extract InductionInfo into a helper class... ... and move it into LoopUtils where it can be used by other passes, just like ReductionDescriptor. The API is very similar to ReductionDescriptor - that is, not very nice at all. Sorting these both out will come in a followup. NFC llvm-svn: 246145	2015-08-27 09:53:00 +00:00
Alex Rosenberg	a0a19c1c91	Whoops, remove trailing whitespace. llvm-svn: 246141	2015-08-27 05:37:12 +00:00
Philip Reames	dfd890dd3a	Allow value forwarding past release fences in EarlyCSE A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-store forwarding across a release fence. We do need to make sure that stores before the fence can't be eliminated even if there's another store to the same location after the fence. In theory, we could reorder the second store above the fence and then eliminate the former, but we can't do this if the stores are on opposite sides of the fence. Note: While more aggressive then what's there, this patch is still implementing a really conservative ordering. In particular, I'm not trying to exploit undefined behavior via races, or the fact that the LangRef says only 'atomic' accesses are ordered w.r.t. fences. Differential Revision: http://reviews.llvm.org/D11434 llvm-svn: 246134	2015-08-27 01:32:33 +00:00
Philip Reames	abcdc5e3a8	[RewriteStatepointsForGC] Reduce the number of new instructions for base pointers When computing base pointers, we introduce new instructions to propagate the base of existing instructions which might not be bases. However, the algorithm doesn't make any effort to recognize when the new instruction to be inserted is the same as an existing one already in the IR. Since this is happening immediately before rewriting, we don't really have a chance to fix it after the pass runs without teaching loop passes about statepoints. I'm really not thrilled with this patch. I've rewritten it 4 different ways now, but this is the best I've come up with. The case where the new instruction is just the original base defining value could be merged into the existing algorithm with some complexity. The problem is that we might have something like an extractelement from a phi of two vectors. It may be trivially obvious that the base of the 0th element is an existing instruction, but I can't see how to make the algorithm itself figure that out. Thus, I resort to the call to SimplifyInstruction instead. Note that we can only adjust the instructions we've inserted ourselves. The live sets are still being tracked in side structures at this point in the code. We can't easily muck with instructions which might be in them. Long term, I'm really thinking we need to materialize the live pointer sets explicitly in the IR somehow rather than using side structures to track them. Differential Revision: http://reviews.llvm.org/D12004 llvm-svn: 246133	2015-08-27 01:02:28 +00:00
Tyler Nowicki	e0f400feaa	Improved printing of analysis diagnostics in the loop vectorizer. This patch ensures that every analysis diagnostic produced by the vectorizer will be printed if the loop has a vectorization hint on it. The condition has also been improved to prevent printing when a disabling hint is specified. llvm-svn: 246132	2015-08-27 01:02:04 +00:00
Philip Reames	98a2dabc08	[SimplifyCFG] Prune code from a provably unreachable switch default As Sanjoy pointed out over in http://reviews.llvm.org/D11819, a switch on an icmp should always be able to become a branch instruction. This patch generalizes that notion slightly to prove that the default case of a switch is unreachable if the cases completely cover all possible bit patterns in the condition. Once that's done, the switch to branch conversion kicks in just fine. Note: Duplicate case values are disallowed by the LangRef and verifier. Differential Revision: http://reviews.llvm.org/D11995 llvm-svn: 246125	2015-08-26 23:56:46 +00:00
Diego Novillo	7732ae4a4f	Fix memory leak in sample profile pass. The problem here were the function analyses invoked by the function pass manager from the new IPO pass. I looked at other IPO passes needing dominance information and the only one that requires it (partial inliner) does not use the standard dependency mechanism. This patch mimics what the partial inliner does to compute dominance, post-dominance and loop info. One thing I like about this approach is that I can delay the computation of all this until I actually need it. This should bring the ASAN buildbot back to green. If there's a better way to fix this, I'll do it in a follow-up patch. llvm-svn: 246066	2015-08-26 20:00:27 +00:00
David Majnemer	3354fe473f	[SimplifyLibCalls] Fix a typo cbrt(sqrt(x)) calculates the sixth root, not the ninth root. cbrt(cbrt(x)) calculates the ninth root. llvm-svn: 246046	2015-08-26 18:30:16 +00:00
Chandler Carruth	748d095ff0	[SROA] Rip out all support for SSAUpdater in SROA. This was only added to preserve the old ScalarRepl's use of SSAUpdater which was originally to avoid use of dominance frontiers. Now, we only need a domtree, and we'll need a domtree right after this pass as well and so it makes perfect sense to always and only use the dom-tree powered mem2reg. This was flag-flipper earlier and has stuck reasonably so I wanted to gut the now-dead code out of SROA before we waste more time with it. Among other things, this will make passmanager porting easier. llvm-svn: 246028	2015-08-26 09:09:29 +00:00
Alex Rosenberg	81cfed21ca	Modernize with range-based for loops. llvm-svn: 246018	2015-08-26 06:11:41 +00:00
Alex Rosenberg	99805ed45a	Reduce code duplication. llvm-svn: 246017	2015-08-26 06:11:38 +00:00
Alex Rosenberg	5b3404a03e	Trailing whitespace llvm-svn: 246016	2015-08-26 06:11:36 +00:00
JF Bastien	9dc042a0b6	Comparing operands should not require the same ValueID Summary: When comparing basic blocks, there is an additional check that two Value*'s should have the same ID, which interferes with merging equivalent constants of different kinds (such as a ConstantInt and a ConstantPointerNull in the included testcase). The cmpValues function already ensures that the two values in each function are the same, so removing this check should not cause incorrect merging. Also, the type comparison is redundant, based on reviewing the code and testing on the test suite and several large LTO bitcodes. Author: jrkoenig Reviewers: nlewycky, jfb, dschuff Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D12302 llvm-svn: 246001	2015-08-26 03:02:58 +00:00
Charles Davis	119525914c	Make variable argument intrinsics behave correctly in a Win64 CC function. Summary: This change makes the variable argument intrinsics, `llvm.va_start` and `llvm.va_copy`, and the `va_arg` instruction behave as they do on Windows inside a `CallingConv::X86_64_Win64` function. It's needed for a Clang patch I have to add support for GCC's `__builtin_ms_va_list` constructs. Reviewers: nadav, asl, eugenis CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1622 llvm-svn: 245990	2015-08-25 23:27:41 +00:00
Evgeniy Stepanov	d04d07e65e	[msan] Precise instrumentation for icmp sgt %x, -1. Extend signed relational comparison instrumentation with a special case for comparisons with -1. This fixes an MSan false positive when such comparison is used as a sign bit test. https://llvm.org/bugs/show_bug.cgi?id=24561 llvm-svn: 245980	2015-08-25 22:19:11 +00:00
NAKAMURA Takumi	c57a09821f	Update libdeps in LLVMipo and LLVMScalarOpts, corresponding to r245940. llvm-svn: 245957	2015-08-25 17:11:17 +00:00
Matthias Braun	a7fc3856f1	Fix dependencies/shared library build llvm-svn: 245955	2015-08-25 17:07:40 +00:00
Wei Mi	edae87d819	The patch replace the overflow check in loop vectorization with the minimum loop iterations check. The loop minimum iterations check below ensures the loop has enough trip count so the generated vector loop will likely be executed, and it covers the overflow check. Differential Revision: http://reviews.llvm.org/D12107. llvm-svn: 245952	2015-08-25 16:43:47 +00:00
Diego Novillo	4d71113cdb	Convert SampleProfile pass into a Module pass. Eventually, we will need sample profiles to be incorporated into the inliner's cost models. To do this, we need the sample profile pass to be a module pass. This patch makes no functional changes beyond the mechanical adjustments needed to run SampleProfile as a module pass. llvm-svn: 245940	2015-08-25 15:25:11 +00:00
Piotr Padlewski	4e7f752bb8	Assume intrinsic handling in global opt It doesn't solve the problem, when for example we load something, and then assume that it is the same as some constant value, because globalopt will fail on unknown load instruction. The proposed solution would be to skip some instructions that we can't evaluate and they are safe to skip (f.e. load, assume and many others) and see if they are required to perform optimization (f.e. we don't care about ephemeral instructions that may appear using @llvm.assume()) http://reviews.llvm.org/D12266 llvm-svn: 245919	2015-08-25 01:34:15 +00:00
Sanjay Patel	6b2765fe49	fix typo; NFC llvm-svn: 245869	2015-08-24 20:11:14 +00:00
Adhemerval Zanella	4754e2d59c	[sanitizers] Add DFSan support for AArch64 42-bit VMA This patch adds support for dfsan on aarch64-linux with 42-bit VMA (current default config for 64K pagesize kernels). The support is enabled by defining the SANITIZER_AARCH64_VMA to 42 at build time for both clang/llvm and compiler-rt. The default VMA is 39 bits. llvm-svn: 245840	2015-08-24 13:48:10 +00:00
Mehdi Amini	d134a67ce9	Require Dominator Tree For SROA, improve compile-time TL-DR: SROA is followed by EarlyCSE which requires the DominatorTree. There is no reason not to require it up-front for SROA. Some history is necessary to understand why we ended-up here. r123437 switched the second (Legacy)SROA in the optimizer pipeline to use SSAUpdater in order to avoid recomputing the costly DominanceFrontier. The purpose was to speed-up the compile-time. Later r123609 removed the need for the DominanceFrontier in (Legacy)SROA. Right after, some cleanup was made in r123724 to remove any reference to the DominanceFrontier. SROA existed in two flavors: SROA_SSAUp and SROA_DT (the latter replacing SROA_DF). The second argument of `createScalarReplAggregatesPass` was renamed from `UseDomFrontier` to `UseDomTree`. I believe this is were a mistake was made. The pipeline was not updated and the call site was still: PM->add(createScalarReplAggregatesPass(-1, false)); At that time, SROA was immediately followed in the pipeline by EarlyCSE which required alread the DominatorTree. Not requiring the DominatorTree in SROA didn't save anything, but unfortunately it was lost at this point. When the new SROA Pass was introduced in r163965, I believe the goal was to have an exact replacement of the existing SROA, this bug slipped through. You can see currently: $ echo "" \| clang -x c++ -O3 -c - -mllvm -debug-pass=Structure ... ... FunctionPass Manager SROA Dominator Tree Construction Early CSE After this patch: $ echo "" \| clang -x c++ -O3 -c - -mllvm -debug-pass=Structure ... ... FunctionPass Manager Dominator Tree Construction SROA Early CSE This improves the compile time from 88s to 23s for PR17855. https://llvm.org/bugs/show_bug.cgi?id=17855 And from 113s to 12s for PR16756 https://llvm.org/bugs/show_bug.cgi?id=16756 Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D12267 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 245820	2015-08-23 22:15:49 +00:00
Joseph Tremoulet	8220bcc570	[WinEH] Require token linkage in EH pad/ret signatures Summary: WinEHPrepare is going to require that cleanuppad and catchpad produce values of token type which are consumed by any cleanupret or catchret exiting the pad. This change updates the signatures of those operators to require/enforce that the type produced by the pads is token type and that the rets have an appropriate argument. The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and similarly for `CleanupReturnInst`/`CleanupPadInst`). To accommodate that restriction, this change adds a notion of an operator constraint to both LLParser and BitcodeReader, allowing appropriate sentinels to be constructed for forward references and appropriate error messages to be emitted for illegal inputs. Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad predecessor must have no other predecessors; this ensures that WinEHPrepare will see the expected linear relationship between sibling catches on the same try. Lastly, remove some superfluous/vestigial casts from instruction operand setters operating on BasicBlocks. Reviewers: rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12108 llvm-svn: 245797	2015-08-23 00:26:33 +00:00
JF Bastien	057292a76c	Improve the determinism of MergeFunctions Summary: Merge functions previously relied on unsigned comparisons of pointer values to order functions. This caused observable non-determinism in the compiler for large bitcode programs. Basically, opt -mergefuncs program.bc \| md5sum produces different hashes when run repeatedly on the same machine. Differing output was observed on three large bitcodes, but it was less frequent on the smallest file. It is possible that this only manifests on the large inputs, hence remaining undetected until now. This patch fixes this by removing (almost, see below) all places where comparisons between pointers are used to order functions. Most of these changes are local, but the comparison of global values requires assigning an identifier to each local in the order it is visited. This is very similar to the way the comparison function identifies Value's defined within a function. Because the order of visiting the functions and their subparts is deterministic, the identifiers assigned to the globals will be as well, and the order of functions will be deterministic. With these changes, there is no more observed non-determinism. There is also only minor slowdowns (negligible to 4%) compared to the baseline, which is likely a result of the fact that global comparisons involve hash lookups and not just pointer comparisons. The one caveat so far is that programs containing BlockAddress constants can still be non-deterministic. It is not clear what the right solution is here. In particular, even if the global numbers are used to order by function, we still need a way to order the BasicBlock's. Unfortunately, we cannot just bail out and fail to order the functions or consider them equal, because we require a total order over functions. Note that programs with BlockAddress constants are relatively rare, so the impact of leaving this in is minor as long as this pass is opt-in. Author: jrkoenig Reviewers: nlewycky, jfb, dschuff Subscribers: jevinskie, llvm-commits, chapuni Differential revision: http://reviews.llvm.org/D12168 llvm-svn: 245762	2015-08-21 23:27:24 +00:00
Tyler Nowicki	552a62fabc	Standardized 'failed' to 'Failed' in LoopVectorizationRequirements. llvm-svn: 245759	2015-08-21 23:03:24 +00:00
Sanjoy Das	c86c162a58	Re-apply r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0" The original checkin was buggy, this change has a fix. Original commit message: [InstCombine] Transform A & (L - 1) u< L --> L != 0 Summary: This transform is never a pessimization at the IR level (since it replaces an `icmp` with another), and has potentiall payoffs: 1. It may make the `icmp` fold away or become loop invariant. 2. It may make the `A & (L - 1)` computation dead. This shows up in Java, in range checks generated by array accesses of the form `a[i & (a.length - 1)]`. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12210 llvm-svn: 245753	2015-08-21 22:22:37 +00:00
David Blaikie	88208840b5	[opaque pointer type]: Pass explicit pointee type when building a constant GEP. Gets a bit tricky in the ValueMapper, of course - not sure if we should just expose a list of explicit types for each Value so that the ValueMapper can be neutral to these special cases (it's OK for things like load, where the explicit type is the result type - but when that's not the case, it means plumbing through another "special" type... ) llvm-svn: 245728	2015-08-21 20:16:51 +00:00
NAKAMURA Takumi	6a6232818d	Revert r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0" It caused miscompilation in clang. llvm-svn: 245678	2015-08-21 07:46:07 +00:00
Peter Collingbourne	1dc6a8d179	TransformUtils: Introduce module splitter. The module splitter splits a module into linkable partitions. It will be used to implement parallel LTO code generation. This initial version of the splitter does not attempt to deal with the somewhat subtle symbol visibility issues around module splitting. These will be dealt with in a future change. Differential Revision: http://reviews.llvm.org/D12132 llvm-svn: 245662	2015-08-21 02:48:20 +00:00
Sanjoy Das	e472d8a57a	[InstCombine] Transform A & (L - 1) u< L --> L != 0 Summary: This transform is never a pessimization at the IR level (since it replaces an `icmp` with another), and has potentiall payoffs: 1. It may make the `icmp` fold away or become loop invariant. 2. It may make the `A & (L - 1)` computation dead. This shows up in Java, in range checks generated by array accesses of the form `a[i & (a.length - 1)]`. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12210 llvm-svn: 245635	2015-08-20 22:31:55 +00:00
Michael Zolotukhin	51b00e6d82	[SLP] Propagate 'nontemporal' attribute into vectorized instructions. llvm-svn: 245633	2015-08-20 22:28:15 +00:00
Michael Zolotukhin	2a3d99fedf	[LoopVectorize] Propagate 'nontemporal' attribute into vectorized instructions. llvm-svn: 245632	2015-08-20 22:27:38 +00:00
Adrian Prantl	cbdfdb74d3	Rename Instruction::dropUnknownMetadata() to dropUnknownNonDebugMetadata() and make it always preserve debug locations, since all callers wanted this behavior anyway. This is addressing a post-commit review feedback for r245589. NFC (inside the LLVM tree). llvm-svn: 245622	2015-08-20 22:00:30 +00:00
Adhemerval Zanella	e00b497242	[asan] Add ASAN support for AArch64 42-bit VMA This patch adds support for asan on aarch64-linux with 42-bit VMA (current default config for 64K pagesize kernels). The support is enabled by defining the SANITIZER_AARCH64_VMA to 42 at build time for both clang/llvm and compiler-rt. The default VMA is 39 bits. llvm-svn: 245594	2015-08-20 18:30:40 +00:00
Jingyue Wu	10fcea5d4b	[ValueTracking] computeOverflowForSignedAdd and isKnownNonNegative Summary: Refactor, NFC Extracts computeOverflowForSignedAdd and isKnownNonNegative from NaryReassociate to ValueTracking in case others need it. Reviewers: reames Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D11313 llvm-svn: 245591	2015-08-20 18:27:04 +00:00
Adrian Prantl	baf90fc265	Fix a bug that caused SimplifyCFG to drop DebugLocs. Instruction::dropUnknownMetadata(KnownSet) is supposed to preserve all metadata in KnownSet, but the condition for DebugLocs was inverted. Most users of dropUnknownMetadata() actually worked around this by not adding LLVMContext::MD_dbg to their list of KnowIDs. This is now made explicit. llvm-svn: 245589	2015-08-20 18:24:02 +00:00
Adrian Prantl	a317cd2583	Fix a debug location handling bug in GVN. Caught by the famous "DebugLoc describes the currect SubProgram" assertion. When GVN is removing a nonlocal load it updates the debug location of the SSA value it replaced the load with with the one of the load. In the testcase this actually overwrites a valid debug location with an empty one. In reality GVN has to make an arbitrary choice between two equally valid debug locations. This patch changes to behavior to only update the location if the value doesn't already have a debug location. llvm-svn: 245588	2015-08-20 18:23:56 +00:00
Adam Nemet	e48134093d	[LVer] Fix FIXME: hide addPHINodes, NFC Since Ashutosh made findDefsUsedOutsideOfLoop public, we can clean this up. Now clients that don't compute DefsUsedOutsideOfLoop can just call versionLoop() and computing DefsUsedOutsideOfLoop will happen implicitly. With that there is no reason to expose addPHINodes anymore. Ashutosh, you can now drop the calls to findDefsUsedOutsideOfLoop and addPHINodes in LVerLICM and things should just work. llvm-svn: 245579	2015-08-20 17:22:29 +00:00
Balaram Makam	ccf59731e3	Optimize bitwise even/odd test (-x&1 -> x&1) to not use negation. Summary: We know that -x & 1 is equivalent to x & 1, avoid using negation for testing if a negative integer is even or odd. Reviewers: majnemer Subscribers: junbuml, mssimpso, gberry, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D12156 llvm-svn: 245569	2015-08-20 15:35:00 +00:00
Benjamin Kramer	fcdb1c14ac	Make helper functions static. NFC. llvm-svn: 245549	2015-08-20 09:57:22 +00:00
Bjorn Steinbrink	2e2f66557e	Revert "[DSE] Enable removal of lifetime intrinsics in terminating blocks" llvm-svn: 245543	2015-08-20 08:58:47 +00:00
Bjorn Steinbrink	cc7e8a9705	[DSE] Enable removal of lifetime intrinsics in terminating blocks Usually DSE is not supposed to remove lifetime intrinsics, but it's actually ok to remove them for dead objects in terminating blocks, because they convey no extra information there. Until we hit a lifetime start that cannot be removed, that is. Because from that point on the lifetime intrinsics become interesting again, e.g. for stack coloring. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11710 llvm-svn: 245542	2015-08-20 08:25:28 +00:00
Chandler Carruth	0f792189a4	[ARC] Pull the ObjC ARC components that really serve the role of analyses into LLVM's Analysis library rather than having them in a Transforms library. This is motivated by the need to have the core AliasAnalysis infrastructure be aware of the ObjCARCAliasAnalysis. However, it also seems like a nice and clean separation. Everything was very easy to move and this doesn't create much clutter in the analysis library IMO. Differential Revision: http://reviews.llvm.org/D12133 llvm-svn: 245541	2015-08-20 08:06:03 +00:00
David Majnemer	ba275f9947	Replace some calls to isa<LandingPadInst> with isEHPad() No functionality change is intended. llvm-svn: 245487	2015-08-19 19:54:02 +00:00
Nick Lewycky	1098e496e1	More clean up, still NFC. Remove dead variables now that the casts are gone. llvm-svn: 245420	2015-08-19 06:25:30 +00:00
Nick Lewycky	2c852543a3	Clean up this file a little. Remove dead casts, casting Values to Values. Adjust some comments for typos and whitespace. NFC. llvm-svn: 245419	2015-08-19 06:22:33 +00:00
Ashutosh Nema	c5b7b55589	Exposed findDefsUsedOutsideOfLoop as a loop utility function Exposed findDefsUsedOutsideOfLoop as a loop utility function by moving it from LoopDistribute to LoopUtils. Reviewed By: anemet llvm-svn: 245416	2015-08-19 05:40:42 +00:00
Eric Christopher	0efe9f60bb	Revert "Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks." This is causing bootstrap problems, e.g.: http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/2960 This reverts r245195. llvm-svn: 245402	2015-08-19 02:15:13 +00:00
Nick Lewycky	06b0ea2e8f	Fix three typos in comments; "easilly" -> "easily". llvm-svn: 245379	2015-08-18 22:41:58 +00:00
Chandler Carruth	7adc3a2b0e	[PM/AA] Remove the last relics of the separate IPA library from LLVM, folding the code into the main Analysis library. There already wasn't much of a distinction between Analysis and IPA. A number of the passes in Analysis are actually IPA passes, and there doesn't seem to be any advantage to separating them. Moreover, it makes it hard to have interactions between analyses that are both local and interprocedural. In trying to make the Alias Analysis infrastructure work with the new pass manager, it becomes particularly awkward to navigate this split. I've tried to find all the places where we referenced this, but I may have missed some. I have also adjusted the C API to continue to be equivalently functional after this change. Differential Revision: http://reviews.llvm.org/D12075 llvm-svn: 245318	2015-08-18 17:51:53 +00:00
Sanjay Patel	1cd6d88e4d	use minSize wrapper; NFCI These were missed when other uses were switched over: http://llvm.org/viewvc/llvm-project?view=revision&revision=243994 llvm-svn: 245311	2015-08-18 16:44:23 +00:00
Justin Bogner	9f00ebaeda	Revert "Constant propagation after hiting llvm.assume" This was also failing bootstrap: http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build This reverts r245265. llvm-svn: 245269	2015-08-18 07:00:34 +00:00
Piotr Padlewski	94ca3783b8	Constant propagation after hiting llvm.assume After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 245265	2015-08-18 03:55:30 +00:00
Karthik Bhat	3af28945b9	Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks. PR24469 resulted because DeleteDeadInstruction in handleNonLocalStoreDeletion was deleting the next basic block iterator. Fixed the same by resetting the basic block iterator post call to DeleteDeadInstruction. llvm-svn: 245195	2015-08-17 05:51:39 +00:00
David Majnemer	8ed559ad22	Revert "[InstCombinePHI] Partial simplification of identity operations." This reverts commit r244887, it caused PR24470. llvm-svn: 245194	2015-08-17 03:11:26 +00:00
Chandler Carruth	2f1fd1658f	[PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never actually invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there is actually a key that we don't update with value handles -- there is a map keyed off of Loops. Because LoopInfo does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that don't trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in just such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193	2015-08-17 02:08:17 +00:00
Benjamin Kramer	bb70d751de	[SimplifyLibCalls] Drop default template args. No functional change. llvm-svn: 245189	2015-08-16 21:16:37 +00:00
Sanjay Patel	57fd1dc5db	transform fmin/fmax calls when possible (PR24314) If we can ignore NaNs, fmin/fmax libcalls can become compare and select (this is what we turn std::min / std::max into). This IR should then be optimized in the backend to whatever is best for any given target. Eg, x86 can use minss/maxss instructions. This should solve PR24314: https://llvm.org/bugs/show_bug.cgi?id=24314 Differential Revision: http://reviews.llvm.org/D11866 llvm-svn: 245187	2015-08-16 20:18:19 +00:00
Sanjoy Das	94c4aecf83	[LSR][NFC] Don’t duplicate entity name at the beginning of the comment. llvm-svn: 245183	2015-08-16 18:22:46 +00:00
Sanjoy Das	302bfd04b5	[LSR][NFC] Use camelCase for method names in Formula and RegUseTracker. llvm-svn: 245182	2015-08-16 18:22:43 +00:00
David Majnemer	e04443baff	Revert "Add support for cross block dse. This patch enables dead stroe elimination across basicblocks." This reverts commit r245025, it caused PR24469. llvm-svn: 245172	2015-08-16 07:11:59 +00:00
David Majnemer	dfa3b09541	[InstCombine] Replace an and+icmp with a trunc+icmp Bitwise arithmetic can obscure a simple sign-test. If replacing the mask with a truncate is preferable if the type is legal because it permits us to rephrase the comparison more explicitly. llvm-svn: 245171	2015-08-16 07:09:17 +00:00
NAKAMURA Takumi	5196275eea	MergeFunc: Quick fix for r245140, Ignore second, aka Function*, in sorting. Don't assume second would be ordered in the module. llvm-svn: 245168	2015-08-16 02:41:23 +00:00
Yaron Keren	dfb655fe17	Try to appease VS 2015 warnings from http://reviews.llvm.org/D11890 ByteSize and BitSize should not be size_t but unsigned, considering 1) They are at most 2^16 and 2^19, respectively. 2) BitSize is an argument to Type::getIntNTy which takes unsigned. Also, use the correct utostr instead itostr and cache the string result. Thanks to James Touton for reporting this! llvm-svn: 245167	2015-08-15 19:06:14 +00:00
David Majnemer	0bc0eef71c	[IR] Give catchret an optional 'return value' operand Some personality routines require funclet exit points to be clearly marked, this is done by producing a token at the funclet pad and consuming it at the corresponding ret instruction. CleanupReturnInst already had a spot for this operand but CatchReturnInst did not. Other personality routines don't need to use this which is why it has been made optional. llvm-svn: 245149	2015-08-15 02:46:08 +00:00
JF Bastien	5e4303dc14	Accelerate MergeFunctions with hashing This patch makes the Merge Functions pass faster by calculating and comparing a hash value which captures the essential structure of a function before performing a full function comparison. The hash is calculated by hashing the function signature, then walking the basic blocks of the function in the same order as the main comparison function. The opcode of each instruction is hashed in sequence, which means that different functions according to the existing total order cannot have the same hash, as the comparison requires the opcodes of the two functions to be the same order. The hash function is a static member of the FunctionComparator class because it is tightly coupled to the exact comparison function used. For example, functions which are equivalent modulo a single variant callsite might be merged by a more aggressive MergeFunctions, and the hash function would need to be insensitive to these differences in order to exploit this. The hashing function uses a utility class which accumulates the values into an internal state using a standard bit-mixing function. Note that this is a different interface than a regular hashing routine, because the values to be hashed are scattered amongst the properties of a llvm::Function, not linear in memory. This scheme is fast because only one word of state needs to be kept, and the mixing function is a few instructions. The main runOnModule function first computes the hash of each function, and only further processes functions which do not have a unique function hash. The hash is also used to order the sorted function set. If the hashes differ, their values are used to order the functions, otherwise the full comparison is done. Both of these are helpful in speeding up MergeFunctions. Together they result in speedups of 9% for mysqld (a mostly C application with little redundancy), 46% for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all three cases, the new speed of MergeFunctions is about half that of the module verifier, making it relatively inexpensive even for large LTO builds with hundreds of thousands of functions. The same functions are merged, so this change is free performance. Author: jrkoenig Reviewers: nlewycky, dschuff, jfb Subscribers: llvm-commits, aemerson Differential revision: http://reviews.llvm.org/D11923 llvm-svn: 245140	2015-08-15 01:18:18 +00:00
Matt Arsenault	427a0fd22e	LoopStrengthReduce: Try to pass address space to isLegalAddressingMode This seems to only work some of the time. In some situations, this seems to use a nonsensical type and isn't actually aware of the memory being accessed. e.g. if branch condition is an icmp of a pointer, it checks the addressing mode of i1. llvm-svn: 245137	2015-08-15 00:53:06 +00:00
Nick Lewycky	8075fd22b9	Fix a crash where a utility function wasn't aware of fcmp vectors and created a value with the wrong type. Fixes PR24458! llvm-svn: 245119	2015-08-14 22:46:49 +00:00
Evgeniy Stepanov	24ac55d884	[msan] Fix handling of musttail calls. MSan instrumentation for return values of musttail calls is not allowed by the IR constraints, and not needed at the same time. llvm-svn: 245106	2015-08-14 22:03:50 +00:00
Justin Bogner	7ae63aa85d	[sancov] Fix an unused variable warning introduced in r245067 llvm-svn: 245072	2015-08-14 17:03:45 +00:00
Reid Kleckner	a57d015154	[sancov] Leave llvm.localescape in the entry block Summary: Similar to the change we applied to ASan. The same test case works. Reviewers: samsonov Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11961 llvm-svn: 245067	2015-08-14 16:45:42 +00:00
James Molloy	87405c7f66	Separate out BDCE's analysis into a separate DemandedBits analysis. This allows other areas of the compiler to use BDCE's bit-tracking. NFCI. llvm-svn: 245039	2015-08-14 11:09:09 +00:00
Adam Nemet	06ccf0145f	[LVer] Remove unused Pass parameter from versionLoop, NFC llvm-svn: 245032	2015-08-14 06:30:26 +00:00
David Majnemer	b611e3f50e	[IR] Add token types This introduces the basic functionality to support "token types". The motivation stems from the need to perform operations on a Value whose provenance cannot be obscured. There are several applications for such a type but my immediate motivation stems from WinEH. Our personality routine enforces a single-entry - single-exit regime for cleanups. After several rounds of optimizations, we may be left with a terminator whose "cleanup-entry block" is not entirely clear because control flow has merged two cleanups together. We have experimented with using labels as operands inside of instructions which are not terminators to indicate where we came from but found that LLVM does not expect such exotic uses of BasicBlocks. Instead, we can use this new type to clearly associate the "entry point" and "exit point" of our cleanup. This is done by having the cleanuppad yield a Token and consuming it at the cleanupret. The token type makes it impossible to obscure or otherwise hide the Value, making it trivial to track the relationship between the two points. What is the burden to the optimizer? Well, it turns out we have already paid down this cost by accepting that there are certain calls that we are not permitted to duplicate, optimizations have to watch out for such instructions anyway. There are additional places in the optimizer that we will probably have to update but early examination has given me the impression that this will not be heroic. Differential Revision: http://reviews.llvm.org/D11861 llvm-svn: 245029	2015-08-14 05:09:07 +00:00
Karthik Bhat	ddc2a86a00	Add support for cross block dse. This patch enables dead stroe elimination across basicblocks. Example: define void @test_02(i32 %N) { %1 = alloca i32 store i32 %N, i32* %1 store i32 10, i32* @x %2 = load i32, i32* %1 %3 = icmp ne i32 %2, 0 br i1 %3, label %4, label %5 ; <label>:4 store i32 5, i32* @x br label %7 ; <label>:5 %6 = load i32, i32* @x store i32 %6, i32* @y br label %7 ; <label>:7 store i32 15, i32* @x ret void } In the above example dead store "store i32 5, i32* @x" is now eliminated. Differential Revision: http://reviews.llvm.org/D11143 llvm-svn: 245025	2015-08-14 04:17:23 +00:00
Chandler Carruth	d541e7304f	[PM/AA] Run clang-format over the ObjCARC Alias Analysis code to normalize its formatting before I make more substantial changes. llvm-svn: 245024	2015-08-14 03:57:00 +00:00
Chandler Carruth	b4ebdf3d72	[PM/AA] Don't bother forward declaring Function and Value, just include their headers. llvm-svn: 245023	2015-08-14 03:55:36 +00:00
Chandler Carruth	21dcff799a	[PM/AA] Extract the interface for GlobalsModRef into a header along with its creation function. This required shifting a bunch of method definitions to be out-of-line so that we could leave most of the implementation guts in the .cpp file. llvm-svn: 245021	2015-08-14 03:48:20 +00:00
Chandler Carruth	1db22822b4	[PM/AA] Hoist the interface to TBAA into a dedicated header along with its creation function. Update the relevant includes accordingly. llvm-svn: 245019	2015-08-14 03:33:48 +00:00
Chandler Carruth	42ff448fe4	[PM/AA] Hoist ScopedNoAliasAA's interface into a header and move the creation function there. Same basic refactoring as the other alias analyses. Nothing special required this time around. llvm-svn: 245012	2015-08-14 02:55:50 +00:00
Chandler Carruth	8b046a42f4	[PM/AA] Extract a minimal interface for CFLAA to its own header file. I've used forward declarations and reorderd the source code some to make this reasonably clean and keep as much of the code as possible in the source file, including all the stratified set details. Just the basic AA interface and the create function are in the header file, and the header file is now included into the relevant locations. llvm-svn: 245009	2015-08-14 02:42:20 +00:00
Jingyue Wu	1238f341ba	[SeparateConstOffsetFromGEP] sext(a)+sext(b) => sext(a+b) when a+b can't sign-overflow. Summary: This patch implements my promised optimization to reunites certain sexts from operands after we extract the constant offset. See the header comment of reuniteExts for its motivation. One key building block that enables this optimization is Bjarke's poison value analysis (D11212). That helps to prove "a +nsw b" can't overflow. Reviewers: broune Subscribers: jholewinski, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12016 llvm-svn: 245003	2015-08-14 02:02:05 +00:00
Chandler Carruth	bf143e2a20	[LIR] Re-instate r244880, reverted in r244884, factoring the handling of AliasAnalysis in LoopIdiomRecognize. The previous commit to LIR, r244879, exposed some scary bug in the loop pass pipeline with an assert failure that showed up on several bots. This patch got reverted as part of getting that revision reverted, but they're actually independent and unrelated. This patch has no functional change and should be completely safe. It is also useful for my current work on the AA infrastructure. llvm-svn: 244993	2015-08-14 00:21:10 +00:00
Sanjay Patel	a75c41e5f3	don't repeat function names in comments; NFC llvm-svn: 244977	2015-08-13 22:53:20 +00:00
Davide Italiano	a195386ca1	[SimplifyLibCalls] Correctly set the is_zero_undef flag for llvm.cttz If <src> is non-zero we can safely set the flag to true, and this results in less code generated for, e.g. ffs(x) + 1 on FreeBSD. Thanks to majnemer for suggesting the fix and reviewing. Code generated before the patch was applied: 0: 0f bc c7 bsf %edi,%eax 3: b9 20 00 00 00 mov $0x20,%ecx 8: 0f 45 c8 cmovne %eax,%ecx b: 83 c1 02 add $0x2,%ecx e: b8 01 00 00 00 mov $0x1,%eax 13: 85 ff test %edi,%edi 15: 0f 45 c1 cmovne %ecx,%eax 18: c3 retq Code generated after the patch was applied: 0: 0f bc cf bsf %edi,%ecx 3: 83 c1 02 add $0x2,%ecx 6: 85 ff test %edi,%edi 8: b8 01 00 00 00 mov $0x1,%eax d: 0f 45 c1 cmovne %ecx,%eax 10: c3 retq It seems we can still use cmove and save another 'test' instruction, but that can be tackled separately. Differential Revision: http://reviews.llvm.org/D11989 llvm-svn: 244947	2015-08-13 20:34:26 +00:00
Jingyue Wu	13a80eaceb	[SeparateConstOffsetFromGEP] strengthen the inbounds attribute We used to be over-conservative about preserving inbounds. Actually, the second GEP (which applies the constant offset) can inherit the inbounds attribute of the original GEP, because the resultant pointer is equivalent to that of the original GEP. For example, x = GEP inbounds a, i+5 => y = GEP a, i // inbounds removed x = GEP inbounds y, 5 // inbounds preserved llvm-svn: 244937	2015-08-13 18:48:49 +00:00
Erik Eckstein	11fc8175d9	[DeadStoreElimination] remove a redundant store even if the load is in a different block. DeadStoreElimination does eliminate a store if it stores a value which was loaded from the same memory location. So far this worked only if the store is in the same block as the load. Now we can also handle stores which are in a different block than the load. Example: define i32 @test(i1, i32) { entry: %l2 = load i32, i32 %1, align 4 br i1 %0, label %bb1, label %bb2 bb1: br label %bb3 bb2: ; This store is redundant store i32 %l2, i32* %1, align 4 br label %bb3 bb3: ret i32 0 } Differential Revision: http://reviews.llvm.org/D11854 llvm-svn: 244901	2015-08-13 15:36:11 +00:00
Charlie Turner	6153698f26	[InstCombinePHI] Partial simplification of identity operations. Consider this code: BB: %i = phi i32 [ 0, %if.then ], [ %c, %if.else ] %add = add nsw i32 %i, %b ... In this common case the add can be moved to the %if.else basic block, because adding zero is an identity operation. If we go though %if.then branch it's always a win, because add is not executed; if not, the number of instructions stays the same. This pattern applies also to other instructions like sub, shl, shr, ashr \| 0, mul, sdiv, div \| 1. Patch by Jakub Kuderski! llvm-svn: 244887	2015-08-13 12:38:58 +00:00
Renato Golin	655348f0b2	Revert "[LIR] Start leveraging the fundamental guarantees of a loop..." This reverts commit r244879, as it broke the test-suite on SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64. llvm-svn: 244885	2015-08-13 11:25:38 +00:00
Renato Golin	4d57906b0e	Revert "[LIR] Handle access to AliasAnalysis the same way as the other analysis in LoopIdiomRecognize." This reverts commit r244880, as it broke the test-suite on SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64. llvm-svn: 244884	2015-08-13 11:25:35 +00:00
Ashutosh Nema	47802628f7	Test Commit. llvm-svn: 244883	2015-08-13 11:18:35 +00:00
Chandler Carruth	c2af09823f	[LIR] Handle access to AliasAnalysis the same way as the other analysis in LoopIdiomRecognize. This is what started me staring at this code. Now migrating it with the new AA stuff will be trivial. llvm-svn: 244880	2015-08-13 10:00:53 +00:00
Chandler Carruth	8ae7b81559	[LIR] Start leveraging the fundamental guarantees of a loop in simplified form to remove redundant checks and simplify the code for popcount recognition. We don't actually need to handle all of these cases. I've left a FIXME for one in particular until I finish inspecting to make sure we don't actually rely on the predicate in any way. llvm-svn: 244879	2015-08-13 09:56:20 +00:00
Chandler Carruth	18c2669aca	[LIR] Handle the LoopInfo the same as all the other analyses. No utility really in breaking pattern just for this analysis. llvm-svn: 244878	2015-08-13 09:27:01 +00:00
Simon Pilgrim	becd5e8abd	[InstCombine] SSE/AVX vector shifts demanded shift amount bits Most SSE/AVX (non-constant) vector shift instructions only use the lower 64-bits of the 128-bit shift amount vector operand, this patch calls SimplifyDemandedVectorElts to optimize for this. I had to refactor some of my recent InstCombiner work on the vector shifts to avoid quite a bit of duplicate code, it means that SimplifyX86immshift now (re)decodes the type of shift. Differential Revision: http://reviews.llvm.org/D11938 llvm-svn: 244872	2015-08-13 07:39:03 +00:00
Chen Li	f458c6f313	[LoopUnswitch] Check OptimizeForSize before traversing over all basic blocks in current loop Summary: This patch moves the check of OptimizeForSize before traversing over all basic blocks in current loop. If OptimizeForSize is set to true, no non-trivial unswitch is ever allowed. Therefore, the early exit will help reduce compilation time. This patch should be NFC. Reviewers: reames, weimingz, broune Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11997 llvm-svn: 244868	2015-08-13 05:24:29 +00:00
Chandler Carruth	dc298329cc	[LIR] Make the LoopIdiomRecognize pass get analyses essentially the same way as every other pass. This simplifies the code quite a bit and is also more idiomatic! <ba-dum!> llvm-svn: 244853	2015-08-13 01:03:26 +00:00
Chandler Carruth	8219a501da	[LIR] Remove the dedicated class for popcount recognition and sink the code into methods on LoopIdiomRecognize. This simplifies the code somewhat and also makes it much easier to move the analyses around. Ultimately, the separate class wasn't providing significant value over methods -- it contained the precondition basic block and the current loop. The current loop is already available and the precondition block wasn't needed everywhere and is easy to pass around. In several cases I just moved things to be static functions because they already accepted most of their inputs as arguments. This doesn't fix the way we manage analyses yet, that will be the next patch, but it already makes the code over 50 lines shorter. No functionality changed. llvm-svn: 244851	2015-08-13 00:44:29 +00:00
Chandler Carruth	d9c6070c98	[LIR] Move all the helpers to be private and re-order the methods in a way that groups things logically. No functionality changed. llvm-svn: 244845	2015-08-13 00:10:03 +00:00
Chandler Carruth	be158b17db	[LIR] Remove the 'LIRUtils' abstraction which was unnecessary and adding complexity. There is only one function that was called from multiple locations, and that was 'getBranch' which has a reasonable one-line spelling already: dyn_cast<BranchInst>(BB->getTerminator). We could make this shorter, but it doesn't seem to add much value. Instead, we should avoid calling it so many times on the same basic blocks, but that will be in a subsequent patch. The other functions are only called in one location, so inline them there, and take advantage of this to use direct early exit and reduce indentation. This makes it much more clear what is being tested for, and in fact makes it clear now to me that there are simpler ways to do this work. However, this patch just does the mechanical inlining. I'll clean up the functionality of the code to leverage loop simplified form more effectively in a follow-up. Despite lots of early line breaks due to early-exit, this is still shorter than it was before. llvm-svn: 244841	2015-08-12 23:55:56 +00:00
Chandler Carruth	bad690e8f7	[LIR] Run clang-format over LoopIdiomRecognize in preparation for a significant code cleanup here. The handling of analyses in this pass is overly complex and can be simplified significantly, but the right way to do that is to simplify all of the code not just the analyses, and that'll require pretty extensive edits that would be noisy with formatting changes mixed into them. llvm-svn: 244828	2015-08-12 23:06:37 +00:00
Philip Reames	971dc3a82a	[RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints To be clear: this is an optimization not a correctness change. CodeGenPrep likes to duplicate icmps feeding branch instructions to take advantage of x86's ability to fuze many comparison/branch patterns into a single micro-op and to reduce the need for materializing i1s into general registers. PlaceSafepoints likes to place safepoint polls right at the end of basic blocks (immediately before terminators) when inserting entry and backedge safepoints. These two heuristics interact in a somewhat unfortunate way where the branch terminating the original block will be controlled by a condition driven by unrelocated pointers. This forces the register allocator to keep both the relocated and unrelocated values of the pointers feeding the icmp alive over the safepoint poll. One simple fix would have been to just adjust PlaceSafepoints to move one back in the basic block, but you can reach similar cases as a result of LICM or other hoisting passes. As a result, doing a post insertion fixup seems to be more robust. I considered doing this in CodeGenPrep itself, but having to update the live sets of already rewritten safepoints gets complicated fast. In particular, you can't just use def/use information because by moving the icmp, we're extending the live range of it's inputs potentially. Instead, this patch teaches RewriteStatepointsForGC to make the required adjustments before making the relocations explicit in the IR. This change really highlights the fact that RSForGC is a CodeGenPrep-like pass which is performing target specific lowering. In the long run, we may even want to combine the two though this would require a lot more smarts to be integrated into RSForGC first. We currently rely on being able to run a set of cleanup passes post rewriting because the IR RSForGC generates is pretty damn ugly. Differential Revision: http://reviews.llvm.org/D11819 llvm-svn: 244821	2015-08-12 22:11:45 +00:00
Philip Reames	9ac4e38a16	[RewriteStatepointsForGC] Handle extractelement fully in the base pointer algorithm When rewriting the IR such that base pointers are available for every live pointer, we potentially need to duplicate instructions to propagate the base. The original code had only handled PHI and Select under the belief those were the only instructions which would need duplicated. When I added support for vector instructions, I'd added a collection of hacks for ExtractElement which caught most of the common cases. Of course, I then found the one test case my hacks couldn't cover. :) This change removes all of the early hacks for extract element. By defining extractelement as a BDV (rather than trying to look through it), we can extend the rewriting algorithm to duplicate the extract as needed. Note that a couple of peephole optimizations were left in for the moment, because while we now handle extractelement as a first class citizen, we're not yet handling insertelement. That change will follow in the near future. llvm-svn: 244808	2015-08-12 21:00:20 +00:00
Sanjay Patel	e24c60eb54	fix typo; NFC llvm-svn: 244805	2015-08-12 20:36:18 +00:00
Chandler Carruth	19ac7d5b29	[PM/AA] Add missing static dependency edges from DSE and memdep to TLI. I forgot to add these in r244780 and r244778. Sorry about that. Also order the static dependencies in a lexicographical order. llvm-svn: 244787	2015-08-12 18:10:45 +00:00
Chandler Carruth	d1a2e05991	[PM/AA] Explicitly depend on TLI rather than getting it out of the AliasAnalysis. Same as the other commits, the TLI access from an alias analysis is going away and isn't very clean -- it is better to explicitly mark the dependencies. llvm-svn: 244785	2015-08-12 18:06:08 +00:00
Chandler Carruth	dbe40fb45e	[PM/AA] Stop getting the TargetLibraryInfo out of the AliasAnalysis and just depend on it directly. This was particularly frustrating because there was a really wide mixture of using a member variable and re-extracting it from the AA that happened to be around. I think the result is much more clear. I've also deleted all of the pointless null checks and used references across the APIs where I could to make it explicit that this cannot be null in a useful fashion. llvm-svn: 244780	2015-08-12 18:01:44 +00:00
Adam Nemet	dfaeb33ec7	[LoopVer] Optionally allow using memchecks from LAA r243382 changed the behavior to always require a set of memchecks to be passed to LoopVer. This change restores the prior behavior as an alternative to the new behavior. This allows the checks to be implicitly taken from the LAA object. Patch by Ashutosh Nema! llvm-svn: 244763	2015-08-12 16:51:19 +00:00
Simon Pilgrim	93f59f53ca	unused variable warning fix. llvm-svn: 244725	2015-08-12 08:23:36 +00:00
Simon Pilgrim	8c049d5c03	[InstCombine] Move SSE/AVX vector blend folding to instcombiner As discussed in D11886, this patch moves the SSE/AVX vector blend folding to instcombiner from PerformINTRINSIC_WO_CHAINCombine (which allows us to remove this completely). InstCombiner already had partial support for this, I just had to add support for zero (ConstantAggregateZero) masks and also the case where both selection inputs were the same (allowing us to ignore the mask). I also moved all the relevant combine tests into InstCombine/blend_x86.ll Differential Revision: http://reviews.llvm.org/D11934 llvm-svn: 244723	2015-08-12 08:08:56 +00:00
Sanjoy Das	827529e7a0	Fix PR24354. `InstCombiner::OptimizeOverflowCheck` was asserting an invariant (operands to binary operations are ordered by decreasing complexity) that wasn't really an invariant. Fix this by instead having `InstCombiner::OptimizeOverflowCheck` establish the invariant if it does not hold. llvm-svn: 244676	2015-08-11 21:33:55 +00:00
Sanjay Patel	956e29cc8e	don't repeat function names in comments; NFC llvm-svn: 244672	2015-08-11 21:24:04 +00:00
Sanjay Patel	41f3d95f76	fix 80-cols; NFC llvm-svn: 244668	2015-08-11 21:11:56 +00:00
Chen Li	0786bc9fe8	[LowerSwitch] Skip dead blocks for processSwitchInst() Summary: This patch adds check for dead blocks and skip them for processSwitchInst(). This will help reduce compilation time. Reviewers: reames, hans Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11953 llvm-svn: 244656	2015-08-11 20:16:17 +00:00
Chen Li	10f01bd4d3	[LowerSwitch] Fix a bug when LowerSwitch deletes the default block Summary: LowerSwitch crashed with the attached test case after deleting the default block. This happened because the current implementation of deleting dead blocks is wrong. After the default block being deleted, it contains no instruction or terminator, and it should no be traversed anymore. However, since the iterator is advanced before processSwitchInst() function is executed, the block advanced to could be deleted inside processSwitchInst(). The deleted block would then be visited next and crash dyn_cast<SwitchInst>(Cur->getTerminator()) because Cur->getTerminator() returns a nullptr. This patch fixes this problem by recording dead default blocks into a list, and delete them after all processSwitchInst() has been done. It still possible to visit dead default blocks and waste time process them. But it is a compile time issue, and I plan to have another patch to add support to skip dead blocks. Reviewers: kariddi, resistor, hans, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11852 llvm-svn: 244642	2015-08-11 18:12:26 +00:00
Teresa Johnson	c4279a7fb2	Enable EliminateAvailableExternally pass in the LTO pipeline. Summary: For LTO we need to enable this pass in the LTO pipeline, as it is skipped during the "-flto -c" compile step (when PrepareForLTO is set). Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11919 llvm-svn: 244622	2015-08-11 16:26:41 +00:00
Sanjay Patel	278004be39	Variable names should start with an upper case letter; NFC llvm-svn: 244618	2015-08-11 16:05:43 +00:00
Sanjay Patel	fec7965b36	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244617	2015-08-11 15:56:31 +00:00
Sanjay Patel	2a3eb41deb	fix code that was accidentally commented out in previous commit llvm-svn: 244610	2015-08-11 15:08:29 +00:00
Sanjay Patel	320217668e	fix typos in comments; NFC llvm-svn: 244609	2015-08-11 15:04:51 +00:00
Sanjay Patel	25b2601bca	fix typo in comment; NFC llvm-svn: 244607	2015-08-11 14:45:08 +00:00
James Molloy	134bec2722	Add support for floating-point minnum and maxnum The select pattern recognition in ValueTracking (as used by InstCombine and SelectionDAGBuilder) only knew about integer patterns. This teaches it about minimum and maximum operations. matchSelectPattern() has been extended to return a struct containing the existing Flavor and a new enum defining the pattern's behavior when given one NaN operand. C minnum() is defined to return the non-NaN operand in this case, but the idiomatic C "a < b ? a : b" would return the NaN operand. ARM and AArch64 at least have different instructions for these different cases. llvm-svn: 244580	2015-08-11 09:12:57 +00:00
David Majnemer	fd9f47756a	[WinEHPrepare] Add rudimentary support for the new EH instructions This adds somewhat basic preparation functionality including: - Formation of funclets via coloring basic blocks. - Cloning of polychromatic blocks to ensure that funclets have unique program counters. - Demotion of values used between different funclets. - Some amount of cleanup once we have removed predecessors from basic blocks. - Verification that we are left with a CFG that makes some amount of sense. N.B. Arguments and numbering still need to be done. Differential Revision: http://reviews.llvm.org/D11750 llvm-svn: 244558	2015-08-11 01:15:26 +00:00
Tyler Nowicki	c94d6ad241	Print vectorization analysis when loop hint is specified. This patch and a relatec clang patch solve the problem of having to explicitly enable analysis when specifying a loop hint pragma to get the diagnostics. Passing AlwasyPrint as the pass name (see below) causes the front-end to print the diagnostic if the user has specified '-Rpass-analysis' without an '=<target-pass>’. Users of loop hints can pass that compiler option without having to specify the pass and they will get diagnostics for only those loops with loop hints. llvm-svn: 244555	2015-08-11 01:09:15 +00:00
Tyler Nowicki	233773837e	Moved LoopVectorizeHints and related functions before LoopVectorizationLegality and LoopVectorizationCostModel. llvm-svn: 244552	2015-08-11 00:52:54 +00:00
Tyler Nowicki	2d5802f38d	Simplify processLoop() by moving loop hint verification into Hints::allowVectorization(). llvm-svn: 244550	2015-08-11 00:35:44 +00:00
Kostya Serebryany	2569118621	[libFuzzer] don't crash if the condition in a switch has unusual type (e.g. i72) llvm-svn: 244544	2015-08-11 00:24:39 +00:00
Adam Nemet	5b0a479541	[LAA] Change name from addRuntimeCheck to addRuntimeChecks, NFC This was requested by Hal in D11205. llvm-svn: 244540	2015-08-11 00:09:37 +00:00
Adam Nemet	0bc068728e	[LoopVer] Remove unused pointer partition argument, NFC. llvm-svn: 244527	2015-08-10 23:05:31 +00:00
Tyler Nowicki	652b0dabe6	Extend late diagnostics to include late test for runtime pointer checks. This patch moves checking the threshold of runtime pointer checks to the vectorization requirements (late diagnostics) and emits a diagnostic that infroms the user the loop would be vectorized if not for exceeding the pointer-check threshold. Clang will also append the options that can be used to allow vectorization. llvm-svn: 244523	2015-08-10 23:01:55 +00:00
Simon Pilgrim	a3a72b41de	[InstCombine] Move SSE2/AVX2 arithmetic vector shift folding to instcombiner As discussed in D11760, this patch moves the (V)PSRA(WD) arithmetic shift-by-constant folding to InstCombine to match the logical shift implementations. Differential Revision: http://reviews.llvm.org/D11886 llvm-svn: 244495	2015-08-10 20:21:15 +00:00
Tyler Nowicki	c1a86f5866	Late evaluation of the fast-math vectorization requirement. This patch moves the verification of fast-math to just before vectorization is done. This way we can tell clang to append the command line options would that allow floating-point commutativity. Specifically those are enableing fast-math or specifying a loop hint. llvm-svn: 244489	2015-08-10 19:51:46 +00:00
Tyler Nowicki	4d62f2e039	Modify diagnostic messages to clearly indicate the why interleaving wasn't done. Sometimes interleaving is not beneficial, as determined by the cost-model and sometimes it is disabled by a loop hint (by the user). This patch modifies the diagnostic messages to make it clear why interleaving wasn't done. llvm-svn: 244485	2015-08-10 19:14:16 +00:00
Igor Laevsky	4709c03715	[IndVarSimplify] Make cost estimation in RewriteLoopExitValues smarter Differential Revision: http://reviews.llvm.org/D11687 llvm-svn: 244474	2015-08-10 18:23:58 +00:00
Mark Heffernan	8939154a22	Add new llvm.loop.unroll.enable metadata. This change adds the unroll metadata "llvm.loop.unroll.enable" which directs the optimizer to unroll a loop fully if the trip count is known at compile time, and unroll partially if the trip count is not known at compile time. This differs from "llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not known at compile time. The "llvm.loop.unroll.enable" is intended to be added for loops annotated with "#pragma unroll". llvm-svn: 244466	2015-08-10 17:28:08 +00:00
Silviu Baranga	61bdc51339	[TTI] Add a hook for specifying per-target defaults for Interleaved Accesses Summary: This adds a hook to TTI which enables us to selectively turn on by default interleaved access vectorization for targets on which we have have performed the required benchmarking. Reviewers: rengolin Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D11901 llvm-svn: 244449	2015-08-10 14:50:54 +00:00
Fraser Cormack	e29ab2bfab	Prevent the scalarizer from caching incorrect entries The scalarizer can cache incorrect entries when walking up a chain of insertelement instructions. This occurs when it encounters more than one instruction that it is not actively searching for, as it unconditionally caches every element it finds. The fix is to only cache the first element that it isn't searching for so we don't overwrite correct entries. Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D11559 llvm-svn: 244448	2015-08-10 14:48:47 +00:00
Benjamin Kramer	df005cbe19	Fix some comment typos. llvm-svn: 244402	2015-08-08 18:27:36 +00:00
David Majnemer	60c994b985	[InstCombine] Don't try to sink EH pad instructions Found by inspection, this change should not effect the existing landingpad behavior. llvm-svn: 244391	2015-08-08 03:51:49 +00:00
Matt Arsenault	b130076469	Remove unnecessary includes llvm-svn: 244382	2015-08-08 00:41:53 +00:00
Adam Nemet	15840393f3	[LAA] Make the set of runtime checks part of the state of LAA, NFC This is the full set of checks that clients can further filter. IOW, it's client-agnostic. This makes LAA complete in the sense that it now provides the two main results of its analysis precomputed: 1. memory dependences via getDepChecker().getInsterestingDependences() 2. run-time checks via getRuntimePointerCheck().getChecks() However, as a consequence we now compute this information pro-actively. Thus if the client decides to skip the loop based on the dependences we've computed the checks unnecessarily. In order to see whether this was a significant overhead I checked compile time on SPEC2k6 LTO bitcode files. The change was in the noise. The checks are generated in canCheckPtrAtRT, at the same place where we used to call groupChecks to merge checks. llvm-svn: 244368	2015-08-07 22:44:15 +00:00
Chen Li	eafbc9dc47	[ConstantFoldTerminator] Preserve make.implicit metadata when converting SwitchInst to BranchInst Summary: llvm::ConstantFoldTerminator function can convert SwitchInst with single case (and default) to a conditional BranchInst. This patch adds support to preserve make.implicit metadata on this conversion. Reviewers: sanjoy, weimingz, chenli Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11841 llvm-svn: 244348	2015-08-07 19:30:12 +00:00
Simon Pilgrim	3815c16bf8	[InstCombine] Fix SSE2/AVX2 vector logical shift by constant This patch fixes the sse2/avx2 vector shift by constant instcombine call to correctly deal with the fact that the shift amount is formed from the entire lower 64-bit and not just the lowest element as it currently assumes. e.g. %1 = tail call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %v, <4 x i32> <i32 15, i32 15, i32 15, i32 15>) In this case, (V)PSRLD doesn't perform a lshr by 15 but in fact attempts to shift by 64424509455 ((15 << 32) \| 15) - giving a zero result. In addition, this review also recognizes shift-by-zero from a ConstantAggregateZero type (PR23821). Differential Revision: http://reviews.llvm.org/D11760 llvm-svn: 244341	2015-08-07 18:22:50 +00:00
Duncan P. N. Exon Smith	8c9dcace0d	ValueMapper: Resolve uniquing cycles more aggressively As a follow-up to r244181, resolve uniquing cycles underneath distinct nodes on the fly. This prevents uniquing cycles in early operands from affecting later operands. It also removes an iteration through distinct nodes' operands. No real functional change here, just more prompt resolution of temporary nodes. llvm-svn: 244302	2015-08-07 00:44:55 +00:00
Duncan P. N. Exon Smith	c9fdbdb78d	ValueMapper: Pull out helper to resolve cycles, NFC Pull out a helper for resolving uniquing cycles of `Metadata` to remove the boiler-plate of downcasting to `MDNode`. llvm-svn: 244301	2015-08-07 00:39:26 +00:00
David Majnemer	09e1fdb3f4	Revert accidentally committed WinEHPrepare changes This reverts commit r244272, r244273, r244274, and r244275. llvm-svn: 244278	2015-08-06 21:13:51 +00:00
David Majnemer	ac6b298850	Handle PHI nodes prefacing EH pads too llvm-svn: 244274	2015-08-06 21:08:32 +00:00
Sanjoy Das	c18115db9c	[IndVars] Improved logging under DEBUG(); NFC. Before this, we'd print the modified comparision in the "Simplified comparison" case. That looked misleading. llvm-svn: 244264	2015-08-06 20:43:28 +00:00
Pete Cooper	ebcd748927	Convert a bunch of loops to foreach. NFC. After r244074, we now have a successors() method to iterate over all the successors of a TerminatorInst. This commit changes a bunch of eligible loops to use it. llvm-svn: 244260	2015-08-06 20:22:46 +00:00
Nico Rieck	78199518c4	Rename inst_range() to instructions() for consistency. NFC llvm-svn: 244248	2015-08-06 19:10:45 +00:00
Quentin Colombet	6443cce233	[Reassociation] Fix miscompile for va_arg arguments. iisUnmovableInstruction() had a list of instructions hardcoded which are considered unmovable. The list lacked (at least) an entry for the va_arg and cmpxchg instructions. Fix this by introducing a new Instruction::mayBeMemoryDependent() instead of maintaining another instruction list. Patch by Matthias Braun <matze@braunis.de>. Differential Revision: http://reviews.llvm.org/D11577 rdar://problem/22118647 llvm-svn: 244244	2015-08-06 18:44:34 +00:00
Chandler Carruth	17e0bc37fd	[PM/AA] Hoist the interface for BasicAA into a header file. This is the first mechanical step in preparation for making this and all the other alias analysis passes available to the new pass manager. I'm factoring out all the totally boring changes I can so I'm moving code around here with no other changes. I've even minimized the formatting churn. I'll reformat and freshen comments on the interface now that its located in the right place so that the substantive changes don't triger this. llvm-svn: 244197	2015-08-06 07:33:15 +00:00
Chandler Carruth	50fee93926	[PM/AA] Simplify the AliasAnalysis interface by removing a wrapper around a DataLayout interface in favor of directly querying DataLayout. This wrapper specifically helped handle the case where this no DataLayout, but LLVM now requires it simplifynig all of this. I've updated callers to directly query DataLayout. This in turn exposed a bunch of places where we should have DataLayout readily available but don't which I've fixed. This then in turn exposed that we were passing DataLayout around in a bunch of arguments rather than making it readily available so I've also fixed that. No functionality changed. llvm-svn: 244189	2015-08-06 02:05:46 +00:00
Duncan P. N. Exon Smith	3115f75bf8	ValueMapper: Rotate distinct node remapping algorithm Rotate the algorithm for remapping distinct nodes in order to simplify how uniquing cycles get resolved. This removes some of the recursion, and, most importantly, exposes all uniquing cycles at the top-level. Besides being a little more efficient -- temporary MDNodes won't live as long -- the clearer logic should help protect against bugs like those fixed in r243961 and r243976. What are uniquing cycles? Why do they present challenges when remapping metadata? !0 = !{!1} !1 = !{!0} !0 and !1 form a simple uniquing cycle. When remapping from one metadata graph to another, every uniquing cycle gets "duplicated" through a dance: !0-temp = !{!1?} ; map(!0): clone !0, VM[!0] = !0-temp !1-temp = !{!0?} ; ..map(!1): clone !1, VM[!1] = !1-temp !1-temp = !{!0-temp} ; ..map(!1): remap !1's operands !2 = !{!0-temp} ; ..map(!1): uniquify: !1-temp => !2 !0-temp = !{!2} ; map(!0): remap !0's operands !3 = !{!2} ; map(!0): uniquify: !0-temp => !3 ; Result !2 = !{!3} !3 = !{!2} (In the two "uniquify" steps above, the operands of !X-temp are compared to the operands of !X. If they're the same, then !X-temp gets RAUW'ed to !X; if they're different, then !X-temp is promoted to a new unique node. The latter case always hits in for uniquing cycles, so we duplicate all the nodes involved.) Why is this a problem? Uniquable Metadata nodes that have temporary node as transitive operands keep RAUW support until the temporary nodes get finalized. With non-cycles, this happens automatically: when a uniquable node's count of unresolved operands drops to zero, it immediately sheds its own RAUW support (possibly triggering the same in any node that references it). However, uniquing cycles create a reference cycle, and uniqued nodes that transitively reference a uniquing cycle are "stuck" in an unresolved state until someone calls `MDNode::resolveCycles()` on a node in the unresolved subgraph. Distinct nodes should help here (and mostly do): since they aren't uniqued anywhere, they are guaranteed not to be RAUW'ed. They effectively form a barrier between uniqued nodes, breaking some uniquing cycles, and shielding uniqued nodes from uniquing cycles. Unfortunately, with this barrier in place, the unresolved subgraph(s) can be disjoint from the top-level node. The mapping algorithm needs to find at least one representative from each disjoint subgraph. But which nodes are stuck, and which will get resolved automatically? And which nodes are in the unresolved subgraph? The old logic was conservative. This commit rotates the logic for distinct nodes, so that we have access to unresolved nodes at the top-level call to `llvm::MapMetadata()`. Each time we return to the top-level, we know that all temporaries have been RAUW'ed away. Here, it's safe (and necessary) to call `resolveCycles()` immediately on unresolved operands. This should also perform better than the old algorithm. The recursion stack is shorter, temporary nodes don't live as long, and there are fewer tracking references to unresolved nodes. As the debug info graph introduces more 'distinct' nodes, remapping should incrementally get cheaper and cheaper. Aside from possible performance improvements (and reduced cruft in the `LLVMContext`), there should be no functionality change here. llvm-svn: 244181	2015-08-05 23:52:42 +00:00
Duncan P. N. Exon Smith	2705097e47	ValueMapper: Simplify remap() helper function, NFC Rename `remap()` to `remapOperands()`, and restrict its contract to remapping operands. Previously, it also called `mapToMetadata()`, but this logic is hard to reason about externally. In particular, this refactors `mapUniquedNode()` to avoid redundant mapping calls, taking advantage of the RAUWs that are already in place. llvm-svn: 244168	2015-08-05 23:22:34 +00:00
Chen Li	50efd9220a	[LoopUnswitch] Preserve make.implicit metadata for unswitched conditions Summary: This patch adds support to preserve make.implicit metadata for unswitched conditions in loop pre-header. Reviewers: sanjoy, weimingz Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11769 llvm-svn: 244132	2015-08-05 21:13:26 +00:00
David Blaikie	a5d7de9f08	-Wdeprecated cleanup: Make CallGraph movable by default by using unique_ptr members rather than raw pointers. The only place that tries to return a CallGraph by value (CallGraphAnalysis::run) doesn't seem to be used right now, but it's a reasonable bit of cleanup anyway. llvm-svn: 244122	2015-08-05 20:55:50 +00:00
Chandler Carruth	b2fda0d95c	[Unroll] Switch to using 'int' cost types in preparation for a somewhat more involved change to the cost computation pattern. llvm-svn: 244095	2015-08-05 18:46:21 +00:00
Simon Pilgrim	18617d193f	Fixed line endings. llvm-svn: 244021	2015-08-05 08:18:00 +00:00
Tanya Lattner	0d28f80bd1	Rename all references to old mailing lists to new lists.llvm.org address. llvm-svn: 243999	2015-08-05 03:51:17 +00:00
Sanjay Patel	924879ad2c	wrap OptSize and MinSize attributes for easier and consistent access (NFCI) Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994	2015-08-04 15:49:57 +00:00
Duncan P. N. Exon Smith	1de9ccb472	Fix 80-column llvm-svn: 243977	2015-08-04 13:24:26 +00:00
Duncan P. N. Exon Smith	5ed90c0278	Linker: Fix ASan failure from r243961 r243883 and r243961 made a use-after-free far more likely: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/6041/steps/check-llvm%20asan/logs/stdio Unresolved nodes get inserted into the `Cycles` array. If they later get resolved through RAUW, we need to update the reference. It's interesting that this never hit before (maybe an asan-ified clang bootstrap with `-flto -g` would have hit it, but I admit I haven't tried anything quite that crazy). llvm-svn: 243976	2015-08-04 13:23:30 +00:00
David Majnemer	eb518bd5d8	Drive-by fixes for LandingPad -> EHPad This change was done as an audit and is by inspection. The new EH system is still very much a work in progress. NFC for the landingpad case. llvm-svn: 243965	2015-08-04 08:21:40 +00:00
Simon Pilgrim	dcfd7a3fba	[InstCombine] Moved SSE vector shift constant folding into its own helper function. NFCI. This will make some upcoming bugfixes + improvements easier to manage. llvm-svn: 243962	2015-08-04 07:49:58 +00:00
Duncan P. N. Exon Smith	706f37e8df	Linker: Fix references to uniqued nodes after r243883 r243883 started moving 'distinct' nodes instead of duplicated them in lib/Linker. This had the side-effect of sometimes not cloning uniqued nodes that reference them. I missed a corner case: !named = !{!0} !0 = !{!1} !1 = distinct !{!0} !0 is the entry point for "remapping", and a temporary clone (say, !0-temp) is created and mapped in case we need to model a uniquing cycle. Recursive descent into !1. !1 is distinct, so we leave it alone, but update its operand to !0-temp. Pop back out to !0. Its only operand, !1, hasn't changed, so we don't need to use !0-temp. !0-temp goes out of scope, and we're finished remapping, but we're left with: !named = !{!0} !0 = !{!1} !1 = distinct !{null} ; uh oh... Previously, if !0 and !0-temp ended up with identical operands, then !0-temp couldn't have been referenced at all. Now that distinct nodes don't get duplicated, that assumption is invalid. We need to !0-temp->replaceAllUsesWith(!0) before freeing !0-temp. I found this while running an internal `-flto -g` bootstrap. Strangely, there was no case of this in the open source bootstrap I'd done before commit... llvm-svn: 243961	2015-08-04 06:42:31 +00:00
Sanjoy Das	215df9ed98	Revert "[LSR] Generate and use zero extends" This reverts commit r243348 and r243357. They caused PR24347. llvm-svn: 243939	2015-08-04 01:52:05 +00:00
Adam Nemet	6b6082dc42	[LoopVer] Remove unused needsRuntimeChecks(), NFC The previous commits moved this functionality into the client. Also remove the now unused member variable. llvm-svn: 243920	2015-08-03 23:32:57 +00:00
Chandler Carruth	87adb7a2e2	[Unroll] Improve the brute force loop unroll estimate by propagating through PHI nodes across iterations. This patch teaches the new advanced loop unrolling heuristics to propagate constants into the loop from the preheader and around the backedge after simulating each iteration. This lets us brute force solve simple recurrances that aren't modeled effectively by SCEV. It also makes it more clear why we need to process the loop in-order rather than bottom-up which might otherwise make much more sense (for example, for DCE). This came out of an attempt I'm making to develop a principled way to account for dead code in the unroll estimation. When I implemented a forward-propagating version of that it produced incorrect results due to failing to propagate cost between loop iterations through the PHI nodes, and it occured to me we really should at least propagate simplifications across those edges, and it is quite easy thanks to the loop being in canonical and LCSSA form. Differential Revision: http://reviews.llvm.org/D11706 llvm-svn: 243900	2015-08-03 20:32:27 +00:00
Duncan P. N. Exon Smith	4fb46cb818	Linker: Move distinct MDNodes instead of cloning Instead of cloning distinct `MDNode`s when linking in a module, just move them over. The module linker destroys the source module, so the old node would otherwise just be leaked on the context. Create the new node in place. This also reduces the number of cloned uniqued nodes (since it's less likely their operands have changed). This mapping strategy is only correct when we're discarding the source, so the linker turns it on via a ValueMapper flag, `RF_MoveDistinctMDs`. There's nothing observable in terms of `llvm-link` output here: the linked module should be semantically identical. I'll be adding more 'distinct' nodes to the debug info metadata graph in order to break uniquing cycles, so the benefits of this will partly come in future commits. However, we should get some gains immediately, since we have a fair number of 'distinct' `DILocation`s being linked in. llvm-svn: 243883	2015-08-03 17:09:38 +00:00
Duncan P. N. Exon Smith	50f8969e52	ValueMapper: Only check for cycles if operands change This is a minor optimization to only check for unresolved operands inside `mapDistinctNode()` if the operands have actually changed. This shouldn't really cause any change in behaviour. I didn't actually see a slowdown in a profile, I was just poking around nearby and saw the opportunity. llvm-svn: 243866	2015-08-03 03:45:32 +00:00
Duncan P. N. Exon Smith	e08bcbff8f	ValueMapper: Use a range-based for, NFC llvm-svn: 243865	2015-08-03 03:27:12 +00:00
Duncan P. N. Exon Smith	0880014d48	ValueMapper: Reuse local variable, NFC llvm-svn: 243864	2015-08-03 03:24:28 +00:00
Craig Topper	e3dcce9700	De-constify pointers to Type since they can't be modified. NFC This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842	2015-08-01 22:20:21 +00:00
David Majnemer	654e130b6e	New EH representation for MSVC compatibility This introduces new instructions neccessary to implement MSVC-compatible exception handling support. Most of the middle-end and none of the back-end haven't been audited or updated to take them into account. Differential Revision: http://reviews.llvm.org/D11097 llvm-svn: 243766	2015-07-31 17:58:14 +00:00
Kostya Serebryany	fb7d8d9d06	[libFuzzer] trace switch statements and apply mutations based on the expected case values llvm-svn: 243726	2015-07-31 01:33:06 +00:00
Adhemerval Zanella	bfe1eaf0fe	Enable dfsan for aarch64 This patch enable DFSan memory transformation for aarch64 (39-bit VMA). llvm-svn: 243684	2015-07-30 20:49:35 +00:00
Wei Mi	d6f7252e2e	[SLP vectorizer]: Choose the best consecutive candidate to pair with a store instruction. The patch changes the SLPVectorizer::vectorizeStores to choose the immediate succeeding or preceding candidate for a store instruction when it has multiple consecutive candidates. In this way it has better chance to find more slp vectorization opportunities. Differential Revision: http://reviews.llvm.org/D10445 llvm-svn: 243666	2015-07-30 17:40:39 +00:00
Adam Nemet	252d529b6c	[LoopVer] Add missing std::move The reason I was passing this vector by value in the constructor so that I wouldn't have to copy when initializing the corresponding member but then I forgot the std::move. The use-case is LoopDistribution which filters the checks then std::moves it to LoopVersioning's constructor. With this interface we can avoid any copies. llvm-svn: 243616	2015-07-30 04:21:13 +00:00
Adam Nemet	c75ad69ca5	[LDist] Filter the checks locally rather than in LAA, NFC Before, we were passing the pointer partitions to LAA. Now, we get all the checks from LAA and filter out the checks within partitions in LoopDistribution. This effectively concludes the steps to move filtering memchecks from LAA into its clients. There is still some cleanup left to remove the unused interfaces in LAA that still take PtrPartition. (Moving this functionality to LoopDistribution also requires needsChecking on pointers to be made public.) llvm-svn: 243613	2015-07-30 03:29:16 +00:00
Nick Lewycky	c3890d2969	Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.) Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h. llvm-svn: 243585	2015-07-29 22:32:47 +00:00
Alexey Samsonov	869a5ff37f	[ASan] Disable dynamic alloca and UAR detection in presence of returns_twice calls. Summary: returns_twice (most importantly, setjmp) functions are optimization-hostile: if local variable is promoted to register, and is changed between setjmp() and longjmp() calls, this update will be undone. This is the reason why "man setjmp" advises to mark all these locals as "volatile". This can not be enough for ASan, though: when it replaces static alloca with dynamic one, optionally called if UAR mode is enabled, it adds a whole lot of SSA values, and computations of local variable addresses, that can involve virtual registers, and cause unexpected behavior, when these registers are restored from buffer saved in setjmp. To fix this, just disable dynamic alloca and UAR tricks whenever we see a returns_twice call in the function. Reviewers: rnk Subscribers: llvm-commits, kcc Differential Revision: http://reviews.llvm.org/D11495 llvm-svn: 243561	2015-07-29 19:36:08 +00:00
Evgeniy Stepanov	4d81f86d97	[asan] Remove special case mapping on Android/AArch64. ASan shadow on Android starts at address 0 for both historic and performance reasons. This is possible because the platform mandates -pie, which makes lower memory region always available. This is not such a good idea on 64-bit platforms because of MAP_32BIT incompatibility. This patch changes Android/AArch64 mapping to be the same as that of Linux/AAarch64. llvm-svn: 243548	2015-07-29 18:22:25 +00:00
Peter Collingbourne	3eddf499b7	LowerBitSets: Add debugging output. Differential Revision: http://reviews.llvm.org/D11583 llvm-svn: 243546	2015-07-29 18:12:36 +00:00
Michael Zolotukhin	9f06ef76d3	[Unroll] Handle SwitchInst properly. Previously successor selection was simply wrong. llvm-svn: 243545	2015-07-29 18:10:33 +00:00
Michael Zolotukhin	3a7d55b623	[Unroll] Don't crash when simplified branch condition is undef. llvm-svn: 243544	2015-07-29 18:10:29 +00:00
Sanjoy Das	cfe41f050c	[Statepoints] Let patchable statepoints have a symbolic call target. Summary: As added initially, statepoints required their call targets to be a constant pointer null if ``numPatchBytes`` was non-zero. This turns out to be a problem ergonomically, since there is no way to mark patchable statepoints as calling a (readable) symbolic value. This change remove the restriction of requiring ``null`` call targets for patchable statepoints, and changes PlaceSafepoints to maintain the symbolic call target through its transformation. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11550 llvm-svn: 243502	2015-07-28 23:50:30 +00:00
Michael Zolotukhin	80d13bac02	[Unroll] Add debug dumps to loop-unroll analyzer. llvm-svn: 243471	2015-07-28 20:07:29 +00:00
Michael Zolotukhin	a425c9d0e3	[Unroll] Don't analyze blocks outside the loop. llvm-svn: 243466	2015-07-28 19:21:21 +00:00
Sanjay Patel	d411114e77	fix formatting; NFC llvm-svn: 243424	2015-07-28 15:38:43 +00:00
Adam Nemet	0a674401bf	[LDist][LVer] Explicitly pass the set of memchecks to LoopVersioning, NFC Before the patch, the checks were generated internally in addRuntimeCheck. Now, we use the new overloaded version of addRuntimeCheck that takes the ready-made set of checks as a parameter. The checks are now generated by the client (LoopDistribution) with the new RuntimePointerChecking::generateChecks API. Also the new printChecks API is used to print out the checks for debugging. This is to continue the transition over to the new model whereby clients will get the full set of checks from LAA, filter it and then pass it to LoopVersioning and in turn to addRuntimeCheck. llvm-svn: 243382	2015-07-28 05:01:53 +00:00
Sanjoy Das	93b3504aa8	[LSR] Generate and use zero extends Summary: If a scale or a base register can be rewritten as "Zext({A,+,1})" then LSR will now consider a formula of that form in its normal cost computation. Depends on D9180 Reviewers: qcolombet, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9181 llvm-svn: 243348	2015-07-27 23:27:51 +00:00
Sanjoy Das	5dab205ced	[IndVars] Make loop varying predicates loop invariant. Summary: Was D9784: "Remove loop variant range check when induction variable is strictly increasing" This change re-implements D9784 with the two differences: 1. It does not use SCEVExpander and does not generate new instructions. Instead, it does a quick local search for existing `llvm::Value`s that it needs when modifying the `icmp` instruction. 2. It is more general -- it deals with both increasing and decreasing induction variables. I've added all of the tests included with D9784, and two more. As an example on what this change does (copied from D9784): Given C code: ``` for (int i = M; i < N; i++) // i is known not to overflow if (i < 0) break; a[i] = 0; } ``` This transformation produces: ``` for (int i = M; i < N; i++) if (M < 0) break; a[i] = 0; } ``` Which can be unswitched into: ``` if (!(M < 0)) for (int i = M; i < N; i++) a[i] = 0; } ``` I went back and forth on whether the top level logic should live in `SimplifyIndvar::eliminateIVComparison` or be put into its own routine. Right now I've put it under `eliminateIVComparison` because even though the `icmp` is not eliminated, it no longer is an IV comparison. I'm open to putting it in its own helper routine if you think that is better. Reviewers: reames, nicholas, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11278 llvm-svn: 243331	2015-07-27 21:42:49 +00:00
Simon Pilgrim	074c0d97dc	Fixed signed/unsigned comparison warning. llvm-svn: 243306	2015-07-27 19:07:15 +00:00
Simon Pilgrim	15c0a59463	[InstCombine][X86][SSE] Replace sign/zero extension intrinsics with native IR Now that we are generating sane codegen for vector sext/zext nodes on SSE targets, this patch uses instcombine to replace the SSE41/AVX2 pmovsx and pmovzx intrinsics with the equivalent native IR code. Differential Revision: http://reviews.llvm.org/D11503 llvm-svn: 243303	2015-07-27 18:52:15 +00:00
Pete Cooper	11bd958cb6	Revert "Remove unnecessary null check. NFC." This reverts commit r243167. Duncan pointed out that dyn_cast can return null in these cases, so this was an unsafe commit to make. Sorry for the noise. Worryingly there were no tests which fail... llvm-svn: 243302	2015-07-27 18:37:58 +00:00
Jingyue Wu	bfefff555e	Roll forward r243250 r243250 appeared to break clang/test/Analysis/dead-store.c on one of the build slaves, but I couldn't reproduce this failure locally. Probably a false positive as I saw this test was broken by r243246 or r243247 too but passed later without people fixing anything. llvm-svn: 243253	2015-07-26 19:10:03 +00:00
Jingyue Wu	84879b71a9	Revert r243250 breaks tests llvm-svn: 243251	2015-07-26 18:30:13 +00:00
Jingyue Wu	bf485f059c	[TTI/CostModel] improve TTI::getGEPCost and use it in CostModel::getInstructionCost Summary: This patch updates TargetTransformInfoImplCRTPBase::getGEPCost to consider addressing modes. It now returns TCC_Free when the GEP can be completely folded to an addresing mode. I started this patch as I refactored SLSR. Function isGEPFoldable looks common and is indeed used by some WIP of mine. So I extracted that logic to getGEPCost. Furthermore, I noticed getGEPCost wasn't directly tested anywhere. The best testing bed seems CostModel, but its getInstructionCost method invokes getAddressComputationCost for GEPs which provides very coarse estimation. So this patch also makes getInstructionCost call the updated getGEPCost for GEPs. This change inevitably breaks some tests because the cost model changes, but nothing looks seriously wrong -- if we believe the new cost model is the right way to go, these tests should be updated. This patch is not perfect yet -- the comments in some tests need to be updated. I want to know whether this is a right approach before fixing those details. Reviewers: chandlerc, hfinkel Subscribers: aschwaighofer, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D9819 llvm-svn: 243250	2015-07-26 17:28:13 +00:00
Simon Pilgrim	54fcd62c6f	[InstCombine][SSE4A] Standardized references to Length/Width and Index/Start to match AMD docs. NFCI. llvm-svn: 243226	2015-07-25 20:41:00 +00:00
Chen Li	145c2f57ae	[LoopUnswitch] Improve loop unswitch pass to find trivial unswitch conditions more effectively Summary: This patch improves trivial loop unswitch. The current trivial loop unswitch only checks if loop header's terminator contains a trivial unswitch condition. But if the loop header only has one reachable successor (due to intentionally or unintentionally missed code simplification), we should consider the successor as part of the loop header. Therefore, instead of stopping at loop header's terminator, we should keep traversing its successors within loop until reach a real conditional branch or switch (whose condition can not be constant folded). This change will enable a single -loop-unswitch pass to unswitch multiple trivial conditions (unswitch one trivial condition could open opportunity to unswitch another one in the same loop), while the old implementation can unswitch only one per pass. Reviewers: reames, broune Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11481 llvm-svn: 243203	2015-07-25 03:21:06 +00:00
Lawrence Hu	dc8a83b53b	Handle loop with negtive induction variable increment This patch extend LoopReroll pass to hand the loops which is similar to the following: while (len > 1) { sum4 += buf[len]; sum4 += buf[len-1]; len -= 2; } llvm-svn: 243171	2015-07-24 22:01:49 +00:00
Pete Cooper	3191697138	Remove unnecessary null check. NFC. Since both places which set this variable do so with dyn_cast, and not dyn_cast_or_null, its impossible to get a nullptr here, so we can remove the check. llvm-svn: 243167	2015-07-24 21:38:01 +00:00
Pete Cooper	7679afda82	Use make_range(rbegin(), rend()) to allow foreach loops. NFC. Instead of the pattern for (auto I = x.rbegin(), E = x.end(); I != E; ++I) we can use make_range to construct the reverse range and iterate using that instead. llvm-svn: 243163	2015-07-24 21:13:43 +00:00
Diego Novillo	b9bf447d90	Remove unused variable. NFC. llvm-svn: 243145	2015-07-24 19:18:32 +00:00
Jingyue Wu	abb05aa3c6	Remove the user-count threshold when analyzing read attributes Summary: This threshold limited FunctionAttrs ability to prove arguments to be read-only. In NVPTX, a specialized instruction ld.global.nc can be used to load memory with non-coherent texture cache. We notice that in SHOC [1] benchmark, some function arguments are not marked with readonly because FunctionAttrs reaches a hardcoded threshold when analysis uses. Removing this threshold won't cause significant regression in compilation time, because the worst-case time complexity of the algorithm is still O(# of instructions) for each parameter. Patched by Xuetian Weng. [1] https://github.com/vetter/shoc Reviewers: nlewycky, jingyue, nicholas Subscribers: nicholas, test, llvm-commits Differential Revision: http://reviews.llvm.org/D11311 llvm-svn: 243141	2015-07-24 19:05:53 +00:00
Philip Reames	fa2c630f79	[RewriteStatepointsForGC] Adjust naming scheme to be more stable The names for instructions inserted were previous dependent on iteration order. By deriving the names from the original instructions, we can avoid instability in tests without resorting to ordered traversals. It also makes the IR mildly easier to read at large scale. llvm-svn: 243140	2015-07-24 19:01:39 +00:00
Pete Cooper	0debbdc872	Use foreach loops for StructType::elements(). NFC. We had a few places where we did for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) { but those could instead do for (auto *EltTy : STy->elements()) { llvm-svn: 243136	2015-07-24 18:55:49 +00:00
Michael Zolotukhin	57776b8159	Handle resolvable branches in complete loop unroll heuristic. Summary: Resolving a branch allows us to ignore blocks that won't be executed, and thus make our estimate more accurate. This patch is intended to be applied after D10205 (though it could be applied independently). Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10206 llvm-svn: 243084	2015-07-24 01:53:04 +00:00
Philip Reames	29e9ae7891	[RewriteStatepointsForGC] Fix release build warning llvm-svn: 243076	2015-07-24 00:42:55 +00:00
Philip Reames	88958b2df3	[RewriteStatepointsForGC] Use a worklist algorithm for first part of base pointer algorithm [NFC] The new code should hopefully be equivalent to the old code; it just uses a worklist to track instructions which need to visited rather than iterating over all instructions visited each time. This should be faster, but the primary benefit is that the purpose should be more clear and the diff of adding another instruction type (forthcoming) much more obvious. Differential Revision: http://reviews.llvm.org/D11480 llvm-svn: 243071	2015-07-24 00:02:11 +00:00
Jingyue Wu	2e424da39b	[NaryReassociate] remove redundant code This check is already done by findClosestMatchingDominator. llvm-svn: 243065	2015-07-23 23:13:37 +00:00
Philip Reames	9b141ed48e	[RewriteStatepointsForGC] Rename PhiState to reflect that it's associated w/more than just PHIs Today, Select instructions also have associated PhiStates. In the near future, so will ExtractElement and SuffleVector. llvm-svn: 243056	2015-07-23 22:49:14 +00:00
Philip Reames	2a892a630b	[RewriteStatepointsForGC] Use idomatic mechanisms for debug tracing [NFC] Deleting much of the code using trace-rewrite-statepoints and use idiomatic DEBUG statements instead. This includes adding operator<< to a helper class. llvm-svn: 243054	2015-07-23 22:25:26 +00:00
Philip Reames	273e6bbd11	[RewriteStatepointsForGC] Simplify code around meet of PhiStates [NFC] We don't need to pass in the map from BDV to PhiStates; we can instead handle that externally and let the MeetPhiStates helper class just meet PhiStates. llvm-svn: 243045	2015-07-23 21:41:27 +00:00
Matt Wala	878c144f8a	[Scalarizer] Fix potential for stale data in Scattered across invocations Summary: Scalarizer has two data structures that hold information about changes to the function, Gathered and Scattered. These are cleared in finish() at the end of runOnFunction() if finish() detects any changes to the function. However, finish() was checking for changes by only checking if Gathered was non-empty. The function visitStore() only modifies Scattered without touching Gathered. As a result, Scattered could have ended up having stale data if Scalarizer only scalarized store instructions. Since the data in Scattered is used during the execution of the pass, this introduced dangling pointer errors. The fix is to check whether both Scattered and Gathered are empty before deciding what to do in finish(). This also fixes a problem where the Function can be modified although the pass returns false. Reviewers: rnk Subscribers: rnk, srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D10459 llvm-svn: 243040	2015-07-23 20:53:46 +00:00
Kuba Brecka	45dbffdc3d	[asan] Rename the ABI versioning symbol to '__asan_version_mismatch_check' instead of abusing '__asan_init' We currently version `__asan_init` and when the ABI version doesn't match, the linker gives a `undefined reference to '__asan_init_v5'` message. From this, it might not be obvious that it's actually a version mismatch error. This patch makes the error message much clearer by changing the name of the undefined symbol to be `__asan_version_mismatch_check_xxx` (followed by the version string). We obviously don't want the initializer to be named like that, so it's a separate symbol that is used only for the purpose of version checking. Reviewed at http://reviews.llvm.org/D11004 llvm-svn: 243003	2015-07-23 10:54:06 +00:00
Chandler Carruth	08eebe2074	[GMR] Add a late run of GlobalsModRef to the main pass pipeline behind the general GMR-in-non-LTO flag. Without this, we have the global information during the CGSCC pipeline for GVN and such, but don't have it available during the late loop optimizations such as the vectorizer. Moreover, after the CGSCC pipeline has finished we have substantially more accurate and refined call graph information, function annotations, etc, which will make GMR even more powerful than it is early in the pipelien. Note that we have to play silly games with preserving AliasAnalysis (which is now trivially preserved) in order to let a module analysis magically be preserved into the entire function pass pipeline. Simultaneously we have to not make GMR an immutable pass in order to be able to re-run it and collect fresh data on the final call graph. llvm-svn: 242999	2015-07-23 09:34:01 +00:00
Chandler Carruth	194f59ca5d	[PM/AA] Extract the ModRef enums from the AliasAnalysis class in preparation for de-coupling the AA implementations. In order to do this, they had to become fake-scoped using the traditional LLVM pattern of a leading initialism. These can't be actual scoped enumerations because they're bitfields and thus inherently we use them as integers. I've also renamed the behavior enums that are specific to reasoning about the mod/ref behavior of functions when called. This makes it more clear that they have a very narrow domain of applicability. I think there is a significantly cleaner API for all of this, but I don't want to try to do really substantive changes for now, I just want to refactor the things away from analysis groups so I'm preserving the exact original design and just cleaning up the names, style, and lifting out of the class. Differential Revision: http://reviews.llvm.org/D10564 llvm-svn: 242963	2015-07-22 23:15:57 +00:00
Anthony Pesch	e92ae2dcd1	Revert "Improve merging of stores from static constructors in GlobalOpt" This reverts commit 0a9dee959a30b81b9e7df64c9a58ff9898c24024. llvm-svn: 242954	2015-07-22 22:26:54 +00:00
Anthony Pesch	b8531f4f65	Revert "IPO: Avoid brace initialization of a map, some versions of libc++ don't like it" This reverts commit fc2dad0c68f8d32273d3c2d790ed496961f829af. llvm-svn: 242953	2015-07-22 22:26:52 +00:00
Justin Bogner	49c5ce67eb	IPO: Avoid brace initialization of a map, some versions of libc++ don't like it Should fix the build failure on these darwin bots: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_build/12427/ http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/10389/ llvm-svn: 242945	2015-07-22 21:41:12 +00:00
Anthony Pesch	3da0acdcbc	Improve merging of stores from static constructors in GlobalOpt Summary: While working on a project I wound up generating a fairly large lookup table (10k entries) of callbacks inside of a static constructor. Clang was taking upwards of ~10 minutes to compile the lookup table. I generated a smaller test case (http://www.inolen.com/static_initializer_test.ll) that, after running with -ftime-report, pointed fingers at GlobalOpt and MemCpyOptimizer. Running globalopt took around ~9 minutes. The slowdown came from how GlobalOpt merged stores from static constructors individually into the global initializer in EvaluateStaticConstructor. For each store it discovered and wanted to commit, it would copy the existing global initializer and then merge in the individual store. I changed this so that stores are now grouped by global, and sorted from most significant to least significant by their GEP indexes (e.g. a store to GEP 0, 0 comes before GEP 0, 0, 1). With this representation, the existing initializer can be copied and all new stores merged into it in a single pass. With this patch and http://reviews.llvm.org/D11198, the lookup table that was taking ~10 minutes to compile now compiles in around 5 seconds. I've ran 'make check' and the test-suite, which all passed. I'm not really sure who to tag as a reviewer, Lang mentioned that Chandler may be appropriate. Reviewers: chandlerc, nlewycky Subscribers: nlewycky, llvm-commits Differential Revision: http://reviews.llvm.org/D11200 llvm-svn: 242935	2015-07-22 21:10:45 +00:00
Hans Wennborg	13958b739e	Fix -Wextra-semi warnings. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D11400 llvm-svn: 242930	2015-07-22 20:46:11 +00:00
Anthony Pesch	a2d9369ef3	Test commit, added blank line llvm-svn: 242923	2015-07-22 18:50:10 +00:00
Chandler Carruth	e9ea5a66f2	[GMR] Add a flag to enable GlobalsModRef in the normal compilation pipeline. Even before I started improving its runtime, it was already crazy fast once the call graph exists, and if we can get it to be conservatively correct, will still likely catch a lot of interesting and useful cases. So it may well be useful to enable by default. But more importantly for me, this should make it easier for me to test that changes aren't breaking it in fundamental ways by enabling it for normal builds. llvm-svn: 242895	2015-07-22 11:57:28 +00:00
Michael Kuperstein	d72403636c	Fix mem2reg to correctly handle allocas only used in a single block Currently, a load from an alloca that is used in as single block and is not preceded by a store is replaced by undef. This is not always correct if the single block is inside a loop. Fix the logic so that: 1) If there are no stores in the block, replace the load with an undef, as before. 2) If there is a store (regardless of where it is in the block w.r.t the load), bail out, and let the rest of mem2reg handle this alloca. Patch by: gil.rapaport@intel.com Differential Revision: http://reviews.llvm.org/D11355 llvm-svn: 242884	2015-07-22 10:29:29 +00:00
Kuba Brecka	8ec94ead7d	[asan] Improve moving of non-instrumented allocas In r242510, non-instrumented allocas are now moved into the first basic block. This patch limits that to only move allocas that are present after the first instrumented one (i.e. only move allocas up). A testcase was updated to show behavior in these two cases. Without the patch, an alloca could be moved down, and could cause an invalid IR. Differential Revision: http://reviews.llvm.org/D11339 llvm-svn: 242883	2015-07-22 10:25:38 +00:00
Chandler Carruth	96ada25bf3	[PM/AA] Remove all of the dead AliasAnalysis pointers being threaded through APIs that are no longer necessary now that the update API has been removed. This will make changes to the AA interfaces significantly less disruptive (I hope). Either way, it seems like a really nice cleanup. llvm-svn: 242882	2015-07-22 09:52:54 +00:00
Chandler Carruth	a1032a0f7c	[PM/AA] Remove the last of the legacy update API from AliasAnalysis as part of simplifying its interface and usage in preparation for porting to work with the new pass manager. Note that this will likely expose that we have dead arguments, members, and maybe even pass requirements for AA. I'll be cleaning those up in seperate patches. This just zaps the actual update API. Differential Revision: http://reviews.llvm.org/D11325 llvm-svn: 242881	2015-07-22 09:49:59 +00:00
Chandler Carruth	d86a4f5ec8	[PM/AA] Switch to an early-exit. NFC. This was split out of another change because the diff is useless. I assure you, I just switched to early-return in this function. Cleanup in preparation for my next commit, as requested in code review! llvm-svn: 242880	2015-07-22 09:44:54 +00:00
Chen Li	c0f3a158f0	[LoopUnswitch] Code refactoring to separate trivial loop unswitch and non-trivial loop unswitch in processCurrentLoop() Summary: The current code in LoopUnswtich::processCurrentLoop() mixes trivial loop unswitch and non-trivial loop unswitch together. It goes over all basic blocks in the loop and checks if a condition is trivial or non-trivial unswitch condition. However, trivial unswitch condition can only occur in the loop header basic block (where it controls whether or not the loop does something at all). This refactoring separate trivial loop unswitch and non-trivial loop unswitch. Before going over all basic blocks in the loop, it checks if the loop header contains a trivial unswitch condition. If so, unswitch it. Otherwise, go over all blocks like before but don't check trivial condition any more since they are not possible to be in the other blocks. This code has no functionality change. Reviewers: meheff, reames, broune Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11276 llvm-svn: 242873	2015-07-22 05:26:29 +00:00
Chandler Carruth	ccffdaf7ed	[SROA] Fix a nasty pile of bugs to do with big-endian, different alloca types and loads, loads or stores widened past the size of an alloca, etc. This started off with a bug report about big-endian behavior with bitfields and loads and stores to a { i32, i24 } struct. An initial attempt to fix this was sent for review in D10357, but that didn't really get to the root of the problem. The core issue was that canConvertValue and convertValue in SROA were handling different bitwidth integers by doing a zext of the integer. It wouldn't do a trunc though, only a zext! This would in turn lead SROA to form an i24 load from an i24 alloca, zext it to i32, and then use it. This would at least produce the wrong value for big-endian systems. One of my many false starts here was to correct the computation for big-endian systems by shifting. But this doesn't actually work because the original code has a 64-bit store to the entire 8 bytes, and a 32-bit load of the last 4 bytes, and because the alloc size is 8 bytes, we can't lose that last (least significant if bigendian) byte! The real problem here is that we're forming an i24 load in SROA which is actually not sufficiently wide to load all of the necessary bits here. The source has an i32 load, and SROA needs to form that as well. The straightforward way to do this is to disable the zext logic in canConvertValue and convertValue, forcing us to actually load all 32-bits. This seems like a really good change, but it in turn breaks several other parts of SROA. First in the chain of knock-on failures, we had places where we were doing integer-widening promotion even though some of the integer loads or stores extended past the end of the alloca's memory! There was even a comment about preventing this, but it only prevented the case where the type had a different bit size from its store size. So I added checks to handle the cases where we actually have a widened load or store and to avoid trying to special integer widening promotion in those cases. Second, we actually rely on the ability to promote in the face of loads past the end of an alloca! This is important so that we can (for example) speculate loads around PHI nodes to do more promotion. The bits loaded are garbage, but as long as they aren't used and the alignment is suitable high (which it wasn't in the test case!) this is "fine". And we can't stop promoting here, lots of things stop working well if we do. So we need to add specific logic to handle the extension (and truncation) case, but only where that extension or truncation are over bytes that are outside the alloca's allocated storage and thus totally bogus to load or store. And of course, once we add back this correct handling of extension or truncation, we need to correctly handle bigendian systems to avoid re-introducing the exact bug that started us off on this chain of misery in the first place, but this time even more subtle as it only happens along speculated loads atop a PHI node. I've ported an existing test for PHI speculation to the big-endian test file and checked that we get that part correct, and I've added several more interesting big-endian test cases that should help check that we're getting this correct. Fun times. llvm-svn: 242869	2015-07-22 03:32:42 +00:00
Nick Lewycky	f836c89c49	Fix a performance problem in memcpyopt by removing a linear scan over ranges when inserting a new range. No functionality change intended. Patch by Anthony Pesch! llvm-svn: 242843	2015-07-21 21:56:26 +00:00
Philip Reames	6ff1a1e3d6	[RewriteStatepointsForGC] minor style cleanup Use a named lambda for readability, common some code, remove a stale comments, and use llvm style variable names. llvm-svn: 242827	2015-07-21 19:04:38 +00:00
Reid Kleckner	2f907557c3	Re-land 242726 to use RAII to do cleanup The LooksLikeCodeInBug11395() codepath was returning without clearing the ProcessedAllocas cache. llvm-svn: 242809	2015-07-21 17:40:14 +00:00
Philip Reames	94babb7030	[RewriteStatepointsForGC] Hoist some code out of a loop llvm-svn: 242808	2015-07-21 17:18:03 +00:00
Arnold Schwaighofer	3651233004	MergeFunc: Transfer the callee's attributes when replacing a direct caller We insert a bitcast which obfuscates the getCalledFunction for the utility function which looks up attributes from the called function. Loosing ABI changing parameter attributes is a bad thing. rdar://21516488 llvm-svn: 242807	2015-07-21 17:07:07 +00:00
Philip Reames	74ce2e7691	[RewriteStatepointsForGC] Delete trivial code A bit more code cleanup: delete some a trivial true assertion and supporting code, remove a redundant cast, and use count in assertions where feasible. llvm-svn: 242805	2015-07-21 16:51:17 +00:00
Nico Weber	f00afcc79b	Revert 242726, it broke ASan on OS X. llvm-svn: 242792	2015-07-21 15:48:53 +00:00
Philip Reames	f388050105	[RewriteStatepointsForGC] Minor code cleanup [NFC] We can use builders to simplify part of the code and we only check for the existance of the metadata value; this enables us to delete some redundant code. llvm-svn: 242751	2015-07-21 00:49:55 +00:00
Reid Kleckner	87d03450a5	Don't try to instrument allocas used by outlined SEH funclets Summary: Arguments to llvm.localescape must be static allocas. They must be at some statically known offset from the frame or stack pointer so that other functions can access them with localrecover. If we ever want to instrument these, we can use more indirection to recover the addresses of these local variables. We can do it during clang irgen or with the asan module pass. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11307 llvm-svn: 242726	2015-07-20 22:49:44 +00:00
Arnold Schwaighofer	764d6de823	Revert "MergeFuncs: Transfer the function parameter attributes to the call site" It is okay to not transfer parameter attributes. This reverts commit r242558. llvm-svn: 242646	2015-07-19 19:30:43 +00:00
Yaron Keren	c66c06b899	Narrow Callee scope, suggestion from David Blaikie. llvm-svn: 242644	2015-07-19 15:48:07 +00:00
Yaron Keren	611f614ee1	De-duplicate CS.getCalledFunction() expression. Not sure if the optimizer will save the call as getCalledFunction() is not a trivial access function but the code is clearer this way. llvm-svn: 242641	2015-07-19 11:52:02 +00:00
Yaron Keren	3d49f6df94	Rangify for loops in GlobalDCE, NFC. llvm-svn: 242619	2015-07-18 19:57:34 +00:00
Chandler Carruth	9f2bf1aff5	[PM/AA] Remove the addEscapingUse update API that won't be easy to directly model in the new PM. This also was an incredibly brittle and expensive update API that was never fully utilized by all the passes that claimed to preserve AA, nor could it reasonably have been extended to all of them. Any number of places add uses of values. If we ever wanted to reliably instrument this, we would want a callback hook much like we have with ValueHandles, but doing this for every use addition seems extremely expensive in terms of compile time. The only user of this update mechanism is GlobalsModRef. The idea of using this to keep it up to date doesn't really work anyways as its analysis requires a symmetric analysis of two different memory locations. It would be very hard to make updates be sufficiently rigorous to guarantee symmetric analysis in this way, and it pretty certainly isn't true today. However, folks have been using GMR with this update for a long time and seem to not be hitting the issues. The reported issue that the update hook fixes isn't even a problem any more as other changes to GetUnderlyingObject worked around it, and that issue stemmed from many years ago. As a consequence, a prior patch provided a flag to control the unsafe behavior of GMR, and this patch removes the update mechanism that has questionable compile-time tradeoffs and is causing problems with moving to the new pass manager. Note the lack of test updates -- not one test in tree actually requires this update, even for a contrived case. All of this was extensively discussed on the dev list, this patch will just enact what that discussion decides on. I'm sending it for review in part to show what I'm planning, and in part to show the amazing amount of work this avoids. Every call to the AA here is something like three to six indirect function calls, which in the non-LTO pipeline never do any work! =[ Differential Revision: http://reviews.llvm.org/D11214 llvm-svn: 242605	2015-07-18 03:26:46 +00:00
Evgeniy Stepanov	9cb08f823f	[asan] Fix shadow mapping on Android/AArch64. Instrumentation and the runtime library were in disagreement about ASan shadow offset on Android/AArch64. This fixes a large number of existing tests on Android/AArch64. llvm-svn: 242595	2015-07-17 23:51:18 +00:00
Kuba Brecka	7f54753180	[asan] Add a comment explaining why non-instrumented allocas are moved. Addition to r242510. llvm-svn: 242561	2015-07-17 19:20:21 +00:00
Arnold Schwaighofer	690cd87dcd	MergeFuncs: Transfer the function parameter attributes to the call site rdar://21516488 llvm-svn: 242558	2015-07-17 18:59:08 +00:00
Kuba Brecka	37a5ffaca0	[asan] Fix invalid debug info for promotable allocas Since r230724 ("Skip promotable allocas to improve performance at -O0"), there is a regression in the generated debug info for those non-instrumented variables. When inspecting such a variable's value in LLDB, you often get garbage instead of the actual value. ASan instrumentation is inserted before the creation of the non-instrumented alloca. The only allocas that are considered standard stack variables are the ones declared in the first basic-block, but the initial instrumentation setup in the function breaks that invariant. This patch makes sure uninstrumented allocas stay in the first BB. Differential Revision: http://reviews.llvm.org/D11179 llvm-svn: 242510	2015-07-17 06:29:57 +00:00
Peter Collingbourne	9b0fe61047	Internalize: internalize comdat members as a group, and drop comdat on such members. Internalizing an individual comdat group member without also internalizing the other members of the comdat can break comdat semantics. For example, if a module contains a reference to an internalized comdat member, and the linker chooses a comdat group from a different object file, this will break the reference to the internalized member. This change causes the internalizer to only internalize comdat members if all other members of the comdat are not externally visible. Once a comdat group has been fully internalized, there is no need to apply comdat rules to its members; later optimization passes (e.g. globaldce) can legally drop individual members of the comdat. So we drop the comdat attribute from all comdat members. Differential Revision: http://reviews.llvm.org/D10679 llvm-svn: 242423	2015-07-16 17:42:21 +00:00
Tobias Grosser	39a7bd182e	Add PM extension point EP_VectorizerStart This extension point allows passes to be executed right before the vectorizer and other highly target specific optimizations are run. llvm-svn: 242389	2015-07-16 08:20:37 +00:00
Cong Hou	ab23bfbc0e	Create a wrapper pass for BranchProbabilityInfo. This new wrapper pass is useful when we want to do branch probability analysis conditionally (e.g. only in PGO mode) but don't want to add one more pass dependence. http://reviews.llvm.org/D11241 llvm-svn: 242349	2015-07-15 22:48:29 +00:00
Chen Li	3f5ed1566e	[LoopUnswitch] Add an else clause to IsTrivialUnswitchCondition() when checking HeaderTerm instruction type Summary: This is a trivial code change with no functionality effect. When LoopUnswitch determines trivial unswitch condition, it checks whether the loop header's terminator instruction is a branch instruction or switch instruction since trivial unswitch condition can only apply to these two instruction types. The current code does not fail the check directly on other instruction types, but check the nullness of LoopExitBB variable instead. The added else clause makes the check fail immediately on other instruction types and makes the code more obvious. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11239 llvm-svn: 242345	2015-07-15 22:41:13 +00:00
JF Bastien	7289f73b8d	Fix mergefunc infinite loop Self-referential constants containing references to a merged function no longer cause the MergeFunctions pass to infinite loop. Also adds a reproduction IR which would otherwise fail, which was isolated from a similar issue in Chromium. Author: jrkoenig Reviewers: nlewycky, jfb Subscribers: llvm-commits, nlewycky, jfb Differential Revision: http://reviews.llvm.org/D11208 llvm-svn: 242337	2015-07-15 21:51:33 +00:00
Michael Zolotukhin	31b3eaaf28	[LoopUnrolling] Handle cast instructions. During estimation of unrolling effect we should be able to propagate constants through casts. Differential Revision: http://reviews.llvm.org/D10207 llvm-svn: 242257	2015-07-15 00:19:51 +00:00
Wei Mi	deee61e434	Create a wrapper pass for BlockFrequencyInfo. This is useful when we want to do block frequency analysis conditionally (e.g. only in PGO mode) but don't want to add one more pass dependence. Patch by congh. Approved by dexonsmith. Differential Revision: http://reviews.llvm.org/D11196 llvm-svn: 242248	2015-07-14 23:40:50 +00:00
David Majnemer	33b6f82e72	[InstCombine] Generalize sub of selects optimization to all BinaryOperators This exposes further optimization opportunities if the selects are correlated. llvm-svn: 242235	2015-07-14 22:39:23 +00:00
Adam Nemet	9f7dedc376	[LAA] Introduce RuntimePointerChecking::PointerInfo, NFC Turn this structure-of-arrays (i.e. the various pointer attributes) into array-of-structures. llvm-svn: 242219	2015-07-14 22:32:50 +00:00
Adam Nemet	7cdebac0c8	[LAA] Lift RuntimePointerCheck out of LoopAccessInfo, NFC I am planning to add more nested classes inside RuntimePointerCheck so all these triple-nesting would be hard to follow. Also rename it to RuntimePointerChecking (i.e. append 'ing'). llvm-svn: 242218	2015-07-14 22:32:44 +00:00
Tim Northover	586b741959	GVN: use a static array instead of regenerating it each time. NFC. llvm-svn: 242202	2015-07-14 21:14:58 +00:00
Tim Northover	d5fdef016d	GVN: tolerate an instruction being replaced without existing in the leaderboard Sometimes an incidentally created instruction can duplicate a Value used elsewhere. It then often doesn't end up in the leader table. If it's later removed, we attempt to remove it from the leader table and segfault. Instead we should just ignore the removal request, which won't cause any problems. The reverse situation, where the original instruction is replaced by the new one (which you might think could leave the leader table empty) cannot occur, because the incidental instruction will never be found in the first place. llvm-svn: 242199	2015-07-14 21:03:18 +00:00
David Majnemer	62690b1952	[SROA] Don't de-atomic volatile loads and stores Volatile loads and stores are made visible in global state regardless of what memory is involved. It is not correct to disregard the ordering and synchronization scope because it is possible to synchronize with memory operations performed by hardware. This partially addresses PR23737. llvm-svn: 242126	2015-07-14 06:19:58 +00:00
Reid Kleckner	486fa3977a	Update enforceKnownAlignment after the isWeakForLinker semantic change Previously we would refrain from attempting to increase the linkage of available_externally globals because they were considered weak for the linker. Now they are treated more like a declaration instead of a weak definition. This was causing SSE alignment faults in Chromuim, when some code assumed it could increase the alignment of a dllimported global that it didn't control. http://crbug.com/509256 llvm-svn: 242091	2015-07-14 00:11:08 +00:00
Pete Cooper	90d95edbb4	Loop idiom recognizer was replacing too many uses of popcount. When spotting that a loop can use ctpop, we were incorrectly replacing all uses of a value with a value derived from ctpop. The bug here was exposed because we were replacing a use prior to the ctpop with the ctpop value and so we have a use before def, i.e., we changed %tobool.5 = icmp ne i32 %num, 0 store i1 %tobool.5, i1* %ptr br i1 %tobool.5, label %for.body.lr.ph, label %for.end to store i1 %1, i1* %ptr %0 = call i32 @llvm.ctpop.i32(i32 %num) %1 = icmp ne i32 %0, 0 br i1 %1, label %for.body.lr.ph, label %for.end Even if we inserted the ctpop so that it dominates the store here, that would still be incorrect. The store doesn’t want the result of ctpop. The fix is very simple, and involves replacing only the branch condition with the ctpop instead of all uses. Reviewed by Hal Finkel. llvm-svn: 242068	2015-07-13 21:25:33 +00:00
Mark Heffernan	d7ebc24112	Enable runtime unrolling with unroll pragma metadata Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N") and a runtime trip count. Also, do not unroll loops with unroll full metadata if the loop has a runtime loop count. Previously, such loops would be unrolled with a very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be enabled resulting in a very large (and likely unwise) unroll factor. llvm-svn: 242047	2015-07-13 18:26:27 +00:00
Benjamin Kramer	e448b5be05	Avoid using Loop::getSubLoopsVector. Passes should never modify it, just use the const version. While there reduce copying in LoopInterchange. No functional change intended. llvm-svn: 242041	2015-07-13 17:21:14 +00:00
Rafael Espindola	c1d63f7499	Remove unused variable. Sorry I missed it in the previous commit. llvm-svn: 242032	2015-07-13 14:43:33 +00:00
Rafael Espindola	5895d6845e	Aliases don't have available_externally linkage. Allowing that is probably a good idea, but currently we don't, so this is dead code. llvm-svn: 242031	2015-07-13 14:39:02 +00:00
Rafael Espindola	237c3a6def	Don't change the visibility when converting a definition to a declaration. llvm-svn: 242030	2015-07-13 14:18:22 +00:00
David Majnemer	599ca4426c	[InstSimplify] Teach InstSimplify how to simplify extractelement llvm-svn: 242008	2015-07-13 01:15:53 +00:00
David Majnemer	25a796e148	[InstSimplify] Teach InstSimplify how to simplify extractvalue llvm-svn: 242007	2015-07-13 01:15:46 +00:00
David Majnemer	6bc83e0f43	[LICM] Don't try to sink values out of loops without any exits There is no suitable basic block to sink instructions in loops without exits. The only way an instruction in a loop without exits can be used is as an incoming value to a PHI. In such cases, the incoming block for the corresponding value is unreachable. This fixes PR24013. Differential Revision: http://reviews.llvm.org/D10903 llvm-svn: 241987	2015-07-12 03:53:05 +00:00
Hal Finkel	9cf58c4095	Move getStrideFromPointer and friends from LoopVectorize to VectorUtils The following functions are moved from the LoopVectorizer to VectorUtils: - getGEPInductionOperand - stripGetElementPtr - getUniqueCastUse - getStrideFromPointer These used to be static functions in LoopVectorize, but will also be used by the upcoming loop versioning LICM transformation. Patch by Ashutosh Nema! llvm-svn: 241980	2015-07-11 10:52:42 +00:00
Chandler Carruth	00ebdbcc47	[PM/AA] Completely remove the AliasAnalysis::copyValue interface. No in-tree alias analysis used this facility, and it was not called in any particularly rigorous way, so it seems unlikely to be correct. Note that one of the only stateful AA implementations in-tree, GlobalsModRef is completely broken currently (and any AA passes like it are equally broken) because Module AA passes are not effectively invalidated when a function pass that fails to update the AA stack runs. Ultimately, it doesn't seem like we know how we want to build stateful AA, and until then trying to support and maintain correctness for an untested API is essentially impossible. To that end, I'm planning to rip out all of the update API. It can return if and when we need it and know how to build it on top of the new pass manager and as part of tested stateful AA implementations in the tree. Differential Revision: http://reviews.llvm.org/D10889 llvm-svn: 241975	2015-07-11 04:39:00 +00:00
Tyler Nowicki	3960d85262	Renamed some uses of unroll to interleave in the vectorizer. llvm-svn: 241971	2015-07-11 00:31:11 +00:00
Bjorn Steinbrink	a6b929dfe2	[InstCombine] Actually combine AA metadata when replacing one load with another Fixes PR24083 llvm-svn: 241955	2015-07-10 22:30:17 +00:00
Jingyue Wu	a277561922	[TTI] BasicTTIImpl assumes no vector registers Summary: Following the discussion on r241884, it's more reasonable to assume that a target has no vector registers by default instead of letting every such target overrides getNumberOfRegisters. Therefore, this patch modifies BasicTTIImpl::getNumberOfRegisters to return 0 when Vector is true, and partially reverts r241884 which modifies NVPTXTTIImpl::getNumberOfRegisters. It also fixes a performance bug in LoopVectorizer. Even if a target has no vector registers, vectorization may still help ILP. So, we need both checks to be false before disabling loop vectorization all together. Reviewers: hfinkel Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11108 llvm-svn: 241942	2015-07-10 21:14:54 +00:00
Adam Nemet	215746b45a	[LoopDist/LoopVer] Move LoopVersioning to a new module, NFC Summary: The class will obviously need improvement down the road. For one, there is no reason that addPHINodes would have to be exposed like that. I will make this and other improvements in follow-up patches. The main goal is to be able to share this functionality. The LoopLoadElimination pass I am working on needs it too. Later we can move other clients as well (LV and Ashutosh's LICMVer). Reviewers: hfinkel, ashutosh.nema Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10577 llvm-svn: 241932	2015-07-10 18:55:13 +00:00
Adam Nemet	1a689188c4	[LoopDist] Move loop-versioning helper functions to Cloning, NFC Summary: This makes them available to the LoopVersioning class as that is moved to its own module in the next patch. Reviewers: ashutosh.nema, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10576 llvm-svn: 241931	2015-07-10 18:55:09 +00:00
Benjamin Kramer	f4ebfa3ae1	[InstSimplify] Fold away ord/uno fcmps when nnan is present. This is important to fold away the slow case of complex multiplies emitted by clang. llvm-svn: 241911	2015-07-10 14:02:02 +00:00
Alexey Bataev	da33d80e9a	Disable loop re-rotation for -Oz (patch by Andrey Turetsky) After changes in rL231820 loop re-rotation is performed even in -Oz mode. Since loop rotation is disabled for -Oz, it seems loop re-rotation should be disabled too. Differential Revision: http://reviews.llvm.org/D10961 llvm-svn: 241897	2015-07-10 10:37:09 +00:00
David Majnemer	db82d2f338	Revert the new EH instructions This reverts commits r241888-r241891, I didn't mean to commit them. llvm-svn: 241893	2015-07-10 07:15:17 +00:00
David Majnemer	1d3fe98d57	Address Reid's review feedback. llvm-svn: 241889	2015-07-10 07:00:58 +00:00
David Majnemer	ae2ffc8a8c	New EH representation for MSVC compatibility Summary: This introduces new instructions neccessary to implement MSVC-compatible exception handling support. Most of the middle-end and none of the back-end haven't been audited or updated to take them into account. Reviewers: rnk, JosephTremoulet, reames, nlewycky, rjmccall Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11041 llvm-svn: 241888	2015-07-10 07:00:44 +00:00
Bjorn Steinbrink	8350534772	[InstCombine] Employ AliasAnalysis in FindAvailableLoadedValue llvm-svn: 241887	2015-07-10 06:55:49 +00:00
Bjorn Steinbrink	a91fd0998f	[InstCombine] Properly combine metadata when replacing a load with another Not doing this can lead to misoptimizations down the line, e.g. because of range metadata on the replacing load excluding values that are valid for the load that is being replaced. llvm-svn: 241886	2015-07-10 06:55:44 +00:00
Sanjoy Das	6f062c8c2a	[IndVars] Try to use existing values in RewriteLoopExitValues. Summary: In RewriteLoopExitValues, before expanding out an SCEV expression using SCEVExpander, try to see if an existing LLVM IR expression already computes the value we're interested in. If so use that existing expression. Apart from reducing IndVars' reliance on the rest of the compilation pipeline, this also prevents IndVars from concluding some expressions as "high cost" when they're not. For instance, `InductiveRangeCheckElimination` often emits code of the following form: ``` len = umin(len_A, len_B) loop: ... if (i++ < len) goto loop outside_loop: use(i) ``` `SCEVExpander` refuses to rewrite the use of `i` in `outside_loop`, since it thinks the value of `i` on loop exit, `len`, is a high cost expansion since it contains an `umax` in it. With this change, `IndVars` can see that it can re-use `len` instead of creating a new expression to compute `umin(len_A, len_B)`. I considered putting this cleverness in `SCEVExpander`, but I was worried that it may then have a deterimental effect on other passes that use it. So I decided it was better to just do this in the one place where it seems like an obviously good idea, with the intent of generalizing later if needed. Reviewers: atrick, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10782 llvm-svn: 241838	2015-07-09 18:46:12 +00:00
Sanjay Patel	1319446195	[SLPVectorizer] Try different vectorization factors for store chains ...and set max vector register size based on target This patch is based on discussion on the llvmdev mailing list: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/087405.html and also solves: https://llvm.org/bugs/show_bug.cgi?id=17170 Several FIXME/TODO items are noted in comments as potential improvements. Differential Revision: http://reviews.llvm.org/D10950 llvm-svn: 241760	2015-07-08 23:40:55 +00:00
Michael Zolotukhin	97295ea7dd	[LoopVectorizer] Rename BypassBlock to VectorPH, and CheckBlock to NewVectorPH. NFCI. llvm-svn: 241742	2015-07-08 21:48:03 +00:00
Michael Zolotukhin	8c874bb2f1	[LoopVectorizer] Restructurize code for emitting RT checks. NFCI. Place all code corresponding to a run-time check in one place. Previously we generated some code, then proceeded to a next check, then finished the code for the first check (like splitting blocks and generating branches). Now the code for generating a check is self-contained. llvm-svn: 241741	2015-07-08 21:47:59 +00:00
Michael Zolotukhin	66f5591f9b	[LoopVectorizer] Remove redundant variables PastOverflowCheck and OverflowCheckAnchor. NFCI. llvm-svn: 241740	2015-07-08 21:47:56 +00:00
Michael Zolotukhin	00345cadd5	[LoopVectorizer] Move some code around to ease further refactoring. NFCI. llvm-svn: 241739	2015-07-08 21:47:53 +00:00
Michael Zolotukhin	7db3063f87	[LoopVectorizer] Remove redundant variable LastBypassBlock. NFC. llvm-svn: 241738	2015-07-08 21:47:47 +00:00
Reid Kleckner	60381791b5	Rename llvm.frameescape and llvm.framerecover to localescape and localrecover Summary: Initially, these intrinsics seemed like part of a family of "frame" related intrinsics, but now I think that's more confusing than helpful. Initially, the LangRef specified that this would create a new kind of allocation that would be allocated at a fixed offset from the frame pointer (EBP/RBP). We ended up dropping that design, and leaving the stack frame layout alone. These intrinsics are really about sharing local stack allocations, not frame pointers. I intend to go further and add an `llvm.localaddress()` intrinsic that returns whatever register (EBP, ESI, ESP, RBX) is being used to address locals, which should not be confused with the frame pointer. Naming suggestions at this point are welcome, I'm happy to re-run sed. Reviewers: majnemer, nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11011 llvm-svn: 241633	2015-07-07 22:25:32 +00:00
David Majnemer	6cc21f909c	Revert "Revert r241570, it caused PR24053" This reverts commit r241602. We had a latent bug in SCCP where we would make a basic block empty and then proceed to ask questions about it's terminator. llvm-svn: 241616	2015-07-07 18:49:41 +00:00
Reid Kleckner	fc0f93832b	[llvm-extract] Drop comdats from declarations The verifier rejects comdats on declarations. llvm-svn: 241483	2015-07-06 18:48:02 +00:00
Teresa Johnson	d3a33a16bb	Resubmit "Add new EliminateAvailableExternally module pass" (r239480) This change includes a fix for https://code.google.com/p/chromium/issues/detail?id=499508#c3, which required updating the visibility for symbols with eliminated definitions. --Original Commit Message-- Add new EliminateAvailableExternally module pass, which is performed in O2 compiles just before GlobalDCE, unless we are preparing for LTO. This pass eliminates available externally globals (turning them into declarations), regardless of whether they are dead/unreferenced, since we are guaranteed to have a copy available elsewhere at link time. This enables additional opportunities for GlobalDCE. If we are preparing for LTO (e.g. a -flto -c compile), the pass is not included as we want to preserve available externally functions for possible link time inlining. The FE indicates whether we are doing an -flto compile via the new PrepareForLTO flag on the PassManagerBuilder. llvm-svn: 241466	2015-07-06 16:22:42 +00:00
Sanjay Patel	cc9fad0bf7	remove unnecessary temp variable; NFCI llvm-svn: 241415	2015-07-05 21:21:47 +00:00
Peter Collingbourne	6a9d1774d0	IR: Do not consider available_externally linkage to be linker-weak. From the linker's perspective, an available_externally global is equivalent to an external declaration (per isDeclarationForLinker()), so it is incorrect to consider it to be a weak definition. Also clean up some logic in the dead argument elimination pass and clarify its comments to better explain how its behavior depends on linkage, introduce GlobalValue::isStrongDefinitionForLinker() and start using it throughout the optimizers and backend. Differential Revision: http://reviews.llvm.org/D10941 llvm-svn: 241413	2015-07-05 20:52:35 +00:00
Sanjay Patel	a4860f3af2	use range-based for loops; NFCI llvm-svn: 241412	2015-07-05 20:15:21 +00:00
Sanjay Patel	f73f8919ed	use range-based for loops; NFCI llvm-svn: 241395	2015-07-04 19:38:52 +00:00
Yaron Keren	6967cbb319	Remove whitespace from start of line, NFC. llvm-svn: 241268	2015-07-02 14:25:09 +00:00
Alexey Samsonov	958dab71b3	[LoopVectorize] Use ReplaceInstWithInst() helper where appropriate. This is mostly an NFC, which increases code readability (instead of saving old terminator, generating new one in front of old, and deleting old, we just call a function). However, it would additionaly copy the debug location from old instruction to replacement, which would help PR23837. llvm-svn: 241197	2015-07-01 22:18:30 +00:00
David Majnemer	453f7a1480	[LoopUnroll] Use undef for phis with no value live We would create a phi node with a zero initialized operand instead of undef in the case where no value was originally available. This was problematic for x86_mmx which has no null value. llvm-svn: 241143	2015-07-01 05:38:07 +00:00
David Majnemer	9402e27ae0	[SCCP] Turn loads of null into undef instead of zero initialized values Surprisingly, this is a correctness issue: the mmx type exists for calling convention purposes, LLVM doesn't have a zero representation for them. This partially fixes PR23999. llvm-svn: 241142	2015-07-01 05:37:57 +00:00
Jingyue Wu	cf02ef315f	[NaryReassociate] enhances nsw by leveraging @llvm.assume Summary: nsw are flaky and can often be removed by optimizations. This patch enhances nsw by leveraging @llvm.assume in the IR. Specifically, NaryReassociate now understands that assume(a + b >= 0) && assume(a >= 0) ==> a +nsw b As a result, it can split more sext(a + b) into sext(a) + sext(b) for CSE. Test Plan: nary-gep.ll Reviewers: broune, meheff Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10822 llvm-svn: 241139	2015-07-01 03:38:49 +00:00
Alexey Samsonov	342b1e8053	[SanitizerCoverage] Don't add instrumentation to unreachable blocks. llvm-svn: 241127	2015-06-30 23:11:45 +00:00
David Majnemer	cda8688f61	[Cloning] Teach CloneModule about personality functions CloneModule didn't take into account that it needed to remap the value using values in the module. This fixes PR23992. llvm-svn: 241122	2015-06-30 22:14:01 +00:00
Alexey Samsonov	b7724b95d8	[LoopSimplify] Set proper debug location in loop backedge blocks. Set debug location for terminator instruction in loop backedge block (which is an unconditional jump to loop header). We can't copy debug location from original backedges, as there can be several of them, with different debug info locations. So, we follow the approach of SplitBlockPredecessors, and copy the debug info from first non-PHI instruction in the header (i.e. destination block). This is yet another change for PR23837. llvm-svn: 240999	2015-06-29 21:30:14 +00:00
Diego Novillo	b0257c8419	Tidy comment. llvm-svn: 240987	2015-06-29 20:03:46 +00:00
Jingyue Wu	3abde7bea5	[SLSR] S's basis must have the same type as S llvm-svn: 240910	2015-06-28 17:45:05 +00:00
David Majnemer	9f3979fd78	[LoopVectorize] Pointer indicies may be wider than the pointer If we are dealing with a pointer induction variable, isInductionPHI gives back a step value of Stride / size of pointer. However, we might be indexing with a legal type wider than the pointer width. Handle this by inserting casts where appropriate instead of crashing. This fixes PR23954. llvm-svn: 240877	2015-06-27 08:38:17 +00:00
David Majnemer	5185c3c271	[PruneEH] A naked, noinline function can return via InlineAsm The PruneEH pass tries to annotate functions as 'noreturn' if it doesn't see a ReturnInst. However, a naked function containing inline assembly can contain control flow leaving the function. This fixes PR23971. llvm-svn: 240876	2015-06-27 07:52:53 +00:00
Peter Collingbourne	ba4c8b5004	LowerBitSets: Ignore bitset entries that do not directly refer to a global. It is possible for a global to be substituted with another global of a different type or a different kind (i.e. an alias) at IR link time. One example of this scenario is when a Microsoft ABI vtable is substituted with an alias referring to a larger vtable containing an RTTI reference. This will cause the global to be RAUW'd with a possibly bitcasted reference to the other global. This will of course also affect any references to the global in bitset metadata. The right way to handle such metadata is simply to ignore it. This is sound because the linked module should contain another copy of the bitset entries as applied to the new global. llvm-svn: 240866	2015-06-27 00:17:51 +00:00
Philip Reames	8fe7f13af8	[RewriteStatepointsForGC] Generalized vector phi/select handling for base pointers This change extends the detection of base pointers for vector constructs to handle arbitrary phi and select nodes. The existing non-vector code already handles those, so this is basically just extending the vector special case to be less special cased. It still isn't generalized vector handling since we can't handle arbitrary vector instructions (e.g. shufflevectors), but it's a lot closer. The general structure of the change is as follows: * Extend the base defining value relation over a subset of vector instructions and vector typed phi & select instructions. * Move scalarization from before base pointer rewriting to after base pointer rewriting. The extension of the BDV relation is sufficient to find vector base phis for vector inputs. * Preserve the existing special case logic for when the base of a vector element is locally obvious. This general idea could be extended to the scalar case as well. Differential Revision: http://reviews.llvm.org/D10461#inline-84275 llvm-svn: 240850	2015-06-26 22:47:37 +00:00
David Blaikie	b447ac6435	Move VectorUtils from Transforms to Analysis to correct layering violation llvm-svn: 240804	2015-06-26 18:02:52 +00:00
David Blaikie	1213dbf1fd	Fix ODR violation waiting to happen by making static function definitions in VectorUtils.h non-static and defined out of line Patch by Ashutosh Nema Differential Revision: http://reviews.llvm.org/D10682 llvm-svn: 240794	2015-06-26 16:57:30 +00:00
Alexey Samsonov	773e8c3966	[ASan] Use llvm::getDISubprogram() to get function entry debug location. It can be more robust than copying debug info from first non-alloca instruction in the entry basic block. We use the same strategy in coverage instrumentation. llvm-svn: 240738	2015-06-26 00:00:47 +00:00
Anna Zaks	785c075786	[asan] Do not instrument special purpose LLVM sections. Do not instrument globals that are placed in sections containing "__llvm" in their name. This fixes a bug in ASan / PGO interoperability. ASan interferes with LLVM's PGO, which places its globals into a special section, which is memcpy-ed by the linker as a whole. When those goals are instrumented, ASan's memcpy wrapper reports an issue. http://reviews.llvm.org/D10541 llvm-svn: 240723	2015-06-25 23:35:48 +00:00
Anna Zaks	4f652b69b1	[asan] Don't run stack malloc on functions containing inline assembly. It makes LLVM run out of registers even on 64-bit platforms. For example, the following test case fails on darwin. clang -cc1 -O0 -triple x86_64-apple-macosx10.10.0 -emit-obj -fsanitize=address -mstackrealign -o ~/tmp/ex.o -x c ex.c error: inline assembly requires more registers than available void TestInlineAssembly(const unsigned char S, unsigned int pS, unsigned char D, unsigned int pD, unsigned int h) { unsigned int sr = 4, pDiffD = pD - 5; unsigned int pDiffS = (pS << 1) - 5; char flagSA = ((pS & 15) == 0), flagDA = ((pD & 15) == 0); asm volatile ( "mov %0, %%"PTR_REG("si")"\n" "mov %2, %%"PTR_REG("cx")"\n" "mov %1, %%"PTR_REG("di")"\n" "mov %8, %%"PTR_REG("ax")"\n" : : "m" (S), "m" (D), "m" (pS), "m" (pDiffS), "m" (pDiffD), "m" (sr), "m" (flagSA), "m" (flagDA), "m" (h) : "%"PTR_REG("si"), "%"PTR_REG("di"), "%"PTR_REG("ax"), "%"PTR_REG("cx"), "%"PTR_REG("dx"), "memory" ); } http://reviews.llvm.org/D10719 llvm-svn: 240722	2015-06-25 23:35:45 +00:00
Pete Cooper	125ad17fed	Use foreach loop over constant operands. NFC. A number of places had explicit loops over Constant::operands(). Just use foreach loops where possible. llvm-svn: 240694	2015-06-25 20:51:38 +00:00
Jingyue Wu	5e34ce33f5	[InstCombine] call SimplifyICmpInst with correct context Summary: Fixes PR23809. Without passing the context to SimplifyICmpInst, we would use the assume to prove that the condition feeding the assume is trivially true (see isValidAssumeForContext in ValueTracking.cpp), causing the removal of the assume which may be useful for later optimizations. Test Plan: pr23800.ll Reviewers: hfinkel, majnemer Reviewed By: hfinkel Subscribers: henryhu, llvm-commits, wengxt, broune, meheff, eliben Differential Revision: http://reviews.llvm.org/D10695 llvm-svn: 240683	2015-06-25 20:14:47 +00:00
Yaron Keren	62064d6d38	Rangify for loop in Inliner.cpp. NFC. llvm-svn: 240678	2015-06-25 19:28:24 +00:00
Peter Collingbourne	2a3443c7c5	GVN: If a branch has two identical successors, we cannot declare either dead. This previously caused miscompilations as a result of phi nodes receiving undef incoming values from blocks dominated by such successors. Differential Revision: http://reviews.llvm.org/D10726 llvm-svn: 240670	2015-06-25 18:32:02 +00:00
Jay Foad	7a28cdc9dd	Teach LLVM about the PPC64 memory sanitizer implementation. Summary: This is the LLVM part of the PPC memory sanitizer implementation in D10648. Reviewers: kcc, samsonov, willschm, wschmidt, eugenis Reviewed By: eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10649 llvm-svn: 240627	2015-06-25 10:34:29 +00:00
Duncan P. N. Exon Smith	817ac8f40a	Add simplify_type<const WeakVH>; simplify IndVarSimplify r240214 fixed some UB in IndVarSimplify, and it needed a temporary `WeakVH` to do it. Add `simplify_type<const WeakVH>` so that this temporary isn't necessary. llvm-svn: 240599	2015-06-24 22:23:21 +00:00
David Majnemer	63d606bdcb	[GVN] Intersect the IR flags when CSE'ing two instructions We performed a simple, but incomplete, intersection when it came time to CSE instructions. It didn't handle, for example, the 'exact' flag. This fixes PR23922. llvm-svn: 240595	2015-06-24 21:52:25 +00:00
David Majnemer	f6e500a0dc	[Reassociate] Don't propogate flags when creating negations Reassociate mutated existing instructions in order to form negations which would create additional reassociate opportunities. This fixes PR23926. llvm-svn: 240593	2015-06-24 21:27:36 +00:00
Sanjay Patel	64ea207027	fix typos; NFC llvm-svn: 240592	2015-06-24 20:42:33 +00:00
Sanjay Patel	09159b8f47	don't repeat function names in comments; NFC llvm-svn: 240591	2015-06-24 20:40:57 +00:00
Sanjay Patel	adb110c372	fix typos; NFC llvm-svn: 240585	2015-06-24 20:07:50 +00:00
Michael Zolotukhin	79ff564ef3	[LoopVectorizer] Fix bailing-out condition for OptForSize case. With option OptForSize enabled, the Loop Vectorizer is not supposed to create tail loop. The condition checking that was invalid and was not matching to the comment above. Patch by Marianne Mailhot-Sarrasin. llvm-svn: 240556	2015-06-24 17:26:24 +00:00
Sanjay Patel	6a24811d87	fix typo; NFC llvm-svn: 240480	2015-06-23 23:26:22 +00:00
Sanjay Patel	9b7e6776a1	don't repeat function names in comments; NFC llvm-svn: 240478	2015-06-23 23:05:08 +00:00
Alexey Samsonov	19ffcb900f	Let llvm::ReplaceInstWithInst copy debug location from old to new instruction. Currently some users of this function do this explicitly, and all the rest forget to do this. ThreadSanitizer was one of such users, and had missing debug locations for calls into TSan runtime handling atomic operations, eventually leading to poorly symbolized stack traces and malfunctioning suppressions. This is another change relevant to PR23837. llvm-svn: 240460	2015-06-23 21:00:08 +00:00
Mark Heffernan	9b536a640b	This change fixes three bugs in loop unswitching. This change causes an 81% speed-up on a benchmark that is based on EigenConvolutionKernel2D from Eigen3, where the lack of loop unswitching blocks hoisting of loads out of a nested loop (see bug 23816 for how loop unswitching and load hoisting are related). Change 1: Unswitching on trivial conditions should always happen regardless of the computed unswitching cost, as really the cost is zero. While there is code to make that happen, the logic that checks the unswitching cost against a threshold was moved to an earlier point (revision 147935) than the point where trivial unswitching is detected, so trivial unswitching is currently blocked by the cost threshold. This change fixes that. Change 2: Before revision 147935 (from 2012-01-11), the threshold parameter was a per-loop threshold. So an unswitching happened only if the cost of the unswitching was less than the threshold. In an indirect way (and I believe unintentionally), the logic for this since then has been that the threshold is an over-all budget across all loops for all loop unswitching done by a given LoopUnswitch loop pass object. So if an unswitching with cost 100 happens in one function, that in effect reduces the threshold from 100 to 0 for the loops even in another function. This persists for the lifetime of that loop pass object. This makes no difference for most small examples but it is important for large examples. This revision fixes that. Change 3: The cost is currently calculated as std::min(NumInstructions, 5 * NumBlocks). So a loop with 2 blocks and a million instructions will have an unswitching cost of 10. I changed this to just NumInstructions, as it were before revision 147935, though I'm open to e.g. instead replacing std::min with std::max. I've tried to make the change minimally invasive while staying with what I think was the original intent of the code. Submitted on behalf of broune@. llvm-svn: 240438	2015-06-23 18:26:50 +00:00
Alexander Kornienko	f00654e31b	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) Apparently, the style needs to be agreed upon first. llvm-svn: 240390	2015-06-23 09:49:53 +00:00
Weiming Zhao	f1abad57da	Fix PR13851: Preserve metadata for the unswitched branch This patch copies the metadata of the unswitched branch to the newly crreated branch in loop unswitch pass. llvm-svn: 240378	2015-06-23 05:31:09 +00:00
David Majnemer	726901b638	[InstCombine] Optimize subtract of selects into a select of a sub This came up when examining some code generated by clang's IRGen for certain member pointers. llvm-svn: 240369	2015-06-23 02:49:24 +00:00
Adam Nemet	f530b329c7	[LoopDist] Improve variable names and comments in LoopVersioning class, NFC As with the previous patch, the goal is to turn the class into a general loop-versioning class. This patch removes any references to loop distribution. llvm-svn: 240352	2015-06-22 22:59:40 +00:00
Peter Collingbourne	de26a918c1	SafeStack: Create the unsafe stack pointer on demand. This avoids creating an unnecessary undefined reference on targets such as NVPTX that require such references to be declared in asm output. llvm-svn: 240321	2015-06-22 20:26:54 +00:00
Chandler Carruth	c3f49eb451	[PM/AA] Hoist the AliasResult enum out of the AliasAnalysis class. This will allow classes to implement the AA interface without deriving from the class or referencing an internal enum of some other class as their return types. Also, to a pretty fundamental extent, concepts such as 'NoAlias', 'MayAlias', and 'MustAlias' are first class concepts in LLVM and we aren't saving anything by scoping them heavily. My mild preference would have been to use a scoped enum, but that feature is essentially completely broken AFAICT. I'm extremely disappointed. For example, we cannot through any reasonable[1] means construct an enum class (or analog) which has scoped names but converts to a boolean in order to test for the possibility of aliasing. [1]: Richard Smith came up with a "solution", but it requires class templates, and lots of boilerplate setting up the enumeration multiple times. Something like Boost.PP could potentially bundle this up, but even that would be quite painful and it doesn't seem realistically worth it. The enum class solution would probably work without the need for a bool conversion. Differential Revision: http://reviews.llvm.org/D10495 llvm-svn: 240255	2015-06-22 02:16:51 +00:00
Benjamin Kramer	00a477f279	[SwitchLowering] Remove quadratic vector removal. This can be triggered with giant switches. No functionality change intended. llvm-svn: 240221	2015-06-20 15:59:34 +00:00
Yaron Keren	4c548f2dd9	Rangify for loops in Inliner::runOnSCC(), NFC. llvm-svn: 240215	2015-06-20 07:12:33 +00:00
Justin Bogner	485212f67c	IndVarSimplify: Avoid UB from binding a reference to a null pointer Calling operator* on a WeakVH whose Value is null hits undefined behaviour, since we bind the value to a reference. Instead, go through `operator Value*` so that we work with the pointer itself. Found by ubsan. llvm-svn: 240214	2015-06-20 06:24:05 +00:00
Justin Bogner	e46d3796fc	LowerSwitch: Avoid some undefined behaviour When a case of INT64_MIN was followed by a case that was greater than zero, we were overflowing a signed integer here. Since we've sorted the cases here anyway (and thus currentValue must be greater than nextValue) it's simple enough to avoid this by using addition rather than subtraction. Found by UBSAN on existing tests. llvm-svn: 240201	2015-06-20 00:28:25 +00:00
Adam Nemet	7632500d7a	[LoopDist] Rename RuntimeCheckEmitter to LoopVersioning, NFC llvm-svn: 240165	2015-06-19 19:32:48 +00:00
Adam Nemet	772a150614	[LoopDist] Move pointer-to-partition computation out of RuntimeCheckEmitter, NFC This starts preparing the class to become a (more) general LoopVersioning utility class. llvm-svn: 240164	2015-06-19 19:32:41 +00:00
Michael Zolotukhin	4d8ffa082c	[SLP] Vectorize for all-constant entries. Differential Revision: http://reviews.llvm.org/D10531 llvm-svn: 240144	2015-06-19 17:40:15 +00:00
Alexander Kornienko	70bc5f1398	Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137	2015-06-19 15:57:42 +00:00
Alexander Potapenko	b9b73ef906	[ASan] Initial support for Kernel AddressSanitizer This patch adds initial support for the -fsanitize=kernel-address flag to Clang. Right now it's quite restricted: only out-of-line instrumentation is supported, globals are not instrumented, some GCC kasan flags are not supported. Using this patch I am able to build and boot the KASan tree with LLVMLinux patches from github.com/ramosian-glider/kasan/tree/kasan_llvmlinux. To disable KASan instrumentation for a certain function attribute((no_sanitize("kernel-address"))) can be used. llvm-svn: 240131	2015-06-19 12:19:07 +00:00
Eric Christopher	572e03a396	Fix "the the" in comments. llvm-svn: 240112	2015-06-19 01:53:21 +00:00
Benjamin Kramer	2b2cdd7799	[EliminateDuplicatePHINodes] Replace custom hash map with DenseSet. While there use hash_combine instead of hand-rolled hashing. No functionality change intended. llvm-svn: 240023	2015-06-18 16:01:00 +00:00
Jingyue Wu	a941129d00	[NFC] more comments in SLSR llvm-svn: 239984	2015-06-18 03:35:57 +00:00
David Majnemer	7fddeccb8b	Move the personality function from LandingPadInst to Function The personality routine currently lives in the LandingPadInst. This isn't desirable because: - All LandingPadInsts in the same function must have the same personality routine. This means that each LandingPadInst beyond the first has an operand which produces no additional information. - There is ongoing work to introduce EH IR constructs other than LandingPadInst. Moving the personality routine off of any one particular Instruction and onto the parent function seems a lot better than have N different places a personality function can sneak onto an exceptional function. Differential Revision: http://reviews.llvm.org/D10429 llvm-svn: 239940	2015-06-17 20:52:32 +00:00
Peter Collingbourne	4fc603ded3	LowerBitSets: Do not assign names to aliases of unnamed bitset element objects. The restriction on unnamed aliases was removed in r239921. Mostly reverts r239590, but we keep the test. llvm-svn: 239923	2015-06-17 18:31:02 +00:00
Igor Breger	dfcc3d31a7	AVX-512: cvtusi2ss/d intrinsics. Change builtin function name and signature ( add third parameter - rounding mode ). Added tests for intrinsics. Differential Revision: http://reviews.llvm.org/D10473 llvm-svn: 239888	2015-06-17 07:23:57 +00:00
Chandler Carruth	ecbd16829a	[PM/AA] Remove the UnknownSize static member from AliasAnalysis. This is now living in MemoryLocation, which is what it pertains to. It is also an enum there rather than a static data member which is left never defined. llvm-svn: 239886	2015-06-17 07:21:38 +00:00
Chandler Carruth	ac80dc7532	[PM/AA] Remove the Location typedef from the AliasAnalysis class now that it is its own entity in the form of MemoryLocation, and update all the callers. This is an entirely mechanical change. References to "Location" within AA subclases become "MemoryLocation", and elsewhere "AliasAnalysis::Location" becomes "MemoryLocation". Hope that helps out-of-tree folks update. llvm-svn: 239885	2015-06-17 07:18:54 +00:00
Tyler Nowicki	27b2c39eb3	Refactor RecurrenceInstDesc Moved RecurrenceInstDesc into RecurrenceDescriptor to simplify the namespaces. llvm-svn: 239862	2015-06-16 22:59:45 +00:00
Philip Reames	c25df11614	Reapply 239795 - [InstCombine] Propagate non-null facts to call parameters The original change broke clang side tests. I will be submitting those momentarily. This change includes post commit feedback on the original change from from Pete Cooper. Original Submission comments: If a parameter to a function is known non-null, use the existing parameter attributes to record that fact at the call site. This has no optimization benefit by itself - that I know of - but is an enabling change for http://reviews.llvm.org/D9129. Differential Revision: http://reviews.llvm.org/D9132 llvm-svn: 239849	2015-06-16 20:24:25 +00:00
Tyler Nowicki	0a91310c7f	Rename Reduction variables/structures to Recurrence. A reduction is a special kind of recurrence. In the loop vectorizer we currently identify basic reductions. Future patches will extend this to identifying basic recurrences. llvm-svn: 239835	2015-06-16 18:07:34 +00:00
Philip Reames	1a6305f313	Revert 239795 I forgot to update some clang test cases. I'll fix and resubmit tomorrow. llvm-svn: 239800	2015-06-16 01:20:53 +00:00
Philip Reames	66ab0f045a	Move logic from JumpThreading into LazyValue info to simplify caller. This change is hopefully NFC. The only tricky part is that I changed the context instruction being used to the branch rather than the comparison. I believe both to be correct, but the branch is strictly more powerful. With the moved code, using the branch instruction is required for the basic block comparison test to return the same result. The previous code was able to directly access both the branch and the comparison where the revised code is not. Differential Revision: http://reviews.llvm.org/D9652 llvm-svn: 239797	2015-06-16 00:49:59 +00:00
Duncan P. N. Exon Smith	51149d5589	modules: Add explicit dependency on intrinsics_gen `LLVM_ENABLE_MODULES` builds sometimes fail because `Intrinsics.td` needs to regenerate `Instrinsics.h` before anyone can include anything from the LLVM_IR module. Represent the dependency explicitly to prevent that. llvm-svn: 239796	2015-06-16 00:44:12 +00:00
Philip Reames	dfc29fba60	[InstCombine] Propagate non-null facts to call parameters If a parameter to a function is known non-null, use the existing parameter attributes to record that fact at the call site. This has no optimization benefit by itself - that I know of - but is an enabling change for http://reviews.llvm.org/D9129. Differential Revision: http://reviews.llvm.org/D9132 llvm-svn: 239795	2015-06-16 00:43:54 +00:00
Peter Collingbourne	82437bf7a5	Protection against stack-based memory corruption errors using SafeStack This patch adds the safe stack instrumentation pass to LLVM, which separates the program stack into a safe stack, which stores return addresses, register spills, and local variables that are statically verified to be accessed in a safe way, and the unsafe stack, which stores everything else. Such separation makes it much harder for an attacker to corrupt objects on the safe stack, including function pointers stored in spilled registers and return addresses. You can find more information about the safe stack, as well as other parts of or control-flow hijack protection technique in our OSDI paper on code-pointer integrity (http://dslab.epfl.ch/pubs/cpi.pdf) and our project website (http://levee.epfl.ch). The overhead of our implementation of the safe stack is very close to zero (0.01% on the Phoronix benchmarks). This is lower than the overhead of stack cookies, which are supported by LLVM and are commonly used today, yet the security guarantees of the safe stack are strictly stronger than stack cookies. In some cases, the safe stack improves performance due to better cache locality. Our current implementation of the safe stack is stable and robust, we used it to recompile multiple projects on Linux including Chromium, and we also recompiled the entire FreeBSD user-space system and more than 100 packages. We ran unit tests on the FreeBSD system and many of the packages and observed no errors caused by the safe stack. The safe stack is also fully binary compatible with non-instrumented code and can be applied to parts of a program selectively. This patch is our implementation of the safe stack on top of LLVM. The patches make the following changes: - Add the safestack function attribute, similar to the ssp, sspstrong and sspreq attributes. - Add the SafeStack instrumentation pass that applies the safe stack to all functions that have the safestack attribute. This pass moves all unsafe local variables to the unsafe stack with a separate stack pointer, whereas all safe variables remain on the regular stack that is managed by LLVM as usual. - Invoke the pass as the last stage before code generation (at the same time the existing cookie-based stack protector pass is invoked). - Add unit tests for the safe stack. Original patch by Volodymyr Kuznetsov and others at the Dependable Systems Lab at EPFL; updates and upstreaming by myself. Differential Revision: http://reviews.llvm.org/D6094 llvm-svn: 239761	2015-06-15 21:07:11 +00:00
Benjamin Kramer	258ea0dbdf	[Statepoints] Skip a vector copy when uniquing values. No functionality change intended. llvm-svn: 239688	2015-06-13 19:50:38 +00:00
Matt Wala	bfb5368cc7	Revert 239644. llvm-svn: 239650	2015-06-13 01:08:00 +00:00
Matt Wala	1f48192d7c	[Scalarizer] Fix potential for stale data in Scattered across invocations Summary: Scalarizer has two data structures that hold information about changes to the function, Gathered and Scattered. These are cleared in finish() at the end of runOnFunction() if finish() detects any changes to the function. However, finish() was checking for changes by only checking if Gathered was non-empty. The function visitStore() only modifies Scattered without touching Gathered. As a result, Scattered could have ended up having stale data if Scalarizer only scalarized store instructions. Since the data in Scattered is used during the execution of the pass, this introduced dangling pointer errors. The fix is to check whether both Scattered and Gathered are empty before deciding what to do in finish(). Reviewers: srhines Reviewed By: srhines Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10422 llvm-svn: 239644	2015-06-12 22:49:11 +00:00
Matt Wala	a4afccd8a8	Fix a typo in a comment in MemCpyOpt (test commit) llvm-svn: 239628	2015-06-12 18:16:51 +00:00
Alexander Potapenko	f90556efb8	[ASan] format AddressSanitizer.cpp with `clang-format -style=Google`, NFC llvm-svn: 239601	2015-06-12 11:27:06 +00:00

... 6 7 8 9 10 ...

13822 Commits