llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrew Kaylor	a237866faf	Fix a block copying problem in LICM Differential Revision: https://reviews.llvm.org/D44817 llvm-svn: 328336	2018-03-23 17:36:18 +00:00
Florian Hahn	f73c3ece7f	Revert r328307: [IPSCCP] Use constant range information for comparisons of parameters. Reverted for now, due to it causing verifier failures. llvm-svn: 328312	2018-03-23 12:49:39 +00:00
Florian Hahn	b1feec087e	[IPSCCP] Use constant range information for comparisons of parameters. For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 328307	2018-03-23 11:56:00 +00:00
David Blaikie	376294c23a	Finish moving the IPSCCP pass from Scalar to IPO - moving the registration llvm-svn: 328259	2018-03-22 22:07:53 +00:00
David Blaikie	3bbf5af0ac	Fix layering between SCCP and IPO SCCP Transforms/Scalar/SCCP.cpp implemented both the Scalar and IPO SCCP, but this meant Transforms/Scalar including Transfroms/IPO headers, creating a circular dependency. (IPO depends on Scalar already) - so move the IPO SCCP shims out into IPO and the basic library implementation accessible from Scalar/SCCP.h to be used from the IPO/SCCP.cpp implementation. llvm-svn: 328250	2018-03-22 21:41:29 +00:00
Anna Thomas	9b1176b0ef	[LoopPredication] Add profitability check based on BPI Summary: LoopPredication is not profitable when the loop is known to always exit through some block other than the latch block. A coarse grained latch check can cause loop predication to predicate the loop, and unconditionally deoptimize. However, without predicating the loop, the guard may never fail within the loop during the dynamic execution because the non-latch loop termination condition exits the loop before the latch condition causes the loop to exit. We teach LP about this using BranchProfileInfo pass. Reviewers: apilipenko, skatkov, mkazantsev, reames Reviewed by: skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44667 llvm-svn: 328210	2018-03-22 16:03:59 +00:00
Florian Hahn	9bc0bc4b9b	[CallSiteSplitting] Preserve DominatorTreeAnalysis. The dominator tree analysis can be preserved easily. Some other kinds of analysis can probably be preserved too. Reviewers: junbuml, dberlin Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D43173 llvm-svn: 328206	2018-03-22 15:23:33 +00:00
David Blaikie	2be3922807	Fix a couple of layering violations in Transforms Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering. Transforms depends on Transforms/Utils, not the other way around. So remove the header and the "createStripGCRelocatesPass" function declaration (& definition) that is unused and motivated this dependency. Move Transforms/Utils/Local.h into Analysis because it's used by Analysis/MemoryBuiltins.cpp. llvm-svn: 328165	2018-03-21 22:34:23 +00:00
Daniel Neilson	6f1eb58e92	[MemCpyOpt] Update to new API for memory intrinsic alignment Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the MemCpyOpt pass to cease using: 1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. 2) The old IRBuilder CreateMemCpy/CreateMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. We also add a few tests to fill gaps in the testing of this pass. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960, rL325816, rL327398, rL327421 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 328097	2018-03-21 14:14:55 +00:00
Justin Lebar	038cbc5c13	Re-re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Backed out for causing performance regressions. Re-landing because we've determined that these regressions were noise. Original Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 328096	2018-03-21 14:08:21 +00:00
Xin Tong	a713ebea24	[MergeICmps] Break eargerly out of loop llvm-svn: 327972	2018-03-20 12:03:25 +00:00
Xin Tong	bdbd97ed9a	[MergeICmp] Fix a bug in entry block shuffled to middle of the chain Summary: Fix a bug in entry block shuffled to middle of the chain. Reviewers: davide, courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44642 llvm-svn: 327971	2018-03-20 11:57:54 +00:00
Anastasis Grammenos	3a589103a4	[LICM] Salvage DI from dying Instructions LICM deletes trivially dead instructions which it won't attempt to sink. Attempt to salvage debug values which reference these instructions. llvm-svn: 327800	2018-03-18 15:59:19 +00:00
Craig Topper	71d69b2ea5	[CorrelatedValuePropagation] Use SelectInst::getCondition/getTrueValue/getFalseValue instead of getOperand for readability. NFC llvm-svn: 327728	2018-03-16 18:18:47 +00:00
Brian M. Rzycki	f65ddc5fa2	[JumpThreading] Track unreachable BBs to avoid processing JumpThreading iterates over F until the IR quiesces. Transforming unreachable BBs increases compile time and it is also possible to never stabilize causing JumpThreading to hang. An older attempt at fixing this problem was D3991 where removeUnreachableBlocks(F) was called before JumpThreading began. This has a few drawbacks: * expensive - the routine attempts to fix up the IR to identify additional BBs that can be removed along with unreachable BBs. * aggressive - does not identify and preserve the shape of the IR. At a minimum it does not preserve loop hierarchies. * invasive - altering reachable blocks it may disrupt IR shapes that could have otherwise been JumpThreaded. This patch avoids removeUnreachableBlocks(F) and instead tracks unreachable BBs in a SmallPtrSet using DominatorTree to validate the initial state of all BBs. We then rely on subsequent passes to identify and remove these unreachable blocks from F. Reviewers: dberlin, sebpop, kuhar, dinesh.d Reviewed by: sebpop, kuhar Subscribers: hiraditya, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D44177 llvm-svn: 327713	2018-03-16 15:13:47 +00:00
Florian Hahn	fc97b6173f	[LoopUnroll] Peel off iterations if it makes conditions true/false. If the loop body contains conditions of the form IndVar < #constant, we can remove the checks by peeling off #constant iterations. This improves codegen for PR34364. Reviewers: mkuper, mkazantsev, efriedma Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D43876 llvm-svn: 327671	2018-03-15 21:34:43 +00:00
Philip Reames	422024a1b7	[EarlyCSE] Don't hide earler invariant.scopes If we've already established an invariant scope with an earlier generation, we don't want to hide it in the scoped hash table with one with a later generation. I noticed this when working on the invariant-load handling, but it also applies to the invariant.start case as well. Without this change, my previous patch for invariant-load regresses some cases, so I'm pushing this without waiting for review. This is why you don't make last minute tweaks to patches to catch "obvious cases" after it's already been reviewed. Bad Philip! llvm-svn: 327655	2018-03-15 18:12:27 +00:00
Philip Reames	ca587fe0b4	[EarlyCSE] Reuse invariant scopes for invariant load This is a follow up to https://reviews.llvm.org/D43716 which rewrites the invariant load handling using the new infrastructure. It's slightly more powerful, but only in somewhat minor ways for the moment. It's not clear that DSE of stores to invariant locations is actually interesting since why would your IR have such a construct to start with? Note: The submitted version is slightly different than the reviewed one. I realized the scope could start for an invariant load which was proven redundant and removed. Added a test case to illustrate that as well. Differential Revision: https://reviews.llvm.org/D44497 llvm-svn: 327646	2018-03-15 17:29:32 +00:00
Fedor Sergeev	194a407bda	[New PM][IRCE] port of Inductive Range Check Elimination pass to the new pass manager There are two nontrivial details here: * Loop structure update interface is quite different with new pass manager, so the code to add new loops was factored out * BranchProbabilityInfo is not a loop analysis, so it can not be just getResult'ed from within the loop pass. It cant even be queried through getCachedResult as LoopCanonicalization sequence (e.g. LoopSimplify) might invalidate BPI results. Complete solution for BPI will likely take some time to discuss and figure out, so for now this was partially solved by making BPI optional in IRCE (skipping a couple of profitability checks if it is absent). Most of the IRCE tests got their corresponding new-pass-manager variant enabled. Only two of them depend on BPI, both marked with TODO, to be turned on when BPI starts being available for loop passes. Reviewers: chandlerc, mkazantsev, sanjoy, asbirlea Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D43795 llvm-svn: 327619	2018-03-15 11:01:19 +00:00
Andrei Elovikov	f9b8035f3c	[LoopUnroll] Ignore ephemeral values when checking full unroll profitability. Summary: Before this patch call graph is like this in the LoopUnrollPass: tryToUnrollLoop ApproximateLoopSize collectEphemeralValues /* Use collected ephemeral values / computeUnrollCount analyzeLoopUnrollCost / Bail out from the analysis if loop contains CallInst / This patch moves collection of the ephemeral values to the tryToUnrollLoop function and passes the collected values into both ApproximateLoopsize (as before) and additionally starts using them in analyzeLoopUnrollCost: tryToUnrollLoop collectEphemeralValues ApproximateLoopSize(EphValues) / Use EphValues / computeUnrollCount(EphValues) analyzeLoopUnrollCost(EphValues) / Ignore ephemeral values - they don't contribute to the final cost / / Bail out from the analysis if loop contains CallInst */ Reviewers: mzolotukhin, evstupac, sanjoy Reviewed By: evstupac Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43931 llvm-svn: 327617	2018-03-15 09:59:15 +00:00
Philip Reames	0adbb19409	[EarlyCSE] Exploit open ended invariant.start scopes If we have an invariant.start with no corresponding invariant.end, then the memory location becomes invariant indefinitely after the invariant.start. As a result, anything dominated by the start is guaranteed to see the value the memory location had when the invariant.start executed. This patch adds an AvailableInvariants table which tracks the generation a particular memory location became invariant and then uses that information to allow value forwarding that would otherwise be disallowed by potentially aliasing stores. (Reminder: In EarlyCSE everything clobbers everything by default.) This should be compatible with the MemorySSA variant, but design is generational. We can and should add first class support for invariant.start within MemorySSA at a later time. I took a quick look at doing so, but probably need some input from a MemorySSA expert. Differential Revision: https://reviews.llvm.org/D43716 llvm-svn: 327577	2018-03-14 21:35:06 +00:00
Anna Thomas	5ac72f94f3	Test Commit NFC. Updated comment llvm-svn: 327436	2018-03-13 19:38:45 +00:00
Daniel Neilson	41e781d5f1	[SROA] Take advantage of separate alignments for memcpy source and destination Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the SROA pass to cease using the old getAlignment() & setAlignment() APIs of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. This allows us to enhance visitMemTransferInst to be more aggressive setting the alignment in memcpy calls that it creates, as well as to only change the alignment of a memcpy/memmove argument that it replaces. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960, rL325816 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html Reviewers: chandlerc, bollu, efriedma Reviewed By: efriedma Subscribers: efriedma, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D42974 llvm-svn: 327398	2018-03-13 14:25:33 +00:00
Clement Courbet	9f0b3170bc	[MergeICmps] Make sure that the comparison only has one use. Summary: Fixes PR36557. Reviewers: trentxintong, spatel Subscribers: mstorsjo, llvm-commits Differential Revision: https://reviews.llvm.org/D44083 llvm-svn: 327372	2018-03-13 07:05:55 +00:00
Vedant Kumar	3a408538f0	Remove the LoopInstSimplify pass (-loop-instsimplify) LoopInstSimplify is unused and untested. Reading through the commit history the pass also seems to have a high maintenance burden. It would be best to retire the pass for now. It should be easy to recover if we need something similar in the future. Differential Revision: https://reviews.llvm.org/D44053 llvm-svn: 327329	2018-03-12 20:49:42 +00:00
Craig Topper	3b4ad9c12d	[CallSiteSplitting] Use !Instruction::use_empty instead of checking for a non-zero return from getNumUses getNumUses is a linear operation. It walks a linked list to get a count. So in this case its better to just ask if there are any users rather than how many. llvm-svn: 327314	2018-03-12 18:40:59 +00:00
Justin Lebar	24b6640b1b	Back out "Re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions." This reverts r326908, originally landed as D44102. Reverted for causing performance regressions on x86. (These regressions are not yet understood.) llvm-svn: 327252	2018-03-12 09:26:09 +00:00
Renato Golin	038ede2a16	[NFC] Consolidate six getPointerOperand() utility functions into one place There are six separate instances of getPointerOperand() utility. LoopVectorize.cpp has one of them, and I don't want to create a 7th one while I'm trying to move LoopVectorizationLegality into a separate file (eventual objective is to move it to Analysis tree). See http://lists.llvm.org/pipermail/llvm-dev/2018-February/120999.html for llvm-dev discussions Closes D43323. Patch by Hideki Saito <hideki.saito@intel.com>. llvm-svn: 327173	2018-03-09 21:05:58 +00:00
Chad Rosier	95d9ccb2a0	[JumpThreading] Don't restrict cast-traversal to i1 In r263618, JumpThreading learned to look trough simple cast instructions, but only if the source of those cast instructions was a phi/cmp i1 (in an effort to limit compile time effects). I think this condition is too restrictive. For switches with limited value range, InstCombine will readily introduce an extra trunc instruction to a smaller integer type (e.g. from i8 to i2), leaving us in the somewhat perverse situation that jump-threading would work before running instcombine, but not after. Since instcombine produces this pattern, I think we need to consider it canonical and support it in JumpThreading. In general, for limiting recursion, I think the existing restriction to phi and cmp nodes should be sufficient to avoid looking through unprofitable chains of instructions. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42262 llvm-svn: 327150	2018-03-09 16:43:46 +00:00
Philip Reames	fbffd126b8	[NFC] Factor out a helper function for checking if a block has a potential early implicit exit. llvm-svn: 327065	2018-03-08 21:25:30 +00:00
Justin Lebar	eccfbf1bcd	Re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Backed out for failing an assert in clang bootstrap builds. Re-landing with a fix for handling non-power-of-two inputs (e.g. udiv i24). Original Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 326908	2018-03-07 16:56:49 +00:00
Justin Lebar	eeeb0eb049	Revert rL326898: "Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions." Breaks bootstrap builds: clang built with this patch asserts while building MCDwarf.cpp: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. llvm-svn: 326900	2018-03-07 16:05:43 +00:00
Justin Lebar	cb9e89c39b	Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Reviewers: spatel, sanjoy Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 326898	2018-03-07 15:11:13 +00:00
Evgeny Stupachenko	204ade4102	Add early exit on reassociation of 0 expression. Summary: Before the patch a try to reassociate ((v * 16) * 0) * 1 fall into infinite loop Reviewers: pankajchawla Differential Revision: http://reviews.llvm.org/D41467 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 326861	2018-03-07 02:17:08 +00:00
Sebastian Pop	bf6e1c26cf	DA: remove uses of GEP, only ask SCEV It's been quite some time the Dependence Analysis (DA) is broken, as it uses the GEP representation to "identify" multi-dimensional arrays. It even wrongly detects multi-dimensional arrays in single nested loops: from test/Analysis/DependenceAnalysis/Coupled.ll, example @couple6 ;; for (long int i = 0; i < 50; i++) { ;; A[i][3i - 6] = i; ;; B++ = A[i][i]; DA used to detect two subscripts, which makes no sense in the LLVM IR or in C/C++ semantics, as there are no guarantees as in Fortran of subscripts not overlapping into a next array dimension: maximum nesting levels = 1 SrcPtrSCEV = %A DstPtrSCEV = %A using GEPs subscript 0 src = {0,+,1}<nuw><nsw><%for.body> dst = {0,+,1}<nuw><nsw><%for.body> class = 1 loops = {1} subscript 1 src = {-6,+,3}<nsw><%for.body> dst = {0,+,1}<nuw><nsw><%for.body> class = 1 loops = {1} Separable = {} Coupled = {1} With the current patch, DA will correctly work on only one dimension: maximum nesting levels = 1 SrcSCEV = {(-2424 + %A)<nsw>,+,1212}<%for.body> DstSCEV = {%A,+,404}<%for.body> subscript 0 src = {(-2424 + %A)<nsw>,+,1212}<%for.body> dst = {%A,+,404}<%for.body> class = 1 loops = {1} Separable = {0} Coupled = {} This change removes all uses of GEP from DA, and we now only rely on the SCEV representation. The patch does not turn on -da-delinearize by default, and so the DA analysis will be more conservative in the case of multi-dimensional memory accesses in nested loops. I disabled some interchange tests, as the DA is not able to disambiguate the dependence anymore. To make DA stronger, we may need to compute a bound on the number of iterations based on the access functions and array dimensions. The patch cleans up all the CHECKs in test/Transforms/LoopInterchange/*.ll to avoid checking for snippets of LLVM IR: this form of checking is very hard to maintain. Instead, we now check for output of the pass that are more meaningful than dozens of lines of LLVM IR. Some tests now require -debug messages and thus only enabled with asserts. Patch written by Sebastian Pop and Aditya Kumar. Differential Revision: https://reviews.llvm.org/D35430 llvm-svn: 326837	2018-03-06 21:55:59 +00:00
Florian Hahn	517dc51c48	[CallSiteSplitting] Do not crash when BB's terminator changes. Change doCallSiteSplitting to iterate until we reach the terminator instruction. tryToSplitCallSite can replace BB's terminator in case BB is a successor of itself. Then IE will be invalidated and we also have to check the current terminator. Reviewers: junbuml, davidxl, davide, fhahn Reviewed By: fhahn, junbuml Differential Revision: https://reviews.llvm.org/D43824 llvm-svn: 326793	2018-03-06 14:00:58 +00:00
Xin Tong	8fd561f572	[MergeICmp] Simplify how BCECmpBlock instructions are blacklisted llvm-svn: 326761	2018-03-06 02:24:02 +00:00
Xin Tong	98af9efca5	[MergeICmp] Fix printing. NFC llvm-svn: 326760	2018-03-06 02:04:57 +00:00
Daniel Neilson	82daad31fe	[RewriteStatepoints] Fix stale parse points Summary: RewriteStatepointsForGC collects parse points for further processing. During the collection if a callsite is found in an unreachable block (DominatorTree::isReachableFromEntry()) then all unreachable blocks are removed by removeUnreachableBlocks(). Some of the removed blocks could have been reachable according to DominatorTree::isReachableFromEntry(). In this case the collected parse points became stale and resulted in a crash when accessed. The fix is to unconditionally canonicalize the IR to removeUnreachableBlocks and then collect the parse points. The added test crashes with the old version and passes with this patch. Patch by Yevgeny Rouban! Reviewed by: Anna Differential Revision: https://reviews.llvm.org/D43929 llvm-svn: 326748	2018-03-05 22:27:30 +00:00
Florian Hahn	0b7c6422fb	[IPSCCP] Add getCompare which returns either true, false, undef or null. getCompare returns true, false or undef constants if the comparison can be evaluated, or nullptr if it cannot. This is in line with what ConstantExpr::getCompare returns. It also allows us to use ConstantExpr::getCompare for comparing constants. Reviewers: davide, mssimpso, dberlin, anna Reviewed By: davide Differential Revision: https://reviews.llvm.org/D43761 llvm-svn: 326720	2018-03-05 17:33:50 +00:00
Sanjay Patel	53ffabdfcb	[CVP] fix formatting; NFC llvm-svn: 326711	2018-03-05 16:08:34 +00:00
Xin Tong	8345c0e3a5	[MergeICmp] We can discard initial blocks that do other work Summary: We can discard initial blocks that do other work We do not need to limit ourselves to just the first block in the chain. Reviewers: courbet, davide Reviewed By: courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44029 llvm-svn: 326698	2018-03-05 13:54:47 +00:00
Clement Courbet	34be1b0288	[MergeICmps][NFC] Improve logging. llvm-svn: 326683	2018-03-05 08:21:47 +00:00
Fedor Indutny	364b9c2adb	[CallSiteSplitting] fix use after-free Iterating through predecessors of `TailBB` while removing their terminators leads to use after-free, because the predecessor list is changing on each removal. llvm-svn: 326668	2018-03-03 22:34:38 +00:00
Fedor Indutny	f9e09c1dd0	[CallSiteSplitting] properly split musttail calls Summary: `musttail` calls can't be naively splitted. The split blocks must include not only the call instruction itself, but also (optional) `bitcast` and `return` instructions that follow it. Clone `bitcast` and `ret`, place them into the split blocks, and remove the tail block when done. Reviewers: junbuml, mcrosier, davidxl, davide, fhahn Reviewed By: fhahn Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D43729 llvm-svn: 326666	2018-03-03 21:40:14 +00:00
Yaxun Liu	3c42f1c3c9	LoopUnroll: respect pragma unroll when AllowRemainder is disabled Currently when AllowRemainder is disabled, pragma unroll count is not respected even though there is no remainder. This bug causes a loop fully unrolled in many cases even though the user specifies a unroll count. Especially it affects OpenCL/CUDA since in many cases a loop contains convergent instructions and currently AllowRemainder is disabled for such loops. Differential Revision: https://reviews.llvm.org/D43826 llvm-svn: 326585	2018-03-02 16:22:32 +00:00
Benjamin Kramer	d1cf7ff5ab	[SCCP] Fix unused variable warning in release builds. llvm-svn: 326429	2018-03-01 11:31:44 +00:00
Reid Kleckner	3762a089d7	[IPSCCP] do not break musttail invariant (PR36485) Do not replace results of `musttail` calls with a constant if the call itself can't be removed. Do not zap returns of `musttail` callees, if the call site can't be removed and replaced with a constant. Do not zap returns of `musttail`-calling blocks, this breaks invariant too. Patch by Fedor Indutny Differential Revision: https://reviews.llvm.org/D43695 llvm-svn: 326404	2018-03-01 01:19:18 +00:00
Xin Tong	256869d8bc	Fix typo. NFC llvm-svn: 326319	2018-02-28 12:09:53 +00:00
Xin Tong	8ba674e43b	[MergeICmp] Fix a bug in MergeICmp that can lead to a block being processed more than once. Summary: Fix a bug in MergeICmp that can lead to a BCECmp block being processed more than once and eventually lead to a broken LLVM module. The problem is that if the non-constant value is not produced by the last block, the producer will be processed once when the its parent block is processed and second time when the last block is processed. We end up having 2 same BCECmpBlock in the merge queue. And eventually lead to a broken LLVM module. Reviewers: courbet, davide Reviewed By: courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43825 llvm-svn: 326318	2018-02-28 12:08:00 +00:00
David Green	7c35de124a	[Dominators] Remove verifyDomTree and add some verifying for Post Dom Trees Removes verifyDomTree, using assert(verify()) everywhere instead, and changes verify a little to always run IsSameAsFreshTree first in order to print good output when we find errors. Also adds verifyAnalysis for PostDomTrees, which will allow checking of PostDomTrees it the same way we check DomTrees and MachineDomTrees. Differential Revision: https://reviews.llvm.org/D41298 llvm-svn: 326315	2018-02-28 11:00:08 +00:00
Florian Hahn	1807c516c7	[NewGVN] Update phi-of-ops def block when updating existing ValuePHI. In case we update a ValuePHI node created earlier, we could update it based on a different OpPHI which could be in a different block. We need to update the TempToBlock mapping reflecting the new block, otherwise we would end up placing the new phi node in a wrong block. This problem is exposed by the test case in https://bugs.llvm.org/show_bug.cgi?id=36504. This patch fixes a slightly simpler problem than in the bug report. In the bug's re-producer, the additional problem is that we are re-using a ValuePHI node with to few incoming values for the new OpPHI. If this patch makes sense, I will follow it up with a patch that creates a new PHI node if the existing PHI node has a different number of incoming values. Reviewers: davide, dberlin Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D43770 llvm-svn: 326181	2018-02-27 09:34:51 +00:00
Florian Hahn	a1822cbabc	[LoopInterchange] Loops with empty dependency matrix are safe. The dependency matrix is only empty if no conflicting load/store instructions have been found. In that case, it is safe to interchange. For the LLVM test-suite, after this change around 1900 loops are interchanged, whereas it is 15 before this change. On cortex-a57, this gives an improvement of -0.57% on the geomean execution time of SPEC2006, SPEC2000 and the test-suite. There are a few small perf regressions, but I think we can improve on those by making the cost model better. Reviewers: karthikthecool, mcrosier Reviewed by: karthikthecool Differential Revision: https://reviews.llvm.org/D43236 llvm-svn: 326077	2018-02-26 10:45:25 +00:00
Adam Nemet	e4e1de60aa	Revert "StructurizeCFG: Test for branch divergence correctly" This reverts commit r325881. Breaks many bots llvm-svn: 326037	2018-02-24 17:29:09 +00:00
Nicolai Haehnle	43c1115cd4	StructurizeCFG: Test for branch divergence correctly Summary: This fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Reviewers: arsenm, rampitec, jlebar Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D40546 llvm-svn: 325881	2018-02-23 10:45:46 +00:00
Bjorn Steinbrink	983d6c3f18	Mark MergedLoadStoreMotion as not preserving MemDep results Summary: MemDep caches results that signify that a dependence is non-local, and there is currently no way to invalidate such cache entries. Unfortunately, when MLSM sinks a store that can result in a non-local dependence becoming a local one, and then MemDep gives wrong answers. The easiest way out here is to just say that MLSM does indeed not preserve MemDep results. Reviewers: davide, Gerolf Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43177 llvm-svn: 325880	2018-02-23 10:41:57 +00:00
Daniel Neilson	20c9207be3	[AlignmentFromAssumptions] Set source and dest alignments of memory intrinsiscs separately Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the AlignmentFromAssumptions pass to cease using the old getAlignment()/setAlignment API of MemoryIntrinsic in favour of getting/setting source & dest specific alignments through the new API. This allows us to simplify some of the code in this pass and also be more aggressive about setting the source and destination alignments separately. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html Reviewers: hfinkel, bollu, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D43081 llvm-svn: 325816	2018-02-22 18:55:59 +00:00
Vedant Kumar	56492f9177	[BDCE] Salvage debug info from dying insts This results in 15 additional unique source variables in a stage2 build of FileCheck (at '-Os -g'), with a negligible increase in the size of the .debug_loc section. llvm-svn: 325660	2018-02-21 01:55:33 +00:00
Sanjoy Das	737fa40ffa	[DSE] Don't DSE stores that subsequent memmove calls read from Summary: We used to remove the first memmove in cases like this: memmove(p, p+2, 8); memmove(p, p+2, 8); which is incorrect. Fix this by changing isPossibleSelfRead to what was most likely the intended behavior. Historical note: the buggy code was added in https://reviews.llvm.org/rL120974 to address PR8728. Reviewers: rsmith Subscribers: mcrosier, llvm-commits, jlebar Differential Revision: https://reviews.llvm.org/D43425 llvm-svn: 325641	2018-02-20 23:19:34 +00:00
Brian M. Rzycki	f1a7df5ef2	[JumpThreading] PR36133 enable/disable DominatorTree for LVI analysis Summary: The LazyValueInfo pass caches a copy of the DominatorTree when available. Whenever there are pending DominatorTree updates within JumpThreading's DeferredDominance object we cannot use the cached DT for LVI analysis. This commit adds the new methods enableDT() and disableDT() to LVI. JumpThreading also sets the appropriate usage model before calling LVI analysis methods. Fixes https://bugs.llvm.org/show_bug.cgi?id=36133 Reviewers: sebpop, dberlin, kuhar Reviewed by: sebpop, kuhar Subscribers: uabelho, llvm-commits, aprantl, hiraditya, a.elovikov Differential Revision: https://reviews.llvm.org/D42717 llvm-svn: 325356	2018-02-16 16:35:17 +00:00
Ivan A. Kosarev	53270d0fa6	[Transforms] Propagate TBAA info in SROA Now that we have the new TBAA metadata format that is capable of representing accesses to aggregates, we can propagate TBAA access tags from memory setting and transferring intrinsics to load and store instructions and vice versa. Since SROA produces lots of new loads and stores on optimized builds, this change significantly decreases the share of undecorated memory accesses on such builds. Differential Revision: https://reviews.llvm.org/D41563 llvm-svn: 325329	2018-02-16 10:10:29 +00:00
Vedant Kumar	616fdb00df	[GVN] Partially revert debug info salvage change (r325063) In r325063, we salvaged debug values from dying instructions in GVN::processBlock() and GVN::performScalarPRE(). The change in performScalarPRE(), while correct, is unhelpful. It introduced a call to salvageDebugInfo() which was immediately followed by a RAUW, meaning it prevented the RAUW from efficiently updating dbg.value intrinsics. This commit reverts the mistake and tightens up the affected test case. llvm-svn: 325308	2018-02-16 01:15:20 +00:00
Vedant Kumar	1df820ecd7	[DCE] Salvage debug info from dead insts This results in small increases in the size of the .debug_loc section and the number of unique source variables in a stage2 build of opt. llvm-svn: 325301	2018-02-15 22:26:18 +00:00
David Green	0d5f9651f2	Move llvm::computeLoopSafetyInfo from LICM.cpp to LoopUtils.cpp. NFC Move computeLoopSafetyInfo, defined in Transforms/Utils/LoopUtils.h, into the corresponding LoopUtils.cpp, as opposed to LICM where it resides at the moment. This will allow other functions from Transforms/Utils to reference it. llvm-svn: 325151	2018-02-14 18:34:53 +00:00
Florian Hahn	b4e3bad89b	Recommit r325001: [CallSiteSplitting] Support splitting of blocks with instrs before call. For basic blocks with instructions between the beginning of the block and a call we have to duplicate the instructions before the call in all split blocks and add PHI nodes for uses of the duplicated instructions after the call. Currently, the threshold for the number of instructions before a call is quite low, to keep the impact on binary size low. Reviewers: junbuml, mcrosier, davidxl, davide Reviewed By: junbuml Differential Revision: https://reviews.llvm.org/D41860 llvm-svn: 325126	2018-02-14 13:59:12 +00:00
Florian Hahn	c6296fea3f	[LoopInterchange] Incrementally update the dominator tree. We can use incremental dominator tree updates to avoid re-calculating the dominator tree after interchanging 2 loops. Reviewers: dmgreen, kuhar Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D43176 llvm-svn: 325122	2018-02-14 13:13:15 +00:00
Elena Demikhovsky	945b7e5aa6	Adding a width of the GEP index to the Data Layout. Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout. p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits. The index size parameter is optional, if not specified, it is equal to the pointer size. Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width. It works fine if you can convert pointer to integer for address calculation and all registered targets do this. But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout. http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account. This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size. Differential Revision: https://reviews.llvm.org/D42123 llvm-svn: 325102	2018-02-14 06:58:08 +00:00
Vedant Kumar	1d5d31b706	[GVN] Salvage debug info from dead insts This preserves an additional 581 unique source variables in a stage2 build of clang (according to `llvm-dwarfdump --statistics`). It increases the size of the .debug_loc section by 0.1% (or 87139 bytes). Differential Revision: https://reviews.llvm.org/D43255 llvm-svn: 325063	2018-02-13 22:27:17 +00:00
Vedant Kumar	35fc103e1e	[DeadStoreElimination] Salvage debug info from dead insts According to `llvm-dwarfdump --statistics` this salvages 43 additional unique source variables in a stage2 build of clang. It increases the size of the .debug_loc section by 0.002% (or 2864 bytes). Differential Revision: https://reviews.llvm.org/D43220 llvm-svn: 325035	2018-02-13 18:15:26 +00:00
Florian Hahn	35d744d388	Revert r325001: [CallSiteSplitting] Support splitting of blocks with instrs before call. Due to memsan not being happy with the array of ValueToValue maps. llvm-svn: 325009	2018-02-13 14:48:39 +00:00
Florian Hahn	348b48ac6b	[CallSiteSplitting] Clear ValueToValue maps. llvm-svn: 325006	2018-02-13 14:17:00 +00:00
Florian Hahn	78bddd4cca	[CallSiteSplitting] Dereference pointer earlier. This should make the sanitizers happy. llvm-svn: 325004	2018-02-13 13:51:51 +00:00
Florian Hahn	b0884b6443	[CallSiteSplitting] Support splitting of blocks with instrs before call. For basic blocks with instructions between the beginning of the block and a call we have to duplicate the instructions before the call in all split blocks and add PHI nodes for uses of the duplicated instructions after the call. Currently, the threshold for the number of instructions before a call is quite low, to keep the impact on binary size low. Reviewers: junbuml, mcrosier, davidxl, davide Reviewed By: junbuml Differential Revision: https://reviews.llvm.org/D41860 llvm-svn: 325001	2018-02-13 12:00:48 +00:00
Florian Hahn	1f95ef1815	[LoopInterchange] Check number of latch successors before accessing them. In cases where the OuterMostLoopLatchBI only has a single successor, accessing the second successor will fail. This fixes a failure when building the test-suite with loop-interchange enabled. Reviewers: mcrosier, karthikthecool, davide Reviewed by: karthikthecool Differential Revision: https://reviews.llvm.org/D42906 llvm-svn: 324994	2018-02-13 10:02:52 +00:00
Adam Nemet	031a00c660	Revert "[LSR] Avoid UB overflow when examining reuse opportunities" This reverts commit r324943. Breaking bots, reverting for Gerolf. llvm-svn: 324958	2018-02-12 22:42:13 +00:00
Gerolf Hoflehner	edcd564820	[LSR] Avoid UB overflow when examining reuse opportunities llvm-svn: 324943	2018-02-12 21:49:32 +00:00
Jun Bum Lim	144eb593dd	[LICM] update BlockColors after splitting predecessors Update BlockColors after splitting predecessors. Do not allow splitting EHPad for sinking when the BlockColors is not empty, so we can simply assign predecessor's color to the new block. Fixes PR36184 llvm-svn: 324916	2018-02-12 17:56:55 +00:00
Florian Hahn	e54a20e094	[LoopInterchange] Simplify splitInnerLoopHeader logic (NFC). We can use SplitBlock for both cases, which makes the code slightly simpler and updates both LoopInfo and the dominator tree. llvm-svn: 324881	2018-02-12 11:10:58 +00:00
Max Kazantsev	b57ca09e43	[NFC] Fix typos llvm-svn: 324867	2018-02-12 05:16:28 +00:00
Serguei Katkov	3cb4c34a4e	Rename and move utility function getLatchPredicateForGuard. NFC. Rename getLatchPredicateForGuard to more common name getFlippedStrictnessPredicate and move it to ICmpInst class. llvm-svn: 324717	2018-02-09 07:59:07 +00:00
Daniel Neilson	fb99a493be	[LoopIdiom] Be more aggressive when setting alignment in memcpy Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the LoopIdiom pass to cease using the old IRBuilder CreateMemCpy single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. This allows us to be slightly more aggressive in setting the alignment of memcpy calls that loop idiom creates. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324626	2018-02-08 17:33:08 +00:00
Alexander Ivchenko	836eac3e8d	Add missed PostDominatorTree analysis dependency to GVN hoist pass. Summary: GVN hoist pass is using PostDominatorTree analysis, therefore the analysis should be listed in the pass initialization as a dependency. Reviewed By: sebpop Differential Revision: https://reviews.llvm.org/D43007 Author: ashlykov <arkady.shlykov@intel.com> llvm-svn: 324597	2018-02-08 11:45:36 +00:00
Serguei Katkov	c8016e7a65	[Loop Predication] Teach LP about reverse loops with uge and sge latch conditions Add support of uge and sge latch condition to Loop Prediction for reverse loops. Reviewers: apilipenko, mkazantsev, sanjoy, anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42837 llvm-svn: 324589	2018-02-08 10:34:08 +00:00
Serguei Katkov	ebc9031b88	[LoopPrediction] Introduce utility function getLatchPredicateForGuard. NFC. Factor out getting the predicate for latch condition in a guard to utility function getLatchPredicateForGuard. llvm-svn: 324450	2018-02-07 06:53:37 +00:00
Daniel Neilson	83cdf6827e	[DSE] Upgrade uses of MemoryIntrinic::getAlignment() to new API. (NFC) Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the DeadStoreElimination pass to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting dest specific alignments through the new API. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324402	2018-02-06 21:18:33 +00:00
Daniel Neilson	5fdf08f7c9	[InferAddressSpaces] Update uses of IRBuilder memory intrinsic creation to new API Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the InferAddressSpaces pass to cease using: 1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. 2) The old IRBuilder CreateMemCpy/CreateMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324395	2018-02-06 20:33:36 +00:00
Clement Courbet	a7a1746865	[MergeICmps] Handle chains with several complex BCE basic blocks. - Fix condition for detecting that a complex basic block was the first in the chain. - Add tests. This was caught by buildbots when submitting rL324319. llvm-svn: 324341	2018-02-06 12:25:33 +00:00
Clement Courbet	c2109c8af6	[MergeICmps][NFC] Add more assertions. llvm-svn: 324323	2018-02-06 09:14:00 +00:00
Sanjay Patel	d7c702b451	[LoopStrengthReduce, x86] don't add cost for a cmp that will be macro-fused (PR35681) In the motivating case from PR35681 and represented by the macro-fuse-cmp test: https://bugs.llvm.org/show_bug.cgi?id=35681 ...there's a 37 -> 31 byte size win for the loop because we eliminate the big base address offsets. SPEC2017 on Ryzen shows no significant perf difference. Differential Revision: https://reviews.llvm.org/D42607 llvm-svn: 324289	2018-02-05 23:43:05 +00:00
Serguei Katkov	ec7029c286	Re-apply [SCEV] Fix isLoopEntryGuardedByCond usage ScalarEvolution::isKnownPredicate invokes isLoopEntryGuardedByCond without check that SCEV is available at entry point of the loop. It is incorrect and fixed by patch. To bugs additionally fixed: assert is moved after the check whether loop is not a nullptr. Usage of isLoopEntryGuardedByCond in ScalarEvolution::isImpliedCondOperandsViaNoOverflow is guarded by isAvailableAtLoopEntry. Reviewers: sanjoy, mkazantsev, anna, dorit, reames Reviewed By: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42417 llvm-svn: 324204	2018-02-05 05:49:47 +00:00
Mikael Holmen	6d06976e74	[LSR] Don't force bases of foldable formulae to the final type. Summary: Before emitting code for scaled registers, we prevent SCEVExpander from hoisting any scaled addressing mode by emitting all the bases first. However, these bases are being forced to the final type, resulting in some odd code. For example, if the type of the base is an integer and the final type is a pointer, we will emit an inttoptr for the base, a ptrtoint for the scale, and then a 'reverse' GEP where the GEP pointer is actually the base integer and the index is the pointer. It's more intuitive to use the pointer as a pointer and the integer as index. Patch by: Bevin Hansson Reviewers: atrick, qcolombet, sanjoy Reviewed By: qcolombet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42103 llvm-svn: 323946	2018-02-01 06:38:34 +00:00
Marek Olsak	8e7d149a31	[SeparateConstOffsetFromGEP] Preserve metadata when splitting GEPs Summary: !amdgpu.uniform needs to be preserved for AMDGPU, otherwise bad things happen. Reviewers: arsenm, nhaehnle, jingyue, broune, majnemer, bjarke.roune, dblaikie Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D42744 llvm-svn: 323907	2018-01-31 20:17:52 +00:00
Daniel Neilson	594f443b06	[RS4GC] Handle call/invoke instructions as base defining values of vectors Summary: There's an asymmetry in the definitions of findBaseDefiningValueOfVector() and findBaseDefiningValue() of RS4GC. The later handles call and invoke instructions, and the former does not. This appears to be simple oversight. This patch remedies the oversight by adding the call and invoke cases to findBaseDefiningValueOfVector(). Reviewers: DaniilSuchkov, anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42653 llvm-svn: 323764	2018-01-30 14:43:41 +00:00
Sanjay Patel	1aef27f5cd	[DSE] make sure memory is not modified before partial store merging (PR36129) We missed a critical check in D30703. We must make sure that no intermediate store is sitting between the stores that we want to merge. This should fix: https://bugs.llvm.org/show_bug.cgi?id=36129 Differential Revision: https://reviews.llvm.org/D42663 llvm-svn: 323759	2018-01-30 13:53:59 +00:00
Brian M. Rzycki	994e889022	[JumpThreading][NFC] Rename LoadInst variables Summary: The JumpThreading pass has several locations where to the variable name LI refers to a LoadInst type. This is confusing and inhibits the ability to use LI for LoopInfo as a member of the JumpThreading class. Minor formatting and comments were also altered to reflect this change. Reviewers: dberlin, kuba, spop, sebpop Reviewed by: sebpop Subscribers: sebpop, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42601 llvm-svn: 323695	2018-01-29 21:29:44 +00:00
Davide Italiano	8b797a0fd2	[CVP] Don't Replace incoming values from unreachable blocks with undef. This pretty much reverts r322006, except that we keep the test, because we work around the issue exposed in a different way (a recursion limit in value tracking). There's still probably some sequence that exposes this problem, and the proper way to fix that for somebody who has time is outlined in the code review. llvm-svn: 323630	2018-01-29 05:59:55 +00:00
Hiroshi Inoue	c8e9245816	[NFC] fix trivial typos in comments and documents "to to" -> "to" llvm-svn: 323628	2018-01-29 05:17:03 +00:00
Daniil Fukalov	6e1dc68117	[AMDGPU] fix LDS f32 intrinsics - using qualified pointer addrspace in intrinsics class to avoid .f32 mangling - changed too common atomic mangling to ds - added missing intrinsics to AMDGPUTTIImpl::getTgtMemIntrinsic Reviewed by: b-sumner Differential Revision: https://reviews.llvm.org/D42383 llvm-svn: 323516	2018-01-26 11:09:38 +00:00
Florian Hahn	212afb9fd9	[CallSiteSplitting] Fix infinite loop when recording conditions. Fix infinite loop when recording conditions by correctly marking basic blocks as visited. Fixes https://bugs.llvm.org/show_bug.cgi?id=36105 llvm-svn: 323515	2018-01-26 10:36:50 +00:00
Hiroshi Inoue	0909ca132f	[NFC] fix trivial typos in comments and documents "in in" -> "in", "on on" -> "on" etc. llvm-svn: 323508	2018-01-26 08:15:29 +00:00
Vedant Kumar	6bfc869cf7	[Debug] Add a utility to propagate dbg.value to new PHIs, NFC This simply moves an existing utility to Utils for reuse. Split out of: https://reviews.llvm.org/D42551 Patch by Matt Davis! llvm-svn: 323471	2018-01-25 21:37:05 +00:00
Amjad Aboud	f1f57a3137	Another try to commit 323321 (aggressive instruction combine). llvm-svn: 323416	2018-01-25 12:06:32 +00:00
Nicolai Haehnle	4afb64e4c6	Revert r321751, "StructurizeCFG: Fix broken backedge detection" It causes regressions in various OpenGL test suites. Keep the test cases introduced by r321751 as XFAIL, and add a test case for the regression. Change-Id: I90b4cc354f68cebe5fcef1f2422dc8fe1c6d3514 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36015 llvm-svn: 323355	2018-01-24 18:02:05 +00:00
Amjad Aboud	d53504e379	Reverted 323321. llvm-svn: 323326	2018-01-24 14:48:49 +00:00
Amjad Aboud	e4453233d7	[InstCombine] Introducing Aggressive Instruction Combine pass (-aggressive-instcombine). Combine expression patterns to form expressions with fewer, simple instructions. This pass does not modify the CFG. For example, this pass reduce width of expressions post-dominated by TruncInst into smaller width when applicable. It differs from instcombine pass in that it contains pattern optimization that requires higher complexity than the O(1), thus, it should run fewer times than instcombine pass. Differential Revision: https://reviews.llvm.org/D38313 llvm-svn: 323321	2018-01-24 12:42:42 +00:00
Max Kazantsev	0f720e1296	[NFC] Remove overconfident assert from IRCE This patch removes assert that SCEV is able to prove that a value is non-negative. In fact, SCEV can sometimes be unable to do this because its cache does not update properly. This assert will be returned once this problem is resolved. llvm-svn: 323309	2018-01-24 07:51:41 +00:00
Ashutosh Nema	007b425b77	This change add's optimization remark in LoopVersioning LICM pass. Summary: This patch is adding remark messages to the LoopVersioning LICM pass, which will be useful for optimization remark emitter (ORE) infrastructure. Patch by: Deepak Porwal Reviewers: anemet, ashutosh.nema, eastig Subscribers: eastig, vivekvpandya, fhahn, llvm-commits llvm-svn: 323183	2018-01-23 09:47:28 +00:00
Serguei Katkov	f38041dc3e	Revert [SCEV] Fix isLoopEntryGuardedByCond usage It causes buildbot failures. New added assert is fired. It seems not all usages of isLoopEntryGuardedByCond are fixed. llvm-svn: 323079	2018-01-22 07:47:02 +00:00
Serguei Katkov	50714a1cbc	[SCEV] Fix isLoopEntryGuardedByCond usage ScalarEvolution::isKnownPredicate invokes isLoopEntryGuardedByCond without check that SCEV is available at entry point of the loop. It is incorrect and fixed by patch. Reviewers: sanjoy, mkazantsev, anna, dorit Reviewed By: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42165 llvm-svn: 323077	2018-01-22 07:31:41 +00:00
Philip Reames	f57714c3c7	[DSE] Factor out common code [NFC] We already had the pointer being stored to in the MemLoc, reuse that code. In merging cases, it turned out the interface of the getLocForWrite had become inconsitent with other related utilities. Fix that by making sure the input passes hasAnalyzableWrite as well. llvm-svn: 323056	2018-01-21 02:10:54 +00:00
Philip Reames	424e7a1174	[DSE] Minor rename for clarity sake [NFC] llvm-svn: 323055	2018-01-21 01:44:33 +00:00
Hiroshi Inoue	d24ddcd6c4	[NFC] fix trivial typos in comments "the the" -> "the" llvm-svn: 322934	2018-01-19 10:55:29 +00:00
Javed Absar	1e28194a40	[SCEV] Fix typo. NFC. Fix confusing typo in comment. llvm-svn: 322765	2018-01-17 21:58:35 +00:00
Daniil Fukalov	d5fca554e2	[AMDGPU] add LDS f32 intrinsics added llvm.amdgcn.atomic.{add\|min\|max}.f32 intrinsics to allow generate ds_{add\|min\|max}[_rtn]_f32 instructions needed for OpenCL float atomics in LDS Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D37985 llvm-svn: 322656	2018-01-17 14:05:05 +00:00
Ivan A. Kosarev	4d0ff0c74d	[Transforms] Support making mutable versions of new-format TBAA access tags Differential Revision: https://reviews.llvm.org/D41565 llvm-svn: 322650	2018-01-17 13:29:54 +00:00
Javed Absar	0b05f327d6	[SCEV] fix typo llvm-svn: 322629	2018-01-17 11:03:06 +00:00
Florian Hahn	c6c89bffdc	[CallSiteSplitting] Pass list of (BB, Conditions) pairs to splitCallSite. This removes some duplication from splitCallSite and makes it easier to add additional code dealing with each predecessor. It also allows us to split for more than 2 predecessors, although that is not enabled for now. Reviewers: junbuml, mcrosier, davidxl, davide Reviewed By: junbuml Differential Revision: https://reviews.llvm.org/D41858 llvm-svn: 322599	2018-01-16 22:13:15 +00:00
Hiroshi Inoue	99a8faa615	[SROA] fix assetion failure This patch fixes the assertion failure in SROA reported in PR35657. PR35657 reports the assertion failure due to r319522 (splitting for non-whole-alloca slices), but this problem can happen even without r319522. The problem exists in a check for reusing an existing alloca when rewriting partitions. As the original comment said, we can reuse the existing alloca if the new alloca has the same type and offset with the existing one. But the code checks only type of the alloca and then check the offset using an assert. In a corner case with out-of-bounds access (e.g. @PR35657 function added in unit test), it is possible that the two allocas have the same type but different offsets. This patch makes the check of the offset in the if condition, and re-enables the splitting for non-whole-alloca slices. Differential Revision: https://reviews.llvm.org/D41981 llvm-svn: 322533	2018-01-16 06:23:05 +00:00
Max Kazantsev	d0fe502385	[NFC] Fix comment to adjust to reality llvm-svn: 322468	2018-01-15 05:44:43 +00:00
Daniel Neilson	2409d24201	[NFC] Change MemIntrinsicInst::setAlignment() to take an unsigned instead of a Constant Summary: In preparation for https://reviews.llvm.org/D41675 this NFC changes this prototype of MemIntrinsicInst::setAlignment() to accept an unsigned instead of a Constant. llvm-svn: 322403	2018-01-12 21:33:37 +00:00
Brian M. Rzycki	9b7ae23256	[JumpThreading] Preservation of DT and LVI across the pass Summary: See D37528 for a previous (non-deferred) version of this patch and its description. Preserves dominance in a deferred manner using a new class DeferredDominance. This reduces the performance impact of updating the DominatorTree at every edge insertion and deletion. A user may call DDT->flush() within JumpThreading for an up-to-date DT. This patch currently has one flush() at the end of runImpl() to ensure DT is preserved across the pass. LVI is also preserved to help subsequent passes such as CorrelatedValuePropagation. LVI is simpler to maintain and is done immediately (not deferred). The code to perform the preversation was minimally altered and simply marked as preserved for the PassManager to be informed. This extends the analysis available to JumpThreading for future enhancements such as threading across loop headers. Reviewers: dberlin, kuhar, sebpop Reviewed By: kuhar, sebpop Subscribers: mgorny, dmgreen, kuba, rnk, rsmith, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40146 llvm-svn: 322401	2018-01-12 21:06:48 +00:00
Max Kazantsev	ef0576000c	[IRCE][NFC] Make range check's End a non-null SCEV Currently, IRC contains `Begin` and `Step` as SCEVs and `End` as value. Aside from that, `End` can also be `nullptr` which can be later conditionally converted into a non-null SCEV. To make this logic more transparent, this patch makes `End` a SCEV and calculates it early, so that it is never a null. Differential Revision: https://reviews.llvm.org/D39590 llvm-svn: 322364	2018-01-12 10:00:26 +00:00
Fiona Glaser	efe6a84e5b	[Sink] Really really fix predicate in legality check LoadInst isn't enough; we need to include intrinsics that perform loads too. All side-effecting intrinsics and such are already covered by the isSafe check, so we just need to care about things that read from memory. D41960, originally from D33179. llvm-svn: 322311	2018-01-11 21:28:57 +00:00
Michael Zolotukhin	1f562176e9	[LoopRotate] Detect loops with indirect branches better (we're giving up on them). llvm-svn: 322137	2018-01-09 23:54:35 +00:00
Chris Bieneman	abdea268c1	[IPSCCP] Remove calls without side effects Summary: When performing constant propagation for call instructions we have historically replaced all uses of the return from a call, but not removed the call itself. This is required for correctness if the calls have side effects, however the compiler should be able to safely remove calls that don't have side effects. This allows the compiler to completely fold away calls to functions that have no side effects if the inputs are constant and the output can be determined at compile time. Reviewers: davide, sanjoy, bruno, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38856 llvm-svn: 322125	2018-01-09 21:58:46 +00:00
Daniel Berlin	56cca7437c	NewGVN: Fix PR/33367, which was causing us to delete non-copy intrinsics accidentally in some rare cases llvm-svn: 322115	2018-01-09 20:12:42 +00:00
Petar Jovanovic	1d26c7e4ff	[EarlyCSE] Salvage debug info during DCE EarlyCSE did not try to salvage debug info during erasing of instructions. This change fixes it. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D41496 llvm-svn: 322083	2018-01-09 15:08:37 +00:00
Davide Italiano	9a60d2c157	[CVP] Replace incoming values from unreachable blocks with undef. This is an attempt of fixing PR35807. Due to the non-standard definition of dominance in LLVM, where uses in unreachable blocks are dominated by anything, you can have, in an unreachable block: %patatino = OP1 %patatino, CONSTANT When `SimplifyInstruction` receives a PHI where an incoming value is of the aforementioned form, in some cases, loops indefinitely. What I propose here instead is keeping track of the incoming values from unreachable blocks, and replacing them with undef. It fixes this case, and it seems to be good regardless (even if we can't prove that the value is constant, as it's coming from an unreachable block, we can ignore it). Differential Revision: https://reviews.llvm.org/D41812 llvm-svn: 322006	2018-01-08 16:34:06 +00:00
Davide Italiano	e15bffe9ea	Revert "[SCCP] Manually fold branches on undef." I thought this was responsible for PR35723, but I was wrong, the issue lies elsewhere. Revert while I debug. llvm-svn: 321975	2018-01-07 22:09:44 +00:00
Reid Kleckner	cd78ddc119	Revert "[JumpThreading] Preservation of DT and LVI across the pass" This reverts r321825, it causes crashes in Chromium. Reproducer forthcoming. llvm-svn: 321832	2018-01-04 23:23:46 +00:00
Brian M. Rzycki	cdad6c0b60	[JumpThreading] Preservation of DT and LVI across the pass Summary: See D37528 for a previous (non-deferred) version of this patch and its description. Preserves dominance in a deferred manner using a new class DeferredDominance. This reduces the performance impact of updating the DominatorTree at every edge insertion and deletion. A user may call DDT->flush() within JumpThreading for an up-to-date DT. This patch currently has one flush() at the end of runImpl() to ensure DT is preserved across the pass. LVI is also preserved to help subsequent passes such as CorrelatedValuePropagation. LVI is simpler to maintain and is done immediately (not deferred). The code to perfom the preversation was minimally altered and was simply marked as preserved for the PassManager to be informed. This extends the analysis available to JumpThreading for future enhancements. One example is loop boundary threading. Reviewers: dberlin, kuhar, sebpop Reviewed By: kuhar, sebpop Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40146 llvm-svn: 321825	2018-01-04 21:57:32 +00:00
Aditya Kumar	1f90cae80f	[GVNHoist] Fix: PR35222 gvn-hoist incorrectly erases load in case of a loop Reviewers: dberlin sebpop eli.friedman Differential Revision: https://reviews.llvm.org/D41453 llvm-svn: 321789	2018-01-04 07:47:24 +00:00
Matt Arsenault	8070882b4e	StructurizeCFG: Fix broken backedge detection The work order was changed in r228186 from SCC order to RPO with an arbitrary sorting function. The sorting function attempted to move inner loop nodes earlier. This was was apparently relying on an assumption that every block in a given loop / the same loop depth would be seen before visiting another loop. In the broken testcase, a block outside of the loop was encountered before moving onto another block in the same loop. The testcase would then structurize such that one blocks unconditional successor could never be reached. Revert to plain RPO for the analysis phase. This fixes detecting edges as backedges that aren't really. The processing phase does use another visited set, and I'm unclear on whether the order there is as important. An arbitrary order doesn't work, and triggers some infinite loops. The reversed RPO list seems to work and is closer to the order that was used before, minus the arbitary custom sorting. A few of the changed tests now produce smaller code, and a few are slightly worse looking. llvm-svn: 321751	2018-01-03 18:45:37 +00:00
Benjamin Kramer	c7fc81e659	Use phi ranges to simplify code. No functionality change intended. llvm-svn: 321585	2017-12-30 15:27:33 +00:00
Matt Arsenault	8dcfa137f3	StructurizeCFG: Use phi iterator range llvm-svn: 321568	2017-12-29 19:25:57 +00:00
Benjamin Kramer	3a13ed60ba	Avoid int to string conversion in Twine or raw_ostream contexts. Some output changes from uppercase hex to lowercase hex, no other functionality change intended. llvm-svn: 321526	2017-12-28 16:58:54 +00:00
Max Kazantsev	a13e163a27	[RewriteStatepoints] Fix incorrect assertion `RewriteStatepointsForGC` iterates over function blocks and their predecessors in order of declaration. One of outcomes of this is that callsites are placed in arbitrary order which has nothing to do with travelsar order. On the other hand, function `recomputeLiveInValues` asserts that bases are added to `Info.PointerToBase` before their deried pointers are updated. But if call sites are processed in order different from RPOT, this is not necessarily true. We cannot guarantee that the base was placed there before every pointer derived from it. All we can guarantee is that this base was marked as known base by this point. This patch replaces the fact that we assert from checking that the base was added to the map with assert that the base was marked as known base. Differential Revision: https://reviews.llvm.org/D41593 llvm-svn: 321517	2017-12-28 12:03:12 +00:00
Reid Kleckner	6d31001cd6	Revert "[memcpyopt] Teach memcpyopt to optimize across basic blocks" This reverts r321138. It seems there are still underlying issues with memdep. PR35519 seems to still be present if debug info is enabled. We end up losing a memcpy. Somehow during store to memset merging, we insert the memset after the memcpy or fail to update the memdep analysis to account for the newly inserted memset of a pair. Reduced test case: #include <assert.h> #include <stdio.h> #include <string> #include <utility> #include <vector> void do_push_back( std::vector<std::pair<std::string, std::vector<std::string>>>* crls) { crls->push_back(std::make_pair(std::string(), std::vector<std::string>())); } int __attribute__((optnone)) main() { // Put some data in the vector and then remove it so we take the push_back // fast path. std::vector<std::pair<std::string, std::vector<std::string>>> crl_set; crl_set.push_back({"asdf", {}}); crl_set.pop_back(); printf("first word in vector storage: %p\n", (void)crl_set.data()); // Do the push_back which may fail to initialize the data. do_push_back(&crl_set); auto first = &crl_set.back().first; printf("first word in vector storage (should be zero): %p\n", (void*)crl_set.data()); assert(first->empty()); puts("ok"); } Compile with libc++, enable optimizations, and enable debug info: $ clang++ -stdlib=libc++ -g -O2 t.cpp -o t.exe -Wl,-rpath=llvm/build/lib This program will assert with this change. llvm-svn: 321510	2017-12-28 05:10:33 +00:00
Florian Hahn	7e9328906b	[CallSiteSplitting] Remove isOrHeader restriction. By following the single predecessors of the predecessors of the call site, we do not need to restrict the control flow. Reviewed By: junbuml, davide Differential Revision: https://reviews.llvm.org/D40729 llvm-svn: 321413	2017-12-23 20:02:26 +00:00
Davide Italiano	55b663431e	[SCCP] Manually fold branches on undef. This code was originally removed and replace with an assertion because believed unnecessary. It turns out there was simply no test coverage for this case, and the constant folder doesn't yet know about patterns like `br undef %label1, %label2`. Presumably at some point the constant folder might learn about these patterns, but it's a broader change. A testcase will be added to make sure this doesn't regress again in the future. Fixes PR35723. llvm-svn: 321402	2017-12-23 15:06:30 +00:00
Easwaran Raman	a17f220590	Add hasProfileData() to check if a function has profile data. NFC. Summary: This replaces calls to getEntryCount().hasValue() with hasProfileData that does the same thing. This refactoring is useful to do before adding synthetic function entry counts but also a useful cleanup IMO even otherwise. I have used hasProfileData instead of hasRealProfileData as David had earlier suggested since I think profile implies "real" and I use the phrase "synthetic entry count" and not "synthetic profile count" but I am fine calling it hasRealProfileData if you prefer. Reviewers: davidxl, silvas Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41461 llvm-svn: 321331	2017-12-22 01:33:52 +00:00
Dan Gohman	aa3922819e	[memcpyopt] Teach memcpyopt to optimize across basic blocks This teaches memcpyopt to make a non-local memdep query when a local query indicates that the dependency is non-local. This notably allows it to eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%. This is r319482 and r319483, along with fixes for PR35519: fix the optimization that merges stores into memsets to preserve cached memdep info, and fix memdep's non-local caching strategy to not assume that larger queries are always more conservative than smaller ones. Fixes PR28958 and PR35519. Differential Revision: https://reviews.llvm.org/D40802 llvm-svn: 321138	2017-12-20 01:36:25 +00:00
Haicheng Wu	5b106ef92e	[SeparateConstOffsetFromGEP] Fix a typo. NFC. do CSE for to => do CSE to llvm-svn: 321098	2017-12-19 18:49:21 +00:00
Max Kazantsev	fd95ee0c9a	[JumpThreading] Restrict PRE across instructions that don't pass control to successors PRE in JumpThreading should not be able to hoist copy of non-speculable loads across instructions that don't always transfer execution to their successors, otherwise they may introduce an unsafe load which otherwise would not be executed. The same problem for GVN was fixed as rL316975. Differential Revision: https://reviews.llvm.org/D40347 llvm-svn: 321063	2017-12-19 09:10:21 +00:00
Hiroshi Inoue	c6faf15459	[SROA] Disable non-whole-alloca splits by default This patch introduce a switch to control splitting of non-whole-alloca slices with default off. The switch will be default on again after fixing an issue reported in PR35657. llvm-svn: 320958	2017-12-18 06:47:37 +00:00
Jun Bum Lim	44c58d35c1	Re-commit : [LICM] Allow sinking when foldable in loop This recommits r320823 reverted due to the test failure in sink-foldable.ll and an unused variable. Added "REQUIRES: aarch64-registered-target" in the test and removed unused variable. Original commit message: Continue trying to sink an instruction if its users in the loop is foldable. This will allow the instruction to be folded in the loop by decoupling it from the user outside of the loop. Reviewers: hfinkel, majnemer, davidxl, efriedma, danielcdh, bmakam, mcrosier Reviewed By: hfinkel Subscribers: javed.absar, bmakam, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37076 llvm-svn: 320858	2017-12-15 20:33:24 +00:00
Jun Bum Lim	5efd4d8b5e	Revert "Re-commit : [LICM] Allow sinking when foldable in loop" This reverts commit r320833. llvm-svn: 320836	2017-12-15 18:12:49 +00:00
Jun Bum Lim	83ccad6684	Re-commit : [LICM] Allow sinking when foldable in loop This recommit r320823 after fixing a test failure. Original commit message: Continue trying to sink an instruction if its users in the loop is foldable. This will allow the instruction to be folded in the loop by decoupling it from the user outside of the loop. Reviewers: hfinkel, majnemer, davidxl, efriedma, danielcdh, bmakam, mcrosier Reviewed By: hfinkel Subscribers: javed.absar, bmakam, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37076 llvm-svn: 320833	2017-12-15 17:58:59 +00:00
Jun Bum Lim	6136d87f5d	Revert "[LICM] Allow sinking when foldable in loop" This reverts commit r320823. llvm-svn: 320828	2017-12-15 16:35:09 +00:00
Jun Bum Lim	22855c26a5	[LICM] Allow sinking when foldable in loop Summary: Continue trying to sink an instruction if its users in the loop is foldable. This will allow the instruction to be folded in the loop by decoupling it from the user outside of the loop. Reviewers: hfinkel, majnemer, davidxl, efriedma, danielcdh, bmakam, mcrosier Reviewed By: hfinkel Subscribers: javed.absar, bmakam, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37076 llvm-svn: 320823	2017-12-15 16:09:54 +00:00
Fedor Sergeev	4b86d79048	[PM] port Rewrite Statepoints For GC to the new pass manager. Summary: The port is nearly straightforward. The only complication is related to the analyses handling, since one of the analyses used in this module pass is domtree, which is a function analysis. That requires asking for the results of each function and disallows a single interface for run-on-module pass action. Decided to copy-paste the main body of this pass. Most of its code is requesting analyses anyway, so not that much of a copy-paste. The rest of the code movement is to transform all the implementation helper functions like stripNonValidData into non-member statics. Extended all the related LLVM tests with new-pass-manager use. No failures. Reviewers: sanjoy, anna, reames Reviewed By: anna Subscribers: skatkov, llvm-commits Differential Revision: https://reviews.llvm.org/D41162 llvm-svn: 320796	2017-12-15 09:32:11 +00:00
Sanjay Patel	0ab0c1a201	[SimplifyCFG] don't sink common insts too soon (PR34603) This should solve: https://bugs.llvm.org/show_bug.cgi?id=34603 ...by preventing SimplifyCFG from altering redundant instructions before early-cse has a chance to run. It changes the default (canonical-forming) behavior of SimplifyCFG, so we're only doing the sinking transform later in the optimization pipeline. Differential Revision: https://reviews.llvm.org/D38566 llvm-svn: 320749	2017-12-14 22:05:20 +00:00
Sanjay Patel	558a465473	[EarlyCSE] recognize swapped variants of abs/nabs as equivalent Extends https://reviews.llvm.org/rL320640 Differential Revision: https://reviews.llvm.org/D41136 llvm-svn: 320653	2017-12-13 22:57:35 +00:00
Brian M. Rzycki	580bc3c8fa	Reverting [JumpThreading] Preservation of DT and LVI across the pass Stage 2 bootstrap failed: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/14434 llvm-svn: 320641	2017-12-13 22:01:17 +00:00
Sanjay Patel	3c7a35de7f	[EarlyCSE] recognize commuted and swapped variants of min/max as equivalent (PR35642) As shown in: https://bugs.llvm.org/show_bug.cgi?id=35642 ...we can have different forms of min/max, so we should recognize those here in EarlyCSE similar to how we already handle binops and compares that can commute. Differential Revision: https://reviews.llvm.org/D41136 llvm-svn: 320640	2017-12-13 21:58:15 +00:00
Michael Zolotukhin	6af4f232b5	Remove redundant includes from lib/Transforms. llvm-svn: 320628	2017-12-13 21:31:01 +00:00
Brian M. Rzycki	d989af98b3	[JumpThreading] Preservation of DT and LVI across the pass Summary: See D37528 for a previous (non-deferred) version of this patch and its description. Preserves dominance in a deferred manner using a new class DeferredDominance. This reduces the performance impact of updating the DominatorTree at every edge insertion and deletion. A user may call DDT->flush() within JumpThreading for an up-to-date DT. This patch currently has one flush() at the end of runImpl() to ensure DT is preserved across the pass. LVI is also preserved to help subsequent passes such as CorrelatedValuePropagation. LVI is simpler to maintain and is done immediately (not deferred). The code to perfom the preversation was minimally altered and was simply marked as preserved for the PassManager to be informed. This extends the analysis available to JumpThreading for future enhancements. One example is loop boundary threading. Reviewers: dberlin, kuhar, sebpop Reviewed By: kuhar, sebpop Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40146 llvm-svn: 320612	2017-12-13 20:52:26 +00:00
Aditya Kumar	49c03b11df	[GVNHoist] Fix: PR35222 gvn-hoist incorrectly erases load w.r.t. the paper "A Practical Improvement to the Partial Redundancy Elimination in SSA Form" (https://sites.google.com/site/jongsoopark/home/ssapre.pdf) Proper dominance check was missing here, so having a loopinfo should not be required. Committing this diff as this fixes the bug, if there are further concerns, I'll be happy to work on them. Differential Revision: https://reviews.llvm.org/D39781 llvm-svn: 320607	2017-12-13 19:40:07 +00:00
Florian Hahn	beda7d517d	[CallSiteSplitting] Refactor creating callsites. Summary: This change makes the call site creation more general if any of the arguments is predicated on a condition in the call site's predecessors. If we find a callsite, that potentially can be split, we collect the set of conditions for the call site's predecessors (currently only 2 predecessors are allowed). To do that, we traverse each predecessor's predecessors as long as it only has single predecessors and record the condition, if it is relevant to the call site. For each condition, we also check if the condition is taken or not. In case it is not taken, we record the inverse predicate. We use the recorded conditions to create the new call sites and split the basic block. This has 2 benefits: (1) it is slightly easier to see what is going on (IMO) and (2) we can easily extend it to handle more complex control flow. Reviewers: davidxl, junbuml Reviewed By: junbuml Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40728 llvm-svn: 320547	2017-12-13 03:05:20 +00:00
Fiona Glaser	b8a330c42a	Reassociate: add global reassociation algorithm This algorithm (explained more in the source code) takes into account global redundancies by building a "pair map" to find common subexprs. The primary motivation of this is to handle situations like foo = (a * b) * c bar = (a * d) * c where we currently don't identify that "a * c" is redundant. Accordingly, it prioritizes the emission of a * c so that CSE can remove the redundant calculation later. Does not change the actual reassociation algorithm -- only the order in which the reassociated operand chain is reconstructed. Gives ~1.5% floating point math instruction count reduction on a large offline suite of graphics shaders. llvm-svn: 320515	2017-12-12 19:18:02 +00:00
Mikael Holmen	66cf383761	[CallSiteSplitting] Don't let debug intrinsics affect optimizations Summary: This solves PR35616. We don't want the compiler to generate different code when we compile with/without -g, so we now ignore debug intrinsics when determining if the optimization can trigger or not. Reviewers: junbuml Subscribers: davide, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41068 llvm-svn: 320460	2017-12-12 07:29:57 +00:00
Matt Arsenault	3e268cc0dd	LSR: Check more intrinsic pointer operands llvm-svn: 320424	2017-12-11 21:38:43 +00:00
Evgeniy Stepanov	c667c1f47a	Hardware-assisted AddressSanitizer (llvm part). Summary: This is LLVM instrumentation for the new HWASan tool. It is basically a stripped down copy of ASan at this point, w/o stack or global support. Instrumenation adds a global constructor + runtime callbacks for every load and store. HWASan comes with its own IR attribute. A brief design document can be found in clang/docs/HardwareAssistedAddressSanitizerDesign.rst (submitted earlier). Reviewers: kcc, pcc, alekseyshl Subscribers: srhines, mehdi_amini, mgorny, javed.absar, eraman, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D40932 llvm-svn: 320217	2017-12-09 00:21:41 +00:00
Brian M. Rzycki	0eae123d9e	[JumpThreading] Minor comment cleanup. NFC. (test commit) llvm-svn: 320179	2017-12-08 19:36:32 +00:00
Alina Sbirlea	193429f0c8	[ModRefInfo] Make enum ModRefInfo an enum class [NFC]. Summary: Make enum ModRefInfo an enum class. Changes to ModRefInfo values should be done using inline wrappers. This should prevent future bit-wise opearations from being added, which can be more error-prone. Reviewers: sanjoy, dberlin, hfinkel, george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40933 llvm-svn: 320107	2017-12-07 22:41:34 +00:00
Alina Sbirlea	18fea013de	[ModRefInfo] Do not use ModRefInfo result in if conditions as this makes assumptions about the values in the enum. Replace with wrapper returning bool [NFC]. llvm-svn: 319949	2017-12-06 19:56:37 +00:00
Hans Wennborg	146a9c3e51	Revert r319482 and r319483 "[memcpyopt] Teach memcpyopt to optimize across basic blocks" This caused PR35519. > [memcpyopt] Teach memcpyopt to optimize across basic blocks > > This teaches memcpyopt to make a non-local memdep query when a local query > indicates that the dependency is non-local. This notably allows it to > eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%. > > Fixes PR28958. > > Differential Revision: https://reviews.llvm.org/D38374 > > [memcpyopt] Commit file missed in r319482. > > This change was meant to be included with r319482 but was accidentally > omitted. llvm-svn: 319873	2017-12-06 01:47:55 +00:00
Alina Sbirlea	63d2250a42	Modify ModRefInfo values using static inline method abstractions [NFC]. Summary: The aim is to make ModRefInfo checks and changes more intuitive and less error prone using inline methods that abstract the bit operations. Ideally ModRefInfo would become an enum class, but that change will require a wider set of changes into FunctionModRefBehavior. Reviewers: sanjoy, george.burgess.iv, dberlin, hfinkel Subscribers: nlopes, llvm-commits Differential Revision: https://reviews.llvm.org/D40749 llvm-svn: 319821	2017-12-05 20:12:23 +00:00
Joel Galenson	ea0bafda8a	[CVP] Remove some {s\|u}sub.with.overflow checks. This uses ConstantRange::makeGuaranteedNoWrapRegion's newly-added handling for subtraction to allow CVP to remove some subtraction overflow checks. Differential Revision: https://reviews.llvm.org/D40039 llvm-svn: 319807	2017-12-05 18:14:24 +00:00
Joel Galenson	d9500bc533	Test commit. I removed a space at the end of a comment. NFC. llvm-svn: 319803	2017-12-05 17:59:07 +00:00
Anna Thomas	7b360434ff	[Loop Predication] Teach LP about reverse loops Summary: Currently, we only support predication for forward loops with step of 1. This patch enables loop predication for reverse or countdownLoops, which satisfy the following conditions: 1. The step of the IV is -1. 2. The loop has a singe latch as B(X) = X <pred> latchLimit with pred as s> or u> 3. The IV of the guard is the decrement IV of the latch condition (Guard is: G(X) = X-1 u< guardLimit). This patch was downstream for a while and is the last series of patches that's from our LP implementation downstream. Reviewers: apilipenko, mkazantsev, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40353 llvm-svn: 319659	2017-12-04 15:11:48 +00:00
Hiroshi Inoue	48e4c7aae6	Recommit rL319407: [SROA] enable splitting for non-whole-alloca loads and stores Recommiting once reverted patch rL319407 after adding a check for bit vector size to avoid failures in some build bots. llvm-svn: 319522	2017-12-01 06:05:05 +00:00
Dan Gohman	59e4c0b938	[memcpyopt] Teach memcpyopt to optimize across basic blocks This teaches memcpyopt to make a non-local memdep query when a local query indicates that the dependency is non-local. This notably allows it to eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%. Fixes PR28958. Differential Revision: https://reviews.llvm.org/D38374 llvm-svn: 319482	2017-11-30 22:10:53 +00:00
Hiroshi Inoue	21e8ded4d2	Revert rL319407: [SROA] enable splitting for non-whole-alloca loads and stores This reverts commit rL319407 due to failures in some buildbot. llvm-svn: 319410	2017-11-30 08:29:51 +00:00
Hiroshi Inoue	422e80aee2	[SROA] enable splitting for non-whole-alloca loads and stores Currently, SROA splits loads and stores only when they are accessing the whole alloca. This patch relaxes this limitation to allow splitting a load/store if all other loads and stores to the alloca are disjoint to or fully included in the current load/store. If there is no other load or store that crosses the boundary of the current load/store, the current splitting implementation works as is. The whole-alloca loads and stores meet this new condition and so they are still splittable. Here is a simplified motivating example. struct record { long long a; int b; int c; }; int func(struct record r) { for (int i = 0; i < r.c; i++) r.b++; return r.b; } When updating r.b (or r.c as well), LLVM generates redundant instructions on some platforms (such as x86_64, ppc64); here, r.b and r.c are packed into one 64-bit GPR when the struct is passed as a method argument. With this patch, the above example is compiled into only few instructions without loop. Without the patch, unnecessary loop-carried dependency is introduced by SROA and the loop cannot be eliminated by the later optimizers. Differential Revision: https://reviews.llvm.org/D32998 llvm-svn: 319407	2017-11-30 07:44:46 +00:00
Adam Nemet	2e92289014	Demote this opt remark to DEBUG. From a random opt-stat output: Top 10 remarks: tailcallelim/tailcall 53% inline/AlwaysInline 13% gvn/LoadClobbered 13% inline/Inlined 8% inline/TooCostly 2% inline/NoDefinition 2% licm/LoadWithLoopInvariantAddressInvalidated 2% licm/Hoisted 1% asm-printer/InstructionCount 1% prologepilog/StackSize 1% llvm-svn: 319235	2017-11-28 22:11:00 +00:00
Adrian Prantl	77d90b0c39	SROA: Don't create variable fragments that are outside of the variable. An alloca may be larger than a variable that is described to be stored there. Don't create a dbg.value for fragments that are outside of the variable. This fixes PR35447. https://bugs.llvm.org/show_bug.cgi?id=35447 llvm-svn: 319230	2017-11-28 21:30:38 +00:00
Jonas Paulsson	f0ff20f1f0	Use getStoreSize() in various places instead of 'BitSize >> 3'. This is needed for cases when the memory access is not as big as the width of the data type. For instance, storing i1 (1 bit) would be done in a byte (8 bits). Using 'BitSize >> 3' (or '/ 8') would e.g. give the memory access of an i1 a size of 0, which for instance makes alias analysis return NoAlias even when it shouldn't. There are no tests as this was done as a follow-up to the bugfix for the case where this was discovered (r318824). This handles more similar cases. Review: Björn Petterson https://reviews.llvm.org/D40339 llvm-svn: 319173	2017-11-28 14:44:32 +00:00
Chandler Carruth	c34f789e38	Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable. The core idea is to (re-)introduce some redundancies where their cost is hidden by the cost of materializing immediates for constant operands of PHI nodes. When the cost of the redundancies is covered by this, avoiding materializing the immediate has numerous benefits: 1) Less register pressure 2) Potential for further folding / combining 3) Potential for more efficient instructions due to immediate operand As a motivating example, consider the remarkably different cost on x86 of a SHL instruction with an immediate operand versus a register operand. This pattern turns up surprisingly frequently, but is somewhat rarely obvious as a significant performance problem. The pass is entirely target independent, but it does rely on the target cost model in TTI to decide when to speculate things around the PHI node. I've included x86-focused tests, but any target that sets up its immediate cost model should benefit from this pass. There is probably more that can be done in this space, but the pass as-is is enough to get some important performance on our internal benchmarks, and should be generally performance neutral, but help with more extensive benchmarking is always welcome. One awkward part is that this pass has to be scheduled after everything that can eliminate these kinds of redundancies. This includes SimplifyCFG, GVN, etc. I'm open to suggestions about better places to put this. We could in theory make it part of the codegen pass pipeline, but there doesn't really seem to be a good reason for that -- it isn't "lowering" in any sense and only relies on pretty standard cost model based TTI queries, so it seems to fit well with the "optimization" pipeline model. Still, further thoughts on the pipeline position are welcome. I've also only implemented this in the new pass manager. If folks are very interested, I can try to add it to the old PM as well, but I didn't really see much point (my use case is already switched over to the new PM). I've tested this pretty heavily without issue. A wide range of benchmarks internally show no change outside the noise, and I don't see any significant changes in SPEC either. However, the size class computation in tcmalloc is substantially improved by this, which turns into a 2% to 4% win on the hottest path through tcmalloc for us, so there are definitely important cases where this is going to make a substantial difference. Differential revision: https://reviews.llvm.org/D37467 llvm-svn: 319164	2017-11-28 11:32:31 +00:00
Florian Hahn	25ea91a838	[TailRecursionElimination] Skip debug intrinsics. Summary: I think we do not need to analyze debug intrinsics here, as they should not impact codegen. This has 2 benefits: 1) slightly less work to do and 2) avoiding generating optimization remarks for converting calls to debug intrinsics to tail calls, which are not really helpful for users. Based on work by Sander de Smalen. Reviewers: davide, trentxintong, aprantl Reviewed By: aprantl Subscribers: llvm-commits, JDevlieghere Tags: #debug-info Differential Revision: https://reviews.llvm.org/D40440 llvm-svn: 319158	2017-11-28 09:32:25 +00:00
Max Kazantsev	115607226a	[GVN] Prevent ScalarPRE from hoisting across instructions that don't pass control flow to successors This is to address a problem similar to those in D37460 for Scalar PRE. We should not PRE across an instruction that may not pass execution to its successor unless it is safe to speculatively execute it. Differential Revision: https://reviews.llvm.org/D38619 llvm-svn: 319147	2017-11-28 07:07:55 +00:00
Rafael Espindola	c06f55e1e8	This reverts commit r319096 and r319097. Revert "[SROA] Propagate !range metadata when moving loads." Revert "[Mem2Reg] Clang-format unformatted parts of this file. NFCI." Davide says they broke a bot. llvm-svn: 319131	2017-11-28 01:25:38 +00:00
Adrian Prantl	d7f6f1636d	SROA: Avoid creating a fragment expression that covers the entire variable. Fixes PR35416. https://bugs.llvm.org/show_bug.cgi?id=35416 llvm-svn: 319126	2017-11-28 00:57:53 +00:00
Davide Italiano	b5d59e73ee	[SROA] Propagate !range metadata when moving loads. This tries to propagate !range metadata to a pre-existing load when a load is optimized out. This is done instead of adding an assume because converting loads to and from assumes creates a lot of IR. Patch by Ariel Ben-Yehuda. Differential Revision: https://reviews.llvm.org/D37216 llvm-svn: 319096	2017-11-27 21:25:13 +00:00
Sanjay Patel	0de1a4bc2d	[PartiallyInlineLibCalls][x86] add TTI hook to allow sqrt inlining to depend on arg rather than result This should fix PR31455: https://bugs.llvm.org/show_bug.cgi?id=31455 Differential Revision: https://reviews.llvm.org/D28314 llvm-svn: 319094	2017-11-27 21:15:43 +00:00
Benjamin Kramer	51ebcaaf25	Make helpers static. NFC. llvm-svn: 318953	2017-11-24 14:55:41 +00:00
Max Kazantsev	716e647d74	[IRCE][NFC] Add no wrap flags to no-wrapping SCEV calculation In a lambda where we expect to have result within bounds, add respective `nsw/nuw` flags to help SCEV just in case if it fails to figure them out on its own. Differential Revision: https://reviews.llvm.org/D40168 llvm-svn: 318898	2017-11-23 06:14:39 +00:00
Davide Italiano	b480b5c2ee	[SCCP] Pick the right lattice value for constants. After the dataflow algorithm proves that an argument is constant, it replaces it value with the integer constant and drops the lattice value associated to the DEF. e.g. in the example we have @f() that's called twice: call @f(undef, ...) call @f(2, ...) `undef` MEET 2 = 2 so we replace the argument and all its uses with the constant 2. Shortly after, tryToReplaceWithConstantRange() tries to get the lattice value for the argument we just replaced, causing an assertion. This function is a little peculiar as it runs when we're doing replacement and not as part of the solver but still queries the solver. The fix is that of checking whether we replaced the value already and get a temporary lattice value for the constant. Thanks to Zhendong Su for the report! Fixes PR35357. llvm-svn: 318817	2017-11-22 03:04:55 +00:00
Alina Sbirlea	ff8b8aea2e	Add MemorySSA as loop dependency, disabled by default [NFC]. Summary: First step in adding MemorySSA as dependency for loop pass manager. Adding the dependency under a flag. New pass manager: MSSA pointer in LoopStandardAnalysisResults can be null. Legacy and new pass manager: Use cl::opt EnableMSSALoopDependency. Disabled by default. Reviewers: sanjoy, davide, gberry Subscribers: mehdi_amini, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D40274 llvm-svn: 318772	2017-11-21 15:45:46 +00:00
Davide Italiano	5df8080011	[SCCP] If we replace with a constant, we can't replace with a range. This microoptimization is NFC. llvm-svn: 318711	2017-11-21 00:21:52 +00:00
Teresa Johnson	3309002a86	[SROA] Correctly invalidate analyses when dead instructions deleted Summary: SROA can fail in rewriting alloca but still rewrite a phi resulting in dead instruction elimination. The Changed flag was not being set correctly, resulting in downstream passes using stale analyses. The included test case will assert during the second BDCE pass as a result. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39921 llvm-svn: 318677	2017-11-20 18:33:38 +00:00
Max Kazantsev	268467869b	[IRCE] Smart range intersection In rL316552, we ban intersection of unsigned latch range with signed range check and vice versa, unless the entire range check iteration space is known positive. It was a correct functional fix that saved us from dealing with ambiguous values, but it also appeared to be a very restrictive limitation. In particular, in the following case: loop: %iv = phi i32 [ 0, %preheader ], [ %iv.next, %latch] %iv.offset = add i32 %iv, 10 %rc = icmp slt i32 %iv.offset, %len br i1 %rc, label %latch, label %deopt latch: %iv.next = add i32 %iv, 11 %cond = icmp i32 ult %iv.next, 100 br it %cond, label %loop, label %exit Here, the unsigned iteration range is `[0, 100)`, and the safe range for range check is `[-10, %len - 10)`. For unsigned iteration spaces, we use unsigned min/max functions for range intersection. Given this, we wanted to avoid dealing with `-10` because it is interpreted as a very big unsigned value. Semantically, range check's safe range goes through unsigned border, so in fact it is two disjoint ranges in IV's iteration space. Intersection of such ranges is not trivial, so we prohibited this case saying that we are not allowed to intersect such ranges. What semantics of this safe range actually means is that we can start from `-10` and go up increasing the `%iv` by one until we reach `%len - 10` (for simplicity let's assume that `%len - 10` is a reasonably big positive value). In particular, this safe iteration space includes `0, 1, 2, ..., %len - 11`. So if we were able to return safe iteration space `[0, %len - 10)`, we could safely intersect it with IV's iteration space. All values in this range are non-negative, so using signed/unsigned min/max for them is unambiguous. In this patch, we alter the algorithm of safe range calculation so that it returnes a subset of the original safe space which is represented by one continuous range that does not go through wrap. In order to reach this, we use modified SCEV substraction function. It can be imagined as a function that substracts by `1` (or `-1`) as long as the further substraction does not cause a wrap in IV iteration space. This allows us to perform IRCE in many situations when we deal with IV space and range check of different types (in terms of signed/unsigned). We apply this approach for both matching and not matching types of IV iteration space and the range check. One implication of this is that now IRCE became smarter in detection of empty safe ranges. For example, in this case: loop: %iv = phi i32 [ %begin, %preheader ], [ %iv.next, %latch] %iv.offset = sub i32 %iv, 10 %rc = icmp ult i32 %iv.offset, %len br i1 %rc, label %latch, label %deopt latch: %iv.next = add i32 %iv, 11 %cond = icmp i32 ult %iv.next, 100 br it %cond, label %loop, label %exit If `%len` was less than 10 but SCEV failed to trivially prove that `%begin - 10 >u %len- 10`, we could end up executing entire loop in safe preloop while the main loop was still generated, but never executed. Now, cutting the ranges so that if both `begin - 10` and `%len - 10` overflow, we have a trivially empty range of `[0, 0)`. This in some cases prevents us from meaningless optimization. Differential Revision: https://reviews.llvm.org/D39954 llvm-svn: 318639	2017-11-20 06:07:57 +00:00
Florian Hahn	2a266a343f	[CallSiteSplitting] Remove some indirection (NFC). Summary: With this patch I tried to reduce the complexity of the code sightly, by removing some indirection. Please let me know what you think. Reviewers: junbuml, mcrosier, davidxl Reviewed By: junbuml Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40037 llvm-svn: 318593	2017-11-18 18:14:13 +00:00
Jun Bum Lim	0f90672ae9	[LICM] Fix PR35342 Summary: This change fix PR35342 by replacing only the current use with undef in unreachable blocks. Reviewers: efriedma, mcrosier, igor-laevsky Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40184 llvm-svn: 318551	2017-11-17 20:38:25 +00:00
Chandler Carruth	693eedb138	[PM/Unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching, making it no longer even remotely simple. The pass will now be more of a "full loop unswitching" pass rather than anything substantively simpler than any other approach. I plan to rename it accordingly once the dust settles. The key ideas of the new loop unswitcher are carried over for non-trivial unswitching: 1) Fully unswitch a branch or switch instruction from inside of a loop to outside of it. 2) Update the CFG and IR. This avoids needing to "remember" the unswitched branches as well as avoiding excessively cloning and reliance on complex parts of simplify-cfg to cleanup the cfg. 3) Update the analyses (where we can) rather than just blowing them away or relying on something else updating them. Sadly, #3 is somewhat compromised here as the dominator tree updates were too complex for me to want to reason about. I will need to make another attempt to do this now that we have a nice dynamic update API for dominators. However, we do adhere to #3 w.r.t. LoopInfo. This approach also adds an important principls specific to non-trivial unswitching: not all of the loop will be duplicated when unswitching. This fact allows us to compute the cost in terms of how much duplicate code is inserted rather than just on raw size. Unswitching conditions which essentialy partition loops will work regardless of the total loop size. Some remaining issues that I will be addressing in subsequent commits: - Handling unstructured control flow. - Unswitching 'switch' cases instead of just branches. - Moving to the dynamic update API for dominators. Some high-level, interesting limitationsV that folks might want to push on as follow-ups but that I don't have any immediate plans around: - We could be much more clever about not cloning things that will be deleted. In fact, we should be able to delete nothing and do a minimal number of clones. - There are many more interesting selection criteria for which branch to unswitch that we might want to look at. One that I'm interested in particularly are a set of conditions which all exit the loop and which can be merged into a single unswitched test of them. Differential revision: https://reviews.llvm.org/D34200 llvm-svn: 318549	2017-11-17 19:58:36 +00:00
Max Kazantsev	1ac6e8ae61	[IRCE] Remove folding of two range checks into RANGE_CHECK_BOTH The logic of replacing of a couple `RANGE_CHECK_LOWER + RANGE_CHECK_UPPER` into `RANGE_CHECK_BOTH` in fact duplicates the logic of range intersection which happens when we calculate safe iteration space. Effectively, the result of intersection of these ranges doesn't differ from the range of merged range check. We chose to remove duplicating logic in favor of code simplicity. Differential Revision: https://reviews.llvm.org/D39589 llvm-svn: 318508	2017-11-17 06:49:26 +00:00
David Blaikie	b3bde2ea50	Fix a bunch more layering of CodeGen headers that are in Target All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490	2017-11-17 01:07:10 +00:00
Max Kazantsev	b1b8aff2e7	[IRCE] Fix SCEVExpander's usage in IRCE When expanding exit conditions for pre- and postloops, we may end up expanding a recurrency from the loop to in its loop's preheader. This produces incorrect IR. This patch ensures that IRCE uses SCEVExpander correctly and only expands code which is safe to expand in this particular location. Differentian Revision: https://reviews.llvm.org/D39234 llvm-svn: 318381	2017-11-16 06:06:27 +00:00
Craig Topper	062bcf30b1	[GVNHoist] Fix a signed/unsigned comparison warning that occurs in 32-bit builds with gcc. std::distance returns ptrdiff_t which is signed. 64-bit builds don't notice because type promotion widens the unsigned first. llvm-svn: 318354	2017-11-16 00:19:59 +00:00
Sanjay Patel	d1becd082a	[Reassociate] simplify code; NFCI llvm-svn: 318298	2017-11-15 16:19:17 +00:00
Craig Topper	bf6495fbcb	[LoopRotate] processLoop should return true even if it just simplified the loop latch without making any other changes Simplifying a loop latch changes the IR and we need to make sure the pass manager knows to invalidate analysis passes if that happened. PR35210 discovered a case where we failed to invalidate the post dominator tree after this simplification because we no changes other than simplifying the loop latch. Fixes PR35210. Differential Revision: https://reviews.llvm.org/D40035 llvm-svn: 318237	2017-11-15 00:22:42 +00:00
Sanjay Patel	64fd333304	[Reassociate] use dyn_cast instead of isa+cast; NFCI llvm-svn: 318212	2017-11-14 23:03:56 +00:00
Hans Wennborg	e1ecd61b98	Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining Clang implements the -finstrument-functions flag inherited from GCC, which inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit. This is useful for getting a trace of how the functions in a program are executed. Normally, the calls remain even if a function is inlined into another function, but it is useful to be able to turn this off for users who are interested in a lower-level trace, i.e. one that reflects what functions are called post-inlining. (We use this to generate link order files for Chromium.) LLVM already has a pass for inserting similar instrumentation calls to mcount(), which it does after inlining. This patch renames and extends that pass to handle calls both to mcount and the cygprofile functions, before and/or after inlining as controlled by function attributes. Differential Revision: https://reviews.llvm.org/D39287 llvm-svn: 318195	2017-11-14 21:09:45 +00:00
Jatin Bhateja	c61ade1ca0	[SCEV] Handling for ICmp occuring in the evolution chain. Summary: If a compare instruction is same or inverse of the compare in the branch of the loop latch, then return a constant evolution node. This shall facilitate computations of loop exit counts in cases where compare appears in the evolution chain of induction variables. Will fix PR 34538 Reviewers: sanjoy, hfinkel, junryoungju Reviewed By: sanjoy, junryoungju Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38494 llvm-svn: 318050	2017-11-13 16:43:24 +00:00
Daniel Neilson	6e4aa1e481	Expand IRBuilder interface for atomic memcpy to require pointer alignments. (NFC) Summary: The specification of the @llvm.memcpy.element.unordered.atomic intrinsic requires that the pointer arguments have alignments of at least the element size. The existing IRBuilder interface to create a call to this intrinsic does not allow for providing the alignment of these pointer args. Having an interface that makes it easy to construct invalid intrinsic calls doesn't seem sensible, so this patch simply adds the requirement that one provide the argument alignments when using IRBuilder to create atomic memcpy calls. llvm-svn: 317918	2017-11-10 19:38:12 +00:00
Sanjoy Das	6fabb90765	[CVP] Remove some {s\|u}add.with.overflow checks. Summary: This adds logic to CVP to remove some overflow checks. It uses LVI to remove operations with at least one constant. Specifically, this can remove many overflow intrinsics immediately following an overflow check in the source code, such as: if (x < INT_MAX) ... x + 1 ... Patch by Joel Galenson! Reviewers: sanjoy, regehr Reviewed By: sanjoy Subscribers: fhahn, pirama, srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D39483 llvm-svn: 317911	2017-11-10 19:13:35 +00:00
Paul Robinson	b46256b0b4	Fix out-of-order stepping behavior in programs with hoisted constants. When the Constant Hoisting pass moves expensive constants into a common block, it would assign a debug location equal to the last use of that constant. While this is certainly intuitive, it places the constant in an out-of-order location, according to the debug location information. This produces out-of-order stepping when debugging programs affected by this pass. This patch creates in-order stepping behavior by merging the debug locations for hoisted constants, and the new insertion point. Patch by Matthew Voss! Differential Revision: https://reviews.llvm.org/D38088 llvm-svn: 317827	2017-11-09 20:01:31 +00:00
Sanjay Patel	0d66010454	[Reassociate] don't name values "tmp"; NFCI The toxic stew of created values named 'tmp' and tests that already have values named 'tmp' and CHECK lines looking for values named 'tmp' causes bad things to happen in our test line auto-generation scripts because it wants to use 'TMP' as a prefix for unnamed values. Use less 'tmp' to avoid that. llvm-svn: 317818	2017-11-09 18:14:24 +00:00
Serguei Katkov	722339e405	[GVN PRE] Patch the source for Phi node in PRE We must patch all existing incoming values of Phi node, otherwise it is possible that we can see poison where program does not expect to see it. This is the similar what GVN does. The added test test/Transforms/GVN/PRE/pre-jt-add.ll shows an example of wrong optimization done by jump threading due to GVN PRE did not patch existing incoming value. Reviewers: mkazantsev, wmi, dberlin, davide Reviewed By: dberlin Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D39637 llvm-svn: 317768	2017-11-09 06:02:18 +00:00
Dan Gohman	2c74fe977d	Add an @llvm.sideeffect intrinsic This patch implements Chandler's idea [0] for supporting languages that require support for infinite loops with side effects, such as Rust, providing part of a solution to bug 965 [1]. Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual effect, but which appears to optimization passes to have obscure side effects, such that they don't optimize away loops containing it. It also teaches several optimization passes to ignore this intrinsic, so that it doesn't significantly impact optimization in most cases. As discussed on llvm-dev [2], this patch is the first of two major parts. The second part, to change LLVM's semantics to have defined behavior on infinite loops by default, with a function attribute for opting into potential-undefined-behavior, will be implemented and posted for review in a separate patch. [0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html [1] https://bugs.llvm.org/show_bug.cgi?id=965 [2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html Differential Revision: https://reviews.llvm.org/D38336 llvm-svn: 317729	2017-11-08 21:59:51 +00:00
Adrian Prantl	25a09dd408	Make DIExpression::createFragmentExpression() return an Optional. We can't safely split arithmetic into multiple fragments because we can't express carry-over between fragments. llvm-svn: 317534	2017-11-07 00:45:34 +00:00
Xinliang David Li	a531f189fc	Fix comment /NFC llvm-svn: 317514	2017-11-06 21:57:51 +00:00
Sanjay Patel	629c411538	[IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html and again more recently: http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html ...this is a step in cleaning up our fast-math-flags implementation in IR to better match the capabilities of both clang's user-visible flags and the backend's flags for SDNode. As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic reassociation - 'AllowReassoc'. We're also adding a bit to allow approximations for library functions called 'ApproxFunc' (this was initially proposed as 'libm' or similar). ...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), but that's apparently already used for other purposes. Also, I don't think we can just add a field to FPMathOperator because Operator is not intended to be instantiated. We'll defer movement of FMF to another day. We keep the 'fast' keyword. I thought about removing that, but seeing IR like this: %f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2 ...made me think we want to keep the shortcut synonym. Finally, this change is binary incompatible with existing IR as seen in the compatibility tests. This statement: "Newer releases can ignore features from older releases, but they cannot miscompile them. For example, if nsw is ever replaced with something else, dropping it would be a valid way to upgrade the IR." ( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility ) ...provides the flexibility we want to make this change without requiring a new IR version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will fail to optimize some previously 'fast' code because it's no longer recognized as 'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'. Note: an inter-dependent clang commit to use the new API name should closely follow commit. Differential Revision: https://reviews.llvm.org/D39304 llvm-svn: 317488	2017-11-06 16:27:15 +00:00
Davide Italiano	c7c05ae4be	[CallSiteSplitting] clang-format my last commit. NFCI. Thanks to Rui for pointing out. llvm-svn: 317393	2017-11-04 00:44:01 +00:00
Davide Italiano	91b4790b33	[CallSiteSplitting] Silence GCC's -Wparentheses. NFCI. llvm-svn: 317385	2017-11-03 23:03:38 +00:00
Jun Bum Lim	0c99007db1	Recommit r317351 : Add CallSiteSplitting pass This recommit r317351 after fixing a buildbot failure. Original commit message: Summary: This change add a pass which tries to split a call-site to pass more constrained arguments if its argument is predicated in the control flow so that we can expose better context to the later passes (e.g, inliner, jump threading, or IPA-CP based function cloning, etc.). As of now we support two cases : 1) If a call site is dominated by an OR condition and if any of its arguments are predicated on this OR condition, try to split the condition with more constrained arguments. For example, in the code below, we try to split the call site since we can predicate the argument (ptr) based on the OR condition. Split from : if (!ptr \|\| c) callee(ptr); to : if (!ptr) callee(null ptr) // set the known constant value else if (c) callee(nonnull ptr) // set non-null attribute in the argument 2) We can also split a call-site based on constant incoming values of a PHI For example, from : BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2, label %BB1 BB1: br label %BB2 BB2: %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ] call void @bar(i32 %p) to BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2-split0, label %BB1 BB1: br label %BB2-split1 BB2-split0: call void @bar(i32 0) br label %BB2 BB2-split1: call void @bar(i32 1) br label %BB2 BB2: %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ] llvm-svn: 317362	2017-11-03 20:41:16 +00:00
Jun Bum Lim	0eb1c2d63a	Revert "Add CallSiteSplitting pass" Revert due to Buildbot failure. This reverts commit r317351. llvm-svn: 317353	2017-11-03 19:17:11 +00:00
Jun Bum Lim	2a58933519	Add CallSiteSplitting pass Summary: This change add a pass which tries to split a call-site to pass more constrained arguments if its argument is predicated in the control flow so that we can expose better context to the later passes (e.g, inliner, jump threading, or IPA-CP based function cloning, etc.). As of now we support two cases : 1) If a call site is dominated by an OR condition and if any of its arguments are predicated on this OR condition, try to split the condition with more constrained arguments. For example, in the code below, we try to split the call site since we can predicate the argument (ptr) based on the OR condition. Split from : if (!ptr \|\| c) callee(ptr); to : if (!ptr) callee(null ptr) // set the known constant value else if (c) callee(nonnull ptr) // set non-null attribute in the argument 2) We can also split a call-site based on constant incoming values of a PHI For example, from : BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2, label %BB1 BB1: br label %BB2 BB2: %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ] call void @bar(i32 %p) to BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2-split0, label %BB1 BB1: br label %BB2-split1 BB2-split0: call void @bar(i32 0) br label %BB2 BB2-split1: call void @bar(i32 1) br label %BB2 BB2: %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ] Reviewers: davidxl, huntergr, chandlerc, mcrosier, eraman, davide Reviewed By: davidxl Subscribers: sdesmalen, ashutosh.nema, fhahn, mssimpso, aemerson, mgorny, mehdi_amini, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D39137 llvm-svn: 317351	2017-11-03 19:01:57 +00:00
Evgeny Stupachenko	d699de2b50	The patch fixes PR35131 Summary: Fix a misprint which led to false CTLZ recognition. Reviewers: craig.topper Differential Revision: https://reviews.llvm.org/D39585 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 317348	2017-11-03 18:50:03 +00:00
Jun Bum Lim	f5fb3d745d	[LICM] sink through non-trivially replicable PHI Summary: The current LICM allows sinking an instruction only when it is exposed to exit blocks through a trivially replacable PHI of which all incoming values are the same instruction. This change enhance LICM to sink a sinkable instruction through non-trivially replacable PHIs by spliting predecessors of loop exits. Reviewers: hfinkel, majnemer, davidxl, bmakam, mcrosier, danielcdh, efriedma, jtony Reviewed By: efriedma Subscribers: nemanjai, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D37163 llvm-svn: 317335	2017-11-03 16:24:53 +00:00
Anna Thomas	6879721453	[LoopPredication] NFC: Refactored code to separate out functions being reused Summary: Refactored the code to separate out common functions that are being reused. This is to reduce the changes for changes coming up wrt loop predication with reverse loops. This refactoring is what we have in our downstream code. llvm-svn: 317324	2017-11-03 14:25:39 +00:00
Mikael Holmen	6018104d5e	[ADCE] Use MapVector for BlockInfo to make iteration order deterministic Summary: Also added a reserve() method to MapVector since we want to use that from ADCE. DenseMap does not provide deterministic iteration order so with that we will handle the members of BlockInfo in random order, eventually leading to random order of the blocks in the predecessor lists. Without this change, I get the same predecessor order in about 90% of the time when I compile a certain reproducer and in 10% I get a different one. No idea how to make a proper test case for this. Reviewers: kuhar, david2050 Reviewed By: kuhar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39593 llvm-svn: 317323	2017-11-03 14:15:08 +00:00
Vedant Kumar	9196ed1be1	[LSR] Clarify a comment. NFC. llvm-svn: 317295	2017-11-03 01:01:28 +00:00
Adrian Prantl	fbb6fbf709	IndVarSimplify: preserve debug information attached to widened PHI nodes. This fixes PR35015. https://bugs.llvm.org/show_bug.cgi?id=35015 Differential Revision: https://reviews.llvm.org/D39345 llvm-svn: 317282	2017-11-02 23:17:06 +00:00
Anna Thomas	1d02b13eb7	[LoopPredication] Enable predication when latchCheckIV is wider than rangeCheck Summary: This patch allows us to predicate range checks that have a type narrower than the latch check type. We leverage SCEV analysis to identify a truncate for the latchLimit and latchStart. There is also safety checks in place which requires the start and limit to be known at compile time. We require this to make sure that the SCEV truncate expr for the IV corresponding to the latch does not cause us to lose information about the IV range. Added tests show the loop predication over range checks that are of various types and are narrower than the latch type. This enhancement has been in our downstream tree for a while. Reviewers: apilipenko, sanjoy, mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39500 llvm-svn: 317269	2017-11-02 21:21:02 +00:00
Anna Thomas	729dafc16b	Strip off invariant.start because memory locations arent invariant The original change was reverted in rL317217 because of the failure in the RS4GC testcase. I couldn't reproduce the failure on my local machine (macbook) but could reproduce it on a linux box. The failure was around removing the uses of invariant.start. The fix here is to just RAUW undef (which was the first implementation in D39388). This is perfectly valid IR as discussed in the review. llvm-svn: 317225	2017-11-02 18:24:04 +00:00
Anna Thomas	ebe429d99f	Revert "[RS4GC] Strip off invariant.start because memory locations arent invariant" This reverts commit r317215, investigating the test failure. llvm-svn: 317217	2017-11-02 16:45:51 +00:00
Anna Thomas	486a7aaa31	[RS4GC] Strip off invariant.start because memory locations arent invariant Summary: Invariant.start on memory locations has the property that the memory location is unchanging. However, this is not true in the face of rewriting statepoints for GC. Teach RS4GC about removing invariant.start so that optimizations after RS4GC does not incorrect sink a load from the memory location past a statepoint. Added test showcasing the issue. Reviewers: reames, apilipenko, dneilson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39388 llvm-svn: 317215	2017-11-02 16:23:31 +00:00
Clement Courbet	82bade615b	Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass." undefined reference to `llvm::TargetPassConfig::ID' on clang-ppc64le-linux-multistage This reverts commit eea333c33fa73ad225ef28607795984829f65688. llvm-svn: 317213	2017-11-02 15:53:10 +00:00
Clement Courbet	1dc37b9c3b	[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass. Summary: This is mostly a noop (most of the test diffs are renamed blocks). There are a few temporary register renames (eax<->ecx) and a few blocks are shuffled around. See the discussion in PR33325 for more details. Reviewers: spatel Subscribers: mgorny Differential Revision: https://reviews.llvm.org/D39456 llvm-svn: 317211	2017-11-02 15:02:51 +00:00
Adrian Prantl	98c6549e4a	loop-rotate: avoid duplicating dbg.value intrinsics in the entry block. This fixes the second half of PR35113. This reapplies r317106 without modifications. llvm-svn: 317121	2017-11-01 20:53:22 +00:00
Adrian Prantl	40a0ea5f29	Revert r317106 to facilitate reverting r317105. llvm-svn: 317109	2017-11-01 18:06:35 +00:00
Adrian Prantl	9259f21604	loop-rotate: avoid duplicating dbg.value intrinsics in the entry block. This fixes the second half of PR35113. llvm-svn: 317106	2017-11-01 17:28:50 +00:00
Max Kazantsev	6f5229d7da	Revert rL311205 "[IRCE] Fix buggy behavior in Clamp" This patch reverts rL311205 that was initially a wrong fix. The real problem was in intersection of signed and unsigned ranges (see rL316552), and the patch being reverted masked the problem instead of fixing it. By now, the test against which rL311205 was made works OK even without this code. This revert patch also contains a test case that demonstrates incorrect behavior caused by rL311205: it is caused by incorrect choise of signed max instead of unsigned. llvm-svn: 317088	2017-11-01 13:21:56 +00:00
Adrian Prantl	deb437b038	loop-rotate: simplify code by using llvm::findDbgValues(). (NFC) llvm-svn: 317037	2017-10-31 21:03:22 +00:00
Max Kazantsev	84286ce5dd	[IRCE][NFC] Rename fields of InductiveRangeCheck Rename `Offset`, `Scale`, `Length` into `Begin`, `Step`, `End` respectively to make naming of similar entities for Ranges and Range Checks more consistent. Differential Revision: https://reviews.llvm.org/D39414 llvm-svn: 316979	2017-10-31 06:19:05 +00:00
Max Kazantsev	21e7b53490	[NFC] Get rid of variables used in assert only llvm-svn: 316977	2017-10-31 05:33:58 +00:00
Max Kazantsev	488ec975bb	Reapply "[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors" This patch fixes the miscompile that happens when PRE hoists loads across guards and other instructions that don't always pass control flow to their successors. PRE is now prohibited to hoist across such instructions because there is no guarantee that the load standing after such instruction is still valid before such instruction. For example, a load from under a guard may be invalid before the guard in the following case: int array[LEN]; ... guard(0 <= index && index < LEN); use(array[index]); Differential Revision: https://reviews.llvm.org/D37460 llvm-svn: 316975	2017-10-31 05:07:56 +00:00
Yaxun Liu	d23f23d81c	InferAddressSpaces: Fix bug about replacing addrspacecast InferAddressSpaces assumes the pointee type of addrspacecast is the same as the operand, which is not always true and causes invalid IR. This bug cause build failure in HCC. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39432 llvm-svn: 316957	2017-10-30 21:19:41 +00:00
Davide Italiano	834b45129b	[NewGVN] Stop assuming PHI args ordering when looking at phi-of-ops. It's not guaranteed. There's a bug open to sort them in predecessor order, but it won't happen anytime soon. In the meanwhile, passes will have to do an O(#preds) scan. Such is life. llvm-svn: 316953	2017-10-30 20:20:16 +00:00
Mandeep Singh Grang	f83268bd9e	[GVNHoist] Fix non-deterministic sort order of PHIs for identical instructions Summary: This fixes failure in Transforms/GVNHoist/hoist.ll uncovered by D39245. Reviewers: hiraditya, spop, dberlin Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39410 llvm-svn: 316949	2017-10-30 19:42:41 +00:00
Clement Courbet	b2c3eb8cf1	[CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2). - Targets that want to support memcmp expansions now return the list of supported load sizes. - Expansion codegen does not assume that all power-of-two load sizes smaller than the max load size are valid. For examples, this is not the case for x86(32bit)+sse2. Fixes PR34887. llvm-svn: 316905	2017-10-30 14:19:33 +00:00
Florian Hahn	d0208b4b1c	Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP. This version of the patch includes a fix addressing a stage2 LTO buildbot failure and addressed some additional nits. Original commit message: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to ret i32 2 with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 316891	2017-10-30 10:07:42 +00:00
Max Kazantsev	390fc57771	[IRCE][NFC] Store Length as SCEV in RangeCheck instead of Value llvm-svn: 316889	2017-10-30 09:35:16 +00:00
Florian Hahn	d18443edad	Revert r316887 to fix buildbot failures. llvm-svn: 316888	2017-10-30 09:21:50 +00:00
Florian Hahn	925d3e4a98	Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP. This version of the patch includes a fix addressing a stage2 LTO buildbot failure and addressed some additional nits. Original commit message: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to ret i32 2 with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 316887	2017-10-30 09:04:18 +00:00
Max Kazantsev	1d7c0439b9	[GVN][NFC] Mark instruction for deletion instead of immediate erasing in LoadPRE It is done to uniformly handle instructions removal. Differential Revision: https://reviews.llvm.org/D39369 llvm-svn: 316884	2017-10-30 04:48:34 +00:00
Sanjay Patel	b049173157	[SimplifyCFG] use pass options and remove the latesimplifycfg pass This is no-functional-change-intended. This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and D35411 (defer folding unconditional branches) with pass parameters rather than a named "latesimplifycfg" pass. Now that we have individual options to control the functionality, we could decouple when these fire (but that's an independent patch if desired). The next planned step would be to add another option bit to disable the sinking transform mentioned in D38566. This should also make it clear that the new pass manager needs to be updated to limit simplifycfg in the same way as the old pass manager. Differential Revision: https://reviews.llvm.org/D38631 llvm-svn: 316835	2017-10-28 18:43:07 +00:00
Craig Topper	49687104d6	[PartialInlineLibCalls] Teach PartialInlineLibCalls to honor nobuiltin, properly check the function signature, and check TLI::has Summary: We shouldn't do this transformation if the function is marked nobuitlin. We were only checking that the return type is floating point, we really should be checking the argument types and argument count as well. This can be accomplished by using the other version of getLibFunc that takes the Function and not just the name. We should also be checking TLI::has since sqrtf is a macro on Windows. Fixes PR32559. Reviewers: hfinkel, spatel, davide, efriedma Reviewed By: davide, efriedma Subscribers: efriedma, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D39381 llvm-svn: 316819	2017-10-28 00:36:58 +00:00
Artur Pilipenko	8aadc643cf	[LoopPredication] Handle the case when the guard and the latch IV have different offsets This is a follow up change for D37569. Currently the transformation is limited to the case when: * The loop has a single latch with the condition of the form: ++i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=. * The step of the IV used in the latch condition is 1. * The IV of the latch condition is the same as the post increment IV of the guard condition. * The guard condition is of the form i u< guardLimit. This patch enables the transform in the case when the latch is latchStart + i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=. And the guard is guardStart + i u< guardLimit Reviewed By: anna Differential Revision: https://reviews.llvm.org/D39097 llvm-svn: 316768	2017-10-27 14:46:17 +00:00
Max Kazantsev	665907c3c2	[GVN][NFC] Refactor loop iteration with foreach llvm-svn: 316748	2017-10-27 08:19:35 +00:00
Eugene Zelenko	57bd5a0274	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316724	2017-10-27 01:09:08 +00:00
Philip Reames	21cc2fa3f6	[LICM] Restructure implicit exit handling to be more clear [NFCI] When going to explain this to someone else, I got tripped up by the complicated meaning of IsKnownNonEscapingObject in load-store promotion. Extract a helper routine and clarify naming/scopes to make this a bit more obvious. llvm-svn: 316699	2017-10-26 21:00:15 +00:00
Eugene Zelenko	5c2aecef78	[Transforms] Revert r316630 changes in Scalar/MergeICmps.cpp to fix broken build bots (NFC). llvm-svn: 316634	2017-10-26 01:25:14 +00:00
Eugene Zelenko	5adb96cc92	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316630	2017-10-26 00:55:39 +00:00
Max Kazantsev	9ac7021a25	[IRCE] Fix intersection between signed and unsigned ranges IRCE for unsigned latch conditions was temporarily disabled by rL314881. The motivating example contained an unsigned latch condition and a signed range check. One of the safe iteration ranges was `[1, SINT_MAX + 1]`. Its right border was incorrectly interpreted as a negative value in `IntersectRange` function, this lead to a miscompile under which we deleted a range check without inserting a postloop where it was needed. This patch brings back IRCE for unsigned latch conditions. Now we treat range intersection more carefully. If the latch condition was unsigned, we only try to consider a range check for deletion if: 1. The range check is also unsigned, or 2. Safe iteration range of the range check lies within `[0, SINT_MAX]`. The same is done for signed latch. Values from `[0, SINT_MAX]` are unambiguous, these values are non-negative under any interpretation, and all values of a range intersected with such range are also non-negative. We also use signed/unsigned min/max functions for range intersection depending on type of the latch condition. Differential Revision: https://reviews.llvm.org/D38581 llvm-svn: 316552	2017-10-25 06:47:39 +00:00
Max Kazantsev	4332a943bc	[IRCE] Smarter detection of empty ranges using SCEV For a SCEV range, this patch replaces the naive emptiness check for SCEV ranges which looks like `Begin == End` with a SCEV check. The range is guaranteed to be empty of `Begin >= End`. We should filter such ranges out and do not try to perform IRCE for them. For example, we can get such range when intersecting range `[A, B)` and `[C, D)` where `A < B < C < D`. The resulting range is `[max(A, C), min(B, D)) = [C, B)`. This range is empty, but its `Begin` does not match with `End`. Making IRCE for an empty range is basically safe but unprofitable because we never actually get into the main loop where the range checks are supposed to be eliminated. This patch uses SCEV mechanisms to treat loops with proved `Begin >= End` as empty. Differential Revision: https://reviews.llvm.org/D39082 llvm-svn: 316550	2017-10-25 06:10:02 +00:00
Eugene Zelenko	7f0f9bc5ab	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316503	2017-10-24 21:24:53 +00:00
Artem Belevich	cb8f6328dc	[NVPTX] allow address space inference for volatile loads/stores. If particular target supports volatile memory access operations, we can avoid AS casting to generic AS. Currently it's only enabled in NVPTX for loads and stores that access global & shared AS. Differential Revision: https://reviews.llvm.org/D39026 llvm-svn: 316495	2017-10-24 20:31:44 +00:00
Mandeep Singh Grang	9ed81c66ce	[GVNSink] Fix failing GVNSink tests in the reverse iteration bot Summary: The elts of ActivePreds which is defined as a SmallPtrSet are copied into Blocks using std::copy. This makes the resultant order of Blocks non-deterministic. We cannot simply sort Blocks as they need to match the corresponding Values. So a better approach is to define ActivePreds as SmallSetVector. This fixes the following failures in http://lab.llvm.org:8011/builders/reverse-iteration: LLVM :: Transforms/GVNSink/indirect-call.ll LLVM :: Transforms/GVNSink/sink-common-code.ll LLVM :: Transforms/GVNSink/struct.ll Reviewers: dberlin, jmolloy, bkramer, efriedma Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39025 llvm-svn: 316369	2017-10-23 19:56:52 +00:00
Sanjay Patel	b80daf0b48	[SimplifyCFG] delay switch condition forwarding to -latesimplifycfg As discussed in D39011: https://reviews.llvm.org/D39011 ...replacing constants with a variable is inverting the transform done by other IR passes, so we definitely don't want to do this early. In fact, it's questionable whether this transform belongs in SimplifyCFG at all. I'll look at moving this to codegen as a follow-up step. llvm-svn: 316298	2017-10-22 19:10:07 +00:00
David Green	907b60fbba	[LoopInterchange] Fix phi node ordering miscompile. The way that splitInnerLoopHeader splits blocks requires that the induction PHI will be the first PHI in the inner loop header. This makes sure that is actually the case when there are both IV and reduction phis. Differential Revision: https://reviews.llvm.org/D38682 llvm-svn: 316261	2017-10-21 13:58:37 +00:00
Eugene Zelenko	99241d75c1	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316241	2017-10-20 21:47:29 +00:00
Simon Pilgrim	0444e4fcd4	Fix MSVC signed/unsigned comparison warning llvm-svn: 316161	2017-10-19 15:00:31 +00:00
Max Kazantsev	3612d4b4f9	[NFC][IRCE] Filter out empty ranges early llvm-svn: 316146	2017-10-19 05:33:28 +00:00
Sanjoy Das	2f27456c82	Revert "[ScalarEvolution] Handling for ICmp occuring in the evolution chain." This reverts commit r316054. There was some confusion over the review process: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171016/495884.html llvm-svn: 316129	2017-10-18 22:00:57 +00:00
Eugene Zelenko	306d29977d	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316128	2017-10-18 21:46:47 +00:00
Jatin Bhateja	1fc49627e4	[ScalarEvolution] Handling for ICmp occuring in the evolution chain. Summary: If a compare instruction is same or inverse of the compare in the branch of the loop latch, then return a constant evolution node. Currently scope of evaluation is limited to SCEV computation for PHI nodes. This shall facilitate computations of loop exit counts in cases where compare appears in the evolution chain of induction variables. Will fix PR 34538 Reviewers: sanjoy, hfinkel, junryoungju Reviewed By: junryoungju Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38494 llvm-svn: 316054	2017-10-18 01:36:16 +00:00
Eugene Zelenko	6cadde7f40	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 316034	2017-10-17 21:27:42 +00:00
Philip Reames	6a7bbfb2e2	Revert 315440 on behalf of mkazantsev This patch reverts rL315440 because of the bug described at https://bugs.llvm.org/show_bug.cgi?id=34937 The fix for the bug is on review as D38944, but not yet ready. Given this is a regression reverting until a fix is ready is called for. Max would have done the revert himself, but is having trouble doing a build of fresh LLVM for some reason. I did the build and test to ensure the revert worked as expected on his behalf. llvm-svn: 315974	2017-10-17 06:21:07 +00:00
Craig Topper	91259e2681	[JumpThreading] Move two PredValueInfoTy vectors to a scope closer to their usage. NFCI llvm-svn: 315941	2017-10-16 21:54:13 +00:00
Eugene Zelenko	dd40f5e7c1	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315940	2017-10-16 21:34:24 +00:00
Aaron Ballman	615eb47035	Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people. Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1 llvm-svn: 315854	2017-10-15 14:32:27 +00:00
Hongbin Zheng	73f650435b	[LoopInfo][Refactor] Make SetLoopAlreadyUnrolled a member function of the Loop Pass, NFC. This avoid code duplication and allow us to add the disable unroll metadata elsewhere. Differential Revision: https://reviews.llvm.org/D38928 llvm-svn: 315850	2017-10-15 07:31:02 +00:00
Eugene Zelenko	3b87939604	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315760	2017-10-13 21:17:07 +00:00
Matthew Simpson	2284937bbc	[IPSCCP] Move common functions to ValueLatticeUtils (NFC) This patch moves some common utility functions out of IPSCCP and makes them available globally. The functions determine if interprocedural data-flow analyses can propagate information through function returns, arguments, and global variables. Differential Revision: https://reviews.llvm.org/D37638 llvm-svn: 315719	2017-10-13 17:53:44 +00:00
Daniel Neilson	fa14ebd138	[RS4GC] Look through vector bitcasts when looking for base pointer Summary: In RS4GC it is possible that a base pointer is contained in a vector that has undergone a bitcast from one element-pointertype to another. We teach RS4GC how to look through bitcasts of vector types when looking for a base pointer. Reviewers: anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38849 llvm-svn: 315694	2017-10-13 15:59:13 +00:00
Daniel Jasper	3344a21236	Revert r314923: "Recommit : Use the basic cost if a GEP is not used as addressing mode" Significantly reduces performancei (~30%) of gipfeli (https://github.com/google/gipfeli) I have not yet managed to reproduce this regression with the open-source version of the benchmark on github, but will work with others to get a reproducer to you later today. llvm-svn: 315680	2017-10-13 14:04:21 +00:00
Anna Thomas	61aec18d46	[CVP] Process binary operations even when def is local Summary: This patch adds processing of binary operations when the def of operands are in the same block (i.e. local processing). Earlier we bailed out in such cases (the bail out was introduced in rL252032) because LVI at that time was more precise about context at the end of basic blocks, which implied local def and use analysis didn't benefit CVP. Since then we've added support for LVI in presence of assumes and guards. The test cases added show how local def processing in CVP helps adding more information to the ashr, sdiv, srem and add operators. Note: processCmp which suffers from the same problem will be handled in a later patch. Reviewers: philip, apilipenko, SjoerdMeijer, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38766 llvm-svn: 315634	2017-10-12 22:39:52 +00:00
Artur Pilipenko	ead69ee4bd	[LoopPredication] Check whether the loop is already guarded by the first iteration check condition llvm-svn: 315623	2017-10-12 21:21:17 +00:00
Bruno Cardoso Lopes	993d2e67d8	Revert "Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP."" This reverts commit r315593: still affect two bots: http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/5308 http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21751/ llvm-svn: 315618	2017-10-12 20:52:34 +00:00
Artur Pilipenko	b4527e1ce2	[LoopPredication] Support ule, sle latch predicates This is a follow up for the loop predication change 313981 to support ule, sle latch predicates. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D38177 llvm-svn: 315616	2017-10-12 20:40:27 +00:00
Bruno Cardoso Lopes	326fdcbff8	Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP." This is r315288 & r315294, which were reverted due to stage2 bot failures. Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315593	2017-10-12 16:54:11 +00:00
Don Hinton	3e0199f7eb	[dump] Remove NDEBUG from test to enable dump methods [NFC] Summary: Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP. Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods. Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so it'll be picked up by public headers. Differential Revision: https://reviews.llvm.org/D38406 llvm-svn: 315590	2017-10-12 16:16:06 +00:00
Hongbin Zheng	d36f2030e2	[SimplifyIndVar] Replace IVUsers with loop invariant whenever possible Differential Revision: https://reviews.llvm.org/D38415 llvm-svn: 315551	2017-10-12 02:54:11 +00:00
Zachary Turner	41a9ee98f9	Revert "[ADT] Make Twine's copy constructor private." This reverts commit 4e4ee1c507e2707bb3c208e1e1b6551c3015cbf5. This is failing due to some code that isn't built on MSVC so I didn't catch. Not immediately obvious how to fix this at first glance, so I'm reverting for now. llvm-svn: 315536	2017-10-11 23:54:34 +00:00
Zachary Turner	337462b365	[ADT] Make Twine's copy constructor private. There's a lot of misuse of Twine scattered around LLVM. This ranges in severity from benign (returning a Twine from a function by value that is just a string literal) to pretty sketchy (storing a Twine by value in a class). While there are some uses for copying Twines, most of the very compelling ones are confined to the Twine class implementation itself, and other uses are either dubious or easily worked around. This patch makes Twine's copy constructor private, and fixes up all callsites. Differential Revision: https://reviews.llvm.org/D38767 llvm-svn: 315530	2017-10-11 23:33:06 +00:00
Vivek Pandya	9590658fb8	[NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure parameterized emit() calls Summary: This is not functional change to adopt new emit() API added in r313691. Reviewed By: anemet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38285 llvm-svn: 315476	2017-10-11 17:12:59 +00:00
Max Kazantsev	fecaff1bd9	[NFC] Fix variables used only for assert in GVN llvm-svn: 315448	2017-10-11 10:31:49 +00:00
Max Kazantsev	3b81809e06	[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors This patch fixes the miscompile that happens when PRE hoists loads across guards and other instructions that don't always pass control flow to their successors. PRE is now prohibited to hoist across such instructions because there is no guarantee that the load standing after such instruction is still valid before such instruction. For example, a load from under a guard may be invalid before the guard in the following case: int array[LEN]; ... guard(0 <= index && index < LEN); use(array[index]); Differential Revision: https://reviews.llvm.org/D37460 llvm-svn: 315440	2017-10-11 08:10:43 +00:00
Max Kazantsev	0c8dd052b8	[LICM] Disallow sinking of unordered atomic loads into loops Sinking of unordered atomic load into loop must be disallowed because it turns a single load into multiple loads. The relevant section of the documentation is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for Optimizers section. Here is the full text of this section: > Notes for optimizers > In terms of the optimizer, this prohibits any transformation that > transforms a single load into multiple loads, transforms a store into > multiple stores, narrows a store, or stores a value which would not be > stored otherwise. Some examples of unsafe optimizations are narrowing > an assignment into a bitfield, rematerializing a load, and turning loads > and stores into a memcpy call. Reordering unordered operations is safe, > though, and optimizers should take advantage of that because unordered > operations are common in languages that need them. Patch by Daniil Suchkov! Reviewed By: reames Differential Revision: https://reviews.llvm.org/D38392 llvm-svn: 315438	2017-10-11 07:26:45 +00:00
Max Kazantsev	25d8655dc2	[IRCE] Do not process empty safe ranges IRCE should not apply when the safe iteration range is proved to be empty. In this case we do unneeded job creating pre/post loops and then never go to the main loop. This patch makes IRCE not apply to empty safe ranges, adds test for this situation and also modifies one of existing tests where it used to happen slightly. Reviewed By: anna Differential Revision: https://reviews.llvm.org/D38577 llvm-svn: 315437	2017-10-11 06:53:07 +00:00
Davide Italiano	e2138fe41b	[GVN] Don't replace constants with constants. This fixes PR34908. Patch by Alex Crichton! Differential Revision: https://reviews.llvm.org/D38765 llvm-svn: 315429	2017-10-11 04:21:51 +00:00
Bruno Cardoso Lopes	57304923ca	Revert "[SCCP] Propagate integer range info for parameters in IPSCCP." This reverts commit r315288. This is part of fixing segfault introduced in: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/ llvm-svn: 315329	2017-10-10 16:37:57 +00:00
Bruno Cardoso Lopes	122c4b3c8c	Revert "[SCCP] Fix mem-sanitizer failure introduced by r315288." This reverts commit r315294. Part of fixing seg fault introduced in: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/ llvm-svn: 315328	2017-10-10 16:37:51 +00:00
Florian Hahn	7d2375df30	[SCCP] Fix mem-sanitizer failure introduced by r315288. llvm-svn: 315294	2017-10-10 10:33:45 +00:00
Florian Hahn	22a44bca40	[SCCP] Propagate integer range info for parameters in IPSCCP. Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315288	2017-10-10 09:32:38 +00:00
Clement Courbet	e2e8a5c496	Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." (fixed stability issues) This reverts commit d6492333d3b478a1d88163315002022f8d5e58dc. llvm-svn: 315281	2017-10-10 08:00:45 +00:00
Adam Nemet	0965da2055	Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.* Sync it up with the name of the class actually defined here. This has been bothering me for a while... llvm-svn: 315249	2017-10-09 23:19:02 +00:00
Clement Courbet	d12c189e2e	Revert "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion." Still a few stability issues on windows. This reverts commit 67e3db9bc121ba244e20337aabc7cf341a62b545. llvm-svn: 315058	2017-10-06 13:02:24 +00:00

... 4 5 6 7 8 ...

9045 Commits