llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	3052290dc0	Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. This version of the patch fixes cleaning up ssa_copy intrinsics, so it does not crash for instructions in blocks that have been marked unreachable. This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin Differential Revision: https://reviews.llvm.org/D45330 llvm-svn: 340525	2018-08-23 11:04:00 +00:00
Alina Sbirlea	8b83d68544	Update MemorySSA in LoopSimplifyCFG. Summary: Add MemorySSA as a dependency to LoopSimplifyCFG and preserve it. Disabled by default until all passes preserve MemorySSA. Reviewers: bogner, chandlerc Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Differential Revision: https://reviews.llvm.org/D50911 llvm-svn: 340445	2018-08-22 20:10:21 +00:00
Alina Sbirlea	c1a216b251	Update MemorySSA in LoopInstSimplify. Summary: Add MemorySSA as a depency to LoopInstInstSimplify and preserve it. Disabled by default until all passes preserve MemorySSA. Reviewers: chandlerc Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Differential Revision: https://reviews.llvm.org/D50906 llvm-svn: 340444	2018-08-22 20:05:21 +00:00
Max Kazantsev	611d645a08	[GuardWidening] Ignore guards with trivial conditions Guard widening should not spend efforts on dealing with guards with trivial true/false conditions. Such guards can easily be eliminated by any further cleanup pass like instcombine. However we should not unconditionally delete them because it may be profitable to widen other conditions into such guards. Differential Revision: https://reviews.llvm.org/D50247 Reviewed By: fedor.sergeev llvm-svn: 340381	2018-08-22 02:40:49 +00:00
Alina Sbirlea	ab6f84f763	Update MemorySSA in BasicBlockUtils. Summary: Extend BasicBlocksUtils to update MemorySSA. Subscribers: sanjoy, arsenm, nhaehnle, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D45300 llvm-svn: 340365	2018-08-21 23:32:03 +00:00
Marcello Maggioni	883fe455f1	[LICM] Refactor some AliasSetTracker code to get rid of new/deletes. NFC Differential Revision: https://reviews.llvm.org/D51024 llvm-svn: 340333	2018-08-21 20:30:14 +00:00
Florian Hahn	9583d4fa03	[GVN] Assign new value number to calls reading memory, if there is no MemDep info. Currently we assign the same value number to two calls reading the same memory location if we do not have MemoryDependence info. Without MemDep Info we cannot guarantee that there is no store between the two calls, so we have to assign a new number to the second call. It also adds a new option EnableMemDep to enable/disable running MemoryDependenceAnalysis and also renamed NoLoads to NoMemDepAnalysis to be more explicit what it does. As it also impacts calls that read memory, NoLoads is a bit confusing. Reviewers: efriedma, sebpop, john.brawn, wmi Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D50893 llvm-svn: 340319	2018-08-21 19:11:27 +00:00
Philip Reames	c3c23e8cf2	[AST] Remove notion of volatile from alias sets [NFCI] Volatility is not an aliasing property. We used to model volatile as if it had extremely conservative aliasing implications, but that hasn't been true for several years now. So, it doesn't make sense to be in AliasSet. It also turns out the code is entirely a noop. Outside of the AST code to update it, there was only one user: load store promotion in LICM. L/S promotion doesn't need the check since it walks all the users of the address anyway. It already checks each load or store via !isUnordered which causes us to bail for volatile accesses. (Look at the lines immediately following the two remove asserts.) There is the possibility of some small compile time impact here, but the only case which will get noticeably slower is a loop with a large number of loads and stores to the same address where only the last one we inspect is volatile. This is sufficiently rare it's not worth optimizing for.. llvm-svn: 340312	2018-08-21 17:59:11 +00:00
Max Kazantsev	097ef69182	[LICM] Hoist guards with invariant conditions This patch teaches LICM to hoist guards from the loop if they are guaranteed to execute and if there are no side effects that could prevent that. Differential Revision: https://reviews.llvm.org/D50501 Reviewed By: reames llvm-svn: 340256	2018-08-21 08:11:31 +00:00
Jun Lim	da5864c73c	Test commit I just removed a blank space. llvm-svn: 340069	2018-08-17 18:40:41 +00:00
Florian Hahn	19f9e32f07	[InstrSimplify,NewGVN] Add option to ignore additional instr info when simplifying. NewGVN uses InstructionSimplify for simplifications of leaders of congruence classes. It is not guaranteed that the metadata or other flags/keywords (like nsw or exact) of the leader is available for all members in a congruence class, so we cannot use it for simplification. This patch adds a InstrInfoQuery struct with a boolean field UseInstrInfo (which defaults to true to keep the current behavior as default) and a set of helper methods to get metadata/keywords for a given instruction, if UseInstrInfo is true. The whole thing might need a better name, to avoid confusion with TargetInstrInfo but I am not sure what a better name would be. The current patch threads through InstrInfoQuery to the required places, which is messier then it would need to be, if InstructionSimplify and ValueTracking would share the same Query struct. The reason I added it as a separate struct is that it can be shared between InstructionSimplify and ValueTracking's query objects. Also, some places do not need a full query object, just the InstrInfoQuery. It also updates some interfaces that do not take a Query object, but a set of optional parameters to take an additional boolean UseInstrInfo. See https://bugs.llvm.org/show_bug.cgi?id=37540. Reviewers: dberlin, davide, efriedma, sebpop, hiraditya Reviewed By: hiraditya Differential Revision: https://reviews.llvm.org/D47143 llvm-svn: 340031	2018-08-17 14:39:04 +00:00
Anna Thomas	1962621a7e	[LICM] Add a diagnostic analysis for identifying alias information Summary: Currently, in LICM, we use the alias set tracker to identify if the instruction (we're interested in hoisting) aliases with instruction that modifies that memory location. This patch adds an LICM alias analysis diagnostic tool that checks the mod ref info of the instruction we are interested in hoisting/sinking, with every instruction in the loop. Because of O(N^2) complexity this is now only a diagnostic tool to show the limitation we have with the alias set tracker and is OFF by default. Test cases show the difference with the diagnostic analysis tool, where we're able to hoist out loads and readonly + argmemonly calls from the loop, where the alias set tracker analysis is not able to hoist these instructions out. Reviewers: reames, mkazantsev, fedor.sergeev, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50854 llvm-svn: 340026	2018-08-17 13:44:00 +00:00
Chen Zheng	e2d47dd1bb	[MISC]Fix wrong usage of std::equal() Differential Revision: https://reviews.llvm.org/D49958 llvm-svn: 340000	2018-08-17 07:51:01 +00:00
Philip Reames	0e2f9b9e30	[LICM][NFC] Restructure pointer invalidation API in terms of MemoryLocation Main value is just simplifying code. I'll further simply the argument handling case in a bit, but that involved a slightly orthogonal change so I went with the mildy ugly intermediate for this patch. Note that the isSized check in the old LICM code was not carried across. It turns out that check was dead. a) no test exercised it, and b) langref and verifier had been updated to disallow unsized types used in loads. llvm-svn: 339930	2018-08-16 20:11:15 +00:00
Max Kazantsev	72d7d649e3	[NFC] Remove const modifier to allow further development in LICM llvm-svn: 339846	2018-08-16 08:30:15 +00:00
Marcello Maggioni	e98aaf1d91	[GVN] Fix typo in IsValueFullyAvailableInBlock. NFC. DenseMap insert() method return a pair<iterator, bool> not pair<iterator, char> Noticed it and thought I might just fix it ... llvm-svn: 339777	2018-08-15 15:06:53 +00:00
Max Kazantsev	530b8d1c3d	[NFC] Refactoring of LoopSafetyInfo, step 1 Turn structure into class, encapsulate methods, add clarifying comments. Differential Revision: https://reviews.llvm.org/D50693 Reviewed By: reames llvm-svn: 339752	2018-08-15 05:55:43 +00:00
Max Kazantsev	68290f838a	[NFC][LICM] Make hoist method void Method hoist always returns true. This patch makes it void. Differential Revision: https://reviews.llvm.org/D50696 Reviewed By: hiraditya llvm-svn: 339750	2018-08-15 02:49:12 +00:00
Max Kazantsev	5c490b49c3	[GuardWidening] Widen very likely non-taken br instructions This is a second part of D49974 that handles widening of conditional branches that have very likely `false` branch. Differential Revision: https://reviews.llvm.org/D50040 Reviewed By: reames llvm-svn: 339537	2018-08-13 07:58:19 +00:00
David Green	f7111d1ece	[UnJ] Improve explicit loop count checks Try to improve the computed counts when it has been explicitly set by a pragma or command line option. This moves the code around, so that first call to computeUnrollCount to get a sensible count and override that if explicit unroll and jam counts are specified. Also added some extra debug messages for when unroll and jamming is disabled. Differential Revision: https://reviews.llvm.org/D50075 llvm-svn: 339501	2018-08-11 07:37:31 +00:00
Philip Reames	85afd1a9a0	[LICM] Hoist assumes out of loops If we have an assume which is known to execute and whose operand is invariant, we can lift that into the pre-header. So long as we don't change which paths the assume executes on, this is a legal transformation. It's likely to be a useful canonicalization as other transforms only look for dominating assumes. Differential Revision: https://reviews.llvm.org/D50364 llvm-svn: 339481	2018-08-10 22:21:56 +00:00
Philip Reames	7d79433136	[LICM] Suppress a compiler warning noticed by one of the bots llvm-svn: 339388	2018-08-09 21:15:33 +00:00
Philip Reames	ca256d93fb	[LICM] hoist fences out of loops w/o memory operations The motivating case is an otherwise dead loop with a fence in it. At the moment, this goes all the way through the optimizer and we end up emitting an entirely pointless loop on x86. This case may seem a bit contrived, but we've seen it in real code as the result of otherwise reasonable lowering strategies combined w/thread local memory optimizations (such as escape analysis). To handle this simple case, we can teach LICM to hoist must execute fences when there is no other memory operation within the loop. Differential Revision: https://reviews.llvm.org/D50489 llvm-svn: 339378	2018-08-09 20:18:42 +00:00
Alina Sbirlea	bf9fe79397	SCEV should forget all loops containing a deleted block. Summary: LoopSimplifyCFG should update ScEv for all loops after a block is deleted. If the deleted block "Succ" is part of L, then it is part of all parent loops, so forget topmost loop. Reviewers: greened, mkazantsev, sanjoy Subscribers: jlebar, javed.absar, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D50422 llvm-svn: 339363	2018-08-09 17:53:26 +00:00
Philip Reames	22b20a09a0	[LICM] Add an assert to ensure all instruction types needing aliasing are handled [NFC] llvm-svn: 339308	2018-08-09 03:44:28 +00:00
Florian Hahn	39bbe179aa	[GVN,NewGVN] Move patchReplacementInstruction to Utils/Local.h This function is shared between both implementations. I am not sure if Utils/Local.h is the best place though. Reviewers: davide, dberlin, efriedma, xbolva00 Reviewed By: efriedma, xbolva00 Differential Revision: https://reviews.llvm.org/D47337 llvm-svn: 339138	2018-08-07 13:27:33 +00:00
Max Kazantsev	640cb00365	[NFC] Factor out implicit control flow logic from GVN Logic for tracking implicit control flow instructions was added to GVN to perform PRE optimizations correctly. It appears that GVN is not the only optimization that sometimes does PRE, so this logic is required in other places (such as Jump Threading). This is an NFC patch that encapsulates all ICF-related logic in a dedicated utility class separated from GVN. Differential Revision: https://reviews.llvm.org/D40293 llvm-svn: 339086	2018-08-07 01:47:20 +00:00
Philip Reames	3b35aaacb6	[LICM] Extract a helper function for readability [NFC] llvm-svn: 339069	2018-08-06 22:07:37 +00:00
Max Kazantsev	778f62bb46	Try to fix buildbot llvm-svn: 338991	2018-08-06 06:35:21 +00:00
Max Kazantsev	eded4abef8	[GuardWidening] Widen guards with conditions of frequently taken dominated branches If there is a frequently taken branch dominated by a guard, and its condition is available at the point of the guard, we can widen guard with condition of this branch and convert the branch into unconditional: guard(cond1) if (cond2) { // taken in 99.9% cases // do something } else { // do something else } Converts to guard(cond1 && cond2) // do something Differential Revision: https://reviews.llvm.org/D49974 Reviewed By: reames llvm-svn: 338988	2018-08-06 05:49:19 +00:00
Hsiangkai Wang	ef72e481ea	[DebugInfo] Refactor DbgInfoIntrinsic class hierarchy. In the past, DbgInfoIntrinsic has a strong assumption that these intrinsics all have variables and expressions attached to them. However, it is too strong to derive the class for other debug entities. Now, it has problems for debug labels. In order to make DbgInfoIntrinsic as a base class for 'debug info', I create a class for 'variable debug info', DbgVariableIntrinsic. DbgDeclareInst, DbgAddrIntrinsic, and DbgValueInst will be derived from it. Differential Revision: https://reviews.llvm.org/D50220 llvm-svn: 338984	2018-08-06 03:59:47 +00:00
Chijun Sima	8b5de48d62	[TailCallElim] Preserve DT and PDT Summary: Previously, in the NewPM pipeline, TailCallElim recalculates the DomTree when it modifies any instruction in the Function. For example, ``` CallInst *CI = dyn_cast<CallInst>(&I); ... CI->setTailCall(); Modified = true; ... if (!Modified \|\| ...) return PreservedAnalyses::all(); ``` After applying this patch, the DomTree only recalculates if needed (plus an extra insertEdge() + an extra deleteEdge() call). When optimizing SQLite with `-passes="default<O3>"` pipeline of the newPM, the number of DomTree recalculation decreases by 6.2%, the number of nodes visited by DFS decreases by 2.9%. The time used by DomTree will decrease approximately 1%~2.5% after applying the patch. Statistics: ``` Before the patch: 23010 dom-tree-stats - Number of DomTree recalculations 489264 dom-tree-stats - Number of nodes visited by DFS -- DomTree After the patch: 21581 dom-tree-stats - Number of DomTree recalculations 475088 dom-tree-stats - Number of nodes visited by DFS -- DomTree ``` Reviewers: kuhar, dmgreen, brzycki, grosser, davide Reviewed By: kuhar, brzycki Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49982 llvm-svn: 338954	2018-08-04 08:13:47 +00:00
Chijun Sima	eacad79777	[ADCE] Remove the need of DomTree Summary: ADCE doesn't need to query domtree. Reviewers: kuhar, brzycki, dmgreen, davide, grosser Reviewed By: kuhar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49988 llvm-svn: 338950	2018-08-04 02:50:12 +00:00
Anastasis Grammenos	4dfe279e00	[TRE][DebugInfo] Preserve Debug Location in new branch instruction There are two branch instructions created so the new test covers them both. Differential Revision: https://reviews.llvm.org/D50263 llvm-svn: 338917	2018-08-03 20:27:13 +00:00
Max Kazantsev	dcf6706e52	[NFC] Add missing comment llvm-svn: 338848	2018-08-03 10:41:51 +00:00
Max Kazantsev	65cd4836d2	[NFC] Move some methods into static functions llvm-svn: 338843	2018-08-03 10:16:40 +00:00
Chijun Sima	21a8b605a1	[Dominators] Convert existing passes and utils to use the DomTreeUpdater class Summary: This patch is the second in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html \| RFC - A new dominator tree updater for LLVM ]]. It converts passes (e.g. adce/jump-threading) and various functions which currently accept DDT in local.cpp and BasicBlockUtils.cpp to use the new DomTreeUpdater class. These converted functions in utils can accept DomTreeUpdater with either UpdateStrategy and can deal with both DT and PDT held by the DomTreeUpdater. Reviewers: brzycki, kuhar, dmgreen, grosser, davide Reviewed By: brzycki Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48967 llvm-svn: 338814	2018-08-03 05:08:17 +00:00
Philip Reames	5937368d4f	[LICM] Remove unneccessary safety check to increase sinking effectiveness This one requires a bit of explaination. It's not every day you simply delete code to implement an optimization. :) The transform in question is sinking an instruction from a loop to the uses in loop exiting blocks. We know (from LCSSA) that all of the uses outside the loop must be phi nodes, and after predecessor splitting, we know all phi users must have a single operand. Since the use must be strictly dominated by the def, we know from the definition of dominance/ssa that the exit block must execute along a (non-strict) subset of paths which reach the def. As a result, duplicating a potentially faulting instruction can not introduce a fault that didn't previously exist in the program. The full story is that this patch builds on "rL338671: [LICM] Factor out fault legality from canHoistOrSinkInst [NFC]" which pulled this logic out of a common helper routine. As best I can tell, this check was originally added to the helper function for hoisting legality, later an incorrect fastpath for loads/calls was added, and then the bug was fixed by duplicating the fault safety check in the hoist path. This left the redundant check in the common code to pessimize sinking for no reason. I split it out in an NFC, and am not removing the unneccessary check. I wanted there to be something easy to revert in case I missed something. Reviewed by: Anna Thomas (in person) llvm-svn: 338794	2018-08-03 00:21:56 +00:00
Philip Reames	32cb80b9d3	[LICM] Factor out fault legality from canHoistOrSinkInst [NFC] This method has three callers, each of which wanted distinct handling: 1) Sinking into a loop is moving an instruction known to execute before a loop into the loop. We don't need to worry about introducing a fault at all in this case. 2) Hoisting from a loop into a preheader already duplicated the check in the caller. 3) Sinking from the loop into an exit block was the only true user of the code within the routine. For the moment, this has just been lifted into the caller, but up next is examining the logic more carefully. Whitelisting of loads and calls - while consistent with the previous code - is rather suspicious. Either way, a behavior change is worthy of it's own patch. llvm-svn: 338671	2018-08-02 04:08:04 +00:00
Philip Reames	09de470e9e	[LICM] hoisting/sinking legality - bail early for unsupported instructions Originally, this was part of a larger refactoring I'd planned, but had to abandoned. I figured the minor improvement in readability was worthwhile. llvm-svn: 338663	2018-08-02 00:54:14 +00:00
George Burgess IV	213d1d23ef	Reland r338431: "Add DebugCounters to DivRemPairs" (Previously reverted in r338442) I'm told that the breakage came from us using an x86 triple on configs that didn't have x86 enabled. This is remedied by moving the debugcounter test to an x86 directory (where there's also a opt-bisect-isel.ll test for similar reasons). I can't repro the reverse-iteration failure mentioned in the revert with this patch, so I assume that a misconfiguration on my end is what caused that. Original commit message: Add DebugCounters to DivRemPairs For people who don't use DebugCounters, NFCI. Patch by Zhizhou Yang! Differential Revision: https://reviews.llvm.org/D50033 llvm-svn: 338653	2018-08-01 23:14:14 +00:00
George Burgess IV	497e8fad51	Revert r338431: "Add DebugCounters to DivRemPairs" This reverts r338431; the test it added is making buildbots unhappy. Locally, I can repro the failure on reverse-iteration builds. llvm-svn: 338442	2018-07-31 21:18:44 +00:00
George Burgess IV	907f4f6a74	Add DebugCounters to DivRemPairs For people who don't use DebugCounters, NFCI. Patch by Zhizhou Yang! Differential Revision: https://reviews.llvm.org/D50033 llvm-svn: 338431	2018-07-31 20:07:46 +00:00
Max Kazantsev	eb8e9c0940	[NFC] Collect statistics in GuardWidening llvm-svn: 338348	2018-07-31 04:37:11 +00:00
Fangrui Song	f78650a8de	Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293	2018-07-30 19:41:25 +00:00
Max Kazantsev	3327bcaeb1	[NFC] Prepare GuardWidening for widening of cond branches llvm-svn: 338229	2018-07-30 07:07:32 +00:00
Alina Sbirlea	5666c7e4bd	[SimpleLoopUnswitch] Fix DT updates for trivial branch unswitching. Summary: Fixing 2 issues with the DT update in trivial branch switching, though I don't have a case where DT update fails. 1. After splitting ParentBB->UnswitchedBB edge, new edges become: ParentBB->LoopExitBB->UnswitchedBB, so remove ParentBB->LoopExitBB edge. 2. AFAIU, for multiple CFG changes, DT should be updated using batch updates, vs consecutive addEdge and removeEdge calls. Reviewers: chandlerc, kuhar Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D49925 llvm-svn: 338180	2018-07-28 00:01:05 +00:00
Florian Hahn	b6613ac665	Revert r337904: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. I suspect it is causing the clang-stage2-Rthinlto failures. llvm-svn: 337956	2018-07-25 19:44:19 +00:00
Florian Hahn	6f5c6adbcd	Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. r337828 resolves a PredicateInfo issue with unnamed types. Original message: This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin llvm-svn: 337904	2018-07-25 11:13:40 +00:00
George Burgess IV	b00fb46479	[DebugCounters] Keep track of total counts This patch makes debug counters keep track of the total number of times we've called `shouldExecute` for each counter, so it's easier to build automated tooling on top of these. A patch to print these counts is coming soon. Patch by Zhizhou Yang! Differential Revision: https://reviews.llvm.org/D49560 llvm-svn: 337748	2018-07-23 21:49:36 +00:00
John Brawn	fc18a6ad7d	[GVN] Don't use the eliminated load as an available value in phi construction In ConstructSSAForLoadSet if an available value is actually the load that we're doing SSA construction to eliminate, then we can omit it as SSAUpdate will add in the value for the phi that will be replacing it anyway. This can result in simpler IR which can allow further optimisation. Differential Revision: https://reviews.llvm.org/D44160 llvm-svn: 337686	2018-07-23 12:14:45 +00:00
Alexandros Lamprineas	592cc78dd8	[GVNHoist] safeToHoistLdSt allows illegal hoisting Bug fix for PR36787. When reasoning if it's safe to hoist a load we want to make sure that the defining memory access dominates the new insertion point of the hoisted instruction. safeToHoistLdSt calls firstInBB(InsertionPoint,DefiningAccess) which returns false if InsertionPoint == DefiningAccess, and therefore it falsely thinks it's safe to hoist. Differential Revision: https://reviews.llvm.org/D49555 llvm-svn: 337674	2018-07-23 09:42:35 +00:00
Aditya Kumar	373ce7eca5	Early exit with cheaper checks Reviewers: sebpop,davide,fhahn,trentxintong Differential Revision: https://reviews.llvm.org/D49617 llvm-svn: 337643	2018-07-21 14:13:44 +00:00
Florian Hahn	ec3ca89a17	[IPSCCP] Fix for bot failure caused by r337548 llvm-svn: 337554	2018-07-20 14:37:10 +00:00
Florian Hahn	0a560d5d9c	Recommit r328307: [IPSCCP] Use constant range information for comparisons of parameters. This version contains a fix to add values for which the state in ParamState change to the worklist if the state in ValueState did not change. To avoid adding the same value multiple times, mergeInValue returns true, if it added the value to the worklist. The value is added to the worklist depending on its state in ValueState. Original message: For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 337548	2018-07-20 13:29:12 +00:00
Eli Friedman	a3c78f5981	[SCCP] Don't use markForcedConstant on branch conditions. It's more aggressive than we need to be, and leads to strange workarounds in other places like call return value inference. Instead, just directly mark an edge viable. Tests by Florian Hahn. Differential Revision: https://reviews.llvm.org/D49408 llvm-svn: 337507	2018-07-19 23:02:07 +00:00
Florian Hahn	d95761d9d0	[IPSCCP] Run Solve each time we resolved an undef in a function. Once we resolved an undef in a function we can run Solve, which could lead to finding a constant return value for the function, which in turn could turn undefs into constants in other functions that call it, before resolving undefs there. Computationally the amount of work we are doing stays the same, just the order we process things is slightly different and potentially there are a few less undefs to resolve. We are still relying on the order of functions in the IR, which means depending on the order, we are able to resolve the optimal undef first or not. For example, if @test1 comes before @testf, we find the constant return value of @testf too late and we cannot use it while solving @test1. This on its own does not lead to more constants removed in the test-suite, probably because currently we have to be very lucky to visit applicable functions in the right order. Maybe we manage to come up with a better way of resolving undefs in more 'profitable' functions first. Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma, davide Differential Revision: https://reviews.llvm.org/D49385 llvm-svn: 337283	2018-07-17 14:04:59 +00:00
Tim Shen	9e25d5d2ce	[LSR] If no Use is interesting, early return. Summary: By looking at the callers of getUse(), we can see that even though IVUsers may offer uses, but they may not be interesting to LSR. It's possible that none of them is interesting. Reviewers: sanjoy Subscribers: jlebar, hiraditya, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D49049 llvm-svn: 337072	2018-07-13 23:40:00 +00:00
Eric Christopher	7289ac8a19	Temporarily revert "Recommit r328307: [IPSCCP] Use constant range information for comparisons of parameters." as it's causing miscompiles. A testcase was provided in the original review thread. This reverts commit r336098. llvm-svn: 336877	2018-07-12 01:53:21 +00:00
Craig Topper	ed6acde8cf	[LoopIdiomRecognize] Don't convert a do while loop to ctlz. This commit suppresses turning loops like this into "(bitwidth - ctlz(input))". unsigned foo(unsigned input) { unsigned num = 0; do { ++num; input >>= 1; } while (input != 0); return num; } The loop version returns a value of 1 for both an input of 0 and an input of 1. Converting to a naive ctlz does not preserve that. Theoretically we could do better if we checked isKnownNonZero or we could insert a select to handle the divergence. But until we have motivating cases for that, this is the easiest solution. llvm-svn: 336864	2018-07-11 22:35:28 +00:00
Chandler Carruth	148861f579	[PM/Unswitch] Fix unused variable in r336646. llvm-svn: 336647	2018-07-10 08:57:04 +00:00
Chandler Carruth	47dc3a346e	[PM/Unswitch] Fix a collection of closely related issues with trivial switch unswitching. The core problem was that the way we handled unswitching trivial exit edges through the default successor of a switch. For some reason I thought the right way to do this was to add a block containing unreachable and point the default successor at this block. In retrospect, this has an amazing number of problems. The first issue is the one that this pass has always worked around -- we have to detect such edges and avoid unswitching them again. This seemed pretty easy really. You juts look for an edge to a block containing unreachable. However, this pattern is woefully unsound. So many things can break it. The amazing thing is that I found a test case where simple-loop-unswitch itself breaks this! When we do a non-trivial unswitch of a switch we will end up splitting this exit edge. The result will be a default successor that is an exit and terminates in ... a perfectly normal branch. So the first test case that I started trying to fix is added to the nontrivial test cases. This is a ridiculous example that did just amazing things previously. With just unswitch, it would create 10+ copies of this stuff stamped out. But if you combine it just right with a bunch of other passes (like simplify-cfg, loop rotate, and some LICM) you can get it to do this infinitely. Or at least, I never got it to finish. =[ This, in turn, uncovered another related issue. When we are manipulating these switches after doing a trivial unswitch we never correctly updated PHI nodes to reflect our edits. As soon as I started changing how these edges were managed, it became obvious there were more issues that I couldn't realistically leave unaddressed, so I wrote more test cases around PHI updates here and ensured all of that works now. And this, in turn, required some adjustment to how we collect and manage the exit successor when it is the default successor. That showed a clear bug where we failed to include it in our search for the outer-most loop reached by an unswitched exit edge. This was actually already tested and the test case didn't work. I (wrongly) thought that was due to SCEV failing to analyze the switch. In fact, it was just a simple bug in the code that skipped the default successor. While changing this, I handled it correctly and have updated the test to reflect that we now get precise SCEV analysis of trip counts for the outer loop in one of these cases. llvm-svn: 336646	2018-07-10 08:36:05 +00:00
Manoj Gupta	77eeac3d9e	llvm: Add support for "-fno-delete-null-pointer-checks" Summary: Support for this option is needed for building Linux kernel. This is a very frequently requested feature by kernel developers. More details : https://lkml.org/lkml/2018/4/4/601 GCC option description for -fdelete-null-pointer-checks: This Assume that programs cannot safely dereference null pointers, and that no code or data element resides at address zero. -fno-delete-null-pointer-checks is the inverse of this implying that null pointer dereferencing is not undefined. This feature is implemented in LLVM IR in this CL as the function attribute "null-pointer-is-valid"="true" in IR (Under review at D47894). The CL updates several passes that assumed null pointer dereferencing is undefined to not optimize when the "null-pointer-is-valid"="true" attribute is present. Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv Reviewed By: efriedma, george.burgess.iv Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47895 llvm-svn: 336613	2018-07-09 22:27:23 +00:00
Chandler Carruth	ed2965438e	[PM/Unswitch] Fix a nasty bug in the new PM's unswitch introduced in r335553 with the non-trivial unswitching of switches. The code correctly updated most aspects of the CFG and analyses, but missed some crucial aspects: 1) When multiple cases have the same successor, we unswitch that a single time and replace the switch with a direct branch. The CFG here is correct, but the target of this direct branch may have had a PHI node with multiple entries in it. 2) When we still have to clone a successor of the switch into an unswitched copy of the loop, we'll delete potentially multiple edges entering this successor, not just one. 3) We also have to delete multiple edges entering the successors in the original loop when they have to be retained. 4) When the "retained successor" also occurs as a case successor, we just assert failed everywhere. This doesn't happen very easily because its always valid to simply drop the case -- the retained successor for switches is always the default successor. However, it is likely possible through some contrivance of different loop passes, unrolling, and simplifying for this to occur in practice and certainly there is nothing "invalid" about the IR so this pass needs to handle it. 5) In the case of #4, we also will replace these multiple edges with a direct branch much like in #1 and need to collapse the entries in any PHI nodes to a single enrty. All of this stems from the delightful fact that the same successor can show up in multiple parts of the switch terminator, and each of these are considered a distinct edge for the purpose of PHI nodes (and iterating the successors and predecessors) but not for unswitching itself, the dominator tree, or many other things. For the record, I intensely dislike this "feature" of the IR in large part because of the complexity it causes in passes like this. We already have a ton of logic building sets and handling duplicates, and we just had to add a bunch more. I've added a complex test case that covers all five of the above failure modes. I've also added a variation on it where #4 and #5 occur in loop exit, adding fun where we have an LCSSA PHI node with "multiple entries" despite have dedicated exits. There were no additional issues found by this, but it seems a useful corner case to cover with testing. One thing that working on all of this code has made painfully clear for me as well is how amazingly inefficient our PHI node representation is (in terms of the in-memory data structures and the APIs used to update them). This code has truly marvelous complexity bounds because every time we remove an entry from a PHI node we do a linear scan to find it and then a linear update to the data structure to remove it. We could in theory batch all of the PHI node updates into a single linear walk of the operands making this much more efficient, but the APIs fight hard against this and the fact that we have to handle duplicates in the peculiar manner we do (removing all but one in some cases) makes even implementing that very tedious and annoying. Anyways, none of this is new here or specific to loop unswitching. All code in LLVM that updates PHI node operands suffers from these problems. llvm-svn: 336536	2018-07-09 10:30:48 +00:00
Craig Topper	2835278ee0	[LoopIdiomRecognize] Support for converting loops that use LSHR to CTLZ. In the 'detectCTLZIdiom' function support for loops that use LSHR instruction instead of ASHR has been added. This supports creating ctlz from the following code. int lzcnt(int x) { int count = 0; while (x > 0) { count++; x = x >> 1; } return count; } Patch by Olga Moldovanova Differential Revision: https://reviews.llvm.org/D48354 llvm-svn: 336509	2018-07-08 01:45:47 +00:00
Chandler Carruth	d8b0c8ce1b	[PM/LoopUnswitch] Fix PR37889, producing the correct loop nest structure after trivial unswitching. This PR illustrates that a fundamental analysis update was not performed with the new loop unswitch. This update is also somewhat fundamental to the core idea of the new loop unswitch -- we actually update the CFG based on the unswitching. In order to do that, we need to update the loop nest in addition to the domtree. For some reason, when writing trivial unswitching, I thought that the loop nest structure cannot be changed by the transformation. But the PR helps illustrate that it clearly can. I've expanded this to a number of different test cases that try to cover the different cases of this. When we unswitch, we move an exit edge of a loop out of the loop. If this exit edge changes which loop reached by an exit is the innermost loop, it changes the parent of the loop. Essentially, this transformation may hoist the inner loop up the nest. I've added the simple logic to handle this reliably in the trivial unswitching case. This just requires updating LoopInfo and rebuilding LCSSA on the impacted loops. In the trivial case, we don't even need to handle dedicated exits because we're only hoisting the one loop and we just split its preheader. I've also ported all of these tests to non-trivial unswitching and verified that the logic already there correctly handles the loop nest updates necessary. Differential Revision: https://reviews.llvm.org/D48851 llvm-svn: 336477	2018-07-07 01:12:56 +00:00
Benjamin Kramer	3687ac52a9	[LoopSink] Make the enforcement of determinism deterministic. LoopBlockNumber is a DenseMap<BasicBlock, int>, comparing the result of find() will compare a pair<BasicBlock, int>. That's of course depending on pointer ordering which varies from run to run. Reverse iteration doesn't find this because we're copying to a vector first. This bug has been there since 2016 but only recently showed up on clang selfhost with FDO and ThinLTO, which is also why I didn't manage to get a reasonable test case for this. Add an assert that would've caught this. llvm-svn: 336439	2018-07-06 14:20:58 +00:00
Michael Zolotukhin	a5f2c52a1e	Revert r332168: "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading."" There were a couple of issues reported (PR38047, PR37929) - I'll reland the patch when I figure out and fix the rootcause. llvm-svn: 336393	2018-07-05 22:10:31 +00:00
Chandler Carruth	3897ded691	[PM/LoopUnswitch] Fix PR37651 by correctly invalidating SCEV when unswitching loops. Original patch trying to address this was sent in D47624, but that didn't quite handle things correctly. There are two key principles used to select whether and how to invalidate SCEV-cached information about loops: 1) We must invalidate any info SCEV has cached before unswitching as we may change (or destroy) the loop structure by the act of unswitching, and make it hard to recover everything we want to invalidate within SCEV. 2) We need to invalidate all of the loops whose CFGs are mutated by the unswitching. Notably, this isn't the entire loop nest, this is every loop contained by the outermost loop reached by an exit block relevant to the unswitch. And we need to do this even when doing trivial unswitching. I've added more focused tests that directly check that SCEV starts off with imprecise information and after unswitching (and simplifying instructions) re-querying SCEV will produce precise information. These tests also specifically work to check that an outer loop's information becomes precise. However, the testing here is still a bit imperfect. Crafting test cases that reliably fail to be analyzed by SCEV before unswitching and succeed afterward proved ... very, very hard. It took me several hours and careful work to build these, and I'm not optimistic about necessarily coming up with more to cover more elaborate possibilities. Fortunately, the code pattern we are testing here in the pass is really straightforward and reliable. Thanks to Max Kazantsev for the initial work on this as well as the review, and to Hal Finkel for helping me talk through approaches to test this stuff even if it didn't come to much. Differential Revision: https://reviews.llvm.org/D47624 llvm-svn: 336183	2018-07-03 09:13:27 +00:00
Alina Sbirlea	0e15501fa7	Replace "Replacable" with "Replaceable". [NFC] llvm-svn: 336133	2018-07-02 18:53:40 +00:00
Florian Hahn	4ebba909a2	Recommit r328307: [IPSCCP] Use constant range information for comparisons of parameters. This version contains a fix to add values for which the state in ParamState change to the worklist if the state in ValueState did not change. To avoid adding the same value multiple times, mergeInValue returns true, if it added the value to the worklist. The value is added to the worklist depending on its state in ValueState. Original message: For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 336098	2018-07-02 12:44:04 +00:00
David Green	963401d2be	[UnrollAndJam] New Unroll and Jam pass This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder Loop So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 336062	2018-07-01 12:47:30 +00:00
Chandler Carruth	7c557f804d	[instsimplify] Move the instsimplify pass to use more obvious file names and diretory. Also cleans up all the associated naming to be consistent and removes the public access to the pass ID which was unused in LLVM. Also runs clang-format over parts that changed, which generally cleans up a bunch of formatting. This is in preparation for doing some internal cleanups to the pass. Differential Revision: https://reviews.llvm.org/D47352 llvm-svn: 336028	2018-06-29 23:36:03 +00:00
Sean Fertile	cd0d7634f6	Revert "Extend CFGPrinter and CallPrinter with Heat Colors" This reverts r335996 which broke graph printing in Polly. llvm-svn: 336000	2018-06-29 17:48:58 +00:00
Sean Fertile	3b0535b424	Extend CFGPrinter and CallPrinter with Heat Colors Extends the CFGPrinter and CallPrinter with heat colors based on heuristics or profiling information. The colors are enabled by default and can be toggled on/off for CFGPrinter by using the option -cfg-heat-colors for both -dot-cfg[-only] and -view-cfg[-only]. Similarly, the colors can be toggled on/off for CallPrinter by using the option -callgraph-heat-colors for both -dot-callgraph and -view-callgraph. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D40425 llvm-svn: 335996	2018-06-29 17:13:58 +00:00
Anastasis Grammenos	425df22ee3	[SROA] Preserve DebugLoc when rewriting alloca partitions When rewriting an alloca partition copy the DL from the old alloca over the the new one. Differential Revision: https://reviews.llvm.org/D48640 llvm-svn: 335904	2018-06-28 18:58:30 +00:00
Florian Hahn	388af14f85	[SCCP] Mark CFG as preserved. SCCP does not change the CFG, so we can mark it as preserved. Reviewers: dberlin, efriedma, davide Reviewed By: davide Differential Revision: https://reviews.llvm.org/D47149 llvm-svn: 335820	2018-06-28 09:53:38 +00:00
Michael Zolotukhin	d3b8bdef01	[JumpThreading] Don't try to rewrite a use if it's already valid. Summary: When recording uses we need to rewrite after cloning a loop we need to check if the use is not dominated by the original def. The initial assumption was that the cloned basic block will introduce a new path and thus the original def will only dominate the use if they are in the same BB, but as the reproducer from PR37745 shows it's not always the case. This fixes PR37745. Reviewers: haicheng, Ka-Ka Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48111 llvm-svn: 335675	2018-06-26 22:19:48 +00:00
Matt Arsenault	2c1a570aab	LoopUnroll: Allow analyzing intrinsic call costs I'm not sure why the code here is skipping calls since TTI does try to do something for general calls, but it at least should allow intrinsics. Skip intrinsics that should not be omitted as calls, which is by far the most common case on AMDGPU. llvm-svn: 335645	2018-06-26 18:51:17 +00:00
Florian Hahn	4a69b0bb36	[IPSCCP] Change dead blocks to unreachable after visiting all executable blocks. changeToUnreachable may remove PHI nodes from executable blocks we found values for and we would fail to replace them. By changing dead blocks to unreachable after we replaced constants in all executable blocks, we ensure such PHI nodes are replaced by their known value before. Fixes PR37780. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D48421 llvm-svn: 335588	2018-06-26 10:15:02 +00:00
Chandler Carruth	1652996fd6	[PM/LoopUnswitch] Teach the new unswitch to handle nontrivial unswitching of switches. This works much like trivial unswitching of switches in that it reliably moves the switch out of the loop. Here we potentially clone the entire loop into each successor of the switch and re-point the cases at these clones. Due to the complexity of actually doing nontrivial unswitching, this patch doesn't create a dedicated routine for handling switches -- it would duplicate far too much code. Instead, it generalizes the existing routine to handle both branches and switches as it largely reduces to looping in a few places instead of doing something once. This actually improves the results in some cases with branches due to being much more careful about how dead regions of code are managed. With branches, because exactly one clone is created and there are exactly two edges considered, somewhat sloppy handling of the dead regions of code was sufficient in most cases. But with switches, there are much more complicated patterns of dead code and so I've had to move to a more robust model generally. We still do as much pruning of the dead code early as possible because that allows us to avoid even cloning the code. This also surfaced another problem with nontrivial unswitching before which is that we weren't as precise in reconstructing loops as we could have been. This seems to have been mostly harmless, but resulted in pointless LCSSA PHI nodes and other unnecessary cruft. With switches, we have to get this right, and everything benefits from it. While the testing may seem a bit light here because we only have two real cases with actual switches, they do a surprisingly good job of exercising numerous edge cases. Also, because we share the logic with branches, most of the changes in this patch are reasonably well covered by existing tests. The new unswitch now has all of the same fundamental power as the old one with the exception of the single unsound case of partial switch unswitching -- that really is just loop specialization and not unswitching at all. It doesn't fit into the canonicalization model in any way. We can add a loop specialization pass that runs late based on profile data if important test cases ever come up here. Differential Revision: https://reviews.llvm.org/D47683 llvm-svn: 335553	2018-06-25 23:32:54 +00:00
Craig Topper	27847868b7	[LoopIdiomRecognize] Fix a couple places where it appears we were unintenionally making copies of DebugLoc. llvm-svn: 335521	2018-06-25 20:45:45 +00:00
Stanislav Mekhanoshin	d8c9374797	Fix invariant fdiv hoisting in LICM FDiv is replaced with multiplication by reciprocal and invariant reciprocal is hoisted out of the loop, while multiplication remains even if invariant. Switch checks for all invariant operands and only invariant denominator to fix the issue. Differential Revision: https://reviews.llvm.org/D48447 llvm-svn: 335411	2018-06-23 04:01:28 +00:00
Eli Friedman	203eaaf5ba	[LoopReroll] Rewrite induction variable rewriting. This gets rid of a bunch of weird special cases; instead, just use SCEV rewriting for everything. In addition to being simpler, this fixes a bug where we would use the wrong stride in certain edge cases. The one bit I'm not quite sure about is the trip count handling, specifically the FIXME about overflow. In general, I think we need to widen the exit condition, but that's probably not profitable if the new type isn't legal, so we probably need a check somewhere. That said, I don't think I'm making the existing problem any worse. As a followup to this, a bunch of IV-related code in root-finding could be cleaned up; with SCEV-based rewriting, there isn't any reason to assume a loop will have exactly one or two PHI nodes. Differential Revision: https://reviews.llvm.org/D45191 llvm-svn: 335400	2018-06-22 22:58:55 +00:00
Alina Sbirlea	bee50036d3	[LoopUnswitch]Fix comparison for DomTree updates. Summary: In LoopUnswitch when replacing a branch Parent -> Succ with a conditional branch Parent -> True & Parent->False, the DomTree updates should insert an edge for each of True/False if True/False are different than Succ, and delete Parent->Succ edge if both are different. The comparison with Succ appears to be incorect, it's comparing with Parent instead. There is no test failing either before or after this change, but it seems to me this is the right way to do the update. Reviewers: chandlerc, kuhar Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D48457 llvm-svn: 335369	2018-06-22 17:14:35 +00:00
Francis Visoiu Mistrih	ac599b6951	Revert r335206 "Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions." This reverts commit r335206. As discussed here: https://reviews.llvm.org/rL333740, a fix will come tomorrow. In the meanwhile, revert this to fix some bots. llvm-svn: 335272	2018-06-21 19:18:36 +00:00
Florian Hahn	d36aa1f763	Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. r335150 should resolve the issues with the clang-with-thin-lto-ubuntu and clang-with-lto-ubuntu builders. Original message: This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin llvm-svn: 335206	2018-06-21 07:15:08 +00:00
Chandler Carruth	d1dab0c3c0	[PM/LoopUnswitch] Add partial non-trivial unswitching for invariant conditions feeding a chain of `and`s or `or`s for a branch. Much like with full non-trivial unswitching, we rely on the pass manager to handle iterating until all of the profitable unswitches have been done. This is to allow other more profitable unswitches to fire on any of the cloned, simpler versions of the loop if viable. Threading the partial unswiching through the non-trivial unswitching logic motivated some minor refactorings. If those are too disruptive to make it reasonable to review this patch, I can separate them out, but it'll be somewhat timeconsuming so I wanted to send it for initial review as-is. Feel free to tell me whether it warrants pulling apart. I've tried to re-use (and factor out) logic form the partial trivial unswitching, but not as much could be shared as I had haped. Still, this wasn't as bad as I naively expected. Some basic testing is added, but I probably need more. Suggestions for things you'd like to see tested more than welcome. One thing I'd like to do is add some testing that when we schedule this with loop-instsimplify it effectively cleans up the cruft created. Last but not least, this uncovered a bug that has been in loop cloning the entire time for non-trivial unswitching. Specifically, we didn't correctly add the outer-most cloned loop to the list of cloned loops. This meant that LCSSA wouldn't be updated for it hypothetically, and more significantly that we would never visit it in the loop pass manager. I noticed this while checking loop-instsimplify by hand. I'll try to separate this bugfix out into its own patch with a more focused test. But it is just one line, so shouldn't significantly confuse the review here. After this patch, the only missing "feature" in this unswitch I'm aware of us non-trivial unswitching of switches. I'll try implementing full non-trivial unswitching of switches (which is at least a sound thing to implement), but partial non-trivial unswitching of switches is something I don't see any sound and principled way to implement. I also have no interesting test cases for the latter, so I'm not really worried. The rest of the things that need to be ported are bug-fixes and more narrow / targeted support for specific issues. Differential Revision: https://reviews.llvm.org/D47522 llvm-svn: 335203	2018-06-21 06:14:03 +00:00
Alina Sbirlea	dfd14adeb0	Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred. Summary: Two utils methods have essentially the same functionality. This is an attempt to merge them into one. 1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred 2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor Prior to the patch: 1. MergeBasicBlockIntoOnlyPred Updates either DomTree or DeferredDominance Moves all instructions from Pred to BB, deletes Pred Asserts BB has single predecessor If address was taken, replace the block address with constant 1 (?) 2. MergeBlockIntoPredecessor Updates DomTree, LoopInfo and MemoryDependenceResults Moves all instruction from BB to Pred, deletes BB Returns if doesn't have a single predecessor Returns if BB's address was taken After the patch: Method 2. MergeBlockIntoPredecessor is attempting to become the new default: Updates DomTree or DeferredDominance, and LoopInfo and MemoryDependenceResults Moves all instruction from BB to Pred, deletes BB Returns if doesn't have a single predecessor Returns if BB's address was taken Uses of MergeBasicBlockIntoOnlyPred that need to be replaced: 1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp Updated in this patch. No challenges. 2. lib/CodeGen/CodeGenPrepare.cpp Updated in this patch. i. eliminateFallThrough is straightforward, but I added using a temporary array to avoid the iterator invalidation. ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary array for blocks Some interesting aspects: - Since Pred is not deleted (BB is), the entry block does not need updating. - The entry block was being updated with the deleted block in eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred. - isMergingEmptyBlockProfitable assumes BB is the one to be deleted. - eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it deletes its unique predecessor instead. - adding some test owner as subscribers for the interesting tests modified: test/CodeGen/X86/avx-cmp.ll test/CodeGen/AMDGPU/nested-loop-conditions.ll test/CodeGen/AMDGPU/si-annotate-cf.ll test/CodeGen/X86/hoist-spill.ll test/CodeGen/X86/2006-11-17-IllegalMove.ll 3. lib/Transforms/Scalar/JumpThreading.cpp Not covered in this patch. It is the only use case using the DeferredDominance. I would defer to Brian Rzycki to make this replacement. Reviewers: chandlerc, spatel, davide, brzycki, bkramer, javed.absar Subscribers: qcolombet, sanjoy, nemanjai, nhaehnle, jlebar, tpr, kbarton, RKSimon, wmi, arsenm, llvm-commits Differential Revision: https://reviews.llvm.org/D48202 llvm-svn: 335183	2018-06-20 22:01:04 +00:00
Chandler Carruth	4da3331d3d	[PM/LoopUnswitch] Support partial trivial unswitching. The idea of partial unswitching is to take a part of a branch's condition that is loop invariant and just unswitching that part. This primarily makes sense with i1 conditions of branches as opposed to switches. When dealing with i1 conditions, we can easily extract loop invariant inputs to a a branch and unswitch them to test them entirely outside the loop. As part of this, we now create much more significant cruft in the loop body, so this relies on adding cleanup passes to the loop pipeline and revisiting unswitched loops to do that cleanup before continuing to process them. This already appears to be more powerful at unswitching than the old loop unswitch pass, and so I'd appreciate pretty careful review in case I'm just missing some correctness checks. The `LIV-loop-condition` test case is not unswitched by the old unswitch pass, but is with this pass. Thanks to Sanjoy and Fedor for the review! Differential Revision: https://reviews.llvm.org/D46706 llvm-svn: 335156	2018-06-20 18:57:07 +00:00
David Green	e6a9c24878	[LoopSimplifyCFG] Invalidate SCEV in LoopSimplifyCFG LoopSimplifyCFG, being a loop pass, needs to preserve scalar evolution. This invalidates SE for the loops altered during block merging. Differential Revision: https://reviews.llvm.org/D48258 llvm-svn: 335036	2018-06-19 09:43:36 +00:00
Florian Hahn	d8fcf0de31	[LoopInterchange] Move PHI handling to adjustLoopBranches. This patch moves the logic to handle reduction PHI nodes to the end of adjustLoopBranches. Reduction PHI nodes in the outer loop header can be moved to the inner loop header and reduction PHI nodes from the inner loop header can be moved to the outer loop header. In the latter situation, we have to deal with 1 kind of PHI nodes: PHI nodes that are part of inner loop-only reductions. We can replace the PHI node with the value coming from outside the inner loop. Reviewers: mcrosier, efriedma, karthikthecool Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D46198 llvm-svn: 335027	2018-06-19 08:03:24 +00:00
Michael Zolotukhin	158a7c3323	CorrelatedValuePropagation: Preserve DT. Summary: We only modify CFG in a couple of places, and we can preserve DT there with a little effort. Reviewers: davide, vsk Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48059 llvm-svn: 334895	2018-06-16 18:57:31 +00:00
Simon Pilgrim	dee9c67f24	[EarlyCSE] Fix MSVC build. NFCI. MSVC doesn't let you assign different lambdas through a ternary operator. llvm-svn: 334715	2018-06-14 14:22:03 +00:00
Max Kazantsev	ff6d1c9188	[EarlyCSE] Propagate conditions of AND and OR instructions This patches teaches EarlyCSE to figure out that if `and i1 %x, %y` is true then both `%x` and `%y` are true in the taken branch, and if `or i1 %x, %y` is false then both `%x` and `%y` are false in non-taken branch. Fix for PR37635. Differential Revision: https://reviews.llvm.org/D47574 Reviewed By: reames llvm-svn: 334707	2018-06-14 13:02:13 +00:00
Hiroshi Inoue	f209649dfc	[NFC] fix trivial typos in comments llvm-svn: 334687	2018-06-14 05:41:49 +00:00
Florian Hahn	a1cc848399	Use SmallPtrSet explicitly for SmallSets with pointer types (NFC). Currently SmallSet<PointerTy> inherits from SmallPtrSet<PointerTy>. This patch replaces such types with SmallPtrSet, because IMO it is slightly clearer and allows us to get rid of unnecessarily including SmallSet.h Reviewers: dblaikie, craig.topper Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D47836 llvm-svn: 334492	2018-06-12 11:16:56 +00:00
Craig Topper	61998289f9	Use SmallPtrSet instead of SmallSet in places where we iterate over the set. SmallSet forwards to SmallPtrSet for pointer types. SmallPtrSet supports iteration, but a normal SmallSet doesn't. So if it wasn't for the forwarding, this wouldn't work. These places were found by hiding the begin/end methods in the SmallSet forwarding llvm-svn: 334343	2018-06-09 05:04:20 +00:00
Daniil Fukalov	37433dc2e1	reapply r334209 with fixes for harfbuzz in Chromium r334209 description: [LSR] Check yet more intrinsic pointer operands the patch fixes another assertion in isLegalUse() Differential Revision: https://reviews.llvm.org/D47794 llvm-svn: 334300	2018-06-08 16:22:52 +00:00
Reid Kleckner	a3609f75b2	Revert r334209 "[LSR] Check yet more intrinsic pointer operands" This causes cast failures when compiling harfbuzz in Chromium. Reproducer on the way. llvm-svn: 334254	2018-06-08 00:43:27 +00:00
Daniil Fukalov	12c0663a25	[LSR] Check yet more intrinsic pointer operands the patch fixes another assertion in isLegalUse() Differential Revision: https://reviews.llvm.org/D47794 llvm-svn: 334209	2018-06-07 17:30:58 +00:00
Michael Zolotukhin	31800864dc	SpeculativeExecution Pass: Set PreserveCFG to avoid unnecessary analyses invalidation. The pass doesn't touch CFG in any way, only moves instructions between blocks. llvm-svn: 334150	2018-06-07 00:19:29 +00:00
David Blaikie	31b98d2e99	Move Analysis/Utils/Local.h back to Transforms Review feedback from r328165. Split out just the one function from the file that's used by Analysis. (As chandlerc pointed out, the original change only moved the header and not the implementation anyway - which was fine for the one function that was used (since it's a template/inlined in the header) but not in general) llvm-svn: 333954	2018-06-04 21:23:21 +00:00
Chandler Carruth	9281503e8f	[PM/LoopUnswitch] Fix how the cloned loops are handled when updating analyses. Summary: I noticed this issue because we didn't put the primary cloned loop into the `NonChildClonedLoops` vector and so never iterated on it. Once I fixed that, it made it clear why I had to do a really complicated and unnecesasry dance when updating the loops to remain in canonical form -- I was unwittingly working around the fact that the primary cloned loop wasn't in the expected list of cloned loops. Doh! Now that we include it in this vector, we don't need to return it and we can consolidate the update logic as we correctly have a single place where it can be handled. I've just added a test for the iteration order aspect as every time I changed the update logic partially or incorrectly here, an existing test failed and caught it so that seems well covered (which is also evidenced by the extensive working around of this missing update). Reviewers: asbirlea, sanjoy Subscribers: mcrosier, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47647 llvm-svn: 333811	2018-06-02 01:29:01 +00:00
Florian Hahn	8a17f1f43e	Revert r333740: IPSCCP] Use PredicateInfo to propagate facts from cmp. This is breaking the clang-with-thin-lto-ubuntu bot. llvm-svn: 333745	2018-06-01 12:58:43 +00:00
Florian Hahn	f4df554f32	Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin Differential Revision: https://reviews.llvm.org/D45330 llvm-svn: 333740	2018-06-01 10:48:54 +00:00
Craig Topper	9a6c0bdcbd	[LoopIdiomRecognize] Only convert loops to ctlz if we can prove that the input is non-negative. Summary: Loop idiom recognize tries to convert loops like ``` int foo(int x) { int cnt = 0; while (x) { x >>= 1; ++cnt; } return cnt; } ``` into calls to ctlz, but if x is initially negative this loop should be infinite. It happens that the cases that motivated this change have an absolute value of x before the loop. So this patch restricts the transform to cases where we know x is positive. Note: We are relying on the absolute value of INT_MIN to be undefined so we can assume that the result is always positive. Fixes PR37479 Reviewers: spatel, hfinkel, efriedma, javed.absar Reviewed By: efriedma Subscribers: dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D47348 llvm-svn: 333702	2018-05-31 22:16:55 +00:00
Craig Topper	c9a4c6208b	[JumpThreading] Fix some strange formatting of code inside LLVM_DEBUG. NFC I don't know if clang-format got confused here or what. llvm-svn: 333675	2018-05-31 18:08:11 +00:00
Max Kazantsev	0bad5be430	[NFC] Factor out a method for further extension llvm-svn: 333633	2018-05-31 08:08:34 +00:00
George Burgess IV	485762ccba	[NewGVN] Fix set comparison; reflow comment Looks like we intended to compare this->Members with Other->Members here, but ended up comparing this->Members with this->Members. Oops. :) Since CongruenceClass::Members is a SmallPtrSet anyway, we can probably skip building std::sets if we're willing to write a bit more code. This appears to be no functional change (for sufficiently lax values of "no"): this equality check was only being called inside of an assert. So, worst case, we'll catch more bugs in the form of assertion failures. Thanks to d0k for noting this! llvm-svn: 333601	2018-05-30 22:24:08 +00:00
Chandler Carruth	71fd27043e	[PM/LoopUnswitch] When using the new SimpleLoopUnswitch pass, schedule loop-cleanup passes at the beginning of the loop pass pipeline, and re-enqueue loops after even trivial unswitching. This will allow us to much more consistently avoid simplifying code while doing trivial unswitching. I've also added a test case that specifically shows effective iteration using this technique. I've unconditionally updated the new PM as that is always using the SimpleLoopUnswitch pass, and I've made the pipeline changes for the old PM conditional on using this new unswitch pass. I added a bunch of comments to the loop pass pipeline in the old PM to make it more clear what is going on when reviewing. Hopefully this will unblock doing partial unswitching instead of just full unswitching. Differential Revision: https://reviews.llvm.org/D47408 llvm-svn: 333493	2018-05-30 02:46:45 +00:00
Chandler Carruth	4cbcbb0761	[LoopInstSimplify] Re-implement the core logic of loop-instsimplify to be both simpler and substantially more efficient. Rather than use a hand-rolled iteration technique that isn't quite the same as RPO, use the pre-built RPO loop body traversal utility. Once visiting the loop body in RPO, we can assert that we visit defs before uses reliably. When this is the case, the only need to iterate is when simplifying a def that is used by a PHI node along a back-edge. With this patch, the first pass over the loop body is just a complete simplification of every instruction across the loop body. When we encounter a use of a simplified instruction that stems from a PHI node in the loop body that has already been visited (due to some cyclic CFG, potentially the loop itself, or a nested loop, or unstructured control flow), we recall that specific PHI node for the second iteration. Nothing else needs to be preserved from iteration to iteration. On the second and later iterations, only instructions known to have simplified inputs are considered, each time starting from a set of PHIs that had simplified inputs along the backedges. Dead instructions are collected along the way, but deleted in a batch at the end of each iteration making the iterations themselves substantially simpler. This uses a new batch API for recursively deleting dead instructions. This alsa changes the routine to visit subloops. Because simplification is fundamentally transitive, we may need to visit the entire loop body, including subloops, to handle knock-on simplification. I've added a basic test file that helps demonstrate that all of these changes work. It includes both straight-forward loops with simplifications as well as interesting PHI-structures, CFG-structures, and a nested loop case. Differential Revision: https://reviews.llvm.org/D47407 llvm-svn: 333461	2018-05-29 20:15:38 +00:00
David Green	aee7ad0cde	Revert 333358 as it's failing on some builders. I'm guessing the tests reply on the ARM backend being built. llvm-svn: 333359	2018-05-27 12:54:33 +00:00
David Green	3034281b43	[UnrollAndJam] Add a new Unroll and Jam pass This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now-jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 333358	2018-05-27 12:11:21 +00:00
Florian Hahn	718af2f817	Revert r333268: [IPSCCP] Use PredicateInfo to propagate facts from... Reverting this to see if this is causing the failures of the clang-with-thin-lto-ubuntu bot. [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin Differential Revision: https://reviews.llvm.org/D45330 llvm-svn: 333323	2018-05-25 23:32:02 +00:00
Florian Hahn	b4a70b9f47	[IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin Differential Revision: https://reviews.llvm.org/D45330 llvm-svn: 333268	2018-05-25 11:12:33 +00:00
Chandler Carruth	e6c30fdda7	Restore the LoopInstSimplify pass, reverting r327329 that removed it. The plan had always been to move towards using this rather than so much in-pass simplification within the loop pipeline, but we never got around to it.... until only a couple months after it was removed due to disuse. =/ This commit is just a pure revert of the removal. I will add tests and do some basic cleanup in follow-up commits. Then I'll wire it into the loop pass pipeline. Differential Revision: https://reviews.llvm.org/D47353 llvm-svn: 333250	2018-05-25 01:32:36 +00:00
Jun Bum Lim	dfbe6fa832	[LICM] Preserve DT and LoopInfo specifically Summary: In LICM, CFG could be changed in splitPredecessorsOfLoopExit(), which update only DT and LoopInfo. Therefore, we should preserve only DT and LoopInfo specifically, instead of all analyses that depend on the CFG (setPreservesCFG()). This change should fix PR37323. Reviewers: uabelho, davide, dberlin, Ka-Ka Reviewed By: dberlin Subscribers: mzolotukhin, bjope, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D46775 llvm-svn: 333198	2018-05-24 15:58:34 +00:00
Karl-Johan Karlsson	478232d52f	[NaryReassociate] Detect deleted instr with WeakVH Summary: If NaryReassociate succeed it will, when replacing the old instruction with the new instruction, also recursively delete trivially dead instructions from the old instruction. However, if the input to the NaryReassociate pass contain dead code it is not save to recursively delete trivially deadinstructions as it might lead to deleting the newly created instruction. This patch will fix the problem by using WeakVH to detect this rare case, when the newly created instruction is dead, and it will then restart the basic block iteration from the beginning. This fixes pr37539 Reviewers: tra, meheff, grosser, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47139 llvm-svn: 333155	2018-05-24 06:09:02 +00:00
Changpeng Fang	5f9154618e	StructurizeCFG: Adjust the loop depth for a subregion to order the nodes correctly Summary: StructurizeCFG::orderNodes basically uses a reverse post-order (RPO) traversal of the region list to get the order. The only problem with it is that sometimes backedges for outer loops will be visited before backedges for inner loops. To solve this problem, a loop depth based approach has been used to make sure all blocks in this loop has been visited before moving on to outer loop. However, we found a problem for a SubRegion which is a loop itself: --> BB1 --> BB2 --> BB3 --> In this case, BB2 is a SubRegion (loop), and thus its loopdepth is different than that of BB1 and BB3. This fact will lead BB2 to be placed in the wrong order. In this work, we treat the SubRegion as a special case and use its exit block to determine the loop and its depth to guard the sorting. Reviewers: arsenm, jlebar Differential Revision: https://reviews.llvm.org/D46912 llvm-svn: 333111	2018-05-23 18:34:48 +00:00
Max Kazantsev	d99f3bacb4	[LoopUnswitch] Fix SCEV invalidation in unswitching Loop unswitching makes substantial changes to a loop that can also affect cached SCEV info in its outer loops as well, but it only cares to invalidate SCEV cache for the innermost loop in case of full unswitching and does not invalidate anything at all in case of trivial unswitching. As result, we may end up with incorrect data in cache. Differential Revision: https://reviews.llvm.org/D46045 Reviewed By: mzolotukhin llvm-svn: 333072	2018-05-23 10:09:53 +00:00
Florian Hahn	a6e63f176c	[NewGVN] Fix handling of assumes This patch fixes two bugs: * test1: Previously assume(a >= 5) concluded that a == 5. That's only valid for assume(a == 5)... * test2: If operands were swapped, additional users were added to the wrong cmp operand. This resulted in an "unsettled iteration" assertion failure. Patch by Nikita Popov Differential Revision: https://reviews.llvm.org/D46974 llvm-svn: 333007	2018-05-22 17:38:22 +00:00
Craig Topper	f14e62c9a5	[EarlyCSE] Improve EarlyCSE of some absolute value cases. Change matchSelectPattern to return X and -X for ABS/NABS in a well defined order. Adjust EarlyCSE to account for this. Ensure the SPF result is some kind of min/max and not abs/nabs in one place in InstCombine that made me nervous. Prevously we returned the two operands of the compare part of the abs pattern. The RHS is always going to be a 0i, 1 or -1 constant. This isn't a very meaningful thing to return for any one. There's also some freedom in the abs pattern as to what happens when the value is equal to 0. This freedom led to early cse failing to match when different constants were used in otherwise equivalent operations. By returning the input and its negation in a defined order we can ensure an exact match. This also makes sure both patterns use the exact same subtract instruction for the negation. I believe CSE should evebntually make this happen and properly merge the nsw/nuw flags. But I'm not familiar with CSE and what order it does things in so it seemed like it might be good to really enforce that they were the same. Differential Revision: https://reviews.llvm.org/D47037 llvm-svn: 332865	2018-05-21 18:42:42 +00:00
David Green	8ceab61c75	[CVP] Require DomTree for new Pass Manager We were previously using a DT in CVP through SimplifyQuery, but not requiring it in the new pass manager. Hence it would crash if DT was not already available. This now gets DT directly and plumbs it through to where it is used (instead of using it through SQ). llvm-svn: 332836	2018-05-21 11:06:28 +00:00
Eric Christopher	563d0b9cb9	Fix up a few grammar issues. llvm-svn: 332835	2018-05-21 10:27:36 +00:00
Max Kazantsev	c0b268f90c	[IRCE] Fix miscompile with range checks against negative values In the patch rL329547, we have lifted the over-restrictive limitation on collected range checks, allowing to work with range checks with the end of their range not being provably non-negative. However it appeared that the non-negativity of this value was assumed in the utility function `ClampedSubtract`. In particular, its reasoning is based on the fact that `0 <= SINT_MAX - X`, which is not true if `X` is negative. The function `ClampedSubtract` is only called twice, once with `X = 0` (which is OK) and the second time with `X = IRC.getEnd()`, where we may now see the problem if the end is actually a negative value. In this case, we may sometimes miscompile. This patch is the conservative fix of the miscompile problem. Rather than rejecting non-provably non-negative `getEnd()` values, we will check it for non-negativity in runtime. For this, we use function `smax(smin(X, 0), -1) + 1` that is equal to `1` if `X` is non-negative and is equal to 0 if `X` is negative. If we multiply `Begin, End` of safe iteration space by this function calculated for `X = IRC.getEnd()`, we will get the original `[Begin, End)` if `IRC.getEnd()` was non-negative (and, thus, `ClampedSubtract` worked correctly) and the empty range `[0, 0)` in case if ` IRC.getEnd()` was negative. So we in fact prohibit execution of the main loop if at least one of range checks was made against a negative value (and we figured it out in runtime). It is still better than what we have before (non-negativity had to be proved in compile time) and prevents us from miscompile, however it is sometiles too restrictive for unsigned range checks against a negative value (which in fact can be eliminated). Once we re-implement `ClampedSubtract` in a way that it handles negative `X` correctly, this limitation can be lifted, too. Differential Revision: https://reviews.llvm.org/D46860 Reviewed By: samparker llvm-svn: 332809	2018-05-19 13:06:37 +00:00
Benjamin Kramer	a76b64ff80	[MergeICmps] Don't crash when memcmp is not available Fixes clang crashing with -fno-builtin, PR37527. llvm-svn: 332808	2018-05-19 12:51:59 +00:00
Bjorn Pettersson	81a76a388a	[SROA] Handle PHI with multiple duplicate predecessors Summary: The verifier accepts PHI nodes with multiple entries for the same basic block, as long as the value is the same. As seen in PR37203, SROA did not handle such PHI nodes properly when speculating loads over the PHI, since it inserted multiple loads in the predecessor block and changed the PHI into having multiple entries for the same basic block, but with different values. This patch teaches SROA to reuse the same speculated load for each PHI duplicate entry in such situations. Resolves: https://bugs.llvm.org/show_bug.cgi?id=37203 Reviewers: uabelho, chandlerc, hfinkel, bkramer, efriedma Reviewed By: efriedma Subscribers: dberlin, efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D46426 llvm-svn: 332577	2018-05-17 07:21:41 +00:00
Hiroshi Inoue	f5c0e6c285	[SROA] pr37267: fix assertion failure in integer widening The current integer widening does not support rewriting partial split slices in rewriteIntegerStore (and rewriteIntegerLoad). This patch adds explicit checks for this case in isIntegerWideningViableForSlice. Before r322533, splitting is allowed only for the whole-alloca slice and hence the above case is implicitly rejected by another check `if (DL.getTypeStoreSize(ValueTy) > Size)` because whole-alloca slice is larger than the partition. Differential Revision: https://reviews.llvm.org/D46750 llvm-svn: 332575	2018-05-17 06:32:17 +00:00
Vedant Kumar	5a0872c2b7	[STLExtras] Add size() for ranges, and remove distance() r332057 introduced distance() for ranges. Based on post-commit feedback, this renames distance() to size(). The new size() is also only enabled when the operation is O(1). Differential Revision: https://reviews.llvm.org/D46976 llvm-svn: 332551	2018-05-16 23:20:42 +00:00
Evgeny Stupachenko	bff9302c3d	Fix LSR compile time hang. Summary: Limit number of reassociations in GenerateReassociationsImpl. Reviewers: qcolombet, mkazantsev Differential Revision: https://reviews.llvm.org/D46039 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 332426	2018-05-16 02:48:50 +00:00
Marek Olsak	3c5fd145c5	StructurizeCFG: fix inverting conditions Author: Samuel Pitoiset Without this patch, it appears to me that we are selecting the wrong operand when inverting conditions. In the attached test, it will select %tmp3 instead of %tmp4. To fix it, just use 'A' as everywhere. This fixes a regression introduced by "[PatternMatch] define m_Not using m_Xor and cst_pred_ty" https://reviews.llvm.org/D46351 llvm-svn: 332403	2018-05-15 21:41:55 +00:00
Max Kazantsev	9b90373c8b	[NFC] Add const to method signature llvm-svn: 332317	2018-05-15 01:21:56 +00:00
Nicola Zaghen	d34e60ca85	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240	2018-05-14 12:53:11 +00:00
Michael Zolotukhin	a41660df7e	Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." Stage3/stage4 bootstrap miscompares should be fixed by a non-determinism fix in IDF (r332167). This reverts commit r330446. llvm-svn: 332168	2018-05-12 01:52:36 +00:00
Artem Belevich	c2cd5d5ce0	[Split GEP] handle trunc() in separate-const-offset-from-gep pass. Let separate-const-offset-from-gep pass handle trunc() when it calculates constant offset relative to base. The pass itself may insert trunc() instructions when it canonicalises array indices to pointer-size integers and needs to handle trunc() in order to evaluate the offset. Differential Revision: https://reviews.llvm.org/D46732 llvm-svn: 332142	2018-05-11 21:13:19 +00:00
Davide Italiano	6e1f7bf316	[Reassociate] Prevent infinite loops when processing PHIs. Phi nodes can reside in live blocks but one of their incoming arguments can come from a dead block. Dead blocks and reassociate don't play nice together. In fact, reassociate performs an RPO as a first step to avoid processing dead blocks. The reason why Reassociate might not fixpoint when examining dead blocks is that the following: %xor0 = xor i16 %xor1, undef %xor1 = xor i16 %xor0, undef is perfectly valid LLVM IR (if it appears in a dead block), so the worklist algorithm keeps pushing the two instructions for reexamination. Note that this is not Reassociate fault, at least not entirely. It's llvm that has a weird definition of dominance. Fixes PR37390. llvm-svn: 332100	2018-05-11 15:45:36 +00:00
Vedant Kumar	e0b5f86b30	[STLExtras] Add distance() for ranges, pred_size(), and succ_size() This commit adds a wrapper for std::distance() which works with ranges. As it would be a common case to write `distance(predecessors(BB))`, this also introduces `pred_size()` and `succ_size()` helpers to make that easier to write. Differential Revision: https://reviews.llvm.org/D46668 llvm-svn: 332057	2018-05-10 23:01:54 +00:00
Chandler Carruth	baf045fb28	[PM/LoopUnswitch] Avoid pointlessly creating an exit block set. This code can just test whether blocks are in the loop, which we already have a dedicated set tracking in the loop itself. llvm-svn: 332004	2018-05-10 17:33:20 +00:00
Daniel Neilson	71fa1b904a	[DSE] Teach the pass about partial overwrite of atomic memory intrinsics Summary: This change teaches DSE that the atomic memory intrinsics can be overwriten partially in the same way as the non-atomic forms. Specifically, that the atomic memcpy & memset can be shortened at the end and that the atomic memset can be shortened at the beginning, if they partially overwritten by later stores. Reviewers: mkazantsev, skatkov, apilipenko, efriedma, rsmith, spatel, filcab, sanjoy Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45584 llvm-svn: 331991	2018-05-10 15:12:49 +00:00
Bjorn Pettersson	9f953cdd7c	[MergedLoadStoreMotion] Fix a debug invariant bug in mergeStores Summary: MergedLoadStoreMotion::mergeStores is using some heuristics to limit the amount of stores that it tries to sink (see MagicCompileTimeControl in MergedLoadStoreMotion.cpp). The heuristic involves counting the number of instructions in one of the basic blocks that is part of the transformation. We now ignore dbg intrinsics when counting instruction for the MagicCompileTimeControl heuristic. This to make sure that the amount of stores that are sunk doesn't depend on the amount of debug information (if -g is used or not). Reviewers: Gerolf, davide, majnemer Reviewed By: davide Subscribers: dberlin, bjope, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D46600 llvm-svn: 331852	2018-05-09 06:52:12 +00:00
Philip Reames	5b39acd111	[LICM] Compute a must execute property for the prefix of the header as we go Computing this property within the existing walk ensures that the cost is linear with the size of the block. If we did this from within isGuaranteedToExecute, it would be quadratic without some very fancy caching. This allows us to reliably catch a hoistable instruction within a header which may throw at some point after our hoistable instruction. It doesn't do anything for non-header cases, but given how common single block loops are, this seems very worthwhile. llvm-svn: 331557	2018-05-04 21:35:00 +00:00
Craig Topper	ded8ee07e9	[LoopIdiomRecognize] Don't create an IRBuilder just to call getTrue/getFalse. We can call the methods in ConstantInt directly. We just need a context. llvm-svn: 331542	2018-05-04 17:39:08 +00:00
Max Kazantsev	786032c1b7	[IRCE] Fix misuse of dyn_cast which leads to UB llvm-svn: 331508	2018-05-04 07:34:35 +00:00
Craig Topper	9510f70636	[LoopIdiomRecognize] Replace more unchecked dyn_casts with cast. Two of these are immediately dereferenced on the next line. The other two are passed immediately to the IRBuilder constructor which can't handle a nullptr. llvm-svn: 331500	2018-05-04 01:04:28 +00:00
Craig Topper	cafae62ec9	[LoopIdiomRecognize] Use a regular array instead of a SmallVector and explicit ArrayRef. llvm-svn: 331499	2018-05-04 01:04:26 +00:00
Craig Topper	8304231508	[LoopIdiomRecognize] Turn two uncheck dyn_casts into regular casts. These are casts on users of a PHINode to Instruction. I think since PHINode is an Instruction any users would also be Instructions. At least a cast will give us an assertion if its wrong. llvm-svn: 331498	2018-05-04 01:04:24 +00:00
Piotr Padlewski	c77ab8ef2f	perform DSE through launder.invariant.group Summary: Alias Analysis knows that llvm.launder.invariant.group returns pointer that mustalias argument, but this information wasn't used, therefor we didn't DSE through launder.invariant.group Reviewers: chandlerc, dberlin, bogner, hfinkel, efriedma Reviewed By: dberlin Subscribers: amharc, llvm-commits, nlewycky, rsmith Differential Revision: https://reviews.llvm.org/D31581 llvm-svn: 331449	2018-05-03 11:03:53 +00:00
Craig Topper	856fd68690	[LoopIdiomRecognize] When looking for 'x & (x -1)' for popcnt, make sure the left hand side of the 'and' matches the left hand side of the 'subtract' llvm-svn: 331437	2018-05-03 05:48:49 +00:00
Craig Topper	8ef2abdbc4	[LoopIdiomRecognize] Remove unnecessary cast from BinaryOperator to Instruction. NFC BinaryOperator is a sub class of Instruction. We don't need an explicit cast back to Instruction. llvm-svn: 331432	2018-05-03 05:00:18 +00:00
Daniel Sanders	8d0d1aa229	[reassociate] Fix excessive revisits when processing long chains of reassociatable instructions. Summary: Some of our internal testing detected a major compile time regression which I've tracked down to: r278938 - Revert "Reassociate: Reprocess RedoInsts after each inst". It appears that processing long chains of reassociatable instructions causes non-linear (potentially exponential) growth in the number of times an instruction is revisited. For example, the included test revisits instructions 220 times in a 20-instruction test. It appears that r278938 reversed the order instructions were visited and that this is preventing scheduled revisits from being cancelled as a result of visiting the instructions naturally during normal processing. However, simply reversing the order also harmed the generated code. Upon closer inspection, it was discovered that revisits occurred in the opposite order to the first pass (Thanks to escha for spotting that). This patch makes the revisit order consistent with the first pass which allows more revisits to be cancelled. This does appear to have a small impact on the generated code in few cases but it significantly reduces compile-time. After this patch, our internal test that was most affected by the regression dropped from ~2 million revisits to ~4k resulting in Reassociate having 0.46% of the runtime it had before (99.54% improvement). Here's the summaries reported by lnt for the LLVM test-suite with --benchmarking-only: \| metric \| geomean before patch \| geomean after patch \| delta \| \| ----- \| ----- \| ----- \| ----- \| \| compile time \| 0.1956 \| 0.1261 \| -35.54% \| \| execution time \| 0.3240 \| 0.3237 \| - \| \| code size \| 7365.4459 \| 7365.6079 \| - \| The results have a few wins and losses on compile-time, mostly in the +/- 2.5% range. There was one outlier though: \| Performance Regressions - compile_time \| Δ \| Previous \| Current \| \| MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk \| 9.82% \| 2.0473 \| 2.2483 \| Reviewers: javed.absar, dberlin Reviewed By: dberlin Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45734 llvm-svn: 331381	2018-05-02 17:59:16 +00:00
Florian Hahn	5912c667b0	[LoopInterchange] Update some loops to use range base for loops (NFC). llvm-svn: 331342	2018-05-02 10:53:04 +00:00
Adrian Prantl	4dfcc4a788	Remove @brief commands from doxygen comments, too. This is a follow-up to r331272. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done https://reviews.llvm.org/D46290 llvm-svn: 331275	2018-05-01 16:10:38 +00:00
Adrian Prantl	5f8f34e459	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Wei Mi	eec5ba9fae	Fix the issue that ComputeValueKnownInPredecessors only handles the case when phi is on lhs of a comparison op. For the following testcase, L1: %t0 = add i32 %m, 7 %t3 = icmp eq i32* %t2, null br i1 %t3, label %L3, label %L2 L2: %t4 = load i32, i32* %t2, align 4 br label %L3 L3: %t5 = phi i32 [ %t0, %L1 ], [ %t4, %L2 ] %t6 = icmp eq i32 %t0, %t5 br i1 %t6, label %L4, label %L5 We know if we go through the path L1 --> L3, %t6 should always be true. However currently, if the rhs of the eq comparison is phi, JumpThreading fails to evaluate %t6 to true. And we know that Instcombine cannot guarantee always canonicalizing phi to the left hand side of the comparison operation according to the operand priority comparison mechanism in instcombine. The patch handles the case when rhs of the comparison op is a phi. Differential Revision: https://reviews.llvm.org/D46275 llvm-svn: 331266	2018-05-01 14:47:24 +00:00
Chandler Carruth	2c85a23123	[PM/LoopUnswitch] Remove the last manual domtree update code from loop unswitch and replace it with the amazingly simple update API code. This addresses piles of FIXMEs around the update logic here and makes everything substantially simpler. llvm-svn: 331247	2018-05-01 09:54:39 +00:00
Chandler Carruth	44aab925fd	[PM/LoopUnswitch] Add back a successor set that was removed based on code review. It turns out this is necessary, and I read the comment on the API correctly the first time. ;] The `applyUpdates` routine requires that updates are "balanced". This is in order to cleanly handle cycles like inserting, removing, nad then re-inserting the same edge. This precludes inserting the same edge multiple times in a row as handling that would cause the insertion logic to become ordered instead of unordered (which is what the API provides). It happens that in this specific case nothing (other than an assert and contract violation) goes wrong because we're never inserting and removing the same edge. The implementation happens to do the right thing to eliminate redundant insertions in that case. But the requirement is there and there is an assert to catch it. Somehow, after the code review I never did another asserts-clang build testing loop-unswich for a long time. As a consequence, I didn't notice this despite a bunch of testing going on, but it shows up immediately with an asserts build of clang itself. llvm-svn: 331246	2018-05-01 09:42:09 +00:00
Nico Weber	432a38838d	IWYU for llvm-config.h in llvm, additions. See r331124 for how I made a list of files missing the include. I then ran this Python script: for f in open('filelist.txt'): f = f.strip() fl = open(f).readlines() found = False for i in xrange(len(fl)): p = '#include "llvm/' if not fl[i].startswith(p): continue if fl[i][len(p):] > 'Config': fl.insert(i, '#include "llvm/Config/llvm-config.h"\n') found = True break if not found: print 'not found', f else: open(f, 'w').write(''.join(fl)) and then looked through everything with `svn diff \| diffstat -l \| xargs -n 1000 gvim -p` and tried to fix include ordering and whatnot. No intended behavior change. llvm-svn: 331184	2018-04-30 14:59:11 +00:00
Philip Reames	502d4481d4	[LoopGuardWidening] Make PostDomTree optional The effect of doing so is not disrupting the LoopPassManager when mixing this pass with other loop passes. This should help locality of access substaintially and avoids the cost of computing PostDom. The assumption here is that the full GuardWidening (which does use PostDom) is run as a canonicalization before loop opts and that this version is just catching cases exposed by other loop passes. (i.e. LoopPredication, IndVarSimplify, LoopUnswitch, etc..) llvm-svn: 331094	2018-04-27 23:15:56 +00:00
Philip Reames	5a6482450a	[LICM] Reduce nesting with an early return [NFC] llvm-svn: 331080	2018-04-27 20:58:30 +00:00
Philip Reames	de5a1da2d2	[GuardWidening] Add some clarifying comments about heuristics [NFC] llvm-svn: 331061	2018-04-27 17:41:37 +00:00
Philip Reames	9258e9d190	[LoopGuardWidening] Split out a loop pass version of GuardWidening The idea is to have a pass which performs the same transformation as GuardWidening, but can be run within a loop pass manager without disrupting the pass manager structure. As demonstrated by the test case, this doesn't quite get there because of issues with post dom, but it gives a good step in the right direction. the motivation is purely to reduce compile time since we can now preserve locality during the loop walk. This patch only includes a legacy pass. A follow up will add a new style pass as well. llvm-svn: 331060	2018-04-27 17:29:10 +00:00
Florian Hahn	f3fea0f11f	[LoopInterchange] Allow some loops with PHI nodes in the exit block. We currently support LCSSA PHI nodes in the outer loop exit, if their incoming values do not come from the outer loop latch or if the outer loop latch has a single predecessor. In that case, the outer loop latch will be executed only if the inner loop gets executed. If we have multiple predecessors for the outer loop latch, it may be executed even if the inner loop does not get executed. This is a first step to support the case described in https://bugs.llvm.org/show_bug.cgi?id=30472 Reviewers: efriedma, karthikthecool, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D43237 llvm-svn: 331037	2018-04-27 13:52:51 +00:00
Florian Hahn	fd2bc11248	[LoopInterchange] Ignore debug intrinsics during legality checks. Reviewers: aprantl, mcrosier, karthikthecool Reviewed By: aprantl Subscribers: mattd, vsk, #debug-info, llvm-commits Differential Revision: https://reviews.llvm.org/D45379 llvm-svn: 330931	2018-04-26 10:26:17 +00:00
Florian Hahn	1da30c659d	[LoopInterchange] Use getExitBlock()/getExitingBlock instead of manual impl. This also means we have to check if the latch is the exiting block now, as `transform` expects the latches to be the exiting blocks too. https://bugs.llvm.org/show_bug.cgi?id=36586 Reviewers: efriedma, davide, karthikthecool Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45279 llvm-svn: 330806	2018-04-25 09:35:54 +00:00
Bjorn Pettersson	bec2a7c4eb	[DebugInfo] Invalidate debug info in ReassociatePass::RewriteExprTree Summary: When Reassociate is rewriting an expression tree it may reuse old binary expression nodes, for new expressions. Whenever an expression node is reused, but with a non-trivial change in the result, we need to invalidate any debug info that is associated with the node. If for example rewriting x = mul a, b y = mul c, x into x = mul c, b y = mul a, x we still get the same result for 'y', but 'x' is a new expression. All debug info referring to 'x' must be invalidated (marked as optimized out) since we no longer calculate the expected value. As a side-effect this patch avoid (at least some) problems where reassociate could end up creating IR with debug-use before def. Earlier the dbg.value nodes where left untouched in the IR, while the reused binary nodes where sinked to just before the root node of the rewritten expression tree. See PR27273 for more info about such problems. Reviewers: dblaikie, aprantl, dexonsmith Reviewed By: aprantl Subscribers: JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D45975 llvm-svn: 330804	2018-04-25 09:23:56 +00:00
Geoff Berry	2af5f3c1e5	[DivRemPairs] Fix non-determinism in use list order. Summary: Use a MapVector instead of a DenseMap for RemMap since it is iteratated over and the order of iteration can effect the order that new instructions are created. This can in turn effect the use list order of div/rem input values if multiple new instructions are created that share any input values. Reviewers: spatel Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D45858 llvm-svn: 330792	2018-04-25 02:17:56 +00:00
Chandler Carruth	69e68f8468	[PM/LoopUnswitch] Begin teaching SimpleLoopUnswitch to use the new update API for dominators rather than doing manual, hacky updates. This is just the first step, but in some ways the most important as it moves the non-trivial unswitching to update the domtree rather than fully recalculating it each time. Subsequent patches should remove the custom update logic used by the trivial unswitch and replace it with uses of the update API. This also fixes a number of bugs I was seeing when testing non-trivial unswitch due to it querying the quasi-correct dominator tree. Now the tree is 100% correct and safe to query. That said, there are still more bugs I can see with non-trivial unswitch just running over the test suite, so more bugfix patches are needed as well. Thanks to both Sanjoy and Fedor for reviews and testing! Differential Revision: https://reviews.llvm.org/D45943 llvm-svn: 330787	2018-04-25 00:18:07 +00:00
Florian Hahn	ceee788947	[LoopInterchange] Make isProfitableForVectorization slightly more conservative. After D43236, we started interchanging loops with empty dependence matrices. In isProfitableForVectorization, we try to determine if interchanging makes the loop dependences more friendly to the vectorizer. If there are no dependences, we should not interchange, based on that heuristic. Reviewers: efriedma, mcrosier, karthikthecool, blitz.opensource Reviewed By: mcrosier Differential Revision: https://reviews.llvm.org/D45208 llvm-svn: 330738	2018-04-24 16:55:32 +00:00
David Blaikie	ba47dd16c5	Fix some layering in AggressiveInstCombine (avoiding inclusion of Scalar.h) llvm-svn: 330726	2018-04-24 15:40:07 +00:00
Chandler Carruth	43acdb35bc	[PM/LoopUnswitch] Fix a bug in the loop block set formation of the new loop unswitch. This code incorrectly added the header to the loop block set early. As a consequence we would incorrectly conclude that a nested loop body had already been visited when the header of the outer loop was the preheader of the nested loop. In retrospect, adding the header eagerly doesn't really make sense. It seems nicer to let the cycle be formed naturally. This will catch crazy bugs in the CFG reconstruction where we can't correctly form the cycle earlier rather than later, and makes the rest of the logic just fall out. I've also added various asserts that make these issues much easier to debug. llvm-svn: 330707	2018-04-24 10:33:08 +00:00
Chandler Carruth	0ace148ca6	[PM/LoopUnswitch] Remove another over-aggressive assert. This code path can very clearly be called in a context where we have baselined all the cloned blocks to a particular loop and are trying to handle nested subloops. There is no harm in this, so just relax the assert. I've added a test case that will make sure we actually exercise this code path. llvm-svn: 330680	2018-04-24 03:27:00 +00:00
David Blaikie	a27771b62f	InstCombine: Fix layering by not including Scalar.h in InstCombine (notionally Scalar.h is part of libLLVMScalarOpts, so it shouldn't be included by InstCombine which doesn't/shouldn't need to depend on ScalarOpts) llvm-svn: 330669	2018-04-24 00:48:59 +00:00
Craig Topper	1bcb258ba3	[AggressiveInstCombine] Add aggressive inst combiner to the LLVM C API. I just tried to copy what was done for regular InstCombine. Hopefully I didn't miss anything. llvm-svn: 330668	2018-04-24 00:39:29 +00:00
Florian Hahn	7441818560	[LoopInterchange] Do not change LI for BBs in child loops. If a loop with child loops becomes our new inner loop after interchanging, we only need to update LoopInfo for the blocks defined in the old outer loop. BBs in child loops will stay there. Reviewers: efriedma, karthikthecool, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45970 llvm-svn: 330653	2018-04-23 21:38:19 +00:00
Xin Tong	8edff27923	[CallSiteSplit] Make sure we remove nonnull if the parameter turns out to be a constant. Summary: We do not need nonull attribute if we know an argument is going to be constant. Reviewers: junbuml, davide, fhahn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45608 llvm-svn: 330641	2018-04-23 20:09:08 +00:00
Bjorn Pettersson	8e484dc531	[MemCpyOpt] Skip optimizing basic blocks not reachable from entry Summary: Skip basic blocks not reachable from the entry node in MemCpyOptPass::iterateOnFunction. Code that is unreachable may have properties that do not exist for reachable code (an instruction in a basic block can for example be dominated by a later instruction in the same basic block, for example if there is a single block loop). MemCpyOptPass::processStore is only safe to use for reachable basic blocks, since it may iterate past the basic block beginning when used for unreachable blocks. By simply skipping to optimize unreachable basic blocks we can avoid asserts such as "Assertion `!NodePtr->isKnownSentinel()' failed." in MemCpyOptPass::processStore. The problem was detected by fuzz tests. Reviewers: eli.friedman, dneilson, efriedma Reviewed By: efriedma Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D45889 llvm-svn: 330635	2018-04-23 19:55:04 +00:00
Daniel Neilson	cc45e923c5	[DSE] Teach the pass that atomic memory intrinsics are stores. Summary: This change teaches DSE that the atomic memory intrinsics are stores that can be eliminated, and can allow other stores to be eliminated. This change specifically does not teach DSE that these intrinsics can be partially eliminated (i.e. length reduced, and dest/src changed); that will be handled in another change. Reviewers: mkazantsev, skatkov, apilipenko, efriedma, rsmith Reviewed By: efriedma Subscribers: dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D45535 llvm-svn: 330629	2018-04-23 19:06:49 +00:00
Chandler Carruth	bf7190a154	[PM/LoopUnswitch] Remove a buggy assert in the new loop unswitch. The condition this was asserting doesn't actually hold. I've added comments to explain why, removed the assert, and added a fun test case reduced out of 403.gcc. llvm-svn: 330564	2018-04-23 06:58:36 +00:00
Chandler Carruth	b525424118	[PM/LoopUnswitch] Fix comment typo. NFC. llvm-svn: 330560	2018-04-23 00:48:42 +00:00
Michael Zolotukhin	e268304122	Revert r330431. There are still stage3/stage4 miscompares :( llvm-svn: 330446	2018-04-20 16:57:10 +00:00
Florian Hahn	773872fd67	[NewGVN] Split OpPHI detection and creation. It also adds a check making sure PHIs for operands are all in the same block. Patch by Daniel Berlin <dberlin@dberlin.org> Reviewers: dberlin, davide Differential Revision: https://reviews.llvm.org/D43865 llvm-svn: 330444	2018-04-20 16:37:13 +00:00
Michael Zolotukhin	a2c9af0209	Revert "Revert r330403 and r330413." Reapply the patches with a fix. Thanks Ilya and Hans for the reproducer! This reverts commit r330416. The issue was that removing predecessors invalidated uses that we stored for rewrite. The fix is to finish manipulating with CFG before we select uses for rewrite. llvm-svn: 330431	2018-04-20 13:34:32 +00:00
Ilya Biryukov	afe822bd6d	Revert r330403 and r330413. Revert r330413: "[SSAUpdaterBulk] Use SmallVector instead of DenseMap for storing rewrites." Revert r330403 "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time." r330403 commit seems to crash clang during our integrate while doing PGO build with the following stacktrace: #2 llvm::SSAUpdaterBulk::RewriteAllUses(llvm::DominatorTree, llvm::SmallVectorImpl<llvm::PHINode>) #3 llvm::JumpThreadingPass::ThreadEdge(llvm::BasicBlock, llvm::SmallVectorImpl<llvm::BasicBlock> const&, llvm::BasicBlock) #4 llvm::JumpThreadingPass::ProcessThreadableEdges(llvm::Value, llvm::BasicBlock, llvm::jumpthreading::ConstantPreference, llvm::Instruction) #5 llvm::JumpThreadingPass::ProcessBlock(llvm::BasicBlock) The crash happens while compiling 'lib/Analysis/CallGraph.cpp'. r3340413 is reverted due to conflicting changes. llvm-svn: 330416	2018-04-20 10:52:54 +00:00
Michael Zolotukhin	9dea079315	[SSAUpdaterBulk] Use SmallVector instead of DenseMap for storing rewrites. llvm-svn: 330413	2018-04-20 10:31:06 +00:00
Michael Zolotukhin	79e4f7fadb	Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time. Hopefully, changing set to vector removes nondeterminism detected by some bots, or the new assert will catch something. This reverts commit r330180. llvm-svn: 330403	2018-04-20 08:01:08 +00:00
Jin Lin	585f2699cf	Refine the loop rotation's API Summary: The following changes addresses the following two issues. 1) The existing loop rotation pass contains both loop latch simplification and loop rotation. So one flag RotationOnly is added to be passed to the loop rotation pass. 2) The threshold value is initialized with MAX_UINT since the loop rotation utility should not have threshold limit. Reviewers: dmgreen, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45582 llvm-svn: 330362	2018-04-19 20:29:43 +00:00
Chandler Carruth	32e62f9c5b	[PM/LoopUnswitch] Detect irreducible control flow within loops and skip unswitching non-trivial edges. Summary: This fixes the bug pointed out in review with non-trivial unswitching. This also provides a basis that should make it pretty easy to finish fleshing out a routine to scan an entire function body for irreducible control flow, but this patch remains minimal for disabling loop unswitch. Reviewers: sanjoy, fedor.sergeev Subscribers: mcrosier, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45754 llvm-svn: 330357	2018-04-19 18:44:25 +00:00
Sanjay Patel	a201787fd7	[Reassociate] fix formatting; NFC llvm-svn: 330348	2018-04-19 17:56:36 +00:00
Florian Hahn	b789165e6b	[NewGVN] Add ops as dependency if we cannot find a leader for ValueOp. If those operands change, we might find a leader for ValueOp, which could enable new phi-of-op creation. This fixes a case where we missed creating a phi-of-ops node. With D43865 and this patch, bootstrapping clang/llvm works with -enable-newgvn, whereas without it, the "value changed after iteration" assertion is triggered. Reviewers: dberlin, davide Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D42180 llvm-svn: 330334	2018-04-19 15:05:47 +00:00
Sam Parker	3c19051bf0	[IRCE] Only check for NSW on equality predicates After investigation discussed in D45439, it would seem that the nsw flag restriction is unnecessary in most cases. So the IsInductionVar lambda has been removed, the functionality extracted, and now only require nsw when using eq/ne predicates. Differential Revision: https://reviews.llvm.org/D45617 llvm-svn: 330256	2018-04-18 13:50:28 +00:00
Michael Zolotukhin	21458fdc55	Revert "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." again." This reverts r330175. There are still stage3/stage4 miscompares. llvm-svn: 330180	2018-04-17 07:31:27 +00:00
Michael Zolotukhin	3f5fd1b129	Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." again. One more, hopefully the last, bug is fixed: when forming UsesToRewrite we should ignore phi operands coming from edges that we want to delete. This reverts r329910. llvm-svn: 330175	2018-04-17 04:45:22 +00:00
Roman Lebedev	25cbb62d18	[NFC] ConstantOffsetExtractor::CanTraceInto(): add FIXME: no tests As suggested in https://reviews.llvm.org/D45631#1068338, looking at haveNoCommonBitsSet() users, and trying to show the change effect elsewhere. llvm-svn: 330100	2018-04-15 18:59:27 +00:00
Hiroshi Inoue	ae17900997	[NFC] fix trivial typos in document and comments "not not" -> "not" etc llvm-svn: 330083	2018-04-14 08:59:00 +00:00
Mandeep Singh Grang	636d94db3b	[Transforms] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: kcc, pcc, danielcdh, jmolloy, sanjoy, dberlin, ruiu Reviewed By: ruiu Subscribers: ruiu, llvm-commits Differential Revision: https://reviews.llvm.org/D45142 llvm-svn: 330059	2018-04-13 19:47:57 +00:00
Xin Tong	d83c883d29	[CallSiteSplit] Fix comment. NFC llvm-svn: 329987	2018-04-13 04:35:38 +00:00
Benjamin Kramer	b4ba3988bb	Revert "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time." This reverts commit r329865. Causes stage2/stage3 miscompare. llvm-svn: 329910	2018-04-12 13:52:02 +00:00
Sam Parker	9737535943	[IRCE] isKnownNonNegative helper function Created a helper function to query for non negative SCEVs. Uses the SGE predicate to catch constants that could be interpreted as negative. Differential Revision: https://reviews.llvm.org/D45481 llvm-svn: 329907	2018-04-12 12:49:40 +00:00
Hiroshi Inoue	bcadfee2ad	[NFC] fix trivial typos in documents and comments "is is" -> "is", "if if" -> "if", "or or" -> "or" llvm-svn: 329878	2018-04-12 05:53:20 +00:00
Michael Zolotukhin	815f453f76	Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time. This reapplies commit r329644. llvm-svn: 329865	2018-04-11 23:37:53 +00:00
Sanjay Patel	3b6d46761f	[CVP] simplify phi with constant incoming values that match common variable edge values This is based on an example that was recently posted on llvm-dev: void propagate_null(void b, int* g) { if (!b) { return 0; } (*g)++; return b; } https://godbolt.org/g/xYk3qG The original code or constant propagation in other passes has obscured the fact that the phi can be removed completely. Differential Revision: https://reviews.llvm.org/D45448 llvm-svn: 329755	2018-04-10 20:42:39 +00:00
Michael Zolotukhin	d6beefd5d3	Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time. This reverts r329661. Bots are still unhappy. llvm-svn: 329666	2018-04-10 03:40:29 +00:00
Michael Zolotukhin	8a13f6d4a7	Revert "Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading."" This reapplies commit r329644. llvm-svn: 329661	2018-04-10 02:16:45 +00:00
Michael Zolotukhin	0274632ee6	Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading." This reverts commit r329644. llvm-svn: 329650	2018-04-10 00:42:43 +00:00
Michael Zolotukhin	c6d2d65f37	[PR16756] Use SSAUpdaterBulk in JumpThreading. Summary: SSAUpdater is a bottleneck in JumpThreading, and this patch improves the situation by using SSAUpdaterBulk instead. Compile time impact: no noticable changes on CTMark, a big improvement on the test from PR16756. Reviewers: dberlin, davide, MatzeB Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44282 llvm-svn: 329644	2018-04-09 23:37:37 +00:00
Xin Tong	fdad23bc36	[MergeICmp] Update debug msg.NFC llvm-svn: 329572	2018-04-09 14:29:13 +00:00
Xin Tong	0efadbbcde	[MergeICmp] Split blocks that do other work. Summary: We do not try to move the instructions and split the block till we know the blocks can be split, i.e. BCE-cmp-insts can be separated from non-BCE-cmp-insts. Reviewers: davide, courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44443 llvm-svn: 329564	2018-04-09 13:14:06 +00:00
Max Kazantsev	8624a4786a	[IRCE] Relax restriction on collected range checks In IRCE, we have a very old legacy check that works when we collect comparisons that we treat as range checks. It ensures that the value against which the indvar is compared is loop invariant and is also positive. This latter condition remained there since the times when IRCE was only able to handle signed latch comparison. As the optimization evolved, it now learned how to intersect signed or unsigned ranges, and this logic has no reliance on the fact that the right border of each range should be positive. The old implementation of this non-negativity check was also naive enough and just looked into ranges (while most of other IRCE logic tries to use power of SCEV implications), so this check did not allow to deal with the most simple case that looks like follows: int size; // not known non-negative int length; //known non-negative; i = 0; if (size != 0) { do { range_check(i < size); range_check(i < length); ++i; } while (i < size) } In this case, even if from some dominating conditions IRCE could parse loop structure, it could only remove the range check against `length` and simply ignored the check against `size`. In this patch we remove this obsolete check. It will allow IRCE to pick comparison against `size` as a potential range check and then let Range Intersection logic decide whether it is OK to eliminate it or not. Differential Revision: https://reviews.llvm.org/D45362 Reviewed By: samparker llvm-svn: 329547	2018-04-09 06:01:22 +00:00
Hiroshi Inoue	9ff2380ea6	[NFC] fix trivial typos in comments and error message "is is" -> "is", "are are" -> "are" llvm-svn: 329546	2018-04-09 04:37:53 +00:00
Xin Tong	99c4e2f364	[LIR] Reorder header. NFC llvm-svn: 329530	2018-04-08 13:19:53 +00:00
Geoff Berry	5bf4a5eafa	[EarlyCSE] Add debug counter for debugging mis-optimizations. NFC. Reviewers: reames, spatel, davide, dberlin Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D45162 llvm-svn: 329443	2018-04-06 18:47:33 +00:00
Max Kazantsev	832563a782	[NFC] Add missing end of line symbols llvm-svn: 329383	2018-04-06 09:47:06 +00:00
Florian Hahn	6e0043365b	[LoopInterchange] Add stats counter for number of interchanged loops. Reviewers: samparker, karthikthecool, blitz.opensource Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D45209 llvm-svn: 329269	2018-04-05 10:39:23 +00:00
Florian Hahn	831a757728	[LoopInterchange] Preserve LoopInfo after interchanging. LoopInterchange relies on LoopInfo being up-to-date, so we should preserve it after interchanging. This patch updates restructureLoops to move the BBs of the interchanged loops to the right place. Reviewers: davide, efriedma, karthikthecool, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45278 llvm-svn: 329264	2018-04-05 09:48:45 +00:00
Taewook Oh	e0db533feb	[CallSiteSplitting] Do not perform callsite splitting inside landing pad Summary: If the callsite is inside landing pad, do not perform callsite splitting. Callsite splitting uses utility function llvm::DuplicateInstructionsInSplitBetween, which eventually calls llvm::SplitEdge. llvm::SplitEdge calls llvm::SplitCriticalEdge with an assumption that the function returns nullptr only when the target edge is not a critical edge (and further assumes that if the return value was not nullptr, the predecessor of the original target edge always has a single successor because critical edge splitting was successful). However, this assumtion is not true because SplitCriticalEdge returns nullptr if the destination block is a landing pad. This invalid assumption results assertion failure. Fundamental solution might be fixing llvm::SplitEdge to not to rely on the invalid assumption. However, it'll involve a lot of work because current API assumes that llvm::SplitEdge never fails. Instead, this patch makes callsite splitting to not to attempt splitting if the callsite is in a landing pad. Attached test case will crash with assertion failure without the fix. Reviewers: fhahn, junbuml, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45130 llvm-svn: 329250	2018-04-05 04:16:23 +00:00
Nicolai Haehnle	eb7311ffb1	StructurizeCFG: Test for branch divergence correctly Fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform, so the branch is non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. As discovered after committing an earlier version of this change, this exposes a subtle interaction between this pass and DivergenceAnalysis: since we remove and re-create branch instructions, we can no longer rely on DivergenceAnalysis for branches in subregions that were already processed by the pass. Explicitly remove branch instructions from DivergenceAnalysis to avoid dangling pointers as a matter of defensive programming, and change how we detect non-uniform subregions. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Differential Revision: https://reviews.llvm.org/D43743 llvm-svn: 329165	2018-04-04 10:58:15 +00:00
Ikhlas Ajbar	1376d934ed	[Hexagon] peel loops with runtime small trip counts Move the check canPeel() to Hexagon Target before setting PeelCount. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329129	2018-04-03 22:55:09 +00:00
Florian Hahn	9467ccf447	[LoopInterchange] Add remark for calls preventing interchanging. It also updates test/Transforms/LoopInterchange/call-instructions.ll to use accesses where we can prove dependence after D35430. Reviewers: sebpop, karthikthecool, blitz.opensource Reviewed By: sebpop Differential Revision: https://reviews.llvm.org/D45206 llvm-svn: 329111	2018-04-03 20:54:04 +00:00
Ikhlas Ajbar	b7322e8ac7	peel loops with runtime small trip counts For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329042	2018-04-03 03:39:43 +00:00
Fangrui Song	956ee79795	Fix a bunch of typoes. NFC llvm-svn: 328907	2018-03-30 22:22:31 +00:00
Philip Reames	5c14ed89f6	[NFC][LICM] Rearrange checks to have the cheap bail out first llvm-svn: 328822	2018-03-29 20:32:15 +00:00
Haicheng Wu	c7cc87922e	[JumpThreading] Don't select an edge that we know we can't thread In r312664 (D36404), JumpThreading stopped threading edges into loop headers. Unfortunately, I observed a significant performance regression as a result of this change. Upon further investigation, the problematic pattern looked something like this (after many high level optimizations): while (true) { bool cond = ...; if (!cond) { <body> } if (cond) break; } Now, naturally we want jump threading to essentially eliminate the second if check and hook up the edges appropriately. However, the above mentioned change, prevented it from doing this because it would have to thread an edge into the loop header. Upon further investigation, what is happening is that since both branches are threadable, JumpThreading picks one of them at arbitrarily. In my case, because of the way that the IR ended up, it tended to pick the one to the loop header, bailing out immediately after. However, if it had picked the one to the exit block, everything would have worked out fine (because the only remaining branch would then be folded, not thraded which is acceptable). Thus, to fix this problem, we can simply eliminate loop headers from consideration as possible threading targets earlier, to make sure that if there are multiple eligible branches, we can still thread one of the ones that don't target a loop header. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42260 llvm-svn: 328798	2018-03-29 16:01:26 +00:00
David Green	b0aa36f9c2	[LoopRotate] Restructuring LoopRotation.cpp to create Loop Rotation Pass with Loop Rotation Utility Interface The existing LoopRotation.cpp is implemented as one of loop passes instead of being a utility. The user cannot easily perform the loop rotation selectively (or on demand) under different optimization level. For example, the loop rotation is needed as part of the logic to convert a loop into a loop with bottom test for a transformation. If the loop rotation is simply added as a loop pass before the transformation, the pass is skipped if it is compiled at –O0 or if it is explicitly disabled by the user, causing the compiler to generate incorrect code. Furthermore, as a loop pass it will rotate all loops instead of just the relevant loops. We provide a utility interface for the loop rotation so that the loop rotation can be called on demand. The changeset is as follows: - Create a new file lib/Transforms/Utils/LoopRotationUtils.cpp and move the main implementation of class LoopRotate into this file. - Create a new file llvm/include/Transform/Utils/LoopRotationUtils.h with the interface LoopRotation(...). - Original LoopRotation.cpp is changed to use the utility function LoopRotation in LoopRotationUtils.cpp. This is done in the same way community did for mem-to-reg implementation. Patch by Jin Lin! Differential Revision: https://reviews.llvm.org/D44595 llvm-svn: 328766	2018-03-29 08:48:15 +00:00
David Blaikie	8ad9a97310	Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header dependency Thanks to echristo for the pointers on direction. llvm-svn: 328737	2018-03-28 22:28:50 +00:00
David Blaikie	eb8cc04ea2	Oops - moved slightly too many things from Scalar to Utils. Move LoopSimplifyCFG things back llvm-svn: 328720	2018-03-28 18:03:25 +00:00
David Blaikie	a373d18eb7	Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils. llvm-svn: 328717	2018-03-28 17:44:36 +00:00
Xin Tong	0272cb077f	80-line wrap. NFC llvm-svn: 328660	2018-03-27 19:43:02 +00:00
Sam Parker	90b7f4f72c	[IRCE] Enable decreasing loops of non-const bound As a follow-up to r328480, this updates the logic for the decreasing safety checks in a similar manner: - CanBeMax is replaced by CannotBeMaxInLoop which queries isLoopEntryGuardedByCond on the maximum value. - SumCanReachMin is replaced by isSafeDecreasingBound which includes some logic from parseLoopStructure and, again, has been updated to use isLoopEntryGuardedByCond on the given bounds. Differential Revision: https://reviews.llvm.org/D44776 llvm-svn: 328613	2018-03-27 08:24:53 +00:00
Krzysztof Parzyszek	0b377e0ae9	[LSR] Allow giving priority to post-incrementing addressing modes Implement TTI interface for targets to indicate that the LSR should give priority to post-incrementing addressing modes. Combination of patches by Sebastian Pop and Brendon Cahoon. Differential Revision: https://reviews.llvm.org/D44758 llvm-svn: 328490	2018-03-26 13:10:09 +00:00
Sam Parker	53a423a417	[IRCE] Enable increasing loops of variable bounds CanBeMin is currently used which will report true for any unknown values, but often a check is performed outside the loop which covers this situation: for (int i = 0; i < N; ++i) ... if (N > 0) for (int i = 0; i < N; ++i) ... So I've add 'LoopGuardedAgainstMin' which reports whether N is greater than the minimum value which then allows loop with a variable loop count to be optimised. I've also moved the increasing bound checking into its own function and replaced SumCanReachMax is another isLoopEntryGuardedByCond function. llvm-svn: 328480	2018-03-26 09:29:42 +00:00
Philip Reames	6a1f3446b5	[GuardWidening] Group code by class [NFC] llvm-svn: 328387	2018-03-23 23:41:47 +00:00
Andrew Kaylor	a237866faf	Fix a block copying problem in LICM Differential Revision: https://reviews.llvm.org/D44817 llvm-svn: 328336	2018-03-23 17:36:18 +00:00
Florian Hahn	f73c3ece7f	Revert r328307: [IPSCCP] Use constant range information for comparisons of parameters. Reverted for now, due to it causing verifier failures. llvm-svn: 328312	2018-03-23 12:49:39 +00:00
Florian Hahn	b1feec087e	[IPSCCP] Use constant range information for comparisons of parameters. For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 328307	2018-03-23 11:56:00 +00:00
David Blaikie	376294c23a	Finish moving the IPSCCP pass from Scalar to IPO - moving the registration llvm-svn: 328259	2018-03-22 22:07:53 +00:00
David Blaikie	3bbf5af0ac	Fix layering between SCCP and IPO SCCP Transforms/Scalar/SCCP.cpp implemented both the Scalar and IPO SCCP, but this meant Transforms/Scalar including Transfroms/IPO headers, creating a circular dependency. (IPO depends on Scalar already) - so move the IPO SCCP shims out into IPO and the basic library implementation accessible from Scalar/SCCP.h to be used from the IPO/SCCP.cpp implementation. llvm-svn: 328250	2018-03-22 21:41:29 +00:00
Anna Thomas	9b1176b0ef	[LoopPredication] Add profitability check based on BPI Summary: LoopPredication is not profitable when the loop is known to always exit through some block other than the latch block. A coarse grained latch check can cause loop predication to predicate the loop, and unconditionally deoptimize. However, without predicating the loop, the guard may never fail within the loop during the dynamic execution because the non-latch loop termination condition exits the loop before the latch condition causes the loop to exit. We teach LP about this using BranchProfileInfo pass. Reviewers: apilipenko, skatkov, mkazantsev, reames Reviewed by: skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44667 llvm-svn: 328210	2018-03-22 16:03:59 +00:00
Florian Hahn	9bc0bc4b9b	[CallSiteSplitting] Preserve DominatorTreeAnalysis. The dominator tree analysis can be preserved easily. Some other kinds of analysis can probably be preserved too. Reviewers: junbuml, dberlin Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D43173 llvm-svn: 328206	2018-03-22 15:23:33 +00:00
David Blaikie	2be3922807	Fix a couple of layering violations in Transforms Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering. Transforms depends on Transforms/Utils, not the other way around. So remove the header and the "createStripGCRelocatesPass" function declaration (& definition) that is unused and motivated this dependency. Move Transforms/Utils/Local.h into Analysis because it's used by Analysis/MemoryBuiltins.cpp. llvm-svn: 328165	2018-03-21 22:34:23 +00:00
Daniel Neilson	6f1eb58e92	[MemCpyOpt] Update to new API for memory intrinsic alignment Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the MemCpyOpt pass to cease using: 1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. 2) The old IRBuilder CreateMemCpy/CreateMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. We also add a few tests to fill gaps in the testing of this pass. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960, rL325816, rL327398, rL327421 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 328097	2018-03-21 14:14:55 +00:00
Justin Lebar	038cbc5c13	Re-re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Backed out for causing performance regressions. Re-landing because we've determined that these regressions were noise. Original Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 328096	2018-03-21 14:08:21 +00:00
Xin Tong	a713ebea24	[MergeICmps] Break eargerly out of loop llvm-svn: 327972	2018-03-20 12:03:25 +00:00
Xin Tong	bdbd97ed9a	[MergeICmp] Fix a bug in entry block shuffled to middle of the chain Summary: Fix a bug in entry block shuffled to middle of the chain. Reviewers: davide, courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44642 llvm-svn: 327971	2018-03-20 11:57:54 +00:00
Anastasis Grammenos	3a589103a4	[LICM] Salvage DI from dying Instructions LICM deletes trivially dead instructions which it won't attempt to sink. Attempt to salvage debug values which reference these instructions. llvm-svn: 327800	2018-03-18 15:59:19 +00:00
Craig Topper	71d69b2ea5	[CorrelatedValuePropagation] Use SelectInst::getCondition/getTrueValue/getFalseValue instead of getOperand for readability. NFC llvm-svn: 327728	2018-03-16 18:18:47 +00:00
Brian M. Rzycki	f65ddc5fa2	[JumpThreading] Track unreachable BBs to avoid processing JumpThreading iterates over F until the IR quiesces. Transforming unreachable BBs increases compile time and it is also possible to never stabilize causing JumpThreading to hang. An older attempt at fixing this problem was D3991 where removeUnreachableBlocks(F) was called before JumpThreading began. This has a few drawbacks: * expensive - the routine attempts to fix up the IR to identify additional BBs that can be removed along with unreachable BBs. * aggressive - does not identify and preserve the shape of the IR. At a minimum it does not preserve loop hierarchies. * invasive - altering reachable blocks it may disrupt IR shapes that could have otherwise been JumpThreaded. This patch avoids removeUnreachableBlocks(F) and instead tracks unreachable BBs in a SmallPtrSet using DominatorTree to validate the initial state of all BBs. We then rely on subsequent passes to identify and remove these unreachable blocks from F. Reviewers: dberlin, sebpop, kuhar, dinesh.d Reviewed by: sebpop, kuhar Subscribers: hiraditya, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D44177 llvm-svn: 327713	2018-03-16 15:13:47 +00:00
Florian Hahn	fc97b6173f	[LoopUnroll] Peel off iterations if it makes conditions true/false. If the loop body contains conditions of the form IndVar < #constant, we can remove the checks by peeling off #constant iterations. This improves codegen for PR34364. Reviewers: mkuper, mkazantsev, efriedma Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D43876 llvm-svn: 327671	2018-03-15 21:34:43 +00:00
Philip Reames	422024a1b7	[EarlyCSE] Don't hide earler invariant.scopes If we've already established an invariant scope with an earlier generation, we don't want to hide it in the scoped hash table with one with a later generation. I noticed this when working on the invariant-load handling, but it also applies to the invariant.start case as well. Without this change, my previous patch for invariant-load regresses some cases, so I'm pushing this without waiting for review. This is why you don't make last minute tweaks to patches to catch "obvious cases" after it's already been reviewed. Bad Philip! llvm-svn: 327655	2018-03-15 18:12:27 +00:00
Philip Reames	ca587fe0b4	[EarlyCSE] Reuse invariant scopes for invariant load This is a follow up to https://reviews.llvm.org/D43716 which rewrites the invariant load handling using the new infrastructure. It's slightly more powerful, but only in somewhat minor ways for the moment. It's not clear that DSE of stores to invariant locations is actually interesting since why would your IR have such a construct to start with? Note: The submitted version is slightly different than the reviewed one. I realized the scope could start for an invariant load which was proven redundant and removed. Added a test case to illustrate that as well. Differential Revision: https://reviews.llvm.org/D44497 llvm-svn: 327646	2018-03-15 17:29:32 +00:00
Fedor Sergeev	194a407bda	[New PM][IRCE] port of Inductive Range Check Elimination pass to the new pass manager There are two nontrivial details here: * Loop structure update interface is quite different with new pass manager, so the code to add new loops was factored out * BranchProbabilityInfo is not a loop analysis, so it can not be just getResult'ed from within the loop pass. It cant even be queried through getCachedResult as LoopCanonicalization sequence (e.g. LoopSimplify) might invalidate BPI results. Complete solution for BPI will likely take some time to discuss and figure out, so for now this was partially solved by making BPI optional in IRCE (skipping a couple of profitability checks if it is absent). Most of the IRCE tests got their corresponding new-pass-manager variant enabled. Only two of them depend on BPI, both marked with TODO, to be turned on when BPI starts being available for loop passes. Reviewers: chandlerc, mkazantsev, sanjoy, asbirlea Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D43795 llvm-svn: 327619	2018-03-15 11:01:19 +00:00
Andrei Elovikov	f9b8035f3c	[LoopUnroll] Ignore ephemeral values when checking full unroll profitability. Summary: Before this patch call graph is like this in the LoopUnrollPass: tryToUnrollLoop ApproximateLoopSize collectEphemeralValues /* Use collected ephemeral values / computeUnrollCount analyzeLoopUnrollCost / Bail out from the analysis if loop contains CallInst / This patch moves collection of the ephemeral values to the tryToUnrollLoop function and passes the collected values into both ApproximateLoopsize (as before) and additionally starts using them in analyzeLoopUnrollCost: tryToUnrollLoop collectEphemeralValues ApproximateLoopSize(EphValues) / Use EphValues / computeUnrollCount(EphValues) analyzeLoopUnrollCost(EphValues) / Ignore ephemeral values - they don't contribute to the final cost / / Bail out from the analysis if loop contains CallInst */ Reviewers: mzolotukhin, evstupac, sanjoy Reviewed By: evstupac Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43931 llvm-svn: 327617	2018-03-15 09:59:15 +00:00
Philip Reames	0adbb19409	[EarlyCSE] Exploit open ended invariant.start scopes If we have an invariant.start with no corresponding invariant.end, then the memory location becomes invariant indefinitely after the invariant.start. As a result, anything dominated by the start is guaranteed to see the value the memory location had when the invariant.start executed. This patch adds an AvailableInvariants table which tracks the generation a particular memory location became invariant and then uses that information to allow value forwarding that would otherwise be disallowed by potentially aliasing stores. (Reminder: In EarlyCSE everything clobbers everything by default.) This should be compatible with the MemorySSA variant, but design is generational. We can and should add first class support for invariant.start within MemorySSA at a later time. I took a quick look at doing so, but probably need some input from a MemorySSA expert. Differential Revision: https://reviews.llvm.org/D43716 llvm-svn: 327577	2018-03-14 21:35:06 +00:00
Anna Thomas	5ac72f94f3	Test Commit NFC. Updated comment llvm-svn: 327436	2018-03-13 19:38:45 +00:00
Daniel Neilson	41e781d5f1	[SROA] Take advantage of separate alignments for memcpy source and destination Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the SROA pass to cease using the old getAlignment() & setAlignment() APIs of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. This allows us to enhance visitMemTransferInst to be more aggressive setting the alignment in memcpy calls that it creates, as well as to only change the alignment of a memcpy/memmove argument that it replaces. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960, rL325816 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html Reviewers: chandlerc, bollu, efriedma Reviewed By: efriedma Subscribers: efriedma, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D42974 llvm-svn: 327398	2018-03-13 14:25:33 +00:00
Clement Courbet	9f0b3170bc	[MergeICmps] Make sure that the comparison only has one use. Summary: Fixes PR36557. Reviewers: trentxintong, spatel Subscribers: mstorsjo, llvm-commits Differential Revision: https://reviews.llvm.org/D44083 llvm-svn: 327372	2018-03-13 07:05:55 +00:00
Vedant Kumar	3a408538f0	Remove the LoopInstSimplify pass (-loop-instsimplify) LoopInstSimplify is unused and untested. Reading through the commit history the pass also seems to have a high maintenance burden. It would be best to retire the pass for now. It should be easy to recover if we need something similar in the future. Differential Revision: https://reviews.llvm.org/D44053 llvm-svn: 327329	2018-03-12 20:49:42 +00:00
Craig Topper	3b4ad9c12d	[CallSiteSplitting] Use !Instruction::use_empty instead of checking for a non-zero return from getNumUses getNumUses is a linear operation. It walks a linked list to get a count. So in this case its better to just ask if there are any users rather than how many. llvm-svn: 327314	2018-03-12 18:40:59 +00:00
Justin Lebar	24b6640b1b	Back out "Re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions." This reverts r326908, originally landed as D44102. Reverted for causing performance regressions on x86. (These regressions are not yet understood.) llvm-svn: 327252	2018-03-12 09:26:09 +00:00
Renato Golin	038ede2a16	[NFC] Consolidate six getPointerOperand() utility functions into one place There are six separate instances of getPointerOperand() utility. LoopVectorize.cpp has one of them, and I don't want to create a 7th one while I'm trying to move LoopVectorizationLegality into a separate file (eventual objective is to move it to Analysis tree). See http://lists.llvm.org/pipermail/llvm-dev/2018-February/120999.html for llvm-dev discussions Closes D43323. Patch by Hideki Saito <hideki.saito@intel.com>. llvm-svn: 327173	2018-03-09 21:05:58 +00:00
Chad Rosier	95d9ccb2a0	[JumpThreading] Don't restrict cast-traversal to i1 In r263618, JumpThreading learned to look trough simple cast instructions, but only if the source of those cast instructions was a phi/cmp i1 (in an effort to limit compile time effects). I think this condition is too restrictive. For switches with limited value range, InstCombine will readily introduce an extra trunc instruction to a smaller integer type (e.g. from i8 to i2), leaving us in the somewhat perverse situation that jump-threading would work before running instcombine, but not after. Since instcombine produces this pattern, I think we need to consider it canonical and support it in JumpThreading. In general, for limiting recursion, I think the existing restriction to phi and cmp nodes should be sufficient to avoid looking through unprofitable chains of instructions. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42262 llvm-svn: 327150	2018-03-09 16:43:46 +00:00
Philip Reames	fbffd126b8	[NFC] Factor out a helper function for checking if a block has a potential early implicit exit. llvm-svn: 327065	2018-03-08 21:25:30 +00:00
Justin Lebar	eccfbf1bcd	Re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Backed out for failing an assert in clang bootstrap builds. Re-landing with a fix for handling non-power-of-two inputs (e.g. udiv i24). Original Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 326908	2018-03-07 16:56:49 +00:00
Justin Lebar	eeeb0eb049	Revert rL326898: "Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions." Breaks bootstrap builds: clang built with this patch asserts while building MCDwarf.cpp: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. llvm-svn: 326900	2018-03-07 16:05:43 +00:00
Justin Lebar	cb9e89c39b	Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Reviewers: spatel, sanjoy Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 326898	2018-03-07 15:11:13 +00:00
Evgeny Stupachenko	204ade4102	Add early exit on reassociation of 0 expression. Summary: Before the patch a try to reassociate ((v * 16) * 0) * 1 fall into infinite loop Reviewers: pankajchawla Differential Revision: http://reviews.llvm.org/D41467 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 326861	2018-03-07 02:17:08 +00:00
Sebastian Pop	bf6e1c26cf	DA: remove uses of GEP, only ask SCEV It's been quite some time the Dependence Analysis (DA) is broken, as it uses the GEP representation to "identify" multi-dimensional arrays. It even wrongly detects multi-dimensional arrays in single nested loops: from test/Analysis/DependenceAnalysis/Coupled.ll, example @couple6 ;; for (long int i = 0; i < 50; i++) { ;; A[i][3i - 6] = i; ;; B++ = A[i][i]; DA used to detect two subscripts, which makes no sense in the LLVM IR or in C/C++ semantics, as there are no guarantees as in Fortran of subscripts not overlapping into a next array dimension: maximum nesting levels = 1 SrcPtrSCEV = %A DstPtrSCEV = %A using GEPs subscript 0 src = {0,+,1}<nuw><nsw><%for.body> dst = {0,+,1}<nuw><nsw><%for.body> class = 1 loops = {1} subscript 1 src = {-6,+,3}<nsw><%for.body> dst = {0,+,1}<nuw><nsw><%for.body> class = 1 loops = {1} Separable = {} Coupled = {1} With the current patch, DA will correctly work on only one dimension: maximum nesting levels = 1 SrcSCEV = {(-2424 + %A)<nsw>,+,1212}<%for.body> DstSCEV = {%A,+,404}<%for.body> subscript 0 src = {(-2424 + %A)<nsw>,+,1212}<%for.body> dst = {%A,+,404}<%for.body> class = 1 loops = {1} Separable = {0} Coupled = {} This change removes all uses of GEP from DA, and we now only rely on the SCEV representation. The patch does not turn on -da-delinearize by default, and so the DA analysis will be more conservative in the case of multi-dimensional memory accesses in nested loops. I disabled some interchange tests, as the DA is not able to disambiguate the dependence anymore. To make DA stronger, we may need to compute a bound on the number of iterations based on the access functions and array dimensions. The patch cleans up all the CHECKs in test/Transforms/LoopInterchange/*.ll to avoid checking for snippets of LLVM IR: this form of checking is very hard to maintain. Instead, we now check for output of the pass that are more meaningful than dozens of lines of LLVM IR. Some tests now require -debug messages and thus only enabled with asserts. Patch written by Sebastian Pop and Aditya Kumar. Differential Revision: https://reviews.llvm.org/D35430 llvm-svn: 326837	2018-03-06 21:55:59 +00:00
Florian Hahn	517dc51c48	[CallSiteSplitting] Do not crash when BB's terminator changes. Change doCallSiteSplitting to iterate until we reach the terminator instruction. tryToSplitCallSite can replace BB's terminator in case BB is a successor of itself. Then IE will be invalidated and we also have to check the current terminator. Reviewers: junbuml, davidxl, davide, fhahn Reviewed By: fhahn, junbuml Differential Revision: https://reviews.llvm.org/D43824 llvm-svn: 326793	2018-03-06 14:00:58 +00:00
Xin Tong	8fd561f572	[MergeICmp] Simplify how BCECmpBlock instructions are blacklisted llvm-svn: 326761	2018-03-06 02:24:02 +00:00
Xin Tong	98af9efca5	[MergeICmp] Fix printing. NFC llvm-svn: 326760	2018-03-06 02:04:57 +00:00
Daniel Neilson	82daad31fe	[RewriteStatepoints] Fix stale parse points Summary: RewriteStatepointsForGC collects parse points for further processing. During the collection if a callsite is found in an unreachable block (DominatorTree::isReachableFromEntry()) then all unreachable blocks are removed by removeUnreachableBlocks(). Some of the removed blocks could have been reachable according to DominatorTree::isReachableFromEntry(). In this case the collected parse points became stale and resulted in a crash when accessed. The fix is to unconditionally canonicalize the IR to removeUnreachableBlocks and then collect the parse points. The added test crashes with the old version and passes with this patch. Patch by Yevgeny Rouban! Reviewed by: Anna Differential Revision: https://reviews.llvm.org/D43929 llvm-svn: 326748	2018-03-05 22:27:30 +00:00
Florian Hahn	0b7c6422fb	[IPSCCP] Add getCompare which returns either true, false, undef or null. getCompare returns true, false or undef constants if the comparison can be evaluated, or nullptr if it cannot. This is in line with what ConstantExpr::getCompare returns. It also allows us to use ConstantExpr::getCompare for comparing constants. Reviewers: davide, mssimpso, dberlin, anna Reviewed By: davide Differential Revision: https://reviews.llvm.org/D43761 llvm-svn: 326720	2018-03-05 17:33:50 +00:00
Sanjay Patel	53ffabdfcb	[CVP] fix formatting; NFC llvm-svn: 326711	2018-03-05 16:08:34 +00:00
Xin Tong	8345c0e3a5	[MergeICmp] We can discard initial blocks that do other work Summary: We can discard initial blocks that do other work We do not need to limit ourselves to just the first block in the chain. Reviewers: courbet, davide Reviewed By: courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44029 llvm-svn: 326698	2018-03-05 13:54:47 +00:00
Clement Courbet	34be1b0288	[MergeICmps][NFC] Improve logging. llvm-svn: 326683	2018-03-05 08:21:47 +00:00
Fedor Indutny	364b9c2adb	[CallSiteSplitting] fix use after-free Iterating through predecessors of `TailBB` while removing their terminators leads to use after-free, because the predecessor list is changing on each removal. llvm-svn: 326668	2018-03-03 22:34:38 +00:00
Fedor Indutny	f9e09c1dd0	[CallSiteSplitting] properly split musttail calls Summary: `musttail` calls can't be naively splitted. The split blocks must include not only the call instruction itself, but also (optional) `bitcast` and `return` instructions that follow it. Clone `bitcast` and `ret`, place them into the split blocks, and remove the tail block when done. Reviewers: junbuml, mcrosier, davidxl, davide, fhahn Reviewed By: fhahn Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D43729 llvm-svn: 326666	2018-03-03 21:40:14 +00:00
Yaxun Liu	3c42f1c3c9	LoopUnroll: respect pragma unroll when AllowRemainder is disabled Currently when AllowRemainder is disabled, pragma unroll count is not respected even though there is no remainder. This bug causes a loop fully unrolled in many cases even though the user specifies a unroll count. Especially it affects OpenCL/CUDA since in many cases a loop contains convergent instructions and currently AllowRemainder is disabled for such loops. Differential Revision: https://reviews.llvm.org/D43826 llvm-svn: 326585	2018-03-02 16:22:32 +00:00
Benjamin Kramer	d1cf7ff5ab	[SCCP] Fix unused variable warning in release builds. llvm-svn: 326429	2018-03-01 11:31:44 +00:00
Reid Kleckner	3762a089d7	[IPSCCP] do not break musttail invariant (PR36485) Do not replace results of `musttail` calls with a constant if the call itself can't be removed. Do not zap returns of `musttail` callees, if the call site can't be removed and replaced with a constant. Do not zap returns of `musttail`-calling blocks, this breaks invariant too. Patch by Fedor Indutny Differential Revision: https://reviews.llvm.org/D43695 llvm-svn: 326404	2018-03-01 01:19:18 +00:00
Xin Tong	256869d8bc	Fix typo. NFC llvm-svn: 326319	2018-02-28 12:09:53 +00:00
Xin Tong	8ba674e43b	[MergeICmp] Fix a bug in MergeICmp that can lead to a block being processed more than once. Summary: Fix a bug in MergeICmp that can lead to a BCECmp block being processed more than once and eventually lead to a broken LLVM module. The problem is that if the non-constant value is not produced by the last block, the producer will be processed once when the its parent block is processed and second time when the last block is processed. We end up having 2 same BCECmpBlock in the merge queue. And eventually lead to a broken LLVM module. Reviewers: courbet, davide Reviewed By: courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43825 llvm-svn: 326318	2018-02-28 12:08:00 +00:00
David Green	7c35de124a	[Dominators] Remove verifyDomTree and add some verifying for Post Dom Trees Removes verifyDomTree, using assert(verify()) everywhere instead, and changes verify a little to always run IsSameAsFreshTree first in order to print good output when we find errors. Also adds verifyAnalysis for PostDomTrees, which will allow checking of PostDomTrees it the same way we check DomTrees and MachineDomTrees. Differential Revision: https://reviews.llvm.org/D41298 llvm-svn: 326315	2018-02-28 11:00:08 +00:00
Florian Hahn	1807c516c7	[NewGVN] Update phi-of-ops def block when updating existing ValuePHI. In case we update a ValuePHI node created earlier, we could update it based on a different OpPHI which could be in a different block. We need to update the TempToBlock mapping reflecting the new block, otherwise we would end up placing the new phi node in a wrong block. This problem is exposed by the test case in https://bugs.llvm.org/show_bug.cgi?id=36504. This patch fixes a slightly simpler problem than in the bug report. In the bug's re-producer, the additional problem is that we are re-using a ValuePHI node with to few incoming values for the new OpPHI. If this patch makes sense, I will follow it up with a patch that creates a new PHI node if the existing PHI node has a different number of incoming values. Reviewers: davide, dberlin Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D43770 llvm-svn: 326181	2018-02-27 09:34:51 +00:00
Florian Hahn	a1822cbabc	[LoopInterchange] Loops with empty dependency matrix are safe. The dependency matrix is only empty if no conflicting load/store instructions have been found. In that case, it is safe to interchange. For the LLVM test-suite, after this change around 1900 loops are interchanged, whereas it is 15 before this change. On cortex-a57, this gives an improvement of -0.57% on the geomean execution time of SPEC2006, SPEC2000 and the test-suite. There are a few small perf regressions, but I think we can improve on those by making the cost model better. Reviewers: karthikthecool, mcrosier Reviewed by: karthikthecool Differential Revision: https://reviews.llvm.org/D43236 llvm-svn: 326077	2018-02-26 10:45:25 +00:00
Adam Nemet	e4e1de60aa	Revert "StructurizeCFG: Test for branch divergence correctly" This reverts commit r325881. Breaks many bots llvm-svn: 326037	2018-02-24 17:29:09 +00:00
Nicolai Haehnle	43c1115cd4	StructurizeCFG: Test for branch divergence correctly Summary: This fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Reviewers: arsenm, rampitec, jlebar Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D40546 llvm-svn: 325881	2018-02-23 10:45:46 +00:00
Bjorn Steinbrink	983d6c3f18	Mark MergedLoadStoreMotion as not preserving MemDep results Summary: MemDep caches results that signify that a dependence is non-local, and there is currently no way to invalidate such cache entries. Unfortunately, when MLSM sinks a store that can result in a non-local dependence becoming a local one, and then MemDep gives wrong answers. The easiest way out here is to just say that MLSM does indeed not preserve MemDep results. Reviewers: davide, Gerolf Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43177 llvm-svn: 325880	2018-02-23 10:41:57 +00:00
Daniel Neilson	20c9207be3	[AlignmentFromAssumptions] Set source and dest alignments of memory intrinsiscs separately Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the AlignmentFromAssumptions pass to cease using the old getAlignment()/setAlignment API of MemoryIntrinsic in favour of getting/setting source & dest specific alignments through the new API. This allows us to simplify some of the code in this pass and also be more aggressive about setting the source and destination alignments separately. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html Reviewers: hfinkel, bollu, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D43081 llvm-svn: 325816	2018-02-22 18:55:59 +00:00
Vedant Kumar	56492f9177	[BDCE] Salvage debug info from dying insts This results in 15 additional unique source variables in a stage2 build of FileCheck (at '-Os -g'), with a negligible increase in the size of the .debug_loc section. llvm-svn: 325660	2018-02-21 01:55:33 +00:00
Sanjoy Das	737fa40ffa	[DSE] Don't DSE stores that subsequent memmove calls read from Summary: We used to remove the first memmove in cases like this: memmove(p, p+2, 8); memmove(p, p+2, 8); which is incorrect. Fix this by changing isPossibleSelfRead to what was most likely the intended behavior. Historical note: the buggy code was added in https://reviews.llvm.org/rL120974 to address PR8728. Reviewers: rsmith Subscribers: mcrosier, llvm-commits, jlebar Differential Revision: https://reviews.llvm.org/D43425 llvm-svn: 325641	2018-02-20 23:19:34 +00:00
Brian M. Rzycki	f1a7df5ef2	[JumpThreading] PR36133 enable/disable DominatorTree for LVI analysis Summary: The LazyValueInfo pass caches a copy of the DominatorTree when available. Whenever there are pending DominatorTree updates within JumpThreading's DeferredDominance object we cannot use the cached DT for LVI analysis. This commit adds the new methods enableDT() and disableDT() to LVI. JumpThreading also sets the appropriate usage model before calling LVI analysis methods. Fixes https://bugs.llvm.org/show_bug.cgi?id=36133 Reviewers: sebpop, dberlin, kuhar Reviewed by: sebpop, kuhar Subscribers: uabelho, llvm-commits, aprantl, hiraditya, a.elovikov Differential Revision: https://reviews.llvm.org/D42717 llvm-svn: 325356	2018-02-16 16:35:17 +00:00
Ivan A. Kosarev	53270d0fa6	[Transforms] Propagate TBAA info in SROA Now that we have the new TBAA metadata format that is capable of representing accesses to aggregates, we can propagate TBAA access tags from memory setting and transferring intrinsics to load and store instructions and vice versa. Since SROA produces lots of new loads and stores on optimized builds, this change significantly decreases the share of undecorated memory accesses on such builds. Differential Revision: https://reviews.llvm.org/D41563 llvm-svn: 325329	2018-02-16 10:10:29 +00:00
Vedant Kumar	616fdb00df	[GVN] Partially revert debug info salvage change (r325063) In r325063, we salvaged debug values from dying instructions in GVN::processBlock() and GVN::performScalarPRE(). The change in performScalarPRE(), while correct, is unhelpful. It introduced a call to salvageDebugInfo() which was immediately followed by a RAUW, meaning it prevented the RAUW from efficiently updating dbg.value intrinsics. This commit reverts the mistake and tightens up the affected test case. llvm-svn: 325308	2018-02-16 01:15:20 +00:00
Vedant Kumar	1df820ecd7	[DCE] Salvage debug info from dead insts This results in small increases in the size of the .debug_loc section and the number of unique source variables in a stage2 build of opt. llvm-svn: 325301	2018-02-15 22:26:18 +00:00
David Green	0d5f9651f2	Move llvm::computeLoopSafetyInfo from LICM.cpp to LoopUtils.cpp. NFC Move computeLoopSafetyInfo, defined in Transforms/Utils/LoopUtils.h, into the corresponding LoopUtils.cpp, as opposed to LICM where it resides at the moment. This will allow other functions from Transforms/Utils to reference it. llvm-svn: 325151	2018-02-14 18:34:53 +00:00
Florian Hahn	b4e3bad89b	Recommit r325001: [CallSiteSplitting] Support splitting of blocks with instrs before call. For basic blocks with instructions between the beginning of the block and a call we have to duplicate the instructions before the call in all split blocks and add PHI nodes for uses of the duplicated instructions after the call. Currently, the threshold for the number of instructions before a call is quite low, to keep the impact on binary size low. Reviewers: junbuml, mcrosier, davidxl, davide Reviewed By: junbuml Differential Revision: https://reviews.llvm.org/D41860 llvm-svn: 325126	2018-02-14 13:59:12 +00:00
Florian Hahn	c6296fea3f	[LoopInterchange] Incrementally update the dominator tree. We can use incremental dominator tree updates to avoid re-calculating the dominator tree after interchanging 2 loops. Reviewers: dmgreen, kuhar Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D43176 llvm-svn: 325122	2018-02-14 13:13:15 +00:00
Elena Demikhovsky	945b7e5aa6	Adding a width of the GEP index to the Data Layout. Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout. p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits. The index size parameter is optional, if not specified, it is equal to the pointer size. Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width. It works fine if you can convert pointer to integer for address calculation and all registered targets do this. But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout. http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account. This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size. Differential Revision: https://reviews.llvm.org/D42123 llvm-svn: 325102	2018-02-14 06:58:08 +00:00
Vedant Kumar	1d5d31b706	[GVN] Salvage debug info from dead insts This preserves an additional 581 unique source variables in a stage2 build of clang (according to `llvm-dwarfdump --statistics`). It increases the size of the .debug_loc section by 0.1% (or 87139 bytes). Differential Revision: https://reviews.llvm.org/D43255 llvm-svn: 325063	2018-02-13 22:27:17 +00:00

... 4 5 6 7 8 ...

9277 Commits