llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	c9f9be0381	[SCEV] Verify integrity of ValuesAtScopes and users (NFC) Make sure that ValuesAtScopes and ValuesAtScopesUsers are consistent during SCEV verification.	2021-11-30 21:08:40 +01:00
Nikita Popov	37d72991c1	[SCEV] Track and invalidate ValuesAtScopes users ValuesAtScopes maps a SCEV and a Loop to another SCEV. While we invalidate entries if the left-hand SCEV is invalidated, we currently don't do this for the right-hand SCEV. Fix this by tracking users in a reverse map and using it for invalidation. This is conceptually the same change as D114738, but using the reverse map to avoid performance issues. Differential Revision: https://reviews.llvm.org/D114788	2021-11-30 18:21:14 +01:00
Nikita Popov	77dd579827	[SCEV] Remove incorrect assert Fix assertion failure reported on D113349 by removing the assert. While the produced expression should be equivalent, it may not be strictly the same, e.g. due to lazy nowrap flag updates. Similar to what the main createSCEV() code does, simply retain the old value map entry if one already exists.	2021-11-29 17:09:12 +01:00
Florian Hahn	7b75110fac	[SCEV] Turn validity check in getExistingSCEV into assert (NFC). Now that we track users of SCEV expressions, we should be able to always invalidate containing expressions. With that, I think the case where a value gets removed but SCEVs containing references to it should not be possible any longer. Turn check into an assert. This slightly reduces compile-time: NewPM-O3: -0.27% NewPM-ReleaseThinLTO: -0.21% NewPM-ReleaseLTO-g: -0.26% http://llvm-compile-time-tracker.com/compare.php?from=c3dc6b081da6ba503e67d260033f81f61eb38ea3&to=95a4a028b1f1dd0bc3d221435953b7d2c031b3d5&stat=instructions Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D114633	2021-11-28 12:16:55 +00:00
Nikita Popov	f492a414ba	[SCEV] Simplify forgetSymbolicName() (NFCI) With the recently introduced tracking as well as D113349, we can greatly simplify forgetSymbolicName(). In fact, we can simply replace it with forgetMemoizedResults(). What forgetSymbolicName() used to do is to walk the IR use-def chain to find all SCEVs that mention the SymbolicName. However, thanks to use tracking, we can now determine the relevant SCEVs in a more direct way. D113349 is needed to also clear out the actual IR to SCEV mapping in ValueExprMap. Differential Revision: https://reviews.llvm.org/D114263	2021-11-27 16:42:38 +01:00
Nikita Popov	c2550e3427	[SCEV] Simplify invalidation after BE count calculation (NFCI) After backedge taken counts have been calculated, we want to invalidate all addrecs and dependent expressions in the loop, because we might compute better results with the newly available backedge taken counts. Previously this was done with a forgetLoop() style use-def walk. With recent improvements to SCEV invalidation, we can instead directly invalidate any SCEVs using addrecs in this loop. This requires a great deal less subtlety to avoid invalidating more than necessary, and in particular gets rid of the hack from D113349. The change is similar to D114263 in spirit.	2021-11-27 16:35:06 +01:00
Nikita Popov	2b160e95c8	Reland [SCEV] Fix and validate ValueExprMap/ExprValueMap consistency Relative to the previous landing attempt, this introduces an additional flag on forgetMemoizedResults() to not remove SCEVUnknown phis from the value map. The invalidation after BECount calculation wants to leave these alone and skips them in its own use-def walk, but we can still end up invalidating them via forgetMemoizedResults() if there is another IR value with the same SCEV. This is intended as a temporary workaround only, and the need for this should go away once the getBackedgeTakenInfo() invalidation is refactored in the spirit of D114263. ----- This adds validation for consistency of ValueExprMap and ExprValueMap, and fixes identified issues: * Addrec construction directly wrote to ValueExprMap in a few places, without updating ExprValueMap. Add a helper to ensures they stay consistent. The adjustment in forgetSymbolicName() explicitly drops the old value from the map, so that we don't rely on it being overwritten. * forgetMemoizedResultsImpl() was dropping the SCEV from ExprValueMap, but not dropping the corresponding entries from ValueExprMap. Differential Revision: https://reviews.llvm.org/D113349	2021-11-27 12:37:15 +01:00
Nikita Popov	719354a571	Revert "[SCEV] Fix and validate ValueExprMap/ExprValueMap consistency" This reverts commit `bee8dcda1f`. Some sanitizer buildbots fail with: > Attempt to use a SCEVCouldNotCompute object! For example: https://lab.llvm.org/buildbot/#/builders/85/builds/7020/steps/9/logs/stdio	2021-11-26 22:18:23 +01:00
Nikita Popov	bee8dcda1f	[SCEV] Fix and validate ValueExprMap/ExprValueMap consistency Relative to the previous landing attempt, this makes insertValueToMap() resilient against the value already being present in the map -- previously I only checked this for the createSimpleAffineAddRec() case, but the same issue can also occur for the general createNodeForPHI(). In both cases, the addrec may be constructed and added to the map in a recursive query trying to create said addrec. In this case, this happens due to the invalidation when the BE count is computed, which ends up clearing out the symbolic name as well. ----- This adds validation for consistency of ValueExprMap and ExprValueMap, and fixes identified issues: * Addrec construction directly wrote to ValueExprMap in a few places, without updating ExprValueMap. Add a helper to ensures they stay consistent. The adjustment in forgetSymbolicName() explicitly drops the old value from the map, so that we don't rely on it being overwritten. * forgetMemoizedResultsImpl() was dropping the SCEV from ExprValueMap, but not dropping the corresponding entries from ValueExprMap. Differential Revision: https://reviews.llvm.org/D113349	2021-11-26 20:57:47 +01:00
Florian Hahn	b927aa69bf	[SCEV] Turn check in createSimpleAffineAddRec to assertion. (NFC) Accum is guaranteed to be defined outside L (via Loop::isLoopInvariant checks above). I think that should guarantee that the more powerful ScalarEvolution::isLoopInvariant also determines that the value is loop invariant. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D114634	2021-11-26 13:23:48 +00:00
Zarko Todorovski	95875d246a	[LLVM][NFC]Inclusive language: remove occurances of sanity check/test from llvm Part of work to use more inclusive language in clang/llvm. Rewording some comments and change function and variable names.	2021-11-24 17:29:55 -05:00
Nikita Popov	62e9acad0a	Revert "[SCEV] Fix and validate ValueExprMap/ExprValueMap consistency" This reverts commit `d633db8f9d`. Causes bootstrap assertion failures: https://lab.llvm.org/buildbot/#/builders/168/builds/3459/steps/9/logs/stdio	2021-11-22 15:47:33 +01:00
Nikita Popov	d633db8f9d	[SCEV] Fix and validate ValueExprMap/ExprValueMap consistency This adds validation for consistency of ValueExprMap and ExprValueMap, and fixes identified issues: * Addrec construction directly wrote to ValueExprMap in a few places, without updating ExprValueMap. Add a helper to ensures they stay consistent. The adjustment in forgetSymbolicName() explicitly drops the old value from the map, so that we don't rely on it being overwritten. * forgetMemoizedResultsImpl() was dropping the SCEV from ExprValueMap, but not dropping the corresponding entries from ValueExprMap. Differential Revision: https://reviews.llvm.org/D113349	2021-11-22 15:27:25 +01:00
Kazu Hirata	f6bce30cf9	[llvm] Use range-based for loops (NFC)	2021-11-20 18:42:10 -08:00
Philip Reames	28000587e1	[SCEV] Revert two speculative compile time optimizations which made no difference Revert "[SCEV] Defer all work from `ea12c2cb` as late as possible" Revert "[SCEV] Defer loop property checks from `ea12c2cb` as late as possible" This reverts commit `734abbad79` and `1a5666acb2`. Both of these changes were speculative attempts to address a compile time regression. Neither worked, and both complicated the code in undesirable ways.	2021-11-19 08:45:56 -08:00
Philip Reames	734abbad79	[SCEV] Defer all work from `ea12c2cb` as late as possible This is a second speculative compile time optimization to address a reported regression. My actual suspicion is that availability of no-self-wrap is making some other bit of code trigger, but let's rule this out.	2021-11-18 17:19:52 -08:00
Philip Reames	1a5666acb2	[SCEV] Defer loop property checks from `ea12c2cb` as late as possible This is a speculative compile time optimization to address a reported regression. It's the only thing which vaguely makes sense.	2021-11-18 13:47:45 -08:00
Philip Reames	ea12c2cb9c	[SCEV] Move mustprogress based no-self-wrap logic so it applies to all exit conditions This change moves logic which we'd added specifically for less than tests so that it applies to equalities and greater than tests as well. The basic idea is that if we can show an IV cycles infinitely through the same series on self-wrap, and that the exit condition must be taken to prevent UB, we can conclude that it must be taken before self-wrap and thus infer said flag. The motivation here is simple loops with unsigned induction variables w/non-one steps and inequality tests. A toy example would be: for (unsigned i = 0; i != N; i += 2) { body; } If body contains no side effects, and this is a mustprogress function, we can assume that this must be a finite loop and thus that the exit count is N/2. Differential Revision: https://reviews.llvm.org/D103991	2021-11-18 10:07:44 -08:00
Florian Hahn	da9f2ba3b1	[SCEV] Reorder operands checks in collectConditions. The initial two cases require a SCEVConstant as RHS. Pull up the condition to check and swap SCEVConstants from below. Also remove a redundant check & swap if RHS is SCEVUnknown.	2021-11-18 09:36:16 +00:00
Philip Reames	ad69402f3e	[SCEVAA] Avoid forming malformed pointer diff expressions This solves the same crash as in D104503, but with a different approach. The test case test_non_dom demonstrates a case where scev-aa crashes today. (If exercised either by -eval-aa or -licm.) The basic problem is that SCEV-AA expects to be able to compute a pointer difference between two SCEVs for any two pair of pointers we do an alias query on. For (valid, but out of scope) reasons, we can end up asking whether expressions in different sub-loops can alias each other. This results in a subtraction expression being formed where neither operand dominates the other. The approach this patch takes is to leverage the "defining scope" notion we introduced for flag semantics to detect and disallow the formation of the problematic SCEV. This ends up being relatively straight forward on that new infrastructure. This change does hint that we should probably be verifying a similar property for all SCEVs somewhere, but I'll leave that to a follow on change. Differential Revision: D114112	2021-11-17 12:38:04 -08:00
Florian Hahn	e8b55cf7b7	[SCEV] Apply loop guards when computing max BTC for arbitrary steps. Similar other cases in the current function (e.g. when the step is 1 or -1), applying loop guards can lead to tighter upper bounds for the backedge-taken counts. Fixes PR52464. Reviewed By: reames, nikic Differential Revision: https://reviews.llvm.org/D113578	2021-11-17 11:00:49 +00:00
Philip Reames	8d85e945b2	[SCEV] Canonicalize X - urem X, Y patterns There are multiple possible ways to represent the X - urem X, Y pattern. SCEV was not canonicalizing, and thus, depending on which you analyzed, you could get different results. The sub representation appears to produce strictly inferior results in practice, so I decided to canonicalize to the Y * X/Y version. The motivation here is that runtime unroll produces the sub X - (and X, Y-1) pattern when Y is a power of two. SCEV is thus unable to recognize that an unrolled loop exits because we don't figure out that the new unrolled step evenly divides the trip count of the unrolled loop. After instcombine runs, we convert the the andn form which SCEV recognizes, so essentially, this is just fixing a nasty pass ordering dependency. The ARM loop hardware interaction in the test diff is opague to me, but the comments in the review from others knowledge of the infrastructure appear to indicate these are improvements in loop recognition, not regressions. Differential Revision: https://reviews.llvm.org/D114018	2021-11-16 11:59:21 -08:00
Florian Hahn	b7aec4f08e	[SCEV] Support rewriting ZExt expressions with loop guard info. So far, applying loop guard information has been restricted to SCEVUnknown. In a few cases, like PR40961 and PR52464, this leads to SCEV failing to determine tight upper bounds for the backedge taken count. This patch adjusts SCEVLoopGuardRewriter and applyLoopGuards to support re-writing ZExt expressions. This is a first step towards fixing PR40961 and PR52464. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D113577	2021-11-16 11:16:07 +00:00
Florian Hahn	8ed8d37088	[SCEV] Update SCEVLoopGuardRewriter to hold reference to map. (NFC) SCEVLoopGuardRewriter doesn't need to copy the rewrite map. It can just hold a const reference instead, to avoid an unnecessary copy.	2021-11-13 09:39:14 +00:00
Florian Hahn	03cfea68c6	[SCEV] Update SCEVLoopGuardRewriter to take SCEV -> SCEV map (NFC). Split off refactoring from D113577 to reduce the diff. NFC as the new interface will only be used in D113577.	2021-11-12 18:16:03 +00:00
Florian Hahn	819bca9b90	[SCEV] Use APIntOps::umin to select best max BC count (NFC). Suggested in D102267, but I missed this in the committed version.	2021-11-12 12:20:01 +00:00
Chris Jackson	116dc70cf3	[DebugInfo][LSR] Add more stringent checks on IV selection and salvage attempts Prevent the selection of IVs that have a SCEV containing an undef. Also prevent salvaging attempts for values for which a SCEV could not be created by ScalarEvolution and have only SCEVUknown. Reviewed by: Orlando Differential Revision: https://reviews.llvm.org/D111810	2021-11-09 13:09:37 +00:00
Kazu Hirata	843d1eda18	[llvm] Use llvm::reverse (NFC)	2021-11-06 19:31:18 -07:00
Philip Reames	d24a0e8857	[SCEV] Use constant range of RHS to prove NUW on narrow IV in trip count logic The basic idea here is that given a zero extended narrow IV, we can prove the inner IV to be NUW if we can prove there's a value the inner IV must take before overflow which must exit the loop. Differential Revision: https://reviews.llvm.org/D109457	2021-11-05 15:36:47 -07:00
Liren Peng	57e093162e	[ScalarEvolution] Infer loop max trip count from array accesses Data references in a loop should not access elements over the statically allocated size. So we can infer a loop max trip count from this undefined behavior. Reviewed By: reames, mkazantsev, nikic Differential Revision: https://reviews.llvm.org/D109821	2021-11-03 10:40:18 +08:00
Nikita Popov	4972d12185	[SCEV] Only add direct loop users (NFC) It it now sufficient to track only direct addrec users of a loop, and let the SCEVUsers mechanism track and invalidate transitive users. Differential Revision: https://reviews.llvm.org/D112875	2021-11-01 18:49:43 +01:00
Max Kazantsev	e512c5b166	[SCEV][NFC] Factor out common API for getting unique operands of a SCEV This function is used at least in 2 places, to it makes sense to make it separate. Differential Revision: https://reviews.llvm.org/D112516 Reviewed By: reames	2021-11-01 11:36:47 +07:00
Max Kazantsev	513914e1f3	[SCEV] Invalidate user SCEVs along with operand SCEVs to avoid cache corruption Following discussion in D110390, it seems that we are suffering from unability to traverse users of a SCEV being invalidated. The result of that is that ScalarEvolution's inner caches may store obsolete data about SCEVs even if their operands are forgotten. It creates problems when we try to verify the contents of those caches. It's also a frequent situation when messing with cache causes very sneaky and hard-to-analyze bugs related to corruption of memory when dealing with cached data. They are lurking there because ScalarEvolution's veirfication is not powerful enough and misses many problematic cases. I plan to make SCEV's verification much stricter in follow-ups, and this requires dangling-pointers-free caches. This patch makes sure that, whenever we forget cached information for a SCEV, we also forget it for all SCEVs that (transitively) use it. This may have negative compile time impact. It's a sacrifice we are more than willing to make to enforce correctness. We can also save some time by reworking invokers of forgetMemoizedResults (maybe we can forget multiple SCEVs with single query). Differential Revision: https://reviews.llvm.org/D111533 Reviewed By: reames	2021-10-28 09:39:24 +07:00
Max Kazantsev	5961f0308f	[SCEV][NFC] Verify intergity of SCEVUsers Make sure that, for every living SCEV, we have all its direct operand tracking it as their user. Differential Revision: https://reviews.llvm.org/D112402 Reviewed By: reames	2021-10-27 09:54:49 +07:00
Nikita Popov	3a995c918e	[SCEV] Move SCEVLostPoisonFlags() check into SCEVExpander Always insert values into ExprValueMap, and instead skip using them in SCEVExpander if poison-generating flags have been lost. This ensures that all values that are in ValueExprMap are also in ExprValueMap, so we can use the latter to invalidate the former. This change is probably not entirely NFC for the case where originally the SCEV had no nowrap flags but they were inferred later, in which case that would now allow reusing the existing value for expansion. Differential Revision: https://reviews.llvm.org/D112389	2021-10-25 22:37:20 +02:00
Kazu Hirata	3729a5abf4	[SCEV] Fix a warning on an unused lambda capture This patch fixes: llvm/lib/Analysis/ScalarEvolution.cpp:12770:37: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]	2021-10-25 00:45:18 -07:00
Max Kazantsev	f8623b0783	[SCEV][NFC] Win some compile time from mass forgetMemoizedResults Mass forgetMemoizedResults can be done more efficiently than bunch of individual invocations of helper because we can traverse maps being updated just once, rather than doing this for each invidivual SCEV. Should be NFC and supposedly improves compile time. Differential Revision: https://reviews.llvm.org/D112294 Reviewed By: reames	2021-10-25 14:09:41 +07:00
Max Kazantsev	dbab339ea4	[SCEV][NFC] Apply mass forgetMemoizedResults queries where possible When forgetting multiple SCEVs, rather than doing this one by one, we can instead use mass updates. We plan to make them more efficient than they are now, potentially improving compile time. Differential Revision: https://reviews.llvm.org/D111602 Reviewed By: reames	2021-10-25 13:50:49 +07:00
Max Kazantsev	a6096b7f9e	[SCEV][NFC] Introduce API for mass forgetMemoizedResults query This patch changes signature of forgetMemoizedResults to be able to work with multiple SCEVs. Usage will come in follow-ups. We also plan to optimize it in the future to work faster than individual invalidation updates. Should not change behavior in any sense. Split-off from D111602. Differential Revision: https://reviews.llvm.org/D112293 Reviewed By: reames	2021-10-25 13:49:31 +07:00
Max Kazantsev	1c18ebb2cc	[NFC][SCEV] Do not track users of SCEVConstants Follow-up from D112295, suggested by Nikita: we can avoid tracking users of SCEVConstants because dropping their cached info is unlikely to give any new prospects for fact inference, and it should not introduce any correctness problems.	2021-10-25 12:30:46 +07:00
Max Kazantsev	fea4a48c0b	[SCEV][NFC] API for tracking of SCEV users This patch introduces API that keeps track of SCEVs users of another SCEVs, required to handle invalidations of users along with operands that comes in follow-up patches. Differential Revision: https://reviews.llvm.org/D112295 Reviewed By: reames	2021-10-25 12:14:18 +07:00
Nikita Popov	4f5e9a2bb2	[SCEV] Remove computeLoadConstantCompareExitLimit() (NFCI) The functionality of this method is already covered by computeExitCountExhaustively() in a more general fashion. It was added at a time when exhaustive exit count calculation did not support constant folding loads yet. I double checked that dropping this code causes no binary changes in test-suite. Differential Revision: https://reviews.llvm.org/D112343	2021-10-23 15:34:25 +02:00
Bjorn Pettersson	9c44a0996c	[SCEV] Fix formatting error introduced by D112080 Accidentally pushed D112080 without this clang-format cleanup.	2021-10-19 21:44:07 +02:00
Bjorn Pettersson	08619006a0	[SCEV] Avoid compile time explosion in ScalarEvolution::isImpliedCond As seen in PR51869 the ScalarEvolution::isImpliedCond function might end up spending lots of time when doing the isKnownPredicate checks. Calling isKnownPredicate for example result in isKnownViaInduction being called, which might result in isLoopBackedgeGuardedByCond being called, and then we might get one or more new calls to isImpliedCond. Even if the scenario described here isn't an infinite loop, using some random generated C programs as input indicates that those isKnownPredicate checks quite often returns true. On the other hand, the third condition that needs to be fulfilled in order to "prove implications via truncation", i.e. the isImpliedCondBalancedTypes check, is rarely fulfilled. I also made some similar experiments to look at how often we would get the same result when using isKnownViaNonRecursiveReasoning instead of isKnownPredicate. So far I haven't seen a single case when codegen is negatively impacted by using isKnownViaNonRecursiveReasoning. On the other hand, it seems like we get rid of the compile time explosion seen in PR51869 that way. Hence this patch. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D112080	2021-10-19 21:37:57 +02:00
Max Kazantsev	90ae538cab	[SCEV] Prove implication of predicates to their sign-flipped counterparts This patch teaches SCEV two implication rules: x <u y && y >=s 0 --> x <s y, x <s y && y <s 0 --> x <u y. And all equivalents with signs/parts swapped. Differential Revision: https://reviews.llvm.org/D110517 Reviewed By: nikic	2021-10-15 11:49:18 +07:00
Max Kazantsev	1202d280c6	[SCEV][NFC] Reduce memory footprint & compile time via DFS refactoring Current implementations of DFS in SCEV check unique-visited of traversed values on pop, and not on push. As result, the same value may be pushed multiple times just to be thrown away when popped. These operations are meaningless and only waste time and increase memory footprint of the worklist. This patch reworks the DFS strategy to check uniqueness before push. Should be NFC. Differential Revision: https://reviews.llvm.org/D111774 Reviewed By: nikic, reames	2021-10-15 10:19:15 +07:00
Max Kazantsev	6e1308bc10	[SCEV][NFC] Simplify check with CI->isZero() exit condition Replace check with if ((ExitIfTrue && CI->isZero()) \|\| (!ExitIfTrue && CI->isOne())) with equivalent and simpler version if (ExitIfTrue == CI->isZero())	2021-10-14 14:06:52 +07:00
Max Kazantsev	46a1dd47e6	[SCEV][NFC] Reorder checks to delay call of all_of Check lightweight getter condition before calling all_of.	2021-10-14 13:30:51 +07:00
Philip Reames	7f55209cee	[SCEV] Extend trip count to avoid overflow by default As a brief reminder, an "exit count" is the number of times the backedge executes before some event. It can be zero if we exit before the backedge is reached. A "trip count" is the number of times the loop header is entered if we branch into the loop. In general, TC = BTC + 1 and thus a zero trip count is ill defined There is a cornercases which we don't handle well. Let's assume i8 for our examples to keep things simple. If BTC = 255, then the correct trip count is 256. However, 256 is not representable in i8. In theory, code which needs to reason about trip counts is responsible for checking for this cornercase, and either bailing out, or handling it correctly. Historically, we don't have a great track record about actually doing so. When reviewing D109676, I found myself asking a basic question. Was there any good reason to preserve the current wrap-to-zero behavior when converting from backedge taken counts to trip counts? After reviewing existing code, I could not find a single case which appears to correctly and precisely handle the overflow case. This patch changes the default behavior to extend instead of wrap. That is, if the result might be 256, we return a value of i9 type to ensure we interpret the count correctly. I did leave the legacy behavior as an option since a) loop-flatten stops triggering if I extend due to weirdly specific pattern matching I didn't understand and b) we could reasonably use the mode if we'd externally established a lack of overflow. I want to emphasize that this change is not NFC. There are two call sites (one in ScalarEvolution.cpp, one in LoopCacheAnalysis.cpp) which are switched to the extend semantics. The former appears imprecise (but correct) for a constant 255 BTC. The later appears incorrect, though I don't have a test case. Differential Revision: https://reviews.llvm.org/D110587	2021-10-11 09:55:55 -07:00
Philip Reames	d694dd0f0d	Add iterator range variants of isGuaranteedToTransferExecutionToSuccessor [mostly-nfc] This factors out utilities for scanning a bounded block of instructions since we have this code repeated in a bunch of places. The change to InlineFunction isn't strictly NFC as the limit mechanism there didn't handle debug instructions correctly.	2021-10-08 09:50:10 -07:00
Philip Reames	1183d65b4d	[SCEV] Search operand tree for scope bound when inferring flags from IR When checking to see if we can apply IR flags to a SCEV, we need to identify a bound on the defining scope of the SCEV to be produced. We'd previously added support for a couple SCEVExpr types which trivially imply bounds, but hadn't handled types such as umax where the bounds come from the bounds of the operands. This does the obvious thing, and recurses through operands searching for a tighter bound on the defining scope. I'm honestly surprised by how little this seems to mater on existing tests, but it's worth doing for completeness sake alone. Differential Revision: https://reviews.llvm.org/D111191	2021-10-06 15:10:02 -07:00
Nikita Popov	17c20a6dfb	[SCEV] Avoid unnecessary domination checks (NFC) When determining the defining scope, avoid repeatedly querying dominationg against the function entry instruction. This ends up begin a very common case that we can handle more efficiently.	2021-10-06 22:14:04 +02:00
Philip Reames	a7ae227baf	[scev] minor style improvement [nfc]	2021-10-06 12:15:16 -07:00
Philip Reames	0658bab870	[SCEV] Infer flags from add/gep in any block This patch removes a compile time restriction from isSCEVExprNeverPoison. We've strengthened our ability to reason about flags on scopes other than addrecs, and this bailout prevents us from using it. The comment is also suspect as well in that we're in the middle of constructing a SCEV for I. As such, we're going to visit all operands anyways. Differential Revision: https://reviews.llvm.org/D111186	2021-10-06 11:11:54 -07:00
Nikita Popov	0be9940ef2	[SCEV] Don't check if propagation safe if there are no flags (NFC) If there are no nowrap flags, then we don't need to determine whether propagating flags is safe -- it will make no difference.	2021-10-05 22:25:41 +02:00
Philip Reames	c608b49d67	[SCEV] Tweak the algorithm for figuring out if flags must apply to a SCEV [mostly-NFC] Behavior wise, this patch should be mostly NFC. The only behavior difference known is that on the isSCEVExprNeverPoison path we'll consider a bound imposed by the SCEVable operands (if any). Algorithmically, it's an invert of the existing code. Previously, we checked for each operand if we could find a bound, then checked for must-execute given that bound. With the patch, we use dominance to refine the innermost bound, then check must execute once. The interesting case is when we have multiple unknowns within a single basic block. While both dominance and must-execute are worst-case linear walks within the block, only dominance is cached. As such, refining based on dominance should be more efficient.	2021-10-05 11:20:48 -07:00
Jay Foad	a9bceb2b05	[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC. Stop using APInt constructors and methods that were soft-deprecated in D109483. This fixes all the uses I found in llvm, except for the APInt unit tests which should still test the deprecated methods. Differential Revision: https://reviews.llvm.org/D110807	2021-10-04 08:57:44 +01:00
Philip Reames	5f7a535330	[SCEV] Cap the number of instructions scanned when infering flags This addresses a comment from review on D109845. The concern was raised that an unbounded scan would be expensive. Long term plan is to cache this search - likely reusing the existing mechanism for loop side effects - but let's be simple and conservative for now.	2021-10-03 16:14:06 -07:00
Philip Reames	35ab211c37	[SCEV] Use trivial bound on defining scope of all SCEVs when computing flags This addresses a comment from review on D109845. Even for SCEVs which we can't find true bounds without recursing through operands, entry to the function forms a trivial upper bound. In some cases, this trivial bound is enough to prove safety of flag inference.	2021-10-03 16:01:30 -07:00
Philip Reames	d02db32644	[SCEV] Use full logic when infering flags on add and gep This is a followon to D109845. With that landed, we will have fixed all known instances of pr51817, and can thus start inferring flags more aggressively with greatly reduced risk of miscompiles. This patch simply applies the same inference logic used in that patch to our other major flag inference path. We can still do much better here (on both paths), but this is our first step. Differential Revision: https://reviews.llvm.org/D111003	2021-10-03 15:32:15 -07:00
Philip Reames	f39978b84f	[SCEV] Correctly propagate nowrap flags across scopes when folding invariant add through addrec This fixes a violation of the wrap flag rules introduced in `c4048d8f`. This is an alternate fix to D106852. The basic problem being fixed is that we infer a set of flags which is valid at some inner scope S1 (usually by correctly propagating them from IR), and then (incorrectly) extend them to a SCEV in scope S2 where S1 != S2. This is not in general safe per the wrap flags semantics recently defined. In this patch, I include a simple inference step to handle the case where we can prove that S2 is the preheader of the loop S1, and that entry into S2 implies execution of S1. See the code for a more detailed explanation. One worry I have with this patch is that I might be over-fitting what shows up in tests - and thus hiding negative impact we'd see in the real world. My best defense is that the rule used here very closely follows the one used to propagate the flags from IR to the inner add to start with, and thus if one is reasonable, so probably is the other. Curious what others think about that piece. The test diffs are roughly as expected. Mostly analysis only, with two transform changes. Oddly, the result looks better in the loop-idiom test, and I don't understand the PPC output enough to have tell. Nothing terrible looking though. (For context, without the scope inference peephole, the test delta includes a couple of vectorization tests. Again, not super concerning, but slightly more so.) Differential Revision: https://reviews.llvm.org/D109845	2021-10-03 15:19:33 -07:00
Philip Reames	26223af256	[SCEV] Split isSCEVExprNeverPoison reasoning explicitly into scope and mustexecute parts [NFC] Inspired by the needs to D111001 and D109845. The seperation of concerns also amakes it easier to reason about correctness and completeness.	2021-10-02 13:10:38 -07:00
Philip Reames	2ca8a3f213	[SCEV] Stop blindly propagating flags from inbound geps to SCEV nodes This fixes a violation of the wrap flag rules introduced in `c4048d8f`. This was also noted in the (very old) PR23527. The issue being fixed is that we assume the inbound flag on any GEP assumes that all users of any gep (or add) which happens to map to that SCEV would also be UB if the (other) gep overflowed. That's simply not true. In terms of the test diffs, I don't see anything seriously problematic. The lost flags are expected (given the semantic restriction on when its legal to tag the SCEV), and there are several cases where the previously inferred flags are unsound per the new semantics. The only common trend I noticed when looking at the deltas is that by not considering branch on poison as immediate UB in ValueTracking, we do miss a few cases we could reclaim. We may be able to claw some of these back with the follow ideas mentioned in PR51817. It's worth noting that most of the changes are analysis result only changes. The two transform changes are pretty minimal. In one case, we miss the opportunity to infer a nuw (correctly). In the other, we fail to fold an exit and produce a loop invariant form instead. This one is probably over-reduced as the program appears to be undefined in practice, and neither before or after exploits that. Differential Revision: https://reviews.llvm.org/D109789	2021-10-01 16:30:44 -07:00
Philip Reames	24cde2f602	[SCEV] Remove invariant requirement from isSCEVExprNeverPoison This code is attempting to prove that I must execute if we enter the defining scope of the SCEV which will be created from I. In the case where it found a defining addrec scope, it had a rather odd restriction that all of the other operands must be loop invariant in that addrec's loop. As near as I can tell here, we really only need a upper bound on the defining scope. If we can prove the stronger property, then we must also have proven the property on the exact defining scope as well. In practice, the actual effect of this change is narrow. The compile time restriction at the top of the routine basically limits us to I being an arithmetic in some loop L with both an addrec operand in L, and a unknown operands in L. Possible to demonstrate, but the main value of the change is removing unneeded code. Differential Revision: https://reviews.llvm.org/D110892	2021-10-01 15:57:37 -07:00
Philip Reames	c5e491e6ee	[SCEV] Modernize code style of isSCEVExprNeverPoison [NFC] Use for-range and all_of to make code easier to read in advance of other changes.	2021-09-30 15:13:43 -07:00
Florian Hahn	1fbdbb5595	Revert "Recommit "[SCEV] Look through single value PHIs." (take 2)" This reverts commit `764d9aa979`. This patch exposed a few additional cases where SCEV expressions are not properly invalidated. See PR52024, PR52023.	2021-09-30 20:53:51 +01:00
Florian Hahn	764d9aa979	Recommit "[SCEV] Look through single value PHIs." (take 2) This reverts commit `8fdac7cb7a`. The issue causing the revert has been fixed a while ago in `60b852092c`. Original message: Now that SCEVExpander can preserve LCSSA form, we do not have to worry about LCSSA form when trying to look through PHIs. SCEVExpander will take care of inserting LCSSA PHI nodes as required. This increases precision of the analysis in some cases. Reviewed By: mkazantsev, bmahjour Differential Revision: https://reviews.llvm.org/D71539	2021-09-28 10:32:17 +01:00
Max Kazantsev	cd166fb2ef	[SCEV] Use isAvailableAtLoopEntry in the asserts This is what is supposed to be there.	2021-09-21 17:11:15 +07:00
Max Kazantsev	4d5d725428	[SCEV] Add some asserts on availability of arguments of isLoopEntryGuardedByCond The logic in howManyLessThans is fishy. It first checks invariance of RHS, and then uses OrigRHS as argument for isLoopEntryGuardedByCond, which is, strictly saying, a different thing. We are seeing a very rare intermittent failure of availability checks, and it looks like this precondition is sometimes broken. Before we can figure out what's going on, adding asserts that all involved values that may possibly to to isLoopEntryGuardedByCond are available at loop entry. If either of these asserts fails (OrigRHS is the most likely suspect), it means that the logic here is flawed.	2021-09-21 17:08:52 +07:00
Max Kazantsev	2c7d5fbc9e	[SCEV] Generalize implication when signedness of FoundPred doesn't matter The implication logic for two values that are both negative or non-negative says that it doesn't matter whether their predicate is signed and unsigned, but only flips unsigned into signed for further inference. This patch adds support for flipping a signed predicate into unsigned as well. Differential Revision: https://reviews.llvm.org/D109959 Reviewed By: nikic	2021-09-21 11:17:56 +07:00
Max Kazantsev	a06db78fd9	[NFC] Rename Context->CtxI in SCEV for uniformity reasons	2021-09-21 10:12:20 +07:00
Max Kazantsev	def15c5fb6	[SCEV] Support negative values in signed/unsigned predicate reasoning There is a piece of logic that uses the fact that signed and unsigned versions of the same predicate are equivalent when both values are non-negative. It's also true when both of them are negative. Differential Revision: https://reviews.llvm.org/D109957 Reviewed By: nikic	2021-09-20 11:26:33 +07:00
Philip Reames	9bdb19cca2	[SCEV] (udiv X, Y) * Y is always NUW Motivated by the removal done in D109782. This implements the correct flag part generically. Differential Revision: https://reviews.llvm.org/D109786	2021-09-15 11:34:50 -07:00
Philip Reames	0dd755f027	[SCEV] Stop applying contextual flags in applyLoopGuards This fixes a violation of the wrap flag rules introduced in `c4048d8f`. As noted in the original review, the NUW is legal to infer from the structure of the replacee, but a) there's no test coverage, and b) this should be done generically for all multiplies. Differential Revision: https://reviews.llvm.org/D109782	2021-09-14 14:14:52 -07:00
Philip Reames	bfa2a81e92	[ScalarEvolution] Add an additional bailout to avoid NOT of pointer. It's possible in some cases for the LHS to be a pointer where the RHS is not. This isn't directly possible for an icmp, but the analysis mixes up operands of different icmp expressions in some cases. This does not include a test case as the smallest reduced case we've managed is extremely fragile and unlikely to test anything meaningful in the long term. Also add an assertion to getNotSCEV() to make tracking down this sort of issue a bit easier in the future. Fixes https://bugs.llvm.org/show_bug.cgi?id=51787 . Differential Revision: https://reviews.llvm.org/D109546	2021-09-09 15:19:36 -07:00
Philip Reames	eede4846a9	[SCEV] Allow negative steps for LT exit count computation for unsigned comparisons This bit of code is incredibly suspicious. It allows fully unknown (but potentially negative) steps, but not steps known to be negative. The comment about scev flag inference is worrying, but also not correct to my knowledge. At best, this might be covering up some related miscompile. However, there's no test in tree for it, the review history doesn't include obvious motivation, and the C++ example doesn't appear to give wrong results when hand translated to IR. I think it's time to remove this and see what falls out. During review, there were concerns raised about the correctness of the corresponding signed case. This change was deliberately narrowed to the unsigned case which has been auditted and appears correct for negative values. We need to get back to the known-negative signed case, but that'll be a future patch if nothing falls out from this one. Differential Revision: https://reviews.llvm.org/D104140	2021-09-09 14:09:29 -07:00
Eli Friedman	8f792707c4	[ScalarEvolution] Fix pointer/int confusion in howManyLessThans. In general, howManyLessThans doesn't really want to work with pointers at all; the result is an integer, and the operands of the icmp are effectively integers. However, isLoopEntryGuardedByCond doesn't like extra ptrtoint casts, so the arguments to isLoopEntryGuardedByCond need to be computed without those casts. Somehow, the values got mixed up with the recent howManyLessThans improvements; fix the confused values, and add a better comment to explain what's happening. Differential Revision: https://reviews.llvm.org/D109465	2021-09-09 12:38:33 -07:00
Philip Reames	e741fabc22	[SCEV] Move getIndexExpressionsFromGEP to delinearize [NFC]	2021-09-08 16:56:49 -07:00
Philip Reames	4b5e260b1d	[SCEV] Simplify findExistingSCEVInCache interface [NFC] We were returning a tuple when all but one caller only cared about one piece of the return value. That one caller can inline the complexity, and we can simplify all other uses.	2021-09-08 15:26:07 -07:00
Philip Reames	585c594d74	Move delinearization logic out of SCEV [NFC] None of this logic has anything to do with SCEV's internals, it just uses the existing public APIs. As a result, we can move the code from ScalarEvolution.cpp/hpp to Delinearization.cpp/hpp with only minor changes. This was discussed in advance on today's loop opt call. It turned out to be easy as hoped.	2021-09-08 12:28:35 -07:00
Philip Reames	6cdca906c7	[SCEV] Use no-self-wrap flags infered from exit structure to compute trip count The basic problem being solved is that we largely give up when encountering a trip count involving an IV which is not an addrec. We will fall back to the brute force constant eval, but that doesn't have the information about the fact that we can't cycle back through the same set of values. There's a high level design question of whether this is the right place to handle this, and if not, where that place is. The major alternative here would be to return a conservative upper bound, and then rely on two invocations of indvars to add the facts to the narrow IV, and then reconstruct SCEV. (I have not implemented the alternative and am not 100% sure this would work out.) That's arguably more in line with existing code, but I find this substantially easier to reason about. During review, no one expressed a strong opinion, so we went with this one. Differential Revision: D108651	2021-09-07 17:00:02 -07:00
Philip Reames	9659069978	[SCEV] Further clarify comments regarding UB and zero stride Follow on to D109029. I realized we had no mention of mustprogrress in the comment (as it prexisted mustprogress in the codebase). In the process of adding it, I tweaked the preconditions into something I think is more clear. Note that mustprogress is checked in the code. Differential Revision: https://reviews.llvm.org/D109091	2021-09-07 13:53:56 -07:00
Kazu Hirata	5648f7170e	[Analysis, Target, Transforms] Construct SmallVector with iterator ranges (NFC)	2021-09-07 09:19:33 -07:00
Nikita Popov	8d54c8a0c3	[SCEV] Fix applyLoopGuards() with range check idiom (PR51760) Due to a typo, this replaced %x with umax(C1, umin(C2, %x + C3)) rather than umax(C1, umin(C2, %x)). This didn't make a difference for the existing tests, because the result is only used for range calculation, and %x will usually have an unknown starting range, and the additional offset keeps it unknown. However, if %x already has a known range, we may compute a result range that is too small.	2021-09-06 22:22:41 +02:00
Philip Reames	bb0fa3ea02	Revert "snapshot - do not push" This reverts commit `91f4655d92`. This wasn't intented to be pushed, sorry.	2021-09-01 16:59:23 -07:00
Philip Reames	91f4655d92	snapshot - do not push	2021-09-01 16:59:01 -07:00
Philip Reames	73b951a7f7	[SCEV] Clarify requirements for zero-stride to be UB There's a silent bug in our reasoning about zero strides. We assume that having a single static exit implies that if that exit is not taken, then the loop must be infinite. This ignores the potential for abnormal exits via exceptions. Consider the following example: for (uint_8 i = 0; i < 1; i += 0) { throw_on_thousandth_call(); } Our reasoning is such that we'd conclude this loop can't take the backedge as that would lead to a (presumed) infinite loop. In practice, this is a silent bug because the loopIsFiniteByAssumption returns false strictly more often than the loopHaNoAbnormalExits property. We could reasonable want to change that in the future, so fixing the codeflow now is worthwhile. Differential Revision: https://reviews.llvm.org/D109029	2021-09-01 14:01:13 -07:00
Philip Reames	29fa37ec9f	[SCEV] If max BTC is zero, then so is the exact BTC [2 of 2] This extends D108921 into a generic rule applied to constructing ExitLimits along all paths. The remaining paths (primarily howFarToZero) don't have the same reasoning about UB sensitivity as the howManyLessThan ones did. Instead, the remain cause for max counts being more precise than exact counts is that we apply context sensitive loop guards on the max path, and not on the exact path. That choice is mildly suspect, but out of scope of this patch. The MVETailPredication.cpp change deserves a bit of explanation. We were previously figuring out that two SCEVs happened to be equal because the happened to be identical. When we optimized one with context sensitive information, but not the other, we lost the ability to prove them equal. So, cover this case by subtracting and then applying loop guards again. Without this, we see changes in test/CodeGen/Thumb2/mve-blockplacement.ll Differential Revision: https://reviews.llvm.org/D109015	2021-09-01 11:51:48 -07:00
Philip Reames	6600e1759b	[SCEV] If max BTC is zero, then so is the exact BTC [1 of N] This patch is specifically the howManyLessThan case. There will be a couple of followon patches for other codepaths. The subtle bit is explaining why the two codepaths have a difference while both are correct. The test case with modifications is a good example, so let's discuss in terms of it. * The previous exact bounds for this example of (-126 + (126 smax %n))<nsw> can evaluate to either 0 or 1. Both are "correct" results, but only one of them results in a well defined loop. If %n were 127 (the only possible value producing a trip count of 1), then the loop must execute undefined behavior. As a result, we can ignore the TC computed when %n is 127. All other values produce 0. * The max taken count computation uses the limit (i.e. the maximum value END can be without resulting in UB) to restrict the bound computation. As a result, it returns 0 which is also correct. WARNING: The logic above only holds for a single exit loop. The current logic for max trip count would be incorrect for multiple exit loops, except that we never call computeMaxBECountForLT except when we can prove either a) no overflow occurs in this IV before exit, or b) this is the sole exit. An alternate approach here would be to add the limit logic to the symbolic path. I haven't played with this extensively, but I'm hesitant because a) the term is optional and b) I'm not sure it'll reliably simplify away. As such, the resulting code quality from expansion might actually get worse. This was noticed while trying to figure out why D108848 wasn't NFC, but is otherwise standalone. Differential Revision: https://reviews.llvm.org/D108921	2021-08-31 08:50:11 -07:00
Nikita Popov	9f7873784d	[SCEVExpander] Reuse removePointerBase() for canonical addrecs ExposePointerBase() in SCEVExpander implements basically the same functionality as removePointerBase() in SCEV, so reuse it. The SCEVExpander code assumes that the pointer operand on adds is the last one -- I'm not sure that always holds. As such this might not be strictly NFC.	2021-08-29 21:12:35 +02:00
Nikita Popov	e6a5dd60ff	[SCEV] Assert unique pointer base (NFC) Add expressions can contain at most one pointer operand nowadays, assert that in getPointerBase() and removePointerBase().	2021-08-29 20:06:24 +02:00
Philip Reames	ec8d87e9f5	[SCEV] Infer nuw from nw for addrecs This was previously committed in `914836b`, and reverted due to confusion on the status of the review. Differential Revision: https://reviews.llvm.org/D108601	2021-08-24 14:24:05 -07:00
Philip Reames	58582bae63	Revert "[SCEV] Infer nsw/nuw from nw for addrecs" This reverts commit `914836b1c8`. Further comments on review came up after initial approval. Reverting while addressing.	2021-08-24 09:28:37 -07:00
Philip Reames	914836b1c8	[SCEV] Infer nsw/nuw from nw for addrecs If we no an addrec doesn't self-wrap, the increment is strictly positive, and the start value is the smallest representable value, then we know that the corresponding wrap type can not occur. Differential Revision: https://reviews.llvm.org/D108601	2021-08-24 08:53:21 -07:00
Philip Reames	96ef794fd0	[SCEV] Add a hasFlags utility to improve readability [NFC]	2021-08-23 17:36:52 -07:00
Roman Lebedev	0dc6b597db	Revert "[SCEV] Remove premature assert. PR46786" Since then, the SCEV pointer handling as been improved, so the assertion should now hold. This reverts commit `b96114c1e1`, relanding the assertion from commit `141e845da5`.	2021-08-13 17:50:22 +03:00
Philip Reames	f82f39b9cf	[SCEV] Add a comment about invariant in howManyLessThans	2021-07-26 16:39:26 -07:00
Nikita Popov	33146857e9	[IR] Consider non-willreturn as side effect (PR50511) This adjusts mayHaveSideEffect() to return true for !willReturn() instructions. Just like other side-effects, non-willreturn calls (aka "divergence") cannot be removed and cannot be reordered relative to other side effects. This fixes a number of bugs where non-willreturn calls are either incorrectly dropped or moved. In particular, it also fixes the last open problem in https://bugs.llvm.org/show_bug.cgi?id=50511. I performed a cursory review of all current mayHaveSideEffect() uses, which convinced me that these are indeed the desired default semantics. Places that do not want to consider non-willreturn as a sideeffect generally do not want mayHaveSideEffect() semantics at all. I identified two such cases, which are addressed by D106591 and D106742. Finally, there is a use in SCEV for which we don't really have an appropriate API right now -- what it wants is basically "would this be considered forward progress". I've just spelled out the previous semantics there. Differential Revision: https://reviews.llvm.org/D106749	2021-07-26 16:35:14 +02:00
Philip Reames	ec43def700	Style tweaks for SCEV's computeMaxBECountForLT [NFC]	2021-07-23 17:19:45 -07:00
Philip Reames	4a3dc7dc9a	[SCEV] Fix bug involving zero step and non-invariant RHS in trip count logic Eli pointed out the issue when reviewing D104140. The max trip count logic makes an assumption that the value of IV changes. When the step is zero, the nowrap fact becomes trivial, and thus there's nothing preventing the loop from being nearly infinite. (The "nearly" part is because mustprogress may disallow an infinite loop while still allowing 999999999 iterations before RHS happens to allow an exit.) This is very difficult to see in practice. You need a means to produce a loop varying RHS in a mustprogress loop which doesn't allow the loop to be infinite. In most cases, LICM or SCEV are smart enough to remove the loop varying expressions. Differential Revision: https://reviews.llvm.org/D106327	2021-07-23 15:19:23 -07:00
Eli Friedman	de3ea51be4	[ScalarEvolution] Refine computeMaxBECountForLT to be accurate in more cases. Allow arbitrary strides, and make sure we return the correct result when the backedge-taken count is zero. Differential Revision: https://reviews.llvm.org/D106197	2021-07-19 15:43:30 -07:00
Philip Reames	4402d0d4fb	[SCEV] Add a clarifying comment in howManyLessThans Wrap semantics are subtle when combined with multiple exits. This has caused several rounds of confusion during recent reviews, so try to document the subtly distinction between when wrap flags provide <u and <=u facts.	2021-07-19 15:13:48 -07:00
Nikita Popov	2b17c24a03	[SCEV] Fix unused variable warning (NFC)	2021-07-18 23:12:22 +02:00
Eli Friedman	28a3ad3f86	[ScalarEvolution] Remove uses of PointerType::getElementType.	2021-07-18 13:14:33 -07:00
Eli Friedman	cbba71bfb5	[ScalarEvolution] Fix overflow in computeBECount. The current implementation of computeBECount doesn't account for the possibility that adding "Stride - 1" to Delta might overflow. For almost all loops, it doesn't, but it's not actually proven anywhere. To deal with this, use a variety of tricks to try to prove that the addition doesn't overflow. If the proof is impossible, use an alternate sequence which never overflows. Differential Revision: https://reviews.llvm.org/D105216	2021-07-16 16:15:18 -07:00
Philip Reames	a99d420a93	[SCEV] Fix unsound reasoning in howManyLessThans This is split from D105216, it handles only a subset of the cases in that patch. Specifically, the issue being fixed is that the code incorrectly assumed that (Start-Stide) < End implied that the backedge was taken at least once. This is not true when e.g. Start = 4, Stride = 2, and End = 3. Note that we often do produce the right backedge taken count despite the flawed reasoning. The fix chosen here is to use an alternate form of uceil (ceiling of unsigned divide) lowering which is safe when max(RHS,Start) > Start - Stride. (Note that signedness of both max expression and comparison depend on the signedness of the comparison being analyzed, and that overflow in the Start - Stride expression is allowed.) Note that this is weaker than proving the backedge is taken because it allows start - stride < end < start. Some cases which can't be proven safe are sent down the generic path, and we do end up generating less optimal expressions in a few cases. Credit for coming up with the approach goes entirely to Eli. I just split it off, tweaked the comments a bit, and did some additional testing. Differential Revision: https://reviews.llvm.org/D105942	2021-07-15 10:32:47 -07:00
Philip Reames	205ed009a4	[SCEV] Handle zero stride correctly in howManyLessThans This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some fairly simple proofs handle the cases which show up in practice. The only test changes are the cases where we really do need a non-zero divider to produce the right result. Recommitting with isLoopInvariant() check. Differential Revision: https://reviews.llvm.org/D105921	2021-07-13 19:14:01 -07:00
Arthur Eubanks	5738819679	Revert "[SCEV] Handle zero stride correctly in howManyLessThans" This reverts commit `4df591b5c9`. Causes crashes, see comments on D105921.	2021-07-13 17:53:48 -07:00
Eli Friedman	bb8c7a980f	[ScalarEvolution] Make isKnownNonZero handle more cases. Using an unsigned range instead of signed ranges is a bit more precise. Differential Revision: https://reviews.llvm.org/D105941	2021-07-13 15:36:45 -07:00
Philip Reames	4df591b5c9	[SCEV] Handle zero stride correctly in howManyLessThans This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some fairly simple proofs handle the cases which show up in practice. The only test changes are the cases where we really do need a non-zero divider to produce the right result. Differential Revision: https://reviews.llvm.org/D105921	2021-07-13 13:31:40 -07:00
Philip Reames	087310c71e	[SCEV] Strengthen inference of RHS > Start in howManyLessThans Split off from D105216 to simplify review. Rewritten with a lambda to be easier to follow. Comments clarified. Sorry for no test case, this is tricky to exercise with the current structure of the code. It's about to be hit more frequently in a follow up patch, and the change itself is simple.	2021-07-13 11:54:07 -07:00
Philip Reames	e4b43973fb	[ScalarEvolution] Fix overflow when computing max trip counts This is split from D105216 to reduce patch complexity. Original code by Eli with very minor modification by me. The primary point of this patch is to add the getUDivCeilSCEV routine. I included the two callers with constant arguments as we know those must constant fold even without any of the fancy inference logic.	2021-07-13 10:01:10 -07:00
Eli Friedman	882ee7fbd6	Fix buildbot regression from `9c4baf5`. Apparently ScalarEvolution::isImpliedCond tries to truncate a pointer in some obscure cases. Guard the code with a check for pointers.	2021-07-09 17:54:09 -07:00
Eli Friedman	9c4baf5101	[ScalarEvolution] Strictly enforce pointer/int type rules. Rules: 1. SCEVUnknown is a pointer if and only if the LLVM IR value is a pointer. 2. SCEVPtrToInt is never a pointer. 3. If any other SCEV expression has no pointer operands, the result is an integer. 4. If a SCEVAddExpr has exactly one pointer operand, the result is a pointer. 5. If a SCEVAddRecExpr's first operand is a pointer, and it has no other pointer operands, the result is a pointer. 6. If every operand of a SCEVMinMaxExpr is a pointer, the result is a pointer. 7. Otherwise, the SCEV expression is invalid. I'm not sure how useful rule 6 is in practice. If we exclude it, we can guarantee that ScalarEvolution::getPointerBase always returns a SCEVUnknown, which might be a helpful property. Anyway, I'll leave that for a followup. This is basically mop-up at this point; all the changes with significant functional effects have landed. Some of the remaining changes could be split off, but I don't see much point. Differential Revision: https://reviews.llvm.org/D105510	2021-07-09 17:29:26 -07:00
Nikita Popov	2e3f4694d6	[IR] Add GEPOperator::indices() (NFC) In order to mirror the GetElementPtrInst::indices() API. Wanted to use this in the IRForTarget code, and was surprised to find that it didn't exist yet.	2021-07-09 21:41:20 +02:00
Martin Storsjö	e479777d3c	Revert "[ScalarEvolution] Fix overflow in computeBECount." This reverts commit `5b350183cd` (and also "[NFC][ScalarEvolution] Cleanup howManyLessThans.", `009436e9c1`, to make it apply). See https://reviews.llvm.org/D105216 for discussion on various miscompilations caused by that commit.	2021-07-09 14:26:48 +03:00
Eli Friedman	009436e9c1	[NFC][ScalarEvolution] Cleanup howManyLessThans. In preparation for D104075. Some NFC cleanup, and some test coverage for planned changes.	2021-07-08 17:56:26 -07:00
Eli Friedman	5b350183cd	[ScalarEvolution] Fix overflow in computeBECount. There are two issues with the current implementation of computeBECount: 1. It doesn't account for the possibility that adding "Stride - 1" to Delta might overflow. For almost all loops, it doesn't, but it's not actually proven anywhere. 2. It doesn't account for the possibility that Stride is zero. If Delta is zero, the backedge is never taken; the value of Stride isn't relevant. To handle this, we have to make sure that the expression returned by computeBECount evaluates to zero. To deal with this, add two new checks: 1. Use a variety of tricks to try to prove that the addition doesn't overflow. If the proof is impossible, use an alternate sequence which never overflows. 2. Use umax(Stride, 1) to handle the possibility that Stride is zero. Differential Revision: https://reviews.llvm.org/D105216	2021-07-08 10:09:55 -07:00
Eli Friedman	f5603aa050	[ScalarEvolution] Make sure getMinusSCEV doesn't negate pointers. Add a function removePointerBase that returns, essentially, S - getPointerBase(S). Use it in getMinusSCEV instead of actually subtracting pointers. Differential Revision: https://reviews.llvm.org/D105503	2021-07-07 10:27:10 -07:00
Eli Friedman	7ac1c7bead	Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers. As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers. There are a few ways we could deal with this: 1. We could completely forbid subtracting pointers in getMinusSCEV() 2. We could forbid subracting pointers with different pointer bases (this patch). 3. We could try to ptrtoint pointer operands. The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option. As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact. This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better. I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test. Recommitting with fix to MemoryDepChecker::isDependent. Differential Revision: https://reviews.llvm.org/D104806	2021-07-06 12:16:05 -07:00
Eli Friedman	a6d081b2cb	Revert "[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers." This reverts commit `74d6ce5d5f`. Seeing crashes on buildbots in MemoryDepChecker::isDependent.	2021-07-06 11:17:13 -07:00
Eli Friedman	74d6ce5d5f	[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers. As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers. There are a few ways we could deal with this: 1. We could completely forbid subtracting pointers in getMinusSCEV() 2. We could forbid subracting pointers with different pointer bases (this patch). 3. We could try to ptrtoint pointer operands. The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option. As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact. This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better. I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test. Differential Revision: https://reviews.llvm.org/D104806	2021-07-06 10:54:41 -07:00
Philip Reames	14d8f1546a	[SCEV] Fold (0 udiv %x) to 0 We have analogous rules in instsimplify, etc.., but were missing the same in SCEV. The fold is near trivial, but came up in the context of a larger change.	2021-06-30 08:31:13 -07:00
Eli Friedman	8d5bf0709d	[NFC] Prefer ConstantRange::makeExactICmpRegion over makeAllowedICmpRegion The implementation is identical, but it makes the semantics a bit more obvious.	2021-06-25 14:43:13 -07:00
Florian Hahn	6478f3fb78	[SCEV] Support single-cond range check idiom in applyLoopGuards. This patch extends applyLoopGuards to detect a single-cond range check idiom that InstCombine generates. It extends applyLoopGuards to detect conditions of the form (-C1 + X < C2). InstCombine will create this form when combining two checks of the form (X u< C2 + C1) and (X >=u C1). In practice, this enables us to correctly compute a tight trip count bounds for code as in the function below. InstCombine will fold the minimum iteration check created by LoopRotate with the user check (< 8). void unsigned_check(short pred, unsigned width) { if (width < 8) { for (int x = 0; x < width; x++) pred[x] = pred[x] pred[x]; } } As a consequence, LLVM creates dead vector loops for the code above, e.g. see https://godbolt.org/z/cb8eTcqET https://alive2.llvm.org/ce/z/SHHW4d Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104741	2021-06-25 10:24:40 +01:00
Florian Hahn	121ecb05e7	[SCEV] Generalize MatchBinaryAddToConst to support non-add expressions. This patch generalizes MatchBinaryAddToConst to support matching (A + C1), (A + C2), instead of just matching (A + C1), A. The existing cases can be handled by treating non-add expressions A as A + 0. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D104634	2021-06-24 12:16:15 +01:00
Eli Friedman	b12192f7cd	[ScalarEvolution] Clarify implementation of getPointerBase(). getPointerBase should only be looking through Add and AddRec expressions; other expressions either aren't pointers, or can't be looked through. Technically, this is a functional change. For a multiply or min/max expression, if they have exactly one pointer operand, and that operand is the first operand, the behavior here changes. Similarly, if an AddRec has a pointer-type step, the behavior changes. But that shouldn't be happening in practice, and we plan to make such expressions illegal.	2021-06-23 12:55:59 -07:00
Eli Friedman	fdaf304e0d	[NFC][ScalarEvolution] Fix SCEVNAryExpr::getType(). SCEVNAryExpr::getType() could return the wrong type for a SCEVAddExpr. Remove it, and add getType() methods to the relevant subclasses. NFC because nothing uses it directly, as far as I know; this is just future-proofing.	2021-06-23 12:55:59 -07:00
Florian Hahn	adee485adf	[SCEV] Support signed predicates in applyLoopGuards. This adds handling for signed predicates, similar to how unsigned predicates are already handled. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104732	2021-06-23 10:21:05 +01:00
Florian Hahn	6c782e6eb0	[SCEV] Reduce code to handle predicates in applyLoopGuards (NFC). Hoist out common recurrence check and sink updating the map, to reduce the code required to support additional predicates.	2021-06-22 15:56:45 +01:00
Florian Hahn	d17798823c	[SCEV] Retain AddExpr flags when subtracting a foldable constant. Currently we drop wrapping flags for expressions like (A + C1)<flags> - C2. But we can retain flags under certain conditions: * Adding a smaller constant is NUW if the original AddExpr was NUW. * Adding a constant with the same sign and small magnitude is NSW, if the original AddExpr was NSW. This can improve results after using `SimplifyICmpOperands`, which may subtract one in order to use stricter predicates, as is the case for `isKnownPredicate`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D104319	2021-06-22 11:27:51 +01:00
Eli Friedman	8f3d16905d	[ScalarEvolution] Ensure backedge-taken counts are not pointers. A backedge-taken count doesn't refer to memory; returning a pointer type is nonsense. So make sure we always return an integer. The obvious way to do this would be to just convert the operands of the icmp to integers, but that doesn't quite work out at the moment: isLoopEntryGuardedByCond currently gets confused by ptrtoint operations. So we perform the ptrtoint conversion late for lt/gt operations. The test changes are mostly innocuous. The most interesting changes are more complex SCEV expressions of the form "(-1 * (ptrtoint i8* %ptr to i64)) + %ptr)". This is expected: we can't fold this to zero because we need to preserve the pointer base. The call to isLoopEntryGuardedByCond in howFarToZero is less precise because of ptrtoint operations; this shows up in the function pr46786_c26_char in ptrtoint.ll. Fixing it here would require more complex refactoring. It should eventually be fixed by future improvements to isImpliedCond. See https://bugs.llvm.org/show_bug.cgi?id=46786 for context. Differential Revision: https://reviews.llvm.org/D103656	2021-06-21 16:24:16 -07:00
Eli Friedman	62ed024c74	[NFC][ScalarEvolution] Clean up ExitLimit constructors. Make all the constructors forward to one constructor. Remove redundant assertions.	2021-06-20 17:40:30 -07:00
Eli Friedman	8a567e5f22	[ScalarEvolution] Fix pointer/int type handling converting select/phi to min/max. The old version of this code would blindly perform arithmetic without paying attention to whether the types involved were pointers or integers. This could lead to weird expressions like negating a pointer. Explicitly handle simple cases involving pointers, like "x < y ? x : y". In all other cases, coerce the operands of the comparison to integer types. This avoids the weird cases, while handling most of the interesting cases. Differential Revision: https://reviews.llvm.org/D103660	2021-06-17 14:05:12 -07:00
Eli Friedman	27963ccf07	[NFC][ScalarEvolution] Refactor createNodeForSelectOrPHI In preparation for D103660.	2021-06-16 12:32:32 -07:00
Roman Lebedev	a3113df219	[SCEV] PtrToInt on non-integral pointers is allowed As per (committed without review) @reames's rGac81cb7e6dde9b0890ee1780eae94ab96743569b change, we are now allowed to produce `ptrtoint` for non-integral pointers. This will unblock further unbreaking of SCEV regarding int-vs-pointer type confusion. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D104322	2021-06-16 10:24:25 +03:00
Philip Reames	7629b2a09c	[LI] Add a cover function for checking if a loop is mustprogress [nfc] Essentially, the cover function simply combines the loop level check and the function level scope into one call. This simplifies several callers and is (subjectively) less error prone.	2021-06-10 13:37:32 -07:00
Philip Reames	aaaeb4b160	[SCEV] Use mustprogress flag on loops (in addition to function attribute) This addresses a performance regression reported against `3c6e4191`. That change (correctly) limited a transform based on assumed finiteness to mustprogress loops, but the previous change (`38540d7`) which introduced the mustprogress check utility only handled function attributes, not the loop metadata form. It turns out that clang uses the function attribute form for C++, and the loop metadata form for C. As a result, `3c6e4191` ended up being a large regression in practice for C code as loops weren't being considered mustprogress despite the language semantics.	2021-06-10 13:20:28 -07:00
Philip Reames	b65f30d6fb	[SCEV] Minor code motion to simplify a later patch [nfc]	2021-06-09 14:17:06 -07:00
Florian Hahn	b76f1f1202	[SCEV] Keep common NUW flags when inlining Add operands. Currently, NoWrapFlags are dropped if we inline operands of SCEVAddExpr operands. As a consequence, we always drop flags when building expressions like `getAddExpr(A, getAddExpr(B, C, NUW), NUW)`. We should be able to retain NUW flags common among all inlined SCEVAddExpr and the original flags. Reviewed By: nikic, mkazantsev Differential Revision: https://reviews.llvm.org/D103877	2021-06-09 17:13:21 +01:00
Philip Reames	3c6e419198	[SCEV] Properly guard reasoning about infinite loops being UB on mustprogress Noticed via code inspection. We changed the semantics of the IR when we added mustprogress, and we appear to have not updated this location. Differential Revision: https://reviews.llvm.org/D103834	2021-06-07 14:47:36 -07:00
Philip Reames	38540d71c7	[SCEV] Compute exit counts for unsigned IVs using mustprogress semantics The motivation here is simple loops with unsigned induction variables w/non-one steps. A toy example would be: for (unsigned i = 0; i < N; i += 2) { body; } Given C/C++ semantics, we do not get the nuw flag on the induction variable. Given that lack, we currently can't compute a bound for this loop. We can do better for many cases, depending on the contents of "body". The basic intuition behind this patch is as follows: * A step which evenly divides the iteration space must wrap through the same numbers repeatedly. And thus, we can ignore potential cornercases where we exit after the n-th wrap through uint32_max. * Per C++ rules, infinite loops without side effects are UB. We already have code in SCEV which relies on this. In LLVM, this is tied to the mustprogress attribute. Together, these let us conclude that the trip count of this loop must come before unsigned overflow unless the body would form a well defined infinite loop. A couple notes for those reading along: * I reused the loop properties code which is overly conservative for this case. I may follow up in another patch to generalize it for the actual UB rules. * We could cache the n(s/u)w facts. I left that out because doing a pre-patch which cached existing inference showed a lot of diffs I had trouble fully explaining. I plan to get back to this, but I don't want it on the critical path. Differential Revision: https://reviews.llvm.org/D103118	2021-06-07 11:24:00 -07:00
Roman Lebedev	e350494fb0	[NFC] Promote willNotOverflow() / getStrengthenedNoWrapFlagsFromBinOp() from IndVars into SCEV proper We might want to use it when creating SCEV proper in createSCEV(), now that we don't `forgetValue()` in `SimplifyIndvar::strengthenOverflowingOperation()`, which might have caused us to loose some optimization potential.	2021-06-05 12:17:51 +03:00
Eli Friedman	fd229caa01	[polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs. When we're remapping an AddRec, the AddRec constructed by a partial rewrite might not make sense. This triggers an assertion complaining it's not loop-invariant. Instead of constructing the partially rewritten AddRec, just skip straight to calling evaluateAtIteration. Testcase was automatically reduced using llvm-reduce, so it's a little messy, but hopefully makes sense. Differential Revision: https://reviews.llvm.org/D102959	2021-06-01 09:51:05 -07:00
Roman Lebedev	f7c95c3322	[NFC] ScalarEvolution: apply SSO to the ExprValueMap value ExprValueMap is a map from SCEV * to a set-vector of (Value , ConstantInt ) pair, and while the map itself will likely be big-ish (have many keys), it is a reasonable assumption that each key will refer to a small-ish number of pairs. In particular looking at n=512 case from https://bugs.llvm.org/show_bug.cgi?id=50384, the small-size of 4 appears to be the sweet spot, it results in the least allocations while minimizing memory footprint. ``` $ for i in $(ls heaptrack.opt.*.gz); do echo $i; heaptrack_print $i \| tail -n 6; echo ""; done heaptrack.opt.0-orig.gz total runtime: 14.32s. calls to allocation functions: 8222442 (574192/s) temporary memory allocations: `2419000` (168924/s) peak heap memory consumption: 190.98MB peak RSS (including heaptrack overhead): 239.65MB total memory leaked: 67.58KB heaptrack.opt.1-n1.gz total runtime: 13.72s. calls to allocation functions: 7184188 (523705/s) temporary memory allocations: 2419017 (176338/s) peak heap memory consumption: 191.38MB peak RSS (including heaptrack overhead): 239.64MB total memory leaked: 67.58KB heaptrack.opt.2-n2.gz total runtime: 12.24s. calls to allocation functions: 6146827 (502355/s) temporary memory allocations: 2418997 (197695/s) peak heap memory consumption: 163.31MB peak RSS (including heaptrack overhead): 211.01MB total memory leaked: 67.58KB heaptrack.opt.3-n4.gz total runtime: 12.28s. calls to allocation functions: 6068532 (494260/s) temporary memory allocations: 2418985 (197017/s) peak heap memory consumption: 155.43MB peak RSS (including heaptrack overhead): 201.77MB total memory leaked: 67.58KB heaptrack.opt.4-n8.gz total runtime: 12.06s. calls to allocation functions: 6068042 (503321/s) temporary memory allocations: 2418992 (200646/s) peak heap memory consumption: 166.03MB peak RSS (including heaptrack overhead): 213.55MB total memory leaked: 67.58KB heaptrack.opt.5-n16.gz total runtime: 12.14s. calls to allocation functions: 6067993 (499958/s) temporary memory allocations: 2418999 (199307/s) peak heap memory consumption: 187.24MB peak RSS (including heaptrack overhead): 233.69MB total memory leaked: 67.58KB ``` While that test may be an edge worst-case scenario, https://llvm-compile-time-tracker.com/compare.php?from=dee85d47d9f15fc268f7b18f279dac2774836615&to=98a57e31b1947d5bcdf4a5605ac2ab32b4bd5f63&stat=instructions agrees that this also results in improvements in the usual situations.	2021-05-31 15:34:03 +03:00
Philip Reames	ff08c3468f	[SCEV] Compute trip multiple for multiple exit loops This patch implements getSmallConstantTripMultiple(L) correctly for multiple exit loops. The previous implementation was both imprecise, and violated the specified behavior of the method. This was fine in practice, because it turns out the function was both dead in real code, and not tested for the multiple exit case. Differential Revision: https://reviews.llvm.org/D103189	2021-05-26 11:52:25 -07:00
Philip Reames	9306bb638f	[SCEV] Generalize getSmallConstantTripCount(L) for multiple exit loops This came up in review for another patch, see https://reviews.llvm.org/D102982#2782407 for full context. I've reviewed the callers to make sure they can handle multiple exit loops w/non-zero returns. There's two cases in target cost models where results might change (Hexagon and PowerPC), but the results looked legal and reasonable. If a target maintainer wishes to back out the effect of the costing change, they should explicitly check for multiple exit loops and handle them as desired. Differential Revision: https://reviews.llvm.org/D103182	2021-05-26 11:18:25 -07:00
Philip Reames	921d3f7af0	[SCEV] Add a utility for converting from "exit count" to "trip count" (Mostly as a logical place to put a comment since this is a reoccuring confusion.)	2021-05-26 10:41:49 -07:00
Philip Reames	fb14577d0c	[SCEV] Extract out a helper for computing trip multiples	2021-05-26 10:15:03 -07:00
Vitaly Buka	f44f2e0afc	[NFC] Fix 'unused' warning	2021-05-25 12:23:57 -07:00

1 2 3 4 5 ...

1789 Commits