llvm-project

Commit Graph

Author	SHA1	Message	Date
Philip Reames	29fa37ec9f	[SCEV] If max BTC is zero, then so is the exact BTC [2 of 2] This extends D108921 into a generic rule applied to constructing ExitLimits along all paths. The remaining paths (primarily howFarToZero) don't have the same reasoning about UB sensitivity as the howManyLessThan ones did. Instead, the remain cause for max counts being more precise than exact counts is that we apply context sensitive loop guards on the max path, and not on the exact path. That choice is mildly suspect, but out of scope of this patch. The MVETailPredication.cpp change deserves a bit of explanation. We were previously figuring out that two SCEVs happened to be equal because the happened to be identical. When we optimized one with context sensitive information, but not the other, we lost the ability to prove them equal. So, cover this case by subtracting and then applying loop guards again. Without this, we see changes in test/CodeGen/Thumb2/mve-blockplacement.ll Differential Revision: https://reviews.llvm.org/D109015	2021-09-01 11:51:48 -07:00
Philip Reames	6600e1759b	[SCEV] If max BTC is zero, then so is the exact BTC [1 of N] This patch is specifically the howManyLessThan case. There will be a couple of followon patches for other codepaths. The subtle bit is explaining why the two codepaths have a difference while both are correct. The test case with modifications is a good example, so let's discuss in terms of it. * The previous exact bounds for this example of (-126 + (126 smax %n))<nsw> can evaluate to either 0 or 1. Both are "correct" results, but only one of them results in a well defined loop. If %n were 127 (the only possible value producing a trip count of 1), then the loop must execute undefined behavior. As a result, we can ignore the TC computed when %n is 127. All other values produce 0. * The max taken count computation uses the limit (i.e. the maximum value END can be without resulting in UB) to restrict the bound computation. As a result, it returns 0 which is also correct. WARNING: The logic above only holds for a single exit loop. The current logic for max trip count would be incorrect for multiple exit loops, except that we never call computeMaxBECountForLT except when we can prove either a) no overflow occurs in this IV before exit, or b) this is the sole exit. An alternate approach here would be to add the limit logic to the symbolic path. I haven't played with this extensively, but I'm hesitant because a) the term is optional and b) I'm not sure it'll reliably simplify away. As such, the resulting code quality from expansion might actually get worse. This was noticed while trying to figure out why D108848 wasn't NFC, but is otherwise standalone. Differential Revision: https://reviews.llvm.org/D108921	2021-08-31 08:50:11 -07:00
Nikita Popov	9f7873784d	[SCEVExpander] Reuse removePointerBase() for canonical addrecs ExposePointerBase() in SCEVExpander implements basically the same functionality as removePointerBase() in SCEV, so reuse it. The SCEVExpander code assumes that the pointer operand on adds is the last one -- I'm not sure that always holds. As such this might not be strictly NFC.	2021-08-29 21:12:35 +02:00
Nikita Popov	e6a5dd60ff	[SCEV] Assert unique pointer base (NFC) Add expressions can contain at most one pointer operand nowadays, assert that in getPointerBase() and removePointerBase().	2021-08-29 20:06:24 +02:00
Philip Reames	ec8d87e9f5	[SCEV] Infer nuw from nw for addrecs This was previously committed in `914836b`, and reverted due to confusion on the status of the review. Differential Revision: https://reviews.llvm.org/D108601	2021-08-24 14:24:05 -07:00
Philip Reames	58582bae63	Revert "[SCEV] Infer nsw/nuw from nw for addrecs" This reverts commit `914836b1c8`. Further comments on review came up after initial approval. Reverting while addressing.	2021-08-24 09:28:37 -07:00
Philip Reames	914836b1c8	[SCEV] Infer nsw/nuw from nw for addrecs If we no an addrec doesn't self-wrap, the increment is strictly positive, and the start value is the smallest representable value, then we know that the corresponding wrap type can not occur. Differential Revision: https://reviews.llvm.org/D108601	2021-08-24 08:53:21 -07:00
Philip Reames	96ef794fd0	[SCEV] Add a hasFlags utility to improve readability [NFC]	2021-08-23 17:36:52 -07:00
Roman Lebedev	0dc6b597db	Revert "[SCEV] Remove premature assert. PR46786" Since then, the SCEV pointer handling as been improved, so the assertion should now hold. This reverts commit `b96114c1e1`, relanding the assertion from commit `141e845da5`.	2021-08-13 17:50:22 +03:00
Philip Reames	f82f39b9cf	[SCEV] Add a comment about invariant in howManyLessThans	2021-07-26 16:39:26 -07:00
Nikita Popov	33146857e9	[IR] Consider non-willreturn as side effect (PR50511) This adjusts mayHaveSideEffect() to return true for !willReturn() instructions. Just like other side-effects, non-willreturn calls (aka "divergence") cannot be removed and cannot be reordered relative to other side effects. This fixes a number of bugs where non-willreturn calls are either incorrectly dropped or moved. In particular, it also fixes the last open problem in https://bugs.llvm.org/show_bug.cgi?id=50511. I performed a cursory review of all current mayHaveSideEffect() uses, which convinced me that these are indeed the desired default semantics. Places that do not want to consider non-willreturn as a sideeffect generally do not want mayHaveSideEffect() semantics at all. I identified two such cases, which are addressed by D106591 and D106742. Finally, there is a use in SCEV for which we don't really have an appropriate API right now -- what it wants is basically "would this be considered forward progress". I've just spelled out the previous semantics there. Differential Revision: https://reviews.llvm.org/D106749	2021-07-26 16:35:14 +02:00
Philip Reames	ec43def700	Style tweaks for SCEV's computeMaxBECountForLT [NFC]	2021-07-23 17:19:45 -07:00
Philip Reames	4a3dc7dc9a	[SCEV] Fix bug involving zero step and non-invariant RHS in trip count logic Eli pointed out the issue when reviewing D104140. The max trip count logic makes an assumption that the value of IV changes. When the step is zero, the nowrap fact becomes trivial, and thus there's nothing preventing the loop from being nearly infinite. (The "nearly" part is because mustprogress may disallow an infinite loop while still allowing 999999999 iterations before RHS happens to allow an exit.) This is very difficult to see in practice. You need a means to produce a loop varying RHS in a mustprogress loop which doesn't allow the loop to be infinite. In most cases, LICM or SCEV are smart enough to remove the loop varying expressions. Differential Revision: https://reviews.llvm.org/D106327	2021-07-23 15:19:23 -07:00
Eli Friedman	de3ea51be4	[ScalarEvolution] Refine computeMaxBECountForLT to be accurate in more cases. Allow arbitrary strides, and make sure we return the correct result when the backedge-taken count is zero. Differential Revision: https://reviews.llvm.org/D106197	2021-07-19 15:43:30 -07:00
Philip Reames	4402d0d4fb	[SCEV] Add a clarifying comment in howManyLessThans Wrap semantics are subtle when combined with multiple exits. This has caused several rounds of confusion during recent reviews, so try to document the subtly distinction between when wrap flags provide <u and <=u facts.	2021-07-19 15:13:48 -07:00
Nikita Popov	2b17c24a03	[SCEV] Fix unused variable warning (NFC)	2021-07-18 23:12:22 +02:00
Eli Friedman	28a3ad3f86	[ScalarEvolution] Remove uses of PointerType::getElementType.	2021-07-18 13:14:33 -07:00
Eli Friedman	cbba71bfb5	[ScalarEvolution] Fix overflow in computeBECount. The current implementation of computeBECount doesn't account for the possibility that adding "Stride - 1" to Delta might overflow. For almost all loops, it doesn't, but it's not actually proven anywhere. To deal with this, use a variety of tricks to try to prove that the addition doesn't overflow. If the proof is impossible, use an alternate sequence which never overflows. Differential Revision: https://reviews.llvm.org/D105216	2021-07-16 16:15:18 -07:00
Philip Reames	a99d420a93	[SCEV] Fix unsound reasoning in howManyLessThans This is split from D105216, it handles only a subset of the cases in that patch. Specifically, the issue being fixed is that the code incorrectly assumed that (Start-Stide) < End implied that the backedge was taken at least once. This is not true when e.g. Start = 4, Stride = 2, and End = 3. Note that we often do produce the right backedge taken count despite the flawed reasoning. The fix chosen here is to use an alternate form of uceil (ceiling of unsigned divide) lowering which is safe when max(RHS,Start) > Start - Stride. (Note that signedness of both max expression and comparison depend on the signedness of the comparison being analyzed, and that overflow in the Start - Stride expression is allowed.) Note that this is weaker than proving the backedge is taken because it allows start - stride < end < start. Some cases which can't be proven safe are sent down the generic path, and we do end up generating less optimal expressions in a few cases. Credit for coming up with the approach goes entirely to Eli. I just split it off, tweaked the comments a bit, and did some additional testing. Differential Revision: https://reviews.llvm.org/D105942	2021-07-15 10:32:47 -07:00
Philip Reames	205ed009a4	[SCEV] Handle zero stride correctly in howManyLessThans This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some fairly simple proofs handle the cases which show up in practice. The only test changes are the cases where we really do need a non-zero divider to produce the right result. Recommitting with isLoopInvariant() check. Differential Revision: https://reviews.llvm.org/D105921	2021-07-13 19:14:01 -07:00
Arthur Eubanks	5738819679	Revert "[SCEV] Handle zero stride correctly in howManyLessThans" This reverts commit `4df591b5c9`. Causes crashes, see comments on D105921.	2021-07-13 17:53:48 -07:00
Eli Friedman	bb8c7a980f	[ScalarEvolution] Make isKnownNonZero handle more cases. Using an unsigned range instead of signed ranges is a bit more precise. Differential Revision: https://reviews.llvm.org/D105941	2021-07-13 15:36:45 -07:00
Philip Reames	4df591b5c9	[SCEV] Handle zero stride correctly in howManyLessThans This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some fairly simple proofs handle the cases which show up in practice. The only test changes are the cases where we really do need a non-zero divider to produce the right result. Differential Revision: https://reviews.llvm.org/D105921	2021-07-13 13:31:40 -07:00
Philip Reames	087310c71e	[SCEV] Strengthen inference of RHS > Start in howManyLessThans Split off from D105216 to simplify review. Rewritten with a lambda to be easier to follow. Comments clarified. Sorry for no test case, this is tricky to exercise with the current structure of the code. It's about to be hit more frequently in a follow up patch, and the change itself is simple.	2021-07-13 11:54:07 -07:00
Philip Reames	e4b43973fb	[ScalarEvolution] Fix overflow when computing max trip counts This is split from D105216 to reduce patch complexity. Original code by Eli with very minor modification by me. The primary point of this patch is to add the getUDivCeilSCEV routine. I included the two callers with constant arguments as we know those must constant fold even without any of the fancy inference logic.	2021-07-13 10:01:10 -07:00
Eli Friedman	882ee7fbd6	Fix buildbot regression from `9c4baf5`. Apparently ScalarEvolution::isImpliedCond tries to truncate a pointer in some obscure cases. Guard the code with a check for pointers.	2021-07-09 17:54:09 -07:00
Eli Friedman	9c4baf5101	[ScalarEvolution] Strictly enforce pointer/int type rules. Rules: 1. SCEVUnknown is a pointer if and only if the LLVM IR value is a pointer. 2. SCEVPtrToInt is never a pointer. 3. If any other SCEV expression has no pointer operands, the result is an integer. 4. If a SCEVAddExpr has exactly one pointer operand, the result is a pointer. 5. If a SCEVAddRecExpr's first operand is a pointer, and it has no other pointer operands, the result is a pointer. 6. If every operand of a SCEVMinMaxExpr is a pointer, the result is a pointer. 7. Otherwise, the SCEV expression is invalid. I'm not sure how useful rule 6 is in practice. If we exclude it, we can guarantee that ScalarEvolution::getPointerBase always returns a SCEVUnknown, which might be a helpful property. Anyway, I'll leave that for a followup. This is basically mop-up at this point; all the changes with significant functional effects have landed. Some of the remaining changes could be split off, but I don't see much point. Differential Revision: https://reviews.llvm.org/D105510	2021-07-09 17:29:26 -07:00
Nikita Popov	2e3f4694d6	[IR] Add GEPOperator::indices() (NFC) In order to mirror the GetElementPtrInst::indices() API. Wanted to use this in the IRForTarget code, and was surprised to find that it didn't exist yet.	2021-07-09 21:41:20 +02:00
Martin Storsjö	e479777d3c	Revert "[ScalarEvolution] Fix overflow in computeBECount." This reverts commit `5b350183cd` (and also "[NFC][ScalarEvolution] Cleanup howManyLessThans.", `009436e9c1`, to make it apply). See https://reviews.llvm.org/D105216 for discussion on various miscompilations caused by that commit.	2021-07-09 14:26:48 +03:00
Eli Friedman	009436e9c1	[NFC][ScalarEvolution] Cleanup howManyLessThans. In preparation for D104075. Some NFC cleanup, and some test coverage for planned changes.	2021-07-08 17:56:26 -07:00
Eli Friedman	5b350183cd	[ScalarEvolution] Fix overflow in computeBECount. There are two issues with the current implementation of computeBECount: 1. It doesn't account for the possibility that adding "Stride - 1" to Delta might overflow. For almost all loops, it doesn't, but it's not actually proven anywhere. 2. It doesn't account for the possibility that Stride is zero. If Delta is zero, the backedge is never taken; the value of Stride isn't relevant. To handle this, we have to make sure that the expression returned by computeBECount evaluates to zero. To deal with this, add two new checks: 1. Use a variety of tricks to try to prove that the addition doesn't overflow. If the proof is impossible, use an alternate sequence which never overflows. 2. Use umax(Stride, 1) to handle the possibility that Stride is zero. Differential Revision: https://reviews.llvm.org/D105216	2021-07-08 10:09:55 -07:00
Eli Friedman	f5603aa050	[ScalarEvolution] Make sure getMinusSCEV doesn't negate pointers. Add a function removePointerBase that returns, essentially, S - getPointerBase(S). Use it in getMinusSCEV instead of actually subtracting pointers. Differential Revision: https://reviews.llvm.org/D105503	2021-07-07 10:27:10 -07:00
Eli Friedman	7ac1c7bead	Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers. As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers. There are a few ways we could deal with this: 1. We could completely forbid subtracting pointers in getMinusSCEV() 2. We could forbid subracting pointers with different pointer bases (this patch). 3. We could try to ptrtoint pointer operands. The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option. As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact. This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better. I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test. Recommitting with fix to MemoryDepChecker::isDependent. Differential Revision: https://reviews.llvm.org/D104806	2021-07-06 12:16:05 -07:00
Eli Friedman	a6d081b2cb	Revert "[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers." This reverts commit `74d6ce5d5f`. Seeing crashes on buildbots in MemoryDepChecker::isDependent.	2021-07-06 11:17:13 -07:00
Eli Friedman	74d6ce5d5f	[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers. As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers. There are a few ways we could deal with this: 1. We could completely forbid subtracting pointers in getMinusSCEV() 2. We could forbid subracting pointers with different pointer bases (this patch). 3. We could try to ptrtoint pointer operands. The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option. As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact. This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better. I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test. Differential Revision: https://reviews.llvm.org/D104806	2021-07-06 10:54:41 -07:00
Philip Reames	14d8f1546a	[SCEV] Fold (0 udiv %x) to 0 We have analogous rules in instsimplify, etc.., but were missing the same in SCEV. The fold is near trivial, but came up in the context of a larger change.	2021-06-30 08:31:13 -07:00
Eli Friedman	8d5bf0709d	[NFC] Prefer ConstantRange::makeExactICmpRegion over makeAllowedICmpRegion The implementation is identical, but it makes the semantics a bit more obvious.	2021-06-25 14:43:13 -07:00
Florian Hahn	6478f3fb78	[SCEV] Support single-cond range check idiom in applyLoopGuards. This patch extends applyLoopGuards to detect a single-cond range check idiom that InstCombine generates. It extends applyLoopGuards to detect conditions of the form (-C1 + X < C2). InstCombine will create this form when combining two checks of the form (X u< C2 + C1) and (X >=u C1). In practice, this enables us to correctly compute a tight trip count bounds for code as in the function below. InstCombine will fold the minimum iteration check created by LoopRotate with the user check (< 8). void unsigned_check(short pred, unsigned width) { if (width < 8) { for (int x = 0; x < width; x++) pred[x] = pred[x] pred[x]; } } As a consequence, LLVM creates dead vector loops for the code above, e.g. see https://godbolt.org/z/cb8eTcqET https://alive2.llvm.org/ce/z/SHHW4d Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104741	2021-06-25 10:24:40 +01:00
Florian Hahn	121ecb05e7	[SCEV] Generalize MatchBinaryAddToConst to support non-add expressions. This patch generalizes MatchBinaryAddToConst to support matching (A + C1), (A + C2), instead of just matching (A + C1), A. The existing cases can be handled by treating non-add expressions A as A + 0. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D104634	2021-06-24 12:16:15 +01:00
Eli Friedman	b12192f7cd	[ScalarEvolution] Clarify implementation of getPointerBase(). getPointerBase should only be looking through Add and AddRec expressions; other expressions either aren't pointers, or can't be looked through. Technically, this is a functional change. For a multiply or min/max expression, if they have exactly one pointer operand, and that operand is the first operand, the behavior here changes. Similarly, if an AddRec has a pointer-type step, the behavior changes. But that shouldn't be happening in practice, and we plan to make such expressions illegal.	2021-06-23 12:55:59 -07:00
Eli Friedman	fdaf304e0d	[NFC][ScalarEvolution] Fix SCEVNAryExpr::getType(). SCEVNAryExpr::getType() could return the wrong type for a SCEVAddExpr. Remove it, and add getType() methods to the relevant subclasses. NFC because nothing uses it directly, as far as I know; this is just future-proofing.	2021-06-23 12:55:59 -07:00
Florian Hahn	adee485adf	[SCEV] Support signed predicates in applyLoopGuards. This adds handling for signed predicates, similar to how unsigned predicates are already handled. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104732	2021-06-23 10:21:05 +01:00
Florian Hahn	6c782e6eb0	[SCEV] Reduce code to handle predicates in applyLoopGuards (NFC). Hoist out common recurrence check and sink updating the map, to reduce the code required to support additional predicates.	2021-06-22 15:56:45 +01:00
Florian Hahn	d17798823c	[SCEV] Retain AddExpr flags when subtracting a foldable constant. Currently we drop wrapping flags for expressions like (A + C1)<flags> - C2. But we can retain flags under certain conditions: * Adding a smaller constant is NUW if the original AddExpr was NUW. * Adding a constant with the same sign and small magnitude is NSW, if the original AddExpr was NSW. This can improve results after using `SimplifyICmpOperands`, which may subtract one in order to use stricter predicates, as is the case for `isKnownPredicate`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D104319	2021-06-22 11:27:51 +01:00
Eli Friedman	8f3d16905d	[ScalarEvolution] Ensure backedge-taken counts are not pointers. A backedge-taken count doesn't refer to memory; returning a pointer type is nonsense. So make sure we always return an integer. The obvious way to do this would be to just convert the operands of the icmp to integers, but that doesn't quite work out at the moment: isLoopEntryGuardedByCond currently gets confused by ptrtoint operations. So we perform the ptrtoint conversion late for lt/gt operations. The test changes are mostly innocuous. The most interesting changes are more complex SCEV expressions of the form "(-1 * (ptrtoint i8* %ptr to i64)) + %ptr)". This is expected: we can't fold this to zero because we need to preserve the pointer base. The call to isLoopEntryGuardedByCond in howFarToZero is less precise because of ptrtoint operations; this shows up in the function pr46786_c26_char in ptrtoint.ll. Fixing it here would require more complex refactoring. It should eventually be fixed by future improvements to isImpliedCond. See https://bugs.llvm.org/show_bug.cgi?id=46786 for context. Differential Revision: https://reviews.llvm.org/D103656	2021-06-21 16:24:16 -07:00
Eli Friedman	62ed024c74	[NFC][ScalarEvolution] Clean up ExitLimit constructors. Make all the constructors forward to one constructor. Remove redundant assertions.	2021-06-20 17:40:30 -07:00
Eli Friedman	8a567e5f22	[ScalarEvolution] Fix pointer/int type handling converting select/phi to min/max. The old version of this code would blindly perform arithmetic without paying attention to whether the types involved were pointers or integers. This could lead to weird expressions like negating a pointer. Explicitly handle simple cases involving pointers, like "x < y ? x : y". In all other cases, coerce the operands of the comparison to integer types. This avoids the weird cases, while handling most of the interesting cases. Differential Revision: https://reviews.llvm.org/D103660	2021-06-17 14:05:12 -07:00
Eli Friedman	27963ccf07	[NFC][ScalarEvolution] Refactor createNodeForSelectOrPHI In preparation for D103660.	2021-06-16 12:32:32 -07:00
Roman Lebedev	a3113df219	[SCEV] PtrToInt on non-integral pointers is allowed As per (committed without review) @reames's rGac81cb7e6dde9b0890ee1780eae94ab96743569b change, we are now allowed to produce `ptrtoint` for non-integral pointers. This will unblock further unbreaking of SCEV regarding int-vs-pointer type confusion. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D104322	2021-06-16 10:24:25 +03:00
Philip Reames	7629b2a09c	[LI] Add a cover function for checking if a loop is mustprogress [nfc] Essentially, the cover function simply combines the loop level check and the function level scope into one call. This simplifies several callers and is (subjectively) less error prone.	2021-06-10 13:37:32 -07:00
Philip Reames	aaaeb4b160	[SCEV] Use mustprogress flag on loops (in addition to function attribute) This addresses a performance regression reported against `3c6e4191`. That change (correctly) limited a transform based on assumed finiteness to mustprogress loops, but the previous change (`38540d7`) which introduced the mustprogress check utility only handled function attributes, not the loop metadata form. It turns out that clang uses the function attribute form for C++, and the loop metadata form for C. As a result, `3c6e4191` ended up being a large regression in practice for C code as loops weren't being considered mustprogress despite the language semantics.	2021-06-10 13:20:28 -07:00
Philip Reames	b65f30d6fb	[SCEV] Minor code motion to simplify a later patch [nfc]	2021-06-09 14:17:06 -07:00
Florian Hahn	b76f1f1202	[SCEV] Keep common NUW flags when inlining Add operands. Currently, NoWrapFlags are dropped if we inline operands of SCEVAddExpr operands. As a consequence, we always drop flags when building expressions like `getAddExpr(A, getAddExpr(B, C, NUW), NUW)`. We should be able to retain NUW flags common among all inlined SCEVAddExpr and the original flags. Reviewed By: nikic, mkazantsev Differential Revision: https://reviews.llvm.org/D103877	2021-06-09 17:13:21 +01:00
Philip Reames	3c6e419198	[SCEV] Properly guard reasoning about infinite loops being UB on mustprogress Noticed via code inspection. We changed the semantics of the IR when we added mustprogress, and we appear to have not updated this location. Differential Revision: https://reviews.llvm.org/D103834	2021-06-07 14:47:36 -07:00
Philip Reames	38540d71c7	[SCEV] Compute exit counts for unsigned IVs using mustprogress semantics The motivation here is simple loops with unsigned induction variables w/non-one steps. A toy example would be: for (unsigned i = 0; i < N; i += 2) { body; } Given C/C++ semantics, we do not get the nuw flag on the induction variable. Given that lack, we currently can't compute a bound for this loop. We can do better for many cases, depending on the contents of "body". The basic intuition behind this patch is as follows: * A step which evenly divides the iteration space must wrap through the same numbers repeatedly. And thus, we can ignore potential cornercases where we exit after the n-th wrap through uint32_max. * Per C++ rules, infinite loops without side effects are UB. We already have code in SCEV which relies on this. In LLVM, this is tied to the mustprogress attribute. Together, these let us conclude that the trip count of this loop must come before unsigned overflow unless the body would form a well defined infinite loop. A couple notes for those reading along: * I reused the loop properties code which is overly conservative for this case. I may follow up in another patch to generalize it for the actual UB rules. * We could cache the n(s/u)w facts. I left that out because doing a pre-patch which cached existing inference showed a lot of diffs I had trouble fully explaining. I plan to get back to this, but I don't want it on the critical path. Differential Revision: https://reviews.llvm.org/D103118	2021-06-07 11:24:00 -07:00
Roman Lebedev	e350494fb0	[NFC] Promote willNotOverflow() / getStrengthenedNoWrapFlagsFromBinOp() from IndVars into SCEV proper We might want to use it when creating SCEV proper in createSCEV(), now that we don't `forgetValue()` in `SimplifyIndvar::strengthenOverflowingOperation()`, which might have caused us to loose some optimization potential.	2021-06-05 12:17:51 +03:00
Eli Friedman	fd229caa01	[polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs. When we're remapping an AddRec, the AddRec constructed by a partial rewrite might not make sense. This triggers an assertion complaining it's not loop-invariant. Instead of constructing the partially rewritten AddRec, just skip straight to calling evaluateAtIteration. Testcase was automatically reduced using llvm-reduce, so it's a little messy, but hopefully makes sense. Differential Revision: https://reviews.llvm.org/D102959	2021-06-01 09:51:05 -07:00
Roman Lebedev	f7c95c3322	[NFC] ScalarEvolution: apply SSO to the ExprValueMap value ExprValueMap is a map from SCEV * to a set-vector of (Value , ConstantInt ) pair, and while the map itself will likely be big-ish (have many keys), it is a reasonable assumption that each key will refer to a small-ish number of pairs. In particular looking at n=512 case from https://bugs.llvm.org/show_bug.cgi?id=50384, the small-size of 4 appears to be the sweet spot, it results in the least allocations while minimizing memory footprint. ``` $ for i in $(ls heaptrack.opt.*.gz); do echo $i; heaptrack_print $i \| tail -n 6; echo ""; done heaptrack.opt.0-orig.gz total runtime: 14.32s. calls to allocation functions: 8222442 (574192/s) temporary memory allocations: `2419000` (168924/s) peak heap memory consumption: 190.98MB peak RSS (including heaptrack overhead): 239.65MB total memory leaked: 67.58KB heaptrack.opt.1-n1.gz total runtime: 13.72s. calls to allocation functions: 7184188 (523705/s) temporary memory allocations: 2419017 (176338/s) peak heap memory consumption: 191.38MB peak RSS (including heaptrack overhead): 239.64MB total memory leaked: 67.58KB heaptrack.opt.2-n2.gz total runtime: 12.24s. calls to allocation functions: 6146827 (502355/s) temporary memory allocations: 2418997 (197695/s) peak heap memory consumption: 163.31MB peak RSS (including heaptrack overhead): 211.01MB total memory leaked: 67.58KB heaptrack.opt.3-n4.gz total runtime: 12.28s. calls to allocation functions: 6068532 (494260/s) temporary memory allocations: 2418985 (197017/s) peak heap memory consumption: 155.43MB peak RSS (including heaptrack overhead): 201.77MB total memory leaked: 67.58KB heaptrack.opt.4-n8.gz total runtime: 12.06s. calls to allocation functions: 6068042 (503321/s) temporary memory allocations: 2418992 (200646/s) peak heap memory consumption: 166.03MB peak RSS (including heaptrack overhead): 213.55MB total memory leaked: 67.58KB heaptrack.opt.5-n16.gz total runtime: 12.14s. calls to allocation functions: 6067993 (499958/s) temporary memory allocations: 2418999 (199307/s) peak heap memory consumption: 187.24MB peak RSS (including heaptrack overhead): 233.69MB total memory leaked: 67.58KB ``` While that test may be an edge worst-case scenario, https://llvm-compile-time-tracker.com/compare.php?from=dee85d47d9f15fc268f7b18f279dac2774836615&to=98a57e31b1947d5bcdf4a5605ac2ab32b4bd5f63&stat=instructions agrees that this also results in improvements in the usual situations.	2021-05-31 15:34:03 +03:00
Philip Reames	ff08c3468f	[SCEV] Compute trip multiple for multiple exit loops This patch implements getSmallConstantTripMultiple(L) correctly for multiple exit loops. The previous implementation was both imprecise, and violated the specified behavior of the method. This was fine in practice, because it turns out the function was both dead in real code, and not tested for the multiple exit case. Differential Revision: https://reviews.llvm.org/D103189	2021-05-26 11:52:25 -07:00
Philip Reames	9306bb638f	[SCEV] Generalize getSmallConstantTripCount(L) for multiple exit loops This came up in review for another patch, see https://reviews.llvm.org/D102982#2782407 for full context. I've reviewed the callers to make sure they can handle multiple exit loops w/non-zero returns. There's two cases in target cost models where results might change (Hexagon and PowerPC), but the results looked legal and reasonable. If a target maintainer wishes to back out the effect of the costing change, they should explicitly check for multiple exit loops and handle them as desired. Differential Revision: https://reviews.llvm.org/D103182	2021-05-26 11:18:25 -07:00
Philip Reames	921d3f7af0	[SCEV] Add a utility for converting from "exit count" to "trip count" (Mostly as a logical place to put a comment since this is a reoccuring confusion.)	2021-05-26 10:41:49 -07:00
Philip Reames	fb14577d0c	[SCEV] Extract out a helper for computing trip multiples	2021-05-26 10:15:03 -07:00
Vitaly Buka	f44f2e0afc	[NFC] Fix 'unused' warning	2021-05-25 12:23:57 -07:00
Nikita Popov	6300c37a46	[SCEV] Cache operands used in BEInfo (NFC) When memoized values for a SCEV expressions are dropped, we also drop all BECounts that make use of the SCEV expression. This is done by iterating over all the ExitNotTaken counts and (recursively) checking whether they use the SCEV expression. If there are many exits, this will take a lot of time. This patch improves the situation by pre-computing a set of all used operands, so that we can determine whether a certain BEInfo needs to be invalidated using a simple set lookup. Will still need to loop over all BEInfos though. This makes for a mild improvement on non-degenerate cases: https://llvm-compile-time-tracker.com/compare.php?from=b661a55a253f4a1cf5a0fbcb86e5ba7b9fb1387b&to=be1393f450e594c53f0ad7e62339a6bc831b16f6&stat=instructions For the degenerate case from https://bugs.llvm.org/show_bug.cgi?id=50384, for n=128 I'm seeing run time drop from 1.6s to 1.1s. Differential Revision: https://reviews.llvm.org/D102796	2021-05-25 21:03:33 +02:00
Philip Reames	aabca2d1da	[SCEV] Cleanup doesIVOverflowOnX checks [NFC] Stylistic changes only. 1) Don't pass a parameter just to do an early exit. 2) Use a name which matches actual behavior.	2021-05-25 10:12:24 -07:00
Philip Reames	a47b2d4567	[SCEV] Remove unused parameter from computeBECount [NFC] All callers pass "false" for the Equality parameter. Kill the dead code, and update the function block comment.	2021-05-25 09:58:56 -07:00
Nikita Popov	b661a55a25	[ScalarEvolution] Remove unused ExitLimit::hasOperand() method (NFC) We only use BackedgeTakenInfo::hasOperand().	2021-05-19 18:42:14 +02:00
Florian Hahn	e2759f110b	[SCEV] Apply guards to max with non-unitary steps. We already apply loop-guards when computing the maximum with unitary steps. This extends the code to also do so when dealing with non-unitary steps. This allows us to infer a tighter maximum in some cases. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D102267	2021-05-13 09:47:29 +01:00
Nikita Popov	d26ca78c18	[SCEV] Handle and/or in applyLoopGuards() applyLoopGuards() already combines conditions from multiple nested guards. However, it cannot use multiple conditions on the same guard, combined using and/or. Add support for this by recursing into either `and` or `or`, depending on the direction of the branch. Differential Revision: https://reviews.llvm.org/D101692	2021-05-09 21:34:28 +02:00
Florian Hahn	6c99e63120	[SCEV] By more careful when traversing phis in isImpliedViaMerge. I think currently isImpliedViaMerge can incorrectly return true for phis in a loop/cycle, if the found condition involves the previous value of Consider the case in exit_cond_depends_on_inner_loop. At some point, we call (modulo simplifications) isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1). The existing code tries to prove IncV <= -1 for all incoming values InvV using the found condition (%call <= -1). At the moment this succeeds, but only because it does not compare the same runtime value. The found condition checks the value of the last iteration, but the incoming value is from the previous iteration. Hence we incorrectly determine that the previous value was <= -1, which may not be true. I think we need to be more careful when looking at the incoming values here. In particular, we need to rule out that a found condition refers to any value that may refer to one of the previous iterations. I'm not sure there's a reliable way to do so (that also works of irreducible control flow). So for now this patch adds an additional requirement that the incoming value must properly dominate the phi block. This should ensure the values do not change in a cycle. I am not entirely sure if will catch all cases and I appreciate a through second look in that regard. Alternatively we could also unconditionally bail out in this case, instead of checking the incoming values Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101829	2021-05-07 19:52:29 +01:00
Nikita Popov	cc58e8918b	[SCEV] Simplify backedge count clearing (NFC) This seems to be a leftover from when the BackedgeTakenInfo stored multiple exit counts with manual memory management. At some point this was switchted to a simple vector, and there should be no need to micro-manage the clearing anymore. We can simply drop the loop from the map and the the destructor do its job.	2021-05-01 17:50:01 +02:00
Philip Reames	0cc3e10f5e	[SCEV] Avoid range intersection idiom in getRangeForUnkownRecurrence [NFC] Addresses a review comment from D101181	2021-04-28 12:48:17 -07:00
Philip Reames	a836de0bde	[SCEV] Compute ranges for ashr recurrences Straight forward extension to the recently added infrastructure which was pioneered with shl. This was originally posted as part of D99687, but split off for ease of review. (I also decided to exclude the unknown start sign case explicitly for simplicity of understanding.) Differential Revision: https://reviews.llvm.org/D101181	2021-04-28 12:36:20 -07:00
Nikita Popov	e45168c4fa	[SCEV] Handle uge/ugt predicates in applyLoopGuards() These can be handled the same way as ule/ult, just using umax instead of umin. This is useful in cases where the umax prevents the upper bound from overflowing. Differential Revision: https://reviews.llvm.org/D101196	2021-04-27 22:41:05 +02:00
Nikita Popov	a5051f2fa2	[SCEV] Fix applyLoopGuards() chaining for ne predicates ICMP_NE predicates directly overwrote the rewritten result, instead of chaining it with previous rewrites, as was done for ICMP_ULT and ICMP_ULE. This means that some guards were effectively discarded, depending on their order.	2021-04-24 21:43:46 +02:00
Philip Reames	424d6cb902	[SCEV] Compute ranges for lshr recurrences Straight forward extension to the recently added infrastructure which was pioneered with shl. Differential Revision: https://reviews.llvm.org/D99687	2021-04-22 11:06:31 -07:00
Yang Fan	4307446e9f	[SCEV] Fix -Wunused-variable warning (NFC) GCC warning: ``` /llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp: In member function ‘const llvm::SCEV* llvm::ScalarEvolution::getLosslessPtrToIntExpr(const llvm::SCEV, unsigned int)::SCEVPtrToIntSinkingRewriter::visitUnknown(const llvm::SCEVUnknown)’: /llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:1152:13: warning: unused variable ‘ExprPtrTy’ [-Wunused-variable] 1152 \| Type *ExprPtrTy = Expr->getType(); \| ^~~~~~~~~ ```	2021-04-21 16:01:46 +08:00
Philip Reames	9c1a145aeb	Rearrange code to reduce diff for D99687 [nfc] Adding the switches to reduce diffs. I'm about to split that into an lshr part and an ashr part, doing the NFC part first makes it easier to maintain both diffs.	2021-04-20 11:40:15 -07:00
Roman Lebedev	7186764884	[NFC][SCEV] Split getLosslessPtrToIntExpr out of getPtrToIntExpr()	2021-04-20 21:29:21 +03:00
Roman Lebedev	41c22acc22	[NFC][SCEV] Assert that we don't try to create SCEVPtrToIntExpr of a non-integral pointer ptr<->int casts are only valid for integral pointes, defensively assert that we don't try to break that here.	2021-04-19 18:38:38 +03:00
Roman Lebedev	d480f968ad	Revert "[SCEV] Model `ashr exact x, C` as `(abs(x) EXACT/u (1<<C)) * signum(x)`" As being discussed in https://reviews.llvm.org/D100721, this modelling is lossy, we can't reconstruct `ash`/`ashr exact` from it, which means that whenever we actually expand the IR, we've just pessimized the code.. It would be good to model this pattern, after all it comes up every time you want to compute a distance between two pointers, but not at this cost. This reverts commit `ec54867df5`.	2021-04-18 16:26:45 +03:00
Nikita Popov	a1ed025d0e	Revert "[SCEV] Don't walk uses of phis without SCEV expression when forgetting" This reverts commit `faf9f11589`. Issues with this patch have been reported in https://reviews.llvm.org/D100264#2689917 and https://bugs.llvm.org/show_bug.cgi?id=49967.	2021-04-15 09:43:52 +02:00
Nikita Popov	faf9f11589	[SCEV] Don't walk uses of phis without SCEV expression when forgetting I've run into some cases where a large fraction of compile-time is spent invalidating SCEV. One of the causes is forgetLoop(), which walks all values that are def-use reachable from the loop header phis. When invalidating a topmost loop, that might be close to all values in a function. Additionally, it's fairly common for there to not actually be anything to invalidate, but we'll still be performing this walk again and again. My first thought was that we don't need to continue walking the uses if the current value doesn't have a SCEV expression. However, this isn't quite right, because SCEV construction can skip over values (e.g. for a chain of adds, we might only create a SCEV expression for the final value). What this patch does instead is to only walk the (full) def-use chain of loop phis that have a SCEV expression. If there's no expression for a phi, then we also don't have any dependent expressions to invalidate. Differential Revision: https://reviews.llvm.org/D100264	2021-04-13 20:28:17 +02:00
Roman Lebedev	e8c7f43e2c	[NFC][ConstantRange] Add 'icmp' helper method "Does the predicate hold between two ranges?" Not very surprisingly, some places were already doing this check, without explicitly naming the algorithm, cleanup them all.	2021-04-10 19:38:55 +03:00
Roman Lebedev	7b12c8c59d	Revert "[NFC][ConstantRange] Add 'icmp' helper method" This reverts commit `17cf2c9423`.	2021-04-10 19:37:53 +03:00
Roman Lebedev	17cf2c9423	[NFC][ConstantRange] Add 'icmp' helper method "Does the predicate hold between two ranges?" Not very surprisingly, some places were already doing this check, without explicitly naming the algorithm, cleanup them all.	2021-04-10 19:09:52 +03:00
Max Kazantsev	fee330824a	[SCEV] Fix false-positive recognition of simple recurrences. PR49856 A value from reachable block may come to a Phi node as its input from unreachable block. This may confuse matchSimpleRecurrence which has no access to DomTree and can falsely recognize something as a recurrency because of this effect, as the attached test shows. Patch `ae7b1e` deals with half of this problem, but it only accounts from the case when an unreachable instruction comes to Phi as an input. This patch provides a generalization by checking that no Phi block's predecessor is unreachable (no matter what the input is). Differential Revision: https://reviews.llvm.org/D99929 Reviewed By: reames	2021-04-07 13:55:17 +07:00
Philip Reames	ae7b1e8823	[SCEV] Handle unreachable binop when matching shift recurrence This fixes an issue introduced with my change d4648e, and reported in pr49768. The root problem is that dominance collapses in unreachable code, and that LoopInfo explicitly only models reachable code. Since the recurrence matcher doesn't filter by reachability (and can't easily because not all consumers have domtree), we need to bailout before assuming that finding a recurrence implies we found a loop.	2021-03-31 10:33:34 -07:00
Nikita Popov	a7efed5a20	[SCEV] Improve handling of not expressions in isImpliedCond() SCEV currently tries to prove implications of x pred y by also trying to imply ~y pred ~x. This is expensive in terms of compile-time (in fact, the majority of isImpliedCond compile-time is spent here) and generally not fruitful. The issue is that this also swaps the operands and thus breaks canonical ordering. If originally we were trying to prove an implication like X > C1 -> Y > C2, then we'll now try to prove X > C1 -> C3 > ~Y, which will not work. The only real case where we can get some use out of this transform is if the original conditions were in the form X > C1 -> Y < C2, were then swapped to X > C1 -> C2 > Y and are then swapped again here to X > C1 -> ~Y > C3. As such, handle this at a higher level, where we are doing the swapping in the first place. There's four different ways that we can line up a predicate and a swapped predicate, so we use some heuristics to pick some profitable way. Because we now try this transform at a higher level (isImpliedCondOperands rather than isImpliedCondOperandsHelper), we can also prove additional facts. Of the added tests, one was proven previously while the other wasn't. Differential Revision: https://reviews.llvm.org/D90926	2021-03-24 21:53:02 +01:00
Juneyoung Lee	b00209ed10	[SCEV] Use logical and/or matcher This is a minor patch that updates ScalarEvolution::isImpliedCond to use logical and/or matcher.	2021-03-23 06:00:54 +09:00
Philip Reames	93ce855d4b	2nd attempt at a speculative fix for windows builders after `d4648eea`	2021-03-22 10:32:57 -07:00
Philip Reames	6ba73c4743	Speculative fix for windows builders after `d4648eea`	2021-03-22 10:22:01 -07:00
Philip Reames	d4648eeaa2	[SCEV] Use trip count information to improve shift recurrence ranges This patch exploits the knowledge that we may be running many fewer than bitwidth iterations of the loop, and may be able to disallow the overflow case. This patch specifically implements only the shl case, but this can be generalized to ashr and lshr without difficulty. Differential Revision: https://reviews.llvm.org/D98222	2021-03-22 09:38:43 -07:00
Philip Reames	00d0315a7c	[SCEV] Factor out a lambda for strict condition splitting [NFC]	2021-03-19 10:07:12 -07:00
Max Kazantsev	fff1363ba0	[SCEV] Add false->any implication By definition of Implication operator, `false -> true` and `false -> false`. It means that `false` implies any predicate, no matter true or false. We don't need to go any further trying to prove the statement we need and just always say that `false` implies it in this case. In practice it means that we are trying to prove something guarded by `false` condition, which means that this code is unreachable, and we can safely prove any fact or perform any transform in this code. Differential Revision: https://reviews.llvm.org/D98706 Reviewed By: lebedev.ri	2021-03-19 11:29:48 +07:00
Max Kazantsev	b3a1500ea8	[SCEV][NFC] API for predicate evaluation Provides API that allows to check predicate for being true or false with one call. Current implementation is naive and just calls isKnownPredicate twice, but further we can rework this logic trying to use one check to prove both facts.	2021-03-18 19:21:29 +07:00
Max Kazantsev	5097143f0e	[SCEV][NFC] Move check up the stack One of (and primary) callers of isBasicBlockEntryGuardedByCond is isKnownPredicateAt, which makes isKnownPredicate check before it. It already makes non-recursive check inside. So, on this execution path this check is made twice. The only other caller is isLoopEntryGuardedByCond. Moving the check there should save some compile time.	2021-03-16 22:09:17 +07:00
Roman Lebedev	78b8ce40ef	Reland [SCEV] Improve modelling for (null) pointer constants This reverts commit `329aeb5db4`, and relands commit `61f006ac65`. This is a continuation of D89456. As it was suggested there, now that SCEV models `PtrToInt`, we can try to improve SCEV's pointer handling. In particular, i believe, i will need this in the future to further fix `SCEVAddExpr`operation type handling. This removes special handling of `ConstantPointerNull` from `ScalarEvolution::createSCEV()`, and add constant folding into `ScalarEvolution::getPtrToIntExpr()`. This way, `null` constants stay as such in SCEV's, but gracefully become zero integers when asked. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D98147	2021-03-13 16:05:34 +03:00
Roman Lebedev	329aeb5db4	Temporairly evert "[SCEV] Improve modelling for (null) pointer constants" This appears to have broken ubsan bot: https://lab.llvm.org/buildbot/#/builders/85/builds/3062 https://reviews.llvm.org/D98147#2623549 It looks like LSR needs some kind of a change around insertion point handling. Reverting until i have a fix. This reverts commit `61f006ac65`.	2021-03-13 09:10:28 +03:00
Roman Lebedev	61f006ac65	[SCEV] Improve modelling for (null) pointer constants This is a continuation of D89456. As it was suggested there, now that SCEV models `PtrToInt`, we can try to improve SCEV's pointer handling. In particular, i believe, i will need this in the future to further fix `SCEVAddExpr`operation type handling. This removes special handling of `ConstantPointerNull` from `ScalarEvolution::createSCEV()`, and add constant folding into `ScalarEvolution::getPtrToIntExpr()`. This way, `null` constants stay as such in SCEV's, but gracefully become zero integers when asked. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D98147	2021-03-12 22:11:58 +03:00
Philip Reames	a25b537bf4	[SCEV] Infer known bits from known sign bits This was suggested by lebedev.ri over on D96534. You'll note lack of tests. During review, we weren't actually able to find a case which exercises it, but both I and lebedev.ri feel it's a reasonable change, straight forward, and near free. Differential Revision: https://reviews.llvm.org/D97064	2021-03-09 12:37:17 -08:00
Philip Reames	4a5edea193	[SCEV] Use both known bits and sign bits when computing range of SCEV unknowns When computing a range for a SCEVUnknown, today we use computeKnownBits for unsigned ranges, and computeNumSignBots for signed ranges. This means we miss opportunities to improve range results. One common missed pattern is that we have a signed range of a value which CKB can determine is positive, but CNSB doesn't convey that information. The current range includes the negative part, and is thus double the size. Per the removed comment, the original concern which delayed using both (after some code merging years back) was a compile time concern. CTMark results (provided by Nikita, thanks!) showed a geomean impact of about 0.1%. This doesn't seem large enough to avoid higher quality results. Differential Revision: https://reviews.llvm.org/D96534	2021-02-19 08:29:12 -08:00
Kazu Hirata	df35a183d7	[SCEV] Use ListSeparator (NFC)	2021-02-16 23:23:05 -08:00
Michael Kruse	606aa622b2	Revert "[AssumptionCache] Avoid dangling llvm.assume calls in the cache" This reverts commit `b7d870eae7` and the subsequent fix "[Polly] Fix build after AssumptionCache change (D96168)" (commit `e6810cab09`). It caused indeterminism in the output, such that e.g. the polly-x86_64-linux buildbot failed accasionally.	2021-02-11 12:17:38 -06:00
Philip Reames	9bf3cfa77b	[SCEV] Add a missing AssumptionCache parameter The AssumptionCache mechanism is used to feed assumes into known bits computations. Most places in SCEV passed it in, but one place appears to have been missed. Spotted via inspection, don't have a test case which actually exercises this, but it seemed like an obvious fixit.	2021-02-10 12:08:55 -08:00
Johannes Doerfert	b7d870eae7	[AssumptionCache] Avoid dangling llvm.assume calls in the cache PR49043 exposed a problem when it comes to RAUW llvm.assumes. While D96106 would fix it for GVNSink, it seems a more general concern. To avoid future problems this patch moves away from the vector of weak reference model used in the assumption cache. Instead, we track the llvm.assume calls with a callback handle which will remove itself from the cache if the call is deleted. Fixes PR49043. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D96168	2021-02-06 12:18:39 -06:00
Gil Rapaport	d475030dc2	[SCEV] Apply loop guards to divisibility tests Extend applyLoopGuards() to take into account conditions/assumes proving some value %v to be divisible by D by rewriting %v to (%v / D) * D. This lets the loop unroller and the loop vectorizer identify more loops as not requiring remainder loops. Differential Revision: https://reviews.llvm.org/D95521	2021-02-02 08:09:39 +02:00
Florian Hahn	f1e8136115	[SCEV] Bail out if URem operand cannot be zero-extended. In some cases, LHS is larger than the target expression type. Bail out in that case for now, to avoid crashing	2021-02-01 13:50:54 +00:00
Max Kazantsev	8a4ad8849f	[SCEV] Do not cache comparison result upon reached max depth as "equivalence". PR48725 We use `EquivalenceClasses` to cache the notion that two SCEVs are equivalent, so save time in situation when `A` is equivalent to `B` and `B` is equivalent to `C`, making check "if `A` is equivalent to `C`?" cheaper. We also return `0` in the comparator when we reach max analysis depth to save compile time. After doing this, we also cache them as being equivalent. Now, imagine the following situation: - `A` is proved equivalent to `B`; - `C` is proved equivalent to `D`; - Comparison of `A` against `D` is proved non-zero; - Comparison of `B` against `C` reaches max depth (and gets cached as equivalence). Now, before the invocation of compare(`B`, `C`), `A` and `D` belonged to different equivalence classes, and their comparison returned non-zero. After the the invocation of compare(`B`, `C`), equivalence classes get merged and `A`, `B`, `C` and `D` all fall into the same equivalence class. So the comparator will change its behavior for couple `A` and `D`, with weird consequences following it. This comparator is finally used in `std::stable_sort`, and this behavior change makes it crash (looks like it's causing a memory corruption). Solution: this patch changes `CompareSCEVComplexity` to return `None` when the max depth is reached. So in this case, we do not cache these SCEVs (and their parents in the tree) as being equivalent. Differential Revision: https://reviews.llvm.org/D94654 Reviewed By: lebedev.ri	2021-01-29 12:08:34 +07:00
Mindong Chen	00fcc03687	[SCEV] Fix incorrect loop exit count analysis. In computeLoadConstantCompareExitLimit, the addrec used to compute the exit count should be from the loop which the exiting block belongs to. Reviewed by: mkazantsev Differential Revision: https://reviews.llvm.org/D92367	2021-01-27 19:36:05 +08:00
Kazu Hirata	8f5da41c4d	[llvm] Construct SmallVector with iterator ranges (NFC)	2021-01-20 21:35:52 -08:00
Kazu Hirata	23b0ab2acb	[llvm] Use the default value of drop_begin (NFC)	2021-01-18 10:16:36 -08:00
Kazu Hirata	19aacdb715	[llvm] Construct SmallVector with iterator ranges (NFC)	2021-01-16 09:40:53 -08:00
Kazu Hirata	848e8f938f	[llvm] Construct SmallVector with iterator ranges (NFC)	2021-01-04 11:42:44 -08:00
Gil Rapaport	d9c0b128e3	[SCEV] Simplify trunc to zero based on known bits Let getTruncateExpr() short-circuit to zero when the value being truncated is known to have at least as many trailing zeros as the target type. Differential Revision: https://reviews.llvm.org/D93973	2021-01-03 13:57:12 +02:00
Juneyoung Lee	509fa8e02e	[SCEV] recognize logical and/or pattern This patch makes SCEV recognize 'select A, B, false' and 'select A, true, B'. This is a performance improvement that will be helpful after unsound select -> and/or transformation is removed, as discussed in D93065. SCEV's answers for the select form should be a bit more conservative than the equivalent `and A, B` / `or A, B`. Take this example: https://alive2.llvm.org/ce/z/NsP9ue . To check whether it is valid for SCEV's computeExitLimit to return min(n, m) as ExactNotTaken value, I put llvm.assume at tgt. It fails because the exit limit becomes poison if n is zero and m is poison. This is problematic if e.g. the exit value of i is replaced with min(n, m). If either n or m is constant, we can revive the analysis again. I added relevant tests and put alive2 links there. If and is used instead, this is okay: https://alive2.llvm.org/ce/z/K9rbJk . Hence the existing analysis is sound. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93882	2021-01-01 04:37:57 +09:00
Kazu Hirata	f76e83bfbb	[Analysis] Use llvm::append_range (NFC)	2020-12-29 19:23:21 -08:00
Kazu Hirata	3285ee143b	[Analysis, IR, CodeGen] Use llvm::erase_if (NFC)	2020-12-20 09:19:35 -08:00
Max Kazantsev	8b330f1f69	[SCEV] Add missing type check into getRangeForAffineNoSelfWrappingAR We make type widening without checking if it's needed. Bail if the max iteration count is wider than AR's type.	2020-12-15 14:50:32 +07:00
Kazu Hirata	eb44682d67	[Analysis] Use is_contained (NFC)	2020-12-11 21:19:31 -08:00
Max Kazantsev	035955f925	Revert "Return "[SCEV] Use isBasicBlockEntryGuardedByCond in isLoopBackedgeGuardedByCond", 2nd try" This reverts commit `f690986f31`. Compile time then and again...	2020-11-26 18:12:51 +07:00
Max Kazantsev	f690986f31	Return "[SCEV] Use isBasicBlockEntryGuardedByCond in isLoopBackedgeGuardedByCond", 2nd try Reverted because the compile time impact is still too high. isKnownViaNonRecursiveReasoning is used twice, we can do it just once. Differential Revision: https://reviews.llvm.org/D92152	2020-11-26 17:45:13 +07:00
Max Kazantsev	91d6b6b5fb	Revert "[SCEV] Use isBasicBlockEntryGuardedByCond in isLoopBackedgeGuardedByCond" This reverts commit `3d4c0460ec`. Compile time impact is still high. Need to understand why. Differential Revision: https://reviews.llvm.org/D92153	2020-11-26 17:28:30 +07:00
Max Kazantsev	3d4c0460ec	[SCEV] Use isBasicBlockEntryGuardedByCond in isLoopBackedgeGuardedByCond Previously we tried to using isKnownPredicateAt, but it makes an extra query to isKnownPredicate, which has negative impact on compile time. Let's try to use more lightweight isBasicBlockEntryGuardedByCond. Differential Revision: https://reviews.llvm.org/D92152	2020-11-26 17:08:38 +07:00
Max Kazantsev	3b6481eae2	Revert "[SCEV] Use isKnownPredicateAt in isLoopBackedgeGuardedByCond" This reverts commit `14f2ad0e3c`. Reverting to investigate compile time drop. Differential Revision: https://reviews.llvm.org/D92152	2020-11-26 16:42:43 +07:00
Max Kazantsev	14f2ad0e3c	[SCEV] Use isKnownPredicateAt in isLoopBackedgeGuardedByCond A piece of code in `isLoopBackedgeGuardedByCond` basically duplicates the dominators traversal from `isBlockEntryGuardedByCond` called from `isKnownPredicateAt`, but it's less powerful because it does not give context to `isImpliedCond`. This patch reuses the `isKnownPredicateAt `function there, reducing the amount of code duplication and making it more powerful. Differential Revision: https://reviews.llvm.org/D92152 Reviewed By: skatkov	2020-11-26 13:20:02 +07:00
Max Kazantsev	f10500e220	[IndVars] Use isLoopBackedgeGuardedByCond for last iteration check Use more context to prove contextual facts about the last iteration. It is only executed when the backedge is taken, so we can use `isLoopBackedgeGuardedByCond` to make this check. Differential Revision: https://reviews.llvm.org/D91535 Reviewed By: skatkov	2020-11-26 12:37:21 +07:00
Joe Ellis	06654a5348	[SVE] Fix TypeSize warning in RuntimePointerChecking::insert The TypeSize warning would occur because RuntimePointerChecking::insert was not scalable vector aware. The fix is to use ScalarEvolution::getSizeOfExpr to grab the size of types. Differential Revision: https://reviews.llvm.org/D90171	2020-11-25 16:59:03 +00:00
Max Kazantsev	9130651126	Revert "[SCEV] Generalize no-self-wrap check in isLoopInvariantExitCondDuringFirstIterations" This reverts commit `7dcc889917`. This patch introduced a logical error that breaks whole logic of this analysis. All checks we are making are supposed to be loop-independent, so that we could safely remove the range check. The 'nw' fact is loop-dependent, so we can remove the check basing on facts from this very check. Motivating examples will follow-up.	2020-11-25 13:26:17 +07:00
Max Kazantsev	02fdbc3567	Revert "[NFC][SCEV] Generalize monotonicity check for full and limited iteration space" This reverts commit `2734a9ebf4`. This patch appeared to not be a NFC. It introduced an execution path where monotonicity check on limited space started relying in existing nsw/nuw flags, which is illegal. The motivating test will follow-up.	2020-11-24 17:56:59 +07:00
Max Kazantsev	48d7cc6ae2	[SCEV] Fix incorrect treatment of max taken count. PR48225 SCEV makes a logical mistake when handling EitherMayExit in case when both conditions must be met to exit the loop. The mistake looks like follows: "if condition `A` fails within at most `X` first iterations, and `B` fails within at most `Y` first iterations, then `A & B` fails at most within `min (X, Y)` first iterations". This is wrong, because both of them must fail at the same time. Simple example illustrating this is following: we have an IV with step 1, condition `A` = "IV is even", condition `B` = "IV is odd". Both `A` and `B` will fail within first two iterations. But it doesn't mean that both of them will fail within first two first iterations at the same time, which would mean that IV is neither even nor odd at the same time within first 2 iterations. We can only do so for known exact BE counts, but not for max. Differential Revision: https://reviews.llvm.org/D91942 Reviewed By: nikic	2020-11-23 16:52:39 +07:00
Max Kazantsev	47e31d1b5e	[NFC] Reduce code duplication in binop processing in computeExitLimitFromCondCached Handling of `and` and `or` vastly uses copy-paste. Factored out into a helper function as preparation step for further fix (see PR48225). Differential Revision: https://reviews.llvm.org/D91864 Reviewed By: nikic	2020-11-23 13:18:12 +07:00
Philip Reames	0f41a2fe83	test commit for new client	2020-11-16 17:26:52 -08:00
Philip Reames	257d33c815	[SCEV] Factor out part of wrap flag detection logic [NFC](try 2) This is a cut down version of 1ec6e1 which was reverted due to a compile time issue. The key changes made from that patch: 1) only infer the flags needed along each path, 2) be careful to preserve order of checks, and 3) avoid computing NW flags at all since we need to prove the stronger property (does not cross 0) in the caller anyways. Assuming this doesn't trip regressions, I'm going to try weakening (1). My end objective is to move flag inference into addrec construction. If I can't weaken (1) without compile time impact, I'll have a problem.	2020-11-16 12:07:21 -08:00
Nikita Popov	9ace4b337f	Revert "[SCEV] Factor out part of wrap flag detection logic [NFC-ish]" This reverts commit `1ec6e1eb8a`. This change causes a significant compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=dd0b8b94d0796bd895cc998dd163b4fbebceb0b8&to=1ec6e1eb8a084bffae8a40236eb9925d8026dd07&stat=instructions I assume that this is due to the non-NFC part of the change, which now performs expensive nowrap inference even for nowrap flags that are not used by the particular code.	2020-11-15 10:19:44 +01:00
Philip Reames	1ec6e1eb8a	[SCEV] Factor out part of wrap flag detection logic [NFC-ish] In an effort to make code around flag determination more readable, and (possibly) prepare for a follow up change, factor out some of the flag detection logic. In the process, reduce the number of locations we mutate wrap flags by a couple. Note that this isn't NFC. The old code tried for NSW xor (NUW \|\| NW). This is, two different paths computed different sets of wrap flags. The new code will try for all three. The result is that some expressions end up with a few extra flags set.	2020-11-14 19:21:05 -08:00
Nikita Popov	f3124a46c1	[SCEV] Fix nsw flags for GEP expressions The SCEV code for constructing GEP expressions currently assumes that the addition of the base and all the offsets is nsw if the GEP is inbounds. While the addition of the offsets is indeed nsw, the addition to the base address is not, as the base address is interpreted as an unsigned value. Fix the GEP expression code to not assume nsw for the base+offset calculation. However, do assume nuw if we know that the offset is non-negative. With this, we use the same behavior as the construction of GEP addrecs does. (Modulo the fact that we disregard SCEV unification, as the pre-existing FIXME points out). Differential Revision: https://reviews.llvm.org/D90648	2020-11-13 18:19:32 +01:00
Max Kazantsev	0a1d394bf3	[NFC] Refactor loop-invariant getters to return Optional	2020-11-13 15:03:10 +07:00
Max Kazantsev	2734a9ebf4	[NFC][SCEV] Generalize monotonicity check for full and limited iteration space A piece of logic of `isLoopInvariantExitCondDuringFirstIterations` is actually a generalized predicate monotonicity check. This patch moves it into the corresponding method and generalizes it a bit. Differential Revision: https://reviews.llvm.org/D90395 Reviewed By: apilipenko	2020-11-12 12:37:07 +07:00
Max Kazantsev	7dcc889917	[SCEV] Generalize no-self-wrap check in isLoopInvariantExitCondDuringFirstIterations Lift limitation on step being `+/- 1`. In fact, the only thing it is needed for is proving no-self-wrap. We can instead check this flag directly. Theoretically it can increase the scope of the transform, but I could not construct such test easily. Differential Revision: https://reviews.llvm.org/D91126 Reviewed By: apilipenko	2020-11-11 11:17:13 +07:00
David Green	b2ac9681a7	[ARM] Alter t2DoLoopStart to define lr This changes the definition of t2DoLoopStart from t2DoLoopStart rGPR to GPRlr = t2DoLoopStart rGPR This will hopefully mean that low overhead loops are more tied together, and we can more reliably generate loops without reverting or being at the whims of the register allocator. This is a fairly simple change in itself, but leads to a number of other required alterations. - The hardware loop pass, if UsePhi is set, now generates loops of the form: %start = llvm.start.loop.iterations(%N) loop: %p = phi [%start], [%dec] %dec = llvm.loop.decrement.reg(%p, 1) %c = icmp ne %dec, 0 br %c, loop, exit - For this a new llvm.start.loop.iterations intrinsic was added, identical to llvm.set.loop.iterations but produces a value as seen above, gluing the loop together more through def-use chains. - This new instrinsic conceptually produces the same output as input, which is taught to SCEV so that the checks in MVETailPredication are not affected. - Some minor changes are needed to the ARMLowOverheadLoop pass, but it has been left mostly as before. We should now more reliably be able to tell that the t2DoLoopStart is correct without having to prove it, but t2WhileLoopStart and tail-predicated loops will remain the same. - And all the tests have been updated. There are a lot of them! This patch on it's own might cause more trouble that it helps, with more tail-predicated loops being reverted, but some additional patches can hopefully improve upon that to get to something that is better overall. Differential Revision: https://reviews.llvm.org/D89881	2020-11-10 15:57:58 +00:00
Max Kazantsev	3ec69c16c3	[NFC] Different way of getting step	2020-11-10 13:48:02 +07:00
Max Kazantsev	6022a8b7e8	[SCEV] Drop cached ranges of AddRecs after flag update Our range computation methods benefit from no-wrap flags. But if the ranges were first computed before the flags were set, the cached range will be too pessimistic. We need to drop cached ranges whenever we sharpen AddRec's no wrap flags. Differential Revision: https://reviews.llvm.org/D89847 Reviewed By: fhahn	2020-11-10 12:37:12 +07:00
Max Kazantsev	ab7ef35d34	Revert "[SCEV] Handle non-positive case in isImpliedViaOperations" This reverts commit `8dc98897c4`. Commited by mistake.	2020-11-05 11:27:55 +07:00
Max Kazantsev	8dc98897c4	[SCEV] Handle non-positive case in isImpliedViaOperations We already handle non-negative case there. Add support for non-positive.	2020-11-05 11:07:37 +07:00
Nikita Popov	cc91554ebb	[SCEV] Delay strengthening of nowrap flags Strengthening nowrap flags is relatively expensive. Make sure we only do it if we're actually going to use the flags -- we don't use them for many recursive invocations. Additionally, if we're reusing an existing SCEV node, there's no point in trying to strengthen the flags if we don't have any new baseline facts. This change falls slightly short of being NFC, because the way flags during add+addrec / mul+addrec folding are handled may be more precise (as less operands are included in the calculation).	2020-11-01 22:18:07 +01:00
Nikita Popov	6ec56467cb	[SCEV] Construct GEP expression more efficiently (NFCI) Instead of performing a sequence of pairwise additions, directly construct a multi-operand add expression. This should be NFC modulo any SCEV canonicalization deficiencies.	2020-11-01 19:00:57 +01:00
Roman Lebedev	ef22d500f7	[NFCI][SCEV] getPtrToIntExpr(): use SCEVRewriteVisitor<> for ptrtoint cast sinking This is functionally-identical to the previous implementation, just using a generic interface to do that instead of hand-rolled one, with caching as a bonus. Thought the sinking is still recursive.. Note that SCEVRewriteVisitor<>'s default implementations don't preserve NoWrap flags on Add/Mul (but does on AddRec!), but here we know we can preserve them, so `visitAddExpr()`/`visitMulExpr()` are specialized.	2020-10-30 17:05:14 +03:00
Roman Lebedev	b4916918e5	[SCEV] SCEVPtrToIntExpr simplifications If we've got an SCEVPtrToIntExpr(op), where op is not an SCEVUnknown, we want to sink the SCEVPtrToIntExpr into an operand, so that the operation is performed on integers, and eventually we end up with just an `SCEVPtrToIntExpr(SCEVUnknown)`. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D89692	2020-10-30 11:13:35 +03:00
Roman Lebedev	81fc53a36a	[SCEV] Introduce SCEVPtrToIntExpr (PR46786) And use it to model LLVM IR's `ptrtoint` cast. This is essentially an alternative to D88806, but with no chance for all the problems it caused due to having the cast as implicit there. (see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3) As we've established by now, there are at least two reasons why we want this: * It will allow SCEV to actually model the `ptrtoint` casts and their operands, instead of treating them as `SCEVUnknown` * It should help with initial problem of PR46786 - this should eventually allow us to not loose pointer-ness of an expression in more cases As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 \| PR46786 ]], in principle, we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()` should sink the cast as far down into the expression as possible, so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`. But i think that it isn't the best solution, because it doesn't really matter from memory consumption side - there probably won't be that many `SCEVPtrToIntExpr`s for it to matter, and it allows for much better discoverability. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D89456	2020-10-30 11:13:35 +03:00

1 2 3 4 5 ...

1702 Commits