llvm-project

Commit Graph

Author	SHA1	Message	Date
Max Kazantsev	84fe777a63	[Test] One more test on IndVars with negative step	2020-11-06 14:55:50 +07:00
Max Kazantsev	1776581be4	[Test] Run test with expensive SE inference. NFC The planned changes require expensive inference to kick in	2020-11-06 14:23:44 +07:00
Max Kazantsev	f847094c24	[IndVars] Use knowledge about execution on last iteration when removing checks If we know that some check will not be executed on the last iteration, we can use this fact to eliminate its check. Differential Revision: https://reviews.llvm.org/D88210 Reviwed By: ebrevnov	2020-11-03 13:38:58 +07:00
Nikita Popov	a8ef00af43	[IndVars] Regenerate test checks (NFC)	2020-11-02 22:31:11 +01:00
Max Kazantsev	160a453138	Return "[IndVars] Remove monotonic checks with unknown exit count" This reverts commit `e038b60d91`. This reverts commit `a0d84d8031`. This revert was a mistake. The reason of the failures was "Use uint64_t for branch weights instead of uint32_t" Differential Revision: https://reviews.llvm.org/D87832	2020-10-28 18:51:40 +07:00
Raphael Isemann	e038b60d91	Revert "[IndVars] Remove monotonic checks with unknown exit count" This reverts commit `c6ca26c0bf`. This breaks stage2 builds due to hitting this assert: ``` Assertion failed: (WeightSum <= UINT32_MAX && "Expected weights to scale down to 32 bits"), function calcMetadataWeights ``` when compiling AArch64RegisterBankInfo.cpp in LLVM.	2020-10-27 15:31:37 +01:00
Max Kazantsev	6335446c99	[Test] One more range check test	2020-10-27 14:51:36 +07:00
Max Kazantsev	c6ca26c0bf	[IndVars] Remove monotonic checks with unknown exit count Even if the exact exit count is unknown, we can still prove that this exit will not be taken. If we can prove that the predicate is monotonic, fulfilled on first & last iteration, and no overflow happened in between, then the check can be removed. Differential Revision: https://reviews.llvm.org/D87832 Reviewed By: apilipenko	2020-10-27 11:35:16 +07:00
Nikita Popov	ebeef022aa	[SCEV] Strenthen nowrap flags after constant folding for mul exprs Same change as `0dda633317`, but for mul expressions. We want to first fold any constant operans and then strengthen the nowrap flags, as we can compute more precise flags at that point.	2020-10-25 19:43:58 +01:00
Nikita Popov	0dda633317	[SCEV] Strength nowrap flags after constant folding We should first try to constant fold the add expression and only strengthen nowrap flags afterwards. This allows us to determine stronger flags if e.g. only two operands are left after constant folding (and thus "guaranteed no wrap region" code applies) or the resulting operands are non-negative and thus nsw->nuw strengthening applies.	2020-10-25 18:00:22 +01:00
Nikita Popov	c5718253c9	[IndVars] Regenerate test checks (NFC) Also run the test case through -instnamer.	2020-10-25 17:45:12 +01:00
Arthur Eubanks	55c4ff9860	[test] Fix tests using -analyze that fail under NPM Many of these tests don't use the output of -analyze.	2020-10-21 21:54:30 -07:00
Arthur Eubanks	5b68772ca9	[test] Fix shrunk-constant.ll under NPM	2020-10-21 21:21:24 -07:00
Fangrui Song	d9f91a3d14	Revert D89381 "[SCEV] Recommit "Use nw flag and symbolic iteration count to sharpen ranges of AddRecs", attempt 2" This reverts commit `a10a64e7e3`. It broke polly/test/ScopInfo/NonAffine/non-affine-loop-condition-dependent-access_3.ll The difference suggests that this may be a serious issue.	2020-10-20 21:03:58 -07:00
Max Kazantsev	a10a64e7e3	[SCEV] Recommit "Use nw flag and symbolic iteration count to sharpen ranges of AddRecs", attempt 2 Fixed wrapping range case & proof methods reduced to constant range checks to save compile time. Differential Revision: https://reviews.llvm.org/D89381	2020-10-20 11:32:36 +07:00
Nikita Popov	74c8c2d903	Revert "Recommit "[SCEV] Use nw flag and symbolic iteration count to sharpen ranges of AddRecs"" This reverts commit `32b72c3165`. While better than before, this change still introduces a large compile-time regression (>3% on mafft): https://llvm-compile-time-tracker.com/compare.php?from=fbd62fe60fb2281ca33da35dc25ca3c87ec0bb51&to=32b72c3165bf65cca2e8e6197b59eb4c4b60392a&stat=instructions Additionally, the logic here doesn't look quite right to me, I will comment in more detail on the differential revision.	2020-10-16 21:36:33 +02:00
Max Kazantsev	32b72c3165	Recommit "[SCEV] Use nw flag and symbolic iteration count to sharpen ranges of AddRecs" It was reverted because of negative compile time impact. In this version, less powerful proof methods are used (non-recursive reasoning only), and scope limited to constant End values to avoid explision of complex proofs. Differential Revision: https://reviews.llvm.org/D89381	2020-10-16 17:35:13 +07:00
Nikita Popov	7d3b475810	Revert "[SCEV] Use nw flag and symbolic iteration count to sharpen ranges of AddRecs" This reverts commit `905101c360`. This causes a large compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=cc175c2cc8e638462bab74e0781e06f9b6eb5017&to=905101c36025fe1c8ecdf9a20cd59db036676073&stat=instructions	2020-10-16 09:47:38 +02:00
Max Kazantsev	905101c360	[SCEV] Use nw flag and symbolic iteration count to sharpen ranges of AddRecs We can sharpen the range of a AddRec if we know that it does not self-wrap and know the symbolic iteration count in the loop. If we can evaluate the value of AddRec on the last iteration and prove that at least one its intermediate value lies between start and end, then no-wrap flag allows us to conclude that all of them also lie between start and end. So the estimate of range can be improved to union of ranges of start and end. Differential Revision: https://reviews.llvm.org/D89381 Reviewed By: efriedma	2020-10-16 12:00:39 +07:00
Roman Lebedev	2008dacf6e	[NFC][IndVars] Autogenerate check lines in tests being affected by upcoming patch	2020-10-15 23:15:04 +03:00
Roman Lebedev	7ee6c40247	Revert "Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown"" and it's follow-ups While we haven't encountered an earth-shattering problem with this yet, by now it is pretty evident that trying to model the ptr->int cast implicitly leads to having to update every single place that assumed no such cast could be needed. That is of course the wrong approach. Let's back this out, and re-attempt with some another approach, possibly one originally suggested by Eli Friedman in https://bugs.llvm.org/show_bug.cgi?id=46786#c20 which should hopefully spare us this pain and more. This reverts commits `1fb6104293`, `7324616660`, `aaafe350bb`, `e92a8e0c74`. I've kept&improved the tests though.	2020-10-14 16:09:18 +03:00
Max Kazantsev	be8344f2a5	[Test] Auto-update for some tests	2020-10-14 17:03:33 +07:00
Max Kazantsev	06a5e2f307	[Test] Use generated auto-checks to make further changes more visible	2020-10-13 15:16:32 +07:00
Roman Lebedev	1fb6104293	Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown" This relands commit `1c021c64ca` which was reverted in commit `17cec6a11a` because an assertion was being triggered, since `BuildConstantFromSCEV()` wasn't updated to handle the case where the constant we want to truncate is actually a pointer. I was unsuccessful in coming up with a test case where we'd end there with constant zext/sext of a pointer, so i didn't handle those cases there until there is a test case. Original commit message: While we indeed can't treat them as no-ops, i believe we can/should do better than just modelling them as `unknown`. `inttoptr` story is complicated, but for `ptrtoint`, it seems straight-forward to model it just as a zext-or-trunc of unknown. This may be important now that we track towards making inttoptr/ptrtoint casts not no-op, and towards preventing folding them into loads/etc (see D88979/D88789/D88788) Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88806	2020-10-12 23:02:55 +03:00
Hans Wennborg	17cec6a11a	Revert `1c021c64c` "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown" > While we indeed can't treat them as no-ops, i believe we can/should > do better than just modelling them as `unknown`. `inttoptr` story > is complicated, but for `ptrtoint`, it seems straight-forward > to model it just as a zext-or-trunc of unknown. > > This may be important now that we track towards > making inttoptr/ptrtoint casts not no-op, > and towards preventing folding them into loads/etc > (see D88979/D88789/D88788) > > Reviewed By: mkazantsev > > Differential Revision: https://reviews.llvm.org/D88806 It caused the following assert during Chromium builds: llvm/lib/IR/Constants.cpp:1868: static llvm::Constant llvm::ConstantExpr::getTrunc(llvm::Constant , llvm::Type *, bool): Assertion `C->getType()->isIntOrIntVectorTy() && "Trunc operand must be integer"' failed. See code review for a link to a reproducer. This reverts commit `1c021c64ca`.	2020-10-12 18:39:35 +02:00
Roman Lebedev	1c021c64ca	[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown While we indeed can't treat them as no-ops, i believe we can/should do better than just modelling them as `unknown`. `inttoptr` story is complicated, but for `ptrtoint`, it seems straight-forward to model it just as a zext-or-trunc of unknown. This may be important now that we track towards making inttoptr/ptrtoint casts not no-op, and towards preventing folding them into loads/etc (see D88979/D88789/D88788) Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88806	2020-10-12 11:04:03 +03:00
Max Kazantsev	9824d5c838	[Test] Add test showing that we fail to eliminate implied exit conditions	2020-10-08 16:36:28 +07:00
Max Kazantsev	a5ef2e0a1e	Return "[SCEV] Prove implicaitons via AddRec start" The initial version of the patch was reverted because it missed the check that the predicate being proved is actually guarded by this check on 1st iteration. If it was not executed on 1st iteration (but possibly executes after that), then it is incorrect to use reasoning about IV start to prove it. Added the test where the miscompile was seen. Unfortunately, my attempts to reduce it with bugpoint did not succeed; it can further be reduced when we understand how to do it without losing the initial bug's notion. Returning assuming the miscompiles are now gone. Differential Revision: https://reviews.llvm.org/D88208	2020-10-08 11:15:35 +07:00
Max Kazantsev	85a6f8fc96	[Test] Add one more test where we can avoid creating trunc	2020-10-07 15:06:38 +07:00
Max Kazantsev	0c009e092e	[Test] Add test showing that we can avoid inserting trunc/zext	2020-10-07 12:19:01 +07:00
David Stenberg	e6f332ef1e	[IndVarSimplify] Fix Modified status for removal of overflow intrinsics When removing an overflow intrinsic the Changed status in SimplifyIndvar was not set, leading to the IndVarSimplify pass returning an incorrect status. This was caught using the check introduced by D80916. As pointed out in the code review, a similar bug may exist for eliminateTrunc(). Reviewed By: reames Differential Revision: https://reviews.llvm.org/D85971	2020-09-29 13:20:59 +02:00
Max Kazantsev	d266fd960e	[IndVars] Remove exiting conditions that are trivially true/false When removing exiting loop conditions, we only consider checks for which we know the exact exit count. We could also eliminate checks for which the condition is always true/false. Differential Revision: https://reviews.llvm.org/D87344 Reviewed By: lebedev.ri, reames	2020-09-29 11:35:32 +07:00
Max Kazantsev	15985952ac	[Test] Add tests where we can replace condition with invariants	2020-09-28 12:04:20 +07:00
Max Kazantsev	98aed8aa00	[Test] Test auto-update	2020-09-21 16:06:18 +07:00
Max Kazantsev	09a3737384	[Test] Missing range check removal opportunity	2020-09-18 17:55:23 +07:00
Max Kazantsev	7688027f16	[Test] Add tests showing that IndVars cannot prove (X + 1 > X)	2020-09-17 22:37:43 +07:00
Max Kazantsev	6985135a43	[Test] Add positive range checks tests in addition to negative	2020-09-16 14:24:42 +07:00
Max Kazantsev	94f7d3dba3	[Test] Some more potential range check elimination opportunities	2020-09-16 14:00:19 +07:00
Max Kazantsev	8a04cdb510	[Test] Add signed version of a test	2020-09-16 11:30:21 +07:00
Sam Parker	0bdf8c9127	[SCEV] Constant expansion cost at minsize As code size is the only thing we care about at minsize, query the cost of materialising immediates when calculating the cost of a SCEV expansion. We also modify the CostKind to TCK_CodeSize for minsize, instead of RecipThroughput. Differential Revision: https://reviews.llvm.org/D76434	2020-09-10 08:21:11 +01:00
Max Kazantsev	046f240202	[Test] More tests where IndVars fails to eliminate a range check	2020-09-08 14:43:29 +07:00
Max Kazantsev	247d023965	[Test] Auto-generated checks for some IndVarSimplify tests	2020-09-08 11:15:40 +07:00
Max Kazantsev	8784e9016d	[Test] Range fix in test test02_neg is not testing what it claims to test because its starting value -1 lies outside of specified range.	2020-09-04 19:28:58 +07:00
Max Kazantsev	159f9a69b4	[Test] Add test showing some simple cases that IndVarSimplify does not cover	2020-09-03 18:35:26 +07:00
Max Kazantsev	8a3907cd49	[Test] Simplify test by removing unneeded variable	2020-09-02 18:39:43 +07:00
Max Kazantsev	e7f53044e7	[Test] Move IndVars test to a proper place	2020-09-01 12:17:31 +07:00
Sam Parker	5eb705d5dc	[NFC] Add some more Arm tests for IndVarSimplify Copy some generic functions and apply minsize for arm.	2020-08-18 11:24:35 +01:00
Sam Parker	613d8f2953	[NFC] Run update script on test Update IndVarSimplify/no-iv-rewrite.ll	2020-08-17 12:53:14 +01:00
Max Kazantsev	9b49a4d301	[Test] Add one more test on IndVars that was failing on one of older builds	2020-08-07 14:23:55 +07:00
Arthur Eubanks	d0acd97c68	[NewPM][LoopUnswitch] Pin loop-unswitch to legacy PM or use simple-loop-unswitch As mentioned in http://lists.llvm.org/pipermail/llvm-dev/2020-July/143395.html, loop-unswitch has not been ported to the NPM. Instead people are using simple-loop-unswitch. Pin all tests in Transforms/LoopUnswitch to legacy PM and replace all other uses of loop-unswitch with simple-loop-unswitch. One test that didn't fit into the above was 2014-06-21-congruent-constant.ll which seems to only pass with loop-unswitch. That is also pinned to legacy PM. Now all tests containing "-loop-unswitch" anywhere in the test succeed with NPM turned on by default. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D85360	2020-08-06 10:56:00 -07:00
Arthur Eubanks	2ca6c422d2	[FunctionAttrs] Rename functionattrs -> function-attrs To match NewPM pass name, and also for readability. Also rename rpo-functionattrs -> rpo-function-attrs while we're here. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D84694	2020-07-28 09:09:13 -07:00
Florian Hahn	be2ea29ee1	[SCEV] Add additional tests. Increase test coverage for upcoming changes to how SCEV deals with LCSSA phis.	2020-07-28 16:15:57 +01:00
Chen Zheng	6d247f980d	[SCEV][IndVarSimplify] insert point should not be block front. Recommit after removing the unused cast instructions. Differential Revision: https://reviews.llvm.org/D80975	2020-07-17 22:25:10 -04:00
serge-sans-paille	1cd1c1d62e	Revert "[SCEV][IndVarSimplify] insert point should not be block front." This reverts commit `f1efb8bb4b`. Reverted because it doesn't correctly update the pass return status, see http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/9441/steps/test-check-all/logs/FAIL%3A%20LLVM%3A%3Awiden-i32-i8ptr.ll	2020-07-14 14:24:26 +02:00
Chen Zheng	f1efb8bb4b	[SCEV][IndVarSimplify] insert point should not be block front. The block front may be a PHI node, inserting a cast instructions like BitCast, PtrToInt, IntToPtr among PHIs is not right. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D80975	2020-07-09 21:56:57 -04:00
Nikita Popov	c84a952dc7	[IndVars] Regenerate test checks (NFC)	2020-06-29 20:33:50 +02:00
Roman Lebedev	d57e9aca01	[IndVarSimplify] Don't replace IV user with unsafe loop-invariant (PR45360) Summary: As [[ https://bugs.llvm.org/show_bug.cgi?id=45360 \| PR45360 ]] reports, with new cost-model we can sometimes end up being able to expand `udiv`/`urem` instructions. And that exposes at least one instance of when we do that regardless of whether or not it is safe to do. In this particular case, it's `SimplifyIndvar::replaceIVUserWithLoopInvariant()`. It seems to me, we simply need to check with `isSafeToExpandAt()` first. The test isn't great. I'm not sure how to make it only run `-indvars`. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=45360 \| PR45360 ]]. Reviewers: mkazantsev, reames, helloqirun Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82108	2020-06-23 13:53:15 +03:00
Roman Lebedev	da419320ef	[NFC][IndVarSimplify] Test: replacing IV user with unsafe loop-invariant (PR45360) https://bugs.llvm.org/show_bug.cgi?id=45360 This is reduced from the (runnable) test provided in the bug report. The remainder operation is originally guarded, it never divides by zero. Indvars should not make it execute unconditionally. This is not a great test, running whole -O2 is fragile, but i really don't understand why running -indvars on the IR before that tranform happens doesn't work.	2020-06-18 19:35:35 +03:00
Roman Lebedev	b2df961231	[IndVarSimplify][LoopUtils] Avoid TOCTOU/ordering issues (PR45835) Summary: Currently, `rewriteLoopExitValues()`'s logic is roughly as following: > Loop over each incoming value in each PHI node. > Query whether the SCEV for that incoming value is high-cost. > Expand the SCEV. > Perform sanity check (`isValidRewrite()`, D51582) > Record the info > Afterwards, see if we can drop the loop given replacements. > Maybe perform replacements. The problem is that we interleave SCEV cost checking and expansion. This is A Problem, because `isHighCostExpansion()` takes special care to not bill for the expansions that were already expanded, and we can reuse. While it makes sense in general - if we know that we will expand some SCEV, all the other SCEV's costs should account for that, which might cause some of them to become non-high-cost too, and cause chain reaction. But that isn't what we are doing here. We expand all SCEV's, unconditionally. So every next SCEV's cost will be affected by the already-performed expansions for previous SCEV's. Even if we are not planning on keeping some of the expansions we performed. Worse yet, this current "bonus" depends on the exact PHI node incoming value processing order. This is completely wrong. As an example of an issue, see @dmajor's `pr45835.ll` - if we happen to have a PHI node with two(!) identical high-cost incoming values for the same basic blocks, we would decide first time around that it is high-cost, expand it, and immediately decide that it is not high-cost because we have an expansion that we could reuse (because we expanded it right before, temporarily), and replace the second incoming value but not the first one; thus resulting in a broken PHI. What we instead should do for now, is not perform any expansions until after we've queried all the costs. Later, in particular after `isValidRewrite()` is an assertion (D51582) we could improve upon that, but in a more coherent fashion. See [[ https://bugs.llvm.org/show_bug.cgi?id=45835 \| PR45835 ]] Reviewers: dmajor, reames, mkazantsev, fhahn, efriedma Reviewed By: dmajor, mkazantsev Subscribers: smeenai, nikic, hiraditya, javed.absar, llvm-commits, dmajor Tags: #llvm Differential Revision: https://reviews.llvm.org/D79787	2020-05-21 13:05:55 +03:00
Sjoerd Meijer	b0614509a0	[HardwareLoops] llvm.loop.decrement.reg definition This is split off from D80316, slightly tightening the definition of overloaded hardwareloop intrinsic llvm.loop.decrement.reg specifying that both operands its result have the same type.	2020-05-21 10:48:16 +01:00
Roman Lebedev	7d572ef2dd	Revert "[SCEV] rewriteLoopExitValues(): even if have hard uses, still rewrite if cheap (PR44668)" As discussed in post-commit review in https://reviews.llvm.org/D73501 if the goal of this is to help vectorizer, then we should actually be teaching vectorizer to do this, because right now this rewrite is still budget-limited, which isn't what we'd want. Additionally, while the rest of the patch series was universally profitable, this particular patch is reportedly (https://reviews.llvm.org/D73501#1905171) exposing cost-modeling issues on ARM. So let's just back this particular patch out. Once there's an undo transform, this could be considered for reintegration. This reverts commit `44edc6fd2c`.	2020-04-03 20:15:04 +03:00
Roman Lebedev	8e7b25bb40	[NFC] Move ARM `opt -indvars` test from Codegen into Transforms They are really not codegen tests.	2020-04-03 20:15:03 +03:00
Sam Parker	db8a3c4206	[NFC] Create X86 subdirectory for indvar tests Many IndVarSiimplify tests target an x86 triple, so move them into a target specific folder.	2020-03-26 12:24:45 +00:00
Zhongduo Lin	eae228a292	[IndVarSimplify] Extend previous special case for load use instruction to any narrow type loop variant to avoid extra trunc instruction Summary: The widenIVUse avoids generating trunc by evaluating the use as AddRec, this will not work when: 1) SCEV traces back to an instruction inside the loop that SCEV can not expand, eg. add %indvar, (load %addr) 2) SCEV finds a loop variant, eg. add %indvar, %loopvariant While SCEV fails to avoid trunc, we can still try to use instruction combining approach to prove trunc is not required. This can be further extended with other instruction combining checks, but for now we handle the following case (sub can be "add" and "mul", "nsw + sext" can be "nus + zext") ``` Src: %c = sub nsw %b, %indvar %d = sext %c to i64 Dst: %indvar.ext1 = sext %indvar to i64 %m = sext %b to i64 %d = sub nsw i64 %m, %indvar.ext1 ``` Therefore, as long as the result of add/sub/mul is extended to wide type with right extension and overflow wrap combination, no trunc is required regardless of how %b is generated. This pattern is common when calculating address in 64 bit architecture. Note that this patch reuse almost all the code from D49151 by @az: https://reviews.llvm.org/D49151 It extends it by providing proof of why trunc is unnecessary in more general case, it should also resolve some of the concerns from the following discussion with @reames. http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180910/585945.html Reviewers: sanjoy, efriedma, sebpop, reames, az, javed.absar, amehsan Reviewed By: az, amehsan Subscribers: hiraditya, llvm-commits, amehsan, reames, az Tags: #llvm Differential Revision: https://reviews.llvm.org/D73059	2020-03-05 16:27:59 -05:00
Eli Friedman	b299926453	[IndVars] Fix sort comparator. std::sort will compare an element to itself in some cases. We should not crash if this happens. Differential Revision: https://reviews.llvm.org/D75000	2020-02-27 17:25:18 -08:00
Roman Lebedev	400ceda425	[SCEV][IndVars] Always provide insertion point to the SCEVExpander::isHighCostExpansion() Summary: This addresses the `llvm/test/Transforms/IndVarSimplify/elim-extend.ll` `@nestedIV` regression from D73728 Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73777	2020-02-25 23:05:59 +03:00
Roman Lebedev	44edc6fd2c	[SCEV] rewriteLoopExitValues(): even if have hard uses, still rewrite if cheap (PR44668) Summary: Replacing uses of IV outside of the loop is likely generally useful, but `rewriteLoopExitValues()` is cautious, and if it isn't told to always perform the replacement, and there are hard uses of IV in loop, it doesn't replace. In [[ https://bugs.llvm.org/show_bug.cgi?id=44668 \| PR44668 ]], that prevents `-indvars` from replacing uses of induction variable after the loop, which might be one of the optimization failures preventing that code from being vectorized. Instead, now that the cost model is fixed, i believe we should be a little bit more optimistic, and also perform replacement if we believe it is within our budget. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44668 \| PR44668 ]]. Reviewers: reames, mkazantsev, asbirlea, fhahn, skatkov Reviewed By: mkazantsev Subscribers: nikic, hiraditya, zzheng, javed.absar, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73501	2020-02-25 23:05:59 +03:00
Roman Lebedev	d6f47aeb51	[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model min/max (PR44668) Summary: Previosly we simply always said that `SCEVMinMaxExpr` is too costly to expand. But this isn't really true, it expands into just a comparison+swap pair. And again much like with add/mul, there will be one less such pair than the number of operands. And we need to count the cost of operands themselves. This does change a number of testcases, and as far as i can tell, all of these changes are improvements, in the sense that we fixed up more latches to do the [in]equality comparison. This concludes cost-modelling changes, no other SCEV expressions exist as of now. This is a part of addressing [[ https://bugs.llvm.org/show_bug.cgi?id=44668 \| PR44668 ]]. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73744	2020-02-25 23:05:59 +03:00
Roman Lebedev	756af2f88b	[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model add/mul Summary: While this resolves the regression from D73722 in `llvm/test/Transforms/IndVarSimplify/exit_value_test2.ll`, this now regresses `llvm/test/Transforms/IndVarSimplify/elim-extend.ll` `@nestedIV` test, we no longer can perform that expansion within default budget of `4`, but require budget of `6`. That regression is being addressed by D73777. The basic idea here is simple. ``` Op0, Op1, Op2 ... \| \| \| \--+--/ \| \| \| \---+---/ ``` I.e. given N operands, we will have N-1 operations, so we have to add cost of an add (mul) for every Op processed, except the first one, plus we need to recurse into every Op. I'm guessing there's already canonicalization that ensures we won't have `1` operand in `scMulExpr`, and no `0` in `scAddExpr`/`scMulExpr`. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73728	2020-02-25 23:05:58 +03:00
Roman Lebedev	cc29600b90	[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model plain UDiv Summary: If we don't believe this UDiv is actually a LShr in disguise, things are much worse. First, we try to see if this UDiv actually originates from user code, by looking for `S + 1`, and if found considering this UDiv to be free. But otherwise, we always considered this UDiv to be high-cost. However that is no longer the case with TTI-driven cost model: our default budget is 4, which matches the default cost of UDiv, so now we allow a single UDiv to not be counted as high-cost. While that is the case, it is evident this is actually a regression due to the fact that cost-modelling is incomplete - we did not account for the `add`, `mul` costs yet. That is being addressed in D73728. Cost-modelling for UDiv also seems pretty straight-forward: subtract cost of the UDiv itself, and recurse into both the LHS and RHS. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73722	2020-02-25 23:05:58 +03:00
Roman Lebedev	b8abdf9a17	[NFC][IndVarSimplify] Adjust value names in IndVarSimplify/exit_value_test2.ll %tmp prefix confuses auto-update scripts	2020-02-25 23:05:58 +03:00
Michael Kruse	e4d20ec8ad	[IndVarSimply] Fix assert/release build difference. In builds with assertions enabled (!NDEBUG), IndVarSimplify does an additional query to ScalarEvolution which may change future SCEV queries since it fills the internal cache differently. The result is actually only used with the -verify-indvars command line option. We fix the issue by only calling SE->getBackedgeTakenCount(L) if -verify-indvars is enabled such that only -verify-indvars shows the behavior, but not debug builds themselves. Also add a remark to the description of -verify-indvars about this behavior. Fixes llvm.org/PR44815 Differential Revision: https://reviews.llvm.org/D74810	2020-02-19 14:36:22 -06:00
Roman Lebedev	8d2e9bca7e	[NFC][IndVarSimplify] Autogenerate exit_value_test2.ll check lines	2020-01-30 20:11:02 +03:00
Roman Lebedev	9c801c48ee	[NFC][IndVarSimplify] Autogenerate tests affected by isHighCostExpansionHelper() cost modelling (PR44668)	2020-01-27 23:34:29 +03:00
Alina Sbirlea	a0f627d584	[IndVarSimplify] Fix for MemorySSA preserve.	2020-01-23 11:06:16 -08:00
Sjoerd Meijer	67bf9a6154	[SVEV] Recognise hardware-loop intrinsic loop.decrement.reg Teach SCEV about the @loop.decrement.reg intrinsic, which has exactly the same semantics as a sub expression. This allows us to query hardware-loops, which contain this @loop.decrement.reg intrinsic, so that we can calculate iteration counts, exit values, etc. of hardwareloops. This "int_loop_decrement_reg" intrinsic is defined as "IntrNoDuplicate". Thus, while hardware-loops and tripcounts now become analysable by SCEV, this prevents the usual loop transformations from applying transformations on hardware-loops, which is what we want at this point, for which I have added test cases for loopunrolling and IndVarSimplify and LFTR. Differential Revision: https://reviews.llvm.org/D71563	2020-01-10 09:35:00 +00:00
Fangrui Song	502a77f125	Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" as cleanups after D56351	2019-12-24 15:57:33 -08:00
Philip Reames	8748be7750	[LoopPred] Enable new transformation by default The basic idea of the transform is to convert variant loop exit conditions into invariant exit conditions by changing the iteration on which the exit is taken when we know that the trip count is unobservable. See the original patch which introduced the code for a more complete explanation. The individual parts of this have been reviewed, the result has been fuzzed, and then further analyzed by hand, but despite all of that, I will not be suprised to see breakage here. If you see problems, please don't hesitate to revert - though please do provide a test case. The most likely class of issues are latent SCEV bugs and without a reduced test case, I'll be essentially stuck on reducing them. (Note: A bunch of tests were opted out of the new transform to preserve coverage. That landed in a previous commit to simplify revert cycles if they turn out to be needed.)	2019-11-06 15:41:57 -08:00
Philip Reames	20cbb6cdf8	[LoopPred] Selectively disable to preserve test cases I'm about to enable the new loop predication transform by default. It has the effect of completely destroying many read only loops - which happen to be a super common idiom in our test cases. So as to preserve test coverage of other transforms, disable the new transform where it would cause sharp test coverage regressions. (This is semantically part of the enabling commit. It's committed separate to ease revert if the actual flag flip gets reverted.)	2019-11-06 15:41:57 -08:00
Philip Reames	8cbcd2f484	[IndVars] Eliminate loop exits with equivalent exit counts We can end up with two loop exits whose exit counts are equivalent, but whose textual representation is different and non-obvious. For the sub-case where we have a series of exits which dominate one another (common), eliminate any exits which would iterate after a previous exit on the exiting iteration. As noted in the TODO being removed, I'd always thought this was a good idea, but I've now seen this in a real workload as well. Interestingly, in review, Nikita pointed out there's let another oppurtunity to leverage SCEV's reasoning. If we kept track of the min of dominanting exits so far, we could discharge exits with EC >= MDE. This is less powerful than the existing transform (since later exits aren't considered), but potentially more powerful for any case where SCEV can prove a >= b, but neither a == b or a > b. I don't have an example to illustrate that oppurtunity, but won't be suprised if we find one and return to handle that case as well. Differential Revision: https://reviews.llvm.org/D69009 llvm-svn: 375379	2019-10-20 23:38:02 +00:00
Philip Reames	ac77947315	Remove a stale comment, noted in post commit review for rL375038 llvm-svn: 375040	2019-10-16 20:27:10 +00:00
Philip Reames	d4346584fa	[IndVars] Fix a miscompile in off-by-default loop predication implementation The problem is that we can have two loop exits, 'a' and 'b', where 'a' and 'b' would exit at the same iteration, 'a' precedes 'b' along some path, and 'b' is predicated while 'a' is not. In this case (see the previously submitted test case), we causing the loop to exit through 'b' whereas it should have exited through 'a'. This only applies to loop exits where the exit counts are not provably inequal, but that isn't as much of a restriction as it appears. If we could order the exit counts, we'd have already removed one of the two exits. In theory, we might be able to prove inequality w/o ordering, but I didn't really explore that piece. Instead, I went for the obvious restriction and ensured we didn't predicate exits following non-predicateable exits. Credit goes to Evgeny Brevnov for figuring out the problematic case. Fuzzing probably also found it (failures seen), but due to some silly infrastructure problems I hadn't gotten to the results before Evgeny hand reduced it from a benchmark (he manually enabled the transform). Once this is fixed, I'll try to filter through the fuzzer failures to see if there's anything additional lurking. Differential Revision https://reviews.llvm.org/D68956 llvm-svn: 375038	2019-10-16 19:58:26 +00:00
Philip Reames	2b161cd0a4	[Tests] Add a test demonstrating a miscompile in the off-by-default loop-pred transform Credit goes to Evgeny Brevnov for figuring out the problematic case. Fuzzing probably also found it (lots of failures), but due to some silly infrastructure problems I hadn't gotten to the results before Evgeny hand reduced it from a benchmark. llvm-svn: 374812	2019-10-14 19:49:40 +00:00
Philip Reames	02945107f8	[Tests] Add a few more tests for idioms with FP induction variables llvm-svn: 374807	2019-10-14 19:10:39 +00:00
Philip Reames	0200626f0b	[IndVars] An implementation of loop predication without a need for speculation This patch implements a variation of a well known techniques for JIT compilers - we have an implementation in tree as LoopPredication - but with an interesting twist. This version does not assume the ability to execute a path which wasn't taken in the original program (such as a guard or widenable.condition intrinsic). The benefit is that this works for arbitrary IR from any frontend (including C/C++/Fortran). The tradeoff is that it's restricted to read only loops without implicit exits. This builds on SCEV, and can thus eliminate the loop varying portion of the any early exit where all exits are understandable by SCEV. A key advantage is that fixing deficiency exposed in SCEV - already found one while writing test cases - will also benefit all of full redundancy elimination (and most other loop transforms). I haven't seen anything in the literature which quite matches this. Given that, I'm not entirely sure that keeping the name "loop predication" is helpful. Anyone have suggestions for a better name? This is analogous to partial redundancy elimination - since we remove the condition flowing around the backedge - and has some parallels to our existing transforms which try to make conditions invariant in loops. Factoring wise, I chose to put this in IndVarSimplify since it's a generally applicable to all workloads. I could split this off into it's own pass, but we'd then probably want to add that new pass every place we use IndVars. One solid argument for splitting it off into it's own pass is that this transform is "too good". It breaks a huge number of existing IndVars test cases as they tend to be simple read only loops. At the moment, I've opted it off by default, but if we add this to IndVars and enable, we'll have to update around 20 test files to add side effects or disable this transform. Near term plan is to fuzz this extensively while off by default, reflect and discuss on the factoring issue mentioned just above, and then enable by default. I also need to give some though to supporting widenable conditions in this framing. Differential Revision: https://reviews.llvm.org/D67408 llvm-svn: 373351	2019-10-01 17:03:44 +00:00
Alexey Lapshin	49f3c2b604	[Debuginfo] dbg.value points to undef value after Induction Variable Simplification. Induction Variable Simplification pass does not update dbg.value intrinsic. Before: %add = add nuw nsw i32 %ArgIndex.06, 1 call void @llvm.dbg.value(metadata i32 %add, metadata !17, metadata !DIExpression()) After: %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 call void @llvm.dbg.value(metadata i64 undef, metadata !17, metadata !DIExpression()) There should be: %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 call void @llvm.dbg.value(metadata i64 %indvars.iv.next, metadata !17, metadata !DIExpression()) Differential Revision: https://reviews.llvm.org/D67770 llvm-svn: 372703	2019-09-24 08:47:03 +00:00
Roman Lebedev	10151f6618	[SimplifyCFG] FoldTwoEntryPHINode(): consider total speculation cost, not per-BB cost Summary: Previously, if the threshold was 2, we were willing to speculatively execute 2 cheap instructions in both basic blocks (thus we were willing to speculatively execute cost = 4), but weren't willing to speculate when one BB had 3 instructions and other one had no instructions, even thought that would have total cost of 3. This looks inconsistent to me. I don't think `cmov`-like instructions will start executing until both of it's inputs are available: https://godbolt.org/z/zgHePf So i don't see why the existing behavior is the correct one. Also, let's add it's own `cl::opt` for this threshold, with default=4, so it is not stricter than the previous threshold: will allow to fold when there are 2 BB's each with cost=2. And since the logic has changed, it will also allow to fold when one BB has cost=3 and other cost=1, or there is only one BB with cost=4. This is an alternative solution to D65148: This fix is mainly motivated by `signbit-like-value-extension.ll` test. That pattern comes up in JPEG decoding, see e.g. `Figure F.12 – Extending the sign bit of a decoded value in V` of `ITU T.81` (JPEG specification). That branch is not predictable, and it is within the innermost loop, so the fact that that pattern ends up being stuck with a branch instead of `select` (i.e. `CMOV` for x86) is unlikely to be beneficial. This has great results on the final assembly (vanilla test-suite + RawSpeed): (metric pass - D67240) \| metric \| old \| new \| delta \| % \| \| x86-mi-counting.NumMachineFunctions \| 37720 \| 37721 \| 1 \| 0.00% \| \| x86-mi-counting.NumMachineBasicBlocks \| 773545 \| 771181 \| -2364 \| -0.31% \| \| x86-mi-counting.NumMachineInstructions \| 7488843 \| 7486442 \| -2401 \| -0.03% \| \| x86-mi-counting.NumUncondBR \| 135770 \| 135543 \| -227 \| -0.17% \| \| x86-mi-counting.NumCondBR \| 423753 \| 422187 \| -1566 \| -0.37% \| \| x86-mi-counting.NumCMOV \| 24815 \| 25731 \| 916 \| 3.69% \| \| x86-mi-counting.NumVecBlend \| 17 \| 17 \| 0 \| 0.00% \| We significantly decrease basic block count, notably decrease instruction count, significantly decrease branch count and very significantly increase `cmov` count. Performance-wise, unsurprisingly, this has great effect on target RawSpeed benchmark. I'm seeing 5 major improvements: ``` Benchmark Time CPU Time Old Time New CPU Old CPU New ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_mean -0.3064 -0.3064 226.9913 157.4452 226.9800 157.4384 Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_median -0.3057 -0.3057 226.8407 157.4926 226.8282 157.4828 Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_stddev -0.4985 -0.4954 0.3051 0.1530 0.3040 0.1534 Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_mean -0.1747 -0.1747 80.4787 66.4227 80.4771 66.4146 Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_median -0.1742 -0.1743 80.4686 66.4542 80.4690 66.4436 Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_stddev +0.6089 +0.5797 0.0670 0.1078 0.0673 0.1062 Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_mean -0.1598 -0.1598 171.6996 144.2575 171.6915 144.2538 Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_median -0.1598 -0.1597 171.7109 144.2755 171.7018 144.2766 Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_stddev +0.4024 +0.3850 0.0847 0.1187 0.0848 0.1175 Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_mean -0.0550 -0.0551 280.3046 264.8800 280.3017 264.8559 Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_median -0.0554 -0.0554 280.2628 264.7360 280.2574 264.7297 Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_stddev +0.7005 +0.7041 0.2779 0.4725 0.2775 0.4729 Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_mean -0.0354 -0.0355 316.7396 305.5208 316.7342 305.4890 Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_median -0.0354 -0.0356 316.6969 305.4798 316.6917 305.4324 Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_stddev +0.0493 +0.0330 0.3562 0.3737 0.3563 0.3681 ``` That being said, it's always best-effort, so there will likely be cases where this worsens things. Reviewers: efriedma, craig.topper, dmgreen, jmolloy, fhahn, Carrot, hfinkel, chandlerc Reviewed By: jmolloy Subscribers: xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67318 llvm-svn: 372009	2019-09-16 16:18:24 +00:00
Philip Reames	2a52583d67	[IndVars] Fix a bug noticed by inspection We were computing the loop exit value, but not ensuring the addrec belonged to the loop whose exit value we were computing. I couldn't actually trip this; the test case shows the basic setup which might trip this, but none of the variations I've tried actually do. llvm-svn: 369730	2019-08-23 04:03:23 +00:00
Philip Reames	6cca3ad43e	[RLEV] Rewrite loop exit values for multiple exit loops w/o overall loop exit count We already supported rewriting loop exit values for multiple exit loops, but if any of the loop exits were not computable, we gave up on all loop exit values. This patch generalizes the existing code to handle individual computable loop exits where possible. As discussed in the review, this is a starting point for figuring out a better API. The code is a bit ugly, but getting it in lets us test as we go. Differential Revision: https://reviews.llvm.org/D65544 llvm-svn: 368898	2019-08-14 18:27:57 +00:00
Philip Reames	f8e7b53657	[IndVars, RLEV] Support rewriting exit values in loops without known exits (prep work) This is a prepatory patch for future work on support exit value rewriting in loops with a mixture of computable and non-computable exit counts. The intention is to be "mostly NFC" - i.e. not enable any interesting new transforms - but in practice, there are some small output changes. The test differences are caused by cases wherewhere getSCEVAtScope can simplify a single entry phi without needing any knowledge of the loop. llvm-svn: 367485	2019-07-31 21:15:21 +00:00
Philip Reames	ea5c94b497	[IndVars] Fix a subtle bug in optimizeLoopExits The original code failed to account for the fact that one exit can have a pointer exit count without all of them having pointer exit counts. This could cause two separate bugs: 1) We might exit the loop early, and leave optimizations undone. This is what triggered the assertion failure in the reported test case. 2) We might optimize one exit, then exit without indicating a change. This could result in an analysis invalidaton bug if no other transform is done by the rest of indvars. Note that the pointer exit counts are a really fragile concept. They show up only when we have a pointer IV w/o a datalayout to provide their size. It's really questionable to me whether the complexity implied is worth it. llvm-svn: 366829	2019-07-23 17:45:11 +00:00
Roman Lebedev	d5a52aeab6	[IndVarSimplify][NFC] Autogenerate check lines in loop_evaluate_1.ll Being affected by upcoming patch. llvm-svn: 366746	2019-07-22 22:08:27 +00:00
Philip Reames	34495b5533	[IndVars] Use exit count reasoning to discharge obviously untaken exits Continue in the spirit of D63618, and use exit count reasoning to prove away loop exits which can not be taken since the backedge taken count of the loop as a whole is provably less than the minimal BE count required to take this particular loop exit. As demonstrated in the newly added tests, this triggers in a number of cases where IndVars was previously unable to discharge obviously redundant exit tests. And some not so obvious ones. Differential Revision: https://reviews.llvm.org/D63733 llvm-svn: 365920	2019-07-12 17:05:35 +00:00
Nikita Popov	a01502f1ba	[LFTR] Regenerate test checks; NFC llvm-svn: 365262	2019-07-06 08:54:15 +00:00
Philip Reames	ea06d63c35	[LFTR] Use SCEVExpander for the pointer limit case instead of manual IR gen As noted in the test change, this is not trivially NFC, but all of the changes in output are cases where the SCEVExpander form is more canonical/optimal than the hand generation. llvm-svn: 365075	2019-07-03 20:03:46 +00:00
Philip Reames	83cca94194	[LFTR] Hoist extend expressions outside of loops w/o waiting for LICM The motivation for this is two fold: 1) Make the output (and thus tests) a bit more readable to a human trying to understand the result of the transform 2) Reduce spurious diffs in a potential future change to restructure all of this logic to use SCEVExpander (which hoists by default) llvm-svn: 365066	2019-07-03 18:18:36 +00:00
Nikita Popov	2d756c4feb	[LFTR] Fix post-inc pointer IV with truncated exit count (PR41998) Fixes https://bugs.llvm.org/show_bug.cgi?id=41998. Usually when we have a truncated exit count we'll truncate the IV when comparing against the limit, in which case exit count overflow in post-inc form doesn't matter. However, for pointer IVs we don't do that, so we have to be careful about incrementing the IV in the wide type. I'm fixing this by removing the IVCount variable (which was ExitCount or ExitCount+1) and replacing it with a UsePostInc flag, and then moving the actual limit adjustment to the individual cases (which are: pointer IV where we add to the wide type, integer IV where we add to the narrow type, and constant integer IV where we add to the wide type). Differential Revision: https://reviews.llvm.org/D63686 llvm-svn: 364709	2019-06-29 09:24:12 +00:00
Philip Reames	b2f09391cf	[Tests] Add cases where we're failing to discharge provably loop exits (tests for D63733) llvm-svn: 364220	2019-06-24 19:26:17 +00:00
Philip Reames	3f8264b062	[Tests] Autogen and improve test readability llvm-svn: 364156	2019-06-23 17:13:53 +00:00
Philip Reames	d22a2a9a72	[IndVars] Remove dead instructions after folding trivial loop exit In rL364135, I taught IndVars to fold exiting branches in loops with a zero backedge taken count (i.e. loops that only run one iteration). This extends that to eliminate the dead comparison left around. llvm-svn: 364155	2019-06-23 17:06:57 +00:00

1 2 3 4 5 ...

558 Commits