llvm-project

Commit Graph

Author	SHA1	Message	Date
David Green	b55d4c2ad8	Revert "[LV] Remove `LoopVectorizationCostModel::useEmulatedMaskMemRefHack()`" This reverts commit `77a0da926c` as we've received multiple reports of this significantly impacting performance, in ways that don't seem to just be target specific cost models going wrong. I would offer some reproducers, but the test changes here seem to be full of them! Reverting for now and hopefully we can remove the "hack" more carefully as we go.	2022-02-09 20:02:54 +00:00
Roman Lebedev	77a0da926c	[LV] Remove `LoopVectorizationCostModel::useEmulatedMaskMemRefHack()` D43208 extracted `useEmulatedMaskMemRefHack()` from legality into cost model. What it essentially does is prevents scalarized vectorization of masked memory operations: ``` // TODO: Cost model for emulated masked load/store is completely // broken. This hack guides the cost model to use an artificially // high enough value to practically disable vectorization with such // operations, except where previously deployed legality hack allowed // using very low cost values. This is to avoid regressions coming simply // from moving "masked load/store" check from legality to cost model. // Masked Load/Gather emulation was previously never allowed. // Limited number of Masked Store/Scatter emulation was allowed. ``` While i don't really understand about what specifically `is completely broken` was talking about, i believe that at least on X86 with AVX2-or-later, this is no longer true. (or at least, i would like to know what is still broken). So i would like to follow suit after D111460, and like wise disable that hack for AVX2+. But since this was added for X86 specifically, let's just instead completely remove this hack. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D114779	2022-02-07 16:08:31 +03:00
Arthur Eubanks	5c31b8b94f	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `10f2a0d662`. More uint64_t overflows.	2020-10-31 00:25:32 -07:00
Arthur Eubanks	10f2a0d662	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-30 10:03:46 -07:00
Nico Weber	2a4e704c92	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `e5766f25c6`. Makes clang assert when building Chromium, see https://crbug.com/1142813 for a repro.	2020-10-27 09:26:21 -04:00
Arthur Eubanks	e5766f25c6	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-26 20:24:04 -07:00
Evgeniy Brevnov	af7e158872	[LV] Vectorizer should adjust trip count in profile information Summary: Vectorized loop processes VFxUF number of elements in one iteration thus total number of iterations decreases proportionally. In addition epilog loop may not have more than VFxUF - 1 iterations. This patch updates profile information accordingly. Reviewers: hsaito, Ayal, fhahn, reames, silvas, dcaballe, SjoerdMeijer, mkuper, DaniilSuchkov Reviewed By: Ayal, DaniilSuchkov Subscribers: fedor.sergeev, hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67905	2020-01-20 18:36:28 +07:00
Eric Christopher	cee313d288	Revert "Temporarily Revert "Add basic loop fusion pass."" The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552	2019-04-17 04:52:47 +00:00
Eric Christopher	a863435128	Temporarily Revert "Add basic loop fusion pass." As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546	2019-04-17 02:12:23 +00:00
Mircea Trofin	06ac8cfbd1	Verify profile data confirms large loop trip counts. Summary: Loops with inequality comparers, such as: // unsigned bound for (unsigned i = 1; i < bound; ++i) {...} have getSmallConstantMaxTripCount report a large maximum static trip count - in this case, 0xffff fffe. However, profiling info may show that the trip count is much smaller, and thus counter-recommend vectorization. This change: - flips loop-vectorize-with-block-frequency on by default. - validates profiled loop frequency data supports vectorization, when static info appears to not counter-recommend it. Absence of profile data means we rely on static data, just as we've done so far. Reviewers: twoh, mkuper, davidxl, tejohnson, Ayal Reviewed By: davidxl Subscribers: bkramer, llvm-commits Differential Revision: https://reviews.llvm.org/D42946 llvm-svn: 324543	2018-02-07 23:29:52 +00:00
Taewook Oh	9083547ae3	Improve profile-guided heuristics to use estimated trip count. Summary: Existing heuristic uses the ratio between the function entry frequency and the loop invocation frequency to find cold loops. However, even if the loop executes frequently, if it has a small trip count per each invocation, vectorization is not beneficial. On the other hand, even if the loop invocation frequency is much smaller than the function invocation frequency, if the trip count is high it is still beneficial to vectorize the loop. This patch uses estimated trip count computed from the profile metadata as a primary metric to determine coldness of the loop. If the estimated trip count cannot be computed, it falls back to the original heuristics. Reviewers: Ayal, mssimpso, mkuper, danielcdh, wmi, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32451 llvm-svn: 305729	2017-06-19 18:48:58 +00:00

11 Commits