Commit Graph

15 Commits

Author SHA1 Message Date
Arthur Eubanks 5c31b8b94f Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit 10f2a0d662.

More uint64_t overflows.
2020-10-31 00:25:32 -07:00
Arthur Eubanks 10f2a0d662 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-30 10:03:46 -07:00
Nico Weber 2a4e704c92 Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit e5766f25c6.
Makes clang assert when building Chromium, see https://crbug.com/1142813
for a repro.
2020-10-27 09:26:21 -04:00
Arthur Eubanks e5766f25c6 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-26 20:24:04 -07:00
Arthur Eubanks a95796a380 [NewPM][LoopUnroll] Rename unroll* to loop-unroll*
The legacy pass is called "loop-unroll", but in the new PM it's called "unroll".
Also applied to unroll-and-jam and unroll-full.

Fixes various check-llvm tests when NPM is turned on.

Reviewed By: Whitney, dmgreen

Differential Revision: https://reviews.llvm.org/D82590
2020-06-26 09:28:32 -07:00
Evgeniy Brevnov 10357e1c89 [LoopUtils] Better accuracy for getLoopEstimatedTripCount.
Summary: Current implementation of getLoopEstimatedTripCount returns 1 iteration less than it should. The reason is that in bottom tested loop first iteration is executed before first back branch is taken. For example for loop with !{!"branch_weights", i32 1 // taken, i32 1 // exit} metadata getLoopEstimatedTripCount gives 1 while actual number of iterations is 2.

Reviewers: Ayal, fhahn

Reviewed By: Ayal

Subscribers: mgorny, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71990
2020-01-20 16:58:07 +07:00
Serguei Katkov c6c31da867 [Loop Peeling] Fix the handling of branch weights of peeled off branches.
Current algorithm to update branch weights of latch block and its copies is
based on the assumption that number of peeling iterations is approximately equal
to trip count.

However it is not correct. According to profitability check in one case we can decide to peel
in case it helps to reduce the number of phi nodes. In this case the number of peeled iteration
can be less then estimated trip count.

This patch introduces another way to set the branch weights to peeled of branches.
Let F is a weight of the edge from latch to header.
Let E is a weight of the edge from latch to exit.
F/(F+E) is a probability to go to loop and E/(F+E) is a probability to go to exit.
Then, Estimated TripCount = F / E.
For I-th (counting from 0) peeled off iteration we set the the weights for
the peeled latch as (TC - I, 1). It gives us reasonable distribution,
The probability to go to exit 1/(TC-I) increases. At the same time
the estimated trip count of remaining loop reduces by I.

As a result after peeling off N iteration the weights will be
(F - N * E, E) and trip count of loop becomes
F / E - N or TC - N.

The idea is taken from the review of the patch D63918 proposed by Philip.

Reviewers: reames, mkuper, iajbar, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D64235

llvm-svn: 366665
2019-07-22 05:15:34 +00:00
Eric Christopher cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Teresa Johnson 8482e56920 Use profile summary to disable peeling for huge working sets
Summary:
Detect when the working set size of a profiled application is huge,
by comparing the number of counts required to reach the hot percentile
in the profile summary to a large threshold*.

When the working set size is determined to be huge, disable peeling
to avoid bloating the working set further.

*Note that the selected threshold (15K) is significantly larger than the
largest working set value in SPEC cpu2006 (which is gcc at around 11K).

Reviewers: davidxl

Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36288

llvm-svn: 310005
2017-08-03 23:42:58 +00:00
Teresa Johnson 9a18a6f08b Disable loop peeling during full unrolling pass.
Summary:
Peeling should not occur during the full unrolling invocation early
in the pipeline, but rather later with partial and runtime loop
unrolling. The later loop unrolling invocation will also eventually
utilize profile summary and branch frequency information, which
we would like to use to control peeling. And for ThinLTO we want
to delay peeling until the backend (post thin link) phase, just as
we do for most types of unrolling.

Ensure peeling doesn't occur during the full unrolling invocation
by adding a parameter to the shared implementation function, similar
to the way partial and runtime loop unrolling are disabled.

Performance results for ThinLTO suggest this has a neutral to positive
effect on some internal benchmarks.

Reviewers: chandlerc, davidxl

Subscribers: mzolotukhin, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D36258

llvm-svn: 309966
2017-08-03 17:52:38 +00:00
Michael Kuperstein c2af82b4b7 [LoopUnroll] Enable PGO-based loop peeling by default.
This enables peeling of loops with low dynamic iteration count by default,
when profile information is available.

Differential Revision: https://reviews.llvm.org/D27734

llvm-svn: 295796
2017-02-22 00:27:34 +00:00
Michael Kuperstein 991c2e0e57 Add test that verifies we don't peel loops in optsize functions. NFC.
llvm-svn: 291708
2017-01-11 21:42:51 +00:00
Xin Tong 2940231ff0 Make sure total loop body weight is preserved in loop peeling
Summary:
Regardless how the loop body weight is distributed, we should preserve
total loop body weight. i.e. we should have same weight reaching the body of the loop
or its duplicates in peeled and unpeeled case.

Reviewers: mkuper, davidxl, anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28179

llvm-svn: 290833
2017-01-02 20:27:23 +00:00
Michael Kuperstein b151a641aa [LoopUnroll] Implement profile-based loop peeling
This implements PGO-driven loop peeling.

The basic idea is that when the average dynamic trip-count of a loop is known,
based on PGO, to be low, we can expect a performance win by peeling off the
first several iterations of that loop.
Unlike unrolling based on a known trip count, or a trip count multiple, this
doesn't save us the conditional check and branch on each iteration. However,
it does allow us to simplify the straight-line code we get (constant-folding,
etc.). This is important given that we know that we will usually only hit this
code, and not the actual loop.

This is currently disabled by default.

Differential Revision: https://reviews.llvm.org/D25963

llvm-svn: 288274
2016-11-30 21:13:57 +00:00