Implemented the `llvm-profdata overlap` feature for sample profiles. It reports weighted //similarity// and unweighted //overlap// metrics at program and function level for two input profiles. Similarity metrics are symmetric with regards to the order of two input profiles. By default, the tool only reports program-level summary. Users can look into function-level details via additional options `--function`, `--similarity-cutoff`, and `--value-cutoff`.
The similarity metrics are designed as follows:
* Program-level summary
* Whole program profile similarity is an aggregate over function-level similarity `FS`: `PS = sum(FS(A) * avg_weight(A))` for all function `A`.
* Whole program sample overlap: `PSO = common_samples / total_samples`.
* Function overlap: `FO = #common_function / #total_function`.
* Hot-function overlap: `HFO = #common_hot_function / #total_hot_function`.
* Hot-block overlap: `HBO = #common_hot_block / #total_hot_block`.
* Function-level details
* Function-level similarity is an aggregate over line/block-level similarities `BS` of all sample lines/blocks in the function, weighted by the closeness of the function's weights in two profiles: `FS = sum(BS(i)) * (1 - weight_distance(A))`.
* Function-level sample overlap: `FSO = common_samples / total_samples` for samples in the function.
Reviewed By: wenlei, hoyFB, wmi
Differential Revision: https://reviews.llvm.org/D83852
PGO profile is usually more precise than sample profile. However, PGO profile
needs to be collected from loadtest and loadtest may not be representative
enough to the production workload. Sample profile collected from production
can be used as a supplement -- for functions cold in loadtest but warm/hot
in production, we can scale up the related function in PGO profile if the
function is warm or hot in sample profile.
The implementation contains changes in compiler side and llvm-profdata side.
Given an instr profile and a sample profile, for a function cold in PGO
profile but warm/hot in sample profile, llvm-profdata will either mark
all the counters in the profile to be -1 or scale up the max count in the
function to be above hot threshold, depending on the zero counter ratio in
the profile. The assumption is if there are too many counters being zero
in the function profile, the profile is more likely to cause harm than good,
then llvm-profdata will mark all the counters to be -1 indicating the
function is hot but the profile is unaccountable. In compiler side, if a
function profile with all -1 counters is seen, the function entry count will
be set to be above hot threshold but its internal profile will be dropped.
In the long run, it may be useful to let compiler support using PGO profile
and sample profile at the same time, but that requires more careful design
and more substantial changes to make two profiles work seamlessly. The patch
here serves as a simple intermediate solution.
Differential Revision: https://reviews.llvm.org/D81981
This patch includes the supporting code that enables always
instrumenting the function entry block by default.
This patch will NOT the default behavior.
It adds a variant bit in the profile version, adds new directives in
text profile format, and changes llvm-profdata tool accordingly.
This patch is a split of D83024 (https://reviews.llvm.org/D83024)
Many test changes from D83024 are also included.
Differential Revision: https://reviews.llvm.org/D84261
Summary:
Add the --hot-func-list feature to llvm-profdata show for sample profiles. This feature prints a list of hot functions whose max sample count are above the 99% threshold, with their numbers of total samples, total samples percentage, max samples, entry samples, and their function names.
Test Plan:
Reviewers: wenlei, hoyFB
Reviewed By: wenlei, hoyFB
Subscribers: hoyFB, wenlei, weihe, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82355
Summary: Add the --hot-func-list feature to llvm-profdata show for sample profiles. This feature prints a list of hot functions whose max sample count are above the 99% threshold, with their numbers of total samples, total samples percentage, max samples, entry samples, and their function names.
Reviewers: wmi, hoyFB, wenlei
Reviewed By: wmi
Subscribers: hoyFB, wenlei, llvm-commits, weihe
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81800
Summary:
According to the comments, we want to convert the profile into two binary formats, and then into the md5text format.
We seems to have ignored the intermediate files.
This patch uses them to complete the full roundtrips.
Reviewers: wmi, wenlei
Reviewed By: wmi
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81202
The internal flag -partial-profile in llvm conflicts with the flag with
the same name in llvm-profdata. The conflict happens in builds with
LLVM_LINK_LLVM_DYLIB enabled. In this case the tools are linked with libLLVM
and we end up with two definitions for the same cl::opt.
The patch renames llvm-profdata flag -partial-profile to -gen-partial-profile.
Summary: Add -detailed-summary support for sample profile dump to match that of instrumentation profile.
Reviewers: wmi, davidxl, hoyFB
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79291
There is no need to use `--check-prefix` multiple times.
It helps to improve readability/test maintainability.
This patch does it for all tools at once.
Differential revision: https://reviews.llvm.org/D78217
Fix the error of show-prof-info.test on some platforms without zlib.
The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them.
Differential Revision: https://reviews.llvm.org/D77426
The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them.
Differential Revision: https://reviews.llvm.org/D77426
Compbinary format uses MD5 to represent strings in name table. That gives smaller profile without the need of compression/decompression when writing/reading the profile. The patch adds the support in extbinary format. It is off by default but user can choose to enable it.
Note the feature of using MD5 in name table can bring very small chance of name conflict leading to profile mismatch. Besides, profile using the feature won't have the profile remapping support.
Differential Revision: https://reviews.llvm.org/D76255
Parsing `ls -l` output to obtain the size of a file is unreliable; the
exact output format is not specified, and some user or group names may
contain multiple words, causing `cut -f5 -d' '` to extract an incorrect
value. `wc -c`, on the other hand, is portable, and there are precendents
of its use in test cases.
This reverts commit bcbb121ff6.
Using 'ls -o' is not compatible way to fix the problem. FreeBSD and OSX
version of 'ls' do not support -o flag and test gets failed on these
platforms.
Differential Revision: https://reviews.llvm.org/D69317
The space symbols are allowed in the group names on Windows system (as
example: Domain Users). In that case the test extracts a wrong field
from the output to get a size of the profdata file.
This patch avoids a printing of the group names in the test output and
extracts a proper field as a file size.
Differential Revision: https://reviews.llvm.org/D69317
Add support for continuously syncing profile counter updates to a file.
The motivation for this is that programs do not always exit cleanly. On
iOS, for example, programs are usually killed via a signal from the OS.
Running atexit() handlers after catching a signal is unreliable, so some
method for progressively writing out profile data is necessary.
The approach taken here is to mmap() the `__llvm_prf_cnts` section onto
a raw profile. To do this, the linker must page-align the counter and
data sections, and the runtime must ensure that counters are mapped to a
page-aligned offset within a raw profile.
Continuous mode is (for the moment) incompatible with the online merging
mode. This limitation is lifted in https://reviews.llvm.org/D69586.
Continuous mode is also (for the moment) incompatible with value
profiling, as I'm not sure whether there is interest in this and the
implementation may be tricky.
As I have not been able to test extensively on non-Darwin platforms,
only Darwin support is included for the moment. However, continuous mode
may "just work" without modification on Linux and some UNIX-likes. AIUI
the default value for the GNU linker's `--section-alignment` flag is set
to the page size on many systems. This appears to be true for LLD as
well, as its `no_nmagic` option is on by default. Continuous mode will
not "just work" on Fuchsia or Windows, as it's not possible to mmap() a
section on these platforms. There is a proposal to add a layer of
indirection to the profile instrumentation to support these platforms.
rdar://54210980
Differential Revision: https://reviews.llvm.org/D68351
Summary: This is a fix to revision D68839 and rL375023. This patch substitutes POSIX option "-b" for the non-portable GNU option "--strip-trailing-cr".
Reviewers: daltenty, hubert.reinterpretcast
Reviewed By: daltenty
Subscribers: mehdi_amini, hiraditya, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69342
Using GNU diff, `--strip-trailing-cr` removes a `\r` appearing before
a `\n` at the end of a line. Without this patch, lit's internal diff
only removes `\r` if it appears as the last character. That seems
useless. This patch fixes that.
This patch also adds `--strip-trailing-cr` to some tests that fail on
Windows bots when D68664 is applied. Based on what I see in the bot
logs, I think the following is happening. In each test there, lit
diff is comparing a file with `\r\n` line endings to a file with `\n`
line endings. Without D68664, lit diff reads those files in text
mode, which in Windows causes `\r\n` to be replaced with `\n`.
However, with D68664, lit diff reads the files in binary mode instead
and thus reports that every line is different, just as GNU diff does
(at least under Ubuntu). Adding `--strip-trailing-cr` to those tests
restores the previous behavior while permitting the behavior of lit
diff to be more like GNU diff.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D68839
llvm-svn: 375020
I removed this test to unblock the ARM bots while looking into failures
(r374915), and am reinstating it now with a fix.
I believe the problem was that counter ptr address I used,
'\0\0\6\0\1\0\0\1', set the high bits of the pointer, not the low bits
like I wanted. On x86_64 this superficially looks like it tests r370826,
but it doesn't, as it would have been caught before r370826. However, on
ARM (or, 32-bit hosts more generally), I suspect the high bits were
cleared, and you get a 'valid' profile.
I verified that setting the *low* bits of the pointer does trigger the
new condition:
-// Note: The CounterPtr here is off-by-one. This should trigger a malformed profile error.
-RUN: printf '\0\0\6\0\1\0\0\1' >> %t.profraw
+// Note: The CounterPtr here is off-by-one.
+//
+// Octal '\11' is 9 in decimal: this should push CounterOffset to 1. As there are two counters,
+// the profile reader should error out.
+RUN: printf '\11\0\6\0\1\0\0\0' >> %t.profraw
This reverts commit c7cf5b3e4b918c9769fd760f28485b8d943ed968.
llvm-svn: 374927
There are a number arm bots failing after r374617 landed, and I'm not
sure why. It looks a bit like the error message llvm-profdata is
expected to print to stderr isn't flushed.
Weaken the test in an attempt to appease the arm bots: if this doesn't
work, that means that llvm-profdata is actually *not failing*, and that
will be a clear indication that some logic error is actually happening.
http://lab.llvm.org:8011/builders/clang-cmake-armv7-global-isel/builds/5604/
llvm-svn: 374792
Using GNU diff, `--strip-trailing-cr` removes a `\r` appearing before
a `\n` at the end of a line. Without this patch, lit's internal diff
only removes `\r` if it appears as the last character. That seems
useless. This patch fixes that.
This patch also adds `--strip-trailing-cr` to some tests that fail on
Windows bots when D68664 is applied. Based on what I see in the bot
logs, I think the following is happening. In each test there, lit
diff is comparing a file with `\r\n` line endings to a file with `\n`
line endings. Without D68664, lit diff reads those files with
Python's universal newlines support activated, causing `\r` to be
dropped. However, with D68664, lit diff reads the files in binary
mode instead and thus reports that every line is different, just as
GNU diff does (at least under Ubuntu). Adding `--strip-trailing-cr`
to those tests restores the previous behavior while permitting the
behavior of lit diff to be more like GNU diff.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D68839
llvm-svn: 374652
Previously ExtBinary profile format only supports compression using zlib for
profile symbol list. In this patch, we extend the compression support to any
section. User can select some or all of the sections to compress. In an
experiment, for a 45M profile in ExtBinary format, compressing name table
reduced its size to 24M, and compressing all the sections reduced its size
to 11M.
Differential Revision: https://reviews.llvm.org/D68253
llvm-svn: 373914
or the size of the profile for profile in ExtBinary format.
Fix a test failure on Mac.
[SampleFDO] Expose an interface to return the size of a section or the
size of the profile for profile in ExtBinary format.
Sometimes we want to limit the size of the profile by stripping some functions
with low sample count or by stripping some function names with small text size
from profile symbol list. That requires the profile reader to have the
interfaces returning the size of a section or the size of total profile. The
patch add those interfaces.
At the same time, add some dump facility to show the size of each section.
Differential revision: https://reviews.llvm.org/D67726
llvm-svn: 372478
of the profile for profile in ExtBinary format.
Sometimes we want to limit the size of the profile by stripping some functions
with low sample count or by stripping some function names with small text size
from profile symbol list. That requires the profile reader to have the
interfaces returning the size of a section or the size of total profile. The
patch add those interfaces.
At the same time, add some dump facility to show the size of each section.
llvm-svn: 372439
Add a mode in which profile read errors are not immediately treated as
fatal. In this mode, merging makes forward progress and reports failure
only if no inputs can be read.
Differential Revision: https://reviews.llvm.org/D66985
llvm-svn: 370827
The check needs to validate a counter offset before performing pointer
arithmetic with the (potentially corrupt) offset.
Found by UBSan's pointer overflow check.
rdar://54843625
Differential Revision: https://reviews.llvm.org/D66979
llvm-svn: 370826
cold versus function being newly added.
This is the second half of https://reviews.llvm.org/D66374.
Profile symbol list is the collection of function symbols showing up in
the binary which generates the current profile. It is used to discriminate
function being cold versus function being newly added. Profile symbol list
is only added for profile with ExtBinary format.
During profile use compilation, when profile-sample-accurate is enabled,
a function without profile will be regarded as cold only when it is
contained in that list.
Differential Revision: https://reviews.llvm.org/D66766
llvm-svn: 370563
This is a patch split from https://reviews.llvm.org/D66374. It tries to add
a new format of profile called ExtBinary. The format adds a section header
table to the profile and organize the profile in sections, so the future
extension like adding a new section or extending an existing section will be
easier while keeping backward compatiblity feasible.
Differential Revision: https://reviews.llvm.org/D66513
llvm-svn: 369798
Summary:
StringMap is used for storing call target to frequency map for AutoFDO. However the iterating order of StringMap is non-deterministic, which leads to non-determinism in AutoFDO profile output. Now new API getSortedCallTargets and SortCallTargets are added for deterministic ordering and output.
Roundtrip test for text profile and binary profile is added.
Reviewers: wmi, davidxl, danielcdh
Subscribers: hiraditya, mgrang, llvm-commits, twoh
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66191
llvm-svn: 369440
Summary: Fix "llvm-profdata show" so it can work with compact binary format profile. The change is to mark all functions "used" so SampleProfileReaderCompactBinary::read will read in all profiles available for dumping. The function names will be MD5 hash for compact binary format.
Reviewers: wmi, davidxl, danielcdh
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65162
llvm-svn: 368731
Currently llvm-profdata does not expect the same file name for the input profile
and the output profile.
>llvm-profdata merge A.profraw B.profraw -o B.profraw
The above command runs successfully but the resulted B.profraw is not correct.
This patch fixes the issue by moving the initialization of writer after loading
the profile.
For the show command, the following will report a confusing error of
"Empty raw profile file":
>llvm-profdata show B.profraw -o B.profraw
It's harder to fix as we need to output something before loading the input profile.
I don't think that a fix for this is worth the effort. I just make the error explicit for
the show command.
Differential Revision: https://reviews.llvm.org/D64360
llvm-svn: 365386
Summary:
Triple components in `XFAIL` lines are tested against the target triple.
Various tests that are expected to fail on big-endian hosts are marked
as being `XFAIL` for big-endian targets. This patch corrects these tests
by having them test against a new `host-byteorder-big-endian` feature.
Reviewers: xingxue, sfertile, jasonliu
Reviewed By: xingxue
Subscribers: jvesely, nhaehnle, fedor.sergeev, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60551
llvm-svn: 359689