Commit Graph

207 Commits

Author SHA1 Message Date
Snehasish Kumar 216575e581 Revert "Revert "[ProfileData] Read and symbolize raw memprof profiles.""
This reverts commit dbf47d227d.

Reapply https://reviews.llvm.org/D116784 now that
https://reviews.llvm.org/D118413 has landed with a couple of fixes:
* fix raw profile reader unaligned access identified by ubsan
* fix windows build by using MOCK_CONST_METHOD3 instead of MOCK_METHOD.
2022-02-08 13:37:27 -08:00
Snehasish Kumar dbf47d227d Revert "[ProfileData] Read and symbolize raw memprof profiles."
This reverts commit 26f978d4c5.

This patch added a transitive dependency on libcurl via symbolize.
See discussion
https://reviews.llvm.org/D116784#inline-1137928
https://reviews.llvm.org/D113717#3295350
2022-02-03 16:14:05 -08:00
Snehasish Kumar 26f978d4c5 [ProfileData] Read and symbolize raw memprof profiles.
This change extends the RawMemProfReader to read all the sections of the
raw profile and symbolize the virtual addresses recorded as part of the
callstack for each allocation. For now the symbolization is used to
display the contents of the profile with llvm-profdata.

Differential Revision: https://reviews.llvm.org/D116784
2022-02-03 14:33:50 -08:00
Snehasish Kumar 14f4f63af5 [memprof] Print out the summary in YAML format.
Print out the profile summary in YAML format to make it easier to for
tools and tests to read in the contents of the raw profile.

Differential Revision: https://reviews.llvm.org/D116783
2022-02-03 14:33:50 -08:00
Ellis Hoag 11d3074267 [InstrProf] Add single byte coverage mode
Use the llvm flag `-pgo-function-entry-coverage` to create single byte "counters" to track functions coverage. This mode has significantly less size overhead in both code and data because
  * We mark a function as "covered" with a store instead of an increment which generally requires fewer assembly instructions
  * We use a single byte per function rather than 8 bytes per block

The trade off of course is that this mode only tells you if a function has been covered. This is useful, for example, to detect dead code.

When combined with debug info correlation [0] we are able to create an instrumented Clang binary that is only 150M (the vanilla Clang binary is 143M). That is an overhead of 7M (4.9%) compared to the default instrumentation (without value profiling) which has an overhead of 31M (21.7%).

[0] https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4

Reviewed By: kyulee

Differential Revision: https://reviews.llvm.org/D116180
2022-01-27 17:38:55 -08:00
Snehasish Kumar 13d89477be [InstrProf][NFC] Refactor Profile kind into a bitset enum.
This change refactors the ProfileKind enum into a bitset enum to
represent the different attributes a profile can have. This change
simplifies the logic in the instrprof writer when multiple profiles are
merged together. In the future we plan on introducing a new memory
profile section which will extend the enum by one additional entry.
Without this change when accounting for memory profiles will have to be
maintained separately and will make the logic more complex.

Differential Revision: https://reviews.llvm.org/D115393
2022-01-27 12:58:11 -08:00
Ellis Hoag c9baa5608b [InstrProf][Correlate] Verify debug info with llvm-profdata show
Use the `llvm-profdata show` command to verify debug info for profile correlation using the `--debug-info` option.

Reviewed By: kyulee

Differential Revision: https://reviews.llvm.org/D118181
2022-01-27 10:11:04 -08:00
Chris Bieneman 91337e9091 Handle whitespace in symbol list
Trimming whitespace or carriage returns from symbols allows this code
to work on Windows and makes it match other places symbol lists are
handled.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D117570
2022-01-18 14:34:40 -06:00
Kazu Hirata f44473ec4e [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-08 11:56:44 -08:00
Kazu Hirata e5947760c2 Revert "[llvm] Remove redundant member initialization (NFC)"
This reverts commit fd4808887e.

This patch causes gcc to issue a lot of warnings like:

  warning: base class ‘class llvm::MCParsedAsmOperand’ should be
  explicitly initialized in the copy constructor [-Wextra]
2022-01-03 11:28:47 -08:00
Kazu Hirata fd4808887e [llvm] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-01 16:18:18 -08:00
Kyungwoo Lee 4ecf15b789 [llvm-profdata] Make -debug-info visible
Add the option comment in .rst.

Reviewed By: ellis

Differential Revision: https://reviews.llvm.org/D116348
2021-12-28 17:35:08 -08:00
Ellis Hoag 65d7fd0239 [Try2][InstrProf] Add Correlator class to read debug info
Extend `llvm-profdata` to read in a `.proflite` file and also a debug info file to generate a normal `.profdata` profile. This reduces the binary size by 8.4% when building an instrumented Clang binary without value profiling (164 MB vs 179 MB).

This work is part of the "lightweight instrumentation" RFC: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4

This was first landed in https://reviews.llvm.org/D114566 but had to be reverted due to build errors.

Reviewed By: kyulee

Differential Revision: https://reviews.llvm.org/D115915
2021-12-17 10:45:59 -08:00
Ellis Hoag bdc68ee70f Revert "[InstrProf] Add Correlator class to read debug info"
Also reverts an attempt to fix the build errors https://reviews.llvm.org/D115911

The original diff https://reviews.llvm.org/D114566 causes some build
errors that I need to investigate.

https://lab.llvm.org/buildbot/#/builders/118/builds/7037

This reverts commit 95946d2f85.

Reviewed By: kyulee

Differential Revision: https://reviews.llvm.org/D115913
2021-12-16 16:28:19 -08:00
Ellis Hoag 95946d2f85 [InstrProf] Add Correlator class to read debug info
Extend `llvm-profdata` to read in a `.proflite` file and also a debug info file to generate a normal `.profdata` profile. This reduces the binary size by 8.4% when building an instrumented Clang binary without value profiling (164 MB vs 179 MB).

This work is part of the "lightweight instrumentation" RFC: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4

Reviewed By: kyulee

Differential Revision: https://reviews.llvm.org/D114566
2021-12-16 15:18:12 -08:00
Hongtao Yu 5740bb801a [CSSPGO] Use nested context-sensitive profile.
CSSPGO currently employs a flat profile format for context-sensitive profiles. Such a flat profile allows for precisely manipulating contexts that is either inlined or not inlined. This is a benefit over the nested profile format used by non-CS AutoFDO. A downside of this is the longer build time due to parsing the indexing the full CS contexts.

For a CS flat profile, though only the context profiles relevant to a module are loaded when that module is compiled, the cost to figure out what profiles are relevant is noticeably high when there're many contexts,  since the sample reader will need to scan all context strings anyway. On the contrary, a nested function profile has its related inline subcontexts isolated from other unrelated contexts. Therefore when compiling a set of functions, unrelated contexts will never need to be scanned.

In this change we are exploring using nested profile format for CSSPGO. This is expected to work based on an assumption that with a preinliner-computed profile all contexts are precomputed and expected to be inlined by the compiler. Contexts not expected to be inlined will be cut off and returned to corresponding base profiles (for top-level outlined functions). This naturally forms a nested profile where all nested contexts are expected to be inlined. The compiler will less likely optimize on derived contexts that are not precomputed.

A CS-nested profile will look exactly the same with regular nested profile except that each nested profile can come with an attributes. With pseudo probes,  a nested profile shown as below can also have a CFG checksum.

```

main:1968679:12
 2: 24
 3: 28 _Z5funcAi:18
 3.1: 28 _Z5funcBi:30
 3: _Z5funcAi:1467398
  0: 10
  1: 10 _Z8funcLeafi:11
  3: 24
  1: _Z8funcLeafi:1467299
   0: 6
   1: 6
   3: 287884
   4: 287864 _Z3fibi:315608
   15: 23
   !CFGChecksum: 138828622701
   !Attributes: 2
  !CFGChecksum: 281479271677951
  !Attributes: 2
```

Specific work included in this change:
- A recursive profile converter to convert CS flat profile to nested profile.
- Extend function checksum and attribute metadata to be stored in nested way for text profile and extbinary profile.
- Unifiy sample loader inliner path for CS and preinlined nested profile.
 - Changes in the sample loader to support probe-based nested profile.

I've seen promising results regarding build time. A nested profile can result in a 20% shorter build time than a CS flat profile while keep an on-par performance. This is with -duplicate-contexts-into-base=1.

Test Plan:

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D115205
2021-12-14 14:40:25 -08:00
Snehasish Kumar 7cca33b40f [memprof] Extend llvm-profdata to display MemProf profile summaries.
This commit adds initial support to llvm-profdata to read and print
summaries of raw memprof profiles.
Summary of changes:
* Refactor shared defs to MemProfData.inc
* Extend show_main to display memprof profile summaries.
* Add a simple raw memprof profile reader.
* Add a couple of tests to tools/llvm-profdata.

Differential Revision: https://reviews.llvm.org/D114286
2021-11-30 10:45:26 -08:00
Hongtao Yu 259e4c5658 [CSSPGO] Trim cold base profiles for the CS preinliner.
Adding support to the CS preinliner to trim cold base profiles. This makes trimming consistent with the inline decision made by the preinliner. Also disable the existing profile merger when preinliner is on unless explicitly specified.

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D112489
2021-10-27 22:50:27 -07:00
Kazu Hirata 4e3eebc6bd [tools, utils] Use StringRef::contains (NFC) 2021-10-22 17:22:13 -07:00
Wenlei He 9978e0e475 [llvm-profdata] Allow overlap/similarity comparison to use custom hot threshold cutoff
Allow overlap/similarity comparison to use custom hot threshold cutoff, instead of using hard coded 990000 as hot cutoff.

Differential Revision: https://reviews.llvm.org/D111385
2021-10-10 13:30:18 -07:00
modimo ce6ed64a69 [llvm-profdata] Extend support of --topn to sample profiles
Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D110449
2021-09-24 16:42:46 -07:00
Wenlei He 4ef88031f5 [llvm-profdata] Fix assertion from invalid iterator
Differential Revision: https://reviews.llvm.org/D109096
2021-09-01 14:42:00 -07:00
Hongtao Yu b9db70369b [CSSPGO] Split context string to deduplicate function name used in the context.
Currently context strings contain a lot of duplicated function names and that significantly increase the profile size. This change split the context into a series of {name, offset, discriminator} tuples so function names used in the context can be replaced by the index into the name table and that significantly reduce the size consumed by context.

A follow-up improvement made in the compiler and profiling tools is to avoid reconstructing full context strings which is  time- and memory- consuming. Instead a context vector of `StringRef` is adopted to represent the full context in all scenarios. As a result, the previous prevalent profile map which was implemented as a `StringRef` is now engineered as an unordered map keyed by `SampleContext`. `SampleContext` is reshaped to using an `ArrayRef` to represent a full context for CS profile. For non-CS profile, it falls back to use `StringRef` to represent a contextless function name. Both the `ArrayRef` and `StringRef` objects are underpinned by real array and string objects that are stored in producer buffers. For compiler, they are maintained by the sample reader. For llvm-profgen, they are maintained in `ProfiledBinary` and `ProfileGenerator`. Full context strings can be generated only in those cases of debugging and printing.

When it comes to profile format, nothing has changed to the text format, though internally CS context is implemented as a vector. Extbinary format is only changed for CS profile, with an additional `SecCSNameTable` section which stores all full contexts logically in the form of `vector<int>`, which each element as an offset points to `SecNameTable`. All occurrences of contexts elsewhere are redirected to using the offset of `SecCSNameTable`.

Testing
This is no-diff change in terms of code quality and profile content (for text profile).

For our internal large service (aka ads), the profile generation is cut to half, with a 20x smaller string-based extbinary format generated.

The compile time of ads is dropped by 25%.

Differential Revision: https://reviews.llvm.org/D107299
2021-08-30 20:09:29 -07:00
Gulfem Savrun Yeniceri e50a38840d [profile] Add binary id into profiles
This patch adds binary id into profiles to easily associate binaries
with the corresponding profiles. There is an RFC that discusses
the motivation, design and implementation in more detail:
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151154.html

Differential Revision: https://reviews.llvm.org/D102039
2021-07-23 00:19:12 +00:00
Gulfem Savrun Yeniceri fd895bc81b Revert "[profile] Add binary id into profiles"
Revert "[profile] Change linkage type of a compiler-rt func"
This reverts commits f984ac2715 and
467c719124 because it broke some builds.
2021-07-21 19:15:18 +00:00
Gulfem Savrun Yeniceri f984ac2715 [profile] Add binary id into profiles
This patch adds binary id into profiles to easily associate binaries
with the corresponding profiles. There is an RFC that discusses
the motivation, design and implementation in more detail:
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151154.html

Differential Revision: https://reviews.llvm.org/D102039
2021-07-21 17:55:43 +00:00
Fangrui Song ea23c38d06 [llvm-profdata] Allow omission of -o for --text output
This makes it more convenient to get a text format profile.

Add an error for printing non-text format output to a terminal for instrumentation profile.
(It cannot be portably tested. For sample profile, raw_fd_ostream is hidden deeply so it's inconvenient to add a diagnostic.)

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D104600
2021-06-21 12:01:57 -07:00
Fangrui Song 8ea2a58a2e [llvm-profdata] Make diagnostics consistent with the (no capitalization, no period) style
The format is currently inconsistent. Use the https://llvm.org/docs/CodingStandards.html#error-and-warning-messages style.

And add `error:` or `warning:` to CHECK lines wherever appropriate.
2021-06-19 14:54:25 -07:00
Fangrui Song 0f558db742 [llvm-profdata] Delete unneeded empty output filename check 2021-06-19 12:20:45 -07:00
Fangrui Song 59d90fe817 Simplify some typedef struct 2021-06-19 11:36:44 -07:00
wlei 863184dd69 [CSSPGO] Aggregation by the last K context frames for cold profiles
This change provides the option to merge and aggregate cold context by the last k frames instead of context-less name. By default K = 1 means the context-less one.

This is for better perf tuning. The more selective merging and trimming will rely on llvm-profgen's preinliner.

Reviewed By: wenlei, hoy

Differential Revision: https://reviews.llvm.org/D104131
2021-06-14 10:33:43 -07:00
Rong Xu 8d581857d7 [SampleFDO] New hierarchical discriminator for FS SampleFDO (llvm-profdata part)
This patch was split from https://reviews.llvm.org/D102246
[SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO
This is for llvm-profdata part of change. It sets the bit masks for the
profile reader in llvm-profdata. Also add an internal option
"-fs-discriminator-pass" for show and merge command to process the profile
offline.

This patch also moved setDiscriminatorMaskedBitFrom() to
SampleProfileReader::create() to simplify the interface.

Differential Revision: https://reviews.llvm.org/D103550
2021-06-04 11:22:06 -07:00
Wenlei He dff8315892 [CSSPGO][llvm-profdata] Support trimming cold context when merging profiles
The change adds support for triming and merging cold context when mergine CSSPGO profiles using llvm-profdata. This is similar to the context profile trimming in llvm-profgen, however the flexibility to trim cold context after profile is generated can be useful.

Differential Revision: https://reviews.llvm.org/D100528
2021-04-22 00:42:37 -07:00
Wenlei He 00ef28ef21 [CSSPGO] Fix dangling context strings and improve profile order consistency and error handling
This patch fixed the following issues along side with some refactoring:

1. Fix bugs where StringRef for context string out live the underlying std::string. We now keep string table in profile generator to hold std::strings. We also do the same for bracketed context strings in profile writer.
2. Make sure profile output strictly follow (total sample, name) order. Previously, there's inconsistency between ProfileMap's key and FunctionSamples's name, leading to inconsistent ordering. This is now fixed by introducing context profile canonicalization. Assertions are also added to make sure ProfileMap's key and FunctionSamples's name are always consistent.
3. Enhanced error handling for profile writing to make sure we bubble up errors properly for both llvm-profgen and llvm-profdata when string table is not populated correctly for extended binary profile.
4. Keep all internal context representation bracket free. This avoids creating new strings for context trimming, merging and preinline. getNameWithContext API is now simplied accordingly.
5. Factor out the code for context trimming and merging into SampleContextTrimmer in SampleProf.cpp. This enables llvm-profdata to use the trimmer when merging profiles. Changes in llvm-profgen will be in separate patch.

Differential Revision: https://reviews.llvm.org/D100090
2021-04-10 12:39:10 -07:00
Abhina Sreeskantharajan 82b3e28e83 [SystemZ][z/OS][Windows] Add new OF_TextWithCRLF flag and use this flag instead of OF_Text
Problem:
On SystemZ we need to open text files in text mode. On Windows, files opened in text mode adds a CRLF '\r\n' which may not be desirable.

Solution:
This patch adds two new flags

  - OF_CRLF which indicates that CRLF translation is used.
  - OF_TextWithCRLF = OF_Text | OF_CRLF indicates that the file is text and uses CRLF translation.

Developers should now use either the OF_Text or OF_TextWithCRLF for text files and OF_None for binary files. If the developer doesn't want carriage returns on Windows, they should use OF_Text, if they do want carriage returns on Windows, they should use OF_TextWithCRLF.

So this is the behaviour per platform with my patch:

z/OS:
OF_None: open in binary mode
OF_Text : open in text mode
OF_TextWithCRLF: open in text mode

Windows:
OF_None: open file with no carriage return
OF_Text: open file with no carriage return
OF_TextWithCRLF: open file with carriage return

The Major change is in llvm/lib/Support/Windows/Path.inc to only set text mode if the OF_CRLF is set.
```
  if (Flags & OF_CRLF)
    CrtOpenFlags |= _O_TEXT;
```

These following files are the ones that still use OF_Text which I left unchanged. I modified all these except raw_ostream.cpp in recent patches so I know these were previously in Binary mode on Windows.
./llvm/lib/Support/raw_ostream.cpp
./llvm/lib/TableGen/Main.cpp
./llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
./llvm/unittests/Support/Path.cpp
./clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
./clang/lib/Frontend/CompilerInstance.cpp
./clang/lib/Driver/Driver.cpp
./clang/lib/Driver/ToolChains/Clang.cpp

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D99426
2021-04-06 07:23:31 -04:00
Markus Böck 142d522ded [llvm-profdata] Make sure to consume Error on the error path of setIsIRLevelProfile
Encountered a crash while running a debug build, where this code path would be taken due to a mismatch in profile coverage data versions. Without consuming the error, an assert would be triggered inside the destructor of Error.

Differential Revision: https://reviews.llvm.org/D99457
2021-03-30 08:52:58 +02:00
Hongtao Yu e68fafa49f [CSSPGO] llvm-profdata support for CS profile.
Context-sensitive AutoFDO profile has a different name scheme where full calling contexts are encoded as function names. When processing CS proifle, llvm-profdata should use full contexts instead of leaf function names.

Reviewed By: wmi, wenlei, wlei

Differential Revision: https://reviews.llvm.org/D97998
2021-03-08 09:04:40 -08:00
Matthew Voss 6da7d31416 [llvm-profdata] Emit Error when Invalid MemOpSize Section is Created by llvm-profdata
Under certain (currently unknown) conditions, llvm-profdata is outputting
profiles that have two consecutive entries in the MemOPSize section for the
value 0. This causes the PGOMemOPSizeOpt pass to output an invalid switch
instruction with two cases for 0. As mentioned, we’re not quite sure what’s
causing this to happen, but this patch prevents llvm-profdata from outputting a
profile that has this problem and gives an error with a request for a
reproducible.

Differential Revision: https://reviews.llvm.org/D92074
2021-02-23 12:51:54 -08:00
Hongtao Yu 7e99bddfea [CSSPGO] Support of CS profiles in extended binary format.
This change brings up support of context-sensitive profiles in the format of extended binary. Existing sample profile reader/writer/merger code is being tweaked to reflect the fact of bracketed input contexts, like (`[...]`). The paired brackets are also needed in extbinary profiles because we don't yet have an otherwise good way to tell calling contexts apart from regular function names since the context delimiter `@` can somehow serve as a part of the C++ mangled names.

Reviewed By: wmi, wenlei

Differential Revision: https://reviews.llvm.org/D95547
2021-01-27 21:29:46 -08:00
Kazu Hirata 7dc3575ef2 [llvm] Remove redundant return and continue statements (NFC)
Identified with readability-redundant-control-flow.
2021-01-14 20:30:34 -08:00
Kazu Hirata e5b4dbab04 [llvm] Simplify string comparisons (NFC)
Identified with readability-string-compare.
2021-01-11 18:48:09 -08:00
Abhina Sreeskantharajan 8ad998a611 [tools] Mark output of tools as text if it is really text
This is a continuation of https://reviews.llvm.org/D67696. The following tools also need to set the OF_Text flag correctly.

  -   llvm-profdata
  -   llvm-link

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D94313
2021-01-11 15:14:03 -05:00
Hongtao Yu ac068e014b [CSSPGO] Consume pseudo-probe-based AutoFDO profile
This change enables pseudo-probe-based sample counts to be consumed by the sample profile loader under the regular `-fprofile-sample-use` switch with minimal adjustments to the existing sample file formats. After the counts are imported, a probe helper, aka, a `PseudoProbeManager` object, is automatically launched to verify the CFG checksum of every function in the current compilation against the corresponding checksum from the profile. Mismatched checksums will cause a function profile to be slipped. A `SampleProfileProber` pass is scheduled before any of the `SampleProfileLoader` instances so that the CFG checksums as well as probe mappings are available during the profile loading time. The `PseudoProbeManager` object is set up right after the profile reading is done. In the future a CFG-based fuzzy matching could be done in `PseudoProbeManager`.

Samples will be applied only to pseudo probe instructions as well as probed callsites once the checksum verification goes through. Those instructions are processed in the same way that regular instructions would be processed in the line-number-based scenario. In other words, a function is processed in a regular way as if it was reduced to just containing pseudo probes (block probes and callsites).

**Adjustment to profile format **

A CFG checksum field is being added to the existing AutoFDO profile formats. So far only the text format and the extended binary format are supported. For the text format, a new line like
```
!CFGChecksum: 12345
```
is added to the end of the body sample lines. For the extended binary profile format, we introduce a metadata section to store the checksum map from function names to their CFG checksums.

Differential Revision: https://reviews.llvm.org/D92347
2020-12-16 15:57:18 -08:00
serge-sans-paille 9218ff50f9 llvmbuildectomy - replace llvm-build by plain cmake
No longer rely on an external tool to build the llvm component layout.

Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.

These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.

Differential Revision: https://reviews.llvm.org/D90848
2020-11-13 10:35:24 +01:00
wlei a8b8a9374a [llvm-profdata]Fix llvm-profdata crash on compact binary profile
llvm-profdata `show` and `overlap` will crash in `getFuncName` on compact binary profile. This change fixed this by switching to use `getName`.

 `getFuncName` is misused in llvm-profdata. As showed below, `GUIDToFuncNameMap` is only supported in compilation mode, there is no initialization in llvm-profdata. Compact profile whose MD5 is true would try to query `GUIDToFuncNameMap` then caused the crash. So fix this by switching to `getName`

Reviewed By: MaskRay, wmi, wenlei, weihe, hoy

Differential Revision: https://reviews.llvm.org/D87740
2020-09-20 16:58:34 -07:00
weihe 540489de68 [llvm-profdata] Implement llvm-profdata overlap for sample profiles
Implemented the `llvm-profdata overlap` feature for sample profiles. It reports weighted //similarity// and unweighted //overlap// metrics at program and function level for two input profiles. Similarity metrics are symmetric with regards to the order of two input profiles. By default, the tool only reports program-level summary. Users can look into function-level details via additional options `--function`, `--similarity-cutoff`, and `--value-cutoff`.

The similarity metrics are designed as follows:
* Program-level summary
    * Whole program profile similarity is an aggregate over function-level similarity `FS`: `PS = sum(FS(A) * avg_weight(A))` for all function `A`.
    * Whole program sample overlap: `PSO = common_samples / total_samples`.
    * Function overlap: `FO = #common_function / #total_function`.
    * Hot-function overlap: `HFO = #common_hot_function / #total_hot_function`.
    * Hot-block overlap: `HBO = #common_hot_block / #total_hot_block`.
* Function-level details
    * Function-level similarity is an aggregate over line/block-level similarities `BS` of all sample lines/blocks in the function, weighted by the closeness of the function's weights in two profiles: `FS = sum(BS(i)) * (1 - weight_distance(A))`.
    * Function-level sample overlap: `FSO = common_samples / total_samples` for samples in the function.

Reviewed By: wenlei, hoyFB, wmi

Differential Revision: https://reviews.llvm.org/D83852
2020-08-08 17:49:48 -07:00
Wei Mi a23f62343c Supplement instr profile with sample profile.
PGO profile is usually more precise than sample profile. However, PGO profile
needs to be collected from loadtest and loadtest may not be representative
enough to the production workload. Sample profile collected from production
can be used as a supplement -- for functions cold in loadtest but warm/hot
in production, we can scale up the related function in PGO profile if the
function is warm or hot in sample profile.

The implementation contains changes in compiler side and llvm-profdata side.
Given an instr profile and a sample profile, for a function cold in PGO
profile but warm/hot in sample profile, llvm-profdata will either mark
all the counters in the profile to be -1 or scale up the max count in the
function to be above hot threshold, depending on the zero counter ratio in
the profile. The assumption is if there are too many counters being zero
in the function profile, the profile is more likely to cause harm than good,
then llvm-profdata will mark all the counters to be -1 indicating the
function is hot but the profile is unaccountable. In compiler side, if a
function profile with all -1 counters is seen, the function entry count will
be set to be above hot threshold but its internal profile will be dropped.

In the long run, it may be useful to let compiler support using PGO profile
and sample profile at the same time, but that requires more careful design
and more substantial changes to make two profiles work seamlessly. The patch
here serves as a simple intermediate solution.

Differential Revision: https://reviews.llvm.org/D81981
2020-07-27 20:17:40 -07:00
Rong Xu 50da55a585 [PGO] Supporting code for always instrumenting entry block
This patch includes the supporting code that enables always
instrumenting the function entry block by default.

This patch will NOT the default behavior.

It adds a variant bit in the profile version, adds new directives in
text profile format, and changes llvm-profdata tool accordingly.

This patch is a split of D83024 (https://reviews.llvm.org/D83024)
Many test changes from D83024 are also included.

Differential Revision: https://reviews.llvm.org/D84261
2020-07-22 15:01:53 -07:00
Wei Mi 78fe6a3ee2 [NFC] Extract the code to write instr profile into function writeInstrProfile
So that the function writeInstrProfile can be used in other places.

Differential Revision: https://reviews.llvm.org/D83521
2020-07-09 16:30:28 -07:00
Fangrui Song 546be08837 [llvm-profdata] --hot-func-list: fix some style issues in D81800
Reviewed By: wenlei, hoyFB

Differential Revision: https://reviews.llvm.org/D82500
2020-06-24 15:17:03 -07:00