Commit Graph

746 Commits

Author SHA1 Message Date
Kazu Hirata fedc59734a [llvm] Use range-based for loops (NFC) 2022-09-03 11:17:40 -07:00
Gulfem Savrun Yeniceri 999886325e [profile] Create only prof header when no counters
When we use selective instrumentation and instrument a file
that is not in the selected files list provided via -fprofile-list,
we generate an empty raw profile. This leads to empty_raw_profile
error when we try to read that profile. This patch fixes the issue by
generating a raw profile that contains only a profile header when
there are no counters and profile data.

A small reproducer for the above issue:
echo "src:other.cc" > code.list
clang++ -O2 -fprofile-instr-generate -fcoverage-mapping
-fprofile-list=code.list code.cc -o code
./code
llvm-profdata show default.profraw

Differential Revision: https://reviews.llvm.org/D132094
2022-08-30 19:50:41 +00:00
Rong Xu d7ef0c3970 [llvm-profdata] Improve profile supplementation
Current implementation promotes a non-cold function in the SampleFDO profile
into a hot function in the FDO profile. This is too aggressive. This patch
promotes a hot functions in the SampleFDO profile into a hot function, and a
warm function in SampleFDO into a warm function in FDO.

Differential Revision: https://reviews.llvm.org/D132601
2022-08-29 16:50:42 -07:00
Kazu Hirata 0e9d37ff95 [llvm] Qualify auto in range-based for loops (NFC) 2022-08-28 23:29:00 -07:00
William Huang b105656207 [Sample Profile Reader] Fix potential integer overflow/infinite loop bug in sample profile reader
Change loop induction variable type to match the type of "SIZE" where it's compared against, to prevent infinite loop caused by overflow wraparound if there are more than 2^32 samples

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D132493
2022-08-23 20:35:23 +00:00
liaochunyu e36a71d87b [llvm-profdata][NFC] fix warning
warning: unused variable ‘FC’

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D132429
2022-08-23 14:09:19 +08:00
Eli Friedman 8f826fe723 Fix reverse-iteration buildbot.
A couple of instances of iterating over maps snuck in while the bot was
down; fix them to use maps with deterministic iteration.
2022-08-19 14:21:05 -07:00
Kazu Hirata 109df7f9a4 [llvm] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-08-13 12:55:42 -07:00
Kazu Hirata a044d0491e [llvm-profdata] Support JSON as as an output-only format
This patch teaches llvm-profdata to output the sample profile in the
JSON format.  The new option is intended to be used for research and
development purposes.  For example, one can write a Python script to
take a JSON file and analyze how similar different inline instances of
a given function are to each other.

I've chosen JSON because Python can parse it reasonably fast, and it
just takes a couple of lines to read the whole data:

  import json
  with open ('profile.json') as f:
    profile = json.load(f)

Differential Revision: https://reviews.llvm.org/D130944
2022-08-09 16:24:53 -07:00
Fangrui Song de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Zequan Wu 4979b16db1 [llvm-cov] Improve error message by printing the object file name that produces error
If error occurs on constructing coverage info for one of the object files, it prints the name of the object file, so that users know which one is the cause of error.

Differential Revision: https://reviews.llvm.org/D130196
2022-07-21 11:26:51 -07:00
Rong Xu 5e0443292b [PGO] Report number of counts being dropped when a hash-mismatch happens
This patch reports number of counts being dropped when a hash-mismatch
happens. This information will be helpful to the users -- if the dropped
counts are large, the user should redo the instrumentation build and
recollect the profile.

Differential Revision: https://reviews.llvm.org/D129001
2022-07-15 14:53:59 -07:00
Rong Xu 4a40fa82c0 [PGO] Don't cross reference CSFDO profile and non-CSFDO profile
Don't cross reference CSFDO profile and non-CSFDO profile when
checking the function hash. Only return hash_mismatch when
CS bits match, and return unknown_function otherwise.

Differential Revision: https://reviews.llvm.org/D129000
2022-07-15 13:57:23 -07:00
Fangrui Song e690137dde [Support] Change compression::zlib::{compress,uncompress} to use uint8_t *
It's more natural to use uint8_t * (std::byte needs C++17 and llvm has
too much uint8_t *) and most callers use uint8_t * instead of char *.
The functions are recently moved into `llvm::compression::zlib::`, so
downstream projects need to make adaption anyway.
2022-07-13 16:26:54 -07:00
Nicolai Hähnle ede600377c ManagedStatic: remove many straightforward uses in llvm
(Reapply after revert in e9ce1a5880 due to
Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other
than error categories, to be checked in more detail and reapplied
separately.)

Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.

Differential Revision: https://reviews.llvm.org/D129120
2022-07-10 10:29:15 +02:00
Nicolai Hähnle e9ce1a5880 Revert "ManagedStatic: remove many straightforward uses in llvm"
This reverts commit e6f1f06245.

Reverting due to a failure on the fuchsia-x86_64-linux buildbot.
2022-07-10 09:54:30 +02:00
Nicolai Hähnle e6f1f06245 ManagedStatic: remove many straightforward uses in llvm
Bulk remove many of the more trivial uses of ManagedStatic in the llvm
directory, either by defining a new getter function or, in many cases,
moving the static variable directly into the only function that uses it.

Differential Revision: https://reviews.llvm.org/D129120
2022-07-10 09:15:08 +02:00
Cole Kissane ea61750c35 [NFC] Refactor llvm::zlib namespace
* Refactor compression namespaces across the project, making way for a possible
  introduction of alternatives to zlib compression.
  Changes are as follows:
  * Relocate the `llvm::zlib` namespace to `llvm::compression::zlib`.

Reviewed By: MaskRay, leonardchan, phosek

Differential Revision: https://reviews.llvm.org/D128953
2022-07-08 11:19:07 -07:00
Petr Hosek 0204fd25b0 [CoverageMapping] Remove dots from paths inside the profile
We already remove dots from collected paths and path mappings. This
makes it difficult to match paths inside the profile which contain
dots. For example, we would never match /path/to/../file.c because
the collected path is always be normalized to /path/file.c. This
change enables dot removal for paths inside the profile to address
the issue.

Differential Revision: https://reviews.llvm.org/D123164
2022-06-28 20:53:01 -07:00
Petr Hosek 834a38bbcb Revert "[CoverageMapping] Remove dots from paths inside the profile"
This reverts commit d1b098fc82 since
it is failing on Windows builders.
2022-06-27 23:20:54 -07:00
Petr Hosek d1b098fc82 [CoverageMapping] Remove dots from paths inside the profile
We already remove dots from collected paths and path mappings. This
makes it difficult to match paths inside the profile which contain
dots. For example, we would never match /path/to/../file.c because
the collected path is always be normalized to /path/file.c. This
change enables dot removal for paths inside the profile to address
the issue.

Differential Revision: https://reviews.llvm.org/D122750
2022-06-27 23:09:37 -07:00
Snehasish Kumar 3a1a404ae2 [memprof] Return an error for unsupported symbolization.
Add a check to detect that the profiled binary was build with position
independent code. Add a test with a pie binary to which can be reused
later when support is added. Also clean up the error messages with
trailing colons.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D128564
2022-06-27 09:43:26 -07:00
Kazu Hirata 129b531c9c [llvm] Use value_or instead of getValueOr (NFC) 2022-06-18 23:07:11 -07:00
Kazu Hirata c2713df30b [ProfileData] Use llvm::erase_if (NFC) 2022-06-10 22:51:30 -07:00
Fangrui Song 36c7d79dc4 Remove unneeded cl::ZeroOrMore for cl::opt options
Similar to 557efc9a8b.
This commit handles options where cl::ZeroOrMore is more than one line below
cl::opt.
2022-06-04 00:10:42 -07:00
Fangrui Song 557efc9a8b [llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!`
error. More were added due to cargo cult. Since the error has been removed,
cl::ZeroOrMore is unneeded.

Also remove cl::init(false) while touching the lines.
2022-06-03 21:59:05 -07:00
Snehasish Kumar 8a87f42fc6 [memprof] Print out the segment information in YAML format.
This change prints out the segment information in the raw profile in
YAML format for testing. Since we don't capture build ids yet, we print
out <None> for now.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D126840
2022-06-02 02:26:39 +00:00
Snehasish Kumar 962db7de84 [memprof] Update summary output.
Update the YAML format print out of the profile to include a summary
instead of displaying the headers in the raw file buffer. This allows us
to release the raw buffer early saving memory.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D126834
2022-06-02 02:15:42 +00:00
Bruno Cardoso Lopes ce54b22657 [Clang][CoverageMapping] Fix switch counter codegen compile time explosion
C++ generated code with huge amount of switch cases chokes badly while emitting
coverage mapping, in our specific testcase (~72k cases), it won't stop after hours.
After this change, the frontend job now finishes in 4.5s and shrinks down `@__covrec_`
by 288k when compared to disabling simplification altogether.

There's probably no good way to create a testcase for this, but it's easy to
reproduce, just add thousands of cases in the below switch, and build with
`-fprofile-instr-generate -fcoverage-mapping`.

```
enum type : int {
 FEATURE_INVALID = 0,
 FEATURE_A = 1,
 ...
};

const char *to_string(type e) {
  switch (e) {
  case type::FEATURE_INVALID: return "FEATURE_INVALID";
  case type::FEATURE_A: return "FEATURE_A";}
  ...
  }

```

Differential Revision: https://reviews.llvm.org/D126345
2022-05-26 11:05:15 -07:00
Snehasish Kumar ec51971eae [memprof] Keep and display symbol names in the RawMemProfReader.
Extend the Frame struct to hold the symbol name if requested
when a RawMemProfReader object is constructed. This change updates the
tests and removes the need to pass --debug to obtain the mapping from
GUID to symbol names.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D126344
2022-05-25 21:17:44 +00:00
Hongtao Yu 1662cfa4be [CSSPGO][CSProfileConverter] Remove call target samples when including callee samples into caller.
When a flat CS profile is converted to a nested profile, the call target samples for inlined callee contexts are left over in the callsite target map. This could cause indirect call promotion to function improperly. One issue is that the inlined callsites are treated with double amount of samples. The other is the inlined callsites are reconsidered for subsequent PGO ICP.

I'm fixing this by excluding call targets from the callsite for inlined targets. While fixing this I found that callsite target sum and the number of body samples for that callsite could be mismatched. {D122609} has an explanation and a fix for that on llvm-profgen side. For now I'm tolerating it in this change.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D125266
2022-05-13 09:19:32 -07:00
Hongtao Yu 23191a4ffe [CSSPGO][llvm-profgen] Do not duplicate context profiles into base profile when converting CS flat profile to nested.
Recent experiments with our two large internal services showed that duplicating context profiles into base profile caused code size inflation and didn't deliver good performance compared to no such duplication. It was a trick we made to catch up with the CS flat profile and I'm now turning it off by default.

The code size inflation mainly comes from the enriched based profiles. A base profile for a function represents the uninlined (or outlined) portion of the whole function running time. Such portion could be very small if a function is inlined into most of its hot callsites. Duplicating context profiles of the function into its base profiles could cause the outlined body to be hot enough and in turn get many of its callees inlined, thus increases the code size. The size inflation could further cause perf regression.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D124796
2022-05-12 09:29:25 -07:00
Teresa Johnson 655294866c [memprof] Use unknown_function error type for missing functions
Switch the error type when a function is not found in the memprof
profile to unknown_function. This gives compatibility with normal PGO
function matching, and also prevents issuing large numbers of additional
matching errors since pgo-warn-missing-function is off by default.

Differential Revision: https://reviews.llvm.org/D124953
2022-05-04 13:02:30 -07:00
Hongtao Yu e36786d15f [CSSPGO] Rename ProfileIsCSNested and ProfileIsCSFlat
To be more clear and definitive, I'm renaming `ProfileIsCSFlat` back to `ProfileIsCS` which stands for full context-sensitive flat profiles.  `ProfileIsCSNested` is now renamed to `ProfileIsPreInlined` and is extended to be applicable for CS flat profiles too. More specifically, `ProfileIsPreInlined` is for any kind of profiles (flat or nested) that contain 'ShouldBeInlined' contexts. The flag is encoded in the profile summary section for extbinary profiles and is computed on-the-fly for text profiles.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D122602
2022-04-29 17:03:52 -07:00
Snehasish Kumar 6dd6a6161f [memprof] Deduplicate and outline frame storage in the memprof profile.
The current implementation of memprof information in the indexed profile
format stores the representation of each calling context fram inline.
This patch uses an interned representation where the frame contents are
stored in a separate on-disk hash table. The table is indexed via a hash
of the contents of the frame. With this patch, the compressed size of a
large memprof profile reduces by ~22%.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D123094
2022-04-08 09:15:20 -07:00
serge-sans-paille 01be9be2f2 Cleanup includes: final pass
Cleanup a few extra files, this closes the work on libLLVM dependencies on my
side.

Impact on libLLVM preprocessed output: -35876 lines

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122576
2022-03-29 09:00:21 +02:00
Kazu Hirata d3e5f0ab01 Apply clang-tidy fixes for readability-redundant-smartptr-get in SampleProfReader.cpp (NFC) 2022-03-28 09:18:39 -07:00
Kazu Hirata 14415a3a5a Apply clang-tidy fixes for readability-redundant-smartptr-get in InstrProfReader.cpp (NFC) 2022-03-28 09:18:38 -07:00
Snehasish Kumar 27a4f2545f Reland "[memprof] Store callsite metadata with memprof records."
This reverts commit f4b794427e.

Reland with underlying msan issue fixed in D122260.
2022-03-22 14:40:02 -07:00
Mitch Phillips f4b794427e Revert "[memprof] Store callsite metadata with memprof records."
This reverts commit 0d362c90d3.

Reason: Causes the MSan buildbot to fail (see comments on
https://reviews.llvm.org/D121179 for more information
2022-03-21 15:59:13 -07:00
Snehasish Kumar 0d362c90d3 [memprof] Store callsite metadata with memprof records.
To ease profile annotation, each of the callsites in a function can be
annotated with profile data - "IR metadata format for MemProf" [1]. This
patch extends the on-disk serialized record format to store the debug
information for allocation callsites incl inline frames. This change is
incompatible with the existing format i.e. indexed profiles must be
regenerated, raw profiles are unaffected.

[1] https://groups.google.com/g/llvm-dev/c/aWHsdMxKAfE/m/WtEmRqyhAgAJ

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D121179
2022-03-21 13:58:29 -07:00
Snehasish Kumar c9a3d29613 [memprof] Update the frame is inline logic and unittests.
Since DI frames are enumerated with the leaf function at index 0, this
patch fixes the logic when IsInlineFrame is set. Also update the
unittests to check that only the last frame is marked as non-inline from
a set of DI Frames for a PC address.

Differential Revision: https://reviews.llvm.org/D121830
2022-03-21 10:41:05 -07:00
Snehasish Kumar 49c048add4 [memprof] Add a test to verify callstack order.
We add to ensure that we are observing the correct callstack order in memprof
during symbolization. There was some confusion whether the order of
DIFrame objects were reversed but in reality the leaf function is at
index 0 so no code changes are required.

Differential Revision: https://reviews.llvm.org/D121759
2022-03-16 10:10:57 -07:00
Igor Kudrin 1c99f650a7 [llvm-cov gcov] Fix calculating coverage of template functions
Template functions share the same lines in source files, so the common
container of lines' properties cannot be used to calculate the coverage
statistics of individual functions.

> cat tmpl.cpp
template <int N> int test() { return N; }
int main() { return test<1>() + test<2>(); }
> clang++ --coverage tmpl.cpp -o tmpl
> ./tmpl
> llvm-cov gcov tmpl.cpp -f
...
Function '_Z4testILi1EEiv'
Lines executed:100.00% of 1

Function '_Z4testILi2EEiv'
Lines executed:-nan% of 0
...
> llvm-cov-patched gcov tmpl.cpp -f
...
Function '_Z4testILi1EEiv'
Lines executed:100.00% of 1

Function '_Z4testILi2EEiv'
Lines executed:100.00% of 1
...

Differential Revision: https://reviews.llvm.org/D121390
2022-03-15 20:46:22 +04:00
Fangrui Song 407c721ceb [Support] Change zlib::compress to return void
With a sufficiently large output buffer, the only failure is Z_MEM_ERROR.
Check it and call the noreturn report_bad_alloc_error if applicable.
resize_for_overwrite may call report_bad_alloc_error as well.

Now that there is no other error type, we can replace the return type with void
and simplify call sites.

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D121512
2022-03-14 11:38:04 -07:00
Snehasish Kumar 11314f4059 [memprof] Filter out callstack frames which cannot be symbolized.
This patch filters out callstack frames which can't be symbolized or if
the frames belong to the runtime. Symbolization may not be possible if
debug information is unavailable or if the addresses are from a shared
library. For now we only support optimization of the main binary which
is statically linked to the compiler runtime.

Differential Revision: https://reviews.llvm.org/D120860
2022-03-04 11:10:08 -08:00
Snehasish Kumar dda7b74967 [memprof] Symbolize and cache stack frames.
Currently, symbolization of stack frames occurs on demand when the instrprof writer
iterates over all the records in the raw memprof reader. With this
change we symbolize and cache the frames immediately after reading the
raw profiles. For a large internal binary this results in a runtime
reduction of ~50% (2m -> 48s) when merging a memprof raw profile with a
raw instr profile to generate an indexed profile. This change also makes
it simpler in the future to generate additional calling context
metadata to attach to each memprof record.

Differential Revision: https://reviews.llvm.org/D120430
2022-03-03 11:00:37 -08:00
serge-sans-paille fc97efa409 Cleanup includes: ProfileData
Estimation of the impact on preprocessor output:

before: 1067349756
after: 1065940348

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120434
2022-02-24 13:25:11 +01:00
Snehasish Kumar b681799938 [instrprof] Rename the profile kind types to be more descriptive.
Based on the discussion in D115393, I've updated the names to be more
descriptive.

Reviewed By: ellis, MaskRay

Differential Revision: https://reviews.llvm.org/D120092
2022-02-23 13:15:56 -08:00
Fangrui Song 045f07b7dc [ProfileData] Remove unused and racy FunctionSamples::Format after D51643
The write may be racy if ThinLTO creates multiple `InProcessThinBackend` instances.
2022-02-22 20:29:08 -08:00