Commit Graph

5 Commits

Author SHA1 Message Date
Walter Erquinigo c4fb631cee [NFC][lldb][trace] Fix formatting of tracing files
Pavel Labath taught me that clang-format sorts headers automatically
using llvm's rules, and it's better not to have spaces between

So in this diff I'm removing those spaces and formatting them as well.

I used `clang-format -i` to format these files.
2022-08-11 11:00:26 -07:00
Walter Erquinigo b532dd545f [trace] Add an option to save a compact trace bundle
A trace bundle contains many trace files, and, in the case of intel pt, the
largest files are often the context switch traces because they are not
compressed by default. As a way to improve this, I'm adding a --compact option
to the `trace save` command that filters out unwanted processes from the
context switch traces. Eventually we can do the same for intel pt traces as
well.

Differential Revision: https://reviews.llvm.org/D129239
2022-07-13 11:43:28 -07:00
Walter Erquinigo 6a5355e8a1 [trace][intelpt] Support system-wide tracing [20] - Rename some fields in the schema
As discusses offline with @jj10305, we are updating some naming used throughout the code, specially in the json schema

- traceBuffer -> iptTrace
- core -> cpu

Differential Revision: https://reviews.llvm.org/D127817
2022-06-16 11:42:22 -07:00
Walter Erquinigo 67c2405145 [trace][intelpt] Support system-wide tracing [19] - Some other minor improvements
This addresses the issues in diffs [13], [14] and [16]

- Add better documentation
- Fix some castings by making them safer
- Simplify CorrelateContextSwitchesAndIntelPtTraces
- Rename some functions

Differential Revision: https://reviews.llvm.org/D127804
2022-06-16 11:42:21 -07:00
Walter Erquinigo a19fcc2bec [trace][intelpt] Support system-wide tracing [14] - Decode per cpu
This is the final functional patch to support intel pt decoding per cpu.
It works by doing the following:

- First, all context switches are split by tid and sorted in order. This produces a list of continuous executes per thread per core.
- Then, all intel pt subtraces are split by PSB boundaries and assigned to individual thread continuous executions on the same core by doing simple TSC-based comparisons.
- With this, we have, per thread, a sorted list of continuous executions each one with a list of intel pt subtraces. Up to this point, this is really fast because no instructions were actually decoded.
- Then, each thread can be decoded by traversing their continuous executions and intel pt subtraces. An advantage of having these continuous executions is that we can identify if a continuous exexecution doesn't have intel pt data, and thus has a gap in it. We can later to more sofisticated comparisons to identify if within a continuous execution there are gaps.

I'm adding a test as well.

Differential Revision: https://reviews.llvm.org/D126394
2022-06-16 11:23:01 -07:00