This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Pavel Labath taught me that clang-format sorts headers automatically
using llvm's rules, and it's better not to have spaces between
So in this diff I'm removing those spaces and formatting them as well.
I used `clang-format -i` to format these files.
It turns out that cgroup filtering is relatively trivial and works
really nicely. Thid diffs adds automatic cgroup filtering when in
per-cpu mode, unless a new --disable-cgroup-filtering flag is passed in
the start command. At least on Meta machines, all processes are spawned
inside a cgroup by default, which comes super handy, because per cpu
tracing is now much more precise.
A manual test gave me this result
- Without filtering:
Total number of trace items: 36083
Total number of continuous executions found: 229
Number of continuous executions for this thread: 2
Total number of PSB blocks found: 98
Number of PSB blocks for this thread 2
Total number of unattributed PSB blocks found: 38
- With filtering:
Total number of trace items: 87756
Total number of continuous executions found: 123
Number of continuous executions for this thread: 2
Total number of PSB blocks found: 10
Number of PSB blocks for this thread 3
Total number of unattributed PSB blocks found: 2
Filtering gives us great results. The number of instructions collected
more than double (probalby because we have less noise in the trace), and
we have much less unattributed PSBs blocks and unrelated PSBs in
general. The ones that are unrelated probably belong to other processes
in the same cgroup.
Differential Revision: https://reviews.llvm.org/D129257
As discusses offline with @jj10305, we are updating some naming used throughout the code, specially in the json schema
- traceBuffer -> iptTrace
- core -> cpu
Differential Revision: https://reviews.llvm.org/D127817
This improves several things and addresses comments up to the diff [11] in this stack.
- Simplify many functions to receive less parameters that they can identify easily
- Create Storage classes for Trace and TraceIntelPT that can make it easier to reason about what can change with live process refreshes and what cannot.
- Don't cache the perf zero conversion numbers in lldb-server to make sure we get the most up-to-date numbers.
- Move the thread identifaction from context switches to the bundle parser, to leave TraceIntelPT simpler. This also makes sure that the constructor of TraceIntelPT is invoked when the entire data has been checked to be correct.
- Normalize all bundle paths before the Processes, Threads and Modules are created, so that they can assume that all paths are correct and absolute
- Fix some issues in the tests. Now they all pass.
- return the specific instance when constructing PerThread and MultiCore processor tracers.
- Properly implement IntelPTMultiCoreTrace::TraceStart.
- Improve some comments.
- Use the typedef ContextSwitchTrace more often for clarity.
- Move CreateContextSwitchTracePerfEvent to Perf.h as a utility function.
- Synchronize better the state of the context switch and the intel pt
perf events.
- Use a booblean instead of an enum for the PerfEvent state.
Differential Revision: https://reviews.llvm.org/D127456
:q!
This diff is massive, but it's because it connects the client with lldb-server
and also ensures that the postmortem case works.
- Flatten the postmortem trace schema. The reason is that the schema has become quite complex due to the new multicore case, which defeats the original purpose of having a schema that could work for every trace plug-in. At this point, it's better that each trace plug-in defines it's own full schema. This means that the only common field is "type".
-- Because of this new approach, I merged the "common" trace load and saving functionalities into the IntelPT one. This simplified the code quite a bit. If we eventually implement another trace plug-in, we can see then what we could reuse.
-- The new schema, which is flattened, has now better comments and is parsed better. A change I did was to disallow hex addresses, because they are a bit error prone. I'm asking now to print the address in decimal.
-- Renamed "intel" to "GenuineIntel" in the schema because that's what you see in /proc/cpuinfo.
- Implemented reading the context switch trace data buffer. I had to do
some refactors to do that cleanly.
-- A major change that I did here was to simplify the perf_event circular buffer reading logic. It was too complex. Maybe the original Intel author had something different in mind.
- Implemented all the necessary bits to read trace.json files with per-core data.
- Implemented all the necessary bits to save to disk per-core trace session.
- Added a test that ensures that parsing and saving to disk works.
Differential Revision: https://reviews.llvm.org/D126015
- Add a warnings field in the jLLDBGetState response, for warnings to be delivered to the client for troubleshooting. This removes the need to silently log lldb-server's llvm::Errors and not expose them easily to the user
- Simplify the tscPerfZeroConversion struct and schema. It used to extend a base abstract class, but I'm doubting that we'll ever add other conversion mechanisms because all modern kernels support perf zero. It is also the one who is supposed to work with the timestamps produced by the context switch trace, so expecting it is imperative.
- Force tsc collection for cpu tracing.
- Add a test checking that tscPerfZeroConversion is returned by the GetState request
- Add a pre-check for cpu tracing that makes sure that perf zero values are available.
Differential Revision: https://reviews.llvm.org/D125932
- Add collection of context switches per cpu grouped with the per-cpu intel pt traces.
- Move the state handling from the interl pt trace class to the PerfEvent one.
- Add support for stopping and enabling perf event groups.
- Return context switch entries as part of the jLLDBTraceGetState response.
- Move the triggers of whenever the process stopped or resumed. Now the will-resume notification is in a better location, which will ensure that we'll capture the instructions that will be executed.
- Remove IntelPTSingleBufferTraceUP. The unique pointer was useless.
- Add unit tests
Differential Revision: https://reviews.llvm.org/D125897
We have two different "process trace" implementations: per thread and per core. As a way to simplify the collector who uses both, I'm creating a base abstract class that is used by these implementations. This effectively simplify a good chunk of code.
Differential Revision: https://reviews.llvm.org/D125503
On Ubuntu 18.04 with GCC 7.5 Intel trace code fails to build due to
failure to convert from
lldb_private::process_linux::IntelPTPerThreadProcessTraceUP to
Expected<lldb_private::process_linux::IntelPTPerThreadProcessTraceUP>.
This commit explicitely marks those unique_ptr values as being moved
which fixes the conversion error.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D126402
IntelPTCollector is very big and has 3 classes in it. It's actually cleaner if each one has its own file. This also gives more visibility to the developer about the different kinds of "tracers" that we have.
Besides that, I'm now restricting the creation of the BinaryData chunks to GetState() instead of having it in different places, which is not very clean, because the gdb-remote protocol should be as restricted as possible.
Differential Revision: https://reviews.llvm.org/D125047
When tracing on per-core mode, we are tracing all processes, which means
that after hitting a breakpoint, our process will stop running (thus
producing no more tracing data) but other processes will continue
writing to our trace buffers. This causes a big data loss for our trace.
As a way to remediate this, I'm adding some logic to pause and unpause
tracing based on the target's state. The earlier we do it the better,
however, I'm not adding the trigger at the earliest possible point for
simplicity of this diff. Later we can improve that part.
Differential Revision: https://reviews.llvm.org/D124962
This diffs implements per-core tracing on lldb-server. It also includes tests that ensure that tracing can be initiated from the client and that the jLLDBGetState ppacket returns the list of trace buffers per core.
This doesn't include any decoder changes.
Finally, this makes some little changes here and there improving the existing code.
A specific piece of code that can't reliably be tested is when tracing
per core fails due to permissions. In this case we add a
troubleshooting message and this is the manual test:
```
/proc/sys/kernel/perf_event_paranoid set to 1
(lldb) process trace start --per-core-tracing error: perf event syscall failed: Permission denied
You might need that /proc/sys/kernel/perf_event_paranoid has a value of 0 or -1.
``
Differential Revision: https://reviews.llvm.org/D124858
llvm's json parser supports uint64_t, so let's better use it for the
packets being sent between lldb and lldb-server instead of using int64_t
as an intermediate type, which might be error-prone.
llvm's json parser supports uint64_t, so let's better use it for the
packets being sent between lldb and lldb-server instead of using int64_t
as an intermediate type, which might be error-prone.
I'm refactoring IntelPTThreadTrace into IntelPTSingleBufferTrace so that it can
both single threads or single cores. In this diff I'm basically renaming the
class, moving it to its own file, and removing all the pieces that are not used
along with some basic cleanup.
Differential Revision: https://reviews.llvm.org/D124648
This updates the documentation of the gdb-remote protocol, as well as the help messages, to include the new --per-core-tracing option.
Differential Revision: https://reviews.llvm.org/D124640
In order to open perf events per core, we need to first get the list of
core ids available in the system. So I'm adding a function that does
that by parsing /proc/cpuinfo. That seems to be the simplest and most
portable way to do that.
Besides that, I made a few refactors and renames to reflect better that
the cpu info that we use in lldb-server comes from procfs.
Differential Revision: https://reviews.llvm.org/D124573
This silences warnings like this:
lldb/source/Core/DebuggerEvents.cpp: In member function ‘llvm::StringRef lldb_private::DiagnosticEventData::GetPrefix() const’:
lldb/source/Core/DebuggerEvents.cpp:55:1: warning: control reaches end of non-void function [-Wreturn-type]
55 | }
Differential Revision: https://reviews.llvm.org/D123203
Update the response schema of the TraceGetState packet and add
Intel PT specific response structure that contains the TSC conversion,
if it exists. The IntelPTCollector loads the TSC conversion and caches
it to prevent unnecessary calls to perf_event_open. Move the TSC conversion
calculation from Perf.h to TraceIntelPTGDBRemotePackets.h to remove
dependency on Linux specific headers.
Differential Revision: https://reviews.llvm.org/D122246
- Add PerfEvent class to handle creating ring buffers and handle the resources associated with a perf_event
- Refactor IntelPT collection code to use this new API
- Add TSC to timestamp conversion logic with unittest
Differential Revision: https://reviews.llvm.org/D121734
- Rename IntelPTManager class and files to IntelPTCollector
- Change GetTimestampCounter API to general trace counter API,
GetCounter
Differential Revision: https://reviews.llvm.org/D121711