Remove m_inferior_prev_state that's not suitable for multiprocess
debugging and that does not seem to be really used at all.
The only use of the variable right now is to "prevent" sending the stop
reason after attach/launch. However, this code is never actually run
since none of the process plugins actually use eStateLaunching or
eStateAttaching. Through adding an assert, I've confirmed that it's
never hit in any of the LLDB tests or while attaching/launching debugged
process via lldb-server and via lldb CLI.
Differential Revision: https://reviews.llvm.org/D128878
Sponsored by: The FreeBSD Foundation
Convert the m_debugged_processes map from NativeProcessProtocol pointers
to structs, and combine the additional set(s) holding the additional
process properties into a flag field inside this struct. This is
desirable since there are more properties to come and having a single
structure with all information should be cleaner and more efficient than
using multiple sets for that.
Suggested by Pavel Labath in D128893.
Differential Revision: https://reviews.llvm.org/D129652
It turns out that cgroup filtering is relatively trivial and works
really nicely. Thid diffs adds automatic cgroup filtering when in
per-cpu mode, unless a new --disable-cgroup-filtering flag is passed in
the start command. At least on Meta machines, all processes are spawned
inside a cgroup by default, which comes super handy, because per cpu
tracing is now much more precise.
A manual test gave me this result
- Without filtering:
Total number of trace items: 36083
Total number of continuous executions found: 229
Number of continuous executions for this thread: 2
Total number of PSB blocks found: 98
Number of PSB blocks for this thread 2
Total number of unattributed PSB blocks found: 38
- With filtering:
Total number of trace items: 87756
Total number of continuous executions found: 123
Number of continuous executions for this thread: 2
Total number of PSB blocks found: 10
Number of PSB blocks for this thread 3
Total number of unattributed PSB blocks found: 2
Filtering gives us great results. The number of instructions collected
more than double (probalby because we have less noise in the trace), and
we have much less unattributed PSBs blocks and unrelated PSBs in
general. The ones that are unrelated probably belong to other processes
in the same cgroup.
Differential Revision: https://reviews.llvm.org/D129257
Previously we recorded AllocationBase as the base address of the region
we get from VirtualQueryEx. However, this is the base of the allocation,
which can later be split into more regions.
So you got stuff like:
[0x00007fff377c0000-0x00007fff377c1000) r-- PECOFF header
[0x00007fff377c0000-0x00007fff37840000) r-x .text
[0x00007fff377c0000-0x00007fff37870000) r-- .rdata
Where all the base addresses were the same.
Instead, use BaseAddress as the base of the region. So we get:
[0x00007fff377c0000-0x00007fff377c1000) r-- PECOFF header
[0x00007fff377c1000-0x00007fff37840000) r-x .text
[0x00007fff37840000-0x00007fff37870000) r-- .rdata
https://docs.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-memory_basic_information
The added test checks for any overlapping regions which means
if we get the base or size wrong it'll fail. This logic
applies to any OS so the test isn't restricted to Windows.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D129272
Fix lldb-server in the non-stop + multiprocess mode to exit on vStopped
only if all processes have exited, rather than when the first one exits.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128639
This is currently being done in an ad hoc way, and so for some
commands it isn't being checked. We have the info to make this check,
since commands are supposed to add their arguments to the m_arguments
field of the CommandObject. This change uses that info to check whether
the command received arguments in error.
A handful of commands weren't defining their argument types, I also had
to fix them. And a bunch of commands were checking for arguments by
hand, so I removed those checks in favor of the CommandObject one. That
also meant I had to change some tests that were checking for the ad hoc
error outputs.
Differential Revision: https://reviews.llvm.org/D128453
Implement support for the "t" action that is used to stop a thread.
Normally this action is used only in non-stop mode. However, there's
no technical reason why it couldn't be also used in all-stop mode,
e.g. to express "resume all threads except ..." (`t:...;c`).
While at it, add a more complete test for vCont correctly resuming
a subset of program's threads.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D126983
Now preserving the non-standard behavior of returning "OK" response
when there is no debugged process.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128152
This reverts part of commit 75757c86c6.
It broke the following test:
commands/target/auto-install-main-executable/TestAutoInstallMainExecutable.py
I need more time to figure it out, so I'm reverting the code changes
and marking the tests depending on them xfail.
Introduce a helper function to append GDB Remote Serial Protocol "thread
IDs", with optional PID in multiprocess mode.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128324
Implement the 'T' packet that is used to verify whether the specified
thread belongs to the debugged processes.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128170
Update the `qfThreadInfo` handler to report threads of all debugged
processes and include PIDs when in multiprocess mode.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128152
Extend vCont function to support resuming a process with an arbitrary
PID, that could be different than the one selected via Hc (or no process
at all may be selected). Resuming more than one process simultaneously
is not supported yet.
Remove the ReadTid() method that was only used by Handle_vCont(),
and furthermore it was wrongly using m_current_process rather than
m_continue_process.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127862
Implement the support for the vKill packet. This is the modern packet
used by the GDB Remote Serial Protocol to kill one of the debugged
processes. Unlike the `k` packet, it has well-defined semantics.
The `vKill` packet takes the PID of the process to kill, and always
replies with an `OK` reply (rather than the exit status, as LLGS does
for `k` packets at the moment). Additionally, unlike the `k` packet
it does not cause the connection to be terminated once the last process
is killed — the client needs to close it explicitly.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127667
Modify the behavior of the `k` packet to kill all inferiors rather than
just the current one. The specification leaves the exact behavior
of this packet up to the implementation but since vKill is specifically
meant to be used to kill a single process, it seems logical to use `k`
to provide the alternate function of killing all of them.
Move starting stdio forwarding from the "running" response
to the packet handlers that trigger the process to start. This avoids
attempting to start it multiple times when multiple processes are killed
on Linux which implicitly causes LLGS to receive "started" events
for all of them. This is probably also more correct as the ability
to send "O" packets is implied by the continue-like command being issued
(and therefore the client waiting for responses) rather than the start
notification.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127500
LLDB tries to follow `EXCEPTION_RECORD::ExceptionRecord` to follow the
nested exception chain. In practice this code just causes Access
Violation whenever there is a nested exception. Since there does not
appear to be any code in LLDB that is actually using the nested
exceptions, this change just removes the crashing code and adds a
comment for future reference.
Fixes https://github.com/mstorsjo/llvm-mingw/issues/292
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D128201
This comment became outdated in 053eb35651
(but was moved along); that commit moved the code and the comment
to a separate function, with a separate local variable
`num_of_bytes_read`. On error, the possibly garbage value is never
copied back to the caller's reference, thus the comment is no longer
relevant (and slightly confusing as is).
Differential Revision: https://reviews.llvm.org/D128226
Add a test verifying that plain 'D' packet correctly detaches all
processes.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127291
Fix ThreadStopInfo struct to include the signal number for all events.
Since signo was not included in the details for fork, vfork
and vforkdone stops, the code incidentally referenced the wrong union
member, resulting in wrong signo being sent.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127193
Implement the support for %Stop asynchronous notification packet format
in LLGS. This does not implement full support for non-stop mode for
threaded programs -- process plugins continue stopping all threads
on every event. However, it will be used to implement asynchronous
events in multiprocess debugging.
The non-stop protocol is enabled using QNonStop packet. When it is
enabled, the server uses notification protocol instead of regular stop
replies. Since all threads are always stopped, notifications are always
generated for all active threads and copied into stop notification
queue.
If the queue was empty, the initial asynchronous %Stop notification
is sent to the client immediately. The client needs to (eventually)
acknowledge the notification by sending the vStopped packet, in which
case it is popped from the queue and the stop reason for the next thread
is reported. This continues until notification queue is empty again,
in which case an OK reply is sent.
Asychronous notifications are also used for vAttach results and program
exits. The `?` packet uses a hybrid approach -- it returns the first
stop reason synchronously, and exposes the stop reasons for remaining
threads via vStopped queue.
The change includes a test case for a program generating a segfault
on 3 threads. The server is expected to generate a stop notification
for the segfaulting thread, along with the notifications for the other
running threads (with "no stop reason"). This verifies that the stop
reasons are correctly reported for all threads, and that notification
queue works.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D125575
Refactor GDBRemoteCommunicationServerLLGS::SendStopReasonForState()
to accept process as an argument rather than hardcoding
m_current_process, in order to make it work correctly for multiprocess
scenarios.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127497
Refactor SendStopReplyPacketForThread() to accept process instance
as a parameter rather than use m_current_process. This future-proofs
it for multiprocess support.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127289
Include the process identifier in the `T` stop responses when
multiprocess extension is enabled (i.e. prepend it to the thread
identifier). Use the exposed identifier to simplify the fork-and-follow
tests.
The LLDB client accounts for the possible PID since the multiprocess
extension support was added in b601c67192.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127192
Include the process identifier in W/X stop reasons when multiprocess
extensions are enabled.
The LLDB client does not support process identifiers there at the moment
but it parses packets in such a way that their presence does not cause
any problems.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127191
Fix modernize-use-override warnings. Because this check is listed in
LLDB's top level .clang-tidy configuration, the check is enabled by
default and the resulting warnings show up in my editor.
I've audited the modified lines. This is not a blind change.
Having a member variable TraceIntelPT * makes it look as if it was
optional. I'm using instead a weak_ptr to indicate that it's not
optional and the object is under the ownership of TraceIntelPT.
Besides that, I've simplified the Perf aux and data buffers copying by
using vector.insert.
I'm also renaming Lookup2 to Lookup. The 2 in the name is confusing.
Differential Revision: https://reviews.llvm.org/D127881
llvm's JSON parser supports 64 bit integers, but other tools like the
ones written in JS don't support numbers that big, so we need to
represent these possibly big numbers as a string. This diff uses that to
represent addresses and tsc zero. The former is printed in hex for and
the latter in decimal string form. The schema was updated mentioning
that.
Besides that, I fixed some remaining issues and now all test pass. Before I wasn't running all tests because for some reason my computer reverted perf_paranoid to 1.
Differential Revision: https://reviews.llvm.org/D127819
As discusses offline with @jj10305, we are updating some naming used throughout the code, specially in the json schema
- traceBuffer -> iptTrace
- core -> cpu
Differential Revision: https://reviews.llvm.org/D127817
This applies the changes requested for diff 12.
- use DenseMap<ConstString, _> instead of std::unordered_map<ConstString, _>, which is more idiomatic and possibly performant.
- deduplicate some code in Trace.cpp by using helper functions for fetching in maps
- stop using size and offset when fetching binary data, because we in fact read the entire buffers all the time. If we ever need streaming, we can implement it then. Now, the size is used only to check that we are getting the correct amount of data. This is useful because in some cases determining the size doesn't involve fetching the actual data.
- added back the x86_64 macro to the perf tests
- added more documentation
- simplified some file handling
- fixed some comments
Differential Revision: https://reviews.llvm.org/D127752
This improves several things and addresses comments up to the diff [11] in this stack.
- Simplify many functions to receive less parameters that they can identify easily
- Create Storage classes for Trace and TraceIntelPT that can make it easier to reason about what can change with live process refreshes and what cannot.
- Don't cache the perf zero conversion numbers in lldb-server to make sure we get the most up-to-date numbers.
- Move the thread identifaction from context switches to the bundle parser, to leave TraceIntelPT simpler. This also makes sure that the constructor of TraceIntelPT is invoked when the entire data has been checked to be correct.
- Normalize all bundle paths before the Processes, Threads and Modules are created, so that they can assume that all paths are correct and absolute
- Fix some issues in the tests. Now they all pass.
- return the specific instance when constructing PerThread and MultiCore processor tracers.
- Properly implement IntelPTMultiCoreTrace::TraceStart.
- Improve some comments.
- Use the typedef ContextSwitchTrace more often for clarity.
- Move CreateContextSwitchTracePerfEvent to Perf.h as a utility function.
- Synchronize better the state of the context switch and the intel pt
perf events.
- Use a booblean instead of an enum for the PerfEvent state.
Differential Revision: https://reviews.llvm.org/D127456
- Add the logic that parses all cpu context switch traces and produces blocks of continuous executions, which will be later used to assign intel pt subtraces to threads and to identify gaps. This logic can also identify if the context switch trace is malformed.
- The continuous executions blocks are able to indicate when there were some contention issues when producing the context switch trace. See the inline comments for more information.
- Update the 'dump info' command to show information and stats related to the multicore decoding flow, including timing about context switch decoding.
- Add the logic to conver nanoseconds to TSCs.
- Fix a bug when returning the context switches. Now they data returned makes sense and even empty traces can be returned from lldb-server.
- Finish the necessary bits for loading and saving a multi-core trace bundle from disk.
- Change some size_t to uint64_t for compatibility with 32 bit systems.
Tested by saving a trace session of a program that sleeps 100 times, it was able to produce the following 'dump info' text:
```
(lldb) trace load /tmp/trace3/trace.json (lldb) thread trace dump info Trace technology: intel-pt
thread #1: tid = 4192415
Total number of instructions: 1
Memory usage:
Total approximate memory usage (excluding raw trace): 2.51 KiB
Average memory usage per instruction (excluding raw trace): 2573.00 bytes
Timing for this thread:
Timing for global tasks:
Context switch trace decoding: 0.00s
Events:
Number of instructions with events: 0
Number of individual events: 0
Multi-core decoding:
Total number of continuous executions found: 2499
Number of continuous executions for this thread: 102
Errors:
Number of TSC decoding errors: 0
```
Differential Revision: https://reviews.llvm.org/D126267
:q!
This diff is massive, but it's because it connects the client with lldb-server
and also ensures that the postmortem case works.
- Flatten the postmortem trace schema. The reason is that the schema has become quite complex due to the new multicore case, which defeats the original purpose of having a schema that could work for every trace plug-in. At this point, it's better that each trace plug-in defines it's own full schema. This means that the only common field is "type".
-- Because of this new approach, I merged the "common" trace load and saving functionalities into the IntelPT one. This simplified the code quite a bit. If we eventually implement another trace plug-in, we can see then what we could reuse.
-- The new schema, which is flattened, has now better comments and is parsed better. A change I did was to disallow hex addresses, because they are a bit error prone. I'm asking now to print the address in decimal.
-- Renamed "intel" to "GenuineIntel" in the schema because that's what you see in /proc/cpuinfo.
- Implemented reading the context switch trace data buffer. I had to do
some refactors to do that cleanly.
-- A major change that I did here was to simplify the perf_event circular buffer reading logic. It was too complex. Maybe the original Intel author had something different in mind.
- Implemented all the necessary bits to read trace.json files with per-core data.
- Implemented all the necessary bits to save to disk per-core trace session.
- Added a test that ensures that parsing and saving to disk works.
Differential Revision: https://reviews.llvm.org/D126015
- Add a warnings field in the jLLDBGetState response, for warnings to be delivered to the client for troubleshooting. This removes the need to silently log lldb-server's llvm::Errors and not expose them easily to the user
- Simplify the tscPerfZeroConversion struct and schema. It used to extend a base abstract class, but I'm doubting that we'll ever add other conversion mechanisms because all modern kernels support perf zero. It is also the one who is supposed to work with the timestamps produced by the context switch trace, so expecting it is imperative.
- Force tsc collection for cpu tracing.
- Add a test checking that tscPerfZeroConversion is returned by the GetState request
- Add a pre-check for cpu tracing that makes sure that perf zero values are available.
Differential Revision: https://reviews.llvm.org/D125932
- Add collection of context switches per cpu grouped with the per-cpu intel pt traces.
- Move the state handling from the interl pt trace class to the PerfEvent one.
- Add support for stopping and enabling perf event groups.
- Return context switch entries as part of the jLLDBTraceGetState response.
- Move the triggers of whenever the process stopped or resumed. Now the will-resume notification is in a better location, which will ensure that we'll capture the instructions that will be executed.
- Remove IntelPTSingleBufferTraceUP. The unique pointer was useless.
- Add unit tests
Differential Revision: https://reviews.llvm.org/D125897