The per-PSB packet decoding logic was wrong because it was assuming that pt_insn_get_sync_offset was being udpated after every PSB. Silly me, that is not true. It returns the offset of the PSB packet after invoking pt_insn_sync_forward regardless of how many PSBs are visited later. Instead, I'm now following the approach described in https://github.com/intel/libipt/blob/master/doc/howto_libipt.md#parallel-decode for parallel decoding, which is basically what we need.
A nasty error that happened because of this is that when we had two PSBs (A and B), the following was happening
1. PSB A was processed all the way up to the end of the trace, which includes PSB B.
2. PSB B was then processed until the end of the trace.
The instructions emitted by step 2. were also emitted as part of step 1. so our trace had duplicated chunks. This problem becomes worse when you many PSBs.
As part of making sure this diff is correct, I added some other features that are very useful.
- Added a "synchronization point" event to the TraceCursor, so we can inspect when PSBs are emitted.
- Removed the single-thread decoder. Now the per-cpu decoder and single-thread decoder use the same code paths.
- Use the query decoder to fetch PSBs and timestamps. It turns out that the pt_insn_sync_forward of the instruction decoder can move past several PSBs (this means that we could skip some TSCs). On the other hand, the pt_query_sync_forward method doesn't skip PSBs, so we can get more accurate sync events and timing information.
- Turned LibiptDecoder into PSBBlockDecoder, which decodes single PSB blocks. It is the fundamental processing unit for decoding.
- Added many comments, asserts and improved error handling for clarity.
- Improved DecodeSystemWideTraceForThread so that a TSC is emitted always before a cpu change event. This was a bug that was annoying me before.
- SplitTraceInContinuousExecutions and FindLowestTSCInTrace are now using the query decoder, which can identify precisely each PSB along with their TSCs.
- Added an "only-events" option to the trace dumper to inspect only events.
I did extensive testing and I think we should have an in-house testing CI. The LLVM buildbots are not capable of supporting testing post-mortem traces of hundreds of megabytes. I'll leave that for later, but at least for now the current tests were able to catch most of the issues I encountered when doing this task.
A sample output of a program that I was single stepping is the following. You can see that only one PSB is emitted even though stepping happened!
```
thread #1: tid = 3578223
0: (event) trace synchronization point [offset = 0x0xef0]
a.out`main + 20 at main.cpp:29:20
1: 0x0000000000402479 leaq -0x1210(%rbp), %rax
2: (event) software disabled tracing
3: 0x0000000000402480 movq %rax, %rdi
4: (event) software disabled tracing
5: (event) software disabled tracing
6: 0x0000000000402483 callq 0x403bd4 ; std::vector<int, std::allocator<int>>::vector at stl_vector.h:391:7
7: (event) software disabled tracing
a.out`std::vector<int, std::allocator<int>>::vector() at stl_vector.h:391:7
8: 0x0000000000403bd4 pushq %rbp
9: (event) software disabled tracing
10: 0x0000000000403bd5 movq %rsp, %rbp
11: (event) software disabled tracing
```
This is another trace of a long program with a few PSBs.
```
(lldb) thread trace dump instructions -E -f thread #1: tid = 3603082
0: (event) trace synchronization point [offset = 0x0x80]
47417: (event) software disabled tracing
129231: (event) trace synchronization point [offset = 0x0x800]
146747: (event) software disabled tracing
246076: (event) software disabled tracing
259068: (event) trace synchronization point [offset = 0x0xf78]
259276: (event) software disabled tracing
259278: (event) software disabled tracing
no more data
```
Differential Revision: https://reviews.llvm.org/D131630
Added:
- Take RISC-V `ebreak` instruction as breakpoint trap code, so our breakpoint works as expected now.
Further work:
- RISC-V does not support hardware single stepping yet. A software implementation may come in future PR.
- Add support for RVC extension (the trap code, etc.).
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D131566
This commit combines the initial commit (7c240de609af), a fix for x86_64 Linux
(3a0581501e76) and a fix for thinko in a last minute rewrite that I really
should have run the testsuite on.
Also, make sure that all the "I need to step over watchpoint" plans execute
before we call a public stop. Otherwise, e.g. if you have N watchpoints and
a Signal, the signal stop info will get us to stop with the watchpoints in a
half-done state.
Differential Revision: https://reviews.llvm.org/D130674
Add bindings for the `TraceCursor` to allow for programatic traversal of
traces.
This diff adds bindings for all public `TraceCursor` methods except
`GetHwClock` and also adds `SBTrace::CreateNewCursor`. A new unittest
has been added to TestTraceLoad.py that uses the new `SBTraceCursor` API
to test that the sequential and random access APIs of the `TraceCursor`
are equivalent.
This diff depends on D130925.
Test Plan:
`ninja lldb-dotest && ./bin/lldb-dotest -p TestTraceLoad`
Differential Revision: https://reviews.llvm.org/D130930
The use of `std::unique_ptr` with `TraceCursor` adds unnecessary complexity to adding `SBTraceCursor` bindings
Specifically, since `TraceCursor` is an abstract class there's no clean
way to provide "deep clone" semantics for `TraceCursorUP` short of
creating a pure virtual `clone()` method (afaict).
After discussing with @wallace, we decided there is no strong reason to
favor wrapping `TraceCursor` with `std::unique_ptr` over `std::shared_ptr`, thus this diff
replaces all usages of `std::unique_ptr<TraceCursor>` with `std::shared_ptr<TraceCursor>`.
This sets the stage for future diffs to introduce `SBTraceCursor`
bindings in a more clean fashion.
Test Plan:
Differential Revision: https://reviews.llvm.org/D130925
Resubmission of https://reviews.llvm.org/D130309 with the 2 patches that fixed the linux buildbot, and new windows fixes.
The FileSpec APIs allow users to modify instance variables directly by getting a non const reference to the directory and filename instance variables. This makes it impossible to control all of the times the FileSpec object is modified so we can clear cached member variables like m_resolved and with an upcoming patch caching if the file is relative or absolute. This patch modifies the APIs of FileSpec so no one can modify the directory or filename instance variables directly by adding set accessors and by removing the get accessors that are non const.
Many clients were using FileSpec::GetCString(...) which returned a unique C string from a ConstString'ified version of the result of GetPath() which returned a std::string. This caused many locations to use this convenient function incorrectly and could cause many strings to be added to the constant string pool that didn't need to. Most clients were converted to using FileSpec::GetPath().c_str() when possible. Other clients were modified to use the newly renamed version of this function which returns an actualy ConstString:
ConstString FileSpec::GetPathAsConstString(bool denormalize = true) const;
This avoids the issue where people were getting an already uniqued "const char *" that came from a ConstString only to put the "const char *" back into a "ConstString" object. By returning the ConstString instead of a "const char *" clients can be more efficient with the result.
The patch:
- Removes the non const GetDirectory() and GetFilename() get accessors
- Adds set accessors to replace the above functions: SetDirectory() and SetFilename().
- Adds ClearDirectory() and ClearFilename() to replace usage of the FileSpec::GetDirectory().Clear()/FileSpec::GetFilename().Clear() call sites
- Fixed all incorrect usage of FileSpec::GetCString() to use FileSpec::GetPath().c_str() where appropriate, and updated other call sites that wanted a ConstString to use the newly returned ConstString appropriately and efficiently.
Differential Revision: https://reviews.llvm.org/D130549
I noticed that the test TestSetWatchpoint.py was failing every so often
on macOS. The failure was in the last assert, that after destroying the
SBTarget containing it, the SBWatchpoint was still saying it was valid.
IsValid in this case just meant the watchpoint weak pointer could be turned
into a shared pointer. The watchpoint shared pointers have two strong references
in general, one to the "Target::m_last_created_watchpoint", and one in the
Target::m_watchpoint_list. Target::Destroy reset the last created watchpoint
but neglected to call RemoveAll on the watchpoint list (it does the analogous
work for the internal & external breakpoint lists...) This patch does the
equivalent cleanup for the watchpoint list.
This can cause a deadlock if other threads use the common pattern of
"lock the StackFrameList, get a frame, lock the StackFrame."
Differential Revision: https://reviews.llvm.org/D130524
The FileSpect APIs allow users to modify instance variables directly by getting a non const reference to the directory and filename instance variables. This makes it impossibly to control all of the times the FileSpec object is modified so we can clear the cache. This patch modifies the APIs of FileSpec so no one can modify the directory or filename directly by adding set accessors and by removing the get accessors that are non const.
Many clients were using FileSpec::GetCString(...) which returned a unique C string from a ConstString'ified version of the result of GetPath() which returned a std::string. This caused many locations to use this convenient function incorrectly and could cause many strings to be added to the constant string pool that didn't need to. Most clients were converted to using FileSpec::GetPath().c_str() when possible. Other clients were modified to use the newly renamed version of this function which returns an actualy ConstString:
ConstString FileSpec::GetPathAsConstString(bool denormalize = true) const;
This avoids the issue where people were getting an already uniqued "const char *" that came from a ConstString only to put the "const char *" back into a "ConstString" object. By returning the ConstString instead of a "const char *" clients can be more efficient with the result.
The patch:
- Removes the non const GetDirectory() and GetFilename() get accessors
- Adds set accessors to replace the above functions: SetDirectory() and SetFilename().
- Adds ClearDirectory() and ClearFilename() to replace usage of the FileSpec::GetDirectory().Clear()/FileSpec::GetFilename().Clear() call sites
- Fixed all incorrect usage of FileSpec::GetCString() to use FileSpec::GetPath().c_str() where appropriate, and updated other call sites that wanted a ConstString to use the newly returned ConstString appropriately and efficiently.
Differential Revision: https://reviews.llvm.org/D130309
This reverts commit 5778ada8e5.
The watchpoint tests all stall on aarch64-ubuntu bots. Reverting till I can
get my hands on an system to test this out.
This reverts commit 555ae5b8f5.
Apparently, there's something different about how Linux ARM handles watchpoints,
as all the watchpoint tests seem to stall on the Ubuntu aarch64 bots.
Reverting till I can get my hands on a linux system and see what is
wrong.
That was causing hit counts to be double-counted on x86_64 Linux.
It looks like StopInfoWatchpoint::ShouldStopSynchronous gets called
twice for a give stop on Linux (not on Darwin). I had taken out the
"have I been called already" check when I reworked this part of the
code because it didn't seem necessary. Putting that back in because
it looks like it is on some systems.
Since we want to present the "new & old" values for watchpoint hits, on architectures,
including the ARM family, that stop before the triggering instruction is run, we need
to single step over the instruction before stopping for realz. This was incorrectly
done directly in the StopInfoWatchpoint::ShouldStop. That causes problems if more than
one thread stops "for a reason" at the same time as the watchpoint, since the other actions
didn't expect the process to make progress in this part of the execution control machinery.
The correct way to do this is to schedule the step over using ThreadPlans, and then to restore
the stop info after that plan stops, so that the rest of the stop info actions can happen when
all the other threads have handled their immediate actions as well.
Differential Revision: https://reviews.llvm.org/D129814
Thanks to ymeng@fb.com for coming up with this change.
`thread trace dump info` can dump some metrics that can be useful for
analyzing the performance and quality of a trace. This diff adds a --json
option for dumping this information in json format that can be easily
understood my machines.
Differential Revision: https://reviews.llvm.org/D129332
Thanks to fredzhou@fb.com for coming up with this feature.
When tracing in per-cpu mode, we have information of in which cpu we are execution each instruction, which comes from the context switch trace. This diff makes this information available as a `cpu changed event`, which an additional accessor in the cursor `GetCPU()`. As cpu changes are very infrequent, any consumer should listen to cpu change events instead of querying the actual cpu of a trace item. Once a cpu change event is seen, the consumer can invoke GetCPU() to get that information. Also, it's possible to invoke GetCPU() on an arbitrary instruction item, which will return the last cpu seen. However, this call is O(logn) and should be used sparingly.
Manually tested with a sample program that starts on cpu 52, then goes to 18, and then goes back to 52.
Differential Revision: https://reviews.llvm.org/D129340
In rare situations, disassemblying would fail that produce an invalid
InstructionSP object. We need to check that it's valid before using.
With this change, now the dumper doesn't crash with dumping instructions of
ioctl. In fact, it now dumps this output
{
"id": 6135,
"loadAddress": "0x7f4bfe5c7515",
"module": "libc.so.6",
"symbol": "ioctl",
"source": "glibc/2.34/src/glibc-2.34/sysdeps/unix/syscall-template.S",
"line": 120,
"column": 0
}
Anyway, we need to investigate why the diassembler failed disassembling that
instruction. From over 2B instructions I was disassembling today, just this
one failed, so this could be a bug in LLVM's core disassembler.
Differential Revision: https://reviews.llvm.org/D129588
not to be hit. But another thread might be hit at the same time and
actually stop. So we have to be sure to switch the first thread's
stop info to eStopReasonNone or we'll report a hit when the condition
failed, which is confusing.
Differential Revision: https://reviews.llvm.org/D128776
We want to include events with metadata, like context switches, and this
requires the API to handle events with payloads (e.g. information about
such context switches). Besides this, we want to support multiple
similar events between two consecutive instructions, like multiple
context switches. However, the current implementation is not good for this because
we are defining events as bitmask enums associated with specific
instructions. Thus, we need to decouple instructions from events and
make events actual items in the trace, just like instructions and
errors.
- Add accessors in the TraceCursor to know if an item is an event or not
- Modify from the TraceDumper all the way to DecodedThread to support
- Renamed the paused event to disabled.
- Improved the tsc handling logic. I was using an API for getting the tsc from libipt, but that was an overkill that should be used when not processing events manually, but as we are already processing events, we can more easily get the tscs.
event items. Fortunately this simplified many things
- As part of this refactor, I also fixed and long stating issue, which is that some non decoding errors were being inserted in the decoded thread. I changed this so that TraceIntelPT::Decode returns an error if the decoder couldn't be set up proplerly. Then, errors within a trace are actual anomalies found in between instrutions.
All test pass
Differential Revision: https://reviews.llvm.org/D128576
The current way ot traversing the cursor is a bit uncommon and it can't handle empty traces, in fact, its invariant is that it shold always point to a valid item. This diff simplifies the cursor API and allows it to point to invalid items, thus being able to handle empty traces or to know it ran out of data.
- Removed all the granularity functionalities, because we are not actually making use of that. We can bring them back when they are actually needed.
- change the looping logic to the following:
```
for (; cursor->HasValue(); cursor->Next()) {
if (cursor->IsError()) {
.. do something for error
continue;
}
.. do something for instruction
}
```
- added a HasValue method that can be used to identify if the cursor ran out of data, the trace is empty, or the user tried to move to an invalid position via SetId() or Seek()
- made several simplifications to severals parts of the code.
Differential Revision: https://reviews.llvm.org/D128543
As previously discussed with @jj10306, we didn't really have a name for
the post-mortem (or offline) trace session representation, which is in
fact a folder with a bunch of files. We decided to call this folder
"trace bundle", and the main JSON file in it "trace bundle description
file". This naming is pretty decent, so I'm refactoring all the existing
code to account for that.
Differential Revision: https://reviews.llvm.org/D128484
In order to provide simple scripting support on top of instruction traces, a simple solution is to enhance the `dump instructions` command and allow printing in json and directly to a file. The format is verbose and not space efficient, but it's not supposed to be used for really large traces, in which case the TraceCursor API is the way to go.
- add a -j option for printing the dump in json
- add a -J option for pretty printing the json output
- add a -F option for specifying an output file
- add a -a option for dumping all the instructions available starting at the initial point configured with the other flags
- add tests for all cases
- refactored the instruction dumper and abstracted the actual "printing" logic. There are two writer implementations: CLI and JSON. This made the dumper itself much more readable and maintanable
sample output:
```
(lldb) thread trace dump instructions -t -a --id 100 -J
[
{
"id": 100,
"tsc": "43591204528448966"
"loadAddress": "0x407a91",
"module": "a.out",
"symbol": "void std::deque<Foo, std::allocator<Foo>>::_M_push_back_aux<Foo>(Foo&&)",
"mnemonic": "movq",
"source": "/usr/include/c++/8/bits/deque.tcc",
"line": 492,
"column": 30
},
...
```
Differential Revision: https://reviews.llvm.org/D128316
This fixes an issue that, when you start lldb or use `target create`
with a program name which is on $PATH, or not specify the .exe suffix of
a program in the working directory on Windows, you get a confusing
error, for example:
(lldb) target create notepad
error: 'C:\WINDOWS\SYSTEM32\notepad.exe' doesn't contain any 'host'
platform architectures: i686, x86_64, i386, i386
Fixes https://github.com/mstorsjo/llvm-mingw/issues/265
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D127436
Add a setting to configure how LLDB parses dynamic Objective-C class
metadata. By default LLDB will choose the most appropriate method for
the target OS.
Differential revision: https://reviews.llvm.org/D128312
Add trace load functionality to SBDebugger via the `LoadTraceFromFile` method.
Update intelpt test case class to have `testTraceLoad` method so we can take advantage of
the testApiAndSB decorator to test both the CLI and SB without duplicating code.
Differential Revision: https://reviews.llvm.org/D128107
When we hit a breakpoint site all of whose owners are internal, we don't
broadcast that event to the public event queue. However, we were checking
whether that was true in the ShouldNotify method, which gets run after the
breakpoint callbacks get run. If the breakpoint callback deletes the site
we just hit, we no longer have the information to make that determination.
This patch just gathers the "was all internal" fact when the StopInfoBreakpoint
gets made, which happens before anyone has a chance to delete the site, and then
uses that cached value.
This bug was causing a couple of tests (including TestStopAtEntry.py) to fail
when using new the macOS Ventura dyld support.
Differential Revision: https://reviews.llvm.org/D127997