This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Use the same register layout as Linux kernel, implement the
related read and write operations.
Reviewed By: SixWeining, xen0n, DavidSpickett
Differential Revision: https://reviews.llvm.org/D138407
Add as little code as possible to allow compiling lldb on LoongArch.
Actual functionality will be implemented later.
Reviewed By: SixWeining, DavidSpickett
Differential Revision: https://reviews.llvm.org/D136578
The size of the m_hwp_regs array should match the default value of
m_max_hwp_supported. This ensures that no out-of-bounds accesses
occur, even if the array is accessed prior to a call to
ReadHardwareDebugInfo().
Fixes https://github.com/llvm/llvm-project/issues/54520, see also
there for additional background.
Differential Revision: https://reviews.llvm.org/D136144
All callers were either assuming their pointer was not null before calling
this, or checking beforehand.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D135668
Most of the paths to this never passed nullptr intentionally. Those
that possibly could have were assuming it was not null elsehwere,
so would have crashed.
I've added asserts in those cases.
At least one case was relying on GetAsMemoryData to return an error
when it was given nullptr. So I've hoisted that error setting code
out into the caller.
Depends on D134963
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D134965
Bionic's <sys/procfs.h> defines the necessary symbols. Remove the
specialization for Android and the now-unnecessary include of
<sys/ptrace.h>. This also helps resolve issues when building the
x86/x86_64 lldb-server for Android.
Curiously, the default branch to include <sys/procfs.h> doesn't seem
necessary on Linux. I'll remove it and add it back if it breaks other
builders.
Differential Revision: https://reviews.llvm.org/D132514
Add:
- `EmulateInstructionRISCV`, which can be used for riscv32 and riscv64.
- Add unittests for EmulateInstructionRISCV.
Note: Compressed instructions set (RVC) was still not supported in this patch.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D131759
In this switch case we didn't handle possible errors in `ResumeThread()`, it's hard to get helpful information when it goes wrong.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D131946
Pavel Labath taught me that clang-format sorts headers automatically
using llvm's rules, and it's better not to have spaces between
So in this diff I'm removing those spaces and formatting them as well.
I used `clang-format -i` to format these files.
Fixed an inconsistency between D130985 and D130342
This should be a follow-up of D130985
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D131667
This patch is based on the minimal extract of D128250.
What is implemented:
- Use the same register layout as Linux kernel and mock read/write for `x0` register (the always zero register).
- Refactor some duplicate code, and delete unused register definitions.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D130342
@clayborg found a potential race condition when setting a static
variable. The fix seems simply to use call_once.
All relevant tests pass.
Differential Revision: https://reviews.llvm.org/D131081
Currently, lldb-server was opening the executable file to determine the
process architecture (to differentiate between 32 and 64 bit
architecture flavours). This isn't a particularly trustworthy source of
information (the file could have been changed since the process was
started) and it is not always available (file could be deleted or
otherwise inaccessible).
Unfortunately, ptrace does not give us a direct API to access the
process architecture, but we can still infer it via some of its
responses -- given that the general purpose register set of 64-bit
applications is larger [citation needed] than the GPR set of 32-bit
ones, we can just ask for the application GPR set and check its size.
This is what this patch does.
Differential Revision: https://reviews.llvm.org/D130985
It turns out that cgroup filtering is relatively trivial and works
really nicely. Thid diffs adds automatic cgroup filtering when in
per-cpu mode, unless a new --disable-cgroup-filtering flag is passed in
the start command. At least on Meta machines, all processes are spawned
inside a cgroup by default, which comes super handy, because per cpu
tracing is now much more precise.
A manual test gave me this result
- Without filtering:
Total number of trace items: 36083
Total number of continuous executions found: 229
Number of continuous executions for this thread: 2
Total number of PSB blocks found: 98
Number of PSB blocks for this thread 2
Total number of unattributed PSB blocks found: 38
- With filtering:
Total number of trace items: 87756
Total number of continuous executions found: 123
Number of continuous executions for this thread: 2
Total number of PSB blocks found: 10
Number of PSB blocks for this thread 3
Total number of unattributed PSB blocks found: 2
Filtering gives us great results. The number of instructions collected
more than double (probalby because we have less noise in the trace), and
we have much less unattributed PSBs blocks and unrelated PSBs in
general. The ones that are unrelated probably belong to other processes
in the same cgroup.
Differential Revision: https://reviews.llvm.org/D129257
Implement support for the "t" action that is used to stop a thread.
Normally this action is used only in non-stop mode. However, there's
no technical reason why it couldn't be also used in all-stop mode,
e.g. to express "resume all threads except ..." (`t:...;c`).
While at it, add a more complete test for vCont correctly resuming
a subset of program's threads.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D126983
Fix ThreadStopInfo struct to include the signal number for all events.
Since signo was not included in the details for fork, vfork
and vforkdone stops, the code incidentally referenced the wrong union
member, resulting in wrong signo being sent.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D127193
Having a member variable TraceIntelPT * makes it look as if it was
optional. I'm using instead a weak_ptr to indicate that it's not
optional and the object is under the ownership of TraceIntelPT.
Besides that, I've simplified the Perf aux and data buffers copying by
using vector.insert.
I'm also renaming Lookup2 to Lookup. The 2 in the name is confusing.
Differential Revision: https://reviews.llvm.org/D127881
llvm's JSON parser supports 64 bit integers, but other tools like the
ones written in JS don't support numbers that big, so we need to
represent these possibly big numbers as a string. This diff uses that to
represent addresses and tsc zero. The former is printed in hex for and
the latter in decimal string form. The schema was updated mentioning
that.
Besides that, I fixed some remaining issues and now all test pass. Before I wasn't running all tests because for some reason my computer reverted perf_paranoid to 1.
Differential Revision: https://reviews.llvm.org/D127819
As discusses offline with @jj10305, we are updating some naming used throughout the code, specially in the json schema
- traceBuffer -> iptTrace
- core -> cpu
Differential Revision: https://reviews.llvm.org/D127817
This applies the changes requested for diff 12.
- use DenseMap<ConstString, _> instead of std::unordered_map<ConstString, _>, which is more idiomatic and possibly performant.
- deduplicate some code in Trace.cpp by using helper functions for fetching in maps
- stop using size and offset when fetching binary data, because we in fact read the entire buffers all the time. If we ever need streaming, we can implement it then. Now, the size is used only to check that we are getting the correct amount of data. This is useful because in some cases determining the size doesn't involve fetching the actual data.
- added back the x86_64 macro to the perf tests
- added more documentation
- simplified some file handling
- fixed some comments
Differential Revision: https://reviews.llvm.org/D127752
This improves several things and addresses comments up to the diff [11] in this stack.
- Simplify many functions to receive less parameters that they can identify easily
- Create Storage classes for Trace and TraceIntelPT that can make it easier to reason about what can change with live process refreshes and what cannot.
- Don't cache the perf zero conversion numbers in lldb-server to make sure we get the most up-to-date numbers.
- Move the thread identifaction from context switches to the bundle parser, to leave TraceIntelPT simpler. This also makes sure that the constructor of TraceIntelPT is invoked when the entire data has been checked to be correct.
- Normalize all bundle paths before the Processes, Threads and Modules are created, so that they can assume that all paths are correct and absolute
- Fix some issues in the tests. Now they all pass.
- return the specific instance when constructing PerThread and MultiCore processor tracers.
- Properly implement IntelPTMultiCoreTrace::TraceStart.
- Improve some comments.
- Use the typedef ContextSwitchTrace more often for clarity.
- Move CreateContextSwitchTracePerfEvent to Perf.h as a utility function.
- Synchronize better the state of the context switch and the intel pt
perf events.
- Use a booblean instead of an enum for the PerfEvent state.
Differential Revision: https://reviews.llvm.org/D127456
- Add the logic that parses all cpu context switch traces and produces blocks of continuous executions, which will be later used to assign intel pt subtraces to threads and to identify gaps. This logic can also identify if the context switch trace is malformed.
- The continuous executions blocks are able to indicate when there were some contention issues when producing the context switch trace. See the inline comments for more information.
- Update the 'dump info' command to show information and stats related to the multicore decoding flow, including timing about context switch decoding.
- Add the logic to conver nanoseconds to TSCs.
- Fix a bug when returning the context switches. Now they data returned makes sense and even empty traces can be returned from lldb-server.
- Finish the necessary bits for loading and saving a multi-core trace bundle from disk.
- Change some size_t to uint64_t for compatibility with 32 bit systems.
Tested by saving a trace session of a program that sleeps 100 times, it was able to produce the following 'dump info' text:
```
(lldb) trace load /tmp/trace3/trace.json (lldb) thread trace dump info Trace technology: intel-pt
thread #1: tid = 4192415
Total number of instructions: 1
Memory usage:
Total approximate memory usage (excluding raw trace): 2.51 KiB
Average memory usage per instruction (excluding raw trace): 2573.00 bytes
Timing for this thread:
Timing for global tasks:
Context switch trace decoding: 0.00s
Events:
Number of instructions with events: 0
Number of individual events: 0
Multi-core decoding:
Total number of continuous executions found: 2499
Number of continuous executions for this thread: 102
Errors:
Number of TSC decoding errors: 0
```
Differential Revision: https://reviews.llvm.org/D126267
:q!
This diff is massive, but it's because it connects the client with lldb-server
and also ensures that the postmortem case works.
- Flatten the postmortem trace schema. The reason is that the schema has become quite complex due to the new multicore case, which defeats the original purpose of having a schema that could work for every trace plug-in. At this point, it's better that each trace plug-in defines it's own full schema. This means that the only common field is "type".
-- Because of this new approach, I merged the "common" trace load and saving functionalities into the IntelPT one. This simplified the code quite a bit. If we eventually implement another trace plug-in, we can see then what we could reuse.
-- The new schema, which is flattened, has now better comments and is parsed better. A change I did was to disallow hex addresses, because they are a bit error prone. I'm asking now to print the address in decimal.
-- Renamed "intel" to "GenuineIntel" in the schema because that's what you see in /proc/cpuinfo.
- Implemented reading the context switch trace data buffer. I had to do
some refactors to do that cleanly.
-- A major change that I did here was to simplify the perf_event circular buffer reading logic. It was too complex. Maybe the original Intel author had something different in mind.
- Implemented all the necessary bits to read trace.json files with per-core data.
- Implemented all the necessary bits to save to disk per-core trace session.
- Added a test that ensures that parsing and saving to disk works.
Differential Revision: https://reviews.llvm.org/D126015
- Add a warnings field in the jLLDBGetState response, for warnings to be delivered to the client for troubleshooting. This removes the need to silently log lldb-server's llvm::Errors and not expose them easily to the user
- Simplify the tscPerfZeroConversion struct and schema. It used to extend a base abstract class, but I'm doubting that we'll ever add other conversion mechanisms because all modern kernels support perf zero. It is also the one who is supposed to work with the timestamps produced by the context switch trace, so expecting it is imperative.
- Force tsc collection for cpu tracing.
- Add a test checking that tscPerfZeroConversion is returned by the GetState request
- Add a pre-check for cpu tracing that makes sure that perf zero values are available.
Differential Revision: https://reviews.llvm.org/D125932
- Add collection of context switches per cpu grouped with the per-cpu intel pt traces.
- Move the state handling from the interl pt trace class to the PerfEvent one.
- Add support for stopping and enabling perf event groups.
- Return context switch entries as part of the jLLDBTraceGetState response.
- Move the triggers of whenever the process stopped or resumed. Now the will-resume notification is in a better location, which will ensure that we'll capture the instructions that will be executed.
- Remove IntelPTSingleBufferTraceUP. The unique pointer was useless.
- Add unit tests
Differential Revision: https://reviews.llvm.org/D125897
We were setting some events to be written in the data buffer of the
perf_event, but we don't need that.
Besides that, we don't need the data buffer to be larger than 1, so we
can reduce its size.
Differential Revision: https://reviews.llvm.org/D125850
We have two different "process trace" implementations: per thread and per core. As a way to simplify the collector who uses both, I'm creating a base abstract class that is used by these implementations. This effectively simplify a good chunk of code.
Differential Revision: https://reviews.llvm.org/D125503
On Ubuntu 18.04 with GCC 7.5 Intel trace code fails to build due to
failure to convert from
lldb_private::process_linux::IntelPTPerThreadProcessTraceUP to
Expected<lldb_private::process_linux::IntelPTPerThreadProcessTraceUP>.
This commit explicitely marks those unique_ptr values as being moved
which fixes the conversion error.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D126402