443 lines
		
	
	
		
			19 KiB
		
	
	
	
		
			Markdown
		
	
	
	
			
		
		
	
	
			443 lines
		
	
	
		
			19 KiB
		
	
	
	
		
			Markdown
		
	
	
	
| # Structured Data Plugins
 | |
| 
 | |
| This document describes an infrastructural feature called Structured Data
 | |
| plugins.  See the DarwinLog documentation for a description of one such plugin
 | |
| that makes use of this feature.
 | |
| 
 | |
| StructuredDataPlugin instances have the following characteristics:
 | |
| 
 | |
| * Each plugin instance is bound to a single Process instance.
 | |
| 
 | |
| * Each StructuredData feature has a type name that identifies the
 | |
|   feature. For instance, the type name for the DarwinLog feature is
 | |
|   "DarwinLog". This feature type name is used in various places.
 | |
| 
 | |
| * The process monitor reports the list of supported StructuredData
 | |
|   features advertised by the process monitor. Process goes through the
 | |
|   list of supported feature type names, and asks each known
 | |
|   StructuredDataPlugin if it can handle the feature. The first plugin
 | |
|   that supports the feature is mapped to that Process instance for
 | |
|   that feature.  Plugins are only mapped when the process monitor
 | |
|   advertises that a feature is supported.
 | |
| 
 | |
| * The feature may send asynchronous messages in StructuredData format
 | |
|   to the Process instance. Process instances route the asynchronous
 | |
|   structured data messages to the plugin mapped to that feature type,
 | |
|   if one exists.
 | |
| 
 | |
| * Plugins can request that the Process instance forward on
 | |
|   configuration data to the process monitor if the plugin needs/wants
 | |
|   to configure the feature. Plugins may call the new Process method
 | |
| 
 | |
|   ```C++
 | |
|   virtual Error
 | |
|   ConfigureStructuredData(ConstString type_name,
 | |
|                           const StructuredData::ObjectSP &config_sp)
 | |
|   ```
 | |
| 
 | |
|   where `type_name` is the feature name and `config_sp` points to the
 | |
|   configuration structured data, which may be nullptr.
 | |
| 
 | |
| * Plugins for features present in a process are notified when modules
 | |
|   are loaded into the Process instance via this StructuredDataPlugin
 | |
|   method:
 | |
| 
 | |
|   ```C++
 | |
|   virtual void
 | |
|   ModulesDidLoad(Process &process, ModuleList &module_list);
 | |
|   ```
 | |
| 
 | |
| * Plugins may optionally broadcast their received structured data as
 | |
|   an LLDB process-level event via the following new Process call:
 | |
| 
 | |
|   ```C++
 | |
|   void
 | |
|   BroadcastStructuredData(const StructuredData::ObjectSP &object_sp,
 | |
|                           const lldb::StructuredDataPluginSP &plugin_sp);
 | |
|   ```
 | |
| 
 | |
|   IDE clients might use this feature to receive information about the
 | |
|   process as it is running to monitor memory usage, CPU usage, and
 | |
|   logging.
 | |
| 
 | |
|   Internally, the event type created is an instance of
 | |
|   EventDataStructuredData.
 | |
| 
 | |
| * In the case where a plugin chooses to broadcast a received
 | |
|   StructuredData event, the command-line LLDB Debugger instance
 | |
|   listens for them. The Debugger instance then gives the plugin an
 | |
|   opportunity to display info to either the debugger output or error
 | |
|   stream at a time that is safe to write to them. The plugin can
 | |
|   choose to display something appropriate regarding the structured
 | |
|   data that time.
 | |
| 
 | |
| * Plugins can provide a ProcessLaunchInfo filter method when the
 | |
|   plugin is registered.  If such a filter method is provided, then
 | |
|   when a process is about to be launched for debugging, the filter
 | |
|   callback is invoked, given both the launch info and the target.  The
 | |
|   plugin may then alter the launch info if needed to better support
 | |
|   the feature of the plugin.
 | |
| 
 | |
| * The plugin is entirely independent of the type of Process-derived
 | |
|   class that it is working with. The only requirements from the
 | |
|   process monitor are the following feature-agnostic elements:
 | |
| 
 | |
|   * Provide a way to discover features supported by the process
 | |
|     monitor for the current process.
 | |
| 
 | |
|   * Specify the list of supported feature type names to Process.
 | |
|     The process monitor does this by calling the following new
 | |
|     method on Process:
 | |
| 
 | |
|     ```C++
 | |
|     void
 | |
|     MapSupportedStructuredDataPlugins(const StructuredData::Array
 | |
|                                       &supported_type_names)
 | |
|     ```
 | |
| 
 | |
|     The `supported_type_names` specifies an array of string entries,
 | |
|     where each entry specifies the name of a StructuredData feature.
 | |
| 
 | |
|   * Provide a way to forward on configuration data for a feature type
 | |
|     to the process monitor.  This is the manner by which LLDB can
 | |
|     configure a feature, perhaps based on settings or commands from
 | |
|     the user.  The following virtual method on Process (described
 | |
|     earlier) does the job:
 | |
| 
 | |
|     ```C++
 | |
|     virtual Error
 | |
|     ConfigureStructuredData(ConstString type_name,
 | |
|                             const StructuredData::ObjectSP &config_sp)
 | |
|     ```
 | |
| 
 | |
|   * Listen for asynchronous structured data packets from the process
 | |
|     monitor, and forward them on to Process via this new Process
 | |
|     member method:
 | |
| 
 | |
|     ```C++
 | |
|     bool
 | |
|     RouteAsyncStructuredData(const StructuredData::ObjectSP object_sp)
 | |
|     ```
 | |
| 
 | |
| * StructuredData producers must send their top-level data as a
 | |
|   Dictionary type, with a key called 'type' specifying a string value,
 | |
|   where the value is equal to the StructuredData feature/type name
 | |
|   previously advertised. Everything else about the content of the
 | |
|   dictionary is entirely up to the feature.
 | |
| 
 | |
| * StructuredDataPlugin commands show up under `plugin structured-data
 | |
|   plugin-name`.
 | |
| 
 | |
| * StructuredDataPlugin settings show up under
 | |
|   `plugin.structured-data.{plugin-name}`.
 | |
| 
 | |
| ## StructuredDataDarwinLog feature
 | |
| 
 | |
| The DarwinLog feature supports logging `os_log`*() and `NSLog`() messages
 | |
| to the command-line lldb console, as well as making those messages
 | |
| available to LLDB clients via the event system.  Starting with fall
 | |
| 2016 OSes, Apple platforms introduce a new fire-hose, stream-style
 | |
| logging system where the bulk of the log processing happens on the log
 | |
| consumer side.  This reduces logging impact on the system when there
 | |
| are no consumers, making it cheaper to include logging at all times.
 | |
| However, it also increases the work needed on the consumer end when
 | |
| log messages are desired.
 | |
| 
 | |
| The debugserver binary has been modified to support collection of
 | |
| `os_log`*()/`NSLog`() messages, selection of which messages appear in the
 | |
| stream, and fine-grained filtering of what gets passed on to the LLDB
 | |
| client.  DarwinLog also tracks the activity chain (i.e. `os_activity`()
 | |
| hierarchy) in effect at the time the log messages were issued.  The
 | |
| user is able to configure a number of aspects related to the
 | |
| formatting of the log message header fields.
 | |
| 
 | |
| The DarwinLog support is written in a way which should support the
 | |
| lldb client side on non-Apple clients talking to an Apple device or
 | |
| macOS system; hence, the plugin support is built into all LLDB
 | |
| clients, not just those built on an Apple platform.
 | |
| 
 | |
| StructuredDataDarwinLog implements the 'DarwinLog' feature type, and
 | |
| the plugin name for it shows up as `darwin-log`.
 | |
| 
 | |
| The user interface to the darwin-log support is via the following:
 | |
| 
 | |
| * `plugin structured-data darwin-log enable` command
 | |
| 
 | |
|   This is the main entry point for enabling the command.  It can be
 | |
|   set before launching a process or while the process is running.
 | |
|   If the user wants to squelch seeing info-level or debug-level
 | |
|   messages, which is the default behavior, then the enable command
 | |
|   must be made prior to launching the process; otherwise, the
 | |
|   info-level and debug-level messages will always show up.  Also,
 | |
|   there is a similar "echo os_log()/NSLog() messages to target
 | |
|   process stderr" mechanism which is properly disabled when enabling
 | |
|   the DarwinLog support prior to launch.  This cannot be squelched
 | |
|   if enabling DarwinLog after launch.
 | |
| 
 | |
|   See the help for this command.  There are a number of options
 | |
|   to shrink or expand the number of messages that are processed
 | |
|   on the remote side and sent over to the client, and other
 | |
|   options to control the formatting of messages displayed.
 | |
| 
 | |
|   This command is sticky.  Once enabled, it will stay enabled for
 | |
|   future process launches.
 | |
| 
 | |
| * `plugin structured-data darwin-log disable` command
 | |
| 
 | |
|   Executing this command disables os_log() capture in the currently
 | |
|   running process and signals LLDB to stop attempting to launch
 | |
|   new processes with DarwinLog support enabled.
 | |
| 
 | |
| * `settings set
 | |
|   plugin.structured-data.darwin-log.enable-on-startup true`
 | |
| 
 | |
|   and
 | |
| 
 | |
|   `settings set
 | |
|   plugin.structured-data.darwin-log.auto-enable-options -- `{options}
 | |
| 
 | |
|   When `enable-on-startup` is set to `true`, then LLDB will automatically
 | |
|   enable DarwinLog on startup of relevant processes.  It will use the
 | |
|   content provided in the auto-enable-options settings as the
 | |
|   options to pass to the enable command.
 | |
| 
 | |
|   Note the `--` required after auto-enable-command.  That is necessary
 | |
|   for raw commands like settings set.  The `--` will not become part
 | |
|   of the options for the enable command.
 | |
| 
 | |
| ### Message flow and related performance considerations
 | |
| 
 | |
| `os_log`()-style collection is not free.  The more data that must be
 | |
| processed, the slower it will be.  There are several knobs available
 | |
| to the developer to limit how much data goes through the pipe, and how
 | |
| much data ultimately goes over the wire to the LLDB client.  The
 | |
| user's goal should be to ensure he or she only collects as many log
 | |
| messages are needed, but no more.
 | |
| 
 | |
| The flow of data looks like the following:
 | |
| 
 | |
| 1. Data comes into debugserver from the low-level OS facility that
 | |
|    receives log messages.  The data that comes through this pipe can
 | |
|    be limited or expanded by the `--debug`, `--info` and
 | |
|    `--all-processes` options of the `plugin structured-data darwin-log
 | |
|    enable` command options.  Exclude as many categories as possible
 | |
|    here (also the default).  The knobs here are very coarse - for
 | |
|    example, whether to include `os_log_info()`-level or
 | |
|    `os_log_debug()`-level info, or to include callstacks in the log
 | |
|    message event data.
 | |
| 
 | |
| 2. The debugserver process filters the messages that arrive through a
 | |
|    message log filter that may be fully customized by the user.  It
 | |
|    works similar to a rules-based packet filter: a set of rules are
 | |
|    matched against the log message, each rule tried in sequential
 | |
|    order.  The first rule that matches then either accepts or rejects
 | |
|    the message.  If the log message does not match any rule, then the
 | |
|    message gets the no-match (i.e. fall-through) action.  The no-match
 | |
|    action defaults to accepting but may be set to reject.
 | |
| 
 | |
|    Filters can be added via the enable command's '`--filter`
 | |
|    {filter-spec}' option.  Filters are added in order, and multiple
 | |
|    `--filter` entries can be provided to the enable command.
 | |
| 
 | |
|    Filters take the following form:
 | |
| ```
 | |
|    {action} {attribute} {op}
 | |
| 
 | |
|    {action} :=
 | |
|        accept |
 | |
|        reject
 | |
| 
 | |
|    {attribute} :=
 | |
|        category       |   // The log message category
 | |
|        subsystem      |   // The log message subsystem
 | |
|        activity       |   // The child-most activity in force
 | |
|                           // at the time the message was logged.
 | |
|        activity-chain |   // The complete activity chain, specified
 | |
|                           // as {parent-activity}:{child-activity}:
 | |
|                           // {grandchild-activity}
 | |
|        message        |   // The fully expanded message contents.
 | |
|                           // Note this one is expensive because it
 | |
|                           // requires expanding the message.  Avoid
 | |
|                           // this if possible, or add it further
 | |
|                           // down the filter chain.
 | |
| 
 | |
|    {op} :=
 | |
|               match {exact-match-text} |
 | |
|               regex {search-regex}        // uses C++ std::regex
 | |
|                                           // ECMAScript variant.
 | |
| ```
 | |
|    e.g.
 | |
|    `--filter "accept subsystem match com.example.mycompany.myproduct"`
 | |
|    `--filter "accept subsystem regex com.example.+"`
 | |
|    `--filter "reject category regex spammy-system-[[:digit:]]+"`
 | |
| 
 | |
| 3. Messages that are accepted by the log message filter get sent to
 | |
|    the lldb client, where they are mapped to the
 | |
|    StructuredDataDarwinLog plugin.  By default, command-line lldb will
 | |
|    issue a Process-level event containing the log message content, and
 | |
|    will request the plugin to print the message if the plugin is
 | |
|    enabled to do so.
 | |
| 
 | |
| ### Log message display
 | |
| 
 | |
| Several settings control aspects of displaying log messages in
 | |
| command-line LLDB.  See the `enable` command's help for a description
 | |
| of these.
 | |
| 
 | |
| 
 | |
| ## StructuredDataDarwinLog feature
 | |
| 
 | |
| The DarwinLog feature supports logging `os_log`*() and `NSLog`() messages
 | |
| to the command-line lldb console, as well as making those messages
 | |
| available to LLDB clients via the event system.  Starting with fall
 | |
| 2016 OSes, Apple platforms introduce a new fire-hose, stream-style
 | |
| logging system where the bulk of the log processing happens on the log
 | |
| consumer side.  This reduces logging impact on the system when there
 | |
| are no consumers, making it cheaper to include logging at all times.
 | |
| However, it also increases the work needed on the consumer end when
 | |
| log messages are desired.
 | |
| 
 | |
| The debugserver binary has been modified to support collection of
 | |
| `os_log`*()/`NSLog`() messages, selection of which messages appear in the
 | |
| stream, and fine-grained filtering of what gets passed on to the LLDB
 | |
| client.  DarwinLog also tracks the activity chain (i.e. `os_activity`()
 | |
| hierarchy) in effect at the time the log messages were issued.  The
 | |
| user is able to configure a number of aspects related to the
 | |
| formatting of the log message header fields.
 | |
| 
 | |
| The DarwinLog support is written in a way which should support the
 | |
| lldb client side on non-Apple clients talking to an Apple device or
 | |
| macOS system; hence, the plugin support is built into all LLDB
 | |
| clients, not just those built on an Apple platform.
 | |
| 
 | |
| StructuredDataDarwinLog implements the 'DarwinLog' feature type, and
 | |
| the plugin name for it shows up as `darwin-log`.
 | |
| 
 | |
| The user interface to the darwin-log support is via the following:
 | |
| 
 | |
| * `plugin structured-data darwin-log enable` command
 | |
| 
 | |
|   This is the main entry point for enabling the command.  It can be
 | |
|   set before launching a process or while the process is running.
 | |
|   If the user wants to squelch seeing info-level or debug-level
 | |
|   messages, which is the default behavior, then the enable command
 | |
|   must be made prior to launching the process; otherwise, the
 | |
|   info-level and debug-level messages will always show up.  Also,
 | |
|   there is a similar "echo os_log()/NSLog() messages to target
 | |
|   process stderr" mechanism which is properly disabled when enabling
 | |
|   the DarwinLog support prior to launch.  This cannot be squelched
 | |
|   if enabling DarwinLog after launch.
 | |
| 
 | |
|   See the help for this command.  There are a number of options
 | |
|   to shrink or expand the number of messages that are processed
 | |
|   on the remote side and sent over to the client, and other
 | |
|   options to control the formatting of messages displayed.
 | |
| 
 | |
|   This command is sticky.  Once enabled, it will stay enabled for
 | |
|   future process launches.
 | |
| 
 | |
| * `plugin structured-data darwin-log disable` command
 | |
| 
 | |
|   Executing this command disables os_log() capture in the currently
 | |
|   running process and signals LLDB to stop attempting to launch
 | |
|   new processes with DarwinLog support enabled.
 | |
| 
 | |
| * `settings set
 | |
|   plugin.structured-data.darwin-log.enable-on-startup true`
 | |
| 
 | |
|   and
 | |
| 
 | |
|   `settings set
 | |
|   plugin.structured-data.darwin-log.auto-enable-options -- `{options}
 | |
| 
 | |
|   When `enable-on-startup` is set to `true`, then LLDB will automatically
 | |
|   enable DarwinLog on startup of relevant processes.  It will use the
 | |
|   content provided in the auto-enable-options settings as the
 | |
|   options to pass to the enable command.
 | |
| 
 | |
|   Note the `--` required after auto-enable-command.  That is necessary
 | |
|   for raw commands like settings set.  The `--` will not become part
 | |
|   of the options for the enable command.
 | |
| 
 | |
| ### Message flow and related performance considerations
 | |
| 
 | |
| `os_log`()-style collection is not free.  The more data that must be
 | |
| processed, the slower it will be.  There are several knobs available
 | |
| to the developer to limit how much data goes through the pipe, and how
 | |
| much data ultimately goes over the wire to the LLDB client.  The
 | |
| user's goal should be to ensure he or she only collects as many log
 | |
| messages are needed, but no more.
 | |
| 
 | |
| The flow of data looks like the following:
 | |
| 
 | |
| 1. Data comes into debugserver from the low-level OS facility that
 | |
|    receives log messages.  The data that comes through this pipe can
 | |
|    be limited or expanded by the `--debug`, `--info` and
 | |
|    `--all-processes` options of the `plugin structured-data darwin-log
 | |
|    enable` command options.  Exclude as many categories as possible
 | |
|    here (also the default).  The knobs here are very coarse - for
 | |
|    example, whether to include `os_log_info()`-level or
 | |
|    `os_log_debug()`-level info, or to include callstacks in the log
 | |
|    message event data.
 | |
| 
 | |
| 2. The debugserver process filters the messages that arrive through a
 | |
|    message log filter that may be fully customized by the user.  It
 | |
|    works similar to a rules-based packet filter: a set of rules are
 | |
|    matched against the log message, each rule tried in sequential
 | |
|    order.  The first rule that matches then either accepts or rejects
 | |
|    the message.  If the log message does not match any rule, then the
 | |
|    message gets the no-match (i.e. fall-through) action.  The no-match
 | |
|    action defaults to accepting but may be set to reject.
 | |
| 
 | |
|    Filters can be added via the enable command's '`--filter`
 | |
|    {filter-spec}' option.  Filters are added in order, and multiple
 | |
|    `--filter` entries can be provided to the enable command.
 | |
| 
 | |
|    Filters take the following form:
 | |
| ```
 | |
|    {action} {attribute} {op}
 | |
| 
 | |
|    {action} :=
 | |
|        accept |
 | |
|        reject
 | |
| 
 | |
|    {attribute} :=
 | |
|        category       |   // The log message category
 | |
|        subsystem      |   // The log message subsystem
 | |
|        activity       |   // The child-most activity in force
 | |
|                           // at the time the message was logged.
 | |
|        activity-chain |   // The complete activity chain, specified
 | |
|                           // as {parent-activity}:{child-activity}:
 | |
|                           // {grandchild-activity}
 | |
|        message        |   // The fully expanded message contents.
 | |
|                           // Note this one is expensive because it
 | |
|                           // requires expanding the message.  Avoid
 | |
|                           // this if possible, or add it further
 | |
|                           // down the filter chain.
 | |
| 
 | |
|    {op} :=
 | |
|               match {exact-match-text} |
 | |
|               regex {search-regex}        // uses C++ std::regex
 | |
|                                           // ECMAScript variant.
 | |
| ```
 | |
|    e.g.
 | |
|    `--filter "accept subsystem match com.example.mycompany.myproduct"`
 | |
|    `--filter "accept subsystem regex com.example.+"`
 | |
|    `--filter "reject category regex spammy-system-[[:digit:]]+"`
 | |
| 
 | |
| 3. Messages that are accepted by the log message filter get sent to
 | |
|    the lldb client, where they are mapped to the
 | |
|    StructuredDataDarwinLog plugin.  By default, command-line lldb will
 | |
|    issue a Process-level event containing the log message content, and
 | |
|    will request the plugin to print the message if the plugin is
 | |
|    enabled to do so.
 | |
| 
 | |
| ### Log message display
 | |
| 
 | |
| Several settings control aspects of displaying log messages in
 | |
| command-line LLDB.  See the `enable` command's help for a description
 | |
| of these.
 | |
| 
 | |
| 
 | |
| 
 |