Commit Graph

26 Commits

Author SHA1 Message Date
Mircea Trofin c35ad9ee4f [mlgo] Support exposing more features than those supported by models
This allows the compiler to support more features than those supported by a
model. The only requirement (development mode only) is that the new
features must be appended at the end of the list of features requested
from the model. The support is transparent to compiler code: for
unsupported features, we provide a valid buffer to copy their values;
it's just that this buffer is disconnected from the model, so insofar
as the model is concerned (AOT or development mode), these features don't
exist. The buffers are allocated at setup - meaning, at steady state,
there is no extra allocation (maintaining the current invariant). These
buffers has 2 roles: one, keep the compiler code simple. Second, allow
logging their values in development mode. The latter allows retraining
a model supporting the larger feature set starting from traces produced
with the old model.

For release mode (AOT-ed models), this decouples compiler evolution from
model evolution, which we want in scenarios where the toolchain is
frequently rebuilt and redeployed: we can first deploy the new features,
and continue working with the older model, until a new model is made
available, which can then be picked up the next time the compiler is built.

Differential Revision: https://reviews.llvm.org/D124565
2022-05-09 18:01:21 -07:00
Mircea Trofin b1fa5ac3ba [mlgo] Factor out TensorSpec
This is a simple datatype with a few JSON utilities, and is independent
of the underlying executor. The main motivation is to allow taking a
dependency on it on the AOT side, and allow us build a correctly-sized
buffer in the cases when the requested feature isn't supported by the
model. This, in turn, allows us to grow the feature set supported by the
compiler in a backward-compatible way; and also collect traces exposing
the new features, but starting off the older model, and continue
training from those new traces.

Differential Revision: https://reviews.llvm.org/D124417
2022-04-25 18:35:46 -07:00
Mircea Trofin e4794ff5c6 [mlgo][nfc] Decouple TensorSpec from tensorflow.
The motivation is twofold:

1) Allow plugging in a different training-time evaluator, e.g.
   TFLite-based, etc.

2) Allow using TensorSpec for AOT, too, to support evolution: we start
   by extracting a superset of the features currently supported by a
   model. For the tensors the model does not support, we just return a
   valid, but useless, buffer. This makes using a 'smaller' model (less
   supported tensors) transparent to the compiler. The key is to
   dimension the buffer appropriately, and we already have TensorSpec
   modeling that info.

The only coupling was due to the reliance of a TF internal API for
getting the element size, but for the types we are interested in,
`sizeof` is sufficient.

A subsequent change will yank out TensorSpec in its own module.

Differential Revision: https://reviews.llvm.org/D124045
2022-04-21 15:37:01 -07:00
Mircea Trofin 1f5dceb1d0 [MLGO] Add support for multiple training traces per module
This happens in e.g. regalloc, where we trace decisions per function,
but wouldn't want to spew N log files (i.e. one per function). So we
output a key-value association, where the key is an ID for the
sub-module object, and the value is the tensorflow::SequenceExample.

The current relation with protobuf is tenuous, so we're avoiding a
custom message type in favor of using the `Struct` message, but that
requires the values be wire-able strings, hence base64 encoding.

We plan on resolving the protobuf situation shortly, and improve the
encoding of such logs, but this is sufficient for now for setting up
regalloc training.

Differential Revision: https://reviews.llvm.org/D116985
2022-01-11 16:13:31 -08:00
Mircea Trofin 8dc3fe0cd1 [NFC][MLGO] Use std::move when moving protobufs
Because of an odd linking problem, we need to temporarily support
building with TF C API 1.15 + tensorflow 2.50 pip package in
'development' mode scenarios. Protobuf Message 'Swap' is partially
implemented in the header (2.50) and relies on a symbol not found in TF
C API 1.15. std::move avoids that, at no semantic cost.
2021-08-20 13:40:35 -07:00
Mircea Trofin 510402c2c8 [NFC][MLGO] 'Use' variable used for asserts 2021-08-10 19:55:17 -07:00
Christopher Di Bella c874dd5362 [llvm][clang][NFC] updates inline licence info
Some files still contained the old University of Illinois Open Source
Licence header. This patch replaces that with the Apache 2 with LLVM
Exception licence.

Differential Revision: https://reviews.llvm.org/D107528
2021-08-11 02:48:53 +00:00
Mircea Trofin ae1a2a09e4 [NFC][MLGO] Make logging more robust
1) add some self-diagnosis (when asserts are enabled) to check that all
features have the same nr of entries

2) avoid storing pointers to mutable fields because the proto API
contract doesn't actually guarantee those stay fixed even if no further
mutation of the object occurs.

Differential Revision: https://reviews.llvm.org/D107594
2021-08-06 04:44:52 -07:00
Mircea Trofin 55e12f7080 [NFC][MLGO] Just use the underlying protobuf object for logging
Avoid buffering just to copy the buffered data, in 'development
mode', when logging. Instead, just populate the underlying protobuf.

Differential Revision: https://reviews.llvm.org/D106592
2021-07-23 10:56:48 -07:00
Mircea Trofin 55e2d2060a [MLGO] Use binary protobufs for improved training performance.
It turns out that during training, the time required to parse the
textual protobuf of a training log is about the same as the time it
takes to compile the module generating that log. Using binary protobufs
instead elides that cost almost completely.

Differential Revision: https://reviews.llvm.org/D106157
2021-07-19 13:59:28 -07:00
Kazu Hirata 047fc3bf20 [Analysis] Use ListSeparator (NFC) 2021-02-21 19:58:04 -08:00
Mircea Trofin 8ab2353a4c [NFC][TFUtils] also include output specs lookup logic in loadOutputSpecs
The lookup logic is also reusable.

Also refactored the API to return the loaded vector - this makes it more
clear what state it is in in the case of error (as it won't be
returned).

Differential Revision: https://reviews.llvm.org/D91759
2020-11-18 21:20:21 -08:00
Mircea Trofin b51e844f7a [NFC][TFUtils] Extract out the output spec loader
It's generic for the 'development mode', not specific to the inliner
case.

Differential Revision: https://reviews.llvm.org/D91751
2020-11-18 20:03:20 -08:00
Mircea Trofin d454328ea8 [ML] Add final reward logging facility.
Allow logging final rewards. A final reward is logged only once, and is
serialized as all-zero values, except for the last one.

Differential Revision: https://reviews.llvm.org/D89626
2020-10-19 08:44:50 -07:00
Mircea Trofin 36bb1fb1fe [MLInliner] Factor out logging
Factored out the logging facility, to allow its reuse outside the
inliner.

Differential Revision: https://reviews.llvm.org/D88770
2020-10-05 18:09:17 -07:00
Sam McCall fa69b60806 [JSON] Add error reporting to fromJSON and ObjectMapper
Translating between JSON objects and C++ strutctures is common.
From experience in clangd, fromJSON/ObjectMapper work well and save a lot of
code, but aren't adopted elsewhere at least partly due to total lack of error
reporting beyond "ok"/"bad".

The recently-added error model should be rich enough for most applications.
It requires tracking the path within the root object and reporting local
errors at appropriate places.
To do this, we exploit the fact that the call graph of recursive
parse functions mirror the structure of the JSON itself.
The current path is represented as a linked list of segments, each of which is
on the stack as a parameter. Concretely, fromJSON now looks like:
  bool fromJSON(const Value&, T&, Path);

Beyond the signature change, this is reasonably unobtrusive: building
the path segments is mostly handled by ObjectMapper and the vector<T> fromJSON.
However the root caller of fromJSON must now create a Root object to
store the errors, which is a little clunky.

I've added high-level parse<T>(StringRef) -> Expected<T>, but it's not
general enough to be the primary interface I think (at least, not usable in
clangd).

All existing users (mostly just clangd) are updated in this patch,
making this change backwards-compatible is a bit hairy.

Differential Revision: https://reviews.llvm.org/D88103
2020-09-24 01:20:09 +02:00
Mircea Trofin 7cfcecece0 [MLInliner] Simplify TFUTILS_SUPPORTED_TYPES
We only need the C++ type and the corresponding TF Enum. The other
parameter was used for the output spec json file, but we can just
standardize on the C++ type name there.

Differential Revision: https://reviews.llvm.org/D86549
2020-08-25 14:19:39 -07:00
Mircea Trofin b18c41c66f [TFUtils] Expose untyped accessor to evaluation result tensors
These were implementation detail, but become necessary for generic data
copying.

Also added const variations to them, and move assignment, since we had a
move ctor (and the move assignment helps in a subsequent patch).

Differential Revision: https://reviews.llvm.org/D85262
2020-08-05 10:22:45 -07:00
Mircea Trofin 90b9c49ca6 [llvm] Expose type and element count-related APIs on TensorSpec
Added a mechanism to check the element type, get the total element
count, and the size of an element.

Differential Revision: https://reviews.llvm.org/D85250
2020-08-04 17:32:16 -07:00
Mircea Trofin 4b1b109c51 [llvm] Add a parser from JSON to TensorSpec
A JSON->TensorSpec utility we will use subsequently to specify
additional outputs needed for certain training scenarios.

Differential Revision: https://reviews.llvm.org/D84976
2020-08-03 09:49:31 -07:00
Mircea Trofin 71059257bd [llvm][NFC] TensorSpec abstraction for ML evaluator
Further abstracting the specification of a tensor, to more easily
support different types and shapes of tensor, and also to perform
initialization up-front, at TFModelEvaluator construction time.

Differential Revision: https://reviews.llvm.org/D84685
2020-07-29 16:29:21 -07:00
Nico Weber 4fe912f186 Build: Move TF source file inclusion from build system to source files
Outside of compiler-rt (where it's arguably an anti-pattern too),
LLVM tries to keep its build files as simple as possible. See e.g.
llvm/docs/SupportLibrary.rst, "Code Organization".

Differential Revision: https://reviews.llvm.org/D84243
2020-07-21 13:02:34 -04:00
Mircea Trofin 4f763b2172 [llvm][NFC] Hide the tensorflow dependency from headers.
Summary:
This change avoids exposing tensorflow types when including TFUtils.h.
They are just an implementation detail, and don't need to be used
directly when implementing an analysis requiring ML model evaluation.

The TFUtils APIs, while generically typed, are still not exposed unless
the tensorflow C library is present, as they currently have no use
otherwise.

Reviewers: mehdi_amini, davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83843
2020-07-14 21:14:11 -07:00
Mircea Trofin caf395ee8c Reapply "[llvm] Native size estimator for training -Oz inliner"
This reverts commit 9908a3b9f5.

The fix was to exclude the content of TFUtils.h (automatically
included in the LLVM_Analysis module, when LLVM_ENABLE_MODULES is enabled).

Differential Revision: https://reviews.llvm.org/D82817
2020-07-13 16:26:26 -07:00
Davide Italiano 9908a3b9f5 Revert "[llvm] Native size estimator for training -Oz inliner"
This reverts commit 83080a294a as
it breaks the macOS modules build.
2020-07-13 13:13:36 -07:00
Mircea Trofin 83080a294a [llvm] Native size estimator for training -Oz inliner
Summary:
This is an experimental ML-based native size estimator, necessary for
computing partial rewards during -Oz inliner policy training. Data
extraction for model training will be provided in a separate patch.

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html

Reviewers: davidxl, jdoerfert

Subscribers: mgorny, hiraditya, mgrang, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82817
2020-07-13 10:13:56 -07:00