The if-statement should check whehter TFLITE is on or not rather than if the variable is specified.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D132902
TLite is a lightweight, statically linkable[1], model evaluator, supporting a
subset of what the full tensorflow library does, sufficient for the
types of scenarios we envision having. It is also faster.
We still use saved models as "source of truth" - 'release' mode's AOT
starts from a saved model; and the ML training side operates in terms of
saved models.
Using TFLite solves the following problems compared to using the full TF
C API:
- a compiler-friendly implementation for runtime-loadable (as opposed
to AOT-embedded) models: it's statically linked; it can be built via
cmake;
- solves an issue we had when building the compiler with both AOT and
full TF C API support, whereby, due to a packaging issue on the TF
side, we needed to have the pip package and the TF C API library at
the same version. We have no such constraints now.
The main liability is it supporting a subset of what the full TF
framework does. We do not expect that to cause an issue, but should that
be the case, we can always revert back to using the full framework
(after also figuring out a way to address the problems that motivated
the move to TFLite).
Details:
This change switches the development mode to TFLite. Models are still
expected to be placed in a directory - i.e. the parameters to clang
don't change; what changes is the directory content: we still need
an `output_spec.json` file; but instead of the saved_model protobuf and
the `variables` directory, we now just have one file, `model.tflite`.
The change includes a utility showing how to take a saved model and
convert it to TFLite, which it uses for testing.
The full TF implementation can still be built (not side-by-side). We
intend to remove it shortly, after patching downstream dependencies. The
build behavior, however, prioritizes TFLite - i.e. trying to enable both
full TF C API and TFLite will just pick TFLite.
[1] thanks to @petrhosek's changes to TFLite's cmake support and its deps!
This just shuffles implementations and declarations around. Now the
logger and the TF C API-based model evaluator are separate.
Differential Revision: https://reviews.llvm.org/D131116
This patch introduces the inline cost priority into the
module inliner, which uses the same computation as
InlineCost.
Reviewed By: kazu
Differential Revision: https://reviews.llvm.org/D130012
This patch introduces the inline cost priority into the
module inliner, which uses the same computation as
InlineCost.
Reviewed By: kazu
Differential Revision: https://reviews.llvm.org/D130012
Adds a number of utilities that are used to help create and update
memprof related metadata. These will be used during profile matching
and annotation, as well as by the inliner when updating the metadata.
Also adds unit tests for the utilities.
See also related RFCs:
RFC: Sanitizer-based Heap Profiler [1]
RFC: A binary serialization format for MemProf [2]
RFC: IR metadata format for MemProf [3]
(Note that the IR metadata format has changed from the RFC during
implementation, as described in the preceeding patch adding the basic
metadata and verification support.)
Depends on D128141.
Differential Revision: https://reviews.llvm.org/D128854
Pointed out in Issue #56432: the current reference models may not be
quite friendly to open source projects. Their purpose is only
illustrative - the expectation is that projects would train their own.
To avoid unintentionally pulling such a model, made the URL cmake
setting require explicit user setting.
Differential Revision: https://reviews.llvm.org/D129342
This is a simple datatype with a few JSON utilities, and is independent
of the underlying executor. The main motivation is to allow taking a
dependency on it on the AOT side, and allow us build a correctly-sized
buffer in the cases when the requested feature isn't supported by the
model. This, in turn, allows us to grow the feature set supported by the
compiler in a backward-compatible way; and also collect traces exposing
the new features, but starting off the older model, and continue
training from those new traces.
Differential Revision: https://reviews.llvm.org/D124417
When looking at building the generator for regalloc, we realized we'd
need quite a bit of custom logic, and that perhaps it'd be easier to
just have each usecase (each kind of mlgo policy) have it's own
stand-alone test generator.
This patch just consolidates the old `config.py` and
`generate_mock_model.py` into one file, and does away with
subdirectories under Analysis/models.
Reverts 02940d6d22. Fixes breakage in the modules build.
LLVM loops cannot represent irreducible structures in the CFG. This
change introduce the concept of cycles as a generalization of loops,
along with a CycleInfo analysis that discovers a nested
hierarchy of such cycles. This is based on Havlak (1997), Nesting of
Reducible and Irreducible Loops.
The cycle analysis is implemented as a generic template and then
instatiated for LLVM IR and Machine IR. The template relies on a new
GenericSSAContext template which must be specialized when used for
each IR.
This review is a restart of an older review request:
https://reviews.llvm.org/D83094
Original implementation by Nicolai Hähnle <nicolai.haehnle@amd.com>,
with recent refactoring by Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>
Differential Revision: https://reviews.llvm.org/D112696
This prepares it for the regalloc work. Part of it is making model
evaluation accross 'development' and 'release' scenarios more reusable.
This patch:
- extends support to tensors of any shape (not just scalars, like we had
in the inliner -Oz case). While the tensor shape can be anything, we
assume row-major layout and expose the tensor as a buffer.
- exposes the NoInferenceModelRunner, which we use in the 'development'
mode to keep the evaluation code path consistent and simplify logging,
as we'll want to reuse it in the regalloc case.
Differential Revision: https://reviews.llvm.org/D115306
LLVM loops cannot represent irreducible structures in the CFG. This
change introduce the concept of cycles as a generalization of loops,
along with a CycleInfo analysis that discovers a nested
hierarchy of such cycles. This is based on Havlak (1997), Nesting of
Reducible and Irreducible Loops.
The cycle analysis is implemented as a generic template and then
instatiated for LLVM IR and Machine IR. The template relies on a new
GenericSSAContext template which must be specialized when used for
each IR.
This review is a restart of an older review request:
https://reviews.llvm.org/D83094
Original implementation by Nicolai Hähnle <nicolai.haehnle@amd.com>,
with recent refactoring by Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>
Differential Revision: https://reviews.llvm.org/D112696
The tests that exercise the 'release' mode, where the model is AOT-ed,
check the output has certain properties, to validate that, indeed, a
different policy from the default one was exercised. For determinism, we
can't reliably check that output for an arbitrary learned policy, since
it could be that policy happens to mimic the default one in that
particular case.
This patch adds a requirement that those tests run only when the model
is autogenerated (e.g. on build bots).
Differential Revision: https://reviews.llvm.org/D111747
It turns out that during training, the time required to parse the
textual protobuf of a training log is about the same as the time it
takes to compile the module generating that log. Using binary protobufs
instead elides that cost almost completely.
Differential Revision: https://reviews.llvm.org/D106157
This change yields an additional 2% size reduction on an internal search
binary, and an additional 0.5% size reduction on fuchsia.
Differential Revision: https://reviews.llvm.org/D104751
They are not conducive to being stored in git. Instead, we autogenerate
mock model artifacts for use in tests. Production models can be
specified with the cmake flag LLVM_INLINER_MODEL_PATH.
LLVM_INLINER_MODEL_PATH has two sentinel values:
- download, which will download the most recent compatible model.
- autogenerate, which will autogenerate a "fake" model for testing the
model uptake infrastructure.
Differential Revision: https://reviews.llvm.org/D104251
This was prompted by D95727, which had the side-effect to break the
'release' mode build bot for ML-driven policies. The problem is that now
the pre-compiled object files don't get transitively carried through as
'source' anymore; that being said, the previous way of consuming them
was problematic, because it was only working for static builds; in
dynamic builds, the whole tf_xla_runtime was linked, which is
undesirable.
The alternative is to treat tf_xla_runtime as an archive, which then
leads to the desired effect.
Differential Revision: https://reviews.llvm.org/D99829
This is related to D94982. We want to call these APIs from the Analysis
component, so we can't leave them under Transforms.
Differential Revision: https://reviews.llvm.org/D95079
This is being recommitted to try and address the MSVC complaint.
This patch implements a DDG printer pass that generates a graph in
the DOT description language, providing a more visually appealing
representation of the DDG. Similar to the CFG DOT printer, this
functionality is provided under an option called -dot-ddg and can
be generated in a less verbose mode under -dot-ddg-only option.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D90159
This patch implements a DDG printer pass that generates a graph in
the DOT description language, providing a more visually appealing
representation of the DDG. Similar to the CFG DOT printer, this
functionality is provided under an option called -dot-ddg and can
be generated in a less verbose mode under -dot-ddg-only option.
Differential Revision: https://reviews.llvm.org/D90159
No longer rely on an external tool to build the llvm component layout.
Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.
These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.
Differential Revision: https://reviews.llvm.org/D90848
This introduces the IRInstructionMapper, and the associated wrapper for
instructions, IRInstructionData, that maps IR level Instructions to
unsigned integers.
Mapping is done mainly by using the "isSameOperationAs" comparison
between two instructions. If they return true, the opcode, result type,
and operand types of the instruction are used to hash the instruction
with an unsigned integer. The mapper accepts instruction ranges, and
adds each resulting integer to a list, and each wrapped instruction to
a separate list.
At present, branches, phi nodes are not mapping and exception handling
is illegal. Debug instructions are not considered.
The different mapping schemes are tested in
unittests/Analysis/IRSimilarityIdentifierTest.cpp
Recommit of: b04c1a9d31
Differential Revision: https://reviews.llvm.org/D86968
This introduces the IRInstructionMapper, and the associated wrapper for
instructions, IRInstructionData, that maps IR level Instructions to
unsigned integers.
Mapping is done mainly by using the "isSameOperationAs" comparison
between two instructions. If they return true, the opcode, result type,
and operand types of the instruction are used to hash the instruction
with an unsigned integer. The mapper accepts instruction ranges, and
adds each resulting integer to a list, and each wrapped instruction to
a separate list.
At present, branches, phi nodes are not mapping and exception handling
is illegal. Debug instructions are not considered.
The different mapping schemes are tested in
unittests/Analysis/IRSimilarityIdentifierTest.cpp
Differential Revision: https://reviews.llvm.org/D86968
This patch recommits "[ConstraintSystem] Add helpers to deal with linear constraints."
(it reverts the revert commit 8da6ae4ce1).
The reason for the revert was using __builtin_multiply_overflow, which
is not available for all compilers. The patch has been updated to use
MulOverflow from MathExtras.h
This patch introduces a new ConstraintSystem class, that maintains a set
of linear constraints and uses Fourier–Motzkin elimination to eliminate
constraints to check if there are solutions for the system.
It also adds a convert-constraint-log-to-z3.py script, which can parse
the debug output of the constraint system and convert it to a python
script that feeds the constraints into Z3 and checks if it produces the
same result as the LLVM implementation. This is for verification
purposes.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D84544
This also changes -lint from an analysis to a pass. It's similar to
-verify, and that is a normal pass, and lives in llvm/IR.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D87057
This change added a new inline advisor that takes optimization remarks from previous inlining as input, and provides the decision as advice so current inlining can replay inline decisions of a different compilation. Dwarf inline stack with line and discriminator is used as anchor for call sites including call context. The change can be useful for Inliner tuning as it provides a channel to allow external input for tweaking inline decisions. Existing alternatives like alwaysinline attribute is per-function, not per-callsite. Per-callsite inline intrinsic can be another solution (not yet existing), but it's intrusive to implement and also does not differentiate call context.
A switch -sample-profile-inline-replay=<inline_remarks_file> is added to hook up the new inline advisor with SampleProfileLoader's inline decision for replay. Since SampleProfileLoader does top-down inlining, inline decision can be specialized for each call context, hence we should be able to replay inlining accurately. However with a bottom-up inliner like CGSCC inlining, the replay can be limited due to lack of specialization for different call context. Apart from that limitation, the new inline advisor can still be used by regular CGSCC inliner later if needed for tuning purpose.
This is a resubmit of https://reviews.llvm.org/D83743
(This reverts commit a5e0194709, and
corrects author).
Rename the pass to be able to extend it to function properties other than inliner features.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D82044
Rename the pass to be able to extend it to function properties other than inliner features.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D82044
Outside of compiler-rt (where it's arguably an anti-pattern too),
LLVM tries to keep its build files as simple as possible. See e.g.
llvm/docs/SupportLibrary.rst, "Code Organization".
Differential Revision: https://reviews.llvm.org/D84243
Summary:
This is the InlineAdvisor used in 'development' mode. It enables two
scenarios:
- loading models via a command-line parameter, thus allowing for rapid
training iteration, where models can be used for the next exploration
phase without requiring recompiling the compiler. This trades off some
compilation speed for the added flexibility.
- collecting training logs, in the form of tensorflow.SequenceExample
protobufs. We generate these as textual protobufs, which simplifies
generation and testing. The protobufs may then be readily consumed by a
tensorflow-based training algorithm.
To speed up training, training logs may also be collected from the
'default' training policy. In that case, this InlineAdvisor does not
use a model.
RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html
Reviewers: jdoerfert, davidxl
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83733
Summary:
This change added a new inline advisor that takes optimization remarks from previous inlining as input, and provides the decision as advice so current inlining can replay inline decisions of a different compilation. Dwarf inline stack with line and discriminator is used as anchor for call sites including call context. The change can be useful for Inliner tuning as it provides a channel to allow external input for tweaking inline decisions. Existing alternatives like alwaysinline attribute is per-function, not per-callsite. Per-callsite inline intrinsic can be another solution (not yet existing), but it's intrusive to implement and also does not differentiate call context.
A switch -sample-profile-inline-replay=<inline_remarks_file> is added to hook up the new inline advisor with SampleProfileLoader's inline decision for replay. Since SampleProfileLoader does top-down inlining, inline decision can be specialized for each call context, hence we should be able to replay inlining accurately. However with a bottom-up inliner like CGSCC inlining, the replay can be limited due to lack of specialization for different call context. Apart from that limitation, the new inline advisor can still be used by regular CGSCC inliner later if needed for tuning purpose.
Subscribers: mgorny, aprantl, hiraditya, llvm-commits
Tags: #llvm
Resubmit for https://reviews.llvm.org/D84086