269 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Markdown
		
	
	
	
			
		
		
	
	
			269 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Markdown
		
	
	
	
| # Quickstart tutorial to adding MLIR graph rewrite
 | |
| 
 | |
| This document will present a quickstart to adding graph rewrites. We shall start
 | |
| by defining an operation, showing multiple ways to define the rewrite using
 | |
| patterns, as well as defining the rewrite using a graph walker (note: using
 | |
| patterns and the rewrite engine is preferred, showing the walker is for
 | |
| demonstration purposes).
 | |
| 
 | |
| See [MLIR specification](LangRef.md) for more information about MLIR, the
 | |
| structure of the IR, operations, etc. See
 | |
| [Table-driven Operation Definition](OpDefinitions.md) and
 | |
| [Declarative Rewrite Rule](DeclarativeRewrites.md) for the detailed explanation
 | |
| of all available mechanisms for defining operations and rewrites in a
 | |
| table-driven manner.
 | |
| 
 | |
| ## Adding operation
 | |
| 
 | |
| An operation in MLIR is specified using a definition in
 | |
| [TableGen](https://llvm.org/docs/TableGen/index.html) file. TableGen is a
 | |
| modeling tool to specify the ops and the C++ code to interact with these
 | |
| operations are generated from. To define an operation one needs to specify:
 | |
| 
 | |
| *   The operation name. This name is a unique identifier of the operation within
 | |
|     MLIR. Most operations are within a dialect, so for example one could have
 | |
|     `tfl.add` to represent the add operation in the TensorFlow Lite dialect.
 | |
|     Instead of repeating the dialect in the op definition, a base class for the
 | |
|     op dialect is commonly created that prepends the dialect namespace given an
 | |
|     op name.
 | |
| *   The traits of the operation. These allow you to specify traits of the
 | |
|     operation, such as whether it has side effects or whether it should be
 | |
|     verified that the operands and result types are the same. These are backed
 | |
|     by C++ traits that perform the verification.
 | |
| *   The arguments of the operation. These are the input operands (values at
 | |
|     runtime produced by other ops) and attributes (compile time known constant
 | |
|     values that affect the behavior of the op) that are the inputs of/define the
 | |
|     behavior of the operation. The input operands may be named, the attributes
 | |
|     must be named.
 | |
| *   The result(s) of the operation. These may again named or not.
 | |
| *   Documentation of the operation. This includes a one-line summary as well as
 | |
|     a longer human-readable description of the operation.
 | |
| *   Dialect specific information. Additional information could be added to the
 | |
|     operation definition that are only used by dialect specific drivers. These
 | |
|     are ignored by the main op and doc generators, but could be used in, say,
 | |
|     the translation from a dialect to another representation.
 | |
| 
 | |
| ```tablegen
 | |
| def TFL_LeakyReluOp: TFL_Op<TFL_Dialect, "leaky_relu",
 | |
|                             [NoSideEffect, SameValueType]>,
 | |
|                      Results<(outs Tensor)> {
 | |
|   let arguments = (ins
 | |
|     F32Tensor:$x,
 | |
|     // Slope of the activation function at x < 0.
 | |
|     F32Attr:$alpha
 | |
|   );
 | |
| 
 | |
|   let summary = "Leaky ReLU operator";
 | |
|   let description = [{
 | |
|     Element-wise Leaky ReLU operator
 | |
|       x -> x >= 0 ? x : (alpha * x)
 | |
|   }];
 | |
| 
 | |
|   // TFLite specific attribute that is used when generating the output
 | |
|   // flatbuffer.
 | |
|   let hasOptions = 1;
 | |
| }
 | |
| ```
 | |
| 
 | |
| Note in the above the result types and inputs are specified in different ways,
 | |
| one by way of trait and the other by way of let. It is possible to specify both
 | |
| in either way.
 | |
| 
 | |
| <!-- TODO: Define a style convention. -->
 | |
| 
 | |
| Operations can also have custom parser, printer, builder, verifier, constant
 | |
| folder, or canonicalizer. These require specifying additional C++ methods to
 | |
| invoke for additional functionality. For example, if an operation is marked to
 | |
| have a folder, the constant folder also needs to be added, e.g.,:
 | |
| 
 | |
| ```c++
 | |
| OpFoldResult SpecificOp::fold(ArrayRef<Attribute> constOperands) {
 | |
|   if (unable_to_fold)
 | |
|     return {};
 | |
|   ....
 | |
|   return val;
 | |
| }
 | |
| ```
 | |
| 
 | |
| ## Adding patterns
 | |
| 
 | |
| There are multiple forms of graph rewrite that can be performed in MLIR. One of
 | |
| the most common is DAG tile to DAG tile rewrite. Patterns provide a concise way
 | |
| to express this transformation as a pair of source pattern to match and
 | |
| resultant pattern. There are both the C++ classes to represent this
 | |
| transformation, as well as the patterns in TableGen from which these can be
 | |
| generated.
 | |
| 
 | |
| ### TableGen patterns
 | |
| 
 | |
| Let us continue with LeakyRelu. To map from TensorFlow's `LeakyRelu` to
 | |
| TensorFlow Lite's `LeakyRelu`:
 | |
| 
 | |
| ```tablegen
 | |
| def : Pat<(TF_LeakyReluOp $arg, F32Attr:$a), (TFL_LeakyReluOp $arg, $a)>
 | |
| ```
 | |
| 
 | |
| The pattern is specified by instantiating a `Pat` with a source and result DAG.
 | |
| The arguments in the source pattern is captured and can be used in the result
 | |
| pattern. This is a simple pattern as we have a 1:1 mapping and the attribute
 | |
| does not need to be transformed (e.g., both have a floating point attribute for
 | |
| alpha). The names of the attributes specified in the pattern is for
 | |
| matching/referencing and need not match the original attribute name in the op
 | |
| definition but the order of arguments of the dags do need to match.
 | |
| 
 | |
| To specify a pattern, both the source and resultant ops need to be defined using
 | |
| TableGen.
 | |
| 
 | |
| If this were a more advance pattern that the current framework could not express
 | |
| as destination then one could use a general native code fallback method. This
 | |
| consists of defining a pattern as well as adding a C++ function to perform the
 | |
| replacement:
 | |
| 
 | |
| ```tablegen
 | |
| def createTFLLeakyRelu : NativeCodeCall<
 | |
|     "createTFLLeakyRelu($_builder, $0.getDefiningOp(), $1, $2)">;
 | |
| 
 | |
| def : Pat<(TF_LeakyReluOp:$old_value, $arg, F32Attr:$a),
 | |
|           (createTFLLeakyRelu $old_value, $arg, $a)>;
 | |
| ```
 | |
| 
 | |
| ```c++
 | |
| static Value createTFLLeakyRelu(PatternRewriter &rewriter, Operation *op,
 | |
|                                 Value operand, Attribute attr) {
 | |
|   return rewriter.create<mlir::TFL::LeakyReluOp>(
 | |
|       op->getLoc(), operands[0].getType(), /*arg=*/operands[0],
 | |
|       /*alpha=*/attrs[0].cast<FloatAttr>());
 | |
| }
 | |
| ```
 | |
| 
 | |
| This allows for arbitrarily complex builders. Input pattern side one can express
 | |
| multi-op patterns with constraints on input operands and attributes. But input
 | |
| patterns cannot yet express constraints across multiple operands/attributes.
 | |
| 
 | |
| ### Register the pattern
 | |
| 
 | |
| The file containing the patterns need to be processed using `mlir-tblgen`
 | |
| `-gen-rewriters` during compilation time. It can be invoked with the following
 | |
| configuration in CMake:
 | |
| 
 | |
| ```cmake
 | |
| set(LLVM_TARGET_DEFINITIONS <name-of-the-td-file>)
 | |
| mlir_tablegen(<name-of-the-generated-inc-file> -gen-rewriters)
 | |
| add_public_tablegen_target(<name-of-the-cmake-target>)
 | |
| ```
 | |
| 
 | |
| Then you can `#include` the generated file in any C++ implementation file you
 | |
| like. (You will also need to make sure the library depends on the CMake target
 | |
| defined in the above.) The generated file will have a `populateWithGenerated(
 | |
| RewritePatternSet &patterns)` function that you can
 | |
| use to collect all the generated patterns inside `patterns` and then use
 | |
| `patterns` in any pass you would like.
 | |
| 
 | |
| ### C++ rewrite specification
 | |
| 
 | |
| In case patterns are not sufficient there is also the fully C++ way of
 | |
| expressing a rewrite:
 | |
| 
 | |
| ```c++
 | |
| /// Multi-step rewrite using "match" and "rewrite". This allows for separating
 | |
| /// the concerns of matching and rewriting.
 | |
| struct ConvertTFLeakyRelu : public RewritePattern {
 | |
|   ConvertTFLeakyRelu(MLIRContext *context)
 | |
|       : RewritePattern("tf.LeakyRelu", 1, context) {}
 | |
| 
 | |
|   LogicalResult match(Operation *op) const override {
 | |
|     return success();
 | |
|   }
 | |
| 
 | |
|   void rewrite(Operation *op, PatternRewriter &rewriter) const override {
 | |
|     rewriter.replaceOpWithNewOp<TFL::LeakyReluOp>(
 | |
|         op, op->getResult(0).getType(), op->getOperand(0),
 | |
|         /*alpha=*/op->getAttrOfType<FloatAttr>("alpha"));
 | |
|   }
 | |
| };
 | |
| 
 | |
| /// Single-step rewrite with "matchAndRewrite". This allows for performing the
 | |
| /// rewrite immediately upon a successful match.
 | |
| struct ConvertTFLeakyRelu : public RewritePattern {
 | |
|   ConvertTFLeakyRelu(MLIRContext *context)
 | |
|       : RewritePattern("tf.LeakyRelu", 1, context) {}
 | |
| 
 | |
|   LogicalResult matchAndRewrite(Operation *op,
 | |
|                                 PatternRewriter &rewriter) const override {
 | |
|     rewriter.replaceOpWithNewOp<TFL::LeakyReluOp>(
 | |
|         op, op->getResult(0).getType(), op->getOperand(0),
 | |
|         /*alpha=*/op->getAttrOfType<FloatAttr>("alpha"));
 | |
|     return success();
 | |
|   }
 | |
| };
 | |
| ```
 | |
| 
 | |
| In the C++ rewrite the static benefit of the rewrite pattern is specified at
 | |
| construction. While in the pattern generator a simple heuristic is currently
 | |
| employed based around the number of ops matched and replaced.
 | |
| 
 | |
| In the case where you have a registered op and want to use a benefit of 1, you
 | |
| can even define the pattern as a C function:
 | |
| 
 | |
| ```c++
 | |
| static LogicalResult
 | |
| convertTFLeakyRelu(TFLeakyReluOp op, PatternRewriter &rewriter) {
 | |
|   rewriter.replaceOpWithNewOp<TFL::LeakyReluOp>(
 | |
|       op, op->getResult(0).getType(), op->getOperand(0),
 | |
|       /*alpha=*/op->getAttrOfType<FloatAttr>("alpha"));
 | |
|   return success();
 | |
| }
 | |
| ```
 | |
| 
 | |
| The above rule did not capture the matching operands/attributes, but in general
 | |
| the `match` function in a multi-step rewrite may populate and return a
 | |
| `PatternState` (or class derived from one) to pass information extracted during
 | |
| matching to the rewrite. A single-step rewrite with the `matchAndRewrite`
 | |
| function has the benefit of being able to directly use any values created when
 | |
| matching; removing the need for `PatternState`.
 | |
| 
 | |
| ## Testing
 | |
| 
 | |
| MLIR uses [lit](https://llvm.org/docs/CommandGuide/lit.html) (LLVM Integrated
 | |
| Testing) tool for performing testing. Testing is performed by way of creating
 | |
| the input IR file, running a transformation and then verifying the output IR.
 | |
| C++ unit tests are the exception, with the IR transformation serving as the core
 | |
| testing mechanism. This results in fewer binaries that need to be built (and
 | |
| linked) and forces to focus on the representation as an important piece.
 | |
| 
 | |
| For the legalization transform above we would have a test (probably as part of
 | |
| the legalization pass test in TensorFlow Lite) such as:
 | |
| 
 | |
| ```mlir
 | |
| // RUN: mlir-opt -tfl-legalize-tf %s | FileCheck %s
 | |
| 
 | |
| func @LeakyRelu(%arg0: tensor<1xf32>) -> tensor<1xf32> {
 | |
|   %2 = "tf.LeakyRelu"(%arg0) {alpha: 0.1} : (tensor<1xf32>) -> tensor<1xf32>
 | |
|   return %2: tensor<1xf32>
 | |
| 
 | |
| // CHECK-LABEL: LeakyRelu
 | |
| // CHECK:  %0 = "tfl.leaky_relu"(%arg0) {alpha: 1.000000e-01} : (tensor<1xf32>) -> tensor<1xf32>
 | |
| }
 | |
| ```
 | |
| 
 | |
| The RUN command at the top results in running the `mlir-opt` binary (which is
 | |
| compiler writer tool to exercise different registered passes) to invoke the
 | |
| optimization pass this transform was added as part of on the current file and to
 | |
| verify its output using `FileCheck`. `FileCheck` is textual output verifier. In
 | |
| particular it uses the CHECK expressions to verify the given output is produced.
 | |
| 
 | |
| There can be multiple RUN commands with different corresponding CHECK prefixes.
 | |
| And in addition multiple independent tests separated by `// -----` and
 | |
| `mlir-opt` invoked with `-split-input-file` flag. This is especially useful for
 | |
| error testing.
 | |
| 
 | |
| This results in very simple, directed testing without need to work around
 | |
| constant propagation or other, unrelated, optimization passes.
 | |
| 
 | |
| ## Adding optimization pass
 | |
| 
 | |
| Optimization passes that do not fit/are difficult to specify in the above
 | |
| structure can be specified as general iterations across modules/functions. See
 | |
| [Writing a Pass](../PassManagement.md) for a general overview and introduction to
 | |
| optimization passes in MLIR.
 |