Add a new OperationType handle type to the Transform dialect. This
transform type is parameterized by the name of the payload operation it
can point to. It is intended as a constraint on transformations that are
only applicable to a specific kind of payload operations. If a
transformation is applicable to a small set of operation classes, it can
be wrapped into a transform op by using a disjunctive constraint, such
as `Type<Or<[Transform_ConcreteOperation<"foo">.predicate,
Transform_ConcreteOperation<"bar">.predicate]>>` for its operand without
modifying this type. Broader sets of accepted operations should be
modeled as specific types.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D135586
Use the recently introduced TransformTypeInterface instead of hardcoding
the PDLOperationType. This will allow the operations to use more
specific transform types to express pre/post-conditions in the future.
It requires the syntax and Python op construction API to be updated.
Dialect extensions will be switched separately.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D135584
tensor.empty/linalg.init_tensor produces an uninititalized tensor that can be used as a destination operand for destination-style ops (ops that implement `DestinationStyleOpInterface`).
This change makes it possible to implement `TilingInterface` for non-destination-style ops without depending on the Linalg dialect.
RFC: https://discourse.llvm.org/t/rfc-add-tensor-from-shape-operation/65101
Differential Revision: https://reviews.llvm.org/D135129
An `_mlirRegisterEverything.*.so` file from an old build that referenced
`MLIRPythonExtension.RegisterEverything`, but which no longer references
that extension in a new build, causes runtime errors in the new build
like:
ImportError: _mlirRegisterEverything.cpython-38-x86_64-linux-gnu.so: undefined symbol: mlirRegisterAllPasses
The error occurs because the MLIR Python binding tries to dynamically
import the `_mlirRegisterEverything` module but the dynamic importer
fails since the new build no longer references
`MLIRPythonExtension.RegisterEverything`.
One possible solution is for the user to manually remove the
`_mlirRegisterEverything.*.so` file. This patch instead resolves the
problem in code by printing a waning if the module cannot be
imported.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D133450
This allows for incrementally updating the old API usages without
needing to update everything at once. PDL will be left on Both
for a little bit and then flipped to prefixed when all APIs have been
updated.
Differential Revision: https://reviews.llvm.org/D134387
The batch-reduce GEMM kernel essentially multiplies a sequence of input tensor
blocks (which form a batch) and the partial multiplication results are reduced
into a single output tensor block.
See: https://ieeexplore.ieee.org/document/9139809 for more details.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D134163
The batch-reduce GEMM kernel essentially multiplies a sequence of input tensor
blocks (which form a batch) and the partial multiplication results are reduced
into a single output tensor block.
See: https://ieeexplore.ieee.org/document/9139809 for more details.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D134163
This reland includes changes to the Python bindings.
Switch variadic operand and result segment size attributes to use the
dense i32 array. Dense integer arrays were introduced primarily to
represent index lists. They are a better fit for segment sizes than
dense elements attrs.
Depends on D131801
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D131803
Introduce two different failure propagation mode in the Transform
dialect's Sequence operation. These modes specify whether silenceable
errors produced by nested ops are immediately propagated, thus stopping
the sequence, or suppressed. The latter is useful in end-to-end
transform application scenarios where the user cannot correct the
transformation, but it is robust enough to silenceable failures. It
can be combined with the "alternatives" operation. There is
intentionally no default value to avoid favoring one mode over the
other.
Downstreams can update their tests using:
S='s/sequence \(%.*\) {/sequence \1 failures(propagate) {/'
T='s/sequence {/sequence failures(propagate) {/'
git grep -l transform.sequence | xargs sed -i -e "$S"
git grep -l transform.sequence | xargs sed -i -e "$T"
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D131774
Using if (TARGET ${LLVM_NATIVE_ARCH}) only works if MLIR is built
together with LLVM, but not for standalone builds of MLIR. The
correct way to check this is
if (${LLVM_NATIVE_ARCH} IN_LIST LLVM_TARGETS_TO_BUILD), as the
LLVM build system exports LLVM_TARGETS_TO_BUILD.
To avoid repeating the same check many times, add a
MLIR_ENABLE_EXECUTION_ENGINE variable.
Differential Revision: https://reviews.llvm.org/D131071
e179532284 removed the Type field from attributes and
arith::ConstantOp argument is now a TypedAttrInterface which isn't
supported by the python generator.
This patch temporarily restore the functionality for arith.constant but
won't generalize: we need to work on the generator instead.
Differential Revision: https://reviews.llvm.org/D130878
This is the same as the existing multiplier-1 variant of DepthwiseConv2D, but in PyTorch dim order.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D128575
Since the very first commits, the Python and C MLIR APIs have had mis-placed registration/load functionality for dialects, extensions, etc. This was done pragmatically in order to get bootstrapped and then just grew in. Downstreams largely bypass and do their own thing by providing various APIs to register things they need. Meanwhile, the C++ APIs have stabilized around this and it would make sense to follow suit.
The thing we have observed in canonical usage by downstreams is that each downstream tends to have native entry points that configure its installation to its preferences with one-stop APIs. This patch leans in to this approach with `RegisterEverything.h` and `mlir._mlir_libs._mlirRegisterEverything` being the one-stop entry points for the "upstream packages". The `_mlir_libs.__init__.py` now allows customization of the environment and Context by adding "initialization modules" to the `_mlir_libs` package. If present, `_mlirRegisterEverything` is treated as such a module. Others can be added by downstreams by adding a `_site_initialize_{i}.py` module, where '{i}' is a number starting with zero. The number will be incremented and corresponding module loaded until one is not found. Initialization modules can:
* Perform load time customization to the global environment (i.e. registering passes, hooks, etc).
* Define a `register_dialects(registry: DialectRegistry)` function that can extend the `DialectRegistry` that will be used to bootstrap the `Context`.
* Define a `context_init_hook(context: Context)` function that will be added to a list of callbacks which will be invoked after dialect registration during `Context` initialization.
Note that the `MLIRPythonExtension.RegisterEverything` is not included by default when building a downstream (its corresponding behavior was prior). For downstreams which need the default MLIR initialization to take place, they must add this back in to their Python CMake build just like they add their own components (i.e. to `add_mlir_python_common_capi_library` and `add_mlir_python_modules`). It is perfectly valid to not do this, in which case, only the things explicitly depended on and initialized by downstreams will be built/packaged. If the downstream has not been set up for this, it is recommended to simply add this back for the time being and pay the build time/package size cost.
CMake changes:
* `MLIRCAPIRegistration` -> `MLIRCAPIRegisterEverything` (renamed to signify what it does and force an evaluation: a number of places were incidentally linking this very expensive target)
* `MLIRPythonSoure.Passes` removed (without replacement: just drop)
* `MLIRPythonExtension.AllPassesRegistration` removed (without replacement: just drop)
* `MLIRPythonExtension.Conversions` removed (without replacement: just drop)
* `MLIRPythonExtension.Transforms` removed (without replacement: just drop)
Header changes:
* `mlir-c/Registration.h` is deleted. Dialect registration functionality is now in `IR.h`. Registration of upstream features are in `mlir-c/RegisterEverything.h`. When updating MLIR and a couple of downstreams, I found that proper usage was commingled so required making a choice vs just blind S&R.
Python APIs removed:
* mlir.transforms and mlir.conversions (previously only had an __init__.py which indirectly triggered `mlirRegisterTransformsPasses()` and `mlirRegisterConversionPasses()` respectively). Downstream impact: Remove these imports if present (they now happen as part of default initialization).
* mlir._mlir_libs._all_passes_registration, mlir._mlir_libs._mlirTransforms, mlir._mlir_libs._mlirConversions. Downstream impact: None expected (these were internally used).
C-APIs changed:
* mlirRegisterAllDialects(MlirContext) now takes an MlirDialectRegistry instead. It also used to trigger loading of all dialects, which was already marked with a TODO to remove -- it no longer does, and for direct use, dialects must be explicitly loaded. Downstream impact: Direct C-API users must ensure that needed dialects are loaded or call `mlirContextLoadAllAvailableDialects(MlirContext)` to emulate the prior behavior. Also see the `ir.c` test case (e.g. ` mlirContextGetOrLoadDialect(ctx, mlirStringRefCreateFromCString("func"));`).
* mlirDialectHandle* APIs were moved from Registration.h (which now is restricted to just global/upstream registration) to IR.h, arguably where it should have been. Downstream impact: include correct header (likely already doing so).
C-APIs added:
* mlirContextLoadAllAvailableDialects(MlirContext): Corresponds to C++ API with the same purpose.
Python APIs added:
* mlir.ir.DialectRegistry: Mapping for an MlirDialectRegistry.
* mlir.ir.Context.append_dialect_registry(MlirDialectRegistry)
* mlir.ir.Context.load_all_available_dialects()
* mlir._mlir_libs._mlirAllRegistration: New native extension that exposes a `register_dialects(MlirDialectRegistry)` entry point and performs all upstream pass/conversion/transforms registration on init. In this first step, we eagerly load this as part of the __init__.py and use it to monkey patch the Context to emulate prior behavior.
* Type caster and capsule support for MlirDialectRegistry
This should make it possible to build downstream Python dialects that only depend on a subset of MLIR. See: https://github.com/llvm/llvm-project/issues/56037
Here is an example PR, minimally adapting IREE to these changes: https://github.com/iree-org/iree/pull/9638/files In this situation, IREE is opting to not link everything, since it is already configuring the Context to its liking. For projects that would just like to not think about it and pull in everything, add `MLIRPythonExtension.RegisterEverything` to the list of Python sources getting built, and the old behavior will continue.
Reviewed By: mehdi_amini, ftynse
Differential Revision: https://reviews.llvm.org/D128593
Introduce a structured transform op that emits IR computing the multi-tile
sizes with requested parameters (target size and divisor) for the given
structured op. The sizes may fold to arithmetic constant operations when the
shape is constant. These operations may then be used to call the existing
tiling transformation with a single non-zero dynamic size (i.e. perform
strip-mining) for each of the dimensions separately, thus achieving multi-size
tiling with optional loop interchange. A separate test exercises the entire
script.
Depends On D129217
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D129287
Extend the definition of the Tile structured transform op to enable it
accepting handles to operations that produce tile sizes at runtime. This is
useful by itself and prepares for more advanced tiling strategies. Note that
the changes are relevant only to the transform dialect, the tiling
transformation itself already supports dynamic sizes.
Depends On D129216
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D129217
This handle manipulation operation allows one to define a new handle that is
associated with a the same payload IR operations N times, where N can be driven
by the size of payload IR operation list associated with another handle. This
can be seen as a sort of broadcast that can be used to ensure the lists
associated with two handles have equal numbers of payload IR ops as expected by
many pairwise transform operations.
Introduce an additional "expensive" check that guards against consuming a
handle that is assocaited with the same payload IR operation more than once as
this is likely to lead to double-free or other undesired effects.
Depends On D129110
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D129216
This Transform dialect op allows one to merge the lists of Payload IR
operations pointed to by several handles into a single list associated with one
handle. This is an important Transform dialect usability improvement for cases
where transformations may temporarily diverge for different groups of Payload
IR ops before converging back to the same script. Without this op, several
copies of the trailing transformations would have to be present in the
transformation script.
Depends On D129090
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D129110
Introduce a new transformation on structured ops that splits the iteration
space into two parts along the specified dimension. The index at which the
splitting happens may be static or dynamic. This transformation can be seen as
a rudimentary form of index-set splitting that only supports the splitting
along hyperplanes parallel to the iteration space hyperplanes, and is therefore
decomposable into per-dimension application.
It is a key low-level transformation that enables independent scheduling for
different parts of the iteration space of the same op, which hasn't been
possible previously. It may be used to implement, e.g., multi-sized tiling. In
future, peeling can be implemented as a combination of split-off amount
computation and splitting.
The transformation is conceptually close to tiling in its separation of the
iteration and data spaces, but cannot be currently implemented on top of
TilingInterface as the latter does not properly support `linalg.index`
offsetting.
Note that the transformation intentionally bypasses folding of
`tensor.extract_slice` operations when creating them as this folding was found
to prevent repeated splitting of the same operation because due to internal
assumptions about extract/insert_slice combination in dialect utilities.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D129090
This is already partially the case, but we can rely more heavily on interface libraries and how they are imported/exported in other to simplify the implementation of the mlir python functions in Cmake.
This change also makes a couple of other changes:
1) Add a new CMake function which handles "pure" sources. This was done inline previously
2) Moves the headers associated with CAPI libraries to the libraries themselves. These were previously managed in a separate source target. They can now be added directly to the CAPI libraries using DECLARED_HEADERS.
3) Cleanup some dependencies that showed up as an issue during the refactor
This is a big CMake change that should produce no impact on the build of mlir and on the produced *build tree*. However, this change fixes an issue with the *install tree* of mlir which was previously unusable for projects like torch-mlir because both the "pure" and "extension" targets were pointing to either the build or source trees.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D128230
This aligns the SCF dialect file layout with the majority of the dialects.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D128049
Also complete the set by adding a variant of depthwise 1d convolution
with the multiplier != 1.
Differential Revision: https://reviews.llvm.org/D127687
If `copy` is specified, the newly allocated buffer is initialized with the given contents. Also add an optional `escape` attribute to indicate whether the buffer of the tensor may be returned from the parent block (aka. "escape") after bufferization.
This change is in preparation of connecting One-Shot Bufferize to the sparse compiler.
Differential Revision: https://reviews.llvm.org/D126570
Introduce transform ops for "for" loops, in particular for peeling, software
pipelining and unrolling, along with a couple of "IR navigation" ops. These ops
are intended to be generalized to different kinds of loops when possible and
therefore use the "loop" prefix. They currently live in the SCF dialect as
there is no clear place to put transform ops that may span across several
dialects, this decision is postponed until the ops actually need to handle
non-SCF loops.
Additionally refactor some common utilities for transform ops into trait or
interface methods, and change the loop pipelining to be a returning pattern.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D127300
Implement the C-API and Python bindings for the builtin opaque type, which was previously missing.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D127303
Similar to complex128/complex64, float16 has no direct support
in the ctypes implementation. This fixes the issue by using a
custom F16 type to change the view in and out of MLIR code
Reviewed By: wrengr
Differential Revision: https://reviews.llvm.org/D126928
These ops complement the tiling/padding transformations by transforming
higher-level named structured operations such as depthwise convolutions into
lower-level and/or generic equivalents that are better handled by some
downstream transformations.
Differential Revision: https://reviews.llvm.org/D126698
There is no direct ctypes for MLIR's complex (and thus np.complex128
and np.complex64) yet, causing the mlir python binding methods for
memrefs to crash. This revision fixes this by passing complex arrays
as tuples of floats, correcting at the boundaries for the proper view.
NOTE: some of these changes (4 -> 2) were forced by the new "linting"
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D126422
Python bindings for extensions of the Transform dialect are defined in separate
Python source files that can be imported on-demand, i.e., that are not imported
with the "main" transform dialect. This requires a minor addition to the
ODS-based bindings generator. This approach is consistent with the current
model for downstream projects that are expected to bundle MLIR Python bindings:
such projects can include their custom extensions into the bundle similarly to
how they include their dialects.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D126208
This diff modifies `mlir-tblgen` to generate Python Operation class `__init__()`
functions that use Python keyword-only arguments.
Previously, all `__init__()` function arguments were positional. Python code to
create MLIR Operations was required to provide values for ALL builder arguments,
including optional arguments (attributes and operands). Callers that did not
provide, for example, an optional attribute would be forced to provide `None`
as an argument for EACH optional attribute. Proposed changes in this diff use
`tblgen` record information (as provided by ODS) to generate keyword arguments
for:
- optional operands
- optional attributes (which includes unit attributes)
- default-valued attributes
These `__init__()` function keyword arguments have default `None` values (i.e.
the argument form is `optionalAttr=None`), allowing callers to create Operations
more easily.
Note that since optional arguments become keyword-only arguments (since they are
placed after the bare `*` argument), this diff will require ALL optional
operands and attributes to be provided using explicit keyword syntax. This may,
in the short term, break any out-of-tree Python code that provided values via
positional arguments. However, in the long term, it seems that requiring
keywords for optional arguments will be more robust to operation changes that
add arguments.
Tests were modified to reflect the updated Operation builder calling convention.
This diff partially addresses the requests made in the github issue below.
https://github.com/llvm/llvm-project/issues/54932
Reviewed By: stellaraccident, mikeurbach
Differential Revision: https://reviews.llvm.org/D124717
This change adds a new op `alloc_tensor` to the bufferization dialect. During bufferization, this op is always lowered to a buffer allocation (unless it is "eliminated" by a pre-processing pass). It is useful to have such an op in tensor land, because it allows users to model tensor SSA use-def chains (which drive bufferization decisions) and because tensor SSA use-def chains can be analyzed by One-Shot Bufferize, while memref values cannot.
This change also replaces all uses of linalg.init_tensor in bufferization-related code with bufferization.alloc_tensor.
linalg.init_tensor and bufferization.alloc_tensor are similar, but the purpose of the former one is just to carry a shape. It does not indicate a memory allocation.
linalg.init_tensor is not suitable for modelling SSA use-def chains for bufferization purposes, because linalg.init_tensor is marked as not having side effects (in contrast to alloc_tensor). As such, it is legal to move linalg.init_tensor ops around/CSE them/etc. This is not desirable for alloc_tensor; it represents an explicit buffer allocation while still in tensor land and such allocations should not suddenly disappear or get moved around when running the canonicalizer/CSE/etc.
BEGIN_PUBLIC
No public commit message needed for presubmit.
END_PUBLIC
Differential Revision: https://reviews.llvm.org/D126003
This allows for using attribute types in result type inference for use with
InferTypeOpInterface. This was a TODO before, but it isn't much
additional work to properly support this. After this commit,
arith::ConstantOp can now have its InferTypeOpInterface implementation automatically
generated.
Differential Revision: https://reviews.llvm.org/D124580