Commit Graph

1063 Commits

Author SHA1 Message Date
Tim Shen d00f5632f3 [mlir] Add a simplifying wrapper for generateCopy and expose it.
Summary:
affineDataCopyGenerate is a monolithinc function that
combines several steps for good reasons, but it makes customizing
the behaivor even harder. The major two steps by affineDataCopyGenerate are:
a) Identify interesting memrefs and collect their uses.
b) Create new buffers to forward these uses.

Step (a) actually has requires tremendous customization options. One could see
that from the recently added filterMemRef parameter.

This patch adds a function that only does (b), in the hope that (a)
can be directly implemented by the callers. In fact, (a) is quite
simple if the caller has only one buffer to consider, or even one use.

Differential Revision: https://reviews.llvm.org/D75965
2020-03-11 16:22:31 -07:00
Tim Shen ced0dd8e51 [MLIR] Guard DMA-specific logic with DMA option
Differential Revision: https://reviews.llvm.org/D75963
2020-03-11 11:23:13 -07:00
River Riddle 153720a0a5 [mlir][NFC] Move the interfaces and traits for side effects out of IR/ to Interfaces/
Summary:
Interfaces/ is the designated directory for these types of interfaces, and also removes the need for including them directly in IR/.

Differential Revision: https://reviews.llvm.org/D75886
2020-03-10 12:45:45 -07:00
River Riddle 7ce1e7ab07 [mlir][NFC] Move the operation interfaces out of Analysis/ and into a new Interfaces/ directory.
The interfaces themselves aren't really analyses, they may be used by analyses though. Having them in Analysis can also create cyclic dependencies if an analysis depends on a specific dialect, that also provides one of the interfaces.

Differential Revision: https://reviews.llvm.org/D75867
2020-03-10 12:45:45 -07:00
River Riddle b10c662514 [mlir][SideEffects] Replace the old SideEffects dialect interface with the newly added op interfaces/traits.
Summary:
The old interface was a temporary stopgap to allow for implementing simple LICM that took side effects of region operations into account. Now that MLIR has proper support for specifying memory effects, this interface can be deleted.

Differential Revision: https://reviews.llvm.org/D74441
2020-03-09 16:02:21 -07:00
Uday Bondhugula 82e9160aab [MLIR][Affine] NFC: add convenience method for affine data copy for a loop body
add convenience method for affine data copy generation for a loop body

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D75822
2020-03-09 04:23:54 +00:00
Valentin Churavy 7c64f6bf52 [MLIR] Add support for libMLIR.so
Putting this up mainly for discussion on
how this should be done. I am interested in MLIR from
the Julia side and we currently have a strong preference
to dynamically linking against the LLVM shared library,
and would like to have a MLIR shared library.

This patch adds a new cmake function add_mlir_library()
which accumulates a list of targets to be compiled into
libMLIR.so.  Note that not all libraries make sense to
be compiled into libMLIR.so.  In particular, we want
to avoid libraries which primarily exist to support
certain tools (such as mlir-opt and mlir-cpu-runner).

Note that the resulting libMLIR.so depends on LLVM, but
does not contain any LLVM components.  As a result, it
is necessary to link with libLLVM.so to avoid linkage
errors. So, libMLIR.so requires LLVM_BUILD_LLVM_DYLIB=on

FYI, Currently it appears that LLVM_LINK_LLVM_DYLIB is broken
because mlir-tblgen is linked against libLLVM.so and
and independent LLVM components.

Previous version of this patch broke depencies on TableGen
targets.  This appears to be because it compiled all
libraries to OBJECT libraries (probably because cmake
is generating different target names).  Avoiding object
libraries results in correct dependencies.

(updated by Stephen Neuendorffer)

Differential Revision: https://reviews.llvm.org/D73130
2020-03-06 13:25:18 -08:00
Stephen Neuendorffer 4594d0e943 [MLIR] Move from add_dependencies() to DEPENDS
add_llvm_library and add_llvm_executable may need to create new targets with
appropriate dependencies.  As a result, it is not sufficient in some
configurations (namely LLVM_BUILD_LLVM_DYLIB=on) to only call
add_dependencies().  Instead, the explicit TableGen dependencies must
be passed to add_llvm_library() or add_llvm_executable() using the DEPENDS
keyword.

Differential Revision: https://reviews.llvm.org/D74930
2020-03-06 13:25:17 -08:00
Stephen Neuendorffer 1c82dd39f9 [MLIR] Ensure that target_link_libraries() always has a keyword.
CMake allows calling target_link_libraries() without a keyword,
but this usage is not preferred when also called with a keyword,
and has surprising behavior.  This patch explicitly specifies a
keyword when using target_link_libraries().

Differential Revision: https://reviews.llvm.org/D75725
2020-03-06 09:14:01 -08:00
River Riddle cb1777127c [mlir] Remove successor operands from the Operation class
Summary:
This revision removes all of the functionality related to successor operands on the core Operation class. This greatly simplifies a lot of handling of operands, as well as successors. For example, DialectConversion no longer needs a special "matchAndRewrite" for branching terminator operations.(Note, the existing method was also broken for operations with variadic successors!!)

This also enables terminator operations to define their own relationships with successor arguments, instead of the hardcoded "pass-through" behavior that exists today.

Differential Revision: https://reviews.llvm.org/D75318
2020-03-05 12:53:02 -08:00
River Riddle 988249a506 [mlir] Refactor a few users to no longer rely on the successor operand API of Operation.
The existing API for successor operands on operations is in the process of being removed. This revision simplifies a later one that completely removes the existing API.

Differential Revision: https://reviews.llvm.org/D75316
2020-03-05 12:51:59 -08:00
Matthias Kramm 7a25bd1d19 [mlir][DialectConversion] Abort early if a subregion has a disconnected CFG.
Summary:
Make computeConversionSet bubble up errors from nested regions. Note
that this doesn't change top-level behavior - since the nested region
calls emitError, the error was visible before, just not surfaced as
quickly.

Differential Revision: https://reviews.llvm.org/D75369
2020-03-02 09:28:21 -08:00
River Riddle de5a81b102 [mlir] Update several usages of IntegerType to properly handled unsignedness.
Summary: For example, DenseElementsAttr currently does not properly round-trip unsigned integer values.

Differential Revision: https://reviews.llvm.org/D75374
2020-03-02 09:19:26 -08:00
Stephen Neuendorffer 798e661567 Revert "[MLIR] Move from using target_link_libraries to LINK_LIBS for llvm libraries."
This reverts commit 7a6c689771.
This breaks the build with cmake 3.13.4, but succeeds with cmake 3.15.3
2020-02-29 11:52:08 -08:00
Stephen Neuendorffer d675df0379 Revert "[MLIR] Move from add_dependencies() to DEPENDS"
This reverts commit 31e07d716a.
2020-02-29 11:52:08 -08:00
Stephen Neuendorffer dd046c9612 Revert "[MLIR] Add support for libMLIR.so"
This reverts commit e17d9c11d4.
It breaks the build.
2020-02-29 11:09:21 -08:00
Valentin Churavy e17d9c11d4 [MLIR] Add support for libMLIR.so
Putting this up mainly for discussion on
how this should be done. I am interested in MLIR from
the Julia side and we currently have a strong preference
to dynamically linking against the LLVM shared library,
and would like to have a MLIR shared library.

This patch adds a new cmake function add_mlir_library()
which accumulates a list of targets to be compiled into
libMLIR.so.  Note that not all libraries make sense to
be compiled into libMLIR.so.  In particular, we want
to avoid libraries which primarily exist to support
certain tools (such as mlir-opt and mlir-cpu-runner).

Note that the resulting libMLIR.so depends on LLVM, but
does not contain any LLVM components.  As a result, it
is necessary to link with libLLVM.so to avoid linkage
errors. So, libMLIR.so requires LLVM_BUILD_LLVM_DYLIB=on

FYI, Currently it appears that LLVM_LINK_LLVM_DYLIB is broken
because mlir-tblgen is linked against libLLVM.so and
and independent LLVM components.

Previous version of this patch broke depencies on TableGen
targets.  This appears to be because it compiled all
libraries to OBJECT libraries (probably because cmake
is generating different target names).  Avoiding object
libraries results in correct dependencies.

(updated by Stephen Neuendorffer)

Differential Revision: https://reviews.llvm.org/D73130
2020-02-29 10:47:27 -08:00
Stephen Neuendorffer 31e07d716a [MLIR] Move from add_dependencies() to DEPENDS
add_llvm_library and add_llvm_executable may need to create new targets with
appropriate dependencies.  As a result, it is not sufficient in some
configurations (namely LLVM_BUILD_LLVM_DYLIB=on) to only call
add_dependencies().  Instead, the explicit TableGen dependencies must
be passed to add_llvm_library() or add_llvm_executable() using the DEPENDS
keyword.

Differential Revision: https://reviews.llvm.org/D74930
2020-02-29 10:47:27 -08:00
Stephen Neuendorffer 7a6c689771 [MLIR] Move from using target_link_libraries to LINK_LIBS for llvm libraries.
When compiling libLLVM.so, add_llvm_library() manipulates the link libraries
being used.  This means that when using add_llvm_library(), we need to pass
the list of libraries to be linked (using the LINK_LIBS keyword) instead of
using the standard target_link_libraries call.  This is preparation for
properly dealing with creating libMLIR.so as well.

Differential Revision: https://reviews.llvm.org/D74864
2020-02-29 10:47:26 -08:00
Stephen Neuendorffer dc1056a3f1 Revert "[MLIR] Move from using target_link_libraries to LINK_LIBS for llvm libraries."
This reverts commit 2f265e3528.
2020-02-28 14:13:30 -08:00
Stephen Neuendorffer 67f2a43cf8 Revert "[MLIR] Move from add_dependencies() to DEPENDS"
This reverts commit 8a2b86b2c2.
2020-02-28 12:17:40 -08:00
Stephen Neuendorffer c6f3fc4999 Revert "[MLIR] Add support for libMLIR.so"
This reverts commit 1246e86716.
2020-02-28 12:17:39 -08:00
Valentin Churavy 1246e86716 [MLIR] Add support for libMLIR.so
Putting this up mainly for discussion on
how this should be done. I am interested in MLIR from
the Julia side and we currently have a strong preference
to dynamically linking against the LLVM shared library,
and would like to have a MLIR shared library.

This patch adds a new cmake function add_mlir_library()
which accumulates a list of targets to be compiled into
libMLIR.so.  Note that not all libraries make sense to
be compiled into libMLIR.so.  In particular, we want
to avoid libraries which primarily exist to support
certain tools (such as mlir-opt and mlir-cpu-runner).

Note that the resulting libMLIR.so depends on LLVM, but
does not contain any LLVM components.  As a result, it
is necessary to link with libLLVM.so to avoid linkage
errors. So, libMLIR.so requires LLVM_BUILD_LLVM_DYLIB=on

FYI, Currently it appears that LLVM_LINK_LLVM_DYLIB is broken
because mlir-tblgen is linked against libLLVM.so and
and independent LLVM components

(updated by Stephen Neuendorffer)

Differential Revision: https://reviews.llvm.org/D73130
2020-02-28 11:35:19 -08:00
Stephen Neuendorffer 8a2b86b2c2 [MLIR] Move from add_dependencies() to DEPENDS
add_llvm_library and add_llvm_executable may need to create new targets with
appropriate dependencies.  As a result, it is not sufficient in some
configurations (namely LLVM_BUILD_LLVM_DYLIB=on) to only call
add_dependencies().  Instead, the explicit TableGen dependencies must
be passed to add_llvm_library() or add_llvm_executable() using the DEPENDS
keyword.

Differential Revision: https://reviews.llvm.org/D74930
2020-02-28 11:35:18 -08:00
Stephen Neuendorffer 2f265e3528 [MLIR] Move from using target_link_libraries to LINK_LIBS for llvm libraries.
When compiling libLLVM.so, add_llvm_library() manipulates the link libraries
being used.  This means that when using add_llvm_library(), we need to pass
the list of libraries to be linked (using the LINK_LIBS keyword) instead of
using the standard target_link_libraries call.  This is preparation for
properly dealing with creating libMLIR.so as well.

Differential Revision: https://reviews.llvm.org/D74864
2020-02-28 11:35:17 -08:00
Rob Suderman 69d757c0e8 Move StandardOps/Ops.h to StandardOps/IR/Ops.h
Summary:
NFC - Moved StandardOps/Ops.h to a StandardOps/IR dir to better match surrounding
directories. This is to match other dialects, and prepare for moving StandardOps
related transforms in out for Transforms and into StandardOps/Transforms.

Differential Revision: https://reviews.llvm.org/D74940
2020-02-21 11:58:47 -08:00
Lei Zhang 35b685270b [mlir] Add a signedness semantics bit to IntegerType
Thus far IntegerType has been signless: a value of IntegerType does
not have a sign intrinsically and it's up to the specific operation
to decide how to interpret those bits. For example, std.addi does
two's complement arithmetic, and std.divis/std.diviu treats the first
bit as a sign.

This design choice was made some time ago when we did't have lots
of dialects and dialects were more rigid. Today we have much more
extensible infrastructure and different dialect may want different
modelling over integer signedness. So while we can say we want
signless integers in the standard dialect, we cannot dictate for
others. Requiring each dialect to model the signedness semantics
with another set of custom types is duplicating the functionality
everywhere, considering the fundamental role integer types play.

This CL extends the IntegerType with a signedness semantics bit.
This gives each dialect an option to opt in signedness semantics
if that's what they want and helps code sharing. The parser is
modified to recognize `si[1-9][0-9]*` and `ui[1-9][0-9]*` as
signed and unsigned integer types, respectively, leaving the
original `i[1-9][0-9]*` to continue to mean no indication over
signedness semantics. All existing dialects are not affected (yet)
as this is a feature to opt in.

More discussions can be found at:

https://groups.google.com/a/tensorflow.org/d/msg/mlir/XmkV8HOPWpo/7O4X0Nb_AQAJ

Differential Revision: https://reviews.llvm.org/D72533
2020-02-21 09:16:54 -05:00
Diego Caballero 376c68539c [mlir][NFC] Fix 'gatherLoops' utility
It replaces DenseMap output with a SmallVector and it
removes empty loop levels from the output.

Reviewed By: andydavis1, mehdi_amini

Differential Revision: https://reviews.llvm.org/D74658
2020-02-19 10:48:14 -08:00
River Riddle 0d7ff220ed [mlir] Refactor TypeConverter to add conversions without inheritance
Summary:
This revision refactors the TypeConverter class to not use inheritance to add type conversions. It instead moves to a registration based system, where conversion callbacks are added to the converter with `addConversion`. This method takes a conversion callback, which must be convertible to any of the following forms(where `T` is a class derived from `Type`:
* Optional<Type> (T type)
   - This form represents a 1-1 type conversion. It should return nullptr
     or `llvm::None` to signify failure. If `llvm::None` is returned, the
     converter is allowed to try another conversion function to perform
     the conversion.
* Optional<LogicalResult>(T type, SmallVectorImpl<Type> &results)
   - This form represents a 1-N type conversion. It should return
     `failure` or `llvm::None` to signify a failed conversion. If the new
     set of types is empty, the type is removed and any usages of the
     existing value are expected to be removed during conversion. If
     `llvm::None` is returned, the converter is allowed to try another
     conversion function to perform the conversion.

When attempting to convert a type, the TypeConverter walks each of the registered converters starting with the one registered most recently.

Differential Revision: https://reviews.llvm.org/D74584
2020-02-18 16:17:48 -08:00
Diego Caballero d7058acc14 [mlir] Add MemRef filter to affine data copy optimization
This patch extends affine data copy optimization utility with an
optional memref filter argument. When the memref filter is used, data
copy optimization will only generate copies for such a memref.

Note: this patch is just porting the memref filter feature from Uday's
'hop' branch: https://github.com/bondhugula/llvm-project/tree/hop.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D74342
2020-02-14 13:41:45 -08:00
Mehdi Amini c64770506b Remove static registration for dialects, and the "alwayslink" hack for passes
In the previous state, we were relying on forcing the linker to include
all libraries in the final binary and the global initializer to self-register
every piece of the system. This change help moving away from this model, and
allow users to compose pieces more freely. The current change is only "fixing"
the dialect registration and avoiding relying on "whole link" for the passes.
The translation is still relying on the global registry, and some refactoring
is needed to make this all more convenient.

Differential Revision: https://reviews.llvm.org/D74461
2020-02-12 09:13:02 +00:00
Andy Davis 40b2eb3530 [mlir][AffineOps] Adds affine loop fusion transformation function to LoopFusionUtils.
Summary:
Adds affine loop fusion transformation function to LoopFusionUtils.
Updates TestLoopFusion utility to run loop fusion transformation until a fixed point is reached.
Adds unit tests to test the transformation.
Includes ASAN bug fix for D73190.

Reviewers: bondhugula, dcaballe

Reviewed By: bondhugula, dcaballe

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74330
2020-02-11 13:56:26 -08:00
Stephen Neuendorffer b80a9ca8cb [MLIR] Allow non-binary operations to be commutative
NFC for binary operations.

Differential Revision: https://reviews.llvm.org/D73670
2020-02-10 10:23:55 -08:00
Alex Zinenko 5a1778057f [mlir] use unpacked memref descriptors at function boundaries
The existing (default) calling convention for memrefs in standard-to-LLVM
conversion was motivated by interfacing with LLVM IR produced from C sources.
In particular, it passes a pointer to the memref descriptor structure when
calling the function. Therefore, the descriptor is allocated on stack before
the call. This convention leads to several problems. PR44644 indicates a
problem with stack exhaustion when calling functions with memref-typed
arguments in a loop. Allocating outside of the loop may lead to concurrent
access problems in case the loop is parallel. When targeting GPUs, the contents
of the stack-allocated memory for the descriptor (passed by pointer) needs to
be explicitly copied to the device. Using an aggregate type makes it impossible
to attach pointer-specific argument attributes pertaining to alignment and
aliasing in the LLVM dialect.

Change the default calling convention for memrefs in standard-to-LLVM
conversion to transform a memref into a list of arguments, each of primitive
type, that are comprised in the memref descriptor. This avoids stack allocation
for ranked memrefs (and thus stack exhaustion and potential concurrent access
problems) and simplifies the device function invocation on GPUs.

Provide an option in the standard-to-LLVM conversion to generate auxiliary
wrapper function with the same interface as the previous calling convention,
compatible with LLVM IR porduced from C sources. These auxiliary functions
pack the individual values into a descriptor structure or unpack it. They also
handle descriptor stack allocation if necessary, serving as an allocation
scope: the memory reserved by `alloca` will be freed on exiting the auxiliary
function.

The effect of this change on MLIR-generated only LLVM IR is minimal. When
interfacing MLIR-generated LLVM IR with C-generated LLVM IR, the integration
only needs to require auxiliary functions and change the function name to call
the wrapper function instead of the original function.

This also opens the door to forwarding aliasing and alignment information from
memrefs to LLVM IR pointers in the standrd-to-LLVM conversion.
2020-02-10 15:03:43 +01:00
River Riddle abe3e5babd [mlir] Add support for generating debug locations from intermediate levels of the IR.
Summary:
This revision adds a utility to generate debug locations from the IR during compilation, by snapshotting to a output stream and using the locations that operations were dumped in that stream. The new locations may either;
* Replace the original location of the operation.

old:
   loc("original_source.cpp":1:1)
new:
   loc("snapshot_source.mlir":10:10)

* Fuse with the original locations as NamedLocs with a specific tag.

old:
    loc("original_source.cpp":1:1)
new:
    loc(fused["original_source.cpp":1:1, "snapshot"("snapshot_source.mlir":10:10)])

This feature may be used by a debugger to display the code at various different levels of the IR. It would also be able to show the different levels of IR attached to a specific source line in the original source file.

This feature may also be used to generate locations for operations generated during compilation, that don't necessarily have a user source location to attach to.

This requires changes in the printer to track the locations of operations emitted in the stream. Moving forward we need to properly(and efficiently) track the number of newlines emitted to the stream during printing.

Differential Revision: https://reviews.llvm.org/D74019
2020-02-08 15:11:29 -08:00
River Riddle 5c159b91a2 [mlir] Add a utility method on CallOpInterface for resolving the callable.
Summary: This is the most common operation performed on a CallOpInterface. This just moves the existing functionality from the CallGraph so that other users can access it.

Differential Revision: https://reviews.llvm.org/D74250
2020-02-08 10:44:29 -08:00
River Riddle 1eaa31ce0e [mlir][DialectConversion] Change erroneous return to a continue
This fixes a nasty bug where the loop would return prematurely when
notifying the argument converter that an operation was removed.
2020-02-06 17:55:14 -08:00
Mehdi Amini 2724ada8d2 Revert "[mlir] Adds affine loop fusion transformation function to LoopFusionUtils."
This reverts commit 64871f778d.

ASAN indicates a use-after-free in in mlir::canFuseLoops(mlir::AffineForOp, mlir::AffineForOp, unsigned int, mlir::ComputationSliceState*) lib/Transforms/Utils/LoopFusionUtils.cpp:202:41
2020-02-06 16:46:28 +00:00
River Riddle c33d6970e0 [mlir] Add support for basic location translation to LLVM.
Summary:
This revision adds basic support for emitting line table information when exporting to LLVMIR. We don't yet have a story for supporting all of the LLVM debug metadata, so this revision stubs some features(like subprograms) to enable emitting line tables.

Differential Revision: https://reviews.llvm.org/D73934
2020-02-05 17:41:51 -08:00
Andy Davis 64871f778d [mlir] Adds affine loop fusion transformation function to LoopFusionUtils.
Summary:
Adds affine loop fusion transformation function to LoopFusionUtils.
Updates TestLoopFusion utility to run loop fusion transformation until a fixed point is reached.
Adds unit tests to test the transformation.

Reviewers: bondhugula, dcaballe, nicolasvasilache

Reviewed By: bondhugula, dcaballe

Subscribers: Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73190
2020-02-05 16:01:06 -08:00
Stephen Neuendorffer 7b7e505813 [MLIR] Break cyclic dependencies with MLIRAnalysis
Summary:

MLIRAnalysis depended on MLIRVectorOps
MLIRVectorOps depended on MLIRAnalysis for Loop information.

Both of these can be solved by factoring out libraries related to loop
analysis into their own library. The new MLIRLoopAnalysis might be
better off with the Loop Dialect in the future.

Reviewers: nicolasvasilache, rriddle!, mehdi_amini

Reviewed By: mehdi_amini

Subscribers: Joonsoo, vchuravy, merge_guards_bot, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73655
2020-02-05 11:27:28 -08:00
Stephen Neuendorffer b3dd31711a [MLIR] Move test passes out of lib/Analysis
Summary:

This breaks a cyclic library dependency where MLIRPass used the verifier
in MLIRAnalysis, but MLIRAnalysis also contained passes used for testing.
The presence of the test passes here is archaeology, predating
test/lib/Transform.

Reviewers: rriddle

Reviewed By: rriddle

Subscribers: merge_guards_bot, mgorny, mehdi_amini, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74067
2020-02-05 11:26:49 -08:00
Dimitry Andric 31fd112eb4 Fix x86 32bits MLIR build (NFC)
This is fixing a build error:

error: non-constant-expression cannot be narrowed from type 'unsigned int' to 'Region::iterator::difference_type' (aka 'int') in initializer list

Fix pr44767
2020-02-04 23:58:58 +00:00
Jacques Pienaar 1544cf2d7c [mlir] Fix errors in release & no-assert
Seen on gcc 8, in release mode & assertions off warnings about logger,
made all statements referencing logger inside LLVM_DEBUG blocks and
ifdef a few variables only used in debug.

This is mechanical fix to get CI green.
2020-02-01 08:57:01 -08:00
River Riddle 75c328179e [mlir][DialectConversion] Remove invalid NDEBUG wrapper.
The functions are used, but empty when NDEBUG is set.
2020-01-31 13:26:49 -08:00
River Riddle 4948b8b3cf [mlir][NFC] Refactor DialectConversion debug logging
Summary:
This revision beefs up the debug logging within dialect conversion. Given the nature of multi-level legalization, and legalization in general, it is one of the harder pieces of infrastructure to debug. This revision adds nice formatting to make the output log easier to parse:

```
Legalizing operation : 'std.constant'(0x608000002420) {
  * Fold {
  } -> FAILURE : unable to fold

  * Pattern : 'std.constant -> ()' {
  } -> FAILURE : pattern failed to match

  * Pattern : 'std.constant -> ()' {
  } -> FAILURE : pattern failed to match

  * Pattern : 'std.constant -> (spv.constant)' {
    ** Insert  : 'spv.constant'(0x608000002c20)
    ** Replace : 'std.constant'(0x608000002420)

    //===-------------------------------------------===//
    Legalizing operation : 'spv.constant'(0x608000002c20) {
    } -> SUCCESS : operation marked legal by the target
    //===-------------------------------------------===//
  } -> SUCCESS : pattern applied successfully
} -> SUCCESS
```

Differential Revision: https://reviews.llvm.org/D73747
2020-01-31 12:07:17 -08:00
Tim Shen 3ccaac3cdd [mlir] Add MemRefTypeBuilder and refactor some MemRefType::get().
The refactored MemRefType::get() calls all intend to clone from another
memref type, with some modifications. In fact, some calls dropped memory space
during the cloning. Migrate them to the cloning API so that nothing gets
dropped if they are not explicitly listed.

It's close to NFC but not quite, as it helps with propagating memory spaces in
some places.

Differential Revision: https://reviews.llvm.org/D73296
2020-01-30 23:30:46 -08:00
River Riddle 6b9e2be8ec [mlir][NFC] Explicitly initialize dynamic legality when setting op
action.
2020-01-30 00:21:32 -08:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
River Riddle b276dec5b6 [mlir] Add a DCE pass for dead symbols.
Summary: This pass deletes all symbols that are found to be unreachable. This is done by computing the set of operations that are known to be live, propagating that liveness to other symbols, and then deleting all symbols that are not within this live set.

Differential Revision: https://reviews.llvm.org/D72482
2020-01-27 23:29:30 -08:00
River Riddle aff4ed7326 [mlir][NFC] Update Operation::getResultTypes to use ArrayRef<Type> instead of iterator_range.
Summary: The new internal representation of operation results now allows for accessing the result types to be more efficient. Changing the API to ArrayRef is more efficient and removes the need to explicitly materialize vectors in several places.

Differential Revision: https://reviews.llvm.org/D73429
2020-01-27 19:57:48 -08:00
River Riddle ce674b131b [mlir] Add support for marking 'unknown' operations as dynamically legal.
Summary: This allows for providing a default "catchall" legality check that is not dependent on specific operations or dialects. For example, this can be useful to check legality based on the specific types of operation operands or results.

Differential Revision: https://reviews.llvm.org/D73379
2020-01-27 19:50:52 -08:00
Diego Caballero 6fb3d59746 [mlir] Remove 'valuesToRemoveIfDead' from PatternRewriter API
Summary:
Remove 'valuesToRemoveIfDead' from PatternRewriter API. The removal
functionality wasn't implemented and we decided [1] not to implement it in
favor of having more powerful DCE approaches.

[1] https://github.com/tensorflow/mlir/pull/212

Reviewers: rriddle, bondhugula

Reviewed By: rriddle

Subscribers: liufengdb, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72545
2020-01-27 14:00:34 -08:00
Mehdi Amini 308571074c Mass update the MLIR license header to mention "Part of the LLVM project"
This is an artifact from merging MLIR into LLVM, the file headers are
now aligned with the rest of the project.
2020-01-26 03:58:30 +00:00
Ahmed Taei 8d1ed2940d [mlir] Fix vectorize transform crashing on none-op operand 2020-01-23 09:57:16 -08:00
Kazuaki Ishizaki fc817b09e2 [mlir] NFC: Fix trivial typos in comments
Differential Revision: https://reviews.llvm.org/D73012
2020-01-20 03:17:03 +00:00
aartbik 0361a961c2 [mlir] [VectorOps] Rename Utils.h into VectorUtils.h
Summary:
First step towards the consolidation
of a lot of vector related utilities
that are now all over the place
(or even duplicated).

Reviewers: nicolasvasilache, andydavis1

Reviewed By: nicolasvasilache, andydavis1

Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72955
2020-01-17 13:39:34 -08:00
Benjamin Kramer df186507e1 Make helper functions static or move them into anonymous namespaces. NFC. 2020-01-14 14:06:37 +01:00
River Riddle c774840492 [mlir] Update the CallGraph for nested symbol references, and simplify CallableOpInterface
Summary:
This enables tracking calls that cross symbol table boundaries. It also simplifies some of the implementation details of CallableOpInterface, i.e. there can only be one region within the callable operation.

Depends On D72042

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D72043
2020-01-13 15:51:28 -08:00
River Riddle cb89c7e3f7 [mlir] Remove unnecessary assert for single region.
This was left over debugging.
2020-01-13 13:55:50 -08:00
River Riddle 2bdf33cc4c [mlir] NFC: Remove Value::operator* and Value::operator-> now that Value is properly value-typed.
Summary: These were temporary methods used to simplify the transition.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D72548
2020-01-11 08:54:39 -08:00
River Riddle 0d6ebb4f0d [mlir] Refactor operation results to use a single use list for all results of the operation.
Summary: A new class is added, IRMultiObjectWithUseList, that allows for representing an IR use list that holds multiple sub values(used in this case for OpResults). This class provides all of the same functionality as the base IRObjectWithUseList, but for specific sub-values. This saves a word per operation result and is a necessary step in optimizing the layout of operation results. For now the use list is placed on the operation itself, so zero-result operations grow by a word. When the work for optimizing layout is finished, this can be moved back to being a trailing object based on memory/runtime benchmarking.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D71955
2019-12-30 20:50:07 -08:00
River Riddle e62a69561f NFC: Replace ValuePtr with Value and remove it now that Value is value-typed.
ValuePtr was a temporary typedef during the transition to a value-typed Value.

PiperOrigin-RevId: 286945714
2019-12-23 16:36:53 -08:00
River Riddle 5d5bd2e1da Change the `notifyRootUpdated` API to be transaction based.
This means that in-place, or root, updates need to use explicit calls to `startRootUpdate`, `finalizeRootUpdate`, and `cancelRootUpdate`. The major benefit of this change is that it enables in-place updates in DialectConversion, which simplifies the FuncOp pattern for example. The major downside to this is that the cases that *may* modify an operation in-place will need an explicit cancel on the failure branches(assuming that they started an update before attempting the transformation).

PiperOrigin-RevId: 286933674
2019-12-23 16:26:15 -08:00
Mehdi Amini 56222a0694 Adjust License.txt file to use the LLVM license
PiperOrigin-RevId: 286906740
2019-12-23 15:33:37 -08:00
River Riddle 35807bc4c5 NFC: Introduce new ValuePtr/ValueRef typedefs to simplify the transition to Value being value-typed.
This is an initial step to refactoring the representation of OpResult as proposed in: https://groups.google.com/a/tensorflow.org/g/mlir/c/XXzzKhqqF_0/m/v6bKb08WCgAJ

This change will make it much simpler to incrementally transition all of the existing code to use value-typed semantics.

PiperOrigin-RevId: 286844725
2019-12-22 22:00:23 -08:00
Manuel Freiberger 22954a0e40 Add integer bit-shift operations to the standard dialect.
Rename the 'shlis' operation in the standard dialect to 'shift_left'. Add tests
for this operation (these have been missing so far) and add a lowering to the
'shl' operation in the LLVM dialect.

Add also 'shift_right_signed' (lowered to LLVM's 'ashr') and 'shift_right_unsigned'
(lowered to 'lshr').

The original plan was to name these operations 'shift.left', 'shift.right.signed'
and 'shift.right.unsigned'. This works if the operations are prefixed with 'std.'
in MLIR assembly. Unfortunately during import the short form is ambigous with
operations from a hypothetical 'shift' dialect. The best solution seems to omit
dots in standard operations for now.

Closes tensorflow/mlir#226

PiperOrigin-RevId: 286803388
2019-12-22 10:02:13 -08:00
Sean Silva 553f794b6f Add a couple useful LLVM_DEBUG's to the inliner.
This makes it easier to narrow down on ops that are preventing inlining.

PiperOrigin-RevId: 286243868
2019-12-18 12:33:30 -08:00
River Riddle 2666b97314 NFC: Cleanup non-conforming usages of namespaces.
* Fixes use of anonymous namespace for static methods.
* Uses explicit qualifiers(mlir::) instead of wrapping the definition with the namespace.

PiperOrigin-RevId: 286222654
2019-12-18 10:46:48 -08:00
Uday Bondhugula 47034c4bc5 Introduce prefetch op: affine -> std -> llvm intrinsic
Introduce affine.prefetch: op to prefetch using a multi-dimensional
subscript on a memref; similar to affine.load but has no effect on
semantics, but only on performance.

Provide lowering through std.prefetch, llvm.prefetch and map to llvm's
prefetch instrinsic. All attributes reflected through the lowering -
locality hint, rw, and instr/data cache.

  affine.prefetch %0[%i, %j + 5], false, 3, true : memref<400x400xi32>

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#225

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/225 from bondhugula:prefetch 4c3b4e93bc64d9a5719504e6d6e1657818a2ead0
PiperOrigin-RevId: 286212997
2019-12-18 10:00:04 -08:00
River Riddle 4562e389a4 NFC: Remove unnecessary 'llvm::' prefix from uses of llvm symbols declared in `mlir` namespace.
Aside from being cleaner, this also makes the codebase more consistent.

PiperOrigin-RevId: 286206974
2019-12-18 09:29:20 -08:00
River Riddle 74278dd01e NFC: Use TypeSwitch to simplify existing code.
PiperOrigin-RevId: 286066371
2019-12-17 14:57:41 -08:00
River Riddle ab610e8a99 Insert signature-converted blocks into a region with a parent operation.
This keeps the IR valid and consistent as it is expected that each block should have a valid parent region/operation. Previously, converted blocks were kept floating without a valid parent region.

PiperOrigin-RevId: 285821687
2019-12-16 12:09:45 -08:00
River Riddle b030e4a4ec Try to fold operations in DialectConversion when trying to legalize.
This change allows for DialectConversion to attempt folding as a mechanism to legalize illegal operations. This also expands folding support in OpBuilder::createOrFold to generate new constants when folding, and also enables it to work in the context of a PatternRewriter.

PiperOrigin-RevId: 285448440
2019-12-13 16:47:26 -08:00
River Riddle 851a8516d3 Make OpBuilder::insert virtual instead of OpBuilder::createOperation.
It is sometimes useful to create operations separately from the builder before insertion as it may be easier to erase them in isolation if necessary. One example use case for this is folding, as we will only want to insert newly generated constant operations on success. This has the added benefit of fixing some silent PatternRewriter failures related to cloning, as the OpBuilder 'clone' methods don't call createOperation.

PiperOrigin-RevId: 285086242
2019-12-11 16:26:45 -08:00
Kazuaki Ishizaki ae05cf27c6 Minor spelling tweaks
Closes tensorflow/mlir#304

PiperOrigin-RevId: 284568358
2019-12-09 09:23:48 -08:00
Uday Bondhugula a63f6e0bf9 Replace spurious SmallVector constructions with ValueRange
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#305

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/305 from bondhugula:value_range 21d1fae73f549e3c8e72b60876eff1b864cea39c
PiperOrigin-RevId: 284541027
2019-12-09 06:26:33 -08:00
River Riddle d6ee6a0310 Update the builder API to take ValueRange instead of ArrayRef<Value *>
This allows for users to provide operand_range and result_range in builder.create<> calls, instead of requiring an explicit copy into a separate data structure like SmallVector/std::vector.

PiperOrigin-RevId: 284360710
2019-12-07 10:35:41 -08:00
River Riddle 9d1a0c72b4 Add a new ValueRange class.
This class represents a generic abstraction over the different ways to represent a range of Values: ArrayRef<Value *>, operand_range, result_range. This class will allow for removing the many instances of explicit SmallVector<Value *, N> construction. It has the same memory cost as ArrayRef, and only suffers cost from indexing(if+elsing the different underlying representations).

This change only updates a few of the existing usages, with more to be changed in followups; e.g. 'build' API.

PiperOrigin-RevId: 284307996
2019-12-06 20:07:23 -08:00
Alexandre E. Eichenberger 3c69ca1e69 fix examples in comments
Closes tensorflow/mlir#301

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/301 from AlexandreEichenberger:vect-doc-update 7e5418a9101a4bdad2357882fe660b02bba8bd01
PiperOrigin-RevId: 284202462
2019-12-06 09:40:50 -08:00
Kazuaki Ishizaki 84a6182ddd minor spelling tweaks
Closes tensorflow/mlir#290

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/290 from kiszk:spelling_tweaks_201912 9d9afd16a723dd65754a04698b3976f150a6054a
PiperOrigin-RevId: 284169681
2019-12-06 05:59:30 -08:00
River Riddle 33a64540ad Add support for instance specific pass statistics.
Statistics are a way to keep track of what the compiler is doing and how effective various optimizations are. It is useful to see what optimizations are contributing to making a particular program run faster. Pass-instance specific statistics take this even further as you can see the effect of placing a particular pass at specific places within the pass pipeline, e.g. they could help answer questions like "what happens if I run CSE again here".

Statistics can be added to a pass by simply adding members of type 'Pass::Statistics'. This class takes as a constructor arguments: the parent pass pointer, a name, and a description. Statistics can be dumped by the pass manager in a similar manner to how pass timing information is dumped, i.e. via PassManager::enableStatistics programmatically; or -pass-statistics and -pass-statistics-display via the command line pass manager options.

Below is an example:

struct MyPass : public OperationPass<MyPass> {
  Statistic testStat{this, "testStat", "A test statistic"};

  void runOnOperation() {
    ...
    ++testStat;
    ...
  }
};

$ mlir-opt -pass-pipeline='func(my-pass,my-pass)' foo.mlir -pass-statistics

Pipeline Display:
===-------------------------------------------------------------------------===
                         ... Pass statistics report ...
===-------------------------------------------------------------------------===
'func' Pipeline
  MyPass
    (S) 15 testStat - A test statistic
  MyPass
    (S)  6 testStat - A test statistic

List Display:
===-------------------------------------------------------------------------===
                         ... Pass statistics report ...
===-------------------------------------------------------------------------===
MyPass
  (S) 21 testStat - A test statistic

PiperOrigin-RevId: 284022014
2019-12-05 11:53:28 -08:00
River Riddle 6f895bec7d [CSE] NFC: Hash the attribute dictionary pointer instead of the list of attributes.
PiperOrigin-RevId: 283810829
2019-12-04 12:32:08 -08:00
Nicolas Vasilache edfaf925cf Drop MaterializeVectorTransfers in favor of simpler declarative unrolling
Now that we have unrolling as a declarative pattern, we can drop a full pass that has gone stale. In the future we may want to add specific unrolling patterns for VectorTransferReadOp.

PiperOrigin-RevId: 283806880
2019-12-04 12:11:42 -08:00
Alex Zinenko 75175134d4 Loop coalescing: fix pointer chainsing in use-chain traversal
In the replaceAllUsesExcept utility function called from loop coalescing the
iteration over the use-chain is incorrect. The use list nodes (IROperands) have
next/prev links, and bluntly resetting the use would make the loop to continue
on uses of the value that was replaced instead of the original one. As a
result, it could miss the existing uses and update the wrong ones. Make sure we
increment the iterator before updating the use in the loop body.

Reported-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#291.

PiperOrigin-RevId: 283754195
2019-12-04 07:42:29 -08:00
Nicolas Vasilache 5c0c51a997 Refactor dependencies to expose Vector transformations as patterns - NFC
This CL refactors some of the MLIR vector dependencies to allow decoupling VectorOps, vector analysis, vector transformations and vector conversions from each other.
This makes the system more modular and allows extracting VectorToVector into VectorTransforms that do not depend on vector conversions.

This refactoring exhibited a bunch of cyclic library dependencies that have been cleaned up.

PiperOrigin-RevId: 283660308
2019-12-03 17:52:10 -08:00
Diego Caballero 330d1ff00e AffineLoopFusion: Prevent fusion of multi-out-edge producer loops
tensorflow/mlir#162 introduced a bug that
incorrectly allowed fusion of producer loops with multiple outgoing
edges. This commit fixes that problem. It also introduces a new flag to
disable sibling loop fusion so that we can test producer-consumer fusion
in isolation.

Closes tensorflow/mlir#259

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/259 from dcaballe:dcaballe/fix_multi_out_edge_producer_fusion 578d5661705fd5c56c555832d5e0528df88c5282
PiperOrigin-RevId: 283531105
2019-12-03 06:09:50 -08:00
Mahesh Ravishankar bd485afda0 Introduce attributes that specify the final ABI for a spirv::ModuleOp.
To simplify the lowering into SPIR-V, while still respecting the ABI
requirements of SPIR-V/Vulkan, split the process into two
1) While lowering a function to SPIR-V (when the function is an entry
   point function), allow specifying attributes on arguments and
   function itself that describe the ABI of the function.
2) Add a pass that materializes the ABI described in the function.

Two attributes are needed.
1) Attribute on arguments of the entry point function that describe
   the descriptor_set, binding, storage class, etc, of the
   spv.globalVariable this argument will be replaced by
2) Attribute on function that specifies workgroup size, etc. (for now
   only workgroup size).

Add the pass -spirv-lower-abi-attrs to materialize the ABI described
by the attributes.

This change makes the SPIRVBasicTypeConverter class unnecessary and is
removed, further simplifying the SPIR-V lowering path.

PiperOrigin-RevId: 282387587
2019-11-25 11:19:56 -08:00
Jean-Michel Gorius 104777d8e6 Unify vector op names with other dialects.
Change vector op names from VectorFooOp to Vector_FooOp and from
vector::VectorFooOp to vector::FooOp.

Closes tensorflow/mlir#257

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/257 from Kayjukh:master dfc3a0e04114885aaec8740d5951d6984d6e1577
PiperOrigin-RevId: 281967461
2019-11-22 08:24:49 -08:00
River Riddle 4ea92a0586 NFC: Use Region::getBlocks to fix build failure with drop_begin.
PiperOrigin-RevId: 281656603
2019-11-20 19:30:46 -08:00
River Riddle fafb708b9a Merge DCE and unreachable block elimination into a new utility 'simplifyRegions'.
This moves the different canonicalizations of regions into one place and invokes them in the fixed-point iteration of the canonicalizer.

PiperOrigin-RevId: 281617072
2019-11-20 15:53:19 -08:00
Sean Silva e4f83c6c26 Add multi-level DCE pass.
This is a simple multi-level DCE pass that operates pretty generically on
the IR. Its key feature compared to the existing peephole dead op folding
that happens during canonicalization is being able to delete recursively
dead cycles of the use-def graph, including block arguments.

PiperOrigin-RevId: 281568202
2019-11-20 12:55:10 -08:00
Nicolas Vasilache fa14d4f6ab Implement unrolling of vector ops to finer-grained vector ops as a pattern.
This CL uses the pattern rewrite infrastructure to implement a simple VectorOps -> VectorOps legalization strategy to unroll coarse-grained vector operations into finer grained ones.
The transformation is written using local pattern rewrites to allow composition with other rewrites. It proceeds by iteratively introducing fake cast ops and cleaning canonicalizing or lowering them away where appropriate.

This is an example of writing transformations as compositions of local pattern rewrites that should enable us to make them significantly more declarative.

PiperOrigin-RevId: 281555100
2019-11-20 11:49:36 -08:00
Diego Caballero dd5a7cb488 Add getRemappedValue to ConversionPatternRewriter
This method is needed for N->1 conversion patterns to retrieve remapped
Values used in the original N operations.

Closes tensorflow/mlir#237

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/237 from dcaballe:dcaballe/getRemappedValue 1f64fadcf2b203f7b336ff0c5838b116ae3625db
PiperOrigin-RevId: 281321881
2019-11-19 11:09:39 -08:00
Jing Pu 563b5910a8 Also elide large array attribute in OpGraph Dump
PiperOrigin-RevId: 281114034
2019-11-18 11:27:43 -08:00
Andy Davis 68a8da4a93 Fix Affine Loop Fusion test case reported on github.
This CL utilizies the more robust fusion feasibility analysis being built out in LoopFusionUtils, which will eventually be used to replace the current affine loop fusion pass.

PiperOrigin-RevId: 281112340
2019-11-18 11:20:37 -08:00
Lei Zhang a0986bf43d NFC: Convert CmpIPredicate in StandardOps to use EnumAttr
This turns several hand-written functions to auto-generated ones.

PiperOrigin-RevId: 280684326
2019-11-15 10:17:31 -08:00
Nicolas Vasilache 0b271b7dfe Refactor the LowerVectorTransfers pass to use the RewritePattern infra - NFC
This is step 1/n in refactoring infrastructure along the Vector dialect to make it ready for retargetability and composable progressive lowering.

PiperOrigin-RevId: 280529784
2019-11-14 15:40:07 -08:00
Alex Zinenko 971b8dd4d8 Move Affine to Standard conversion to lib/Conversion
This is essentially a dialect conversion and conceptually belongs to
conversions.

PiperOrigin-RevId: 280460034
2019-11-14 10:35:21 -08:00
Nicolas Vasilache f2b6ae9991 Move VectorOps to Tablegen - (almost) NFC
This CL moves VectorOps to Tablegen and cleans up the implementation.

This is almost NFC but 2 changes occur:
  1. an interface change occurs in the padding value specification in vector_transfer_read:
     the value becomes non-optional. As a shortcut we currently use %f0 for all paddings.
     This should become an OpInterface for vectorization in the future.
  2. the return type of vector.type_cast is trivial and simplified to `memref<vector<...>>`

Relevant roundtrip and invalid tests that used to sit in core are moved to the vector dialect.

The op documentation is moved to the .td file.

PiperOrigin-RevId: 280430869
2019-11-14 08:15:23 -08:00
River Riddle d985c74883 NFC: Refactor block signature conversion to not erase the original arguments.
This refactors the implementation of block signature(type) conversion to not insert fake cast operations to perform the type conversion, but to instead create a new block containing the proper signature. This has the benefit of enabling the use of pre-computed analyses that rely on mapping values. It also leads to a much cleaner implementation overall. The major user facing change is that applySignatureConversion will now replace the entry block of the region, meaning that blocks generally shouldn't be cached over calls to applySignatureConversion.

PiperOrigin-RevId: 280226936
2019-11-13 10:27:53 -08:00
Jacques Pienaar bcfb3d4cd6 Explicitly initialize isRecursivelyLegal
This also previously triggered the warning:

warning: missing field 'isRecursivelyLegal' initializer [-Wmissing-field-initializers]
  legalOperations[op] = {action};
                               ^
PiperOrigin-RevId: 279399175
2019-11-08 15:06:34 -08:00
Sean Silva f6188b5b07 Replace some remnant uses of "inst" with "op".
PiperOrigin-RevId: 278961676
2019-11-06 16:09:23 -08:00
River Riddle 2366561a39 Add a PatternRewriter hook to merge blocks, and use it to support for folding branches.
A pattern rewriter hook, mergeBlock, is added that allows for merging the operations of one block into the end of another. This is used to support a canonicalization pattern for branch operations that folds the branch when the successor has a single predecessor(the branch block).

Example:
  ^bb0:
    %c0_i32 = constant 0 : i32
    br ^bb1(%c0_i32 : i32)
  ^bb1(%x : i32):
    return %x : i32

becomes:
  ^bb0:
    %c0_i32 = constant 0 : i32
    return %c0_i32 : i32
PiperOrigin-RevId: 278677825
2019-11-05 11:57:38 -08:00
Mahesh Ravishankar 9cbbd8f4df Support lowering of imperfectly nested loops into GPU dialect.
The current lowering of loops to GPU only supports lowering of loop
nests where the loops mapped to workgroups and workitems are perfectly
nested. Here a new lowering is added to handle lowering of imperfectly
nested loop body with the following properties
1) The loops partitioned to workgroups are perfectly nested.
2) The loop body of the inner most loop partitioned to workgroups can
contain one or more loop nests that are to be partitioned across
workitems. Each individual loops nests partitioned to workitems should
also be perfectly nested.
3) The number of workgroups and workitems are not deduced from the
loop bounds but are passed in by the caller of the lowering as values.
4) For statements within the perfectly nested loop nest partitioned
across workgroups that are not loops, it is valid to have all threads
execute that statement. This is NOT verified.

PiperOrigin-RevId: 277958868
2019-11-01 10:52:06 -07:00
Jing Pu 736ad2061c Dump op location in createPrintOpGraphPass for easier debugging.
PiperOrigin-RevId: 277546527
2019-10-30 11:22:22 -07:00
River Riddle a32f0dcb5d Add support to GreedyPatternRewriter for erasing unreachable blocks.
Rewrite patterns may make modifications to the CFG, including dropping edges between blocks. This change adds a simple unreachable block elimination run at the end of each iteration to ensure that the CFG remains valid.

PiperOrigin-RevId: 277545805
2019-10-30 11:19:24 -07:00
Diego Caballero c87c7f5732 Bugfix: Keep worklistMap in sync with worklist in GreedyPatternRewriter
When we removed a pattern, we removed it from worklist but not from
worklistMap. Then, when we tried to add a new pattern on the same Operation
again, the pattern wasn't added since it already existed in the
worklistMap (but not in the worklist).

Closes tensorflow/mlir#211

PiperOrigin-RevId: 277319669
2019-10-29 10:58:31 -07:00
River Riddle 2f4d0c085a Add support for marking an operation as recursively legal.
In some cases, it may be desirable to mark entire regions of operations as legal. This provides an additional granularity of context to the concept of "legal". The `ConversionTarget` supports marking operations, that were previously added as `Legal` or `Dynamic`, as `recursively` legal. Recursive legality means that if an operation instance is legal, either statically or dynamically, all of the operations nested within are also considered legal. An operation can be marked via `markOpRecursivelyLegal<>`:

```c++
ConversionTarget &target = ...;

/// The operation must first be marked as `Legal` or `Dynamic`.
target.addLegalOp<MyOp>(...);
target.addDynamicallyLegalOp<MySecondOp>(...);

/// Mark the operation as always recursively legal.
target.markOpRecursivelyLegal<MyOp>();
/// Mark optionally with a callback to allow selective marking.
target.markOpRecursivelyLegal<MyOp, MySecondOp>([](Operation *op) { ... });
/// Mark optionally with a callback to allow selective marking.
target.markOpRecursivelyLegal<MyOp>([](MyOp op) { ... });
```

PiperOrigin-RevId: 277086382
2019-10-28 10:04:34 -07:00
River Riddle 2b61b7979e Convert the Canonicalize and CSE passes to generic Operation Passes.
This allows for them to be used on other non-function, or even other function-like, operations. The algorithms are already generic, so this is simply changing the derived pass type. The majority of this change is just ensuring that the nesting of these passes remains the same, as the pass manager won't auto-nest them anymore.

PiperOrigin-RevId: 276573038
2019-10-24 15:01:09 -07:00
Alex Zinenko edffbbcdae Fix "set-but-unused" warning in DialectConversion
The variable in question is only used in an assertion,
leading to a warning in opt builds.

PiperOrigin-RevId: 276352259
2019-10-23 14:32:13 -07:00
Kazuaki Ishizaki 8bfedb3ca5 Fix minor spelling tweaks (NFC)
Closes tensorflow/mlir#177

PiperOrigin-RevId: 275692653
2019-10-20 00:11:34 -07:00
Nicolas Vasilache 9e7e297da3 Lower vector transfer ops to loop.for operations.
This allows mixing linalg operations with vector transfer operations (with additional modifications to affine ops) and is a step towards solving tensorflow/mlir#189.

PiperOrigin-RevId: 275543361
2019-10-18 14:10:10 -07:00
River Riddle 2acc220f17 NFC: Remove trivial builder get methods.
These don't add any value, and some are even more restrictive than the respective static 'get' method.

PiperOrigin-RevId: 275391240
2019-10-17 20:08:34 -07:00
Geoffrey Martin-Noble 6090643877 Introduce a wrapper around ConversionPattern that operates on the derived class
Analogous to OpRewritePattern, this makes writing conversion patterns more convenient.

PiperOrigin-RevId: 275349854
2019-10-17 15:30:38 -07:00
Nicolas Vasilache 10039d04e2 Rename LoopNestBuilder to AffineLoopNestBuilder - NFC
PiperOrigin-RevId: 275310747
2019-10-17 12:13:59 -07:00
Sana Damani 3940b90d84 Update Chapter 4 of the Toy tutorial
This Chapter now introduces and makes use of the Interface concept
in MLIR to demonstrate ShapeInference.
END_PUBLIC

Closes tensorflow/mlir#191

PiperOrigin-RevId: 275085151
2019-10-16 12:19:39 -07:00
Mahesh Ravishankar e7b49eef1d Allow for remapping argument to a Value in SignatureConversion.
The current SignatureConversion framework (part of DialectConversion)
allows remapping input arguments to a function from 1->0, 1->1 or
1->many arguments during conversion. Another case is where the
argument itself is dropped, but it's use are remapped to another
Value*.

An example of this is: The Vulkan/SPIR-V spec requires entry functions
to be of type void(void). The GPU -> SPIR-V conversion implemented
this without having the DialectConversion framework track the
remapping that lead to some undefined behavior. The changes here
addresses that.

PiperOrigin-RevId: 275059656
2019-10-16 10:21:03 -07:00
River Riddle dfe09cc621 Add support for PatternRewriter::eraseOp.
This hook is useful when an operation is known to be dead, and no replacement values make sense.

PiperOrigin-RevId: 275052756
2019-10-16 09:50:57 -07:00
Mehdi Amini f1f9e3b8d1 Fix CMake configuration after introduction of LICM and LoopLikeInterface
b843cc5d5a introduced a new op LICM transformation and a LoopLike interface,
but missed the CMake aspects of it. This should fix the build.

PiperOrigin-RevId: 275038533
2019-10-16 08:37:39 -07:00
Stephan Herhut b843cc5d5a Implement simple loop-invariant-code-motion based on dialect interfaces.
PiperOrigin-RevId: 275004258
2019-10-16 04:28:38 -07:00
River Riddle 96de7091bc Allowing replacing non-root operations in DialectConversion.
When dealing with regions, or other patterns that need to generate temporary operations, it is useful to be able to replace other operations than the root op being matched. Before this PR, these operations would still be considered for legalization meaning that the conversion would either fail, erroneously need to mark these ops as legal, or add unnecessary patterns.

PiperOrigin-RevId: 274598513
2019-10-14 10:01:59 -07:00
River Riddle 6b1cc3c6ea Add support for canonicalizing callable regions during inlining.
This will allow for inlining newly devirtualized calls, as well as give a more accurate cost model(when we have one). Currently canonicalization will only run for nodes that have no child edges, as the child nodes may be erased during canonicalization. We can support this in the future, but it requires more intricate deletion tracking.

PiperOrigin-RevId: 274011386
2019-10-10 17:06:33 -07:00
River Riddle 438dc176b1 Remove the need to convert operations in regions of operations that have been replaced.
When an operation with regions gets replaced, we currently require that all of the remaining nested operations are still converted even though they are going to be replaced when the rewrite is finished. This cl adds a tracking for a minimal set of operations that are known to be "dead". This allows for ignoring the legalization of operations that are won't survive after conversion.

PiperOrigin-RevId: 274009003
2019-10-10 17:06:25 -07:00
Christian Sigg 35bb732032 Guard rewriter insertion point during signature conversion.
Avoid unexpected side effect in rewriter insertion point.

PiperOrigin-RevId: 273785794
2019-10-09 11:33:28 -07:00
Diego Caballero 3451055614 Add support for some multi-store cases in affine fusion
This PR is a stepping stone towards supporting generic multi-store
source loop nests in affine loop fusion. It extends the algorithm to
support fusion of multi-store loop nests that:
 1. have only one store that writes to a function-local live out, and
 2. the remaining stores are involved in loop nest self dependences
    or no dependences within the function.

Closes tensorflow/mlir#162

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/162 from dcaballe:dcaballe/multi-output-fusion 7fb7dec6fe8b45f5ce176f018bfe37b256420c45
PiperOrigin-RevId: 273773907
2019-10-09 10:37:30 -07:00
River Riddle 49b29dd186 Add a PatternRewriter hook for cloning a region into another.
This is similar to the `inlineRegionBefore` hook, except the original blocks are unchanged. The region to be cloned *must* not have been modified during the conversion process at the point of cloning, i.e. it must belong an operation that has yet to be converted, or the operation that is currently being converted.

PiperOrigin-RevId: 273622533
2019-10-08 15:45:08 -07:00
Uday Bondhugula 6136f33d59 unroll and jam: fix order of jammed bodies
- bodies would earlier appear in the order (i, i+3, i+2, i+1) instead of
  (i, i+1, i+2, i+3) for example for factor 4.

- clean up hardcoded test cases

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#170

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/170 from bondhugula:ujam b66b405b2b1894a03b376952e32a9d0292042665
PiperOrigin-RevId: 273613131
2019-10-08 15:13:11 -07:00
Jing Pu 17606a108b Print result types when dumping graphviz.
PiperOrigin-RevId: 273406833
2019-10-07 16:45:53 -07:00
Uday Bondhugula 89e7a76a1c fix simplify-affine-structures bug
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#157

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/157 from bondhugula:quickfix bd1fcd79825fc0bd5b4a3e688153fa0993ab703d
PiperOrigin-RevId: 273316498
2019-10-07 10:04:50 -07:00
Christian Sigg 85dcaf19c7 Fix typos, NFC.
PiperOrigin-RevId: 272851237
2019-10-04 04:37:53 -07:00
River Riddle 5830f71a45 Add support for inlining calls with different arg/result types from the callable.
Some dialects have implicit conversions inherent in their modeling, meaning that a call may have a different type that the type that the callable expects. To support this, a hook is added to the dialect interface that allows for materializing conversion operations during inlining when there is a mismatch. A hook is also added to the callable interface to allow for introspecting the expected result types.

PiperOrigin-RevId: 272814379
2019-10-03 23:10:51 -07:00
River Riddle a20d96e436 Update the Inliner pass to work on SCCs of the CallGraph.
This allows for the inliner to work on arbitrary call operations. The updated inliner will also work bottom-up through the callgraph enabling support for multiple levels of inlining.

PiperOrigin-RevId: 272813876
2019-10-03 23:05:21 -07:00
Jacques Pienaar 2b86e27dbd Show type even if elementsattr is elided in graph
The type is quite useful for debugging and shouldn't be too large.

PiperOrigin-RevId: 272390311
2019-10-02 01:46:12 -07:00
Jacques Pienaar c57f202c8c Switch explicit create methods to match generated build's order
The generated build methods have result type before the arguments (operands and attributes, which are also now adjacent in the explicit create method). This also results in changing the create method's ordering to match most build method's ordering.

PiperOrigin-RevId: 271755054
2019-09-28 09:35:58 -07:00
Uday Bondhugula 74eabdd14e NFC - clean up op accessor usage, std.load/store op verify, other stale info
- also remove stale terminology/references in docs

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#148

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/148 from bondhugula:cleanup e846b641a3c2936e874138aff480a23cdbf66591
PiperOrigin-RevId: 271618279
2019-09-27 11:58:24 -07:00
Nicolas Vasilache ddf737c5da Promote MemRefDescriptor to a pointer to struct when passing function boundaries in LLVMLowering.
The strided MemRef RFC discusses a normalized descriptor and interaction with library calls (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio).
Lowering of nested LLVM structs as value types does not play nicely with externally compiled C/C++ functions due to ABI issues.
Solving the ABI problem generally is a very complex problem and most likely involves taking
a dependence on clang that we do not want atm.

A simple workaround is to pass pointers to memref descriptors at function boundaries, which this CL implement.

PiperOrigin-RevId: 271591708
2019-09-27 09:57:36 -07:00
Jing Pu 47a7021cc3 Change the return type of createPrintCFGGraphPass to match other passes.
PiperOrigin-RevId: 271252404
2019-09-25 18:33:47 -07:00
Mehdi Amini 5583252173 Add convenience methods to set an OpBuilder insertion point after an Operation (NFC)
PiperOrigin-RevId: 270727180
2019-09-23 11:54:55 -07:00
Christian Sigg c900d4994e Fix a number of Clang-Tidy warnings.
PiperOrigin-RevId: 270632324
2019-09-23 02:34:27 -07:00
Uday Bondhugula f559c38c28 Upgrade/fix/simplify store to load forwarding
- fix store to load forwarding for a certain set of cases (where
  forwarding shouldn't have happened); use AffineValueMap difference
  based MemRefAccess equality checking; utility logic is also greatly
  simplified

- add missing equality/inequality operators for AffineExpr ==/!= ints

- add == != operators on MemRefAccess

Closes tensorflow/mlir#136

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/136 from bondhugula:store-load-forwarding d79fd1add8bcfbd9fa71d841a6a9905340dcd792
PiperOrigin-RevId: 270457011
2019-09-21 10:08:56 -07:00
River Riddle 91125d33ed Avoid iterator invalidation when recursively computing pattern depth.
computeDepth calls itself recursively, which may insert into minPatternDepth. minPatternDepth is a DenseMap, which invalidates iterators on insertion, so this may lead to asan failures.

PiperOrigin-RevId: 270374203
2019-09-20 16:30:29 -07:00
Uday Bondhugula 727a50ae2d Support symbolic operands for memref replacement; fix memrefNormalize
- allow symbols in index remapping provided for memref replacement
- fix memref normalize crash on cases with layout maps with symbols

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Reported by: Alex Zinenko

Closes tensorflow/mlir#139

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/139 from bondhugula:memref-rep-symbols 2f48c1fdb5d4c58915bbddbd9f07b18541819233
PiperOrigin-RevId: 269851182
2019-09-18 11:26:11 -07:00
MLIR Team 1c73be76d8 Unify error messages to start with lower-case.
PiperOrigin-RevId: 269803466
2019-09-18 07:45:17 -07:00
Uday Bondhugula bd7de6d4df Add rewrite pattern to compose maps into affine load/stores
- add canonicalization pattern to compose maps into affine loads/stores;
  templatize the pattern and reuse it for affine.apply as well

- rename getIndices -> getMapOperands() (getIndices is confusing since
  these are no longer the indices themselves but operands to the map
  whose results are the indices). This also makes the accessor uniform
  across affine.apply/load/store. Change arg names on the affine
  load/store builder to avoid confusion. Drop an unused confusing build
  method on AffineStoreOp.

- update incomplete doc comment for canonicalizeMapAndOperands (this was
  missed from a previous update).

Addresses issue tensorflow/mlir#121

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#122

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/122 from bondhugula:compose-load-store e71de1771e56a85c4282c10cb43f30cef0701c4f
PiperOrigin-RevId: 269619540
2019-09-17 11:49:45 -07:00
River Riddle 9619ba10d4 Add support for multi-level value mapping to DialectConversion.
When performing A->B->C conversion, an operation may still refer to an operand of A. This makes it necessary to unmap through multiple levels of replacement for a specific value.

PiperOrigin-RevId: 269367859
2019-09-16 10:38:19 -07:00
Uday Bondhugula 4f32ae61b4 NFC - Move explicit copy/dma generation utility out of pass and into LoopUtils
- turn copy/dma generation method into a utility in LoopUtils, allowing
  it to be reused elsewhere.

- no functional/logic change to the pass/utility

- trim down header includes in files affected

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#124

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/124 from bondhugula:datacopy 9f346e62e5bd9dd1986720a30a35f302eb4d3252
PiperOrigin-RevId: 269106088
2019-09-14 13:23:48 -07:00
Uday Bondhugula 1366467a3b update normalizeMemRef utility; handle missing failure check + add more tests
- take care of symbolic operands with alloc
- add missing check for compose map failure and a test case
- add test cases on strides
- drop incorrect check for one-to-one'ness

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#132

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/132 from bondhugula:normalize-memrefs 8aebf285fb0d7c19269d85255aed644657e327b7
PiperOrigin-RevId: 269105947
2019-09-14 13:21:35 -07:00
River Riddle f1b100c77b NFC: Finish replacing FunctionPassBase/ModulePassBase with OpPassBase.
These directives were temporary during the generalization of FunctionPass/ModulePass to OpPass.

PiperOrigin-RevId: 268970259
2019-09-13 13:34:27 -07:00
Smit Hinsu 1854c64c7c Log name of the generated illegal operation name in DialectConversion debug mode
PiperOrigin-RevId: 268859399
2019-09-13 01:37:38 -07:00
Jacques Pienaar a23f69a37b Remove redundant qualification
Address GCC error: extra qualification not allowed [-fpermissive]

PiperOrigin-RevId: 268133737
2019-09-09 19:50:53 -07:00
Jacques Pienaar 2660623a88 Add pass generate per block in a function a GraphViz Dot graph with ops as nodes
* Add GraphTraits that treat a block as a graph, Operation* as node and use-relationship for edges;
  - Just basic graph output;
* Add use iterator to iterate over all uses of an Operation;
* Add testing pass to generate op graph;

This does not support arbitrary operations other than function nor nested regions yet.

PiperOrigin-RevId: 268121782
2019-09-09 18:12:41 -07:00
Mehdi Amini 6443583bfd Refactor getUsedValuesDefinedAbove to expose a variant taking a callback (NFC)
This will allow clients to implement a different collection strategy on these
values, including collecting each uses within the region for example.

PiperOrigin-RevId: 267803978
2019-09-07 17:03:01 -07:00
River Riddle 0ba0087887 Add the initial inlining infrastructure.
This defines a set of initial utilities for inlining a region(or a FuncOp), and defines a simple inliner pass for testing purposes.
A new dialect interface is defined, DialectInlinerInterface, that allows for dialects to override hooks controlling inlining legality. The interface currently provides the following hooks, but these are just premilinary and should be changed/added to/modified as necessary:

* isLegalToInline
  - Determine if a region can be inlined into one of this dialect, *or* if an operation of this dialect can be inlined into a given region.

* shouldAnalyzeRecursively
  - Determine if an operation with regions should be analyzed recursively for legality. This allows for child operations to be closed off from the legality checks for operations like lambdas.

* handleTerminator
  - Process a terminator that has been inlined.

This cl adds support for inlining StandardOps, but other dialects will be added in followups as necessary.

PiperOrigin-RevId: 267426759
2019-09-05 12:24:13 -07:00
Uday Bondhugula 8c9dc690eb pipeline-data-transfer: remove dead tag alloc's and improve test coverage for replaceMemRefUsesWith / pipeline-data-transfer
- address remaining comments from PR tensorflow/mlir#87 for better test coverage for
  pipeline-data-transfer/replaceAllMemRefUsesWith
- remove dead tag allocs the same way they are removed for the replaced buffers

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#106

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/106 from bondhugula:followup 9e868666d047e8d43e5f82f43e4093b838c710fa
PiperOrigin-RevId: 267144774
2019-09-04 06:59:09 -07:00
Uday Bondhugula 54d674f51e Utility to normalize memrefs with non-identity layout maps
- introduce utility to convert memrefs with non-identity layout maps to
  ones with identity layout maps: convert the type and rewrite/remap all
  its uses

- add this utility to -simplify-affine-structures pass for testing
  purposes

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#104

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/104 from bondhugula:memref-normalize f2c914aa1890e8860326c9e33f9aa160b3d65e6d
PiperOrigin-RevId: 266985317
2019-09-03 12:14:28 -07:00
Uday Bondhugula b1ef9dc22c Fix affine data copy generation corner cases/bugs
- the [begin, end) range identified for copying could end in between the
  block, which makes hoisting invalid in some cases. Change the range
  identification to always end with end of block.

- add test case to exercise these (with fast mem capacity set to minimal so
  that single element memref buffers are generated at the innermost loop)

- the location of begin/end of the block range for data copying was
  being confused with the insert points for copy in and copy out code.
  In cases, where we choose to hoist transfers, these are separate.

- when copy loops are single iteration ones, promote their bodies at
  the end of the pass.

- change default fast mem space to 1 (setting it to zero made it
  generate DMA op's that won't verify in the default case - since the
  DMA ops have a check for src/dest memref spaces being different).

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Co-Authored-By: Mehdi Amini <joker.eph@gmail.com>

Closes tensorflow/mlir#88

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/88 from bondhugula:datacopy 88697267c45e850c3ced87671e16e4a930c02a42
PiperOrigin-RevId: 266980911
2019-09-03 11:53:16 -07:00
River Riddle 6563b1c446 Add a new dialect interface for the OperationFolder `OpFolderDialectInterface`.
This interface will allow for providing hooks to interrop with operation folding. The first hook, 'shouldMaterializeInto', will allow for controlling which region to insert materialized constants into. The folder will generally materialize constants into the top-level isolated region, this allows for materializing into a lower level ancestor region if it is more profitable/correct.

PiperOrigin-RevId: 266702972
2019-09-01 20:07:08 -07:00
Mehdi Amini ce702fc8da Add a `getUsedValuesDefinedAbove()` overload that takes an `Operation` pointer (NFC)
This is a convenient utility around the existing `getUsedValuesDefinedAbove()`
that take two regions.

PiperOrigin-RevId: 266686854
2019-09-01 16:32:10 -07:00
River Riddle 9c8a8a7d0d Add a canonicalization to erase empty AffineForOps.
AffineForOp themselves are pure and can be removed if there are no internal operations.

PiperOrigin-RevId: 266481293
2019-08-30 16:49:32 -07:00
River Riddle 037742cdf2 Add support for early exit walk methods.
This is done by providing a walk callback that returns a WalkResult. This result is either `advance` or `interrupt`. `advance` means that the walk should continue, whereas `interrupt` signals that the walk should stop immediately. An example is shown below:

auto result = op->walk([](Operation *op) {
  if (some_invariant)
    return WalkResult::interrupt();
  return WalkResult::advance();
});

if (result.wasInterrupted())
  ...;

PiperOrigin-RevId: 266436700
2019-08-30 12:47:53 -07:00
River Riddle 4bfae66d70 Refactor the 'walk' methods for operations.
This change refactors and cleans up the implementation of the operation walk methods. After this refactoring is that the explicit template parameter for the operation type is no longer needed for the explicit op walks. For example:

    op->walk<AffineForOp>([](AffineForOp op) { ... });

is now accomplished via:

    op->walk([](AffineForOp op) { ... });

PiperOrigin-RevId: 266209552
2019-08-29 13:04:50 -07:00
Uday Bondhugula bc2a543225 fix loop unroll and jam - operand mapping - imperfect nest case
- fix operand mapping while cloning sub-blocks to jam - was incorrect
  for imperfect nests where def/use was across sub-blocks
- strengthen/generalize the first test case to cover the previously
  missed scenario
- clean up the other cases while on this.

Previously, unroll-jamming the following nest
```
    affine.for %arg0 = 0 to 2048 {
      %0 = alloc() : memref<512x10xf32>
      affine.for %arg1 = 0 to 10 {
        %1 = affine.load %0[%arg0, %arg1] : memref<512x10xf32>
      }
      dealloc %0 : memref<512x10xf32>
    }
```

would yield

```
      %0 = alloc() : memref<512x10xf32>
      %1 = affine.apply #map0(%arg0)
      %2 = alloc() : memref<512x10xf32>
      affine.for %arg1 = 0 to 10 {
        %4 = affine.load %0[%arg0, %arg1] : memref<512x10xf32>
        %5 = affine.apply #map0(%arg0)
        %6 = affine.load %0[%5, %arg1] : memref<512x10xf32>
      }
      dealloc %0 : memref<512x10xf32>
      %3 = affine.apply #map0(%arg0)
      dealloc %0 : memref<512x10xf32>

```

instead of

```

module {
    affine.for %arg0 = 0 to 2048 step 2 {
      %0 = alloc() : memref<512x10xf32>
      %1 = affine.apply #map0(%arg0)
      %2 = alloc() : memref<512x10xf32>
      affine.for %arg1 = 0 to 10 {
        %4 = affine.load %0[%arg0, %arg1] : memref<512x10xf32>
        %5 = affine.apply #map0(%arg0)
        %6 = affine.load %2[%5, %arg1] : memref<512x10xf32>
      }
      dealloc %0 : memref<512x10xf32>
      %3 = affine.apply #map0(%arg0)
      dealloc %2 : memref<512x10xf32>
    }
```

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#98

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/98 from bondhugula:ujam ddbc853f69b5608b3e8ff9b5ac1f6a5a0bb315a4
PiperOrigin-RevId: 266073460
2019-08-28 23:42:50 -07:00
Uday Bondhugula aa2cee9cf5 Refactor / improve replaceAllMemRefUsesWith
Refactor replaceAllMemRefUsesWith to split it into two methods: the new
method does the replacement on a single op, and is used by the existing
one.

- make the methods return LogicalResult instead of bool

- Earlier, when replacement failed (due to non-deferencing uses of the
  memref), the set of ops that had already been processed would have
  been replaced leaving the IR in an inconsistent state. Now, a
  pass is made over all ops to first check for non-deferencing
  uses, and then replacement is performed. No test cases were affected
  because all clients of this method were first checking for
  non-deferencing uses before calling this method (for other reasons).
  This isn't true for a use case in another upcoming PR (scalar
  replacement); clients can now bail out with consistent IR on failure
  of replaceAllMemRefUsesWith. Add test case.

- multiple deferencing uses of the same memref in a single op is
  possible (we have no such use cases/scenarios), and this has always
  remained unsupported. Add an assertion for this.

- minor fix to another test pipeline-data-transfer case.

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#87

PiperOrigin-RevId: 265808183
2019-08-27 17:56:56 -07:00
River Riddle 2f59f76876 NFC: Remove the explicit context from Operation::create and OperationState.
The context can easily be recovered from the Location in these situations.

PiperOrigin-RevId: 265578574
2019-08-26 17:34:48 -07:00
Andy Ly 6a501e3d1b Support folding of ops with inner ops in GreedyPatternRewriteDriver.
This fixes a bug when folding ops with inner ops and inner ops are still being visited.

PiperOrigin-RevId: 265475780
2019-08-26 09:44:39 -07:00
River Riddle 32052c8417 NFC: Add a note to 'applyPatternsGreedily' that it also performs folding/dce.
Fixes tensorflow/mlir#72

PiperOrigin-RevId: 265097597
2019-08-23 11:28:45 -07:00
River Riddle ffde975e21 NFC: Move AffineOps dialect to the Dialect sub-directory.
PiperOrigin-RevId: 264482571
2019-08-20 15:36:39 -07:00
Nicolas Vasilache b628194013 Move Linalg and VectorOps dialects to the Dialect subdir - NFC
PiperOrigin-RevId: 264277760
2019-08-19 17:11:38 -07:00
River Riddle ba0fa92524 NFC: Move LLVMIR, SDBM, and StandardOps to the Dialect/ directory.
PiperOrigin-RevId: 264193915
2019-08-19 11:01:25 -07:00
Jacques Pienaar 79f53b0cf1 Change from llvm::make_unique to std::make_unique
Switch to C++14 standard method as llvm::make_unique has been removed (
https://reviews.llvm.org/D66259). Also mark some targets as c++14 to ease next
integrates.

PiperOrigin-RevId: 263953918
2019-08-17 11:06:03 -07:00
River Riddle 9c29273ddc Refactor DialectConversion to convert the signatures of blocks when they are moved.
Often we want to ensure that block arguments are converted before operations that use them. This refactors the current implementation to be cleaner/less frequent by triggering conversion when a set of blocks are moved/inlined; or when legalization is successful.

PiperOrigin-RevId: 263795005
2019-08-16 10:16:38 -07:00
Mehdi Amini 926fb685de Express ownership transfer in PassManager API through std::unique_ptr (NFC)
Since raw pointers are always passed around for IR construct without
implying any ownership transfer, it can be error prone to have implicit
ownership transferred the same way.
For example this code can seem harmless:

  Pass *pass = ....
  pm.addPass(pass);
  pm.addPass(pass);
  pm.run(module);

PiperOrigin-RevId: 263053082
2019-08-12 19:13:12 -07:00
River Riddle 5290e8c36d NFC: Update pattern rewrite API to pass OwningRewritePatternList by const reference.
The pattern list is not modified by any of these APIs and should thus be passed with const.

PiperOrigin-RevId: 262844002
2019-08-11 18:34:14 -07:00
River Riddle 1e42954032 NFC: Standardize the terminology used for parent ops/regions/etc.
There are currently several different terms used to refer to a parent IR unit in 'get' methods: getParent/getEnclosing/getContaining. This cl standardizes all of these methods to use 'getParent*'.

PiperOrigin-RevId: 262680287
2019-08-09 20:07:52 -07:00
River Riddle 41968fb475 NFC: Update usages of OwningRewritePatternList to pass by & instead of &&.
This will allow for reusing the same pattern list, which may be costly to continually reconstruct, on multiple invocations.

PiperOrigin-RevId: 262664599
2019-08-09 17:20:29 -07:00
Nicolas Vasilache 39f1b9a053 Add a higher-order vector.extractelement operation in MLIR
This CL is step 2/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools.

This CL adds the vector.extractelement operation to the MLIR vector dialect as well as the appropriate roundtrip test. Lowering to LLVM will occur in the following CL.

PiperOrigin-RevId: 262545089
2019-08-09 05:58:47 -07:00
River Riddle 8089f93746 Add utility 'replaceAllUsesWith' methods to Operation.
These methods will allow replacing the uses of results with an existing operation, with the same number of results, or a range of values. This removes a number of hand-rolled result replacement loops and simplifies replacement for operations with multiple results.

PiperOrigin-RevId: 262206600
2019-08-07 13:48:52 -07:00
Andy Ly 55f2e24ab3 Remove ops in regions/blocks from worklist when parent op is being removed via GreedyPatternRewriteDriver::replaceOp.
This fixes a bug where ops inside the parent op are visited even though the parent op has been removed.

PiperOrigin-RevId: 261953580
2019-08-06 11:08:54 -07:00
Nicolas Vasilache 24647750d4 Refactor Linalg ops to loop lowering (NFC)
This CL modifies the LowerLinalgToLoopsPass to use RewritePattern.
This will make it easier to inline Linalg generic functions and regions when emitting to loops in a subsequent CL.

PiperOrigin-RevId: 261894120
2019-08-06 05:38:16 -07:00
River Riddle a0df3ebd15 NFC: Implement OwningRewritePatternList as a class instead of a using directive.
This allows for proper forward declaration, as opposed to leaking the internal implementation via a using directive. This also allows for all pattern building to go through 'insert' methods on the OwningRewritePatternList, replacing uses of 'push_back' and 'RewriteListBuilder'.

PiperOrigin-RevId: 261816316
2019-08-05 18:38:22 -07:00
Mehdi Amini 0c3923e1dc Fix clang 5.0 by using type aliases for LLVM DenseSet/Map
When inlining the declaration for llvm::DenseSet/DenseMap in the mlir
namespace from a forward declaration, clang does not take the default
for the template parameters if their are declared later.

namespace llvm {
  template<typename Foo>
  class DenseMap;
}
namespace mlir {
  using llvm::DenseMap;
}
namespace llvm {
  template<typename Foo = int>
  class DenseMap {};
}

namespace mlir {
  DenseMap<> map;
}

PiperOrigin-RevId: 261495612
2019-08-03 11:35:50 -07:00
Alex Zinenko 58e66d71e7 AffineDataCopyGeneration: don't use CL flag values inside the pass
AffineDataCopyGeneration pass relied on command line flags for internal logic
in several places, which makes it unusable in a library context (i.e. outside a
standalone mlir-opt binary that does the command line parsing).  Define
configuration flags in the constructor instead, and set them up to command
line-based defaults to maintain the original behavior.

PiperOrigin-RevId: 261322364
2019-08-02 08:04:30 -07:00
Uday Bondhugula 18b8d4352b Introduce explicit copying optimization by generalizing the DMA generation pass
Explicit copying to contiguous buffers is a standard technique to avoid
conflict misses and TLB misses, and improve hardware prefetching
performance. When done in conjunction with cache tiling, it nearly
eliminates all cache conflict and TLB misses, and a single hardware
prefetch stream is needed per data tile.

- generalize/extend DMA generation pass (renamed data copying pass) to
  perform either point-wise explicit copies to fast memory buffers or
  DMAs (depending on a cmd line option). All logic is the same as
  erstwhile -dma-generate.

- -affine-dma-generate is now renamed -affine-data-copy; when -dma flag is
  provided, DMAs are generated, or else explicit copy loops are generated
  (point-wise) by default.

- point-wise copying could be used for CPUs (or GPUs); some indicative
  performance numbers with a "C" version of the MLIR when compiled with
  and without this optimization (about 2x improvement here).

  With a matmul on 4096^2 matrices on a single core of an Intel Core i7
  Skylake i7-8700K with clang 8.0.0:

  clang -O3:                       518s
  clang -O3 with MLIR tiling (128x128):      24.5s
  clang -O3 with MLIR tiling + data copying  12.4s
  (code equivalent to test/Transforms/data-copy.mlir func @matmul)

- fix some misleading comments.

- change default fast-mem space to 0 (more intuitive now with the
  default copy generation using point-wise copies instead of DMAs)

On a simple 3-d matmul loop nest, code generated with -affine-data-copy:

```
  affine.for %arg3 = 0 to 4096 step 128 {
    affine.for %arg4 = 0 to 4096 step 128 {
      %0 = affine.apply #map0(%arg3, %arg4)
      %1 = affine.apply #map1(%arg3, %arg4)
      %2 = alloc() : memref<128x128xf32, 2>
      // Copy-in Out matrix.
      affine.for %arg5 = 0 to 128 {
        %5 = affine.apply #map2(%arg3, %arg5)
        affine.for %arg6 = 0 to 128 {
          %6 = affine.apply #map2(%arg4, %arg6)
          %7 = load %arg2[%5, %6] : memref<4096x4096xf32>
          affine.store %7, %2[%arg5, %arg6] : memref<128x128xf32, 2>
        }
      }
      affine.for %arg5 = 0 to 4096 step 128 {
        %5 = affine.apply #map0(%arg3, %arg5)
        %6 = affine.apply #map1(%arg3, %arg5)
        %7 = alloc() : memref<128x128xf32, 2>
        // Copy-in LHS.
        affine.for %arg6 = 0 to 128 {
          %11 = affine.apply #map2(%arg3, %arg6)
          affine.for %arg7 = 0 to 128 {
            %12 = affine.apply #map2(%arg5, %arg7)
            %13 = load %arg0[%11, %12] : memref<4096x4096xf32>
            affine.store %13, %7[%arg6, %arg7] : memref<128x128xf32, 2>
          }
        }
        %8 = affine.apply #map0(%arg5, %arg4)
        %9 = affine.apply #map1(%arg5, %arg4)
        %10 = alloc() : memref<128x128xf32, 2>
        // Copy-in RHS.
        affine.for %arg6 = 0 to 128 {
          %11 = affine.apply #map2(%arg5, %arg6)
          affine.for %arg7 = 0 to 128 {
            %12 = affine.apply #map2(%arg4, %arg7)
            %13 = load %arg1[%11, %12] : memref<4096x4096xf32>
            affine.store %13, %10[%arg6, %arg7] : memref<128x128xf32, 2>
          }
        }
        // Compute.
        affine.for %arg6 = #map7(%arg3) to #map8(%arg3) {
          affine.for %arg7 = #map7(%arg4) to #map8(%arg4) {
            affine.for %arg8 = #map7(%arg5) to #map8(%arg5) {
              %11 = affine.load %7[-%arg3 + %arg6, -%arg5 + %arg8] : memref<128x128xf32, 2>
              %12 = affine.load %10[-%arg5 + %arg8, -%arg4 + %arg7] : memref<128x128xf32, 2>
              %13 = affine.load %2[-%arg3 + %arg6, -%arg4 + %arg7] : memref<128x128xf32, 2>
              %14 = mulf %11, %12 : f32
              %15 = addf %13, %14 : f32
              affine.store %15, %2[-%arg3 + %arg6, -%arg4 + %arg7] : memref<128x128xf32, 2>
            }
          }
        }
        dealloc %10 : memref<128x128xf32, 2>
        dealloc %7 : memref<128x128xf32, 2>
      }
      %3 = affine.apply #map0(%arg3, %arg4)
      %4 = affine.apply #map1(%arg3, %arg4)
      // Copy out result matrix.
      affine.for %arg5 = 0 to 128 {
        %5 = affine.apply #map2(%arg3, %arg5)
        affine.for %arg6 = 0 to 128 {
          %6 = affine.apply #map2(%arg4, %arg6)
          %7 = affine.load %2[%arg5, %arg6] : memref<128x128xf32, 2>
          store %7, %arg2[%5, %6] : memref<4096x4096xf32>
        }
      }
      dealloc %2 : memref<128x128xf32, 2>
    }
  }
```

With -affine-data-copy -dma:

```
  affine.for %arg3 = 0 to 4096 step 128 {
    %0 = affine.apply #map3(%arg3)
    %1 = alloc() : memref<128xf32, 2>
    %2 = alloc() : memref<1xi32>
    affine.dma_start %arg2[%arg3], %1[%c0], %2[%c0], %c128_0 : memref<4096xf32>, memref<128xf32, 2>, memref<1xi32>
    affine.dma_wait %2[%c0], %c128_0 : memref<1xi32>
    %3 = alloc() : memref<1xi32>
    affine.for %arg4 = 0 to 4096 step 128 {
      %5 = affine.apply #map0(%arg3, %arg4)
      %6 = affine.apply #map1(%arg3, %arg4)
      %7 = alloc() : memref<128x128xf32, 2>
      %8 = alloc() : memref<1xi32>
      affine.dma_start %arg0[%arg3, %arg4], %7[%c0, %c0], %8[%c0], %c16384, %c4096, %c128_2 : memref<4096x4096xf32>, memref<128x128xf32, 2>, memref<1xi32>
      affine.dma_wait %8[%c0], %c16384 : memref<1xi32>
      %9 = affine.apply #map3(%arg4)
      %10 = alloc() : memref<128xf32, 2>
      %11 = alloc() : memref<1xi32>
      affine.dma_start %arg1[%arg4], %10[%c0], %11[%c0], %c128_1 : memref<4096xf32>, memref<128xf32, 2>, memref<1xi32>
      affine.dma_wait %11[%c0], %c128_1 : memref<1xi32>
      affine.for %arg5 = #map3(%arg3) to #map5(%arg3) {
        affine.for %arg6 = #map3(%arg4) to #map5(%arg4) {
          %12 = affine.load %7[-%arg3 + %arg5, -%arg4 + %arg6] : memref<128x128xf32, 2>
          %13 = affine.load %10[-%arg4 + %arg6] : memref<128xf32, 2>
          %14 = affine.load %1[-%arg3 + %arg5] : memref<128xf32, 2>
          %15 = mulf %12, %13 : f32
          %16 = addf %14, %15 : f32
          affine.store %16, %1[-%arg3 + %arg5] : memref<128xf32, 2>
        }
      }
      dealloc %11 : memref<1xi32>
      dealloc %10 : memref<128xf32, 2>
      dealloc %8 : memref<1xi32>
      dealloc %7 : memref<128x128xf32, 2>
    }
    %4 = affine.apply #map3(%arg3)
    affine.dma_start %1[%c0], %arg2[%arg3], %3[%c0], %c128 : memref<128xf32, 2>, memref<4096xf32>, memref<1xi32>
    affine.dma_wait %3[%c0], %c128 : memref<1xi32>
    dealloc %3 : memref<1xi32>
    dealloc %2 : memref<1xi32>
    dealloc %1 : memref<128xf32, 2>
  }
```

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#50

PiperOrigin-RevId: 261221903
2019-08-01 16:31:58 -07:00
Jacques Pienaar 0fa1ea704c Initialize union to avoid -Wmissing-field-initializers warning.
Reported by clang-6.

PiperOrigin-RevId: 260311814
2019-07-27 11:47:26 -07:00
Nicolas Vasilache fae4d94990 Use "standard" load and stores in LowerVectorTransfers
Clipping creates non-affine memory accesses, use std_load and std_store instead of affine_load and affine_store.
In the future we may also want a fill with the neutral element rather than clip, this would make the accesses affine if we wanted more analyses and transformations to happen post lowering to pointwise copies.

PiperOrigin-RevId: 260110503
2019-07-26 02:34:24 -07:00
River Riddle 1293708473 Add support for an analysis mode to DialectConversion.
This mode analyzes which operations are legalizable to the given target if a conversion were to be applied, i.e. no rewrites are ever performed even on success. This mode is useful for device partitioning or other utilities that may want to analyze the effect of conversion to different targets before performing it.

The analysis method currently just fills a provided set with the operations that were found to be legalizable. This can be extended in the future to capture more information as necessary.

PiperOrigin-RevId: 259987105
2019-07-25 11:31:07 -07:00
Nicolas Vasilache 48a1baeb8a Refactor LoopParametricTiling as a test pass - NFC
This CL moves LoopParametricTiling into test/lib as a pass for purely testing purposes.

PiperOrigin-RevId: 259300264
2019-07-22 04:31:17 -07:00
River Riddle 00bdc8e070 Refactor region type signature conversion to be explicit via patterns.
This cl enforces that the conversion of the type signatures for regions, and thus their entry blocks, is handled via ConversionPatterns. A new hook 'applySignatureConversion' is added to the ConversionPatternRewriter to perform the desired conversion on a region. This also means that the handling of rewriting the signature of a FuncOp is moved to a pattern. A default implementation is provided via 'mlir::populateFuncOpTypeConversionPattern'. This removes the hacky implicit 'dynamically legal' status of FuncOp that was present previously, and leaves it up to the user to decide when/how to convert the signature of a function.

PiperOrigin-RevId: 259161999
2019-07-20 19:06:07 -07:00
Nicolas Vasilache d2a872922f Refactor stripmineSink for AffineForOp - NFC
More moving less cloning.

PiperOrigin-RevId: 258947575
2019-07-19 11:40:37 -07:00
Nicolas Vasilache db4cd1c8dc Utility function to map a loop on a parametric grid of virtual processors
This CL introduces a simple loop utility function which rewrites the bounds and step of a loop so as to become mappable on a regular grid of processors whose identifiers are given by SSA values.

A corresponding unit test is added.

For example, using CUDA terminology, and assuming a 2-d grid with processorIds = [blockIdx.x, threadIdx.x] and numProcessors = [gridDim.x, blockDim.x], the loop:
```
   loop.for %i = %lb to %ub step %step {
     ...
   }
```
is rewritten into a version resembling the following pseudo-IR:
```
   loop.for %i = %lb + threadIdx.x + blockIdx.x * blockDim.x to %ub
      step %gridDim.x * blockDim.x {
     ...
   }
```

PiperOrigin-RevId: 258945942
2019-07-19 11:40:31 -07:00
Nicolas Vasilache 5bc344743c Uniformize the API for the mlir::tile functions on AffineForOp and loop::ForOp
This CL adapts the recently introduced parametric tiling to have an API matching the tiling
of AffineForOp. The transformation using stripmineSink is more general and produces  imperfectly nested loops.

Perfect nesting invariants of the tiled version are obtained by selectively applying hoisting of ops to isolate perfectly nested bands. Such hoisting may fail to produce a perfect loop nest in cases where ForOp transitively depend on enclosing induction variables. In such cases, the API provides a LogicalResult return but the SimpleParametricLoopTilingPass does not currently use this result.

A new unit test is added with a triangular loop for which the perfect nesting property does not hold. For this example, the old behavior was to produce IR that did not verify (some use was not dominated by its def).

PiperOrigin-RevId: 258928309
2019-07-19 11:40:25 -07:00
River Riddle 28057ff3da Add support for providing a legality callback for dynamic legality in DialectConversion.
This allows for providing specific handling for dynamically legal operations/dialects without overriding the general 'isDynamicallyLegal' hook. This also means that a derived ConversionTarget class need not always be defined when some operations are dynamically legal.

Example usage:

ConversionTarget target(...);
target.addDynamicallyLegalOp<ReturnOp>([](ReturnOp op) {
  return ...
};
target.addDynamicallyLegalDialect<StandardOpsDialect>([](Operation *op) {
  return ...
};

PiperOrigin-RevId: 258884753
2019-07-19 11:40:19 -07:00
River Riddle 8b447b6cad NFC: Expose a ConversionPatternRewriter for use with ConversionPatterns.
This specific PatternRewriter will allow for exposing hooks in the future that are only useful for the conversion framework, e.g. type conversions.

PiperOrigin-RevId: 258818122
2019-07-19 11:40:00 -07:00
River Riddle 9e3c2650d2 Refactor the conversion of block argument types in DialectConversion.
This cl begins a large refactoring over how signature types are converted in the DialectConversion infrastructure. The signatures of blocks are now converted on-demand when an operation held by that block is being converted. This allows for handling the case where a region is created as part of a pattern, something that wasn't possible previously.

This cl also generalizes the region signature conversion used by FuncOp to work on any region of any operation. This generalization allows for removing the 'apply*Conversion' functions that were specific to FuncOp/ModuleOp. The implementation currently uses a new hook on TypeConverter, 'convertRegionSignature', but this should ideally be removed in favor of using Patterns. That depends on adding support to the PatternRewriter used by ConversionPattern to allow applying signature conversions to regions, which should be coming in a followup.

PiperOrigin-RevId: 258645733
2019-07-19 11:38:45 -07:00
River Riddle 491ef84dc4 Add support for explicitly marking dialects and operations as illegal.
This explicit tag is useful is several ways:
*) This simplifies how to mark sub sections of a dialect as explicitly unsupported, e.g. my target supports all operations in the foo dialect except for these select few. This is useful for partial lowerings between dialects.
*) Partial conversions will now verify that operations that were explicitly marked as illegal must be converted. This provides some guarantee that the operations that need to be lowered by a specific pass will be.

PiperOrigin-RevId: 258582879
2019-07-19 11:38:25 -07:00
Nicolas Vasilache 0002e2964d Move affine.for and affine.if to ODS
As the move to ODS is made, body and region names across affine and loop dialects are uniformized.

PiperOrigin-RevId: 258416590
2019-07-16 13:45:47 -07:00
River Riddle 2b9855b5b4 Refactor DialectConversion to support different conversion modes.
Users generally want several different modes of conversion. This cl refactors DialectConversion to provide two:
* Partial (applyPartialConversion)
  - This mode allows for illegal operations to exist in the IR, and does not fail if an operation fails to be legalized.

* Full (applyFullConversion)
  - This mode fails if any operation is not properly legalized to the conversion target. This allows for ensuring that the IR after a conversion only contains operations legal for the target.

PiperOrigin-RevId: 258412243
2019-07-16 13:45:41 -07:00
River Riddle 2087bf6386 Remove lowerAffineConstructs and lowerControlFlow in favor of providing patterns.
These methods don't compose well with the rest of conversion framework, and create artificial breaks in conversion. Replace these methods with two(populateAffineToStdConversionPatterns and populateLoopToStdConversionPatterns respectively) that populate a list of patterns to perform the same behavior.

PiperOrigin-RevId: 258219277
2019-07-16 13:44:45 -07:00
River Riddle e7a2ef21f9 Update 'applyPatternsGreedily' to work on the regions of any operations.
'applyPatternsGreedily' is a useful utility outside of just function regions.

PiperOrigin-RevId: 258182937
2019-07-16 13:44:39 -07:00
River Riddle 7d1e1e6721 Refactor the traversal of operations to Convert in DialectConversion.
This cl changes the way that operations/blocks to convert are collected/traversed so that parent region operations can be legalized before their bodies. Most RewritePatterns for region operations assume that the entry arguments to each region are yet to be converted. Given that the bodies are currently converted first, this makes it difficult to fit these patterns into the same run as one converting types.

The operations/blocks to convert are now collected before any legalization has run, which simplifies the conversion logic itself, as legalization may insert new operations, move blocks, etc.

PiperOrigin-RevId: 258170158
2019-07-16 13:44:33 -07:00
River Riddle 40715789f8 Refactor LowerAffine to use OpRewritePattern instead of ConversionPattern.
ConversionPattern should ideally only be used when the types of the operands are changing, which in this case they aren't. Using OpRewritePattern also lends to much simpler code.

PiperOrigin-RevId: 258158474
2019-07-16 13:44:09 -07:00
Alex Zinenko fc044e8929 Introduce loop coalescing utility and a simple pass
Multiple (perfectly) nested loops with independent bounds can be combined into
a single loop and than subdivided into blocks of arbitrary size for load
balancing or more efficient parallelism exploitation.  However, MLIR wants to
preserve the multi-dimensional multi-loop structure at higher levels of
abstraction. Introduce a transformation that coalesces nested loops with
independent bounds so that they can be further subdivided by tiling.

PiperOrigin-RevId: 258151016
2019-07-16 13:43:44 -07:00
Nicolas Vasilache cca53e8527 Extract std.for std.if and std.terminator in their own dialect
These ops should not belong to the std dialect.
This CL extracts them in their own dialect and updates the corresponding conversions and tests.

PiperOrigin-RevId: 258123853
2019-07-16 13:43:18 -07:00
River Riddle a764c19d17 Fix a bug in DialectConversion when using RewritePattern.
When using a RewritePattern and replacing an operation with an existing value, that value may have already been replaced by something else. This cl ensures that only the final value is used when applying rewrites.

PiperOrigin-RevId: 258058488
2019-07-16 13:43:12 -07:00
River Riddle e50a8bd19c NFC: Add header blocks to DialectConversion.h to improve readability.
PiperOrigin-RevId: 257903383
2019-07-13 05:55:50 -07:00
River Riddle 2566a72a21 Update the PatternRewriter constructor to take a context instead of a region.
This will allow for cleanly using a rewriter for multiple different regions.

PiperOrigin-RevId: 257845371
2019-07-12 17:42:52 -07:00
River Riddle 8e349a48b6 Remove the 'region' field from OpBuilder.
This field wasn't updated as the insertion point changed, making it potentially dangerous given the multi-level of MLIR(e.g. 'createBlock' would always insert the new block in 'region'). This also allows for building an OpBuilder with just a context.

PiperOrigin-RevId: 257829135
2019-07-12 17:42:41 -07:00
River Riddle 60a2983779 Fix a bug in the canonicalizer when replacing constants via patterns.
The GreedyPatternRewriteDriver currently does not notify the OperationFolder when constants are removed as part of a pattern match. This materializes in a nasty bug where a different operation may be allocated to the same address. This causes an assertion in the OperationFolder when it gets notified of the new operations removal.

PiperOrigin-RevId: 257817627
2019-07-12 17:42:24 -07:00
Nicolas Vasilache cab671d166 Lower affine control flow to std control flow to LLVM dialect
This CL splits the lowering of affine to LLVM into 2 parts:
1. affine -> std
2. std -> LLVM

The conversions mostly consists of splitting concerns between the affine and non-affine worlds from existing conversions.
Short-circuiting of affine `if` conditions was never tested or exercised and is removed in the process, it can be reintroduced later if needed.

LoopParametricTiling.cpp is updated to reflect the newly added ForOp::build.

PiperOrigin-RevId: 257794436
2019-07-12 08:44:28 -07:00
River Riddle 9dbef0bf96 Rename FunctionAttr to SymbolRefAttr.
This allows for the attribute to hold symbolic references to other operations than FuncOp. This also allows for removing the dependence on FuncOp from the base Builder.

PiperOrigin-RevId: 257650017
2019-07-12 08:43:42 -07:00
Alex Zinenko 054e25c079 EDSC: use affine.load/store instead of std.load/store
Standard load and store operations are evolving to be separated from the Affine
constructs.  Special affine.load/store have been introduced to uphold the
restrictions of the Affine control flow constructs on their operands.
EDSC-produced loads and stores were originally intended to uphold those
restrictions as well so they should use affine.load/store instead of
std.load/store.

PiperOrigin-RevId: 257443307
2019-07-12 08:42:28 -07:00
River Riddle fec20e590f NFC: Rename Module to ModuleOp.
Module is a legacy name that only exists as a typedef of ModuleOp.

PiperOrigin-RevId: 257427248
2019-07-10 10:11:21 -07:00
River Riddle 8c44367891 NFC: Rename Function to FuncOp.
PiperOrigin-RevId: 257293379
2019-07-10 10:10:53 -07:00
Alex Zinenko 7a2e8726e8 Fix a test broken on some systems due to a mis-rebase.
PiperOrigin-RevId: 257190161
2019-07-09 07:43:42 -07:00
Alex Zinenko 9d03f5674f Implement parametric tiling on standard for loops
Parametric tiling can be used to extract outer loops with fixed number of
iterations.  This in turn enables mapping to GPU kernels on a fixed grid
independently of the range of the original loops, which may be unknown
statically, making the kernel adaptable to different sizes.  Provide a utility
function that also computes the parametric tile size given the range of the
loop.  Exercise the utility function through a simple pass that applies it to
all top-level loop nests.  Permutability or parallelism checks must be
performed before calling this utility function in actual passes.

Note that parametric tiling cannot be implemented in a purely affine way,
although it can be encoded using semi-affine maps.  The choice to implement it
on standard loops is guided by them being the common representation between
Affine loops, Linalg and GPU kernels.

PiperOrigin-RevId: 257180251
2019-07-09 06:37:41 -07:00
River Riddle 626b8b6a5d NFC: Remove `Module::getFunctions` in favor of a general `getOps<T>`.
Modules can now contain more than just Functions, this just updates the iteration API to reflect that. The 'begin'/'end' methods have also been updated to iterate over opaque Operations.

PiperOrigin-RevId: 257099084
2019-07-08 18:28:17 -07:00
River Riddle ce502af9cd NFC: Remove the various "::getFunction" methods.
These methods assume that a function is a valid builtin top-level operation, and removing these methods allows for decoupling FuncOp and IR/. Utility "getParentOfType" methods have been added to Operation/OpState to allow for querying the first parent operation of a given type.

PiperOrigin-RevId: 257018913
2019-07-08 12:40:08 -07:00
River Riddle 474e354179 NFC: Remove Region::getContainingFunction as Functions are now Operations.
PiperOrigin-RevId: 256579717
2019-07-04 13:23:10 -07:00
Andy Davis 2e1187dd25 Globally change load/store/dma_start/dma_wait operations over to affine.load/store/dma_start/dma_wait.
In most places, this is just a name change (with the exception of affine.dma_start swapping the operand positions of its tag memref and num_elements operands).
Significant code changes occur here:
*) Vectorization: LoopAnalysis.cpp, Vectorize.cpp
*) Affine Transforms: Transforms/Utils/Utils.cpp

PiperOrigin-RevId: 256395088
2019-07-03 14:37:06 -07:00
River Riddle 206e55cc16 NFC: Refactor Module to be value typed.
As with Functions, Module will soon become an operation, which are value-typed. This eases the transition from Module to ModuleOp. A new class, OwningModuleRef is provided to allow for owning a reference to a Module, and will auto-delete the held module on destruction.

PiperOrigin-RevId: 256196193
2019-07-02 16:43:36 -07:00
River Riddle 705b80918d Generalize the CFG graph printing for Functions to work on Regions instead.
PiperOrigin-RevId: 256029944
2019-07-01 17:02:51 -07:00
River Riddle 694975ddbc Standardize the definition and usage of getAllArgAttrs between FuncOp and Function.
PiperOrigin-RevId: 255988352
2019-07-01 11:39:12 -07:00
River Riddle 54cd6a7e97 NFC: Refactor Function to be value typed.
Move the data members out of Function and into a new impl storage class 'FunctionStorage'. This allows for Function to become value typed, which will greatly simplify the transition of Function to FuncOp(given that FuncOp is also value typed).

PiperOrigin-RevId: 255983022
2019-07-01 11:39:00 -07:00
Alex Zinenko 5eef726bc8 TypeConversion: do not materialize conversion of the type to itself
Type conversion does not necessarily affect all types, some of them may remain
untouched.  The type conversion tool from the dialect conversion framework will
unconditionally insert a temporary cast operation from the type to itself
anyway, and will try to materialize it to a real conversion operation if there
are remaining uses.  Simply use the original value instead.

PiperOrigin-RevId: 255975450
2019-07-01 09:56:56 -07:00
Andy Davis f487d20bf0 Add affine-to-standard lowerings for affine.load/store/dma_start/dma_wait.
PiperOrigin-RevId: 255960171
2019-07-01 09:56:22 -07:00
Nicolas Vasilache e7f51ad08a Add a folder-based EDSC intrinsics constructor (NFC)
PiperOrigin-RevId: 255908660
2019-07-01 09:55:35 -07:00
River Riddle 7c755d06aa Refactor DialectConversion to use 'materializeConversion' when a type conversion must persist after the conversion has finished.
During conversion, if a type conversion has dangling uses a type conversion must persist after conversion has finished to maintain valid IR. In these cases, we now query the TypeConverter to materialize a conversion for us. This allows for the default case of a full conversion to continue working as expected, but also handle the degenerate cases more robustly.

PiperOrigin-RevId: 255637171
2019-06-28 11:29:04 -07:00
River Riddle a4c3a6455c Move the emitError/Warning/Remark utility methods out of MLIRContext and into the mlir namespace.
Now that Locations are attributes, they have direct access to the MLIR context. This allows for simplifying error emission by removing unnecessary context lookups.

PiperOrigin-RevId: 255112791
2019-06-25 21:32:23 -07:00
River Riddle 49162524d8 NFC: Uniformize the return of the LocationAttr 'get' methods to 'Location'.
PiperOrigin-RevId: 255078768
2019-06-25 16:57:56 -07:00
River Riddle 66ed7d6d83 Update the OperationFolder to find a valid insertion point when materializing constants.
The OperationFolder currently just inserts into the entry block of a Function, but regions may be isolated above, i.e. explicit capture only, and blindly inserting constants may break the invariants of these regions.

PiperOrigin-RevId: 254987796
2019-06-25 09:43:21 -07:00
River Riddle c32080a1b0 NFC: Move the ArgConverter methods out-of-line to improve readability.
PiperOrigin-RevId: 254872695
2019-06-24 17:47:51 -07:00
Nicolas Vasilache 95cfd99616 Fix OSS build
PiperOrigin-RevId: 254847773
2019-06-24 17:47:27 -07:00
Nicolas Vasilache dac75ae5ff Split test-specific passes out of mlir-opt
Instead put their impl in test/lib and link them into mlir-test-opt

PiperOrigin-RevId: 254837439
2019-06-24 17:47:12 -07:00
River Riddle b67cab4c44 Update CSE to respect nested regions that are isolated from above. This cl also removes the unused 'NthRegionIsIsolatedFromAbove' trait as it was replaced with a more general 'IsIsolatedFromAbove'.
PiperOrigin-RevId: 254709704
2019-06-24 13:44:53 -07:00
River Riddle bcacef1a70 Add a new dialect hook 'materializeConstant' to create a constant operation that materializes an attribute value with the given type. This effectively adds support for dialect specific constant values that have different invariants than std.constant. 'OperationFolder' is updated to use this new hook, or attempt to default to std.constant when legal.
PiperOrigin-RevId: 254570153
2019-06-22 13:05:27 -07:00
River Riddle 48d6cf1ced NFC: Remove the 'context' parameter from OperationState.
Now that Locations are Attributes they contain a direct reference to the MLIRContext, i.e. the context can be directly accessed from the given location instead of being explicitly passed in.

PiperOrigin-RevId: 254568329
2019-06-22 13:05:10 -07:00
River Riddle 704a7fb13e Add support for 1->N type mappings in the dialect conversion infrastructure. To support these mappings a hook must be overridden on the type converter: 'materializeConversion' :to generate a cast operation from the new types to the old type. This operation is automatically erased if all uses are removed, otherwise it remains in the IR for the user to handle.
PiperOrigin-RevId: 254411383
2019-06-22 09:16:06 -07:00
River Riddle 3e99d99553 Add an overload to 'PatternRewriter::inlineRegionBefore' that accepts a parent region for the insertion position. This allows for inlining the given region into the end of another region.
PiperOrigin-RevId: 254367375
2019-06-22 09:15:21 -07:00
Nicolas Vasilache 0804750c9b Uniformize usage of OpBuilder& (NFC)
Historically the pointer-based version of builders was used.
This CL uniformizes to OpBuilder &

PiperOrigin-RevId: 254280885
2019-06-22 09:14:49 -07:00
River Riddle 7202c4e69d Rename ConversionTarget::isLegal to isDynamicallyLegal to better represent what the function is actually checking.
PiperOrigin-RevId: 254141073
2019-06-22 09:13:45 -07:00
River Riddle 9764ae3f24 Refactor the TypeConverter to support more robust type conversions:
* Support for 1->0 type mappings, i.e. when the argument is being removed.
* Reordering types when converting a type signature.
* Adding new inputs when converting a type signature.

This cl also lays down the initial foundation for supporting 1->N type mappings, but full support will come in a followup.

Moving forward, function signature changes will be driven by populating a SignatureConversion instance. This class contains all of the necessary information for adding/removing/remapping function signatures; e.g. addInputs, addResults, remapInputs, etc.

PiperOrigin-RevId: 254064665
2019-06-19 23:08:33 -07:00
River Riddle 30bbd91056 Simplify usages of SplatElementsAttr now that it inherits from DenseElementsAttr.
PiperOrigin-RevId: 253910543
2019-06-19 23:07:34 -07:00
Andy Davis 59b68146ff Factor fusion compute cost calculation out of LoopFusion and into LoopFusionUtils (NFC).
PiperOrigin-RevId: 253797886
2019-06-19 23:06:26 -07:00
Alex Zinenko 4291ae7431 Factor Region::getUsedValuesDefinedAbove into Transforms/RegionUtils
Arguably, this function is only useful for transformations and should not
pollute the main IR.  Also make sure it accepts a the resulting container
by-reference instead of returning it.

PiperOrigin-RevId: 253622981
2019-06-19 23:03:51 -07:00
Andy Davis 898cf0e968 LoopFusion: adds support for computing forward computation slices, which will enable fusion of consumer loop nests into their producers in subsequent CLs.
PiperOrigin-RevId: 253601994
2019-06-19 23:03:42 -07:00
Alex Zinenko ee6f84aebd Convert a nest affine loops to a GPU kernel
This converts entire loops into threads/blocks.  No check on the size of the
block or grid, or on the validity of parallelization is performed, it is under
the responsibility of the caller to strip-mine the loops and to perform the
dependence analysis before calling the conversion.

PiperOrigin-RevId: 253189268
2019-06-19 23:02:02 -07:00
River Riddle 6a0555a875 Refactor SplatElementsAttr to inherit from DenseElementsAttr as opposed to being a separate Attribute type. DenseElementsAttr provides a better internal representation for splat values as well as better API for accessing elements.
PiperOrigin-RevId: 253138287
2019-06-19 23:01:52 -07:00
River Riddle 5da741f671 Add basic cost modeling to the dialect conversion infrastructure. This initial cost model favors specific patterns based upon two criteria:
1) Lowest minimum pattern stack depth when legalizing.
  - This leads the system to favor patterns that have lower legalization stacks, i.e. represent a more direct mapping to the target.

2)  Pattern benefit.
  - When considering multiple patterns with the same legalization depth, this favors patterns with a larger specified benefit.

PiperOrigin-RevId: 252713470
2019-06-19 22:59:06 -07:00
River Riddle eb28b30940 NFC: Cleanup the naming scheme for registering legalization actions to be consistent, and move a file functions to the source file.
PiperOrigin-RevId: 252639629
2019-06-11 10:14:35 -07:00
Alex Zinenko 8ad35b90ec Use DialectConversion to lower the Affine dialect to the Standard dialect
This introduces the support for region-containing operations to the dialect
conversion framework in order to support the conversion of affine control-flow
operations into the standard control flow with branches.  Regions that belong
to an operation are converted before the operation itself.  The
DialectConversionPattern can therefore access the converted regions of the
original operation and process them further if necessary.  In particular, the
conversion is allowed to move the blocks from the original region to other
regions and to split blocks into multiple blocks.  All block manipulations must
be performed through the PatternRewriter to ensure they will be undone if the
conversion fails.

Port the pass converting from the affine dialect (loops and ifs with bodies as
regions) to the standard dialect (branch-based cfg) to use DialectConversion in
order to exercise this new functionality.  The modification to the lowering
functions are minor and are focused on using the PatterRewriter instead of
directly modifying the IR.

PiperOrigin-RevId: 252625169
2019-06-11 10:14:27 -07:00
Andy Davis e33e36f178 Return dependence result enum to distiguish between dependence result and error cases (NFC).
PiperOrigin-RevId: 252437616
2019-06-11 10:12:36 -07:00
River Riddle e7ccfb2ae8 Add support to ConversionTarget for storing legalization actions for entire dialects as opposed to individual operations. This allows for better support of unregistered operations, as well as removing the need to collect all of the operations for a given dialect(which may be very expensive).
PiperOrigin-RevId: 251943590
2019-06-09 16:21:32 -07:00
River Riddle e25796ef6e Add support for matchAndRewrite to the DialectConversion patterns. This also drops the default "always succeed" match override to better align with RewritePattern.
PiperOrigin-RevId: 251941625
2019-06-09 16:21:20 -07:00
River Riddle 0560f153b8 Add utility 'create' methods to OperationFolder that will create an operation with a given OpBuilder and automatically try to fold it, similarly to OpBuilder::createOrFold. The difference here is that these methods enable folding to constants in addition to existing values. This functionality is then used to replace linalg::FunctionConstants.
PiperOrigin-RevId: 251716247
2019-06-09 16:19:51 -07:00
River Riddle 9fc00cf840 Always remap results when replacing an operation. This prevents a crash when lowering identity(passthrough) operations to the same resultant type as the original operation.
PiperOrigin-RevId: 251665492
2019-06-09 16:18:44 -07:00
River Riddle 0d2492eb2e When cleaning up after a failed legalization pattern, make sure to remove any newly created value mappings.
PiperOrigin-RevId: 251658984
2019-06-09 16:18:32 -07:00
River Riddle f1b848e470 NFC: Rename FuncBuilder to OpBuilder and refactor to take a top level region instead of a function.
PiperOrigin-RevId: 251563898
2019-06-09 16:17:59 -07:00
River Riddle 9b4a02c1e9 NFC: Rename FoldHelper to OperationFolder and split a large function in two.
PiperOrigin-RevId: 251485843
2019-06-09 16:17:11 -07:00
Ben Vanik 9fc4193eea Adding additional dialect parsing utilities, conversion wrappers, and traversal helpers.
- added a typed walk to Block (matching the equivalent on Function)
- added token parsers (incl optional variants) for : and (
- added applyConversionPatterns that takes a list of functions to apply patterns to

PiperOrigin-RevId: 251481608
2019-06-09 16:16:59 -07:00
River Riddle 95eaca3e0f Refactor the dialect conversion framework to support multi-level conversions. Multi-level conversions are those that require multiple patterns to be applied before an operation is completely legalized. This essentially means that conversion patterns do not have to directly generate legal operations, and may be chained together to produce legal code.
To accomplish this, moving forward users will need to provide a legalization target that defines what operations are legal for the conversion. A target can mark an operation as legal by providing a specific legalization action. The initial actions are:
* Legal
  - This action signals that every instance of the given operation is legal,
    i.e. any combination of attributes, operands, types, etc. is valid.
* Dynamic
  - This action signals that only some instances of a given operation are legal. This
    allows for defining fine-tune constraints, like say std.add is only legal when
    operating on 32-bit integers.

An example target is shown below:
struct MyTarget : public ConversionTarget {
  MyTarget(MLIRContext &ctx) : ConversionTarget(ctx) {
    // All operations in the LLVM dialect are legal.
    addLegalDialect<LLVMDialect>();

    // std.constant op is always legal on this target.
    addLegalOp<ConstantOp>();

    // std.return op has dynamic legality constraints.
    addDynamicallyLegalOp<ReturnOp>();
  }

  /// Implement the custom legalization handler to handle
  /// std.return.
  bool isLegal(Operation *op) override {
    // Process the dynamic handling for a std.return op (and any others that were
    // marked "dynamic").
    ...
  }
};

PiperOrigin-RevId: 251289374
2019-06-03 19:27:02 -07:00
Amit Sabne 7a43da6060 Loop invariant code motion - remove reliance on getForwardSlice. Add more tests.
--

PiperOrigin-RevId: 250950703
2019-06-01 20:13:30 -07:00
Geoffrey Martin-Noble 60d6249fbd Replace checks against numDynamicDims with hasStaticShape
--

PiperOrigin-RevId: 250782165
2019-06-01 20:11:31 -07:00
Jacques Pienaar 4a697a91de Fix 5 ClangTidy - Readability findings.
* the 'empty' method should be used to check for emptiness instead of 'size'
    * using decl 'CapturableHandle' is unused
    * redundant get() call on smart pointer
    * using decl 'apply' is unused
    * using decl 'ScopeGuard' is unused

--

PiperOrigin-RevId: 250623863
2019-06-01 20:10:22 -07:00
River Riddle 9abdbb3189 NFC: Inline toString as operations can be streamed directly into raw_ostream.
--

PiperOrigin-RevId: 250619765
2019-06-01 20:10:12 -07:00
MLIR Team 5a91b9896c Remove "size" property of affine maps.
--

PiperOrigin-RevId: 250572818
2019-06-01 20:09:02 -07:00
Andy Davis 1de0f97fff LoopFusionUtils CL 2/n: Factor out and generalize slice union computation.
*) Factors slice union computation out of LoopFusion into Analysis/Utils (where other iteration slice utilities exist).
    *) Generalizes slice union computation to take the union of slices computed on all loads/stores pairs between source and destination loop nests.
    *) Fixes a bug in FlatAffineConstraints::addSliceBounds where redundant constraints were added.
    *) Takes care of a TODO to expose FlatAffineConstraints::mergeAndAlignIds as a public method.

--

PiperOrigin-RevId: 250561529
2019-06-01 20:08:52 -07:00
Alex Zinenko c2d105811a Do not assume Blocks belong to Functions
Fix Block::splitBlock and Block::eraseFromFunction that erronously assume
    blocks belong to functions.  They now belong to regions.  When splitting, new
    blocks should be created in the same region as the existing block.  When
    erasing a block, it should be removed from the region rather than from the
    function body that transitively contains the region.

    Also rename Block::eraseFromFunction to Block::erase for consistency with other
    IR containers.

--

PiperOrigin-RevId: 250278272
2019-06-01 20:05:21 -07:00
Alex Zinenko d4c071cc69 Decouple affine->standard lowering from the pass
The lowering from the Affine dialect to the Standard dialect was originally
    implemented as a standalone pass.  However, it may be used by other passes
    willing to lower away some of the affine constructs as a part of their
    operation.  Decouple the transformation functions from the pass infrastructure
    and expose the entry point for the lowering.

    Also update the lowering functions to use `LogicalResult` instead of bool for
    return values.

--

PiperOrigin-RevId: 250229198
2019-06-01 20:05:01 -07:00
River Riddle c2d069323b Rename DialectConversion to TypeConverter and split out pattern construction. This simplifies building the conversion pattern list from multiple sources.
--

PiperOrigin-RevId: 249930583
2019-06-01 20:02:03 -07:00
Lei Zhang ba104f871c Add TestLoopFusion.cpp to CMakeLists.txt
--

PiperOrigin-RevId: 249901490
2019-06-01 20:00:52 -07:00
Andy Davis e53b7d2c02 Add LoopFusionUtils.cpp to CMakeLists.
--

PiperOrigin-RevId: 249887371
2019-06-01 20:00:33 -07:00
Andy Davis a560f2c646 Affine Loop Fusion Utility Module (1/n).
*) Adds LoopFusionUtils which will expose a set of loop fusion utilities (e.g. dependence checks, fusion cost/storage reduction, loop fusion transformation) for use by loop fusion algorithms. Support for checking block-level fusion-preventing dependences is added in this CL (additional loop fusion utilities will be added in subsequent CLs).
    *) Adds TestLoopFusion test pass for testing LoopFusionUtils at a fine granularity.
    *) Adds unit test for testing dependence check for block-level fusion-preventing dependences.

--

PiperOrigin-RevId: 249861071
2019-06-01 20:00:23 -07:00
River Riddle ae1651368f NFC: Rename DialectConversionPattern to ConversionPattern.
--

PiperOrigin-RevId: 249857277
2019-06-01 20:00:13 -07:00
Alex Zinenko fe2716aee3 Detemplatize convertRegion in DialectConversion
Originally, FunctionConverter::convertRegion in the DialectConversion framework
    was implemented as a function template because it was creating a new region in
    the parent object, which could have been an op or a function.  Since
    DialectConversion now operates in place, new region is no longer created so
    there is no need for convertRegion to be aware of the parent, only of the error
    reporting location.

--

PiperOrigin-RevId: 249826392
2019-06-01 20:00:04 -07:00
River Riddle 4958ec2414 Apply operation rewrites before updating arguments.
--

PiperOrigin-RevId: 249678839
2019-06-01 19:58:14 -07:00
River Riddle 14d1cfbccb Decouple running a conversion from the DialectConversion class. The DialectConversion class is only necessary for type signature changes(block arguments or function arguments). This isn't always desired when performing a dialect conversion. This allows for those conversions without this need to run per function instead of per module.
--

PiperOrigin-RevId: 249657549
2019-06-01 19:58:04 -07:00
River Riddle c33862b0ed Refactor FunctionAttr to hold the internal function reference by name instead of pointer. The one downside to this is that the function reference held by a FunctionAttr needs to be explicitly looked up from the parent module. This provides several benefits though:
* There is no longer a need to explicitly remap function attrs.
      - This removes a potentially expensive call from the destructor of Function.
      - This will enable some interprocedural transformations to now run intraprocedurally.
      - This wasn't scalable and forces dialect defined attributes to override
        a virtual function.
    * Replacing a function is now a trivial operation.
    * This is a necessary first step to representing functions as operations.

--

PiperOrigin-RevId: 249510802
2019-06-01 19:56:54 -07:00
River Riddle d15d107da1 Refactor DialectConversion to operate on functions in-place *without* any cloning. This works by caching all of the requested pattern rewrite operations, e.g. replace operation, and only applying them on a completely successful conversion.
--

PiperOrigin-RevId: 249490306
2019-06-01 19:56:24 -07:00
MLIR Team 80884d28ac [LoopFusion] Don't count terminator op in compute cost.
--

PiperOrigin-RevId: 249124895
2019-06-01 19:52:52 -07:00
Nicolas Vasilache fdbbb3c274 Use lambdas for nesting edsc constructs.
Using ArrayRef introduces issues with the order of evaluation between a constructor and
    the arguments of the subsequent calls to the `operator()`.
    As a consequence the order of captures is not well-defined can go wrong with certain compilers (e.g. gcc-6.4).
    This CL fixes the issue by using lambdas in lieu of ArrayRef.

--

PiperOrigin-RevId: 249114775
2019-05-20 13:50:28 -07:00
Mehdi Amini 164c3c7ac5 Fix debug build: static constexpr data member must have a definition (until C++17)
--

PiperOrigin-RevId: 248990338
2019-05-20 13:48:36 -07:00
River Riddle 68250edbfa NFC: Tidy up DialectConversion.cpp and rename DialectOpConversion to DialectConversionPattern.
--

PiperOrigin-RevId: 248980810
2019-05-20 13:48:19 -07:00
River Riddle 6241cf132e Refactor the DialectConversion process to clone each function and then operate in-place, as opposed to incrementally constructing a new function. This is crucial to allowing the use of non type-conversion patterns(normal RewritePatterns) as part of the conversion process.
The converter now works by inserting fake producer operations when replacing the results of an existing operation with values of a different, now legal, type. These fake operations are guaranteed to never escape the converter.

--

PiperOrigin-RevId: 248969130
2019-05-20 13:48:10 -07:00
River Riddle 8780d8d8eb Add user iterators to IRObjects, i.e. Values.
--

PiperOrigin-RevId: 248877752
2019-05-20 13:47:19 -07:00
River Riddle 3de0c7696b Rewrite the DialectOpConversion patterns to inherit from RewritePattern instead of Pattern. This simplifies the infrastructure a bit by being able to reuse PatternRewriter and the RewritePatternMatcher, but also starts to lay the groundwork for a more generalized legalization framework that can operate on DialectOpConversions as well as normal RewritePatterns.
--

PiperOrigin-RevId: 248836492
2019-05-20 13:47:01 -07:00
River Riddle 1a100849c4 Add support for saving and restoring the insertion point of a FuncBuilder. This also updates the edsc::ScopedContext to use a single builder that saves/restores insertion points. This is necessary for using edscs within RewritePatterns.
--

PiperOrigin-RevId: 248812645
2019-05-20 13:46:35 -07:00
River Riddle eb5ec03960 Refactor PatternRewriter to inherit from FuncBuilder instead of Builder. This is necessary for allowing more complicated rewrites in the future that may do things like update the insertion point (e.g. for rewrites involving regions).
--

PiperOrigin-RevId: 248803153
2019-05-20 13:46:26 -07:00
River Riddle 1982afb145 Unify the 'constantFold' and 'fold' hooks on an operation into just 'fold'. This new unified fold hook will take constant attributes as operands, and may return an existing 'Value *' or a constant 'Attribute' when folding. This removes the awkward situation where a simple canonicalization like "sub(x,x)->0" had to be written as a canonicalization pattern as opposed to a fold.
--

PiperOrigin-RevId: 248582024
2019-05-20 13:44:24 -07:00
Chris Lattner 9ec6b5b749 Remove some extraneous const qualifiers on Type, and 0b1 -> 1 in tblgen files. (NFC)
--

PiperOrigin-RevId: 248332674
2019-05-20 13:42:56 -07:00
Jacques Pienaar cde4d5a6d9 Remove unnecessary C++ specifier in CPP files. NFC.
These are only required in .h files to disambiguate between C and C++ header files.

--

PiperOrigin-RevId: 248219135
2019-05-20 13:42:13 -07:00
Stella Laurenzo 1a2ad06bae Fix lingering sign compare warnings in exposed by "ninja check-mlir".
--

PiperOrigin-RevId: 248050178
2019-05-20 13:41:11 -07:00
Jacques Pienaar c82e1da268 Remove unused function and avoid unused variable warning. NFC.
--

PiperOrigin-RevId: 247991231
2019-05-20 13:40:23 -07:00
Andy Davis 90d4023c9b Factor out loop interchange code from LoopFusion into LoopUtils (NFC).
--

PiperOrigin-RevId: 247926512
2019-05-20 13:38:12 -07:00
River Riddle d5b60ee840 Replace Operation::isa with llvm::isa.
--

PiperOrigin-RevId: 247789235
2019-05-20 13:37:52 -07:00
River Riddle adca3c2edc Replace Operation::cast with llvm::cast.
--

PiperOrigin-RevId: 247785983
2019-05-20 13:37:42 -07:00
River Riddle c5ecf9910a Add support for using llvm::dyn_cast/cast/isa for operation casts and replace usages of Operation::dyn_cast with llvm::dyn_cast.
--

PiperOrigin-RevId: 247780086
2019-05-20 13:37:31 -07:00
MLIR Team 41d90a85bd Automated rollback of changelist 247778391.
PiperOrigin-RevId: 247778691
2019-05-20 13:37:20 -07:00
River Riddle 02e03b9bf4 Add support for using llvm::dyn_cast/cast/isa for operation casts and replace usages of Operation::dyn_cast with llvm::dyn_cast.
--

PiperOrigin-RevId: 247778391
2019-05-20 13:37:10 -07:00
Chris Lattner 81e478adca rename -memref-dependence-check to -test-memref-dependence-check since it
generates remarks for testing, it isn't itself a transformation.

    While there, upgrade its diagnostic emission to use the streaming interface.

    Prune some unnecessary #includes.

--

PiperOrigin-RevId: 247768062
2019-05-20 13:36:38 -07:00
River Riddle fe7b23792d Remove some unnecessary or duplicated header includes from IR/.
--

PiperOrigin-RevId: 247762545
2019-05-20 13:36:28 -07:00
Chris Lattner 0134b5df3a Cleanups and simplifications to code, noticed by inspection. NFC.
--

PiperOrigin-RevId: 247758075
2019-05-20 13:36:17 -07:00
Mehdi Amini 91f0781000 Remove extra `;` after function definition (NFC)
Fix a GCC warning

--

PiperOrigin-RevId: 247670176
2019-05-10 19:29:26 -07:00
Mehdi Amini 83cce46b96 Remove unused Vectorize constructor (NFC)
Fix gcc warning.

--

PiperOrigin-RevId: 247647114
2019-05-10 19:29:01 -07:00
Andy Davis 6254a42d58 Fix bug in DmaGenerate pass where MemRefRegion union was not propagated to read region.
Also cleaned up dma-generate.mlir a bit.

--

PiperOrigin-RevId: 247417358
2019-05-10 19:25:44 -07:00
Lei Zhang 323e1bf7f8 Inline a string used in lambda function to fix capture error
The string was referenced but not captured in the lambda, which causes
    a failure when compiling with MSVC.

    This issue was discovered by @loic-joly-sonarsource with a proposed fix
    in https://github.com/tensorflow/mlir/pull/22.

--

PiperOrigin-RevId: 247085897
2019-05-10 19:23:49 -07:00
River Riddle ae9f4f2157 Simplify the emission of various diagnostics created in Analysis/ and Transforms/ by using the new diagnostic infrastructure.
--

PiperOrigin-RevId: 246955332
2019-05-10 19:23:07 -07:00
River Riddle 983e0eea95 Simplify several usages of attributes now that they always have a type and, transitively, access to the context.
This also fixes a bug where FunctionAttrs were not being remapped for function and function argument attributes.

--

PiperOrigin-RevId: 246876924
2019-05-10 19:22:41 -07:00
River Riddle 4ea887be41 Namespaceify a few explicit template specializations to appease errors caused by a bug in gcc versions < 7.0.
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56480)

--

PiperOrigin-RevId: 246664463
2019-05-06 08:29:09 -07:00
Jacques Pienaar 2fe8ae4f6c Fix up some mixed sign warnings.
--

PiperOrigin-RevId: 246614498
2019-05-06 08:28:20 -07:00
Nicolas Vasilache 258e8d9ce2 Prepend an "affine-" prefix to Affine pass option names - NFC
Trying to activate both LLVM and MLIR passes in mlir-cpu-runner showed name collisions when registering pass names.
    One possible way of disambiguating that should also work across dialects is to prepend the dialect name to the passes that specifically operate on that dialect.

    With this CL, mlir-cpu-runner tests still run when both LLVM and MLIR passes are registered

--

PiperOrigin-RevId: 246539917
2019-05-06 08:26:44 -07:00
MLIR Team 9c66417569 Fix bug in LoopTiling where a loop with trip count of 1 caused a division by zero
--

PiperOrigin-RevId: 246480710
2019-05-06 08:26:15 -07:00
River Riddle b14c4b4ca8 Add support for basic remark diagnostics. This is the minimal functionality needed to separate notes from remarks. It also provides a starting point to start building out better remark infrastructure.
--

PiperOrigin-RevId: 246175216
2019-05-06 08:24:02 -07:00
River Riddle eaf7f6b671 Start sketching out a new diagnostics infrastructure. Create a new class 'DiagnosticEngine' and move the diagnostic handler support and final diagnostic emission from the MLIRContext to it.
--

PiperOrigin-RevId: 246163897
2019-05-06 08:23:53 -07:00
Nicolas Vasilache 56c7a957bf Parsing support for Range, View and Slice operations
This CL implements the previously unsupported parsing for Range, View and Slice operations.
    A pass is introduced to lower to the LLVM.
    Tests are moved out of C++ land and into mlir/test/Examples.
    This allows better fitting within standard developer workflows.

--

PiperOrigin-RevId: 245796600
2019-05-06 08:20:55 -07:00
River Riddle 1423acc03c Rename isa_nonnull to isa_and_nonnull to match the upstream llvm name.
--

PiperOrigin-RevId: 244928036
2019-04-23 22:03:14 -07:00
Feng Liu 5c757087c7 Apply patterns repeatly if the function is modified
During the pattern rewrite, if the function is changed, i.e. ops created,
    deleted or swapped, the pattern rewriter needs to re-scan the function entirely
    and apply the patterns again, so the patterns whose root ops have been popped
    out from the working list nor an immediate users of the changed ops can be
    reconsidered.

    A command line flag is added to set the max number of iterations rescanning the
    function for pattern match. If the rewrite doesn' converge after this number,
    this compiling will continue and the result can be sub-optimal.

    One unit test is updated because this change fixed the missing optimization opportunities.

--

PiperOrigin-RevId: 244754190
2019-04-23 22:02:16 -07:00
Amit Sabne 4aa9235ae0 Fix LLVM_DEBUG instances
--

PiperOrigin-RevId: 244058051
2019-04-18 11:49:39 -07:00
Amit Sabne 7905da656e Loop invariant code motion.
--

PiperOrigin-RevId: 244043679
2019-04-18 11:49:31 -07:00
Lei Zhang bdd56eca49 Remove checks guaranteed to be true by the type
This addresses the compiler wraning of "comparison of unsigned expression
    >= 0 is always true [-Wtype-limits]".

--

PiperOrigin-RevId: 242868703
2019-04-11 10:52:33 -07:00
Lei Zhang 2e7895d5f1 Add parentheses in various asserts to group predicates
This addresses the "suggest parentheses around ‘&&’ within ‘||’
    [-Wparentheses]" compiler warnings.

--

PiperOrigin-RevId: 242868670
2019-04-11 10:52:21 -07:00
Andy Davis 44f6dffbf8 Factor code to compute dependence components out of loop fusion pass, and into a reusable utility function (NFC).
--

PiperOrigin-RevId: 242716259
2019-04-11 10:51:53 -07:00
Amit Sabne 70a416de14 Fix typos in LoopFusion
--

PiperOrigin-RevId: 242679298
2019-04-11 10:51:43 -07:00
Mehdi Amini f40634ef3a Filter DialectConversion pattern to be considered only if the root kind matches the operation.
This is the same logic as the PatterRewriter.

--

PiperOrigin-RevId: 242287241
2019-04-07 18:21:34 -07:00
River Riddle e4628b79fb Add new utilities for RTTI Operation casting: dyn_cast_or_null and isa_nonnull
* dyn_cast_or_null
      - This will first check if the operation is null before trying to 'dyn_cast':

        Value *v = ...;
        if (auto forOp = dyn_cast_or_null<AffineForOp>(v->getDefiningOp()))
          ...
    * isa_nonnull
      - This will first check if the pointer is null before trying to 'isa':

        Value *v = ...;
        if (isa_nonnull<AffineForOp>(v->getDefiningOp());
          ...

--

PiperOrigin-RevId: 242171343
2019-04-07 18:20:07 -07:00
Mehdi Amini 7a640e65e9 Fix CMake build: reflect that a new file Utils/ConstantFoldUtils.cpp was added
--

PiperOrigin-RevId: 242123122
2019-04-05 07:43:50 -07:00
Mehdi Amini 01e8ec94c3 Fix CMake build: account for renamed files and add missing include on MacOS
--

PiperOrigin-RevId: 242101364
2019-04-05 07:43:23 -07:00
River Riddle c4a5386e48 NFC: Replace usages of iterator_range<operand_iterator> with operand_range.
--

PiperOrigin-RevId: 242031201
2019-04-05 07:42:29 -07:00
MLIR Team 0cd589c337 Create a LoopUtil function to return perfectly nested loop set
--

PiperOrigin-RevId: 242019230
2019-04-05 07:42:01 -07:00
River Riddle a8f4b9eeeb Iterate on the operations to fold in TestConstantFold in reverse to remove the need for ConstantFoldHelper to have a flag for insertion at the head of the entry block. This also fixes an asan bug in TestConstantFold due to the iteration order of operations and ConstantFoldHelper's constant insertion placement.
Note: This now means that we cannot fold chains of operations, i.e. where constant foldable operations feed into each other. Given that this is a testing pass solely for constant folding, this isn't really something that we want anyways. Constant fold tests should be simple and direct, with more advanced folding/feeding being tested with the canonicalizer.

--

PiperOrigin-RevId: 242011744
2019-04-05 07:41:52 -07:00
River Riddle dca21299cb Fix a few warnings for missing parentheses around '||' and extra semicolons.
--

PiperOrigin-RevId: 241994767
2019-04-05 07:41:43 -07:00
Lei Zhang 4e40c83291 Deduplicate constant folding logic in ConstantFold and GreedyPatternRewriteDriver
There are two places containing constant folding logic right now: the ConstantFold
    pass and the GreedyPatternRewriteDriver. The logic was not shared and started to
    drift apart. We were testing constant folding logic using the ConstantFold pass,
    but lagged behind the GreedyPatternRewriteDriver, where we really want the constant
    folding to happen.

    This CL pulled the logic into utility functions and classes for sharing between
    these two places. A new ConstantFoldHelper class is created to help constant fold
    and de-duplication.

    Also, renamed the ConstantFold pass to TestConstantFold to make it clear that it is
    intended for testing purpose.

--

PiperOrigin-RevId: 241971681
2019-04-05 07:41:32 -07:00
River Riddle 6fa3181329 Remove the non-postorder walk functions from Function/Block/Instruction and rename walkPostOrder to walk.
--

PiperOrigin-RevId: 241965239
2019-04-05 07:41:23 -07:00
Andy Davis d0d1b2a30d Fix bug in LoopTiling where creation of tile-space loop upper bound did not handle symbol operands correctly.
--

PiperOrigin-RevId: 241958502
2019-04-05 07:41:12 -07:00
Nicolas Vasilache f1b12f5a64 Fix test that fails on non-determinism in LowerVectorTransfers
This CL fixes the non-determinism across compilers in an edsc::select expression used in LowerVectorTransfers. This is achieved by factoring the expression out of the function call to ensure a deterministic order of evaluation.
    Since the expression is now factored out, fewer IR is generated and the test is updated accordingly.

--

PiperOrigin-RevId: 241679962
2019-04-03 01:09:13 -07:00
River Riddle 67a52c44b1 Rewrite the verify hooks on operations to use LogicalResult instead of bool. This also changes the return of Operation::emitError/emitOpError to LogicalResult as well.
--

PiperOrigin-RevId: 241588075
2019-04-02 13:40:47 -07:00
Andy Davis 7c1fc9e795 Enable producer-consumer fusion for liveout memrefs if consumer read region matches producer write region.
--

PiperOrigin-RevId: 241517207
2019-04-02 13:39:50 -07:00
River Riddle 084669e005 Remove MLPatternLoweringPass and rewrite LowerVectorTransfers to use RewritePattern instead.
--

PiperOrigin-RevId: 241455472
2019-04-02 13:39:17 -07:00
Mehdi Amini b3a407fa68 Fix MacOS build
This is making up for some differences in standard library and linker flags.
    It also get rid of the requirement to build with RTTI.

--

PiperOrigin-RevId: 241348845
2019-04-01 11:00:30 -07:00
Jacques Pienaar 1273af232c Add build files and update README.
* Add initial version of build files;
    * Update README with instructions to download and build MLIR from github;

--

PiperOrigin-RevId: 241102092
2019-03-30 11:23:22 -07:00
Nicolas Vasilache c9d5f3418a Cleanup SuperVectorization dialect printing and parsing.
On the read side,
```
%3 = vector_transfer_read %arg0, %i2, %i1, %i0 {permutation_map: (d0, d1, d2)->(d2, d0)} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32>
```

becomes:

```
%3 = vector_transfer_read %arg0[%i2, %i1, %i0] {permutation_map: (d0, d1, d2)->(d2, d0)} : memref<?x?x?xf32>, vector<32x256xf32>
```

On the write side,

```
vector_transfer_write %0, %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d0)} : vector<128xf32>, memref<?x?xf32>, index, index
```

becomes

```
vector_transfer_write %0, %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d0)} : vector<128xf32>, memref<?x?xf32>
```

Documentation will be cleaned up in a followup commit that also extracts a proper .md from the top of the file comments.

PiperOrigin-RevId: 241021879
2019-03-29 17:56:42 -07:00
Nicolas Vasilache f93a5be65f Make createMaterializeVectorsPass take a vectorSize parameter - NFC
This CL allows the programmatic control of the target hardware vector size when creating a MaterializeVectorsPass.
This is useful for registering passes for the tutorial.

PiperOrigin-RevId: 240996136
2019-03-29 17:56:12 -07:00
Nicolas Vasilache 094ca64ab0 Refactor vectorization patterns
This CL removes the reliance of the vectorize pass on the specification of a `fastestVaryingDim` parameter. This parameter is a restriction meant to more easily target a particular loop/memref combination for vectorization and is mainly used for testing.

This also had the side-effect of restricting vectorization patterns to only the ones in which all memrefs were contiguous along the same loop dimension. This simple restriction prevented matmul to vectorize in 2-D.

this CL removes the restriction and adds the matmul test which vectorizes in 2-D along the parallel loops. Support for reduction loops is left for future work.

PiperOrigin-RevId: 240993827
2019-03-29 17:55:36 -07:00
MLIR Team 9d30b36aaf Enable input-reuse fusion to search function arguments for fusion candidates (takes care of a TODO, enables another tutorial test case).
PiperOrigin-RevId: 240979894
2019-03-29 17:54:36 -07:00
River Riddle 106dd08e99 Change the vectorizer test pass to output via diagnostics instead of llvm::outs. This allows for the output to be deterministic when multi-threading is enabled.
PiperOrigin-RevId: 240905858
2019-03-29 17:54:21 -07:00
Jacques Pienaar cd0b925dc2 Remove extra qualification
PiperOrigin-RevId: 240875432
2019-03-29 17:52:36 -07:00
Alex Zinenko 3173a63f3f Dialect Conversion: convert regions of operations when cloning them
Dialect conversion currently clones the operations that did not match any
pattern.  This includes cloning any regions that belong to these operations.
Instead, apply conversion recursively to the nested regions.

Note that if an operation matched one of the conversion patterns, it is up to
the pattern rewriter to fill in the regions of the converted operation.  This
may require calling back to the converter and is left for future work.

PiperOrigin-RevId: 240872410
2019-03-29 17:52:04 -07:00
MLIR Team 9d9675fc8f Remove overly conservative check in LoopFusion pass (enables fusion in tutorial example).
PiperOrigin-RevId: 240859227
2019-03-29 17:51:16 -07:00
River Riddle 213b8d4d3b Rename InstOperand to OpOperand.
PiperOrigin-RevId: 240814651
2019-03-29 17:50:41 -07:00
Nicolas Vasilache 4dc7af9da8 Make vectorization aware of loop semantics
Now that we have a dependence analysis, we can check that loops are indeed parallel and make vectorization correct.

PiperOrigin-RevId: 240682727
2019-03-29 17:49:30 -07:00
Nicolas Vasilache c3742d20b5 Give the Vectorize pass a virtualVectorSize argument.
This CL allows vectorization to be called and configured in other ways than just via command line arguments.
This allows triggering vectorization programmatically.

PiperOrigin-RevId: 240638208
2019-03-29 17:48:12 -07:00
River Riddle 99b87c9707 Replace usages of Instruction with Operation in the Transforms/ directory.
PiperOrigin-RevId: 240636130
2019-03-29 17:47:26 -07:00
Mehdi Amini 3518122e86 Simplify API uses of `getContext()` (NFC)
The Pass base class is providing a convenience getContext() accessor.

PiperOrigin-RevId: 240634961
2019-03-29 17:47:11 -07:00
Jacques Pienaar b0244b66a5 Fix include path in test pass.
PiperOrigin-RevId: 240628260
2019-03-29 17:46:41 -07:00
River Riddle 9c08540690 Replace usages of Instruction with Operation in the /Analysis directory.
PiperOrigin-RevId: 240569775
2019-03-29 17:44:56 -07:00
Alex Zinenko 5a5bba0279 Introduce affine terminator
Due to legacy reasons (ML/CFG function separation), regions in affine control
flow operations require contained blocks not to have terminators.  This is
inconsistent with the notion of the block and may complicate code motion
between regions of affine control operations and other regions.

Introduce `affine.terminator`, a special terminator operation that must be used
to terminate blocks inside affine operations and transfers the control back to
he region enclosing the affine operation.  For brevity and readability reasons,
allow `affine.for` and `affine.if` to omit the `affine.terminator` in their
regions when using custom printing and parsing format.  The custom parser
injects the `affine.terminator` if it is missing so as to always have it
present in constructed operations.

Update transformations to account for the presence of terminator.  In
particular, most code motion transformation between loops should leave the
terminator in place, and code motion between loops and non-affine blocks should
drop the terminator.

PiperOrigin-RevId: 240536998
2019-03-29 17:44:24 -07:00
River Riddle af45236c70 Add experimental support for multi-threading the pass manager. This adds support for running function pipelines on functions across multiple threads, and is guarded by an off-by-default flag 'experimental-mt-pm'. There are still quite a few things that need to be done before multi-threading is ready for general use(e.g. pass-timing), but this allows for those things to be tested in a multi-threaded environment.
PiperOrigin-RevId: 240489002
2019-03-29 17:44:08 -07:00
River Riddle f9d91531df Replace usages of Instruction with Operation in the /IR directory.
This is step 2/N to renaming Instruction to Operation.

PiperOrigin-RevId: 240459216
2019-03-29 17:43:37 -07:00
River Riddle 9ffdc930c0 Rename the Instruction class to Operation. This just renames the class, usages of Instruction will still refer to a typedef in the interim.
This is step 1/N to renaming Instruction to Operation.

PiperOrigin-RevId: 240431520
2019-03-29 17:42:50 -07:00
Alex Zinenko a7215a9032 Allow creating standalone Regions
Currently, regions can only be constructed by passing in a `Function` or an
`Instruction` pointer referencing the parent object, unlike `Function`s or
`Instruction`s themselves that can be created without a parent.  It leads to a
rather complex flow in operation construction where one has to create the
operation first before being able to work with its regions.  It may be
necessary to work with the regions before the operation is created.  In
particular, in `build` and `parse` functions that are executed _before_ the
operation is created in cases where boilerplate region manipulation is required
(for example, inserting the hypothetical default terminator in affine regions).
Allow creating standalone regions.  Such regions are meant to own a list of
blocks and transfer them to other regions on demand.

Each instruction stores a fixed number of regions as trailing objects and has
ownership of them.  This decreases the size of the Instruction object for the
common case of instructions without regions.  Keep this behavior intact.  To
allow some flexibility in construction, make OperationState store an owning
vector of regions.  When the Builder creates an Instruction from
OperationState, the bodies of the regions are transferred into the
instruction-owned regions to minimize copying.  Thus, it becomes possible to
fill standalone regions with blocks and move them to an operation when it is
constructed, or move blocks from a region to an operation region, e.g., for
inlining.

PiperOrigin-RevId: 240368183
2019-03-29 17:40:59 -07:00
Chris Lattner 46ade282c8 Make FunctionPass::getFunction() return a reference to the function, instead of
a pointer.  This makes it consistent with all the other methods in
FunctionPass, as well as with ModulePass::getModule().  NFC.

PiperOrigin-RevId: 240257910
2019-03-29 17:40:44 -07:00
River Riddle 96ebde9cfd Replace usages of "Op::operator->" with ".".
This is step 2/N of removing the temporary operator-> method as part of the de-const transition.

PiperOrigin-RevId: 240200792
2019-03-29 17:40:09 -07:00
River Riddle 5de726f493 Refactor the Pattern framework to allow for combined match/rewrite patterns. This is done by adding a new 'matchAndRewrite' function to RewritePattern that performs the match and rewrite in one step. The default behavior simply calls into the existing 'match' and 'rewrite' functions. The 'PatternMatcher' class has now been specialized for RewritePatterns and has been rewritten to make use of the new matchAndRewrite functionality.
This combined match/rewrite functionality allows simplifying the majority of existing RewritePatterns, as they do not benefit from separate match and rewrite functions.

Some of the existing canonicalization patterns in StandardOps have been modified to take advantage of this functionality.

PiperOrigin-RevId: 240187856
2019-03-29 17:39:35 -07:00
River Riddle af1abcc80b Replace usages of "operator->" with "." for the AffineOps.
Note: The "operator->" method is a temporary helper for the de-const transition and is gradually being phased out.
PiperOrigin-RevId: 240179439
2019-03-29 17:39:19 -07:00
River Riddle 832567b379 NFC: Rename the 'for' operation in the AffineOps dialect to 'affine.for' and set the namespace of the AffineOps dialect to 'affine'.
PiperOrigin-RevId: 240165792
2019-03-29 17:39:03 -07:00
Mehdi Amini bb621a5596 Using getContext() instead of getInstruction()->getContext() on Operation (NFC)
PiperOrigin-RevId: 240088209
2019-03-29 17:38:29 -07:00
Chris Lattner e510de0305 Various small cleanups to the code, mostly removing const_cast's.
PiperOrigin-RevId: 240083489
2019-03-29 17:37:58 -07:00
River Riddle 9c6e92360c NFC: Rename the 'if' operation in the AffineOps dialect to 'affine.if'.
PiperOrigin-RevId: 240071154
2019-03-29 17:36:53 -07:00
Chris Lattner d9b5bc8f55 Remove OpPointer, cleaning up a ton of code. This also moves Ops to using
inherited constructors, which is cleaner and means you can now use DimOp()
to get a null op, instead of having to use Instruction::getNull<DimOp>().

This removes another 200 lines of code.

PiperOrigin-RevId: 240068113
2019-03-29 17:36:21 -07:00
Chris Lattner dd2b2ec542 Push a bunch of 'consts' out of the *Op structure, in prep for removing
OpPointer.

PiperOrigin-RevId: 240044712
2019-03-29 17:35:35 -07:00
Nicolas Vasilache f26c7cd792 Cleanup ValueHandleArray
We just need a way to unpack ArrayRef<ValueHandle> to ArrayRef<Value*>.
No need to expose this to the user.

This reduces the cognitive overhead for the tutorial.

PiperOrigin-RevId: 240037425
2019-03-29 17:35:20 -07:00
Chris Lattner 986310a68f Remove const from Value, Instruction, Argument, and the various methods on the
*Op classes.  This is a net reduction by almost 400LOC.

PiperOrigin-RevId: 239972443
2019-03-29 17:34:33 -07:00
Chris Lattner 3d6c74fff5 Remove const from mlir::Block.
This also eliminates some incorrect reinterpret_cast logic working around it, and numerous const-incorrect issues (like block argument iteration).

PiperOrigin-RevId: 239712029
2019-03-29 17:30:30 -07:00
Chris Lattner 88e9f418f5 Continue pushing const out of the core IR types - in this case, remove const
from Function.

PiperOrigin-RevId: 239638635
2019-03-29 17:29:58 -07:00
Nicolas Vasilache fc5bbdd6c8 Improve comment for `augmentMapAndBounds`
Followup from a previous CL.

PiperOrigin-RevId: 239591775
2019-03-29 17:27:57 -07:00
Chris Lattner 589df37142 Move to new `const` model, part 1: remove ConstOpPointer.
This eliminate ConstOpPointer (but keeps OpPointer for now) by making OpPointer
implicitly launder const in a const incorrect way.  It will eventually go away
entirely, this is a progressive step towards the new const model.

PiperOrigin-RevId: 239512640
2019-03-29 17:26:56 -07:00
Nicolas Vasilache d6c650cfb5 Properly propagate induction variable in tiling
This CL fixes an issue where cloned loop induction variables were not properly
propagated and beefs up the corresponding test.

PiperOrigin-RevId: 239422961
2019-03-29 17:25:53 -07:00
Jacques Pienaar a8ed2ca8fd Cleanup for changes failing with std=c++11
The static constexpr were failing with undefined reference due to lacking definition at namespace scope.

PiperOrigin-RevId: 239241157
2019-03-29 17:25:24 -07:00
Jacques Pienaar 57270a9a99 Remove some statements that required >C++11, add includes and qualify names. NFC.
PiperOrigin-RevId: 239197784
2019-03-29 17:24:53 -07:00
Dimitrios Vytiniotis ee4cfefca8 Avoiding allocations during argument attribute conversion.
PiperOrigin-RevId: 239144675
2019-03-29 17:24:38 -07:00
Nicolas Vasilache c3b0c6a0dc Cleanups Vectorize and SliceAnalysis - NFC
This CL cleans up and refactors super-vectorization and slice analysis.

PiperOrigin-RevId: 238986866
2019-03-29 17:23:07 -07:00
Nicolas Vasilache a89d8c0a1a Port Tablegen'd reference implementation of Add to declarative builders.
PiperOrigin-RevId: 238977252
2019-03-29 17:22:36 -07:00
Nicolas Vasilache 3a12bc5041 Remove LOAD/STORE/RETURN boilerplate in declarative builders.
This CL introduces a ValueArrayHandle helper to manage the implicit conversion
of ArrayRef<ValueHandle> -> ArrayRef<Value*> by converting first to ValueArrayHandle.
Without this, boilerplate operations that take ArrayRef<Value*> cannot be removed easily.

This all seems to boil down to decoupling Value from Type.
Alternative solutions exist (e.g. MLIR using Value by value everywhere) but they would be very intrusive. This seems to be the lowest impedance change.

Intrinsics are also lowercased by popular demand.

PiperOrigin-RevId: 238974125
2019-03-29 17:22:20 -07:00
Nicolas Vasilache f43388e4ce Port LowerVectorTransfers from EDSC + AST to declarative builders
This CL removes the dependency of LowerVectorTransfers on the AST version of EDSCs which will be retired.

This exhibited a pretty fundamental staging difference in AST-based vs declarative based emission.

Since the delayed creation with an AST was staged, the loop order came into existence after the clipping expressions were computed.
This now changes as the loops first need to be created declaratively in fixed order and then the clipping expressions are created.
Also, due to lack of staging, coalescing cannot be done on the fly anymore and
needs to be done either as a pre-pass (current implementation) or as a local transformation on the generated IR (future work).

Tests are updated accordingly.

PiperOrigin-RevId: 238971631
2019-03-29 17:22:06 -07:00
River Riddle 27d1bb920e Cache the simplified attributes in SimplifyAffineStructures to avoid redundant simplifications, as well as unnecessary accesses to the MLIRContext.
PiperOrigin-RevId: 238654325
2019-03-29 17:20:46 -07:00
Alex Zinenko 276fae1b0d Rename BlockList into Region
NFC.  This is step 1/n to specifying regions as parts of any operation.

PiperOrigin-RevId: 238472370
2019-03-29 17:18:04 -07:00
Uday Bondhugula a228b7d477 Change getMemoryFootprintBytes emitError to a warning
- this is really not a hard error; emit a warning instead (for inability to compute
  footprint due to the union failing due to unimplemented cases)
- remove a misleading warning from LoopFusion.cpp

PiperOrigin-RevId: 238118711
2019-03-29 17:16:12 -07:00
Uday Bondhugula 9f2781e8dd Fix misc bugs / TODOs / other improvements to analysis utils
- fix for getConstantBoundOnDimSize: floordiv -> ceildiv for extent
- make getConstantBoundOnDimSize also return the identifier upper bound
- fix unionBoundingBox to correctly use the divisor and upper bound identified by
  getConstantBoundOnDimSize
- deal with loop step correctly in addAffineForOpDomain (covers most cases now)
- fully compose bound map / operands and simplify/canonicalize before adding
  dim/symbol to FlatAffineConstraints; fixes false positives in -memref-bound-check; add
  test case there
- expose mlir::isTopLevelSymbol from AffineOps

PiperOrigin-RevId: 238050395
2019-03-29 17:15:27 -07:00
Uday Bondhugula 075090f891 Extend loop unrolling and unroll-jamming to non-matching bound operands and
multi-result upper bounds, complete TODOs, fix/improve test cases.

- complete TODOs for loop unroll/unroll-and-jam. Something as simple as
  "for %i = 0 to %N" wasn't being unrolled earlier (unless it had been written
  as "for %i = ()[s0] -> (0)()[%N] to %N"; addressed now.

- update/replace getTripCountExpr with buildTripCountMapAndOperands; makes it
  more powerful as it composes inputs into it

- getCleanupLowerBound and getUnrolledLoopUpperBound actually needed the same
  code; refactor and remove one.

- reorganize test cases, write previous ones better; most of these changes are
  "label replacements".

- fix wrongly labeled test cases in unroll-jam.mlir

PiperOrigin-RevId: 238014653
2019-03-29 17:14:12 -07:00
River Riddle 5e1f1d2cab Update the constantFold/fold API to use LogicalResult instead of bool.
PiperOrigin-RevId: 237719658
2019-03-29 17:10:50 -07:00
River Riddle 0310d49f46 Move the success/failure functions out of LogicalResult and into the mlir namespace.
PiperOrigin-RevId: 237712180
2019-03-29 17:10:21 -07:00
River Riddle 80d3568c0a Rename Status to LogicalResult to avoid conflictions with the Status in xla/tensorflow/etc.
PiperOrigin-RevId: 237537341
2019-03-29 17:08:50 -07:00
Uday Bondhugula ce7e59536c Add a basic model to set tile sizes + some cleanup
- compute tile sizes based on a simple model that looks at memory footprints
  (instead of using the hardcoded default value)
- adjust tile sizes to make them factors of trip counts based on an option
- update loop fusion CL options to allow setting maximal fusion at pass creation
- change an emitError to emitWarning (since it's not a hard error unless the client
  treats it that way, in which case, it can emit one)

$ mlir-opt -debug-only=loop-tile -loop-tile test/Transforms/loop-tiling.mlir

test/Transforms/loop-tiling.mlir:81:3: note: using tile sizes [4 4 5 ]

  for %i = 0 to 256 {

for %i0 = 0 to 256 step 4 {
    for %i1 = 0 to 256 step 4 {
      for %i2 = 0 to 250 step 5 {
        for %i3 = #map4(%i0) to #map11(%i0) {
          for %i4 = #map4(%i1) to #map11(%i1) {
            for %i5 = #map4(%i2) to #map12(%i2) {
              %0 = load %arg0[%i3, %i5] : memref<8x8xvector<64xf32>>
              %1 = load %arg1[%i5, %i4] : memref<8x8xvector<64xf32>>
              %2 = load %arg2[%i3, %i4] : memref<8x8xvector<64xf32>>
              %3 = mulf %0, %1 : vector<64xf32>
              %4 = addf %2, %3 : vector<64xf32>
              store %4, %arg2[%i3, %i4] : memref<8x8xvector<64xf32>>
            }
          }
        }
      }
    }
  }

PiperOrigin-RevId: 237461836
2019-03-29 17:06:51 -07:00
River Riddle 1e55ae19a0 Convert ambiguous bool returns in /Analysis to use Status instead.
PiperOrigin-RevId: 237390240
2019-03-29 17:06:21 -07:00
River Riddle 10ddae6d88 Use Status instead of bool in DialectConversion.
PiperOrigin-RevId: 237339277
2019-03-29 17:06:06 -07:00
River Riddle ba6fdc8b01 Move UtilResult into the Support directory and rename it to Status. Status provides an unambiguous way to specify success/failure results. These can be generated by 'Status::success()' and Status::failure()'. Status provides no implicit conversion to bool and should be consumed by one of the following utility functions:
* bool succeeded(Status)
  - Return if the status corresponds to a success value.

* bool failed(Status)
  - Return if the status corresponds to a failure value.

PiperOrigin-RevId: 237153884
2019-03-29 17:04:19 -07:00
River Riddle d43f630de8 NFC: Remove 'Result' from the analysis manager api to better reflect the implementation. There is no distinction between analysis computation and result.
PiperOrigin-RevId: 237093101
2019-03-29 17:02:12 -07:00
River Riddle 1d87b62afe Add support for preserving specific analyses in the analysis manager. Passes can now preserve specific analyses via 'markAnalysesPreserved'.
Example:

markAnalysesPreserved<DominanceInfo>();
markAnalysesPreserved<DominanceInfo, PostDominanceInfo>();

PiperOrigin-RevId: 237081454
2019-03-29 17:01:41 -07:00
MLIR Team c1ff9e866e Use FlatAffineConstraints::unionBoundingBox to perform slice bounds union for loop fusion pass (WIP).
Adds utility to convert slice bounds to a FlatAffineConstraints representation.
Adds utility to FlatAffineConstraints to promote loop IV symbol identifiers to dim identifiers.

PiperOrigin-RevId: 236973261
2019-03-29 16:59:21 -07:00
Uday Bondhugula 5836fae8a0 DMA generation CL flag update
- allow mem capacity to be overridden by command-line flag
- change default fast mem space to 2

PiperOrigin-RevId: 236951598
2019-03-29 16:59:05 -07:00