llvm-project

Commit Graph

Author	SHA1	Message	Date
River Riddle	aff4ed7326	[mlir][NFC] Update Operation::getResultTypes to use ArrayRef<Type> instead of iterator_range. Summary: The new internal representation of operation results now allows for accessing the result types to be more efficient. Changing the API to ArrayRef is more efficient and removes the need to explicitly materialize vectors in several places. Differential Revision: https://reviews.llvm.org/D73429	2020-01-27 19:57:48 -08:00
River Riddle	ce674b131b	[mlir] Add support for marking 'unknown' operations as dynamically legal. Summary: This allows for providing a default "catchall" legality check that is not dependent on specific operations or dialects. For example, this can be useful to check legality based on the specific types of operation operands or results. Differential Revision: https://reviews.llvm.org/D73379	2020-01-27 19:50:52 -08:00
Diego Caballero	6fb3d59746	[mlir] Remove 'valuesToRemoveIfDead' from PatternRewriter API Summary: Remove 'valuesToRemoveIfDead' from PatternRewriter API. The removal functionality wasn't implemented and we decided [1] not to implement it in favor of having more powerful DCE approaches. [1] https://github.com/tensorflow/mlir/pull/212 Reviewers: rriddle, bondhugula Reviewed By: rriddle Subscribers: liufengdb, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72545	2020-01-27 14:00:34 -08:00
Mehdi Amini	308571074c	Mass update the MLIR license header to mention "Part of the LLVM project" This is an artifact from merging MLIR into LLVM, the file headers are now aligned with the rest of the project.	2020-01-26 03:58:30 +00:00
Ahmed Taei	8d1ed2940d	[mlir] Fix vectorize transform crashing on none-op operand	2020-01-23 09:57:16 -08:00
Kazuaki Ishizaki	fc817b09e2	[mlir] NFC: Fix trivial typos in comments Differential Revision: https://reviews.llvm.org/D73012	2020-01-20 03:17:03 +00:00
aartbik	0361a961c2	[mlir] [VectorOps] Rename Utils.h into VectorUtils.h Summary: First step towards the consolidation of a lot of vector related utilities that are now all over the place (or even duplicated). Reviewers: nicolasvasilache, andydavis1 Reviewed By: nicolasvasilache, andydavis1 Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72955	2020-01-17 13:39:34 -08:00
Benjamin Kramer	df186507e1	Make helper functions static or move them into anonymous namespaces. NFC.	2020-01-14 14:06:37 +01:00
River Riddle	c774840492	[mlir] Update the CallGraph for nested symbol references, and simplify CallableOpInterface Summary: This enables tracking calls that cross symbol table boundaries. It also simplifies some of the implementation details of CallableOpInterface, i.e. there can only be one region within the callable operation. Depends On D72042 Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D72043	2020-01-13 15:51:28 -08:00
River Riddle	cb89c7e3f7	[mlir] Remove unnecessary assert for single region. This was left over debugging.	2020-01-13 13:55:50 -08:00
River Riddle	2bdf33cc4c	[mlir] NFC: Remove Value::operator* and Value::operator-> now that Value is properly value-typed. Summary: These were temporary methods used to simplify the transition. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D72548	2020-01-11 08:54:39 -08:00
River Riddle	0d6ebb4f0d	[mlir] Refactor operation results to use a single use list for all results of the operation. Summary: A new class is added, IRMultiObjectWithUseList, that allows for representing an IR use list that holds multiple sub values(used in this case for OpResults). This class provides all of the same functionality as the base IRObjectWithUseList, but for specific sub-values. This saves a word per operation result and is a necessary step in optimizing the layout of operation results. For now the use list is placed on the operation itself, so zero-result operations grow by a word. When the work for optimizing layout is finished, this can be moved back to being a trailing object based on memory/runtime benchmarking. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D71955	2019-12-30 20:50:07 -08:00
River Riddle	e62a69561f	NFC: Replace ValuePtr with Value and remove it now that Value is value-typed. ValuePtr was a temporary typedef during the transition to a value-typed Value. PiperOrigin-RevId: 286945714	2019-12-23 16:36:53 -08:00
River Riddle	5d5bd2e1da	Change the `notifyRootUpdated` API to be transaction based. This means that in-place, or root, updates need to use explicit calls to `startRootUpdate`, `finalizeRootUpdate`, and `cancelRootUpdate`. The major benefit of this change is that it enables in-place updates in DialectConversion, which simplifies the FuncOp pattern for example. The major downside to this is that the cases that may modify an operation in-place will need an explicit cancel on the failure branches(assuming that they started an update before attempting the transformation). PiperOrigin-RevId: 286933674	2019-12-23 16:26:15 -08:00
Mehdi Amini	56222a0694	Adjust License.txt file to use the LLVM license PiperOrigin-RevId: 286906740	2019-12-23 15:33:37 -08:00
River Riddle	35807bc4c5	NFC: Introduce new ValuePtr/ValueRef typedefs to simplify the transition to Value being value-typed. This is an initial step to refactoring the representation of OpResult as proposed in: https://groups.google.com/a/tensorflow.org/g/mlir/c/XXzzKhqqF_0/m/v6bKb08WCgAJ This change will make it much simpler to incrementally transition all of the existing code to use value-typed semantics. PiperOrigin-RevId: 286844725	2019-12-22 22:00:23 -08:00
Manuel Freiberger	22954a0e40	Add integer bit-shift operations to the standard dialect. Rename the 'shlis' operation in the standard dialect to 'shift_left'. Add tests for this operation (these have been missing so far) and add a lowering to the 'shl' operation in the LLVM dialect. Add also 'shift_right_signed' (lowered to LLVM's 'ashr') and 'shift_right_unsigned' (lowered to 'lshr'). The original plan was to name these operations 'shift.left', 'shift.right.signed' and 'shift.right.unsigned'. This works if the operations are prefixed with 'std.' in MLIR assembly. Unfortunately during import the short form is ambigous with operations from a hypothetical 'shift' dialect. The best solution seems to omit dots in standard operations for now. Closes tensorflow/mlir#226 PiperOrigin-RevId: 286803388	2019-12-22 10:02:13 -08:00
Sean Silva	553f794b6f	Add a couple useful LLVM_DEBUG's to the inliner. This makes it easier to narrow down on ops that are preventing inlining. PiperOrigin-RevId: 286243868	2019-12-18 12:33:30 -08:00
River Riddle	2666b97314	NFC: Cleanup non-conforming usages of namespaces. * Fixes use of anonymous namespace for static methods. * Uses explicit qualifiers(mlir::) instead of wrapping the definition with the namespace. PiperOrigin-RevId: 286222654	2019-12-18 10:46:48 -08:00
Uday Bondhugula	47034c4bc5	Introduce prefetch op: affine -> std -> llvm intrinsic Introduce affine.prefetch: op to prefetch using a multi-dimensional subscript on a memref; similar to affine.load but has no effect on semantics, but only on performance. Provide lowering through std.prefetch, llvm.prefetch and map to llvm's prefetch instrinsic. All attributes reflected through the lowering - locality hint, rw, and instr/data cache. affine.prefetch %0[%i, %j + 5], false, 3, true : memref<400x400xi32> Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#225 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/225 from bondhugula:prefetch 4c3b4e93bc64d9a5719504e6d6e1657818a2ead0 PiperOrigin-RevId: 286212997	2019-12-18 10:00:04 -08:00
River Riddle	4562e389a4	NFC: Remove unnecessary 'llvm::' prefix from uses of llvm symbols declared in `mlir` namespace. Aside from being cleaner, this also makes the codebase more consistent. PiperOrigin-RevId: 286206974	2019-12-18 09:29:20 -08:00
River Riddle	74278dd01e	NFC: Use TypeSwitch to simplify existing code. PiperOrigin-RevId: 286066371	2019-12-17 14:57:41 -08:00
River Riddle	ab610e8a99	Insert signature-converted blocks into a region with a parent operation. This keeps the IR valid and consistent as it is expected that each block should have a valid parent region/operation. Previously, converted blocks were kept floating without a valid parent region. PiperOrigin-RevId: 285821687	2019-12-16 12:09:45 -08:00
River Riddle	b030e4a4ec	Try to fold operations in DialectConversion when trying to legalize. This change allows for DialectConversion to attempt folding as a mechanism to legalize illegal operations. This also expands folding support in OpBuilder::createOrFold to generate new constants when folding, and also enables it to work in the context of a PatternRewriter. PiperOrigin-RevId: 285448440	2019-12-13 16:47:26 -08:00
River Riddle	851a8516d3	Make OpBuilder::insert virtual instead of OpBuilder::createOperation. It is sometimes useful to create operations separately from the builder before insertion as it may be easier to erase them in isolation if necessary. One example use case for this is folding, as we will only want to insert newly generated constant operations on success. This has the added benefit of fixing some silent PatternRewriter failures related to cloning, as the OpBuilder 'clone' methods don't call createOperation. PiperOrigin-RevId: 285086242	2019-12-11 16:26:45 -08:00
Kazuaki Ishizaki	ae05cf27c6	Minor spelling tweaks Closes tensorflow/mlir#304 PiperOrigin-RevId: 284568358	2019-12-09 09:23:48 -08:00
Uday Bondhugula	a63f6e0bf9	Replace spurious SmallVector constructions with ValueRange Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#305 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/305 from bondhugula:value_range 21d1fae73f549e3c8e72b60876eff1b864cea39c PiperOrigin-RevId: 284541027	2019-12-09 06:26:33 -08:00
River Riddle	d6ee6a0310	Update the builder API to take ValueRange instead of ArrayRef<Value *> This allows for users to provide operand_range and result_range in builder.create<> calls, instead of requiring an explicit copy into a separate data structure like SmallVector/std::vector. PiperOrigin-RevId: 284360710	2019-12-07 10:35:41 -08:00
River Riddle	9d1a0c72b4	Add a new ValueRange class. This class represents a generic abstraction over the different ways to represent a range of Values: ArrayRef<Value >, operand_range, result_range. This class will allow for removing the many instances of explicit SmallVector<Value , N> construction. It has the same memory cost as ArrayRef, and only suffers cost from indexing(if+elsing the different underlying representations). This change only updates a few of the existing usages, with more to be changed in followups; e.g. 'build' API. PiperOrigin-RevId: 284307996	2019-12-06 20:07:23 -08:00
Alexandre E. Eichenberger	3c69ca1e69	fix examples in comments Closes tensorflow/mlir#301 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/301 from AlexandreEichenberger:vect-doc-update 7e5418a9101a4bdad2357882fe660b02bba8bd01 PiperOrigin-RevId: 284202462	2019-12-06 09:40:50 -08:00
Kazuaki Ishizaki	84a6182ddd	minor spelling tweaks Closes tensorflow/mlir#290 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/290 from kiszk:spelling_tweaks_201912 9d9afd16a723dd65754a04698b3976f150a6054a PiperOrigin-RevId: 284169681	2019-12-06 05:59:30 -08:00
River Riddle	33a64540ad	Add support for instance specific pass statistics. Statistics are a way to keep track of what the compiler is doing and how effective various optimizations are. It is useful to see what optimizations are contributing to making a particular program run faster. Pass-instance specific statistics take this even further as you can see the effect of placing a particular pass at specific places within the pass pipeline, e.g. they could help answer questions like "what happens if I run CSE again here". Statistics can be added to a pass by simply adding members of type 'Pass::Statistics'. This class takes as a constructor arguments: the parent pass pointer, a name, and a description. Statistics can be dumped by the pass manager in a similar manner to how pass timing information is dumped, i.e. via PassManager::enableStatistics programmatically; or -pass-statistics and -pass-statistics-display via the command line pass manager options. Below is an example: struct MyPass : public OperationPass<MyPass> { Statistic testStat{this, "testStat", "A test statistic"}; void runOnOperation() { ... ++testStat; ... } }; $ mlir-opt -pass-pipeline='func(my-pass,my-pass)' foo.mlir -pass-statistics Pipeline Display: ===-------------------------------------------------------------------------=== ... Pass statistics report ... ===-------------------------------------------------------------------------=== 'func' Pipeline MyPass (S) 15 testStat - A test statistic MyPass (S) 6 testStat - A test statistic List Display: ===-------------------------------------------------------------------------=== ... Pass statistics report ... ===-------------------------------------------------------------------------=== MyPass (S) 21 testStat - A test statistic PiperOrigin-RevId: 284022014	2019-12-05 11:53:28 -08:00
River Riddle	6f895bec7d	[CSE] NFC: Hash the attribute dictionary pointer instead of the list of attributes. PiperOrigin-RevId: 283810829	2019-12-04 12:32:08 -08:00
Nicolas Vasilache	edfaf925cf	Drop MaterializeVectorTransfers in favor of simpler declarative unrolling Now that we have unrolling as a declarative pattern, we can drop a full pass that has gone stale. In the future we may want to add specific unrolling patterns for VectorTransferReadOp. PiperOrigin-RevId: 283806880	2019-12-04 12:11:42 -08:00
Alex Zinenko	75175134d4	Loop coalescing: fix pointer chainsing in use-chain traversal In the replaceAllUsesExcept utility function called from loop coalescing the iteration over the use-chain is incorrect. The use list nodes (IROperands) have next/prev links, and bluntly resetting the use would make the loop to continue on uses of the value that was replaced instead of the original one. As a result, it could miss the existing uses and update the wrong ones. Make sure we increment the iterator before updating the use in the loop body. Reported-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#291. PiperOrigin-RevId: 283754195	2019-12-04 07:42:29 -08:00
Nicolas Vasilache	5c0c51a997	Refactor dependencies to expose Vector transformations as patterns - NFC This CL refactors some of the MLIR vector dependencies to allow decoupling VectorOps, vector analysis, vector transformations and vector conversions from each other. This makes the system more modular and allows extracting VectorToVector into VectorTransforms that do not depend on vector conversions. This refactoring exhibited a bunch of cyclic library dependencies that have been cleaned up. PiperOrigin-RevId: 283660308	2019-12-03 17:52:10 -08:00
Diego Caballero	330d1ff00e	AffineLoopFusion: Prevent fusion of multi-out-edge producer loops tensorflow/mlir#162 introduced a bug that incorrectly allowed fusion of producer loops with multiple outgoing edges. This commit fixes that problem. It also introduces a new flag to disable sibling loop fusion so that we can test producer-consumer fusion in isolation. Closes tensorflow/mlir#259 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/259 from dcaballe:dcaballe/fix_multi_out_edge_producer_fusion 578d5661705fd5c56c555832d5e0528df88c5282 PiperOrigin-RevId: 283531105	2019-12-03 06:09:50 -08:00
Mahesh Ravishankar	bd485afda0	Introduce attributes that specify the final ABI for a spirv::ModuleOp. To simplify the lowering into SPIR-V, while still respecting the ABI requirements of SPIR-V/Vulkan, split the process into two 1) While lowering a function to SPIR-V (when the function is an entry point function), allow specifying attributes on arguments and function itself that describe the ABI of the function. 2) Add a pass that materializes the ABI described in the function. Two attributes are needed. 1) Attribute on arguments of the entry point function that describe the descriptor_set, binding, storage class, etc, of the spv.globalVariable this argument will be replaced by 2) Attribute on function that specifies workgroup size, etc. (for now only workgroup size). Add the pass -spirv-lower-abi-attrs to materialize the ABI described by the attributes. This change makes the SPIRVBasicTypeConverter class unnecessary and is removed, further simplifying the SPIR-V lowering path. PiperOrigin-RevId: 282387587	2019-11-25 11:19:56 -08:00
Jean-Michel Gorius	104777d8e6	Unify vector op names with other dialects. Change vector op names from VectorFooOp to Vector_FooOp and from vector::VectorFooOp to vector::FooOp. Closes tensorflow/mlir#257 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/257 from Kayjukh:master dfc3a0e04114885aaec8740d5951d6984d6e1577 PiperOrigin-RevId: 281967461	2019-11-22 08:24:49 -08:00
River Riddle	4ea92a0586	NFC: Use Region::getBlocks to fix build failure with drop_begin. PiperOrigin-RevId: 281656603	2019-11-20 19:30:46 -08:00
River Riddle	fafb708b9a	Merge DCE and unreachable block elimination into a new utility 'simplifyRegions'. This moves the different canonicalizations of regions into one place and invokes them in the fixed-point iteration of the canonicalizer. PiperOrigin-RevId: 281617072	2019-11-20 15:53:19 -08:00
Sean Silva	e4f83c6c26	Add multi-level DCE pass. This is a simple multi-level DCE pass that operates pretty generically on the IR. Its key feature compared to the existing peephole dead op folding that happens during canonicalization is being able to delete recursively dead cycles of the use-def graph, including block arguments. PiperOrigin-RevId: 281568202	2019-11-20 12:55:10 -08:00
Nicolas Vasilache	fa14d4f6ab	Implement unrolling of vector ops to finer-grained vector ops as a pattern. This CL uses the pattern rewrite infrastructure to implement a simple VectorOps -> VectorOps legalization strategy to unroll coarse-grained vector operations into finer grained ones. The transformation is written using local pattern rewrites to allow composition with other rewrites. It proceeds by iteratively introducing fake cast ops and cleaning canonicalizing or lowering them away where appropriate. This is an example of writing transformations as compositions of local pattern rewrites that should enable us to make them significantly more declarative. PiperOrigin-RevId: 281555100	2019-11-20 11:49:36 -08:00
Diego Caballero	dd5a7cb488	Add getRemappedValue to ConversionPatternRewriter This method is needed for N->1 conversion patterns to retrieve remapped Values used in the original N operations. Closes tensorflow/mlir#237 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/237 from dcaballe:dcaballe/getRemappedValue 1f64fadcf2b203f7b336ff0c5838b116ae3625db PiperOrigin-RevId: 281321881	2019-11-19 11:09:39 -08:00
Jing Pu	563b5910a8	Also elide large array attribute in OpGraph Dump PiperOrigin-RevId: 281114034	2019-11-18 11:27:43 -08:00
Andy Davis	68a8da4a93	Fix Affine Loop Fusion test case reported on github. This CL utilizies the more robust fusion feasibility analysis being built out in LoopFusionUtils, which will eventually be used to replace the current affine loop fusion pass. PiperOrigin-RevId: 281112340	2019-11-18 11:20:37 -08:00
Lei Zhang	a0986bf43d	NFC: Convert CmpIPredicate in StandardOps to use EnumAttr This turns several hand-written functions to auto-generated ones. PiperOrigin-RevId: 280684326	2019-11-15 10:17:31 -08:00
Nicolas Vasilache	0b271b7dfe	Refactor the LowerVectorTransfers pass to use the RewritePattern infra - NFC This is step 1/n in refactoring infrastructure along the Vector dialect to make it ready for retargetability and composable progressive lowering. PiperOrigin-RevId: 280529784	2019-11-14 15:40:07 -08:00
Alex Zinenko	971b8dd4d8	Move Affine to Standard conversion to lib/Conversion This is essentially a dialect conversion and conceptually belongs to conversions. PiperOrigin-RevId: 280460034	2019-11-14 10:35:21 -08:00
Nicolas Vasilache	f2b6ae9991	Move VectorOps to Tablegen - (almost) NFC This CL moves VectorOps to Tablegen and cleans up the implementation. This is almost NFC but 2 changes occur: 1. an interface change occurs in the padding value specification in vector_transfer_read: the value becomes non-optional. As a shortcut we currently use %f0 for all paddings. This should become an OpInterface for vectorization in the future. 2. the return type of vector.type_cast is trivial and simplified to `memref<vector<...>>` Relevant roundtrip and invalid tests that used to sit in core are moved to the vector dialect. The op documentation is moved to the .td file. PiperOrigin-RevId: 280430869	2019-11-14 08:15:23 -08:00
River Riddle	d985c74883	NFC: Refactor block signature conversion to not erase the original arguments. This refactors the implementation of block signature(type) conversion to not insert fake cast operations to perform the type conversion, but to instead create a new block containing the proper signature. This has the benefit of enabling the use of pre-computed analyses that rely on mapping values. It also leads to a much cleaner implementation overall. The major user facing change is that applySignatureConversion will now replace the entry block of the region, meaning that blocks generally shouldn't be cached over calls to applySignatureConversion. PiperOrigin-RevId: 280226936	2019-11-13 10:27:53 -08:00
Jacques Pienaar	bcfb3d4cd6	Explicitly initialize isRecursivelyLegal This also previously triggered the warning: warning: missing field 'isRecursivelyLegal' initializer [-Wmissing-field-initializers] legalOperations[op] = {action}; ^ PiperOrigin-RevId: 279399175	2019-11-08 15:06:34 -08:00
Sean Silva	f6188b5b07	Replace some remnant uses of "inst" with "op". PiperOrigin-RevId: 278961676	2019-11-06 16:09:23 -08:00
River Riddle	2366561a39	Add a PatternRewriter hook to merge blocks, and use it to support for folding branches. A pattern rewriter hook, mergeBlock, is added that allows for merging the operations of one block into the end of another. This is used to support a canonicalization pattern for branch operations that folds the branch when the successor has a single predecessor(the branch block). Example: ^bb0: %c0_i32 = constant 0 : i32 br ^bb1(%c0_i32 : i32) ^bb1(%x : i32): return %x : i32 becomes: ^bb0: %c0_i32 = constant 0 : i32 return %c0_i32 : i32 PiperOrigin-RevId: 278677825	2019-11-05 11:57:38 -08:00
Mahesh Ravishankar	9cbbd8f4df	Support lowering of imperfectly nested loops into GPU dialect. The current lowering of loops to GPU only supports lowering of loop nests where the loops mapped to workgroups and workitems are perfectly nested. Here a new lowering is added to handle lowering of imperfectly nested loop body with the following properties 1) The loops partitioned to workgroups are perfectly nested. 2) The loop body of the inner most loop partitioned to workgroups can contain one or more loop nests that are to be partitioned across workitems. Each individual loops nests partitioned to workitems should also be perfectly nested. 3) The number of workgroups and workitems are not deduced from the loop bounds but are passed in by the caller of the lowering as values. 4) For statements within the perfectly nested loop nest partitioned across workgroups that are not loops, it is valid to have all threads execute that statement. This is NOT verified. PiperOrigin-RevId: 277958868	2019-11-01 10:52:06 -07:00
Jing Pu	736ad2061c	Dump op location in createPrintOpGraphPass for easier debugging. PiperOrigin-RevId: 277546527	2019-10-30 11:22:22 -07:00
River Riddle	a32f0dcb5d	Add support to GreedyPatternRewriter for erasing unreachable blocks. Rewrite patterns may make modifications to the CFG, including dropping edges between blocks. This change adds a simple unreachable block elimination run at the end of each iteration to ensure that the CFG remains valid. PiperOrigin-RevId: 277545805	2019-10-30 11:19:24 -07:00
Diego Caballero	c87c7f5732	Bugfix: Keep worklistMap in sync with worklist in GreedyPatternRewriter When we removed a pattern, we removed it from worklist but not from worklistMap. Then, when we tried to add a new pattern on the same Operation again, the pattern wasn't added since it already existed in the worklistMap (but not in the worklist). Closes tensorflow/mlir#211 PiperOrigin-RevId: 277319669	2019-10-29 10:58:31 -07:00
River Riddle	2f4d0c085a	Add support for marking an operation as recursively legal. In some cases, it may be desirable to mark entire regions of operations as legal. This provides an additional granularity of context to the concept of "legal". The `ConversionTarget` supports marking operations, that were previously added as `Legal` or `Dynamic`, as `recursively` legal. Recursive legality means that if an operation instance is legal, either statically or dynamically, all of the operations nested within are also considered legal. An operation can be marked via `markOpRecursivelyLegal<>`: ```c++ ConversionTarget &target = ...; /// The operation must first be marked as `Legal` or `Dynamic`. target.addLegalOp<MyOp>(...); target.addDynamicallyLegalOp<MySecondOp>(...); /// Mark the operation as always recursively legal. target.markOpRecursivelyLegal<MyOp>(); /// Mark optionally with a callback to allow selective marking. target.markOpRecursivelyLegal<MyOp, MySecondOp>([](Operation *op) { ... }); /// Mark optionally with a callback to allow selective marking. target.markOpRecursivelyLegal<MyOp>([](MyOp op) { ... }); ``` PiperOrigin-RevId: 277086382	2019-10-28 10:04:34 -07:00
River Riddle	2b61b7979e	Convert the Canonicalize and CSE passes to generic Operation Passes. This allows for them to be used on other non-function, or even other function-like, operations. The algorithms are already generic, so this is simply changing the derived pass type. The majority of this change is just ensuring that the nesting of these passes remains the same, as the pass manager won't auto-nest them anymore. PiperOrigin-RevId: 276573038	2019-10-24 15:01:09 -07:00
Alex Zinenko	edffbbcdae	Fix "set-but-unused" warning in DialectConversion The variable in question is only used in an assertion, leading to a warning in opt builds. PiperOrigin-RevId: 276352259	2019-10-23 14:32:13 -07:00
Kazuaki Ishizaki	8bfedb3ca5	Fix minor spelling tweaks (NFC) Closes tensorflow/mlir#177 PiperOrigin-RevId: 275692653	2019-10-20 00:11:34 -07:00
Nicolas Vasilache	9e7e297da3	Lower vector transfer ops to loop.for operations. This allows mixing linalg operations with vector transfer operations (with additional modifications to affine ops) and is a step towards solving tensorflow/mlir#189. PiperOrigin-RevId: 275543361	2019-10-18 14:10:10 -07:00
River Riddle	2acc220f17	NFC: Remove trivial builder get methods. These don't add any value, and some are even more restrictive than the respective static 'get' method. PiperOrigin-RevId: 275391240	2019-10-17 20:08:34 -07:00
Geoffrey Martin-Noble	6090643877	Introduce a wrapper around ConversionPattern that operates on the derived class Analogous to OpRewritePattern, this makes writing conversion patterns more convenient. PiperOrigin-RevId: 275349854	2019-10-17 15:30:38 -07:00
Nicolas Vasilache	10039d04e2	Rename LoopNestBuilder to AffineLoopNestBuilder - NFC PiperOrigin-RevId: 275310747	2019-10-17 12:13:59 -07:00
Sana Damani	3940b90d84	Update Chapter 4 of the Toy tutorial This Chapter now introduces and makes use of the Interface concept in MLIR to demonstrate ShapeInference. END_PUBLIC Closes tensorflow/mlir#191 PiperOrigin-RevId: 275085151	2019-10-16 12:19:39 -07:00
Mahesh Ravishankar	e7b49eef1d	Allow for remapping argument to a Value in SignatureConversion. The current SignatureConversion framework (part of DialectConversion) allows remapping input arguments to a function from 1->0, 1->1 or 1->many arguments during conversion. Another case is where the argument itself is dropped, but it's use are remapped to another Value*. An example of this is: The Vulkan/SPIR-V spec requires entry functions to be of type void(void). The GPU -> SPIR-V conversion implemented this without having the DialectConversion framework track the remapping that lead to some undefined behavior. The changes here addresses that. PiperOrigin-RevId: 275059656	2019-10-16 10:21:03 -07:00
River Riddle	dfe09cc621	Add support for PatternRewriter::eraseOp. This hook is useful when an operation is known to be dead, and no replacement values make sense. PiperOrigin-RevId: 275052756	2019-10-16 09:50:57 -07:00
Mehdi Amini	f1f9e3b8d1	Fix CMake configuration after introduction of LICM and LoopLikeInterface `b843cc5d5a` introduced a new op LICM transformation and a LoopLike interface, but missed the CMake aspects of it. This should fix the build. PiperOrigin-RevId: 275038533	2019-10-16 08:37:39 -07:00
Stephan Herhut	b843cc5d5a	Implement simple loop-invariant-code-motion based on dialect interfaces. PiperOrigin-RevId: 275004258	2019-10-16 04:28:38 -07:00
River Riddle	96de7091bc	Allowing replacing non-root operations in DialectConversion. When dealing with regions, or other patterns that need to generate temporary operations, it is useful to be able to replace other operations than the root op being matched. Before this PR, these operations would still be considered for legalization meaning that the conversion would either fail, erroneously need to mark these ops as legal, or add unnecessary patterns. PiperOrigin-RevId: 274598513	2019-10-14 10:01:59 -07:00
River Riddle	6b1cc3c6ea	Add support for canonicalizing callable regions during inlining. This will allow for inlining newly devirtualized calls, as well as give a more accurate cost model(when we have one). Currently canonicalization will only run for nodes that have no child edges, as the child nodes may be erased during canonicalization. We can support this in the future, but it requires more intricate deletion tracking. PiperOrigin-RevId: 274011386	2019-10-10 17:06:33 -07:00
River Riddle	438dc176b1	Remove the need to convert operations in regions of operations that have been replaced. When an operation with regions gets replaced, we currently require that all of the remaining nested operations are still converted even though they are going to be replaced when the rewrite is finished. This cl adds a tracking for a minimal set of operations that are known to be "dead". This allows for ignoring the legalization of operations that are won't survive after conversion. PiperOrigin-RevId: 274009003	2019-10-10 17:06:25 -07:00
Christian Sigg	35bb732032	Guard rewriter insertion point during signature conversion. Avoid unexpected side effect in rewriter insertion point. PiperOrigin-RevId: 273785794	2019-10-09 11:33:28 -07:00
Diego Caballero	3451055614	Add support for some multi-store cases in affine fusion This PR is a stepping stone towards supporting generic multi-store source loop nests in affine loop fusion. It extends the algorithm to support fusion of multi-store loop nests that: 1. have only one store that writes to a function-local live out, and 2. the remaining stores are involved in loop nest self dependences or no dependences within the function. Closes tensorflow/mlir#162 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/162 from dcaballe:dcaballe/multi-output-fusion 7fb7dec6fe8b45f5ce176f018bfe37b256420c45 PiperOrigin-RevId: 273773907	2019-10-09 10:37:30 -07:00
River Riddle	49b29dd186	Add a PatternRewriter hook for cloning a region into another. This is similar to the `inlineRegionBefore` hook, except the original blocks are unchanged. The region to be cloned must not have been modified during the conversion process at the point of cloning, i.e. it must belong an operation that has yet to be converted, or the operation that is currently being converted. PiperOrigin-RevId: 273622533	2019-10-08 15:45:08 -07:00
Uday Bondhugula	6136f33d59	unroll and jam: fix order of jammed bodies - bodies would earlier appear in the order (i, i+3, i+2, i+1) instead of (i, i+1, i+2, i+3) for example for factor 4. - clean up hardcoded test cases Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#170 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/170 from bondhugula:ujam b66b405b2b1894a03b376952e32a9d0292042665 PiperOrigin-RevId: 273613131	2019-10-08 15:13:11 -07:00
Jing Pu	17606a108b	Print result types when dumping graphviz. PiperOrigin-RevId: 273406833	2019-10-07 16:45:53 -07:00
Uday Bondhugula	89e7a76a1c	fix simplify-affine-structures bug Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#157 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/157 from bondhugula:quickfix bd1fcd79825fc0bd5b4a3e688153fa0993ab703d PiperOrigin-RevId: 273316498	2019-10-07 10:04:50 -07:00
Christian Sigg	85dcaf19c7	Fix typos, NFC. PiperOrigin-RevId: 272851237	2019-10-04 04:37:53 -07:00
River Riddle	5830f71a45	Add support for inlining calls with different arg/result types from the callable. Some dialects have implicit conversions inherent in their modeling, meaning that a call may have a different type that the type that the callable expects. To support this, a hook is added to the dialect interface that allows for materializing conversion operations during inlining when there is a mismatch. A hook is also added to the callable interface to allow for introspecting the expected result types. PiperOrigin-RevId: 272814379	2019-10-03 23:10:51 -07:00
River Riddle	a20d96e436	Update the Inliner pass to work on SCCs of the CallGraph. This allows for the inliner to work on arbitrary call operations. The updated inliner will also work bottom-up through the callgraph enabling support for multiple levels of inlining. PiperOrigin-RevId: 272813876	2019-10-03 23:05:21 -07:00
Jacques Pienaar	2b86e27dbd	Show type even if elementsattr is elided in graph The type is quite useful for debugging and shouldn't be too large. PiperOrigin-RevId: 272390311	2019-10-02 01:46:12 -07:00
Jacques Pienaar	c57f202c8c	Switch explicit create methods to match generated build's order The generated build methods have result type before the arguments (operands and attributes, which are also now adjacent in the explicit create method). This also results in changing the create method's ordering to match most build method's ordering. PiperOrigin-RevId: 271755054	2019-09-28 09:35:58 -07:00
Uday Bondhugula	74eabdd14e	NFC - clean up op accessor usage, std.load/store op verify, other stale info - also remove stale terminology/references in docs Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#148 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/148 from bondhugula:cleanup e846b641a3c2936e874138aff480a23cdbf66591 PiperOrigin-RevId: 271618279	2019-09-27 11:58:24 -07:00
Nicolas Vasilache	ddf737c5da	Promote MemRefDescriptor to a pointer to struct when passing function boundaries in LLVMLowering. The strided MemRef RFC discusses a normalized descriptor and interaction with library calls (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio). Lowering of nested LLVM structs as value types does not play nicely with externally compiled C/C++ functions due to ABI issues. Solving the ABI problem generally is a very complex problem and most likely involves taking a dependence on clang that we do not want atm. A simple workaround is to pass pointers to memref descriptors at function boundaries, which this CL implement. PiperOrigin-RevId: 271591708	2019-09-27 09:57:36 -07:00
Jing Pu	47a7021cc3	Change the return type of createPrintCFGGraphPass to match other passes. PiperOrigin-RevId: 271252404	2019-09-25 18:33:47 -07:00
Mehdi Amini	5583252173	Add convenience methods to set an OpBuilder insertion point after an Operation (NFC) PiperOrigin-RevId: 270727180	2019-09-23 11:54:55 -07:00
Christian Sigg	c900d4994e	Fix a number of Clang-Tidy warnings. PiperOrigin-RevId: 270632324	2019-09-23 02:34:27 -07:00
Uday Bondhugula	f559c38c28	Upgrade/fix/simplify store to load forwarding - fix store to load forwarding for a certain set of cases (where forwarding shouldn't have happened); use AffineValueMap difference based MemRefAccess equality checking; utility logic is also greatly simplified - add missing equality/inequality operators for AffineExpr ==/!= ints - add == != operators on MemRefAccess Closes tensorflow/mlir#136 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/136 from bondhugula:store-load-forwarding d79fd1add8bcfbd9fa71d841a6a9905340dcd792 PiperOrigin-RevId: 270457011	2019-09-21 10:08:56 -07:00
River Riddle	91125d33ed	Avoid iterator invalidation when recursively computing pattern depth. computeDepth calls itself recursively, which may insert into minPatternDepth. minPatternDepth is a DenseMap, which invalidates iterators on insertion, so this may lead to asan failures. PiperOrigin-RevId: 270374203	2019-09-20 16:30:29 -07:00
Uday Bondhugula	727a50ae2d	Support symbolic operands for memref replacement; fix memrefNormalize - allow symbols in index remapping provided for memref replacement - fix memref normalize crash on cases with layout maps with symbols Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Reported by: Alex Zinenko Closes tensorflow/mlir#139 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/139 from bondhugula:memref-rep-symbols 2f48c1fdb5d4c58915bbddbd9f07b18541819233 PiperOrigin-RevId: 269851182	2019-09-18 11:26:11 -07:00
MLIR Team	1c73be76d8	Unify error messages to start with lower-case. PiperOrigin-RevId: 269803466	2019-09-18 07:45:17 -07:00
Uday Bondhugula	bd7de6d4df	Add rewrite pattern to compose maps into affine load/stores - add canonicalization pattern to compose maps into affine loads/stores; templatize the pattern and reuse it for affine.apply as well - rename getIndices -> getMapOperands() (getIndices is confusing since these are no longer the indices themselves but operands to the map whose results are the indices). This also makes the accessor uniform across affine.apply/load/store. Change arg names on the affine load/store builder to avoid confusion. Drop an unused confusing build method on AffineStoreOp. - update incomplete doc comment for canonicalizeMapAndOperands (this was missed from a previous update). Addresses issue tensorflow/mlir#121 Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#122 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/122 from bondhugula:compose-load-store e71de1771e56a85c4282c10cb43f30cef0701c4f PiperOrigin-RevId: 269619540	2019-09-17 11:49:45 -07:00
River Riddle	9619ba10d4	Add support for multi-level value mapping to DialectConversion. When performing A->B->C conversion, an operation may still refer to an operand of A. This makes it necessary to unmap through multiple levels of replacement for a specific value. PiperOrigin-RevId: 269367859	2019-09-16 10:38:19 -07:00
Uday Bondhugula	4f32ae61b4	NFC - Move explicit copy/dma generation utility out of pass and into LoopUtils - turn copy/dma generation method into a utility in LoopUtils, allowing it to be reused elsewhere. - no functional/logic change to the pass/utility - trim down header includes in files affected Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#124 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/124 from bondhugula:datacopy 9f346e62e5bd9dd1986720a30a35f302eb4d3252 PiperOrigin-RevId: 269106088	2019-09-14 13:23:48 -07:00
Uday Bondhugula	1366467a3b	update normalizeMemRef utility; handle missing failure check + add more tests - take care of symbolic operands with alloc - add missing check for compose map failure and a test case - add test cases on strides - drop incorrect check for one-to-one'ness Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#132 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/132 from bondhugula:normalize-memrefs 8aebf285fb0d7c19269d85255aed644657e327b7 PiperOrigin-RevId: 269105947	2019-09-14 13:21:35 -07:00
River Riddle	f1b100c77b	NFC: Finish replacing FunctionPassBase/ModulePassBase with OpPassBase. These directives were temporary during the generalization of FunctionPass/ModulePass to OpPass. PiperOrigin-RevId: 268970259	2019-09-13 13:34:27 -07:00
Smit Hinsu	1854c64c7c	Log name of the generated illegal operation name in DialectConversion debug mode PiperOrigin-RevId: 268859399	2019-09-13 01:37:38 -07:00
Jacques Pienaar	a23f69a37b	Remove redundant qualification Address GCC error: extra qualification not allowed [-fpermissive] PiperOrigin-RevId: 268133737	2019-09-09 19:50:53 -07:00
Jacques Pienaar	2660623a88	Add pass generate per block in a function a GraphViz Dot graph with ops as nodes * Add GraphTraits that treat a block as a graph, Operation* as node and use-relationship for edges; - Just basic graph output; * Add use iterator to iterate over all uses of an Operation; * Add testing pass to generate op graph; This does not support arbitrary operations other than function nor nested regions yet. PiperOrigin-RevId: 268121782	2019-09-09 18:12:41 -07:00
Mehdi Amini	6443583bfd	Refactor getUsedValuesDefinedAbove to expose a variant taking a callback (NFC) This will allow clients to implement a different collection strategy on these values, including collecting each uses within the region for example. PiperOrigin-RevId: 267803978	2019-09-07 17:03:01 -07:00
River Riddle	0ba0087887	Add the initial inlining infrastructure. This defines a set of initial utilities for inlining a region(or a FuncOp), and defines a simple inliner pass for testing purposes. A new dialect interface is defined, DialectInlinerInterface, that allows for dialects to override hooks controlling inlining legality. The interface currently provides the following hooks, but these are just premilinary and should be changed/added to/modified as necessary: * isLegalToInline - Determine if a region can be inlined into one of this dialect, or if an operation of this dialect can be inlined into a given region. * shouldAnalyzeRecursively - Determine if an operation with regions should be analyzed recursively for legality. This allows for child operations to be closed off from the legality checks for operations like lambdas. * handleTerminator - Process a terminator that has been inlined. This cl adds support for inlining StandardOps, but other dialects will be added in followups as necessary. PiperOrigin-RevId: 267426759	2019-09-05 12:24:13 -07:00
Uday Bondhugula	8c9dc690eb	pipeline-data-transfer: remove dead tag alloc's and improve test coverage for replaceMemRefUsesWith / pipeline-data-transfer - address remaining comments from PR tensorflow/mlir#87 for better test coverage for pipeline-data-transfer/replaceAllMemRefUsesWith - remove dead tag allocs the same way they are removed for the replaced buffers Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#106 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/106 from bondhugula:followup 9e868666d047e8d43e5f82f43e4093b838c710fa PiperOrigin-RevId: 267144774	2019-09-04 06:59:09 -07:00
Uday Bondhugula	54d674f51e	Utility to normalize memrefs with non-identity layout maps - introduce utility to convert memrefs with non-identity layout maps to ones with identity layout maps: convert the type and rewrite/remap all its uses - add this utility to -simplify-affine-structures pass for testing purposes Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#104 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/104 from bondhugula:memref-normalize f2c914aa1890e8860326c9e33f9aa160b3d65e6d PiperOrigin-RevId: 266985317	2019-09-03 12:14:28 -07:00
Uday Bondhugula	b1ef9dc22c	Fix affine data copy generation corner cases/bugs - the [begin, end) range identified for copying could end in between the block, which makes hoisting invalid in some cases. Change the range identification to always end with end of block. - add test case to exercise these (with fast mem capacity set to minimal so that single element memref buffers are generated at the innermost loop) - the location of begin/end of the block range for data copying was being confused with the insert points for copy in and copy out code. In cases, where we choose to hoist transfers, these are separate. - when copy loops are single iteration ones, promote their bodies at the end of the pass. - change default fast mem space to 1 (setting it to zero made it generate DMA op's that won't verify in the default case - since the DMA ops have a check for src/dest memref spaces being different). Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Co-Authored-By: Mehdi Amini <joker.eph@gmail.com> Closes tensorflow/mlir#88 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/88 from bondhugula:datacopy 88697267c45e850c3ced87671e16e4a930c02a42 PiperOrigin-RevId: 266980911	2019-09-03 11:53:16 -07:00
River Riddle	6563b1c446	Add a new dialect interface for the OperationFolder `OpFolderDialectInterface`. This interface will allow for providing hooks to interrop with operation folding. The first hook, 'shouldMaterializeInto', will allow for controlling which region to insert materialized constants into. The folder will generally materialize constants into the top-level isolated region, this allows for materializing into a lower level ancestor region if it is more profitable/correct. PiperOrigin-RevId: 266702972	2019-09-01 20:07:08 -07:00
Mehdi Amini	ce702fc8da	Add a `getUsedValuesDefinedAbove()` overload that takes an `Operation` pointer (NFC) This is a convenient utility around the existing `getUsedValuesDefinedAbove()` that take two regions. PiperOrigin-RevId: 266686854	2019-09-01 16:32:10 -07:00
River Riddle	9c8a8a7d0d	Add a canonicalization to erase empty AffineForOps. AffineForOp themselves are pure and can be removed if there are no internal operations. PiperOrigin-RevId: 266481293	2019-08-30 16:49:32 -07:00
River Riddle	037742cdf2	Add support for early exit walk methods. This is done by providing a walk callback that returns a WalkResult. This result is either `advance` or `interrupt`. `advance` means that the walk should continue, whereas `interrupt` signals that the walk should stop immediately. An example is shown below: auto result = op->walk([](Operation *op) { if (some_invariant) return WalkResult::interrupt(); return WalkResult::advance(); }); if (result.wasInterrupted()) ...; PiperOrigin-RevId: 266436700	2019-08-30 12:47:53 -07:00
River Riddle	4bfae66d70	Refactor the 'walk' methods for operations. This change refactors and cleans up the implementation of the operation walk methods. After this refactoring is that the explicit template parameter for the operation type is no longer needed for the explicit op walks. For example: op->walk<AffineForOp>([](AffineForOp op) { ... }); is now accomplished via: op->walk([](AffineForOp op) { ... }); PiperOrigin-RevId: 266209552	2019-08-29 13:04:50 -07:00
Uday Bondhugula	bc2a543225	fix loop unroll and jam - operand mapping - imperfect nest case - fix operand mapping while cloning sub-blocks to jam - was incorrect for imperfect nests where def/use was across sub-blocks - strengthen/generalize the first test case to cover the previously missed scenario - clean up the other cases while on this. Previously, unroll-jamming the following nest ``` affine.for %arg0 = 0 to 2048 { %0 = alloc() : memref<512x10xf32> affine.for %arg1 = 0 to 10 { %1 = affine.load %0[%arg0, %arg1] : memref<512x10xf32> } dealloc %0 : memref<512x10xf32> } ``` would yield ``` %0 = alloc() : memref<512x10xf32> %1 = affine.apply #map0(%arg0) %2 = alloc() : memref<512x10xf32> affine.for %arg1 = 0 to 10 { %4 = affine.load %0[%arg0, %arg1] : memref<512x10xf32> %5 = affine.apply #map0(%arg0) %6 = affine.load %0[%5, %arg1] : memref<512x10xf32> } dealloc %0 : memref<512x10xf32> %3 = affine.apply #map0(%arg0) dealloc %0 : memref<512x10xf32> ``` instead of ``` module { affine.for %arg0 = 0 to 2048 step 2 { %0 = alloc() : memref<512x10xf32> %1 = affine.apply #map0(%arg0) %2 = alloc() : memref<512x10xf32> affine.for %arg1 = 0 to 10 { %4 = affine.load %0[%arg0, %arg1] : memref<512x10xf32> %5 = affine.apply #map0(%arg0) %6 = affine.load %2[%5, %arg1] : memref<512x10xf32> } dealloc %0 : memref<512x10xf32> %3 = affine.apply #map0(%arg0) dealloc %2 : memref<512x10xf32> } ``` Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#98 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/98 from bondhugula:ujam ddbc853f69b5608b3e8ff9b5ac1f6a5a0bb315a4 PiperOrigin-RevId: 266073460	2019-08-28 23:42:50 -07:00
Uday Bondhugula	aa2cee9cf5	Refactor / improve replaceAllMemRefUsesWith Refactor replaceAllMemRefUsesWith to split it into two methods: the new method does the replacement on a single op, and is used by the existing one. - make the methods return LogicalResult instead of bool - Earlier, when replacement failed (due to non-deferencing uses of the memref), the set of ops that had already been processed would have been replaced leaving the IR in an inconsistent state. Now, a pass is made over all ops to first check for non-deferencing uses, and then replacement is performed. No test cases were affected because all clients of this method were first checking for non-deferencing uses before calling this method (for other reasons). This isn't true for a use case in another upcoming PR (scalar replacement); clients can now bail out with consistent IR on failure of replaceAllMemRefUsesWith. Add test case. - multiple deferencing uses of the same memref in a single op is possible (we have no such use cases/scenarios), and this has always remained unsupported. Add an assertion for this. - minor fix to another test pipeline-data-transfer case. Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#87 PiperOrigin-RevId: 265808183	2019-08-27 17:56:56 -07:00
River Riddle	2f59f76876	NFC: Remove the explicit context from Operation::create and OperationState. The context can easily be recovered from the Location in these situations. PiperOrigin-RevId: 265578574	2019-08-26 17:34:48 -07:00
Andy Ly	6a501e3d1b	Support folding of ops with inner ops in GreedyPatternRewriteDriver. This fixes a bug when folding ops with inner ops and inner ops are still being visited. PiperOrigin-RevId: 265475780	2019-08-26 09:44:39 -07:00
River Riddle	32052c8417	NFC: Add a note to 'applyPatternsGreedily' that it also performs folding/dce. Fixes tensorflow/mlir#72 PiperOrigin-RevId: 265097597	2019-08-23 11:28:45 -07:00
River Riddle	ffde975e21	NFC: Move AffineOps dialect to the Dialect sub-directory. PiperOrigin-RevId: 264482571	2019-08-20 15:36:39 -07:00
Nicolas Vasilache	b628194013	Move Linalg and VectorOps dialects to the Dialect subdir - NFC PiperOrigin-RevId: 264277760	2019-08-19 17:11:38 -07:00
River Riddle	ba0fa92524	NFC: Move LLVMIR, SDBM, and StandardOps to the Dialect/ directory. PiperOrigin-RevId: 264193915	2019-08-19 11:01:25 -07:00
Jacques Pienaar	79f53b0cf1	Change from llvm::make_unique to std::make_unique Switch to C++14 standard method as llvm::make_unique has been removed ( https://reviews.llvm.org/D66259). Also mark some targets as c++14 to ease next integrates. PiperOrigin-RevId: 263953918	2019-08-17 11:06:03 -07:00
River Riddle	9c29273ddc	Refactor DialectConversion to convert the signatures of blocks when they are moved. Often we want to ensure that block arguments are converted before operations that use them. This refactors the current implementation to be cleaner/less frequent by triggering conversion when a set of blocks are moved/inlined; or when legalization is successful. PiperOrigin-RevId: 263795005	2019-08-16 10:16:38 -07:00
Mehdi Amini	926fb685de	Express ownership transfer in PassManager API through std::unique_ptr (NFC) Since raw pointers are always passed around for IR construct without implying any ownership transfer, it can be error prone to have implicit ownership transferred the same way. For example this code can seem harmless: Pass *pass = .... pm.addPass(pass); pm.addPass(pass); pm.run(module); PiperOrigin-RevId: 263053082	2019-08-12 19:13:12 -07:00
River Riddle	5290e8c36d	NFC: Update pattern rewrite API to pass OwningRewritePatternList by const reference. The pattern list is not modified by any of these APIs and should thus be passed with const. PiperOrigin-RevId: 262844002	2019-08-11 18:34:14 -07:00
River Riddle	1e42954032	NFC: Standardize the terminology used for parent ops/regions/etc. There are currently several different terms used to refer to a parent IR unit in 'get' methods: getParent/getEnclosing/getContaining. This cl standardizes all of these methods to use 'getParent*'. PiperOrigin-RevId: 262680287	2019-08-09 20:07:52 -07:00
River Riddle	41968fb475	NFC: Update usages of OwningRewritePatternList to pass by & instead of &&. This will allow for reusing the same pattern list, which may be costly to continually reconstruct, on multiple invocations. PiperOrigin-RevId: 262664599	2019-08-09 17:20:29 -07:00
Nicolas Vasilache	39f1b9a053	Add a higher-order vector.extractelement operation in MLIR This CL is step 2/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools. This CL adds the vector.extractelement operation to the MLIR vector dialect as well as the appropriate roundtrip test. Lowering to LLVM will occur in the following CL. PiperOrigin-RevId: 262545089	2019-08-09 05:58:47 -07:00
River Riddle	8089f93746	Add utility 'replaceAllUsesWith' methods to Operation. These methods will allow replacing the uses of results with an existing operation, with the same number of results, or a range of values. This removes a number of hand-rolled result replacement loops and simplifies replacement for operations with multiple results. PiperOrigin-RevId: 262206600	2019-08-07 13:48:52 -07:00
Andy Ly	55f2e24ab3	Remove ops in regions/blocks from worklist when parent op is being removed via GreedyPatternRewriteDriver::replaceOp. This fixes a bug where ops inside the parent op are visited even though the parent op has been removed. PiperOrigin-RevId: 261953580	2019-08-06 11:08:54 -07:00
Nicolas Vasilache	24647750d4	Refactor Linalg ops to loop lowering (NFC) This CL modifies the LowerLinalgToLoopsPass to use RewritePattern. This will make it easier to inline Linalg generic functions and regions when emitting to loops in a subsequent CL. PiperOrigin-RevId: 261894120	2019-08-06 05:38:16 -07:00
River Riddle	a0df3ebd15	NFC: Implement OwningRewritePatternList as a class instead of a using directive. This allows for proper forward declaration, as opposed to leaking the internal implementation via a using directive. This also allows for all pattern building to go through 'insert' methods on the OwningRewritePatternList, replacing uses of 'push_back' and 'RewriteListBuilder'. PiperOrigin-RevId: 261816316	2019-08-05 18:38:22 -07:00
Mehdi Amini	0c3923e1dc	Fix clang 5.0 by using type aliases for LLVM DenseSet/Map When inlining the declaration for llvm::DenseSet/DenseMap in the mlir namespace from a forward declaration, clang does not take the default for the template parameters if their are declared later. namespace llvm { template<typename Foo> class DenseMap; } namespace mlir { using llvm::DenseMap; } namespace llvm { template<typename Foo = int> class DenseMap {}; } namespace mlir { DenseMap<> map; } PiperOrigin-RevId: 261495612	2019-08-03 11:35:50 -07:00
Alex Zinenko	58e66d71e7	AffineDataCopyGeneration: don't use CL flag values inside the pass AffineDataCopyGeneration pass relied on command line flags for internal logic in several places, which makes it unusable in a library context (i.e. outside a standalone mlir-opt binary that does the command line parsing). Define configuration flags in the constructor instead, and set them up to command line-based defaults to maintain the original behavior. PiperOrigin-RevId: 261322364	2019-08-02 08:04:30 -07:00
Uday Bondhugula	18b8d4352b	Introduce explicit copying optimization by generalizing the DMA generation pass Explicit copying to contiguous buffers is a standard technique to avoid conflict misses and TLB misses, and improve hardware prefetching performance. When done in conjunction with cache tiling, it nearly eliminates all cache conflict and TLB misses, and a single hardware prefetch stream is needed per data tile. - generalize/extend DMA generation pass (renamed data copying pass) to perform either point-wise explicit copies to fast memory buffers or DMAs (depending on a cmd line option). All logic is the same as erstwhile -dma-generate. - -affine-dma-generate is now renamed -affine-data-copy; when -dma flag is provided, DMAs are generated, or else explicit copy loops are generated (point-wise) by default. - point-wise copying could be used for CPUs (or GPUs); some indicative performance numbers with a "C" version of the MLIR when compiled with and without this optimization (about 2x improvement here). With a matmul on 4096^2 matrices on a single core of an Intel Core i7 Skylake i7-8700K with clang 8.0.0: clang -O3: 518s clang -O3 with MLIR tiling (128x128): 24.5s clang -O3 with MLIR tiling + data copying 12.4s (code equivalent to test/Transforms/data-copy.mlir func @matmul) - fix some misleading comments. - change default fast-mem space to 0 (more intuitive now with the default copy generation using point-wise copies instead of DMAs) On a simple 3-d matmul loop nest, code generated with -affine-data-copy: ``` affine.for %arg3 = 0 to 4096 step 128 { affine.for %arg4 = 0 to 4096 step 128 { %0 = affine.apply #map0(%arg3, %arg4) %1 = affine.apply #map1(%arg3, %arg4) %2 = alloc() : memref<128x128xf32, 2> // Copy-in Out matrix. affine.for %arg5 = 0 to 128 { %5 = affine.apply #map2(%arg3, %arg5) affine.for %arg6 = 0 to 128 { %6 = affine.apply #map2(%arg4, %arg6) %7 = load %arg2[%5, %6] : memref<4096x4096xf32> affine.store %7, %2[%arg5, %arg6] : memref<128x128xf32, 2> } } affine.for %arg5 = 0 to 4096 step 128 { %5 = affine.apply #map0(%arg3, %arg5) %6 = affine.apply #map1(%arg3, %arg5) %7 = alloc() : memref<128x128xf32, 2> // Copy-in LHS. affine.for %arg6 = 0 to 128 { %11 = affine.apply #map2(%arg3, %arg6) affine.for %arg7 = 0 to 128 { %12 = affine.apply #map2(%arg5, %arg7) %13 = load %arg0[%11, %12] : memref<4096x4096xf32> affine.store %13, %7[%arg6, %arg7] : memref<128x128xf32, 2> } } %8 = affine.apply #map0(%arg5, %arg4) %9 = affine.apply #map1(%arg5, %arg4) %10 = alloc() : memref<128x128xf32, 2> // Copy-in RHS. affine.for %arg6 = 0 to 128 { %11 = affine.apply #map2(%arg5, %arg6) affine.for %arg7 = 0 to 128 { %12 = affine.apply #map2(%arg4, %arg7) %13 = load %arg1[%11, %12] : memref<4096x4096xf32> affine.store %13, %10[%arg6, %arg7] : memref<128x128xf32, 2> } } // Compute. affine.for %arg6 = #map7(%arg3) to #map8(%arg3) { affine.for %arg7 = #map7(%arg4) to #map8(%arg4) { affine.for %arg8 = #map7(%arg5) to #map8(%arg5) { %11 = affine.load %7[-%arg3 + %arg6, -%arg5 + %arg8] : memref<128x128xf32, 2> %12 = affine.load %10[-%arg5 + %arg8, -%arg4 + %arg7] : memref<128x128xf32, 2> %13 = affine.load %2[-%arg3 + %arg6, -%arg4 + %arg7] : memref<128x128xf32, 2> %14 = mulf %11, %12 : f32 %15 = addf %13, %14 : f32 affine.store %15, %2[-%arg3 + %arg6, -%arg4 + %arg7] : memref<128x128xf32, 2> } } } dealloc %10 : memref<128x128xf32, 2> dealloc %7 : memref<128x128xf32, 2> } %3 = affine.apply #map0(%arg3, %arg4) %4 = affine.apply #map1(%arg3, %arg4) // Copy out result matrix. affine.for %arg5 = 0 to 128 { %5 = affine.apply #map2(%arg3, %arg5) affine.for %arg6 = 0 to 128 { %6 = affine.apply #map2(%arg4, %arg6) %7 = affine.load %2[%arg5, %arg6] : memref<128x128xf32, 2> store %7, %arg2[%5, %6] : memref<4096x4096xf32> } } dealloc %2 : memref<128x128xf32, 2> } } ``` With -affine-data-copy -dma: ``` affine.for %arg3 = 0 to 4096 step 128 { %0 = affine.apply #map3(%arg3) %1 = alloc() : memref<128xf32, 2> %2 = alloc() : memref<1xi32> affine.dma_start %arg2[%arg3], %1[%c0], %2[%c0], %c128_0 : memref<4096xf32>, memref<128xf32, 2>, memref<1xi32> affine.dma_wait %2[%c0], %c128_0 : memref<1xi32> %3 = alloc() : memref<1xi32> affine.for %arg4 = 0 to 4096 step 128 { %5 = affine.apply #map0(%arg3, %arg4) %6 = affine.apply #map1(%arg3, %arg4) %7 = alloc() : memref<128x128xf32, 2> %8 = alloc() : memref<1xi32> affine.dma_start %arg0[%arg3, %arg4], %7[%c0, %c0], %8[%c0], %c16384, %c4096, %c128_2 : memref<4096x4096xf32>, memref<128x128xf32, 2>, memref<1xi32> affine.dma_wait %8[%c0], %c16384 : memref<1xi32> %9 = affine.apply #map3(%arg4) %10 = alloc() : memref<128xf32, 2> %11 = alloc() : memref<1xi32> affine.dma_start %arg1[%arg4], %10[%c0], %11[%c0], %c128_1 : memref<4096xf32>, memref<128xf32, 2>, memref<1xi32> affine.dma_wait %11[%c0], %c128_1 : memref<1xi32> affine.for %arg5 = #map3(%arg3) to #map5(%arg3) { affine.for %arg6 = #map3(%arg4) to #map5(%arg4) { %12 = affine.load %7[-%arg3 + %arg5, -%arg4 + %arg6] : memref<128x128xf32, 2> %13 = affine.load %10[-%arg4 + %arg6] : memref<128xf32, 2> %14 = affine.load %1[-%arg3 + %arg5] : memref<128xf32, 2> %15 = mulf %12, %13 : f32 %16 = addf %14, %15 : f32 affine.store %16, %1[-%arg3 + %arg5] : memref<128xf32, 2> } } dealloc %11 : memref<1xi32> dealloc %10 : memref<128xf32, 2> dealloc %8 : memref<1xi32> dealloc %7 : memref<128x128xf32, 2> } %4 = affine.apply #map3(%arg3) affine.dma_start %1[%c0], %arg2[%arg3], %3[%c0], %c128 : memref<128xf32, 2>, memref<4096xf32>, memref<1xi32> affine.dma_wait %3[%c0], %c128 : memref<1xi32> dealloc %3 : memref<1xi32> dealloc %2 : memref<1xi32> dealloc %1 : memref<128xf32, 2> } ``` Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#50 PiperOrigin-RevId: 261221903	2019-08-01 16:31:58 -07:00
Jacques Pienaar	0fa1ea704c	Initialize union to avoid -Wmissing-field-initializers warning. Reported by clang-6. PiperOrigin-RevId: 260311814	2019-07-27 11:47:26 -07:00
Nicolas Vasilache	fae4d94990	Use "standard" load and stores in LowerVectorTransfers Clipping creates non-affine memory accesses, use std_load and std_store instead of affine_load and affine_store. In the future we may also want a fill with the neutral element rather than clip, this would make the accesses affine if we wanted more analyses and transformations to happen post lowering to pointwise copies. PiperOrigin-RevId: 260110503	2019-07-26 02:34:24 -07:00
River Riddle	1293708473	Add support for an analysis mode to DialectConversion. This mode analyzes which operations are legalizable to the given target if a conversion were to be applied, i.e. no rewrites are ever performed even on success. This mode is useful for device partitioning or other utilities that may want to analyze the effect of conversion to different targets before performing it. The analysis method currently just fills a provided set with the operations that were found to be legalizable. This can be extended in the future to capture more information as necessary. PiperOrigin-RevId: 259987105	2019-07-25 11:31:07 -07:00
Nicolas Vasilache	48a1baeb8a	Refactor LoopParametricTiling as a test pass - NFC This CL moves LoopParametricTiling into test/lib as a pass for purely testing purposes. PiperOrigin-RevId: 259300264	2019-07-22 04:31:17 -07:00
River Riddle	00bdc8e070	Refactor region type signature conversion to be explicit via patterns. This cl enforces that the conversion of the type signatures for regions, and thus their entry blocks, is handled via ConversionPatterns. A new hook 'applySignatureConversion' is added to the ConversionPatternRewriter to perform the desired conversion on a region. This also means that the handling of rewriting the signature of a FuncOp is moved to a pattern. A default implementation is provided via 'mlir::populateFuncOpTypeConversionPattern'. This removes the hacky implicit 'dynamically legal' status of FuncOp that was present previously, and leaves it up to the user to decide when/how to convert the signature of a function. PiperOrigin-RevId: 259161999	2019-07-20 19:06:07 -07:00
Nicolas Vasilache	d2a872922f	Refactor stripmineSink for AffineForOp - NFC More moving less cloning. PiperOrigin-RevId: 258947575	2019-07-19 11:40:37 -07:00
Nicolas Vasilache	db4cd1c8dc	Utility function to map a loop on a parametric grid of virtual processors This CL introduces a simple loop utility function which rewrites the bounds and step of a loop so as to become mappable on a regular grid of processors whose identifiers are given by SSA values. A corresponding unit test is added. For example, using CUDA terminology, and assuming a 2-d grid with processorIds = [blockIdx.x, threadIdx.x] and numProcessors = [gridDim.x, blockDim.x], the loop: ``` loop.for %i = %lb to %ub step %step { ... } ``` is rewritten into a version resembling the following pseudo-IR: ``` loop.for %i = %lb + threadIdx.x + blockIdx.x * blockDim.x to %ub step %gridDim.x * blockDim.x { ... } ``` PiperOrigin-RevId: 258945942	2019-07-19 11:40:31 -07:00
Nicolas Vasilache	5bc344743c	Uniformize the API for the mlir::tile functions on AffineForOp and loop::ForOp This CL adapts the recently introduced parametric tiling to have an API matching the tiling of AffineForOp. The transformation using stripmineSink is more general and produces imperfectly nested loops. Perfect nesting invariants of the tiled version are obtained by selectively applying hoisting of ops to isolate perfectly nested bands. Such hoisting may fail to produce a perfect loop nest in cases where ForOp transitively depend on enclosing induction variables. In such cases, the API provides a LogicalResult return but the SimpleParametricLoopTilingPass does not currently use this result. A new unit test is added with a triangular loop for which the perfect nesting property does not hold. For this example, the old behavior was to produce IR that did not verify (some use was not dominated by its def). PiperOrigin-RevId: 258928309	2019-07-19 11:40:25 -07:00
River Riddle	28057ff3da	Add support for providing a legality callback for dynamic legality in DialectConversion. This allows for providing specific handling for dynamically legal operations/dialects without overriding the general 'isDynamicallyLegal' hook. This also means that a derived ConversionTarget class need not always be defined when some operations are dynamically legal. Example usage: ConversionTarget target(...); target.addDynamicallyLegalOp<ReturnOp>([](ReturnOp op) { return ... }; target.addDynamicallyLegalDialect<StandardOpsDialect>([](Operation *op) { return ... }; PiperOrigin-RevId: 258884753	2019-07-19 11:40:19 -07:00
River Riddle	8b447b6cad	NFC: Expose a ConversionPatternRewriter for use with ConversionPatterns. This specific PatternRewriter will allow for exposing hooks in the future that are only useful for the conversion framework, e.g. type conversions. PiperOrigin-RevId: 258818122	2019-07-19 11:40:00 -07:00
River Riddle	9e3c2650d2	Refactor the conversion of block argument types in DialectConversion. This cl begins a large refactoring over how signature types are converted in the DialectConversion infrastructure. The signatures of blocks are now converted on-demand when an operation held by that block is being converted. This allows for handling the case where a region is created as part of a pattern, something that wasn't possible previously. This cl also generalizes the region signature conversion used by FuncOp to work on any region of any operation. This generalization allows for removing the 'apply*Conversion' functions that were specific to FuncOp/ModuleOp. The implementation currently uses a new hook on TypeConverter, 'convertRegionSignature', but this should ideally be removed in favor of using Patterns. That depends on adding support to the PatternRewriter used by ConversionPattern to allow applying signature conversions to regions, which should be coming in a followup. PiperOrigin-RevId: 258645733	2019-07-19 11:38:45 -07:00
River Riddle	491ef84dc4	Add support for explicitly marking dialects and operations as illegal. This explicit tag is useful is several ways: ) This simplifies how to mark sub sections of a dialect as explicitly unsupported, e.g. my target supports all operations in the foo dialect except for these select few. This is useful for partial lowerings between dialects. ) Partial conversions will now verify that operations that were explicitly marked as illegal must be converted. This provides some guarantee that the operations that need to be lowered by a specific pass will be. PiperOrigin-RevId: 258582879	2019-07-19 11:38:25 -07:00
Nicolas Vasilache	0002e2964d	Move affine.for and affine.if to ODS As the move to ODS is made, body and region names across affine and loop dialects are uniformized. PiperOrigin-RevId: 258416590	2019-07-16 13:45:47 -07:00
River Riddle	2b9855b5b4	Refactor DialectConversion to support different conversion modes. Users generally want several different modes of conversion. This cl refactors DialectConversion to provide two: * Partial (applyPartialConversion) - This mode allows for illegal operations to exist in the IR, and does not fail if an operation fails to be legalized. * Full (applyFullConversion) - This mode fails if any operation is not properly legalized to the conversion target. This allows for ensuring that the IR after a conversion only contains operations legal for the target. PiperOrigin-RevId: 258412243	2019-07-16 13:45:41 -07:00
River Riddle	2087bf6386	Remove lowerAffineConstructs and lowerControlFlow in favor of providing patterns. These methods don't compose well with the rest of conversion framework, and create artificial breaks in conversion. Replace these methods with two(populateAffineToStdConversionPatterns and populateLoopToStdConversionPatterns respectively) that populate a list of patterns to perform the same behavior. PiperOrigin-RevId: 258219277	2019-07-16 13:44:45 -07:00
River Riddle	e7a2ef21f9	Update 'applyPatternsGreedily' to work on the regions of any operations. 'applyPatternsGreedily' is a useful utility outside of just function regions. PiperOrigin-RevId: 258182937	2019-07-16 13:44:39 -07:00
River Riddle	7d1e1e6721	Refactor the traversal of operations to Convert in DialectConversion. This cl changes the way that operations/blocks to convert are collected/traversed so that parent region operations can be legalized before their bodies. Most RewritePatterns for region operations assume that the entry arguments to each region are yet to be converted. Given that the bodies are currently converted first, this makes it difficult to fit these patterns into the same run as one converting types. The operations/blocks to convert are now collected before any legalization has run, which simplifies the conversion logic itself, as legalization may insert new operations, move blocks, etc. PiperOrigin-RevId: 258170158	2019-07-16 13:44:33 -07:00
River Riddle	40715789f8	Refactor LowerAffine to use OpRewritePattern instead of ConversionPattern. ConversionPattern should ideally only be used when the types of the operands are changing, which in this case they aren't. Using OpRewritePattern also lends to much simpler code. PiperOrigin-RevId: 258158474	2019-07-16 13:44:09 -07:00
Alex Zinenko	fc044e8929	Introduce loop coalescing utility and a simple pass Multiple (perfectly) nested loops with independent bounds can be combined into a single loop and than subdivided into blocks of arbitrary size for load balancing or more efficient parallelism exploitation. However, MLIR wants to preserve the multi-dimensional multi-loop structure at higher levels of abstraction. Introduce a transformation that coalesces nested loops with independent bounds so that they can be further subdivided by tiling. PiperOrigin-RevId: 258151016	2019-07-16 13:43:44 -07:00
Nicolas Vasilache	cca53e8527	Extract std.for std.if and std.terminator in their own dialect These ops should not belong to the std dialect. This CL extracts them in their own dialect and updates the corresponding conversions and tests. PiperOrigin-RevId: 258123853	2019-07-16 13:43:18 -07:00
River Riddle	a764c19d17	Fix a bug in DialectConversion when using RewritePattern. When using a RewritePattern and replacing an operation with an existing value, that value may have already been replaced by something else. This cl ensures that only the final value is used when applying rewrites. PiperOrigin-RevId: 258058488	2019-07-16 13:43:12 -07:00
River Riddle	e50a8bd19c	NFC: Add header blocks to DialectConversion.h to improve readability. PiperOrigin-RevId: 257903383	2019-07-13 05:55:50 -07:00
River Riddle	2566a72a21	Update the PatternRewriter constructor to take a context instead of a region. This will allow for cleanly using a rewriter for multiple different regions. PiperOrigin-RevId: 257845371	2019-07-12 17:42:52 -07:00
River Riddle	8e349a48b6	Remove the 'region' field from OpBuilder. This field wasn't updated as the insertion point changed, making it potentially dangerous given the multi-level of MLIR(e.g. 'createBlock' would always insert the new block in 'region'). This also allows for building an OpBuilder with just a context. PiperOrigin-RevId: 257829135	2019-07-12 17:42:41 -07:00
River Riddle	60a2983779	Fix a bug in the canonicalizer when replacing constants via patterns. The GreedyPatternRewriteDriver currently does not notify the OperationFolder when constants are removed as part of a pattern match. This materializes in a nasty bug where a different operation may be allocated to the same address. This causes an assertion in the OperationFolder when it gets notified of the new operations removal. PiperOrigin-RevId: 257817627	2019-07-12 17:42:24 -07:00
Nicolas Vasilache	cab671d166	Lower affine control flow to std control flow to LLVM dialect This CL splits the lowering of affine to LLVM into 2 parts: 1. affine -> std 2. std -> LLVM The conversions mostly consists of splitting concerns between the affine and non-affine worlds from existing conversions. Short-circuiting of affine `if` conditions was never tested or exercised and is removed in the process, it can be reintroduced later if needed. LoopParametricTiling.cpp is updated to reflect the newly added ForOp::build. PiperOrigin-RevId: 257794436	2019-07-12 08:44:28 -07:00
River Riddle	9dbef0bf96	Rename FunctionAttr to SymbolRefAttr. This allows for the attribute to hold symbolic references to other operations than FuncOp. This also allows for removing the dependence on FuncOp from the base Builder. PiperOrigin-RevId: 257650017	2019-07-12 08:43:42 -07:00
Alex Zinenko	054e25c079	EDSC: use affine.load/store instead of std.load/store Standard load and store operations are evolving to be separated from the Affine constructs. Special affine.load/store have been introduced to uphold the restrictions of the Affine control flow constructs on their operands. EDSC-produced loads and stores were originally intended to uphold those restrictions as well so they should use affine.load/store instead of std.load/store. PiperOrigin-RevId: 257443307	2019-07-12 08:42:28 -07:00
River Riddle	fec20e590f	NFC: Rename Module to ModuleOp. Module is a legacy name that only exists as a typedef of ModuleOp. PiperOrigin-RevId: 257427248	2019-07-10 10:11:21 -07:00
River Riddle	8c44367891	NFC: Rename Function to FuncOp. PiperOrigin-RevId: 257293379	2019-07-10 10:10:53 -07:00
Alex Zinenko	7a2e8726e8	Fix a test broken on some systems due to a mis-rebase. PiperOrigin-RevId: 257190161	2019-07-09 07:43:42 -07:00
Alex Zinenko	9d03f5674f	Implement parametric tiling on standard for loops Parametric tiling can be used to extract outer loops with fixed number of iterations. This in turn enables mapping to GPU kernels on a fixed grid independently of the range of the original loops, which may be unknown statically, making the kernel adaptable to different sizes. Provide a utility function that also computes the parametric tile size given the range of the loop. Exercise the utility function through a simple pass that applies it to all top-level loop nests. Permutability or parallelism checks must be performed before calling this utility function in actual passes. Note that parametric tiling cannot be implemented in a purely affine way, although it can be encoded using semi-affine maps. The choice to implement it on standard loops is guided by them being the common representation between Affine loops, Linalg and GPU kernels. PiperOrigin-RevId: 257180251	2019-07-09 06:37:41 -07:00
River Riddle	626b8b6a5d	NFC: Remove `Module::getFunctions` in favor of a general `getOps<T>`. Modules can now contain more than just Functions, this just updates the iteration API to reflect that. The 'begin'/'end' methods have also been updated to iterate over opaque Operations. PiperOrigin-RevId: 257099084	2019-07-08 18:28:17 -07:00
River Riddle	ce502af9cd	NFC: Remove the various "::getFunction" methods. These methods assume that a function is a valid builtin top-level operation, and removing these methods allows for decoupling FuncOp and IR/. Utility "getParentOfType" methods have been added to Operation/OpState to allow for querying the first parent operation of a given type. PiperOrigin-RevId: 257018913	2019-07-08 12:40:08 -07:00
River Riddle	474e354179	NFC: Remove Region::getContainingFunction as Functions are now Operations. PiperOrigin-RevId: 256579717	2019-07-04 13:23:10 -07:00
Andy Davis	2e1187dd25	Globally change load/store/dma_start/dma_wait operations over to affine.load/store/dma_start/dma_wait. In most places, this is just a name change (with the exception of affine.dma_start swapping the operand positions of its tag memref and num_elements operands). Significant code changes occur here: ) Vectorization: LoopAnalysis.cpp, Vectorize.cpp ) Affine Transforms: Transforms/Utils/Utils.cpp PiperOrigin-RevId: 256395088	2019-07-03 14:37:06 -07:00
River Riddle	206e55cc16	NFC: Refactor Module to be value typed. As with Functions, Module will soon become an operation, which are value-typed. This eases the transition from Module to ModuleOp. A new class, OwningModuleRef is provided to allow for owning a reference to a Module, and will auto-delete the held module on destruction. PiperOrigin-RevId: 256196193	2019-07-02 16:43:36 -07:00
River Riddle	705b80918d	Generalize the CFG graph printing for Functions to work on Regions instead. PiperOrigin-RevId: 256029944	2019-07-01 17:02:51 -07:00
River Riddle	694975ddbc	Standardize the definition and usage of getAllArgAttrs between FuncOp and Function. PiperOrigin-RevId: 255988352	2019-07-01 11:39:12 -07:00
River Riddle	54cd6a7e97	NFC: Refactor Function to be value typed. Move the data members out of Function and into a new impl storage class 'FunctionStorage'. This allows for Function to become value typed, which will greatly simplify the transition of Function to FuncOp(given that FuncOp is also value typed). PiperOrigin-RevId: 255983022	2019-07-01 11:39:00 -07:00
Alex Zinenko	5eef726bc8	TypeConversion: do not materialize conversion of the type to itself Type conversion does not necessarily affect all types, some of them may remain untouched. The type conversion tool from the dialect conversion framework will unconditionally insert a temporary cast operation from the type to itself anyway, and will try to materialize it to a real conversion operation if there are remaining uses. Simply use the original value instead. PiperOrigin-RevId: 255975450	2019-07-01 09:56:56 -07:00
Andy Davis	f487d20bf0	Add affine-to-standard lowerings for affine.load/store/dma_start/dma_wait. PiperOrigin-RevId: 255960171	2019-07-01 09:56:22 -07:00
Nicolas Vasilache	e7f51ad08a	Add a folder-based EDSC intrinsics constructor (NFC) PiperOrigin-RevId: 255908660	2019-07-01 09:55:35 -07:00
River Riddle	7c755d06aa	Refactor DialectConversion to use 'materializeConversion' when a type conversion must persist after the conversion has finished. During conversion, if a type conversion has dangling uses a type conversion must persist after conversion has finished to maintain valid IR. In these cases, we now query the TypeConverter to materialize a conversion for us. This allows for the default case of a full conversion to continue working as expected, but also handle the degenerate cases more robustly. PiperOrigin-RevId: 255637171	2019-06-28 11:29:04 -07:00
River Riddle	a4c3a6455c	Move the emitError/Warning/Remark utility methods out of MLIRContext and into the mlir namespace. Now that Locations are attributes, they have direct access to the MLIR context. This allows for simplifying error emission by removing unnecessary context lookups. PiperOrigin-RevId: 255112791	2019-06-25 21:32:23 -07:00
River Riddle	49162524d8	NFC: Uniformize the return of the LocationAttr 'get' methods to 'Location'. PiperOrigin-RevId: 255078768	2019-06-25 16:57:56 -07:00
River Riddle	66ed7d6d83	Update the OperationFolder to find a valid insertion point when materializing constants. The OperationFolder currently just inserts into the entry block of a Function, but regions may be isolated above, i.e. explicit capture only, and blindly inserting constants may break the invariants of these regions. PiperOrigin-RevId: 254987796	2019-06-25 09:43:21 -07:00
River Riddle	c32080a1b0	NFC: Move the ArgConverter methods out-of-line to improve readability. PiperOrigin-RevId: 254872695	2019-06-24 17:47:51 -07:00
Nicolas Vasilache	95cfd99616	Fix OSS build PiperOrigin-RevId: 254847773	2019-06-24 17:47:27 -07:00
Nicolas Vasilache	dac75ae5ff	Split test-specific passes out of mlir-opt Instead put their impl in test/lib and link them into mlir-test-opt PiperOrigin-RevId: 254837439	2019-06-24 17:47:12 -07:00
River Riddle	b67cab4c44	Update CSE to respect nested regions that are isolated from above. This cl also removes the unused 'NthRegionIsIsolatedFromAbove' trait as it was replaced with a more general 'IsIsolatedFromAbove'. PiperOrigin-RevId: 254709704	2019-06-24 13:44:53 -07:00
River Riddle	bcacef1a70	Add a new dialect hook 'materializeConstant' to create a constant operation that materializes an attribute value with the given type. This effectively adds support for dialect specific constant values that have different invariants than std.constant. 'OperationFolder' is updated to use this new hook, or attempt to default to std.constant when legal. PiperOrigin-RevId: 254570153	2019-06-22 13:05:27 -07:00
River Riddle	48d6cf1ced	NFC: Remove the 'context' parameter from OperationState. Now that Locations are Attributes they contain a direct reference to the MLIRContext, i.e. the context can be directly accessed from the given location instead of being explicitly passed in. PiperOrigin-RevId: 254568329	2019-06-22 13:05:10 -07:00
River Riddle	704a7fb13e	Add support for 1->N type mappings in the dialect conversion infrastructure. To support these mappings a hook must be overridden on the type converter: 'materializeConversion' :to generate a cast operation from the new types to the old type. This operation is automatically erased if all uses are removed, otherwise it remains in the IR for the user to handle. PiperOrigin-RevId: 254411383	2019-06-22 09:16:06 -07:00
River Riddle	3e99d99553	Add an overload to 'PatternRewriter::inlineRegionBefore' that accepts a parent region for the insertion position. This allows for inlining the given region into the end of another region. PiperOrigin-RevId: 254367375	2019-06-22 09:15:21 -07:00
Nicolas Vasilache	0804750c9b	Uniformize usage of OpBuilder& (NFC) Historically the pointer-based version of builders was used. This CL uniformizes to OpBuilder & PiperOrigin-RevId: 254280885	2019-06-22 09:14:49 -07:00
River Riddle	7202c4e69d	Rename ConversionTarget::isLegal to isDynamicallyLegal to better represent what the function is actually checking. PiperOrigin-RevId: 254141073	2019-06-22 09:13:45 -07:00
River Riddle	9764ae3f24	Refactor the TypeConverter to support more robust type conversions: * Support for 1->0 type mappings, i.e. when the argument is being removed. * Reordering types when converting a type signature. * Adding new inputs when converting a type signature. This cl also lays down the initial foundation for supporting 1->N type mappings, but full support will come in a followup. Moving forward, function signature changes will be driven by populating a SignatureConversion instance. This class contains all of the necessary information for adding/removing/remapping function signatures; e.g. addInputs, addResults, remapInputs, etc. PiperOrigin-RevId: 254064665	2019-06-19 23:08:33 -07:00
River Riddle	30bbd91056	Simplify usages of SplatElementsAttr now that it inherits from DenseElementsAttr. PiperOrigin-RevId: 253910543	2019-06-19 23:07:34 -07:00
Andy Davis	59b68146ff	Factor fusion compute cost calculation out of LoopFusion and into LoopFusionUtils (NFC). PiperOrigin-RevId: 253797886	2019-06-19 23:06:26 -07:00
Alex Zinenko	4291ae7431	Factor Region::getUsedValuesDefinedAbove into Transforms/RegionUtils Arguably, this function is only useful for transformations and should not pollute the main IR. Also make sure it accepts a the resulting container by-reference instead of returning it. PiperOrigin-RevId: 253622981	2019-06-19 23:03:51 -07:00
Andy Davis	898cf0e968	LoopFusion: adds support for computing forward computation slices, which will enable fusion of consumer loop nests into their producers in subsequent CLs. PiperOrigin-RevId: 253601994	2019-06-19 23:03:42 -07:00
Alex Zinenko	ee6f84aebd	Convert a nest affine loops to a GPU kernel This converts entire loops into threads/blocks. No check on the size of the block or grid, or on the validity of parallelization is performed, it is under the responsibility of the caller to strip-mine the loops and to perform the dependence analysis before calling the conversion. PiperOrigin-RevId: 253189268	2019-06-19 23:02:02 -07:00
River Riddle	6a0555a875	Refactor SplatElementsAttr to inherit from DenseElementsAttr as opposed to being a separate Attribute type. DenseElementsAttr provides a better internal representation for splat values as well as better API for accessing elements. PiperOrigin-RevId: 253138287	2019-06-19 23:01:52 -07:00
River Riddle	5da741f671	Add basic cost modeling to the dialect conversion infrastructure. This initial cost model favors specific patterns based upon two criteria: 1) Lowest minimum pattern stack depth when legalizing. - This leads the system to favor patterns that have lower legalization stacks, i.e. represent a more direct mapping to the target. 2) Pattern benefit. - When considering multiple patterns with the same legalization depth, this favors patterns with a larger specified benefit. PiperOrigin-RevId: 252713470	2019-06-19 22:59:06 -07:00
River Riddle	eb28b30940	NFC: Cleanup the naming scheme for registering legalization actions to be consistent, and move a file functions to the source file. PiperOrigin-RevId: 252639629	2019-06-11 10:14:35 -07:00
Alex Zinenko	8ad35b90ec	Use DialectConversion to lower the Affine dialect to the Standard dialect This introduces the support for region-containing operations to the dialect conversion framework in order to support the conversion of affine control-flow operations into the standard control flow with branches. Regions that belong to an operation are converted before the operation itself. The DialectConversionPattern can therefore access the converted regions of the original operation and process them further if necessary. In particular, the conversion is allowed to move the blocks from the original region to other regions and to split blocks into multiple blocks. All block manipulations must be performed through the PatternRewriter to ensure they will be undone if the conversion fails. Port the pass converting from the affine dialect (loops and ifs with bodies as regions) to the standard dialect (branch-based cfg) to use DialectConversion in order to exercise this new functionality. The modification to the lowering functions are minor and are focused on using the PatterRewriter instead of directly modifying the IR. PiperOrigin-RevId: 252625169	2019-06-11 10:14:27 -07:00
Andy Davis	e33e36f178	Return dependence result enum to distiguish between dependence result and error cases (NFC). PiperOrigin-RevId: 252437616	2019-06-11 10:12:36 -07:00
River Riddle	e7ccfb2ae8	Add support to ConversionTarget for storing legalization actions for entire dialects as opposed to individual operations. This allows for better support of unregistered operations, as well as removing the need to collect all of the operations for a given dialect(which may be very expensive). PiperOrigin-RevId: 251943590	2019-06-09 16:21:32 -07:00
River Riddle	e25796ef6e	Add support for matchAndRewrite to the DialectConversion patterns. This also drops the default "always succeed" match override to better align with RewritePattern. PiperOrigin-RevId: 251941625	2019-06-09 16:21:20 -07:00
River Riddle	0560f153b8	Add utility 'create' methods to OperationFolder that will create an operation with a given OpBuilder and automatically try to fold it, similarly to OpBuilder::createOrFold. The difference here is that these methods enable folding to constants in addition to existing values. This functionality is then used to replace linalg::FunctionConstants. PiperOrigin-RevId: 251716247	2019-06-09 16:19:51 -07:00
River Riddle	9fc00cf840	Always remap results when replacing an operation. This prevents a crash when lowering identity(passthrough) operations to the same resultant type as the original operation. PiperOrigin-RevId: 251665492	2019-06-09 16:18:44 -07:00
River Riddle	0d2492eb2e	When cleaning up after a failed legalization pattern, make sure to remove any newly created value mappings. PiperOrigin-RevId: 251658984	2019-06-09 16:18:32 -07:00
River Riddle	f1b848e470	NFC: Rename FuncBuilder to OpBuilder and refactor to take a top level region instead of a function. PiperOrigin-RevId: 251563898	2019-06-09 16:17:59 -07:00
River Riddle	9b4a02c1e9	NFC: Rename FoldHelper to OperationFolder and split a large function in two. PiperOrigin-RevId: 251485843	2019-06-09 16:17:11 -07:00
Ben Vanik	9fc4193eea	Adding additional dialect parsing utilities, conversion wrappers, and traversal helpers. - added a typed walk to Block (matching the equivalent on Function) - added token parsers (incl optional variants) for : and ( - added applyConversionPatterns that takes a list of functions to apply patterns to PiperOrigin-RevId: 251481608	2019-06-09 16:16:59 -07:00
River Riddle	95eaca3e0f	Refactor the dialect conversion framework to support multi-level conversions. Multi-level conversions are those that require multiple patterns to be applied before an operation is completely legalized. This essentially means that conversion patterns do not have to directly generate legal operations, and may be chained together to produce legal code. To accomplish this, moving forward users will need to provide a legalization target that defines what operations are legal for the conversion. A target can mark an operation as legal by providing a specific legalization action. The initial actions are: * Legal - This action signals that every instance of the given operation is legal, i.e. any combination of attributes, operands, types, etc. is valid. * Dynamic - This action signals that only some instances of a given operation are legal. This allows for defining fine-tune constraints, like say std.add is only legal when operating on 32-bit integers. An example target is shown below: struct MyTarget : public ConversionTarget { MyTarget(MLIRContext &ctx) : ConversionTarget(ctx) { // All operations in the LLVM dialect are legal. addLegalDialect<LLVMDialect>(); // std.constant op is always legal on this target. addLegalOp<ConstantOp>(); // std.return op has dynamic legality constraints. addDynamicallyLegalOp<ReturnOp>(); } /// Implement the custom legalization handler to handle /// std.return. bool isLegal(Operation *op) override { // Process the dynamic handling for a std.return op (and any others that were // marked "dynamic"). ... } }; PiperOrigin-RevId: 251289374	2019-06-03 19:27:02 -07:00
Amit Sabne	7a43da6060	Loop invariant code motion - remove reliance on getForwardSlice. Add more tests. -- PiperOrigin-RevId: 250950703	2019-06-01 20:13:30 -07:00
Geoffrey Martin-Noble	60d6249fbd	Replace checks against numDynamicDims with hasStaticShape -- PiperOrigin-RevId: 250782165	2019-06-01 20:11:31 -07:00
Jacques Pienaar	4a697a91de	Fix 5 ClangTidy - Readability findings. * the 'empty' method should be used to check for emptiness instead of 'size' * using decl 'CapturableHandle' is unused * redundant get() call on smart pointer * using decl 'apply' is unused * using decl 'ScopeGuard' is unused -- PiperOrigin-RevId: 250623863	2019-06-01 20:10:22 -07:00
River Riddle	9abdbb3189	NFC: Inline toString as operations can be streamed directly into raw_ostream. -- PiperOrigin-RevId: 250619765	2019-06-01 20:10:12 -07:00
MLIR Team	5a91b9896c	Remove "size" property of affine maps. -- PiperOrigin-RevId: 250572818	2019-06-01 20:09:02 -07:00
Andy Davis	1de0f97fff	LoopFusionUtils CL 2/n: Factor out and generalize slice union computation. ) Factors slice union computation out of LoopFusion into Analysis/Utils (where other iteration slice utilities exist). ) Generalizes slice union computation to take the union of slices computed on all loads/stores pairs between source and destination loop nests. ) Fixes a bug in FlatAffineConstraints::addSliceBounds where redundant constraints were added. ) Takes care of a TODO to expose FlatAffineConstraints::mergeAndAlignIds as a public method. -- PiperOrigin-RevId: 250561529	2019-06-01 20:08:52 -07:00
Alex Zinenko	c2d105811a	Do not assume Blocks belong to Functions Fix Block::splitBlock and Block::eraseFromFunction that erronously assume blocks belong to functions. They now belong to regions. When splitting, new blocks should be created in the same region as the existing block. When erasing a block, it should be removed from the region rather than from the function body that transitively contains the region. Also rename Block::eraseFromFunction to Block::erase for consistency with other IR containers. -- PiperOrigin-RevId: 250278272	2019-06-01 20:05:21 -07:00
Alex Zinenko	d4c071cc69	Decouple affine->standard lowering from the pass The lowering from the Affine dialect to the Standard dialect was originally implemented as a standalone pass. However, it may be used by other passes willing to lower away some of the affine constructs as a part of their operation. Decouple the transformation functions from the pass infrastructure and expose the entry point for the lowering. Also update the lowering functions to use `LogicalResult` instead of bool for return values. -- PiperOrigin-RevId: 250229198	2019-06-01 20:05:01 -07:00
River Riddle	c2d069323b	Rename DialectConversion to TypeConverter and split out pattern construction. This simplifies building the conversion pattern list from multiple sources. -- PiperOrigin-RevId: 249930583	2019-06-01 20:02:03 -07:00
Lei Zhang	ba104f871c	Add TestLoopFusion.cpp to CMakeLists.txt -- PiperOrigin-RevId: 249901490	2019-06-01 20:00:52 -07:00
Andy Davis	e53b7d2c02	Add LoopFusionUtils.cpp to CMakeLists. -- PiperOrigin-RevId: 249887371	2019-06-01 20:00:33 -07:00
Andy Davis	a560f2c646	Affine Loop Fusion Utility Module (1/n). ) Adds LoopFusionUtils which will expose a set of loop fusion utilities (e.g. dependence checks, fusion cost/storage reduction, loop fusion transformation) for use by loop fusion algorithms. Support for checking block-level fusion-preventing dependences is added in this CL (additional loop fusion utilities will be added in subsequent CLs). ) Adds TestLoopFusion test pass for testing LoopFusionUtils at a fine granularity. *) Adds unit test for testing dependence check for block-level fusion-preventing dependences. -- PiperOrigin-RevId: 249861071	2019-06-01 20:00:23 -07:00
River Riddle	ae1651368f	NFC: Rename DialectConversionPattern to ConversionPattern. -- PiperOrigin-RevId: 249857277	2019-06-01 20:00:13 -07:00
Alex Zinenko	fe2716aee3	Detemplatize convertRegion in DialectConversion Originally, FunctionConverter::convertRegion in the DialectConversion framework was implemented as a function template because it was creating a new region in the parent object, which could have been an op or a function. Since DialectConversion now operates in place, new region is no longer created so there is no need for convertRegion to be aware of the parent, only of the error reporting location. -- PiperOrigin-RevId: 249826392	2019-06-01 20:00:04 -07:00
River Riddle	4958ec2414	Apply operation rewrites before updating arguments. -- PiperOrigin-RevId: 249678839	2019-06-01 19:58:14 -07:00
River Riddle	14d1cfbccb	Decouple running a conversion from the DialectConversion class. The DialectConversion class is only necessary for type signature changes(block arguments or function arguments). This isn't always desired when performing a dialect conversion. This allows for those conversions without this need to run per function instead of per module. -- PiperOrigin-RevId: 249657549	2019-06-01 19:58:04 -07:00
River Riddle	c33862b0ed	Refactor FunctionAttr to hold the internal function reference by name instead of pointer. The one downside to this is that the function reference held by a FunctionAttr needs to be explicitly looked up from the parent module. This provides several benefits though: * There is no longer a need to explicitly remap function attrs. - This removes a potentially expensive call from the destructor of Function. - This will enable some interprocedural transformations to now run intraprocedurally. - This wasn't scalable and forces dialect defined attributes to override a virtual function. * Replacing a function is now a trivial operation. * This is a necessary first step to representing functions as operations. -- PiperOrigin-RevId: 249510802	2019-06-01 19:56:54 -07:00
River Riddle	d15d107da1	Refactor DialectConversion to operate on functions in-place without any cloning. This works by caching all of the requested pattern rewrite operations, e.g. replace operation, and only applying them on a completely successful conversion. -- PiperOrigin-RevId: 249490306	2019-06-01 19:56:24 -07:00
MLIR Team	80884d28ac	[LoopFusion] Don't count terminator op in compute cost. -- PiperOrigin-RevId: 249124895	2019-06-01 19:52:52 -07:00
Nicolas Vasilache	fdbbb3c274	Use lambdas for nesting edsc constructs. Using ArrayRef introduces issues with the order of evaluation between a constructor and the arguments of the subsequent calls to the `operator()`. As a consequence the order of captures is not well-defined can go wrong with certain compilers (e.g. gcc-6.4). This CL fixes the issue by using lambdas in lieu of ArrayRef. -- PiperOrigin-RevId: 249114775	2019-05-20 13:50:28 -07:00
Mehdi Amini	164c3c7ac5	Fix debug build: static constexpr data member must have a definition (until C++17) -- PiperOrigin-RevId: 248990338	2019-05-20 13:48:36 -07:00
River Riddle	68250edbfa	NFC: Tidy up DialectConversion.cpp and rename DialectOpConversion to DialectConversionPattern. -- PiperOrigin-RevId: 248980810	2019-05-20 13:48:19 -07:00
River Riddle	6241cf132e	Refactor the DialectConversion process to clone each function and then operate in-place, as opposed to incrementally constructing a new function. This is crucial to allowing the use of non type-conversion patterns(normal RewritePatterns) as part of the conversion process. The converter now works by inserting fake producer operations when replacing the results of an existing operation with values of a different, now legal, type. These fake operations are guaranteed to never escape the converter. -- PiperOrigin-RevId: 248969130	2019-05-20 13:48:10 -07:00
River Riddle	8780d8d8eb	Add user iterators to IRObjects, i.e. Values. -- PiperOrigin-RevId: 248877752	2019-05-20 13:47:19 -07:00
River Riddle	3de0c7696b	Rewrite the DialectOpConversion patterns to inherit from RewritePattern instead of Pattern. This simplifies the infrastructure a bit by being able to reuse PatternRewriter and the RewritePatternMatcher, but also starts to lay the groundwork for a more generalized legalization framework that can operate on DialectOpConversions as well as normal RewritePatterns. -- PiperOrigin-RevId: 248836492	2019-05-20 13:47:01 -07:00
River Riddle	1a100849c4	Add support for saving and restoring the insertion point of a FuncBuilder. This also updates the edsc::ScopedContext to use a single builder that saves/restores insertion points. This is necessary for using edscs within RewritePatterns. -- PiperOrigin-RevId: 248812645	2019-05-20 13:46:35 -07:00
River Riddle	eb5ec03960	Refactor PatternRewriter to inherit from FuncBuilder instead of Builder. This is necessary for allowing more complicated rewrites in the future that may do things like update the insertion point (e.g. for rewrites involving regions). -- PiperOrigin-RevId: 248803153	2019-05-20 13:46:26 -07:00
River Riddle	1982afb145	Unify the 'constantFold' and 'fold' hooks on an operation into just 'fold'. This new unified fold hook will take constant attributes as operands, and may return an existing 'Value *' or a constant 'Attribute' when folding. This removes the awkward situation where a simple canonicalization like "sub(x,x)->0" had to be written as a canonicalization pattern as opposed to a fold. -- PiperOrigin-RevId: 248582024	2019-05-20 13:44:24 -07:00
Chris Lattner	9ec6b5b749	Remove some extraneous const qualifiers on Type, and 0b1 -> 1 in tblgen files. (NFC) -- PiperOrigin-RevId: 248332674	2019-05-20 13:42:56 -07:00
Jacques Pienaar	cde4d5a6d9	Remove unnecessary C++ specifier in CPP files. NFC. These are only required in .h files to disambiguate between C and C++ header files. -- PiperOrigin-RevId: 248219135	2019-05-20 13:42:13 -07:00
Stella Laurenzo	1a2ad06bae	Fix lingering sign compare warnings in exposed by "ninja check-mlir". -- PiperOrigin-RevId: 248050178	2019-05-20 13:41:11 -07:00
Jacques Pienaar	c82e1da268	Remove unused function and avoid unused variable warning. NFC. -- PiperOrigin-RevId: 247991231	2019-05-20 13:40:23 -07:00
Andy Davis	90d4023c9b	Factor out loop interchange code from LoopFusion into LoopUtils (NFC). -- PiperOrigin-RevId: 247926512	2019-05-20 13:38:12 -07:00
River Riddle	d5b60ee840	Replace Operation::isa with llvm::isa. -- PiperOrigin-RevId: 247789235	2019-05-20 13:37:52 -07:00
River Riddle	adca3c2edc	Replace Operation::cast with llvm::cast. -- PiperOrigin-RevId: 247785983	2019-05-20 13:37:42 -07:00
River Riddle	c5ecf9910a	Add support for using llvm::dyn_cast/cast/isa for operation casts and replace usages of Operation::dyn_cast with llvm::dyn_cast. -- PiperOrigin-RevId: 247780086	2019-05-20 13:37:31 -07:00
MLIR Team	41d90a85bd	Automated rollback of changelist 247778391. PiperOrigin-RevId: 247778691	2019-05-20 13:37:20 -07:00
River Riddle	02e03b9bf4	Add support for using llvm::dyn_cast/cast/isa for operation casts and replace usages of Operation::dyn_cast with llvm::dyn_cast. -- PiperOrigin-RevId: 247778391	2019-05-20 13:37:10 -07:00
Chris Lattner	81e478adca	rename -memref-dependence-check to -test-memref-dependence-check since it generates remarks for testing, it isn't itself a transformation. While there, upgrade its diagnostic emission to use the streaming interface. Prune some unnecessary #includes. -- PiperOrigin-RevId: 247768062	2019-05-20 13:36:38 -07:00
River Riddle	fe7b23792d	Remove some unnecessary or duplicated header includes from IR/. -- PiperOrigin-RevId: 247762545	2019-05-20 13:36:28 -07:00
Chris Lattner	0134b5df3a	Cleanups and simplifications to code, noticed by inspection. NFC. -- PiperOrigin-RevId: 247758075	2019-05-20 13:36:17 -07:00
Mehdi Amini	91f0781000	Remove extra `;` after function definition (NFC) Fix a GCC warning -- PiperOrigin-RevId: 247670176	2019-05-10 19:29:26 -07:00
Mehdi Amini	83cce46b96	Remove unused Vectorize constructor (NFC) Fix gcc warning. -- PiperOrigin-RevId: 247647114	2019-05-10 19:29:01 -07:00
Andy Davis	6254a42d58	Fix bug in DmaGenerate pass where MemRefRegion union was not propagated to read region. Also cleaned up dma-generate.mlir a bit. -- PiperOrigin-RevId: 247417358	2019-05-10 19:25:44 -07:00
Lei Zhang	323e1bf7f8	Inline a string used in lambda function to fix capture error The string was referenced but not captured in the lambda, which causes a failure when compiling with MSVC. This issue was discovered by @loic-joly-sonarsource with a proposed fix in https://github.com/tensorflow/mlir/pull/22. -- PiperOrigin-RevId: 247085897	2019-05-10 19:23:49 -07:00
River Riddle	ae9f4f2157	Simplify the emission of various diagnostics created in Analysis/ and Transforms/ by using the new diagnostic infrastructure. -- PiperOrigin-RevId: 246955332	2019-05-10 19:23:07 -07:00
River Riddle	983e0eea95	Simplify several usages of attributes now that they always have a type and, transitively, access to the context. This also fixes a bug where FunctionAttrs were not being remapped for function and function argument attributes. -- PiperOrigin-RevId: 246876924	2019-05-10 19:22:41 -07:00
River Riddle	4ea887be41	Namespaceify a few explicit template specializations to appease errors caused by a bug in gcc versions < 7.0. (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56480) -- PiperOrigin-RevId: 246664463	2019-05-06 08:29:09 -07:00
Jacques Pienaar	2fe8ae4f6c	Fix up some mixed sign warnings. -- PiperOrigin-RevId: 246614498	2019-05-06 08:28:20 -07:00
Nicolas Vasilache	258e8d9ce2	Prepend an "affine-" prefix to Affine pass option names - NFC Trying to activate both LLVM and MLIR passes in mlir-cpu-runner showed name collisions when registering pass names. One possible way of disambiguating that should also work across dialects is to prepend the dialect name to the passes that specifically operate on that dialect. With this CL, mlir-cpu-runner tests still run when both LLVM and MLIR passes are registered -- PiperOrigin-RevId: 246539917	2019-05-06 08:26:44 -07:00
MLIR Team	9c66417569	Fix bug in LoopTiling where a loop with trip count of 1 caused a division by zero -- PiperOrigin-RevId: 246480710	2019-05-06 08:26:15 -07:00
River Riddle	b14c4b4ca8	Add support for basic remark diagnostics. This is the minimal functionality needed to separate notes from remarks. It also provides a starting point to start building out better remark infrastructure. -- PiperOrigin-RevId: 246175216	2019-05-06 08:24:02 -07:00
River Riddle	eaf7f6b671	Start sketching out a new diagnostics infrastructure. Create a new class 'DiagnosticEngine' and move the diagnostic handler support and final diagnostic emission from the MLIRContext to it. -- PiperOrigin-RevId: 246163897	2019-05-06 08:23:53 -07:00
Nicolas Vasilache	56c7a957bf	Parsing support for Range, View and Slice operations This CL implements the previously unsupported parsing for Range, View and Slice operations. A pass is introduced to lower to the LLVM. Tests are moved out of C++ land and into mlir/test/Examples. This allows better fitting within standard developer workflows. -- PiperOrigin-RevId: 245796600	2019-05-06 08:20:55 -07:00
River Riddle	1423acc03c	Rename isa_nonnull to isa_and_nonnull to match the upstream llvm name. -- PiperOrigin-RevId: 244928036	2019-04-23 22:03:14 -07:00
Feng Liu	5c757087c7	Apply patterns repeatly if the function is modified During the pattern rewrite, if the function is changed, i.e. ops created, deleted or swapped, the pattern rewriter needs to re-scan the function entirely and apply the patterns again, so the patterns whose root ops have been popped out from the working list nor an immediate users of the changed ops can be reconsidered. A command line flag is added to set the max number of iterations rescanning the function for pattern match. If the rewrite doesn' converge after this number, this compiling will continue and the result can be sub-optimal. One unit test is updated because this change fixed the missing optimization opportunities. -- PiperOrigin-RevId: 244754190	2019-04-23 22:02:16 -07:00
Amit Sabne	4aa9235ae0	Fix LLVM_DEBUG instances -- PiperOrigin-RevId: 244058051	2019-04-18 11:49:39 -07:00
Amit Sabne	7905da656e	Loop invariant code motion. -- PiperOrigin-RevId: 244043679	2019-04-18 11:49:31 -07:00
Lei Zhang	bdd56eca49	Remove checks guaranteed to be true by the type This addresses the compiler wraning of "comparison of unsigned expression >= 0 is always true [-Wtype-limits]". -- PiperOrigin-RevId: 242868703	2019-04-11 10:52:33 -07:00
Lei Zhang	2e7895d5f1	Add parentheses in various asserts to group predicates This addresses the "suggest parentheses around ‘&&’ within ‘\|\|’ [-Wparentheses]" compiler warnings. -- PiperOrigin-RevId: 242868670	2019-04-11 10:52:21 -07:00
Andy Davis	44f6dffbf8	Factor code to compute dependence components out of loop fusion pass, and into a reusable utility function (NFC). -- PiperOrigin-RevId: 242716259	2019-04-11 10:51:53 -07:00
Amit Sabne	70a416de14	Fix typos in LoopFusion -- PiperOrigin-RevId: 242679298	2019-04-11 10:51:43 -07:00
Mehdi Amini	f40634ef3a	Filter DialectConversion pattern to be considered only if the root kind matches the operation. This is the same logic as the PatterRewriter. -- PiperOrigin-RevId: 242287241	2019-04-07 18:21:34 -07:00
River Riddle	e4628b79fb	Add new utilities for RTTI Operation casting: dyn_cast_or_null and isa_nonnull * dyn_cast_or_null - This will first check if the operation is null before trying to 'dyn_cast': Value v = ...; if (auto forOp = dyn_cast_or_null<AffineForOp>(v->getDefiningOp())) ... isa_nonnull - This will first check if the pointer is null before trying to 'isa': Value *v = ...; if (isa_nonnull<AffineForOp>(v->getDefiningOp()); ... -- PiperOrigin-RevId: 242171343	2019-04-07 18:20:07 -07:00
Mehdi Amini	7a640e65e9	Fix CMake build: reflect that a new file Utils/ConstantFoldUtils.cpp was added -- PiperOrigin-RevId: 242123122	2019-04-05 07:43:50 -07:00
Mehdi Amini	01e8ec94c3	Fix CMake build: account for renamed files and add missing include on MacOS -- PiperOrigin-RevId: 242101364	2019-04-05 07:43:23 -07:00
River Riddle	c4a5386e48	NFC: Replace usages of iterator_range<operand_iterator> with operand_range. -- PiperOrigin-RevId: 242031201	2019-04-05 07:42:29 -07:00
MLIR Team	0cd589c337	Create a LoopUtil function to return perfectly nested loop set -- PiperOrigin-RevId: 242019230	2019-04-05 07:42:01 -07:00
River Riddle	a8f4b9eeeb	Iterate on the operations to fold in TestConstantFold in reverse to remove the need for ConstantFoldHelper to have a flag for insertion at the head of the entry block. This also fixes an asan bug in TestConstantFold due to the iteration order of operations and ConstantFoldHelper's constant insertion placement. Note: This now means that we cannot fold chains of operations, i.e. where constant foldable operations feed into each other. Given that this is a testing pass solely for constant folding, this isn't really something that we want anyways. Constant fold tests should be simple and direct, with more advanced folding/feeding being tested with the canonicalizer. -- PiperOrigin-RevId: 242011744	2019-04-05 07:41:52 -07:00
River Riddle	dca21299cb	Fix a few warnings for missing parentheses around '\|\|' and extra semicolons. -- PiperOrigin-RevId: 241994767	2019-04-05 07:41:43 -07:00
Lei Zhang	4e40c83291	Deduplicate constant folding logic in ConstantFold and GreedyPatternRewriteDriver There are two places containing constant folding logic right now: the ConstantFold pass and the GreedyPatternRewriteDriver. The logic was not shared and started to drift apart. We were testing constant folding logic using the ConstantFold pass, but lagged behind the GreedyPatternRewriteDriver, where we really want the constant folding to happen. This CL pulled the logic into utility functions and classes for sharing between these two places. A new ConstantFoldHelper class is created to help constant fold and de-duplication. Also, renamed the ConstantFold pass to TestConstantFold to make it clear that it is intended for testing purpose. -- PiperOrigin-RevId: 241971681	2019-04-05 07:41:32 -07:00
River Riddle	6fa3181329	Remove the non-postorder walk functions from Function/Block/Instruction and rename walkPostOrder to walk. -- PiperOrigin-RevId: 241965239	2019-04-05 07:41:23 -07:00
Andy Davis	d0d1b2a30d	Fix bug in LoopTiling where creation of tile-space loop upper bound did not handle symbol operands correctly. -- PiperOrigin-RevId: 241958502	2019-04-05 07:41:12 -07:00
Nicolas Vasilache	f1b12f5a64	Fix test that fails on non-determinism in LowerVectorTransfers This CL fixes the non-determinism across compilers in an edsc::select expression used in LowerVectorTransfers. This is achieved by factoring the expression out of the function call to ensure a deterministic order of evaluation. Since the expression is now factored out, fewer IR is generated and the test is updated accordingly. -- PiperOrigin-RevId: 241679962	2019-04-03 01:09:13 -07:00
River Riddle	67a52c44b1	Rewrite the verify hooks on operations to use LogicalResult instead of bool. This also changes the return of Operation::emitError/emitOpError to LogicalResult as well. -- PiperOrigin-RevId: 241588075	2019-04-02 13:40:47 -07:00
Andy Davis	7c1fc9e795	Enable producer-consumer fusion for liveout memrefs if consumer read region matches producer write region. -- PiperOrigin-RevId: 241517207	2019-04-02 13:39:50 -07:00
River Riddle	084669e005	Remove MLPatternLoweringPass and rewrite LowerVectorTransfers to use RewritePattern instead. -- PiperOrigin-RevId: 241455472	2019-04-02 13:39:17 -07:00
Mehdi Amini	b3a407fa68	Fix MacOS build This is making up for some differences in standard library and linker flags. It also get rid of the requirement to build with RTTI. -- PiperOrigin-RevId: 241348845	2019-04-01 11:00:30 -07:00
Jacques Pienaar	1273af232c	Add build files and update README. * Add initial version of build files; * Update README with instructions to download and build MLIR from github; -- PiperOrigin-RevId: 241102092	2019-03-30 11:23:22 -07:00
Nicolas Vasilache	c9d5f3418a	Cleanup SuperVectorization dialect printing and parsing. On the read side, ``` %3 = vector_transfer_read %arg0, %i2, %i1, %i0 {permutation_map: (d0, d1, d2)->(d2, d0)} : (memref<?x?x?xf32>, index, index, index) -> vector<32x256xf32> ``` becomes: ``` %3 = vector_transfer_read %arg0[%i2, %i1, %i0] {permutation_map: (d0, d1, d2)->(d2, d0)} : memref<?x?x?xf32>, vector<32x256xf32> ``` On the write side, ``` vector_transfer_write %0, %arg0, %c3, %c3 {permutation_map: (d0, d1)->(d0)} : vector<128xf32>, memref<?x?xf32>, index, index ``` becomes ``` vector_transfer_write %0, %arg0[%c3, %c3] {permutation_map: (d0, d1)->(d0)} : vector<128xf32>, memref<?x?xf32> ``` Documentation will be cleaned up in a followup commit that also extracts a proper .md from the top of the file comments. PiperOrigin-RevId: 241021879	2019-03-29 17:56:42 -07:00
Nicolas Vasilache	f93a5be65f	Make createMaterializeVectorsPass take a vectorSize parameter - NFC This CL allows the programmatic control of the target hardware vector size when creating a MaterializeVectorsPass. This is useful for registering passes for the tutorial. PiperOrigin-RevId: 240996136	2019-03-29 17:56:12 -07:00
Nicolas Vasilache	094ca64ab0	Refactor vectorization patterns This CL removes the reliance of the vectorize pass on the specification of a `fastestVaryingDim` parameter. This parameter is a restriction meant to more easily target a particular loop/memref combination for vectorization and is mainly used for testing. This also had the side-effect of restricting vectorization patterns to only the ones in which all memrefs were contiguous along the same loop dimension. This simple restriction prevented matmul to vectorize in 2-D. this CL removes the restriction and adds the matmul test which vectorizes in 2-D along the parallel loops. Support for reduction loops is left for future work. PiperOrigin-RevId: 240993827	2019-03-29 17:55:36 -07:00
MLIR Team	9d30b36aaf	Enable input-reuse fusion to search function arguments for fusion candidates (takes care of a TODO, enables another tutorial test case). PiperOrigin-RevId: 240979894	2019-03-29 17:54:36 -07:00
River Riddle	106dd08e99	Change the vectorizer test pass to output via diagnostics instead of llvm::outs. This allows for the output to be deterministic when multi-threading is enabled. PiperOrigin-RevId: 240905858	2019-03-29 17:54:21 -07:00
Jacques Pienaar	cd0b925dc2	Remove extra qualification PiperOrigin-RevId: 240875432	2019-03-29 17:52:36 -07:00
Alex Zinenko	3173a63f3f	Dialect Conversion: convert regions of operations when cloning them Dialect conversion currently clones the operations that did not match any pattern. This includes cloning any regions that belong to these operations. Instead, apply conversion recursively to the nested regions. Note that if an operation matched one of the conversion patterns, it is up to the pattern rewriter to fill in the regions of the converted operation. This may require calling back to the converter and is left for future work. PiperOrigin-RevId: 240872410	2019-03-29 17:52:04 -07:00
MLIR Team	9d9675fc8f	Remove overly conservative check in LoopFusion pass (enables fusion in tutorial example). PiperOrigin-RevId: 240859227	2019-03-29 17:51:16 -07:00
River Riddle	213b8d4d3b	Rename InstOperand to OpOperand. PiperOrigin-RevId: 240814651	2019-03-29 17:50:41 -07:00
Nicolas Vasilache	4dc7af9da8	Make vectorization aware of loop semantics Now that we have a dependence analysis, we can check that loops are indeed parallel and make vectorization correct. PiperOrigin-RevId: 240682727	2019-03-29 17:49:30 -07:00
Nicolas Vasilache	c3742d20b5	Give the Vectorize pass a virtualVectorSize argument. This CL allows vectorization to be called and configured in other ways than just via command line arguments. This allows triggering vectorization programmatically. PiperOrigin-RevId: 240638208	2019-03-29 17:48:12 -07:00
River Riddle	99b87c9707	Replace usages of Instruction with Operation in the Transforms/ directory. PiperOrigin-RevId: 240636130	2019-03-29 17:47:26 -07:00
Mehdi Amini	3518122e86	Simplify API uses of `getContext()` (NFC) The Pass base class is providing a convenience getContext() accessor. PiperOrigin-RevId: 240634961	2019-03-29 17:47:11 -07:00
Jacques Pienaar	b0244b66a5	Fix include path in test pass. PiperOrigin-RevId: 240628260	2019-03-29 17:46:41 -07:00
River Riddle	9c08540690	Replace usages of Instruction with Operation in the /Analysis directory. PiperOrigin-RevId: 240569775	2019-03-29 17:44:56 -07:00
Alex Zinenko	5a5bba0279	Introduce affine terminator Due to legacy reasons (ML/CFG function separation), regions in affine control flow operations require contained blocks not to have terminators. This is inconsistent with the notion of the block and may complicate code motion between regions of affine control operations and other regions. Introduce `affine.terminator`, a special terminator operation that must be used to terminate blocks inside affine operations and transfers the control back to he region enclosing the affine operation. For brevity and readability reasons, allow `affine.for` and `affine.if` to omit the `affine.terminator` in their regions when using custom printing and parsing format. The custom parser injects the `affine.terminator` if it is missing so as to always have it present in constructed operations. Update transformations to account for the presence of terminator. In particular, most code motion transformation between loops should leave the terminator in place, and code motion between loops and non-affine blocks should drop the terminator. PiperOrigin-RevId: 240536998	2019-03-29 17:44:24 -07:00
River Riddle	af45236c70	Add experimental support for multi-threading the pass manager. This adds support for running function pipelines on functions across multiple threads, and is guarded by an off-by-default flag 'experimental-mt-pm'. There are still quite a few things that need to be done before multi-threading is ready for general use(e.g. pass-timing), but this allows for those things to be tested in a multi-threaded environment. PiperOrigin-RevId: 240489002	2019-03-29 17:44:08 -07:00
River Riddle	f9d91531df	Replace usages of Instruction with Operation in the /IR directory. This is step 2/N to renaming Instruction to Operation. PiperOrigin-RevId: 240459216	2019-03-29 17:43:37 -07:00
River Riddle	9ffdc930c0	Rename the Instruction class to Operation. This just renames the class, usages of Instruction will still refer to a typedef in the interim. This is step 1/N to renaming Instruction to Operation. PiperOrigin-RevId: 240431520	2019-03-29 17:42:50 -07:00
Alex Zinenko	a7215a9032	Allow creating standalone Regions Currently, regions can only be constructed by passing in a `Function` or an `Instruction` pointer referencing the parent object, unlike `Function`s or `Instruction`s themselves that can be created without a parent. It leads to a rather complex flow in operation construction where one has to create the operation first before being able to work with its regions. It may be necessary to work with the regions before the operation is created. In particular, in `build` and `parse` functions that are executed _before_ the operation is created in cases where boilerplate region manipulation is required (for example, inserting the hypothetical default terminator in affine regions). Allow creating standalone regions. Such regions are meant to own a list of blocks and transfer them to other regions on demand. Each instruction stores a fixed number of regions as trailing objects and has ownership of them. This decreases the size of the Instruction object for the common case of instructions without regions. Keep this behavior intact. To allow some flexibility in construction, make OperationState store an owning vector of regions. When the Builder creates an Instruction from OperationState, the bodies of the regions are transferred into the instruction-owned regions to minimize copying. Thus, it becomes possible to fill standalone regions with blocks and move them to an operation when it is constructed, or move blocks from a region to an operation region, e.g., for inlining. PiperOrigin-RevId: 240368183	2019-03-29 17:40:59 -07:00
Chris Lattner	46ade282c8	Make FunctionPass::getFunction() return a reference to the function, instead of a pointer. This makes it consistent with all the other methods in FunctionPass, as well as with ModulePass::getModule(). NFC. PiperOrigin-RevId: 240257910	2019-03-29 17:40:44 -07:00
River Riddle	96ebde9cfd	Replace usages of "Op::operator->" with ".". This is step 2/N of removing the temporary operator-> method as part of the de-const transition. PiperOrigin-RevId: 240200792	2019-03-29 17:40:09 -07:00
River Riddle	5de726f493	Refactor the Pattern framework to allow for combined match/rewrite patterns. This is done by adding a new 'matchAndRewrite' function to RewritePattern that performs the match and rewrite in one step. The default behavior simply calls into the existing 'match' and 'rewrite' functions. The 'PatternMatcher' class has now been specialized for RewritePatterns and has been rewritten to make use of the new matchAndRewrite functionality. This combined match/rewrite functionality allows simplifying the majority of existing RewritePatterns, as they do not benefit from separate match and rewrite functions. Some of the existing canonicalization patterns in StandardOps have been modified to take advantage of this functionality. PiperOrigin-RevId: 240187856	2019-03-29 17:39:35 -07:00
River Riddle	af1abcc80b	Replace usages of "operator->" with "." for the AffineOps. Note: The "operator->" method is a temporary helper for the de-const transition and is gradually being phased out. PiperOrigin-RevId: 240179439	2019-03-29 17:39:19 -07:00
River Riddle	832567b379	NFC: Rename the 'for' operation in the AffineOps dialect to 'affine.for' and set the namespace of the AffineOps dialect to 'affine'. PiperOrigin-RevId: 240165792	2019-03-29 17:39:03 -07:00
Mehdi Amini	bb621a5596	Using getContext() instead of getInstruction()->getContext() on Operation (NFC) PiperOrigin-RevId: 240088209	2019-03-29 17:38:29 -07:00
Chris Lattner	e510de0305	Various small cleanups to the code, mostly removing const_cast's. PiperOrigin-RevId: 240083489	2019-03-29 17:37:58 -07:00
River Riddle	9c6e92360c	NFC: Rename the 'if' operation in the AffineOps dialect to 'affine.if'. PiperOrigin-RevId: 240071154	2019-03-29 17:36:53 -07:00
Chris Lattner	d9b5bc8f55	Remove OpPointer, cleaning up a ton of code. This also moves Ops to using inherited constructors, which is cleaner and means you can now use DimOp() to get a null op, instead of having to use Instruction::getNull<DimOp>(). This removes another 200 lines of code. PiperOrigin-RevId: 240068113	2019-03-29 17:36:21 -07:00
Chris Lattner	dd2b2ec542	Push a bunch of 'consts' out of the *Op structure, in prep for removing OpPointer. PiperOrigin-RevId: 240044712	2019-03-29 17:35:35 -07:00
Nicolas Vasilache	f26c7cd792	Cleanup ValueHandleArray We just need a way to unpack ArrayRef<ValueHandle> to ArrayRef<Value*>. No need to expose this to the user. This reduces the cognitive overhead for the tutorial. PiperOrigin-RevId: 240037425	2019-03-29 17:35:20 -07:00
Chris Lattner	986310a68f	Remove const from Value, Instruction, Argument, and the various methods on the *Op classes. This is a net reduction by almost 400LOC. PiperOrigin-RevId: 239972443	2019-03-29 17:34:33 -07:00
Chris Lattner	3d6c74fff5	Remove const from mlir::Block. This also eliminates some incorrect reinterpret_cast logic working around it, and numerous const-incorrect issues (like block argument iteration). PiperOrigin-RevId: 239712029	2019-03-29 17:30:30 -07:00
Chris Lattner	88e9f418f5	Continue pushing const out of the core IR types - in this case, remove const from Function. PiperOrigin-RevId: 239638635	2019-03-29 17:29:58 -07:00
Nicolas Vasilache	fc5bbdd6c8	Improve comment for `augmentMapAndBounds` Followup from a previous CL. PiperOrigin-RevId: 239591775	2019-03-29 17:27:57 -07:00
Chris Lattner	589df37142	Move to new `const` model, part 1: remove ConstOpPointer. This eliminate ConstOpPointer (but keeps OpPointer for now) by making OpPointer implicitly launder const in a const incorrect way. It will eventually go away entirely, this is a progressive step towards the new const model. PiperOrigin-RevId: 239512640	2019-03-29 17:26:56 -07:00
Nicolas Vasilache	d6c650cfb5	Properly propagate induction variable in tiling This CL fixes an issue where cloned loop induction variables were not properly propagated and beefs up the corresponding test. PiperOrigin-RevId: 239422961	2019-03-29 17:25:53 -07:00
Jacques Pienaar	a8ed2ca8fd	Cleanup for changes failing with std=c++11 The static constexpr were failing with undefined reference due to lacking definition at namespace scope. PiperOrigin-RevId: 239241157	2019-03-29 17:25:24 -07:00
Jacques Pienaar	57270a9a99	Remove some statements that required >C++11, add includes and qualify names. NFC. PiperOrigin-RevId: 239197784	2019-03-29 17:24:53 -07:00
Dimitrios Vytiniotis	ee4cfefca8	Avoiding allocations during argument attribute conversion. PiperOrigin-RevId: 239144675	2019-03-29 17:24:38 -07:00
Nicolas Vasilache	c3b0c6a0dc	Cleanups Vectorize and SliceAnalysis - NFC This CL cleans up and refactors super-vectorization and slice analysis. PiperOrigin-RevId: 238986866	2019-03-29 17:23:07 -07:00
Nicolas Vasilache	a89d8c0a1a	Port Tablegen'd reference implementation of Add to declarative builders. PiperOrigin-RevId: 238977252	2019-03-29 17:22:36 -07:00
Nicolas Vasilache	3a12bc5041	Remove LOAD/STORE/RETURN boilerplate in declarative builders. This CL introduces a ValueArrayHandle helper to manage the implicit conversion of ArrayRef<ValueHandle> -> ArrayRef<Value> by converting first to ValueArrayHandle. Without this, boilerplate operations that take ArrayRef<Value> cannot be removed easily. This all seems to boil down to decoupling Value from Type. Alternative solutions exist (e.g. MLIR using Value by value everywhere) but they would be very intrusive. This seems to be the lowest impedance change. Intrinsics are also lowercased by popular demand. PiperOrigin-RevId: 238974125	2019-03-29 17:22:20 -07:00
Nicolas Vasilache	f43388e4ce	Port LowerVectorTransfers from EDSC + AST to declarative builders This CL removes the dependency of LowerVectorTransfers on the AST version of EDSCs which will be retired. This exhibited a pretty fundamental staging difference in AST-based vs declarative based emission. Since the delayed creation with an AST was staged, the loop order came into existence after the clipping expressions were computed. This now changes as the loops first need to be created declaratively in fixed order and then the clipping expressions are created. Also, due to lack of staging, coalescing cannot be done on the fly anymore and needs to be done either as a pre-pass (current implementation) or as a local transformation on the generated IR (future work). Tests are updated accordingly. PiperOrigin-RevId: 238971631	2019-03-29 17:22:06 -07:00
River Riddle	27d1bb920e	Cache the simplified attributes in SimplifyAffineStructures to avoid redundant simplifications, as well as unnecessary accesses to the MLIRContext. PiperOrigin-RevId: 238654325	2019-03-29 17:20:46 -07:00
Alex Zinenko	276fae1b0d	Rename BlockList into Region NFC. This is step 1/n to specifying regions as parts of any operation. PiperOrigin-RevId: 238472370	2019-03-29 17:18:04 -07:00
Uday Bondhugula	a228b7d477	Change getMemoryFootprintBytes emitError to a warning - this is really not a hard error; emit a warning instead (for inability to compute footprint due to the union failing due to unimplemented cases) - remove a misleading warning from LoopFusion.cpp PiperOrigin-RevId: 238118711	2019-03-29 17:16:12 -07:00
Uday Bondhugula	9f2781e8dd	Fix misc bugs / TODOs / other improvements to analysis utils - fix for getConstantBoundOnDimSize: floordiv -> ceildiv for extent - make getConstantBoundOnDimSize also return the identifier upper bound - fix unionBoundingBox to correctly use the divisor and upper bound identified by getConstantBoundOnDimSize - deal with loop step correctly in addAffineForOpDomain (covers most cases now) - fully compose bound map / operands and simplify/canonicalize before adding dim/symbol to FlatAffineConstraints; fixes false positives in -memref-bound-check; add test case there - expose mlir::isTopLevelSymbol from AffineOps PiperOrigin-RevId: 238050395	2019-03-29 17:15:27 -07:00
Uday Bondhugula	075090f891	Extend loop unrolling and unroll-jamming to non-matching bound operands and multi-result upper bounds, complete TODOs, fix/improve test cases. - complete TODOs for loop unroll/unroll-and-jam. Something as simple as "for %i = 0 to %N" wasn't being unrolled earlier (unless it had been written as "for %i = ()[s0] -> (0)()[%N] to %N"; addressed now. - update/replace getTripCountExpr with buildTripCountMapAndOperands; makes it more powerful as it composes inputs into it - getCleanupLowerBound and getUnrolledLoopUpperBound actually needed the same code; refactor and remove one. - reorganize test cases, write previous ones better; most of these changes are "label replacements". - fix wrongly labeled test cases in unroll-jam.mlir PiperOrigin-RevId: 238014653	2019-03-29 17:14:12 -07:00
River Riddle	5e1f1d2cab	Update the constantFold/fold API to use LogicalResult instead of bool. PiperOrigin-RevId: 237719658	2019-03-29 17:10:50 -07:00
River Riddle	0310d49f46	Move the success/failure functions out of LogicalResult and into the mlir namespace. PiperOrigin-RevId: 237712180	2019-03-29 17:10:21 -07:00
River Riddle	80d3568c0a	Rename Status to LogicalResult to avoid conflictions with the Status in xla/tensorflow/etc. PiperOrigin-RevId: 237537341	2019-03-29 17:08:50 -07:00
Uday Bondhugula	ce7e59536c	Add a basic model to set tile sizes + some cleanup - compute tile sizes based on a simple model that looks at memory footprints (instead of using the hardcoded default value) - adjust tile sizes to make them factors of trip counts based on an option - update loop fusion CL options to allow setting maximal fusion at pass creation - change an emitError to emitWarning (since it's not a hard error unless the client treats it that way, in which case, it can emit one) $ mlir-opt -debug-only=loop-tile -loop-tile test/Transforms/loop-tiling.mlir test/Transforms/loop-tiling.mlir:81:3: note: using tile sizes [4 4 5 ] for %i = 0 to 256 { for %i0 = 0 to 256 step 4 { for %i1 = 0 to 256 step 4 { for %i2 = 0 to 250 step 5 { for %i3 = #map4(%i0) to #map11(%i0) { for %i4 = #map4(%i1) to #map11(%i1) { for %i5 = #map4(%i2) to #map12(%i2) { %0 = load %arg0[%i3, %i5] : memref<8x8xvector<64xf32>> %1 = load %arg1[%i5, %i4] : memref<8x8xvector<64xf32>> %2 = load %arg2[%i3, %i4] : memref<8x8xvector<64xf32>> %3 = mulf %0, %1 : vector<64xf32> %4 = addf %2, %3 : vector<64xf32> store %4, %arg2[%i3, %i4] : memref<8x8xvector<64xf32>> } } } } } } PiperOrigin-RevId: 237461836	2019-03-29 17:06:51 -07:00
River Riddle	1e55ae19a0	Convert ambiguous bool returns in /Analysis to use Status instead. PiperOrigin-RevId: 237390240	2019-03-29 17:06:21 -07:00
River Riddle	10ddae6d88	Use Status instead of bool in DialectConversion. PiperOrigin-RevId: 237339277	2019-03-29 17:06:06 -07:00
River Riddle	ba6fdc8b01	Move UtilResult into the Support directory and rename it to Status. Status provides an unambiguous way to specify success/failure results. These can be generated by 'Status::success()' and Status::failure()'. Status provides no implicit conversion to bool and should be consumed by one of the following utility functions: * bool succeeded(Status) - Return if the status corresponds to a success value. * bool failed(Status) - Return if the status corresponds to a failure value. PiperOrigin-RevId: 237153884	2019-03-29 17:04:19 -07:00
River Riddle	d43f630de8	NFC: Remove 'Result' from the analysis manager api to better reflect the implementation. There is no distinction between analysis computation and result. PiperOrigin-RevId: 237093101	2019-03-29 17:02:12 -07:00
River Riddle	1d87b62afe	Add support for preserving specific analyses in the analysis manager. Passes can now preserve specific analyses via 'markAnalysesPreserved'. Example: markAnalysesPreserved<DominanceInfo>(); markAnalysesPreserved<DominanceInfo, PostDominanceInfo>(); PiperOrigin-RevId: 237081454	2019-03-29 17:01:41 -07:00
MLIR Team	c1ff9e866e	Use FlatAffineConstraints::unionBoundingBox to perform slice bounds union for loop fusion pass (WIP). Adds utility to convert slice bounds to a FlatAffineConstraints representation. Adds utility to FlatAffineConstraints to promote loop IV symbol identifiers to dim identifiers. PiperOrigin-RevId: 236973261	2019-03-29 16:59:21 -07:00
Uday Bondhugula	5836fae8a0	DMA generation CL flag update - allow mem capacity to be overridden by command-line flag - change default fast mem space to 2 PiperOrigin-RevId: 236951598	2019-03-29 16:59:05 -07:00
Uday Bondhugula	02af8c22df	Change Pass:getFunction() to return pointer instead of ref - NFC - change this for consistency - everything else similar takes/returns a Function pointer - the FuncBuilder ctor, Block/Value/Instruction::getFunction(), etc. - saves a whole bunch of &s everywhere PiperOrigin-RevId: 236928761	2019-03-29 16:58:35 -07:00
Nicolas Vasilache	069c818f40	Fix lower/upper bound mismatch in stripmineSink Also beef up the corresponding test case. PiperOrigin-RevId: 236878818	2019-03-29 16:57:21 -07:00
Dimitrios Vytiniotis	a60ba7d908	Supporting conversion of argument attributes along their types. This fixes a bug: previously, during conversion function argument attributes were neither beings passed through nor converted. This fix extends DialectConversion to allow for simultaneous conversion of the function type and the argument attributes. This was important when lowering MLIR to LLVM where attribute information (e.g. noalias) needs to be preserved in MLIR(LLVMDialect). Longer run it seems reasonable that we want to convert both the function attribute and its type and the argument attributes, but that requires a small refactoring in Function.h to aggregate these three fields in an inner struct, which will require some discussion. PiperOrigin-RevId: 236709409	2019-03-29 16:55:51 -07:00
MLIR Team	d42ef78a75	Handle MemRefRegion::compute return value in loop fusion pass (NFC). PiperOrigin-RevId: 236685849	2019-03-29 16:55:20 -07:00
River Riddle	485746f524	Implement the initial AnalysisManagement infrastructure, with the introduction of the FunctionAnalysisManager and ModuleAnalysisManager classes. These classes provide analysis computation, caching, and invalidation for a specific IR unit. The invalidation is currently limited to either all or none, i.e. you cannot yet preserve specific analyses. An analysis can be any class, but it must provide the following: * A constructor for a given IR unit. struct MyAnalysis { // Compute this analysis with the provided module. MyAnalysis(Module *module); }; Analyses can be accessed from a Pass by calling either the 'getAnalysisResult<AnalysisT>' or 'getCachedAnalysisResult<AnalysisT>' methods. A FunctionPass may query for a cached analysis on the parent module with 'getCachedModuleAnalysisResult'. Similary, a ModulePass may query an analysis, it doesn't need to be cached, on a child function with 'getFunctionAnalysisResult'. By default, when running a pass all cached analyses are set to be invalidated. If no transformation was performed, a pass can use the method 'markAllAnalysesPreserved' to preserve all analysis results. As noted above, preserving specific analyses is not yet supported. PiperOrigin-RevId: 236505642	2019-03-29 16:54:50 -07:00
Uday Bondhugula	eee85361bb	Remove hidden flag from fusion CL options PiperOrigin-RevId: 236409185	2019-03-29 16:54:05 -07:00
River Riddle	f37651c708	NFC. Move all of the remaining operations left in BuiltinOps to StandardOps. The only thing left in BuiltinOps are the core MLIR types. The standard types can't be moved because they are referenced within the IR directory, e.g. in things like Builder. PiperOrigin-RevId: 236403665	2019-03-29 16:53:35 -07:00
Lei Zhang	85d9b6c8f7	Use consistent names for dialect op source files This CL changes dialect op source files (.h, .cpp, .td) to follow the following convention: <full-dialect-name>/<dialect-namespace>Ops.{h\|cpp\|td} Builtin and standard dialects are specially treated, though. Both of them do not have dialect namespace; the former is still named as BuiltinOps.* and the latter is named as Ops.*. Purely mechanical. NFC. PiperOrigin-RevId: 236371358	2019-03-29 16:53:19 -07:00
MLIR Team	d038e34735	Loop fusion for input reuse. ) Breaks fusion pass into multiple sub passes over nodes in data dependence graph: - first pass fuses single-use producers into their unique consumer. - second pass enables fusing for input-reuse by fusing sibling nodes which read from the same memref, but which do not share dependence edges. - third pass fuses remaining producers into their consumers (Note that the sibling fusion pass may have transformed a producer with multiple uses into a single-use producer). ) Fusion for input reuse is enabled by computing a sibling node slice using the load/load accesses to the same memref, and fusion safety is guaranteed by checking that the sibling node memref write region (to a different memref) is preserved. ) Enables output vector and output matrix computations from KFAC patches-second-moment operation to fuse into a single loop nest and reuse input from the image patches operation. ) Adds a generic loop utilitiy for finding all sequential loops in a loop nest. *) Adds and updates unit tests. PiperOrigin-RevId: 236350987	2019-03-29 16:52:35 -07:00
River Riddle	ddc6788cc7	Provide a Builder::getNamedAttr and (Instruction\|Function)::setAttr(StringRef, Attribute) to simplify attribute manipulation. PiperOrigin-RevId: 236222504	2019-03-29 16:50:59 -07:00
River Riddle	ed5fe2098b	Remove PassResult and have the runOnFunction/runOnModule functions return void instead. To signal a pass failure, passes should now invoke the 'signalPassFailure' method. This provides the equivalent functionality when needed, but isn't an intrusive part of the API like PassResult. PiperOrigin-RevId: 236202029	2019-03-29 16:50:44 -07:00
Uday Bondhugula	58889884a2	Change some of the debug messages to use emitError / emitWarning / emitNote - NFC PiperOrigin-RevId: 236169676	2019-03-29 16:50:29 -07:00
River Riddle	c6c534493d	Port all of the existing passes over to the new pass manager infrastructure. This is largely NFC. PiperOrigin-RevId: 235952357	2019-03-29 16:47:14 -07:00
Uday Bondhugula	7aa60a383f	Temp change in FlatAffineConstraints::getSliceBounds() to deal with TODO in LoopFusion - getConstDifference in LoopFusion is pending a refactoring to handle bounds with min's and max's; it currently asserts on some useful test cases that we want to experiment with. This CL changes getSliceBounds to be more conservative so as to not trigger the assertion. Filed b/126426796 to track this. PiperOrigin-RevId: 235826538	2019-03-29 16:45:23 -07:00
Uday Bondhugula	d4b3ff1096	Loop fusion comand line options cleanup - clean up loop fusion CL options for promoting local buffers to fast memory space - add parameters to loop fusion pass instantiation PiperOrigin-RevId: 235813419	2019-03-29 16:44:38 -07:00
River Riddle	cdbfd48471	Rewrite the dominance info classes to allow for operating on arbitrary control flow within operation regions. The CSE pass is also updated to properly handle nested dominance. PiperOrigin-RevId: 235742627	2019-03-29 16:43:35 -07:00
Nicolas Vasilache	62c54a2ec4	Add a stripmineSink and imperfectly nested tiling primitives. This CL adds a primitive to perform stripmining of a loop by a given factor and sinking it under multiple target loops. In turn this is used to implement imperfectly nested loop tiling (with interchange) by repeatedly calling the stripmineSink primitive. The API returns the point loops and allows repeated invocations of tiling to achieve declarative, multi-level, imperfectly-nested tiling. Note that this CL is only concerned with the mechanical aspects and does not worry about analysis and legality. The API is demonstrated in an example which creates an EDSC block, emits the corresponding MLIR and applies imperfectly-nested tiling: ```cpp auto block = edsc::block({ For(ArrayRef<edsc::Expr>{i, j}, {zero, zero}, {M, N}, {one, one}, { For(k1, zero, O, one, { C({i, j, k1}) = A({i, j, k1}) + B({i, j, k1}) }), For(k2, zero, O, one, { C({i, j, k2}) = A({i, j, k2}) + B({i, j, k2}) }), }), }); // clang-format on emitter.emitStmts(block.getBody()); auto l_i = emitter.getAffineForOp(i), l_j = emitter.getAffineForOp(j), l_k1 = emitter.getAffineForOp(k1), l_k2 = emitter.getAffineForOp(k2); auto indicesL1 = mlir::tile({l_i, l_j}, {512, 1024}, {l_k1, l_k2}); auto l_ii1 = indicesL1[0][0], l_jj1 = indicesL1[1][0]; mlir::tile({l_jj1, l_ii1}, {32, 16}, l_jj1); ``` The edsc::Expr for the induction variables (i, j, k_1, k_2) provide the programmatic hooks from which tiling can be applied declaratively. PiperOrigin-RevId: 235548228	2019-03-29 16:41:20 -07:00
Uday Bondhugula	dfe07b7bf6	Refactor AffineExprFlattener and move FlatAffineConstraints out of IR into Analysis - NFC - refactor AffineExprFlattener (-> SimpleAffineExprFlattener) so that it doesn't depend on FlatAffineConstraints, and so that FlatAffineConstraints could be moved out of IR/; the simplification that the IR needs for AffineExpr's doesn't depend on FlatAffineConstraints - have AffineExprFlattener derive from SimpleAffineExprFlattener to use for all Analysis/Transforms purposes; override addLocalFloorDivId in the derived class - turn addAffineForOpDomain into a method on FlatAffineConstraints - turn AffineForOp::getAsValueMap into an AffineValueMap ctor PiperOrigin-RevId: 235283610	2019-03-29 16:39:32 -07:00
River Riddle	f48716146e	NFC: Make DialectConversion not directly inherit from ModulePass. It is now just a utility class that performs dialect conversion on a provided module. PiperOrigin-RevId: 235194067	2019-03-29 16:38:57 -07:00
River Riddle	5410dff790	Rewrite MLPatternLoweringPass to no longer inherit from FunctionPass and just provide a utility function that applies ML patterns. PiperOrigin-RevId: 235194034	2019-03-29 16:38:41 -07:00
MLIR Team	8564b274db	Internal change PiperOrigin-RevId: 235191129	2019-03-29 16:38:24 -07:00
River Riddle	3e656599f1	Define a PassID class to use when defining a pass. This allows for the type used for the ID field to be self documenting. It also allows for the compiler to know the set alignment of the ID object, which is useful for storing pointer identifiers within llvm data structures. PiperOrigin-RevId: 235107957	2019-03-29 16:37:12 -07:00
Uday Bondhugula	4d3af6be82	Print debug message better + switch a dma-generate cl opt to uint64_t PiperOrigin-RevId: 234840316	2019-03-29 16:35:41 -07:00
Uday Bondhugula	a1dad3a5d9	Extend/improve getSliceBounds() / complete TODO + update unionBoundingBox - compute slices precisely where the destination iteration depends on multiple source iterations (instead of over-approximating to the whole source loop extent) - update unionBoundingBox to deal with input with non-matching symbols - reenable disabled backend test case PiperOrigin-RevId: 234714069	2019-03-29 16:33:11 -07:00
River Riddle	48ccae2476	NFC: Refactor the files related to passes. * PassRegistry is split into its own source file. * Pass related files are moved to a new library 'Pass'. PiperOrigin-RevId: 234705771	2019-03-29 16:32:56 -07:00
Uday Bondhugula	5021dc4fa0	DMA placement update - hoist loops invariant DMAs - hoist DMAs past all loops immediately surrounding the region that the latter is invariant on - do this at DMA generation time itself PiperOrigin-RevId: 234628447	2019-03-29 16:32:41 -07:00
Uday Bondhugula	4ca6219099	Update pass documentation + improve/fix some comments - add documentation for passes - improve / fix outdated doc comments PiperOrigin-RevId: 234627076	2019-03-29 16:32:11 -07:00
River Riddle	da0ebe0670	Add a generic pattern matcher for matching constant values produced by an operation with zero operands and a single result. PiperOrigin-RevId: 234616691	2019-03-29 16:31:56 -07:00
Alex Zinenko	b4dba895a6	EDSC: make Expr typed and extensible Expose the result types of edsc::Expr, which are now stored for all types of Exprs and not only for the variadic ones. Require return types when an Expr is constructed, if it will ever have some. An empty return type list is interpreted as an Expr that does not create a value (e.g. `return` or `store`). Conceptually, all edss::Exprs are now typed, with the type being a (potentially empty) tuple of return types. Unbound expressions and Bindables must now be constructed with a specific type they will take. This makes EDSC less evidently type-polymorphic, but we can still write generic code such as Expr sumOfSquares(Expr lhs, Expr rhs) { return lhs * lhs + rhs * rhs; } and use it to construct different typed expressions as sumOfSquares(Bindable(IndexType::get(ctx)), Bindable(IndexType::get(ctx))); sumOfSquares(Bindable(FloatType::getF32(ctx)), Bindable(FloatType::getF32(ctx))); On the positive side, we get the following. 1. We can now perform type checking when constructing Exprs rather than during MLIR emission. Nevertheless, this is still duplicates the Op::verify() until we can factor out type checking from that. 2. MLIREmitter is significantly simplified. 3. ExprKind enum is only used for actual kinds of expressions. Data structures are converging with AbstractOperation, and the users can now create a VariadicExpr("canonical_op_name", {types}, {exprs}) for any operation, even an unregistered one without having to extend the enum and make pervasive changes to EDSCs. On the negative side, we get the following. 1. Typed bindables are more verbose, even in Python. 2. We lose the ability to do print debugging for higher-level EDSC abstractions that are implemented as multiple MLIR Ops, for example logical disjunction. This is the step 2/n towards making EDSC extensible. *** Move MLIR Op construction from MLIREmitter::emitExpr to Expr::build since Expr now has sufficient information to build itself. This is the step 3/n towards making EDSC extensible. Both of these strive to minimize the amount of irrelevant changes. In particular, this introduces more complex pretty-printing for affine and binary expression to make sure tests continue to pass. It also relies on string comparison to identify specific operations that an Expr produces. PiperOrigin-RevId: 234609882	2019-03-29 16:31:26 -07:00
Alex Zinenko	0a4c940c1b	EDSC: introduce support for blocks EDSC currently implement a block as a statement that is itself a list of statements. This suffers from two modeling problems: (1) these blocks are not addressable, i.e. one cannot create an instruction where thus constructed block is a successor; (2) they support block nesting, which is not supported by MLIR blocks. Furthermore, emitting such "compound statement" (misleadingly named `Block` in Python bindings) does not actually produce a new Block in the IR. Implement support for creating actual IR Blocks in EDSC. In particular, define a new StmtBlock EDSC class that is neither an Expr nor a Stmt but contains a list of Stmts. Additionally, StmtBlock may have (early-) typed arguments. These arguments are Bindable expressions that can be used inside the block. Provide two calls in the MLIREmitter, `emitBlock` that actually emits a new block and `emitBlockBody` that only emits the instructions contained in the block without creating a new block. In the latter case, the instructions must not use block arguments. Update Python bindings to make it clear when instruction emission happens without creating a new block. PiperOrigin-RevId: 234556474	2019-03-29 16:30:56 -07:00
Uday Bondhugula	f97c1c5b06	Misc. updates/fixes to analysis utils used for DMA generation; update DMA generation pass to make it drop certain assumptions, complete TODOs. - multiple fixes for getMemoryFootprintBytes - pass loopDepth correctly from getMemoryFootprintBytes() - use union while computing memory footprints - bug fixes for addAffineForOpDomain - take into account loop step - add domains of other loop IVs in turn that might have been used in the bounds - dma-generate: drop assumption of "non-unit stride loops being tile space loops and skipping those and recursing to inner depths"; DMA generation is now purely based on available fast mem capacity and memory footprint's calculated - handle memory region compute failures/bailouts correctly from dma-generate - loop tiling cleanup/NFC - update some debug and error messages to use emitNote/emitError in pipeline-data-transfer pass - NFC PiperOrigin-RevId: 234245969	2019-03-29 16:30:26 -07:00
MLIR Team	58aa383e60	Support fusing producer loop nests which write to a memref which is live out, provided that the write region of the consumer loop nest to the same memref is a super set of the producer's write region. PiperOrigin-RevId: 234240958	2019-03-29 16:30:11 -07:00
MLIR Team	8f5f2c765d	LoopFusion: perform a series of loop interchanges to increase the loop depth at which slices of producer loop nests can be fused into constumer loop nests. ) Adds utility to LoopUtils to perform loop interchange of two AffineForOps. ) Adds utility to LoopUtils to sink a loop to a specified depth within a loop nest, using a series of loop interchanges. ) Computes dependences between all loads and stores in the loop nest, and classifies each loop as parallel or sequential. ) Computes loop interchange permutation required to sink sequential loops (and raise parallel loop nests) while preserving relative order among them. ) Checks each dependence against the permutation to make sure that dependences would not be violated by the loop interchange transformation. ) Calls loop interchange in LoopFusion pass on consumer loop nests before fusing in producers, sinking loops with loop carried dependences deeper into the consumer loop nest. *) Adds and updates related unit tests. PiperOrigin-RevId: 234158370	2019-03-29 16:29:26 -07:00
Alex Zinenko	d7aa700ccb	Dialect conversion: decouple function signature conversion from type conversion Function types are built-in in MLIR and affect the validity of the IR itself. However, advanced target dialects such as the LLVM IR dialect may include custom function types. Until now, dialect conversion was expecting function types not to be converted to the custom type: although the signatures was allowed to change, the outer type must have been an mlir::FunctionType. This effectively prevented dialect conversion from creating instructions that operate on values of the custom function type. Dissociate function signature conversion from general type conversion. Function signature conversion must still produce an mlir::FunctionType and is used in places where built-in types are required to make IR valid. General type conversion is used for SSA values, including function and block arguments and function results. Exercise this behavior in the LLVM IR dialect conversion by converting function types to LLVM IR function pointer types. The pointer to a function is chosen to provide consistent lowering of higher-order functions: while it is possible to have a value of function type, it is not possible to create a function type accepting a returning another function type. PiperOrigin-RevId: 234124494	2019-03-29 16:28:41 -07:00
Uday Bondhugula	6b7a49dd6a	Add -tile-sizes command line option for loop tiling; clean up cl options for for dma-generate, loop-unroll. - add -tile-sizes command line option for loop tiling to specify different tile sizes for loops in a band - clean up command line options for loop-unroll, dma-generate (remove cl::hidden) PiperOrigin-RevId: 234006232	2019-03-29 16:28:10 -07:00
Uday Bondhugula	00860662a2	Generate dealloc's for alloc's of pipeline-data-transfer - for the DMA transfers being pipelined through double buffering, generate deallocs for the double buffers being alloc'ed This change is along the lines of cl/233502632. We initially wanted to experiment with scoped allocation - so the deallocation's were usually not necessary; however, they are needed even with scoped allocations in some situations - for eg. when the enclosing loop gets unrolled. The dealloc serves as an end of lifetime marker. PiperOrigin-RevId: 233653463	2019-03-29 16:25:53 -07:00
River Riddle	4755774d16	Make IndexType a standard type instead of a builtin. This also cleans up some unnecessary factory methods on the Type class. PiperOrigin-RevId: 233640730	2019-03-29 16:25:38 -07:00
Alex Zinenko	0e59e5c49b	EDSC: move Expr and Stmt construction operators to a namespace In the current state, edsc::Expr and edsc::Stmt overload operators to construct other Exprs and Stmts. This includes some unconventional overloads of the `operator==` to create a comparison expression and of the `operator!` to create a negation expression. This situation could lead to unpleasant surprises where the code does not behave like expected. Make all Expr and Stmt construction operators free functions and move them to the `edsc::op` namespace. Callers willing to use these operators must explicitly include them with the `using` declaration. This can be done in some local scope. Additionally, we currently emit signed comparisons for order-comparison operators. With namespaces, we can later introduce two sets of operators in different namespace, e.g. `edsc::op::sign` and `edsc::op::unsign` to clearly state which kind of comparison is implied. PiperOrigin-RevId: 233578674	2019-03-29 16:25:08 -07:00
Uday Bondhugula	8b3f841daf	Generate dealloc's for the alloc's of dma-generate. - for the DMA buffers being allocated (and their tags), generate corresponding deallocs - minor related update to replaceAllMemRefUsesWith and PipelineDataTransfer pass Code generation for DMA transfers was being done with the initial simplifying assumption that the alloc's would map to scoped allocations, and so no deallocations would be necessary. Drop this assumption to generalize. Note that even with scoped allocations, unrolling loops that have scoped allocations could create a series of allocations and exhaustion of fast memory. Having a end of lifetime marker like a dealloc in fact allows creating new scopes if necessary when lowering to a backend and still utilize scoped allocation. DMA buffers created by -dma-generate are guaranteed to have either non-overlapping lifetimes or nested lifetimes. PiperOrigin-RevId: 233502632	2019-03-29 16:24:08 -07:00
River Riddle	366ebcf6aa	Remove the restriction that only registered terminator operations may terminate a block and have block operands. This allows for any operation to hold block operands. It also introduces the notion that unregistered operations may terminate a block. As such, the 'isTerminator' api on Instruction has been split into 'isKnownTerminator' and 'isKnownNonTerminator'. PiperOrigin-RevId: 233076831	2019-03-29 16:22:23 -07:00
Uday Bondhugula	c419accea3	Automated rollback of changelist 232728977. PiperOrigin-RevId: 232944889	2019-03-29 16:21:38 -07:00
River Riddle	a886625813	Modify the canonicalizations of select and muli to use the fold hook. This also extends the greedy pattern rewrite driver to add the operands of folded operations back to the worklist. PiperOrigin-RevId: 232878959	2019-03-29 16:20:06 -07:00
Uday Bondhugula	4ba8c9147d	Automated rollback of changelist 232717775. PiperOrigin-RevId: 232807986	2019-03-29 16:19:33 -07:00
River Riddle	99fee0b181	When canonicalizing only erase the operation after calling the 'fold' hook if replacement results were supplied. This fixes a bug where the operation would always get erased, even if it was modified in place. PiperOrigin-RevId: 232757964	2019-03-29 16:19:17 -07:00
River Riddle	fd2d7c857b	Rename the 'if' operation in the AffineOps dialect to 'affine.if' and namespace the AffineOps dialect with 'affine'. PiperOrigin-RevId: 232728977	2019-03-29 16:18:59 -07:00
River Riddle	90d10b4e00	NFC: Rename the 'for' operation in the AffineOps dialect to 'affine.for'. The is the second step to adding a namespace to the AffineOps dialect. PiperOrigin-RevId: 232717775	2019-03-29 16:17:59 -07:00
River Riddle	3227dee15d	NFC: Rename affine_apply to affine.apply. This is the first step to adding a namespace to the affine dialect. PiperOrigin-RevId: 232707862	2019-03-29 16:17:29 -07:00
MLIR Team	b9dde91ea6	Adds the ability to compute the MemRefRegion of a sliced loop nest. Utilizes this feature during loop fusion cost computation, to compute what the write region of a fusion candidate loop nest slice would be (without having to materialize the slice or change the IR). ) Adds parameter to public API of MemRefRegion::compute for passing in the slice loop bounds to compute the memref region of the loop nest slice. ) Exposes public method MemRefRegion::getRegionSize for computing the size of the memref region in bytes. PiperOrigin-RevId: 232706165	2019-03-29 16:17:15 -07:00
River Riddle	0c65cf283c	Move the AffineFor loop bound folding to a canonicalization pattern on the AffineForOp. PiperOrigin-RevId: 232610715	2019-03-29 16:16:11 -07:00
River Riddle	10237de8eb	Refactor the affine analysis by moving some functionality to IR and some to AffineOps. This is important for allowing the affine dialect to define canonicalizations directly on the operations instead of relying on transformation passes, e.g. ComposeAffineMaps. A summary of the refactoring: * AffineStructures has moved to IR. * simplifyAffineExpr/simplifyAffineMap/getFlattenedAffineExpr have moved to IR. * makeComposedAffineApply/fullyComposeAffineMapAndOperands have moved to AffineOps. * ComposeAffineMaps is replaced by AffineApplyOp::canonicalize and deleted. PiperOrigin-RevId: 232586468	2019-03-29 16:15:41 -07:00
MLIR Team	a78edcda5b	Loop fusion improvements: ) After a private memref buffer is created for a fused loop nest, dependences on the old memref are reduced, which can open up fusion opportunities. In these cases, users of the old memref are added back to the worklist to be reconsidered for fusion. ) Fixed a bug in fusion insertion point dependence check where the memref being privatized was being skipped from the check. PiperOrigin-RevId: 232477853	2019-03-29 16:13:50 -07:00
Uday Bondhugula	ed27b40085	Remove stray debug output - NFC PiperOrigin-RevId: 232390076	2019-03-29 16:13:17 -07:00
River Riddle	bf9c381d1d	Remove InstWalker and move all instruction walking to the api facilities on Function/Block/Instruction. PiperOrigin-RevId: 232388113	2019-03-29 16:12:59 -07:00
River Riddle	c9ad4621ce	NFC: Move AffineApplyOp to the AffineOps dialect. This also moves the isValidDim/isValidSymbol methods from Value to the AffineOps dialect. PiperOrigin-RevId: 232386632	2019-03-29 16:12:40 -07:00
Uday Bondhugula	0f50414fa4	Refactor common code getting memref access in getMemRefRegion - NFC - use getAccessMap() instead of repeating it - fold getMemRefRegion into MemRefRegion ctor (more natural, avoid heap allocation and unique_ptr where possible) - change extractForInductionVars - MutableArrayRef -> ArrayRef for the arguments. Since the method is just returning copies of 'Value *', the client can't mutate the pointers themselves; it's fine to mutate the 'Value''s themselves, but that doesn't mutate the pointers to those. - change the way extractForInductionVars returns (see b/123437690) PiperOrigin-RevId: 232359277	2019-03-29 16:12:25 -07:00
River Riddle	b499277fb6	Remove remaining usages of OperationInst in lib/Transforms. PiperOrigin-RevId: 232323671	2019-03-29 16:10:53 -07:00
River Riddle	a3d9ccaecb	Replace the walkOps/visitOperationInst variants from the InstWalkers with the Instruction variants. PiperOrigin-RevId: 232322030	2019-03-29 16:10:24 -07:00
Uday Bondhugula	b26900dce5	Update dma-generate pass to (1) work on blocks of instructions (instead of just loops), (2) take into account fast memory space capacity and lower 'dmaDepth' to fit, (3) add location information for debug info / errors - change dma-generate pass to work on blocks of instructions (start/end iterators) instead of 'for' loops; complete TODOs - allows DMA generation for straightline blocks of operation instructions interspersed b/w loops - take into account fast memory capacity: check whether memory footprint fits in fastMemoryCapacity parameter, and recurse/lower the depth at which DMA generation is performed until it does fit in the provided memory - add location information to MemRefRegion; any insufficient fast memory capacity errors or debug info w.r.t dma generation shows location information - allow DMA generation pass to be instantiated with a fast memory capacity option (besides command line flag) - change getMemRefRegion to return unique_ptr's - change getMemRefFootprintBytes to work on a 'Block' instead of 'ForInst' - other helper methods; add postDomInstFilter option for replaceAllMemRefUsesWith; drop forInst->walkOps, add Block::walkOps methods Eg. output $ mlir-opt -dma-generate -dma-fast-mem-capacity=1 /tmp/single.mlir /tmp/single.mlir:9:13: error: Total size of all DMA buffers' for this block exceeds fast memory capacity for %i3 = (d0) -> (d0)(%i1) to (d0) -> (d0 + 32)(%i1) { ^ $ mlir-opt -debug-only=dma-generate -dma-generate -dma-fast-mem-capacity=400 /tmp/single.mlir /tmp/single.mlir:9:13: note: 8 KiB of DMA buffers in fast memory space for this block for %i3 = (d0) -> (d0)(%i1) to (d0) -> (d0 + 32)(%i1) { PiperOrigin-RevId: 232297044	2019-03-29 16:09:52 -07:00
River Riddle	de2d0dfbca	Fold the functionality of OperationInst into Instruction. OperationInst still exists as a forward declaration and will be removed incrementally in a set of followup cleanup patches. PiperOrigin-RevId: 232198540	2019-03-29 16:09:19 -07:00
River Riddle	126ec14e2d	Fix the handling of the resizable operands bit of OperationState in a few places. PiperOrigin-RevId: 232163738	2019-03-29 16:08:28 -07:00
Uday Bondhugula	8be2627436	Promote local buffers created post fusion to higher memory space - fusion already includes the necessary analysis to create small/local buffers post fusion; allocate these buffers in a higher memory space if the necessary pass parameters are provided (threshold size, memory space id) - although there will be a separate utility at some point to directly detect and promote small local buffers to higher memory spaces, doing it while fusion when possible is much less expensive, comes free with fusion analysis, and covers a key common case. PiperOrigin-RevId: 232063894	2019-03-29 16:07:23 -07:00
River Riddle	5052bd8582	Define the AffineForOp and replace ForInst with it. This patch is largely mechanical, i.e. changing usages of ForInst to OpPointer<AffineForOp>. An important difference is that upon construction an AffineForOp no longer automatically creates the body and induction variable. To generate the body/iv, 'createBody' can be called on an AffineForOp with no body. PiperOrigin-RevId: 232060516	2019-03-29 16:06:49 -07:00
Nicolas Vasilache	0353ef99eb	Cleanup EDSCs and start a functional auto-generated library of custom Ops This CL applies the following simplifications to EDSCs: 1. Rename Block to StmtList because an MLIR Block is a different, not yet supported, notion; 2. Rework Bindable to drop specific storage and just use it as a simple wrapper around Expr. The only value of Bindable is to force a static cast when used by the user to bind into the emitter. For all intended purposes, Bindable is just a lightweight check that an Expr is Unbound. This simplifies usage and reduces the API footprint. After playing with it for some time, it wasn't worth the API cognition overhead; 3. Replace makeExprs and makeBindables by makeNewExprs and copyExprs which is more explicit and less easy to misuse; 4. Add generally useful functionality to MLIREmitter: a. expose zero and one for the ubiquitous common lower bounds and step; b. add support to create already bound Exprs for all function arguments as well as shapes and views for Exprs bound to memrefs. 5. Delete Stmt::operator= and replace by a `Stmt::set` method which is more explicit. 6. Make Stmt::operator Expr() explicit. 7. Indexed.indices assertions are removed to pave the way for expressing slices and views as well as to work with 0-D memrefs. The CL plugs those simplifications with TableGen and allows emitting a full MLIR function for pointwise add. This "x.add" op is both type and rank-agnostic (by allowing ArrayRef of Expr passed to For loops) and opens the door to spinning up a composable library of existing and custom ops that should automate a lot of the tedious work in TF/XLA -> MLIR. Testing needs to be significantly improved but can be done in a separate CL. PiperOrigin-RevId: 231982325	2019-03-29 16:05:23 -07:00
River Riddle	9f22a2391b	Define an detail::OperandStorage class to handle managing instruction operands. This class stores operands in a similar way to SmallVector except for two key differences. The first is the inline storage, which is a trailing objects array. The second is that being able to dynamically resize the operand list is optional. This means that we can enable the cases where operations need to change the number of operands after construction without losing the spatial locality benefits of the common case (operation instructions / non-control flow instructions with a lifetime fixed number of operands). PiperOrigin-RevId: 231910497	2019-03-29 16:05:08 -07:00
Nicolas Vasilache	d4921f4a96	Address Performance issue in NestedMatcher A performance issue was reported due to the usage of NestedMatcher in ComposeAffineMaps. The main culprit was the ubiquitous copies that were occuring when appending even a single element in `matchOne`. This CL generally simplifies the implementation and removes one level of indirection by getting rid of auxiliary storage as well as simplifying the API. The users of the API are updated accordingly. The implementation was tested on a heavily unrolled example with ComposeAffineMaps and is now close in performance with an implementation based on stateless InstWalker. As a reminder, the whole ComposeAffineMaps pass is slated to disappear but the bug report was very useful as a stress test for NestedMatchers. Lastly, the following cleanups reported by @aminim were addressed: 1. make NestedPatternContext scoped within runFunction rather than at the Pass level. This was caused by a previous misunderstanding of Pass lifetime; 2. use defensive assertions in the constructor of NestedPatternContext to make it clear a unique such locally scoped context is allowed to exist. PiperOrigin-RevId: 231781279	2019-03-29 16:04:07 -07:00
MLIR Team	1e85191d07	Fix ASAN issue: snapshot edge list before loop which can modify this list. PiperOrigin-RevId: 231686040	2019-03-29 16:03:38 -07:00
MLIR Team	d7c824451f	LoopFusion: insert the source loop nest slice at a depth in the destination loop nest which preserves dependences (above any loop carried or other dependences). This is accomplished by updating the maximum destination loop depth based on dependence checks between source loop nest loads and stores which access the memref on which the source loop nest has a store op. In addition, prevent fusing in source loop nests which write to memrefs which escape or are live out. PiperOrigin-RevId: 231684492	2019-03-29 16:03:23 -07:00
Uday Bondhugula	44064d5b3b	3000x speed improvement on compose-affine-maps by dropping NestedMatcher for a trivial inst walker :-) (reduces pass time from several minutes non-terminating to 120ms) - (fixes b/123541184) - use a simple 7-line inst walker to collect affine_apply op's instead of the nested matcher; -compose-affine-maps pass runs in 120ms now instead of 5 minutes + (non- terminating / out of memory) - on a realistic test case that is 20,000 lines 12-d loop nest - this CL is also pushing for simple existing/standard patterns unless there is a real efficiency issue (OTOH, fixing nested matcher to address this issue requires cl/231400521) - the improvement is from swapping out the nested walker as opposed to from a bug or anything else that this CL changes - update stale comment PiperOrigin-RevId: 231623619	2019-03-29 16:02:53 -07:00
River Riddle	b6928c945c	Standardize the spelling of debug info to "debuginfo" in opt flags. PiperOrigin-RevId: 231610337	2019-03-29 16:02:38 -07:00
Uday Bondhugula	c0e9e5eb07	Fix getFullMemRefAsRegion() and FlatAffineConstraints::reset PiperOrigin-RevId: 231426734	2019-03-29 16:00:39 -07:00
MLIR Team	a0f3db4024	Support fusing loop nests which require insertion into a new instruction Block position while preserving dependences, opening up additional fusion opportunities. - Adds SSA Value edges to the data dependence graph used in the loop fusion pass. PiperOrigin-RevId: 231417649	2019-03-29 16:00:04 -07:00
River Riddle	755538328b	Recommit: Define a AffineOps dialect as well as an AffineIfOp operation. Replace all instances of IfInst with AffineIfOp and delete IfInst. PiperOrigin-RevId: 231342063	2019-03-29 15:59:30 -07:00
Nicolas Vasilache	ae772b7965	Automated rollback of changelist 231318632. PiperOrigin-RevId: 231327161	2019-03-29 15:42:38 -07:00
River Riddle	5ecef2b3f6	Define a AffineOps dialect as well as an AffineIfOp operation. Replace all instances of IfInst with AffineIfOp and delete IfInst. PiperOrigin-RevId: 231318632	2019-03-29 15:42:08 -07:00
Nicolas Vasilache	1a5287d594	Replace too obscure usage of functional::map by declare + reserve + loop. Cleanup a usage of functional::map that is deemed too obscure in `reindexAffineIndices`. Also fix a stale comment in `reindexAffineIndices`. PiperOrigin-RevId: 231211184	2019-03-29 15:41:08 -07:00
Chris Lattner	b42bea215a	Change AffineApplyOp to produce a single result, simplifying the code that works with it, and updating the g3docs. PiperOrigin-RevId: 231120927	2019-03-29 15:40:38 -07:00
River Riddle	36babbd781	Change the ForInst induction variable to be a block argument of the body instead of the ForInst itself. This is a necessary step in converting ForInst into an operation. PiperOrigin-RevId: 231064139	2019-03-29 15:40:23 -07:00
Nicolas Vasilache	0e7a8a9027	Drop AffineMap::Null and IntegerSet::Null Addresses b/122486036 This CL addresses some leftover crumbs in AffineMap and IntegerSet by removing the Null method and cleaning up the constructors. As the ::Null uses were tracked down, opportunities appeared to untangle some of the Parsing logic and make it explicit where AffineMap/IntegerSet have ambiguous syntax. Previously, ambiguous cases were hidden behind the implicit pointer values of AffineMap* and IntegerSet* that were passed as function parameters. Depending the values of those pointers one of 3 behaviors could occur. This parsing logic convolution is one of the rare cases where I would advocate for code duplication. The more proper fix would be to make the syntax unambiguous or to allow some lookahead. PiperOrigin-RevId: 231058512	2019-03-29 15:40:08 -07:00
Nicolas Vasilache	81c7f2e2f3	Cleanup resource management and rename recursive matchers This CL follows up on a memory leak issue related to SmallVector growth that escapes the BumpPtrAllocator. The fix is to properly use ArrayRef and placement new to define away the issue. The following renaming is also applied: 1. MLFunctionMatcher -> NestedPattern 2. MLFunctionMatches -> NestedMatch As a consequence all allocations are now guaranteed to live on the BumpPtrAllocator. PiperOrigin-RevId: 231047766	2019-03-29 15:39:53 -07:00
River Riddle	75c21e1de0	Wrap cl::opt flags within passes in a category with the pass name. This improves the help output of tools like mlir-opt. Example: dma-generate options: -dma-fast-mem-capacity - Set fast memory space ... -dma-fast-mem-space=<uint> - Set fast memory space ... loop-fusion options: -fusion-compute-tolerance=<number> - Fractional increase in ... -fusion-maximal - Enables maximal loop fusion loop-tile options: -tile-size=<uint> - Use this tile size for ... loop-unroll options: -unroll-factor=<uint> - Use this unroll factor ... -unroll-full - Fully unroll loops -unroll-full-threshold=<uint> - Unroll all loops with ... -unroll-num-reps=<uint> - Unroll innermost loops ... loop-unroll-jam options: -unroll-jam-factor=<uint> - Use this unroll jam factor ... PiperOrigin-RevId: 231019363	2019-03-29 15:39:38 -07:00
Uday Bondhugula	b4a1443508	Update replaceAllMemRefUsesWith to generate single result affine_apply's for index remapping - generate a sequence of single result affine_apply's for the index remapping (instead of one multi result affine_apply) - update dma-generate and loop-fusion test cases; while on this, change test cases to use single result affine apply ops - some fusion comment fix/cleanup PiperOrigin-RevId: 230985830	2019-03-29 15:38:23 -07:00
Uday Bondhugula	b588d58c5f	Update createAffineComputationSlice to generate single result affine maps - Update createAffineComputationSlice to generate a sequence of single result affine apply ops instead of one multi-result affine apply - update pipeline-data-transfer test case; while on this, also update the test case to use only single result affine maps, and make it more robust to change. PiperOrigin-RevId: 230965478	2019-03-29 15:37:53 -07:00
River Riddle	c3424c3c75	Allow operations to hold a blocklist and add support for parsing/printing a block list for verbose printing. PiperOrigin-RevId: 230951462	2019-03-29 15:37:37 -07:00
Alex Zinenko	6d37a255e2	Generic dialect conversion pass exercised by LLVM IR lowering This commit introduces a generic dialect conversion/lowering/legalization pass and illustrates it on StandardOps->LLVMIR conversion. It partially reuses the PatternRewriter infrastructure and adds the following functionality: - an actual pass; - non-default pattern constructors; - one-to-many rewrites; - rewriting terminators with successors; - not applying patterns iteratively (unlike the existing greedy rewrite driver); - ability to change function signature; - ability to change basic block argument types. The latter two things required, given the existing API, to create new functions in the same module. Eventually, this should converge with the rest of PatternRewriter. However, we may want to keep two pass versions: "heavy" with function/block argument conversion and "light" that only touches operations. This pass creates new functions within a module as a means to change function signature, then creates new blocks with converted argument types in the new function. Then, it traverses the CFG in DFS-preorder to make sure defs are converted before uses in the dominated blocks. The generic pass has a minimal interface with two hooks: one to fill in the set of patterns, and another one to convert types for functions and blocks. The patterns are defined as separate classes that can be table-generated in the future. The LLVM IR lowering pass partially inherits from the existing LLVM IR translator, in particular for type conversion. It defines a conversion pattern template, instantiated for different operations, and is a good candidate for tablegen. The lowering does not yet support loads and stores and is not connected to the translator as it would have broken the existing flows. Future patches will add missing support before switching the translator in a single patch. PiperOrigin-RevId: 230951202	2019-03-29 15:37:23 -07:00
Uday Bondhugula	95f19d558c	Fix return value logic / error reporting in -dma-generate PiperOrigin-RevId: 230906158	2019-03-29 15:36:23 -07:00
MLIR Team	5c5739d42b	Change the dependence check in the loop fusion pass to use the MLIR instruction list ordering (instead of the dependence graph node id ordering). This breaks the overloading of dependence graph node ids as both edge endpoints and instruction list position. PiperOrigin-RevId: 230849232	2019-03-29 15:35:53 -07:00
Uday Bondhugula	f94b15c247	Update dma-generate: update for multiple load/store op's per memref - introduce a way to compute union using symbolic rectangular bounding boxes - handle multiple load/store op's to the same memref by taking a union of the regions - command-line argument to provide capacity of the fast memory space - minor change to replaceAllMemRefUsesWith to not generate affine_apply if the supplied index remap was identity PiperOrigin-RevId: 230848185	2019-03-29 15:35:38 -07:00
Uday Bondhugula	06d21d9f64	loop-fusion: debug info cleanup PiperOrigin-RevId: 230817383	2019-03-29 15:35:08 -07:00
Chris Lattner	934b6d125f	Introduce a new operation hook point for implementing simple local canonicalizations of operations. The ultimate important user of this is going to be a funcBuilder->foldOrCreate<YourOp>(...) API, but for now it is just a more convenient way to write certain classes of canonicalizations (see the change in StandardOps.cpp). NFC. PiperOrigin-RevId: 230770021	2019-03-29 15:34:35 -07:00
River Riddle	451869f394	Add cloning functionality to Block and Function, this also adds support for remapping successor block operands of terminator operations. We define a new BlockAndValueMapping class to simplify mapping between cloned values. PiperOrigin-RevId: 230768759	2019-03-29 15:34:20 -07:00
Uday Bondhugula	72e5c7f428	Minor updates + cleanup to dma-generate - switch some debug info to emitError - use a single constant op for zero index to make it easier to write/update test cases; avoid creating new constant op's for common zero index cases - test case cleanup This is in preparation for an upcoming major update to this pass. PiperOrigin-RevId: 230728379	2019-03-29 15:34:06 -07:00
River Riddle	f319bbbd28	Add a function pass to strip debug info from functions and instructions. PiperOrigin-RevId: 230654315	2019-03-29 15:33:50 -07:00
River Riddle	6859f33292	Migrate VectorOrTensorType/MemRefType shape api to use int64_t instead of int. PiperOrigin-RevId: 230605756	2019-03-29 15:33:20 -07:00
MLIR Team	b28009b681	Fix single producer check in loop fusion pass. PiperOrigin-RevId: 230565482	2019-03-29 15:32:20 -07:00
Uday Bondhugula	864d9e02a1	Update fusion cost model + some additional infrastructure and debug information for -loop-fusion - update fusion cost model to fuse while tolerating a certain amount of redundant computation; add cl option -fusion-compute-tolerance evaluate memory footprint and intermediate memory reduction - emit debug info from -loop-fusion showing what was fused and why - introduce function to compute memory footprint for a loop nest - getMemRefRegion readability update - NFC PiperOrigin-RevId: 230541857	2019-03-29 15:32:06 -07:00
Uday Bondhugula	92e9d9484c	loop unroll update: unroll factor one for a single iteration loop - unrolling a single iteration loop by a factor of one should promote its body into its parent; this makes it consistent with the behavior/expectation that unrolling a loop by a factor equal to its trip count makes the loop go away. PiperOrigin-RevId: 230426499	2019-03-29 15:31:35 -07:00
Uday Bondhugula	1b735dfe27	Refactor -dma-generate walker - NFC - ForInst::walkOps will also be used in an upcoming CL (cl/229438679); better to have this instead of deriving from the InstWalker PiperOrigin-RevId: 230413820	2019-03-29 15:31:03 -07:00
Uday Bondhugula	94a03f864f	Allocate private/local buffers for slices accurately during fusion - the size of the private memref created for the slice should be based on the memref region accessed at the depth at which the slice is being materialized, i.e., symbolic in the outer IVs up until that depth, as opposed to the region accessed based on the entire domain. - leads to a significant contraction of the temporary / intermediate memref whenever the memref isn't reduced to a single scalar (through store fwd'ing). Other changes - update to promoteIfSingleIteration - avoid introducing unnecessary identity map affine_apply from IV; makes it much easier to write and read test cases and pass output for all passes that use promoteIfSingleIteration; loop-fusion test cases become much simpler - fix replaceAllMemrefUsesWith bug that was exposed by the above update - 'domInstFilter' could be one of the ops erased due to a memref replacement in it. - fix getConstantBoundOnDimSize bug: a division by the coefficient of the identifier was missing (the latter need not always be 1); add lbFloorDivisors output argument - rename getBoundingConstantSizeAndShape -> getConstantBoundingSizeAndShape PiperOrigin-RevId: 230405218	2019-03-29 15:30:31 -07:00
MLIR Team	71495d58a7	Handle escaping memrefs in loop fusion pass: ) Do not remove loop nests which write to memrefs which escape the function. ) Do not remove memrefs which escape the function (e.g. are used in the return instruction). PiperOrigin-RevId: 230398630	2019-03-29 15:30:14 -07:00
Nicolas Vasilache	9f3f39d61a	Cleanup EDSCs This CL performs a bunch of cleanups related to EDSCs that are generally useful in the context of using them with a simple wrapping C API (not in this CL) and with simple language bindings to Python and Swift. PiperOrigin-RevId: 230066505	2019-03-29 15:27:58 -07:00

... 7 8 9 10 11 ...

1063 Commits