Commit Graph

222 Commits

Author SHA1 Message Date
harsh-nod e33f301ec2 [mlir] Add support for moving reductions to outer most dimensions in vector.multi_reduction
The approach for handling reductions in the outer most
dimension follows that for inner most dimensions, outlined
below

First, transpose to move reduction dims, if needed
Convert reduction from n-d to 2-d canonical form
Then, for outer reductions, we emit the appropriate op
(add/mul/min/max/or/and/xor) and combine the results.

Differential Revision: https://reviews.llvm.org/D107675
2021-08-13 12:59:50 -07:00
Stephen Neuendorffer 432341d8a8 [mlir] Handle cases where transfer_read should turn into a scalar load
The existing vector transforms reduce the dimension of transfer_read
ops.  However, beyond a certain point, the vector op actually has
to be reduced to a scalar load, since we can't load a zero-dimension
vector.  This handles this case.

Note that in the longer term, it may be preferaby to support
zero-dimension vectors.  see
https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097.

Differential Revision: https://reviews.llvm.org/D103432
2021-08-03 22:53:40 -07:00
Nicolas Vasilache 14c1450d5c [mlir][Vector] Add vector to outerproduct lowering for the [reduction, parallel] case.
Differential Revision: https://reviews.llvm.org/D105373
2021-07-30 14:32:57 +00:00
Benjamin Kramer 8c63c24dca [mlir] Fix typo s/applyPermuationMap/applyPermutationMap/ 2021-07-27 12:18:54 +02:00
thomasraoux 73a9d6d0e2 [mlir][linalg] Fix bug in contraction op vectorization with output perm
When the output indexing map has a permutation we need to consider in
the contraction vector type.

Differential Revision: https://reviews.llvm.org/D106469
2021-07-23 08:39:43 -07:00
Matthias Springer d1a9e9a7cb [mlir][vector] Remove vector.transfer_read/write to LLVM lowering
This simplifies the vector to LLVM lowering. Previously, both vector.load/store and vector.transfer_read/write lowered directly to LLVM. With this commit, there is a single path to LLVM vector load/store instructions and vector.transfer_read/write ops must first be lowered to vector.load/store ops.

* Remove vector.transfer_read/write to LLVM lowering.
* Allow non-unit memref strides on all but the most minor dimension for vector.load/store ops.
* Add maxTransferRank option to populateVectorTransferLoweringPatterns.
* vector.transfer_reads with changing element type can no longer be lowered to LLVM. (This functionality is needed only for SPIRV.)

Differential Revision: https://reviews.llvm.org/D106118
2021-07-17 14:07:27 +09:00
Matthias Springer 4a3defa629 [mlir][vector] Refactor TransferReadToVectorLoadLowering
* TransferReadToVectorLoadLowering no longer generates memref.load ops.
* Add new pattern VectorLoadToMemrefLoadLowering that lowers scalar vector.loads to memref.loads.
* Add vector::BroadcastOp canonicalization pattern that folds broadcast chains.

Differential Revision: https://reviews.llvm.org/D106117
2021-07-17 13:53:09 +09:00
thomasraoux 6296e10972 [mlir][Vector] Remove Vector TupleOp as it is unused
TupleOp is not used anymore after recent refactoring.

Differential Revision: https://reviews.llvm.org/D105924
2021-07-13 12:39:12 -07:00
thomasraoux 291025389c [mlir][vector] Refactor Vector Unrolling and remove Tuple ops
Simplify vector unrolling pattern to be more aligned with rest of the
patterns and be closer to vector distribution.
The new implementation uses ExtractStridedSlice/InsertStridedSlice
instead of the Tuple ops. After this change the ops based on Tuple don't
have any more used so they can be removed.

This allows removing signifcant amount of dead code and will allow
extending the unrolling code going forward.

Differential Revision: https://reviews.llvm.org/D105381
2021-07-07 11:11:26 -07:00
Matthias Springer 2c115ecc41 [mlir][NFC] MemRef cleanup: Remove helper functions
Remove `getDynOperands` and `createOrFoldDimOp` from MemRef.h to decouple MemRef a bit from Tensor. These two functions are used in other dialects/transforms.

Differential Revision: https://reviews.llvm.org/D105260
2021-07-05 10:10:21 +09:00
Nicolas Vasilache cb5de7c813 [mlir][Vector] NFC - Compress vector to outerproduct lowering.
The implementation has become too unwieldy and cognitive overhead wins.
Instead compress the implementation in preparation for additional lowering paths.

This is a resubmit of https://reviews.llvm.org/D105359 without ordering ambiguities.

Differential Revision: https://reviews.llvm.org/D105367
2021-07-02 21:23:59 +00:00
Mehdi Amini 4525d52c73 Revert "[mlir][Vector] NFC - Compress vector to outerproduct lowering."
This reverts commit db188adfb1.

Breaks the GCC tests, likely because of some order of evaluation
difference between clang and gcc.
2021-07-02 17:55:06 +00:00
Nicolas Vasilache db188adfb1 [mlir][Vector] NFC - Compress vector to outerproduct lowering.
The implementation has become too unwieldy and cognitive overhead wins.
Instead compress the implementation in preparation for additional lowering paths.

Differential Revision: https://reviews.llvm.org/D105359
2021-07-02 16:41:51 +00:00
Matthias Springer c0a6318d96 [mlir][tensor] Add tensor.dim operation
* Split memref.dim into two operations: memref.dim and tensor.dim. Both ops have the same builder interface and op argument names, so that they can be used with templates in patterns that apply to both tensors and memrefs (e.g., some patterns in Linalg).
* Add constant materializer to TensorDialect (needed for folding in affine.apply etc.).
* Remove some MemRefDialect dependencies, make some explicit.

Differential Revision: https://reviews.llvm.org/D105165
2021-07-01 10:00:19 +09:00
thomasraoux 627733b5f0 [mlir][vector] Extend vector distribution to all elementwise and contract
Uses elementwise interface to generalize canonicalization pattern and add a new
pattern for vector.contract case.

Differential Revision: https://reviews.llvm.org/D104343
2021-06-30 16:22:31 -07:00
Stella Laurenzo 485cc55edf [mlir] Generare .cpp.inc files for dialects.
* Previously, we were only generating .h.inc files. We foresee the need to also generate implementations and this is a step towards that.
* Discussed in https://llvm.discourse.group/t/generating-cpp-inc-files-for-dialects/3732/2
* Deviates from the discussion above by generating a default constructor in the .cpp.inc file (and adding a tablegen bit that disables this in case if this is user provided).
* Generating the destructor started as a way to flush out the missing includes (produces a link error), but it is a strict improvement on its own that is worth doing (i.e. by emitting key methods in the .cpp file, we root vtables in one translation unit, which is a non-controversial improvement).

Differential Revision: https://reviews.llvm.org/D105070
2021-06-29 20:10:30 +00:00
harsh-nod 0d6e4199e3 [mlir][vector] Order parallel indices before transposing the input in multireductions
The current code does not preserve the order of the parallel
dimensions when doing multi-reductions and thus we can end
up in scenarios where the result shape does not match the
desired shape after reduction.

This patch fixes that by ensuring that the parallel indices
are in order and then concatenates them to the reduction dimensions
so that the reduction dimensions are innermost.

Differential Revision: https://reviews.llvm.org/D104884
2021-06-28 18:47:16 -07:00
Tobias Gysi 7cef24ee83 [mlir][linalg] Adapt the FillOp builder signature.
Change the build operand order from output, value to value, output. The patch makes the argument order consistent with the pretty printed order updated by https://reviews.llvm.org/D104356.

Differential Revision: https://reviews.llvm.org/D104359
2021-06-23 08:06:43 +00:00
Tobias Gysi a21a6f51bc [mlir][linalg] Change the pretty printed FillOp operand order.
The patch changes the pretty printed FillOp operand order from output, value to value, output. The change is a follow up to https://reviews.llvm.org/D104121 that passes the fill value using a scalar input instead of the former capture semantics.

Differential Revision: https://reviews.llvm.org/D104356
2021-06-23 07:03:00 +00:00
thomasraoux 1244bca53f [mlir][vector] Support distributing transfer op with permutation map
Differential Revision: https://reviews.llvm.org/D104263
2021-06-21 12:56:08 -07:00
Fangrui Song 558ee5843f [mlir] Fix -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build. NFC 2021-06-20 11:55:00 -07:00
Matthias Springer 2bc8ffa8af [mlir] Support permutation maps in vector transfer op folder
Fold away in_bounds attribute even if the transfer op has a non-identity permutation map.

Differential Revision: https://reviews.llvm.org/D103133
2021-05-31 17:22:46 +09:00
Nicolas Vasilache 6825bfe23e [mlir][Vector] NFC - Drop vector EDSC usage
Drop the vector dialect EDSC subdirectory and update all uses.
2021-05-19 12:44:38 +00:00
Matthias Springer fb7ec1f187 [mlir] Use VectorTransferPermutationMapLoweringPatterns in VectorToSCF
VectorTransferPermutationMapLoweringPatterns can be enabled via a pass option. These additional patterns lower permutation maps to minor identity maps with broadcasting, if possible, allowing for more efficient vector load/stores. The option is deactivated by default.

Differential Revision: https://reviews.llvm.org/D102593
2021-05-19 14:46:19 +09:00
Matthias Springer 2c9688d201 [mlir] Improve TransferOp verifier: broadcasts are in_bounds
Broadcast dimensions of vector transfer ops are always in-bounds. This is consistent with the fact that the starting position of a transfer is always in-bounds.

Differential Revision: https://reviews.llvm.org/D102566
2021-05-17 22:35:44 +09:00
Matthias Springer 7ddeffee55 [mlir] Lower permutation maps on TransferWriteOps
Add TransferWritePermutationLowering, which replaces permutation maps of TransferWriteOps with vector.transpose.

Differential Revision: https://reviews.llvm.org/D102548
2021-05-17 15:30:46 +09:00
Matthias Springer 6774e5a995 [mlir] Fix in_bounds attr handling in TransferReadPermutationLowering
The in_bounds attribute should also be transposed.

Differential Revision: https://reviews.llvm.org/D102572
2021-05-17 15:28:16 +09:00
Matthias Springer 60da33c2d4 [mlir] Support masks in TransferOpReduceRank and TransferReadPermutationLowering
These two patterns allow for more efficient codegen in VectorToSCF.

Differential Revision: https://reviews.llvm.org/D102222
2021-05-13 15:08:08 +09:00
Matthias Springer 864adf399e [mlir] Allow empty position in vector.insert and vector.extract
Such ops are no-ops and are folded to their respective `source`/`vector` operand.

Differential Revision: https://reviews.llvm.org/D101879
2021-05-13 12:54:18 +09:00
Matthias Springer c52cbe63e4 [mlir] Fix masked vector transfer ops with broadcasts
Broadcast dimensions of a vector transfer op have no corresponding dimension in the mask vector. E.g., a 2-D TransferReadOp, where one dimension is a broadcast, can have a 1-D `mask` attribute.

This commit also adds a few additional transfer op integration tests for various combinations of broadcasts, masking, dim transposes, etc.

Differential Revision: https://reviews.llvm.org/D101745
2021-05-13 12:46:03 +09:00
Matthias Springer 6555e53ab0 Revert "[mlir] Fix masked vector transfer ops with broadcasts"
This reverts commit c9087788f7.

Accidentally pushed old version of the commit.
2021-05-13 11:55:00 +09:00
Matthias Springer c9087788f7 [mlir] Fix masked vector transfer ops with broadcasts
Broadcast dimensions of a vector transfer op have no corresponding dimension in the mask vector. E.g., a 2-D TransferReadOp, where one dimension is a broadcast, can have a 1-D `mask` attribute.

This commit also adds a few additional transfer op integration tests for various combinations of broadcasts, masking, dim transposes, etc.

Differential Revision: https://reviews.llvm.org/D101745
2021-05-13 11:37:36 +09:00
Tres Popp 88a48999d2 Support VectorTransfer splitting on writes also.
VectorTransfer split previously only split read xfer ops. This adds
the same logic to write ops. The resulting code involves 2
conditionals for write ops while read ops only needed 1, but the created
ops are built upon the same patterns, so pattern matching/expectations
are all consistent other than in regards to the if/else ops.

Differential Revision: https://reviews.llvm.org/D102157
2021-05-11 10:33:27 +02:00
thomasraoux 6aaf06f929 [mlir][vector] Fix warning
Previous change caused another warning in some build configuration:
"default label in switch which covers all enumeration values"
2021-05-07 17:12:47 -07:00
thomasraoux b90b66bcbe [mlir] Missed clang-format 2021-05-07 13:57:34 -07:00
thomasraoux d0453a8933 [mlir][vector] Extend pattern to trim lead unit dimension to Splat Op
Differential Revision: https://reviews.llvm.org/D102091
2021-05-07 13:54:41 -07:00
thomasraoux a970e69d6b [mlir][vector] add pattern to cast away leading unit dim for elementwise op
Differential Revision: https://reviews.llvm.org/D102034
2021-05-07 07:54:09 -07:00
thomasraoux 71eb32d97e [mlir][vector] Fix typo 2021-05-06 10:12:31 -07:00
thomasraoux 933551eaeb [mlir][NFC] Fix warning in VectorTransforms.cpp 2021-05-06 08:11:42 -07:00
thomasraoux 0b303da6f8 [mlir][vector] add pattern to cast away lead unit dimension for broadcast op
Differential Revision: https://reviews.llvm.org/D101955
2021-05-06 08:02:17 -07:00
Sergei Grechanik d80b04ab00 [mlir][Affine][Vector] Support vectorizing reduction loops
This patch adds support for vectorizing loops with 'iter_args'
implementing known reductions along the vector dimension. Comparing to
the non-vector-dimension case, two additional things are done during
vectorization of such loops:
- The resulting vector returned from the loop is reduced to a scalar
  using `vector.reduce`.
- In some cases a mask is applied to the vector yielded at the end of
  the loop to prevent garbage values from being written to the
  accumulator.

Vectorization of reduction loops is disabled by default. To enable it, a
map from loops to array of reduction descriptors should be explicitly passed to
`vectorizeAffineLoops`, or `vectorize-reductions=true` should be passed
to the SuperVectorize pass.

Current limitations:
- Loops with a non-unit step size are not supported.
- n-D vectorization with n > 1 is not supported.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D100694
2021-05-05 09:03:59 -07:00
Matthias Springer aa58281979 [mlir] Fix bug in TransferOpReduceRank when all dims are broadcasts
TransferReadOps that are a scalar read + broadcast are handled by TransferReadToVectorLoadLowering.

Differential Revision: https://reviews.llvm.org/D101808
2021-05-04 11:21:44 +09:00
thomasraoux 9621c1ef56 [mlir][linalg] Fix vectorization bug in vector transfer indexing map calculation
The current implementation had a bug as it was relying on the target vector
dimension sizes to calculate where to insert broadcast. If several dimensions
have the same size we may insert the broadcast on the wrong dimension. The
correct broadcast cannot be inferred from the type of the source and
destination vector.

Instead when we want to extend transfer ops we calculate an "inverse" map to the
projected permutation and insert broadcast in place of the projected dimensions.

Differential Revision: https://reviews.llvm.org/D101738
2021-05-03 12:16:38 -07:00
thomasraoux f44c76d6e9 [mlir][vector] Extend vector transfer unrolling to support permutations and broadcast
Differential Revision: https://reviews.llvm.org/D101637
2021-05-03 10:47:02 -07:00
thomasraoux 7417541fd8 [mlir][vector] Add canonicalization for extract/insert -> shapecast
Differential Revision: https://reviews.llvm.org/D101643
2021-05-03 10:41:15 -07:00
thomasraoux be8e2801a4 [mlir][vector][NFC] split TransposeOp lowerning out of contractLowering
Move TransposeOp lowering in its own populate function as in some cases
it is better to keep it during ContractOp lowering to better
canonicalize it rather than emiting scalar insert/extract.

Differential Revision: https://reviews.llvm.org/D101647
2021-05-03 10:23:45 -07:00
Ahmed Taei 499e89fc91 Add patterns to lower vector.multi_reduction into a sequence of vector.reduction
Three patterns are added to convert into vector.multi_reduction into a
sequence of vector.reduction as the following:

- Transpose the inputs so inner most dimensions are always reduction.
- Reduce rank of vector.multi_reduction into 2d with inner most
reduction dim (get the 2d canical form)
- 2D canonical form is converted into a sequence of vector.reduction.

There are two things we might worth in a follow up diff:

- An scf.for (maybe optionally) around vector.reduction instead of unrolling it.
- Breakdown the vector.reduction into a sequence of vector.reduction
(e.g tree-based reduction) instead of relying on how downstream dialects
handle it.
  Note: this will requires passing target-vector-length

Differential Revision: https://reviews.llvm.org/D101570
2021-04-30 10:52:21 -07:00
Nicolas Vasilache b6113db955 [mlir][Linalg] Generalize linalg vectorization
This revision adds support for vectorizing more general linalg operations with projected permutation maps.

This is achieved by eagerly broadcasting the intermediate vector to the common size
of the iteration domain of the linalg op. This allows a much more natural expression of
generalized vectorization but may introduce additional computations until all the
proper canonicalizations are implemented.

This generalization modifies the vector.transfer_read/write permutation logic and
exposes the fact that the logic employed in vector.contract was too ad-hoc.

As a consequence, changes occur in the permutation / transposition logic for contraction. In turn this prompts supporting more cases in the lowering of contract
to matrix intrinsics, which is required to make the corresponding tests pass.

Differential revision: https://reviews.llvm.org/D101165
2021-04-29 07:44:01 +00:00
Matthias Springer dd5324467d [mlir] Disallow broadcast dimensions on TransferWriteOp.
The current implementation allows for TransferWriteOps with broadcasts that do not make sense. E.g., a broadcast could write a vector into a single (scalar) memory location, which is effectively the same as writing only the last element of the vector.

Differential Revision: https://reviews.llvm.org/D100842
2021-04-21 07:43:45 +09:00
thomasraoux 3fc0fbefc8 [mlir][vector] Move transferOp on tensor opt to folder/canonicalization
Move the existing optimization for transfer op on tensor to folder and
canonicalization. This handles the write after write case and read after write
and also add write after read case.

Differential Revision: https://reviews.llvm.org/D100597
2021-04-16 08:13:10 -07:00