This makes ignoring a result explicit by the user, and helps to prevent accidental errors with dropped results. Marking LogicalResult as no discard was always the intention from the beginning, but got lost along the way.
Differential Revision: https://reviews.llvm.org/D95841
Add support to tile affine.for ops with parametric sizes (i.e., SSA
values). Currently supports hyper-rectangular loop nests with constant
lower bounds only. Move methods
- moveLoopBody(*)
- getTileableBands(*)
- checkTilingLegality(*)
- tilePerfectlyNested(*)
- constructTiledIndexSetHyperRect(*)
to allow reuse with constant tile size API. Add a test pass -test-affine
-parametric-tile to test parametric tiling.
Differential Revision: https://reviews.llvm.org/D87353
This diff attempts to resolve the TODO in `getOpIndexSet` (formerly
known as `getInstIndexSet`), which states "Add support to handle IfInsts
surronding `op`".
Major changes in this diff:
1. Overload `getIndexSet`. The overloaded version considers both
`AffineForOp` and `AffineIfOp`.
2. The `getInstIndexSet` is updated accordingly: its name is changed to
`getOpIndexSet` and its implementation is based on a new API `getIVs`
instead of `getLoopIVs`.
3. Add `addAffineIfOpDomain` to `FlatAffineConstraints`, which extracts
new constraints from the integer set of `AffineIfOp` and merges it to
the current constraint system.
4. Update how a `Value` is determined as dim or symbol for
`ValuePositionMap` in `buildDimAndSymbolPositionMaps`.
Differential Revision: https://reviews.llvm.org/D84698
This revision aims to provide a new API, `checkTilingLegality`, to
verify that the loop tiling result still satisifes the dependence
constraints of the original loop nest.
Previously, there was no check for the validity of tiling. For instance:
```
func @diagonal_dependence() {
%A = alloc() : memref<64x64xf32>
affine.for %i = 0 to 64 {
affine.for %j = 0 to 64 {
%0 = affine.load %A[%j, %i] : memref<64x64xf32>
%1 = affine.load %A[%i, %j - 1] : memref<64x64xf32>
%2 = addf %0, %1 : f32
affine.store %2, %A[%i, %j] : memref<64x64xf32>
}
}
return
}
```
You can find more information about this example from the Section 3.11
of [1].
In general, there are three types of dependences here: two flow
dependences, one in direction `(i, j) = (0, 1)` (notation that depicts a
vector in the 2D iteration space), one in `(i, j) = (1, -1)`; and one
anti dependence in the direction `(-1, 1)`.
Since two of them are along the diagonal in opposite directions, the
default tiling method in `affine`, which tiles the iteration space into
rectangles, will violate the legality condition proposed by Irigoin and
Triolet [2]. [2] implies two tiles cannot depend on each other, while in
the `affine` tiling case, two rectangles along the same diagonal are
indeed dependent, which simply violates the rule.
This diff attempts to put together a validator that checks whether the
rule from [2] is violated or not when applying the default tiling method
in `affine`.
The canonical way to perform such validation is by examining the effect
from adding the constraint from Irigoin and Triolet to the existing
dependence constraints.
Since we already have the prior knowlegde that `affine` tiles in a
hyper-rectangular way, and the resulting tiles will be scheduled in the
same order as their respective loop indices, we can simplify the
solution to just checking whether all dependence components are
non-negative along the tiling dimensions.
We put this algorithm into a new API called `checkTilingLegality` under
`LoopTiling.cpp`. This function iterates every `load`/`store` pair, and
if there is any dependence between them, we get the dependence component
and check whether it has any negative component. This function returns
`failure` if the legality condition is violated.
[1]. Bondhugula, Uday. Effective Automatic parallelization and locality optimization using the Polyhedral model. https://dl.acm.org/doi/book/10.5555/1559029
[2]. Irigoin, F. and Triolet, R. Supernode Partitioning. https://dl.acm.org/doi/10.1145/73560.73588
Differential Revision: https://reviews.llvm.org/D84882
This diff provides a concrete test case for the error that will be raised when the iteration space is non hyper-rectangular.
The corresponding emission method for this error message has been changed as well.
Differential Revision: https://reviews.llvm.org/D84531
Fix intra-tile upper bound setting in a scenario where the tile size was
larger than the trip count.
Differential Revision: https://reviews.llvm.org/D78505
Rename mlir::tileCodeGen -> mlir::tilePerfectlyNested to be consistent.
NFC clean up tiling utility code, drop dead code, better comments.
Expose isPerfectlyNested and reuse.
Differential Revision: https://reviews.llvm.org/D78423
Summary:
Modified AffineMap::get to remove support for the overload which allowed
an ArrayRef of AffineExpr but no context (and gathered the context from a
presumed first entry, resulting in bugs when there were 0 results).
Instead, we support only a ArrayRef and a context, and a version which
takes a single AffineExpr.
Additionally, removed some now needless case logic which previously
special cased which call to AffineMap::get to use.
Reviewers: flaub, bondhugula, rriddle!, nicolasvasilache, ftynse, ulysseB, mravishankar, antiagainst, aartbik
Subscribers: mehdi_amini, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, bader, grosul1, frgossen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78226
Summary: Pass options are a better choice for various reasons and avoid the need for static constructors.
Differential Revision: https://reviews.llvm.org/D77707
Summary:
This is much cleaner, and fits the same structure as many other tablegen backends. This was not done originally as the CRTP in the pass classes made it overly verbose/complex.
Differential Revision: https://reviews.llvm.org/D77367
This revision removes all of the CRTP from the pass hierarchy in preparation for using the tablegen backend instead. This creates a much cleaner interface in the C++ code, and naturally fits with the rest of the infrastructure. A new utility class, PassWrapper, is added to replicate the existing behavior for passes not suitable for using the tablegen backend.
Differential Revision: https://reviews.llvm.org/D77350
This revision adds support for generating utilities for passes such as options/statistics/etc. that can be inferred from the tablegen definition. This removes additional boilerplate from the pass, and also makes it easier to remove the reliance on the pass registry to provide certain things(e.g. the pass argument).
Differential Revision: https://reviews.llvm.org/D76659
This generates a Passes.td for all of the dialects that have transformation passes. This removes the need for global registration for all of the dialect passes.
Differential Revision: https://reviews.llvm.org/D76657
This patch introduces a utility to separate full tiles from partial
tiles when tiling affine loop nests where trip counts are unknown or
where tile sizes don't divide trip counts. A conditional guard is
generated to separate out the full tile (with constant trip count loops)
into the then block of an 'affine.if' and the partial tile to the else
block. The separation allows the 'then' block (which has constant trip
count loops) to be optimized better subsequently: for eg. for
unroll-and-jam, register tiling, vectorization without leading to
cleanup code, or to offload to accelerators. Among techniques from the
literature, the if/else based separation leads to the most compact
cleanup code for multi-dimensional cases (because a single version is
used to model all partial tiles).
INPUT
affine.for %i0 = 0 to %M {
affine.for %i1 = 0 to %N {
"foo"() : () -> ()
}
}
OUTPUT AFTER TILING W/O SEPARATION
map0 = affine_map<(d0) -> (d0)>
map1 = affine_map<(d0)[s0] -> (d0 + 32, s0)>
affine.for %arg2 = 0 to %M step 32 {
affine.for %arg3 = 0 to %N step 32 {
affine.for %arg4 = #map0(%arg2) to min #map1(%arg2)[%M] {
affine.for %arg5 = #map0(%arg3) to min #map1(%arg3)[%N] {
"foo"() : () -> ()
}
}
}
}
OUTPUT AFTER TILING WITH SEPARATION
map0 = affine_map<(d0) -> (d0)>
map1 = affine_map<(d0) -> (d0 + 32)>
map2 = affine_map<(d0)[s0] -> (d0 + 32, s0)>
#set0 = affine_set<(d0, d1)[s0, s1] : (-d0 + s0 - 32 >= 0, -d1 + s1 - 32 >= 0)>
affine.for %arg2 = 0 to %M step 32 {
affine.for %arg3 = 0 to %N step 32 {
affine.if #set0(%arg2, %arg3)[%M, %N] {
// Full tile.
affine.for %arg4 = #map0(%arg2) to #map1(%arg2) {
affine.for %arg5 = #map0(%arg3) to #map1(%arg3) {
"foo"() : () -> ()
}
}
} else {
// Partial tile.
affine.for %arg4 = #map0(%arg2) to min #map2(%arg2)[%M] {
affine.for %arg5 = #map0(%arg3) to min #map2(%arg3)[%N] {
"foo"() : () -> ()
}
}
}
}
}
The separation is tested via a cmd line flag on the loop tiling pass.
The utility itself allows one to pass in any band of contiguously nested
loops, and can be used by other transforms/utilities. The current
implementation works for hyperrectangular loop nests.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D76700
Move some of the affine transforms and their test cases to their
respective dialect directory. This patch does not complete the move, but
takes care of a good part.
Renames: prefix 'affine' to affine loop tiling cl options,
vectorize -> super-vectorize
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D76565