Commit Graph

98 Commits

Author SHA1 Message Date
Uday Bondhugula 00860662a2 Generate dealloc's for alloc's of pipeline-data-transfer
- for the DMA transfers being pipelined through double buffering, generate
  deallocs for the double buffers being alloc'ed

This change is along the lines of cl/233502632. We initially wanted to experiment with
scoped allocation - so the deallocation's were usually not necessary; however, they are
needed even with scoped allocations in some situations - for eg. when the enclosing loop
gets unrolled. The dealloc serves as an end of lifetime marker.

PiperOrigin-RevId: 233653463
2019-03-29 16:25:53 -07:00
Uday Bondhugula 8b3f841daf Generate dealloc's for the alloc's of dma-generate.
- for the DMA buffers being allocated (and their tags), generate corresponding deallocs
- minor related update to replaceAllMemRefUsesWith and PipelineDataTransfer pass

Code generation for DMA transfers was being done with the initial simplifying
assumption that the alloc's would map to scoped allocations, and so no
deallocations would be necessary. Drop this assumption to generalize. Note that
even with scoped allocations, unrolling loops that have scoped allocations
could create a series of allocations and exhaustion of fast memory. Having a
end of lifetime marker like a dealloc in fact allows creating new scopes if
necessary when lowering to a backend and still utilize scoped allocation.
DMA buffers created by -dma-generate are guaranteed to have either
non-overlapping lifetimes or nested lifetimes.

PiperOrigin-RevId: 233502632
2019-03-29 16:24:08 -07:00
Uday Bondhugula 4ba8c9147d Automated rollback of changelist 232717775.
PiperOrigin-RevId: 232807986
2019-03-29 16:19:33 -07:00
River Riddle 90d10b4e00 NFC: Rename the 'for' operation in the AffineOps dialect to 'affine.for'. The is the second step to adding a namespace to the AffineOps dialect.
PiperOrigin-RevId: 232717775
2019-03-29 16:17:59 -07:00
River Riddle 3227dee15d NFC: Rename affine_apply to affine.apply. This is the first step to adding a namespace to the affine dialect.
PiperOrigin-RevId: 232707862
2019-03-29 16:17:29 -07:00
River Riddle bf9c381d1d Remove InstWalker and move all instruction walking to the api facilities on Function/Block/Instruction.
PiperOrigin-RevId: 232388113
2019-03-29 16:12:59 -07:00
River Riddle b499277fb6 Remove remaining usages of OperationInst in lib/Transforms.
PiperOrigin-RevId: 232323671
2019-03-29 16:10:53 -07:00
River Riddle a3d9ccaecb Replace the walkOps/visitOperationInst variants from the InstWalkers with the Instruction variants.
PiperOrigin-RevId: 232322030
2019-03-29 16:10:24 -07:00
River Riddle 5052bd8582 Define the AffineForOp and replace ForInst with it. This patch is largely mechanical, i.e. changing usages of ForInst to OpPointer<AffineForOp>. An important difference is that upon construction an AffineForOp no longer automatically creates the body and induction variable. To generate the body/iv, 'createBody' can be called on an AffineForOp with no body.
PiperOrigin-RevId: 232060516
2019-03-29 16:06:49 -07:00
Chris Lattner b42bea215a Change AffineApplyOp to produce a single result, simplifying the code that
works with it, and updating the g3docs.

PiperOrigin-RevId: 231120927
2019-03-29 15:40:38 -07:00
River Riddle 36babbd781 Change the ForInst induction variable to be a block argument of the body instead of the ForInst itself. This is a necessary step in converting ForInst into an operation.
PiperOrigin-RevId: 231064139
2019-03-29 15:40:23 -07:00
Nicolas Vasilache 0e7a8a9027 Drop AffineMap::Null and IntegerSet::Null
Addresses b/122486036

This CL addresses some leftover crumbs in AffineMap and IntegerSet by removing
the Null method and cleaning up the constructors.

As the ::Null uses were tracked down, opportunities appeared to untangle some
of the Parsing logic and make it explicit where AffineMap/IntegerSet have
ambiguous syntax. Previously, ambiguous cases were hidden behind the implicit
pointer values of AffineMap* and IntegerSet* that were passed as function
parameters. Depending the values of those pointers one of 3 behaviors could
occur.

This parsing logic convolution is one of the rare cases where I would advocate
for code duplication. The more proper fix would be to make the syntax
unambiguous or to allow some lookahead.

PiperOrigin-RevId: 231058512
2019-03-29 15:40:08 -07:00
Uday Bondhugula b588d58c5f Update createAffineComputationSlice to generate single result affine maps
- Update createAffineComputationSlice to generate a sequence of single result
  affine apply ops instead of one multi-result affine apply
- update pipeline-data-transfer test case; while on this, also update the test
  case to use only single result affine maps, and make it more robust to
  change.

PiperOrigin-RevId: 230965478
2019-03-29 15:37:53 -07:00
River Riddle 6859f33292 Migrate VectorOrTensorType/MemRefType shape api to use int64_t instead of int.
PiperOrigin-RevId: 230605756
2019-03-29 15:33:20 -07:00
Uday Bondhugula 03e15e1b9f Minor code cleanup - NFC.
- readability changes

PiperOrigin-RevId: 229443430
2019-03-29 15:19:41 -07:00
Uday Bondhugula 742c37abc9 Fix DMA overlap pass buffer mapping
- the double buffer should be indexed (iv floordiv step) % 2 and NOT (iv % 2);
  step wasn't being accounted for.

- fix test cases, enable failing test cases

PiperOrigin-RevId: 228635726
2019-03-29 15:07:10 -07:00
Chris Lattner 7974889f54 Update and generalize various passes to work on both CFG and ML functions,
simplifying them in minor ways.  The only significant cleanup here
is the constant folding pass.  All the other changes are simple and easy,
but this is still enough to shrink the compiler by 45LOC.

The one pass left to merge is the CSE pass, which will be move involved, so I'm
splitting it out to its own patch (which I'll tackle right after this).

This is step 28/n towards merging instructions and statements.

PiperOrigin-RevId: 227328115
2019-03-29 14:49:52 -07:00
Uday Bondhugula f12182157e Introduce PostDominanceInfo, fix properlyDominates() for Instructions
- introduce PostDominanceInfo in the right/complete way and use that for post
  dominance check in store-load forwarding
- replace all uses of Analysis/Utils::dominates/properlyDominates with
  DominanceInfo::dominates/properlyDominates
- drop all redundant copies of dominance methods in Analysis/Utils/
- in pipeline-data-transfer, replace dominates call with a much less expensive
  check; similarly, substitute dominates() in checkMemRefAccessDependence with
  a simpler check suitable for that context
- fix a bug in properlyDominates
- improve doc for 'for' instruction 'body'

PiperOrigin-RevId: 227320507
2019-03-29 14:48:44 -07:00
Chris Lattner 456ad6a8e0 Standardize naming of statements -> instructions, revisting the code base to be
consistent and moving the using declarations over.  Hopefully this is the last
truly massive patch in this refactoring.

This is step 21/n towards merging instructions and statements, NFC.

PiperOrigin-RevId: 227178245
2019-03-29 14:44:30 -07:00
Chris Lattner 315a466aed Rename BasicBlock and StmtBlock to Block, and make a pass cleaning it up. I did not make an effort to rename all of the 'bb' names in the codebase, since they are still correct and any specific missed once can be fixed up on demand.
The last major renaming is Statement -> Instruction, which is why Statement and
Stmt still appears in various places.

This is step 19/n towards merging instructions and statements, NFC.

PiperOrigin-RevId: 227163082
2019-03-29 14:43:58 -07:00
Chris Lattner 69d9e990fa Eliminate the using decls for MLFunction and CFGFunction standardizing on
Function.

This is step 18/n towards merging instructions and statements, NFC.

PiperOrigin-RevId: 227139399
2019-03-29 14:43:13 -07:00
Chris Lattner d798f9bad5 Rename BBArgument -> BlockArgument, Op::getOperation -> Op::getInst(),
StmtResult -> InstResult, StmtOperand -> InstOperand, and remove the old names.

This is step 17/n towards merging instructions and statements, NFC.

PiperOrigin-RevId: 227121537
2019-03-29 14:42:40 -07:00
Chris Lattner 5187cfcf03 Merge Operation into OperationInst and standardize nomenclature around
OperationInst.  This is a big mechanical patch.

This is step 16/n towards merging instructions and statements, NFC.

PiperOrigin-RevId: 227093712
2019-03-29 14:42:23 -07:00
Chris Lattner 4c05f8cac6 Merge CFGFuncBuilder/MLFuncBuilder/FuncBuilder together into a single new
FuncBuilder class.  Also rename SSAValue.cpp to Value.cpp

This is step 12/n towards merging instructions and statements, NFC.

PiperOrigin-RevId: 227067644
2019-03-29 14:40:22 -07:00
Chris Lattner 3f190312f8 Merge SSAValue, CFGValue, and MLValue together into a single Value class, which
is the new base of the SSA value hierarchy.  This CL also standardizes all the
nomenclature and comments to use 'Value' where appropriate.  This also eliminates a large number of cast<MLValue>(x)'s, which is very soothing.

This is step 11/n towards merging instructions and statements, NFC.

PiperOrigin-RevId: 227064624
2019-03-29 14:40:06 -07:00
Jacques Pienaar 58d50a6325 Rename convenience methods to make type explicit.
PiperOrigin-RevId: 226939383
2019-03-29 14:36:50 -07:00
Chris Lattner 1301f907a1 Refactor ForStmt: having it contain a StmtBlock instead of subclassing
StmtBlock.  This is more consistent with IfStmt and also conceptually makes
more sense - a forstmt "isn't" its body, it contains its body.

This is step 1/N towards merging BasicBlock and StmtBlock.  This is required
because in the new regime StmtBlock will have a use list (just like BasicBlock
does) of operands, and ForStmt already has a use list for its induction
variable.

This is a mechanical patch, NFC.

PiperOrigin-RevId: 226684158
2019-03-29 14:35:19 -07:00
Uday Bondhugula b9f53dc0bd Update/Fix LoopUtils::stmtBodySkew to handle loop step.
- loop step wasn't handled and there wasn't a TODO or an assertion; fix this.
- rename 'delay' to shift for consistency/readability.
- other readability changes.
- remove duplicate attribute print for DmaStartOp; fix misplaced attribute
  print for DmaWaitOp
- add build method for AddFOp (unrelated to this CL, but add it anyway)

PiperOrigin-RevId: 224892958
2019-03-29 14:25:07 -07:00
Uday Bondhugula d59a95a05c Fix missing check for dependent DMAs in pipeline-data-transfer
- adding a conservative check for now (TODO: use the dependence analysis pass
  once the latter is extended to deal with DMA ops). resolve an existing bug on
  a test case.

- update test cases

PiperOrigin-RevId: 224869526
2019-03-29 14:24:53 -07:00
Uday Bondhugula 2ef57806ba Update/fix -pipeline-data-transfer; fix b/120770946
- fix replaceAllMemRefUsesWith call to replace only inside loop body.
- handle the case where DMA buffers are dynamic; extend doubleBuffer() method
  to handle dynamically shaped DMA buffers (pass the right operands to AllocOp)
- place alloc's for DMA buffers at the depth at which pipelining is being done
  (instead of at top-level)
- add more test cases

PiperOrigin-RevId: 224852231
2019-03-29 14:24:22 -07:00
Uday Bondhugula b6c03917ad Remove allocations for memref's that become dead as a result of double
buffering in the auto DMA overlap pass.

This is done online in the pass.

PiperOrigin-RevId: 222313640
2019-03-29 14:05:19 -07:00
Uday Bondhugula fff1efbaf5 Updates to transformation/analysis passes/utilities. Update DMA generation pass
and getMemRefRegion() to work with specified loop depths; add support for
outgoing DMAs, store op's.

- add support for getMemRefRegion symbolic in outer loops - hence support for
  DMAs symbolic in outer surrounding loops.

- add DMA generation support for outgoing DMAs (store op's to lower memory
  space); extend getMemoryRegion to store op's. -memref-bound-check now works
  with store op's as well.

- fix dma-generate (references to the old memref in the dma_start op were also
  being replaced with the new buffer); we need replace all memref uses to work
  only on a subset of the uses - add a new optional argument for
  replaceAllMemRefUsesWith. update replaceAllMemRefUsesWith to take an optional
  'operation' argument to serve as a filter - if provided, only those uses that
  are dominated by the filter are replaced.

- Add missing print for attributes for dma_start, dma_wait op's.

- update the FlatAffineConstraints API

PiperOrigin-RevId: 221889223
2019-03-29 14:00:51 -07:00
Jacques Pienaar cc9a6ed09d Initialize Pass with PassID.
The passID is not currently stored in Pass but this avoids the unused variable warning. The passID is used to uniquely identify passes, currently this is only stored/used in PassInfo.

PiperOrigin-RevId: 220485662
2019-03-29 13:50:34 -07:00
Jacques Pienaar 6f0fb22723 Add static pass registration
Add static pass registration and change mlir-opt to use it. Future work is needed to refactor the registration for PassManager usage.

Change build targets to alwayslink to enforce registration.

PiperOrigin-RevId: 220390178
2019-03-29 13:49:34 -07:00
Uday Bondhugula 8201e19e3d Introduce memref bound checking.
Introduce analysis to check memref accesses (in MLFunctions) for out of bound
ones. It works as follows:

$ mlir-opt -memref-bound-check test/Transforms/memref-bound-check.mlir

/tmp/single.mlir:10:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#1
      %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32>
           ^
/tmp/single.mlir:10:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#1
      %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32>
           ^
/tmp/single.mlir:10:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#2
      %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32>
           ^
/tmp/single.mlir:10:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#2
      %x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32>
           ^
/tmp/single.mlir:12:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#1
      %y = load %B[%idy] : memref<128 x i32>
           ^
/tmp/single.mlir:12:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#1
      %y = load %B[%idy] : memref<128 x i32>
           ^
#map0 = (d0, d1) -> (d0, d1)
#map1 = (d0, d1) -> (d0 * 128 - d1)
mlfunc @test() {
  %0 = alloc() : memref<9x9xi32>
  %1 = alloc() : memref<128xi32>
  for %i0 = -1 to 9 {
    for %i1 = -1 to 9 {
      %2 = affine_apply #map0(%i0, %i1)
      %3 = load %0[%2tensorflow/mlir#0, %2tensorflow/mlir#1] : memref<9x9xi32>
      %4 = affine_apply #map1(%i0, %i1)
      %5 = load %1[%4] : memref<128xi32>
    }
  }
  return
}

- Improves productivity while manually / semi-automatically developing MLIR for
  testing / prototyping; also provides an indirect way to catch errors in
  transformations.

- This pass is an easy way to test the underlying affine analysis
  machinery including low level routines.

Some code (in getMemoryRegion()) borrowed from @andydavis cl/218263256.

While on this:

- create mlir/Analysis/Passes.h; move Pass.h up from mlir/Transforms/ to mlir/

- fix a bug in AffineAnalysis.cpp::toAffineExpr

TODO: extend to non-constant loop bounds (straightforward). Will transparently
work for all accesses once floordiv, mod, ceildiv are supported in the
AffineMap -> FlatAffineConstraints conversion.
PiperOrigin-RevId: 219397961
2019-03-29 13:46:08 -07:00
River Riddle 4c465a181d Implement value type abstraction for types.
This is done by changing Type to be a POD interface around an underlying pointer storage and adding in-class support for isa/dyn_cast/cast.

PiperOrigin-RevId: 219372163
2019-03-29 13:45:54 -07:00
Chris Lattner adbba70d82 Simplify FunctionPass to eliminate the CFGFunctionPass/MLFunctionPass
distinction.  FunctionPasses can now choose to get called on all functions, or
have the driver split CFG/ML Functions up for them.  NFC.

PiperOrigin-RevId: 218775885
2019-03-29 13:40:05 -07:00
Uday Bondhugula ccfe593715 PassResult return cleanup.
- return success as long as IR is in a valid state.

PiperOrigin-RevId: 218225317
2019-03-29 13:35:47 -07:00
Chris Lattner 7850258c49 Introduce a new Operation::erase helper to generalize some code in
the pattern matcher / canonicalizer, and rename existing eraseFromBlock methods
to align with it.

PiperOrigin-RevId: 218104455
2019-03-29 13:34:51 -07:00
Feng Liu 34927e2474 Rename Operation::getAs to Operation::dyn_cast
Also rename Operation::is to Operation::isa
Introduce Operation::cast

All of these are for consistency with global dyn_cast/cast/isa operators.

PiperOrigin-RevId: 217878786
2019-03-29 13:33:41 -07:00
Uday Bondhugula 18e666702c Generalize / improve DMA transfer overlap; nested and multiple DMA support; resolve
multiple TODOs.

- replace the fake test pass (that worked on just the first loop in the
  MLFunction) to perform DMA pipelining on all suitable loops.
- nested DMAs work now (DMAs in an outer loop, more DMAs in nested inner loops)
- fix bugs / assumptions: correctly copy memory space and elemental type of source
  memref for double buffering.
- correctly identify matching start/finish statements, handle multiple DMAs per
  loop.
- introduce dominates/properlyDominates utitilies for MLFunction statements.
- move checkDominancePreservationOnShifts to LoopAnalysis.h; rename it
  getShiftValidity
- refactor getContainingStmtPos -> findAncestorStmtInBlock - move into
  Analysis/Utils.h; has two users.
- other improvements / cleanup for related API/utilities
- add size argument to dma_wait - for nested DMAs or in general, it makes it
  easy to obtain the size to use when lowering the dma_wait since we wouldn't
  want to identify the matching dma_start, and more importantly, in general/in the
  future, there may not always be a dma_start dominating the dma_wait.
- add debug information in the pass

PiperOrigin-RevId: 217734892
2019-03-29 13:32:28 -07:00
Uday Bondhugula 86eac4618c Create private exclusive / single use affine computation slice for an op stmt.
- add util to create a private / exclusive / single use affine
  computation slice for an op stmt (see method doc comment); a single
  multi-result affine_apply op is prepended to the op stmt to provide all
  results needed for its operands as a function of loop iterators and symbols.
- use it for DMA pipelining (to create private slices for DMA start stmt's);
  resolve TODOs/feature request (b/117159533)
- move createComposedAffineApplyOp to Transforms/Utils; free it from taking a
  memref as input / generalize it.

PiperOrigin-RevId: 216926818
2019-03-29 13:29:21 -07:00
Jacques Pienaar 764fd035b0 Split BuiltinOps out of StandardOps.
* Move Return, Constant and AffineApply out into BuiltinOps;
* BuiltinOps are always registered, while StandardOps follow the same dynamic registration;
* Kept isValidX in MLValue as we don't have a verify on AffineMap so need to keep it callable from Parser (I wanted to move it to be called in verify instead);

PiperOrigin-RevId: 216592527
2019-03-29 13:28:12 -07:00
Nicolas Vasilache 1d3e7e2616 [MLIR] AffineMap value type
This CL applies the same pattern as AffineExpr to AffineMap: a simple struct
that acts as the storage is allocated in the bump pointer. The AffineMap is
immutable and accessed everywhere by value.

PiperOrigin-RevId: 216445930
2019-03-29 13:26:24 -07:00
Uday Bondhugula 82e55750d2 Add target independent standard DMA ops: dma.start, dma.wait
Add target independent standard DMA ops: dma.start, dma.wait. Update pipeline
data transfer to use these to detect DMA ops.

While on this
- return failure from mlir-opt::performActions if a pass generates invalid output
- improve error message for verify 'n' operand traits

PiperOrigin-RevId: 216429885
2019-03-29 13:26:10 -07:00
Nicolas Vasilache ce2edea135 [MLIR] Cleanup AffineExpr
This CL introduces a series of cleanups for AffineExpr value types:
1. to make it clear that the value types should be used, the pointer
AffineExpr types are put in the detail namespace. Unfortunately, since the
value type operator-> only forwards to the underlying pointer type, we
still
need to expose this in the include file for now;
2. AffineExprKind is ok to use, it thus comes out of detail and thus of
AffineExpr
3. getAffineDimExpr, getAffineSymbolExpr, getAffineConstantExpr are
similarly
extracted as free functions and their naming is mande consistent across
Builder, MLContext and AffineExpr
4. AffineBinaryOpEx::simplify functions are made into static free
functions.
In particular it is moved away from AffineMap.cpp where it does not belong
5. operator AffineExprType is made explicit
6. uses the binary operators everywhere possible
7. drops the pointer usage everywhere outside of AffineExpr.cpp,
MLIRContext.cpp and AsmPrinter.cpp

PiperOrigin-RevId: 216207212
2019-03-29 13:24:45 -07:00
Uday Bondhugula 6cfdb756b1 Introduce memref replacement/rewrite support: to replace an existing memref
with a new one (of a potentially different rank/shape) with an optional index
remapping.

- introduce Utils::replaceAllMemRefUsesWith
- use this for DMA double buffering

(This CL also adds a few temporary utilities / code that will be done away with
once:
1) abstract DMA op's are added
2) memref deferencing side-effect / trait is available on op's
3) b/117159533 is resolved (memref index computation slices).
PiperOrigin-RevId: 215831373
2019-03-29 13:23:19 -07:00
Uday Bondhugula 041817a45e Introduce loop body skewing / loop pipelining / loop shifting utility.
- loopBodySkew shifts statements of a loop body by stmt-wise delays, and is
  typically meant to be used to:
  - allow overlap of non-blocking start/wait until completion operations with
    other computation
  - allow shifting of statements (for better register
    reuse/locality/parallelism)
  - software pipelining (when applied to the innermost loop)
- an additional argument specifies whether to unroll the prologue and epilogue.
- add method to check SSA dominance preservation.
- add a fake loop pipeline pass to test this utility.

Sample input/output are below. While on this, fix/add following:

- fix minor bug in getAddMulPureAffineExpr
- add additional builder methods for common affine map cases
- fix const_operand_iterator's for ForStmt, etc. When there is no such thing
  as 'const MLValue', the iterator shouldn't be returning const MLValue's.
  Returning MLValue is const correct.

Sample input/output examples:

1) Simplest case: shift second statement by one.

Input:

for %i = 0 to 7 {
  %y = "foo"(%i) : (affineint) -> affineint
  %x = "bar"(%i) : (affineint) -> affineint
}

Output:

#map0 = (d0) -> (d0 - 1)
mlfunc @loop_nest_simple1() {
  %c8 = constant 8 : affineint
  %c0 = constant 0 : affineint
  %0 = "foo"(%c0) : (affineint) -> affineint
  for %i0 = 1 to 7 {
    %1 = "foo"(%i0) : (affineint) -> affineint
    %2 = affine_apply #map0(%i0)
    %3 = "bar"(%2) : (affineint) -> affineint
  }
  %4 = affine_apply #map0(%c8)
  %5 = "bar"(%4) : (affineint) -> affineint
  return
}

2) DMA overlap: shift dma.wait and compute by one.

Input
  for %i = 0 to 7 {
    %pingpong = affine_apply (d0) -> (d0 mod 2) (%i)
    "dma.enqueue"(%pingpong) : (affineint) -> affineint
    %pongping = affine_apply (d0) -> (d0 mod 2) (%i)
    "dma.wait"(%pongping) : (affineint) -> affineint
    "compute1"(%pongping) : (affineint) -> affineint
  }

Output

#map0 = (d0) -> (d0 mod 2)
#map1 = (d0) -> (d0 - 1)
#map2 = ()[s0] -> (s0 + 7)
mlfunc @loop_nest_dma() {
  %c8 = constant 8 : affineint
  %c0 = constant 0 : affineint
  %0 = affine_apply #map0(%c0)
  %1 = "dma.enqueue"(%0) : (affineint) -> affineint
  for %i0 = 1 to 7 {
    %2 = affine_apply #map0(%i0)
    %3 = "dma.enqueue"(%2) : (affineint) -> affineint
    %4 = affine_apply #map1(%i0)
    %5 = affine_apply #map0(%4)
    %6 = "dma.wait"(%5) : (affineint) -> affineint
    %7 = "compute1"(%5) : (affineint) -> affineint
  }
  %8 = affine_apply #map1(%c8)
  %9 = affine_apply #map0(%8)
  %10 = "dma.wait"(%9) : (affineint) -> affineint
  %11 = "compute1"(%9) : (affineint) -> affineint
  return
}

3) With arbitrary affine bound maps:

Shift last two statements by two.

Input:

  for %i = %N to ()[s0] -> (s0 + 7)()[%N] {
    %y = "foo"(%i) : (affineint) -> affineint
    %x = "bar"(%i) : (affineint) -> affineint
    %z = "foo_bar"(%i) : (affineint) -> (affineint)
    "bar_foo"(%i) : (affineint) -> (affineint)
  }

Output

#map0 = ()[s0] -> (s0 + 1)
#map1 = ()[s0] -> (s0 + 2)
#map2 = ()[s0] -> (s0 + 7)
#map3 = (d0) -> (d0 - 2)
#map4 = ()[s0] -> (s0 + 8)
#map5 = ()[s0] -> (s0 + 9)

  for %i0 = %arg0 to #map0()[%arg0] {
    %0 = "foo"(%i0) : (affineint) -> affineint
    %1 = "bar"(%i0) : (affineint) -> affineint
  }
  for %i1 = #map1()[%arg0] to #map2()[%arg0] {
    %2 = "foo"(%i1) : (affineint) -> affineint
    %3 = "bar"(%i1) : (affineint) -> affineint
    %4 = affine_apply #map3(%i1)
    %5 = "foo_bar"(%4) : (affineint) -> affineint
    %6 = "bar_foo"(%4) : (affineint) -> affineint
  }
  for %i2 = #map4()[%arg0] to #map5()[%arg0] {
    %7 = affine_apply #map3(%i2)
    %8 = "foo_bar"(%7) : (affineint) -> affineint
    %9 = "bar_foo"(%7) : (affineint) -> affineint
  }

4) Shift one by zero, second by one, third by two

  for %i = 0 to 7 {
    %y = "foo"(%i) : (affineint) -> affineint
    %x = "bar"(%i) : (affineint) -> affineint
    %z = "foobar"(%i) : (affineint) -> affineint
  }

#map0 = (d0) -> (d0 - 1)
#map1 = (d0) -> (d0 - 2)
#map2 = ()[s0] -> (s0 + 7)

  %c9 = constant 9 : affineint
  %c8 = constant 8 : affineint
  %c1 = constant 1 : affineint
  %c0 = constant 0 : affineint
  %0 = "foo"(%c0) : (affineint) -> affineint
  %1 = "foo"(%c1) : (affineint) -> affineint
  %2 = affine_apply #map0(%c1)
  %3 = "bar"(%2) : (affineint) -> affineint
  for %i0 = 2 to 7 {
    %4 = "foo"(%i0) : (affineint) -> affineint
    %5 = affine_apply #map0(%i0)
    %6 = "bar"(%5) : (affineint) -> affineint
    %7 = affine_apply #map1(%i0)
    %8 = "foobar"(%7) : (affineint) -> affineint
  }
  %9 = affine_apply #map0(%c8)
  %10 = "bar"(%9) : (affineint) -> affineint
  %11 = affine_apply #map1(%c8)
  %12 = "foobar"(%11) : (affineint) -> affineint
  %13 = affine_apply #map1(%c9)
  %14 = "foobar"(%13) : (affineint) -> affineint

5) SSA dominance violated; no shifting if a shift is specified for the second
statement.

  for %i = 0 to 7 {
    %x = "foo"(%i) : (affineint) -> affineint
    "bar"(%x) : (affineint) -> affineint
  }

PiperOrigin-RevId: 214975731
2019-03-29 13:21:26 -07:00