Using signless ints as the address was just wrong. It was probably done
to avoid incomplete lowerings, which have been addressed. Changes RAM
declaration addresses to use unsigned integer types throughout the
ESI dialect and its tests, updating both the code generator and
expected outputs.
- Switch address ports to unsigned (uiN) in MLIR tests and service
definitions.
- Update service port generation and SystemVerilog memory
instantiation to use unsigned address types, inserting bitcasts to
signless for indexing.
- Adjust PyCDE front end and software integration tests to use
UInt addressing.
Adds the ESIInstanceOp. This op allows a non-handshake design to
instantiate `handshake.func`s. Since handshake needs elastic inputs and
produces elastic outputs, this uses ESI channels.
This pass deconstructs the (rather complex) semantics of a >2 input cmerge operation into a series of 2-input cmerge operations + supporting logic. This simpler structure is better suited for `dc` lowering, which only supports lowering 2-input control merge operations.
Simplify
fix test
Also split regular merges
nit
Fix index type propagation for cmerge
Also zext non-`builtin.index`-typed `control_merge` operations
Having a direction on the service ports which just reversed the bundle directions was far too complicated. Standardize on the service always packing the bundle and sending it to the client. Standardizing that way has the additional benefit that this was already the canonical form which all the service generators saw. Remove all to_client verbiage from the IR.
This is quite invasive. This converts from the functiontype printer to the moduletype printer.
---------
Co-authored-by: Mike Urbach <mikeurbach@gmail.com>
... by requiring that handshake ops are nested within an operation that inherits the `FineGrainedDataflowRegionOpInterface`.
This is somewhat of a half-way solution, seeing as there isn't support for `HasParent` with an interface upstream. I've raised the issue and suggested a fix here https://github.com/llvm/llvm-project/pull/66196 but we'll see how long that takes to resolve.
Until then, added a `HasParentInterface` which does the same thing, albeit with a cryptic error message about which interface the parent op lacked (note: the whole issue here is that there isn't any name literal being generated for op interfaces).
I'll be monitoring the upstream stuff, and changing this over until then. For now, the motivation for adding this into circt is to unblock me in using handshake outside of a `handshake.func` while still having a restriction on where handshake ops can be used - i.e. i don't want to completely lift the `HasParent` restriction - users should still explicitly opt-into the fact that "using handshake => handshake ops is in a fine-grained dataflow region".
This commit removes the InferTypeOpInterface from ControlMergeOp because
it no longer made sense w.r.t. the variable type of the index result introduced in
#4929). Additionally, when parsing a control_merge with an index type different
than "index", it would raise an error because of the mismatch between the
parsed and inferred types. Example code that would trigger it below.
```
handshake.func @test(%arg1: i32, %arg2: i32) {
%result, %index = control_merge %arg1, %arg2 : i32, i64
return
}
```
The commit also adds a custom builder for ControlMergeOp (that was
previously auto-generated by InferTypeOpInterface) to allow creating
the operation without explicitly providing return types. The builder defaults
to an Index type for the op's second result.
[Handshake] Make ControlMergeOp idx result AnyType
This commit changes the type of ControlMergeOp's second result
(representing the index of the propagated input) from an Index to an
AnyType. A verifier is implemented for the operation that checks that
this result is either an Index or IntegerType of large enough bitwidth
to support indexing the operation's operands. This aligns the
verification logic of this result with the one for the select operand of
MuxOp's, which is semantically similar.
ControlMergeOp's assembly format is changed to also print the type of
the second result, since it is no longer necessarily an Index. If the
result types are not specified explicitly during operation construction,
the second result will default to an Index. Tests are modified to
reflect the change in assembly format.
This fixes a bug in the `EliminateCBranchIntoMuxPattern` canonicalization pattern where a `cond_br` operation would get deleted despite still having uses. Below is an example handshake.func which would trigger the bug.
```mlir
handshake.func @cbranch_into_mux_elim(%arg0 : i1, %arg1 : index, %arg2 : i32, %arg3: none) -> (i32, i32, none) {
%trueResult, %falseResult = cond_br %arg0, %arg2 : i32
%0 = mux %arg1 [%trueResult, %falseResult] : index, i32
%1 = arith.addi %trueResult, %trueResult : i32
handshake.return %0, %1, %arg3 : i32, i32, none
}
```
Here the `cond_br` would get deleted despite `%trueResult` still being used by `arith.addi`.
The above example is added to Handshake canonicalization tests.
* isControl paramater for some HandshakeOps builders
Logic to determine whether a Handshake operation is control-only
or not was moved outside the builders and into the
StandardToHandshake conversion pass. Modified builders now take an
isControl parameter.
* Cleaned up merge-like operations builders
This commit refactors and cleans up the merge insertion logic in
StandardToHandshake. The pass now creates valid merge-like operations
from the beginning with backedges as operands. All backedges are then
resolved in reconnectMergeOps. This allows us to remove some operation
builders that were only used during StandardToHandshake.
* Handshake builders cleanup and SOST interface
This commit removes a lot of custom builders for Handshake operations,
and re-enables default builders for most operations. The commit also
replaces the sost namespace (along with its operation attributes logic)
with an SOSTInterface that Handshake operations can implement to achieve
the same purpose with less code and more reliability. This in turns
facilitates the deletion of some operation attributes that were
redundant.
A lot of tests still fail on this commit due to largely unupdated
parse and print logic for Handshake operations.
* Fixed many issues, all tests passing
This commit fixes many issues introduced during Handshake operation
builders cleanup, many related to parsing, printing, or constructing
operations. In the HandshakeToFIRRTL pass, the control-ness of each
Handshake operation is cached in an attribute before lowering to
accomodate the new logic to determine whether an operation is
control-only.
* Verification and type inference for HandshakeOps
This commit adds some verification logic as well as return type
inference for a number of Handshake operations. The latter allows to
shorten the creation of Handshake operations in many places as the
return type(s) need no longer be specified at creation.
The commit also introduces a simple builder for ForkOp and LazyForkOp.
This commit adds a new MuxOp rewrite pattern that canonicalizes
muxes with two data operands originating from the same conditional
branch, resulting in the removal of at least one mux and one
cbranch. It also adds three new tests to check for the correctness
of the rewrite pattern, and updates existing tests to reflect the
expected change in canonicalized IR.
This turned out to be a bit more complicated than I had expected (assumptions placed on the hardware interface of the module, this pass still need high-level memory info, index types are used for memory in Handshake, ESI strictly requires clog2(memory size) for address signals). I'm sure there's a more principled way of doing this, but that was not what flowed out of the fingers.
A wrapper is created which instantiates an `esi.mem.ram` service for each `memref` argument of the original handshake function. This wrapper also instantiates an external module which is a stand-in for the module that will be created during `HandshakeToHW`. The load- and store ports of the to-be-lowered instance are plumbed up with esi service requests.
Given a handshake function as follows, with 1 load and 1 store port:
```mlir
handshake.func @main(%arg0: index, %arg1: index, %v: i32, %mem : memref<10xi32>, %argCtrl: none) -> none
```
the following IR is generated:
```mlir
hw.module.extern @_main_hw(%arg0: !esi.channel<i64>, %arg1: !esi.channel<i64>, %v: !esi.channel<i32>, %mem_ld0.data: !esi.channel<i32>, %mem_st0.done: !esi.channel<i0>, %argCtrl: !esi.channel<i0>, %clock: i1, %reset: i1) -> (out0: !esi.channel<i0>, mem_ld0.addr: !esi.channel<i4>, mem_st0: !esi.channel<!hw.struct<address: i4, data: i32>>)
esi.mem.ram @mem i32 x 10
hw.module @main_esi_wrapper(%arg0: !esi.channel<i64>, %arg1: !esi.channel<i64>, %v: !esi.channel<i32>, %argCtrl: !esi.channel<i0>, %clock: i1, %reset: i1) -> (out0: !esi.channel<i0>) {
esi.service.instance @mem impl as "cosim" opts {}(%clock, %reset) : (i1, i1) -> ()
%0 = esi.service.req.inout %main.mem_ld0.addr -> <@mem::@read>([]) : !esi.channel<i4> -> !esi.channel<i32>
%1 = esi.service.req.inout %main.mem_st0 -> <@mem::@write>([]) : !esi.channel<!hw.struct<address: i4, data: i32>> -> !esi.channel<i0>
%main.out0, %main.mem_ld0.addr, %main.mem_st0 = hw.instance "main" @_main_hw(arg0: %arg0: !esi.channel<i64>, arg1: %arg1: !esi.channel<i64>, v: %v: !esi.channel<i32>, mem_ld0.data: %0: !esi.channel<i32>, mem_st0.done: %1: !esi.channel<i0>, argCtrl: %argCtrl: !esi.channel<i0>, clock: %clock: i1, reset: %reset: i1) -> (out0: !esi.channel<i0>, mem_ld0.addr: !esi.channel<i4>, mem_st0: !esi.channel<!hw.struct<address: i4, data: i32>>)
hw.output %main.out0 : !esi.channel<i0>
}
handshake.func @main(%arg0: index, %arg1: index, %arg2: i32, %arg3: i32, %arg4: none, %arg5: none, ...) -> (none, i4, !hw.struct<address: i4, data: i32>) attributes {argNames = ["arg0", "arg1", "v", "mem_ld0.data", "mem_st0.done", "argCtrl"], resNames = ["out0", "mem_ld0.addr", "mem_st0"]}
```
This is intended to be a lowering used for transforming Handshake to hardware.
The transformation materializes the top-level `memref` to a new set of in- and output arguments, and plumbs to the in- and output values of the `handshake.extmem` operation which referenced the memref. This is essentially what is was done by `HandshakeToFIRRTL`/`HandshakeToHW` but can be moved as a preprocessing step.
Another motivation for doing so is to create a more natural interface for external users to interact with:
- Store ports now have a single handshake'd output port with combined addr+data.
- Load ports now only requires a single handshake'd data input. This input is then forked into its data+control parts to feed the load- and store operations.
In doing so, we also use `!hw.struct` types to ensure that correct names are being maintained for the in- and output values. This is opposed to using tuples, which does not have named fields.
The tuple requires use of `hw.struct_create` - in HandshakeToHW, this can be supported identically to `handshake.pack` which itself uses `hw.struct_create` after tuple types have been lowered to structs.
This commit exposes a new `bufferRegion` function that is independent of
`handshake.func`. Additionally, it cleans up some of the code in the
buffer insertion.
This PR introduces a handshake pass that adds a lock mechanism for functions.
Such a mechanism ensures that there is only one active control token in a function at each point in time.
An additional "RUN" directive/command was added to the integration tests that otherwise require a task-pipelining transformation. These test runs disable task-pipelining in favor of locking but should still yield the same results (albeit with lower throughput)
* Bump LLVM
Bump LLVM to the current top-of-tree.
This requires changing a few `Attribute`s to the new `TypedAttr`
interface, since attributes no longer carry a type by default. We rely
on this type in quite a few places in the CIRCT codebase, which required
a switch over to `TypedAttr`.
See: https://reviews.llvm.org/D130092
Co-authored-by: John Demme <john.demme@microsoft.com>
* [Conversions][Transforms] Add missing includes due to refactoring in upstream MLIR.
LLVM commit 36d3efea15e6202edd64b05de38d8379e2baddb2 removed a few
include statements from mlir/Pass/Pass.h, so we have to update a few of
the files here to add back in some includes.
* [Handshake][SV] Migrate from StrEnumAttr to EnumAttr.
StrEnumAttr was removed in upstream LLVM commit
60e34f8dddb4a3ae5b82e8d55728c021126c4af8. It has been replaced with
EnumAttr, which can do everything that StrEnumAttr could do but more
efficiently.
This affected two specific attributes: Handshake's BufferType, and SV's
ModportDirectionAttr. Both have been converted over to EnumAttrs. Their
parsers had to be updated, since the new EnumAttr no longer likes having
double quotes around the enum value in the assembly format.
Consequently, the tests were also updated.
* Bump llvm to 1aa4f0bb6cc21b7666718f5534c88d03152ddfb1.
Bump LLVM.
Remove one Handshake test that was testing a verifier that is superseded
by an internal LLVM/MLIR check.
Signed-off-by: Schuyler Eldridge <schuyler.eldridge@sifive.com>
While testing handshake accelerators, it has been a general issue that kernels were not able to be function pipelined. That is, having multiple activations of a kernel live, concurrently.
The reason for this is due to the previous use of control-merge operations at backedges in the CFG (loops). By doing so, multiple kernel invocations are able to enter the loop control network, which itself is already loop pipelining. In most scenarios, this then leads to a deadlock due to insufficient buffering inside the loop, due to having to accomodate multiple invocations + loop pipelining, all at once.
The solution to this is a new structure for loop entry control. Instead of a single control merge to allow for control entry to the loop, we now have a combination of a mux + a loop "priming" register. The priming register drives the mux, and the mux selects from either external control or internal, loop control. By doing so, we essentially guard loop control to be exclusively allowed for that inside of the loop, or an external loop activation. A similar mux structure is used for the inputs to the loop header; block data input muxes are split in two, containing inputs from external loop predecessors and internal loop predecessors, and a mux - driven by the loop priming register - selects between either of these, based on where control originated from.
The loop priming register is driven by the loop exit condition. Thus, when the loop exits, the priming mux is "re-primed" allowing for control to again enter from an external activation.
Related changes:
- Merge op "dataflow conversion check" has been relaxed to now not require that all merge-like ops have an operand from all predecessor blocks. This will not be the case for the externalCtrlMerge/loopCtrlMerge.
- Allow single-input muxes, but canonicalize them away. This simplifies lowering (removes a special case), and is more easily taken care of by a canonicalization pattern.
This reverts commit 8e44671.
The problem which I thought this commit (partially) solved is actually solved by #2444. In light of that, this commit is technically a regression and should be reverted.
This commit adds support for buffer initialization. A list of integer values may be provided to sequential buffers, which will be made the reset values of the buffer registers when lowering to hardware.
In trying to get function pipelining to work, it finally seems like we've found the issue for why this hasn't been working. If we keep simple merges, and then add buffers to their outputs, function pipelining looks to be working.
This commit allows simple merges in canonical handshake IR, but keeps the option of removing them through a separate pass. After buffer insertion, the merges will be redundant, but we must keep them (at least for now) to guide buffer insertion.
Due to the previous implementation of this function being based on a graph traversal which stopped at buffer ops, running bufferAllStrategy did not actually "buffer all" unbuffered channels, if some buffers already existed.
This commit simplifies bufferAllStrategy to do what is promised.