The manual for the BLKANDNBLK warning describes that it is safe to
disable that error if the updated ranges are non-overlapping. This
however was not true (see the added t_nba_mixed_update* tests).
In this patch we change V3Delayed to use a new ShadowVarMasked
scheme for variables that have mixed blocking and non-blocking
updates (or the FlagUnique scheme for unpacked variables), which
is in fact safe to use when the updated parts are non-overlapping.
Furthermore, mixed assignments are safe as far as scheduling is
concerned if either:
- They are to independent parts (bits/members/etc) (with this patch)
- Or if the blocking assignment is in clocked (or suspendable) logic.
The risk in scheduling is a race between the Post scheduled NBA
commit, and blocking assignments in combinational logic, which might
order incorrectly.
The second point highlights that we can handle stuff like this safely,
which is sometimes used in testbenches:
```systemverilog
always @(posedge clk) begin
if ($time == 0) a = 0;
end
always @(posedge clk) begin
if ($time > 0) a <= 2;
end
````
The only dangerous case is:
```systemverilog
always @(posedge clk) foo[idx] <= val;
assign foo[0] = bar;
```
Whit this patch, this will still resolve fine at run-time if 'idx' is
never zero, but might resolve incorrectly if 'idx' is zero.
With the above in mind, the BLKANDNBLK warning is now only issued if:
- We can't prove that the assignments are to non-overlapping bits
- And the blocking assignment is in combinational logic
These are the cases that genuinely require user attention to resolve.
With this patch, there are no more BLKANDNBLK warnings in the RTLMeter
designs.
Fixes#6122.
Somewhat commonly, there is code out there that compares an expression (or
variable) against many different constants, e.g. a one-hot decoder:
```systemverilog
assign oneHot = {x == 3, x == 2, x == 1, x == 0};
```
If the width of the expression is sufficiently large, this can blow up
a GCC pass and take an egregious amount of memory and time to compile.
Adding a new DFG pass that will generate a cheap one-hot decoder:
to compute:
```systemverilog
wire [$bits(x)-1:0] idx = <the expression being compared many times>
reg tab [1<<$bits(x)] = '{default: 0};
reg [$bits(x)-1:0] pre = '0;
always_comb begin
tab[pre] = 0;
tab[idx] = 1;
pre = idx ; // This assignment marked to avoid a false UNOPFTLAT
end
```
We then replace the comparisons `x == CONST` with `tab[CONST]`.
This is generally performance neutral, but avoids the compile time and memory
blowup with GCC (128GB+ -> 1GB in one example).
We do not apply this if the comparisons seem to be part of a `COMPARE ?
val : COND` conditional tree, which the C++ compilers can turn into jump
tables.
This enables all XiangShan configurations from RTLMeter to now build with GCC,
so in this patch we enabled those in the nightly runs.
To run scheduled instances of the RTLMeter or coverage workflows, the
ENABLE_SCHEDULED_JOBS variable must explicitly be set to 'true' in the
repository settings. This enables each fork to decide whether to run the
scheduled instances or not.
Add the GitHub Actions workflows for running RTLMeter.
Runs start daily, at 02:00 UTC, on ubuntu-24.04. There are 2 runs:
- Using GCC, with default verilator options
- Using Clang, with "--threads 4"
Each run uses a maximum of 2 runners in parallel (so max 4 in total),
and takes slightly over 2 hours to complete.
The jobs will fail if a benchmark is broken, so this already serves as a
regression test for the included designs.
For now, performance metrics are recorded as artefacts of the run but
not otherwise published.
Performance metrics are always recorded for all successful jobs, even if
some cases are failing.