Sram throughput #279

GregAC · 2024-10-28T16:28:31Z

This allows us to have back to back accesses to SRAM without any stall cycles.

Sadly it increases timing pressure (shouldn't be a big difference, now the response FIFO in SRAM must choose between two different flopped responses and the response direct from SRAM and previously it chose between a single flopped response and a response direct from SRAM, though will add a logic level at least and I guess that's pushed a few things closer to the edge).

When you build this under Vivado it sometimes reports a timing failure but when you ask for a detailed timing report it changes its mind and says timing is fine. Possibly something to do with the over-constraining @elliotb-lowrisc put in? Or maybe the reported timing is based on a rough timing analysis and the more detailed one shows it's actually fine.

I think this is important to get in, takes coremark from 1.34 to 1.47 (almost 10% improvement) and clearly is generally applicable.

Another change to be considered is switching to a single cycle multiplier. Unfortunately that also adds in more timing pressure so left that off for now (but did add in the Ibex changes so we can choose which multiplier we want). It does still meet timing but pushes synthesis time up yet more.

elliotb-lowrisc · 2024-10-28T16:59:09Z

Where in the output files is it reporting this timing failure? There are definitely some points in the flow where it will give bad timing estimates. I've had a go at building this with Vivado v2021.1 (64-bit) and couldn't see anything concerning at a glance at the high-level results.

Incidentally, do you know what the register u_sonata_system/u_top_tracing/u_ibex_top/u_ibex_core/cs_registers_i/gen_cntrs[1].gen_imp.mcounters_variable_i/counter_q_reg[*] is for? It keeps popping up as the endpoint of critical paths, but looking in ibex_cs_registers.sv seems to show it as some sort of reserved counter.

GregAC · 2024-10-28T17:04:10Z

Where in the output files is it reporting this timing failure?

When you open the Vivado GUI it's listed in the 'Timing' section of the 'Project Summary' window, with the negative slacks noted in the columns in the 'Design Runs' tab plus there's a critical warning noting 'The design failed to meet timing requirements' output from the 'Route Design' stage.

Incidentally, do you know what the register u_sonata_system/u_top_tracing/u_ibex_top/u_ibex_core/cs_registers_i/gen_cntrs[1].gen_imp.mcounters_variable_i/counter_q_reg[*] is for?

Yes it's a performance counter, we could flop the increment signal for these. No big deal if your performance counters have one cycle of latency (other than for the poor person trying to get a cycle accurate match in a simulator but we don't do that!). I can have a look at doing this in our cheriot-ibex fork.

I've got another easy timing fix I should really get in as well, didn't have a dramatic effect on fmax but could be good for overall timing pressure.

elliotb-lowrisc · 2024-10-28T17:23:10Z

'The design failed to meet timing requirements'

Odd, that should be correct as it's output after post-route optimisation (in the logs at least). The Timing section of the Project Summary window seems clean in my build + tool-version, so perhaps it's a version-specific bug (not uncommon in my experience). Could also be something to do with the stage-specific clock over-constraining I put in I suppose, if it was not being cleared properly for some reason.

Took me a while to find the Project Summary as I had been opening the Design Checkpoints (.dcp) files. Nice to know there's this nice overview too.

GregAC · 2024-10-28T17:30:05Z

Took me a while to find the Project Summary as I had been opening the Design Checkpoints (.dcp) files. Nice to know there's this nice overview too.

I use the build-gui target of the fusesoc make file

make -C ./build/lowrisc_sonata_system_0/synth-vivado/ build-gui

That provides the design summary when it opens.

marnovandermaas

I hadn't realized this timing failure also occurs without the single cycle multiplier. I would be hesitant to merge this in without first understanding why that negative slack is reported and fix the constraining first.

GregAC · 2024-10-29T09:38:10Z

I would guess it's a tooling bug, seems it doesn't occur in earlier Vivado versions (such as v2021.1 Elliot is using). I'll download the latest release and try it today.

I would rather not block v1.0 on getting to the bottom of a tooling bug given the actual timing analysis reports a pass and we can always use v2021.1 if we're concerned by this. I am keen to get this in for v1.0 as it's a noticeable performance impact.

GregAC · 2024-10-29T13:29:30Z

Same behaviour (negative slack that disappears when you run a full timing analysis) is seen on v2024.1 but I also see this with current main. We should investigate in more detail but for let's just use 2021.1 to create releases: see #289

@marnovandermaas are you happy accepting this given it builds fine under v2021.1 and our current main sees a similar failure under v2024.1 (i.e. whatever this issue is it's already present in sonata-system not inherent to this change).

Also broken on main

marnovandermaas · 2024-10-29T14:36:40Z

I'm not happy with that current main is broken on 2024, but that is not a problem with this PR.

GregAC · 2024-10-29T14:57:41Z

Just done a build on v2021.1 of this rebased on top of main. As reported by @elliotb-lowrisc builds fine and reports no timing issues.

@marnovandermaas are you happy to approve this?

marnovandermaas · 2024-10-30T14:48:25Z

I am using Vivado 2024.1 and main is passing for me. It is not with this PR, even when I open the more detailed timing report.

Need a minimum of 2 (this is what is used in OpenTitan) to enable back to back requests without stall cycles.

Update code from upstream repository https://github.com/lowrisc/cheriot-ibex.git to revision ea2df9db3bcea776f0dc72d6d89c31c73798ecd4 * Feed RV32M through ibexc_top_tracing/ibexc_top (Greg Chadwick) * Switch to no bitmanip by default (Greg Chadwick) * Feed RV32B through in ibexc_top (Greg Chadwick) Signed-off-by: Greg Chadwick <[email protected]>

This is effectively a no-op change. Before the latest Ibex was vendored we had no bitmanip (the RV32BFull parameter was not fully passed through) and RV32M was the fast multiplier. Sadly the single cycle multiplier seems to be increasing timing pressure. It does just meet timing but greatly increases synthesis times. As it's implemented with in-built FPGA DSP blocks it shouldn't be a big issue to use it so something to examine here but for now leave things as they are.

marnovandermaas

Even after a rebase this is reporting timing failure on 2024.1 while main is passing fine. I'm marking this as request changes until I test it out on 2021.1

marnovandermaas · 2024-10-30T20:55:02Z

I just confirmed that timing is passing on 2021.1 but I would still prefer we figure out why timing is failing from main to this PR on 2024.1 before merging.

marnovandermaas · 2024-10-30T20:55:27Z

My last force push is a rebase.

alees24

I've built this cleanly on top of latest main this evening using Vivado 2022.2 and tests/test_runner passes (modulo removal of pinmux loopback wire-dependent checks).

GregAC · 2024-10-31T09:01:26Z

but I would still prefer we figure out why timing is failing from main to this PR on 2024.1 before merging.

My view (as stated above) is this is an important performance improvement that we want for 1.0. It is unfortunate it's having issues with 2024.1 but 2021.1 is the agreed sign-off tool and it's working fine under that. We can investigate what's going on with 2024.1 (could just be some changes in the TCL scripting environment meaning @elliotb-lowrisc's flow improvements aren't working properly) but I don't think it's worth spending time on right now and certainly shouldn't be required for the 1.0 release.

GregAC · 2024-11-03T18:59:34Z

Closing in favour of #314

GregAC requested review from marnovandermaas and alees24 October 28, 2024 16:28

marnovandermaas previously requested changes Oct 29, 2024

View reviewed changes

GregAC mentioned this pull request Oct 30, 2024

Single cycle mult #295

Closed

GregAC added 3 commits October 30, 2024 17:13

[rtl] Increase outstanding requests in SRAM wrapper

48a29c5

Need a minimum of 2 (this is what is used in OpenTitan) to enable back to back requests without stall cycles.

marnovandermaas requested changes Oct 30, 2024

View reviewed changes

marnovandermaas force-pushed the sram_throughput branch from 664c2aa to a428412 Compare October 30, 2024 20:55

alees24 approved these changes Oct 30, 2024

View reviewed changes

GregAC closed this Nov 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sram throughput #279

Sram throughput #279

GregAC commented Oct 28, 2024 •

edited

Loading

elliotb-lowrisc commented Oct 28, 2024

GregAC commented Oct 28, 2024

elliotb-lowrisc commented Oct 28, 2024

GregAC commented Oct 28, 2024 •

edited

Loading

marnovandermaas left a comment

GregAC commented Oct 29, 2024

GregAC commented Oct 29, 2024

marnovandermaas commented Oct 29, 2024

GregAC commented Oct 29, 2024

marnovandermaas commented Oct 30, 2024

marnovandermaas left a comment

marnovandermaas commented Oct 30, 2024

marnovandermaas commented Oct 30, 2024

alees24 left a comment •

edited

Loading

GregAC commented Oct 31, 2024

GregAC commented Nov 3, 2024

Sram throughput #279

Sram throughput #279

Conversation

GregAC commented Oct 28, 2024 • edited Loading

elliotb-lowrisc commented Oct 28, 2024

GregAC commented Oct 28, 2024

elliotb-lowrisc commented Oct 28, 2024

GregAC commented Oct 28, 2024 • edited Loading

marnovandermaas left a comment

Choose a reason for hiding this comment

GregAC commented Oct 29, 2024

GregAC commented Oct 29, 2024

marnovandermaas commented Oct 29, 2024

GregAC commented Oct 29, 2024

marnovandermaas commented Oct 30, 2024

marnovandermaas left a comment

Choose a reason for hiding this comment

marnovandermaas commented Oct 30, 2024

marnovandermaas commented Oct 30, 2024

alees24 left a comment • edited Loading

Choose a reason for hiding this comment

GregAC commented Oct 31, 2024

GregAC commented Nov 3, 2024

GregAC commented Oct 28, 2024 •

edited

Loading

GregAC commented Oct 28, 2024 •

edited

Loading

alees24 left a comment •

edited

Loading