-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sram throughput #279
Sram throughput #279
Conversation
Where in the output files is it reporting this timing failure? There are definitely some points in the flow where it will give bad timing estimates. I've had a go at building this with Vivado v2021.1 (64-bit) and couldn't see anything concerning at a glance at the high-level results. Incidentally, do you know what the register |
When you open the Vivado GUI it's listed in the 'Timing' section of the 'Project Summary' window, with the negative slacks noted in the columns in the 'Design Runs' tab plus there's a critical warning noting 'The design failed to meet timing requirements' output from the 'Route Design' stage.
Yes it's a performance counter, we could flop the increment signal for these. No big deal if your performance counters have one cycle of latency (other than for the poor person trying to get a cycle accurate match in a simulator but we don't do that!). I can have a look at doing this in our cheriot-ibex fork. I've got another easy timing fix I should really get in as well, didn't have a dramatic effect on fmax but could be good for overall timing pressure. |
Odd, that should be correct as it's output after post-route optimisation (in the logs at least). The Timing section of the Project Summary window seems clean in my build + tool-version, so perhaps it's a version-specific bug (not uncommon in my experience). Could also be something to do with the stage-specific clock over-constraining I put in I suppose, if it was not being cleared properly for some reason. Took me a while to find the Project Summary as I had been opening the Design Checkpoints (.dcp) files. Nice to know there's this nice overview too. |
I use the make -C ./build/lowrisc_sonata_system_0/synth-vivado/ build-gui That provides the design summary when it opens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't realized this timing failure also occurs without the single cycle multiplier. I would be hesitant to merge this in without first understanding why that negative slack is reported and fix the constraining first.
I would guess it's a tooling bug, seems it doesn't occur in earlier Vivado versions (such as v2021.1 Elliot is using). I'll download the latest release and try it today. I would rather not block v1.0 on getting to the bottom of a tooling bug given the actual timing analysis reports a pass and we can always use v2021.1 if we're concerned by this. I am keen to get this in for v1.0 as it's a noticeable performance impact. |
Same behaviour (negative slack that disappears when you run a full timing analysis) is seen on v2024.1 but I also see this with current @marnovandermaas are you happy accepting this given it builds fine under v2021.1 and our current |
I'm not happy with that current main is broken on 2024, but that is not a problem with this PR. |
Just done a build on v2021.1 of this rebased on top of @marnovandermaas are you happy to approve this? |
I am using Vivado 2024.1 and main is passing for me. It is not with this PR, even when I open the more detailed timing report. |
Need a minimum of 2 (this is what is used in OpenTitan) to enable back to back requests without stall cycles.
Update code from upstream repository https://github.com/lowrisc/cheriot-ibex.git to revision ea2df9db3bcea776f0dc72d6d89c31c73798ecd4 * Feed RV32M through ibexc_top_tracing/ibexc_top (Greg Chadwick) * Switch to no bitmanip by default (Greg Chadwick) * Feed RV32B through in ibexc_top (Greg Chadwick) Signed-off-by: Greg Chadwick <[email protected]>
This is effectively a no-op change. Before the latest Ibex was vendored we had no bitmanip (the RV32BFull parameter was not fully passed through) and RV32M was the fast multiplier. Sadly the single cycle multiplier seems to be increasing timing pressure. It does just meet timing but greatly increases synthesis times. As it's implemented with in-built FPGA DSP blocks it shouldn't be a big issue to use it so something to examine here but for now leave things as they are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even after a rebase this is reporting timing failure on 2024.1 while main is passing fine. I'm marking this as request changes until I test it out on 2021.1
I just confirmed that timing is passing on 2021.1 but I would still prefer we figure out why timing is failing from main to this PR on 2024.1 before merging. |
664c2aa
to
a428412
Compare
My last force push is a rebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've built this cleanly on top of latest main this evening using Vivado 2022.2 and tests/test_runner passes (modulo removal of pinmux loopback wire-dependent checks).
My view (as stated above) is this is an important performance improvement that we want for 1.0. It is unfortunate it's having issues with 2024.1 but 2021.1 is the agreed sign-off tool and it's working fine under that. We can investigate what's going on with 2024.1 (could just be some changes in the TCL scripting environment meaning @elliotb-lowrisc's flow improvements aren't working properly) but I don't think it's worth spending time on right now and certainly shouldn't be required for the 1.0 release. |
Closing in favour of #314 |
This allows us to have back to back accesses to SRAM without any stall cycles.
Sadly it increases timing pressure (shouldn't be a big difference, now the response FIFO in SRAM must choose between two different flopped responses and the response direct from SRAM and previously it chose between a single flopped response and a response direct from SRAM, though will add a logic level at least and I guess that's pushed a few things closer to the edge).
When you build this under Vivado it sometimes reports a timing failure but when you ask for a detailed timing report it changes its mind and says timing is fine. Possibly something to do with the over-constraining @elliotb-lowrisc put in? Or maybe the reported timing is based on a rough timing analysis and the more detailed one shows it's actually fine.
I think this is important to get in, takes coremark from 1.34 to 1.47 (almost 10% improvement) and clearly is generally applicable.
Another change to be considered is switching to a single cycle multiplier. Unfortunately that also adds in more timing pressure so left that off for now (but did add in the Ibex changes so we can choose which multiplier we want). It does still meet timing but pushes synthesis time up yet more.