Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rtl] Flush pipe on all CSR modifications #2214

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

GregAC
Copy link
Collaborator

@GregAC GregAC commented Sep 17, 2024

This fixes #2193, an issue that meant bit clears in PMP related CSRs didn't immediately apply to an instruction already in the fetch stage due to a lack of a pipeline flush.

With this change the pipeline will flush in that scenario, fixing the issue. It now flushes the pipeline on all CSR modifications as this makes the pipeline more resliant against similar issues in the future (where the list of CSRs to flush on should have been updated but wasn't).

This fixes lowRISC#2193, an issue that meant bit clears in PMP related CSRs
didn't immediately apply to an instruction already in the fetch stage
due to a lack of a pipeline flush.

With this change the pipeline will flush in that scenario, fixing the
issue. It now flushes the pipeline on all CSR modifications as this
makes the pipeline more resliant against similar issues in the future
(where the list of CSRs to flush on should have been updated but
wasn't).
@GregAC
Copy link
Collaborator Author

GregAC commented Sep 17, 2024

Still need to run a full regression on this as well a specific directed test to ensure the particular bug has indeed been solved.

Copy link
Contributor

@marnovandermaas marnovandermaas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the riscv_multiple_interrupt_test is failing on this commit.

// Flush pipe on any CSR modification. Some CSR modifications alter how instructions execute (e.g.
// the PMP CSRs) so this ensures all instructions always see the latest architectural state when
// entering the fetch stage. This causes some needless flushes but performance impact is
// limited. We have a single fetch stage to flush not many stages of a deep pipeline and CSR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar question: Do you mean something like "We only have a single fetch stage and the pipeline is not deep. Also, CSR instructions are normally rare and aren't part of performance critical parts of the code."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I do, I shall rewrite it to make it clearer

@GregAC
Copy link
Collaborator Author

GregAC commented Sep 17, 2024

Looks like the riscv_multiple_interrupt_test is failing on this commit.

I think that's a known failure, though may be we're getting more of them now as it's to do with assertions around pipeline flushes and we're doing more pipeline flushes. Could be we should use a less flaky test in our smoke suite.

// the PMP CSRs) so this ensures all instructions always see the latest architectural state when
// entering the fetch stage. This causes some needless flushes but performance impact is
// limited. We have a single fetch stage to flush not many stages of a deep pipeline and CSR
// instructions are in general rare and not part of performance critical parts of the code.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it's a common pattern to do a csrrw sp, mscratch, sp to switch to a dedicated stack for exception handling.

For exception handling it also performs quite a few other CSR reads, so I wouldn't call CSR instructions rare and not part of performance critical code.

Copy link
Collaborator Author

@GregAC GregAC Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are good points.

Though do note we don't flush on read only access. Whilst strictly all CSR instructions do perform modifications CSRRS and CSRRC with an x0 rs1 (the register that specifies which bits to set/clear) and the immediate versions with a 0 immediate do not alter the CSRs and will have csr_op_o in this RTL set to CSR_OP_READ so won't trigger the flush.

Do we expect any CSR writes other than to mscratch during normal exception handling code?

I'm wonder if we should operate a white-list system here where modifications to specific CSRs do not trigger a flush but everything else does. mscratch is a good candidate for that list, anything else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mepc would also be updated in case the exception needs to skip over the current instruction. mstatus may be updated but that actually require flushes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect mepc updates would be less common, not something you'd do in an interrupt handling routine for instance, maybe used in error handling cases but it's reasonable to expect those to be rare. Also used where you're doing software emulation of an instruction where speed would be desired but uncertain how much of that would practically be done on Ibex (the typical use case would be unaligned memory access which we already have hardware handling for).

Still putting mepc in the whitelist seems pretty safe, it's basically a special jump target that gets read out in the ID/EX stage and we'd probably need some radical change to make it have relevance from IF (and hence require flushing on modification).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realised use of ecall will require mepc modification (otherwise you get an infinite loop of ecall). Performance degradation from a pipe flush is modest (a single cycle in cases where you hit in icache or have single cycle access to instruction memory I believe) and there'd only be one mepc write per ecall so I don't think it'd be a huge issue but as above it seems a safe one to whitelist so we can do that.

Copy link
Contributor

@nbdd0121 nbdd0121 Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For exception handling I think it's not uncommon to save mepc & all registers to memory, call some routine, and unconditionally restore mepc. Also, for things like ecall, it's also needed to unconditionally increment by 4. But you're right they're not touched for interrupt handling, which matters more for than synchronous exceptions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pipeline not being flushed on PMP register clears
4 participants