-
Notifications
You must be signed in to change notification settings - Fork 798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tlul,rtl] Correct the calculation of data_intg in tlul_adapter_sram #25925
[tlul,rtl] Correct the calculation of data_intg in tlul_adapter_sram #25925
Conversation
This is the value that gets supplied as a user field that provides data integrity. If we are responding with an error response, we need to make sure we send integrity bits that correspond. The logic that was previously here detected this condition with (vld_rd_rsp && reqfifo_rdata.error) but that is wrong because it shouldn't depend on vld_rd_rsp. Imagine we send a TL write with a TL error. When we read the response, the d_error flag will be high (because u_reqfifo contains the faulty TL write) and d_data will be error_blanking_data. But the integrity bits will be SecdedInv3932ZeroEcc because vld_rd_rsp is false (we haven't seen a TL read at all!) This is also possible to trigger by using only reads. Suppose we send an TL read with a TL error and then, a few cycles later, read the TL response. When we read the response, the d_error flag will again be high (because u_reqfifo contains the faulty TL read). Again, d_data will (correctly) be error_blanking_data. Again, we should be using error_blanking_integ for error bits but we actually use SecdedInv3932ZeroEcc. Dropping the vld_rd_rsp term will fix the behaviour in both cases. So it remains in sync with the RTL, we also drop a conditional coverage exclusion for rom_ctrl. Tracking down how it was actually possible to see this happen led us to the design change. Signed-off-by: Rupert Swarbrick <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rswarbrick for the fix and the carefully written description, that's very useful!
IIRC, the issue also affected reads because during a read generating a TL error, the request is not actually sent out to the SRAM and thus no response is obtained from the SRAM and u_rspfifo
remains empty, meaning rspfifo_rvalid
and thus vld_rd_rsp
remain deasserted. Is this correct?
@nasahlpa according to the description of Rupert we have been generating ECC errors upon TL errors in the past for the SRAMs is this inline with your recollection? |
CHANGE AUTHORIZED: hw/ip/tlul/rtl/tlul_adapter_sram.sv This PR fixes a bug in the error reporting. Previously, the adapater did insert an ECC error was generated when experiencing a TL-UL error. Fixing this is a good thing. |
@vogelpi: Yep, I think so. Indeed, that's what I originally saw when reasoning about things. See the text starting with "This is also possible to trigger". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting & fixing this!
I've created an issue #25927 as we def. should check the error response of the SRAM controller more carefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I ran into this issue on Sonata at some point as well, I just didn't know how to fix it. Thanks for this!
CHANGE AUTHORIZED: hw/ip/tlul/rtl/tlul_adapter_sram.sv |
This is the value that gets supplied as a user field that provides data integrity. If we are responding with an error response, we need to make sure we send integrity bits that correspond.
The logic that was previously here detected this condition with
but that is wrong because it shouldn't depend on vld_rd_rsp. Imagine we send a TL write with a TL error. When we read the response, the d_error flag will be high (because u_reqfifo contains the faulty TL write) and d_data will be error_blanking_data. But the integrity bits will be SecdedInv3932ZeroEcc because vld_rd_rsp is false (we haven't seen a TL read at all!)
This is also possible to trigger by using only reads. Suppose we send an TL read with a TL error and then, a few cycles later, read the TL response.
When we read the response, the d_error flag will again be high (because u_reqfifo contains the faulty TL read). Again, d_data will (correctly) be error_blanking_data. Again, we should be using error_blanking_integ for error bits but we actually use SecdedInv3932ZeroEcc.
Dropping the vld_rd_rsp term will fix the behaviour in both cases.
So it remains in sync with the RTL, we also drop a conditional coverage exclusion for rom_ctrl. Tracking down how it was actually possible to see this happen led us to the design change.
@KinzaQamar: Thanks for helping me understand this in the first place.