Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difftest: delay step for both emu and simv #283

Merged
merged 1 commit into from
Feb 4, 2024
Merged

Conversation

klin02
Copy link
Member

@klin02 klin02 commented Feb 4, 2024

For simv, we delay step to ensure DPIC_step behind DPIC_transfer. For emu, when signal enable is high on current cycle, it will be read by Software, however DPIC will be called next cycle because verilog will use previous cycle signals as always block condition.

Such problem is not exposed when step size is fixed, so emu used to step before DPIC. Now we move delay logic to Gateway, so it can be shared by both emu and simv.

For simv, we delay step to ensure DPIC_step behind DPIC_transfer.
For emu, when signal enable is high on current cycle, it will be read
by Software, however DPIC will be called next cycle because verilog
will use previous cycle signals as always block condition.

Such problem is not exposed when step size is fixed, so emu used to step
before DPIC. Now we move delay logic to Gateway, so it can be shared
by both emu and simv.
@klin02 klin02 requested a review from poemonsense February 4, 2024 13:22
Copy link
Member

@poemonsense poemonsense left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

I would like add some comments on why this happens. Whether the difftest_step is delayed compared with the DPI-C calls is decided by how the clock is toggled. According to https://github.com/OpenXiangShan/difftest/blob/master/src/test/csrc/verilator/emu.cpp#L589, in single_cycle, the clock is first assigned with 1 followed by an eval() call, which forms the posedge of the clock. In this eval(), DPI-C calls will be evaluated, thus transmitting the difftest values from the previous cycle to the C++ data structures. After single_cycle, however, we are seeing this cycle's difftest_step. This misalignment should be fixed in this PR.

However, I'm wondering why there are not any errors previously. Did we previously test the batch or other modes which change the difftest_step every cycle? Why didn't we find this before?

@poemonsense
Copy link
Member

Also, any other signals should we fix for the one-cycle delay? Are there any top-level IOs which has one-cycle latency but we are misusing them previously?

@klin02
Copy link
Member Author

klin02 commented Feb 4, 2024

This looks good to me.这对我来说看起来不错。

I would like add some comments on why this happens. Whether the difftest_step is delayed compared with the DPI-C calls is decided by how the clock is toggled. According to https://github.com/OpenXiangShan/difftest/blob/master/src/test/csrc/verilator/emu.cpp#L589, in single_cycle, the clock is first assigned with 1 followed by an eval() call, which forms the posedge of the clock. In this eval(), DPI-C calls will be evaluated, thus transmitting the difftest values from the previous cycle to the C++ data structures. After single_cycle, however, we are seeing this cycle's difftest_step. This misalignment should be fixed in this PR.我想补充一些关于为什么会发生这种情况的评论。 difftest_step 与 DPI-C 呼叫相比,是否延迟取决于时钟的切换方式。根据 https://github.com/OpenXiangShan/difftest/blob/master/src/test/csrc/verilator/emu.cpp#L589,在single_cycle ,时钟首先被分配为 1,然后是一个 eval() 调用,这构成了时钟的正边缘。在此 eval() 中,将评估DPI-C调用,从而将上一个周期的最差值传输到C++数据结构。然而,在 single_cycle 之后,我们看到这个周期的 difftest_step .这种错位应该在此 PR 中修复。

However, I'm wondering why there are not any errors previously. Did we previously test the batch or other modes which change the difftest_step every cycle? Why didn't we find this before?但是,我想知道为什么以前没有任何错误。我们之前是否测试过改变每个周期的 difftest_step 批处理或其他模式?为什么我们以前没有发现这个?

Previous step size is fixed even with Batch mode, it is batchSize or 0 at every cycle. And difftest_nstep will do nothing is step is 0. So the flow may be step(1th)-DPIC(1th)-step(2nd)-DPIC(2nd). When step size is fixed, step(2nd) may act the same as 1th, even it happens some cycles after DPIC(1th)

@poemonsense
Copy link
Member

Previous step size is fixed even with Batch mode, it is batchSize or 0 at every cycle. And difftest_nstep will do nothing is step is 0. So the flow may be step(1th)-DPIC(1th)-step(2nd)-DPIC(2nd). When step size is fixed, step(2nd) may act the same as 1th, even it happens some cycles after DPIC(1th)

Understood. Make sense.

@klin02
Copy link
Member Author

klin02 commented Feb 4, 2024

Also, any other signals should we fix for the one-cycle delay? Are there any top-level IOs which has one-cycle latency but we are misusing them previously?

The problem with step is the separation of the control side and the data side, which may be easier to cause misalignment? When related IO is used at the same time, the behavior seems more predictive,

@klin02 klin02 merged commit ae9f7b2 into master Feb 4, 2024
5 checks passed
@klin02 klin02 deleted the delay-step-emu branch February 4, 2024 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants