Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hw,tlul_adapter_reg,rtl] Do not gate a_ready with a_valid #25890

Merged
merged 1 commit into from
Jan 17, 2025

Conversation

Razer6
Copy link
Member

@Razer6 Razer6 commented Jan 15, 2025

This creates a combinational path between the input and output TLUL port. This can be pretty long and impacts the synthesis in high-frequency designs.

Originally, the busy_i signal was gated with a_vali for cases where reg_tops have different clock domains. There, the busy input is a function from the request... At the moment, this causes assertions to fail, e.g., the AReadyKnown assertion in the PWM. This is an IP that uses 2 clock domains.

@Razer6 Razer6 requested review from rswarbrick and vogelpi January 15, 2025 19:39
@rswarbrick
Copy link
Contributor

This might be obvious to others but it took me a while... The way that the version on master uses a_valid dates back to commit a49ceb6 in 2021.

@rswarbrick
Copy link
Contributor

I'm trying to make sure I understand the reason for this proposed change. Is the point that tl_o.a_ready signal was depending on stuff from tl_i which wasn't necessarily on the clock we expect? I'm a bit confused: I would have expected both of those signals to be synchronous to clk_i.

What have I missed?

// busy is selected based on address
// thus if there is no valid transaction, we should ignore busy
a_ready: ~(outstanding_q | tl_i.a_valid & busy_i),
a_ready: ~(outstanding_q | busy_i),
Copy link
Contributor

@vogelpi vogelpi Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work for register interfaces with built in CDC. Because for those, busy_i depends on whether the currently selected address has an ongoing transaction over the CDC. For example, in aon_timer_reg_top.sv:

  // register busy
  logic reg_busy_sel;
  assign reg_busy = reg_busy_sel | shadow_busy;
  always_comb begin
    reg_busy_sel = '0;
    unique case (1'b1)
      addr_hit[1]: begin
        reg_busy_sel = wkup_ctrl_busy;
      end
      addr_hit[2]: begin
        reg_busy_sel = wkup_thold_hi_busy;
      end

Register interfaces without built-in CDC only use shadow_busy which is statically zero when the module comes out of reset.

I agree that the change makes sense in general. But we should think how to handle the case with CDCs. Maybe we could simply factor tl_i.a_valid into busy_i in the reg top if and only if the register interface has a CDC? Then we would at least break this path for most IP blocks? WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds perfectly reasonable to me. When CDC is involved, it is typically done to the AON domain, which is way slower. There, these paths are not that critical I guess. But on the fast main clock it makes sense to cut the path between the input and output port. I added the change to the reg_top template and re-generated all IPs based on that.

@Razer6
Copy link
Member Author

Razer6 commented Jan 17, 2025

I'm trying to make sure I understand the reason for this proposed change. Is the point that tl_o.a_ready signal was depending on stuff from tl_i which wasn't necessarily on the clock we expect? I'm a bit confused: I would have expected both of those signals to be synchronous to clk_i.

What have I missed?

In general, a_valid in that equation adds a combinational path between tl_i and tl_o, which can be a pretty long one. This may lead to problems when trying to synthesize for frequencies > 1 GHz ;-). It seems that a_valid is not needed in the general case but only when dealing with CDC in the reg_top, where busy is a function of the TLUL request. Instead of generally adding a_valid to the equation now, we now only add that where it's really needed.

@vogelpi
Copy link
Contributor

vogelpi commented Jan 17, 2025

Thanks for making the change @Razer6 !

Lint currently fails in CI. The issue is that now English Breakfast is a real top meaning we check in it's autogenerated files, too. You should be able to run make -k -C hw top_and_cmdgen to fix this.

This creates a combinational path between the input and output TLUL
port. This can be pretty long and impacts the synthesis in high-frequency
designs.

For some cases where CDC is involved, busy depends on a_valid. For these
cases, factoring in a_valid is done in the reg_top, explicitly for when
needed. That cuts the path for cases where only the fast main clock is
involved. For CDC cases, there is still this combinational path. But
because that is typically using the way slower AON domain, this is not
that critical there.

Signed-off-by: Robert Schilling <[email protected]>
Copy link
Contributor

@vogelpi vogelpi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @Razer6 !

@vogelpi
Copy link
Contributor

vogelpi commented Jan 17, 2025

CHANGE AUTHORIZED: hw/ip/adc_ctrl/rtl/adc_ctrl_reg_top.sv
CHANGE AUTHORIZED: hw/ip/aon_timer/rtl/aon_timer_reg_top.sv
CHANGE AUTHORIZED: hw/ip/pwm/rtl/pwm_reg_top.sv
CHANGE AUTHORIZED: hw/ip/sysrst_ctrl/rtl/sysrst_ctrl_reg_top.sv
CHANGE AUTHORIZED: hw/ip/tlul/rtl/tlul_adapter_reg.sv
CHANGE AUTHORIZED: hw/ip/usbdev/rtl/usbdev_reg_top.sv
CHANGE AUTHORIZED: hw/top_earlgrey/ip_autogen/clkmgr/rtl/clkmgr_reg_top.sv
CHANGE AUTHORIZED: hw/top_earlgrey/ip_autogen/pinmux/rtl/pinmux_reg_top.sv

The change introduced by this PR doesn't have a functional impact but it alleviates TLUL timing pressure for all IPs not involving CDCs in the register file.

@rswarbrick
Copy link
Contributor

CHANGE AUTHORIZED: hw/ip/adc_ctrl/rtl/adc_ctrl_reg_top.sv
CHANGE AUTHORIZED: hw/ip/aon_timer/rtl/aon_timer_reg_top.sv
CHANGE AUTHORIZED: hw/ip/pwm/rtl/pwm_reg_top.sv
CHANGE AUTHORIZED: hw/ip/sysrst_ctrl/rtl/sysrst_ctrl_reg_top.sv
CHANGE AUTHORIZED: hw/ip/tlul/rtl/tlul_adapter_reg.sv
CHANGE AUTHORIZED: hw/ip/usbdev/rtl/usbdev_reg_top.sv
CHANGE AUTHORIZED: hw/top_earlgrey/ip_autogen/clkmgr/rtl/clkmgr_reg_top.sv
CHANGE AUTHORIZED: hw/top_earlgrey/ip_autogen/pinmux/rtl/pinmux_reg_top.sv

This has been carefully reasoned. Thanks, @Razer6, for explaining what's going on.

@vogelpi vogelpi merged commit d3e7c5a into lowRISC:master Jan 17, 2025
38 checks passed
@Razer6 Razer6 deleted the tlul-timing-fix branch January 17, 2025 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants