Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client-side latency far exceeds RPC execution time for long running RPC #119

Open
Stuart0l opened this issue Jan 12, 2025 · 1 comment
Open

Comments

@Stuart0l
Copy link

Stuart0l commented Jan 12, 2025

Hi Anuj,

bool complete = false;
void rpc_cont_func(void *context, void *_tag) { complete = true; }
void client_thread() {
  while (true) {
    // timer start
    rpc_->enqueue_request(session_num_, REQ, &req_, &resp_, rpc_cont_func, nullptr);
    while (!complete)
      rpc_->run_event_loop(1000);
    complete = false;
    // timer stop
  }
}

My client polls for RPC completion after each RPC and before sending the next one. Something wired happens that I have a long-running RPC that runs for about 60us (I measured this in the server-side handler). But the client-side e2e latency I measured is over 200us, which far exceeds the RPC execution time so I can't attribute that to network delay. I've tried to put the RPC to the background by setting req_func_type to kBackGround but the result is the same. Other short-running RPCs work fine, an RPC with a few us execution time results in ~10us e2e latency on the client side.

This issue shares a bit of flavor with #116 and #104. The difference is there is no sleeping on either client or server side. Is there any way to debug this? At least to figure out whether the excessive latency is from server-side or client-side?

@Stuart0l
Copy link
Author

I experimented with busy-waiting for different time in the rpc handler, the result is interesting:

wait time (us) client-side latency (us)
10 24.337
20 34.294
30 44.375
40 54.743
50 228.659

When the wait time is below 50us the client-side latency is as expected. The latency shoots up after waiting time reaches 50us so it seems pretty obvious that some mechanism kicks in. Any hint on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant