-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update metrics & ema before breaking the connection loop #4414
Update metrics & ema before breaking the connection loop #4414
Conversation
Example stats: [2025-01-11T21:54:36.003926852Z INFO solana_metrics::metrics] datapoint: bench_vote_metrics active_connections=0i active_streams=366i new_connections=0i new_streams=0i evictions=0i connection_added_from_staked_peer=0i connection_added_from_unstaked_peer=0i connection_add_failed=0i connection_add_failed_invalid_stream_count=0i connection_add_failed_staked_node=0i connection_add_failed_unstaked_node=0i connection_add_failed_on_pruning=0i connection_removed=0i connection_remove_failed=0i connection_setup_timeout=0i connection_setup_error=0i connection_setup_error_timed_out=0i connection_setup_error_closed=0i connection_setup_error_transport=0i connection_setup_error_app_closed=0i connection_setup_error_reset=0i connection_setup_error_locally_closed=0i connection_rate_limited_across_all=0i connection_rate_limited_per_ipaddr=0i invalid_stream_size=0i packets_allocated=0i packet_batches_allocated=0i packets_sent_for_batching=0i staked_packets_sent_for_batching=0i unstaked_packets_sent_for_batching=0i bytes_sent_for_batching=0i chunks_sent_for_batching=0i packets_sent_to_consumer=0i bytes_sent_to_consumer=0i chunks_processed_by_batcher=0i chunks_received=0i staked_chunks_received=0i unstaked_chunks_received=0i packet_batch_send_error=0i handle_chunk_to_packet_batcher_send_error=0i packet_batches_sent=0i packet_batch_empty=0i stream_read_errors=0i stream_read_timeouts=0i throttled_streams=0i stream_load_ema=917i stream_load_ema_overflow=0i stream_load_capacity_overflow=0i throttled_unstaked_streams=0i throttled_staked_streams=0i process_sampled_packets_us_90pct=0i process_sampled_packets_us_min=0i process_sampled_packets_us_max=0i process_sampled_packets_us_mean=0i process_sampled_packets_count=0i perf_track_overhead_us=0i connection_rate_limiter_length=2i outstanding_incoming_connection_attempts=1i total_incoming_connection_attempts=6916i quic_endpoints_count=32i open_connections=0i refused_connections_too_many_open_connections=0i Note the persistent non 0 value: active_streams. |
@@ -1167,6 +1167,8 @@ async fn handle_connection( | |||
CONNECTION_CLOSE_CODE_INVALID_STREAM.into(), | |||
CONNECTION_CLOSE_REASON_INVALID_STREAM, | |||
); | |||
stats.total_streams.fetch_sub(1, Ordering::Relaxed); | |||
stream_load_ema.update_ema_if_needed(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR says update metrics but this is a logic change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to restore the logic which is was broken in commit: 2be7c2e. We need to do the correct bookkeeping of the total_streams and ema before breaking the connection loop. For example,
we did the following
stats.total_streams.fetch_add(1, Ordering::Relaxed);
before the inner loop.
After the inner loop, we need to correct these counters.
The direct break to the outer loop cause these to be missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I understand. What I'm saying is that the PR title says "update metrics" but the most important thing in this PR is arguably updating the EMA.
Having both changes in the same PR is fine of course, but please update PR/commit to reflect what's actually being changed. Also we're backporting this to 2.1 yeah?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay -- updated the title to include ema.
Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis. |
* Update metrics and ema before breaking connection loop (cherry picked from commit 83919b8)
…ort of #4414) (#4450) Update metrics & ema before breaking the connection loop (#4414) * Update metrics and ema before breaking connection loop (cherry picked from commit 83919b8) Co-authored-by: Lijun Wang <[email protected]>
Problem
It is found that active_streams is none zero even after no connections to the server. This is due to we missed updating metrics in case of connection error when handling chunks.
Summary of Changes
Update metrics and ema before break conn loop.
Fixes #