You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With significant changes in many of the different components, some of the existing metrics do not make sense anymore. We probably also need a couple of new metrics. This issue tracks what we had in 0.1 and what we want for 0.2.
Some notes on the (crossed out) below:
with the prefix store doing the UserMergeUpdate now, tracking the merge_update related metrics is not possible/sensible from Rotonda itself anymore. We can't distinguish an update vs an insert. A possible new metric to track here is the number of compare-and-swap attempts the store had to perform, though there might be other (more) insightful numbers.
all the task_ related metrics (managed by TokioTaskMetrics) were not actually being tracked;
eventually all BMP state machine related code will move into routecore, we will revisit all those metrics when that happens;
we might want some diagnostics for the new ingress::Register. Moreover, the metrics output probably needs to consult the Register to populate the labels for the endpoint output.
0.1 metrics (taken from the /status/ endpoint), crossed out means we do not include it in v0.2:
version: rotonda/0.1.1-dev
bgp-in num_updates: 3
bgp-in num_dropped_updates: 0
bgp-in last_update: 2024-07-30 10:40:47.909446464 UTC
bgp-in since_last_update: 41
bgp-in update_set_size: 3
bgp-in bgp_tcp_in_listener_bound_count: 1
bgp-in bgp_tcp_in_connection_accepted_count: 1
bgp-in bgp_tcp_in_connection_lost_count: 1
bgp-in bgp_tcp_in_disconnect_count: 0
bmp-in num_updates: 1
bmp-in num_dropped_updates: 0
bmp-in last_update: 2024-07-30 10:40:47.908768858 UTC
With significant changes in many of the different components, some of the existing metrics do not make sense anymore. We probably also need a couple of new metrics. This issue tracks what we had in 0.1 and what we want for 0.2.
Some notes on the (crossed out) below:
UserMergeUpdate
now, tracking themerge_update
related metrics is not possible/sensible from Rotonda itself anymore. We can't distinguish an update vs an insert. A possible new metric to track here is the number of compare-and-swap attempts the store had to perform, though there might be other (more) insightful numbers.task_
related metrics (managed byTokioTaskMetrics
) were not actually being tracked;ingress::Register
. Moreover, the metrics output probably needs to consult theRegister
to populate the labels for the endpoint output.0.1 metrics (taken from the
/status/
endpoint), crossed out means we do not include it in v0.2:bmp-in bmp_state_machine_state router=cavefish: Dumpingbmp-in bmp_state_num_up_peers_eor_capable router=cavefish: 0bmp-in task_instrumented_count: 0bmp-in task_dropped_count: 0bmp-in task_first_poll_count: 0bmp-in task_total_first_poll_delay: 0bmp-in task_total_idled_count: 0bmp-in task_total_idle_duration: 0bmp-in task_total_scheduled_count: 0bmp-in task_total_scheduled_duration: 0bmp-in task_total_poll_count: 0bmp-in task_total_poll_duration: 0bmp-in task_total_fast_poll_count: 0bmp-in task_total_fast_poll_duration: 0bmp-in task_total_slow_poll_count: 0bmp-in task_total_slow_poll_duration: 0bmp-in task_total_short_delay_count: 0bmp-in task_total_long_delay_count: 0bmp-in task_total_short_delay_duration: 0bmp-in task_total_long_delay_duration: 0rib-in-post task_instrumented_count: 0rib-in-post task_dropped_count: 0rib-in-post task_first_poll_count: 0rib-in-post task_total_first_poll_delay: 0rib-in-post task_total_idled_count: 0rib-in-post task_total_idle_duration: 0rib-in-post task_total_scheduled_count: 0rib-in-post task_total_scheduled_duration: 0rib-in-post task_total_poll_count: 0rib-in-post task_total_poll_duration: 0rib-in-post task_total_fast_poll_count: 0rib-in-post task_total_fast_poll_duration: 0rib-in-post task_total_slow_poll_count: 0rib-in-post task_total_slow_poll_duration: 0rib-in-post task_total_short_delay_count: 0rib-in-post task_total_long_delay_count: 0rib-in-post task_total_short_delay_duration: 0rib-in-post task_total_long_delay_duration: 0rib-in-post rib_unit_update_duration: 0rib-in-post rib_merge_update_withdrawal_duration le=1: 0rib-in-post rib_merge_update_withdrawal_duration le=10: 0rib-in-post rib_merge_update_withdrawal_duration le=100: 0rib-in-post rib_merge_update_withdrawal_duration le=1000: 0rib-in-post rib_merge_update_withdrawal_duration le=10000: 0rib-in-post rib_merge_update_withdrawal_duration le=+Inf: 0rib-in-post rib_merge_update_withdrawal_duration: 0rib-in-post rib_merge_update_withdrawal_duration: 0rib-in-post rib_merge_update_announce_duration le=1: 0rib-in-post rib_merge_update_announce_duration le=10: 0rib-in-post rib_merge_update_announce_duration le=100: 0rib-in-post rib_merge_update_announce_duration le=1000: 0rib-in-post rib_merge_update_announce_duration le=10000: 0rib-in-post rib_merge_update_announce_duration le=+Inf: 0rib-in-post rib_merge_update_announce_duration: 0rib-in-post rib_merge_update_announce_duration: 0rib-in-pre task_instrumented_count: 0rib-in-pre task_dropped_count: 0rib-in-pre task_first_poll_count: 0rib-in-pre task_total_first_poll_delay: 0rib-in-pre task_total_idled_count: 0rib-in-pre task_total_idle_duration: 0rib-in-pre task_total_scheduled_count: 0rib-in-pre task_total_scheduled_duration: 0rib-in-pre task_total_poll_count: 0rib-in-pre task_total_poll_duration: 0rib-in-pre task_total_fast_poll_count: 0rib-in-pre task_total_fast_poll_duration: 0rib-in-pre task_total_slow_poll_count: 0rib-in-pre task_total_slow_poll_duration: 0rib-in-pre task_total_short_delay_count: 0rib-in-pre task_total_long_delay_count: 0rib-in-pre task_total_short_delay_duration: 0rib-in-pre task_total_long_delay_duration: 0rib-in-pre rib_unit_update_duration: 0rib-in-pre rib_merge_update_withdrawal_duration le=1: 0rib-in-pre rib_merge_update_withdrawal_duration le=10: 0rib-in-pre rib_merge_update_withdrawal_duration le=100: 0rib-in-pre rib_merge_update_withdrawal_duration le=1000: 0rib-in-pre rib_merge_update_withdrawal_duration le=10000: 0rib-in-pre rib_merge_update_withdrawal_duration le=+Inf: 0rib-in-pre rib_merge_update_withdrawal_duration: 0rib-in-pre rib_merge_update_withdrawal_duration: 0rib-in-pre rib_merge_update_announce_duration le=1: 0rib-in-pre rib_merge_update_announce_duration le=10: 0rib-in-pre rib_merge_update_announce_duration le=100: 0rib-in-pre rib_merge_update_announce_duration le=1000: 0rib-in-pre rib_merge_update_announce_duration le=10000: 0rib-in-pre rib_merge_update_announce_duration le=+Inf: 0rib-in-pre rib_merge_update_announce_duration: 0rib-in-pre rib_merge_update_announce_duration: 0New metrics we want in 0.2:
The text was updated successfully, but these errors were encountered: