Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics for WIS2 Gateways #18

Open
2 tasks done
6a6d74 opened this issue Jun 28, 2024 · 13 comments
Open
2 tasks done

Add metrics for WIS2 Gateways #18

6a6d74 opened this issue Jun 28, 2024 · 13 comments

Comments

@6a6d74
Copy link
Contributor

6a6d74 commented Jun 28, 2024

  • GTS to WIS2 Gateway
  • WIS2 to GTS Gateway
@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 9, 2024

Proposed metrics:

Suggest using the token "gw" to signify gateway metrics - irrespective of whether it's GTS-tp-WIS2 or WIS2-to-GTS.

GTS-to-WIS2 gateway

Name Labels Description Child Initialize
wmo_wis2_gw_published_total CCCC , report_by Number of messages published, grouped by GTS CCCC code - false
wmo_wis2_gw_gts_received_total CCCC , report_by Number of messages received from GTS, grouped by GTS CCCC code (variance with wmo_wis2_gw_published_total indicates processing failure) - false
wmo_wis2_gw_published_errors_total CCCC , report_by Number of errors occurring while processing messages received from GTS, grouped by GTS CCCC code - false
wmo_wis2_gw_gts_last_received_timestamp_seconds CCCC , report_by Timestamp of most recent message received from GTS, grouped by GTS CCCC code - false

WIS2-to-GTS gateway

Given the similarity between a WIS2-to-GTS gateway and a Global Cache (i.e., they both subscribe to messages, download them, and re-publish them), the metrics proposed are very similar to those already agreed for the Global Cache.

Name Labels Description Child Initialize
wmo_wis2_gw_downloaded_total centre_id, report_by Number of data items downloaded by the gateway for republication on GTS - false
wmo_wis2_gw_downloaded_errors_total centre_id, report_by Number of download errors encountered - false
wmo_wis2_gw_dataserver_status_flag centre_id, dataserver, report_by Status of WIS2 Node dataserver (1-up and 0-down) - false
wmo_wis2_gw_dataserver_last_download_timestamp_seconds centre_id, dataserver, report_by Timestamp of last successful download - false
wmo_wis2_gw_integrity_failed_total centre_id, report_by Number of messages for which the integrity check failed - false
wmo_wis2_gw_connected_flag centre_id, report_by connection status to upstream Global Broker - false

wmo_wis2_gw_connected_flag was proposed at the ET-WISOP meeting (Geneva, Dec 2024). However, based on discussion at the meeting, the consensus was that this wasn't needed. The status of the upstream brokers (Global Brokers) can be determined already because each Global Broker already monitors and provides metrics on the status of it's peer brokers. Also, the wmo_wis2_gw_last_download_timestamp_seconds metric is useful for detecting when there are problems.

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 9, 2024

@kaiwirt - what do you think?

@kaiwirt
Copy link
Collaborator

kaiwirt commented Dec 10, 2024

Do we need to distinguish between the two gateways? For example use gw as gtstowis2 and wg as wis2togts?

Apart from that sounds good to me

@shirley-xuelei
Copy link

@6a6d74 For the wis2-gts gateway, here are some suggestions for your reference.

  1. agree with Kai @kaiwirt , we should distinguish between the two gateways.
  2. It is recommended to add the following metrics,
    wmo_wis2_wg_connected_flag (centre_id|report_by) (optional)
    wmo_wis2_wg_messages_gtsproperties_total (centre_id|report_by)
    wmo_wis2_wg_messages_gtsproperties_invalid_format_total (centre_id|report_by) << some gts header in gts.properties maybe malformed
    wmo_wis2_wg_published_total(centre_id|report_by) << total files output by the gateway
  3. is it necessary for us to distinguish 'core' and the 'recommended' in some of the metrics?

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 12, 2024

@kaiwirt @shirley-xuelei ... I'll respond to each of your points above in a separate post.

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 12, 2024

Distinguish between the gateways?

Yes. Let's do this.

  • gw for GTS-to-WIS2
  • wg for WIS2-to-GTS

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 12, 2024

Distinguish 'core' and the 'recommended' in some of the metrics

I don't see any strong requirement to do this. The metrics for Global Services are designed to help diagnose correct/faulty operation. Both core and recommended data are managed in exactly the same way. It might be something we could configure for a sensor centre to monitor - along with other data quality/data availability attributes.

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 12, 2024

wmo_wis2_wg_connected_flag

We discussed this at ET-WISOP and agreed that Global Brokers provide adequate means to determine if their peer GBs are not functioning. As per comment in my original post, I think we don't need this one.

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 12, 2024

wmo_wis2_wg_messages_gtsproperties_total

I would be happy to add this metric. I think this would be the number of messages received that contain properties.gts? Noting that only some of these messages will be processed for download, based on the configured whitelist of GTS Headers.

I think it would also be useful to capture the total number of unique messages received by the Gateway - irrespective of whether the messages contain properties.gts. This would be a useful diagnostic to determine if the Gateway is receiving all the necessary messages. This metric would be wmo_wis2_wg_messages_total.

For both wmo_wis2_wg_messages_gtsproperties_total and wmo_wis2_wg_messages_total this should be the total once message de-duplication has happened.

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 12, 2024

wmo_wis2_wg_messages_gtsproperties_invalid_format_total

This would be a useful metric. Validation of ttaaii and cccc should at least cover the length of the strings and their alpha or numeric components. We may also want to check that T1, T2, A1 and A2 are valid letters based on the Tables in Attachment II-5 of the Manual on Codes.

My question is whether this validation should be done at the Gateway or along with the other message validation tests in the Global Broker. I can make an argument for either case. @golfvert - what do you think?

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 12, 2024

wmo_wis2_wg_published_total

I think the intent of this metric is the same as wmo_wis2_gw_downloaded_total. I don't mind which name we use, but I think we only need one. I have a marginal preference for wmo_wis2_gw_downloaded_total because we're not actually counting the number of data objects/bulletins that the Message Switch is publishing, just the number of data objects that are passed to the Message Switch for onward publication.

@golfvert
Copy link
Contributor

My question is whether this validation should be done at the Gateway or along with the other message validation tests in the Global Broker. I can make an argument for either case.

I have a strong preference for NOT doing it at the GB level. GB WNM validation shouldn't be linked to a particular domain. If we implement this at the GB level, then, we could have other domains using properties.something asking for further validation at the GB. GB are agnostic :)

@6a6d74
Copy link
Contributor Author

6a6d74 commented Dec 12, 2024

I have a strong preference for NOT doing it at the GB level.

OK. We should implement wmo_wis2_wg_messages_gtsproperties_invalid_format_total at the Gateway.

6a6d74 added a commit to 6a6d74/wis2-metric-hierarchy that referenced this issue Dec 17, 2024
wg = WIS2-to-GTS Gateway
gw = GTS-to-WIS2 Gateway

See Issue wmo-im#18
@6a6d74 6a6d74 changed the title Add metrics for WIS2 Gatways Add metrics for WIS2 Gateways Dec 17, 2024
6a6d74 added a commit to 6a6d74/wis2-metric-hierarchy that referenced this issue Dec 17, 2024
gw.csv -> GTS-to-WIS2 Gateway
wg.csv -> WIS2-to-GTS Gateway

see [Issue wmo-im#18](wmo-im#18)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants