💡 [Feature] Log download stats from THREDDS server #444

huard · 2024-04-05T19:19:12Z

Description

It would be useful for reporting purposes to monitor data downloads from THREDDS:

total volume of downloads (Gb/day);
per-file download volume (Gb/day);
per-datasets opendap streaming volumes (Gb/day);

References

This information can be parsed from NGINX logs, but those logs need to be exposed to Prometheus to be aggregated and archived within the current architecture.

Possible solutions:

https://www.martin-helmich.de/en/blog/monitoring-nginx.html: parses logs and exposes to Prometheus
Loki (Prometheus for logs, integrates with Graphana), seems possible to configure metrics from logs and expose them to Prometheus
https://github.com/nginxinc/nginx-prometheus-exporter: only exposes content from the API, which is very limited
https://github.com/google/mtail (general purpose log scraper)

Additional info

https://www.martin-helmich.de/en/blog/monitoring-nginx.html

Concerned Organizations

fmigneault · 2024-04-08T13:35:18Z

Consider downloads from WPS outputs and STAC data proxy endpoints as well for the same reasons.

huard · 2024-04-26T12:44:15Z

ESGF uses Beats and Logstash to collect logs and compute their stats. See https://drive.google.com/drive/folders/1LbvoYeQ_6L_bzTsO-EEhwqjIx1jZ-G1k

fmigneault · 2024-04-26T20:45:55Z

If the "node collector" can be located on the same instance, logstash seems like an interesting candidate. If there is no distinction between beats or logstash as "log producers", I would favor the 2nd architecture to limit the number of configurations/technologies involved.

## Overview This version of canarie-api permits running the proxy (nginx) container independently of the canarie-api application. This makes it easier to monitor the logs of canarie-api and proxy containers simultaneously and allows for the configuration files for canarie-api to be mapped to the canarie-api containers where appropriate. ## Changes **Non-breaking changes** - New component version canarie-api:1.0.0 **Breaking changes** ## Related Issue / Discussion - Resolves [issue id](url) ## Additional Information Links to other issues or sources. - This might make parsing the nginx logs slightly easier as well which could help with #12 and #444 ## CI Operations  birdhouse_daccs_configs_branch: master birdhouse_skip_ci: false

huard · 2024-10-04T20:20:50Z

Parser for nginx logs and prometheus counter
https://gist.github.com/huard/25ca5be3479f72546f748da54f7097e7

mishaschwartz · 2024-11-07T18:12:48Z

I've created two PRs to implement log parsing in different ways. I'd like to summarize each below and briefly discuss the pros and cons of each. We should decide here which one we are interested in.

Prometheus log parser: #473

Summary: reads log files with a lightweight python library and converts log lines to metrics using python functions that create metrics using the prometheus python client

Pros:

lightweight and easy to configure
we write the underlying log parser code so we can update it as needed
write custom python functions to convert lines to metrics (simple to write a basic parser and can be as complex as needed within the limits of python)
the prometheus python client is well maintained and is an official prometheus product
the prometheus python client is well documented and very simple so its easy to learn.
can easily be deployed on multiple machines (supports a federated architecture)

Cons:

we write the underlying log parser code so we need to maintain it
slightly slower than promtail (written in python vs. go)

Promtail and Loki #474

Summary: reads log files with the promtail component and converts log lines to metrics using the metrics pipeline stage. Optionally supports shipping the parsed logs themselves to grafana (through loki) for custom log inspection.

Pros:

slightly faster than the prometheus log parser (written in go vs. python)
part of the grafana stack and an official grafana product
can easily be deployed on multiple machines (supports a federated architecture)
we don't need to maintain the underlying code

Cons:

we can't customize the underlying code
officially, promtail cannot be run without loki so if we just want to generate metrics and nothing else, we can do it but it's a bit of a hack.
writing pipeline stages to extract log lines into metrics is complex and (in my opinion) very poorly documented which makes it very difficult to learn

Why not something else... logstash, beats, fluentbit ...

These could totally work as well... probably. I didn't have time to investigate them all. The main reason why I didn't choose to investigate these options is because for most of them, exporting log data to metrics required additional plugins and the complexity to set them up seemed much higher.

For our goals, I think we can achieve what we want with promtail or the prometheus log exporter. Unless there's a use-case that we can't achieve with either of those two I'm happy to look into other technologies but I'd rather stick with these two options for now.

huard · 2024-11-07T18:55:10Z

Thanks for the overview. I think one challenge we're having by plugging together different servers is the expertise required to configure each one. I don't think we have within our group someone fluent in Grafana for example. I'm concerned that as we add component, ẁe're going to make the problem worse.

In that sense, I'm leaning toward your first approach, which is simple and can be easily extended without delving into yet another configuration format.

fmigneault · 2024-11-08T17:41:55Z

I agree with @huard for the same reasons.

mishaschwartz · 2024-11-08T17:53:14Z

Thanks @huard and @fmigneault for your input. I'm happy with that decision as well. I'll un-draft #473 and close #474.

Once we're happy with #473 we can start adding some of the other metrics discussed here

tlvu · 2024-11-11T14:44:51Z

I agree with @huard for the same reasons.

Same here.

## Overview This component parses log files from other components and converts their logs to prometheus metrics that are then ingested by the monitoring Prometheus instance (the one created by the`components/monitoring` component). For more information on how this component reads log files and converts them to prometheus components see the [log-parser](https://github.com/DACCS-Climate/log-parser/) documentation. To configure this component: * set the `PROMETHEUS_LOG_PARSER_POLL_DELAY` variable to a number of seconds to set how often the log parser checks if new lines have been added to log files (default: 1) * set the `PROMETHEUS_LOG_PARSER_TAIL` variable to `"true"` to only parse new lines in log files. If unset, this will parse all existing lines in the log file as well (default: `"true"`) To view all metrics exported by the log parser: * Navigate to the `https://<BIRDHOUSE_FQDN>/prometheus/graph` search page * Put `{job="log_parser"}` in the search bar and click the "Execute" button Update the prometheus version to the current latest `v2.53.3`. This is required to support loading multiple prometheus scrape configuration files with the `scrape_config_files` configuration option. ## Changes **Non-breaking changes** - New component version prometheus:v2.53.3 **Breaking changes** - None ## Related Issue / Discussion - #444 ## Additional Information - implements parser given as an example here: #444 (comment) - this is an alternative to #474. See discussion in #444 to help decide which we should pick. ## CI Operations  birdhouse_daccs_configs_branch: master birdhouse_skip_ci: false

huard added the enhancement New feature or request label Apr 5, 2024

huard assigned mishaschwartz, tlvu and fmigneault Apr 5, 2024

tlvu assigned Zeitsperre Apr 8, 2024

mishaschwartz mentioned this issue May 7, 2024

Bump canarie api 1.0.0 #452

Merged

This was referenced Nov 5, 2024

Add the prometheus-log-parser optional component #473

Merged

Add the promtail and loki optional components #474

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

💡 [Feature] Log download stats from THREDDS server #444

💡 [Feature] Log download stats from THREDDS server #444

huard commented Apr 5, 2024 •

edited

Loading

fmigneault commented Apr 8, 2024

huard commented Apr 26, 2024

fmigneault commented Apr 26, 2024

huard commented Oct 4, 2024

mishaschwartz commented Nov 7, 2024 •

edited by fmigneault

Loading

huard commented Nov 7, 2024

fmigneault commented Nov 8, 2024

mishaschwartz commented Nov 8, 2024

tlvu commented Nov 11, 2024

💡 [Feature] Log download stats from THREDDS server #444

💡 [Feature] Log download stats from THREDDS server #444

Comments

huard commented Apr 5, 2024 • edited Loading

Description

References

Concerned Organizations

fmigneault commented Apr 8, 2024

huard commented Apr 26, 2024

fmigneault commented Apr 26, 2024

huard commented Oct 4, 2024

mishaschwartz commented Nov 7, 2024 • edited by fmigneault Loading

Prometheus log parser: #473

Promtail and Loki #474

Why not something else... logstash, beats, fluentbit ...

huard commented Nov 7, 2024

fmigneault commented Nov 8, 2024

mishaschwartz commented Nov 8, 2024

tlvu commented Nov 11, 2024

huard commented Apr 5, 2024 •

edited

Loading

mishaschwartz commented Nov 7, 2024 •

edited by fmigneault

Loading