Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the prometheus-log-parser optional component #473

Merged
merged 13 commits into from
Nov 19, 2024
Merged
6 changes: 3 additions & 3 deletions .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 2.5.5
current_version = 2.6.0
commit = True
tag = False
tag_name = {new_version}
Expand Down Expand Up @@ -30,11 +30,11 @@ search = {current_version}
replace = {new_version}

[bumpversion:file:RELEASE.txt]
search = {current_version} 2024-11-14T21:04:17Z
search = {current_version} 2024-11-19T13:53:14Z
replace = {new_version} {utcnow:%Y-%m-%dT%H:%M:%SZ}

[bumpversion:part:releaseTime]
values = 2024-11-14T21:04:17Z
values = 2024-11-19T13:53:14Z

[bumpversion:file(version):birdhouse/components/canarie-api/docker_configuration.py.template]
search = 'version': '{current_version}'
Expand Down
30 changes: 30 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,36 @@

[//]: # (list changes here, using '-' for each new entry, remove this when items are added)

[2.6.0](https://github.com/bird-house/birdhouse-deploy/tree/2.6.0) (2024-11-19)
------------------------------------------------------------------------------------------------------------------

## Changes

- Add the `prometheus-log-parser` optional component

This component parses log files from other components and converts their logs to prometheus
metrics that are then ingested by the monitoring Prometheus instance (the one created by the
`components/monitoring` component).

For more information on how this component reads log files and converts them to prometheus components see
the [log-parser](https://github.com/DACCS-Climate/log-parser/) documentation.

To configure this component:

* set the `PROMETHEUS_LOG_PARSER_POLL_DELAY` variable to a number of seconds to set how often the log parser
checks if new lines have been added to log files (default: 1)
* set the `PROMETHEUS_LOG_PARSER_TAIL` variable to `"true"` to only parse new lines in log files. If unset,
this will parse all existing lines in the log file as well (default: `"true"`)
mishaschwartz marked this conversation as resolved.
Show resolved Hide resolved

To view all metrics exported by the log parser:

* Navigate to the `https://<BIRDHOUSE_FQDN>/prometheus/graph` search page
* Put `{job="log_parser"}` in the search bar and click the "Execute" button

- Update the prometheus version to the current latest `v2.53.3`. This is required to support
loading multiple prometheus scrape configuration files with the `scrape_config_files`
configuration option.

[2.5.5](https://github.com/bird-house/birdhouse-deploy/tree/2.5.5) (2024-11-14)
------------------------------------------------------------------------------------------------------------------

Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Generic variables
override SHELL := bash
override APP_NAME := birdhouse-deploy
override APP_VERSION := 2.5.5
override APP_VERSION := 2.6.0

# utility to remove comments after value of an option variable
override clean_opt = $(shell echo "$(1)" | $(_SED) -r -e "s/[ '$'\t'']+$$//g")
Expand Down
8 changes: 4 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,13 @@ for a full-fledged production platform.
* - citation
- | |citation|

.. |commits-since| image:: https://img.shields.io/github/commits-since/bird-house/birdhouse-deploy/2.5.5.svg
.. |commits-since| image:: https://img.shields.io/github/commits-since/bird-house/birdhouse-deploy/2.6.0.svg
:alt: Commits since latest release
:target: https://github.com/bird-house/birdhouse-deploy/compare/2.5.5...master
:target: https://github.com/bird-house/birdhouse-deploy/compare/2.6.0...master

.. |latest-version| image:: https://img.shields.io/badge/tag-2.5.5-blue.svg?style=flat
.. |latest-version| image:: https://img.shields.io/badge/tag-2.6.0-blue.svg?style=flat
:alt: Latest Tag
:target: https://github.com/bird-house/birdhouse-deploy/tree/2.5.5
:target: https://github.com/bird-house/birdhouse-deploy/tree/2.6.0

.. |readthedocs| image:: https://readthedocs.org/projects/birdhouse-deploy/badge/?version=latest
:alt: ReadTheDocs Build Status (latest version)
Expand Down
2 changes: 1 addition & 1 deletion RELEASE.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.5.5 2024-11-14T21:04:17Z
2.6.0 2024-11-19T13:53:14Z
1 change: 1 addition & 0 deletions birdhouse/components/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,7 @@ Here is a sample setup to test autodeploy:
# then remove the running scheduler container
birdhouse compose up -d --remove-orphans

.. _Monitoring:

Monitoring
==========
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -108,8 +108,8 @@ SERVICES = {
# NOTE:
# Below version and release time auto-managed by 'make VERSION=x.y.z bump'.
# Do NOT modify it manually. See 'Tagging policy' in 'birdhouse/README.rst'.
'version': '2.5.5',
'releaseTime': '2024-11-14T21:04:17Z',
'version': '2.6.0',
'releaseTime': '2024-11-19T13:53:14Z',
'institution': '${BIRDHOUSE_INSTITUTION}',
'researchSubject': '${BIRDHOUSE_SUBJECT}',
'supportEmail': '${BIRDHOUSE_SUPPORT_EMAIL}',
Expand Down Expand Up @@ -141,8 +141,8 @@ PLATFORMS = {
# NOTE:
# Below version and release time auto-managed by 'make VERSION=x.y.z bump'.
# Do NOT modify it manually. See 'Tagging policy' in 'birdhouse/README.rst'.
'version': '2.5.5',
'releaseTime': '2024-11-14T21:04:17Z',
'version': '2.6.0',
'releaseTime': '2024-11-19T13:53:14Z',
'institution': '${BIRDHOUSE_INSTITUTION}',
'researchSubject': '${BIRDHOUSE_SUBJECT}',
'supportEmail': '${BIRDHOUSE_SUPPORT_EMAIL}',
Expand Down
2 changes: 1 addition & 1 deletion birdhouse/components/monitoring/default.env
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ export GRAFANA_VERSION="7.0.3"
export GRAFANA_DOCKER=grafana/grafana
export GRAFANA_IMAGE='${GRAFANA_DOCKER}:${GRAFANA_VERSION}'

export PROMETHEUS_VERSION="v2.19.0"
export PROMETHEUS_VERSION="v2.53.3"
export PROMETHEUS_DOCKER=prom/prometheus
export PROMETHEUS_IMAGE='${PROMETHEUS_DOCKER}:${PROMETHEUS_VERSION}'

Expand Down
3 changes: 3 additions & 0 deletions birdhouse/components/monitoring/prometheus.yml.template
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@ scrape_configs:
- targets:
- ${BIRDHOUSE_FQDN}:9100

scrape_config_files:
- "/etc/prometheus/scrape_config.d/*.yml"

rule_files:
- "/etc/prometheus/*.rules"

Expand Down
2 changes: 2 additions & 0 deletions birdhouse/components/proxy/nginx.conf.template
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ http {
include /etc/nginx/mime.types;
default_type application/octet-stream;

# NOTE: if this log_format ever changes, make sure to update the relevant code in the
# prometheus-log-parser component as well to match the change.
log_format main '$remote_addr - $remote_user [$time_iso8601] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
Expand Down
4 changes: 4 additions & 0 deletions birdhouse/env.local.example
Original file line number Diff line number Diff line change
Expand Up @@ -632,6 +632,10 @@ export THREDDS_ADDITIONAL_CATALOG=''
#export ALERTMANAGER_EXTRA_INHIBITION=""
#export ALERTMANAGER_EXTRA_RECEIVERS=""

# Below are for the prometheus-log-parser optional component
#export PROMETHEUS_LOG_PARSER_POLL_DELAY=1 # time in seconds
#export PROMETHEUS_LOG_PARSER_TAIL=true

#############################################################################
# Emu optional vars
#############################################################################
Expand Down
44 changes: 44 additions & 0 deletions birdhouse/optional-components/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -443,3 +443,47 @@ How to enable X-Robots-Tag Header in ``env.local`` (a copy from `env.local.examp

.. seealso::
See the `env.local.example`_ file for more details about this ``BIRDHOUSE_PROXY_ROOT_LOCATION`` behaviour.

.. _prometheus-log-parser

Prometheus Log Parser
---------------------

Parses log files from other components and converts their logs to prometheus metrics that are then ingested by the
monitoring Prometheus instance (the one created by the :ref:`Monitoring` component).

For more information on how this component reads log files and converts them to prometheus components see
the log-parser_ documentation.

To configure this component:

* set the ``PROMETHEUS_LOG_PARSER_POLL_DELAY`` variable to a number of seconds to set how often the log parser
checks if new lines have been added to log files (default: 1)
* set the ``PROMETHEUS_LOG_PARSER_TAIL`` variable to ``"true"`` to only parse new lines in log files. If unset,
this will parse all existing lines in the log file as well (default: ``"true"``)

To view all metrics exported by the log parser:

* Navigate to the ``https://<BIRDHOUSE_FQDN>/prometheus/graph`` search page
* Put ``{job="log_parser"}`` in the search bar and click the "Execute" button

For developers, to create a new parser that can be used to track log files:

1. create a python file that can be mounted as a volume to the ``PROMETHEUS_LOG_PARSER_PARSERS_DIR``
directory on the ``prometheus-log-parser`` container.
2. mount any log files that you want to parse as a volume on the ``prometheus-log-parser`` container.
3. the python script should create at least one `prometheus metric using the prometheus_client
library <prometheus_python_metrics>`_ and must contain a global constant named ``LOG_PARSER_CONFIG``
which is a dictionary where keys are paths to log files (mounted in the container) and values are a
list of "line parser" functions.
* a "line parser" is any function that takes a string as a single argument (a single line from a
fmigneault marked this conversation as resolved.
Show resolved Hide resolved
log file). These functions are where you'd write the code that parses the line and converts it
into a prometheus metric.
* your line parser function should update one of the prometheus metrics you created previously.

For an example of a working log parser, see
`birdhouse/optional-components/prometheus-log-parser/config/thredds/prometheus-log-exporter.py`_
(:download:`download <birdhouse/optional-components/prometheus-log-parser/config/thredds/prometheus-log-exporter.py>`).

.. _log-parser: https://github.com/DACCS-Climate/log-parser/
.. _prometheus_python_metrics: https://prometheus.github.io/client_python/instrumenting/
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
scrape_configs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
version: "3.4"
services:
prometheus:
volumes:
- ./optional-components/prometheus-log-parser/config/monitoring/scrape_configs.yml:/etc/prometheus/scrape_config.d/log_parser.yml:ro
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
scrape_configs:
- job_name: log_parser
honor_labels: true
static_configs:
- targets:
- prometheus-log-parser:${PROMETHEUS_LOG_PARSER_CLIENT_PORT}
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
version: "3.4"
services:
proxy:
volumes:
- proxy-logs:${PROXY_LOG_DIR}
prometheus-log-parser:
volumes:
- proxy-logs:/var/log/proxy
environment:
- PROXY_LOG_FILE=${PROXY_LOG_FILE}

volumes:
proxy-logs:
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
version: "3.4"
services:
prometheus-log-parser:
volumes:
- ./optional-components/prometheus-log-parser/config/thredds/prometheus-log-exporter.py:${PROMETHEUS_LOG_PARSER_PARSERS_DIR}/thredds-prometheus-log-exporter.py:ro
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
import os
import re

import prometheus_client


# This matches a request to the THREDDS data services as defined in birdhouse/components/thredds/catalog.xml.template
THREDDS_REQ_URI_REGEX = r'\/[^\s]+\/thredds\/(?P<tds_service>dodsC|fileServer|ncss)\/(?P<dataset>[^\s]*)(?:\?(?P<variable>\w+))?'

# This matches the nginx log_fomat as defined in birdhouse/components/proxy/nginx.conf.template
REGEX = re.compile(
r'(?P<remote_addr>(?:^|\b(?<!\.))(?:1?\d\d?|2[0-4]\d|25[0-5])(?:\.(?:1?\d\d?|2[0-4]\d|25[0-5])){3}(?=$|[^\w.]))\s'
r'-\s'
r'(?P<remote_usr>-|[a-z_][a-z0-9_]{0,30})\s'
r'(?P<date_time>\[(?P<date>\d\d\d\d-\d\d-\d\d)T(?P<time>\d\d:\d\d:\d\d).*\])\s'
r'(?P<request>\"'
r'(?P<req_method>GET|POST|HEAD|PUT|DELETE|CONNECT|OPTIONS|TRACE|PATCH)\s'
fr'(?P<req_uri>{THREDDS_REQ_URI_REGEX})\s'
r'(?P<http_ver>HTTP/\d\.\d)'
r'\")\s'
r'(?P<status>\d{3})\s'
r'(?P<body_byte_sent>\d+)\s'
r'\"(?P<http_referer>[^\s]+)\"\s'
r'\"(?P<user_agent>[^\"]+)\"\s'
r'\"(?P<forward_for>[^\"]+)\"')

LABEL_KEYS = ("remote_addr", "date", "tds_service", "dataset", "variable")

counter = prometheus_client.Counter(
name="thredds_transfer_size_kb",
documentation="THREDDS data transferred",
labelnames=LABEL_KEYS,
unit="kb",
)

def parse_line(line):
match = REGEX.match(line)
if match:
labels = {label: match.group(label) or "" for label in LABEL_KEYS}
if body_byte_sent := match.group("body_byte_sent") is not None:
body_byte_sent = int(body_byte_sent) / 1024
counter.labels(**labels).inc(body_byte_sent)

LOG_PARSER_CONFIG = {f"/var/log/proxy/{os.getenv('PROXY_LOG_FILE')}": [parse_line]}
18 changes: 18 additions & 0 deletions birdhouse/optional-components/prometheus-log-parser/default.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
export PROMETHEUS_LOG_PARSER_DOCKER=marbleclimate/log-parser
export PROMETHEUS_LOG_PARSER_VERSION=v1.0.1
export PROMETHEUS_LOG_PARSER_IMAGE='${PROMETHEUS_LOG_PARSER_DOCKER}:${PROMETHEUS_LOG_PARSER_VERSION}'
export PROMETHEUS_LOG_PARSER_CLIENT_PORT=8000
export PROMETHEUS_LOG_PARSER_PARSERS_DIR=/parsers.d
export PROMETHEUS_LOG_PARSER_POLL_DELAY=1 # time in seconds
export PROMETHEUS_LOG_PARSER_TAIL=true
export PROMETHEUS_LOG_PARSER_LOG_LEVEL=INFO

OPTIONAL_VARS="
$OPTIONAL_VARS
\$PROMETHEUS_LOG_PARSER_CLIENT_PORT
"

export DELAYED_EVAL="
$DELAYED_EVAL
PROMETHEUS_LOG_PARSER_IMAGE
"
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
version: "3.4"

x-logging: &default-logging
driver: "json-file"
options:
max-size: "50m"
max-file: "10"

services:
prometheus-log-parser:
image: ${PROMETHEUS_LOG_PARSER_IMAGE}
container_name: prometheus-log-parser
entrypoint: ["log-parser-prometheus"]
environment:
- PROMETHEUS_LOG_PARSER_CLIENT_PORT=${PROMETHEUS_LOG_PARSER_CLIENT_PORT}
- LOG_PARSER_PARSERS=${PROMETHEUS_LOG_PARSER_PARSERS_DIR}
- LOG_PARSER_POLL_DELAY=${PROMETHEUS_LOG_PARSER_POLL_DELAY}
- LOG_PARSER_TAIL=${PROMETHEUS_LOG_PARSER_TAIL}
- LOG_PARSER_LOG_LEVEL=${PROMETHEUS_LOG_PARSER_LOG_LEVEL}
restart: always
logging: *default-logging
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,9 @@
# built documents.
#
# The short X.Y version.
version = '2.5.5'
version = '2.6.0'
# The full version, including alpha/beta/rc tags.
release = '2.5.5'
release = '2.6.0'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
Loading