Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add cht-sync collector and update the sql exporter example #111

Merged
merged 6 commits into from
Oct 18, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions exporters/postgres/cht_sync_collector.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
collector_name: cht-sync

# Update the schema and table names as needed
queries:
- query_name: couch2pg-query
query: |
SELECT
split_part(seq,'-',1) as sequence,
pending as pending,
CASE
WHEN updated_at < NOW() - INTERVAL '1 minute' THEN 0
ELSE 1
END AS liveness,
split_part(source,'/',2) as db,
split_part(source,'/',1) as cht_instance
FROM
v1.couchdb_progress
WHERE
source like '%/%' and
seq like '%-%'
ORDER BY
cht_instance, db
- query_name: dbt-latency
query: |
SELECT
EXTRACT(EPOCH FROM(couchdb.latest - dbt_root.latest)) AS dbt_latency
FROM
(SELECT MAX(saved_timestamp) as latest FROM v1.document_metadata) dbt_root,
(SELECT MAX(saved_timestamp) as latest FROM v1.couchdb) couchdb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting an error:

ts=2024-10-16T22:22:52.357Z caller=klog.go:134 level=error func=Errorf msg="Error gathering metrics: [from Gatherer #1] [job=db_targets,target=db1,collector=cht-sync,query=dbt-latency] pq: relation \"v1.couchdb\" does not exist"
ts=2024-10-16T22:22:53.357Z caller=klog.go:134 level=error func=Errorf msg="Error gathering metrics: [from Gatherer #1] [job=db_targets,target=db1,collector=cht-sync,query=dbt-latency] pq: relation \"v1.couchdb\" does not exist"
ts=2024-10-16T22:22:54.356Z caller=klog.go:134 level=error func=Errorf msg="Error gathering metrics: [from Gatherer #1] [job=db_targets,target=db1,collector=cht-sync,query=dbt-latency] pq: relation \"v1.couchdb\" does not exist"

Doing some digging in the actual data in my DB, this might work?

Suggested change
(SELECT MAX(saved_timestamp) as latest FROM v1.couchdb) couchdb
(SELECT max(v1.couchdb_progress.updated_at) as latest FROM v1.couchdb_progress) couchdb

Checking prometheus targets at http://localhost:9090/targets?search= I see http://sql_exporter:9399/metrics marked as UP, and checking that URL I see metrics I'd expect:

# HELP couch2pg_progress_pending approximate number of changes left to sync from couch to postgres
# TYPE couch2pg_progress_pending gauge
couch2pg_progress_pending{cht_instance="192-168-68-26.local-ip.medicmobile.org",db="medic",job="db_targets",target="db1"} 0
# HELP couch2pg_progress_sequence current sequence number for couch2pg
# TYPE couch2pg_progress_sequence counter
couch2pg_progress_sequence{cht_instance="192-168-68-26.local-ip.medicmobile.org",db="medic",job="db_targets",target="db1"} 221
# HELP couch2pg_up 1 if couch2pg is running and has updated in the last minute, 0 if not
# TYPE couch2pg_up gauge
couch2pg_up{cht_instance="192-168-68-26.local-ip.medicmobile.org",db="medic",job="db_targets",target="db1"} 1
# HELP dbt_execution_time dbt run last execution time (ms)
# TYPE dbt_execution_time gauge
dbt_execution_time{job="db_targets",table_name="contact",target="db1"} 0.09859967231750488
dbt_execution_time{job="db_targets",table_name="contact_type",target="db1"} 0.07459473609924316
dbt_execution_time{job="db_targets",table_name="data_record",target="db1"} 0.10951995849609375
dbt_execution_time{job="db_targets",table_name="dbt_results",target="db1"} 0.23592472076416016
dbt_execution_time{job="db_targets",table_name="document_metadata",target="db1"} 0.2093191146850586
dbt_execution_time{job="db_targets",table_name="patient",target="db1"} 0.0865786075592041
dbt_execution_time{job="db_targets",table_name="person",target="db1"} 0.10456228256225586
dbt_execution_time{job="db_targets",table_name="place",target="db1"} 0.16501379013061523
dbt_execution_time{job="db_targets",table_name="user",target="db1"} 0.0803828239440918
# HELP dbt_latency difference between last timestamp in dbt models and current time (seconds)
# TYPE dbt_latency gauge
dbt_latency{job="db_targets",target="db1"} 1180.199879
# HELP scrape_duration_seconds How long it took to scrape the target in seconds
# TYPE scrape_duration_seconds gauge
scrape_duration_seconds{job="db_targets",target="db1"} 0.000738378
# HELP up 1 if the target is reachable, or 0 if the scrape failed
# TYPE up gauge
up{job="db_targets",target="db1"} 1

that all said, my couch2pg backlog panel in Grafana shows no data:

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrjones-plip I suspect the first error you got is because your POSTGRES_TABLE env variable was set to medic but that was updated to couchdb in this commit. I have managed to replicate the No data issue but I don't know why since the SQL query works as expected. I have booked a 30 minute session in your calendar to go through this together and hopefully get this over the line.

- query_name: dbt-run-stats
query: |
SELECT
status,
execution_time,
name as table_name
FROM
v1.dbt_results

metrics:
- metric_name: couch2pg_progress_sequence
type: counter
help: 'current sequence number for couch2pg'
key_labels:
- db
- cht_instance
values: [sequence]
query_ref: couch2pg-query
- metric_name: couch2pg_progress_pending
type: gauge
help: 'approximate number of changes left to sync from couch to postgres'
key_labels:
- db
- cht_instance
values: [pending]
query_ref: couch2pg-query
- metric_name: couch2pg_up
type: gauge
help: '1 if couch2pg is running and has updated in the last minute, 0 if not'
key_labels:
- db
- cht_instance
values: [liveness]
query_ref: couch2pg-query
- metric_name: dbt_latency
type: gauge
help: 'difference between last timestamp in dbt models and current time (seconds)'
values: [dbt_latency]
query_ref: dbt-latency
- metric_name: dbt_execution_time
type: gauge
help: 'dbt run last execution time (ms)'
key_labels:
- table_name
values: [execution_time]
query_ref: dbt-run-stats
7 changes: 3 additions & 4 deletions exporters/postgres/sql_servers_example.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,16 @@ global:
max_connection_lifetime: 10m

collector_files:
- "/etc/sql_exporter/couch2pg_collector.yml"
- "/etc/sql_exporter/cht_sync_collector.yml"

jobs:
- job_name: db_targets
collectors: [couch2pg]
collectors: [cht-sync] # change this to [couch2pg] to monitor couch2pg
enable_ping: true
static_configs:
- targets:
# change USERNAME, PASSWORD, DB_SERVER as needed. Likely DATABASE and PORT don't need to change.
# be sure each new server gets a unique name. A good rule of thumb is to use the name of the
# sql server (eg "postgres-rds-prod", "postgres-rds-dev1" etc.)
# 'postgres://USERNAME:PASSWORD@DB_SERVER_IP/DATABASE:PORT
"db1": 'postgres://cht_couch2pg:[email protected]:5432/cht?sslmode=disable' # //NOSONAR - password is safe to commit

"db1": 'postgres://postgres:[email protected]:5432/data?sslmode=disable' # //NOSONAR - password is safe to commit
Loading