-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add cht-sync collector and update the sql exporter example #111
Conversation
@mrjones-plip please have another look now that the docs PR is merged. |
I went to go test and started by setting up all inclusive CHT Sync and hit a snag. however, now I realize I don't have a good way of putting data into that stand alone couchdb instance, so maybe I'll try using an existing docker helper instance to test. stay tuned! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I totally think this should work - I can't get to to work following the new docs we just published :/
I keep on getting the error in the cht-watchdog-sql_exporter-1
service:
ts=2024-09-05T21:14:26.875Z caller=klog.go:157 level=error func=Fatalf msg="Error creating exporter: unknown collector \"cht-sync\" referenced in job \"db_targets\""
Checking, this looks right as the collectors
here matches where we define the collector.
renaming the collector file to exporters/postgres/cht-sync_collector.yml
didn't help, and changing everything to either cht_sync
or chtsync
didn't fix it either 🤷
Happy to pair with you if wanna see what I see!
@mrjones-plip I needed to update |
@@ -29,4 +29,3 @@ jobs: | |||
# sql server (eg "postgres-rds-prod", "postgres-rds-dev1" etc.) | |||
# 'postgres://USERNAME:PASSWORD@DB_SERVER_IP/DATABASE:PORT | |||
"db1": 'postgres://cht_couch2pg:[email protected]:5432/cht?sslmode=disable' # //NOSONAR - password is safe to commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this ever going to be cht
database? I think, for the near term, it's always going to be data
yeah? I got this error initially:
ts=2024-10-01T22:28:33.507Z caller=klog.go:134 level=error func=Errorf msg="Error gathering metrics: [from Gatherer #1] [job=db_targets,target=db1] pq: database \"cht\" does not exist"
So I think this is the fix. Also, since we default to postgres
as the user, I think we should use that here too, yeah? Finally - I think we can further demote couch2pg
by just using password
:
"db1": 'postgres://cht_couch2pg:cht_couch2pg_password@172.17.0.1:5432/cht?sslmode=disable' # //NOSONAR - password is safe to commit | |
"db1": 'postgres://postgres:password@172.17.0.1:5432/code?sslmode=disable' # //NOSONAR - password is safe to commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for circling back on this!
I gave suggestion to ease how much the user has to work (what is the DB name? cht
? data
? other?), but also am getting this error now:
ts=2024-10-01T22:29:22.502Z caller=klog.go:134 level=error func=Errorf msg="Error gathering metrics: [from Gatherer #1] [job=db_targets,target=db1,collector=cht-sync,query=couch2pg-query] pq: syntax error at or near \"{\""
Steps to reproduce were:
- set up docker helper cht core instance
- set up bare bones watchdog to monitor that instance
- set up cht sync to monitor docker helper instance
- follow the steps to update watchdog to monitor CHT sync
- edit
sql_servers.yml
to have thistargets:
:"db1": 'postgres://postgres:[email protected]:5432/data?sslmode=disable'
- run compose command:
docker compose -f docker-compose.yml -f exporters/postgres/compose.yml up -d
- check logs with
docker logs cht-watchdog-sql_exporter-1
@mrjones-plip the error is being caused by how we define the queries with environment variables. I have tried to update it into something like A simple solution is to hard code the default variables as they are defined in CHT Sync and update the documentation to reflect that these should be changed if they are different. What do you think of this approach? |
Sorry this took so long to get back to! I didn't even read it and thought I'd need to set up my who test suite - it was just a question 😭 - I'll do better next time! So I see that we have hard coded variables (or simply no database/schema names?) in the If we must hard code a deeply linked |
Failing to specify the schema would default the queries to running in the public schema which would almost certainly be wrong. I would rather we have the hard-coded schema but make sure we specify in the docs that it should be changed should it be different. |
@mrjones-plip I have added the hard-coded schema and a comment stating the values should be updated as needed. Please review. |
sounds good! I'll get to this review today for sure. |
oh no! I might not get to this 'til tomorrow... I'll keep ya posted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK!!! We're much closer. Thanks for pushing through on this!
I'm still seeing some errors though - see my deep dive one comment.
EXTRACT(EPOCH FROM(couchdb.latest - dbt_root.latest)) AS dbt_latency | ||
FROM | ||
(SELECT MAX(saved_timestamp) as latest FROM v1.document_metadata) dbt_root, | ||
(SELECT MAX(saved_timestamp) as latest FROM v1.couchdb) couchdb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting an error:
ts=2024-10-16T22:22:52.357Z caller=klog.go:134 level=error func=Errorf msg="Error gathering metrics: [from Gatherer #1] [job=db_targets,target=db1,collector=cht-sync,query=dbt-latency] pq: relation \"v1.couchdb\" does not exist"
ts=2024-10-16T22:22:53.357Z caller=klog.go:134 level=error func=Errorf msg="Error gathering metrics: [from Gatherer #1] [job=db_targets,target=db1,collector=cht-sync,query=dbt-latency] pq: relation \"v1.couchdb\" does not exist"
ts=2024-10-16T22:22:54.356Z caller=klog.go:134 level=error func=Errorf msg="Error gathering metrics: [from Gatherer #1] [job=db_targets,target=db1,collector=cht-sync,query=dbt-latency] pq: relation \"v1.couchdb\" does not exist"
Doing some digging in the actual data in my DB, this might work?
(SELECT MAX(saved_timestamp) as latest FROM v1.couchdb) couchdb | |
(SELECT max(v1.couchdb_progress.updated_at) as latest FROM v1.couchdb_progress) couchdb |
Checking prometheus targets at http://localhost:9090/targets?search=
I see http://sql_exporter:9399/metrics
marked as UP
, and checking that URL I see metrics I'd expect:
# HELP couch2pg_progress_pending approximate number of changes left to sync from couch to postgres
# TYPE couch2pg_progress_pending gauge
couch2pg_progress_pending{cht_instance="192-168-68-26.local-ip.medicmobile.org",db="medic",job="db_targets",target="db1"} 0
# HELP couch2pg_progress_sequence current sequence number for couch2pg
# TYPE couch2pg_progress_sequence counter
couch2pg_progress_sequence{cht_instance="192-168-68-26.local-ip.medicmobile.org",db="medic",job="db_targets",target="db1"} 221
# HELP couch2pg_up 1 if couch2pg is running and has updated in the last minute, 0 if not
# TYPE couch2pg_up gauge
couch2pg_up{cht_instance="192-168-68-26.local-ip.medicmobile.org",db="medic",job="db_targets",target="db1"} 1
# HELP dbt_execution_time dbt run last execution time (ms)
# TYPE dbt_execution_time gauge
dbt_execution_time{job="db_targets",table_name="contact",target="db1"} 0.09859967231750488
dbt_execution_time{job="db_targets",table_name="contact_type",target="db1"} 0.07459473609924316
dbt_execution_time{job="db_targets",table_name="data_record",target="db1"} 0.10951995849609375
dbt_execution_time{job="db_targets",table_name="dbt_results",target="db1"} 0.23592472076416016
dbt_execution_time{job="db_targets",table_name="document_metadata",target="db1"} 0.2093191146850586
dbt_execution_time{job="db_targets",table_name="patient",target="db1"} 0.0865786075592041
dbt_execution_time{job="db_targets",table_name="person",target="db1"} 0.10456228256225586
dbt_execution_time{job="db_targets",table_name="place",target="db1"} 0.16501379013061523
dbt_execution_time{job="db_targets",table_name="user",target="db1"} 0.0803828239440918
# HELP dbt_latency difference between last timestamp in dbt models and current time (seconds)
# TYPE dbt_latency gauge
dbt_latency{job="db_targets",target="db1"} 1180.199879
# HELP scrape_duration_seconds How long it took to scrape the target in seconds
# TYPE scrape_duration_seconds gauge
scrape_duration_seconds{job="db_targets",target="db1"} 0.000738378
# HELP up 1 if the target is reachable, or 0 if the scrape failed
# TYPE up gauge
up{job="db_targets",target="db1"} 1
that all said, my couch2pg backlog panel in Grafana shows no data:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrjones-plip I suspect the first error you got is because your POSTGRES_TABLE
env variable was set to medic but that was updated to couchdb
in this commit. I have managed to replicate the No data issue but I don't know why since the SQL query works as expected. I have booked a 30 minute session in your calendar to go through this together and hopefully get this over the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks thanks THANKS! for the big push to see this through - really good work here.
we got it all working and added the JSON to the PR
approved !
I really appreciate the time you made to review this in a call. Your attention to detail and making sure it works as expected led us to uncover a few issues that we can fix to make this better. |
🎉 This PR is included in version 1.17.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
Add a collector for CHT Sync