Harvester Stopped Reporting #114

Jahorse · 2023-10-21T12:41:41Z

I'm using chia-exporter v0.10.0 running as a service on Ubuntu 22.04.3 with Chia v2.0.0.

My chia exporter and my harvester both seem to be running fine, but the harvester metrics the exporter is reporting haven't updated in a week. It looks like there was a timeout a week ago and after that only the chia_full_node was reporting, but this machine is only running a harvester... I see this in the log:

Oct 14 16:35:42 ... level=info msg="recv: chia_harvester farming_info\n"
Oct 14 16:35:45 ... level=info msg="cron: chia_full_node updating file sizes"
Oct 14 16:36:02 ... level=info msg="cron: chia_full_node updating file sizes"
Oct 14 16:36:02 ... write tcp 127.0.0.1:60624->127.0.0.1:55400: i/o timeout
Oct 14 16:36:02 ... level=error msg="write tcp 127.0.0.1:60624->127.0.0.1:55400: i/o timeout\n"
Oct 14 16:36:02 ... level=error msg="write tcp 127.0.0.1:60624->127.0.0.1:55400: i/o timeout\n"
Oct 14 16:36:02 ... Trying to reconnect...
Oct 14 16:39:35 ... Reconnected!
Oct 14 16:39:35 ... level=info msg="cron: chia_full_node updating file sizes"

From that point on, the log is just filled with those chia_full_node updating file sizes lines. Before that I saw everything.

Restarting the service fixed the issue, but it kind of sucks that all the counters got reset. Also I would need to be paying really close attention to catch when this happens, the exporter continues to serve metrics, they just don't update.

The text was updated successfully, but these errors were encountered:

cmmarslender · 2023-10-21T16:24:05Z

Thanks for reporting. I think this happens for another person internally as well. I'm not entirely sure what causes it, but I can look and see if its possible to detect when events stop coming in and do something about it.

As far as counter resets go, this is just a quirk of how prometheus counters work. Any metrics here that are counter values you'd typically want to use with rate() or similar functions provided by prometheus that account for counter resets https://prometheus.io/docs/prometheus/latest/querying/functions/#rate

Jahorse · 2023-10-21T17:56:13Z

Thanks, yeah I'm not too worried about resetting the counters or why it's happening, it would just be nice if it could detect when some service metrics are no longer being received and restart whatever it needs to fix it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harvester Stopped Reporting #114

Harvester Stopped Reporting #114

Jahorse commented Oct 21, 2023

cmmarslender commented Oct 21, 2023

Jahorse commented Oct 21, 2023

Harvester Stopped Reporting #114

Harvester Stopped Reporting #114

Comments

Jahorse commented Oct 21, 2023

cmmarslender commented Oct 21, 2023

Jahorse commented Oct 21, 2023