Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data gets duplicated 60x in influxdb? #11

Open
doctarrwho opened this issue Aug 8, 2024 · 3 comments
Open

Data gets duplicated 60x in influxdb? #11

doctarrwho opened this issue Aug 8, 2024 · 3 comments

Comments

@doctarrwho
Copy link

After importing the data, I noticed that the raw data included the same values for each minute of the hour. See screenshot:
image

The data returned from SmartHub (for me) seems to be hourly, see sample debug output:

DEBUG: Parsed data from poll endpoint:
&{Status:COMPLETE Data:map[ELECTRIC:[{Type:USAGE Series:[{Name:336288847 Data:[{UnixMillis:1719532800000 Value:2.26} {UnixMillis:1719536400000 Value:2.24} {UnixMillis:1719540000000 Value:0.97} 

I modified the local code to change the WriteMetrics function to write one point per hour which fixed the problem for me, and now I have single row for each hour with hourly value.

func WriteMetrics(records []ElectricUsage, config InfluxConfig) error {
	opts := influxdb2.DefaultOptions()
	if config.Insecure {
		opts.SetTLSConfig(&tls.Config{InsecureSkipVerify: true})
	}
	client := influxdb2.NewClientWithOptions(config.Host, config.AuthToken, opts)
	writeApi := client.WriteAPIBlocking(config.Org, config.Database)
	for _, record := range records {
		// Calculate the duration in hours, assuming the record spans an hour
		hours := record.EndTime.Sub(record.StartTime).Hours()
		if hours == 0 {
			hours = 1 // Avoid division by zero, assume at least one hour
		}

		// Create a point for the record
		point := influxdb2.NewPointWithMeasurement("electric").
			SetTime(record.StartTime).
			AddField("watts", float64(record.WattHours)/hours)

		if record.CostInCents != nil {
			point.AddField("cost", float64(*record.CostInCents)/hours)
		}
		if record.MeterName != nil {
			point.AddTag("name", *record.MeterName)
		}

		// Write the point to the database
		err := writeApi.WritePoint(context.Background(), point)
		if err != nil {
			return err
		}
	}
	return nil
}

I'm not sure if my utility provider returns different data vs. yours, but some customization here would be nice to control the point interval.

@tedpearson
Copy link
Owner

Let me try to explain why the code does what it does here in the WriteMetrics function.

  • I wanted to have a minutely metric with the wattage used in order to have a derived graph of cumulative and 1d/7d average usage. See attached screenshot.
  • The minutely values are actually correct for your data as well, as they indicate the watts used during that minute, as accurately as we know from the data.
  • In your case, with hourly data points, the watt-hours from your utility are equal to the watts used each minute. In my case with 15-minute data points, I had to multiply the watt-hours x4 to get the watts used per minute.

If you have a strong reason why hourly datapoints work better for your use case than minutely datapoints, I could add some sort of config option to toggle this feature.

image

@doctarrwho
Copy link
Author

Minutely metric would make sense if that's what the data was for, but in my case, the hourly metric is returned as kWh and is duplicated 60x (once for each minute) so the end result in the graphs is way off.

image
Data:[
  {UnixMillis:1719532800000 Value:2.26} 
  {UnixMillis:1719536400000 Value:2.24}

In this data sample, 2.26 and 2.24 are saved 60x each, so if I were to look at the cumulative usage, it shows 270kWh instead of 4.5kWh for 2 hours.

Ideally, the points written to influxdb should be based on the interval in the data returned by the utility provider, whether it's an hour (my case), or 15 mins (your case), or anything else.

@tedpearson
Copy link
Owner

Yes, the returned data from smarthub is in kilowatt-hour units. However, currently electric-usage-downloader writes a metric in watts ("realtime" usage), not watt-hours or kilowatt-hours ("cumulative" usage). I wrote it that way because my co-op returns 15-minutely data, and I wanted to see the wattage my home was using, not how many watt-hours were used each 15 minute period.

So if I were to allow for customization, it would be in the unit returned. I would want to have a metric for both watt-hours and for watts, with separate metrics or maybe labels.

But technically speaking, since the unit is watts and not watt-hours, the metric is not duplicated, it's just averaged since there is only hourly data to extrapolate from. Since the metric is in watts, the equation to accumulate it in prometheus or victoriametrics would be sum_over_time(watts[$internal])/60 to get kilowatt-hours. I haven't touched influxql in a while so I'm unsure of the equivalent influx query.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants