2

Is there a feasible way to calculate a running average of an incoming stream of sensor data? I get multiple samples a second and the stream can run for several hours, so accumulating the values and keeping track of the amount of samples seems not feasible. Is there a more pragmatic approach to this problem, maybe with the sacrifice of accuracy? The best solution I came up so far is an IIR implementation as x[n] = 0.99*x[n-1] + 0.01*y but this is not really giving the average as I need it.

po.pe
  • 1,047
  • 1
  • 12
  • 27
  • are you looking for an overall average or a local average of n datapoints? – user3235916 Jan 30 '22 at 09:14
  • would you be satisfied with the average of a running window of the last x minutes? Maybe even give the previous data some influence on the whole average? – user1984 Jan 30 '22 at 09:15
  • It depends on what kind of accuracy you would need. Say you want some kind of output for graphing over 5s intervals. Then you would be fine just storing the number of datapoints and the accumulated values of those at a 5s resolution. If that is ok, you could use something like rrdtool. – Cheatah Jan 30 '22 at 09:17
  • 1
    It depends very much on what the output is supposed to represent. There are many types of [filter](https://en.wikipedia.org/wiki/Filter_(signal_processing)). Do you want the overall average, the short-term average, a responsive average, a sticky average, etc. Why isn't keeping a running total feasible if you use 64-bit to sum the values? – Weather Vane Jan 30 '22 at 09:17
  • Moving average is not what I'm looking for. Reduction of data points on the other hand could be possible. Okay, if I do the math right 64bit should work as well... it just feels odd. I'll give it a try – po.pe Jan 30 '22 at 09:34

1 Answers1

2

Calculating an exact average requires the sum of samples and a count of samples. There is not really a way around that.

Keeping an average, A_N, and updating it for a new sample:

A_{N+1} = (A_N * N + s_{N+1}) / (N+1)

is completely equivalent to keeping a sum (which is equal to the term A_N * N) so that is not a solution.

The potential problem with keeping a sum of samples is that the exact value may exceed the number of significant bits in the representation (whether it is integer or floating-point).

To get around that (in case the maximal integer size is not sufficient), either a library for arbitrary large integers or a "home-made" solution can be used.

A home-made solution could be to keep "buckets of sums" either with a fixed number of samples in each bucket or with a count per bucket. The average could then be calculated as a weighed average of the per-bucket averages using floating-point calculations.

nielsen
  • 5,641
  • 10
  • 27