1
   |    7    |     8     |     0     |     4     |  <- average

   |  
10 |    +-------------+
8  |    |             |                    +--------
   |    |             |                    |
4  |----+             |                    |
   |                  |                    |   
   +----5----10----15-+==20====25====30====35----40
                         t ->

Input:
[ (timestamp, new_value), ...]
e.g.: [(0,4), (5,10), (18,0), (35,8), ...]

given:
#define WINDOW_SIZE 10;

Print windowed average:
7, 8, 0, 4, ...

I'm stumped trying to think through the code for this problem. One method I could think of was to break down the (timestamp, value) pairs into their lowest unit, say, 1 time unit, which makes the code easier to write, but is computationally expensive.

Edit: For example, the first value is 7 because I calculate the area under the curve for 5 time steps (4*5 = 20) and the area under the curve for the next 5 time steps (10*5 = 50) and then add these 2 and divide by WINDOW_SIZE (70/10 = 7). However, the problem is that these timestamps are quite arbitrary, i.e, they are not periodic. So how do I know when to stop looking through the array?

Edit: This link seems to be relevant: Calculating moving average in C++

Community
  • 1
  • 1
nirvanaswap
  • 859
  • 2
  • 11
  • 21
  • How about calculating the area under the graph and dividing by time. – Weather Vane Mar 04 '16 at 17:58
  • The problem is the non-periodic nature of the input timestamps. I know to calculate the area under the curve for WINDOW_SIZE periods, but the problem becomes segregating these values into WINDOW_SIZEs, if that makes any sense. – nirvanaswap Mar 04 '16 at 18:03
  • Can't you just advance an index into your data array, calculating the total area, until you reach the frame time? Next frame, advance the index again, adding the rectangular areas. However you show rectangular areas: shouldn't they be trapezial, on the assumption that the value changed evenly, rather than suddenly? If so, you'll have to interpolate a supposed value at the frame time. – Weather Vane Mar 04 '16 at 18:10
  • 1
    Note that the semicolon in the `#define` is almost inevitably incorrect (unwanted). – Jonathan Leffler Mar 04 '16 at 18:29
  • If you want the 'moving average' over 10 time units but your data arrives at indeterminate times, then you need to find a way to treat the array with 4 entries as if it had an entry at each time — which isn't all that hard. You can then calculate as if the entries arrived on schedule, though in practice many consecutive entries will be the same value. If your data arrives at time precisions down to second level but you're interested in values to the minute (moving average over 10 minutes), then you have to decide how to handle entries — is it 600 entries at 1 second intervals that you use? – Jonathan Leffler Mar 04 '16 at 18:34
  • Yes Jonathan, I figured out this approach, and even suggested this in my question... but it's expensive in terms of both space and time... – nirvanaswap Mar 04 '16 at 18:35

2 Answers2

1

The average is the area under the curve (mathematical integral) divided by the period of time. Print the average every time you step over a time threshold:

#define WINDOW_SIZE 10

void runningAverage( int startTime, int time, int value, int *area )
{
    int A = *area;
    for ( int i = startTime/WINDOW_SIZE; i < time/WINDOW_SIZE; i ++ )
    {
        int newT = (i+1) * WINDOW_SIZE;    // timespane to next threshold
        A += ( newT - startTime ) * value; // add sub area
        printf( "%d ", A / WINDOW_SIZE );  // print average

        startTime = newT;                  // set new start time
        A = 0;                             // reset area
    }
    *area = A + (time-startTime) * value;  // add area
}

int main()
{
    int area = 0;
    runningAverage(  0,  5,  4, &area );
    runningAverage(  5, 18, 10, &area );
    runningAverage( 18, 35,  0, &area );
    runningAverage( 35, 40,  8, &area );
    return 0;
}

Output: 7 8 0 4

Rabbid76
  • 202,892
  • 27
  • 131
  • 174
0

So you have real input data that arrives at irregular moments and want to calculate a sliding window/moving average.

My approach would be to have an array (list) of input data/time stamp pairs and construct in another array the interpolated values for the missing time points. Then calculate the window.

The array must be large enough to hold all the moments between two input time stamps. After having calculated the first window, move the array the window size down and continue.

Paul Ogilvie
  • 25,048
  • 4
  • 23
  • 41