I have this problem where I need to compute a continuous exponential moving average of a value in a discrete data stream. It's impossible to predict when I will receive the next sample, but EMA formulas expect the amount of time between each sample of data to be equal.
I found this article with a demonstration of how to work around this:
double exponentialMovingAverageIrregular( double alpha,
double sample,
double prevSample,
double deltaTime,
double emaPrev
)
{
double a = deltaTime / ( 1 - alpha );
double u = exp( a * -1 ); // e^(-a)
double v = ( 1 - u ) / a;
double emaNext = ( emaPrev * u )
+ ( prevSample * ( v - u ) )
+ ( sample * ( 1 - v ) );
return emaNext;
}
I compute alpha
by using the following formula: 2 / (period + 1)
where period is the number of milliseconds I want my EMA to pay attention to.
When I use this, the EMA moves way too quickly. I could have a 30 minute window that takes only two or three samples for the EMA to equal the input.
Here are some things I could be doing wrong:
- I use milliseconds for computing
alpha
because that's the resolution of the timestamps on my input - I use milliseconds for
deltaTime
because that's what everything else is using - Per the suggestion of commenters on the article, I use
a = deltaTime / (a - alpha)
instead ofa = deltaTime / alpha
. Neither fixes the problem, but the latter causes more problems.
Here is a contrived example in which all the samples are exactly one minute apart. When computing alpha
, I used 11 * 60 * 1000
, or 11 minutes, leaving me with alpha = 0.0000030302984389417593
. Notice how each ema has followed the sample almost exactly. This is not supposed to happen with an 11 minute window.
sample 10766.26, ema 10766.260001166664, time 1518991800000
sample 10750.75, ema 10750.750258499216, time 1518991860000
sample 10750.76, ema 10750.759999833333, time 1518991920000
sample 10750.75, ema 10750.750000166665, time 1518991980000
sample 10750.76, ema 10750.759999833333, time 1518992040000
sample 10750.76, ema 10750.759999999998, time 1518992100000
sample 10750.76, ema 10750.759999999998, time 1518992160000
sample 10750, ema 10750.000012666627, time 1518992220000
sample 10719.99, ema 10719.990500165151, time 1518992280000
sample 10720, ema 10719.999999833333, time 1518992340000
sample 10719.99, ema 10719.990000166667, time 1518992400000
sample 10719.99, ema 10719.99, time 1518992460000
sample 10709.27, ema 10709.270178666126, time 1518992520000
sample 10690.26, ema 10690.260316832373, time 1518992580000
sample 10690.27, ema 10690.269999833334, time 1518992640000
sample 10690.27, ema 10690.27, time 1518992700000
sample 10695, ema 10694.999921166906, time 1518992760000
sample 10699.98, ema 10699.979917000252, time 1518992820000
sample 10702.05, ema 10702.049965500104, time 1518992880000
sample 10744.99, ema 10744.989284335501, time 1518992940000
sample 10744.12, ema 10744.120014499955, time 1518993000000
The way the function was derived was not explained, and I didn't pay attention in math class. Any pointers would be greatly appreciated.