1

I have this problem where I need to compute a continuous exponential moving average of a value in a discrete data stream. It's impossible to predict when I will receive the next sample, but EMA formulas expect the amount of time between each sample of data to be equal.

I found this article with a demonstration of how to work around this:

double exponentialMovingAverageIrregular( double alpha,
                                          double sample,
                                          double prevSample, 
                                          double deltaTime,
                                          double emaPrev
                                          )
{
   double a = deltaTime / ( 1 - alpha );
   double u = exp( a * -1 );                                  // e^(-a)
   double v = ( 1 - u ) / a;

   double emaNext = ( emaPrev       *           u   )
                  + (    prevSample * (     v - u ) )
                  + (        sample * ( 1 - v     ) );
   return emaNext;
}

I compute alpha by using the following formula: 2 / (period + 1) where period is the number of milliseconds I want my EMA to pay attention to.

When I use this, the EMA moves way too quickly. I could have a 30 minute window that takes only two or three samples for the EMA to equal the input.

Here are some things I could be doing wrong:

  • I use milliseconds for computing alpha because that's the resolution of the timestamps on my input
  • I use milliseconds for deltaTime because that's what everything else is using
  • Per the suggestion of commenters on the article, I use a = deltaTime / (a - alpha) instead of a = deltaTime / alpha. Neither fixes the problem, but the latter causes more problems.

Here is a contrived example in which all the samples are exactly one minute apart. When computing alpha, I used 11 * 60 * 1000, or 11 minutes, leaving me with alpha = 0.0000030302984389417593. Notice how each ema has followed the sample almost exactly. This is not supposed to happen with an 11 minute window.

sample 10766.26, ema 10766.260001166664, time 1518991800000
sample 10750.75, ema 10750.750258499216, time 1518991860000
sample 10750.76, ema 10750.759999833333, time 1518991920000
sample 10750.75, ema 10750.750000166665, time 1518991980000
sample 10750.76, ema 10750.759999833333, time 1518992040000
sample 10750.76, ema 10750.759999999998, time 1518992100000
sample 10750.76, ema 10750.759999999998, time 1518992160000
sample 10750,    ema 10750.000012666627, time 1518992220000
sample 10719.99, ema 10719.990500165151, time 1518992280000
sample 10720,    ema 10719.999999833333, time 1518992340000
sample 10719.99, ema 10719.990000166667, time 1518992400000
sample 10719.99, ema 10719.99,           time 1518992460000
sample 10709.27, ema 10709.270178666126, time 1518992520000
sample 10690.26, ema 10690.260316832373, time 1518992580000
sample 10690.27, ema 10690.269999833334, time 1518992640000
sample 10690.27, ema 10690.27,           time 1518992700000
sample 10695,    ema 10694.999921166906, time 1518992760000
sample 10699.98, ema 10699.979917000252, time 1518992820000
sample 10702.05, ema 10702.049965500104, time 1518992880000
sample 10744.99, ema 10744.989284335501, time 1518992940000
sample 10744.12, ema 10744.120014499955, time 1518993000000

The way the function was derived was not explained, and I didn't pay attention in math class. Any pointers would be greatly appreciated.

user3666197
  • 1
  • 6
  • 50
  • 92
patrickjm
  • 147
  • 1
  • 11
  • P.S. [this](https://stackoverflow.com/questions/1023860/exponential-moving-average-sampled-at-varying-times) question is different from mine because it concerns formulae and not implementation – patrickjm Feb 18 '18 at 21:59
  • Could you also provide some numerical example of the data you feed that results in "_EMA moving way too quickly_"? – SergGr Feb 18 '18 at 23:16
  • @SergGr I've edited the post to include a sample. You'll notice how quickly the ema moves with each new sample. I'm not sure if the function itself is the issue or my parameters (is alpha bad? is deltaTime using the wrong unit of measurement? etc) – patrickjm Feb 19 '18 at 00:01

1 Answers1

0

You Get Exactly What You've Defined:

given the way you defined alpha, the rest is a causal-chain:

|>>> a = 60000 / 0.999997
|>>> u = exp( -a )
|>>> v = ( 1 - u ) / a
|>>> u, ( v - u ), ( 1 - v )
( 0.0, 1.6666616666666667e-05, 0.99998333338333334 )

thus a

return ( ( emaPrev       *           u   ) // -> 0.       * emaPrev
       + (    prevSample * (     v - u ) ) // -> 0.000016 *    prevSample
       + (        sample * ( 1 - v     ) ) // -> 0.999983 *        sample
         );                                // ~=                   sample

returns nothing much different from the sample ( all the powers of the smoothing effect has been efficiently short-cut off the wannabe-smoothing-filter )

There are different motivations in different fields of use of the signal-filtering / smoothing. Strategies that may work fine in domains of mass-bound models for noisy sensor readouts, need not meet your expectations in other domains, like quant-modelling in trading and other domains that enjoy mass-less or otherwise absent products of inertia for processes and similar principal discontinuities of the subject of the study phenomena.

Out of question, it is worth spending some time both on math and on quant subjects of the study, both of these help you a lot in future work.

user3666197
  • 1
  • 6
  • 50
  • 92
  • Thanks for your answer. I maybe should have asked: how do I achieve the results I'm looking for? When I use a run-of-the-mill EMA function and sample every second (on a 1 second timer, I sample the most recent value and ignore the rest), I get EMA values that look like what you see on your standard charting tools. I'm trying to reproduce that (without the timer) and I'm lost on what to change to get there. Isn't an EMA an EMA, regardless of context? Surely then this is an issue with my `alpha`? – patrickjm Feb 19 '18 at 20:57