Formula to calculate average number on expanding data array

Question

Say, if I had a final array of numbers, say:

{1, 5, 7, 2}

the average for them will be:

(1 + 5 + 7 + 2) / 4;   //Or, sum of all elements, divided by their number

But what if my array is constantly growing and I need to know the current average number at an instance of time when the full array is not known yet. How do you calculate that?

Say, like when I'm trying to display current data transfer rate.

For data transfer rate you really want to use a weighted moving average and not a straight average over the whole transfer. — Yaur, May 23 '13 at 21:58
possible duplicate of [How to calculate simple moving average faster in C#?](http://stackoverflow.com/questions/12884600/how-to-calculate-simple-moving-average-faster-in-c) — Yuck, May 23 '13 at 21:59

score 3 · Answer 1 · answered May 23 '13 at 22:11

Every time you get a new value update your sum and count and simply divide the two when you need to display them to the user.

For data transfer rate this is problematic approach though. Think of the scenario where you start a transfer with high bandwidth and then the connection drops, with a simple average it could take a long time for your UI to reflect that the current transfer rate is 0. Using a weighted moving average is a quick way to make your UI seem more responsive.

The simplest implementation of this is to sample the transfer rate periodically (say every 5 seconds) and calculate your rate with something like:

 float weight = 2.0; //you are going to want to tweak this to get the right balance between "responsive" and "noisy"

 void UpdateAverage(float newValue)
 {
    this.Average = (this.Average + (newValue*weight))/(weight+1)
 }

Whoever downvoted, can you explain why? This was my idea as well. — ahmd0, May 23 '13 at 22:19

score 3 · Accepted Answer · answered May 23 '13 at 22:21

3

I would go with a simple approach for having a running total (cumulative) with a constant space complexity (1). And on request of avg, return the result which is running total/total number of item. No extended time complexity since we don't iterate array afterwards to find cumulative sum.

answered May 23 '13 at 22:21

S.N

4,910
5
31
51

3

Thanks. One obvious downside of this concept is an overflow in the variable that holds the `running total`. How would you address that? – ahmd0 May 23 '13 at 22:30
@ahmd0 then store the result in a *larger* type, long,double,decimal,Biginteger etc. – I4V May 23 '13 at 22:34
@I4V: I'm already using `double` but I'm afraid I may have a problem over time... I think it loses precision the larger it grows. – ahmd0 May 23 '13 at 22:35
@ahmd0 How about [BigInteger](http://msdn.microsoft.com/en-us/library/system.numerics.biginteger.aspx) ? – I4V May 23 '13 at 22:37
@I4V: Yes, I thought about it, but then I'll lose precision at the beginning with smaller numbers... But, maybe I should just use both depending on the size of the total. Hmm... – ahmd0 May 23 '13 at 22:38
Good question. The rational behind the proposal was to give a head start with less overhead. But the question that you asked was more of how to deal scenario such as overflow. I don't have answer to it since it is you who can decide or rather know the probable case and worst case. Such as what if I have double array item and what happens to the cumulative sum. This subtle consideration stay same irrespective of algorithm you choose to proceed. – S.N May 23 '13 at 22:39

score 2 · Answer 3 · edited May 25 '14 at 21:04

I know I'm late to the party here but I solved a similar problem to this back in 2000, when I was creating an adaptive load balancer. The standard average has two problems, both mentioned in the answers given:

The overflow scenario
Recalculating average

So you can turn the normal average calculation into a what maths folk call a 'recurrence relation' which means you store the previous average, which is then used to calculate the new one (akin to Yaur's answer, which I have to agree with ahmed, I can't see it was down-voted). I wrote the original in Delphi back in the day, so I very recently did the same thing again in .NET and compared it against the standard process of calculating continual averages as lists of items get bigger.

Without wanting to self-promote (I'm posting the link just to save my typing fingers), you can find the theoretical treatment, algebra, results of the comparison experiments and how it addresses the problems at:

http://goadingtheitgeek.blogspot.co.uk/2014/05/blast-from-past.html

I hope you find it useful.

Eric · Answer 4 · 2013-05-23T22:41:17.997

My bet is that you want something fast, otherwise you already have your answer. Sum all numbers that you already have by the length of the array you already have. This is very straightforward.

However sometimes you cannot know if the array will be bounded, it can be infinite, such as the data coming from a microphone. The suggestion of the moving average is good in this case, it means that you have to take the last x values from the array and calculate the average on those values only. The algorithm and the time required to compute the result stays the same whether you have x values or 1000x values.

edit :

the x vs 1000x comes from algo complexity. Suppose that you sum 5 numbers, that is 5 operations, then you divide by 5, another operation for a total of 6 ( for the sake of the example we'll assume that they all take the same computer time but in reality dividing is slow compared to adding ). If you take the same code but with 1000 numbers, you do 1001 operations which is going to take much more time than in the first case!

With the "moving average", you always take a fixed number of numbers so your algo takes a fixed amount of time to execute no matter if you have 5 or 1000 numbers.

Moving average is just a fancy wording to say that you do not take the same numbers in your array from one time to another. Imagine the following array :

int x = { 1, 4, 6, 3, 1 }; int arrayLength = 5;

Then the average of this array will be

int runningTotal = 0;
for(int i = 0; i < arrayLength; i++)
{
   runningTotal += x[i];
}
double average = runningTotal / arrayLength

The moving average of 3 values would be

int movingLength = 3;
int runningTotal = 0;
for(int i = 0; i < movingLength; i++)
{
   runningTotal += x[arrayLength - i - 1];
}
double average = runningTotal / movingLength;

So the first values in the array are not part of the calculation when the array grows.

You know I'm still struggling to understand that "moving average" concept, so I'm not sure what you meant in your last x values or 1000x values comparison? And yes, I'm looking for something where I don't need to have the whole array each time I need to calculate the average. — ahmd0, May 23 '13 at 22:26
Thank you for the explanation. I get it now. This is as if I discard the head of the array (or early data) and calculate the average only on last N elements. Hmm. That is an interesting idea... — ahmd0, May 23 '13 at 22:46

score 1 · Answer 5 · answered May 23 '13 at 22:12

You're looking for a moving average:

    static void Main(string[] args) {
        var nums = Enumerable.Range(1, 5).Select(n => (double)n);
        nums = nums.Concat(nums.Reverse());

        var sma3 = SMA(3);
        var sma5 = SMA(5);

        foreach (var n in nums) {
            Console.WriteLine("{0}    (sma3) {1,-16} (sma5) {2,-16}", n, sma3(n), sma5(n));
        }
    }

    static Func<double, double> SMA(int p) {
        Queue<double> s = new Queue<double>(p);
        return (x) => {
            if (s.Count >= p) {
                s.Dequeue();
            }
            s.Enqueue(x);
            return s.Average();
        };
    }

Source: http://rosettacode.org/wiki/Averages/Simple_moving_average#C.23

Thanks. "Moving average" assumes that I keep the whole array of data somewhere in memory, right? Sorry, that Wiki page is too complicated to grasp quickly. — ahmd0, May 23 '13 at 22:18
You don't have to keep the whole array in memory, just the number of samples that you are going to use in your average. — Yaur, May 23 '13 at 22:32

Kevin · Answer 6 · 2013-05-23T22:39:29.597

0

Assuming you don't want a moving average..

You need to keep track of the current sum of the elements when you last calculated the average.

If your array is this...

Array = {1, 5, 7, 2}
Sum = 1 + 5 + 7 + 2 = 15
Number of elements = 4
Average = Sum / Number of elements = 3.75

You add a couple of elements and your array looks like this...

Array = {1, 5, 7, 2, 10, 6}

Assuming your actual array is much larger... To recalculate your average..

Sum = ([previous sum] + [sum of new elements]) / [number of elements]
Number of elements = 6
Average = ((15 * 4) + 10 + 6) / 6 = 5.1666667

Edit: if you are concerned about precision and size check out the BigInteger

edited May 23 '13 at 22:39

answered May 23 '13 at 22:14

Kevin

4,586
23
35

Thanks. Yes, all I need is an approx. average. So what's the formula in that case? – ahmd0 May 23 '13 at 22:21
Just keep your running total and number of elements... you just need to keep track of which elements are new to the array since you last calculated your running total. If you are adding the elements to the array then just add to your running total as you add them – Kevin May 23 '13 at 22:31
How would you address an overflow in the running total though? – ahmd0 May 23 '13 at 22:32
How big do you expect your numbers to get? If you need it to be an Integer you can always use a BigInteger. – Kevin May 23 '13 at 22:36
I don't know. If the transfer takes a long time, I can imagine that the total may grow pretty big. – ahmd0 May 23 '13 at 22:37

score 0 · Answer 7 · answered Sep 25 '15 at 12:03

if average buffer length is changable my old program is that.

namespace WindowsFormsApplication1
{
  public partial class ComTester : Form
  {
    public int []sAvgArr=new int [251];
    public int sAvgAdr,sAvgCnt;

    private void Calc_Values(object sender, int v)//v is value
    {
        int n = AverageLength;
        if (n > 250) n = 250;//average buffer maksimum
        if (n > 0)
        {
            sAvgArr[sAvgAdr] = v;
            sAvgAdr += 1;
            sAvgCnt += 1;
            if (n <= sAvgAdr) sAvgAdr = 0;
            if (n <= sAvgCnt) sAvgCnt = n;
            n = sAvgCnt;
            int f = 0, l = 0; ;
            for (l = 0; l < n; l += 1)
                f += sAvgArr[l];
            f = f / sAvgCnt;
            sAvgVal = f;
        }
        else 
        {
            sAvgVal=v;
        }
    }
  }
}

score -3 · Answer 8 · edited Nov 16 '16 at 12:25

-3

using System;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            int  n = 3 , x , sum = 0 ;
            double ave;

            for (int i = 0; i < n; i++ )
            {
                Console.WriteLine("enter the number:");
                x = Convert.ToInt16(Console.ReadLine());
                sum += x;
            }

            ave = (double)(sum) / 3;
            Console.WriteLine("average:");
            Console.WriteLine( ave );
            Console.ReadLine();
        }
    }
}

edited Nov 16 '16 at 12:25

Thomas Weller

55,411
20
125
222

answered Nov 16 '16 at 12:02

Ms.FMa

1
1

2

Please consider to fix the formatting of your answer as well as providing some text describing/explaining it. – simme Nov 16 '16 at 12:11

Formula to calculate average number on expanding data array

8 Answers8