How to demonstrate a better average?

Question

Here's a question on Page 60-61 in A Book On C(3rd Edition) by Al Kelley/Ira Pohl:

The following code fragment shows two different methods of computing a running average:

int i;
double x;
double avg= sum= 0.0;
double navg;

for (i=1; scanf("%lf", &x)==1; ++i)
{
  avg+= (x-avg)/i;
  sum+= x;
  navg= sum/i;
}

The original question as written in the book is: If you input some "ordinary" numbers, the avg and the navg seem to be identical. Demonstrate experimentally that the avg is better, even when sum does not overflow.

My question as a beginning programmer is:

What is the criteria for a "better" algorithm? I believe precision and the amount of running time are two key factors, but are there any other things that make an algorithm "better"?
In terms of precision and running time, how can I demonstrate experimentally that when overflow is ruled out, avg is still a better method than navg is? Should I use numbers that are "out of the ordinary", e.g., of largely different magnitudes?

By type, I mean do you want a "arithmetic mean", "geometric mean", "harmonic mean", "median", "mode", other? — Daniel Renshaw, Jul 06 '13 at 06:21
I also want to know in what sense it is better... The original question is "If you run the avg and the navg taking the input from a file that contains some "ordinary" numbers, then they seem to be identical. Demonstrate experimentally that avg is better, even when sum does not overflow"... — Pocoapoco, Jul 06 '13 at 06:23
Check http://stackoverflow.com/questions/6227543/better-algorithm-to-find-average — furins, Jul 06 '13 at 06:26
"How can I demonstrate experimentally" -- use values with very large differences in magnitude. — Jim Balter, Jul 06 '13 at 06:39
@Pocoapoco: How does the avg method compute the average?Can you please explain? — Aravind, Jul 06 '13 at 06:54
@Aravind: try imagining this way: 1. (x-avg): the difference between the next x and the current average, 2. (x-avg)/i: distribute the difference you got in the first step averagely to all the x you have input. do some hand calculations and it's clear :) — Pocoapoco, Jul 06 '13 at 07:01
@DanielRenshaw The code clearly computes an average, aka arithmetic mean, in two different ways. — Jim Balter, Jul 06 '13 at 07:25
Your question as written is unclear and unanswerable, because the criteria for deciding one algorithm is "better" than another is left undefined by you. Consider clarifying your homework question if you would like better answers. — Alex Reynolds, Jul 06 '13 at 08:18
I wouldn't have posted this question if I know how the criteria for "better" is defined. — Pocoapoco, Jul 06 '13 at 08:22
Maybe refer [to the book](https://engineering.purdue.edu/~engr116/ENGR19500H_spr/General_Course_Information/Common/ReadingMaterial/ABookOnC_Ch_0-1.pdf), see the question in furins' comment. The code snippet clearly originates from there (I found the book text, so I can confirm this). Read it (I mean the section in the book, it's on page 64-65), and rephrase your question accordingly. — Jubatian, Jul 06 '13 at 10:13

score 1 · Accepted Answer · answered Jul 06 '13 at 08:12

the two algorithms don't have much difference on running time;
compared with navg, avg is better on precision.

(1)running time: the following two pieces of code demonstrate that at the magnitude of 1000000, the two algorithms do not have much difference.

#include<stdio.h>
#include<time.h>

int main()
{
  int i ;
  double x ,sum = 0,avg = 0;
  srand(time(NULL));
  for(i = 0; i < 1000000 ; i++)
    {
      x = rand()%10+1;
      sum += x;
    }

  avg = sum/i;
  printf("%lf\n",avg);
  printf("time use:%lf\n",(double)clock()/CLOCKS_PER_SEC);
}

#include<stdio.h>
#include<time.h>

int main()
{
  double sum = 0,avg = 0;
  double x;
  int i;
  srand(time(NULL));
  for(i = 0 ; i < 1000000; i++)
    {
      x = rand()%10+1;
      avg += (x-avg)/(i+1);
    }

  printf("%lf\n",avg);
  printf("time use:%lf\n",(double)clock()/CLOCKS_PER_SEC);
}

(2)precision: the code below demonstrates that, adding the differences between avg and every x, the result is 0; while as for navg the result is -2.44718e-005, which means that avg is better on precision.

#include <stdlib.h>
#include <stdio.h>

int main()
{
  static double data[1000000];
  double sum, avg, check_value;

  int i;
  int n = sizeof(data)/sizeof(data[0]);

  avg = 0;
  for( i = 0; i < n; ++ i)
    {
      avg += ( data[i] - avg) / (i + 1);
    }

  check_value = 0;
  for( i = 0; i < n; ++ i)
    {
      check_value = check_value + ( data[i] - avg );
    }
  printf("\navg += (x[i] - avb) / i:\tavg = %g\t check_value = %g", avg, check_value );

  for( i = 0; i < n; ++ i )
    {
      data[i] = 1.3;
    }

  sum = 0;
  for( i = 0; i < n; ++ i)
    {
      sum += data[i];
    }
  avg = sum / n;

  check_value = 0;
  for( i = 0; i < n; ++ i)
    {
      check_value = check_value + ( data[i] - avg );
    }
  printf("\n avg = sum / N: \tavg = %g\t check_value = %g", avg, check_value );

  getchar();
}

The difference in precision makes no sense. Note you have not initialised your array. Even if you move the initialization loop at the beginning, the difference in `check_value` must be an optimization. — Michael Foukarakis, Jul 06 '13 at 08:51
The running time difference also does not make sense as the rand() call and the library code generating pseudorandom numbers may well dominate it. You also waste a considerable amount of time by %10 since there is no modulo operation on the FPU, thus floating point <=> integer conversions and integer divisions are also likely generated, which are also very costly operations (assuming x86), especially compared to the simple averaging logic you wanted to measure. — Jubatian, Jul 06 '13 at 10:02

score 0 · Answer 2 · edited Jun 18 '14 at 18:18

0

Note that you're dividing by zero in your for() loop even if you do ++i

edited Jun 18 '14 at 18:18

CRABOLO

8,605
39
41
68

answered Jul 06 '13 at 06:19

user2422531

100
4

sorry it was a mistake. it's expected to start with i=1; – Pocoapoco Jul 06 '13 at 06:28
3

-1 The purpose of the SO limitation from commenting with low rep is not for people with low rep to post comments as answers. – Jim Balter Jul 06 '13 at 06:52

score 0 · Answer 3 · edited May 23 '17 at 12:29

I think this is a valid question, although not too well phrased. A problem is that even the question referred to by furins was not phrased well and was closed before receiving a good answer.

The question itself however is interesting especially for the closed one which shows that it was even included in a book, so it could lead more people in one or another direction.

I think neither algorithm is particularly better. In the naive average it looks like we will lose precision, or that when averaging numbers with several magnitude of differences we even lose numbers, but the same may probably be discovered with the other algorithm as well, maybe just with different sets of input data.

So, especially for that it is from an existing book, I think this is a perfectly valid question seeking for some decent answer.

I attempt to try to cover up what I think about the two algorithm by an example. So imagine you have 4 numbers of roughly the same magnitude and you want to average them.

The naive method sums them up first, one after another. After summing the first two you obviously lost one bit of precision at the low end (since you now probably have one larger exponent). When you add the last number, you have 2 bits lost (which bits are now used for representing the high part of the sum). But then you divide by four which in this case essentially is just subtracting 2 from your exponent.

What did we lose during this process? Now it's easier to answer what if all the numbers were truncated by 2 bits first. This case obviously the last two bits of the fraction of the resulting average will be zero, and up to additional 2 bits worth of error may be introduced (if all the truncated bits were happened to be ones in the original numbers compared to if they were zeros). So essentially if the sources were single precision floats with 23 bits of fraction, the resulting avg would now have about 19 bits worth of precision.

The actual result from the naive method is better though as the first numbers summed didn't lose that much precision.

In the differential method in each iteration the appropriately weighted difference is added to the sum. If the numbers are of the same magnitude, then this difference will most likely be somewhere one magnitude below. It is then divided by the current count, in this operation nothing lost, but the resulting difference for the last number (with i=4 in this example) may be about 3 magnitudes below than the source numbers. We add this to the running average which is around the same magnitude as the original numbers.

So with the differential method in this example adding the last number seems to have lost about 3 bits of precision, for all the 4 numbers it may even look like we may be down to 5 essentially lost bits of precision - maybe even worse than the naive method?

The differential method is harder to follow, maybe I did some mistake in my assumptions. But I think it's clear: It just doesn't seem valid to regard one or another performing better, or if so, maybe it depends on the layout and magnitude differences of the data.

It shouldn't have been closed, but Oli Charlesworth's answer there is a good one. — Jim Balter, Jul 06 '13 at 07:24
Yes, that's the best answer this far, but I still feel it isn't a strong reason for or against the book's judgement. I would like to see one which clearly supports the book's intention or falsifies it showing that there is really not so much difference between the two algorithms. — Jubatian, Jul 06 '13 at 07:29
Perhaps, since the OP asks how to demonstrate a difference, the OP will get around to actually doing some experimentation and then we might find out. (The validity of http://mattgemmell.com/2008/12/08/what-have-you-tried/ is yet again demonstrated.) — Jim Balter, Jul 06 '13 at 07:35
Good point! Maybe some valid experimentations may be performed by comparing the results of 32bit float arithmetic to either method's result but calculated with 64bit doubles. — Jubatian, Jul 06 '13 at 09:48

How to demonstrate a better average?

3 Answers3