2

i'm studying for an exam and i tackled this problem

to add 0.1+0.2+...+1.00 ..or.. 1.00+0.99+0.98+...+0.1

in which order to add the numbers to get higher accuracy in Java?

Mohamed Islam Fares
  • 193
  • 1
  • 2
  • 10
  • 5
    Do you mean for that to start `0.01 + 0.02 + ...`? – Dawood ibn Kareem Jan 19 '15 at 02:42
  • Well, it depends on how those numbers are _declared_ in the first place; if they are already `double`s to start with you'll have to deal with IEEE 754 limitations; if you have the choice, make them `String`s and use `BigDecimal`. – fge Jan 19 '15 at 02:47
  • As stated they have the same accuracy either way. – Elliott Frisch Jan 19 '15 at 02:48
  • 1
    If you _really_ have a choice, well, sum(i = 0, n, i) == n * (n+1) / 2, so that is 100 * 101 / 2 / 100 == 101 / 2 == 50.5 – fge Jan 19 '15 at 02:49
  • @ElliottFrisch What makes you say that? – Dawood ibn Kareem Jan 19 '15 at 03:35
  • @DavidWallace It appears to be two sequences of `double`(s), since only the order of `double` addition is changing you should get the same result regardless of which order you compute the sequence (aka The Commutative Law of Addition). – Elliott Frisch Jan 19 '15 at 03:38
  • 2
    Actually, @ElliottFrisch, the associative law of addition (which you need in addition to the commutative law) doesn't apply to floating point arithmetic. – Dawood ibn Kareem Jan 19 '15 at 03:40
  • @DavidWallace I'm confused. Care to provide an example sequence where it doesn't yield the same result? I actually ran a test in Java, they yield the same result "50.50000000000003". – Elliott Frisch Jan 19 '15 at 03:44
  • 3
    `System.out.println(1.1 + 0.0001 + 0.0003 + 1.2 == 1.1 + 1.2 + 0.0001 + 0.0003);` shows false. @ElliottFrisch – Dawood ibn Kareem Jan 19 '15 at 03:50
  • 1
    More strikingly, `1e100 + 1 - 1e100 == 0`. – tmyklebu Jan 19 '15 at 06:42

4 Answers4

5

Any COBOL programmer would be able to answer this immediately.

The point of the question is that if you add the big numbers first, you lose precision when you come to add the small numbers. Add the small numbers first.

Dawood ibn Kareem
  • 77,785
  • 15
  • 98
  • 110
2

There is a result stating that one can get a better worst-case error bound if one adds a sequence of positive numbers in increasing order. This result does not imply that, for a given sequence of positive numbers, one gets smaller error by adding the numbers in increasing order.

Case in point:

int main() {
  float f = 0, g = 1;
  for (int i = 1; i <= 99; i++) {
    char buf[42];
    float ff;
    sprintf(buf, "0.%02i", i);
    sscanf(buf, "%f", &ff);
    f += ff;
    sprintf(buf, "0.%02i", 100-i);
    sscanf(buf, "%f", &ff);
    g += ff;
  }
  f += 1;
  printf("%a %a\n", f, g);
}

prints

0x1.940002p+5 0x1.93fffep+5

The exact sum of the 100 floats in question, which can be computed by declaring f and g as double (but leaving ff as a float), is 0x1.93ffffff8p+5, indicating that summing the numbers in decreasing order wins here.

tmyklebu
  • 13,915
  • 3
  • 28
  • 57
  • I was about to upvote this answer, for pointing out the difference between worst-case error bound and actual error. But then I got to the last paragraph and decided not to. You _can't_ get an exact result by declaring `f` and `g` as `double`. Exactly the same kinds of errors exist using `double` and using `float` - they're just smaller. In this case, the correct answer is `50.5 = 0x1.94p+5`, which you can work out using pencil and paper, or using an abacus, but _not_ using a `double`. So if the two additions give `0x1.940002p+5` and `0x1.93fffep+5`, then they are _equally_ ... – Dawood ibn Kareem Jan 19 '15 at 19:35
  • ... inaccurate - the error in each calculation is exactly the same. – Dawood ibn Kareem Jan 19 '15 at 19:35
  • @DavidWallace: Read the code. All of the summands are multiples of a power of two larger than 2^(-40). So yes, you can get the exact answer by declaring `f` and `g` as `double`. The exact sum of those 100 `float`s is the number I gave, which is smaller than `0x1.94p5`. – tmyklebu Jan 19 '15 at 20:27
  • Since when has `0.1` been a multiple of a power of two? – Dawood ibn Kareem Jan 19 '15 at 20:28
  • Since when has `0.1` been representable as a `float`? Read the code. (I suspect I'm making this point too subtly. I'll edit the question to make it more obvious what's going on.) – tmyklebu Jan 19 '15 at 20:29
  • 1
    Yes, but the question isn't about adding `float`s. Representability as `float` is completely irrelevant to what the OP is asking. – Dawood ibn Kareem Jan 19 '15 at 20:31
  • @DavidWallace: OK, then `0.1` is a multiple of a power of two. It is exactly the number with the following finite binary expansion: `0x1.999999999999ap-4`. And `0.1f` has the binary expansion `0x1.99999ap-4`. My point stands here, if you compute `1.0f + 0.99f + ... + 0.01f`, the final result is less wrong than if you compute `0.01f + 0.02f + ... + 1.0f`. – tmyklebu Jan 19 '15 at 20:33
  • I don't think representation is irrelevant in the context of floating-point arithmetic. Clearly these numbers must be represented in some format. The question did not state which format that is. I upvoted this answer because it states both the generally accepted (and proven, as far as I know) guideline of adding in order of increasing magnitude, but also points out as a caveat that for a particular storage format (here: `float`) and sequence of numbers this may not deliver the best result. The code illustrates the caveat nicely. – njuffa Jan 19 '15 at 21:23
  • The intention of this answer is correct. But the nearest single precision float to the exact sum of the singles is 50.5 (same for doubles, 50.5). So both results in single precision are 1 ulp off. In double precision, summing in increasing order wins. But it's just luck. Try with 0.0001+...+1 and you're 10 ulp off the nearest float (double) to exact sum (5000.5) while reverse is just 2 ulp off... I can exhibit Smalltalk 1-liner expressions if you want. – aka.nice Jan 19 '15 at 21:59
  • @aka.nice: The nearest `float` to the correct answer is indeed `50.5`, but the decreasing sum is off by 0.9990234375 ulp while the increasing sum is off by 1.0009765625 ulp. (In the first version of this comment, I thought you were talking about adding the closest floats to `0.0001` through `1` together, which is actually another weird example where summing in reverse wins.) – tmyklebu Jan 19 '15 at 22:39
  • @tmyklebu yes, in second part of the comment I was talking of the sum of 0.0001+... – aka.nice Jan 19 '15 at 22:56
  • While the numbers stated in the question sum to 50.5 mathematically, if each of those numbers is represented as an IEEE-754 single precision `float`, the *exact* sum of those `float` operands is *not* equal to 50.5 and can be represented exactly in a `double` as @tmyklebu pointed out. – njuffa Jan 19 '15 at 23:06
  • @njuffa yes, but the nearest single precision float to the exact sum of single precision floats is still 50.5f, and the nearest double to the exact sum of doubles is still 50.5... Try it, it works remarkably well even with larger sums. – aka.nice Jan 20 '15 at 10:46
  • @aka.nice: There appears to be a double-rounding anomaly in your philosophy. I'm measuring the distance from the exact result, not the correctly-rounded result, to the result one gets from a certain computation. – tmyklebu Jan 20 '15 at 12:19
  • 1
    @tmyklebu yes, agree, sorry for the noise, my point was that it's nonetheless remarkable that the exact sum of float approximates to fractions i/100 is less than 1/2ulp to the sum of exact fractions i/100. – aka.nice Jan 20 '15 at 17:49
1

Since this is an arithmetic series, its sum can be calculated as follows

Sn = n * (a1 + an) / 2

Sn ... sum
n  ... number of elements
a1 ... first element
an ... last element

As you eliminate most of the numbers, this would seem the best solution.

Eugen Pechanec
  • 37,669
  • 7
  • 103
  • 124
1

Better than summing all the numbers linearly, you should study a divide and conquer method (sum half the array separately).

This is very much like Precise sum of floating point numbers

Community
  • 1
  • 1
aka.nice
  • 9,100
  • 1
  • 28
  • 40