floating point product expansion equivalence

Question

In IEEE 754 floating point, it is possible that

a*(b-c) != a*b-a*c // a, b, c double

So expansion of a product is not guaranteed to be equal to the unexpanded product.

But what about this:

a*(b1+b2+...+bn) == a*b1+a*b2+...+a*bn // b1==b2==...==bn

When all b equal, is equivalence guaranteed (in case of no under-/overflow)? Is there a difference if equality of b is known at compile time or not?

Edit:

It is not - see Eric Postpischil and Pascal Cuoq.

But maybe holds the weaker assertion?:

   (1.0/n)*(b1+b2+...+bn) <= 1.0
&& (1.0/n)*b1+(1.0/n)*b2+...+(1.0/n)*bn <= 1.0

// when all b<=1.0 and n integral double but not power of 2
// so that 1.0/n not exactly representable with base-2 floating point

I simply wonder if you can guarantee that the average of a data set does not exceed some value that is also not exceeded by every single data value, no matter how you compute the average (first adding and once dividing, or adding every value divided).

Edit2:

Ok, the && doesn't hold. See Eric Postpischil and David Hammen:

average of nine 1.0 -> only first condition is true, second exceeds.
average of ten 1.0/3 -> only second condition is true, first exceeds.

Is then the optimal method of computation of an average dependent of upper expected limit of data set? Or also of size (that means n) of data set? Or is there no optimal method surely existing?

The earlier question http://stackoverflow.com/questions/21654098/why-there-is-no-errors-while-multiplying-floating-point-numbers shows that there is no chance of this being true with a=1/10., n=10 and bi=1. — Pascal Cuoq, Feb 10 '14 at 13:37
@PascalCuoq good example, thanks. Do you have any idea with my updated question? — mb84, Feb 10 '14 at 14:22
You would need additional constraints to ensure the property in your updated question. One way to have the property hold is sometimes to compute intermediate results at a higher precision. See for instance http://stackoverflow.com/questions/13725802/properties-of-80-bit-extended-precision-computations-starting-from-double-precis for the comparable property `u2` ≤ result ≤ `u3`. — Pascal Cuoq, Feb 10 '14 at 14:51
There is not even a single way to sum: ((((b+b)+b)+b)+b)+b is not necessarily equal to ((b+b)+(b+b))+(b+b) though I'd be curious if you can exhibit a difference with 4 operands — aka.nice, Feb 10 '14 at 22:12
@aka.nice On my computer, testing all positive float via brute-force (`casting unsigned int * to float *`) yields that *all* float f hold to `((f+f)+f)+f == (f+f)+(f+f)` *except* the 2^23-1 NaNs. So too with your 6 operands example, and also with 5 operands. — mb84, Feb 11 '14 at 09:49

score 3 · Answer 1 · answered Feb 10 '14 at 12:31

3

IEEE 754 doesn't delve too much into language details. In particular, details like "compile time" aren't specified. That doesn't even make sense for some languages.

The easiest case to understand is when you have an intermediate infinity. Assume that sum(b)==INF, barely, but a is 0.5. The result a*sum(b) is still INF. However, by multiplying first, the subsequent addition no longer overflows.

answered Feb 10 '14 at 12:31

MSalters

173,980
10
155
350

Ok, I meant without under-/overflow (edited it). And with "*compile time*" I meant some possible optimizations, which seem to be mentioned (by mah) in the link provided by Pascal Cuoq. I updated my question with a more practical core question behind that all. – mb84 Feb 10 '14 at 14:04

score 3 · Accepted Answer · edited May 23 '17 at 12:20

Even excluding overflow and underflow, the results may differ.

.3 * (.3+.3+.3) is 0.269999999999999962252417162744677625596523284912109375 while .3*.3 + .3*.3 + .3*.3 is 0.270000000000000017763568394002504646778106689453125 (when both are evaluated with IEEE-754 rules and 64-bit binary floating-point).

Regarding the questions added in an update:

There are two questions, one of which asks whether the computed average of a set of numbers each not exceeding one can exceed one. As David Hammon points out, calculating the average of nine 1s as 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 produces 1.0000000000000002220446049250313080847263336181640625.

The other asks whether the computed average of a set can exceed all of the numbers in the set. This is a superset of the first question, but here is an example using a different calculation of the average:

Let b = 1./3 (which is 0.333333333333333314829616256247390992939472198486328125).

Then 1./10 * (b+b+b+b+b+b+b+b+b+b) is 0.33333333333333337034076748750521801412105560302734375, which is greater than b.

Good example, thanks. I updated my question with a more practical core question behind that all. Do you have an idea? — mb84, Feb 10 '14 at 14:05
So which method exceeds data set range depends of upper limit (asked also David Hammen)? data <= 1.0 -> use my first method. data <= 1.0/3 -> use my second method? Depends also from `n`? E.g. I found that: data <= 1.0/3 and `n=9 or 8` -> first method doesn't exceed. But is there some n where *second* method would exceed for data <= 1.0/3? Or *first* method for data <= 1.0? — mb84, Feb 10 '14 at 15:34

score 2 · Answer 3 · answered Feb 10 '14 at 14:33

2

But maybe holds the weaker assertion?:

(1.0/n)*(b1+b2+...+bn) <= 1.0
&& (1.0/n)*b1+(1.0/n)*b2+...+(1.0/n)*bn <= 1.0

No. For example, this assertion fails on my computer with n=9 and b_i=1.0.

I simply wonder if you can guarantee that the average of a data set does not exceed some value that is also not exceeded by every single data value, no matter how you compute the average (first adding and once dividing, or adding every value divided).

Once again, the answer is no. The correlation E[(X-Xbar)*(Y-Ybar)]/(sigma_x * sigma_y) between two random variables should always be between -1.0 and 1.0. Yet if you compute the statistics for two perfectly correlated (or perfectly anti-correlated) random variables, you'll oftentimes see a correlation that is slightly greater than +1 (or slightly less than -1).

answered Feb 10 '14 at 14:33

David Hammen

32,454
9
60
108

Interesting: computing the average of nine 1.0 is exact only with `(1.0/9)*(1.0+...+1.0)`. Computing the average of ten 1.0/3, like Eric Postpischil points out, is exact (in terms of actual representation of itself not exactly representable 1.0/3) only with `(1.0/10)*(1.0/3)+...+(1.0/10)*(1.0/3)`. So does it depend on upper limit, which method you should chose? All values of data set <= 1.0 -> first method -> average not exceeding 1.0. All values of data set <= 1.0/3 -> second method -> average not exceeding 1.0/3? – mb84 Feb 10 '14 at 15:18
1

You need to be careful here. If you write `x=(1.0/9)*(1.0+1.0+1.0+1.0+1.0+1.0+1.0+1.0+1.0);`, your compiler might well compile this as if you had written `x=1.0;`. If you want to see how your computer will treat this using native floating point you need to present the problem to the compiler in a such a manner that the compiler won't optimize away all the places IEEE floating point can go slightly awry. I suggest writing a function that computes the average of an array of N doubles by your first approach, another function that computes the average using your second approach. (continued) – David Hammen Feb 10 '14 at 15:54
1

As to which approach is better, I'd argue that it's the first. For one thing, it is considerably less costly computation-wise. Your first approach involves N-1 adds and one multiply. Your second approach has N multiplies and N-1 adds. Accuracy-wise, the first approach is almost always going to win as well for the simple reason that it only has N floating point operations where things can go awry while the second has 2N-1 such operations. – David Hammen Feb 10 '14 at 15:57
I ran my first and second approach with no optimization potential: indeed: the average of nine 1.0 is exact only with first approach (the second exceeds 1.0), and inverse for the ten 1.0/3. But twelve 1.0/3 are suddenly exact with first approach only and the second exceeds... Thanks anyway for examples and hints. Your argumentation is clear. But I'll give the answer to Eric because he answered first (and maybe finding this decision from averaging all factors obeys also to some sort of uncertainity principle ;) +1 for you – mb84 Feb 10 '14 at 22:26

floating point product expansion equivalence

3 Answers3

Linked