What is a tight lower-bound on the size of the set of irrational numbers, N, expressed as doubles in Matlab on a 64-bit machine, that I multiply together while having confidence in k decimal digits of the product? What precision, for example could I expect after multiplying together ~10^12 doubles encoding different random chunks of pi?
-
As phrased, your question seems to involve two sources of error: First, arbitrary values (your irrational numbers) are converted to double precision. This produces a rounding error. Second, multiplication operations are performed repeatedly to produce a single product. This produces a rounding error in each multiplication. Is that correct? – Eric Postpischil Aug 21 '12 at 02:59
-
When you ask for a tight lower bound, do you want an absolute guarantee that, if you multiply *N* numbers, the calculated result will be within 10^-*k* of the mathematically exact result? – Eric Postpischil Aug 21 '12 at 03:01
-
@Eric Postpischil Yes, both of your statements are correct. – Chalk Aug 21 '12 at 03:03
-
The absolute error bound is meaningless if your inputs are random enough, you'll get 9 digits of precision rather than 4 in more than 99.99999999999999999999% of cases, hope that meet your confidence expectation. – aka.nice Aug 21 '12 at 14:37
3 Answers
If you ask for tight bound, the response of @EricPostpischil is the absolute error bound if all operations are performed in IEEE 754 double precision.
If you ask for confidence, I understand it as statistical distribution of errors. Assuming a uniform distribution of error in [-e/2,e/2] you could try to ask for theoretical distribution of error after M operations on math stack exchange... I guess the tight bound is somehow very conservative.
Let's illustrate an experimental estimation of those stats with some Smalltalk code (any language having large integer/fraction arithmetic could do):
nOp := 500.
relativeErrorBound := ((1 + (Float epsilon asFraction / 2)) raisedTo: nOp * 2 - 1) - 1.0.
nToss := 1000.
stats := (1 to: nToss)
collect: [:void |
| fractions exactProduct floatProduct relativeError |
fractions := (1 to: nOp) collect: [:e | 10000 atRandom / 3137].
exactProduct := fractions inject: 1 into: [:prod :element | prod * element].
floatProduct := fractions inject: 1.0 into: [:prod :element | prod * element].
relativeError := (floatProduct asFraction - exactProduct) / exactProduct.
relativeError].
s1 := stats detectSum: [:each | each].
s2 := stats detectSum: [:each | each squared].
maxEncounteredError := (stats detectMax: [:each | each abs]) abs asFloat.
estimatedMean := (s1 /nToss) asFloat.
estimatedStd := (s2 / (nToss-1) - (s1/nToss) squared) sqrt.
I get these results for multiplication of nOp=20 double:
relativeErrorBound -> 4.440892098500626e-15
maxEncounteredError -> 1.250926201710214e-15
estimatedMean -> -1.0984634797115124e-18
estimatedStd -> 2.9607828266493842e-16
For nOp=100:
relativeErrorBound -> 2.220446049250313e-14
maxEncounteredError -> 2.1454964094158273e-15
estimatedMean -> -1.8768492273800676e-17
estimatedStd -> 6.529482793500846e-16
And for nOp=500:
relativeErrorBound -> 1.1102230246251565e-13
maxEncounteredError -> 4.550696454362764e-15
estimatedMean -> 9.51007740905571e-17
estimatedStd -> 1.4766176010100097e-15
You can observe that the standard deviation growth is much more slow than that of error bound.
UPDATE: at first approximation (1+e)^m = 1+m*e+O((m*e)^2)
, so the distribution is approximately a sum of m uniform in [-e,e] as long as m*e is small enough, and this sum is very near a normal distribution (gaussian) of variance m*(2e)^2/12
. You can check that std(sum(rand(100,5000)))
is near sqrt(100/12)
in Matlab.
We can consider it is still true for m=2*10^12-1
, that is approximately m=2^41
, m*e=2^-12
. In which case, the global error is a quasi normal distribution and the standard deviation of global error is sigma=(2^-52*sqrt(2^41/12))
or approximately sigma=10^-10
See http://en.wikipedia.org/wiki/Normal_distribution to compute P(abs(error)>k*sigma)
- In 68% of case (1 sigma), you'll have 10 digits of precision or more.
erfc(10/sqrt(2))
gives you the probability to have less than 9 digits of precision, about 1 case out of 6*10^22, so I let you compute the probability of having only 4 digits of precision (you can't evaluate it with double precision, it underflows) !!!
My experimental standard deviation were a bit smaller than theoretical ones (2e-15 9e-16 4e-16 for 20 100 & 500 double) but this must be due to a biased distriution of my inputs errors i/3137 i=1..10000...
That's a good way to remind that the result will be dominated by the distribution of errors in your inputs, which might exceed e if they result from floating point operations like M_PI*num/den
Also, as Eric said, using only * is quite an ideal case, things might degenerate quicker if you mix +.
Last note: we can craft a list of inputs that reach the maximum error bound, set all elements to be (1+e) which will be rounded to 1.0, and we get the maximum theoretical error bound, but our input distribution is quite biased! HEM WRONG since all multiplication are exact we get only (1+e)^n, not (1+e)^(2n-1), so about only half the error...
UPDATE 2: the inverse problem
Since you want the inverse, what is the length n of sequence such that I get k digits of precision with a certain level of confidence 10^-c
I'll answer only for k>=8
, because (m*e) << 1
is required in above approximations.
Let's take c=7
, you get k digits with a confidence of 10^-7 means 5.3*sigma < 10^-k.
sigma = 2*e*sqrt((2*n-1)/12)
that is n=0.5+1.5*(sigma/e)^2
with e=2^-53
.
Thus n ~ 3*2^105*sigma^2, as sigma^2 < 10^-2k/5.3^2, we can write n < 3*2^105*10^-(2*k)/5.3^2
A.N. the probability of having less than k=9 digits is less than 10^-7 for a length n=4.3e12, and around n=4.3e10 for 10 digits.
We would reach n=4 numbers for 15 digits, but here our normal distribution hypothesis is very rough and does not hold, especially distribution tail at 5 sigmas, so use with caution (Berry–Esseen theorem bounds how far from normal is such distribution http://en.wikipedia.org/wiki/Berry-Esseen_theorem )

- 9,100
- 1
- 28
- 40
The relative error in M operations as described is at most (1+2-53)M-1, assuming all input, intermediate, and final values do not underflow or overflow.
Consider converting a real number a0 to double precision. The result is some number a0•(1+e), where -2-53 ≤ e ≤ 2-53 (because conversion to double precision should always produce the closest representable value, and the quantum for double precision values is 2-53 of the highest bit, and the closest value is always within half a quantum). For further analysis, we will consider the worst case value of e, 2-53.
When we multiply one (previously converted) value by another, the mathematically exact result is a0•(1+e) • a1•(1+e). The result of the calculation has another rounding error, so the calculated result is a0•(1+e) • a1•(1+e) • (1+e) = a0 • a1 • (1+e)3. Obviously, this is a relative error of (1+e)3. We can see the error accumulates simply as (1+e)M for these operations: Each operation multiplies all previous error terms by 1+e.
Given N inputs, there will be N conversions and N-1 multiplications, so the worst error will be (1+e)2 N - 1.
Equality for this error is achieved only for N≤1. Otherwise, the error must be less than this bound.
Note that an error bound this simple is possible only in a simple problem, such as this one with homogeneous operations. In typical floating-point arithmetic, with a mixture of addition, subtraction, multiplication, and other operations, computing a bound so simply is generally not possible.
For N=1012 (M=2•1012-1), the above bound is less than 2.000222062•1012 units of 2-53, and is less than .0002220693. So the calculated result is good to something under four decimal digits. (Remember, though, you need to avoid overflow and underflow.)
(Note on the strictness of the above calculation: I used Maple to calculate 1000 terms of the binomial (1+2-53)2•1012-1 exactly (having removed the initial 1 term) and to add a value that is provably larger than the sum of all remaining terms. Then I had Maple evaluate that exact result to 1000 decimal digits, and it was less than the bound I report above.)

- 195,579
- 13
- 168
- 312
-
The error on input data is fundamental anyway, in case of randow fraction of pi as input, it may as well cumulate 3 rounding errors if computed by M_PI * num / den and that would raise the error to (1+e)^(4*M-1)-1 – aka.nice Aug 21 '12 at 10:05
-
@aka.nice: I do not understand what you mean. If one is taking 10^12 numbers from π, they will certainly be calculated by special means, not by using M_PI and normal IEEE 754 floating point. I think π was only offered as an example. – Eric Postpischil Aug 21 '12 at 12:41
-
agree, it's more a reminder for OP: the way inputs are computed is essential, they might exceed e and then dominate the global error – aka.nice Aug 21 '12 at 13:57
-
1"Equality for this error is achieved only for N≤1." I don't think you'll ever achieve equality: the worst case occurs when `a0` is `1 + 2**-53`, and in that case `e` is `-2**-53 / (1 + 2**-53)`, strictly less than `2**-53`. (And the formula doesn't make much sense when N = 0, since you can't have -1 multiplications.) – Mark Dickinson Aug 21 '12 at 17:25
-
@MarkDickinson: You are right. I was thinking you could have exactly 1/2 ULP error in the N=1 case, but of course that makes the relative error less than 2^-53. For the case of no input values, the degenerate product would be 1, which of course is exact. :-) – Eric Postpischil Aug 21 '12 at 17:49
-
@EricPostpischil yes, I suspect a tight bound is less than what you provided, it does not seem possible to have max error on both inputs and multiplications... Until now, I did not find better than (1+2^-26+2^-53)*(1+2^-27+2^-53)*(1+2^-28+2^-53)*(1+2^-29+2^-53) whose error is quasi perfect after 1st *, but about 95% of (1+e/(1+e))^5-1 after 2nd * and 89% of (1+e/(1+e))^7-1 after 3rd * – aka.nice Aug 21 '12 at 21:43
For 64 bit floating point numbers, assuming the standard IEEE 754, has 52+1 bits of mantissa.
That means relative precision is between 1.0000...0 and 1.0000...1, where the number of binary digits after the decimal point is 52. (You can think of the 1.000...0 as what is stored in binary in the mantissa AKA significand).
The error is 1/2 to the power of 52 divided by 2 (half the resolution). Note I choose the relative precision as close to 1.0 as possible, because it is the worst case (otherwise between 1.111..11 and 1.111..01, it is more precise).
In decimal, the worst case relative precision of a double is 1.11E-16.
If you multiply N doubles with this precision, the new relative precision (assuming no additional error due to intermediate rounding) is:
1 - (1 - 1.11E-16)^N
So if you multiply pi (or any double 10^12) times, the upper bound on error is:
1.1102e-004
That means you can have confidence in about 4-5 digits.
You can ignore intermediate rounding error if your CPU has support for extended precision floating point numbers for intermediate results.
If there is no extended precision FPU (floating point unit) used, rounding in intermediate steps introduces additional error (same as due to multiplication). That means that a strict lower bound calculated as:
1 -
((1 - 1.11E-16) * (1 - 1.11E-16) * (1 - 1.11E-16)
* (1 - 1.11E-16) * (1 - 1.11E-16) % for multiplication, then rounding
... (another N-4 lines here) ...
* (1 - 1.11E-16) * (1 - 1.11E-16))
= 1-(1-1.11E-16)^(N*2-1)
If N is too large, it takes too long to run. The possible error (with intermediate rounding) is 2.2204e-012, which is double compared to without intermediate rounding 1-(1 - 1.11E-16)^N
=1.1102e-012.
Approximately, we can say that intermediate rounding doubles the error.
If you multiplied pi 10^12 times, and there was no extended precision FPU. This might be because you write intermediate steps to memory (and perhaps do something else), before continuing (just make sure the compiler hasn't reordered your instructions so that there is no FPU result accumulation), then a strict upper bound on your relative error is:
2.22e-004
Note that confidence in decimal places doesn't mean it will be exactly that decimal places sometimes.
For example, if the answer is:
1.999999999999, and the error is 1E-5, the actual answer could be 2.000001234.
In this case, even the first decimal digit was wrong. But that really depends on how lucky you are (whether the answer falls on a boundary such as this).
This solution assumes that the doubles (including the answer) are all normalized. For denormalized results, obviously, the number binary digits by which it is denormalized will reduce the accuracy by that many digits.

- 12,225
- 10
- 51
- 61
-
"If N is too large, it takes too long to run. The answer is 2.2204e-010. Compared to without intermediate rounding 1-(1 - 1.11E-16)^N=1.1102e-012." Did you mean 2.2204e-012? – Chalk Aug 21 '12 at 02:57
-
No, what I wrote was correct, I edited to clarify. When you have extended precision (ie. effectively no rounding required at intermediate steps), the maximum error is half compared to if you rounded at each intermediate step. – ronalchn Aug 21 '12 at 03:05
-
I suspect I'm misunderstanding something. With intermediate rounding, it seems like we're losing two extra digits of precision, not just doubling the error: 2.22e-010 is 200 times 1.11e-012? – Chalk Aug 21 '12 at 03:15
-
Oops, I calculated it wrong. Changed it to 2.22e-10. I was fiddling with the code in matlab, and pasted in the answer when N=1E6. Answer changed to fix it. – ronalchn Aug 21 '12 at 03:22
-
Thanks... sorry, responding on my phone, so I couldn't re-run the calculation myself. – Chalk Aug 21 '12 at 03:27
-
2.22e-4 is not a strict bound, in the absence of further proof. 2.220693e-4 is. Rounding your numbers down when doing error calculations is not strict. – Eric Postpischil Aug 21 '12 at 03:42
-
The poser of the question indicated, in response to a request for clarification, that they wanted an absolute guarantee. Several steps in this answer preclude an absolute guarantee: Rounding error bounds down (using 1.11e-16 rather than the correct 2^-53), ignoring intermediate rounding (even when extended precision is in use), justifying doubling the error “empirically” and “by estimation”, using an operation that has “almost exactly the same effect”. These steps simply do not constitute proof and do not give an absolute guarantee. – Eric Postpischil Aug 21 '12 at 04:25