-1

I am currently using two approaches to calculate a partial sum of harmonic series:

  1. Brute forcing it (iteration)
  2. Using certain formulas which offer a shortcut to the solution

The problem which is bugging me is this : double is precise up to 15 decimal places, and long double is precise up to 19 decimal places. I am dealing with partial sums ranging from 5, to about 50. For example, to reach sum of 5, you'd need about 83 members of harmonic series, and to reach the sum of 50, you'd need about 2911068851283175669760 members of harmonic series. I'm currently using double.

When calculating big sums (e.g. >20), am I losing out on accuracy when calculating such sums using iteration and formulas?

If so, would it be beneficial to use some high precision libraries?

Apollo
  • 118
  • 1
  • 9
  • 1
    Have you tried Kahan summation? – rtoijala Jun 06 '19 at 12:17
  • 1
    There are several C libraries offering arbitrary precision. A quick search finds the following, many more exist: LibBF, GMPLib (Disclaimer: I haven't personally used any of these) –  Jun 06 '19 at 12:23
  • 2
    Post code. Sum small terms first. – chux - Reinstate Monica Jun 06 '19 at 12:29
  • Kernighan & Plauger say, in their old but classic book [The Elements of Programming Style, 2nd Edn](https://smile.amazon.com/dp/0070342075/) (1978), that: _A wise old programmer once said "floating point numbers are like little piles of sand; every time you move one, you lose a little sand and gain a little dirt"._ See also [Is floating point arithmetic broken?](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) – Jonathan Leffler Jun 06 '19 at 13:03
  • 4
    Calculating 2911068851283175669760 (22 digits) worth of terms, with your new multi-core computer (let's give it a million cores), and a fantastic clock (let's make it 1 THz = 1000 GHz), and assuming a term takes a single clock cycle to compute, and we've got a calculation that's going to take or the order of 1 hour, (10^22 / (10^6 * 10^12) = 10^4. But I fear you're missing out on some of those technological enhancements — maybe you have 5 GHz and 20 cores; and then you're going to be taking quite a lot longer — (10^22 / (20* 5 * 10^9) = 10^11 seconds, which is around 6 centuries, I think. – Jonathan Leffler Jun 06 '19 at 13:11
  • Thanks for your advice. I've never heard of Kahan summation, but I'll give it a try. Jonathan, of course I won't be dealing with such an amount of elements using iteration. The mentioned number of elements is calculated by a formula, not by iteration. – Apollo Jun 06 '19 at 13:19

1 Answers1

1

Using any language that provides exact arithmetic for (unbounded) large integer and fraction, we can try to evaluate the rounding error easily with very naive code, at least for small n (exact arithmetic then might become prohibitive)

I'll arbitrarily choose Squeak Smalltalk here to illustrate this.

First, let's define a closure that will return the exact partial sum

hn := [:n | (1 to: n) detectSum: [:i | i reciprocal]].

Then another one that will return the Float (double precision) by iteration, starting with smaller terms:

fn := [:n | (n to: 1 by: -1) detectSum: [:i | i asFloat reciprocal]].

And now another closure that will estimate the error in term of ulp:

un := [:n | | f | ((f := fn value: n) asFraction - (hn value: n)) abs / f ulp].

Now, let's search for the worst case for first 1000 partial sums:

i := (1 to: 1000) detectMax: [:n | un value: n ].
{i. un value: i}.

The result is #(807 6.101840327204507), about 6 ulp error for H807. That's not much as you can see.

If we try to sum with bigger harmonic first:

fn := [:n | (1 to: n) detectSum: [:i | i asFloat reciprocal]].

Then the result is #(471 5.714808765246509), 5.7 ulp, surprise a bit less.

We can try to look a bit further with the later fn:

un value: 10000.
-> 1.8242087614993667

Using memoization, I find that max error is about 54 ulp for 100,000 first terms (H ~ 7) and about 26 ulp if summing more carefully starting with smaller terms.

So up to a few thousands terms, no use to start summation with smaller terms (which enable memoization if you're going to scan increasing), and it's not necessary to use Kahan summation either, unless you really care about last few ulp.

Kahan will become interesting for millions to billions terms, but for bigger sum, formulas are going to be so much faster and also more accurate if you have good approximation of Euler Mascheroni, natural logarithm will never be more than a few ulp off. But you should show your code, otherwise this is only speculation.

For example, this naive second order formulation

fn := [:n | n ln + (0.5772156649015328606 + ((n*2.0) reciprocal - (n squared*12.0) reciprocal))].

will give about 5.5 ulp error for small values of n, but less than 1.5 ulp above 100,000 with decent MacOs libm.

aka.nice
  • 9,100
  • 1
  • 28
  • 40
  • Thanks for your answer and time. I'm not really fond of showing my code here considering it is my thesis code, and I'm not sure what the policy of my faculty is regarding such matters. I was expecting bigger errors, and this was quite a surprise. Once again, thanks for your answer. – Apollo Jun 08 '19 at 13:01
  • 1
    Since harmonic serie grows very slowly, double precision remain good absolute precision quite long. This explains why summing bigger terms first does not make a big difference in first thousands terms. I found less than 500 ulp (9 bits) for first 1 million terms, even when summing bigger terms first. Error growth rate is increasing though. – aka.nice Jun 08 '19 at 21:17