12

I was doing some calculations for planning an improved implementation of my prime sieve when I noticed that the Libre Office spreadsheet was showing the wrong values for numbers far below 2^53, which is the limit for precise integer calculations in FoxPro and a multitude of other languages that internally use a C double (a.k.a. IEEE 754 double).

Some quick checks showed wrong results for numbers as low as 2^50. What's worse, the flaky piece of software did not give any warning that it was working beyond its operational limits and that the displayed values were only approximate (even if the correct values were entered manually). I guess they thought what's good enough for POS like Excel then must be good enough for them (but they ought to have named it 'Guesstimate' instead 'Calc' then).

Anyway, is it possible to put the spreadsheet into some kind of accurate mode that gives at least the precision of a double, and/or get it to show some kind of warning when some calculation exceeds its numerical capabilities?

Or is there some other spreadsheet available that's suitable for programmers in an age where even phones and toasters are powered by 64-bit chips?

Background: in this particular case the need for precision arises because the exactness of the formulas is verified by comparing certain result columns with lists of well-known numbers (like the number of primes up to 2^k, available from OEIS) or with numbers derived from instrumented code. The idea is to get the formulas right by checking them for doable ranges - up to somewhere between 2^32 and 2^40 - and then to use those formulas for studying the behaviour for ranges up to 2^64 (where it is impossible to get a complete picture by running exhaustive tests).

P.S.: I found that the problem has been analysed extensively in On the Numerical Accuracy of Spreadsheets (pdf). It seems that gnumeric holds up a tad better than the rest but it's more a case of the one-eyed among the blind than of really passing muster; besides, it's not available for Windows and thus would require a Linux VM to be fired up on the platform I am developing on and for...

UPDATE/WORKAROUND

Up to 2^49 there are no problems, so it's possible to work normally. This is sufficient to get a precise idea of things. In the final tables, big numbers and high-precision results (computed elsewhere) are entered as pre-formatted text (or rather imported/pasted) so that Calc can't bork the display. The numbers can still be referenced by applying the VALUE() function, and for many inline calculations the internal precision is quite sufficient - especially for graphing.

The image below demonstrates what I posted in one of the comments, which is that the internal precision is higher than what's displayed. How are we supposed to find out then what the actual values are if the program doesn't show them? Play around with adding/subtracting small values, watch changes, create and test hypotheses regarding the relation of actual values, supposed rounding behaviour and displayed values? The mind truly boggles.

same displayed value but different behaviour

I'm offering a bounty for any of the following:

  • a practical way of restoring Calc to sane behaviour in the sense of WYSIWI (what you see is what's inside, or rather if it's inside then it can be displayed) without reducing internal precision and without forking the whole bloody source tree and fixing it oneself

  • a pointer to a spreadsheet with higher precision than IEEE 754 double (preferably arbitrary precision, like GP/PARI but as a spreadsheet) that's public domain or shareware; an online/cloud thingies would be acceptable

The reason for the first item is twofold: for one thing it's a bloody nuisance when spreadsheet precision is lower than the universal IEEE 754 double because it means that experience/knowledge regarding accuracy/precision/stability at that baseline level doesn't carry over. For another, it's an even greater nuisance if we can't trust that the displayed values are correct even when we know for certain that the actual values are.

The reason for the second item is the simple fact that it's rather cumbersome to have to script things in another program for computing things at high precision or with bignums, or to have to edit/compile/run programs for the purpose. Besides the awkward split into a spreadsheet and a bunch of scripts or source files, it would be more natural and convenient to work just inside the spreadsheet, period.

Cœur
  • 37,241
  • 25
  • 195
  • 267
DarthGizka
  • 4,347
  • 1
  • 24
  • 36
  • Are you forced to use a spreadsheet to solve your problem? Can you use something like Python with bigdecimal, a computer algebra system, etc.? – Nayuki Dec 28 '15 at 20:34
  • On a side note, if you're worried about accuracy, then you probably shouldn't be using a floating point type at all. Floating point numbers aren't intended to represent numbers exactly. Is there a particular reason why you are using a system that uses floating point numbers? (This ties into the question above by Nayuki Minase.) – Jim Pedid Dec 28 '15 at 21:10
  • 1
    Good point. Usually I use [gp PARI](http://pari.math.u-bordeaux.fr/) and [gnuplot](http://www.gnuplot.info/) for my arbitrary-precision needs, occasionally C++ with libraries like [gmp](https://gmplib.org/) or [MPIR](https://en.wikipedia.org/wiki/MPIR_%28mathematics_software%29), sometimes Java/C# with their builtin bignum support. `__int64`, `__int128` and `long double` go a long way, too. However, they all pretty much require that you know up front exactly what you want, and are less suited for thinking, exploring and trying out stuff, or visualising and documenting things. Ergo spreadsheet. – DarthGizka Dec 28 '15 at 21:16
  • The point is that the style of working with a spreadsheet is a perfect fit for my needs here; contrast this to the hoops you have to jump through for computing and visualising/inspecting things with the other systems I mentioned above. In fact, I'm usually much faster coding my thoughts in FoxPro (an xBASE dialect) which offers a 64-bit double under the hood, being careful not to exceed the range of exact integer representation (i.e. 2^53 as upper limit). It requires only a tiny fraction of the code that C++ or Java or C# need with their excessive, noisy verbosity. – DarthGizka Dec 28 '15 at 21:30
  • 2
    It's disappointing that spreadsheets - which are expressly intended for calculation - offer less accuracy than even the simplest of scripting languages, and none really seems numerically fit for more than keeping track of one's pocket money. The accuracy of spreadsheets doesn't seem to have improved since the times of 16-bit DOS... – DarthGizka Dec 28 '15 at 21:34
  • Could you put your data into Base? Base by default uses the HyperSQL Database (HSQLDB) engine (it's zipped inside the Base file, though you have the option to use Base as a front-end with a separate database back-end of your choice), which has a 64-bit double as a field option. Calc easily connects to Base as long as the file is 'registered' on your computer if you want to use Calc functions (like graphing) on the data once the calculations have been done with Base queries. – Lyrl Dec 29 '15 at 14:39
  • Interesting idea, but whatever is wrong with Calc is wrong with Base as well. Fields created as numeric or double drop bits/digits from inputted values; e.g. entering 1125899906842623 (which is 2^50-1) gives 112589990684262**0** just like in Calc and Excel. The same happens if DBFs with correct values are imported; Base can read doubles written by FoxPro but it mangles the values. Also, I tried to export from Base to DBF (via Calc, as documented) to see what's what but the exporter cr*pped out and failed. Import from CSV **previews** correct values (!) but the actual result is mangled again. – DarthGizka Dec 29 '15 at 15:57
  • 1
    I'm sorry to hear Base has this problem as well. Perhaps double only allows a 32-bit number and using the last bit for positive/negative - so if your numbers are integers BIGINT might work? Thanks for testing and reporting back your results, it looks like you've tested everything any of us have thought of so far, and hopefully this will at least help others in the future. – Lyrl Dec 30 '15 at 14:39
  • 1
    Alas, with BIGINT it's exactly the same story. However, it seems that the problem is in the **display logic**. Adding/subtracting small values from 'borked' numbers shows that they do not behave the same way as when you enter the displayed value directly. E.g. entering 4503599627370495 (== 2^52-1) shows 450359962737049**0** but adding 1 gives 4503599627370**500** whereas adding 1 to 4503599627370490 gives no change. I.e. the internal values are different even if the displayed values are the same. And yes, I do hope that someone else might find something useful in the reports of my failures... – DarthGizka Dec 30 '15 at 18:03
  • The limit is probably 999,999,999,999,999 (max 15 digit number), which is between 2^49 and 2^50. BTW, Google Sheets has the same problem. – Rick Regan Dec 30 '15 at 21:50
  • @ Rick Regan: If so then the implementation is sloppy - 999.999.999.999.999 gets displayed as 1.000.000.000.000.000 by Calc (4.4.7.2 here) though even Excel manages to get that one right... I'm guessing there's some one-size-fits-all display formatting coded by someone who believed that a double can hold any number that might need to be displayed (even in Base where the docs say about DECIMAL/NUMERIC "range: unlimited, 50 places in the GUI"). P.S.: the locale is German here, hence the '.' as thousands separator instead of ','. – DarthGizka Dec 30 '15 at 22:09
  • I was testing it in Sheets, but I just tried it in Calc and I see the rounding you describe. I was guessing that they all handled it the same way (that is, up to the "magic number" of 15 digits). – Rick Regan Dec 31 '15 at 01:27
  • 1
    have you seen: http://precisioncalc.com/123quattro.html? It's for MS Excel, but this post suggest that there is some chance that author will crate version for Open Office: http://en.libreofficeforum.org/node/7729 – Maciek Sawicki Jan 02 '16 at 04:30
  • @ Maciek Sawicki: excellent! You just earned yourself a nice bounty. :-) Could you please write this up as a proper answer, perhaps with a screenie of an example? This should be immensely useful for many people who - like myself - have access to a working copy of Excel at work or elsewhere. – DarthGizka Jan 02 '16 at 15:28
  • @Maciek Sawicki: please write this up as an answer - I can't award the bounty to a comment. – DarthGizka Jan 05 '16 at 19:30
  • 1
    Good to know my answer helped. I will rewrite the answer later today :) – Maciek Sawicki Jan 05 '16 at 20:01
  • 1
    I filled a bug report: https://bugs.documentfoundation.org/show_bug.cgi?id=96918 – Cœur Jan 06 '16 at 07:23
  • I doubt that anything related to IEEE 754 double can be used if you need absolute precision. Even storing a "simple" value like `3.1415` into a `double` will internally store something like `3.1415000000000002` - at least using vc++ on win32. – erg Jan 06 '16 at 14:22
  • @Maciek Sawicki: the bounty will go down the drain if you don't post your answer really soon now... :-/ – DarthGizka Jan 07 '16 at 20:21
  • ups, looks like I missed it. I'm sorry for that :( Thank you @DarthGizka for doing your best to motivate me to rewrite my comment! :) – Maciek Sawicki Jan 07 '16 at 21:10
  • @DarthGizka Tools->Options->Calc->Calculate "Precision as shown" – Cœur Jan 10 '16 at 02:18
  • @Cœur: This does not affect the problem discussed here in any way, as you can easily find out if you check for yourself. The setting refers to a **voluntary reduction in precision** requested by the user, like when Calc has been directed via the formatting options to display only a certain number of decimal places. – DarthGizka Jan 10 '16 at 08:09

3 Answers3

2

PrecisionCalc is an Excel add on that allows computations with arbitrary. precision.

There is a Free edition with 3,000 significant digits limit.

Maciek Sawicki
  • 6,717
  • 9
  • 34
  • 48
2

Calc 5.1 (soon available) will fix the issue: https://bugs.documentfoundation.org/show_bug.cgi?id=96918

Cœur
  • 37,241
  • 25
  • 195
  • 267
1

Maciek Sawicki (see question comments) has found an addon for Excel that allows computations with arbitrary precision: xlPrecision/PrecisionCalc. This is shareware with an extremely reasonable price.

DarthGizka
  • 4,347
  • 1
  • 24
  • 36