0

I'm starting to learn Haskell using Miran Lipovaca's famous book, but my curiosity stopped me at the first interactions with the Glasgow Haskell Compiler's interactive shell (ghci).

In particular, I started playing by dividing two integers to get floating point decimal numbers, basically to see how Haskell manages them automatically and to know more about its automatic casting.

λ> 1/3
0.3333333333333333
λ> 4/3
1.3333333333333333
λ> 3424/3
1141.3333333333333

These told me Haskell uses a total of 17 digits (or 18 characters including the dot?), whether or not they're significant. However, these ones occured as well

λ> 14/3
4.666666666666667
λ> 34/3
11.333333333333334
λ> 44/3
14.666666666666666

Why is the first shorter by one digit? Why are the others erroneously rounded?

Probably it's a stupid question, but I feel like knowing the answer to such basic things I can start with a deeper understanding on how the language (or the interpreter?) works by knowing a little more of what's under the hood.

Mor A.
  • 3,805
  • 2
  • 16
  • 19
Jeffrey Lebowski
  • 281
  • 1
  • 2
  • 12
  • 10
    Do you understand floating point? – Thomas M. DuBuisson Oct 13 '18 at 15:51
  • 4
    I don't think this is a Haskell-specific problem. Floating point errors are [a well-known problem](http://www.smbc-comics.com/comics/20130605.png), and the full mechanics of floating-point numbers can be *extremely* complex. If you really want to know about it, there are [plenty](https://www.ias.ac.in/article/fulltext/reso/021/01/0011-0030) [of](http://flint.cs.yale.edu/cs422/doc/art-of-asm/pdf/CH14.PDF) [specifications](https://people.eecs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF) available online. – AJF Oct 13 '18 at 16:04
  • 2
    These floating point numbers are represented in base 2 in a computer, and all operations happen in base 2. They are converted in base 10 only just before they are printed on screen, for human readability. – chi Oct 13 '18 at 16:04
  • 2
    @AJFarmar Indeed it's not Haskell-specific: I just tried running the above divisions on Python, and got the same rounding errors. (Hardly a surprise, of course.) – chi Oct 13 '18 at 16:10
  • 2
    it works exactly as it should. this is really not a haskell problem. it's just the way how floating point numbers work. – Michael Oct 13 '18 at 16:29
  • Got it, thanks! That makes me want to try and learn Assembly before. – Jeffrey Lebowski Oct 13 '18 at 17:52
  • Actually the decimal printout should be `14/3 = 4.6666666666666670`, it seems that trailing zeros are suppressed. – gammatester Oct 13 '18 at 18:21

1 Answers1

10

The Haskell specification leaves some slack in its specification of floating-point formats and behaviors. It says Haskell’s Double type should cover IEEE “double-precision” in range and precision, the default operations defined by the Haskell Prelude do not conform to certain standards, but some aspects of IEEE floating-point have been accounted for in the Prelude class RealFloat. For this answer, I will demonstrate how the results in the question arise from the IEEE-754 basic 64-bit binary floating-point format and arithmetic.

The question states:

These told me Haskell uses a total of 17 digits (or 18 characters including the dot?), whether or not they're significant.

This is incorrect. As assumed in this answer, and is likely the case in OP’s Haskell implementation, the number has 53 binary digits, not 17 decimal digits. 17 digits may be displayed, but this is the result of conversion of the number for display, not an exact representation of the actual value used for computing.

The first three cases shown are unremarkable, but, for illustration, we show the internal values:

λ> 1/3
0.3333333333333333 -- 0.333333333333333314829616256247390992939472198486328125
λ> 4/3
1.3333333333333333 -- 1.3333333333333332593184650249895639717578887939453125
λ> 3424/3
1141.3333333333333 -- 1141.333333333333257542108185589313507080078125

Now we will look at the surprising cases, starting with:

λ> 14/3
4.666666666666667

What is surprising here is that it is shown with 16 decimal digits, whereas the previous results were shown with 17. Why?

I do not see rules in the Haskell specification about how floating-point numbers should be formatted when displayed or otherwise converted to decimal. One rule that would explain this is one adopted by Java and some other software: Produce just enough decimal digits that convert the decimal numeral back to the floating-point format produces the original number. That is, just enough digits are produced to uniquely identify the internal value. (Other not uncommon rules are to convert to a fixed number of digits, or to convert to a fixed number of digits and then remove trailing zeroes. Both the just-enough rule and the remoe-trailing-zeroes rule would produce the results shown in the question. I will demonstrate the just-enough rule in this answer.)

The value produced by 14/3 is exactly 4.66666666666666696272613990004174411296844482421875. Together with the next lower and next greater representable values, we have (with a space inserted after the 16th digit, to assist visualization):

4.666666666666666 0745477201999165117740631103515625
4.666666666666666 96272613990004174411296844482421875
4.666666666666667 850904559600166976451873779296875

If we were converting 4.666666666666667 to floating-point, which of the above values should be the result? The middle one is closer; it is only about .04 away (in units of the lowest digit), whereas the others are .93 and .15 away. Thus the 16 digits “4.666666666666667” are sufficient to uniquely identify 4.66666666666666696272613990004174411296844482421875.

In contrast, consider 4/3, which is 1.3333333333333332593184650249895639717578887939453125. It and its two neighbors are:

1.333333333333333 03727386009995825588703155517578125
1.333333333333333 2593184650249895639717578887939453125
1.333333333333333 481363069950020872056484222412109375

Again, the space is after the 16th digit. If we were converting the 16-digit 1.333333333333333 to floating-point, which of these should be the result? Now the first one is closer; it is only .04 units away. So “1.333333333333333” fails to represent the correct internal value. We need the 17-digit “1.3333333333333333” in order to uniquely identify the desired value.

The next case is:

λ> 34/3
11.333333333333334

The question asks why this is “erroneously rounded.” It is not. The internal value is 11.3333333333333339254522798000834882259368896484375. This number and its two neighboring representable values are:

11.333333333333332149095440399833023548126220703125
11.3333333333333339254522798000834882259368896484375
11.33333333333333570180911920033395290374755859375

The middle one is the closest to 11⅓, so it is the correct result of 34/3. And “11.333333333333334” is a correct conversion of 11.3333333333333339254522798000834882259368896484375 to a 17-digit decimal numeral.

Similarly, in:

λ> 44/3
14.666666666666666

the candidate results are:

14.666666666666664 29819088079966604709625244140625
14.666666666666666 0745477201999165117740631103515625
14.666666666666667 850904559600166976451873779296875

The middle of these is closer to 14⅔, as it is about .59 units away (using units at the position I marked with a space), whereas the last one is 1.18 units away. So the correct internal result is 14.6666666666666660745477201999165117740631103515625, and the result of converting that to 17 decimal digits is 14.666666666666666.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312