164

I know that most decimals don't have an exact floating point representation (Is floating point math broken?).

But I don't see why 4*0.1 is printed nicely as 0.4, but 3*0.1 isn't, when both values actually have ugly decimal representations:

>>> 3*0.1
0.30000000000000004
>>> 4*0.1
0.4
>>> from decimal import Decimal
>>> Decimal(3*0.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(4*0.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')
Community
  • 1
  • 1
Aivar
  • 6,814
  • 5
  • 46
  • 78
  • 59
    @MorganThrapp: no it isn't. The OP is asking about the rather arbitrary-looking formatting choice. Neither 0.3 nor 0.4 can be represented exactly in binary floating point. – Bathsheba Sep 21 '16 at 14:13
  • 3
    It's not arbitrary at all, it's showing any significant digits. – Morgan Thrapp Sep 21 '16 at 14:14
  • 4
    Obligatory link under every floating point related question: http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html – BartoszKP Sep 21 '16 at 14:56
  • 43
    @BartoszKP: Having read the document several times, it doesn't explain why Python is displaying `0.3000000000000000444089209850062616169452667236328125` as `0.30000000000000004` and `0.40000000000000002220446049250313080847263336181640625` as `.4` even though they appear to have the same accuracy, and thus doesn't answer the question. – Mooing Duck Sep 21 '16 at 17:36
  • 6
    See also http://stackoverflow.com/questions/28935257/why-0-4-2-equals-to-0-2-meanwhile-0-6-3-equals-to-0-19999999999999998-in - I'm somewhat irritated that it got closed as a duplicate but this one hasn't. – Random832 Sep 22 '16 at 01:23
  • 3
    @Gilles No this is not a duplicate of that question. This is a question about *string representation* of floating points *in python*. – Bakuriu Sep 23 '16 at 10:01
  • 1
    Good ole 2 + 2 = 5 for extremely large values of 2 – coteyr Sep 23 '16 at 18:09
  • 1
    The [What's new in Python 3.1 docs (scroll to end of linked section, just before "New, Improved and Deprecated Modules")](https://docs.python.org/3/whatsnew/3.1.html#other-language-changes) are a useful explanation for why/when Python 2.7/3.1+ have much shorter `float` `repr`s for some values. Straight from the horse's mouth, so to speak. – ShadowRanger Sep 23 '16 at 18:27
  • 15
    Reopened, **please do not close this as a duplicate of "is floating point math broken"**. – Antti Haapala -- Слава Україні Sep 23 '16 at 21:06
  • @Random832 That one should duplicate *to here*; it's still a question about the rule for display, but it's not as well asked or answered as this one. I went ahead and fixed that. – Karl Knechtel Jan 26 '23 at 11:45

4 Answers4

310

The simple answer is because 3*0.1 != 0.3 due to quantization (roundoff) error (whereas 4*0.1 == 0.4 because multiplying by a power of two is usually an "exact" operation). Python tries to find the shortest string that would round to the desired value, so it can display 4*0.1 as 0.4 as these are equal, but it cannot display 3*0.1 as 0.3 because these are not equal.

You can use the .hex method in Python to view the internal representation of a number (basically, the exact binary floating point value, rather than the base-10 approximation). This can help to explain what's going on under the hood.

>>> (0.1).hex()
'0x1.999999999999ap-4'
>>> (0.3).hex()
'0x1.3333333333333p-2'
>>> (0.1*3).hex()
'0x1.3333333333334p-2'
>>> (0.4).hex()
'0x1.999999999999ap-2'
>>> (0.1*4).hex()
'0x1.999999999999ap-2'

0.1 is 0x1.999999999999a times 2^-4. The "a" at the end means the digit 10 - in other words, 0.1 in binary floating point is very slightly larger than the "exact" value of 0.1 (because the final 0x0.99 is rounded up to 0x0.a). When you multiply this by 4, a power of two, the exponent shifts up (from 2^-4 to 2^-2) but the number is otherwise unchanged, so 4*0.1 == 0.4.

However, when you multiply by 3, the tiny little difference between 0x0.99 and 0x0.a0 (0x0.07) magnifies into a 0x0.15 error, which shows up as a one-digit error in the last position. This causes 0.1*3 to be very slightly larger than the rounded value of 0.3.

Python 3's float repr is designed to be round-trippable, that is, the value shown should be exactly convertible into the original value (float(repr(f)) == f for all floats f). Therefore, it cannot display 0.3 and 0.1*3 exactly the same way, or the two different numbers would end up the same after round-tripping. Consequently, Python 3's repr engine chooses to display one with a slight apparent error.

nneonneo
  • 171,345
  • 36
  • 312
  • 383
  • 25
    This is an amazingly comprehensive answer, thank you. (In particular, thanks for showing `.hex()`; I didn't know it existed.) – NPE Sep 21 '16 at 14:33
  • 2
    @NPE then you might be interested in `float.fromhex()` too, it does the reverse. – Mark Ransom Sep 21 '16 at 14:56
  • Out of curiosity, does Python always try to use the shortest string that is within 0.50 ulp of the given value, or does it use the shortest string that is within e.g. 0.47 ulp of the given value? Some floating-point libraries, if given a decimal string which almost exactly halfway between two values that are representable as "double", may not always return the value which is closer to the exact value represented by the string, but printing one more decimal digit would solve that problem. – supercat Sep 21 '16 at 16:05
  • 22
    @supercat: Python tries to find the *shortest string that would round to the desired value*, whatever that happens to be. Obviously the evaluated value must be within 0.5ulp (or it would round to something else), but it may require more digits in ambiguous cases. The code is *very* gnarly, but if you want to take a peek: https://hg.python.org/cpython/file/03f2c8fc24ea/Python/dtoa.c#l2345 – nneonneo Sep 21 '16 at 16:16
  • Can we then say that Python's repr uses selective rounding (meaning it doesn't use same simple rounding rule for all floats)? – Aivar Sep 21 '16 at 17:31
  • @supercat This has changed in python3.1, see [the issue with the patch](https://bugs.python.org/issue1580). In any case: the default representation is designed to produce the more readable result that completely preserves the value of the float. This means that `eval(repr(f)) == f` for all floats `f` (and `eval(s)` does the same as `float(s)`). However `float('0.100000000000000012') == 0.1` even though it is actually closer to `0.10000000000000002` (which is the next representable double). – Bakuriu Sep 21 '16 at 18:57
  • 1
    @Bakuriu: I'm not sure what you're saying. The `float` constructor always does correct rounding. The nearest representable float to `0.100000000000000012` is `0.1000000000000000055511151231257827021181583404541015625`, which Python displays as `0.1`. – Mark Dickinson Sep 21 '16 at 19:15
  • 2
    @supercat: Always the shortest string that's within 0.5 ulp. (*Strictly* within if we're looking at a float with odd LSB; i.e., the shortest string that makes it work with round-ties-to-even). Any exceptions to this are a bug, and should be reported. – Mark Dickinson Sep 21 '16 at 19:17
  • What does the `p` stand for in that hex representation? And are these actually valid number literals (like hex integers), or are they only a custom formatting? – Bergi Sep 22 '16 at 02:17
  • @Bergi the `p` takes the place of the `e` in scientific notation, but I don't know the rationale for choosing a different letter. They are *not* valid literals, you need to use the `float.fromhex()` function with a string as I mentioned earlier. – Mark Ransom Sep 22 '16 at 04:07
  • 7
    @MarkRansom Surely they did use something else than `e` because that's already a hex digit. Maybe `p` for *power* instead of *exponent*. – Bergi Sep 22 '16 at 04:12
  • 12
    @Bergi: The use of `p` in this context goes back (at least) to C99, and also appears in IEEE 754 and in various other languages (including Java). When `float.hex` and `float.fromhex` were implemented (by me :-), Python was merely copying what was by then established practice. I don't know whether the intention was 'p' for "Power", but it seems like a nice way to think about it. – Mark Dickinson Sep 22 '16 at 07:50
  • @nneonneo "Python tries to find the shortest string that would round to the desired value." That should be the first line of your answer. – Aleksandr Dubinsky Sep 29 '16 at 12:35
77

repr (and str in Python 3) will put out as many digits as required to make the value unambiguous. In this case the result of the multiplication 3*0.1 isn't the closest value to 0.3 (0x1.3333333333333p-2 in hex), it's actually one LSB higher (0x1.3333333333334p-2) so it needs more digits to distinguish it from 0.3.

On the other hand, the multiplication 4*0.1 does get the closest value to 0.4 (0x1.999999999999ap-2 in hex), so it doesn't need any additional digits.

You can verify this quite easily:

>>> 3*0.1 == 0.3
False
>>> 4*0.1 == 0.4
True

I used hex notation above because it's nice and compact and shows the bit difference between the two values. You can do this yourself using e.g. (3*0.1).hex(). If you'd rather see them in all their decimal glory, here you go:

>>> Decimal(3*0.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(0.3)
Decimal('0.299999999999999988897769753748434595763683319091796875')
>>> Decimal(4*0.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')
>>> Decimal(0.4)
Decimal('0.40000000000000002220446049250313080847263336181640625')
Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • I wonder if it would be worth noting the precise decimal values of the nearest "doubles" to 0.1, 0.3, and 0.4, since a lot of people can't read floating-point hex. – supercat Sep 21 '16 at 15:04
  • @supercat you make a good point. Putting those super large doubles into the text would be distracting, but I thought of a way to add them. – Mark Ransom Sep 21 '16 at 15:36
26

Here's a simplified conclusion from other answers.

If you check a float on Python's command line or print it, it goes through function repr which creates its string representation.

Starting with version 3.2, Python's str and repr use a complex rounding scheme, which prefers nice-looking decimals if possible, but uses more digits where necessary to guarantee bijective (one-to-one) mapping between floats and their string representations.

This scheme guarantees that value of repr(float(s)) looks nice for simple decimals, even if they can't be represented precisely as floats (eg. when s = "0.1").

At the same time it guarantees that float(repr(x)) == x holds for every float x

Aivar
  • 6,814
  • 5
  • 46
  • 78
  • 3
    Your answer is accurate for Python versions >= 3.2, where `str` and `repr` are identical for floats. For Python 2.7, `repr` has the properties you identify, but `str` is much simpler - it simply computes 12 significant digits and produces an output string based on those. For Python <= 2.6, both `repr` and `str` are based on a fixed number of significant digits (17 for `repr`, 12 for `str`). (And nobody cares about Python 3.0 or Python 3.1 :-) – Mark Dickinson Sep 21 '16 at 18:27
  • Thanks @MarkDickinson! I included your comment in the answer. – Aivar Sep 21 '16 at 18:37
  • 2
    Note that the rounding from shell comes from `repr` thus the Python 2.7 behaviour would be identical... – Antti Haapala -- Слава Україні Sep 23 '16 at 21:09
5

Not really specific to Python's implementation but should apply to any float to decimal string functions.

A floating point number is essentially a binary number, but in scientific notation with a fixed limit of significant figures.

The inverse of any number that has a prime number factor that is not shared with the base will always result in a recurring dot point representation. For example 1/7 has a prime factor, 7, that is not shared with 10, and therefore has a recurring decimal representation, and the same is true for 1/10 with prime factors 2 and 5, the latter not being shared with 2; this means that 0.1 cannot be exactly represented by a finite number of bits after the dot point.

Since 0.1 has no exact representation, a function that converts the approximation to a decimal point string will usually try to approximate certain values so that they don't get unintuitive results like 0.1000000000004121.

Since the floating point is in scientific notation, any multiplication by a power of the base only affects the exponent part of the number. For example 1.231e+2 * 100 = 1.231e+4 for decimal notation, and likewise, 1.00101010e11 * 100 = 1.00101010e101 in binary notation. If I multiply by a non-power of the base, the significant digits will also be affected. For example 1.2e1 * 3 = 3.6e1

Depending on the algorithm used, it may try to guess common decimals based on the significant figures only. Both 0.1 and 0.4 have the same significant figures in binary, because their floats are essentially truncations of (8/5)(2^-4) and (8/5)(2^-6) respectively. If the algorithm identifies the 8/5 sigfig pattern as the decimal 1.6, then it will work on 0.1, 0.2, 0.4, 0.8, etc. It may also have magic sigfig patterns for other combinations, such as the float 3 divided by float 10 and other magic patterns statistically likely to be formed by division by 10.

In the case of 3*0.1, the last few significant figures will likely be different from dividing a float 3 by float 10, causing the algorithm to fail to recognize the magic number for the 0.3 constant depending on its tolerance for precision loss.

Edit: https://docs.python.org/3.1/tutorial/floatingpoint.html

Interestingly, there are many different decimal numbers that share the same nearest approximate binary fraction. For example, the numbers 0.1 and 0.10000000000000001 and 0.1000000000000000055511151231257827021181583404541015625 are all approximated by 3602879701896397 / 2 ** 55. Since all of these decimal values share the same approximation, any one of them could be displayed while still preserving the invariant eval(repr(x)) == x.

There is no tolerance for precision loss, if float x (0.3) is not exactly equal to float y (0.1*3), then repr(x) is not exactly equal to repr(y).

AkariAkaori
  • 427
  • 5
  • 4