13

"Behavior of “round” function in Python" observes that Python rounds floats like this:

>>> round(0.45, 1)
0.5
>>> round(1.45, 1)
1.4
>>> round(2.45, 1)
2.5
>>> round(3.45, 1)
3.5
>>> round(4.45, 1)
4.5
>>> round(5.45, 1)
5.5
>>> round(6.45, 1)
6.5
>>> round(7.45, 1)
7.5
>>> round(8.45, 1)
8.4
>>> round(9.45, 1)
9.4

The accepted answer confirms this is caused by the binary representation of floats being inaccurate, which is all logical.

Assuming that Ruby floats are just as inaccurate as Python's, how come Ruby floats round like a human would? Does Ruby cheat?

1.9.3p194 :009 > 0.upto(9) do |n|
1.9.3p194 :010 >     puts (n+0.45).round(1)
1.9.3p194 :011?>   end
0.5
1.5
2.5
3.5
4.5
5.5
6.5
7.5
8.5
9.5
Community
  • 1
  • 1
steenslag
  • 79,051
  • 16
  • 138
  • 171

3 Answers3

10

Summary

Both implementations are confront the same issues surrounding binary floating point numbers.

Ruby operates directly on the floating point number with simple operations (multiply by a power of ten, adjust, and truncate).

Python converts the binary floating point number to a string using David Gay's sophisticated algorithm that yields the shortest decimal representation that is exactly equal to the binary floating point number. This does not do any additional rounding, it is an exact conversion to a string.

With the shortest string representation in-hand, Python rounds to the appropriate number of decimal places using exact string operations. The goal of the float-to-string conversion is to attempt to "undo" some of the binary floating point representation error (i.e. if you enter 6.6, Python rounds on the 6.6 rather that 6.5999999999999996.

In addition, Ruby differs from some versions of Python in rounding modes: round-away-from-zero versus round-half-even.

Detail

Ruby doesn't cheat. It starts with plain old binary float point numbers the same a Python does. Accordingly, it is subject to some of the same challenges (such 3.35 being represented at slightly more than 3.35 and 4.35 being represented as slightly less than 4.35):

>>> Decimal.from_float(3.35)
Decimal('3.350000000000000088817841970012523233890533447265625')
>>> Decimal.from_float(4.35)
Decimal('4.3499999999999996447286321199499070644378662109375')

The best way to see the implementation differences is to look at the underlying source code:

Here's a link to the Ruby source code: https://github.com/ruby/ruby/blob/trunk/numeric.c#L1587

The Python source is starts here: http://hg.python.org/cpython/file/37352a3ccd54/Python/bltinmodule.c and finishes here: http://hg.python.org/cpython/file/37352a3ccd54/Objects/floatobject.c#l1080

The latter has an extensive comment that reveals the differences between the two implementations:

The basic idea is very simple: convert and round the double to a decimal string using _Py_dg_dtoa, then convert that decimal string back to a double with _Py_dg_strtod. There's one minor difficulty: Python 2.x expects round to do round-half-away-from-zero, while _Py_dg_dtoa does round-half-to-even. So we need some way to detect and correct the halfway cases.

Detection: a halfway value has the form k * 0.5 * 10**-ndigits for some odd integer k. Or in other words, a rational number x is exactly halfway between two multiples of 10**-ndigits if its 2-valuation is exactly -ndigits-1 and its 5-valuation is at least -ndigits. For ndigits >= 0 the latter condition is automatically satisfied for a binary float x, since any such float has nonnegative 5-valuation. For 0 > ndigits >= -22, x needs to be an integral multiple of 5**-ndigits; we can check this using fmod. For -22 > ndigits, there are no halfway cases: 5**23 takes 54 bits to represent exactly, so any odd multiple of 0.5 * 10**n for n >= 23 takes at least 54 bits of precision to represent exactly.

Correction: a simple strategy for dealing with halfway cases is to (for the halfway cases only) call _Py_dg_dtoa with an argument of ndigits+1 instead of ndigits (thus doing an exact conversion to decimal), round the resulting string manually, and then convert back using _Py_dg_strtod.

In short, Python 2.7 goes to great lengths to accurately follow a round-away-from-zero rule.

In Python 3.3, it goes to equally great length to accurately follow a round-to-even rule.

Here's a little additional detail on the _Py_dg_dtoa function. Python calls the float to string function because it implements an algorithm that gives the shortest possible string representation among equal alternatives. In Python 2.6, for example, the number 1.1 shows up as 1.1000000000000001, but in Python 2.7 and later, it is simply 1.1. David Gay's sophisticated dtoa.c algorithm gives "the-result-that-people-expect" without forgoing accuracy.

That string conversion algorithm tends to make-up for some of the issues that plague any implementation of round() on binary floating point numbers (i.e. it less rounding of 4.35 start with 4.35 instead of 4.3499999999999996447286321199499070644378662109375).

That and the rounding mode (round-half-even vs round-away-from-zero) are the essential differences between the Python and Ruby round() functions.

Peter O.
  • 32,158
  • 14
  • 82
  • 96
Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
  • 2
    beats me how this answers the question – Karoly Horvath Mar 31 '13 at 22:58
  • -1. While you have cited relevant source material, I have to say you should have extracted the conceptual difference and actually explained it. I doubt if this helps the OP at all and it's not even clear whether you do or do not understand it yourself. I believe you *do* understand, but then, why not just explain it? – DigitalRoss Apr 01 '13 at 00:02
  • 5
    @DigitalRoss -1's are for downright wrong answers. Downvotes should not be used for the answers that you just don't like for some reason. Just don't upvote such answers. – ovgolovin Apr 01 '13 at 00:37
  • 3
    The tooltip for the downvote arrow says "this answer is not useful". It doesn't say anything about right or wrong. If you ask me "Can you tell me what time it is" and I say "Yes", that answer is completely useless but still 100% correct. – Jörg W Mittag Apr 01 '13 at 02:24
  • +1. I've reversed the dv because Raymond added a nice explanation. It does appear, however, that my temporary dv was precisely congruent with the site design. – DigitalRoss Apr 01 '13 at 02:29
  • 1
    I'm afraid the description of Python's `round` in this answer isn't accurate. Python's `round` *doesn't* use Gay's "shortest string" code, does not do an exact conversion to string at any point, and doesn't make any attempt to undo floating-point representation error. In the `round` source, `_Py_dg_dtoa` is called with `mode=3`, which simply computes `ndigits` correctly rounded digits after the point (or before the point if `ndigits` is negative). In contrast, the shortest string algorithm used (for example) by `float.__repr__` is invoked with `mode=0`. – Mark Dickinson Nov 07 '16 at 18:39
8

The fundamental difference is:

Python: Convert to decimal and then round

Ruby:    Round and then convert to decimal

Ruby is rounding it from the original floating point bit string, but after operating on it with 10n. You can't see the original binary value without looking very closely. The values are inexact because they are binary, and we are used to writing in decimal, and as it happens almost all of the decimal fraction strings we are likely to write do not have an exact equivalence as a base 2 fraction string.

In particular, 0.45 looks like this:

01111111101 1100110011001100110011001100110011001100110011001101 

In hex, that is 3fdccccccccccccd.

It repeats in binary, the first unrepresented digit is 0xc, and the clever decimal input conversion has accurately rounded this very last fractional digit to 0xd.

This means that inside the machine, the value is greater than 0.45 by roughly 1/250. This is obviously a very, very small number but it's enough to cause the default round-nearest algorithm to round up instead of to the tie-breaker of even.

Both Python and Ruby are potentially rounding more than once as every operation effectively rounds into the least significant bit.

I'm not sure I agree that Ruby does what a human would do. I think Python is approximating what decimal arithmetic would do. Python (depending on version) is applying round-nearest to the decimal string and Ruby is applying the round nearest algorithm to a computed binary value.

Note that we can see here quite clearly the reason people say that FP is inexact. It's a reasonably true statement, but it's more true to say that we simply can't convert accurately between binary and most decimal fractions. (Some do: 0.25, 0.5, 0.75, ...) Most simple decimal numbers are repeating numbers in binary, so we can never store the exact equivalent value. But, every value we can store is known exactly and all arithmetic performed on it is performed exactly. If we wrote our fractions in binary in the first place our FP arithmetic would be considered exact.

DigitalRoss
  • 143,651
  • 25
  • 248
  • 329
  • In other words, Python implements `round(f, n)` pretty much as `s = '%.*f' % (n, f); return float(s[:s.index('.') + n + 1]`, with special handling of halfway cases. *That* ought to be described as "what a human would do" — fascinating. – user4815162342 Apr 01 '13 at 00:50
  • But if that is the implementation, why doesn't `round(1.45, 1)` come out as `1.5`? Looking at the code, it should convert 1.45 to `"1.45"` (two decimals: one for rounding and one more to handle the halfway case), manually handle the halfway case by changing `"1.45"` to `"1.5"`, and convert `"1.5"` to - `1.5`. But string input as `"1.5"` and converted to float prints as `1.5`, not `1.4`! – user4815162342 Apr 01 '13 at 00:55
  • 1
    There is more complexity. Early versions of Python use a round-away-from-zero mode that is not even one of the *five* IEEE-754 modes. Later version use a variant on *round-nearest*, which in IEEE-754 breaks ties to even numbers. This would round 1.45 to 1.4 *if* there were no low-order residual from the decimal conversion. – DigitalRoss Apr 01 '13 at 02:35
  • I can repeat `round(1.45) -> 1.4` with Python 2.7 which [implements round-away-from-zero](http://hg.python.org/cpython/file/240c83902fca/Objects/floatobject.c#l1099). Looking at the code, I suspect that the `halfway_case` detection evaluates to false for this number, so halfway detection which would have worked is never triggered. I.e. everything is handled correctly by `_Py_dg_dtoa` except for halfway detection. – user4815162342 Apr 01 '13 at 08:31
3

Ruby doesn't cheat. It just chose another way to implement round.

In Ruby, 9.45.round(1) is almost equivalent to (9.45*10.0).round / 10.0.

irb(main):001:0> printf "%.20f", 9.45
9.44999999999999928946=> nil
irb(main):002:0> printf "%.20f", 9.45*10.0
94.50000000000000000000=> nil

So

irb(main):003:0> puts 9.45.round(1)
9.5

If we use such way in Python, we will get 9.5 as well.

>>> round(9.45, 1)
9.4
>>> round(9.45*10)/10
9.5
nymk
  • 3,323
  • 3
  • 34
  • 36