30

I don't know if this is an obvious bug, but while running a Python script for varying the parameters of a simulation, I realized the results with delta = 0.29 and delta = 0.58 were missing. On investigation, I noticed that the following Python code:

for i_delta in range(0, 101, 1):
  delta = float(i_delta) / 100

  (...)

filename = 'foo' + str(int(delta * 100)) + '.dat'

generated identical files for delta = 0.28 and 0.29, same with .57 and .58, the reason being that python returns float(29)/100 as 0.28999999999999998. But that isn't a systematic error, not in the sense it happens to every integer. So I created the following Python script:

import sys

n = int(sys.argv[1])

for i in range(0, n + 1):
  a = int(100 * (float(i) / 100))
  if i != a: print i, a

And I can't see any pattern in the numbers for which this rounding error happens. Why does this happen with those particular numbers?

jpjandrade
  • 451
  • 1
  • 4
  • 11
  • 4
    It's just how IEEE 754 floating-point numbers work. I suggest you round to turn the float back into an integer, rather than simply truncating. – Steve Howard May 13 '11 at 19:50
  • 1
    It is not an error - it is common in many different languages. There are some walkarounds, but in this case the simplest solution may be just using idelta in filename. Just keep in mind idelta is not passed to outside the loop by default. – Tadeck May 13 '11 at 19:56
  • 2
    #StdSOAnswer_1. That's how floating point works. – S.Lott May 13 '11 at 21:27
  • @Tadeck I would say it's still an error, it's just endemic to most of modern day computer science. – Boris Verkhovskiy Oct 30 '19 at 11:48

2 Answers2

36

Any number that can't be built from exact powers of two can't be represented exactly as a floating point number; it needs to be approximated. Sometimes the closest approximation will be less than the actual number.

Read What Every Computer Scientist Should Know About Floating-Point Arithmetic.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • I honestly didn't see your link to the same document before posting the same link. Just shows its such a good reference. – dr jimbob May 13 '11 at 20:00
  • @jimbob, I added the link a minute after the original post. It's a classic, but I didn't have it immediately handy. – Mark Ransom May 13 '11 at 20:04
  • 8
    For Pythonistas, there's also a shorter (and easier to read) chapter in the [Python Tutorial](http://docs.python.org/tutorial/floatingpoint.html) that deals with this issue. – Tim Pietzcker May 13 '11 at 20:13
22

Its very well known due to the nature of floating point numbers.

If you want to do decimal arithmetic not floating point arithmatic there are libraries to do this.

E.g.,

>>> from decimal import Decimal
>>> Decimal(29)/Decimal(100)
Decimal('0.29')
>>> Decimal('0.29')*100
Decimal('29')
>>> int(Decimal('29'))
29

In general decimal is probably going overboard and still will have rounding errors in rare cases when the number does not have a finite decimal representation (for example any fraction where the denominator is not 1 or divisible by 2 or 5 - the factors of the decimal base (10)). For example:

>>> s = Decimal(7)
>>> Decimal(1)/s/s/s/s/s/s/s*s*s*s*s*s*s*s
Decimal('0.9999999999999999999999999996')
>>> int(Decimal('0.9999999999999999999999999996'))
0

So its best to always just round before casting floating points to ints, unless you want a floor function.

>>> int(1.9999)
1
>>> int(round(1.999))
2

Another alternative is to use the Fraction class from the fractions library which doesn't approximate. (It justs keeps adding/subtracting and multiplying the integer numerators and denominators as necessary).

dr jimbob
  • 17,259
  • 7
  • 59
  • 81
  • hmm, actually a better example is Decimal(1)/Decimal(3) * Decimal(3), which does not produce 1.0 with more precision. "when the base is not 10" should be when the fraction cannot be represented in base 10 exactly. The number is, of course, base 10. – Derek Litz Apr 20 '13 at 14:38
  • @DerekLitz - Agreed, my answer was sloppy. Your example is more concise (though both are equally valid). Should have writen when the number does not have a finite decimal representation in base-10, which will happen with any fraction when the denominator is not divisible by 2 or 5. (Granted "the fraction cannot be represented in base 10 exactly. The number is, of course, in base 10." isn't exactly correct either. Numbers don't have bases. One-thirds = 1 /(1+1+1) exactly irrespective of base. Written as a fraction it can be in represented in base 10--1/3.) – dr jimbob Apr 20 '13 at 16:14
  • @dr_jimbob I like the improvement above, however, I don't like the statement 'numbers don't have bases'. Perhaps the difference is semantic, but the meaning of words is important. A number is supposed to represent a value (or quantity if you prefer). In order to create a numbering system a base needs to be chosen, symbols need to be chosen, and walla we can communicate more effectively then simple tallying, but I'm sure that's what you meant :) – Derek Litz Apr 20 '13 at 17:35
  • @DerekLitz - Yes numbers represent values, but only the representation of a number has a base. One, two, twenty-eight, three-halves, π are numbers. The decimal representations are respectively: 1, 2, 28, 1.5, 3.14159... (decimal meaning base-10) and yes the numbers name often relates to base 10. In binary (base 2), they'd be as 1, 10, 11100, 1.1, 11.0010 0100 0011 1111..., and in hex: 1, 2, 1c, 1.8, 3.243f... Number has a specific mathematical meaning referring to the abstract object (e.g., number two is the second successor to zero: two = succ (succ zero)), with no regard to base. – dr jimbob Apr 20 '13 at 19:58
  • @DerekLitz - And I totally admit, this is completely nit-picking. The intended meaning of what we are saying is quite obvious. – dr jimbob Apr 20 '13 at 19:59
  • 1
    @dr_jimbob I like these conversations :). It's more of a ambiguity in the definition of 'number', which can mean either the abstraction representing the mathematical value or the mathematical value. It's certainly a good to know when I'm around the Math types I should assume the latter :) – Derek Litz Apr 22 '13 at 15:05