16

I work daily with Python 2.4 at my company. I used the versatile logarithm function 'log' from the standard math library, and when I entered log(2**31, 2) it returned 31.000000000000004, which struck me as a bit odd.

I did the same thing with other powers of 2, and it worked perfectly. I ran 'log10(2**31) / log10(2)' and I got a round 31.0

I tried running the same original function in Python 3.0.1, assuming that it was fixed in a more advanced version.

Why does this happen? Is it possible that there are some inaccuracies in mathematical functions in Python?

Emil H
  • 39,840
  • 10
  • 78
  • 97
Avihu Turzion
  • 3,284
  • 4
  • 25
  • 34
  • duplicate of the perennial floating point question (why am I getting floating point errors?), can't find the best duplicate Q to post, maybe someone else can. – Jason S Jun 01 '09 at 13:55
  • I should point out that Python 3 did *not* fix the floating point error. Instead, the printing output uses a smart algorithm to display the intended floating point value, instead of the slack. – Xavier Ho Jun 15 '10 at 15:15

9 Answers9

53

This is to be expected with computer arithmetic. It is following particular rules, such as IEEE 754, that probably don't match the math you learned in school.

If this actually matters, use Python's decimal type.

Example:

from decimal import Decimal, Context
ctx = Context(prec=20)
two = Decimal(2)
ctx.divide(ctx.power(two, Decimal(31)).ln(ctx), two.ln(ctx))
Matthew Flaschen
  • 278,309
  • 50
  • 514
  • 539
  • 28
    +1 for a nice answer, but mostly for "If this actually matters." Probes were flown to Saturn with less precision. – dwc May 31 '09 at 14:15
  • Indeed. The italics is the most important part of the answer. – Matthew Flaschen May 31 '09 at 14:22
  • 7
    @dwc It would have mattered if the OP would have been taking ceil of the result of log function. Then the error would become very large. In my case, in one of my programs, I was doing this: `a = floor(log(x,b))` and the program was crashing a few lines ahead because `floor(log(243,3))` was coming out to be 4 – Rushil Paul Dec 11 '16 at 13:06
21

You should read "What Every Computer Scientist Should Know About Floating-Point Arithmetic".

http://docs.sun.com/source/806-3568/ncg_goldberg.html

bayer
  • 6,854
  • 24
  • 35
18

Always assume that floating point operations will have some error in them and check for equality taking that error into account (either a percentage value like 0.00001% or a fixed value like 0.00000000001). This inaccuracy is a given since not all decimal numbers can be represented in binary with a fixed number of bits precision.

Your particular case is not one of them if Python uses IEEE754 since 31 should be easily representable with even single precision. It's possible however that it loses precision in one of the many steps it takes to calculate log2231, simply because it doesn't have code to detect special cases like a direct power of two.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • Very interesting. I write code for a very long, and that's the first I have come across this phenomenon. But after the response here, I think now I begin to see the bigger picture of why it happens. – Avihu Turzion Jun 02 '09 at 06:13
6

floating-point operations are never exact. They return a result which has an acceptable relative error, for the language/hardware infrastructure.

In general, it's quite wrong to assume that floating-point operations are precise, especially with single-precision. "Accuracy problems" section from Wikipedia Floating point article :)

Nicolas Dumazet
  • 7,147
  • 27
  • 36
3

IEEE double floating point numbers have 52 bits of precision. Since 10^15 < 2^52 < 10^16, a double has between 15 and 16 significant figures. The result 31.000000000000004 is correct to 16 figures, so it is as good as you can expect.

John D. Cook
  • 29,517
  • 10
  • 67
  • 94
2

float are imprecise

I don't buy that argument, because exact power of two are represented exactly on most platforms (with underlying IEEE 754 floating point).

So if we really want that log2 of an exact power of 2 be exact, we can.
I'll demonstrate it in Squeak Smalltalk, because it is easy to change the base system in that language, but the language does not really matter, floating point computation are universal, and Python object model is not that far from Smalltalk.

For taking log in base n, there is the log: function defined in Number, which naively use the Neperian logarithm ln:

log: aNumber 
    "Answer the log base aNumber of the receiver."
    ^self ln / aNumber ln

self ln (take the neperian logarithm of receiver) , aNumber ln and / are three operations that will round there result to nearest Float, and these rounding error can cumulate... So the naive implementation is subject to the rounding error you observe, and I guess that Python implementation of log function is not much different.

((2 raisedTo: 31) log: 2) = 31.000000000000004

But if I change the definition like this:

log: aNumber 
    "Answer the log base aNumber of the receiver."
    aNumber = 2 ifTrue: [^self log2].
    ^self ln / aNumber ln

provide a generic log2 in Number class:

log2
    "Answer the base-2 log of the receiver."
    ^self asFloat log2

and this refinment in Float class:

log2
    "Answer the base 2 logarithm of the receiver.
    Care to answer exact result for exact power of two."
    ^self significand ln / Ln2 + self exponent asFloat

where Ln2 is a constant (2 ln), then I effectively get an exact log2 for exact power of two, because significand of such number = 1.0 (including subnormal for Squeak exponent/significand definition), and 1.0 ln = 0.0.

The implementation is quite trivial, and should translate without difficulty in Python (probably in the VM); the runtime cost is very cheap, so it's just a matter of how important we think this feature is, or is not.

As I always say, the fact that floating point operations results are rounded to nearest (or whatever rounding direction) representable value is not a license to waste ulp. Exactness has a cost, both in term of runtime penalty and implementation complexity, so it's trade-offs driven.

aka.nice
  • 9,100
  • 1
  • 28
  • 40
2

This is normal. I would expect log10 to be more accurate then log(x, y), since it knows exactly what the base of the logarithm is, also there may be some hardware support for calculating base-10 logarithms.

quant_dev
  • 6,181
  • 1
  • 34
  • 57
1

The representation (float.__repr__) of a number in python tries to return a string of digits as close to the real value as possible when converted back, given that IEEE-754 arithmetic is precise up to a limit. In any case, if you printed the result, you wouldn't notice:

>>> from math import log
>>> log(2**31,2)
31.000000000000004
>>> print log(2**31,2)
31.0

print converts its arguments to strings (in this case, through the float.__str__ method), which caters for the inaccuracy by displaying less digits:

>>> log(1000000,2)
19.931568569324174
>>> print log(1000000,2)
19.9315685693
>>> 1.0/10
0.10000000000000001
>>> print 1.0/10
0.1

usuallyuseless' answer is very useful, actually :)

tzot
  • 92,761
  • 29
  • 141
  • 204
0

If you wish to calculate the highest power of 'k' in a number 'n'. Then the code below might be helpful:

import math
answer = math.ceil(math.log(n,k))
while k**answer>n:
    answer-=1

NOTE: You shouldn't use 'if' instead of 'while' because that will give wrong results in some cases like n=2**51-1 and k=2. In this example with 'if' the answer is 51 whereas with 'while' the answer is 50, which is correct.

Developer
  • 425
  • 3
  • 15