4

Notice in the code below that the output suddenly becomes 0.0 for 0.0005**99.

In [1]: 0.0005**97                                                              
Out[1]: 6.31e-321

In [2]: 0.0005**98                                                              
Out[2]: 5e-324

In [3]: 0.0005**99                                                              
Out[3]: 0.0

In [4]: 0.0005**100                                                             
Out[4]: 0.0

I was expecting to see an Underflow error or some kind of a warning that this happens.

I'm coding a spam filter using the Naive Bayes algorithm, and computations like you see above are common for lengthy messages. Although I can add some mathematical workarounds, I still think it's problematic that this "conversion" to 0.0 happens silently.

I ran the code above in Python 3.7.3.

Ismael Padilla
  • 5,246
  • 4
  • 23
  • 35
Alex
  • 3,958
  • 4
  • 17
  • 24
  • 4
    IEEE-754 underflow just happens silently. There's technically a flag that gets set on underflow, and you can technically configure it to not be silent IIRC, but you pretty much have to write assembly to access any of that functionality. – user2357112 Oct 09 '19 at 10:05
  • 1
    Floating point math is imperfect, so at some point the errors pile up enough to reduce your result to 0. Do you expect floats to print a warning every time they produce an incorrect result (i.e. 99% of the time)? – Aran-Fey Oct 09 '19 at 10:07
  • The way you've phrased the question, it's not at all clear that you're asking specifically about the number 0. – Aran-Fey Oct 09 '19 at 10:13
  • (Incidentally, if you can access the IEEE-754 exception handling settings - something almost no high-level language provides - you actually *can* have it raise an error for any inexact result instead of rounding. This isn't very useful.) – user2357112 Oct 09 '19 at 10:14
  • Anyway, the fact that this is an issue for your Naive Bayes thing suggests that you haven't yet learned that you need to work with logarithms for that. Work with logarithms. – user2357112 Oct 09 '19 at 10:16
  • Sure, thanks for the tip, logarithms are the mathematical workarounds I was referring to in the question. :) – Alex Oct 09 '19 at 10:19
  • 1
    @Aran-Fey The question says they expected underflow, which is about zero, so it's clear they're asking specifically about that. – Kelly Bundy Dec 20 '21 at 16:03

1 Answers1

2

In Python, all floating point rounding is done silently, it just so happens in this case that the closest floating point number you're trying to represent is closer to 0 than the smallest floating point number > 0.

As the some of the comments have suggested, using log space will help you with handling the representation of these very small numbers.

silleknarf
  • 1,219
  • 6
  • 22
  • 1
    Note that underflow is a distinct concept from rounding, and it's a design decision for Python to treat underflow like rounding (i.e. not a problem) as opposed to how it treats overflow (i.e. it raises `OverflowError` instead of rounding to `float('inf')`). – kaya3 Dec 20 '21 at 16:19