3

Python 3.1

I am doing some calculations on a data that has missing values. Any calculation that involves a missing value should result in a missing value.

I am thinking of using float('nan') to represent missing values. Is it safe? At the end I'll just check

def is_missing(x):
  return x!=x # I hope it's safe to use to check for NaN

It seems perfect, but I couldn't find a clear confirmation in the documentation.

I could use None of course, but it would then require that I do every single calculation with try / except TypeError to detect it. I could also use Inf, but I am even less sure about how it works.

EDIT:

@MSN I understand using NaN is slow. But if my choice is either:

# missing value represented by NaN
def f(a, b, c):
  return a + b * c

or

# missing value represented by None
def f(a, b, c):
  if a is None or b is None or c is None:
    return None
  else:
    return a + b * c

I would imagine the NaN option is still faster, isn't it?

mskfisher
  • 3,291
  • 4
  • 35
  • 48
max
  • 49,282
  • 56
  • 208
  • 355

1 Answers1

1

It's safe-ish, but if the FPU ever has to touch x it can be insanely slow (as some hardware treats NaN as a special case): Is it a good idea to use IEEE754 floating point NaN for values which are not set?

Community
  • 1
  • 1
MSN
  • 53,214
  • 7
  • 75
  • 105
  • I guess it won't be any faster if I instead check every single expression for missing values using `if`? – max Oct 30 '10 at 08:53
  • @max, Can you give me a code example of what you mean? I don't quite understand the question. – MSN Oct 30 '10 at 20:51
  • @max, it's certainly more terse. As for the perf difference, that's very platform specific. – MSN Nov 01 '10 at 02:48