1580

float('nan') represents NaN (not a number). But how do I check for it?

cottontail
  • 10,268
  • 18
  • 50
  • 51
Jack Ha
  • 19,661
  • 11
  • 37
  • 41

19 Answers19

2030

Use math.isnan:

>>> import math
>>> x = float('nan')
>>> math.isnan(x)
True
Boris Verkhovskiy
  • 14,854
  • 11
  • 100
  • 103
gimel
  • 83,368
  • 10
  • 76
  • 104
  • Note that this works equally well with `float("nan")` as it does with `numpy.core.numeric.NaN`, while comparing the two with `is` does not work. Hence this might be the preferrable solution in (legacy?) code possibly containing both definitions, if I'm not mistaken? – Eike P. Jun 22 '15 at 14:24
  • There are multiple varieties of NaN out there so I don't know what the is operator does in that scenario? – meawoppl Jun 30 '15 at 22:54
  • 2
    I got an error with the above code. Is it because of python 3? However, numpy.isnan(float('nan')) did work. Why would I use math instead of numpy? – Charlie Parker Sep 07 '16 at 18:06
  • 11
    @charlie-parker : In Python3, math.isnan is still a part of the math module. https://docs.python.org/3/library/math.html#math.isnan . Use numpy.isnan if you wish, this answer is just a suggestion. – gimel Sep 08 '16 at 04:43
  • @gimel I had a question.Why float("nan") works when float("string") does not? –  Nov 30 '16 at 05:55
  • 3
    @ SittingBull See https://docs.python.org/3/library/functions.html#float "If the argument is a string, it should contain a decimal number", or "Infinity" "inf" "nan" – gimel Nov 30 '16 at 06:19
  • 3
    Note: Only works with float; throws a TypeError when x is a str. – ChaimG Apr 09 '17 at 07:31
  • 82
    is `math.isnan` preferred to `np.isnan()` ? – TMWP Aug 01 '17 at 02:25
  • 87
    @TMWP possibly... `import numpy` takes around 15 MB of RAM, whereas `import math` takes some 0,2 MB – petrpulc Sep 12 '17 at 12:09
  • What if it is a string? – WJA Jan 03 '19 at 13:17
  • @Joel: Strings aren't numbers. You shouldn't be passing a string to this check at all. This is a check for floating-point NaN values. Also, `isdigit` is not a check for numbers. For example, `'1.0'.isdigit()` produces `False`. – user2357112 Feb 21 '19 at 00:49
  • 39
    @TMWP: If you're using NumPy, `numpy.isnan` is a superior choice, as it handles NumPy arrays. If you're not using NumPy, there's no benefit to taking a NumPy dependency and spending the time to load NumPy just for a NaN check (but if you're writing the kind of code that does NaN checks, it's likely you *should* be using NumPy). – user2357112 Feb 21 '19 at 00:51
  • Simply, without import math: if x == float('nan') – jungwook Jun 10 '19 at 01:45
  • 11
    @jungwook That actually doesn't work. Your expression is *always* false. That is, `float('nan') == float('nan')` returns `False` — which is a strange convention, but basically part of the definition of a NaN. The approach you want is actually the one posted by Chris Jester-Young, below. – Mike Jul 11 '19 at 15:38
  • 1
    `math.isnan` seems to be faster than `np.isnan` (about 20 times on my machine) – kotrfa Nov 28 '19 at 18:59
  • 4
    @kotrfa: Only if you call it on an individual scalar, and if you're doing that a lot in NumPy, you're using NumPy wrong. Using NumPy effectively is all about doing whole-array operations. [Calling `numpy.isnan` on a large array is faster than using `math.isnan` by a factor of over 200.](https://ideone.com/wp2AGO) – user2357112 Apr 14 '20 at 07:07
  • 1
    @TMWP no need to import the whole module one can just do `from numpy import isnan` – Hackaholic Jul 15 '21 at 18:15
  • why `NaN == NaN` returns False? – ORHAN ERDAY Jun 16 '22 at 11:08
  • @ORHANERDAY Because by definition, `NaN != x for every x`. Which means you can do `x=float('nan'); if x != x: print("it's not a number")` – nitzel Jul 21 '22 at 07:17
586

The usual way to test for a NaN is to see if it's equal to itself:

def isNaN(num):
    return num != num
C. K. Young
  • 219,335
  • 46
  • 382
  • 435
  • 10
    Word of warning: quoting Bear's comment below "For people stuck with python <= 2.5. Nan != Nan did not work reliably. Used numpy instead." Having said that, I've not actually ever seen it fail. – mavnn Jan 26 '10 at 13:18
  • 47
    I'm sure that, given operator overloading, there are lots of ways I could confuse this function. go with math.isnan() – djsadinoff Aug 11 '11 at 22:38
  • 9
    It says in the 754 spec mentioned above that NaN==NaN should always be false, although it is not always implemented as such. Isn't is possible this is how math and/or numpy check this under the hood anyway? – Hari Ganesan Apr 01 '14 at 16:16
  • 2
    Thanks . this is also 15-20x times faster than using np.isnan if doing operation on a scalar – thomas.mac Mar 13 '19 at 11:31
  • 60
    Even though this works and, to a degree makes sense, I'm a human with principles and I hereby declare this as prohibited witchcraft. Please use math.isnan instead. – Gonzalo Oct 16 '19 at 21:09
  • 6
    @djsadinoff Is there any other drawback to confusion? math.isnan() can't check string values, so this solution seems more robust. – William Torkington May 28 '20 at 10:11
  • 3
    `math.isnan(x)` requires x to be a real number, incurring the overhead of verifying the type of x (and possibly converting x to a real number) before you can even check for NaN. `x != x` is succinct and robust -- bravo! – 2Toad Jun 16 '20 at 16:34
  • 9
    If your **input includes strings** this is the correct answer. (@williamtorkington) `np.isnan` and `math.isnan` will both break in this case. – Tobias Geisler Jun 28 '20 at 15:07
  • 7
    This answer is awful; it relies on `nan` being the only thing in the universe not equal to itself. AT THE VERY LEAST it should be `return isinstance(num, float) and num != num`. The overhead of verifying the type is better than the possibility of actually being wrong, which this can be. – kevlarr Mar 02 '21 at 16:53
  • 5
    @2Toad This is more succinct but it's hardly robust. Type checking is necessary to be accurate, and if someone is so concerned about the minimal overhead of checking the type (no conversion is necessary) than they shouldn't be using Python. – kevlarr Mar 02 '21 at 17:00
  • 1
    This solution is the fastest. It beats numpy, pandas and math libraries. – cure Apr 17 '21 at 20:05
  • This solution violates #2 in PEP-20: *Explicit is better than implicit.* It will fail if `__eq__` is defined as constant False for some abstract type and of course should have a type check. Otherwise I would say that `"some string"` is also Not A Number (and even doesn't have NaN or not-NaN semantics at all). – astentx Mar 10 '23 at 10:38
  • I'm guessing this fails to produce a value for signaling NaN's. – IInspectable Apr 03 '23 at 15:35
  • That is not always true. ``` In [1]: class LOL: ...: def __eq__(self, other): ...: return False ...: In [2]: x = LOL() In [3]: x == x Out[3]: False ``` – Jürgen Gmach Jun 28 '23 at 16:24
275

numpy.isnan(number) tells you if it's NaN or not.

Boris Verkhovskiy
  • 14,854
  • 11
  • 100
  • 103
mavnn
  • 9,101
  • 4
  • 34
  • 52
  • 1
    Thanks, stuck with 2.5, this is just what I needed – wich Jan 25 '10 at 09:57
  • 3
    Works in python version 2.7 too. – Michel Keijzers Dec 05 '12 at 14:35
  • 13
    `numpy.all(numpy.isnan(data_list))` is also useful if you need to determine if all elements in the list are nan – Jay Prall Feb 27 '14 at 22:18
  • 6
    No need for NumPy: `all(map(math.isnan, [float("nan")]*5))` – sleblanc Mar 28 '15 at 03:41
  • 8
    When this answer was written 6 years ago, Python 2.5 was still in common use - and math.isnan was not part of the standard library. Now days I'm really hoping that's not the case in many places! – mavnn Mar 30 '15 at 07:30
  • 1
    This is also useful if you're using numpy and don't want to import `math`. – Scimonster Apr 13 '16 at 10:12
  • 5
    note that np.isnan() doesn't handle decimal.Decimal type (as many numpy's function). math.isnan() does handle. – comte May 16 '18 at 15:53
  • I prefer this to the accepted answer, because `numpy.isnan` can handle arrays while `math.isnan` throws: `TypeError: only size-1 arrays can be converted to Python scalars`. – James Hirschorn Nov 12 '18 at 18:02
  • 1
    @comte: If you're using `Decimal`, you should use `d.is_nan()` instead of `math.isnan(d)`. Feeding `Decimal` instances to `math` functions is a bad habit to get into, because most `math` functions will convert the input to float and defeat the point of using `Decimal` in the first place. – user2357112 Apr 14 '20 at 07:11
  • `np.isnan('foo')` causes a TypeError exception because it can't handle strings. Use `pd.isna()` instead, if you need to handle strings, and are already using Pandas . That can handle `float('nan')` as well as strings. – wisbucky Sep 30 '22 at 20:30
235

Here are three ways where you can test a variable is "NaN" or not.

import pandas as pd
import numpy as np
import math

# For single variable all three libraries return single boolean
x1 = float("nan")

print(f"It's pd.isna: {pd.isna(x1)}")
print(f"It's np.isnan: {np.isnan(x1)}}")
print(f"It's math.isnan: {math.isnan(x1)}}")

Output

It's pd.isna: True
It's np.isnan: True
It's math.isnan: True
petezurich
  • 9,280
  • 9
  • 43
  • 57
M. Hamza Rajput
  • 7,810
  • 2
  • 41
  • 36
59

It seems that checking if it's equal to itself (x != x) is the fastest.

import pandas as pd 
import numpy as np 
import math 

x = float('nan')

%timeit x != x
44.8 ns ± 0.152 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%timeit math.isnan(x)
94.2 ns ± 0.955 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%timeit pd.isna(x)
281 ns ± 5.48 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit np.isnan(x)
1.38 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
wjandrea
  • 28,235
  • 9
  • 60
  • 81
Grzegorz
  • 1,268
  • 11
  • 11
  • 1
    It's worthwhile noting that this works even if infinities are in question. That is, if `z = float('inf')`, `z != z` evaluates to false. – npengra317 Oct 30 '20 at 19:05
  • 1
    in my computer ```z=float('inf')``` and then ```z==z``` give True. ```x=float('nan')``` and then ```x==x``` give False. – matan h Dec 10 '20 at 13:40
  • 3
    In most (if not all) cases, these speed differences will only be relevant, if repeated numerous times. Then you'll be using `numpy` or another tensor library, anyway. – rvf Jan 04 '22 at 10:58
  • 1
    This is a bad comparison. At this scale (nanoseconds) name and attribute lookup time are significant. If you use only local names, the difference between `x != x` and `math.isnan(x)` disappears; they're both about 35 ns on my system. You can use `%timeit` in cell mode to check: 1) `%%timeit x = float('nan')` `x != x` 2) `%%timeit x = float('nan'); from math import isnan` `isnan(x)` – wjandrea Apr 02 '23 at 18:29
  • Careful: These timings *only* represent checking a pre-existing variable and do not generalise well. A function such as `math.isnan` will compete very differently when a function is actually required and `x != x` would need wrapping in a `lambda`. A `numpy` functionality such as `numpy.isnan` will compete very differently when applied to a `numpy` array where `x != x` would require iteration. – MisterMiyagi Apr 03 '23 at 14:23
48

here is an answer working with:

  • NaN implementations respecting IEEE 754 standard
    • ie: python's NaN: float('nan'), numpy.nan...
  • any other objects: string or whatever (does not raise exceptions if encountered)

A NaN implemented following the standard, is the only value for which the inequality comparison with itself should return True:

def is_nan(x):
    return (x != x)

And some examples:

import numpy as np
values = [float('nan'), np.nan, 55, "string", lambda x : x]
for value in values:
    print(f"{repr(value):<8} : {is_nan(value)}")

Output:

nan      : True
nan      : True
55       : False
'string' : False
<function <lambda> at 0x000000000927BF28> : False
x0s
  • 1,648
  • 17
  • 17
  • 1
    The series I'm checking is strings with missing values are 'nans' (???) so this solution works where others failed. – keithpjolley Nov 03 '18 at 22:49
  • `numpy.nan` is a regular Python `float` object, just like the kind returned by `float('nan')`. Most NaNs you encounter in NumPy will not be the `numpy.nan` object. – user2357112 Apr 14 '20 at 07:13
  • `numpy.nan` defines its NaN value [on its own in the underlying library in C](https://github.com/numpy/numpy/blob/35d01b2a7cd38b2fda54b148402919aa1dd7e9c4/numpy/core/include/numpy/npy_math.h#L49). It does not wrap python's NaN. But now, they both comply with IEEE 754 standard as they rely on C99 API. – x0s Apr 22 '20 at 07:59
  • @user2357112supportsMonica: Python and numpy NaN actually don't behave the same way: `float('nan') is float('nan')` (non-unique) and `np.nan is np.nan` (unique) – x0s Apr 22 '20 at 08:07
  • @x0s: That has nothing to do with NumPy. `np.nan` is a specific object, while each `float('nan')` call produces a new object. If you did `nan = float('nan')`, then you'd get `nan is nan` too. If you constructed an *actual* NumPy NaN with something like `np.float64('nan')`, then [you'd get `np.float64('nan') is not np.float64('nan')` too](https://ideone.com/gf5JeG). – user2357112 Apr 22 '20 at 10:09
  • @x0s: Those macros you're looking at in the source are a completely unrelated C-level thing. They're used in NumPy C code to get C-level NaNs, which are completely different from regular Python NaNs or NumPy array scalar NaNs. – user2357112 Apr 22 '20 at 10:17
  • It's important to understand that you **cannot** assume all NumPy NaNs are `numpy.nan`. [Even `numpy.array([numpy.nan])[0] is not numpy.nan`.](https://ideone.com/ma6jW0) – user2357112 Apr 22 '20 at 10:19
  • @user2357112supportsMonica: Thanks for your insights and supporting examples. I'll update the answer. – x0s Apr 22 '20 at 19:59
33

I actually just ran into this, but for me it was checking for nan, -inf, or inf. I just used

if float('-inf') < float(num) < float('inf'):

This is true for numbers, false for nan and both inf, and will raise an exception for things like strings or other types (which is probably a good thing). Also this does not require importing any libraries like math or numpy (numpy is so damn big it doubles the size of any compiled application).

DaveTheScientist
  • 3,299
  • 25
  • 19
  • 12
    `math.isfinite` was not introduced until Python 3.2, so given the answer from @DaveTheScientist was posted in 2012 it was not exactly "reinvent[ing] the wheel" - solution still stands for those working with Python 2. – sudo_coffee Nov 22 '16 at 17:09
  • This can be useful for people who need to check for NaN in a `pd.eval` expression. For example `pd.eval(float('-inf') < float('nan') < float('inf'))` will return `False` – Derek O May 25 '21 at 15:33
28

math.isnan()

or compare the number to itself. NaN is always != NaN, otherwise (e.g. if it is a number) the comparison should succeed.

Tomalak
  • 332,285
  • 67
  • 532
  • 628
  • 6
    For people stuck with python <= 2.5. Nan != Nan did not work reliably. Used numpy instead. – Bear Jan 18 '10 at 07:06
26

Well I entered this post, because i've had some issues with the function:

math.isnan()

There are problem when you run this code:

a = "hello"
math.isnan(a)

It raises exception. My solution for that is to make another check:

def is_nan(x):
    return isinstance(x, float) and math.isnan(x)
Idok
  • 3,642
  • 4
  • 21
  • 18
  • 4
    It was probably downvoted because isnan() takes a float, not a string. There's nothing wrong with the function, and the problems are only in his attempted use of it. (For that particular use case his solution is valid, but it's not an answer to this question.) – Peter Hansen Jul 07 '13 at 14:12
  • 7
    Be careful with checking for types in this way. This will not work e.g. for numpy.float32 NaN's. Better to use a try/except construction: `def is_nan(x): try: return math.isnan(x) except: return False` – Rob Mar 24 '14 at 10:25
  • 4
    NaN does *not* mean that a value is not a valid number. It is part of IEEE floating point representation to specify that a particular result is undefined. e.g. 0 / 0. Therefore asking if "hello" is nan is meaningless. – Brice M. Dempsey Jul 17 '15 at 08:50
  • 2
    this is better because NaN can land in any list of strings,ints or floats, so useful check – RAFIQ Mar 11 '16 at 08:41
  • I had to implement exactly this for handling string columns in pandas. – Cristian Garcia Jun 04 '20 at 19:09
17

Another method if you're stuck on <2.6, you don't have numpy, and you don't have IEEE 754 support:

def isNaN(x):
    return str(x) == str(1e400*0)
Josh Lee
  • 171,072
  • 38
  • 269
  • 275
10

With python < 2.6 I ended up with

def isNaN(x):
    return str(float(x)).lower() == 'nan'

This works for me with python 2.5.1 on a Solaris 5.9 box and with python 2.6.5 on Ubuntu 10

Mauro Bianchi
  • 683
  • 1
  • 8
  • 17
8

Comparison pd.isna, math.isnan and np.isnan and their flexibility dealing with different type of objects.

The table below shows if the type of object can be checked with the given method:


+------------+-----+---------+------+--------+------+
|   Method   | NaN | numeric | None | string | list |
+------------+-----+---------+------+--------+------+
| pd.isna    | yes | yes     | yes  | yes    | yes  |
| math.isnan | yes | yes     | no   | no     | no   |
| np.isnan   | yes | yes     | no   | no     | yes  | <-- # will error on mixed type list
+------------+-----+---------+------+--------+------+

pd.isna

The most flexible method to check for different types of missing values.


None of the answers cover the flexibility of pd.isna. While math.isnan and np.isnan will return True for NaN values, you cannot check for different type of objects like None or strings. Both methods will return an error, so checking a list with mixed types will be cumbersom. This while pd.isna is flexible and will return the correct boolean for different kind of types:

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: missing_values = [3, None, np.NaN, pd.NA, pd.NaT, '10']

In [4]: pd.isna(missing_values)
Out[4]: array([False,  True,  True,  True,  True, False])
Erfan
  • 40,971
  • 8
  • 66
  • 78
  • This!!!! I came here trying to figure out how to check for both NaN and None, which depending on user input excel sheets I could get either. If it weren't for those pesky users this would be easy! – turbonate Mar 23 '23 at 10:22
7

I am receiving the data from a web-service that sends NaN as a string 'Nan'. But there could be other sorts of string in my data as well, so a simple float(value) could throw an exception. I used the following variant of the accepted answer:

def isnan(value):
  try:
      import math
      return math.isnan(float(value))
  except:
      return False

Requirement:

isnan('hello') == False
isnan('NaN') == True
isnan(100) == False
isnan(float('nan')) = True
Mahdi
  • 1,778
  • 1
  • 21
  • 35
  • 1
    or `try: int(value)` – chwi Jul 06 '16 at 14:00
  • @chwi so what does your suggestion tell about `value` being `NaN` or not? – Mahdi Jul 06 '16 at 15:39
  • Well, being "not a number", anything that can not be casted to an int I guess is in fact not a number, and the try statement will fail? Try, return true, except return false. – chwi Jul 07 '16 at 09:29
  • @chwi Well, taking "not a number" literally, you are right, but that's not the point here. In fact, I am looking exactly for what the semantics of `NaN` is (like in python what you could get from `float('inf') * 0`), and thus although the string 'Hello' is not a number, but it is also not `NaN` because `NaN` is still a numeric value! – Mahdi Jul 07 '16 at 11:19
  • @chwi: You are correct, if exception handling is for specific exception. But in this answer, generic exception have been handled. So no need to check `int(value)` For all exception, `False` will be written. – Harsha Biyani Jan 15 '20 at 11:53
4

All the methods to tell if the variable is NaN or None:

None type

In [1]: from numpy import math

In [2]: a = None
In [3]: not a
Out[3]: True

In [4]: len(a or ()) == 0
Out[4]: True

In [5]: a == None
Out[5]: True

In [6]: a is None
Out[6]: True

In [7]: a != a
Out[7]: False

In [9]: math.isnan(a)
Traceback (most recent call last):
  File "<ipython-input-9-6d4d8c26d370>", line 1, in <module>
    math.isnan(a)
TypeError: a float is required

In [10]: len(a) == 0
Traceback (most recent call last):
  File "<ipython-input-10-65b72372873e>", line 1, in <module>
    len(a) == 0
TypeError: object of type 'NoneType' has no len()

NaN type

In [11]: b = float('nan')
In [12]: b
Out[12]: nan

In [13]: not b
Out[13]: False

In [14]: b != b
Out[14]: True

In [15]: math.isnan(b)
Out[15]: True
siberiawolf61
  • 77
  • 1
  • 3
4

How to remove NaN (float) item(s) from a list of mixed data types

If you have mixed types in an iterable, here is a solution that does not use numpy:

from math import isnan

Z = ['a','b', float('NaN'), 'd', float('1.1024')]

[x for x in Z if not (
                      type(x) == float # let's drop all float values…
                      and isnan(x) # … but only if they are nan
                      )]
['a', 'b', 'd', 1.1024]

Short-circuit evaluation means that isnan will not be called on values that are not of type 'float', as False and (…) quickly evaluates to False without having to evaluate the right-hand side.

petezurich
  • 9,280
  • 9
  • 43
  • 57
sleblanc
  • 3,821
  • 1
  • 34
  • 42
4

In Python 3.6 checking on a string value x math.isnan(x) and np.isnan(x) raises an error. So I can't check if the given value is NaN or not if I don't know beforehand it's a number. The following seems to solve this issue

if str(x)=='nan' and type(x)!='str':
    print ('NaN')
else:
    print ('non NaN')
1

For nan of type float

>>> import pandas as pd
>>> value = float(nan)
>>> type(value)
>>> <class 'float'>
>>> pd.isnull(value)
True
>>>
>>> value = 'nan'
>>> type(value)
>>> <class 'str'>
>>> pd.isnull(value)
False
J11
  • 455
  • 4
  • 8
0

If you want to check for values that are not NaN, then negate whatever is used to flag NaNs; pandas has its own dedicated function for flagging non-NaN values.

lst = [1, 2, float('nan')]

m1 = [e == e for e in lst]              # [True, True, False]

m2 = [not math.isnan(e) for e in lst]   # [True, True, False]

m3 = ~np.isnan(lst)                     # array([ True,  True, False])

m4 = pd.notna(lst)                      # array([ True,  True, False])

This is especially useful if you want to filter values that are not NaN. For ndarray/Series objects, == is vectorized, so it can be used as well.

s = pd.Series(lst)
arr = np.array(lst)

x = s[s.notna()]
y = s[s==s]                             # `==` is vectorized
z = arr[~np.isnan(arr)]                 # array([1., 2.])

assert (x == y).all() and (x == z).all()
cottontail
  • 10,268
  • 18
  • 50
  • 51
-5

for strings in panda take pd.isnull:

if not pd.isnull(atext):
  for word in nltk.word_tokenize(atext):

the function as feature extraction for NLTK

def act_features(atext):
features = {}
if not pd.isnull(atext):
  for word in nltk.word_tokenize(atext):
    if word not in default_stopwords:
      features['cont({})'.format(word.lower())]=True
return features
Max Kleiner
  • 1,442
  • 1
  • 13
  • 14