In NumPy and Pandas, nan != nan
and NaT != NaT
. So, when comparing results during unit testing, how can I assert that a returned value is one of those values? A simple assertEqual
naturally fails, even if I use pandas.util.testing
.

- 169,130
- 45
- 262
- 238

- 16,656
- 6
- 71
- 80
-
2use [`isnull`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.isnull.html#pandas.isnull) – EdChum Sep 11 '15 at 12:59
-
simply `value != value` should be true – Hacketo Sep 11 '15 at 13:11
3 Answers
If you're comparing scalars, one way is to use assertTrue
with isnull
. For example, in the DataFrame unit tests (pandas/tests/test_frame.py
) you can find tests such as this:
self.assertTrue(com.isnull(df.ix['c', 'timestamp']))
(com
is an alias for pandas/core/common.py
and so com.isnull
calls the same underlying function as pd.isnull
.)
If on the other hand you're comparing Series or DataFrames with null values for equality, these are handled automatically by tm.assert_series_equal
and tm.assert_frame_equal
. For example:
>>> import pandas.util.testing as tm
>>> df = pd.DataFrame({'a': [1, np.nan]})
>>> df
a
0 1
1 NaN
Normally, NaN
is not equal to NaN
:
>>> df == df
a
0 True
1 False
But assert_frame_equal
processes NaN
as being equal to itself:
>>> tm.assert_frame_equal(df, df)
# no AssertionError raised

- 169,130
- 45
- 262
- 238
Testing on python2.7, I get the following
import numpy as np
import pandas as pd
x = np.nan
x is np.nan #True
x is pd.NaT #False
np.isnan(x) #True
pd.isnull(x) #True
y = pd.NaT
y is np.nan #False
y is pd.NaT #True
np.isnan(y) #TypeError !!
pd.isnull(y) #True
You can also use
x != x #True for nan
y != y #True for NaT
But I don't really like this style, I can never quite convince myself to trust it.

- 3,615
- 3
- 15
- 13
Before doing an assert_frame_equal check, you could use the .fillna() method on the dataframes to replace the null values with something else that won't otherwise appear in your values. You may also want to read these examples on how to use the .fillna() method.

- 2,154
- 3
- 26
- 49
-
1Thank you, this is almost exactly what I have been looking for. I'm saying "almost" as you can't pass `None`, which would be ideal as a type-neutral value, but another unique scalar, such as a zero or a string (e.g. `"INCORRECT!!!1!1!"` ;-) ), is good enough for now. – Berislav Lopac Sep 12 '15 at 08:16
-
@BerislavLopac: perhaps I've misunderstood exactly what you're trying to do, but `assert_frame_equal` already asserts that `NaN` is equal to `NaN`. Using `fillna()` to replace `NaN` with some other scalar to be compared for equality is redundant and so isn't used in Pandas' unit tests. – Alex Riley Sep 12 '15 at 09:22
-
Gah, you're right -- I took your advice too literally and called `fillna` _before_ `assert_frame_check`, so I missed that it works out the differences. Thanks! – Berislav Lopac Sep 12 '15 at 09:26