1

Suppose i have two equivalent dataframes:

df1 = pd.DataFrame({'a': [None, None, None]})
df2 = pd.DataFrame({'a': [None, None, None]})

When i use the isin method as such:

df1.isin(df2)

I get the following output:

'a'
0 False
1 False
2 False

I would expect the results to be True for all. Why am I not getting my expected results?

Appreciate any guidance and clarifications from the community, thanks!

Yee Cheng
  • 11
  • 2

1 Answers1

0

It's because pandas converts the None into np.nan, while:

>>> np.nan == np.nan
False
>>> 

Gives False.

It's because np.nan is setup that all np.nans are different objects, np.nan is inherited from pythons float('nan'), as you can see:

>>> id(float('nan'))
1881571529008
>>> id(float('nan'))
1881618531856
>>> 

They're different objects!

An example implementation for np.nan != np.nan could be:

class nan:
    def __repr__(self):
        return 'nan'
    def __eq__(self, other):
        return False
print(nan() == nan())
print(nan())

Output:

False
nan
U13-Forward
  • 69,221
  • 14
  • 89
  • 114