why is pandas's isin method returning False for matching None?

Question

Suppose i have two equivalent dataframes:

df1 = pd.DataFrame({'a': [None, None, None]})
df2 = pd.DataFrame({'a': [None, None, None]})

When i use the isin method as such:

df1.isin(df2)

I get the following output:

I would expect the results to be True for all. Why am I not getting my expected results?

Appreciate any guidance and clarifications from the community, thanks!

None is not equal to None. cant be compared, hence False – sammywemmy Aug 29 '21 at 03:39 — sammywemmy, Aug 29 '21 at 03:39

U13-Forward · Answer 1 · 2021-08-29T03:53:43.157

0

It's because pandas converts the None into np.nan, while:

>>> np.nan == np.nan
False
>>>

Gives False.

It's because np.nan is setup that all np.nans are different objects, np.nan is inherited from pythons float('nan'), as you can see:

>>> id(float('nan'))
1881571529008
>>> id(float('nan'))
1881618531856
>>>

They're different objects!

An example implementation for np.nan != np.nan could be:

class nan:
    def __repr__(self):
        return 'nan'
    def __eq__(self, other):
        return False
print(nan() == nan())
print(nan())

Output:

False
nan

edited Aug 29 '21 at 03:53

answered Aug 29 '21 at 03:41

U13-Forward

The id is compared with `is`. `==` uses the equality operator. Presumably `np.nan` has `def __eq__(self, other): return False` – Acccumulation Aug 29 '21 at 03:45
@Acccumulation yeap exactly! – U13-Forward Aug 29 '21 at 03:45

1 Answers1