1

I have a pandas dataframe with a column that is populated by "yes" or "no" strings. When I do .value_counts() to this column, i receive the correct distribution.
But, when I run .isna() it shows that the whole column is NaNs.

I suspect later it creates problems for me.

Example:

df = pd.DataFrame(np.array([[0,1,2,3,4],[40,30,20,10,0], ['yes','yes','no','no','yes']]).T, columns=['A','B','C'])

len(df['C'].isna())  # 5 --> why?!
df['C'].value_counts()  # yes : 3,  no: 2 --> as expected. 
ArieAI
  • 354
  • 1
  • 12

1 Answers1

1

len gives you the length of the Series (irrespective of its content), not the number of True values.

Use sum if you want the count of True:

df['C'].isna().sum()
# 0
mozway
  • 194,879
  • 13
  • 39
  • 75