I think there is problem after aggregate max
you get all NaN
s, so value_counts
return empty Series
:
df = pd.DataFrame({'A':[1,1,0,np.nan],
'npatience':[np.nan,np.nan,4,5],
'C':[1,0,np.nan,np.nan],
'D':[1,3,5,7]})
print (df)
A C D npatience
0 1.0 1.0 1 NaN
1 1.0 0.0 3 NaN
2 0.0 NaN 5 4.0
3 NaN NaN 7 5.0
print (df.A.value_counts())
1.0 2
0.0 1
Name: A, dtype: int64
print (df.C.value_counts())
0.0 1
1.0 1
Name: C, dtype: int64
g = df.groupby('npatience').max()
print (g)
A C D
npatience
4.0 0.0 NaN 5
5.0 NaN NaN 7
print (g.C)
npatience
4.0 NaN
5.0 NaN
Name: C, dtype: float64
#check if in column are all values NaNs
print (g.C.isnull().all())
True
print (g.A)
npatience
4.0 0.0
5.0 NaN
Name: A, dtype: float64
print (g.C.value_counts())
Series([], Name: C, dtype: int64)
print (g.A.value_counts())
0.0 1
Name: A, dtype: int64
print (g.C.value_counts(dropna=False))
NaN 2
Name: C, dtype: int64
print (g.A.value_counts(dropna=False))
NaN 1
0.0 1
Name: A, dtype: int64
EDIT:
groupby
by default remove NaN
s rows (cannot groups by NaNs), so it is same as call drop
before groupby
:
g = df.dropna(subset=['npatience'])
print (g)
A C D
npatience
4.0 0.0 NaN 5
5.0 NaN NaN 7
print (g.C)
2 NaN
3 NaN
Name: C, dtype: float64
#check if in column are all values NaNs
print (g.C.isnull().all())
True
And solution for groupby without remove NaN
s is replace NaN
s by value (which is not in df
) like 1000
:
g = df.fillna(1000).groupby('npatience').max()
print (g)
A C D
npatience
4.0 0.0 1000.0 5
5.0 1000.0 1000.0 7
1000.0 1.0 1.0 3
print (g.C.value_counts())
1000.0 2
1.0 1
Name: C, dtype: int64