I was to use the transform
method on a groupby
object using built-in (ie 'mean'
, 'sum
', etc) functions but keep np.nan
values. For example,
np.random.seed(0)
df = pd.DataFrame({'value':np.random.randint(0,100,8)},index = list('aabbccdd'))
df.iloc[[0,6]] = np.nan
df.groupby(level=0).transform('min')
yields
value
a 43.0
a 43.0
b 4.0
b 4.0
c 44.0
c 44.0
d 89.0
d 89.0
but i want:
value
a np.nan
a np.nan
b 4.0
b 4.0
c 44.0
c 44.0
d np.nan
d np.nan
Using my own function such as lambda x: min(skipna=True)
will work...eventually but I have rather millions of small groups on which lambda
and numpy
methods takes an eternity. Any suggestions?
Yes, there is a similar question but note that in that question, the OP wants to include np.nan
groups whereas I want to not skip over np.nan
values in the groups