-2

i have a dataframe as below

Res_id  Mean_per_year
a     10.4
a     12.4
b     4.4
b     4.5
c     17
d     9

i would like to calculate the std() using panda on "mean_per_year" for the same res_id i did an aggregate and got

Res_id  Mean_per_year_agg
a     10.4,12.4
b     4.4,4.5
c     17
d     9

but when i applied std() on mean_per_year_agg it does not work MY CODE:

data = (data.groupby(['res_id'])
.agg({'mean_per_year': lambda x: x.tolist()})
.rename({'res_id' : 'mean_per_year'},axis=1)
.reset_index())

data['std'] = data.groupby('res_id')['mean_per_year_agg'].std()

Error

/usr/local/lib/python3.6/dist-packages/pandas/core/internals/blocks.py in __init__(self, values, placement, ndim)
    129         if self._validate_ndim and self.ndim and len(self.mgr_locs) != len(self.values):
    130             raise ValueError(
--> 131                 f"Wrong number of items passed {len(self.values)}, "
    132                 f"placement implies {len(self.mgr_locs)}"
    133             )

ValueError: Wrong number of items passed 0, placement implies 1

Thanks for any help

orlp
  • 112,504
  • 36
  • 218
  • 315
Aris
  • 17
  • 3
  • `data['std'] = data.groupby('res_id')['mean_per_year_agg'].std()` simply does the job. No need to groupby and store it as list. Skip the part above this. – Raghul Raj Feb 19 '21 at 10:55
  • This is exactly what i did and get the error :ValueError: Wrong number of items passed 0, placement implies 1 – Aris Feb 19 '21 at 11:12

1 Answers1

1

You simply want data.groupby('res_id')["Mean_per_year"].std() on your original dataframe (remove the whole aggregate business).

orlp
  • 112,504
  • 36
  • 218
  • 315