0

I'm given a child birth data from a hospital, and asked to perform certain tasks on it :

timestamp ethnicity gender body_mass

01:03:27 indian m 8.1

01:07:20 hispanic f 5.9

01:09:34 romani m 7.2

... ... ... ...

11:56:15 irish f 6.3

and I need to generate statistical features for every value in 'ethnicity' after every 10minutes.

timestamp indian_avg indian_max indian_min ... iris_min

01:20:00 7.1 9.5 4.7 ... 5.1

01:40:00 7.2 8.8 5.6 ... 6.9

... ... ... ... ... ...

12:00:00 7.6 10.1 5.1 ... 6.7

Please help I am a beginner and stuck on this problem for a day now

ansev
  • 30,322
  • 5
  • 17
  • 31

1 Answers1

0

You can use pd.Grouper! and group by the frequency and the ethnicity.

df.groupby([pd.Grouper(freq='10min'), 'ethnicity']) \
  .agg({'body_mass': ['max', 'min']})

In order to get the exact format you want for your output can perform the following manipulation to get the desired result (read more in: Pandas - How to flatten a hierarchical index in columns

df.groupby([pd.Grouper(freq='10min'), 'ethnicity']) \
  .agg({'body_mass': ['max', 'min']}) \
  .unstack()
df.columns = [' '.join(col).strip() for col in df.columns.values]
Ian Wright
  • 166
  • 4