4

I have the following dataframe:

     Name  Number        Date   Time  Temperature  RH  Height         AH  
0    Rome     301  01/10/2019  02:00         20.5  89      10  15.830405   
1    Rome     301  01/10/2019  05:00         19.4  91      10  15.176020    
..    ...     ...         ...    ...          ...  ..     ...        ...   
91  Napoli     600  02/10/2019  11:00         30.5  52       5  16.213860   
92  Napoli     600  02/10/2019  14:00         30.3  51       5  15.731054   

Under "Name" there are a few locations, under AH is the Absolute Humidity. I want to calculate the median AH per each location for each Date (There are 2 days) and to display each of these daily medians in new columns called med_AH_[Date]. (In total 2 new columns).

How do I do this?

This is what I have until now:

my_data['med_AH_[Date]']= my_data.groupby('Name')['AH'].transform('median')

But it naturally provides me only the medians by Name and with no division between dates.

Sash Vash
  • 41
  • 1

2 Answers2

0

I believe you just need to update your groupby to include Date:

my_data['med_AH_[Date]']= my_data.groupby(['Name', 'Date'])['AH'].transform('median')
Joe
  • 206
  • 2
  • 9
0

Inside the groupby clause you can include more than one feature, and also use the built-in method median instead of using transform:

df_grouped = df.groupby(["name", "date"], as_index=False)["AH"].median()
df_grouped

    name    date        AH
0   Milan   03/10/2019  28.0
1   Napoli  02/10/2019  31.5
2   Rome    01/10/2019  17.0
3   Rome    02/10/2019  26.0
sergiomahi
  • 964
  • 2
  • 8
  • 21