0

I have the following pandas DataFrame df:

                                                    FFDI_SFC    AET_date
latitude        longitude       time        
-39.7650000000  140.8954000000  2017-09-30 13:00:00     1   2017-09-30
                                2017-09-30 14:00:00     2   2017-10-01
                                2017-09-30 15:00:00     1   2017-10-01
                                2017-09-30 16:00:00     1   2017-10-01
                                2017-09-30 17:00:00     2   2017-10-01
                                2017-09-30 18:00:00     4   2017-10-01
                                2017-09-30 19:00:00     3   2017-10-01
                                2017-09-30 20:00:00     2   2017-10-01
                                2017-09-30 21:00:00     4   2017-10-01
                                2017-09-30 22:00:00     1   2017-10-01
                                2017-09-30 23:00:00     3   2017-10-01
                                2017-10-01 00:00:00     nan     2017-10-01
                                2017-10-01 01:00:00     nan     2017-10-01
                                2017-10-01 02:00:00     4   2017-10-01
                                2017-10-01 03:00:00     3   2017-10-01
                                2017-10-01 04:00:00     nan     2017-10-01
                                2017-10-01 05:00:00     5   2017-10-01
                                2017-10-01 06:00:00     nan     2017-10-01
                                2017-10-01 07:00:00     4   2017-10-01
                                2017-10-01 08:00:00     4   2017-10-01
                                2017-10-01 09:00:00     4   2017-10-01
                                2017-10-01 10:00:00     3   2017-10-01
                                2017-10-01 11:00:00     4   2017-10-01
                                2017-10-01 12:00:00     5   2017-10-01
                                2017-10-01 13:00:00     3   2017-10-02
                                2017-10-01 13:00:00     3   2017-10-02
                                2017-10-01 14:00:00     nan     2017-10-02
                                2017-10-01 14:00:00     4   2017-10-02
                                2017-10-01 15:00:00     5   2017-10-02
                                2017-10-01 16:00:00     nan     2017-10-02
                                2017-10-01 17:00:00     nan     2017-10-02
                                2017-10-01 18:00:00     nan     2017-10-02


                                ...     ...     ...     ...     ...
-33.9350000000  151.0466000000  2017-10-08 07:00:00     6   2017-10-08
                                2017-10-08 08:00:00     5   2017-10-08
                                2017-10-08 09:00:00     5   2017-10-08
                                2017-10-08 10:00:00     6   2017-10-08
                                2017-10-08 11:00:00     6   2017-10-08
                                2017-10-08 12:00:00     nan     2017-10-08
                                2017-10-08 13:00:00     6   2017-10-09
                                2017-10-08 13:00:00     nan     2017-10-09
                                2017-10-08 14:00:00     7   2017-10-09
                                2017-10-08 14:00:00     7   2017-10-09
                                2017-10-08 15:00:00     7   2017-10-09
                                ... ... ... ... ... ... ... ... ... ... ...
                                2017-10-10 09:00:00     nan     2017-10-10
                                2017-10-10 10:00:00     12  2017-10-10
                                2017-10-10 11:00:00     10  2017-10-10
                                2017-10-10 12:00:00     14  2017-10-10
                                2017-10-10 13:00:00     13  2017-10-11
                                2017-10-10 14:00:00     15  2017-10-11

103554880 rows × 2 columns

They are multi-indexed (latitude, longitude and time). There is a column called AET_date indicating its actual date for each record. FFDI_SFC is a nan-able value for each record.

What I want to achieve is to get the max of FFDI_SFC for the rows which have the identical latitude, longitude and AET_date. In other words this is to group rows by latitude, longitude and AET_date and get the max (daily) for each group.

The anticipated output will look like:

                                                Max_Daily_FFDI_SFC  
latitude        longitude       AET_date        
-39.7650000000  140.8954000000  2017-09-30      5   
                                2017-10-01      7   
                                2017-10-02      5
...     ...     ...     ...     ...
-33.9350000000  151.0466000000  2017-10-08      14
                                2017-10-09      12
                                2017-10-10      16
alextc
  • 3,206
  • 10
  • 63
  • 107
  • Please provide a [mre]. All the `FFDI_SFC`s provided here are `nan`. See also [How to make good reproducible pandas examples](/q/20109391/4518341) – wjandrea Nov 11 '21 at 02:47
  • 1
    Have you tried `df.groupby(['latitude', 'longitude', 'AET_date']).max()` or something like that? – wjandrea Nov 11 '21 at 02:49

0 Answers0