Finding hour of daily max using Pandas in Python

Question

I am trying to find the hour of max demand every day in my demand time series.

I have created a dataframe that looks like..

                       power
2011-01-01 00:00:00  1015.70
2011-01-01 01:00:00  1015.70
2011-01-01 02:00:00  1010.30
2011-01-01 03:00:00  1010.90
2011-01-01 04:00:00  1021.10
2011-01-01 05:00:00  1046.00
2011-01-01 06:00:00  1054.60
...

and a grouped series to find the max value from each day using .max()

grouped = df.groupby(pd.TimeGrouper('D'))
grouped['power'].max()

OUTPUT

2011-01-01    1367.30
2011-01-02    1381.90
2011-01-03    1289.00
2011-01-04    1323.50
2011-01-05    1372.70
2011-01-06    1314.40
2011-01-07    1310.60
...

However I need the hour of the max value also. So something like:

2011-01-01  18  1367.30
2011-01-02  5   1381.90
2011-01-03  22  1289.00
2011-01-04  10  1323.50
...

I have tried using idxmax() but I keep getting a ValueError

using `df.groupby(pd.TimeGrouper('D'))['power'].idxmax()` works for me when I use the small amount of data you provided. — piRSquared, May 25 '16 at 17:01
@piRSquared, if i try it - i get `KeyError: 'Column not found: power'` — MaxU - stand with Ukraine, May 25 '16 at 17:07
This works great thanks! Although it doesn't show the power, I am going to be using the daily max hours to populate a histogram. This too only works for one year of data, when I add a second year I get that ValueError, going to look over my data to make sure there is nothing wrong. — lewis2ba, May 25 '16 at 17:26
@MaxU pandas 0.18.0, Python 2.7.11. Something has to account for that difference. I'll figure it out later. — piRSquared, May 25 '16 at 17:39

MaxU - stand with Ukraine · Accepted Answer · 2018-09-19T11:46:07.773

9

UPDATE from 2018-09-19:

FutureWarning: pd.TimeGrouper is deprecated and will be removed; Please use pd.Grouper(freq=...)

solution:

In [295]: df.loc[df.groupby(pd.Grouper(freq='D')).idxmax().iloc[:, 0]]
Out[295]:
                                         power
2011-01-01 06:00:00                     1054.6
2011-01-02 06:00:00                     2054.6

Old answer:

try this:

In [376]: df.loc[df.groupby(pd.TimeGrouper('D')).idxmax().iloc[:, 0]]
Out[376]:
                                           power
2011-01-01 06:00:00                       1054.6
2011-01-02 06:00:00                       2054.6

data:

In [377]: df
Out[377]:
                                           power
2011-01-01 00:00:00                       1015.7
2011-01-01 01:00:00                       1015.7
2011-01-01 02:00:00                       1010.3
2011-01-01 03:00:00                       1010.9
2011-01-01 04:00:00                       1021.1
2011-01-01 05:00:00                       1046.0
2011-01-01 06:00:00                       1054.6
2011-01-02 00:00:00                       2015.7
2011-01-02 01:00:00                       2015.7
2011-01-02 02:00:00                       2010.3
2011-01-02 03:00:00                       2010.9
2011-01-02 04:00:00                       2021.1
2011-01-02 05:00:00                       2046.0
2011-01-02 06:00:00                       2054.6

edited Sep 19 '18 at 11:46

answered May 25 '16 at 17:05

MaxU - stand with Ukraine

205,989
36
386
419

This works great, thanks! It is working well for a years worth of data, when I try to run two years i get: `raise IndexError("single positional indexer is out-of-bounds") IndexError: single positional indexer is out-of-bounds`. I am guessing something Is wrong with my data. – lewis2ba May 25 '16 at 17:12
Do you have missing days in your data? `pd.TimeGrouper('D')` will generate groups for each date between the min and max date, so you'll get empty groups for missing days, which will result in the error you're seeing. – root May 25 '16 at 17:25
There doesn't seem to be any missing days in the data. I can run each year separately(2011 or 2012) and get the results. When I run them together(2011 and 2012) I get the value error. Going to make sure the 2 year csv file is formatting correctly – lewis2ba May 25 '16 at 17:41
@MaxU, Hi I am trying to perform same task above. But, `TimeGrouper` is depreciated now. I am trying this code : `df['time'].loc[df.groupby(pd.Grouper(key='time',freq='D')).max()]`. But, this gives error? – Msquare Sep 19 '18 at 11:41
@MaxU, Thanks a ton for responding quickly and updating your answer. I tried this code `df.loc[df.groupby(pd.Grouper(key='time',freq='D')).idxmax().iloc[:,0]]`. The output is an error with message: `IndexError: single positional indexer is out-of-bounds`. Any guess? – Msquare Sep 19 '18 at 11:50
@Msquare, i afraid one would need a small __reproducible__ data set in order to be able to help ... Can you reproduce that error using a data set from my answer? – MaxU - stand with Ukraine Sep 19 '18 at 11:52
@MaxU, I just tried your code `df.loc[df.groupby(pd.Grouper(freq='D')).idxmax().iloc[:,0]]`. The output is: `TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Int64Index'`. It is actually a 150 MB data file. I can share a part of it. How to do it? – Msquare Sep 19 '18 at 11:54
@Msquare, try to create a new question including [a minimal reproducible sample data set](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples), which would allow to reproduce that error message and your desired data set. – MaxU - stand with Ukraine Sep 19 '18 at 11:56
@MaxU, I have formed a new question here https://stackoverflow.com/questions/52405615/finding-daily-maximum-and-its-time-stamp-in-python-pandas . could you please check it? – Msquare Sep 19 '18 at 12:13
Also make sure to check the index of your dataframe returns a DateTimeIndex. you can check and set this using `df.index` – sh87 Feb 01 '19 at 04:13

score 0 · Answer 2 · answered Dec 01 '20 at 00:36

You can also group by your index date with df.groupby(df.index.date) and then use idxmax() to find the index of the max value in the power column:

df.groupby(df.index.date)["power"].idxmax()

If you want the power values too, use .loc:

df.loc[df.groupby(df.index.date)["power"].idxmax()]

Finding hour of daily max using Pandas in Python

2 Answers2

Linked