How to resample intra-day intervals and use .idxmax()?

Question

I am using data from yfinance which returns a pandas Data-Frame.

                            Volume
Datetime                          
2021-09-13 09:30:00-04:00   951104
2021-09-13 09:35:00-04:00   408357
2021-09-13 09:40:00-04:00   498055
2021-09-13 09:45:00-04:00   466363
2021-09-13 09:50:00-04:00   315385
2021-12-06 15:35:00-05:00   200748
2021-12-06 15:40:00-05:00   336136
2021-12-06 15:45:00-05:00   473106
2021-12-06 15:50:00-05:00   705082
2021-12-06 15:55:00-05:00  1249763

There are 5 minute intra-day intervals in the data-frame. I want to resample to daily data and get the idxmax of the maximum volume for that day.

df.resample("B")["Volume"].idxmax()

Returns an error:

ValueError: attempt to get argmax of an empty sequence

I used B(business-days) as the resampling period, so there shouldn't be any empty sequences.

I should say .max() works fine.

Also using .agg as was suggested in another question returns an error:

df["Volume"].resample("B").agg(lambda x : np.nan if x.count() == 0 else x.idxmax())

error:

IndexError: index 77 is out of bounds for axis 0 with size 0

score 1 · Answer 1 · answered Dec 07 '21 at 09:00

1

For me working test if all NaNs per group in if-else:

df = df.resample("B")["Volume"].agg(lambda x: np.nan if x.isna().all() else x.idxmax())

answered Dec 07 '21 at 09:00

jezrael

822,522
95
1,334
1,252

score 1 · Accepted Answer · answered Dec 07 '21 at 10:06

1

You can use groupby as an alternative of resample:

>>> df.groupby(df.index.normalize())['Volume'].agg(Datetime='idxmax', Volume='max')

                      Datetime   Volume
Datetime                               
2021-09-13 2021-09-13 09:30:00   951104
2021-12-06 2021-12-06 15:55:00  1249763

answered Dec 07 '21 at 10:06

Corralien

109,409
8
28
52

Thanks. I have never seen this type of parameters in the agg function. I guess this is for when you select one column after the groupby. – Borut Flis Dec 08 '21 at 08:19
1

Yes, you're right. You probably know this form for DataFrame: `df.groupby(df.index.normalize()).agg(Datetime=('Volume', 'idxmax'), Volume=('Volume', 'max'))`? – Corralien Dec 08 '21 at 08:28
If you have not valid python identifiers like (`Datetime Peak`), you can use this form: `df.groupby(df.index.normalize())['Volume'].agg(**{'Datetime Peak': 'idxmax', 'Volume Max': 'max'})` – Corralien Dec 08 '21 at 08:53

How to resample intra-day intervals and use .idxmax()?

2 Answers2