Python pandas calculating average price within 15 minute time frame

Question

I have a dateframe that contains datetime and price.

Here is a sample chosen at random

In [2]: df
Out[2]: 
          price           datetime
239035    5.05 2016-04-14 14:13:27
1771224   5.67 2016-08-30 14:19:47
2859140   4.00 2016-12-05 20:57:01
1311384   7.71 2016-07-08 18:16:22
141709    4.19 2016-04-07 13:30:00
2802527   3.94 2016-11-30 15:36:11
1411955   7.27 2016-07-20 13:55:20
2215987   4.87 2016-10-07 19:56:13

The datetime is accurate to the second.

I want to calculate the average price every 15 minutes starting at 9:00am and ending at 4:30pm, and store the new data into a new dataframe.

I could do it the old fashion way, make a list of all the 15 minute time intervals within 9am-4:30pm for each date, and iterate through each row of the CSV file, check its time and dump it into the appropriate bucket. Then find the average value for each bucket in each day.

But I was wondering if there is a nicer way to do this in panda. if not I'll just brute force my way through it...

I updated my question with how I would usually do it the "brute force way"... But I think there should be an easier way to accomplish what I want to do, since Pandas is a data analytics tool.... — Telenoobies, Dec 09 '16 at 18:46
Can't read your data using `pd.read_clipboard()` which makes it harder. (read [this](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples)). Also, adding the desired output would be nice. — Julien Marrec, Dec 09 '16 at 19:06
How about now? To test it, I just Ctrl-C the data right and run pd.clipboard() right? — Telenoobies, Dec 09 '16 at 19:47

Julien Marrec · Accepted Answer · 2016-12-09T19:12:07.930

6

You can use DataFrame.resample:

df2 = df.resample(rule='15Min', on='datetime').mean()

Then you filter out the times you don't want using boolean indexing. It's better to work with a DateTimeIndex:

df2 = df2.set_index('datetime', drop=False)
df2.between_time('9:00','16:30')

edited Dec 09 '16 at 19:12

answered Dec 09 '16 at 19:02

Julien Marrec

11,605
4
46
63

And I can pass in the start_time and end_time, so that it knows where to start and where to end? – Telenoobies Dec 09 '16 at 19:07
Yes, for this the easiest would be to use `set_index('datetime')`then you can filter. I'll add that – Julien Marrec Dec 09 '16 at 19:10
1

Worked very well. That saved a lot of manual labor. Thanks again. – Telenoobies Dec 09 '16 at 19:21
I also saw an option to resample only business days, removing all weekends and holidays. df.resample('B') is this correct? – Telenoobies Dec 09 '16 at 19:50

Python pandas calculating average price within 15 minute time frame

1 Answers1