Copy a DataFrame to another with 5 min interval

Question

I'd like copy dataframe df into df2 with 5 mins interval, instead of 1 min.

How to extract the row by a particular interview?

>>> df = pd.read_csv('sample.txt', names=['Time', 'Index', 'Volume'], delim_whitespace=True)
>>> df.head(20)
                Time   Index  Volume
0   2019/05/16-09:15  27830    2031
1   2019/05/16-09:16  27815     995
2   2019/05/16-09:17  27829     961
3   2019/05/16-09:18  27848     663
4   2019/05/16-09:19  27873     869
5   2019/05/16-09:20  27847     854
6   2019/05/16-09:21  27828     784
7   2019/05/16-09:22  27813     676
8   2019/05/16-09:23  27828     700
9   2019/05/16-09:24  27849     665
10  2019/05/16-09:25  27851     487
11  2019/05/16-09:26  27859     451
12  2019/05/16-09:27  27874     548
13  2019/05/16-09:28  27898    1105
14  2019/05/16-09:29  27907     832
15  2019/05/16-09:30  27897    1470

My expected dataframe

>>> df2
                Time   Index  Volume
0   2019/05/16-09:15  27830    2031
5   2019/05/16-09:20  27847     854
10  2019/05/16-09:25  27851     487
15  2019/05/16-09:30  27897    1470
.....

you can use splicing. see this https://stackoverflow.com/questions/10665889/how-to-take-column-slices-of-dataframe-in-pandas — WhySoSerious, May 17 '19 at 08:29

score 3 · Answer 1 · answered May 17 '19 at 08:33

Use DataFrame.asfreq working with DatetimeIndex, so first convert column to datetime and then DataFrame.set_index:

df['Time'] = pd.to_datetime(df['Time'])
df = df.set_index('Time').asfreq('5T').reset_index()
print (df)
                 Time  Index  Volume
0 2019-05-16 09:15:00  27830    2031
1 2019-05-16 09:20:00  27847     854
2 2019-05-16 09:25:00  27851     487
3 2019-05-16 09:30:00  27897    1470

If format of datetimes cannot be changed:

df.index = pd.to_datetime(df['Time'])
df = df.asfreq('5T').reset_index(drop=True)
print (df)
               Time  Index  Volume
0  2019/05/16-09:15  27830    2031
1  2019/05/16-09:20  27847     854
2  2019/05/16-09:25  27851     487
3  2019/05/16-09:30  27897    1470

score 1 · Answer 2 · answered May 17 '19 at 08:36

I think you are looking for a groupby statement, there is a convienience method for datetime indexing .resample

 df.set_index('Time').resample('5T').last()

produces:

Out[156]:
0   2019/05/16-09:15  27830    2031
5   2019/05/16-09:20  27847     854
10  2019/05/16-09:25  27851     487
15  2019/05/16-09:30  27897    1470

Copy a DataFrame to another with 5 min interval

2 Answers2