1

I'm trying to combine every 5-minute data, means, so in 1st 5-minute I want to combine all those details(alphanumeric values into one cell)

sample data looks like this:

TIME                                   details

2020-07-13 14:35               asjkdhkjh fdsfsfsf fgdgdgdgd jgkgk

2020-07-13 14:35               fgsdfdsfs  fdsfsfsf d5445435 jgkgk

2020-07-13 14:36               ssssss dffef fgdgdgdgd gfrtgtrgtr

2020-07-13 14:38               fgfgfd vccbcvbfdsfsfsf gdfgdsfs

2020-07-13 14:42               muyjkuj fdsfsfsf treteer fghgfh

2020-07-13 14:45               rtrtrtrtr rrtrtf fgdgdgdgd jjhjhj

2020-07-13 14:45               zszszszszszs fdsfsfsf kfjhdshfds

2020-07-13 14:50               cjkdfhd fdsfsfsf fgdgduhfdsjfskfd 

2020-07-13 14:52               qwqwewew fdsfsfsf fgdgdgdgd trytu

2020-07-13 14:55               ncmjvhfh fdsfsfsf fgdgdgdgd jhgfd

Expected record:

 TIME                                      details

1        asjkdhkjh fdsfsfsf fgdgdgdgd jgkgk,fgsdfdsfs  fdsfsfsf d5445435 jgkgk

2        ssssss dffef fgdgdgdgd gfrtgtrgtr, fgfgfd vccbcvbfdsfsfsf gdfgdsfs

3        muyjkuj fdsfsfsf treteer fghgfh, rtrtrtrtr rrtrtf fgdgdgdgd jjhjhj,zszszszszszs fdsfsfsf kfjhdshfds

4        cjkdfhd fdsfsfsf fgdgduhfdsjfskfd , qwqwewew fdsfsfsf fgdgdgdgd trytu, ncmjvhfh fdsfsfsf fgdgdgdgd jhgfd

I have tried all these codes: Group DataFrame in 5-minute intervals How to groupby time series by 10 minutes using pandas? but not able to group this alphanumeric data, anyone knows how to group this data?

Thank you

dev_user
  • 417
  • 1
  • 3
  • 16

1 Answers1

1

Use DataFrame.resample with Resampler.aggregate and join function:

df['TIME'] = pd.to_datetime(df['TIME'])
df = df.resample('5Min', on='TIME')['details'].agg(' '.join).reset_index(name='new')
print (df)
                 TIME                                                new
0 2020-07-13 14:35:00  asjkdhkjh fdsfsfsf fgdgdgdgd jgkgk fgsdfdsfs  ...
1 2020-07-13 14:40:00                    muyjkuj fdsfsfsf treteer fghgfh
2 2020-07-13 14:45:00  rtrtrtrtr rrtrtf fgdgdgdgd jjhjhj zszszszszszs...
3 2020-07-13 14:50:00  cjkdfhd fdsfsfsf fgdgduhfdsjfskfd qwqwewew fds...
4 2020-07-13 14:55:00                  ncmjvhfh fdsfsfsf fgdgdgdgd jhgfd
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    @python_user - One thing, it groiping data by 5 minutes like `2020-07-13 14:35:00 - 2020-07-13 14:35:59`, `2020-07-13 14:40:00-2020-07-13 14:40:59`, so output is different. – jezrael Nov 13 '20 at 09:56