0

I have a python list in the below-given format.

time_list = ['2021-02-04 02:40:00', '2021-02-04 02:41:00', '2021-02-04 02:42:00', '2021-02-04 02:44:00', '2021-02-04 03:01:00', '2021-02-04 03:02:00', '2021-02-04 03:03:00', '2021-02-04 03:04:00', '2021-02-04 03:05:00']

I am in need of time like

answer_list = [{'start': '2021-02-04 02:40:00', 'end': '2021-02-04 02:44:00'}, {'start': '2021-02-04 03:01:00', 'end': '2021-02-04 03:05:00'}]

I tried multiple ways, no way looks correct. Can anyone let me know how to do it or any modules are there to crop?

  • What do you mean by "crop"? Do I understand correctly that you just want to insert the values from the list into several dictionaries with alternating "start" and "end" keys? – mkrieger1 Feb 26 '21 at 07:46
  • Does this answer your question? [What is the most "pythonic" way to iterate over a list in chunks?](https://stackoverflow.com/questions/434287/what-is-the-most-pythonic-way-to-iterate-over-a-list-in-chunks) – mkrieger1 Feb 26 '21 at 07:47
  • 1
    so what *did* you try? from the given example, the specification how you want to go from input to the desired output is unclear to me. – FObersteiner Feb 26 '21 at 07:48
  • @mkrieger1: the OPs first chunk seems to have a size of 4, the second a size of 5? – FObersteiner Feb 26 '21 at 07:49

1 Answers1

1

It seems to me that your are wanting to list the start and end times by hour. If I am correct about that, then "crop" is not the correct term, rather you mean "group". (But correct me if I am wrong.)

Here is an algorithm, not very fancy, but should work to do that:

from dateutil import parser as dp

time_list = ['2021-02-04 02:40:00', '2021-02-04 02:41:00', '2021-02-04 02:42:00', '2021-02-04 02:44:00', '2021-02-04 03:01:00', '2021-02-04 03:02:00', '2021-02-04 03:03:00', '2021-02-04 03:04:00', '2021-02-04 03:05:00']

answer_list = []
start = None
for dtm in time_list:
    d = dp.parse(dtm)
    if start is None:
        start = d
        hour  = d.hour
    elif hour == d.hour:
        end   = d
        continue
    else:
        answer_list.append(dict(start=str(start),end=str(end)))
        start = d
        hour  = d.hour
answer_list.append(dict(start=str(start),end=str(end)))

print(answer_list)

The output:

[{'start': '2021-02-04 02:40:00', 'end': '2021-02-04 02:44:00'}, {'start': '2021-02-04 03:01:00', 'end': '2021-02-04 03:05:00'}]

You can also use pandas to be able to group and manipulate this data a lot easier:

import pandas as pd
time_list = ['2021-02-04 02:40:00', '2021-02-04 02:41:00', '2021-02-04 02:42:00', '2021-02-04 02:44:00', '2021-02-04 03:01:00', '2021-02-04 03:02:00', '2021-02-04 03:03:00', '2021-02-04 03:04:00', '2021-02-04 03:05:00']

df = pd.DataFrame(dict(datetime=time_list),index=pd.DatetimeIndex(time_list))
df['Hour'] = [d.hour for d in df.index]

print(df)
                                datetime  Hour
2021-02-04 02:40:00  2021-02-04 02:40:00     2
2021-02-04 02:41:00  2021-02-04 02:41:00     2
2021-02-04 02:42:00  2021-02-04 02:42:00     2
2021-02-04 02:44:00  2021-02-04 02:44:00     2
2021-02-04 03:01:00  2021-02-04 03:01:00     3
2021-02-04 03:02:00  2021-02-04 03:02:00     3
2021-02-04 03:03:00  2021-02-04 03:03:00     3
2021-02-04 03:04:00  2021-02-04 03:04:00     3
2021-02-04 03:05:00  2021-02-04 03:05:00     3

Now you can do things like this:

print(df.groupby('Hour').first())
                 datetime

Hour
2     2021-02-04 02:40:00
3     2021-02-04 03:01:00

print(df.groupby('Hour').last())

                 datetime
Hour
2     2021-02-04 02:44:00
3     2021-02-04 03:05:00

And ...

answer_list = []
gbh = df.groupby('Hour')
for start,end in zip(gbh.first().values,gbh.last().values):
    answer_list.append(dict(start=start[0],end=end[0]))

print(answer_list)

[{'start': '2021-02-04 02:40:00', 'end': '2021-02-04 02:44:00'}, {'start': '2021-02-04 03:01:00', 'end': '2021-02-04 03:05:00'}]
Daniel Goldfarb
  • 6,937
  • 5
  • 29
  • 61