I have a list of dates with some days missing. I'm trying to get an array of date ranges with no dates as an output. At the moment I can get the desired output as map objects, but cannot convert them into a single array. my code is as follows
import os
import pandas as pd
import numpy as np
from datetime import datetime
from itertools import groupby
from operator import itemgetter
Converting a list of strings into datetime.date. newdates is my original date with the missing days
In[1]:
newdates = [datetime.strptime(date, '%Y-%m-%d').date() for date in newdates]
Print(newdates)
Out[1]:
[datetime.date(2013, 11, 5),..., datetime.date(2013, 12, 31)]
Creating a date range for my desired year and using .difference to output a list of strings of dates that were missing in my original data.
In[2]:
TEST = pd.date_range(start = '2013, 01, 01', end = '2013, 12, 31').difference(newdates)
TEST = TEST.strftime('%Y-%m-%d').tolist()
I found code from @jab answer to this question (Split a list of dates into subsets of consecutive dates) which groups the consecutive days. It outputs the desired data, however in multiple map.objects.
def consecutive_groups(iterable, ordering=lambda x: x):
for k, g in groupby(enumerate(iterable), key=lambda x: x[0] - ordering(x[1])):
yield map(itemgetter(1),g)
for g in consecutive_groups(TEST, lambda x: datetime.strptime(x, '%Y-%m-%d').toordinal()):
print(list(g))
Out[2]:
['2013-01-01',..., '2013-11-04']
['2013-11-24']
Ive tried to convert the map objects to lists (i would like a single array though) by the following:
for g in consecutive_groups(TEST, lambda x: datetime.strptime(x, '%Y-%m-%d').toordinal()):
dates = list(g)
This gives me a list of the final map object but not all.
I've also tried using np.fromiter, but can't figure out how to get a range.
In conclusion, I would like to convert the output (list(g)) to an array which would look like this:
[['2013-01-01',..., '2013-11-04'],['2013-11-24']]