2

I have a list with year and day starting from December till February from 2003 to 2005. I want to divide this list into list of lists to hold year day from December to February:

a = ['2003337', '2003345', '2003353', '2003361', '2004001', '2004009', '2004017', '2004025', '2004033', '2004041', '2004049', '2004057', '2004337', '2004345', '2004353', '2004361', '2005001', '2005009', '2005017', '2005025', '2005033', '2005041', '2005049', '2005057']

Output should be like:

b = [['2003337', '2003345', '2003353', '2003361', '2004001', '2004009', '2004017', '2004025', '2004033', '2004041', '2004049', '2004057'] ['2004337', '2004345', '2004353', '2004361', '2005001', '2005009', '2005017', '2005025', '2005033', '2005041', '2005049', '2005057']]

and then loop over each list of lists. I could use even splitting but there is a chance of missing year days. So it would be better not to do evenly split. Any suggestions?

Community
  • 1
  • 1
Ibe
  • 5,615
  • 7
  • 32
  • 45

2 Answers2

4

Convert to datetime, then group by the year whose end is nearest.

import datetime
import itertools

#convert from a "year-day" string to a datetime object
def datetime_from_year_day(s):
    year = int(s[:4])
    days = int(s[4:])
    return datetime.datetime(year=year, month=1, day=1) + datetime.timedelta(days=days-1)

#returns the year whose end is closest to the date, whether in the past or future
def nearest_year_end(d):
    if d.month <= 6:
        return d.year-1
    else:
        return d.year

a = ['2003337', '2003345', '2003353', '2003361', '2004001', '2004009', '2004017', '2004025', '2004033', '2004041', '2004049', '2004057', '2004337', '2004345', '2004353', '2004361', '2005001', '2005009', '2005017', '2005025', '2005033', '2005041', '2005049', '2005057']

result = [list(v) for k,v in itertools.groupby(a, lambda s: nearest_year_end(datetime_from_year_day(s)))]
print result

Result:

[['2003337', '2003345', '2003353', '2003361', '2004001', '2004009', '2004017', '2004025', '2004033', '2004041', '2004049', '2004057'], ['2004337', '2004345', '2004353', '2004361', '2005001', '2005009', '2005017', '2005025', '2005033', '2005041', '2005049', '2005057']]
Kevin
  • 74,910
  • 12
  • 133
  • 166
0

You can also do it by nesting 2 if-else in a for loop. This is also easy to understand

a = ['2003337', '2003345', '2003353', '2003361', '2004001', '2004009', '2004017', '2004025', '2004033', '2004041', '2004049', '2004057', '2004337', '2004345', '2004353', '2004361', '2005001', '2005009', '2005017', '2005025', '2005033', '2005041', '2005049', '2005057']
temp = []
b = []
for day in a:
    if len(temp)==0:
        temp.append(day)
    else:
        if int(temp[-1][4:]) < 60 and int(day[4:]) > 335:
            b.append(temp)
            temp = []
            temp.append(day)
        else:
            temp.append(day)
print b

Result-

[['2003337', '2003345', '2003353', '2003361', '2004001', '2004009', '2004017', '2004025', '2004033', '2004041', '2004049', '2004057'], ['2004337', '2004345', '2004353', '2004361', '2005001', '2005009', '2005017', '2005025', '2005033', '2005041', '2005049', '2005057']]
Naman Sogani
  • 943
  • 1
  • 8
  • 28