8

I have an access table with a 'Date' field. it has random dates for each record. I've built a script to append all the records into a list and then set the list to filter out only the unique values:

dateList = []
# cursor search through each record and append all records in the date 
# field to a python list
for row in rows:
   dateList.append(row.getValue("DATE_OBSERVATION").strftime('%m-%d-%Y'))

# Filter unique values to a set
newList = list(set(dateList))

This returns (on my test table):

['07-06-2010', '06-24-2010', '07-05-2010', '06-25-2010']

Now that I have the unique values for the "DATE_OBSERVATION" field, I want to detect if:

  • the dates are single (i.e. only one unique date is returned because that is the date in every record)
  • if the dates are a range of dates (i.e. all of the dates fall into a consecutive range)
  • if the dates are multiple dates, but are not in a range of consecutive dates

Any suggestions would be much appreciated! Mike

Mike
  • 4,099
  • 17
  • 61
  • 83
  • 1
    Short lazy reply: Convert them into datetime objects, sort them then use the `pairwise` recipe from the `itertools` doc page to compare all dates against the next one in the list to see if it's a range; for single dates, take the date of the first and check that all the rest are within the same calendar day; if both of those fail they're disparate dates. – Daenyth Mar 06 '12 at 18:25
  • 1
    if you don't select other values, use `select distinct date_observation from mytable order by date_observation desc` and don't convert dates to strings.. – Aprillion Mar 06 '12 at 18:40
  • @deathApril: why descending order? – jfs Mar 06 '12 at 18:56
  • @J.F.Sebastian uh,, no reason - i saw '07-06-2010', '06-24-2010' in the question and skipped the rest of the examples i guess.. – Aprillion Mar 06 '12 at 19:02

4 Answers4

19

Rather than rolling your own consecutive function you can simply convert date objects to integers using the .toordinal() method of datetime objects. The difference between the maximum and minimum value of the set of ordinal dates is one more than the length of the set:

from datetime import datetime

date_strs = ['07-06-2010', '06-24-2010', '07-05-2010', '06-25-2010']
# date_strs = ['02-29-2012', '02-28-2012', '03-01-2012']
# date_strs = ['01-01-2000']
dates = [datetime.strptime(d, "%m-%d-%Y") for d in date_strs]

date_ints = set([d.toordinal() for d in dates])

if len(date_ints) == 1:
    print "unique"
elif max(date_ints) - min(date_ints) == len(date_ints) - 1:
    print "consecutive"
else:
    print "not consecutive"
Michael Dunn
  • 8,163
  • 4
  • 37
  • 54
  • Thanks Michael. This worked well with my script! I appreciate your response. – Mike Mar 06 '12 at 19:36
  • @Michael Dillon: `select distinct ..` as in @deathApril's comment is even better. – jfs Mar 08 '12 at 19:34
  • if we consider ['01-01-2000', '01-01-2000', '01-02-2000'] to be not consecutive, using this code will say the dates are consecutive, how might you modified your code to consider this requirement? I don't know how sets work yet. – lessthanl0l Feb 22 '14 at 05:53
  • @lessthanl0l: sets remove duplicates. If you want to consider a list with duplicates as non-consecutive you could test to see whether the length of your list of dates is greater than the length of the set of dates. – Michael Dunn Feb 24 '14 at 09:54
1

Another version using the same logic as in my other answer.

from datetime import date, timedelta

# Definition 1: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider consecutive
# Definition 2: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider not consecutive

# datelist = [date(2014, 1, 1), date(2014, 1, 3),
#             date(2013, 12, 31), date(2013, 12, 30)]

# datelist = [date(2014, 2, 19), date(2014, 2, 19), date(2014, 2, 20),
#             date(2014, 2, 21), date(2014, 2, 22)]

datelist = [date(2014, 2, 19), date(2014, 2, 21),
            date(2014, 2, 22), date(2014, 2, 20)]

datelist.sort()

previousdate = datelist[0]

for i in range(1, len(datelist)):
    #if (datelist[i] - previousdate).days == 1 or (datelist[i] - previousdate).days == 0:  # for Definition 1
    if (datelist[i] - previousdate).days == 1:    # for Definition 2
        previousdate = datelist[i]
    else:
        previousdate = previousdate + timedelta(days=-1)

if datelist[-1] == previousdate:
    print "dates are consecutive"
else:
    print "dates are not consecutive"
lessthanl0l
  • 1,035
  • 2
  • 12
  • 21
0

Here's my version using the reduce() function.

from datetime import date, timedelta


def checked(d1, d2):
    """
    We assume the date list is sorted.
    If d2 & d1 are different by 1, everything up to d2 is consecutive, so d2
    can advance to the next reduction.
    If d2 & d1 are not different by 1, returning d1 - 1 for the next reduction
    will guarantee the result produced by reduce() to be something other than
    the last date in the sorted date list.

    Definition 1: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider consecutive
    Definition 2: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider not consecutive

    """
    #if (d2 - d1).days == 1 or (d2 - d1).days == 0:  # for Definition 1
    if (d2 - d1).days == 1:                          # for Definition 2
        return d2
    else:
        return d1 + timedelta(days=-1)

# datelist = [date(2014, 1, 1), date(2014, 1, 3),
#             date(2013, 12, 31), date(2013, 12, 30)]

# datelist = [date(2014, 2, 19), date(2014, 2, 19), date(2014, 2, 20),
#             date(2014, 2, 21), date(2014, 2, 22)]

datelist = [date(2014, 2, 19), date(2014, 2, 21),
            date(2014, 2, 22), date(2014, 2, 20)]

datelist.sort()

if datelist[-1] == reduce(checked, datelist):
    print "dates are consecutive"
else:
    print "dates are not consecutive"
lessthanl0l
  • 1,035
  • 2
  • 12
  • 21
0

Use your database to select unique dates in the ascending order:

  • if the query returns a single date it is your first case

  • otherwise find out whether the dates are consecutive:

    import datetime
    
    def consecutive(a, b, step=datetime.timedelta(days=1)):
        return (a + step) == b
    

Code layout:

dates = <query database>
if all(consecutive(dates[i], dates[i+1]) for i in xrange(len(dates) - 1)):
   if len(dates) == 1: # unique
      # 1st case: all records have the same date
   else:
      # the dates are a range of dates
else:
   # non-consecutive dates
jfs
  • 399,953
  • 195
  • 994
  • 1,670