Your format, yyyy-mm-dd
, allows a lexicographic sort, so your code should work fine unless your values aren't zero padded (ex 2012-10-9
instead of 2012-10-09
).
Fix this problem by relying on a comparison of dates rather than strings:
sorted(datesAndText, key=lambda x: datetime.strptime(x, '%Y-%m-%d'))
This utilizes the key
parameter to sorted, which is a function which accepts one argument (an element of the list being compared during sort) and returns a value on which sorted
can use to sort.
This has the ancillary benefit of allowing you to explicitly specify the string format of the date, should your data need to change.
Edit:
mgilson brought up an interesting point. str.split
is probably more efficient. Let's see if he's correct:
strptime
solution:
bburns@virgil:~$ python -mtimeit -s"from datetime import datetime;d={'2012-2-12':None, '2012-10-9':None, '1978-1-1':None, '1985-10-9':None}" 'sorted(d, key=lambda x: datetime.strptime(x,"%Y-%m-%d"))'
10000 loops, best of 3: 79.7 usec per loop
mgilson's original str.split
solution:
bburns@virgil:~$ python -mtimeit -s"from datetime import datetime;d={'2012-2-12':None, '2012-10-9':None, '1978-1-1':None, '1985-10-9':None}" 'sorted(d,key=lambda x: [int(y) for y in x.split("-")])'
100000 loops, best of 3: 17.6 usec per loop
mgilson's zfill
str.split
solution:
bburns@virgil:~$ python -mtimeit -s"from datetime import datetime;d={'2012-2-12':None, '2012-10-9':None, '1978-1-1':None, '1985-10-9':None}" 'sorted(d,key=lambda x: [y.zfill(2) for y in x.split("-")])'
100000 loops, best of 3: 7.4 usec per loop
Looks like he's correct! mgilson's original answer is 4-5 times faster, and his final answer is 10-11 times faster! However, as we agreed in the comments, readability matters. Unless you're presently CPU-bound, I'd still advise going with datetime.strptime
over str.split
.