0

How to sort a list of tuples based on the first value i.e, in a dictionary we can use sorted(a.keys()).

How to do it for a list of tuples?

If these are the tuple values

t = [('2010-09-11', 'somedata', somedata),
     ('2010-06-11', 'somedata', somedata),
     ('2010-09-12', 'somedata', somedata)]

tuples should be sorted based on dates in the first field.

Georgy
  • 12,464
  • 7
  • 65
  • 73
Rajeev
  • 44,985
  • 76
  • 186
  • 285
  • You mean you want to sort the list that containts the tuples? – Jacob Jul 25 '11 at 09:52
  • 2
    It's simply t.sort() see: http://stackoverflow.com/questions/644170/how-does-python-sort-a-list-of-tuples – Karoly Horvath Jul 25 '11 at 09:55
  • @delan but the OP doesn't actually want to sort a tuple, he wants to sort a list of tuples. – Daniel Roseman Jul 25 '11 at 09:57
  • sorry SO drived me nuts it always posted my answer as comment (saying that is trivial). Also, he sorts a *list* not a tuple – Karoly Horvath Jul 25 '11 at 09:58
  • @Rajeev Writing _"tuple should be sorted based on date,first field "_ is wrong. If it is sorted on the first field, then it is sorted on a **string**. If you want to sort on a **date**, you must take in account the manner in which the string expresses the date, _year-month-day_ or _year-day-month_. – eyquem Jul 25 '11 at 11:37

4 Answers4

7

Usually, just sorted(t) works, as tuples are sorted by lexicographical order. If you really want to ignore everything after the first item (instead of sorting tuples with the same first element by the following elements), you can supply a key that picks out the first element. The simplest way would be operator.itemgetter:

import operator
...
for item in sorted(t, key=operator.itemgetter(0)):
    ...

Of course if you want to sort the list in-place, you can use t.sort(key=operator.itemgetter(0)) instead.

1

Or you can use something like this to be sure that list of tuples sorted by dates:

from datetime import datetime
initData = [('2010-09-11','somedata',1), ('2010-06-11','somedata',2), ('2010-09-12','somedata',3)]
outData = sorted(initData , key=lambda x: datetime.strptime(x[0],"%Y-%m-%d"))
Artsiom Rudzenka
  • 27,895
  • 4
  • 34
  • 52
  • 2
    In the case that you are treating, that is to say the date strings expressing dates as _year-month-day_ , it isn't necessary to resort to **strptime()**: doing so is exploiting the order contained in the **struct_time** data type, while it is possible to exploit the order of the dates considered as strings, without applying **strptime()**. - However this use of **strptime()** alone, without **strftime()**, is fine and I realized this point concerning the orders. Hence I upvote – eyquem Jul 25 '11 at 11:45
1

If '2010-09-11' is year-month-day , you do:

somedata = 'jyhghg'
t = [('2010-09-11','somedata',somedata),
     ('2010-06-11','somedata',somedata),
     ('2010-09-12','somedata',somedata),
     ('2010-08-12','somedata',somedata)]

from operator import itemgetter
t.sort(key = itemgetter(0))
print t

result

[('2010-06-11', 'somedata', 'jyhghg'),
 ('2010-08-12', 'somedata', 'jyhghg'),
 ('2010-09-11', 'somedata', 'jyhghg'),
 ('2010-09-12', 'somedata', 'jyhghg')]

.

If '2010-09-11' is year-day-month, you do:

from time import strptime,strftime

somedata = 'jyhghg'
t = [('2010-09-11','somedata',somedata),
     ('2010-06-11','somedata',somedata),
     ('2010-09-12','somedata',somedata),
     ('2010-08-12','somedata',somedata)]

t.sort(key = lambda x: strftime('%Y%m%d',strptime(x[0],'%Y-%d-%m')))
print t

result

[('2010-06-11', 'somedata', 'jyhghg'),
 ('2010-09-11', 'somedata', 'jyhghg'),
 ('2010-08-12', 'somedata', 'jyhghg'),
 ('2010-09-12', 'somedata', 'jyhghg')]

.

Edit 1

Reading the answer of Artsiom Rudzenka in which he uses strptime() alone, I realized that strptime() produces a struct_time object that is sorted by nature . Such an object has attributes tm_year, tm_mon, tm_mday, tm_hour, tm_min, tm_sec, tm_wday, tm_yday, tm_isdst that are accessible through common dot-notation access (toto.tm_mon for exemple), but also through index-notation access (toto[1] for exemple) , because the attributes of a struc_time object are registered in this order tm_year, tm_mon, tm_mday, tm_hour, tm_min, tm_sec, tm_wday, tm_yday, tm_isdst . The struct_time data type has a named tuple's interface .

Since a struct_time object is ordered by nature, it isn't necessary to apply strftime() to obtain a date string having year-month-day in this order: this order is already present in the struct_time object.

Then , I correct my code for the case in which 11 in '2010-06-11' is the month : I eliminate strftime()

from time import strptime

somedata = 'jyhghg'
t = [('2010-09-11','somedata',somedata),
     ('2010-06-11','somedata',somedata),
     ('2010-09-12','somedata',somedata),
     ('2010-08-12','somedata',somedata)]

t.sort(key = lambda x: strptime(x[0],'%Y-%d-%m'))
print t

Edit 2

Taking Kirk Strauser's info in consideration:

import re

regx = re.compile('(\d{4})-(\d\d)-(\d\d)')

somedata = 'jyhghg'
t = [('2010-09-11','somedata',somedata),
     ('2010-06-11','somedata',somedata),
     ('2010-09-12','somedata',somedata),
     ('2010-08-12','somedata',somedata)]

t.sort(key = lambda x: regx.match(x[0]).group(1,3,2))
print t
eyquem
  • 26,771
  • 7
  • 38
  • 46
  • I didn't thought about year-day-month possibility, +1. – utdemir Jul 25 '11 at 10:27
  • @utdemir Thank you. In fact, I also felt in the trap, one time, so I recall – eyquem Jul 25 '11 at 10:32
  • Note: strptime is _horribly_ slow for this use case. I wrote a function to split the date field on '-', then return fields 0, 2, 1. With timeit, the strptime version ran in 18.06s while the string manipulation version ran in .86s. In general, stay away from strptime unless you actually need the information it returns. – Kirk Strauser Jul 25 '11 at 13:54
  • @Kirk Strauser OK. Didn't know the slowness of strptime(), and didn't think about the speed. The difference is big, so I corrected my answer. What is your function to return fields 0,1,2 please ? – eyquem Jul 25 '11 at 14:09
  • It was pretty trivial: `def swapmonthday(date): fields = date.split('-'); return fields[0], fields[2], fields[1]`. – Kirk Strauser Jul 25 '11 at 14:26
  • @Kirk Strauser I compared the speeds of **1)** ``strptime('2010-09-11','%Y-%d-%m')`` , **2)** ``regx.match('2010-09-11').group(1,3,2)`` , **3)** ``fields = date.split('-'); return (fields[0], fields[2], fields[1])`` , **4)** ``return (date[0:4],date[5:7],date[8:])`` . Slowness of strptime() is incredible, you are right. I obtained following times : 1) 17 seconds 2) 0.40 secs 3) 0.30 secs 4) 0.12 secs (iterating 100000 times) – eyquem Jul 25 '11 at 14:47
  • I'm kinda embarrassed that I didn't even consider using slices. D'oh! The last three methods are all fast enough that I'd say to just use whichever makes most sense to you or fits in best with the rest of the program unless it becomes a problem. But strptime? Ow! – Kirk Strauser Jul 25 '11 at 16:20
  • @Kirk Strauser I agree with you. strptime() is out. Others are acceptable. A message asks me: _" Would you like to automatically move this discussion to chat?"_ but I have nothing more to add. Thank you for your info on strptime. – eyquem Jul 25 '11 at 16:47
0

You can use the very simple

t.sort()

see: How does Python sort a list of tuples?

Community
  • 1
  • 1
Karoly Horvath
  • 94,607
  • 11
  • 117
  • 176