3

I have a sorted list of lists that contain duplicate first elements. Currently I'm iterating over it to get the solution.

[['5th ave', 111, -30.00, 38.00],
['5th ave', 222, -30.00, 33.00],
['6th ave', 2224, -32.00, 34.90]]

I'd like an elegant list comprehension to convert this to a list of lists based on the first element:

['5th ave', [[111, -30.00, 38.00] , [222, -30.00, 33.00]]

Thanks

sucasa
  • 373
  • 1
  • 8
  • 19

3 Answers3

8

Looks like a job for collections.defaultdict:

>>> from collections import defaultdict
>>> L = [['5th ave', 111, -30.00, 38.00],
... ['5th ave', 222, -30.00, 33.00],
... ['6th ave', 2224, -32.00, 34.90]]
>>> d = defaultdict(list)
>>> for sublist in L:
...     d[sublist[0]].append(sublist[1:])
... 
>>> print d.items()
[('5th ave', [[111, -30.0, 38.0], [222, -30.0, 33.0]]), ('6th ave', [[2224, -32.0, 34.9]])]

There's absolutely no reason to have a list comprehension. Just because it's less lines does not mean it's more pythonic.

TerryA
  • 58,805
  • 11
  • 114
  • 143
  • You beat me to this , but the war is not over yet! :P Great answer though. – Games Brainiac Sep 02 '13 at 05:54
  • just a joke. thing this solution breaks the sort. i was hoping to keep it in a list and still sorted. – sucasa Sep 02 '13 at 06:04
  • Why `defaultdict` and why not `setdefault` in ordinary `dict`? – thefourtheye Sep 02 '13 at 06:04
  • @thefourtheye I've never used a setdefault before. Looking at http://stackoverflow.com/questions/3483520/use-cases-for-the-setdefault-dict-method , it seems defaultdict replaces it for the majority of cases – TerryA Sep 02 '13 at 06:09
  • @user1550052 http://stackoverflow.com/questions/6190331/can-i-do-an-ordered-default-dict-in-python May help you :) – TerryA Sep 02 '13 at 06:18
1
data = [['5th ave', 111, -30.00, 38.00],
['5th ave', 222, -30.00, 33.00],
['6th ave', 2224, -32.00, 34.90]]

previous   = ""
listOfData = []
result     = []
for currentItem in data:
    if currentItem[0] != previous:
        if listOfData:
            result.append([previous, listOfData])
            listOfData = []
        previous = currentItem[0]
    listOfData.append(currentItem[1:])

if listOfData:
    result.append([previous, listOfData])

print result

Output

[['5th ave', [[111, -30.0, 38.0], [222, -30.0, 33.0]]], ['6th ave', [[2224, -32.0, 34.9]]]]

This maintains the order as well.

Edit:

With defaultdict I could reduce few lines

from collections import defaultdict

data = [['5th ave', 111, -30.00, 38.00],
['5th ave', 222, -30.00, 33.00],
['6th ave', 2224, -32.00, 34.90]]

unique, Map = [], defaultdict(list)
for item in data:
    if item[0] not in unique: unique.append(item[0])
    Map[item[0]].append(item[1:])
print [(item, Map[item]) for item in unique]

This still maintains order.

thefourtheye
  • 233,700
  • 52
  • 457
  • 497
  • 1
    You should really only use upper case letters at the beginning of variable names for classes – TerryA Sep 02 '13 at 06:12
  • @Haidro I updated the solution. I prefer naming the variables this way. Is there any specific reason why we should use uppercase letters only at the beginning? – thefourtheye Sep 02 '13 at 06:15
  • thanks this is similar to what i was doing. just wanted to see if it was possible with a list comprehension. – sucasa Sep 02 '13 at 06:16
  • have a look at [the PEP 8 style guide](http://www.python.org/dev/peps/pep-0008/#naming-conventions) ;) – TerryA Sep 02 '13 at 06:16
1

collections.defaultdict really is the way to go, but I feel it might be slower which is why I came up with this:

from itertools import imap

def RemDup(L):
    ListComp = {}
    for sublist in L:
        try: ListComp[sublist[0]].append(sublist[1:])
        except KeyError: ListComp[sublist[0]] = [sublist[1:]]
    return imap( list, ListComp.items() )

DupList = [['5th ave', 111, -30.00, 38.00],
['5th ave', 222, -30.00, 33.00],
['6th ave', 2224, -32.00, 34.90]]

print [ uniq for uniq in RemDup(DupList) ]
smac89
  • 39,374
  • 15
  • 132
  • 179