0

I have a list of lists in the form:

testdata = [['9034968', 'ETH'],  ['14160113', 'ETH'],  ['9034968', 'ETH'],  
            ['11111', 'NOT'], ['9555269', 'NOT'],  ['15724032', 'ETH'],  
            ['15481740', 'ETH'],  ['15481757', 'ETH'],  ['15481724', 'ETH'],   
            ['10307528', 'ETH'],  ['15481757', 'ETH'],  ['15481724', 'ETH'],
            ['15481740', 'ETH'],  ['15379365', 'ETH'],  ['11111', 'NOT'],
            ['9555269', 'NOT'],  ['15379365', 'ETH']]

I would like to have a final result which groups the unique name with their values. So in the finall list (or dictinary, or any iterable) there are only two names (ETH and NOT) with a list as the second item of all the other values, e.g:

In [252]: unique_names
Out[252]: 
{'ETH': ['9034968',
  '14160113',
  '9034968',
  '15724032',
  '15481740',
  '15481757',
  '15481724',
  '10307528',
  '15481757',
  '15481724',
  '15481740',
  '15379365',
  '15379365'],
 'NOT': ['11111', '9555269', '11111', '9555269']}

To achieve this I used a dictionary and the following steps:

unique_names = []

for (x,y) in testdata: 
    if y not in unique_names:
       unique_names.append(y)

# now unique_names is ['ETH', 'NOT']

unique_names = {name:list() for name in unique_names}

for (x,y) in testdata: unique_names[y]=unique_names[y]+[x]

#so finally I get the result above

My problem is:

  • test_data is a result of an SQL query with 1000's of entries. My solution is running quite slow (at least that's how it feels).
  • Can you do this in a more Pythonic manner?

The example data for this question was take from a similar question about sets and list here: Python: Uniqueness for list of lists. Unfortunately, the OP there wanted a different result, but the structure of data was appropriate enough.

Community
  • 1
  • 1
oz123
  • 27,559
  • 27
  • 125
  • 187
  • You're not creating a list of sets, but a list of dictionaries. – Matthias Jun 25 '14 at 11:49
  • @Matthias, type(unique_names) shows a dict. But it does not matter, like I say, I need some kind of iterable with the set of values of each key. – oz123 Jun 25 '14 at 11:52

1 Answers1

3

You can use defaultdict like this

from collections import defaultdict
d = defaultdict(list)

for (value, key) in testdata:
    d[key].append(value)

print d

Or with a normal dictionary

d = {}
for (value, key) in testdata:
    d.setdefault(key, []).append(value)
print d

Both the examples are based on the same idea. They group the values as a list based on the key. The dict.setdefault will assign the default value to the key, if the key is not present in the dictionary already, and then the value will be returned. We just keep appending the values to the list corresponding to the key.

thefourtheye
  • 233,700
  • 52
  • 457
  • 497
  • I will be dammned, that is well documented in the excellent python docs! https://docs.python.org/2/library/collections.html?highlight=defaultdict#defaultdict-examples – oz123 Jun 25 '14 at 11:50