0

I am having trouble implementing a python algorithm which does the following: (This is part of an attempt to implement a friend of friend algorithm)

Given a list of the form [[a,b],[c,d],[e,f],...] I want to create a new list of the form [[a, a1,a2,a3,...], [b, b1,b2,b3,...], [c, c1,c2,c3,...],...].

An example to make this clearer is something like the following: given a list [[0,1], [0,4], [0,3], [0,423], [1,232], [1,2], [2,444], [2,12]]

I want the output to group all the elements with the first integer so the output would be [[0, 1,4,3,432],[1, 232,2], [2, 444,12]]

Notes: I have sorted the input list according to the first element in each item.

I have been stumped on how to implement this in a somewhat efficient manner for some time now, and would love to get some advice/suggestions as to how to implement this.

P.S. Ultimately I want this to combine all "liked" terms. What i mean is taking the above example, instead of getting the output [[0, 1,4,3,432],[1, 232,2], [2, 444,12]] I would get [[0, 1,4,3,432],[1, 232,2, 444,12]], where the "2" term and its shared elements have joined the elements associated with the "1" term since 1 is associated with 2. This last part may be confusing, but if it makes sense advice would be welcomed as well! Otherwise ignore this last part. =] Thanks again!

Thanks!

AdrianV
  • 97
  • 6
  • 5
    What have you tried? You're after an efficient way to do it; can you post your "inefficient" version? – mathematical.coffee Feb 28 '12 at 06:09
  • Well, I haven't been able to come up with anything that does this, that is why I decided to ask for help. – AdrianV Feb 28 '12 at 06:12
  • It sounds like you should investigate [graph](http://en.wikipedia.org/wiki/Graph_(abstract_data_type)) data structures. – GWW Feb 28 '12 at 06:13
  • 2
    Since you have [0,1] and [1,2], why would you get `[[0, 1,4,3,432],[1, 232,2, 444,12]]`? – Felix Yan Feb 28 '12 at 06:31
  • If you're aiming at what I think you're aiming at, then [this recent question](http://stackoverflow.com/questions/9110837/python-simple-list-merging-based-on-intersection) should be relevant. – DSM Feb 28 '12 at 06:34

5 Answers5

1

Using itertools.groupby():

from itertools import groupby
from operator import itemgetter

data = [[0, 1], [0, 4], [0, 3], [0, 423], [1, 232], [1, 2], [2, 444], [2, 12]]
result = [[k] + list(zip(*g)[1]) for k, g in groupby(data, key=itemgetter(0))]

Using a dictionary:

result = {}
for k, v in data:
    result.setdefault(k, []).append(v)
result = sorted([k] + v for k, v in result.iteritems())
Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
0
temp =  [[0,1], [0,4], [0,3], [0,432], [1,232], [1,2], [2,444], [2,12]]

temp1 = set()

temp2 = dict()

    for i in temp:

        first = i[0]
        second = i[1]
        if first in temp1:
            temp2[first] = temp2[first] + [second]
        else:
            temp2[first] = [second]
        temp1.add(i[0])

Here temp2 will have the required result.

lczapski
  • 4,026
  • 3
  • 16
  • 32
Madness
  • 127
  • 1
  • 6
0

Try this:

from collections import defaultdict

friends = defaultdict(set)
friendpairs =  [[0,1], [0,4], [0,3], [0,432], [1,232], [1,2], [2,444], [2,12]]

for f1,f2 in friendpairs : friends[f1].add(f2)

friendOfFriends = dict( (guy,fr.copy()) for guy,fr in friends.iteritems())

for f1 in friendOfFriends:
    for f2 in friends[f1]:
        friendOfFriends[f1].update(friends[f2])

UPD: You could also replace last line with

        friendOfFriends[f1].update(friends.get(f2,()))

to prevet appearing of empty sets in friends collection

Odomontois
  • 15,918
  • 2
  • 36
  • 71
  • Thanks, I believe this solves my issue perfectly...now i only need to read up on what exactly is happening here – AdrianV Feb 28 '12 at 07:10
0

Without the Ultimately part, you can simply do this:

>>> a = [[0,1], [0,4], [0,3], [0,423], [1,232], [1,2], [2,444], [2,12]]
>>> d = dict()
>>> for x, y in a:
...     if x in d:
...             d[x].append(y)
...     else:
...             d[x] = [y]
...
>>> d
{0: [1, 4, 3, 423], 1: [232, 2], 2: [444, 12]}
>>> [[x] + d[x] for x in d]
[[0, 1, 4, 3, 423], [1, 232, 2], [2, 444, 12]]
Felix Yan
  • 14,841
  • 7
  • 48
  • 61
0

This is fast and simple as I can get it:

data=iter([[0,1], [0,4], [0,3], [0,423], [1,232], [1,2], [2,444], [2,12]])
result = [next(data)]

for pair in  data:
    if result[-1][0]==pair[0]:
        result[-1].append(pair[1])
    else:
        result.append(pair)

print result
"[[0, 1, 4, 3, 423], [1, 232, 2], [2, 444, 12]]"
Rusty Rob
  • 16,489
  • 8
  • 100
  • 116