How can I merge lists that have common elements within a list of lists?

Question

I have a list like [[1, 2, 4], [2, 5], [0, 3, 7, 8], [12, 3, 6], [18, 14]]. How can I get a list that contains lists of all the lists that contain overlapping elements added together? For the example input, the result should be [[1, 2, 4, 5], [0, 3, 6, 7, 8, 12], [14, 18]].

One (perhaps unnecessarily convoluted) way of looking at this problem is as finding the connected components of the bipartite graph, with one set of nodes being the sublists of `a` and the other set being their entries. — jme, Nov 28 '14 at 02:23
@qqvc (via [suggested edit](http://stackoverflow.com/review/suggested-edits/6336403)): Please don't edit a question to include the answer. Post it as an answer. — Pokechu22, Nov 28 '14 at 02:32

poke · Accepted Answer · 2014-11-28T02:45:53.010

a = [[1, 2, 4], [2, 5], [0, 3, 7, 8], [12, 3, 6], [18, 14]]

result = []
for s in a:
    s = set(s)
    for t in result:
        if t & s:
            t.update(s)
            break
    else:
        result.append(s)

This will go one-by-one through the list and create a set from the current sublist (s). Then it will check in the results, if there is another set t that has a non-empty intersection with it. If that’s the case, the items from s are added to that set t. If there is no t with a non-empty intersection, then s is a new independent result and can be appended to the result list.

A problem like this is also a good example for a fixed-point iteration. In this case, you would look at the list and continue to merge sublists as long as you could still find lists that overlap. You could implement this using itertools.combinations to look at pairs of sublists:

result = [set(x) for x in a] # start with the original list of sets
fixedPoint = False # whether we found a fixed point
while not fixedPoint:
    fixedPoint = True
    for x, y in combinations(result, 2): # search all pairs …
        if x & y: # … for a non-empty intersection
            x.update(y)
            result.remove(y)

            # since we have changed the result, we haven’t found the fixed point
            fixedPoint = False

            # abort this iteration
            break

score 0 · Answer 2 · answered Nov 28 '14 at 02:03

One way I can think of doing this is through recursion. Start with one item, then loop until you find every number it's connected to. For each of these numbers, you must do the same. Hence the recursion. To make it more efficient, store numbers you've visited in a list and check it at the beginning of each recursive sequence to make sure you don't repeat any explorations.

elyase · Answer 3 · 2014-11-28T02:12:57.040

-1

A two liner:

a_set = [set(x) for x in a]
result = [list(x.union(y)) for i,x in enumerate(a_set) for y in a_set[i:] 
          if x.intersection(y) and x != y]

edited Nov 28 '14 at 02:12

answered Nov 28 '14 at 02:03

elyase

39,479
12
112
119

This drops `{14, 18}` though since it doesn’t intersect with anything else. – poke Nov 28 '14 at 02:22
This also doesn’t work at all for a list of lists, where *more than two* lists are overlapping each. For example `[[1, 2], [2, 3], [3, 4]]`. – poke Nov 28 '14 at 02:47
@poke, why should `{14, 18}` be included? The question says "a list that contains lists of all the lists in a that contain overlapping elements added together". `{14, 18}` doesn't have any overlapping element with another sublist. – elyase Nov 28 '14 at 23:38
If you look at OP’s example, you can clearly see that `[14, 18]` is included in the desired output. The result should be all lists from the input, but with those sublists combined where there are overlapping elements. – poke Nov 29 '14 at 00:07

score -1 · Answer 4 · answered Nov 28 '14 at 02:30

I have left the last step for you:

a = [[1, 2, 4], [2, 5], [0, 3, 7, 8], [12, 3, 6], [18, 14]]
result = [[1, 2, 4, 5], [0, 3, 6, 7, 8, 12], [14, 18]]
# each sub list
result2 = []
count = 0
print a
for sub_list in a:
    print count
    print "sub_list: " + str(sub_list)
    a.pop(count)
    print "a: " + str(a)
    #each int
    sub_list_extend_flag = False
    for int_in_sub_list in sub_list:
        print "int_in_sub_list: " + str(int_in_sub_list)
        for other_sub_list in a:
            print "current_other_sub_list: " + str(other_sub_list)
            if int_in_sub_list in other_sub_list:
                sub_list_extend_flag = True
                other_sub_list.extend(sub_list)

                result2.append(list(set(other_sub_list)))
    if not sub_list_extend_flag:
        result2.append(sub_list)
    count += 1
print result2

This gives `[[1, 2, 4, 5], [1, 2, 4, 5], [0, 3, 6, 7, 8, 12], [0, 3, 6, 7, 8, 12], [0, 3, 6, 7, 8, 12], [18, 14]]` for me which is not the desired result. Ignoring the duplicated lists in the results, this also doesn’t work for lists of sublists where *more than two* or *none* are overlapping. For example `[[1, 2], [2, 3], [3, 4], [5, 6]]`. — poke, Nov 28 '14 at 02:52

GLHF · Answer 5 · 2014-11-28T03:29:53.903

Simple answer:

 a = [[1, 2, 4], [2, 5], [0, 3, 7, 8], [12, 3, 6], [18, 14]]
for x in a:
    for y in x:
        print y

its more simple than first one:

box=[]
a = [[1, 2, 4], [2, 5], [0, 3, 7, 8], [12, 3, 6], [18, 14]]
for x in a:
    for y in x:
        box.append(y)
print box

Result:[1, 2, 4, 2, 5, 0, 3, 7, 8, 12, 3, 6, 18, 14]

And with this, you can compare the numbers:

box=[]
box2=""
a = [[1, 2, 4], [2, 5], [0, 3, 7, 8], [12, 3, 6], [18, 14]]
for x in a:
    for y in x:
        box.append(y)
print box

for a in box:
    box2+=str(a)
print box2

Result: 12425037812361814

Also you can make it more cute:

print " ".join(box2)

Result: 1 2 4 2 5 0 3 7 8 1 2 3 6 1 8 1 4

How can I merge lists that have common elements within a list of lists?

5 Answers5