I have a list like [[1, 2, 4], [2, 5], [0, 3, 7, 8], [12, 3, 6], [18, 14]]
. How can I get a list that contains lists of all the lists that contain overlapping elements added together? For the example input, the result should be [[1, 2, 4, 5], [0, 3, 6, 7, 8, 12], [14, 18]]
.

- 62,466
- 11
- 102
- 153

- 515
- 2
- 7
- 17
-
Could all the subsists be stored as lists? – will Nov 28 '14 at 02:01
-
The number of sublists is not known beforehand. – Daquicker Nov 28 '14 at 02:03
-
1One (perhaps unnecessarily convoluted) way of looking at this problem is as finding the connected components of the bipartite graph, with one set of nodes being the sublists of `a` and the other set being their entries. – jme Nov 28 '14 at 02:23
-
@qqvc (via [suggested edit](http://stackoverflow.com/review/suggested-edits/6336403)): Please don't edit a question to include the answer. Post it as an answer. – Pokechu22 Nov 28 '14 at 02:32
5 Answers
a = [[1, 2, 4], [2, 5], [0, 3, 7, 8], [12, 3, 6], [18, 14]]
result = []
for s in a:
s = set(s)
for t in result:
if t & s:
t.update(s)
break
else:
result.append(s)
This will go one-by-one through the list and create a set from the current sublist (s
). Then it will check in the results, if there is another set t
that has a non-empty intersection with it. If that’s the case, the items from s
are added to that set t
. If there is no t
with a non-empty intersection, then s
is a new independent result and can be appended to the result list.
A problem like this is also a good example for a fixed-point iteration. In this case, you would look at the list and continue to merge sublists as long as you could still find lists that overlap. You could implement this using itertools.combinations
to look at pairs of sublists:
result = [set(x) for x in a] # start with the original list of sets
fixedPoint = False # whether we found a fixed point
while not fixedPoint:
fixedPoint = True
for x, y in combinations(result, 2): # search all pairs …
if x & y: # … for a non-empty intersection
x.update(y)
result.remove(y)
# since we have changed the result, we haven’t found the fixed point
fixedPoint = False
# abort this iteration
break

- 369,085
- 72
- 557
- 602
One way I can think of doing this is through recursion. Start with one item, then loop until you find every number it's connected to. For each of these numbers, you must do the same. Hence the recursion. To make it more efficient, store numbers you've visited in a list and check it at the beginning of each recursive sequence to make sure you don't repeat any explorations.

- 121
- 2
- 12
A two liner:
a_set = [set(x) for x in a]
result = [list(x.union(y)) for i,x in enumerate(a_set) for y in a_set[i:]
if x.intersection(y) and x != y]

- 39,479
- 12
- 112
- 119
-
This drops `{14, 18}` though since it doesn’t intersect with anything else. – poke Nov 28 '14 at 02:22
-
This also doesn’t work at all for a list of lists, where *more than two* lists are overlapping each. For example `[[1, 2], [2, 3], [3, 4]]`. – poke Nov 28 '14 at 02:47
-
@poke, why should `{14, 18}` be included? The question says "a list that contains lists of all the lists in a that contain overlapping elements added together". `{14, 18}` doesn't have any overlapping element with another sublist. – elyase Nov 28 '14 at 23:38
-
If you look at OP’s example, you can clearly see that `[14, 18]` is included in the desired output. The result should be all lists from the input, but with those sublists combined where there are overlapping elements. – poke Nov 29 '14 at 00:07
I have left the last step for you:
a = [[1, 2, 4], [2, 5], [0, 3, 7, 8], [12, 3, 6], [18, 14]]
result = [[1, 2, 4, 5], [0, 3, 6, 7, 8, 12], [14, 18]]
# each sub list
result2 = []
count = 0
print a
for sub_list in a:
print count
print "sub_list: " + str(sub_list)
a.pop(count)
print "a: " + str(a)
#each int
sub_list_extend_flag = False
for int_in_sub_list in sub_list:
print "int_in_sub_list: " + str(int_in_sub_list)
for other_sub_list in a:
print "current_other_sub_list: " + str(other_sub_list)
if int_in_sub_list in other_sub_list:
sub_list_extend_flag = True
other_sub_list.extend(sub_list)
result2.append(list(set(other_sub_list)))
if not sub_list_extend_flag:
result2.append(sub_list)
count += 1
print result2

- 4,620
- 4
- 32
- 44
-
This gives `[[1, 2, 4, 5], [1, 2, 4, 5], [0, 3, 6, 7, 8, 12], [0, 3, 6, 7, 8, 12], [0, 3, 6, 7, 8, 12], [18, 14]]` for me which is not the desired result. Ignoring the duplicated lists in the results, this also doesn’t work for lists of sublists where *more than two* or *none* are overlapping. For example `[[1, 2], [2, 3], [3, 4], [5, 6]]`. – poke Nov 28 '14 at 02:52
Simple answer:
a = [[1, 2, 4], [2, 5], [0, 3, 7, 8], [12, 3, 6], [18, 14]]
for x in a:
for y in x:
print y
its more simple than first one:
box=[]
a = [[1, 2, 4], [2, 5], [0, 3, 7, 8], [12, 3, 6], [18, 14]]
for x in a:
for y in x:
box.append(y)
print box
Result:[1, 2, 4, 2, 5, 0, 3, 7, 8, 12, 3, 6, 18, 14]
And with this, you can compare the numbers:
box=[]
box2=""
a = [[1, 2, 4], [2, 5], [0, 3, 7, 8], [12, 3, 6], [18, 14]]
for x in a:
for y in x:
box.append(y)
print box
for a in box:
box2+=str(a)
print box2
Result: 12425037812361814
Also you can make it more cute:
print " ".join(box2)
Result: 1 2 4 2 5 0 3 7 8 1 2 3 6 1 8 1 4

- 3,835
- 10
- 38
- 83