5

I am trying to code a fast function which will loop through the elements in the sublists and merge the sublists if they contain element. For example, the list [[0, 3], [3, 4], [5, 6]] should be merged to [[0, 3, 4], [5, 6]]. The sublists can be of any size and each sublist can have a different size, therefore could contain many elements.

My code so far (which does not work) is shown below. The error that comes up is: slice indices must be integers or None or have an __index__ method

def join_clusters(clusters):
    for cluster in clusters:
        for j in cluster:
            for k in clusters[cluster:]:
                for h in k:
                    if j == h:
                        cluster.append(k)
                        clusters.pop(k)
                        return clusters
Tom Fuller
  • 5,291
  • 7
  • 33
  • 42
Hello
  • 97
  • 1
  • 5
  • Do you only want to merge adjacent sublists? Would `[[1,2,3], [3,4,5], [5,6,7]]` get merged to `[[1,2,3,4,5,6,7]]`? – PM 2Ring Nov 19 '16 at 14:11
  • @PM2Ring I assume so, that's the way I've done it – Tom Fuller Nov 19 '16 at 14:12
  • 1
    I think you're trying to solve a set consolidation/connected components/union-find problem -- in which case it's a duplicate of many previous questions -- but your example is a little confusing because it could be interpreted as linking neighbouring chains. Should `[[1,2],[3,1],[1,3]]` become `[[1,2,3]]`? – DSM Nov 19 '16 at 15:02
  • yes, that is what I am needing. [[1,2],[3,1],[1,3]] would become [[1,2,3]] – Hello Nov 19 '16 at 15:09

3 Answers3

2

If the subsets are sorted I would just try to do something with sets.

from itertools import islice

def merge(T):
  idx = 0
  result = [set(T[0])]
  for sublst in islice(T, 1, len(T)):
    subset = set(sublst)
    if result[idx] & subset:
      result[idx].update(subset)
    else:
      result.append(set(sublst))
      idx += 1
  return [sorted(sub) for sub in result]
Nf4r
  • 1,390
  • 1
  • 7
  • 9
1

I've used a while loop to make it easier to reference the next cluster in the list

def join_clusters(clusters):
    idx = 0
    while idx < len(clusters) - 1:
        for element in clusters[idx]:
            if element in clusters[idx + 1]:
                clusters[idx].remove(element)
                clusters[idx] = clusters[idx] + clusters[idx + 1]                
                del(clusters[idx + 1])
                break
        idx = idx + 1
    return clusters

hope this helps you :)

Tom Fuller
  • 5,291
  • 7
  • 33
  • 42
1

Here's a solution which works for any kind of sublist, regardless of whether it's sorted:

def join_clusters(clusters):
    result = clusters[:1]                          #1
    for cluster in clusters[1:]:
        if cluster[0] == result[-1][-1]:
            result[-1] = result[-1] + cluster[1:]  #2
        else:
            result.append(cluster)                 #3
    return result

Examples:

>>> c1 = [[0, 3], [3, 4], [5, 6]]
>>> join_clusters(c1)
[[0, 3, 4], [5, 6]]

>>> c2 = [[3, 1], [1, 2], [1, 3], [2, 1], [1, 3], [3, 1], [1, 2]]
>>> join_clusters(c2)
[[3, 1, 2], [1, 3], [2, 1, 3, 1, 2]]

>>> les_mis = "At the end of the day you're another day older".split()
>>> join_clusters(les_mis)
['Athend', 'of', 'the', "dayou're", 'another', 'day', 'older']

Notes:

#1: Use result = clusters[:1][:] if you want the output to contain only copies of the input, rather than the actual original sublists.

#2: result[-1] += cluster[1:] is not used, as it would mutate elements of the original list, which may be undesirable.

#3: Use result.append(cluster[:]) if you want the output to contain only copies of the input, rather than the actual original sublists.

Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160
  • Would there be any way to delete the duplicates, so that [2, 1, 3, 1, 2] became [1,2,3] (order is not important) – Hello Nov 19 '16 at 15:12
  • Also, would there be any way to alter the list clusters, rather than create a copy? – Hello Nov 19 '16 at 15:14
  • 1. Not without starting from scratch - it looks like the question I thought you were asking is not the one you wanted an answer to. 2. Well, yes, but given #1 there doesn't seem much point ;-) – Zero Piraeus Nov 19 '16 at 15:32
  • Thanks for your help. – Hello Nov 19 '16 at 15:35