0

I have a list of unsorted sublists, like this:

listToGroup = [[123, 134], [134, 153], [134, 158], [123], [537, 190], [190], [537, 950], [950, 650]]

What i'm trying to do is group the sublists based on reccuring connections found between the sublists. The start value is always a sublist with a single item, i.e. [123] or [190] from the example. The result should look something like this:

sortedList = [[123, 134, 153, 158], [190, 537, 950, 650]]

My dataset consists of perhaps about 1000 of these sublists. I thought about solving this recursively as shown below, but think i'm way of here;

def listGrouper(startItem, listToGroup):
    groupedList = []
    checkedIndexes = []
    groupedList.append(startItem)

    for index, subList in enumerate(listToGroup):
        if len(subList) > 1:
            if startItem in subList and index not in checkedIndexes:
                if subList.index(startItem) == 0:
                    nextItem = subList[1]
                elif subList.index(startItem) == 1:
                    nextItem = subList[0]

                checkedIndexes.append(index)
                groupedList.append(listGrouper(nextItem, listToGroup))

     return [item for item in groupedList]

 sortedList = []

 for subList in listToGroup:
     if len(subList) == 1:
         sortedList.append(listGrouper(subList[0], listToGroup))

Sorry if the code is a bit messy. Would appreciate if anyone could point me in the right direction.

1 Answers1

1

You're looking for the connected components. You can proceed as in this answer but filternig out single item sublists, since they won't be adding any connections and networkX throws an error:

import networkx as nx
G=nx.Graph()
G.add_edges_from(i for i in listToGroup if len(i)==2)

list(nx.connected_components(G))
# [{123, 134, 153, 158}, {190, 537, 650, 950}]
yatu
  • 86,083
  • 12
  • 84
  • 139