What's the right algorithm for finding isolated subsets

Question

Picture is worth a thousand words, so:

My input is the matrix on the left, and what I need to find is the sets of nodes that are maximum one step away from each other (not diagonally). Node that is more than one up/down/left/right step away would be in a separate set.

So, my plan was running a BFS from every node I find, then returning the set it traversed through, and removing it from the original set. Iterate this process until I'm done. But then I've had the wild idea of looking for a graph analysis tools - and I've found NetworkX. Is there an easy way (algorithm?) to achieve this without manually writing BFS, and traverse the whole matrix?

Thanks

What format is your input in? Is each point just listed as a coordinate pair, or do you actually have the connectivity information explicitly? Also, `networkx` does have breadth first search. — BrenBarn, Jul 09 '16 at 01:53
It's a matrix of coordinates. I can implement the bfs, no problem - but then i have to also iterate the whole matrix and reduce it every time a subset is found. Was hoping to save some work. — MeLight, Jul 09 '16 at 02:00
Why do you have to iterate the matrix again? If you marked the visited nodes already you wouldn't need to, right? Should be <10 lines of code total if you use your own BFS, but you can also use [this](https://networkx.github.io/documentation/networkx-1.9.1/reference/generated/networkx.algorithms.components.connected.connected_component_subgraphs.html) — llllvvuu, Jul 09 '16 at 02:48
I would create links between adjacent filled cells and then use https://en.wikipedia.org/wiki/Disjoint-set_data_structure — mcdowella, Jul 09 '16 at 04:12
You could just iterate over the matrix, join each point to adjacent points (if they exist), and then use the networkx function that gives you the connected components. — BrenBarn, Jul 09 '16 at 06:21
by the way - that networkx function is `nx.connected_components` — Joel, Jul 09 '16 at 13:35
Does this help? http://stackoverflow.com/questions/17795711/creating-sets-of-similar-elements-in-a-2d-array/17902398#17902398 — גלעד ברקן, Jul 09 '16 at 22:01

score 0 · Answer 1 · answered Jul 09 '16 at 20:45

What you are trying to do is searching for "connected components" and NetworX has itself a method for doing exactly that as can be seen in the first example on this documentation page as others has already pointed out on the comments.

Reading your question it seems that your nodes are on a discrete grid and the concept of connected that you describe is the same used on the pixel of an image.

Connected components algorithms are available for graphs and for images also.

If performances are important in your case I would suggest you to go for the image version of connected components. This comes by the fact that images (grids of pixels) are a specific class of graphs so the connected components algorithms dealing with grids of nodes are built knowing the topology of the graph itself (i.e. graph is planar, the max vertex degree is four). A general algorithm for graphs has o be able to work on general graphs (i.e they may be not planar, with multiple edges between some nodes) so it has to spend more work because it can't assume much about the properties of the input graph.

Since connected components can be found on graphs in linear time I am not telling the image version would be orders of magnitude faster. There will only be a constant factor between the two. For this reason you should also take into account which is the data structure that holds your input data and how much time will be spent in creating the input structures which are required by each version of the algorithm.

What's the right algorithm for finding isolated subsets

1 Answers1