1

I'm trying to develop a modification of the connected component algorithm I found as an answer to this question: Connected Component Labelling.

Basically, I have 2d- and 3d- matrices consisting of 0s and 1s. My problem is to find connected regions of 1s, labeling each region separately. The matrix sizes can be very large (consisting of 5e4-by-5e4 elements in 2-d and 1000^3 elements in 3d). So I need something which doesn't strain the stack memory, and which is fast enough to repeat several times over the course of a simulation.

The most upvoted answer to that question, using depth-first search, gives a stack overflow error (as noted in a comment). I have been trying to use the union-find algorithm suggested by another user.

The original code (by user Dukeling) works very well for large 2-d matrices, but I want to have diagonal connections between elements. Here's my code, with the example input I am trying to use:

#include <iostream>
#include <stdio.h>
#include <stdlib.h>

const int w = 8, h = 8;
int input[w][h] = {{1,0,0,0,1,0,0,1},
               {1,1,0,1,1,1,1,0},
               {0,1,0,0,0,0,0,1},
               {1,1,1,1,0,1,0,1},
               {0,0,0,0,0,0,1,0},
               {0,0,1,0,0,1,0,0},
               {0,1,0,0,1,1,1,0},
               {1,0,1,1,0,1,0,1}};
int component[w*h];

void doUnion(int a, int b)
{
// get the root component of a and b, and set the one's parent to the other
while (component[a] != a)
    a = component[a];
while (component[b] != b)
    b = component[b];
component[b] = a;
}

void unionCoords(int x, int y, int x2, int y2)
{
if (y2 < h && x2 < w && input[x][y] && input[x2][y2] && y2 > 0 && x2 > 0)
    doUnion(x*h + y, x2*h + y2);
}

int main()
{
int i, j;
for (i = 0; i < w*h; i++)
    component[i] = i;

for (int x = 0; x < w; x++)
for (int y = 0; y < h; y++)
{
    unionCoords(x, y, x+1, y);
    unionCoords(x, y, x, y+1);
    unionCoords(x, y, x+1, y+1);
    unionCoords(x, y, x-1, y+1);
    unionCoords(x, y, x+1, y-1);
    unionCoords(x, y, x-1, y-1);
}


// print the array
for (int x = 0; x < w; x++)
{
    for (int y = 0; y < h; y++)
    {
        if (input[x][y] == 0)
        {
            printf("%4d ",input[x][y]);
            continue;
        }
        int c = x*h + y;
        while (component[c] != c) c = component[c];
        printf("%4d ", component[c]);

    }
    printf("\n");
}
}

As you can see, I added 4 commands for doing diagonal connectivity between elements. Is this a valid modification of the union-find algorithm? I searched Google and stackoverflow in particular, but I can't find any example of diagonal connectivity. In addition, I want to extend this to 3 dimensions - so I would need to add 26 commands for checking. Will this way scale well? I mean the code seems to work for my case, but sometimes I randomly get an unlabeled isolated element. I don't want to integrate it with my code only to discover a bug months later.

Thanks.

rahulv88
  • 101
  • 9
  • You can rewrite a function to use an array as a stack instead of recursion, and then you don't run out of array until you run out of memory. If you have goto available, instead of calling the function, push some a marker for a return address, push its arguments onto the stack, and then goto its start. At the start, pop the arguments from the stack and carry on. To return, pop the return address and go there. If you don't have goto, turn the function into a while loop. While there is anything to do on the stack, do what you can, turning recursive calls into pushes of todo objects to the stack. – mcdowella Jun 10 '17 at 04:17
  • Hi, I did do it with an array. So, in 2d, the function recursively calls an element's four neighbors. It ran out of memory fairly quickly. I also adjusted the stack size to its maximum allowable limit. – rahulv88 Jun 10 '17 at 18:31

1 Answers1

1

There is nothing wrong with your approach using the union find algorithm. Union find runs on any graph. For each node it examines, it checks its connected nodes to determine whether they are in the same subset. Your approach appears to be doing just that, checking the 8 adjacent nodes of any observed node. The union find algorithm has nothing to do with the dimensions of your graph. You can extend that approach to 3d or any dimension, as long as your graph corresponds correctly to that dimension. If you are experiencing errors with this, you can post an example of that error, or check out code review: https://codereview.stackexchange.com/.

Jayson Boubin
  • 1,504
  • 2
  • 11
  • 19
  • Hi, thanks for the response. So the reason I even posted this question is because I couldn't get it to work in 3d. I wanted to know if there's some conceptual difficulties I'm having. If you think this is ok, I think I will edit the original question to include the 3d code. – rahulv88 Jun 10 '17 at 18:33
  • You might get more attention if you repost it as a separate question. – Jayson Boubin Jun 10 '17 at 18:34
  • Also, depending on the nature of your new question, think about whether it belongs here or in code review – Jayson Boubin Jun 10 '17 at 18:35
  • Yeah I think you are right..I should just ask a new question with the non-working code. – rahulv88 Jun 10 '17 at 18:46
  • Good idea. Also, If you think this answer helped you enough with this question, you may want to accept it as to not draw attention away from newer unanswered questions. – Jayson Boubin Jun 10 '17 at 18:50