I'm trying to develop a modification of the connected component algorithm I found as an answer to this question: Connected Component Labelling.
Basically, I have 2d- and 3d- matrices consisting of 0s and 1s. My problem is to find connected regions of 1s, labeling each region separately. The matrix sizes can be very large (consisting of 5e4-by-5e4 elements in 2-d and 1000^3 elements in 3d). So I need something which doesn't strain the stack memory, and which is fast enough to repeat several times over the course of a simulation.
The most upvoted answer to that question, using depth-first search, gives a stack overflow error (as noted in a comment). I have been trying to use the union-find algorithm suggested by another user.
The original code (by user Dukeling) works very well for large 2-d matrices, but I want to have diagonal connections between elements. Here's my code, with the example input I am trying to use:
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
const int w = 8, h = 8;
int input[w][h] = {{1,0,0,0,1,0,0,1},
{1,1,0,1,1,1,1,0},
{0,1,0,0,0,0,0,1},
{1,1,1,1,0,1,0,1},
{0,0,0,0,0,0,1,0},
{0,0,1,0,0,1,0,0},
{0,1,0,0,1,1,1,0},
{1,0,1,1,0,1,0,1}};
int component[w*h];
void doUnion(int a, int b)
{
// get the root component of a and b, and set the one's parent to the other
while (component[a] != a)
a = component[a];
while (component[b] != b)
b = component[b];
component[b] = a;
}
void unionCoords(int x, int y, int x2, int y2)
{
if (y2 < h && x2 < w && input[x][y] && input[x2][y2] && y2 > 0 && x2 > 0)
doUnion(x*h + y, x2*h + y2);
}
int main()
{
int i, j;
for (i = 0; i < w*h; i++)
component[i] = i;
for (int x = 0; x < w; x++)
for (int y = 0; y < h; y++)
{
unionCoords(x, y, x+1, y);
unionCoords(x, y, x, y+1);
unionCoords(x, y, x+1, y+1);
unionCoords(x, y, x-1, y+1);
unionCoords(x, y, x+1, y-1);
unionCoords(x, y, x-1, y-1);
}
// print the array
for (int x = 0; x < w; x++)
{
for (int y = 0; y < h; y++)
{
if (input[x][y] == 0)
{
printf("%4d ",input[x][y]);
continue;
}
int c = x*h + y;
while (component[c] != c) c = component[c];
printf("%4d ", component[c]);
}
printf("\n");
}
}
As you can see, I added 4 commands for doing diagonal connectivity between elements. Is this a valid modification of the union-find algorithm? I searched Google and stackoverflow in particular, but I can't find any example of diagonal connectivity. In addition, I want to extend this to 3 dimensions - so I would need to add 26 commands for checking. Will this way scale well? I mean the code seems to work for my case, but sometimes I randomly get an unlabeled isolated element. I don't want to integrate it with my code only to discover a bug months later.
Thanks.