Extracting segments from a list of 8-connected pixels

Question

Current situation: I'm trying to extract segments from an image. Thanks to openCV's findContours() method, I now have a list of 8-connected point for every contours. However, these lists are not directly usable, because they contain a lot of duplicates.

The problem: Given a list of 8-connected points, which can contain duplicates, extract segments from it.

Possible solutions:

At first, I used openCV's approxPolyDP() method. However, the results are pretty bad... Here is the zoomed contours:

enter image description here

Here is the result of approxPolyDP(): (9 segments! Some overlap)

enter image description here

but what I want is more like:

enter image description here

It's bad because approxPolyDP() can convert something that "looks like several segments" in "several segments". However, what I have is a list of points that tend to iterate several times over themselves.

For example, if my points are:

0 1 2 3 4 5 6 7 8 
  9

Then, the list of point will be 0 1 2 3 4 5 6 7 8 7 6 5 4 3 2 1 9... And if the number of points become large (>100) then the segments extracted by approxPolyDP() are unfortunately not duplicates (i.e : they overlap each other, but are not strictly equal, so I can't just say "remove duplicates", as opposed to pixels for example)

Perhaps, I've got a solution, but it's pretty long (though interesting). First of all, for all 8-connected list, I create a sparse matrix (for efficiency) and set the matrix values to 1 if the pixel belongs to the list. Then, I create a graph, with nodes corresponding to pixels, and edges between neighbouring pixels. This also means that I add all the missing edges between pixels (complexity small, possible because of the sparse matrix). Then I remove all possible "squares" (4 neighbouring nodes), and this is possible because I am already working on pretty thin contours. Then I can launch a minimal spanning tree algorithm. And finally, I can approximate every branch of the tree with openCV's approxPolyDP()

To sum up: I've got a tedious method, that I've not yet implemented as it seems error-prone. However, I ask you, people at Stack Overflow: are there other existing methods, possibly with good implementations?

Edit: To clarify, once I have a tree, I can extract "branches" (branches start at leaves or nodes linked to 3 or more other nodes) Then, the algorithm in openCV's approxPolyDP() is the Ramer–Douglas–Peucker algorithm, and here is the Wikipedia picture of what it does:

enter image description here

With this picture, it is easy to understand why it fails when points may be duplicates of each other

Another edit: In my method, there is something that may be interesting to note. When you consider points located in a grid (like pixels), then generally, the minimal spanning tree algorithm is not useful because there are many possible minimal trees

X-X-X-X
|
X-X-X-X

is fundamentally very different from

X-X-X-X
| | | |
X X X X

but both are minimal spanning trees

However, in my case, my nodes rarely form clusters because they are supposed to be contours, and there is already a thinning algorithm that runs beforehand in the findContours().

Answer to Tomalak's comment:

enter image description here

If DP algorithm returns 4 segments (the segment from the point 2 to the center being there twice) I would be happy! Of course, with good parameters, I can get to a state where "by chance" I have identical segments, and I can remove duplicates. However, clearly, the algorithm is not designed for it.

Here is a real example with far too many segments:

enter image description here

`are unfortunately not duplicates (i.e : they overlap each other, but are not stricly equal, so I can't just say "remove duplicates", as opposed to pixels for example)` I don't understand this. Why not? — Lightness Races in Orbit, Jun 20 '11 at 13:05
I answered by editing my post with another magnificent Paint picture =) — B. Decoster, Jun 20 '11 at 13:22
I'm looking at the picture you added in response to Tomalak, and I have a doubt: the result of the DP algorithm is actually a valid solution, if you accept that the observed points can be as far from the "true"! segment as the central point in the image is. Do you have a specific threshold for the point-line distance that the algorithm must respect? — Coffee on Mars, Jun 20 '11 at 13:41
No, I don't have a particular threshold, but the problem is that I'm dealing with thousands of images with thousands of contours. If I set a particular value for the threshold, then for most of the contours, I'm going to have a horrible mess of overlapping segments. I understand that what DP algorithm returns is "valid". But just look at the blue segments in the first picture of my post. You can see that there are lots of segments, where only two or three would be needed. And they overlap because of the 8-connectivity nature of the list. — B. Decoster, Jun 20 '11 at 13:45

Matthias Odisio · Accepted Answer · 2011-06-28T17:03:30.947

Using Mathematica 8, I created a morphological graph from the list of white pixels in the image. It is working fine on your first image:

enter image description here

Create the morphological graph:

graph = MorphologicalGraph[binaryimage];

Then you can query the graph properties that are of interest to you.

This gives the names of the vertex in the graph:

vertex = VertexList[graph]

The list of the edges:

EdgeList[graph]

And that gives the positions of the vertex:

pos = PropertyValue[{graph, #}, VertexCoordinates] & /@ vertex

This is what the results look like for the first image:

In[21]:= vertex = VertexList[graph]

Out[21]= {1, 3, 2, 4, 5, 6, 7, 9, 8, 10}

In[22]:= EdgeList[graph]

Out[22]= {1 \[UndirectedEdge] 3, 2 \[UndirectedEdge] 4,  3 \[UndirectedEdge] 4, 
          3 \[UndirectedEdge] 5, 4 \[UndirectedEdge] 6,  6 \[UndirectedEdge] 7, 
          6 \[UndirectedEdge] 9, 8 \[UndirectedEdge] 9,  9 \[UndirectedEdge] 10}

In[26]:= pos = PropertyValue[{graph, #}, VertexCoordinates] & /@ vertex

Out[26]= {{54.5, 191.5}, {98.5, 149.5},  {42.5, 185.5}, 
          {91.5, 138.5}, {132.5, 119.5}, {157.5, 72.5},
          {168.5, 65.5}, {125.5, 52.5},  {114.5, 53.5}, 
          {120.5, 29.5}}

Given the documentation, http://reference.wolfram.com/mathematica/ref/MorphologicalGraph.html, the command MorphologicalGraph first computes the skeleton by morphological thinning:

skeleton = Thinning[binaryimage, Method -> "Morphological"]

Then the vertex are detected; they are the branch points and the end points:

verteximage = ImageAdd[
                  MorphologicalTransform[skeleton, "SkeletonEndPoints"],   
                  MorphologicalTransform[skeleton, "SkeletonBranchPoints"]]

enter image description here

And then the vertex are linked after analysis of their connectivity.

For example, one could start by breaking the structure around the vertex and then look for the connected components, revealing the edges of the graph:

comp = MorphologicalComponents[
           ImageSubtract[
               skeleton, 
               Dilation[vertices, CrossMatrix[1]]]];
Colorize[comp]

enter image description here

The devil is in the details, but that sounds like a solid starting point if you wish to develop your own implementation.

+1 But as the OP is not asking for a solution in Mathematica, you should explain the algorithms behind `MorphologicalGraph[]` — Dr. belisarius, Jun 27 '11 at 18:48
Edited accordingly, thanks for that and allowing the pictures in. — Matthias Odisio, Jun 27 '11 at 19:41
Oh, thanks a lot! Well, I don't use Mathematica, and I can't incorporate it. I'll try to look at `MorphologicalGraph[]` but if you have any idea how it works, I'll be forever grateful =). And, if no ones comes up with something amazing, the bounty's yours! — B. Decoster, Jun 28 '11 at 08:03
@Fezvez Thanks for the encouragement. I don't know how MorphologicalGraph works, but I proposed another step further. The algorithm requires anyway a set of morphological tools which, as Andrey said, may not be available in OpenCV. If you can not use Mathematica, you may want to look for another library which would provide morphological features. — Matthias Odisio, Jun 28 '11 at 17:11

score 10 · Answer 2 · answered Jun 20 '11 at 13:46

Try math morphology. First you need to dilate or close your image to fill holes.

cvDilate(pimg, pimg, NULL, 3);
cvErode(pimg, pimg, NULL);

I got this image

enter image description here

The next step should be applying thinning algorithm. Unfortunately it's not implemented in OpenCV (MATLAB has bwmorph with thin argument). For example with MATLAB I refined the image to this one:

enter image description here

However OpenCV has all needed basic morphological operations to implement thinning (cvMorphologyEx, cvCreateStructuringElementEx, etc).

Another idea.

They say that distance transform seems to be very useful in such tasks. May be so. Consider cvDistTransform function. It creates to an image like that:

enter image description here

Then using something like cvAdaptiveThreshold:

enter image description here

That's skeleton. I guess you can iterate over all connected white pixels, find curves and filter out small segments.

Thanks for the input! It's interesting stuff, but I fear there has been a misunderstanding : the image was a zoomed version of what I had. I have cleared that in my post. What I want is to take your second picture in input, and output 3 segments. — B. Decoster, Jun 20 '11 at 14:12

Sean · Answer 3 · 2011-06-30T21:03:04.273

I've implemented a similar algorithm before, and I did it in a sort of incremental least-squares fashion. It worked fairly well. The pseudocode is somewhat like:

L = empty set of line segments
for each white pixel p
  line = new line containing only p
  C = empty set of points
  P = set of all neighboring pixels of p
  while P is not empty
    n = first point in P
    add n to C
    remove n from P
    line' = line with n added to it
    perform a least squares fit of line'
    if MSE(line) < max_mse and d(line, n) < max_distance
      line = line'
      add all neighbors of n that are not in C to P
  if size(line) > min_num_points
    add line to L

where MSE(line) is the mean-square-error of the line (sum over all points in the line of the squared distance to the best fitting line) and d(line,n) is the distance from point n to the line. Good values for max_distance seem to be a pixel or so and max_mse seems to be much less, and will depend on the average size of the line segments in your image. 0.1 or 0.2 pixels have worked in fairly large images for me.

I had been using this on actual images pre-processed with the Canny operator, so the only results I have are of that. Here's the result of the above algorithm on an image: Raw image Detected segments

It's possible to make the algorithm fast, too. The C++ implementation I have (closed source enforced by my job, sorry, else I would give it to you) processed the above image in about 20 milliseconds. That includes application of the Canny operator for edge detection, so it should be even faster in your case.

score 0 · Answer 4 · edited Jun 13 '18 at 16:43

You can start by extraction straight lines from your contours image using HoughLinesP which is provided with openCV:

HoughLinesP(InputArray image, OutputArray lines, double rho, double theta, int threshold, double minLineLength = 0, double maxLineGap = 0)

If you choose threshold = 1 and minLineLenght small, you can even obtain all single elements. Be careful though, since it yields many results in case you have many edge pixels.

Extracting segments from a list of 8-connected pixels

4 Answers4