I have a JPG, BMP, or SVG image (see example below) and I need an algorithm to extract the vertices (X, Y) coordinates and the egdes (i.e., a list that indicates which vertices are connected). The Edges can be of the form of a boolean true/false for each vertex pair or simply a list of vertex pairs that are connected. Any ideas welcome.
For example, I would like a function (or series of functions) which input the image and output two lists:
Vertices:
Vertex 1: X = 1, Y = 2
Vertex 2: X = 3, Y = 5
Vertex 3: X = 3, Y = 7
...
Edges:
Edge 1: (Vertex 1, Vertex 3)
Edge 2: (Vertex 1, Vertex 4)
Edge 3: (Vertex 4, Vertex 10)
...
The vertex coordinate system can be in any coordinate system (e.g., pixels, based on SVG coordinates) or it can be some alternate user-defined coordinate system.
For example, I extracted the following coordinates (pixels) from the example image (left) and plotting them in Matlab (right).
So, for example, I can tell that the corner vertices are roughly: (10, 10), (290, 10), (290, 190), and (10, 190).
But I want an algorithm to automatically detect those coordinates and to also tell me that there is an edge between the top left vertex (10, 190) and the top right vertex (290, 190), etc. I also need to identify each of the vertices and edges for the internal blocks, etc.
As well, for more complicated diagrams, I need it to work as well. For example, I am able to extract the necessary pixels and produce the following Matlab plot:
Similarly to before, it is quite clear where the vertices "should be", however, due to the line thickness, there are many clusters of pixels that first need to be "smoothed out", etc. I'm unsure of how to go about doing this and automating the process of identifying vertices/edges.
Note 1: The method I'm using to get the pixel coordinates is basically:
- Convert to Black/White
- Scan each pixel to see if colour <= threshold, save (X,Y) if it's "black"
- Plot in Matlab
A rough algorithm which I'm thinking is:
- Apply "smoothing" to get a single line instead of pixel clusters
- "Loop" through pixels in different directions, when a significant slope change occurs, Identify it as a "vertex"
- After all vertices are identified, evaluate the line between each pair of vertices, if that line is mostly black, identify it as an edge
There are many issues with the above algorithm, so I was hoping others might have some better ideas or similar C# code, etc.
I would like the process to be as automated as possible.
Note 2: I can also convert the image to SVG format (already implemented). It is my understanding that the SVG format may lend itself very well to my application because it can more easily automate the process; however, I find the SVG structure quite confusing.
I have read through some literature online about SVG formats and I understand how it works, but I was wondering if there was some sort of already existing library or something that would allow me to very easily identify the vertices of the "path" in the SVG file, etc.
For example, one of the "paths" that I get from one SVG file is of the form:
<path d="M70 1810 c0 -91 3 -110 15 -110 12 0 15 17 15 95 l0 95 1405
0 1405 0 0 -410 0 -411 -87 3 -88 3 -1 35 c0 19 -1 124 -2 233 l-2 197
-70 0 -70 0 0 -320 0 -320 153 0 c83 0 162 3 175 6 l22 6 0 504 0 504
-1435 0 -1435 0 0 -110z m2647 -490 c1 -113 2 -217 2 -232 l1 -27 88 -3
87 -3 0 -70 0 -70 145 0 -145 0 -3 295 c-1 162 0 301 3 308 3 9 21 12
57 10 l53 -3 2 -205z"/>
I know this follows a Cubic Bezier Spline, but I was wondering if any already existing algorithms are out there to process the "path" code and extract the relevant coordinates, etc.
Thanks for your help!!