C# - Black points recognition from a photo

Question

I have some photos of white pages with some black points drawn on, like this: photo (the points aren't very circular, I can draw them better), and I would find the coordinates of these points. I can binarize the images (the previous photo binarized: image), but how can I find the coordinates of these black points? I need only the coordinates of one pixel for each point, the approximate center.

This is for a school assignment.

is the background guarantee to be pure white? and could the black points connect to each other?? — Steve, May 10 '18 at 17:08
Seems like your actual problem is more along the line of detecting which pixels belong together... which is just a matter of making some kind of linking of neighbouring pixels. Basically, you're going to have to write some kind of "bucket fill" algorithm. Once you have that, assuming they are indeed dots, you just get the max and min values of X and Y in that collection and take the average. — Nyerguds, May 10 '18 at 17:09
@Steve yes, the background is surely pure white; if two points are connected I would treat them as a single point. — Tommaso, May 10 '18 at 17:12
You can use OpenCVs Thresholding function to get a binary mask of the image, which splits the image into black and white parts. From this image you can the analyze each pixels that belong together with a shape approximation algorithm (also in OpenCV) and the get the center of the approximated shape. — jAC, May 10 '18 at 17:12
Do note you have some single floating pixels there... to avoid those showing up as extra matches you could scan for other pixels in an area _around_ your current pixel (with like a 2-3 pixel radius), rather than _exactly_ neighbouring, or you could combine any overlapping bounding rectangles of the detected collections. — Nyerguds, May 10 '18 at 17:27
I added my own answer to specifically go over the fill algorithm. I hope it'll be useful. — Nyerguds, May 11 '18 at 06:53

Steve · Answer 1 · 2018-05-10T21:12:19.763

Since its for school work I will only provide you with a high level algorithm.

Since the background is guarantee to be white, you are in luck.

First you need to define a threshold on the level black which you want to consider as the black dot's color.

#ffffff is pure white and #000000 is pure black. I would suggest some where like #383838 to be your threshold.

Then you make a two dimensional bool array to keep track of which pixel you have visited already.

Now we can start looking at the picture.

You read the pixel one at the time horizontally and see if the pixel is > threshold. If yes then you do a DFS or BFS to find the entire area where the pixel's neighbor is also > threshold.

During the process you will be marking the bool array we created earlier to indicate that you have already visited the pixel.

since its a circle point you can just take the min, max of x and y coordinate and calculate the center point.

Once you are done with one point you would keep looping thru the picture's pixel and find the points that you have not visited (false in the bool array)

Since the points you have on the photo contains some small dots on the edge which is not connected to the large point, you might have to do some math to see if the radius is > some number to consider that a valid point. Or instead of a radius 1 neighbor you do a 5 - 10 pixel neighbor BFS/DFS to include the ones that are really close to the main point.

Nyerguds · Accepted Answer · 2020-04-15T11:15:33.593

The basics for processing image data can be found in other questions, so I won't go into deeper detail about that, but for the threshold check specifically, I'd do it by gathering the red, green and blue bytes of each pixel (as indicated in the answer I linked), and then just combine them to a Color c = Color.FromArgb(r,g,b) and testing that to be "dark" using c.GetBrightness() < brightnessThreshold. A value of 0.4 was a good threshold for your test image.

You should store the result of this threshold detection in an array in which each item is a value that indicates whether the threshold check passed or failed. This means you can use something as simple as a two-dimensional Boolean array with the original image's height and width.

If you already have methods of doing all that, all the better. Just make sure you got some kind of array in which you can easily look up the result of that binarization. If the method you have gives you the result as image, you will be more likely to end up with a simple one-dimensional byte array, but then your lookups will simply be of a format like imagedata[y * stride + x]. This is functionally identical to how internal lookups in a two-dimensional array happen, so it won't be any less efficient.

Now, the real stuff in here, as I said in my comment, would be an algorithm to detect which pixels should be grouped together to one "blob".

The general usage of this algorithm is to loop over every single pixel on the image, then check if A) it cleared the threshold, and B) it isn't already in one of your existing detected blobs. If the pixel qualifies, generate a new list of all threshold-passed pixels connected to this one, and add that new list to your list of detected blobs. I used the Point class to collect coordinates, making each of my blobs a List<Point>, and my collection of blobs a List<List<Point>>.

As for the algorithm itself, what you do is make two collections of points. One is the full collection of neighbouring points you're building up (the points list), the other is the current edge you're scanning (the current edge list). The current edge list will start out containing your origin point, and the following steps will loop as long as there are items in your current edge list:

Add all items from the current edge list into the full points list.
Make a new collection for your next edge (the next edge list).
For each point in your current edge list, get a list of its directly neighbouring points (excluding any that would fall outside the image bounds), and check for all of these points if they clear the threshold, and if they are not already in either the points list or the next edge list. Add the points that pass the checks to the next edge list.
After this loop through the current edge list ends, replace the original current edge list by the next edge list.

...and, as I said, loop these steps as long as your current edge list after this last step is not empty.

This will create an edge that expands until it matches all threshold-clearing pixels, and will add them all to the list. Eventually, as all neighbouring pixels end up in the main list, the new generated edge list will become empty, and the algorithm will end. Then you add your new points list to the list of blobs, and any pixels you loop over after that can be detected as already being in those blobs, so the algorithm is not repeated for them.

There are two ways of doing the neighbouring points; you either get the four points around it, or all eight. The difference is that using four will not make the algorithm do diagonal jumps, while using eight will. (An added effect is that one causes the algorithm to expand in a diamond shape, while the other expands in a square.) Since you seem to have some stray pixels around your blobs, I advise you to get all eight.

As Steve pointed out in his answer, a very quick way of doing checks to see if a point is present in a collection is to create a two-dimensional Boolean array with the dimensions of the image, e.g. Boolean[,] inBlob = new Boolean[height, width];, which you keep synchronized with the actual points list. So whenever you add a point, you also mark the [y, x] position in the Boolean array as true. This will make rather heavy checks of the if (collection.contains(point)) type as simple as if (inBlob[y,x]), which requires no iterations at all.

I had a List<Boolean[,]> inBlobs which I kept synced with the List<List<Point>> blobs I built, and in the expanding-edge algorithm I kept such a Boolean[,] for both the next edge list and the points list (the latter of which was added to inBlobs at the end).

As I commented, once you have your blobs, just loop over the points inside them per blob and get the minimums and maximums for both X and Y, so you end up with the boundaries of the blob. Then just take the averages of those to get the center of the blob.

Extras:

If all your dots are guaranteed to be a significant distance apart, a very easy way to get rid of floating edge pixels is to take the edge boundaries of each blob, expand them all by a certain threshold (I took 2 pixels for that), and then loop over these rectangles and check if any intersect, and merge those that do. The Rectangle class has both an IntersectsWith() for easy checks, and a static Rectangle.Inflate for increasing a rectangle's size.
You can optimise the memory usage of the fill method by only storing the edge points (threshold-matching points with non-matching neighbours in any of the four main directions) in the main list. The final boundaries, and thus the center, will remain the same. The important thing to remember then is that, while you exclude a bunch of points from the blob list, you should mark all of them in the Boolean[,] array that's used for checking the already-processed pixels. This doesn't take up any extra memory anyway.

The full algorithm, including optimisations, in action on your photo, using 0.4 as brightness threshold:

Detected blobs

Blue are the detected blobs, red is the detected outline (by using the memory-optimised method), and the single green pixels indicate the center points of all blobs.

[Edit]

Since it's been almost a year since I posted this, I guess I might as well link to the implementation I made of this. I actually managed to use it myself about a month after I wrote it, when recreating the video compression algorithm of an old DOS game which used chunked up diff frames.

C# - Black points recognition from a photo

2 Answers2

Linked