1

I have a huge set of black and white images with spots and want to replace every spot into pixel (or small circle) that is located in the geometric center of the spot.

enter image description here

Because the spots has different sizes, I could not consequently use dilate operator because it completely deletes small sports. Is there any way to do this automatical with Imagemagick?

Ivan Z
  • 1,517
  • 1
  • 16
  • 25
  • You can use -connected-components to fine the centroids of each region and the bounding boxes. Use the bounding box to approximate the diameter of the circle or draw the same size circles for each spot. Use -draw to draw the circles. But be careful, it will mess with your arc on the right side unless you threshold to filter it out on the bounding box sizes. – fmw42 Mar 21 '23 at 18:34
  • Completely different way: Using Gimp, create a color selection on black, convert the selection to a path (so you get a collection of tiny polygons), and compute the centroid of these. If needed, create a new layer and paint the pixel at these coordinates (but maybe you just want the coordinate, in which case you can write it to a file directly). Of course if you have a convex spot the centroid can be outside... – xenoid Mar 21 '23 at 18:49
  • @xenoid, I have some thousands files and need a script that does all work automatically. – Ivan Z Mar 21 '23 at 19:11
  • @fmw42, can you write a script example? – Ivan Z Mar 21 '23 at 19:13
  • @IvanZ you can script Gimp and make it execute things in batch. What you want to do is less than 20 lines of Python code. The rest is explained [here](https://stackoverflow.com/questions/44430081/how-to-run-python-scripts-using-gimpfu-from-windows-command-line/44435560#44435560). – xenoid Mar 21 '23 at 20:31
  • Kindly state your operating system. Thanks. – Mark Setchell Mar 21 '23 at 21:44

2 Answers2

3

Kudos (and upvotes) to Fred (@fmw42) for the original technique. As much for my own curiosity as anything (and to document the approach), I wanted to make a variant that is hopefully more CPU-friendly, I/O friendly and maybe more portable. This should also have advantages given that OP has large numbers of images to process.

I worked on the following aspects:

  • rather than repeatedly load the image, draw a circle and re-save for every circle, I wanted to use a script that loads it once, draws all the circles and saves it

  • reduce dependency on other tools, and their process creation times - so all the cut, grep, tr, convert, echo and so on are encapsulated inside a single, awk invocation leveraging its built-in ability to split fields, process text, and do math. Hopefully this makes it easier to port to Windows too as fewer binaries are needed.

So, it looks like this:

#!/bin/bash

magick black_spots.png \
   -threshold 50% -type bilevel \
   -define connected-components:mean-color=true \
   -define connected-components:area-threshold=0-300 \
   -define connected-components:verbose=true \
   -connected-components 8 null: | awk -F'[ x+]*' '
      BEGIN       { print "black_spots.png -fill white -colorize 100 -fill black" }
      /gray\(0\)/ {
                    w=$3; h=$4; x=$5; y=$6; cx=x+w/2; cy=y+h/2
                    printf("-draw \"translate %f,%f circle 0,0 0,5\"\n", cx, cy)
                  }
      END         { print "-write result.png"}
   ' | magick -script -

enter image description here


The awk part in the middle actually generates a script that looks like this:

black_spots.png -fill white -colorize 100 -fill black
-draw "translate 1429.000000,368.000000 circle 0,0 0,5"
-draw "translate 6.000000,1026.500000 circle 0,0 0,5"
-draw "translate 739.500000,378.000000 circle 0,0 0,5"
...
...
-write result.png

That is then piped into magick -script at the end. Hopefully it is clear that the input file is only read once, all the circles are drawn, then the output file is written - just once.


Some notes on the awk parts:

  • -F'[ x+]*' means that multiple spaces, the letter x and + signs should all be treated as field separators

  • BEGIN and END blocks are executed once at the start and end of the awk script

  • the /gray\(0\)/ block is executed only on lines containing gray(0)


As regards processing large numbers of files, I would use GNU Parallel, but you have not indicated your operating system. Basically, modify the above script to accept a filename as a parameter, save it as ProcessOne, make it executable with chmod +x ProcessOne, then run:

parallel ./ProcessOne ::: *.png

and it will keep all your CPU cores busy processing all your files in parallel till they are all done. You can get progress bars and ETAs with various switches:

parallel --eta ...         # show ETA
parallel --progress ...    # report progress
parallel --bar ...         # add progress bar
parallel -j 4 ...          # just run 4 jobs in parallel
Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • Nice solution. But I'm working in Windows. Does this script can work in it? – Ivan Z Mar 22 '23 at 16:50
  • It should be possible to make it work under Windows. First, you would need to install **ImageMagick**, then you'd need to install `gawk`, If you can do that, and let me know when you have done it, we can work out together how to adapt to Windows. – Mark Setchell Mar 22 '23 at 17:36
  • Alternatively, it should be possible to do it using Python with **OpenCV** installed if that is easier for you. It could use multiprocessing internally too in order to speed things up for large numbers of files. – Mark Setchell Mar 22 '23 at 19:53
  • I try to use **ImageMagick** with `awk`. – Ivan Z Mar 22 '23 at 20:18
  • Mark, it looks like you are drawing at the center of the bounding boxes and not the centroids of the regions. Is that correct? From Wikipedia: "In mathematics and physics, the centroid is also known as geometric center". Connected components reports the centroid in addition to the bounding box. So you could extract that directly rather than compute the centers of the bounding boxes. – fmw42 Mar 22 '23 at 21:26
  • Mark, I am not an AWK expert. But you have a line ` /gray\(0\)/ {` that looks to me out of place that likely was a copy and paste error from my grep "gray(0)". If not a typo, please explain further. – fmw42 Mar 22 '23 at 21:32
  • @fmw42 Yes, that's what my comment was referring to below your answer. OP asked for *"geometric centre"* which I interpreted as being the centre of the rectangle containing the blob. If I wanted the centre as calculated by weighted distances (moments) I would pretty much always say *"centroid"*. Maybe OP will clarify - it's an easy alteration, if necesary. Thank you for checking. – Mark Setchell Mar 22 '23 at 21:34
  • That is a regex that means the following block enclosed within `{...}` is only executed on lines containg `gray(0)`. The braces are escaped though by preceding with backslashes and the whole thing is enclosed in `/.../` which is how `awk` likes it. – Mark Setchell Mar 22 '23 at 21:37
  • Thanks. I did not know you could do that. – fmw42 Mar 22 '23 at 21:38
  • So `awk '/mark/ {print $0}' file` is equivalent to `grep mark file` And you can have multiple, different patterns `awk '/mark/{print "good"} /fred/{print "also good"}' file` – Mark Setchell Mar 22 '23 at 21:39
  • @fmw42 I'm not disagreeing or trying to argue, but my (maybe English) interpretation was like this https://www.researchgate.net/figure/Geometric-center-vs-centroid_fig3_280841445 – Mark Setchell Mar 22 '23 at 21:52
  • Interesting. I was going by Wikipedia and my interpretation of geometric center. My Google search found Wikipedia as the first thing. We will have to see what the OP actually wanted. – fmw42 Mar 22 '23 at 22:12
2

Here is how to use -connected-components in Imagemagick to put circles where the spots are located.

This is a Unix Bash script using Imagemagick.

  1. Save the internal file separator
  2. Set the internal file separator to a new line rather than a space
  3. Create an array holding the dat and read the input
  4. Make a copy of the input and write over it with white and save
  5. Threshold the input and save a bilevel
  6. Set the output of connected components to the mean-color of the regions
  7. Threshold the regions to those that between 0 and 300 pixels in area (This will discard your long arc on the right for the most part. It may throw out large spots. So change as desired) 8) Set the connected components to verbose output to get the textual data
  8. Set the connected components type to 8 connected rather than 4 connected
  9. Save only black regions and filter to get only the second and third fields which are the bounding box and the centroid
  10. Get the number of lines of data (one line per region)
  11. Print to the terminal the number of lines of data
  12. Start a For loop over each line of data
  13. Get the bounding box width
  14. Get the bounding box height
  15. Compute the radius from the bounding box width and height
  16. Get the centroid
  17. Print to the terminal the line number, width, height, radius and centroid
  18. Draw a black circle on the white image for each line of data, i.e. using the centroid and radius
  19. end the loop
  20. Set the internal file separator back to the save space separator

Input:

enter image description here

OLDIFS=$IFS
IFS=$'\n'
dataArr=(`convert black_spots.png \
\( +clone -fill white -colorize 100 -write black_spots_result.png +delete \) \
-threshold 50% -type bilevel \
-define connected-components:mean-color=true \
-define connected-components:area-threshold=0-300 \
-define connected-components:verbose=true \
-connected-components 8 null: |\
grep "gray(0)" | awk '{print $2, $3}' | tr "x" "+"`)
num=${#dataArr[*]}
echo $num
for ((i=0; i<num; i++)); do
ww=`echo ${dataArr[$i]} | cut -d\  -f1 | cut -d+ -f1`
hh=`echo ${dataArr[$i]} | cut -d\  -f1 | cut -d+ -f2`
rad=`convert xc: -format "%[fx:($ww+$hh)/4]" info:`
cent=`echo ${dataArr[$i]} | cut -d\  -f2`
echo "$i $ww $hh $rad $cent"
convert black_spots_result.png -fill black -draw "translate $cent circle 0,0 0,$rad" black_spots_result.png
done
IFS=$OLDIFS

Result:

enter image description here

fmw42
  • 46,825
  • 10
  • 62
  • 80
  • Nice solution - you already have my vote. It probably doesn't matter for small shapes (and you do indeed have a threshold of 300px) but technically the replacement circle is placed at the centroid rather than at the geometric centre as OP asked. – Mark Setchell Mar 22 '23 at 11:23
  • Mark, it is the centroid as reported in -connected-components. – fmw42 Mar 22 '23 at 15:04