I have a large collection of image files for a book, andthe publisher wants a list where files are classified by "type" (greyscale graph, b/w halftone image, color image, line drawing, etc.). This is a hard problem in general, but perhaps I can do some of this automatically using image processing tools, e.g., ImageMagick with the R magick package.
I think ImageMagick is the right tool, but I don't really know how to use it for this purpose.
What I have is just a list of fig numbers & file names:
1.1 ch01-intro/fig/alcohol-risk.jpg
1.2 ch01-intro/fig/measels.png
1.3 ch01-intro/fig/numbers.png
1.4 ch01-intro/fig/Lascaux-bull-chamber.jpg
...
Can someone help get me started?
Edit: This was probably an ill-framed or overly-arching question as initially stated. I thought that ImageMagick identify
or the R magick::image_info()
function could help, so the initial question perhaps should have been: "How to extract image information from a list of files [in R]". I can pose this separately, if not already asked.
An initial attempt at this gave me the following for my first images,
library(magick)
# initialize an empty array to hold the results of `image_info`
figinfo <- data.frame(
format=character(),
width=numeric(),
height=numeric(),
colorspace=character(),
matte=logical(),
filesize=numeric(),
density=character(), stringsAsFactors = FALSE
)
for (i in seq_along(files)) {
img <- image_read(files[i])
info <- image_info(img)
figinfo[i,] <- info
}
I get:
> figinfo
format width height colorspace matte filesize density
1 JPEG 661 733 sRGB FALSE 41884 72x72
2 PNG 838 591 sRGB TRUE 98276 38x38
3 PNG 990 721 sRGB TRUE 427253 38x38
4 JPEG 798 219 sRGB FALSE 99845 300x300
I conclude that this doesn't help much in answering the question I posed, of how to classify these images.
Edit2 Before closing this question, the advice to look into direct use of ImageMagick identify
was helpful. https://imagemagick.org/script/escape.php
In particular, the %[type]
is closer to
what I need. This is not exposed in magick::image_info()
, so I may have to write a shell script or call system()
in a loop.
For the record, here is how I can extract relevant attributes of these image files using identify
directly.
# Get image characteristics via ImageMagick identify
# from: https://imagemagick.org/script/escape.php
#
# -format elements:
# %m image file format
# %f filename
# %[type] image type
# %k number of unique colors
# %h image height in pixels
# %r image class and colorspace
identify -format "%m,%f,%[type],%r,%k,%hx%w" imagefile
>identify -format "%m,%f,%[type],%r,%k,%hx%w" Quipu.png
PNG,Quipu.png,GrayscaleAlpha,DirectClass Gray Matte,16,449x299
The %[type]
attribute takes me towards what I want.