OpenCV - How to see which algorithm is used behind a function? (imread)

Question

There are several ways in python to generate a greyscale image from an RGB version. One of those is just to read an image as greyscale using OpenCV.

im = cv2.imread(img, 0)

While 0 equals cv2.IMREAD_GRAYSCALE

There are many different algorithms to handle this operation well explained here.

I'm wondering how OpenCV handles this task and which algorithm stands behind cv2.IMREAD_GRAYSCALE but could neither find any documentation nor reference. Does someone have any idea? A paper reference would be great.

Thanks in advance

p.s. I'm working with jpg and png.

https://github.com/opencv/opencv/blob/2558ab3de7cdd57c91935eb64755afb2afd05f00/modules/imgcodecs/src/loadsave.cpp#L617-L629 and from there you continue — Christoph Rackwitz, Sep 21 '21 at 15:39
one commit referencing IMREAD_GRAYSCALE says each specific codec does this conversion, if it can: https://github.com/opencv/opencv/commit/9253e8bda2f15cc53c5d4097bbe1ba6aaa1d6f9e — Christoph Rackwitz, Sep 21 '21 at 15:42
I suspect that you need to look for IMREAD_UNCHANGED or other values because IMREAD_GRAYSCALE is **0**, so that might have been a default at one time, before someone changed the default flags to something else — Christoph Rackwitz, Sep 21 '21 at 15:46
It's often relegated to the particular codec -- e.g. for PNG it's done by [`png_set_rgb_to_gray`](https://github.com/opencv/opencv/blob/master/modules/imgcodecs/src/grfmt_png.cpp#L276). The main driver (`imread_`) allocates a `Mat` with appropriate number of channels based on flags passed. This is then passed to implementation of `BaseImageDecoder` for particular codec, which then selects appropriate (implementation specific) behaviour based on the properties of that `Mat`. The `imread` flags don't get passed to that class at all. — Dan Mašek, Sep 21 '21 at 17:36
Here's JPEG: https://github.com/opencv/opencv/blob/master/modules/imgcodecs/src/grfmt_jpeg.cpp#L504, and so on, should see the pattern by now. — Dan Mašek, Sep 21 '21 at 17:38
@ChristophRackwitz I see that that uses this parameter. But unfortunately I still don't see the code that explained my answer how the image is converted.. — Jürgen K., Sep 21 '21 at 18:44
@DanMašek Oh, I think with your help I at least come closer to the answer, at least for png png_set_rgb_to_gray( png_ptr, 1, 0.299, 0.587 ). Strangely, there is nothing comparable to this line for jpg. So still I'm not sure.. — Jürgen K., Sep 21 '21 at 18:46
The function that the JPEG implementation uses (that I linked to in my second comment) is an OpenCV helper function: https://github.com/opencv/opencv/blob/master/modules/imgcodecs/src/utils.cpp#L356 | There's a whole bunch of those, for example BMP uses several of them, e.g. https://github.com/opencv/opencv/blob/master/modules/imgcodecs/src/grfmt_bmp.cpp#L473 — Dan Mašek, Sep 21 '21 at 18:49
@DanMašek The function you linked is for CMYK. For RGB there is none as far as I see. Imread takes jpg as RGB, not CMYK. The second is rather for printers — Jürgen K., Sep 21 '21 at 18:56
You kinda have to grok the whole function -- look a bit higher - https://github.com/opencv/opencv/blob/master/modules/imgcodecs/src/grfmt_jpeg.cpp#L433-L458. When the JPEG contains 4 channels, it decodes to CMYK and uses that function I mentioned. Otherwise, it sets up `cinfo` in such way that libjpeg does the conversion to grayscale itself. — Dan Mašek, Sep 21 '21 at 19:01
@DanMašek I kind of see what you mean. Could you mark me the formula which is used to combine the 3 channels? — Jürgen K., Sep 21 '21 at 19:07
That's an implementation detail of libjpeg -- the OpenCV codebase has a version of it [here](https://github.com/opencv/opencv/tree/master/3rdparty/libjpeg). Sorry, I don't have the time right now to dig through that codebase, since I'm not as familiar with it as I am with the OpenCV imgcodecs module. Maybe later, but don't hold your breath ;) — Dan Mašek, Sep 21 '21 at 19:17
On a quick look, maybe here? https://github.com/opencv/opencv/blob/master/3rdparty/libjpeg/jdcolor.c | Might be a good starting point, anyway. — Dan Mašek, Sep 21 '21 at 19:19
@DanMašek Wow, there are at least three function, that potentially could be involved. How is this possible that everybody uses the code without knowing what actually happens in such a simple case? — Jürgen K., Sep 21 '21 at 19:24
I think in case of libraries like this (canonical implementation from the people who designed JPEG) people expect the domain experts to get it right (and judging from this site, most people "using" third party libraries don't even read the documentation, let alone worry about fine details like this). | With some work you could narrow it down to what exactly it calls -- you just have to go through the several API calls involved in the decompression sequence and trace the code path taken when the `jpeg_decompress_struct` is configured in this way. | I really need to get to work now tho ;) — Dan Mašek, Sep 21 '21 at 19:45
Anyway, if you read closely through the comments in that file, you'll notice something familiar from the site you said that explains this -- `Y = 0.299 * R + 0.587 * G + 0.114 * B`. — Dan Mašek, Sep 21 '21 at 22:50

Grillteller · Answer 1 · 2021-09-29T13:31:35.217

I think basically @Dan Mašek already answered the question in the comment section.
I will try to summarize the findings for jpg files as an answer and I am glad about any improvements.

CMYK to Grayscale

If you want to convert your jpg file from CMYK we have to look into grfmt_jpeg.cpp. There exist other files like this for different image codes. Depending on the numbers of color channels cinfo is assigned. For CMYK images the cinfo is set to 4 and the function on line 504 icvCvt_CMYK2Gray_8u_C4C1R is called.
This function can be found in utils.cpp:

void icvCvt_CMYK2Gray_8u_C4C1R( const uchar* cmyk, int cmyk_step,
                                uchar* gray, int gray_step, Size size )
{
    int i;
    for( ; size.height--; )
    {
        for( i = 0; i < size.width; i++, cmyk += 4 )
        {
            int c = cmyk[0], m = cmyk[1], y = cmyk[2], k = cmyk[3];
            c = k - ((255 - c)*k>>8);
            m = k - ((255 - m)*k>>8);
            y = k - ((255 - y)*k>>8);
            int t = descale( y*cB + m*cG + c*cR, SCALE );
            gray[i] = (uchar)t;
        }
        gray += gray_step;
        cmyk += cmyk_step - size.width*4;
    }
}

and uses fixed variables for the conversion:

#define  SCALE  14
#define  cR  (int)(0.299*(1 << SCALE) + 0.5)
#define  cG  (int)(0.587*(1 << SCALE) + 0.5)
#define  cB  ((1 << SCALE) - cR - cG)

RGB/BGR to Grayscale

If your image only contains three color channels it seems that libjpeg is used for the conversion. This can be seen in line 717. (I am not 100% sure if this is the correct line).

In jdcolor.c it can be seen that there a definitions and standards for converting color channels starting from line 41.

The most important part for your specific question is:

the conversion equations to be implemented are therefore
R = Y + 1.402 * Cr
G = Y - 0.344136286 * Cb - 0.714136286 * Cr
B = Y + 1.772 * Cb
Y = 0.299 * R + 0.587 * G + 0.114 * B

which relate to standards of the ITU-R and are used in many other sources I found. More detailed information can be found here and here.

The second source relating to a StackOverflow question makes it clear that the conversion does not only depend on the pure RGB values but also on other parameters as gamma value.

The standard OpenCV uses seems to be Rec. 601.

user16930239 · Answer 2 · 2021-09-24T17:30:18.540

-1

in OpenCV documentation you can find:

IMREAD_GRAYSCALE = 0,  //!< If set, always convert image to the single channel grayscale image (codec internal conversion).

Also

When using IMREAD_GRAYSCALE, the codec's internal grayscale conversion will be used, if available. Results may differ to the output of cvtColor()

So it depends on codec's internal grayscale conversion.

More Info: from OpenCV documentation

When using IMREAD_GRAYSCALE, the codec's internal grayscale conversion will be used, if available. Results may differ to the output of cvtColor() On Microsoft Windows* OS and MacOSX*, the codecs shipped with an OpenCV image (libjpeg, libpng, libtiff, and libjasper) are used by default. So, OpenCV can always read JPEGs, PNGs, and TIFFs. On MacOSX, there is also an option to use native MacOSX image readers. But beware that currently these native image loaders give images with different pixel values because of the color management embedded into MacOSX. On Linux*, BSD flavors and other Unix-like open-source operating systems, OpenCV looks for codecs supplied with an OS image. Install the relevant packages (do not forget the development files, for example, "libjpeg-dev", in Debian* and Ubuntu*) to get the codec support or turn on the OPENCV_BUILD_3RDPARTY_LIBS flag in CMake.

edited Sep 24 '21 at 17:30

answered Sep 24 '21 at 00:34

user16930239

6,319
2
9
33

So which algorithm stand behind the conversion?For rgb or png for example – Jürgen K. Sep 24 '21 at 09:11
the codec on your device will if available will be responsible for this, for example in for reading JPEG files in Windows there are some codecs different than MacOSX, they will give images with different pixel values when reading JPEG, check out https://docs.opencv.org/3.4.13/d4/da8/group__imgcodecs.html#ga288b8b3da0892bd651fce07b3bbd3a56 `When using IMREAD_GRAYSCALE, the codec's internal grayscale conversion will be used, if available. Results may differ to the output of cvtColor()` – user16930239 Sep 24 '21 at 17:27
„ there is also an option to use native MacOSX image readers“ - so this is optional – Jürgen K. Sep 25 '21 at 14:52
@JürgenK. yes it is optional, but on linux -for example- OpenCV will use the OS default codecs if installed. – user16930239 Sep 25 '21 at 14:57
actually, OpenCV will *probably* bring its *own* builds of various libraries, such as libjpeg, libpng, ... so the system can't affect that. – Christoph Rackwitz Sep 26 '21 at 19:02

OpenCV - How to see which algorithm is used behind a function? (imread)

2 Answers2

CMYK to Grayscale

RGB/BGR to Grayscale