0

To begin, just note that I'm avare of Accessing certain pixel RGB value in openCV and Pixel access in OpenCV 2.2. Neither of those address my problem.

What I would like to do, is to reliably read / write the pixel RGB values in OpenCV's IplImage. So far, I have seen this:

uint8_t cr = CV_IMAGE_ELEM(color_img, uint8_t, y, x * color_img->nChannels + 2);
uint8_t cg = CV_IMAGE_ELEM(color_img, uint8_t, y, x * color_img->nChannels + 1);
uint8_t cb = CV_IMAGE_ELEM(color_img, uint8_t, y, x * color_img->nChannels + 0);

Now that seems plain wrong to me.

const uint8_t *p_color_pixel = cvPtr2D(color_img, y, x, NULL);
uint8_t cr = p_color_pixel[2], cg = p_color_pixel[1], cb = p_color_pixel[0];

This is still a bit painful, but probably the fastest way. One could also use cvGet2D(), although that also seems to involve the awkward multiplication of x by the number of channels.

Anyway, my issue is that those are all quite "unsafe", in the sense that the image may be RGB or BGR or ARGB. Also, the pixels may not be uint8_t. So the result feels still pretty random, depending on what kind of image and format I'm reading. I'm actually processing images that store some 24-bit integer values, packed to RGB, so the correct order is crucial (a wrong order will essentially yield noise). The only approach I've seen that seems "semi-safe" so far is quite morbid, both code and performance-wise:

CvSize t_image_size = cvGetSize(image);
IplImage *ir = cvCreateImage(t_image_size, 8, 1);
IplImage *ig = cvCreateImage(t_image_size, 8, 1);
IplImage *ib = cvCreateImage(t_image_size, 8, 1);
cvSplit(image, left_b, left_g, left_r, 0);

double cr = cvGetReal2D(ir, x, y),
    cg = cvGetReal2D(ig, x, y), cvGetReal2D(ib, x, y);

cvReleaseImage(&ir);
cvReleaseImage(&ig);
cvReleaseImage(&ib);

I mean ... come on, we are in the 21st century, why is it so damn hard to get correct RGB?!

Note that I'd prefer using the "C" interface, as the C++ one is quite unstable at the moment. But if there was a reliable way in C++, I'd take it.

Community
  • 1
  • 1
the swine
  • 10,713
  • 7
  • 58
  • 100
  • in both, c and c++ you dont know what kind of data layout there is BGR/RGB. You should know that from the place where you got the images from ;) in IplImages i usually processed like: `for(rows)for(cols)for(channels){compute array position and read pixel/channel value}` (you can precompute in each loop level one par of the coordinate though) – Micka Oct 14 '14 at 16:10
  • 1
    your "safe" code sample will fail too if it's RGB instead of BGR and/or not 24 bit?!? – Micka Oct 14 '14 at 16:13
  • @Micka I think the color depth should be handled correctly in the "safe" mode, as the pixels are accessed using `cvGetReal2D` which needs to perform the conversion (if not, that's a bug in OpenCV right there). It is true that `cvSplit` also requires me to know the `RGB` order, you're right. – the swine Oct 14 '14 at 16:26
  • "Note that I'd prefer using the "C" interface, as the C++ one is quite unstable at the moment" - please do not. the outdated c-wrappers you want to use are mostly expensive wrappers around c++ functionality anyway. – berak Oct 14 '14 at 16:41
  • @berak Well, there is a large code base, using the "C" interface. I don't see it going away anytime soon. On the contrary, the C++ interface changed considerably in the past few `2.x` versions, we had to rewrite code and put `#ifdef`s everywhere. I don't see how that is worth it, considering maintenance and compatibility (what happens if someone tries to build our code with OpenCV version we did not test with?). That's a no-go for me. Maybe in a couple of years. – the swine Oct 14 '14 at 16:44
  • @berak Actually, the fact that they went through the trouble of wrapping their C++ functions in "C" makes it likely that they will not remove the "C" interface - they would have done it now and saved the trouble. – the swine Oct 14 '14 at 16:47
  • yea, sure, the code from the book still has to work, and such. still a dead end (unless, ofc, you're happy with the 1.1 subset) – berak Oct 14 '14 at 16:48
  • @berak I see that you're skilled in OpenCV. Maybe you could shed some light on http://stackoverflow.com/questions/18765406/opencv-2-4-6-sift-keypoints-detection-using-a-lot-of-memory-on-ios and win the bounty? – the swine Oct 14 '14 at 16:56
  • @the swine I didnt check the sourcecode of `cvSplit` but it will either reallocate your created 8 bit IplImage or convert (where does it get the depth information of the original image from? could you use this information yourself directly?) or it might fail. – Micka Oct 15 '14 at 07:38
  • @Micka Yes, in another words, If I wanted to implement it myself, I could do a better job than OpenCV does. Sure. – the swine Oct 15 '14 at 07:41
  • what prevents you from calculating the pixel position in memory (known x/y/nchannels/depth) and read/write your data type?!? if you dont know whether it's 32 bit int or 32 bit float you've lost anyways?!? – Micka Oct 15 '14 at 08:06
  • Yes, again, if I wanted to implement it myself, I could do a better job than OpenCV does. – the swine Oct 15 '14 at 09:02
  • no it's not a better job just different focus. opencv is focused on real time processing and therefore optimized pixel access. You wont get both (speed and robustness/ease-of-use) if you always extract channel ordering for each pixel access (as you do in your answer). – Micka Oct 15 '14 at 13:23
  • Extracting the order once is not slowing you down if you process a million pixels. Also, OpenCV could fix their little issues and state that the order *is* RGBA and never different and then no searching would be required in the first place. Some people think that if it's fast, it doesn't matter that it's flawed or hard to use. That's just rubbish. – the swine Oct 15 '14 at 15:05
  • As I said if you provide the precpmputed ordering for each call of your getter you'd probably want to provide other getters that use precomputed row-ptr information (e.g.), too. You'll end up in dozens of different getters ;) and yes, it's slowing you down (a bit) if you have 3-4 look-ups for each pixel access to achieve ordering. – Micka Oct 15 '14 at 20:20
  • I don't see it that way. Also, these days it is not really a problem, as all this can be templated away. In the old "C" days, this would have been done using preprocessor. Anyway, if the channel indices were variables as in my answer, it does not slow down the addressing logic at all, there is only a small constant cost for the initialization. Try to implement it yourself and see. – the swine Oct 16 '14 at 12:56

1 Answers1

0

So anyway, here's what we know. From IplImage documentation, we can see a few relevant fields:

nChannels // Number of channels. Most OpenCV functions support 1-4 channels.
depth // The supported depths are: IPL_DEPTH_8U, IPL_DEPTH_16U, ...
channelSeq // Ignored by OpenCV
colorModel // Ignored by OpenCV
dataOrder // 0 - interleaved color channels, 1 - separate color channels.
          // CreateImage() only creates images with interleaved channels.
          // For example, the usual layout of a color image is: b00r00g00b10r10g10

So at least the depth and the usual layout is knowable. What about the unusual layouts, though? We can also check imread documentation. Quite at the end, it says:

Note: In the case of color images, the decoded images will have the channels stored in B G R order.

So that seems to be it. There seems to be no "unusual" layout implemented in reading the images.

Let's see how cvGet2D() is implemented (OpenCV 2.4.7):

CV_IMPL CvScalar cvGet2D(const CvArr* arr, int y, int x)
{
    CvScalar scalar = {{0,0,0,0}};
    int type = 0;
    uchar* ptr;

    if(CV_IS_MAT(arr)) {
        CvMat* mat = (CvMat*)arr;
        if((unsigned)y >= (unsigned)(mat->rows) ||
           (unsigned)x >= (unsigned)(mat->cols))
            CV_Error(CV_StsOutOfRange, "index is out of range");
        type = CV_MAT_TYPE(mat->type);
        ptr = mat->data.ptr + (size_t)y * mat->step + x * CV_ELEM_SIZE(type);
    } else if(!CV_IS_SPARSE_MAT(arr))
        ptr = cvPtr2D(arr, y, x, &type);
    else {
        int idx[] = {y, x};
        ptr = icvGetNodePtr((CvSparseMat*)arr, idx, &type, 0, 0);
    }
    if(ptr)
        cvRawDataToScalar(ptr, type, &scalar);
    return scalar;
}

That's actually interesting. Debugging reveals that on IplImage, this calls cvPtr2D and then cvRawDataToScalar. Further debugging also reveals that channelSeq of IplImage is actually a string (damn you, OpenCV implementation, why did you keep that a secret?!). So a robust code for reading OpenCV image RGB values is:

const char *rpos = strchr(color_img->channelSeq, "R"),
           *gpos = strchr(color_img->channelSeq, "G"),
           *bpos = strchr(color_img->channelSeq, "B");
if(!rpos || !gpos || !bpos)
    throw std::runtime_error("unknown image channel sequence");
const int rIdx = rpos - color_img->channelSeq,
          gIdx = gpos - color_img->channelSeq,
          bIdx = bpos - color_img->channelSeq;
// determine channel mapping

CvScalar color = cvGet2D(color_img, y, x); // note y, x
uint8_t cr = uint8_t(color.val[rIdx]),
        cg = uint8_t(color.val[gIdx]),
        cb = uint8_t(color.val[bIdx]);
// do not multiply by 255, it is already in that range

This will correctly handle any channel order and any depth. Or if you want to go faster, use imread(filename, CV_LOAD_IMAGE_COLOR) to convert the values to 8 bit per pixel. Then this can be used:

assert(color_img->depth == IPL_DEPTH_8U); // note that imread does not really say whether it converts always to IPL_DEPTH_8U or sometimes to IPL_DEPTH_8S!
const uint8_t *p_color_pixel = cvPtr2D(color_img, y, x, NULL);
uint8_t cr = p_color_pixel[rIdx], cg = p_color_pixel[gIdx],
    cb = p_color_pixel[bIdx];

And that's it. It's not pretty, but I guess that comes with using OpenCV. Note that it might actually not be reliable as the documentation says that OpenCV is not using the IplImage::channelSeq field, but this is the best one can do now. At this point I don't trust anything that's written in OpenCV documentation, it's the worst.

I was kind of hoping that there will be a way of doing this, say cvGetRGBA2D() that would return CvScalar with RGBA order always (one line instead of my 12 lines). Ridiculous.

the swine
  • 10,713
  • 7
  • 58
  • 100