OpenCV HOG feature data layout?

Question

I'm working with OpenCV's CPU version of Histogram of Oriented Gradients (HOG). I'm using a 32x32 image with 4x4 cells, 4x4 blocks, no overlap among blocks, and 15 orientation bins. OpenCV's HOGDescriptor gives me a 1D feature vector of length 960. This makes sense, because (32*32 pixels) * (15 orientations) / (4*4 cells) = 960.

However, I'm not sure about how these 960 numbers are laid out in memory. My guess would be that it's like this:

vector<float> descriptorsValues =
[15 bins for cell 0, 0] 
[15 bins for cell 0, 1]
...
[15 bins for cell 0, 7]
....
[15 bins for cell 7, 0] 
[15 bins for cell 7, 1]
...
[15 bins for cell 7, 7]

Of course, this is a 2D problem flattened into 1D, so it would actually look like this:

[cell 0, 0] [cell 0, 1] ... [cell 7, 0] ... [cell 7, 7]

So, do I have the right idea for the data layout? Or is it something else?

Here's my example code for this:

using namespace cv;

//32x32 image, 4x4 blocks, 4x4 cells, 4x4 blockStride
vector<float> hogExample(cv::Mat img)
{
    img = img.rowRange(0, 32).colRange(0,32); //trim image to 32x32
    bool gamma_corr = true;
    cv::Size win_size(img.rows, img.cols); //using just one window
    int c = 4;
    cv::Size block_size(c,c);
    cv::Size block_stride(c,c); //no overlapping blocks
    cv::Size cell_size(c,c);
    int nOri = 15; //number of orientation bins

    cv::HOGDescriptor d(win_size, block_size, block_stride, cell_size, nOri, 1, -1,
                              cv::HOGDescriptor::L2Hys, 0.2, gamma_corr, cv::HOGDescriptor::DEFAULT_NLEVELS);

    vector<float> descriptorsValues;
    vector<cv::Point> locations;
    d.compute(img, descriptorsValues, cv::Size(0,0), cv::Size(0,0), locations);

    printf("descriptorsValues.size() = %d \n", descriptorsValues.size()); //prints 960
    return descriptorsValues;
}

Related resources: This StackOverflow post and this tutorial helped me to get started with the OpenCV HOGDescriptor.

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

I believe you got the right idea.

In its original paper Histograms of Oriented Gradients for Human Detection (Page 2), it says

[...] The detector window is tiled with a grid of overlapping blocks in which Histogram of Oriented Gradient feature vectors are extracted. [...]

[...] Tiling the detection window with a dense (in fact, overlapping) grid of HOG descriptors and using the combined feature vector [...]

All it talked about is tiling them together. Although no detail info is introduced on how to exactly tile them together. I guess there should be no fancy things happens here (otherwise they will talk about it), i.e. just regularly concatenating them (from left to right, top to down).

After all, It's reasonable and the easiest way to layout the data.

Edit: You will convince yourself more if you look at how people access and visualize the data.

for (int blockx=0; blockx<blocks_in_x_dir; blockx++)
{
    for (int blocky=0; blocky<blocks_in_y_dir; blocky++)            
    {
        for (int cellNr=0; cellNr<4; cellNr++)
        {
            for (int bin=0; bin<gradientBinSize; bin++)
            {
                float gradientStrength = descriptorValues[ descriptorDataIdx ];
                descriptorDataIdx++;

                // ... ...

            } // for (all bins)
        } // for (all cells)
    } // for (all block x pos)
} // for (all block y pos)

@solvingPuzzles It is in fact column major. Look how the loop with `blocky` is nested inside the one with `blockx`. — isarandi, Apr 28 '15 at 15:58

OpenCV HOG feature data layout?

So, do I have the right idea for the data layout? Or is it something else?

1 Answers1