I'm working with OpenCV's CPU version of Histogram of Oriented Gradients (HOG). I'm using a 32x32 image with 4x4 cells, 4x4 blocks, no overlap among blocks, and 15 orientation bins. OpenCV's HOGDescriptor
gives me a 1D feature vector of length 960. This makes sense, because (32*32 pixels) * (15 orientations) / (4*4 cells) = 960.
However, I'm not sure about how these 960 numbers are laid out in memory. My guess would be that it's like this:
vector<float> descriptorsValues =
[15 bins for cell 0, 0]
[15 bins for cell 0, 1]
...
[15 bins for cell 0, 7]
....
[15 bins for cell 7, 0]
[15 bins for cell 7, 1]
...
[15 bins for cell 7, 7]
Of course, this is a 2D problem flattened into 1D, so it would actually look like this:
[cell 0, 0] [cell 0, 1] ... [cell 7, 0] ... [cell 7, 7]
So, do I have the right idea for the data layout? Or is it something else?
Here's my example code for this:
using namespace cv;
//32x32 image, 4x4 blocks, 4x4 cells, 4x4 blockStride
vector<float> hogExample(cv::Mat img)
{
img = img.rowRange(0, 32).colRange(0,32); //trim image to 32x32
bool gamma_corr = true;
cv::Size win_size(img.rows, img.cols); //using just one window
int c = 4;
cv::Size block_size(c,c);
cv::Size block_stride(c,c); //no overlapping blocks
cv::Size cell_size(c,c);
int nOri = 15; //number of orientation bins
cv::HOGDescriptor d(win_size, block_size, block_stride, cell_size, nOri, 1, -1,
cv::HOGDescriptor::L2Hys, 0.2, gamma_corr, cv::HOGDescriptor::DEFAULT_NLEVELS);
vector<float> descriptorsValues;
vector<cv::Point> locations;
d.compute(img, descriptorsValues, cv::Size(0,0), cv::Size(0,0), locations);
printf("descriptorsValues.size() = %d \n", descriptorsValues.size()); //prints 960
return descriptorsValues;
}
Related resources: This StackOverflow post and this tutorial helped me to get started with the OpenCV HOGDescriptor.