Get HOG image features from OpenCV + Python?

Question

I've read this post about how to use OpenCV's HOG-based pedestrian detector: How can I detect and track people using OpenCV?

I want to use HOG for detecting other types of objects in images (not just pedestrians). However, the Python binding of HOGDetectMultiScale doesn't seem to give access to the actual HOG features.

Is there any way to use Python + OpenCV to extract the HOG features directly from any image?

I found this MATLAB library that was helpful: http://vision.ucsd.edu/~pdollar/toolbox/doc/index.html — lubar, Aug 08 '12 at 04:06

score 169 · Answer 1 · answered Aug 04 '14 at 13:31

169

In python opencv you can compute hog like this:

 import cv2
 hog = cv2.HOGDescriptor()
 im = cv2.imread(sample)
 h = hog.compute(im)

answered Aug 04 '14 at 13:31

ton4eg

1,922
2
12
10

1

It has to be exactly 64 x 128 pixels? Can't I change the window size? – Mauker Dec 13 '16 at 14:23
Does it have to be grayscale image? If yes, won't `cv2.imread()` have `cv2.IMREAD_GRAYSCALE`? – Prasad Raghavendra Jun 17 '20 at 00:30
1

I ran both. I am not seeing appreciable difference. I ran this for about 100 big images and it took 23 seconds approximately for both grayscale and RGB. I excluded reading from disk while timing. – Prasad Raghavendra Jun 17 '20 at 00:39

mdilip · Answer 2 · 2015-07-28T10:47:55.857

1. Get Inbuilt Documentation: Following command on your python console will help you know the structure of class HOGDescriptor:

 import cv2; 
 help(cv2.HOGDescriptor())

2. Example Code: Here is a snippet of code to initialize an cv2.HOGDescriptor with different parameters (The terms I used here are standard terms which are well defined in OpenCV documentation here):

import cv2
image = cv2.imread("test.jpg",0)
winSize = (64,64)
blockSize = (16,16)
blockStride = (8,8)
cellSize = (8,8)
nbins = 9
derivAperture = 1
winSigma = 4.
histogramNormType = 0
L2HysThreshold = 2.0000000000000001e-01
gammaCorrection = 0
nlevels = 64
hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,
                        histogramNormType,L2HysThreshold,gammaCorrection,nlevels)
#compute(img[, winStride[, padding[, locations]]]) -> descriptors
winStride = (8,8)
padding = (8,8)
locations = ((10,20),)
hist = hog.compute(image,winStride,padding,locations)

3. Reasoning: The resultant hog descriptor will have dimension as: 9 orientations X (4 corner blocks that get 1 normalization + 6x4 blocks on the edges that get 2 normalizations + 6x6 blocks that get 4 normalizations) = 1764. as I have given only one location for hog.compute().

4. One more way to initialize is from xml file which contains all parameter values:

hog = cv2.HOGDescriptor("hog.xml")

To get an xml file one can do following:

hog = cv2.HOGDescriptor()
hog.save("hog.xml")

and edit the respective parameter values in xml file.

score 17 · Answer 3 · edited Oct 21 '17 at 09:23

Here is a solution that uses only OpenCV:

import numpy as np
import cv2
import matplotlib.pyplot as plt

img = cv2.cvtColor(cv2.imread("/home/me/Downloads/cat.jpg"),
                   cv2.COLOR_BGR2GRAY)

cell_size = (8, 8)  # h x w in pixels
block_size = (2, 2)  # h x w in cells
nbins = 9  # number of orientation bins

# winSize is the size of the image cropped to an multiple of the cell size
hog = cv2.HOGDescriptor(_winSize=(img.shape[1] // cell_size[1] * cell_size[1],
                                  img.shape[0] // cell_size[0] * cell_size[0]),
                        _blockSize=(block_size[1] * cell_size[1],
                                    block_size[0] * cell_size[0]),
                        _blockStride=(cell_size[1], cell_size[0]),
                        _cellSize=(cell_size[1], cell_size[0]),
                        _nbins=nbins)

n_cells = (img.shape[0] // cell_size[0], img.shape[1] // cell_size[1])
hog_feats = hog.compute(img)\
               .reshape(n_cells[1] - block_size[1] + 1,
                        n_cells[0] - block_size[0] + 1,
                        block_size[0], block_size[1], nbins) \
               .transpose((1, 0, 2, 3, 4))  # index blocks by rows first
# hog_feats now contains the gradient amplitudes for each direction,
# for each cell of its group for each group. Indexing is by rows then columns.

gradients = np.zeros((n_cells[0], n_cells[1], nbins))

# count cells (border cells appear less often across overlapping groups)
cell_count = np.full((n_cells[0], n_cells[1], 1), 0, dtype=int)

for off_y in range(block_size[0]):
    for off_x in range(block_size[1]):
        gradients[off_y:n_cells[0] - block_size[0] + off_y + 1,
                  off_x:n_cells[1] - block_size[1] + off_x + 1] += \
            hog_feats[:, :, off_y, off_x, :]
        cell_count[off_y:n_cells[0] - block_size[0] + off_y + 1,
                   off_x:n_cells[1] - block_size[1] + off_x + 1] += 1

# Average gradients
gradients /= cell_count

# Preview
plt.figure()
plt.imshow(img, cmap='gray')
plt.show()

bin = 5  # angle is 360 / nbins * direction
plt.pcolor(gradients[:, :, bin])
plt.gca().invert_yaxis()
plt.gca().set_aspect('equal', adjustable='box')
plt.colorbar()
plt.show()

I have used HOG descriptor computation and visualization to understand the data layout and vectorized the loops over groups.

Shouldn't hog return a histogram? what is being plotted? it seems as the image. — Tony Tannous, Apr 19 '19 at 23:22
It does return a 2D grid of histograms, the example just displays one of orientation bins, I arbitrarily chose the fifth (`bin = 5 # angle is 360 / nbins * direction`). — pixelou, Apr 22 '19 at 10:59

omotto · Answer 4 · 2017-02-08T14:48:57.270

Despite the fact that exist a method as said in previous answers:

hog = cv2.HOGDescriptor()

I would like to post a python implementation you can find on opencv's examples directory, hoping it can be useful to understand HOG funcionallity:

def hog(img):
    gx = cv2.Sobel(img, cv2.CV_32F, 1, 0)
    gy = cv2.Sobel(img, cv2.CV_32F, 0, 1)
    mag, ang = cv2.cartToPolar(gx, gy)
    bin_n = 16 # Number of bins
    bin = np.int32(bin_n*ang/(2*np.pi))

    bin_cells = []
    mag_cells = []

    cellx = celly = 8

    for i in range(0,img.shape[0]/celly):
        for j in range(0,img.shape[1]/cellx):
            bin_cells.append(bin[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])
            mag_cells.append(mag[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])   

    hists = [np.bincount(b.ravel(), m.ravel(), bin_n) for b, m in zip(bin_cells, mag_cells)]
    hist = np.hstack(hists)

    # transform to Hellinger kernel
    eps = 1e-7
    hist /= hist.sum() + eps
    hist = np.sqrt(hist)
    hist /= norm(hist) + eps

    return hist

Regards.

@Clément F Sorry, I should add the import library declaration: _**from numpy.linalg import norm**_ Here you have a link with the function description: [link](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.norm.html) — omotto, Mar 10 '17 at 07:14
@omotto Can you please explain the above algorithm in detail. — Praveen Kumar, May 06 '17 at 07:35

score 1 · Answer 5 · answered Jul 12 '11 at 13:28

I would disagree with the argument of peakxu. The HOG detector in the end is "just" a rigid linear filter. any degrees of freedom in the "object" (i.e. persons) lead to bluring in the detector, and are not actually handled by it. There is an extension of this detector using latent SVMs that does explicitly handle dgrees of freedom by introducing structural constraints between independent parts (i.e. head, arms, etc) as well as allowing for multiple appearances per object (i.e. frontal people and sideways people...).

Regarding the HOG detector in opencv: In theory you can upload another detector to be used with the features, but you cannot afaik get the features themselves. thus, if you have a trained detector (i.e. a class specific linear filter) you should be able to upload that into the detector to get the fast detections performance of opencv. that said it should be easy to hack the opencv source code to provide this access and propose this patch back to the maintainers.

score -11 · Answer 6 · answered May 24 '11 at 20:21

-11

I would not recommend using HOG features for detecting objects other than pedestrians. In the original HOG paper by Dalal and Triggs, they specifically mentioned that their detector is built around pedestrian detection in allowing for significant degrees of freedom in the limbs while using strong structural hints around human body.

Instead, try looking at OpenCV's HaarDetectObjects. You can learn how to train your own cascades here.

answered May 24 '11 at 20:21

peakxu

6,667
1
28
27

1

The OpenCV HOG detector (i.e., the trained SVM) is pedestrian-specific but I've had success using HOG features to train detectors for more general object classes. So far, I've been using [this MATLAB package](http://www.mathworks.com/matlabcentral/fileexchange/28689-hog-descriptor-for-matlab) but would be interested in an OpenCV solution. – lubar May 25 '11 at 17:32
10

This is not an answer to the question. HOG descriptors are not the same thing as HOG detectors. A descriptor is the signature provided in an image patch by computing the HoG feature. If one can collect positive andd negative training examples of the HoG features, then it's easy to use libsvm or scikits.learn to train SVM classifiers to do recognition on new HoG features. This has been done very successfully for detecting all kinds of shapes and objects beyond human forms. I am currently looking into accessing HoG descriptors with OpenCV Python and will write back if I figure it out. – ely Sep 07 '11 at 04:46
3

In the computer vision literature, HOG features are widely used and quite successful, in particular as building block of the deformable parts model. I have never seen anyone use Haar features in any serious object detection work. – Andreas Mueller Oct 21 '13 at 19:06
Thi sentence it has no sense: "I would not recommend using HOG features for detecting objects other than pedestrians." HOG can be used in other applications as feature vector. – omotto Aug 01 '17 at 07:29

Get HOG image features from OpenCV + Python?

6 Answers6

Linked