If I have an image made of uint16s and want to compute a histogram for each bit, i.e. a vector 'x' of 0..65535
that contains the intensity value, and a vector y
that is the number of samples that have that value, is there a vectorized numpy / linear algreba way to compute this?
Asked
Active
Viewed 412 times
1

Brandon Dube
- 428
- 1
- 10
- 26
-
Wonder if creating histogram by itself can be vectorized: https://stackoverflow.com/questions/12985949/methods-to-vectorise-histogram-in-simd – SKPS Jan 29 '20 at 17:36
-
1Why can't you use `numpy.histogram()` or `numpy.unique(return_counts=True)` ? – scleronomic Jan 29 '20 at 18:24
-
The latter does what I want. `np.hist` can be extremely slow for large images. – Brandon Dube Jan 29 '20 at 19:21
-
What are the height and width (in pixels) of your large images please? – Mark Setchell Jan 30 '20 at 00:16
-
2560x2160, 16-bit per channel x 4 channels. – Brandon Dube Jan 30 '20 at 17:38
-
Please have another look - I have updated my answer with a significant speedup. – Mark Setchell Jan 31 '20 at 15:39
-
would be great not to need such a mammoth dependency as opencv for this. – Brandon Dube Feb 03 '20 at 21:37
2 Answers
1
I did it the obvious way with Numpy, and using your image dimensions on my Mac, it takes 300ms. I then did the same thing with OpenCV and it is 33x faster at 9ms!
#!/usr/bin/env python3
import cv2
import numpy as np
# Dimensions - height, width
h, w = 2160, 2560
# Known image, channel0=1, channel1=3, channel2=5, channel3=65535
R = np.zeros((h,w,4), dtype=np.uint16)
R[...,0] = 1
R[...,1] = 3
R[...,2] = 5
R[...,3] = 65535
def npHistogram(R):
"""Generate histogram using Numpy"""
H, _ = np.histogram(R,65536)
return H
def OpenCVHistogram(R):
"""Generate histogram using OpenCV"""
H = cv2.calcHist([R.ravel()], [0], None, [65536], [0,65536])
return H
A = npHistogram(R)
B = OpenCVHistogram(R)
#%timeit npHistogram(R)
#%timeit OpenCVHistogram(R)
Results
Using IPython, I got these timings
%timeit npHistogram(R)
300 ms ± 11.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit OpenCVHistogram(R)
9.02 ms ± 226 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Keywords: Python, histogram, slow, Numpy, np.histogram, speedup, OpenCV, image processing.

Mark Setchell
- 191,897
- 31
- 273
- 432
0
Ok, if OpenCV is too big a dependency for you to get 9ms processing time instead of 300ms, how about Numba? This runs in 10ms.
#!/usr/bin/env python3
import numpy as np
from numba import jit
# Dimensions - height, width
h, w = 2160, 2560
# Known image, channel0=1, channel1=3, channel2=5, channel3=65535
R = np.zeros((h,w,4), dtype=np.uint16)
R[...,0] = 1
R[...,1] = 3
R[...,2] = 5
R[...,3] = 65535
@jit(nopython=True, nogil=True)
def NumbaHistogram(pixels):
"""Histogram of uint16 image"""
H = np.zeros(65536, dtype=np.int32)
for i in range(len(pixels)):
H[pixels[i]] += 1
return H
#%timeit q = NumbaHistogram(R.ravel())
Results
%timeit NumbaHistogram(R.ravel())
10.4 ms ± 54.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Mark Setchell
- 191,897
- 31
- 273
- 432
-
Numba is arguably worse. The resistance to OpenCV is that getting it installed and linked properly can be a huge pain in the butt. Numba is virtually impossible without using Conda as your python distribution, which is neither better nor worse. – Brandon Dube Feb 05 '20 at 20:01
-
Mmm, there's no pleasing some people This explains why I like macOS, both OpenCV and Numba are one-liners to install, without Conda. I guess there's no point making a Cython version either. – Mark Setchell Feb 05 '20 at 20:05
-