Few days ago I tried to find python implementation of Daugman's Iris detection algorithm. I found only one repo on github, but implementation was very slow, around 46 sec on my laptop. So, I rewrote all stuff, according to this formulae:
Final result is ~85 times faster (~530 ms on my laptop), but it is still not fast enough fo real-time video processing (at 10 fps, at least).
I've read this stackoverflow topics 1,2 and tried to vectorize the code.
I've tested map()
, np.vectorize()
, np.fromiter()
, but they were no faster than my current solution (some of them were even slower) and I am out of ideas.
Is there a way to vectorize code below or I need to use C extensions or try to run it on PyPy?
My solution:
def daugman(center, start_r, gray_img):
"""return maximal intense radius for given center
center -- tuple(x, y)
start_r -- int
gray_img -- grayscale picture as np.array(), it should be square
"""
# get separate coordinates
x, y = center
# get img dimensions
h, w = gray_img.shape
# for calculation convinience
img_shape = np.array([h, w])
c = np.array(center)
# define some other vars
tmp = []
mask = np.zeros_like(gray)
# for every radius in range
# we are presuming that iris will be no bigger than 1/3 of picture
for r in range(start_r, int(h/3)):
# draw circle on mask
cv2.circle(mask, center, r, 255, 1)
# get pixel from original image
radii = gray_img & mask # it is faster than np or cv2
# normalize
tmp.append(radii[radii > 0].sum()/(2*3.1415*r))
# refresh mask
mask.fill(0)
# calculate delta of radius intensitiveness
tmp = np.array(tmp)
tmp = tmp[1:] - tmp[:-1]
# aply gaussian filter
tmp = abs(cv2.GaussianBlur(tmp[:-1], (1, 5), 0))
# get maximum value
idx = np.argmax(tmp)
# return value, center coords, radius
return tmp[idx], [center, idx + start_r]
def find_iris(gray, start_r):
"""Apply daugman() on every pixel in calculated image slice
gray -- graysacale img as np.array()
start_r -- initial radius as int
Selection of image slice guarantees that every
radius will be drawn in iage borders, so we need to check it (speed up)
To speed up the whole process we need to pregenerate all centers for detection
"""
_, s = gray.shape
# reduce step for better accuracy
# 's/3' is the maximum radius of a daugman() search
a = range(0 + int(s/3), s - int(s/3), 3)
all_points = list(itertools.product(a, a))
values = []
coords = []
for p in all_points:
tmp = daugman(p, start_r, gray)
if tmp is not None:
val, circle = tmp
values.append(val)
coords.append(circle)
# return the radius with biggest intensiveness delta on image
# [(xc, yc), radius]
return coords[np.argmax(values)]
UPD: image for testing
UPD 2: I've tried to create a tensor with all possible x,y,r values and map function over it -- it is not faster.
Tensor creation:
# prepare img
img = cv2.imread('eye.jpg')
img = img[20:130, 20:130]
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
start_r = 10
# prepare some vars
h, w = gray.shape
img_shape = np.array([h, w])
mask = np.zeros_like(gray)
# generate points
_, s = gray.shape
a = range(0 + int(s/3), s - int(s/3), 3)
b = range(start_r, int(s/3))
all_points = list(itertools.product(a, a, b))
all_points_arr = np.array(all_points)
Rewritten daugman for such case:
def daugman_6(point, gray_img=gray):
"""return maximal intense radius for given center
center -- tuple(x, y)
start_r -- int
gray_img -- grayscale picture as np.array(), it should be square
"""
# get separate coordinates
x, y, r = point
# for every radius in range
# we are presuming that iris will be no bigger than 1/3 of picture
# draw circle on mask
cv2.circle(mask, (x, y), r, 255, 1)
# get pixel from original image
radii = gray_img & mask # it is faster than np or cv2
# refresh mask
mask.fill(0)
return radii[radii > 0].sum()/(2*3.1415*r)
Results on server with Core i9:
# iterate via numpy array with for
%%timeit
[daugman_6(i) for i in all_points_arr]
#80.6 ms ± 2.74 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# np.fromiter() on numpy array
%%timeit
np.fromiter((daugman_6(p) for p in all_points_arr), dtype=np.float32)
#82.9 ms ± 3.75 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# iteration over list of tuples
%%timeit
[daugman_6(i) for i in all_points]
#70 ms ± 2.86 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# original implementation
%%timeit
find_iris(gray, 10)
#71.6 ms ± 3.51 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)