Referencing our comments, I wrote a MATLAB implementation of this algorithm here: Extract a page from a uniform background in an image, and it was quite fast on large images.
If you'd like a better explanation of the algorithm, please see my other answer here: Bradley Adaptive Thresholding -- Confused (questions). This may be a good place to start if you want a better understanding of the code I wrote.
Because MATLAB and NumPy are similar, this is a re-implementation of the Bradley-Roth thresholding algorithm, but in NumPy. I convert the PIL image into a NumPy array, do the processing on this image, then convert back to a PIL image. The function takes in three parameters: the grayscale image image
, the size of the window s
and the threshold t
. This threshold is different than what you have as this is following the paper exactly. The threshold t
is a percentage of the total summed area of each pixel window. If the summed area is less than this threshold, then the output should be a black pixel - else it's a white pixel. The defaults for s
and t
are the number of columns divided by 8 and rounded, and 15% respectively:
import numpy as np
from PIL import Image
def bradley_roth_numpy(image, s=None, t=None):
# Convert image to numpy array
img = np.array(image).astype(np.float)
# Default window size is round(cols/8)
if s is None:
s = np.round(img.shape[1]/8)
# Default threshold is 15% of the total
# area in the window
if t is None:
t = 15.0
# Compute integral image
intImage = np.cumsum(np.cumsum(img, axis=1), axis=0)
# Define grid of points
(rows,cols) = img.shape[:2]
(X,Y) = np.meshgrid(np.arange(cols), np.arange(rows))
# Make into 1D grid of coordinates for easier access
X = X.ravel()
Y = Y.ravel()
# Ensure s is even so that we are able to index into the image
# properly
s = s + np.mod(s,2)
# Access the four corners of each neighbourhood
x1 = X - s/2
x2 = X + s/2
y1 = Y - s/2
y2 = Y + s/2
# Ensure no coordinates are out of bounds
x1[x1 < 0] = 0
x2[x2 >= cols] = cols-1
y1[y1 < 0] = 0
y2[y2 >= rows] = rows-1
# Ensures coordinates are integer
x1 = x1.astype(np.int)
x2 = x2.astype(np.int)
y1 = y1.astype(np.int)
y2 = y2.astype(np.int)
# Count how many pixels are in each neighbourhood
count = (x2 - x1) * (y2 - y1)
# Compute the row and column coordinates to access
# each corner of the neighbourhood for the integral image
f1_x = x2
f1_y = y2
f2_x = x2
f2_y = y1 - 1
f2_y[f2_y < 0] = 0
f3_x = x1-1
f3_x[f3_x < 0] = 0
f3_y = y2
f4_x = f3_x
f4_y = f2_y
# Compute areas of each window
sums = intImage[f1_y, f1_x] - intImage[f2_y, f2_x] - intImage[f3_y, f3_x] + intImage[f4_y, f4_x]
# Compute thresholded image and reshape into a 2D grid
out = np.ones(rows*cols, dtype=np.bool)
out[img.ravel()*count <= sums*(100.0 - t)/100.0] = False
# Also convert back to uint8
out = 255*np.reshape(out, (rows, cols)).astype(np.uint8)
# Return PIL image back to user
return Image.fromarray(out)
if __name__ == '__main__':
img = Image.open('test.jpg').convert('L')
out = bradley_roth_numpy(img)
out.show()
out.save('output.jpg')
The image is read in and converted to grayscale if required. The output image will be displayed, and it will be saved to the same directory where you ran the script to an image called output.jpg
. If you want to override the settings, simply do:
out = bradley_roth_numpy(img, windowsize, threshold)
Play around with this to get good results. Using the default parameters and using IPython, I measured the average time of execution using timeit
, and this is what I get for your image you uploaded in your post:
In [16]: %timeit bradley_roth_numpy(img)
100 loops, best of 3: 7.68 ms per loop
This means that running this function repeatedly 100 times on the image you uploaded, the best of 3 execution times gave on average 7.68 milliseconds per run.
I also get this image as a result when I threshold it:
