Python Stroke Width Transform

Question

I'm trying to implement the Stroke Width Transform in Python. I have gone through numerous stack overflow questions and answers and other resources on the internet but have not found an implementation for it. So I decided to try on my own.

After performing Canny edge detection, my first step is to calculate the x and y derivatives of the image

sobelx = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=5)
sobely = cv2.Sobel(img,cv2.CV_64F,0,1,ksize=5)

How do I then combine these to get the gradient at each point in the image? Also, how do I then calculate the width of the stroke?

I'm using the steps given in the original IEEE Paper (freely accessible paper direct from Microsoft here) for reference. The description of the steps is on the bottom of the right hand side on page 3.

Check the C++ code [here](https://stackoverflow.com/a/31811225/5008845) to get an idea. That's a bit outdated... but it's still useful to understand how it works. HINT: it's better to keep x and y derivatives separate — Miki, Jul 07 '17 at 17:21
The SWT uses the angle of the gradient which you can easily calculate; you have the magnitude in `x` and `y` directions so you can grab the angle with inverse trig functions (like arctan). — alkasm, Jul 07 '17 at 17:27
@AlexanderReynolds or do you think I could use this - http://docs.opencv.org/2.4/modules/core/doc/operations_on_arrays.html#phase? — Sibi, Jul 07 '17 at 17:29
@Sibi yes, notice that it simply calculates `atan2(y,x)` like I mentioned. The paper discusses how to calculate the stroke width. The basic idea is you start with a pixel in Canny, follow the direction of the gradient until you hit another pixel from Canny. If the gradient direction is opposite, then that's your stroke width, unless either of those pixels has been hit before and stored a lower value. — alkasm, Jul 07 '17 at 17:57

score 2 · Answer 1 · answered Jul 07 '17 at 18:48

2

There's a python implementation of the SWT here. And unlike many SWT implementations I've seen, this one clusters text regions into groups that likely represent words.

answered Jul 07 '17 at 18:48

woodstockhausen

339
1
6

Python Stroke Width Transform

1 Answers1