It is required to recognize text from forms with boxes given for each character input.
I have tried using bounding box for each input and cropping that particular input, i.e I can get all the boxes for inputting in 'Name' field. But when I try to detect individual boxes in the group of boxes, I am not able to do so and the opencv returns only one contour for all the boxes. The file referred in the for loop is a file containing coordinates of the bounding box. The cropped_img is the image which belongs to a single field's input(eg. Name).
Full form image
This is the image of the form.
cropped image for each field
It contains many boxes for inputting characters. Here the number of the contours detected is always one. Why am I not able to detect all individual boxes? In short, I want all the individual boxes in the cropped_img.
Also, any other idea for approaching the task of form ocr is really appreciated!
for line in file.read().split("\n"):
if len(line)==0:
continue
region = list(map(int,line.split(' ')[:-1]))
index=line.split(' ')[-1]
text=''
contentDict={}
#uzn in format left, up, width, height
region[2] = region[0]+region[2]
region[3] = region[1]+region[3]
region = tuple(region)
cropped_img = panimg[region[1]:region[3],region[0]:region[2]]
index=index.replace('_', ' ')
if index=='sign' or index=='picture' or index=='Dec sign':
continue
kernel = np.ones((50,50),np.uint8)
gray = cv2.cvtColor(cropped_img, cv2.COLOR_BGR2GRAY)
ret, threshold = cv2.threshold(gray,127,255,cv2.THRESH_BINARY)
threshold = cv2.bitwise_not(threshold)
dilate = cv2.dilate(threshold,kernel,iterations = 1)
ret, threshold = cv2.threshold(dilate,127,255,cv2.THRESH_BINARY)
dilate = cv2.dilate(threshold,kernel,iterations = 1)
contours, hierarchy = cv2.findContours(dilate,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
contours.sort(key=lambda x:get_contour_precedence(x, panimg.shape[1]))
print("Length of contours detected: ", len(contours))
for j, ctr in enumerate(contours):
# Get bounding box
x, y, w, h = cv2.boundingRect(ctr)
# Getting ROI
roi = cropped_img[y:y+h, x:x+w]
# show ROI
cv2.imshow('segment no:'+str(j-1),roi)
cv2.waitKey(0)
The content of file 'file' is as follows:
462 545 468 39 AO_Office
450 785 775 39 Last_Name
452 836 770 37 First_Name
451 885 772 39 Middle_Name
241 963 973 87 Abbreviation_Name
The expected output is contours for individual boxes for inputting a single letter for each field