How to extract texts from customized blocks in a image using Google Vision API(OCR)?

Question

when we use Google vision's DOCUMENT_TEXT_DETECTION for a image, it decides what are the blocks in the image and what texts are in each block
Here I want to get the text for the blocks which are defined by me(already have a model to identify different blocks in a image).
Simply I want the texts within blocks defined by me but the defined by Google vision.
How I can achieve this?

score 1 · Answer 1 · answered Mar 29 '22 at 04:34

I found a better way to do this. First I merge each block vertically and between every block it is possible to include textual separator. It mean after every block there is a line of text. So then we can provide this image with merged blocks as the input for Google vision API. Is the response we can get full text for our input and we also have the text that we previously set between the blocks. So we can split the whole text using that. Then we can have block-wise text

score 0 · Answer 2 · answered Mar 24 '22 at 11:16

0

For now, I decided to filter symbols for given block's vertices. It is better, if there is a way to simply find intersected symbols. For, now I'm going to loop through every symbol.

answered Mar 24 '22 at 11:16

Vidu VDS

62
9

Here I faced to an issue. What to do for overlaps? – Vidu VDS Mar 25 '22 at 05:33
With shapely, it is possible to find intersected polygons - https://gis.stackexchange.com/questions/90055/finding-if-two-polygons-intersect-in-python – Vidu VDS Mar 25 '22 at 08:26
Finally I created a code to first check for blocks in given area and if exists then build the block text. For non existing block, then I check for paragraphs of its' block in given area. For available paragraphs, I build the paragraph text. For non-existing paragraphs I continue to go down in hierarchy(Pages->Blocks->Paragraphs->Words->Symbols). Except symbol level, I finished other levels but its' output has missed many contents. – Vidu VDS Mar 29 '22 at 04:30

How to extract texts from customized blocks in a image using Google Vision API(OCR)?

2 Answers2