0

After a broad search of keywords in google scholar, images, and web - I cannot find anything related to OCR of diagonal text. There are a few close examples:

So, presumably, diagonal fields functions do not exist in openCV. Is this true. And how are diagonal text fields handled?

Community
  • 1
  • 1
forest.peterson
  • 755
  • 2
  • 13
  • 30

1 Answers1

0

It seems you want to perform OCR on a page with both horizontal and diagonal text. There is no straightforward solution in terms of OpenCV, but you could take a divide-and-conquer approach such as:

  1. Partition the image according to prior knowledge about the document (common with forms), or the distribution of white regions (column spacing etc.)
  2. Identify regions where there is a possibility of diagonal text (diagonal, fat lines after blurring and thresholding is one method)
  3. Rotate the partition and perform OCR
  4. Merge results for different partitions

You can also try a brute force approach like rotating the image by a range of angles and performing OCR on all of them. The results will have to be merged.

Totoro
  • 3,398
  • 1
  • 24
  • 39
  • "rotating the image by a range of angles and performing OCR on all of them," that could work but the header diagonal text is spaced closely - hence the reason for diagonal - and so the text in the preceding column and succeeding column is captured by the rectangular 'recognition box,' resulting in each field containing stray text. In the example form http://stackoverflow.com/questions/12455018/driver-logs-image-recognition-in-c-net I gave in the question, as an example, "Dallas, TX" and "Corsicana, TX" overlap. The OCR field will capture "Dallas, TX Lunch Corsica Lo" – forest.peterson Dec 16 '13 at 19:54