1

Is there a way that I can detect whether the text in a page is in Landscape or Portrait Orientation using JS or any libraries? I cannot rely on width > height, as there are pages that are rotated as well. Rotated Page with Portrait Orientation vs Rotated Page with Landscape Orientation

I cannot rely on comparing Width and Height, or checking if the page is rotated, because both these pages are rotated 90 degrees, but I cannot figure out how to detect the text's orientation.

I also do some preprocessing on the PDF using Node.js and pdfjs. So if that has any API/library to help me get the required information I would appreciate the help.

IrtzaSuhail
  • 97
  • 1
  • 9

1 Answers1

1

You can do this using tesseract which is mainly used for OCR conversion. I am using it with PHP but you can also use it with JS: https://ourcodeworld.com/articles/read/580/how-to-convert-images-to-text-with-pure-javascript-using-tesseract-js

Tesseract can detect orientation. Here is some information on it using Python: Is it possible to check orientation of an image before passing it through pytesseract ocr module

All you would need to do is to adapt this to Javascript using the tool of the first link above.

rf1234
  • 1,510
  • 12
  • 13