Tesseract/Leptonica is all you need.
Want to add some caveats before answering though:
- Tesseract's Auto OSD should handle this by default. Try that option first and you might find you are already getting the right results.
- Tesseract-OCR is not suitable for handwritten text, which makes up most of your content (the answer on how to extract handwritten text is out-of-scope for this question).
You can find the orientation and also correct it using Tesseract libraries.
- Make sure you have
osd.trainneddata
in your tessdata
folder. OSD here stands for Orientation Script Detection, which actually enables you to understand the orientation. It can be found here if you do not have it: https://github.com/tesseract-ocr/tessdata/blob/main/osd.traineddata
- Use
Tesseract.PageSegMode.AutoOsd
. This is different than what you had before as it not only does Automated Page Segmentation but also does Orientation Script Detection.
- Utilize
page.AnalyseLayout()
- This returns both the detected orientation of the text. It comes as an enum: (Page Up (0 degree, what you expect), Page Right (90 degree angle), Page Down (180 degree angle), and Page Left (270 degree angle).
- It also returns the deskew angle that can be passed to Leptonica and fixed via rotation. You can do this through the Tesseract libraries however, no need to install another package.
using (var pix = PixConverter.ToPix(image))
{
using (var page = engine.Process(pix, Tesseract.PageSegMode.AutoOsd))
{
using (var pageIter = page.AnalyseLayout())
{
pageIter.Begin();
var pageProps = pageIter.GetProperties();
// Get page orientation (Page Up, Page Down, Page Left, Page Right)
var orientation = pageProps.Orientation;
// Rotate image based on DeskewAngle (in radians)
pix.Rotate(pageProps.DeskewAngle);
}
}
}
I've ran this on your images and found that the orientation is found correctly for both sample images (for image 1 it detects as "Page Right", and for image 2 as "Page Left").