1

Tesseract OCR build

I want to deploy tesseract-OCR on lambda. Scroll to the section where it says adaptions for tesseract 4. I have built tesseract following this tutorial. But I am facing an issue with this as the build is not portable. When ever I try to use the built on a new Linux instance, I have to set the environmental variable PATH to /tmp or else this wont work.

Structure

├── cv2
├── lib
├── numpy
├── PIL
├── pytesseract
├── tessdata
├── tesseract
├── test.png
└── zzz.py

https://s3.amazonaws.com/tesseractstandalone/complete-package.zip

This is the link for the standalone tesseract. There is a sample program zzz.py which has the script for running the tesseract. When I download the zip and extract to /tmp/ folder in an ec2 instance, the program works fine. But I am having an issue when working with lambda. When I try to download the same thing to lambda /tmp/ folder, I am getting an error that says tesseract is not installed or it's not in your path. Don't know where things are going wrong. Not sure whether its a PATH issue or lambda issue.

Random
  • 909
  • 1
  • 9
  • 14
Bijju
  • 43
  • 1
  • 7
  • 1
    see https://docs.aws.amazon.com/lambda/latest/dg/env_variables.html#env_limits – petey Sep 27 '18 at 15:33
  • @petey Thanks. I was wrong, PATH is not a reserved variable, so can use it. But still I can use the answer of how to change the variable from PATH to something of my own. – Bijju Sep 27 '18 at 15:41
  • 1
    What does this have to do with the Linux kernel internals? – Barmar Sep 27 '18 at 15:57
  • Lol, he naively mentioned linux-kernal just for tagging. – Random Sep 27 '18 at 17:47
  • @Bijju I'd like to attempt that, but the aws-lamda is not my forte. you should be able to configure your lamba following these instructions: https://docs.aws.amazon.com/lambda/latest/dg/tutorial-env_console.html#tutorial-env-configure-function (I think) – petey Sep 27 '18 at 18:54
  • @petey, thanks for offering help. – Bijju Sep 27 '18 at 19:20
  • https://s3.amazonaws.com/tesseractstandalone/complete-package.zip I have packaged everything in to one file. This is a standalone package jsut ,odify zzz.py file paths to get the program working. This works on ec2 instance but not on lambda. Updating the post. – Bijju Sep 27 '18 at 19:21

1 Answers1

0

Finally got the support from AWS help. It seems that the executable doesn't have the permissions to execute when it is getting downloaded to lambda. Solve my doing chmod 755 to the executable.

Bijju
  • 43
  • 1
  • 7