9

I need to recognise numbers from the camera image on iPhone, in real-time. I know there will be no more than 5 digits on the image.

Is this problem realistic to solve given the computational specifications of the iPhone? Does anyone have any experience using the Tesseract OCR library, and do you think it could be solved by using it?

MitLap
  • 91
  • 1
  • 1
  • 2
  • possible duplicate of: http://stackoverflow.com/questions/3140455/training-tesseract-to-use-with-iphone – Dan Hanly Feb 03 '11 at 14:26
  • 1
    @Daniel: While that question asks how to use Tesseract to read numbers in any still image, this is investigating the possibility of doing this kind of processing from a live video stream. I believe there's enough of a difference here to justify a new question. – Brad Larson Feb 03 '11 at 16:32

5 Answers5

11

The depends on your definition of "real-time", but yes, it should be possible to do relatively fast recognition of just the digits 0-9 on an iPhone 4, particularly if you can fonts, lighting conditions, etc. that they will appear in.

I highly recommend reading the article on how Sudoku Grab does its recognition of puzzles using the iPhone camera. In their case, a trained neural network was used to identify the digits, which should be reasonably simple and fast on modern iOS hardware.

The current recognition libraries out there, like OpenCV, will use the iPhone's CPU to do the processing. I've heard that they can do even more complex tasks like facial recognition fast enough to use with video sources while showing a minimal amount of stutter.

For even better performance, I believe that there's a lot of potential in the programmable GPUs on the newer iOS devices. In my benchmarks, I saw a 14X - 28X speedup when using the iPhone 4's GPU for simple image processing. While few people are looking at this right now, something like Sudoku Grab's neural network should be a parallel enough process to benefit from running on the GPU.

Brad Larson
  • 170,088
  • 45
  • 397
  • 571
1

There is free SDK for that: http://rtrsdk.com/ Supports both iOS and Andorid, works in real-time, helps you capture any text, numbers should not be a problem.

Disclaimer: I work for ABBYY

Tomato
  • 2,169
  • 15
  • 24
1

It should be computationally possible. There are apps that can get a bar code in real time and also an app that does real time translation. (Word Lens). I'm not sure what libraries they use, however.

Alex Argo
  • 8,920
  • 12
  • 43
  • 46
1

YES it is possible using the tesseract engine

Here is the sample code if you like to check...

https://github.com/nolanbrown/Tesseract-iPhone-Demo

Ankit Srivastava
  • 12,347
  • 11
  • 63
  • 115
0

Yes. Bender can help you with that. It lets you build and run neural nets on iOS. As it uses Metal under the hood, it runs fast and smooth. It also supports running TensorFlow models directly.

So you can run in Bender an existing model in TensorFlow trained for digit recognition Handwritten Digit Recognition using Convolutional Neural Networks in Python with Keras if you need help

Disclaimer: I worked on this project.

bryant1410
  • 5,540
  • 4
  • 39
  • 40